hsiehyichia / scene-text-recognition Goto Github PK

Scene text detection and recognition based on Extremal Region(ER)

License: MIT License

C++ 84.05% C 8.44% Python 2.36% Jupyter Notebook 3.78% CMake 1.37%

mser adaboost svm opencv ocr image-processing computer-vision scene-text-recognition scene-text-detection non-maximum-suppression lbp machine-learning algorithm detection text-recognition chaincode canny cascade-classifier spelling-checker classifier

scene-text-recognition's People

Contributors

Stargazers

Watchers

scene-text-recognition's Issues

Documentation on Training

@HsiehYiChia Thank you for you hard work

After reading your replies on issue #6, it is still unclear how to train our own model.
Can you simplify the training process to us.

I get this error when executing it via VisualStudio2015

The exe file can be compiled and executed but didnot show anything in the window,after some seconds I got this: Microsoft C++ Exception: std::length_error .

I am using visual studio 2015 enterprice and getting error (Errpr : "node" is ambigious")?

Debugger takes me to ER.h
struct Node
{
Node(ER v, const int i) : vertex(v), index(i){};
ER vertex;
int index;
vector adj_list;
vector edge_prob;
};

typedef vector < Node > Graph;

In Type def vector < Node > Graph , Node is ambiguous

how can I train with my own data

Hi, You have done a wonderful job. Can you tell me how to train with my own data? Thanks!

The positive and negative samples cannot be uncompressed after downloading

Hi!
Thank you for your project!
Maybe the pos and neg is too large to unzip

I want to use text detection function without OCR,but I don't know how to prepare my own neg and pos. Could you tell me how to prepare training data ? Could you update the training data again ?

error while making the scene text recognition

os system: ubuntu 16.04
cmake version 3.5.1
opencv : 3.4

g++ (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609

make -j8
[ 9%] Building CXX object CMakeFiles/svm-train.dir/src/svm-train.cpp.o
[ 18%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/adaboost.cpp.o
[ 27%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/ER.cpp.o
[ 36%] Building CXX object CMakeFiles/svm-train.dir/src/svm.cpp.o
[ 54%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/main.cpp.o
[ 63%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/SpellingCorrector.cpp.o
[ 72%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/svm.cpp.o
[ 54%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/OCR.cpp.o
[ 81%] Building CXX object CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o
[ 90%] Linking CXX executable svm-train
[ 90%] Built target svm-train
/home/giuser/Scene-text-recognition/src/utils.cpp: In function ‘void get_lbp_data()’:
/home/giuser/Scene-text-recognition/src/utils.cpp:1416:24: warning: ISO C++ forbids converting a string constant to ‘char*’ [-Wwrite-strings]
char data_filename = "training/detection_training_data.txt";
^
[100%] Linking CXX executable scene_text_recognition
CMakeFiles/scene_text_recognition.dir/src/ER.cpp.o: In function ERFilter::er_tree_extract(cv::Mat)': ER.cpp:(.text+0x165d): undefined reference to cv::error(int, cv::String const&, char const, char const*, int)'
CMakeFiles/scene_text_recognition.dir/src/ER.cpp.o: In function cv::String::String(char const*)': ER.cpp:(.text._ZN2cv6StringC2EPKc[_ZN2cv6StringC5EPKc]+0x54): undefined reference to cv::String::allocate(unsigned long)'
CMakeFiles/scene_text_recognition.dir/src/ER.cpp.o: In function cv::String::~String()': ER.cpp:(.text._ZN2cv6StringD2Ev[_ZN2cv6StringD5Ev]+0x14): undefined reference to cv::String::deallocate()'
CMakeFiles/scene_text_recognition.dir/src/ER.cpp.o: In function cv::String::operator=(cv::String const&)': ER.cpp:(.text._ZN2cv6StringaSERKS0_[_ZN2cv6StringaSERKS0_]+0x28): undefined reference to cv::String::deallocate()'
CMakeFiles/scene_text_recognition.dir/src/OCR.cpp.o: In function OCR::extract_feature(cv::Mat&, svm_node*)': OCR.cpp:(.text+0xce1): undefined reference to cv::findContours(cv::_InputOutputArray const&, cv::OutputArray const&, int, int, cv::Point)'
OCR.cpp:(.text+0x10b8): undefined reference to cv::normalize(cv::_InputArray const&, cv::_InputOutputArray const&, double, double, int, int, cv::_InputArray const&)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function image_mode(ERFilter*, char*)':
utils.cpp:(.text+0x44e): undefined reference to cv::imread(cv::String const&, int)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function video_mode(ERFilter*, char*)':
utils.cpp:(.text+0x977): undefined reference to cv::VideoCapture::VideoCapture(cv::String const&)' utils.cpp:(.text+0xa99): undefined reference to cv::VideoWriter::fourcc(char, char, char, char)'
utils.cpp:(.text+0xaed): undefined reference to cv::VideoWriter::open(cv::String const&, int, double, cv::Size_<int>, bool)' utils.cpp:(.text+0xb32): undefined reference to cv::VideoWriter::fourcc(char, char, char, char)'
utils.cpp:(.text+0xb86): undefined reference to cv::VideoWriter::open(cv::String const&, int, double, cv::Size_<int>, bool)' utils.cpp:(.text+0x1800): undefined reference to cv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator > const&)'
CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function load_challenge2_test_file(cv::Mat&, int)': utils.cpp:(.text+0x23f8): undefined reference to cv::imread(cv::String const&, int)'
CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function load_challenge2_training_file(cv::Mat&, int)': utils.cpp:(.text+0x274b): undefined reference to cv::imread(cv::String const&, int)'
CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function show_result(cv::Mat&, cv::Mat&, std::vector<Text, std::allocator<Text> >&, std::vector<double, std::allocator<double> >, std::vector<ER*, std::allocator<ER*> >, std::vector<std::vector<ER*, std::allocator<ER*> >, std::allocator<std::vector<ER*, std::allocator<ER*> > > >, std::vector<std::vector<ER*, std::allocator<ER*> >, std::allocator<std::vector<ER*, std::allocator<ER*> > > >, std::vector<std::vector<ER*, std::allocator<ER*> >, std::allocator<std::vector<ER*, std::allocator<ER*> > > >, std::vector<std::vector<ER*, std::allocator<ER*> >, std::allocator<std::vector<ER*, std::allocator<ER*> > > >)': utils.cpp:(.text+0x3526): undefined reference to cv::getTextSize(cv::String const&, int, double, int, int*)'
utils.cpp:(.text+0x372c): undefined reference to cv::putText(cv::_InputOutputArray const&, cv::String const&, cv::Point_<int>, int, double, cv::Scalar_<double>, int, int, bool)' utils.cpp:(.text+0x3b2b): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)'
utils.cpp:(.text+0x3ba5): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)' utils.cpp:(.text+0x3c1f): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)'
utils.cpp:(.text+0x3c99): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)' utils.cpp:(.text+0x3d13): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)'
CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o:utils.cpp:(.text+0x3d77): more undefined references to cv::imshow(cv::String const&, cv::_InputArray const&)' follow CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function draw_FPS(cv::Mat&, double)':
utils.cpp:(.text+0x41a1): undefined reference to cv::putText(cv::_InputOutputArray const&, cv::String const&, cv::Point_<int>, int, double, cv::Scalar_<double>, int, int, bool)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function draw_linear_time_MSER(std::__cxx11::basic_string<char, std::char_traits, std::allocator >)':
utils.cpp:(.text+0x425d): undefined reference to cv::imread(cv::String const&, int)' utils.cpp:(.text+0x42c5): undefined reference to cv::VideoWriter::fourcc(char, char, char, char)'
utils.cpp:(.text+0x4319): undefined reference to cv::VideoWriter::open(cv::String const&, int, double, cv::Size_<int>, bool)' utils.cpp:(.text+0x4ca3): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)'
CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function draw_multiple_channel(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)': utils.cpp:(.text+0x5707): undefined reference to cv::imread(cv::String const&, int)'
utils.cpp:(.text+0x5aaf): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)' utils.cpp:(.text+0x5b23): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)'
utils.cpp:(.text+0x5b97): undefined reference to cv::imshow(cv::String const&, cv::_InputArray const&)' utils.cpp:(.text+0x5c21): undefined reference to cv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator > const&)'
utils.cpp:(.text+0x5cba): undefined reference to cv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator<int> > const&)' utils.cpp:(.text+0x5d53): undefined reference to cv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator > const&)'
CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function output_MSER_time(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)': utils.cpp:(.text+0x60c9): undefined reference to cv::imread(cv::String const&, int)'
CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function output_optimal_path(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)': utils.cpp:(.text+0x6763): undefined reference to cv::imread(cv::String const&, int)'
CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function load_gt(int)': utils.cpp:(.text+0x6cf3): undefined reference to cv::imread(cv::String const&, int)'
CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function calc_recall_rate()': utils.cpp:(.text+0x7b2e): undefined reference to cv::MSER::create(int, int, int, double, double, int, double, double, int)'
CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function bootstrap()': utils.cpp:(.text+0xd10f): undefined reference to cv::imread(cv::String const&, int)'
utils.cpp:(.text+0xd34d): undefined reference to cv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator<int> > const&)' utils.cpp:(.text+0xd4c2): undefined reference to cv::imwrite(cv::String const&, cv::_InputArray const&, std::vector<int, std::allocator > const&)'
CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function get_lbp_data()': utils.cpp:(.text+0xd9df): undefined reference to cv::imread(cv::String const&, int)'
utils.cpp:(.text+0xdc31): undefined reference to cv::imread(cv::String const&, int)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function get_ocr_data()':
utils.cpp:(.text+0xee47): undefined reference to cv::imread(cv::String const&, int)' CMakeFiles/scene_text_recognition.dir/src/utils.cpp.o: In function cv::String::String(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&)':
utils.cpp:(.text._ZN2cv6StringC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE[_ZN2cv6StringC5ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE]+0x5d): undefined reference to `cv::String::allocate(unsigned long)'
collect2: error: ld returned 1 exit status
CMakeFiles/scene_text_recognition.dir/build.make:268: recipe for target 'scene_text_recognition' failed
make[2]: *** [scene_text_recognition] Error 1
CMakeFiles/Makefile2:104: recipe for target 'CMakeFiles/scene_text_recognition.dir/all' failed
make[1]: *** [CMakeFiles/scene_text_recognition.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

can any help me out
Thanks

What does it mean to be “strong” and “weak”?

I know it uses the machine classifier to judge non-text.
But i didn't know what does it mean to be “strong” and “weak”?
What are the "strong" and "weak" classifying based on?
I want to learn it and make it better to classify。

My English is no well，maybe you can't easily to read。
Hope to get your reply.

//Chinese 中文
strong和weak分类器是根据什么来判定的？
我想去深入了解一下并把它改进改进。

Now i am getting this error after running the .exe file

.\canny_text.exe -icdar
Error: the input file is not opened!!
Error: the input file is not opened!!
Error: the Transition Probability Table file is not opened!!

Btw i also tried to open up an image but same error

The steps of model training

Hi, Thank you for your work , Can you tell me The steps of model training more in detail , which function will be used, Thank you

Should the text data and non-text data in the root directory of /res/neg and /res/pos?

Should the text data and non-text data in the root directory of /res/neg and /res/pos? or they can be put in folders of /res?

Poor Results

Dear HsiehYiChia,

I run the model with the ".\scene_text_recognition.bat -i res\ICDAR2015_test\img_6.jpg" command and I noticed that the result were not the same with the one mentioned. For example I was expecting to see the result1.jpg but the output was the result. Can you tell me how to fix it? Thank you in advance.

What are these result output windows 'pool, weak, strong, tracked, result, all ' means respectively?

I'm new to the image processing filed, would you please help me to answer this problem?

Error executing with a video argument

Hello,

I've successfully compiled the solution and I've been able to test it with some images. However, when passing a video file as argument, it seems that it only process one frame, the first one. I am using as it is written in the docs: "scene_text_recognition.exe -v videofile.name". Am I doing something wrong?

Thanks!
Ana

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.