Comments (2)
cairo
is used in the Keras OCR example primarily for rendering text on images to create synthetic datasets. Originally, keras-ocr
did the same thing (you can see it if you go back in time).
Line 196 in 5b074c3
In fact, the early version (prior to the first public commit) of this library directly cited the OCR example.
Today, this library uses Pillow
(PIL
) to carry out this task. The benefits for switching were:
PIL
has no additional system dependencies that users have to install (or at least none that are unusual for users to have installed already) withapt
,brew
or some other system package manager.PIL
has a simpler API thancairo
- The quality of images rendered with
PIL
are roughly the same as those rendered withcairo
in my brief testing. PIL
makes it easy to load fonts from TrueType font files without the fonts needing to be installed on the system (cairo
required fonts to be installed).
OpenCV is used for generic image manipulation (resizing, warping text boxes, etc.). For generating synthetic data, it is used in conjunction with PIL (which carries out the text rendering steps). It's used here because it is well-supported, ships with pre-built wheels for all major platforms, and has bindings available for other languages (e.g., JavaScript). By doing image processing in OpenCV, it leaves the door open to using the models built with this package with, for example, TensorFlow.js. Some of the post-processing steps in the detector (e.g., extracting contours) cannot be easily done with TensorFlow.js alone. But because all the logic is written with OpenCV functions, porting said logic over to JavaScript will be much easier than if we had used PIL
for those steps.
We don't use OpenCV for rendering text only because it does not have the same font handling options as PIL
.
from keras-ocr.
Awesome. thanks for the clear explanations. YOU ROCK!!!
from keras-ocr.
Related Issues (20)
- "Tried to convert 'num' to a tensor and failed. Error: None values not supported." HOT 1
- Can I get Korean Text from Image? Using keras-ocr HOT 1
- Open Source License HOT 1
- Adding an example for fine-tuning both detector & recognizer using an your own dataset HOT 4
- Detecting vertical text with craft HOT 3
- Can I extract the text color too?
- Error while import package
- How can I load the models in an offline environment? HOT 1
- Finetuning the recognizer crashes when reaching the fit_generator method
- README.md has 3 image links for running OCR. Second image is not available.
- Text bbox transform
- Train the recognizer
- Filling up RAM
- unable to load fonts. There is some issue not loading fonts while end-to-end training. HOT 1
- Small Issue With Letter Recognition
- is there a way to skip download data_generation.get_backgrounds and data_generation.get_fonts
- tensorflow is missing from requirements
- Readme.md issue
- Pipeline constructor initializing libiomp5 multiple times
- Cannot Download Pipeline: Unrecognized keyword arguments passed to Dense HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from keras-ocr.