Giter Club home page Giter Club logo

Comments (2)

faustomorales avatar faustomorales commented on July 20, 2024 1

cairo is used in the Keras OCR example primarily for rendering text on images to create synthetic datasets. Originally, keras-ocr did the same thing (you can see it if you go back in time).

surface = cairocffi.ImageSurface(cairocffi.FORMAT_ARGB32, width, height)

In fact, the early version (prior to the first public commit) of this library directly cited the OCR example.

Today, this library uses Pillow (PIL) to carry out this task. The benefits for switching were:

  • PIL has no additional system dependencies that users have to install (or at least none that are unusual for users to have installed already) with apt, brew or some other system package manager.
  • PIL has a simpler API than cairo
  • The quality of images rendered with PIL are roughly the same as those rendered with cairo in my brief testing.
  • PIL makes it easy to load fonts from TrueType font files without the fonts needing to be installed on the system (cairo required fonts to be installed).

OpenCV is used for generic image manipulation (resizing, warping text boxes, etc.). For generating synthetic data, it is used in conjunction with PIL (which carries out the text rendering steps). It's used here because it is well-supported, ships with pre-built wheels for all major platforms, and has bindings available for other languages (e.g., JavaScript). By doing image processing in OpenCV, it leaves the door open to using the models built with this package with, for example, TensorFlow.js. Some of the post-processing steps in the detector (e.g., extracting contours) cannot be easily done with TensorFlow.js alone. But because all the logic is written with OpenCV functions, porting said logic over to JavaScript will be much easier than if we had used PIL for those steps.

We don't use OpenCV for rendering text only because it does not have the same font handling options as PIL.

from keras-ocr.

bayethiernodiop avatar bayethiernodiop commented on July 20, 2024

Awesome. thanks for the clear explanations. YOU ROCK!!!

from keras-ocr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.