Giter Club home page Giter Club logo

invoice_text_renderer's Introduction

项目简介

基于原text_render 根据自己的需求做了一些修改;

增加了对于色彩的支持(包含文字色彩与背景色彩图);

增加了通过字符截图来替代字体ttf生成样本的支持(虽然部分票据的点阵字体可以通过淘宝购买,但在出租车票等地域特色显著的票种依然很难收集到相似效果的字体)

Support both latin and non-latin text.

  • Ubuntu 16.04
  • python 3.5+

Install dependencies:

pip3 install -r requirements.txt

Python3 main.py

带参数的:

python3 main.py --num_img 1000000 --strict --num_processes 8 --config_file ./configs/autowork.yaml --bg_dir ./data/autoworkbg --output_dir /home/jovyan/data-modified/ocr_render/output --tag base_10_copy

Demo

example1.jpg example2.jpg example3.jpg example4.jpg

FontByImage

00000000.jpg 00000001.jpg 00000002.jpg 00000003.jpg

文字变换特效

编辑 configs/default.yaml 来实现文字弯曲,模糊等效果,详见原text_render; 新增表格线颜色控制,字体颜色控制,添加冗余文字

Effect name Image
Origin(Font size 25) origin
Perspective Transform perspective
Random Crop rand_crop
Curve curve
Light border light border
Dark border dark border
Random char space big random char space big
Random char space small random char space small
Middle line middle line
Table line table line
Under line under line
Emboss emboss
Reverse color reverse color
Blur blur
font_color font_color
line_color line_color
extra_words extra_words

颜色配置说明: 程序会随机拾取下方所配置颜色种类(按照fraction中的概率,所有颜色fraction相加需要等于1),选中颜色种类后,rgb由l_boundary与h_boundary中取一个随机数确定; font_color 同理

line_color:
  black:
    fraction: 0.3
    l_boundary: 0,0,0
    h_boundary: 64,64,64
  gray:
    fraction: 0.7
    l_boundary: 77,77,77
    h_boundary: 230,230,230

冗余文字说明: 如开启选项,程序会按照设定的总概率随机选取一段独立的文字(颜色,字体及字号大小均采用正文的设置),生成的文字长度在1到最长文本长度之间随机。 top 与 bottom 中的 fraction 决定了冗余文字生成在上方或下方的概率; *注:由于冗余文字尽量生成在上方与下方,所以可能生成在图片之外,所以最终出现的概率会小于设定值;

extra_words:
  enable: true
  fraction: 0.05
  top:
    fraction: 0.5
  bottom:
    fraction: 0.5

Strict mode

For no-latin language(e.g Chinese), it's very common that some fonts only support limited chars. In this case, you will get bad results like these:

bad_example1

bad_example2

bad_example3

Select fonts that support all chars in --chars_file is annoying. Run main.py with --strict option, renderer will retry get text from corpus during generate processing until all chars are supported by a font.

Tools

You can use check_font.py script to check how many chars your font not support in --chars_file:

python3 tools/check_font.py

checking font ./data/fonts/eng/Hack-Regular.ttf
chars not supported(4971):
['', '', '广', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '','', '', '', ''...]
0 fonts support all chars(5071) in ./data/chars/chn.txt:
[]

Generate image using GPU

If you want to use GPU to make generate image faster, first compile opencv with CUDA. Compiling OpenCV with CUDA support

Then build Cython part, and add --gpu option when run main.py

cd libs/gpu
python3 setup.py build_ext --inplace

Debug mode

Run python3 main.py --debug will save images with extract information. You can see how perspectiveTransform works and all bounding/rotated boxes.

debug_demo

Todo

  • 增加同一张图中不同字体、大小与颜色的设置;
  • 增加文字侵蚀,打印不完整的效果;

invoice_text_renderer's People

Contributors

verarong avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.