Giter Club home page Giter Club logo

breezedeus / cnocr Goto Github PK

View Code? Open in Web Editor NEW

This project forked from diaomin/crnn-mxnet-chinese-text-recognition

3.1K 64.0 486.0 17.74 MB

CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTorch/MXNet 的中文/英文 OCR Python 包。】

Home Page: https://www.breezedeus.com/article/cnocr

License: Apache License 2.0

Python 98.71% Makefile 1.16% Dockerfile 0.13%
ocr ocr-python pytorch chinese-character-recognition english-character-recognition

cnocr's Introduction

English | 中文

CnOCR

Tech should serve the people, not enslave them!
请勿将此项目用于文字审查!
---

[Update 2023.12.24]:发布 V2.3

主要变更:

  • 重新训练了所有的模型,比上一版精度更高。
  • 按使用场景把模型分为几大类场景(见 识别模型列表):
    • scene:场景图片,适合识别一般拍照图片中的文字。此类模型以 scene- 开头,如模型 scene-densenet_lite_136-gru
    • doc:文档图片,适合识别规则文档的截图图片,如书籍扫描件等。此类模型以 doc- 开头,如模型 doc-densenet_lite_136-gru
    • number:仅识别纯数字(只能识别 0~9 十个数字)图片,适合银行卡号、身份证号等场景。此类模型以 number- 开头,如模型 number-densenet_lite_136-gru
    • general: 通用场景,适合图片无明显倾向的一般图片。此类模型无特定开头,与旧版模型名称保持一致,如模型 densenet_lite_136-gru

    注意 ⚠️:以上说明仅为参考,具体选择模型时建议以实际效果为准。

  • 加入了两个更大的系列模型:
    • *-densenet_lite_246-gru_base:优先供 知识星球 CnOCR/CnSTD私享群 会员使用,一个月后会免费开源。
    • *-densenet_lite_666-gru_large:Pro 模型,购买后可使用。

更多细节请参考:CnOCR V2.3 新版发布:模型更好、更多、更大 | Breezedeus.com

CnOCRPython 3 下的文字识别Optical Character Recognition,简称OCR)工具包,支持简体中文繁体中文(部分模型)、英文数字的常见字符识别,支持竖排文字的识别。自带了20+个 训练好的模型,适用于不同应用场景,安装后即可直接使用。同时,CnOCR也提供简单的训练命令供使用者训练自己的模型。欢迎扫码加小助手为好友,备注 ocr,小助手会定期统一邀请大家入群:

微信群二维码

作者也维护 知识星球 CnOCR/CnSTD私享群 ,这里面的提问会较快得到作者的回复,欢迎加入。知识星球会员 可享受以下福利:

  • 可免费下载部分未开源的付费模型
  • 购买其他所有的付费模型一律八折优化;
  • 作者快速回复使用过程中遇到的各种困难;
  • 作者每月提供两次免费特有数据的训练服务。
  • 星球会陆续发布一些CnOCR/CnSTD相关的私有资料;
  • 星球会持续发布 OCR/STD/CV 等相关的最新研究资料。

详细文档

CnOCR在线文档

使用说明

CnOCRV2.2 开始,内部自动调用文字检测引擎 CnSTD 进行文字检测和定位。所以 CnOCR V2.2 不仅能识别排版简单的印刷体文字图片,如截图图片,扫描件等,也能识别一般图片中的场景文字

以下是一些不同场景的调用示例。

不同场景的调用示例

常见的图片识别

所有参数都使用默认值即可。如果发现效果不够好,多调整下各个参数看效果,最终往往能获得比较理想的精度。

from cnocr import CnOcr

img_fp = './docs/examples/huochepiao.jpeg'
ocr = CnOcr()  # 所有参数都使用默认值
out = ocr.ocr(img_fp)

print(out)

识别结果:

火车票识别

排版简单的印刷体截图图片识别

针对 排版简单的印刷体文字图片,如截图图片,扫描件图片等,可使用 det_model_name='naive_det',相当于不使用文本检测模型,而使用简单的规则进行分行。

Note

det_model_name='naive_det' 的效果相当于 V2.2 之前(V2.0.*, V2.1.*)的 CnOCR 版本。

使用 det_model_name='naive_det' 的最大优势是速度快,劣势是对图片比较挑剔。如何判断是否该使用此检测模型呢?最简单的方式就是拿应用图片试试效果,效果好就用,不好就不用。

from cnocr import CnOcr

img_fp = './docs/examples/multi-line_cn1.png'
ocr = CnOcr(det_model_name='naive_det') 
out = ocr.ocr(img_fp)

print(out)

识别结果:

图片 OCR结果
docs/examples/multi-line_cn1.png 网络支付并无本质的区别,因为
每一个手机号码和邮件地址背后
都会对应着一个账户--这个账
户可以是信用卡账户、借记卡账
户,也包括邮局汇款、手机代
收、电话代收、预付费卡和点卡
等多种形式。

竖排文字识别

采用来自 PaddleOCR(之后简称 ppocr)的中文识别模型 rec_model_name='ch_PP-OCRv3' 进行识别。

from cnocr import CnOcr

img_fp = './docs/examples/shupai.png'
ocr = CnOcr(rec_model_name='ch_PP-OCRv3')
out = ocr.ocr(img_fp)

print(out)

识别结果:

竖排文字识别

英文识别

虽然中文检测和识别模型也能识别英文,但专为英文文字训练的检测器和识别器往往精度更高。如果是纯英文的应用场景,建议使用来自 ppocr 的英文检测模型 det_model_name='en_PP-OCRv3_det', 和英文识别模型 rec_model_name='en_PP-OCRv3'

from cnocr import CnOcr

img_fp = './docs/examples/en_book1.jpeg'
ocr = CnOcr(det_model_name='en_PP-OCRv3_det', rec_model_name='en_PP-OCRv3')
out = ocr.ocr(img_fp)

print(out)

识别结果:

英文识别

繁体中文识别

采用来自ppocr的繁体识别模型 rec_model_name='chinese_cht_PP-OCRv3' 进行识别。

from cnocr import CnOcr

img_fp = './docs/examples/fanti.jpg'
ocr = CnOcr(rec_model_name='chinese_cht_PP-OCRv3')  # 识别模型使用繁体识别模型
out = ocr.ocr(img_fp)

print(out)

使用此模型时请注意以下问题:

  • 识别精度一般,不是很好;

  • 除了繁体字,对标点、英文、数字的识别都不好;

  • 此模型不支持竖排文字的识别。

识别结果:

繁体中文识别

单行文字的图片识别

如果明确知道待识别的图片是单行文字图片(如下图),可以使用类函数 CnOcr.ocr_for_single_line() 进行识别。这样就省掉了文字检测的时间,速度会快一倍以上。

单行文本识别
调用代码如下:
from cnocr import CnOcr

img_fp = './docs/examples/helloworld.jpg'
ocr = CnOcr()
out = ocr.ocr_for_single_line(img_fp)
print(out)

更多应用示例

  • 核酸疫苗截图识别
核酸疫苗截图识别
  • 身份证识别
身份证识别
  • 饭店小票识别
饭店小票识别

安装

嗯,顺利的话一行命令即可。

$ pip install cnocr[ort-cpu]

如果是 GPU 环境使用 ONNX 模型,请使用以下命令进行安装:

$ pip install cnocr[ort-gpu]

如果要训练自己的模型,,可以使用以下命令安装:

$ pip install cnocr[dev]

安装速度慢的话,可以指定国内的安装源,如使用阿里云的安装源:

$ pip install cnocr[ort-cpu] -i https://mirrors.aliyun.com/pypi/simple

Note

请使用 Python3(3.7.*~3.10.*之间的版本应该都行),没测过Python2下是否ok。

更多说明可见 安装文档

Warning

如果电脑中从未安装过 PyTorchOpenCV python包,初次安装可能会遇到问题,但一般都是常见问题,可以自行百度/Google解决。

Docker Image

可以从 Docker Hub 直接拉取已安装好 CnOCR 的镜像使用。

$ docker pull breezedeus/cnocr:latest

更多说明可见 安装文档

HTTP服务

CnOCR V2.2.1 加入了基于 FastAPI 的HTTP服务。开启服务需要安装几个额外的包,可以使用以下命令安装:

pip install cnocr[serve]

安装完成后,可以通过以下命令启动HTTP服务(-p 后面的数字是端口,可以根据需要自行调整):

cnocr serve -p 8501

服务开启后,可以使用以下方式调用服务。

命令行

比如待识别文件为 docs/examples/huochepiao.jpeg,如下使用 curl 调用服务:

> curl -F image=@docs/examples/huochepiao.jpeg http://0.0.0.0:8501/ocr

Python

使用如下方式调用服务:

import requests

image_fp = 'docs/examples/huochepiao.jpeg'
r = requests.post(
    'http://0.0.0.0:8501/ocr', files={'image': (image_fp, open(image_fp, 'rb'), 'image/png')},
)
ocr_out = r.json()['results']
print(ocr_out)

具体也可参考文件 scripts/screenshot_daemon_with_server.py

其他语言

请参照 curl 的调用方式自行实现。

可使用的模型

可使用的检测模型

具体参考 CnSTD的下载说明

det_model_name PyTorch 版本 ONNX 版本 模型原始来源 模型文件大小 支持语言 是否支持竖排文字识别
db_shufflenet_v2 X cnocr 18 M 简体中文、繁体中文、英文、数字
db_shufflenet_v2_small X cnocr 12 M 简体中文、繁体中文、英文、数字
db_shufflenet_v2_tiny X cnocr 7.5 M 简体中文、繁体中文、英文、数字
db_mobilenet_v3 X cnocr 16 M 简体中文、繁体中文、英文、数字
db_mobilenet_v3_small X cnocr 7.9 M 简体中文、繁体中文、英文、数字
db_resnet34 X cnocr 86 M 简体中文、繁体中文、英文、数字
db_resnet18 X cnocr 47 M 简体中文、繁体中文、英文、数字
ch_PP-OCRv3_det X ppocr 2.3 M 简体中文、繁体中文、英文、数字
ch_PP-OCRv2_det X ppocr 2.2 M 简体中文、繁体中文、英文、数字
en_PP-OCRv3_det X ppocr 2.3 M 英文、数字

可使用的识别模型

相比于 CnOCR V2.2.* 版本,V2.3 中的大部分模型都经过了重新训练和精调,精度比旧版模型更高。同时,加入了两个参数量更多的模型系列:

  • *-densenet_lite_246-gru_base:优先供 知识星球 CnOCR/CnSTD私享群 会员使用,后续会免费开源。
  • *-densenet_lite_666-gru_largePro 模型,购买后可使用。购买链接见文档:

V2.3 中的模型按使用场景可以分为以下几大类:

  • scene:场景图片,适合识别一般拍照图片中的文字。此类模型以 scene- 开头,如模型 scene-densenet_lite_136-gru
  • doc:文档图片,适合识别规则文档的截图图片,如书籍扫描件等。此类模型以 doc- 开头,如模型 doc-densenet_lite_136-gru
  • number:仅识别纯数字(只能识别 0~9 十个数字)图片,适合银行卡号、身份证号等场景。此类模型以 number- 开头,如模型 number-densenet_lite_136-gru
  • general: 通用场景,适合图片无明显倾向的一般图片。此类模型无特定开头,与旧版模型名称保持一致,如模型 densenet_lite_136-gru

注意 ⚠️:以上说明仅供参考,具体选择模型时建议以实际效果为准。

更多说明见:可用模型

rec_model_name PyTorch 版本 ONNX 版本 模型原始来源 模型文件大小 支持语言 是否支持竖排文字识别
densenet_lite_136-gru 🆕 cnocr 12 M 简体中文、英文、数字 X
scene-densenet_lite_136-gru 🆕 cnocr 12 M 简体中文、英文、数字 X
doc-densenet_lite_136-gru 🆕 cnocr 12 M 简体中文、英文、数字 X
densenet_lite_246-gru_base 🆕
(星球会员专享)
cnocr 25 M 简体中文、英文、数字 X
scene-densenet_lite_246-gru_base 🆕
(星球会员专享)
cnocr 25 M 简体中文、英文、数字 X
doc-densenet_lite_246-gru_base 🆕
(星球会员专享)
cnocr 25 M 简体中文、英文、数字 X
densenet_lite_666-gru_large 🆕
(购买链接:B站Lemon Squeezy
cnocr 82 M 简体中文、英文、数字 X
scene-densenet_lite_666-gru_large 🆕
(购买链接:B站Lemon Squeezy
cnocr 82 M 简体中文、英文、数字 X
doc-densenet_lite_666-gru_large 🆕
(购买链接:B站Lemon Squeezy
cnocr 82 M 简体中文、英文、数字 X
number-densenet_lite_136-fc 🆕 cnocr 2.7 M 纯数字(仅包含 0~9 十个数字) X
number-densenet_lite_136-gru 🆕
(星球会员专享)
cnocr 5.5 M 纯数字(仅包含 0~9 十个数字) X
number-densenet_lite_666-gru_large 🆕
(购买链接:B站Lemon Squeezy
cnocr 55 M 纯数字(仅包含 0~9 十个数字) X
ch_PP-OCRv3 X ppocr 10 M 简体中文、英文、数字
ch_ppocr_mobile_v2.0 X ppocr 4.2 M 简体中文、英文、数字
en_PP-OCRv3 X ppocr 8.5 M 英文、数字
en_number_mobile_v2.0 X ppocr 1.8 M 英文、数字
chinese_cht_PP-OCRv3 X ppocr 11 M 繁体中文、英文、数字 X

未来工作

  • 支持图片包含多行文字 (Done)
  • crnn模型支持可变长预测,提升灵活性 (since V1.0.0)
  • 完善测试用例 (Doing)
  • 修bugs(目前代码还比较凌乱。。) (Doing)
  • 支持空格识别(since V1.1.0
  • 尝试新模型,如 DenseNet,进一步提升识别准确率(since V1.1.0
  • 优化训练集,去掉不合理的样本;在此基础上,重新训练各个模型
  • 由 MXNet 改为 PyTorch 架构(since V2.0.0
  • 基于 PyTorch 训练更高效的模型
  • 支持列格式的文字识别
  • 打通与 CnSTD 的无缝衔接(since V2.2
  • 模型精度进一步优化
  • 支持更多的应用场景

给作者来杯咖啡

开源不易,如果此项目对您有帮助,可以考虑 给作者加点油🥤,鼓鼓气💪🏻


官方代码库:https://github.com/breezedeus/cnocr

cnocr's People

Contributors

breezedeus avatar diaomin avatar hjue avatar icarusion avatar jinnrry avatar myuanz avatar qianyun210603 avatar sugobet avatar uranusseven avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cnocr's Issues

支持在现有模型之上训练吗?

原来的那个项目有load_epoch可以加载现有模型,现在可以在下载的0200之上训练吗?
我试了一下,有报错。
python scripts/cnocr_train.py --dataset cn_ocr --load_epoch 0020
proc 0 started
proc 1 started
proc 2 started
proc 3 started
proc 0 started
proc 1 started
2019-04-25 23:42:32,320 Loaded model ./models/model_0020.params
Process Process-2:
Traceback (most recent call last):
File "/////lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "////python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "////cnocr/cnocr/data_utils/multiproc_data.py", line 89, in _proc_loop
data = fn()
File "////cnocr/cnocr/data_utils/data_iter.py", line 212, in _gen_sample
labels[idx - 1] = int(img_lst[idx])
IndexError: index 10 is out of bounds for axis 0 with size 10
Process Process-4:

重新训练模型失败

自行生成了一批训练数据,会在训练过程中accuracy会忽然变为0,然后就一直为0,不再发生变化。

pip安装失败,好像是提示找不到mxnet的1.5.0 1.4.1 版本

我试了从原地址下载,也试了好几个镜像,基本都是这个问题,下面是报错
ERROR: Could not find a version that satisfies the requirement mxnet<1.5.0,>=1.4.1 (from cnocr) (from versions: 0.11.1b20170915, 0.11.1b20170922, 0.11.1b20170929, 0.11.1b20171006, 0.11.1b20171013, 0.12.0b20171020, 0.12.0b20171027, 0.12.0, 0.12.1b20171103, 0.12.1, 1.0.0, 1.0.0.post1, 1.0.0.post3, 1.0.0.post4, 1.0.1b20180114, 1.0.1b20180121, 1.0.1b20180128, 1.0.1b20180202, 1.1.0b20180209, 1.1.0b20180216, 1.1.0.post0, 1.2.0b20180223, 1.2.0b20180302, 1.2.0b20180309, 1.2.0b20180323, 1.2.0b20180330, 1.2.0b20180406, 1.2.0b20180413, 1.2.0b20180420, 1.2.0b20180427, 1.2.0b20180504, 1.2.0, 1.6.0)
ERROR: No matching distribution found for mxnet<1.5.0,>=1.4.1 (from cnocr)

fine—tunning

hi,I want to add some small dataests to fine—tunning on your models,was your baidu—wangpan model’s final epoach 20?

是否考虑按类型识别呢

中文识别非常完美,但是还有一点小问题,0和o,1和l,9和g,这种经常识别错,是否可以加一个文本类型的参数呢?

一句简单的话没有识别出来

我要识别的一句话为“中华人民共和国”,代码为:

from cnocr import CnOcr
ocr = CnOcr()
img = cv2.imread("1.png", 0)
text = ocr.ocr_for_single_line(img)

它识别出的是['严', '这', '吧'],这是怎么回事?

为什么截图和拍照之后的识别率相差巨大

发现截图的识别率接近100%,但是同样的截图拍照之后再识别基本上就识别不出来了,我需要做哪些处理才可以让拍照之后的识别率能够显著提升,谢谢!


上面的可以识别,下面的就识别不了

dropbox无法下载模型

你好,运行程序的时候默认从dropbox下载模型,但是国内无法下载呀,请问怎么处理啊

mxnet.base.MXNetError

My codes are below:
`# _ coding = utf-8 _

from cnocr import CnOcr
png = 'E:\python\py\Vitaminpic\2018-10-29 维生素价格.png'
ocr = CnOcr()
res = ocr.ocr(png)
print("Predicted Chars:", res)`

However ,I get below with latest version mxnet--1.4.1 :

Traceback (most recent call last):
File "e:\python\exam\exam3.py", line 6, in
res = ocr.ocr(png)
File "C:\Users\NexFord\AppData\Local\Programs\Python\Python37\lib\site-packages\cnocr\cn_ocr.py", line 145, in ocr
img = mx.image.imread(img_fp, 1).asnumpy()
File "C:\Users\NexFord\AppData\Local\Programs\Python\Python37\lib\site-packages\mxnet\image\image.py", line 85, in imread
return _internal._cvimread(filename, *args, **kwargs)
File "", line 35, in _cvimread
File "C:\Users\NexFord\AppData\Local\Programs\Python\Python37\lib\site-packages\mxnet_ctypes\ndarray.py", line 92, in _imperative_invoke
ctypes.byref(out_stypes)))
File "C:\Users\NexFord\AppData\Local\Programs\Python\Python37\lib\site-packages\mxnet\base.py", line 252, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [17:54:53] C:\Jenkins\workspace\mxnet-tag\mxnet\src\io\image_io.cc:222: Check failed: file.is_open() Imread: 'E:\python\py\Vitaminpic\2018-10-29 维生素价格.png' couldn't open file: Invalid argument
So ,Who is Jenkins?

输入ocr = CnOcr()报错

按照教程,先安装后导入包,结果报如下错误:
1 attempt left
Downloading /home/yqli/.cnocr/cnocr-models-v1.0.0.zip from https://www.dropbox.com/s/7w8l3mk4pvkt34w/cnocr-models-v1.0.0.zip?dl=1...
Traceback (most recent call last):
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/urllib3/connection.py", line 157, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/urllib3/util/connection.py", line 84, in create_connection
raise err
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/urllib3/util/connection.py", line 74, in create_connection
sock.connect(sa)
OSError: [Errno 101] Network is unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/urllib3/connectionpool.py", line 672, in urlopen
chunked=chunked,
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/urllib3/connectionpool.py", line 376, in _make_request
self._validate_conn(conn)
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/urllib3/connectionpool.py", line 994, in _validate_conn
conn.connect()
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/urllib3/connection.py", line 334, in connect
conn = self._new_conn()
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/urllib3/connection.py", line 169, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x7faa2f72c9b0>: Failed to establish a new connection: [Errno 101] Network is unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/urllib3/connectionpool.py", line 720, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/urllib3/util/retry.py", line 436, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.dropbox.com', port=443): Max retries exceeded with url: /s/7w8l3mk4pvkt34w/cnocr-models-v1.0.0.zip?dl=1 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7faa2f72c9b0>: Failed to establish a new connection: [Errno 101] Network is unreachable',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/cnocr/cn_ocr.py", line 101, in init
self._assert_and_prepare_model_files(root)
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/cnocr/cn_ocr.py", line 126, in _assert_and_prepare_model_files
get_model_file(root)
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/cnocr/utils.py", line 69, in get_model_file
download(MODEL_BASE_URL, path=zip_file_path, overwrite=True)
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/mxnet/gluon/utils.py", line 342, in download
raise e
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/mxnet/gluon/utils.py", line 309, in download
r = requests.get(url, stream=True, verify=verify_ssl)
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/home/yqli/anaconda3/envs/practice/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.dropbox.com', port=443): Max retries exceeded with url: /s/7w8l3mk4pvkt34w/cnocr-models-v1.0.0.zip?dl=1 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7faa2f72c9b0>: Failed to establish a new connection: [Errno 101] Network is unreachable',))
在Windows下这一步也是报错

请问支持变长图片识别是怎么做到的?切图吗?

多谢作者的分享!
训练图片应该是定长吧?
请问变长图片识别是怎么做到的?切成训练图片长度再塞入batch做inference吗?
我用定长图片训练,感觉图片长度越长,越难训练而且效果越差。
请问你是如何做到长图片也能OCR效果这么好?

从网盘下载了model,放到cnocr文件夹下,但是运行程序一直要去下载

从网盘下载了model,放到cnocr文件夹下,但是运行程序一直要去下载,后来我自己解压了,把models 文件夹放在cnocr文件夹下,这样也不行,后来发现时models下面文件的文件名不同,改了文件名,跳过了下载,后面又出错了,
[21:51:57] C:\Jenkins\workspace\mxnet-tag\mxnet\src\nnvm\legacy_json_util.cc:209: Loading symbol saved by previous version v1.3.1. Attempting to upgrade...
[21:51:57] C:\Jenkins\workspace\mxnet-tag\mxnet\src\nnvm\legacy_json_util.cc:217: Symbol successfully upgraded!
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\site-packages\mxnet\symbol\symbol.py", line 1523, in simple_bind
ctypes.byref(exe_handle)))
File "C:\ProgramData\Anaconda3\lib\site-packages\mxnet\base.py", line 252, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [21:51:57] c:\jenkins\workspace\mxnet-tag\mxnet\src\executor../common/exec_utils.h:392: InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:
l0_init_h: [], l2_init_h: [], l1_init_h: [], l3_init_h: [],

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:/OCR/ocrtest.py", line 2, in
ocr = CnOcr()
File "C:\ProgramData\Anaconda3\lib\site-packages\cnocr\cn_ocr.py", line 107, in init
self._mod = self._get_module(self._hp)
File "C:\ProgramData\Anaconda3\lib\site-packages\cnocr\cn_ocr.py", line 134, in _get_module
mod = load_module(prefix, self._model_epoch, data_names, data_shapes, network=network)
File "C:\ProgramData\Anaconda3\lib\site-packages\cnocr\cn_ocr.py", line 90, in load_module
mod.bind(for_training=False, data_shapes=data_shapes)
File "C:\ProgramData\Anaconda3\lib\site-packages\mxnet\module\module.py", line 429, in bind
state_names=self._state_names)
File "C:\ProgramData\Anaconda3\lib\site-packages\mxnet\module\executor_group.py", line 279, in init
self.bind_exec(data_shapes, label_shapes, shared_group)
File "C:\ProgramData\Anaconda3\lib\site-packages\mxnet\module\executor_group.py", line 375, in bind_exec
shared_group))
File "C:\ProgramData\Anaconda3\lib\site-packages\mxnet\module\executor_group.py", line 662, in _bind_ith_exec
shared_buffer=shared_data_arrays, **input_shapes)
File "C:\ProgramData\Anaconda3\lib\site-packages\mxnet\symbol\symbol.py", line 1529, in simple_bind
raise RuntimeError(error_msg)
RuntimeError: simple_bind error. Arguments:
data: (128, 1, 32, 280)
[21:51:57] c:\jenkins\workspace\mxnet-tag\mxnet\src\executor../common/exec_utils.h:392: InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:
l0_init_h: [], l2_init_h: [], l1_init_h: [], l3_init_h: [],

多行文本识别问题

我看了您的实现,当遇到文件行距特别小,小到上下行偶尔有粘连的时候就无法分割开了。我尝试了用滴水算法去进行分割,效果也不理想,这种情况还有什么好的算法处理吗。还有一个思路是尝试再训练一个模型来专门做分割,但是找了半天也没找到有可以用的训练集

windows系统用cnocr出现问题

[17:12:04] C:\Jenkins\workspace\mxnet-tag\mxnet\src\nnvm\legacy_json_util.cc:209: Loading symbol saved by previous version v1.3.1. Attempting to upgrade...
[17:12:04] C:\Jenkins\workspace\mxnet-tag\mxnet\src\nnvm\legacy_json_util.cc:217: Symbol successfully upgraded!
Traceback (most recent call last):
File "C:\Program Files\Anaconda3\lib\site-packages\mxnet\symbol\symbol.py", line 1523, in simple_bind
ctypes.byref(exe_handle)))
File "C:\Program Files\Anaconda3\lib\site-packages\mxnet\base.py", line 252, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [17:12:04] c:\jenkins\workspace\mxnet-tag\mxnet\src\executor../common/exec_utils.h:392: InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:
l0_init_h: [], l2_init_h: [], l1_init_h: [], l3_init_h: [],

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:/文件/OCR项目/新OCR项目/OCR/testocr.py", line 4, in
ocr = CnOcr()
File "D:\文件\OCR项目\新OCR项目\OCR\cnocr2\cnocr\cn_ocr.py", line 107, in init
self._mod = self._get_module(self._hp)
File "D:\文件\OCR项目\新OCR项目\OCR\cnocr2\cnocr\cn_ocr.py", line 134, in _get_module
mod = load_module(prefix, self._model_epoch, data_names, data_shapes, network=network)
File "D:\文件\OCR项目\新OCR项目\OCR\cnocr2\cnocr\cn_ocr.py", line 90, in load_module
mod.bind(for_training=False, data_shapes=data_shapes)
File "C:\Program Files\Anaconda3\lib\site-packages\mxnet\module\module.py", line 429, in bind
state_names=self._state_names)
File "C:\Program Files\Anaconda3\lib\site-packages\mxnet\module\executor_group.py", line 279, in init
self.bind_exec(data_shapes, label_shapes, shared_group)
File "C:\Program Files\Anaconda3\lib\site-packages\mxnet\module\executor_group.py", line 375, in bind_exec
shared_group))
File "C:\Program Files\Anaconda3\lib\site-packages\mxnet\module\executor_group.py", line 662, in _bind_ith_exec
shared_buffer=shared_data_arrays, **input_shapes)
File "C:\Program Files\Anaconda3\lib\site-packages\mxnet\symbol\symbol.py", line 1529, in simple_bind
raise RuntimeError(error_msg)
RuntimeError: simple_bind error. Arguments:
data: (128, 1, 32, 280)
[17:12:04] c:\jenkins\workspace\mxnet-tag\mxnet\src\executor../common/exec_utils.h:392: InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:
l0_init_h: [], l2_init_h: [], l1_init_h: [], l3_init_h: [],

Process finished with exit code 1
运行到这个函数的mod.bind(for_training=False, data_shapes=data_shapes)这里出现上述错误
def load_module(prefix, epoch, data_names, data_shapes, network=None):
"""
Loads the model from checkpoint specified by prefix and epoch, binds it
to an executor, and sets its parameters and returns a mx.mod.Module
"""
sym, arg_params, aux_params = mx.model.load_checkpoint(prefix, epoch)
if network is not None:
sym = network

# We don't need CTC loss for prediction, just a simple softmax will suffice.
# We get the output of the layer just before the loss layer ('pred_fc') and add softmax on top
pred_fc = sym.get_internals()['pred_fc_output']
sym = mx.sym.softmax(data=pred_fc)

mod = mx.mod.Module(symbol=sym, context=mx.cpu(), data_names=data_names, label_names=None)
mod.bind(for_training=False, data_shapes=data_shapes)
mod.set_params(arg_params, aux_params, allow_missing=False)
return mod

图片中有无法识别的文字报错

python版本:3.7.2
操作系统:windows10
图片中有一个乱码,是这个原因吗?

Traceback (most recent call last):
File "E:/PycharmProjects/mhxy/main.py", line 9, in
ocr = CnOcr()
File "C:\Users\63110\AppData\Local\Programs\Python\Python37\lib\site-packages\cnocr\cn_ocr.py", line 83, in init
self._alphabet, _ = read_charset(os.path.join(self._model_dir, 'label_cn.txt'))
File "C:\Users\63110\AppData\Local\Programs\Python\Python37\lib\site-packages\cnocr\utils.py", line 65, in read_charset
for line in fp:
UnicodeDecodeError: 'gbk' codec can't decode byte 0x8c in position 10: illegal multibyte sequence

the format of data_root, train_file and test_file

thanks for your great work.
I want to train net on my own data, but I don't know the format of training set. So can you tell me the format of image name and train txt file?
Besides, is there any requirement for image size?
Thanks

内存释放问题

我写了一个web服务,对外提供识别接口。代码如下。

import web
import json
from cnocr import CnOcr
urls = ('/upload', 'Upload')

class Upload:
    ocr = CnOcr()
    def GET(self):
        return """<html><head></head><body>
<form method="POST" enctype="multipart/form-data" action="">
<input type="file" name="myfile" />
<br/>
<input type="submit" />
</form>
</body></html>"""

    def POST(self):
        x = web.input(myfile={})
        filedir = './upload_file' # change this to the directory you want to store the file in.
        if 'myfile' in x: # to check if the file-object is created
            filepath=x.myfile.filename.replace('\\','/') # replaces the windows-style slashes with linux ones.
            filename=filepath.split('/')[-1] # splits the and chooses the last part (the filename with extension)
            fout = open(filedir +'/'+ filename,'wb') # creates the file where the uploaded file should be stored
            fout.write(x.myfile.file.read()) # writes the uploaded file to the newly created file.
            fout.close() # closes the file, upload complete.
            resultData = Upload.ocr.ocr( filedir + '/' + filename )
            jsonStr=json.dumps(resultData, cls=NumpyEncoder)
        return jsonStr;


if __name__ == "__main__":
   app = web.application(urls, globals())
   app.run()

class NumpyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, (np.int_, np.intc, np.intp, np.int8,
            np.int16, np.int32, np.int64, np.uint8,
            np.uint16, np.uint32, np.uint64)):
            return int(obj)
        elif isinstance(obj, (np.float_, np.float16, np.float32,
            np.float64)):
            return float(obj)
        elif isinstance(obj,(np.ndarray,)): #### This is the fix
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)

现在的问题是,随着我不断的提交图片进行识别。内存会不断飙升。从最初的1个G到3个G。这还只是识别了10多张图片,是否随着图片识别数量的增多,内存占用会越来越高呢?
image

是我的web服务的写法有问题,还是什么有问题?不能每次识别完成之后释放调内存吗?
求大神指导~~~

识别的速度慢

这个项目安装、使用简单,但是在识别的时候速度比较慢。一开始是调用了多行检测的函数,确实是慢,后来改成单行检测,速度有提升,但是比原版的crnn-mxnet还是慢不少。而且没有支持批量识别,在实际使用时会比较慢。

cannot import name 'CnOcr'

按照要求,用pip安装了cnocr,pip list看了下也有cnocr库,但是跑脚本的时候,报如下错误,想问一下,该怎么解决。
from cnocr import CnOcr
ImportError: cannot import name 'CnOcr'

v1.0.0训练自己的模型,accuracy仍然一直为0?

2019-08-25 15:34:55,719 Epoch[0] Batch [0-50] Speed: 206.17 samples/sec accuracy=0.000000
2019-08-25 15:35:27,459 Epoch[0] Batch [50-100] Speed: 201.66 samples/sec accuracy=0.000000
2019-08-25 15:35:58,847 Epoch[0] Batch [100-150] Speed: 203.91 samples/sec accuracy=0.000000
2019-08-25 15:36:56,625 Epoch[0] Batch [150-200] Speed: 110.77 samples/sec accuracy=0.000000
2019-08-25 15:38:04,535 Epoch[0] Batch [200-250] Speed: 94.24 samples/sec accuracy=0.000000
2019-08-25 15:39:18,125 Epoch[0] Batch [250-300] Speed: 86.97 samples/sec accuracy=0.000000
2019-08-25 15:40:42,457 Epoch[0] Batch [300-350] Speed: 75.89 samples/sec accuracy=0.000000
2019-08-25 15:42:04,507 Epoch[0] Batch [350-400] Speed: 78.00 samples/sec accuracy=0.000000
2019-08-25 15:43:11,145 Epoch[0] Batch [400-450] Speed: 96.04 samples/sec accuracy=0.000000
2019-08-25 15:44:13,990 Epoch[0] Batch [450-500] Speed: 101.84 samples/sec accuracy=0.000000
2019-08-25 15:45:11,138 Epoch[0] Batch [500-550] Speed: 111.99 samples/sec accuracy=0.000000
2019-08-25 15:47:05,598 Epoch[0] Batch [550-600] Speed: 206.19 samples/sec accuracy=0.000000
2019-08-25 15:47:37,091 Epoch[0] Batch [600-650] Speed: 203.22 samples/sec accuracy=0.000000
2019-08-25 15:48:08,517 Epoch[0] Batch [650-700] Speed: 203.65 samples/sec accuracy=0.000000
如题,是因为训练的epoch太少了么?如果是的话请问大家训练自己模型时,大概运行多少Epoch,accuracy会变化?

继续训练

大佬,如果自己准备好训练数据(不含360W的训练数据),在已有的模型上继续训练,会不会对模型的整体识别效果产生较大的影响呢?比如,原模型能够正确识别的字,继续训练后,会导致识别错误,会有这种情况发生么?

cnocr_train.py训练,accuracy一直是0

使用crnn-mxnet的Synthetic Chinese Dataset样本,执行
python ./cnocr-master/scripts/cnocr_train.py --loss ctc --dataset cn_ocr --data_root ./data/images --train_file ./data/train.txt --test_file ./data/test.txt

cnocr_train/cnocr-master# proc 0 started
proc 1 started
proc 2 started
proc 3 started
proc 0 started
proc 1 started
[16:56:12] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:97: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
2019-05-29 16:56:46,983 Epoch[0] Batch [0-50] Speed: 199.47 samples/sec accuracy=0.000000
2019-05-29 16:57:30,919 Epoch[0] Batch [50-100] Speed: 145.66 samples/sec accuracy=0.000000
2019-05-29 16:58:17,541 Epoch[0] Batch [100-150] Speed: 137.27 samples/sec accuracy=0.000000
2019-05-29 16:59:03,438 Epoch[0] Batch [150-200] Speed: 139.44 samples/sec accuracy=0.000000
2019-05-29 16:59:51,143 Epoch[0] Batch [200-250] Speed: 134.16 samples/sec accuracy=0.000000
2019-05-29 17:00:39,957 Epoch[0] Batch [250-300] Speed: 131.11 samples/sec accuracy=0.000000
2019-05-29 17:01:13,673 Epoch[0] Batch [300-350] Speed: 189.82 samples/sec accuracy=0.000000
2019-05-29 17:01:56,561 Epoch[0] Batch [350-400] Speed: 149.23 samples/sec accuracy=0.000000
2019-05-29 17:02:39,406 Epoch[0] Batch [400-450] Speed: 149.37 samples/sec accuracy=0.000000
2019-05-29 17:03:24,134 Epoch[0] Batch [450-500] Speed: 143.09 samples/sec accuracy=0.000000
2019-05-29 17:04:08,365 Epoch[0] Batch [500-550] Speed: 144.69 samples/sec accuracy=0.000000
2019-05-29 17:04:52,112 Epoch[0] Batch [550-600] Speed: 146.30 samples/sec accuracy=0.000000
2019-05-29 17:05:40,416 Epoch[0] Batch [600-650] Speed: 132.49 samples/sec accuracy=0.000000
2019-05-29 17:06:27,831 Epoch[0] Batch [650-700] Speed: 134.98 samples/sec accuracy=0.000000
2019-05-29 17:07:03,921 Epoch[0] Batch [700-750] Speed: 177.34 samples/sec accuracy=0.000000
2019-05-29 17:07:47,872 Epoch[0] Batch [750-800] Speed: 145.62 samples/sec accuracy=0.000000
2019-05-29 17:08:34,820 Epoch[0] Batch [800-850] Speed: 136.32 samples/sec accuracy=0.000000
2019-05-29 17:09:24,825 Epoch[0] Batch [850-900] Speed: 127.99 samples/sec accuracy=0.000000
2019-05-29 17:10:00,173 Epoch[0] Batch [900-950] Speed: 181.06 samples/sec accuracy=0.000000
2019-05-29 17:10:43,914 Epoch[0] Batch [950-1000] Speed: 146.32 samples/sec accuracy=0.000000
2019-05-29 17:11:29,664 Epoch[0] Batch [1000-1050] Speed: 139.89 samples/sec accuracy=0.000000
2019-05-29 17:12:18,869 Epoch[0] Batch [1050-1100] Speed: 130.07 samples/sec accuracy=0.000000
2019-05-29 17:12:54,773 Epoch[0] Batch [1100-1150] Speed: 178.25 samples/sec accuracy=0.000000
2019-05-29 17:13:45,715 Epoch[0] Batch [1150-1200] Speed: 125.63 samples/sec accuracy=0.000000
2019-05-29 17:14:16,195 Epoch[0] Batch [1200-1250] Speed: 209.97 samples/sec accuracy=0.000000
2019-05-29 17:14:58,715 Epoch[0] Batch [1250-1300] Speed: 150.52 samples/sec accuracy=0.000000
2019-05-29 17:15:41,078 Epoch[0] Batch [1300-1350] Speed: 151.08 samples/sec accuracy=0.000000
2019-05-29 17:16:25,221 Epoch[0] Batch [1350-1400] Speed: 144.98 samples/sec accuracy=0.000000
2019-05-29 17:17:09,735 Epoch[0] Batch [1400-1450] Speed: 143.78 samples/sec accuracy=0.000000
2019-05-29 17:17:57,844 Epoch[0] Batch [1450-1500] Speed: 133.03 samples/sec accuracy=0.000000
2019-05-29 17:18:35,149 Epoch[0] Batch [1500-1550] Speed: 171.56 samples/sec accuracy=0.000000
2019-05-29 17:19:22,321 Epoch[0] Batch [1550-1600] Speed: 135.67 samples/sec accuracy=0.000000
2019-05-29 17:19:57,424 Epoch[0] Batch [1600-1650] Speed: 182.32 samples/sec accuracy=0.000000
2019-05-29 17:20:38,650 Epoch[0] Batch [1650-1700] Speed: 155.24 samples/sec accuracy=0.000000
2019-05-29 17:21:23,728 Epoch[0] Batch [1700-1750] Speed: 141.98 samples/sec accuracy=0.000000
2019-05-29 17:22:11,530 Epoch[0] Batch [1750-1800] Speed: 133.89 samples/sec accuracy=0.000000
2019-05-29 17:22:52,154 Epoch[0] Batch [1800-1850] Speed: 157.54 samples/sec accuracy=0.000000
2019-05-29 17:23:38,767 Epoch[0] Batch [1850-1900] Speed: 137.30 samples/sec accuracy=0.000000
2019-05-29 17:24:13,728 Epoch[0] Batch [1900-1950] Speed: 183.07 samples/sec accuracy=0.000000
2019-05-29 17:24:59,449 Epoch[0] Batch [1950-2000] Speed: 139.98 samples/sec accuracy=0.000000
2019-05-29 17:25:45,785 Epoch[0] Batch [2000-2050] Speed: 138.12 samples/sec accuracy=0.000000
2019-05-29 17:26:28,279 Epoch[0] Batch [2050-2100] Speed: 150.61 samples/sec accuracy=0.000000

windows系统用cnocr出现问题

raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [16:16:44] c:\jenkins\workspace\mxnet-tag\mxnet\src\executor../common/exec_utils.h:392: InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:
l0_init_h: [], l2_init_h: [], l1_init_h: [], l3_init_h: [],
有知道什么问题吗

A4扫描稿识别率太低

我发现识别用截图工具截图一小段文字没有问题,如果喂给一个完整的4A扫描件,几乎无法识别。
图片附着如下:
img016
img013
img014
img015

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.