szad670401 / end-to-end-for-chinese-plate-recognition Goto Github PK
View Code? Open in Web Editor NEW多标签分类,端到端的中文车牌识别基于mxnet, End-to-End Chinese plate recognition base on mxnet
多标签分类,端到端的中文车牌识别基于mxnet, End-to-End Chinese plate recognition base on mxnet
I'm trying to reimplement the model in Keras and have some questions about this model:
https://github.com/szad670401/end-to-end-for-chinese-plate-recognition/blob/master/train.py#L108
Fo each character/number we have fc2n
output in our case you have n=7
of them and num_hidden = 65
is number of unique character/number in dictionary.
So as I understand at the output we have 7x65 output (row x col) and each row have only one 1.0 and other values are 0.0 (one hot encoding).
I'm not sure how to deal with matrix, because in ordinary cases like classification we have output as vector and softmax+categorical_crossentropy
on top.
And what if we have digits for example ['0','1','2','3','4','5','6','7','8','9'] (num_hidden = 10
) and characters for example ['A','B','C'] (num_hidden = 3
) how to concat 3 and 10 vectors in single matrix?
Can you eleborate on this?
Also seems this project is very similar:
https://github.com/apache/incubator-mxnet/blob/master/example/captcha/mxnet_captcha.R#L13
生成车牌时,W字母会明显的被遮挡,请问这个问题的原因是什么呢?
cv2.error: OpenCV(3.4.1) /io/opencv/modules/imgproc/src/resize.cpp:4044: error: (-215) ssize.width > 0 && ssize.height > 0 in function resize
报错如上,不知是否与opencv版本有关
在Ubuntu16.04下,下载zip解压后,运行 python test.py 报错:
python test.py
(30, 120, 3)
(3, 30, 120)
[18:02:27] include/dmlc/logging.h:235: [18:02:27] src/io/local_filesys.cc:154: Check failed: allow_null LocalFileSystem: fail to open "cnn-ocr-symbol.json"
Traceback (most recent call last):
File "test.py", line 84, in
TestRecognizeOne(cv2.imread("./plate/01.jpg"))
File "test.py", line 59, in TestRecognizeOne
_, arg_params, __ = mx.model.load_checkpoint("cnn-ocr", 1)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.7.0-py2.7.egg/mxnet/model.py", line 372, in load_checkpoint
symbol = sym.load('%s-symbol.json' % prefix)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.7.0-py2.7.egg/mxnet/symbol.py", line 971, in load
check_call(_LIB.MXSymbolCreateFromFile(c_str(fname), ctypes.byref(handle)))
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.7.0-py2.7.egg/mxnet/base.py", line 77, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [18:02:27] src/io/local_filesys.cc:154: Check failed: allow_null LocalFileSystem: fail to open "cnn-ocr-symbol.json"
PIL我安装的是pillow,python版本2.7,请问是哪一步出现问题了呢?
python用的是3.6的 ,跑起来有问题,请问你们这个项目用的啥环境跑的
好像是python2.X吧? 具体要什么环境和软件环境
Thanks for your contribution.
I have downloaded the code and run it ok for 8 letters recognition. However, Something happened when I recode your code to recognize 14 or more letters. The problem is that training accuracy always be zero, even after dozen epoches' training, each epoch has 1w pictures.
2017-08-07 18:16:11,402 - root - INFO - Epoch[11] Batch [100] Speed: 271.67 samples/sec accuracy=0.000000
2017-08-07 18:16:15,152 - root - INFO - Epoch[11] Batch [200] Speed: 266.71 samples/sec accuracy=0.000000
2017-08-07 18:16:18,915 - root - INFO - Epoch[11] Batch [300] Speed: 265.76 samples/sec accuracy=0.000000
2017-08-07 18:16:22,544 - root - INFO - Epoch[11] Batch [400] Speed: 275.58 samples/sec accuracy=0.000000
2017-08-07 18:16:26,258 - root - INFO - Epoch[11] Batch [500] Speed: 269.32 samples/sec accuracy=0.000000
2017-08-07 18:16:30,039 - root - INFO - Epoch[11] Batch [600] Speed: 264.47 samples/sec accuracy=0.000000
2017-08-07 18:16:33,699 - root - INFO - Epoch[11] Batch [700] Speed: 273.27 samples/sec accuracy=0.000000
2017-08-07 18:16:37,393 - root - INFO - Epoch[11] Batch [800] Speed: 270.71 samples/sec accuracy=0.000000
2017-08-07 18:16:41,054 - root - INFO - Epoch[11] Batch [900] Speed: 273.20 samples/sec accuracy=0.000000
I add some new code and rise it to 95%. but test results still performed bad. I't almost useless for real plates.
PC with Quadro 600(1GB)
2016-09-06 23:23:50,956 Epoch[0] Resetting Data Iterator
2016-09-06 23:23:50,956 Epoch[0] Time cost=13450.590
2016-09-06 23:24:08,369 Epoch[0] Validation-Accuracy=0.667429
2016-09-06 23:24:08,411 Saved checkpoint to "cnn-ocr-0001.params"
('\xe6\xb8\x9dQ8L3PC', [3, 55, 39, 51, 34, 54, 43])
the "len(text) == 9" in my project there is error.the correct is "len(text) == 7"
你好,我之前也用过这种端到端的模型。对于固定长度的序列,确实能做到state-of-the-art,但是变长的,比如武警车牌有7个也有8个长度的,你是怎么做到的呢。另外,我试了下ctc发现在耗时要求严格的情况下,效果也不好。能否交流下,谢谢
你好,请教一个问题,
def get_ocrnet():
fc1 = mx.symbol.FullyConnected(data = flatten, num_hidden = 120)
fc21 = mx.symbol.FullyConnected(data = fc1, num_hidden = 65)
代码中的65和120,有具体含义吗?65是代表车省份31个、24个大写字母(去除O和I)、10个数字吗?
tag number is 7 , but for 4 in accuracy fc.
。。。基本就是原代码没改啊,不清楚为什么跑不出来
self.bg = cv2.resize(cv2.imread("./images/template.bmp"),(226,70));
cv2.error: OpenCV(3.4.3) C:\projects\opencv-python\opencv\modules\imgproc\src\resize.cpp:4044: error: (-215:Assertion failed) !ssize.empty() in function 'cv::resize'
您好,请问您是怎么解决基于生成车牌的网络模型对真实车牌识别精度较差这种情况的,能解释下具体的思路吗。看您介绍中简单提了一下这方面。
Hi,
Can I get training dataset if possible?
Thanks a million. :)
车牌识别数据集,,包含多种车牌类型,各省份数量均匀且充足:qq1668486259
@szad670401
请问下怎么用gan网络生成车牌呢,您用的是什么模型呢
谢谢分享,想请教一下关于模拟生成车牌的字符颜色如何做调整,我想将字体颜色改为黑色,尝试了修改字体函数里面的参数,发现得到的颜色效果与期望的不同,是做了什么预处理吗?
这个是怎么实现的,利用GAN在生成一些更像真实样本的数据进行训练吗?
[16:28:34] ./dmlc-core/include/dmlc/logging.h:208: [16:28:34] src/io/local_filesys.cc:149: Check failed: allow_null LocalFileSystem: fail to open "cnn-ocr-symbol.json"
Traceback (most recent call last):
File "/opt/github/end-to-end-for-chinese-plate-recognition/test.py", line 84, in
TestRecognizeOne(cv2.imread("./plate/01.jpg"))
File "/opt/github/end-to-end-for-chinese-plate-recognition/test.py", line 59, in TestRecognizeOne
_, arg_params, __ = mx.model.load_checkpoint("cnn-ocr", 1)
File "/usr/local/lib/python2.7/site-packages/mxnet/model.py", line 437, in load_checkpoint
symbol = sym.load('%s-symbol.json' % prefix)
训练了5万车牌数据,字母数字都能收敛到很小,只有中文字符收敛的比较慢,中文字符识别精确度很难提高,请问有什么建议吗?
本人可提供一手车牌检测和车牌识别商用级别数据集,数据简介见链接:https://www.bilibili.com/video/bv1kU4y1e7Sb
qq:1041357701
I ran the train.py
.
However, I got those info
[14:21:51] src/operator/tensor/./matrix_op-inl.h:144: Using target_shape will be deprecated.
[14:21:51] src/operator/tensor/./matrix_op-inl.h:144: Using target_shape will be deprecated.
2017-06-28 14:21:59,105 Epoch[0] Batch [50] Speed: 51.16 samples/sec Accuracy=0.000000
2017-06-28 14:22:07,049 Epoch[0] Batch [100] Speed: 50.35 samples/sec Accuracy=0.000000
2017-06-28 14:22:15,034 Epoch[0] Batch [150] Speed: 50.09 samples/sec Accuracy=0.000000
2017-06-28 14:22:23,187 Epoch[0] Batch [200] Speed: 49.07 samples/sec Accuracy=0.000000
2017-06-28 14:22:31,466 Epoch[0] Batch [250] Speed: 48.31 samples/sec Accuracy=0.000000
2017-06-28 14:22:39,868 Epoch[0] Batch [300] Speed: 47.61 samples/sec Accuracy=0.000000
2017-06-28 14:22:48,231 Epoch[0] Batch [350] Speed: 47.83 samples/sec Accuracy=0.000000
2017-06-28 14:22:56,492 Epoch[0] Batch [400] Speed: 48.42 samples/sec Accuracy=0.000000
....
Does anyone know the reason? How to solve it? Thanks!
你的标签格式是n个one-hot向量concat在一起吗?比如四位数字[1,2,3,4],标签就是[[1,0000000],[0,1,000000].....[000,1,00000]],每个sample和为4?
50w张图片训练了多久 不知道后续会不会放出训练好的模型
without any code changed
我目前了解的车牌识别方法就是两种:分割然后识别单个字符;或者使用CTC进行序列识别。我没看懂你这个code用的是什么方法,多标签分类是什么意思呀?
我把FeedForward里面dev改成mx.gpu()之后会报这个错
mxnet.base.MXNetError: [16:44:09] src/imperative/imperative.cc:78: Operator _zeros is not implemented for GPU.
求用gpu跑过的大神知道
你好,我想请教一下这种Multi-label classification方法适用于竖排文字吗?
关于竖排文字该如何处理呢,这里假设要识别的文字长度是一定的,只有3个。然后这个文字的范围不超过50种文字。
I read the readme file. But I don't know how do you do it.
大神您好,我现在在做双行车牌的识别。奈何数据集有限,请问您有双行车牌生成的代码么?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.