dee1024 / pytorch-captcha-recognition Goto Github PK
View Code? Open in Web Editor NEW基于CNN训练的一套 "端到端" 的验证码识别模型,使用深度学习+训练数据+大量计算力,纯数字识别率高达 99.99%,数字+字母识别率 96%
License: Apache License 2.0
基于CNN训练的一套 "端到端" 的验证码识别模型,使用深度学习+训练数据+大量计算力,纯数字识别率高达 99.99%,数字+字母识别率 96%
License: Apache License 2.0
当我想把模型部署到服务器上的时候,准备读取客户post上来的image文件类型,尝试使用
image_root = './dataset/predict/1129_1554564483.jpg'
image = Image.open(image_root)
image = image.resize((60, 160))
if transform is not None:
image = transform(image)
vimage = Variable(image.cuda())
predict_label = cnn(vimage.cuda())
c0 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, 0:captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
c1 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, captcha_setting.ALL_CHAR_SET_LEN:2 * captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
c2 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, 2 * captcha_setting.ALL_CHAR_SET_LEN:3 * captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
c3 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, 3 * captcha_setting.ALL_CHAR_SET_LEN:4 * captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
predict_label = '%s%s%s%s'(c0,c1,c2,c3)
print(predict_label)
以上来处理我的图片,但遇到了输入维度不匹配的问题,然而我使用测试中的dataloader传入
for i, (images, labels) in enumerate(test_dataloader):
image = images
vimage = Variable(image.cuda())
predict_label = cnn(vimage.cuda())
c0 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, 0:captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
c1 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, captcha_setting.ALL_CHAR_SET_LEN:2 * captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
c2 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, 2 * captcha_setting.ALL_CHAR_SET_LEN:3 * captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
c3 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, 3 * captcha_setting.ALL_CHAR_SET_LEN:4 * captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
predict_label = '%s%s%s%s' % (c0, c1, c2, c3)
true_label = one_hot_encoding.decode(labels.numpy()[0])
print("predict: %s true: %s"%(predict_label, true_label))
便不会发生任何问题,请问我该如何处理
原来的独热码编码时位置计算太依赖数字,这样每次改动图片大小时都有可能对应改动编码算法。而且解码也必须依赖数字。从这两个角度优化,新的独热码代码如下:
import numpy
CHARACTER = {'0': 0, '1': 1, '2': 2, '3': 3, '4': 4, '5': 5, '6': 6, '7': 7, '8': 8, '9': 9,
'A': 10, 'B': 11, 'C': 12, 'D': 13, 'E': 14, 'F': 15, 'G': 16, 'H': 17, 'I': 18,
'J': 19, 'K': 20, 'L': 21, 'M': 22, 'N': 23, 'O': 24, 'P': 25, 'Q': 26, 'R': 27,
'S': 28, 'T': 29, 'U': 30, 'V': 31, 'W': 32, 'X': 33, 'Y': 34, 'Z': 35}
CAPTCHA_NUMBER = 6
def one_hot_encode(value: list) -> tuple:
"""编码,将字符转为独热码
vector为独热码,order用于解码
"""
order = []
vector = numpy.zeros(CAPTCHA_NUMBER * len(CHARACTER ), dtype=float)
for k, v in enumerate(value):
index = k * len(CHARACTER) + CHARACTER.get(v)
vector[index] = 1.0
order.append(index)
return vector, order
def one_hot_decode(value: list) -> str:
"""解码,将独热码转为字符
"""
res = []
for ik, iv in enumerate(value):
val = iv - ik * len(CHARACTER) if ik else iv
for k, v in CHARACTER.items():
if val == int(v):
res.append(k)
break
return "".join(res)
if __name__ == '__main__':
code = '0A2JYD'
vec, orders = one_hot_encode(code)
print(orders)
print(vec)
print(one_hot_decode(orders))
运行后,输出如下:
[0, 46, 74, 127, 178, 193]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
0A2JYD
数字和字母识别都很精准 但是加上了加减符号从新训练后 识别数字没问题 识别符号也会被认为是数字 setting中添加了符号 样本也相应的标注了 还需要调整model的参数么 新手求问
求教97%准确率模型的超参设置和数据集划分,谢谢了~
想做一个预测接口,上传一张图片返回结果,怎样能只预测一张图片并得到结果?
Traceback (most recent call last):
File "/Users/james/pythonproject/pytorch-captcha-recognition/captcha_predict.py", line 34, in
main()
File "/Users/james/pythonproject/pytorch-captcha-recognition/captcha_predict.py", line 19, in main
for i, (images, labels) in enumerate(predict_dataloader):
File "/Users/james/pythonproject/airasiago/venv/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 345, in next
data = self._next_data()
File "/Users/james/pythonproject/airasiago/venv/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/Users/james/pythonproject/airasiago/venv/lib/python2.7/site-packages/torch/utils/data/utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/Users/james/pythonproject/pytorch-captcha-recognition/my_dataset.py", line 24, in getitem
label = ohe.encode(image_name.split('')[0]) # 为了方便,在生成图片的时候,图片文件的命名格式 "4个数字或者数字_时间戳.PNG", 4个字母或者即是图片的验证码的值,字母大写,同时对该值做 one-hot 处理
File "/Users/james/pythonproject/pytorch-captcha-recognition/one_hot_encoding.py", line 22, in encode
vector[idx] = 1.0
IndexError: index 152 is out of bounds for axis 0 with size 144
@dee1024 感谢您的工作,我研究了以下您的代码,针对您的库中的不足之处进行了部分改进和个性化处理。
这是链接:https://github.com/pprp/captcha_identify.torch
使用您的库:
training set : 10k
test set: 1k
达不到96%的要求,请教您具体是哪个地方有问题?
另外,我主要的改动的点在以下几个方面:
目前最好的是RES101,RES152为基础的网络,可以达到94%。
pytorch-captcha-recognition git:(master) ✗ python captcha_train.py
init net
('epoch:', 0, 'step:', 0, 'loss:', 0.7110539674758911)
('epoch:', 1, 'step:', 0, 'loss:', 0.20382478833198547)
('epoch:', 2, 'step:', 0, 'loss:', 0.2277430146932602)
('epoch:', 3, 'step:', 0, 'loss:', 0.2373160570859909)
('epoch:', 4, 'step:', 0, 'loss:', 0.22053484618663788)
('epoch:', 5, 'step:', 0, 'loss:', 0.21534274518489838)
('epoch:', 6, 'step:', 0, 'loss:', 0.19580186903476715)
('epoch:', 7, 'step:', 0, 'loss:', 0.1724429875612259)
('epoch:', 8, 'step:', 0, 'loss:', 0.15848731994628906)
('epoch:', 9, 'step:', 0, 'loss:', 0.1449590027332306)
('epoch:', 10, 'step:', 0, 'loss:', 0.14423207938671112)
('epoch:', 11, 'step:', 0, 'loss:', 0.1371624767780304)
('epoch:', 12, 'step:', 0, 'loss:', 0.12939313054084778)
('epoch:', 13, 'step:', 0, 'loss:', 0.12587422132492065)
('epoch:', 14, 'step:', 0, 'loss:', 0.12168151885271072)
('epoch:', 15, 'step:', 0, 'loss:', 0.12061356008052826)
('epoch:', 16, 'step:', 0, 'loss:', 0.11823482811450958)
('epoch:', 17, 'step:', 0, 'loss:', 0.11660336703062057)
('epoch:', 18, 'step:', 0, 'loss:', 0.11383380740880966)
('epoch:', 19, 'step:', 0, 'loss:', 0.11137279868125916)
('epoch:', 20, 'step:', 0, 'loss:', 0.10716967284679413)
('epoch:', 21, 'step:', 0, 'loss:', 0.10241743177175522)
('epoch:', 22, 'step:', 0, 'loss:', 0.10174597799777985)
('epoch:', 23, 'step:', 0, 'loss:', 0.09972652792930603)
('epoch:', 24, 'step:', 0, 'loss:', 0.09792148321866989)
('epoch:', 25, 'step:', 0, 'loss:', 0.09143608063459396)
('epoch:', 26, 'step:', 0, 'loss:', 0.09200140088796616)
('epoch:', 27, 'step:', 0, 'loss:', 0.09041691571474075)
('epoch:', 28, 'step:', 0, 'loss:', 0.08658407628536224)
('epoch:', 29, 'step:', 0, 'loss:', 0.08212457597255707)
save last model
➜ pytorch-captcha-recognition git:(master) ✗ python captcha_test.py
load cnn net.
Test Accuracy of the model on the 3 test images: 0.000000 %
请问有训练好的模型吗?如果没有,能否请教一下怎样去训练呢...求教
大概需要训练多少次?
epoch: 0 step: 9 loss: 0.18992629647254944
epoch: 0 step: 19 loss: 0.1413472294807434
epoch: 0 step: 29 loss: 0.13513770699501038
epoch: 0 step: 39 loss: 0.13097789883613586
epoch: 0 step: 49 loss: 0.13270880281925201
epoch: 0 step: 59 loss: 0.13205519318580627
epoch: 0 step: 69 loss: 0.13079266250133514
epoch: 0 step: 79 loss: 0.13159562647342682
epoch: 0 step: 89 loss: 0.13097992539405823
epoch: 0 step: 99 loss: 0.1281210333108902
save model
epoch: 0 step: 109 loss: 0.13041895627975464
epoch: 0 step: 119 loss: 0.12769392132759094
epoch: 0 step: 129 loss: 0.12638987600803375
epoch: 0 step: 139 loss: 0.126255601644516
epoch: 0 step: 149 loss: 0.12566044926643372
epoch: 0 step: 157 loss: 0.12796930968761444
epoch: 1 step: 9 loss: 0.12493212521076202
epoch: 1 step: 19 loss: 0.12400683015584946
epoch: 1 step: 29 loss: 0.12240520119667053
epoch: 1 step: 39 loss: 0.12404748797416687
epoch: 1 step: 49 loss: 0.11832032352685928
epoch: 1 step: 59 loss: 0.1218739002943039
epoch: 1 step: 69 loss: 0.11750295013189316
epoch: 1 step: 79 loss: 0.1162392869591713
epoch: 1 step: 89 loss: 0.1177394762635231
epoch: 1 step: 99 loss: 0.11434363573789597
save model
epoch: 1 step: 109 loss: 0.11179210245609283
epoch: 1 step: 119 loss: 0.11091677844524384
epoch: 1 step: 129 loss: 0.11243681609630585
epoch: 1 step: 139 loss: 0.11096695810556412
epoch: 1 step: 149 loss: 0.1121552512049675
epoch: 1 step: 157 loss: 0.10759076476097107
epoch: 2 step: 9 loss: 0.10809783637523651
epoch: 2 step: 19 loss: 0.10550016164779663
epoch: 2 step: 29 loss: 0.10933805257081985
epoch: 2 step: 39 loss: 0.1027115136384964
epoch: 2 step: 49 loss: 0.10497838258743286
epoch: 2 step: 59 loss: 0.10278748720884323
epoch: 2 step: 69 loss: 0.10579609125852585
epoch: 2 step: 79 loss: 0.10177668184041977
epoch: 2 step: 89 loss: 0.09883224219083786
epoch: 2 step: 99 loss: 0.10240185260772705
save model
十万张测试图片,test的正确率在百分之四十左右,是不是过拟合了?如何修正参数
不同背景,不同字体得验证码都要单独训练吗,这个太麻烦了,什么情况下可以达到通用啊
实在不会用,用captcha_predict.py可以,但是我想在其它地方单独传入一张图片名称得到返回结果如何做呢?
我发现对同一张图多次预测的结果不同,看到模型里有用dropout,是不是预测的时候也有……
对于部署pytorch的人一头蒙,说到的端到端呢
你好 我在训练时 损失为0 但是测试时损失很大 准确率为 0
可以用自己的6位验证码集训练吗
RuntimeError: size mismatch, m1: [64 x 8960], m2: [1344 x 1024] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:940
设置了自己定义的图片大小后出现不匹配的错误,请问如何解决?
File "C:\Users\leping.zhao\AppData\Local\Programs\Python\Python36\lib\site-pac
kages\torch\utils\data\dataloader.py", line 165, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimensio
n 0. Got 160 and 164 in dimension 3 at c:\new-builder_2\win-wheel\pytorch\aten\s
rc\th\generic/THTensorMath.cpp:3616
当验证码只要30个的时候没有报错,但是把数量改成1W或者5W生成验证后,然后训练的时候就报这个错误了,麻烦看下怎么改
使用captcha_gen.py脚本生成了1万张图片。然后直接使用captcha_train.py脚本,参数是num_epochs = 30
batch_size = 100
learning_rate = 0.001。
脚本执行完毕后使用test脚本,但是模型成功率为0.
请问,需要训练多少次才能达到0.96的正确率。
你这个是单线程的,效率问题太大了吧
When i run the test it says im missing a model.pkl file and i dont know where to find it or generate it.
您好
请问有没有可能有其他合作的可能
我们想做关于一些银行登入验证码的解析
自己的训练效果很差,能否提供一个预训练模型?
当我想把模型部署到服务器上的时候,准备读取客户post上来的image文件类型,尝试使用
image_root = './dataset/predict/1129_1554564483.jpg'
image = Image.open(image_root)
image = image.resize((60, 160))
if transform is not None:
image = transform(image)
vimage = Variable(image.cuda())
predict_label = cnn(vimage.cuda())
c0 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, 0:captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
c1 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, captcha_setting.ALL_CHAR_SET_LEN:2 * captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
c2 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, 2 * captcha_setting.ALL_CHAR_SET_LEN:3 * captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
c3 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, 3 * captcha_setting.ALL_CHAR_SET_LEN:4 * captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
predict_label = '%s%s%s%s'(c0,c1,c2,c3)
print(predict_label)
以上来处理我的图片,但遇到了输入维度不匹配的问题,然而我使用测试中的dataloader传入
for i, (images, labels) in enumerate(test_dataloader):
image = images
vimage = Variable(image.cuda())
predict_label = cnn(vimage.cuda())
c0 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, 0:captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
c1 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, captcha_setting.ALL_CHAR_SET_LEN:2 * captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
c2 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, 2 * captcha_setting.ALL_CHAR_SET_LEN:3 * captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
c3 = captcha_setting.ALL_CHAR_SET[np.argmax(predict_label[0, 3 * captcha_setting.ALL_CHAR_SET_LEN:4 * captcha_setting.ALL_CHAR_SET_LEN].data.cpu().numpy())]
predict_label = '%s%s%s%s' % (c0, c1, c2, c3)
true_label = one_hot_encoding.decode(labels.numpy()[0])
print("predict: %s true: %s"%(predict_label, true_label))
便不会发生任何问题,请问我该如何处理
如题
大佬, 麻烦通过一下, 我加群了
11
小白一枚,看见头晕。
训练时cpu占用100%而gpu占用仅1%
Traceback (most recent call last):
File "/Users/haiyuan/Downloads/pytorch-captcha-recognition-master-2/captcha_train.py", line 43, in
main()
File "/Users/haiyuan/Downloads/pytorch-captcha-recognition-master-2/captcha_train.py", line 23, in main
for i, (images, labels) in enumerate(train_dataloader):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 346, in next
data = self.dataset_fetcher.fetch(index) # may raise StopIteration
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 80, in default_collate
return [default_collate(samples) for samples in transposed]
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 80, in
return [default_collate(samples) for samples in transposed]
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 56, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 160 and 164 in dimension 3 at ../aten/src/TH/generic/THTensor.cpp:689
init net
epoch: 0 step: 3 loss: 0.2350926697254181
epoch: 1 step: 3 loss: 0.18072116374969482
epoch: 2 step: 3 loss: 0.14990365505218506
epoch: 3 step: 3 loss: 0.1484384387731552
epoch: 4 step: 3 loss: 0.12944868206977844
epoch: 5 step: 3 loss: 0.136171355843544
epoch: 6 step: 3 loss: 0.13556182384490967
epoch: 7 step: 3 loss: 0.1339406818151474
epoch: 8 step: 3 loss: 0.12743189930915833
epoch: 9 step: 3 loss: 0.12321759760379791
epoch: 10 step: 3 loss: 0.11995768547058105
epoch: 11 step: 3 loss: 0.12512315809726715
epoch: 12 step: 3 loss: 0.11607347428798676
epoch: 13 step: 3 loss: 0.1194273829460144
epoch: 14 step: 3 loss: 0.11752907186746597
epoch: 15 step: 3 loss: 0.11002203077077866
epoch: 16 step: 3 loss: 0.11969199776649475
epoch: 17 step: 3 loss: 0.11192518472671509
epoch: 18 step: 3 loss: 0.10305792093276978
epoch: 19 step: 3 loss: 0.10709385573863983
epoch: 20 step: 3 loss: 0.10055689513683319
epoch: 21 step: 3 loss: 0.09390119463205338
epoch: 22 step: 3 loss: 0.10000652819871902
epoch: 23 step: 3 loss: 0.08290736377239227
epoch: 24 step: 3 loss: 0.08297861367464066
epoch: 25 step: 3 loss: 0.07633257657289505
epoch: 26 step: 3 loss: 0.08562453091144562
epoch: 27 step: 3 loss: 0.06919705867767334
epoch: 28 step: 3 loss: 0.06607756018638611
epoch: 29 step: 3 loss: 0.07599730044603348
save last model
为啥不默认GPU的
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.