Comments (9)
一开始未做任何改动,完全按照步骤生成数据,然后训练报错,维度不匹配,根据相关问题里的解答,将reshape里的24和timestep的24都改为32后(24应该是宽度为96时的设置,而当前的图片宽度是128),可以顺利train起来,并在训练集和验证集都达到99的正确率。但是测试训练集图片时的输出都完全不对。需要修改两个地方:1.默认的验证码数据的单个字符仅包含0-9再加上空白标记总共11个,所以blank_label: 10,alphabet_size: 11,fc 层num_output: 11。2.训练时图片数据是没有归一化到0-1的,而recognition.cpp中sample_resized.convertTo(sample_float, CV_32FC3)将图片像素值归一化到0-1,将其改为sample_resized.convertTo(sample_float, CV_32FC3, 1/255.0)。
from crnn.caffe.
@plastic0313 我也有这样的问题,请问您的解决了么
from crnn.caffe.
@plastic0313 @greatgeekgrace 因为的自己的分类个数是74,你要参考你自己的分类个数,更改generate_dataset.py num_output alphabet_size等参数。
from crnn.caffe.
@yalecyu 好的,非常感谢~~~目前预测的captcha图片(图片为0642)结果:
74 74 0 74 6 74 4 74 2 74 74 74 74 74 74 74 74 74 74 74 - - - -
看起来结果有24个字符(上面的结果一个一个数的),可是在crnn.prototxt和deploy.prototxt中num_output设置的是75,是怎么回事呢。而且还出现了-符号?
from crnn.caffe.
@xijunjun 对,主要注意的是,因为另一个OCR的项目,我更改了prototxt的配置,没有用生成数据集验证是否维度匹配。另一个需要注意的就是0-1和0-255,但是没有验证过,只是README里面给出提示。有时间补了这些坑。
from crnn.caffe.
@xijunjun 哪儿可以知道训练的时候是归一化的?convertTo(sample_float, CV_32FC3, 1/255.0)这个函数的作用不是归一化吗?
from crnn.caffe.
@dingtao1 我是看了下数据制作代码和数据输入层参数
from crnn.caffe.
Hi, sorry for posting on an old discussion, but i need some help or hints as I can't seem to get consistent and correct results for my own trained crnn model after following all the steps. I ported the Linux code and compiled this on Windows and Visual Studio 2017 compiler. I managed to compile the codes successfully after making some changes, but I supposed this shouldn't affect the results.
-
First I generated dataset using generate_captcha.py. Total image size is 50,000.
-
Then execute generate_dateset.py.
IMAGE_WIDTH, IMAGE_HEIGHT = 128, 32.
Training size = 40,000 and Test size = 10,000. -
In my crnn.prototxt, I changed batch size to 50 to cater for my GPU which only have 2 MB memory. I changed the following as well:
layer {
name: "reshape"
type: "Reshape"
bottom: "conv6"
top: "reshape"
reshape_param {
shape {
#nc(w*h)
dim: 50
dim: 512
dim: 32
}
}
}
layer {
name: "indicator"
type: "ContinuationIndicator"
top: "indicator"
continuation_indicator_param {
time_step: 32
batch_size: 50
}
}
layer {
name: "ctc_loss"
type: "CtcLoss"
bottom: "fc1"
bottom: "label"
top: "ctc_loss"
loss_weight: 1.0
ctc_loss_param {
blank_label: 10
alphabet_size: 11
time_step: 32
}
}
layer {
name: "accuracy"
type: "LabelsequenceAccuracy"
bottom: "premuted_fc"
bottom: "label"
top: "accuracy"
labelsequence_accuracy_param {
blank_label: 10
}
}
I managed to get over 0.95 accuracy for both test and train data. Loss seems to be on the low side as well (0.00x). -
Next, I change the deploy.prototxt:
name: "crnn"
layer {
name: "data"
type: "Input"
top: "data"
input_param {shape:{dim:1 dim:3 dim:32 dim:128}}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "conv6"
top: "reshape"
reshape_param {
shape {
#nc(w*h)
dim: 1
dim: 512
dim: 32
}
}
}
layer {
name: "indicator"
type: "ContinuationIndicator"
top: "indicator"
continuation_indicator_param {
time_step: 32
batch_size: 1
}
}
layer {
name: "fc1"
type: "InnerProduct"
bottom: "lstm2"
top: "fc1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 11
axis: 2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
} -
I also amend the recognition.cpp to include the normalization:
if (num_channels_ == 3)sample_resized.convertTo(sample_float, CV_32FC3, 1.f/255);
else sample_resized.convertTo(sample_float, CV_32FC1, 1.f/255);
for the output, when i run the recognition exe such as below:
recognition D:\ImageProc\ImgDataset\Data\Captcha\49998-7959.png D:\Lib\caffecrnn\examples\crnn\deploy.prototxt D:\Lib\caffecrnn\examples\crnn\model\crnn_captcha_iter_3600.caffemodel
i can't get consistent results from the model each time, and i can't get an accurate output as well:
Output that I get if i run it for three times:
8 9 8 8 8 8 8 8 8 8 8 8 8 8 8 4 4 4 4 4 1 1 1 - 1 1 1 0 0 - 1 2
6 6 - - 9 9 - - - - - - - - - - - - 6 6 6 6 6 6 6 6 6 6 6 6 3 3
1 1 2 1 1 8 8 8 8 1 - 1 1 1 1 1 1 1 1 7 7 7 7 7 7 7 7 7 7 7 6 1
Anyone have any hints or detected where I have a mistake? Anyone managed to get accurate output from the trained model?
Please help! Thank you.
from crnn.caffe.
(数字+英文字母)测试图从BGR转RGB可以解决训练过程中测试准确率很高,但是cpp_recognition输出结果不对的问题!!!
#~~~~~~~~~~~~~~~~~~原因如下~~~~~~~~~~~~~~~~~~~~#
我们做数据的时候:img = caffe.io.load_image(os.path.join(img_path, image))
caffe.io里面:img = skimage.img_as_float(skimage.io.imread(filename, as_grey=not color)).astype(np.float32)
问题所在:cv2的存储格式是BGR,而skimage的存储格式是RGB(recognition.cpp里面的读图是用opencv,使用cv::cvtColor(resizeimg, resizeimg, cv::COLOR_BGR2RGB);)
from crnn.caffe.
Related Issues (20)
- Wrong accuracy when I change test_iter HOT 1
- 测试例子出现问题 HOT 5
- hello,麻烦咨询下我这边有个维度不匹配的情况 HOT 11
- 标签问题 HOT 4
- ContinuationIndicator出自哪里?
- your model doesn't match to your deploy.txt HOT 4
- 图片是不是必须这个格式,多一点边框都不行吗? HOT 1
- 自制数据集,过拟合的问题 HOT 7
- accuracy is 0 HOT 5
- example core dump
- 测试结果的问题 HOT 3
- test_accuracy和模型实际测试的值不同
- Alphabets in label's questions HOT 2
- 将CNN换成densenet结构后,BN层设置问题
- CNN结构替换成denseNet时遇到的问题
- make fail with protobuf/stubs/common.h not found
- 训练你的代码出现维度不匹配
- check failed: registry.count(type)==1(0 vs 1) unknown layer type:
- 你好,训练自己数据的时候准确率和loss都为0 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crnn.caffe.