我训练模型达到了0.96的accurary，用./build/examples/cpp_recognition/recognition.bin data/testimage

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

测试结果出现了很大问题 about crnn.caffe HOT 9 OPEN

yalecyu commented on July 28, 2024

测试结果出现了很大问题

from crnn.caffe.

Comments (9)

xijunjun commented on July 28, 2024 3

一开始未做任何改动，完全按照步骤生成数据，然后训练报错，维度不匹配，根据相关问题里的解答，将reshape里的24和timestep的24都改为32后（24应该是宽度为96时的设置，而当前的图片宽度是128），可以顺利train起来，并在训练集和验证集都达到99的正确率。但是测试训练集图片时的输出都完全不对。需要修改两个地方：1.默认的验证码数据的单个字符仅包含0-9再加上空白标记总共11个，所以blank_label: 10，alphabet_size: 11，fc 层num_output: 11。2.训练时图片数据是没有归一化到0-1的，而recognition.cpp中sample_resized.convertTo(sample_float, CV_32FC3)将图片像素值归一化到0-1，将其改为sample_resized.convertTo(sample_float, CV_32FC3, 1/255.0)。

from crnn.caffe.

commented on July 28, 2024

@plastic0313 我也有这样的问题，请问您的解决了么

from crnn.caffe.

yalecyu commented on July 28, 2024

@plastic0313 @greatgeekgrace 因为的自己的分类个数是74，你要参考你自己的分类个数，更改generate_dataset.py num_output alphabet_size等参数。

from crnn.caffe.

commented on July 28, 2024

@yalecyu 好的，非常感谢~~~目前预测的captcha图片（图片为0642）结果：
74 74 0 74 6 74 4 74 2 74 74 74 74 74 74 74 74 74 74 74 - - - -
看起来结果有24个字符（上面的结果一个一个数的），可是在crnn.prototxt和deploy.prototxt中num_output设置的是75，是怎么回事呢。而且还出现了-符号？

from crnn.caffe.

yalecyu commented on July 28, 2024

@xijunjun 对，主要注意的是，因为另一个OCR的项目，我更改了prototxt的配置，没有用生成数据集验证是否维度匹配。另一个需要注意的就是0-1和0-255，但是没有验证过，只是README里面给出提示。有时间补了这些坑。

from crnn.caffe.

dingtao1 commented on July 28, 2024

@xijunjun 哪儿可以知道训练的时候是归一化的？convertTo(sample_float, CV_32FC3, 1/255.0)这个函数的作用不是归一化吗？

from crnn.caffe.

xijunjun commented on July 28, 2024

@dingtao1 我是看了下数据制作代码和数据输入层参数

from crnn.caffe.

yjtan118 commented on July 28, 2024

Hi, sorry for posting on an old discussion, but i need some help or hints as I can't seem to get consistent and correct results for my own trained crnn model after following all the steps. I ported the Linux code and compiled this on Windows and Visual Studio 2017 compiler. I managed to compile the codes successfully after making some changes, but I supposed this shouldn't affect the results.

First I generated dataset using generate_captcha.py. Total image size is 50,000.
Then execute generate_dateset.py.
IMAGE_WIDTH, IMAGE_HEIGHT = 128, 32.
Training size = 40,000 and Test size = 10,000.
In my crnn.prototxt, I changed batch size to 50 to cater for my GPU which only have 2 MB memory. I changed the following as well:
layer {
name: "reshape"
type: "Reshape"
bottom: "conv6"
top: "reshape"
reshape_param {
shape {
#nc(w*h)
dim: 50
dim: 512
dim: 32
}
}
}
layer {
name: "indicator"
type: "ContinuationIndicator"
top: "indicator"
continuation_indicator_param {
time_step: 32
batch_size: 50
}
}
layer {
name: "ctc_loss"
type: "CtcLoss"
bottom: "fc1"
bottom: "label"
top: "ctc_loss"
loss_weight: 1.0
ctc_loss_param {
blank_label: 10
alphabet_size: 11
time_step: 32
}
}
layer {
name: "accuracy"
type: "LabelsequenceAccuracy"
bottom: "premuted_fc"
bottom: "label"
top: "accuracy"
labelsequence_accuracy_param {
blank_label: 10
}
}
I managed to get over 0.95 accuracy for both test and train data. Loss seems to be on the low side as well (0.00x).
Next, I change the deploy.prototxt:
name: "crnn"
layer {
name: "data"
type: "Input"
top: "data"
input_param {shape:{dim:1 dim:3 dim:32 dim:128}}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "conv6"
top: "reshape"
reshape_param {
shape {
#nc(w*h)
dim: 1
dim: 512
dim: 32
}
}
}
layer {
name: "indicator"
type: "ContinuationIndicator"
top: "indicator"
continuation_indicator_param {
time_step: 32
batch_size: 1
}
}
layer {
name: "fc1"
type: "InnerProduct"
bottom: "lstm2"
top: "fc1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 11
axis: 2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
I also amend the recognition.cpp to include the normalization:
if (num_channels_ == 3)sample_resized.convertTo(sample_float, CV_32FC3, 1.f/255);
else sample_resized.convertTo(sample_float, CV_32FC1, 1.f/255);

for the output, when i run the recognition exe such as below:
recognition D:\ImageProc\ImgDataset\Data\Captcha\49998-7959.png D:\Lib\caffecrnn\examples\crnn\deploy.prototxt D:\Lib\caffecrnn\examples\crnn\model\crnn_captcha_iter_3600.caffemodel
i can't get consistent results from the model each time, and i can't get an accurate output as well:
Output that I get if i run it for three times:
8 9 8 8 8 8 8 8 8 8 8 8 8 8 8 4 4 4 4 4 1 1 1 - 1 1 1 0 0 - 1 2

6 6 - - 9 9 - - - - - - - - - - - - 6 6 6 6 6 6 6 6 6 6 6 6 3 3

1 1 2 1 1 8 8 8 8 1 - 1 1 1 1 1 1 1 1 7 7 7 7 7 7 7 7 7 7 7 6 1

Anyone have any hints or detected where I have a mistake? Anyone managed to get accurate output from the trained model?

Please help! Thank you.

from crnn.caffe.

BarryKCL commented on July 28, 2024

（数字+英文字母）测试图从BGR转RGB可以解决训练过程中测试准确率很高，但是cpp_recognition输出结果不对的问题！！！
#～～～～～～～～～～～～～～～～～～原因如下～～～～～～～～～～～～～～～～～～～～#
我们做数据的时候：img = caffe.io.load_image(os.path.join(img_path, image))
caffe.io里面：img = skimage.img_as_float(skimage.io.imread(filename, as_grey=not color)).astype(np.float32)
问题所在：cv2的存储格式是BGR，而skimage的存储格式是RGB（recognition.cpp里面的读图是用opencv,使用cv::cvtColor(resizeimg, resizeimg, cv::COLOR_BGR2RGB);）

from crnn.caffe.

测试结果出现了很大问题 about crnn.caffe HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent