mhliao / textboxes Goto Github PK

View Code? Open in Web Editor NEW

This project forked from weiliu89/caffe

631.0 631.0 154.0 45.82 MB

TextBoxes: A Fast Text Detector with a Single Deep Neural Network

Home Page: https://github.com/MhLiao/TextBoxes

License: Other

CMake 2.43% Makefile 0.62% Shell 0.48% C++ 80.57% Cuda 5.94% MATLAB 0.78% Python 9.11% Dockerfile 0.06%

aaai detection scene spotting text

textboxes's People

Contributors

Stargazers

Watchers

Forkers

cins-china ericxsun jeffzhengye simmoncn rex-yue-wu dpengwen jayhello crazylyf ericustc niasand lisaflyz tcl326 wcy940418 lillypj frankjiang hershawn lihaiyan2017 ustc2014 ionvision darinchen mylearning2017 sylviaholic yushanshan05 pankajkumar clscy kitter huangpanxx fengshikun yingning entn-at michael-chiang-mc5 doudoubean fangtaoluckyboy jsmilemsj zw88 y2e4hfg sghoshcvc shenggaozhu po-hsuan-huang aayn jeffrey98-ai kingofoz triployd jiakuilee hhgxx123 hellotf farmingyard jacklongking tony32769 flyflywang huichuanliu taozhi2yaoyao ahmedmazari-dhatim xhappy louisly stevenlol codingzzy justrypython ank-it tumusudheer xiaofengzhiyu gritsenko-konstantin briangong rosssong thlo7777 qwzhong1988 zmxheart bindung vsooda linecode niluanwudidadi leesoon1984 benben2773 tjblunt yang53 ocrbyyue nightinwhite gx9702 walnutmandu huguanglong jensensss airyym guanhuachen1995 masouddev abivikings boragocode harshadeepg baifengbai xuming76 jmt330 justttry gegetang yunzhuz zzmcdc lingrui98 wxd000000 zhouleidcc leichangqing pummi823 curry8

textboxes's Issues

如何在SSD路径下运行TextBoxes的model

你好，TextBoxes在横向字符有很好的表现，但是我在ssd路径下运行TextBoxes的model, 出现了错误，与SSD网络相比,TextBoxes在data层缺少了distort_param，在loss缺少了ignor_cross_boudary_bbox与mini_type参数，我猜TextBoxes使用的ssd是否是老版ssd
错误是：
Check failed:num_priors_*num_loc_class_*4==bottom[0]->channels()(295112 vs.590224) Number of priors must match number of location predictions

Can not make

nd@nd-Z97X-UD3H:~/yao/TextBoxes$ make -j8
CXX src/caffe/data_transformer.cpp
CXX src/caffe/util/sampler.cpp
CXX src/caffe/util/upgrade_proto.cpp
CXX src/caffe/util/bbox_util.cpp
CXX src/caffe/util/im2col.cpp
CXX src/caffe/util/im_transforms.cpp
CXX src/caffe/util/blocking_queue.cpp
CXX src/caffe/util/db.cpp
CXX src/caffe/util/io.cpp
CXX src/caffe/util/insert_splits.cpp
In file included from src/caffe/util/im_transforms.cpp:20:0:
./include/caffe/util/im_transforms.hpp:44:37: error: ‘ResizeParameter’ does not name a type
void UpdateBBoxByResizePolicy(const ResizeParameter& param,
^
./include/caffe/util/im_transforms.hpp:46:31: error: ‘NormalizedBBox’ has not been declared
NormalizedBBox* bbox);
^
./include/caffe/util/im_transforms.hpp:48:50: error: ‘ResizeParameter’ does not name a type
cv::Mat ApplyResize(const cv::Mat& in_img, const ResizeParameter& param);
^
./include/caffe/util/im_transforms.hpp:50:49: error: ‘NoiseParameter’ does not name a type
cv::Mat ApplyNoise(const cv::Mat& in_img, const NoiseParameter& param);
^
src/caffe/util/im_transforms.cpp:251:37: error: ‘ResizeParameter’ does not name a type
void UpdateBBoxByResizePolicy(const ResizeParameter& param
.
.
.
make: *** [.build_release/src/caffe/util/upgrade_proto.o] Error 1
^Cmake: *** [.build_release/src/caffe/util/io.o] Interrupt
make: *** [.build_release/src/caffe/data_transformer.o] Interrupt
make: *** [.build_release/src/caffe/util/sampler.o] Interrupt
make: *** [.build_release/src/caffe/util/blocking_queue.o] Interrupt
make: *** [.build_release/src/caffe/util/insert_splits.o] Interrupt
make: *** wait: No child processes. Stop.

Then I add caffe.pb.h in 'include/caffe/proto/', This error still appear.

ImportError: No module named shapely

I have followed your guide and when I run
python demo.py
Error occurs, error message shows as follow:

Traceback (most recent call last):
  File "demo.py", line 8, in <module>
    from nms import nms
  File "/home/wangjianbo_i/TextBoxes/examples/TextBoxes/nms.py", line 3, in <module>
    import shapely
ImportError: No module named shapely

Also, in your guide,

Test
run "python examples/demo.py".

May should be
python examples/TextBoxes/demo.py

difference between TextBoxes and SSD

What is the difference of TextBoxes and SSD? Is it just modify the default boxes?

NameError: name 'get_labelname' is not defined

Dear @MhLiao

I have tried to run test_icdar13_multi_scale.py but I have obtained the following issue

Could you please help me to resolve it

Getting mbox_loss as 0

Hi,
I am trying to train TextBoxes on my custom dataset. The annotations are in pascal_voc format. I am getting following log for all the iterations so far (currently it is at 830 iteration) while training TextBoxes.

Train net output #0: mbox_loss = 0 (* 1 = 0 loss)

Somehow it does not look right to me. Can someone suggest what might be wrong here. I have done a code walk through and things seem right as far as annotations are concerned. I am debugging it further but any quick help will be appreciated. Thanks.

what is the best detection_eval in SynthText dataset

@MhLiao ,hello
I am training TextBoxes model to re-implementation the result in your paper on SynthText dataset.
So I have two questions:

what is the best detection_eval in SynthText dataset in your training.
did you chose 50k caffemodel to fine-tune with icdar2013 dataset or just chose a best
detection_eval (not 50k in my training)

Thanks in advance!

cannot find -lopencv_dep_cudart

每次都遇到这个问题说找不到，要吐血了，请求支援

how to modify the gpu usage?(if my gpu is out of memory)

how to modify the gpu usage?(if my gpu is out of memory), when i run the test?

LMDB file creation

Hi...I am new in caffe and i just know basic working of caffe. For single character recognition we just provide image path and label in train.txt and val.txt file. Can anyone please tell me in Textboxes what is format of train.txt and val.txt for creating lmdb?

My train.txt file contains image path and annotation file path for coco dataset is it right ? and how can we open lmdb file which contains annotations ?

I used following code for reading LMDB file

import caffe
import lmdb
import numpy as np
import cv2
import matplotlib.pyplot as plt
from caffe.proto import caffe_pb2

lmdb_file='/home/arha/workarea/ocr_project/cocotext/data/lmdb/coco_val_lmdb'
lmdb_env = lmdb.open(lmdb_file)
lmdb_txn = lmdb_env.begin()
lmdb_cursor = lmdb_txn.cursor()

datum = caffe_pb2.Datum()
for key, value in lmdb_cursor:
datum.ParseFromString(value)

label = datum.label
print 'label=',label
data = caffe.io.datum_to_array(datum)
print data.shape
print 'data=',data
image = np.transpose(data, (1,2,0))
cv2.imshow('cv2', image)
cv2.waitKey(0)
print('{},{}'.format(key, label))
And It is giving output as follow

label= 0
(0, 0, 0)
data= []
OpenCV Error: Assertion failed (size.width>0 && size.height>0) in imshow, file /home/arha/softwares/opencv-2.4.13/modules/highgui/src/window.cpp, line 261
Traceback (most recent call last):
File "lm.py", line 28, in
cv2.imshow('cv2', image)
cv2.error: /home/arha/softwares/opencv-2.4.13/modules/highgui/src/window.cpp:261: error: (-215) size.width>0 && size.height>0 in function imshow

Please clear my doubts.

Thank you

Time consumption

@MhLiao, in your article in table 1 time consumption for few methods is presented. How did you measure these times? My GPU is weaker than Titan X and I'd like to get to know times on my machine. Of course, I'm particularly interested in your method and I assume that I should add some tic tac in test_icdar13.py file, but I'd like to know your opinion.

Best regards

Small objects detection

Hi, @MhLiao
I'm trying to re-implement your work in tensorflow . For now, i have got big process. But the detection on small objects are quiet bad because small objects can't match default boxes well, and they never got trained.
I tried to tune the anchor size but it does't work well. Could you give me some advices.
Thanks a lot.

Error in `python': double free or corruption (out) Error

Dear all

I have tried to run test_icdar13.py but i got the following errors

Could someone help me please to resolve this issue

Error parsing text-format caffe.NetParameter: 758:14: Message type "caffe.LayerParameter" has no field named "norm_param". F0709 17:08:07.915925 6370 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter

Hello,
l got the following error while running :

python demo.py

l got the following error :

WARNING: Logging before InitGoogleLogging() is written to STDERR
W0709 17:08:07.867985  6370 _caffe.cpp:135] DEPRECATION WARNING - deprecated use of Python interface
W0709 17:08:07.868007  6370 _caffe.cpp:136] Use this instead (with the named "weights" parameter):
W0709 17:08:07.868028  6370 _caffe.cpp:138] Net('/home/ahmed/TextBoxes/examples/TextBoxes/deploy.prototxt', 1, weights='/home/ahmed/TextBoxes/examples/TextBoxes/TextBoxes_icdar13.caffemodel')
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 758:14: Message type "caffe.LayerParameter" has no field named "norm_param".
F0709 17:08:07.915925  6370 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: /home/ahmed/TextBoxes/examples/TextBoxes/deploy.prototxt
*** Check failure stack trace: ***
Aborted (core dumped)
Thanks a lot

Undefined function or variable 'polygon_intersect'

@MhLiao

I am running the evaluation_nms code and there is a mistake in the code about "Undefined function or variable 'polygon_intersect'”. So, how do you actually implement this function.

Thanks a lot.

Get confidence score of CRNN to regularize the detection outputs of textBoxes.

Hello,
let me first thank you about these excellent articles : textboxes+CRNN.
In first page of textbox paper. It's mentionned the following : "we use the confidence scores of CRNN
to regularize the detection outputs of TextBoxes"

However, l get stuck at getting the probability of the sequence outputted by CRNN

from example :

--h-e--ll-oo- => 'hello' with a probability= 0.89
for instance
how can l get that ? l'm using the pytorch version.

in the code CTCLoss can't find these probabilities .

In __init__.py the CTC class is defined as follow :

However l don't find where to print the output probabilities here.

class _CTC(Function):
    def forward(self, acts, labels, act_lens, label_lens):
        is_cuda = True if acts.is_cuda else False
        acts = acts.contiguous()
        loss_func = warp_ctc.gpu_ctc if is_cuda else warp_ctc.cpu_ctc
        grads = torch.zeros(acts.size()).type_as(acts)
        minibatch_size = acts.size(1)
        costs = torch.zeros(minibatch_size)
        loss_func(acts,
                  grads,
                  labels,
                  label_lens,
                  act_lens,
                  minibatch_size,
                  costs)
        self.grads = grads
        self.costs = torch.FloatTensor([costs.sum()])
        return self.costs

    def backward(self, grad_output):
        return self.grads, None, None, None


class CTCLoss(Module):
    def __init__(self):
        super(CTCLoss, self).__init__()

    def forward(self, acts, labels, act_lens, label_lens):
        """
        acts: Tensor of (seqLength x batch x outputDim) containing output from network
        labels: 1 dimensional Tensor containing all the targets of the batch in one sequence
        act_lens: Tensor of size (batch) containing size of each output sequence from the network
        act_lens: Tensor of (batch) containing label length of each example
        """
        _assert_no_grad(labels)
        _assert_no_grad(act_lens)
        _assert_no_grad(label_lens)
        return _CTC()(acts, labels, act_lens, label_lens)


Thank you

Is textBoxes adapted to document images like scanned invoices ?

Hello,
Let me first thank you for this excellent work.

l'm wondering if TextBoxes can be applied to document images or just for natural images.

Thanks a lot

LMDB creating script

May I ask for the LMDB creating script for training? The path on comment seems missing. Thank you.

Convert `TextBoxes_icdar13.caffemodel` to `TextBoxes_icdar13.mlmodel` error!

Hi! @MhLiao Can you help me convert TextBoxes_icdar13.caffemodel to TextBoxes_icdar13.mlmodel through the tool coremltools ？
use see: https://pypi.python.org/pypi/coremltools extremely grateful!

I try it ：

import coremltools
coreml_model = coremltools.converters.caffe.convert(('TextBoxes_icdar13.caffemodel',   'deploy.prototxt'))
coreml_model.save('TextBoxes_icdar13.mlmodel')

But I get error :

[libprotobuf ERROR /git/coreml/deps/protobuf/src/google/protobuf/text_format.cc:298] Error parsing  text-format caffe.NetParameter: 758:14: Message type "caffe.LayerParameter" has no field named  "norm_param".
Traceback (most recent call last):
  File "convert.py", line 5, in <module>
    coreml_model = coremltools.converters.caffe.convert(('TextBoxes_icdar13.caffemodel', 'deploy.prototxt'))
  File "/Users/mambaxie/anaconda2/lib/python2.7/site-packages/coremltools/converters/caffe/_caffe_converter.py", line 142, in convert
    predicted_feature_name)
  File "/Users/mambaxie/anaconda2/lib/python2.7/site-packages/coremltools/converters/caffe/_caffe_converter.py", line 187, in _export
    predicted_feature_name
RuntimeError: Unable to load caffe network Prototxt file: deploy.prototxt

How do i fix it?

Failed to run make -j8

After I ran command make -j8, I'm getting this error :
/usr/include/google/protobuf/arenastring.h:219:31: note: candidate expects 1 argument, 0 provided Makefile:575: recipe for target '.build_release/src/caffe/data_transformer.o' failed make: *** [.build_release/src/caffe/data_transformer.o] Error 1
I'm using Ubuntu 16.04. Protobuf 3.2
Downgrading Protobuf to 3.1 also didn't work

Query about icdar_2013.caffemodel

Hi @MhLiao

Could you please clarify that the model icdar_2013.caffemodel is pretrained on SynthText dataset and then trained on ICDAR 2013 Text localization dataset (as mentioned in the paper) or it is only trained on ICDAR 2013 Text localization dataset?

ICDAR 2013 database

Dear All,

Could someone send me please the ICDAR 2013 database + test_list.txt needed to run test-icdr13.py because I didn't find it on the Internet.
Thanks in advance.

Demo.py predicts only 5 words out 500 words, why ?

Hello,

l run demo.py on an invoice image l got the following :

Why the Netwok is not able to predict all the words ?

Thank you

RuntimeError: Could not open file ./deploy.prototxt

When I run
python demo.py
Error occurs and the error message shows as folllow:

[root@ml-gpu-ser167 TextBoxes]# python demo.py 
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0712 06:58:42.744659 24402 _caffe.cpp:135] DEPRECATION WARNING - deprecated use of Python interface
W0712 06:58:42.744719 24402 _caffe.cpp:136] Use this instead (with the named "weights" parameter):
W0712 06:58:42.744726 24402 _caffe.cpp:138] Net('./deploy.prototxt', 1, weights='./TextBoxes_icdar13.caffemodel')
Traceback (most recent call last):
  File "demo.py", line 36, in <module>
    caffe.TEST)     # use test mode (e.g., don't perform dropout)
RuntimeError: Could not open file ./deploy.prototxt

What's wrong with it?

How to use SynthText for training?

Text in SynthText is oriented and its label has 4 points, how to use these data for training?

How to train textboxes for my own dataset from scratch?

"clip" parameter in PriorBoxLayer

Hi, @MhLiao
I found that you set the parameter 'clip' to 'true' in your train script.
Could you please explain why did you do this?
Does this parameter have effect on performance?
Thanks a lot.

Error in trainning upon changing the aspect ratios

Hi !
I was experimenting with the parameters and thought of changing the aspect ratios from [2,3,5,7,10] to [1,2,3,5,7] and got this error while training :

F0615 09:42:51.943936 18711 multibox_loss_layer.cpp:143] Check failed: num_priors_ * loc_classes_ * 4 == bottom[0]->channels() (224088 vs. 266728) Number of priors must match number of location predictions.

I am getting the same error if I try to add any other integer to [2,3,5,7,10].
Can you please help me to resort this issue.
Thanks in advance!

Failed to import caffe : ImportError: cannot import name symbol_database

Hello,
l installed successfully caffe. l have tested that with the notebook jupyter examples suggested by caffe.
caffe is installed in /home/ahmed/caffe

Textboxes is installed in /home/ahmed/TextBoxes
When l come to install textboxes. l get stuck at the following :
cd /home/ahmed/TextBoxes
works correctly but when l run
make -j8
l get

Makefile:6: *** Makefile.config not found. See Makefile.config.example.. Stop.

Thank you

The problem I occur when I run python test_icdar13.py after I modified the related path

Hello @MhLiao ,the following is error when I run python test_icdar13.py .I only compiled caffe of cpu version Would you mind giving me a hand? Thanks in advance.
jsj@jsj:~/TextBoxes/examples/TextBoxes$ python test_icdar13.py
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0527 19:02:48.273658 10427 common.cpp:66] Cannot use GPU in CPU-only Caffe: check mode.
*** Check failure stack trace: ***
已放弃 (核心已转储)

how to make ICDAR2013 dataset generate LMDB?

Hi. I am not clear the process of ICDAR2013 dataset generating LMDB, could you explain it in detail. Thank you.

How to pre-process ICDAR2017 Scene-text detection dataset

In ICDAR2017 dataset, the images are described with ground truth bounding box. Like 86,191,142,191,139,214,84,214,Latin,Flame . So how can i feed the data into the text box model?

Is there a way to do trasfer learning/ fine tuning the model for new data ?

ERRO in running test_icdar13.py

@MhLiao There is an erro when running test_icdar13.py at line:
net = caffe.Net(model_def, # defines the structure of the model
model_weights, # contains the trained weights
caffe.TEST) # use test mode (e.g., don't perform dropout)

The console shows that:
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 758:14: Message type "caffe.LayerParameter" has no field named "norm_param".
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0404 18:15:56.743150 22335 upgrade_proto.cpp:79] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: /home/scw4750/TextBoxes/examples/TextBoxes/deploy.prototxt
*** Check failure stack trace: ***

It seems that it cannot parse the parameter field in Layer: type: "Normalize" in deploy.prototxt. How can I resolve this issue?

Multiclass implementation of Textboxes

Hi,

I wanted to know if we can implement TextBoxes such that there are more than two possible classes? (currently the classes are text and background)
Thanks!

GPU memory (with / without cudnn)

@MhLiao hello,
Could you please tell me the GPU memory needed to train or test your proposed method?
And how much free memory is required if i use cudnn?
Thank you very much!

Combined model of TextBoxes + CRNN

Hi,

Is there any plan to release TextBoxes + CRNN combined module or parts of it?
Please let me know. Thanks.

problems in re-implement your experiment.

Hi,MhLiao.
I have implemented your TextBoxes on SynthText and icdar2013.but here are some problems that bother me.i got the result recall =0.488,precision = 0.938 and f-measure=0.64 for 700x700 single scale in icdar test set. For multi scale the f-measue is 0.73. that seems worse than your result.
the detail information for my training is below:
step 1: pretrain on the synthText data.

pretrain model:the VGG_ILSVRC_16_layers_fc_reduced.caffemodel
train data: about 85w SynthText , train size:700x700，batch size:8 (GPU limit)
lr: 0.0001 for 6w iterations, 0.00001 for the rest 12w iterations. total 18w iterations(loss about 2.0).
step 2: train on the icdar2013 train data
pretrain model:the model of step 1.
train size:700x700,batch_size = 4.
lr :0.00001 for 3k iterations.(loss about 1.5)
I have tried other settings: train data resize to 500x500 but still got a low recall(about 0.46) .by the way,the final loss is down to about 2.0 when trained on the synthdata, i don't know whether i have not
taken into account.
looking forward to your reply.

detected results are not the same as shown in the paper!

@MhLiao Hi,
i download the codes and run test_icdar13.py with TextBoxes_icdar13.caffemodel on the ICDAR13 dataset

(which contains 233 pictures), but some results (eg. img_14, img_19, img_34) are worse than the pictures

present in your paper (Figure 3).

Could you help me find the reason. Do you use multi-scale to produce the pictures shown in Figure 3 ?

Thanks in advance!

Issue with generating training data.

How to create ICDAR-13 data for training. I am getting the following error :

What's the meaning of detection_eval?

Hi, @MhLiao
In the training phase, I saw the test net outputting "detection_eval = ***". What's meaning of this value? I wonder whether I can judge the network's performance in terms of this value.
Thanks!
(I found this value isn't equal to the actual F-measure.)

TextBoxes_icdar13.caffemodel problem

Hi MhLiao
I download the codes and ready to run your code.
But i dont know where can i find this file -> [TextBoxes_icdar13.caffemodel]
Where can i find this file?
Thanks

Error in running train_icdar13.py

@MhLiao There is an error when running train_icdar13.py.

The console shows that:
I0614 14:06:31.362638 9607 layer_factory.hpp:77] Creating layer data
I0614 14:06:31.371596 9607 net.cpp:100] Creating Layer data
I0614 14:06:31.371624 9607 net.cpp:408] data -> data
I0614 14:06:31.371659 9607 net.cpp:408] data -> label
F0614 14:06:31.371739 9610 db_lmdb.hpp:15] Check failed: mdb_status == 0 (2 vs. 0) No such file or directory
*** Check failure stack trace: ***
@ 0x7f2c227e95cd google::LogMessage::Fail()
@ 0x7f2c227eb433 google::LogMessage::SendToLog()
@ 0x7f2c227e915b google::LogMessage::Flush()
@ 0x7f2c227ebe1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f2c22df1350 caffe::db::LMDB::Open()
@ 0x7f2c22e2d276 caffe::DataReader<>::Body::InternalThreadEntry()
@ 0x7f2c1f4225d5 (unknown)
@ 0x7f2c1ecd06ba start_thread
@ 0x7f2c2128782d clone
@ (nil) (unknown)
Aborted (core dumped)

The single scale f-measure is 55%

@MhLiao hello
I want to re-implement the result in your paper ,but I got a more low f-measure which is 55%(single scale), below is the solver which is created with using your default code
+++++++++++++++++++++++++++++++++++++++++
train_net: "models/TextBoxes/train.prototxt"
test_net: "models/TextBoxes/test.prototxt"
test_iter: 233
test_interval: 500
base_lr: 0.0001
display: 10
max_iter: 120000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 60000
snapshot: 500
snapshot_prefix: "models/TextBoxes/snapshots2/"
solver_mode: GPU
device_id: 0
debug_info: false
snapshot_after_train: true
test_initialization: false
average_loss: 10
iter_size: 1
type: "SGD"
eval_type: "detection"
ap_version: "11point"
+++++++++++++++++++++++++++++++++++++++++
but the stepsize=40000 in your paper.
and below is my train.prototxt's transformer parameters
+++++++++++++++++++++++++++++++++++++++++
transform_param {
mirror: false
mean_value: 104
mean_value: 117
mean_value: 123
resize_param {
prob: 1
resize_mode: WARP
height: 300
width: 300
interp_mode: LINEAR
interp_mode: AREA
interp_mode: NEAREST
interp_mode: CUBIC
interp_mode: LANCZOS4
}
emit_constraint {
emit_type: CENTER
}
}

Can your give a detail of your solver.prototxt and transformer paramters, or some advice
Thanks a lot~

Parameter settings for training

I have tried the parameter settings in your paper to train the model, but the performance of detection is bad. The f-measure on ICDAR2013-test is 0.72(All of the following results are tested with 700*700 single-scale. The f-measure should be 0.80 according to the paper). Here is my parameter setting for training:

Step 1. pretrain on the synthetic data

pretrain model: the VGG_ILSVRC_16_layers_fc_reduced.caffemodel
train data: Synthetic data(about 850k) refered in the paper
learning rate: 0.001 for the first 40k iteration, 0.0001 for the last 10k iteration
batch size: 8
The f-measure on ICDAR2013-test is about 0.6. The final train-loss is about 1.5

Step 2. train on the ICDAR2013-train data

pretrain model: the model of the Step 1.
train data: ICDAR2013-train
learning rate: 0.0001 for 2k iteration
batch size: 8
The f-measure on ICDAR2013-test is about 0.72. The final train-loss is about 0.9

Could you give me some advice on training or more information about the parameter setting for training? @MhLiao Thanks very much.

Instructions to setup the training.

Hi, i am totally new in this field and am struggling to find out how to train about 50k iterions on Synthetic data which refered in the paper. Hope that there's a guide on how to setup the training. Thanks.

How to test TextBoxes using ICDAR2013 dataset?

Hi,

I've downloaded the ICDAR2013 dataset (Task 2.1: Text Localization, 233 images and ground truth text files), but how can I generate test_list.txt?

Question about your multiscale input

Thanks for excellent work. But I have a concern about your multiscale input. The PriorBox layers need specific input size to construct the default boxes, but in your multiscale test file, you just invoke the deploy.prototxt generated by SSD300 training file for all scales. Does it mean that in larger input scale (e.g. 700x700), the size of prior boxes are same with 300x300 and the only difference is the number of boxes? Thank you.

Training on ICDAR 2017 and Inclusion of vertical offset boxes

I am trying to train on the COCO-Text data and my loss is not reducing significantly once it reaches 4.1.
Also, How is the inclusion of vertical offset default boxes being done in the code provided.

Regards,
Aakriti

error with make runtest?

i have make successfully
when i make runtest, it has this error

Do you know why?

[----------] 3 tests from MultiBoxLossLayerTest/3, where TypeParam = caffe::GPUDevice
[ RUN ] MultiBoxLossLayerTest/3.TestConfGradient
F0122 10:41:19.059348 24720 multibox_loss_layer.cpp:143] Check failed: num_priors_ * loc_classes_ * 4 == bottom[0]->channels() (128 vs. 64) Number of priors must match number of location predictions.
*** Check failure stack trace: ***
@ 0x7f488e9d6b6d google::LogMessage::Fail()
@ 0x7f488e9dab87 google::LogMessage::SendToLog()
@ 0x7f488e9d8a09 google::LogMessage::Flush()
@ 0x7f488e9d8d0d google::LogMessageFatal::~LogMessageFatal()
@ 0x7f488a7b6dc0 caffe::MultiBoxLossLayer<>::Reshape()
@ 0x55d82f caffe::Layer<>::SetUp()
@ 0x55ee3f caffe::GradientChecker<>::CheckGradientExhaustive()
@ 0x9d12d1 caffe::MultiBoxLossLayerTest_TestConfGradient_Test<>::TestBody()
@ 0xa804b3 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0xa782b7 testing::Test::Run()
@ 0xa7835e testing::TestInfo::Run()
@ 0xa78465 testing::TestCase::Run()
@ 0xa7a6f8 testing::internal::UnitTestImpl::RunAllTests()
@ 0xa7a987 testing::UnitTest::Run()
@ 0x54942f main
@ 0x7f4889a50bd5 __libc_start_main
@ 0x552149 (unknown)
make: *** [runtest] Aborted

mhliao / textboxes Goto Github PK

textboxes's People

Contributors

Stargazers

Watchers

Forkers

textboxes's Issues

Do you know why?

Recommend Projects

Recommend Topics

Recommend Org