huoyijie / advancedeast Goto Github PK

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car

Home Page: https://huoyijie.cn/

License: MIT License

Python 100.00%

scene text-detect east keras tensorflow python deep-learning machine-learning computer-vision tianchi

advancedeast's Introduction

AdvancedEAST

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST:An Efficient and Accurate Scene Text Detector, and the significant improvement was also made, which make long text predictions more accurate. If this project is helpful to you, welcome to star. And if you have any problem, please contact me.

email:[email protected]
website:https://huoyijie.cn

advantages

writen in keras, easy to read and run
base on EAST, an advanced text detect algorithm
easy to train the model
significant improvement was made, long text predictions more accurate.(please see 'demo results' part bellow, and pay attention to the activation image, which starts with yellow grids, and ends with green grids.)

In my experiments, AdvancedEast has obtained much better prediction accuracy then East, especially on long text. Since East calculates final vertexes coordinates with weighted mean values of predicted vertexes coordinates of all pixels. It is too difficult to predict the 2 vertexes from the other side of the quadrangle. See East limitations picked from original paper bellow.

project files

config file:cfg.py,control parameters
pre-process data: preprocess.py,resize image
label data: label.py,produce label info
define network network.py
define loss function losses.py
execute training advanced_east.py and data_generator.py
predict predict.py and nms.py

后置处理过程说明参见后置处理(含原理图)

network arch

AdvancedEast

网络输出说明：输出层分别是1位score map, 是否在文本框内；2位vertex code，是否属于文本框边界像素以及是头还是尾；4位geo，是边界像素可以预测的2个顶点坐标。所有像素构成了文本框形状，然后只用边界像素去预测回归顶点坐标。边界像素定义为黄色和绿色框内部所有像素，是用所有的边界像素预测值的加权平均来预测头或尾的短边两端的两个顶点。头和尾部分边界像素分别预测2个顶点，最后得到4个顶点坐标。

原理简介(含原理图)

East

setup

python 3.6.3+
tensorflow-gpu 1.5.0+(or tensorflow 1.5.0+)
keras 2.1.4+
numpy 1.14.1+
tqdm 4.19.7+

training

tianchi ICPR dataset download 链接: https://pan.baidu.com/s/1NSyc-cHKV3IwDo6qojIrKA 密码: ye9y
prepare training data:make data root dir(icpr), copy images to root dir, and copy txts to root dir, data format details could refer to 'ICPR MTWI 2018 挑战赛二：网络图像的文本检测', Link
modify config params in cfg.py, see default values.
python preprocess.py, resize image to 256256,384384,512512,640640,736*736, and train respectively could speed up training process.
python label.py
python advanced_east.py, train entrance
python predict.py -p demo/001.png, to predict
pretrain model download(use for test) 链接: https://pan.baidu.com/s/1KO7tR_MW767ggmbTjIJpuQ 密码: kpm2

demo results

compared with east based on vgg16

As you can see, although the text area prediction is very accurate, the vertex coordinates are not accurate enough.

License

The codes are released under the MIT License.

references

原理简介(含原理图)

后置处理过程说明参见后置处理(含原理图)

A Simple RaspberryPi Car Project

advancedeast's People

Contributors

Stargazers

Watchers

Forkers

horizon-z40 chyelang liny23 june505 burness siyue0211 zsm116 cshjarry fendaq terrynech elagjun fanofjava baby313 wanglc2008 lxj0276 senitco smilewsw pandasmx wyw636 rkshuai justrypython aileader tobechao xianfengju leichangqing cvtower fireae yuckfu xshhhm baiyancheng20 jxlijunhao zhongkailv ericustc jiachen0212 jdc08161063 dreadlord1984 ericxsun gitchenguang lzd0825 s97712 bayesquant zimoqingfeng 10183308 afcarl hubeibei007 zhuguangqiang timedcy zzmcdc juventi gehongpeng vincentliubuaa jacke121 qdet musicbeer lyfadvance aocoder gq124 weiliangxiao gds101054108 wuyunxiangwyx conleykong onexuan ocrbyyue yuanhang8605 caoyangcr7 incentering moonshine90 xinghalo pzheng2018 xggiou xhappy caotianwei lss616263 abeliuxl fakeryfx zenozhouzhao seeker1943 marearts apexpredator1 creke perseus1996 binwangh white2018 zgsxwsdxg happog cwbjyy verazjy gavin666github jiangxiluning xiaoyigwr fujingling xuannianc baby47 cpppy aiwener for-competition plume yangjian615 zhengyi144 2016xjtuzyt

advancedeast's Issues

Could you provide us with a pre-trained model?

how to calculate the f1 score?

what is the biggest difference with the east?

I want to know the biggest difference with the east?the backbone or the postprocess?
@huoyijie
look forward to hearing from you

一个奇怪的检测不全的问题

您demo中检测了机动车驾驶证，我拍了张名片测试了下，发现这类检测是否对间隙文本支持不友好，比如名片中的姓名，两个字是分开的，在您目前的pixel threshold阈值下，可以单个框出。我修改了下阈值，结果可以框到两个字，但两个字都框的不全，是否是我对代码理解不到位，对于这种问题，作者您怎么看？还有那三个阈值到底分别有何影响，打分策略部分没有看太懂，还望作者解答。

关于side区域标签生成的问题

头尾按照属于第一个poly进行标记，但是初始map为0, 属于head-side的ith=0，属于tail-side的ith=1，no-side的ith=-1，如此标记的话，head-side与non-side的值都为0，这部分可视化出来会发现只有tail没有head，在训练过程中head的损失计算会出现错误。这里作者认为如何

weight file error

when I use the file ‘east_model_weights_2T736.h5’, Error comes up. ValueError: You are trying to load a weight file containing 58 layers into a model with 30 layers. It seems the weight file does not match.

检测结果无头无尾

@huoyijie
您好，谢谢您代码的分享。我遇到一些问题，没找到头绪，想要咨询一下。
首先，我要检测的是自然场景下的文字，所以我对训练数据集进行了更改，使用的是ICDAR 2017的数据集和自己标注的数据集，后面找了一些图片进行测试。发现一些问题，有很多图片出现了有头无尾，有尾无头，无头无尾，以及相邻两行之间交叉，分不开的情况，可是我找不到原因以及怎么改善？可以给我一些建议吗？研三了，比较急，谢谢啦

想问问一些关于loss的问题

side_vertex_code_loss是表示什么loss，还有一个问题是预测出来的坐标是什么形式的（比如左上角，右下角坐标，还是其他形式）

QUAD 训练支持吗

仅实现RBOX还不够，能实现QUAD训练吗

.npy文件

你好，请问和label文件对应的.npy是什么文件，需要自己去生成么

Can you provide us with a pre-trained model?

数据集问题咨询

作者，你好，请问下如何生成符合要求的自己的数据集，我看作者提供的数据集是图片对应txt文件，而且txt文件包含四个点的坐标，我用labelImg生成的是xml文件，不符合格式要求，感谢提供帮助

python3 predict.py error

python3 predict.py

Traceback (most recent call last):
File "predict.py", line 148, in
east_detect.load_weights(cfg.saved_model_weights_file_path)
File "/home/ddc/.local/lib/python3.5/site-packages/keras/engine/network.py", line 1161, in load_weights
f, self.layers, reshape=reshape)
File "/home/ddc/.local/lib/python3.5/site-packages/keras/engine/saving.py", line 900, in load_weights_from_hdf5_group
str(len(filtered_layers)) + ' layers.')
ValueError: You are trying to load a weight file containing 58 layers into a model with 30 layers.

Detect a line rather than a word

Which field to tweak so that a word box can be converted to line

could you give some analysis of the process of label.py or some code annotation for easy reading

谢谢了搞定了

File "/home/louj1/pywork/AEAST/data_generator.py", line 37, in gen
y[i] = np.load(gt_file)
ValueError: could not broadcast input array from shape (21,4,2) into shape (64,64,7)

请问这个模型在ICPR MTWI 2018上排名如何呢？

请问这个模型在ICPR MTWI 2018上排名如何呢？谢谢

petrain 模型的精度

非常感谢您开源了代码，您是否尝试在ICDAR2015上测试过这个代码，相比较原始EAST模型，您的模型的精度召回大概是多少，十分感谢

ImportError: cannot import name 'abs'

请问遇到过这个问题吗，是版本不匹配吗？麻烦赐教

Training error: could not broadcast input array from shape

Hi
I am trying to run the training and am getting the following error:

 File "D:\AdvancedEAST\data_generator.py", line 33, in gen
    img

ValueError: could not broadcast input array from shape (720,1280,3) into shape (736,736,3)

Did you ever get such an error?
Any suggestions?
My data includes just a few images for merely initiating the training loop.

The tree structure of the files used for training is displayed below:

quad invalid with vertex num less then 4.

使用您的预训练模型做预测，测试了一张名片图片，报错，用您demo内的图片测试是没有问题的

I seems to find a bug in your codes

it's in function "resize_image" of file "preprocess.py"，wrong codes are below：

if im_width == max_img_size < im.width:
if o_height == max_img_size < im_height:

i think they should be as below：

if im_width == max_img_size:
if o_height == max_img_size:

i dont know if i'am right, please author to judge

merge layer 产生多个网络分支问题

AdvancedEAST/network.py

Lines 65 to 70 in ba78824

 inside_score = Conv2D(1, 1, padding='same', name='inside_score' 

 )(self.g(cfg.feature_layers_num)) 

 side_v_code = Conv2D(2, 1, padding='same', name='side_vertex_code' 

 )(self.g(cfg.feature_layers_num)) 

 side_v_coord = Conv2D(4, 1, padding='same', name='side_vertex_coord' 

 )(self.g(cfg.feature_layers_num))

这几行代码，会生成三个不同的feature merge layer网络分支，这是预期的结果吗？

可不可以这样直接生成7个output map：
Conv2D(7, 1, padding='same', name='output_all')(self.g(cfg.feature_layers_num))

Alternative model download website

Can any of you guys please provide the model weights in a site for non-chinese speakers? ;-) pan.baidu is impossible for me to use.

y的维度问题

这个y的矩阵维度是啥样的啊

    y = np.zeros((batch_size, pixel_num_h, pixel_num_w, 7), dtype=np.float32)

为什么是7而不是9

rrc.cvc.uab.es

@huoyijie Can you publish AdvancedEast to http://rrc.cvc.uab.es/?ch=8&com=evaluation
it a competition website.

quad invalid with vertex num less then 4

您好，下载您提供的预训练模型，然后测试github上demo里面的身份证和机动车驾驶证，总是出现这个问题，并不能得出您原始效果。请问这是什么原因？

Cannot download from Baidu

Hi
Can you please share the data and the pretrained model on GoogleDrive, DropBox, Box or any other cloud service.
Unfortunately, Baidu does not work so great in the west.
Best Regards
Wajahat

validation_steps=None

@huoyijie

(east) home@home-lnx:~/Desktop/program/AdvancedEAST$ python advanced_east.py
Using TensorFlow backend.
2019-02-11 18:44:57.258140: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2019-02-11 18:44:57.259110: I tensorflow/core/common_runtime/process_util.cc:69] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
.
.
.
.
==================================================================================================
Total params: 15,087,367
Trainable params: 15,083,655
Non-trainable params: 3,712
__________________________________________________________________________________________________
WARNING:tensorflow:Variable *= will be deprecated. Use `var.assign(var * other)` if you want assignment to the variable value or `x = x * y` if you want a new python Tensor object.
Traceback (most recent call last):
  File "advanced_east.py", line 31, in <module>
    verbose=1)])
  File "/home/home/anaconda3/envs/east/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/home/anaconda3/envs/east/lib/python3.6/site-packages/keras/engine/training.py", line 2115, in fit_generator
    raise ValueError('`validation_steps=None` is only valid for a'
ValueError: `validation_steps=None` is only valid for a generator based on the `keras.utils.Sequence` class. Please specify `validation_steps` or use the `keras.utils.Sequence` class.

found a bug in data_generator.py line 37

the shape of y dosen't match the np.load(gt_file)

发现旷视用这个代码做商业应用，请问是否授权?

Respected moderator I would venture to ask, this Advanced EAST did not see the training entrance, it is estimated that many people like me are looking for. Thank you.

Pretrained model

Can you please share the pretrained model?

训练迭代次数

请问您当时训练256*256，大概多少次迭代后能达到收敛？

[Request] Few requests

Hello sir.
Thank you for your great work.

Would you mind upload your model into Google Drive ?
And translating the Readme into English.

运行predict.py报错

您好！我在anaconda新建的虚拟环境里搭建好了tensorflow，但是运行predict.py的时候出现了以上的错误，我找不到解决的方法

how can I make the taining dataset format?

您好，请问如何制作类似ICDAR2015格式的数据集呢？收集到的图片用什么工具如何进行标注呢？

nms.py cost too much time？

When I predict a picture , it seems nms.py cost too much time. This situation becomes more usual when it comes to some complicated picture even 10 seconds or much more , why?

少了一个负号

AdvancedEAST/losses.py

Line 17 in 6790f1e

(1 - beta) * (1 - labels) * tf.log(1 - predicts + cfg.epsilon)))

根据balanced cross entropy定义，该行代码应该是 -1 * (1 - beta) * (1 - labels) * tf.log(1 - predicts + cfg.epsilon)))，最前面少了一个负号：）

Please upload pretrained model in other sites like google drive or dropbox

HI Can you upload the pretrained model in google drive/drop box please?

测试集

阿里天池的比赛结束了，数据集下载不了，作者提供的只有训练集，请问能否提供一下测试集

inaccurate detection for image larger than 1000 pixel wide

the code resize the image to max 736 pixel wide before processing the detection, this makes the text in the image (larger than 1000 pixel wide) too small to detect, and leads to inaccuracy.

if I change the resize process, i got resource exhausted error. do you have any ideas how to solve this?

Training details about different sizes

python preprocess.py, resize image to 256256,384384,512512,640640,736*736, and train respectively could speed up training process.

I am kinda confused the meaning of train respectively?
Does this mean to train the network in a coarse-to-fine process, which initals the network from 256x256 and then finetunes it on larger sizes?
Does this accelerate the converge of the network than train it on size 736x736 directly?

你好，请问下east_model_weights_2T736.h5这个模型是预训练的模型，还是最终训好的模型，我调用这个模型检测，效果很差呀？

The loss of the validation set does not decrease

Excuse me , Use you ICPR dataset and run python preprocess.py && python label.py and python advanced_east.py, but the validation loss haven't decrease, so when sotp, Test data is very bad.
can you help me,What went wrong?

Weights loading issues.

Hi!
I try to train my own datasets with your pre-trained weights.
First I set load_weights=True in cfg.py, then an error occurs in advanced_east.py as follow：

Traceback (most recent call last):
File "D:/Workspace/Ad_new/advanced_east.py", line 17, in main
east_network.load_weights(cfg.saved_model_weights_file_path)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1391, in load_weights
saving.load_weights_from_hdf5_group(f, self.layers)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\saving.py", line 732, in load_weights_from_hdf5_group
' layers.')
ValueError: You are trying to load a weight file containing 30 layers into a model with 1 layers.

According to stackoverflow, I rewrite advanced_east.py ：

import os
from tensorflow.python.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.python.keras.optimizers import Adam
from tensorflow.python.keras.models import load_model
from tensorflow.python.keras.models import model_from_json

import cfg
from network import East
from losses import quad_loss
from data_generator import gen

east = East()
east_network = east.east_network()

if cfg.load_weights and os.path.exists(cfg.saved_model_weights_file_path):
  east_network.load_weights(cfg.saved_model_weights_file_path, by_name=True)
  json_string = east_network.to_json()
  east_network = model_from_json(json_string)

east_network.summary()
east_network.compile(loss=quad_loss, optimizer=Adam(lr=cfg.lr,
                                                    # clipvalue=cfg.clipvalue,
                                                    decay=cfg.decay))

east_network.fit_generator(generator=gen(),
                           steps_per_epoch=cfg.steps_per_epoch,
                           epochs=cfg.epoch_num,
                           validation_data=gen(is_val=True),
                           validation_steps=cfg.validation_steps,
                           verbose=1,
                           # use_multiprocessing=True,
                           initial_epoch=cfg.initial_epoch,
                           callbacks=[
                               EarlyStopping(patience=cfg.patience, verbose=1),
                               ModelCheckpoint(filepath=cfg.model_weights_path,
                                               save_best_only=False,
                                               save_weights_only=True,
                                               verbose=1)])
east_network.save(cfg.saved_model_file_path)
east_network.save_weights(cfg.saved_model_weights_file_path)

After then, my training process started successfully.
While I tried to predict images with my trained weights east_model_weights_3T736.h5 in folder saved_model , the error showing up again:

Traceback (most recent call last):
File "D:/Workspace/Ad_new/test_img.py", line 163, in
main()
File "D:/Workspace/Ad_new/test_img.py", line 137, in main
east_detect.load_weights("saved_model/east_model_weights_3T736.h5")
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1391, in load_weights
saving.load_weights_from_hdf5_group(f, self.layers)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\saving.py", line 732, in load_weights_from_hdf5_group
' layers.')
ValueError: You are trying to load a weight file containing 1 layers into a model with 30 layers.

I wanted to use the same solution and code as follow：

  east = East()
  east_detect = east.east_network()
  east_detect.load_weights("saved_model/east_model_weights_3T736.h5", by_name=True)
  json_string = east_detect.to_json()
  east_detect = model_from_json(json_string)

Here occurs new error：

File "D:/Workspace/Ad_new/test_img.py", line 163, in
main()
File "D:/Workspace/Ad_new/test_img.py", line 130, in main
east_detect.load_weights("saved_model/east_model_weights_3T736.h5", by_name=True)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1389, in load_weights
saving.load_weights_from_hdf5_group_by_name(f, self.layers)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\saving.py", line 810, in load_weights_from_hdf5_group_by_name
K.batch_set_value(weight_value_tuples)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\backend.py", line 2711, in batch_set_value
assign_op = x.assign(assign_placeholder)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 945, in assign
self._shape.assert_is_compatible_with(value_tensor.shape)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 847, in assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (128,) and (1024,) are incompatible

Am I using a wrong way loading or saving weights? And is there anyone else facing the same problem?

	inside_score = Conv2D(1, 1, padding='same', name='inside_score'
	)(self.g(cfg.feature_layers_num))
	side_v_code = Conv2D(2, 1, padding='same', name='side_vertex_code'
	)(self.g(cfg.feature_layers_num))
	side_v_coord = Conv2D(4, 1, padding='same', name='side_vertex_coord'
	)(self.g(cfg.feature_layers_num))