Giter Club home page Giter Club logo

huoyijie / advancedeast Goto Github PK

View Code? Open in Web Editor NEW
1.2K 44.0 381.0 3.22 MB

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car

Home Page: https://huoyijie.cn/

License: MIT License

Python 100.00%
scene text-detect east keras tensorflow python deep-learning machine-learning computer-vision tianchi

advancedeast's Introduction

AdvancedEAST

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST:An Efficient and Accurate Scene Text Detector, and the significant improvement was also made, which make long text predictions more accurate. If this project is helpful to you, welcome to star. And if you have any problem, please contact me.

advantages

  • writen in keras, easy to read and run
  • base on EAST, an advanced text detect algorithm
  • easy to train the model
  • significant improvement was made, long text predictions more accurate.(please see 'demo results' part bellow, and pay attention to the activation image, which starts with yellow grids, and ends with green grids.)

In my experiments, AdvancedEast has obtained much better prediction accuracy then East, especially on long text. Since East calculates final vertexes coordinates with weighted mean values of predicted vertexes coordinates of all pixels. It is too difficult to predict the 2 vertexes from the other side of the quadrangle. See East limitations picked from original paper bellow. East limitations

project files

  • config file:cfg.py,control parameters
  • pre-process data: preprocess.py,resize image
  • label data: label.py,produce label info
  • define network network.py
  • define loss function losses.py
  • execute training advanced_east.py and data_generator.py
  • predict predict.py and nms.py

后置处理过程说明参见 后置处理(含原理图)

network arch

  • AdvancedEast

AdvancedEast network arch

网络输出说明: 输出层分别是1位score map, 是否在文本框内;2位vertex code,是否属于文本框边界像素以及是头还是尾;4位geo,是边界像素可以预测的2个顶点坐标。所有像素构成了文本框形状,然后只用边界像素去预测回归顶点坐标。边界像素定义为黄色和绿色框内部所有像素,是用所有的边界像素预测值的加权平均来预测头或尾的短边两端的两个顶点。头和尾部分边界像素分别预测2个顶点,最后得到4个顶点坐标。

原理简介(含原理图)

  • East

East network arch

setup

  • python 3.6.3+
  • tensorflow-gpu 1.5.0+(or tensorflow 1.5.0+)
  • keras 2.1.4+
  • numpy 1.14.1+
  • tqdm 4.19.7+

training

  • tianchi ICPR dataset download 链接: https://pan.baidu.com/s/1NSyc-cHKV3IwDo6qojIrKA 密码: ye9y

  • prepare training data:make data root dir(icpr), copy images to root dir, and copy txts to root dir, data format details could refer to 'ICPR MTWI 2018 挑战赛二:网络图像的文本检测', Link

  • modify config params in cfg.py, see default values.

  • python preprocess.py, resize image to 256256,384384,512512,640640,736*736, and train respectively could speed up training process.

  • python label.py

  • python advanced_east.py, train entrance

  • python predict.py -p demo/001.png, to predict

  • pretrain model download(use for test) 链接: https://pan.baidu.com/s/1KO7tR_MW767ggmbTjIJpuQ 密码: kpm2

demo results

001原图 001激活图 001预测图

004原图 004激活图 004预测图

005原图 005激活图 005预测图

  • compared with east based on vgg16

As you can see, although the text area prediction is very accurate, the vertex coordinates are not accurate enough.

001激活图 001预测图

License

The codes are released under the MIT License.

references

网络输出说明: 输出层分别是1位score map, 是否在文本框内;2位vertex code,是否属于文本框边界像素以及是头还是尾;4位geo,是边界像素可以预测的2个顶点坐标。所有像素构成了文本框形状,然后只用边界像素去预测回归顶点坐标。边界像素定义为黄色和绿色框内部所有像素,是用所有的边界像素预测值的加权平均来预测头或尾的短边两端的两个顶点。头和尾部分边界像素分别预测2个顶点,最后得到4个顶点坐标。

原理简介(含原理图)

后置处理过程说明参见 后置处理(含原理图)

A Simple RaspberryPi Car Project

advancedeast's People

Contributors

huoyijie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

advancedeast's Issues

一个奇怪的检测不全的问题

您demo中检测了机动车驾驶证,我拍了张名片测试了下,发现这类检测是否对间隙文本支持不友好,比如名片中的姓名,两个字是分开的,在您目前的pixel threshold阈值下,可以单个框出。我修改了下阈值,结果可以框到两个字,但两个字都框的不全,是否是我对代码理解不到位,对于这种问题,作者您怎么看?还有那三个阈值到底分别有何影响,打分策略部分没有看太懂,还望作者解答。

关于side区域标签生成的问题

头尾按照属于第一个poly进行标记,但是初始map为0, 属于head-side的ith=0,属于tail-side的ith=1,no-side的ith=-1,如此标记的话,head-side与non-side的值都为0,这部分可视化出来会发现只有tail没有head,在训练过程中head的损失计算会出现错误。这里作者认为如何

weight file error

when I use the file ‘east_model_weights_2T736.h5’, Error comes up. ValueError: You are trying to load a weight file containing 58 layers into a model with 30 layers. It seems the weight file does not match.

检测结果无头无尾

@huoyijie
您好,谢谢您代码的分享。我遇到一些问题,没找到头绪,想要咨询一下。
首先,我要检测的是自然场景下的文字,所以我对训练数据集进行了更改,使用的是ICDAR 2017的数据集和自己标注的数据集,后面找了一些图片进行测试。发现一些问题,有很多图片出现了有头无尾,有尾无头,无头无尾,以及相邻两行之间交叉,分不开的情况,可是我找不到原因以及怎么改善?可以给我一些建议吗?研三了,比较急,谢谢啦

31 jpg_act
31 jpg_predict
41 jpg_act
41 jpg_predict

8 jpg_act
8 jpg_predict

想问问一些关于loss的问题

side_vertex_code_loss是表示什么loss,还有一个问题是预测出来的坐标是什么形式的(比如左上角,右下角坐标,还是其他形式)

.npy文件

你好,请问和label文件对应的.npy是什么文件,需要自己去生成么

数据集问题咨询

作者,你好,请问下如何生成符合要求的自己的数据集,我看作者提供的数据集是图片对应txt文件,而且txt文件包含四个点的坐标,我用labelImg生成的是xml文件,不符合格式要求,感谢提供帮助

python3 predict.py error

python3 predict.py

Traceback (most recent call last):
File "predict.py", line 148, in
east_detect.load_weights(cfg.saved_model_weights_file_path)
File "/home/ddc/.local/lib/python3.5/site-packages/keras/engine/network.py", line 1161, in load_weights
f, self.layers, reshape=reshape)
File "/home/ddc/.local/lib/python3.5/site-packages/keras/engine/saving.py", line 900, in load_weights_from_hdf5_group
str(len(filtered_layers)) + ' layers.')
ValueError: You are trying to load a weight file containing 58 layers into a model with 30 layers.

谢谢了 搞定了

File "/home/louj1/pywork/AEAST/data_generator.py", line 37, in gen
y[i] = np.load(gt_file)
ValueError: could not broadcast input array from shape (21,4,2) into shape (64,64,7)

petrain 模型的精度

非常感谢您开源了代码,您是否尝试在ICDAR2015上测试过这个代码,相比较原始EAST模型,您的模型的精度召回大概是多少,十分感谢

Training error: could not broadcast input array from shape

Hi
I am trying to run the training and am getting the following error:

 File "D:\AdvancedEAST\data_generator.py", line 33, in gen
    img

ValueError: could not broadcast input array from shape (720,1280,3) into shape (736,736,3)

Did you ever get such an error?
Any suggestions?
My data includes just a few images for merely initiating the training loop.

The tree structure of the files used for training is displayed below:

untitled

I seems to find a bug in your codes

it's in function "resize_image" of file "preprocess.py",wrong codes are below:

if im_width == max_img_size < im.width:
if o_height == max_img_size < im_height:

i think they should be as below:

if im_width == max_img_size:
if o_height == max_img_size:

i dont know if i'am right, please author to judge

merge layer 产生多个网络分支问题

AdvancedEAST/network.py

Lines 65 to 70 in ba78824

inside_score = Conv2D(1, 1, padding='same', name='inside_score'
)(self.g(cfg.feature_layers_num))
side_v_code = Conv2D(2, 1, padding='same', name='side_vertex_code'
)(self.g(cfg.feature_layers_num))
side_v_coord = Conv2D(4, 1, padding='same', name='side_vertex_coord'
)(self.g(cfg.feature_layers_num))

这几行代码,会生成三个不同的feature merge layer网络分支,这是预期的结果吗?

可不可以这样直接生成7个output map:
Conv2D(7, 1, padding='same', name='output_all')(self.g(cfg.feature_layers_num))

y的维度问题

这个y的矩阵维度是啥样的啊

    y = np.zeros((batch_size, pixel_num_h, pixel_num_w, 7), dtype=np.float32)

为什么是7而不是9

quad invalid with vertex num less then 4

您好,下载您提供的预训练模型, 然后测试github上demo里面的身份证和机动车驾驶证,总是出现这个问题,并不能得出您原始效果。请问这是什么原因?

Cannot download from Baidu

Hi
Can you please share the data and the pretrained model on GoogleDrive, DropBox, Box or any other cloud service.
Unfortunately, Baidu does not work so great in the west.
Best Regards
Wajahat

validation_steps=None

@huoyijie

(east) home@home-lnx:~/Desktop/program/AdvancedEAST$ python advanced_east.py
Using TensorFlow backend.
2019-02-11 18:44:57.258140: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2019-02-11 18:44:57.259110: I tensorflow/core/common_runtime/process_util.cc:69] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
.
.
.
.
==================================================================================================
Total params: 15,087,367
Trainable params: 15,083,655
Non-trainable params: 3,712
__________________________________________________________________________________________________
WARNING:tensorflow:Variable *= will be deprecated. Use `var.assign(var * other)` if you want assignment to the variable value or `x = x * y` if you want a new python Tensor object.
Traceback (most recent call last):
  File "advanced_east.py", line 31, in <module>
    verbose=1)])
  File "/home/home/anaconda3/envs/east/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/home/anaconda3/envs/east/lib/python3.6/site-packages/keras/engine/training.py", line 2115, in fit_generator
    raise ValueError('`validation_steps=None` is only valid for a'
ValueError: `validation_steps=None` is only valid for a generator based on the `keras.utils.Sequence` class. Please specify `validation_steps` or use the `keras.utils.Sequence` class.

训练迭代次数

请问您当时训练256*256,大概多少次迭代后能达到收敛?

[Request] Few requests

Hello sir.
Thank you for your great work.

Would you mind upload your model into Google Drive ?
And translating the Readme into English.

运行predict.py报错

2018-11-11 17-20-11
您好!我在anaconda新建的虚拟环境里搭建好了tensorflow,但是运行predict.py的时候出现了以上的错误,我找不到解决的方法

nms.py cost too much time?

When I predict a picture , it seems nms.py cost too much time. This situation becomes more usual when it comes to some complicated picture even 10 seconds or much more , why?

少了一个负号

(1 - beta) * (1 - labels) * tf.log(1 - predicts + cfg.epsilon)))

根据balanced cross entropy定义,该行代码应该是 -1 * (1 - beta) * (1 - labels) * tf.log(1 - predicts + cfg.epsilon))), 最前面少了一个负号:)

测试集

阿里天池的比赛结束了,数据集下载不了,作者提供的只有训练集,请问能否提供一下测试集

inaccurate detection for image larger than 1000 pixel wide

the code resize the image to max 736 pixel wide before processing the detection, this makes the text in the image (larger than 1000 pixel wide) too small to detect, and leads to inaccuracy.

if I change the resize process, i got resource exhausted error. do you have any ideas how to solve this?

Training details about different sizes

python preprocess.py, resize image to 256256,384384,512512,640640,736*736, and train respectively could speed up training process.

I am kinda confused the meaning of train respectively?
Does this mean to train the network in a coarse-to-fine process, which initals the network from 256x256 and then finetunes it on larger sizes?
Does this accelerate the converge of the network than train it on size 736x736 directly?

The loss of the validation set does not decrease

Excuse me , Use you ICPR dataset and run python preprocess.py && python label.py and python advanced_east.py, but the validation loss haven't decrease, so when sotp, Test data is very bad.
can you help me,What went wrong?

Weights loading issues.

Hi!
I try to train my own datasets with your pre-trained weights.
First I set load_weights=True in cfg.py, then an error occurs in advanced_east.py as follow:

Traceback (most recent call last):
File "D:/Workspace/Ad_new/advanced_east.py", line 17, in main
east_network.load_weights(cfg.saved_model_weights_file_path)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1391, in load_weights
saving.load_weights_from_hdf5_group(f, self.layers)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\saving.py", line 732, in load_weights_from_hdf5_group
' layers.')
ValueError: You are trying to load a weight file containing 30 layers into a model with 1 layers.

According to stackoverflow, I rewrite advanced_east.py :

import os
from tensorflow.python.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.python.keras.optimizers import Adam
from tensorflow.python.keras.models import load_model
from tensorflow.python.keras.models import model_from_json

import cfg
from network import East
from losses import quad_loss
from data_generator import gen

east = East()
east_network = east.east_network()

if cfg.load_weights and os.path.exists(cfg.saved_model_weights_file_path):
  east_network.load_weights(cfg.saved_model_weights_file_path, by_name=True)
  json_string = east_network.to_json()
  east_network = model_from_json(json_string)

east_network.summary()
east_network.compile(loss=quad_loss, optimizer=Adam(lr=cfg.lr,
                                                    # clipvalue=cfg.clipvalue,
                                                    decay=cfg.decay))

east_network.fit_generator(generator=gen(),
                           steps_per_epoch=cfg.steps_per_epoch,
                           epochs=cfg.epoch_num,
                           validation_data=gen(is_val=True),
                           validation_steps=cfg.validation_steps,
                           verbose=1,
                           # use_multiprocessing=True,
                           initial_epoch=cfg.initial_epoch,
                           callbacks=[
                               EarlyStopping(patience=cfg.patience, verbose=1),
                               ModelCheckpoint(filepath=cfg.model_weights_path,
                                               save_best_only=False,
                                               save_weights_only=True,
                                               verbose=1)])
east_network.save(cfg.saved_model_file_path)
east_network.save_weights(cfg.saved_model_weights_file_path)

After then, my training process started successfully.
While I tried to predict images with my trained weights east_model_weights_3T736.h5 in folder saved_model , the error showing up again:

Traceback (most recent call last):
File "D:/Workspace/Ad_new/test_img.py", line 163, in
main()
File "D:/Workspace/Ad_new/test_img.py", line 137, in main
east_detect.load_weights("saved_model/east_model_weights_3T736.h5")
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1391, in load_weights
saving.load_weights_from_hdf5_group(f, self.layers)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\saving.py", line 732, in load_weights_from_hdf5_group
' layers.')
ValueError: You are trying to load a weight file containing 1 layers into a model with 30 layers.

I wanted to use the same solution and code as follow:

  east = East()
  east_detect = east.east_network()
  east_detect.load_weights("saved_model/east_model_weights_3T736.h5", by_name=True)
  json_string = east_detect.to_json()
  east_detect = model_from_json(json_string)

Here occurs new error:

File "D:/Workspace/Ad_new/test_img.py", line 163, in
main()
File "D:/Workspace/Ad_new/test_img.py", line 130, in main
east_detect.load_weights("saved_model/east_model_weights_3T736.h5", by_name=True)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\network.py", line 1389, in load_weights
saving.load_weights_from_hdf5_group_by_name(f, self.layers)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\engine\saving.py", line 810, in load_weights_from_hdf5_group_by_name
K.batch_set_value(weight_value_tuples)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\keras\backend.py", line 2711, in batch_set_value
assign_op = x.assign(assign_placeholder)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 945, in assign
self._shape.assert_is_compatible_with(value_tensor.shape)
File "D:\ProgramData\Anaconda3\envs\gpu\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 847, in assert_is_compatible_with
raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (128,) and (1024,) are incompatible

Am I using a wrong way loading or saving weights? And is there anyone else facing the same problem?

其他训练

请问我用13的数据集训练,那些训练参数需要调吗?我看你的epoch是才24.

如何训练

你好,能给一个稍微详细点的训练步骤么。想用自己的数据来训练试试

相邻两行分不开

@huoyijie
请问下图中问题能解决吗?为什么我训练之后这种情况很多,相邻行都连接到一起了。谢谢

img_2061 jpg_act
img_2061 jpg_predict
换成resnet会不会变好?

自己训练了模型却predict不出任何边框

你好,我用的数据集是tianchi ICPR的数据集 就是你分享的百度网盘的数据集,但是我根据preprocess label advanced_east 后,再predict,没有任何的边框检测出来,请问是有哪些可能?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.