whai362 / pan_pp.pytorch Goto Github PK

View Code? Open in Web Editor NEW

435.0 11.0 90.0 834 KB

Official implementations of PSENet, PAN and PAN++.

License: Apache License 2.0

Shell 0.10% Python 98.77% Cython 1.13%

text-recognition psenet pan text-detection text-spotting

pan_pp.pytorch's Introduction

My homepage: https://whai362.github.io/

pan_pp.pytorch's People

Contributors

Stargazers

Watchers

Forkers

xiaoyubing zhongqianli shualite zzmcdc gaoxin627 dlml holygen bubusang xuweidongkobe jadentan yongduek trarynight pkq1688 cv-ip linhong00316 stemon jewelc92 mysky83585318 szu-sg mayidu rampagefx linghu8812 rosesakurai roger1993 hunt-cat sookyungkim88 scott-mao gztangde zenmoore wonlee2019 liuswot mei-727 xrosliang aiedward yanshuang17 dikubab sunxingxingtf wangdai-0800 toumingl attendfov qgh1223 ocrorg skywalkerfmc yellowjs0304 aborder phidch ldoublev simplify23 rootzzp garonehuang saulocatharino wjtuz hellmo718 thanit456 panfei748 dreihunde-wang xijunjun vanpersie32 xiongzl rigvedsah000 smallflyfly penseesface haimingluo initgo lrjj xushibo96 whgao97 jo-hero mymsimple taoshss git-tengsun neverstoplearn ssocean julienhimself zhang-o ruthvik92 b1ingo rosebbb wibrow jasper-cell linghushaoxia zerohertz lyf1122 jinwonmin hei0414 techthiyanes duyuankai1992 bunbohue19 zhutie0524 zt0717

pan_pp.pytorch's Issues

Hi. When I run 'python train.py config/pan/pan_r18_ic15.py' , the errors are as followings:
Do you know how to solve the problem? Thank you very much.
Traceback (most recent call last):
File "train.py", line 234, in
main(args)
File "train.py", line 216, in main
train(train_loader, model, optimizer, epoch, start_iter, cfg)
File "train.py", line 41, in train
for iter, data in enumerate(train_loader):
File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next
data = self._next_data()
File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data
data.reraise()
File "D:\Anaconda3\lib\site-packages\torch_utils.py", line 428, in reraise
raise self.exc_type(msg)
TypeError: function takes exactly 5 arguments (1 given)

关于训练过程Acc rec一直为0的原因？

训练了好几轮，它的rec acc一直为0，但是loss下降正常，想问问问题出在哪里了？

why not use bilinear interpolate or others?

pan_pp.pytorch/models/head/pa_head.py

Line 72 in afad0b4

 label = cv2.resize(label, (img_size[1], img_size[0]), interpolation=cv2.INTER_NEAREST) 

关于训练的问题

eval ic15 Error! No module named numpy

Code for pa.py?

Hi there,

Is there a pa.py version implemented in pure python in addition to pa.pyx?

Thanks.

和DBNet效果相比

您好，想请教一下，这个PAN和DBnet相比，哪个对中文效果检测更好呢？或者是这个论文对比DB有哪些优点呢？

Datasets about training pan++

Hi, I want to train pan++, but I don't know the format of dataset, especially the total text which have multiple version, could you provide the detailed data structure or the link to the dataset you used, thank you!

安装mmcv库不成功

pip安装mmcv库一直安装不上，试了pytorch1.1、1.4、1.7都不行，请问是什么原因呢？

执行test.py提示TypeError: 'module' object is not callable

将模型路径和config文件路径配置好了之后，执行python test.py，提示如下：
Traceback (most recent call last):
File "test.py", line 117, in
main(args)
File "test.py", line 107, in main
test(test_loader, model, cfg)
File "test.py", line 56, in test
outputs = model(**data)
File "/home/ethony/anaconda3/envs/ocr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/media/ethony/C14D581BDA18EBFA/lyg_datas_and_code/OCR_work/pan_pp.pytorch-master/models/pan.py", line 104, in forward
det_res = self.det_head.get_results(det_out, img_metas, cfg)
File "/media/ethony/C14D581BDA18EBFA/lyg_datas_and_code/OCR_work/pan_pp.pytorch-master/models/head/pa_head.py", line 65, in get_results
label = pa(kernels, emb)
TypeError: 'module' object is not callable
看提示应该是model/post_processing下的pa没有正确导入，导入为模块了，这应该怎么解决呢

Acc rec: 0.000

when I train the PSE, my Acc is 0.00 ？Did u know what is wrong with it ？

Testing issues

I have tested the model with "Total-text" dataset with your trained model.

I tried to draw the bounding boxes on the images using outputs['bboxes']. But i couldn't. I have used the following code.

for bbox in enumerate(bboxes):
      draw = ImageDraw.Draw(img)
      color = tuple(np.random.choice(range(100, 256), size=3))
      draw.polygon(bbox, outline=color)
return img

But resultant image does not have the bounding box

what is wrong in this code?
There is no code for calculating the Precision %, Recall % and F-measure % at Test function. How the Precision %, Recall % and F-measure % is calculated?

ModuleNotFoundError: No module named 'models.post_processing.pse_v2'

these is no pse_v2

Backbone Resnet101 - Training time error

I have changed the backbone to Resnet101 and started the training,

got the following error

RuntimeError: Given groups=1, weight of size 128 64 1 1, expected input[8, 256, 160, 160] to have 64 channels, but got 256 channels instead

Could you tell me the reason for this error?

_pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed

more complete log as belows:
Epoch: [1 | 600]
/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/torch/nn/functional.py:2941: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/torch/nn/functional.py:3121: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
(1/374) LR: 0.001000 | Batch: 2.668s | Total: 0min | ETA: 17min | Loss: 1.619 | Loss(text/kernel/emb/rec): 0.680/0.193/0.746/0.000 | IoU(text/kernel): 0.324/0.335 | Acc rec: 0.000
Traceback (most recent call last):
File "/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
obj = _ForkingPickler.dumps(obj)
File "/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed

the code runs normally when using the CTW1500 datasets.
but encounter errors when using my own datasets.

it seems fine in the first run (1/374), what is wrong ? I have no idea.

Regarding pa.pyx

Hi,

I try to run your code and figure out that in your last line in pa.pyx

return _pa(kernels[:-1], emb, label, cc, kernel_num, label_num, min_area)

Looks like this should be

return _pa(kernels, emb, label, cc, kernel_num, label_num, min_area)

So that we can scan over all kernels (you skip the last kernel) and there is no crash in this function. Am I correct?

Thanks.

not sure about run compile.sh

(zyl_torch16) ubuntu@ubuntu:/data/zhangyl/pan_pp.pytorch-master$ sh ./compile.sh
Compiling pa.pyx because it depends on /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/init.pxd.
[1/1] Cythonizing pa.pyx
/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /data/zhangyl/pan_pp.pytorch-master/models/post_processing/pa/pa.pyx
tree = Parsing.p_module(s, pxd, full_module_name)
running build_ext
building 'pa' extension
creating build
creating build/temp.linux-x86_64-3.7
gcc -pthread -B /data/tools/anaconda3/envs/zyl_torch16/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include -I/data/tools/anaconda3/envs/zyl_torch16/include/python3.7m -c pa.cpp -o build/temp.linux-x86_64-3.7/pa.o -O3
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822:0,
from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from pa.cpp:647:
/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it with "
^~~~~~~
g++ -pthread -shared -B /data/tools/anaconda3/envs/zyl_torch16/compiler_compat -L/data/tools/anaconda3/envs/zyl_torch16/lib -Wl,-rpath=/data/tools/anaconda3/envs/zyl_torch16/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/pa.o -o /data/zhangyl/pan_pp.pytorch-master/models/post_processing/pa/pa.cpython-37m-x86_64-linux-gnu.so
(zyl_torch16) ubuntu@ubuntu:/data/zhangyl/pan_pp.pytorch-master$

this is the compile history, I am not sure whether is successully build or not.

PAN++的代码开源了吗

Show how to process a single new image.

Thanks!

pan配置文件 detection_head 中的num_classes=6什么意思？

请问作者 pan配置文件 detection_head 中的num_classes=6是什么意思？pan网络预测的text region 和kernel吗？这个num_classes是什么意识

No module named 'models.post_processing.pa.pa

’No module named 'models.post_processing.pa.pa’提示没有这个该怎么解决呢

Trained model save/load with DataParallel.

Many Thanks for sharing the code of the very good paper.

Model save/load (with DataParallel) can be easier, which was found from here:
https://discuss.pytorch.org/t/solved-keyerror-unexpected-key-module-encoder-embedding-weight-in-state-dict/1686/17

Instead of deleting the “module.” string from all the state_dict keys, you can save your model with:

torch.save(model.module.state_dict(), path_to_file)

instead of

torch.save(model.state_dict(), path_to_file)

that way you don’t get the “module.” string to begin with…

Pan++

@whai362 @RoseSakurai Hi there,

is the focus of Pan++ to improve the speed or accuracy?
what are the main new improvements?

about the postprocessing

pse module in PSENET takes only several ms if scale = 4 which is actually the output feature map (1/4 of original img) of network.
pa module in pan takes more time (more than 10ms ) when producing the same size of feature map (1/4 of original img).
So pa is helpful for the accuracy of final result but slower than the pse when dealing with a feature of same size .
Is my viewpoint proper or not?

your PAN is 83.4% while mmocr PAN is 80.7%

@RoseSakurai @whai362
i have noticed that your implementation of PAN achieves 83.4% on ctw1500, while mmocr implementation achieves 80.7%?
Do you have any idea or recommendation to improve the mmocr implementation to in-order to achieve your level?

Training problem

Thanks for your excellent work! I follow all your settings and train the PAN model on the Total-text dataset, but can't reach the results as you report in the paper.

I run the commond 'CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py config/pan/pan_r18_tt.py', and the F1-score is 83.0 (83.5 in the paper). The small difference is fine.

I run the commond 'CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py config/pan/pan_r18_synth.py' to pretrain the PAN, and 'CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py config/pan/pan_r18_tt_finetune.py' to finetune the model. The Final score is 83.5(85.0 in the paper).

Why I get the bad results? Could you provide the training logs?

AttributeError: 'PAN_PP_jointTrain' object has no attribute 'voc'

How to resolve the 'AttributeError: 'PAN_PP_jointTrain' object has no attribute 'voc''

训练文件的数据格式是什么样的呢

similarity vectors. 这个标签是怎么生成的呢我从代码中没有找到

关于Loss的计算

代码中在计算EmbLoss时，分为l_agg、l_dis、l_reg，我在论文中并没有看到关于l_reg的内容，如果l_agg和l_dis分别是为了让像素与同一kernel的距离近、与不同kernel的距离远，请问l_reg的具体含义是什么？

请问大佬打算何时开源PAN++代码

关于后处理的疑问

后处理的代码中当kernel中两个连通域的面积比大于max_rate时，将这两个连通域的flag赋值为1，在扩充时，必须同时满足当前扩充的点所属的连通域的flag值为1且与kernal的similar vector距离大于3时才不扩充该点。请问设flag这步操作的作用是什么，直接判断与Kernel的similar vector的距离可以吗？
论文中扩充的点与kernel相似向量的欧式距离thresh值为6，代码中为3，请问实际应用中这个值跟什么有关系，是数据集的某些特点吗？

Evaluation of the performance result

Hello Author,
First of all, I would like to appreciate your work and effort. I have tried your repo. The evaluation code gives me an error of the "The sample 199 not present in GT," but the label text is there. When I tried to see the result via visualizing it on the images, it seems good. Let me know if there is any solution from your side.

None

请问pan++主要是优化了速度和是效果

facing problems when i try to train pan++ with my own dataset

ModuleNotFoundError: No module named 'models.post_processing.pa.pa'
@whai362

文本检测有粘连的情况如何解决？

您好！我现在针对特定的问题进行文本检测模型，但是有的时候会出来上下两行检测为一个的情况，我增加了一下重合的数据，但是效果也不是特别理想，这种情况依然存在，请问一下您这样该如何有效的避免或者解决

test Pan on single image

@whai362 @RoseSakurai Hi there,
I want to test Pan on a single image, is there a script to do that

AttributeError: 'Namespace' object has no attribute 'resume'

An error appears when trying to test the model:

(pan) home@home-lnx:~/programs/pan_pp.pytorch$ python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar
{
    "model": {
        "type": "PAN",
        "backbone": {
            "type": "resnet18",
            "pretrained": true
        },
        "neck": {
            "type": "FPEM_v1",
            "in_channels": [
                64,
                128,
                256,
                512
            ],
            "out_channels": 128
        },
        "detection_head": {
            "type": "PA_Head",
            "in_channels": 512,
            "hidden_dim": 128,
            "num_classes": 6,
            "loss_text": {
                "type": "DiceLoss",
                "loss_weight": 1.0
            },
            "loss_kernel": {
                "type": "DiceLoss",
                "loss_weight": 0.5
            },
            "loss_emb": {
                "type": "EmbLoss_v1",
                "feature_dim": 4,
                "loss_weight": 0.25
            }
        }
    },
    "data": {
        "batch_size": 16,
        "train": {
            "type": "PAN_IC15",
            "split": "train",
            "is_transform": true,
            "img_size": 736,
            "short_size": 736,
            "kernel_scale": 0.5,
            "read_type": "cv2"
        },
        "test": {
            "type": "PAN_IC15",
            "split": "test",
            "short_size": 736,
            "read_type": "cv2",
            "report_speed": false
        }
    },
    "train_cfg": {
        "lr": 0.001,
        "schedule": "polylr",
        "epoch": 600,
        "optimizer": "Adam"
    },
    "test_cfg": {
        "min_score": 0.85,
        "min_area": 16,
        "bbox_type": "rect",
        "result_path": "outputs/submit_ic15.zip"
    },
    "report_speed": false
}
Downloading: "http://sceneparsing.csail.mit.edu/model/pretrained_resnet/resnet18-imagenet.pth" to ./pretrained/resnet18-imagenet.pth
Traceback (most recent call last):
  File "test.py", line 117, in <module>
    main(args)
  File "test.py", line 100, in main
    print("No checkpoint found at '{}'".format(args.resume))
AttributeError: 'Namespace' object has no attribute 'resume'

检测结果和实际图片有角度差

为什么检测会出来这种的呢，我是按（y,x）坐标来画的

训练数据模拟

您好！请问训练数据，弯曲的，如果模拟生成呢？

Showing output Polygon

Hi,
Thanks for the paper and the code.

I was wondering how can the output from the test.py be shown on the image.

feeled confused about the speed compared in table 4

In PSENET, speed statistics are computed from images of long side equaling to 1280.
In PAN, speed statistics are computed from images of short side equaling to 640.
Is it fair or proper to compre them in this way?

No module named 'models.post_processing.pse_v2'

these is no pse_v2

can not compile sucessfully

running build_ext
building 'pa' extension
gcc -pthread -B /home/dell10/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/dell10/anaconda3/lib/python3.7/site-packages/numpy/core/include -I/home/dell10/anaconda3/include/python3.7m -c pa.cpp -o build/temp.linux-x86_64-3.7/pa.o -O3
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/dell10/anaconda3/lib/python3.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1832:0,
from /home/dell10/anaconda3/lib/python3.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
from /home/dell10/anaconda3/lib/python3.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from pa.cpp:554:
/home/dell10/anaconda3/lib/python3.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it with "
^
pa.cpp: In function ‘void __Pyx__ExceptionSave(PyThreadState*, PyObject**, PyObject**, PyObject**)’:
pa.cpp:8607:21: error: ‘PyThreadState’ has no member named ‘exc_type’
type = tstate->exc_type;
^
pa.cpp:8608:22: error: ‘PyThreadState’ has no member named ‘exc_value’
value = tstate->exc_value;
^
pa.cpp:8609:19: error: ‘PyThreadState’ has no member named ‘exc_traceback’
tb = tstate->exc_traceback;
^
pa.cpp: In function ‘void __Pyx__ExceptionReset(PyThreadState, PyObject, PyObject, PyObject*)’:
pa.cpp:8616:24: error: ‘PyThreadState’ has no member named ‘exc_type’
tmp_type = tstate->exc_type;
^
pa.cpp:8617:25: error: ‘PyThreadState’ has no member named ‘exc_value’
tmp_value = tstate->exc_value;
^
pa.cpp:8618:22: error: ‘PyThreadState’ has no member named ‘exc_traceback’
tmp_tb = tstate->exc_traceback;
^
pa.cpp:8619:13: error: ‘PyThreadState’ has no member named ‘exc_type’
tstate->exc_type = type;
^
pa.cpp:8620:13: error: ‘PyThreadState’ has no member named ‘exc_value’
tstate->exc_value = value;
^
pa.cpp:8621:13: error: ‘PyThreadState’ has no member named ‘exc_traceback’
tstate->exc_traceback = tb;
^
pa.cpp: In function ‘int __Pyx__GetException(PyThreadState*, PyObject**, PyObject**, PyObject**)’:
pa.cpp:8691:24: error: ‘PyThreadState’ has no member named ‘exc_type’
tmp_type = tstate->exc_type;
^
pa.cpp:8692:25: error: ‘PyThreadState’ has no member named ‘exc_value’
tmp_value = tstate->exc_value;
^
pa.cpp:8693:22: error: ‘PyThreadState’ has no member named ‘exc_traceback’
tmp_tb = tstate->exc_traceback;
^
pa.cpp:8694:13: error: ‘PyThreadState’ has no member named ‘exc_type’
tstate->exc_type = local_type;
^
pa.cpp:8695:13: error: ‘PyThreadState’ has no member named ‘exc_value’
tstate->exc_value = local_value;
^
pa.cpp:8696:13: error: ‘PyThreadState’ has no member named ‘exc_traceback’
tstate->exc_traceback = local_tb;
^
error: command 'gcc' failed with exit status 1

Training from scratch 达不到论文PAN在Total Text上汇报的结果

config file：config/pan/pan_r18_tt.py
配置文件与仓库中一致，只不过用了两张GPU训练。
最后的600epoch的模型测试结果为Precision:0.834923757993______/Recall:_0.771597096189/Hmean:_0.802012305697，达不到论文中的结果precision 88， recall79.5，hmean83.5
我训练时使用的TotalText的GroundTruth为新发布的，可参考https://github.com/cs-chan/Total-Text-Dataset/tree/master/Groundtruth/Text
请问您可以提供更多的实现细节吗？

similarity vector的表示方式

请问similarity vector在代码中用什么表示的呢？

训练Total Text时遇到的问题

运行 python train.py config/pan/pan_r18_tt.py 后，出现如下情况：

Traceback (most recent call last):
File "/home/dell2/anaconda3/envs/pannet/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File "/home/dell2/anaconda3/envs/pannet/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed
似乎是迭代过程中出现的问题且只出现在训练TT数据集的时候
请问出现这种情况该怎样解决呢？谢谢您