ying09 / textfusenet Goto Github PK

View Code? Open in Web Editor NEW

469.0 469.0 122.0 4.51 MB

A PyTorch implementation of "TextFuseNet: Scene Text Detection with Richer Fused Features".

License: MIT License

Shell 0.49% Python 88.03% C++ 4.05% Cuda 7.31% Dockerfile 0.12%

scene-text-detection

textfusenet's People

Contributors

Stargazers

Watchers

Forkers

yunke-wang jireh-father peternara garspace ai-motive toydogcat brandnewa red-canoe hotaru-ishibashi lohzhunyewcs ysun-coder euphoriayan kevinl-perc bavo96 wulitaotao1 thanhhau097 yueyedeai magiccodess nguyennpa412 seekingdeep congjianting davis-love-ai bachelorwangwei chenhaohan88 yangyin2016 mattweiliu xrosliang maxpark zobeirraisi animesh-007 tianhaofu fangpanliang code4101 jongho-park work82mj fourhand voithru teacrown deoko zyzyzhou chenhongruixuan yaoderrr da-head0 seantangtao fireae crowntailtw0608 hamidrezazarrabi real-yej bschembri-uom imiuru06 princehd dodo020206 qiwang-gzu aiedward doem97 2020-ai-zx xiaohua1027 yanshuang17 hun246 iambhuvi m-abbas-ansari shijievvu ishaanchandratreya muchwater nghianguyen7171 wjtuz ringwraith caizhengqi ramanhacks kaxapatel andreaschandra algonacci mobinamosannafat aspnetcs lovegood-1 hamditarek gregbugaj doansangg pantdevesh paulasquin ahmadoudeh praneelrokz jasonantonio10 happy20200 xiongshufeng nabang1010 priya9896 swara-p 0xt3j4s hwijune sinkinxx haimin777 aniketgurav rafayghafoor 851624623 senstar-hsoleimani yes89929 ruthvik92 hassanbinharoon mercurial24

textfusenet's Issues

TypeError: 'VisImage' object is not iterable (URGENT!!)

Facing this issue when running the icdar2015_detection.py file.

Not able to figure out, need help.

Note: I have changed the path for the respective weights, input, output, and config directories.

detectron2: ImportError: cannot import name '_C'

It raise import error when I tried to run demo file

ModuleNotFoundError: No module named 'main.register_coco';

ModuleNotFoundError: No module named 'main.register_coco'; 'main' is not a package
When i run python detectron2/data/datasets/builtin.py for register custom dataset.

KeyError: 'Non-existent config key: MODEL.TEXTFUSENET_MUTIL_PATH_FUSE_ON'

Facing this issue when running the icdar2015_detection.py file.

Note: I have changed the path for the respective weights, input, output, and config directories.

Couldn't resolve it, need help.

Confusing IoA computation

TextFuseNet/detectron2/modeling/roi_heads/mutil_path_fuse_module.py

Line 79 in d4708a9

 self_area, inter_area = get_selfarea_and_interarea(proposal_boxes, proposal_boxes) 

It seems this will always generate a inter_percent of all 1, since boxes1 and boxes2 are the same. Is this the expected behavior of the model illustrated in the paper?

By running the demo, we can get a visualization image with bounding boxes and characters. However, is there any inference command which can return the words or phrases instead of only characters? Thanks! @Real-YeJ

Demo for large images with multiple text instances

I observed that when doing a demo, the model takes much of the GPU memory, making it difficult to test for large image with multiple text instances (crashed usually due to memory limit). Are there anyway to go around this, i.e., resize the image and test it or something?

Train on ICDAR 2013 dataset

Please provide any code snippets for registering icdar 2013 dataset for training.

Do you have an annotation tool for custom data?

I need tool to annotate custom data

please, provide it

thanks

How to get the word result in the word-level instance instead of the possibility?

Hi, how to get the word result in the word-level instance instead of the possibility? Furthermore, could you give me some advice if I need it perform better for the vertical- or even inverted-type text using new training dataset?
Looking forward to your help. Thanks.

KeyError: 'Non-existent config key: MODEL.TEXTFUSENET_MUTIL_PATH_FUSE_ON'

This error raise on when I run python demo/***_detection.py

This is config file:

tool create data train?

i have custom data. can you suggest for me some tool annotate for that data like your example data train.

How i can build detectron2 with cuda 10.0 or cpu

I have some problems when i tried to run python demo/icdar2013_detection.py on pytorch 1.4 - cuda 10.0

Step-by-step installation at https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md
Step-by-step installation at https://github.com/ying09/TextFuseNet/blob/master/step-by-step%20installation.txt

RuntimeError: CUDA error: invalid device function ROIAlign_forward_cuda

Hi, I have a problem to run python demo/icdar2013_detection.py
I followed by step-by-step installation.txt.
Next, I set the file paths in demo/icdar2015_detection.py (testing images, configs), and then tried to launch the demo but fail.

Here are my error,

Can I recognize Chinese characters?

I want to recognize Chinese characters, how should I train

裁剪掉char特征效果能有多少?

@Real-YeJ 叶同学,你好, 实际工作中不太容易能够做到字符级别的标注或学习, 想问下, 如果不做字符级别的特征, 你们的模型效果大概在各项指标上是什么水平?

Training a new model 培訓新模型

@Real-YeJ @ying09

i have a folder containing a combination of ".jpg" and ".txt" of the textlines.
i want to train a new ctw1500 based model, how can i do that?

我有一個文件夾，其中包含“ .jpg”及其在“ .txt”中的文本行註釋的組合
我想訓練一個基於ctw1500的新模型，我該怎麼做？

Hello, can you provide the trained model of ResNet50? Thank you!

Hello, thank you for sharing this very good job. Could you please provide a trained model of ResNet50, it will be very helpful, thank you.

Looking forward to your reply, thank you.

Textline is cut 文本被剪切

ImportError: cannot import name '_C' from 'detectron2'

Hi, I'm trying to run your demo.

I installed the pytorch environment follow your 'step-by-step installation.txt'.
But when I use python demo/icdar2015_detection.py to run the demo. I came across this problem .


Traceback (most recent call last):
  File "demo/icdar2015_detection.py", line 12, in <module>
    from detectron2.data.detection_utils import read_image
  File "C:\Users\Tianh\Desktop\1-detect\TextFuseNet-master\demo\detectron2\data\__init__.py", line 4, in <module>
    from .build import (
  File "C:\Users\Tianh\Desktop\1-detect\TextFuseNet-master\demo\detectron2\data\build.py", line 13, in <module>
    from detectron2.structures import BoxMode
  File "C:\Users\Tianh\Desktop\1-detect\TextFuseNet-master\demo\detectron2\structures\__init__.py", line 2, in <module>
    from .boxes import Boxes, BoxMode, pairwise_iou
  File "C:\Users\Tianh\Desktop\1-detect\TextFuseNet-master\demo\detectron2\structures\boxes.py", line 7, in <module>
    from detectron2.layers import cat
  File "C:\Users\Tianh\Desktop\1-detect\TextFuseNet-master\demo\detectron2\layers\__init__.py", line 3, in <module>
    from .deform_conv import DeformConv, ModulatedDeformConv
  File "C:\Users\Tianh\Desktop\1-detect\TextFuseNet-master\demo\detectron2\layers\deform_conv.py", line 10, in <module>
    from detectron2 import _C
ImportError: cannot import name '_C' from 'detectron2' (C:\Users\Tianh\Desktop\1-detect\TextFuseNet-master\demo\detectron2\__init__.py)

Do you know why? Thanks!

Unable to compile if pytorch version is >1.4?

Hi,

I'm trying to pair the detection model with a recognition model that I have already trained while working on the character annotations. However, since I've trained the model in 1.6 and pytorch doesn't have forward compatibility in a certain case, I need to use pytorch 1.6 to compile the detectron for textfusenet. There seems to be an issue with older versions of detectron when trying to compile using pytorch >1.4.

I've tried compiling the new detectron on my own and using the fvcore that was provided but I was only met with the error AttributionError: module 'fvcore' has no attribute 'version'

I've also tried using pip's fvcore but it just came out with another error about missing texfusenet key which I assume means the detectron2 provided is modified.

Is there anyway to use textfusenet with a newer version of pytorch?

Does the tool detect languages other than English?

关于图3中的不同字符的特征相加的作用

作者你好，对于Fig 3中的字符级特征，文中说的是将每个字符对应的特征resize到14×14然后相加，但是它们对应的是不同字符的特征，比如说B的特征和A的特征相加，这样的作用是什么呢？
期待作者的回复，谢谢

Use pre-trained model for prediction without GPU

Hi,
I'd like to try the TextFuseNet architecture without training on new data but only to assess the performance of the model, is it possible to do it without GPU ?
I've followed the step by step installation guide and placed the detection model in a created folder according to the python file but when running the demo, it needs to have a GPU.

Is there something missing here ?

Question of NUM_CLASSES

I have a question while learning Korean dataset

Follow the steps below to proceed

write config file
register dataset( my dataset name is AISL dataset)
then training below command

$ python tools/train_net.py --num-gpus 4 --config-file

below is config file ( just change the dataset name from total-text config file )

_BASE_: "./Base-RCNN-FPN.yaml"
MODEL:
  MASK_ON: True
  TEXTFUSENET_MUTIL_PATH_FUSE_ON: True
  WEIGHTS: "./out_dir_r101/totaltext_model/model_tt_r101.pth"
  PIXEL_STD: [57.375, 57.120, 58.395]
  RESNETS:
    STRIDE_IN_1X1: False  # this is a C2 model
    NUM_GROUPS: 32
    WIDTH_PER_GROUP: 8
    DEPTH: 101
  ROI_HEADS:
    NMS_THRESH_TEST: 0.4
  TEXTFUSENET_SEG_HEAD:
    FPN_FEATURES_FUSED_LEVEL: 1
    POOLER_SCALES: (0.125,)

DATASETS:
  TRAIN: ("AISLText",)
  TEST: ("AISLText",)
SOLVER:
  IMS_PER_BATCH: 8
  BASE_LR: 0.001
  STEPS: (40000,80000,)
  MAX_ITER: 120000
  CHECKPOINT_PERIOD: 2500

INPUT:
  MIN_SIZE_TRAIN: (800,1000,1200)
  MAX_SIZE_TRAIN: 1500
  MIN_SIZE_TEST: 800
  MAX_SIZE_TEST: 1333


OUTPUT_DIR: "./out_dir_r101/at_model/"

register with coco_register in detectron2/data/datasets/builtin.py.

image_path = "/home/ensa/JYB/TextFuseNet/datasets/AISLText/train_images"
json_path = "/home/ensa/JYB/TextFuseNet/datasets/AISLText/trainval.json"
register_coco_instances("AISLText", {},json_path, image_path)

An error occurs when learning

[01/19 18:35:50 d2.data.datasets.coco]: Loaded 3 images in COCO format from /home/ensa/JYB/TextFuseNet/datasets/AISLText/trainval.json
[01/19 18:35:50 d2.data.build]: Removed 0 images with no usable annotations. 3 images left.
[01/19 18:35:50 d2.data.build]: Distribution of training instances among all 31 categories:
|  category  | #instances   |  category  | #instances   |  category  | #instances   |
|:----------:|:-------------|:----------:|:-------------|:----------:|:-------------|
|     -      | 2            |     0      | 2            |     1      | 2            |
|     3      | 3            |     5      | 1            |     7      | 2            |
|     A      | 2            |     B      | 2            |     E      | 4            |
|     K      | 2            |     L      | 2            |     R      | 1            |
|     a      | 1            |     b      | 1            |     c      | 1            |
|     e      | 2            |     i      | 1            |     m      | 1            |
|     o      | 2            |     r      | 3            |     t      | 1            |
|    text    | 7            |     u      | 1            |     y      | 1            |
|     강      | 1            |     료      | 1            |     실      | 3            |
|     의      | 1            |     자      | 1            |     장      | 1            |
|     화      | 1            |            |              |            |              |
|   total    | 56           |            |              |            |              |
[01/19 18:35:50 d2.data.detection_utils]: TransformGens used in training: [ResizeShortestEdge(short_edge_length=(800, 1000, 1200), max_size=1500, sample_style='choice'), RandomFlip(), RandomContrast(intensity_min=0.5, intensity_max=1.5), RandomBrightness(intensity_min=0.5, intensity_max=1.5), RandomSaturation(intensity_min=0.5, intensity_max=1.5), RandomLighting(scale=1.1931034212737668)]
[01/19 18:35:50 d2.data.build]: Using training sampler TrainingSampler
[01/19 18:35:51 fvcore.common.checkpoint]: Loading checkpoint from ./out_dir_r101/totaltext_model/model_tt_r101.pth
[01/19 18:35:51 d2.engine.train_loop]: Starting training from iteration 0
[01/19 18:35:53 d2.engine.hooks]: Total training time: 0:00:01 (0:00:00 on hooks)
Traceback (most recent call last):
  File "tools/train_net.py", line 161, in <module>
    args=(args,),
  File "/home/ensa/JYB/TextFuseNet/detectron2/engine/launch.py", line 49, in launch
    daemon=False,
  File "/home/ensa/anaconda3/envs/textfusenet2/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
    while not spawn_context.join():
  File "/home/ensa/anaconda3/envs/textfusenet2/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 118, in join
    raise Exception(msg)
Exception: 

-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/home/ensa/anaconda3/envs/textfusenet2/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
    fn(i, *args)
  File "/home/ensa/JYB/TextFuseNet/detectron2/engine/launch.py", line 84, in _distributed_worker
    main_func(*args)
  File "/home/ensa/JYB/TextFuseNet/tools/train_net.py", line 149, in main
    return trainer.train()
  File "/home/ensa/JYB/TextFuseNet/detectron2/engine/defaults.py", line 356, in train
    super().train(self.start_iter, self.max_iter)
  File "/home/ensa/JYB/TextFuseNet/detectron2/engine/train_loop.py", line 132, in train
    self.run_step()
  File "/home/ensa/JYB/TextFuseNet/detectron2/engine/train_loop.py", line 212, in run_step
    loss_dict = self.model(data)
  File "/home/ensa/anaconda3/envs/textfusenet2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ensa/anaconda3/envs/textfusenet2/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 442, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/home/ensa/anaconda3/envs/textfusenet2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ensa/JYB/TextFuseNet/detectron2/modeling/meta_arch/rcnn.py", line 88, in forward
    _, detector_losses = self.roi_heads(images, features, proposals, gt_instances)
  File "/home/ensa/anaconda3/envs/textfusenet2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ensa/JYB/TextFuseNet/detectron2/modeling/roi_heads/roi_heads.py", line 584, in forward
    losses.update(self._forward_mask(features_list, proposals, targets))
  File "/home/ensa/JYB/TextFuseNet/detectron2/modeling/roi_heads/roi_heads.py", line 684, in _forward_mask
    mask_features = self.mutil_path_fuse_module(mask_features, global_context, proposals)
  File "/home/ensa/anaconda3/envs/textfusenet2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ensa/JYB/TextFuseNet/detectron2/modeling/roi_heads/mutil_path_fuse_module.py", line 110, in forward
    feature_fuse = char_context + x + global_context
RuntimeError: The size of tensor a (19) must match the size of tensor b (145) at non-singleton dimension 0

To test whether learning is possible,I just tested with 3 images.
then this error is occurred

I compared the your sample coco format to my coco format, but it was the same.

I need to learn at least 1000 characters, does this error relevant to the number of characters? or relevant to input size?

Thank you for reading
please help...

code for gt generation

Hi,
thanks for the fantastic research. is there a code for
the pretrain model inference on new dataset and generate gt(coco json file) containing character-level annotations.

weakly supervised training?

Will you share the details of the weakly supervised part?

Thanks.

RuntimeError:CUDA out of memory

I'm aware that this is more hardware issue of mine, but I was wondering if there is any way I can make the model little bit smaller to save GPU memory. Thank you in advance!

Please store pretrained model in Google Drive where we can easily access, not able to access via Baidu Netdisk

@ying09

IMS_PER_BATCH: 2 is Error

I run with batchsize = 1 is oke. but 2 have error.
Traceback (most recent call last): File "tools/train_net.py", line 161, in <module> args=(args,), File "/media/data/bachtuan/TextFuseNet/detectron2/engine/launch.py", line 52, in launch main_func(*args) File "tools/train_net.py", line 149, in main return trainer.train() File "/media/data/bachtuan/TextFuseNet/detectron2/engine/defaults.py", line 356, in train super().train(self.start_iter, self.max_iter) File "/media/data/bachtuan/TextFuseNet/detectron2/engine/train_loop.py", line 132, in train self.run_step() File "/media/data/bachtuan/TextFuseNet/detectron2/engine/train_loop.py", line 212, in run_step loss_dict = self.model(data) File "/home/asilla/miniconda3/envs/textfusenet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__ result = self.forward(*input, **kwargs) File "/media/data/bachtuan/TextFuseNet/detectron2/modeling/meta_arch/rcnn.py", line 88, in forward _, detector_losses = self.roi_heads(images, features, proposals, gt_instances) File "/home/asilla/miniconda3/envs/textfusenet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__ result = self.forward(*input, **kwargs) File "/media/data/bachtuan/TextFuseNet/detectron2/modeling/roi_heads/roi_heads.py", line 584, in forward losses.update(self._forward_mask(features_list, proposals, targets)) File "/media/data/bachtuan/TextFuseNet/detectron2/modeling/roi_heads/roi_heads.py", line 684, in _forward_mask mask_features = self.mutil_path_fuse_module(mask_features, global_context, proposals) File "/home/asilla/miniconda3/envs/textfusenet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in __call__ result = self.forward(*input, **kwargs) File "/media/data/bachtuan/TextFuseNet/detectron2/modeling/roi_heads/mutil_path_fuse_module.py", line 94, in forward text = x[char_pos[i]] IndexError: The shape of the mask [2] at index 0does not match the shape of the indexed tensor [9, 256, 14, 14] at index 0

multi-path fusion in detection branch ?

Hi, thank you for interesting works.

I'm confused about multi-path fusion in detection branch.
In the paper, it is explained the multi-path fusion in detection branch, which fuses "word level features" and "global level features (from Semantic segmentation branch)". This is depicted in Figure.2, and is explained in section 3.1, 3.2 in the paper.

But in the code, the multi-path fusion in detection branch is not there.
The class method "_forward_box" in class "StandardROIHeads" of /detectron2/modeling/roi_heads.py, does not use multi-path fuse, unlike the class method "_forward_mask" in the same root.. right?
Moreover, "mutil_path_fuse_module.py" explains the argument is mask roi features..

Is there anything i missed? Thank you.

installation error

When I follow exactly the step-by-step installation instructions, I got the error message like this when I run the demo code:

However, when I used I different version of pytorch which is installed by pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html, then when I run python setup.py build develop, there comes the error:

I tried to exactly copy your environment and changed my cuda version from 10.2 to 10.1 and followed your step-by-step instructions, but it still doesn't work. Can you give a hint on what I need to do? Thanks! @Real-YeJ

error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

I followed the step-by-step installation.
An error occurred while executing this code.

python setup.py build develop

I used gcc-4.9, python 3.6.8.
Is it a version problem?

help me plz..

关于mutil_path_fuse_module

作者你好，paper中的流程图中显示的是 detection 分支和mask 分支都有mutil_path_fuse_module，但是在代码中只有roi_heads.py中的_forward_mask函数调用了mutil_path_fuse_module，好像检测分支并没有调用，请问在最终的实现中检测分支是否使用了mutil_path_fuse_module。

question about train synthetic data and funsd data

I have tried to train the model on synthetic data (keras-ocr https://keras-ocr.readthedocs.io/en/latest/examples/end_to_end_training.html#generating-synthetic-data). I have 10000 background images. Till now i have trained for 25000 iterations with pre-trained weights of synth text model but not able to see any result so can you tell me how many iterations i have to train the model.
I have also try training on https://guillaumejaume.github.io/FUNSD/download/ dataset. which is based on documents where the word is annotated. I have modified your code a little bit to train only on word level. I am training with pretrained ctw model weights. Following is my metrics file can you tell me its looks fine or I have done something wrong?
metrics.txt

Training using 8gb GPU and inferencing using cpu

hi there,

can textfusenet be trained using a single 8gb gpu?
can the inferencing be on cpu-only?
- How much ram memory is required for inferencing?

哪里有对应的json文件

作者你好！请问能提供对应的json文件吗？谢谢。

How to get the corresponding mask area of all the texts in roi_heads.py in the inference stage, not the probability

Hello, in the def _forward_mask() of roi_heads.py, only the mask probability of each roi is provided in the inference phase. How can I get the mask of each text instance in roi_heads.py in the inference phase (that is, get the coordinates of the contour of each mask)? I have tried for a long time, but still can't get the mask area of each text instance. Please help me, thanks.

代码现在是否只支持batch_size==1?

Mutil_Path_Fuse_Module：：forward
if self.training:
proposal_boxes = proposals[0].proposal_boxes
classes = proposals[0].gt_classes
else:
proposal_boxes = proposals[0].pred_boxes
classes = proposals[0].pred_classes

    if len(proposal_boxes) == 0:
          return x

代码中只取了proposals[0]，batch_size >1时 text = x[char_pos[i]] 会报错

the meaning of each tag in json file

"annotations":[
{
"area":14902.5,
"bbox":[
817,
431,
164,
162
],..
What is "area"?
And bbox is [xmin, ymin, xmax, ymax].
pls help me for create custom train.json

train the model on Icdar2015 dataset

Good day! I want to train the model on ICDAR 2015 dataset. Are there anyway to convert the data in such form that the loader can understand? I already read the README file in datasets folder, but I am looking for some conversion code that helps. Thank you

Question about training custom dataset

Hi I have 2 questions about training,

Are both bbox and segmentation used?
Do we have to label character by character or would word/sentences be fine as well? In the sample json it does seem to have labels only by character( e.g.annotation[0]= category_id: 1, segmentation:[[...]]) but since it's a detection model I'm not sure if the label(category_id) matters.

Thanks for your time and the model!

Textfusenet model pth to mar torchserve?

Has anyone done this yet?
Is it possible to share the code?
i tried torch-model-archiver --model-name textfusenet --version 1.0 --model-file model.py --serialized-file model.pth --export-path model_store --extra-files config.yaml. with model.py = model_zoo.py and it's not success.

synthtext pretrain

demo error. _C.cpython-37m-x86_64-linux-gnu.so: undefined symbol:

I followed the step-by-step installation. (https://github.com/ying09/TextFuseNet/blob/master/step-by-step%20installation.txt)
I got an error for running the demo.

Traceback (most recent call last):
File "demo/icdar2013_detection.py", line 12, in
from detectron2.data.detection_utils import read_image
File "/home/ubuntu/source/TextFuseNet/detectron2/data/init.py", line 4, in
from .build import (
File "/home/ubuntu/source/TextFuseNet/detectron2/data/build.py", line 13, in
from detectron2.structures import BoxMode
File "/home/ubuntu/source/TextFuseNet/detectron2/structures/init.py", line 2, in
from .boxes import Boxes, BoxMode, pairwise_iou
File "/home/ubuntu/source/TextFuseNet/detectron2/structures/boxes.py", line 7, in
from detectron2.layers import cat
File "/home/ubuntu/source/TextFuseNet/detectron2/layers/init.py", line 3, in
from .deform_conv import DeformConv, ModulatedDeformConv
File "/home/ubuntu/source/TextFuseNet/detectron2/layers/deform_conv.py", line 10, in
from detectron2 import _C
ImportError: /home/ubuntu/source/TextFuseNet/detectron2/_C.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

Runtime error

Hi I am getting the following runtime error:

(textfusenet) mickey@MICKEY-2080TI:/mnt/d/download/GitHub/Examples/2020-09-28 TextFuseNet/TextFuseNet-master$ python demo/icdar2015_detection.py --input one-frame.jpg 
Config './configs/ocr/icdar2015_101_FPN.yaml' has no VERSION. Assuming it to be compatible with latest v2.
Traceback (most recent call last):
  File "demo/icdar2015_detection.py", line 128, in <module>
    for i in glob.glob(test_images_path):
  File "/home/mickey/miniconda3/envs/textfusenet/lib/python3.7/glob.py", line 20, in glob
    return list(iglob(pathname, recursive=recursive))
  File "/home/mickey/miniconda3/envs/textfusenet/lib/python3.7/glob.py", line 40, in _iglob
    dirname, basename = os.path.split(pathname)
  File "/home/mickey/miniconda3/envs/textfusenet/lib/python3.7/posixpath.py", line 107, in split
    p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not list

Best,
Mickey

Calculate F-measure, Recall, and Precision

Can you guide me on how to calculate F-measure, Recall, and Precision using this code. Do we need to implement it in this implementation?

datasets format

hi, what is the format of training-datasets ground truth ? Is it similar to the ground truth of detection or semantic segmentation? Should the location of each word be labeled? Can this model be used to do semantic segmentation tasks only? Thank you very much!

RuntimeError: Not compiled with GPU support

Hi @ying09 ,

I followed the instructions in the step-by-step installation.txt and was able to go through with no issues.

However, when I try to run the demo\icdar2013_detection.py along with the required options, I get an error RuntimeError: Not compiled with GPU support.

Both the input options and the error is shown in the screenshot below -

Let me know if you need any other information.

results when use ResNet50 as backbone

Hello, the existing text detection backbone is generally ResNet50, but the results given in the paper are the results of ResNet101. What are the results of TextFuseNET on several datasets when using ResNet50 as the backbone?