weijiawu / transdetr Goto Github PK

View Code? Open in Web Editor NEW

99.0 99.0 11.0 183.19 MB

[IJCV 2024] TransDETR: End-to-end Video Text Spotting with Transformer

Python 90.67% Shell 2.60% C++ 0.53% Cuda 6.12% C 0.09%

transdetr's People

Contributors

Stargazers

Watchers

Forkers

aschortgen hangfang6 devindesilva shualite aniketgurav yimengxi scotteven westamine thinh-huynh-re nalinig123456789 sarahovcmr

transdetr's Issues

在测试时遇到的问题

非常抱歉再次打扰您，我在测试时执行了sh configs/r50_TransDETR_eval_ICDAR2015.sh，出现了这个问题

请问您是否遇到过相同的问题？

Error in the evaluation script

Hello, excellent work!

I was trying the evaluation script then I got this error that I couldn't pass.

what could be the issue here ??

A5000 batchsize=1 CUDAoom

跑DSText的baseline的时候，设置batchsize=1，在A5000单卡上出现CUDAoom
按照readme里面的描述 8张32G的V100能跑batchsize=16 那么24G单卡跑batchsize=1应该是不会出现这种情况的才对、
请问这种情况下还要调整哪些参数来降低显存占用

请问如何组织icdar2015数据集呢

您看，这是我下载的icdar2015数据集，请问如何将其设置成您dataset中的那种分布呢

How much training time was required on 8 v100 gpus

Hello! It's a great job. I would like to ask the following README that the training time is on 8 NVIDIA V100 GPUs with batchsize 16, but it seems that the actual training can only set the batchsize per GPU to 1. How to solve this problem?Looking forward to your reply.

not Set FP track Instances' obj_idx as -2

TransDETR/models/qim.py

Line 109 in b6c73c5

 def _add_fp_tracks(self, track_instances: Instances, active_track_instances: Instances) -> Instances: 

DSText 推理时报错。

您好，在使用DSText 推理时报错：
Traceback (most recent call last):
File "eval.py", line 905, in
detr, _, _ = build_model(args)
File "/home/1080ti/lhe/transdetr_new/TransDETR-main/models/init.py", line 24, in build_model
return build_func(args)
File "/home/1080ti/lhe/transdetr_new/TransDETR-main/models/TransDETR_ignored.py", line 1066, in build
criterion = ClipMatcher(num_classes, matcher=img_matcher, weight_dict=weight_dict, losses=losses,language=language)
File "/home/1080ti/lhe/transdetr_new/TransDETR-main/models/TransDETR_ignored.py", line 64, in init
self.voc, self.char2id, self.id2char = get_vocabulary(language, use_ctc=True)
File "/home/1080ti/lhe/transdetr_new/TransDETR-main/datasets/data_tools.py", line 306, in get_vocabulary
with open('/share/lizhuang05/code/pan_pp.pytorch_dev/data/keys.json', encoding='utf-8') as j:
FileNotFoundError: [Errno 2] No such file or directory: '/share/lizhuang05/code/pan_pp.pytorch_dev/data/keys.json'

请问该如何解决？去掉 --is_bilingual 似乎可以正常评估，但是不知道会不会对输出结果造成影响

关于测试问题

您好很感谢您开源这么好的工作！
我尝试用开源的模型对ICDAR2015进行测试，得到结果后提交到了 https://rrc.cvc.uab.es/?ch=3&com=introduction，但是验证结果一直为 0，想请问可能是什么原因呢？

import error

The Rotated_ROIAlign module has already been built, but the error raises when training:
import rotated_roi as _C ModuleNotFoundError: No module named 'rotated_roi'

Google drive checkpoint

can you allow access to the checkpoint through google drive?

Some questions about your proposed Rotated RoI

Could you please verify the difference of your Rotated RoI and the Rotated RoI Pooling and Alignment operations proposed in [1],[2] and [3]? Though your proposed Rotated RoI is different from RoIRotate, (you mentioned you use bilinear interpolation to map the feature grid) It is the same component that has been proposed in [1-3]. If there is no difference between the two, please cite these papers in your paper and remove the "propose" claim.

[1] Ma, J., Shao, W., Ye, H., Wang, L., Wang, H., Zheng, Y., & Xue, X. (2018). Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia, 20(11), 3111-3122.
[2] He, T., Tian, Z., Huang, W., Shen, C., Qiao, Y., & Sun, C. (2018). An end-to-end textspotter with explicit alignment and attention. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5020-5029).
[3] Ma, J. (2020). Rrpn++: Guidance towards more accurate scene text detection. arXiv preprint arXiv:2009.13118.

When the code will be available?

Thanks for sharing your fantastic work; when will the code be released?

evaluation result on the official website of the icdar2015 dataset

hello,dear author? The evaluation result on the official website of the icdar2015 dataset is only mota motp idf1. How can I obtain p r?

No module named 'rotated_roi'

Hi there I am facing the issue in this module as i have already installed the dependency which is as follow

cd ./models/Rotated_ROIAlign
python setup.py build_ext --inplace

dependent lib ninja was also installed as my main goal was to get the model and infer the pre-train model

TransDETR_ignored和TransDETR的区别是什么

请问，TransDETR_ignored和TransDETR的区别是什么呢？

undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceISt7complexIdEEEPKNS_6detail12TypeMetaDataEv

Thanks for your outstanding works! However, something wrong occured when I reproduced your project. The error log is following.

Traceback (most recent call last):
File "main.py", line 26, in
import datasets
File "/data1/ljh/code/TransDETR/datasets/init.py", line 15, in
from .detmot import build as build_e2e_mot
File "/data1/ljh/code/TransDETR/datasets/detmot.py", line 23, in
from models.structures import Instances
File "/data1/ljh/code/TransDETR/models/init.py", line 12, in
from .TransDETR import build as build_TransDETR
File "/data1/ljh/code/TransDETR/models/TransDETR.py", line 32, in
from .deformable_transformer_plus import build_deforamble_transformer
File "/data1/ljh/code/TransDETR/models/deformable_transformer_plus.py", line 25, in
from models.ops.modules import MSDeformAttn
File "/data1/ljh/code/TransDETR/models/ops/modules/init.py", line 9, in
from .ms_deform_attn import MSDeformAttn
File "/data1/ljh/code/TransDETR/models/ops/modules/ms_deform_attn.py", line 21, in
from ..functions import MSDeformAttnFunction
File "/data1/ljh/code/TransDETR/models/ops/functions/init.py", line 9, in
from .ms_deform_attn_func import MSDeformAttnFunction
File "/data1/ljh/code/TransDETR/models/ops/functions/ms_deform_attn_func.py", line 18, in
import MultiScaleDeformableAttention as MSDA
ImportError: /data1/ljh/anaconda3/envs/transDETR/lib/python3.7/site-packages/MultiScaleDeformableAttention-1.0-py3.7-linux-x86_64.egg/MultiScaleDeformableAtte
ntion.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceISt7complexIdEEEPKNS_6detail12TypeMetaDataEv

How to solve this problem? It looks like MSDA cannot match the version of mmcv.

关于Minetto和YVT数据集

请问方便了解一下这两个数据集现在如何获得嘛？最近搜了很多资源也循着网址找去但无法打开网址。

Requirements not clear

Can you provide detailed package versions and dependencies.
Running the script r50_TransDETR_eval_ICDAR2015.sh give the error :

ImportError: Rotated_ROIAlign/build/lib.linux-x86_64-cpython-37/rotated_roi.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceIdEEPKNS_6detail12TypeMetaDataEv

which maybe because of compatibility issues of packages (detectron2 and pytorch). Requirements.txt doesn't really provide the details.

 track_instances.angle[active_idxes] = torch.abs(active_track_angle[0] - gt_angle) 

the code above should be track_instances.angle[active_idxes] = torch.abs(active_track_angle - gt_angle)? because the first dim of track_instances.angle is num_queries