cv516buaa / tph-yolov5 Goto Github PK

View Code? Open in Web Editor NEW

703.0 703.0 171.0 9.24 MB

License: GNU General Public License v3.0

Shell 0.88% Python 86.34% Dockerfile 0.63% Jupyter Notebook 12.15%

tph-yolov5's People

Contributors

Stargazers

Watchers

Forkers

cv-shuchanglyu yinxuping quan1995417 easonsen jasonbomeng majx1997 limenghaodell gululi aliushn yulongnan luca-zanella-dvl linxinqiang90 yawudede clw5180 mengm0 as501226107 mrreeed ljqcn101 phoenix9032 itinterpret yanxioa zacharyzgw inference-asia llhy60 maarten0912 lsxzhq sdw8855 jie311 jqkaa husnejahan innovationlab-top lyk125 shizhanhao yassinegacha yidan-zhang weizhiyangq zhangguotaogit mikeyu925 ayerzcc kpboo yangyahu-1994 e4qwe4 caiyaya java-wanghanwen realism111 protagonistg fardman69420 linhong00316 archerprince edvcc zqyjason binghaoliu lq115q anastasia-cs jeon-minseong sylixosfans purewhites big-rabbit-ear wuhuj herschel555 amitabhama lymdlut sylvia-an azidanit tfeisong theasxy wangjihao920615 daveishan senlin-ali liujk6525 nemonameless longxiaoze 79zh possibie davidsvaughn pythoncodeashish blueyao17 dangyuuki123 cronaldo7531 jh-001 l-net-1992 terrafyassin butaixing rxz-bupt amwons cjj2923 pufeiyang zhoufanking gtffly adlith sbing2000 cicc1 opentrafficcam a15097916856 cv-ip cv-det najingligong1111 danchaofan-git wulouzhu heinthetaung

tph-yolov5's Issues

ret = input.softmax(dim) RuntimeError: CUDA out of memory

Hey,

When I am trying to run the inference command I am getting the next error:
ret = input.softmax(dim)
RuntimeError: CUDA out of memory. Tried to allocate 962.00 MiB (GPU 0; 3.81 GiB total capacity; 1.79 GiB already allocated; 725.00 MiB free; 1.93 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I tried to change the batch to 4 and even to 2 but it doesn't solve the problem.

What can I do to solve this?

The training issue about 0 MAP

Thanks for your work.
I use this respository to train on the VisDrone dataset just using 10 images to have a fast try, but I find this training is not exactly performed on VisDrone with 0 MAP. However, this code works normally on coco128 dataset. It is very strange and I need your help.

Can't get attribute 'NonDynamicallyQuantizableLinear'

训练的时候遇到这个问题，如何解决
AttributeError: Can't get attribute 'NonDynamicallyQuantizableLinear' on <module 'torch.nn.modules.linear' from 'D:\conda\anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\linear.py'>

关于验证集与测试开发集上的推理结果相差较大的问题

使用您所给的权重进行推理，在验证集（40.8）和测试开放集上的结果（32.2）相差很大，特别是在测试开发集上的推理结果（32.2）与论文结果有一些区别，是我自己哪里设置的不对吗，想请教一下您这个问题，感谢。

当我运行val.py文件的时候，出现如下问题indexerror: index 67 is out of bounds for axis 0 with size 3，请问如何解决

Can you share the yolov5x that finally used? And the script to transfer yolov5 official weights to TPH-YOLOv5?

Thanks a lot!

Segmentation fault (core dumped)

the error is :"Segmentation fault (core dumped) env "PYTHONUNBUFFERED"="1" "PYTHONPATH"="/home/hm/LFY/tph-yolov5" "PYCHARM_HOSTED"="1" "JETBRAINS_REMOTE_RUN"="1" "PYTHONIOENCODING"="UTF-8" /home/hm/anaconda3/envs/lfy_yolo5/bin/python -u /home/hm/LFY/tph-yolov5/train.py"
Hello!! When i try to run train.py, It occured this error: Segmentation fault (core dumped). I hope someone can tell me the reason.

Any yolov5m medium config for training from scratch?

All configs are for large as I can see it

Converted model gives error on TensorRT

I want to use the yolov5l-xs-1.pt model to perform inference and optimize it using TensorRT. I understand you are not using TensorRT, but I thought you might understand the issue

I have exported the .pt file to an onnx file using the export.py program (without --dynamic flag). It gave this warning, but I don't understand what it means:

WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.

Later, when loading the ONNX file with TensorRT, I get this error:

[TensorRT] ERROR: [graphShapeAnalyzer.cpp::throwIfError::1306] Error Code 9: Internal Error (Reshape_218: reshape changes volume)

Apparently there is a node which reshapes to a different volume, which is not allowed according to TensorRT. Do you know what I can do about this issue? Please let me know if you could use more information!

EDIT:
I have also tried it with the --dynamic flag. It looks like it is going through more of the ONNX model when loading, but eventually it gives this error:

[TRT]    4: [network.cpp::validate::2713] Error Code 4: Internal Error (images: dynamic input is missing dimensions in profile 0.)

EDIT:
This is what the node (from the non-dynamic model) looks like:

Is it possible to run the original yolov5 repository with your weights?

hey, Is it possible to run the original yolov5 repository with your weights?

thanks

关于swin-T的create_mask中的h_slices问题

h_slices = ( (0, -self.window_size),
slice(-self.window_size, -self.shift_size),
slice(-self.shift_size, None))

第一行是不是应该加上一个slice呢？这个比较疑惑，我想请教一下去掉slice的原因

Training run swin module error

May I ask the author why this is? Without modifying any code, use the training code: python train.py --img 1536 --adam --batch 4 --epochs 80 --data ./data/VisDrone.yaml --weights yolov5l.pt --hy data/hyps /hyp.VisDrone.yaml --cfg models/yolov5l-xs-tph.yaml --name v5l-xs-tph
Error message：

Traceback (most recent call last):
File "train.py", line 630, in
main(opt)
File "train.py", line 527, in main
train(opt.hyp, opt, device, callbacks)
File "train.py", line 119, in train
model = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device) # create
File "/root/autodl-tmp/tph-yolov5-main/models/yolo.py", line 104, in init
self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist
File "/root/autodl-tmp/tph-yolov5-main/models/yolo.py", line 291, in parse_model
m_ = nn.Sequential(*(m(args) for _ in range(n))) if n > 1 else m(args) # module
File "/root/autodl-tmp/tph-yolov5-main/models/common.py", line 493, in init
self.m = SwinTransformerBlock(c_, c_, c_//32, n)
File "/root/autodl-tmp/tph-yolov5-main/models/common.py", line 426, in init
self.tr = nn.Sequential((SwinTransformerLayer(c2, num_heads=num_heads, window_size=window_size, shift_size=0 if (i % 2 == 0) else self.shift_size ) for i in range(num_layers)))
File "/root/autodl-tmp/tph-yolov5-main/models/common.py", line 426, in
self.tr = nn.Sequential((SwinTransformerLayer(c2, num_heads=num_heads, window_size=window_size, shift_size=0 if (i % 2 == 0) else self.shift_size ) for i in range(num_layers)))
File "/root/autodl-tmp/tph-yolov5-main/models/common.py", line 338, in init
self.attn = WindowAttention(
File "/root/autodl-tmp/tph-yolov5-main/models/common.py", line 259, in init
coords = torch.stack(torch.meshgrid([coords_h, coords_w], indexing="ij")) # [2, Mh, Mw]
TypeError: meshgrid() got an unexpected keyword argument 'indexing'

what is work C3STR?

Hello. i want to train my image dataset. i used to your yolov5l-xs-tph.yaml but it error. What should i do next?

TypeError: meshgrid() got an unexpected keyword argument 'indexing'

python train.py --img 1536 --adam --batch 4 --epochs 80 --data ./data/VisDrone.yaml --weights yolov5l.pt --hy data/hyps/hyp.VisDrone.yaml --cfg models/yolov5l-xs-tph.yaml --name v5l-xs-tph

have a error problem
File "/data/zhangshilin/wangjun/529_zsl_6.0/tph-yolov5/tph-yolov5/models/common.py", line 493, in init
self.m = SwinTransformerBlock(c_, c_, c_//32, n)
File "/data/zhangshilin/wangjun/529_zsl_6.0/tph-yolov5/tph-yolov5/models/common.py", line 426, in init
self.tr = nn.Sequential((SwinTransformerLayer(c2, num_heads=num_heads, window_size=window_size, shift_size=0 if (i % 2 == 0) else self.shift_size ) for i in range(num_layers)))
File "/data/zhangshilin/wangjun/529_zsl_6.0/tph-yolov5/tph-yolov5/models/common.py", line 426, in
self.tr = nn.Sequential((SwinTransformerLayer(c2, num_heads=num_heads, window_size=window_size, shift_size=0 if (i % 2 == 0) else self.shift_size ) for i in range(num_layers)))
File "/data/zhangshilin/wangjun/529_zsl_6.0/tph-yolov5/tph-yolov5/models/common.py", line 340, in init
attn_drop=attn_drop, proj_drop=drop)
File "/data/zhangshilin/wangjun/529_zsl_6.0/tph-yolov5/tph-yolov5/models/common.py", line 259, in init
coords = torch.stack(torch.meshgrid([coords_h, coords_w], indexing="ij")) # [2, Mh, Mw]
TypeError: meshgrid() got an unexpected keyword argument 'indexing'

Unable to val

Hello author, the following error occurred in the reasoning process, please advise, thank you。
AttributeError: Can't get attribute 'NonDynamicallyQuantizableLinear' on <module 'torch.nn.modules.linear' from '/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/linear.py'>

TPH

I want to know where the TPH module is？
And I want to know whether the network structure described in the file "yolov5l-xs-tr-cbam-spp-bifpn.yaml" is consistent with the network structure described in the paper.

where is the ensemble_boxes.py?

In the wbf.py, we can't find the denifition of ensemble_boxes and weighted_boxes_fusion function,can you share it?

WBF, mAP

Dear author, I'm confusing the usage of WBF. Could you please give me a guide?

Q: When I get the wbf_labels based on the ensemble for val results of two models, how to calculate the new mAP based on these wbf_labels? Is there a command or code for such calculation??

C3 is replaced by CSTR and an error is reported. I modified it in version 5.0

TypeError (note: full exception trace is shown but execution is paused at: )
meshgrid() got an unexpected keyword argument 'indexing'

Backbone Structure

I have a question about the file https://github.com/cv516Buaa/tph-yolov5/blob/main/models/yolov5l-xs-tr-cbam-spp-bifpn.yaml.
How is it the backbone:

[from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3TR, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]

the same described in the image of the paper?

SPP and 3TR appear to be in inverse order.

你好，你们没有开源TPH-YOLOv5中论文的模型？

论文中的这个模型结构与你们开源项目中models下的任何一个yaml文件都不匹配。没有公开吗？

Can't download weight

can you share the google drive link of weights? all google drive links are dead

Higher pytorch verision has a typeError "meshgrid() got an unexpected keyword argument 'indexing'"

File "/root/autodl-tmp/tph-yolov5/models/common.py", line 259, in init
coords = torch.stack(torch.meshgrid([coords_h, coords_w], indexing="ij")) # [2, Mh, Mw]
TypeError: meshgrid() got an unexpected keyword argument 'indexing'

Deleting ", indexing="ij"" like this can run well. coords = torch.stack(torch.meshgrid([coords_h, coords_w]))

Code cannot running on the device which pytorch version higher than 1.9.0.

https://blog.csdn.net/qq_43391414/article/details/122902091?utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2~aggregatepage~first_rank_ecpm_v1~rank_v31_ecpm-8-122902091.pc_agg_new_rank&utm_term=meshgrid%E4%B8%ADindexing&spm=1000.2123.3001.4430

There is a similar issue in #20 (comment)

second-stage classifier

hi,i wonder to know have you used second-stage classifier,accoriding to your paper,there supposed to be a resnet18 classifier.so, can you please point it out?thank you very much.that will very nice

模型的推理精度在测试开发集上的和论文中提到有点出入

WBF

I want to ask where WBF is defined and called

No Loggers in utils.loggers

Traceback (most recent call last):
File "train.py", line 48, in
from utils.loggers import Loggers
ImportError: cannot import name 'Loggers' from 'utils.loggers' (unknown location)

`torch.load('yolov5l-xs-1.pt')` --> `No module named 'models'`

Dear authors,

we are trying to use your weights in our own python application for inference with yolov5.

When we try to load your weights with torch.load('yolov5l-xs-1.pt'), we receive the error No module named 'models'.

Is the missing module the folder "models" in your repo?

Is it possible (and intended) to use your weights out of the box with torch.load()?

Question on visdrone annotation

I was reading through the repo when I saw the need to convert visdron annotations to yolo labels. Currently, i am using their 2018 mot toolkit to benchmark some trackers and 1 of them is Yolov5 deepsort.
Does that mean I have to change the results text file generated to visdrone annotation type? If so, how do I do it?

RuntimeError: result type Float can't be cast to the desired output type long int

使用默认的进行训练，有个报错
File "train.py", line 527, in main
train(opt.hyp, opt, device, callbacks)
File "train.py", line 324, in train
loss, loss_items = compute_loss(pred, targets.to(device)) # loss scaled by batch_size
tph-yolov5/utils/loss.py", line 243, in build_targets
indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1))) # image, anchor, grid indices
RuntimeError: result type Float can't be cast to the desired output type long int

How to play the inference on a video?

hey, how can I run the inference on a video?

it looks a little bit complicated since you are using Dataoader in the code.

thanks!

Create tensorrt for your model

hey,

there is a chance you are planning to use tensorrt for your model?

In tensorrtx they already did it for all the yolo`s versions including v5, and I think for you it will be easy for you modify their code and create the tensorrtx for your model. For us it will be too challenging because we need to dive in to yolov5 code, your tph-yolov5 code and the tensorrtx code to understand exactly how to create that..

Your model can be used in so many applications but is it too slow comparing to the normal yolov5 and I think it will be great your that your model will be faster.

thanks

Pb/Tflite convertation by export.py Yolov5

Hi, in your repository there is export.py script with some convertation formats from export.py yolov5 official. Is there any way to resolve this error via convertation to pb/tflite models your tph-yolov5 weights?

Inference error

Hi,

First I use the command:
python VisDrone2YOLO_lable.py

Then

$ python val.py --weights ./weights/yolov5l-xs-1.pt --img 1996 --data ./data/VisDrone.yaml
yolov5l-xs-2.pt
--augment --save-txt --save-conf --task val --batch-size 8 --verbose --name v5l-xs

WARNING: --img-size 1996 must be multiple of max stride 32, updating to 2016
val: Scanning '../datasets/VisDrone/VisDrone2019-DET-val/labels.cache' images and labels... 548 found, 0 missing, 0 empty, 0 corrupted: 100%|█
Class Images Labels P R [email protected] [email protected]:.95: 0%| | 0/69 [00:00<?, ?it/s]
Killed

I did not get any results/detection on folder v5l-xs inside val.

About SwinTransformer

yolov5l-xs-tph.yaml中定义的模型似乎将Transformer替换为了SwinTransformer，且去掉了CBAM模块，请问这样做带来的精度和速度有何变化吗？

Use wbf.py to report an error

Traceback (most recent call last):
File "wbf.py", line 57, in
score_list.append(row[5])
IndexError: index 5 is out of bounds for axis 0 with size 5

Hi author, there seems to be an index error here, I only modified the path, the above is the error message

TypeError: meshgrid() got an unexpected keyword argument 'indexing'

Hello, I want to use this repository to train a model on images of starfish so I prepare the directory as per requirement. But while training
I am getting the above type error from common.py. Could help me out with this?

here's my arguments to train.py
!python train.py --img 1280\ --adam \ --batch 4 \ --epochs 5 \ --data data.yaml \ --weights yolov5l.pt \ --hy data/hyps/hyp.VisDrone.yaml \ --cfg models/yolov5l-xs-tph.yaml \ --name reef-detection

RuntimeError: result type Float can't be cast to the desired output type long int ?

File "/content/drive/MyDrive/Yolo v5/utils/loss.py", line 240, in build_targets
indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1))) # image, anchor, grid indices
RuntimeError: result type Float can't be cast to the desired output type long int

我想问问最后代码的yolov5l-xs-tph.yaml算是作者的最后网络吗?

根据我观察感觉好像不是论文中画出来的网络结构，最后可以单拿这个yolov5l-xs-tph.yaml结构的网络运行结果，作为tph-yolov5的最终预测结果吗？还是说这个网络最后不公布，只是简易版的让我们运行一下呢？非常困惑谢谢解答。

help

File "/home/test/anaconda3/envs/whn_PT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(input, **kwargs)
File "/home/test/anaconda3/envs/whn_PT/lib/python3.6/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/home/test/anaconda3/envs/whn_PT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(input, **kwargs)
File "/mnt/4T/whn/thesis/7-add swintransformer/models/common.py", line 318, in forward
attn_windows = self.attn(x_windows, mask=attn_mask) # [nWB, MhMw, C]
File "/home/test/anaconda3/envs/whn_PT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in call_impl
result = self.forward(*input, **kwargs)
File "/mnt/4T/whn/thesis/7-add swintransformer/models/common.py", line 239, in forward
x = (attn @ v).transpose(1, 2).reshape(B, N, C)
RuntimeError: Expected object of scalar type Float but got scalar type Half for argument #2 'mat2' in

Which model did you use to get the result in train.png?

Would you please tell me which model did you use to get the result in picture train.png? I can't get this result with the command
python train.py --img 1536 --adam --batch 4 --epochs 80 --data ./data/VisDrone.yaml --weights yolov5l.pt --hy data/hyps/hyp.VisDrone.yaml --cfg models/yolov5l-xs-tph.yaml --name v5l-xs-tph
My result is mAP=38.8 for the best.

Training with nano size

I am trying to train the the same model with a smaller network. I use the yolov5n.pt from the public repo and I created a yolov5n-xs-tph.yaml similar to yolov5l-xs-tph.yaml. It looks like this: (note I only changed the depth and width multiples)

# YOLOv5 🚀 by Ultralytics, GPL-3.0 license

# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.25  # layer channel multiple
anchors: 4
  # - [10,13, 16,30, 33,23]  # P3/8
  # - [30,61, 62,45, 59,119]  # P4/16
  # - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 v6.0 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 6, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 3, C3, [1024]],
   [-1, 1, SPPF, [1024, 5]],  # 9
  ]

# YOLOv5 v6.0 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [ -1, 1, Conv, [ 128, 1, 1 ] ],
   [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
   [ [ -1, 2 ], 1, Concat, [ 1 ] ],  # cat backbone P2
   [ -1, 2, C3STR, [ 128, False ] ],  # 21 (P2/4-xsmall)

   [ -1, 1, Conv, [ 128, 3, 2 ] ],
   [ [ -1, 18, 4], 1, Concat, [ 1 ] ],  # cat head P3
   [ -1, 2, C3STR, [ 256, False ] ],  # 24 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14, 6], 1, Concat, [1]],  # cat head P4
   [-1, 2, C3STR, [512, False]],  # 27 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 2, C3STR, [1024, False]],  # 30 (P5/32-large)

   [[21, 24, 27, 30], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

Doing exactly this for the yolov5s model worked for me and it trained fine, but with the yolov5n model I get this error:

Traceback (most recent call last):
  File "train.py", line 631, in <module>
    main(opt)
  File "train.py", line 528, in main
    train(opt.hyp, opt, device, callbacks)
  File "train.py", line 119, in train
    model = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device)  # create
  File "~/tph-yolov5/models/yolo.py", line 104, in __init__
    self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch])  # model, savelist
  File "~/tph-yolov5/models/yolo.py", line 291, in parse_model
    m_ = nn.Sequential(*(m(*args) for _ in range(n))) if n > 1 else m(*args)  # module
  File "~/tph-yolov5/models/common.py", line 493, in __init__
    self.m = SwinTransformerBlock(c_, c_, c_//32, n)
  File "~/tph-yolov5/models/common.py", line 426, in __init__
    self.tr = nn.Sequential(*(SwinTransformerLayer(c2, num_heads=num_heads, window_size=window_size,  shift_size=0 if (i % 2 == 0) else self.shift_size ) for i in range(num_layers)))
  File "~/tph-yolov5/models/common.py", line 426, in <genexpr>
    self.tr = nn.Sequential(*(SwinTransformerLayer(c2, num_heads=num_heads, window_size=window_size,  shift_size=0 if (i % 2 == 0) else self.shift_size ) for i in range(num_layers)))
  File "/~/tph-yolov5/models/common.py", line 338, in __init__
    self.attn = WindowAttention(
  File "~/tph-yolov5/models/common.py", line 249, in __init__
    head_dim = dim // num_heads
ZeroDivisionError: integer division or modulo by zero

The error occurs when trying to create a C3STR block (# 21). I put these prints in:

c1=64
c2=32
n=1
shortcut=False
g=1
e=0.5
c_=16
num_heads of SwinTransformerBlock that will be created would be: 0

I know the problem has to do with my yolov5n-xs-tph.yaml file, but I don't understand what I should change. Again, for yolov5s-xs-tph.yaml it worked fine, with depth 0.33 and width 0.5... Any ideas?

u版yolov5l 1280尺寸，tph1536尺寸同样数据。对比训练时候精度相差很大

having issue with training on a new dataset

I can successfully get good results with yolov5l.yaml on my own dataset and I can get good results on visdrone with yolov5l-xs-tph.yaml

But when I try to train on my own dataset with yolov5l-xs-tph.yaml with bigger batch size and multi-GPUs with both adam and sgd and with higher LR ( I revised hardcoded lr in train.py) I see mAP of 10 after 100epochs, and mAP of 16 after 300 but it does not get better. ( In contrast, I get 22% with small old models like efficientDet-D0 and 26 mAP on deeper ones)

Any clues?

Drawing Software

In your paper, Figure 3 which is the architecture of the TPH-YOLOv5, is so exquisite, I am curious about the software you use for drawing.
Thank you.

ONNX Inference failed. Non-zero status code returned while running Reshape node.

Export yolov5l-xs-1.pt to ONNX format using export.py in this repo,
Then do the detect, ONNX model was loaded successfully, but it failed when running this code
pred = torch.tensor(self.session.run([self.session.get_outputs()[0].name], {self.session.get_inputs()[0].name: img}))
It threw:
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'Reshape_222' Status Message: D:\a\_work\1\s\onnxruntime\core/providers/cpu/tensor/reshape_helper.h:42 onnxruntime::ReshapeHelper::ReshapeHelper gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{756,1,512}, requested shape:{756,12096,32}

The same code runs yolov5's onnx model successfully.

Any plan to add onnx convert support for yolov5-TPH? Thanks you!