cv516buaa / tph-yolov5 Goto Github PK
View Code? Open in Web Editor NEWLicense: GNU General Public License v3.0
License: GNU General Public License v3.0
Hey,
When I am trying to run the inference command I am getting the next error:
ret = input.softmax(dim)
RuntimeError: CUDA out of memory. Tried to allocate 962.00 MiB (GPU 0; 3.81 GiB total capacity; 1.79 GiB already allocated; 725.00 MiB free; 1.93 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I tried to change the batch to 4 and even to 2 but it doesn't solve the problem.
What can I do to solve this?
训练的时候遇到这个问题,如何解决
AttributeError: Can't get attribute 'NonDynamicallyQuantizableLinear' on <module 'torch.nn.modules.linear' from 'D:\conda\anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\linear.py'>
Thanks a lot!
the error is :"Segmentation fault (core dumped) env "PYTHONUNBUFFERED"="1" "PYTHONPATH"="/home/hm/LFY/tph-yolov5" "PYCHARM_HOSTED"="1" "JETBRAINS_REMOTE_RUN"="1" "PYTHONIOENCODING"="UTF-8" /home/hm/anaconda3/envs/lfy_yolo5/bin/python -u /home/hm/LFY/tph-yolov5/train.py"
Hello!! When i try to run train.py, It occured this error: Segmentation fault (core dumped). I hope someone can tell me the reason.
All configs are for large as I can see it
I want to use the yolov5l-xs-1.pt model to perform inference and optimize it using TensorRT. I understand you are not using TensorRT, but I thought you might understand the issue
I have exported the .pt file to an onnx file using the export.py
program (without --dynamic
flag). It gave this warning, but I don't understand what it means:
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
Later, when loading the ONNX file with TensorRT, I get this error:
[TensorRT] ERROR: [graphShapeAnalyzer.cpp::throwIfError::1306] Error Code 9: Internal Error (Reshape_218: reshape changes volume)
Apparently there is a node which reshapes to a different volume, which is not allowed according to TensorRT. Do you know what I can do about this issue? Please let me know if you could use more information!
EDIT:
I have also tried it with the --dynamic
flag. It looks like it is going through more of the ONNX model when loading, but eventually it gives this error:
[TRT] 4: [network.cpp::validate::2713] Error Code 4: Internal Error (images: dynamic input is missing dimensions in profile 0.)
EDIT:
This is what the node (from the non-dynamic model) looks like:
hey, Is it possible to run the original yolov5 repository with your weights?
thanks
h_slices = ( (0, -self.window_size),
slice(-self.window_size, -self.shift_size),
slice(-self.shift_size, None))
第一行是不是应该加上一个slice呢?这个比较疑惑,我想请教一下去掉slice的原因
May I ask the author why this is? Without modifying any code, use the training code: python train.py --img 1536 --adam --batch 4 --epochs 80 --data ./data/VisDrone.yaml --weights yolov5l.pt --hy data/hyps /hyp.VisDrone.yaml --cfg models/yolov5l-xs-tph.yaml --name v5l-xs-tph
Error message:
Traceback (most recent call last):
File "train.py", line 630, in
main(opt)
File "train.py", line 527, in main
train(opt.hyp, opt, device, callbacks)
File "train.py", line 119, in train
model = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device) # create
File "/root/autodl-tmp/tph-yolov5-main/models/yolo.py", line 104, in init
self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist
File "/root/autodl-tmp/tph-yolov5-main/models/yolo.py", line 291, in parse_model
m_ = nn.Sequential(*(m(args) for _ in range(n))) if n > 1 else m(args) # module
File "/root/autodl-tmp/tph-yolov5-main/models/common.py", line 493, in init
self.m = SwinTransformerBlock(c_, c_, c_//32, n)
File "/root/autodl-tmp/tph-yolov5-main/models/common.py", line 426, in init
self.tr = nn.Sequential((SwinTransformerLayer(c2, num_heads=num_heads, window_size=window_size, shift_size=0 if (i % 2 == 0) else self.shift_size ) for i in range(num_layers)))
File "/root/autodl-tmp/tph-yolov5-main/models/common.py", line 426, in
self.tr = nn.Sequential((SwinTransformerLayer(c2, num_heads=num_heads, window_size=window_size, shift_size=0 if (i % 2 == 0) else self.shift_size ) for i in range(num_layers)))
File "/root/autodl-tmp/tph-yolov5-main/models/common.py", line 338, in init
self.attn = WindowAttention(
File "/root/autodl-tmp/tph-yolov5-main/models/common.py", line 259, in init
coords = torch.stack(torch.meshgrid([coords_h, coords_w], indexing="ij")) # [2, Mh, Mw]
TypeError: meshgrid() got an unexpected keyword argument 'indexing'
python train.py --img 1536 --adam --batch 4 --epochs 80 --data ./data/VisDrone.yaml --weights yolov5l.pt --hy data/hyps/hyp.VisDrone.yaml --cfg models/yolov5l-xs-tph.yaml --name v5l-xs-tph
have a error problem
File "/data/zhangshilin/wangjun/529_zsl_6.0/tph-yolov5/tph-yolov5/models/common.py", line 493, in init
self.m = SwinTransformerBlock(c_, c_, c_//32, n)
File "/data/zhangshilin/wangjun/529_zsl_6.0/tph-yolov5/tph-yolov5/models/common.py", line 426, in init
self.tr = nn.Sequential((SwinTransformerLayer(c2, num_heads=num_heads, window_size=window_size, shift_size=0 if (i % 2 == 0) else self.shift_size ) for i in range(num_layers)))
File "/data/zhangshilin/wangjun/529_zsl_6.0/tph-yolov5/tph-yolov5/models/common.py", line 426, in
self.tr = nn.Sequential((SwinTransformerLayer(c2, num_heads=num_heads, window_size=window_size, shift_size=0 if (i % 2 == 0) else self.shift_size ) for i in range(num_layers)))
File "/data/zhangshilin/wangjun/529_zsl_6.0/tph-yolov5/tph-yolov5/models/common.py", line 340, in init
attn_drop=attn_drop, proj_drop=drop)
File "/data/zhangshilin/wangjun/529_zsl_6.0/tph-yolov5/tph-yolov5/models/common.py", line 259, in init
coords = torch.stack(torch.meshgrid([coords_h, coords_w], indexing="ij")) # [2, Mh, Mw]
TypeError: meshgrid() got an unexpected keyword argument 'indexing'
Hello author, the following error occurred in the reasoning process, please advise, thank you。
AttributeError: Can't get attribute 'NonDynamicallyQuantizableLinear' on <module 'torch.nn.modules.linear' from '/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/linear.py'>
I want to know where the TPH module is?
And I want to know whether the network structure described in the file "yolov5l-xs-tr-cbam-spp-bifpn.yaml" is consistent with the network structure described in the paper.
In the wbf.py, we can't find the denifition of ensemble_boxes and weighted_boxes_fusion function,can you share it?
Dear author, I'm confusing the usage of WBF. Could you please give me a guide?
Q: When I get the wbf_labels based on the ensemble for val results of two models, how to calculate the new mAP based on these wbf_labels? Is there a command or code for such calculation??
TypeError (note: full exception trace is shown but execution is paused at: )
meshgrid() got an unexpected keyword argument 'indexing'
I have a question about the file https://github.com/cv516Buaa/tph-yolov5/blob/main/models/yolov5l-xs-tr-cbam-spp-bifpn.yaml.
How is it the backbone:
[from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3TR, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]
the same described in the image of the paper?
SPP and 3TR appear to be in inverse order.
can you share the google drive link of weights? all google drive links are dead
File "/root/autodl-tmp/tph-yolov5/models/common.py", line 259, in init
coords = torch.stack(torch.meshgrid([coords_h, coords_w], indexing="ij")) # [2, Mh, Mw]
TypeError: meshgrid() got an unexpected keyword argument 'indexing'
Deleting ", indexing="ij"" like this can run well. coords = torch.stack(torch.meshgrid([coords_h, coords_w]))
Code cannot running on the device which pytorch version higher than 1.9.0.
There is a similar issue in #20 (comment)
hi,i wonder to know have you used second-stage classifier,accoriding to your paper,there supposed to be a resnet18 classifier.so, can you please point it out?thank you very much.that will very nice
I want to ask where WBF is defined and called
Traceback (most recent call last):
File "train.py", line 48, in
from utils.loggers import Loggers
ImportError: cannot import name 'Loggers' from 'utils.loggers' (unknown location)
Dear authors,
we are trying to use your weights in our own python application for inference with yolov5.
When we try to load your weights with torch.load('yolov5l-xs-1.pt')
, we receive the error No module named 'models'
.
Is the missing module the folder "models" in your repo?
Is it possible (and intended) to use your weights out of the box with torch.load()
?
I was reading through the repo when I saw the need to convert visdron annotations to yolo labels. Currently, i am using their 2018 mot toolkit to benchmark some trackers and 1 of them is Yolov5 deepsort.
Does that mean I have to change the results text file generated to visdrone annotation type? If so, how do I do it?
使用默认的进行训练,有个报错
File "train.py", line 527, in main
train(opt.hyp, opt, device, callbacks)
File "train.py", line 324, in train
loss, loss_items = compute_loss(pred, targets.to(device)) # loss scaled by batch_size
tph-yolov5/utils/loss.py", line 243, in build_targets
indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1))) # image, anchor, grid indices
RuntimeError: result type Float can't be cast to the desired output type long int
hey, how can I run the inference on a video?
it looks a little bit complicated since you are using Dataoader in the code.
thanks!
hey,
there is a chance you are planning to use tensorrt for your model?
In tensorrtx they already did it for all the yolo`s versions including v5, and I think for you it will be easy for you modify their code and create the tensorrtx for your model. For us it will be too challenging because we need to dive in to yolov5 code, your tph-yolov5 code and the tensorrtx code to understand exactly how to create that..
Your model can be used in so many applications but is it too slow comparing to the normal yolov5 and I think it will be great your that your model will be faster.
thanks
Hi,
First I use the command:
python VisDrone2YOLO_lable.py
Then
$ python val.py --weights ./weights/yolov5l-xs-1.pt --img 1996 --data ./data/VisDrone.yaml
yolov5l-xs-2.pt
--augment --save-txt --save-conf --task val --batch-size 8 --verbose --name v5l-xs
WARNING: --img-size 1996 must be multiple of max stride 32, updating to 2016
val: Scanning '../datasets/VisDrone/VisDrone2019-DET-val/labels.cache' images and labels... 548 found, 0 missing, 0 empty, 0 corrupted: 100%|█
Class Images Labels P R [email protected] [email protected]:.95: 0%| | 0/69 [00:00<?, ?it/s]
Killed
I did not get any results/detection on folder v5l-xs inside val.
yolov5l-xs-tph.yaml中定义的模型似乎将Transformer替换为了SwinTransformer,且去掉了CBAM模块,请问这样做带来的精度和速度有何变化吗?
Traceback (most recent call last):
File "wbf.py", line 57, in
score_list.append(row[5])
IndexError: index 5 is out of bounds for axis 0 with size 5
Hi author, there seems to be an index error here, I only modified the path, the above is the error message
Hello, I want to use this repository to train a model on images of starfish so I prepare the directory as per requirement. But while training
I am getting the above type error from common.py. Could help me out with this?
here's my arguments to train.py
!python train.py --img 1280\ --adam \ --batch 4 \ --epochs 5 \ --data data.yaml \ --weights yolov5l.pt \ --hy data/hyps/hyp.VisDrone.yaml \ --cfg models/yolov5l-xs-tph.yaml \ --name reef-detection
File "/content/drive/MyDrive/Yolo v5/utils/loss.py", line 240, in build_targets
indices.append((b, a, gj.clamp_(0, gain[3] - 1), gi.clamp_(0, gain[2] - 1))) # image, anchor, grid indices
RuntimeError: result type Float can't be cast to the desired output type long int
根据我观察 感觉好像不是论文中画出来的网络结构,最后可以单拿这个yolov5l-xs-tph.yaml结构的网络运行结果,作为tph-yolov5的最终预测结果吗?还是说 这个网络最后不公布,只是简易版的让我们运行一下呢?非常困惑 谢谢解答。
File "/home/test/anaconda3/envs/whn_PT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(input, **kwargs)
File "/home/test/anaconda3/envs/whn_PT/lib/python3.6/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/home/test/anaconda3/envs/whn_PT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(input, **kwargs)
File "/mnt/4T/whn/thesis/7-add swintransformer/models/common.py", line 318, in forward
attn_windows = self.attn(x_windows, mask=attn_mask) # [nWB, MhMw, C]
File "/home/test/anaconda3/envs/whn_PT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in call_impl
result = self.forward(*input, **kwargs)
File "/mnt/4T/whn/thesis/7-add swintransformer/models/common.py", line 239, in forward
x = (attn @ v).transpose(1, 2).reshape(B, N, C)
RuntimeError: Expected object of scalar type Float but got scalar type Half for argument #2 'mat2' in
Would you please tell me which model did you use to get the result in picture train.png? I can't get this result with the command
python train.py --img 1536 --adam --batch 4 --epochs 80 --data ./data/VisDrone.yaml --weights yolov5l.pt --hy data/hyps/hyp.VisDrone.yaml --cfg models/yolov5l-xs-tph.yaml --name v5l-xs-tph
My result is mAP=38.8 for the best.
I am trying to train the the same model with a smaller network. I use the yolov5n.pt
from the public repo and I created a yolov5n-xs-tph.yaml
similar to yolov5l-xs-tph.yaml
. It looks like this: (note I only changed the depth and width multiples)
# YOLOv5 🚀 by Ultralytics, GPL-3.0 license
# Parameters
nc: 80 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.25 # layer channel multiple
anchors: 4
# - [10,13, 16,30, 33,23] # P3/8
# - [30,61, 62,45, 59,119] # P4/16
# - [116,90, 156,198, 373,326] # P5/32
# YOLOv5 v6.0 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]
# YOLOv5 v6.0 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 13
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 17 (P3/8-small)
[ -1, 1, Conv, [ 128, 1, 1 ] ],
[ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
[ [ -1, 2 ], 1, Concat, [ 1 ] ], # cat backbone P2
[ -1, 2, C3STR, [ 128, False ] ], # 21 (P2/4-xsmall)
[ -1, 1, Conv, [ 128, 3, 2 ] ],
[ [ -1, 18, 4], 1, Concat, [ 1 ] ], # cat head P3
[ -1, 2, C3STR, [ 256, False ] ], # 24 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 14, 6], 1, Concat, [1]], # cat head P4
[-1, 2, C3STR, [512, False]], # 27 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 2, C3STR, [1024, False]], # 30 (P5/32-large)
[[21, 24, 27, 30], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]
Doing exactly this for the yolov5s model worked for me and it trained fine, but with the yolov5n model I get this error:
Traceback (most recent call last):
File "train.py", line 631, in <module>
main(opt)
File "train.py", line 528, in main
train(opt.hyp, opt, device, callbacks)
File "train.py", line 119, in train
model = Model(cfg or ckpt['model'].yaml, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device) # create
File "~/tph-yolov5/models/yolo.py", line 104, in __init__
self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch]) # model, savelist
File "~/tph-yolov5/models/yolo.py", line 291, in parse_model
m_ = nn.Sequential(*(m(*args) for _ in range(n))) if n > 1 else m(*args) # module
File "~/tph-yolov5/models/common.py", line 493, in __init__
self.m = SwinTransformerBlock(c_, c_, c_//32, n)
File "~/tph-yolov5/models/common.py", line 426, in __init__
self.tr = nn.Sequential(*(SwinTransformerLayer(c2, num_heads=num_heads, window_size=window_size, shift_size=0 if (i % 2 == 0) else self.shift_size ) for i in range(num_layers)))
File "~/tph-yolov5/models/common.py", line 426, in <genexpr>
self.tr = nn.Sequential(*(SwinTransformerLayer(c2, num_heads=num_heads, window_size=window_size, shift_size=0 if (i % 2 == 0) else self.shift_size ) for i in range(num_layers)))
File "/~/tph-yolov5/models/common.py", line 338, in __init__
self.attn = WindowAttention(
File "~/tph-yolov5/models/common.py", line 249, in __init__
head_dim = dim // num_heads
ZeroDivisionError: integer division or modulo by zero
The error occurs when trying to create a C3STR
block (# 21). I put these prints in:
c1=64
c2=32
n=1
shortcut=False
g=1
e=0.5
c_=16
num_heads of SwinTransformerBlock that will be created would be: 0
I know the problem has to do with my yolov5n-xs-tph.yaml
file, but I don't understand what I should change. Again, for yolov5s-xs-tph.yaml
it worked fine, with depth 0.33 and width 0.5... Any ideas?
I can successfully get good results with yolov5l.yaml
on my own dataset and I can get good results on visdrone with yolov5l-xs-tph.yaml
But when I try to train on my own dataset with yolov5l-xs-tph.yaml
with bigger batch size and multi-GPUs with both adam and sgd and with higher LR ( I revised hardcoded lr in train.py
) I see mAP of 10 after 100epochs, and mAP of 16 after 300 but it does not get better. ( In contrast, I get 22% with small old models like efficientDet-D0 and 26 mAP on deeper ones)
Any clues?
In your paper, Figure 3 which is the architecture of the TPH-YOLOv5, is so exquisite, I am curious about the software you use for drawing.
Thank you.
Export yolov5l-xs-1.pt to ONNX format using export.py in this repo,
Then do the detect, ONNX model was loaded successfully, but it failed when running this code
pred = torch.tensor(self.session.run([self.session.get_outputs()[0].name], {self.session.get_inputs()[0].name: img}))
It threw:
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'Reshape_222' Status Message: D:\a\_work\1\s\onnxruntime\core/providers/cpu/tensor/reshape_helper.h:42 onnxruntime::ReshapeHelper::ReshapeHelper gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{756,1,512}, requested shape:{756,12096,32}
The same code runs yolov5's onnx model successfully.
Any plan to add onnx convert support for yolov5-TPH? Thanks you!
How to convert TPH-yolo to onnx, why use your export.py to generate onnx, but cannot use
self.m = nn.ModuleList([nn.Conv2d(c_, c_, kernel_size=3, stride=1, padding= x //2, dilation= x //2, bias=False) for x in k]) TypeError: 'int' object is not iterable
/home/test/Desktop/Screenshot from 2021-12-16 14-50-00.png
help!!!!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.