yuhengsss / yolov Goto Github PK
View Code? Open in Web Editor NEWThis repo is an implementation of PyTorch version YOLOV Series
License: Apache License 2.0
This repo is an implementation of PyTorch version YOLOV Series
License: Apache License 2.0
您好,请问一下为什么每个epoch训练都会报以下这个错误然后训练就被中断,而且每次都是训练到7020iter的时候。
2023-03-04 06:39:27.342 | INFO | yolox.core.vid_trainer:after_iter:279 - epoch: 1/7, iter: 7020/9366, mem: 8055Mb,`iter_time: 4.363s, data_time: 3.605s, total_loss: 1.1, iou_loss: 0.7, l1_loss: 0.0, conf_loss: 0.2, cls_loss: 0.1, lr: 2.247e-03, size: 480, ETA: 2 days, 21:48:48
2023-03-04 06:39:38.261 | INFO | yolox.core.vid_trainer:after_train:198 - Training of experiment is done and the best AP is 0.00
2023-03-04 06:39:38.262 | ERROR | yolox.core.launch:launch:98 - An error has been caught in function 'launch', process 'MainProcess' (267170), thread 'MainThread' (140410779206464):
Traceback (most recent call last):
File "tools/vid_train.py", line 151, in
args=(exp, args),
│ └ Namespace(batch_size=128, cache=False, ckpt='/media/user/A0F260D9F260B566/qsy/YOLOV/weights/yoloxs_vid.pth', devices=1, dist_...
└ ╒═══════════════════╤════════════════════════════════════════════════════════════════════════════════════════════════════════...
File "./yolox/core/launch.py", line 98, in launch
main_func(*args)
│ └ (╒═══════════════════╤═══════════════════════════════════════════════════════════════════════════════════════════════════════...
└ <function main at 0x7fb305add4d0>
File "tools/vid_train.py", line 128, in main
trainer.train()
│ └ <function Trainer.train at 0x7fb305ae15f0>
└ <yolox.core.vid_trainer.Trainer object at 0x7fb3ed55a990>
File "./yolox/core/vid_trainer.py", line 85, in train
self.train_in_epoch()
│ └ <function Trainer.train_in_epoch at 0x7fb305ae1b90>
└ <yolox.core.vid_trainer.Trainer object at 0x7fb3ed55a990>
File "./yolox/core/vid_trainer.py", line 94, in train_in_epoch
self.train_in_iter()
│ └ <function Trainer.train_in_iter at 0x7fb305ae1dd0>
└ <yolox.core.vid_trainer.Trainer object at 0x7fb3ed55a990>
File "./yolox/core/vid_trainer.py", line 100, in train_in_iter
self.train_one_iter()
│ └ <function Trainer.train_one_iter at 0x7fb305ae4d40>
└ <yolox.core.vid_trainer.Trainer object at 0x7fb3ed55a990>
File "./yolox/core/vid_trainer.py", line 107, in train_one_iter
inps = inps.to(self.data_type)
│ │ └ torch.float16
│ └ <yolox.core.vid_trainer.Trainer object at 0x7fb3ed55a990>
└ None
AttributeError: 'NoneType' object has no attribute 'to'
Dear author, thank you very much for sharing your excellent research. It is very innovative and gets outstanding results. I'm trying to training yolov_s model with exp file yolov_s_online.py. But I encountered a problem "IndexError: The shape of the mask [60] at index 0 does not match the shape of the indexed tensor [30, 30] at index 0". Can you tell me how to fix it. Looking forward to your reply. Thank you very much.
Do I need to train yolox and then train yolov when I train my data set? I don't see a correlation between the two models.
Dear Author,
Thanks for sharing your great work. I'm trying to train yolov_l and yolov_x, but it looks like it's unable to load the checkpoints for yoloxl_vid.pth
and yoloxx_vid.pth
provided in google drive.
Here's my training command
python tools/vid_train.py -f exps/yolov/yolov_l.py -c pretrained_weights/yoloxl_vid.pth --fp16
python tools/vid_train.py -f exps/yolov/yolov_x.py -c pretrained_weights/yoloxx_vid.pth --fp16
And I'm getting this error
My command below for training yolov_s is working fine. Is it possible that the large and extra large checkpoints got corrupted?
python tools/vid_train.py -f exps/yolov/yolov_s.py -c pretrained_weights/yoloxs_vid.pth --fp16
Thanks for the help!
Dear auther, I found that if batch_size != gframe (e.g., batchsize=32, gframe=16 and lframe=0) which means one batch contains two groups (16 frames each x2), all the predictions of two groups of 32 frames will be used to conduct the MSA related processing. But the two groups of frames possibly come from two different videos, won't it have some problems ? If setting batch_size = gframe = a small value (e.g. 16), the problem does not exist, but the GPU Memory utilization will be very low. If setting batch_size = gframe = a big value (e.g. 64), that means making a decision during inference needs to consider 64 frames. So I don't know if I missed something ? How should I set batch_size and gframe\lframe on my 2x24GB gpus ?
I read the sorce codes and find the yolo head return output and output_ori. What's the differences between output and output_ori? thanks a lot~
Hi, it is unclear how I can format my custom dataset to your expected format, Can possibly provide information on how I can convert my dataset, and what are you expected format? For E.g. it is unclear to me what the context is of the video sequences.
thx for the great work, and I found the Argoverse dataset has been implemented in vid.py. So I want to know does this dataset work? Or should I make some minor revisions on it? Many thx for your reply ~
Dear authors, really impressive approach and great results! Thank you for publishing your work. First of all, is it possible to run online/real-time inference on a video (stream), in a way that the feature aggregation is done on the previous frames and inference is run on the last frame? Does the script ./tools/yolov_demo_online.py target this purpose? Thanks in advance!
hi~, I saw there are two versions of your implementation, yolov_s and yolov_s_online, what's the difference between them?
I've seen the docs but didn't find guidance on how to start video training. Should I start with vid_train.py?
And if I have a trained detector, what should I do then to continue the video training part ?
你好,我注意到 loss 设置为3reg_loss+2ref_loss+obj_loss+cls_loss,而YoloX的loss是5*reg_loss+obj_loss+cls_loss。这个参数3和2是经过实验得来的么?(我并没有看到论文中有提到)
另外YoloX在ImageNet Det数据上训练时使用到了COCO的预训练权重么?
Dear Author,
Thanks for sharing your work. I'm trying to run vid_demo.py and use yoloxx_vid.pth and yoloxs_vid.pth, but it looks like it's unable to load the checkpoints for yoloxs_vid.pth and yoloxx_vid.pth provided in google drive. And I'm getting this error:
can you tell me what's wrong?
Thanks for the help!
train_log .txt
can you tell me how to deal with it when evaluate
!! Iteration after 3 epoch. !! Running for yolov_x stopped. !!
But, following the bottom structure could smoothly running for yolov_s.
pytorch 1.20, cuda 11.6.
The 'AttributeError' seems to related the path of dataset ? I am not sure.
But, running for yolov_s is OK.
Does anyone here could give a help ? Thank you so much.
when trying to do the command in the readme python tools/vid_demo.py -f ./exps/yolov/yoloxs_vid.py -c ./yoloxs_vid.pth --path ./189_4_1_20230217042816000.tmp.mp4 --conf 0.25 --nms 0.5 --tsize 600
I get the following error:
File "/Users/joy/jtest/YOLOV/tools/vid_demo.py", line 303, in <module>
main(exp, args)
File "/Users/joy/jtest/YOLOV/tools/vid_demo.py", line 296, in main
imageflow_demo(predictor, vis_folder, current_time, args)
File "/Users/joy/jtest/YOLOV/tools/vid_demo.py", line 224, in imageflow_demo
outputs.extend(predictor.inference(ele))
File "/Users/joy/jtest/YOLOV/tools/vid_demo.py", line 137, in inference
outputs,outputs_ori = self.model(img, nms_thresh=self.nmsthre)
File "/Users/joy/jtest/YOLOV/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
TypeError: YOLOX.forward() got an unexpected keyword argument 'nms_thresh'
Dear author,
Did you check the performance with and without L1 loss ?
When I try to run tools/train.py, I met a issue:
/root/hhh/YOLO-master
unknown host, exit
How can I figure out it?
In "README", it is mentioned that the two files "YOLOV/annotations/vid_train_coco.json" and "YOLOv/yolox/data/dataset/train_seq.npy" are required, but I only see train_seq.npy is used in the project, json What and Where is the use of the json file? I‘m looking for reply,thanks
Hi! I notice that the input of this model is randomly selected from the original video. May I know how you make use of the non-keyframe information? And is the result of this model the full result of the video(including key and non-key frames) or just keyframe results? Thanks!
十分感谢作者做的贡献,问题如题目所示,感谢作者解答!!!
1.能否提供数据集vid格式转适合yolox训练的json格式的代码
或者
2.如何使用vid格式(无需json格式)直接对yolox进行训练
dataset is coco?
Line 116 in eb1d600
Can anyone download the ilsvrc2015 VID dataset?
http://bvisionweb1.cs.unc.edu/ilsvrc2015/ILSVRC2015_VID.tar.gz
http://image-net.org/challenges/LSVRC/2015/index
The links are dead.
I see the paper where 2/4 gpus can directly used to train the yolox model. However, I suffer from the stuck problem of multi gpu when training yolox model. Does anyone know how to solve this problem? Thank you!
super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory
In your paper, you mentioned that AP50 of "FCOS + Ours" can reach 73.1%. I am very interested in this. Could you share the relevant code?
Dear author, I've tried to train your yolov on my dataset and an obvious improvement was observed almost at the first epoch of fine-turing, but in later epoches, the performance never got better. If a big lr is used, the performace will get worse quickly from the first epoch. Did you meet similar situation during the fine-turing training on pretrained model ?
Hi, I have the problem that if the post function finds nothing. Maybe the input is a background image
YOLOV/yolox/models/yolov_msa_online.py
Lines 255 to 256 in 9f3a5b7
How to make sure the output feature meets the required shape?
YOLOV/yolox/models/yolovp_msa.py
Lines 272 to 274 in 9f3a5b7
I intended to run the online demo, but I got this bug. How can I solve it? (I have already equipped yolox in my environment)
Hey, I had a question about what training methodology was used to pre-train the yolox baseline models. I added the images from DET dataset with similar classes as the VID dataset and trained yolox-s model but I was not able to replicate your results. So could you elaborate on how you pretrained the yolox model?
您好,我想请教您以下两个问题:
1.在训练模型时输出的测试集mAP和单独运行vid_eval.py时得到的mAP不一致是怎么回事呢,而且发现两种操作生成的refined_pred.json文件的大小并不相同,我初步猜测是不是因为一些阈值的设置不同,导致产生的预测框数量不同,最终的测试结果不一样呢,应该以哪个为准呢
2.如果想要测试模型的FPS有相应的文件么,还是将Average forward time与Average inference time相加得到总共需要的时间
期待您的解答
Dear author, I noticed the following code:
# def fix_bn(m):
# classname = m.class.name
# if classname.find('BatchNorm') != -1:
# m.eval()
self.model.apply(init_yolo)
# self.model.apply(fix_bn)
In your experiments, is there big difference between self.model.apply(init_yolo) and self.model.apply(fix_bn) ?
Hi! which file did you calculate the AP and Times? I want to reproduce them and calculate other metrics.
Thanks a lot!
Hello, I'm working on if I can replace the YOLOX with YOLOv5, so I'm finding how can I train the YOLOX and Feature aggregation module seperately?
您好,非常感谢您的工作!我有一些问题,我没找到评估OVIS数据集的方法,vid_eval.py 评估的加载的VID_dataset数据集的验证集。OVIS数据集如何评估?我是在自定义的数据集上实验是COCO格式标注,没办法评估模型的好坏?谢谢!
great job! i have a question about the loss, all the annotations (reference frames and key frame) will be used for YOLOV? if that, i wanna only use information of references for key frame detection, Is it feasible?thx
Hello, thank you very much for your excellent work. I want to know how to prepare the corresponding vid_train_coco.json file and train_seq.npy file if I want to use my own data set. Is there any guidance for training private data set on your project? thank you very much.
请问下可以告诉我们完整的训练命令吗,如何可以训练出87.5的结果呢,我torch版本和cuda已经保持了一致,但是无法复现结果,如果不是多节点训练的命令应该怎么实现呢?
Hello, I'm confused about what is the content of the file"imagenet_vid_groundtruth_motion_iou.mat"?
and how to make it?
I saw above that yolov7 was added and didn't find the code for that section, can you tell me which section it is in please? Also, any thoughts on adding a detector for yolov6, v6 is a frameless detector.
Getting the following error while running the online demo script:
Traceback (most recent call last):
File "tools/yolov_demo_online.py", line 320, in <module>
main(exp, args)
File "tools/yolov_demo_online.py", line 313, in main
imageflow_demo(predictor, vis_folder, current_time, args)
File "tools/yolov_demo_online.py", line 226, in imageflow_demo
N = int(res_dict['cls_scores'].shape[0] / len(tmp_imgs))
TypeError: list indices must be integers or slices, not str
I have not changed anything in the script. Although I did check that pred_result, and res_dict are equal which are obtained from the inference step.
pred_result, res_dict = predictor.inference(imgs, other_result)
Can someone please guide me
Since the dataset is very large, can you provide a part of the video files in the datasets for testing the demo?
Hello author, I would appreciate your advice on the setting of training hyperparameters when training the yolov_x large model, do the learning rate and epochs need to be increased?
hello,I have a question, I find that ppyoloe don't design obj prediction on the regression branch of the detection head, how the FSM module works? Only use class_conf to select?
TypeError: forward() got an unexpected keyword argument 'nms_thresh'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.