Comments (12)
I have did experiments on 4 A100 and 20% waymo data, it seems that the k3 version voxelnext_ioubranch_large.yaml
consumes twice as much time as the 2D version voxelnext2d_ioubranch.yaml
. The former is 10hour, the latter is 4.5hour
from voxelnext.
Hi @sky-fly97 ,
My machine (4 GPUs V100) is very very old... Its CPU for loading data really drives me crazy. It took me about 5 days to train voxelnext2d_ioubranch.yaml
and 1 week to train voxelnext_ioubranch_large.yaml
.
Regards,
Yukang Chen
from voxelnext.
Thanks~
from voxelnext.
Hi @sky-fly97 ,
My machine (4 GPUs V100) is very very old... Its CPU for loading data really drives me crazy. It took me about 5 days to train
voxelnext2d_ioubranch.yaml
and 1 week to trainvoxelnext_ioubranch_large.yaml
.Regards, Yukang Chen
Can I ask how many epochs you set when you train on 20% waymo dataset.
Because In my case Its converge time is so long and performance is nan when I use 20% waymo dataset
from voxelnext.
Hi,
In my 20% training, I used the exactly same settings to the full dataset, both 12 epochs (8 GPUs).
It is weird. The training results for 100% and 20% data should not have such gap.
Regards,
Yukang Chen
from voxelnext.
Hello,
I solved the problem.
I actually reimplemented your code based on mmdetection3D.
VoxelNeXt/pcdet/utils/loss_utils.py
Line 421 in b5b7d39
This class might have some problems because this Loss class has no function of splitting batches compared with other Loss classes such as RegLossSparse or something else.
I think I didn't find a similar function for that in your code.
Finally, I got below performances in Waymo Datasets based on your "voxelnext_ioubranch_large.yaml"
Vehicle/L2 mAPH: 0.6657
Pedestrian/L2 mAPH: 0.6599,
Cyclist/L2 mAPH: 0.7042
from voxelnext.
Hello,
I solved the problem.
I actually reimplemented your code based on mmdetection3D.
VoxelNeXt/pcdet/utils/loss_utils.py
Line 421 in b5b7d39
This class might have some problems because this Loss class has no function of splitting batches compared with other Loss classes such as RegLossSparse or something else.
I think I didn't find a similar function for that in your code.
Finally, I got below performances in Waymo Datasets based on your "voxelnext_ioubranch_large.yaml"
Vehicle/L2 mAPH: 0.6657 Pedestrian/L2 mAPH: 0.6599, Cyclist/L2 mAPH: 0.7042
Hi,
I have also been using mmdetection3d to reproduce recently, and I have also encountered a problem with long convergence time.
I have analyzed the loss and guess it may be a problem with focal loss.
May I ask how you resolved the problem with focal loss?
from voxelnext.
Hello,
I solved the problem.
I actually reimplemented your code based on mmdetection3D.
VoxelNeXt/pcdet/utils/loss_utils.py
Line 421 in b5b7d39
This class might have some problems because this Loss class has no function of splitting batches compared with other Loss classes such as RegLossSparse or something else.
I think I didn't find a similar function for that in your code.
Finally, I got below performances in Waymo Datasets based on your "voxelnext_ioubranch_large.yaml"
Vehicle/L2 mAPH: 0.6657 Pedestrian/L2 mAPH: 0.6599, Cyclist/L2 mAPH: 0.7042Hi,
I have also been using mmdetection3d to reproduce recently, and I have also encountered a problem with long convergence time.
I have analyzed the loss and guess it may be a problem with focal loss.
May I ask how you resolved the problem with focal loss?
Hello,
In my perspective, I think your problem is not caused by focal loss
In VoxelNeXt, to make targets, they use "for loop" for all ground truth
and It gets worse when you use multi-task head groups such as ArgoverseV2, because we should make targets for each task per ground truth.
I tried to make a faster targeting module, but this repository uses Gaussian Focal Loss and nearest assignments, so I can't resolve the convergence problem. Because I don't know how avoid "for loop" when I implement GFL and nearest assignment
Maybe someday, If someone makes "cuda version" targeting module, we can make faster training convergence.
Good luck
from voxelnext.
Hello,
I solved the problem.
I actually reimplemented your code based on mmdetection3D.
VoxelNeXt/pcdet/utils/loss_utils.py
Line 421 in b5b7d39
This class might have some problems because this Loss class has no function of splitting batches compared with other Loss classes such as RegLossSparse or something else.
I think I didn't find a similar function for that in your code.
Finally, I got below performances in Waymo Datasets based on your "voxelnext_ioubranch_large.yaml"
Vehicle/L2 mAPH: 0.6657 Pedestrian/L2 mAPH: 0.6599, Cyclist/L2 mAPH: 0.7042Hi,
I have also been using mmdetection3d to reproduce recently, and I have also encountered a problem with long convergence time.
I have analyzed the loss and guess it may be a problem with focal loss.
May I ask how you resolved the problem with focal loss?Hello,
In my perspective, I think your problem is not caused by focal loss
In VoxelNeXt, to make targets, they use "for loop" for all ground truth
and It gets worse when you use multi-task head groups such as ArgoverseV2, because we should make targets for each task per ground truth.
I tried to make a faster targeting module, but this repository uses Gaussian Focal Loss and nearest assignments, so I can't resolve the convergence problem. Because I don't know how avoid "for loop" when I implement GFL and nearest assignment
Maybe someday, If someone makes "cuda version" targeting module, we can make faster training convergence.
Good luck
Thank you very much for your detailed reply~
from voxelnext.
I have did experiments on 4 A100 and 20% waymo data, it seems that the k3 version
voxelnext_ioubranch_large.yaml
consumes twice as much time as the 2D versionvoxelnext2d_ioubranch.yaml
. The former is 10hour, the latter is 4.5hour
Hi!
I also have a question regarding evaluation cost time.
Could you please let me know approximately how much time you spent on evaluation and generating the final results? In my training, the time spent on evaluation seems normal, but the time spent on generating the final results is several times longer than the evaluation time.
I would like to understand if this is a normal occurrence.
Thanks.
from voxelnext.
currence.
Thanks.
Hello,
Unfortunately, I can't remember the exact evaluation cost time. It's a too long time...
But it is weird that the final results are several times longer than the evaluation time.
I'm sorry I can't help.
Good luck!
from voxelnext.
currence.
Thanks.Hello,
Unfortunately, I can't remember the exact evaluation cost time. It's a too long time...
But it is weird that the final results are several times longer than the evaluation time.
I'm sorry I can't help.
Good luck!
Thank you for your prompt response!
I am using the official configuration files and code from openpcdet for training, but the time spent on generating results is four times longer than the validation time.
I have found some answers, but they haven't resolved the issue.
Anyway, thank you very much.
from voxelnext.
Related Issues (20)
- VoxelNeXT-2D HOT 1
- tracking HOT 3
- How to get the flops ?
- No module named 'pcdet.version' HOT 3
- IoU related loss HOT 1
- [Question] How to run a single inference on this model? HOT 2
- Question about the settings of training
- How to use the NuScenes dataset? Please!!! HOT 3
- Can you send me the nuscene and waymo test set model?
- Question about test results
- Can't reproduce the results on waymo
- FPS
- Question about NuScenes val set
- KeyError 'model_state' while loading checkpoint HOT 1
- Could not launch VoxelNeXt due to VoxelGenerator
- Details about the code HOT 1
- The result on nuscenes test set is not good
- 3D object detection from UAV based LiDAR point clouds
- Argoverse2 trainval infos
- About the deployment
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from voxelnext.