Giter Club home page Giter Club logo

2dpass's Issues

Where the model is saved?

Hi, thanks for sharing your great work!
I am unfamiliar with pytorch lightning module, and I am trying to understand the logic how the trainer saves the model. Could you explain a bit and tells me how and where can I change the model save path? Thanks for your respp in advance!

About Nuscenes test

AssertionError: Error: Array for predictions must be between 1 and 16 (inclusive).

Author, you set the training numclass = 17, but there are some categories = 0, the above error will occur when the test is performed, and the problem will also occur when submitting to the online test. How to solve it

Unable to do testing with semantic kitti using pretrained model

Hello Thanks for providing the repo. I have tried to follow all the steps in google colab. But, at the end of the main.py file, the code

train_dataset_loader, val_dataset_loader, test_dataset_loader = build_loader(configs)
model_file = importlib.import_module('network.' + configs['model_params']['model_architecture'])
my_model = model_file.get_model(configs)

the above code is not loading the network.arch_2dpass.py and the testing is not being started.

Please help regarding this.

error while running pretrained model

Traceback (most recent call last):
File "main.py", line 167, in
train_dataset_loader, val_dataset_loader, test_dataset_loader = build_loader(configs)
File "main.py", line 120, in build_loader
val_pt_dataset = pc_dataset(config, data_path=val_config['data_path'], imageset='val', num_vote=val_config["batch_size"])
File "/home/ps/hcc/code/2DPASS/dataloader/pc_dataset.py", line 64, in init
calib = self.read_calib(calib_path)
File "/home/ps/hcc/code/2DPASS/dataloader/pc_dataset.py", line 94, in read_calib
calib_out['Tr'][:3, :4] = calib_all['Tr'].reshape(3, 4)
KeyError: 'Tr'

Testing results on SemanticKITTI

Thank you for your great work! We achieve the claimed results on validation set. However, we only achieves 68.2 mIoU on the test set, which is much lower than the claimed 72.9.

code release

It's glad to see this interesting work.
I wonder when you will release the source code?

input dims

Hello!

why you have chosen input dims =4 ?

Thanks !

Modality fusion implementation question

hello. I am studying 2DPASS with code.
It seems that the modality fusion implementation is in network/arch_2dpass.py from line 100 to line 105.
In the thesis, bitwise add is specified, but I do not see it in the code, so I ask a question.
Below is the code.

modality fusion

feat_learner = F.relu(self.leanersidx)
feat_cat = torch.cat([img_feat, feat_learner], 1)
feat_cat = self.fcs1idx
feat_weight = torch.sigmoid(self.fcs2idx)
fuse_feat = F.relu(feat_cat * feat_weight)

I think that [fuse_feat = F.relu(feat_cat*feat_wieght) + img_feat] implements the formula in the paper as a code.
Isn't it?

training error about KeyError: 'Tr'

Thanks for sharing your excellent work. When i trianed the model, i met the error: './2DPASS-main/dataloader/pc_dataset.py", line 94, in read_calib calib_out['Tr'][:3, :4] = calib_all['Tr'].reshape(3, 4)'. how could i solve this problem.

Baseline training schedule

First of all, thank you for uploading this very comprehensive and capable code. It surpasses the quality of what others upload on Github in the 3D-Semantic Space.

Has there been any additional training schedule for the baseline model?
Models like Cylinder3D used non-published training schedules and only hinted at some methods in github issues or in their paper. Most likely they used additional Test-time-augmentation, Instance augmentation, unreported augmentation schedule, model depth/channel tuning, Lasermix, ensembling etc.... Potentially every trick in the book.

Is that the same case in this code? Or has the baseline only been trained with the details present in this code? I want to train your code, but rather not invest resources if I cannot reach the reported 67.4 test result of the baseline.

I've read Issue 13 , but do the answers in that issue also apply to the baseline?

Fails to load pretrained model

Hi!
I have problems with loading a pretrained model. It is said that tensors shaped mismatched.

Traceback (most recent call last):
  File "main.py", line 181, in <module>
    my_model = my_model.load_from_checkpoint(configs.checkpoint, config=configs, strict=(not configs.pretrain2d))
  File "/usr/local/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 157, in load_from_checkpoint
    model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 205, in _load_model_state
    model.load_state_dict(checkpoint['state_dict'], strict=strict)
  File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for get_model:
        size mismatch for model_3d.spv_enc.0.v_enc.0.layers_in.0.weight: copying a param with shape torch.Size([64, 1, 1, 1, 64]) from checkpoint, the shape in current model is torch.Size([1, 1, 1, 64, 64]).
        size mismatch for model_3d.spv_enc.0.v_enc.0.layers.0.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.0.v_enc.0.layers.3.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.0.v_enc.1.layers_in.0.weight: copying a param with shape torch.Size([64, 1, 1, 1, 64]) from checkpoint, the shape in current model is torch.Size([1, 1, 1, 64, 64]).
        size mismatch for model_3d.spv_enc.0.v_enc.1.layers.0.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.0.v_enc.1.layers.3.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.1.v_enc.0.layers_in.0.weight: copying a param with shape torch.Size([64, 1, 1, 1, 64]) from checkpoint, the shape in current model is torch.Size([1, 1, 1, 64, 64]).
        size mismatch for model_3d.spv_enc.1.v_enc.0.layers.0.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.1.v_enc.0.layers.3.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.1.v_enc.1.layers_in.0.weight: copying a param with shape torch.Size([64, 1, 1, 1, 64]) from checkpoint, the shape in current model is torch.Size([1, 1, 1, 64, 64]).
        size mismatch for model_3d.spv_enc.1.v_enc.1.layers.0.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.1.v_enc.1.layers.3.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.2.v_enc.0.layers_in.0.weight: copying a param with shape torch.Size([64, 1, 1, 1, 64]) from checkpoint, the shape in current model is torch.Size([1, 1, 1, 64, 64]).
        size mismatch for model_3d.spv_enc.2.v_enc.0.layers.0.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.2.v_enc.0.layers.3.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.2.v_enc.1.layers_in.0.weight: copying a param with shape torch.Size([64, 1, 1, 1, 64]) from checkpoint, the shape in current model is torch.Size([1, 1, 1, 64, 64]).
        size mismatch for model_3d.spv_enc.2.v_enc.1.layers.0.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.2.v_enc.1.layers.3.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.3.v_enc.0.layers_in.0.weight: copying a param with shape torch.Size([64, 1, 1, 1, 64]) from checkpoint, the shape in current model is torch.Size([1, 1, 1, 64, 64]).
        size mismatch for model_3d.spv_enc.3.v_enc.0.layers.0.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.3.v_enc.0.layers.3.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.3.v_enc.1.layers_in.0.weight: copying a param with shape torch.Size([64, 1, 1, 1, 64]) from checkpoint, the shape in current model is torch.Size([1, 1, 1, 64, 64]).
        size mismatch for model_3d.spv_enc.3.v_enc.1.layers.0.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).
        size mismatch for model_3d.spv_enc.3.v_enc.1.layers.3.weight: copying a param with shape torch.Size([64, 3, 3, 3, 64]) from checkpoint, the shape in current model is torch.Size([3, 3, 3, 64, 64]).

How to solve it? Should I change config file for using your pretrained model?

Muti-GPU train issue

Have you ever run into this issue?

It works well when I use one or two gups to train with batch_size =1 or 2. However, it will be killed when I use three or four GPUs with batch_size 3 or 4. given that the per GPU memory is around 12G.

I don't know if I forget to set any parameters.

Can anyone do me a favor, if you met this before?

Thanks!

Pretrained models

Hello,
Thank you for your great work ,

the models you put in google drive, they were trained with an additional validation set and using instance-level segmentation ?

Thanks in advance

Training Details

Hello, Congratulation for your paper and it is a very good job. I didn't find which GPUs you use for training and inference, So could you please give me a suggestion about that?

The results on nuscenes validation set .

Thanks for you great work, I test the checkpoint you released on the nuscenes validation set, but it only got only 73.7% mIoU, which is much lower than you reported。 I wonder if you have any suggestions on this。

CutMix Augmentation Code

Thank you for sharing this inspiring work. In the paper you mention using CutMix augmentation however I could not find the code in the repository. You cite RPVNet paper but they also did not publish their code. Can you please provide the code for CutMix augmentation?

SemanticKITTI training hyperparameter

Thank you for your work!
You mentioned in the README that higher performance was achieved by fine-tuning the model over more epochs in SemanticKITTI.
How many more epochs did you set?

SimpleProfiler error while training

Hi, I would first like to be thankful for providing this repo. I have installed all the dependencies, and followed each step in local machine. However, I get stuck with the following error, both in training and testing with pre-trained weights. It would be grateful if I could get help with this.

Screenshot 2022-12-10 061711

Hidden Layer dimension on nuScenes

Hello, according to the paper, a demension size of 128 is used to train the network on nuScenes. However, in this released code, the default dimension size is 256 on nuScenes. Could you please tell me why are they different and which is right?

Some problems before training

hi,
I got this error when training,
but i don't know how to solve it.
Would u tell me how to solve this error?
Thanks.

Traceback (most recent call last): File "E:\2DPASS-main\main.py", line 211, in <module> trainer.fit(my_model, train_dataset_loader, val_dataset_loader) File "D:\anaconda\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 460, in fit self._run(model) File "D:\anaconda\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 714, in _run self.accelerator.setup_environment() File "D:\anaconda\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 80, in setup_environment self.training_type_plugin.setup_environment() File "D:\anaconda\lib\site-packages\pytorch_lightning\plugins\training_type\ddp.py", line 118, in setup_environment self.setup_distributed() File "D:\anaconda\lib\site-packages\pytorch_lightning\plugins\training_type\ddp.py", line 206, in setup_distributed self.init_ddp_connection() File "D:\anaconda\lib\site-packages\pytorch_lightning\plugins\training_type\ddp.py", line 273, in init_ddp_connection torch_distrib.init_process_group(self.torch_distributed_backend, rank=global_rank, world_size=world_size) File "D:\anaconda\lib\site-packages\torch\distributed\distributed_c10d.py", line 754, in init_process_group store, rank, world_size = next(rendezvous_iterator) File "D:\anaconda\lib\site-packages\torch\distributed\rendezvous.py", line 248, in _env_rendezvous_handler store = _create_c10d_store( master_addr , master_port, rank, world_size, timeout) File "D:\anaconda\lib\site-packages\torch\distributed\rendezvous.py", line 178, in _create_c10d_store TCPStore(hostname, port, world_size, start_daemon, timeout, multi_tenant=False) RuntimeError: unmatched '}' in format string

Visualization

Hi, thanks for publishing your code.

Is there any way to visualize the results on the validation set/save the network predictions to a file?

Some question in your paper

First of all, congratulations on your achievements!
I would like to ask some question in your paper

in fig 3(a), I understand that the point P ^ of the image path can be mapped from P, and the red line indicates that correspondence operation can obtain the points in the corresponding camera image.

But I don't know what the meaning of the blue box I draw in the figure is. What is the relationship between p ^ and M img?
fig 3

A question about the dataloader

Thanks for your excellent work ! I have a small question about the dataloader. The length of the img_indices、img_label and point2img_index may be different for each frame of data. How to put them into the same batch ?

Validation sanity check Error(SeamnticKITTI dataset)

image
image
When I was running the Semantic KITTI dataset, the above error occurred.But it is normal when I run the NuScenes dataset.I have placed the Semantic KITTI dataset as requested in the readme, and I try to run the Semantic KITTI dataset with the PMF code and it works fine. In addition, I also checked that the dataset code is the same as the code in this GitHub. I would like to know why this error occurs and how to solve it.
Thanks you very much.

论文请教

作者你好,很感谢产出这么优秀的工作,在阅读论文后我有一些疑惑请教。

在MSFSKD模块中,我所理解上面一条支路输出的是2D分割结果,下面一条支路是3D分割结果,请问这两种空间的输出如何计算KL散度呢
image

about pictures problem

hello,thank you for your great work! why i unable to test pretrained model with no pictures?

Training Problem

Thank you for your excellent work! When I run the code, the performance of the first four epochs are 51.682 mIoU, 49.632 mIoU, 50.934 mIoU and 47.684 mIoU. The performance is decreasing with training. I don not know that if it is a reasonable experimental result.

Inquiry about GPU

Thanks for your great work!
What is the memory size of your graphics card? Can I use a 3090 for training?

How to save the predicted labels?

Thanks for your great work! I'm trying to use the model to visualize the segmentation result, can you tell me how to save the predicted labels?

about inference time

Thank you for your great work! When I run the testing on SemanticKitti, I got inference time 480ms(using Tesla V100 32G), In your paper, the inference time is 62ms, I just the use the model in your Google drive folder, what's the problem?
I use the following code to print inference time:
t1 = time.time()
data_dict = self.forward(data_dict)
t2 = time.time()
print('inference time: ', t2-t1)

Some details about training and inference

Thank you for your exellent work. I am quite curious about the training and inference details about the MSFSKD module. As stated in the paper, the 2D learner takes the 3D features (64 channels) of points in image fov as input and output the enhanced 3D features. Hence, what is the input to the 2D learner at inference phase since the image and the 'points in image fov' is not given in the inference phase. Besides, how do the enhanced 3D features connect with the backbone feature, that is, how the features goes back to the SPVCNN encoder?

Thank you.

details about SPVCNN

Hi, @yanx27 , thanks for providing the excellent work. I wonder what network configs you take in the experiments. As mentioned, you take a modified SPVCNN with resolution=0.1 and hidden dimension=64. Do they mean you change the https://github.com/mit-han-lab/spvnas/blob/69750e900d8687ac9fcc8e042b171cd1f6beffa1/core/models/semantic_kitti/spvcnn.py#L87 of cs = [32, 32, 64, 128, 256, 256, 128, 96, 96] into cs = [64, 64, 64, 64, 64, 64, 64, 64, 64] and https://github.com/mit-han-lab/spvnas/blob/69750e900d8687ac9fcc8e042b171cd1f6beffa1/configs/semantic_kitti/default.yaml#L14 of "voxel_size: 0.05" into "voxel_size: 0.1"?

And are there any other changes to reproduce the reported results?

Thanks again.

validation_epoch_end

It gives a size error when calculating iou, what do you think could be the problem?
hemen

AssertionError: Table lidarseg not found

Thanks for your great work!
I prepared the data for Nuscenes according to your instructions, and encountered the following problem when training:
1663916690119

_Validation sanity check: 0%| | 0/2 [00:00<?, ?it/s]Traceback (most recent call last):
File "main.py", line 211, in
trainer.fit(my_model, train_dataset_loader, val_dataset_loader)
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 460, in fit
self._run(model)
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 758, in _run
self.dispatch()
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 799, in dispatch
self.accelerator.start_training(self)
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 96, in start_training
self.training_type_plugin.start_training(trainer)
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 144, in start_training
self._results = trainer.run_stage()
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 809, in run_stage
return self.run_train()
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 844, in run_train
self.run_sanity_check(self.lightning_module)
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1112, in run_sanity_check
self.run_evaluation()
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 954, in run_evaluation
for batch_idx, batch in enumerate(dataloader):
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
data.reraise()
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/torch/_utils.py", line 425, in reraise
raise self.exc_type(msg)
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/torch/utils/data/utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/root/Outdoor/2DPASS/dataloader/dataset.py", line 333, in getitem
data, root = self.point_cloud_dataset[index]
File "/root/Outdoor/2DPASS/dataloader/pc_dataset.py", line 257, in getitem
pointcloud, sem_label, instance_label, lidar_sample_token = self.loadDataByIndex(index)
File "/root/Outdoor/2DPASS/dataloader/pc_dataset.py", line 191, in loadDataByIndex
print(self.nusc.get('lidarseg', lidar_sample_token)['filename'])
File "/root/miniconda/envs/2dpass/lib/python3.8/site-packages/nuscenes/nuscenes.py", line 214, in get
assert table_name in self.table_names, "Table {} not found".format(table_name)
AssertionError: Table lidarseg not found

Error when testing and training

Hi,

I am getting not implemented for CPU ONLY build. error when testing or training the network although the gpu is available.
kindly advise.

generation of 2d semantic map

It was such an excellent job!
I wanna ask how can I get the 2d semantic map like your demo picpure posted in your github page?(the top right one)
Looking for your reply~

About xModalKD code and MSFSKD figure

Hi, thanks for your excellent work!

I notice that in Fig.4 of 2DPASS paper, the enhanced 3D features is the summation of 3D features and output from the 2D Learner. However, in this Line of the code, 3D features from spvcnn.py is directly feed into the 3D classifier, without adding feat_learner. What's more, this forward path is not processed during inference, which is also not consistent with the solid line in Fig 4.

Would you please kindly explain the difference between code and paper, and which variable in your code is corresponding to the Enhanced 3D Features in Fig.4? Thanks so much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.