Giter Club home page Giter Club logo

mohamedafham / crosspoint Goto Github PK

View Code? Open in Web Editor NEW
224.0 7.0 29.0 2.04 MB

Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

Home Page: https://mohamedafham.github.io/CrossPoint/

Shell 1.09% Python 92.07% Jupyter Notebook 6.85%
self-supervised-learning 3d-point-clouds cross-modal-learning transfer-learning unsupervised-learning point-cloud few-shot-learning deep-learning object-classification

crosspoint's Introduction

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding (CVPR'22)

Citation

If you find our work, this repository, or pretrained models useful, please consider giving a star ⭐ and citation.

@InProceedings{Afham_2022_CVPR,
    author    = {Afham, Mohamed and Dissanayake, Isuru and Dissanayake, Dinithi and Dharmasiri, Amaya and Thilakarathna, Kanchana and Rodrigo, Ranga},
    title     = {CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {9902-9912}
}

🚀 News

  • (Mar 25, 2023)
    • An implementation supporting PyTorchDistributedDataParallel (DDP) is available here. Thanks to Jerry Sun
  • (Mar 2, 2022)
    • Paper accepted at CVPR 2022 🎉
  • (Mar 2, 2022)
    • Training and evaluation codes for CrossPoint, along with pretrained models are released.

Dependencies

Refer requirements.txt for the required packages.

Pretrained Models

CrossPoint pretrained models with DGCNN feature extractor are available here.

Download data

Datasets are available here. Run the command below to download all the datasets (ShapeNetRender, ModelNet40, ScanObjectNN, ShapeNetPart) to reproduce the results.

cd data
source download_data.sh

Train CrossPoint

Refer scripts/script.sh for the commands to train CrossPoint.

Downstream Tasks

1. 3D Object Classification

Run eval_ssl.ipynb notebook to perform linear SVM object classification in both ModelNet40 and ScanObjectNN datasets.

2. Few-Shot Object Classification

Refer scripts/fsl_script.sh to perform few-shot object classification.

3. 3D Object Part Segmentation

Refer scripts/script.sh for fine-tuning experiment for part segmentation in ShapeNetPart dataset.

Acknowledgements

Our code borrows heavily from DGCNN repository. We thank the authors of DGCNN for releasing their code. If you use our model, please consider citing them as well.

crosspoint's People

Contributors

dinithipurna avatar mohamedafham avatar theamaya avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

crosspoint's Issues

Definition of the dgcnn_seg model

Thanks for your nice work. I am trying to reproduce your fine-tuning results on ShapeNetPart segmentation. I find that the model architecture for classification pertaining and segmentation pertaining are different. More specifically, in classification pertaining, the dgcnn model is adopted, while dgcnn_seg is utilized for the pre-training for part segmentation, as shown in the following:

python train_crosspoint.py --model dgcnn_seg --epochs 100 --lr 0.001 --exp_name crosspoint_dgcnn_seg --batch_size 20 --print_freq 200 --k 15

However, I can not find the definition of dgcnn_seg in your model library. I guess the dgcnn_seg should be the DGCNN_partseg model with pretrain=True, right?

In addition, in my opinion, other paper may adopt the same architecture in pre-training for both classification and part segmentation, such as OcCo. Such a difference may lead to unfair comparison. What's your opinion?

It seems that the pretrain model you provide has gap on modelnet40

Hi, I used your pretrain model directly test linear accuracy on modelnet40, it got 90.27%, same as I runed train_crosspoint.py without any initialize, But the result you mentioned in your paper can get 91.2%. So I want to know are there any tricks in your codes. Or It means I should train based on your pretrain model? I look forward to your answers

Can't download the dataset using gdown

When using the download_data.sh, it will raise the error:
requests.exceptions.MissingSchema: Invalid URL '': No scheme supplied. Perhaps you meant http://?

How to use gdown to download the dataset?

relatively large performance gap on ScanObjectNN

@MohamedAfham Recently, I have run all experiments in the codebase at least 3 times to ensure there are not explicit exceptions during my operations.

Some of the results are very encouraging, which means they are comparable with the paper reported, sometimes even higher than that in the paper, e.g. the reproduced results on ModelNet. But some are not.

Specifically, for the downstream task few-shot classification on ScanObjectNN, the performance gap is relatively large, e.g.,

  1. for 5 way, 10 shot, I got 72.5 ± 8.33,
  2. for 5 way, 20 shot, I got 82.5 ± 5.06,
  3. for 10 way, 10 shot, I got 59.4 ± 3.95,
  4. for 10 way, 20 shot, I got 67.8 ± 4.41

For the downstream task linear SVM classification on ScanObjectNN, the reproduced performance is 75.73%. All experiments use the DGCNN backbone and default settings except for the batch size.

In short, all of results are behind the reported peformances on ScanObjectNN in the paper, by a large margin.

At this point, I wonder whether there are some precautions when experimenting on ScanObjectNN, and what possible reasons are. Can you provide some suggestions? thank you.

Can train_crosspoint.py train the partseg model based on ShapeNetPart?

@MohamedAfham Thank you for releasing the code. The paper is well written and the code is robust.

I have successfully trained the classification and part segmentation models based on train_crosspoint.py and train_partseg.py, respectively. Everything goes smoothly.

One point I'm confused with is the comments in scripts/script.sh, you point out train_crosspoint.py can be used for training the part segmentation model and train_partseg.py is used for finetuing it. The code in train_crosspoint.py, however, only load ShapeNetRender for pretraining and ModelNet40 for linear accuracy evaluation. Actually, it does not load ShapeNetPart for part segmentation.

Instead, I think both training and finetuning take place in train_partseg.py as the train_loader in this file is designed for ShapeNetPart. Further, I think the self-superviesd cross-modal contrastive learning is intended for point cloud classification. Have I got a correct understaning?

About PointNetRendering Dataset

I think the view metrics of rendering images are discribed in your rendering_mentadata.txt. Could you specify the meaning of each metric? Thanks.

the definition of loss function

Thank you for your excellent work. Which code can I find the definition of loss function (including imid and cmid) mentioned in the paper?

Availability of checkpoint

Is it possible to have the checkpoints of both models (3d and 2d) in order to make a fine tuning starting from pretrained models?

null

Can you provide pointnet as a feature extractor to train your code? Thank you very much.

According to the source code referenced by the pointnet that you provided, but probably because the parameters were set differently from DGCNN, the training results were bad.

Please provide the part of the code that trains pointnet as a feature extractor. Thank you very much.
It means a lot to me. Thank you!

get_graph_feature adds tensors on two different devices

Thank you for your contributions. Very interesting work!. I have two GPUs and I'm trying to run crosspoint pre-training for classification using:

python train_crosspoint.py --model dgcnn --epochs 100 --lr 0.001 --exp_name crosspoint_dgcnn_cls --batch_size 20 --print_freq 200 --k 15

And i'm getting the following error:

Traceback (most recent call last):
  File "train_crosspoint.py", line 258, in <module>
    train(args, io)
  File "train_crosspoint.py", line 100, in train
    _, point_feats, _ = point_model(data)
  File "/home/nas/anaconda3/envs/crosspoint/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nas/Desktop/CrossPoint/models/dgcnn.py", line 95, in forward
    x = get_graph_feature(x, k=self.k)
  File "/home/nas/Desktop/CrossPoint/models/dgcnn.py", line 31, in get_graph_feature
    idx = idx + idx_base
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! 

Is there a reason for hardcoding cuda device to 1 here: https://github.com/MohamedAfham/CrossPoint/blob/440e3bdf1656014eb4284786a6b2bcdf83e8df30/models/dgcnn.py#L27

Downstream tasks 3D Object classification

thanks for your great work!
I'm confused that why you fit a simple linear SVM classifier on the train split of the classification datasets in 3D object classification? where can I find the corresponding code?

About the pointcloud visualization software in Fig.2

Hi, Mohamed Afham!
I really appreciate your great work! And I think the figures in your paper are wonderful!
Could you please tell me what the pointcloud visualization software is in Figure 2? It's looks nice!
Thanks in advance!
figure2_pc

What's the GPU device used during your training and finetuing?

As the title described, I wonder the GPU device you used to support the batch_size=20.

I use a RTX 2080 Ti, which has 11GB memory, when running train_crosspoint.py, I have to set batch_size=2 to avoid CUDA out of memory since you konw, knn and torch.cat in models/dgcnn.py will consume a large portion of memory.

However, the small batch_size leads to much slower training procedure so that I can get the final results probably in 4 or 5 days.

By the way, I have multiple GPUs, is it possible to incorporate DistributedDataParallel to accelerate the training procedure?

Anyway, I will try it out!

RuntimeError: CUDA error: invalid device ordinal

Hi @MohamedAfham

Have you ever met this bug before? Thanks a lot.

Using GPU : 0 from 1 devices
Use Adam
Start training epoch: (0/100)
/export/home/hanxiaobing/anaconda3/envs/crosspoint/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
Traceback (most recent call last):
File "train_crosspoint.py", line 261, in
train(args, io)
File "train_crosspoint.py", line 103, in train
_, point_feats, _ = point_model(data)
File "/export/home/hanxiaobing/anaconda3/envs/crosspoint/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/export/home/hanxiaobing/Documents/PlaneNet_PlaneRCNN/DGCNN_PointNet2/SensatUrban/MAE/CrossPoint/models/dgcnn.py", line 95, in forward
x = get_graph_feature(x, k=self.k)
File "/export/home/hanxiaobing/Documents/PlaneNet_PlaneRCNN/DGCNN_PointNet2/SensatUrban/MAE/CrossPoint/models/dgcnn.py", line 29, in get_graph_feature
idx_base = torch.arange(0, batch_size, device=device).view(-1, 1, 1)*num_points
RuntimeError: CUDA error: invalid device ordinal

distributed training for CrossPoint

@MohamedAfham I have succefully integrated the PyTorch DistributedDataParallel mechanism into your codebase, which accelerates the training procedure remarkbly and achieves a similar performance with the paper reported.

Later on I want to pull a request to your repository, thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.