Giter Club home page Giter Club logo

cppf's Introduction

CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild

Yang You, Ruoxi Shi, Weiming Wang, Cewu Lu

CVPR 2022

Paper PDF Project Page Video

CPPF is a pure sim-to-real method that achieves 9D pose estimation in the wild. Our model is trained solely on ShapeNet synthetic models (without any real-world background pasting), and could be directly applied to real-world scenarios (i.e., NOCS REAL275, SUN RGB-D, etc.). CPPF achieves the goal by using only local $SE3$-invariant geometric features, and leverages a bottom-up voting scheme, which is quite different from previous end-to-end learning methods. Our model is robust to noise, and can obtain decent predictions even if only bounding box masks are provided.

News

  • [2024.07] Check our new object pose estimation benchmark PACE on ECCV 2024.
  • [2024.04] Check our CPPF++ (TPAMI) for even better results in the wild!
  • cppf++
  • [2022.03] Our another Detection-by-Voting method Canonical Voting, which achieves SoTA on ScanNet, SceneNN, SUN RGB-D is accepted to CVPR 2022.

Change Logs

  • [2022.05.05] Fix a problem in scale target computing.

Contents

Overview

This is the official code implementation of CPPF, including both training and testing. Inference on custom datasets is also supported.

Installation

You can run the following command to setup an environment, tested on Ubuntu 18.04:

Create Conda Env
conda create -n cppf python=3.8
Install Pytorch
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch-lts
Install Other Dependencies
pip install tqdm opencv-python scipy matplotlib open3d==0.12.0 hydra-core pyrender cupy-cuda102 PyOpenGL-accelerate OpenEXR
CXX=g++-7 CC=gcc-7 pip install MinkowskiEngine==0.5.4 -v
Miscellaneous

Notice that we use pyrender with OSMesa support, you may need to install OSMesa after running pip install pyrender, more details can be found here.

MinkowskiEngine append its package path into sys.path (a.k.a., PYTHONPATH), which includes a module named utils. In order not to get messed with our own utils package, you should import MinkowskiEngine after importing utils.

Train on ShapeNet Objects

Data Preparation

Download ShapeNet v2 dataset and modify the shapenet_root key in config/config.yaml to point to the location of the dataset.

Train on NOCS REAL275 objects

To train on synthetic ShapeNet objects that appear in NOCS REAL275, run:

python train.py category=bottle,bowl,camera,can,laptop,mug -m

For laptops, an auxiliary segmentation is needed to ensure a unique pose. Please refer to Auxiliary Segmentation for Laptops/

Train on SUN RGB-D objects

To train on synthetic ShapeNet objects that appear in SUN RGB-D, run:

python train.py category=bathtub,bed,bookshelf,chair,sofa,table -m
Auxiliary Segmentation for Laptops

For Laptops, geometry alone cannot determine the pose unambiguously, we rely on an auxiliary segmentation network that segments out the lid and the keyboard base.

To train the segmenter network, first download our Blender physically rendered laptop images from Google Drive and place it under data/laptop. Then run the following command:

python train_laptop_aux.py

Pretrained Models

Pretrained models for various ShapeNet categories can be downloaded from Google Drive.

Test on NOCS REAL275

Data Preparation

First download the detection priors from Google Drive, which is used for evaluation with instance segmentation or bounding box masks. Put the directory under data/nocs_seg.

Then download RGB-D images from NOCS REAL275 dataset and put it under data/nocs.

Place (pre-)trained models under checkpoints.

Evaluate with Instance Segmentation Mask

First save inference outputs:

python nocs/inference.py --adaptive_voting

Then evaluate mAP:

python nocs/eval.py | tee nocs/map.txt
Evaluate with Bounding Box Mask

First save inference outputs with bounding box mask enabled:

python nocs/inference.py --bbox_mask --adaptive_voting

Then evaluate mAP:

python nocs/eval.py | tee nocs/map_bbox.txt
Zero-Shot Instance Segmentation and Pose Estimation

For this task, due to the memory limitation, we use the regression-based network. You can go through the process by running the jupyter notebook nocs/zero_shot.ipynb.

Test on SUN RGB-D

Data Preparation

We follow the same data preparation process as in VoteNet. You need to first download SUNRGBD v2 data (SUNRGBD.zip, SUNRGBDMeta2DBB_v2.mat, SUNRGBDMeta3DBB_v2.mat) and the toolkits (SUNRGBDtoolbox.zip). Move all the downloaded files under data/OFFICIAL_SUNRGBD. Unzip the zip files.

Download the prepared extra data for SUN RGB-D from Google Drive, and move it under data/sunrgbd_extra. Unzip the zip files.

Evaluate with Instance Segmentation Mask

First save inference outputs:

python sunrgbd/inference.py

Then evaluate mAP:

python sunrgbd/eval.py | tee sunrgbd/map.txt

Train on Your Own Object Collections

Configuration Explained

To train on custom objects, it is necessary to understand some parameters in configuration files.

  • up_sym: Whether the objects look like a cylinder from up to bottom (e.g., bottles). This is to ensure the voting target is unambiguous.
  • right_sym: Whether the objects look like a cylinder from left to right (e.g., rolls). This is to ensure the voting target is unambiguous.
  • regress_right: Whether to predict the right axis. Some symmetric objects only have a up axis well defined (e.g., bowls, bottles), while some do not (e.g., laptops, mugs).
  • z_right: Whether the objects are placed such that the right axis is [0, 0, 1] (default: [1, 0, 0]).
Voting Statistics Generation

Next, we need to know the scale_range (used for data augmentation, control possible object scales along the diagonal), vote_range (the range for center voting targets $\mu,\nu$), and scale_mean (the average 3D scale, used for scale voting). To generate them, you may refer to gen_stats.py.

Write Configuration Files and Train

After you prepare the necessary configurations and voting statistics, you can write your own configuration file similar to that in config/category, and then run train.py.

Citation

@inproceedings{you2022cppf,
  title={CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild},
  author={You, Yang and Shi, Ruoxi and Wang, Weiming and Lu, Cewu},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2022}
}

cppf's People

Contributors

qq456cvb avatar

Stargazers

 avatar sewon jeon avatar Zipeng Dai avatar Wenhao Sun avatar  avatar Changmin Jeon avatar  avatar  avatar hiyyg avatar BuckyBuck avatar Alberto Remus avatar  avatar Sean Lee avatar  avatar  avatar Peleka Georgia avatar Kosmas Tsiakas avatar Starsky Wong avatar  avatar  avatar Taisanai avatar alpha avatar  avatar  avatar Yueci Deng avatar Hannah Schieber avatar YeChongjie avatar  avatar Xingyu Liu avatar Jack avatar Daoyi Gao avatar  avatar Jingze avatar  avatar Fu Lian avatar Jingkang Wang avatar Yong-Lu Li avatar Yoking avatar 爱可可-爱生活 avatar Jiefeng Li avatar KailinLi avatar Hu Zhu avatar  avatar Si-yu Derrick ZHANG avatar Yuan Liu avatar Raphaël avatar  avatar Lixin YANG (杨理欣) avatar

Watchers

James Cloos avatar  avatar  avatar Yueci Deng avatar hiyyg avatar

cppf's Issues

ground truth segmentation of extra_sunrgbd data is not right

Hi, thanks for sharing the codes of the excellent work.
I run the code sunrgbd/inference.py, and I save the original back-projected points and the segmented points.
For the bed instance, the yellow point is the original back-projected points (from bed_pc.npz), and the blue points (chosen by using bed_segment.pkl). It seems that the blue points loss some part when indexing the original back-projected points by bed_segment.pkl.
So is there something wrong with the bed_segment.pkl?
截屏2022-04-29 19 37 49

关于实验

请问,我按照工程中的步骤没有复现出文章中的结果,结果如下 :
image
image

Question about the SUNRGBD results

Hi there!
Sorry for bothering you again!
After re-evaluation the models on SUNRGBD dataset with full rot setting, we get the results.
This is quite different from the results reported in the paper, especially for translation and rotation errors.
Any suggestions for the situation?
Thank you!
PS: no full rot setting faces the same situation.

Typename mAP: bed
bed 3D IoU at 10: 35.0
bed 3D IoU at 25: 11.1
bed 3D IoU at 50: 0.0
bed 3D IoU at 75: 0.0
bed 20 degree, 10cm: 0.0
bed 40 degree, 20cm: 0.0
bed 60 degree, 30cm: 0.2
Typename mAP: table
table 3D IoU at 10: 8.1
table 3D IoU at 25: 1.5
table 3D IoU at 50: 0.0
table 3D IoU at 75: 0.0
table 20 degree, 10cm: 0.0
table 40 degree, 20cm: 0.1
table 60 degree, 30cm: 0.6
Typename mAP: sofa
sofa 3D IoU at 10: 42.8
sofa 3D IoU at 25: 17.9
sofa 3D IoU at 50: 1.0
sofa 3D IoU at 75: 0.0
sofa 20 degree, 10cm: 0.0
sofa 40 degree, 20cm: 0.0
sofa 60 degree, 30cm: 0.0
Typename mAP: chair
chair 3D IoU at 10: 35.0
chair 3D IoU at 25: 13.4
chair 3D IoU at 50: 0.5
chair 3D IoU at 75: 0.0
chair 20 degree, 10cm: 0.0
chair 40 degree, 20cm: 0.4
chair 60 degree, 30cm: 1.0
Typename mAP: bookshelf
bookshelf 3D IoU at 10: 8.4
bookshelf 3D IoU at 25: 0.4
bookshelf 3D IoU at 50: 0.0
bookshelf 3D IoU at 75: 0.0
bookshelf 20 degree, 10cm: 0.0
bookshelf 40 degree, 20cm: 0.1
bookshelf 60 degree, 30cm: 0.5
Typename mAP: bathtub
bathtub 3D IoU at 10: 59.3
bathtub 3D IoU at 25: 27.7
bathtub 3D IoU at 50: 3.1
bathtub 3D IoU at 75: 0.0
bathtub 20 degree, 10cm: 0.1
bathtub 40 degree, 20cm: 0.1
bathtub 60 degree, 30cm: 0.1

the groundtruth in ShapeNetDataset

Hello, I have been referring to the ShapeNetDataset section in your code recently, where the translation, rotation and scale seem to be in the form of classification. May I ask which part of the code defines the groundtruth of thetranslation, rotation and scale in matrix form.Thanks!

Question about the evaluation

Dear Author,
We noticed that you changed your evaluation code from NOCS because the original way to calculate 3D IoU is buggy.
However I am wondering that why did you chang all your evaluation codes rather than just replace the function and keep other evaluation code:

def asymmetric_3d_iou(RT_1, RT_2, size_1, size_2):

In my experiments, your model could report this metrics in your evaluation, which is the same as the result reported in your paper

mAP 25 mAP 50 mAP 75 10cm 10° 10° 10cm 5° 5cm 10° 5cm
78.5 26 0.1 99.9 48.7 48.2 17.4 17.7

but report this result when simply repalce the asymmetric_3d_iou

mAP 25 mAP 50 mAP 75 10cm 10° 10° 10cm 5° 5cm 10° 5cm
82.9 72.3 10.3 98.1 47 46.2 16.5 44.4

Could your explain why did you chang your evaluation code in this way? Thanks a lot!

The missing checkpoints

Dear author,
We found that the checkpoints of classes: 'toilet':4, 'desk':5, 'dresser':6, 'night_stand':7 in SUNRGBD dataset were not provided. Could you please provide those checkpoints on Google Drive? Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.