mentian / object-deformnet Goto Github PK

License: MIT License

Python 93.72% C++ 0.73% Cuda 5.55%

object-deformnet's Introduction

Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation

Overview

This repository contains the PyTorch implementation of the paper "Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation" (arXiv). Our approach could recover the 6D pose and size of unseen objects from an RGB-D image, as well as reconstruct their complete 3D models.

Dependencies

Python 3.6
PyTorch 1.0.1
CUDA 9.0

Installation

ROOT=/path/to/object-deformnet
cd $ROOT/lib/nn_distance
python setup.py install --user

Datasets

Download camera_train, camera_val, real_train, real_test, ground-truth annotations, and mesh models provided by NOCS.
Unzip and organize these files in $ROOT/data as follows:

data
├── CAMERA
│   ├── train
│   └── val
├── Real
│   ├── train
│   └── test
├── gts
│   ├── val
│   └── real_test
└── obj_models
    ├── train
    ├── val
    ├── real_train
    └── real_test

Run python scripts to prepare the datasets.

cd $ROOT/preprocess
python shape_data.py
python pose_data.py

Notice that running the scripts will additionally shift and re-scale the models of mug category (w/o modifying the original files), such that the origin of the object coordinate frame is on the axis of symmetry. This step is implemented for one of our early experiments and turns out to be unnecessary. Ignoring this step should make no difference to the performance of our approach. We keep it in this repo for reproducibility.

Training

# optional - train an Autoencoder from scratch and prepare the shape priors
python train_ae.py
python mean_shape.py

# train DeformNet
python train_deform.py

Evaluation

Download the pre-trained models, segmentation results from Mask R-CNN, and predictions of NOCS from here.

unzip -q deformnet_eval.zip
mv deformnet_eval/* $ROOT/results
rmdir deformnet_eval
cd $ROOT
python evaluate.py

Citation

If you find our work helpful, please consider citing:

@InProceedings{Tian_2020_ECCV,
  author = {Tian, Meng and Ang Jr, Marcelo H and Lee, Gim Hee},
  title = {Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  month = {August},
  year = {2020}
}

Acknowledgment

Our implementation leverages the code from NOCS and 3PU.

object-deformnet's People

Contributors

Stargazers

Watchers

object-deformnet's Issues

Question regarding pose_data and difference in annotation for real and camera train datasets

Hello,
During data preprocessing in the pose_data.py file there are separate methods for annotating camera train and real train datasets. In the camera train dataset, you have done Umeyama alignment for gt nocs map with the depth image, but they haven't done the same for real dataset. Can you please explain why this is the case?

Confusion about Evaluation Metric

Hello, thanks for your great work. I don't have a thorough understanding of the evaluation metric, and I hope to get your help. I only found the definition of mAP for 2D object detection. Is the calculation of mAP about 3D IoU and n ° m cm the same as the calculation steps of 2D object detection mAP, except that the threshold for judging FP and TP is changed from IoU to 3D IoU and n ° m cm? In addition, what does Acc mean?

Where can I get mug_meta.pkl? or How can I generate it?

I am trying to repeat this project, but I can't find the file mug_meta.pkl which mentioned in
object-deformnet/preprocess/pose_data.py
Where can I get mug_meta.pkl? or How can I generate it?

Thank you!

About pose_data.py

I am so sorry to trouble you. I was confused when I ran pose_data.py. Is it used to relabel and write a new .pkl file only for mug category? How to do with the other categories?
I modified the path from ../data/results/nocs_results/val/ to ../data/gts/val/.
Then, as you suggested, I add the following code:

            if 'handle_visibility' in nocs:
                gt_handle_visibility = nocs['handle_visibility']
                assert len(nocs['handle_visibility']) == len(gt_class_ids)
            else:
                gt_handle_visibility = np.ones_like(gt_class_ids)

But there are still mistakes.
I check the results_real_test_{}_{}.pkll at ./NOCS/gt/real_test. They only have one key 'gt_RTs' like the following:

{'gt_RTs': array([[[-0.08540888, -0.06049078, -0.15215242, 0.42124944],
[ 0.11055115, -0.14789181, -0.00325966, -0.26176407],
[-0.12078065, -0.09259071, 0.10460977, 0.94286811],
[ 0. , 0. , 0. , 1. ]],

   [[-0.17664521, -0.06370935, -0.04728206, -0.14555978],
    [ 0.07886463, -0.15358652, -0.08769011, -0.24302094],
    [-0.008651  , -0.09924865,  0.16605093,  1.18763963],
    [ 0.        ,  0.        ,  0.        ,  1.        ]],

   [[ 0.48531236, -0.13263202, -0.12733575, -0.30494184],
    [-0.01690129, -0.39007853,  0.34188713,  0.05244928],
    [-0.18308481, -0.3155648 , -0.36909733,  0.91378847],
    [ 0.        ,  0.        ,  0.        ,  1.        ]],

   [[-0.20929454, -0.06829979, -0.00570323,  0.15795401],
    [ 0.06239159, -0.18228103, -0.10668883, -0.32206444],
    [ 0.02836677, -0.10300651,  0.19257889,  1.12987919],
    [ 0.        ,  0.        ,  0.        ,  1.        ]],

   [[ 0.16513061, -0.06186272,  0.08991979, -0.24412596],
    [-0.10036745, -0.15014064,  0.08102373, -0.07852249],
    [ 0.04288286, -0.11318774, -0.15662159,  1.02920438],
    [ 0.        ,  0.        ,  0.        ,  1.        ]],

   [[ 0.17068051, -0.06556304,  0.01535379,  0.04655275],
    [-0.06110246, -0.13321694,  0.11038941, -0.17303414],
    [-0.02829731, -0.10779987, -0.14575519,  1.08886987],
    [ 0.        ,  0.        ,  0.        ,  1.        ]],

   [[-0.14795024, -0.10026765, -0.18662856,  0.28520438],
    [ 0.17137876, -0.19046816, -0.03353048, -0.24504754],
    [-0.1245519 , -0.14297327,  0.17555237,  1.03897479],
    [ 0.        ,  0.        ,  0.        ,  1.        ]]]), 'image_path': '/home/hewang/Projects/CoordRCNN/data/shapenet_toi_330K/real_test/scene_4/0448_color.png'}

Question about visualization results

Dear author, when I used the pre-trained model to visualize the pose results, I found that the visualization results of some pictures were very strange, as shown in the figure below (the red lines). Is this normal? Why does this happen?

Could you share the code that only using RGB as inputs?

Hello,
Thank you for sharing the code. Could you share the code that only using RGB as inputs? (Ours (RGB) in Table.1).

Thanks a lot.

Could you please share the requirements of this work？

install _pickle

Hello, I encountered an environmental problem when reproducing your project. About: import _pickle as cPickle. When I use pip install _pickle to install this package, the system prompts me: Invalid requirement:'_pickle', can Tell me how to install this package? Can you share your environment file? thank you very much！

Question about file_path

Can I asked how to get file_path = 'CAMERA/val_list.txt', 'Real/test_list.txt' ??
Could you share each dataset val_list, test_list, train_list ??

object-deformnet/evaluate.py

Line 37 in a2dcdb8

file_path = 'Real/test_list.txt'

`pose_data.py` throws an error because it needs the data used for the evaluation step.

The following line in pose_data.py assumes data that is downloaded in the evaluation section. I think the README.md should be updated accordingly.

object-deformnet/preprocess/pose_data.py

Lines 298 to 304 in a2dcdb8

 nocs_dir = os.path.join(os.path.dirname(data_dir), 'results/nocs_results') 

 if source == 'CAMERA': 

 nocs_path = os.path.join(nocs_dir, 'val', 'results_val_{}_{}.pkl'.format( 

 img_path.split('/')[-2], img_path.split('/')[-1])) 

 else: 

 nocs_path = os.path.join(nocs_dir, 'real_test', 'results_test_{}_{}.pkl'.format( 

 img_path.split('/')[-2], img_path.split('/')[-1]))

Could you share the code to evaluate shape reconstruction by CD metric (in Table.2)?

No such file or directory: 'data/CAMERA/val_list.txt'.

Sorry for bothering you. When I run the file evaluate.py, the following problems occurred:

`FileNotFoundError: [Errno 2] No such file or directory: 'data/CAMERA/val_list.txt'.

I did not find this file in the dataset，can you provide “val_list.txt” and “train_list.txt”？

When I run the file train_ae.py，the “data/obj_models/ShapeNetCore_4096.h5” file is not in the dataset, either.

How can I visualize result from your code?

I want to get the same result as the example picture in your project, just like teaser.png. How to do that?

key error about object size

Hi! when I run pose_data.py . In annotate_test_data(data_dir). There will be a key error about sizes[i] = model_sizes[model_list[i]]. I checked the results and found that the model_list[i] e.g. e56e77c6eb21d9bdf577ff4de1ac394c is missing in keys of model_sizes set. Do you konw why?

error about obj_models/val/02876657/d3b53f56b4a7b3b3c9f016d57db96408/model.obj when run shape_data.py and pose_data.py

Dear author, thank you for releasing such a wonderful work!
I met a problem when running shape_data.py to preprocess the object models.
The info is as follows.

Traceback (most recent call last): File "/home/user/object-deformnet/preprocess/shape_data.py", line 217, in save_nocs_model_to_file(obj_model_dir) File "/home/user/object-deformnet/preprocess/shape_data.py", line 33, in save_nocs_model_to_file model_points = sample_points_from_mesh(path_to_mesh_model, 1024, fps=True, ratio=3) File "../lib/utils.py", line 147, in sample_points_from_mesh points = uniform_sample(vertices, faces, ratio*n_pts, with_normal) File "../lib/utils.py", line 100, in uniform_sample faces = vertices[faces] IndexError: arrays used as indices must be of integer (or boolean) type

now I select to skip this case in shape_data.py, but it still show in pose_data.py, as follow, how should I do? should I still skip it ?

error about obj_models in "val/02876657/d3b53f56b4a7b3b3c9f016d57db96408"

Dear author, thank you for releasing such a wonderful work!
I met a problem when running shape_data.py to preprocess the object models.
The info is as follows.

Traceback (most recent call last): File "/home/user/object-deformnet/preprocess/shape_data.py", line 217, in <module> save_nocs_model_to_file(obj_model_dir) File "/home/user/object-deformnet/preprocess/shape_data.py", line 33, in save_nocs_model_to_file model_points = sample_points_from_mesh(path_to_mesh_model, 1024, fps=True, ratio=3) File "../lib/utils.py", line 147, in sample_points_from_mesh points = uniform_sample(vertices, faces, ratio*n_pts, with_normal) File "../lib/utils.py", line 100, in uniform_sample faces = vertices[faces] IndexError: arrays used as indices must be of integer (or boolean) type

It seems that the obj.model in "obj_models/val/02876657/d3b53f56b4a7b3b3c9f016d57db96408" is not correct. How can I solve this problem?

error about obj_models in "val/02876657/d3b53f56b4a7b3b3c9f016d57db96408"

Dear author, thank you for releasing such a wonderful work!
I met a problem when running shape_data.py to preprocess the object models.
The info is as follows.

It seems that the obj.model in "obj_models/val/02876657/d3b53f56b4a7b3b3c9f016d57db96408" is not correct. How can I solve this problem?

is there additional dataseet to finetune Maskrcnn

Hi, It's really amazing work and the method your proposed boost the performance a lot.
But there are no 'camera' and 'can' categories in COCO dataset. Could I ask if you had added other additional datasets to finetune the maskrcnn model? Or could you share your method on how to finetune or train the maskrcnn model.

Question about evaluation performance

I downloaded and used your pretrained model, but I couldn't get the reported result.
Q1. Can I ask which part am I missed??

Total images: 2754
Valid images: 2754,  Total instances: 15573,  Average: 5.65/image
Inference time: 86.035171  Average: 0.031240/image
Umeyama time: 104.452625  Average: 0.037928/image
Total time: 323.474025
mAP:
3D IoU at 25: 81.9
3D IoU at 50: 71.0
3D IoU at 75: 43.1
5 degree, 2cm: 11.4
5 degree, 5cm: 12.0
10 degree, 2cm: 33.5
10 degree, 5cm: 37.8
Acc:
3D IoU at 25: 90.3
3D IoU at 50: 82.9
3D IoU at 75: 57.5
5 degree, 2cm: 23.3
5 degree, 5cm: 24.5
10 degree, 2cm: 45.5
10 degree, 5cm: 50.2

Problem when I try to train this model

Thanks for your great work. Now I meet some problem when I try to train this model.

I get this error message : Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

I can evaluate in this model without mistake, but I can not make a train

Questions about the value evaluation is different paper in REAL dataset?

When I run the pretrained model you provided for evaluation, I get the following results, which are not the same as the paper,the CAMERA datasets is same , but REAL dataset is different , just as below.
val :
3D IoU at 25: 94.4
3D IoU at 50: 93.2
3D IoU at 75: 83.3
5 degree, 2cm: 54.3
5 degree, 5cm: 59.0
10 degree, 2cm: 73.2
10 degree, 5cm: 81.5
Acc:
3D IoU at 25: 85.1
3D IoU at 50: 84.5
3D IoU at 75: 78.9
5 degree, 2cm: 67.6
5 degree, 5cm: 71.3
10 degree, 2cm: 82.3
10 degree, 5cm: 88.1

real_test
mAP:
3D IoU at 25: 82.1
3D IoU at 50: 71.2
3D IoU at 75: 42.9
5 degree, 2cm: 11.5
5 degree, 5cm: 12.1
10 degree, 2cm: 33.6
10 degree, 5cm: 38.0
Acc:
3D IoU at 25: 90.4
3D IoU at 50: 83.0
3D IoU at 75: 57.5
5 degree, 2cm: 23.6
5 degree, 5cm: 24.8
10 degree, 2cm: 45.6
10 degree, 5cm: 50.5

No such file or directory: '../data/results/nocs_results/val/results_val_00000_0000.pkl'

Hi, when I run pose_data.py to prepare datasets, I have the following problems:
No such file or directory: '../data/results/nocs_results/val/results_val_00000_0000.pkl'
I couldn't find the file download link either in readme.md or in the dataset.

ShapeNetCore_4096.h5

Hello, thank you very much for your excellent work, can you please provide ShapeNetCore_4096.h5?

Error when running pose_data.py

During data preprocessing, I encountered the above error when running pose_data.py. I've downloaded the data following the link on GitHub and ensured all formats are correct.
Should I write separate code to handle cases where the scale value is none?

Problems encountered in reproducing results in the new environment

Hello, thank you for your great work. I used the pre-trained model you provided for evaluation and got results similar to the paper. But I retrained the model in my environment, and then tested it and encountered some problems.
I hope to get your help: 1) The best result did not appear in epoch 50 (CAMERA: epoch 48, Real: epoch 52); 2 ) The test result cannot achieve the same performance as the model you provided.
Thanks in advance!

My operating environment is under Anaconda:
python 3.8.0
pytorch 1.7.0
cuda 11.0.221
GPU：NVIDIA RTX 3090
My test results are as follows:
CAMERA

Real

About ground truth size in real_test

Hi, thank you for sharing your code! I am reading the code recently.

I am wondering that the code about def save_nocs_model_to_file(obj_model_dir): function in preprocess/shape_data.py
scale = np.linalg.norm(bbox_dims)
model_points = sample_points_from_mesh(inst_path, 1024, fps=True, ratio=3)
model_points /= scale

Thus, I am confused that why the ground truth size in real_test dataset is normalized size but not its actual size?

Something wrong in computing 3D IoU

def transform_coordinates_3d(coordinates, sRT):
    """
    Args:
        coordinates: [3, N]
        sRT: [4, 4]
    Returns:
        new_coordinates: [3, N]
    """
    assert coordinates.shape[0] == 3
    coordinates = np.vstack([coordinates, np.ones((1, coordinates.shape[1]), dtype=np.float32)])
    new_coordinates = sRT @ coordinates
    new_coordinates = new_coordinates[:3, :] / new_coordinates[3, :]
    return new_coordinates

def compute_3d_IoU(sRT_1, sRT_2, size_1, size_2, class_name_1, class_name_2, handle_visibility):
    """ Computes IoU overlaps between two 3D bboxes. """
    def asymmetric_3d_iou(sRT_1, sRT_2, size_1, size_2):
        noc_cube_1 = get_3d_bbox(size_1, 0)
        bbox_3d_1 = transform_coordinates_3d(noc_cube_1, sRT_1)
        noc_cube_2 = get_3d_bbox(size_2, 0)
        bbox_3d_2 = transform_coordinates_3d(noc_cube_2, sRT_2)

        bbox_1_max = np.amax(bbox_3d_1, axis=0)
        bbox_1_min = np.amin(bbox_3d_1, axis=0)
        bbox_2_max = np.amax(bbox_3d_2, axis=0)
        bbox_2_min = np.amin(bbox_3d_2, axis=0)

        overlap_min = np.maximum(bbox_1_min, bbox_2_min)
        overlap_max = np.minimum(bbox_1_max, bbox_2_max)

        # intersections and union
        if np.amin(overlap_max - overlap_min) < 0:
            intersections = 0
        else:
            intersections = np.prod(overlap_max - overlap_min)
        union = np.prod(bbox_1_max - bbox_1_min) + np.prod(bbox_2_max - bbox_2_min) - intersections
        overlaps = intersections / union
        return overlaps

When I debugged method asymmetric_3d_iou(), I found that bbox_3d_1 shape is (3, 8), as described in method transform_coordinates_3d(). Therefore, bbox_1_max shape is (8,), so are bbox_1_min, bbox_2_max, andbbox_2_min.

But I think their shapes should be (3,), which represented bounds of 3D boxes.

And I did a test using method asymmetric_3d_iou(). Results are as follows:

use bbox_1_max = np.amax(bbox_3d_1, axis=0) which output shape is (8,)

from lib.utils import compute_3d_IoU
import numpy as np
pose = np.eye(4, dtype=np.float32)
pose[:, 3] = np.array([1, 1, 1, 1])
print(compute_3d_IoU(pose, pose, [1, 1, 1], [0.5, 0.5, 0.5], 'laptop', 'laptop', 1))

## output : nan

use bbox_1_max = np.amax(bbox_3d_1, axis=1) which output shape is (3,)

from lib.utils import compute_3d_IoU
import numpy as np
pose = np.eye(4, dtype=np.float32)
pose[:, 3] = np.array([1, 1, 1, 1])
print(compute_3d_IoU(pose, pose, [1, 1, 1], [0.5, 0.5, 0.5], 'laptop', 'laptop', 1))

## output : 0.125

About dataset

Dear author, in your paper, you mentioned that you treated the mug as a symmetric object when the handle is unseen. But how to get handle the visibility of mug in real-test dataset? Thanks very much.

The wrong results when i run evaluate.py

i test on the data"real_test" and got a bed result,which is far form the correct one.who can tell me whats wrong with it?

100%|██████████| 2754/2754 [03:08<00:00, 14.59it/s]
mAP:
3D IoU at 25: 28.9
3D IoU at 50: 1.3
3D IoU at 75: 0.0
5 degree, 2cm: 0.0
5 degree, 5cm: 0.0
10 degree, 2cm: 0.0
10 degree, 5cm: 0.0
Acc:
3D IoU at 25: 46.1
3D IoU at 50: 5.5
3D IoU at 75: 0.0
5 degree, 2cm: 0.0
5 degree, 5cm: 0.0
10 degree, 2cm: 0.1
10 degree, 5cm: 0.2

Process finished with exit code 0

Question about ground-truth NOCS

Thanks for sharing good work.
I have very simple question about ground-truth NOCS.

As I understand, NOCS is normalization coordinate and it doesn't need 6D Pose information(R,T).
In the paper, to make ground-truth NOCS coordinate, object model and its 6D pose through image rendering was used.
Could you explain why the 6D pose and image rendering was used??

why relabel mug in shape_data.py?

Hi, I am reading your code recently. But I am wondering why it needs to relabel the size of the mug only?

Error When Running shape_data.py

When I run shape_date.py，I meet this error：
File "/home/lsl/GenPose/GenPose/preprocess/../lib/utils.py", line 104, in uniform_sample
faces = vertices[faces]
IndexError: arrays used as indices must be of integer (or boolean) type
My environment is ：
Ubuntu 20.04
Python 3.8.15
Pytorch 1.12.0
Pytorch3d 0.7.2
CUDA 11.3

So I want to know ,Do I need to be consistent with your environment?

dataset download link not working

Good afternoon, thank you for your work, I would very much like to conduct training, but today the link to the datasets does not work(
Maybe you saved the datasets on google drive or know some other workaround?
I also tried downloading as recommended in hughw19/NOCS_CVPR2019#59

	nocs_dir = os.path.join(os.path.dirname(data_dir), 'results/nocs_results')
	if source == 'CAMERA':
	nocs_path = os.path.join(nocs_dir, 'val', 'results_val_{}_{}.pkl'.format(
	img_path.split('/')[-2], img_path.split('/')[-1]))
	else:
	nocs_path = os.path.join(nocs_dir, 'real_test', 'results_test_{}_{}.pkl'.format(
	img_path.split('/')[-2], img_path.split('/')[-1]))

mentian / object-deformnet Goto Github PK

object-deformnet's Introduction

Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation

Overview

Dependencies

Installation

Datasets

Training

Evaluation

Citation

Acknowledgment

object-deformnet's People

Contributors

Stargazers

Watchers

Forkers

object-deformnet's Issues

Recommend Projects

Recommend Topics

Recommend Org