Giter Club home page Giter Club logo

object-deformnet's Introduction

Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation

teaser

Overview

This repository contains the PyTorch implementation of the paper "Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation" (arXiv). Our approach could recover the 6D pose and size of unseen objects from an RGB-D image, as well as reconstruct their complete 3D models.

Dependencies

  • Python 3.6
  • PyTorch 1.0.1
  • CUDA 9.0

Installation

ROOT=/path/to/object-deformnet
cd $ROOT/lib/nn_distance
python setup.py install --user

Datasets

Download camera_train, camera_val, real_train, real_test, ground-truth annotations, and mesh models provided by NOCS.
Unzip and organize these files in $ROOT/data as follows:

data
├── CAMERA
│   ├── train
│   └── val
├── Real
│   ├── train
│   └── test
├── gts
│   ├── val
│   └── real_test
└── obj_models
    ├── train
    ├── val
    ├── real_train
    └── real_test

Run python scripts to prepare the datasets.

cd $ROOT/preprocess
python shape_data.py
python pose_data.py

Notice that running the scripts will additionally shift and re-scale the models of mug category (w/o modifying the original files), such that the origin of the object coordinate frame is on the axis of symmetry. This step is implemented for one of our early experiments and turns out to be unnecessary. Ignoring this step should make no difference to the performance of our approach. We keep it in this repo for reproducibility.

Training

# optional - train an Autoencoder from scratch and prepare the shape priors
python train_ae.py
python mean_shape.py

# train DeformNet
python train_deform.py

Evaluation

Download the pre-trained models, segmentation results from Mask R-CNN, and predictions of NOCS from here.

unzip -q deformnet_eval.zip
mv deformnet_eval/* $ROOT/results
rmdir deformnet_eval
cd $ROOT
python evaluate.py

Citation

If you find our work helpful, please consider citing:

@InProceedings{Tian_2020_ECCV,
  author = {Tian, Meng and Ang Jr, Marcelo H and Lee, Gim Hee},
  title = {Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  month = {August},
  year = {2020}
}

Acknowledgment

Our implementation leverages the code from NOCS and 3PU.

object-deformnet's People

Contributors

mentian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

object-deformnet's Issues

Confusion about Evaluation Metric

Hello, thanks for your great work. I don't have a thorough understanding of the evaluation metric, and I hope to get your help. I only found the definition of mAP for 2D object detection. Is the calculation of mAP about 3D IoU and n ° m cm the same as the calculation steps of 2D object detection mAP, except that the threshold for judging FP and TP is changed from IoU to 3D IoU and n ° m cm? In addition, what does Acc mean?

About pose_data.py

I am so sorry to trouble you. I was confused when I ran pose_data.py. Is it used to relabel and write a new .pkl file only for mug category? How to do with the other categories?
I modified the path from ../data/results/nocs_results/val/ to ../data/gts/val/.
Then, as you suggested, I add the following code:

            if 'handle_visibility' in nocs:
                gt_handle_visibility = nocs['handle_visibility']
                assert len(nocs['handle_visibility']) == len(gt_class_ids)
            else:
                gt_handle_visibility = np.ones_like(gt_class_ids)

But there are still mistakes.
I check the results_real_test_{}_{}.pkll at ./NOCS/gt/real_test. They only have one key 'gt_RTs' like the following:

{'gt_RTs': array([[[-0.08540888, -0.06049078, -0.15215242, 0.42124944],
[ 0.11055115, -0.14789181, -0.00325966, -0.26176407],
[-0.12078065, -0.09259071, 0.10460977, 0.94286811],
[ 0. , 0. , 0. , 1. ]],

   [[-0.17664521, -0.06370935, -0.04728206, -0.14555978],
    [ 0.07886463, -0.15358652, -0.08769011, -0.24302094],
    [-0.008651  , -0.09924865,  0.16605093,  1.18763963],
    [ 0.        ,  0.        ,  0.        ,  1.        ]],

   [[ 0.48531236, -0.13263202, -0.12733575, -0.30494184],
    [-0.01690129, -0.39007853,  0.34188713,  0.05244928],
    [-0.18308481, -0.3155648 , -0.36909733,  0.91378847],
    [ 0.        ,  0.        ,  0.        ,  1.        ]],

   [[-0.20929454, -0.06829979, -0.00570323,  0.15795401],
    [ 0.06239159, -0.18228103, -0.10668883, -0.32206444],
    [ 0.02836677, -0.10300651,  0.19257889,  1.12987919],
    [ 0.        ,  0.        ,  0.        ,  1.        ]],

   [[ 0.16513061, -0.06186272,  0.08991979, -0.24412596],
    [-0.10036745, -0.15014064,  0.08102373, -0.07852249],
    [ 0.04288286, -0.11318774, -0.15662159,  1.02920438],
    [ 0.        ,  0.        ,  0.        ,  1.        ]],

   [[ 0.17068051, -0.06556304,  0.01535379,  0.04655275],
    [-0.06110246, -0.13321694,  0.11038941, -0.17303414],
    [-0.02829731, -0.10779987, -0.14575519,  1.08886987],
    [ 0.        ,  0.        ,  0.        ,  1.        ]],

   [[-0.14795024, -0.10026765, -0.18662856,  0.28520438],
    [ 0.17137876, -0.19046816, -0.03353048, -0.24504754],
    [-0.1245519 , -0.14297327,  0.17555237,  1.03897479],
    [ 0.        ,  0.        ,  0.        ,  1.        ]]]), 'image_path': '/home/hewang/Projects/CoordRCNN/data/shapenet_toi_330K/real_test/scene_4/0448_color.png'}

Question about visualization results

Dear author, when I used the pre-trained model to visualize the pose results, I found that the visualization results of some pictures were very strange, as shown in the figure below (the red lines). Is this normal? Why does this happen?

00001_0002_pred
00001_0004_pred
00004_0004_pred

install _pickle

Hello, I encountered an environmental problem when reproducing your project. About: import _pickle as cPickle. When I use pip install _pickle to install this package, the system prompts me: Invalid requirement:'_pickle', can Tell me how to install this package? Can you share your environment file? thank you very much!

`pose_data.py` throws an error because it needs the data used for the evaluation step.

The following line in pose_data.py assumes data that is downloaded in the evaluation section. I think the README.md should be updated accordingly.

nocs_dir = os.path.join(os.path.dirname(data_dir), 'results/nocs_results')
if source == 'CAMERA':
nocs_path = os.path.join(nocs_dir, 'val', 'results_val_{}_{}.pkl'.format(
img_path.split('/')[-2], img_path.split('/')[-1]))
else:
nocs_path = os.path.join(nocs_dir, 'real_test', 'results_test_{}_{}.pkl'.format(
img_path.split('/')[-2], img_path.split('/')[-1]))

No such file or directory: 'data/CAMERA/val_list.txt'.

Sorry for bothering you. When I run the file evaluate.py, the following problems occurred:

`FileNotFoundError: [Errno 2] No such file or directory: 'data/CAMERA/val_list.txt'.

I did not find this file in the dataset,can you provide “val_list.txt” and “train_list.txt”?

When I run the file train_ae.py,the “data/obj_models/ShapeNetCore_4096.h5” file is not in the dataset, either.

key error about object size

Hi! when I run pose_data.py . In annotate_test_data(data_dir). There will be a key error about sizes[i] = model_sizes[model_list[i]]. I checked the results and found that the model_list[i] e.g. e56e77c6eb21d9bdf577ff4de1ac394c is missing in keys of model_sizes set. Do you konw why?

error about obj_models/val/02876657/d3b53f56b4a7b3b3c9f016d57db96408/model.obj when run shape_data.py and pose_data.py

Dear author, thank you for releasing such a wonderful work!
I met a problem when running shape_data.py to preprocess the object models.
The info is as follows.

Traceback (most recent call last): File "/home/user/object-deformnet/preprocess/shape_data.py", line 217, in save_nocs_model_to_file(obj_model_dir) File "/home/user/object-deformnet/preprocess/shape_data.py", line 33, in save_nocs_model_to_file model_points = sample_points_from_mesh(path_to_mesh_model, 1024, fps=True, ratio=3) File "../lib/utils.py", line 147, in sample_points_from_mesh points = uniform_sample(vertices, faces, ratio*n_pts, with_normal) File "../lib/utils.py", line 100, in uniform_sample faces = vertices[faces] IndexError: arrays used as indices must be of integer (or boolean) type

image

now I select to skip this case in shape_data.py, but it still show in pose_data.py, as follow, how should I do? should I still skip it ?
image

error about obj_models in "val/02876657/d3b53f56b4a7b3b3c9f016d57db96408"

Dear author, thank you for releasing such a wonderful work!
I met a problem when running shape_data.py to preprocess the object models.
The info is as follows.

Traceback (most recent call last): File "/home/user/object-deformnet/preprocess/shape_data.py", line 217, in <module> save_nocs_model_to_file(obj_model_dir) File "/home/user/object-deformnet/preprocess/shape_data.py", line 33, in save_nocs_model_to_file model_points = sample_points_from_mesh(path_to_mesh_model, 1024, fps=True, ratio=3) File "../lib/utils.py", line 147, in sample_points_from_mesh points = uniform_sample(vertices, faces, ratio*n_pts, with_normal) File "../lib/utils.py", line 100, in uniform_sample faces = vertices[faces] IndexError: arrays used as indices must be of integer (or boolean) type

It seems that the obj.model in "obj_models/val/02876657/d3b53f56b4a7b3b3c9f016d57db96408" is not correct. How can I solve this problem?

Screenshot from 2021-06-07 16-53-43

error about obj_models in "val/02876657/d3b53f56b4a7b3b3c9f016d57db96408"

Dear author, thank you for releasing such a wonderful work!
I met a problem when running shape_data.py to preprocess the object models.
The info is as follows.

Traceback (most recent call last): File "/home/user/object-deformnet/preprocess/shape_data.py", line 217, in <module> save_nocs_model_to_file(obj_model_dir) File "/home/user/object-deformnet/preprocess/shape_data.py", line 33, in save_nocs_model_to_file model_points = sample_points_from_mesh(path_to_mesh_model, 1024, fps=True, ratio=3) File "../lib/utils.py", line 147, in sample_points_from_mesh points = uniform_sample(vertices, faces, ratio*n_pts, with_normal) File "../lib/utils.py", line 100, in uniform_sample faces = vertices[faces] IndexError: arrays used as indices must be of integer (or boolean) type

It seems that the obj.model in "obj_models/val/02876657/d3b53f56b4a7b3b3c9f016d57db96408" is not correct. How can I solve this problem?

Screenshot from 2021-06-07 16-53-43

is there additional dataseet to finetune Maskrcnn

Hi, It's really amazing work and the method your proposed boost the performance a lot.
But there are no 'camera' and 'can' categories in COCO dataset. Could I ask if you had added other additional datasets to finetune the maskrcnn model? Or could you share your method on how to finetune or train the maskrcnn model.

Question about evaluation performance

I downloaded and used your pretrained model, but I couldn't get the reported result.
Q1. Can I ask which part am I missed??

Total images: 2754
Valid images: 2754,  Total instances: 15573,  Average: 5.65/image
Inference time: 86.035171  Average: 0.031240/image
Umeyama time: 104.452625  Average: 0.037928/image
Total time: 323.474025
mAP:
3D IoU at 25: 81.9
3D IoU at 50: 71.0
3D IoU at 75: 43.1
5 degree, 2cm: 11.4
5 degree, 5cm: 12.0
10 degree, 2cm: 33.5
10 degree, 5cm: 37.8
Acc:
3D IoU at 25: 90.3
3D IoU at 50: 82.9
3D IoU at 75: 57.5
5 degree, 2cm: 23.3
5 degree, 5cm: 24.5
10 degree, 2cm: 45.5
10 degree, 5cm: 50.2

mAP

Problem when I try to train this model

Thanks for your great work. Now I meet some problem when I try to train this model.

I get this error message : Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

I can evaluate in this model without mistake, but I can not make a train

Questions about the value evaluation is different paper in REAL dataset?

When I run the pretrained model you provided for evaluation, I get the following results, which are not the same as the paper,the CAMERA datasets is same , but REAL dataset is different , just as below.
val :
3D IoU at 25: 94.4
3D IoU at 50: 93.2
3D IoU at 75: 83.3
5 degree, 2cm: 54.3
5 degree, 5cm: 59.0
10 degree, 2cm: 73.2
10 degree, 5cm: 81.5
Acc:
3D IoU at 25: 85.1
3D IoU at 50: 84.5
3D IoU at 75: 78.9
5 degree, 2cm: 67.6
5 degree, 5cm: 71.3
10 degree, 2cm: 82.3
10 degree, 5cm: 88.1

real_test
mAP:
3D IoU at 25: 82.1
3D IoU at 50: 71.2
3D IoU at 75: 42.9
5 degree, 2cm: 11.5
5 degree, 5cm: 12.1
10 degree, 2cm: 33.6
10 degree, 5cm: 38.0
Acc:
3D IoU at 25: 90.4
3D IoU at 50: 83.0
3D IoU at 75: 57.5
5 degree, 2cm: 23.6
5 degree, 5cm: 24.8
10 degree, 2cm: 45.6
10 degree, 5cm: 50.5

ShapeNetCore_4096.h5

Hello, thank you very much for your excellent work, can you please provide ShapeNetCore_4096.h5?

Error when running pose_data.py

image

During data preprocessing, I encountered the above error when running pose_data.py. I've downloaded the data following the link on GitHub and ensured all formats are correct.
Should I write separate code to handle cases where the scale value is none?

Problems encountered in reproducing results in the new environment

Hello, thank you for your great work. I used the pre-trained model you provided for evaluation and got results similar to the paper. But I retrained the model in my environment, and then tested it and encountered some problems.
I hope to get your help: 1) The best result did not appear in epoch 50 (CAMERA: epoch 48, Real: epoch 52); 2 ) The test result cannot achieve the same performance as the model you provided.
Thanks in advance!

My operating environment is under Anaconda:
python 3.8.0
pytorch 1.7.0
cuda 11.0.221
GPU:NVIDIA RTX 3090
My test results are as follows:
CAMERA
2345截图20210226125907
Real
2345截图20210226130109

About ground truth size in real_test

Hi, thank you for sharing your code! I am reading the code recently.

I am wondering that the code about def save_nocs_model_to_file(obj_model_dir): function in preprocess/shape_data.py
scale = np.linalg.norm(bbox_dims)
model_points = sample_points_from_mesh(inst_path, 1024, fps=True, ratio=3)
model_points /= scale

Thus, I am confused that why the ground truth size in real_test dataset is normalized size but not its actual size?

Something wrong in computing 3D IoU

def transform_coordinates_3d(coordinates, sRT):
    """
    Args:
        coordinates: [3, N]
        sRT: [4, 4]
    Returns:
        new_coordinates: [3, N]
    """
    assert coordinates.shape[0] == 3
    coordinates = np.vstack([coordinates, np.ones((1, coordinates.shape[1]), dtype=np.float32)])
    new_coordinates = sRT @ coordinates
    new_coordinates = new_coordinates[:3, :] / new_coordinates[3, :]
    return new_coordinates

def compute_3d_IoU(sRT_1, sRT_2, size_1, size_2, class_name_1, class_name_2, handle_visibility):
    """ Computes IoU overlaps between two 3D bboxes. """
    def asymmetric_3d_iou(sRT_1, sRT_2, size_1, size_2):
        noc_cube_1 = get_3d_bbox(size_1, 0)
        bbox_3d_1 = transform_coordinates_3d(noc_cube_1, sRT_1)
        noc_cube_2 = get_3d_bbox(size_2, 0)
        bbox_3d_2 = transform_coordinates_3d(noc_cube_2, sRT_2)

        bbox_1_max = np.amax(bbox_3d_1, axis=0)
        bbox_1_min = np.amin(bbox_3d_1, axis=0)
        bbox_2_max = np.amax(bbox_3d_2, axis=0)
        bbox_2_min = np.amin(bbox_3d_2, axis=0)

        overlap_min = np.maximum(bbox_1_min, bbox_2_min)
        overlap_max = np.minimum(bbox_1_max, bbox_2_max)

        # intersections and union
        if np.amin(overlap_max - overlap_min) < 0:
            intersections = 0
        else:
            intersections = np.prod(overlap_max - overlap_min)
        union = np.prod(bbox_1_max - bbox_1_min) + np.prod(bbox_2_max - bbox_2_min) - intersections
        overlaps = intersections / union
        return overlaps

When I debugged method asymmetric_3d_iou(), I found that bbox_3d_1 shape is (3, 8), as described in method transform_coordinates_3d(). Therefore, bbox_1_max shape is (8,), so are bbox_1_min, bbox_2_max, andbbox_2_min.

But I think their shapes should be (3,), which represented bounds of 3D boxes.

And I did a test using method asymmetric_3d_iou(). Results are as follows:

  • use bbox_1_max = np.amax(bbox_3d_1, axis=0) which output shape is (8,)
from lib.utils import compute_3d_IoU
import numpy as np
pose = np.eye(4, dtype=np.float32)
pose[:, 3] = np.array([1, 1, 1, 1])
print(compute_3d_IoU(pose, pose, [1, 1, 1], [0.5, 0.5, 0.5], 'laptop', 'laptop', 1))

## output : nan
  • use bbox_1_max = np.amax(bbox_3d_1, axis=1) which output shape is (3,)
from lib.utils import compute_3d_IoU
import numpy as np
pose = np.eye(4, dtype=np.float32)
pose[:, 3] = np.array([1, 1, 1, 1])
print(compute_3d_IoU(pose, pose, [1, 1, 1], [0.5, 0.5, 0.5], 'laptop', 'laptop', 1))

## output : 0.125

About dataset

Dear author, in your paper, you mentioned that you treated the mug as a symmetric object when the handle is unseen. But how to get handle the visibility of mug in real-test dataset? Thanks very much.

The wrong results when i run evaluate.py

i test on the data"real_test" and got a bed result,which is far form the correct one.who can tell me whats wrong with it?

100%|██████████| 2754/2754 [03:08<00:00, 14.59it/s]
mAP:
3D IoU at 25: 28.9
3D IoU at 50: 1.3
3D IoU at 75: 0.0
5 degree, 2cm: 0.0
5 degree, 5cm: 0.0
10 degree, 2cm: 0.0
10 degree, 5cm: 0.0
Acc:
3D IoU at 25: 46.1
3D IoU at 50: 5.5
3D IoU at 75: 0.0
5 degree, 2cm: 0.0
5 degree, 5cm: 0.0
10 degree, 2cm: 0.1
10 degree, 5cm: 0.2

Process finished with exit code 0

my_0_pred

Question about ground-truth NOCS

Thanks for sharing good work.
I have very simple question about ground-truth NOCS.

As I understand, NOCS is normalization coordinate and it doesn't need 6D Pose information(R,T).
In the paper, to make ground-truth NOCS coordinate, object model and its 6D pose through image rendering was used.
Could you explain why the 6D pose and image rendering was used??

Error When Running shape_data.py

When I run shape_date.py,I meet this error:
File "/home/lsl/GenPose/GenPose/preprocess/../lib/utils.py", line 104, in uniform_sample
faces = vertices[faces]
IndexError: arrays used as indices must be of integer (or boolean) type
My environment is :
Ubuntu 20.04
Python 3.8.15
Pytorch 1.12.0
Pytorch3d 0.7.2
CUDA 11.3

So I want to know ,Do I need to be consistent with your environment?

dataset download link not working

Good afternoon, thank you for your work, I would very much like to conduct training, but today the link to the datasets does not work(
Maybe you saved the datasets on google drive or know some other workaround?
I also tried downloading as recommended in hughw19/NOCS_CVPR2019#59

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.