changlongjianggit / a2j-transformer Goto Github PK

[CVPR 2023] Code for paper 'A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image'

License: Apache License 2.0

Python 77.51% Shell 0.22% C++ 2.02% Cuda 20.26%

3d-hand-pose-estimation a2j-transformer cvpr2023 pose-estimation rgb-image

a2j-transformer's People

Contributors

Stargazers

Watchers

Forkers

siyisan guanyazhou1 fangwudi zdforient scott-mao kimx3966 messmor

a2j-transformer's Issues

How to recover the estimated 3D coordinates to a 2D image

Thanks for your code! I found that the estimated key points do not correspond to ground-truth when I visualize it directly from preds. What should I do to recover the estimated 3D coordinates to a 2D image. btw, I'm a beginner, please don't hesitate to teach!

with torch.no_grad():
       for itr, (inputs, targets, meta_info) in enumerate(tqdm(tester.batch_generator,ncols=150)):
            # forward
            start = time.time()
            out = tester.model(inputs, targets, meta_info, 'test')
            end = time.time()

            joint_coord_out = out['joint_coord'].cpu().numpy()
            inv_trans = out['inv_trans'].cpu().numpy()   # 
            joint_vaild = out['joint_valid'].cpu().numpy()

            preds['joint_coord'].append(joint_coord_out)
            preds['inv_trans'].append(inv_trans)
            preds['joint_valid'].append(joint_vaild)

            timer.append(end-start)


            # visualization
            # focal = meta_info['focal'][0]
            # princpt = meta_info['princpt'][0]
            # for j in range(42):
            #     joint_coord_out[0][j,:2] = trans_point2d(joint_coord_out[0][j,:2],inv_trans[0])
            # joint_coord_out[0][:,2] = (joint_coord_out[0][:,2]/cfg.output_hm_shape[0] * 2 - 1) * (cfg.bbox_3d_size/2)
            # joint_coord_out[0][:21,2] += float(targets['rel_root_depth'][0])
            # joint_coord_out[0] = pixel2cam(joint_coord_out[0], focal, princpt)

            plt.imshow(inputs['img'][0].permute(1,2,0))
            plt.scatter(joint_coord_out[0][:21,0],joint_coord_out[0][:21,1])
            plt.scatter(targets['joint_coord'][0][:21,0],targets['joint_coord'][0][:21,1])
            plt.savefig('./visualization/result'+str(itr)+'.png')
            plt.close()

about test.py

hello, thanks for your great work
when i trying to run code on my computer, it always stops at 'get bbox and root depth from groundtruth annotation 40%'
i set the num_thread to 4, test batch_size to 5
my gpu is 2070, computer memory size is 8GB
do i need to change my gpu or enlarge memory size
good luck with your work

Hand2017训练文件

您好，在Hand2017数据集上的训练代码中提到的这三个文件我没有在文件夹中找到，这些文件您可以提供一个获取方式吗
keypointsfile = '/data/data1/zhangboshen/CODE/219_A2J_original/Anchor_Pose_fpn/data/Hands2017/train_keypointsUVD.mat'
center_train = scio.loadmat('/data/data1/zhangboshen/CODE/219_A2J_original/Anchor_Pose_fpn/data/Hands2017/train_centre_pixel.mat')
center_test = scio.loadmat('/data/data1/zhangboshen/CODE/219_A2J_original/Anchor_Pose_fpn/data/Hands2017/test_centre_pixel.mat')

数据集

不好意思，打扰一下，请问一下您的interhand数据集解压过程中有出现这样的问题吗？

training code

Your work is fantastic. When will the training code be released?

About Accuracy

Hello,
Thank you very much for your excellent work! I'm trying to reproduce your results. If possible, could you please help me out to get the accuracy of the model and loss function?

The results I am getting are as follows:
MPJPE: {'total': 9.632784, 'single_hand_total': 8.109349, 'single_hand_2d': 3.632148, 'single_hand_depth': 6.542033289847017, 'inter_hand_total': 10.962901, 'inter_hand_2d': 6.2421556, 'inter_hand_depth': 7.810678911578783}
Hand Accuracy: None
MRRPE: None
Time per batch: 0.39169927277285777

I would appreciate it very much.

Regards,
Harshit Soni
Binghamton University

Training time

Could you please provide a detailed description of the GPU used and training time? I am trying to reproduce your results. I set the batch_size to 24 and used dual 3090 GPUs for training. It cost 4.34h for one epoch. Is this reasonable?

Inference without bounding box

HI I'm student who studying hand pose estimation these days.
First of all, thanks a lot for your good project and paper.
However, I have a question.
I want to inference with my own custom data.
(e.g., want to extract 3D keypoints in real time with live camera (webcam) )
But I found there is an function 'augmentation' in utils.preprocessing and it uses bounding box information in advance. (from xxxx_test_data.json)
Is there any possible way to inference without bounding box?
Thanks and have a wonderful day! :)

code

Thank you for your excellent job! Will the training code be released?

Incomprehensible error

When I compile the make.sh script, I get an error "ModuleNotFoundError: No module named 'torch' ", but my virtual environment has already installed pytorch. Why is there such an error?

training code

Thank you for your great work and look forward to the release of your training code

RHP dataset

Do you have data processing and training scripts for the RHP dataset?

Wrist position in camera space, and relative positions between left and right wrists

Hello. Thanks for sharing the paper and code.

I'm wondering how you model the estimation of wrist position in camera space, and relative positions between left and right wrists in either camera space or right-hand root-aligned space.

I've read the paper and code, and I feel like that respectively for left and right hands, only the root-aligned joint positions regarding the wrist of the same hand (i.e., left-hand joints referring to the left wrist, right-hand joints referring to the right wrist) are taken into consideration.

Thank you for your time and help!

Best regards,
Frances

sh make.sh

I changed cur_dir path, but when i run sh make.sh (win64) Terminal display:
ValueError: path '/data/data2/a2jformer/camera_ready/dab_deformable_detr/ops/src/vision.cpp' cannot be absolute，
How can i solve this problem, Thanks

3D关键点坐标

您好，我想请问一下如何通过你这个模型将检测到的3D关键点坐标保存下来呢.

About code test.

Hello, I tried the installation instructions to test your code, but no matter how many Gpus I set up it will only run on Gpus 0. Also when I use one 3090 as gpu, it said cuda out of memory. But your paper said you test on a single 2080ti gpu. Looking forward to your reply!

train code is released?

Demo available?

Thanks for your great work! Will you provide visualization demo?

Tested on a subset of InterHand2.6M (38959 images), but I got MPJPE=118.32

Hi! Thanks for your code.

Due to my limited GPU resources, I tested on a subset of the InterHand 2.6M dataset (part aa).
but I got the result of MPJPE=118.32, I guess there are some mistakes in my operation. Could you please direct me what should I change in the code. By the way, I noticed that you are Chinese, if it's convenient, we can communicate in Chinese!

Thank you again!

self.datalist = self.datalist_sh + self.datalist_ih
valid_datalist = []    
for data in tqdm(self.datalist):     
    img_path = data['img_path']
    if osp.exists(img_path):       # Only 'InterHand2.6M.images.5.fps.v1.0.tar.partaa'.
        valid_datalist.append(data)
self.datalist = valid_datalist[:]         
print('Number of annotations in single hand sequences: ' + str(len(self.datalist_sh)))
print('Number of annotations in interacting hand sequences: ' + str(len(self.datalist_ih)))

Will the train code on NYU dataset be released？

create datalist error

Thank you for your code! But I found there are some errors about the result of generating the datalist. It's unable to read the images. Can you check the correctness of the code of generating datalist in dataset.py?

code for loss function

Thank you very much for your excellent work! I'm trying to reproduce your results. If possible, could you please provide the code for the loss function mentioned in the paper?I will appreciate it very much.

Will the train code be released?

Thank you for your excellent job! Will the training code be released?

The paper link

very appreciate your contributions to this community. However, I can't find the paper link. When will it be released? thanks:)

visualize hand skeleton

Hello, I want to visualize the hand skeleton in the output picture, how should I do it? Are you using your vis_keypoints(img, kps, kps_gt, bbox, score, skeleton, filename, score_thr=0.4, line_width=3, circle_rad = 3, save_path=None) function?
If so, how should the parameters img, kps, kps_gt, bbox, score, and skeleton be set?