Giter Club home page Giter Club logo

pose-residual-network-pytorch's Introduction

Pose Residual Network

This repository contains a PyTorch implementation of the Pose Residual Network (PRN) presented in our ECCV 2018 paper:

Muhammed Kocabas, Salih Karagoz, Emre Akbas. MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network. In ECCV, 2018. arxiv

PRN is described in Section 3.2 of the paper.

Getting Started

We have tested our method on Coco Dataset

Prerequisites

python
pytorch
numpy
tqdm
pycocotools
progress
scikit-image

Installing

  1. Clone this repository git clone https://github.com/salihkaragoz/pose-residual-network-pytorch.git

  2. Install Pytorch

  3. pip install -r src/requirements.txt

  4. To download COCO dataset train2017 and val2017 annotations run: bash data/coco.sh. (data size: ~240Mb)

Training

python train.py

For more options look at opt.py

Testing

  1. Download pre-train model

  2. python test.py --test_cp=PathToPreTrainModel/PRN.pth.tar

Results

Results on COCO val2017 Ground Truth data.

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.892
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.978
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.921
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.883
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.912
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.917
 Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.982
 Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.937
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.902
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.944

License

Citation

If you find this code useful for your research, please consider citing our paper:

@Inproceedings{kocabas18prn,
  Title          = {Multi{P}ose{N}et: Fast Multi-Person Pose Estimation using Pose Residual Network},
  Author         = {Kocabas, Muhammed and Karagoz, Salih and Akbas, Emre},
  Booktitle      = {European Conference on Computer Vision (ECCV)},
  Year           = {2018}
}

pose-residual-network-pytorch's People

Contributors

eakbas avatar icewinechen avatar salihkaragoz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pose-residual-network-pytorch's Issues

More details about the Fig.2 in your paper.

@salihkaragoz Thanks for your excellent work and shared repo. I'm very interested in the Fig.2 results of your paper, and I would like to reproduce these results. And, would you mind sharing your code about how to get sample poses obtained via clustering the structures learned by PRN.
Thanks.
Looking forward to your any replies

Licensing

Hello,

What is the license of this model ?

Thank you

Softmax across all keypoints?

class Flatten(nn.Module):
    def forward(self, input):
        return input.view(input.size(0), -1)

class PRN(nn.Module):
    def __init__(self,node_count,coeff):
        ...
        self.softmax   = nn.Softmax(dim=1)

    def forward(self, x):
        res = self.flatten(x)
        ...
        out = self.add(out,res)  # [N,H*W*C]
        out = self.softmax(out)
        out = out.view(out.size()[0],self.height, self.width, 17)

        return out

Corrupted tar archive

Hello,
Although the name of the issue is self-explanatory, I'll add a few details here :

  • The archive is 806MB, that seems a bit large, especially when the pth file of a trained retina net is less than 200MB
  • The file can't be opened, which is a shame, I'd love to replicate your results :)

I hope you will be able to help, have a nice day !

Keypoint Estimation Subnet

Does this code contain the implementation of Keypoint Estimation Subnet? And how to add a loss at each level of K features in Keypoint Estimation Subnet? Thanks!

Confuse of input and label

The dataloader of coco dataset shows the details of the work how to exploit in network training. But I check the dataloader function, the weight and output actually is same "dataloader.py---- line 43-64 and line 75--95", the input is the weighs variable from gendata function, why the input is coming from the known keypoints position information rather than the true image data? It is definitely different from what u said in the paper. It means that u use the known label to predict the known label? Does it make sense? If I have some misunderstanding about the code, please let me know.

for j in range(17):
if kpv[j] > 0:
x0 = int((kpx[j] - x) * x_scale)
y0 = int((kpy[j] - y) * y_scale)

            if x0 >= self.bbox_width and y0 >= self.bbox_height:
                output[self.bbox_height - 1, self.bbox_width - 1, j] = 1
            elif x0 >= self.bbox_width:
                output[y0, self.bbox_width - 1, j] = 1
            elif y0 >= self.bbox_height:
                try:
                    output[self.bbox_height - 1, x0, j] = 1
                except:
                    output[self.bbox_height - 1, 0, j] = 1
            elif x0 < 0 and y0 < 0:
                output[0, 0, j] = 1
            elif x0 < 0:
                output[y0, 0, j] = 1
            elif y0 < 0:
                output[0, x0, j] = 1
            else:
                output[y0, x0, j] = 1

    img_id = ann_data['image_id']
    img_data = coco.loadImgs(img_id)[0]
    ann_data = coco.loadAnns(coco.getAnnIds(img_data['id']))

    for ann in ann_data:
        kpx = ann['keypoints'][0::3]
        kpy = ann['keypoints'][1::3]
        kpv = ann['keypoints'][2::3]

        for j in range(17):
            if kpv[j] > 0:
                if (kpx[j] > bbox[0] - bbox[2] * self.threshold and kpx[j] < bbox[0] + bbox[2] * (1 + self.threshold)):
                    if (kpy[j] > bbox[1] - bbox[3] * self.threshold and kpy[j] < bbox[1] + bbox[3] * (1 + self.threshold)):
                        x0 = int((kpx[j] - x) * x_scale)
                        y0 = int((kpy[j] - y) * y_scale)

                        if x0 >= self.bbox_width and y0 >= self.bbox_height:
                            weights[self.bbox_height - 1, self.bbox_width - 1, j] = 1
                        elif x0 >= self.bbox_width:
                            weights[y0, self.bbox_width - 1, j] = 1
                        elif y0 >= self.bbox_height:
                            weights[self.bbox_height - 1, x0, j] = 1
                        elif x0 < 0 and y0 < 0:
                            weights[0, 0, j] = 1
                        elif x0 < 0:
                            weights[y0, 0, j] = 1
                        elif y0 < 0:
                            weights[0, x0, j] = 1
                        else:
                            weights[y0, x0, j] = 1

    for t in range(17):
        weights[:, :, t] = gaussian(weights[:, :, t])
    output = gaussian(output, sigma=2, mode='constant', multichannel=True)
    # weights = gaussian_multi_input_mp(weights)
    # output = gaussian_multi_output(output)
    return weights, output

it's just a joke

pls give up your trying. i check the model, the kernel are just some linear layers with the size of 1024*34272

Downloadable weights

Hi. I'm very interested in this implementation. For now I'm gonna try training myself.

But do you think you'll put some downloadable weights that reach the scores thrown in the article ? I'd be very interested.

Thanks.

Hey @VladislavZavadskyy,

Could you clearly describe your problem instead of making a groundless judgment? This repo isn't a full pipeline of the things that we introduced in our recent paper, just a demo of the main contribution to help people to understand the idea. If you can state your issue in a concise way, maybe we can guide you to correct resources.

Thanks,

Originally posted by @mkocabas in #14 (comment)

Yes we would like you to direct us exactly the training methodology your preprocessing pipeline and ur hyper paramater tuning strategy. Or you could just release ur pretrained model which you were promising from long time.

release all the code

thanks for sharing your code!
I am very interested in your code. I want to train your network from scratch. Can you release all the code(backbone, keypoint subnet, person detection subnet and pose residual net)? thank you very much!

help

Hello, I am studying your thesis recently and trying to run train.py on Linux, but it appears core dumped , what should I do?

Zero division by bbox[3]

Hi,
I got a zero division error during the training, I'm wondering when a bbox[3] has a zero value?
Please help to solve this issue, thanks a lot!


index created!
14%|██████████████████▊ | 9072/64115 [11:40<52:52, 17.35it/s]Traceback (most recent call last):
File "train.py", line 81, in
main(option)
File "train.py", line 68, in main
Evaluation(model, opt)
File "~/pose_residual_network.pytorch/src/eval.py", line 221, in Evaluation
y_scale = float(h) / math.ceil(b[3])
ZeroDivisionError: float division by zero
14%|██████████████████▌ | 9072/64115 [11:40<1:10:52, 12.95it/s]

Training Result not well

Dear salihkaragoz,
Thanks for you code. The results trained on your code seems not fine as your git . Something options missing?

Thanks for your reply!

Total Step: 17372 | Total Epoch: 16
17372it [05:57, 48.62it/s] | Epoch: 15 Total: 1:59:03 | ETA: 18:15:13 | loss:0.00240877992474
------------Evaulation Started------------
loading annotations into memory...
Done (t=0.17s)
creating index...
index created!
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2693/2693 [01:22<00:00, 32.71it/s]
Loading and preparing results...
DONE (t=0.11s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type keypoints
DONE (t=4.03s).
Accumulating evaluation results...
DONE (t=0.05s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.852
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.967
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.884
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.835
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.889
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.886

Doubt with Our keypoints + Our bboox?

Hi@salihkaragoz Thank you for your novel work, i have some doubt when we do real test.

PRN is trained on both GT, when we test with our predicted keypoints and box, do we need to train it again, or just use the same model in this reposity? If we need to train it with Our keypoints + Our bboox, how can we prepare the input and target?

pretrain model is different from the defined model [Solved]

Hi, I download the code and pre-trained model. Then I just run test:
RuntimeError: Error(s) in loading state_dict for PRN:
Missing key(s) in state_dict: "bneck.weight", "bneck.bias".
Unexpected key(s) in state_dict: "dens3.weight", "dens3.bias".

It looks like the model definition is different from the pre-trainined model.

Find some weird code in eval.py

Hi, thanks for your work. When I looking your code about this repository , i find somewhere is weird in eval.py. When to get predicated bbox_keypoints, you used the true keypoints to assign the bbox_keypoints. The code in eval.py is about line 200 and line 205.

The peaks is true keypoints coordinate, is it right? It seems that used the true coordinate to assign the predicated bbox_keypoints. Actually i think the line 209~220 in eval.py is the right way to get real predicated bbox_keypoints.

May be you can give me some advice about this, thanks.

This repo is a scam

Don't waste your time dealing with the code, cause I did it for you.
I've made an input-label collage. As you can see the network learns to blur the (slightly blurred and filtered) labels passed to it as input.
io

Reproducing results reported by paper

Given that the full network & training flow is not released by the authors, did anyone actually fully succeed in reproducing the results written in the paper (both the accuracy & speed of 23 FPS)? Either DL framework is ok. Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.