Giter Club home page Giter Club logo

rethinking_of_par's Introduction

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting (official Pytorch implementation)

zero-shot This paper submitted to TIP is the extension of the previous Arxiv paper.

This project is adopted in the JDAI-CV/fast-reid and PP-Human of PaddleDetection.

This project aims to

  1. provide a strong baseline for Pedestrian Attribute Recognition and Multi-Label Classification.
  2. provide two new datasets RAPzs and PETAzs following zero-shot pedestrian identity setting.
  3. provide a general training pipeline for pedestrian attribute recognition and multi-label classification task.

This project provide

  1. DDP training, which is mainly used for multi-label classifition.
  2. Training on all attributes, testing on "selected" attribute. Because the proportion of positive samples for other attributes is less than a threshold, such as 0.01.
    1. For PETA and PETAzs, 35 of the 105 attributes are selected for performance evaluation.
    2. For RAPv1, 51 of the 92 attributes are selected for performance evaluation.
    3. For RAPv2 and RAPzs, 54 and 53 of the 152 attributes are selected for performance evaluation.
    4. For PA100k, all attributes are selected for performance evaluation.
    • However, training on all attributes can not bring consistent performance improvement on various datasets.
  3. EMA model.
  4. Transformer-base model, such as swin-transformer (with a huge performance improvement) and vit.
  5. Convenient dataset info file like dataset_all.pkl

Dataset Info

  • PETA: Pedestrian Attribute Recognition At Far Distance [Paper][Project]

  • PA100K[Paper][Github]

  • RAP : A Richly Annotated Dataset for Pedestrian Attribute Recognition

  • PETAzs & RAPzs : Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting Paper [Project]

Performance

Pedestrian Attribute Recognition

Datasets Models ma Acc Prec Rec F1
PA100k resnet50 80.21 79.15 87.79 87.01 87.40
-- resnet50* 79.85 79.13 89.45 85.40 87.38
-- resnet50 + EMA 81.97 80.20 88.06 88.17 88.11
-- bninception 79.13 78.19 87.42 86.21 86.81
-- TresnetM 74.46 68.72 79.82 80.71 80.26
-- swin_s 82.19 80.35 87.85 88.51 88.18
-- vit_s 79.40 77.61 86.41 86.22 86.32
-- vit_b 81.01 79.38 87.60 87.49 87.55
PETA resnet50 83.96 78.65 87.08 85.62 86.35
PETAzs resnet50 71.43 58.69 74.41 69.82 72.04
RAPv1 resnet50 79.27 67.98 80.19 79.71 79.95
RAPv2 resnet50 78.52 66.09 77.20 80.23 78.68
RAPzs resnet50 71.76 64.83 78.75 76.60 77.66
  • The resnet* model is trained by using the weighted function proposed by Tan in AAAI2020.
  • Performance in PETAzs and RAPzs based on the first version of PETAzs and RAPzs as described in paper.
  • Experiments are conducted on the input size of (256, 192), so there may be minor differences from the results in the paper.
  • The reported performance can be achieved at the first drop of learning rate. We also take this model as the best model.
  • Pretrained models are provided now at Google Drive.

Multi-label Classification

Datasets Models mAP CP CR CF1 OP OR OF1
COCO resnet101 82.75 84.17 72.07 77.65 85.16 75.47 80.02

Pretrained Models

Dependencies

  • python 3.7
  • pytorch 1.7.0
  • torchvision 0.8.2
  • cuda 10.1

Get Started

  1. Run git clone https://github.com/valencebond/Rethinking_of_PAR.git
  2. Create a directory to dowload above datasets.
    cd Rethinking_of_PAR
    mkdir data
    
  3. Prepare datasets to have following structure:
    ${project_dir}/data
        PETA
            images/
            PETA.mat
            dataset_all.pkl
            dataset_zs_run0.pkl
        PA100k
            data/
            dataset_all.pkl
        RAP
            RAP_dataset/
            RAP_annotation/
            dataset_all.pkl
        RAP2
            RAP_dataset/
            RAP_annotation/
            dataset_zs_run0.pkl
        COCO14
            train2014/
            val2014/
            ml_anno/
                category.json
                coco14_train_anno.pkl
                coco14_val_anno.pkl
    
  4. Train baseline based on resnet50
    sh train.sh
    

Acknowledgements

Codes are based on the repository from Dangwei Li and Houjing Huang. Thanks for their released code.

Citation

If you use this method or this code in your research, please cite as:

@article{jia2021rethinking,
  title={Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting},
  author={Jia, Jian and Huang, Houjing and Chen, Xiaotang and Huang, Kaiqi},
  journal={arXiv preprint arXiv:2107.03576},
  year={2021}
}

@inproceedings{jia2021spatial,
 title={Spatial and semantic consistency regularizations for pedestrian attribute recognition},
 author={Jia, Jian and Chen, Xiaotang and Huang, Kaiqi},
 booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
 pages={962--971},
 year={2021}

}

rethinking_of_par's People

Contributors

valencebond avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

rethinking_of_par's Issues

Found some mistake in training

In my training RAP2, i follow as readme mention, but found in the log: "RAP2 attr_num : 54, eval_attr_num : 54", and testing the saving model file " with shape torch.Size([54, 2048]) from checkpoint "
i think attr_num must be 119 as readme said "Training on all attributes, testing on "selected" attribute."
so in rapv2.yaml it should be " LABEL: 'all' " instead of " LABEL: 'eval' " , am i right ?

35 Attributes in PETA

I can't find anything about the names of 35 selected attributes in PETA for training and evaluate. Can you help me?

PETAzs and RAPV2zs download link

Hello,

Thank you for releasing the code for this awesome work!
I wanted to know how to access the proposed datasets in the paper: PETAzs and RAPv2 zs?
From where can we download these datasets also which scripts we need to run to keep the dataset in required file structure?

Thank you in advance!

Could you provide us training configuration more precisely?

I trained your model with swin_s for PA100K but there was noticable gap between my experiment(79.48) and your reported performance(80.35)
I assume that configuration i used for training is different from your settings.
Could you provide us more precise configuration for swin_s (PA100K)?
many thanks.

Data augmentations

I notice that there are many method of data augmentations mentioned in the Ablation Study in the paper. I assume this is relate to the "_C.TRAIN.DATAAUG.TYPE = 'base' " part in the config. How can I use this to test different types of data augmentation as suggested in the paper?

How to solve the problem of misidentification

Hi~ @valencebond
First of all, thank you very much for your workes
I'm trying to train the model of smoking and phone call detection. Now I have a problem. The false alarm rate of the model is high.

like
image
image
The above will be mistaken for smoking

Where do you think I should improve?Continue to add images of such situations?Is there any preprocessing method for the image that can reduce misrecognition?

RAPv2 dataset

Hello
How are you?
Could you share the RAPv2 dataset used when training your model?
Thanks

infer.py推理效果很差

infer.py推理效果很差,为什么训练的时候验证集pos_recall可以到80,但是infer的时候pos_recall只有47??????

where is dataset_all.pkl

Hi, I run PA100K, but it needs dataset_all.pkl file.

The official PA100K dataset does not contain this file.

Where can I download it? Thanks very much!

about image size

Hi, I found the input size was (256, 192) in the code, but was (256, 128) in the paper.

Did I get something wrong?

Custom dataset Annotation

I am currently engaged in attribute recognition work, and I am looking for guidance on annotating my dataset with attributes. Additionally, I am interested in understanding the process of training a model using a custom dataset.

abnormal results on PK100K

when i train this code, i got higher results on train set(ma: 94.4 | acc: 93.0 | Prec: 95.7 | Rec:96.4 | 96.1
), but got lower results on test set(ma: 53.4 | acc: 45.4 | Prec: 60.6 | Rec:62.0 | 61.3)

Inference time

@valencebond can you please share the details of the inference time of the model with system specification it would be helpful

LICENSE

@valencebond can you please add license to ur project it would be really helpful , thanks in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.