Giter Club home page Giter Club logo

diag-hoi's Introduction

Diagnosing Human-object Interaction Detectors

Code for "Diagnosing Human-object Interaction Detectors".

Contributed by Fangrui Zhu, Yiming Xie, Weidi Xie, Huaizu Jiang.

Installation

Installl the dependencies.

pip install -r requirements.txt

Data preparation

HICO-DET

Due to the inconsistency in the current evaluation protocol, we remove all no interaction annotations from ground truth annotations and predictions. There are two precomputed data files regarding HICO-DET.

data
 └─ hicodet
     |─ sum_gts_filtered.pkl # dict: {HOI triplet: number of it in the ground truth} (test set)
     └─ gt_wo_nointer.pkl # ground truth HOIs for all test images
     

They can be easily obtained from current HOI benchmarks or you can download them here and put it under data/ folder.

V-COCO

Data files can be obtained from the original repo here. Or you can download files for evaluation here and put it under data/ folder.

data
 └─ vcoco
     |─ vcoco_test.json
     |─ instances_vcoco_all_2014.json 
     └─ vcoco_test.ids 

Usage

Standard Evaluation

Run commands below to get standard $mAP$ results without no interaction class. Need to specify --preds_file in the config. This file is the output of the HOI detector, whose format can be found in the examples data/CDN/preds_wo_nointer.pkl and data/CDN/vcoco_r50.pickle. Or you can follow CDN to save this kind of output.

sh eval_hicodet.sh # evaluate on HICO-DET
sh eval_vcoco.sh # evaluate on V-COCO

mAP Improvement

Simply run the following commands to obtain the new mAP after fixing one type of error. You can specify --fix_type and --model_name while changing the error type and HOI model.

sh map_hicodet.sh 
sh map_vcoco.sh

Pair Localization

In order to diagnose the model's performance on pair localization, we need to extract intermediate~(the first stage) results from HOI detectors, i.e. extracting detected pairs before they are passed for interaction classification. Example results can be found here (data/CDN/hicodet_pair_preds.pkl).

Then run commands below to obtain the average number of pairs, recall and precision regarding detected pairs.

sh pair_loc_hicodet.sh
sh pair_loc_vcoco.sh

Action Analysis

Binary Classification for "no_interaction" class

We compute $AP$ of classifying negative human-object pairs to see if the model is able to give low confidence scores to incorrect localized human-object pairs. We compute the no interaction $AP$ on HICO-DET as it contains no interaction annotations. To get the output from HOI detectors, we need to save the classification score belonging to the negative class as $1 - \max_i (p_i)$, where $p_i$ is the classification score of the $i$-th actual action class. Examples can be found here (data/CDN/pair_pred_wscore.pkl).

With the desired model output, we can obtain the $AP$ of this binary classification problem by running commands below.

sh binary_cls_hicodet.sh

Action mAP on actual classes

For correctly localized pairs, there may be multiple action labels associated with them. We therefore compute the $mAP$ score for the classification over all actual action categories.

In order to get the correct result, one need to save the action score for each predicted triplet, rather than the overall score (human_score * object_score * action_score). Then, running commands below.

sh action_map_hicodet.sh
sh action_map_vcoco.sh

Acknowledgment

Some of the code is borrowed from CDN and v-coco.

diag-hoi's People

Contributors

fangruizhu avatar playerkk avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.