Giter Club home page Giter Club logo

counting-detr's Introduction

Counting-DETR: Few-shot Object Counting and Detection

Table of contents
  1. Introduction
  2. Main Results
  3. Usage
  4. Acknowledgments
  5. Contacts

Introduction

Abstract: We tackle a new task of few-shot object counting and detection. Given a few exemplar bounding boxes of a target object class, we seek to count and detect all objects of the target class. This task shares the same supervision as the few-shot object counting but additionally outputs the object bounding boxes along with the total object count. To address this challenging problem, we introduce a novel two-stage training strategy and a novel uncertainty-aware few-shot object detector: \Approach. The former is aimed at generating pseudo ground-truth bounding boxes to train the latter. The latter leverages the pseudo ground-truth provided by the former but takes the necessary steps to account for the imperfection of pseudo ground-truth. To validate the performance of our method on the new task, we introduce two new datasets named FSCD-147 and FSCD-LVIS. Both datasets contain images with complex scenes, multiple object classes per image, and a huge variation in object shapes, sizes, and appearance. Our proposed approach outperforms very strong baselines adapted from few-shot object counting and few-shot object detection with a large margin in both counting and detection metrics.

DETR

Details of the Counting-DETR model architecture and experimental results can be found in our following paper: Counting-DETR

@inproceedings{countingdetr2022,
title     = {{Few-shot Object Counting and Detection}},
author    = {Thanh Nguyen, Chau Pham, Khoi Nguyen and Minh Hoai},
booktitle = {Proceedings of the European Conference on Computer Vision 2022},
year      = {2022}
}

Please CITE our paper when Counting-DETR is used to help produce published results or incorporated into other software.

Main Results

For experiments on the FSCD-147 dataset

FSCD-147 Results

For experiments on the FSCD-LVIS dataset

FSCD-LVIS Results

Usage

Installation

First, pull the docker with the following command:

docker pull quaden/docker_images:pytorch_cuda102

Second, create a container

docker run -it  --name od_cnt --gpus=all  --shm-size=8G --volume="$PWD:/workspace/" --ipc=host -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY  pytorch_cuda102:latest  /bin/bash

Training and Testing

First, clone this git repo inside the docker container

git clone [email protected]:VinAIResearch/Counting-DETR.git

Second, download our FSCD-147 and FSCD-LVIS datasets from the below link:

https://drive.google.com/drive/folders/14qzZaV4S8EBUj3yEkgrDQC7iErHxSPjl?usp=sharing

In case, above link doens't work, use following link:

https://drive.google.com/drive/folders/1tlHZIg6X3jp6qARTxKh0kMsNvuIQop9P?usp=sharing

Extract each dataset for each dataset to the corresponding folder. For example, to conduct experiment for 1st stage of FSCD-147 dataset, extract FSCD_147.zip to src/CountDETR_147_1st_stage. Folder structure should be like:

Counting-DETR
│   README.md # This is the Readme you're reading now
│   LICENSE    
└───src
│   └───CountDETR_147_1st_stage # all expriements for 1st-stage of FSCD-147 dataset is conducted here
│   |    │   FSCD_147 # extracted from FSCD_147.zip
│   |    │   main.py # source code for 1st stage
│   |    │   ...
│   |
│   └───CountDETR_147_2nd_stage # all expriements for 2nd-stage of FSCD-147 dataset is conducted here
│   |    │   FSCD_147 # extracted from FSCD_147.zip
│   |    │   main.py # source code for 2nd stage
│   |    │   ...
...

Then, change the directory to the corresponding experiments and run the corresponding scripts. Sample scripts would both train and evaluate experiments.

For the 1st stage in FSCD-147 experiments:

cd src/CountDETR_147_1st_stage && ./scripts/weakly_supervise_fscd_147.sh

For the 2st stage in FSCD-147 experiments:

cd src/CountDETR_147_2nd_stage && ./scripts/var_wh_laplace_600.sh

For the 1st stage in FSCD-LVIS experiments:

cd src/CountDETR_lvis_1st_stage && ./scripts/lvis_1_stage.sh

For the 2st stage in FSCD-LVIS experiments:

cd src/CountDETR_lvis_2nd_stage && ./scripts/var_wh_laplace_lvis_2nd.sh

Acknowledgments

Our code borrowed some parts of the official repositories of AnchorDETR. Thank you so much to the authors for their efforts to release source code and pre-trained weights.

Contact

If you have any questions, feel free to open an issue or contact us at [email protected].

counting-detr's People

Contributors

khoindm avatar nguyenvanthanhhust avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

counting-detr's Issues

Code and Data

Hi, this is really an impressive work. I am interested in the work and wondering when the code and data would be released.

About the CNN+FPN

Excuse me. I have tried the code about this work. However, what confused me is that the paper mentions to use CNN+FPN structure for feature extraction, but I didn't find the FPN in the code section. In fact, after I added FPN to the code myself, I found that the quality of the generated pseudo-labels decreased. Can you help me to explain this, thank you very much!

How many epochs need to run?

How many epochs do you need to run to achieve better results? I ran about 200 epochs, but the predicted annotation is empty.

annotation error

the key value "id" are different in test_count.json and pseudo_lvis_test_cxcywh.json, and the dataloader can not work correctly.

2023-03-11 19-05-03屏幕截图

Error on Stage 1 training with FSCD147 dataset

Hi VinAI research, great paper there for counting unknown objects!

Issue:
I have tried the following script to test the stage 1 with FSCD147 dataset.

  1. This script is successfully run
  2. this script raise "key error whs"

The error happens on stage 1, point 2 (as in the link above), where I get

KeyError: 'whs'

when trying to run this script on stage 1.

CMIIW, Could I know, is it a bug, unfinished label sharing, or something else?

Here I also attach the snapshot of the error, hope it helps.
image

Expected output: I could run the whole script in the weakly_supervise_fscd_147.sh without any error.

Thanks in advance~

About Annotation json files

I have a question about the annotation files. What's the difference between count_train and unseen_count_train?
From what I understand, count_train has same classes with count_val and count_test, while unseen_count_train and unseen_count_test don't share classes.
So, would it be appropriate to use the unseen_count_train.json for a few-shot setting?
Also, is the model in Tab.9 trained using unseen_count_train.json and evaluated using unseen_count_test.json?
캡처

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.