Giter Club home page Giter Club logo

gconet_plus's Introduction

GCoNet+

This repo is the official implementation of "GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector".

[arXiv] [code] [stuff]

PWC PWC PWC

Abstract

In this paper, we present a novel end-to-end group collaborative learning network, termed GCoNet+, which can effectively and efficiently (250 fps) identify co-salient objects in natural scenes. The proposed GCoNet+ achieves the new state-of-the-art performance for co-salient object detection (CoSOD) through mining consensus representations based on the following two essential criteria: 1) intra-group compactness to better formulate the consistency among co-salient objects by capturing their inherent shared attributes using our novel group affinity module (GAM); 2) inter-group separability to effectively suppress the influence of noisy objects on the output by introducing our new group collaborating module (GCM) conditioning on the inconsistent consensus. To further improve the accuracy, we design a series of simple yet effective components as follows: i) a recurrent auxiliary classification module (RACM) promoting the model learning at the semantic level; ii) a confidence enhancement module (CEM) helping the model to improve the quality of the final predictions; and iii) a group-based symmetric triplet (GST) loss guiding the model to learn more discriminative features. Extensive experiments on three challenging benchmarks, i.e., CoCA, CoSOD3k, and CoSal2015, demonstrate that our GCoNet+ outperforms the existing 12 cutting-edge models. Code has been released at https://github.com/ZhengPeng7/GCoNet_plus.

Framework Overview

The figure of network architecture is drawn by Inkscape (0.92.5) as a .svg file. You can download and modify it if you find it useful.

arch

Result

  • Comparison with the previous state-of-the-art methods with different training sets:

image-20220601123106208

  • Ablation study:

image-20220426224944251

image-20220426225011381

image-20220426225038722

Prediction

To see the better performance of our GCoNet+, we select the currently latest and top models (UFO-arXiv2022, DCFM-CVPR2022, and CADC-ICCV2021) for the qualitative comparison.

We not only show the selected extremely hard samples in the test sets, but also simply put the unscreened samples (the first 10 samples in the first group in CoCA) for the more objective and fair qualitative comparisons.

  • The first^2 samples:

qual4README.png

  • The extremely hard cases:

qual4README.png

Usage

  1. Environment

    GPU: V100 x 1
    Install Python 3.7, PyTorch 1.8.2
    pip install requirements.txt
    
    
  2. Datasets preparation

    Download all the single train/test datasets from my google-drive, or directly download the datasets.zip in the folder for all the data you need as following structure shows (COCO-SEG is too big, so you can download it separately). The file directory structure on my machine is as follows:

    +-- datasets
    |   +-- sod
    |       +-- images
    |           +-- DUTS_class
    |           +-- COCO-9k
    |           ...
    |           +-- CoSal2015
    |       +-- gts
    |           +-- DUTS_class
    |           +-- COCO-9k
    |           ...
    |           +-- CoSal2015
    |   ...
    ...
    +-- codes
    |   +-- sod
    |       +-- GCoNet_plus
    |       ...
    |   ...
    ...
    
  3. Update the paths

    Replace all /root/datasets/sod/GCoNet_plus and /root/codes/sod/GCoNet_plus in this project to /YOUR_PATH/datasets/sod/GCoNet_plus and /YOUR_PATH/codes/sod/GCoNet_plus, respectively.

  4. Training + Test + Evaluate + Select the Best

    ./gco.sh

    If you can apply more GPUs on DGX cluster, you can ./sub_by_id.sh to submit multiple times for more stable results.

    If you have the OOM problem, plz decrease batch_size in config.py.

  5. Adapt the settings of modules in config.py

    You can change the weights of losses, try various backbones or use different data augmentation strategy. There is also some modules coded but not used in this work, like adversarial training, the refiner in BASNet, weighted multiple output and supervision used in GCoNet and GICD, etc.

    image-20220426234911555

Download

โ€‹ Find well trained models + predicted saliency maps and all other stuff on my google-drive folder for this work:

GD_content

Acknowledgement

We appreciate the codebases of GICD, GCoNet. Thanks for the CoSOD evaluation toolbox provided in eval-co-sod.

Citation

@article{zheng2022gconet+,
  title = {GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector},
  author = {Zheng, Peng and Fu, Huazhu and Fan, Deng-Ping and Fan, Qi and Qin, Jie, Tai, Yu-Wing and Tang, Chi-Keung and Van Gool, Luc},
  journal = {arXiv preprint arXiv:2205.15469},
  year = {2022},
}
@inproceedings{fan2021gconet,
  title = {Group Collaborative Learning for Co-Salient Object Detection},
  author = {Fan, Qi and Fan, Deng-Ping and Fu, Huazhu and Tang, Chi-Keung and Shao, Ling and Tai, Yu-Wing},
  booktitle = {CVPR},
  year = {2021}
}

Contact

Any question, discussion or even complaint, feel free to leave issues here or send me e-mails ([email protected]).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.