Giter Club home page Giter Club logo

catnet's Introduction

Context Aggregation Network

arXiv License

This repository maintains the official implementation of the paper Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images by Ye Liu, Huifang Li, Chao Hu, Shuang Luo, Yan Luo, and Chang Wen Chen.

Installation

Please refer to the following environmental settings that we use. You may install these packages by yourself if you meet any problem during automatic installation.

  • CUDA 10.2 Update 2
  • CUDNN 8.0.5.39
  • Python 3.9.7
  • PyTorch 1.10.0
  • MMCV 1.3.17
  • MMDetection 2.18.1
  • NNCore 0.3.2

Install from source

  1. Clone the repository from GitHub.
git clone https://github.com/yeliudev/CATNet.git
cd CATNet
  1. Install dependencies.
pip install -r requirements.txt

Getting Started

Download and prepare the datasets

  1. Download and extract the datasets.

Note that the images in iSAID dataset are splitted into patches with both sides no more than 512 pixels, as reported in our paper. We strongly recommend using this pre-processed version directly since the offical toolkit has known unknown bugs, leading to undesirable patch sizes (e.g. extreme aspect ratios).

  1. Prepare the files in the following structure.
CATNet
โ”œโ”€โ”€ configs
โ”œโ”€โ”€ datasets
โ”œโ”€โ”€ models
โ”œโ”€โ”€ tools
โ”œโ”€โ”€ data
โ”‚   โ”œโ”€โ”€ dior
โ”‚   โ”‚   โ”œโ”€โ”€ Annotations
โ”‚   โ”‚   โ”œโ”€โ”€ ImageSets
โ”‚   โ”‚   โ”œโ”€โ”€ JPEGImages-test
โ”‚   โ”‚   โ””โ”€โ”€ JPEGImages-trainval
โ”‚   โ”œโ”€โ”€ hrsid
โ”‚   โ”‚   โ”œโ”€โ”€ annotations
โ”‚   โ”‚   โ””โ”€โ”€ images
โ”‚   โ”œโ”€โ”€ isaid
โ”‚   โ”‚   โ”œโ”€โ”€ annotations
โ”‚   โ”‚   โ”œโ”€โ”€ train
โ”‚   โ”‚   โ””โ”€โ”€ val
โ”‚   โ””โ”€โ”€ vhr
โ”‚       โ”œโ”€โ”€ ground truth
โ”‚       โ””โ”€โ”€ positive image set
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ setup.cfg
โ””โ”€โ”€ ยทยทยท
  1. Convert DIOR annotations to PASCAL VOC format.
python tools/convert_dior.py
  1. Convert NWPU VHR-10 annotations to COCO format.
python tools/convert_vhr.py

Train a model

Run the following command to train a model using a specified config.

torchrun --nproc_per_node=4 tools/train.py <path-to-config>

Test a model and evaluate results

Run the following command to test a model and evaluate results.

torchrun --nproc_per_node=4 tools/test.py <path-to-config> <path-to-checkpoint>

Model Zoo

We provide multiple pre-trained models here. All the models are trained using 4 NVIDIA Tesla V100-SXM2 GPUs and are evaluated using the default metrics of the datasets.

Dataset Model Backbone Schd Aug Performance Download
BBox AP Mask AP
iSAID CAT Mask R-CNN ResNet-50 1x โœ— 46.2 38.5 model | metrics
CAT Mask R-CNN ResNet-50 1x โœ“ 47.6 40.1 model | metrics
DIOR CATNet ResNet-50 3x โœ— 76.3 โ€” model | metrics
CATNet ResNet-50 3x โœ“ 78.6 โ€” model | metrics
CAT R-CNN ResNet-50 3x โœ— 77.7 โ€” model | metrics
CAT R-CNN ResNet-50 3x โœ“ 81.9 โ€” model | metrics
NWPU
VHR-10
CATNet ResNet-50 6x โœ— 95.8 โ€” model | metrics
CATNet ResNet-50 6x โœ“ 97.4 โ€” model | metrics
CAT R-CNN ResNet-50 6x โœ— 96.4 โ€” model | metrics
CAT R-CNN ResNet-50 6x โœ“ 97.7 โ€” model | metrics
HRSID CAT Mask R-CNN ResNet-50 3x โœ— 71.7 58.2 model | metrics
CAT Mask R-CNN ResNet-50 3x โœ“ 73.3 59.6 model | metrics
CAT R-CNN ResNet-50 3x โœ— 70.5 โ€” model | metrics
CAT R-CNN ResNet-50 3x โœ“ 72.8 โ€” model | metrics

Citation

If you find this project useful for your research, please kindly cite our paper.

@techreport{liu2021learning,
  title={Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images},
  author={Liu, Ye and Li, Huifang and Hu, Chao and Luo, Shuang and Luo, Yan and Chen, Chang Wen},
  number={arXiv:2111.11057},
  year={2021}
}

catnet's People

Contributors

yeliudev avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.