Giter Club home page Giter Club logo

context-cluster's Introduction

Image as Set of Points - ICLR'23 [Oral, Top5%]

by Xu Ma*, Yuqian Zhou*, Huan Wang, Can Qin, Bin Sun, Chang Liu, Yun Fu.

arXiv webpage


TO DO (Mar 9):

  • update the checkpoints (conv1x1 -> nn.linear, shape doesn't match)
  • add large model and checkpoints
  • release codes/ checkpoints for CoC without region partition (re-trained with updated codes, get better results)
  • release the visualization script.

Image Classification

1. Requirements

torch>=1.7.0; torchvision>=0.8.0; pyyaml; timm; einops; apex-amp (if you want to use fp16);

data prepare: ImageNet with the following folder structure, you can extract ImageNet by this script.

│imagenet/
├──train/
│  ├── n01440764
│  │   ├── n01440764_10026.JPEG
│  │   ├── n01440764_10027.JPEG
│  │   ├── ......
│  ├── ......
├──val/
│  ├── n01440764
│  │   ├── ILSVRC2012_val_00000293.JPEG
│  │   ├── ILSVRC2012_val_00002138.JPEG
│  │   ├── ......
│  ├── ......

2. Pre-trained Context Cluster Models

We upload the checkpoints and logs to anonymous google drive. Feel free to download.

Model #params Image resolution Top1 Acc Throughtput Download
ContextCliuster-tiny 5.3M 224 71.8 518.4 [checkpoint & logs]
ContextCliuster-tiny* 5.3M 224 71.7 510.8 [checkpoint & logs]
ContextCliuster-tiny_plain (w/o region partition) 5.3M 224 72.9 - [checkpoint]
ContextCliuster-small 14.0M 224 77.5 513.0 [checkpoint & logs]
ContextCliuster-medium 27.9M 224 81.0 325.2 [checkpoint & logs]

3. Validation

To evaluate our Context Cluster models, run:

MODEL=coc_tiny #{tiny, tiny2 small, medium}
python3 validate.py /path/to/imagenet  --model $MODEL -b 128 --checkpoint {/path/to/checkpoint} 

4. Train

We show how to train Context Cluster on 8 GPUs. The relation between learning rate and batch size is lr=bs/1024*1e-3. For convenience, assuming the batch size is 1024, then the learning rate is set as 1e-3 (for batch size of 1024, setting the learning rate as 2e-3 sometimes sees better performance).

MODEL=coc_tiny # coc variants
DROP_PATH=0.1 # drop path rates
python3 -m torch.distributed.launch --nproc_per_node=8 train.py --data_dir /dev/shm/imagenet --model $MODEL -b 128 --lr 1e-3 --drop-path $DROP_PATH --amp

5. Clustering Visualization

We provide a script to visualize the clustering results of CoC for a given stage, block, head.

Different layers/heads will present different clustering patterns.

# Use example (generated image will saved to images/cluster_vis/{model}):
python cluster_visualize.py --image {path_to_image} --model {model} --checkpoint {path_to_checkpoint} --stage {stage} --block {block} --head {head}
 

See folder pointcloud for point cloud classification taks on ScanObjectNN.

See folder detection for Detection and instance segmentation tasks on COCO..

See folder segmentation for Semantic Segmentation task on ADE20K.

BibTeX

@inproceedings{ma2023image,
    title={Image as Set of Points},
    author={Xu Ma and Yuqian Zhou and Huan Wang and Can Qin and Bin Sun and Chang Liu and Yun Fu},
    booktitle={The Eleventh International Conference on Learning Representations},
    year={2023},
    url={https://openreview.net/forum?id=awnvqZja69}
}

Acknowledgment

Our implementation is mainly based on the following codebases. We gratefully thank the authors for their wonderful works.

pointMLP, poolformer, pytorch-image-models, mmdetection, mmsegmentation.

License

The majority of Context Cluster is licensed under an Apache License 2.0

context-cluster's People

Contributors

canqin001 avatar ma-xu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.