Giter Club home page Giter Club logo

lm4cv's Introduction

This is the official implementation of our ICCV paper Learning Concise and Descriptive Attributes for Visual Recognition.

Requirements

  • torch == 2.0.1
  • python 3.9.13
  • torchvision == 0.15.2

Datasets

  • CUB: Download the dataset from here. The downloaded files are organized as below.
  • Stanford_Cars: Download the dataset from here. The downloaded files are organized as below.
  • CIFAR10: run the code python main.py --config configs/cifar10.yaml then the dataset will be automatically downloaded into the folder ./data/cifar-10-batches-py.
  • CIFAR100: run the code python main.py --config configs/cifar100_bn.yaml then the datasete will be automatically downloaded into the folder ./data/cifar-100-python.
  • Flowers102: run the code python main.py --config configs/flower.yaml then the dataset will be automatically downloaded into the folder ./data/flowers-102.
  • Food101: run the code python main.py --config configs/food_bn.yaml then the datasete will be automatically downloaded into the folder ./data/food-101.
  • Oxford-Pets: run the code python main.py --config configs/oxford_pets_bn.yaml then the datasete will be automatically downloaded into the folder ./data/oxford-iiit-pet.
  • Imagenet-Animals: Download t he dataset from here and the downloaded files are organized as below.
- data
    - CUB_200_2011
        - cub_attributes_gpt3.txt # generated by us
        - image_class_labels.txt # generated by us
        - train_test_split.txt
        - images.txt
        - attributes
        - images
        - parts
        - README.md
        - ...
    - stanford_cars
        - cars_attributes.txt # generated by us
        - image_class_labels.txt # generated by us
        - cars_train
            - *.jpg
        - cars_test
            - *.jpg
        - devkit
        - cars_train.tgz
        - cars_test.tgz
        - cars_test_annos_withlabels.mat
        # The url provided from "torchvision" is invalid, 
        # so you need to first download the files and put 
        # the tgz files under this folder so that the class 
        # would think the dataset has already been downloaded.
    - cifar-10-batches-py
        - cifar10_attributes.txt # generated by us
        - image_class_labels.txt # generated by us
    - cifar-100-python
        - cifar100_attributes.txt # generated by us
        - image_class_labels.txt # generated by us
    - flowers-102
        - flower_attributes.txt # generated by us
        - image_class_labels.txt # generated by us
    - food-101
        - food_attributes.txt # generated by us
        - image_class_labels.txt # generated by us
    - oxford-iiit-pet
        - oxford_pets_attributes.txt # generated by us
        - image_class_labels.txt # generated by us
    - imagenet
        - imagenet_animal_attributes.txt # generated by us
        - imagenet_attributes.txt # generated by us
        - image_class_labels.txt # generated by us

Attributes queired for each class

We put the attributes quried for each class with GPT3 in the folder cls2attributes.

Parameters

The following key parameters are available for customization:

  • cluster_feature_method: Choose one from [kmeans, random, linear]. "Linear" refers to our method.
  • model_size: Set the size of the CLIP model.
  • mahalanobis: Enable or disable Mahalanobis distance regularization.
  • division_power: Control the strength of Mahalanobis constraints.
  • reinit: Decide whether to initialize the model with weights from image training features.
  • num_attributes: Specify the number of attributes selected for classification.

Please make sure to adjust these parameters according to your requirements.

Citation

If you find our codebase useful for your research, please consider citing our paper:

@article{DBLP:journals/corr/abs-2308-03685,
  author       = {An Yan and
                  Yu Wang and
                  Yiwu Zhong and
                  Chengyu Dong and
                  Zexue He and
                  Yujie Lu and
                  William Wang and
                  Jingbo Shang and
                  Julian J. McAuley},
  title        = {Learning Concise and Descriptive Attributes for Visual Recognition},
  journal      = {CoRR},
  volume       = {abs/2308.03685},
  year         = {2023}
}

lm4cv's People

Contributors

wangyu-ustc avatar zijizhu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.