Giter Club home page Giter Club logo

class-norm's Introduction

About

This repo contains the code for the Class Normalization for Continual Zero-Shot Learning paper from ICLR 2021:

  • the code to reproduce ZSL and CZSL results
  • the proposed CZSL metrics (located in src/utils/metrics.py)
  • fast python implementation of the AUSUC metric

[arXiv Paper] [Google Colab] [OpenReview Paper]

In this project, we explored different normalization strategies used in ZSL and proposed a new one (class normalization) that is suited for deep attribute embedders. This allowed us to outperform the existing ZSL model with a simple 3-layer MLP trained just in 30 seconds. Also, we extended ZSL ideas into a more generalized setting: Continual Zero-Shot Learning, proposed a set of metrics for it and tested several baselines.

Class Normalization illustration

Installation & training

Data preparation

For ZSL

For ZSL, we tested our method on the standard GBU datasets which you can download from the original website. It is the easiest to follow our Google Colab to reproduce the results.

For CZSL

For CZSL, we tested our method on SUN and CUB datasets. In contrast to ZSL, in CZSL we used raw images as inputs instead of an ImageNet-pretrained model's features. For CUB, please follow the instructions in the A-GEM repo. Note, that CUB images dataset are now to be downloaded manually from here, but we used the same splits as A-GEM. Put the A-GEM splits into the CUB data folder.

For SUN, download the data from the official website, put it under data/SUN and then follow the instructions in scripts/sun_data_preprocessing.py

Installing the firelab dependency

You will need to install firelab library to run the training:

pip install firelab

Running ZSL training

Please, refer to this Google Colab notebook: it contains the code to reproduce our results.

Running CZSL training

To run CZSL training you will need to run the command:

python src/run.py -c basic|agem|mas|joint -d cub|sun

Please note, that by default we load all the data into memory (to speed up things). This behaviour is controled by the in_memory flag in the config.

Results

Zero-shot learning results

ZSL results

Continual Zero-Shot Learning results

CZSL results

Training speed results for ZSL

Training speed results

class-norm's People

Contributors

universome avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

class-norm's Issues

use class-norm

Hi!
When I used class-norm in my code, I found that the accuracy dropped a lot.
Here is my code.
Screenshot from 2022-03-30 20-46-29
In my method, the AwA2 data set reduces the accuracy considerably.
such as 60% --> 45%
Am I using the wrong method?
thanks!

Normalization usage?

Hi, thank you for the awesome work.
I have a question on using class normalization.
According to the 'class-norm-for-czsl.ipynb' file in this repo,
ClassNorm(CN) seems to be applied in the following form:

FC - CN - ReLU - CN - FC - ReLU.

But to my intuition, this seems a little weird, since layers are stacked usually in the form of:

FC - Normalization - ReLU - FC - Normalization - ReLU.

The current form seems to have an activation layer between two Class-Norm layers, without any kind of Conv / FC layers.
Is this intended?
I have went through the paper, but could not find the answer, possibly due to my problem in understanding.
Could you kindly clarify on this?

Why use additional multiplication of np.sqrt(attrs.shape[1]) in attribute normalization?

Attributes Normalization (AN) in the paper is as follows:

But the code uses Attributes Normalization (AN) like this:

is the dimensionality of attribute vector.
And the code for attributes normalization that I found is (at the preprocessing part of the class-norm-for-czsl.ipynb file)
attrs = attrs / attrs.norm(dim=1, keepdim=True) * np.sqrt(attrs.shape[1])

I couldn't find about this additional multiplication in the paper.
And it seems to have a huge influence on performance.
Can you tell me why this is used?

Request for used slurm.utils

Hi @universome ,

In nm-zsl/src/run.py, line 11, you have:

from slurm.utils import generate_experiments_from_hpo_grid

However, for the currently released code, slurm.utils are not provided. Could you please also add this to the code?

Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.