Giter Club home page Giter Club logo

continualevaluation's Introduction

Continual Evaluation for Lifelong Learning

This is a Pytorch and Avalanche based repository to enable finegrained continual evaluation in continual learning, opposed to the standard task transition based evaluation. It is the main codebase for the Spotlight ICLR 2023 paper: "Continual evaluation for lifelong learning: Identifying the stability gap".

Why continual evaluation? Using continual evaluation, our work finds a stability gap, where representative continual learning methods falter in maintaining robust performance during the learning process. Measuring continual worst-case performance is important to enable continual learners in the real world, especially for safety-critical applications and real-world actuators.

Main features of this repo:

  • Continual Eval: Per-iteration evaluation, including test-set subsampling and adjustable evaluation periodicity
  • Continual evaluation metrics:
    • New metrics: Worst-case Accuracy (WC-ACC), Average Minimum Accuracy (Min-ACC), Windowed-Forgetting (WF), Windowed-Plasticity (WP).
    • Existing metrics: Learning Curve Area (LCA), Average Forgetting (FORG), Average Accuracy (ACC).
  • Extensive tracking: Track all stats of your continual learning model, e.g. per-iteration feature drift and gradient norms.
  • 7 Continual Learning benchmarks based on: MNIST, CIFAR10, Mini-Imagenet, Mini-DomainNet, PermutedMNIST, RotatedMNIST, Digits

Project Status: Codebase delivered as is, no support available.

Setup

This code uses

  • Python 3.8
  • Avalanche 0.1.0 (beta)
  • Pytorch 1.8.1

To setup your environment, you can use the install script, which automatically creates an Anaconda environment for you. The script defines the default versions used for the paper.

./install_script.sh

You can also define your own conda environment with the environment.yml file.

conda env create -n CLEVAL_ENV -f environment.yml python=3.8
conda activate CLEVAL_ENV

Reproducing results

All configs for the experiments can be found in reproduce/configs. The yaml config files enable a structured way to pass arguments to the python script.

To reproduce an experiment, simply run ./reproduce/run.sh and pass the yaml filename. For example:

./reproduce/run.sh splitmnist_ER.yaml

Note that we didn't run with deterministic CUDNN backbone for computational efficiency, which might result in small deviations in results. We average all results over 5 initialization seeds, these can be run at once (with n_seeds=5), or define the specific seed per run (e.g. seed=0).

Continual Evaluation Implementation

The continual evaluation is integrated in the Avalanche flow of continual learning. For documentation, see here.

  • src/eval/continual_eval.py: Introduces Continual Evaluation tracking flow after training iterations in Avalanche after_training_iteration. The additional phases are defined as:

      # Standard Avalanche CL flow
      ...                          
      - before_training_iteration
      - after_training_iteration
          - before_tracking        # BEGIN Integrated Continual Evaluation
          - before_tracking_step
          - before_tracking_batch
          - after_tracking_batch
          - after_tracking_step
          - after_tracking         # END
      - after_training_epoch
      ...
    
  • src/eval/continual_eval_metrics.py: Contains all the continual evaluation metrics, which all inherit from TrackerPluginMetric. This Plugin defines all the Continual Evaluation tracking phases, which each metric can overwrite as appropriate.

  • main.py first passes a list of plugins to Avalanche's EvaluationPlugin with ContinualEvaluationPhasePlugin first to update the Continual Evaluation metric states on after_training_iteration. Next in the list, the metric plugins are passed to EvaluationPlugin so after_training_iteration logs the metrics on each iteration.

Visualize results

We support both tensorboard and WandB. To view results for Tensorboard, run:

tensorboard --logdir=OUTPUT_DIR

Citing and license

Please consider citing us upon using this repo:

@inproceedings{
  delange2023continual,
  title={Continual evaluation for lifelong learning: Identifying the stability gap},
  author={Matthias De Lange and Gido M van de Ven and Tinne Tuytelaars},
  booktitle={The Eleventh International Conference on Learning Representations },
  year={2023},
  url={https://openreview.net/forum?id=Zy350cRstc6}
}

Code is available under MIT license: A short and simple permissive license with conditions only requiring preservation of copyright and license notices. See LICENSE for the full license.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.