Giter Club home page Giter Club logo

model-vs-human's Introduction

modelvshuman: Does your model generalise better than humans?

modelvshuman is a Python toolbox to benchmark the gap between human and machine vision. Using this library, both PyTorch and TensorFlow models can be evaluated on 17 out-of-distribution datasets with high-quality human comparison data.

๐Ÿ† Benchmark

The top-10 models are listed here; training dataset size is indicated in brackets. Additionally, standard ResNet-50 is included as the last entry of the table for comparison. Model ranks are calculated across the full range of 52 models that we tested. If your model scores better than some (or even all) of the models here, please open a pull request and we'll be happy to include it here!

Most human-like behaviour

winner model accuracy difference โ†“ observed consistency โ†‘ error consistency โ†‘ mean rank โ†“
๐Ÿฅ‡ CLIP: ViT-B (400M) .023 .758 .281 1.33
๐Ÿฅˆ SWSL: ResNeXt-101 (940M) .028 .752 .237 4
๐Ÿฅ‰ BiT-M: ResNet-101x1 (14M) .034 .733 .252 4.33
๐Ÿ‘ BiT-M: ResNet-152x2 (14M) .035 .737 .243 5
๐Ÿ‘ ViT-L (1M) .033 .738 .222 6.66
๐Ÿ‘ BiT-M: ResNet-152x4 (14M) .035 .732 .233 7.66
๐Ÿ‘ BiT-M: ResNet-50x3 (14M) .040 .726 .228 9.33
๐Ÿ‘ BiT-M: ResNet-50x1 (14M) .042 .718 .240 9.66
๐Ÿ‘ ViT-L (14M) .035 .744 .206 9.66
๐Ÿ‘ SWSL: ResNet-50 (940M) .041 .727 .211 11.66
... standard ResNet-50 (1M) .087 .665 .208 28.66

Highest out-of-distribution robustness

winner model OOD accuracy โ†‘ rank โ†“
๐Ÿฅ‡ Noisy Student: ENetL2 (300M) .829 1
๐Ÿฅˆ ViT-L (14M) .733 2
๐Ÿฅ‰ CLIP: ViT-B (400M) .708 3
๐Ÿ‘ ViT-L (1M) .706 4
๐Ÿ‘ SWSL: ResNeXt-101 (940M) .698 5
๐Ÿ‘ BiT-M: ResNet-152x2 (14M) .694 6
๐Ÿ‘ BiT-M: ResNet-152x4 (14M) .688 7
๐Ÿ‘ BiT-M: ResNet-101x3 (14M) .682 8
๐Ÿ‘ BiT-M: ResNet-50x3 (14M) .679 9
๐Ÿ‘ SimCLR: ResNet-50x4 (1M) .677 10
... standard ResNet-50 (1M) .559 31

๐Ÿ”ง Installation

Simply clone the repository to a location of your choice and follow these steps (requires python3.8):

  1. Set the repository home path by running the following from the command line:

    export MODELVSHUMANDIR=/absolute/path/to/this/repository/
    
  2. Install package (remove the -e option if you don't intend to add your own model or make any other changes)

    pip install -e .
    

๐Ÿ”ฌ User experience

Simply edit examples/evaluate.py as desired. This will test a list of models on out-of-distribution datasets, generating plots. If you then compile latex-report/report.tex, all the plots will be included in one convenient PDF report.

๐Ÿซ Model zoo

The following models are currently implemented:

If you e.g. add/implement your own model, please make sure to compute the ImageNet accuracy as a sanity check.

How to load a model

If you just want to load a model from the model zoo, this is what you can do:

    # loading a PyTorch model from the zoo
    from modelvshuman.models.pytorch.model_zoo import InfoMin
    model = InfoMin("InfoMin")

    # loading a Tensorflow model from the zoo
    from modelvshuman.models.tensorflow.model_zoo import efficientnet_b0
    model = efficientnet_b0("efficientnet_b0")

Then, the model can be evaluated via:

    output_numpy = model.forward_batch(images)
    
    # by default, type(output) is numpy.ndarray, which can be converted to a tensor via:
    output_tensor = torch.tensor(output_numpy)
How to list all available models

All implemented models are registered by the model registry, which can then be used to list all available models of a certain framework with the following method:

    from modelvshuman import models
    
    print(models.list_models("pytorch"))
    print(models.list_models("tensorflow"))
How to add a new model

Adding a new model is possible for standard PyTorch and TensorFlow models. Depending on the framework (pytorch / tensorflow), open modelvshuman/models/<framework>/model_zoo.py. Here, you can add your own model with a few lines of code - similar to how you would load it usually. If your model has a custom model definition, create a new subdirectory called modelvshuman/models/<framework>/my_fancy_model/fancy_model.py which you can then import from model_zoo.py via from .my_fancy_model import fancy_model.

๐Ÿ“ Datasets

In total, 17 datasets with human comparison data collected under highly controlled laboratory conditions in the Wichmannlab are available.

Twelve datasets correspond to parametric or binary image distortions. Top row: colour/grayscale, contrast, high-pass, low-pass (blurring), phase noise, power equalisation. Bottom row: opponent colour, rotation, Eidolon I, II and III, uniform noise. noise-stimuli

The remaining five datasets correspond to the following nonparametric image manipulations: sketch, stylized, edge, silhouette, texture-shape cue conflict. nonparametric-stimuli

How to load a dataset

Similarly, if you're interested in just loading a dataset, you can do this via:

   from modelvshuman.datasets import sketch      
   dataset = sketch(batch_size=16, num_workers=4)
How to list all available datasets
    from modelvshuman import datasets
    
    print(list(datasets.list_datasets().keys()))

๐Ÿ’ณ Credit

Psychophysical data were collected by us in the vision laboratory of the Wichmannlab.

While the psychophysical data was collected ourselves, we used existing image dataset sources. 12 datasets were obtained from Generalisation in humans and deep neural networks. 4 datasets were obtained from ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. Additionally, we used 1 dataset from Learning Robust Global Representations by Penalizing Local Predictive Power (sketch images from ImageNet-Sketch).

We thank all model authors and repository maintainers for providing the models described above.

model-vs-human's People

Contributors

rgeirhos avatar kantharajucn avatar yurigalindo avatar

Stargazers

 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.