Giter Club home page Giter Club logo

dataaugmentationinterventions's Introduction

Selecting Data Augmentation for Simulating Interventions

by Maximilian Ilse ([email protected]), Jakub M. Tomczak and Patrick Forré

Overview

PyTorch implementation of our paper "Selecting Data Augmentation for Simulating Interventions":

Used modules

  • Python 3.6
  • PyTorch 1.0.1

Datasets

Pre-trained AlexNet

To reproduce our results on the PACS dataset, please use: https://drive.google.com/file/d/1wUJTH1Joq2KAgrUDeKJghP1Wf7Q9w4z-/view?usp=sharing

Story behind the paper

Everybody that works with medical imaging data eventually comes across the following problem: appearance variability. This variability is usually caused by the equipment used to generate medical imaging data, e.g., CT scanners from different vendors will generate images with different intensity patterns. If we train a CNN on data from a single scanner we are likely to overfit on the specific intensity pattern of the scanner. As a result, we are likely to fail to generalize to data from a different scanner.

In late 2018, we started to work on the problem of domain generalization/learning invariant representations motivated by the appearance variability in medical imaging data described above. In domain generalization, one tries to find a representation that generalizes across different environments, called domains, each with a different shift of the input.

This eventually led to a model that we called the Domain Invariant Variational Autoencoder (DIVA, https://arxiv.org/abs/1905.10427, thanks to my co-authors!). While the results of DIVA are promising, there were a couple of experiments that didn’t make it into the paper since the performance of DIVA didn’t match a simple baseline CNN. For a while, we thought it is probably due to optimization issues, etc. During 2019, we realized that we had a very poor understanding of the problem itself.

Questions and Issues

If you find any bugs or have any questions about this code please contact Maximilian. We cannot guarantee any support for this software.

Citation

Please cite our paper if you use this code in your research:

@article{ilse_selecting_2020,
	title = {Selecting {Data} {Augmentation} for {Simulating} {Interventions}},
	url = {http://arxiv.org/abs/2005.01856},
	urldate = {2020-05-06},
	journal = {arXiv:2005.01856 [cs, stat]},
	author = {Ilse, Maximilian and Tomczak, Jakub M. and Forré, Patrick},
	month = may,
	year = {2020},
	note = {arXiv: 2005.01856}

Acknowledgments

The work conducted by Maximilian Ilse was funded by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Grant DLMedIa: Deep Learning for Medical Image Analysis).

dataaugmentationinterventions's People

Contributors

max-ilse avatar

Stargazers

 avatar  avatar Arnas avatar jameszhou-gl avatar Dun Zeng avatar  avatar yue yao avatar  avatar CyberXiao avatar Minghui Chen avatar Xinyi Wang avatar Austin Ray avatar Lukas Miklautz avatar  avatar Georvic Tur avatar  avatar Peiqi (Mark) Wang avatar Weijia Zhang avatar BoxiXia avatar Mohammad Bayazi avatar Ruocheng Guo avatar Jan Bours avatar Danqing Kang avatar Nat Dilokthanakul avatar Seder(方进) avatar 爱可可-爱生活 avatar IronMan avatar

Watchers

James Cloos avatar Marco.Federici avatar Thomas Kipf avatar  avatar  avatar IronMan avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.