Giter Club home page Giter Club logo

histovae's Introduction

This project has moved to https://gitlab.com/willgdjones/HistoVAE

HistoVAE

This repository outlines the HistoVAE project. Learning representations from histopathology images using unsupervised deep learning to create utility out of unlabelled histopathology images.

We use the annotated histopathology images from the GTEx project (Genotype-Tissue Expression Project).

Table of Contents

  1. Download
  2. Patch Coordinates

Download

To download a small version of the dataset from NCI, run:

python scripts/download.py --n_images 50 --n_tissues 10

This will download 50 images from the 10 tissues with the greatest number of samples. Samples are sorted by donorID to ensure replicability.

Patch Coordinates

Much of the tissue image is whitespace. We segment the foreground and background of the tissue slice using Otsu thresholding. We sample square pixel patches of sizes 128, 256, 512 and 1024 pixels. We reject any samples where more than 25% of the patch is whitespace, defined as being above the in the Otsu background. We store the coordinates of the selected patches in HDF5 files within the data/patches directory using pytables. Using these coordinates, and knowing the patch size, we efficiently retrieve sampled patches at any level from the image using the OpenSlide DeepZoomGenerator.

Training the Convolutional Autoencoder

We train a Convolutional Autoencoder with convolutional layer in both the encoder and the decoder. After each convolutional layer in the encoder, we perform 2D max-pooling. After each convolutional layer in the decoder, we perform a 2D up-sampling. These operations in the decoder are equivalent to a deconvolutional layer. In the final layer of the encoder, and the first layer of the decoder, we perform dropout with probability 0.5. We use L2 regularization on the final encoded representation, and vary the dimension of this final representation to be a vector of length 256, 512 or 1024.

We augment patches passed through the autoencoder by performing horizontal and vertical flips of the patch each with probability 0.5. During training, we use the Adam optimizer with a learning rate of 0.0001, and a beta of 0.5. We found the performance of the model to be sensitive to these hyperparameters. For example, when using learning rate of 0.0005, we noticed stochastic jumps during training. We use a batch-size of 64. We used 128 filter for the first convolutional layer, 64 filters for the second, 32 for the third, and 16 for the last convolutional layer. The order of filters was reversed in the decoding layer. Receptive fields of size (3, 3) are using throughout.

Viewing decoded encodings.

We generate realistic encodings on test images.

We perform Principal Component Analysis on the encoded representations.

histovae's People

Contributors

willgdjones avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.