Giter Club home page Giter Club logo

kidney-segmentation's Introduction

Kidney Segmentation Model

This repository contains code used for the Kaggle competition SenNet + HOA - Hacking the Human Vasculature in 3D. The goal of this competition was to build a computer vision model to create segmentation masks marking blood vessels in 2D scans of human kidneys. The trained models and pipelines from this repository achieve a surface dice score of ~0.79 on the public test set and ~0.49 on the private test set (~15th percentile of notebook submissions).

Model Architecture

As is common for medical imaging tasks, the models I worked with were based on the UNet model architecture. UNet is an encoder/decoder architecture. In its simplest form, it is based on convolutional layers (see below; image taken from U-Net: Convolutional Networks for Biomedical Image Segmentation). The left side of U-Net acts as an encoder, creating the most relevant features based on the training images. The right side act as the decoder. It uses the features from the encoder to determine whether each pixel should be classified as belonging to a blood vessel or not in the segmentation masks.

For more complicated models, I tried the attenU-Net, which adds attention gates to the skip connections. I also used pre-trained backbones, which replace the encoder part of the UNet. A useful repository for implementing these is segmentation_models.pytorch. Two which were particular effective were the resnext50_32x4d (convolution based) and the mit_b2 (mix vision transformer). The pre-trained models and the final models tuned on the kidney segmentation masks provided are here.

Training

For training, there were segmentations for 3 kidneys provided. However, only 2 had accurate labellings, whereas the third was sparsely segmented at 65%. I only used the complete segmentations for training, though it would have been interesting to test approaches using pseudo-labeling for the sparsely segmented kidney. The effective amount of training data could be increased by building an image augmentation pipeline consisting of rotations, changes in brightness, etc., which I did with the albumentations library. Another trick for expanding the training set was to take the set of 2d kidney images (originally sliced along the z axis), stack them together, and then slice the resulting 3d block in the x and y directions, with these new images used for training as well. In addition, to ensure the model was flexible enough to handle input images of varying size, I broke each 2d image into a set of 512x512 pixel tiles, which were fed into the model for training.

Choosing a proper loss function also turned out to be critical. The simplest choice was binary cross entropy, as this is essentially a pixel-wise classification problem. However, this turned out to not be a good proxy for the competition metric, surface dice. Often, an incremental improvement on BCE would lead to a no improvement or a worsening of surface dice, even when evaluating on just the training set itself. Some alternatives I tried were focal loss and dice loss, with the latter yielding the best results.

For validation, one simple approach was to train on one completely segmented kidney, and then to do validation on the second completely segmented kidney. However, this turned out to give a very poor estimate for the test error. This was evident because it was relatively easy to build a model which performed well on the training kidney and the validation kidney, but then performed terribly on the hidden test kidney. This indicated that there was very high variation from kidney to kidney, making parameter tuning very difficult. As a result, I opted to not tune precisely, and instead looked for model configurations which performed consistently on the hidden test kidney across a wide range of parameters.

Inference

When it came to inference, the models I initially trained suffered from a high false negative rate, meaning they were missing a high number of detections. To improve this, I implemented test time augmentation, which is similar in spirit to train time augmentation. The main idea is to provide the model with multiple views of the kidney during inference, giving it many chances to make a detection. Specifically, my model performed inference on rotated and flipped versions of each image tile. It also sliced the kidney along all three axes, again increasing the number of chances to make a detection. Whenever the model made a detection for some view, the pixel was classified as belonging to a blood vessel.

Conclusion

The challenge was a great way to dive into an area that I previously knew nothing about, and develop working knowledge in a short period of time. Plus, it's pretty satisfying to see some of the segmentation masks produced during validation:

kidney-segmentation's People

Contributors

nbb5858 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.