Authors Maxwell Chen and Kevin Miao
This repository contains our final project for CS194-80: Full Stack Deep Learning taught at UC Berkeley by Pieter Abbeel, Sergey Karayev and Joshua Tobin.
The deployment phase can be accessed here:
Archive
- Contains older versions and debugging notebooks/scripts.HAM10000_metadata.csv
- Original metadata with diagnoses (unmodified from the original HAM10000 dataset)annotation-v2.py
- This code contains the annotation script which outputsfinal.csv
that is used as ground truth labels and bounding box areas by using the provided segmentation maps.annotation.py
- Script for automatic annotationdataset.py
- Pytorch dataset accompanied by transformsdisc.ipynb/py
- Debugging filesfinal.csv
- Ground Truth bounding box coordinates, paths and labelsmean-std.pt
- PyTorch Dictionary containing the mean/std of the training imagesmodel_util.py
- Util functions for loading/reading models from statesetup.sh
- Shell script for setting up requirements and dependenciesstate_loading.py
- Script for loading a state dictionary into a modelsweep.yaml
- Weights and Biases files for hyperparameter sweepingtrain.ipynb
- Notebook for training debuggingtrain.py
- Official training scripttransforms.py
- Image Transformsutil.py
- Contains util functions
Tschandl, Philipp, 2018, "The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions", https://doi.org/10.7910/DVN/DBW86T, Harvard Dataverse, V3, UNF:6:/APKSsDGVDhwPBWzsStU5A== [fileUNF]
The basis of the project is the HAM10000 dataset which contains 10,015 images categorized in 7 different skin cancers along with supervised segmentation maps.
Dataset can be downloaded here: Harvard Dataverse
The architecture being used is a pretrained FasterRCNN
with a ResNet50
backbone augmented with a Feature Pyramid Network. The model has been adapted from: torchvision FasterRCNN Resnet-50 fpn pretrained on COCO
We have the following diseases in our dataset which correspond to the respective indices. The last index, 7, is reserved for background.
Dictionary : {'akiec': 0, 'bcc': 1, 'bkl': 2, 'df': 3, 'mel': 4, 'nv': 5, 'vasc': 6}
This part of the project uses python 3.8
in a conda
environment with the following dependencies. The setup.sh
file can be run to initiate the online environment.