Giter Club home page Giter Club logo

monet's Introduction

MONET (Medical cONcept rETriever)

MONET is an image-text foundation model trained on 105,550 dermatological images paired with natural language descriptions from a large collection of medical literature. MONET can accurately annotate concepts across dermatology images as verified by board-certified dermatologists, competitively with supervised models built on previously concept-annotated dermatology datasets of clinical images. MONET enables AI transparency across the entire AI system development pipeline from building inherently interpretable models to dataset and model auditing.

Getting started

Install

To install the required packages, run the following bash commands:

# clone project
git clone https://github.com/suinleelab/MONET
cd MONET

# [OPTIONAL] create conda environment
conda create -n MONET python=3.9.15
conda activate MONET

# install PyTorch according to instructions at https://pytorch.org/get-started/ v.1.13.0 was used during development.
# example: conda install pytorch==1.13.0 torchvision==0.14.0 pytorch-cuda=11.7 -c pytorch -c nvidia

# install other required python packages
pip install -r requirements.txt
pip install git+https://github.com/openai/CLIP.git

Initialize model

Using original openai CLIP implementation

import clip

def get_transform(n_px):
    def convert_image_to_rgb(image):
        return image.convert("RGB")
    return T.Compose(
        [
            T.Resize(n_px, interpolation=T.InterpolationMode.BICUBIC),
            T.CenterCrop(n_px),
            convert_image_to_rgb,
            T.ToTensor(),
            T.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),        
        ]
    )

model, preprocess = clip.load("ViT-L/14", device="cuda:0", jit=False), get_transform(n_px=224)
model.load_state_dict(torch.hub.load_state_dict_from_url("https://aimslab.cs.washington.edu/MONET/weight_clip.pt"))
model.eval()

Using huggingface CLIP implementation

from transformers import AutoProcessor, AutoModelForZeroShotImageClassification

processor_hf = AutoProcessor.from_pretrained("chanwkim/monet")
model_hf = AutoModelForZeroShotImageClassification.from_pretrained("chanwkim/monet")
model_hf.to("cuda:0")
model_hf.eval()

Usage

We provide jupyter notebooks to demonstrate how to use MONET for automatic concept annotation and various transparency tasks such as data auditing, model auditing, and inherently interpretable model building.

  • Automatic concept annotation: tutorial/automatic_concept_annotation.ipynb
  • Data auditing: tutorial/data_auditing.ipynb
  • Model auditing: tutorial/model_auditing.ipynb
  • Inherently interpretable model building: tutorial/inherently_interpretable_model_building.ipynb

MONET Training data

For code to download and preprocess the training data, please refer to the following scripts:

scripts/preprocess/preprocess_pubmed.sh
scripts/preprocess/preprocess_pdf.sh

Training / Evaluation

Code for preprocessing data and training MONET is available in src folder. Code used for evaluation in our paper is available in experiments folder.

Citation

@article{kim2024transparent,
    title={Transparent medical image AI via an image–text foundation model grounded in
medical literature},
    author={Chanwoo Kim and Soham U. Gadgil and Alex J. DeGrave and Jesutofunmi A. Omiye and Zhuo Ran Cai and Roxana Daneshjou and Su-In Lee},
    journal={Nature Medicine},
    year={2024},
    doi={10.1038/s41591-024-02887-x},
    url={https://doi.org/10.1038/s41591-024-02887-x}    
}

monet's People

Contributors

chanwkimlab avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

monet's Issues

Failure to replicate the CLIP concept generation experiment.

Thanks for your great work! I am trying to follow your steps and replicate the CLIP concept generation process on the Fitzpatrick17k split of the SkinCon dataset, but only get an AUROC of the 0.55. Could you please kindly explain at a high level if I did something wrong here?

  1. Exclude any with less than 30 positive examples, use a prompt of 'This is {symptom}' for every symptom example.
  2. For every image, re-sized and center-cropped to be 224x224 dimensions. It is then normalized using the mean and standard deviation used in CLIP
  3. Use a pre-trained CLIP model from huggingface, here I tried (a). openai/clip-vit-large-patch14 (b). openai/clip-vit-large-patch14-336 (c). laion/CLIP-ViT-g-14-laion2B-s34B-b88K

Thank you in advance for any instructions!

Request for ’data‘ folder

Hello,

I wanted to express my gratitude for the outstanding work you’ve done! I must apologize if I’ve missed it, but I was unable to locate the ‘data’ folder in this repository, which is supposed to contain the ‘/pubmed/search_query.csv’ file and ‘/textbook/pdf_files’. Could you kindly let me know if there are any plans to release these pertinent resources?

Best Regards,
Chenlin

License for code?

Hello,

Thank you for the excellent work! I apologize if I've overlooked it but I could not find a license for the code in this repo. Do you have plans to apply a license to this code?

Thanks,
Jeff

Textbook name

hi, can you provide the textbook name list so that we can collect these training data? Many thanks

Training cost

Hi, the authors!
Thanks for bringing this towering work, I'm curious about the training cost of the whole pipeline. Have you ever trained using CLIP?

Abut dataset

Thanks for your great work! Can I ask whether you will release the image-text pair dataset? Also, can you provide some examples of what the caption looks like other than the example in Fig1?

Textbook script json fille

Hi authors, thanks for the great work.
When I run the pdf script in src/MONET/preprocess/pdf_match.py, I found a data/textbook/pdf_files.config.json is needed to extract text from pdf.
Could you provide your setting or the template of json file?

python src/MONET/preprocess/pdf_match.py \
--image data/textbook/pdf_extracted.compact.uncorrupted.dermonlyv3.hdf5 \
--pdf-extracted data/textbook/pdf_extracted \
--config data/textbook/pdf_files.config.json \
--output data/textbook/pdf_extracted.compact.uncorrupted.dermonlyv3.matched.pkl          
   path_config = Path(args.config)

with open(path_config) as f:
    config = json.load(f)

text_matched_all = []
for pdf_name, pdf_config_list in config.items():

    for pdf_config in pdf_config_list:
        print(pdf_name, pdf_config)
        text_include_list = pdf_config["text_include_list"]
        fontsize_range = pdf_config["fontsize_range"]
        font_list = pdf_config["font_list"]
        prioritize_text_under_image = pdf_config["prioritize_text_under_image"]
        return_all = pdf_config["return_all"]
        key_images_pdf = [
            key
            for key in key_images
            if os.path.splitext(key)[0].split("_")[0] == pdf_name
            # and os.path.splitext(key)[0].split("_")[1] == "00742"
        ]
        # print(key_images_pdf)

        text_matched = match_text(
            path_base=path_pdf_extracted,
            key_images=key_images_pdf,
            text_include_list=text_include_list,
            fontsize_range=fontsize_range,
            font_list=font_list,
            prioritize_text_under_image=prioritize_text_under_image,
            return_all=return_all,
            verbose=False,
        )

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.