Giter Club home page Giter Club logo

craft-md's Introduction

CRAFT-MD: Conversational Reasoning Assessment Framework for Testing in Medicine

This repository contains code and data for Conversational Reasoning Assessment Framework for Testing in Medicine (CRAFT-MD). CRAFT-MD is a novel framework designed to assess the conversational reasoning abilities of LLMs through simulated doctor-patient conversations that mimic clinical encounters. It evaluates clinical LLMs using a multi-agent framework comprising four components: a clinical LLM agent, a patient AI agent, a grader AI agent, and medical experts. The clinical LLM agent can be instantiated with any new LLM, allowing seamless evaluation of its clinical capabilities, such as history taking and information synthesis. The other agents play supportive roles in evaluating the clinical LLM agent.

Paper Link:

Dataset

The dataset consists of 2000 questions, each structured as a case vignette followed by four answer choices (see Data Availability). Of these, 1800 were sourced from MedQA-USMLE, encompassing medical conditions commonly encountered in primary and specialist care settings. These questions span 12 medical specialties: Dermatology, Hematology and Oncology, Neurology, Gastroenterology, Pediatrics and Neonatology, Cardiology, Infectious Disease, Obstetrics and Gynecology, Urology and Nephrology, Endocrinology, Rheumatology, and Others. Additional 100 vignettes from an online question back and 100 newly generated vignettes are also included. The case vignettes were written by medical experts and contained details of the age and sex of the patient, current symptoms, medical history of illness and medications, and relevant family history. In some cases, physical exam and laboratory results were also included in the vignette.

Replicating results

Note: You will need to setup an OpenAI API setup, and replace the OpenAI key information in the below script. Jupyter notebooks can be run to replicate the CRAFT-MD framework and other results presented in the manuscript.

System Requirements

All code contained in the repository was tested on python v3.9.17. The conda environment used can be re-created using environment.yml.

All open-source models were tested on Quadro RTX 8000 48gb GPU. The GPT-4 and GPT-3.5 models can be tested on a normal computer.

Citation

Johri, S., Jeong, J., Tran, B. A., Schlessinger, D. I., Wongvibulsin, S., Cai, Z. R., ... & Rajpurkar, P. (2023). Guidelines For Rigorous Evaluation of Clinical LLMs For Conversational Reasoning. medRxiv, 2023-09.

Johri, Shreya, et al. "CRAFT-MD: A Conversational Evaluation Framework for Comprehensive Assessment of Clinical LLMs." AAAI 2024 Spring Symposium on Clinical Foundation Models. 2024.

craft-md's People

Contributors

sjohri20 avatar

Stargazers

Wenke Ren avatar YOLO avatar Xiaoman Zhang avatar Thomas Buckley avatar  avatar Jaehwan Jeong avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.