Giter Club home page Giter Club logo

diagnoai's Introduction

DiagnoAI

DiagnoAI is a tool to detect a disease from a text description of the patient's symptoms and daily condition. It is based on a transformer model called BERT, fine-tuned for 24 common diseases.

Contents

  1. Dataset
  2. Model Training
  3. Testing
  4. References

1. Dataset

We created a dataset containing 24 disease and 50 manually written descriptions of the symptoms (in english) for each disease. The disease names, symptoms and precautions where chosen Disease Symptom Prediction dataset [1] from Kaggle.

Hence, a total 1200 descriptions were created, out of which 80% was used for model training and remaining 20% for validation and testing purposes. An example of a data instance:

Description : There are small red spots all over my body that I can't explain. It's worrying me. I feel extremely tired and experience a mild fever every night.

Disease: Chicken Pox

2. Model Training

Because of limited data, we decided to fine tune a pretrained language model. We chose the pre trained BERT model from Hugging Face and its corresponding tokenizer for tokenizing the sentences. TensorFlow was used as the base framework for loading and training the model.

Upon experimentation, we found that unfreezing the BERT layer helped acheive a better training and validation accuracy. Hence, we decided to go keep the complete model trainable.

The model was trained with the following parameters:

Loss function: SparseCategoricalCrossentropy Optimizer: Adam Learning Rate: 0.00003 Epochs: 5

Model Plot

3. Testing

After training, we acheived a training accuracy of 100.00% and vadlidation accuracy of 98.33%. Although, the misclassification rate is quite low, we can't be completely sure of the model's predictions as it trained on a relatively smaller corpus.

We plan to increase the dataset in future so that the model can generalize better and not suffer from overconfidence.

4. References

  1. Kaggle - Disease Symptom Dataset
  2. Hugging Face - BERT
  3. Tensorflow
  4. Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding."

diagnoai's People

Contributors

faizalkarim280280 avatar niyarrbarman avatar

Stargazers

Mohammad Arshad avatar  avatar  avatar Krish Sharma avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.