Pneumonia is a lung infection (๐ซ) that inflames the air sacs in one or both lungs. This infection arises when the air sacs get filled with fluid or pus (purulent material). It can be a bacterial or viral infection. The main symptoms are - cough with phlegm or pus, fever, chills, and breathing difficulty.
This disease is responsible for over 15% of all deaths of children under five years old worldwide. This proves the severity of this disease and the need for accurate detection.
The most commonly used method to diagnose pneumonia is through chest radiograph or chest X-ray, which depicts the infection as an increased opacity in the lungs' specific area(s).
To increase the diagnosis procedure's efficacy and reach, we can leverage machine learning algorithms to identify abnormalities in the chest X-ray images. In this model, many chest X-ray images (both normal
and pneumonia
) are fed to build Convolutional Neural Network (CNN)
model for fulfilling the purpose.
- Python 3.7.0+
- Tensorflow 2.4.1+
- Keras 2.4.3+
- scikit-learn 0.24.1+
- matplotlib 3.3.3+
- texttable 1.6.3+
- gradio 1.5.3+
You can download the dataset from kaggle. Use the download link to download the dataset.
- Extract the archive
- You will find several directories in it
- Copy the
chest-xray
directory contents (train
,test
andval
subdirectories) to thedata
folder
The number of images belonging to both classes (Normal
and Pneumonia
) in the train
, test
and val
datasets are -
Dataset Type | Normal | Pneumonia |
---|---|---|
Training | 1341 | 3875 |
Test | 234 | 390 |
Validation | 8 | 8 |
- Clone the repository
git clone https://github.com/baishalidutta/Pneumonia-Detection.git
- Install the required libraries
pip3 install -r requirements.txt
Enter into the source
directory to execute the following source codes.
- To generate the model on your own, run
python3 model_training.py
- To evaluate any dataset using the pre-trained model (in the
model
directory), run
python3 model_evaluation.py
Note that, for evaluation, model_evaluation.py
will use all the images contained inside both test
and val
subdirectories (inside data
directory).
Alternatively, you can find the whole analysis in the notebook inside the notebook
directory. To open the notebook, use either jupyter notebook
or google colab
or any other IDE that supports notebook feature such as PyCharm Professional
.
The model is trained with 96%
accuracy on the training dataset. The model's accuracy on the test
and val
datasets are 91%
and 88%
respectively. In both cases, the f1-score
and ROC_AUC Score
are relatively high, as shown below.
To run the web application locally, go to the webapp
directory and execute:
python3 web_app.py
This will start a local server that you can access in your browser. You can either upload/drag a new X-ray image or select any test X-ray images from the examples below.
You can, alternatively, try out the hosted web application here.
Baishali Dutta ([email protected])
If you would like to contribute and improve the model further, check out the Contribution Guide
This project is licensed under Apache License Version 2.0