Giter Club home page Giter Club logo

ecg-classification's Introduction

ECG Arrhythmia classification

pipeline-viz

The repository contains code for Master's degree dissertation - Diagnosis of Diseases by ECG Using Convolutional Neural Networks. Only CNN neural network models are considered in the paper and the repository. As a part of the work, more than 30 experiments have been run. The table with all experiments and their metrics is available by the link

The best 1D and 2D CNN models are presented in the repository The repository follows config principle and can be run in the following modes:

  • Training - use train.py --config configs/training/<config>.json to train the model
  • Validation - use inference.py --config configs/inference/config.json to validate the model
  • Pipeline - use pipeline.py --config configs/pipelines/config/json to test the model using ECG data (i.e. data generation, running, visualization the results)

All available models and all necessary information are described below

Python 3.7 and PyTorch are used in the project GitHub actions are used for installing dependencies and training implemented models

Program - Data Mining Department - Computer Science

Principal Investigator - Nikolai Yu. Zolotykh National Research University - Higher School of Economics

Implemented models

1D models:

2D models:

Metrics

name type model accuracy val loss
exp-025 1D (1x128) - [PEAK[t] - 64, PEAK[t] + 64] https://arxiv.org/pdf/1707.01836.pdf 0,9827 0,0726
exp-030 1D (1x128) - [PEAK[t] - 64, PEAK[t] + 64] https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8952723 0,9864 1,5
exp-031 1D (1x128) - [PEAK[t] - 64, PEAK[t] + 64] https://arxiv.org/pdf/2002.00254.pdf 0,9886 0,15
exp-018 2D (128x128) - [PEAK[t] - 64, PEAK[t] + 64] https://arxiv.org/pdf/1804.06812.pdf 0,9920 0,1
exp-013 2D (128x128) - [PEAK[t] - 64, PEAK[t] + 64] MobileNetV2 0,9934 0,088
exp-021 2D (128x128) - [PEAK[t-1] + 20, PEAK[t+1] - 20] EfficientNetB4 0,9935 0,062
exp-029 1D (1x128) - [PEAK[t] - 64, PEAK[t] + 64] Novel EcgResNet34 0,9938 0,0500

Getting started

Training quick start:

  1. Download and unzip files into mit-bih directory
  2. Install requirements via pip install -r requirements.txt
  3. Generate 1D and 2D data files running cd scripts && python dataset-generation-pool.py
  4. Create json annotation files
    • For 1D model - cd scripts && python annotation-generation-1d.py
    • For 2D model - cd scripts && python annotation-generation-2d.py
  5. Run training - python train.py --config configs/training/<config>.json

See CI examples for each model

Testing and visualization

Using EcgResNet34 model as it shows the best metrics

  1. Install requirements via pip install -r requirements.txt
  2. Create directory named experiments
  3. Download the archive and unzip its content into experiments directory
  4. Download WFDB format data
  5. Change ecg_data path in configs/pipelines/config.json with no extension
{
  ...
  "ecg_data": "./mit-bih/100",
  ...
}
  1. Run pipeline - python pipeline.py --config configs/pipelines/config.json

The results will be saved as HTML file in experiments/EcgResNet34/results directory

image

Experiments

The code of all experiments described in the table is in branches experiments/exp-XXX

Other

The repository contains Jupyter Notebooks (see notebooks folder)

image

Contributors

Support

Please give a ⭐️ if this project helped you

License

This project is licensed under the MIT License

ecg-classification's People

Contributors

dependabot[bot] avatar lxdv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ecg-classification's Issues

Bug in F.maxpool_1d?

It seems that there is a bug here

identity = F.max_pool1d(identity, self.stride)

F.maxpool_1d takes kernel_size as the second parameter and stride as the third parameter. If you want to specify the stride as it is, it should be something like F.max_pool1d(identity, kernel_size=2, stride=2, padding=1) Otherwise, it may ignore the last dimension of the identity. And this may cause a bug when identity has odd number of dimensions.

conv_subsumpling() takes 2 positional arguments but 3 were given

Branch: exp-025
file : models/models_1d.py
defination:
def conv_subsumpling(in_planes, out_planes)
reference:
conv_subsumpling(self.inplanes, planes*block.expansion, stride)
Error:
conv_subsumpling() takes 2 positional arguments but 3 were given

Which is right, and how to fix this ?

Can provide full python version

as same like title ,because i have a dependencies version problem can author provide the full version like Python 3.7.3 or other
i download Python 3.7.9 but it can install pytorch 1.1.0 because Pytorch 1.1.0 it can not match 3.7.9

Problem in inference

I tried to ran inferency code for HeartNet2D model by changing few parameters but getting an error RuntimeError: Error(s) in loading state_dict for HeartNet: Missing key(s) in state_dict: "conv1.weight", "bn1.weight", "bn1.bias", "bn1.running_mean", "bn1.running_var", "layer0.0.conv1.weight", "layer0.0.bn1.weight", "layer0.0.bn1.bias", "layer0.0.bn1.running_mean", "layer0.0.bn1.running_var", "layer0.0.conv2.weight", "layer0.0.bn2.weight", "layer0.0.bn2.bias", "layer0.0.bn2.running_mean", "layer0.0.bn2.running_var", "layer0.0.downsample.0.weight", "layer1.0.conv1.weight", "layer1.0.bn1.weight", "layer1.0.bn1.bias", "layer1.0.bn1.running_mean", "layer1.0.bn1.running_var", "layer1.0.conv2.weight", "layer1.0.bn2.weight", "layer1.0.bn2.bias", "layer1.0.bn2.running_mean", "layer1.0.bn2.running_var", "layer1.0.downsample.0.weight", "layer1.1.conv1.weight", "layer1.1.bn1.weight", "layer1.1.bn1.bias", "layer1.1.bn1.running_mean", "layer1.1.bn1.running_var", "layer1.1.conv2.weight", "layer1.1.bn2.weight", "layer1.1.bn2.bias", "layer1.1.bn2.running_mean", "layer1.1.bn2.running_var", "layer2.0.conv1.weight", "layer2.0.bn1.weight", "layer2.0.bn1.bias", "layer2.0.bn1.running_mean", "layer2.0.bn1.running_var", "layer2.0.conv2.weight", "layer2.0.bn2.weight", "layer2.0.bn2.bias", "layer2.0.bn2.running_mean", "layer2.0.bn2.running_var", "layer2.0.downsample.0.weight", "layer2.1.conv1.weight", "layer2.1.bn1.weight", "layer2.1.bn1.bias", "layer2.1.bn1.running_mean", "layer2.1.bn1.running_var", "layer2.1.conv2.weight", "layer2.1.bn2.weight", "layer2.1.bn2.bias", "layer2.1.bn2.running_mean", "layer2.1.bn2.running_var", "layer2_.0.conv1.weight", "layer2_.0.bn1.weight", "layer2_.0.bn1.bias", "layer2_.0.bn1.running_mean", "layer2_.0.bn1.running_var", "layer2_.0.conv2.weight", "layer2_.0.bn2.weight", "layer2_.0.bn2.bias", "layer2_.0.bn2.running_mean", "layer2_.0.bn2.running_var", "layer2_.0.downsample.0.weight", "layer2_.1.conv1.weight", "layer2_.1.bn1.weight", "layer2_.1.bn1.bias", "layer2_.1.bn1.running_mean", "layer2_.1.bn1.running_var", "layer2_.1.conv2.weight", "layer2_.1.bn2.weight", "layer2_.1.bn2.bias", "layer2_.1.bn2.running_mean", "layer2_.1.bn2.running_var", "layer3.0.conv1.weight", "layer3.0.bn1.weight", "layer3.0.bn1.bias", "layer3.0.bn1.running_mean", "layer3.0.bn1.running_var", "layer3.0.conv2.weight", "layer3.0.bn2.weight", "layer3.0.bn2.bias", "layer3.0.bn2.running_mean", "layer3.0.bn2.running_var", "layer3.0.downsample.0.weight", "layer3.1.conv1.weight", "layer3.1.bn1.weight", "layer3.1.bn1.bias", "layer3.1.bn1.running_mean", "layer3.1.bn1.running_var", "layer3.1.conv2.weight", "layer3.1.bn2.weight", "layer3.1.bn2.bias", "layer3.1.bn2.running_mean", "layer3.1.bn2.running_var", "layer3_.0.conv1.weight", "layer3_.0.bn1.weight", "layer3_.0.bn1.bias", "layer3_.0.bn1.running_mean", "layer3_.0.bn1.running_var", "layer3_.0.conv2.weight", "layer3_.0.bn2.weight", "layer3_.0.bn2.bias", "layer3_.0.bn2.running_mean", "layer3_.0.bn2.running_var", "layer3_.0.downsample.0.weight", "layer3_.1.conv1.weight", "layer3_.1.bn1.weight", "layer3_.1.bn1.bias", "layer3_.1.bn1.running_mean", "layer3_.1.bn1.running_var", "layer3_.1.conv2.weight", "layer3_.1.bn2.weight", "layer3_.1.bn2.bias", "layer3_.1.bn2.running_mean", "layer3_.1.bn2.running_var", "layer4.0.conv1.weight", "layer4.0.bn1.weight", "layer4.0.bn1.bias", "layer4.0.bn1.running_mean", "layer4.0.bn1.running_var", "layer4.0.conv2.weight", "layer4.0.bn2.weight", "layer4.0.bn2.bias", "layer4.0.bn2.running_mean", "layer4.0.bn2.running_var", "layer4.0.downsample.0.weight", "layer4.1.conv1.weight", "layer4.1.bn1.weight", "layer4.1.bn1.bias", "layer4.1.bn1.running_mean", "layer4.1.bn1.running_var", "layer4.1.conv2.weight", "layer4.1.bn2.weight", "layer4.1.bn2.bias", "layer4.1.bn2.running_mean", "layer4.1.bn2.running_var", "layer4_.0.conv1.weight", "layer4_.0.bn1.weight", "layer4_.0.bn1.bias", "layer4_.0.bn1.running_mean", "layer4_.0.bn1.running_var", "layer4_.0.conv2.weight", "layer4_.0.bn2.weight", "layer4_.0.bn2.bias", "layer4_.0.bn2.running_mean", "layer4_.0.bn2.running_var", "layer4_.0.downsample.0.weight", "layer4_.1.conv1.weight", "layer4_.1.bn1.weight", "layer4_.1.bn1.bias", "layer4_.1.bn1.running_mean", "layer4_.1.bn1.running_var", "layer4_.1.conv2.weight", "layer4_.1.bn2.weight", "layer4_.1.bn2.bias", "layer4_.1.bn2.running_mean", "layer4_.1.bn2.running_var", "layer5.0.conv1.weight", "layer5.0.bn1.weight", "layer5.0.bn1.bias", "layer5.0.bn1.running_mean", "layer5.0.bn1.running_var", "layer5.0.conv2.weight", "layer5.0.bn2.weight", "layer5.0.bn2.bias", "layer5.0.bn2.running_mean", "layer5.0.bn2.running_var", "layer5.0.downsample.0.weight", "fc.weight", "fc.bias". Unexpected key(s) in state_dict: "features.0.weight", "features.0.bias", "features.2.weight", "features.2.bias", "features.2.running_mean", "features.2.running_var", "features.2.num_batches_tracked", "features.3.weight", "features.3.bias", "features.5.weight", "features.5.bias", "features.5.running_mean", "features.5.running_var", "features.5.num_batches_tracked", "features.7.weight", "features.7.bias", "features.9.weight", "features.9.bias", "features.9.running_mean", "features.9.running_var", "features.9.num_batches_tracked", "features.10.weight", "features.10.bias", "features.12.weight", "features.12.bias", "features.12.running_mean", "features.12.running_var", "features.12.num_batches_tracked", "features.14.weight", "features.14.bias", "features.16.weight", "features.16.bias", "features.16.running_mean", "features.16.running_var", "features.16.num_batches_tracked", "features.17.weight", "features.17.bias", "features.19.weight", "features.19.bias", "features.19.running_mean", "features.19.running_var", "features.19.num_batches_tracked", "classifier.0.weight", "classifier.0.bias", "classifier.2.weight", "classifier.2.bias", "classifier.2.running_mean", "classifier.2.running_var", "classifier.2.num_batches_tracked", "classifier.4.weight", "classifier.4.bias".

Problem related to the json file generation

while the training of train.py file it shows me error related to no train.json file found.

So, how have you generate these json file: 1707.01836.json , 1804.06812.json, 1911.IEEE.json, 2002.00254.json, EcgResNet34.json EfficientNetB4.json, MobileNet.json? those files found at "configs/training/" folder.

Are the ECGResnet34 outputs probabilities?

Hello,

If my understanding is right:
The dimension of the output of the model is (batch size * number of classes). Logically, I would expect the elements in the second dimension (with dimension of number of classes) sum up to 1, i.e., they should represent the probability of belonging to that class. So the index of the maximum of the elements would identify the class of the input.

but, in my case at least, the elements do not sum up to 1 and I believe one normalization step (softmax function) should be added
as the last layer to make the sum equal to 1.

Thanks for your input,

E

subprocess.CalledProcessError: Command '['python3', 'dataset-generation.py', '--file', '../mit-bih\\114']' returned non-zero exit status 9009.

when i come to the third step (Generate 1D and 2D data files running), it shows this error

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "dataset-generation-pool.py", line 21, in
p.map(run, ecg_data)
File "C:\Users\Weber\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Users\Weber\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 657, in get
raise self._value
subprocess.CalledProcessError: Command '['python3', 'dataset-generation.py', '--file', '../mit-bih\114']' returned non-zero exit status 9009.

where is file 'data/class-mapper.json'

hello,
when I run the file pipeline.py, FileNotFoundError happen : No such file or directory: 'data/class-mapper.json', so I wonder where can I find this file, Thank you.

Can author guide me which version of Python i need to use

I use Python 3.9 but console tell me C:\Users\xxxx\AppData\Roaming\Python\Python39\site-packages\pkg_resources_init_.py:122: PkgResourcesDeprecationWarning: p is an invalid version and will not be supported in a future release
warnings.warn(

I think is Python and dependencies version problem

Sorry sir i find it is python 3.7

Get annotation "|" and "+" and "~" when prerpocessing data

The problem occours both in windows 10 and ubuntu 18 .
And also can't finish data processing normally.

Secondly, branch exp-026 don't have requirement.txt for env.
I suppose it's the same as branch master.
but when i execute ''python main.py'' after half-processed data, it run out with "can't find moudle: torchsampler"
and i can't pip install torchsampler, orz.
pls help me!
it's a great work ! BTW

!python3 pipeline.py --config configs/pipelines/config.json

THanks for your code! However, i got the error here, what happened? I use colab to run it.

Trainer: Pipeline1D <class 'pipelines.pipelines.Pipeline1D'>
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:481: UserWarning: This DataLoader will create 3 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
cpuset_checked))
Checkpoint ./experiments/EcgResNet34/checkpoints/00000635.pth successfully loaded
18it [00:04, 3.99it/s]
Traceback (most recent call last):
File "pipeline.py", line 20, in
pipeline.run_pipeline()
File "/content/drive/MyDrive/researchHub/ecg-classification/pipelines/base_pipeline.py", line 69, in run_pipeline
and self.mapper[label] != "N"
KeyError: 5

Windows incompatibility issue of label name "\"

Hi, thanks a lot for your work!

I've come across some issues when trying to annotate the raw data on my windows machine. Taking a peek into your code, I believe the label name "" is real problematic. I guess the root of the evil is the difference in the naming convention of file system between Windows and macOS/Linux... So when trying to create a file with "" or "\" in its name, the system gives an error. (Actually os.path.join will give an erroneous result "/" on my machine).

FYI, I am using python 3.7.3 and windows 10 1903.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.