Giter Club home page Giter Club logo

deeppic's Introduction

DeepPIC

This is the code repo for the paper Highly Automatic and Universal Approach for Extracting Features from LC-MS Data Using Deep Learning. We developed a deep learning-based pure ion chromatogram method (DeepPIC) for extracting PICs from raw data files directly and automatically. The DeepPIC method has already been integrated into the KPIC2 framework. The combination can provide the entire pipeline from raw data to discriminant models for metabolomic datasets.

Installation

1. Install Anaconda for python 3.8.13.

2. Install R 4.2.1.

3. Install KPIC2 in R language.

The method of installing KPIC2 can refer to https://github.com/hcji/KPIC2.

  • First install the depends of KPIC2.
    install.packages(c("BiocManager", "devtools", "Ckmeans.1d.dp", "Rcpp", "RcppArmadillo", "mzR", "parallel", "shiny", "plotly", "data.table", "GA", "IRanges",  "dbscan", "randomForest"))
    BiocManager::install(c("mzR","ropls"))
  • Then, download the source package of KPIC2 at url and install the package locally.

4. Create environment and install main packages.

  • Open commond line, create environment.

    conda create --name DeepPIC python=3.8.13
    conda activate DeepPIC
  • Clone the repository and enter.

    git clone https://github.com/yuxuanliao/DeepPIC.git
    cd DeepPIC
  • Install main packages in requirements.txt with following commands.

    python -m pip install -r requirements.txt
  • Set environment variables for calling R language using rpy2.

    R_HOME represents the installation location of the R language.

    R_USER represents the installation location of the rpy2 package.

    setx "R_HOME" "C:\Program Files\R\R-4.2.1"
    setx "R_USER" "C:\Users\yxliao\anaconda3\Lib\site-packages\rpy2"

DeepPIC

The following files are in the DeepPIC folder:

  • train.py. for model training
  • extract.py. extract PICs from raw LC-MS files
  • predict.py. define the IoU metric for PICs and evalute the DeepPIC model

KPIC2

The following files are in the KPIC2 folder:

  • KPIC2.py. for integrating DeepPIC into KPIC2 to implement the whole process of metabolomics processing
  • KPIC2.R. the code for the feature detection, alignment, grouping, missing value filling, and building classification models
  • permutation_vip.py. define some functions for file format conversion, permutation test, and biomarkers selection
  • files:

Others

The following files are in the others folder:

  • metabolomics.py. the code for the OPLS-DA scores plot, permutation test, biomarkers selection and hierarchical cluster analysis
  • quantitative.py. evaluate the quantitative ability of feature extraction methods
  • XCMS.R. the code for XCMS to detect peaks
  • Simulation:

Dataset

The dataset with 200 input-label pairs used to train, validate, and test the DeepPIC model is in the dataset folder. As the model and the data exceeded the limits, we have uploaded the optimized model and the datasets (MM48, simulated MM48, quantitative, metabolomics and different instrumental datasets) to Github release page.

Usage

The example code for model training is included in the train.ipynb.

The example code for feature extraction is included in the extract.ipynb.

The example code for integrating DeepPIC into KPIC2 to implement the whole process of metabolomics processing is included in the Integration_into_KPIC2.ipynb.

Start from raw LC-MS dataset to discriminant model

By running extract.py, user can use DeepPIC to extract PICs from each LC-MS file in the metabolomics dataset. The whole process of metabolomics processing can be implemented by running KPIC2.py directly. Please refer to extract.ipynb and Integration_into_KPIC2.ipynb for details. Thus, you can use DeepPIC+KPIC2 to process your data.

Information of maintainers

deeppic's People

Contributors

yuxuanliao avatar

Stargazers

 avatar zm avatar Hongchao Ji avatar Zhimin Zhang avatar

Watchers

 avatar

Forkers

zmzhang

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.