prs-net's Introduction

PRS-Net

About

This repository contains the code and resources of the following paper:

PRS-Net: Interpretable polygenic risk scores via geometric learning

Overview of the framework

PRS-Net is an interpretable genomic deep learning-based approach designed to effectively model the nonlinearity of the biological system and deliver precise and interpretable polygenic risk scores.

Data preprocessing

Setup environment

Setup the required environment using environment_data.yml with Anaconda. While in the project directory run:

conda env create -f environment_data.yml

Activate the environment

conda activate PRS-Net_data

Step 1: Arrange GWAS summary statistic data

Ensure the columns in your GWAS summary statistic data are ordered as follows, separated by tabs (\t): CHR, BP, SNP, A1, A2, N, SE, P, OR, INFO, MAF, BETA, where A1 stands for the effect allele of the SNP and A2 stands for the non-effect allele of the SNP.

Step 2: Run preprocessing script

Execute the 1_data_preprocess.sh script with the following command:

./1_data_preprocess.sh your_base_data_path your_target_data_path your_phenotype_data_path your_ld_data_path your_output_path

your_base_data_path: the path of the gwas summarty stastistics file derived from Step 1
your_target_data_path: the path (without postfix) of the target data in PLINK format (.bim, .fam, .bed)
your_phenotype_data_path: the path of the phenotype data separated by blanks (Sample_ID Sample_ID Phenotype_label)
your_ld_data_path: the path (without postfix) of the data used for LD calculating in PLINK format
your_output_path: the path of the output files

Step 3: Run PRS-Net data generation script

Execute the gwas_data_preprocess.sh script with the following command:

./2_data_generation.sh your_preprocessed_data_path your_phenotype_data_path your_output_path

your_preprocessed_data_path: the path of the preprocessed files derived from Step 2
your_phenotype_data_path: the path of the phenotype data separated by tabs (ID \t ID \t Label)
your_output_path: the path of the output files

Train

Setup environment

Setup the required environment using environment.yml with Anaconda. While in the project directory run:

conda env create -f environment.yml

Activate the environment

conda activate PRS-Net

We upload an example dataset at https://cloud.tsinghua.edu.cn/f/30a1c8df0c024fefa3c1/?dl=1.

Unzip the file and put it in example_dataset/ and excecute the train.py with the following command:

python train.py --data_path ../example_dataset/ --dataset ad_eur

License

PRS-Net is licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0.

prs-net's People

Watchers

prs-net's Issues

Questions about installing the package

Hi, thanks for your interesting project. When installing the package to reproduce the training result, I meet an error:

      /tmp/pip-build-env-5nfkk1gy/overlay/lib/python3.7/site-packages/setuptools/command/build_py.py:201: _Warning: Package 'pyreadr.libs.lzma.lzma' is absent from the `packages` configuration.
      !!
      
              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'pyreadr.libs.lzma.lzma' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration.
      
              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'pyreadr.libs.lzma.lzma' is explicitly added
              to the `packages` configuration field.
      
              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).
      
              You can read more about "package discovery" on setuptools documentation page:
      
              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html
      
              If you don't want 'pyreadr.libs.lzma.lzma' to be distributed and are
              already explicitly excluding 'pyreadr.libs.lzma.lzma' via
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.
      
              You can read more about "package data files" on setuptools documentation page:
      
              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html
      
      
              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages.
              ********************************************************************************
      
      !!
        check.warn(importable)
      /tmp/pip-build-env-5nfkk1gy/overlay/lib/python3.7/site-packages/setuptools/command/build_py.py:201: _Warning: Package 'pyreadr.libs.zlib' is absent from the `packages` configuration.
      !!
      
              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'pyreadr.libs.zlib' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration.
      
              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'pyreadr.libs.zlib' is explicitly added
              to the `packages` configuration field.
      
              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).
      
              You can read more about "package discovery" on setuptools documentation page:
      
              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html
      
              If you don't want 'pyreadr.libs.zlib' to be distributed and are
              already explicitly excluding 'pyreadr.libs.zlib' via
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.
      
              You can read more about "package data files" on setuptools documentation page:
      
              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html
      
      
              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages.
              ********************************************************************************
      
      !!
        check.warn(importable)
      error: command '/usr/bin/gcc' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for pyreadr
ERROR: Could not build wheels for pyreadr, which is required to install pyproject.toml-based projects

It seems that the current conda.yml file is not enough. Would you please provide more information? Thanks.

Recommend Projects

lihan97 / prs-net Goto Github PK

prs-net's Introduction

PRS-Net

About

Overview of the framework

Data preprocessing

Setup environment

Train

Setup environment

License

prs-net's People

Watchers

prs-net's Issues

Questions about installing the package

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent