Giter Club home page Giter Club logo

grpm_system's Introduction

GRPM System

GRPM (Gene-Rsid-Pmid-Mesh) system is a comprehensive tool designed to integrate and analyze genetic polymorphism data associated with specific biomedical subjects. It comprises five modules that allow data retrieval, merging, analysis, and incorporation of GWAS data.

medrxiv Manuscript DOI

Overview

Introduction

GRPM System is a Python framework able to build a comprehensive dataset of human genetic polymorphisms associated with nutrition. By combining data from multiple sources and utilizing MeSH terms as a framework, this workflow enables researchers to explore the vast genetic literature in search of variants significantly associated with a specific biomedical subject. The main purpose of developing this resource was to assist nutritionists in investigating gene-diet interactions and implementing personalized nutrition interventions.

Graphical Abstract

Modules

The GRPM System comprises five modules that perform various tasks to facilitate the integration and analysis of genetic polymorphism data associated with nutrition. These modules are as follows:

To try out GRPM System. Run each module separately by clicking the "Open in Colab". Be careful to import all necessary dependencies and files. Google Drive folder synch option available.

Each Jupyter notebook is provided with the code for downloading and installing the necessary requirements for their execution.

No. Notebook Module Description
1. Open In Colab Dataset Builder Retrieves data from LitVar and PubMed databases, merging them into a CSV format.
2. Open In Colab MeSH Selection for Retrieval Defines a coherent MeSH term list for information retrieval over the whole GRPM Dataset using NLP.
3. Open In Colab GRPM Dataset MeSH Query Employs MeSH terms for GRPM dataset retrieval. It extracts a subset of matched entities making a Data Report.
4. Open In Colab GRPM Data Analyzer Analyzes retrieved data and calculates survgey metrics. Data visualization trough matplotlib and seaborn.
5. Open In Colab GRPM-GWAS Data Integration: Integrates GWAS data associating GWAS phenotypes and potential risk/effect alleles with the GRPM Dataset.

GRPM system: Integrating Genetic Polymorphism Data with PMIDs and MeSH Terms to Retrieve Genes and rsIDs for Biomedical Research Fields. GRPM Dataset: pcg, protein coding genes; rna, RNA genes; pseudo, presudogenes; in parentheses, dataset shape.

These modules provide a comprehensive framework for researchers and nutritionists to explore genetic polymorphism data and gain insights into gene-diet interactions and personalized nutrition interventions.

Updates

The GRPM Dataset available on Zenodo is a snapshot of LitVar1. LitVar1 is now deprecated and has been fully replaced by LitVar2. Module 1 (Dataset Builder) has been updated to retrieve data from LitVar2. The subsequent modules in the pipeline remain functional and can be tested using the original version of the GRPM Dataset available on Zenodo.

Installation

To install GRPM System, clone the repository to your local machine:

git clone https://github.com/johndef64/GRPM_system.git

Otherwise, run each module separately in Google Colab importing Google Drive to keep-up your progress.

Usage

Detailed instructions on how to use each module of GRPM System can be found inside the relative Jupyter Module provided in the repository. Make sure to follow the instructions and install the necessary Python packages specified for each module.

Requirements

GRPM System has the following requirements:

  • Python 3.9 or above
  • pandas
  • requests
  • biopython
  • nbib
  • beautifulsoup
  • openai
  • matplotlib
  • seaborn
  • nltk

grpm_system's People

Contributors

johndef64 avatar

Stargazers

Anggi avatar

Watchers

 avatar

grpm_system's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.