Giter Club home page Giter Club logo

genelist-api's Introduction

KnetMiner genome and genepage API

This script will query the KnetMiner Knowledge Graph with a user provided gene list (one gene ID per line) and a user provided keyword file (one keyword per line). It returns the knetscore, genome location, and KnetMiner genepage URL for each user gene that can be associated with any of the keywords. The script was tested with upto 3000 wheat genes and 10 keywords. It works for various species including Wheat, Rice, Arabidopsis, F.graminearum, Z.tritici and more.

Prerequisites

The script requires Python3+ with following dependencies (last test with py3.8.2):

  • pandas

  • numpy

  • argparse

  • requests

  • resource (included in Linux/Unix but needed for Windows conda)

  • Python virtual environments, e.g. pyvenv for python3. If the user does not have root permission on Easybuild a virtual environment is required for installation of the dependencies through pip.

  • The script uses the KnetMiner REST API and therefore requires an internet connection.

Execution

Downloading the repository

Clone or download this repository using the green "Clone or Download" button. To clone this repository via git (command-line), simple execute the following:

git clone https://github.com/Rothamsted/genelist-api.git

You'll find all the relevant files in the genelist-api folder, cloned to whatever directory you cloned it in.

Setting up Python and dependencies

Login to a compute node:

srun --pty bash -i

Check available versions of python:

module avail Python
module load <Python3 version>

A virtualenv is required if you lack permissions to pip install

virtualenv <name of Python virtual environment>
source </path to env>/bin/activate/

Install missing python packages:

pip install pandas
pip install numpy
pip install argparse
pip install requests

Run the script

Change to the cloned/downloaded script folder. To see the script help page run:

python3 genepage_insight.py -h

Supported arguments are:

  • -g OR --gene Text file which contains your list of gene ids or names (one per line)
  • -k OR --keywords Text file which contains the search terms or keywords of interest to you (one er line).
  • -s OR --species Currently spporting rice, wheat, or arabidopsis (ara)
  • -o OR -output Output directory. If npt provided, a file will be created using your gene file name & appending '_output' to it, where your results & dependent files will be found.

Example command:

python3 genepage-insight.py -g example_list.txt -k keywords_heat.txt -s wheat -o /home/$USER/test_output.txt

Output information

The output will be a tabular text file containing 5 columns: Gene ID, Knetscore, Chromosome, Gene start and Network URL.

The Knetscore indicates the relevance of a gene to the provided keywords as described in Hassani-Pak (2017), PhD thesis.

The URL links to an interactive knowledge network of that gene with links to publications, ontologies, pathways etc that contain the keywords.

Authors

Keywan-Hassani Pak

Colin Li

Joseph Hearnshaw

genelist-api's People

Contributors

ajitps avatar josephhearnshaw avatar keywanhp avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.