Giter Club home page Giter Club logo

cmonkey2's Introduction

cMonkey2 Logo

cMonkey2 - Python port of the cMonkey biclustering algorithm

Description

This is the Python implementation of the cMonkey algorithm based on the original R implementation by David J. Reiss, Institute for Systems Biology.

Documentation

A complete set of documentation for installation and running of cMonkey is on the wiki. There are also developer and user discussion groups.

Contact

Please report all bugs or other issues using the issue tracker. Please direct any and all questions to either the developer or user discussion groups.

System requirements

cMonkey2 has been tested and runs on all tested recent versions of Linux (including debian-based [Ubuntu, Mint, Debian] and RPM-based [CentOS, Fedora]) and recent versions of Mac OS X. Additional dependencies include:

  • Developed and tested with Python 2.7.x
  • scipy >= 0.9.0
  • numpy >= 1.6.0
  • biopython >= 1.63
  • MySQLdb >= 1.2.3
  • BeautifulSoup >= 3.2.0
  • R >= 2.14.1
  • rpy2 >= 2.2.1
  • MEME 4.3.0 or >= 4.8.1
  • csh (for running MEME) for the human setup, Weeder 1.4.2 is needed

for running the unit tests (optional):

  • python-xmlrunner

for running the interactive monitoring and visualization web application (optional):

  • CherryPy 3
  • Jinja2
  • python-routes

Running the Unit Tests

./run_tests.sh

Running cmonkey2

In general, you should be able to run cmonkey2 on microbial gene expression ratios with

./cmonkey.py --organism <organism-code> --ratios <tab separated file of gene expressions>

The file can be either in your file system or a web URL.

After the program was started, a log file will be written in cmonkey.log. You can see all available options with

./cmonkey.py --help

Test Run with Halobacterium Salinarum

There is a startup script for cMonkey to run the current integrated system

./cmonkey.py --organism hal --ratios example_data/hal/halo_ratios5.tsv

Start the python based monitoring application

python cmviewer/main.py

Another way is to run Halobacterium is specify the RSAT database

./cmonkey.py --organism hal --ratios example_data/hal/halo_ratios5.tsv --rsat_organism Halobacterium_NRC_1_uid57769 --rsat_base_url http://pedagogix-tagc.univ-mrs.fr/rsat --rsat_features gene --nooperons --use_BSCM

Running cMonkey on Human

To run cMonkey on human data, run the following code with your own <ratios.tsv> file

./cmonkey.py --organism hsa --ratios <ratios.tsv> --string <stringFile> --rsat_organism Homo_sapiens_GRCh37 --rsat_URL http://rsat.sb-roscoff.fr/ --rsat_features protein_coding --nooperons

More details for running cMonkey on human data

Running cMonkey on Human data is somewhat difficult because neither the string database nor the RSAT database has human data cleanly entered. Here are the steps for a sucessful python cMonkey run on human

  1. Make a gene interaction file. The example data file mentioned above was generated from Biogrid around 10/6/14.
  2. Find an RSAT mirror that has .raw chromose files and feature files. In the above example, we use Homo_sapiens_ensembl_74_GRCh37 from the main RSAT database. To annotate these we use 'protein_coding.tab' and 'protein_coding_names.tab'. In principal, other annotation files such as 'processed_transcript' would work just as well.
  3. Adjust the upstream region searched, and perhaps modify the code to search for know TF and miRNA motifs rather than de-novo motifs. NOTE: Modiyfing the motif search step is non-trivial.

cmonkey2's People

Contributors

aebrahim avatar cplaisier avatar dreiss-isb avatar sdanzige avatar weiju avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.