Giter Club home page Giter Club logo

exonize's Introduction

███████╗██╗  ██╗ ██████╗ ███╗   ██╗██╗███████╗███████╗
██╔════╝╚██╗██╔╝██╔═══██╗████╗  ██║██║╚══███╔╝██╔════╝
█████╗   ╚███╔╝ ██║   ██║██╔██╗ ██║██║  ███╔╝ █████╗
██╔══╝   ██╔██╗ ██║   ██║██║╚██╗██║██║ ███╔╝  ██╔══╝
███████╗██╔╝ ██╗╚██████╔╝██║ ╚████║██║███████╗███████╗
╚══════╝╚═╝  ╚═╝ ╚═════╝ ╚═╝  ╚═══╝╚═╝╚══════╝╚══════╝

Discover exon duplications!

Dependencies

Getting started

You are best off installing exonize from PyPI.org (once we upload it there), using pip install exonize.

If installing from the repo:

$ git clone [email protected]:msarrias/exonize.git
$ cd exonize
$ pip install .

You should now be able to run exonize -h.

Required arguments

Optional arguments

Usage

Analyzing an example dataset

Citation

exonize's People

Contributors

msarrias avatar arvestad avatar mvggz avatar

Watchers

 avatar  avatar

Forkers

arvestad

exonize's Issues

Refactor the blast code

I think clarity would be improved if Blast-related code was refactored into a separate file and class. The Exonize object is quite heavy with many attributes and methods. If blast code was moved to a separate class, it could be instantiated in exonize.py and all its gory details would be "hidden" there, before being shared as an attribute to Exonize. Ideally, the blast abstraction would be so good that Exonize does not even need to know how sequences are compared, but that is not a priority.

insert_fragments_calls

This relates to the functions insert_fragments_calls() in sql_utils.py and insert_fragments_table in exonize_handler.py.

  • I was confused by the function insert_fragments_calls(). It returns two strings used to perform two SQL execute commands. I think it would make more sense to have functions for the two executes.
  • insert_fragments_table has an internal function which in turn has an internal function. That is too many levels. The strand analysis could be done somewhere else, either when the blast results are collected in the first place or right before inserting into the DB tables.
  • get_gene_tuple() returns a tuple with 6 elements. Element 3 is within parenthesis. Is there a reason for that? I printed the results and cannot see that it is needed.

BiopythonWarning: You may be importing Biopython from inside the source tree

After installing exonize following the instructions in the readme file I got the following warning:

Did you encounter the same issue?

/opt/homebrew/lib/python3.11/site-packages/Bio/init.py:138: BiopythonWarning: You may be importing Biopython from inside the source tree. This is bad practice and might lead to downstream issues. In particular, you might encounter ImportErrors due to missing compiled C extensions. We recommend that you try running your code from outside the source tree. If you are outside the source tree then you have a setup.py file in an unexpected directory: /opt/homebrew/lib/python3.11/site-packages
warnings.warn(

AttributeError - '_Exonize__query_CDS_frame'

AttributeError: 'Exonize' object has no attribute '_Exonize__target_CDS_frame'. Did you mean:
'_Exonize__query_CDS_frame'?Classifying events %

seems like I missed to initiallize some attributes.

MXE cases of multi-CDS duplications

Check for exclusive pairs that are classified as “optional” (neither>0) but are actually MXE - it may just be a case where >2 duplicated exons may be sampled at that site.

check for adjacency

This relates to the duplication mechanism, i.e., it is a tandem or a long region duplication.

If the best reciprocal hits are adjacent to each other this will mean that we're dealing with tandem duplications and not with a multi-exon duplication.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.