Giter Club home page Giter Club logo

kleat's People

Contributors

zyxue avatar

Watchers

 avatar  avatar

kleat's Issues

Think through hardclip from softclip

hardclip is distinct from softclip as it's related to chimeric contigs,

see if it's indeed necessary to consider it separately from softclip (likely).

ctg_hex_pos and ctg_hex_dist is buggy when the contig has indels

This is because the coordinate information is currently lost when the sequence is extracted from contig.

Currently, this search function is used to search for hexamer in the contig, it only takes into account the extracted sequence with cigar information missing.

def search(strand, clv, seq, window=50):

Maybe it's easier to define the searching window (e.g. 50bp) wst. to the reference, the actual sequence length wst. contig could be a few bp more or less than 50bp.

Sum up num_suffix_reads is buggy

Currently benchmarking, cluster first, and realized that summing up num_suffix_reads is buggy as one suffix read, based on the current definition, can support multiple neighbouring cleavage sites before clustering.

Potential solution:

  1. more rigorous defintion of suffix read, maybe similar to bridge read.
  2. take the max of num_suffix_reads instead of summing them up.

Parallize the initial looping through contig

main difficulty

  File "stringsource", line 2, in pysam.libcalignedsegment.AlignedSegment.__reduce_cython__
TypeError: self._delegate cannot be converted to a Python object for pickling

Infer ref_clv from chimeric contigs

Given a chimeric contig, when a bridge read is aligned to the contig, the contig is in one piece; but when the contig is aligned to the genome, it becomes two pieces.

Problem: When looping through the contig to the first-piece, how to infer the genome coord of a clv that's actually aligned the second piece?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.