Giter Club home page Giter Club logo

fermi-lite's People

Contributors

julianhess avatar lh3 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fermi-lite's Issues

Can fermi-lite be made strand specific ?

Is there any parameter or easy code changes that could make fermi-lite strand specific ? I am happy to make the changes if you think this is straightforward and had some pointers, I tried removing some of the revcomp lines in the source code but the resulting assembly was worse off.

In my use-case I know all my input sequences are from the same strand therefore I do not want their reverse complement to be tried for overlaps.

Thank you

parameter determination

We are interested in using fermi-lite for local assembly. In the readme you mention the need to automatically determine parameters for the assembly. What parameters need to be determined?

Possible Error in bfc_ec1dir?

Hello,

I am using the error correction code of fermi-lite in my thesis and it works pretty well. I have noticed that count k-mer occurrences with help of a table built on top of a set of khash tables (enhanced with locking support). The lowest 14 bits of table keys is used to count occurrences of corresponding k-mers. Bits 0-7 count low quality instances, 8-13 are responsible for high-quality ones.

So, to extract the low-quality occurence, you need just to AND the key with the 0xff mask. To get the high-quality one, a right shift of 8, followed by AND with 0x3f mask, is required.

On line 450 of the bfc.c file (the bfc_ec1dir function), there is probably a wrong mask applied:
pen.absent_high = ((s>>8&0xff) < e->opt->min_cov);

Can you look into it please? I think I have quite deep understanding of the code now but I am still probably missing few details..

considerations for local assembly

We would like to deploy fermi-lite in vg for local assembly and homogenization. Is there any particular consideration that we should take when doing this?

It may be helpful to assemble the data from many genomes in a small region (1kb-100kb for instance). What parameters might we use in that case?

bseq1_t --> fseq1_t

Hi Heng,
In trying to link bwa and fml in the same executable, I ran into an issue where bseq1_t was defined differently in each library. I ended up making a fork that fixed this issue for SeqLib, but am getting some feedback that it would be better to avoid having multiple fml / bwa clones out there and instead just have SeqLib link to the official fml.

Would you be willing to consider a PR that does the minimal amount of re-naming within fml to be able to link to bwa without multiple definition errors?

Remove need for SSE2 support

Currently the code in ksw.c seems to need SSE2 to compile. It would be nice to have some kind of -- probably slower -- fallback implementation to improve support on architectures without these instructions.

cc1: fatal error: prog.c: No such file or directory

While compiling, I got following error:

gcc -Wall -O2 prog.c -o prog -L /usr/local/bin -lfml -lz -lm -lpthread
cc1: fatal error: prog.c: No such file or directory

gcc (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4)
Fedora 36.

How to solve this?

No assembly reported for 100 reads with the same sequence

@lh3 I was playing around with this tool but I couldn't get it to work on a "simple" case. I duplicated a read 100 times and would expect it to output the duplicated read. Any thoughts?

``` @M50205:20:000000000-B82KM:1:1108:8421:4217/2 CTAAGGTGGACATGTTGGCTTCTCTCTGTTCTTAACATGTTAAAATTAAAATTAACTTCTCTGGTGTGTGGAGATGTCTTACAATAACAGTTGCTACTATTTCTTTTCTTTTTCTCTTTCTTTCCTCTCTCTTTTTCTTTCTTTCTTTCTTTCTTTCTTTCTTTCTTTCTTTCTTTCTTAGACAAGGTCTCAATTTGTCACTCAGAGTGAAGTGCATTGGCATGAACATTGCTCACTTCATCCTTAACCTTCTTGGCCAAAGAACTCCTCCTGCCTCACCCCC + 2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222 ```

Fermi-lite aggressive trimming

Hi,
When setting aggressive trimming in fermi-lite to pop bubbles in heterozygous regions, what is the strategy being employed.
Is the longer path in the bubble being kept or the shorter path? Or the one with highest average coverage?
I am using fermi-lite to do local assembly of ~2kb regions.

Thanks,
Cristian

Cannot assemble a simple example

Consider the following 8 reads.

>seq1
ATCCTGAGAATCAATCTGTGAAAATTATGTCTTGGGAGGAGGGGAAGGAAACCAAAAATTTTTAGAAAAGCTGGAACTCTTAGCTATCTAGAAGCAGGTC
>seq2
GGGAAGGAAACCAAAAATTTTTAGAAAAGCTGGAACTCTTAGCTATCTAGAAGCAGGTCTTGAATCTCACAGAATCGCAAAGGAAGAAAATCAGGGCCTA
>seq3
TTTAGAAAAGCTGGAACTCTTAGCTATCTAGAAGCAGGTCTTGAATATCACAGAATCGCAAAGGAAGAAAATCAGGGCCTACCTATCTAAATTTAAAATT
>seq4
GAAATTTTAAATTTAGATATGTAGGCCCTGATTTTCTTCCTTTGCGATTCTGTGATATTCAAGACCTGCTTCTAGATAGCTAAGAGTTCCAGCTTTTCTA
>seq5
TGAGAAAATTATGTCTTGGGAGGAGGGGAAGGAAACCAAAAATTTTTAGAAAAGCTGGAACTCTTAGCTATCTAGAAGCAGGTCTTGAATATCACAGAAT
>seq6
TGAAAATTATGTCTTGGGAGGAGGGGAAGGAAACCAAAAATTTTTAGAAAAGCTGGAACTCTTAGCTATCTAGAAGCAGGTCTTGAATATCACAGAATCG
>seq7
TTTTTAGAAAAGCTGGAACTCTTAGCTATCTAGAAGCAGGTCTTGAATATCACAGAATCGCAAAGGAAGAAAATCAGGGCCTACATATCTAAATTTAAAA
>seq8
ATAGCTAAGAGTTCCAGCTTTTCTAAAAATTTTTGGTTTCCTTCCCCTCCTCCCAAGACATAATTTTCACAGATTGATTCTCAGGATTGGCAATCATGCA

A quick multiple sequence alignment shows that there is very good consensus among these 8 reads for most of the alignment.

seq1            -------------atcctgagaatcaatctgtgaaaattatgtcttgggaggaggggaag
_R_seq8         tgcatgattgccaatcctgagaatcaatctgtgaaaattatgtcttgggaggaggggaag
seq5            -----------------------------tgagaaaattatgtcttgggaggaggggaag
seq6            -------------------------------tgaaaattatgtcttgggaggaggggaag
seq2            ------------------------------------------------------gggaag
seq3            ------------------------------------------------------------
_R_seq4         ------------------------------------------------------------
seq7            ------------------------------------------------------------
                                                                            

seq1            gaaaccaaaaatttttagaaaagctggaactcttagctatctagaagcaggtc-------
_R_seq8         gaaaccaaaaatttttagaaaagctggaactcttagctat--------------------
seq5            gaaaccaaaaatttttagaaaagctggaactcttagctatctagaagcaggtcttgaata
seq6            gaaaccaaaaatttttagaaaagctggaactcttagctatctagaagcaggtcttgaata
seq2            gaaaccaaaaatttttagaaaagctggaactcttagctatctagaagcaggtcttgaatc
seq3            -------------tttagaaaagctggaactcttagctatctagaagcaggtcttgaata
_R_seq4         ---------------tagaaaagctggaactcttagctatctagaagcaggtcttgaata
seq7            -----------tttttagaaaagctggaactcttagctatctagaagcaggtcttgaata
                               *************************                    

seq1            -------------------------------------------------------
_R_seq8         -------------------------------------------------------
seq5            tcacagaat----------------------------------------------
seq6            tcacagaatcg--------------------------------------------
seq2            tcacagaatcgcaaaggaagaaaatcagggccta---------------------
seq3            tcacagaatcgcaaaggaagaaaatcagggcctacctatctaaatttaaaatt--
_R_seq4         tcacagaatcgcaaaggaagaaaatcagggcctacatatctaaatttaaaatttc
seq7            tcacagaatcgcaaaggaagaaaatcagggcctacatatctaaatttaaaa----

However, I cannot get fml-asm to produce any assembly from these reads. I've tried relaxing parameters in various ways but with no success. Are there any parameter settings that will assemble these reads, or is this a particularly challenging case that can't easily be solved?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.