Giter Club home page Giter Club logo

Comments (9)

mikolmogorov avatar mikolmogorov commented on August 27, 2024

Hi,

Yes, the graph construction algorithm might sometimes be slow on highly repetitive genomes, and it is not trivial to make it run in parallel. Given the size of the genome and high repeat content, the computation might take a while. Unfortunately, it is hard to predict how long it might take, since it varies greatly from one dataset to another. The largest genome that we tested so far was ~3G, so 26G genome might have some unexpected bottlenecks.

from flye.

shanesturrock avatar shanesturrock commented on August 27, 2024

It sounds like it is unlikely to finish any time soon in this case so I think I'll need to investigate other options. I have done an assembly using canu but it seems to only use about half the available data. The repetitiveness is killing all the various tools I've tried unfortunately. Thanks for getting back to me though.

from flye.

mikolmogorov avatar mikolmogorov commented on August 27, 2024

You might also try MARVEL (https://github.com/schloi/MARVEL) - it looks like it was originally designed to assemble very large genomes.

from flye.

shanesturrock avatar shanesturrock commented on August 27, 2024

Yep, I've got that and it was on my list to try after I read the paper on the Axolotl assembly they did but they had 32x PacBio coverage so we're looking at possibly getting some more data. Theirs was also a lot less repetitive than ours. I'm hoping lots more longer reads could resolve the problems. Ilumina de-novo assembly with SOAPdenovo2 produces a lot of contigs (54 million last run) so that's not a great deal of help either. MaSuRCA got stuck (repeats again) and was likely to take 2000 days to finish. Tough one this.

from flye.

mikolmogorov avatar mikolmogorov commented on August 27, 2024

Closing this year-old issue. Flye should now be better with respect to the resource usage (still does not scale to axolotl though).

from flye.

spe238 avatar spe238 commented on August 27, 2024

How does the -t option behave? I have requested several CPUs in the SLURM script, but did not use the -t option, and flye is only using one CPU on the server node. Would you recommend using the option?

How long would you expect flye to run with a 50m genome, and 10 CPUs and 90g mem per CPU requested?

from flye.

mikolmogorov avatar mikolmogorov commented on August 27, 2024

@spe238 -t option sets up the number of threads, which is one by default - please do use it. Some estimates of memory usage/running time are given in the FAQ/Manual.

from flye.

spe238 avatar spe238 commented on August 27, 2024

Thank you!

from flye.

spe238 avatar spe238 commented on August 27, 2024

Thank you!

from flye.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.