Giter Club home page Giter Club logo

Comments (15)

tsibley avatar tsibley commented on September 26, 2024

Two things I notice going on here:

  1. It looks like you're running things locally, not in the container image. The Python source paths in the traceback point into /Users/trvrb/….

  2. The exception is occurring when augur align tries to read the aligned fasta file produced by (in this case) mafft but finds it empty. This can happen if the input file passed to mafft was empty.

from cli.

tsibley avatar tsibley commented on September 26, 2024

Closing this because it doesn't seem like a CLI issue, especially given the traceback isn't from the container image. The full zika build works for me using nextstrain-cli 1.2.0 and nextstrain/base image with id 1fae083a1078.

I'm still happy to debug what happened if you have more logs of what was run.

from cli.

trvrb avatar trvrb commented on September 26, 2024

I thought "nextstrain build" automatically ran from within the container. Did this change? Is there a flag now?

from cli.

tsibley avatar tsibley commented on September 26, 2024

No, it didn't change.

from cli.

trvrb avatar trvrb commented on September 26, 2024

I'm trying to reproduce now on 2nd computer. I reopened because it seems very odd to get an error message referring to local paths when running nextstrain build .

Okay... just reproduced on current computer. Got the following error:

using mafft to align via:
	mafft --reorder --anysymbol --thread 2 results/filtered.fasta.ref.fasta 1> results/aligned.fasta 2> results/aligned.fasta.log 

	Katoh et al, Nucleic Acid Research, vol 30, issue 14
	https://doi.org/10.1093%2Fnar%2Fgkf436

Traceback (most recent call last):
  File "/nextstrain/augur/bin/augur", line 165, in <module>
    return_code = args.func(args)
  File "/nextstrain/augur/augur/align.py", line 71, in run
    aln = AlignIO.read(output, 'fasta')
  File "/usr/lib/python3.6/site-packages/Bio/AlignIO/__init__.py", line 429, in read
    raise ValueError("No records found in handle")
ValueError: No records found in handle

I ran nextstrain update and still reproduce. To be clear I'm at:

nextstrain.cli 1.2.0
zika master (0827730fb201b7595094bb841215d5d0c8613192)

Leaving this here now... going to try a couple more stabs at things. Current hypotheses:

  1. Failing stochastically
  2. I forgot to remove an intermediate file temp... that's in the zika/ directory. I definitely ran snakemake clean but this only gets rid of results/ and not temp files.

Favoring hypothesis 2.

from cli.

trvrb avatar trvrb commented on September 26, 2024

Clean in zika/Snakefile should be expanded to include temp files. I believe this fixes the issue. Will make a PR today.

from cli.

tsibley avatar tsibley commented on September 26, 2024

Ok. Do you know what's creating the temp files?

from cli.

tsibley avatar tsibley commented on September 26, 2024

(I've also submitted a PR to augur to be better about checking if external commands failed or not.)

from cli.

trvrb avatar trvrb commented on September 26, 2024

Thanks for the augur PR.

I'm still not able to reliably reproduce. And I realized that adding things like temp_iqtree.fasta to snakemake clean rule seems wrong and to put too much burden on snakemake clean. Instead, I would argue to have augur modules remove possible temp cruft at the beginning of the module's run. I made an issue for this here: nextstrain/augur#201.

I'm willing to close this CLI issue in favor of this augur issue.

from cli.

tsibley avatar tsibley commented on September 26, 2024

👍 Sounds good to me. I concur about overloading snakemake clean.

I was never able to reproduce without intentionally triggering it with empty input. I did briefly look at mafft's source code to see if it writes temp files somewhere in a brittle way that could cause this. There are some suspects, but no obvious culprit I could trigger (at least without spending more time than I thought was worth trying).

from cli.

alliblk avatar alliblk commented on September 26, 2024

@trvrb @tsibley: I'm also having this same error running cli version 1.2.0, i.e.nextstrain build . stochastically fails on the augur align step (error message below).

`using mafft to align via:
mafft --reorder --anysymbol --thread 2 results/filtered.fasta.ref.fasta 1> results/aligned.fasta 2> results/aligned.fasta.log

Katoh et al, Nucleic Acid Research, vol 30, issue 14
https://doi.org/10.1093%2Fnar%2Fgkf436

Traceback (most recent call last):
File "/nextstrain/augur/bin/augur", line 189, in
return_code = args.func(args)
File "/nextstrain/augur/augur/align.py", line 71, in run
aln = AlignIO.read(output, 'fasta')
File "/usr/lib/python3.6/site-packages/Bio/AlignIO/init.py", line 429, in read
raise ValueError("No records found in handle")
ValueError: No records found in handle`

Before running I remove all files from results/ and auspice/.

from cli.

alliblk avatar alliblk commented on September 26, 2024

@tsibley and I have just been chatting about this. Trying to figure out why mafft seems to be having issues.

from cli.

alliblk avatar alliblk commented on September 26, 2024

@tsibley In a twist to this scenario, I just ran mafft on it's own with the command as Augur would specify, and it ran perfectly, and yet continues to fail when mafft is called within augur align.

from cli.

alliblk avatar alliblk commented on September 26, 2024

Thanks @tsibley for figuring out that this was likely a memory issue that crops up on Mac machines because Docker limits memory allocation on Mac (but not on Linux).

Apparent problem was that the memory allocation was too low for mafft, leading to failure and an empty alignment file that augur align crashes on when attempting to trim to reference. Raising memory allocation from default 2GB to 8GB fixed this for me.

from cli.

tsibley avatar tsibley commented on September 26, 2024

Maybe I should add a test to check-setup to see if the memory available to the container is ≤ 2GB. There's no magic number that's always the "right" amount of memory, but 2GB or less is likely not the optimal situation for alignments.

from cli.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.