Comments (15)
Two things I notice going on here:
-
It looks like you're running things locally, not in the container image. The Python source paths in the traceback point into
/Users/trvrb/…
. -
The exception is occurring when
augur align
tries to read the aligned fasta file produced by (in this case) mafft but finds it empty. This can happen if the input file passed to mafft was empty.
from cli.
Closing this because it doesn't seem like a CLI issue, especially given the traceback isn't from the container image. The full zika build works for me using nextstrain-cli 1.2.0 and nextstrain/base image with id 1fae083a1078.
I'm still happy to debug what happened if you have more logs of what was run.
from cli.
I thought "nextstrain build" automatically ran from within the container. Did this change? Is there a flag now?
from cli.
No, it didn't change.
from cli.
I'm trying to reproduce now on 2nd computer. I reopened because it seems very odd to get an error message referring to local paths when running nextstrain build .
Okay... just reproduced on current computer. Got the following error:
using mafft to align via:
mafft --reorder --anysymbol --thread 2 results/filtered.fasta.ref.fasta 1> results/aligned.fasta 2> results/aligned.fasta.log
Katoh et al, Nucleic Acid Research, vol 30, issue 14
https://doi.org/10.1093%2Fnar%2Fgkf436
Traceback (most recent call last):
File "/nextstrain/augur/bin/augur", line 165, in <module>
return_code = args.func(args)
File "/nextstrain/augur/augur/align.py", line 71, in run
aln = AlignIO.read(output, 'fasta')
File "/usr/lib/python3.6/site-packages/Bio/AlignIO/__init__.py", line 429, in read
raise ValueError("No records found in handle")
ValueError: No records found in handle
I ran nextstrain update
and still reproduce. To be clear I'm at:
nextstrain.cli 1.2.0
zika master (0827730fb201b7595094bb841215d5d0c8613192)
Leaving this here now... going to try a couple more stabs at things. Current hypotheses:
- Failing stochastically
- I forgot to remove an intermediate file
temp...
that's in thezika/
directory. I definitely ransnakemake clean
but this only gets rid ofresults/
and not temp files.
Favoring hypothesis 2.
from cli.
Clean in zika/Snakefile
should be expanded to include temp files. I believe this fixes the issue. Will make a PR today.
from cli.
Ok. Do you know what's creating the temp files?
from cli.
(I've also submitted a PR to augur to be better about checking if external commands failed or not.)
from cli.
Thanks for the augur PR.
I'm still not able to reliably reproduce. And I realized that adding things like temp_iqtree.fasta
to snakemake clean rule seems wrong and to put too much burden on snakemake clean
. Instead, I would argue to have augur modules remove possible temp cruft at the beginning of the module's run. I made an issue for this here: nextstrain/augur#201.
I'm willing to close this CLI issue in favor of this augur issue.
from cli.
snakemake clean
.
I was never able to reproduce without intentionally triggering it with empty input. I did briefly look at mafft's source code to see if it writes temp files somewhere in a brittle way that could cause this. There are some suspects, but no obvious culprit I could trigger (at least without spending more time than I thought was worth trying).
from cli.
@trvrb @tsibley: I'm also having this same error running cli version 1.2.0, i.e.nextstrain build .
stochastically fails on the augur align step (error message below).
`using mafft to align via:
mafft --reorder --anysymbol --thread 2 results/filtered.fasta.ref.fasta 1> results/aligned.fasta 2> results/aligned.fasta.log
Katoh et al, Nucleic Acid Research, vol 30, issue 14
https://doi.org/10.1093%2Fnar%2Fgkf436
Traceback (most recent call last):
File "/nextstrain/augur/bin/augur", line 189, in
return_code = args.func(args)
File "/nextstrain/augur/augur/align.py", line 71, in run
aln = AlignIO.read(output, 'fasta')
File "/usr/lib/python3.6/site-packages/Bio/AlignIO/init.py", line 429, in read
raise ValueError("No records found in handle")
ValueError: No records found in handle`
Before running I remove all files from results/
and auspice/
.
from cli.
@tsibley and I have just been chatting about this. Trying to figure out why mafft seems to be having issues.
from cli.
@tsibley In a twist to this scenario, I just ran mafft
on it's own with the command as Augur would specify, and it ran perfectly, and yet continues to fail when mafft
is called within augur align
.
from cli.
Thanks @tsibley for figuring out that this was likely a memory issue that crops up on Mac machines because Docker limits memory allocation on Mac (but not on Linux).
Apparent problem was that the memory allocation was too low for mafft
, leading to failure and an empty alignment file that augur align
crashes on when attempting to trim to reference. Raising memory allocation from default 2GB to 8GB fixed this for me.
from cli.
Maybe I should add a test to check-setup
to see if the memory available to the container is ≤ 2GB. There's no magic number that's always the "right" amount of memory, but 2GB or less is likely not the optimal situation for alignments.
from cli.
Related Issues (20)
- BUG: Uncaught `ImportError: libssl.so.1.1: cannot open shared object file: No such file or directory` via mamba/Ubuntu HOT 7
- BUG: Bioconda installed cli asks for upgrade via pip, it shouldn't HOT 4
- ENH: Link to relevant PRs in changelog HOT 2
- BUG(update): failed to register layer: sync /var/lib/docker/image/overlay2/layerdb/tmp/write-set-226670863/diff: input/output error HOT 4
- ENH: Use brotli compression by default if possible for nextstrain.org requests HOT 1
- [batch] exclude directories for upload HOT 9
- [setup] Docker runtime support HOT 1
- RFD: `remote upload`'s use of filenames HOT 8
- DOC: Mention disk space requirements for managed conda-runtime and docker HOT 1
- ENH: Compress batch job results using `.tar.zst` instead of `.zip` for 10x or more better compression ratio HOT 7
- ENH: Output commit hash of the workflow repo used when using repo actions to submit HOT 2
- ENH: Add `nextstrain update --all` (or `all`) command or support multiple arguments HOT 1
- [shell] shell-history does not exist for Docker HOT 4
- Nexstrain CLI standalone installation fails on EC2 Amazon Linux 2 HOT 5
- `nextstrain remote` treats Groups names as case sensitive though nextstrain.org does not
- `nextstrain view` error message with standalone CLI HOT 17
- CI fails due to change in Python 3.6 deprecation warning from cryptography HOT 1
- Windows: Snakemake arguments with file paths containing backslashes produce file not found errors
- ENH: Print docker image used at beginning of aws-batch jobs HOT 2
- When using `--aws-batch`, the `--image` argument is ignored HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cli.