nextstrain / cli Goto Github PK

The Nextstrain command-line interface (CLI)—a program called nextstrain—which aims to provide a consistent way to run and visualize pathogen builds and access Nextstrain components like Augur and Auspice across computing environments such as Docker, Conda, and AWS Batch.

Home Page: https://docs.nextstrain.org/projects/cli/

License: MIT License

Python 94.82% Shell 3.87% Starlark 1.31%

cli's Introduction

This repository is archived and contains the content used to build the documentation and splash page found in nextstrain.org. This content can now be found here.

License and copyright

Source code to Nextstrain is made available under the terms of the GNU Affero General Public License (AGPL). Nextstrain is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

cli's People

Contributors

Stargazers

Watchers

cli's Issues

view: Advertise over mDNS a nextstrain.local record for nicer URLs

It's possible to use mDNS (zeroconf / avahi / dns-sd) to advertise hostnames, and doing so would turn nextstrain view URLs from:

http://140.107.151.110:4000/zika

http://nextstrain.local:4000/zika

I have a prototype of this that's working in some circumstances, but it needs to be made more robust. This issue is a reminder to finish that work at some point.

Run builds on AWS Batch [issue]

This issue is for a work-in-progress feature which I've been working on recently and am currently polishing up.

Even with the working mechanics, there are several external things that need consideration before this can be considered a shippable feature:

(The list above is as much for me as anyone else.)

Regression test for deploy/remote upload bug

It'd be great to have a regression test for the recent bug (#62) in nextstrain deploy / nextstrain remote upload.

I wrote an ad-hoc test script to git bisect the problem and test my fix, as noted in this comment. The script is an example of what a regression test for the bug would look like, and I left some more notes on testing in another comment.

Move runner-selection options into their own option group

For example, the output of nextstrain build --help might look like this (trimmed) example:

positional arguments:
  <directory>           Path to pathogen build directory
  ...                   Additional arguments to pass to the executed program

environment selection arguments:
  --docker              Run commands inside a container image using Docker.
                        (default)
  --native              Run commands on the native host, outside of any
                        container image.
  --aws-batch           Run commands remotely on AWS Batch inside the
                        Nextstrain container image.

optional arguments:
  --help, -h            Show a brief help message of common options and exit
  --help-all            Show a full help message of all options and exit
  --detach              Run the build in the background, detached from your
                        terminal. Re-attach later using --attach.Currently
                        only supported when also using --aws-batch. (default:
                        False)
  --attach <job-id>     Re-attach to a --detach'ed build to view output and
                        download results. Currently only supported when also
                        using --aws-batch. (default: None)

Run again with --help-all instead to see more options.

instead of how it currently lumps all options together:

positional arguments:
  <directory>           Path to pathogen build directory
  ...                   Additional arguments to pass to the executed program

optional arguments:
  --help, -h            Show a brief help message of common options and exit
  --help-all            Show a full help message of all options and exit
  --detach              Run the build in the background, detached from your
                        terminal. Re-attach later using --attach.Currently
                        only supported when also using --aws-batch. (default:
                        False)
  --attach <job-id>     Re-attach to a --detach'ed build to view output and
                        download results. Currently only supported when also
                        using --aws-batch. (default: None)
  --docker              Run commands inside a container image using Docker.
                        (default)
  --native              Run commands on the native host, outside of any
                        container image.
  --aws-batch           Run commands remotely on AWS Batch inside the
                        Nextstrain container image.

Run again with --help-all instead to see more options.

Note that all commands which register runners should do this, not just build.

Tag versions 1.0.0 and 1.1.0

I should tag the appropriate commits once I generate and publish a key for my work address. This is a low-priority todo item for future myself.

Nextstrain view fails to find local JSONs

I ran nextstrain build . on https://github.com/nextstrain/zika successfully. This produces: zika/auspice/zika_tree.json and zika/auspice/zika_meta.json as expected. If I move these files to the auspice repo under auspice/data/ I can then run npm start and open a browser to http://localhost:4000/local/zika and everything works as expected.

However, if I run nextstrain view auspice/ from the zika/ directory I get the terminal output:

    The following datasets should be available in a moment:
       • http://127.0.0.1:4000/local/zika

But if I go to this address I get a 404. My guess is that this bug derives from not updating CLI to match changes to server logic introduced in nextstrain/auspice#604.

[Feature request] New command for viewing and editing the config file.

Context
Currently the config values must be edited manually or with a separate tool. This would make it easier for users who wished to script cli commands that rely on certain config settings, and make changing the config more visible/accessible to new users. Brief discussion here: #53 (comment)

Description
This feature would allow the user to get, set, replace and unset config values.

Examples

$ nextstrain config --add core.runner docker # set the default runner
$ nextstrain config --list # list what's in the config
core.runner=docker
$ nextstrain config --unset core.runner # unset a default runner

Windows support

I would like the CLI to work well on Windows when used with Docker for Windows on Windows 10 Professional or Enterprise in a very vanilla setup. I think it is most of the way there, but there are some showstoppers and several rough edges and sharp corners. It would also be very nice, and serve more users users, to support Docker Toolbox on Windows 10 Home or older versions of Windows. To do any of this requires access to a Windows machine (not a VM, unfortunately) for development and testing, which has been a blocker so far.

This issue is going to serve as a punch list of things that need addressing. Some items may be moved to separate issues in the future.

Python 3 installs just as python, no python3.
netifaces wheel must be available for install. The current version is missing for Python 3.7, but this will be present in the next version (thanks to my PR).
Paths for docker run invocation must not be standard Windows paths, but munged versions (e.g. //c/Users/…). Affects volume mapping.
Unicode not well-supported it seems in cmd.exe or PowerShell
Terminal escapes may not be well-supported as-is
os.execvp() doesn't work as expected
Use Travis CI's Windows support or AppVeyor for automated testing, prototyped in #14

Bug in deploy after version 1.16.0

There seems to be an issue with JSON deployment in version 1.16.0 and 1.16.1. Here are the results of running nextstrain deploy s3://nextstrain-data auspice/ncov_* with different versions of the CLI:

Not exactly sure what's going on here.

[aws batch] Review handling of Snakemake's internal state directory

This issue is to mark that we should comprehensively review which parts of Snakemake's internal state directory, .snakemake/, are ferried to and from the local host and remote AWS Batch job. It was motivated by the following exchange in Slack:

Trevor Bedford (Mar 28th)
Maybe snakemake uses special files in .snakemake when it resolves checkpoints
and these files aren't downloaded?

Thomas Sibley
Yes, I believe it does use the files in .snakemake/metadata/. The files in
there are named using base-64 encodings of output file names and contain JSON
blobs of internal state.

I haven't confirmed if this is one of the underlying issues, but I'd put it
high up there in terms of probability.

nextstrain build explicitly omits .snakemake when uploading/downloading the
S3 results to avoid other issues with state getting mixed up, but some middle
ground might be better. It already makes an exception for .snakemake/logs/.
The AWS job itself also avoids uploading anything in .snakemake except the
logs. So a couple places would need coordinated changing.

We don't have a reproduction case here, but by inspection of the files, it seems possible we should be uploading/downloading .snakemake/metadata/.

It would also be a good point to audit other directories in there, with information on their use obtained from reading Snakemake documentation and gleaned from its source.

An example listing of .snakemake/ is below, from running nextstrain/ncov through aggregate_alignment after removing the state dir entirely. Note that this was generated roughly on 28 March, and I expect will look different on more recent versions of the build workflow or using different workflow profiles.

.snakemake/
├── conda
├── conda-archive
├── locks
├── log
│   ├── 2020-04-01T000053.769503.snakemake.log
│   ├── 2020-04-01T000107.718271.snakemake.log
│   └── 2020-04-01T000133.220994.snakemake.log
├── metadata
│   ├── cmVzdWx0cy9hbGlnbmVkLmZhc3Rh
│   ├── cmVzdWx0cy9maWx0ZXJlZC5mYXN0YQ==
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzAuZmFzdGE=
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzcuZmFzdGE=
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzE0LmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzE1LmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzE2LmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzEuZmFzdGE=
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzEwLmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzExLmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzEyLmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzEzLmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzguZmFzdGE=
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzIuZmFzdGE=
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzkuZmFzdGE=
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzMuZmFzdGE=
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzQuZmFzdGE=
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzUuZmFzdGE=
│   ├── cmVzdWx0cy9zcGxpdF9hbGlnbm1lbnRzLzYuZmFzdGE=
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC80LmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC81LmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC82LmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC83LmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC84LmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC85LmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC8wLmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC8xLmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC8xMC5mYXN0YQ==
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC8xMi5mYXN0YQ==
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC8xMS5mYXN0YQ==
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC8xMy5mYXN0YQ==
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC8xNC5mYXN0YQ==
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC8xNi5mYXN0YQ==
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC8xNS5mYXN0YQ==
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC8yLmZhc3Rh
│   ├── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcG9zdC8zLmZhc3Rh
│   └── cmVzdWx0cy9zcGxpdF9zZXF1ZW5jZXMvcHJlLw==
├── shadow
└── singularity

Quick thoughts:

The metadata files are the primary thing to figure out if we should be ferrying between the remote job and local host. With what I know about them, I think the answer is yes, but this should be confirmed.
Locks should still be excluded. No processes are shared between remote and local jobs.
I believe Conda, Singularity, and shadow dirs are caches and should probably also be excluded for reasons such as binary compatibility, etc. Snakemake should be able to rebuild these faithfully given the declarations in the Snakefile.

[aws batch] Exit status is not >0 when AWS permissions are missing

I noticed this bug when running the rebuild-staging script in the private ncov-ingest repo. See this conversation for more context.

AWS Batch runner should accept --cpus and --memory options

These options would modify the job submission, overriding any defaults in the job definition. Coupled with a cascade of compute environments defined for the job queue (or an auto-scaling optimal compute environment), this lets users who knows what their builds needs ask for a certain amount of resources without the CLI needing to be tightly coupled to AWS Batch config. This also makes it clear in the command-line invocation the interaction between resources requested and snakemake's -j option.

@trvrb would like this for seasonal flu builds.

Show nextstrain/base image version in output of version command

and probably a timestamp too, which is handier than just the sha.

Document command and runner architecture

I think the code is fairly self-explanatory, but documentation laying out its organization and intended architecture would be good.

Add `nextstrain build --quiet` option which spits out Job ID when combined with `--detatch`

See this PR comment for context.

Update command should automatically prune old nextstrain/base images

Otherwise we're setting our users up for massive essentially-hidden disk usage.

recommend adding `--native` flag to `nextstrain view`

Since there's issues running the CLI in a docker container on Windows subsystem for Linux (WSL), there's this nice functionality where you can specify within nextstrain build with flag --native that you want to run the build with the locally installed augur and auspice rather than using those utilities within a container.

I'm opening this issue because I think that that this --native flag should be added to nextstrain view as well. While I understand that the nextstrain command is just a wrapper, and I'm able to view my build with auspice view, I think that switching from the nextstrain-cli command notation to augur/auspice command notation will be confusing for WSL users less familiar with running nextstrain and the differences between and the local and the container ways of using it.

Also I know this is a big ask (I'm sorry Tom), but is there any way that this can be done before I head to the DRC?

Commands to manipulate (download, delete, list) deployed datasets and narratives

The nextstrain deploy command works great to upload data to the S3 buckets. It struck me that we may want to download data as well, especially for S3 buckets where authentication is required. (Maybe this would entail renaming deploy). We do this in auspice via https://github.com/nextstrain/auspice/blob/master/scripts/get-data.sh, which is a bit hacky... wildcards would be really good if possible.

[check-setup] UnicodeEncodeErrors on terminals with non-Unicode locales

There is a non-ASCII unicode ellipsis ('\u2026') on line 61 of nextstrain/cli/command/check_setup.py:

$  nextstrain check-setup                                                                                                    (auspice) 
nextstrain-cli is up to date!

Traceback (most recent call last):
  File "/home/terry/miniconda3/envs/auspice/bin/nextstrain", line 8, in <module>
    sys.exit(main())
  File "/home/terry/miniconda3/envs/auspice/lib/python3.6/site-packages/nextstrain/cli/__main__.py", line 10, in main
    return cli.run( argv[1:] )
  File "/home/terry/miniconda3/envs/auspice/lib/python3.6/site-packages/nextstrain/cli/__init__.py", line 56, in run
    return opts.__command__.run(opts)
  File "/home/terry/miniconda3/envs/auspice/lib/python3.6/site-packages/nextstrain/cli/command/check_setup.py", line 61, in run
    print("Testing your setup\u2026")
UnicodeEncodeError: 'ascii' codec can't encode character '\u2026' in position 18: ordinal not in range(128)

Replacing it with 3 dots just gets you into trouble further on (line 91):

  File "/home/terry/s/net/cli/nextstrain/cli/command/check_setup.py", line 91, in run
    print(status.get(result, str(result)) + ":", formatted_description)
UnicodeEncodeError: 'ascii' codec can't encode character '\u2718' in position 7: ordinal not in range(128)

Support builds run in the local environment

nextstrain build should be able to support running builds in a local, non-container environment provided the dependencies are available and any supporting sibling directories (like ../fauna/) exist. The interface might look like one of these:

nextstrain build --local .
nextstrain build --no-container .

Implementing this would involve adding a new runner, nextstrain/cli/runner/local.py, and modifying nextstrain/cli/command/build.py to include the ability to switch between runners (i.e. docker or local).

[aws batch] Switch to uploading build dirs as tar.bz2 archives instead of zip archives

Unlike zip files, using a tar.bz2 archive will let us stream the archive during upload and download. This will alleviate the need for extra disk space on the local computer, as no temporary file is needed, and speed the upload/download process. We'll also be able to provide informative messages about which files are currently being uploaded/download, so that nextstrain build doesn't look like it's frozen during that step with large builds and/or slow connections. In the future, it would also let us add command-line options to download only a subset of a build's results without needing to download/extract the whole archive first.

Python's standard library includes tarfile which supports streaming operation. When extracting, we probably want to ignore user/group ownership bits, and maybe some mode/permission bits.

https://github.com/nextstrain/docker-base/tree/master/entrypoint-aws-batch will also need updating to add support for tar.bz2 files.

Support using Singularity to run builds in the container image

nextstrain build should be able to support running builds using the container image but in an environment which doesn't support docker or have it installed, such as the Hutch's rhino cluster. This can be accomplished using Singularity. The interface might look like one of these:

nextstrain build --singularity .

Implementing this would involve adding a new runner, nextstrain/cli/runner/singularity.py, and registering it within the runner framework.

Update command should check for new versions of CLI package

CI on Windows

Now that we document an installation path for Windows, it'd be very nice to add Windows to our CI. This would help catch issues like #151 earlier during development.

GitHub Actions supports Windows runners with Windows Server versions. I don't know if differences between Windows Server and our expected users' Desktop versions will make this less useful. Other CI providers like AppVeyor or CircleCI might have better/different Windows support that's worth evaluating? That said, I expect GitHub, owned by Microsoft, to expand here with time.

A while ago, I wrote a proof-of-concept using AppVeyor but never finished it. It is likely not useful now.

AWS Batch runner does not support runtime-configurable Docker images

This limitation is because the --aws-batch runner (nextstrain.cli.runner.aws_batch) relies on a static job definition created manually during AWS Batch setup (as documented). The job definition hardcodes an image, and individual job instances based on that definition are not allowed to override the image (unlike other job properties like CPU, memory, command line, etc).

The --aws-batch runner should allow a Docker image to be selected by the user at runtime, not just setup, using the same mechanisms supported by the --docker runner. Specifically, --aws-batch should use the image passed in via --image or the default Docker image (configured via a config file or environment var).

To accomplish this, the --aws-batch runner will have to create a new job definition before job submission if no job definition already exists for the desired image. The new job definition should be based on the default job definition, using it as a template for all properties except the image. The help message for --aws-batch-job should be updated to document that it may be used this way.

As a rough map of where to start for the person taking on this issue, I annotated the places which need new code to support this functionality.

facilitate detached builds on AWS

I'd like to be able to run many builds simultaneously on the cloud, and then interrogate the results. The solution described in this issue doesn't fully achieve this, but makes the process smoother.

Currently the --aws-batch functionality is designed to stream back logs to the terminal. Quitting the program while running will stop the job running (on EC2), and while you could background the process and have it keep running, this is not ideal for this use case.

Running netstrain build --aws-batch --detached ${build} could trigger a batch job and simply return the job id. Subsequent functionality can be built in to query the progress of this job id / download results etcetera.

Note that "detached" may not be the correct word. Feel free to suggest alternatives.

What is the remit of CLI?

I want to have a clear remit for CLI. How does it fall in relation to augur and auspice? Is the current direction of sound one?

I would imagine the CLI making possible the following workflow:

nextstrain build --aws --repo nextstrain/zika: starts jobs running via AWS batch and resulting JSON files go to an S3 bucket
nextstrain fetch: pulls down data from this bucket to a local directory
nextstrain view: to look at this data in auspice
nextstrain deploy: to push JSONs live

Or a local build is run with:

nextstrain build .: starts job running locally
nextstrain view: to look JSONs in auspice
nextstrain deploy: to push JSONs live

This workflow is used for established builds that are debugged and working. Debugging and getting things working takes place within augur. Augur doesn't concern itself with moving files about. deploy takes place after viewing in auspice. Not necessarily advocating this exact solution, but seems like a comfortable direction. nextstrain takes care of the Nextstrain logic, pushing builds live, fetching builds, etc... that is outside the scope of augur or auspice individually.

In general, I'd like to figure out a way to best frame nextstrain/cli. What is its remit? What is its interface? Is this a good general direction, or do people have other things in mind?

Diagram components in use for AWS Batch in the documentation

A great idea requested by @jameshadfield:

The documentation is good, however a flow diagram of the different components (cli, s3 bucket, batch, logs...), how they interact with each other, which permissions are used where etcetera would be extremely helpful (to someone like me).

[check-setup] Clarify help for --set-default

With release 1.11.1 we can now check the computer's hardware and assess what type of nextstrain running architectures are available (docker, native, or AWS). The thing that is nice about this is that you can have the same running commands (e.g. nextstrain build <dir>) on both a windows machine which can't use docker and a mac or some other machine that does allow docker.

You can also set a default run type to use, using the flag --set-default when running nextstrain check-setup. Currently when nextstrain check-setup --help is run, the help statement describes what set-default does, but does not specify the rank order of running methods that it will choose from when picking a default setting. This should be amended to be more clear that first choice for default is docker, followed by native.

Putting this issue up now to remember it, but I'm unlikely to get to this until after I'm back from the DRC.

confusing behaviour when both auspice & nextstrain view are running

If nextstrain view is running & using the default port 4000, then running auspice view, which also uses port 4000, produces confusing behavior. It doesn't matter which command is run first.

What happens:

On MacOS, but not Linux, No error / warning is displayed relating to multiple processes using the same port, and views of http://localhost:4000/ will always connect to the nextstrain view process. Unknown what happens on windows.

Expected result:

(1) nextstrain view should display an error similar to if multiple auspice view commands are run:

[error]	Port 4000 is currently in use by another program.
      You must either close that program or specify a different port by setting the shell variable
      "$PORT". Note that on MacOS / Linux, "lsof -n -i :4000 | grep LISTEN" should
      identify the process currently using the port.

(2) auspice should detect that port 4000 is in use (by nextstrain view) just as it does if the port was opened by a different auspice view command. Perhaps this is a fix in auspice?

Error earlier with a clear error message when passed a non-existent path

…to commands like build/view. I believe I watched Ashley run into this confusing situation during lab meeting this morning, though John spotted what happened.

AWS batch file modified dates should be amended

@tsibley ---

Currently, when running --aws-batch files get downloaded from S3 and their timestamp reflects download time, for example:

Fenrir:seasonal-flu trvrb$ ls -lah results/
-rw-r--r--   1 trvrb  staff   3.7M Feb  5 13:33 aligned_cdc_h3n2_ha_3y_cell_hi.fasta
-rw-r--r--   1 trvrb  staff    44K Feb  5 13:33 aligned_cdc_h3n2_ha_3y_cell_hi.fasta.log
-rw-r--r--   1 trvrb  staff   4.0M Feb  5 13:33 aligned_cdc_h3n2_ha_3y_cell_hi.fasta.post_aligner.fasta
-rw-r--r--   1 trvrb  staff   3.7M Feb  5 13:33 aligned_cdc_h3n2_ha_3y_cell_hi.fasta.pre_aligner.fasta
-rw-r--r--   1 trvrb  staff   3.1M Feb  5 13:33 aligned_cdc_h3n2_na_3y_cell_hi.fasta
-rw-r--r--   1 trvrb  staff    44K Feb  5 13:33 aligned_cdc_h3n2_na_3y_cell_hi.fasta.log
-rw-r--r--   1 trvrb  staff   3.7M Feb  5 13:33 aligned_cdc_h3n2_na_3y_cell_hi.fasta.post_aligner.fasta
-rw-r--r--   1 trvrb  staff   3.1M Feb  5 13:33 aligned_cdc_h3n2_na_3y_cell_hi.fasta.pre_aligner.fasta
...

This messes with the ability to restart snakemake or nextstrain build locally as timestamps don't properly file build order. This results in Snakemake attempting to start from scratch.

Would it be possible to make these files keep their original timestamps?

view: Support a --public or --allow-network-access option

which binds to 0.0.0.0 instead of 127.0.0.1 and also ideally prints a nice message with the host IPs to use (instead of just 0.0.0.0, which isn't useful).

It may also be worth advertising an mdns / zeroconf / avahi record for nextstrain.local or similar, so that the URL can always be http://nextstrain.local:4000/.

Catch exception when docker not installed

Running nextstrain check-setup without docker installed results in the following stack trace, which should be caught and something like "install docker" printed...

nextstrain-cli is up to date!

Testing your setup…
Traceback (most recent call last):
  File "/Users/naboo/miniconda3/envs/nextstrain-cli/bin/nextstrain", line 11, in <module>
    load_entry_point('nextstrain-cli', 'console_scripts', 'nextstrain')()
  File "/Users/naboo/github/nextstrain/cli/nextstrain/cli/__main__.py", line 10, in main
    return cli.run( argv[1:] )
  File "/Users/naboo/github/nextstrain/cli/nextstrain/cli/__init__.py", line 51, in run
    return opts.__command__.run(opts)
  File "/Users/naboo/github/nextstrain/cli/nextstrain/cli/command/check_setup.py", line 53, in run
    for runner in all_runners
  File "/Users/naboo/github/nextstrain/cli/nextstrain/cli/command/check_setup.py", line 53, in <listcomp>
    for runner in all_runners
  File "/Users/naboo/github/nextstrain/cli/nextstrain/cli/runner/docker.py", line 155, in test_setup
    *test_memory_limit()
  File "/Users/naboo/github/nextstrain/cli/nextstrain/cli/runner/docker.py", line 122, in test_memory_limit
    if image_exists():
  File "/Users/naboo/github/nextstrain/cli/nextstrain/cli/runner/docker.py", line 291, in image_exists
    stderr = subprocess.DEVNULL)
  File "/Users/naboo/miniconda3/envs/nextstrain-cli/lib/python3.7/subprocess.py", line 453, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/Users/naboo/miniconda3/envs/nextstrain-cli/lib/python3.7/subprocess.py", line 756, in __init__
    restore_signals, start_new_session)
  File "/Users/naboo/miniconda3/envs/nextstrain-cli/lib/python3.7/subprocess.py", line 1499, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'docker': 'docker'

[view] Auspice v2 datasets aren't listed in console output

This is a bug in the helpful messaging that nextstrain view outputs to the console. v2 datasets work just fine, but aren't listed in the console.

A couple places need updating:

https://github.com/nextstrain/cli/blob/master/nextstrain/cli/command/view.py#L4-L7

https://github.com/nextstrain/cli/blob/master/nextstrain/cli/command/view.py#L74-L78

devel/release uses a bash-ism that's not supported on macOS versions of bash

macOS ships with a 12-year old bash that doesn't support read's -i option.

[deploy] Hint that a bucket may not be found because of missing/wrong credentials

when running deploy without or wrong credentials, the error message is not "not authenticated" but "bucket doesn't exist"...

\:> nextstrain deploy s3://nextstrain-staging auspice/v2_flu_seasonal_*
No bucket exists with the name "nextstrain-staging".

Buckets are not automatically created for safety reasons.

not sure whether there is a way around this, but not the ideal error message...

Bug with nextstrain update in >1.7

Calling nextstrain update in version 1.6.1 gets me:

Leda:mumps trvrb$ nextstrain update
A new version of nextstrain-cli, 1.7.1, is available!  You're running 1.6.1.

Upgrade by running:

    pip install --upgrade nextstrain-cli

Updating Docker image nextstrain/base…

Using default tag: latest
latest: Pulling from nextstrain/base
Digest: sha256:036a0368430163bfe76ee02cdd1318b9e3ba21ffbde13a26c1eddb05c9c20c75
Status: Downloaded newer image for nextstrain/base:latest

Pruning old copies of image…


Your images are up to date!

…but consider upgrading nextstrain-cli too, as noted above.

Everything working as it should be. However, running nextstrain update in 1.7.0 or 1.7.1 gets me:

Leda:mumps trvrb$ nextstrain update
nextstrain-cli is up to date!

Updating Docker image from nextstrain/base to nextstrain/base:build-20181222T010646Z…

build-20181222T010646Z: Pulling from nextstrain/base
Digest: sha256:036a0368430163bfe76ee02cdd1318b9e3ba21ffbde13a26c1eddb05c9c20c75
Status: Image is up to date for nextstrain/base:build-20181222T010646Z
Traceback (most recent call last):
  File "/Users/trvrb/.pyenv/versions/3.6.1/bin/nextstrain", line 11, in <module>
    sys.exit(main())
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/site-packages/nextstrain/cli/__main__.py", line 10, in main
    return cli.run( argv[1:] )
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/site-packages/nextstrain/cli/__init__.py", line 51, in run
    return opts.__command__.run(opts)
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/site-packages/nextstrain/cli/command/update.py", line 27, in run
    for runner in all_runners
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/site-packages/nextstrain/cli/command/update.py", line 27, in <listcomp>
    for runner in all_runners
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/site-packages/nextstrain/cli/runner/docker.py", line 186, in update
    config.set("docker", "image", latest_image)
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/site-packages/nextstrain/cli/config.py", line 75, in set
    save(config)
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/site-packages/nextstrain/cli/config.py", line 43, in save
    with path.open(mode = "w", encoding = "utf-8") as file:
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/pathlib.py", line 1164, in open
    opener=self._opener)
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/pathlib.py", line 1018, in _opener
    return self._accessor.open(self, flags, mode)
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/pathlib.py", line 390, in wrapped
    return strfunc(str(pathobj), *args)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/trvrb/.nextstrain/config'

[aws batch] Runner should automatically provide snakemake with the appropriate `--cores N` option

If the AWS Batch runner is executing snakemake, it should provide a default --cores N / --jobs N option where N defaults to the number of CPUs in the Batch job definition (which may be overridden by the environment or CLI options).

This will let the Nextstrain build use all the available resources by default, instead of defaulting to a single core.

Any user-provided --cores, --jobs, or -j option shouldn't be overridden.

Configuration files for named deployment targets

My thoughts on this in relation to the new nextstrain deploy command.

confusing behavior when a terminal window is closed with `nextstrain view` running

If you close a terminal window that was running nextstrain view then -- at least on MacOS -- the docker container continues to run and the port (normally 4000) still connects to the docker container. Combined with #48 this makes for some fun debugging. Is it possible to close the docker container in this case (as happens with ctrl+c), or is this not possible?

Nextstrain build sometimes fails for Zika

Running nextstrain build . from https://github.com/nextstrain/zika fails at the align step with the error message:

using mafft to align via:
	mafft --reorder --anysymbol --thread 2 results/filtered.fasta.ref.fasta 1> results/aligned.fasta 2> results/aligned.fasta.log 

	Katoh et al, Nucleic Acid Research, vol 30, issue 14
	https://doi.org/10.1093%2Fnar%2Fgkf436

Traceback (most recent call last):
  File "/Users/trvrb/.pyenv/versions/3.6.1/bin/augur", line 6, in <module>
    exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/Users/trvrb/Documents/src/augur/bin/augur", line 176, in <module>
    return_code = args.func(args)
  File "/Users/trvrb/Documents/src/augur/augur/align.py", line 71, in run
    aln = AlignIO.read(output, 'fasta')
  File "/Users/trvrb/.pyenv/versions/3.6.1/lib/python3.6/site-packages/Bio/AlignIO/__init__.py", line 439, in read
    raise ValueError("No records found in handle")
ValueError: No records found in handle

Note that this error does not occur when running snakemake on the same repo, so I believe this is a cli error and not an augur or zika build error.

Edit: And I just ran again and it worked just fine. I have no idea what was going on. @tsibley: Could you test nextstrain build on zika and close this if it works for you?

[deploy] Improve error message when the internet connection is down

I know most of us use Nextstrain with awesome internet, but as the CLI facilitates uptake, and hopefully in places with shoddier internet, we’ll get some errors thrown that are common and should have nice error messages, rather than just the goto exceptions.

For instance, when I runnextstrain deploy but the internet mucks up and it can’t connect to the S3 bucket, I get really ugly standard error output (posted below). This output would probably be concerning to someone who is less familiar with the command line. Given that internet connectivity issues are reasonably common, it seems like a decent thing to make this scenario spit out an informative and shorter error message of our choice, rather than this default blurb.

>>> Traceback (most recent call last):
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/urllib3/connection.py", line 160, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/urllib3/util/connection.py", line 80, in create_connection
    raise err
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/urllib3/util/connection.py", line 70, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/httpsession.py", line 262, in send
    chunked=self._chunked(request.headers),
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/urllib3/connectionpool.py", line 641, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/urllib3/util/retry.py", line 344, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/urllib3/packages/six.py", line 686, in reraise
    raise value
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/urllib3/connectionpool.py", line 603, in urlopen
    chunked=chunked)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/urllib3/connectionpool.py", line 344, in _make_request
    self._validate_conn(conn)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/urllib3/connectionpool.py", line 843, in _validate_conn
    conn.connect()
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/urllib3/connection.py", line 316, in connect
    conn = self._new_conn()
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/urllib3/connection.py", line 169, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <botocore.awsrequest.AWSHTTPSConnection object at 0x10690da58>: Failed to establish a new connection: [Errno 61] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/alliblk/miniconda3/envs/nextstrain/bin/nextstrain", line 10, in <module>
    sys.exit(main())
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/nextstrain/cli/__main__.py", line 10, in main
    return cli.run( argv[1:] )
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/nextstrain/cli/__init__.py", line 51, in run
    return opts.__command__.run(opts)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/nextstrain/cli/command/deploy.py", line 87, in run
    return deploy.run(url, files)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/nextstrain/cli/deploy/s3.py", line 45, in run
    boto3.client("s3").head_bucket(Bucket = bucket.name)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/client.py", line 648, in _make_api_call
    operation_model, request_dict, request_context)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/client.py", line 667, in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/endpoint.py", line 102, in make_request
    return self._send_request(request_dict, operation_model)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/endpoint.py", line 137, in _send_request
    success_response, exception):
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/endpoint.py", line 231, in _needs_retry
    caught_exception=caught_exception, request_dict=request_dict)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/hooks.py", line 356, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/hooks.py", line 228, in emit
    return self._emit(event_name, kwargs)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/hooks.py", line 211, in _emit
    response = handler(**kwargs)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/retryhandler.py", line 183, in __call__
    if self._checker(attempts, response, caught_exception):
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/retryhandler.py", line 251, in __call__
    caught_exception)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/retryhandler.py", line 277, in _should_retry
    return self._checker(attempt_number, response, caught_exception)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/retryhandler.py", line 317, in __call__
    caught_exception)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/retryhandler.py", line 223, in __call__
    attempt_number, caught_exception)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/retryhandler.py", line 359, in _check_caught_exception
    raise caught_exception
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/endpoint.py", line 200, in _do_get_response
    http_response = self._send(request)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/endpoint.py", line 244, in _send
    return self.http_session.send(request)
  File "/Users/alliblk/miniconda3/envs/nextstrain/lib/python3.6/site-packages/botocore/httpsession.py", line 282, in send
    raise EndpointConnectionError(endpoint_url=request.url, error=e)
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "https://nextstrain-inrb.s3.amazonaws.com/"

Should we auto-update the image?

I had an issue with Nextstrain view owing to my not having thought to run nextstrain update here with #21.

I had thought that it auto-updated. If possible, I would either have it auto-update before running nextstrain build, nextstrain view, etc..., or print that an update exists and you should run nextstrain update. I'd prefer the former. This is in analogy to brew install etc... that automatically runs brew update. Without this feature, there will be many gotcha moments when people have an issue and discover they needed to run nextstrain update.

If we're worried about pinning an image version I'd think to do something akin to nextstrain pin TAGNAME, where TAGNAME defaults to LATEST, which would give the auto-updating functionality.

Configure mypy to check untyped function bodies

Add the --check-untyped-defs option to the config and then resolve errors. Low priority, but nice to do.

Add type hints to the command and runner APIs

Python 3.5 typing hints look really nice and promising and include static analysis. The command and runner APIs would benefit most from type safety checks, but all modules could use them if we wanted.

[aws batch] Gracefully handle network-level errors during uploading

Current Behavior

I'm getting this error frequently when running ncov builds on aws-batch -- only around 1/3 of my attempts to run the command succeed. It's related to uploading the large (400+Mb) repo which is needed to run SARS-CoV-2 builds currently. The internet is not down, so it's not the same as #55.

$ nextstrain build --aws-batch --detach ...
...
zipped: /Users/naboo/github/nextstrain/ncov/Snakefile_Priorities

Traceback (most recent call last):
  File "/Users/naboo/miniconda3/lib/python3.7/site-packages/urllib3/connection.py", line 157, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "/Users/naboo/miniconda3/lib/python3.7/site-packages/urllib3/util/connection.py", line 61, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/Users/naboo/miniconda3/lib/python3.7/socket.py", line 748, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:
...
  File "/Users/naboo/miniconda3/lib/python3.7/site-packages/botocore/httpsession.py", line 283, in send
    raise EndpointConnectionError(endpoint_url=request.url, error=e)
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: <redacted>

This is the nextstrain-cli function which triggers it:

  File "/Users/naboo/github/nextstrain/cli/nextstrain/cli/runner/aws_batch/__init__.py", line 121, in run
    remote_workdir = s3.upload_workdir(local_workdir, bucket, run_id)
  File "/Users/naboo/github/nextstrain/cli/nextstrain/cli/runner/aws_batch/s3.py", line 70, in upload_workdir
    remote_workdir.upload_fileobj(tmpfile)

Expected behavior
Ability to upload large build directories despite less-than-ideal network upload speeds.

Additional context
nextstrain.cli 1.16.5

Document how to install from source

Reading the current README, it's not clear how to
(a) install the required dependencies such that one can run ./bin/nextstrain
(b) how one may install a system-wide version of nextstrain from the source (my understanding is that python3 -m pip install nextstrain-cli installs this from the pip repository). I was able to achieve this with pip install -e .

Switch to running containers via docker-py

https://docker-py.readthedocs.io looks like a nice abstraction layer that might make some things less verbose. No pressing need, but good to do at some point. https://github.com/d11wtq/dockerpty might also pair nicely with it.

[build] Add options for CPUs/cores/threads and memory limits shared by all runners

The goal would be to deprecate --aws-batch-cpus and --aws-batch-memory in favor of generic options supported by all runners (or ignored if not applicable). If the --exec program is snakemake (the default), then these values would also be automatically passed into it.

An equivalency table of how these proposed new options would map to existing options:

proposed	existing
`nextstrain build --cpus 2 --memory 1GiB --docker .`	`nextstrain build --docker . --cores 2 --resources mem_mb=1024`
`nextstrain build --cpus 2 --memory 1GiB --aws-batch .`	`nextstrain build --aws-batch --aws-batch-cpus 2 --aws-batch-memory 1024 . --cores 2 --resources mem_mb=1024`
`nextstrain build --cpus 2 --memory 1GB --aws-batch .`	`nextstrain build --aws-batch --aws-batch-cpus 2 --aws-batch-memory 954 . --cores 2 --resources mem_mb=954`

--memory would accept 1024-based units like MiB and GiB as well as 1000-based units MB, and GB. The default would be MB if no unit is specified.

I'm not sure if --cpus should be --cpus, --cores, --threads or something else. All terms are somewhat ambiguous due to overloaded meanings.

Related to #65, which is a more specific issue that would be solved by this more general one.

Providing these general options allows for a cleaner, simpler build invocation regardless of build environment and, in the future where workflow executors other than Snakemake are supported, cleaner integration with their ways of specifying resources too.