Comments (16)
I agree dataset
isn't quite on the nose, especially since it'd (intentionally) work for narratives too. I'm hesitant to lift up its subcommands to the top-level though.
I do prefer upload/download over deploy/retrieve. Deploy has more connotations than I think we want (I regret this choice) and isn't a common term for non-developers, whereas upload/download are widely understood by even non-developers.
I definitely want to make sources first-class things understood by the CLI, but I think that's a different scope of work for the future. The current design of the deploy
command is set up to support different kinds of destinations (although it only supports S3 right now), and I'll maintain this in whatever additions I make for download/delete.
from cli.
Ok, it sounds like we should re-envision the nextstrain deploy
command.
What about something like the following, where deploy
becomes dataset upload
:
nextstrain dataset upload s3://nextstrain-data auspice/*.json
nextstrain dataset download s3://nextstrain-data/zika_tree.json some/dir/
nextstrain dataset delete s3://nextstrain-data/zika*
# Deploy will be an alias for `dataset upload`, so this still works, but asks
# you to start using the new command.
nextstrain deploy s3://nextstrain-data *.json
The dataset delete
command will also take care of invalidating the CloudFront cache if necessary, the same way deploy
does now.
from cli.
I spent some time yesterday and today implementing these commands under working names since the real names are still TBD. We should decide on the final names/structure so I can finish this up sooner than later. The basics are mostly there but there is still some refining of behaviour and documentation to round it out.
from cli.
Hmm. This seems like scope creep to me. The nextstrain
command isn't meant to be a general-purpose S3 manipulation tool: aws s3
exists for that (and supports wildcards).
Note that S3 is just one possible destination for nextstrain deploy
(albeit the only destination possible right now); future supported destinations might use SFTP, SCP, git, WebDAV, or other upload mechanisms. The intent is for nextstrain deploy
to do what needs to be done to put locally built files at a location reachable by a deployed Nextstrain instance.
from cli.
with the implementation of private S3 buckets, we will need the ability to download / remove files (right now we can only do this via the AWS console).
from cli.
This seems very reasonable to me. I would find nextstrain dataset download
useful in day-to-day work with flu.
However, dataset
seems slightly non-perfect semantically, but I don't have a better suggestion.
I'm 50/50 on whether I prefer:
nextstrain dataset upload s3://nextstrain-data auspice/*.json
nextstrain dataset download s3://nextstrain-data/zika_tree.json some/dir/
nextstrain dataset delete s3://nextstrain-data/zika*
or
nextstrain retrieve s3://nextstrain-data auspice/*.json
nextstrain deploy s3://nextstrain-data/zika_tree.json some/dir/
nextstrain remove s3://nextstrain-data/zika*
And one additional thought... if we're planning for source
to be a first class citizen, maybe we should elevate it in the CLI. This would make this:
nextstrain dataset upload wa-doh auspice/*.json
nextstrain dataset download wa-doh zika_tree.json some/dir/
nextstrain dataset delete wa-doh zika*
from cli.
I'm hesitant to lift up its subcommands to the top-level though.
That said, it may still be the right choice!
from cli.
I see the logic of the subcommand and think generally cleaner, but without dataset
feeling right I think semantically better to just do nextstrain upload
, nextstrain download
, nextstrain delete
. Other options I just considered, but didn't really like nextstrain io upload
, nextstrain file upload
, nextstrain remote upload
.
from cli.
Sounds good to me.
from cli.
I have no comment on the naming but the functionality sounds perfect.
from cli.
I guess we'll also need a nextstrain list
command?
(I'd plan to alias delete
to rm
and list
to ls
.)
from cli.
I said I wouldn't comment on naming, but having taught nextstrain deploy
over the past 2 weeks here is a comment:
"list", "delete", "upload", "download" (and who knows, there maybe more in the future, e.g. "move") are all part of the same concept in that they interact with a "source" (or "group", not sure about terminology), and not the local computer / aws build / docker build. This should be clear from the command.
from cli.
Yeah, that commonality is a big reason why I'd prefer a subcommand to collect and identify them all, and why I'm hesitant to make them all top-level commands.
I dismissed dataset
earlier, but coming back to it, maybe we do:
nextstrain dataset {upload,download,delete,list} …
nextstrain narrative {upload,download,delete,list} …
nextstrain deploy
would become an alias for nextstrain dataset upload
.
The backing code (currently what's in nextstrain/cli/deploy/*
as used by nextstrain/cli/command/deploy.py
) would be largely shared between both nextstrain dataset
and nextstrain narrative
, but the list
ing can be specific to one type of file and we can be dataset- or narrative-specific in help text.
from cli.
Nice! I was just doing some work / thinking about the proposal I had earlier in the summer to have entrypoint for builds start with downloading a flat file from S3 rather than calling out to fauna. My thought was to mirror data.nextstrain.org/zika.json with something like data.nextstrain.org/zika_sequences.fasta and data.nextstrain.org/zika_metadata.tsv. We don't have to move in this direction, but I can see creep of more file types than dataset
and narrative
. But that doesn't mean that dataset
and narrative
aren't good things to have.
That said, I am maybe coming around to nextstrain remote {upload,download,delete,list} …
.
from cli.
Nod. During my implementation so far, I was realizing that separate dataset
and narrative
commands would add a lot of UI duplication to the CLI. So I was thinking better to have a single command like remote
which, if desired, then could have separate filter options for file "types" (dataset, narrative, etc.).
I think nextstrain remote
makes most sense, unless we want to have upload/download/etc
at top-level.
from cli.
Thanks Tom. After thinking more, my preference is nextstrain remote upload/download/etc
. If you prefer something else happy to go with that instead.
from cli.
Related Issues (20)
- `nextstrain update conda` in 7.0.0 can error with "Invalid version" HOT 13
- Improve runtime documentation HOT 1
- pyparsing 3.1.0 causing test failures in Markdown image inlining HOT 4
- [batch] ZIP archive of build dir is stored without compression HOT 1
- `nextstrain update conda` fails silently HOT 4
- Integrate AWS Batch set up and usage doc with runtime doc
- Uncaught ImmatureSignatureError: The token is not yet valid (iat) HOT 3
- Python user site directory leaks into Conda runtime HOT 5
- Test compatibility with SingularityCE 4.x series
- nextstrain view can't connect to host on a Mac inside a Conda environment HOT 8
- Use same version resolution method in `nextstrain setup conda` as `nextstrain update conda` HOT 1
- Invalid `~/.shrc` file named in standalone installer instructions HOT 5
- Broken CI due to upstream release; mypy implicated HOT 1
- Consider UI/UX for AWS Batch runtime without Docker runtime HOT 1
- Singularity/Apptainer Version HOT 4
- remote/s3 should instantiate its own MimeTypes instance
- CLI-ception: Some commands do not work in managed runtime shells HOT 4
- Extend validity of login refresh tokens to 90 days from 30 days HOT 1
- Unhandled exception when renewing login from a different Cognito user pool HOT 1
- Pass AWS credentials from standard file to Docker HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cli.