Comments (1)
Hi Laura @laura-bankers,
Thanks for your interest! We are very happy that people are reaching out and asking about new pathogens.
Do you mean these files?
https://github.com/nextstrain/nextclade/tree/master/data/enterovirus/d68
Sadly, these are only a genome annotation, a reference sequence and a few example sequences, so they are not enough to run Nextclade (which also currently requires a reference tree, QC config and virus properties config). These files are historically only there to provide some examples to run Nextalign (which is like Nextclade, but only does alignment and translation).
Or maybe you've seen other files somewhere else? Could you please send me a link?
I don't exclude a possibility that there are datasets exist on the internet, created by the community and which we don't know about.
A few notes which may help you in your work with Nextclade:
Dockstore containers is not something Nextclade team is aware of. This is not an official source. Probably some community effort. Which we are happy to hear about, but don't have bandwidth to support officially.
Official docker containers (on DockerHub) or any other official means of distribution of Nextclade CLI (listed in the docs) don't contain datasets on purpose. Nextclade is pathogen-agnostic by design. It only reads an index.json file hosted elsewhere on our servers, which contains a list of known datasets, and then can download datasets from this list from our server using nextclade dataset get
command. This is purely for convenience. But you can also load any dataset you want from your computer. So, if you found a dataset you like, or created one, you can just pass it into Nextclade as you would do with an officially downloaded one.
You can try and build your own dataset to support a new pathogen. It's quite a challenging adventure at the time. But I gathered some of the information in response to this issue in hope that it helps people: #1225
We are working on the next major version of Nextclade - version 3. In the new version there will be significant changes to datasets. Nextalign will be removed and all dataset files previously required for Nextclade will become optional - this way you could build a dataset gradually, starting small and adding new features later as needed. And we are also hoping to document creation of new datasets better and t provide tools to make the process easier. It's all coming soon. Stay tuned!
from nextclade.
Related Issues (20)
- Nextclade Web: Confusing unwanted dataset switching HOT 3
- Nextclade Web: consider rethinking dataset badges HOT 1
- Nextclade Web: don't store unnecessary dataset info in local storage
- [minor] Auspice dataset functionality: URL redirects don't update displayed metadata HOT 3
- Max marker setting even counts markers that are off
- Frameshift and insertion markers cannot be disabled/configured in contrast to all other markers
- Unfolding <details> in changelog in website jumbles things up
- Rename `master` to `main`
- How does one update Nextclade CLI? I cannot find any instructions on the Nextclade CLI page, only descriptions of various updates? HOT 1
- Bioconda workflow failed with push error due to insufficient permissions HOT 7
- SVG download for the Results table
- Default threads for webapp are set too high HOT 2
- Add coverage per CDS to output HOT 8
- List of mutational changes per clade HOT 2
- Can influenza H5 datasets be available for nextclade CLI HOT 3
- Can't do quality control when change reference HOT 1
- Nucleotide insertions not shown in peptide tooltip, causing confusion in case of frameshift due to nt insertion HOT 6
- nextclade run with optional dataset tag input HOT 5
- Definition of Private Mutation/Deletion in Nextclade CLI ndjson HOT 2
- qc algorithm for nonACTGNs HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nextclade.