Comments (4)
Hi @coreymeloche,
Nextclade v2 reports that it was unable to align this sequence. This is a hard error. Without alignment it it's impossible to compare the sequence to the reference sequence and to analyse it further in any way. Impossible to find mutations and assign clades.
Nextclade v3, as stated in the changelog, has received improvements in the alignment algorithm and it seems that it was sufficient for this sequence to be aligned now. Nextclade v3 tries to analyze and make sense of it and it's up to you to interpret the results (or to ignore them). Nextclade applies the same analysis to all sequences and tries to assign clades to all sequences. Sequence is still reported as "bad" - you can see that the table row is highlighted in red, QC column reports multiple failed QC rules, other columns show 1200+ mutations, 18k deletions and 102 frame shifts, and sequence view shows numerous defects. This makes very little sense, so the clade assignment should also be questioned, as you rightfully did.
Back to your original question:
Erroneous Clade Assignment or More Refined Tool?
Nextclade v3 is a bit more refined, yes. At least our team believes so :)
Different people might have different opinions, standards and thresholds for what is "good" or "bad" sequences, what is a "correct" or "erroneous" clade assignment. Someone wants to squeeze each useful byte from the precious sequencing juice they've just spent. Others might want only the most reliable information and to throw away all the junk. We don't try to guess and just report as much information as we have, so that users could make an informed decision.
Can you also provide release notes for v2.15?
Nothing changed in 2.15 other than addition of a warning message on main page. The version bump was necessary to be able to deploy the application. Sadly, the changelog for that version has stuck on v2
branch, where Nextclade v2 has found its rest: https://github.com/nextstrain/nextclade/blob/e44d0ed5cf25fe38d22cf2f994098bef987734c4/CHANGELOG.md#nextclade-web-2150-2023-12-15
I now added this information to the old changelog on master branch as well:
https://github.com/nextstrain/nextclade/blob/08759d219826f05b456d59fff343f7c8cd6d4e0a/docs/changes/CHANGELOG.old.md
from nextclade.
Thank you for your detailed response, this is very helpful!!
from nextclade.
Hi @coreymeloche
just following up on Ivan's message. version 2 was unable to align diverged or very fragmented sequences, while version 3 can. the clade assignment derives from the closest neighbor on the tree. In case of a low quality sequence like the one you posted, this placement on the tree will be essentially meaning less. So for sequences with a very large number of private mutations, the clade call needs to be checked for plausibility using the tree view. Such criteria will likely be different for different viruses. hope this makes sense.
best,
richard
from nextclade.
Yes this makes sense, thank you! We will begin using v3.0 for data analysis but use our critical thinking skills and the guidance you outlined on poor sequences. Thank you Ivan and Richard for taking the time to explain!
from nextclade.
Related Issues (20)
- Add a BA.1 reference for the web nextclade version HOT 4
- error when using `nextclade dataset get --verbosity` flag HOT 3
- 21L Tree Updates? HOT 2
- `--input-pcr-primers` listed in CLI help options despite being removed in v3 HOT 2
- When using `?input-fasta=` url query param without specifying dataset, web auto-starts analysis (prematurely) HOT 5
- Scrollbar shown for dataset names in dataset picker HOT 9
- how to generate the result table by the cli version auspice HOT 4
- output TSV column(s) for missing bases at beginning and end of sequence? HOT 1
- --input-dataset parameter HOT 5
- Update Fred Hutch logo
- How to get the latest Lineage- with CLI HOT 4
- Community build cache validity bug HOT 2
- Developer guide uses deprecated CLI option
- docs: document nextalign-like use-case HOT 1
- ENH(nextclade cli): nextclade dataset list: indicate whether clades can be assigned HOT 7
- nextclade run --output-columns-selection throws error for seqName and includes index even though I don't want index HOT 11
- Nextclade Web: Confusing unwanted dataset switching HOT 3
- Nextclade Web: consider rethinking dataset badges HOT 1
- Nextclade Web: don't store unnecessary dataset info in local storage
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nextclade.