Comments (6)
Could you explain use cases where physical coverage track will be useful?
When a contig/scaffold ends or is misassembled, physical coverage helps troubleshoot whether the assembly, the reads, or the reference assembly is to fault.
When a region has no read coverage but does have physical coverage, it indicates a region where the assembly cannot assemble a contig, but should be able to scaffold over.
If a region has low/no read coverage and low/no physical coverage, it indicates a region where the assembly does not expect a contig nor scaffold.
If a region has typical read coverage but has low physical coverage, it can indicate a region of reference misassembly.
from quast.
Should we consider single reads or improperly paired reads somehow?
Single reads and improperly paired reads should be ignored in physical coverage calculation.
Reads pairs for which both reads are multimapped, can be useful to either count or ignore depending on the situation. Ignoring them is useful to indicate genomic positions where scaffolds should end, because the repeat at that position is larger than the sequencing fragment size.
from quast.
Should we check insert size deviation somehow and, e.g. skip remotely mapped reads?
Filtering out improperly paired reads should be good enough, as long as that flag is set. Super long incorrectly paired fragments could possibly dominate the physical coverage count if not filtered out.
from quast.
How do you calculate physical coverage usually? samtools/bedtools/some ad hoc script?
Filter as above, create a BED file of (read start pos, read start pos + isize) and use bedtools genomecov
.
from quast.
We added physical coverage to coverage track. @sjackman, please take a look at the attached example and feel free to checkout the latest source code. We are waiting for your comments :)
alignment_viewer.html.zip
Note that we calculate physical coverage as you suggested (extracting properly paired reads and counting the gaps between the paired-end reads and reads itself as covered fragments). If you find this feature useful in some cases, please send us few examples -- it will be very interesting to look at them!.
from quast.
Thanks! I'll test it out and get back to you. In the example I notice that there are spots were read coverage drops quickly but it is smoother in physical coverage. That's a good example of why scaffolding works, because there are positions with poor read coverage, but still spanned by paired-end reads.
from quast.
Related Issues (20)
- ERROR! File not found (contigs) running on biocontainers/quast:5.2.0 HOT 4
- MetaQUAST 5.2.0 reference genome "not in list" error when "Summarizing results..."
- metaquast coloring error for > 14 samples
- OSError: [Errno 22]
- Error creating Krona plots in metaquast.py HOT 8
- Version of installed minimap2 differs from its version in the QUAST package (2.24) HOT 1
- Augustus busco not running with a conda install
- ERROR! Skipping S7KLEB_S14_L001_R2_001.fastq.00.0_0.cor.fastq.gz because it contains non-ACGTN characters. HOT 1
- [Errno 30] Read-only file system: '/usr/local/lib/python3.10/dist-packages/quast_libs/bwa/make.log' HOT 2
- Issues with the installation
- Issues about 'quast-download-busco' HOT 2
- minimap2 cannot works correctly in the "Running Contig Analyzer" step
- Error occured while running Busco HOT 2
- No space left on device HOT 5
- Bam files in output HOT 1
- What size of genome fraction can be considered low?
- Issues running quast.
- invalid literal for int() with base 10: 'START_A'
- mapped reads count in reads_report
- License issue
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from quast.