Comments (2)
For your first question, it doesn't matter whether you set -M
or -Y
option or not. xTea will re-do the alignment for the clipped reads (primary alignments only).
For the second question, you need to set some slack value when comparing the breakpoints. Because for one insertion, there are actually two breakpoints (there is a target site duplication between them, usually short). Tools (including xTea) only report one breakpoint, but depends on settings, for one insertion, maybe different breakpoints are reported for the same insertion. Thus, you cannot require exactly the same. And, most of the time, distance between two insertions are much larger than the value you set, thus it will not affect the comparison results much.
from xtea.
Thanks a lot !
But as for the first question, I got different results when using bwa -Y
. The codes and results are as follows.
bwa code without -Y
@PG ID:bwa PN:bwa VN:0.7.17-r1188 CL:bwa mem -M -R @RG\tID:SRR1264615\tLB:SRR1264615\tPL:Illumina\tPU:SRR1264615\tSM:SRR1264615 -t 20...
xtea result without -Y
awk '{print $1}' SRR1264615_sorted_ALU.vcf | uniq -c | tail -23
110 chr1 129 chr2 83 chr3 88 chr4 85 chr5 110 chr6 67 chr7 73 chr8 62 chr9 72 chr10 64 chr11 79 chr12 62 chr13 54 chr14 41 chr15 23 chr16 22 chr17 38 chr18 14 chr19 19 chr20 12 chr21 3 chr22 23 chrX
bwa code with -Y
@PG ID:bwa PN:bwa VN:0.7.17-r1188 CL:bwa mem -M -Y -t 20 -R @RG\tID:SRR1264615\tLB:SRR1264615\tPL:Illumina\tPU:SRR1264615\tSM:SRR1264615...
xtea result with -Y
awk '{print $1}' test_Y_SRR1264615_sorted_ALU.vcf | uniq -c | tail -23
108 chr1 126 chr2 80 chr3 82 chr4 82 chr5 108 chr6 66 chr7 72 chr8 63 chr9 72 chr10 63 chr11 72 chr12 58 chr13 52 chr14 37 chr15 22 chr16 21 chr17 39 chr18 14 chr19 18 chr20 13 chr21 4 chr22 21 chrX
I also want to extract the clipped and discordant reads from tmp/cns/temp_clip.sam
and tmp/cns/temp_disc.sam
just as #36 referred, but I could not fully understand these two files. What is the meaning of numbers between ~
, i.e., ~R~1~
~1~0~0~1~1~
in chr1~890330~R~1~890451~1~0
(from temp_clip.sam) and SRR1264615.423066703~1~0~0~1~1~31020097~chr12~31020266~0
(from temp_disc.sam)? How can I get the complete sequence of clipped reads? (They are incomplete in temp_clip.sam
and identifiers are not given)
from xtea.
Related Issues (20)
- x_TEA_main.py: error: no such option: --bamsnap HOT 2
- Very long length of L1 insertion HOT 1
- Long_reads: File detailing which reads show transductions of deletions HOT 1
- temp files burst out HOT 2
- Possible to merge TE sites and do genotyping in each sample? HOT 1
- Alu output didn't go through filter step? HOT 2
- xTea_long getting stuck, no errors, during wtpoa-cns, running HG38 HOT 2
- xTea output file understanding HOT 5
- About the case-control analysis in Illumina Short-read WGS Data HOT 3
- filtering HOT 4
- Will the coverage of BAM files affect the xTea results? HOT 3
- Input file for pseudogene insertion calling HOT 1
- visual inspection using IGV
- Merge clip and disc step error HOT 12
- Sporadic inconsistent errors with xtea_long HOT 5
- WES issues HOT 1
- interpretation of GENE_INFO in output file HOT 1
- Long read cannot find temp files HOT 1
- xTea Long_read SyntaxWarning HOT 6
- line 1133, IOError: [Errno 2] No such file or directory: 'null' HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xtea.