Comments (5)
Hi @monovich, CB2 currently only supports single-end FASTQ files. To support your paired-end read files, I may need further information regarding it. Could you provide further details?
Thank you,
Hyun-Hwan Jeong
from cb2.
Certainly. I have 150 bp paired-end reads for my sgRNA abundance quantification, which is probably overkill for 23nt guides, but it was what our sequencing core is now providing by default to standardize library preps. Obviously providing both files to CB2 doesn't work, so I will need to generate some sort of single file input. I could just utilize the reads from R1 if you think that is appropriate, or I could generate an interleaved FASTQ, but I suspect CB2 would see that as simply twice as many reads treating R1 and R2 as unpaired, so that may skew calculated statistics.
Example first 3 reads from my first sample's R1 and R2 file (as you can see they are paired):
==> 3579-AI-1_TTGAACCG-ATAGGATC_S216_R1_001.fastq <==
@A00437:378:H3JYFDSX2:1:1101:5141:1000 1:N:0:TTGAACCG+ATAGGATC
TNGTGGAAAGGACGAAACACCGCTCCGTCCCCTCCTGCCGCGGTATAAGAGCTATGCTGGAAACAGCATAGCAAGTTTAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTTTGAATTCTAGATCTTGA
+
F#FFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFF:,FFFFFFFFFFFFFFFFFFFF:FFFFFF:,FFFFFFFFFFFFFF:FFFFFF,::FFFFFFFF:FFFFFFFF:FFFFF:FFF,FFFFFF:,FFFF,FFF,FFFFFFFF
@A00437:378:H3JYFDSX2:1:1101:8793:1000 1:N:0:TTGAACCG+ATAGGATC
TNGTGGAAAGGACGAAACACCGTTGGAACAAAGAAAACTCCCGTTTAAGAGCTATGCTGGAAACAGCATAGCAAGTTTAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTTCGAATTCTAGATCTTGA
+
F#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFF:FFFFFFFFF:FFFFFFFFFFFFFFF:FF:FFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFF
@A00437:378:H3JYFDSX2:1:1101:12897:1000 1:N:0:TTGAACCG+ATAGGATC
TNGTGGAAAGGACGAAACACCGGAACAGGCAGACACATCTCAGTTTAAGAGCTATGCTGGAAACAGCATAGCAAGTTTAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTTTGAATTCTAGATCTTGA
==> 3579-AI-1_TTGAACCG-ATAGGATC_S216_R2_001.fastq <==
@A00437:378:H3JYFDSX2:1:1101:5141:1000 2:N:0:TTGAACCG+ATAGGATC
TCTACTATTCTTTCCCCTGCACTGTACCCCCCAATCCCCCCTTTTCTGTTAAAATTGTGGATGAATACTGCCATTTGTCTCAAGATCTAGAATTCAAAAAAAGCACCGACTCGGTGCCACTTTTTCAAGTTGATAACGGACTAGCCTTATT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFF:F,FFFFFFFFF:FFFFFFFF:F:FFF,FFFFF,FF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFF:FFF
@A00437:378:H3JYFDSX2:1:1101:8793:1000 2:N:0:TTGAACCG+ATAGGATC
TCTACTATTCTTTCCCCTGCACTGTACCCCCCAATCCCCCCTTTTCTTTTAAAATTGTGGATGAATACTGCCATTTGTCTCAAGATCTAGAATTCGAAAAAAGCACCGACTCGGTGCCACTTTTTCAAGTTGATAACGGACTAGCCTTATT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,F:FFFFF,FFFF:FFFFFFFF:FFFFFFFF::FFFFFF:FFF:,FFFFFFFFFFFFFFFFFF:FFF:F:FFFFFF:FFFFFFFF,FF::F:FFFFFFFFFFFF::FFFF
@A00437:378:H3JYFDSX2:1:1101:12897:1000 2:N:0:TTGAACCG+ATAGGATC
TCTACTATTCTTTCCCCTGCACTGTACCCCCCAATCCCCCCTTTTCTTTTAAAATTGTGGATGAATACTGCCATTTGTCTCAAGATCTAGAATTCAAAAAAAGCACCGACTCGGTGCCACTTTTTCAAGTTGATAACGGACTAGCCTTATT
from cb2.
Based on my experience, every guide would be located at a constant location (of course, it can be a bit staggered). So, I guess most guides would appear at one of the ends. In other words, if you see a lot of guides in R1, you may not need to use R2. Have you checked the mappability of each R1 and R2? It helps my assumption is correct or not. Otherwise, it will be time to find plan B.
Hyun-Hwan Jeong
from cb2.
That makes good sense. I just quickly ran my R1 file from above on its own through bwa and it looks like I'm getting 99% alignment, so I'll probably just go ahead and use my R1 files. Thanks for the clarification.
from cb2.
No problem, I think we can close the issue.
from cb2.
Related Issues (20)
- Problem with gene-level statistic HOT 1
- Error in arising in run_sgrna_quant HOT 4
- Cluster setup failed. 31 of 31 workers failed to connect. HOT 2
- run_sgrna_quant report the wrong sequences associated to sgRNA names HOT 2
- maximum guide length HOT 2
- Error in cb2_count() HOT 4
- Is it possible to analyze by inserting one mismatch guide RNA barcode? HOT 3
- Error in measure_gene_stats(sgrna_stat) HOT 3
- Interaction Terms and Complex Designs HOT 1
- Fasta file issue HOT 16
- CB2 vhat estimation with variable total read count
- Choice of count normalisation
- run_sgrna_quant fails HOT 3
- inconsistent between .fq.gz and .fastq? HOT 7
- calc_mappability HOT 1
- export join_count_and_design HOT 2
- plot_count_distribution add export option HOT 1
- gene_stat cpms HOT 2
- Clarity around logFC for gene_stats HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cb2.