/data4/qiumin/soft/ConsensusCruncher/ConsensusCruncher/SSCS_maker.py --infile LargeMid/bamfiles/LargeMid_56_L005_R1.fastq.sorted.bam --outfile LargeMid/LargeMid_56_L005_R1.fastq.sorted/sscs/LargeMid_56_L005_R1.fastq.sorted.sscs.bam --cutoff 0.7 --bedfile /data4/qiumin/soft/ConsensusCruncher/ConsensusCruncher/hg19_cytoBand.txt
# === SSCS ===
Uncollapsed - Total reads: 19259
Uncollapsed - Unmapped reads: 15
Uncollapsed - Secondary/Supplementary reads: 23
SSCS reads: 0
Singletons: 19216
Bad spacers: 0
# QC: Total uncollapsed reads should be equivalent to mapped reads in bam file.
Total uncollapsed reads: 19259
Total mapped reads in bam file: 19564
QC: check dictionaries to see if there are any remaining reads
=== pair_dict remaining ===
HWI-D00331:196:C900FANXX:5:1101:13274:2385|CG.TG
read remaining:
HWI-D00331:196:C900FANXX:5:1101:13274:2385|CG.TG 161 7 114597515 60 123M 27 128565 123 CATATGCCATGCACATAAAATGTTATTTATATATTTATTGGTTAAATGAATTAACATTTAAATATTGGCATCGTAAGTGAATAAGTATTCAGTATCTTTGTAATCAATGGGTAACTCATGCTT array('B', [15, 25, 34, 33, 37, 16, 37, 35, 37, 38, 38, 38, 38, 38, 38, 38, 34, 37, 38, 38, 38, 38, 38, 38, 37, 38, 38, 38, 38, 38, 38, 38, 34, 37, 38, 38, 38, 38, 38, 38, 38, 37, 33, 37, 38, 38, 37, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 34, 38, 38, 36, 36, 38, 38, 37, 35, 38, 38, 38, 36, 38, 38, 31, 35, 38, 38, 38, 38, 38, 38, 38, 36, 38, 36, 35, 35, 36, 38, 38, 33, 37, 38, 38, 38, 38, 38, 35, 38, 34, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 37, 38, 38]) [('NM', 0), ('MD', '123'), ('AS', 123), ('XS', 20), ('RG', '1')]
mate:
HWI-D00331:196:C900FANXX:5:1101:13274:2385|CG.TG 81 27 128565 0 123M 7 114597515 123 TTGGGAGTTGGCCTTACTGGGTATCTGTAAGAACAGGGAAAAGGACACGCACCTGGCCTGTGGTGGTTACTTCTTTCTGAATCGTGTCAGAGAACTTGGCTGCTCTGGAAGAGCCAGTTTTGT array('B', [35, 38, 38, 38, 38, 38, 38, 28, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 37, 38, 38, 38, 38, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 36, 38, 38, 38, 38, 38, 37, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 36, 38, 35, 38, 38, 38, 38, 38, 37, 38, 38, 38, 38, 37, 38, 38, 38, 38, 38, 38, 38, 38, 38, 37, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 37, 38, 38, 33, 34, 34]) [('NM', 0), ('MD', '123'), ('AS', 123), ('XS', 123), ('RG', '1'), ('XA', 'chr16,-70923438,123M,0;')]
HWI-D00331:196:C900FANXX:5:1101:15070:3473|TC.GT
read remaining:
HWI-D00331:196:C900FANXX:5:1101:15070:3473|TC.GT 129 4 49985033 60 123M 27 359773 123 GAAGTCCAATTTATCTTCTTTTTTTAAAATCTGTGCCTCATCTGCAAATATTACCAATCGAAAGTCATGAAATTTTTCCCCTAAGATTTTATAGTTTTAGCGCTTACGTTTGGGTCTTTGATC array('B', [33, 33, 38, 38, 38, 37, 38, 38, 38, 38, 38, 37, 34, 38, 36, 37, 38, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 37, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 36, 38, 38, 37, 36, 38, 38, 36, 38, 38, 37, 36, 37, 38, 38, 38, 38, 38, 38, 38, 38, 34, 38, 38, 38, 38, 38, 38, 36, 31, 31, 29, 33, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 33, 35, 38, 38, 38, 38, 38, 38, 38, 38, 38, 15]) [('NM', 0), ('MD', '123'), ('AS', 123), ('XS', 22), ('RG', '1')]
mate:
HWI-D00331:196:C900FANXX:5:1101:15070:3473|TC.GT 65 27 359773 2 123M 4 49985033 123 GACACCACACTTCATGCTCTGGGTGCCTGGTAACCTGAGTTTACCACTTGGAGGAGGTCACTACCTAAAATGTCGCAGTAAATGGTCTGTTGATAGAGCTTGGCTTCTAGTGGGTTAAAGTAC array('B', [34, 34, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 34, 38, 38, 38, 38, 38, 38, 38, 37, 38, 38, 38, 38, 38, 38, 38, 37, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 36, 38, 38, 38, 38, 38, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 36, 38, 37, 38, 37, 37, 38, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 35, 38, 38, 38, 38, 38, 36, 34]) [('NM', 0), ('MD', '123'), ('AS', 123), ('XS', 121), ('RG', '1'), ('XA', 'chr16,+71149535,123M,1;')]
HWI-D00331:196:C900FANXX:5:1101:18338:2852|GT.CC
read remaining:
HWI-D00331:196:C900FANXX:5:1101:18338:2852|GT.CC 97 6 140573779 0 123M 60 145190 123 TGGACGCCAACGACAACTCGCCCTTCGTGCTGTACCCGCTGCAGAACGGCTCCGCGCCCTGCACCGAGCTGGTGCCCCGGGCGGCCGAGCCGGGCTACCTGGTGACCAAGGTGGTGGCGGTGG array('B', [33, 33, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 37, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 26, 35, 38, 38, 36, 38, 38, 31, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 28, 38, 38, 38, 31, 35, 38, 13, 35, 35, 38, 38, 38, 38, 38, 38]) [('NM', 0), ('MD', '123'), ('AS', 123), ('XS', 123), ('RG', '1')]
mate:
HWI-D00331:196:C900FANXX:5:1101:18338:2852|GT.CC 145 60 145190 3 21S80M22S 6 140573779 80 TGGCCAAGCACAGGCTAGTGTTGGGTGATCAATGCAGAAATATGTCACAATGCCCCCTTAGGCAGAGCCTAGACAAAAGCCCCATCACCTGGATGATCAGTACAGGGTTATGTCAAAAAGTTA array('B', [34, 38, 38, 35, 36, 38, 38, 38, 36, 37, 38, 38, 38, 38, 37, 38, 38, 37, 37, 38, 38, 38, 38, 38, 38, 35, 29, 38, 38, 38, 38, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 31, 33, 37, 34, 38, 38, 38, 38, 38, 38, 35, 37, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 36, 38, 38, 38, 38, 38, 36, 38, 38, 38, 38, 38, 38, 38, 35, 38, 38, 38, 38, 37, 29, 35, 37, 34, 35, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 37, 38, 38, 38, 38, 34, 38, 38, 38, 38, 38, 37, 38, 38, 37, 38, 38, 38, 38, 38, 32, 33]) [('NM', 1), ('MD', '10G69'), ('AS', 75), ('XS', 72), ('RG', '1'), ('XA', 'chrUn_gl000216,-154629,24S77M22S,1;chrUn_gl000225,+92868,2S89M32S,4;chrUn_gl000225,+117160,22S77M24S,2;chrUn_gl000225,-21045,23S92M8S,6;')]
HWI-D00331:196:C900FANXX:5:1101:2357:3051|TG.CA
read remaining:
HWI-D00331:196:C900FANXX:5:1101:2357:3051|TG.CA 113 20 40392124 0 123M 64 130694 123 CTGAGAGCCGAGCAGCCCAGGGAGCAGGTGTCCGCACAGAGCTCGTAGTGACTGTTCTGAGGGCATTCCATGGCTGCAAGGAGGGGGTGCCGATCAGAGCCCTGGGGAGGGAGGGGCTGCAAG array('B', [38, 38, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 34, 35, 38, 33, 37, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 34, 38, 34, 38, 33, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 37, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 34, 34]) [('NM', 0), ('MD', '123'), ('AS', 123), ('XS', 122), ('RG', '1'), ('XA', 'chr19,-40376440,123M,1;')]
mate:
HWI-D00331:196:C900FANXX:5:1101:2357:3051|TG.CA 177 64 130694 8 123M 20 40392124 123 ACACGTCACCCATAAGTGTGTGTTCCCGTGAGGAGAGATTTCTAAGAAATGGCACTGTACACTGAACGCAGTGGCTCACGTCTGTCATCCCGAGGTCAGGAGTTCGAGACCAGCCCGGCCAAC array('B', [2, 34, 31, 38, 38, 38, 38, 38, 38, 38, 28, 38, 37, 37, 36, 33, 38, 38, 36, 38, 38, 38, 38, 27, 38, 38, 38, 38, 38, 38, 38, 38, 38, 36, 38, 37, 29, 38, 38, 38, 37, 38, 38, 38, 38, 38, 38, 38, 37, 38, 38, 37, 38, 38, 38, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 34, 38, 38, 34, 37, 35, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 34, 37, 31, 37, 38, 34, 37, 38, 38, 31, 34, 37, 37, 29, 38, 38, 38, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 33, 31]) [('NM', 2), ('MD', '35T16T70'), ('AS', 113), ('XS', 108), ('RG', '1'), ('XA', 'chrUn_gl000220,-126244,123M,3;')]
HWI-D00331:196:C900FANXX:5:1101:1040:2069|TC.AC
read remaining:
HWI-D00331:196:C900FANXX:5:1101:1040:2069|TC.AC 129 20 41630129 17 36S18M1D28M41S 64 129912 46 CTCCCCCCCCCCCCCCCCCCCCTTTCCCCCCTTTTTTTCTTTTTTTTTTTTTCTTTCCCCCCCCCCTTTTTTTTTTTTTTTTCTTTATTACGTAAGAAATATTGGAGTTGGATGAAATTTTTG array('B', [33, 33, 16, 31, 29, 30, 36, 36, 38, 38, 33, 38, 29, 14, 27, 27, 34, 36, 38, 38, 29, 14, 14, 15, 15, 15, 15, 15, 15, 14, 14, 14, 15, 15, 15, 15, 14, 14, 15, 15, 15, 24, 15, 15, 14, 14, 14, 14, 14, 14, 14, 14, 15, 24, 15, 15, 15, 15, 15, 25, 25, 34, 29, 25, 34, 14, 14, 15, 15, 26, 31, 38, 38, 38, 33, 38, 31, 32, 26, 32, 38, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]) [('NM', 2), ('MD', '15C2^C28'), ('AS', 34), ('XS', 28), ('RG', '1'), ('XA', 'chr14,+55382840,28S28M67S,0;chr2,+61459994,54S28M41S,0;chr11,+76344684,58S28M37S,0;chr3,-59180857,41S28M54S,0;')]
mate:
HWI-D00331:196:C900FANXX:5:1101:1040:2069|TC.AC 65 64 129912 60 123M 20 41630129 123 TGAACACCCCCGTCACAAGTTTACCTATGTCACAGTCTTGCTCATGTATGCTTGAACGACNAATAAAAGTTCGGGGGGGNGAAGAGAGGAGAGAGAGAGAGAGACGGGGAGAGAGGGGGGAGG array('B', [33, 33, 38, 38, 38, 38, 38, 38, 38, 37, 36, 38, 38, 38, 34, 38, 38, 38, 38, 38, 38, 38, 37, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 34, 38, 38, 38, 36, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 2, 28, 15, 27, 37, 38, 38, 38, 38, 38, 38, 38, 37, 38, 38, 38, 38, 38, 38, 2, 25, 14, 25, 34, 31, 36, 33, 35, 34, 24, 34, 36, 38, 33, 38, 38, 38, 36, 38, 27, 35, 38, 38, 38, 27, 38, 38, 31, 38, 23, 34, 34, 38, 13, 25, 36, 38, 38, 26, 35, 13, 23, 34]) [('NM', 3), ('MD', '60A18A42A0'), ('AS', 118), ('XS', 77), ('RG', '1')]
=== read_dict remaining ===
=== csn_pair_dict remaining ===
Traceback (most recent call last):
File "/data4/qiumin/soft/ConsensusCruncher/ConsensusCruncher/SSCS_maker.py", line 391, in <module>
main()
File "/data4/qiumin/soft/ConsensusCruncher/ConsensusCruncher/SSCS_maker.py", line 372, in main
plt.bar(list(tags_per_fam), read_fraction)
File "/data4/qiumin/soft/ConsensusCruncher/dep/lib/python3.5/site-packages/matplotlib/pyplot.py", line 2639, in bar
ax = gca()
File "/data4/qiumin/soft/ConsensusCruncher/dep/lib/python3.5/site-packages/matplotlib/pyplot.py", line 935, in gca
return gcf().gca(**kwargs)
File "/data4/qiumin/soft/ConsensusCruncher/dep/lib/python3.5/site-packages/matplotlib/pyplot.py", line 585, in gcf
return figure()
File "/data4/qiumin/soft/ConsensusCruncher/dep/lib/python3.5/site-packages/matplotlib/pyplot.py", line 534, in figure
**kwargs)
File "/data4/qiumin/soft/ConsensusCruncher/dep/lib/python3.5/site-packages/matplotlib/backends/backend_qt4agg.py", line 46, in new_figure_manager
return new_figure_manager_given_figure(num, thisFig)
File "/data4/qiumin/soft/ConsensusCruncher/dep/lib/python3.5/site-packages/matplotlib/backends/backend_qt4agg.py", line 53, in new_figure_manager_given_figure
canvas = FigureCanvasQTAgg(figure)
File "/data4/qiumin/soft/ConsensusCruncher/dep/lib/python3.5/site-packages/matplotlib/backends/backend_qt4agg.py", line 76, in __init__
FigureCanvasQT.__init__(self, figure)
File "/data4/qiumin/soft/ConsensusCruncher/dep/lib/python3.5/site-packages/matplotlib/backends/backend_qt4.py", line 63, in __init__
_create_qApp()
File "/data4/qiumin/soft/ConsensusCruncher/dep/lib/python3.5/site-packages/matplotlib/backends/backend_qt5.py", line 136, in _create_qApp
raise RuntimeError('Invalid DISPLAY variable')
RuntimeError: Invalid DISPLAY variable
/data4/qiumin/soft/ConsensusCruncher/ConsensusCruncher/DCS_maker.py --infile LargeMid/LargeMid_56_L005_R1.fastq.sorted/sscs/LargeMid_56_L005_R1.fastq.sorted.sscs.sorted.bam --outfile LargeMid/LargeMid_56_L005_R1.fastq.sorted/dcs/LargeMid_56_L005_R1.fastq.sorted.dcs.bam --bedfile /data4/qiumin/soft/ConsensusCruncher/ConsensusCruncher/hg19_cytoBand.txt
# === DCS ===
SSCS - Total reads: 0
SSCS - Unmapped reads: 0
SSCS - Secondary/Supplementary reads: 0
DCS reads: 0
SSCS singletons: 0
/data4/qiumin/soft/ConsensusCruncher/ConsensusCruncher/singleton_correction.py --singleton LargeMid/LargeMid_56_L005_R1.fastq.sorted/sscs/LargeMid_56_L005_R1.fastq.sorted.singleton.sorted.bam --bedfile /data4/qiumin/soft/ConsensusCruncher/ConsensusCruncher/hg19_cytoBand.txt
# === Singleton Correction ===
Total singletons: 19216
Singleton Correction by SSCS: 0
% Singleton Correction by SSCS: 0.0
Singleton Correction by Singletons: 4
% Singleton Correction by Singletons : 0.020815986677768527
Uncorrected Singletons: 19212
0.009413317839304606
/data4/qiumin/soft/ConsensusCruncher/dep/bin/samtools merge LargeMid/LargeMid_56_L005_R1.fastq.sorted/sscs_sc/LargeMid_56_L005_R1.fastq.sorted.sscs.sc.bam LargeMid/LargeMid_56_L005_R1.fastq.sorted/sscs/LargeMid_56_L005_R1.fastq.sorted.sscs.sorted.bam LargeMid/LargeMid_56_L005_R1.fastq.sorted/sscs_sc/LargeMid_56_L005_R1.fastq.sorted.sscs.correction.sorted.bam LargeMid/LargeMid_56_L005_R1.fastq.sorted/sscs_sc/LargeMid_56_L005_R1.fastq.sorted.singleton.correction.sorted.bam
/data4/qiumin/soft/ConsensusCruncher/ConsensusCruncher/DCS_maker.py --infile LargeMid/LargeMid_56_L005_R1.fastq.sorted/sscs_sc/LargeMid_56_L005_R1.fastq.sorted.sscs.sc.sorted.bam --outfile LargeMid/LargeMid_56_L005_R1.fastq.sorted/dcs_sc/LargeMid_56_L005_R1.fastq.sorted.dcs.sc.bam --bedfile /data4/qiumin/soft/ConsensusCruncher/ConsensusCruncher/hg19_cytoBand.txt
# === DCS - Singleton Correction ===
SSCS SC - Total reads: 4
SSCS SC - Unmapped reads: 0
SSCS SC - Secondary/Supplementary reads: 0
DCS SC reads: 2
SSCS SC singletons: 0
LargeMid/LargeMid_56_L005_R1.fastq.sorted/dcs_sc/LargeMid_56_L005_R1.fastq.sorted.all.unique.dcs.bam
Traceback (most recent call last):
File "/data4/qiumin/soft/ConsensusCruncher/ConsensusCruncher.py", line 473, in <module>
args.func(args)
File "/data4/qiumin/soft/ConsensusCruncher/ConsensusCruncher.py", line 293, in consensus
'{}/{}_tag_fam_size.png'.format(sample_dir, identifier))
FileNotFoundError: [Errno 2] No such file or directory: 'LargeMid/LargeMid_56_L005_R1.fastq.sorted/sscs/LargeMid_56_L005_R1.fastq.sorted_tag_fam_size.png' -> 'LargeMid/LargeMid_56_L005_R1.fastq.sorted/LargeMid_56_L005_R1.fastq.sorted_tag_fam_size.png'`
So, are these errors acceptable? And I found that in these 2 files, only 2% of the total reads were corrected by singletons method, but according to your paper, the corrected reads shold be more than 30%. Am I making something wrong?