MetaSV: An accurate and integrative structural-variant caller for next generation sequencing
See http://bioinform.github.io/metasv/ for help and downloads.
MetaSV: An accurate and integrative structural-variant caller for next generation sequencing
Home Page: http://bioinform.github.io/metasv/
License: BSD 2-Clause "Simplified" License
MetaSV: An accurate and integrative structural-variant caller for next generation sequencing
See http://bioinform.github.io/metasv/ for help and downloads.
Older timeout command versions may not have the "-k" option.
I as wondering if MetaSV does "joint genotyping" of multiple samples, analogous to what GATK does for SNPs?
Does MetaSV do break-point resolution? For example, if I have a population samples with structural variants in the same region, can MetaSV use that information to try to figure out the boundaries of the structural variant?
Can the genotyping be done separate from the variant calling/discovery?
Thank you,
Luz
Hello,
I've run MetaSV and received this following message:
ERROR 2016-02-23 02:20:01,971 run_spades_single-<Process(PoolWorker-15, started daemon)> Caught exception in worker thread
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/metasv/run_spades.py", line 75, in run_spades_single
retcode = cmd.run(cmd_log_fd_out=spades_log_fd, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/metasv/external_cmd.py", line 20, in run
self.p = subprocess.Popen(self.cmd, stderr=cmd_log_fd_err, stdout=cmd_log_fd_out)
File "/usr/lib/python2.7/subprocess.py", line 710, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1223, in _execute_child
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
()
INFO 2016-02-23 02:20:07,528 metasv.run_spades Merging the contigs from []
I am running on a machine with 96 GB of RAM, so memory shouldn't be an issue here. Monitoring RAM during while the script is running, it seems to use about 55 GB RAM. Have you an idea what the issue might be?
Thanks again for your assistance,
Madeline
Hello,
I ran:
python2.7 run_metasv.py --version
run_metasv.py 0.5
module load bedtools/2.24.0
module load gcc/4.8.4
PATH=$PATH:/bcbiometasv/miniconda/bin
PYTHONPATH="${PYTHONPATH}:/bcbiometasv/miniconda:/bcbiometasv/miniconda/lib/python2.7/site-packages"
python2.7 run_metasv.py --reference hg19_chromosome.fa --boost_sc
--age /bcbiometasv/miniconda/bin/AGE-master/age_align
--pindel_vcf 5.realigned.pindelx5_1toY.N0_PTonly_LI.filtered.somatic.142.recode.vcf 6.realigned.pindelx5_1toY_N0.PTonly_TD.filtered.somatic.142.recode.vcf 7.realigned.pindelx5_1toY_N0.PTonly_D.filtered.somatic.142.recode.vcf 8.realigned.pindelx5_1toY_N0.PTonly_INV.filtered.somatic.142.recode.vcf 9.realigned.pindelx5_1toY_N0.PTonly_SI.filtered.somatic.142.recode.vcf
--cnvnator_vcf 4.PTonly.NTrealign.root.cnvnator.N0.filtered.somatic.142.recode.vcf
--lumpy_vcf 3.tumor.gt.lumpy.svtyper.PRECISE.N0.PTonly.filtered.somatic.142.recode.vcf --manta_vcf 1.somaticSV_manta.PASS.N0only.PTonly.filtered.somatic.142.recode.vcf
--breakdancer_native 2.breakdancer.cfg.LIBTN.a.TumorOnly.noCTXITX.somatic.manEdit.out
--sample filter.somatic
--bam Clean3_mergedL7L8_hg19_kmer_q15_TrimN_N0_L70.recal_sort2_dedup2.realigned2.NTrealign.bam
--spades /bcbiometasv/miniconda/bin/SPAdes-3.6.0/bin/spades.py
--spades_options '-k 71'
--num_threads 4
--workdir /bcbiometasv/miniconda/bin/UP53input
--outdir out_somatic --min_support_ins 2 --max_ins_intervals 1000000
--mean_read_length 146 --isize_mean 365 --isize_sd 104
It is in the last step. I can see variant.vcf with only a header. But I also see the following error. Can you advise me the workaround for this?
INFO 2016-02-28 23:00:57,715 genotype_interval-<Process(PoolWorker-16, started daemon)> For interval chrY:22260563-22301084 DEL counts are 36, 142 and normal_frac is 0.253521 gt is 0/1
INFO 2016-02-28 23:00:58,091 genotype_interval-<Process(PoolWorker-16, started daemon)> For interval chrY:28792949-28793380 DEL counts are 296, 4088 and normal_frac is 0.072407 gt is 0/1
INFO 2016-02-28 23:00:58,245 genotype_interval-<Process(PoolWorker-16, started daemon)> For interval chrY:28805583-28814110 DEL counts are 229, 3217 and normal_frac is 0.0711843 gt is 0/1
INFO 2016-02-28 23:00:58,700 genotype_intervals-<Process(PoolWorker-16, started daemon)> Genotyped 351 intervals in 1.00553 minutes
INFO 2016-02-28 23:00:58,790 parallel_genotype_intervals-<_MainProcess(MainProcess, started)> Following BED files will be merged: ['/scratch/RDS-SMS-PCaGenomes-RW/weejar/bcbiometasv/miniconda/bin/UP53input/genotyping/0/genotyped.bed', '/scratch/RDS-SMS-PCaGenomes-RW/weejar/bcbiometasv/miniconda/bin/UP53input/genotyping/2/genotyped.bed', '/scratch/RDS-SMS-PCaGenomes-RW/weejar/bcbiometasv/miniconda/bin/UP53input/genotyping/1/genotyped.bed', '/scratch/RDS-SMS-PCaGenomes-RW/weejar/bcbiometasv/miniconda/bin/UP53input/genotyping/3/genotyped.bed']
INFO 2016-02-28 23:00:58,878 parallel_genotype_intervals-<_MainProcess(MainProcess, started)> Finished parallel genotyping of 1410 intervals in 1.08642 minutes
INFO 2016-02-28 23:00:58,882 metasv.main Output final VCF file
Traceback (most recent call last):
File "run_metasv.py", line 5, in
pkg_resources.run_script('MetaSV==0.5', 'run_metasv.py')
File "/usr/local/python/2.7.9/lib/python2.7/site-packages/distribute-0.6.28-py2.7.egg/pkg_resources.py", line 499, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/local/python/2.7.9/lib/python2.7/site-packages/distribute-0.6.28-py2.7.egg/pkg_resources.py", line 1239, in run_script
execfile(script_filename, namespace, namespace)
File "/scratch/RDS-SMS-PCaGenomes-RW/weejar/bcbiometasv/miniconda/lib/python2.7/site-packages/MetaSV-0.5-py2.7.egg/EGG-INFO/scripts/run_metasv.py", line 142, in
sys.exit(run_metasv(args))
File "/scratch/RDS-SMS-PCaGenomes-RW/weejar/bcbiometasv/miniconda/lib/python2.7/site-packages/MetaSV-0.5-py2.7.egg/metasv/main.py", line 335, in run_metasv
convert_metasv_bed_to_vcf(bedfile=genotyped_bed, vcf_out=final_vcf, workdir=args.workdir, sample=args.sample, reference=args.reference, pass_calls=False)
File "/scratch/RDS-SMS-PCaGenomes-RW/weejar/bcbiometasv/miniconda/lib/python2.7/site-packages/MetaSV-0.5-py2.7.egg/metasv/generate_final_vcf.py", line 476, in convert_metasv_bed_to_vcf
vcf_writer = vcf.Writer(open(vcf_out, "w"), vcf_template_reader)
File "/scratch/RDS-SMS-PCaGenomes-RW/weejar/bcbiometasv/miniconda/lib/python2.7/site-packages/vcf/parser.py", line 673, in init
if line.length:
AttributeError: 'tuple' object has no attribute 'length'
Sorry to bother you with too many questions. Thankyou for your time in helping my research.
James
Hello,
I'm using Lumpy, Breakseq, breakdancer, pindel and cnvnator to look for CNVs in fastq obtained by WES (I know it's not the most adaptable tools for WES).
I would like to know the impact of the --filter_gaps option on merging files? How does it work ?
Hi,
I checked the output from metasv, but I find some regions are both duplication and deletion, what caused this?
Scaffold_12628_Chr2 93711 . G . PASS CIEND=-10,10;END=102822;SVLEN=9110;SVTYPE=DUP;CIPOS=-10,10;HOMLEN=2;HOMSEQ=TG;SOURCES=Scaffold_12628_Chr2-93711-Scaffold_12628_Chr2-102819-9108-Manta,Scaffold_12628_Chr2-93711-Scaffold_12628_Chr2-102821-9110-Lumpy,Scaffold_12628_Chr2-93712-Scaffold_12628_Chr2-102822-9110-WHAM;NUM_SVMETHODS=6;NUM_SVTOOLS=3;SVMETHOD=RP,RP,RP,SR,SR,SR;A=15;CF=1.0;CIEND95=0,0;CIPOS95=0,0;CW=0.0,1.0,0.0,0.0,0.0;D=0;DI=0.0;EV=8;I=0;PE=0;SR=7;SS=0;STRANDS=-+:7;SU=7;T=0;TAGS=2931_t4;TF=0;U=15;V=0 GT 0/1
Scaffold_12628_Chr2 96164 . T . PASS CIEND=-6,5;END=98928;IMPRECISE;SVLEN=-2764;SVTYPE=DEL;CIPOS=-6,5;SOURCES=Scaffold_12628_Chr2-96164-Scaffold_12628_Chr2-98928-2764-Manta,Scaffold_12628_Chr2-96190-Scaffold_12628_Chr2-98902-2712-Lumpy;NUM_SVMETHODS=4;NUM_SVTOOLS=2;SVMETHOD=RP,RP,SR,SR;CIEND95=0,0;CIPOS95=0,0;PE=0;SR=6;STRANDS=+-:6;SU=6 GT 0/1
I'm using metasv to merge the output of pindel, cnvnator and breakdancer. But I get the error "OSError: [Errno 12] Cannot allocate memory". I want to know if the metasv read all the bam file into memory when merge the SVs.
Here is my command:
run_metasv.py --reference /public/home/ylma/genome/Sus_scrofa/Sus_scrofa.Sscrofa10.2.dna.toplevel.fa
--breakdancer_native rc.out
--cnvnator_native rc.cnv
--pindel_native rc_D rc_LI rc_SI rc_TD rc_INV
--sample BMX --bams SAMN02298127.02.bam SAMN02298128.02.bam SAMN02298129.02.bam SAMN02298130.02.bam SAMN02298131.02.bam SAMN02298132.02.bam
--spades /public/home/ylma/tools/SPAdes-3.10.1-Linux/bin/spades.py
--age /public/home/ylma/tools/AGE/age_align --num_threads 15 --workdir work --outdir out
--max_ins_intervals 500000 --isize_mean 500 --isize_sd 150
When I run metaSV, I get the following error: Reference file hg19_reference/hg19_multifasta is not indexed
What kind of indexing does the reference fasta file need?
Thank you,
Madeline
I am running Metasv with local assembly for duplication variants. My input is only 5 duplication variants and 4 of them were skipped due to small size, so there is only 1 duplication will be processed for local assembly. I am wondering how much time this process will take. My job has been running over a day with 2 threads and 12G memory for each thread. Are there anyway to speed up this process? For other types of variant DEL,Inseration,Duplication, the same setting could be finished in a few hours with all variants from one chromosome. Thanks, Justin
My input parameters: --svs_to_assemble DUP --svs_to_softclip DUP
Where I am now from output information
INFO 2017-02-14 17:02:34,915 metasv.sv_interval Loading SV intervals from /work/s167568/MGRAK_2016_10_17_WGS14_1507_0_MetaSV/MantaBreakdancer_metaSV/test_DUP.vcf
WARNING 2017-02-14 17:02:34,923 metasv.sv_interval Skipping Record(CHROM=1, POS=821604, REF=T, ALT=[DUP:TANDEM]) due to small size
WARNING 2017-02-14 17:02:34,923 metasv.sv_interval Skipping Record(CHROM=1, POS=2324462, REF=G, ALT=[DUP:TANDEM]) due to small size
WARNING 2017-02-14 17:02:34,924 metasv.sv_interval Skipping Record(CHROM=1, POS=3714245, REF=T, ALT=[DUP:TANDEM]) due to small size
WARNING 2017-02-14 17:02:34,924 metasv.sv_interval Skipping Record(CHROM=1, POS=4789624, REF=T, ALT=[DUP:TANDEM]) due to small size
INFO 2017-02-14 17:02:34,924 metasv.main SV types are set(['DUP'])
INFO 2017-02-14 17:02:34,924 metasv.main Output per-tool VCFs
INFO 2017-02-14 17:02:34,925 metasv.main Outputting single tool VCF for Manta
INFO 2017-02-14 17:02:34,976 metasv.main Indexing single tool VCF for Manta
INFO 2017-02-14 17:02:35,050 metasv.main Do merging
INFO 2017-02-14 17:02:35,050 metasv.main Processing SVs of type DUP
INFO 2017-02-14 17:02:35,050 metasv.main Intra-tool Merging SVs of type DUP
INFO 2017-02-14 17:02:35,050 metasv.main First level merging for DUP for tool Manta
INFO 2017-02-14 17:02:35,050 metasv.main Inter-tool Merging SVs of type DUP
INFO 2017-02-14 17:02:35,051 metasv.main Output merged VCF without assembly
INFO 2017-02-14 17:02:35,103 metasv.main ('DUP', 'LowQual', 'IMPRECISE', ('Manta',)):1
INFO 2017-02-14 17:02:35,103 metasv.main Running assembly
INFO 2017-02-14 17:02:35,103 metasv.main Creating directory /work/s167568/MGRAK_2016_10_17_WGS14_1507_0_MetaSV/MantaBreakdancer_metaSV/metasv_work_test5DUP/spades
INFO 2017-02-14 17:02:35,111 metasv.main Creating directory /work/s167568/MGRAK_2016_10_17_WGS14_1507_0_MetaSV/MantaBreakdancer_metaSV/metasv_work_test5DUP/age
INFO 2017-02-14 17:02:35,122 metasv.main Generating Soft-Clipping intervals.
INFO 2017-02-14 17:02:35,122 parallel_generate_sc_intervals-<_MainProcess(MainProcess, started)> SVs to soft-clip: set(['DUP', 'INV', 'DEL', 'INS'])
INFO 2017-02-14 17:02:35,315 get_bp_intervals-<_MainProcess(MainProcess, started)> 2 total candidate bp intervals in other methods
INFO 2017-02-14 17:02:35,325 generate_sc_intervals-<Process(PoolWorker-1, started daemon)> Generating candidate intervals from /work/s167568/MGRAK_2016_10_17_WGS14_1507_0_MetaSV/input/HCC4017_Clone4.DupsMarked_RG.bam for chromsome 1
INFO 2017-02-14 17:27:36,793 generate_sc_intervals-<Process(PoolWorker-1, started daemon)> 6949907 candidate reads
INFO 2017-02-14 17:28:07,973 generate_sc_intervals-<Process(PoolWorker-1, started daemon)> 574885 candidate NONE reads
INFO 2017-02-14 17:28:07,974 generate_sc_intervals-<Process(PoolWorker-1, started daemon)> Gather intervals from breakpoints in other methods
INFO 2017-02-14 17:28:12,076 generate_sc_intervals-<Process(PoolWorker-1, started daemon)> 574885 bps in other methods
INFO 2017-02-14 17:44:31,879 resolve_none_svs-<Process(PoolWorker-1, started daemon)> 127 unresolved intervals
INFO 2017-02-14 17:44:33,931 resolve_none_svs-<Process(PoolWorker-1, started daemon)> 94 merged unresolved intervals
INFO 2017-02-14 17:44:34,789 resolve_none_svs-<Process(PoolWorker-1, started daemon)> 94 filtered unresolved intervals
INFO 2017-02-14 17:44:34,935 resolve_none_svs-<Process(PoolWorker-1, started daemon)> 79 coverage filtered unresolved intervals
INFO 2017-02-14 17:44:36,884 resolve_none_svs-<Process(PoolWorker-1, started daemon)> 58 coverage filtered unresolved intervals
INFO 2017-02-14 17:57:45,636 generate_sc_intervals-<Process(PoolWorker-1, started daemon)> 179755 merged intervals with left bp support
Hi,
I installed metasv using pip install and when I run it I get a missing file cnvnator.call. Full log is below signature.
Thanks, Colin
run_metasv.py --reference /wd5/sq/grch37decoy/hs37d5.000.fa --breakdancer_native breakdancer.out --breakseq_native breakseq.gff --cnvnator_native cnvnator.call --pindel_native pindel_D pindel_LI pindel_SI pindel_TD pindel_INV --bam chimera.bam --spades SPAdes/spades.py --age AGE/age_align --num_threads 11 --workdir work --outdir out --max_ins_intervals 500000 --isize_mean 500 --isize_sd 150 --sample 1
INFO 2018-02-01 15:07:02,114 metasv.main Running MetaSV 0.5.2
INFO 2018-02-01 15:07:02,114 metasv.main Command-line /st2/colin/.local/bin/run_metasv.py --reference /wd5/sq/grch37decoy/hs37d5.000.fa --breakdancer_native breakdancer.out --breakseq_native breakseq.gff --cnvnator_native cnvnator.call --pindel_native pindel_D pindel_LI pindel_SI pindel_TD pindel_INV --bam chimera.bam --spades SPAdes/spades.py --age AGE/age_align --num_threads 11 --workdir work --outdir out --max_ins_intervals 500000 --isize_mean 500 --isize_sd 150 --sample 1
INFO 2018-02-01 15:07:02,114 metasv.main Arguments are Namespace(age='AGE/age_align', age_timeout=300, age_window=20, assembly_max_tools=1, assembly_pad=500, bams=['chimera.bam'], boost_sc=False, breakdancer_native=['breakdancer.out'], breakdancer_vcf=[], breakseq_native=['breakseq.gff'], breakseq_vcf=[], chromosomes=[], cnvkit_vcf=[], cnvnator_native=['cnvnator.call'], cnvnator_vcf=[], disable_assembly=False, enable_per_tool_output=False, extraction_max_read_pairs=10000, filter_gaps=False, gaps=None, gatk_vcf=[], gt_normal_frac=0.05, gt_window=100, inswiggle=100, isize_mean=500.0, isize_sd=150.0, keep_standard_contigs=False, lumpy_vcf=[], manta_vcf=[], max_ins_cov_frac=1.5, max_ins_intervals=500000, max_nm=10, maxsvlen=1000000, mean_read_coverage=50, mean_read_length=100, min_avg_base_qual=20, min_del_subalign_len=50, min_ins_cov_frac=0.5, min_inv_subalign_len=50, min_mapq=5, min_matches=50, min_soft_clip=20, min_support_frac_ins=0.05, min_support_ins=15, minsvlen=50, num_threads=11, outdir='out', overlap_ratio=0.5, pindel_native=['pindel_D', 'pindel_LI', 'pindel_SI', 'pindel_TD', 'pindel_INV'], pindel_vcf=[], reference='/wd5/sq/grch37decoy/hs37d5.000.fa', sample='1', sc_other_scale=5, spades='SPAdes/spades.py', spades_max_interval_size=50000, spades_options='', spades_timeout=300, stop_spades_on_fail=False, svs_to_assemble=set(['DUP', 'INV', 'DEL', 'INS']), svs_to_report=set(['INV', 'CTX', 'INS', 'DEL', 'ITX', 'DUP']), svs_to_softclip=set(['DUP', 'INV', 'DEL', 'INS']), wham_vcf=[], wiggle=100, workdir='work')
INFO 2018-02-01 15:07:02,115 metasv.main Only SVs on the following contigs will be reported: ['GL000191.1', 'GL000192.1', 'GL000193.1', 'GL000194.1', 'GL000195.1', 'GL000196.1', 'GL000197.1', 'GL000198.1', 'GL000199.1', 'GL000200.1', 'GL000201.1', 'GL000202.1', 'GL000203.1', 'GL000204.1', 'GL000205.1', 'GL000206.1', 'GL000207.1', 'GL000208.1', 'GL000209.1', 'GL000210.1', 'GL000211.1', 'GL000212.1', 'GL000213.1', 'GL000214.1', 'GL000215.1', 'GL000216.1', 'GL000217.1', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000222.1', 'GL000223.1', 'GL000224.1', 'GL000225.1', 'GL000226.1', 'GL000227.1', 'GL000228.1', 'GL000229.1', 'GL000230.1', 'GL000231.1', 'GL000232.1', 'GL000233.1', 'GL000234.1', 'GL000235.1', 'GL000236.1', 'GL000237.1', 'GL000238.1', 'GL000239.1', 'GL000240.1', 'GL000241.1', 'GL000242.1', 'GL000243.1', 'GL000244.1', 'GL000245.1', 'GL000246.1', 'GL000247.1', 'GL000248.1', 'GL000249.1', 'NC_007605', 'chr1', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr2', 'chr20', 'chr21', 'chr22', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chrM', 'chrX', 'chrY', 'hs37d5']
INFO 2018-02-01 15:07:02,115 metasv.main Load native files
INFO 2018-02-01 15:07:02,115 metasv.cnvnator_reader File is cnvnator.call
Traceback (most recent call last):
File "/st2/colin/.local/bin/run_metasv.py", line 143, in
sys.exit(run_metasv(args))
File "/home/colin/.local/lib/python2.7/site-packages/metasv/main.py", line 106, in run_metasv
for record in svReader(native_file, svs_to_report=args.svs_to_report):
File "/home/colin/.local/lib/python2.7/site-packages/metasv/cnvnator_reader.py", line 110, in init
self.file_fd = open(file_name)
IOError: [Errno 2] No such file or directory: 'cnvnator.call'
have option to output version number
Hello,
When I use the following command:
run_metasv.py --reference hg19_reference/hg19_multifasta.fa --boost_ins --breakdancer_native breakdancer.out --breakseq_native breakseq.gff --cnvnator_native cnvnator.call --pindel_native pindel_D pindel_LI pindel_SI pindel_TD pindel_INV --sample A10890 --bam A10890_C1VBNACXX_1.bam --spades SPAdes-3.6.2-Linux/bin/spades.py --age AGE-simple-parseable-output/age_align.cpp --num_threads 15 --workdir work --outdir out --min_ins_support 2 --max_ins_intervals 500000 --isize_mean 500 --isize_sd 150
I get the following output and error message:
INFO 2016-01-20 10:56:47,410 metasv.main Only SVs on the following contigs will be reported: ['chr1', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr2', 'chr20', 'chr21', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chrM', 'chrX', 'chrY']
INFO 2016-01-20 10:56:47,410 metasv.main Load native files
INFO 2016-01-20 10:56:47,410 metasv.cnvnator_reader File is cnvnator.call
Traceback (most recent call last):
File "/usr/local/bin/run_metasv.py", line 6, in
exec(compile(open(file).read(), file, 'exec'))
File "/home/hpcuser01/metasv-0.3/scripts/run_metasv.py", line 108, in
stop_spades_on_fail=args.stop_spades_on_fail, gt_window=args.gt_window, gt_normal_frac=args.gt_normal_frac, isize_mean=args.isize_mean, isize_sd=args.isize_sd, extraction_max_read_pairs=args.extraction_max_read_pairs))
File "/home/hpcuser01/metasv-0.3/metasv/main.py", line 130, in run_metasv
for record in svReader(native_file):
File "/home/hpcuser01/metasv-0.3/metasv/cnvnator_reader.py", line 106, in init
self.file_fd = open(file_name)
IOError: [Errno 2] No such file or directory: 'cnvnator.call'
I am not sure why. Help would be appreciated!
Thank you,
Madeline
Hello,
There seems to be some bug in metaSV when the --boost_scs option is included. With this option, my deletion detection sensitivity drops dramatically from ~80% to 5% (as measured using VarSim). My command is as follows, using metaSV v 0.5.3:
run_metasv.py --bam $BAM \ --reference $REF \ --sample $SAMPLE --boost_sc \ --cnvnator_native $SAMPLE.bam_CNVcall.100 \ --lumpy_vcf $SAMPLE.bam_lumpy.vcf \ --spades /home/hpcuser01/SPAdes-3.6.2-Linux/bin/spades.py \ --age /home/hpcuser01/AGE/age_align \ --min_support_ins 2 \ --max_ins_intervals 500000 --isize_mean $INSMEAN --isize_sd $INSSD \ --num_threads $THREADS --outdir $SAMPLE.metaSV.out --workdir $SAMPLE.metaSV.work
Even if I specify --svs_to_assembly INS I still have deletions dropping out. Not sure why this is.
Hi there,
I have got some vcf files by metasv to merge different tools and assembly. But I don't know how to extract high confidence SNVs from outputs ? could you give me some ideas?
thanks a lot.
Kai
Hi,
I want to detect deletion only.This is my command:
run_metasv.py --reference ../data/MIC_supercont_chr_combine.txt --breakdancer_native ./BDcaller/version1/BD/CU427_EC_chr_BD --pindel_native ./BDcaller/version1/pindel/CU427_D --sample WT --bam ../data/CU427_chr_EC_bwa_sorted.bam --spades spades.py --age age_align --num_threads 4 --workdir work --outdir DEL --isize_mean 236.88 --isize_sd 172.95 --svs_to_assemble DEL --svs_to_report DEL
The error messange:
Traceback (most recent call last):
File "/usr/local/bin/run_metasv.py", line 143, in
sys.exit(run_metasv(args))
File "/usr/local/lib/python2.7/dist-packages/metasv/main.py", line 338, in run_metasv
convert_metasv_bed_to_vcf(bedfile=genotyped_bed, vcf_out=final_vcf, workdir=args.workdir, sample=args.sample, reference=args.reference, pass_calls=False)
File "/usr/local/lib/python2.7/dist-packages/metasv/generate_final_vcf.py", line 573, in convert_metasv_bed_to_vcf
filterd_bed = filter_confused_INS_calls(nonfilterd_bed, filterd_bed)
File "/usr/local/lib/python2.7/dist-packages/metasv/generate_final_vcf.py", line 164, in filter_confused_INS_calls
bad_INS = bedtool_INS.window(bedtool_bp_nonINS, w=wiggle)
File "/usr/local/lib/python2.7/dist-packages/pybedtools/bedtool.py", line 664, in decorated
result = method(self, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pybedtools/bedtool.py", line 243, in wrapped
check_stderr=check_stderr)
File "/usr/local/lib/python2.7/dist-packages/pybedtools/helpers.py", line 423, in call_bedtools
raise BEDToolsError(subprocess.list2cmdline(cmds), stderr)
pybedtools.helpers.BEDToolsError:
Command was:
bedtools window -w 20 -b /root/work/genotyping/pybedtools.WpFB8r.tmp -a /root/work/genotyping/pybedtools.rfbScw.tmp
Error message was:
Error: The requested bed file (/root/work/genotyping/pybedtools.rfbScw.tmp) could not be opened. Exiting!
I am not sure what's the problem, please let me know if there is anything I can do. Thank you!
Hi there,
Reading the metaSV paper.
I am interested in how do you implement the "Merge overlapping calls per tool (intra-tool merging)"
I have read a post on bedtools google group https://groups.google.com/forum/#!topic/bedtools-discuss/JXZbJSwVxUo
where can I get the script for merging overlapping vcf files?
Thanks!
Ming
Hi there,
I am trying to test metasv. I have breakseq, pindel, and breakdancer data that I am providing to metasv. However, when I ran metasv on our cluster using 16cpus and 128gb of memory, metasv was terminated because it exceeded this memory allotment -- it went up to 178gb. Is it normal behaviour to use this much memory? Any ideas what could be happening?
The run command I used was this:
python run_metasv.py --breakdancer_native $list --reference /projects/trans_scratch/references/genomes/transabyss/bwamem-0.7.10/hg19a.fa --sample TEST --outdir /projects/trans_scratch/validations/workspace/dpaulino/metasvtest --bam /projects/analysis/analysis24/A36971/merge_bwa-mem-0.7.6a/150nt/hg19a/A36971_2_lanes_dupsFlagged.bam --spades /home/dpaulino/.linuxbrew/Cellar/spades/3.10.1/bin/spades.py --age /home/dpaulino/software/ageAligner/AGE-master/age_align --num_threads 5 --breakseq_native /projects/trans_scratch/validations/workspace/dpaulino/breakseqTest/work/breakseq.gff --pindel_native $pindellist
$list and $pindellist contain paths to breakdancer and pinel data.
The dataset is a human genome with chromosomes 1-22, X, and Y. Any advice on how to get metasv to run properly is greatly appreciated!
Thanks,
Daniel
Hi and thanks in advance for your help!
While running MetaSV, it got stuck while running spades. There are a few 'OSError: [Errno 13] Permission denied' errors during the run_spades_single process, and then it seems to hang while merging the contigs. The merged.fa and spades.log files are both empty. Running MetaSV on the same sample with --disable_assembly went fine with no errors. The error messages are below, please let me know if there is anything I should try or additional information that would be helpful.
Thanks!
Amanda
Errors:
INFO 2016-04-06 14:10:19,200 extract_read_pairs-<Process(PoolWorker-7, started daemon)> Examined 256 pairs in 11.3396 seconds
INFO 2016-04-06 14:10:19,200 extract_read_pairs-<Process(PoolWorker-7, started daemon)> Extraction counts [('all_pair_hq', 256), ('non_perfect_hq', 107)]
INFO 2016-04-06 14:10:19,202 run_spades_single-<Process(PoolWorker-7, started daemon)> Running /HOME/BIOINFORMATICS/SOFTWARE/SPADES-3.7.1-LINUX/BIN/ with arguments ['-1', 'work/spades/2/_all_pair_hq_1.fq', '-2', 'work/spades/2/_all_pair_hq_2.fq', '-o', 'work/spades/2/spades_all_pair_hq/', '-m', '4', '-t', '1', '--phred-offset', '33']
ERROR 2016-04-06 14:10:19,950 run_spades_single-<Process(PoolWorker-7, started daemon)> Caught exception in worker thread
Traceback (most recent call last):
File "/home/.local/lib/python2.7/site-packages/metasv/run_spades.py", line 75, in run_spades_single
retcode = cmd.run(cmd_log_fd_out=spades_log_fd, timeout=timeout)
File "/home/.local/lib/python2.7/site-packages/metasv/external_cmd.py", line 20, in run
self.p = subprocess.Popen(self.cmd, stderr=cmd_log_fd_err, stdout=cmd_log_fd_out)
File "/usr/local/apps/python-2.7.11/lib/python2.7/subprocess.py", line 710, in init
errread, errwrite)
File "/usr/local/apps/python-2.7.11/lib/python2.7/subprocess.py", line 1335, in _execute_child
raise child_exception
OSError: [Errno 13] Permission denied
INFO 2016-04-06 14:10:20,589 metasv.run_spades Merging the contigs from []
^CTraceback (most recent call last):
File "/home/Bioinformatics/software/metasv/scripts/run_metasv.py", line 143, in
sys.exit(run_metasv(args))
File "/home/.local/lib/python2.7/site-packages/metasv/main.py", line 306, in run_metasv
assembly_max_tools=args.assembly_max_tools)
File "/home/.local/lib/python2.7/site-packages/metasv/run_spades.py", line 214, in run_spades_parallel
for line in fileinput.input(assembly_fastas):
File "/usr/local/apps/python-2.7.11/lib/python2.7/fileinput.py", line 254, in next
line = self.readline()
File "/usr/local/apps/python-2.7.11/lib/python2.7/fileinput.py", line 349, in readline
self._buffer = self._file.readlines(self._bufsize)
This will enable controlling SPAdes and AGE timed executions.
Hi @marghoob,
I just noticed breakseq2 repo. Can I replace (old) breakseq with breakseq2 output as input for metasv?
Best wishes,
Fengyuan
Hi,
I'm trying ti use MetaSV for the first time.
I installed version 0.5.2.
I checked that reference.fa and reference.fa.fai are in the same folder with excatly the same name.
I'm trying the following command:
run_metasv.py --reference ~/Documents/Post-Doc/Results/SequencingData/Reference_genomes/Celegans/c_elegans.PRJNA13758.WS243.genomic.fa --boost_sc --breakdancer_native breakdancer.out --breakseq_native breakseq.gff --cnvnator_native cnvnator.call --pindel_native pindel_D pindel_LI pindel_SI pindel_TD pindel_INV --sample 516 --bam ~/Bureau/temp_PSMN/Post-doc/N2vs516/NON-MASKED/BAM-516_20X-unM_RG-sorted-dedup-realign_BQSR1.bam --spades ~/softwares/SPAdes-3.13.0-Linux/bin/spades.py --age ~/softwares/AGE-master/age_align --num_threads 1 --workdir ~/Bureau/temp_PSMN/Post-doc/MetaSV/MetaSV_516/work --outdir ~/Bureau/temp_PSMN/Post-doc/MetaSV/MetaSV_516/out --isize_mean 470 --isize_sd 35
In the folder containing my ref file, I do have .fa and .fai (I re-generate it with samtools faidx to be sure)
ls -l | grep "c_elegans.PRJNA13758.WS243.genomic.fa" -rwxrwxrwx 1 fabfab fabfab 102292161 mai 19 2014 c_elegans.PRJNA13758.WS243.genomic.fa -rwxrwxrwx 1 fabfab fabfab 14 avril 22 2015 c_elegans.PRJNA13758.WS243.genomic.fa.amb -rwxrwxrwx 1 fabfab fabfab 231 avril 22 2015 c_elegans.PRJNA13758.WS243.genomic.fa.ann -rwxrwxrwx 1 fabfab fabfab 100286508 avril 22 2015 c_elegans.PRJNA13758.WS243.genomic.fa.bwt -rwxrwxrwx 1 fabfab fabfab 181 oct. 18 11:28 c_elegans.PRJNA13758.WS243.genomic.fa.fai -rwxrwxrwx 1 fabfab fabfab 25071602 avril 22 2015 c_elegans.PRJNA13758.WS243.genomic.fa.pac -rwxrwxrwx 1 fabfab fabfab 50143256 avril 22 2015 c_elegans.PRJNA13758.WS243.genomic.fa.sa
Any idea of what I'm doing wrong ?
thanks,
Fabrice
Hi Marghoob,
I’m interested to use MetaSV for identifying high confidence SVs from 3 different algorithms e.g. BreakDancer, Delly and CNVnator.
I was wondering whether it is possible to include Delly’s output in MetaSV; and if so, your guidance will be highly appreciated.
Thanks
Mesbah
Hi,
While running MetaSV, it got errors during the spades process. It seems I can't run the spades successfully, and got nothing finally. Running MetaSV on the same sample with --disable_assembly went fine with no errors. The error messages are below, please let me know if there is anything I should try or additional information that would be helpful.
Thanks!
jsxu
Hi,
I am trying a complete run of MetaSV 0.5.4 (installed from bioconda) using all 2 SV detectors (pindel and breakdancer), soft-clips based analysis, and local assembly. And got this error
INFO 2022-02-15 15:51:33,585 metasv.main Load native files
INFO 2022-02-15 15:51:33,586 metasv.pindel_reader File is LA3111t13-LA4330t13_D
INFO 2022-02-15 16:18:19,892 metasv.pindel_reader File is LA3111t13-LA4330t13_SI
INFO 2022-02-15 16:32:20,639 metasv.pindel_reader File is LA3111t13-LA4330t13_TD
INFO 2022-02-15 16:34:33,923 metasv.pindel_reader File is LA3111t13-LA4330t13_INV
INFO 2022-02-15 16:34:35,183 metasv.breakdancer_reader File is breakdancer.sv.vcf
Traceback (most recent call last):
File "/data/home/users/g.silvaarias/anaconda3/envs/metasv/bin/run_metasv.py", line 143, in <module>
sys.exit(run_metasv(args))
File "/data/home/users/g.silvaarias/anaconda3/envs/metasv/lib/python2.7/site-packages/metasv/main.py", line 106, in run_metasv
for record in svReader(native_file, svs_to_report=args.svs_to_report):
File "/data/home/users/g.silvaarias/anaconda3/envs/metasv/lib/python2.7/site-packages/metasv/breakdancer_reader.py", line 222, in next
self.header.parse_header_line(line)
File "/data/home/users/g.silvaarias/anaconda3/envs/metasv/lib/python2.7/site-packages/metasv/breakdancer_reader.py", line 74, in parse_header_line
self.header_dict[fields[0]] = dict(field.split(":") for field in fields[1:])
ValueError: dictionary update sequence element #0 has length 1; 2 is required
Here is the full command:
run_metasv.py --reference $ref \
--outdir $outdir \
--boost_sc \
--breakdancer_native breakdancer.sv.vcf \
--pindel_native LA3111t13-LA4330t13_D LA3111t13-LA4330t13_SI LA3111t13-LA4330t13_TD LA3111t13-LA4330t13_INV \
--sample LA3111t13 --sample LA4330t13 --bam LA3111t13_dedup_RG.bam LA4330t13_dedup_RG.bam --spades spades.py \
--age age_align --num_threads $threads \
--min_support_ins 10 --isize_mean 400 --isize_sd 100
I would appreciate any suggestion to fix that.
Best,
Gustavo
Hello,
I have been getting this error below when running metasv with soft-clip. Have you seen this before? I am using correct version of tools as listed in http://bioinform.github.io/metasv/.
INFO 2016-08-31 22:00:33,727 genotype_intervals-<Process(PoolWorker-16, started daemon)> Genotyped 6 intervals in 0.0038044 minutes
INFO 2016-08-31 22:00:33,803 parallel_genotype_intervals-<_MainProcess(MainProcess, started)> Following BED files will be merged: ['work/genotyping/0/genotyped.bed', 'work/genotyping/1/genotyped.bed', 'work/genotyping/2/genotyped.bed', 'work/genotyping/3/genotyped.bed']
INFO 2016-08-31 22:00:33,845 parallel_genotype_intervals-<_MainProcess(MainProcess, started)> Finished parallel genotyping of 27 intervals in 0.00704072 minutes
INFO 2016-08-31 22:00:33,847 metasv.main Output final VCF file
feature.field:0/1
Traceback (most recent call last):
File "/mnt/galaxyTools/tools/pymodules/python2.7/bin/run_metasv.py", line 146, in
sys.exit(run_metasv(args))
File "/mnt/galaxyTools/tools/pymodules/python2.7/lib/python/MetaSV-0.5-py2.7.egg/metasv/main.py", line 335, in run_metasv
convert_metasv_bed_to_vcf(bedfile=genotyped_bed, vcf_out=final_vcf, workdir=args.workdir, sample=args.sample, reference=args.reference, pass_calls=False)
File "/mnt/galaxyTools/tools/pymodules/python2.7/lib/python/MetaSV-0.5-py2.7.egg/metasv/generate_final_vcf.py", line 435, in convert_metasv_bed_to_vcf
interval_info = get_interval_info(interval,pass_calls)
File "/mnt/galaxyTools/tools/pymodules/python2.7/lib/python/MetaSV-0.5-py2.7.egg/metasv/generate_final_vcf.py", line 77, in get_interval_info
info.update(json.loads(base64.b64decode(feature.fields[10])))
File "/mnt/galaxyTools/tools/pymodules/python2.7/lib/python/base64.py", line 76, in b64decode
raise TypeError(msg)
TypeError: Incorrect padding
Dear MetaSV team,
Does your tool support Manta and Canvas callers?
Many thanks
Dear developers,
Thank you for contributing the tool to the community.
At the moment, I have ref and bam files, and when I run:
run_metasv.py --sample sample_A --reference ./ref/ref.fasta --bam ./bam/sample_A.bam --outdir ./out
I got an error - "Nothing to do since no SV file specified".
Could you please let me know where goes wrong? And what's the best way to run the tool? Could you show me an example?
Best wishes,
Fengyuan
Hi metasv developer,
Currently, I am using metasv to merge SVs from the outputs of BreankDancer, CNVNATOR, and Pindel for a human genome. I was wondering if there are some tricks that I could accelerate the computational time?
I downloaded metasv from anaconda by using the command below:
conda install -c bioconda metasv
The version of metasv:
[ksu2 18:11:36 ksu2_SVE]$ run_metasv.py --version
run_metasv.py 0.5.4
I performed the run_metasv.py
on the example files without any issue, so I moved to my own data. The running time of metasv on our HPC is over 5 days now. If you have any change to give me some suggestions that will be great. Here I listed my bash command.
`
#!/bin/bash
#SBATCH --qos=long
#SBATCH --time=7-00:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=20
#SBATCH --mem=64G
module load anaconda/2.5.0 bedtools/2.27.1
module load gcc/4.8.2
module load cmake/3.0.2 ROOT/5.34.36
export CONDA_ENVS_PATH=/lustre/project/ksu2_SVE
unset PYTHONPATH
source activate SVE
metaSV_ref=Homo_sapiens_assembly38.fasta
breakdancer_our=/data/BreakDancer_out/Subject_ID.sv.tbl
cnvnator_call=/data/CNVnator_out/Subject_ID.cnv.xls
pindel_out=/data/pindel_out/Sample_dir/Subject_ID/*
sample_idSubject_ID_tbl
alignments_bam=/data/Subject_ID.bam
spades_exe=/ksu2_SVE/SVE/bin/spades.py
age_align_exe=/ksu2_SVE/SVE/bin/age_align
threads=20
work=/data/metaSV_work2
OUTDIR=/data/metaSV_out2
insert_size_mean=260.04
insert_size_sd=56.34
metaSV_svs_to_assemble={'DEL','INS','INV','DUP'}
run_metasv.py --reference $metaSV_ref
--breakdancer_native $breakdancer_our
--cnvnator_native $cnvnator_call
--pindel_native $pindel_out
--sample $sample_id
--bam $alignments_bam
--spades $spades_exe
--age $age_align_exe
--num_threads $threads
--workdir $work
--outdir $OUTDIR
--isize_mean $insert_size_mean
--isize_sd $insert_size_sd
`
I didn't find any issues in the log file so far but the running time is over than I expected.
If you need more information, please let me know, and thank you for your time.
Ray
Hi,
I'm trying to run metasv to integrate calls from breakdancer, lumpy and cnvnator.
I've installed and tested metasv v.0.4 without problems, but I get a " Floating point exception" error when trying to test it in on my data (1 sample and 1 chr).
I copy below the last lines of the log and the command I've run.
It creates the work directory and the pre_asm.vcf file, but it's empty.
I've tried combining the output of only 2 callers as well, but all the combinations give the same error.
I would appreciate if you could help me identify what is causing the error.
Thank you in advance and best regards,
amaia
$METASV --reference $resourceDir/human_g1k_v37_decoy.fasta --outdir $WorkingDir/metasv --sample $sample --bam $WorkingDir/bam_sort/$sample.final.sorted.bam --chromosomes $chr --cnvnator_native $WorkingDir/cnvnator/calls/$sample.calls --breakdancer_native $WorkingDir/breakdancer/intrachr/samples61_breakdancer_o_$chr.ctx --lumpy_vcf $WorkingDir/lumpy/$fam'_samples.vcf' --disable_assembly --filter_gaps --keep_standard_contigs
...
INFO 2015-11-12 11:46:24,844 metasv.main SV types are set(['DEL', 'DUP', 'INV', 'INS'])
INFO 2015-11-12 11:46:24,845 metasv.main Do merging
INFO 2015-11-12 11:46:24,845 metasv.main Processing SVs of type DEL
INFO 2015-11-12 11:46:24,845 metasv.main Intra-tool Merging SVs of type DEL
INFO 2015-11-12 11:46:24,845 metasv.main First level merging for DEL for tool CNVnator
INFO 2015-11-12 11:46:24,903 metasv.main First level merging for DEL for tool BreakDancer
INFO 2015-11-12 11:46:25,207 metasv.main First level merging for DEL for tool Lumpy
INFO 2015-11-12 11:46:27,096 metasv.main Inter-tool Merging SVs of type DEL
INFO 2015-11-12 11:46:27,588 metasv.main Checking overlaps SVs of type DEL
INFO 2015-11-12 11:46:28,249 metasv.main Processing SVs of type DUP
INFO 2015-11-12 11:46:28,249 metasv.main Intra-tool Merging SVs of type DUP
INFO 2015-11-12 11:46:28,249 metasv.main First level merging for DUP for tool CNVnator
INFO 2015-11-12 11:46:28,267 metasv.main First level merging for DUP for tool Lumpy
INFO 2015-11-12 11:46:28,477 metasv.main Inter-tool Merging SVs of type DUP
INFO 2015-11-12 11:46:28,549 metasv.main Checking overlaps SVs of type DUP
INFO 2015-11-12 11:46:29,211 metasv.main Processing SVs of type INV
INFO 2015-11-12 11:46:29,211 metasv.main Intra-tool Merging SVs of type INV
INFO 2015-11-12 11:46:29,212 metasv.main First level merging for INV for tool BreakDancer
INFO 2015-11-12 11:46:29,225 metasv.main First level merging for INV for tool Lumpy
INFO 2015-11-12 11:46:29,229 metasv.main Inter-tool Merging SVs of type INV
INFO 2015-11-12 11:46:29,280 metasv.main Checking overlaps SVs of type INV
INFO 2015-11-12 11:46:29,333 metasv.main Processing SVs of type INS
INFO 2015-11-12 11:46:29,333 metasv.main Intra-tool Merging SVs of type INS
INFO 2015-11-12 11:46:29,333 metasv.main First level merging for INS for tool BreakDancer
INFO 2015-11-12 11:46:36,834 metasv.main Inter-tool Merging SVs of type INS
INFO 2015-11-12 11:46:56,726 metasv.main Checking overlaps SVs of type INS
INFO 2015-11-12 11:47:17,721 metasv.main Output merged VCF without assembly
Floating point exception
Error after running Metasv;can you give me some suggestion?
$METASV --pindel_native /bak01/yangqj/pindel/20220116/vcf_format/ps0001/ps0001_* --cnvnator_native /bak01/yangqj/cnvnator/ps0001.call --reference /bak01/yangqj/Metasv/hg19.fa --outdir out --sample ps0001 --filter_gaps --minsvlen 500 --maxsvlen 500000 --disable_assembly --keep_standard_contigs
INFO 2022-01-25 16:34:31,082 metasv.main Running MetaSV 0.5.4
INFO 2022-01-25 16:34:31,082 metasv.main Command-line /lustre/yangqj/software/miniconda3/envs/metasv/bin/run_metasv.py --pindel_native /bak01/yangqj/pindel/20220116/vcf_format/ps0001/ps0001_DEL.vcf /bak01/yangqj/pindel/20220116/vcf_format/ps0001/ps0001_TD.vcf --cnvnator_native /bak01/yangqj/cnvnator/ps0001.call --reference /bak01/yangqj/Metasv/hg19.fa --outdir out --sample ps0001 --filter_gaps --minsvlen 500 --maxsvlen 500000 --disable_assembly --keep_standard_contigs
INFO 2022-01-25 16:34:31,083 metasv.main Arguments are Namespace(age=None, age_timeout=300, age_window=20, assembly_max_tools=1, assembly_pad=500, bams=[], boost_sc=False, breakdancer_native=[], breakdancer_vcf=[], breakseq_native=[], breakseq_vcf=[], chromosomes=[], cnvkit_vcf=[], cnvnator_native=['/bak01/yangqj/cnvnator/ps0001.call'], cnvnator_vcf=[], disable_assembly=True, enable_per_tool_output=False, extraction_max_read_pairs=10000, filter_gaps=True, gaps=None, gatk_vcf=[], gt_normal_frac=0.05, gt_window=100, inswiggle=100, isize_mean=350.0, isize_sd=50.0, keep_standard_contigs=True, lumpy_vcf=[], manta_vcf=[], max_ins_cov_frac=1.5, max_ins_intervals=10000, max_nm=10, maxsvlen=500000, mean_read_coverage=50, mean_read_length=100, min_avg_base_qual=20, min_del_subalign_len=50, min_ins_cov_frac=0.5, min_inv_subalign_len=50, min_mapq=5, min_matches=50, min_soft_clip=20, min_support_frac_ins=0.05, min_support_ins=15, minsvlen=500, num_threads=1, outdir='out', overlap_ratio=0.5, pindel_native=['/bak01/yangqj/pindel/20220116/vcf_format/ps0001/ps0001_DEL.vcf', '/bak01/yangqj/pindel/20220116/vcf_format/ps0001/ps0001_TD.vcf'], pindel_vcf=[], reference='/bak01/yangqj/Metasv/hg19.fa', sample='ps0001', sc_other_scale=5, spades=None, spades_max_interval_size=50000, spades_options='', spades_timeout=300, stop_spades_on_fail=False, svs_to_assemble=set(['DUP', 'INV', 'INS']), svs_to_report=set(['INV', 'CTX', 'INS', 'DEL', 'ITX', 'DUP']), svs_to_softclip=set(['DUP', 'INV', 'INS']), wham_vcf=[], wiggle=100, workdir='work')
INFO 2022-01-25 16:34:31,084 metasv.main Only SVs on the following contigs will be reported: ['chr1', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr2', 'chr20', 'chr21', 'chr22', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chrM', 'chrX', 'chrY']
INFO 2022-01-25 16:34:31,084 metasv.sv_interval Loading the gaps in the genome from /lustre/yangqj/software/miniconda3/envs/metasv/lib/python2.7/site-packages/metasv/resources/hg19.gaps.bed
INFO 2022-01-25 16:34:31,117 metasv.main Load native files
INFO 2022-01-25 16:34:31,117 metasv.cnvnator_reader File is /bak01/yangqj/cnvnator/ps0001.call
Traceback (most recent call last):
File "/lustre/yangqj/software/miniconda3/envs/metasv/bin/run_metasv.py", line 143, in
sys.exit(run_metasv(args))
File "/lustre/yangqj/software/miniconda3/envs/metasv/lib/python2.7/site-packages/metasv/main.py", line 106, in run_metasv
for record in svReader(native_file, svs_to_report=args.svs_to_report):
File "/lustre/yangqj/software/miniconda3/envs/metasv/lib/python2.7/site-packages/metasv/cnvnator_reader.py", line 123, in next
record = CNVnatorRecord(line.strip())
File "/lustre/yangqj/software/miniconda3/envs/metasv/lib/python2.7/site-packages/metasv/cnvnator_reader.py", line 38, in init
self.sv_type = sv_type_dict[fields[0]]
KeyError: 'Assuming'
I tried to run MetaSV with the following command:
run_metasv.py --reference ../../../../../mnt/data/GRCh37_bcgsc/GRCh37-lite.fa --boost_ins --breakdancer_native ../../breakdancer/perl/A10898_breakdancer --breakseq_native ../../work/breakseq.gff --cnvnator_native ../../CNVnator_v0.3.2/src/A10898_100_cnvnator --pindel_native ../../pindel-master/A10898_D ../../pindel-master/A10898_LI ../../pindel-master/A10898_SI ../../pindel-master/A10898_TD ../../pindel-master/A10898_INV --sample A10898 --bam ../../../../mnt/data/A10898_3_lanes_dupsFlagged.bam --spades ../../SPAdes-3.6.2-Linux/bin/spades.py --age ../../AGE-simple-parseable-output/age_align --num_threads 15 --workdir A10898_work_metaSV --outdir A10898_out_metaSV --min_ins_support 2 --max_ins_intervals 500000 --isize_mean 500 --isize_sd 150
Eventually I get this traceback:
Traceback (most recent call last):
File "/usr/local/bin/run_metasv.py", line 6, in
exec(compile(open(file).read(), file, 'exec'))
File "/home/hpcuser01/metasv-0.3/scripts/run_metasv.py", line 108, in stop_spades_on_fail=args.stop_spades_on_fail, gt_window=args.gt_window, gt_normal_frac=args.gt_normal_frac, isize_mean=args.isize_mean, isize_sd=args.isize_sd, extraction_max_read_pairs=args.extraction_max_read_pairs))
File "/home/hpcuser01/metasv-0.3/metasv/main.py", line 327, in run_metasv
min_support_frac=min_support_frac, max_intervals=max_intervals)
File "/home/hpcuser01/metasv-0.3/metasv/generate_sv_intervals.py", line 243, in parallel_generate_sc_intervals
"Merging %d features with %d features from %s" % (bedtool.count(), skip_bedtool.count(), skip_bed))
File "/usr/local/lib/python2.7/dist-packages/pybedtools-0.7.4-py2.7-linux-x86_64.egg/pybedtools/bedtool.py", line 2261, in count
return sum(1 for _ in iter(self))
File "/usr/local/lib/python2.7/dist-packages/pybedtools-0.7.4-py2.7-linux-x86_64.egg/pybedtools/bedtool.py", line 2261, in
return sum(1 for _ in iter(self))
File "pybedtools/cbedtools.pyx", line 772, in pybedtools.cbedtools.IntervalIterator.next (pybedtools/cbedtools.cxx:11001)
File "pybedtools/cbedtools.pyx", line 682, in pybedtools.cbedtools.create_interval_from_list (pybedtools/cbedtools.cxx:9798)
pybedtools.cbedtools.MalformedBedLineError: Start is greater than stop
Prior to this message, I was getting errors for each chromosomes:
Traceback (most recent call last):
File "/home/hpcuser01/metasv-0.3/metasv/generate_sv_intervals.py", line 148, in generate_sc_intervals
filtered_bed)
File "/usr/local/lib/python2.7/dist-packages/pybedtools-0.7.4-py2.7-linux-x86_64.egg/pybedtools/bedtool.py", line 775, in decorated
result = method(self, _args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/pybedtools-0.7.4-py2.7-linux-x86_64.egg/pybedtools/bedtool.py", line 2912, in moveto
fn = self._collapse(self, fn=fn)
File "/usr/local/lib/python2.7/dist-packages/pybedtools-0.7.4-py2.7-linux-x86_64.egg/pybedtools/bedtool.py", line 1215, in _collapse
for i in iterable:
File "pybedtools/cbedtools.pyx", line 731, in pybedtools.cbedtools.IntervalIterator.next (pybedtools/cbedtools.cxx:10588)
line = next(self.stream)
File "/usr/local/lib/python2.7/dist-packages/pybedtools-0.7.4-py2.7-linux-x86_64.egg/pybedtools/bedtool.py", line 876, in _generator
result = func(f, _args, *_kwargs)
File "/home/hpcuser01/metasv-0.3/metasv/generate_sv_intervals.py", line 90, in merged_interval_features
interval_readcount = bam_handle.count(reference=feature.chrom, start=feature.start, end=feature.end)
File "csamtools.pyx", line 1169, in pysam.csamtools.Samfile.count (pysam/csamtools.c:13478)
File "csamtools.pyx", line 989, in pysam.csamtools.Samfile._parseRegion (pysam/csamtools.c:11668)
File "csamtools.pyx", line 923, in pysam.csamtools.Samfile.gettid (pysam/csamtools.c:10827)
File "csamtools.pyx", line 57, in pysam.csamtools._force_bytes (pysam/csamtools.c:3393)
TypeError: Expected bytes, got unicode
I'm not sure what the issue is. Your help would be appreciated!
Thanks,
Madeline
Hello,
When I try to run enhanced insertion detection, I get an error message. This is my command:
run_metasv.py --reference /mnt/data/GRCh37_bcgsc/GRCh37-lite.fa --boost_sc --sample A48018 --bam /mnt/data/A48018_2_lanes_dupsFlagged.bam --spades /home/hpcuser01/SPAdes-3.6.2-Linux/spades.py --age /home/hpcuser01/AGE/age_align --num_threads 50 --workdir A48018_work_boostins --outdir A48018_out_boostinst --max_ins_intervals 500000 --isize_mean 462 --isize_sd 119 --chromosomes X
And the error message:
ERROR 2016-06-24 11:59:42,265 run_spades_single-<Process(PoolWorker-51, started daemon)> Caught exception in worker thread
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/metasv/run_spades.py", line 75, in run_spades_single
retcode = cmd.run(cmd_log_fd_out=spades_log_fd, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/metasv/external_cmd.py", line 20, in run
self.p = subprocess.Popen(self.cmd, stderr=cmd_log_fd_err, stdout=cmd_log_fd_out)
File "/usr/lib/python2.7/subprocess.py", line 710, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
What might be the issue here?
Also, the http://bioinform.github.io/metasv/ website should be updated for enhancing insertion detection (.e.g boost_ins is no longer an option).
Thanks for your help,
Madeline
Hello,
It seems to be great tool for merging SV calls from multiple tools.
I did a quick run with provided script "run_test.sh" as well as with our data from "CNVnator", "Lumpy" and "Manta". In both cases when I looked at the output file it contains only homozygous ALT allele in genotype filed (i.e. 1/1).
How should the GT field be interpreted ? Is there any flag that identifies whether call is heterozygous or homozygous alt.
Also, we are in lab interested in analyzing de novo SVs, is there a functionality that looks at genotype in multiple sample ? On that note, How metaSV treats multi sample VCF input, it is not giving any error when provided multi sample VCF from Manta or Lumpy ?
Thanks.
Best,
Nick
Hi,
I'm running metaSV with cmd like:
run_metasv.py --reference /home/gst/work/ref/Zea_mays.AGPv4.dna.toplevel.fa --boost_sc --pindel_vcf Q417_pindel.vcf --breakdancer_vcf Q417_breakdancer.vcf --cnvnator_native Q417_cnvnator.out --manta_vcf Q417_manta.vcf --lumpy_vcf Q417_lumpy.vcf --wham_vcf Q417_whamg.vcf --mean_read_length 150 --sample Q417 --bam /home/gst/work/b73_BWAMEM_bam/Q417_bwa/Q417.bwamem.sort.bam --spades ~/gst/sftw/SPAdes-3.11.1-Linux/bin/spades.py --age ~/gst/sftw/anaconda2/bin/age_align --num_threads 8 --workdir work --outdir out --min_support_ins 4 --max_ins_intervals 500000 --isize_mean 350 --isize_sd 50
and it breaked with error said:
INFO 2018-07-10 02:09:14,322 parallel_generate_sc_intervals-<_MainProcess(MainProcess, started)> Selecting the top 500000 intervals based on normalized read support
INFO 2018-07-10 02:12:48,947 parallel_generate_sc_intervals-<_MainProcess(MainProcess, started)> After merging with work/metasv.bed 124230 features
<type 'exceptions.IOError'>: Broken pipe
The command was:bedtools sort -i stdin
Things to check:
Traceback (most recent call last):
File "/home/gst/sftw/anaconda2/bin/run_metasv.py", line 143, in
sys.exit(run_metasv(args))
File "/home/gst/sftw/anaconda2/lib/python2.7/site-packages/metasv/main.py", line 292, in run_metasv
other_scale=args.sc_other_scale)
File "/home/gst/sftw/anaconda2/lib/python2.7/site-packages/metasv/generate_sv_intervals.py", line 1165, in parallel_generate_sc_intervals
bedtool = bedtool.each(partial(fix_precise_coords)).sort().saveas(interval_bed)
File "/home/gst/sftw/anaconda2/lib/python2.7/site-packages/pybedtools/bedtool.py", line 668, in decorated
result = method(self, *args, **kwargs)
File "/home/gst/sftw/anaconda2/lib/python2.7/site-packages/pybedtools/bedtool.py", line 243, in wrapped
check_stderr=check_stderr)
File "/home/gst/sftw/anaconda2/lib/python2.7/site-packages/pybedtools/helpers.py", line 456, in call_bedtools
print '\n\t' + '\n\t'.join(problems[err.errno])
KeyError: 32
What's the cause of this error?
Thank you!
best wishes,
songtao gui
Hi, I enjoyed your talk at UKGS, where it was stated that the software would be released in "a couple of weeks". Please could you clarify when we can expect MetaSV to be released? I'm looking forward to trying it with my datasets.
Hi,
I have used Version 0.4 and seen the error below. Do you fix this in version 0.5?
INFO 2016-02-14 17:11:46,150 run_age_single-<Process(PoolWorker-10, started daemon)> Writing the ref sequence for region chr1.1076464.1076874
INFO 2016-02-14 17:11:46,151 run_age_single-<Process(PoolWorker-10, started daemon)> Processing 13 contigs for region (chr1, 1076464, chr1, 1076874)
INFO 2016-02-14 17:11:46,151 run_age_single-<Process(PoolWorker-10, started daemon)> Writing the assembeled sequence chr1_1076464_1076874_INS_0_NODE_1_length_1122_cov_20.4015_ID_295 of length 1122
INFO 2016-02-14 17:11:46,157 run_age_single-<Process(PoolWorker-10, started daemon)> Running /BCBIOMETASV/MINICONDA/METASV27JAN/AGE_ALIGN with arguments ['-indel', '-both', '-go=-6', '/bcbiometasv/miniconda/metasv27jan/UP53input/age/chr1.1076464.1076874.ref.fa', '/bcbiometasv/miniconda/metasv27jan/UP53input/age/0df25e19515155dfddbed3a8c720a98a.as.fa']
INFO 2016-02-14 17:11:46,199 run_age_single-<Process(PoolWorker-9, started daemon)> Will process 589 intervals
INFO 2016-02-14 17:11:46,218 run_age_single-<Process(PoolWorker-9, started daemon)> Matching interval chr1 1180324 1180344 eyJOVU1fU1ZNRVRIT0RTIjogMSwgIk5VTV9TVlRPT0xTIjogMSwgIlNDX0NPVkVSQUdFIjogIjQxOCIsICJTT1VSQ0VTIjogImNocjEtMTE3OTgyNC1jaHIxLTExODA4NDQtMC1Tb2Z0Q2xpcCIsICJTQ19ORUlHSF9TVVBQT1JUIjogIjQxIiwgIlNDX1JFQURfU1VQUE9SVCI6ICIzIiwgIlNDX0NIUjJfU1RSIjogImNocjE7NDsxMTgwMDI4OzExODA0MjgsLTE7NDstMTs3MixjaHIyOzMyOzMzMTQxMjk5OzMzMTQxNjY4LGNocjE1OzE7ODgwNzA1MzQ7ODgwNzA2MDcifQ==,INS,0,SC 1 .
INFO 2016-02-14 17:11:46,229 run_age_single-<Process(PoolWorker-9, started daemon)> Writing the ref sequence for region chr1.1180324.1180344
INFO 2016-02-14 17:11:46,231 run_age_single-<Process(PoolWorker-9, started daemon)> Processing 3 contigs for region (chr1, 1180324, chr1, 1180344)
INFO 2016-02-14 17:11:46,231 run_age_single-<Process(PoolWorker-9, started daemon)> Writing the assembeled sequence chr1_1180324_1180344_INS_0_NODE_1_length_1095_cov_19.8838_ID_63 of length 1095
INFO 2016-02-14 17:11:46,234 run_age_single-<Process(PoolWorker-9, started daemon)> Running /BCBIOMETASV/MINICONDA/METASV27JAN/AGE_ALIGN with arguments ['-indel', '-both', '-go=-6', '/bcbiometasv/miniconda/metasv27jan/UP53input/age/chr1.1180324.1180344.ref.fa', '/bcbiometasv/miniconda/metasv27jan/UP53input/age/3e23e83081dd0bbc7bc0548bbb1e4534.as.fa']
INFO 2016-02-14 17:11:46,269 run_age_single-<Process(PoolWorker-10, started daemon)> Returned code 0 (0.0812719 seconds)
ERROR 2016-02-14 17:11:46,271 run_age_single-<Process(PoolWorker-10, started daemon)> Caught exception in worker thread
Traceback (most recent call last):
File "/bcbiometasv/miniconda/metasv27jan/metasv/run_age.py", line 146, in run_age_single
age_record = AgeRecord(out,tr_region_1=tr_region)
File "/bcbiometasv/miniconda/metasv27jan/metasv/age_parser.py", line 85, in init
INFO 2016-02-14 17:11:46,295 run_age_single-<Process(PoolWorker-9, started daemon)> Returned code 0 (0.04352 seconds)
ERROR 2016-02-14 17:11:46,296 run_age_single-<Process(PoolWorker-9, started daemon)> Caught exception in worker thread
Traceback (most recent call last):
File "/bcbiometasv/miniconda/metasv27jan/metasv/run_age.py", line 146, in run_age_single
age_record = AgeRecord(out,tr_region_1=tr_region)
File "/bcbiometasv/miniconda/metasv27jan/metasv/age_parser.py", line 85, in init
self.read_from_age_file(age_out_file)
File "/bcbiometasv/miniconda/metasv27jan/metasv/age_parser.py", line 186, in read_from_age_file
file2, len2 = self.parse_input_descriptor(age_fd)
File "/bcbiometasv/miniconda/metasv27jan/metasv/age_parser.py", line 126, in parse_input_descriptor
raise AgeFormatError("INPUT DESCRIPTOR", age_fd.line_num)
self.read_from_age_file(age_out_file)
File "/bcbiometasv/miniconda/metasv27jan/metasv/age_parser.py", line 186, in read_from_age_file
AgeFormatError: Error while reading AGE output, L13 (section INPUT DESCRIPTOR).
file2, len2 = self.parse_input_descriptor(age_fd)
File "/bcbiometasv/miniconda/metasv27jan/metasv/age_parser.py", line 126, in parse_input_descriptor
raise AgeFormatError("INPUT DESCRIPTOR", age_fd.line_num)
AgeFormatError: Error while reading AGE output, L13 (section INPUT DESCRIPTOR).
Exception in thread Thread-9:
Traceback (most recent call last):
File "/bcbiometasv/miniconda/lib/python2.7/threading.py", line 801, in *bootstrap_inner
self.run()
File "/bcbiometasv/miniconda/lib/python2.7/threading.py", line 754, in run
self.__target(_self.__args, _self.__kwargs)
File "/bcbiometasv/miniconda/lib/python2.7/multiprocessing/pool.py", line 389, in _handle_results
task = get()
TypeError: ('__init() takes exactly 3 arguments (1 given)', <class 'metasv.age_parser.AgeFormatError'>, ())
James
It looks like metaSV ignores translocations in Lumpy vcf files. When I run metaSV, I get these messages ( a short sample):
INFO 2016-10-28 11:30:02,622 metasv.main Load VCF files
INFO 2016-10-28 11:30:02,623 metasv.sv_interval Loading SV intervals from /mnt/data/SV_analysis/HSAN1-c3_bwamem.bam_lumpy.vcf
ERROR 2016-10-28 11:30:02,634 metasv.sv_interval Ignoring record due to missing SVTYPE or INFO field in Record(CHROM=1, POS=29878039, REF=N, ALT=[N]1:31129380]])
ERROR 2016-10-28 11:30:02,634 metasv.sv_interval Ignoring record due to missing SVTYPE or INFO field in Record(CHROM=1, POS=31129380, REF=N, ALT=[N]1:29878039]])
ERROR 2016-10-28 11:30:02,634 metasv.sv_interval Ignoring record due to missing SVTYPE or INFO field in Record(CHROM=1, POS=16415316, REF=N, ALT=[N]1:16416094]])
ERROR 2016-10-28 11:30:02,634 metasv.sv_interval Ignoring record due to missing SVTYPE or INFO field in Record(CHROM=1, POS=16416094, REF=N, ALT=[N]1:16415316]])
ERROR 2016-10-28 11:30:02,635 metasv.sv_interval Ignoring record due to missing SVTYPE or INFO field in Record(CHROM=1, POS=6515317, REF=N, ALT=[[1:17019741[N])
ERROR 2016-10-28 11:30:02,635 metasv.sv_interval Ignoring record due to missing SVTYPE or INFO field in Record(CHROM=1, POS=17019741, REF=N, ALT=[[1:6515317[N])
Lumpy doesn't annotate translocations as CTX or ITX; rather the SVTYPE is replaced by the second breakend coordinate (the first is the entry for the start position of the translocation). Is it possible to add support for lumpy translocations into metaSV?
Hello, can you please add description of the output for metasv explaining all the flags in the vcf. This will help to further filter the metasv results.
Hi @marghoob ,
I've got an error as following, could you advise me on a fix?
Traceback (most recent call last):
File "/home/cbrcmod/scratch/modules/out/modulebin/metasv/0.3/bin/run_metasv.py", line 108, in <module>
stop_spades_on_fail=args.stop_spades_on_fail, gt_window=args.gt_window, gt_normal_frac=args.gt_normal_frac, isize_mean=args.isize_mean, isize_sd=args.isize_sd, extraction_max_read_pairs=args.extraction_max_read_pairs))
File "/home/cbrcmod/scratch/modules/out/modulebin/metasv/0.3/lib/python2.7/site-packages/metasv/main.py", line 295, in run_metasv
pybedtools.BedTool(bed_intervals).saveas(merged_bed)
File "/home/cbrcmod/scratch/modules/out/modulebin/metasv/0.3/lib/python2.7/site-packages/pybedtools/bedtool.py", line 390, in __init__
fn = BedTool(iter(fn)).saveas().fn
File "/home/cbrcmod/scratch/modules/out/modulebin/metasv/0.3/lib/python2.7/site-packages/pybedtools/bedtool.py", line 668, in decorated
result = method(self, *args, **kwargs)
File "/home/cbrcmod/scratch/modules/out/modulebin/metasv/0.3/lib/python2.7/site-packages/pybedtools/bedtool.py", line 2729, in saveas
fn = self._collapse(self, fn=fn, trackline=trackline)
File "/home/cbrcmod/scratch/modules/out/modulebin/metasv/0.3/lib/python2.7/site-packages/pybedtools/bedtool.py", line 1097, in _collapse
for i in iterable:
File "pybedtools/cbedtools.pyx", line 638, in pybedtools.cbedtools.IntervalIterator.__next__ (pybedtools/cbedtools.cpp:9096)
MemoryError
Best wishes,
Fengyuan
Hi,
I am using Meta-SV version 0.5.4. I got an error while reading the Age output. I am attaching the error file here.
It would be great if you can look into it!
Thanks,
Add more details to the MetaSV webpage
Hello MetaSV Team,
I have used version 0.4 and seen 3 kmer values used to refine each SV. Is it possible to have options to use only one kmer per SV?
I know kmers need to be optimised, but I think users have prior knowledge of their reads, such as read length. So, they can reasonably choose a certain kmer in MetaSV to use so as to save computation.
Any thoughts? Thank you for this very great program.
Regards,
James
Hi team,
Does metaSV support the four SV callers, or do I have to install those and run them separately?
Thanks,
Madeline
(Bioinformagician)
Hi Brad,
I found a problem with metaSV when running the local assembly - Spades.
It seems a similar error to: bcbio/bcbio-nextgen#1075, but I'm not sure about what is causing the problem.
Please find the error copied below.
Thanks,
amaia
INFO 2015-12-16 10:43:31,315 metasv.run_spades 5176 intervals selected
INFO 2015-12-16 10:43:31,315 metasv.run_spades 22 intervals ignored
INFO 2015-12-16 10:43:32,807 run_spades_single-<Process(PoolWorker-2, started daemon)> Processing interval 22 16054689 16054690 eyJOVU1fU1ZNRVRIT0RTIjogMSwgIkJEX1NDT1JFIjogMzUuMCwgIkJEX09SSTIiOiAiMTQrMTMtIiwgIkJEX1BPUzEiOiAxNjA1NDY4OCwgIkJEX1BPUzIiOiAxNjA1NDgyOSwgIkJEX09SSTEiOiAiMTQrMTMtIiwgIkJEX0NIUjEiOiAiMjIiLCAiU09VUkNFUyI6ICIyMi0xNjA1NDY4OS0xNjA1NDY4OS0xMjQtQnJlYWtEYW5jZXIiLCAiQkRfQ0hSMiI6ICIyMiIsICJOVU1fU1ZUT09MUyI6IDEsICJCRF9TVVBQT1JUSU5HX1JFQURfUEFJUlMiOiA0fQ==,INS,124,RP 1 .
INFO 2015-12-16 10:43:32,808 extract_read_pairs-<Process(PoolWorker-2, started daemon)> Extracting reads from /data/corpora/MPI_workspace/lag/workspaces/lg-ngs/working//bam_sort/DYS14587.final.sorted.bam for region 22:16054689-16054690 with padding 500 using functions ['all_pair', 'non_perfect']
ERROR 2015-12-16 10:43:32,909 run_spades_single-<Process(PoolWorker-2, started daemon)> Caught exception in worker thread
Traceback (most recent call last):
File "/home/amacar/.local/lib/python2.7/site-packages/metasv/run_spades.py", line 68, in run_spades_single
max_read_pairs=max_read_pairs, sv_type=sv_type)
File "/home/amacar/.local/lib/python2.7/site-packages/metasv/extract_pairs.py", line 93, in extract_read_pairs
aln_list = [aln for aln in bam.fetch(chr_name, start=chr_start, end=chr_end) if not aln.is_secondary]
File "csamtools.pyx", line 1059, in pysam.csamtools.Samfile.fetch (pysam/csamtools.c:12490)
File "csamtools.pyx", line 989, in pysam.csamtools.Samfile._parseRegion (pysam/csamtools.c:11668)
File "csamtools.pyx", line 923, in pysam.csamtools.Samfile.gettid (pysam/csamtools.c:10827)
File "csamtools.pyx", line 57, in pysam.csamtools._force_bytes (pysam/csamtools.c:3393)
TypeError: Expected bytes, got unicode
INFO 2015-12-16 10:43:33,066 metasv.run_spades Merging the contigs from []
Traceback (most recent call last):
File "/data/corpora/MPI_workspace/lag/workspaces/lg-ngs/working/programs/metasv-0.4/scripts/run_metasv.py", line 136, in <module>
sys.exit(run_metasv(args))
File "/home/amacar/.local/lib/python2.7/site-packages/metasv/main.py", line 307, in run_metasv
assembly_max_tools=args.assembly_max_tools)
File "/home/amacar/.local/lib/python2.7/site-packages/metasv/run_spades.py", line 204, in run_spades_parallel
for line in fileinput.input(assembly_fastas):
File "/usr/lib64/python2.7/fileinput.py", line 253, in next
line = self.readline()
File "/usr/lib64/python2.7/fileinput.py", line 346, in readline
self._buffer = self._file.readlines(self._bufsize)
hi,
there are some information about "TAGS= sample name" in output vcf file. So, this information tell me which samples have this variantion ?
thanks,
Kai
I'm trying to set up metaSV
on a shared HPC on ComputeCanada's Cedar and running into an error with the pip
installation.
Following the installation instructions I download/load the system requirements first.
First load provided modules
and setup Python env
:
module load python/3.8
module load spades/3.13.1
module load samtools/0.1.20
virtualenv metaSV
source metaSV/bin/activate
pip install Cython # needs to be installed before the following 3 dependencies
pip install pysam
pip install pybedtools
pip install pyvcf
SPAdes
was already available but I needed to downloaded/compiled AGE make OMP=no
Now I try to install metaSV
with pip install https://github.com/bioinform/metasv/archive/0.5.2.tar.gz
and get an error:
Ignoring pip: markers 'python_version < "3"' don't match your environment
Looking in links: /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/avx2, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic
Collecting https://github.com/bioinform/metasv/archive/0.5.2.tar.gz
Using cached https://github.com/bioinform/metasv/archive/0.5.2.tar.gz
ERROR: Command errored out with exit status 1:
command: /project/6013424/common/tools/CNV/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-pd08her8/setup.py'"'"'; __file__='"'"'/tmp/pip-req-build-pd08her8/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-req-build-pd08her8/pip-egg-info
cwd: /tmp/pip-req-build-pd08her8/
Complete output (6 lines):
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-req-build-pd08her8/setup.py", line 8
print version
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(version)?
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
My uname -a
:
Linux cedar1.cedar.computecanada.ca 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019 x86_64 GNU/Linux
the Makefile for the custom version of age_align has -O3 option added, on my debian systems at least this pretty well breaks age_align. it hangs most of the time, even at -O it hangs on the test.sh script 90%+ of the time.
tried compiling with gcc 4.6.3 and 4.9.2, nearly the same results.
removing the -O option has it working
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.