Giter Club home page Giter Club logo

methphaser's People

Contributors

fu-yilei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

methphaser's Issues

Duplicated and missing entries in .methphased.vcf

When I look at the entries in the output .methphased.vcf, there seems to be complete duplicate rows (not just that one variant is methphased into different haplotypes and therefore split into multiple entires with different PS tags, but entirely duplicate entires). This is not a big issue because complete duplicates can be easily dropped - but what could be causing this? Is this intentional?

There also seems to be variants in the original .vcf file that's missing from the output .methphased.vcf, including some that was phased in the original .vcf. Is that intentional?

The original .vcf was filtered to only contain those on autosomes, and the input .bam files are filtered to only contain primary alignments but were not filtered to only keep those that map to autosomes.

Thank you so much!

[E::bam_parse_basemod] MM tag refers to bases beyond sequence length

Hi,

I am using ONT R10.4.1 provided by EPI2ME ( https://labs.epi2me.io/askenazi-kit14-2022-12/ ) and testing with whatshap. However, I encountered an error during the process. How can I resolve it? Here are the commands I used.

whatshap --version
1.7
whatshap phase --ignore-read-groups --indels \
-r GCA_000001405.15_GRCh38_no_alt_analysis_set.fa \
hg002.wf_snp.vcf.gz \
hg002.sup.60x.bam \
-o whatshap_hg02_60x.vcf
bgzip -c whatshap_hg02_60x.vcf > whatshap_hg02_60x.vcf.gz
tabix -p vcf whatshap_hg02_60x.vcf.gz
whatshap haplotag --ignore-read-groups  \
whatshap_hg02_60x.vcf.gz hg002.sup.60x.bam \
-r GCA_000001405.15_GRCh38_no_alt_analysis_set.fa \
-o hg002.sup.60x.whatshap.haplotag.bam

samtools index -@ 24 hg002.sup.60x.whatshap.haplotag.bam
whatshap stats --gtf whatshap_hg02_60x.gtf whatshap_hg02_60x.vcf.gz
~/methphaser/meth_phaser_parallel \
-b hg002.sup.60x.whatshap.haplotag.bam \
-r GCA_000001405.15_GRCh38_no_alt_analysis_set.fa \
-g whatshap_hg02_60x.gtf \
-vc whatshap_hg02_60x.vcf.gz  \
-o work

[E::bam_parse_basemod] MM tag refers to bases beyond sequence length
Traceback (most recent call last):
  File "/home/jyunhong104/methphaser/methphasing", line 1471, in <module>
    main(sys.argv[1:])
  File "/home/jyunhong104/methphaser/methphasing", line 1437, in main
    ) = get_assignment_max(
  File "/home/jyunhong104/methphaser/methphasing", line 898, in get_assignment_max
    base_modification_list = get_base_modification_dictionary(  # build the dictionary with snp phased reads
  File "/home/jyunhong104/methphaser/methphasing", line 239, in get_base_modification_dictionary
    if methylation_identifier_0 in list(mm.keys()):
AttributeError: 'NoneType' object has no attribute 'keys'
[W::bam_next_basemod] MM tag refers to bases beyond sequence length
Traceback (most recent call last):
  File "/home/jyunhong104/methphaser/methphasing", line 1471, in <module>
    main(sys.argv[1:])
  File "/home/jyunhong104/methphaser/methphasing", line 1437, in main
    ) = get_assignment_max(
  File "/home/jyunhong104/methphaser/methphasing", line 898, in get_assignment_max
    base_modification_list = get_base_modification_dictionary(  # build the dictionary with snp phased reads
  File "/home/jyunhong104/methphaser/methphasing", line 239, in get_base_modification_dictionary
    if methylation_identifier_0 in list(mm.keys()):
AttributeError: 'NoneType' object has no attribute 'keys'
[W::bam_next_basemod] MM tag refers to bases beyond sequence length
Traceback (most recent call last):
  File "/home/jyunhong104/methphaser/methphasing", line 1471, in <module>
    main(sys.argv[1:])
  File "/home/jyunhong104/methphaser/methphasing", line 1437, in main
    ) = get_assignment_max(
  File "/home/jyunhong104/methphaser/methphasing", line 898, in get_assignment_max
    base_modification_list = get_base_modification_dictionary(  # build the dictionary with snp phased reads
  File "/home/jyunhong104/methphaser/methphasing", line 239, in get_base_modification_dictionary
    if methylation_identifier_0 in list(mm.keys()):
AttributeError: 'NoneType' object has no attribute 'keys'

...

Thanks

The total size of the output bam file has decreased

The total size of the bam file outputted by methphaser has decreased.
My bam size after Whatshap processing is 175 G, but the total size of the bam after methphaser processing is 142 G.
175G Nov 30 01:33 HG002_ONT_md_sup.whatshap.haplotag.bam

total 142G
 6.7G Dec  1 15:20 output_ob.10.methtagged.bam
 6.8G Dec  1 15:47 output_ob.11.methtagged.bam
 6.8G Dec  1 15:28 output_ob.12.methtagged.bam
 4.8G Dec  1 15:25 output_ob.13.methtagged.bam
 4.6G Dec  1 15:53 output_ob.14.methtagged.bam
 4.2G Dec  1 14:55 output_ob.15.methtagged.bam
 4.4G Dec  1 14:58 output_ob.16.methtagged.bam
 4.2G Dec  1 14:56 output_ob.17.methtagged.bam
 3.8G Dec  1 15:23 output_ob.18.methtagged.bam
 3.0G Dec  1 14:42 output_ob.19.methtagged.bam
  12G Dec  1 16:02 output_ob.1.methtagged.bam
 3.1G Dec  1 14:44 output_ob.20.methtagged.bam
 2.0G Dec  1 14:38 output_ob.21.methtagged.bam
 1.9G Dec  1 15:20 output_ob.22.methtagged.bam
  12G Dec  1 15:59 output_ob.2.methtagged.bam
 9.9G Dec  1 15:54 output_ob.3.methtagged.bam
 9.4G Dec  1 15:52 output_ob.4.methtagged.bam
 9.0G Dec  1 15:44 output_ob.5.methtagged.bam
 8.6G Dec  1 15:42 output_ob.6.methtagged.bam
 7.9G Dec  1 15:52 output_ob.7.methtagged.bam
 7.3G Dec  1 15:37 output_ob.8.methtagged.bam
 5.9G Dec  1 15:16 output_ob.9.methtagged.bam
 3.9G Dec  1 15:10 output_ob.X.methtagged.bam
 489M Dec  1 15:27 output_ob.Y.methtagged.bam

This is how i run mathphaser :

meth_phaser_parallel -b HG002_ONT_md_sup.whatshap.haplotag.bam -r hs37d5.fa -g whatshap_hg02.gtf -vc whatshap_hg02.vcf.gz -o work -t 16
meth_phaser_post_processing -ib HG002_ONT_md_sup.whatshap.haplotag.bam -if work/ -ov output.vcf -ob output_ob -vc whatshap_hg02.vcf.gz -t 16

What is the reason of the problem?How can i solve it?
Thanks!

Two warning messages appear

Two warning messages appear when i run meth_phaser_parallel :

warning:
/home/lixin/miniconda3/envs/viaenv/bin/methphasing:4: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  from pkg_resources import require

and

/home/lixin/miniconda3/envs/viaenv/lib/python3.9/site-packages/scipy/__init__.py:155: UserWarning: A NumPy version >=1.18.5 and <1.26.0 is required for this version of SciPy (detected version 1.26.2
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"

But the program dont stop and can get the output.
What is the reason of the two warning and does they affect the results?
Thanks!

Question on secondary and supplementary reads

It is necessary to exclude secondary and supplementary alignments with "samtools view -bF 2304 -o output.bam input.bam" before running methphaser. However, the secondary and supplementary reads could provide increased depth when assembling the haplotypes after phasing. I guess I can filter out secondary and supplementary reads that are already phased using whatsapp, and re-add them to the new bam file after methphasing?

IndexError: list index out of range

Hi there - I am applying Methphaser to a mouse dataset and ran into a minor issue. See error log below:

no reads aligned to chromosome MT, skipping...
no reads aligned to chromosome JH584301.1, skipping...
Traceback (most recent call last):
  File "/data/liuy45/conda/envs/methphaser/bin/meth_phaser_parallel", line 283, in <module>
    main(sys.argv[1:])
  File "/data/liuy45/conda/envs/methphaser/bin/meth_phaser_parallel", line 252, in main
    skipping_pair_start_list.append(phased_block_distance_list_sorted[0][0])
IndexError: list index out of range

Seems like chr_block_num is not zero, but somehow phased_block_distance_dict is empty? What could the problem be? Thank you so much!

Empty output vcf and bam

Hi,

I am using methphaser as a singularity image on an HPC. I have made the singularity image work with the test data, so shouldn't be any technical issues with the image.

I am doing adaptive sequencing of a 1,2 Mbp region on chromosome 14 (the IGH locus). I am assembling the reads to a personal reference of this region, and then I re-map the ROI-reads to this personal reference. This works very well for my purpose, but when I am adding methphaser to the pipeline it yields only empty vcf and bam files (only headers). Shouldn't it at least contain the information in the input bam and input vcf files? I am not getting any errors when running methphaser.

I would be extremely grateful if you could help me troubleshoot.

Below is truncated input bam file with frist two sequences, truncated input vcf file, gtf file and truncated fast-reference:

Truncated BAM file:

750c6e3f-9e8c-44b9-9e31-fc66c1b1fc2c 16 contig_1 1 60 1189S416M1I1717M1I1167M5D438M1D86M1I193M2D234M2D285M1I59M1D36M2D23M1I365M1D1397M1I793M1D8M1I4M1D198M2D12M1D1249M1D5M1I11M102S * 0 0 CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC&')+069:78A?DHBEDFJGD>C>G=:8400.---,'')255676;<;:<5>?@BFBCIDCBABCDGEBBKCED@=CBECB>DCF@@DCCFCA;@@C=?>CAD>><A@CB?BAE>A?ECFB?@A>CA?>B@EB?D-5640...0>?@>D:9*)'&&%$$$$$$$$(()*,/8...-.03656=>>=>??ESGJMFMGIBEEGHEHKPIG<=6/.4..-,+))-2DGIFGD:'&)%$$$%&&'+++'%%&%&(+,2223-,,*((&$$$%&''+++**) NM:i:42 ms:i:17172 AS:i:17172 nn:i:0 tp:A:P cm:i:1027 s1:i:5636 s2:i:324 de:f:0.0039 SA:Z:contig_1,1578,-,2107M7D7888S,1,83; rl:i:4350 CO:Z:MM:Z:C+h?,4,2,4,1,0,0,1,0,1,0,0,0,1,13,13,59,20,42,26,28,24,11,90,4,28,0,38,11,34,1,32,14,23,48,27,17,68,2,5,0,4,126,0,0,3,1,3,1,4,2,9,0,2,2,4,0,5,2,13,0,1,0,0,5,0,8,2,0,1,0,1,0,8,2,0,1,0,1,0,4,2,0,1,0,8,2,0,1,0,1,0,4,2,0,1,0,8,2,0,1,0,1,0,3,2,0,1,0,9,2,0,1,0,1,0,4,2,0,1,0,9,2,0,0,0,9,2,0,0,0,9,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,3,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,2,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,0,0,4,2,0,0,0,1,2,15,0,0,0,0,0; ML:B:C,36,8,165,2,1,16,3,3,2,2,3,264,92,253,254,253,235,248,253,254,255,253,254,190,240,248,250,206,239,9,182,209,4,221,183,65,185,15,242,0,0,241,254,255,254,244,245,227,250,250,254,254,196,230,252,65,248,254,223,248,129,252,204,250,254,253,208,123,242,253,247,254,250,13,60,196,217,255,255,255,240,116,250,254,255,68,92,0,86,119,249,255,255,254,234,252,251,248,254,253,33,238,224,254,252,253,20,1,225,171,22,182,252,253,254,247,0,242,245,254,149,52,242,239,243,54,50,62,58,123,107,134,177,248,254,156,215,205,224,202,201,241 HP:i:1 PC:i:210 PS:i:2459 40fce1e4-96a0-4a00-8695-fa9590857b1d 16 contig_1 1 60 1363S1821M1D653M24D119M1I10M1D113M6I391M1D171M1I342M1I149M1D168M9M1D41M3D760M1D45M2D395M1D2713M2D1363M2D207M1D18M2D16M2D15M32S * 0 0 AACCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTA&)135@AABBBF?G=A:EBDGCBDBHCA@D@EDA>DEDED>BCDEDDEBDCB>CBFFB>C@C???A@DB@@@>C>>>DADDA?CAEBA:AAD;=BC@CA?;C?CCA9CADB@BC@FB?;?>?>>=@>BBA?B=?=;;4AA46FIBAGMSLSIJKIIIJJSIISSJHSHKGEGJGSJLQKSHBBJ2?FFIIIGIISLQSIKJOIHSPSNHSFSKSSSEGJSBLEFA??=999;>SLKLSIILSJHSS??@BBBBAEDABHISMSSSLOLMLSGNHJQKJISLJSRNSLJHJIMSSOIOLFISMMMJIKILHEHFSPSKKSSSGGLOJJJLHIKSSJMIKMLMKJSJKHHSJPIJSLJND0<;=AB66666LSSIIIJEC6-,,,2015>DIE###$$$%&')&(.,+'%%%+356/..-.56((((()**')),,,,+'&%&%$$%')))'&&&%$$ NM:i:320 ms:i:112130 AS:i:112134 nn:i:0 tp:A:P cm:i:9642 s1:i:52735 s2:i:2852 de:f:0.0037 rl:i:5199 CO:Z:MM:Z:C+h?,16,0,1,3,2,1,1,5,1,2,2,7,3,0,4,1,37,0,3,1488,2,0,0,0,9,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,2,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,3,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,2,15,0,0,0; ML:B:C,29,104,69,23,0,2,62,2,2,2,1,1,1,1,3,0,0,1,0,11,29,1,9,7,3,6,8,22,2,4,1,2,49,1,2,2,68,1,28,1,0,1,92,1,5,2,224,254,8,8,3,5,1,9,2,9,3,2,1,1,1,3,144,2,18,30,9,5,0,5,1,14,1,4,4,1,3,4,0,5,0,1,4,0,1,42,0,53,2,4,16,24,2,0,1,4,1,2,26,41,115,1,2,0,1,3,2,3,9,1,1,19,1,13,7,7,0,1,3,30,7,10,1,149,29,230,252,251,189,252,255,254,247,245,253,202,0,243,254,197,236,240,253,249,252,251,241,133,120,54,240,249,235,130,196,36,249,253,252,216,228,212,233,1 HP:i:2 PC:i:210 PS:i:2459

Truncated VCF file:

##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##source=Clair3
##clair3_version=1.0.0
##FILTER=<ID=LowQual,Description="Low quality variant">
##FILTER=<ID=RefCall,Description="Reference call">
##INFO=<ID=P,Number=0,Type=Flag,Description="Result from pileup calling">
##INFO=<ID=F,Number=0,Type=Flag,Description="Result from full-alignment calling">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Read depth for each allele">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Phred-scaled genotype likelihoods rounded to the closest integer">
##FORMAT=<ID=AF,Number=1,Type=Float,Description="Estimated allele frequency in the range of [0,1]">
##contig=<ID=contig_1,length=2568895>
##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phase set identifier">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
contig_1 2459 . C A 10.87 PASS F GT:GQ:DP:AF:PS 0|1:10:24:0.2917:2459
contig_1 2465 . C A 10.99 PASS F GT:GQ:DP:AF:PS 0|1:10:24:0.2917:2459
contig_1 2475 . TACCCCAACCCCAACCCCAACCCCA T 9.78 PASS P GT:GQ:DP:AF 0/1:9:24:0.2083
contig_1 2588 . C T 9.03 PASS P GT:GQ:DP:AF:PS 0|1:9:24:0.375:2459
contig_1 2594 . C T 14.76 PASS F GT:GQ:DP:AF:PS 0|1:14:24:0.375:2459
contig_1 2600 . C T 16.54 PASS F GT:GQ:DP:AF:PS 0|1:16:24:0.4167:2459
contig_1 2618 . T TA 15.35 PASS F GT:GQ:DP:AF 0/1:15:24:0.375
contig_1 2628 . TA T 14.22 PASS F GT:GQ:DP:AF 0/1:14:24:0.3333
contig_1 2804 . G T 19.19 PASS F GT:GQ:DP:AF:PS 0|1:19:24:0.375:2459
contig_1 2894 . G T 23.33 PASS F GT:GQ:DP:AF:PS 0|1:23:24:0.3333:2459
contig_1 3300 . AACCCT A 17.11 PASS F GT:GQ:DP:AF 0/1:17:24:0.5833
contig_1 3305 . T TA 8.56 PASS F GT:GQ:DP:AF 0/1:8:24:0.375
contig_1 66619 . AT A 0 LowQual F GT:GQ:DP:AF 1/1:0:23:0.1739

GTF file:

contig_1 Phasing exon 2459 400591 . + . gene_id "2459"; transcript_id "2459.1";
contig_1 Phasing exon 641089 1001211 . + . gene_id "641089"; transcript_id "641089.1";
contig_1 Phasing exon 1230613 1244186 . + . gene_id "1230613"; transcript_id "1230613.1";
contig_1 Phasing exon 1582658 1670184 . + . gene_id "1582658"; transcript_id "1582658.1";
contig_1 Phasing exon 1858102 1886017 . + . gene_id "1858102"; transcript_id "1858102.1";
contig_1 Phasing exon 2054129 2503491 . + . gene_id "2054129"; transcript_id "2054129.1";

Truncated FASTA reference:

contig_1
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC

Thanks a lot in advance!!

Best,
Andreas

IndexError: list index out of range

Hi.
I am getting the same error:

Traceback (most recent call last):
  File "/iga/scripts/dev_modules/mambaforge/envs/duet-v0.6/bin/methphasing", line 1471, in <module>
    main(sys.argv[1:])
  File "/iga/scripts/dev_modules/mambaforge/envs/duet-v0.6/bin/methphasing", line 1396, in main
    [0], phased_region_list[1][0])]
IndexError: list index out of range

The command that I used was this one:

meth_phaser_parallel -t 10 -ml -2 \
-o methphaser \
-b grapevine.minimap2.filtered_only_primary.bam \
-r reference.fasta \
-g grapevine_phased.whatshap.stats.gtf \
-vc grapevine.clair3.phased.vcf.gz

Theoretically, I should not have problems with sex chromosomes, given the fact that grapevines do not possess them.

Thank you in advance for the help.

Originally posted by @Mar10L in #10 (comment)

Possible no caught error for single blocks

Hello there,

I just wanted to report what I think is a small error:

/home/sivico/mambaforge/envs/Phasing/bin/methphasing:4: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  from pkg_resources import require
Traceback (most recent call last):
  File "/home/sivico/mambaforge/envs/Phasing/bin/methphasing", line 1470, in <module>
    main(sys.argv[1:])
  File "/home/sivico/mambaforge/envs/Phasing/bin/methphasing", line 1395, in main
    max_expension_list_larger += [(phased_region_list[0][0], phased_region_list[1][0])]
IndexError: list index out of range

When I looked at the code, and considering my input, I think that this IndexError comes from the fact that the code is expecting at least 2 blocks to merge in any given sequence. If there is a sequence with a single block, instead of skipping the sequence (because there is nothing to do), it tries to index the second block that does not exist, and then the error emerges.

I hope this is helpful to you.
Sivico

some block have only one SNP

Hi, I am using ONT R10.4.1 provided by giab ( https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/NA12878_HG001/latest/ ) and testing with whatshap.
Here are the commands I used.

whatshap --version
2.0
whatshap phase --ignore-read-groups --indels \
-r GCA_000001405.15_GRCh38_no_alt_analysis_set.fa \
hg001.wf_snp.vcf.gz \
hg001.sup.60x.bam \
-o whatshap_hg001_60x.vcf
bgzip -c whatshap_hg001_60x.vcf > whatshap_hg001_60x.vcf.gz
tabix -p vcf whatshap_hg001_60x.vcf.gz
whatshap haplotag --ignore-read-groups  \
whatshap_hg001_60x.vcf.gz hg001.sup.60x.bam \
-r GCA_000001405.15_GRCh38_no_alt_analysis_set.fa \
-o hg001.sup.60x.whatshap.haplotag.bam
samtools index -@ 24 hg001.sup.60x.whatshap.haplotag.bam
whatshap stats --gtf whatshap_hg001_60x.gtf whatshap_hg001_60x.vcf.gz
samtools view -bF 2304 -o hg001.sup.60x.whatshap.haplotag.bam
samtools index -@ 24 .hg001.sup.60x.whatshap.haplotag.primary.bam
meth_phaser_parallel -b hg001.sup.60x.whatshap.haplotag.primary.bam\ 
-r GCA_000001405.15_GRCh38_no_alt_analysis_set.fa\
 -g whatshap_hg001_60x.gtf -vc whatshap_hg001_60x.vcf.gz\
 -o work\

meth_phaser_post_processing -ib hg001.sup.60x.whatshap.haplotag.primary.bam -if work -ov output.vcf -ob output_ob -vc /whatshap_hg001_60x.vcf.gz -t 8

bcftools sort -Ov output.vcf -o output_sorted.vcf

I found that there are duplicate positions in the output vcf file.
1690752181307

Also, some blocks have only one SNP , I'm not sure if these are normal?

If I want to evaluate methphaser (wrong switch, N50), should I make any corrections to the output vcf file?

thank you

Error when running meth_phaser_parallel

Hi,

I tried to run the example cmd of:

    ./meth_phaser_parallel -b test_data/HLA.R10.haplotagged.bam -r test_data/GCA_000001405.15_GRCh38_no_alt_analysis_set.chr6.fna -g test_data/LSK.filtered.gtf -vc test_data/HLA.R10.phased.vcf.gz  -o test_data/work -k -1 -ml -2

it raises an error of
File "/home/_env/anaconda3/envs/mp_env/lib/python3.10/subprocess.py", line 1863, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'methphasing'

any clue to fix it? thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.