cbg-ethz / v-pipe Goto Github PK
View Code? Open in Web Editor NEWV-pipe is a pipeline designed for analysing NGS data of short viral genomes
Home Page: https://cbg-ethz.github.io/V-pipe/
License: Apache License 2.0
V-pipe is a pipeline designed for analysing NGS data of short viral genomes
Home Page: https://cbg-ethz.github.io/V-pipe/
License: Apache License 2.0
Do V-pipe trim primers ? I can't find the information. (I see only this param in the conf file: primers_file=)
If yes, is it done post-alignment ?
This software should be available in an automatically build container image with all dependencies already installed and ready to run with a well defined interface.
The container image should be available via quay or docker hub.
This was done manually at some point but the automatic pipeline has not been established yet: https://quay.io/repository/dryak/v-pipe?tab=tags
Hi !
Got this error in rule generate_web_visualization, from the assemble_visualization_webpage.py script on line 16 :
from Bio.Alphabet import IUPAC
ImportError: Bio.Alphabet has been removed from Biopython. In many cases, the alphabet can simply be ignored and removed from scripts. In a few cases, you may need to specify the
molecule_type
as an annotation on a SeqRecord for your script to work correctly. Please see https://biopython.org/wiki/Alphabet for more information.
What should I do to correct that ? Should I roll back my version of biopython ?
Thanks !
https://github.com/cbg-ethz/V-pipe/blob/master/utils/quick_install.sh#L216 is wrong
the url for the tagged version tar archive is (now)
https://github.com/cbg-ethz/V-pipe/archive/refs/tags/v2.99.1.tar.gz
(with refs/tags)
and not just
Dynamic files are marked as "experimental" and more than one wildcard is marked as "not supported". Rule snv uses two wildcards.
https://bitbucket.org/snakemake/snakemake/issues/577/python-interrupts-snakemake-with-keyerror
Hey,
I have another issue regarding the use of the config file.
I'm running different tests for testing different parameters on and for different sample sets. Therefore it would be great if I could just create different config files and running vpipe using snakemakes --configfile option. I thought that should be possible since it is a basic snakemake functionality.
However, after some tests I figured that using this option will always throw an error. Even if I assign the path to the default path/to/vpipe/workdir/vipipe.config
with the --configfile. So really the same file which is definitely used when running vpipe without specifying the config-file. I get YAML format errors, that's why I tested multiple times if it really is the same file I assign there. I really can't think about any reason for this anymore.
Do you have an idea what could be the cause of it? Or am I indeed just misunderstood some snakemake logic?
Those are the errors I get:
yaml.parser.ParserError: expected '<document start>', but found '<scalar>'
in "vpipe.config", line 2, column 1
[...]
During handling of the above exception, another exception occurred:
[...]
snakemake.exceptions.WorkflowError: Config file is not valid JSON or YAML. In case of YAML, make sure to not mix whitespace and tab indentation.```
I have run "./vpipe --dryrun" and got this:
VPIPE_BASEDIR = /users/Carlotta/v-pipe/testing/V-pipe
Building DAG of jobs...
Job stats:
job count min threads max threads
all 1 1 1
total 1 1 1
[Tue Oct 5 15:46:26 2021]
localrule all:
jobid: 0
resources: tmpdir=/var/folders/rb/rdcsh0j507n5b1ctytvr7xy00000gn/T
Job stats:
job count min threads max threads
all 1 1 1
total 1 1 1
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.
What does it mean?
Any help is really appreciated, thanks a lot!
Carlotta
Hello,
When I run V-Pipe I get the following error file in the initial consensus folder. The V-Pipe run did not complete. Any assistance would be much appreciated.
[bwa_index] Construct BWT for the packed sequence...
[bwa_index] 0.00 seconds elapse.
[bwa_index] Update BWT... 0.00 sec
[bwa_index] Pack forward-only FASTA... 0.00 sec
[bwa_index] Construct SA from BWT and Occ... 0.00 sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa index consensus.fasta
[main] Real time: 0.020 sec; CPU: 0.004 sec
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 22 sequences (4752 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 1, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] skip orientation FR as there are not enough pairs
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 22 reads in 0.005 CPU sec, 0.005 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 4 consensus.fasta ../preprocessed_data/R1.fastq ../preprocessed_data/R2.fastq
[main] Real time: 0.007 sec; CPU: 0.008 sec
INFO 2019-10-14 23:36:08 SamToFastq
********** NOTE: Picard's command line syntax is changing.
********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
********** The command line looks like this in the new syntax:
********** SamToFastq -I mapped.bam -FASTQ cleaned/R1.fastq -SECOND_END_FASTQ cleaned/R2.fastq -VALIDATION_STRINGENCY SILENT
23:36:09.046 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/matt_hopken/software/V-pipe/.snakemake/conda/b85da07e/share/picard-2.21.1-0/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Mon Oct 14 23:36:09 MDT 2019] SamToFastq INPUT=mapped.bam FASTQ=cleaned/R1.fastq SECOND_END_FASTQ=cleaned/R2.fastq VALIDATION_STRINGENCY=SILENT OUTPUT_PER_RG=false COMPRESS_OUTPUTS_PER_RG=false RG_TAG=PU RE_REVERSE=true INTERLEAVE=false INCLUDE_NON_PF_READS=false CLIPPING_MIN_LENGTH=0 READ1_TRIM=0 READ2_TRIM=0 INCLUDE_NON_PRIMARY_ALIGNMENTS=false VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Mon Oct 14 23:36:09 MDT 2019] Executing as matt_hopken@abdoserver1 on Linux 4.15.0-51-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_152-release-1056-b12; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.21.1-SNAPSHOT
[Mon Oct 14 23:36:09 MDT 2019] picard.sam.SamToFastq done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=514850816
/bin/bash: line 66: vicuna: command not found
Sorry. False alarm caused by the automatically recognised file path, which did not match to the file path in the local system.
MAFFT version 7.310 does not run successfully.
here's the log... savage failed on all three samples... the output was not put into a log file:
Activating conda environment: .snakemake/conda/e6edb6f18a80cf5f3d8af13ded28a55d_
patch 3 - De novo overlap computations - RunningProcessing output[Fri Nov 18 13:13:47 2022]
Finished job 30.
26 of 30 steps (87%) done
patch 9 - De novo overlap computations - Running rust-overlapsINFO 2022-11-18 13:13:51 SamToFastq
********** NOTE: Picard's command line syntax is changing.
**********
********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
**********
********** The command line looks like this in the new syntax:
**********
********** SamToFastq -I samples/CAP217/4390/alignments/REF_aln.bam -FASTQ samples/CAP217/4390/variants/global/R1.fastq -SECOND_END_FASTQ samples/CAP217/4390/variants/global/R2.fastq -RC false
**********
patch 10 - De novo overlap computations - Running rust-overlaps13:13:51.522 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/ubuntu/new-vpipe-haplotype-recon-experiments/work-hiv-example/.snakemake/conda/e6edb6f18a80cf5f3d8af13ded28a55d_/share/picard-2.22.3-0/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Fri Nov 18 13:13:51 CET 2022] SamToFastq INPUT=samples/CAP217/4390/alignments/REF_aln.bam FASTQ=samples/CAP217/4390/variants/global/R1.fastq SECOND_END_FASTQ=samples/CAP217/4390/variants/global/R2.fastq RE_REVERSE=false OUTPUT_PER_RG=false COMPRESS_OUTPUTS_PER_RG=false RG_TAG=PU INTERLEAVE=false INCLUDE_NON_PF_READS=false CLIPPING_MIN_LENGTH=0 READ1_TRIM=0 READ2_TRIM=0 INCLUDE_NON_PRIMARY_ALIGNMENTS=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Fri Nov 18 13:13:51 CET 2022] Executing as ubuntu@TIM-N716 on Linux 5.10.102.1-microsoft-standard-WSL2 amd64; OpenJDK 64-Bit Server VM 11.0.8-internal+0-adhoc..src; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.22.3
WARNING: BAM index file /home/ubuntu/new-vpipe-haplotype-recon-experiments/work-hiv-example/samples/CAP217/4390/alignments/REF_aln.bam.bai is older than BAM /home/ubuntu/new-vpipe-haplotype-recon-experiments/work-hiv-example/samples/CAP217/4390/alignments/REF_aln.bam
patch 3 - De novo overlap computations - RunningProcessing output[Fri Nov 18 13:13:51 CET 2022] picard.sam.SamToFastq done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=536870912
patch 11 - De novo overlap computations - Running rust-overlaps
-------------------------------------------
SAVAGE - Strain Aware VirAl GEnome assembly
-------------------------------------------
Version: 0.4.2
Author: Jasmijn Baaijens
Command used:
/home/ubuntu/new-vpipe-haplotype-recon-experiments/work-hiv-example/.snakemake/conda/e6edb6f18a80cf5f3d8af13ded28a55d_/opt/savage-0.4.2/savage.py -t 1 --split 20 -p1 /home/ubuntu/new-vpipe-haplotype-recon-experiments/work-hiv-example/samples/CAP217/4390/variants/global/R1.fastq -p2 /home/ubuntu/new-vpipe-haplotype-recon-experiments/work-hiv-example/samples/CAP217/4390/variants/global/R2.fastq -o samples/CAP217/4390/variants/global/
Parameter values:
filtering = True
reference = None
merge_contigs = 0.0
remove_branches = True
contig_len_stage_c = 100
split_num = 20
use_subreads = True
no_assembly = False
diploid_contig_len = 200
overlap_stage_c = 100
input_p2 = /home/ubuntu/new-vpipe-haplotype-recon-experiments/work-hiv-example/samples/CAP217/4390/variants/global/R2.fastq
input_p1 = /home/ubuntu/new-vpipe-haplotype-recon-experiments/work-hiv-example/samples/CAP217/4390/variants/global/R1.fastq
count_strains = False
min_clique_size = 4
diploid_overlap_len = 30
compute_overlaps = True
preprocessing = True
threads = 1
stage_a = True
stage_b = True
stage_c = True
max_tip_len = None
min_overlap_len = None
outdir = samples/CAP217/4390/variants/global/
average_read_len = None
sfo_mm = 50
revcomp = False
input_s = None
diploid = False
Input fastq stats:
Number of single-end reads = 0
Number of paired-end reads = 4872
Total number of bases = 1398543
Average sequence length = 287.1
Using max_tip_len = 287
Using min_overlap_len = 172
*******************
Preprocessing input
Done! s
********************
Overlap computations
Done! t nningProcessing output
**************
SAVAGE Stage a
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [33, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [10, 0]
Processing outputpipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [8, 0]
patch 7 - De novo overlap computationspipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [3, 0]
- Running rust-overlapspipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [0, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [0, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [0, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [1, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [33, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [9, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [0, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [1, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [16, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [4, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [1, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [15, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [0, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [2, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [33, 0]
pipeline_per_stage.py
Processing outputStage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [1, 0]
combine_contigs.py
cat: stage_a/patch0/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch1/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch2/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch3/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch4/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch5/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch6/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch7/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch9/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch10/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch11/stage_a/singles.fastq: No such file or directory
Processing outputcat: stage_a/patch12/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch13/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch14/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch15/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch16/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch17/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch18/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch19/stage_a/singles.fastq: No such file or directory
patch 8 - De novo overlap computationsDone!
**************
SAVAGE Stage b
Empty set of contigs from Stage a (contigs_stage_a.fasta) --> Exiting SAVAGE.
[Fri Nov 18 13:13:58 2022]
Error in rule savage:
jobid: 41
input: samples/CAP188/30/alignments/REF_aln.bam
output: samples/CAP188/30/variants/global/R1.fastq, samples/CAP188/30/variants/global/R2.fastq, samples/CAP188/30/variants/global/contigs_stage_c.fasta
log: samples/CAP188/30/variants/global/savage.out.log, samples/CAP188/30/variants/global/savage.err.log (check log file(s) for error message)
conda-env: /home/ubuntu/new-vpipe-haplotype-recon-experiments/work-hiv-example/.snakemake/conda/e6edb6f18a80cf5f3d8af13ded28a55d_
shell:
# Convert BAM to FASTQ without re-reversing reads - SAVAGE expect all reads in the same direction
source /home/ubuntu/new-vpipe-haplotype-recon-experiments/V-pipe/workflow/scripts/functions.sh
SamToFastq picard I=samples/CAP188/30/alignments/REF_aln.bam FASTQ=samples/CAP188/30/variants/global/R1.fastq SECOND_END_FASTQ=samples/CAP188/30/variants/global/R2.fastq RC=false 2> >(tee samples/CAP188/30/variants/global/savage.err.log >&2)
# Remove /1 and /2 from the read names
sed -i -e "s:/1$::" samples/CAP188/30/variants/global/R1.fastq
sed -i -e "s:/2$::" samples/CAP188/30/variants/global/R2.fastq
R1=${PWD}/samples/CAP188/30/variants/global/R1.fastq
R2=${PWD}/samples/CAP188/30/variants/global/R2.fastq
savage -t 1 --split 20 -p1 ${R1} -p2 ${R2} -o samples/CAP188/30/variants/global/ 2> >(tee -a samples/CAP188/30/variants/global/savage.err.log >&2)
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job savage since they might be corrupted:
samples/CAP188/30/variants/global/R1.fastq, samples/CAP188/30/variants/global/R2.fastq
Done! t
**************
SAVAGE Stage a
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [127, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [11, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [16, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [63, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [15, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [18, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [36, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [70, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [48, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [98, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [12, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [14, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [111, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [58, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [51, 0]
Processing outputpipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [53, 0]
pipeline_per_stage.py
patch 8 - De novo overlap computationsStage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [68, 0]
- Running rust-overlapspipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [9, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [32, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [52, 0]
combine_contigs.py
cat: stage_a/patch0/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch1/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch2/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch3/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch4/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch5/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch6/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch7/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch8/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch9/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch10/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch11/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch12/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch13/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch14/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch15/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch17/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch19/stage_a/singles.fastq: No such file or directory
Done!
**************
SAVAGE Stage b
Empty set of contigs from Stage a (contigs_stage_a.fasta) --> Exiting SAVAGE.
[Fri Nov 18 13:14:09 2022]
Error in rule savage:
jobid: 39
input: samples/CAP188/4/alignments/REF_aln.bam
output: samples/CAP188/4/variants/global/R1.fastq, samples/CAP188/4/variants/global/R2.fastq, samples/CAP188/4/variants/global/contigs_stage_c.fasta
log: samples/CAP188/4/variants/global/savage.out.log, samples/CAP188/4/variants/global/savage.err.log (check log file(s) for error message)
conda-env: /home/ubuntu/new-vpipe-haplotype-recon-experiments/work-hiv-example/.snakemake/conda/e6edb6f18a80cf5f3d8af13ded28a55d_
shell:
# Convert BAM to FASTQ without re-reversing reads - SAVAGE expect all reads in the same direction
source /home/ubuntu/new-vpipe-haplotype-recon-experiments/V-pipe/workflow/scripts/functions.sh
SamToFastq picard I=samples/CAP188/4/alignments/REF_aln.bam FASTQ=samples/CAP188/4/variants/global/R1.fastq SECOND_END_FASTQ=samples/CAP188/4/variants/global/R2.fastq RC=false 2> >(tee samples/CAP188/4/variants/global/savage.err.log >&2)
# Remove /1 and /2 from the read names
sed -i -e "s:/1$::" samples/CAP188/4/variants/global/R1.fastq
sed -i -e "s:/2$::" samples/CAP188/4/variants/global/R2.fastq
R1=${PWD}/samples/CAP188/4/variants/global/R1.fastq
R2=${PWD}/samples/CAP188/4/variants/global/R2.fastq
savage -t 1 --split 20 -p1 ${R1} -p2 ${R2} -o samples/CAP188/4/variants/global/ 2> >(tee -a samples/CAP188/4/variants/global/savage.err.log >&2)
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job savage since they might be corrupted:
samples/CAP188/4/variants/global/R1.fastq, samples/CAP188/4/variants/global/R2.fastq
Done! t
**************
SAVAGE Stage a
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [31, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [13, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [56, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [18, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [29, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [25, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [34, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [51, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [193, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [55, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [29, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [25, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [83, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [18, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [85, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [9, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [3, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [28, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [44, 0]
pipeline_per_stage.py
Stage a done in 1 iterations
Maximum read length per iteration: [0]
Number of contigs per iteration: [0]
Number of overlaps per iteration: [27, 0]
combine_contigs.py
cat: stage_a/patch0/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch1/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch2/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch3/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch4/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch5/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch6/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch7/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch9/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch10/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch11/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch13/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch14/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch15/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch16/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch17/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch18/stage_a/singles.fastq: No such file or directory
cat: stage_a/patch19/stage_a/singles.fastq: No such file or directory
Done!
**************
SAVAGE Stage b
Empty set of contigs from Stage a (contigs_stage_a.fasta) --> Exiting SAVAGE.
[Fri Nov 18 13:14:31 2022]
Error in rule savage:
jobid: 40
input: samples/CAP217/4390/alignments/REF_aln.bam
output: samples/CAP217/4390/variants/global/R1.fastq, samples/CAP217/4390/variants/global/R2.fastq, samples/CAP217/4390/variants/global/contigs_stage_c.fasta
log: samples/CAP217/4390/variants/global/savage.out.log, samples/CAP217/4390/variants/global/savage.err.log (check log file(s) for error message)
conda-env: /home/ubuntu/new-vpipe-haplotype-recon-experiments/work-hiv-example/.snakemake/conda/e6edb6f18a80cf5f3d8af13ded28a55d_
shell:
# Convert BAM to FASTQ without re-reversing reads - SAVAGE expect all reads in the same direction
source /home/ubuntu/new-vpipe-haplotype-recon-experiments/V-pipe/workflow/scripts/functions.sh
SamToFastq picard I=samples/CAP217/4390/alignments/REF_aln.bam FASTQ=samples/CAP217/4390/variants/global/R1.fastq SECOND_END_FASTQ=samples/CAP217/4390/variants/global/R2.fastq RC=false 2> >(tee samples/CAP217/4390/variants/global/savage.err.log >&2)
# Remove /1 and /2 from the read names
sed -i -e "s:/1$::" samples/CAP217/4390/variants/global/R1.fastq
sed -i -e "s:/2$::" samples/CAP217/4390/variants/global/R2.fastq
R1=${PWD}/samples/CAP217/4390/variants/global/R1.fastq
R2=${PWD}/samples/CAP217/4390/variants/global/R2.fastq
savage -t 1 --split 20 -p1 ${R1} -p2 ${R2} -o samples/CAP217/4390/variants/global/ 2> >(tee -a samples/CAP217/4390/variants/global/savage.err.log >&2)
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job savage since they might be corrupted:
samples/CAP217/4390/variants/global/R1.fastq, samples/CAP217/4390/variants/global/R2.fastq
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-11-18T125029.533222.snakemake.log
I followed https://github.com/cbg-ethz/V-pipe/blob/master/docs/tutorial_hiv.md, my config was
general:
virus_base_config: 'hiv'
# e.g: 'hiv', 'sars-cov-2', or absent
# enable hyplotype reconstruction
# let's try
haplotype_reconstruction: savage
# this failed:
#haplotype_reconstruction: predicthaplo
# this worked:
#haplotype_reconstruction: haploclique
input:
samples_file: samples.tsv
output:
datadir: samples/
trim_primers: false
# see: config/README.md#amplicon-protocols
snv: false
local: false
# enable hyplotype reconstruction
global: true
visualization: false
diversity: false
QA: false
upload: false
dehumanized_raw_reads: false
hi all
how can we convert the minority_variants.tsv in vcf or merge each of the single vcf into one? (they all have same naming)
thanks
ibseq
Conda environment for rule 'initial_vicuna' does not work properly. The package on bioconda, mvicuna, is a slightly different tool with different command line arguments and interface. Moreover, VICUNA and mvicuna are not longer maintained.
Additionally, the interface for jar files is implemented differently when dependencies are fetched from the bioconda channel.
Hey, thanks for all the effort you put in this pipeline!
Because I have to call variants in regions with quite low coverage I recently tried running the v-pipe SARS-CoV branch using lofreq as snv caller, defined by the config file as written in the documentation. After some issues I also adjusted the "coverage_intervals"; "coverage" value to 10 (to fit the lofreq filter).
In the visualization, however, I only get posterior scores of 1 for every variant. Since it also calls the ShoRAH rule after lofreq I was wondering why this is the case but couldn't find anything so far.
Is this an expected behaviour?
Is there a way to adjust the snv rule to get the posterior scores also when using lofreq as a snv caller?
Do you maybe have any recommendations how to apply certain frequency filtering on lofreq variants, regardless of whether they can be included in the visualization afterwards or not? (I think it's calculating a p-value but I couldn't find how to make use of this in v-pipe)
Any hints where I could start to look at, would be highly appreciated. Thanks!
Provide a solution for the different interfaces when calling InDelFixer, ConsensusFixer and Picard tools.
There are multiple places to specify GFFs in V-pipe and this is confusing to users.
All this leads to confusions for users, see here
Hi
The following problem I got with V-pipe master (HIV-1 analysis): when I try to initialise the project I have the following Warning:
Warning: cannot detect conda environment V-pipe project initialized!
Conda V-pipe environment is activated.
And after trying --dryrun, the following error came up:
$ ./vpipe --dryrun
VPIPE_BASEDIR = /Users/sviat/V-pipe
Migrating .snakemake folder to new format...
Migration complete
Building DAG of jobs...
WorkflowError:
WorkflowError:
MissingInputException: Missing input files for rule gunzip:
samples/ADA1038B/20210521/extracted_data/R1.fastq.gz
CyclicGraphException: Cyclic dependency on rule convert_to_ref.
MissingInputException: Missing input files for rule sam2bam:
samples/ADA1038B/20210521/alignments/REF_aln.sam
ADA1038B is the first sample in my sample list.
Samples prepared according to the manual:
v-pipe_workdir/samples/ADA1038B/20210521/raw_data/ADA1038B_R1.fastq
v-pipe_workdir/samples/ADA1038B/20210521/raw_data/ADA1038B_R2.fastq
...
If the files are in *fastq.gz format, the error message looks a bit different:
Building DAG of jobs...
MissingInputException in line 10 of /Users/sviat/V-pipe/rules/quality_assurance.smk:
Missing input files for rule gunzip:
samples/ADA1038B/20210521/extracted_data/R1.fastq.gz
Can you please help me with this issue?
Thank you!
P.S. SARS-CoV-2 V-pipe branch works perfect!
Having a little example with sample files somewhere up front in the README would be nice.
I have successfully run test data and Wuhan data on which is paired-end data, but not able to run single-end data as there is no specific guide/manual for it.
I have got to know that it supports single-end data as reported in publication.
I tried by renaming file to read_R1.fastq but no sucess,
(base) zuber@gbrc-hpc-42:/opt/data/env-V-pipe/ENV/work$ ./vpipe --cores 40
VPIPE_BASEDIR = /opt/data/env-V-pipe/ENV/V-pipe
AssertionError in line 369 of /opt/data/env-V-pipe/ENV/V-pipe/rules/common.smk:
ERROR: Line '3' does not contain at least two entries!
File "/opt/data/env-V-pipe/ENV/V-pipe/vpipe.snake", line 11, in
File "/opt/data/env-V-pipe/ENV/V-pipe/rules/common.smk", line 369, in
Hello, I would like to ask what does this error mean and how will I solve this error?
Activating conda environment: .snakemake/conda/0a11ef3f67c5c382159134f72bbed3ac_
ERROR: Count argument '2-testing-work-results-2VM-sim-20170904' is not an integral/floating point value! Aborting.
[Thu Feb 2 15:03:36 2023]
Error in rule hmm_align:
jobid: 3
input: /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/references/initial_consensus.fasta, /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/preprocessed_data/R1.fastq, /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/preprocessed_data/R2.fastq
output: /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/alignments/full_aln.sam, /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/alignments/rejects.sam, /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/references/ref_ambig.fasta, /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/references/ref_majority.fasta
log: /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/alignments/ngshmmalign.out.log, /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/alignments/ngshmmalign.err.log (check log file(s) for error details)
conda-env: /home/diamantev/HIV/Vpipe/working_2/.snakemake/conda/0a11ef3f67c5c382159134f72bbed3ac_
shell:
CONSENSUS_NAME=/home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904
CONSENSUS_NAME="${CONSENSUS_NAME#*/}"
CONSENSUS_NAME="${CONSENSUS_NAME//\//-}"
# 1. clean previous run
rm -rf /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/alignments
rm -f /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/references/ref_ambig.fasta
rm -f /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/references/ref_majority.fasta
mkdir -p /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/alignments
mkdir -p /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/references
# 2. perform alignment # -l = leave temps
ngshmmalign -v -R /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/references/initial_consensus.fasta -o /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/alignments/full_aln.sam -w /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/alignments/rejects.sam -t 1 -N "${CONSENSUS_NAME}" /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/preprocessed_data/R1.fastq /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/preprocessed_data/R2.fastq > /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/alignments/ngshmmalign.out.log 2> >(tee /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/alignments/ngshmmalign.err.log >&2)
# 3. move references into place
mv /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/{alignments,references}/ref_ambig.fasta
mv /home/diamantev/HIV/Vpipe/working_2/testing/work/results/2VM-sim/20170904/{alignments,references}/ref_majority.fasta
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2023-02-02T150334.285845.snakemake.log
Currently unknown keywords in config files are silently ignored by snakemake's default "snakemake.utils.validate" validator.
This makes typoe in sections and options hard to spot for user.
TODO warn about unknown keywords, search for similar-looking keyword
Hi there. So apparently that data would not have enough divergence, so a crash with haploclique is expected.
But predicthaplo should apparently work but does not:
Removing output files of failed job predicthaplo since they might be corrupted:
samples/SRR10903401/20200102/variants/global/REF_aln.sam
Configuration:
prefix = samples/SRR10903402/20200102/variants/global/predicthaplo/
cons = /home/ubuntu/new-vpipe-haplotype-recon-experiments/V-pipe/workflow/../resources/sars-cov-2/NC_045512.2.fasta
visualization_level = 1
FASTAreads = samples/SRR10903402/20200102/variants/global/REF_aln.sam
have_true_haplotypes = 0
FASTAhaplos =
do_local_Analysis = 1
After parsing the reads in file samples/SRR10903402/20200102/variants/global/REF_aln.sam: average read length= -nan 0
First read considered in the analysis starts at position 100000. Last read ends at position 0
There are 0 reads
/usr/bin/bash: line 3: 25922 Segmentation fault predicthaplo --sam samples/SRR10903402/20200102/variants/global/REF_aln.sam --reference /home/ubuntu/new-vpipe-haplotype-recon-experiments/V-pipe/workflow/../resources/sars-cov-2/NC_045512.2.fasta --prefix samples/SRR10903402/20200102/variants/global/predicthaplo/ --have_true_haplotypes 0 --min_length 0 2> >(tee -a samples/SRR10903402/20200102/variants/global/predicthaplo.err.log >&2)
[Fri Nov 18 11:34:03 2022]
Error in rule predicthaplo:
jobid: 22
input: samples/SRR10903402/20200102/alignments/REF_aln.bam, /home/ubuntu/new-vpipe-haplotype-recon-experiments/V-pipe/workflow/../resources/sars-cov-2/NC_045512.2.fasta
output: samples/SRR10903402/20200102/variants/global/REF_aln.sam, samples/SRR10903402/20200102/variants/global/predicthaplo_haplotypes.fasta
log: samples/SRR10903402/20200102/variants/global/predicthaplo.out.log, samples/SRR10903402/20200102/variants/global/predicthaplo.err.log (check log file(s) for error message)
conda-env: /home/ubuntu/new-vpipe-haplotype-recon-experiments/work-sars-cov-2-example/.snakemake/conda/648dc97f886b8633756d6cd60de0ff7c_
shell:
samtools sort -n samples/SRR10903402/20200102/alignments/REF_aln.bam -o samples/SRR10903402/20200102/variants/global/REF_aln.sam 2> >(tee samples/SRR10903402/20200102/variants/global/predicthaplo.err.log >&2)
predicthaplo --sam samples/SRR10903402/20200102/variants/global/REF_aln.sam --reference /home/ubuntu/new-vpipe-haplotype-recon-experiments/V-pipe/workflow/../resources/sars-cov-2/NC_045512.2.fasta --prefix samples/SRR10903402/20200102/variants/global/predicthaplo/ --have_true_haplotypes 0 --min_length 0 2> >(tee -a samples/SRR10903402/20200102/variants/global/predicthaplo.err.log >&2)
# TODO: copy over actual haplotypes
touch samples/SRR10903402/20200102/variants/global/predicthaplo_haplotypes.fasta
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job predicthaplo since they might be corrupted:
samples/SRR10903402/20200102/variants/global/REF_aln.sam
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-11-18T113259.947175.snakemake.log
I just followed the tutorial https://github.com/cbg-ethz/V-pipe/blob/master/docs/tutorial_sarscov2.md and configured in config.yaml
general:
virus_base_config: 'sars-cov-2'
# e.g: 'hiv', 'sars-cov-2', or absent
# the tool selected as haplotype_reconstruction does the global haplotype reconstruction
haplotype_reconstruction: predicthaplo
output:
# enable global haplotype reconstruction
# might not work with this data...
#
# > nope, this data does not support haplotype reconstruction, not enough divergence?...
global: true
the rest is the default options
Having all output on the console is not really helpful when run in a cluster or multi-core environment.
Hi,
Thank you for your support and creating of this pipeline,
The problem I face is the lack of access to the percentage and frequency of each haplotype.
What I did:
Set global True
in the config file
Then get the freq_est.py script from here (https://bitbucket.org/jbaaijens/savage/src/master/)
Unfortunately, it does not work and there is no percentage for haplotypes.
Is this not possible by default in the pipeline? (Like what is seen in the webinars and graphs of this pipeline?)
Does the use of Haploclique have an effect on the creation of this report for the percentage of haplotypes?
Sincerely yours,
Naser
I am performing local haplotype reconstruction on more than 100 samples with shorah.
Is there a way to check whether the MCMC sampling has converged for a particular region or overall?
Do we plan to have the parameters used in the various rules exposed in the config file? Or do we envision users who want to change some of them to do this in some other way?
I run the whole pipeline starting with usual raw reads. I.e., alignments are made within V-pipe.
ShoRAh error log:
Traceback (most recent call last):
File "/data/nasif12/home_if12/dvoretsk/projects/V-pipe-SARS/.snakemake/conda/5f9cc436/bin/shorah", line 14, in <module>
main()
File "/data/nasif12/home_if12/dvoretsk/projects/V-pipe-SARS/.snakemake/conda/5f9cc436/lib/python3.6/site-packages/shorah/cli.py", line 196, in main
args.func(args)
File "/data/nasif12/home_if12/dvoretsk/projects/V-pipe-SARS/.snakemake/conda/5f9cc436/lib/python3.6/site-packages/shorah/cli.py", line 75, in shotgun_run
shotgun.main(args)
File "/data/nasif12/home_if12/dvoretsk/projects/V-pipe-SARS/.snakemake/conda/5f9cc436/lib/python3.6/site-packages/shorah/shotgun.py", line 440, in main
r = list(aligned_reads.keys())[0]
IndexError: list index out of range
Job counts:
count jobs
1 convert_to_ref
1
convert_reference -t gi|1142969405|gb|KY272010.1| -m references/ALL_aln_ambig.fasta -i samples/HFMD/71147/alignments/full_aln.bam -o samples/HFMD/71147/alignments/REF_aln.bam > samples/HFMD/71147/alignments/convert_to_ref.out.log 2> >(tee samples/HFMD/71147/alignments/convert_to_ref.err.log >&2)
/bin/bash: 1142969405: command not found
/bin/bash: gb: command not found
/bin/bash: KY272010.1: command not found
/bin/bash: -m: command not found
/bin/bash: 1142969405: command not found
/bin/bash: gb: command not found
/bin/bash: KY272010.1: command not found
/bin/bash: -m: command not found
/bin/bash: 1142969405: command not found
/bin/bash: gb: command not found
/bin/bash: KY272010.1: command not found
/bin/bash: 1142969405: command not found
/bin/bash: gb: command not found
/bin/bash: KY272010.1: command not found
/bin/bash: -m: command not found
/bin/bash: -m: command not found
usage: convert_reference [-h] -t TO [-v] -m input -i input [-o output] [-p]
[-X] [-H]
convert_reference: error: the following arguments are required: -m, -i
usage: convert_reference [-h] -t TO [-v] -m input -i input [-o output] [-p]
[-X] [-H]
convert_reference: error: the following arguments are required: -m, -i
usage: convert_reference [-h] -t TO [-v] -m input -i input [-o output] [-p]
[-X] [-H]
convert_reference: error: the following arguments are required: -m, -i
usage: convert_reference [-h] -t TO [-v] -m input -i input [-o output] [-p]
[-X] [-H]
convert_reference: error: the following arguments are required: -m, -i
[Fri Oct 30 07:39:20 2020]
Error in rule convert_to_ref:
jobid: 0
output: samples/HFMD/71157/alignments/REF_aln.bam
log: samples/HFMD/71157/alignments/convert_to_ref.out.log, samples/HFMD/71157/alignments/convert_to_ref.err.log (check log file(s) for error message)
conda-env: /Volumes/AKiTiO_duo3/CoxA10/trimmed/match_to_virus_genome/v-pipe-working-dir/.snakemake/conda/77a26a2e
shell:
convert_reference -t gi|1142969405|gb|KY272010.1| -m references/ALL_aln_ambig.fasta -i samples/HFMD/71157/alignments/full_aln.bam -o samples/HFMD/71157/alignments/REF_aln.bam > samples/HFMD/71157/alignments/convert_to_ref.out.log 2> >(tee samples/HFMD/71157/alignments/convert_to_ref.err.log >&2)
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Exiting because a job execution failed. Look above for error message
Hi all,
I run the above with own data but got :
(vpipe_env) ibseq:~/testing/work$ ./vpipe --dryrun
VPIPE_BASEDIR = /univ/ibseq/testing/V-pipe
AssertionError in line 68 of /univ/ibseq/testing/V-pipe/rules/common.smk:
ERROR: Line '13' does not contain at least two entries!
File "/univ/ibseq/testing/V-pipe/vpipe.snake", line 11, in
File "/univ/ibseq/testing/V-pipe/rules/common.smk", line 68, in
from https://cbg-ethz.github.io/V-pipe/tutorial/sars-cov2/
any advice?
thanks
ibseq
Output of conda while trying to construct environment:
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
UnsatisfiableError: The following specifications were found to be incompatible with each other:
Output in format: Requested package -> Available versions
Package htslib conflicts for:
lofreq=2.1.4 -> samtools -> htslib[version='>=1.10,<1.11.0a0|>=1.9,<1.10.0a0']
lofreq=2.1.4 -> htslib[version='>=1.10.2,<1.11.0a0']
Package libgcc-ng conflicts for:
lofreq=2.1.4 -> libgcc-ng[version='>=7.3.0|>=7.5.0']
lofreq=2.1.4 -> python[version='>=3.7,<3.8.0a0'] -> libgcc-ng[version='>=4.9|>=7.2.0']
Package libcurl conflicts for:
lofreq=2.1.4 -> htslib[version='>=1.10.2,<1.11.0a0'] -> libcurl[version='>=7.64.1,<8.0a0|>=7.71.1,<8.0a0']
samtools=1.9 -> curl[version='>=7.64.0,<8.0a0'] -> libcurl[version='7.59.0|7.60.0|7.61.0|7.61.1|7.61.1|7.61.1|7.62.0|7.62.0|7.63.0|7.63.0|7.64.0|7.64.0|7.64.0|7.64.0|7.64.0|7.64.1|7.64.1|7.65.2|7.65.3|7.68.0|7.68.0|7.69.1|7.71.0|7.71.1|7.71.0|7.69.1|7.68.0|7.67.0|7.65.3|7.65.2|7.64.1|7.63.0|7.63.0|7.62.0',build='h1ad7b7a_0|h20c2e04_0|hbdb9355_0|h01ee5af_1000|hbdb9355_0|h01ee5af_1000|h20c2e04_2|h20c2e04_0|h01ee5af_0|hda55be3_4|hda55be3_0|hda55be3_0|hcdd3856_0|hf7181ac_0|hf7181ac_1|hda55be3_0|hda55be3_0|hf7181ac_1|hf7181ac_5|h541490c_2|h20c2e04_0|h20c2e04_0|h20c2e04_0|h20c2e04_0|h20c2e04_0|h20c2e04_0|h01ee5af_1002|hbdb9355_2|h20c2e04_1000|h20c2e04_0|h20c2e04_0|h1ad7b7a_0|h1ad7b7a_0']
bcftools=1.9 -> curl[version='>=7.64.1,<8.0a0'] -> libcurl[version='7.59.0|7.60.0|7.61.0|7.61.1|7.61.1|7.61.1|7.62.0|7.62.0|7.63.0|7.63.0|7.64.0|7.64.0|7.64.0|7.64.0|7.64.0|7.64.1|7.64.1|7.64.1|7.65.2|7.65.3|7.68.0|7.68.0|7.69.1|7.71.0|7.71.1|7.71.0|7.69.1|7.68.0|7.67.0|7.65.3|7.65.2|7.63.0|7.63.0|7.62.0',build='h1ad7b7a_0|h1ad7b7a_0|h20c2e04_1000|hbdb9355_0|h01ee5af_1000|hbdb9355_0|h01ee5af_1000|h20c2e04_2|h01ee5af_0|hda55be3_4|hf7181ac_5|h20c2e04_0|h20c2e04_0|hda55be3_0|hda55be3_0|hcdd3856_0|hf7181ac_0|hf7181ac_1|hda55be3_0|hda55be3_0|hf7181ac_1|h20c2e04_0|h20c2e04_0|h20c2e04_0|h20c2e04_0|h20c2e04_0|h541490c_2|h01ee5af_1002|hbdb9355_2|h20c2e04_0|h20c2e04_0|h20c2e04_0|h1ad7b7a_0']
Package libstdcxx-ng conflicts for:
lofreq=2.1.4 -> python[version='>=3.7,<3.8.0a0'] -> libstdcxx-ng[version='>=4.9|>=7.5.0|>=7.2.0']
lofreq=2.1.4 -> libstdcxx-ng[version='>=7.3.0']
Package zlib conflicts for:
lofreq=2.1.4 -> zlib[version='>=1.2.11,<1.3.0a0']
lofreq=2.1.4 -> samtools -> zlib[version='1.2.*|1.2.11|1.2.11.*|1.2.8.*|1.2.8']
Package libssh2 conflicts for:
bcftools=1.9 -> curl[version='>=7.64.1,<8.0a0'] -> libssh2[version='>=1.8.0,<2.0.0a0|>=1.9.0,<2.0a0|>=1.8.2,<2.0a0|>=1.8.0,<2.0a0']
samtools=1.9 -> curl[version='>=7.64.0,<8.0a0'] -> libssh2[version='>=1.8.0,<2.0.0a0|>=1.9.0,<2.0a0|>=1.8.2,<2.0a0|>=1.8.0,<2.0a0']
Package samtools conflicts for:
lofreq=2.1.4 -> samtools
samtools=1.9
Hi,
I have tried to run the tutorial dataset (SARS-CoV-2) with the command mentioned on the webpage,
https://cbg-ethz.github.io/V-pipe/tutorial/sars-cov2/
However, after running successfully the whole script, the generated output files are all empty. Could anyone help me find out where is the problem?
Is it happeining in prinseq? Below is the output of prinseq;
Please let me know,
Best regards,
Wasim
I always ran into this problem. Any idea?
CreateCondaEnvironmentException:
Could not create conda environment from /Users/jameschen/CloudStation/Bioinform/V-pipe/envs/savage.yaml:
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... failed
ResolvePackageNotFound:
Hi,
I have installed v-pipe and modified the config.yaml and created the folder "samples".
I tried the ./vpipe --dryrun and I've got:
VPIPE_BASEDIR = /Users/opentrons-b10/V-pipe/workflow
Using base configuration virus SARS-CoV-2
WARNING: protocols YAML look-up file </Users/opentrons-b10/V-pipe/workflow/../resources/sars-cov-2/primers.yaml> specified, but no sample ever uses it: fourth column absent from samples TSV file.
Building DAG of jobs...
MissingInputException in rule generate_web_visualization in file /Users/opentrons-b10/V-pipe/workflow/rules/visualization.smk, line 10:
Missing input files for rule generate_web_visualization:
output: samples/20230220/torino/visualization/snv_calling.html, samples/20230220/torino/visualization/alignment.html, samples/20230220/torino/visualization/reference_uri_file, samples/20230220/torino/visualization/bam_uri_file
wildcards: dataset=samples/20230220/torino
affected files:
/Users/opentrons-b10/V-pipe/workflow/../resources/sars-cov-2/primers/v3/nCoV-2019.tsv
Could you please help me ion solving this issue?
many thanks for the help,
Carlotta Olivero
The following upstream issue affect execution of V-pipe:
snakemake/snakemake#1021
snakemake/snakemake#1024
current work-around:
mamba install snakemake-minimal=6.3.0
Hello V-pipe team,
Thanks for the wonderful tool.
My v-pipe works fine, however I am getting this warning (Warning: All reads at position 4045 in the same reverse orientation ?) for number of position around 50, and don't know what is wrong in the dataset. Can you please explain me why I get this warning and how to rectify this ?
Thanks
Vinoy
Reported by Maryam:
I tried to run v-pipe and I got the following error:
SyntaxError:
Not all output, log and benchmark files of rule gunzip contain the same wildcards. This is crucial though, in order to avoid that two or more jobs write to the same file.
File ".../V-pipe/vpipe.snake", line 415, in <module>
Then I changed snakemake version from 5.4.2
to 4.8.0
and I get a different error:
Building DAG of jobs...
MissingInputException in line 389 of .../V-pipe/vpipe.snake:
Missing input files for rule all:
samples/patient1/20170904/variants/local/snvs.csv
any comments would be appreciated.
According to @sposadac:
At first sight, can you try running it with:
[output]
local = False
Indeed, I haven?t tried snakemake 5+ versions, and there might be somethings that need updating. Therefore, running V-pipe using snakemake version 4.8.0 sounds a good idea for the time being.
Removing output files of failed job predicthaplo since they might be corrupted:
samples/CAP188/4/variants/global/REF_aln.sam
After parsing the reads in file samples/CAP217/4390/variants/global/REF_aln.sam: average read length= -nan 0
First read considered in the analysis starts at position 100000. Last read ends at position 0
There are 0 reads
/usr/bin/bash: line 3: 1586 Segmentation fault predicthaplo --sam samples/CAP217/4390/variants/global/REF_aln.sam --reference samples/cohort_consensus.fasta --prefix samples/CAP217/4390/variants/global/predicthaplo/ --have_true_haplotypes 0 --min_length 0 2> >(tee -a samples/CAP217/4390/variants/global/predicthaplo.err.log >&2)
[Fri Nov 18 12:35:01 2022]
Error in rule predicthaplo:
jobid: 41
input: samples/CAP217/4390/alignments/REF_aln.bam, samples/cohort_consensus.fasta
output: samples/CAP217/4390/variants/global/REF_aln.sam, samples/CAP217/4390/variants/global/predicthaplo_haplotypes.fasta
log: samples/CAP217/4390/variants/global/predicthaplo.out.log, samples/CAP217/4390/variants/global/predicthaplo.err.log (check log file(s) for error message)
conda-env: /home/ubuntu/new-vpipe-haplotype-recon-experiments/work-hiv-example/.snakemake/conda/2bf4df4a26b143afa975c4ca179e069b_
shell:
samtools sort -n samples/CAP217/4390/alignments/REF_aln.bam -o samples/CAP217/4390/variants/global/REF_aln.sam 2> >(tee samples/CAP217/4390/variants/global/predicthaplo.err.log >&2)
predicthaplo --sam samples/CAP217/4390/variants/global/REF_aln.sam --reference samples/cohort_consensus.fasta --prefix samples/CAP217/4390/variants/global/predicthaplo/ --have_true_haplotypes 0 --min_length 0 2> >(tee -a samples/CAP217/4390/variants/global/predicthaplo.err.log >&2)
# TODO: copy over actual haplotypes
touch samples/CAP217/4390/variants/global/predicthaplo_haplotypes.fasta
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job predicthaplo since they might be corrupted:
samples/CAP217/4390/variants/global/REF_aln.sam
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-11-18T121401.627778.snakemake.log
I followed https://github.com/cbg-ethz/V-pipe/blob/master/docs/tutorial_hiv.md and then configured the following:
general:
virus_base_config: 'hiv'
# e.g: 'hiv', 'sars-cov-2', or absent
# enable hyplotype reconstruction
# this failed:
haplotype_reconstruction: predicthaplo
# this worked:
#haplotype_reconstruction: haploclique
input:
samples_file: samples.tsv
output:
datadir: samples/
trim_primers: false
# see: config/README.md#amplicon-protocols
snv: false
local: false
# enable hyplotype reconstruction
global: true
visualization: false
diversity: false
QA: false
upload: false
dehumanized_raw_reads: false
Hello,
I am having a first try a V-pipe and it seems that Bio.Alphabet has been removed from BioPython this month, causing an error/crash at the report generation step.
Traceback (most recent call last):
File "/Users/ywenger/vpipe/V-pipe/scripts/assemble_visualization_webpage.py", line 16, in <module>
from Bio.Alphabet import IUPAC
File "/Users/ywenger/vpipe/work/.snakemake/conda/c7ff4d0f/lib/python3.8/site-packages/Bio/Alphabet/__init__.py", line 20, in <module>
raise ImportError(
ImportError: Bio.Alphabet has been removed from Biopython. In many cases, the alphabet can simply be ignored and removed from scripts. In a few cases, you may need to specify the ``molecule_type`` as an annotation on a SeqRecord for your script to work correctly. Please see https://biopython.org/wiki/Alphabet for more information.
Best,
Yvan
Hello V-pipe Team,
I am from the University Hospital Essen, Germany, and we work extensively with SARS-CoV-2 in our research. We have also developed a SARS-CoV-2 workflow. In preparation for the publication of our workflow, we have looked at several other SARS-CoV-2 related workflows, including your work. We will present this review in the publication and want to ensure that your work is represented as accurately as possible.
Moreover, there is currently little to no current overview of SARS-CoV-2 related workflows. Therefore, we have decided to make the above comparison publicly available via this GitHub repository. It contains a table with an overview of the functions of different SARS-CoV-2 workflows and the tools used to implement these functions.
We would like to give you the opportunity to correct any misunderstandings on our side. Please take a moment to make sure we are not misrepresenting your work or leaving out important parts of it by taking a look at this overview table. If you feel that something is missing or misrepresented, please feel free to give us feedback by contributing directly to the repository.
Thank you very much!
cc @alethomas
I am working on a server where conda and activate are not in a central location, and therefore snakemake isn't able to load up the environments, giving a /usr/bin/bash: /usr/bin/activate: No such file or directory
error. The snakemake envs load up fine if I do it myself.
Activating conda environment: /XXX/V-pipe/.snakemake/conda/3d1013d0
/usr/bin/bash: /usr/bin/activate: No such file or directory
# but this loads it up fine
conda activate /XXX/V-pipe/.snakemake/conda/3d1013d0
How can I adjust the snakemake or the init_project.sh
to use conda
or conda activate
from a different location than /usr/bin/activate
?
I do have --use-conda
in my call:
./vpipe --dryrun --use-conda
./vpipe --cores 24 --use-conda
Thanks.
Hi, thanks for this pipeline, loving it. BUT am not yet a snakemake guru.
I have a question regarding optimizing of the compute. Seems like I can run the pipeline two ways:
If I do (1), I am starting vpipe with the --cores 128
option (AMD server with 128 physical cores) but it seems to use only 4 threads for those sub-programs that can use them. In the vpipe config files, I see the threads option, but that seems to be set to 1. So, where did it get the 4 and is there an easy way to change that globally? --threads=128
or something?
If I do (2), is there a way to specify the number of samples that should be processed simultaneously AND similar to above, the threads to use for each process? Something like process 8 samples at a time using 16 threads each.
Thanks
Bob
VPIPE_BASEDIR = /opt/V-dock/V-pipe/workflow
Importing legacy configuration file vpipe.config
MissingSectionHeaderError in line 107 of /opt/V-dock/V-pipe/workflow/rules/common.smk:
File contains no section headers.
file: 'vpipe.config', line: 1
'general:\n'
File "/opt/V-dock/V-pipe/workflow/Snakefile", line 12, in
File "/opt/V-dock/V-pipe/workflow/rules/common.smk", line 259, in
File "/opt/V-dock/V-pipe/workflow/rules/common.smk", line 170, in process_config
File "/opt/V-dock/V-pipe/workflow/rules/common.smk", line 107, in load_legacy_ini
File "/opt/conda/envs/snakemake/lib/python3.10/configparser.py", line 698, in read
File "/opt/conda/envs/snakemake/lib/python3.10/configparser.py", line 1086, in _read
For future development it is inevitable to externalize the CSS to respective stylesheets. This not only improves maintainability and enforces standardization, it also encourages reusability of defined classes. I might also make sense to make full use of CSS 3 to keep up with contemporary Web 11.2 standards.
Best,
Simon
When I go to the wiki page of this project, I just see a list of the four pages.
It should look more like: https://github.com/npm/cli/wiki with a short introductory text.
When defining datadir
different from samples/
in the config.yaml
, e.g.
input:
reference: ../../resources/reference/ancestor_consensus.fasta
datadir: ../../resources/samples/Experiment3
the function len_cutoff()
in quality_assurance.smk
is taking the wrong part of the splitted string:
def len_cutoff(wildcards):
parts = wildcards.dataset.split("/")
patient_ID = parts[1] # should be: patient_ID = parts[-2]
date = parts[2] # should be: date = parts[-1]
Instead of
patient_ID = parts[1]
date = parts[2]
it should be
patient_ID = parts[-2]
date = parts[-1]
Any idea about the cause of the following error ?
Best
[Mon Feb 1 23:30:09 2021]
Finished job 16.
21 of 27 steps (78%) done
[Mon Feb 1 23:30:09 2021]
localrule shorah_regions:
input: variants/coverage_intervals.tsv
output: samples/10919588/K-5771/variants/coverage_intervals.tsv, samples/10919594/K-5770/variants/coverage_intervals.tsv
jobid: 10
VPIPE_BASEDIR = /home/x/V-pipe
Job counts:
count jobs
1 shorah_regions
1
[Mon Feb 1 23:30:10 2021]
Error in rule shorah_regions:
jobid: 0
output: samples/10919588/K-5771/variants/coverage_intervals.tsv, samples/10919594/K-5770/variants/coverage_intervals.tsv
RuleException:
FileNotFoundError in line 65 of /home/x/V-pipe/rules/snv.smk:
[Errno 2] No such file or directory: 'samples/10919588-K/5771/variants/coverage_intervals.tsv'
File "/home/x/V-pipe/rules/snv.smk", line 65, in __rule_shorah_regions
File "/home/x/miniconda3/envs/V-pipe/lib/python3.8/concurrent/futures/thread.py", line 57, in run
Exiting because a job execution failed. Look above for error message
[Mon Feb 1 23:30:19 2021]
Finished job 3.
22 of 27 steps (81%) done
[Mon Feb 1 23:30:22 2021]
Finished job 4.
23 of 27 steps (85%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
vpipe.config
[input]
reference = references/ref.fasta
samples_file = samples.tsv
paired = True
[output]
snv = True
local = True
global = False
[hmm_align]
leave_msa_temp = true
[general]
aligner = bwa
threads = 10
snv_caller = shorah
[preprocessing]
extra = -ns_max_n 4 -min_qual_mean 30 -trim_qual_left 30 -trim_qual_right 30 -trim_qual_window 10
Dear V-pipe authors,
Thank you very much for your pipeline. I attempt to use V-pipe to assess the intra-host viral diversity in plant samples. However, I encountered the following issues:
Error in rule frameshift_deletions_checks:
jobid: 9
output: results_rev/capsicum/41239130/references/frameshift_deletions_check.tsv
log: results_rev/capsicum/41239130/references/frameshift_deletions_check.out.log, results_rev/capsicum/41239130/references/frameshift_deletions_check.err.log (check log file(s) for error message)
conda-env: {...}/vpipe_merged_reads/41239130_Capsicum_annuum_TSWV/.snakemake/conda/c2a6b8c9e98375cd16af9408a0e9b8b2
shell:
frameshift_deletions_checks -i results_rev/capsicum/41239130/alignments/REF_aln.bam -c results_rev/capsicum/41239130/references/consensus.bcftools.fasta -f references_rev/41239130_tswv_conc_rev.fasta -g --english=true -o results_rev/capsicum/41239130/references/frameshift_deletions_check.tsv 2> >(tee results_rev/capsicum/41239130/references/frameshift_deletions_check.err.log >&2)
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: {...}/vpipe_merged_reads/41239130_Capsicum_annuum_TSWV/.snakemake/log/2022-11-01T152833.441917.snakemake.log
The error is not fixed even if I provide the .gff file in the config.yaml with the following two alternative syntaxes:
root:
frameshift_deletion_checks :
genes_gff : gff_dir/[…].gff3
or
frameshift_deletion_checks:
genes_gff : gff_dir/[…].gff3
I cannot find the benchmarking functionality in the latest documentation. In the config HTML file, there are no explanations and tags on how to generate reads and overall use the benchmarking modules. Downloading the benchmark branch and adjusting to the deprecated documentation was not successful. Are you planning to update the documentation for the newest version of your master branch, so that the benchmarking capabilities of the pipeline can be utilized? Can you please point me to the documentation of the benchmarking?
Considering the Haplotype calling modules, I consistently get segmentation faults when using PredictHaplo (changing between local to global analysis) and (core dumped) errors when using Haploclique. I was trying these options as I want to perform reference-based reconstruction of haplotypes. Can you provide any ideas on the reasons that such errors pop up?
Haploclique error log:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
PredictHaplo error log remains empty, but after completing the local haplotype reconstruction, raises a segmentation fault.
It may be relevant that my datasets show very high coverage (50.000x -350.000x).
Please let me know if you need any more information to answer these issues. Thank you very much for your consideration.
Dimitris Karapliafis
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.