Giter Club home page Giter Club logo

bagep's People

Contributors

idolawoye avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

bagep's Issues

Error in rule snippy_core:

Hi, Idolawoye. Please help. I am having a hard time running BAGEP.

Log

(bagep) laurindo@LAPTOP-M857EF9P:~/BAGEP$ snakemake --cores 6 --config ref=vc_reference.fasta
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 6
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads


abricate 1 1 1
all 1 1 1
move_files 1 1 1
snippy_core 1 1 1
tree 1 1 1
vcf_viewer 1 1 1
total 6 1 1

Select jobs to execute...

[Thu Jan 4 18:00:07 2024]
Job 2: Aligning core and whole genomes into a multi fasta file

This is snippy-core 4.6.0
Obtained from http://github.com/tseemann/snippy
Enabling bundled tools for linux
Found any2fasta - /home/laurindo/miniconda3/envs/bagep/bin/any2fasta
Found samtools - /home/laurindo/miniconda3/envs/bagep/bin/samtools
Found minimap2 - /home/laurindo/miniconda3/envs/bagep/bin/minimap2
Found bedtools - /home/laurindo/miniconda3/envs/bagep/bin/bedtools
Found snp-sites - /home/laurindo/miniconda3/envs/bagep/bin/snp-sites
Saving reference FASTA: core.ref.fa
This is any2fasta 0.4.2
Opening 'vc_reference.fasta'
Detected FASTA format
Read 67228 lines from 'vc_reference.fasta'
Wrote 2 sequences from FASTA file.
Processed 1 files.
Done.
Loaded 2 sequences totalling 4033501 bp.
Will mask 0 regions totalling 0 bp ~ 0.00%
Opening: core.tab
Opening: core.vcf
Processing contig: AE003852
Processing contig: AE003853
Generating core.full.aln
Creating TSV file: core.txt
Running: snp-sites -c -o core.aln core.full.aln
Warning: No SNPs were detected so there is nothing to output.
ERROR: Could not run: snp-sites -c -o core.aln core.full.aln
[Thu Jan 4 18:00:09 2024]
Error in rule snippy_core:
jobid: 0
output: core.aln, core.vcf, core.full.aln

Traceback (most recent call last):
File "/home/laurindo/miniconda3/envs/bagep/lib/python3.6/site-packages/snakemake/executors/init.py", line 593, in _callback
raise ex
File "/home/laurindo/miniconda3/envs/bagep/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/laurindo/miniconda3/envs/bagep/lib/python3.6/site-packages/snakemake/executors/init.py", line 579, in cached_or_run
run_func(*args)
File "/home/laurindo/miniconda3/envs/bagep/lib/python3.6/site-packages/snakemake/executors/init.py", line 2461, in run_wrapper
raise ex
File "/home/laurindo/miniconda3/envs/bagep/lib/python3.6/site-packages/snakemake/executors/init.py", line 2442, in run_wrapper
runtime_sourcecache_path,
File "/home/laurindo/BAGEP/Snakefile", line 171, in __rule_snippy_core
File "/home/laurindo/miniconda3/envs/bagep/lib/python3.6/site-packages/snakemake/shell.py", line 287, in new
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail; snippy-core --ref vc_reference.fasta --prefix core' returned non-zero exit status 2.
Exiting because a job execution failed. Look above for error message
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/laurindo/BAGEP/.snakemake/log/2024-01-04T180007.501282.snakemake.log
(bagep) laurindo@LAPTOP-M857EF9P:~/BAGEP$

snippy error

I think the environment has an old version of snippy which has this issue tseemann/snippy#344 .

Could you upgrade to the latest snippy version?

samtools version error

I installed the repo today and get this early error during pre-run tests

Need samtools --version >= 1.7 but you have 1.12 - please upgrade it.

snippy --check
[11:51:23] This is snippy 4.4.3
[11:51:23] Written by Torsten Seemann
[11:51:23] Obtained from https://github.com/tseemann/snippy
[11:51:23] Detected operating system: linux
[11:51:23] Enabling bundled linux tools.
[11:51:23] Found bwa - /opt/biotools/miniconda3/envs/bagep/bin/bwa
[11:51:23] Found bcftools - /opt/biotools/miniconda3/envs/bagep/bin/bcftools
[11:51:23] Found samtools - /opt/biotools/miniconda3/envs/bagep/bin/samtools
[11:51:23] Found java - /opt/biotools/miniconda3/envs/bagep/bin/java
[11:51:23] Found snpEff - /opt/biotools/miniconda3/envs/bagep/bin/snpEff
[11:51:23] Found samclip - /opt/biotools/miniconda3/envs/bagep/bin/samclip
[11:51:23] Found seqtk - /opt/biotools/miniconda3/envs/bagep/bin/seqtk
[11:51:23] Found parallel - /opt/biotools/miniconda3/envs/bagep/bin/parallel
[11:51:23] Found freebayes - /opt/biotools/miniconda3/envs/bagep/bin/freebayes
[11:51:23] Found freebayes-parallel - /opt/biotools/miniconda3/envs/bagep/bin/freebayes-parallel
[11:51:23] Found fasta_generate_regions.py - /opt/biotools/miniconda3/envs/bagep/bin/fasta_generate_regions.py
[11:51:23] Found vcfstreamsort - /opt/biotools/miniconda3/envs/bagep/bin/vcfstreamsort
[11:51:23] Found vcfuniq - /opt/biotools/miniconda3/envs/bagep/bin/vcfuniq
[11:51:23] Found vcffirstheader - /opt/biotools/miniconda3/envs/bagep/bin/vcffirstheader
[11:51:23] Found gzip - /bin/gzip
[11:51:23] Found vt - /opt/biotools/miniconda3/envs/bagep/bin/vt
[11:51:23] Found snippy-vcf_to_tab - /opt/biotools/miniconda3/envs/bagep/bin/snippy-vcf_to_tab
[11:51:23] Found snippy-vcf_report - /opt/biotools/miniconda3/envs/bagep/bin/snippy-vcf_report
[11:51:23] Need samtools --version >= 1.7 but you have 1.12 - please upgrade it.

samtools is the copy created in the conda env:

/opt/biotools/miniconda3/envs/bagep/bin/samtools --version
samtools 1.12
Using htslib 1.12
Copyright (C) 2021 Genome Research Ltd.

Samtools compilation details:
    Features:       build=configure curses=yes 
    CC:             /opt/conda/conda-bld/samtools_1616892191687/_build_env/bin/x86_64-conda-linux-gnu-cc
    CPPFLAGS:       -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /opt/biotools/miniconda3/envs/bagep/include
    CFLAGS:         -Wall -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /opt/biotools/miniconda3/envs/bagep/include -fdebug-prefix-map=/opt/conda/conda-bld/samtools_1616892191687/work=/usr/local/src/conda/samtools-1.12 -fdebug-prefix-map=/opt/biotools/miniconda3/envs/bagep=/usr/local/src/conda-prefix
    LDFLAGS:        -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,-rpath,/opt/biotools/miniconda3/envs/bagep/lib -Wl,-rpath-link,/opt/biotools/miniconda3/envs/bagep/lib -L/opt/biotools/miniconda3/envs/bagep/lib
    HTSDIR:         
    LIBS:           
    CURSES_LIB:     -ltinfow -lncursesw

HTSlib compilation details:
    Features:       build=configure plugins=yes, plugin-path=/opt/biotools/miniconda3/envs/bagep/libexec/htslib libcurl=yes S3=yes GCS=yes libdeflate=yes lzma=yes bzip2=yes htscodecs=1.0
    CC:             /opt/conda/conda-bld/htslib_1616818599374/_build_env/bin/x86_64-conda-linux-gnu-cc
    CPPFLAGS:       -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /opt/biotools/miniconda3/envs/bagep/include
    CFLAGS:         -Wall -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /opt/biotools/miniconda3/envs/bagep/include -fdebug-prefix-map=/opt/conda/conda-bld/htslib_1616818599374/work=/usr/local/src/conda/htslib-1.12 -fdebug-prefix-map=/opt/biotools/miniconda3/envs/bagep=/usr/local/src/conda-prefix -fvisibility=hidden
    LDFLAGS:        -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,-rpath,/opt/biotools/miniconda3/envs/bagep/lib -Wl,-rpath-link,/opt/biotools/miniconda3/envs/bagep/lib -L/opt/biotools/miniconda3/envs/bagep/lib -fvisibility=hidden -rdynamic

HTSlib URL scheme handlers present:
    built-in:	 preload, data, file
    S3 Multipart Upload:	 s3w, s3w+https, s3w+http
    Google Cloud Storage:	 gs+http, gs+https, gs
    Amazon S3:	 s3+https, s3+http, s3
    libcurl:	 imaps, pop3, gophers, http, smb, gopher, sftp, ftps, imap, smtp, smtps, rtsp, scp, ftp, telnet, mqtt, https, smbs, tftp, pop3s, dict
    crypt4gh-needed:	 crypt4gh
    mem:	 mem

the workflow proceeds and the fastp processing completes but then something breaks during snippy

Error in rule snippy:
    jobid: 84
    output: fastq/ERR2206048/, fastq/ERR2206048.snippy
    shell:
        snippy --force --cleanup --outdir fastq/ERR2206048/ --ref reference/h37rv.gbk --R1 fastp/fastq/ERR2206048_R1.fastq.gz.fastp --R2 fastp/fastq/ERR2206048_R2.fastq.gz.fastp
        (exited with non-zero exit code)

I attach the full snakemake log for more info error_log.txt

my launch command is: snakemake --jobs 48 --config ref=reference/h37rv.gbk
my reference is NBT hrv37 found in reference/
my read are from EBI and renamed to match your pipeline reqs

ll fastq/
total 3.7G
drwxr-xr-x  2 u0002316 domain users 4.0K Oct 20 10:53 .
drwxr-xr-x 11 u0002316 domain users 4.0K Oct 20 10:55 ..
-rw-r--r--  1 u0002316 domain users  83M Oct 20 10:41 ERR2206031_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  89M Oct 20 10:41 ERR2206031_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  70M Oct 20 10:41 ERR2206032_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  74M Oct 20 10:41 ERR2206032_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  43M Oct 20 10:41 ERR2206033_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  44M Oct 20 10:41 ERR2206033_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  61M Oct 20 10:41 ERR2206034_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  63M Oct 20 10:41 ERR2206034_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  84M Oct 20 10:41 ERR2206035_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  86M Oct 20 10:41 ERR2206035_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  82M Oct 20 10:41 ERR2206036_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  83M Oct 20 10:41 ERR2206036_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  59M Oct 20 10:41 ERR2206037_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  62M Oct 20 10:41 ERR2206037_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  70M Oct 20 10:41 ERR2206038_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  73M Oct 20 10:41 ERR2206038_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  54M Oct 20 10:41 ERR2206039_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  57M Oct 20 10:41 ERR2206039_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  76M Oct 20 10:41 ERR2206040_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  80M Oct 20 10:41 ERR2206040_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  59M Oct 20 10:41 ERR2206041_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  60M Oct 20 10:41 ERR2206041_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  38M Oct 20 10:41 ERR2206042_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  40M Oct 20 10:41 ERR2206042_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  44M Oct 20 10:41 ERR2206043_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  46M Oct 20 10:41 ERR2206043_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  60M Oct 20 10:41 ERR2206044_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  61M Oct 20 10:41 ERR2206044_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  64M Oct 20 10:41 ERR2206045_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  67M Oct 20 10:41 ERR2206045_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  96M Oct 20 10:41 ERR2206046_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users 101M Oct 20 10:41 ERR2206046_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users 124M Oct 20 10:41 ERR2206047_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users 129M Oct 20 10:41 ERR2206047_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users 112M Oct 20 10:41 ERR2206048_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users 117M Oct 20 10:41 ERR2206048_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  98M Oct 20 10:41 ERR2206049_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users 103M Oct 20 10:41 ERR2206049_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users 126M Oct 20 10:41 ERR2206050_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users 134M Oct 20 10:41 ERR2206050_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  41M Oct 20 10:41 ERR2206051_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  44M Oct 20 10:41 ERR2206051_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  69M Oct 20 10:41 ERR2206052_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  73M Oct 20 10:41 ERR2206052_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  64M Oct 20 10:41 ERR2206053_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  70M Oct 20 10:41 ERR2206053_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  74M Oct 20 10:41 ERR2206054_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  79M Oct 20 10:41 ERR2206054_R2.fastq.gz
-rw-r--r--  1 u0002316 domain users  80M Oct 20 10:41 ERR2206055_R1.fastq.gz
-rw-r--r--  1 u0002316 domain users  86M Oct 20 10:41 ERR2206055_R2.fastq.gz

thanks for your help

Question

I am trying to install BAGEP with the following command but it is running at "solving environment" for very long.

conda env create -f environment.yml

vfc filtered files are empty [Error: fai_fetch failed]

Hi,

I've tried to use your BAGEP pipeline on the dataset available here (https://zenodo.org/record/3731118#.X8jjYOhKg2w).
So I downloaded all the fastq (and I put them in a $WorkDir/fastq/ folder), the reference genome ( $WorkDir/ct18genome.fasta) and I launch your pipeline using:

snakemake --snakefile $HOME/Software/BAGEP/Snakefile --config ref=ct18genome.fasta

But I got the following error:

[15:15:41] Running: bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf  | vt normalize -r reference/ref.fa - | bcftools annotate --remove '^INFO/TYPE,^INFO/DP,^INFO/RO,^INFO/AO,^INFO/AB,^FORMAT/GT,^FORMAT/DP,^FORMAT/RO,^FORMAT/AO,^FORMAT/QR,^FORMAT/QA,^FORMAT/GL' > snps.filt.vcf 2>> snps.log
normalize v0.5

options:     input VCF file                                  -
         [o] output VCF file                                 -
         [w] sorting window size                             10000
         [n] no fail on reference inconsistency for non SNPs false
         [q] quiet                                           false
         [d] debug                                           false
         [r] reference FASTA file                            reference/ref.fa

[fai_fetch_seq] Error: fai_fetch failed. (Seeking in a compressed, .gzi unindexed, file?)
[variant_manip.cpp:67 is_ref_consistent] failure to extract base from fasta file: NC_003198.1:8527-8530

I think that it's because the SNP is 3 bases long (I used the -d option of vt normalize ):

NC_003198.1:8680:T/C
NC_003198.1:8732:A/G
NC_003198.1:8737:T/C
NC_003198.1:8743:C/T
NC_003198.1:8750:CTG/TTA
[fai_fetch_seq] Error: fai_fetch failed. (Seeking in a compressed, .gzi unindexed, file?)
[variant_manip.cpp:67 is_ref_consistent] failure to extract base from fasta file: NC_003198.1:8749-8751

Can you help me please to resolve this issue ?

Thanks a lot,

Best regards,

Heloise

Error run

Dear Sir/Madam,

I have tried to run the package but unfortunately it does not work. I ran the following command:

(bagep) sam@BioInf2:~/Downloads/BAGEP$ snakemake --config ref=/home/sam/Downloads/BAGEP/Ref-GCF_002005205.3_ASM200520v3_genomic.fna.gz

Then obtained the following error message:
Building DAG of jobs...
MissingInputException in line 76 of /home/sam/Downloads/BAGEP/Snakefile:
Missing input files for rule vcf_viewer:
core.vcf

Could you please help me solve this issue?

Thank you in advance for your support.

Best regards,
Pablo

How to run BAGEP in conda environment?

Hi,

I am trying to analyze Pseudomonas aeruginosa genomes using BAGEP. For this, I followed the steps for Installation from the BAGEP github page and also setup Centrifuge and Krona Taxonomy Plot, but now I do not see the command to run the program.
I also do not see any example that illustrates how to use raw .fastq files to generate the results as shown.
It says,
To run the pipeline, simply enter:
snakemake --config ref={users reference genome}
Now, I do not get where is the input files and the parameters to run different modules within BAGEP!
Also, what does users reference and genome refers to within the brackets {}?

Please guide me through this.
Thank you,
Govind

no argument **-bb** in iqtree --help

the iqtree command reads:

# Building phylogeny
rule tree:
  input:
    "core.aln"
  output:
    "iqtree.log"
  message:
    "Builing phylogeny tree of whole genomes using IQ-Tree"
  shell:
    "iqtree -s results/{input} -bb 1000 > {output}"

could it be that it should be instead -b 1000 ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.