Giter Club home page Giter Club logo

btyper3's Introduction

BTyper3

In silico taxonomic classification of Bacillus cereus group isolates using assembled genomes

Overview

BTyper3 is a command-line tool for taxonomically classifying Bacillus cereus group genomes using a standardized nomenclature.

The program, as well as the associated databases, can be downloaded from https://github.com/lmc297/BTyper3.

Post issues at https://github.com/lmc297/BTyper3/issues.

For more information, check out the BTyper3 wiki at https://github.com/lmc297/BTyper3/wiki.


Installation

For more information, check out the BTyper3 wiki

conda (recommended)

To create a conda environment named btyper3 and install BTyper3 and all of its dependencies:

  1. Install conda, if necessary
  2. Create a new environment named btyper3 by running the following command from your terminal:
    conda create -n btyper3
  3. Activate your btyper3 environment by running the following command from your terminal:
    conda activate btyper3
  4. Install BTyper3 by running the following command from your terminal:
    conda install bioconda::btyper3
  5. You can now run btyper3! Run the following command from your terminal to view all btyper3 options, or check the BTyper3 wiki for details:
    btyper3 --help
  6. When you're done with BTyper3, you can deactivate the btyper3 environment by running the following:
    conda deactivate

pip

  1. To run BTyper3, please download and install the following dependencies, if necessary:

  1. Add BLAST+ to your path, if necessary (to check if BLAST+ is in your path, try running makeblastdb -h and tblastn -h from your command line; you should get a help message for each command, with no error messages)

  2. Install via pip (this will download required Python dependencies as well):

    pip install btyper3  

News: updates in BTyper3 v3.2.0, v3.3.0, and v3.4.0 -- new species just dropped!

The primary function of BTyper3 is to allow users to taxonomically classify B. cereus group genomes using a standardized nomenclature (see here and here for details regarding how the standardized nomenclature was constructed, and how it compares to historical typing methods, respectively). However, we understand that some users may also want to compare their B. cereus group genomes to the type strain genomes of published B. cereus group species. Thus, in BTyper3 v3.2.0, we have added the --ani_typestrains option, which calculates ANI values between a query genome and the genomes of all published B. cereus group species type strains and reports the type strain that produces the highest ANI value.

The type strain genomes used by BTyper3's --ani_typestrains option correspond to the species discussed in Figure 2 of our taxonomy review, plus species published after the review was published (i.e., Bacillus sanguinis, Bacillus paramobilis, and Bacillus hominis, added in v3.2.0; "B. arachidis" and B. rhizoplanae, added in v3.3.0; "B. pretiosus", added in v3.4.0). Within the standardized taxonomy that BTyper3 uses for genomospecies assignment:

  • All members of Bacillus sanguinis (type strain RefSeq Assembly Accession GCF_018332475.1) belong to B. mosaicus (i.e., B. sanguinis is not considered a novel species in the standardized taxonomy)

  • All members of Bacillus paramobilis (type strain RefSeq Assembly Accession GCF_018332495.1) belong to B. mosaicus (i.e., B. paramobilis is not considered a novel species in the standardized taxonomy)

  • All members of Bacillus hominis (type strain RefSeq Assembly Accession GCF_018332515.1) belong to B. mycoides (i.e., B. hominis is not considered a novel species in the standardized taxonomy)

  • "Bacillus arachidis" (type strain RefSeq Assembly Accession GCF_017498775.1) replaces the putative genomospecies previously referred to as "Unknown Species 17" in the standardized taxonomy

  • Bacillus rhizoplanae (type strain RefSeq Assembly Accession GCF_917563915.1) represents a novel genomospecies within the standardized taxonomy and has been added to the database

  • All members of "Bacillus pretiosus" (type strain RefSeq Assembly Accession GCF_025916425.1) belong to B. mosaicus (i.e., "B. pretiosus" is not considered a novel species in the standardized taxonomy)

Importantly, B. cereus group species are often proposed in the literature using unstandardized approaches (e.g., varying genomospecies thresholds, which may produce overlapping genomospecies). We have added the type strain comparison method in BTyper3 v3.2.0, as users may still want to compare a query genome with the type strains of published B. cereus group species. However, interpret results with caution, as some B. cereus group genomes may belong to multiple species using type strain genomes.

For more information, check out our:


Citation

If you found the BTyper3 tool, its source code, and/or any of its associated databases useful, please cite:

Carroll, Laura M., Martin Wiedmann, Jasna Kovac. 2020. "Proposal of a Taxonomic Nomenclature for the Bacillus cereus Group Which Reconciles Genomic Definitions of Bacterial Species with Clinical and Industrial Phenotypes." mBio 11(1): e00034-20; DOI: 10.1128/mBio.00034-20.

Carroll, Laura M., Rachel A. Cheng, Jasna Kovac. 2020. "No Assembly Required: Using BTyper3 to Assess the Congruency of a Proposed Taxonomic Framework for the Bacillus cereus group with Historical Typing Methods." Frontiers in Microbiology 11: 580691; DOI: 10.3389/fmicb.2020.580691.


Quick Start

For detailed information, check out the BTyper3 wiki

Command Structure

btyper3 -i [fasta] -o [output directory] [options...]

For help, type btyper3 -h or btyper3 --help

For your current version, type btyper3 --version

Sample Commands

Perform all default analyses, using an assembled genome (complete or draft) in (multi-)FASTA format as input (assumes fastANI is in the user's path):

btyper3 -i /path/to/genome.fasta -o /path/to/desired/output_directory

Perform all default analyses, using an assembled genome (complete or draft) in (multi-)FASTA format as input (fastANI is not in the user's path):

btyper3 -i /path/to/genome.fasta -o /path/to/desired/output_directory --fastani_path /path/to/FastANI_executable/fastANI

Perform all default analyses, plus pseudo-gene flow unit assignment, using an assembled genome (complete or draft) in (multi-)FASTA format as input (assumes fastANI is in the user's path):

btyper3 -i /path/to/genome.fasta -o /path/to/desired/output_directory --ani_geneflow True

Perform seven-gene MLST only, using user-supplied MLST gene sequences and the latest version of the PubMLST B. cereus s.l. database (sequences can be in multi-FASTA format, or concatenated into a single sequence in FASTA format):

btyper3 -i /path/to/mlst.fasta -o /path/to/desired/output_directory --ani_species False --ani_subspecies False --ani_typestrains False --virulence False --bt False --panC False --download_mlst_latest True

Perform panC group assignment only, using a user-supplied panC gene sequence in FASTA format:

btyper3 -i /path/to/panC.fasta -o /path/to/desired/output_directory --ani_species False --ani_subspecies False --ani_typestrains False --virulence False --bt False --mlst False

Perform virulence factor and Bt toxin-encoding gene detection in a plasmid sequence in FASTA format:

btyper3 -i /path/to/plasmid.fasta -o /path/to/desired/output_directory --ani_species False --ani_subspecies False --ani_typestrains False --mlst False --panC False

Disclaimer: BTyper3 is pretty neat! However, no tool is perfect, and BTyper3 cannot definitively prove whether an isolate is pathogenic or not. As always, interpret your results with caution. We are not responsible for taxonomic misclassifications, misclassifications of an isolate's pathogenic potential or industrial utility, and/or misinterpretations (biological, statistical, or otherwise) of BTyper3 results.

btyper3's People

Contributors

althonos avatar lmc297 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

btyper3's Issues

fastANI error loading shared libraries libgsl.so.25

Hi Laura,

I was installing this great tool this morning, but came across an error with libgsl.so.25.

$ btyper3 -i 2218710072.fna -o bytper3
Welcome to BTyper3!
You are initializing this run at 2022-07-19 08:56
You ran the following command:
/phe/tools/miniconda3/envs/btyper3/bin/btyper3 -i 2218710072.fna -o bytper3
Report bugs/concerns to Laura M. Carroll, [email protected]
Using FastANI to assign 2218710072 to a species at 2022-07-19 08:56
fastANI: error while loading shared libraries: libgsl.so.25: cannot open shared object file: No such file or directory
Traceback (most recent call last):
  File "/phe/tools/miniconda3/envs/btyper3/bin/btyper3", line 10, in <module>
    sys.exit(main())
  File "/phe/tools/miniconda3/envs/btyper3/lib/python3.10/site-packages/btyper3/__init__.py", line 441, in main
    run_pipeline(args)
  File "/phe/tools/miniconda3/envs/btyper3/lib/python3.10/site-packages/btyper3/__init__.py", line 105, in run_pipeline
    final_species = get_species.run_fastani("species", fastani_path, infile, final_results_directory, prefix)
  File "/phe/tools/miniconda3/envs/btyper3/lib/python3.10/site-packages/btyper3/ani.py", line 60, in run_fastani
    proc.check_returncode()
  File "/phe/tools/miniconda3/envs/btyper3/lib/python3.10/subprocess.py", line 456, in check_returncode
    raise CalledProcessError(self.returncode, self.args, self.stdout,
subprocess.CalledProcessError: Command '['fastANI', '-q', '2218710072.fna', '--rl', '/tmp/tmp69n8g9vktxt', '-o', 'bytper3/btyper3_final_results/species/2218710072_species_fastani.txt']' returned non-zero exit status 127

After a quick google, the current gsl version 2.7.1 only provides libgsl.so.27. For anyone else that encounters this issue, I fixed it by downgrading the gsl version from 2.7.1 to 2.7.
conda install gsl=2.7

Have a good day!
Caitlin

Feature Request: store downloaded db files

To enable further follow-up work on the exact files used for comparison, for example for the genomes using ANI, could a feature be added to specify an outpath where the files are stored?

Error with download_pubmlst_latest.py

$ python download_pubmlst_latest.py

Downloading most recent PubMLST datbase at 2021-04-12 14:38
Traceback (most recent call last):
  File "download_pubmlst_latest.py", line 74, in <module>
    main()
  File "download_pubmlst_latest.py", line 67, in main
    download_pubmlst(btyper3_path)
  File "download_pubmlst_latest.py", line 28, in download_pubmlst
    tree=ET.parse(xml)
  File "/home/domeni/anaconda3/lib/python3.7/xml/etree/ElementTree.py", line 1197, in parse
    tree.parse(source, parser)
  File "/home/domeni/anaconda3/lib/python3.7/xml/etree/ElementTree.py", line 598, in parse
    self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: no element found: line 1, column 0

I'm not very familiar with the requests and the xml modules this error seems to be related with. Any clue?

Thank you

Domenico

Installation problem

Hi @lmc297,
I get the following error when trying to install BTyper3:

Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: /
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed

UnsatisfiableError:


I'm using a fresh installation of anaconda on windows 11.

Tutorial not working?

ANI seems not to be working for me. Using the tutorial data, I got the attached results, with no ANI classification. I also attach the log.
I tried a few genomes but never got an ANI classification, even with the "--ani_typestrains" True parameter
JHQN01.1.log
JHQN01.1_final_results.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.