Giter Club home page Giter Club logo

paulstothard / cgview_comparison_tool Goto Github PK

View Code? Open in Web Editor NEW
28.0 3.0 18.0 709.42 MB

The CGView Comparison Tool (CCT) is a package for visually comparing bacterial, plasmid, chloroplast, and mitochondrial sequences.

Home Page: https://paulstothard.github.io/cgview_comparison_tool/

License: GNU General Public License v3.0

Perl 85.13% Shell 14.62% Dockerfile 0.25%
bioinformatics bacterial-genomes bacterial-genome-analysis comparative-genomics mitochondrial-sequences

cgview_comparison_tool's Introduction

drawing

CGView Comparison Tool (CCT)

The CGView Comparison Tool (CCT) is a package for visually comparing bacterial, plasmid, chloroplast, and mitochondrial sequences. The comparisons are conducted using BLAST, and the BLAST results are presented in the form of graphical maps that can also show sequence features, gene and protein names, COG category assignments, and sequence composition characteristics. CCT can generate maps in a variety of sizes, including 400 Megapixel maps suitable for posters. Comparisons can be conducted within a particular species or genus, or all available genomes can be used. The entire map creation process, from downloading sequences to redrawing zoomed maps, can be completed easily using scripts included with CCT. User-defined features or analysis results can be included on maps, and maps can be extensively customized.

Sample CCT maps can be viewed here.

CCT was written and is maintained by Paul Stothard [email protected] and Jason Grant [email protected].

CCT citation

Grant JR, Arantes AS, Stothard P (2012) Comparing thousands of circular genomes using the CGView Comparison Tool. BMC Genomics 13:202.

Use CCT online

Proksee uses a JavaScript version of CGView to create interactive, web-based maps. Proksee is suitable for generating maps comparing up to five genomes.

Using the CCT Docker image

Pull the Docker image:

docker pull pstothard/cgview_comparison_tool

Run the Docker image and use fetch_genome_by_accession.sh to download a sequence in GenBank format:

docker run --rm -v "$(pwd)":/dir -u "$(id -u)":"$(id -g)" -w /dir pstothard/cgview_comparison_tool fetch_genome_by_accession.sh -a AC_000022 -o ./

Use build_blast_atlas.sh to create a BLAST atlas project for the sequence that was downloaded:

docker run --rm -v "$(pwd)":/dir -u "$(id -u)":"$(id -g)" -w /dir pstothard/cgview_comparison_tool build_blast_atlas.sh -i AC_000022.gbk

Download some sequences to be used as "comparison genomes" in the BLAST atlas project:

docker run --rm -v "$(pwd)":/dir -u "$(id -u)":"$(id -g)" -w /dir pstothard/cgview_comparison_tool fetch_genome_by_accession.sh -a NC_046914 -o ./AC_000022/comparison_genomes
docker run --rm -v "$(pwd)":/dir -u "$(id -u)":"$(id -g)" -w /dir pstothard/cgview_comparison_tool fetch_genome_by_accession.sh -a NC_047196 -o ./AC_000022/comparison_genomes
docker run --rm -v "$(pwd)":/dir -u "$(id -u)":"$(id -g)" -w /dir pstothard/cgview_comparison_tool fetch_genome_by_accession.sh -a NC_047457 -o ./AC_000022/comparison_genomes
docker run --rm -v "$(pwd)":/dir -u "$(id -u)":"$(id -g)" -w /dir pstothard/cgview_comparison_tool fetch_genome_by_accession.sh -a AC_000022 -o ./AC_000022/comparison_genomes
docker run --rm -v "$(pwd)":/dir -u "$(id -u)":"$(id -g)" -w /dir pstothard/cgview_comparison_tool fetch_genome_by_accession.sh -a NC_043914 -o ./AC_000022/comparison_genomes

Generate the CGView maps:

docker run --rm -v "$(pwd)":/dir -u "$(id -u)":"$(id -g)" -w /dir pstothard/cgview_comparison_tool build_blast_atlas.sh -p AC_000022 -z medium

Once complete, the maps can be found in the AC_000022/maps_for_cds_vs_cds and AC_000022/maps_for_dna_vs_dna directories on the host system.

There are multiple ways to alter the appearance and contents of a map generated by CCT. For example, the configuration files written when a project is first created can be edited prior to completion of the project. Also, CCT script command-line options can be used to alter the contents and appearance of a map. Finally, the CGView XML file that is generated as the input to cgview.jar can be edited and reprocessed. For more details see the CCT documentation, in particular the tutorials.

For example, the following command uses the --custom option to change several aspects of the map:

docker run --rm -v "$(pwd)":/dir -u "$(id -u)":"$(id -g)" -w /dir pstothard/cgview_comparison_tool build_blast_atlas.sh -p AC_000022 -x -z medium --custom "title='Example map' global_label=T legend=F use_opacity=F backboneRadius=900 labelFontSize=60 borderColor=white width=3000 height=3000"

This command redraws the maps in SVG format:

docker run --rm -v "$(pwd)":/dir -u "$(id -u)":"$(id -g)" -w /dir pstothard/cgview_comparison_tool redraw_maps.sh -p AC_000022 -f svg

The SVG maps are added to the AC_000022/maps_for_cds_vs_cds and AC_000022/maps_for_dna_vs_dna directories. Below are the maps generated for the CDS vs CDS comparisons and the DNA vs DNA comparisons, respectively.

CGView map CGView map

Downloading and running CCT

To download CCT:

git clone [email protected]:paulstothard/cgview_comparison_tool.git

CCT requires the following programs:

CCT requires the following Perl modules:

  • Bio::SeqIO
  • Bio::SeqUtils
  • Bio::Tools::CodonTable
  • Error
  • File::Temp
  • LWP::Protocol::https
  • Tie::IxHash

Set the following environment variables (by editing ~/.bashrc or ~/.bash_profile, for example):

export CCT_HOME="/path/to/cgview_comparison_tool"
export PATH="$PATH:${CCT_HOME}/scripts"
export PERL5LIB="$PERL5LIB:${CCT_HOME}/lib/perl_modules"

Test your setup:

./scripts/check_env.sh

Prepare the COG database:

./scripts/build_cog_db.sh

Build some test maps:

./scripts/process_test_projects.sh

Refer to the CCT documentation for information on how to use CCT. The tutorials are a good starting point.

cgview_comparison_tool's People

Contributors

paulstothard avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cgview_comparison_tool's Issues

Incompatibility with BLAST 2.6.0+

Hello @paulstothard

I found a problem trying to run scripts/process_test_projects.sh

DESCRIPTION
   Protein Query-Translated Subject BLAST 2.6.0+

Use '-help' to print detailed descriptions of command line arguments
========================================================================

Error: Unknown argument: "query_gencode"
Error:  (CArgException::eInvalidArg) Unknown argument: "query_gencode"
Program failed, try executing the command manually.

Moreover, I have BLAST 2.9.0 that should be the first option and you included BLAST 2.2.26.

Should I use BLAST 2.2.26 to run your software?

Thank you for your time,
Best regard.

Warn about wrong input format

Initially I tried to feed just assemblies' fasta files (just two of 5.5 mb), not gbk annotations into the program, and that led to uncontrolled RAM consumption. Probably user need to be warned about wrong input format.

Blast colors and filter

Hello,
Thank you for this nice tool!

I have two questions on the tool that I think would be of very much help when drawing it:

  1. Is it possible to filter the Blast alignments while running? Only allow the tool to use a set of hits with X identities and lengths.
  2. Is it possible to give a list of colors for the Blast alignments? The tool automatically repeats the colors for every three genomes added for comparison. However, I wanted one color for each comparison genome so it is easier to spot. Is that an option for it? Changing it directly from the .xml is too messy.

๐Ÿ˜„

Please add release tags

Hi,
the Debian Med team is providing cct as an official Debian package. To get new versions there are automatic tools checking github for release tags. So please tag your releases to enable us detect new versions that are of release quality (in contrast to random development commits).
Kind regards, Andreas.

PERL FAILURE

I tried using cgview and it failed. the check_env was successful, however when I used on my data (prokka generated genbank files), I get this error

I tried using cgview and it failed. the check_env was successful, however when I used on my data (prokka generated genbank files), I get this error

Performing BLAST search for sequence number 8653 (rfaY;S4_09114;_start=9364892;end=9365317;strand=-1;rf=3).
Writing HSP to file.
Open salmonella_project/cct_projects/S4/blast/blast_results_local/S4_CDS_vs_S9.gbk_cds_CDS_blastp to view the BLAST results.
[Wednesday April 07 08:01:05 2021] [Notice] The BLAST results have been written to salmonella_project/cct_projects/S4/blast/blast_results_local.
[Wednesday April 07 08:01:05 2021] [Notice] Assigning COG categories to CDS translations from reference genome sequence in salmonella_project/cct_projects/S4/reference_genome.
Cannot open file : No such file or directory at /home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/assign_cogs/assign_cogs.pl line 223.
The following command failed: perl /home/yser/apps/bioinfo/cgview_comparison_tool/lib/scripts/assign_cogs/assign_cogs.pl -i 'salmonella_project/cct_projects/S4/reference_genome/S4.gbk' -o 'salmonella_project/cct_projects/S4/features/S4.gbk_cds_cogs.gff' -s 'cds' -myva '/home/user/apps/bioinfo/cgview_comparison_tool/cog_db/myva' -whog '/home/user/apps/bioinfo/cgview_comparison_tool/cog_db/whog' -get_orfs '/home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/get_orfs/get_orfs.pl' -get_cds '/home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/get_cds/get_cds.pl' -local_bl '/home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/local_blast_client/local_blast_client.pl' -blastall 'blastall' -c '11' -e '0.1' -starts 'ttg|ctg|att|atc|ata|atg|gtg' -stops 'taa|tag|tga' -m_orf '25' -m_score '0'.
The following command failed: perl /home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/assign_cogs/assign_cogs.pl -i 'salmonella_project/cct_projects/S4/reference_genome/S4.gbk' -o 'salmonella_project/cct_projects/S4/features/S4.gbk_cds_cogs.gff' -s 'cds' -myva '/home/user/apps/bioinfo/cgview_comparison_tool/cog_db/myva' -whog '/home/user/apps/bioinfo/cgview_comparison_tool/cog_db/whog' -get_orfs '/home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/get_orfs/get_orfs.pl' -get_cds '/home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/get_cds/get_cds.pl' -local_bl '/home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/local_blast_client/local_blast_client.pl' -blastall 'blastall' -c '11' -e '0.1' -starts 'ttg|ctg|att|atc|ata|atg|gtg' -stops 'taa|tag|tga' -m_orf '25' -m_score '0'. at /home/user/apps/bioinfo/cgview_comparison_tool/scripts/cgview_comparison_tool.pl line 891.
main::_assignCogs(HASH(0x560e9260b7c8), Util::LogManager=HASH(0x560e925750c0)) called at /home/user/apps/bioinfo/cgview_comparison_tool/scripts/cgview_comparison_tool.pl line 799
main::_sequenceAnalysis(HASH(0x560e921513e8), Util::Configurator=HASH(0x560e925a08a8), Util::LogManager=HASH(0x560e925750c0)) called at /home/user/apps/bioinfo/cgview_comparison_tool/scripts/cgview_comparison_tool.pl line 103

[Wednesday April 07 08:01:46 2021] [Error] The following command failed: perl /home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/assign_cogs/assign_cogs.pl -i 'salmonella_project/cct_projects/S4/reference_genome/S4.gbk' -o 'salmonella_project/cct_projects/S4/features/S4.gbk_cds_cogs.gff' -s 'cds' -myva '/home/user/apps/bioinfo/cgview_comparison_tool/cog_db/myva' -whog '/home/user/apps/bioinfo/cgview_comparison_tool/cog_db/whog' -get_orfs '/home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/get_orfs/get_orfs.pl' -get_cds '/home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/get_cds/get_cds.pl' -local_bl '/home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/local_blast_client/local_blast_client.pl' -blastall 'blastall' -c '11' -e '0.1' -starts 'ttg|ctg|att|atc|ata|atg|gtg' -stops 'taa|tag|tga' -m_orf '25' -m_score '0'.

[Wednesday April 07 08:01:46 2021] [Error] The following command failed: perl /home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/assign_cogs/assign_cogs.pl -i 'salmonella_project/cct_projects/S4/reference_genome/S4.gbk' -o 'salmonella_project/cct_projects/S4/features/S4.gbk_cds_cogs.gff' -s 'cds' -myva '/home/user/apps/bioinfo/cgview_comparison_tool/cog_db/myva' -whog '/home/user/apps/bioinfo/cgview_comparison_tool/cog_db/whog' -get_orfs '/home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/get_orfs/get_orfs.pl' -get_cds '/home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/get_cds/get_cds.pl' -local_bl '/home/user/apps/bioinfo/cgview_comparison_tool/lib/scripts/local_blast_client/local_blast_client.pl' -blastall 'blastall' -c '11' -e '0.1' -starts 'ttg|ctg|att|atc|ata|atg|gtg' -stops 'taa|tag|tga' -m_orf '25' -m_score '0'. at /home/user/apps/bioinfo/cgview_comparison_tool/scripts/cgview_comparison_tool.pl line 891.
main::_assignCogs(HASH(0x560e9260b7c8), Util::LogManager=HASH(0x560e925750c0)) called at /home/user/apps/bioinfo/cgview_comparison_tool/scripts/cgview_comparison_tool.pl line 799
main::_sequenceAnalysis(HASH(0x560e921513e8), Util::Configurator=HASH(0x560e925a08a8), Util::LogManager=HASH(0x560e925750c0)) called at /home/user/apps/bioinfo/cgview_comparison_tool/scripts/cgview_comparison_tool.pl line 103

Multipe genome comparision not being done

I have tried to run cgview on multiple genome by putting those multiple genome into comaprison_genomes folder, however I always get the comparison done vs the first genome file not with the others. I have three genomes .gbk files in that folder as shown in the image bu the comparision is being done with just one of them.

Screen Shot 2021-04-12 at 8 46 44 AM

genome with plasmids vs genome with plasmids

One of my genome with plasmids with extension .gb ran through docker gave accurate representation of size another one I downloaded from NCBI which had a genome with around 6503724 bp and 9 other plasmids with different sizes. When I draw a map of this later genome I only get map of 6503724 bp , without addition of 9 other plasmids. They are all in one single files. The file was too big so I took a snapshot for the image. The gbk file has been uploaded by converting it to text as gbk format is not supported
NC_009925.1.txt

Screen Shot 2021-04-20 at 2 21 41 PM

CGview Java org.xml.sax.SAXException

Trying to do redraw_maps.sh -p myproject -f svg or build_blast_atlas.sh -p myproject -m 48g but i am getting the same following error :

org.xml.sax.SAXException: value for 'start' attribute in featureRange element must be less than or equal to the length of the plasmid in null at line 52 column 48
        at ca.ualberta.stothard.cgview.CgviewFactory.handleFeatureRange(CgviewFactory.java:3570)
        at ca.ualberta.stothard.cgview.CgviewFactory.startElement(CgviewFactory.java:669)
        at org.apache.xerces.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:497)
        at org.apache.xerces.parsers.AbstractXMLDocumentParser.emptyElement(AbstractXMLDocumentParser.java:180)
        at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:275)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1654)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:324)
        at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:845)
        at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:768)
        at org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:108)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1201)
        at ca.ualberta.stothard.cgview.CgviewFactory.createCgviewFromFile(CgviewFactory.java:445)
        at ca.ualberta.stothard.cgview.CgviewIO.main(CgviewIO.java:1474)
The following error occurred: org.xml.sax.SAXException: value for 'start' attribute in featureRange element must be less than or equal to the length of the plasmid in null at line 52 column 48

Any tips to fix this error?
Many thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.