Giter Club home page Giter Club logo

ggsashimi's Introduction

ggsashimi

Build Status

Command-line tool for the visualization of splicing events across multiple samples

Installation
Dependencies
Download docker image
Build docker image
Use docker image
Usage
Galaxy
Cite ggsashimi

image

Installation

The ggsashimi script can be directly downloaded from this repository:

wget https://raw.githubusercontent.com/guigolab/ggsashimi/master/ggsashimi.py

Change the execution permissions:

chmod u+x ggsashimi.py

Provided all dependencies are already installed (see below), you can directly execute the script:

./ggsashimi.py --help

To download the entire repository, which includes the dockerfile and example files:

git clone https://github.com/guigolab/ggsashimi.git

Dependencies

In order to run ggsashimi the following software components and packages are required:

  • python (2.7 or 3)
  • pysam (>=0.10.0)
  • R (>=3.3)
    • ggplot2 (>=2.2.1)
    • data.table (>=1.10.4)
    • gridExtra (>=2.2.1)

Additional required R packages grid and gtable should be automatically installed when installing R and ggplot2, respectively. Package svglite (>=1.2.1) is also required when generating output images in SVG format.

To avoid dependencies issues, the script is also available through a docker image.

Download docker image

A public ggsashimi Docker image is available in the Docker Hub and can be downloaded as follows:

docker pull guigolab/ggsashimi

Alternatively, we provide the Dockerfile if you want to build your local docker image, although most users will not need it.

Build docker image (optional)

After downloading the repository, move inside the repository folder:

cd ggsashimi

To build the docker image run the following command:

docker build -f docker/Dockerfile -t guigolab/ggsashimi .

This can take several minutes.

Use docker image

Once the image is downloaded or built, to execute ggsashimi with docker:

docker run guigolab/ggsashimi --help

Because the image is used in a docker container which has its own file system, to use the program with local files, a host data volume needs to be mounted.

As an example, you can run this command from the main repository folder:

docker run -w $PWD -v $PWD:$PWD guigolab/ggsashimi -b examples/input_bams.tsv -c chr10:27040584-27048100

The '-w' option sets the working directory inside the container to the current directory. The '-v' option mounts the current working directory and all child folders inside the container to the same path (host_path:container_path).

If your files are in another folder, for example the annotation file is stored in a different folder then the one containing the bam file, you can mount extra folders like this:

f="$DIR/annotation.gtf"
docker run -w $PWD -v $PWD:$PWD -v $DIR:$DIR guigolab/ggsashimi -b examples/input_bams.tsv -c chr10:27040584-27048100 -g $f

You can even mount a single file:

docker run -w $PWD -v $PWD:$PWD -v $f:$f guigolab/ggsashimi -b examples/input_bams.tsv -c chr10:27040584-27048100 -g $f

Usage

Execute the script with --help option for a complete list of options. Sample data and usage examples can be found at examples

Debug mode

Debug mode allows to run ggsashimi without producing any graphical output. It creates an Rscript file in the current working folder, containing all the R commands to generate the plot. It can be useful when reporting bugs or trying to debug the program behavior. Debug mode can be enabled by setting the environment variable GGSASHIMI_DEBUG when running the script, e.g.:

# export the environment variable
$ export GGSASHIMI_DEBUG=yes
$ ggsashimi.py -b ...

or

# set the environment variable inline
$ GGSASHIMI_DEBUG=yes ggsashimi.py -b ...

Galaxy

Thanks to ARTbio, now a Galaxy wrapper for ggsashimi is available at the Galaxy ToolShed.

Cite ggsashimi

If you find ggsashimi useful in your research please cite the related publication:

Garrido-Martín, D., Palumbo, E., Guigó, R., & Breschi, A. (2018). ggsashimi: Sashimi plot revised for browser-and annotation-independent splicing visualization. PLoS computational biology, 14(8), e1006360.

ggsashimi's People

Contributors

abreschi avatar dgarrimar avatar emi80 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ggsashimi's Issues

File "sashimi-plot.py", line 214, in intersect_introns a, b = next(it) StopIteration

Hi,

When I run the sashimi-plot.py it comes out this error. Could you help me solve this problem? Thank you!

(python3) [myang9@sbcsmdplp001 ReadsCoverage]$ sashimi-plot.py -s SENSE --shrink -F png -C 2 -g Homo_sapiens.GRCh38.94.gtf -b BamR6103640ACSE98545.bam -c chr1:196788215-196790227 -o R6103640-AC.SE_98545
Traceback (most recent call last):
File "sashimi-plot.py", line 214, in intersect_introns
a, b = next(it)
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "sashimi-plot.py", line 644, in
intersected_introns = list(intersect_introns(introns))
RuntimeError: generator raised StopIteration

Integration in Galaxy

Hello,

I really like your tool and I find it very useful. I wonder if you already thought about wrapping your ggsashimi into Galaxy ?
I am currently working in the bioinformatics plateform ARTbio (see our github here ) and we work a lot with Galaxy by developping the framework but also different tools and wrapping them.

Would you agree to allow our platform to create the galaxy wrapper ? This would be an opportunity to provide a sashimi plot generator to Galaxy users.

Best,
Lea

What's the difference between the flag -S OUT_STRAND and -s STRAND?

Hi,

I am using the ggsashimi to generate sashimi plots for my results. The alternative splicing events I am interested in are located in either + or - strand. I checked the help info of ggsashimi, and the -S OUT_STRAND and -s STRAND are the two flags I can set to focus on only one strand.

But I am not sure how to correctly set the two flags. For my understanding, if the alternative splicing event I am interested in is located in - strand, -s was set to ANTISENSE. How about -S? Should I set it to minus which is compatible to -s flag? What does the out strand mean?

thanks,

Shan

No reads in the specified Area error

Hi,
I am using the ggsashmi to plot some of the transcripts from gtf file, However when I see in the IGV there are reads for exons as well as junctions. However, when I run sashami-plot script I get error "No reads in the specified area."
Example jpeg for the transcripts from IGV.

I am running following command for the plotting.
sashimi-plot.py -b bam_list -C 4 -L 3 -c chr22:39696699-39700075 -o MSTRG10006 -g ../Final_novel.gtf
Screenshot from 2020-06-30 10-25-20

Add exception for GTF files with gene rows as well

I think now we currently support only gtf files with transcripts and exons, assuming that gene rows would still have a transcript_id tag in the metadata. Unfortunately, newer gtf formats do not have transcript_id for genes anymore and it throws error. We should add an exception for this

ValueError: need more than 1 value to unpack

Hello,

First off, thank you for developing this tool. I'm trying to run ggsashimi but I'm encountering the following error:

Traceback (most recent call last):
File "/home/sashimi-plot.py", line 632, in
transcripts, exons = read_gtf(args.gtf, args.coordinates)
File "/home/sashimi-plot.py", line 285, in read_gtf
el_chr, _, el, el_start, el_end, _, strand, _, tags = line.strip().split("\t")
ValueError: need more than 1 value to unpack

I would appreciate some help with this, thank you.

Nelson Barrientos

Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding

$ ./ggsashimi.sh
Python path configuration:
  PYTHONHOME = '/apps/python/3.7'
  PYTHONPATH = (not set)
  program name = '/usr/bin/python3'
  isolated = 0
  environment = 1
  user site = 1
  import site = 1
  sys._base_executable = '/usr/bin/python3'
  sys.base_prefix = '/apps/python/3.7'
  sys.base_exec_prefix = '/apps/python/3.7'
  sys.executable = '/usr/bin/python3'
  sys.prefix = '/apps/python/3.7'
  sys.exec_prefix = '/apps/python/3.7'
  sys.path = [
    '/apps/python/3.7/lib/python38.zip',
    '/apps/python/3.7/lib/python3.8',
    '/apps/python/3.7/lib/python3.8/lib-dynload',
  ]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'

Current thread 0x00002ae45d1c7740 (most recent call first):
<no Python frame>

ggsashimi.sh looks like:

#!/bin/bash
module load python
module load samtools
module load R/4.0

python3 /path/sashimi-plot.py \
--bam /path/wt_480480_ChP_ggsashimi.tsv \
--coordinates chrX:166170454-166178830 \
--gtf /path/Mus_musculus.GRCm38.100.gtf \
--overlay 1 \
--color-factor 3 \
--labels 1 \
--fix-y-scale

Change font size of transcript id when using `-gtf`

I would like to change the size of transctipt id when using --gtf, but I am not sure which part of the R script needs to be modified to achieve this. Could you point to the part of the script that controls this? Thanks!

Docker build error: package devtools is not available

Hi! I was trying to run it via docker and started building the image:
docker build -f docker/Dockerfile -t guigolab/ggsashimi .

However it complains about devtools:

Step 6/14 : RUN R -e 'install.packages("devtools");' &&     R -e "require(devtools); install_version('ggplot2', version = '${GGPLOT_VER}', repos = 'http://cloud.r-project.org');"
 ---> Running in fcc8a5f682b6

R version 3.3.3 (2017-03-06) -- "Another Canoe"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> install.packages("devtools");
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
Warning: unable to access index for repository https://mran.microsoft.com/snapshot/2017-04-21/src/contrib:
  Line starting '<!doctype html><html ...' is malformed!
Warning message:
package ‘devtools’ is not available (for R version 3.3.3)
>
>

R version 3.3.3 (2017-03-06) -- "Another Canoe"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> require(devtools); install_version('ggplot2', version = '2.2.1', repos = 'http://cloud.r-project.org');
Loading required package: devtools
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘devtools’
Error: could not find function "install_version"
Execution halted
The command '/bin/sh -c R -e 'install.packages("devtools");' &&     R -e "require(devtools); install_version('ggplot2', version = '${GGPLOT_VER}', repos = 'http://cloud.r-project.org');"' returned a non-zero code: 1

thoughts?

Color and label settings

Trying to set the colors.
I reused the palette file in the example.
How does it know which colors should be associated to which sample ?
My png plot is always gray. I m trying to use it on 5 samples without overlapping.
Also can you give an example of how labels file should be ? I didn't also succeed to make it works.

coordinate flag do not match with coordinate on x-axis of generated graph if using --shrink flag

Hi,

I noticed the coordinate on x-axis of the generated graph would be horizontally truncated if --shrink flag was used. It's the same in your examples.

the command to generate sashimi.pdf:
../sashimi-plot.py -b input_bams.tsv -c chr10:27040584-27048100 -g annotation.gtf -M 10 -C 3 -O 3 --shrink --alpha 0.25 --base-size=20 --ann-height=4 --height=3 --width=18 -P palette.txt

But the x-axis in the graph end at around 27041500 which is way less than the 27048100 in the -c flag.

sashimi.pdf

Error in seq.default(start, max(start + 1, end - 4), by = 2425)

$ ./sashimi-plot.py -b input_bam.tsv -g nochr_gencode_exon_transcript.gtf -c 2:29,619,720-29,740,978 -M 10 --base-size=20 --ann-height=4 --height=3 --width=18 -P palette.txt

Error in seq.default(start, max(start + 1, end - 4), by = 2425) :
'from' must be of length 1
Calls: rbind -> [ -> [.data.table -> seq -> seq.default
Execution halted

Unexpected input in " color_list = list(11_"

I am encountering this particular error, am I running it incorrectly, or is there a specific input I have to format some way?

./sashimi-plot.py -b 11_sf3_ctrl.mm10.sorted.bam -c chr2:119,046,711-119,057,857 -M 10 -g gencode.vM18.annotation.nogene.gtf
Error: unexpected input in " color_list = list(11_"
Execution halted

Thanks in advance for the help!

Axes labels missing

Running the default example provided results in a PDF figure where expression labels (ex. 0-600) and the coordinate label are missing. The main figure on the repository, however, has these axes labels. Not sure if this is an issue with the R code or the python...

-S plus vs -S minus shows the same reads

Hi,

I've been using ggsashimi for a while now and it's a wonderful package!

I needed some clarification on -s and -S arguments. I went through the options as to what the script says the difference between the two are and also went through a similar question someone asked about this. As far as I understand -s SENSE would be if you want to visualize the reads for the sense strand and -s ANTISENSE would be if you want to visualize the reads for the antisense strand with respect to RNA-Seq. Meanwhile, I would want to visualize the reads of only for the forward strand (+ strand) in one of the junctions I'd like to view and would not want the reverse strand (- strand) reads. I entered the junction coordinates on IGV, viewed the sashimi plot for the forward and reverse strand and for each of the strands there are different numbers. I expected the same output using ggsashimi and used the -S plus once and -S minus another time but I get the same junction reads using both options and there is no difference in the outputs between them.

Here is my script execution incase I'm not making any mistakes and if my understanding is right:
~/tools/ggsashimi/sashimi-plot.py -b ../input_bam.tsv -c 12:6448738-6451363 -g annotation_CD27.gtf --shrink --alpha 0.6 --base-size=20 --ann-height=4 --height=3 --width=18 -P ../palette.txt -C 1 -S plus -o sashimi_CD27_onlyJunc_plus

If needed I can attach a screenshot of what I see for plus and minus strands on IGV and what I see for ggsashimi if I'm not doing something wrong.

Thanks,
Ahish

KeyError: 'transcript_id' with Ensemble human annotation

From @sridhar0605 originally posted in #1:

When using Homo_sapiens.GRCh37.75.gtf as reference from ensembl, I see this error

Using default tag: latest
latest: Pulling from guigolab/ggsashimi
915665fee719: Pull complete
1a0814f59c8e: Pull complete
b3b71680ed5d: Pull complete
1c3c8afa6ada: Pull complete
2fbeb903a5b4: Pull complete
Digest: sha256:82590f821978568e948ad4861ce009fcb26e7543263bea9d7b78c17667f8d675
Status: Downloaded newer image for guigolab/ggsashimi:latest
Traceback (most recent call last):
File "/sashimi-plot.py", line 592, in
transcripts, exons = read_gtf(args.gtf, args.coordinates)
File "/sashimi-plot.py", line 278, in read_gtf
transcript_id = d["transcript_id"]
KeyError: 'transcript_id'

few lines from gtf:

#!genome-build GRCh37.p13
#!genome-version GRCh37
#!genome-date 2009-02
#!genome-build-accession NCBI:GCA_000001405.14
#!genebuild-last-updated 2013-09
1	pseudogene	gene	11869	14412	.	+	.	gene_id "ENSG00000223972"; gene_name "DDX11L1"; gene_source "ensembl_havana"; gene_biotype "pseudogene";
1	processed_transcript	transcript	11869	14409	.	+	.	gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_name "DDX11L1"; gene_source "ensembl_havana"; gene_biotype "pseudogene"; transcript_name "DDX11L1-002"; transcript_source "havana";
1	processed_transcript	exon	11869	12227	.	+	.	gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "1"; gene_name "DDX11L1"; gene_source "ensembl_havana"; gene_biotype "pseudogene"; transcript_name "DDX11L1-002"; transcript_source "havana"; exon_id "ENSE00002234944";
1	processed_transcript	exon	12613	12721	.	+	.	gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "2"; gene_name "DDX11L1"; gene_source "ensembl_havana"; gene_biotype "pseudogene"; transcript_name "DDX11L1-002"; transcript_source "havana"; exon_id "ENSE00003582793";
1	processed_transcript	exon	13221	14409	.	+	.	gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; exon_number "3"; gene_name "DDX11L1"; gene_source "ensembl_havana"; gene_biotype "pseudogene"; transcript_name "DDX11L1-002"; transcript_source "havana"; exon_id "ENSE00002312635";

Annotate gene symbol and exon number

Hi,

I wanted to know if there is any way to annotate gene_name and exon_number in addition to the transcript_id from the GTF file:

GTF file:

##description: evidence-based annotation of the human genome (GRCh37), version 19 (Ensembl 74)
##provider: GENCODE
##contact: [email protected]
##format: gtf
##date: 2013-12-05
##modified by GTEx_Collapsed_Gene_Model.py
1       HAVANA  gene    11869   14362   .       +       .       gene_id "ENSG00000223972.4"; transcript_id "ENSG00000223972.4"; gene_type "pseudogene"; gene_s
tatus "KNOWN"; gene_name "DDX11L1"; transcript_type "pseudogene"; transcript_status "KNOWN"; transcript_name "DDX11L1"; level 2; havana_gene "OTTHUMG000000009
61.2";
1       HAVANA  transcript      11869   14362   .       +       .       gene_id "ENSG00000223972.4"; transcript_id "ENSG00000223972.4"; gene_type "pseudogene"
; gene_status "KNOWN"; gene_name "DDX11L1"; transcript_type "pseudogene"; transcript_status "KNOWN"; transcript_name "DDX11L1"; level 2; havana_gene "OTTHUMG0
0000000961.2";
1       HAVANA  exon    11869   12227   .       +       .       gene_id "ENSG00000223972.4"; transcript_id "ENSG00000223972.4"; gene_type "pseudogene"; gene_s
tatus "KNOWN"; gene_name "DDX11L1"; transcript_type "pseudogene"; transcript_status "KNOWN"; transcript_name "DDX11L1"; level 2; havana_gene "OTTHUMG000000009
61.2"; exon_id "ENSG00000223972.4_1; exon_number 1";
1       HAVANA  exon    12595   12721   .       +       .       gene_id "ENSG00000223972.4"; transcript_id "ENSG00000223972.4"; gene_type "pseudogene"; gene_s
tatus "KNOWN"; gene_name "DDX11L1"; transcript_type "pseudogene"; transcript_status "KNOWN"; transcript_name "DDX11L1"; level 2; havana_gene "OTTHUMG000000009
61.2"; exon_id "ENSG00000223972.4_2; exon_number 2";
1       HAVANA  exon    12975   13052   .       +       .       gene_id "ENSG00000223972.4"; transcript_id "ENSG00000223972.4"; gene_type "pseudogene"; gene_s
tatus "KNOWN"; gene_name "DDX11L1"; transcript_type "pseudogene"; transcript_status "KNOWN"; transcript_name "DDX11L1"; level 2; havana_gene "OTTHUMG000000009
61.2"; exon_id "ENSG00000223972.4_3; exon_number 3";

Sashimi Plot:

Screen Shot 2019-06-10 at 1 33 24 PM

Thanks
Komal

AttributeError: 'NoneType' object has no attribute 'replace'

used this command - ./sashimi-plot.py -b wt_chr2.bam -g Mus_musculus.GRCm38.99.gtf -o wt_1_

Traceback (most recent call last):
File "./sashimi-plot.py", line 578, in
a, junctions = read_bam(bam, args.coordinates, args.strand)
File "./sashimi-plot.py", line 125, in read_bam
_, start, end = parse_coordinates(c)
File "./sashimi-plot.py", line 67, in parse_coordinates
c = c.replace(",", "")
AttributeError: 'NoneType' object has no attribute 'replace'

No y scale

I'm trying to make sashimi plot using the examples files in the example folder and just copy/paste the command line you suggested and I get the plots like the example but no y axis at all. Any clues to fix this issue?
sashimi.pdf

Error: unexpected symbol in " color_list = list(ConAc_A2-11D.sorted"

Hi,

I get the following error when using sashimi-plot.py

This is my code
python sashimi-plot.py -b /project/uml_frederic_chain/Daphnia/index/ConAc_A2-11D.sorted.bam -c DAPPUscaffold_67:640865-642375 -g /project/uml_frederic_chain/Daphnia/Daphnia_pulex.V1.0.41.exons.gtf --shrink

Could you please help me out in resolving this.

Plot dimension too big

Hi,

I'm trying to plot 26 bams separately in one single plot, but i constantly get the following error. I tried to tweak the parameters, specially the --height but it doesn't seem to work. I can't edit the R script either to set limitsize to FALSE. When I overlay the plot by groups I can nicely see the plot, but then when showing the reads from each sample within the same layer gets a bit messy, and I do not want to aggregate the counts in this case.

Error: Dimensions exceed 50 inches (height and width are specified in 'in' not pixels). If you're sure you want a plot that big, use `limitsize = FALSE`.

Any help one track per sample ?
Best,
Pedro

Feature request: Add option for fixed y-axis scales

Hi!
In some cases it would be nice if the y-axis scale was fixed for each compared condition. I think this could help visualize differences in the inclusion levels of alternative events. There are different ways in which this feature could be implemented, one could either fix the y-scale to that of the max read count among all the plots or it could be fixed to a value that is manually entered by the user. Both of these approaches should be fairly easy to implement using standard ggplot syntax and I could try it out if this sounds interesting to anyone else!

Histogram smoothing possible?

Hi

I pulled ggsashimi from DockerHub and am trying to smooth the histograms with "--smooth" but the argument is not recognized. In the GitHub code it looks to be commented out. Is there another argument or is smoothing not possible, currently?

Thanks, Luke

rpkm/tpm rather than median/mean reads

am plotting two samples on using the same settings on the the example_run.sh, except the gtf file (which is mine). However, while the y axis scale on the first automatically readjusts to median read count, panel 2 is stuck to one. Is there a way to make sure the scales on both panels readjust automatically.
Also is it possible to visual rfkm/tpm rather than median/mean reads?

Error: unexpected '=' in " color_list = list(input_bam="

Hello all,

I'm testing ggshashimi with a single bam input but keep getting the error bellow:

Error: unexpected '=' in " color_list = list(bam_iput="
Execution halted

I'm running:
Python 2.7.14
R version 3.4.1
ggplot2 v2.2.1
gridextra v2.3
data.table v1.11.4

Any help will be much appreciated and thank you for generating ggsashimi!
Best Gil

Error in grid.Call.graphics

Received by email

I am interested in using your ggsashimi utility to generate some plots for a research publication. I have installed the listed dependencies and R packages for the utility from my bash prompt. (I am using the Linux subsystem for Windows 10.) However, when I attempt to run either my own samples or the provided example files, I get the error message provided below. Unfortunately, my knowledge of Python is limited. Does this indicate an error with my system setup? I appreciate the assistance!

Example command I used:

./sashimi-plot.py -b ./examples/input_bams.tsv -c chr10:27040584-27048100 -g ./examples/annotation.gtf -M 10 -C 3 -O 3 --shrink --alpha 0.25 --base-size=20 --ann-height=4 --height=3 --width=18 -P ./examples/palette.txt

Which returns the following message:

Warning messages:
1: In density_grobs[[id]]$widths[1] <- density_grobs[[id]]$widths[1] +  :
  number of items to replace is not a multiple of replacement length
2: In density_grobs[[id]]$widths[3 + vs] <- maxYtextWidth :
  number of items to replace is not a multiple of replacement length
3: In density_grobs[[id]]$widths[1] <- density_grobs[[id]]$widths[1] +  :
  number of items to replace is not a multiple of replacement length
4: In density_grobs[[id]]$widths[3 + vs] <- maxYtextWidth :
  number of items to replace is not a multiple of replacement length
5: In density_grobs[[id]]$widths[1] <- density_grobs[[id]]$widths[1] +  :
  number of items to replace is not a multiple of replacement length
6: In density_grobs[[id]]$widths[3 + vs] <- maxYtextWidth :
  number of items to replace is not a multiple of replacement length
7: In density_grobs[[id]]$widths[1] <- density_grobs[[id]]$widths[1] +  :
  number of items to replace is not a multiple of replacement length
8: In density_grobs[[id]]$widths[3 + vs] <- maxYtextWidth :
  number of items to replace is not a multiple of replacement length
9: In density_grobs[[id]]$widths[1] <- density_grobs[[id]]$widths[1] +  :
  number of items to replace is not a multiple of replacement length
10: In density_grobs[[id]]$widths[3 + vs] <- maxYtextWidth :
  number of items to replace is not a multiple of replacement length
Error in grid.Call.graphics(L_setviewport, vp, TRUE) :
  INTEGER() can only be applied to a 'integer', not a 'NULL'
Calls: ggsave ... lapply -> FUN -> push.vp.viewport -> grid.Call.graphics
Execution halted

no ggplot2 in docker

hello team,

This is a great tool.

I am not sure why when I pull you docker image i see this error

Using default tag: latest
latest: Pulling from guigolab/ggsashimi
Digest: sha256:eab7ae5fa4ae2aadb8fc20e507bda59acb9bf78c12013e8384be2302fbb57f3c
Status: Image is up to date for guigolab/ggsashimi:latest
Error in library(ggplot2) : there is no package called 'ggplot2'
Execution halted

Any help?

Thanks.

Sid

Error if BAM filename starts with number

Hi,
I installed ggsashimi and both examples in example_run.sh worked fine. But when I tried with my BAM file, named 2dpi.bam, I got the following error:

./sashimi-plot.py -b 2dpi.bam -c tig00000003:33700-34300

Error: unexpected symbol in "        color_list = list(2dpi"
Execution halted

I started to mess around with different file names and when I changed 2dpi.bam to dpi2.bam it worked. It looks like if the BAM file name starts with a number ggsashimi gives the error above. Maybe this is an issue you would like to address in future versions of ggsashimi.

Package versions:
Python v3.7.3
R v3.5.1
ggplot2 v3.2.1
data.table v1.12.2
gridExtra v2.3

Exons not displayed properly

Hi,
in some cases R raise a warning (Removed 1 rows containing missing values (geom_segment)) and some exons are not displayed.
I attached an example.

I run the following command

sashimi-plot -b totBs_sashimi.annot.tsv -c '1:109113929-109206781' -g TCONS_00009768_andNear.gtf -M 10 -C 3 -O 3
--shrink --alpha 0.25 --base-size=20 --ann-height=4 --height=3 --width=18 -P palette.txt

Thanks

example.zip

Aggregation doesn't work with long genomic regions

Hi, it seems that the mean and median aggregate functions throw an error if the genomic region is too large. I'm currently looking at a region 147,000 bases long and am getting this error when I try and use the mean or median aggregation

Error in j[4] = as.numeric(d[x == j[2] + 1, y]) : replacement has length zero Execution halted

It works just fine if I remove the aggregation or if I shrink the genomic region. What may be causing this?

Sashimi plot read counts (above curves)

Hi, I'm curious about the meaning of numbers plotted above the curves (splice junction reads).

for single samples, the number seems to match the count of splice junction reads. (something like 206, 159.. )
image

But if I input multiple samples (4 samples) in one group and draw sashimi plot, the numbers get extremely huge. (something like : 347,142,108,206). It is definitely not the sum of junction read counts in 4 samples. I wonder what those huge numbers mean.

image

Thanks!

How to generate transcript diagram at the bottom

Hi, I want to ask how to generate the transcript diagram (that consists transcript id and strand) at the bottom of sahsimi plot as you have generated in your example. I have tried, But I am not getting anything at the bottom..
Thanks
Heena

Warnings when using -A mean_j (median_j)

The test example raises some warnings when -A mean_j or -A median_j option is set:

../sashimi-plot.py -b input_bams.tsv -c chr10:27040584-27048100 -g annotation.gtf -M 10 -C 3 -O 3 --shrink --alpha 0.25 --base-size=20 --ann-height=4 --height=3 --width=18 -P palette.txt -A mean_j

There were 18 warnings (use warnings() to see them)

And the arches in the output plot start from (and end into) wrong signal values

something missing in graphic

--Hi,
i generate the graphic with this command:
./sashimi-plot.py -b input_bams.tsv -c 11:5617339-5625858 -g exons.gtf -M 10 -C 3 -O 3 -A median -F tiff -R 350 --base-size=20 --ann-height=4 --height=3 --width=18 -P palette.txt

but the arrows between exons (below the sashimi plots) are not drawn.

an idea ?
thank you --

IndexError: list index out of range

./sashimi-plot.py -b input_bams.tsv -g Mus_musculus.GRCm38.99.gtf -M 10 -C 3 -O 3 --shrink --alpha 0.25 --base-size=20 --ann-height=4 --height=3 --width=18 -P palette.txt
Traceback (most recent call last):
File "./sashimi-plot.py", line 573, in
for id, bam, overlay_level, color_level, label_text in read_bam_input(args.bam, args.overlay, args.color_factor, args.labels):
File "./sashimi-plot.py", line 176, in read_bam_input
bam = get_bam_path(f, line_sp[1])
IndexError: list index out of range

error in R script portion

when I turn on GGSASHIMI_DEBUG and run the R script it produces, I get the following when I run the script in R:

library(ggplot2)
 library(grid)
 library(gridExtra)
 library(data.table)
 library(gtable)
 
 scale_lwd = function(r) {
   lmin = 0.1
   lmax = 4
   return( r*(lmax-lmin)lmin )
 }
 
 base_size = 14
 height = ( 5.5  base_size*0.352777778/67 ) * 1.02
 width = 10
 theme_set(theme_bw(base_size=base_size))
 theme_update(
   plot.margin = unit(c(15,15,15,15), "pt"),
   panel.grid = element_blank(),
   panel.border = element_blank(),
   axis.line = element_line(size=0.5),
   axis.title.x = element_blank(),
   axis.title.y = element_text(angle=0, vjust=0.5)
 )
 
 labels = list("#000000"="#000000","WT"="WT","1KO"="1KO")
 
 density_list = list()
 junction_list = list()
 
 color_list = list("WT"="#000000","1KO"="#000000")
 
 
 # data table with exons
 ann_list = list(
   "exons" = data.table(),
   "introns" = data.table()
 )
 
 
 gtfp = ggplot()
 if (length(ann_list[['introns']])  0) {
   gtfp = gtfp  geom_segment(data=ann_list[['introns']], aes(x=start, xend=end, y=tx, yend=tx), size=0.3)
   gtfp = gtfp  geom_segment(data=txarrows, aes(x=V1,xend=V2,y=tx,yend=tx), arrow=arrow(length=unit(0.02,"npc")))
 }
 if (length(ann_list[['exons']])  0) {
   gtfp = gtfp  geom_segment(data=ann_list[['exons']], aes(x=start, xend=end, y=tx, yend=tx), size=5, alpha=1)
 }
 gtfp = gtfp  scale_y_discrete(expand=c(0,0.5))
 gtfp = gtfp  scale_x_continuous(expand=c(0,0.25))
 gtfp = gtfp  coord_cartesian(xlim = c(6064339,6070090))
 gtfp = gtfp  labs(y=NULL)
 gtfp = gtfp  theme(axis.line = element_blank(), axis.text.x = element_blank(), axis.ticks = element_blank())
 
[density lists and junction lists omitted bc too long]
                                                                              junction_list[["1KO"]] = data.frame(x=c(6064422,6064422,6065846,6064422,6064422,6065846,6064422,6064422,6065846), xend=c(6065705,6069831,6069831,6065705,6069831,6069831,6065705,6069831,6069831), y=c(39,39,13,39,39,13,39,39,13), yend=c(14,34,34,14,34,34,14,34,34), count=c(13,22,8,13,22,8,13,22,8))
Error: unexpected symbol in:
"20,6064821,6064822,6064823,6064824,6064825,6064826,6064827,6064828,6064829,6064830,6064831,6064832,6064833,6064834,6064835,6064836,6064837,6064838,6064839,6064840,6
                                                                             junction_list"
                                                                              
                                                                              
                                                                              pdf(NULL) # just to remove the blank pdf produced by ggplotGrob
                                                                              
                                                                              if(packageVersion('ggplot2') = '3.0.0'){  # fix problems with ggplot2 vs 3.0.0
                                                                                vs = 1
                                                                              } else {
                                                                                vs = 0
                                                                              }
                                                                              
                                                                              if(TRUE) {
                                                                                print(max(unlist(lapply(density_list, function(df){max(df$y)}))))
                                                                                maxheight = max(unlist(lapply(density_list, function(df){max(df$y)})))
                                                                                breaks_y = labeling::extended(0, maxheight, m = 4)
                                                                              }
[1] -Inf
Error in seq.default(from = best$lmin, to = best$lmax, by = best$lstep) : 
  'from' must be of length 1
In addition: Warning messages:
1: In max(unlist(lapply(density_list, function(df) { :
  no non-missing arguments to max; returning -Inf
2: In max(unlist(lapply(density_list, function(df) { :
  no non-missing arguments to max; returning -Inf
                                                                              
                                                                              if(exists('coord_dict')){
                                                                                all_pos_shrinked = do.call(rbind, density_list)$x
                                                                                s2r = merge(intersected_introns, coord_dict, by.x = 'real_xend', by.y = 'real')
                                                                                s2r = merge(s2r, coord_dict, by.x = 'real_x', by.y = 'real', suffixes = c('_xend', '_x'))
                                                                                breaks_x_shrinked = labeling::extended(min(all_pos_shrinked), max(all_pos_shrinked), m = 5)
                                                                                breaks_x = c()
                                                                                out_range = c()
                                                                                for (b in breaks_x_shrinked){
                                                                                  iintron = FALSE
                                                                                  for (j in 1:nrow(s2r)){
                                                                                    l = s2r[j, ]
                                                                                    if(b = l$shrinked_x && b <= l$shrinked_xend){
                                                                                      # Intersected intron
                                                                                      p = (b-l$shrinked_x)/(l$shrinked_xend - l$shrinked_x)
                                                                                      realb = round(l$real_x  p*(l$real_xend - l$real_x))
                                                                                      breaks_x = c(breaks_x, realb)
                                                                                      iintron = TRUE
                                                                                      break
                                                                                    }
                                                                                  }
                                                                                  if (!iintron){
                                                                                    # Exon, upstream/downstream intergenic region or intron (not intersected)
                                                                                    if(b <= min(s2r$shrinked_x)) {
                                                                                      l <- s2r[which.min(s2r$shrinked_x), ]
                                                                                      if(any(b == all_pos_shrinked)){
                                                                                        # Boundary (subtract)
                                                                                        s = l$shrinked_x - b
                                                                                        realb = l$real_x - s
                                                                                        breaks_x = c(breaks_x, realb)
                                                                                      } else {
                                                                                        out_range <- c(out_range, which(breaks_x_shrinked == b))
                                                                                      }
                                                                                    } else if (b = max(s2r$shrinked_xend)){
                                                                                      l <- s2r[which.max(s2r$shrinked_xend), ]
                                                                                      if(any(b == all_pos_shrinked)){
                                                                                        # Boundary (sum)
                                                                                        s = b - l$shrinked_xend
                                                                                        realb = l$real_xend  s
                                                                                        breaks_x = c(breaks_x, realb)
                                                                                      } else {
                                                                                        out_range <- c(out_range, which(breaks_x_shrinked == b))
                                                                                      }
                                                                                    } else {
                                                                                      delta = b-s2r$shrinked_xend
                                                                                      delta[delta < 0] = Inf
                                                                                      l = s2r[which.min(delta), ]
                                                                                      # Internal (sum)
                                                                                      s = b - l$shrinked_xend
                                                                                      realb = l$real_xend  s
                                                                                      breaks_x = c(breaks_x, realb)
                                                                                    }
                                                                                  }
                                                                                }
                                                                                if(length(out_range)) {
                                                                                  breaks_x_shrinked = breaks_x_shrinked[-out_range]
                                                                                }
                                                                              }
                                                                              
                                                                              density_grobs = list();
                                                                              
                                                                              for (bam_index in 1:length(density_list)) {
                                                                                
                                                                                id = names(density_list)[bam_index]
                                                                                d = data.table(density_list[[id]])
                                                                                junctions = data.table(junction_list[[id]])
                                                                                
                                                                                # Density plot
                                                                                gp = ggplot(d)  geom_bar(aes(x, y), width=1, position='identity', stat='identity', fill=color_list[[id]], alpha=0.05)
                                                                                gp = gp  labs(y=labels[[id]])
                                                                                if(exists('coord_dict')) {
                                                                                  gp = gp  scale_x_continuous(expand=c(0, 0.25), breaks = breaks_x_shrinked, labels = breaks_x)
                                                                                } else {
                                                                                  gp = gp  scale_x_continuous(expand=c(0, 0.25))
                                                                                }
                                                                                
                                                                                if(!TRUE){
                                                                                  maxheight = max(d[['y']])
                                                                                  breaks_y = labeling::extended(0, maxheight, m = 4)
                                                                                  gp = gp  scale_y_continuous(breaks = breaks_y)
                                                                                } else {
                                                                                  gp = gp  scale_y_continuous(breaks = breaks_y, limits = c(NA, maxheight))
                                                                                }
                                                                                
                                                                                # Aggregate junction counts
                                                                                row_i = c()
                                                                                if (nrow(junctions) 0 ) {
                                                                                  
                                                                                  junctions$jlabel = as.character(junctions$count)
                                                                                  junctions = setNames(junctions[,.(max(y), max(yend),round(mean(count)),paste(jlabel,collapse=",")), keyby=.(x,xend)], names(junctions))
                                                                                  if ("mean" != "") {
                                                                                    junctions = setNames(junctions[,.(max(y), max(yend),round(mean(count)),round(mean(count))), keyby=.(x,xend)], names(junctions))
                                                                                  }
                                                                                  # The number of rows (unique junctions per bam) has to be calculated after aggregation
                                                                                  row_i = 1:nrow(junctions)
                                                                                }
                                                                                
                                                                                
                                                                                for (i in row_i) {
                                                                                  
                                                                                  j_tot_counts = sum(junctions[['count']])
                                                                                  
                                                                                  j = as.numeric(junctions[i,1:5])
                                                                                  
                                                                                  if ("mean" != "") {
                                                                                    j[3] = ifelse(length(d[x==j[1]-1,y])==0, 0, max(as.numeric(d[x==j[1]-1,y])))
                                                                                    j[4] = ifelse(length(d[x==j[2]1,y])==0, 0, max(as.numeric(d[x==j[2]1,y])))
                                                                                  }
                                                                                  
                                                                                  # Find intron midpoint
                                                                                  xmid = round(mean(j[1:2]), 1)
                                                                                  ymid = max(j[3:4]) * 1.2
                                                                                  
                                                                                  # Thickness of the arch
                                                                                  lwd = scale_lwd(j[5]/j_tot_counts)
                                                                                  
                                                                                  curve_par = gpar(lwd=lwd, col=color_list[[id]])
                                                                                  
                                                                                  # Arc grobs
                                                                                  
                                                                                  # Choose position of the arch (top or bottom)
                                                                                  nss = i
                                                                                  if (nss%%2 == 0) {  #bottom
                                                                                    ymid = -0.3 * maxheight
                                                                                    # Draw the arcs
                                                                                    # Left
                                                                                    curve = xsplineGrob(x=c(0, 0, 1, 1), y=c(1, 0, 0, 0), shape=1, gp=curve_par)
                                                                                    gp = gp  annotation_custom(grob = curve, j[1], xmid, 0, ymid)
                                                                                    # Right
                                                                                    curve = xsplineGrob(x=c(1, 1, 0, 0), y=c(1, 0, 0, 0), shape=1, gp=curve_par)
                                                                                    gp = gp  annotation_custom(grob = curve, xmid, j[2], 0, ymid)
                                                                                  }
                                                                                  
                                                                                  if (nss%%2 != 0) {  #top
                                                                                    # Draw the arcs
                                                                                    # Left
                                                                                    curve = xsplineGrob(x=c(0, 0, 1, 1), y=c(0, 1, 1, 1), shape=1, gp=curve_par)
                                                                                    gp = gp  annotation_custom(grob = curve, j[1], xmid, j[3], ymid)
                                                                                    # Right
                                                                                    curve = xsplineGrob(x=c(1, 1, 0, 0), y=c(0, 1, 1, 1), shape=1, gp=curve_par)
                                                                                    gp = gp  annotation_custom(grob = curve, xmid, j[2], j[4], ymid)
                                                                                  }
                                                                                  
                                                                                  # Add junction labels
                                                                                  gp = gp  annotate("label", x = xmid, y = ymid, label = as.character(junctions[i,6]),
                                                                                                     vjust=0.5, hjust=0.5, label.padding=unit(0.01, "lines"),
                                                                                                     label.size=NA, size=(base_size*0.352777778)*0.6
                                                                                  )
                                                                                  
                                                                                  
                                                                                  #               gp = gp  annotation_custom(grob = rectGrob(x=0, y=0, gp=gpar(col="red"), just=c("left","bottom")), xmid, j[2], j[4], ymid)
                                                                                  #               gp = gp  annotation_custom(grob = rectGrob(x=0, y=0, gp=gpar(col="green"), just=c("left","bottom")), j[1], xmid, j[3], ymid)
                                                                                  
                                                                                  
                                                                                }
                                                                                
                                                                                gpGrob = ggplotGrob(gp);
                                                                                gpGrob$layout$clip[gpGrob$layout$name=="panel"] <- "off"
                                                                                if (bam_index == 1) {
                                                                                  maxWidth = gpGrob$widths[2vs]  gpGrob$widths[3vs];    # fix problems ggplot2 vs
                                                                                  maxYtextWidth = gpGrob$widths[3vs];                     # fix problems ggplot2 vs
                                                                                  # Extract x axis grob (trim=F -- keep empty cells)
                                                                                  xaxisGrob <- gtable_filter(gpGrob, "axis-b", trim=F)
                                                                                  xaxisGrob$heights[8vs] = gpGrob$heights[1]              # fix problems ggplot2 vs
                                                                                  x.axis.height = gpGrob$heights[7vs]  gpGrob$heights[1] # fix problems ggplot2 vs
                                                                                }
                                                                                
                                                                                
                                                                                # Remove x axis from all density plots
                                                                                kept_names = gpGrob$layout$name[gpGrob$layout$name != "axis-b"]
                                                                                gpGrob <- gtable_filter(gpGrob, paste(kept_names, sep="", collapse="|"), trim=F)
                                                                                
                                                                                # Find max width of y text and y label and max width of y text
                                                                                maxWidth = grid::unit.pmax(maxWidth, gpGrob$widths[2vs]  gpGrob$widths[3vs]); # fix problems ggplot2 vs
                                                                                maxYtextWidth = grid::unit.pmax(maxYtextWidth, gpGrob$widths[3vs]); # fix problems ggplot2 vs
                                                                                density_grobs[[id]] = gpGrob;
                                                                              }
Error in density_list[[id]] : 
  attempt to select less than one element in get1index
                                                                              
                                                                              # Add x axis grob after density grobs BEFORE annotation grob
                                                                              density_grobs[["xaxis"]] = xaxisGrob
                                                                              
                                                                              # Annotation grob
                                                                              if (1.0 == 1) {
                                                                                gtfGrob = ggplotGrob(gtfp);
                                                                                maxWidth = grid::unit.pmax(maxWidth, gtfGrob$widths[2vs]  gtfGrob$widths[3vs]); # fix problems ggplot2 vs
                                                                                density_grobs[['gtf']] = gtfGrob;
                                                                              }
Error in ans[ypos] <- rep(yes, length.out = len)[ypos] : 
  replacement has length zero
In addition: Warning message:
In rep(yes, length.out = len) : 'x' is NULL so the result will be NULL
                                                                              
                                                                              # Reassign grob widths to align the plots
                                                                              for (id in names(density_grobs)) {
                                                                                density_grobs[[id]]$widths[1] <- density_grobs[[id]]$widths[1]  maxWidth - (density_grobs[[id]]$widths[2vs]  maxYtextWidth); # fix problems ggplot2 vs
                                                                                density_grobs[[id]]$widths[3vs] <- maxYtextWidth # fix problems ggplot2 vs
                                                                              }
Error: object of type 'closure' is not subsettable
                                                                              
                                                                              # Heights for density, x axis and annotation
                                                                              heights = unit.c(
                                                                                unit(rep(2, length(density_list)), "in"),
                                                                                x.axis.height,
                                                                                unit(1.5*1.0, "in")
                                                                              )
Error in unit(rep(2, length(density_list)), "in") : 
  'x' and 'units' must have length  0
                                                                              
                                                                              # Arrange grobs
                                                                              argrobs = arrangeGrob(
                                                                                grobs=density_grobs,
                                                                                ncol=1,
                                                                                heights = heights,
                                                                              );
Error in arrangeGrob(grobs = density_grobs, ncol = 1, heights = heights,  : 
  object 'heights' not found
                                                                              
                                                                              # Save plot to file in the requested format
                                                                              if ("pdf" == "tiff"){
                                                                                # TIFF images will be lzw-compressed
                                                                                ggsave("/orange/swanson/carter.h/ChP_Splicing_Project/Mbnl_12KO_ChP/rMATS_output_1KO_ChP_Triplicate/testing_SE_sashimi_plots/Arhgap12_chr18_6064340-6070091_meanj_-.pdf", plot = argrobs, device = "tiff", width = width, height = height, units = "in", dpi = 300, compression = "lzw", limitsize = FALSE)
                                                                              } else {
                                                                                ggsave("/orange/swanson/carter.h/ChP_Splicing_Project/Mbnl_12KO_ChP/rMATS_output_1KO_ChP_Triplicate/testing_SE_sashimi_plots/Arhgap12_chr18_6064340-6070091_meanj_-.pdf", plot = argrobs, device = "pdf", width = width, height = height, units = "in", dpi = 300, limitsize = FALSE)
                                                                              }
Error in grDevices::pdf(file = filename, ..., version = version) : 
  cannot open file '/orange/swanson/carter.h/ChP_Splicing_Project/Mbnl_12KO_ChP/rMATS_output_1KO_ChP_Triplicate/testing_SE_sashimi_plots/Arhgap12_chr18_6064340-6070091_meanj_-.pdf'
                                                                              
                                                                              dev.log = dev.off()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.