Giter Club home page Giter Club logo

cicero's Introduction

CICERO

Status Github Issues Pull Requests Actions: CI Status License: MIT

CICERO (Clipped-reads Extended for RNA Optimization) is an assembly-based algorithm to detect diverse classes of driver gene fusions from RNA-seq.
Explore the docs »
Work with demo data »
Read the paper »

Request Feature Report Bug
⭐ Consider starring the repo! ⭐


To discover driver fusions beyond canonical exon-to-exon chimeric transcripts, we develop CICERO, a local assembly-based algorithm that integrates RNA-seq read support with extensive annotation for candidate ranking. CICERO outperforms commonly used methods, achieving a 95% detection rate for 184 independently validated driver fusions including internal tandem duplications and other non-canonical events in 170 pediatric cancer transcriptomes. Overview of CICERO algorithm which consists of fusion detection through analysis of candidate SV breakpoints and splice junction, fusion annotation, and ranking.


📝 Table of Contents

Running CICERO

Add the src/scripts directory to your system PATH variable. Add the src/perllib and dependencies/lib/perl directories to your system PERL5LIB variable.

Then invoke the CICERO wrapper as

Cicero.sh [-h] [-n ncores] -b bamfile -g genome -r refdir [-j junctions] [-o outdir] [-t threshold] [-s sc_cutoff] [-c sc_shift] [-p] [-d]

-p - optimize CICERO, sets sc_cutoff=3 and sc_shift=10 [default true]
-s <num> - minimum number of soft clip support required [default=2]
-t <num> - threshold for enabling increased soft clip cutoff [default=200000]
-c <num> - clustering distance for grouping similar sites [default=3]
-j <file> - junctions file from RNApeg
-n <num> - number of cores to utilize with GNU parallel
-d - disable excluded regions file use
  • ncores is the number of cores to be run on (with GNU parallel).
  • bamfile is the input bamfile mapped to human genome builds GRCh37-lite or GRCh38_no_alt. Contact us if your bam is based on other reference version.
  • genome is either GRCh37-lite or GRCh38_no_alt. CICERO only support the two human reference genome versions.
  • refdir is the reference file directory specific to CICERO. Download Reference Files below. e.g. -r /home/user/software/CICERO/reference_hg38/ or -r /home/user/software/CICERO/reference_hg19/
  • outdir is user defined output file folder.
  • junctions is the junctions file output from RNApeg. See Generate Junctions below. CICERO can detect fusion by analysis of splice junction reads. If this option is omitted, fusions generated by small deletions may be missed as these events may lack the soft-clipped reads.
  • threshold CICERO first detects all soft-clipped positions supported by >=2 reads from bam file. For sample with <=threshold (default 200,000) soft-clipped positions, CICERO will detect fusions based on these soft-clipped positons; otherwise, to speed-up CICERO running, CICERO will detect fusions based on soft-clipped positions supported by >=sc_cutoff (3, default for optimize mode, see below) reads. For sample with lots of soft-clipped positions, a smaller threshold will speed-up CICERO running, however, some fusion events (i.e. only supported by 2 reads) may be missed.
  • sc_cutoff controls the number of soft clip reads required to support a putative site. The default is 2, but for samples with large numbers of soft clip reads, it may be desirable to require additional support to reduce the computational time required.
  • sc_shift sets the threshold for considering events to be the same site.
  • optimize defaults to ON. This sets sc_cutoff to 3 for samples where the number of soft clip sites exceeds 200,000. It also sets sc_shift to 10 which sets the distance to consider events the same.
  • -no-optimize turns optimizations off. This can increase sensitivity, but increases the computational requirements.

The final CICERO fusion result file will be located at <outdir>/CICERO_DATADIR/<sample name>/final_fusions.txt. Use the following guide to interpret the results.

To visualize CICERO fusion output you can load the final fusion output file at https://proteinpaint.stjude.org/FusionEditor/.

Dependencies

  • GNU parallel
  • Samtools 1.3.1
  • Cap3
  • Blat
  • Java 1.8.0
  • Perl 5.10.1 with libraries:
    • base
    • Bio
    • Carp
    • Compress
    • Cwd
    • Data
    • DBI
    • diagnostics
    • Digest
    • English
    • enum
    • Exporter
    • File
    • FileHandle
    • List
    • POSIX
    • strict
    • Sys
    • Tree
    • warnings

Running with Docker

CICERO can be run with Docker. Pre-built Docker images are provided for each release in GitHub Packages.

Invoke the CICERO wrapper using the Docker image available in GitHub Packages. You will likely need to add an additional bind mount for the output and input (BAM + junctions) files. Note the following command pulls the latest tag for the Docker image. For reproducible results, it is advisable to specify the exact version to run.

docker run -v <path to reference directory>:/reference ghcr.io/stjude/cicero:latest [-n cores] -b <bam file path> -g <genome> -r /reference -o <output directory> [-j junctions file] [-p] [-s int] [-t int] [-c int]

See Running CICERO for details of the parameters.

Running with St. Jude Cloud

CICERO is integrated in the St. Jude Cloud Rapid RNA-Seq workflow. To run CICERO in St. Jude Cloud, access the tool through the platform page. Documentation for running and interpreting results is available in the user guide.

Generate junctions file with RNApeg

RNApeg is required to generate a junctions file for use by CICERO. You can get RNApeg from both Docker and Singularity. Once RNApeg is complete, the *.junctions.tab.shifted.tab file can be provided to CICERO using the -j argument.

Running RNApeg via Docker:

RNApeg is authored by Michael N. Edmonson (@mnedmonson).

RNApeg overview

This software analyzes nextgen RNA sequencing data which has been mapped to whole-genome coordinates, identifying evidence of both known and novel splicing events from the resulting alignments. The raw junction sites in the mapped BAMs undergo postprocessing to correct various issues related to mapping ambiguity. The result is a more compact and consistent set of junction calls, simplifying downstream quantification, analysis, and comparison.

RNApeg key features

Raw junction extraction

First, the BAM read mappings are analyzed to identify putative junction sites. This produces a list of junction sites along with counts of supporting reads and several associated quality metrics. While reflective of the BAM data, this output typically requires refinement by the following steps.

Correction vs. reference junctions

Novel junctions are compared with reference exon junction boundaries and evaluated for mapping ambiguity which can justify adjusting the sites to match. Even small ambiguities such as the presence of the same nucleotide on either side of a junction can be enough to nudge a prediction that would otherwise perfectly match a reference isoform out of place.

Self-correction for novel junctions

Mapping ambiguity is next evaluated within the novel junctions themselves. Ambiguous junctions are combined where possible, merging their counts of supporting reads and related annotations. This reduces the population of novel junctions while simultaneously improving the evidence for those remaining. Combining evidence for poorly-covered sites also improves the chances of these sites passing the default minimum level of 3 supporting reads required for reporting junctions in the final output.

Correction vs. novel skips of known exons

Additional correction of novel junctions is also performed to identify previously unknown skips of an exon (or exons) within known reference isoforms. Special handling is required in these cases because while the corrected boundaries are known, the events themselves are novel.

Edge correction of novel junctions vs. reference exons

Junctions may also be shifted in cases of ambiguity involving a single edge (i.e. junction start or end). While not doubly-anchorable as with known reference junctions or novel skips of known exons, this adjustment can standardize evidence e.g. for novel exons.

Junction calling

Novel junctions are subjected to additional scrutiny before being reported:

  • must be supported by a minimum of 3 reads
  • at least one read must pass minimum flanking sequence requirements, to avoid false positives near read ends due to insufficient anchoring
  • the junction must be either observed bidirectionally, or be supported by very clean alignments (either perfect or with very few high-quality mismatches, insertions, deletions, or soft clips)

While these requirements are minimal, they substantially reduce background noise.

Cross-sample correction

This step pools results for a set of samples and does additional standardization of novel exons based on the combined set. Mostly this has the effect of standardizing ambiguous novel junction sites across samples, but it can occasionally result in combinations of sites as well.

Output

The primary output files are tab-delimited text.

Output is also written in UCSC .bed format, which can be used to visualize the junctions and supporting read counts within the UCSC genome browser.

Running RNApeg via Docker

docker run -v <outdir>:/results ghcr.io/stjude/rnapeg:latest -b bamfile -f fasta -r refflat
  • fasta reference genome; i.e. "Homo_sapiens/GRCh38_no_alt/FASTA/GRCh38_no_alt.fa" or "Homo_sapiens/GRCh37-lite/FASTA/GRCh37-lite.fa" from Reference Files.
  • refflat i.e. "Homo_sapiens/GRCh38_no_alt/mRNA/RefSeq/refFlat.txt" or "Homo_sapiens/GRCh37-lite/mRNA/Combined/all_refFlats.txt" from Reference Files.

Running RNApeg via Singularity:

singularity run --containall --bind <outdir>:/results docker://ghcr.io/stjude/rnapeg:latest -b bamfile -f fasta -r refflat

You will also need to add --bind arguments to mount the file paths for bamfile, fasta, and refflat into the container.

Downloading reference files

Reference files are required to run CICERO. They can be found at the following location:

Supported Genome Version

CICERO currently supports GRCh37-lite and GRCh38_no_alt.

Demo

A demo of CICERO can be found at the following location:

Output Fields

Field Description
sample Sample ID
geneA / geneB gene at breakpoint A / B
chrA / chrB chromosome at breakpoint A / B
posA / posB coordinate at breakpoint A / B
ortA / ortB Mapping strand of assembled contig at breakpoint A / B
featureA / featureB 5utr / 3utr / coding / intron / intergenic at breakpoint A / B
sv_ort Whether the mapping orientation of assembled contig has confident biological meaning; if confident, then '>', else '?' (e.g. the contig mapping is from sense strand of gene A to antisense strand of gene B).
readsA / readsB number of junction reads that support the fusion at breakpoint A / B
matchA / matchB contig matched length at breakpoint A / B region
repeatA / repeatB repeat score (0~1) at breakpoint A / B region, the higher the more repetitive
coverageA / coverageB coverage of junction reads that support the fusion at breakpoint A / B (add the sequence length that can be mapped to the assembled contig for each junction read)
ratioA / ratioB MAF of soft-clipped reads at breakpoint A / B (calculate the MAF for plus mapped reads and minus mapped reads, respectively; use the maximum MAF).
qposA / qposB breakpoint position in the contig that belongs to A / B part
total_readsA / total_readsB total reads number at the breakpoint at breakpoint A / B
contig Assembled contig sequence that support the fusion
type CTX (interchromosomal translocation) / Internal_dup / ITX (inversion) / DEL (deletion) / INS (insertion) / read_through
score Fusion score, the higher the better
rating HQ (known fusions) / RT (read_through) / LQ (others)
medal Estimated pathogenicity assessment using St. Jude Medal Ceremony. Value: 0/1/2/3/4, the bigger the better
functional effect ITD (Internal_dup) / Fusion / upTSS / NLoss / CLoss / other
frame 0 (event is not in frame) / 1 (event is in-frame) / 2 (geneB portion contains canonical coding start site (i.e. the entire CDS for geneB)) / 3 (possible 5' UTR fusion in geneB)

Citation

Tian, L., Li, Y., Edmonson, M.N. et al. CICERO: a versatile method for detecting complex and diverse driver fusions using cancer RNA sequencing data. Genome Biol 21, 126 (2020). https://doi.org/10.1186/s13059-020-02043-x

License

Copyright 2020 St. Jude Children's Research Hospital

Licensed under a modified version of the Apache License, Version 2.0 (the "License") for academic research use only; you may not use this file except in compliance with the License. To inquire about commercial use, please contact the St. Jude Office of Technology Licensing at [email protected].

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

cicero's People

Contributors

adthrasher avatar atrull314 avatar b2pi avatar claymcleod avatar drjrm3 avatar drkennetz avatar jordan-rash avatar liqingti avatar mcrusch avatar mjz1 avatar mnedmonson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

cicero's Issues

ExtractSClips.error- fail to determine the sequence name.

Hello
I am running Cicero with Docker. I am using the demo file using SJBALL020016_C27.bam file. I have used GRCh38_no_alt as reference file.
Here is my command
sudo docker run -v /home/deepak/:/home/deepak/ -v /media/deepak/EXTRA/:/media/deepak/EXTRA/ stjude/cicero:0.3.0 Cicero.sh -n 20 -b /media/deepak/EXTRA/SJBALL020016_C27.bam -g GRCh38_no_alt -r /home/deepak/Downloads/reference -o /media/deepak/EXTRA/output
But I am getting some error

  1. ExtractSClips.error- [bam_parse_region] fail to determine the sequence name.
  2. final fusion.txt file is not created.
    Here is my log output-
Optimize: setting SC_SHIFT=10, SC_CUTOFF=3, THRESHOLD=200000
/opt/cicero/configs
configs.b8e35fb555c0.1.tmp
SJ_CONFIGS=configs.b8e35fb555c0.1.tmp
Starting local blat server
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server is up!
Step 01 - 2020.11.11 05:13:01 - ExtractSClips
CICERO_DATADIR/SJBALL020016_C27/SJBALL020016_C27.13.10001.16499999.cover has no soft clipped reads
CICERO_DATADIR/SJBALL020016_C27/SJBALL020016_C27.14.10001.16099999.cover has no soft clipped reads
CICERO_DATADIR/SJBALL020016_C27/SJBALL020016_C27.15.10001.17499999.cover has no soft clipped reads
CICERO_DATADIR/SJBALL020016_C27/SJBALL020016_C27.22.10001.13699999.cover has no soft clipped reads
CICERO_DATADIR/SJBALL020016_C27/SJBALL020016_C27.EBV.171823.cover has no soft clipped reads
CICERO_DATADIR/SJBALL020016_C27/SJBALL020016_C27.M.16569.cover has no soft clipped reads
Step 02 - 2020.11.11 05:35:01 - Cicero
Step 03 - 2020.11.11 05:46:33 - Combine
Step 04 - 2020.11.11 05:46:33 - Annotate
Step 05 - 2020.11.11 05:46:54 - Filter
wc: CICERO_DATADIR/SJBALL020016_C27/final_fusions.txt: No such file or directory
/opt/cicero/src/bin/Cicero.sh: line 348: [: -eq: unary operator expected
Killing blat server with pid 225Optimize: setting SC_SHIFT=10, SC_CUTOFF=3, THRESHOLD=200000
/opt/cicero/configs
configs.b8e35fb555c0.1.tmp
SJ_CONFIGS=configs.b8e35fb555c0.1.tmp
Starting local blat server
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server is up!
Step 01 - 2020.11.11 05:13:01 - ExtractSClips
CICERO_DATADIR/SJBALL020016_C27/SJBALL020016_C27.13.10001.16499999.cover has no soft clipped reads
CICERO_DATADIR/SJBALL020016_C27/SJBALL020016_C27.14.10001.16099999.cover has no soft clipped reads
CICERO_DATADIR/SJBALL020016_C27/SJBALL020016_C27.15.10001.17499999.cover has no soft clipped reads
CICERO_DATADIR/SJBALL020016_C27/SJBALL020016_C27.22.10001.13699999.cover has no soft clipped reads
CICERO_DATADIR/SJBALL020016_C27/SJBALL020016_C27.EBV.171823.cover has no soft clipped reads
CICERO_DATADIR/SJBALL020016_C27/SJBALL020016_C27.M.16569.cover has no soft clipped reads
Step 02 - 2020.11.11 05:35:01 - Cicero
Step 03 - 2020.11.11 05:46:33 - Combine
Step 04 - 2020.11.11 05:46:33 - Annotate
Step 05 - 2020.11.11 05:46:54 - Filter
wc: CICERO_DATADIR/SJBALL020016_C27/final_fusions.txt: No such file or directory
/opt/cicero/src/bin/Cicero.sh: line 348: [: -eq: unary operator expected
Killing blat server with pid 225
Killing   225 ?        00:03:29 gfServer
Killing   225 ?        00:03:29 gfServer

Any suggestion will be helpful.

Thanks

ERRO[4901] error waiting for container: unexpected EOF

I am running Cicero via Docker but unfortunately I am getting one error.

docker run -v /home/deepak/:/home/deepak/ -v /media/deepak/EXTRA/:/media/deepak/EXTRA/ stjude/cicero:0.3.0 Cicero.sh -n 35 -b /home/deepak/B4_second_markdup.bam -g GRCh38_no_alt -r /home/deepak/Downloads/reference -o /media/deepak/EXTRA/output -j /home/deepak/rnapeg/B4_second_markdup.bam.junctions.tab.shifted.tab

Optimize: setting SC_SHIFT=10, SC_CUTOFF=3, THRESHOLD=200000
/opt/cicero/configs
configs.31ace14e1140.1.tmp
SJ_CONFIGS=configs.31ace14e1140.1.tmp
Starting local blat server
Blat server not yet running ...
Blat server not yet running ...
Blat server not yet running ...
Blat server is up!
Step 01 - 2021.08.28 08:41:10 - ExtractSClips
ERRO[4901] error waiting for container: unexpected EOF

Any suggestion to resolve the issue?

Thanks,
Jay

-g vs -r

Could you please let me know what to give for -g parameters and -r parameter? I tried few variations of the file paths given to these parameters. However, none of them seem to work.

I downloaded the reference files - Cicero_reference/Homo_sapiens/GRCh37-lite/FASTA/GRCh37-lite.fa.

Failed to build docker image

I am trying to build the Docker image. I have used the following command.
sudo docker build -t stjude/cicero:0.2.0 .
However I am getting the error-

! Installing the dependencies failed: Module 'YAML' is not installed, Module 'XML::LibXML' is not installed, Module 'Test::Most' is not installed, Module 'XML::LibXML::Reader' is not installed
! Bailing out the installation for BioPerl-1.7.8.
! Installing the dependencies failed: Module 'Bio::Root::Version' is not installed
! Bailing out the installation for Bio-SamTools-1.35.
102 distributions installed
The command '/bin/sh -c SAMTOOLS="/tmp/samtools-0.1.17" cpanm --force -i Bio::DB::[email protected] && chown -R root:root /usr/local/.cpanm' returned a non-zero code: 1


Kindly give your suggestion.
Best
Jay

Can cicero be used on other organisms?

Hi!

I wanted to use cicero n drosophila, but you state that you only support
CICERO currently supports GRCh37-lite and GRCh38_no_alt.
Is there anything I can do to make it works on fly?

Romain

hg38 reference

Hi there,
I'm interested in trying up CICERO but i was wondering if you've prepared the hg38 reference files?
Much appreciated! A

CRLF2 missed fusions

From the hg38 support pull request:

Based on my understanding, this is specific for short deletion derived CRLF2 fusion (e.g. P2RY8-CRLF2 in the paper). We can only detect this kind of fusion through "novel junction" from RNApeg output.
However, sometimes RNApeg output does not put the junction as novel, that is the reason we need "next unless($line =~ m/novel/ || $line =~ m/chrX:1212/);"
For "unless($line =~ m/chrX:1212/)", it means the criteria is not for CRLF2. Because CICERO wants to detect CRLF2 fusion sensitively.

Originally posted by @liqingti in #24

Cannot rebuild docker image

Hello!

I am trying to build a docker image with the given Dockerfile, without the last 2 lines

ENTRYPOINT ["/opt/cicero/src/bin/Cicero.sh"]
CMD ["-h"]

as it is for a nextflow pipeline (nf-core/rnafusion) and nextflow by default expects a bash shell prompt and not the executable, but I have been unable to reproduce the build due to perl packages failing to install.
Mostly it fails with error: Installing the dependencies failed: Your Perl is not in the range '5.014'
I have tried to install with other Perl versions but I am not a Perl expert and run into other errors, such as:
env: ‘/run/rosetta/rosetta’: No such file or directory

Would it be possible to provide an updated docker file?

Final fusion result of local run is different from the standard result released

I wonder why the final fusion result of local run by docker is significantly different from the standard result released?

Comparing with the standard result "CICERO output SJBALL020016_C27_final_fusions.txt", which output 121 fusion results, my final fusion file has only 51. I wonder if some specific parameters need to be set?

Following the instructions, I run the CICERO through docker with DEMO data SJBALL020016_C27.bam in default parameters.

This is my command:
docker run -v /home/libproject/filebase/CICERO_GRCh37-lite_reference:/reference -v /home/libproject/Project/development/CICERO_test:/output ghcr.io/stjude/cicero:v1.5.1 Cicero.sh -b /output/SJBALL020016_C27.bam -g GRCh37-lite -r /reference -o /output/test_CICERO_output_v2

And there have no warning or error messages when running. I thought it was completely done. I have no idea how the differences generate.

Hope for your reply!

ERROR: no local reference sequence found for BAM reference name chr1_KI270762v1_alt

I am running Running RNApeg via Docker.
Here is my command
sudo docker run -v /home/deepak/rnapeg:/results -v /home/deepak/:/home/deepak/ -v /media/deepak/Deepak4T/:/media/deepak/Deepak4T/ mnedmonson/public:rnapeg RNApeg.sh -b /media/deepak/Deepak4T/B1_dedup_reads.bam -f /home/deepak/Downloads/reference/Homo_sapiens/GRCh38_no_alt/FASTA/GRCh38_no_alt.fa -r /home/deepak/Downloads/reference/Homo_sapiens/GRCh38_no_alt/mRNA/RefSeq/refFlat.txt
However I am getting the following error

ERROR: no local reference sequence found for BAM reference name chr1_KI270762v1_alt
ERROR: no local reference sequence found for BAM reference name chr1_KI270766v1_alt
ERROR: no local reference sequence found for BAM reference name chr1_GL383518v1_alt

.
.
.
ERROR: no local reference sequence found for BAM reference name chr19_GL949752v1_alt
ERROR: no local reference sequence found for BAM reference name chr6_KI270758v1_alt
ERROR: no local reference sequence found for BAM reference name chr19_GL949753v2_alt
ERROR: no local reference sequence found for BAM reference name chr19_KI270938v1_alt
ERROR: java.io.IOException: BAM header not fully compatible with specified reference sequence
java.io.IOException: BAM header not fully compatible with specified reference sequence
at org.stjude.compbio.rnapeg.SplicedReadReporter.report(SplicedReadReporter.java:241)
at org.stjude.compbio.rnapeg.SplicedReadReporter.main(SplicedReadReporter.java:744)
where is /results/B1_dedup_reads.bam.junctions.tab at /RNApeg/src/bin/junction_extraction_wrapper.pl line 338

I have done STAR 2 mode alignment using "Homo_sapiens/GRCh38_no_alt/FASTA/GRCh38_no_alt.fa" followed by
Mark Duplicates using Picard.

Kindly help to resolve the issue.
Thanks and Regards,
Jay

gfServer fails to start/is aborted.

Hello all,

I've been trying to run CICERO on a collection of bams aligned to Grch_38_no_alt. I've already run picard to mark/remove duplicates and RNApeg as well to get the junction files. Upon running CICERO, the help screen for gfServer appears and I get a message saying the gfServer was aborted. The err and log files are very similar, attached is the log:

2022/02/16 02:35:43: error: gfServer v 37x1 - Make a server to quickly find where DNA occurs in genome
To set up a server:
gfServer start host port file(s)
where the files are .2bit or .nib format files specified relative to the current directory
To remove a server:
gfServer stop host port
To query a server with DNA sequence:
gfServer query host port probe.fa
To query a server with protein sequence:
gfServer protQuery host port probe.fa
To query a server with translated DNA sequence:
gfServer transQuery host port probe.fa
To query server with PCR primers:
gfServer pcr host port fPrimer rPrimer maxDistance
To process one probe fa file against a .2bit format genome (not starting server):
gfServer direct probe.fa file(s).2bit
To test PCR without starting server:
gfServer pcrDirect fPrimer rPrimer file(s).2bit
To figure out if server is alive, on static instances get usage statics as well:
gfServer status host port
For dynamic gfServer instances, specify -genome and optionally the -genomeDataDir
to get information on an untranslated genome index. Include -trans to get about information
about a translated genome index
To get input file list:
gfServer files host port
To generate a precomputed index:
gfServer index gfidx file(s)
where the files are .2bit or .nib format files. Separate indexes are
be created for untranslated and translated queries. These can be used
with a persistent server as with 'start -indexFile or a dynamic server.
They must follow the naming convention for for dynamic servers.
To run a dynamic server (usually called by xinetd):
gfServer dynserver rootdir
Data files for genomes are found relative to the root directory.
Queries are made using the prefix of the file path relative to the root
directory. The files $genome.2bit, $genome.untrans.gfidx, and
$genome.trans.gfidx are required. Typically the structure will be in
the form:
$rootdir/$genomeDataDir/$genome.2bit
$rootdir/$genomeDataDir/$genome.untrans.gfidx
$rootdir/$genomeDataDir/$genome.trans.gfidx
in this case, one would call gfClient with
-genome=$genome -genomeDataDir=$genomeDataDir
Often $genomeDataDir will be the same name as $genome, however it
can be a multi-level path. For instance:
GCA/902/686/455/GCA_902686455.1_mSciVul1.1/
The translated or untranslated index maybe omitted if there is no
need to handle that type of request.
The -perSeqMax functionality can be implemented by creating a file
$rootdir/$genomeDataDir/$genome.perseqmax

options:
-tileSize=N Size of n-mers to index. Default is 11 for nucleotides, 4 for
proteins (or translated nucleotides).
-stepSize=N Spacing between tiles. Default is tileSize.
-minMatch=N Number of n-mer matches that trigger detailed alignment.
Default is 2 for nucleotides, 3 for proteins.
-maxGap=N Number of insertions or deletions allowed between n-mers.
Default is 2 for nucleotides, 0 for proteins.
-trans Translate database to protein in 6 frames. Note: it is best
to run this on RepeatMasked data in this case.
-log=logFile Keep a log file that records server requests.
-seqLog Include sequences in log file (not logged with -syslog).
-ipLog Include user's IP in log file (not logged with -syslog).
-debugLog Include debugging info in log file.
-syslog Log to syslog.
-logFacility=facility Log to the specified syslog facility - default local0.
-mask Use masking from .2bit file.
-repMatch=N Number of occurrences of a tile (n-mer) that triggers repeat masking the
tile. Default is 2252.
-noSimpRepMask Suppresses simple repeat masking.
-maxDnaHits=N Maximum number of hits for a DNA query that are sent from the server.
Default is 100.
-maxTransHits=N Maximum number of hits for a translated query that are sent from the server.
Default is 200.
-maxNtSize=N Maximum size of untranslated DNA query sequence.
Default is 40000.
-maxAaSize=N Maximum size of protein or translated DNA queries.
Default is 8000.
-perSeqMax=file File contains one seq filename (possibly with ':seq' suffix) per line.
-maxDnaHits will be applied to each filename[:seq] separately: each may
have at most maxDnaHits/2 hits. The filename MUST not include the directory.
Useful for assemblies with many alternate/patch sequences.
-canStop If set, a quit message will actually take down the server.
-indexFile Index file create by `gfServer index'. Saving index can speed up
gfServer startup by two orders of magnitude. The parameters must
exactly match the parameters when the file is written or bad things
will happen.
-timeout=N Timeout in seconds.
Default is 90.

2022/02/16 02:35:43: error: gfServer aborted

this is the command I am running:

docker run -v ${CICERO_REFERENCES}:/reference \
-v ${bam_dir}:/bamdir \
-v ${genome}:/fasta \
-v ${outdir_cicero}:/out \
-v ${outdir_rnapeg}:/rnapeg \
ghcr.io/stjude/cicero:latest Cicero.sh \
-n 4 \
-b /bamdir/${i}_marked_duplicates.bam \
-g GRCh38_no_alt.fa \
-r /reference \
-o /out \
-j /rnapeg/${i}_marked_duplicates.bam.junctions.tab.shifted.tab

I made sure BLAT and gfServer are installed in usr/local/bin which is in the $PATH, and so should be accessible to Cicero.sh. I've also made sure the CICERO scripts folder is in the $PATH and the perl5libs dependencies are in $PERL5LIB. It just seems like the gfServer isn't launching and I'm not sure why. I've also tried running Cicero in docker with the same results. Could it have to do with the port? This is being done on an azure compute VM, with ubuntu. Can anyone help me?

Can we get read names for fusion-supporting supporting reads in the output?

Is there a way to find the read names for reads that are supporting a particular fusion call?

For a particular fusion I cannot find the reads supporting it in IGV (usually this is not a problem) using the same bam file that was input for CICERO, and would like to try and see what's going on.

Thanks for the great work on this tool!

singularity error

Hi there,
My HPC does not allow direct docker use so i'm using singularity instead to build the docker image.
Using this command:

singularity --verbose --debug build cicero.simg docker://stjude/cicero:0.3.0

DEBUG   [U=2341426,P=350718]persistentPreRunE()           Singularity version: 3.5.0+dirty
DEBUG   [U=2341426,P=350718]handleConfDir()               /home/mhamdan/.singularity already exists. Not creating.
DEBUG   [U=2341426,P=350718]getCacheBasedir()             environment variable SINGULARITY_CACHEDIR not set, using default image cache
DEBUG   [U=2341426,P=350718]updateCacheSubdir()           Caching directory set to /home/mhamdan/.singularity/cache/library
DEBUG   [U=2341426,P=350718]updateCacheSubdir()           Caching directory set to /home/mhamdan/.singularity/cache/oci-tmp
DEBUG   [U=2341426,P=350718]updateCacheSubdir()           Caching directory set to /home/mhamdan/.singularity/cache/oci
DEBUG   [U=2341426,P=350718]updateCacheSubdir()           Caching directory set to /home/mhamdan/.singularity/cache/net
DEBUG   [U=2341426,P=350718]updateCacheSubdir()           Caching directory set to /home/mhamdan/.singularity/cache/shub
DEBUG   [U=2341426,P=350718]updateCacheSubdir()           Caching directory set to /home/mhamdan/.singularity/cache/oras
DEBUG   [U=2341426,P=350718]newBundle()                   Created temporary directory "/tmp/bundle-temp-170087539" for the bundle
DEBUG   [U=2341426,P=350718]newBundle()                   Created directory "/tmp/rootfs-e8fe5976-38c8-11eb-ab90-6c2b59b4600d" for the bundle
DEBUG   [U=2341426,P=350718]ensureGzipComp()              Ensuring gzip compression for mksquashfs
DEBUG   [U=2341426,P=350718]ensureGzipComp()              Gzip compression by default ensured
INFO    [U=2341426,P=350718]Full()                        Starting build...
DEBUG   [U=2341426,P=350718]Get()                         Reference: stjude/cicero:0.3.0
DEBUG   [U=2341426,P=350718]cleanUp()                     Cleaning up "/tmp/rootfs-e8fe5976-38c8-11eb-ab90-6c2b59b4600d" and "/tmp/bundle-temp-170087539"
FATAL   [U=2341426,P=350718]runBuildLocal()               While performing build: conveyor failed to get: Error reading manifest 0.3.0 in docker.io/stjude/cicero: errors:
denied: requested access to the resource is denied
unauthorized: authentication required

It says authentication required- i'm not sure if i've done anything wrong with the command?
Appreciate any input.
A

Running files in batch

I am running Running RNApeg and Cicero via Docker.
Is there any way to run multiple files in batch mode ?

Thanks
Jay

/opt/cicero/src/bin/Cicero.sh: line 318: /usr/bin/find: Argument list too long

Hi,

connected to a closed issue:
#73 (comment)

for certain samples I still get the "Argument list too long" error using find.
/opt/cicero/src/bin/Cicero.sh: line 318: /usr/bin/find: Argument list too long
/opt/cicero/src/bin/Cicero.sh: line 319: /usr/bin/find: Argument list too long

seems like it is connected to the "*" in the command. I manually tried the following:
find ~/CICERO/sample1/CICERO_DATADIR/sample1_Aligned.out_sort/*/ -type f -name 'unfiltered.fusion.txt' -exec cat {} \; | sort -V -k 9,9 -k 10,10n -k 11,11n > ./TEST1_unfiltered.fusion.txt
-bash: /usr/bin/find: Argument list too long

find ~/CICERO/sample1/CICERO_DATADIR/sample1_Aligned.out_sort/ -type f -name 'unfiltered.fusion.txt' -exec cat {} \; | sort -V -k 9,9 -k 10,10n -k 11,11n > ./TEST2_unfiltered.fusion.txt
this works

PS I am running RNApeg:v2.7.1 + CICERO version 1.8.1, however, as the find command is still the same, I expect to get the same error with the newest version

gf server error in BLAT

I have installed Cicero in my computer but I am facing issue in running it.
my command
/media/deepak/EXTRA/CICERO-master/src/scripts/Cicero.sh -n 20 -b /media/deepak/Deepak4T/NEW_SAMPLE/SAMPLE_3/star/sample.sorted.bam -g /media/deepak/Deepak4T/Genomedir/hg38/hg38.fa -r /home/deepak/Downloads/reference

output..
Optimize: setting SC_SHIFT=10, SC_CUTOFF=3, THRESHOLD=200000
/media/deepak/EXTRA/CICERO-master/configs
configs.ngs.26697.tmp
SJ_CONFIGS=configs.ngs.26697.tmp
awk: fatal: cannot open file configs.ngs.26697.tmp/genome//media/deepak/Deepak4T/Genomedir/hg38/hg38.fa.config.txt' for reading (No such file or directory) awk: fatal: cannot open file configs.ngs.26697.tmp/genome//media/deepak/Deepak4T/Genomedir/hg38/hg38.fa.config.txt' for reading (No such file or directory)
awk: fatal: cannot open file configs.ngs.26697.tmp/genome//media/deepak/Deepak4T/Genomedir/hg38/hg38.fa.config.txt' for reading (No such file or directory) awk: fatal: cannot open file configs.ngs.26697.tmp/genome//media/deepak/Deepak4T/Genomedir/hg38/hg38.fa.config.txt' for reading (No such file or directory)
Starting local blat server
Blat server not yet running ...
ERROR: The server has probably died. Please review error logs at below location:
/media/deepak/EXTRA/CICERO-master/src/scripts/gfServer.ngs.26697.err

KIndly help to resolve my issue.

Failure to run docker for rnapeg.

Hi,
I am trying to analyze RNA fusion using CICERO.
I am using MacBook terminal.

I ran the following command line:

docker pull ghcr.io/stjude/cicero:latest --platform linux/amd64

And then,

docker run -v /Users/dajeong/CICERO_DJ:/results ghcr.io/stjude/rnapeg:latest --platform linux/amd64
-b /results/ATL005_STAR_Aligned.out.bam
-f /results/hg38.fa
-r /results/refFlat.txt

The error messages are as follows:

Unable to find image 'ghcr.io/stjude/rnapeg:latest' locally
latest: Pulling from stjude/rnapeg
docker: no matching manifest for linux/arm64/v8 in the manifest list entries.
See 'docker run --help'.

Please help me.
Thanks!

Sincerely,
DJ. J

Error when installing CICERO via docker on ArchLinux

Hello everyone

I'm trying to install cicero using docker.
I've succesfully ran RNApeg (for now) with installation using docker.

But when I try to install CICERO via docker I get the following error:

docker build -t stjude/cicero:1.4.0 https://github.com/stjude/CICERO/blob/master/Dockerfile

Error response from daemon: dockerfile parse error line 7: unknown instruction: <!DOCTYPE

I've also tried to download the docker file and run it locally, with the same error.

Output of docker version


Client:
 Version:           20.10.8
 API version:       1.41
 Go version:        go1.16.6
 Git commit:        3967b7d28e
 Built:             Wed Aug  4 10:59:01 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.8
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.6
  Git commit:       75249d88bc
  Built:            Wed Aug  4 10:58:48 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.5.5
  GitCommit:        72cec4be58a9eb6b2910f5d10f1c01ca47d231c0.m
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Docker Error: Can't locate English.pm in @INC

Hi - I've been trying to run Cicero with the Docker image provided here on GitHub, but I've been having some trouble.

Image Download

Because the High Performance Computing cluster I use does not allow Docker,
I singularity pulled the docker image:

singularity pull docker://ghcr.io/stjude/cicero:v1.8.1

Minimal working example of error

This is the bash script I ran. I used the demo data provided here. (I've substituted my username with xxx)

#!/bin/bash

export LC_ALL=C
export LANG=en_US.UTF-8
export LANGUAGE=en_US.UTF-8

singularity exec \
--bind /home/xxx/analysis/cicero/test-dataset/MV4_11,/home/xxx/ref/grch38/cicero-reference-genome,/home/xxx/analysis/cicero/src/run-cicero,/home/xxx/analysis/cicero/result \
/home/xxx/bin/cicero/v1.8.1/docker/cicero-v1.8.1.simg \
bash /home/xxx/analysis/cicero/src/run-cicero/test_cicero_singularity_runscript_2021-12-18.sh

The bash file /home/xxx/analysis/cicero/src/run-cicero/test_cicero_singularity_runscript_2021-12-18.sh
contains:

#!/bin/bash

export PATH=/opt/conda/bin:/opt/cicero/src/bin:/opt/cicero/configs/genome:/opt/cicero/configs/app:$PATH

Cicero.sh \
-n 8 \
-b /home/xxx/analysis/cicero/test-dataset/MV4_11/MV4_11_RNAseq_1.bam \
-g GRCh38_no_alt \
-r /home/xxx/ref/grch38/cicero-reference-genome/reference/ \
-o /home/xxx/analysis/cicero/result/test/2021-12-18/MV4_11

This is what I get in my standard error stream:

exit: Error in ExtractSClips: numeric argument required

In addition, in one of the output files (01_ExtractSClips.err), I get this error message:

Can't locate English.pm in @INC (you may need to install the English module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 /usr/local/share/perl/5.26.1 /usr/lib/x86_64-linux-gnu/perl5/5.26 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.26 /usr/share/perl/5.26 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /opt/cicero/src/bin/get_sc_cmds.pl line 8.

BEGIN failed--compilation aborted at /opt/cicero/src/bin/get_sc_cmds.pl line 8.

Can't locate English.pm in @INC (you may need to install the English module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 /usr/local/share/perl/5.26.1 /usr/lib/x86_64-linux-gnu/perl5/5.26 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.26 /usr/share/perl/5.26 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /opt/cicero/src/bin/get_geneInfo.pl line 11.

BEGIN failed--compilation aborted at /opt/cicero/src/bin/get_geneInfo.pl line 11.

This error message suggests that the English perl module is missing in the Docker container,
but is this the case? Is this because I am running the Docker image in Singularity?

I would very much appreciate any help.

RNApeg input for " -r refflat " and "-rg refflat" is not cleared

I am running RNApeg via Docker to generate a junctions file for use by CICERO. However I am unable to understand what should be the input for " -r refflat " and "-rg refflat" in the code mentioned.
I have downloaded refflat file from UCSC and have used the following command.
sudo docker run -v /home/deepak/:/home/deepak/ -v /media/deepak/Deepak4T/:/media/deepak/Deepak4T/ mnedmonson/public:rnapeg RNApeg.sh -b /media/deepak/Deepak4T/NEW_SAMPLE/SAMPLE_3/star/sample.sorted.bam -f /home/deepak/Downloads/reference/Homo_sapiens/GRCh37-lite/FASTA/GRCh37-lite.fa -r /home/deepak/Downloads/HG19REFFLAT

I am getting the following error

RNApeg.sh [-h] -b bamfile -f fasta -r refflat [-rg refflat]
ERROR: Output directory '/results' does not exist; need mountpoint?

Where should be the input for the output directory in the command?

Kindly help to resolve the issue.
Thanks and Regards

frameness?

Hi is there a way to have Cicero output the frame predictions? Such as, in.frame, out.of.frame or undetermine?

edit: I notice 2 colums, frame and sv_frame but can't make out how to utilized these to determine the frame.

thanks!

Update Ground Truth Data

There have been various changes to the cicero code that have impacted results, we will need to generate new ground truth data for the test sample we have and validate its correct.

Running in batch error and CICERO keeps processing files using only 1 thread

Hello!
I've been trying to run CICERO by using all my 32 threads available in my system.
To do that, I used:

docker run -v /home:/reference -v /home:/data -v /home:/output ghcr.io/stjude/cicero:latest Cicero.sh -n 32 -b /data/Análises/Teste1/star/cicero/11_FRAS202372577-2r_1.fqAligned.sortedByCoord.out.bam -g GRCh38_no_alt -r /reference/references/cicero/reference -o /output/Análises/Teste1/cicero/teste -j /data/Análises/Teste1/RNAPEG/trueseq/11_FRAS202372577-2r_1.fqAligned.sortedByCoord.out.bam.junctions.tab.shifted.tab.annotated.tab

Which works fine, but takes 4 days and keeps using 1 thread forever.
I've been trying to run it in a batch of files. I used:

 while IFS="," read fq tab
    do
    echo $fq
    echo $tab
docker run -v /home:/reference -v /home:/data -v /home:/results  ghcr.io/stjude/cicero:latest Cicero.sh -n 32 -b /data/Análises/Teste1/star/cicero/$fq -g GRCh38_no_alt -r /reference/references/cicero/reference -o /results/Análises/Teste1/cicero/total -j /data/Análises/Teste1/RNAPEG/total/$tab
    done < /home/cicero/samplenames/samples.csv

#samples.csv contains both BAM and annotated.tab from RNAPEG in two collumns

Which works, going sample by sample.
But I get the following error everytime the process is finishing for one sample:

wc: CICERO_DATADIR/100_FRAS202421990-1a_1.fqAligned.sortedByCoord.out/final_fusions.txt: No such file or directory

Everytime a new sample in run, the existing files in /CICERO_DATADIR are deleted and I get that error above

Not only, but each sample is taking 4 days. Even tough I use -n 32, CICERO keeps using 1 thread for almost all processing and takes forever. It would be better to run multiple instances of cicero (if they only use 1 thread anyway) because we have lots of RAM in our cluster, but I can't make multiple docker instances with my loop

ERROR: unrecognized parameter RNApeg.sh

Hi,
when running RNApeg like mentioned in the readme, I get the error message "ERROR: unrecognized parameter RNApeg.sh".
I have to skip the "RNApeg.sh" (see command below), then it runs.
Best, Dagmar

docker run  --user=$(id -u):$(id -g) \
	   -v /etc/localtime:/etc/localtime:ro \
	   -v ${RNApeg_outdir}:/results \
	   -v ${star_outdir}:/bamdir:ro \
	   -v ${refFlatfile}:/refFlat:ro \
	   -v ${fastadir}:/fastadir:ro \
	   -v ${cicero_ref}:/refdir \
	   ghcr.io/stjude/rnapeg:v2.7.1 -b /bamdir/${sampleID}_Aligned.out_sort.bam \
	   -f /fastadir/${fastafile} -r /refFlat

UBTF::ATXN7L3

Hi,
I would highly recommend adding UBTF::ATXN7L3 to the known_fusion.txt otherwise this clinically relevant fusion is filtered out as readthrough. I also added the following coordinates to the known_breakpoints.txt and now it works.
chr17 44209352
chr17 44198130
Best, Dagmar

Missing chromosome information in annotated file

This may only be an issue with hg38, but it is possible for the annotated output file to have a record with no chromosome information for one of the partners, which causes sv_inframe.pl to fail.

SJLGG064626_D1 GTPBP4 chr10 992606 + coding - > 4 85 0.00 0.00 118 0.07 0.01 86 137 79 0 CGAAAGACTCCAACCGTTATTCATAAACATTACCAAATACATCGCATTAGACATTTTTACATGAGAAAAGTCAAATTTACTCAACGAAAGACTCCAACCGTTATTCATAAACATTACCAAATACATCGCATTAGAC CTX

Performance issues

Dear St.Jude Researchers,

lately I was trying you tool for rnaseq fusion detection.
I made a run with a pair-end bulk rnaseq sample (150bp) with 16 milions reads.
I used 16 threads and 64 GB of memory with the following command:

Cicero.sh -n 16  \
                 -b $PWD/sample.Aligned.sortedByCoord.out.bam \
                 -g "GRCh38_no_alt" \
                 -r $PWD/reference/  \
                 -o sample \
                 -j sample.Aligned.sortedByCoord.out.bam.junctions.tab.shifted.tab

I takes a lot of time to finish 10 h 55 m 48 s.

Furthermore it gives me a huge rate of ambiguous calls. (I attached the output file)

Am I doing something wrong? Or it is normal?

Thank you in advance!
Best regards,
Youssef

sample.Aligned.sortedByCoord.out.final_fusions.txt

Interpretation of output spanning reads and discordant reads

I was wondering if there's a more detailed description of the output columns,
The only resource I could find was here: https://university.stjude.cloud/docs/genomics-platform/workflow-guides/rapid-rnaseq/#interpreting-results, and the FusionEditor help page. Neither of those resources contain definitions for columns such as matchA, repeatA, coverageA, or ratioA.

Specifically, I have the following questions:

  • If the readsA and readsB columns in final_fusions.txt are columns containing number of chimeric reads for a fusion, why are the two having different values for the same fusion? By definition a chimeric read should contain sequences from both A and B genes.

  • Are there metrics equivalent to star-fusion's JunctionReadCount and SpanningFragCount columns? According to STAR-Fusion wiki, the former "indicates the number of RNA-Seq fragments containing a read that aligns as a split read at the site of the putative fusion junction", and the latter "indicates the number of RNA-Seq fragments that encompass the fusion junction such that one read of the pair aligns to a different gene than the other paired-end read of that fragment".

If you could illustrate how CICERO's output readsA and readsB columns (or maybe other columns) relate to these two concepts, to enable us to do a more thorough comparison of the two methods, that would be great!

Thank you!

Error with reference: BAM header not fully compatible with specified reference sequence

Hello everyone!
I've been trying to run both CICERO and RNApeg with the refence files provided.
I've aligned my reads with ensembl primary assembly (hg38) FASTA file and GTF file, the most updated version up to january/2021

But then I getting the following error:

[root@tcg gabriel.gama]# docker run -v /home/Análises/Teste1/RNAPEG/trueseq:/results -v /home/gabriel.gama:/data ghcr.io/stjude/rnapeg:latest -b /data/Análises/Teste1/star/all_samples_ensembl_chimericwithinbam/99_FRAS202421989-1a_1.fqAligned.sortedByCoord.out.bam -f /data/references/cicero/reference/Homo_sapiens/GRCh38_no_alt/FASTA/GRCh38_no_alt.fa -r /data/references/cicero/reference/Homo_sapiens/GRCh38_no_alt/mRNA/RefSeq/refFlat.txt set REFGENE to default /data/references/cicero/reference/Homo_sapiens/GRCh38_no_alt/mRNA/RefSeq/refFlat.txt [*] Running junction_extraction_wrapper.pl Tue Nov 30 20:48:45 2021: running: bam_junction.pl -type all -bam /data/Análises/Teste1/star/all_samples_ensembl_chimericwithinbam/99_FRAS202421989-1a_1.fqAligned.sortedByCoord.out.bam -annotate -now -force 1 -out 99_FRAS202421989-1a_1.fqAligned.sortedByCoord.out.bam.junctions.tab -no-config -refflat /data/references/cicero/reference/Homo_sapiens/GRCh38_no_alt/mRNA/RefSeq/refFlat.txt -fasta /data/references/cicero/reference/Homo_sapiens/GRCh38_no_alt/FASTA/GRCh38_no_alt.fa /usr/bin/env java -Xmx5g org.stjude.compbio.rnapeg.SplicedReadReporter -bam /data/Análises/Teste1/star/all_samples_ensembl_chimericwithinbam/99_FRAS202421989-1a_1.fqAligned.sortedByCoord.out.bam -of 99_FRAS202421989-1a_1.fqAligned.sortedByCoord.out.bam.junctions.tab -refflat /data/references/cicero/reference/Homo_sapiens/GRCh38_no_alt/mRNA/RefSeq/refFlat.txt -fasta /data/references/cicero/reference/Homo_sapiens/GRCh38_no_alt/FASTA/GRCh38_no_alt.fa -annotate intron cache: chr16_KI270854v1_alt: 22 chr19_KI270915v1_alt: 93 chr12_GL877875v1_alt: 36 chr21: 2809 chr6_GL000255v2_alt: 1567 chr22: 5237 chrUn_GL000224v1: 2 chr6_KI270801v1_alt: 56 chr20: 6091 chr8_KI270812v1_alt: 1 chr3_KI270780v1_alt: 5 chr11_KI270903v1_alt: 15 chr20_KI270870v1_alt: 29 chr19_GL949753v2_alt: 338 chr3_KI270779v1_alt: 96 chr5_KI270791v1_alt: 43 chr2_KI270767v1_alt: 12 chr22_KI270879v1_alt: 101 chr10: 10500 chr11: 13005 chrX_KI270881v1_alt: 10 chr16_GL383557v1_alt: 11 chr17_KI270909v1_alt: 80 chr12: 13602 chr13: 4738 chr17_JH159148v1_alt: 5 chr12_KI270837v1_alt: 8 chr18: 3804 chr19: 13382 chr6_GL000250v2_alt: 658 chr14: 7276 chr15: 9088 chr16: 10275 chr15_KI270905v1_alt: 508 chr17: 13711 chrUn_KI270753v1: 3 chr7_KI270899v1_alt: 8 chr19_KI270930v1_alt: 103 chr4_GL383527v1_alt: 25 chr17_KI270861v1_alt: 111 chr19_KI270885v1_alt: 93 chr18_KI270911v1_alt: 2 chr5_GL339449v2_alt: 245 chr2_KI270773v1_alt: 5 chrX_KI270880v1_alt: 13 chr19_GL949747v2_alt: 336 chr3_KI270935v1_alt: 108 chr5_KI270796v1_alt: 1 chr9_GL383541v1_alt: 14 chr15_KI270906v1_alt: 28 chrUn_KI270741v1: 4 chr12_GL877876v1_alt: 19 chr8_KI270817v1_alt: 20 chr6_GL000254v2_alt: 1494 chr15_KI270848v1_alt: 88 chr4_KI270786v1_alt: 2 chrUn_GL000213v1: 10 chr19_KI270914v1_alt: 91 chr5_GL949742v1_alt: 12 chr19_KI270884v1_alt: 81 chr17_KI270860v1_alt: 36 chr21_GL383579v2_alt: 4 chr4_KI270896v1_alt: 16 chr8_KI270811v1_alt: 2 chr8_KI270818v1_alt: 1 chr15_GL383555v2_alt: 19 chr20_KI270869v1_alt: 17 chr3_KI270934v1_alt: 74 chr2_KI270774v1_alt: 19 chr11_JH159136v1_alt: 1 chr13_KI270838v1_alt: 23 chr15_KI270849v1_alt: 24 chr22_KI270875v1_alt: 19 chr1_KI270759v1_alt: 12 chr4_KI270925v1_alt: 4 chr12_KI270833v1_alt: 20 chr14_KI270847v1_alt: 244 chr19_KI270919v1_alt: 94 chr11_KI270831v1_alt: 71 chr3_KI270777v1_alt: 20 chr19_KI270866v1_alt: 10 chr2_KI270768v1_alt: 11 chr19_KI270921v1_alt: 149 chr12_GL383550v2_alt: 12 chr19_KI270886v1_alt: 102 chr21_GL383580v2_alt: 9 chr19_GL949746v1_alt: 428 chr17_GL383563v3_alt: 25 chr1_KI270763v1_alt: 79 chr8_KI270816v1_alt: 179 chr5_KI270793v1_alt: 2 chr19_KI270889v1_alt: 93 chr19_GL000209v2_alt: 96 chr3_KI270936v1_alt: 74 chr10_KI270825v1_alt: 31 chr7_GL383534v2_alt: 42 chr19_KI270890v1_alt: 103 chr8_KI270821v1_alt: 85 chr21_KI270873v1_alt: 3 chr6_KI270797v1_alt: 27 chr6_GL000256v2_alt: 1426 chr2_GL383521v1_alt: 1 chr1_GL383520v2_alt: 22 chr19_GL949748v2_alt: 237 chr7_KI270804v1_alt: 8 chr3_KI270781v1_alt: 1 chr15_GL383554v1_alt: 60 chr8_KI270813v1_alt: 28 chr15_KI270727v1_random: 17 chr19_KI270865v1_alt: 31 chr19_KI270917v1_alt: 102 chr19_KI270888v1_alt: 79 chr5_KI270897v1_alt: 158 chr10_GL383546v1_alt: 27 chr19_KI270923v1_alt: 86 chr22_KI270877v1_alt: 14 chr19_KI270891v1_alt: 93 chr12_KI270835v1_alt: 52 chr3_GL383526v1_alt: 7 chr19_KI270916v1_alt: 105 chr22_KB663609v1_alt: 32 chr2_KI270770v1_alt: 2 chr8_KI270814v1_alt: 14 chr17_KI270910v1_alt: 26 chr3_KI270937v1_alt: 77 chr2_GL383522v1_alt: 29 chr11_KI270827v1_alt: 1 chr19_GL949752v1_alt: 462 chr19_KI270887v1_alt: 109 chr19_KI270922v1_alt: 104 chr20_KI270871v1_alt: 6 chr9_KI270823v1_alt: 10 chr10_GL383545v1_alt: 4 chr11_KI270832v1_alt: 127 chr7_KI270806v1_alt: 29 chr21_GL383581v2_alt: 22 chr5_KI270792v1_alt: 36 chr6_KI270802v1_alt: 4 chr1_KI270762v1_alt: 103 chr22_KI270878v1_alt: 24 chr7_KI270803v1_alt: 178 chr8_KI270815v1_alt: 4 chr9: 9769 chr19_GL949749v2_alt: 234 chr7: 11845 chr3_KI270783v1_alt: 4 chr8: 8584 chr5: 10980 chr6_KI270799v1_alt: 2 chr6: 12073 chr3: 14739 chr8_KI270900v1_alt: 14 chr4: 9486 chr1: 24620 chr2: 18571 chrUn_GL000220v1: 6 chr5_KI270794v1_alt: 4 chr8_KI270822v1_alt: 82 chr11_KI270830v1_alt: 23 chr6_GL000252v2_alt: 1561 chr21_KI270872v1_alt: 32 chr11_KI270721v1_random: 12 chr3_KI270924v1_alt: 85 chr14_KI270846v1_alt: 11 chr1_GL383519v1_alt: 90 chr5_KI270898v1_alt: 10 chr19_KI270918v1_alt: 72 chr19_KI270932v1_alt: 103 chr19_KI270938v1_alt: 497 chr22_KI270876v1_alt: 24 chr19_GL949750v2_alt: 218 chr12_KI270834v1_alt: 60 chr19_KI270882v1_alt: 118 chr4_KI270790v1_alt: 3 chr15_KI270852v1_alt: 30 chr22_GL383583v2_alt: 8 chr14_GL000194v1_random: 9 chr8_KI270926v1_alt: 14 chr14_KI270845v1_alt: 10 chr16_KI270728v1_random: 34 chr7_KI270808v1_alt: 5 chr19_KI270920v1_alt: 105 chr1_GL383518v1_alt: 52 chr18_KI270863v1_alt: 17 chr11_KI270829v1_alt: 8 chr8_KI270810v1_alt: 3 chr17_GL383566v1_alt: 2 chr6_KI270758v1_alt: 14 chr19_KI270868v1_alt: 25 chrUn_GL000218v1: 4 chr17_GL000258v2_alt: 176 chr6_KI270798v1_alt: 24 chr22_GL383582v2_alt: 31 chr7_KI270809v1_alt: 23 chrY: 1017 chr13_KI270842v1_alt: 2 chr3_KI270782v1_alt: 8 chr1_KI270765v1_alt: 3 chrX: 8331 chr17_KI270907v1_alt: 8 chr19_GL949751v2_alt: 256 chr22_KI270928v1_alt: 41 chr19_KI270867v1_alt: 24 chr2_KI270769v1_alt: 10 chr5_KI270795v1_alt: 8 chrUn_GL000195v1: 6 chr16_GL383556v1_alt: 30 chr19_KI270883v1_alt: 94 chr17_GL383564v2_alt: 38 chr6_GL000251v2_alt: 1657 chr17_KI270908v1_alt: 168 chrUn_GL000219v1: 6 chr4_KI270789v1_alt: 3 chr19_KI270929v1_alt: 96 chr3_KI270895v1_alt: 74 chr19_KI270931v1_alt: 95 chr1_KI270766v1_alt: 25 chr17_KI270862v1_alt: 39 chr2_KI270893v1_alt: 12 chr3_JH636055v2_alt: 10 chr19_GL383576v1_alt: 2 chr16_KI270853v1_alt: 504 chr14_KI270844v1_alt: 39 chr18_GL383572v1_alt: 2 chr3_KI270784v1_alt: 26 chr22_KI270731v1_random: 12 chr8_KI270819v1_alt: 21 chr12_KI270904v1_alt: 32 chr11_KI270927v1_alt: 108 chr13_KI270840v1_alt: 23 chr18_GL383567v1_alt: 7 chr16_KI270856v1_alt: 4 chr9_GL383540v1_alt: 10 chr19_GL383575v2_alt: 9 chr6_GL000253v2_alt: 1227 chr19_KI270933v1_alt: 94 chr15_KI270850v1_alt: 100 chr2_KI270776v1_alt: 20 chr2_GL582966v2_alt: 25 chr12_GL383553v2_alt: 19 chr16_KI270855v1_alt: 97 chrX_KI270913v1_alt: 12 chr17_KI270857v1_alt: 587 chr17_JH159146v1_alt: 47 chr18_GL383571v1_alt: 1 chr19_GL383573v1_alt: 13 chr15_KI270851v1_alt: 87 chr22_KI270734v1_random: 22 chr4_GL000008v2_random: 4 chr17_JH159147v1_alt: 11 chr4_GL000257v2_alt: 49 chr1_KI270713v1_random: 11 chr19_GL383574v1_alt: 11 chr11_KI270902v1_alt: 55 total=266245 flatfile db load: 1062 ms checking BAM/reference compatibility...reference sequence names: [chr9_KI270719v1_random, chrUn_KI270509v1, chr21, chr22, chrUn_GL000224v1, chrUn_KI270317v1, chr2_KI270715v1_random, chr20, chrUn_KI270305v1, chrUn_KI270320v1, chrUn_KI270366v1, chrUn_KI270389v1, chrUn_KI270512v1, chrUn_KI270742v1, chrUn_KI270438v1, chrUn_KI270381v1, chrUn_KI270414v1, chr10, chr14_GL000009v2_random, chr11, chr12, chr13, chr18, chr19, chrUn_KI270316v1, chr14, chr15, chr14_KI270722v1_random, chr16, chr17, chrUn_KI270753v1, chr1_KI270710v1_random, chr22_KI270736v1_random, chrUn_KI270392v1, chrUn_KI270442v1, chrUn_KI270465v1, chrUn_KI270548v1, chr17_KI270729v1_random, chrUn_KI270382v1, chrUn_KI270741v1, chrUn_KI270749v1, chrUn_GL000213v1, chrUn_KI270329v1, chrUn_KI270378v1, chrUn_KI270393v1, chrUn_KI270752v1, chr22_KI270732v1_random, chrUn_KI270507v1, chrUn_KI270424v1, chrUn_GL000226v1, chr1_KI270712v1_random, chrUn_KI270510v1, chrUn_KI270579v1, chrUn_KI270303v1, chrUn_KI270364v1, chrUn_KI270387v1, chr17_KI270730v1_random, chrUn_KI270744v1, chr1_KI270711v1_random, chr9_KI270720v1_random, chr1_KI270708v1_random, chrUn_KI270412v1, chrUn_KI270435v1, chrUn_GL000214v1, chrUn_KI270519v1, chr1_KI270709v1_random, chrUn_KI270337v1, chrUn_KI270522v1, chrUn_KI270375v1, chrUn_KI270755v1, chrUn_KI270583v1, chrUn_KI270390v1, chr15_KI270727v1_random, chr14_KI270724v1_random, chrUn_KI270508v1, chrUn_KI270425v1, chrUn_KI270448v1, chrUn_KI270511v1, chrUn_KI270304v1, chrUn_KI270388v1, chrUn_GL000216v2, chrUn_KI270743v1, chr9_KI270717v1_random, chr22_KI270739v1_random, chrUn_KI270584v1, chrUn_KI270338v1, chrUn_KI270315v1, chrUn_KI270376v1, chrUn_KI270330v1, chrUn_KI270754v1, chrUn_KI270391v1, chrUn_KI270422v1, chrUn_KI270468v1, chr9, chr22_KI270735v1_random, chr7, chr8, chr5, chr6, chr3, chr4, chr1, chrUn_KI270528v1, chr2, chrUn_GL000220v1, chrUn_KI270362v1, chrUn_KI270385v1, chrUn_KI270419v1, chr11_KI270721v1_random, chr1_KI270714v1_random, chrUn_KI270746v1, chr14_KI270723v1_random, chrUn_KI270517v1, chrUn_KI270312v1, chrUn_KI270335v1, chr5_GL000208v1_random, chrUn_KI270589v1, chrUn_KI270396v1, chrUn_KI270373v1, chr14_GL000194v1_random, chrUn_KI270581v1, chr3_GL000221v1_random, chr16_KI270728v1_random, chrUn_KI270423v1, chrUn_KI270757v1, chrUn_KI270529v1, chrEBV, chr9_KI270718v1_random, chrUn_GL000218v1, chrUn_KI270302v1, chrUn_KI270386v1, chrUn_KI270363v1, chrUn_KI270745v1, chrUn_KI270340v1, chrUn_KI270593v1, chrUn_KI270411v1, chrY, chrX, chr2_KI270716v1_random, chrY_KI270740v1_random, chrUn_KI270518v1, chrM, chr1_KI270706v1_random, chrUn_KI270336v1, chr14_KI270725v1_random, chrUn_KI270521v1, chrUn_KI270544v1, chrUn_KI270374v1, chrUn_KI270756v1, chrUn_GL000195v1, chr22_KI270733v1_random, chrUn_KI270582v1, chr22_KI270738v1_random, chrUn_KI270420v1, chrUn_KI270466v1, chrUn_GL000219v1, chrUn_KI270322v1, chrUn_KI270590v1, chrUn_KI270383v1, chrUn_KI270417v1, chrUn_KI270748v1, chr14_GL000225v1_random, chrUn_KI270515v1, chrUn_KI270538v1, chr22_KI270731v1_random, chrUn_KI270587v1, chrUn_KI270310v1, chrUn_KI270379v1, chrUn_KI270333v1, chrUn_KI270394v1, chrUn_KI270371v1, chrUn_KI270751v1, chrUn_KI270467v1, chr1_KI270707v1_random, chrUn_KI270530v1, chr17_GL000205v2_random, chrUn_KI270418v1, chrUn_KI270384v1, chr22_KI270737v1_random, chrUn_KI270591v1, chrUn_KI270747v1, chrUn_KI270516v1, chr22_KI270734v1_random, chr4_GL000008v2_random, chr14_KI270726v1_random, chrUn_KI270539v1, chrUn_KI270311v1, chr1_KI270713v1_random, chrUn_KI270588v1, chrUn_KI270334v1, chrUn_KI270429v1, chrUn_KI270395v1, chrUn_KI270372v1, chrUn_KI270580v1, chrUn_KI270750v1] raw=1 ref_name_bam:1 ref_name_ref:chr1 raw=10 ref_name_bam:10 ref_name_ref:chr10 raw=11 ref_name_bam:11 ref_name_ref:chr11 raw=12 ref_name_bam:12 ref_name_ref:chr12 raw=13 ref_name_bam:13 ref_name_ref:chr13 raw=14 ref_name_bam:14 ref_name_ref:chr14 raw=15 ref_name_bam:15 ref_name_ref:chr15 raw=16 ref_name_bam:16 ref_name_ref:chr16 raw=17 ref_name_bam:17 ref_name_ref:chr17 raw=18 ref_name_bam:18 ref_name_ref:chr18 raw=19 ref_name_bam:19 ref_name_ref:chr19 raw=2 ref_name_bam:2 ref_name_ref:chr2 raw=20 ref_name_bam:20 ref_name_ref:chr20 raw=21 ref_name_bam:21 ref_name_ref:chr21 raw=22 ref_name_bam:22 ref_name_ref:chr22 raw=3 ref_name_bam:3 ref_name_ref:chr3 raw=4 ref_name_bam:4 ref_name_ref:chr4 raw=5 ref_name_bam:5 ref_name_ref:chr5 raw=6 ref_name_bam:6 ref_name_ref:chr6 raw=7 ref_name_bam:7 ref_name_ref:chr7 raw=8 ref_name_bam:8 ref_name_ref:chr8 raw=9 ref_name_bam:9 ref_name_ref:chr9 raw=MT ref_name_bam:MT ref_name_ref:chrM raw=X ref_name_bam:X ref_name_ref:chrX raw=Y ref_name_bam:Y ref_name_ref:chrY raw=KI270728.1 ref_name_bam:KI270728.1 ref_name_ref:null raw=KI270727.1 ref_name_bam:KI270727.1 ref_name_ref:null raw=KI270442.1 ref_name_bam:KI270442.1 ref_name_ref:null raw=KI270729.1 ref_name_bam:KI270729.1 ref_name_ref:null raw=GL000225.1 ref_name_bam:GL000225.1 ref_name_ref:null raw=KI270743.1 ref_name_bam:KI270743.1 ref_name_ref:null raw=GL000008.2 ref_name_bam:GL000008.2 ref_name_ref:null raw=GL000009.2 ref_name_bam:GL000009.2 ref_name_ref:null raw=KI270747.1 ref_name_bam:KI270747.1 ref_name_ref:null raw=KI270722.1 ref_name_bam:KI270722.1 ref_name_ref:null raw=GL000194.1 ref_name_bam:GL000194.1 ref_name_ref:null raw=KI270742.1 ref_name_bam:KI270742.1 ref_name_ref:null raw=GL000205.2 ref_name_bam:GL000205.2 ref_name_ref:null raw=GL000195.1 ref_name_bam:GL000195.1 ref_name_ref:null raw=KI270736.1 ref_name_bam:KI270736.1 ref_name_ref:null raw=KI270733.1 ref_name_bam:KI270733.1 ref_name_ref:null raw=GL000224.1 ref_name_bam:GL000224.1 ref_name_ref:null raw=GL000219.1 ref_name_bam:GL000219.1 ref_name_ref:null raw=KI270719.1 ref_name_bam:KI270719.1 ref_name_ref:null raw=GL000216.2 ref_name_bam:GL000216.2 ref_name_ref:null raw=KI270712.1 ref_name_bam:KI270712.1 ref_name_ref:null raw=KI270706.1 ref_name_bam:KI270706.1 ref_name_ref:null raw=KI270725.1 ref_name_bam:KI270725.1 ref_name_ref:null raw=KI270744.1 ref_name_bam:KI270744.1 ref_name_ref:null raw=KI270734.1 ref_name_bam:KI270734.1 ref_name_ref:null raw=GL000213.1 ref_name_bam:GL000213.1 ref_name_ref:null raw=GL000220.1 ref_name_bam:GL000220.1 ref_name_ref:null raw=KI270715.1 ref_name_bam:KI270715.1 ref_name_ref:null raw=GL000218.1 ref_name_bam:GL000218.1 ref_name_ref:null raw=KI270749.1 ref_name_bam:KI270749.1 ref_name_ref:null raw=KI270741.1 ref_name_bam:KI270741.1 ref_name_ref:null raw=GL000221.1 ref_name_bam:GL000221.1 ref_name_ref:null raw=KI270716.1 ref_name_bam:KI270716.1 ref_name_ref:null raw=KI270731.1 ref_name_bam:KI270731.1 ref_name_ref:null raw=KI270751.1 ref_name_bam:KI270751.1 ref_name_ref:null raw=KI270750.1 ref_name_bam:KI270750.1 ref_name_ref:null raw=KI270519.1 ref_name_bam:KI270519.1 ref_name_ref:null raw=GL000214.1 ref_name_bam:GL000214.1 ref_name_ref:null raw=KI270708.1 ref_name_bam:KI270708.1 ref_name_ref:null raw=KI270730.1 ref_name_bam:KI270730.1 ref_name_ref:null raw=KI270438.1 ref_name_bam:KI270438.1 ref_name_ref:null raw=KI270737.1 ref_name_bam:KI270737.1 ref_name_ref:null raw=KI270721.1 ref_name_bam:KI270721.1 ref_name_ref:null raw=KI270738.1 ref_name_bam:KI270738.1 ref_name_ref:null raw=KI270748.1 ref_name_bam:KI270748.1 ref_name_ref:null raw=KI270435.1 ref_name_bam:KI270435.1 ref_name_ref:null raw=GL000208.1 ref_name_bam:GL000208.1 ref_name_ref:null raw=KI270538.1 ref_name_bam:KI270538.1 ref_name_ref:null raw=KI270756.1 ref_name_bam:KI270756.1 ref_name_ref:null raw=KI270739.1 ref_name_bam:KI270739.1 ref_name_ref:null raw=KI270757.1 ref_name_bam:KI270757.1 ref_name_ref:null raw=KI270709.1 ref_name_bam:KI270709.1 ref_name_ref:null raw=KI270746.1 ref_name_bam:KI270746.1 ref_name_ref:null raw=KI270753.1 ref_name_bam:KI270753.1 ref_name_ref:null raw=KI270589.1 ref_name_bam:KI270589.1 ref_name_ref:null raw=KI270726.1 ref_name_bam:KI270726.1 ref_name_ref:null raw=KI270735.1 ref_name_bam:KI270735.1 ref_name_ref:null raw=KI270711.1 ref_name_bam:KI270711.1 ref_name_ref:null raw=KI270745.1 ref_name_bam:KI270745.1 ref_name_ref:null raw=KI270714.1 ref_name_bam:KI270714.1 ref_name_ref:null raw=KI270732.1 ref_name_bam:KI270732.1 ref_name_ref:null raw=KI270713.1 ref_name_bam:KI270713.1 ref_name_ref:null raw=KI270754.1 ref_name_bam:KI270754.1 ref_name_ref:null raw=KI270710.1 ref_name_bam:KI270710.1 ref_name_ref:null raw=KI270717.1 ref_name_bam:KI270717.1 ref_name_ref:null raw=KI270724.1 ref_name_bam:KI270724.1 ref_name_ref:null raw=KI270720.1 ref_name_bam:KI270720.1 ref_name_ref:null raw=KI270723.1 ref_name_bam:KI270723.1 ref_name_ref:null raw=KI270718.1 ref_name_bam:KI270718.1 ref_name_ref:null raw=KI270317.1 ref_name_bam:KI270317.1 ref_name_ref:null raw=KI270740.1 ref_name_bam:KI270740.1 ref_name_ref:null raw=KI270755.1 ref_name_bam:KI270755.1 ref_name_ref:null raw=KI270707.1 ref_name_bam:KI270707.1 ref_name_ref:null raw=KI270579.1 ref_name_bam:KI270579.1 ref_name_ref:null raw=KI270752.1 ref_name_bam:KI270752.1 ref_name_ref:null raw=KI270512.1 ref_name_bam:KI270512.1 ref_name_ref:null raw=KI270322.1 ref_name_bam:KI270322.1 ref_name_ref:null raw=GL000226.1 ref_name_bam:GL000226.1 ref_name_ref:null raw=KI270311.1 ref_name_bam:KI270311.1 ref_name_ref:null raw=KI270366.1 ref_name_bam:KI270366.1 ref_name_ref:null raw=KI270511.1 ref_name_bam:KI270511.1 ref_name_ref:null raw=KI270448.1 ref_name_bam:KI270448.1 ref_name_ref:null raw=KI270521.1 ref_name_bam:KI270521.1 ref_name_ref:null raw=KI270581.1 ref_name_bam:KI270581.1 ref_name_ref:null raw=KI270582.1 ref_name_bam:KI270582.1 ref_name_ref:null raw=KI270515.1 ref_name_bam:KI270515.1 ref_name_ref:null raw=KI270588.1 ref_name_bam:KI270588.1 ref_name_ref:null raw=KI270591.1 ref_name_bam:KI270591.1 ref_name_ref:null raw=KI270522.1 ref_name_bam:KI270522.1 ref_name_ref:null raw=KI270507.1 ref_name_bam:KI270507.1 ref_name_ref:null raw=KI270590.1 ref_name_bam:KI270590.1 ref_name_ref:null raw=KI270584.1 ref_name_bam:KI270584.1 ref_name_ref:null raw=KI270320.1 ref_name_bam:KI270320.1 ref_name_ref:null raw=KI270382.1 ref_name_bam:KI270382.1 ref_name_ref:null raw=KI270468.1 ref_name_bam:KI270468.1 ref_name_ref:null raw=KI270467.1 ref_name_bam:KI270467.1 ref_name_ref:null raw=KI270362.1 ref_name_bam:KI270362.1 ref_name_ref:null raw=KI270517.1 ref_name_bam:KI270517.1 ref_name_ref:null raw=KI270593.1 ref_name_bam:KI270593.1 ref_name_ref:null raw=KI270528.1 ref_name_bam:KI270528.1 ref_name_ref:null raw=KI270587.1 ref_name_bam:KI270587.1 ref_name_ref:null raw=KI270364.1 ref_name_bam:KI270364.1 ref_name_ref:null raw=KI270371.1 ref_name_bam:KI270371.1 ref_name_ref:null raw=KI270333.1 ref_name_bam:KI270333.1 ref_name_ref:null raw=KI270374.1 ref_name_bam:KI270374.1 ref_name_ref:null raw=KI270411.1 ref_name_bam:KI270411.1 ref_name_ref:null raw=KI270414.1 ref_name_bam:KI270414.1 ref_name_ref:null raw=KI270510.1 ref_name_bam:KI270510.1 ref_name_ref:null raw=KI270390.1 ref_name_bam:KI270390.1 ref_name_ref:null raw=KI270375.1 ref_name_bam:KI270375.1 ref_name_ref:null raw=KI270420.1 ref_name_bam:KI270420.1 ref_name_ref:null raw=KI270509.1 ref_name_bam:KI270509.1 ref_name_ref:null raw=KI270315.1 ref_name_bam:KI270315.1 ref_name_ref:null raw=KI270302.1 ref_name_bam:KI270302.1 ref_name_ref:null raw=KI270518.1 ref_name_bam:KI270518.1 ref_name_ref:null raw=KI270530.1 ref_name_bam:KI270530.1 ref_name_ref:null raw=KI270304.1 ref_name_bam:KI270304.1 ref_name_ref:null raw=KI270418.1 ref_name_bam:KI270418.1 ref_name_ref:null raw=KI270424.1 ref_name_bam:KI270424.1 ref_name_ref:null raw=KI270417.1 ref_name_bam:KI270417.1 ref_name_ref:null raw=KI270508.1 ref_name_bam:KI270508.1 ref_name_ref:null raw=KI270303.1 ref_name_bam:KI270303.1 ref_name_ref:null raw=KI270381.1 ref_name_bam:KI270381.1 ref_name_ref:null raw=KI270529.1 ref_name_bam:KI270529.1 ref_name_ref:null raw=KI270425.1 ref_name_bam:KI270425.1 ref_name_ref:null raw=KI270396.1 ref_name_bam:KI270396.1 ref_name_ref:null raw=KI270363.1 ref_name_bam:KI270363.1 ref_name_ref:null raw=KI270386.1 ref_name_bam:KI270386.1 ref_name_ref:null raw=KI270465.1 ref_name_bam:KI270465.1 ref_name_ref:null raw=KI270383.1 ref_name_bam:KI270383.1 ref_name_ref:null raw=KI270384.1 ref_name_bam:KI270384.1 ref_name_ref:null raw=KI270330.1 ref_name_bam:KI270330.1 ref_name_ref:null raw=KI270372.1 ref_name_bam:KI270372.1 ref_name_ref:null raw=KI270548.1 ref_name_bam:KI270548.1 ref_name_ref:null raw=KI270580.1 ref_name_bam:KI270580.1 ref_name_ref:null raw=KI270387.1 ref_name_bam:KI270387.1 ref_name_ref:null raw=KI270391.1 ref_name_bam:KI270391.1 ref_name_ref:null raw=KI270305.1 ref_name_bam:KI270305.1 ref_name_ref:null raw=KI270373.1 ref_name_bam:KI270373.1 ref_name_ref:null raw=KI270422.1 ref_name_bam:KI270422.1 ref_name_ref:null raw=KI270316.1 ref_name_bam:KI270316.1 ref_name_ref:null raw=KI270340.1 ref_name_bam:KI270340.1 ref_name_ref:null raw=KI270338.1 ref_name_bam:KI270338.1 ref_name_ref:null raw=KI270583.1 ref_name_bam:KI270583.1 ref_name_ref:null raw=KI270334.1 ref_name_bam:KI270334.1 ref_name_ref:null raw=KI270429.1 ref_name_bam:KI270429.1 ref_name_ref:null raw=KI270393.1 ref_name_bam:KI270393.1 ref_name_ref:null raw=KI270516.1 ref_name_bam:KI270516.1 ref_name_ref:null raw=KI270389.1 ref_name_bam:KI270389.1 ref_name_ref:null raw=KI270466.1 ref_name_bam:KI270466.1 ref_name_ref:null raw=KI270388.1 ref_name_bam:KI270388.1 ref_name_ref:null raw=KI270544.1 ref_name_bam:KI270544.1 ref_name_ref:null raw=KI270310.1 ref_name_bam:KI270310.1 ref_name_ref:null raw=KI270412.1 ref_name_bam:KI270412.1 ref_name_ref:null raw=KI270395.1 ref_name_bam:KI270395.1 ref_name_ref:null raw=KI270376.1 ref_name_bam:KI270376.1 ref_name_ref:null raw=KI270337.1 ref_name_bam:KI270337.1 ref_name_ref:null raw=KI270335.1 ref_name_bam:KI270335.1 ref_name_ref:null raw=KI270378.1 ref_name_bam:KI270378.1 ref_name_ref:null raw=KI270379.1 ref_name_bam:KI270379.1 ref_name_ref:null raw=KI270329.1 ref_name_bam:KI270329.1 ref_name_ref:null raw=KI270419.1 ref_name_bam:KI270419.1 ref_name_ref:null raw=KI270336.1 ref_name_bam:KI270336.1 ref_name_ref:null raw=KI270312.1 ref_name_bam:KI270312.1 ref_name_ref:null raw=KI270539.1 ref_name_bam:KI270539.1 ref_name_ref:null raw=KI270385.1 ref_name_bam:KI270385.1 ref_name_ref:null raw=KI270423.1 ref_name_bam:KI270423.1 ref_name_ref:null raw=KI270392.1 ref_name_bam:KI270392.1 ref_name_ref:null raw=KI270394.1 ref_name_bam:KI270394.1 ref_name_ref:null done ERROR: no local reference sequence found for BAM reference name KI270728.1 ERROR: no local reference sequence found for BAM reference name KI270727.1 ERROR: no local reference sequence found for BAM reference name KI270442.1 ERROR: no local reference sequence found for BAM reference name KI270729.1 ERROR: no local reference sequence found for BAM reference name GL000225.1 ERROR: no local reference sequence found for BAM reference name KI270743.1 ERROR: no local reference sequence found for BAM reference name GL000008.2 ERROR: no local reference sequence found for BAM reference name GL000009.2 ERROR: no local reference sequence found for BAM reference name KI270747.1 ERROR: no local reference sequence found for BAM reference name KI270722.1 ERROR: no local reference sequence found for BAM reference name GL000194.1 ERROR: no local reference sequence found for BAM reference name KI270742.1 ERROR: no local reference sequence found for BAM reference name GL000205.2 ERROR: no local reference sequence found for BAM reference name GL000195.1 ERROR: no local reference sequence found for BAM reference name KI270736.1 ERROR: no local reference sequence found for BAM reference name KI270733.1 ERROR: no local reference sequence found for BAM reference name GL000224.1 ERROR: no local reference sequence found for BAM reference name GL000219.1 ERROR: no local reference sequence found for BAM reference name KI270719.1 ERROR: no local reference sequence found for BAM reference name GL000216.2 ERROR: no local reference sequence found for BAM reference name KI270712.1 ERROR: no local reference sequence found for BAM reference name KI270706.1 ERROR: no local reference sequence found for BAM reference name KI270725.1 ERROR: no local reference sequence found for BAM reference name KI270744.1 ERROR: no local reference sequence found for BAM reference name KI270734.1 ERROR: no local reference sequence found for BAM reference name GL000213.1 ERROR: no local reference sequence found for BAM reference name GL000220.1 ERROR: no local reference sequence found for BAM reference name KI270715.1 ERROR: no local reference sequence found for BAM reference name GL000218.1 ERROR: no local reference sequence found for BAM reference name KI270749.1 ERROR: no local reference sequence found for BAM reference name KI270741.1 ERROR: no local reference sequence found for BAM reference name GL000221.1 ERROR: no local reference sequence found for BAM reference name KI270716.1 ERROR: no local reference sequence found for BAM reference name KI270731.1 ERROR: no local reference sequence found for BAM reference name KI270751.1 ERROR: no local reference sequence found for BAM reference name KI270750.1 ERROR: no local reference sequence found for BAM reference name KI270519.1 ERROR: no local reference sequence found for BAM reference name GL000214.1 ERROR: no local reference sequence found for BAM reference name KI270708.1 ERROR: no local reference sequence found for BAM reference name KI270730.1 ERROR: no local reference sequence found for BAM reference name KI270438.1 ERROR: no local reference sequence found for BAM reference name KI270737.1 ERROR: no local reference sequence found for BAM reference name KI270721.1 ERROR: no local reference sequence found for BAM reference name KI270738.1 ERROR: no local reference sequence found for BAM reference name KI270748.1 ERROR: no local reference sequence found for BAM reference name KI270435.1 ERROR: no local reference sequence found for BAM reference name GL000208.1 ERROR: no local reference sequence found for BAM reference name KI270538.1 ERROR: no local reference sequence found for BAM reference name KI270756.1 ERROR: no local reference sequence found for BAM reference name KI270739.1 ERROR: no local reference sequence found for BAM reference name KI270757.1 ERROR: no local reference sequence found for BAM reference name KI270709.1 ERROR: no local reference sequence found for BAM reference name KI270746.1 ERROR: no local reference sequence found for BAM reference name KI270753.1 ERROR: no local reference sequence found for BAM reference name KI270589.1 ERROR: no local reference sequence found for BAM reference name KI270726.1 ERROR: no local reference sequence found for BAM reference name KI270735.1 ERROR: no local reference sequence found for BAM reference name KI270711.1 ERROR: no local reference sequence found for BAM reference name KI270745.1 ERROR: no local reference sequence found for BAM reference name KI270714.1 ERROR: no local reference sequence found for BAM reference name KI270732.1 ERROR: no local reference sequence found for BAM reference name KI270713.1 ERROR: no local reference sequence found for BAM reference name KI270754.1 ERROR: no local reference sequence found for BAM reference name KI270710.1 ERROR: no local reference sequence found for BAM reference name KI270717.1 ERROR: no local reference sequence found for BAM reference name KI270724.1 ERROR: no local reference sequence found for BAM reference name KI270720.1 ERROR: no local reference sequence found for BAM reference name KI270723.1 ERROR: no local reference sequence found for BAM reference name KI270718.1 ERROR: no local reference sequence found for BAM reference name KI270317.1 ERROR: no local reference sequence found for BAM reference name KI270740.1 ERROR: no local reference sequence found for BAM reference name KI270755.1 ERROR: no local reference sequence found for BAM reference name KI270707.1 ERROR: no local reference sequence found for BAM reference name KI270579.1 ERROR: no local reference sequence found for BAM reference name KI270752.1 ERROR: no local reference sequence found for BAM reference name KI270512.1 ERROR: no local reference sequence found for BAM reference name KI270322.1 ERROR: no local reference sequence found for BAM reference name GL000226.1 ERROR: no local reference sequence found for BAM reference name KI270311.1 ERROR: no local reference sequence found for BAM reference name KI270366.1 ERROR: no local reference sequence found for BAM reference name KI270511.1 ERROR: no local reference sequence found for BAM reference name KI270448.1 ERROR: no local reference sequence found for BAM reference name KI270521.1 ERROR: no local reference sequence found for BAM reference name KI270581.1 ERROR: no local reference sequence found for BAM reference name KI270582.1 ERROR: no local reference sequence found for BAM reference name KI270515.1 ERROR: no local reference sequence found for BAM reference name KI270588.1 ERROR: no local reference sequence found for BAM reference name KI270591.1 ERROR: no local reference sequence found for BAM reference name KI270522.1 ERROR: no local reference sequence found for BAM reference name KI270507.1 ERROR: no local reference sequence found for BAM reference name KI270590.1 ERROR: no local reference sequence found for BAM reference name KI270584.1 ERROR: no local reference sequence found for BAM reference name KI270320.1 ERROR: no local reference sequence found for BAM reference name KI270382.1 ERROR: no local reference sequence found for BAM reference name KI270468.1 ERROR: no local reference sequence found for BAM reference name KI270467.1 ERROR: no local reference sequence found for BAM reference name KI270362.1 ERROR: no local reference sequence found for BAM reference name KI270517.1 ERROR: no local reference sequence found for BAM reference name KI270593.1 ERROR: no local reference sequence found for BAM reference name KI270528.1 ERROR: no local reference sequence found for BAM reference name KI270587.1 ERROR: no local reference sequence found for BAM reference name KI270364.1 ERROR: no local reference sequence found for BAM reference name KI270371.1 ERROR: no local reference sequence found for BAM reference name KI270333.1 ERROR: no local reference sequence found for BAM reference name KI270374.1 ERROR: no local reference sequence found for BAM reference name KI270411.1 ERROR: no local reference sequence found for BAM reference name KI270414.1 ERROR: no local reference sequence found for BAM reference name KI270510.1 ERROR: no local reference sequence found for BAM reference name KI270390.1 ERROR: no local reference sequence found for BAM reference name KI270375.1 ERROR: no local reference sequence found for BAM reference name KI270420.1 ERROR: no local reference sequence found for BAM reference name KI270509.1 ERROR: no local reference sequence found for BAM reference name KI270315.1 ERROR: no local reference sequence found for BAM reference name KI270302.1 ERROR: no local reference sequence found for BAM reference name KI270518.1 ERROR: no local reference sequence found for BAM reference name KI270530.1 ERROR: no local reference sequence found for BAM reference name KI270304.1 ERROR: no local reference sequence found for BAM reference name KI270418.1 ERROR: no local reference sequence found for BAM reference name KI270424.1 ERROR: no local reference sequence found for BAM reference name KI270417.1 ERROR: no local reference sequence found for BAM reference name KI270508.1 ERROR: no local reference sequence found for BAM reference name KI270303.1 ERROR: no local reference sequence found for BAM reference name KI270381.1 ERROR: no local reference sequence found for BAM reference name KI270529.1 ERROR: no local reference sequence found for BAM reference name KI270425.1 ERROR: no local reference sequence found for BAM reference name KI270396.1 ERROR: no local reference sequence found for BAM reference name KI270363.1 ERROR: no local reference sequence found for BAM reference name KI270386.1 ERROR: no local reference sequence found for BAM reference name KI270465.1 ERROR: no local reference sequence found for BAM reference name KI270383.1 ERROR: no local reference sequence found for BAM reference name KI270384.1 ERROR: no local reference sequence found for BAM reference name KI270330.1 ERROR: no local reference sequence found for BAM reference name KI270372.1 ERROR: no local reference sequence found for BAM reference name KI270548.1 ERROR: no local reference sequence found for BAM reference name KI270580.1 ERROR: no local reference sequence found for BAM reference name KI270387.1 ERROR: no local reference sequence found for BAM reference name KI270391.1 ERROR: no local reference sequence found for BAM reference name KI270305.1 ERROR: no local reference sequence found for BAM reference name KI270373.1 ERROR: no local reference sequence found for BAM reference name KI270422.1 ERROR: no local reference sequence found for BAM reference name KI270316.1 ERROR: no local reference sequence found for BAM reference name KI270340.1 ERROR: no local reference sequence found for BAM reference name KI270338.1 ERROR: no local reference sequence found for BAM reference name KI270583.1 ERROR: no local reference sequence found for BAM reference name KI270334.1 ERROR: no local reference sequence found for BAM reference name KI270429.1 ERROR: no local reference sequence found for BAM reference name KI270393.1 ERROR: no local reference sequence found for BAM reference name KI270516.1 ERROR: no local reference sequence found for BAM reference name KI270389.1 ERROR: no local reference sequence found for BAM reference name KI270466.1 ERROR: no local reference sequence found for BAM reference name KI270388.1 ERROR: no local reference sequence found for BAM reference name KI270544.1 ERROR: no local reference sequence found for BAM reference name KI270310.1 ERROR: no local reference sequence found for BAM reference name KI270412.1 ERROR: no local reference sequence found for BAM reference name KI270395.1 ERROR: no local reference sequence found for BAM reference name KI270376.1 ERROR: no local reference sequence found for BAM reference name KI270337.1 ERROR: no local reference sequence found for BAM reference name KI270335.1 ERROR: no local reference sequence found for BAM reference name KI270378.1 ERROR: no local reference sequence found for BAM reference name KI270379.1 ERROR: no local reference sequence found for BAM reference name KI270329.1 ERROR: no local reference sequence found for BAM reference name KI270419.1 ERROR: no local reference sequence found for BAM reference name KI270336.1 ERROR: no local reference sequence found for BAM reference name KI270312.1 ERROR: no local reference sequence found for BAM reference name KI270539.1 ERROR: no local reference sequence found for BAM reference name KI270385.1 ERROR: no local reference sequence found for BAM reference name KI270423.1 ERROR: no local reference sequence found for BAM reference name KI270392.1 ERROR: no local reference sequence found for BAM reference name KI270394.1 ERROR: java.io.IOException: BAM header not fully compatible with specified reference sequence java.io.IOException: BAM header not fully compatible with specified reference sequence at org.stjude.compbio.rnapeg.SplicedReadReporter.report(SplicedReadReporter.java:241) at org.stjude.compbio.rnapeg.SplicedReadReporter.main(SplicedReadReporter.java:765) Uncaught exception from user code: where is 99_FRAS202421989-1a_1.fqAligned.sortedByCoord.out.bam.junctions.tab at /RNApeg/src/bin/junction_extraction_wrapper.pl line 339. main::run_cmd("bam_junction.pl -type all -bam /data/An\x{c3}\x{a1}lises/Teste1/star/all_samples_ensembl_chimericwithinbam/99_FRAS202421989-1a_1.fqAligned.sortedByCoord.out.bam -annotate -now -force 1 -out 99_FRAS202421989-1a_1.fqAligned.sortedByCoord.out.bam.junctions.tab -no-config -refflat /data/references/cicero/reference/Homo_sapiens/GRCh38_no_alt/mRNA/RefSeq/refFlat.txt -fasta /data/references/cicero/reference/Homo_sapiens/GRCh38_no_alt/FASTA/GRCh38_no_alt.fa", "99_FRAS202421989-1a_1.fqAligned.sortedByCoord.out.bam.junctions.tab") called at /RNApeg/src/bin/junction_extraction_wrapper.pl line 217

How can I correct the BAM header or remove the chromossomes/patches that are not present in the reference provided by CICERO?

Call TE-chimeric from RNA seq

Hi CICERO team,

Thanks for developing the method! I am wondering how to access the intermediate file mentioned as "the portion of the contig not mapped to bp1" in the paper. As I want to use the intermediate file to call TE-chimeric (joining parts of gene and Transposable Elements) from RNA seq. Thanks!

Catch cap3 error when it fails to allocate memory due to invocation of cap3

As we recently saw in cicero-itd, Perl can occasionally fail to allocate memory when shelling out to an external command. This has rarely happened in Cicero when calling out to cap3 and was previously undiagnosed or resolved.

We should catch this with something akin to the following:

 $fn = `$cmd`;
    if ($? == -1) {
        # $! contains error from the C library
        # $^E contains error from the OS
        print STDERR "\$^E: ".$^E."\n";
        die "Error message: $!\n";
    }

Singularity RNApeg

Hi,

I get this error when trying to run RNApeg with singularity 3.5.3:

[*] Running junction_extraction_wrapper.pl
Tue Sep 8 16:52:27 2020: running: bam_junction.pl -type all -bam test.bam -annotate -now -force 1 -out test.bam.junctions.tab -no-config -refflat refFlat.txt -fasta GRCh37-lite.fa
Can't locate JavaRun.pm in @inc (you may need to install the JavaRun module)

Any ideas?

Thanks

Dynamic blacklist

Is this correct? It looks like this is a dynamic blacklist to handle highly recurrent genes, however, upon the first gene that falls under the threshold, it ends. Shouldn't it just skip to the next gene?

last if($gene_recurrance{$g} < $max_num_hits*10);

Combine error with many putative fusions

Hello!

I'm running CICERO v1.7.0 (RNAPeg v2.6.4) using default parameters and get the following error in the combine step:

/opt/cicero/src/bin/Cicero.sh: line 318: /bin/cat: Argument list too long
/opt/cicero/src/bin/Cicero.sh: line 319: /bin/cat: Argument list too long

45682 files to be merged, seems to be too much for cat.

Perhaps using find and xargs would be better here?

Example:

find $CICERO_DATADIR/$SAMPLE/*/ -type f -name 'unfiltered.fusion.txt' -print0 | sort -zV -k 9,9 -k 10,10n -k 11,11n | xargs -0 cat > $CICERO_DATADIR/$SAMPLE/unfiltered.fusion.txt

Could probably be made more efficient since the find is unnecessary as per use kojiro's comment here: https://unix.stackexchange.com/questions/426748/cat-a-very-large-number-of-files-together-in-correct-order

Although in my own testing this doesn't take very long to run either way, and successfully outputs the unfiltered.fusions.txt file.

Final Fusion List Missing

Hi,
I downloaded CICERO and I tested the docker image using the demo data provided in the repo.
For some reason, the final fusion file is missing. Not sure what I missed. Here are the commands I executed:

RNApeg:
docker run -v /mnt/storage/data/cicero/reference/:/mnt/ref/ -v /mnt/storage/data/cicero/bam/demo/:/mnt/bam/ -v /mnt/storage/an
alysis/fusions/cicero/:/mnt/output/ rnapeg -b /mnt/bam/MV4_11_RNAseq_1.bam -f /mnt/ref/Homo_sapiens/GRCh37-lite/FASTA/GRCh37-lite.fa -r /mnt/ref/Homo_sapiens/GRCh37-lite/mRNA/RefSeq/refFlat.txt -rg /mnt/ref/Homo_sapiens/GRCh37-lite/mRNA/RefSeq/refFlat.txt -O /mnt/output/

cicero:
docker run -v /mnt/storage/data/cicero/reference/:/mnt/ref/ -v /mnt/storage/data/cicero/bam/demo/:/mnt/bam/ -v /mnt/storage/an
alysis/fusions/cicero/:/mnt/output/ --memory="8g" --memory-swap="20g" cicero:latest -b /mnt/bam/MV4_11_RNAseq_1.bam -r /mnt/ref/ -o /mnt/output/ -j /mnt/output/MV4_11_RNAseq_1.bam.junctions.tab.shifted.tab -g GRCh37-lite -n 32

I got the files:

01_ExtractSClips.err
01_ExtractSClips.log
01_ExtractSClips.out
02_Cicero.err
02_Cicero.log
02_Cicero.out
03_Combine.err
03_Combine.out
04_Annotate.err
04_Annotate.log
04_Annotate.out
05_Filter.err
05_Filter.out
MV4_11_RNAseq_1.bam.junctions.tab
MV4_11_RNAseq_1.bam.junctions.tab.shifted.bed
MV4_11_RNAseq_1.bam.junctions.tab.shifted.tab
MV4_11_RNAseq_1.bam.junctions.tab.shifted.tab.annotated.tab

CICERO_CONFIG
CICERO_DATADIR
CICERO_RUNDIR

And:
ls -1d CICERO_DATADIR/MV4_11_RNAseq_1/*txt
CICERO_DATADIR/MV4_11_RNAseq_1/annotated.all.txt
CICERO_DATADIR/MV4_11_RNAseq_1/annotated.internal.txt
CICERO_DATADIR/MV4_11_RNAseq_1/excluded.new.fusions.txt
CICERO_DATADIR/MV4_11_RNAseq_1/excluded.new.internal.txt
CICERO_DATADIR/MV4_11_RNAseq_1/final_internal.txt
CICERO_DATADIR/MV4_11_RNAseq_1/MV4_11_RNAseq_1.gene_info.txt
CICERO_DATADIR/MV4_11_RNAseq_1/MV4_11_RNAseq_1.SC.txt
CICERO_DATADIR/MV4_11_RNAseq_1/unfiltered.fusion.txt
CICERO_DATADIR/MV4_11_RNAseq_1/unfiltered.internal.txt

When I check the files:
wc -l CICERO_DATADIR/MV4_11_RNAseq_1/*txt
1 CICERO_DATADIR/MV4_11_RNAseq_1/annotated.all.txt
1 CICERO_DATADIR/MV4_11_RNAseq_1/annotated.internal.txt
0 CICERO_DATADIR/MV4_11_RNAseq_1/excluded.new.fusions.txt
0 CICERO_DATADIR/MV4_11_RNAseq_1/excluded.new.internal.txt
0 CICERO_DATADIR/MV4_11_RNAseq_1/final_internal.txt
24731 CICERO_DATADIR/MV4_11_RNAseq_1/MV4_11_RNAseq_1.gene_info.txt
136746 CICERO_DATADIR/MV4_11_RNAseq_1/MV4_11_RNAseq_1.SC.txt
0 CICERO_DATADIR/MV4_11_RNAseq_1/unfiltered.fusion.txt
0 CICERO_DATADIR/MV4_11_RNAseq_1/unfiltered.internal.txt

What did I miss?
Thanks a lot in advance.

gfServer: Error in TCP non-blocking connect() 111 - Connection refused

Hi I'm having issues trying to run Cicero in an HPC cluster. I get errors trying to start the gfServer

$ tail -f gfServer.ji02.77967.*

==> gfServer.ji02.77967.log <==
2023/01/09 10:02:02: info: gfServer version 35 on host localhost, port 2728  (2023-01-09 10:02)
2023/01/09 10:02:02: info: setting up untranslated index
2023/01/09 10:02:58: info: indexing complete
2023/01/09 10:02:58: info: Server ready for queries!

==> gfServer.ji02.77967.out <==
tileSize 11
stepSize 5
minMatch 2
pcr requests 0
blat requests 0
bases 0
misses 0
noSig 0
trimmed 0
warnings 0

==> gfServer.ji02.77967.err <==
Error in TCP non-blocking connect() 111 - Connection refused
Error in TCP non-blocking connect() 111 - Connection refused
Error in TCP non-blocking connect() 111 - Connection refused
Error in TCP non-blocking connect() 111 - Connection refused
Error in TCP non-blocking connect() 111 - Connection refused
Error in TCP non-blocking connect() 111 - Connection refused
Error in TCP non-blocking connect() 111 - Connection refused

Any idea on how can I solve or troubleshoot this issue?
Thanks

Failed to build docker image

I am trying to build the Docker image. I have used the following command.
sudo docker build -t stjude/cicero:0.2.0 .
However I am getting the error-
The command '/bin/sh -c cpanm --force -i enum Data::[email protected] [email protected] && chown -R root:root /usr/local/.cpanm' returned a non-zero code: 1

Here is the final docker image

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
<none>              <none>              3e3bf687eff9        9 hours ago         1.44GB
ubuntu              18.04               56def654ec22        6 weeks ago         63.2MB
hello-world         latest              bf756fb1ae65        10 months ago       13.3kB

Kindly give your suggestion.
Regards
Jay

Failure to run CICERO

I try running Cicero either locally or using Docker, and it just stuck at "Step 01 - 2023.12.02 17:53:45 - ExtractSClips", without any progression. The bam file is just from a targeted RNA-seq, and running STAR-fusion and Arriba just takes less than 10 minutes.
What has gone wrong? Is that it depends on the blat server? Anyway to correct this?

The code:
Cicero.sh -b /media/Data/RNA-seq/Bam/08AH9897/Aligned.sortedByCoord.out.bam -r /media/Data/Reference/Cicero/reference -g GRCh38_no_alt -j /media/Data/RNA-seq/Bam/08AH9897/Aligned.sortedByCoord.out.bam.junctions.tab.shifted.tab -O /media/Data/RNA-seq/Bam/08AH9897 -p -n 8

docker:
docker run -v $OUTDIR:/input -v $REFDIR:/reference ghcr.io/stjude/cicero:latest -n 8 -b /input/Aligned.sortedByCoord.out.bam -g GRCh38_no_alt -r /reference -j /input/Aligned.sortedByCoord.out.bam.junctions.tab.shifted.tab -o /input/Cicero -s 2 -no-optimize

Thanks.

Conflicting samtools requirement

The README lists Samtools 1.3.1 as a dependency, but also has Bio as a Perl dependency. Bio::DB::Sam requires Samtools 0.1.10 through 0.1.17. It doesn't work with the HTSlib reconfiguration in 1.0 onwards.

STAR and Picard parameters

Hello! I would like to use STAR and Picard to prepare my samples for CICERO. Which parameters do you recommend for these tools? Is RNApeg necessary, or can it be omitted?

Thanks!
Kristin

build the Docker image using the Dockerfile problem

When I use docker for software installation, the error message is as follows:
/bin/sh: 1: perlbrew: not found
The command '/bin/sh -c perlbrew --notest install 5.10.1' returned a non-zero code: 127

The command line is:
sudo docker build -t stjude/cicero:0.3.0 .

It may be that a line of commands in the Dockerfile needs to be changed.

Add documentation from paper

The Cicero paper contains a description of the various thresholds used throughout the code. This documentation should be included in-line in the code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.