xryanglab / ribocode Goto Github PK
View Code? Open in Web Editor NEWrelease version
License: MIT License
release version
License: MIT License
Even thought used the following parameters
metaplots -a $genecode.v44.ribo.anno -r $sample.Aligned.toTranscriptome.out.bam -o $sample.ribocode -f0_percent 0.01
The log still reported the same contents, which causes RiboCode to interrput
Hi,
RiboCode runs on my dataset from Cryptococcus neoformas, then fails during the writing to gtf. This meant that ORF calling succeeded and wrote a complete .txt file, but for the .gtf output left large gaps of entire chromosomes or half-chromosomes.
Error message was:
Errors: 88% has finished! Writing the results to file ..... error when transform the transcript interval to genomic! Traceback (most recent call last): File "/usr/local/bin/RiboCode", line 10, in <module> sys.exit(main()) File "/usr/local/lib/python3.6/dist-packages/RiboCode/RiboCode.py", line 63, in main output_gtf=output_gtf, output_bed=output_bed) File "/usr/local/lib/python3.6/dist-packages/RiboCode/detectORF.py", line 455, in main write_to_gtf(gene_dict, transcript_dict, orf_results, collapsed_orf_idx, outname) File "/usr/local/lib/python3.6/dist-packages/RiboCode/detectORF.py", line 192, in write_to_gtf exon_ivs = transcript_iv_transform(tobj, orf_iv) File "/usr/local/lib/python3.6/dist-packages/RiboCode/prepare_transcripts.py", line 285, in transcript_iv_transform exons_ivs.append(Interval_from_directional(exons_bound[i],exons_bound[i+1],strand))
I suggest three possible ways to help with this.
I can provide access to data and annotations if that helps to debug!
Thanks
Edward
Sorry to bother you.
I followed your workflow use STAR arguments --quantMode TranscriptomeSAM
and --outFilterMultimapNmax 1
,this is the command looks like
STAR --outFilterType BySJout --runThreadN 10 --outFilterMismatchNmax 2 \
--genomeDir /reference/GRCm38/STAR \
--readFilesIn /my/rmrRNA/${sample}_trim_norrna.fq \
--outFileNamePrefix ${sample} \
--outSAMtype BAM SortedByCoordinate \
--quantMode TranscriptomeSAM GeneCounts \
--outFilterMultimapNmax 1 --outFilterMatchNmin 16 --alignEndsType EndToEnd
the output Aligned.toTranscriptome.out.bam
file still have multiple mapping sequence ( NH:i
> 1 ) , because STAR wiil output all records in this toTranscriptome
bam file , Does these multiple mapping records affect the Ribocode result to find uORF ?
Hi,
I got the following error when running my script:
Error, the references in bam are different from transcriptome annotation
It seems to throw this as my reads were aligned to the transcriptome and have the transcript version ID instead of just the transcript ID. Is it possible for the output of the prepare_transcript function to include the version number?
Thanks
Hi, I want to use a special gtf file and want to process these files through GTFupdate
, but I find that there will be some errors.
The following is the content of my gtf file:
chr1 Cufflinks exon 11872 12227 . + . gene_id "NONHSAG000001.2";gene_name "NONHSAG000001.2";transcript_id "NONHSAT000002.2"
chr1 Cufflinks exon 12613 12721 . + . gene_id "NONHSAG000001.2";gene_name "NONHSAG000001.2";transcript_id "NONHSAT000002.2"
chr1 Cufflinks exon 13225 14412 . + . gene_id "NONHSAG000001.2";gene_name "NONHSAG000001.2";transcript_id "NONHSAT000002.2"
chr1 Cufflinks exon 11874 12227 . + . gene_id "NONHSAG000001.2";gene_name "NONHSAG000001.2";transcript_id "NONHSAT000003.2"
chr1 Cufflinks exon 12595 12721 . + . gene_id "NONHSAG000001.2";gene_name "NONHSAG000001.2";transcript_id "NONHSAT000003.2"
chr1 Cufflinks exon 13403 13655 . + . gene_id "NONHSAG000001.2";gene_name "NONHSAG000001.2";transcript_id "NONHSAT000003.2"
chr1 Cufflinks exon 13661 14409 . + . gene_id "NONHSAG000001.2";gene_name "NONHSAG000001.2";transcript_id "NONHSAT000003.2"
I modified it with reference to the format of GTF_update.rst, but there are still the following errors;
Traceback (most recent call last):
File "/home/leelee/miniconda3/envs/p3/bin/GTFupdate", line 10, in <module>
sys.exit(main())
File "/home/leelee/miniconda3/envs/p3/lib/python3.7/site-packages/RiboCode/GTF_update.py", line 117, in main
gset,sourted_gset_keys = GroupGeneSubsets(args.gtfFile)
File "/home/leelee/miniconda3/envs/p3/lib/python3.7/site-packages/RiboCode/GTF_update.py", line 34, in GroupGeneSubsets
gid=field_dict["attr"]["gene_id"]
KeyError: 'gene_id'
How can i solve this problem?
Thanks,
LeeLee
I think I still encounter an issue that is similar to issue#32
i.e., I did not get the psites_number table in the hdf5 file.
group name otype dclass dim
0 / p_sites H5I_DATASET VLEN 46826
1 / transcript_ids H5I_DATASET STRING 46826
Also the dimension of the p_sites table seems to be wrong.
Otherwise the code ran okay and generated other files.
This job is running on skl-119 on Wed May 17 23:21:46 EDT 2023
0% has finished! ^M2% has finished! ^M4% has finished! ^M6% has finished! ^M9% has finished! ^M11% has finished! ^M13% has finished! ^M15% has finished! ^M17% has finished! ^M19% has finished! ^M21% has finished! ^M23% has finished! ^M26% has finished! ^M28% has finished! ^M30% has finished! ^M32% has finished! ^M34% has finished! ^M36% has finished! ^M38% has finished! ^M41% has finished! ^M43% has finished! ^M45% has finished! ^M47% has finished! ^M49% has finished! ^M51% has finished! ^M53% has finished! ^M56% has finished! ^M58% has finished! ^M60% has finished! ^M62% has finished! ^M64% has finished! ^M66% has finished! ^M68% has finished! ^M70% has finished! ^M73% has finished! ^M75% has finished! ^M77% has finished! ^M79% has finished! ^M81% has finished! ^M83% has finished! ^M85% has finished! ^M88% has finished! ^M90% has finished! ^M92% has finished! ^M94% has finished! ^M96% has finished! ^M98% has finished! ^M[2023-05-17 23:29:14] Finished!
Loading transcripts.pickle ...
Reading bam file: /mnt/home/larrywu/CTRL_arabidopsis/data/RiboCode_STAR/ribo_mapped/D1//star_D1_Aligned.toTranscriptome.out.bam......
Finished reading bam file!
Any suggestions for how to deal with this issue? Thanks!
I'm trying to debug a workflow that runs RiboCode. I tried using the latest container from quay.io and this is the error I'm running into. Any guidance on what I should look into first?
Loading transcripts.pickle ...
Loading Psites from xxxxx.transcriptome.dedup......
_psites.hd5Traceback (most recent call last):
File "/usr/local/bin/RiboCode", line 10, in
sys.exit(main())
File "/usr/local/lib/python3.9/site-packages/RiboCode/RiboCode.py", line 40, in main
tpsites_sum, total_psites_number = process_bam.psites_count(configIn.configList,transcript_dict,thread_num=1)
File "/usr/local/lib/python3.9/site-packages/RiboCode/process_bam.py", line 107, in psites_count
tpsites_sum,total_psites_number = read_bam(configList[0])
File "/usr/local/lib/python3.9/site-packages/RiboCode/process_bam.py", line 45, in read_bam
tpsites,psites_number = load_psites(name + "_psites.hd5" )
File "/usr/local/lib/python3.9/site-packages/RiboCode/process_bam.py", line 31, in load_psites
psites_number = fin.attrs["psites_number"]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/usr/local/lib/python3.9/site-packages/h5py/_hl/attrs.py", line 56, in getitem
attr = h5a.open(self._id, self._e(name))
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5a.pyx", line 80, in h5py.h5a.open
KeyError: "Can't open attribute (can't locate attribute: 'psites_number')"
Hi,
I am getting a weird AttributeError with not much description of what's going wrong:
Traceback (most recent call last):
File "/home/bwee/.local/bin/metaplots", line 9, in
load_entry_point('RiboCode==1.2.10', 'console_scripts', 'metaplots')()
File "/home/bwee/.local/lib/python2.7/site-packages/RiboCode/metaplots.py", line 241, in main
meta_analysis(gene_dict,transcript_dict,args)
File "/home/bwee/.local/lib/python2.7/site-packages/RiboCode/metaplots.py", line 227, in meta_analysis
distancePlot(distance_to_start_count,distance_to_stop_count,pre_psite_dict,length_counter,args.outname + sampleName)
File "/home/bwee/.local/lib/python2.7/site-packages/RiboCode/metaplots.py", line 108, in distancePlot
with PdfPages(outname + ".pdf") as pdf:
AttributeError: exit
Could someone help me understand what's going wrong?
Thanks,
Brendan
Is this package only counts the unique mapped reads in the setp "ORFcount"?
Hey, since RiboCode requires Python 3.6, it is better to set default in readme to pip3, as this makes it more failsafe for users who have both. If there is not something I am missing ?
Traceback (most recent call last):
File "/home/dr/anaconda2/bin/prepare_transcripts", line 10, in
sys.exit(main())
File "/home/dr/anaconda2/lib/python2.7/site-packages/RiboCode/prepare_transcripts.py", line 388, in main
processTranscripts(args.genomeFasta,args.gtfFile,args.out_dir)
File "/home/dr/anaconda2/lib/python2.7/site-packages/RiboCode/prepare_transcripts.py", line 308, in processTranscripts
gene_dict,transcript_dict = readGTF(gtfFile)
File "/home/dr/anaconda2/lib/python2.7/site-packages/RiboCode/prepare_transcripts.py", line 187, in readGTF
gene => transcript => exon (or CDS)" % i)
RiboCode.prepare_transcripts.ParsingError: Error in line 0. The annotation in GTF file should be three-level hierarchy of gene => transcript => exon (or CDS)
I am getting this error and I am new to bioinformatics kindly resolve....
Hi,
I got a ModuleNotFoundError when testing the RiboCode installation with RiboCode_onestep -V
Can you help?
$ RiboCode_onestep -V
Traceback (most recent call last):
File "/nexus/posix0/MAGE-flaski/service/projects/data/Bioinformatics/bit_pipe_ribosome_profiling/libraries/venv3/bin/RiboCode_onestep", line 5, in <module>
from RiboCode.RiboCode_onestep import main
File "/nexus/posix0/MAGE-flaski/service/projects/data/Bioinformatics/bit_pipe_ribosome_profiling/libraries/venv3/lib/python3.9/site-packages/RiboCode/RiboCode_onestep.py", line 16, in <module>
from .prepare_transcripts import *
File "/nexus/posix0/MAGE-flaski/service/projects/data/Bioinformatics/bit_pipe_ribosome_profiling/libraries/venv3/lib/python3.9/site-packages/RiboCode/prepare_transcripts.py", line 17, in <module>
from pyfasta import Fasta
File "/nexus/posix0/MAGE-flaski/service/projects/data/Bioinformatics/bit_pipe_ribosome_profiling/libraries/venv3/lib/python3.9/site-packages/pyfasta/__init__.py", line 3, in <module>
from fasta import Fasta, complement, DuplicateHeaderException
ModuleNotFoundError: No module named 'fasta'
my package list
$ pip3 list
Package Version
------------------ ---------
AGEpy 0.8.2
autopaths 1.6.0
bcrypt 3.2.0
biomart 0.9.2
biopython 1.78
certifi 2020.12.5
cffi 1.14.5
chardet 4.0.0
charset-normalizer 3.1.0
click 7.1.2
coloredlogs 15.0
colormath 3.0.0
cryptography 3.4.6
cycler 0.10.0
decorator 4.4.2
et-xmlfile 1.0.1
future 0.18.2
h5py 3.1.0
HTSeq 0.13.5
humanfriendly 9.1
idna 2.10
ipaddress 1.0.23
jdcal 1.4.1
Jinja2 2.11.3
joblib 1.0.1
kiwisolver 1.3.1
lzstring 1.0.4
Markdown 3.3.4
MarkupSafe 1.1.1
matplotlib 3.3.4
minepy 1.2.6
multiqc 1.9
networkx 2.5
numpy 1.20.1
openpyxl 3.0.6
pandas 1.2.2
paramiko 2.7.2
patsy 0.5.1
Pillow 8.1.0
pip 23.0.1
plumbing 2.11.2
py 1.11.0
pybedtools 0.8.1
pycparser 2.20
pyfasta 0.5.2
PyNaCl 1.4.0
pyparsing 2.4.7
pysam 0.16.0.1
python-dateutil 2.8.1
pytz 2021.1
PyYAML 5.4.1
requests 2.25.1
retry 0.9.2
RiboCode 1.2.15
scikit-learn 0.24.1
scipy 1.6.1
seaborn 0.11.1
setuptools 57.5.0
sh 2.0.3
simplejson 3.17.2
six 1.15.0
spectra 0.0.11
statsmodels 0.12.2
suds-jurko 0.6
threadpoolctl 2.1.0
tqdm 4.65.0
urllib3 1.26.3
Wand 0.6.5
wheel 0.38.4
xlrd 2.0.1
XlsxWriter 1.3.7
ribocode was installed by conda and run in python2.7 env.
when i run this command prepare_transcripts -g ~/data/reference/IRGSP/IRGSP.gtf -f ~/data/reference/IRGSP/IRGSP.fa -o annotation/
then i get fellow error:
Traceback (most recent call last):
File "/home/wuyuechao/data/bio-tools/anaconda3/envs/ribocode/bin/prepare_transcripts", line 10, in
sys.exit(main())
File "/home/wuyuechao/data/bio-tools/anaconda3/envs/ribocode/lib/python2.7/site-packages/RiboCode/prepare_transcripts.py", line 388, in main
processTranscripts(args.genomeFasta,args.gtfFile,args.out_dir)
File "/home/wuyuechao/data/bio-tools/anaconda3/envs/ribocode/lib/python2.7/site-packages/RiboCode/prepare_transcripts.py", line 311, in processTranscripts
genomic_seq = GenomeSeq(genomeFasta)
File "/home/wuyuechao/data/bio-tools/anaconda3/envs/ribocode/lib/python2.7/site-packages/RiboCode/prepare_transcripts.py", line 205, in init
self.fh = Fasta(filename, key_fn = get_chrom)
File "/home/wuyuechao/data/bio-tools/anaconda3/envs/ribocode/lib/python2.7/site-packages/pyfasta/fasta.py", line 73, in init
flatten_inplace)
File "/home/wuyuechao/data/bio-tools/anaconda3/envs/ribocode/lib/python2.7/site-packages/pyfasta/records.py", line 48, in prepare
idx = cPickle.load(fh)
ValueError: unsupported pickle protocol: 4
please help thx!
RiboCode -a RiboCode_annot -c metaplots_pre_config.txt -l no -g -o RiboCode_ORFs_result
Loading transcripts.pickle ...
Reading bam file: /n/jobspace/bbcore/schaffer_rpf_abby_min6/two_batches/rpf/alignments_transcriptome/20210125_Index_3_JS8600_S11_R1_001_Aligned.toTranscriptome.out.bam......
Traceback (most recent call last):
File "/home/panh/miniconda3/envs/ribocode/bin/RiboCode", line 10, in
sys.exit(main())
File "/home/panh/miniconda3/envs/ribocode/lib/python3.7/site-packages/RiboCode/RiboCode.py", line 40, in main
tpsites_sum, total_psites_number = process_bam.psites_count(configIn.configList,transcript_dict,thread_num=1)
File "/home/panh/miniconda3/envs/ribocode/lib/python3.7/site-packages/RiboCode/process_bam.py", line 118, in psites_count
tpsites,psites_number = read_bam(configData)
File "/home/panh/miniconda3/envs/ribocode/lib/python3.7/site-packages/RiboCode/process_bam.py", line 80, in read_bam
write_psites(tpsites,psites_number, name + "_psites.hd5")
File "/home/panh/miniconda3/envs/ribocode/lib/python3.7/site-packages/RiboCode/process_bam.py", line 21, in write_psites
fout.create_dataset("transcript_ids",data=list(tpsites.keys()),dtype=ds)
File "/home/panh/miniconda3/envs/ribocode/lib/python3.7/site-packages/h5py/_hl/group.py", line 136, in create_dataset
dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds)
File "/home/panh/miniconda3/envs/ribocode/lib/python3.7/site-packages/h5py/_hl/dataset.py", line 170, in make_new_dset
dset_id.write(h5s.ALL, h5s.ALL, data)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5d.pyx", line 212, in h5py.h5d.DatasetID.write
File "h5py/h5t.pyx", line 1654, in h5py.h5t.py_create
File "h5py/h5t.pyx", line 1715, in h5py.h5t.py_create
TypeError: No conversion path for dtype: dtype('<U18')
My run failed for my dataset, that has all genes set to equal length.
What happens is this (I here display the partial hdf5 file):
group name otype dclass dim
0 / p_sites H5I_DATASET VLEN 551 x 1000
1 / transcript_ids H5I_DATASET STRING 1000
You see the "psites_number" table is not made, because it fails at inserting the "p_sites" table properly.
I think the newest h5py with anaconda (3.6.0) version does not work now as the code is implemented,
or am I wrong here?
ERROR is:
Traceback (most recent call last):
File "/home/roler/anaconda3/envs/ribocode_env/bin/RiboCode", line 10, in <module>
sys.exit(main())
File "/home/roler/anaconda3/envs/ribocode_env/lib/python3.9/site-packages/RiboCode/RiboCode.py", line 40, in main
tpsites_sum, total_psites_number = process_bam.psites_count(configIn.configList,transcript_dict,thread_num=1)
File "/home/roler/anaconda3/envs/ribocode_env/lib/python3.9/site-packages/RiboCode/process_bam.py", line 111, in psites_count
tpsites_sum,total_psites_number = read_bam(configList[0])
File "/home/roler/anaconda3/envs/ribocode_env/lib/python3.9/site-packages/RiboCode/process_bam.py", line 84, in read_bam
write_psites(tpsites,psites_number, name + "_psites.hd5")
File "/home/roler/anaconda3/envs/ribocode_env/lib/python3.9/site-packages/RiboCode/process_bam.py", line 26, in write_psites
fout.create_dataset("p_sites",data=list(tpsites.values()),dtype=dt, compression="gzip")
File "/home/roler/anaconda3/envs/ribocode_env/lib/python3.9/site-packages/h5py/_hl/group.py", line 149, in create_dataset
dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)
File "/home/roler/anaconda3/envs/ribocode_env/lib/python3.9/site-packages/h5py/_hl/dataset.py", line 145, in make_new_dset
dset_id.write(h5s.ALL, h5s.ALL, data)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5d.pyx", line 232, in h5py.h5d.DatasetID.write
File "h5py/_proxy.pyx", line 145, in h5py._proxy.dset_rw
File "h5py/_conv.pyx", line 784, in h5py._conv.ndarray2vlen
AttributeError: 'int' object has no attribute 'dtype'
I tried to make the h5 file from scratch, but it still fails with this error:
Traceback (most recent call last):
File "/home/roler/anaconda3/envs/ribocode_env/bin/RiboCode", line 10, in <module>
sys.exit(main())
File "/home/roler/anaconda3/envs/ribocode_env/lib/python3.9/site-packages/RiboCode/RiboCode.py", line 40, in main
tpsites_sum, total_psites_number = process_bam.psites_count(configIn.configList,transcript_dict,thread_num=1)
File "/home/roler/anaconda3/envs/ribocode_env/lib/python3.9/site-packages/RiboCode/process_bam.py", line 111, in psites_count
tpsites_sum,total_psites_number = read_bam(configList[0])
File "/home/roler/anaconda3/envs/ribocode_env/lib/python3.9/site-packages/RiboCode/process_bam.py", line 49, in read_bam
tpsites,psites_number = load_psites(name + "_psites.hd5" )
File "/home/roler/anaconda3/envs/ribocode_env/lib/python3.9/site-packages/RiboCode/process_bam.py", line 35, in load_psites
psites_number = fin["psites_number"].value
AttributeError: 'Dataset' object has no attribute 'value'
It now says there is no ".value" getter, which might be because ".value" is deprecated?
Relevant link: https://stackoverflow.com/questions/67409919/attributeerror-dataset-object-has-no-attribute-value
Hi,
I was wondering how to add an alternative start codon list in the RiboCode step. Thank you for your help!
Best,
Qidi
Hi,
When I try to run RiboCode:
RiboCode -a annot/ -c config.txt -l no -g -o result/
I got this error:
Loading transcripts.pickle ...
Reading bam file: newstarAligned.toTranscriptome.out.bam......
Finished reading bam file!
100% has finished!
Writing the results to file .....
Traceback (most recent call last):
File "/usr/local/bin/RiboCode", line 11, in
load_entry_point('RiboCode==1.2.10', 'console_scripts', 'RiboCode')()
File "/usr/local/lib/python3.6/dist-packages/RiboCode/RiboCode.py", line 63, in main
output_gtf=output_gtf, output_bed=output_bed)
File "/usr/local/lib/python3.6/dist-packages/RiboCode/detectORF.py", line 379, in main
write_result(orf_results,outname)
File "/usr/local/lib/python3.6/dist-packages/RiboCode/detectORF.py", line 156, in write_result
header = "\t".join(list(orf_results[0].keys())[:-1])
IndexError: list index out of range
I really don't know how to deal with this problem.Could you show me some advice?
When I run metaplots, it gives an error:
ValueError: numpy.ufunc size changed, may indicate binary incompatibility. Expected 216 from C header, got 192 from PyObject
I don't know where the problem is, please help, thanks.
Correct me if I am wrong, but currently, the hdf5 file with psite information is saved in working dir and not in defined output-dir, this makes it fail if output-dir is not the working directory for the call.
This was tested on the latest version through Conda.
when i use the metaplots function of ribocode 1.2.14, it show the error:, how can i slove it ,thank you in advance!!!
Create metaplot file and predict the P-site locations ...
Loading transcripts.pickle ...
Traceback (most recent call last):
File "/home/zuozd/miniconda3/envs/ribocode/bin/metaplots", line 10, in
sys.exit(main())
File "/home/zuozd/miniconda3/envs/ribocode/lib/python3.7/site-packages/RiboCode/metaplots.py", line 241, in main
gene_dict,transcript_dict = load_transcripts_pickle(os.path.join(args.annot_dir,"transcripts.pickle"))
File "/home/zuozd/miniconda3/envs/ribocode/lib/python3.7/site-packages/RiboCode/prepare_transcripts.py", line 293, in load_transcripts_pickle
gene_dict, transcript_dict = pickle.load(fin)
ValueError: unsupported pickle protocol: 5
Dear developers,
It seems that the adjusted p values in the collapsed output was calculated from filtered p values. Shouldn't multiple testing correction be done before filtering p-values with the default pvalue cutoff of 0.05? To do the multiple testing correction correctly, could I set --pval-cutoff 1
and then filter the output by adjusted_pval < 0.05 manually?
> collapsed[order(-pval_combined)]
pval_combined adjusted_pval
1: 4.993873e-02 4.993873e-02
2: 4.977925e-02 4.978278e-02
3: 4.968042e-02 4.968748e-02
4: 4.958320e-02 4.959377e-02
5: 4.957438e-02 4.958848e-02
---
14073: 6.443817e-267 1.814192e-263
14074: 3.203795e-276 1.127496e-272
14075: 0.000000e+00 0.000000e+00
14076: 0.000000e+00 0.000000e+00
14077: 0.000000e+00 0.000000e+00
Besides, I noticed that there are two parameters on calculating combined p-values:
--dependence_test {none,mic,pcc}
the method for measuring the dependence between frame1 and frame2. This test could help determine whether the combined p-values should be ajusted to account for the dependence between two test (i.e. F0 vs F1 and F0 vs F2). mic: Maximal Information
Coefficient; pcc: Pearson Correlation Coefficient.
--stouffer_adj {none,nyholt,liji,gao,galwey}
the method for adjustment the cominbed p-values to account for the dependence between two tests (i.e. F0 vs F1 and F0 vs F2). see details at: https://search.r-project.org/CRAN/refmans/poolr/html/stouffer.html
What's the recommended way to set the two parameters?
Thanks in advance.
Hi,
When I try to run prepare_transcripts with command line:
prepare_transcripts -g Arabidopsis_thaliana.TAIR10.41.gtf -f tair10.fa -o tair10_annot/
I got this error:
Preparing annotation files ...
Reading the GTF file: Arabidopsis_thaliana.TAIR10.41.gtf .......
Traceback (most recent call last):
File "/usr/local/bin/prepare_transcripts", line 11, in
load_entry_point('RiboCode==1.2.10', 'console_scripts', 'prepare_transcripts')()
File "/usr/local/lib/python3.6/dist-packages/RiboCode/prepare_transcripts.py", line 388, in main
processTranscripts(args.genomeFasta,args.gtfFile,args.out_dir)
File "/usr/local/lib/python3.6/dist-packages/RiboCode/prepare_transcripts.py", line 308, in processTranscripts
gene_dict,transcript_dict = readGTF(gtfFile)
File "/usr/local/lib/python3.6/dist-packages/RiboCode/prepare_transcripts.py", line 174, in readGTF
field_dict = parsing_line(line)
File "/usr/local/lib/python3.6/dist-packages/RiboCode/prepare_transcripts.py", line 102, in parsing_line
field_dict = {"chrom": intern(chrom),"source":source,"feature": intern(feature),"iv":iv,"attr":parsing_attr(attr)}
File "/usr/local/lib/python3.6/dist-packages/RiboCode/prepare_transcripts.py", line 74, in parsing_attr
k,v = i.strip().split(" ",1)
ValueError: not enough values to unpack (expected 2, got 1)
The gtf file was downloaded from http://plants.ensembl.org/index.html
Could you show me some advice about this error?
Have a nice day.
when i use the command "prepare_transcripts",it show "ValueError: Can't transform the genomic interval, please check!",I use the "GTFupdate",it also show the same error,how can i slove this? thanks!
I am using the QC data to do my best analysis and I am not getting the following error, how can I change the p-value or how can I fix this problem?
error:No obviously periodicity are detected from bam file, it could be due to poor quality sequencing. Please check the metagene plots and try again by lowering the value of frame0_percent
thank you!
Hi, the "transcripts_cds.txt" file generated is empty, how should I fix it? Thanks!
Here is GTF
A01 phytozomev13 gene 34694907 34695456 . - . gene_id "Gohir.A01G126666.v2.1"; gene_name "Gohir.A01G126666.v2.1";
A01 phytozomev13 transcript 34694907 34695456 . - . gene_id "Gohir.A01G126666.v2.1"; gene_name "Gohir.A01G126666.v2.1"; transcript_id "Gohir.A01G126666.1.v2.1";
A01 phytozomev13 exon 34694907 34695456 0 - . gene_id "Gohir.A01G126666.v2.1"; transcript_id "Gohir.A01G126666.1.v2.1"; Name "Gohir.A01G126666";
A01 phytozomev13 gene 119528395 119531897 . + . gene_id "Gohir.A01G229700.v2.1"; gene_name "Gohir.A01G229700.v2.1";
A01 phytozomev13 transcript 119528395 119531897 . + . gene_id "Gohir.A01G229700.v2.1"; gene_name "Gohir.A01G229700.v2.1"; transcript_id "Gohir.A01G229700.1.v2.1";
A01 phytozomev13 exon 119528395 119528785 0 + . gene_id "Gohir.A01G229700.v2.1"; transcript_id "Gohir.A01G229700.1.v2.1"; Name "Gohir.A01G229700";
A01 phytozomev13 exon 119528884 119529005 0 + . gene_id "Gohir.A01G229700.v2.1"; transcript_id "Gohir.A01G229700.1.v2.1"; Name "Gohir.A01G229700";
A01 phytozomev13 exon 119529125 119529262 0 + . gene_id "Gohir.A01G229700.v2.1"; transcript_id "Gohir.A01G229700.1.v2.1"; Name "Gohir.A01G229700";
A01 phytozomev13 exon 119529365 119529507 0 + . gene_id "Gohir.A01G229700.v2.1"; transcript_id "Gohir.A01G229700.1.v2.1"; Name "Gohir.A01G229700";
A01 phytozomev13 exon 119529601 119529736 0 + . gene_id "Gohir.A01G229700.v2.1"; transcript_id "Gohir.A01G229700.1.v2.1"; Name "Gohir.A01G229700";
And here is the script
(ribocode) [hugj2006@bigram2 translatome2022]$ prepare_transcripts -g td1.gtf -f ../cottonLeaf/refGenomes/TM1utx_26.fasta -o RiboCode_annot
[2022-06-08 05:12:58] Preparing annotation files ...
Loading transcripts.pickle ...
[2022-06-08 05:13:09] The step of preparing transcript annotation has been completed.
(ribocode) [hugj2006@bigram2 translatome2022]$
(ribocode) [hugj2006@bigram2 translatome2022]$ ls -lh RiboCode_annot/
total 166M
-rw-rw-r--+ 1 hugj2006 domain users 0 Jun 8 04:53 transcripts_cds.txt
-rw-rw-r--+ 1 hugj2006 domain users 71M Jun 8 04:53 transcripts.pickle
-rw-rw-r--+ 1 hugj2006 domain users 186M Jun 8 04:53 transcripts_sequence.fa
Hi,
I was wondering if it will be possible to identify C-terminal extensions using RiboCode. I've noticed that N-terminal extensions / truncations are indeed identified by RiboCode (they fall in the category 'annotated', but they have different start codon than the main annotated ORF).
Thank you very much,
Kind regards,
Marina
Ribocode does not correctly handle the hdf5 file it creates to store temperary data, resulting in identical results for multiple runs that are ran from the same folder.
For example, running this script resulted in all identical ORF libraries for each of the datasets.
while IFS= read -r folder; do
echo "$folder"
mkdir "$folder/GRCh38_110/ribocode/"
RiboCode_onestep -g "$assembly/Homo_sapiens.GRCh38.110.gtf" \
-f "$assembly/Homo_sapiens.GRCh38.dna.primary_assembly.fa" \
-r "${folder}/GRCh38_110/aligned_tran.bam" \
-l no \
-o "${folder}/GRCh38_110/ribocode/ribo" \
-f0_percent 0.5
done < "$1"
I solved this by removing the hdf5 file created after each run
while IFS= read -r folder; do
echo "$folder"
mkdir "$folder/GRCh38_110/ribocode/"
rm aligned_tran_psites.hd5 <-- remove file
RiboCode_onestep -g "$assembly/Homo_sapiens.GRCh38.110.gtf" \
-f "$assembly/Homo_sapiens.GRCh38.dna.primary_assembly.fa" \
-r "${folder}/GRCh38_110/aligned_tran.bam" \
-l no \
-o "${folder}/GRCh38_110/ribocode/ribo" \
-f0_percent 0.5
done < "$1"
Hi Developers!
I am one of the developers of the nf-core/riboseq pipeline that we have started work on recently. It is now recognised as 'in-development" by the nf-core community but we still have a lot of work to do. I am trying to develop a RiboCode module so that the pipeline can support ORF calling as an option.
Firstly, I wanted to let you know about this effort and secondly, I have a few suggestions regarding the usability of the tool.
GTFupdate
subtool. eg GTFupdate
should sort the gtf fileI am happy to help with these and will likely make PRs here for them as I need them
I think, the input gtf
to metaplots
and prepare_transcripts
needs start_codon
and stop_codon
features in order to recognize something as an annotated CDS; CDS
features are not enough. Could this be added as an option in GTF_update.rst
?
Importantly, could this be clarified explicitly in the gtf specification in README.md
?
Hi,
I ran your tool following the guidelines but when I try to run the Ribocode.py bit, I am getting this error:
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'module' object has no attribute 'Gene'
It seems to stem from pickle.load()
Let me know if you have any suggestions.
Best
Hi,
I'm trying to run prepare_transcripts to extract transcripts and CDSs from a GTF file. The problem is that this GTF file contains transcripts with multiple CDSs and, when checking the output file 'transcripts_cds.txt' from prepare_transcripts, it seems to contain only one CDS for each transcript. Does RiboCode support transcripts with multiple CDSs at all? Any way around this?
Hi!
I'm sorry to bother you again. I don't know how to deal with different transcripts of one gene when quantification and study 3-nt periodicity in translation. Do you have some suggestions? Thank you very much!
Hi!
I'm sorry to bother you. I followed Ribocoed workflow, but I don't know whether need to remove PCR duplicate after STAR alignment. Could you give me some suggestions? Thank you very much!
Dear developers,
I'm having the following error message:
Traceback (most recent call last):
File "/home/huf/.local/bin/RiboCode", line 11, in <module>
sys.exit(main())
File "/home/huf/.local/lib/python2.7/site-packages/RiboCode/RiboCode.py", line 40, in main
tpsites_sum, total_psites_number = process_bam.psites_count(configIn.configList,transcript_dict,thread_num=1)
File "/home/huf/.local/lib/python2.7/site-packages/RiboCode/process_bam.py", line 111, in psites_count
tpsites_sum,total_psites_number = read_bam(configList[0])
File "/home/huf/.local/lib/python2.7/site-packages/RiboCode/process_bam.py", line 77, in read_bam
tpsites[tid][t_psite] += 1
KeyError: 'ENSMUST00000035606.8'
I even sorted the bam, but it gave me the same error for a different transcript Id. Do you know where possibly goes wrong?
Best wishes,
Fengyuan
The Tkinter is needed when import the matplotlib module. If the following error is returned, pls check whether the tkinter package is installed. On Linux platform, users can install this package using follow command:
apt-get install python-tk
or
yum install python-tk
Traceback (most recent call last):
File "/usr/bin/metaplots", line 7, in
from RiboCode.metaplots import main
File "/usr/lib64/python2.7/site-packages/RiboCode/metaplots.py", line 7, in
import matplotlib.pyplot as plt
File "/usr/lib64/python2.7/site-packages/matplotlib/pyplot.py", line 115, in
_backend_mod, new_figure_manager, draw_if_interactive, _show = pylab_setup()
File "/usr/lib64/python2.7/site-packages/matplotlib/backends/init.py", line 32, in pylab_setup
globals(),locals(),[backend_name],0)
File "/usr/lib64/python2.7/site-packages/matplotlib/backends/backend_tkagg.py", line 6, in
from six.moves import tkinter as Tk
File "/usr/lib/python2.7/site-packages/six.py", line 203, in load_module
mod = mod._resolve()
File "/usr/lib/python2.7/site-packages/six.py", line 115, in _resolve
return _import_module(self.mod)
File "/usr/lib/python2.7/site-packages/six.py", line 82, in _import_module
import(name)
ImportError: No module named Tkinter
When trying to use RiboCode_onestep I come across this error below:
Finished reading bam file!
Traceback (most recent call last):
File "/usr/local/bin/RiboCode_onestep", line 10, in
sys.exit(main())
File "/usr/local/lib/python3.9/site-packages/RiboCode/RiboCode_onestep.py", line 76, in main
detectORF.main(gene_dict=gene_dict, transcript_dict=transcript_dict, annot_dir = "annot",
TypeError: main() missing 3 required positional arguments: 'dependence_test', 'stouffer_adj', and 'pval_adj'
My command line looks like this:
Singularity> RiboCode_onestep -g Homo_sapiens.GRCh38.104.gtf -f Homo_sapiens.GRCh38.dna.primary_assembly.fa -r my.transcriptome.dedup.bam -l no -o RiboCode_ORFs_result
when i use the command : metaplots -a -r ,it show
ImportError: cannot import name 'PPoly' from partially initialized module 'scipy.interpolate' (most likely due to a circular import) (/home/zhai2/miniconda3/lib/python3.8/site-packages/scipy/interpolate/init.py)
how can i slove this ? thankyou
请问Psites_frame0_RPKM代表的位于0读码框架的reads计算的ORF的表达量吗,不包含读码框1和2的reads
Hi
I just wanted to ask for the supplementary R file for the data analysis. I followed the the link but the document is not available anymore.
Thanks!
Dear RiboCode developers,
According to the documentation, it is recommended to include outFilterMultimapNmax 1 parameter in STAR alignment to exclude non-unique alignments and reduce noise for downstream analyses.
In case of default outFilterMultimapNmax 10 setting, how does RiboCode handle non-unique alignments? Are they included in P-site estimating and ORF detection? Does RiboCode differentiate between primary and secondary alignment flags when dealing with multi-mapped reads?
Thank you very much!
plot_orf_density error,it show the following arguments are required: --start-codon,but i have added the start condn:
plot_orf_density -a totalreads_out/ -c metaplots_pre_config.txt -t itf00g40750.t4 -s 65 -e 295 --start-condon ATG --plot-annotated-orf yes
how can i slove?
thank you
When running the "metaplots" or "plot_orf_density" command, some users received errors similar to the following:
"raise RuntimeError('Invalid DISPLAY variable')"
_"tkinter.TclError: no display name and no $DISPLAY environment variable"
The main problem is that default backend of matplotlib is unavailable. The solution is to modify the backend. A very simple solution is to set the MPLBACKEND environment variable, either for your current shell or for a single script:
export MPLBACKEND="module://my_backend"
Giving below are non-interactive backends, capable of writing to a file:
See also:
http://matplotlib.org/faq/usage_faq.html#what-is-a-backend
http://matplotlib.org/users/customizing.html#the-matplotlibrc-file
http://stackoverflow.com/questions/2801882/generating-a-png-with-matplotlib-when-display-is-undefined
When i used metaplots -a RiboCode_annot -r HEK293Aligned.toTranscriptome.out.bam, an error occurred:
[2022-07-28 21:05:31] Create metaplot file and predict the P-site locations ...
Loading transcripts.pickle ...
Traceback (most recent call last):
File "/home/user/BGM/lit/anaconda3/envs/ribocode/bin/metaplots", line 10, in
sys.exit(main())
File "/home/user/BGM/lit/anaconda3/envs/ribocode/lib/python3.7/site-packages/RiboCode/metaplots.py", line 241, in main
meta_analysis(gene_dict,transcript_dict,args)
File "/home/user/BGM/lit/anaconda3/envs/ribocode/lib/python3.7/site-packages/RiboCode/metaplots.py", line 181, in meta_analysis
filter_tids = filter_transcript(gene_dict,transcript_dict)
File "/home/user/BGM/lit/anaconda3/envs/ribocode/lib/python3.7/site-packages/RiboCode/metaplots.py", line 48, in filter_transcript
level = list(sorted(levels))[0]
TypeError: '<' not supported between instances of 'str' and 'NoneType'
how can i solve this? thanx!!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.