Giter Club home page Giter Club logo

irescue's People

Contributors

bepoli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

bepoli

irescue's Issues

Better error handling

The program should raise errors in case something in the input files is wrong, avoiding crashing with uninformative messages such as those dealt with in #1.
Keeping here a list of errors/exceptions to implement:

  • requirement not met (error if not present, warning if version is not supported)
  • cell barcode and UMI tags not found in BAM
    • [might check for header if STARsolo attributes are not included (https://github.com//issues/1#issuecomment-1431680110)
  • BAM reference names not matching with BED
  • couldn't download the annotation data for a genome assembly (for any reason)

ValueError: range() arg 3 must not be zero

Hi,
Thanks for the nice job in bioRxiv, hope it will be successfully accepted in a good journal.
Now I want to use it to quantificate my single-cell RNA-seq data.
An error occurred when I ran the command
nohup ~/anaconda3/envs/irescue/bin/irescue -b possorted_genome_bam.bam -p 8 -r /public1/home/sc60481/Axolotl/sc-RNA/03.deal.TE/All.TE.deal.bed -w ./filtered_feature_bc_matrix/barcodes.tsv.gz &. I am not sure what caused the error.
Hope for your reply and help.

Thanks for your time and work.

图片

IRescue error: Traceback (most recent call last): File "/apps/software/gcc-12.1.0/python/3.10.5/bin/irescue", line 8, in <module> sys.exit(main())

I am trying to run IRescue on 10X samples that were aligned using STARSolo and I am getting an error I do not understand. I was wondering if you could help me.

My submission script is:
#!/bin/bash -l
#SBATCH --job-name=IRescue
#SBATCH --account=tcmartinez
#SBATCH --partition=tier2q
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=4
#SBATCH --time=48:00:00
#SBATCH --cpus-per-task=4
#SBATCH --mem=64gb
#SBATCH --output=/gpfs/data/mcnerney-lab/Tanner/TCM230/ir.out
#SBATCH --error=/gpfs/data/mcnerney-lab/Tanner/TCM230/ir.err

module load gcc/12.1.0
module load python/3.10.5
module load samtools/1.18
module load bedtools/2.30.0

irescue -b /gpfs/data/mcnerney-lab/Tanner/TCM230/STARSolo/Aligned.sortedByCoord.out.bam
-g mm10
-p 8
-w /gpfs/data/mcnerney-lab/Tanner/TCM230/STARSolo/whitelist/Anames.tsv\

And the error message I am getting is:

[01/15/2024 - 13:11:21] IRescue job starts
[01/15/2024 - 13:11:21] Found CB and UR tags occurrence in bam's line 1.
[01/15/2024 - 13:11:21] Downloading and parsing RepeatMasker annotation for assembly mm10 from https://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/initial/mm10.fa.out.gz ...
[01/15/2024 - 13:12:13] WARNING: The following references contain read alignments but are not found in the TE annotation and will be skipped: chr4_JH584295_random, chrM
[01/15/2024 - 13:14:47] Writing mapped barcodes to ./IRescue_out//barcodes.tsv.gz
[01/15/2024 - 13:14:47] Writing mapped features to ./IRescue_out//features.tsv.gz
Traceback (most recent call last):
File "/apps/software/gcc-12.1.0/python/3.10.5/bin/irescue", line 8, in
sys.exit(main())
File "/apps/software/gcc-12.1.0/python/3.10.5/lib/python3.10/site-packages/irescue/main.py", line 101, in main
bc_per_thread = list(split_bc(barcodes_file, args.threads))
File "/apps/software/gcc-12.1.0/python/3.10.5/lib/python3.10/site-packages/irescue/count.py", line 133, in split_bc
for chunk in split_int(bclen, n):
File "/apps/software/gcc-12.1.0/python/3.10.5/lib/python3.10/site-packages/irescue/count.py", line 119, in split_int
for i in range(0, num, split):
ValueError: range() arg 3 must not be zero

Confued number of clusters by TE matrix

Hi beboli,
It is still me. After I successfully ran the irescue and got the three files (matrix.mtx.gz,features.tsv.gz and barcodes.tsv.gz) of each time point. I ran the command to add the TE assay into the RNA assay.
dpa0.data <- Read10X(data.dir = "/public1/home/sc60481/Axolotl/sc-RNA/dpa0/outs/filtered_feature_bc_matrix")
dpa0 <- CreateSeuratObject(counts = dpa0.data, project = "dpa0", min.cells = 3, min.features = 100)
dpa0.te.data <- Seurat::Read10X('./dpa0/outs/IRescue_out/', gene.column = 1, cell.column = 1)
te.assay <- Seurat::CreateAssayObject(dpa0.te.data)
te.assay <- subset(te.assay, colnames(te.assay)[which(colnames(te.assay) %in% colnames(dpa0))])
dpa0[['TE']] <- te.assay

As the scRNA-seq data has been analyzed and intergrated with annotations of celltype info before I ran irescue, I found that the TE assay of each stage can not be added to the previous seurat object.
Then I re-ran each stage follow aforementioned commands and merged all my seven stages by Harmony and ran the normalization, scale and findcluster analysis based on this object.
图片
As the species I used has 48 subfamilies of TE, the the TE matrix is 48 subfamilies × N cell.
图片

Am I right? I can not understand this TE matrix for why not the matrix is each TE × N cell.
The second confusion of mine is when I ran FindClusters with resolution <1.0, I can only get 3 clusters, while resolution >1.0 (I have try 1.0001),the number of clusters increased to ~9000.
I think I must make something errors. Hope you can help me.
Thank you very much.
Xiangyu

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.