icbi-lab / luca Goto Github PK

View Code? Open in Web Editor NEW

57.0 4.0 16.0 986.84 MB

Single-cell Lung Cancer Atlas with 1.2M cells

Home Page: https://luca.icbi.at

License: BSD 3-Clause "New" or "Revised" License

Shell 0.45% Python 81.70% Nextflow 14.89% R 2.95%

luca's Issues

Epithelial Cell Types

Hi!

We were wondering what happened to the different epithelial cell types in 33_epithelial_cells

In our project, we are interested in epithelial cells (Goblet cells in particular). Were these cell types filtered out during preprocessing or not found at all?

In the script 33_epithelial_cells.py Goblet cells are mentioned, but the plots aren't visible to give more context.

Thanks in advance!

Question about cluster idents in Seurat object

Hello authors. Thanks for sharing the valuable data and code.
While reproducing the data analysis using Seurat, I had a question about the differences between three idents that can be used to annotate the clusters: cell_type, cell_type_major, and cell_type_tumor.
Sometimes, the same cluster is named differently depending on which ident we used for annotating the cluster.
For example, a cluster annotated with "type I/II pneumocytes" with cell_type ident is differently annotated with "Alveolar cell type 1/2" when using the cell_type_major ident.
Could you explain the differences between these three idents, and what should I use as the principal object for the annotation?
Thank you so much.

batch correction on expression counts and embeddings

Thanks for sharing this dataset. I would like to ask for the data from cellxgene, is the batch correction applied for both the low-dimensional reduction embedding (e.g. UMAP) and the expression counts? Or it's just for the embedding. Thanks : )

Is there rds/Seurat data can be downloaded for extended and Core atlas with the mutation information of STK11?

Hi, I am not a professional bioinformatics person. I am interested in the data with or without STK11 mutations. Could you please let me know which dataset that I can download? It would be great if the dataset include the mutations information of STK11 as well as other key oncogenes mentioned in the paper? I can only start with RDS or Seurat data. Could you please send me the direct link to download the data?

I sincerely appreciate your help.

Regards,
Shawn

Modules scanpy_helpers / AnnotationHelper

Hi,

I love to explore these thrilling datasets and code, but I've got one error I want to ask.

From the code at analyses/37_subclustering/37_neutrophil_subclustering.py
I tried to import following modules and functions

"from scanpy_helpers.annotation import AnnotationHelper
import scanpy_helpers as sh"

, but I countered the errors that there is no module scanpy_helpers neither AnnotationHelper.

From Googling, I seems there are no module called scanpy_helpers neither AnnotationHelper that I can install.
How could I use the modules ? Is there any specific routes that I can install these?

Thanks
bangbattlers

Loading the h5ad

Congratulations on a very nice preprint.

I am trying to load the extended atlas from the .h5ad provided. In R, using loomR's connect() function I encounter the following error:

lfile  <- connect( filename = 'data/extended_atlas.h5ad' )
Error in validateLoom(object = self) :
  There can only be one dataset at the root of the loom file

When I tried using python:

out.file =  scanpy.read_10x_h5 ('data/extended_atlas.h5ad')
ValueError: 'data/extended_atlas.h5ad' contains more than one genome. For legacy 10x h5 files you must specify the genome if more than one is present. Available genomes are: ['X', 'obs', 'obsm', 'obsp', 'raw', 'uns', 'var']

So I then attempted using the raw "genome," and encountered this error:

Exception: File is missing one or more required datasets.

I work mostly in R for scRNA-seq analyses, so I don't have much experience with the h5ad format. How can I go about loading this file?

Project new dataset to the atlas

Dear authors,

Thanks for sharing this valuable data and workflow. I wonder if you may have the script for projecting new dataset from the user onto the atlas. Or if there is a(n) parameter/argument in the nextflow pipeline can do such projection. Many thanks.

Best,
Nan

RAM usage in SCISSOR_TCGA

Hi, thank you for your work. When I was performing SCISSOR_TCGA step, I encountered an issue requiring over three thousand GB RAM. I wanted to inquire about how much memory resources you used when performing SCISSOR_TCGA?

malignant cells labeled in normal samples

Hi,

I downloaded the extended data atlas from CellxGene, and was surprised to find many cells labeled as "malignant" in supposedly normal tissues (see screenshot containing malignant cell counts per disease/study/origin below - data was aggregated from the cell metadata in the downloaded h5ad file). Can you please help me understand how these cell type labels were generated, and explain the presence of these malignant labels in normal tissues? I thought perhaps it was due to mislabeling during transfer learning, but many of these cells come from the core atlas datasets.

Thanks,
Rebecca

Index 0 out of bounds for length 0 error

Dear authors,
Thank you very much for your work. I am new to nextflow. After I download related and follow the command you give
"nextflow run main.nf --workflow downstream_analyses -resume -profile icbi_lung --build_atlas_dir ./data/20_build_atlas --outdir ./data/30_downstream_analyses"

This error always occurs. I run this on a server instead of HPC. I don't know why this error occurred. Is it related to the environment I am using?

Here is the error info from log file:

icbi-lab / luca Goto Github PK

luca's Issues

Epithelial Cell Types

Question about cluster idents in Seurat object

batch correction on expression counts and embeddings

Is there rds/Seurat data can be downloaded for extended and Core atlas with the mutation information of STK11?

Modules scanpy_helpers / AnnotationHelper

Loading the h5ad

Project new dataset to the atlas

RAM usage in SCISSOR_TCGA

malignant cells labeled in normal samples

Index 0 out of bounds for length 0 error

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent