This repository contains R scripts used for our publication Kinase deficient NTRK2 splice variant, TrkB.T1, links development and oncogenesis"
- Cao, J., Spielmann, M., Qiu, X. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019). https://doi.org/10.1038/s41586-019-0969-x
- Anders, S., Pyl, P. T., & Huber, W. (2015). HTSeq--a Python framework to work with high-throughput sequencing data. Bioinformatics (Oxford, England), 31(2), 166–169. https://doi.org/10.1093/bioinformatics/btu638
- Trapnell C. et. al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, 381–386 (2014). https://doi.org/10.1038/nbt.2859
- Qiu, X. et. al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 14, 979–982 (2017). https://doi.org/10.1038/nmeth.4402
- McInnes, L., Healy, J. & Melville, J. UMAP: Uniform Manifold Approximation and Projection for dimension reduction. Preprint at https://arxiv.org/abs/1802.03426 (2018).
- Vivian J, Rao AA, Nothaft FA, et al. (2017) Toil enables reproducible, open source, big biomedical data analyses. Nature biotechnology.
- R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
All our analysis is done in R using the following R/Biocondcutor packages.
- ggplot2 for making plots in our paper.
- htseq-count for estimating transcript expression data
- Monocle3 for normalizing trasncript expression data
- rtracklyer to import the GTF file downloaded from Gencode (v M12)
Monocle3 can be installed using instructions found here
To ensure smooth execution of code in this repository, please install the following packages
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c( "rtracklyer", "ggplot2", "gridExtra"))
- SAM alignment files for the MOCA dataset were downloaded from here
- “cell_annotation.csv” which contained TSNE coordinates, UMAP coordinates, and information about clusters and trajectories was downloaded from here
- “Comprehensive gene annotation” file was downloaded from Gencode
- Transcript data (TPM data) was downloaded for TARGET from UCSC Xena
- Transcript Data (TCGA RNAseqV2 RSEM data) for each TCGA organ sites was downloaded from here.
- Hallmark gene set was downloaded form here