scHPF is a tool for de novo discovery of both discrete and continuous expression patterns in single-cell RNA-sequencing (scRNA-seq). We find that scHPF’s sparse low-dimensional representations, non-negativity, and explicit modeling of variable sparsity across genes and cells produce highly interpretable factors.
- Documentation
- Changelog
- Paper at Molecular Systems Biology
- Application to human tissue T cells across multiple donors and tissues
scHPF requires Python >= 3.6 and the packages:
- numba (version needed depends on Python version, but should be safe with 0.45)
- scikit-learn
- pandas
- (optional) loompy
The easiest way to setup an environment for scHPF is with the Anaconda Python distribution in Miniconda or anaconda:
conda create -n schpf_p37 python=3.7 scikit-learn numba=0.45 pandas
# for newer anaconda versions
conda activate schpf_p37
# XOR older anaconda verstions
source activate schpf_p37
# Optional, for using loom files as input to preprocessing
pip install -U loompy
Once you have set up the environment, clone this repository and install.
git clone [email protected]:simslab/scHPF.git
cd scHPF
pip install .
scHPF has a scikit-learn like API. Trained models are stored in a serialized joblib format.
If you have any questions/errors/issues, please open an issue and I be happy to to provide whatever help and guidance I can.
Contributions to scHPF are welcome. Please get in touch if you would like to discuss/check it's something I've already done but haven't pushed to master yet. To contribute, please fork scHPF, make your changes, and submit a pull request.
Hanna Mendes Levitin, Jinzhou Yuan, Yim Ling Cheng, Francisco JR Ruiz, Erin C Bush, Jeffrey N Bruce, Peter Canoll, Antonio Iavarone, Anna Lasorella, David M Blei, Peter A Sims. "De novo gene signature identification from single‐cell RNA‐seq with hierarchical Poisson factorization." Molecular Systems Biology, 2019. [Open access article]
Peter A. Szabo*, Hanna Mendes Levitin*, Michelle Miron, Mark E. Snyder, Takashi Senda, Jinzhou Yuan, Yim Ling Cheng, Erin C. Bush, Pranay Dogra, Puspa Thapa, Donna L. Farber, Peter A. Sims. "Single-cell transcriptomics of human T cells reveals tissue and activation signatures in health and disease." Nature Communications, 2019. [Open access article] * Co-first authors