Giter Club home page Giter Club logo

multiscale_phate's Introduction

Multiscale PHATE

Latest PyPi version Travis CI Build Coverage Status Twitter GitHub stars Code style: Black

Multiscale PHATE is a python package for multiresolution analysis of high dimensional data. For an in-depth explanation of the algorithm and applications, please read our manuscript on Nature Biotechnology.

The biomedical community is producing increasingly high dimensional datasets integrated from hundreds of patient samples that current computational techniques are unable to explore across granularities. To visualize, cluster and analyze massive datasets across granularities, we created Multiscale PHATE. The goal of Multiscale PHATE is to learn and visualize abstract cellular features and groupings of the data at all levels of granularity in an efficient manner to identify meaningful biological relationships and mechanisms. Our approach learns a tree of data granularities which can be cut at coarse levels for high level summarizations of data as well as at fine levels for detailed representations on subsets.

Overview of Algorithm:

alt text

Our algorithm integrates dimensionality reduction technique PHATE with multigranular analysis tool diffusion condensation. First the non-linear diffusion manifold is calculated using PHATE. Then diffusion condensation takes this manifold-intrinsic diffusion space and slowly condensing data points towards local centers of gravity to form natural, data-driven groupings across multiple granularities. These granularities can then be viewed.

alt text Using gradient analysis, which looks at shifts in data density during successive iterations of the diffusion condensation process, we can identify stable resolutions of the hierarchical tree for downstream analysis. With this stability information, we can cut the hierarchical tree at multiple resolutions to produce visualizations and clusters across granularities for downstream analysis.

alt text By identifying multiple resolutions, Multiscale PHATE enables users to interact with their data and zoom in on cellular subsets of interest to reveal increasingly granular information about cell types and subtypes.

While this may sound computationally inefficient, we show that we are able to perform these calculations as well as visualize and cluster the data significantly faster than “single-scale” visualization techniques like tSNE, UMAP or PHATE, allowing the analysis of millions of cells within minutes. When combined with other computational algorithms for high dimensional data analysis, such as MELD and DREMI, Multiscale PHATE is able to provide deep and detailed insights in biological processes.

Installation

Multiscale PHATE is available on pip. Install by running the following in a terminal:

pip install --user git+https://github.com/KrishnaswamyLab/Multiscale_PHATE

Quick Start

import multiscale_phate
mp_op = multiscale_phate.Multiscale_PHATE()
mp_embedding, mp_clusters, mp_sizes = mp_op.fit_transform(X)

# Plot optimal visualization
scprep.plot.scatter2d(mp_embedding, s = mp_sizes, c = mp_clusters,
                      fontsize=16, ticks=False,label_prefix="Multiscale PHATE", figsize=(16,12))

Guided Tutorial

For more details on using Multiscale PHATE, see our guided tutorial using 10X's public PBMC4k dataset.

multiscale_phate's People

Contributors

scottgigante avatar mkuchroo avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.