Giter Club home page Giter Club logo

brm-phylo's Introduction

Phylogenies of Breast Cancer Brain Metastases

Introduction

This repository contains the code and data for the following work: Yifeng Tao, Haoyun Lei, Adrian V. Lee, Jian Ma, and Russell Schwartz. Phylogenies Derived from Matched Transcriptome Reveal the Evolution of Cell Populations and Temporal Order of Perturbed Pathways in Breast Cancer Brain Metastases. Proceedings of the International Symposium on Mathematical and Computational Oncology (ISMCO). 2019.

How to use the pipeline?

STEP 0: Prerequisites

The code runs on Python 2.7.

  • Common Python packages need to be installed: os, random, numpy, pandas, pickle, scipy, sklearn, matplotlib, seaborn, cStringIO, collections.
  • These additional Python packages are required in some experiments: statsmodels, networkx, skbio, Bio, PyTorch.

We will introduce the three-step pipeline below.

STEP 1: Data preprocessing and mapping to gene modules/cancer pathways

You can load the preprocessed mapped data (df_modu) using the following pieces of scripts in Python environment:

from DataProcessor import DataProcessor
data_proc = DataProcessor()
df_modu, len_kegg = data_proc.load_modu_data()

As you can see, we use the DataProcessor class to conduct the data preprocessing and mapping. The returned df_modu is a pandas.DataFrame, where each row is a gene module/cancer pathway, and each column is a sample.

STEP 2: Deconvolution of bulk data

We want to conduct cross-validation to determine the proper number of cell communities/components for deconvolution, and then use the optimal number of components to unmix the bulk data:

python run_nnd.py

where models.NND is called to perform neural network deconvolution (NND). The result of cross-validation is available at data/ica/results_cv.pkl. The unmixed matrices are available at data/ica/BCF.pkl.

STEP 3: Building cell community phylogeny and inferring Steiner node pathways

Some cell components are missing in some patients. We can aggregate the different patterns of exiting components in patients:

import pickle
from DataProcessor import DataProcessor
from utils_analysis import component_portion, classify_patients, plot_phylo
# Load preprocessed data
data_proc = DataProcessor()
df_modu, len_kegg = data_proc.load_modu_data()
# Load deconvolved components and fraction matrix
BCF = pickle.load(open( "data/ica/BCF.pkl", "rb" ))
B, C, F = BCF["B"], BCF["C"], BCF["F"]
# Index of the primary component/community
comp_p = component_portion(F, plot_mode=True)
# Aggregate different patterns of components in patients
list_patterns = classify_patients(F, threshold_0=2.5e-2)

Here, the list_patterns contains four different patterns of phylogenies. In order to visualize a specific pattern, e.g., the first one, and print out the differentially perturbed pathways along edges of this phylogeny:

pattern = list_patterns[0]
plot_phylo(C, F, list(df_modu.index), len_kegg, comp_p, pattern, threshold=0.05)

How to replicate results in the paper?

python run_nnd.py

It plots Fig. 2b, Fig. A1, Fig. A2 in the paper.

python analysis.py

It prints out the following figures or plots tables of the paper in the order of: Fig. 3b, Table 1, Table A1, Fig. 3a, Fig. 3c, Fig. A3, Fig. 3d, Fig. 3e, Table A2-A5.

License

The repository uses MIT license, so feel free to share or adapt the materials. If you find this work useful, please cite:

@inproceedings{tao2019brm,
  title = {Phylogenies Derived from Matched Transcriptome Reveal the Evolution of Cell Populations and Temporal Order of Perturbed Pathways in Breast Cancer Brain Metastases},
  author = {Tao, Yifeng and
	Lei, Haoyun  and
	Lee, Adrian V.  and
	Ma, Jian  and
	Schwartz, Russell},
  booktitle = {Proceedings of the International Symposium on Mathematical and Computational Oncology},
  month = {Oct},
  year = {2019},
}

You are welcome to reach out to us for any questions.

Contact: Yifeng Tao ([email protected]), Russell Schwartz ([email protected])

brm-phylo's People

Contributors

yifengtao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.