Giter Club home page Giter Club logo

p1h's Introduction

Pilot1 Hackathon

Python functions and command line utilities for working with standard pilot1 data sets.

git clone https://github.com/levinas/p1h
pip install -U pandas scikit-learn xgboost

Command-line Examples

Scripts exist for dataframe export and prediciton tasks.

Export by-drug data

Save molecular features and dose reponse data for given drugs to CSV files:

$ python dataframe.py --by drug --drugs 100071 --feature_subsample 10

NSC 100071: saved 52 rows and 11 columns to NSC_100071.csv

Export by-cell data

Save drug features and dose response data for given cell lines to CSV files:

$ python dataframe.py --by cell --cells BR:MCF7 CNS:SF_268

BR:MCF7: saved 15628 rows and 3811 columns to BR:MCF7.csv
CNS:SF_268: saved 28151 rows and 3811 columns to CNS:SF_268.csv

By-drug regression runs

Run three regression models on drug set A (defined by Jason to include 306 drugs) using all types of cell line features (expression, miRNA and proteome), and save feature importance and model performance evaluated on various metrics to files.

$ python by_drug.py --drugs A --models randomforest lasso elasticnet --cell_features all

Code Examples

Run standard regressions on a drug

from datasets import NCI60
from skwrapper import regress

df = NCI60.load_by_drug_data(drug='100071')
regress('XGBoost', df)
regress('Lasso', df)

Sweep customized RandomForest regression on all cell lines

from datasets import NCI60
from skwrapper import regress
from sklearn.ensemble import RandomForestRegressor

model = RandomForestRegressor(n_estimators=20)

cells = NCI60.all_cells()
for cell in cells:
    df = NCI60.load_by_cell_data(cell)
    regress(model, df, cv=3)

p1h's People

Contributors

levinas avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.