Giter Club home page Giter Club logo

pharmacogxml's Introduction

Biomarker discovery from large pharmacogenomics datasets

Instructors:

Workshop Description

This workshop will focus on the challenges encountered when applying machine learning techniques in complex, high dimensional biological data. In particular, we will focus on biomarker discovery from pharmacogenomic data, which consists of developing predictors of response of cancer cell lines to chemical compounds based on their genomic features. From a methodological viewpoint, biomarker discovery is strongly linked to variable selection, through methods such as Supervised Learning with sparsity inducing norms (e.g., ElasticNet) or techniques accounting for the complex correlation structure of biological features (e.g., mRMR). Yet, the main focus of this talk will be on sound use of such methods in a pharmacogenomics context, their validation and correct interpretation of the produced results. We will discuss how to assess the quality of both the input and output data. We will illustrate the importance of unified analytical platforms, data and code sharing in bioinformatics and biomedical research, as the data generation process becomes increasingly complex and requires high level of replication to achieve robust results. This is particularly relevant as our portfolio of machine learning techniques is ever enlarging, with its set of hyperparameters that can be tuning in a multitude of ways, increasing the risk of overfitting when developing multivariate predictors of drug response.

Pre-requisites

  • Basic knowledge of R syntax
  • Familiarity with the machine learning concept and at least a few approaches

Following resources might be useful to read:

Workshop Participation

Participants expected to have the following required packages installed on their machines to be able to run the commands along with the instructors.

  • PharmacoGx and Biobase from Bioconductor
  • mRMRe, caret, glmnet, randomForest from cran
  • bhklab/mci and bhklab/PharmacoGx-ML from github

R / Bioconductor packages used

Time outline

An example for a 45-minute workshop:

Activity Time
Introduction 10m
Basic functionalities of PharmacoGx 15m
Consistency assessment between datasets 15m
Machine learning and biomarker discovery 20m

Workshop goals and objectives

Learning goals

  • describe the pharmacogenomic datasets and their usefulness
  • learn how to extract information from these datasets and to intersect them over their common features
  • identify functionalities available in PharmacoGx package to work with the high dimensional pharmacogenomics data
  • assess reproducibility and replication of pharmacogenomics studies
  • understand how to handle the biomarker discovery as a pattern recognition problem in the domain of pharmacogenomics studies

Learning objectives

  • list available standardized pharmacogenomic datasets and download them
  • understand the structure of these darasest and how to access the features and response quantifications
  • create drug-dose response plots
  • Measure the consistency across multiple datasets and how to improve such measurements
  • Assess whether known biomarkers are reproduced within these datasets
  • Predict new biomarkers by applying different machine learning methods

pharmacogxml's People

Contributors

bhaibeka avatar p-smirnov avatar zhaleh-safikhani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.