Giter Club home page Giter Club logo

metagx's Introduction

MetaGx

Scripts to perform meta-analysis of cancer gene expression datasets

Dependencies:

Install the R/Bioconductior dependencies:

pp <- c("Biobase", "BiocGenerics", "org.Hs.eg.db", "survival", "survcomp", "genefu", "mRMRe", "WriteXLS")

source("http://bioconductor.org/biocLite.R")

myrepos <- biocinstallRepos()

rr <- biocLite(pkgs=pp, dependencies=TRUE, type="source", destdir=".")

If using Windows:

  • WriteXLS needs Perl to be installed. To check if perl has been installed: library(WriteXLS) testPerl(perl="perl", verbose=TRUE)

TODO

  • Adapt the package to handle curatedOvarianData
  • Weighted survival does not work properly (number of patients per time points should be "unweighted")

metagx's People

Contributors

bhaibeka avatar dgendoo avatar gmchen avatar lwaldron avatar natchar avatar p-smirnov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

metagx's Issues

Inconsistent annotations

The following annotations and corresponding column names must be include in each dataset in MetaGx and documented

PROBEID: probe name on the platform
ENTREZID: NCBI Entrez gene ID
ENSEMBLID: Ensembl gene id
SYMBOL: Official gene symbol

Check probeGeneMapping, subtypeClassification and all other functions

subscripts out of bound error

In robustness_validation.RNW, clustering.subtypes$tcga_nmf_4 is a list of length 15 as expected; but the elements of the lists are empty, while a list of at least 100 is expected

read.xls not imported NAMESPACE

As a result, getHellandSubtypes.R errors if library(gdata) is not done in global environment:

https://github.com/bhklab/MetaGx/blob/master/R/getHellandSubtypes.R#L2

In general, MetaGx has a heavy footprint and introduces risks of collisions with its many DEPENDS. Would be better to import functions, unless an entire package is really needed. E.g. see http://r-pkgs.had.co.nz/namespace.html - although Hadley's strict function-by-function importing may sometimes be overkill esp when using many functions from a base Bioconductor package, in many cases it is not much effort and avoids NAMESPACE collisions and attaching many unused functions and methods.

heavy use of DEPENDS

MetaGx has a heavy footprint and introduces risks of collisions with its many DEPENDS. Would be better to import functions, unless an entire package is really needed. E.g. see http://r-pkgs.had.co.nz/namespace.html - although Hadley's strict function-by-function importing may sometimes be overkill esp when using many functions from a base Bioconductor package, in many cases it is not much effort and avoids NAMESPACE collisions and attaching many unused functions and methods.

`select()` needs to be imported in `getHellandSubtypes()`

NAMESPACE collision with select from library(dplyr) is causing an error in OvcSubtypes classificationAcrossDatasets.Rnw (see reproducible example below, will attach file PMID17290060_eset.rds). This is related to Issue 3 - this error probably doesn't happen on an earlier version of R/Bioc/dplyr used during development, but breaks on more recent installations - such problems can be avoided by more use of imports rather than depends.

> suppressPackageStartupMessages(library(Biobase))
> suppressPackageStartupMessages(library(MetaGx))
> suppressPackageStartupMessages(library(gdata))
> suppressPackageStartupMessages(library(hgu133plus2.db))
> eset=readRDS("PMID17290060_eset.rds")
> tmp=getHellandSubtypes(eset)
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
> suppressPackageStartupMessages(library(dplyr))
> tmp=getHellandSubtypes(eset)
Error in UseMethod("select_") : 
  no applicable method for 'select_' applied to an object of class "c('ChipDb', 'AnnotationDb', 'envRefClass', '.environment', 'refClass', 'environment', 'refObject', 'AssayData')"
> sessionInfo()
R Under development (unstable) (2016-01-30 r70052)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
 [1] dplyr_0.4.3            hgu133plus2.db_3.2.2   gdata_2.17.0          
 [4] MetaGx_0.9.10          lsa_0.73.1             SnowballC_0.5.1       
 [7] WriteXLS_4.0.0         mRMRe_2.0.5            igraph_1.0.1          
[10] genefu_2.3.2           AIMS_1.3.0             e1071_1.6-7           
[13] iC10_1.1.3             iC10TrainingData_1.0.1 pamr_1.55             
[16] cluster_2.0.3          biomaRt_2.27.2         limma_3.27.11         
[19] mclust_5.1             survcomp_1.21.0        prodlim_1.5.7         
[22] survival_2.38-3        jetset_3.1.3           org.Hs.eg.db_3.2.3    
[25] RSQLite_1.0.0          DBI_0.3.1              AnnotationDbi_1.33.7  
[28] IRanges_2.5.24         S4Vectors_0.9.26       Biobase_2.31.3        
[31] BiocGenerics_0.17.3   

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.3        bitops_1.0-6       class_7.3-14       tools_3.3.0       
 [5] bootstrap_2015.2   amap_0.8-14        gtools_3.5.0       SuppDists_1.1-9.2 
 [9] grid_3.3.0         R6_2.1.2           XML_3.98-1.3       lava_1.4.1        
[13] rmeta_2.16         magrittr_1.5       survivalROC_1.0.3  splines_3.3.0     
[17] assertthat_0.1     KernSmooth_2.23-15 RCurl_1.95-4.7    
> 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.