Giter Club home page Giter Club logo

orcestra's People

Contributors

anthfm avatar bhaibeka avatar dependabot[bot] avatar gangeshberi avatar mattbocc avatar mnakano avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

mattbocc

orcestra's Issues

Use of OCRESTRA API end-points break backwards compatibility in PharmacoGx?

I just noticed this right now while trying to respond to a user issue for PharmacoGx.

When we migrated the availablePSets and downloadPSet functions to query the ORCESTRA API we started forcing all package users since this update to use the most recent PharmacoSet objects. This may break backwards compatibility if, for example, we migrate the data in the @molecularProfilesSlot to a MultiAssayExperiment. It may even break for new Bioconductor releases if there is a change in the object serialization in R from one release to another (e.g., as was done in version 3.5 -> 3.6), preventing users from loading PharmacoSets without updating their version of R.

Since all the previously released PharmacoSets are available on Zenodo, would it be possible to version the API URL such that we ensure users of previous PharmacoGx releases download the appropriate PharmacoSets? Then we can update the URL for each release to prevent breaking backwards compatibility. That way users get the appropriate version of the PharmacoSet for the BiocManager version.

Let me know your thoughts on this?

Chris

BeatAML SMILES missing

Hello,

I am using the PharmacoGx R package to acces the BeatAML dataset. For the other datasets in PharmacoGx there is SMILES associated to each compound but it's not the case for BeatAML. Since the PSets in PharmacoGx are version controlled via ORCESTRA, I am reaching you for help. Is there a way to have acces to this information?

Incorrect mutation data for gCSI_2017

While investigating an issue opened by a user on GitHub I have discovered a problem with the gCSI_2017 mutation data. It appears that the matrix contains expression values instead of the normal strings needed for summarizeMolecularProfiles to work:

> molecularProfiles(gCSI_2017, 'mutation')[1:5, 1:5]
         NCI-H358   NCI-H292   NCI-H522   NCI-H650    NCI-H23
ARID1A -0.4137651 -0.4137651 -0.4137651 -0.4137651 -0.4137651
JAK1   -0.2579865 -0.2579865 -0.2579865 -0.2579865 -0.2579865
MSH2   -0.2113687 -0.2113687 -0.2113687 -0.2113687 -0.2113687
MSH6   -0.3179615 -0.3179615 -0.3179615 -0.3179615 -0.3179615
NFE2L2 -0.1223739 -0.1223739 -0.1223739 -0.1223739 -0.1223739

As such the results returned for summarizeMolecularProfiles are nonsensical:

> assay(summarizeMolecularProfiles(gCSI_2017, 'mutation', summary.stat='and'), 1)[1:5, 1:5]
       NCI-H358 NCI-H292 NCI-H522 NCI-H650 NCI-H23
ARID1A "1"      "1"      "1"      "1"      "1"    
JAK1   "1"      "1"      "1"      "1"      "1"    
MSH2   "1"      "1"      "1"      "1"      "1"    
MSH6   "1"      "1"      "1"      "1"      "1"    
NFE2L2 "1"      "1"      "1"      "1"      "1" 

Please see PharmacoGx issue #71 for more information on the users code and the sessionInfo for their R environment.

NCI60 RNA data misannotated

I am not sure what happened in the creation of the NCI60 PSet, but the feature into of the RNASeq data is completely misaligned with the row names of the object. For example:

> rowData(molecularProfiles(NCI60)$rnaseq.comp["ERBB2",])
DataFrame with 1 row and 9 columns
              gene_id hugo_symbol entrez_gid    cytoband          gene_name_url
          <character> <character>  <numeric> <character>            <character>
ERBB2 ENSG00000203663       OR2L2      26246        1q44 http://www.genenames..
              entrez_gid_url      genomic_coord_url       gene_description
                 <character>            <character>            <character>
ERBB2 http://www.ncbi.nlm... https://www.ncbi.nlm.. olfactory receptor f..
                 ensembl_tid
                 <character>
ERBB2 ENST00000642011|ENST..
> rowData(molecularProfiles(NCI60)$rnaseq.iso["ERBB2",])
DataFrame with 1 row and 9 columns
              gene_id hugo_symbol entrez_gid    cytoband          gene_name_url
          <character> <character>  <numeric> <character>            <character>
ERBB2 ENSG00000203663       OR2L2      26246        1q44 http://www.genenames..
              entrez_gid_url      genomic_coord_url       gene_description
                 <character>            <character>            <character>
ERBB2 http://www.ncbi.nlm... https://www.ncbi.nlm.. olfactory receptor f..
                 ensembl_tid
                 <character>
ERBB2 ENST00000642011|ENST..

The PSet needs to be fixed.

Non-UTF byte in CCLE drugInfo

There is a non-UTF byte in drugInfo(CCLE)[4, 2]. That is the 'Compound..brand.name.' column, I think it is probably a TM symbol. But it breaks a bunch of stuff, such as reading in the table as a .csv in Python. Also some R show methods.

We should have a general mechanism to ensure that only valid UTF-8 strings are stored in a PSet. There is a utility for this already in base called iconv.

We could do something like:

DF$column <- iconv(DF$column, to='UTF-8', sub='')

submitting without login fails?

Tried to submit GDSC1 with old array and no filtering, without logging in. got this error:

Error in Request ProcessCannot read property 'name' of undefined

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.