bhklab / orcestra Goto Github PK
View Code? Open in Web Editor NEWORCESTRA is a new web application that enables users to search, request and manage pharmacogenomic datasets (PSets).
License: Apache License 2.0
ORCESTRA is a new web application that enables users to search, request and manage pharmacogenomic datasets (PSets).
License: Apache License 2.0
similar to #20 , changing the type of microarray in GDSC does not change which options for download are shown.
I just noticed this right now while trying to respond to a user issue for PharmacoGx
.
When we migrated the availablePSets
and downloadPSet
functions to query the ORCESTRA API we started forcing all package users since this update to use the most recent PharmacoSet
objects. This may break backwards compatibility if, for example, we migrate the data in the @molecularProfilesSlot
to a MultiAssayExperiment
. It may even break for new Bioconductor releases if there is a change in the object serialization in R from one release to another (e.g., as was done in version 3.5 -> 3.6), preventing users from loading PharmacoSet
s without updating their version of R.
Since all the previously released PharmacoSet
s are available on Zenodo, would it be possible to version the API URL such that we ensure users of previous PharmacoGx
releases download the appropriate PharmacoSet
s? Then we can update the URL for each release to prevent breaking backwards compatibility. That way users get the appropriate version of the PharmacoSet for the BiocManager version.
Let me know your thoughts on this?
Chris
Hello,
I am using the PharmacoGx
R package to acces the BeatAML dataset. For the other datasets in PharmacoGx there is SMILES associated to each compound but it's not the case for BeatAML. Since the PSets
in PharmacoGx are version controlled via ORCESTRA, I am reaching you for help. Is there a way to have acces to this information?
If you do not select the filtered checkbox, filtered datasets still get returned
The mutation data is also incorrectly annotated in beataml:
metadata(SE)$annotation needs to be equal to mutation
Should be fixed asap
The gCSI dataset does not have any of Marc Hafner's precomputed GR metrics included. Need to fix pipeline
ss1 <- fNames(CCLE, mDataType = "mutation")
ss2 <- featureInfo(CCLE, mDataType = "mutation")[ , "Symbol",drop=TRUE]
table(is.na(ss1) == is.na(ss2))
FALSE TRUE
83 1584
These 2 vectors should be the same
Clicking on a dataset on this page takes me to undefined: https://orcestra.ca/Stats
Nine datasets that were curated as a part of Roche RAAN Phase I needs to be put on ORCESTRA. The data objects (SE) are already uploaded to zenodo
Code - https://github.com/bhklab/Clinical-Trial-SE/blob/master/ClinicalTrial_SE_curation.Rmd
Nine curated datasets available as R SummarizedExperiment objects on Zenodo - (See Open Access datasets section of Clinical trial curation documentation)
While investigating an issue opened by a user on GitHub I have discovered a problem with the gCSI_2017 mutation data. It appears that the matrix contains expression values instead of the normal strings needed for summarizeMolecularProfiles
to work:
> molecularProfiles(gCSI_2017, 'mutation')[1:5, 1:5]
NCI-H358 NCI-H292 NCI-H522 NCI-H650 NCI-H23
ARID1A -0.4137651 -0.4137651 -0.4137651 -0.4137651 -0.4137651
JAK1 -0.2579865 -0.2579865 -0.2579865 -0.2579865 -0.2579865
MSH2 -0.2113687 -0.2113687 -0.2113687 -0.2113687 -0.2113687
MSH6 -0.3179615 -0.3179615 -0.3179615 -0.3179615 -0.3179615
NFE2L2 -0.1223739 -0.1223739 -0.1223739 -0.1223739 -0.1223739
As such the results returned for summarizeMolecularProfiles
are nonsensical:
> assay(summarizeMolecularProfiles(gCSI_2017, 'mutation', summary.stat='and'), 1)[1:5, 1:5]
NCI-H358 NCI-H292 NCI-H522 NCI-H650 NCI-H23
ARID1A "1" "1" "1" "1" "1"
JAK1 "1" "1" "1" "1" "1"
MSH2 "1" "1" "1" "1" "1"
MSH6 "1" "1" "1" "1" "1"
NFE2L2 "1" "1" "1" "1" "1"
Please see PharmacoGx issue #71 for more information on the users code and the sessionInfo for their R environment.
download pset tries to convert the molecular profiles to summarized experiments, but fails because they are already summarized experiments. this is fixed by setting the version >=2
I am not sure what happened in the creation of the NCI60 PSet, but the feature into of the RNASeq data is completely misaligned with the row names of the object. For example:
> rowData(molecularProfiles(NCI60)$rnaseq.comp["ERBB2",])
DataFrame with 1 row and 9 columns
gene_id hugo_symbol entrez_gid cytoband gene_name_url
<character> <character> <numeric> <character> <character>
ERBB2 ENSG00000203663 OR2L2 26246 1q44 http://www.genenames..
entrez_gid_url genomic_coord_url gene_description
<character> <character> <character>
ERBB2 http://www.ncbi.nlm... https://www.ncbi.nlm.. olfactory receptor f..
ensembl_tid
<character>
ERBB2 ENST00000642011|ENST..
> rowData(molecularProfiles(NCI60)$rnaseq.iso["ERBB2",])
DataFrame with 1 row and 9 columns
gene_id hugo_symbol entrez_gid cytoband gene_name_url
<character> <character> <numeric> <character> <character>
ERBB2 ENSG00000203663 OR2L2 26246 1q44 http://www.genenames..
entrez_gid_url genomic_coord_url gene_description
<character> <character> <character>
ERBB2 http://www.ncbi.nlm... https://www.ncbi.nlm.. olfactory receptor f..
ensembl_tid
<character>
ERBB2 ENST00000642011|ENST..
The PSet needs to be fixed.
Yes, its not very informative, but it is a required column for pharmacogx to work.
Title is self-explanatory.
Hi @mnakano,
I am getting an invalid SSL certificate when accessing https://orcestra.ca/
.
Best,
Chris
There is a non-UTF byte in drugInfo(CCLE)[4, 2]
. That is the 'Compound..brand.name.' column, I think it is probably a TM symbol. But it breaks a bunch of stuff, such as reading in the table as a .csv in Python. Also some R show methods.
We should have a general mechanism to ensure that only valid UTF-8 strings are stored in a PSet. There is a utility for this already in base
called iconv
.
We could do something like:
DF$column <- iconv(DF$column, to='UTF-8', sub='')
Tried to submit GDSC1 with old array and no filtering, without logging in. got this error:
Error in Request ProcessCannot read property 'name' of undefined
It looks like this API URL is broken: http://www.orcestra.ca/api/psets/available.
As a result, availablePSets()
in PharmacoGx is breaking when canonical=FALSE
.
Could you please look into this?
Best,
Chris
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.