Giter Club home page Giter Club logo

ropensci / ucscxenatools Goto Github PK

View Code? Open in Web Editor NEW
98.0 6.0 12.0 2.93 MB

:package: An R package for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq https://cran.r-project.org/web/packages/UCSCXenaTools/

Home Page: https://docs.ropensci.org/UCSCXenaTools

License: GNU General Public License v3.0

R 86.59% XQuery 11.14% TeX 2.27%
ucsc-xena downloader api-client tcga ccle icgc ucsc toil treehouse r

ucscxenatools's Introduction

UCSCXenaTools logo

CRAN status lifecycle R-CMD-check rOpenSci DOI

UCSCXenaTools is an R package for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq. Public omics data from UCSC Xena are supported through multiple turn-key Xena Hubs, which are a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others. Databases are normalized so they can be combined, linked, filtered, explored and downloaded.

Who is the target audience and what are scientific applications of this package?

  • Target Audience: cancer and clinical researchers, bioinformaticians
  • Applications: genomic and clinical analyses

Table of Contents

Installation

Install stable release from CRAN with:

install.packages("UCSCXenaTools")

You can also install devel version of UCSCXenaTools from github with:

# install.packages("remotes")
remotes::install_github("ropensci/UCSCXenaTools")

If you want to build vignette in local, please add two options:

remotes::install_github("ropensci/UCSCXenaTools", build_vignettes = TRUE, dependencies = TRUE)

Data Hub List

All datasets are available at https://xenabrowser.net/datapages/.

Currently, UCSCXenaTools supports the following data hubs of UCSC Xena.

Users can update dataset list from the newest version of UCSC Xena by hand with XenaDataUpdate() function, followed by restarting R and library(UCSCXenaTools).

If any url of data hub is changed or a new data hub is online, please remind me by emailing to [email protected] or opening an issue on GitHub.

Basic usage

Download UCSC Xena datasets and load them into R by UCSCXenaTools is a workflow with generate, filter, query, download and prepare 5 steps, which are implemented as XenaGenerate, XenaFilter, XenaQuery, XenaDownload and XenaPrepare functions, respectively. They are very clear and easy to use and combine with other packages like dplyr.

To show the basic usage of UCSCXenaTools, we will download clinical data of LUNG, LUAD, LUSC from TCGA (hg19 version) data hub. Users can learn more about UCSCXenaTools by running browseVignettes("UCSCXenaTools") to read vignette.

XenaData data.frame

UCSCXenaTools uses a data.frame object (built in package) XenaData to generate an instance of XenaHub class, which records information of all datasets of UCSC Xena Data Hubs.

You can load XenaData after loading UCSCXenaTools into R.

library(UCSCXenaTools)
#> =========================================================================================
#> UCSCXenaTools version 1.4.8
#> Project URL: https://github.com/ropensci/UCSCXenaTools
#> Usages: https://cran.r-project.org/web/packages/UCSCXenaTools/vignettes/USCSXenaTools.html
#> 
#> If you use it in published research, please cite:
#> Wang et al., (2019). The UCSCXenaTools R package: a toolkit for accessing genomics data
#>   from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq.
#>   Journal of Open Source Software, 4(40), 1627, https://doi.org/10.21105/joss.01627
#> =========================================================================================
#>                               --Enjoy it--
data(XenaData)

head(XenaData)
#> # A tibble: 6 × 17
#>   XenaHosts XenaHostNames XenaCohorts XenaDatasets SampleCount DataSubtype Label
#>   <chr>     <chr>         <chr>       <chr>              <int> <chr>       <chr>
#> 1 https://… publicHub     Breast Can… ucsfNeve_pu…          51 gene expre… Neve…
#> 2 https://… publicHub     Breast Can… ucsfNeve_pu…          57 phenotype   Phen…
#> 3 https://… publicHub     Glioma (Ko… kotliarov20…         194 copy number Kotl…
#> 4 https://… publicHub     Glioma (Ko… kotliarov20…         194 phenotype   Phen…
#> 5 https://… publicHub     Lung Cance… weir2007_pu…         383 copy number CGH  
#> 6 https://… publicHub     Lung Cance… weir2007_pu…         383 phenotype   Phen…
#> # ℹ 10 more variables: Type <chr>, AnatomicalOrigin <chr>, SampleType <chr>,
#> #   Tags <chr>, ProbeMap <chr>, LongTitle <chr>, Citation <chr>, Version <chr>,
#> #   Unit <chr>, Platform <chr>

Workflow

Select datasets.

# The options in XenaFilter function support Regular Expression
XenaGenerate(subset = XenaHostNames=="tcgaHub") %>% 
  XenaFilter(filterDatasets = "clinical") %>% 
  XenaFilter(filterDatasets = "LUAD|LUSC|LUNG") -> df_todo

df_todo
#> class: XenaHub 
#> hosts():
#>   https://tcga.xenahubs.net
#> cohorts() (3 total):
#>   TCGA Lung Cancer (LUNG)
#>   TCGA Lung Adenocarcinoma (LUAD)
#>   TCGA Lung Squamous Cell Carcinoma (LUSC)
#> datasets() (3 total):
#>   TCGA.LUNG.sampleMap/LUNG_clinicalMatrix
#>   TCGA.LUAD.sampleMap/LUAD_clinicalMatrix
#>   TCGA.LUSC.sampleMap/LUSC_clinicalMatrix

Query and download.

XenaQuery(df_todo) %>%
  XenaDownload() -> xe_download
#> This will check url status, please be patient.
#> All downloaded files will under directory /var/folders/gm/lw6z28md2594gcnws_38_9f40000gn/T//RtmpfewSeZ.
#> The 'trans_slash' option is FALSE, keep same directory structure as Xena.
#> Creating directories for datasets...
#> Downloading TCGA.LUNG.sampleMap/LUNG_clinicalMatrix
#> Downloading TCGA.LUAD.sampleMap/LUAD_clinicalMatrix
#> Downloading TCGA.LUSC.sampleMap/LUSC_clinicalMatrix

Prepare data into R for analysis.

cli = XenaPrepare(xe_download)
class(cli)
#> [1] "list"
names(cli)
#> [1] "LUNG_clinicalMatrix" "LUAD_clinicalMatrix" "LUSC_clinicalMatrix"

More to read

Citation

Cite me by the following paper.

Wang et al., (2019). The UCSCXenaTools R package: a toolkit for accessing genomics data
  from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq. 
  Journal of Open Source Software, 4(40), 1627, https://doi.org/10.21105/joss.01627

# For BibTex
  
@article{Wang2019UCSCXenaTools,
    journal = {Journal of Open Source Software},
    doi = {10.21105/joss.01627},
    issn = {2475-9066},
    number = {40},
    publisher = {The Open Journal},
    title = {The UCSCXenaTools R package: a toolkit for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq},
    url = {https://dx.doi.org/10.21105/joss.01627},
    volume = {4},
    author = {Wang, Shixiang and Liu, Xuesong},
    pages = {1627},
    date = {2019-08-05},
    year = {2019},
    month = {8},
    day = {5},
}

Cite UCSC Xena by the following paper.

Goldman, Mary, et al. "The UCSC Xena Platform for cancer genomics data 
    visualization and interpretation." BioRxiv (2019): 326470.

How to contribute

For anyone who wants to contribute, please follow the guideline:

  • Clone project from GitHub
  • Open UCSCXenaTools.Rproj with RStudio
  • Modify source code
  • Run devtools::check(), and fix all errors, warnings and notes
  • Create a pull request

Acknowledgment

This package is based on XenaR, thanks Martin Morgan for his work.

ropensci_footer

ucscxenatools's People

Contributors

shixiangwang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ucscxenatools's Issues

RemoveGermlineCNV 没有移除种系CNV?

尊敬的作者,您好,我最近在使用咱们的工具下载拷贝数变异数据,但有个问题想咨询您,我利用

AA=getTCGAdata(project = 'OV',GisticCopyNumber = TRUE, Gistic2Threshold = FALSE,
               download = TRUE, RemoveGermlineCNV = FALSE)

AA=getTCGAdata(project = 'OV',GisticCopyNumber = TRUE, Gistic2Threshold = FALSE,
               download = TRUE, RemoveGermlineCNV = TRUE)

下载的是一样的,其中移除生殖细胞这块一个是FALSE一个是TRUE,是其本身就一样还是怎么样啊?还请给予回答,这个问题对我来说很重要,多谢您啦;

引入 R actions

From https://github.com/thomasp85/patchwork/blob/master/.github/workflows/R-CMD-check.yaml

on: [push, pull_request]

name: R-CMD-check

jobs:
  R-CMD-check:
    runs-on: ${{ matrix.config.os }}

    name: ${{ matrix.config.os }} (${{ matrix.config.r }})

    strategy:
      fail-fast: false
      matrix:
        config:
        - { os: windows-latest, r: '3.6', args: "--no-manual"}
        - { os: macOS-latest, r: '3.6'}
        - { os: macOS-latest, r: 'devel', args: "--no-manual"}
        - { os: ubuntu-16.04, r: '3.2', cran: "https://demo.rstudiopm.com/all/__linux__/xenial/latest", args: "--no-manual" }
        - { os: ubuntu-16.04, r: '3.3', cran: "https://demo.rstudiopm.com/all/__linux__/xenial/latest", args: "--no-manual" }
        - { os: ubuntu-16.04, r: '3.4', cran: "https://demo.rstudiopm.com/all/__linux__/xenial/latest", args: "--no-manual" }
        - { os: ubuntu-16.04, r: '3.5', cran: "https://demo.rstudiopm.com/all/__linux__/xenial/latest", args: "--no-manual" }
        - { os: ubuntu-16.04, r: '3.6', cran: "https://demo.rstudiopm.com/all/__linux__/xenial/latest", args: "--no-manual" }

    env:
      R_REMOTES_NO_ERRORS_FROM_WARNINGS: true
      CRAN: ${{ matrix.config.cran }}

    steps:
      - uses: actions/checkout@v1

      - uses: r-lib/actions/setup-r@master
        with:
          r-version: ${{ matrix.config.r }}

      - uses: r-lib/actions/setup-pandoc@master

      - uses: r-lib/actions/setup-tinytex@master
        if: contains(matrix.config.args, 'no-manual') == false

      - name: Cache R packages
        uses: actions/cache@v1
        with:
          path: ${{ env.R_LIBS_USER }}
          key: ${{ runner.os }}-r-${{ matrix.config.r }}-${{ hashFiles('DESCRIPTION') }}

      - name: Install system dependencies
        if: runner.os == 'Linux'
        env:
          RHUB_PLATFORM: linux-x86_64-ubuntu-gcc
        run: |
          Rscript -e "install.packages('remotes')" -e "remotes::install_github('r-hub/sysreqs')"
          sysreqs=$(Rscript -e "cat(sysreqs::sysreq_commands('DESCRIPTION'))")
          sudo -s eval "$sysreqs"
      - name: Install dependencies
        run: Rscript -e "install.packages('remotes')" -e "remotes::install_deps(dependencies = TRUE)" -e "remotes::install_cran('rcmdcheck')"

      - name: Check
        run: Rscript -e "rcmdcheck::rcmdcheck(args = '${{ matrix.config.args }}', error_on = 'warning', check_dir = 'check')"

      - name: Upload check results
        if: failure()
        uses: actions/upload-artifact@master
        with:
          name: ${{ runner.os }}-r${{ matrix.config.r }}-results
          path: check

      - name: Test coverage
        if: matrix.config.os == 'macOS-latest' && matrix.config.r == '3.6'
        run: |
          Rscript -e 'remotes::install_github("r-lib/covr@gh-actions")'
          Rscript -e 'covr::codecov(token = "${{secrets.CODECOV_TOKEN}}")'

treehouse update fails

Hi Shixiang,

Thank you for developing the package! It is very easy to use. However, I failed to update the dataset of treehouse. See below.
Thanks!

> packageVersion("UCSCXenaTools")
[1] ‘1.3.1’
> XenaDataUpdate()
=> Obtaining info from UCSC Xena hubs...
==> Searching cohorts for host https://ucscpublic.xenahubs.net...
==> Trying #1
===> #37 cohorts found.
===> Querying datasets info...
===> #114 datasets found.
...
==> Searching cohorts for host https://xena.treehouse.gi.ucsc.edu...
==> Trying #1
==> Trying #2
==> Trying #3
Error in value[[3L]](cond) : 
  Tried 3 times but failed, please check URL or your internet connection!

more detail in NEWS.md?

hi, thanks for including changes in your NEWS file https://github.com/ropensci/UCSCXenaTools/blob/master/NEWS.md

I wonder if you could include some details of what was done in each release? For example, instead of just

* #14 fixed

Include some details of what was done so users can quickly get a sense for the changes that were made

* fixed wrong url in the vignette (#14)

and having the issue number in parens will link to the issue on github

can not access the GDC dataset

> XenaGenerate(subset = XenaHostNames=="gdcHub") %>% 
+   XenaFilter(filterDatasets = "methylation|phenotype") %>% 
+   XenaFilter(filterDatasets = "UCS") -> df_todo
> XenaQuery(df_todo) %>%
+   XenaDownload() -> xe_download
This will check url status, please be patient.
All downloaded files will under directory /tmp/RtmpciLecI.
The 'trans_slash' option is FALSE, keep same directory structure as Xena.
Creating directories for datasets...
'/tmp/RtmpciLecI/TCGA-UCS/Xena_Matrices' already exists'/tmp/RtmpciLecI/TCGA-UCS/Xena_Matrices' already exists/tmp/RtmpciLecI/TCGA-UCS/Xena_Matrices/TCGA-UCS.GDC_phenotype.tsv.gz, the file has been download!
/tmp/RtmpciLecI/TCGA-UCS/Xena_Matrices/TCGA-UCS.methylation450.tsv.gz, the file has been download!


Cannot query copy number data

This works in xenaPython

hub = "https://tcga.xenahubs.net" 
dataset = "TCGA.PANCAN.sampleMap/Gistic2_CopyNumber_Gistic2_all_data_by_genes"   
samples = ["TCGA-02-0047-01","TCGA-02-0055-01","TCGA-02-2483-01","TCGA-02-2485-01"] 
In [15]: xena.dataset_fetch(hub, dataset, samples, ["TP53"])                                                                         
Out[15]: [[-0.012, -0.323, -0.033, -0.025]]

In [16]: xena.dataset_probe_values(hub, dataset, samples, ["TP53"])                                                                  
Out[16]: [None, [[-0.012, -0.323, -0.033, -0.025]]]

issue: couldn't download pancanAtlas data

Hi authors,
I tried to download pancancerAtlas dataset thru UCSCXEnaTools, but failed. Code is pasted below and I have tried paste the url shown in the code result, it doesn't give me proper data. Could you help me with it? Thank you!

> pcA_cohort = XenaData %>% 
+     filter(XenaHostNames == "pancanAtlasHub") # select pancanAtlas Hub
> cli_query = pcA_cohort %>% 
+     filter(DataSubtype == "gene expression RNAseq") %>%  # select RNAseq data
+     XenaGenerate() %>%  # generate a XenaHub object
+     XenaQuery() %>% 
+     XenaDownload()
This will check url status, please be patient.
All downloaded files will under directory /var/folders/k2/zhwq4hld003_vbl84g1qvxcr0000gn/T//RtmpAjrRSW.
The 'trans_slash' option is FALSE, keep same directory structure as Xena.
Creating directories for datasets...
Downloading EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena.gz
trying URL 'https://pancanatlas.xenahubs.net/download/EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena.gz'
==> Trying #2
trying URL 'https://pancanatlas.xenahubs.net/download/EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena.gz'
==> Trying #3
trying URL 'https://pancanatlas.xenahubs.net/download/EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena.gz'
Can not find fileEB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena.gz, this file maybe not compressed.
Try downloading fileEB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena...
trying URL 'https://pancanatlas.xenahubs.net/download/EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena'
==> Trying #2
trying URL 'https://pancanatlas.xenahubs.net/download/EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena'
==> Trying #3
trying URL 'https://pancanatlas.xenahubs.net/download/EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena'
Your network is bad (try again) or the data source is invalid (report to the developer).
Warning messages:
1: In download.file(url, destfile, ...) :
  cannot open URL 'https://tcga-pancan-atlas-hub.s3.us-east-1.amazonaws.com:443/download/EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena.gz': HTTP status was '403 Forbidden'
2: In download.file(url, destfile, ...) :
  cannot open URL 'https://tcga-pancan-atlas-hub.s3.us-east-1.amazonaws.com:443/download/EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena.gz': HTTP status was '403 Forbidden'
3: In download.file(url, destfile, ...) :
  cannot open URL 'https://tcga-pancan-atlas-hub.s3.us-east-1.amazonaws.com:443/download/EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena.gz': HTTP status was '403 Forbidden'
4: In download.file(url, destfile, ...) :
  cannot open URL 'https://tcga-pancan-atlas-hub.s3.us-east-1.amazonaws.com:443/download/EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena': HTTP status was '403 Forbidden'
5: In download.file(url, destfile, ...) :
  cannot open URL 'https://tcga-pancan-atlas-hub.s3.us-east-1.amazonaws.com:443/download/EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena': HTTP status was '403 Forbidden'
6: In download.file(url, destfile, ...) :
  cannot open URL 'https://tcga-pancan-atlas-hub.s3.us-east-1.amazonaws.com:443/download/EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena': HTTP status was '403 Forbidden'

Explanation of terms

First of all thank you for developing this package. I am new to clinical analysis and this package and the vignettes / examples were a good start.

Regarding the survival analysis vignette, I have been trying to find a resource that explains / maps the variable names in the clinical data table and those used by the studies / Xena. For example the term OS.time doesn't show up in my searches of both the Xena portal or the Pan-Cancer Atlas. I am assuming time to remission, but it is just a guess. My question is there is a metadata table that explains what OS.time (and other terms) mean?

cheers,
António

增加ProbeMap下载

有些数据集有探针用于各种ID的转换,可以在XenaQuery()中支持这个

API function for querying single gene or sample does not work

Use .p_dataset_probe_values and .p_dataset_gene_probe_avg as example.

library(UCSCXenaTools)
hub = "https://pancanatlas.xenahubs.net"
dataset = "EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena"
samples = c("TCGA-02-0047-01","TCGA-02-0055-01")
probes =c("TP53", "RB1")

Work:

> .p_dataset_probe_values(hub, dataset, samples, probes)
[[1]]
  strand chromend chromstart chrom
1      + 49056122   48877911 chr13
2      -  7590868    7565097 chr17

[[2]]
      [,1]  [,2]
[1,] 10.84  9.96
[2,] 11.22 10.15

> .p_dataset_gene_probe_avg(hub, dataset, samples, probes) 
  gene                     position       scores
1 TP53   -, 7590868, 7565097, chr17  10.84, 9.96
2  RB1 +, 49056122, 48877911, chr13 11.22, 10.15

Does not work for single sample:

> .p_dataset_probe_values(hub, dataset, "TCGA-02-0055-01", probes)
[[1]]
  strand chromend chromstart chrom
1      + 49056122   48877911 chr13
2      -  7590868    7565097 chr17

[[2]]
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
[1,]  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN   NaN   NaN   NaN   NaN   NaN   NaN
[2,]  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN   NaN   NaN   NaN   NaN   NaN   NaN

  gene                     position                                                                    scores
1 TP53   -, 7590868, 7565097, chr17 NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN
2  RB1 +, 49056122, 48877911, chr13 NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN

Does not work for single probe (like gene):

> .p_dataset_probe_values(hub, dataset, samples, "TP53")
 Error in UCSCXenaTools:::.xena_post(host, UCSCXenaTools:::.call(xquery,  : 
  Internal Server Error (HTTP 500). 
> .p_dataset_gene_probe_avg(hub, dataset, samples, "TP53") 
 Error in UCSCXenaTools:::.xena_post(host, UCSCXenaTools:::.call(xquery,  : 
  Internal Server Error (HTTP 500). 

Interesting, the .p_dataset_gene_probes_values works for single gene, but not single sample

> .p_dataset_gene_probes_values(hub, dataset, samples, "TP53")
[[1]]
[[1]]$position
  strand chromend chromstart chrom
1      -  7590868    7565097 chr17

[[1]]$name
[1] "TP53"


[[2]]
      [,1] [,2]
[1,] 10.84 9.96

> .p_dataset_gene_probes_values(hub, dataset, "TCGA-02-0047-01", "TP53")
[[1]]
[[1]]$position
  strand chromend chromstart chrom
1      -  7590868    7565097 chr17

[[1]]$name
[1] "TP53"


[[2]]
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
[1,]  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN   NaN   NaN   NaN   NaN   NaN   NaN

Function naming strategy

Comments from ropensci/software-review#315

  • Argument naming is not consistent - for example, the fetch_ API functions are in snake_case but a number of other functions are camel-cased with the first letter capitalized (i.e. XenaScan) or regular camel-cased (getTCGAdata). It would be helpful to adopt a similar casing style even if snake_case doesn't work because of consistency with other tools for these data sets.

This issue should be fixed in the next incompatible version.

获取 sparse 数据的接口

.p_sparse_data("https://ucscpublic.xenahubs.net", "ccle/CCLE_DepMap_18Q2_maf_20180502",
               samples = list("HCE4_OESOPHAGUS", "NCIH2818_PLEURA"), genes = list("TP53"))
.p_sparse_data_examples("https://ucscpublic.xenahubs.net", "ccle/CCLE_DepMap_18Q2_maf_20180502", 2)
UCSCXenaTools::fetch_dataset_samples("https://ucscpublic.xenahubs.net", "ccle/CCLE_DepMap_18Q2_maf_20180502")

写一个 fetch_sparse_value

Error downloading CCLE datasets from publicHub

Hi,

I'm trying to download CCLE files, but I get the file missing message:

cannot open URL 'https://ucscpublic.xenahubs.net/download/ccle/CCLE_copynumber_2013-12-03.seg.txt': HTTP status was '404 Not Found'

The code I use:

mysets <- XenaGenerate(subset = XenaHostNames=="publicHub") %>%
    XenaFilter(filterCohorts = "CCLE")
XenaQuery(mysets) %>%
    XenaDownload() -> ccle_download

If I try the same with MAGIC datasets, it works fine.

DownloadTCGA下载报错

downloadTCGA(project = "OV", data_type = "Phenotype", file_type = "Clinical Information", destdir = tempdir())
This will check url status, please be patient.
错误: Evaluation error: An unknown option was passed in to libcurl.

诗翔师兄你好,我在用ucscxenatools 出现上面的问题,下载不了数据,还有一个小问题是,GDC TCGA 和TCGA的数据有什么区别啊,为什么Xena要重复放这批数据? 麻烦师兄了!

---计算所-志强

移除warning

In dir.create(i, recursive = TRUE) : 'data/Xena' already exists
Error: Unable to establish connection with R session

xenaPython对外开放的API函数

from . import xenaQuery as xena

def Gene_values (hub, dataset, samples, gene):
    values = xena.dataset_gene_values (hub, dataset, samples, [gene])
    return values[0]["scores"][0]

def Genes_values (hub, dataset, samples, genes):
    values = [x["scores"][0] for x in xena.dataset_gene_values (hub, dataset, samples, genes)]
    return values

def Probe_values (hub, dataset, samples, probe):
    values = xena.dataset_probe_values (hub, dataset, samples, [probe])
    return values[0]

def Probes_values (hub, dataset, samples, probes):
    values = xena.dataset_probe_values (hub, dataset, samples, probes)
    return values

def dataset_samples (hub,dataset):
    return xena.dataset_samples(hub, dataset)

def dataset_fields (hub, dataset):
    return xena.dataset_field (hub, dataset)

def all_cohorts(hub):
    return xena.all_cohorts(hub)

New feature: XenaExperiment?

This comes from to do list of xenaR, maybe I can implement it.

XenaExperiment() to represent a collection of datasets from XenaHub(), subset to contain specific samples and features.

Basic data retirieval of all or part of the assays present in a XenaExperiment.

bad option check in fetch_dense_values

> host = "https://toil.xenahubs.net"
> dataset = "tcga_RSEM_gene_tpm"
> samples = c("TCGA-02-0047-01","TCGA-02-0055-01","TCGA-02-2483-01","TCGA-02-2485-01")
> probes = c('ENSG00000282740.1', 'ENSG00000000005.5', 'ENSG00000000419.12')
> genes =c("TP53", "RB1", "PIK3CA")
> fetch_dense_values(host, dataset, genes, samples, check = TRUE, use_probeMap = TRUE)
Checking identifiers...
The following identifiers have been removed fro host https://toil.xenahubs.net dataset tcga_RSEM_gene_tpm
[1] NA NA NA
Done.
Checking samples...
Done.
Checking if the dataset has probeMap...
Done. ProbeMap is found.
Error in dimnames(x) <- dn : 'dimnames'的长度[2]必需与陈列范围相等

missing "OS", "OS.time", "OS.unit", "RFS", "RFS.time", "RFS.unit" columns in the downloaded clinical infomration file from TCGA

Hello,

I'm following this tutorial "TCGA Pan-cancer data download" (https://xsliulab.github.io/tumor-immunogenicity-score/#data-download-and-preprocessing) to download and clean TCGA clinical data.
However, columns like "OS", "OS.time", "OS.unit", "RFS", "RFS.time", "RFS.unit" are expected to be but absent in the clinical information files.
Is this due to updates of the "UCSCXenaTools" package?

Best,
Danshu

CRAN checks

Dear maintainer,

Please see the problems shown on
https://cran.r-project.org/web/checks/check_results_UCSCXenaTools.html.

Please correct before 2021-07-24 to safely retain your package on CRAN.

It seems we need to remind you of the CRAN policy:

'Packages which use Internet resources should fail gracefully with an informative message
if the resource is not available or has changed (and not give a check warning nor error).'

This needs correction whether or not the resource recovers.

The CRAN Team

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.