Giter Club home page Giter Club logo

gamblr.utils's Introduction

GAMBLR.utils

Collection of functions that consist of the core functionality of GAMBLR

gamblr.utils's People

Contributors

vladimirsouza avatar kdreval avatar mattssca avatar rdmorin avatar

Watchers

 avatar  avatar  avatar

gamblr.utils's Issues

Bug in `clean_maf`

> clean_maf = cleanup_maf(maf_df = GAMBLR.data::sample_data$grch37$maf)
Error in `mutate()`:
ℹ In argument: `EXON = gsub("/.+", "", EXON)`.
Caused by error in `h()`:
! error in evaluating the argument 'x' in selecting a method for function 'gsub': object 'EXON' not found

The input MAF table may not have a EXON column.

> maf <- GAMBLR.results::get_coding_ssm()
> any( names(maf) == "EXON" )
[1] FALSE

In cleanup_maf code:

> mutate(maf_df,EXON = gsub("/.+", "", EXON))
Error in `mutate()`:
ℹ In argument: `EXON = gsub("/.+", "", EXON)`.
Caused by error in `is.factor()`:
! object 'EXON' not found

drop maftools dependency

the maftools is used as dependency in 2 functions:

  • genome_to_exome as a supported input format for incoming maf. As we are moving away from maftools we will discontinue support of this input format
  • sanitize_maf_data as a mean to create onco matrix. This could be swapped for the function in helpers child

This will significantly reduce the dependency burden on installing this package.

drop Complexheatmap dependency

The Complexheatmap is only used to generate an optional output in cnvKompare. The heatmap is not clustered. This can be substituted with ggplot2 implementation to decrease the number of dependencies.

Bug in count_ssm_by_region

The example in the vignette no longer runs after the migration. The affected function besides the highlighted one is GAMBLR.helpers::handle_ssm_by_region. Self-assigning this issue. Posted for tracking.

Example:
#define a region.
my_region = gene_to_region(gene_symbol = "MYC", return_as = "region")

#subset metadata.
my_metadata = get_gambl_metadata() %>%
  dplyr::filter(pathology == "FL")

#count SSMs for the selected sample subset and defined region.
fl_ssm_counts_myc = count_ssm_by_region(region = my_region, these_samples_metadata = my_metadata)
Error in vctrs::vec_size_common(string = string, pattern = pattern, replacement = replacement,  : 
  argument "this_maf" is missing, with no default

Improvements and generalization to liftover_bedpe

Currently, this is the only option for GMABLR to lift coordinates between the two projections (grch37 and hg38). This function is heavily restricted by the dictated data type (bedpe).

As discussed in Slack, this function should be improved to accept additional data types (e.g maf, bed, seg). Potentially, the function could also be improved to allow the liftover of strings with genomic coordinates (i.e not restricting this function to solely operate on data frames with multiple regions).

This should be doable without the need to bundle any additional data sets since the liftover chains are already available in GAMBLR.data.

Lastly, this function could likely also be improved in other ways, such as added flexibility for input data column names. As reported here;

This function seems to have some quirks that could be ironed out, too. If I include any columns other than CHROM, START, END, STRAND for A and B, I get this error:

Error in .local(x, ...) : strand values must be in '+' '-' '*'

with STRAND specified as .
Then, if I change STRAND to + I get

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
  scan() expected 'a real', got '+' 

So I can only run it with a bedpe file with the 8 columns for CHROM, START, END, and STRAND, and STRAND has to be populated with .

drop plyr dependency

There is only one place where plyr is used, and it is to round to the nearest 5. We can avoid requiring this dependency by using base round and specifying base of 5 like so:

base*round(x/base)

# examples
> 5*round(94.3/5)
[1] 95
> 5*round(92.3/5)
[1] 90

This will help to decrease the dependency load

Input parameters missing in `cnvKompare` Examples

> cnvKompare(patient_id = "13-26835",
+            genes_of_interest = c("EZH2",
+                                  "TP53",
+                                  "MYC",
+                                  "CREBBP",
+                                  "GNA13"),
+            projection = "hg38",
+            show_x_labels = FALSE)
Using bundled metadata from GAMBLR.data
Found 4 samples for patient 13-26835 ...
You did not provide path to seg file or segments in data frame.
You can obtain the seg data by using GAMBLR.results::get_sample_cn_segments.
Error in cnvKompare(patient_id = "13-26835", genes_of_interest = c("EZH2",  : 
  Please provide the seg data or retreive the CNV data.

Does not install from current master

There is probably some missing comma or a bracket somewhere. This is what I get when installing from the current master:

✔  checking DESCRIPTION meta-information ...
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/adjust_ploidy.Rd:28: unknown macro '\item'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/adjust_ploidy.Rd:30: unknown macro '\item'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/adjust_ploidy.Rd:32: unexpected section header '\value'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/adjust_ploidy.Rd:35: unexpected section header '\description'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/adjust_ploidy.Rd:38: unexpected section header '\details'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/adjust_ploidy.Rd:44: unexpected section header '\examples'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/adjust_ploidy.Rd:55: unexpected END_OF_INPUT '
   '
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/cnvKompare.Rd:45: unknown macro '\item'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/cnvKompare.Rd:47: unknown macro '\item'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/cnvKompare.Rd:49: unknown macro '\item'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/cnvKompare.Rd:51: unknown macro '\item'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/cnvKompare.Rd:53: unexpected section header '\value'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/cnvKompare.Rd:56: unexpected section header '\description'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/cnvKompare.Rd:59: unexpected section header '\details'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/cnvKompare.Rd:68: unexpected section header '\examples'
   Warning: /tmp/Rtmpnv2QOI/Rbuild18ee466cb886f/GAMBLR.utils/man/cnvKompare.Rd:79: unexpected END_OF_INPUT '
   '
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
   Omitted ‘LazyData’ from DESCRIPTION
─  building ‘GAMBLR.utils_0.1.0.tar.gz’
   
Running /gsc/software/linux-x86_64-centos7/R-4.1.3/lib64/R/bin/R CMD INSTALL \
  /tmp/RtmpP2SDPz/GAMBLR.utils_0.1.0.tar.gz --install-tests 
* installing to library ‘/home/kdreval/R/x86_64-pc-linux-gnu-library/4.1’
* installing *source* package ‘GAMBLR.utils’ ...
** using staged installation
** R
** byte-compile and prepare package for lazy loading
** help
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/adjust_ploidy.Rd:28: unknown macro '\item'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/adjust_ploidy.Rd:30: unknown macro '\item'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/adjust_ploidy.Rd:32: unexpected section header '\value'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/adjust_ploidy.Rd:35: unexpected section header '\description'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/adjust_ploidy.Rd:38: unexpected section header '\details'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/adjust_ploidy.Rd:44: unexpected section header '\examples'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/adjust_ploidy.Rd:55: unexpected END_OF_INPUT '
'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/cnvKompare.Rd:45: unknown macro '\item'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/cnvKompare.Rd:47: unknown macro '\item'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/cnvKompare.Rd:49: unknown macro '\item'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/cnvKompare.Rd:51: unknown macro '\item'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/cnvKompare.Rd:53: unexpected section header '\value'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/cnvKompare.Rd:56: unexpected section header '\description'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/cnvKompare.Rd:59: unexpected section header '\details'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/cnvKompare.Rd:68: unexpected section header '\examples'
Warning: /tmp/RtmpSql4pn/R.INSTALL1904d4885504a/GAMBLR.utils/man/cnvKompare.Rd:79: unexpected END_OF_INPUT '

Conflicting function names

When loading GAMBLR.utils I get a warning message about conflicting function names. This can be resolved by specifying the Import field in the function documentation.

The conflicting packages are; IRanges and stats and the functions affected are: start and end.

Without looking into this much further, I assume that any function that imports IRanges (e.g annotate_ssm_motif_ccontext) should have the @rawNamespace import updated to also exclude start, and end. Given that this function is not using these functions from IRanges.

@rawNamespace import(IRanges, except = c("start", "end", "merge", "shift", "collapse", "union", "slice", "intersect", "setdiff", "desc", "reduce")) 

The warning message:

> library(GAMBLR.utils)
Warning messages:
1: replacing previous import ‘IRanges::end’ by ‘stats::end’ when loading ‘GAMBLR.utils’ 
2: replacing previous import ‘IRanges::start’ by ‘stats::start’ when loading ‘GAMBLR.utils’ 

It's crucial to suppress warnings like this since it will cause the build-check by Git Actions to fail.

`get_manta_sv` is used in `annotate_igh_breakpoints` examples

get_manta_sv is a function restricted for GSC users (from GAMBLR.results package). I think we could keep this example, but it would be great if we also include another example for non-GSC users.

Current example:

manta_sv = get_manta_sv(verbose = FALSE)
all_annotated = annotate_sv(sv_data = manta_sv)
ig_annotated = annotate_igh_breakpoints(all_annotated)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.