caravagnalab / mobster Goto Github PK
View Code? Open in Web Editor NEWModel-based subclonal deconvolution from bulk sequencing.
Home Page: https://caravagnalab.github.io/mobster/
License: GNU General Public License v3.0
Model-based subclonal deconvolution from bulk sequencing.
Home Page: https://caravagnalab.github.io/mobster/
License: GNU General Public License v3.0
Hi,
This line in selection2clonenested()
throws an error that time
variable is missing. Is it a typo, and should it be the time1
?
Line 149 in b824fe0
Best regards,
Paweł
After moving the repo to caravagnalab, we need to create the GitHub page again, and also ideally set up a redirect from the old URL
https://caravagn.github.io/mobster
to
I can't seem to find what the "d" and "popsize" are indicating in the input examples.
Could someone clarify?
Hi,
I came across the following error. Do you think it is because I updated recently some packages? Could you please help me to fix this?
Thanks,
library(mobster)
library(tidyr)
library(dplyr)
example_data = Clusters(mobster::fit_example$best)
drivers_rows = c(2239, 3246, 3800)
example_data$is_driver = FALSE
example_data$driver_label = NA
example_data$is_driver[drivers_rows] = TRUE
example_data$driver_label[drivers_rows] = c("DR1", "DR2", "DR3")
# Fit and print the data
fit = mobster_fit(example_data, auto_setup = 'FAST')
[ MOBSTER fit ]
Error in mobster:::check_input(x, K, samples, init, tail, epsilon, maxIter, : There are some reserved names in the input data that cannot be used, please remove or rename columns: cluster, Tail, C1, C2
Traceback:
1. mobster_fit(example_data, auto_setup = "FAST")
2. mobster:::check_input(x, K, samples, init, tail, epsilon, maxIter,
. fit.type, seed, model.selection, trace)
3. stop("There are some reserved names in the input data that cannot be used, please remove or rename columns: ",
. paste0(fixed_names, collapse = ", "))
Hi, I was wondering whether it is feasible to deal with exome-sequencing data using mobster?
Error when building binomial_noise branch from repo on Rstudio
Quitting from lines 31-35 [unnamed-chunk-3] (a4_popgen.Rmd)
Error: processing vignette 'a4_popgen.Rmd' failed with diagnostics:
Bad MOBSTER input (list of fits).
--- failed re-building ‘a4_popgen.Rmd’
Can be worked around by setting 'vignettes = F' in build()
I get the following error sometimes, looks if I remove mutations with VAF = 0.0 it goes away. I think at the moment they get set to 1e-9, maybe we just remove them? Or do you think it's something else?
Error in if (.stoppingCriterion(i, prevNLL, fit$NLL, prevpi, fit$pi, fit.type, :
missing value where TRUE/FALSE needed
In addition: Warning message:
In .dbpmm.EM(x, K = tests[r, "K"], init = init, tail = tests[r, :
Possible singularity in one Beta component a/b --> Inf.
@marcjwilliams1 Check out commit 7396fea3308aee9921220b05e4b6f7e6267cd93e. I wrapped a call to the neutralitytestr package.
It is wrapped from a general MOBSTER object (k regions) which contains MOBSTER fits in the $fit.MOBSTER field. VAF is already adjusted by MOBSTER, mutations are assumed to be diploid. The integration range is selected via custom upper and lower quantiles. The test run on tail mutations.
Function neutralitytest
is run to each sample that has a fit tail. The final mutation rate M is a linear combination of the mutation rate per sample, weighted by the tail size (normalized). The idea is that if one sample has a large tail (900 muts), and 1 a very small one (100 mutations), we want to give more weight to estimate of mutation rate for the larger tail (90%).
(function (command = NULL, args = character(), error_on_status = TRUE, …
:Type .Last.error to see the more details.
Warning messages:
1: In readLines(f, n) :
incomplete final line found on '/Users/madeleine.dale/Library/R/arm64/4.3/library/mobster/DESCRIPTION'
2: In readLines(file) :
incomplete final line found on '/Users/madeleine.dale/Library/R/arm64/4.3/library/mobster/DESCRIPTION'
3: In readLines(f, n) :
incomplete final line found on '/Users/madeleine.dale/Library/R/arm64/4.3/library/mobster/DESCRIPTION'
4: In readLines(file) :
incomplete final line found on '/Users/madeleine.dale/Library/R/arm64/4.3/library/mobster/DESCRIPTION'
Hi,
I would like to ask some questions about the inputs.
I have Mutect2 (both unfiltered and filtered with GATK FilterMutectCalls) and Strelka2 calls.
My questions are:
DP_column
and NV_column
paramenters in the load_vcf()
function, for Mutect2 and Strelka2 vcf files?load_vcf()
or it is necessary to manipulate the files? In that case, how should I do it? Are there some kind of "helper" functions I missed from the package manual?Sorry for the silly questions, I just want to be 100% sure of what I'm doing.
Thank you!
Implement a direct MLE/MM fit without EM.
In plot_latent_variables
a MOBSTER FIT is passed as main argument and it contains a tibble like
# A tibble: 3 x 7
VAF cluster Tail C1 C2 is_driver driver_label
<dbl> <chr> <dbl> <dbl> <dbl> <lgl> <chr>
1 0.448 C1 0.0125 9.88e- 1 8.08e-21 TRUE DR1
2 0.159 C2 0.225 2.35e-34 7.75e- 1 TRUE DR2
3 0.0629 Tail 1.00 1.91e-82 4.02e- 5 TRUE DR3
after passing the objet to the function Clusters
we get
# A tibble: 5,000 x 10
VAF cluster Tail...3 C1...4 C2...5 is_driver driver_label Tail...8 C1...9
<dbl> <chr> <dbl> <dbl> <dbl> <lgl> <chr> <dbl> <dbl>
1 0.497 C1 0.00736 0.993 5.22e-27 FALSE NA 0.00736 0.993
2 0.490 C1 0.00669 0.993 4.42e-26 FALSE NA 0.00669 0.993
3 0.470 C1 0.00705 0.993 1.31e-23 FALSE NA 0.00705 0.993
4 0.517 C1 0.0130 0.987 1.83e-29 FALSE NA 0.0130 0.987
5 0.506 C1 0.00903 0.991 3.86e-28 FALSE NA 0.00903 0.991
6 0.440 C1 0.0179 0.982 9.68e-20 FALSE NA 0.0179 0.982
7 0.428 C1 0.0347 0.965 3.88e-18 FALSE NA 0.0347 0.965
8 0.523 C1 0.0164 0.984 3.97e-30 FALSE NA 0.0164 0.984
9 0.482 C1 0.00648 0.994 3.87e-25 FALSE NA 0.00648 0.994
10 0.499 C1 0.00759 0.992 3.20e-27 FALSE NA 0.00759 0.992
# … with 4,990 more rows, and 1 more variable: C2...10 <dbl>
This results in the break of the execution of
clusters_names = names(x$pi)
assignments %>% select(clusters_names)
where assignments
is the results of the Cluster
function.
The two tibbles have different column names and the select doesn't work causing an error.
The problem can be replicated executing the second stage of the vignette "2. Plotting fits"
Update DESCRIPTION file to reflect this after commit 72056b6
Hi Giulio,
So I've been running lots of examples with mobster (very cool stuff) but I did notice a consistent pattern of model selection failure (weird beta distributions fits) when sparse and dispersed low frequency mutations are present in the VAF. See example plot below (left):
Both fits were run with the default settings (not auto_setup = FAST).
I think a straight forward solution is to just trim the neutral tail a bit by removing mutations below ~2-5% VAF (see right plot - trimming mutations below 5% VAF fixes this issue), but I am wondering if there is any automated solution within the package for this? Maybe I looked over something in the documentation.
Hi,
I would like to try MOBSTER, but can't see any examples of how to input data, formatting, or what output will look like.
Is there a 'walk-through' with some example data that I can follow?
Thanks,
Bruce.
Write the bootstrap routines to work on offline data so to submit an array job to the cluster using easypar
. That is much faster than the current implementation.
Running the following code from the example gives me an error
library(mobster)
dataset = random_dataset(
seed = 123,
Beta_variance_scaling = 100 # variance ~ U[0, 1]/Beta_variance_scaling
)
fit = mobster_fit(
dataset$data,
auto_setup = "FAST",
parallel = F
)
Loaded input data, n = 5000.
❯ n = 5000. Mixture with k = 1,2 Beta(s). Pareto tail: TRUE and FALSE. Output clusters with
π > 0.02 and n > 10.
! mobster automatic setup FAST for the analysis.
❯ Scoring (without parallel) 2 x 2 x 2 = 8 models by reICL.
[easypar] run 1 - Error: Can't merge the outer name `init.value` with a vector of length > 1.
Please supply a `.name_spec` specification.
[easypar] run 2 - Error: Can't merge the outer name `init.value` with a vector of length > 1.
Please supply a `.name_spec` specification.
[easypar] run 3 - Error: Can't merge the outer name `init.value` with a vector of length > 1.
Please supply a `.name_spec` specification.
[easypar] run 4 - Error: Can't merge the outer name `init.value` with a vector of length > 1.
Please supply a `.name_spec` specification.
[easypar] run 5 - Error: Can't merge the outer name `init.value` with a vector of length > 1.
Please supply a `.name_spec` specification.
[easypar] run 6 - Error: Can't merge the outer name `init.value` with a vector of length > 1.
Please supply a `.name_spec` specification.
[easypar] run 7 - Error: Can't merge the outer name `init.value` with a vector of length > 1.
Please supply a `.name_spec` specification.
[easypar] run 8 - Error: Can't merge the outer name `init.value` with a vector of length > 1.
Please supply a `.name_spec` specification.
[easypar] 8/8 computations returned errors and will be removed.
Error in mobster_fit(dataset$data, auto_setup = "FAST", parallel = F) :
All task returned errors, no fit available, raising this error to interrupt the computation....
Hi,
I find the mobster very useful for subclone reconstruction analysis and would like to use it for my ctDNA research.
Could you please let me know if mobster can deal with WES ctDNA data?
Thank you!
Check that these can be fit appropriately.
At the moment R CMD check is failing when running examples. Here the error:
Running examples in ‘mobster-Ex.R’ failed
The error most likely occurred in:
> ### Name: get_clone_trees
> ### Title: Return clone trees from the fit.
> ### Aliases: get_clone_trees
...
> trees = get_clone_trees(x)
Error in get_clone_trees(x) :
Your data should have driver events annotated, cannot use 'ctree' otherwise.
Execution halted
"guides(<scale> = FALSE)
is deprecated. Please use guides(<scale> = "none")
instead. "
Here are all the paths that needed updating:
mobster/R/plot_boostrap_coclustering.R
mobster/R/plot_mixing_proportions.R
mobster/R/S3_methods_plot.R
mobster/R/plot_gofit.R
mobster/R/plot_fit_scores.R
mobster/R/plot_boostrap_Beta.R
mobster/R/plot_boostrap_tail.R
mobster/R/plot_model_selection.R
I think I found all of them.
Hi, I carefully check the funtion in Mobster, but don't see any information about "squareplot". Look forward to your update.
Bestwishes,
Sunny.
Errore in -tests$likelihood : argomento non valido per l'operatore unario
Require the new R version due in April 2020.
It seems it is currently missing the following CRAN package
wesanderson
reshape2
plus my github package
ctree
Can we add these to the package so they get downloaded automatically? Should we do that into development and mirror the change to master as well?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.