Giter Club home page Giter Club logo

carnival's Introduction

CARNIVAL

Lifecycle: stable BioC status BioC devel status codecov test coverage

Overview

CARNIVAL (CAusal Reasoning for Network identification using Integer VALue programming) is a method for the identification of upstream reguatory signalling pathways from downstream gene expression (GEX).

This is a tool currently being developed by the saezlab members and is an extension of the previously implemented Causal Reasoning (Melas et al.) method. More detailed information on the CARNIVAL pipeline as well as benchmarking and applicational studies are available on following paper.

Liu A., Trairatphisan P., Gjerga E. et al. From expression footprints to causal pathways: contextualizing large signaling networks with CARNIVAL npj Systems Biology and Applications volume 5, Article number: 40 (2019) (equal contributions).

The aim of the CARNIVAL pipeline is to identify a subset of interactions from a prior knowledge network that represent potential regulated pathways linking known or potential targets of perturbation towards active transcription factors derived from GEX data. The pipeline includes a number improved functionalities comparing to the original version and consists of the following processes:

    1. Transcription factors’ (TFs) activities and pathway scores from gene expressions can be inferred with our in-house tools DoRothEA & PROGENy, respectively.
    1. TFs’ activities and signed directed protein-protein interaction networks with or without the provided target of perturbations and pathway scores are then used to derive a series of linear constraints to generate integer linear programming (ILP) problems.
    1. An ILP solver (IBM ILOG CPLEX) is subsequently applied to identify the sub-network topology with minimised fitting error and model size.

Applications of CARNIVAL include the identification of drug’s modes of action and of deregulated processes in diseases (even if the molecular targets remain unknown) by deciphering the alterations of main signalling pathways as well as alternative pathways and off-target effects.

Getting Started

A tutorial for preparing CARNIVAL input files starting from differentially gene expression (DEG) and for running the CARNIVAL pipeline are provided as vignettes in R-Markdown, R-script and HTML formats. The wrapper script “runCARNIVAL” was introduced to take input arguments, pre-process input descriptions, run optimisation and export results as network files and figures. Three built-in CARNIVAL examples are also supplied as case studies for users.

Prerequisites

CARNIVAL requires the interactive version of IBM Cplex, Gurobi or CBC-COIN solver as the network optimiser. The IBM ILOG Cplex is freely available through Academic Initiative here. Gurobi license is also free for academics, request a license here. The CBC solver is open source and freely available for any user, but has a significantly lower performance than CPLEX or Gurobi. Obtain CBC executable directly usable for CARNIVAL here

Alternatively for small networks, users can rely on the freely available lpSolve R-package, which is automatically installed with the package.

Installation

To install the stable version from Bioconductor:

# install from bioconductor
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("CARNIVAL")

Otherwise, it is possible to install the newest version from github using:

# install the development version from GitHub
# install.packages("devtools")
devtools::install_github("saezlab/CARNIVAL")

Inputs and Outputs of CARNIVAL

The input for CARNIVAL consists of:

  • A prior knowledge network (PKN) comprises a list of signed and directed interactions between signalling proteins. (Required)

  • Inferred transcription factor activities which can be inferred from GEX data using DoRothEA. (Required)

  • A list of target of perturbations (drugs, diseases, etc.) with or without their effects on signalling proteins. (Optional)

  • Inferred pathway scores representing signalling pathway activities from GEX data using PROGENy (Optional)

The outcome of CARNIVAL includes the list of identified networks that fitted to the provided experimental data as well as the predicted activities of signalling proteins in the networks whether they are up- or down-regulated.

Running CARNIVAL

To obtain the list of tutorials/vignettes of the CARNIVAL package, user can start with typing the following commmand on R-console:

vignette("CARNIVAL-vignette")

References

Melas et al.:

Melas IN, Sakellaropoulos T, Iorio F, Alexopoulos L, Loh WY, Lauffenburger DA, Saez-Rodriguez J, Bai JPF. (2015). Identification of drug-specific pathways based on gene expression data: application to drug induced lung injury. Integrative Biology, Issue 7, Pages 904-920, https://doi.org/10.1039/C4IB00294F.

DoRothEA - Garcia-Alonso et al.:

Garcia-Alonso L, Holland CH, Ibrahim MM, Turei D, Saez-Rodriguez J. (2018). Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Research 2019. 29: 1363-1375, https://doi.org/10.1101/gr.240663.118

PROGENy - Schubert et al.:

Schubert M, Klinger B, Klünemann M, Sieber A, Uhlitz F, Sauer S, Garnett MJ, Blüthgen N, Saez-Rodriguez J. (2018). Perturbation-response genes reveal signaling footprints in cancer gene expression. Nature Communication, Issue 9, Nr. 20. https://doi.org/10.1038/s41467-017-02391-6.

Studies where CARNIVAL was used

Triantafyllidis C.P. et al. A machine learning and directed network optimization approach to uncover TP53 regulatory patterns. iScience 26, 12 (2023), 108291. doi: 10.1016/j.isci.2023.108291

Buhl E.M. et al. Dysregulated mesenchymal PDGFR‐β drives kidney fibrosis EMBO Mol Med (2020)e11021 doi: 10.15252/emmm.201911021

Binenbaum I. et al. Bioinformatic framework for analysis of transcription factor changes as the molecular link between replicative cellular senescence signaling pathways and carcinogenesis Biogerontology doi: 10.1007/s10522-020-09866-y

Acknowledgement

CARNIVAL has been developed as a computational tool to analyse -omics data within the TransQST Consortium and H2020 Symbiosys ITN Training Network.

“This project has received funding by the European Union’s H2020 program (675585 Marie-Curie ITN ‘‘SymBioSys’’) and the Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No 116030. The Joint Undertaking receives support from the European Union’s Horizon 2020 research and innovation programme and EFPIA.”

carnival's People

Contributors

adugourd avatar anikaliu avatar bartoszbartmanski avatar christianholland avatar enio23 avatar gabora avatar ivanovaos avatar matteospatuzzi avatar nturaga avatar pierremj avatar ptrairatphisan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

carnival's Issues

Running carnival in parallel

Dear Carnival Developers,
First, thank you for the great tool, it really is amazing.

My issue, now, is trying to run Carnival in parallel, but that is not working. I made a toy example to run it but it is not working.

library(furrr)
plan(multisession, workers = 4)
library(CARNIVAL)

load(file = system.file("toy_inputs_ex1.RData",
                        package="CARNIVAL"))
load(file = system.file("toy_measurements_ex1.RData",
                        package="CARNIVAL"))
load(file = system.file("toy_network_ex1.RData",
                        package="CARNIVAL"))


input_lists <- rep(list(toy_inputs_ex1),times=20)
meas_lists <- rep(list(toy_measurements_ex1),times=20)
net_lists <- rep(list(toy_inputs_ex1),times=20)
input_lists <- rep(list(toy_inputs_ex1),times=20)

result = future_map(input_lists, function(lst){
  runCARNIVAL(inputObj = lst, measObj = toy_measurements_ex1, threads =1,
              netObj = toy_network_ex1, solverPath = "/usr/bin/cbc", solver = "cbc")
})

This is the error that I get when running it.

Writing constraints...
Solving LP problem...
Error: 'results_cbc_1_1.txt' does not exist in current working directory ('/home/ahmed/Comp_Bio/Projects/Discovery_Pipeline').
In addition: Warning message:
UNRELIABLE VALUE: Future (‘<none>’) unexpectedly generated random numbers without specifying argument 'seed'. There is a risk that those random numbers are not statistically sound and the overall results might be invalid. To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe random numbers are produced via the L'Ecuyer-CMRG method. To disable this check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

I looked at the one of the development branches, and it seems that this was somehow addressed with parallelIdx1, but it always has a fixed value of 1 and there is no argument in the runCarnival function to change 'condition' that would then send off different arguments to the cbc solver instead of always looking for the 'results_cbc_1_1.txt' file which is already in use by the first node.

Thanks,

Error in file(file, "rt"): cannot open the connection

Dear respectful developers,

May I apply for your guidance at your convenient time about runCARNIVAL step?

When using the latest version cplex and VARNIVAL and R4.1.3, in the runCARNIVAL step, it saved the lp file, and then shows Error in file(file, "rt"): cannot open the connection, and cplex seems not generate the results file in the working directory as expected.

Thanks a lot! Much Appreicate!

weightObj error

When I put progeny pathway score (dataframe, 1obs. of 14 variables) into runCARNIVAL function as weightObj, I got the following error message.
"Error in if (nrow(weightObj) != nrow(measObj)) "

However, my weightObj (dataframe, 1obs. of 14 variables) and measObj (dataframe, 1obs. of 289 variables) have the same number of row.

If you provide example pipeline using progeny pathway score as weightObj, it will be very helpful.

R version 4.0.2 (2020-06-22)
CARNIVAL version 1.0.0

No DOT figure

Using BigSur 11.6 and CPLEX 20.10, I get no DOT figure with CARNIVAL 2.2 as an output network even though the problem is optimal. The same inputs generate a DOT figure file in Windows10 as normal.

Some issues

For my project I read your paper with the title "From expression footprints to causal pathways: contextualizing large signaling networks with CARNIVAL '' and it was really interesting to me, the procedure you applied in different stages was novel and clever. Just some questions remained in my mind and I really appreciate you helping me to answer them: first: what was your request from Omnipath Db to create your PKN network, I downloaded it from Omnipath Db with the following code but I got more interaction than you mentioned in the paper, Can I ask you to send your script for me?
My code:
interactions <-import_omnipath_interactions(resources=c("SignaLink3","PhosphoSite","SIGNOR"))
Secondly:
As you mentioned in the paper we can use PROGENy tools for pathway activity but I don't understand which format is acceptable? I read your vignette but unfortunately, I didn't find any related data to see what should be this input?
Finally: is it possible for you to send me a real example R script that you run on CARNAVAL? I ran your toy example but for more understanding about its real application, I need a true one.
I hope you help me to answer my questions.
Best Regards

missing mouse progeny data

The supplied "progenyMembers.RData" file contains only the human genes used in the progeny pre-steps. One with mouse genes would be required to be compatible with mouse input.

Gurobi needs documentation

Hi,
it seems to me that gurobi is already part of this version (can be selected as solver), but there is no documentation about which parameters can/should be used in ?runCARNIVAL(). Is it described somewhere or could you add it to runCARNIVAL?
thanks
Attila

CBC multithreading arguments

Dear @ivanovaos
as we discussed in this issue #61, I would suggest that we add two additional options to be sent to the CBC solver command line.
Those would be: threads and randomseed , which would help speed up CBC that was built with support for multithreading by adding that option to the config file ./configure --enable-cbc-parallel
Many thanks!!
Ahmed

revise dependencies

Hey,
the following dependencies should be revised:
Imports:
doParallel, --- is it compulsory?
readr, (? not sure if really needed)
viper, -- (do we need Dorothea for CARNIVAL? or is it only used for the vignette?)
AnnotationDbi, -- gene name mapping should be done outside carnival, right?
UniProt.ws, -- gene name mapping should be done outside carnival, right?

Memory issues when writing constraints

adj <- as.matrix(adj)

Converting adj from sparse to dense matrix is leading to memory issues on larger problems (>30000 nodes/reactions take up more than 8 Gb).

Since this variable is only needed to identify which rows sum up to 0, line 21 could be removed and this line below could probably replace the idx1 definition, although this is not as efficient as it could be.

idx1 <- where(sapply(1:dim(adj)[1], function(x){sum(adj[x,])}) == 0)

Vignette missing

Hi,

it seems the command vignette("CARNIVAL-vignette") leads to a warning:

vignette ‘CARNIVAL-vignette’ not found

I tried the Bioconductor and the github version but neither versions contain the vignette. Can you fix this?

I'm running CARNIVAL v 1.0.0 on R4.0.1.

Many thanks,
Axel

Speed up for carnival

Hi,

Is any method of speed up the analysis in carnival?

  • Would you suggest the best speed solver? (cbc or cplex)

  • multi-processing/threading or anything

Thank you in advance!

(In my case, there are about 20,000 genes and 171 samples)

generate_measfile() returns: object 'folderpath' not found

Tried to rerun some code written just a couple of weeks ago. Getting this error now when generating the measurement file.

I noticed that the folderpath variable is declared inside the generate_measfile() function but not defined.

Moreover, I cannot find the generate_measfile() in the source code you currently have available. Have you changed the name or otherwise?

I pasted my error below:

> generate_measfile(measurements = TF_uniprot, \+ topnumber = NULL, \+ write2folder = "measurements") Error in paste0(folderpath, "/meas_", drug, "_all.txt") : object 'folderpath' not found

progenyMembers.RData

Hey Saezlab:)

I found an error in progenyMembers.RData: The p53 pathway is mapped to "P53" in the file, but in Omnipath it is called "TP53".

As consequence, I've been getting this warning:
These nodes are not in prior knowledge network and will be ignored: P53

This might have been simply my mistake back in the days, but if it is due to updates in gene name conventions this might be something to look out for more generally?

All the best,
Anika

ILP issue

Hi,

I am running CARNIVAL using saezlab/transcriptutorial. I tried using IBM cplex but it didn't work out and I kept getting this:

Error in resList[[1]] : subscript out of bounds
In addition: Warning messages:
1: In controlNodeIdentifiers(netObj = netObj) :
  Your network contains identifiers with '-' symbol and they will
            be replaced with '_'
2: In controlNodeIdentifiers(netObj = netObj) :
  Your network contains identifiers with '/' symbol and they will
            be replaced with '_'

Finally, I installed lpSolve in R & made a ~/.Renviron file with R_MAX_VSIZE=100Gb & removed unused variable from the environment. And I got this one:

> carnival_result = CARNIVAL::runCARNIVAL(inputObj= iniciators,
+                                measObj = tfList$t, 
+                                netObj = sif, 
+                                weightObj = progenylist$score, 
+                                solverPath = "/var/folders/5h/nb97vjfj5w13g__ksqy9z0b40000gn/T/RtmpzOFfha/downloaded_packages/lpSolve/R/lpSolve", 
+                                solver = "lpSolve")
Writing constraints...
Solving LP problem...

── Column specification ─────────────────────────────────────────────
cols(
  `enter Problem` = col_character()
)

Error: vector memory exhausted (limit reached?)
In addition: Warning messages:
1: In controlNodeIdentifiers(netObj = netObj) :
  Your network contains identifiers with '-' symbol and they will
            be replaced with '_'
2: In controlNodeIdentifiers(netObj = netObj) :
  Your network contains identifiers with '/' symbol and they will
            be replaced with '_'

Here is my R session:

R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lpSolve_5.6.15      visNetwork_2.0.9    OmnipathR_2.1.1     yaml_2.2.1          rappdirs_0.3.3     
 [6] tidyr_1.1.2         dplyr_1.0.4         readr_1.4.0         RCurl_1.98-1.2      purrr_0.3.4        
[11] magrittr_2.0.1      progress_1.2.2      jsonlite_1.7.2      vsn_3.58.0          Biobase_2.50.0     
[16] BiocGenerics_0.36.0

loaded via a namespace (and not attached):
 [1] httr_1.4.2            mixtools_1.2.0        UniProt.ws_2.30.0     bit64_4.0.5           splines_4.0.2        
 [6] foreach_1.5.1         assertthat_0.2.1      BiocFileCache_1.14.0  BiocManager_1.30.10   affy_1.68.0          
[11] stats4_4.0.2          RBGL_1.66.0           blob_1.2.1            Category_2.56.0       viper_1.24.0         
[16] pillar_1.4.7          RSQLite_2.2.3         lattice_0.20-41       glue_1.4.2            limma_3.46.0         
[21] digest_0.6.27         colorspace_2.0-0      htmltools_0.5.1.1     preprocessCore_1.52.1 Matrix_1.3-2         
[26] GSEABase_1.52.1       XML_3.99-0.5          pkgconfig_2.0.3       logger_0.1            genefilter_1.72.1    
[31] CARNIVAL_1.2.0        zlibbioc_1.36.0       xtable_1.8-4          scales_1.1.1          affyio_1.60.0        
[36] annotate_1.68.0       tibble_3.0.6          generics_0.1.0        IRanges_2.24.1        ggplot2_3.3.3        
[41] ellipsis_0.3.1        cachem_1.0.3          cli_2.3.0             survival_3.2-7        crayon_1.4.0         
[46] memoise_2.0.0         doParallel_1.0.16     MASS_7.3-53           segmented_1.3-1       class_7.3-18         
[51] graph_1.68.0          tools_4.0.2           prettyunits_1.1.1     hms_1.0.0             lifecycle_0.2.0      
[56] S4Vectors_0.28.1      kernlab_0.9-29        munsell_0.5.0         AnnotationDbi_1.52.0  packrat_0.5.0        
[61] compiler_4.0.2        e1071_1.7-4           rlang_0.4.10          grid_4.0.2            iterators_1.0.13     
[66] rstudioapi_0.13       htmlwidgets_1.5.3     igraph_1.2.6          bitops_1.0-6          gtable_0.3.0         
[71] codetools_0.2-18      curl_4.3              DBI_1.1.1             R6_2.5.0              fastmap_1.1.0        
[76] bit_4.0.4             KernSmooth_2.23-18    Rcpp_1.0.6            vctrs_0.3.6           dbplyr_2.1.0         
[81] tidyselect_1.1.0     

I hope you can help me in that issue.

All the best,
Asuman

Saving CARNIVAL outputs

Dear all,
I have some problems saving the outputs of runCARNIVAL() , e.g. weighted_Model.txt, nodesAttribute.txt. etc. These files are just not saved, they appear nowhere on my computer. The only thing being saved when running the function is the DOT-figure.
I have already tried giving an absolute path and just the directory (as input for 'dir_name'), and with and without DOTfigure saving.
Do you have an idea what could be the reason?

Writing cplex command file -> error thrown

Oh hai!
When running runCARNIVAL, the following error message is thrown:

Writing cplex command file
Error in file(file, ifelse(append, "a", "w")) :
cannot open the connection

And the warning:

2: In file(file, ifelse(append, "a", "w")) :
cannot open file '/cplexCommand_t18_19_31d04_06_2021n19.txt': Permission denied

(In the working directory, both the .lp and .RData files are successfully created.)

Can anyone help to solve this issue?
Cheers!

R version requirement

Dear All,
I noticed that you changed the prerequisit R version in the description file to >=4.0. This stops Carnival installation for R 3.6.3 which was released 6 weeks ago. Could you decrease that to 3.5?
Kind regards,

Missining complete tutorial

Hi, I'd like to use this software for my research. Although I I can't find a complete tutorial or an example (including Dorothea and Progeny) through which I can understand the pipeline. I have a set of target perturbations and a gene expression matrix. How can I get the final signaling network ?

Details about output

Hi,

I really liked trying CARNIVAL for my project! For more advanced network visualization, I need to know details of the output list. Could you please provide that information especially for weightedSIF and nodesAttributes? In specific, I have only "T" and NA node type. T is equal to 100 AvgAct, while NA node type could be -100 or 0. What would that mean?

Process quits with message "Killed"

Hi, I've run into an error when running CARNIVAL on my research group's server (resources are first-come, first-served, no queuing system).

Input:

r1 = runCARNIVAL(inputObj=NULL, 
                   weightObj = progenylist$`1`,
                   measObj = measObj, 
                   netObj = netfilefull,
                   solverPath = "../../../ibm/ILOG/CPLEX_Studio1210/cplex/bin/x86-64_linux/cplex",
                   threads = 10,
                   dir_name="../Results/Test")

Output:

Warning message:
In dir.create("../Results/Test") : '../Results/Test' already exists
inputObj set to NULL -- running InvCARNIVAL
Writing constraints...
Solving LP problem...

── Column specification ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
cols(
  `enter Problem` = col_character()
)

Killed

I do not get this issue with the toy example 1.

I tried adjusting threads (we have 40 in total which are all free atm), timelimit (300 to 6000), solver (Cplex or the free solver). I think it is a memory issue but I am not entirely sure how to rectify it. For network I am just using the Omnipath network, so nothing too huge. Default no. TFs and then the PROGENy pathways as input. The run_carnival.R script I am using can be found here https://github.com/laylagerami/EPA/tree/main/Initial_Investigation/Scripts which also has all of the input files etc.

Thanks!

EDIT: .lp file is here https://github.com/laylagerami/EPA/blob/main/Initial_Investigation/Scripts/testFile_1_1.lp

Confusion about inputs for CARNIVAL

Hello,

I am having some issues understanding the inputs for the CARNIVAL program.

1). measFile : I get a matrix (Transcrition Factors as rows and Samples as columns) from DOROTHEA after using run_viper on gene expression matrix and . But the input to CARNIVAL has two rows: the TFs and the values. Should I take an average for the TF-activity across samples (or Median) to get a two row format ?

  1. weightFile: Similar is the case for the PROGENY output. Can I again take average(or Median) across samples for the pathway scores as well ?

  2. inputFile: Can I use the sign fold change of differential gene expression to construct this file ? As in -1 for the down-regulated genes and 1 for upregulated genes ...

Thanks in advance for any help.

Regards,

Anupam

runCARNIVAL

Hi,
Thanks for developing this great tool, I have one question and your answer is really helpful for me,
Question: I want to run CARNIVAL on a single cell dataset and it contains more than 300 cancer cells, I should run CARNIVAL on each cell separately? because in your transcriptutorial you run it on a for loop.
Thanks in advance

Wrong column name inside the measurementsDf?

From create_lp_formulation.R, functioncreateGenerals_v2 :

generals <- paste(c(variables$nodesDf$nodesVars,
                      variables$nodesDf$nodesActStateVars,
                      variables$measurementsDf$absDifference), sep="\t")

Hello all. Thanks a lot for putting up such fantastic work!

Very sorry for this non-error issue but I wonder if variables$measurementsDf$absDifference here should be variables$measurementsDf$value instead? There is no error running the pipeline like this though.

Having tried this part of the code, I also realised the two values from the toy_measurements_ex1.RData were not written down at the end of the Generals in the toy_lp_file_ex1.lp file. Should it be like this? Or is it because those values happen to be 1 so when writing output, it just writes two \t?

Thanks a lot.

cbc error : 'data' must be of a vector type, was 'NULL' despite solution found

Me again...

I can't get cbc solver to work. Not sure if it isn't generating results due to the nature of the error message, but it still says "Result - Optimal solution found". Have tried with InvCarnival only for now, with two different data inputs (both of which worked w/ Cplex) and a toy example.

The error message is below - the output starts with the usual text and then continually outputs the solving process as expected, then ends with:

Result - Optimal solution found

Objective value:                1.53549443
Enumerated nodes:               18300
Total iterations:               9011546
Time (CPU seconds):             47645.70
Time (Wallclock seconds):       2513.40

Total time (CPU seconds):       47647.50   (Wallclock seconds):      2513.99 
10:41:44 27.07.2021 Done: solving LP problem. 
10:41:44 27.07.2021 Getting the solution matrix 
Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x),  : 
  'data' must be of a vector type, was 'NULL' 

I used the same settings as I usually do with Cplex:

> carnivalOptions
$solverPath
[1] "/usr/bin/cbc"

$timelimit
[1] 3600

$threads
[1] 20

$solver
[1] "cbc"

$betaWeight
[1] 0.2

$poolrelGap
[1] 0.0001

$lpFilename
[1] ""

$workdir
[1] "output/"

$outputFolder
[1] "temp_carnival"

$cleanTmpFiles
[1] TRUE

$keepLPFiles
[1] FALSE

I also tried increasing/decreasing number of threads (including using 1 thread only) and time limit - still I get the same error.

The result file looks like this, not sure if a symptom or cause of the issue?

Status unknown - objective value 0.00000000

Layla

Saving results step is very slow

when great amount of solutions are generated and/or solution tree is very large

Solution: Review the code part that does it and find optimisations

Reported by Aurelien

Failing tests because local files were used

@ivanovaos , some tests are failing e.g. test_carnival_simplest_case.R because it uses files from your google drive:
'/Users/olgaivanova/GoogleDrive/_PhD_Heidelberg/playground/carnival_style/carnival/tests/test_networks/direct_network.sif'
please fix them

Bioconductor update?

Hi,
could we please update the bioconductor package with the constraint fix commits? (if this version passes all the requirements/test the the current version would be also ok, i guess)

We got 2 issues opened on COSMOS related to this problem.
thanks

Careful with protein identifiers in CARNIVAL

Hello,

I was running CARNIVAL on a case study and I got an error when cplex was about to solve the ilp problem introduced. The issue was that some of the identifiers had an undesired "-" symbol on it (i.e. genes NKX2-1, NKX3-1, etc.). These symbols are recognised as a minus in the problem formulation and might cause problems. Please be careful with the names of the protein identifiers introduced in the PPI, input or measurement files and not add any mathematical symbol on them.

Cheers,

Very different results with latest CARNIVAL release

Hi devs,

I have noticed that I am getting very different results with CARNIVAL now compared to the previous version. I am using the exact same OS, cplex version, input data and settings (default w/ 20 cores).

Specifically, before the network I obtained had 69 nodes and 140 edges. Running with the new version of CARNIVAL, the weighted network now has 2535 nodes and 7672 edges. This is quite large since I would like to visualise the network in a Shiny app! Is this behaviour intentional?

I also tried running a different dataset with both cbc+cplex and again obtained very big networks compared to what I was getting before, but haven't got an exact "before" to compare to.

Command before update

values$carnival_result <- runCARNIVAL(
          netObj = pkn,
          measObj = meas,
          weightObj = weights,
          solverPath = "/scratch/lh605/ucc-fs-nethome/ibm/ILOG/CPLEX_Studio1210/cplex/bin/x8664_linux/cplex",
          solver = "cplex",
          timelimit = 3600,
          threads = 20
       )

Command upon update

carnival_options = defaultCplexCarnivalOptions()
      carnival_options$solverPath = "/scratch/lh605/ucc-fs-nethome/ibm/ILOG/CPLEX_Studio1210/cplex/bin/x8664_linux/cplex"
      carnival_options$timelimit = 3600
      carnival_options$threads = 20
      carnival_options$workdir = "output/"
      carnival_options$outputFolder = "output/"

values$carnival_result <- runInverseCarnival(
          priorKnowledgeNetwork = pkn,
          measurements = meas,
          weights = weights,
          carnivalOptions = carnival_options
          )

Session info:

R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8          LC_NUMERIC=C                  LC_TIME=en_GB.UTF-8          
 [4] LC_COLLATE=en_GB.UTF-8        LC_MONETARY=en_GB.UTF-8       LC_MESSAGES=en_GB.UTF-8      
 [7] LC_PAPER=en_GB.UTF-8          LC_NAME=en_GB.UTF-8           LC_ADDRESS=en_GB.UTF-8       
[10] LC_TELEPHONE=en_GB.UTF-8      LC_MEASUREMENT=en_GB.UTF-8    LC_IDENTIFICATION=en_GB.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] readr_1.4.0          snowfall_1.84-6.1    snow_0.4-3           GSEABase_1.54.0     
 [5] graph_1.70.0         annotate_1.70.0      XML_3.99-0.6         colorspace_2.0-1    
 [9] sortable_0.4.4       lpSolve_5.6.15       shinyWidgets_0.6.0   shinyalert_2.0.0    
[13] HGNChelper_0.8.1     piano_2.8.0          visNetwork_2.0.9     CARNIVAL_2.2.0      
[17] progeny_1.14.0       ggplot2_3.3.3        tibble_3.1.2         dplyr_1.0.6         
[21] dorothea_1.4.0       org.Hs.eg.db_3.13.0  AnnotationDbi_1.54.0 IRanges_2.26.0      
[25] S4Vectors_0.30.0     Biobase_2.52.0       BiocGenerics_0.38.0  shinyFiles_0.9.0    
[29] shinythemes_1.2.0    shinyBS_0.61         shinysky_0.1.3       rhandsontable_0.3.8 
[33] chemdoodle_0.4.0     rcdk_3.5.0           rcdklibs_2.3         rJava_1.0-4         
[37] miniUI_0.1.1.1       DT_0.18              igraph_1.2.6         shinyjs_2.0.0       
[41] shiny_1.6.0         

loaded via a namespace (and not attached):
  [1] uuid_0.1-4             fastmatch_1.1-0        plyr_1.8.6             fingerprint_3.5.7     
  [5] shinydashboard_0.7.1   splines_4.1.0          BiocParallel_1.26.0    crosstalk_1.1.1       
  [9] GenomeInfoDb_1.28.0    digest_0.6.27          htmltools_0.5.1.1      fansi_0.5.0           
 [13] magrittr_2.0.1         memoise_2.0.0          cluster_2.1.2          mixtools_1.2.0        
 [17] limma_3.48.0           Biostrings_2.60.0      blob_1.2.1             ggrepel_0.9.1         
 [21] xfun_0.23              crayon_1.4.1           RCurl_1.98-1.3         jsonlite_1.7.2        
 [25] survival_3.2-11        iterators_1.0.13       glue_1.4.2             gtable_0.3.0          
 [29] zlibbioc_1.38.0        XVector_0.32.0         kernlab_0.9-29         scales_1.1.1          
 [33] DBI_1.1.1              relations_0.6-9        Rcpp_1.0.6             xtable_1.8-4          
 [37] learnr_0.10.1          bit_4.0.4              proxy_0.4-25           htmlwidgets_1.5.3     
 [41] httr_1.4.2             fgsea_1.18.0           gplots_3.1.1           ellipsis_0.3.2        
 [45] farver_2.1.0           pkgconfig_2.0.3        sass_0.4.0             utf8_1.2.1            
 [49] RJSONIO_1.3-1.4        labeling_0.4.2         tidyselect_1.1.1       rlang_0.4.11          
 [53] later_1.2.0            munsell_0.5.0          tools_4.1.0            cachem_1.0.5          
 [57] cli_2.5.0              generics_0.1.0         RSQLite_2.2.7          stringr_1.4.0         
 [61] evaluate_0.14          fastmap_1.1.0          yaml_2.2.1             knitr_1.33            
 [65] bit64_4.0.5            fs_1.5.0               bcellViper_1.28.0      caTools_1.18.2        
 [69] purrr_0.3.4            KEGGREST_1.32.0        mime_0.10              slam_0.1-48           
 [73] compiler_4.1.0         rstudioapi_0.13        png_0.1-7              e1071_1.7-7           
 [77] marray_1.70.0          viper_1.26.0           stringi_1.6.2          bslib_0.2.5.1         
 [81] lattice_0.20-44        Matrix_1.3-4           markdown_1.1           vctrs_0.3.8           
 [85] pillar_1.6.1           lifecycle_1.0.0        jquerylib_0.1.4        data.table_1.14.0     
 [89] bitops_1.0-7           httpuv_1.6.1           R6_2.5.0               promises_1.2.0.1      
 [93] KernSmooth_2.23-20     gridExtra_2.3          MASS_7.3-54            gtools_3.8.2          
 [97] assertthat_0.2.1       rprojroot_2.0.2        withr_2.4.2            GenomeInfoDbData_1.2.6
[101] hms_1.1.0              grid_4.1.0             tidyr_1.1.3            class_7.3-19          
[105] rmarkdown_2.8          segmented_1.3-4        sets_1.0-18            itertools_0.1-3  

Let me know if you need anything else from my side to understand what's going on,

Layla

error when no solution found ?

There is an unhandled error occuring in some cases:
Error in resList[[1]] : subscript out of bounds Calls: ... <Anonymous> -> solveCARNIVAL -> solveCARNIVALSingle

current_dir and dir_name conflict

I have realized that some times runCARNIVAL founds solutions, but the pipeline retrieves "No result to be written", without giving any error/warning message.

I have dug in the code, and I have found that it puts together current_dir and dir_name. current_dir gets the current working directory, and if the dir_name is an absolute path, then you can end up with a nested path that doesn't exist. Due to this conflict, even if there are solutions to be generated, they are never written.

Be aware that the argument for Result_dir must NOT be an absolute path.

Incorrect assignment of input values

There was an issue with assigning the order of activity values for the inputs. This issue occurs only when there is a combination of positive (1) and negative (-1) values in the same set of inputs.

node name fixing

the formatting of node names are done independently for network (controlNodeIdentifiers) , inputs (checkInputObj) and measurements (checkMeasObj).
This should be handled by the same function, if possible, to assure they are done the same way.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.