Giter Club home page Giter Club logo

metabolic-ep's Introduction

Metabolic-EP

Expectation Propagation algorithm for metabolic networks

Documentation Build Status Coverage
Build Status codecov

Authors:

Alfredo Braunstein, Anna Paola Muntoni and Andrea Pagnani

Description

This is an implementation of the Expectation Propagation algorithm for studying the space of solution of constrained metabolic fluxes. The main outputs of the m-function are the means and the variances of truncated Gaussian distributions that approximate the marginal probability density of observing a flux, given a stoichiometric matrix and a measure of the intakes/uptakes. This is part of the work:

"An analytic approximation of the feasible space of metabolic networks" - A. Braunstein, A. Muntoni, A. Pagnani - Nature Communications 8, Article number: 14915 (2017) - doi:10.1038/ncomms14915

There are two implementations: one in matlab (under folder matlab), the second in Julia.

Matlab Version

Input

  • S: stoichiometric matrix of "Nm metabolites" x "Nr reactions"
  • b: vector of Nm intakes/uptakes
  • nuinf, nusup: lower and upper bounds for each metabolic flux
  • Beta: inverse variance of the noise, if any. Otherwise a "large" number (ex. 1e9)
  • damp: damping coefficient (from 0 to 1) applied to the update of means "a" and variances "d" of approximating Gaussians Ex. "new a" = damp * "new a" + (1 - damp) * "old a"
  • max_iter: maximum number of iterations (ex. 1e3)
  • minvar, maxvar: lower and upper bounds for the variances "d" of the approximation. (ex. 1e-50, 1e50)
  • precision: precision required to stop the algorithm (ex. 1e-9)

Input (optional) to fix an experimental profile

  • av_exp: mean of the experimental profile
  • var_exp: variance of the experimental profile
  • exp_i: index of the measured flux If no experimental evidence is available, set av_exp = 0, var_exp = 0 and exp_i = 0.

Output

  • mu: vector parametrizing the mean of the posterior distribution
  • s: vector parametrizing the variance of the posterior distribution
  • a: vector containing the means of the approximated priors
  • d: vector containing the variances of the approximated priors
  • av: averages of the truncated Gaussians of the approximation
  • va: variances of the truncated Gaussians of the approximation
  • t: running time

Julia Version

Installing the package. Enter ] and then:

   (v.1.x)> add https://github.com/anna-pa-m/Metabolic-EP/
   julia> using MetabolicEP

Otherwise, if you do not want to use the package manager:

  • Clone the package git clone https://github.com/anna-pa-m/Metabolic-EP
  • cd Metabolic-EP
  • julia
  • Enter the pkg manager using ] and type instantiate
  • julia> using MetabolicEP

It works with version >= 1.5.

Typical usage is

julia> res=metabolicEP(S,b,numin,numax)

The output in res is of type ``EPout`: there are several fields:

  • μ::Vector: A parameter linked to the mean of the posterior probability
  • σ::Vector: A parameter linked to the std of the posterior probability
  • av::Vector: The mean posterior probability
  • va::Vector: The variance of the posterior probability
  • sol::EPFields: The internal field status. From this value we can restart the sampling from a specific state.
  • status::Symbol: either :converged or :unconverged.

Input (required)

  • S: MxN matrix (either sparse or dense) please note that if you input a dense version, the algorithm is slighlty more efficient. Dense matrices can be create from sparse ones with full(S).
  • b: a vector of M intakes/uptakes
  • nuinf: a vector of lengh N of upper bounds.
  • nusup: a vector of lengh N of lower bounds.

Input (optional argument).

  • beta (inverse temperature::Real): default 10^7; the zero temperature algorithm is run setting beta=Inf.
  • verbose (true or false): default true
  • damp (∈ (0,1) newfield = damp * oldfield + (1-damp)* newfield): default 0.9
  • epsconv (convergence criterion): default 1e-6
  • maxiter (maximum number of iterations): default 2000
  • maxvar (threshold on maximum variance): default 1e50
  • minvar (threshold on minimum variance): default 1e-50
  • solution (start from solution. Is of type EPout): default: nothing
  • expval (fix to posterior probability of mean and/or variance to values): default nothing. expval can be either at Tuple{Float64,Float64,Int} or a Vector{Tuple{Float64,Float64,Int}}. Values can be fixed as expval=(0.2,0.4,4) meaning that for flux index 4 the mean is set to 0.2 and the variance to 0.4. Fixing more values expval=[(0.2, 0.3, 4), (0.4, nothing, 5)]: in this case, we fix the posterior of flux 4 to 0.2 (mean) and 0.3 (variance), while for flux 5 we fix the mean to 0.4 and we keep the variance free.

COBRA compatibility

We developed a COBRA compatibility so that now models can be loaded with the COBRA.loadModel() utility metabolicEP can be also run passing a LPproblem type as returned by loadModel.

COBRA has not yet been updated to julia v1.0. For this reason the compatibility has been temporarily removed.

Reading matlab metabolic reconstruction (.mat files)

There is a small convenience reader for metabolic reconstructions in matlab format (.mat). It can be invoked as:

julia> met=ReadMatrix("nomefile.mat")

The output met is of type MetNet whose fields are:

  • N::Int number of fluxes
  • M::Int number of metabolites
  • S::SparseMatrixCSC{Float64,Int} Stoichiometric matrix M x N sparse
  • b::Array{Float64,1} right hand side of equation S ν = b (vector of size M)
  • c::Array{Float64,1} reaction index of biomass (vector of size N)
  • lb::Array{Float64,1} fluxes lower bound N elements vector
  • ub::Array{Float64,1} fluxes upper bound N elements vector
  • genes::Array{String,1} gene names N elements vector
  • rxnGeneMat::SparseMatrixCSC{Float64,Int}
  • grRules::Array{String,1} gene-reaction rule N elements vector of strings (and / or allowed)
  • mets::Array{String,1} metabolites short-name M elements
  • rxns::Array{String,1} reactions short-name N elements
  • metNames::Array{String,1} metabolites long-names M elements
  • metFormulas::Array{String,1} metabolites formula M elements
  • rxnNames::Array{String,1} reactions long-names N elements
  • rev::Array{Bool,1} reversibility of reactions N elements
  • subSystems::Array{String,1} cellular component of fluxes N elements

Test model (in folder data): iJR904 model for Escherichia Coli. https://doi.org/10.1093/nar/gkv1049

metabolic-ep's People

Contributors

pagnani avatar anna-pa-m avatar abraunst avatar qacwnfq avatar josepereiro avatar

Stargazers

Lei avatar Nilesh Mukherjee avatar Stefano Crotti avatar Marouen avatar  avatar  avatar Gabriel Gellner avatar Linus Schumacher avatar  avatar

Watchers

 avatar  avatar  avatar

metabolic-ep's Issues

Sampling feasible flux vectors from metabolic-EP results

Hello,
I am wondering if there may be a way to sample flux vectors (such as would result from FBA) from the results of metabolic-EP sampling. Indeed your method is much faster than other sampling algorithms but having actual flux vector samples would be useful in some applications.

In the context of my project I am trying to correct sampling results for thermodynamically infeasible fluxes. Flux vectors can be fixed using the CycleFreeFlux method, while we so far could not find a way to prevent metabolic-EP sampling results from being affected by infeasible fluxes.

I would appreciate any insights on whether you believe this may be possible or if you would advise to use alternative sampling algorithms instead.
Thank you in advance,
Bastien

Extracting flux vectors from the sampling space.

Hi,
I'm working on a project which requires me to extract a large number of flux vectors from the sampling space. I wanted to know there is any way I can do that using your tool/approach. Are there any other tools that could do that for me other than optGpSampler?
Any guidance will be much appreciated.

Thanks!

Question about Humongous outputs 'a' generated from MetabolicEP

Dear Dr. Muntoni,

This is Kao Wei Chen from the Maastricht Centre for Systems Biology (MaCSBio) at Maastricht University, Netherlands. We are currently using the analytic flux space approximation method that you published in Nature Communications in 2017, and came across some questions that we hope you can help us with. We also already emailed Prof. Braunstein in this regard, but figured we might also ask you directly on GitHub.

The problem we came across lies in humongous output values being returned by the “metabolicEP” function in the output variable ‘a’. From tracing back the execution of the code, it seems the origin of these humongous output values lies in the update statement in line 98 of metabolicEP.m:

new_a = av + (av – mu) .* new_d .* s1

It turns out new_d occasionally becomes extremely large, 1e50. In our calculation for a human-size network, this happened 162 times. Out of these, in 160 cases the term av – mu was exactly zero, but in two cases it did not exactly cancel out, leaving a small but nonzero result of 1e-19; giving a product (av - mu) .* new_d that still comes out very large, thus presumably propagating to the final result.
As new_d is capped by maxvar in line 97, we have been able to circumvent the issue by decreasing maxvar from 1e50 to 1e19, which prevented this blowup. However, we are now wondering if this “runaway” new_d can happen with certain models (we observed the blowup with several models), or if this should not actually occur.

In this context, we were also wondering about the line 96, which gives new_d as 1./(1./va – 1./s). In our calculation, there are some values of va and s that are the the same, if that occurs new_d would become infinite and get capped by maxvar in the next line. We were wondering whether this was in fact expected behavior for which we can simply decrease maxvar if problems occur, or if something might in fact have gone wrong if new_d became infinite to begin wit

Best regards,

Wei Chen, Kao

Covariance matrix from EP sampling

Hello,
I have been trying out your Metabolic-EP code and I hope you can help me with what I’m trying to do. Specifically, I want to use the covariance matrix for a principal component analysis. So, I would like to know if the covariance matrix as written out by your code is calculated after already performing mean subtraction on the distribution. if not, can this be implemented somehow?
Thank you in advance,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.