csvoss / syndra Goto Github PK
View Code? Open in Web Editor NEWConvert high-level facts-about biology into an executable model, using logical deduction (the name is a pun on synthesis + INDRA)
Convert high-level facts-about biology into an executable model, using logical deduction (the name is a pun on synthesis + INDRA)
Finish coverage of Predicate and AtomicPredicate, and add tests for whatever else seems useful as well.
In particular, to help me debug check_sat.
This should take the form of a test case that asserts the first two predicates, gets a model, then checks the satisfiability of the third predicate on the resulting model.
I think this would be a better abstraction.
Then I could add Syndra predicates, push, and pop, and solver.py would maintain the state of the z3 solver it's using, and I would be able to do stuff like
solver.add(Syndra predicate)
solver.get_model()
solver.push()
solver.add(Syndra predicate)
solver.get_model()
solver.pop()
and have that all do the right thing.
Maybe this is what my solver was supposed to do all along, and now I'm just realizing it!
(from conv. with Hector)
For predicate.py:
ExistsNode
Something that will be powerful enough to allow us to encode that if a site is labeled ThreoninePhosphorylated, then it must be labeled Phosphorylated and also be labeled HasThreonine
For structure.py:
with_parent
"NOT" versions of all of the structure.py methods
Allow multiple labels and choosing from the powerset of possible labels.
Introduction + proposed work.
(description copypasted from Dropbox Paper)
Currently, from the following Syndra predicate…
ModelHasRule(lambda r: And(
PregraphHas(r, kinase.labeled(active)),
PregraphHas(r, substrate),
PostgraphHas(r, kinase.labeled(active)),
PostgraphHas(r, substrate.labeled(phosphate)),
Not(PregraphHas(r, substrate.labeled(phosphate))),
))
…we get the following model:
[Rule({substrate with links to kinase; kinase-(active, phosphate) with links to substrate} -> {substrate-(active, phosphate) with links to kinase; kinase-(active, phosphate) with links to substrate})]
This is a list consisting of one rule, which says that substrate bound to active-phosphorylated kinase becomes active-phosphorylated substrate bound to active-phosphorylated kinase, technically satisfying the Syndra predicate even though there are unnecessary labels and bindings.
It’s even worse with the model we get from Walter’s example:
[Rule({RAF-(GTP, phosphate) with links to ERK1, HRAS, MEK1, SAF1; ERK1 with links to RAF, HRAS, MEK1, SAF1; HRAS with links to RAF, ERK1, MEK1, SAF1; MEK1 with links to RAF, ERK1, HRAS, SAF1; SAF1 with links to RAF, ERK1, HRAS, MEK1} -> {RAF-(GTP, phosphate) with links to ERK1, ERK1, HRAS, HRAS, MEK1, MEK1, SAF1, SAF1; ERK1-(GTP, phosphate) with links to RAF, RAF, HRAS, HRAS, MEK1, MEK1, SAF1, SAF1; HRAS-(GTP, phosphate) with links to RAF, RAF, ERK1, ERK1, MEK1, MEK1, SAF1, SAF1; MEK1-(GTP, phosphate) with links to RAF, RAF, ERK1, ERK1, HRAS, HRAS, SAF1, SAF1; SAF1-(GTP, phosphate) with links to RAF, RAF, ERK1, ERK1, HRAS, HRAS, MEK1, MEK1}), Rule({RAF with links to ERK1, HRAS, MEK1, SAF1; ERK1-(GTP, phosphate) with links to RAF, HRAS, MEK1, SAF1; HRAS with links to RAF, ERK1, MEK1, SAF1; MEK1 with links to RAF, ERK1, HRAS, SAF1; SAF1 with links to RAF, ERK1, HRAS, MEK1} -> {RAF with links to ERK1, HRAS, MEK1, SAF1; ERK1 with links to RAF, HRAS, MEK1, SAF1; HRAS with links to RAF, ERK1, MEK1, SAF1; MEK1 with links to RAF, ERK1, HRAS, SAF1; SAF1 with links to RAF, ERK1, HRAS, MEK1}), Rule({RAF-(GTP, phosphate) with links to ERK1, HRAS, MEK1, SAF1; ERK1-(GTP, phosphate) with links to RAF, HRAS, MEK1, SAF1; HRAS-(GTP, phosphate) with links to RAF, ERK1, MEK1, SAF1; MEK1-(GTP, phosphate) with links to RAF, ERK1, HRAS, SAF1; SAF1-(GTP, phosphate) with links to RAF, ERK1, HRAS, MEK1} -> {RAF-(GTP, phosphate) with links to ERK1, HRAS, MEK1, SAF1; ERK1-(GTP, phosphate) with links to RAF, HRAS, MEK1, SAF1; HRAS-(GTP, phosphate) with links to RAF, ERK1, MEK1, SAF1; MEK1-(GTP, phosphate) with links to RAF, ERK1, HRAS, SAF1; SAF1-(GTP, phosphate) with links to RAF, ERK1, HRAS, MEK1})]
There are two ways to fix these unnecessary labels:
We must make it so that we can correctly implement the statement “active Enzyme phosphorylates Substrate at site S222”. In this statement, the enzyme is active, so it’s probably bound to some agent we don’t know about yet that activates it. This is an argument for implementing a minimizer over implementing more Syndra predicates: an implementation of “active enzyme phosphorylates substrate @ S222” must permit the extension of the rule when more statements clarifying “active” are added, while also remaining minimal in the case when no such statements are added. This cannot be done with extra Syndra predicates, because if we try to make the rule stay minimal by requiring it be bound to no other extra things, then it will not be able to be extended by new statements.
However, if we’re trying to minimize the number of rules, maybe we’ll end up with two statements “A+B→C” and “D→E” being combined by Syndra into the single rule “A+B+D→C+E”, so we still want some way of saying that the rule “A+B→C” does not involve any agents we haven’t mentioned yet. This is an argument for implementing more Syndra predicates over implementing a minimizer.
This is a puzzle.
Ideas:
to contain macros and other nuggets of biological information
I added Syndra to PyPI, and pip search syndra
works, but pip install syndra
doesn't work yet. Not sure why.
This should be a port-over of statements_to_predicates.py
. It should actually work as-is – the INDRA patternmatching hasn't changed and statements_to_predicate.py
interfaces with Syndra via macros.py
whose interface hasn't changed.
Possibly even as a library all on its own.
For example, we should allow the user to make labels like SerinePhosphorylated
from which we can infer that a site labeled SerinePhosphorylated
would also count as being labeled Phosphorylated
as well as Serine
.
This idea came from a conversation with Hector. A syntax more like Kappa would be better than the conventions I came up with for printing models.
Now that I can pretty-print my z3 models, I see that the following Syndra predicate:
ModelHasRule(lambda r: And(
PregraphHas(r, kinase.labeled(active)),
PregraphHas(r, substrate),
PostgraphHas(r, kinase.labeled(active)),
PostgraphHas(r, substrate.labeled(phosphate)),
)).get_python_model()
produces the following model:
[Rule({kinase-(active, phosphate) with links to kinase} -> {kinase-(active, phosphate) with links to kinase})]
This is in error; the substrate does not become phosphorylated. (Maybe it's incorrectly merging substrate and kinase into the same agent?) Fix this.
This should help with the Occam's Razor bug.
Subtasks (woah, these checkboxes show up on the issue page!):
Given the context from Walter's example (HRAS-MEK-ERK pathway + HRAS is not a kinase), create a demo that is able to infer both (a) that a causal gap with an unboxed statement exists, and (b) that there are a few ways to close the causal gap:
The goal here is that we can take in a whole pile of statements, have Syndra automatically unbox some of them using the info we get from other statements, and then at the end tell the user if there are any loose ends – unboxed statements still unboxed – remaining.
e.g. if A activates B via a series of steps, prove that we only have to look at finitely many steps (since there are only finitely many rules) in order to show the deduction.
This has actually been a problem for a while, but it's more annoying now. I need to figure out the intended way to import INDRA. Fixing this may involve making a pull request to INDRA.
I cannot do something like this:
from indra.indra import trips
because the first indra
is not a module (it lacks an __init__.py
). Even supposing that I fix that, I run into issues where stuff in INDRA assumes that we'll be doing imports a la import indra.whatever
, which breaks things when I'm instead doing indra.indra.whatever
.
It would be nice if I could simply fix this by using the inner indra
subfolder of the outer indra
folder as my module, but I can't do that because the inner indra
has dependencies on the folder data
, which is in the outer indra
folder. This feels like an abstraction violation, so I'll see if I can fix it.
Also to consider: looking into whether indra
is pip-installable; then we could get rid of the submodule altogether.
Either make the system managing variables and labels so that we can specify Forall("x", pred)
without having to write a variable for x – basically, nice syntactic sugar – or give up on the syntactic sugar and just take variables as inputs.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.