Giter Club home page Giter Club logo

syndra's People

Contributors

bgyori avatar csvoss avatar jeanqasaur avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

syndra's Issues

More more tests!

Finish coverage of Predicate and AtomicPredicate, and add tests for whatever else seems useful as well.

More tests!

In particular, to help me debug check_sat.

Make solver.py behave like z3's solver, but for Syndra predicates instead of z3 predicates

I think this would be a better abstraction.

Then I could add Syndra predicates, push, and pop, and solver.py would maintain the state of the z3 solver it's using, and I would be able to do stuff like

solver.add(Syndra predicate)
solver.get_model()
solver.push()
solver.add(Syndra predicate)
solver.get_model()
solver.pop()

and have that all do the right thing.

Maybe this is what my solver was supposed to do all along, and now I'm just realizing it!

More features in structure.py and predicate.py

(from conv. with Hector)

For predicate.py:
ExistsNode
Something that will be powerful enough to allow us to encode that if a site is labeled ThreoninePhosphorylated, then it must be labeled Phosphorylated and also be labeled HasThreonine

For structure.py:
with_parent
"NOT" versions of all of the structure.py methods

Solve Occam's Razor problem

(description copypasted from Dropbox Paper)

Currently, from the following Syndra predicate…

ModelHasRule(lambda r: And(
    PregraphHas(r, kinase.labeled(active)),
    PregraphHas(r, substrate),
    PostgraphHas(r, kinase.labeled(active)),
    PostgraphHas(r, substrate.labeled(phosphate)),
    Not(PregraphHas(r, substrate.labeled(phosphate))),
))

…we get the following model:

[Rule({substrate with links to kinase; kinase-(active, phosphate) with links to substrate} -> {substrate-(active, phosphate) with links to kinase; kinase-(active, phosphate) with links to substrate})]

This is a list consisting of one rule, which says that substrate bound to active-phosphorylated kinase becomes active-phosphorylated substrate bound to active-phosphorylated kinase, technically satisfying the Syndra predicate even though there are unnecessary labels and bindings.

It’s even worse with the model we get from Walter’s example:

[Rule({RAF-(GTP, phosphate) with links to ERK1, HRAS, MEK1, SAF1; ERK1 with links to RAF, HRAS, MEK1, SAF1; HRAS with links to RAF, ERK1, MEK1, SAF1; MEK1 with links to RAF, ERK1, HRAS, SAF1; SAF1 with links to RAF, ERK1, HRAS, MEK1} -> {RAF-(GTP, phosphate) with links to ERK1, ERK1, HRAS, HRAS, MEK1, MEK1, SAF1, SAF1; ERK1-(GTP, phosphate) with links to RAF, RAF, HRAS, HRAS, MEK1, MEK1, SAF1, SAF1; HRAS-(GTP, phosphate) with links to RAF, RAF, ERK1, ERK1, MEK1, MEK1, SAF1, SAF1; MEK1-(GTP, phosphate) with links to RAF, RAF, ERK1, ERK1, HRAS, HRAS, SAF1, SAF1; SAF1-(GTP, phosphate) with links to RAF, RAF, ERK1, ERK1, HRAS, HRAS, MEK1, MEK1}), Rule({RAF with links to ERK1, HRAS, MEK1, SAF1; ERK1-(GTP, phosphate) with links to RAF, HRAS, MEK1, SAF1; HRAS with links to RAF, ERK1, MEK1, SAF1; MEK1 with links to RAF, ERK1, HRAS, SAF1; SAF1 with links to RAF, ERK1, HRAS, MEK1} -> {RAF with links to ERK1, HRAS, MEK1, SAF1; ERK1 with links to RAF, HRAS, MEK1, SAF1; HRAS with links to RAF, ERK1, MEK1, SAF1; MEK1 with links to RAF, ERK1, HRAS, SAF1; SAF1 with links to RAF, ERK1, HRAS, MEK1}), Rule({RAF-(GTP, phosphate) with links to ERK1, HRAS, MEK1, SAF1; ERK1-(GTP, phosphate) with links to RAF, HRAS, MEK1, SAF1; HRAS-(GTP, phosphate) with links to RAF, ERK1, MEK1, SAF1; MEK1-(GTP, phosphate) with links to RAF, ERK1, HRAS, SAF1; SAF1-(GTP, phosphate) with links to RAF, ERK1, HRAS, MEK1} -> {RAF-(GTP, phosphate) with links to ERK1, HRAS, MEK1, SAF1; ERK1-(GTP, phosphate) with links to RAF, HRAS, MEK1, SAF1; HRAS-(GTP, phosphate) with links to RAF, ERK1, MEK1, SAF1; MEK1-(GTP, phosphate) with links to RAF, ERK1, HRAS, SAF1; SAF1-(GTP, phosphate) with links to RAF, ERK1, HRAS, MEK1})]

There are two ways to fix these unnecessary labels:

  • Implement a minimizer: minimize the number of rules, links, and labels.
  • Implement some Syndra predicates to allow the user to require that links or labels not exist.

We must make it so that we can correctly implement the statement “active Enzyme phosphorylates Substrate at site S222”. In this statement, the enzyme is active, so it’s probably bound to some agent we don’t know about yet that activates it. This is an argument for implementing a minimizer over implementing more Syndra predicates: an implementation of “active enzyme phosphorylates substrate @ S222” must permit the extension of the rule when more statements clarifying “active” are added, while also remaining minimal in the case when no such statements are added. This cannot be done with extra Syndra predicates, because if we try to make the rule stay minimal by requiring it be bound to no other extra things, then it will not be able to be extended by new statements.

However, if we’re trying to minimize the number of rules, maybe we’ll end up with two statements “A+B→C” and “D→E” being combined by Syndra into the single rule “A+B+D→C+E”, so we still want some way of saying that the rule “A+B→C” does not involve any agents we haven’t mentioned yet. This is an argument for implementing more Syndra predicates over implementing a minimizer.

This is a puzzle.

Ideas:

  • Implement some combination of both? This is a lot of work.
  • Make the minimizer ignore the number of rules? Then it still might spuriously merge rules together.
  • Make the minimizer maximize the number of rules? ← probably a bad plan
  • Make PregraphHas take in all of the agents/structures, and make it require that no other structures besides those be present unless they are linked; then make linkage way more costly than splitting into separate rules? This works, but it’s weird; why privilege linkage? Surely something must break this.

Fix INDRA integration to work with new engine

This should be a port-over of statements_to_predicates.py. It should actually work as-is – the INDRA patternmatching hasn't changed and statements_to_predicate.py interfaces with Syndra via macros.py whose interface hasn't changed.

Extra functionality for refinements over labels

For example, we should allow the user to make labels like SerinePhosphorylated from which we can infer that a site labeled SerinePhosphorylated would also count as being labeled Phosphorylated as well as Serine.

Debug system which should produce phosphorylation rule

Now that I can pretty-print my z3 models, I see that the following Syndra predicate:

ModelHasRule(lambda r: And(
        PregraphHas(r, kinase.labeled(active)),
        PregraphHas(r, substrate),
        PostgraphHas(r, kinase.labeled(active)),
        PostgraphHas(r, substrate.labeled(phosphate)),
)).get_python_model()

produces the following model:

[Rule({kinase-(active, phosphate) with links to kinase} -> {kinase-(active, phosphate) with links to kinase})]

This is in error; the substrate does not become phosphorylated. (Maybe it's incorrectly merging substrate and kinase into the same agent?) Fix this.

Change Node to be an enum

This should help with the Occam's Razor bug.

Subtasks (woah, these checkboxes show up on the issue page!):

  • Modify predicate.py: simplify API, move extra stuff to solver.py
  • Move pythonize.py to solver.py
  • Move datatypes.py to solver.py
  • Make it so that solver.py keeps track of the nodes that should be in the enum, and calls predicate's get_predicate method with that enum provided
  • Modify structure.py to use the enum
  • Modify predicate.py to pass the enum down to predicate.py
  • Modify pythonize.py (now in solver) to be compatible with the enum
  • Try adding in the edge assertions again, now that everything's been cleaned up

Causality demo with Walter's example

Given the context from Walter's example (HRAS-MEK-ERK pathway + HRAS is not a kinase), create a demo that is able to infer both (a) that a causal gap with an unboxed statement exists, and (b) that there are a few ways to close the causal gap:

  • MEK1P phosphorylates SAF1, or
  • ERK1P phosphorylates SAF1, or
  • RAF, when bound to HRAS-GTP, phosphorylates SAF1

The goal here is that we can take in a whole pile of statements, have Syndra automatically unbox some of them using the info we get from other statements, and then at the end tell the user if there are any loose ends – unboxed statements still unboxed – remaining.

Re-check atomic_predicate.py

  • Do the atomic predicates still work now that sets of nodes and sets of edges are implemented as arrays, not functions?

Fix importing of INDRA dependencies

This has actually been a problem for a while, but it's more annoying now. I need to figure out the intended way to import INDRA. Fixing this may involve making a pull request to INDRA.

I cannot do something like this:

from indra.indra import trips

because the first indra is not a module (it lacks an __init__.py). Even supposing that I fix that, I run into issues where stuff in INDRA assumes that we'll be doing imports a la import indra.whatever, which breaks things when I'm instead doing indra.indra.whatever.

It would be nice if I could simply fix this by using the inner indra subfolder of the outer indra folder as my module, but I can't do that because the inner indra has dependencies on the folder data, which is in the outer indra folder. This feels like an abstraction violation, so I'll see if I can fix it.

Also to consider: looking into whether indra is pip-installable; then we could get rid of the submodule altogether.

Implement Forall and Exists in predicate.py

Either make the system managing variables and labels so that we can specify Forall("x", pred) without having to write a variable for x – basically, nice syntactic sugar – or give up on the syntactic sugar and just take variables as inputs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.