opencog / agi-bio Goto Github PK
View Code? Open in Web Editor NEWGenomic and Proteomic data exploration and pattern mining
License: Other
Genomic and Proteomic data exploration and pattern mining
License: Other
The escape character in this term name is causing loading of the scheme file to choke, and then the rest of the atoms following are not loaded:
805491 (EvaluationLink
805492 (PredicateNode "GO_name")
805493 (ListLink
805494 (ConceptNode "GO:0033942")
805495 ; (ConceptNode "4-alpha-D-\{(1->4)-alpha-D-glucano}trehalose trehalohydrolase activity")
805496 (ConceptNode "4-alpha-D-{(1->4)-alpha-D-glucano}trehalose trehalohydrolase activity")
805497 )
805498 )
Also, the latest version of GO.scm checked in the repo has trailing spaces after some of the GO term ConceptNode names. This was causing problems in inheritance mining and will also cause problems with reasoning. I regenerated the scheme file by running the python script and the trailing spaces are no longer there... Perhaps the repo version was generated from an older version of the python script?
Please add either to the knowledge-import readme or to a github project wiki page a list of the imported knowledge bases that includes the name of the associated python script, name of the atom scheme file, and link to the source.
Please make sure to update the list when new KB's are added to the bio atomspace.
This issue encompasses exporting the models + their fitness semantics, that is what the score of a model means, and how this model relates to the target feature.
Load the MSigDB atoms on a cogserver running on the Hetzner server and make available for people to explore using the atomspace visualizer.
[ ] set up for the visualizer code on hetzner to automatically pull from the github repo when their are commits to the gihub repo, if not too difficult to implement this
cross reference issue opencog/atomspace#1708
[ 50%] Building CXX object bioscience/types/CMakeFiles/bioscience-types.dir/BioScienceTypes.cc.o
<command-line>:0:10: warning: missing terminating " character
In file included from /home/cog/opencog/agi-bio/bioscience/types/BioScienceTypes.cc:28:0:
/usr/local/include/opencog/atoms/base/atom_types.cc: In function ‘void init()’:
/usr/local/include/opencog/atoms/base/atom_types.cc:43:40: error: ‘class opencog::ClassServer’ has no member named ‘beginTypeDecls’
bool is_init = opencog::classserver().beginTypeDecls(xstr(INITNAME));
^
In file included from /usr/local/include/opencog/atoms/base/atom_types.cc:46:0,
from /home/cog/opencog/agi-bio/bioscience/types/BioScienceTypes.cc:28:
/home/cog/opencog/agi-bio/build/bioscience/types/atom_types.inheritance:5:45: error: ‘class opencog::ClassServer’ has no member named ‘declType’
opencog::GENE_NODE = opencog::classserver().declType(opencog::CONCEPT_NODE, "GeneNode");
^
/home/cog/opencog/agi-bio/build/bioscience/types/atom_types.inheritance:6:48: error: ‘class opencog::ClassServer’ has no member named ‘declType’
opencog::PROTEIN_NODE = opencog::classserver().declType(opencog::CONCEPT_NODE, "ProteinNode");
^
In file included from /home/cog/opencog/agi-bio/bioscience/types/BioScienceTypes.cc:28:0:
/usr/local/include/opencog/atoms/base/atom_types.cc:50:25: error: ‘class opencog::ClassServer’ has no member named ‘endTypeDecls’
opencog::classserver().endTypeDecls();
^
bioscience/types/CMakeFiles/bioscience-types.dir/build.make:78: recipe for target 'bioscience/types/CMakeFiles/bioscience-types.dir/BioScienceTypes.cc.o' failed
[ ] confirm Amen's docker for bio project is working - #9
[ ] script to load all atom scheme files in a directory
[ ] put bio atomspace scheme representations up in its own repo or on a server where others can access
Although biosciene_types.pyx
is generated during build, it is not installed and the types GeneNode
and MoleculeNode
aren't available for import.
Target completion date: (Selam, please fill in)
Per Bobby:
Can we include Table S6 from this paper as an additional gene list, similar to the ones from mSigDB? http://www.sciencedirect.com/science/article/pii/S1097276512008933
This is a list of genes that seem to be modulated with the epigenetic clock of aging that has been shown to work very repeatedly and across different tissues... I don't think it's already included within mSigDB though.
Here's the data: https://drive.google.com/file/d/0B3NYFAN330UTbDdBTC1RNngyV0l3cVRRVVY4cFdKRDZLR0dj/view?usp=sharing
Selam, This set can be represented similar to how the MSigDB gene sets are represented.
Lets create a new general ConceptNode "GeneSet" that this gene set inherits from.
And then in the MSigDB representation, we should add an inheritance of MSigDB_GeneSet to GeneSet, IOW:
InheritanceLink
ConceptNode "MSigDB_GeneSet"
ConceptNode "GeneSet"
Fill in as many of the fields used in the MSigDB that can also be applied to this gene set. Perhaps Meseret can help if needed with interpreting things from the article regarding what the field values should be.
For all relevant models we want included in the bio atomspace, especially the nonagenarian models, but for supercentenarians, etc. as well.
These should be in the format Mike specified for the latest conversion script of Nil's that includes model accuracy information for conversion to atomspace truth values.
The models should be delivered to Selam for importing with the latest script from Nil.
this is the log output. the same thing happens trying to load libruleengine.so either from the cogserver.conf file or from the cogserver command line.
[2017-06-30 23:11:09:810] [INFO] Loading module "/usr/local/lib/opencog/libbioscience-types.so"
[2017-06-30 23:11:09:810] [ERROR] Unable to find symbol "opencog_module_id": /usr/local/lib/opencog/libbioscience-types.so: undefined symbol: opencog_module_id (module /usr/local/lib/opencog/libbioscience-types.so)
Stack Trace:
2: Logger.cc:481 opencog::Logger::logva(opencog::Logger::Level, char const*, __va_list_tag*)
3: Logger.cc:493 opencog::Logger::Error::operator()(char const*, ...)
4: CogServer.cc:426 opencog::CogServer::loadModule(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
5: CogServer.cc:630 opencog::CogServer::loadModules(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)
6: CogServerMain.cc:223 main()
7: libc-start.c:325 __libc_start_main()
8: ??:0 _start()
[2017-06-30 23:11:09:878] [INFO] Loading module "/usr/local/lib//opencog/libbioscience-types.so"
[2017-06-30 23:11:09:878] [ERROR] Unable to find symbol "opencog_module_id": /usr/local/lib/opencog/libbioscience-types.so: undefined symbol: opencog_module_id (module /usr/local/lib//opencog/libbioscience-types.so)
Stack Trace:
2: Logger.cc:481 opencog::Logger::logva(opencog::Logger::Level, char const*, __va_list_tag*)
3: Logger.cc:493 opencog::Logger::Error::operator()(char const*, ...)
4: CogServer.cc:426 opencog::CogServer::loadModule(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
5: CogServer.cc:630 opencog::CogServer::loadModules(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)
6: CogServerMain.cc:223 main()
7: libc-start.c:325 __libc_start_main()
8: ??:0 _start()
[2017-06-30 23:11:09:948] [WARN] Failed to load cogserver module libbioscience-types.so
i think this got broke in recent atomspace/opencog updates, it built a couple weeks ago...
Scanning dependencies of target bioscience-types
[ 50%] Building CXX object bioscience/types/CMakeFiles/bioscience-types.dir/BioScienceTypes.cc.o
<command-line>:0:10: warning: missing terminating " character
In file included from /usr/local/include/opencog/atoms/base/atom_types.cc:46:0,
from /home/mjsd/oc/agi-bio/bioscience/types/BioScienceTypes.cc:28:
/home/mjsd/oc/agi-bio/build/bioscience/types/atom_types.inheritance: In function ‘void init()’:
/home/mjsd/oc/agi-bio/build/bioscience/types/atom_types.inheritance:1:39: error: no matching function for call to ‘opencog::ClassServer::beginTypeDecls()’
opencog::classserver().beginTypeDecls();
^
In file included from /home/mjsd/oc/agi-bio/build/bioscience/types/atom_types.definitions:2:0,
from /home/mjsd/oc/agi-bio/bioscience/types/BioScienceTypes.cc:23:
/usr/local/include/opencog/atoms/base/ClassServer.h:97:10: note: candidate: bool opencog::ClassServer::beginTypeDecls(const char*)
bool beginTypeDecls(const char *);
^
/usr/local/include/opencog/atoms/base/ClassServer.h:97:10: note: candidate expects 1 argument, 0 provided
bioscience/types/CMakeFiles/bioscience-types.dir/build.make:78: recipe for target 'bioscience/types/CMakeFiles/bioscience-types.dir/BioScienceTypes.cc.o' failed
make[2]: *** [bioscience/types/CMakeFiles/bioscience-types.dir/BioScienceTypes.cc.o] Error 1
the cvs file used by lifeSpanObservation_2015.py is not in repo. does it still exist?
the output file is available at https://gitlab.com/opencog-bio/bio-data/blob/master/scheme-representations/Lifespan-observations_2015-02-21.scm
We will need to implement a backing store when we reach the point that imported bio knowledge will not all fit in RAM.
There should be an end parenthesis after PredicateNodes in EvaluationLinks like this:
(EvaluationLink
(PredicateNode "GO_namespace")
(ListLink
(ConceptNode "GO:0000007")
(ConceptNode "molecular_function")))
I fixed this for GO terms and GO annotations (2029837). Please fix for the other knowledge bases where needed and regenerate the scheme files.
i tried to build opencog with the "bioscience" module. after changing the directory to match current location of "Module.h" in BioScienceTypes.cc (commit 2b8b5) the build still fails because because there is no longer a file "opencog/atomspace/atom_types.cc". anyone know what replaced this file?
Add a parameter, 'limit=' to GET .../atoms in the REST API that limits the size of the result set to that number.
Including in the return results the number n of the size of the full result set that would have been returned without the limit would be good too.
this issue is mainly intended for icog folks with access to opencog-bio /bio-data gitlab repo but the python scripts that convert the public data files to scheme are here in agi-bio/knowledge-import. after loading GO.scm and GO_annotations.scm generated per the README into an atomspace and using these functions from "GO utilities.scm":
(define GOname
(lambda (GOterm)
(gar
(cog-execute!
(GetLink
(EvaluationLink
(PredicateNode "GO_name")
(ListLink
GOterm
(VariableNode "$name"))))))))
(define get-type-with-prefix
(lambda (type string)
(filter
(lambda (atom) (string-prefix? string (cog-name atom)))
(cog-get-atoms type))))
this procedure counts almost 2k GO terms with no ConceptNode associated
(length (filter
(lambda (goterm) (eq? (list) (GOname goterm)))
(get-type-with-prefix 'ConceptNode "GO:")))
$4 = 1976
the error could be in the python script or something missing in the original database downloads.
Looks like there are a lot of duplicate MemberLinks in GO_annotations.scm
For example
(MemberLink
(GeneNode "TLR8")
(ConceptNode "GO:0010008"))
has a bunch of entries
These won't add additional atoms to the atomspace I'm pretty sure, but it's probably slowing down loading the scheme file considerably.
Per specifications in the Google doc.
Target Date: (Selam, please fill in)
Leverage MSigDB geneset data to directly represent regulatory relationships between genes.
Please list the names used for ConceptNodes and PredicateNodes in this google document so that we will have consistency in naming moving forward:
https://docs.google.com/document/d/1S3umczKHFHwQpIbp5jlgIf73tlYRyGNlFuCHBp_q8p0/edit
I'd like to package this for GNU Guix, but I haven't been able to find a license declaration in this repository. Is this free software? If so, under what license is it released?
I've taken it upon myself to rewrite the GO.scm code a bit:
At first I ended up with this: https://github.com/jac2130/obo_to_Atom_Space
But then I found a wonderful little toolbox for ontologies, called pronto and then, after making a few adjustments to that library in my own fork (https://github.com/jac2130/pronto) and after realizing that much work has been done on opencog tools, I came up with this:
https://github.com/CollectiWise/collectiwise/blob/master/python/scheme_router.py
combined with these statements, directly in Scheme Atomese:
https://github.com/CollectiWise/collectiwise/blob/master/statements.scm
Now, you'll notice that they are very incomplete at the moment but the idea to send statements and relationship types directly as a json stream through a router to the AtomSpace is what drove me to this, because I'm building a live ontology and crowd reasoning system that may be updated in real time. The nice ideas embodied in the pronto library are quite helpful. The pronto library allows for easy ontology merging, taking either owl or obo ontologies and it links information in intuitive and powerful ways. So the idea then is to push as much of the work as possible to the ontology library (pronto) and to the Scheme file in which the statements are crafted, leaving a thin message router that just takes an ontology as input and sends terms and relationships to the appropriate scheme functions, written in scheme code and feeding the results directly into the AtomSpace (no need for files). I will add to this that non-ontological logical statements (implications etc) can also be sent through this router via JSON. This will lead to a detailed logical semantics of the ontological relationships in some set of ontologies (in my case it will be the set of relations which are found in the ENVO and the SDGIO (https://github.com/SDG-InterfaceOntology/sdgio) ontologies. No matter what ontologies or lists of logical statements (axioms) you will use, the number of types of relations is rather small, even in huge ontologies or knowledge bases, because they are just the set of rules by which things can be ontologically or logically related. These relationships, in turn, have an even smaller set of properties (is_symmetric, is_transitive etc. , for an exhaustive list of obo defined relationship properties, see here: https://metacpan.org/pod/OBO::Core::RelationshipType#is_cyclic) the meanings of which I will encode precisely in Scheme Atomese next. Currently, pronto only picks up a handful of those relationship properties but I will work on that as well. Further improvements and additional features are planned. While this code is part of a bigger project, I think it could be useful pretty quickly for the things that you are doing here and maybe you could provide me with some feedback and ideas as to how to encode things in Atomese in the process? I hope that it turns out to be useful for someone aside from me and my team!
per ben's document
opencog.conf lists modules that aren't modules anymore. probably a bunch of other stuff needs updating as well.
When running
./export_models_and_fitness.sh ../bestCombos50/chr22_moses.5x10 localhost 17001
eventually the cogserver crashes, but if I truncate the file after to the first 5 lines, it works. That's all I know for now.
also representation for intersection_of
Identify human homolog genes.
Import into atomspace
Supply list of the homolog genes for Bobby
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.