metasoarous / tripl Goto Github PK
View Code? Open in Web Editor NEWThis one weird trick turns JSON documents into semantic graph databases!
This one weird trick turns JSON documents into semantic graph databases!
The following code (mostly copy & paste from the readme):
from tripl import tripl
def cft_cons(name):
return tripl.entity_cons('cft.type:' + name, 'cft.' + name)
def main():
subject = cft_cons('subject')
# Next our schema
schema = {
'cft.seq:timepoint': {'db:valueType': 'db.type:ref',
'db:cardinality': 'db.cardinality:many'},
'cft.seq:subject': {'db:valueType': 'db.type:ref'}}
ts = tripl.TripleStore(schema=schema, default_cardinality='db.cardinality:one')
ts.assert_facts([
subject(id='QA255')],
id_attrs=['cft.timepoint:id', 'cft.seq:id', 'cft.subject:id'])
Causes an exception:
Traceback (most recent call last):
File ".../venv/bin/...", line 11, in <module>
load_entry_point('...', 'console_scripts', '...')()
File ".../.../__init__.py", line 38, in main
id_attrs=['cft.timepoint:id', 'cft.seq:id', 'cft.subject:id'])
File "build/bdist.linux-x86_64/egg/tripl/tripl.py", line 521, in assert_facts
File "build/bdist.linux-x86_64/egg/tripl/tripl.py", line 499, in assert_fact
File "build/bdist.linux-x86_64/egg/tripl/tripl.py", line 472, in _assert_dict
File "build/bdist.linux-x86_64/egg/tripl/tripl.py", line 448, in _resolve_eid
File "build/bdist.linux-x86_64/egg/tripl/tripl.py", line 448, in <dictcomp>
File "build/bdist.linux-x86_64/egg/tripl/tripl.py", line 591, in match
File "build/bdist.linux-x86_64/egg/tripl/tripl.py", line 586, in _entity_lookup
AttributeError: 'NoneType' object has no attribute 'keys'
Do you have a hint at what might be causing this?
I am using Python 2.7.16 and unmodified Tripl from master branch.
No mention of the EAV index option for assert_facts or as a file format option next to lists of dicts.
We want a collection of utilities for representing and working with sequence, tree and tabular data. This will involve some data modelling work, tooling for slurping/spitting to standard formats, and in the case of ingest, linking/relating it to the rest of the data (I'm imagining being able to specify a join on some sequence data and a CSV metadata file and representing that as triples, for example). There's also a lot of room here for tooling at the build pipeline level, since these things tend to get into the semantics of the actual data ("for each subject, for each cell cluster, for each ..."; I'll probably specifically build out some thing along these lines for nestly). These things are likely going to have to get broken up into smaller pieces, so this is a bit of an epic issue.
The pull query and the graphql query are homomorphic, so it would be cool to think about how we could plug these things together.
https://blog.codeship.com/an-introduction-to-graphql-via-the-github-api/
Probably a wiki page or .md file in docs. Should go a little more in depth into how things affect us in bioinformatics, and what these things might look like in a bioinformatic context, with tree/fasta/csv examples. And make reference to nestly work as well.
Right now it's possible to query reverse relationships by using an underscore after the namespace separator (so the reverse relation for person:parent
would be person:_parent
, and would map from parents to children). This should be possible for assertions as well. Imagine you wanted to describe a mother with 10 children.
data = [
{'person:name': 'Momma Jones',
'person:_parent': [{'person:name': 'Little Joe Jones'}, {'person:name': 'Wilma Jenkins'}, ...]}]
my argument is readibility: when you specify a dict, then there are many ":" which (imho) decreases readibility, compare
'mock:type': 'mock.type:seq',
'mock.seq:id': 'a1',
'mock.seq:string': 'ACTGA',
'mock:description': 'some foo from bar',
with
'mock/type': 'mock.type:seq',
'mock.seq/id': 'a1',
'mock.seq/string': 'ACTGA',
'mock/description': 'some foo from bar',
There's an entity api via tp.entity(entity_id)
that lets you traverse the graph as a "live" graph of connected dicts. There might also need to be some cardinality or reverse lookup ref work here, but whatever the case, details should be documented in the README.
Once we're out of alpha/beta we should just NEVER BREAK THE SCHEMA! Because then you'll have to get into having to figure out how to load different data in different versions and that just sucks. So we have to settle on all these little things like: what do we call our primary key? db:ident
? tripl:id
? Where do we install the schema? On a tripl:schema
ident?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.