davebshow / goblin Goto Github PK
View Code? Open in Web Editor NEWA Python 3.5 rewrite of the TinkerPop 3 OGM Goblin
License: Other
A Python 3.5 rewrite of the TinkerPop 3 OGM Goblin
License: Other
Hello Goblin contributors,
I have only just realized that I should have done this several weeks ago. I would like to inform and ask the consent of all goblin contributor to change from an APGL license to Apache 2. This is a shift towards a more permissive license that will only help to stimulate community contributions and the overall improvement of this software. @leifurhauks @trojek @H-Plus-Time @rosenbrockc. Please confirm that this change is acceptable to you.
Best,
Dave
Hey Dave,
I've been attempting to communicate with DSE graph 5.10 through Goblin with the following code:
async def gremlin_query(script, port, alias=None):
eventloop = asyncio.get_event_loop()
if alias is None:
cluster = await Cluster.open(eventloop, port=port)
else:
cluster = await Cluster.open(eventloop, aliases=alias, port=port)
client = await cluster.connect()
resp = await client.submit(script)
messages = []
async for msg in resp:
messages.append(msg)
await cluster.close()
return messages
The alias is defined as { 'g': 'test_graph.g' }
When run against the DSE Gremlin Server, I get the following error:
aiogremlin.exception.GremlinServerError: 500: 500: Unknown value type for: class java.util.LinkedHashMap
This works fine when used in the context of a Session
object like so:
async def create_session(self):
eventloop = asyncio.get_event_loop()
self.app = await Goblin.open(eventloop,
get_hashable_id=self.adapter.get_hashable_id,
aliases=self.adapter.get_alias('test_graph'),
port=self.adapter.port)
self.app.register(User, LivesIn)
session = await self.app.session()
return session
Where self.adapter.get_alias('test_graph')
returns { 'g': 'test_graph.g' }
.
Perhaps the Cluster
class from aiogremlin is having trouble serializing the dictionary? Don't really have a clue...
Any help would be appreciated!
When I run the following on 2.1.0rc1
import asyncio, goblin
loop = asyncio.get_event_loop()
app = loop.run_until_complete(goblin.Goblin.open(loop))
client = loop.run_until_complete(app.cluster.connect())
query = """
size = graph.getOpenTransactions().size();
for(i=0;i<size;i++) {graph.getOpenTransactions().getAt(0).rollback()};
graph.getOpenTransactions().size()
"""
result = loop.run_until_complete(client.submit(query))
print(loop.run_until_complete(result.all())) # Hangs here forever
loop.run_until_complete(client.close())
it hangs on the result.all()
and I have to interrupt the kernel to get control. This used to work on the previous version. Is there an important change I am missing? I got the latest aiogremlin
and the official gremlin_python
.
The actual query
doesn't matter, this is just one that I had lying around in an ipython notebook. It hangs for all my other traversal queries too.
i have wrote goblin ogm sample into my spark app, but i got Task was destroyed but it is pending!
below is my code
def savePartition(p):
from goblin import element, properties
class Brand(element.Vertex):
name = properties.Property(properties.String)
import asyncio
loop = asyncio.get_event_loop()
from goblin.app import Goblin
app = loop.run_until_complete(Goblin.open(loop))
app.register(Brand)
async def go(app):
session = await app.session()
for i in p:
if i['brand']:
traversal = session.traversal(Brand)
result = await traversal.has(Brand.name, i['brand']).oneOrNone()
if not result:
brand = Brand()
brand.name = i['brand']
session.add(brand)
await session.flush()
loop.run_until_complete(go(app))
rdd = rdd.foreachPartition(savePartition)
i am wonder how to close the response in above code. thanks very much.
Some queries return results that aren't vertices or edges, for example:
result = await session.traversal(MyVertex).has(MyVertex.some_prop, 'some_value').count()
Which returns an integer.
A more complex example:
result = await (session.traversal(MyVertex)
.has(MyVertex.some_prop, 'some_value').as('v1')
.out('some_label').as('v2')
.select('v1', 'v2').by().by('a_string_property'))
This will return a mapping of vertices ('v1') to strings (from a property of 'v2'), something like:
{
v[xyz]: 'some_string',
...
}
I was thinking about using the new id serializer in goblin-dse to deal with the dict ids.
DSE has a shortcut where you can specify the ID as a string in traversals, of the form "{key1=val1, key2=val2, key3=val3}"
and so on. It doesn't care about order.
If DSE ids always get converted to a string like this, they will be hashable already.
@davebshow , do you think this makes sense?
Hi I just started learning how to use Gremlin in Python with OGM.
I used JanusGraph 0.1.1, Python 3.6.1, and Goblin 2.0 (from pip) and saw an error.
I ran the following code from the tutorial in the main page:
import asyncio
from goblin import Goblin, element, properties
class Person(element.Vertex):
name = properties.Property(properties.String)
age = properties.Property(properties.Integer)
class Knows(element.Edge):
notes = properties.Property(properties.String, default='N/A')
async def go(app):
session = await app.session()
leif = Person()
leif.name = 'Leif'
leif.age = 28
jon = Person()
jon.name = 'Jonathan'
works_with = Knows(leif, jon)
session.add(leif, jon, works_with)
await session.flush()
result = await session.g.E(works_with.id).next()
assert result is works_with
people = session.traversal(Person) # element class based traversal source
async for person in people:
print(person)
loop = asyncio.get_event_loop()
app = loop.run_until_complete(Goblin.open(loop))
app.config_from_file("config.yaml")
app.register(Person, Knows)
loop.run_until_complete(go(app))
And I received an error at the last line as follows:
TypeError Traceback (most recent call last)
in ()
----> 1 loop.run_until_complete(go(app))
/home/jbkoh/anaconda3/lib/python3.6/asyncio/base_events.py in run_until_complete(self, future)
464 raise RuntimeError('Event loop stopped before Future completed.')
465
--> 466 return future.result()
467
468 def stop(self):
in go(app)
8 works_with = Knows(leif, jon)
9 session.add(leif, jon, works_with)
--> 10 await session.flush()
11 result = await session.g.E(works_with.id).next()
12 assert result is works_with
/home/jbkoh/anaconda3/lib/python3.6/site-packages/goblin/session.py in flush(self)
268 while self._pending:
269 elem = self._pending.popleft()
--> 270 await self.save(elem)
271
272 async def remove_vertex(self, vertex):
/home/jbkoh/anaconda3/lib/python3.6/site-packages/goblin/session.py in save(self, elem)
310 result = await self.save_vertex(elem)
311 elif elem.__type__ == 'edge':
--> 312 result = await self.save_edge(elem)
313 else:
314 raise exception.ElementError(
/home/jbkoh/anaconda3/lib/python3.6/site-packages/goblin/session.py in save_edge(self, edge)
348 self.update_edge)
349 hashable_id = self._get_hashable_id(result.id)
--> 350 self.current[hashable_id] = result
351 return result
352
/home/jbkoh/anaconda3/lib/python3.6/weakref.py in __setitem__(self, key, value)
166 if self._pending_removals:
167 self._commit_removals()
--> 168 self.data[key] = KeyedRef(value, self._remove, key)
169
170 def copy(self):
TypeError: unhashable type: 'dict'
How can I solve this problem? Do I misuse it?
Goblin's reliance on asyncio means it cannot be used within a swath of popular frameworks built on top of Twisted. Case in point: Scrapy spiders can't make use of Goblin. While this is a problem for me, personally, I think it's also a limiting factor in the widespread adoption of Goblin.
This limitation could be solved by rewriting golbin to use Autobahn's websocket implementation, which can employ asyncio or Twisted as a backend. For other asynchronous logic, txaio can be used to write coroutines and futures/deferreds in a manner agnostic to the async backend.
Is this something the Goblin team would be interested in doing?
I am trying to find an OGM with certain characteristics and I came across Goblin. Goblin requires a considerable "investment" because it means moving away from the current state which is based on Neo4J. I do not mind too much about the actual backend and Goblin is supposed to work with anything that is based on Tinkerpop, so that's an extra check point.
But, I first need to confirm that Goblin can handle a specific modelling case:
The system I am developing can apply a set of algorithms to data that is structured in a specific way with small differences here and there. For this reason, I am setting up a base schema that is specialised (via the use of inheritance) depending on the use case.
The schema looks more or less like this:
#---BASE--------------------------------------------
class commonVertexFunc(Vertex):
[. . .]
class Item(commonVertexFunc):
someProperty = str()
someOtherProperty = str()
someRelationship = ItemRelationship(anotherItem, ZeroOrMore)
[. . .]
class ItemRelationship(Edge):
[. . .]
class anotherItem(commonVertexFunc):
[. . .]
#---Specific------------------------------------------------
class specificItem(Item):
specificAdditionalProperty = str()
class specificOtherItem1(anotherItem):
[. . .]
class specificOtherItem2(anotherItem):
[. . .]
Now, what I expect, with this particular setting is:
a. specificItem
already has a someRelationship
. And that should not be too big of a problem.
b. More importantly, someRelationship
will accept ANY object that is of type anotherItem
.
Will Goblin be able to handle this?
It appears that the example code in the docs fails for metaproperties, at least on my system.
Here is my code, copied from the web docs:
import goblin
from aiogremlin.gremlin_python import Cardinality
class HistoricalName(goblin.VertexProperty):
notes = goblin.Property(goblin.String)
class City(goblin.Vertex):
name = goblin.Property(goblin.String)
population = goblin.Property(goblin.Integer)
historical_name = HistoricalName(
goblin.String, card=Cardinality.list)
When I run this script, I get the error:
File "/usr/lib/python3.5/enum.py", line 274, in __getattr__
raise AttributeError(name) from None
AttributeError: list
I'm using goblin 2.0.0, gremlin-python 3.2.5, aiogremlin 3.2.4.
goblin.Integer
uses the built-in python int
for validating whether a given value is integer (https://github.com/davebshow/goblin/blob/master/goblin/properties.py#L175). However, this causes problems with 64-bit integers. For example, list
and set
cardinality creates *VertexPropertyManager
instances that call validate on each argument they are given.
In order to support correct GraphSON serialization to Int64
, gremlin_python
creates a long
type in gremlin_python.statics
. Developers should cast integers to that sub-class of int
if they want them to be serialized correctly to the database. Sets of 64-bit integers, however, get cast back to the 32-bit int
because of the validator.
I propose replacing the existing validate with:
if not isinstance(val, long):
return int(val)
else:
return val
where long
is from gremlin_python.statics import long
. Happy to pull request for this again.
When running the quickstart Titan (titan-1.0.0-hadoop1/bin/titan.sh start) and then trying to follow along with the goblin docs, it turns out that goblin is trying to speak application/vnd.gremlin-v2.0+json
which the default gremlin-server install packaged with Titan doesn't understand.
If I edit the gremlin-server.yaml and add to the serializers
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { useMapperFromGraph: graph }}
which from some googling I have some expectation to bless the gremlin server with the ability to speak the correct dialect I get, from the gremlin-server logs:
13217 [main] WARN org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Could not find configured serializer class - org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0 - it will not be available
While I appreciate this might not really be goblin's problem, and that I need somehow to update my gremlin-server, goblin is my route into learning gremlin/titan etc, so it might be nice for goblin somewhere to specify it needs a certain version of gremlin when connecting to the server.
Incidentally, if I run the 1+1
example in your docs with the default gremlin-server that comes with Titan it just hangs forever - it isn't noticing that the server has spat out its pacifier.
Suppose I have a vertex type Restaurant
:
loop = asyncio.get_event_loop()
app = loop.run_until_complete(goblin.Goblin.open(loop,
get_hashable_id=get_hashable_id))
S = loop.run_until_complete(app.session())
R = Restaurant()
R.id = 32952
loop.run_until_complete(S.get_vertex(R))
This correctly get all of the regular properties. However, I have a subclass of VertexProperty
called Address
. It correctly grabs the set of locations:
{<Address(type=<goblin.properties.String object at 0x107e23da0>, value=murray),
<Address(type=<goblin.properties.String object at 0x107e23da0>, value=provo)}
but the individual properties of each Address
are all None
:
assert R.locations("murray").uuid is None
# True
I traced the problem to https://github.com/davebshow/goblin/blob/master/goblin/session.py#L223.
Because I have db_name
mappings, the key in the props
dictionary are the db_name instead of the property name. So the __properties__.get
returns None
and the second valueMap
request never fires.
When I add:
for okey, val in props.items():
key = element.__mapping__.db_properties[okey][0]
if isinstance(element.__properties__.get(key), VertexProperty):
trav = self._g.V(
props['id']).properties(okey).valueMap(True)
vert_prop = await trav.toList()
new_props[key] = vert_prop
then it works and the extra traversal is triggered. Mind, I am using the official 2.0 release on PyPI, so my version actually looks like:
for okey, val in props.items():
key = element.__mapping__.db_properties[okey][0]
if isinstance(getattr(element, key, None), VertexPropertyManager):
vert_prop = await self._g.V(
props['id']).properties(okey).valueMap(True).toList()
new_props[key] = vert_prop
and the message sent to gremlin server is:
!application/vnd.gremlin-v2.0+json{"requestId": {"@type": "g:UUID", "@value": "0f385674-85f5-4f90-99d7-58ed8b30df93"}, "processor": "traversal", "op": "bytecode", "args": {"aliases": {"g": "g"}, "gremlin": {"@type": "g:Bytecode", "@value": {"step": [["V", {"@type": "g:Int64", "@value": 32952}], ["properties", "restaurant.locations"], ["valueMap", true]]}}}}
Unfortunately, the actual query doesn't execute properly and I am at a loss as to the problem. In python I get:
.../lib/python3.5/site-packages/aiogremlin/driver/resultset.py in wrapper(self)
14 raise exception.GremlinServerError(
15 msg.status_code,
---> 16 "{0}: {1}".format(msg.status_code, msg.message))
17 msg = msg.data
18 return msg
GremlinServerError: 500: id
But, when I execute the following in the gremlin console:
gremlin> g.V(32952).properties("restaurant.locations").valueMap(true).toList()
==>{address.zip=84632, key=restaurant.locations, address.city=Murray, id=oif-pfc-lc5, address.uuid=12345678-1234-5678-1234-567812345679, address.street=456 Ruby Ave., value=murray}
==>{address.zip=84606, key=restaurant.locations, address.city=Provo, id=pav-pfc-lc5, address.street=123 Thai St., address.uuid=12345678-1234-5678-1234-567812345678, value=provo
it works. Any ideas for the error? I am happy to submit a PR for the mapping issue, though you may have a better way to handle it.
Sorry, davebshow
There may be no need to create an issues.But i dont konw how to contact with you.
I want to index all vertex, is there any resolvent? I successed using index with the property keys like
vertex = await session.traversal(Suspector).has('id_number', id_number).next()
But when i index all the vertex, it seems the hbase scan all graph with no used elasticsearch.So it is very slow. like
traversal1 = await session.traversal(Suspector).has('id_number').toList()
Hi,
Has anyone been able to use Goblin with Cosmos DB?
Here's our yaml config:
hosts: ['gremlin_uri_that_cosmos_gives_us_in_azure_dashboard']
port: 443
username: '/dbs/graphdb/colls/Persons'
password: 'somepassword'
response_timeout: 5
connectionPool: {
enableSsl: true}
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { serializeResultToString: true }}
We run
import asyncio, datetime
import goblin
from goblin import Goblin
from goblin import driver, abc, exception
loop = asyncio.get_event_loop()
app = loop.run_until_complete(Goblin.open(loop, configfile='config.yaml'))
app.close()
but it always comes back with
Traceback (most recent call last):
File "/usr/lib/python3.6/asyncio/selector_events.py", line 724, in _read_ready
data = self._sock.recv(self.max_size)
ConnectionResetError: [Errno 104] Connection reset by peer
We've tried with the Gremlin console and it sort of works better against cosmos db but then we get back the connection reset by peer error there too after awhile. It seems to be a cosmos db/networking issue with Azure, but we're not sure where to look.
We just have a trial edition of azure at this point, just evaluating Goblin with Cosmos DB for graphs.
The Azure documentation for Cosmos DB using Gremlin says to use the Goblin Python driver for Python support, but we can't even connect at the moment :(
Any help would be appreciated.
Thanks,
Jordan
i have a scenario of updating a vertex after get it, but it not works
traversal = session.traversal(Offer)
offer = await traversal.has(Offer.oid, i['oid']).oneOrNone()
if not offer:
offer = Offer()
offer.oid = i['oid']
offer.url = i['url']
offer.name = i['name']
offer.brand = i['brand']
offer.model = i['model']
offer.model = 'Q2'
await session.save(offer)
please help.
class Relation(Edge):
relationship = goblin.Property(goblin.String, default='unkonwn')
I use this to create edge.
When i create mixed index in gremlin-shell
graph = JanusGraphFactory.open('conf/janusgraph-hbase-es.properties')
graph.tx().rollback()
mgmt = graph.openManagement()
relationship = mgmt.getPropertyKey('relationship')
mgmt.buildIndex("RL", Edge.class).addKey(relationship).buildMixedIndex(search);
It reminds
'''gremlin> mgmt.buildIndex("RL", Edge.class).addKey(relationship).buildMixedIndex("search")
Could not register new index field 'relationship' with index backend as the data type, cardinality or parameter combination is not supported.
Type ':help' or ':h' for help.
Display stack trace? [yN]n
'''
How can i correct my relationship property key data type?Please help me!
I wanted to use goblin with titan 1.0.0 which is the latest mature version of titan, but I had issues even at the most simple operations like adding a vertex :(
It looks like the mature version of titan does not come with GraphSON2 , but GraphSON1 which is some pain provider on my part .
Hi,Dave.
In gremlin shell, i use the composite index like this
g.V().has('id_number', 'xxx').toList()
I can use the goblin like
suspector = await session.traversal(Suspector).has('id_number', 'xxx').toList()
But how can i use the mixed index with goblin?
In gremlin shell
g.V().has('age', inside(18,30)).toList()
How can i use the goblin to make this?
Hi Dave,
I've been working with your library for most of today but have stumbled upon a bit of an issue. I wonder if you have seen this before?
I'm using Titan 1.1 (built from source), with TinkerPop 3.2.3 and an embedded Cassandra database.
Adding an edge throws a TypeError: unhashable type: 'dict'
in line 350 of save_edge()
.
The following minimal code replicates my error:
loop = asyncio.get_event_loop()
app = loop.run_until_complete(goblin.Goblin.open(loop,
hosts = ['localhost'],
port = '8182',
scheme = 'ws'))
async def create(app, data):
session = await app.session()
session.add(data)
await session.flush()
return data
class myVertex(goblin.Vertex):
pass
class myEdge(goblin.Edge):
pass
v1 = myVertex()
v1 = loop.run_until_complete(create(app, v1))
v2 = myVertex()
v2 = loop.run_until_complete(create(app, v1))
e = myEdge()
e.source = v1
e.target = v2
loop.run_until_complete(create(app, e))
What could be the source of this?
With regards,
Will
Need to review and refactor ElementaMeta
--inheritance is hosed.
In order to accommodate complex ID datatypes, like DSE map IDs.
Build failure in sphinx:
File "/lib/python3.5/site-packages/sphinx/ext/autodoc.py", line 862, in filter_members
not keep, self.options)
File "/lib/python3.5/site-packages/sphinx/application.py", line 593, in emit_firstresult
for result in self.emit(event, *args):
File "/lib/python3.5/site-packages/sphinx/application.py", line 589, in emit
results.append(callback(self, *args))
File "/lib/python3.5/site-packages/sphinxcontrib/napoleon/__init__.py", line 428, in _skip_member
qualname = getattr(obj, '__qualname__', '')
File "/lib/python3.5/site-packages/goblin/element.py", line 211, in __repr__
self._data_type, self.value)
File "/lib/python3.5/site-packages/goblin/element.py", line 194, in getvalue
return self._val
AttributeError: 'SubOption' object has no attribute '_val'
SubOption
is a sub-class of VertexProperty
:
class SubOption(goblin.VertexProperty):
a = goblin.Property(goblin.Boolean, default=True)
b = goblin.Property(goblin.Boolean, default=True)
I hacked element.py
to get my build to pass by adding self._val = None
to the class initializer for VertexProperty
(https://github.com/davebshow/goblin/blob/master/goblin/element.py#L183).
Authentication is currently broken.
Trying to submit to an authenticated connection results in an AttributeError
caused by this line in goblin.driver.connection. The connection doesn't have an instance attribute _processor
.
I was thinking maybe we could add an attribute processor
to Response
, to record which processor the request was made with, so that _receive
can use that in the authentication command.
I'm not sure how to go about testing this. Maybe we could use a mock ws to send a 407 response and check that the correct authentication command gets sent back?
Hi Dave. If I create a vertex class with an integer property that has a meta-property, and then set that property value to zero, I loose the Goblin.Integer type.
class vclass(goblin.VertexProperty):
notes = goblin.Property(goblin.String)
class myclass (goblin.Vertex):
id1 = vclass(goblin.Integer, card=Cardinality.list_)
id2 = vclass(goblin.Integer, card=Cardinality.list_)
a = myclass()
setattr(a, 'id1', 0)
setattr(a, 'id2', '0')
print(a.__dict__)
This results in:
{'_id1': 0, '_id2': [<vclass(type=<goblin.properties.Integer object at 0x7f38c1586780>, value=0)]}
This problem is not seen if the value is not zero. For example:
a = myclass()
setattr(a, 'id1', 1)
setattr(a, 'id2', '1')
print(a.__dict__)
This results in the expected behavior:
{'_id1': [<vclass(type=<goblin.properties.Integer object at 0x7f38c15866a0>, value=1)], '_id2': [<vclass(type=<goblin.properties.Integer object at 0x7f38c1586780>, value=1)]}
while running a short loop I get an error I cant interpret:
$ ./dump_remote_janusgraph.py
2017-10-23 21:03:11,993,DEBUG,selector_events.py:65,Using selector: EpollSelector
Traceback (most recent call last):
File "./dump_remote_janusgraph.py", line 44, in <module>
loop.run_until_complete(dump_all(app))
File "/usr/lib64/python3.6/asyncio/base_events.py", line 467, in run_until_complete
return future.result()
File "./dump_remote_janusgraph.py", line 33, in dump_all
async for msg in session.g.V():
File "/home/goern/Work/thoth/venv/lib/python3.6/site-packages/aiogremlin/gremlin_python/process/traversal.py", line 46, in __anext__
self.last_traverser = await self.traversers.__anext__()
File "/home/goern/Work/thoth/venv/lib/python3.6/site-packages/aiogremlin/driver/resultset.py", line 66, in __anext__
msg = await self.one()
File "/home/goern/Work/thoth/venv/lib/python3.6/site-packages/aiogremlin/driver/resultset.py", line 10, in wrapper
msg = await fn(self)
File "/home/goern/Work/thoth/venv/lib/python3.6/site-packages/aiogremlin/driver/resultset.py", line 86, in one
loop=self._loop)
File "/usr/lib64/python3.6/asyncio/tasks.py", line 342, in wait_for
timeout_handle = loop.call_later(timeout, _release_waiter, waiter)
File "/usr/lib64/python3.6/asyncio/base_events.py", line 543, in call_later
timer = self.call_at(self.time() + delay, callback, *args)
TypeError: unsupported operand type(s) for +: 'float' and 'str'
Here is the loop:
async def dump_all(app):
session = await app.session()
async for msg in session.g.V():
print(msg)
loop = asyncio.get_event_loop() # pylint: disable=invalid-name
app = loop.run_until_complete(Goblin.open( # pylint: disable=invalid-name
loop, get_hashable_id=get_hashable_id, configfile=CONFIG_FILE))
app.register(BinaryPackage, Requires)
loop.run_until_complete(dump_all(app))
what's your thoughts on goblin + grakn.ai?
Running 'pip3 install goblin' had me seemingly install goblin successfully, though I'm having a bit of trouble running the examples found in the documentation.
I cannot do:
from goblin import driver
and when I attempt to access goblin.driver (just importing goblin) I recieve:
module 'goblin' has no attribute 'driver'
Any help would be greatly appreciated.
Hi @davebshow,
So Janusgraph works with Goblin, right?
There's an option to use AWS DynamoDB as the backend storage for Janusgraph. We were looking at that as an option for a managed cloud solution but using AWS. I know you're looking at getting Goblin working with Cosmos DB, but we thought maybe we'd try janusgraph with DynamodDB but wanted to know if you (or anyone else) has tried Goblin out on Janusgraph with DynamodDB option.
Here's the link: https://github.com/awslabs/dynamodb-janusgraph-storage-backend
We're trying it right now but running across issues.
Thanks,
Jordan
Hi Dave.
Can you suggest how I might use timestamps with goblin OGM? In the gremlin console, I might use something like:
g.addV("User").property("createdDate",System.currentTimeMillis())
In general, I would like to traverse based on whether a timestamp for an edge/vertex is gt/lt than that of another edge/vertex, and whether a timedelta between two edges/vertexes is gt/lt some number of minutes.
The following, from here: does not address these traverses, but it does suggest how a timestamp for an entry in the database might be created and used for a different kind of traverse. For all three traverses, is there a way to use goblin OGM?
gremlin> import java.util.concurrent.TimeUnit
gremlin> import com.thinkaurelius.titan.core.attribute.Timestamp
...
gremlin> g = TitanFactory.open("conf/titan-cassandra-es.properties")
==>titangraph[cassandrathrift:[127.0.0.1]]
gremlin> v1 = g.addVertex(null)
==>v[256]
gremlin> v2 = g.addVertex(null)
==>v[512]
gremlin> v1.addEdge("knows", v2)
==>e[dc-74-1lh-e8][256-knows->512]
gremlin> g.commit()
==>null
gremlin> yesterday = System.currentTimeMillis() - 1000 * 60 * 60 * 24
==>1420758191198
gremlin> g.V().bothE().has('$timestamp', Compare.GREATER_THAN_EQUAL, new Timestamp(yesterday, TimeUnit.MILLISECONDS))
==>e[dc-74-1lh-e8][256-knows->512]
==>e[dc-74-1lh-e8][256-knows->512]
For this, the following line is added to janusgraph-cassandra-es.properties:
storage.meta.edgestore.timestamps=true
It's my code
loop = asyncio.get_event_loop()
app = loop.run_until_complete(Goblin.open(loop,
get_hashable_id=get_hashable_id))
app.config_from_file('config.yaml')
loop.run_until_complete(gen_jane(app))
In my code, i want to update my config, but i cant't update the config.
It seems always use the default config. Because when i make the port:123456, it did not report error.
Please test it and check it.It may be a bug.Thank you!
I am writing documentation for a project using Sphinx. Here is the first issue causing the build to fail. I'll post the second in another issue:
File "/lib/python3.5/site-packages/sphinx/ext/autodoc.py", line 862, in filter_members
not keep, self.options)
File "/lib/python3.5/site-packages/sphinx/application.py", line 593, in emit_firstresult
for result in self.emit(event, *args):
File "/lib/python3.5/site-packages/sphinx/application.py", line 589, in emit
results.append(callback(self, *args))
File "/lib/python3.5/site-packages/sphinxcontrib/napoleon/__init__.py", line 428, in _skip_member
qualname = getattr(obj, '__qualname__', '')
File "/lib/python3.5/site-packages/goblin/mapper.py", line 213, in __getattr__
value, self._element_type))
goblin.exception.MappingError: unrecognized property __qualname__ for class: edge
In mapper.py
, line 206, I hacked it together this way (so that my build will pass and I can keep working; the generated documentation still looks good):
def __getattr__(self, value):
try:
mapping, _ = self._ogm_properties[value]
return mapping
except:
if value == "__qualname__":
return self.__class__.__qualname__
raise exception.MappingError(
"unrecognized property {} for class: {}".format(
value, self._element_type))
It seems strange to me that a class instance in python 3 would not have __qualname__
.
Just recently updated goblin to version 2.0.0. The following code used to work fine with Titan 1.1 since it was using the GraphSON1 Serializer packaged with goblin 1.0.0b:
eventloop = asyncio.get_event_loop()
conn = await driver.Connection.open(
'ws://localhost:8182/gremlin', eventloop,
message_serializer=driver.serializer.GraphSONMessageSerializer)
I noticed the serializer has migrated to aiogremlin
as of 2.0.0 and tried to update the code with the following:
eventloop = asyncio.get_event_loop()
cluster = await Cluster.open(
eventloop,
message_serializer=driver.GraphSONMessageSerializer
But now it seems to be hanging just like it did when using the GraphSON2 serializer that was the default for the Connection.open
function. Is Titan 1.1 no longer officially supported?
Hi,dave
I read the document http://goblin.readthedocs.io/en/latest/driver.html#configuring-cluster but it's only a few python shell statement
Is there a sample show how to use the driver connect to cluster?
I want to read some code.
Thanks!
I get this error when trying to use my models and save them:
[gremlin-server-exec-2] WARN org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor - Exception processing a script on request [RequestMessage{, requestId=a686a34e-d070-4900-b36a-d9b9b953f338, op='eval', processor='', args={gremlin=g.addV("memorial").property(k0, v0), aliases={}, bindings={k0=dateOfBirth, v0=now}}}].
java.lang.IllegalArgumentException: The provided key/value array length must be a multiple of two
This is my code:
gigi = MyVertex()
gigi.dateOfBirth = "now"
gigi.verified = False
name = Name()
name.firstName = "Gigi"
name.title = "Mr"
named = KnownAs(gigi, name)
session.add(gigi, name, named)
await session.flush()
While a raw gremlin query would work:
resp = await g.addV('person', 'developer').property('name', 'Leif').next()
script = "g.addV('person','developer').property(k1, v1)"
bindings = {'k1': 'name', 'v1': 'Leif'}
session = await app.session()
resp = await session.submit(gremlin=script, bindings=bindings)
Is there a mapping problem or is it just that properties should be a tuple ? no clue what is happening
I'm thinking it might be useful to have an internal API that lets goblin know about the graph vendor it's running against. Sort of like vendor-level config.
For example, when running against DSE, get_hashable_id should default to our dict hasher function unless the user explicitly sets it to something else.
Also, if we're using DSE it's not really necessary to query the database to discover whether transactions are supported; they're known to be supported for DSE.
(Incidentally, not querying the database for this would also fix another issue, which is that if an alias hasn't been set, the query in Goblin.supports_transactions
fails. It should be possible to get an app with Goblin.open
without setting an alias, because the alias could be set on the session instead.)
Maybe the way to do this would be to have an optional kwarg on Goblin.open
/ Goblin.__init__
that accepts a "vendor information" object, which might be a class or mapping. The default value for that kwarg could be the vendor information object for TinkerGraph.
If this sounds reasonable, @davebshow , @jsenecal , I'll get a PR ready for this.
Hi Dave. I'm trying to do a few simple tasks with goblin, but am running into two problems. First, while I can traverse a graph using .has() for equalities, I get errors when I use something like .has('count', P.gt(2)). That error message is:
aiogremlin.exception.GremlinServerError: 500: 500: Value [{operator=gt, other=null, value=2}] is not an instance of the expected data type for property key [num1] and cannot be converted. Expected: class java.lang.Integer, found: class java.util.LinkedHashMap
Second, I'm not sure how to alter my code to connect to a specific JanusGraph graph, 'etrg', for a session, rather than the default graph, 'g'. Previously, I used Goblin.open(translator = GroovyTranslator('etrg'), ...) but it seems that GroovyTranslator has recently been removed from gremlin_python and I'm not sure how to proceed. I want to use the 'etrg' graph with a session.
My test code is:
import asyncio
from goblin import element, Goblin
import goblin
#this fails: from gremlin_python.process.translator import GroovyTranslator
from gremlin_python.process.traversal import P
from gremlin_python import statics
from goblin import DriverRemoteConnection
from goblin.session import bindprop
# ======================================
def get_hashable_id(val):
#Use the value "as-is" by default.
result = val
if isinstance(val, dict) and "@type" in val and "@value" in val:
if val["@type"] == "janusgraph:RelationIdentifier":
result = val["@value"]["value"]
return result
# ======================================
#translator = GroovyTranslator('etrg') # previously, this worked.
# I want to connect the session somehow to ('ws://localhost:8182/gremlin', 'etrg')
# Set up event loop and app
loop = asyncio.get_event_loop()
app = loop.run_until_complete(Goblin.open(loop,
get_hashable_id=get_hashable_id, translator=None))
# define a vertex class
class Event (goblin.Vertex):
num1 = goblin.Property(goblin.Integer)
def alterNum(self):
self.num1 += 5
print("\n new = {}\n".format(self.num1))
# add a new attribute
setattr(Event, 'num2', goblin.Property(goblin.Integer))
# Register the models with the app
app.register(Event)
Session = loop.run_until_complete(app.session())
evt = Event()
evt.num1 = 5
evt.num2 = 20
Session.add(evt)
loop.run_until_complete(Session.flush())
# get existing vertex, this works
result = loop.run_until_complete(Session.g.V().hasLabel('event').toList())
for v in result:
print(" id= {}, num1= {}, num2= {}".format(v.id, v.num1, v.num2))
# transversal with bindprop, this works
bound_name = bindprop(Event, 'num1', 5, binding='v1')
v = loop.run_until_complete(Session.traversal(Event).has(*bound_name).next())
print("\n1: id= {}, num1= {}, num2= {}".format(v.id, v.num1, v.num2))
# transversal w/o bindprop, on Event vertex, this works
v = loop.run_until_complete(Session.traversal(Event).has('num1', 5).next())
print("\n2: id= {}, num1= {}, num2= {}".format(v.id, v.num1, v.num2))
# transversal w/o bindprop, any vertex, this works
v = loop.run_until_complete(Session.g.V().has('num1', 5).next())
print("\n3: id= {}, num1= {}, num2= {}".format(v.id, v.num1, v.num2))
# this causes an exception:
v = loop.run_until_complete(Session.g.V().has(Event.num1, P.gt(2)).next())
# change the value of num1 via the class method, this works
v.alterNum()
loop.run_until_complete(Session.update_vertex(v))
v = loop.run_until_complete(Session.g.V().has('num1', 10).next())
print("\n4: id= {}, num1= {}, num2= {}".format(v.id, v.num1, v.num2))
# remove all vertex, this works
for v in result:
loop.run_until_complete(Session.remove_vertex(v))
result = loop.run_until_complete(Session.g.V().hasLabel('event').toList())
print("\nresult2 = ", result)
loop.run_until_complete(app.close())
The documentation on the OGM says that a future release will support transactions. Do you have a timeline for that? It would be extremely helpful for me to be able to use transactions in python the way they are supported with java in tinkerpop. I realize it is a non-trivial feature request, but would love to kick up the priority list if no one is vying for something else.
Add support for schema generation for given metadata from Object Graph Mapper.
Add support for App/Cluster configuration using URL.
--
see also http://docs.sqlalchemy.org/en/latest/core/engines.html#sqlalchemy.engine.url.URL
Hi,
(I use JanusGraph 0.1.1, Python 3.6.1, and Goblin 2.0.0 (from pip) and saw an error.)
It fails at removing a vertex with a new session. Here's the example code:
import asyncio
from goblin import Goblin
from goblin.properties import Property, String, Boolean
from goblin.element import Vertex, Edge
import goblin
def get_hashable_id(val):
#Use the value "as-is" by default.
result = val
if isinstance(val, dict) and "@type" in val and "@value" in val:
if val["@type"] == "janusgraph:RelationIdentifier":
result = val["@value"]["value"]
return result
class Person(Vertex):
name = Property(String)
async def gen_jane(app):
session = await app.session()
jane = Person()
jane.name = 'JaneDoe'
session.add(jane)
await session.flush()
async def del_jane(app):
session = await app.session()
trav = session.traversal(Person).has('name', 'JaneDoe')
jane = (await trav.toList())[0]
await session.remove_vertex(jane)
loop = asyncio.get_event_loop()
app = loop.run_until_complete(Goblin.open(loop,
get_hashable_id=get_hashable_id))
app.config_from_file('remote.yaml')
app.register(Person)
loop.run_until_complete(gen_jane(app))
loop.run_until_complete(del_jane(app))
The above code produces
Traceback (most recent call last):
File "jane_test.py", line 40, in
loop.run_until_complete(del_jane(app))
File "/home/jbkoh/anaconda3/lib/python3.5/asyncio/base_events.py", line 387, in run_until_complete
return future.result()
File "/home/jbkoh/anaconda3/lib/python3.5/asyncio/futures.py", line 274, in result
raise self._exception
File "/home/jbkoh/anaconda3/lib/python3.5/asyncio/tasks.py", line 239, in _step
result = coro.send(None)
File "jane_test.py", line 30, in del_jane
await session.remove_vertex(jane)
File "/home/jbkoh/anaconda3/lib/python3.5/site-packages/goblin/session.py", line 281, in remove_vertex
vertex = self.current.pop(hashable_id)
File "/home/jbkoh/anaconda3/lib/python3.5/weakref.py", line 240, in pop
o = self.data.pop(key)()
KeyError: 20728
I checked that the vertex is actually removed in JanusGraph and the _id was found correctly as 20728 for jane.
Is this an expected behavior? How can I remove a vertex with a query? (I would like to use it as a normal CRUD database.)
Suppose I have a vertex v
that has a property hashes
of cardinality set:
v.hashes = v.hashes.union(hashes)
Here, I want to update the property to have the union between two sets. The v.hashes
is a SetVertexPropertyManager
and supports all the relevant set functions. The union works great. t produces a set that has mixed object types of int
and VertexProperty
. However, in abc.py
, when the assignment happens, it assumes that all the values in val
are of primitive type. This means that self.validate
fails because there are VertexProperty
instances in the value, which obviously don't validate as integers because python's int
function doesn't know what to do with them.
We could fix it by only creating a vertex property if the value isn't already one:
for v in val:
if not isinstance(v, VertexProperty):
vp = vertex_prop(data_type, card=card)
vp.value = self.validate(v)
else:
vp = v
vertex_props.add(vp)
I haven't submitted a pull request because I thought you might prefer to fix this elsewhere.
I am opening this issue to discuss the level of integration with gremlin-python we will provide in this package. As @leifurhauks suggested, if were to implement a couple of custom classes, we would achieve an "async style" Gremlin sytanx:
resp = await g.V().has('key', 'val')
Well, almost. Currently, in order to actually submit the generated script and bindings we need to call another method signaling that the traversal is finished and needs to be submitted. In the current gremlin-python implementation this is achieved by calling next
, which forms part of a kind of strange iterator machinery built into the gremlin_python.Traversal
class that relies on another class gremlin_python.Traverser
, which is instantiated by doing an extra iteration over the results the RemoteConnection
object. So guess in the end, we can achieve something like:
resp = await g.V().has('key', 'val').next()
async for msg in resp:
...
This is great and slick, and I think the driver should implement this. However, I wonder how integrated this async style Gremlin should be with Goblin. I have several concerns. Maybe they are unfounded, but I figured I would open an issue to do a bit of discussion.
So, first of all, I wonder if in some cases we will need a bit more control over the scripts we submit. Imagine we want submit arbitrary groovy code to the server, or, do something like transaction control. This is a bit more difficult to achieve if we are relying on all of the internals of the GLV.
Currently, we simply use the GLV to generate strings and bindings, which is quite flexible. For example, something like:
traversal = g.addV(...)
script = "graph.tx().rollback(); try {" + traversal.script + "graph.tx().commit()} except { graph.tx().rollback()"
resp = await conn.submit(script)
async for msg in resp:
...
I'm not sure how to achieve something like this using the GLV. I know that this example is pretty ugly, but I can see things like this coming up.
Also, I'm thinking about how we provide queries. If a use wants to just user the GLV, fine, but I had kind of hoped that goblin could provide its own machinery for user built queries that are bound to the session elements and feature some limitations as well as some convenience methods. For example, as it is currently:
resp = session.traversal(MyElementType).has('key', 'val').out().all()
So here, we can note that MyElementType provides the traversal with some initial information about the type of traversal being build i.e., the element type (vertex vs. edge) and element label. Furthermore, instead of calling next
(which I think is kind of confusing), we call all
, which says we want all of the results returned by the db. We could easily provide other methods that return lists and scalars (like in sqlalchemy): one
, first
, one_or_none
.
I am sure there are more things to consider here...that's why I am looking for input @leifurhauks @jsenecal
I am new to async/await.
On my development machine my async / await code seems to work perfectly (running on a quite slow ssh connection). When moved to a testing machine in AWS (and running against the same titan server in AWS, both machines running python 3.5.2, dev=Mac, test=Ubuntu 16) I am getting the following errors:
Task was destroyed but it is pending!
task: <Task pending coro=<Connection._terminate_response() running at /usr/local/lib/python3.5/site-packages/goblin/driver/connection.py:251> wait_for=<Future pending cb=[Task._wakeup()]>>
This seems to be occurring in the following command run_titan_cmd, which is based on the Goblin docs. Any hints? Is it returning the msg before the response termination is complete?
async def run_titan_cmd(self, script, binding):
conn = await driver.Connection.open( 'ws://'+self.titanURI+':8182/gremlin', loop, message_serializer=GraphSONMessageSerializer)
async with conn:
resp = await conn.submit(gremlin=script, bindings=binding)
async for msg in resp:
return msg
Examples from documentation won't work with latest TP3 as addV() is expecting an even number of arguments.
Instead of:
resp = await g.addV('developer').property('name', 'Leif').next()
You need something like:
resp = await g.addV('person','developer').property('name', 'Leif').next()
OGM examples ( session.add() ) seem to break for the very same reason.
Kind regards,
Francisco
The session objects pops the vertex hashable id from its current
list; however, if the vertex was not first saved in the current session, then the id will not exist, so it cannot be popped. This is a trivial fix at https://github.com/davebshow/goblin/blob/master/goblin/session.py#L262:
if hashable_id in self.current:
vertex = self.current.pop(hashable_id)
Which branch should I PR to?
from goblin import DriverRemoteConnection
from goblin import Graph
import asyncio
loop = asyncio.get_event_loop()
async def go(loop):
remote_connection = await DriverRemoteConnection.open('ws://localhost:8182/gremlin', 'g')
g = Graph().traversal().withRemote(remote_connection)
vertices = await g.V().name.toList()
await remote_connection.close()
return vertices
results = loop.run_until_complete(go(loop))
print(results)
I got this warn
3638305 [gremlin-server-exec-4] WARN org.janusgraph.graphdb.transaction.StandardJanusGraphTx - Query requires iterating over all vertices [()]. For better performance, use indexes
I have created composite index and my propertykey is 'id_number'
How to use the index in goblin?Please help me
At http://goblin.readthedocs.io/en/latest/ogm.html
Cardinality.list -> Cardinality.list_
We may want the driver to handle 204 messages specially, e.g. by returning an empty response to the caller.
Also we were talking about breaking apart the response messages into their components for easier consumption; in this case we would definitely need to handle 204 specially because callers won't be interacting with messages directly.
I'm trying to connect Goblin to a remote gremlin server.
Thus far I've only been able to do this by:
conn = await driver.Connection.open('ws://localhost:8182/gremlin', loop)
Is it possible to connect goblin to a remote server without first ssh tunneling into the server?
I've tried replacing the localhost with the remote host name as follows with no luck:
conn = await driver.Connection.open('ws://remote_host_name:8182/gremlin', loop)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.