Giter Club home page Giter Club logo

grand's Issues

Multi-edges between two nodes

Hello, I use the new version 0.5.1 of grand-graph:

from grand import Graph
from grandcypher import GrandCypher
from grand.backends._sqlbackend import SQLBackend

backend=SQLBackend(db_url="sqlite:///demo2.db")
G = Graph(backend=backend)

G.nx.add_node("spranger", type="Person")
G.nx.add_node("meier", type="Person")
G.nx.add_node("krause", type="Person")
G.nx.add_node("Berlin", type="City")
G.nx.add_node("Paris", type="City")
G.nx.add_node("London", type="City")

G.nx.add_edge("spranger", "Paris", type="LIVES_IN")
G.nx.add_edge("krause", "Berlin", type="LIVES_IN")
G.nx.add_edge("meier", "London", type="LIVES_IN")
G.nx.add_edge("spranger", "Berlin", type="BORN_IN")
G.nx.add_edge("krause", "Berlin", type="BORN_IN")
G.nx.add_edge("meier", "Berlin", type="BORN_IN")


result1 = GrandCypher(G.nx).run("""
MATCH (n)-[r]->(c)
WHERE
    n.type == "Person"
    and
    c.type == "City"
    
RETURN n, r, c    
""")

from lark.lexer import Token
n = result1[Token('CNAME', 'n')]
r = result1[Token('CNAME', 'r')]
c = result1[Token('CNAME', 'c')]

for i in range(len(n)):
    print(f"{n[i]} - {r[i].get('type')} -> {c[i]}")

backend.commit()
backend.close()

results in

  • spranger - BORN_IN -> Berlin
  • spranger - LIVES_IN -> Paris
  • meier - BORN_IN -> Berlin
  • meier - LIVES_IN -> London
  • krause - BORN_IN -> Berlin-
  1. The "krause - LIVES_IN -> Berlin" relation is not stored (as any second relation between two same nodes) .
    This might be due to the cause that our "G.nx" doesn't cope with multigraphs.
  2. In plain grandcypher I can query "Match (p:Person)" . How do I do this in my cypher query above?
  3. Would it be a good idea to have the backend and the graph layer (e.g. netwrokx) completely transparent and just run Cypher queries, also for creating nodes and relations?

Kind Regards.
Steffen, the graphologist

Small wiki doc issue with SQLBackend

In the wiki this:

import grand
from grand.backends import SQLBackend

grand.Graph(backend=SQLBackend("sqlite:///my-file.db"))

should be this:

import grand
from grand.backends import SQLBackend

grand.Graph(backend=SQLBackend(db_url="sqlite:///my-file.db"))

since the SQLBackend constructor only has kwargs, and no positional args.
I am unable to edit the wiki, hence the issue.

best

Add default dialect

In order to have drop-in compatibility with networkx, we could make the default dialect to be nx, such that graph.nx.{method} == graph.{method}. Duck test with nx digraph.

Optionally support TTL cache on Backend function calls

Especially when interrogating larger networks, values might not change enough in between function calls to make re-calling out to a database worthwhile. In these cases, it would be advantageous to cache results either for a certain amount of time or for the lifespan of the Backend.

Improve SQLBackend degree performance

Can use something akin to:

out_degree

SELECT 
    source,
    COUNT(DISTINCT source) as source_count
FROM {G.backend._edge_table_name}
GROUP BY
    source

degree (undirected)

SELECT 
    vert, COUNT(DISTINCT vert) as vert_count 
FROM 
    (
        SELECT source as vert FROM {G.backend._edge_table_name} 
        UNION ALL
        SELECT target FROM {G.backend._edge_table_name}
    )
GROUP BY vert

NetworkXDialect does not work correctly with networkx.DiGraph

Hi @j6k4m8,

There is an issue with grand.Graph and grand.dialects.NetworkXDialect.

Since NetworkXDialect is inherited from networkx.Graph, there happen to be discrepancies between grand.dialects.NetworkXDialect and networkx.Digraph which is popagated back to grand.Graph. One of them is the networkx.Graph.edges returns EdgeView while networkx.Digraph.edges returns OutEdgeView.

Below is one of the test to replicate the issue

def test_nx_edges(self):
        G = Graph(directed=True).nx
        H = nx.DiGraph()
        G.add_edge("1", "2")
        G.add_edge("2", "1")   # <<< this won't work with EdgeView for G
        G.add_edge("1", "3")
        H.add_edge("1", "2")
        H.add_edge("2", "1")   # <<< OutEdgeView returns this for H
        H.add_edge("1", "3")
        self.assertEqual(dict(G.edges), dict(H.edges))
        self.assertEqual(dict(G.edges()), dict(H.edges()))
        self.assertEqual(list(G.edges["1", "2"]), list(H.edges["1", "2"]))

The result is

    def test_nx_edges(self):
        G = Graph(directed=True).nx
        H = nx.DiGraph()
        # H = nx.Graph()
        G.add_edge("1", "2")
        G.add_edge("2", "1")
        G.add_edge("1", "3")
        H.add_edge("1", "2")
        H.add_edge("2", "1")
        H.add_edge("1", "3")
>       self.assertEqual(dict(G.edges), dict(H.edges))
E       AssertionError: {('1', '2'): {}, ('1', '3'): {}} != {('1', '2'): {}, ('1', '3'): {}, ('2', '1'): {}}
E       - {('1', '2'): {}, ('1', '3'): {}}
E       + {('1', '2'): {}, ('1', '3'): {}, ('2', '1'): {}}
E       ?                              ++++++++++++++++

Add fast backend short-circuit for len(edges)

Right now it's very slow to get the count of edges for certain backends. Adding a query for this directly (rather than requiring enumeration of all edges) would make some functions (such as nx.density) MUCH faster.

Question: edge attributes

My situation is that I use networkx and have access to a postgres db. I find networkx to be quite slow and thought of using some of the alternatives esp. networkit. The challenge I have is with attributes ie. networkit seems to allow only a single numerical 'weight' for edge attributes. My graphs need pretty rich edge attributes and networkx accommodates those. So:

  1. would grand allow me to use networkx syntax/features and edge attribute functionality with networkit eg. filter edges on rich attribute set but retain algorithms running at higher speeds?

  2. you mention grand interacting with dynamo db. I'm not sure I understand that. Is grand using the db to store the graph structure and if so, could it do that with a postgres db? Note: I had a look at this and it seems like this is what I had in mind but when I read your readme.

Tightly pinned versions in requirements

Hi, I was trying to use the project. But the tight pins on the requirements, seems to be hindering me from installing it in my virtualenv.

https://github.com/aplbrain/grand/blob/master/setup.py#L18

Can we reduce these constrains to >= so that it's clearer what the minimum requirements are ?

Some notes:

  • I still need to support pandas 0.25
  • Support for lower numpy versions like 1.11 is generally preferred as conda will use that by default
  • SQLAlchemy 1.4 is what I'm using right now - but a bunch of folks are using 1.3 also. There are some backward incompatible changes between them
  • network - I try to use the latest. But am open to trying older versions if needed

no nodes persisted in sqlite SQLBackend

If I ran
G = Graph(backend=SQLBackend(db_url="sqlite:///demo.db")) G.nx.add_node("A", foo="bar")
demo.db contains tables (grand_Nodes, grand_Edges), but those are empty, no data from the nodes.

The 'graph' attribute isn't present in `G.nx`

Via @MikeB2019x:

@j6k4m8 screenshare not required at the moment but I may take you up on that in the future. So trying to write out a graphml as suggested throws an error (stack trace below). If I compare a networkx graph's attributes and those of G.nx, you'll see: [...,'edges', 'get_edge_data','graph','graph_attr_dict_factory','has_edge','has_node'...] for the former compared to [...'edges','get_edge_data','graph_attr_dict_factory','has_edge','has_node',...] for the latter. That is, the 'graph' attribute isn't present in G.nx. I'm guessing that's intentional?

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [35], in <module>
      1 graphml_file_name = 'graphtools.graphml'
----> 3 nx.write_graphml(G.nx, graphml_file_name)

File <class 'networkx.utils.decorators.argmap'> compilation 17:5, in argmap_write_graphml_lxml_13(G, path, encoding, prettyprint, infer_numeric_types, named_key_ids, edge_id_from_attribute)
      3 from contextlib import contextmanager
      4 from pathlib import Path
----> 5 import warnings
      7 import networkx as nx
      8 from networkx.utils import create_random_state, create_py_random_state

File ~/opt/miniconda3/envs/graph_analytics/lib/python3.8/site-packages/networkx/readwrite/graphml.py:171, in write_graphml_lxml(G, path, encoding, prettyprint, infer_numeric_types, named_key_ids, edge_id_from_attribute)
    160 except ImportError:
    161     return write_graphml_xml(
    162         G,
    163         path,
   (...)
    168         edge_id_from_attribute,
    169     )
--> 171 writer = GraphMLWriterLxml(
    172     path,
    173     graph=G,
    174     encoding=encoding,
    175     prettyprint=prettyprint,
    176     infer_numeric_types=infer_numeric_types,
    177     named_key_ids=named_key_ids,
    178     edge_id_from_attribute=edge_id_from_attribute,
    179 )
    180 writer.dump()

File ~/opt/miniconda3/envs/graph_analytics/lib/python3.8/site-packages/networkx/readwrite/graphml.py:729, in GraphMLWriterLxml.__init__(self, path, graph, encoding, prettyprint, infer_numeric_types, named_key_ids, edge_id_from_attribute)
    726 self.attribute_types = defaultdict(set)
    728 if graph is not None:
--> 729     self.add_graph_element(graph)

File ~/opt/miniconda3/envs/graph_analytics/lib/python3.8/site-packages/networkx/readwrite/graphml.py:740, in GraphMLWriterLxml.add_graph_element(self, G)
    737 else:
    738     default_edge_type = "undirected"
--> 740 graphid = G.graph.pop("id", None)
    741 if graphid is None:
    742     graph_element = self._xml.element("graph", edgedefault=default_edge_type)

AttributeError: 'NetworkXDialect' object has no attribute 'graph'

Originally posted by @MikeB2019x in #35 (comment)

SQLBackend all_edges_as_iterable accesses _node_table instead of _edge_table resulting in KeyError: 'Source'

Just a quick find which results in:

  File ".local/lib/python3.10/site-packages/grand/backends/_sqlbackend.py", line 309, in all_edges_as_iterable
    self._node_table.c[self._edge_source_key],
  File ".local/lib/python3.10/site-packages/sqlalchemy/sql/base.py", line 1608, in __getitem__
    return self._index[key][1]
KeyError: 'Source'

This:
https://github.com/aplbrain/grand/blob/e71be46259fde37136cff0bdad3f998787f92cf3/grand/backends/_sqlbackend.py#L297C9-L317

should access self._edge_table instead of self._node_table.

best

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.