Giter Club home page Giter Club logo

Comments (6)

lvca avatar lvca commented on May 22, 2024

At the end using the Graph Model this makes not more sense. GraphDB approach is faster and lighter. I'm going to close this issue.

from orientdb.

mindplay-dk avatar mindplay-dk commented on May 22, 2024

At the end using the Graph Model this makes not more sense. GraphDB approach is faster and lighter.

Then perhaps links should be deprecated?

IMO, given that they exist and (based on your comment) are a worse-performing, less flexible alternative to graph edges, perhaps it would be preferable for a database engineer not to get distracted and waste any time on them?

At least, the documentation should explain the drawbacks, as they are not going to be obvious to new users. The Udemy course even teaches us to use them, which now sounds like a bad idea (?)

from orientdb.

mindplay-dk avatar mindplay-dk commented on May 22, 2024

On second thought, what I wrote here last night is nonsense.

There are important differences that give links some advantages over edges:

  1. They resemble in-memory references; that is, they map and correspond 1:1 with object references in an OOP language.
  2. The link or link-collection is visible as a property with a name and embedded directly in the entity.
  3. Link types support and enforce constraints, which helps enforce data consistency.
  4. They are the logical counterpart of embedded documents.

The fact that edges are lighter and faster does not mean that the graph model is a "better choice" - the choice should be made based on what is most suitable for the model you're designing, not based on lacking support for traversal in either direction.

I think this needs to be addressed. The data model is incomplete without this feature. For one, people coming from relational databases are used to relations working in both directions.

How about changing the underlying storage mechanism, so that links would be stored as edges? If edges are lighter and faster, that would make sense to me either way.

The simplest approach I can think of, is to add a new set of "edge-mapped link" property types, e.g. a new Edge property type corresponding to Link, but with two "type arguments" where Link has only one, e.g. specifying both the class of the document the Edge points to, and the class of the outgoing edge it represents.

Similarly, an EdgeList property-type corresponding to LinkList, again with both the document and edge class being specified. And so on for the other collection-types.

These would be stored as visible edges in the graph, and thereby could be traversed and used the same way you would use any other edge. The opposite end of the relationship still would not be visible as a property, unlike what was proposed here but it no longer needs to - you can traverse edges in either direction.

They would however fulfill the four requirements I listed above, and as such provide all the benefits of links and edges combined. The link types could then actually be deprecated - if edges were lighter and faster, and fulfilled all the same requirements as links, then links would no longer necessary, or at least would be a very poor choice.

Thoughts?

from orientdb.

mindplay-dk avatar mindplay-dk commented on May 22, 2024

It occurs to me, I may have been thinking about this all wrong - as well as the original reporter, and possibly even @lvca given his remark above before closing this issue.

Fundamentally, we may be thinking about this all wrong. As I pointed out above, links "resemble in-memory references; that is, they map and correspond 1:1 with object references in an OOP languages" - if you accept that as true, then it follows that you shouldn't be able to traverse relationships in reverse, because that would deviate from OOP, where a pointer/reference also cannot be traversed in reverse.

Links are similar to embedded objects in that sense too - because embedded objects do not have identity, they are value objects that cannot be identified at all, and so you can't start from those objects in the first place, and therefore of course backwards traversal can't even happen.

In other words, you need to think of both links and embedded objects as belonging to their parent - the only key difference between them, and the only reason to choose one vs the other, is that child objects represented as links have identity, and therefore can have multiple parents, or could even be orphans for that matter.

The mistake seems to be to think of links as "references" in the relational DBMS sense - they're more like pointers in the programming sense, and as such, they make a lot of sense, you just need to be really well-informed about when to use them.

Bottom line, I think, is:

  • Use links or embedded objects only for aggregates in which child objects have no awareness of parent objects (such as is true in an in-memory OOP model)
  • Use embedded objects when a child has precisely one parent
  • Use links when a child may have zero or more parents

Does that sum it up?

Perhaps this needs to be clarified in the manual? - I really think the manual could use a good section on data storage design/patterns with the OrientDB data model. It takes a lot of time to figure these things out on your own.

from orientdb.

lvca avatar lvca commented on May 22, 2024

@mindplay-dk :+1. The first page could be: https://github.com/orientechnologies/orientdb-docs/wiki/Concepts. Would you like to improve it with your consideration?

from orientdb.

mindplay-dk avatar mindplay-dk commented on May 22, 2024

I could do some work on this, but, would you be open to me using the word "reference" and avoiding the word "relationship"?

The word "relationship" implies awareness on either party's behalf. If I just happen to know another person, that doesn't mean we have a relationship.

The word "relation" doesn't strictly imply two-way relationship - for example, I could have a relation by blood to a relative who doesn't know me. So that accurately describes references and embedded objects in Orient, but has already been overloaded by RDBMS where it does imply being able to query relations in both directions.

So the umbrella term should probably be something like "Aggregation" (rather than "Relationships") which would cover Referenced and Embedded objects.

Using the terms "parent" and "child" in parts of the copy make sense too.

If you'd like, I can fork and do some work on it?

from orientdb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.