Giter Club home page Giter Club logo

Comments (31)

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024 2

I think we have very different perspectives on this question, so for the moment instead of responding to your summary I'll just give you mine.

The premises that I take as given are these:

  • A k-ary tree is a concept in graph theory and so passes some presumed preliminary test of, "Does it even belong in this library?"
  • k-ary trees are an extremely important topic in computer science, so we can presume that people want to model them.

Now the issues start to get a bit more nuanced but I'm going to state them simply.

I'll claim that adding a k-ary tree data structure to Boost.Graph does not preclude the creation of a separate k-ary tree library, either in or out of Boost.

And I'll assert that the audience here is Boost.Graph users, not all Boost users. And so the question is not the general, "Where could/should a k-ary tree data structure go?" -- the question is, "Would Boost.Graph users benefit from a k-ary tree data structure?"

My initial thought was no, there would be no performance benefit, but I was thinking about general trees, not k-ary trees.
Now, I presume there could be a performance benefit. Knowing k at compile-time, theoretically, should allow for optimization of the memory layout. When a binary tree stores a predecessor edge (to the parent), we can do cool things like depth-first search and isomorphism in constant space.
And I guess I am saying that these algorithms cannot be implemented on the existing graph data structures, because they don't have the concept of a fixed k number of successors.

Apart from performance, there could be a much better interface for constructing, querying and mutating a k-ary tree, with functions that relate directly to the particulars of the structure.

How would it work in practice? Boost.Graph has a great generic design, so as I said earlier, a k_ary_tree would exist alongside adjacency_list, adjacency_matrix and compressed_sparse_row_graph. It would satisfy some graph concepts (probably not all) and satisfy some new k-ary tree concepts.

Take an incomplete typical scenario:

adjacency_list<> g;
// Construct g.
MyDFSVisitor vis; // models DFS Visitor concept.
depth_first_search(g, vis, ...);

A user whose graph is a binary tree could instead do this:

k_ary_tree<2> g;
// Construct g using k-ary tree-specific interface.
MyDFSTreeVisitor vis; // models new DFS k-ary tree visitor concept (probably).
depth_first_search(g, vis, ...);

They would not have to use compressed_sparse_row_graph anymore and all the limitations it brings.
Users who do not use k-ary trees would not know anything has changed.
depth_first_search would be a new overload implemented in terms of the k-ary tree concept.

To summarize, I think there has to be either an improvement in performance or interface to justify adding a k-ary tree data structure, and I think there is strong theoretical support that both of these outcomes would be achieved. The real answer will come from experimentation and a proof of concept.

Now, I don't just say all this for the fun of it. Elements of Programming by Stepanov and McJones has a small, complete set of binary tree concepts and algorithms. Anyone can go ahead and make a proof of concept from it. I'll have a go when I have time.

Cheers.

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024 2

Also, this presentation might be useful, but I warn you that the speaker is very slow, you need to watch it at 1.5x: https://youtu.be/e-cRWFiGb8E

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024 2

Hi @Becheler , sorry yeah, I forgot that there are actually a few files involved. If you look at the corresponding PR, you can see exactly what changed, but essentially there is the 'public' k-ary_tree.hpp file, then a 'private' detail/k-ary_tree.hpp file and the tree concepts that you found added to the existing graph_concepts.hpp file.

from graph.

jzmaddock avatar jzmaddock commented on August 11, 2024 1

I think I agree about closing this one:

  • The answer to the original question is "no".
  • While I suspect that some unnecessary cycles may be used by breadth/depth first searches in checking for cycles, I wonder if this is detectable in practice.
  • If there is a performance improvement to be made - it's in refining the algorithms for this particular case - rather than in a "tree class" as such. I do agree that tree-data structures are an important use case though.

Thinking out loud here.... suppose there was a traits class that allowed you to mark a data structure as a tree (true or false), could if constexpr then be used to optimise the algorithms for that use case? Someone would need to step and implement it, and then prove it actually makes a difference though..

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024 1

I didn't really have time but I started anyway and I may as well share my progress now.

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024 1

Hey @Becheler thanks for your interest. Best we can offer you right now is the PoC branch I made three years ago.

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024 1

That's very flattering, thank you, @Becheler ! It motivates me to one day pick up the baton again and get it finished.

from graph.

Becheler avatar Becheler commented on August 11, 2024 1

Hi @jeremy-murphy !
Trees are back to haunt you! 🌱 👻
I began to play with your bidirectional binary tree code, but I am not sure to understand how I'm supposed to attach vertex/edgeproperties to it? Until now my BGL rocky way has mostly been playing with adjacency_list<....., VertexProperty, EdgeProperty>.
If I understand correctly your implementation I would have to define myself a property map as a data member when I define my class class my_binary_tree : boost::bidirectional_binary_tree<> ? Sorry to bother from ground 0 of BGL 😅

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024 1

Hi @jeremy-murphy !
Trees are back to haunt you! 🌱 👻

No problem!

I didn't consider properties when I implemented the class, so yeah, it's completely missing.

I wonder if you could create a new class that simply publicly inherits from both the tree and a property map to get what you need? That would give you the most direct union of functionality and not require redefining the interface.

But use composition if you need more control.

Best of luck!

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024 1

Hey @Becheler , yeah that doesn't surprise me. I started writing it with k-ary in mind but at some point I probably mentally switched over to binary and forgot to look back.

I'm planning on putting some effort into my PR again and any contributions would be welcome! However, this is the general issue about adding a tree class to BGL rather than my specific implementation of it, so I think this topic is best moved over to the PR.

from graph.

anadon avatar anadon commented on August 11, 2024

Usually, you just have an acyclic graph, pick a vertex as your root, and use that as a tree. The reason why this usually isn't done with a full fat graph is that there are usually reference based methods or array based methods that can have better performance, lower memory usage, or both. You might want to look at std::make_heap and related functions.

https://en.cppreference.com/w/cpp/algorithm/make_heap

@jzmaddock I think this one can be closed out.

from graph.

mkatliar avatar mkatliar commented on August 11, 2024

In my case the main point is the performance. I know that my graph is always a tree, and I want
a) the traversals (depth-first and breadth-first) to be as quick as possible;
b) avoid dynamic memory allocation whenever possible;
c) fast retrieval of a parent vertex.
Currently I am using boost::compressed_sparse_row_graph as an implementation. I am wondering how much performance improvement could be gained from implementing a specialized tree class. What is your opinion on that?

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024

I strongly suspect that you are not prevented from getting maximum performance by the current implementation, although I could be wrong.
My fundamental argument would be that 'treeness' is not a property of type, but of value, in the same sense that 'sorted' is a property of value.
A graph with one node is also a tree. A graph with a cycle becomes a tree when you remove an edge from the cycle (and nominate a root).

I think the key design issue is to implement algorithms that assume a graph is a tree, in the same way that we have algorithms (binary search) that assume a sequence is sorted.

One catch to this is how to deal with trees stored as an undirected graph with explicitly nominated root versus a directed graph with implicit root versus a tree that also stores 'back edges' (from child to parent).

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024

I've actually had second thoughts on this and I'm wondering if there is a case for a tree class: not for trees in general, but for k-ary trees.

Correct me if I'm wrong, @jzmaddock , but Graph doesn't have concepts for a fixed number of successors, so the only way I can think of to define a k-ary tree would be to nominate a particular vertex as 'empty' (say, 0, as distinct from the graph_traits<G>::null_vertex()) and create an edge from u to 0 to represent a missing k-th successor. The extra space of one vertex is not an issue, but now the number of vertices is off by one, which makes things a bit awkward.

So maybe it's not a tree class, per se, but a graph with a fixed number of successors.

@mkatliar were you thinking of arbitrarily structured trees or k-ary trees? If the latter, how do you implement them?

from graph.

anadon avatar anadon commented on August 11, 2024

I don't think k-array trees are something which sound be added to the core of BGL. A constraint like that could be added, but it could just as well be user code. What you're trying to get at is a vertex with a set number of out edges and 0 or 1 in edges. Having these predicates checked isn't that bad, but adding complexity to the BGL interface might be.

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024

The point is that the k is a compile-time value, not a run-time predicate, thus the reference to concepts.
If you don't think k-ary trees belong in Boost.Graph, where do you think they belong?

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024

You may also find it edifying to read the previous discussions about adding a tree class to Boost, which occurred back in 2002 and 2009, and centered around adding one particular tree data structure.
That never eventuated and it's not clear why. I suspect that trying to add an outside tree implementation into a new Boost.Tree library required more energy and analysis than was available or predicted.
(Btw, the rub of all those discussions is that plenty of people want a tree data structure.)
What I think would be simpler is to integrate a new data structure into BGL. It already has adjacency_matrix for whatever special use cases it covers; k_ary_tree would just be another, with accompanying tree-specific overloads of relevant algorithms.

from graph.

anadon avatar anadon commented on August 11, 2024

Reading through the 2009 thread, it doesn't look like an imperative for inclusion was ever reached.

I'm trying to think here, at what point does it make sense to change from having support through graph and runtime checks and support to going through the effort to create a more specialized version of what already exists. I see thee main options: no explicit support, runtime BGL support, or a separate library in Boost. Lets break down pros and cons for each.

No Support pro:

  • Our time, many of us are over taxed as is

No Support con:

  • Betraying users expectations
  • Less usable
  • Higher burden on users

Runtime BGL pro:

  • Accretion of features is usually the right way to go
  • Simpler to implement
    Runtime BGL con:
  • Violates C++'s cardinal rule of "you don't pay for what you don't get"
  • BGL isn't space or computationally optimal (at least without a lot of tweaking and knowledge)

Separate Library pro:

  • Minimal space over head
  • Deeper enforcement of properties
  • Better compile time optimization
    Separate Library con:
  • Likely to require a decent chunk of time

Already with that, I was wrong and the balance tilts towards having something separate. But that something separate should be its own project. But so long as we're here, let's make sure the tree library as a concept would be fully fleshed out. Off hand, I can see that it would need to have a few things available. A forest container, tree data structure, standard functions, and dynamic node expansion. I'm not sold on the forest container being necessary, but having dynamic node expansion is critical for a number of domains and that behavior isn't immediately clear as to what it should even be.

I'm working on 3 hours of sleep. Am I missing anything?

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024

What do you mean by "dynamic node expansion"? It sounds like something from a traversal algorithm, but maybe you mean something else.

from graph.

anadon avatar anadon commented on August 11, 2024

I do mean a traversal algorithm as a special case. One of my bigger areas of study has been AI, and this is one feature often worked around in my experience.

Also, for completeness, there probably should be functions for converting to and from a graph.

from graph.

anadon avatar anadon commented on August 11, 2024

Update cross reference from the mailing list: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3700.html

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024

Happy new year. @jzmaddock have I not convinced you that this issue is actually worthwhile?

from graph.

jzmaddock avatar jzmaddock commented on August 11, 2024

Maybe, but what do you want me to do?

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024

Just reopen this issue.

from graph.

jzmaddock avatar jzmaddock commented on August 11, 2024

Just reopen this issue.

Ah OK, that I can do ;)

BTW with regard to adding this to the BGL - it would be quite a large addition - more than the community maintenance group normally takes on, so something like a mini-review would probably be best. Some open questions off the top of my head:

  • Can it be shown that tree data structures lead to improved performance, how about neater (self explanatory) code?
  • Would the proposal linked by @anadon above work with the BGL "straight out of the box"?

Other than that, something that followed the standardization proposal is more likely to gain traction I would think - and might shake out any issues in the proposal?

from graph.

jeremy-murphy avatar jeremy-murphy commented on August 11, 2024

from graph.

Becheler avatar Becheler commented on August 11, 2024

Hello! Any news on this front? I was hoping to use BGL to upgrade from my own Forest and k-ary Tree 🙃

from graph.

Becheler avatar Becheler commented on August 11, 2024

Oooh nice! Thank you! Well I don't think I want to complicate my dependencies even more, so I will keep my rocky implementation , but I will watch this (not-so-slow-for-non-native-speakers) cool presentation!
[EDIT] I took the time to watch the presentation, and it convinced me to throw my Tree class by the window and begin using the BGL :-)

from graph.

Becheler avatar Becheler commented on August 11, 2024

Hi @jeremy-murphy !
I'm carving my way in, but the code fails to compile with a no template named 'BidirectionalBinaryTreeConcept' in namespace 'boost::concepts';
I can't find anything in the project that defines this concept, am I missing an external dependency?
Also I am not sure this present thread issue is the best suited to this discussion, so feel free to re-orient me somewhere else 😄

from graph.

Becheler avatar Becheler commented on August 11, 2024

Ok, found it in the file https://github.com/jeremy-murphy/graph/blob/k-ary-tree/include/boost/graph/graph_concepts.hpp
. I isolated the required definitions in a concepts.hpp file :

#ifndef BOOST_GRAPH_DETAIL_CONCEPTS_TREE
#define BOOST_GRAPH_DETAIL_CONCEPTS_TREE

#include <boost/concept_check.hpp>
#include <boost/concept/assert.hpp>
#include <boost/concept/detail/concept_def.hpp>

namespace boost::concepts
{

BOOST_concept(ForwardBinaryTree,(G))
: Graph<G>
{
    BOOST_CONCEPT_USAGE(ForwardBinaryTree) {
        t = has_left_successor(u, g);
        t = has_right_successor(u, g);
        v = left_successor(u, g);
        v = right_successor(u, g);
        t = empty(u, g);
        const_constraints(g);
    }
    void const_constraints(G const &g) {
        t = has_left_successor(u, g);
        t = has_right_successor(u, g);
        v = left_successor(u, g);
        v = right_successor(u, g);
        t = empty(u, g);
    }
    bool t;
    G g;
    typename graph_traits<G>::vertex_descriptor u, v;
};


BOOST_concept(BidirectionalBinaryTree,(G))
: ForwardBinaryTree<G>
{
    BOOST_CONCEPT_USAGE(BidirectionalBinaryTree) {
        t = has_predecessor(u, g);
        t = predecessor(u, g);
        u = root(u, g);
        const_constraints(g);
    }

    void const_constraints(G const &g) {
        t = has_predecessor(u, g);
        t = predecessor(u, g);
        u = root(u, g);
    }

    bool t;
    G g;
    typename graph_traits<G>::vertex_descriptor u;
};

BOOST_concept(MutableForwardBinaryTree,(G))
    : ForwardBinaryTree<G>
{
    BOOST_CONCEPT_USAGE(MutableForwardBinaryTree) {
    e = add_left_edge(u, v, g);
    e = add_right_edge(u, v, g);
    // TODO: remove_left_edge, remove_right_edge
    }
    G g;
    typename graph_traits<G>::vertex_descriptor u, v;
    typename graph_traits<G>::edge_descriptor e;
};

BOOST_concept(MutableBidirectionalBinaryTree,(G))
    : MutableForwardBinaryTree<G>
{
    BOOST_CONCEPT_USAGE(MutableBidirectionalBinaryTree) {
    e = add_predecessor(u, v, g);
    remove_predecessor(u, g);
    }
    G g;
    typename graph_traits<G>::vertex_descriptor u, v;
    typename graph_traits<G>::edge_descriptor e;
};
    
}

#endif

from graph.

Becheler avatar Becheler commented on August 11, 2024

@jeremy-murphy in the present state, the syntax k_ary_tree<3> g mentioned here is not supported as the focus seemed to be binary trees only. Am I missing some code updates and do you think it's reasonably feasible to the newcomer to expand on your binary_tree implementation?
Beside having a array<vertex_descriptor, N> successors; that is. I'm no graph theoretician 😛

from graph.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.