Giter Club home page Giter Club logo

Comments (4)

peekxc avatar peekxc commented on June 16, 2024

Hi @corybrunson,

I've changed a few things in the latest commit. I apologize, I made some api-breaking refactorings to simplify some things (insert_simplex -> insert, supports multiple simplices, etc.). A couple of things in response:

One occasional use need is to introduce new vertices "before" or "between" existing vertices

I've also encountered the need to occasionally insert new simplices. I've added a simple feature the to tree that generates new vertex ids. Here's an example:

library("simplextree")
st <- simplex_tree()
st$generate_ids(3) ## 0 1 2
#> [1] 0 1 2
st$insert(st$generate_ids(3))
print(st$vertices) ## 0 1 2
#> [1] 0 1 2
st$insert(st$generate_ids(2))
st$print_tree() 
#> 0 (h = 2): .( 1 2 )..( 2 )
#> 1 (h = 1): .( 2 )
#> 2 (h = 0): 
#> 3 (h = 1): .( 4 )
#> 4 (h = 0):
st$remove(4)
st$generate_ids(1) # 4
#> [1] 4

Created on 2019-06-06 by the reprex package (v0.3.0)

In my case, I actually needed new ids that were guaranteed to be unique across a filtration-like sequence. You can modify an 'id_policy' property of the tree to change the generating mechanism. See ?generate_ids for more details.

library("simplextree")
st <- simplex_tree()
st$id_policy <- "unique"
st$insert(1:3)
st$generate_ids(1)
#> [1] 4
st$generate_ids(1)
#> [1] 5

Created on 2019-06-06 by the reprex package (v0.3.0)

or to reorder existing vertices ...

I've also had the need to do this. I'm not how to achieve this efficiently, in the sense that because the trie is ordered, I assume one might be able to come up with a nice balancing algorithm that takes into the adjacency structure to optimize the re-labeling.

I implemented a simple workaround via reindex method that might be ok for the time being. It takes as input either a vector of integers of the same length as the number of vertices, which are used as the target of the map, or a list that yields the mapping explicitly. Here's an example:

library(simplextree)
st <- simplex_tree()
st$insert(1:3)
st$print_tree()
#> 1 (h = 2): .( 2 3 )..( 3 )
#> 2 (h = 1): .( 3 )
#> 3 (h = 0):
st$reindex(4:6)
st$print_tree()
#> 4 (h = 2): .( 5 6 )..( 6 )
#> 5 (h = 1): .( 6 )
#> 6 (h = 0):
st$reindex(list("6"=1))
st$print_tree()
#> 1 (h = 2): .( 4 5 )..( 5 )
#> 4 (h = 1): .( 5 )
#> 5 (h = 0):

Created on 2019-06-06 by the reprex package (v0.3.0)

It just serializes the tree into a minimal set of simplices needed to reconstitute the trie, clears the simplices, then reinserts the newly mapped simplices.

relative to some height function that otherwise does not change. It would be convenient to simply use print as the height function, but this would require methods to shift and reorder vertex indices in a simplex tree....

I'm not sure what you mean by this?

Relatedly, the insert_simplex method interprets both 0 and 18446744073709551616 (264) as 0 but interprets negative numbers modulo 264 and ignores (without error) numbers greater than 264.

I added a check on the bounds and sign of the insertion/find/remove etc functions.

library("simplextree")
st <- simplex_tree()
st$insert(c(-1, 2))
#> Error in st$insert(c(-1, 2)): Only unsigned integer simplices are supported.

Created on 2019-06-06 by the reprex package (v0.3.0)

My own preferred behavior would be to throw errors for cases outside [0,264–1]; there may be better options or good reasons to keep the behavior as-is, but i didn't see it documented anywhere.

I do only very simple 'exception handling' w/ Rcpp's stop sugar function. I've never really incoporated heavy use of exception-throwing into my code, and don't plan too until something like this happens to the standard.

As a final note, I do plan to eventually switch the label-type used by the implementation (which is size_t right now) to a template parameter that encapsulates any integral-type, I just haven't gottern around to it.

from simplextree.

corybrunson avatar corybrunson commented on June 16, 2024

The new generator is great. For the record, here's an illustration that more clearly illustrates it:

library(simplextree)
st <- simplex_tree()
st$insert(1:5)
st$vertices
#> [1] 1 2 3 4 5
st$print_tree()
#> 1 (h = 4): .( 2 3 4 5 )..( 3 4 5 4 5 5 )...( 4 5 5 5 )....( 5 )
#> 2 (h = 3): .( 3 4 5 )..( 4 5 5 )...( 5 )
#> 3 (h = 2): .( 4 5 )..( 5 )
#> 4 (h = 1): .( 5 )
#> 5 (h = 0):

st$remove(3)
st$vertices
#> [1] 1 2 4 5
st$print_tree()
#> 1 (h = 3): .( 2 4 5 )..( 4 5 5 )...( 5 )
#> 2 (h = 2): .( 4 5 )..( 5 )
#> 4 (h = 1): .( 5 )
#> 5 (h = 0):

st$generate_ids(4)
#> [1] 0 3 6 7
st$insert(st$generate_ids(4))
st$vertices
#> [1] 0 1 2 3 4 5 6 7
st$print_tree()
#> 0 (h = 3): .( 3 6 7 )..( 6 7 7 )...( 7 )
#> 1 (h = 3): .( 2 4 5 )..( 4 5 5 )...( 5 )
#> 2 (h = 2): .( 4 5 )..( 5 )
#> 3 (h = 2): .( 6 7 )..( 7 )
#> 4 (h = 1): .( 5 )
#> 5 (h = 0): 
#> 6 (h = 1): .( 7 )
#> 7 (h = 0):

Created on 2019-06-06 by the reprex package (v0.2.1)

Together with the reindexing method, they accomplish exactly what i was hoping for. Though incautious use can result in strange behavior without warning. Should there be a check that the target indices are not taken?

library(simplextree)
st <- simplex_tree()
st$insert(1:3)
st$print_tree()
#> 1 (h = 2): .( 2 3 )..( 3 )
#> 2 (h = 1): .( 3 )
#> 3 (h = 0):

st$reindex("3"=1)
#> Error in st$reindex(`3` = 1): target id vector must match the size of the number of 0-simplices.
st$print_tree()
#> 1 (h = 2): .( 2 3 )..( 3 )
#> 2 (h = 1): .( 3 )
#> 3 (h = 0):

Created on 2019-06-06 by the reprex package (v0.2.1)

Sorry for confusion over the height function! I only meant to acknowledge that, by a clever but inefficient choice of height function, the behavior i wanted was possible without reindexing.

Thanks for the reference — i do only what i absolutely must in C++, but this is good context to have.

from simplextree.

peekxc avatar peekxc commented on June 16, 2024

Though incautious use can result in strange behavior without warning. Should there be a check that the target indices are not taken?

There are two issues here.

First, note that the syntax for explicitly mapping vertex ids to new ids requires a list, not named parameters. So st$reindex("3"=1) should be st$reindex(list("3"=1)).

The second issue, I agree it's a good idea to check the target indices are not taken, but I also wanted the method to be able to e.g. swap indices. Here's an example of the desired behaviour I swanted with the newest version of the package.

library(simplextree)
st <- simplex_tree()
st$insert(1:3)
st$reindex(2:4)
st$print_tree()
#> 2 (h = 2): .( 3 4 )..( 4 )
#> 3 (h = 1): .( 4 )
#> 4 (h = 0):
st$reindex(list("4"=2, "2"=5))
st$print_tree()
#> 2 (h = 2): .( 3 5 )..( 5 )
#> 3 (h = 1): .( 5 )
#> 5 (h = 0):
st$reindex(list("5"=2))
#> Error in st$reindex(list(`5` = 2)): target ids must all unique.

Created on 2019-06-08 by the reprex package (v0.3.0)

from simplextree.

corybrunson avatar corybrunson commented on June 16, 2024

Oh, that's strange—i must have originally been using list() and dropped it without noticing the change, since i didn't intend to reproduce an error message. Sorry about that. But the intended behavior makes sense now. Thank you!

from simplextree.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.