lacuna / bifurcan Goto Github PK
View Code? Open in Web Editor NEWfunctional, durable data structures
License: MIT License
functional, durable data structures
License: MIT License
Graphs.merge creates a result
variable, but then mutates variable a
instead of result
, and returns result
. Merge winds up being a noop for immutable graphs. :-O
bifurcan/src/io/lacuna/bifurcan/Graphs.java
Line 191 in 32b3fbf
I am using version 0.2.0-alpha6.
In the following, a list of size 1 is sliced to make another list of size 1 that is then correctly made linear.
Then, a list of size 2 is sliced into a list of size 1. This new list is attempted to be made linear but that fails.
import io.lacuna.bifurcan.IList;
import io.lacuna.bifurcan.LinearList;
public class Main {
public static void main(String[] args) {
LinearList<Integer> list1 = new LinearList<>();
list1.addLast(1);
IList<Integer> slice1 = list1.slice(0, 1).linear();
System.out.println("Slice 1 is linear: " + slice1.isLinear());
LinearList<Integer> list2 = new LinearList<>();
list2.addLast(1);
list2.addLast(2);
IList<Integer> slice2 = list2.slice(0, 1).linear();
System.out.println("Slice 2 is linear: " + slice2.isLinear());
}
}
The output is:
Slice 1 is linear: true
Slice 2 is linear: false
The output should be:
Slice 1 is linear: true
Slice 2 is linear: true
It looks like DirectedAcyclicGraph.remove only ever expands the top/bottom sets, which means that removing a vertex from a DAG leaves it in the top and bottom sets indefinitely.
This might be my bug, but it kinda looks like creating graphs with custom equality/hash semantics can result in duplicates. For instance, from the bifurcan-clj test suite:
(testing "custom equality/hash"
; Very weird: I'm not sure why we get duplicate vertices here. Bug in
; bifurcan maybe?
(is (= {:x #{:y "y"}
:y #{:x "x"}}
(-> (g/graph #(hash (name %)) (fn [a b] (= (name a) (name b))))
(g/link :x :y)
(g/link "y" "x")
datafy)))))
The following methods should have their return type adapted.
SortedMap#remove(K)
should return SortedMap<K, V>
SortedMap#put(K, V)
should return SortedMap<K, V>
SortedMap#put(K, V, BinaryOperator)
should return SortedMap<K, V>
Currently they always need to be re-casted to (SortedMap(K, V))
which makes the code a bit ugly.
I tried to build a jar with lein and I get this error:
sputnik% lein jar
Compiling 26 source files to /home/james/code/forks/java/bifurcan/target/classes
/home/james/code/forks/java/bifurcan/src/io/lacuna/bifurcan/utils/Iterators.java:3: error: package clojure.lang does not exist
import clojure.lang.Obj;
^
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 error
Compilation of Java sources(lein javac) failed.
So I tried your benchmark alias to see if something else worked and I get this:
sputnik% lein benchmark
Retrieving criterium/criterium/0.4.3/criterium-0.4.3.pom from clojars
Retrieving proteus/proteus/0.1.6/proteus-0.1.6.pom from clojars
Retrieving eftest/eftest/0.1.4/eftest-0.1.4.pom from clojars
Retrieving progrock/progrock/0.1.2/progrock-0.1.2.pom from clojars
Retrieving io/aviso/pretty/0.1.33/pretty-0.1.33.pom from clojars
Retrieving criterium/criterium/0.4.3/criterium-0.4.3.jar from clojars
Retrieving progrock/progrock/0.1.2/progrock-0.1.2.jar from clojars
Retrieving eftest/eftest/0.1.4/eftest-0.1.4.jar from clojars
Retrieving io/aviso/pretty/0.1.33/pretty-0.1.33.jar from clojars
Retrieving proteus/proteus/0.1.6/proteus-0.1.6.jar from clojars
Compiling 26 source files to /home/james/code/forks/java/bifurcan/target/classes
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
benchmarking LinearMap
Exception in thread "main" java.lang.NullPointerException, compiling:(/tmp/form-init9040329292974367555.clj:1:73)
at clojure.lang.Compiler.load(Compiler.java:7391)
at clojure.lang.Compiler.loadFile(Compiler.java:7317)
at clojure.main$load_script.invokeStatic(main.clj:275)
at clojure.main$init_opt.invokeStatic(main.clj:277)
at clojure.main$init_opt.invoke(main.clj:277)
at clojure.main$initialize.invokeStatic(main.clj:308)
at clojure.main$null_opt.invokeStatic(main.clj:342)
at clojure.main$null_opt.invoke(main.clj:339)
at clojure.main$main.invokeStatic(main.clj:421)
at clojure.main$main.doInvoke(main.clj:384)
at clojure.lang.RestFn.invoke(RestFn.java:421)
at clojure.lang.Var.invoke(Var.java:383)
at clojure.lang.AFn.applyToHelper(AFn.java:156)
at clojure.lang.Var.applyTo(Var.java:700)
at clojure.main.main(main.java:37)
Caused by: java.lang.NullPointerException
at java.io.StringReader.<init>(StringReader.java:50)
at clojure.lang.RT.readString(RT.java:1834)
at clojure.lang.RT.readString(RT.java:1830)
at clojure.core$read_string.invokeStatic(core.clj:3687)
at clojure.core$read_string.invoke(core.clj:3677)
at bifurcan.benchmark_test$benchmark_collection.invokeStatic(benchmark_test.clj:602)
at bifurcan.benchmark_test$benchmark_collection.invoke(benchmark_test.clj:601)
at bifurcan.benchmark_test$_main$fn__6660.invoke(benchmark_test.clj:625)
at clojure.core$map$fn__4785.invoke(core.clj:2644)
at clojure.lang.LazySeq.sval(LazySeq.java:40)
at clojure.lang.LazySeq.seq(LazySeq.java:49)
at clojure.lang.RT.seq(RT.java:521)
at clojure.core$seq__4357.invokeStatic(core.clj:137)
at clojure.core.protocols$seq_reduce.invokeStatic(protocols.clj:24)
at clojure.core.protocols$fn__6738.invokeStatic(protocols.clj:75)
at clojure.core.protocols$fn__6738.invoke(protocols.clj:75)
at clojure.core.protocols$fn__6684$G__6679__6697.invoke(protocols.clj:13)
at clojure.core$reduce.invokeStatic(core.clj:6545)
at clojure.core$into.invokeStatic(core.clj:6610)
at clojure.core$into.invoke(core.clj:6604)
at bifurcan.benchmark_test$_main.invokeStatic(benchmark_test.clj:626)
at bifurcan.benchmark_test$_main.doInvoke(benchmark_test.clj:609)
at clojure.lang.RestFn.invoke(RestFn.java:410)
at clojure.lang.Var.invoke(Var.java:379)
at user$eval5.invokeStatic(form-init9040329292974367555.clj:1)
at user$eval5.invoke(form-init9040329292974367555.clj:1)
at clojure.lang.Compiler.eval(Compiler.java:6927)
at clojure.lang.Compiler.eval(Compiler.java:6917)
at clojure.lang.Compiler.load(Compiler.java:7379)
... 14 more
sputnik% java -version
openjdk version "1.8.0_121"
OpenJDK Runtime Environment (Zulu 8.20.0.5-linux64) (build 1.8.0_121-b15)
OpenJDK 64-Bit Server VM (Zulu 8.20.0.5-linux64) (build 25.121-b15, mixed mode)
It'll be really cool if there was a way for existing clojure code to be interpreted/compiled using bifurcan datastructures:
ie.
(assoc {} :a 1 :b 2) => io.lacuna.bifurcan.Map
(conj [] 1 2 3) => io.lacuna.bifurcan.List
I'm not too sure of the difficulty or the feasibility.
Hi @ztellman I see that 0.2.0-alpha2
appeared on Maven Central.
Unfortunately I don't see any Git tag or release on GitHub. I was wondering if you might be able to tell me what has changed since 0.2.0-alpha1
and if there was any resolution to our open issue #23 . Thanks :-)
Was there expected to be API changes? https://travis-ci.com/github/eXist-db/exist/jobs/321004943#L304
I get the impression that DAG bottom sets should be just those vertices with no out edges. However, it looks like they also include nodes in the middle:
(let [g (-> (g/directed-acyclic-graph)
(g/link :a1 :b)
(g/link :a2 :b)
(g/link :b :c1)
(g/link :b :c2)
(g/link :a3 :c3))]
(testing "top"
(is (= #{:a1 :a2 :a3} (datafy (g/top g)))))
(testing "bottom"
; Is the presence of b here a bug, or expected behavior?
(is (= #{:b :c1 :c2 :c3} (datafy (g/bottom g)))))))
There's a project with an implementation of CHAMP Trie based data structures - Capsule.
Do you think it'd make sense to add it to the benchmarks?
There appears to be a null pointer exception that occurs at this line
The rest of the NPE stacktrace
! java.lang.NullPointerException: null
! at io.lacuna.bifurcan.nodes.MapNodes$Node.put(MapNodes.java:193) ~[bifurcan-0.1.0.jar:na]
! at io.lacuna.bifurcan.Map.put(Map.java:139) ~[bifurcan-0.1.0.jar:na]
! at io.lacuna.bifurcan.Map.put(Map.java:135) ~[bifurcan-0.1.0.jar:na]
! at io.lacuna.bifurcan.Map.put(Map.java:130) ~[bifurcan-0.1.0.jar:na]
! at io.lacuna.bifurcan.Map.put(Map.java:24) ~[bifurcan-0.1.0.jar:na]
I see that my project is using an older version of Bifurcan, but judging from my cursory read of the MapNodes implementation, it seems the NPE is still possible?
I suspect the NPE behavior is also non-deterministic. In my testing environment, I had four separate processes populating bifurcan.Map
with data from the same source, but only one resulted in the NPE mentioned above.
Unfortunately, I don't have more precise reproduction steps for you. But will update here if I end up finding a test that results in the NPE.
I'm doing just a simple benchmark test and am finding that the standard implementation in clojure (vec) is faster than List and LinearList on inserts
[(h/bench-ns {}
(dotimes [i 4]
(let [^LinearList v (LinearList.)]
(dotimes [i 1000000]
(.addFirst ^LinearList v (Object.)))
(.size v))))
(h/bench-ns {}
(dotimes [i 4])
(reduce (fn [v i]
(.addFirst ^List v (Object.)))
(List.)
(range 1000000)))
(h/bench-ns {}
(dotimes [i 4])
(reduce (fn [v i]
(conj v (Object.)))
[]
(range 1000000)))]
[85886785
43148048
36295950]
[86910871
48210347
37343646]
this is quite rough but I'm running it a couple of times and it consistently shows the same results. Is there a way to speed it up?
Not totally sure what's up here, but map-value seems to work OK on normal maps, but not on int-maps. It looks like it passes a null value to the function?
(deftest ^:buggy map-values-test
(let [m (i/from {1 10, 3 30})]
(testing "IMap"
; This fails due to a bug I think
(is (= {1 11 3 31} (datafy (m/map-values m (fn [k v]
(prn :k k :v v)
v))))))))
:k 0 :v nil
...
bifurcan-clj.int-map-test/fn int_map_test.clj: 77
bifurcan-clj.map/map-values map.clj: 202
io.lacuna.bifurcan.IntMap.mapValues IntMap.java: 23
io.lacuna.bifurcan.IntMap.mapValues IntMap.java: 286
io.lacuna.bifurcan.nodes.IntMapNodes$Node.mapVals IntMapNodes.java: 197
java.lang.NullPointerException: Cannot invoke "io.lacuna.bifurcan.nodes.IntMapNodes$Node.mapVals(Object, java.util.function.BiFunction)" because "<local3>.content[<local4>]" is null
The current EMPTY
instances have no type parameters, similar to j.u.collection's empty instances.
Add an empty()
method that performs the unchecked cast in the library, akin to Collections.emptyMap()
Sorry to report another one... I am not sure if this bug existed before #23 was fixed, or if it is a regression. I suspect it is related though.
Unfortunately after removing from a LinearMap
the iterator over the keys of that map produces the wrong keys.
The test below shows that:
new Map<>().linear()
works as expectednew LinearMap<>()
causes the wrong keys to be returned by iteration after a remove(key)
.import io.lacuna.bifurcan.*;
import org.junit.Test;
import java.util.Iterator;
import static org.junit.Assert.*;
public class BifurcanTest {
@Test
public void removeFromLinearMap() {
final IMap<String, String> map = new LinearMap<>();
removeRemove(map);
}
@Test
public void removeFromMapConvertedToLinear() {
final IMap<String, String> map = new Map<String, String>().linear();
removeRemove(map);
}
private void removeRemove(final IMap<String, String> map) {
storeDays(map);
final IMap<String, String> allDaysMap = map.forked(); // creates DiffMap from LinearMap (when starting with LinearMap)
final IMap<String, String> withoutWeds = remove(allDaysMap, "We");
assertSetEquals(Set.of("Su", "Mo", "Tu", "Th", "Fr", "Sa"), withoutWeds.keys());
assertSetEqualsByIteration(Set.of("Su", "Mo", "Tu", "Th", "Fr", "Sa"), withoutWeds.keys());
final IMap<String, String> withoutWedsAndTues = remove(withoutWeds, "Tu");
assertSetEquals(Set.of("Su", "Mo", "Th", "Fr", "Sa"), withoutWedsAndTues.keys());
assertSetEqualsByIteration(Set.of("Su", "Mo", "Th", "Fr", "Sa"), withoutWedsAndTues.keys());
}
private IMap<String, String> remove(final IMap<String, String> immutableMap, final String... keys) {
assertTrue( !immutableMap.isLinear());
// create a mutable map
IMap<String, String> mutableMap = immutableMap.linear();
assertTrue(mutableMap.isLinear());
for (final String key: keys) {
mutableMap = mutableMap.remove(key);
assertFalse(mutableMap.contains(key));
assertFalse(mutableMap.keys().contains(key));
//TODO(AR) the blow assertion fails where it should not - this is because toString also makes use of Iterator! Should be uncommented when the Iterator is fixed...
// assertEquals(-1, mutableMap.keys().toString().indexOf(key));
}
// return an immutable map
return mutableMap.forked();
}
private void storeDays(final IMap<String, String> map) {
assertTrue(map.isLinear());
map.put("Su", "Sunday");
map.put("Mo", "Monday");
map.put("Tu", "Tuesday");
map.put("We", "Wednesday");
map.put("Th", "Thursday");
map.put("Fr", "Friday");
map.put("Sa", "Saturday");
assertSetEquals(Set.of("Su", "Mo", "Tu", "We", "Th", "Fr", "Sa"), map.keys());
}
private static <T> void assertSetEquals(final ISet<T> expected, final ISet<T> actual) {
assertEquals("Expected equal size", expected.size(), actual.size());
assertTrue(actual.containsAll(expected));
}
private static <T> void assertSetEqualsByIteration(final ISet<T> expected, final ISet<T> actual) {
assertEquals("Expected equal size", expected.size(), actual.size());
for (final Iterator<T> itActual = actual.iterator(); itActual.hasNext(); ) {
final T act = itActual.next();
assertTrue("`actual` has key '" + act + "', but this is not in `expected`!", expected.contains(act));
}
}
}
As far as I can tell the io.lacuna.bifurcan.Graphs.Edge<V, E> class serves as a standard implementation for IEdge<V, E>.
As in many graph instances the edges won't have a special value, it should be allowed to set the value to null. But now, the #hashCode method will throw a NullPoinrterException.
It should either be changed to:
@Override
public int hashCode() {
if (hash == -1) {
hash = from.hashCode() ^ to.hashCode() ^ (vale == null ? 0 : value.hashCode());
}
return hash;
}
or to:
@Override
public int hashCode() {
if (hash == -1) {
hash = Objects.hash(from, to, value);
}
return hash;
}
If either from or to is null a NullPointerException should be thrown by the constructor.
I recently tried to test the limits of the number of elements that can be stored in an instance of List<Long>
and I learned (as I suspected) that indexing will not work properly with values that are greater than Integer.MAX_VALUE
. This is because there is a downcast from long
to int
at line 79 of List.java
. I think most users will assume that because long
is used to index this collection that it will store more than Integer.MAX_VALUE
number of elements. Some documentation to clarify this or to update the implementation to allow more values would be helpful. ๐
I am a little perplexed over when it is appropriate to call fork or linear.
I have called map2.forked
before I then call map2.linear
to create map3
, however I see changes made to my map3
reflected in my map2
. I had thought that forked
would mean that changes to map3
don't affect map2
.
It may be that I have misunderstood the docs and this isn't a bug after all. Some guidance would be appreciated please...
@Test
public void bifurcanImmutableRemoveThenRemove() {
/*
1. Create the initial map1: `map { 1: true(), 2: true() }`
*/
IMap<Integer, Boolean> map1 = new LinearMap<>();
map1.put(1, true);
map1.put(2, true);
map1 = map1.forked(); // make the map immutable
checkMapIsForked(map1);
/*
2. Create map2 and remove the entry with key `2`
*/
IMap<Integer, Boolean> map2 = map1.linear(); // create a transient map for modifications
map2 = map2.remove(2);
map2 = map2.forked(); // make the map immutable
checkMapIsForked(map2);
/*
3. Check the entries in map2 with key `1` and `2`
*/
final Boolean expected1 = map2.get(1, null);
assertTrue(expected1);
final Boolean expected2 = map2.get(2, null);
assertNull(expected2);
/*
4. Create map3 and remove the entry in the with key `1`
*/
IMap<Integer, Boolean> map3 = map2.linear(); // create a transient map for modifications
map3 = map3.remove(1);
map3 = map3.forked(); // make the map immutable
checkMapIsForked(map3);
/*
I expect map2.get(1, null) to return true
*/
assertEquals(expected1, map2.get(1, null));
/*
I expect map3.get(1, null) to return null
*/
assertNotEquals(expected1, map3.get(1, null));
}
So far I do not know that this is a correctness issue, only a minor performance issue from evidence I have now. Starting from an empty list and appending elements, some trees are constructed with 1 extra level of depth/height than necessary.
$ clj "-J-Dclojure.server.repl={:port 50505 :accept clojure.core.server/repl}" -Sdeps '{:deps {io.lacuna/bifurcan {:mvn/version "0.2.0-alpha1"}}}'
(import '(io.lacuna.bifurcan List))
;; (capac s) is the maximum number of vector elements that can be
;; descendants beneath a tree node with shift s.
(defn capac [shift]
(bit-shift-left 1 (+ shift 5)))
;; make private root field accessible
(def list-class (class (List.)))
(def list-root-field (.getDeclaredField list-class "root"))
(.setAccessible list-root-field true)
(defn get-root [v]
(.get list-root-field v))
(defn listnode? [x]
(instance? io.lacuna.bifurcan.nodes.ListNodes$Node x))
(defn descendant-depths [n]
(loop [remaining-nodes [[0 n]]
depth->count {}]
(if (zero? (count remaining-nodes))
depth->count
(let [[depth node] (peek remaining-nodes)
depth->count (update depth->count depth (fnil inc 0))
children (if (listnode? node)
(take (.numNodes node) (seq (.nodes node))))]
(recur (into (pop remaining-nodes)
(map (fn [x] [(inc depth) x]) children))
depth->count)))))
(defn list-n [n]
(reduce (fn [v e] (.addLast v e))
(List.)
(range n)))
(def n1 (+ (capac 15) (- (capac 10)) (capac 5) (capac 0)))
(def pv (list-n n1))
(descendant-depths (get-root pv))
;; {0 1, 1 1, 2 32, 3 994, 4 31777}
For a List with fewer than (capac 15) elements, I would expect a tree
with shift 15 at the root, but this is the smallest value of n I found
that causes the tree to go to shift 20 at the root, a bit too early.
As far as I can tell so far this is not a correctness issue -- nth
returns the correct value for all indices, for example. It is a
performance issue.
I have not tried changes to the code yet to modify this behavior, but
I believe it is because the condition to add a new level of depth to
the tree in method pushLast only checks that the bottom-most node has
the maximum branching factor, and the root node does, but not that all
of the other nodes on the "right tree edge" have the maximum branching
factor. Thus the code can add a new level of depth to the tree when
an intermediate depth node on the "right tree edge" has only 1 node.
(def wrong-count (atom 0))
@wrong-count
(dotimes [i n1]
(when (not= (.nth pv i) i)
(swap! wrong-count inc)))
@wrong-count
;; 0 - good
The value of n shown below is the smallest one I know of that exhibits this problem.
One issue with generative testing and data structures like List with wide branching factors is that it can take a large number of operations to reach "interesting" tree structures that exercise new code paths. I have found that making a local change with SHIFT_INCREMENT=2 and thus MAX_BRANCHES=(1 << 2) = 4 causes the existing generative tests to find issues like this in a fairly short period of time.
That requires a few replacements of literal constants 5 and 32 in the List.java and ListNodes.java source files to be replaced with appropriate names, to avoid replacing them manually in multiple places. I can submit a PR with such changes if that would be of interest.
$ clj -Sdeps '{:deps {io.lacuna/bifurcan {:mvn/version "0.2.0-alpha1"} org.clojure/clojure {:mvn/version "1.10.1"}}}'
(import '(io.lacuna.bifurcan List))
(def n 33824)
(def l1 (.concat (List.) (List/from (range n))))
(= (seq (range n)) (iterator-seq (.iterator l1)))
;; true. Good
(.size l1)
;; 33824 - the expected size
(def l2 (.removeLast l1))
(= (seq (range (dec n))) (iterator-seq (.iterator l2)))
;; false. bug!
(.size l2)
;; 1055 - not even close to the correct value
I've noticed that there's quite a lot of similarities between the two structures. Is there a way to have a hybrid structure that can upscale the simpler clojure vector if need be to an RRB Tree?
This will get rid of a problem that's always bothered me:
(conj (cons [...] 1) 2) => (2, 1, ...) ;; which is a Linked List, not a vector
will then be:
(conj (cons [...] 1) 2) => [1 ... 2] ;; which is an upscaled Bifurcan List
Hi, thank you for the great library!
I've found out that the forked set can be modified in case the initial linear set gets any modifications.
This behaviour doesn't reproduce for lists.
Here is the code example:
import io.lacuna.bifurcan.List;
import io.lacuna.bifurcan.Set;
public class Main {
public static void main(String[] args) {
Set<Integer> linearSet = new Set<Integer>().linear();
linearSet.add(1);
Set<Integer> forkedSet = linearSet.forked();
linearSet.add(2);
System.out.println("Linear set: " + linearSet);
System.out.println("Forked set: " + forkedSet);
List<Integer> linearList = new List<Integer>().linear();
linearList.addLast(1);
List<Integer> forkedList = linearList.forked();
linearList.addLast(2);
System.out.println("Linear list: " + linearList);
System.out.println("Forked list: " + forkedList);
}
}
And here is the produced output:
Linear set: {1, 2}
Forked set: {1, 2}
Linear list: [1, 2]
Forked list: [1]
I'd expect that the "Forked set" remains as {1}
.
Here is the full project code: https://github.com/AlexPl292/distraught-chimera
Currently, DirectedGraph#edge
relies on Map#get(K)
which returns an optional.
This could be improved by using Map#get(K, V)
with a default value - especially in highly frequently used graph traversals this could save man allocations.
So instead of:
@Override
public E edge(V from, V to) {
Map m = out.get(from).orElseThrow(() -> new IllegalArgumentException("no such edge"));
Object e = m.get(to, DEFAULT);
if (e == DEFAULT) {
throw new IllegalArgumentException("no such edge");
} else {
return (E) e;
}
}
It could be something like:
@Override
public E edge(V from, V to) {
Map m = out.get(from, DEFAULT_MAP)
if (m == DEFAULT) {
throw new IllegalArgumentException("no such edge");
}
Object e = m.get(to, DEFAULT);
if (e == DEFAULT) {
throw new IllegalArgumentException("no such edge");
} else {
return (E) e;
}
}
SortedMap
already has a field comparator
, but is is always set to Comparator.naturalOrder()
.
There is also the private constructor SortedMap(Node<K, V> root, boolean linear, Comparator<K> comparator)
that is used by SortedMap
internally - although it can also be invoked with reflection magic. Tests show that it seems to work fine when supplying a custom Comparator
.
But there should be an "official" possibility.
It appears that several methods like addLast, addFirst can cause method grow to be called, even if the List is documented as immutable, e.g. isLinear() returns false.
I agree that the code is mutating the node in such a way that smaller arrays are replaced with larger arrays that have the same contents in the indices that are common to both the old and new arrays, which is good.
The red flag it raises in my mind is that if thread T1 is executing grow on a forked List, and thread T2 is doing read-only operations on it concurrently, then modifications made by T1 might not be safely published to T2, and T2 could start getting inconsistent results for different array elements.
Does that sound possible?
Hi, first off, very cool library & benchmarks!
I'm not sure if I'm doing something wrong or making an incorrect assumption, but I think I've found a bug causing accidental mutation in List in 0.2.0-alpha6, see my minifed reproducer:
import io.lacuna.bifurcan.List;
public class Test {
public static void main(String args[]) {
int size = 32;
List<Integer> a = new List<Integer>();
List<Integer> b = new List<Integer>();
for (int i = 0; i < size; i++) {
a = a.addLast(1);
b = b.addLast(1);
}
List<Integer> ap = a.set(15, 2);
List<Integer> bp = b.set(15, 2);
List<Integer> z = null;
// Comment out, and the equals is true
z = ap.set(15, 3);
System.out.println("a: " + a);
System.out.println("b: " + b);
System.out.println("ap:" + ap);
System.out.println("bp:" + bp);
System.out.println("z: " + z);
System.out.println("Should be true: " + ap.equals(bp));
}
}
Will print:
~/test โ javac -cp "./bifurcan-0.2.0-alpha6.jar" Test.java && java -cp ".:./bifurcan-0.2.0-alpha6.jar" Test
a: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
b: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
ap:[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
bp:[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
z: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Should be true: false
Commenting out the z = ap.set(15,3);
line, or changing the size to anything less than 32 will make the result true.
Hi!
I have encountered a bug in List implementation when tried to implement Sorted Array data structure. I've been using concat and slice operations heavily to insert new elements at proper places, and it turned out my data structure failed after several iterations.
I've logged requests to List and recreated a pure test which fails with internal AIOOBE. Just in case, I've re-checked, the indices queried are correct and exist.
java.lang.ArrayIndexOutOfBoundsException: -1
at io.lacuna.bifurcan.nodes.ListNodes$Node.pushLast(ListNodes.java:364)
at io.lacuna.bifurcan.nodes.ListNodes.pushLast(ListNodes.java:34)
at io.lacuna.bifurcan.nodes.ListNodes$Node.slice(ListNodes.java:240)
at io.lacuna.bifurcan.List.slice(List.java:228)
at io.lacuna.bifurcan.List.slice(List.java:19)
Here's the link to the gist: https://gist.github.com/valich/0958ed848ff5999532de59f360a2f899
I understand this is not a minimal sample, but I hope this will be enough to locate the problem. Please tell if you need more info.
The bug is present in 0.1.0 and 0.2.0-alpha.
Oh, yes, and removing linear/forked calls does not help either if that's important.
On the following page, you compare a number of persistent data structures:
https://github.com/lacuna/bifurcan/blob/master/doc/comparison.md
In the lists section, I think it would be more appropriate to use clojure.PersistentList than clojure.PersistentVector, considering you switch to the equivalent list types for other languages.
benchmark-iteration is defined twice in the benchmark-test namespace. The first definition is incorrect, but that doesn't matter much since the second one is correct, and replaces the first. No bug here, just a couple lines of unnecessary and unused code.
A smaller example was found via generative testing with List code locally modified with SHIFT_INCREMENT=2 MAX_BRANCH=(1 << 2)=4. Then I tried the scaled up version with a similar tree structure and got the same error.
Everything goes well up until the final slice method call.
$ clj -Sdeps '{:deps {io.lacuna/bifurcan {:mvn/version "0.2.0-alpha1"} org.clojure/clojure {:mvn/version "1.10.1"}}}'
Clojure 1.10.1
user=> (import '(io.lacuna.bifurcan List))
io.lacuna.bifurcan.List
user=> (def b 32)
#'user/b
user=> (def n (+ (* b b b) (- (* b b)) b))
#'user/n
user=> n
31776
(defn same-seq [a b]
(= (seq a) (seq b)))
#'user/same-seq
user=> (def r1 (range 2))
#'user/r1
user=> (def r2 (range 2 (+ 2 n)))
#'user/r2
user=> (def l1 (.concat (List.) (List/from r1)))
#'user/l1
user=> (def l2 (.concat (List.) (List/from r2)))
#'user/l2
user=> (same-seq r1 l1)
true
user=> (same-seq r2 l2)
true
user=> (def l3 (.concat l1 l2))
#'user/l3
user=> (def r3 (concat r1 r2))
#'user/r3
user=> (same-seq r3 l3)
true
user=> (def l4 (.slice l3 1 3))
Execution error (ArrayIndexOutOfBoundsException) at io.lacuna.bifurcan.nodes.ListNodes$Node/pushLast (ListNodes.java:364).
-1
user=> (pst)
ArrayIndexOutOfBoundsException -1
io.lacuna.bifurcan.nodes.ListNodes$Node.pushLast (ListNodes.java:364)
io.lacuna.bifurcan.nodes.ListNodes.pushLast (ListNodes.java:34)
io.lacuna.bifurcan.nodes.ListNodes$Node.slice (ListNodes.java:240)
io.lacuna.bifurcan.List.slice (List.java:228)
sun.reflect.NativeMethodAccessorImpl.invoke0 (NativeMethodAccessorImpl.java:-2)
sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke (Method.java:498)
clojure.lang.Reflector.invokeMatchingMethod (Reflector.java:167)
clojure.lang.Reflector.invokeInstanceMethod (Reflector.java:102)
clojure.lang.Compiler$InstanceMethodExpr.eval (Compiler.java:1555)
clojure.lang.Compiler$DefExpr.eval (Compiler.java:457)
nil
When starting from a LinearMap
and converting to a Map
, the subsequent remove(key)
call on the returned VirtualMap
does nothing. I think this is probably a bug!
// TEST 1 (FAILS)
final IMap<String, String> linearMap = LinearMap.from(Arrays.asList(new Maps.Entry<>("a", "b")));
assert(linearMap.contains("a"));
final IMap<String, String> map = linearMap.forked();
assert(map.contains("a"));
assert(!map.remove("a").contains("a")); // this FAILS
Interestingly when working the other way around, i.e. from Map
to LinearMap
everything works fine, i.e.
// TEST 2 (OK)
final IMap<String, String> map = Map.from(Arrays.asList(new Maps.Entry<>("a", "b")));
assert(map.contains("a"));
final IMap<String, String> linearMap = map.linear();
assert(linearMap.contains("a"));
assert(!linearMap.remove("a").contains("a")); // this is OK
Even stranger, this one also seems to work, which makes me think the problem is definitely caused by LinearMap#forked()
returning a VirtualCollection
(in our first example):
// TEST 3 (OK)
final IMap<String, String> map = Map.from(Arrays.asList(new Maps.Entry<>("a", "b")));
assert(map.contains("a"));
final IMap<String, String> linearMap = map.linear();
assert(linearMap.contains("a"));
final IMap<String, String> map2 = linearMap.forked();
assert(!map2.remove("a").contains("a")); // this is OK
I'm implementing a toy lisp based on kanaka/mal using bifurcan, and I've come across two cases where it would be useful to be able to create subclasses of the bifurcan structures. These are difficult to create since the methods create new instances of the structures and there's no way to create a subclass when that happens.
I've only looked at List so far, but it seems like the operations create new instances via the private List(boolean linear, Node root, int prefixLen, Object[] prefix, int suffixLen, Object[] suffix)
constructor. Could the new instances instead be created by calling a protected method which in List just calls that constructor, but in a subclass could create a Vector, IObjList or whatever?
I'm not sure what the implications are for other interfaces such as Concat though.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.