Giter Club home page Giter Club logo

masstree-beta's Introduction

Masstree

This is the source release for Masstree, a fast, multi-core key-value store. This document describes how to run Masstree and interpret its results.

Contents

  • MTDIR: this directory
  • MTDIR/doc: Masstree algorithm specification

Installation

Masstree is tested on Debian, Ubuntu and Mac OS X. To build from source:

$ ./bootstrap.sh
$ ./configure
$ make

For performance measurements, you should disable assertions.

$ ./configure --disable-assertions

Masstree needs a fast malloc, and can link with jemalloc, Google’s tcmalloc, Hoard, or our own Flow allocator. It will normally choose jemalloc or tcmalloc, if it finds them. To use a specific memory allocator:

./configure --with-malloc=<jemalloc|tcmalloc|flow|hoard>

Flow is our re-implementation of Streamflow allocator, and may be open-sourced in future.

See ./configure --help for more configure options.

Testing

The simplest way to try out Masstree is the ./mttest program. This test doesn’t involve disk or network overhead.

$ ./mttest
1/1 rw1/m
0: now getting
1: now getting
0: {"table":"mb","test":"rw1","trial":0,"thread":0,"puts":13243551,"puts_per_sec":1324492.05531,"gets":13243551,"gets_per_sec":1497267.13928,"ops":26487102,"ops_per_sec":1405590.1258}
1: {"table":"mb","test":"rw1","trial":0,"thread":1,"puts":13242601,"puts_per_sec":1324397.45602,"gets":13242601,"gets_per_sec":1481151.35726,"ops":26485202,"ops_per_sec":1398395.26601}
EXPERIMENT x0

The test starts a process which hosts a Masstree, and generates and executes queries over the tree. It uses all available cores (two in the above example). The test lasts for 20 seconds. It populates the key-value store with put queries during first 10 seconds, and then issues get queries over the tree during the next 10 seconds. See kvtest_rw1_seed in kvtest.hh for more details about the workload. For a list of workloads, run ./mttest --help.

The output summarizes the throughput of each core. The 1/1 rw1/m line says that mttest is running the first trial (out of one trials), of the rw1 workload using Masstree (m for short) as the internal data structure. When the run completes (the now getting lines are printed during the test), mttest generates a per-core throughput summary, as indicated by 0: {"table":"mb","test":"rw1",...}.

If you redirect its standard output to a file or pipe, mttest will produce gnuplot source that plots the median per-core throughput. Each candlestick has five points for the min,20%,50%,70%,max of the corresponding metrics among all threads.

mttest also writes the output as JSON into file for further analysis. For example, after ./mttest, notebook-mttest.json will contain:

{
  "experiments":{
    "x0":{
      "git-revision":"673994c43d58d46f4ebf3f7d4e1fce19074594cb",
      "time":"Wed Oct 24 14:54:39 2012",
      "machine":"mat",
      "cores":2,
      "runs":["x0\/rw1\/mb\/0"]
    }
  },
  "data":{
    "x0\/rw1\/mb\/0":[
      {
        "table":"mb",
        "test":"rw1",
        "trial":0,
        "thread":0,
        "puts":13243551,
        "puts_per_sec":1324492.05531,
        "gets":13243551,
        "gets_per_sec":1497267.13928,
        "ops":26487102,
        "ops_per_sec":1405590.1258
      },
      {
        "table":"mb",
        "test":"rw1",
        "trial":0,
        "thread":1,
        "puts":13242601,
        "puts_per_sec":1324397.45602,
        "gets":13242601,
        "gets_per_sec":1481151.35726,
        "ops":26485202,
        "ops_per_sec":1398395.26601
      }
    ]
  }
}

Run ./mttest --help for a list of tests and options.

Network testing

mtclient supports almost the same set of workloads that mttest does, but it sends queries to a Masstree server over the network.

To start the Masstree server, run:

$ ./mtd --logdir=[LOG_DIRS] --ckdir=[CHECKPOINT_DIRS]
mb, Bag, pin-threads disabled, logging enabled
no ./kvd-ckp-gen
no ./kvd-ckp-0-0
no ./kvd-ckp-0-1
2 udp threads
2 tcp threads

LOG_DIRS is a comma-separated list of directories storing Masstree logs, and CHECKPOINT_DIRS is a comma-separated list of directories storing Masstree checkpoints. Masstree will write its logs to the LOG_DIRS and periodic checkpoints to the CHECKPOINT_DIRS. (Both logging and multithreading are performed using multiple cores, so there are several log and checkpoint files.) Alternatively, run ./mtd -n to turn off logging.

To run the rw1 workload with mtclient on the same machine as mtd, run:

$ ./mtclient -s 127.0.0.1 rw1
tcp, w 500, test rw1, children 2
0 now getting
1 now getting
0 total 7632001 763284 put/s 1263548 get/s
1 total 7612501 761423 put/s 1259847 get/s
{"puts":7632001,"puts_per_sec":763284.211682,"gets":7632001,"gets_per_sec":1263548.30195,"ops":15264002,"ops_per_sec":951678.506329}
{"puts":7612501,"puts_per_sec":761423.014367,"gets":7612501,"gets_per_sec":1259847.22076,"ops":15225002,"ops_per_sec":949182.006246}
total 30489004
puts: n 2, total 15244502, average 7622251, min 7612501, max 7632001, stddev 13789
gets: n 2, total 15244502, average 7622251, min 7612501, max 7632001, stddev 13789
puts/s: n 2, total 1524707, average 762354, min 761423, max 763284, stddev 1316
gets/s: n 2, total 2523396, average 1261698, min 1259847, max 1263548, stddev 2617

masstree-beta's People

Contributors

alevy avatar benesch avatar huanchenz avatar kohler avatar micchie avatar mitake avatar nathanielherman avatar photoszzt avatar ydmao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

masstree-beta's Issues

Inconsistency bug

This issue is about a key that was inserted successfully to the tree but failed to be retrieved or deleted. Inserting the same key again might succeed.

Scenario:

            Thread 1: Inserting a key A. It reaches leaf X (most left leaf) and doesn't find the key A. It's about to insert it to leaf X (before node version validation).
            Thread 2: Inserting multiply keys to leaf X and causing a split in a way key A is not belong to this leaf anymore (should be inserted in a leaf to the right).
            Thread 2: Deleting all keys in leaf X (Leaf X is now empty).
            Thread 1: Notice that the node's version has changed and looking for the next node that should contain the key (in the same layer) but decides to stay in the same leaf while it should have continued to the next leaf. 

Live-lock bug

This issue is about using reverse scanner on a tree with an empty trie node (b+ tree).

Scenario:

  1. Insert keys of pattern which creates 3 b+ trees in the 2nd layer (key length = 16)
  2. Delete the keys in the middle b+ tree
  3. Use reverse scanner with a start key that should have been located in the most right b+ tree in the 2nd layer

deadlock bug in gc_layer

child->make_layer_root();

It is unsafe to make_layer_root without locking the node child.
In the implementation of make_layer_root, setting the root_bit is not atomic.
(Even if it is atomic, there will also be other problem)

v_ |= P::root_bit;

Deadlock Scenario:
Thread A:Inserting a key into the node child, lock child (lock_bit is set now)
Thread B: doing gc_layer, runnng inside mark_root, load v_ and calculate tmpv = v_ | P::root_bit ( lock_bit is set in tmpv)
Thread A: insert finished, unlock child (lock_bit is unset now)
Thread B: assign tmpv back to v_ , then the node child's lock_bit is set forever

Since the locking order in the same layer is from bottom to up, can a try_lock operation fix this?
My tentative solution: https://github.com/chubaodb/masstree-beta/commit/64bad5ebe3a2dbf998ff2e0ac9c24548cf0d226f

scantest failed

Hi Eddie,

I've come across a few failed assertions when running scantest over the network.

Server:
Screen Shot 2021-09-30 at 7 56 28 PM

Client:
Screen Shot 2021-09-30 at 7 56 48 PM

Any idea what might be going wrong?

Hamed

can't change number of keys/node

If I pass nodeparams anything less than 15 (I tried 14 and 13), I get a bunch of errors when running ./mtttest, which look like:

                        leaf 0x7f573dd57a80: 10 keys, version 68000000, permutation 0b762a4853:91c, parent 0x7f57090cc600, prev 0x7f56e1131700, next 0x7f570606
2780 [ksuf i1] 
                          11495984 = SUBTREE #0/137
                            leaf 0x7f5735490000: 2 keys, version 0, permutation 01:cba98765432, parent (nil), prev (nil), next (nil) 
                              19 = 1149598420 @2931851292.828826 #0/2
                              98 = 1149598499 @2931851292.828826 #1/2
                          1149598577 = 1149598578 @2931851292.828826 #b/9
                          1149598656 = 1149598657 @2931851292.828826 #7/9
                          1149598735 = 1149598736 @2931851292.828826 #6/9
                          11495988 = SUBTREE #2/137
                            leaf 0x7f56eb82f5c0: 2 keys, version 0, permutation 01:cba98765432, parent (nil), prev (nil), next (nil) 
                              14 = 1149598815 @2931851292.828826 #0/2
                              93 = 1149598894 @2931851292.828826 #1/2
                          1149598972 = 1149598973 @2931851292.828826 #a/9
                          1149599051 = 1149599052 @2931851292.828826 #4/9
                          1149599888 = 1149599889 @2931851292.828826 #8/9
                          1149599967 = 1149599968 @2931851292.828826 #5/9
                          11496000 = SUBTREE #3/137
                            leaf 0x7f56dbcdb200: 2 keys, version 0, permutation 01:cba98765432, parent (nil), prev (nil), next (nil) 
                              46 = 1149600047 @2931851292.828826 #0/2
                              7 = 114960008 @2931851292.828826 #1/1

Running Masstree with UDP

Hi,

I'm trying to run Masstree with the native UDP client
After following the setup guidelines provided in the README, I've used the following to start the server:

./mtd --cores 2 -j 2

Screen Shot 2021-02-24 at 8 09 36 PM

and the following to start my client:

./mtclient -s 192.168.1.2 -j 2 -u rw1

Screen Shot 2021-02-24 at 9 14 21 PM

The client seems to hang. However, the same test work fine without the -u flag (which I believe would make a TCP client).

I had specified 2 threads and 2 cores, but I'm also curious as to why there are node counts for more than two cores.

Are these bugs, or am I missing something?
Would sincerely appreciate your feedback.

Thanks,
Hamed

Bug? about controling key whose length is 9

I may found bugs. Could you read this comment?

AC_DEFINE_UNQUOTED([HAVE_UNALIGNED_ACCESS], [1], [Define if unaligned accesses are OK.])

This line defines HAVE_UNALIGNED_ACCESS to 1.

Then, comparing method is forced this line.

#if HAVE_UNALIGNED_ACCESS

I think there is a problem.
If we use the key whose length is 9, or 17, or..., which isn't divisible by 8, it causes segv.

Is this right?
Sincerely yours,
Takayuki

Fixing build on BSDs

For what it's worth, one needs to add -lexecinfo to $LIB to build on OpenBSD and FreeBSD (and, I would assume, NetBSD). This is because backtrace(3) and friends are stored there rather than in libc. I tried to find an elegant way to add this to configure.ac, but I have little experience with autoconf and it seems that masstree doesn't yet use the features necessary for AC_CANONICAL_TARGET or AC_CANONICAL_HOST.

If there's an easy way of adding it, it'd be appreciated. Otherwise, it's easy to manually patch into GNUMakefile.in.

Inconsistency bug

This issue is about a key that was inserted successfully to the tree but failed to be retrieved or deleted. Inserting the same key again might succeed.

Scenario:
Thread 1: Inserting key A. Traversing down the tree looking for the right leaf
Thread 2: Deleting key B, where B > A and keys share partial\full internode route to the leafs.
Thread 2: Deleting key B, which is the last key in leaf X (Leaf X is now empty).
Thread 2: Deleting leaf X and redirect new key boundary.
Thread 1: Finish insert key A successfully but the key is not reachable.

query_table::test segfaults

Can reproduce with GCC -O3 and Clang -O0.

diff --git a/GNUmakefile.in b/GNUmakefile.in
index 765ae96..746a1f9 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -64,6 +64,9 @@ jsontest: jsontest.o string.o straccum.o json.o compiler.o
 msgpacktest: msgpacktest.o string.o straccum.o json.o compiler.o msgpack.o
    $(CXX) $(CXXFLAGS) -o $@ $^ $(MEMMGR) $(LDFLAGS) $(LIBS)

+scantest: scantest.o compiler.o misc.o $(KVTREES) libjson.a
+   $(CXX) $(CXXFLAGS) -o $@ $^ $(MEMMGR) $(LDFLAGS) $(LIBS)
+
 config.h: stamp-h

 GNUmakefile: GNUmakefile.in config.status
diff --git a/query_masstree.hh b/query_masstree.hh
index 166e104..eb99899 100644
--- a/query_masstree.hh
+++ b/query_masstree.hh
@@ -27,7 +27,6 @@ class query_table {
   public:
     typedef P parameters_type;
     typedef node_base<P> node_type;
-    typedef typename node_type::nodeversion_type nodeversion_type;
     typedef typename P::threadinfo_type threadinfo;
     typedef unlocked_tcursor<P> unlocked_cursor_type;
     typedef tcursor<P> cursor_type;
diff --git a/scantest.cc b/scantest.cc
new file mode 100644
index 0000000..8b1ca66
--- /dev/null
+++ b/scantest.cc
@@ -0,0 +1,18 @@
+#include "query_masstree.hh"
+
+using namespace Masstree;
+
+kvepoch_t global_log_epoch = 0;
+volatile uint64_t globalepoch = 1;     // global epoch, updated by main thread regularly
+volatile bool recovering = false; // so don't add log entries, and free old value immediately
+kvtimestamp_t initial_timestamp;
+
+int
+main(int argc, char *argv[])
+{
+    (void) argc;
+    (void) argv;
+
+    threadinfo ti;
+    default_table::test(ti);
+}
$ gdb scantest
(gdb) r
Starting program: /vagrant/scantest 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
pool_allocate (tag=memtag_masstree_leaf, sz=320, this=0x7fffffffe500) at kvthread.hh:334
334             pool_[nl - 1] = *reinterpret_cast<void **>(p);
(gdb) bt
#0  pool_allocate (tag=memtag_masstree_leaf, sz=320, this=0x7fffffffe500) at kvthread.hh:334
#1  make (ti=..., node_ts=0, ksufsize=0) at masstree_struct.hh:312
#2  make_root (ksufsize=0, ti=..., parent=0x0) at masstree_struct.hh:320
#3  Masstree::basic_table<Masstree::default_query_table_params>::initialize (this=0x7fffffffe1e0, ti=...) at masstree_struct.hh:546
#4  0x000000000040d7eb in initialize (ti=..., this=0x7fffffffe1e0) at query_masstree.hh:45
#5  Masstree::query_table<Masstree::default_query_table_params>::test (ti=...) at query_masstree.cc:294
#6  0x0000000000401cff in main (argc=<optimized out>, argv=<optimized out>) at scantest.cc:17
(gdb) 

left-most leaf nodes not be freed after delete all keys

Steps to reproduce the behavior

  1. insert some keys
  2. delete all keys
  3. many nodes(leftmost leaf) remain in the tree, lead to memory consumption

Example Tree structure(After delete all keys):
green border: leaf node
coral color: layer root
tree

I wonder if we can do some checks and issue a gc_layer during remove_leaf.
I try to fix it like this: https://github.com/jimdb-org/masstree-beta/commit/eabe31105a283a956ca84e309b9221add8f71575 , and the problem goes away.

Dead-Lock bug

This issue might occur when a layer is being removed.

Scenario:

Thread 1: Deleting last key "1111111122222222" in the 2nd layer, and add request to remove the layer which continue from "11111111"
Thread 1: Cleanup is called, and layer is being removed. in function gc_layer, the leaf's node was just taken
Thread 2: Inserting key "1111111122222222", and currently waiting on the same leaf's lock (in function find_locked)
Thread 3: Inserting key "1111111133333333", and currently waiting on the same leaf's lock (in function find_locked)
Thread 1: Mark leaf as "deleted layer" and unlock it
Thread 2: Acquire the leaf's lock, check leaf's state and find out the layer was removed. It continues without unlocking the leaf.
Thread 3: Stuck in dead-lock

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.