Giter Club home page Giter Club logo

some-kvstore's Introduction

some-kvstore

A sharded fault-tolerent persistent K/V store that's hopefully fast.

This project is inspired by the MIT Distributed Systems class, but doesn't utilize any of their codebase.

references:

Basically the sharded key/value store consists of a shard orchestrator and a bunch of replication groups, each of which is a Raft cluster.

RPC Services

within Raft cluster

  • node join
  • start cluster
  • request vote
  • append entry

replication group

  • Get <- fe
  • Put <- fe
  • Append <- fe
  • Delete <- fe
  • get shard <- rg

shard orchestrator

  • client registration <- fe, rg
  • Get Configuration <- fe, rg
  • join <- fe
  • leave <- fe

Idempotency Design

Idempotency is achieved by clientID + request sequence number.

shard orchestraor clients:

  • Admin: unique ID, registered
  • Client: unique ID, registered
  • InternalClient: unique ID, registered

replication group clients:

  • Client: same ID as corresponding shard orch client, passed in
  • Transferer:
    • Transferers in the same replication group should have the same ID, registered idempotently using groupID as idempotencyID
    • sharding kicker Transferer in ShardOrchestrator should have ID 0, default

Sharding

Shard Orchestrator decides mapping between replication group and shards.

Query API of the Shard Orchestrator returns a configuration according to version number (or maybe just the latest one). Query from a replication group should contains a list of current addrs within the group (for membership changing).

Join, Leave, Move should change the current configuration and increment the version number.

If no replication group currently exists in the configuration, the shard orchestrator virtually owns all shards. When the first replication group joins, shard orchestrator pushes all shards ownership to that replication group to get things started.

Client operation on a kv pair should start by querying the configuration then talk to the replication group.

Replication Groups periodically query the latest configuration and transfer shard with each other.

Configuration

  • version number
  • map of GID to list of addr
  • map of shard to GID

Shard Transfer

All nodes in a replication group periodically queries the shard orchestrator for latest configuration. On configuration change, update local configuration, ask raft protocol to do necessary shards out. This is step is only effective on a leader node in the raft cluster. Other state should ignore this request.

A shard out stores a raft log. Once committed, this log entry requires

  1. reads local kv within the shard
  2. makes shard transfer rpc call to target replication group with the data
  3. wait for success reply, then hide that shard in state machine, so future client request on that shard should fail

When receive a shard transfer rpc call

  1. checks for idempotency, return if the request has been seen
  2. stores a shard in log entry containing the data
  3. on process, put all the kv pairs in kvstore, own the shard
  4. return success

For shard in requests, replication group doesn't check against its local configuration and trust the caller making the right call. In case of discrepency (lagged local config or remote config), this should be ok after the next periodical query.

Shard Orchestrator

APIs

  • Join
  • Leave
  • Move
  • Query

Replication Group in Raft

A replication groups manages one or more logical shards of the whole dataset, utilizing Raft protocol to reach consensus within a group.

Supporting Operations

  • Get

    returns nil when key doesn't exist, not throwing error

  • Put

    stores a key-value pair, overwrite old value on putting with existing key

  • Append

    append passed in value to the end of existing value, equivalent to Put if key does not previously exist

  • Delete

    delete key from store, pass silently if key not exist, not throwing error

Server Side

Raft Cluster

Consensus protocol. Each node in the cluster manages a KVStore state machine to replicate the key/value pairs stored by client.

KVStore State Machine

Available commands for clent

  • get
  • put
  • append
  • delete

Commands for other replication group

  • getShard

    callee returns and deletes local shard (or postpone delete maybe??)

  • ownShard

    callee takes ownership of the passed in shard

Client

  • put
  • get
  • append
  • delete

Persistence

TODO

  • raft
    • import old raft implementation
    • optimization
      • membership change
      • log compaction
    • client request idempotency cache and cleaning up
  • redesign and refactor RPC calls flow
  • single replication group
    • KV store as state machine
      • implementation
      • test
    • shard kv machine
    • integration
      • replication group node
      • refactor existing raft implementation
        • fix and refactor old tests
      • client
        • change to ack finished request seq
      • integration tests
  • shard orchestrator
    • configuration & config machine
    • API
      • rpc set up
    • client
      • frontend client
      • internal client
  • integration
    • sharding replication group state machine
    • API and communication between replication groups and shard orchestrator
    • shard transfer between replication groups
  • persistence
  • deployment ready and demo

some-kvstore's People

Contributors

ziyaoh avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.