pbailis / thebes Goto Github PK

View Code? Open in Web Editor NEW

0.0 5.0 0.0 9.25 MB

Site of the first HATs

JavaScript 9.20% Shell 0.79% Java 50.40% Python 6.40% HTML 2.30% Thrift 0.31% Makefile 0.01% TeX 30.58%

thebes's Introduction

thebes

Site of the first HATs

thebes's People

Contributors

Watchers

thebes's Issues

add metrics support

it'd be really useful to set up a metrics server like
http://metrics.codahale.com/

there are some really great reporting tools; we could hook up a cluster to ganglia or graphite

figure out distributed transaction protocol

do we ship the transaction in its entirety to the remote datacenter and have it execute there? do we do 2PC and 2PL in the remote DC and then send the result back to the client?

todo

add tag support

AARON:
run experiments with no TM, small
test cross-cluster with new AMI

PETER:
write scripts to parse logs
functional testing of RR, RC
figure out percentile composition

once aaron's code works, run full load on
1.) bunch of m1.large on us-east-1; 2 clusters
2.) bunch of m1.xlarge across us-east-1 and us-west-2

add optional port numbers to cluster configuration

SessionIDs should be deterministic

We can have a clientID field in the command line configuration that defaults to a random int if not set.

support basic anti-entropy between nodes

change "ReplicaService"

add "InternalReplicaService":
*_add thrift service definition
*_optional: add thrift server port to configuration
*_start service on server boot
*_change configuration in AntiEntropyService

set up actual anti-entropy
**call AntiEntropyService.sendToNeighbors on ReplicaService.put()

add "timestamp" field to put request--hold off for now.

add garbage collection of partial order metadata

one way to do this is to use async handlers for anti-entropy messages, then, then all handlers have acked, send a notification to clients.

the problem here is that now clients have to run a server.

add partial order enforcement/causal consistency mechanisms

There are several steps:
Implement a Local Lock Manager #20
Set partition masters in yaml configuration #21
TwoPL clients directly contact masters to perform transactions #22
Add Transaction Manager Proxy (single cluster) #23
Add cross-cluster transactional support with TransactionManagers #24

Stretch goals:
Dynamic master configuration #25
Durable acks over WAN #26

Configure partition masters in yaml

For each partition [1-N], which server (by IP) is the master?

Java 6 Compatibility

Some of the Config code requires Java 7 features. To preserve backwards compatibility, we should remove them.

partitioning and client-side routing

paper reading

Dynamic master configuration

It would be nice to dynamically configure masters so we can play with failure modes under 2PL. One way to do this is via ZooKeeper.

There are other issues here, like durability (see #26).

Refactor twopl, hat modules

twopl/
client
server
tm

hat/
client
server

figure out TPCC integration

read Spanner

http://research.google.com/archive/spanner.html

test YCSB

run one YCSB process per physical host (1:1 mapping between servers and clients)
vary number of threads per process to vary throughput
measure YCSB reported throughput and latency
each thread gets a separate DB instance, so no problem with synchronization

ycsb load thebes -threads 10 -fieldlength=1 -p fieldcount=1 -p operationcount=10000 -p recordcount=10000 -t

ycsb run thebes -threads 10 -fieldlength=1 -p fieldcount=1 -p operationcount=10000 -p recordcount=10000 -t

make client jar and stub example programs

figure out distributed deadlock strategy

one option: set and forget/fuckit mode
another option: timeouts with retry

we'll likely hit problems when we run at large scale (10s-100s of nodes)

add client-side cached-based read committed and repeatable read

support for version vectors

I need to add version vectors to the Thebes API to track causality (effectively going from a scalar clientId, timestamp to a Map<serverId, timestamp>). How do you want to handle this?

My proposal is that we can change Version to become VersionVector and discard it in TwoPLServer (effectively what we do anyway with dependencies!)

To induce a total ordering between vectors, we can pick the one with the highest timestamp.

Thoughts? I am planning to do this soon.

how do we set up isolation level settings and architect for them

my temptation is to do this in the yaml configuration. this will require shutting down the cluster between options.

ansi_isolation: {repeatableread, readcommitted, readuncommitted}
transactional_visibility: true

// partial orders need to be kept separate--or do they
partial_orders: [{causal, explicit, monotonicwrites, monotonicreads}]

with the exception of explicit causality, we shouldn't have to change the API

also, need to figure out how to achieve this with modularity of the code.

add alternative backend; kyotocabinet? berkeleyDB?

add eventual option to HATClient

Add Transaction Manager Proxy (single cluster)

Long-haul WAN latencies will be expensive for individual operations; build a service that allows clients to send their entire transaction logic to a coordinator node.

This will require a wire protocol that allows clients to express their entire transaction, then ship it to the Transaction Manager. The Transaction Manager will look a lot like the previous implementation of the TwoPLClient.

For this milestone, consider the case where all masters are in a single cluster. #24 changes this.

add transaction timing to YCSB

expand interface for replication

need to add timestamp
likely: make a thrift "writerequest" datatype

need to add additional thrift service for intra-replica communciation

Durable storage over WAN before acknowledgment

Write to a majority of non-master replicas for a given cluster before acknowledging to the client.

This is what Spanner accomplishes via Paxos-replicated log writes. Omitting it from experiments only makes them look better.

Add cross-cluster TransactionManager support

We'll want to run cross-cluster transactions if masters live in different clusters. We should come up with a heuristic to choose an optimal TransactionManager to do the proxying.

A first cut at this is to simply select the TM in the cluster with the most masters for a given transaction's data items.

We can develop latency-specific heuristics later.

reconfigure command line options

since we're shipping the client library as a jar, its dependencies on the command line are going to cause a problem. what we probably want to do is instead set java environment variables.

i think that using environment variables is still better than setting configuration file parameters, simply because the latter causes lots of headache when running experiments and/or quickly changing parameters (especially across a cluster!). this is open for discussion.

Read_Lock(key) -> True
Write_Lock(key) -> True
Unlock(key) -> True

ycsb integration

we need to integrate as a database in YCSB. we should also fork the YCSB codebase to include a "transaction" construct.