talent-plan / tinykv Goto Github PK
View Code? Open in Web Editor NEWA course to build distributed key-value service based on TiKV model
License: Apache License 2.0
A course to build distributed key-value service based on TiKV model
License: Apache License 2.0
Course material
Please follow the course material to learn the background knowledge and finish code step by step.
Should be Course Materials?
At present, all sent messages follow these flow:
request -> (get store/bind stream/get region) -> raft cluster HandleRegionHeartbeat -> Dispatch
drivePushOperator -> PushOperators -> Dispatch -> SendScheduleCommand
PromoteWaitingOperator -> addOperatorLocked -> SendScheduleCommand
All operators are created in these flow:
Scheduler runScheduler -> AddWaitingOperator
patrolRegions -> AddWaitingOperator
We can simplify scheduler in these points:
Remove merge operator and all related changes (because merge operation has two operator)
Simplify the total routine. Maybe the simplest way: request -> operator
. Delete waiting queue, pushDriver... Should we have a waiting queue (or simply cancel all exceeded operation?)
schedule:
schedulers:
there are some runners in raftstore to do async tasks, please add comments to describe what it is used for and how it does.
After I performed the make
command to experience this project. I faced a problem of golang compline. The error msg as below
GO111MODULE=on go build -tags codes -ldflags '-L/usr/local/opt/icu4c/lib -X "main.gitHash=`git rev-parse HEAD`" ' -o bin/tinykv-server kv/tinykv-server/main.go
# command-line-arguments
flag provided but not defined: -L/usr/local/opt/icu4c/lib
Then I checked the Makefile under the root directory
LDFLAGS += -X "main.gitHash=`git rev-parse HEAD`"
It seems like just adding the version info. The L/usr/local/opt/icu4c/lib
wasn't added by tinykv. So I checked the value of this env LDFLAGS
on my host.
The answer is here
# echo $LDFLAGS
-L/usr/local/opt/icu4c/lib
LDFLAGS
has a initial value which affected this project.
My solution is easy:
export LDFLAGS = ''
Actually, I am not sure if this initial value is necessary for tinykv. Maybe we can clear this initial value for LDFLAGS
to avoid errors. It will be friendly for our students.
adjust the implementation https://github.com/ngaut/unistore/blob/master/tikv/server.go#L509 to TinyKV code base https://github.com/pingcap-incubator/tinykv/blob/master/kv/tikv/server.go#L180
The command tests are very repetitive with lots of boilerplate, there must be a better way
PartA: raft
PartB: raftstore + apply
PartC: snapshot + raftloggc
PartA: support conf change in raft
PartB: implement conf change in raftstore
PartC: split
InnerServer
actually gives some operations on the underlying storage, maybe we can rename InnerServer
and DBReader
to a meaningful name, like Storage
and StorageReader
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/test_raftstore/scheduler.go#L345-L348
From the code, it seems that region meta in Scheduler and TinyKV should be independence.
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/raftstore/peer.go#L345-L352
but it send a pointer to scheduler. It will cause problem when change peer. Maybe it should be cloned first.
When I worked on a solution, sometimes I get error related to applying snapshot, complaining some of the sst files are of invalid size 0.
This is strange as snapshots are validated on reception. If it passes reception, it should not have an invalid size.
After digging into the snapshot applying code path, I noticed that DB.IngestExternalFiles is just hardlinking the sst files into badger without removing the received file here:
https://github.com/pingcap-incubator/tinykv/blob/fc36c301fbf92d05e30e629453a7c3282399af75/kv/raftstore/snap/snap.go#L692
so what can happen is sst files are getting compacted as part of lsm compaction, and its size is reduced to 0 (so compaction literally does unmap/truncate to free space I assume?). when snapshot is re-sent, the follower will find the snapshot already exists. then when the follower applies the snapshot, it blows up on size mismatch.
we should either:
I did approach 1) in my solution:
https://github.com/junlong-gao/tinykv/blob/0fc6c28acffee911a38b2c043809e6977d7d2822/kv/raftstore/snap/snap.go#L692
and the issue no longer surfaces.
Could pingcap team please take a look at this and validate if my theory here makes sense?
Currently, the scheduler executes all requests sequentially on a single thread. We should execute them concurrently. This will require synchronization of the storage layer using latches.
go version go1.14.3 linux/amd64
I receive the following error when I run any of the above commands:
go: github.com/pingcap/[email protected] requires
github.com/pingcap-incubator/[email protected]: invalid version: unknown revision f660f80
I encountered a pitfall in kv/server/server_test.go
the test case TestRawGetAfterRawDelete1
in line 169 says we need to expose ErrKeyNotFound err when we try to get a non-exists key.
resp, err := server.RawGet(nil, get)
assert.Nil(t, err)
assert.True(t, resp.NotFound)
the code in TestRawDelete1
says we should return nil when we try to get a key that have already beed deleted.
_, err := server.RawDelete(nil, req)
assert.Nil(t, err)
val, err := Get(s, cf, []byte{99})
assert.Equal(t, nil, err)
assert.Equal(t, []byte(nil), val)
I am able to pass the make project1
when I change this line .
from
assert.Equal(t, nil, err)
to
assert.Equal(t, badger.ErrKeyNotFound, err)
So, I wonder it this a bug ? or there is a missing part of my code that I should continue to implement ?
https://github.com/pingcap-incubator/tinykv/blob/d13ccb8785702ecdf17368bf2350f7f82c58cab5/kv/raftstore/runner/region_task.go#L81-L88
https://github.com/pingcap-incubator/tinykv/blob/d13ccb8785702ecdf17368bf2350f7f82c58cab5/kv/raftstore/peer_storage.go#L157-L162
ps.snapState.Receiver
will never recive msg if error happend. Snapshot
method will always return ErrSnapshotTemporarilyUnavailable
The idea with the package is to provide abstractions for interacting with a low-level kv store. However, transaction and lock/write do kind of different things and it is really a package for lowering TinySQL abstractions to KV abstractions.
createPeer
AdminCmdType_Split
, there is no way to find out whether the new peer is creating or notstore_worker
may call replicatePeer
to create new peer when recive msg from other nodeso, it may create peer twice at same time
createPeer
will init raftLocalState.LastIndex
and raftLocalState.LastTerm
to 5 and replicatePeer
will init them to 0. and write them to storage.Imaging this: if createPeer
write raftLocalState.LastIndex
and raftLocalState.LastTerm
with 5 to storage and replicatePeer
get them, but when init applyState
, replicatePeer
get nothing and init applyState
to 0. it will encounter bug since raftLocalState
and applyState
is not match.
Am I right? or this situation will nerver happened and I miss something?
memInnerServer
is used in transaction tests, we can just remove it and use standaloneInnerServer
In TestRawDelete1, we didn't prepare test data before we delete it.
clean up the implementation that should be filled by students and move to a new branch
In file raft_test.go
(line 535) TestHandleMessageType_MsgAppend2AB
test case 8 and 9.
The message term is 1, and they are sent to a follower which in term 2, and expected the follower could append them and update state. From what I understand, I think this message should be ignore because this two message's term is slower than the follower. someone can explain this? thx. :)
Remove BatchSplitRequest
and BatchSplitResponse
and add SplitResponse
. https://github.com/pingcap-incubator/tinykv/blob/master/proto/proto/raft_cmdpb.proto#L86.
Then make OnReadySplitRegion
and ExecBatchSplit
into an unbatched manner(only one split key).
I think this function initializes storage, but it is not called in kv/server/server_test.go
. So what does this function do?
Currently not possible due to creating import cycles. It would be good to address this somehow (even just combining things into one package).
sm.Prs[2].Next
is a uint64 variable,after sm.Prs[2].Next=0
, sm.Prs[2].Next - 1
might overflow .
In TestRawGetNotFound1
, we create a standalone storage, but we don't start it(invoke storage.Start()
) before using it.
this part is using for building snapshot, it opens metafile and 3 cf files, but it seems the files did not closed. Also, in NewSnapForSending
, NewSnapForReceiving
, NewSnapForApplying
are not close file too.
Or, these files has been closed and I miss something?
DeepEqual will fail because want
and rd
differ in softstate and hardstate.
maybe we need something like
want := Ready{
&SoftState{
Lead: None,
RaftState: StateFollower,
},
pb.HardState{
Term: st.Term,
Commit: st.Commit,
},
[]pb.Entry{},
pb.Snapshot{},
// commit up to commit index in st
entries[:st.Commit],
make([]pb.Message, 0), // Messages []pb.Message should be empty slice or nil?
}
I encounter a bug in test testClusterSuite.TestConcurrentHandleRegion3C
. when call NewTestServer
, it will panic, error stack
PANIC: cluster_test.go:376: testClusterSuite.TestConcurrentHandleRegion3C
... Panic: invalid page type: 0: 4 (PC=0x43F645)
/usr/lib/go-1.14/src/runtime/panic.go:969
in gopanic
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/cursor.go:250
in Cursor.search
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/cursor.go:159
in Cursor.seek
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/bucket.go:165
in Bucket.CreateBucket
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/tx.go:108
in Tx.CreateBucket
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/mvcc/backend/batch_tx.go:72
in batchTx.UnsafeCreateBucket
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/mvcc/kvstore.go:149
in NewStore
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/mvcc/watchable_store.go:78
in newWatchableStore
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/mvcc/watchable_store.go:73
in New
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/etcdserver/server.go:543
in NewServer
/home/xjw/go/pkg/mod/go.etcd.io/[email protected]/embed/etcd.go:211
in StartEtcd
server.go:163
in Server.startEtcd
server.go:300
in Server.Run
testutil.go:48
in NewTestServer
cluster_test.go:379
in testClusterSuite.TestConcurrentHandleRegion3C
/usr/lib/go-1.14/src/reflect/value.go:321
in Value.Call
/home/xjw/go/pkg/mod/github.com/pingcap/[email protected]/check.go:850
in suiteRunner.forkTest.func1
/home/xjw/go/pkg/mod/github.com/pingcap/[email protected]/check.go:739
in suiteRunner.forkCall.func1
/usr/lib/go-1.14/src/runtime/asm_amd64.s:1373
in goexit
do not konw whether it is my fault.
the "overview of the code section" starts with
Same as the architect TiDB + TiKV + PD
should it not be this?
Same as the architecture of TiDB + TiKV + PD
https://github.com/pingcap-incubator/tinykv/blob/b6d7c442e3c9d1eac7192c3bff8372b16a7ae18b/kv/test_raftstore/scheduler.go#L393-L406
code between 396 and 401 does not make sense, it will always return false unless it is so lucky that the first peer in region is the added peer. Also, if there is no peer in region or only one peer that is added peer, code will reach the unreachable part.
TestHandleHeartbeat2AA
expect commit would change.
but later I encounter a bug like this
This issue is for the layer between the raft store and the server API, i.e., the server
and storage
modules in TiKV.
This layer must support the following:
At the highest level, this should be implemented in kv/tikv/server.go, by handling gRPC messages by calling the methods of InnerServer
.
for example tests under peer_storage_test.go
from what I understant both local messages and network messages should be append into Raft.msgs, and some code is responsible for loop through Raft.msgs and call Raft.Step.
Is that correct?
so the local messages(MessageType_MsgHup and MessageType_MsgBeat) generate by Raft.tick() just sit in msgs, waiting to be process, until then it's state will not change.
if that is the case, the following code will not terminate
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/raft/raft_test.go#L797-L799
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/raft/raft_test.go#L853-L855
I would like to define what is in and out of scope for the initial iteration of TinyKV. I'll start to make a list here, please comment if I got something wrong or if there are other things which should be in or out.
just like how RegionError
does, we can implement error for kvrpcpb.KeyError
like this:
type KeyError struct {
Err *kvrpcpb.KeyError
}
func (ke *KeyError) Error() string {
return ke.Err.String()
}
so function can unify return value of kvrpcpb.KeyError
into error
. For example
func (p *Prewrite) prewriteMutation(txn *mvcc.MvccTxn, mut *kvrpcpb.Mutation) (*kvrpcpb.KeyError, error) {}
->
func (p *Prewrite) prewriteMutation(txn *mvcc.MvccTxn, mut *kvrpcpb.Mutation) error {}
func (scan *Scanner) Next() ([]byte, []byte, interface{}) {}
->
func (scan *Scanner) Next() ([]byte, []byte, error) {}
I used make proto
locally, then I used git status
to check if any files have been modified,and the eraftpb.pb.go is the only one.As shown in the figure below,I found that the fileDescriptor of eraftpb.proto generated by protoc has been changed from 6ff387e2d76c3034 to 2f2e0bcef614736b.
Actually I think these generated code should not be pushed to git repo,but as you have pushed these files,is it better to make them consistent?Do I need to fix this?Or this is just because the difference of environment?
for example, storeMeta is accessed in both newRaftWorker (mutation for conf change) and newStoreWorker (reading for heartbeating to pd):
I was working on a solution and noticed inconsistent region info was sent to the scheduler (peer and region epoch field in the storeMeta was inconsistent) probably due to the fact that I was mutating it in place, updating fields one after another, and heartbeat worker reads seeing 2 fields one before and one after the mutation.
I avoided this issue in my solution by doing a deep copy:
yet it is still racy.
Perhaps we should kindly add a mutex in the global context so folks working on this will notice the race condition here?
https://github.com/pingcap-incubator/tinykv/blob/b6d7c442e3c9d1eac7192c3bff8372b16a7ae18b/kv/raftstore/snap/snap.go#L680-L685
cannot find where the file closed
it only changes applyState in DB
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/raftstore/peer_storage_test.go#L34-L40
but provided function PeerStorage.truncatedIndex
and PeerStorage.truncatedTerm
is using in memory applyState
https://github.com/pingcap-incubator/tinykv/blob/1df74d600cb0c5001f32d2d21d6f89e9744ee3e0/kv/raftstore/peer_storage.go#L219-L221
so in TestPeerStorageTerm
test 2 and 3, Term
will return ErrCompacted
, since truncatedIndex is initially 5
should we modify PeerStorage.truncatedIndex
and PeerStorage.truncatedTerm
function?
master branch TestRawNodeStart2C
course branch TestRawNodeRestart2AC
https://github.com/pingcap-incubator/tinykv/blob/edb12af93f17321030036ffcfcc7723239c3f8ad/raft/raft_test.go#L121
should be something like
cfg := func(c *Config) { c.peers = idsBySize(5) }
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.