Giter Club home page Giter Club logo

openraft's Introduction

Openraft

Advanced Raft in 🦀 Rust using Tokio. Please ⭐ on github!

Crates.io docs.rs guides Discord Chat
CI License Crates.io Crates.io

This project intends to improve raft as the next-generation consensus protocol for distributed data storage systems (SQL, NoSQL, KV, Streaming, Graph ... or maybe something more exotic).

Currently, openraft is the consensus engine of meta-service cluster in databend.

  • 🚀 Get started:

  • 🙌 Questions?

    • Why not take a peek at our FAQ? You might find just what you need.
    • Wanna chat? Come hang out with us on Discord!
    • Or start a new discussion over on GitHub.
    • Or join our Feishu group.
    • And hey, if you're on WeChat, add us: drmingdrmer. Let's get the conversation started!

Whatever your style, we're here to support you. 🚀 Let's make something awesome together!

Status

  • The features are almost complete for building an application.
  • Performance: Supports 70,000 writes/sec for single writer, and 1,000,000 writes/sec for 256 writers. See: Performance
  • Unit test coverage stands at 92%.
  • The chaos test has not yet been completed, and further testing is needed to ensure the application's robustness and reliability.

API status

  • Openraft API is not stable yet. Before 1.0.0, an upgrade may contain incompatible changes. Check our change-log. A commit message starts with a keyword to indicate the modification type of the commit:

    • DataChange: on-disk data types changes, which may require manual upgrade.
    • Change: if it introduces incompatible changes.
    • Feature: if it introduces compatible non-breaking new features.
    • Fix: if it just fixes a bug.

Versions

Roadmap

Performance

The benchmark is focused on the Openraft framework itself and is run on a minimized store and network. This is NOT a real world application benchmark!!!

clients put/s ns/op
256 1,014,000 985
64 730,000 1,369
1 70,000 14,273

For benchmark detail, go to the ./cluster_benchmark folder.

Features

  • Async and Event-Driven: Operates based on Raft events without reliance on periodic ticks, optimizing message batching for high throughput.
  • Extensible Storage and Networking: Customizable via RaftLogStorage, RaftStateMachine and RaftNetwork traits, allowing flexibility in choosing storage and network solutions.
  • Unified Raft API: Offers a single Raft type for creating and interacting with Raft tasks, with a straightforward API.
  • Cluster Formation: Provides strategies for initial cluster setup as detailed in the cluster formation guide.
  • Built-In Tracing Instrumentation: The codebase integrates tracing for logging and distributed tracing, with the option to set verbosity levels at compile time.

Functionality:

Who use it

Contributing

Check out the CONTRIBUTING.md guide for more details on getting started with contributing to this project.

Contributors

Made with contributors-img.

License

Openraft is licensed under the terms of the MIT License or the Apache License 2.0, at your choosing.

openraft's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openraft's Issues

Leader Change Hook

What's the best way to hook into leader change events? I see how it's done internally, each append_entry request has this guard:

     if self.current_leader.as_ref() != Some(&msg.leader_id) {
         self.update_current_leader(UpdateCurrentLeader::OtherNode(msg.leader_id));
         report_metrics = true;
     }

But this leader_id field is dropped by the time by time my impl is called:

    #[tracing::instrument(level = "trace", skip_all, fields(len=%entries.len()))]
    async fn append_to_log(&self, entries: &[&Entry<SequencedRequest>]) -> Result<()> {
        // What's the cheapest way to find out whether this entry has a new leader?
        todo!()
    }

Basically, I need set some context following a leader re-election BEFORE applying the associated entry to the state machine? I could simply query the RaftMetrics in apply_to_state_machine if that's how the control flow works. As it stands, I'm not 100% sure when report_metrics takes effect because it sends the update asynchronously:

        let res = self.tx_metrics.send(RaftMetrics {
            id: self.id,
            state: self.target_state,
            current_term: self.current_term,
            last_log_index: self.last_log_id.index,
            last_applied: self.last_applied.index,
            current_leader: self.current_leader,
            membership_config: self.membership.clone(),
            snapshot: self.snapshot_last_log_id,
            leader_metrics,
        });

`RaftStorage`: add method `get_log_state() -> Result<(Option<LogId>, Option<LogId>, StorageError>` to return the first and last log id in the store

RaftStorage: add method get_log_state() -> Result<(Option<LogId>, Option<LogId>, StorageError> to return the first and last log id in the store.

Then remove first_id_in_log() and last_id_in_log().

This way to simplify the trait API users need to impl.

async fn first_id_in_log(&self) -> Result<Option<LogId>, StorageError>;
async fn first_known_log_id(&self) -> Result<LogId, StorageError>;
/// Returns the last log id in log.
///
/// The impl should not consider the applied log id in state machine.
async fn last_id_in_log(&self) -> Result<LogId, StorageError>;

Demo

Could you please add a demo for that, I am trying to follow the tutorial but it's missing a lot of information. Would be nice to have a simple demo implementation of multiple server running and applying the consensus. The content inside Getting Started is just a base implementation of what this library could be. For example:

What get_id_from_storage() could be?
How to properly implement RaftRouter and RaftStorage?

After some investigation I noticed that the demo is outdated. Could you please update it?

How I read data from MemStore? What are the required steps? The documentation is not clear about that.

Wait - Last Applied Index

I'm starting to use your handy wait_ test utils.

I see this to wait for logs, suggesting both values should always be equal:

    #[tracing::instrument(level = "debug", skip(self), fields(msg=msg.to_string().as_str()))]
    pub async fn log(&self, want_log: u64, msg: impl ToString) -> Result<RaftMetrics, WaitError> {
        self.metrics(
            |x| x.last_log_index == want_log,
            &format!("{} .last_log_index -> {}", msg.to_string(), want_log),
        )
        .await?;

        self.metrics(
            |x| x.last_applied == want_log,
            &format!("{} .last_applied -> {}", msg.to_string(), want_log),
        )
        .await
    }

This is used after config changes like change membership. But isn't last_applied incremented only for Normal log entries applied to the state machine?

Refine cluster initialization procedure

The cluster init procedure is specialized and introduced duplicated codes.
This procedure could be safely implemented with existing routines.
This way to reduce code and make it more easier to test.

Currently:

  • Init effective-membership.
  • Enter candidate state.
  • Become leader and commit a membership log that contains the initial config or a blank log(cluster already initialized).

Expected:

  • Append a membership log at index 0, with term 0
  • Enter candidate state.
  • Become leader and commit a blank log entry.

This way:

  • commit_initial_leader_entry does not need to check last_log_index == 0.
  • No more membership operations when initializing a cluster.

bug: CI test `stepdown` failed

here is the log and CI link https://github.com/datafuselabs/openraft/runs/4700908123?check_suite_focus=true#step:4:417

running 1 test
test stepdown ... FAILED

failures:

---- stepdown stdout ----
thread 'stepdown' panicked at 'assertion failed: `(left == right)`: expected node 1 to have state machine last_applied_log 2-4, got 3-4

Diff < left / right > :
 LogId {
<    term: 3,
>    term: 2,
     index: 4,
 }

', openraft/tests/fixtures/mod.rs:701:13
stack backtrace:
   0:     0x55cb16b53f4c - std::backtrace_rs::backtrace::libunwind::trace::h65ef482bb9b15649
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x55cb16b53f4c - std::backtrace_rs::backtrace::trace_unsynchronized::hf1ee7630128bf9a9
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x55cb16b53f4c - std::sys_common::backtrace::_print_fmt::haddc20e8865333bd
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/sys_common/backtrace.rs:67:5
   3:     0x55cb16b53f4c - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hc06b166f304d5a13
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/sys_common/backtrace.rs:46:22
   4:     0x55cb16b7b96c - core::fmt::write::h212f7b7266b9a26a
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/core/src/fmt/mod.rs:1149:17
   5:     0x55cb16b4e913 - std::io::Write::write_fmt::hbfd04b0aa1968e83
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/io/mod.rs:1660:15
   6:     0x55cb16b56702 - std::sys_common::backtrace::_print::h9d519f4e309ac3d8
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/sys_common/backtrace.rs:49:5
   7:     0x55cb16b56702 - std::sys_common::backtrace::print::h8362bac372870d46
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/sys_common/backtrace.rs:36:9
   8:     0x55cb16b56702 - std::panicking::default_hook::{{closure}}::hdf453592ec91e76e
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/panicking.rs:211:50
   9:     0x55cb16b56351 - std::panicking::default_hook::h9b626066b2c6b270
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/panicking.rs:225:9
  10:     0x55cb16b56d53 - std::panicking::rust_panic_with_hook::h90c32e4f1cc54562
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/panicking.rs:606:17
  11:     0x55cb16b56a70 - std::panicking::begin_panic_handler::{{closure}}::h6751672be4522935
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/panicking.rs:502:13
  12:     0x55cb16b543f4 - std::sys_common::backtrace::__rust_end_short_backtrace::hdecab70784de07d3
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/sys_common/backtrace.rs:139:18
  13:     0x55cb16b567a9 - rust_begin_unwind
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/panicking.rs:498:5
  14:     0x55cb163ba401 - core::panicking::panic_fmt::h3a6bf5e754065d1c
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/core/src/panicking.rs:107:14
  15:     0x55cb164f3c04 - stepdown::fixtures::RaftRouter::assert_storage_state::{{closure}}::h436e5f88acb8f336
                               at /home/runner/work/openraft/openraft/openraft/tests/fixtures/mod.rs:701:13
  16:     0x55cb16550429 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hd7c940c832f59945
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/core/src/future/mod.rs:84:19
  17:     0x55cb1661945a - stepdown::stepdown::{{closure}}::h7dff7e0fc7262991
                               at /home/runner/work/openraft/openraft/openraft/tests/stepdown.rs:131:99
  18:     0x55cb1654e1e9 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hb0fac9f34dea2935
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/core/src/future/mod.rs:84:19
  19:     0x55cb165ec630 - tokio::park::thread::CachedParkThread::block_on::{{closure}}::hc967d3ff0ac1d916
                               at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/park/thread.rs:263:54
  20:     0x55cb1662e199 - tokio::coop::with_budget::{{closure}}::hece6e7fff23fbe9b
                               at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/coop.rs:102:9
  21:     0x55cb164d6aea - std::thread::local::LocalKey<T>::try_with::hfd9320b0b9fec881
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/thread/local.rs:412:16
  22:     0x55cb164d52fd - std::thread::local::LocalKey<T>::with::h72ed484b2dfa81fb
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/thread/local.rs:388:9
  23:     0x55cb165ec3ed - tokio::coop::with_budget::h0ed9a56ecce39f22
                               at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/coop.rs:95:5
  24:     0x55cb165ec3ed - tokio::coop::budget::h64415ea0ab85f3d6
                               at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/coop.rs:72:5
  25:     0x55cb165ec3ed - tokio::park::thread::CachedParkThread::block_on::h4e7f5e9c04ec7ab7
                               at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/park/thread.rs:263:31
  26:     0x55cb1655782e - tokio::runtime::enter::Enter::block_on::h2de1d90cc0290968
                               at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/enter.rs:151:13
  27:     0x55cb163c3923 - tokio::runtime::thread_pool::ThreadPool::block_on::h1c0b9c616e7bcdbc
                               at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/thread_pool/mod.rs:77:9
  28:     0x55cb164e7b46 - tokio::runtime::Runtime::block_on::hc0f1407cae225896
                               at /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/mod.rs:463:43
  29:     0x55cb164855a1 - stepdown::stepdown::hafcf32fcebecedb3
                               at /home/runner/work/openraft/openraft/openraft/tests/stepdown.rs:136:5
  30:     0x55cb16613c4e - stepdown::stepdown::{{closure}}::h7550cf795b8ff982
                               at /home/runner/work/openraft/openraft/openraft/tests/stepdown.rs:24:7
  31:     0x55cb16442efe - core::ops::function::FnOnce::call_once::h7271fc4f3780d659
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/core/src/ops/function.rs:227:5
  32:     0x55cb16676b03 - core::ops::function::FnOnce::call_once::h2f9ae408a2054227
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/core/src/ops/function.rs:227:5
  33:     0x55cb16676b03 - test::__rust_begin_short_backtrace::h946635b2b37a3f0d
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/test/src/lib.rs:585:5
  34:     0x55cb166757af - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h327757468c02cf0a
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/alloc/src/boxed.rs:1811:9
  35:     0x55cb166757af - <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::h51682cfbb1c3ae3c
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/core/src/panic/unwind_safe.rs:271:9
  36:     0x55cb166757af - std::panicking::try::do_call::hacea7f0da16ddb2e
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/panicking.rs:406:40
  37:     0x55cb166757af - std::panicking::try::hb51105ac5e97d475
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/panicking.rs:370:19
  38:     0x55cb166757af - std::panic::catch_unwind::h3e6525b13276184d
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/panic.rs:133:14
  39:     0x55cb166757af - test::run_test_in_process::hc169dee883f59795
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/test/src/lib.rs:608:18
  40:     0x55cb166757af - test::run_test::run_test_inner::{{closure}}::hf2b675a51a8a715b
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/test/src/lib.rs:500:39
  41:     0x55cb1667eb11 - test::run_test::run_test_inner::{{closure}}::h24c5be3fc3c26378
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/test/src/lib.rs:527:37
  42:     0x55cb1667eb11 - std::sys_common::backtrace::__rust_begin_short_backtrace::h02f81884a4850f72
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/sys_common/backtrace.rs:123:18
  43:     0x55cb1664b5df - std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}}::h4b4d68bc1bf14409
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/thread/mod.rs:477:17
  44:     0x55cb1664b5df - <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::h41705b2de488f292
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/core/src/panic/unwind_safe.rs:271:9
  45:     0x55cb1664b5df - std::panicking::try::do_call::hb0c49e6a8b1782a9
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/panicking.rs:406:40
  46:     0x55cb1664b5df - std::panicking::try::hc9baa64f4cf6e2d3
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/panicking.rs:370:19
  47:     0x55cb1664b5df - std::panic::catch_unwind::h38c4576076b1268c
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/panic.rs:133:14
  48:     0x55cb1664b5df - std::thread::Builder::spawn_unchecked::{{closure}}::h12b59a33f3b9f983
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/thread/mod.rs:476:30
  49:     0x55cb1664b5df - core::ops::function::FnOnce::call_once{{vtable.shim}}::h6d952349615f77cc
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/core/src/ops/function.rs:227:5
  50:     0x55cb16b5cce3 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hc32e0330a23c98fc
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/alloc/src/boxed.rs:1811:9
  51:     0x55cb16b5cce3 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hc425e01eeb6518b7
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/alloc/src/boxed.rs:1811:9
  52:     0x55cb16b5cce3 - std::sys::unix::thread::Thread::new::thread_start::hda5c72740ad584b7
                               at /rustc/8f3238f898163f09726c3d2b2cc9bafb09da26f3/library/std/src/sys/unix/thread.rs:108:17
  53:     0x7fa030cb7609 - start_thread
  54:     0x7fa030a89293 - clone
  55:                0x0 - <unknown>


failures:
    stepdown

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 5.63s

error: test failed, to rerun pass '-p openraft --test stepdown'

cc @drmingdrmer

Re-organize integration tests.

Currently integration tests are in the one-case-per-file mode:

  ▾ �tests/
    ▸ �fixtures/
    ▸ �membership/
      �add_remove_voter.rs
      �api_install_snapshot.rs
      �append_conflicts.rs
      �append_inconsistent_log.rs
      �append_updates_membership.rs
...

The problems are:

  • Slow compilation: every case re-compiles the common mod fixtures.
  • Hard for new participant to understand.

Proposed Solution:

Put related tests into a folder, such as membership related tests are now staying in /tests/membership.
The case may be one of the category:

Update the demo, create a complete raft application as demo

TODO:

  • Create demo implementation of RaftNetwork.
  • Make crate memstore a key-value store to precisely reflect what real-world storage is like.
  • Build an example that brings up a 3-nodes raft cluster application.
  • Update getting-started.md to show how to build this example app.

RaftStorage::get_initial_state should not try to fill in any default values. Leave them None and let the caller to init them.

  • InitialState::new_initial does nothing else than build a default value of InitialState.
    There is nothing a user has to implement for InitialState.
    Thus just let Raft do this job. A user just returns None in the RaftStorage ::get_initial_state().

  • In addition, what get_initial_state does is determined. A user does not need to customize get_initial_state if he has already implemented other methods in RaftStorage. Thus get_initial_state can be removed.

Remove `RaftEvent::Repilcate::entry`. Replace it with log id.

Remove RaftEvent::Repilcate::entry. Replace it with log id.

RaftCore does not need to send an entry, but just a log id is quite enough.
Since a ReplicationStream always reads log entries from a RaftStorage impl.

pub(crate) enum RaftEvent<D: AppData> {
Replicate {
/// The new entry which needs to be replicated.
///
/// This entry will always be the most recent entry to have been appended to the log, so its
/// index is the new last_log_index value.
entry: Arc<Entry<D>>,
/// The index of the highest log entry which is known to be committed in the cluster.
committed: LogId,
},

build fail for stable rust

error[E0658]: arbitrary expressions in key-value attributes are unstable
--> /Users/simon/.cargo/git/checkouts/async-raft-c411f5c0ea29c115/ca6ba30/async-raft/src/lib.rs:1:10
|
1 | #![doc = include_str!("../README.md")]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: see issue #78835 rust-lang/rust#78835 for more information

error: aborting due to previous error

For more information about this error, try rustc --explain E0658.
error: could not compile async-raft

To learn more, run the command again with --verbose.

Avoid using `set_target_state()` everywhere to modify raft state.

If one piece of code updates the state(e.g., to Leader or Follower), then every other code has to check the state change before going on.

A better way is to return an error indicating state change, to shutdown the current state procedure, and notify the top level to switch to the expected state.

Add Test: handle_append_entries should save hard state with vote_for to be the leader instead of None.

This should be a unit test for a method and does not need to be an integration test.

There may be some refactoring work to be done to make unit test against handle_append_entries() easier to write.

Problem:

A follower saves hard state (term=msg.term, voted_for=None)
when a msg.term > local.term when handling append-entries RPC.

This is quite enough to be correct but not perfect. Correct because:

  • In one term, only an established leader will send append-entries;

  • Thus, there is a quorum voted for this leader;

  • Thus, no matter what voted_for is saved, it is still correct. E.g.
    when handling append-entries, a follower node could save hard state
    (term=msg.term, voted_for=Some(ANY_VALUE)).

The problem is that a follower already knows the legal leader for a term
but still does not save it. This leads to an unstable cluster state: The
test sometimes fails.

Solution:

A follower always save hard state with the id of a known legal leader.

Originally posted by @lichuang in #95 (comment)

raft.rs change_membership function call RaftMsg::ChangeMembership twice

in raft.rs, change_membership function call RaftMsg::ChangeMembership twice with the same params (first to joint config, second to uniform config),this may make source readers confused.

How about add a param to indicate the action(joint,uniform) instead, this will make code more clearly.

Add RPCError; methods in trait Network returns RPCError instead of anyhow::Error

anyhow should not be used in a lib. It can not express correctly what is actually going wrong.
E.g. a transport error or storage error emitted by the remote peer.

API needs explicit error type:

enum RPCError {
     /// timeout recv a response
    Timeout(...)
    /// implementation specific error
    NetworkError(...)
    /// an error happened on remote peer
    RemoteError(...)
}

Simplify trait RaftStorage

Goal: it should only provide the necessary API. Extended API should be defined in RaftStorage or StoreExt. Let a user write less code.

Join forces?

Hello datafuse team! I just wanted to open a channel of communication and see if you all would be interested in getting this fork merged upstream. I know it is 111 commits difference at this point, but taking the changes and adding some folks as co-maintainers would be awesome for the overall health of the async-raft project.

Just let me know, cheers 🍻

adding a client implementation

@drmingdrmer

I would consider adding a client implementation as well since "all communication has to go through the leader" and "it is the clients responsibility to send requests to the leader". That is, show a simple client implementation that handles ForwardToLeader errors.

I'm very happy that you put the demo together ❤️. I think it'll help a lot of people. In particular using HTTP endpoints instead of tonic gRPC will make it easier to follow when you're just getting started.

Originally posted by @MikaelCall in #155 (comment)

Add Doc: explain send_append_entires

This method is a bit complicated and needs a nice explanation.

It involves:

  • binary search for the matching log id on a follower.
  • loading log entries from non-snapshot storage API.
  • deal with lack-entry error and switch to snapshot replication.

Refactor errors

  • Errors should be serializable for transport: by using AnyError to wrap external error
  • Top-level errors are API errors, such as ClientWriteError, ChangeMembershipError etc. API errors should be an enum that lists every possible sub error.
  • Removed RaftError, what it does is not very clear.
  • Avoid using anyhow::Error inside this crate.

Avoid `report_metrics()` everywhere

Currently, if it noticed some state is changed, it calls report_metrics() at once to broadcast the change.
report_metcis() is not that cheap and should be avoided being called frequently.

A better way may be to report once after dealing with one event. Because openraft is event-driven thus if there is no incoming event, the metrics will never change. E.g.: just call report_metrics() once at the end of the loop and when there is an error returned:

pub(self) async fn leader_loop(mut self) -> Result<(), Fatal> {
loop {
if !self.core.target_state.is_leader() {
tracing::info!("id={} state becomes: {:?}", self.core.id, self.core.target_state);
// implicit drop replication_rx
// notify to all nodes DO NOT send replication event any more.
return Ok(());
}
let span = tracing::debug_span!("CHrx:LeaderState");
let _ent = span.enter();
tokio::select! {
Some((msg,span)) = self.core.rx_api.recv() => {
self.handle_msg(msg).instrument(span).await?;
},
Some(update) = self.core.rx_compaction.recv() => {
tracing::info!("leader recv from rx_compaction: {:?}", update);
self.core.update_snapshot_state(update);
}
Some((event, span)) = self.replication_rx.recv() => {
tracing::info!("leader recv from replication_rx: {:?}", event.summary());
self.handle_replica_event(event).instrument(span).await?;
}
Ok(_) = &mut self.core.rx_shutdown => {
tracing::info!("leader recv from rx_shudown");
self.core.set_target_state(State::Shutdown);
}
}
}
}

Draft: introduce type Leader{term, node_id}

Although in raft there is only one leader per term, a Leader is an essential concept behind the simplified raft terminology.

E.g. Voting is actually comparing two vectors: (Leader, last_log_id).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.