Giter Club home page Giter Club logo

Comments (13)

killme2008 avatar killme2008 commented on July 17, 2024 3

仔细思考了下,这里存在的问题跟 #15 类似,issue 中描述的场景确实可能出现,出现了两个分割的 majority: (A, B) 和 (B, C),又凑巧出现写入极少或者没有的情况, A无法及时发现 C成为了新的 leader,在 1个 hearbteat 周期内,C又刚好写入了新日志,这样 A 会认为自己是 leader,而出现了 stale read,应当在 preVote 加入租约有效期的判断,来避免 C 成为新的 leader。

感谢分析。

from sofa-jraft.

PFZheng avatar PFZheng commented on July 17, 2024 1

赞一个,你们的回复速度很快

from sofa-jraft.

pifuant avatar pifuant commented on July 17, 2024

可以参考paxos made live给出的方案, 让follewer在lease周期内不接受选举产生新的leader就可以了

from sofa-jraft.

pifuant avatar pifuant commented on July 17, 2024

In our implementation, all replicas implicitly grant a lease to the master of the previous Paxos instance
and refuse to process Paxos messages from any other replica while the lease is held. The master maintains
a shorter timeout for the lease than the replicas – this protects the system against clock drift. The master
periodically submits a dummy “heartbeat” value to Paxos to refresh its lease.

from sofa-jraft.

PFZheng avatar PFZheng commented on July 17, 2024

@pifuant 我的疑问是sofa-jraft目前的实现似乎没有保证这个

from sofa-jraft.

pifuant avatar pifuant commented on July 17, 2024

@PFZheng 我也似乎没看到相关处理

from sofa-jraft.

killme2008 avatar killme2008 commented on July 17, 2024

@PFZheng @pifuant 其实有处理的,只是不是采用时间判断的方式,而是使用定时器来避开,请参考 handleElectionTimeout 实现,这是定时器定期调用,间隔就是 election timeout。preVote 的发起只能在 lastLeaderTimestamp 过期的情况下才会发起。follower 保证在没有收到更新的 leader 请求后才会发起 preVote。

leader 和 follower 之间心跳间隔是 election timeout 的 1/10。除了定期心跳之外,所有复制请求也会更新 lastLeaderTimestamp

from sofa-jraft.

PFZheng avatar PFZheng commented on July 17, 2024

仔细思考了下,这里存在的问题跟 #15 类似,issue 中描述的场景确实可能出现,出现了两个分割的 majority: (A, B) 和 (B, C),又凑巧出现写入极少或者没有的情况, A无法及时发现 C成为了新的 leader,在 1个 hearbteat 周期内,C又刚好写入了新日志,这样 A 会认为自己是 leader,而出现了 stale read,应当在 preVote 加入租约有效期的判断,来避免 C 成为新的 leader。

感谢分析。

看了这个,的确可以用同一方法来解决😊。另外,我建议可以另外提供一个lease timeout的参数。

from sofa-jraft.

pifuant avatar pifuant commented on July 17, 2024
            this.checkReplicator(peer);
            if (nowMs - replicatorGroup.getLastRpcSendTimestamp(peer) <= options.getElectionTimeoutMs()) {
                aliveCount++;
                continue;
            }
            deadNodes.addPeer(peer);
 

这里的replicatorGroup.getLastRpcSendTimestamp(peer)是不是应该判断下是已经收到peer response确认过的,而不单单只是SendTimestamp。

from sofa-jraft.

killme2008 avatar killme2008 commented on July 17, 2024

@pifuant 他是 Replicator 里的 lastRpcSendTimestamp,每次在 rpc response 返回后更新的。名字可能误导你了。

from sofa-jraft.

pifuant avatar pifuant commented on July 17, 2024

@killme2008 thx, 由于时钟偏移, 判断lease有效时, 是不是用小于options.getElectionTimeoutMs()的某个值代替options.getElectionTimeoutMs()更好一点呢

from sofa-jraft.

killme2008 avatar killme2008 commented on July 17, 2024

@pifuant 因为 heartbeat 的周期是 electionTimeout 的十分之一,这个小于getElectionTimeoutMs多少的值如何设置都不会完全合理,保持目前这个判断可以接受了。

from sofa-jraft.

fengjiachun avatar fengjiachun commented on July 17, 2024

这个 bug 已在 v1.2.4中发布解决,先关闭了

from sofa-jraft.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.