Comments (6)
Hi @Winslett any plans to implement this any time soon?
from governor.
@tvb I'm indecisive how this should work.
My core issue with moving this way is solving the "what if the etcd cluster goes away?" problem. I need to create another issue for that problem, and probably reference this problem. If we relied on etcd
state for leader / follower on haproxy_status.sh
, and etcd had a maintenance window, crashed, or had a network partition, then the Postgres cluster would go down. With the current behavior, etcd going away would cause governor.py
to throw an urllib
error, which would stop PostgreSQL. In a perfect scenario, if etcd
is unavailable to a running cluster, the cluster should maintain the current Primary if possible, but not failover. @jberkus and I chatted about this scenario. If etcd is unaccessible by the leader (network partition, etcd outage, or maintenance), a leader governor should expect a majority of follower governors to provide heartbeats to the leader. If follower heartbeats are not providing enough votes, the leader governor would go read-only and the cluster would wait for etcd to return. I would start the process by modifying the decision tree.
[update: created the issue at https://github.com//issues/7]
In the interim of solving that problem…
The more I think about this, the more I think governor.py
should handle the state responses to haproxy. Thus, removing the haproxy_status.sh
files and moving the HTTP port cofiguration to the postgres*.yml
files.
For people who know Python better than I do, is there a sensible way to run governor.py
with a looping runner and a HTTP listener?
from governor.
In a perfect scenario, if etcd is unavailable to a running cluster, the cluster should maintain the current Primary if possible, but not failover
This is tricky as there would be no way for the primary to check its status.
from governor.
" If etcd is unaccessible by the leader (network partition, etcd outage, or maintenance), a leader governor should expect a majority of follower governors to provide heartbeats to the leader. If follower heartbeats are not providing enough votes, the leader governor would go read-only and the cluster would wait for etcd to return. I would start the process by modifying the decision tree."
My thinking was this:
- if etcd is not accessible to any follower db, it should remain as it is;
- if etcd is not accessible to the leader db, it should restart in read-only mode in case it is no longer the leader
- if etcd is not accessible to HAproxy, it should make no changes except for disabling failed nodes.
The last reason is a good reason, IMHO, for HAProxy to be doing direct checks against each node as well as etcd, via this logic:
Is etcd responding?
Is node marked leader in etcd?
Is node responding?
enable node
else:
disable node
else:
disable node
Else:
Is node responding?
leave node enabled/disabled
else:
disable node
One problem with the above logic is that this doesn't support ever load-balancing connections to the read replica. However, that seems to be a limitation with any HAProxy-based design if we want automated connection switching, due to an inability to add new backends to HAproxy without restarting. FYI, I plan to instead use Kubernetes networking to handle the load-balancing case.
One thing I don't understand is why we need to have an HTTP daemon for HAProxy auth. Isn't there some way it can check the postgres port? I'm pretty sure there is something for HAProxy; we really want a check based on pg_isready. This is a serious issue if you want to use Postgres in containers, because we really don't want a container listening on two ports.
Also, if we can do the check via the postgres port, then we can implement whatever logic we want on the backend, including checks against etcd and internal postgres status.
Parathetically: At first, the idea of implementing a custom worker for Postgres which implements the leader election portion of RAFT is appealing. However, this does not work with binary replication, because without etcd we have nowhere to store status information. And if we're using etcd anyway, we might as well rely on it as a source of truth. Therefore: let's keep governor/etcd.
from governor.
@jberkus
"One problem with the above logic is that this doesn't support ever load-balancing connections to the read replica. However, that seems to be a limitation with any HAProxy-based design if we want automated connection switching, due to an inability to add new backends to HAproxy without restarting. FYI, I plan to instead use Kubernetes networking to handle the load-balancing case"
You can add new backends (modify HAProxy config) with zero-downtime. Reload HAProxy with a little bit of help from iptables. We're using this with great success: https://medium.com/@Drew_Stokes/actual-zero-downtime-with-haproxy-18318578fde6
from governor.
Still seems like a heavy-duty work-around to do something which Kubernetes does as a built-in feature.
from governor.
Related Issues (20)
- database system identifier differs between the primary and standby? HOT 2
- etcd returns 500 internal server error on ubuntu which causes postgres to crash. HOT 1
- 404 error causing the postgres to go down
- Use python-etcd client library HOT 1
- Fatal: requested timeline 8 is not a child of this server's history HOT 3
- Fencing and Quorum Support HOT 3
- Local Docker cluster with Governor on board HOT 1
- Make governor a module
- PostgreSQL + haproxy with multiple IP HOT 3
- not catching ssl timeout exception HOT 1
- rewind ex-leader before joining again HOT 2
- [Errno 32] Broken pipe HOT 1
- non atomic has_lock() and update_lock()
- cannot easily "go build" golang-custom-raft; maybe we should have a new project? HOT 1
- golang-custom-raft: add abilty to execute post-election script HOT 14
- New GB build tool based off of golang-custom-raft branch with a new name - hapg HOT 2
- golang-custom-raft: If a PG process is unhealthy - it can kill governor
- golang-custom-raft: maximum_lag_on_failover doesn't work as it should
- help:the connect info in the recover.conf are "None"
- replication slots failing when names contain dashes
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from governor.