syleron / pulseha Goto Github PK
View Code? Open in Web Editor NEWPulseHA is a active-passive high availability cluster daemon that uses GRPC and is written in GO.
Home Page: http://www.pulseha.com/
License: GNU Affero General Public License v3.0
PulseHA is a active-passive high availability cluster daemon that uses GRPC and is written in GO.
Home Page: http://www.pulseha.com/
License: GNU Affero General Public License v3.0
The networking actions in PulseHA should be extended/replaced by the means of plugins for different use cases such as AWS and Azure platforms.
Health Check plugins are to be used as additional methods of ensuring that a peer node is unavailable before a failover is to be performed.
The memberlist will contain each member of the cluster and other vital information such as it's connection details, hostname, and cluster status.
As we are finding that we are needing to use the config everywhere, this should probably be made as a global.
Furthermore, in order to keep the integrity of the information a mutex should be used to ensure as multiple threads may attempt to access the config details.
Should be able to generate, and use SSL cert for security.
Allow for configurable e-mail alerts in specific situations such as a split brain scenario.
The CLI should have an option to generate new cert.
By catching the interrupt signal we can perform a graceful shutdown. Active node could use this opportunity to inform others? or any node for that matter.
Look into using a custom logger to catch and format GRPC errors.
Example error:
rpc error: code = Unavailable desc = grpc: the connection is unavailable
In an attempt to assign a floating IP group from an interface, it would error sayings that it doesn't exist.
When adding 2 modes to a Master giving 3 nodes pulse0,1,2 pulse1 and pulse2 go ACTIVE/UNAILABLE and swap roles.
pulse0 becomes passive
| pulse1.as.lab | | | ACTIVE | |
Then
I also note I have no bind address. and also this is only seen on pulse0 pulse1&2 show as below
Pulse1
Pulse2
Extend logging to ensure each process is correctly logged and saved to a readable file.
Send/receive custom events which could be used for actions such as restarting processes, executing deploy scripts etc.
This should probably be built into the CLI.
A new config option needs to be added to prioritize who will always become the master in regards to a failover.
Update PulseHA to reflect the following folder structure:
The ability to reload the PulseHA service via the CLI/GRPC
When adding/removing ips via the CLI ensure that a subnet is specified.
Hi Andrew
I did a typo but the results do not look nice, especially the panic: messages
[root@pulse2 ~]# pulse leave
___ _ _
/ _ \_ _| |___ ___ /\ /\/_\
/ /_)/ | | | / __|/ _ \/ /_/ //_\\
/ ___/| |_| | \__ \ __/ __ / _ \ Version v0.0.1-192-g3c01b7f
\/ \__,_|_|___/\___\/ /_/\_/ \_/ Build 3c01b7f
[Dec 4 20:50:58][info] Loading configuration file
[Dec 4 20:50:58][WARNING] TLS Disabled! PulseHA server connection unsecured.
[Dec 4 20:50:58][error] Failed to listen: listen tcp 127.0.0.1:9443: bind: address already in use
panic: runtime error: invalid memory address or nil pointer dereference
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x9965d1]
goroutine 6 [running]:
google.golang.org/grpc.(*Server).Serve.func1(0xc420001c80, 0x0, 0x0)
/root/go/src/google.golang.org/grpc/server.go:486 +0xb1
panic(0xa41a00, 0xf01b50)
/usr/lib/golang/src/runtime/panic.go:491 +0x283
google.golang.org/grpc.(*Server).Serve(0xc420001c80, 0x0, 0x0, 0x0, 0x0)
/root/go/src/google.golang.org/grpc/server.go:495 +0x184
main.(*CLIServer).Setup(0xc420146090)
/root/go/src/github.com/Syleron/PulseHA/src/cliserver.go:470 +0x183
created by main.main
/root/go/src/github.com/Syleron/PulseHA/src/main.go:137 +0x1ab
[root@pulse2 ~]# pulseha leave
pulseha leave
When performing this action is should do one of the following:
Currently the Client:Send() function returns interface{}, error.
Look into whether it is possible to return the proto type itself rather than as an interface otherwise you must type cast.
For example:
r, err := client.Send(SendJoin, &proto.PulseJoin{
Config: buf,
Hostname: utils.GetHostname(),
})
// Handle an unsuccessful request
if !r.(*proto.PulseJoin).Success {
log.Emergency("Peer error: %s", err)
return &proto.PulseJoin{
Success: false,
Message: r.(*proto.PulseJoin).Message,
}, nil
}
Not very pretty!
The CLI should be able to generate a empty, working config in case someone balls up.
pulseha join ip:port
Join a configured network if communication can be made
Security thoughts:
When over TLS this shouldn't require additional auth as a key is required to connect. However, if TLS is disabled anyone who can access the Pulse instance can join it.
Perhaps we should scrap the disable TLS option haha
pulseha status
This should show a table of the current configured cluster and each members statuses.
At the moment each and every CLI command create and handles it's own GRPC client connection.
Perhaps it would be wise to consider writing a sender so that this logic does not get repeated for every command?
Just a thought.
Oct 03 09:59:36 ted pulse[21067]: panic: runtime error: invalid memory address or nil pointer dereference
Oct 03 09:59:36 ted pulse[21067]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xb64c96]
Oct 03 09:59:36 ted pulse[21067]: goroutine 10 [running]:
Oct 03 09:59:36 ted pulse[21067]: main.(*Member).getHostname(0x0, 0x0, 0x0)
Oct 03 09:59:36 ted pulse[21067]: /home/andrew/Projects/src/github.com/Syleron/PulseHA/src/member.go:69 +0x46
Oct 03 09:59:36 ted pulse[21067]: main.(*Member).makeActive(0x0, 0x0)
Oct 03 09:59:36 ted pulse[21067]: /home/andrew/Projects/src/github.com/Syleron/PulseHA/src/member.go:149 +0x51
Oct 03 09:59:36 ted pulse[21067]: main.(*Member).monitorReceivedHCs(0xc420012840, 0xc4200247a8)
Oct 03 09:59:36 ted pulse[21067]: /home/andrew/Projects/src/github.com/Syleron/PulseHA/src/member.go:239 +0x20e
Oct 03 09:59:36 ted pulse[21067]: main.(*Member).(main.monitorReceivedHCs)-fm(0xc420012ae0)
Oct 03 09:59:36 ted pulse[21067]: /home/andrew/Projects/src/github.com/Syleron/PulseHA/src/memberlist.go:153 +0x42
Oct 03 09:59:36 ted pulse[21067]: github.com/Syleron/PulseHA/src/utils.Scheduler(0xc420161030, 0x2540be400)
Oct 03 09:59:36 ted pulse[21067]: /home/andrew/Projects/src/github.com/Syleron/PulseHA/src/utils/utils.go:82 +0xaa
Oct 03 09:59:36 ted pulse[21067]: created by main.(*Memberlist).Setup
Oct 03 09:59:36 ted pulse[21067]: /home/andrew/Projects/src/github.com/Syleron/PulseHA/src/memberlist.go:153 +0x2e2
Oct 03 09:59:36 ted systemd[1]: pulseha.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Oct 03 09:59:36 ted systemd[1]: Unit pulseha.service entered failed state.
Oct 03 09:59:36 ted systemd[1]: pulseha.service failed.
GRPC will need to be replicated between nodes to ensure each node's configuration is up to date.
The core GRPC Health Check scheduler needs to be written.
The failover process or the process of electing a new active appliance upon failure needs to be completed.
Nicely handle when it cannot bind to the port at the moment it just panics.
`Sep 21 15:07:03.809 [ ERROR] - Failed to listen: listen tcp 127.0.0.1:9443: bind: address already in use
Sep 21 15:07:03.809 [ INFO] - CLI initialised on 127.0.0.1:9443
panic: runtime error: invalid memory address or nil pointer dereference
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x99ef71]
goroutine 5 [running]:
github.com/Syleron/PulseHA/vendor/google.golang.org/grpc.(*Server).Serve.func1(0xc4200de500, 0x0, 0x0)
/home/ben/go/src/github.com/Syleron/PulseHA/vendor/google.golang.org/grpc/server.go:436 +0xb1
panic(0xa301a0, 0xee6b10)
/usr/local/go/src/runtime/panic.go:491 +0x283
github.com/Syleron/PulseHA/vendor/google.golang.org/grpc.(*Server).Serve(0xc4200de500, 0x0, 0x0, 0x0, 0x0)
/home/ben/go/src/github.com/Syleron/PulseHA/vendor/google.golang.org/grpc/server.go:445 +0x184
main.(*Server).SetupCLI(0xc4200cc0c0)
/home/ben/go/src/github.com/Syleron/PulseHA/src/server.go:438 +0x183
created by main.main
/home/ben/go/src/github.com/Syleron/PulseHA/src/main.go:69 +0x1f0
`
When you add a duplicate IP to the cli the server logs this and does not add the IP but this message is not returned to the cli.
The proto file is getting pretty full now. Perhaps we should split it out into multiple files!
Add the logging level into config so that production systems do not see all of the debug logs.
Develop a CLI which would allow for easier configuration, and maintenance of a Pulse cluster.
pulseha groups -node=compiler.as.lab -name=group3 -iface=lo unassign
[x] interface does not exist
There is documentation here and there but there is still a lot missing. This needs a good read through and finishing.
Hostnames as a unique identifier for a node seemed like a good idea at the time but ultimately I believe UUIDs is where it should be.
CLI command to assign a new active node within the cluster.
Otherwise the ip's do not get brought up until a service restart
There are currently no test functions which is.. terrible.
We need to get tests written for Pulse!
as per the title. any node can join. I would think this is a security issue as any node can join and possibly hijack the network resources.
Currently, Pulse only supports IPv4 and requires IPv6 support.
Define configurable options within the config that will allow a user to customise their setup including failover threshold etc.
A temporary delay was used to resolve this issue. However, this needs to be properly handled/resolved as the current solution is unacceptable.
When you assign/unassign a floating IP group it currently does not bring up/down the floating IP addresses
An overall service status that can be requested for. For example, "Active", "Passive", "Failing over" etc.
Hey, I was installing PulseHA on FC26
All good! However I want to "make install DESTDIR=/otherroot"
Can you make that possible? it makes it easier to move a package around
Integrate module system to extend core functionality through the means of plugins.
The current alternate health checks should be removed and converted into a plugins.
Move the networking out into it's package.
Networking needs to be extended to allow for plugins to instruct how network changes should be made.
The ability to promote a node within the cluster.
When promoting a specific node, it should let the other nodes in the cluster who the new active node is. In doing so the old active node should then demote itself.
Config validation that ensures defined items in the config are what PulseHA expects on load.
Additionally checks that the hostname and network interfaces exist.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.