Giter Club home page Giter Club logo

pulseha's People

Contributors

andrew-loadbalancer avatar ben-loadbalancer avatar bencabot avatar cottonaf avatar dependabot[bot] avatar matthewcooper avatar syleron avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pulseha's Issues

Network Plugin Support

The networking actions in PulseHA should be extended/replaced by the means of plugins for different use cases such as AWS and Azure platforms.

Health Check Plugins

Health Check plugins are to be used as additional methods of ensuring that a peer node is unavailable before a failover is to be performed.

Memberlist

The memberlist will contain each member of the cluster and other vital information such as it's connection details, hostname, and cluster status.

Global Config

As we are finding that we are needing to use the config everywhere, this should probably be made as a global.

Furthermore, in order to keep the integrity of the information a mutex should be used to ensure as multiple threads may attempt to access the config details.

E-mail Alerts

Allow for configurable e-mail alerts in specific situations such as a split brain scenario.

Catch the interrupt signal

By catching the interrupt signal we can perform a graceful shutdown. Active node could use this opportunity to inform others? or any node for that matter.

Catch GRPC errors with custom logger

Look into using a custom logger to catch and format GRPC errors.

Example error:

rpc error: code = Unavailable desc = grpc: the connection is unavailable

Add 3 pulse nodes and nodes 1,2 swap roles and one becomes unavilable

When adding 2 modes to a Master giving 3 nodes pulse0,1,2 pulse1 and pulse2 go ACTIVE/UNAILABLE and swap roles.

pulse0 becomes passive


| NODE HOSTNAME | BIND ADDRESS | LATENCY | STATUS | LAST RECEIVED |

| pulse0.as.lab | 192.168.1.200 | 1ms | PASSIVE | Sat, 02 Dec 2017 20:23:57 GMT |

| pulse1.as.lab | | | ACTIVE | |


| pulse2.as.lab | | | | |

Then


| NODE HOSTNAME | BIND ADDRESS | LATENCY | STATUS | LAST RECEIVED |

| pulse0.as.lab | 192.168.1.200 | 1ms | PASSIVE | Sat, 02 Dec 2017 20:24:02 GMT |

| pulse1.as.lab | | | UNAVAILABLE | |

| pulse2.as.lab | | | ACTIVE | |

I also note I have no bind address. and also this is only seen on pulse0 pulse1&2 show as below

Pulse1


| NODE HOSTNAME | BIND ADDRESS | LATENCY | STATUS | LAST RECEIVED |

| pulse0.as.lab | 192.168.1.200 | 1ms | PASSIVE | Sat, 02 Dec 2017 20:26:06 GMT |

| pulse1.as.lab | | | ACTIVE | |

| pulse2 | | | UNAVAILABLE | |

Pulse2


| NODE HOSTNAME | BIND ADDRESS | LATENCY | STATUS | LAST RECEIVED |

| pulse0.as.lab | 192.168.1.200 | 1ms | PASSIVE | Sat, 02 Dec 2017 20:26:16 GMT |

| pulse1.as.lab | | | UNAVAILABLE | |

| pulse2.as.lab | | | ACTIVE | |

Extend logging

Extend logging to ensure each process is correctly logged and saved to a readable file.

Custom Event Broadcaster

Send/receive custom events which could be used for actions such as restarting processes, executing deploy scripts etc.

This should probably be built into the CLI.

New folder structure

Update PulseHA to reflect the following folder structure:

  • Plugins - /usr/local/lib/pulseha/
  • Config/Certs - /etc/pulseha/
  • Both CLI and Pulse Daemon - /etc/local/sbin/

Reload Command

The ability to reload the PulseHA service via the CLI/GRPC

I ran pulse leave not pulseha leave. it would be good not to see this.

Hi Andrew

I did a typo but the results do not look nice, especially the panic: messages

[root@pulse2 ~]# pulse leave

   ___       _                  _
  / _ \_   _| |___  ___  /\  /\/_\
 / /_)/ | | | / __|/ _ \/ /_/ //_\\
/ ___/| |_| | \__ \  __/ __  /  _  \  Version v0.0.1-192-g3c01b7f
\/     \__,_|_|___/\___\/ /_/\_/ \_/  Build   3c01b7f

[Dec  4 20:50:58][info] Loading configuration file
[Dec  4 20:50:58][WARNING] TLS Disabled! PulseHA server connection unsecured.
[Dec  4 20:50:58][error] Failed to listen: listen tcp 127.0.0.1:9443: bind: address already in use
panic: runtime error: invalid memory address or nil pointer dereference
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x9965d1]

goroutine 6 [running]:
google.golang.org/grpc.(*Server).Serve.func1(0xc420001c80, 0x0, 0x0)
	/root/go/src/google.golang.org/grpc/server.go:486 +0xb1
panic(0xa41a00, 0xf01b50)
	/usr/lib/golang/src/runtime/panic.go:491 +0x283
google.golang.org/grpc.(*Server).Serve(0xc420001c80, 0x0, 0x0, 0x0, 0x0)
	/root/go/src/google.golang.org/grpc/server.go:495 +0x184
main.(*CLIServer).Setup(0xc420146090)
	/root/go/src/github.com/Syleron/PulseHA/src/cliserver.go:470 +0x183
created by main.main
	/root/go/src/github.com/Syleron/PulseHA/src/main.go:137 +0x1ab
[root@pulse2 ~]# pulseha leave

CLI Leave Cluster

pulseha leave

When performing this action is should do one of the following:

  • In a configured cluster - Remove itself from the cluster and update peers.
  • In an un-configured cluster - Remove itself and clear out the cluster config.

Client:Send() Return interface{} as the type of proto instead of interface.

Currently the Client:Send() function returns interface{}, error.

Look into whether it is possible to return the proto type itself rather than as an interface otherwise you must type cast.

For example:

r, err := client.Send(SendJoin, &proto.PulseJoin{
	Config: buf,
	Hostname: utils.GetHostname(),
})
// Handle an unsuccessful request
if !r.(*proto.PulseJoin).Success {
	log.Emergency("Peer error: %s", err)
	return &proto.PulseJoin{
		Success: false,
		Message: r.(*proto.PulseJoin).Message,
	}, nil
}

Not very pretty!

CLI Join Cluster

pulseha join ip:port

Join a configured network if communication can be made

Security thoughts:
When over TLS this shouldn't require additional auth as a key is required to connect. However, if TLS is disabled anyone who can access the Pulse instance can join it.

Perhaps we should scrap the disable TLS option haha

CLI Status Command

pulseha status

This should show a table of the current configured cluster and each members statuses.

CLI Sender

At the moment each and every CLI command create and handles it's own GRPC client connection.

Perhaps it would be wise to consider writing a sender so that this logic does not get repeated for every command?

Just a thought.

Fix SIGSEGV upon re-introducing member into cluster

Oct 03 09:59:36 ted pulse[21067]: panic: runtime error: invalid memory address or nil pointer dereference
Oct 03 09:59:36 ted pulse[21067]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xb64c96]
Oct 03 09:59:36 ted pulse[21067]: goroutine 10 [running]:
Oct 03 09:59:36 ted pulse[21067]: main.(*Member).getHostname(0x0, 0x0, 0x0)
Oct 03 09:59:36 ted pulse[21067]: /home/andrew/Projects/src/github.com/Syleron/PulseHA/src/member.go:69 +0x46
Oct 03 09:59:36 ted pulse[21067]: main.(*Member).makeActive(0x0, 0x0)
Oct 03 09:59:36 ted pulse[21067]: /home/andrew/Projects/src/github.com/Syleron/PulseHA/src/member.go:149 +0x51
Oct 03 09:59:36 ted pulse[21067]: main.(*Member).monitorReceivedHCs(0xc420012840, 0xc4200247a8)
Oct 03 09:59:36 ted pulse[21067]: /home/andrew/Projects/src/github.com/Syleron/PulseHA/src/member.go:239 +0x20e
Oct 03 09:59:36 ted pulse[21067]: main.(*Member).(main.monitorReceivedHCs)-fm(0xc420012ae0)
Oct 03 09:59:36 ted pulse[21067]: /home/andrew/Projects/src/github.com/Syleron/PulseHA/src/memberlist.go:153 +0x42
Oct 03 09:59:36 ted pulse[21067]: github.com/Syleron/PulseHA/src/utils.Scheduler(0xc420161030, 0x2540be400)
Oct 03 09:59:36 ted pulse[21067]: /home/andrew/Projects/src/github.com/Syleron/PulseHA/src/utils/utils.go:82 +0xaa
Oct 03 09:59:36 ted pulse[21067]: created by main.(*Memberlist).Setup
Oct 03 09:59:36 ted pulse[21067]: /home/andrew/Projects/src/github.com/Syleron/PulseHA/src/memberlist.go:153 +0x2e2
Oct 03 09:59:36 ted systemd[1]: pulseha.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Oct 03 09:59:36 ted systemd[1]: Unit pulseha.service entered failed state.
Oct 03 09:59:36 ted systemd[1]: pulseha.service failed.

GRPC Replication

GRPC will need to be replicated between nodes to ensure each node's configuration is up to date.

Failover

The failover process or the process of electing a new active appliance upon failure needs to be completed.

Catch panic when unable to bind to port

Nicely handle when it cannot bind to the port at the moment it just panics.

`Sep 21 15:07:03.809 [ ERROR] - Failed to listen: listen tcp 127.0.0.1:9443: bind: address already in use
Sep 21 15:07:03.809 [ INFO] - CLI initialised on 127.0.0.1:9443
panic: runtime error: invalid memory address or nil pointer dereference
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x99ef71]

goroutine 5 [running]:
github.com/Syleron/PulseHA/vendor/google.golang.org/grpc.(*Server).Serve.func1(0xc4200de500, 0x0, 0x0)
/home/ben/go/src/github.com/Syleron/PulseHA/vendor/google.golang.org/grpc/server.go:436 +0xb1
panic(0xa301a0, 0xee6b10)
/usr/local/go/src/runtime/panic.go:491 +0x283
github.com/Syleron/PulseHA/vendor/google.golang.org/grpc.(*Server).Serve(0xc4200de500, 0x0, 0x0, 0x0, 0x0)
/home/ben/go/src/github.com/Syleron/PulseHA/vendor/google.golang.org/grpc/server.go:445 +0x184
main.(*Server).SetupCLI(0xc4200cc0c0)
/home/ben/go/src/github.com/Syleron/PulseHA/src/server.go:438 +0x183
created by main.main
/home/ben/go/src/github.com/Syleron/PulseHA/src/main.go:69 +0x1f0
`

Separate Protos

The proto file is getting pretty full now. Perhaps we should split it out into multiple files!

PulseCLI

Develop a CLI which would allow for easier configuration, and maintenance of a Pulse cluster.

Overall Documentation

There is documentation here and there but there is still a lot missing. This needs a good read through and finishing.

Function Tests

There are currently no test functions which is.. terrible.

We need to get tests written for Pulse!

Any node can join any node

as per the title. any node can join. I would think this is a security issue as any node can join and possibly hijack the network resources.

IPv6 Support

Currently, Pulse only supports IPv4 and requires IPv6 support.

Configurable options

Define configurable options within the config that will allow a user to customise their setup including failover threshold etc.

Overall Service Status

An overall service status that can be requested for. For example, "Active", "Passive", "Failing over" etc.

Module System

Integrate module system to extend core functionality through the means of plugins.

The current alternate health checks should be removed and converted into a plugins.

Networking Package

Move the networking out into it's package.

Networking needs to be extended to allow for plugins to instruct how network changes should be made.

Promote/Demote Node

The ability to promote a node within the cluster.

When promoting a specific node, it should let the other nodes in the cluster who the new active node is. In doing so the old active node should then demote itself.

Config Validation

Config validation that ensures defined items in the config are what PulseHA expects on load.

Additionally checks that the hostname and network interfaces exist.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.