Giter Club home page Giter Club logo

consul-cluster-manager's Introduction

consul-cluster-manager's People

Contributors

n0mer avatar romalev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

coder965 n0mer

consul-cluster-manager's Issues

Load balancing is not working

Round robin load balancing seems to be not working between two subs of the save event bus address - this has to be addressed.

Service's ports overlapping problem

Failing node (service) keeps residing within Consul for one minute since the time its health check has failed. Now when new node is about to join the cluster and this new node has same port as the node that's failing, then -> failing node theoretically will become available again (in case new node with the same port gets eventually registered).

This behavior should abounded.

Performance & functional testing. Getting ready to prod release.

See : https://github.com/romalev/vertx-consul-cluster-manager/milestone/2
Goes to first prod release:

  • cm configuration! right now it looks a bit messy - come up with the proper way of cm configuration.
  • subs are being properly load balanced.
  • subs are ephemeral.
  • hainfo map is NOT ephemeral.
  • verify node leaving the cluster behavior.
  • verify node joining the cluster behavior.
  • distributed lock and distributed counter behavior.
  • TCKs against this CM implementation must pass.
  • HA.
  • Run findbugs against this CM implementation.
  • Run the CM within docker image.
  • Logging.
  • Perform some cleanup once manager is stable.
    Perform load testing on given cluster manager.
    Code review to be performed by vertx experts.

Goes to second prod release. To be discussed with vert.x experts:
Restore node's caching data in Consul. Suppose consul agent was down in the cluster for some time (meaning that data consul was holding potentially was lost) and it is up and running again - verify whether data located in the local caches are consistent with the data located within consul agent kv store.

HA

HA must be working.

Host & port assignment.

Check if the current implementation of port & host dynamic assignment is correct. See how port and host gets assigned for even bus subscribers.

Nodes de-registration process.

Once the node is down - all its appropriate data in consul has to be cleaned up (service itself + appripriate distributed map KV pair(s)).

Consider:

  • shutdown hooks;
  • tcp consul checks;

Failed to remove sub - check this.

2018-10-09 09:52:03 [vert.x-eventloop-thread-5] ERROR i.v.c.e.i.c.ClusteredEventBus - Failed to remove sub
java.lang.IllegalStateException: Result is already complete: succeeded
at io.vertx.core.impl.FutureImpl.complete(FutureImpl.java:87)
at io.vertx.spi.cluster.consul.impl.ConsulAsyncMultiMap.lambda$remove$8(ConsulAsyncMultiMap.java:94)
at io.vertx.core.Future.lambda$compose$1(Future.java:265)

seems to be a bug.

CAS, Transactions, entry locking while entry is being read.

Currently given implementation of this cluster manager contains lots of code that first reads an entry and then update or remove it - sort of CAS operation which is not synchronized by consul agent.

Imagine a scenario where entry is being read by Alice in order to get replaced (or removed or updated) and in the meanwhile Bob wants to update same entry and he actually does. Bob's write takes place before Alice proceeds which means Alice's read value is not consistent anymore with central storage. Question: Are we allowing Alice's write forcibly to take place or should we come up with something like

  • CAS;
  • entry locking;
  • applying consul transactions;

to abort Alice's write ? It has to be investigated.

Design and implement custom TTL handler.

Given following consul restriction :

TTL value (on entries) must be between 10s and 86400s currently. Invalidation-time is twice the TTL time - this means actual time when ttl entry gets removed (expired) is doubled to what you will specify as a ttl.

Custom handler should take place in order to satisfy vertx cluster managing SPI.

Develop distributed counter.

public class ConsulClusterManager {
@Override
    public void getCounter(String name, Handler<AsyncResult<Counter>> resultHandler) {
// has to be implemented.
    }
}

Develop lock with timeout.

@Override
    public void getLockWithTimeout(String name, long timeout, Handler<AsyncResult<Lock>> resultHandler) {
        // has to be imlplemented.
    }

Managing sessions.

Now managing sessions within this CM implementation happens in two places :

  • in consul abstract map.
  • in consul cluster manager.

Define a way to have one SessionManager.

Session to be re-created on already existing ttl entries

Imagine a scenario where map entry with ttl already present within consul kv store. Now the vertx node is about to update given entry by wanting to put new ttl on it. Currently new session is gonna get created but won't be bound to existing entry we want to update ttl on.

TTL on map entries.

AsyncMap comes with contract to have ttl on entry - this has to be addressed.

Bugs

  • Perform load testing on given cluster manager.
  • Code review to be performed by vertx experts.
  • Restore node's caching data in Consul. Suppose consul agent was down in the cluster for some time (meaning that data consul was holding potentially was lost) and it is up and running again - verify whether data located in the local caches are consistent with the data located within consul agent kv store.
  • #78
  • #42

Restore node's caching data in Consul.

Suppose consul agent was down in the cluster for some time (meaning that data consul was holding potentially was lost) and it is up and running again - verify whether data located in the local caches are consistent with the data located within consul agent kv store.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.