Giter Club home page Giter Club logo

Comments (10)

AndrewDryga avatar AndrewDryga commented on June 2, 2024 1

I think you would have the same issue even if you use deployment, new pods would crash-loop anyways.

The way how we solved this is: we have our own implementation of the Kubernetes strategy that polls k8s API and only joins nodes of the same version (from the version label) into a cluster. This makes it impossible to handoff state between application versions but makes sure that code that was never tested to co-live in a cluster would end up crashing in production.

from libcluster.

amacciola avatar amacciola commented on June 2, 2024

@AndrewDryga do you think this custom k8s strategy is worth a PR or can be shared ? Because i feel like how are more people not running into this same issue ? Is everyone else using this library only every deploying libcluster 1 time and then never adding new features into its registry from that point forward ?

from libcluster.

AndrewDryga avatar AndrewDryga commented on June 2, 2024

@amacciola the problem with our strategy is that it is very opinionated (uses specific labels named for our environment, uses node names from k8s labels, etc). I will think about open-sourcing it but it's a relatively easy change: just leverage labelSelector in get_nodes/4 callback implementation and query only the pods of a specific version.

from libcluster.

amacciola avatar amacciola commented on June 2, 2024

@AndrewDryga okay i will try this. So if i am trying to extend the k8s dns strategy here:
https://github.com/bitwalker/libcluster/blob/main/lib/strategy/kubernetes_dns.ex

you are suggestion that i need to tweak the get_nodes method here:

defp get_nodes(%State{topology: topology, config: config}) do

to also additionally query for a specific version or at least only matching version numbers ?

from libcluster.

AndrewDryga avatar AndrewDryga commented on June 2, 2024

@amacciola you can't extract that information from DNS server, instead you should modify that function in lib/strategy/kubernetes.ex. K8s API returns a lot of information about the pod including labels that you need to use to store the version.

from libcluster.

amacciola avatar amacciola commented on June 2, 2024

@AndrewDryga i see. So its changing

path =
case ip_lookup_mode do
:endpoints -> "api/v1/namespaces/#{namespace}/endpoints?labelSelector=#{selector}"
:pods -> "api/v1/namespaces/#{namespace}/pods?labelSelector=#{selector}"
end
headers = [{'authorization', 'Bearer #{token}'}]
http_options = [ssl: [verify: :verify_none], timeout: 15000]
case :httpc.request(:get, {'https://#{master}/#{path}', headers}, http_options, []) do
{:ok, {{_version, 200, _status}, _headers, body}} ->
parse_response(ip_lookup_mode, Jason.decode!(body))
|> Enum.map(fn node_info ->
format_node(
Keyword.get(config, :mode, :ip),
node_info,
app_name,
cluster_name,
service_name
)
end)

so include additional params to only return info for certain version info

from libcluster.

AndrewDryga avatar AndrewDryga commented on June 2, 2024

@amacciola yes, you want to query for pods and return only the ones that match your current version

from libcluster.

amacciola avatar amacciola commented on June 2, 2024

@AndrewDryga i am working on testing this new strategy out now so thanks for the insight. But i just wanted to make sure i understood how some of the Libcluster combined with Horde registry code is working under the hood.

If we have 3 pods running for the same application. Each of these pods have lets say server_1 registered in the HordeRegistry.

  • pod_1 with version_1
  • pod_2 with version_1 -> running server_1
  • pod_3 with version_1

If we then trigger an update for version_2, which contains a new service that gets registered in the Horde Registry, server_2. Lets say the update starts with pod_3. When pod_3 comes online with the new version and finds pod_1 and pod_2 .

Does it just pick a process_id from one of the 3 pods to to try and start the new service on ? So you will have a 2 in 3 chance it tries to start the new service on a pod with version_1 and not version_2. ???

from libcluster.

AndrewDryga avatar AndrewDryga commented on June 2, 2024

I'm not using Horde but the pods with version 1 would not see pods with version 2 in the Erlang cluster, so basically, for each of the islands (one per version), everything would behave like it's a cluster with the same codebase. If you have globally unique jobs it also means that you will have two of the workers started (one per island).

from libcluster.

amacciola avatar amacciola commented on June 2, 2024

For now i have just created a separate HordeRegistry for each Genserver we would want to be leveraging the Libcluster strategies. As long as we dont have too many its a minor annoyance to fix this issue

from libcluster.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.