Giter Club home page Giter Club logo

swarm's Introduction

Swarm

Hex.pm Version Build Status

NOTE: If you are upgrading from 1.0, be aware that the autoclustering functionality has been extracted to its own package, which you will need to depend on if you use that feature. The package is libcluster and is available on Hex. Please be sure to read over the README to make sure your config is properly updated.

Swarm is a global distributed registry, offering a feature set similar to that of gproc, but architected to handle dynamic node membership and large volumes of process registrations being created/removed in short time windows.

To be more clear, Swarm was born out of the need for a global process registry which could handle large numbers of persistent processes representing devices/device connections, which needed to be distributed around a cluster of Erlang nodes, and easily found. Messages need to be routed to those processes from anywhere in the cluster, both individually, and as groups. Additionally, those processes need to be shifted around the cluster based on cluster topology changes, or restarted if their owning node goes down.

Before writing Swarm, I tried both global and gproc, but the former is not very flexible, and both of them require leader election, which, in the face of dynamic node membership and the sheer volume of registrations, ended up causing deadlocks/timeouts during leadership contention.

I also attempted to use syn, but because it uses mnesia at the time, dynamic node membership as a requirement meant it was dead on arrival for my use case.

In short, are you running a cluster of Erlang nodes under something like Kubernetes? If so, Swarm is for you!

View the docs here.

PLEASE READ: If you are giving Swarm a spin, it is important to understand that you can concoct scenarios whereby the registry appears to be out of sync temporarily, this is a side effect of an eventually consistent model and does not mean that Swarm is not working correctly, rather you need to ensure that applications you build on top of Swarm are written to embrace eventual consistency, such that periods of inconsistency are tolerated. For the most part though, the registry replicates extremely quickly, so noticeable inconsistency is more of an exception than a rule, but a proper distributed system should always be designed to tolerate the exceptions, as they become more and more common as you scale up. If however you notice extreme inconsistency or delayed replication, then it is possible it may be a bug, or performance issue, so feel free to open an issue if you are unsure, and we will gladly look into it.

Installation

defp deps do
  [{:swarm, "~> 3.0"}]
end

Features

  • automatic distribution of registered processes across the cluster based on a consistent hashing algorithm, where names are partitioned across nodes based on their hash.
  • easy handoff of processes between one node and another, including handoff of current process state.
  • can do simple registration with {:via, :swarm, name}
  • both an Erlang and Elixir API

Restrictions

  • auto-balancing of processes in the cluster requires registrations to be done via register_name/5, which takes module/function/args params, and handles starting the process for you. The MFA must return {:ok, pid}. This is how Swarm handles process handoff between nodes, and automatic restarts when nodedown events occur and the cluster topology changes.

Process handoff

Processes may be redistributed between nodes when a node joins, or leaves, a cluster. You can indicate whether the handoff should simply restart the process on the new node, start the process and then send it the handoff message containing state, or ignore the handoff and remain on its current node.

Process state can be transferred between running nodes during process redistribution by using the {:swarm, :begin_handoff} and {:swarm, :end_handoff, state} callbacks. However process state will be lost when a node hosting a distributed process terminates. In this scenario you must restore the state yourself.

Consistency Guarantees

Like any distributed system, a choice must be made in terms of guarantees provided. You can choose between availability or consistency during a network partition by selecting the appropriate process distribution strategy.

Swarm provides two strategies for you to use:

  • Swarm.Distribution.Ring

    This strategy favors availability over consistency, even though it is eventually consistent, as network partitions, when healed, will be resolved by asking any copies of a given name that live on nodes where they don't belong to shutdown.

    Network partitions result in all partitions running an instance of processes created with Swarm. Swarm was designed for use in an IoT platform, where process names are generally based on physical device ids, and as such, the consistency issue is less of a problem. If events get routed to two separate partitions, it's generally not an issue if those events are for the same device. However this is clearly not ideal in all situations. Swarm also aims to be fast, so registrations and lookups must be as low latency as possible, even when the number of processes in the registry grows very large. This is achieved without consensus by using a consistent hash of the name which deterministically defines which node a process belongs on, and all requests to start a process on that node will be serialized through that node to prevent conflicts.

    This is the default strategy and requires no configuration.

  • Swarm.Distribution.StaticQuorumRing

    A quorum is the minimum number of nodes that a distributed cluster has to obtain in order to be allowed to perform an operation. This can be used to enforce consistent operation in a distributed system.

    You configure the quorum size by defining the minimum number of nodes that must be connected in the cluster to allow process registration and distribution. Calls to Swarm.register_name/5 will return {:error, :no_node_available} if there are fewer nodes available than the configured minimum quorum size.

    In a network partition, the partition containing at least the quorum size number of clusters will continue operation. Processes running on the other side of the split will be stopped and restarted on the active side. This ensures that only one instance of a registered process will be running in the cluster.

    You must configure this strategy and its minimum quorum size using the :static_quorum_size setting:

    config :swarm,
      distribution_strategy: Swarm.Distribution.StaticQuorumRing,
      static_quorum_size: 5

    The quorum size should be set to half the cluster size, plus one node. So a three node cluster would be two, a five node cluster is three, and a nine node cluster is five. You must not add more than 2 x quorum size - 1 nodes to the cluster as this would cause a network split to result in both partitions continuing operation.

    Processes are distributed amongst the cluster using the same consistent hash of their name as in the ring strategy above.

    This strategy is a good choice when you have a fixed number of nodes in the cluster.

Clustering

Swarm pre-2.0 included auto-clustering functionality, but that has been split out into its own package, libcluster. Swarm works out of the box with Erlang's distribution tools (i.e. Node.connect/1, :net_kernel.connect_node/1, etc.), but if you need the auto-clustering that Swarm previously provided, you will need to add :libcluster to your deps, and make sure it's in your applications list before :swarm. Some of the configuration has changed slightly in :libcluster, so be sure to review the docs.

Node Blacklisting/Whitelisting

You can explicitly whitelist or blacklist nodes to prevent certain nodes from being included in Swarm's consistent hash ring. This is done with either the node_whitelist and node_blacklist settings respectively. These settings must be lists containing either literal strings or valid Elixir regex patterns as either string or regex literals. If no whitelist is set, then the blacklist is used, and if no blacklist is provided, the default blacklist includes two patterns, in both cases to ignore nodes which are created by Relx/ExRM/Distillery when using releases, in order to setup remote shells (the first) and hot upgrade scripting (the second), the patterns can be found in this repo's config/config.exs file, and you can find a quick example below:

config :swarm,
  node_whitelist: [~r/^myapp-[\d]@.*$/]

The above will only allow nodes named something like myapp-1@somehost to be included in Swarm's clustering. NOTE: It is important to understand that this does not prevent those nodes from connecting to the cluster, only that Swarm will not include those nodes in its distribution algorithm, or communicate with those nodes.

Registration/Process Grouping

Swarm is intended to be used by registering processes before they are created, and letting Swarm start them for you on the proper node in the cluster. This is done via Swarm.register_name/5. You may also register processes the normal way, i.e. GenServer.start_link({:via, :swarm, name}, ...). Swarm will manage these registrations, and replicate them across the cluster, however these processes will not be moved in response to cluster topology changes.

Swarm also offers process grouping, similar to the way gproc does properties. You "join" a process to a group after it is started, (beware of doing so in init/1 outside of a Task, or it will deadlock), with Swarm.join/2. You can then publish messages (i.e. cast) with Swarm.publish/2, and/or call all processes in a group and collect results (i.e. call) with Swarm.multi_call/2 or Swarm.multi_call/3. Leaving a group can be done with Swarm.leave/2, but will automatically be done when a process dies. Join/leave can be used to do pubsub like things, or perform operations over a group of related processes.

Debugging/Troubleshooting

By configuring Swarm with debug: true and setting Logger's log level to :debug, you can get much more information about what it is doing during operation to troubleshoot issues.

To dump the tracker's state, you can use :sys.get_state(Swarm.Tracker) or :sys.get_status(Swarm.Tracker). The former will dump the tracker state including what nodes it is tracking, what nodes are in the hash ring, and the state of the interval tree clock. The latter will dump more detailed process info, including the current function and its arguments. This is particularly useful if it appears that the tracker is stuck and not doing anything. If you do find such things, please gist all of these results and open an issue so that I can fix these issues if they arise.

Example

The following example shows a simple case where workers are dynamically created in response to some events under a supervisor, and we want them to be distributed across the cluster and be discoverable by name from anywhere in the cluster. Swarm is a perfect fit for this situation.

defmodule MyApp.Supervisor do
  @moduledoc """
  This is the supervisor for the worker processes you wish to distribute
  across the cluster, Swarm is primarily designed around the use case
  where you are dynamically creating many workers in response to events. It
  works with other use cases as well, but that's the ideal use case.
  """
  use Supervisor

  def start_link() do
    Supervisor.start_link(__MODULE__, [], name: __MODULE__)
  end

  def init(_) do
    children = [
      worker(MyApp.Worker, [], restart: :temporary)
    ]
    supervise(children, strategy: :simple_one_for_one)
  end

  @doc """
  Registers a new worker, and creates the worker process
  """
  def register(worker_name) do
    {:ok, _pid} = Supervisor.start_child(__MODULE__, [worker_name])
  end
end

defmodule MyApp.Worker do
  @moduledoc """
  This is the worker process, in this case, it simply posts on a
  random recurring interval to stdout.
  """
  def start_link(name) do
    GenServer.start_link(__MODULE__, [name])
  end

  def init([name]) do
    {:ok, {name, :rand.uniform(5_000)}, 0}
  end

  # called when a handoff has been initiated due to changes
  # in cluster topology, valid response values are:
  #
  #   - `:restart`, to simply restart the process on the new node
  #   - `{:resume, state}`, to hand off some state to the new process
  #   - `:ignore`, to leave the process running on its current node
  #
  def handle_call({:swarm, :begin_handoff}, _from, {name, delay}) do
    {:reply, {:resume, {name, delay}}, {name, delay}}
  end
  # called after the process has been restarted on its new node,
  # and the old process' state is being handed off. This is only
  # sent if the return to `begin_handoff` was `{:resume, state}`.
  # **NOTE**: This is called *after* the process is successfully started,
  # so make sure to design your processes around this caveat if you
  # wish to hand off state like this.
  def handle_cast({:swarm, :end_handoff, delay}, {name, _}) do
    {:noreply, {name, delay}}
  end
  # called when a network split is healed and the local process
  # should continue running, but a duplicate process on the other
  # side of the split is handing off its state to us. You can choose
  # to ignore the handoff state, or apply your own conflict resolution
  # strategy
  def handle_cast({:swarm, :resolve_conflict, _delay}, state) do
    {:noreply, state}
  end

  def handle_info(:timeout, {name, delay}) do
    IO.puts "#{inspect name} says hi!"
    Process.send_after(self(), :timeout, delay)
    {:noreply, {name, delay}}
  end
  # this message is sent when this process should die
  # because it is being moved, use this as an opportunity
  # to clean up
  def handle_info({:swarm, :die}, state) do
    {:stop, :shutdown, state}
  end
end

defmodule MyApp.ExampleUsage do
  ...snip...

  @doc """
  Starts worker and registers name in the cluster, then joins the process
  to the `:foo` group
  """
  def start_worker(name) do
    {:ok, pid} = Swarm.register_name(name, MyApp.Supervisor, :register, [name])
    Swarm.join(:foo, pid)
  end

  @doc """
  Gets the pid of the worker with the given name
  """
  def get_worker(name), do: Swarm.whereis_name(name)

  @doc """
  Gets all of the pids that are members of the `:foo` group
  """
  def get_foos(), do: Swarm.members(:foo)

  @doc """
  Call some worker by name
  """
  def call_worker(name, msg), do: GenServer.call({:via, :swarm, name}, msg)

  @doc """
  Cast to some worker by name
  """
  def cast_worker(name, msg), do: GenServer.cast({:via, :swarm, name}, msg)

  @doc """
  Publish a message to all members of group `:foo`
  """
  def publish_foos(msg), do: Swarm.publish(:foo, msg)

  @doc """
  Call all members of group `:foo` and collect the results,
  any failures or nil values are filtered out of the result list
  """
  def call_foos(msg), do: Swarm.multi_call(:foo, msg)

  ...snip...
end

License

MIT

Testing

mix test runs a variety of tests, most of them use a cluster of Elixir nodes to test the tracker and the registry. If you want more verbose output during the tests, run them like this:

# SWARM_DEBUG=true mix test

This sets the log level to :debug, runs ExUnit with --trace, and enables GenServer tracing on the Tracker processes.

Executing the tests locally

In order to execute the tests locally you'll need to have Erlang Port Mapper Daemon running.

If you don't have epmd running you can start it using the following command:

epmd -daemon

TODO

  • automated testing (some are present)
  • QuickCheck model

swarm's People

Contributors

beardedeagle avatar bitwalker avatar ericentin avatar fdbeirao avatar kelostrada avatar liveforeverx avatar lukasherwartz avatar mbaeuerle avatar miguelfteixeira avatar msw10100 avatar pdilyard avatar rhnonose avatar schneiderl avatar scrogson avatar slashdotdash avatar tschmittni avatar ujihisa avatar xinz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

swarm's Issues

Purposeful node failover

Hi! Swarm looks awesome, I just miss one feature: Manually telling swarm to failover everything that it manages on one node.
This could basically be what happens on addition of a new node, but reversed and would allow for more seamless upgrades of nodes, downscaling if the load is going down and so on.

Is this something that fits your idea for swarm?

Add non-swarm processes to Swarm group

As the title says, is this possible? I'm testing using Swarm.join(:test, self()) in IEx but Swarm.members(:test) is coming back empty. Works fine if I register a Swarm registered process

RFC: Assign roles to nodes to control distribution

Not sure if this belongs here or fits more to libcluster functionality, but would be great if one could assign roles to nodes, and then start worker processes only on nodes of particular role.

Thanks!

Topology change on node startup when process modules not yet loaded

Using Swarm 3.0.5, often when I have processes on one node and a new node joins, causing processes to move around to a different node, I will see a warning on the originating node:

2017-11-08 15:01:01.231 [warn] [swarm on [email protected]] [tracker:start_pid_remotely] "ID_4" could not be started on [email protected]: {:error, :undef}

and on the target node:

15:01:01.231 [warn]  [swarm on [email protected]] [tracker:handle_call] ** (UndefinedFunctionError) function Gptest.Service.start_link/1 is undefined (module Gptest.Service is not available)
    Gptest.Service.start_link({:id, "4"})
    (swarm) lib/swarm/tracker/tracker.ex:961: Swarm.Tracker.handle_call/3
    (stdlib) gen_statem.erl:1240: :gen_statem.call_state_function/5
    (stdlib) gen_statem.erl:1012: :gen_statem.loop_event/6
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3

So it would appear that since Swarm loads before our application modules (Gptest, above), the VM has not yet loaded Gptest into memory when the Swarm attempts to start it on the new node. And since there are no retries when that error occurs, the process stays down until some external entity restarts the process.

Is there a mechanism that I could use to either delay the attempt to load the process or retry when I get this error?

Topology change stops manually registered processes on another nodes

Hello!

I am using Swarm 3.0.5 with libcluster. Imagine, I have two nodes starting up.

On every node startup same procedure is invoked:

  • Swarm.whereis_name to see if name is registered
  • if not, start GenServer using via tuple
  • try to register with Swarm.register_name/2

NodeA starts up and follows procedure and registers new process
NodeB follows same procedure, but as soon as Swarm application starts on the NodeB, before even proceeding to my custom logic, on NodeA I get the following in debug log level:

  • topology change (nodeup....)
  • removing registration for XXXX, nodeA is down

even though nodeA is up and after that I can run Swarm.registered/0 and see process registered on NodeB.

Any thoughts about this behavior?

Thanks!

Unable to locate process registered via register_name/2 after hot upgrade

I have a GenServer that I register using the {:via, ...} tuple. After running a hot code upgrade my process isn't discoverable anymore. I wonder if it has something to do with topology change. Maybe Swarm is trying to route it to the wrong node? But either way I think it might be a good idea to add the upgrader node to the blacklist so it doesn't cause unnecessary relocations. Or is the name too general?

iex([email protected])4> pid = Swarm.whereis_name(:foo)
#PID<0.1304.0>
iex([email protected])5> Process.alive? pid
true

22:43:11.968 [info] [swarm on [email protected]] [tracker:nodeup] nodeup [email protected]
22:43:11.973 [info] [swarm on [email protected]] [tracker:topology_change] topology change complete
22:43:13.023 [info] [swarm on [email protected]] [tracker:nodedown] nodedown [email protected]
22:43:13.023 [info] [swarm on [email protected]] [tracker:topology_change] topology change complete

iex([email protected])7> Process.alive? pid
true
iex([email protected])8> pid = Swarm.whereis_name(:foo)
:undefined

If nodes are not connected on application startup tracker won't sync

If I start an isolated node the tracker will continue without sync, however if I connect the node to the cluster after seeing such message it will take 5 minutes to receive the registry.
With 2 process is really easy to lose the registry information
I.E.
start both nodes 1 and 2, and connect them -> start a process in node 1 -> restart node 2 and connect it after seeing the tracker without sync message -> connect node 2 -> stop node 1 any time in the next 5 minutes -> node 2 does not know about the process in node 1 and therefore no handoff is produced.

I am not sure if this is a bug or the intended functionality, is there a way to control the registry exchange frequency?

Hierarchical Registration

Disclaimer: this may/does have some overlap with #58. This is merely to start spit balling ideas on how we would even accomplish this or what specific feature set we would want to accomplish upon implementation.

Enhancement/Feature Request:
(as discussed in slack) It would be great if we could have hierarchical registration within Swarm. I have a application that is using Swarm but is distributed geographically among many DC (data centers) both internally and with various public cloud providers. These Swarm clusters are independent and I have to wire them together external to Swarm currently.

Basically what I am thinking is that you can ask a top level Swarm for a process name, it tells me what DC it's in and I start talk to that DC/process. Ideally, you'd also be able to direct process creation to a specific DC as well (this is where #58 starts coming into play).

I'm aware that process groups are a thing in Swarm but afaik there is no way to say "This process group belongs to this/these node(s)". So Doing something like hierarchical registration starts to bring that into the picture. You could say "I want to have a new process created and manged by Swarm in DC2, on nodes 'labeled' with, or has the 'role' 'web' within DC2, and made a part of the 'UI' process group". Rather ambitious but a very nice feature to have.

                               +-----------+
                               |           |
                               |           |
                        +------+ Top Level +------+
                        |      |           |      |
                        |      |           |      |
                        |      +-----------+      |
                        |                         |
                        |                         |
                        |                         |
                  +-----v-----+             +-----v-----+
                  |           |             |           |
                  |           |             |           |
      +-----------+    DC1    |             |    DC2    +-----------+
      |           |           |             |           |           |
      |           |           |             |           |           |
      |           +---+-------+             +-----+-----+           |
      |               |                           |                 |
    +---+  +---+  +---+ +--+  +--+  +--+  +--+  +--+  +--+  +---+  +---+
      |               |                           |                 |
      |               |                           |                 |
+-----v-----+  +------v-------+             +-----v-----+  +--------v-----+
|Node1 "web"|  |Node2 "engine"|             |Node1 "web"|  |Node2 "engine"|
+-----------+  +--------------+             +--+------+-+  +--------------+
                                               |      |
                                            +--v-+ +--v--+
                                            | UI | | API |
                                            +----+ +-----+

Dead lock when using `{:via, :swarm, ...}` to register a process.

I'm not sure whether or not I'm doing a mistake here, but since I wasn't able to get help from the Elixir's Slack, please allow me to explain.

We are using Swarm.register_name/5 to distribute spawning of a process S under dynamic supervision tree. This works great, the process is placed under the appropriate node based on its name's hash.

But in fact, S is itself a supervisor, and it spawns two children via their spec. For the sake of simplicity, we will call them A and B.

The issue seems to be that inside A.start_link and B.start_link, we are not able to register them using Swarm.register_name/2 (Or init) using their pid, nor are we able to use name: {:via, :swarm, their_name}. When we try to do either of these things, iex just hangs forever and one CPU is sometimes maxed.

What is very interesting is that if we use {:via, :global, their_name} instead, it works, but we as the docs state we don't really want to do this because those processes will not be handed over in the event of the node going down.

So this might be an actual bug, but more generally speaking, it would be nice to have a small bit of documentation explaining how to achieve the following:

  • If swarm starts a process, all children will be started on the node which was used to start the parent.
  • Given that the previous statement is true, it becomes important to be able to locate some of the children process to send them messages.
  • How to do that?

Cheers, swarm helped us understand a lot how to distribute things and stay consistent. Good job!

tests fail

I perform checkout of master and run the tests, the following failures are generated:

14:11:12.342 [info]  Protocol 'inet_tcp': register/listen error: econnrefused

** (EXIT from #PID<0.73.0>) :not_alive

14:11:12.361 [error] Task #PID<0.150.0> started from #PID<0.73.0> terminating
** (stop) :not_alive
    (stdlib) slave.erl:198: :slave.start/5
    (swarm) test/support/cluster.ex:21: Swarm.Cluster.spawn_node/1
    (elixir) lib/task/supervised.ex:88: Task.Supervised.do_apply/2
    (elixir) lib/task/supervised.ex:38: Task.Supervised.reply/5
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Function: #Function<8.128785774/0 in Swarm.Cluster.spawn/1>
    Args: []

14:11:12.362 [error] Task #PID<0.151.0> started from #PID<0.73.0> terminating
** (stop) :not_alive
    (stdlib) slave.erl:198: :slave.start/5
    (swarm) test/support/cluster.ex:21: Swarm.Cluster.spawn_node/1
    (elixir) lib/task/supervised.ex:88: Task.Supervised.do_apply/2
    (elixir) lib/task/supervised.ex:38: Task.Supervised.reply/5
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Function: #Function<8.128785774/0 in Swarm.Cluster.spawn/1>
    Args: []

As with libcluster, is there some setup I need to perform prior to running the tests?

Split Swarm.Ring into it's own package

Some folks have requested the hash ring implementation be standalone for other use cases. I see no reason not to do that, so this issue is to track splitting it out, and updating this project to use it as a dependency.

Evaluate usage of tracker clock to ensure it's logically correct

To recap, the tracker uses an implementation of an Interval Tree Clock for resolving the causal history of replicated events. My understanding of ITC is that a scenario plays out like so:

  • Node A is created as a seed clock
  • Node B is created by forking A's clock
  • Event X occurs on A, incrementing it's event clock
  • When A syncs with B or vice versa, we can compare the clocks, and see that B's clock is strictly in the past of A's clock, so we can update B's state no problem
  • If events occur on both A and B, their clocks will conflict, which requires us to resolve the conflicting state in some way, and then make sure the clocks are synchronized before proceeding

In the case of Swarm, we can deterministically resolve conflicts because at any given point in time, we know which node in the cluster "owns" a given process/registration, with the exception of processes which are registered but not managed by Swarm. We can ask the conflicted process to shut down, or kill it, and the registration will remain with the "correct" version. However, since the way synchronization works today is that the registry is sent to the remote node, I'm not sure a clock is strictly needed. If we assume that it does help us though, we should ensure that our logic for handling the clock forking/peeking/incrementing is correct.

Handoff processes when node shutdowns gracefully

Is there a way to gracefully leave the cluster and force handoffs (passing state along) before the node goes down? If I understand correctly, currently handoffs only occur when a new node joins the cluster and not the other way around.

Swarm for Erlang based projects

Hi, Thanks for a wonderful project to solve the distributed node discovery problem in Kubernetes. I have come across this project when I was doing some reasearch/prototypes on my own to see a way of solving this kind of probelm.

I am an Elixir novice. So, wondering if there is a way by which one can use this project in Erlang environment, where my project/projects are 100% Erlang.

  • Can include this project as a dependency to my Erlang project?
  • If yes, do you have any example?
  • If no, do you know of a way making this project include in Erlang. If someone can give me some pointers, I can make changes and provide this addition to the project.

Thanks

Unregistering processes

I'm not certain this is a bug, but looking for some clarification (I'm a bit new to Elixir, so sorry if this is obvious).

I integrated Swarm into an existing library using :via tuples. That seems to work fine, but tests that were killing a process and restarting it were failing. It seems there's a longer delay between a process being killed and Swarm.Tracker unregistering the process than with the Registry module.

Now that makes sense since in a distributed system it would take longer, but I'm testing on a single node so I didn't expect this to be an issue. I'm able to make the issue go away in tests by either using Process.sleep() or explicitly calling Swarm.unregister_name. I have an example project that reproduces the issue. Here's the relevant code:

  def test() do
    uuid = "b78932b0-1265-4317-bd92-3a1a3c64789e"
    TestSwarm.Supervisor.start_worker(uuid)
    TestSwarm.Worker.worker_state(uuid)
    shutdown(uuid)
    TestSwarm.Supervisor.start_worker(uuid)
    TestSwarm.Worker.worker_state(uuid)
  end

And the output:

iex(2)> TestSwarm.Helper.test()

15:03:40.975 [debug] Starting worker for uuid: "b78932b0-1265-4317-bd92-3a1a3c64789e"

15:03:40.982 [debug] [swarm on nonode@nohost] [tracker:handle_call] registering #PID<0.178.0> as "b78932b0-1265-4317-bd92-3a1a3c64789e", with metadata %{}

15:03:40.984 [debug] [swarm on nonode@nohost] [tracker:handle_call] add_meta {:workergroup, true} to #PID<0.178.0>

15:03:40.984 [debug] Starting worker for uuid: "b78932b0-1265-4317-bd92-3a1a3c64789e"

15:03:40.984 [debug] [swarm on nonode@nohost] [tracker:handle_monitor] "b78932b0-1265-4317-bd92-3a1a3c64789e" is down: :shutdown
** (exit) exited in: GenServer.call({:via, :swarm, "b78932b0-1265-4317-bd92-3a1a3c64789e"}, {:get_state}, 5000)
    ** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
    (elixir) lib/gen_server.ex:737: GenServer.call/3

If it's not a bug (which is probably likely), is there something my Supervisor or Worker needs to be doing?
Call Swarm.unregister_name from the Supervisor or in the terminate/2 of the worker? I also thought that maybe I need to block until the tracker is updated, but I don't see a good spot to do that.

process replication doesn't work if one of the nodes is started late

I have 2 nodes in my swarm cluster. Here is my sys.config

[{kernel, [ {sync_nodes_optional, ['[email protected]', '[email protected]']}, {sync_nodes_timeout, 10000} ]} ].

process replication/acknowledgment does not work right if I start node1 first, wait 20 seconds (let the timeout expire for sync_nodes) and then start node2. At this point node 1 will start with "[info] [swarm on [email protected]] [tracker:cluster_wait] no connected nodes, proceeding without sync" and when node2 starts up, it does detect it and output following message:

iex([email protected])2> [warn] [swarm on [email protected]] [tracker:nodeup] nodeup for [email protected] was ignored because swarm failed to start: {:error, {:swarm, {'no such file or directory', 'swarm.app'}}}
[debug] [swarm on [email protected]] [tracker:handle_cast] received sync request from [email protected]
[info] [swarm on [email protected]] [tracker:awaiting_sync_ack] received sync acknowledgement from [email protected]
[info] [swarm on [email protected]] [tracker:resolve_pending_sync_requests] pending sync requests cleared

But when I start a worker from node1, there is no replication of that process acknowledged by node2 but it works just fine if I start a worker from node2.

I'm sure this is a common use case where one of the nodes dies and joins the cluster late after all the other clusters have been initialized. I am not sure what that warning is and if that has something to do with this not working as expected.

No synchronization between my nodes

Hi,

First, I'd like to thank you for this had work :)

I tried to used Swarm for my own project. I'm starting processes from my supervisor and I'd like them to be shared between connected nodes.

This is the code I use to start new workers in the swarm:

defmodule ModNotifications.Workers.Sender do
  # Automatically defines child_spec/1
  use DynamicSupervisor

  def start_link(arg) do
    DynamicSupervisor.start_link(__MODULE__, arg, name: __MODULE__)
  end

  def init(_arg) do
    DynamicSupervisor.init(strategy: :one_for_one, max_children: :infinity)
  end

  def schedule(delay, sender) do
    name = UUID.uuid4()

    {:ok, pid} =
      Swarm.register_name(name, __MODULE__, :register, [
        name,
        {ModNotifications.Workers.Delayer, %{delay: delay, sender: sender}}
      ])

    Swarm.join(:notification_sender, pid)
  end

  def register(name, {mod, data}) do
    DynamicSupervisor.start_child(
      __MODULE__,
      {mod, {name, data}}
    )
  end
end

This is just a DynamicSupervisor started at app startup. When I need a new worker I call schedule function with parameters, which then starts a new worker.

The code for the worker is pretty simple:

defmodule ModNotifications.Workers.Delayer do
  use GenServer, restart: :temporary

  alias ModNotifications.Workers.Sender

  require Logger

  def start_link({name, state}) do
    Logger.warn("START LINK #{inspect(name)} #{inspect(state)}")
    GenServer.start_link(__MODULE__, state)
  end

  def init(%{sender: _, delay: delay_time} = state) do
    Logger.warn("INIT DELAYER #{inspect(state)}")
    delay(delay_time)

    {:ok, state}
  end

  def init(_), do: :ignore

  def handle_call({:swarm, :begin_handoff}, _from, state) do
    Logger.warn("BEGIN_HANDOFF #{inspect(state)}")
    {:reply, {:resume, state}, state}
  end

  # called after the process has been restarted on its new node,
  # and the old process' state is being handed off. This is only
  # sent if the return to `begin_handoff` was `{:resume, state}`.
  # **NOTE**: This is called *after* the process is successfully started,
  # so make sure to design your processes around this caveat if you
  # wish to hand off state like this.
  def handle_cast({:swarm, :end_handoff, state}, old_state) do
    Logger.warn("END_HANDOFF #{inspect(state)}, old state : #{inspect(old_state)}")
    {:noreply, state}
  end

  # called when a network split is healed and the local process
  # should continue running, but a duplicate process on the other
  # side of the split is handing off its state to us. You can choose
  # to ignore the handoff state, or apply your own conflict resolution
  # strategy
  def handle_cast({:swarm, :resolve_conflict, resolve_state}, state) do
    Logger.warn("RESOLVE_CONFLICT #{inspect(resolve_state)}, state : #{inspect(state)}")
    {:noreply, state}
  end

  # this message is sent when this process should die
  # because it is being moved, use this as an opportunity
  # to clean up
  def handle_info({:swarm, :die}, state) do
    Logger.warn("DIE #{inspect(state)}")
    {:stop, :shutdown, state}
  end

  def handle_info(:send, %{sender: sender} = state) do
    NotificationCenter.send(sender)

    {:stop, :normal, state}
  end

  def handle_info(_message, state), do: {:noreply, state}

  def delay(delay) do
    # in minutes
    Process.send_after(self(), :send, delay * 60 * 1000)
  end
end

The worker does only one think: It calls a send function after a given amount of time.

Everything works fine with a single node. However, when I start a new node and connect it to the first one, I can't find my workers from the second node:

Swarm.members(:notification_sender) == []

Here are the logs from my first node (the one from which I started the workers):

*DBG* 'Elixir.Swarm.Tracker' receive info {nodeup,'[email protected]',[{node_type,visible}]} in state tracking
[info] [swarm on [email protected]] [tracker:ensure_swarm_started_on_remote_node] nodeup [email protected]
*DBG* 'Elixir.Swarm.Tracker' consume info {nodeup,'[email protected]',[{node_type,visible}]} in state tracking
[info] [swarm on [email protected]] [tracker:cluster_wait] joining cluster..
[info] [swarm on [email protected]] [tracker:cluster_wait] found connected nodes: [:"[email protected]"]
[info] [swarm on [email protected]] [tracker:cluster_wait] selected sync node: [email protected]
*DBG* 'Elixir.Swarm.Tracker' receive cast {sync,<27946.629.0>} in state syncing
[info] [swarm on [email protected]] [tracker:syncing] there is a tie between syncing nodes, breaking with die roll (5)..
*DBG* 'Elixir.Swarm.Tracker' consume cast {sync,<27946.629.0>} in state syncing
*DBG* 'Elixir.Swarm.Tracker' receive cast {sync_begin_tiebreaker,<27946.629.0>,1} in state syncing
[info] [swarm on [email protected]] [tracker:syncing] there is a tie between syncing nodes, breaking with die roll (13)..
*DBG* 'Elixir.Swarm.Tracker' consume cast {sync_begin_tiebreaker,<27946.629.0>,1} in state syncing
[info] [swarm on [email protected]] [tracker:syncing] we won the die roll (13 vs 1), sending registry..
*DBG* 'Elixir.Swarm.Tracker' receive cast {sync_end_tiebreaker,<27946.629.0>,5,19} in state awaiting_sync_ack
*DBG* 'Elixir.Swarm.Tracker' postpone cast {sync_end_tiebreaker,<27946.629.0>,5,19} in state awaiting_sync_ack
*DBG* 'Elixir.Swarm.Tracker' receive cast {sync_recv,<27946.629.0>,{{0,1},0},[]} in state awaiting_sync_ack
*DBG* 'Elixir.Swarm.Tracker' consume cast {sync_recv,<27946.629.0>,{{0,1},0},[]} in state awaiting_sync_ack
*DBG* 'Elixir.Swarm.Tracker' receive cast {sync_ack,'[email protected]'} in state awaiting_sync_ack
[info] [swarm on [email protected]] [tracker:awaiting_sync_ack] received sync acknowledgement from [email protected]
*DBG* 'Elixir.Swarm.Tracker' consume cast {sync_ack,'[email protected]'} in state awaiting_sync_ack
[info] [swarm on [email protected]] [tracker:resolve_pending_sync_requests] pending sync requests cleared
*DBG* 'Elixir.Swarm.Tracker' consume cast {sync_end_tiebreaker,<27946.629.0>,5,19} in state tracking
[warn] [swarm on [email protected]] [tracker:handle_cast] unrecognized cast: {:sync_end_tiebreaker, #PID<27946.629.0>, 5, 19}

iex([email protected])2> Swarm.members(:notification_sender)
[#PID<0.888.0>, #PID<0.901.0>]

Here are the logs from the second node:

iex([email protected])2> Node.connect(:"[email protected]")
*DBG* 'Elixir.Swarm.Tracker' receive info {nodeup,'[email protected]',[{node_type,visible}]} in state tracking
true
iex([email protected])3> [info] [swarm on [email protected]] [tracker:ensure_swarm_started_on_remote_node] nodeup [email protected]
[info] [swarm on [email protected]] [tracker:cluster_wait] joining cluster..
*DBG* 'Elixir.Swarm.Tracker' consume info {nodeup,'[email protected]',[{node_type,visible}]} in state tracking
[info] [swarm on [email protected]] [tracker:cluster_wait] found connected nodes: [:"[email protected]"]
[info] [swarm on [email protected]] [tracker:cluster_wait] selected sync node: [email protected]
*DBG* 'Elixir.Swarm.Tracker' receive cast {sync,<29111.620.0>} in state syncing
[info] [swarm on [email protected]] [tracker:syncing] there is a tie between syncing nodes, breaking with die roll (1)..
*DBG* 'Elixir.Swarm.Tracker' consume cast {sync,<29111.620.0>} in state syncing
*DBG* 'Elixir.Swarm.Tracker' receive cast {sync_begin_tiebreaker,<29111.620.0>,5} in state syncing
[info] [swarm on [email protected]] [tracker:syncing] there is a tie between syncing nodes, breaking with die roll (19)..
*DBG* 'Elixir.Swarm.Tracker' consume cast {sync_begin_tiebreaker,<29111.620.0>,5} in state syncing
[info] [swarm on [email protected]] [tracker:syncing] we won the die roll (19 vs 5), sending registry..
*DBG* 'Elixir.Swarm.Tracker' receive cast {sync_end_tiebreaker,<29111.620.0>,1,13} in state awaiting_sync_ack
*DBG* 'Elixir.Swarm.Tracker' postpone cast {sync_end_tiebreaker,<29111.620.0>,1,13} in state awaiting_sync_ack
*DBG* 'Elixir.Swarm.Tracker' receive cast {sync_recv,<29111.620.0>,
         {{0,1},10},
         [{entry,<<"a2762c69-e1f7-4d56-b2d8-bc44bff3527a">>,<29111.888.0>,
              #Ref<29111.1533303619.3224371201.128918>,
              #{mfa =>
                    {'Elixir.ModNotifications.Workers.Sender',register,
                        [<<"a2762c69-e1f7-4d56-b2d8-bc44bff3527a">>,
                         {'Elixir.ModNotifications.Workers.Delayer',
                             #{delay => 14,
                               sender =>
                                   ...long content....}]},
                notification_sender => true},
              {{0,5},{1,5}}},
          {entry,<<"59635bd8-fc95-4134-8cf0-4459cdc87fe7">>,<29111.901.0>,
              #Ref<29111.1533303619.3224371201.129074>,
              #{mfa =>
                    {'Elixir.ModNotifications.Workers.Sender',register,
                        [<<"59635bd8-fc95-4134-8cf0-4459cdc87fe7">>,
                         {'Elixir.ModNotifications.Workers.Delayer',
                             #{delay => 29,
                               sender =>
                                   ...long content....}]},
                notification_sender => true},
              {{0,10},{1,10}}}]} in state awaiting_sync_ack
*DBG* 'Elixir.Swarm.Tracker' consume cast {sync_recv,<29111.620.0>,
         {{0,1},10},
         [{entry,<<"a2762c69-e1f7-4d56-b2d8-bc44bff3527a">>,<29111.888.0>,
              #Ref<29111.1533303619.3224371201.128918>,
              #{mfa =>
                    {'Elixir.ModNotifications.Workers.Sender',register,
                        [<<"a2762c69-e1f7-4d56-b2d8-bc44bff3527a">>,
                         {'Elixir.ModNotifications.Workers.Delayer',
                             #{delay => 14,
                               sender =>
                                   ...long content....}]},
                notification_sender => true},
              {{0,5},{1,5}}},
          {entry,<<"59635bd8-fc95-4134-8cf0-4459cdc87fe7">>,<29111.901.0>,
              #Ref<29111.1533303619.3224371201.129074>,
              #{mfa =>
                    {'Elixir.ModNotifications.Workers.Sender',register,
                        [<<"59635bd8-fc95-4134-8cf0-4459cdc87fe7">>,
                         {'Elixir.ModNotifications.Workers.Delayer',
                             #{delay => 29,
                               sender =>
                                   ...long content....}]},
                notification_sender => true},
              {{0,10},{1,10}}}]} in state awaiting_sync_ack
*DBG* 'Elixir.Swarm.Tracker' receive cast {sync_ack,'[email protected]'} in state awaiting_sync_ack
[info] [swarm on [email protected]] [tracker:awaiting_sync_ack] received sync acknowledgement from [email protected]
*DBG* 'Elixir.Swarm.Tracker' consume cast {sync_ack,'[email protected]'} in state awaiting_sync_ack
[info] [swarm on [email protected]] [tracker:resolve_pending_sync_requests] pending sync requests cleared
[warn] [swarm on [email protected]] [tracker:handle_cast] unrecognized cast: {:sync_end_tiebreaker, #PID<29111.620.0>, 1, 13}
*DBG* 'Elixir.Swarm.Tracker' consume cast {sync_end_tiebreaker,<29111.620.0>,1,13} in state tracking
Swarm.members(:notification_sender)
[]
iex([email protected])4> *DBG* 'Elixir.Swarm.Tracker' receive info {nodedown,'[email protected]',[{node_type,visible}]} in state tracking
[info] [swarm on [email protected]] [tracker:nodedown] nodedown [email protected]
[debug] [swarm on [email protected]] [tracker:handle_topology_change] topology change (nodedown for [email protected])
*DBG* 'Elixir.Swarm.Tracker' consume info {nodedown,'[email protected]',[{node_type,visible}]} in state tracking
[info] [swarm on [email protected]] [tracker:handle_topology_change] topology change complete

iex([email protected])4> Swarm.members(:notification_sender)
[]

Did I do or understood anything wrong?

I expected the second node to be able to find my workers and to get my workers if the first node crashes.

join / publish race condition

When a process joins a group and then immediately publishes in that group, the newly joined process doesn't receive the message. For example:

Swarm.join(:test, self())
Swarm.publish(:test, :ping)
# Doesn't receive :ping message

However, when adding a sleep:

Swarm.join(:test, self())
:timer.sleep(500)
Swarm.publish(:test, :ping)
# Does receive :ping message

Even though add_meta uses call it doesn't seem to get registered immediately. The behavior would probably also depend on the host machine being used.

I'm not sure if this is a bug or intended, but I didn't see any mention of this in the readme. Other people might bump into the same issue so I decided to report it. Could you please confirm if this is a bug or not? Thanks

Deadlock when registering process

First off, great job on the library! I can't believe how easy it was to incorporate it into my project. There is one issue that I found though. When spawning lots of processes very rapidly across multiple nodes, I often run into a deadlock. I've tracked the issue to this line

res = GenServer.call({__MODULE__, node}, {:register, name, mfa, groups}, 5_000)

It seems both nodes try to call each other's Swarm.Tracker at the same time and time out when none of them can reply.

node_whitelist and node_blacklist seem to be ignored

Swarm is starting processes on an iex session. My dev config looks like this (prod is using kubernetes and libcluster, the behavior happens in both):

iex> Application.get_all_env(:swarm)
[included_applications: [], node_blacklist: ["[email protected]"], debug: true]
iex> Application.get_all_env(:libcluster)
[topologies: [local: [strategy: Cluster.Strategy.Epmd,
   config: [hosts: [:"[email protected]", :"[email protected]",
     :"[email protected]"]]]], included_applications: []]

when I start the app like this: iex --name [email protected] --cookie foo -S mix run processes managed by swarm are spawned in my iex shell. (I'm running the latest from master)

no function clause matching in :lib.is_op/2

Hi!

I'm currently building a distributed application consisting in 6 nodes (by now) with two different applications: 2 nodes with one app and 4 with the other (these last are the ones registering processes individually and in a group).

The thing is that, during deployment (using Kubernetes) we often get no function clause matching in :lib.is_op/2 exception on the nodes that are not deployed (the ones that detects the topology changes).

For clustering we are using libcluster with the Kubernetes strategy, but we have no clue about how this error arises.

Also, we are not using handoff as the registered processes are stateless. Any idea what can be happening?

Here is the stacktrace:

lib/swarm/tracker/crdt.ex:68:in `leq/2'
lib/swarm/tracker/crdt.ex:79:in `compare/2'
lib/swarm/tracker/tracker.ex:342:in `syncing/3'
gen_statem.erl:1240:in `call_state_function/5'
gen_statem.erl:1012:in `loop_event/6'
proc_lib.erl:247:in `init_p_do_apply/3'

Thanks in advance for such a great pair of libraries!

is registery replicated?

If I have 1M registrations across my cluster, does each node contain a table of all the registrations? Or is consistent hashing using to determine which node a process belongs on?

I'm trying to understand if any part of Swarm isn't fully horizontally scalable.

Swarm.registered/0 returns [] but Swarm.whereis_name/1 returns the proper remote pid

Steps to reproduce this behaviour:

  1. Start a node
  2. Create a process X
  3. Start a second node, but do not connect it to the cluster
  4. Wait for "proceeding without sync" message on the second node
  5. Execute Swarm.registered/0 and Swarm.whereis_name/1 on the second node.

Swarm.registered() returns []
Swarm.whereis_name(X) returns pid of process in the first node

Why is swarm able to find the process by name but it is unable to track it as registered?

Node Whitelisting doesn't prevent the unexpected nodes into Swarm's clustering

Hi,

I'm new to Swarm, I find the following case doesn't work as expected, please advise if any misunderstanding.

1, Here is node whitelist config in config.exs:

config :swarm,
  debug: true,
  node_whitelist: [~r/^app[\d]@.*$/]

2, Here is libcluster config in dev.exs:

config :libcluster,
  topologies: [
    main: [
    strategy: Cluster.Strategy.Epmd,
    config: [hosts: [:"app1@server"]],
    connect: {:net_kernel, :connect, []},
  ]
]

3, I run the following commands to setup nodes in order
$iex --sname app1 -S mix
$iex --sname unknown -S mix

4, Refer the Example in https://hexdocs.pm/swarm/readme.html , I start some workers in unknown node, and then I find some workers started in app1 node as expected, but some workers started in unknown node as unexpected, why the Swarm's distribution algorithm will use the excluded nodes (NOT in node whitelist) to register workers?

Hex env:
Hex: 0.15.0
Elixir: 1.4.2
OTP: 19.2.3

Deps tree:
โ”œโ”€โ”€ libcluster ~> 2.0 (Hex package)
โ”‚ โ””โ”€โ”€ poison ~> 3.0 (Hex package)
โ”œโ”€โ”€ timex ~> 3.0 (Hex package)
โ”‚ โ”œโ”€โ”€ combine ~> 0.7 (Hex package)
โ”‚ โ”œโ”€โ”€ gettext ~> 0.10 (Hex package)
โ”‚ โ””โ”€โ”€ tzdata ~> 0.1.8 or ~> 0.5 (Hex package)
โ”‚ โ””โ”€โ”€ hackney ~> 1.0 (Hex package)
โ”‚ โ”œโ”€โ”€ certifi 1.0.0 (Hex package)
โ”‚ โ”œโ”€โ”€ idna 4.0.0 (Hex package)
โ”‚ โ”œโ”€โ”€ metrics 1.0.1 (Hex package)
โ”‚ โ”œโ”€โ”€ mimerl 1.0.2 (Hex package)
โ”‚ โ””โ”€โ”€ ssl_verify_fun 1.1.1 (Hex package)
โ”œโ”€โ”€ sweet_xml ~> 0.6.5 (Hex package)
โ”œโ”€โ”€ httpoison ~> 0.11.0 (Hex package)
โ”‚ โ””โ”€โ”€ hackney ~> 1.7.0 (Hex package)
โ”œโ”€โ”€ swarm ~> 3.0 (Hex package)
โ”‚ โ”œโ”€โ”€ gen_state_machine ~> 2.0 (Hex package)
โ”‚ โ””โ”€โ”€ libring ~> 1.0 (Hex package)
โ””โ”€โ”€ distillery ~> 1.0 (Hex package)

Thanks

Using Swarm with Agents

My new course is about distributing Elixir, my way :)

I'm using Swarm for registration and pubsub. I might even do some fault-tolerant stuff if I have time.

I'm bumping into a minor roadblock.

I have an agent-based server:

defmodule Dictionary.WordList do

  @me {:via, Swarm, __MODULE__ }

  use Agent

  def start_link(_) do
    Agent.start_link(&word_list/0, name: @me)
  end

  def random_word() do
    Agent.get(@me, &Enum.random/1)
  end

  ...

When another node connects to the one running this server, Swarm sends a {:swarm, :begin_handoff} to the Agent, which crashes it.

Is there a way that I can continue to use an Agent but still register its PID via Swarm?

Cheers

Dave

recovered process doesn't get added to group

I have a test environment setup where I have two nodes talking to each other using swarm. I start 2 processes, one on each node with swarm in group :foo. When I kill node 2, node1 detects it and creates the pid locally but it does not add the process to swarm group :foo.

I am not sure if this is a bug or it is expected. If this is by design, how can I store this metadata so when it re-initialized, I can add it to the right group.

More automated tests

The current test suite runs a rather simplistic test, but a decent one:

  • Starts two nodes
  • Connects them together
  • Starts Swarm on both
  • Creates 50 worker processes
  • Disconnects the two nodes, simulating a network partition
  • Checks to see that all 50 procs are registered properly on both sides of the partition
  • Joins the partitions
  • Checks to see that there are only 50 procs in the cluster, and that they are back on the nodes where they belong, and that the registry on both nodes is correct

Tests needed:

  • Basic test as described above
  • The same as the above, but adds/removes registrations from both sides of the partition and then ensures that the conflicts are correctly resolved
  • Same as the above 2, but with 3 nodes, make sure they agree
  • Tests with many short-lived processes, say create 100 processes which exit after 1s to create churn in the tracker, and ensure it does the right thing.
  • Test the original scenario, but brutally kill one of the nodes so that it doesn't have a chance to shut down gracefully.

Tracker becomes non-responsive

The Tracker is getting into a non-responsive state for swarm version 3.0.5 under the following circumstance:

  1. The Tracker has messages in the message queue which triggers a broadcast when handled
  2. one or more node goes down
  3. The handler calls :rpc.sbcast which tries to send a message to all nodes, including the down nodes, and therefore only returns after a timeout to the dead nodes. This continues until the nodedown messages are handled.

Our setup is a kubernetes cluster, where we have observed timeouts of 3-6 seconds before it discovers that a node is down. This makes the Tracker non-responsive until the nodedown message is handled, which potentially takes a lot of time.

A hotfix for this, until it is resolved, could be to call :rpc.abcast instead, since the info about the bad nodes are never really used anyway. Are there any issues with this approach? I can't see that it makes any difference other than missing warnings about the bad nodes.

A fix for this could be to look for nodedown messages in the message queue, when bad nodes are discovered, and then handle nodedown messages accordingly. I'm not sure this is way to do it, just a thought.

Restart crashed process

Is there any way to restart a crashed process?

I tried to set restart: :permanent but my process is not restarted when I kill it with: Process.exit(pid, :kill).

My process is created like this:

defmodule Scheduler do
  def start_link(default) do
    case Swarm.register_name(__MODULE__, __MODULE__, :register, [default]) do
      {:ok, pid} ->
        Swarm.join(:notification_sender, pid)
        {:ok, pid}

      {:error, reason} ->
        {:error, reason}
    end
  end

  def register(default) do
    GenServer.start_link(__MODULE__, default)
  end

  # ... snip
end

The supervisor calling start_link is defined like this:

defmodule App.Application do
  use Supervisor

  def start_link(args \\ []) do
    Supervisor.start_link(__MODULE__, args, name: __MODULE__)
  end

  def init(_) do
    children = [
      %{
        id: Scheduler,
        start: {Scheduler, :start_link, [[]]}
      }
    ]

    Supervisor.init(children, strategy: :one_for_one)
  end
end

GenStateMachine error for Erlang/OTP 19.1 and above

15:04:04.796 [error] GenStateMachine #PID<0.237.0> terminating
** (ErlangError) erlang error: {:bad_return_from_init, {:handle_event_function, nil, nil}}
    (stdlib) gen_statem.erl:596: :gen_statem.init_result/6
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
State: :undefined
Callback mode: :undefined

15:04:04.796 [error] GenStateMachine #PID<0.238.0> terminating
** (ErlangError) erlang error: {:bad_return_from_init, {:state_functions, nil, nil}}
    (stdlib) gen_statem.erl:596: :gen_statem.init_result/6
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
State: :undefined
Callback mode: :undefined

Here's the issue on the gen_state_machine repo
ericentin/gen_state_machine#7

Handoff during node shutdown

I encountered an issue when a handoff started when the destination node was killed.

Here is the gist of the logs from the alive node (there is nothing interesting in the killed node): https://gist.github.com/FabienHenon/84d985b370bc76826f46bbebf4ff7563

As you can see from the logs I have 4 processes. One of them is on the other node. Then I killed the other node (as we can see from the logs), and just after that there is a [error] Scheduler BEGIN_HANDOFF %{} log, which means a handoff started on my scheduler process (#PID<0.792.0>). A few lines below there is another log [error] Scheduler DIE %{} which means handle_info with :die has been called on my scheduler process.
Finally, the process located on the killed node has been restarted in the alive node, but I have 1 missing process: the scheduler.

Having a global process using Swarm

Hi,

Maybe my use case is not adapted to the way we should use Swarm but I thought it could be worth trying.

What I would like is:

  • Having a global process across my nodes (let's say a counter, but it could be a process with more complicated state and logic)
  • I need to be able to increment that counter from any node and getting the result from the counter from any node
  • I need to launch this process at application startup
  • When I start a new node (a counter is created at startup) and connect it to the cluster I want to keep only ONE counter in the entire cluster and kill the other
  • But I also need my dying counter to be able to tell the other counter what is his state, so that the other counter can choose to merge it with his own state
  • Finally, when I kill the node hosting the counter I need that counter to be recreated on another node with his current state

I first thought about naming my counter with {:global, :counter} but as soon as I connect 2 nodes together all counters die because of the already globally registered counter.

Then I tried with Swarm. I started my counter and registered it to Swarm like this:

def start_link(_) do
    case Swarm.register_name(__MODULE__, __MODULE__, :register, []) do
      {:ok, pid} ->
        Swarm.join(:notification_sender, pid)
        {:ok, pid}

      {:error, reason} ->
        {:error, reason}
    end
  end

  def register() do
    GenServer.start_link(__MODULE__, 0)
  end

The start_link function being called by the main Supervisor.
This is working almost fine! When I start a node a counter is created, and when I connect 2 nodes together, only one counter is kept.
However, this does not solve my 2 last points: give the state of the dying counter to the other counters and recreate a counter with a correct state (like when returning :resume during handoff).

Is there any solution with Swarm for this? Or using any other OTP feature?

If there is no solution I may try something different like keeping one counter per node in the same swarm group and broadcasting increments to each process of the group so that each counter remains synchronized. Is it a better solution? Should I register a process using Swarm.register_name(name, pid) so that my counters stay in their own node? Or is there a solution to keep my counters in their parent node?

Thank you for your help!

Slow startup times when running tests

When using Swarm 2.x, I noticed it would take about 2 seconds to execute mix test in my project, even though the tests themselves only took about 0.1 seconds to run.

When I upgraded to 3.x, this time went up to about 5 seconds.

I have a slightly modified version of the example code in the README that isolates the problem, you can see it here: https://github.com/pdilyard/swarm_performance.

  1. Clone that repo
  2. mix test --only without_swarm (starts a process using the global registry)
  3. mix test --only with_swarm (starts a process using swarm registry)

Note how much longer it takes to register with swarm initially.

Evaluating our use case

I'm trying to figure out if swarm can be used to fulfill some of our needs. We're running an application inside of a kubernetes cluster, and we'd like to be able to make a few things happen: certain processes should retain state throughout a kubernetes level rolling update, and certain gen_severs should always be running exactly once somewhere in the cluster. At first I was thinking we could use swarm pretty easily, but from looking at #11 it looks like that isn't currently a part of the implementation. I've already got the clustering set up w/ libcluster. If swarm isn't the right tool for this, if there are any recommendations for alternative solutions that would fill our use cases that would be awesome.

Ubuntu 16.04 Could not start application swarm

I'm working on an umbrella app containing 1 phoenix application and 1 regular elixir application.
Swarm is used in the latter.

When developing on macos, everything works fine, but when I want to get it running on an Ubuntu 16.04, the app won't start.

These are the versions of swarm, elixir and erlang:
Swarm v 3.0.5
Elixir 1.4.1
Erlang/OTP 19

I'm getting the following output, any ideas on how to troubleshoot this?
Could it be related to the fact that I'm running it on a single node?

[info] [swarm on nonode@nohost] [tracker:init] started

=INFO REPORT==== 21-Feb-2017::19:56:03 ===
    application: logger
    exited: stopped
    type: temporary
** (Mix) Could not start application swarm: Swarm.start(:normal, []) returned an error: shutdown: failed to start child: Swarm.Tracker
    ** (EXIT) {:bad_return_from_init, {:state_functions, :cluster_wait, %Swarm.Tracker.TrackerState{clock: nil, nodes: [], pending_sync_reqs: [], ring: #<Ring[:nonode@nohost]>, self: :nonode@nohost, sync_node: nil, sync_ref: nil}}}

Node Crash

Great library! Exactly what I was going to build myself but thankfully I didn't have to. I've been testing out certain scenarios to see if it's appropriate for my use case. Currently I'm testing with two nodes on my local machine, when I kill one the started processes don't seem to transfer over. Calls to the process fail with no connection to node@localhost. When I restart the crashed node everything seems to fine

Is there something I'm missing? Thanks

Pub/sub events for registrations

Is it in any way possible to "subscribe" to a Swarm registered process? Similar to Gprocs gproc_monitor:subscribe.

I want a process to be notified when a name becomes registered / unregistered (crash or shutdown). I can do this by looking up the Swarm name and then using Process.monitor on the PID, and keeping track of pid <-> name relation until it crashes, but that's a lot extra (and fragile) work.

Are there any plans of adding such a feature?

Node affinity / bias `key_to_node`

It would be great to have functionality allowing overriding the key_to_node not to use whatever is passed into name, and have something arbitrary instead.

Ideally, it would be a function passed via config, that would be called every time something is about to be hashed.

Imagine a process named {Vote.Logic, 1}, its state {Vote.Logic.State, 1}. When they get started, their name gets hashed independently, and they end up running on different nodes, adding tons of network lag between the two processes.
My suggestion is to allow swarm to be configured with a name_to_key function that would be called every time something is about to get hashed, such as:

config :swarm, name_to_key: fn name ->
  case name do
    {_module, identifier} -> identifier
    name -> name
  end
end

Swarm would call this function and end up hashing its result, not the name of the process, therefore allowing processes that should stay close to one another to run on the same node.

Deadlock during dynamic cluster bringup

  • Erlang/OTP 20
  • Elixir 1.5.1
  • Swarm 3.1.0

I may have observed a deadlock with simultaneous bringup of 3 nodes. In this case all 3 trackers are stuck in :syncing or :cluster_wait states, and they do not recover.

16:48:35.537 [info] [swarm on [email protected]] [tracker:init] started
16:48:35.978 [info] [swarm on [email protected]] [tracker:ensure_swarm_started_on_remote_node] nodeup [email protected]
16:48:36.186 [info] [swarm on [email protected]] [tracker:ensure_swarm_started_on_remote_node] nodeup [email protected]
16:48:40.538 [info] [swarm on [email protected]] [tracker:cluster_wait] joining cluster..
16:48:40.538 [info] [swarm on [email protected]] [tracker:cluster_wait] found connected nodes: [:"[email protected]", :"[email protected]"]
16:48:40.538 [info] [swarm on [email protected]] [tracker:cluster_wait] selected sync node: [email protected]
16:48:40.636 [info] [swarm on [email protected]] [tracker:syncing] pending sync request from [email protected]
16:51:14.556 [info] [swarm on [email protected]] [tracker:syncing] pending sync request from [email protected]
16:48:35.633 [info] [swarm on [email protected]] [tracker:init] started
16:48:35.978 [info] [swarm on [email protected]] [tracker:ensure_swarm_started_on_remote_node] nodeup [email protected]
16:48:36.192 [info] [swarm on [email protected]] [tracker:ensure_swarm_started_on_remote_node] nodeup [email protected]
16:48:40.634 [info] [swarm on [email protected]] [tracker:cluster_wait] joining cluster..
16:48:40.635 [info] [swarm on [email protected]] [tracker:cluster_wait] found connected nodes: [:"[email protected]", :"[email protected]"]
16:48:40.635 [info] [swarm on [email protected]] [tracker:cluster_wait] selected sync node: [email protected]
16:48:40.904 [info] [swarm on [email protected]] [tracker:syncing] pending sync request from [email protected]
16:49:38.217 [info] [swarm on [email protected]] [tracker:syncing] pending sync request from [email protected]
16:48:35.899 [info] [swarm on [email protected]] [tracker:init] started
16:48:36.182 [info] [swarm on [email protected]] [tracker:ensure_swarm_started_on_remote_node] nodeup [email protected]
16:48:36.189 [info] [swarm on [email protected]] [tracker:ensure_swarm_started_on_remote_node] nodeup [email protected]
16:48:40.535 [info] [swarm on [email protected]] [tracker:cluster_wait] pending sync request from [email protected]
16:48:40.900 [info] [swarm on [email protected]] [tracker:cluster_wait] joining cluster..
16:48:40.901 [info] [swarm on [email protected]] [tracker:cluster_wait] found connected nodes: [:"[email protected]", :"[email protected]"]
16:48:40.901 [info] [swarm on [email protected]] [tracker:cluster_wait] selected sync node: [email protected]

Some kind of timing-related issue

I have three nodes that I start independently.

One acts as a meeting place. The other two do a Node.connect to it when they start, retrying until it becomes available.

Swarm is running on all three, and starts before any nodes are connected.

When I start them manually in three tmux panes, everything is great.

If I start them via a script, it starts up OK, but one or more of them will then error out with a variant of

17:38:07.758 [error] ** State machine Swarm.Tracker terminating
** Last event = {:cast, {:sync, #PID<15567.213.0>, nil}}
** When server state  = {:syncing, %Swarm.Tracker.TrackerState{clock: nil, nodes: [:"[email protected]"], pending_sync_reqs: [#PID<15567.179.0>, #PID<15565.174.0>], self: :"[email protected]", strategy: #<Ring[:"[email protected]", :"[email protected]"]>, sync_node: :"[email protected]", sync_ref: #Reference<0.1508653593.3076784129.75024>}}
** Reason for termination = :error::function_clause
** Callback mode = :state_functions
** Stacktrace =
**  [{Swarm.IntervalTreeClock, :leq, [nil, nil], [file: 'lib/swarm/tracker/crdt.ex', line: 68]}, {Swarm.IntervalTreeClock, :compare, 2, [file: 'lib/swarm/tracker/crdt.ex', line: 79]}, {Swarm.Tracker, :syncing, 3, [file: 'lib/swarm/tracker/tracker.ex', line: 342]}]

If I introduce a delay of two seconds between each start, I get this:


17:44:08.213 [info]  [swarm on [email protected]] [tracker:init] started

17:44:08.217 [info]  [swarm on [email protected]] [tracker:cluster_wait] pending sync request from [email protected]
** (Mix) Could not start application dictionary: Dictionary.Application.start(:normal, []) returned an error: shutdown: failed to start child: Dictionary.WordList
    ** (EXIT) exited in: :gen_statem.call(Swarm.Tracker, {:whereis, Dictionary.WordList}, :infinity)
        ** (EXIT) an exception was raised:
            ** (FunctionClauseError) no function clause matching in Swarm.IntervalTreeClock.leq/2
                (swarm) lib/swarm/tracker/crdt.ex:68: Swarm.IntervalTreeClock.leq(nil, nil)
                (swarm) lib/swarm/tracker/crdt.ex:79: Swarm.IntervalTreeClock.compare/2
                (swarm) lib/swarm/tracker/tracker.ex:342: Swarm.Tracker.syncing/3

17:44:08.261 [error] ** State machine Swarm.Tracker terminating
** Last event = {:cast, {:sync, #PID<15618.179.0>, nil}}
** When server state  = {:syncing, %Swarm.Tracker.TrackerState{clock: nil, nodes: [:"[email protected]", :"[email protected]"], pending_sync_reqs: [], self: :"[email protected]", strategy: #<Ring[:"[email protected]", :"[email protected]", :"[email protected]"]>, sync_node: :"[email protected]", sync_ref: #Reference<0.441710707.3882090498.99165>}}
** Reason for termination = :error::function_clause
** Callback mode = :state_functions
** Postponed = [{{:call, {#PID<0.190.0>, #Reference<0.441710707.3882090497.101952>}}, {:whereis, Dictionary.WordList}}]
** Stacktrace =
**  [{Swarm.IntervalTreeClock, :leq, [nil, nil], [file: 'lib/swarm/tracker/crdt.ex', line: 68]}, {Swarm.IntervalTreeClock, :compare, 2, [file: 'lib/swarm/tracker/crdt.ex', line: 79]}, {Swarm.Tracker, :syncing, 3, [file: 'lib/swarm/tracker/tracker.ex', line: 342]}]


17:44:08.262 [info]  Application dictionary exited: Dictionary.Application.start(:normal, []) returned an error: shutdown: failed to start child: Dictionary.WordList
    ** (EXIT) exited in: :gen_statem.call(Swarm.Tracker, {:whereis, Dictionary.WordList}, :infinity)
        ** (EXIT) an exception was raised:
            ** (FunctionClauseError) no function clause matching in Swarm.IntervalTreeClock.leq/2
                (swarm) lib/swarm/tracker/crdt.ex:68: Swarm.IntervalTreeClock.leq(nil, nil)
                (swarm) lib/swarm/tracker/crdt.ex:79: Swarm.IntervalTreeClock.compare/2
                (swarm) lib/swarm/tracker/tracker.ex:342: Swarm.Tracker.syncing/3

Something I'm doing wrong?

Dave

Tests based on process groups

What is the best way to implement tests that depend on process groups? Because Swarm.Tracker.add_meta/3 uses GenStateMachine.cast/2, it is possible (and frequent) for test code to run before a process has joined a group. This creates race conditions, which I've fixed for now just by using :timer.sleep/1. However, I don't feel that is the most practical or reliable solution, so I was wondering if there's a better way to create reliable tests.

(This is at least my theory, feel free to correct my if I'm wrong)

Question: Cleanly killing a process

Read through the docs and didn't see anything related to this, so thought I'd ask here.

Is there an intended way to kill a process and have it be cleanly removed from the Swarm cluster and it's tracking (as well as have the code on the process stop running), or is Process.exit(pid, :kill) or Process.exit(pid, :normal) sufficient?

Problems with example

Hi Bitwalker,

apologies but i am still finding problems:

Is there a missing / in

config :swarm,
node_whitelist: [~r/^myapp-[\d]@.*$]

i.e. should be?
config :swarm,
node_whitelist: [~r/^myapp-[\d]@.*$/]

Also:

def start_worker(name) do
{:ok, pid} = Swarm.register_name(name, MyApp.Supervisor, :register, [name])
Swarm.join(:foo, pid)
end

But MyApp.Supervisor does not exist as you named it 'defmodule MyApp.WorkerSup do'

It may be worth mentioning you can't test this on the same host - needs to be on 2 or more machines :-) I was trying to run it in 2 terminal sessions and it didn't want to know. I also got tripped by the old cookie not the same problem when i did run it across 2 machines :-)

Thanks for your work on this!

regards

Shifters

Create ASCII art diagram of the tracker FSM

The tracker is a complex finite state machine, and in order to support contributions, and future maintenance, a visual diagram of that state machine would be immensely helpful in making those goals easy to achieve. Creating it in ASCII form is not strictly necessary, but means it can be put in the module docs as well, which I would like to do.

Handover begins before Supervision tree is fully loaded

Swarm: 3.2.1
Elixir: 1.6.1
Erlang: 20.2.2

Suppose we have two nodes A and B.
We have a worker W which was spawned on node A. Now node A crashes and B takes over W as expected.
When A is starting again Swarm is trying to do the handover of W. But in our case this happens before the Supervisor Test.Supervisor for W is started yet. Swarm is retrying the handover and succeeds after one or two retries but nevertheless there is an exception thrown like the one shown:

[error] [swarm on [email protected]] [tracker:handle_handoff] ** (exit) exited in: GenServer.call(Test.Supervisor, {:start_child, [:state]}, :infinity)
    ** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
    (elixir) lib/gen_server.ex:821: GenServer.call/3
    (swarm) lib/swarm/tracker/tracker.ex:646: Swarm.Tracker.handle_handoff/3
    (stdlib) gen_statem.erl:1240: :gen_statem.call_state_function/5
    (stdlib) gen_statem.erl:1012: :gen_statem.loop_event/6
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3

Maybe there is a way to tell when node A is fully up again (with the whole Supervision tree loaded) and Swarm can savely begin the handoff.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.