isaacs / cluster-master Goto Github PK

Take advantage of node built-in cluster module behavior

License: ISC License

JavaScript 100.00%

cluster-master's Introduction

cluster-master

A module for taking advantage of the built-in cluster module in node v0.8 and above.

Your main server.js file uses this module to fire up a cluster of workers. Those workers then do the actual server stuff (using socket.io, express, tako, raw node, whatever; any TCP/TLS/HTTP/HTTPS server would work.)

This module provides some basic functionality to keep a server running. As the name implies, it should only be run in the master module, not in any cluster workers.

var clusterMaster = require("cluster-master")

// most basic usage: just specify the worker
// Spins up as many workers as you have CPUs
//
// Note that this is VERY WRONG for a lot of multi-tenanted
// VPS environments where you may have 32 CPUs but only a
// 256MB RSS cap or something.
clusterMaster("worker.js")

// more advanced usage.  Specify configs.
// in real life, you can only actually call clusterMaster() once.
clusterMaster({ exec: "worker.js" // script to run
              , size: 5 // number of workers
              , env: { SOME: "environment_vars" }
              , args: [ "--deep", "doop" ]
              , silent: true
              , signals: false
              , onMessage: function (msg) {
                  console.error("Message from %s %j"
                               , this.uniqueID
                               , msg)
                }
              })

// methods
clusterMaster.resize(10)

// graceful rolling restart
clusterMaster.restart()

// graceful shutdown
clusterMaster.quit()

// not so graceful shutdown
clusterMaster.quitHard()

Methods

clusterMaster.resize(n)

Set the cluster size to n. This will disconnect extra nodes and/or spin up new nodes, as needed. Done by default on restarts.

clusterMaster.restart(cb)

One by one, shut down nodes and spin up new ones. Callback is called when finished.

clusterMaster.quit()

Gracefully shut down the worker nodes and then process.exit(0).

clusterMaster.quitHard()

Forcibly shut down the worker nodes and then process.exit(1).

Configs

The exec, env, argv, and silent configs are passed to the cluster.fork() call directly, and have the same meaning.

exec - The worker script to run
env - Envs to provide to workers
argv - Additional args to pass to workers.
silent - Boolean, default=false. Do not share stdout/stderr
size - Starting cluster size. Default = CPU count
signals - Boolean, default=true. Set up listeners to:
- SIGHUP - restart
- SIGINT - quit
onMessage - Method that gets called when workers send a message to the parent. Called in the context of the worker, so you can reply by looking at this.
repl - where to have REPL listen, defaults to env.CLUSTER_MASTER_REPL || 'cluster-master-socket'
- if repl is null or false - REPL is disabled and will not be started
- if repl is string path - REPL will listen on unix domain socket to this path
- if repl is an integer port - REPL will listen on TCP 0.0.0.0:port
- if repl is an object with address and port, then REPL will listen on TCP address:PORT

Examples of configuring repl

var config = { repl: false }                       // disable REPL
var config = { repl: '/tmp/cluster-master-sock' }  // unix domain socket
var config = { repl: 3001 }                        // tcp socket 0.0.0.0:3001
var config = { repl: { address: '127.0.0.1', port: 3002 }}  // tcp 127.0.0.1:3002

Note: be careful when using TCP for your REPL since anyone on the network can connect to your REPL (no security). So either disable the REPL or use a unix domain socket which requires local access (or ssh access) to the server.

REPL

Cluster-master provides a REPL into the master process so you can inspect the state of your cluster. By default the REPL is accessible by a socket written to the root of the directory, but you can override it with the CLUSTER_MASTER_REPL environment variable. You can access the REPL with nc or socat like so:

nc -U ./cluster-master-socket

# OR

socat ./cluster-master-socket stdin

The REPL provides you with access to these objects or functions:

help - display these commands
repl - access the REPL
resize(n) - resize the cluster to n workers
restart(cb) - gracefully restart workers, cb is optional
stop() - gracefully stop workers and master
kill() - forcefully kill workers and master
cluster - node.js cluster module
size - current cluster size
connections - number of REPL connections to master
workers - current workers
select(fld) - map of id to field (from workers)
pids - map of id to pids
ages - map of id to worker ages
states - map of id to worker states
debug(a1) - output a1 to stdout and all REPLs
sock - this REPL socket'
.exit - close this connection to the REPL

cluster-master's People

Contributors

Stargazers

Watchers

cluster-master's Issues

Possible usage for fs.createReadStream

Hi,

Is it possible to use this module to handle reading files?

Thanks

There is no way to disconnect output to console

All the debug outputs are always printed on the console.

There doesn't appear to be a way to get a handle on the workers

I am trying to get my hands on the forked workers but I can't seem to find a way. I've looked at the code and it appears that the workers are spawned and forgotten.

Is this correct? If so would it be possible to add handles and a method to access them?

I appologize if I'm missing something obvious, I'm new to node.js processes.

if workers send an 'field' message, then attach to its worker object in the repl

Ie, a worker should be able to do process.send({cmd: 'field', key: 'gitsha', value: 'dead134fdecafbad00aaabbb' }) and then in the workers object in the repl, you'll see gitsha: 'dead134fdecafbad00aaabbb'

repl doesn't work running root

I started my node server app from root crontab, using cluster-master and even though I had "repl:3001" set, the repl seems to have not started.
netstat showed no listener on 3001.
on startup, I did see the message: "resize and then setup repl"

crontab invoked this script:

!/bin/sh

cd /home/jloveman/callhome2
/usr/bin/screen -dmS callhome2 sudo node /home/jloveman/callhome2/cluster_worker.js

repl: workers array objects should have more options

disconnect -> cluster.workers[worker.id].disconnect()

kill - > cluster.workers[worker.id].process.kill()

Others?

What is "256MB RSS cap"

end event listener registration error

Hi all, I've been working on a tool to identify instances of events registered to the wrong object in uses of some JavaScript event-driven APIs, as part of a research project.
The tool flagged line 213 in cluster-master.js in the root of this repository, on the registration of the “end” event.

The reason I believe this is indicative of an error is as follows (from looking at the nodejs repl API documentation).
This listener for “end” is registered on variable r, which is an object of type repl.REPLServer, initialized on line 138 by a call to repl.start(). However, “end” is not an event emitted on repl.REPLServer.

My guess is that maybe instead you should be listening for “exit” (an event on repl.REPLServer), or “close” (an event on readline.Interface, which repl.REPLServer extends).

Thanks!

frequently two disconnects will be required to disconnect a worker.

Not sure why this is. It's like the first one is being ignored.

Maybe because npm-www starts up 2 servers?

The application shuts down with error 'Cannot set property 'lookup' of undefined(dgram.js:147:20)'

We recently moved from pm2 to using supervisor with cluster-master, application yesterday went down due to a fatal error and I've this in logs:

dgram.js:147
  newHandle.lookup = self._handle.lookup;
                                 ^
TypeError: Cannot set property 'lookup' of undefined
    at replaceHandle (dgram.js:147:20)

Could you please give some insights on this, like if this issue is because of cluster-master or some wrong implementation from my side.

SIGKILL isn't catchable

And node knows it:

process.on("SIGKILL", function(){})
Error: uv_signal_start EINVAL

I think you're always hitting your catch.

Logs streams, worker events, REPL...

This project has not had a lot of activity. I would like to see some new features like:

Transform streams for workers stdout: So you could append process id etc to logs bubbling up from the workers
onMessage is supported, but what about exit, online etc.
.exit in the REPL is not exactly right, I think that sock.end() is what people are looking for, update documentation/implementation.
Additional REPL features (some listed elsewhere), but I would like to be able to attach stdout for a given worker to the REPL...

Should I make some PR's or should I fork the project and start anew?

add a license please

any way you could slap mit or bsd in here somewhere? :)