Giter Club home page Giter Club logo

node-worker-nodes's Introduction

view on npm Build status

worker-nodes

A node.js library to run cpu-intensive tasks in a separate processes and to not to block the event loop.

Installation

$ npm install worker-nodes

Node.js greater than 14.0.0 is required

API Reference

WorkerNodes

Kind: global class

new WorkerNodes(path, [options])

Param Type Description
path String An absolute path to the module that will be run in the workers.
[options] Object See WorkerNodesOptions for a detailed description.

workerNodes.call : Proxy

This exposes the api of a module that the worker nodes are working on. If the module is a function, you can call this directly. If the module exports multiple functions, you can call them as they were properties of this proxy.

Kind: instance property of WorkerNodes

workerNodes.ready() ⇒ Promise

A method to check if the minimum required number of workers are ready to serve the calls.

Kind: instance method of WorkerNodes
Returns: Promise - resolves with a WorkerNodes instance

workerNodes.terminate() ⇒ Promise

Starts the process of terminating this instance.

Kind: instance method of WorkerNodes
Returns: Promise - - resolved when the instance is terminated.

workerNodes.profiler(duration) ⇒ void

Run CPU Profiler and save result on main process directory

Kind: instance method of WorkerNodes

Param Type
duration number

workerNodes.takeSnapshot() ⇒ void

Take Heap Snapshot and save result on main process directory

Kind: instance method of WorkerNodes

workerNodes.getUsedWorkers() ⇒ Array.<Worker>

Return list with used workers in pool

Kind: instance method of WorkerNodes

WorkerNodesOptions

Describes a WorkerNodes options.

Kind: global class

options.autoStart : Boolean

Whether should initialize the workers before a first call.

If true, depending on the lazyStart option, it will start the min or max number of workers.

Kind: instance property of WorkerNodesOptions
Default: false

options.lazyStart : Boolean

Whether should start a new worker only if all the others are busy.

Kind: instance property of WorkerNodesOptions
Default: false

options.asyncWorkerInitialization : Boolean

Enables async initialization of worker. To start handling task over worker, need to invoke sendWorkerMessage('ready') function when it fully initialized. For examples please refer to the test cases

Kind: instance property of WorkerNodesOptions
Default: false

options.minWorkers : Number

The minimum number of workers that needs to be running to consider the whole pool as operational.

Kind: instance property of WorkerNodesOptions
Default: 0

options.maxWorkers : Number

The maximum number of workers that can be running at the same time. Defaults to the number of cores the operating system sees.

Kind: instance property of WorkerNodesOptions

options.maxTasks : Number

The maximum number of calls that can be handled at the same time. Exceeding this limit causes MaxConcurrentCallsError to be thrown.

Kind: instance property of WorkerNodesOptions
Default: Infinity

options.maxTasksPerWorker : Number

The number of calls that can be given to a single worker at the same time.

Kind: instance property of WorkerNodesOptions
Default: 1

options.taskTimeout : Number

The number milliseconds after which a call is considered to be lost. Exceeding this limit causes TimeoutError to be thrown and a worker that performed that task to be killed.

Kind: instance property of WorkerNodesOptions
Default: Infinity

options.taskMaxRetries : Number

The maximum number of retries that will be performed over a task before reporting it as incorrectly terminated. Exceeding this limit causes ProcessTerminatedError to be thrown.

Kind: instance property of WorkerNodesOptions
Default: 0

options.workerEndurance : Number

The maximum number of calls that a single worker can handle during its whole lifespan. Exceeding this limit causes the termination of the worker.

Kind: instance property of WorkerNodesOptions
Default: Infinity

options.workerStopTimeout : Number

The timeout value (in milliseconds) for the worker to stop before sending SIGKILL.

Kind: instance property of WorkerNodesOptions
Default: 100

options.resourceLimits : Object

Provides the set of JS engine resource constraints inside this Worker thread. (Usable when using workerType: thread only)

Kind: instance property of WorkerNodesOptions
Properties

Name Type Description
maxYoungGenerationSizeMb Number The maximum size of a heap space for recently created objects
maxOldGenerationSizeMb Number The maximum size of the main heap in MB
codeRangeSizeMb Number The size of a pre-allocated memory range used for generated code
stackSizeMb Number The default maximum stack size for the thread. Small values may lead to unusable Worker instances

options.workerType : string

Can be either process or thread (default), that controls the underlying implementation used, either child_process or worker_threads. Most usecases are perfectly fine with thread implementation, some work loads though, might need to use process, for example, if you are using process.chdir() call which is not supported in worker_threads.

Example

Given /home/joe.doe/workspace/my-module.js:

module.exports = function myTask() {
    return 'hello from separate process!';
};

you can run it through the worker nodes as follows:

const WorkerNodes = require('worker-nodes');
const myModuleWorkerNodes = new WorkerNodes('/home/joe.doe/workspace/my-module');

myModuleWorkerNodes.call().then(msg => console.log(msg));  // -> 'hello from separate process!'

For more advanced examples please refer to the test cases.

Running tests

Check out the library code and then:

$ npm install
$ npm test

Benchmarks

To run tests, type:

$ npm install
$ npm run benchmark

It will run a performance test against the selected libraries:

  • data in: an object that consists of a single field that is a 0.5MB random string
  • data out: received object stringified and concatenated with another 1MB string

Example results:

results for 100 executions

name                time: total [ms]  time usr [ms]  time sys [ms]  worker usr [ms]  worker sys [ms]  mem rss [MB]  worker rss [MB]  errors
------------------  ----------------  -------------  -------------  ---------------  ---------------  ------------  ---------------  ------
no-workers                       148            203             37                0                0            98                0       0
[email protected]               362            390            143              389              143           213              210       0
[email protected]                 367            495            185              492              182           236              245       0
[email protected]              1095            520            207              592              243           216               86       0
[email protected]               1886            749            276              947              299           221               70       0
[email protected]              2002            847            285              986              309           219               74       0
[email protected]              13775           7129           5236             1891              952           363               63       0

  os : Darwin / 19.5.0 / x64
 cpu : Intel(R) Core(TM) i7-7660U CPU @ 2.50GHz × 4
node : 14.3.0 / v8: 8.1.307.31-node.33

See also

sources of inspiration:

License

Copyright Allegro Sp. z o.o.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

node-worker-nodes's People

Contributors

adamdubiel avatar bgalek avatar dependabot[bot] avatar galcarmi avatar kwiatkk1 avatar lev-kazakov avatar mariusalch avatar mheiniger avatar noam-almog avatar rafixer avatar reshiire avatar slonka avatar yurynix avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

node-worker-nodes's Issues

how to use import,export in workerModule ?

test.js

import workerNodes from 'worker-nodes';
const test1 = new workerNodes(path.resolve(__dirname, "test-model"));

test-model.js

import lib from './lib.js';   // **report cant use import,and export**
module.exports =async function(){
};

lib.js

export default {
  test(){
  }
}

Transferring large JS objects can be slow

Thank you so much for your great library! I've been hitting an issue when trying to transfer large JS objects - they seem to be especially slow to encode/decode. I've put together a simple test case to analyze:

https://gist.github.com/jeresig/af003831559a99af63133189e088b4e6

It seems like passing through any JS object ends up being slow - however if we use our own JSON.stringify/JSON.parse and pass through just a string then it ends up being much faster.

We may just encode everything as a big string (non-JS object) and just parse that result, to get optimal speed. Any suggestions you have for improving the performance would be appreciated!

fail not catched when error is unserializable

Hi,
i happen to call browserify in a worker node: upon failure it returns an error with a lot of additional structures.
It took me a while to realize I had to rewrite the error using:

const err2 = new Error(err.message);
err2.code = err.code;
err2.stack = err.stack;
reject(err2);

i wonder if there is a way for worker-nodes to avoid the headache.

Rename option `workerStopTimeout` to `workerForceKillAfterTimeout`

I didn't understand that workerStopTimeout force kill the process when the time is reached after we call Worker.stop().
I was expecting same behavior than taskTimeout and I was confused between this 2 options.

I think it's more clear to call this option workerForceKillAfterTimeout like execa.

Has this package been tested on typescript?

Hi,

Its been few days that I am struggling to use this package on typescript.
Here is an online test implementation: https://repl.it/@JasimKhan/worker-nodes-test

Locally, I am trying to debug it using ts-node, but i dont get any break point in the worker tasks... Nor do I see console traces of their execution.

Can some one point me to some typescript usage of this package?

Thanks in advance.

maxWorkers doesn't work as expected

The following configuration results in spawning 8 workers on my machine:

{
    "autoStart": true,
    "maxWorkers": 4
}

I expect 4 workers to be spawned instead.

outdated benchmark results

Seems that with latest versions of node.js (7.5.0+) this library is not outperforming its competitors as it is previously stated in a benchmark results.

Fix memory leak in WorkerProcess

Hey,

We’re using this lib in our service to run worker processes which bundles code with webpack.
following @yurynix PR everything seemed to work fine.

We use this configuration for the pool

const POOL_OPTIONS = {
  minWorkers: 4,
  maxWorkers: 6,
  // Not passed in production, only used for testing.
  autoStart: process.env.POOL_TESTS_AUTO_START !== "false",
  // Not passed in production, only used for testing.
  taskTimeout: process.env.POOL_TESTS_TASK_TIMEOUT ?? seconds(55),
  workerType: "process",
  workerEndurance: 1,
};

After a while, we noticed that our node processes were leaking memory.
It took some time for us to understand that when a process is being killed after it finishes 1 job (workerEndurance: 1 configuration) we don’t close the server connection here. This is causing server socket files to accumulate in os.tmpdir() since the server never closes

I created a fork which fixes the issue (close server connection when child process exists)
I already tested it in production and it works great.

I'll create a PR to fix it in this package so I can continue use it in production.

document promise behaviour

workerNodes.call : Proxy

This exposes the api of a module that the worker nodes are working on. If the module is a function, you can call this directly. If the module exports multiple functions, you can call them as they were properties of this proxy.

Kind: instance property of WorkerNodes

nothing is mentioned about promises, and the example shows usage of promises

minWorkers/maxWorkers ambiguity

The minWorkers option is only respected when initially reporting pool readiness, but should also be treated as a number of workers that the pool needs to keep running all the time, i.e.:

  • on any accidental worker termination, pool should boot a new worker immediately (if minWorkers number is greater than the current number of healthy workers)
  • a planned worker exit (for example as a result of WorkerEndurance value exceedance) should result in a spawn of a new worker before starting the process of termination the exhausted one (maybe even at cost of some extra calls above declared endurance).

Crash when the program exit: ProcessTerminatedError: cancel after 0 retries!

On circleci, my program crash at onExit callback.
The program work on local environement.

Edit: A worker crash, but I don't know how to get more error output. It's seems like the real error is hidden by node-worker-nodes :/

I use node v8.7.0 and npm v5.5.1

image

> ERROR: ProcessTerminatedError: cancel after 0 retries!
    at tasks.filter.forEach.task (/home/circleci/project/node_modules/worker-nodes/lib/pool.js:111:39)
    at Array.forEach (<anonymous>)
    at WorkerNodes.handleWorkerExit (/home/circleci/project/node_modules/worker-nodes/lib/pool.js:110:14)
    at Worker.worker.on.exitCode (/home/circleci/project/node_modules/worker-nodes/lib/pool.js:160:44)
    at emitOne (events.js:115:13)
    at Worker.emit (events.js:210:7)
    at WorkerProcess.Worker.process.once.code (/home/circleci/project/node_modules/worker-nodes/lib/worker.js:39:18)
    at Object.onceWrapper (events.js:316:30)
    at emitOne (events.js:115:13)
    at WorkerProcess.emit (events.js:210:7)
    at ChildProcess.WorkerProcess.child.once.code (/home/circleci/project/node_modules/worker-nodes/lib/worker/process.js:42:41)
    at Object.onceWrapper (events.js:318:30)
    at emitTwo (events.js:125:13)
    at ChildProcess.emit (events.js:213:7)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:200:12)

{ Error: read ECONNRESET
    at _errnoException (util.js:1021:11)
    at Pipe.onread (net.js:608:25) code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }
{ Error: read ECONNRESET
    at _errnoException (util.js:1021:11)
    at Pipe.onread (net.js:608:25) code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }
{ Error: read ECONNRESET
    at _errnoException (util.js:1021:11)
    at Pipe.onread (net.js:608:25) code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }
{ Error: read ECONNRESET
    at _errnoException (util.js:1021:11)
    at Pipe.onread (net.js:608:25) code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }
{ Error: read ECONNRESET
    at _errnoException (util.js:1021:11)
    at Pipe.onread (net.js:608:25) code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }
{ Error: read ECONNRESET
    at _errnoException (util.js:1021:11)
    at Pipe.onread (net.js:608:25) code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }
{ Error: read ECONNRESET
    at _errnoException (util.js:1021:11)
    at Pipe.onread (net.js:608:25) code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }
{ Error: This socket has been ended by the other party
    at Socket.writeAfterFIN [as write] (net.js:354:12)
    at Transport.send (/home/circleci/project/node_modules/worker-nodes/lib/worker/transport.js:57:19)
    at Promise.then.result (/home/circleci/project/node_modules/worker-nodes/lib/worker/child-loader.js:36:28)
    at <anonymous>
    at process._tickCallback (internal/process/next_tick.js:188:7) code: 'EPIPE' }

Feature request: add the ability to run "init" code at worker startup

We have the need to initialize some global data structures on our workers before they can start doing work. In addition, the initialization depends on some runtime data known by the parent process.

We'd like to do something like this:

const workerNodes = require("worker-nodes");
const cacheSize = 100;
const workers = new workerNodes(myfile, options, () => initCache(cacheSize));

though that particular implementation probably wouldn't work.

Is such a thing possible? It seems like it should be, by calling handleCall on worker startup, in addition to calling it every time we are starting a new task. But I'm afraid I'm not up to the task of figuring out all the details.

fail cant catch it

mod.js

module.exports = async function(){
  return new Promise((succ, fail ) => {
      fail();
  })
}

test.js

let worker = ......
worker.then().catch()//not catch it

(node:28649) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): RangeError: Maximum call stack size exceeded

Worker and nodesocket server

FFMPEg runs in a worker, and in the main a node webserver who use sockets is waiting for instructions.
But the web server can't respond during ffmpeg is running : why ??

Worker implementation selection

Hi 👋
Thank you for the great library! 🙏

When moving from 1.x to 2.x, the implementation of the workers was switched from process (child_process) to threads (worker_threads), which is great for many workloads, however, for some workloads, processes are preferred.

Specifically what bothers me is workloads that do process.chdir() which is not supported inside worker_threads.

Would you be open to a new option in WorkerNodesOptions -> .workerType which can get thread (default) or process,
that will control the underlying implementation?

I don't think it's a big code change, however one downside is that WorkerNodesOptions.resourceLimits is only going to be supported in worker_threads.

What do you think? 🙃

Error on passing IP Address and Port to --inspector: 'Unable to open devtools socket: address already in use'

When launching a Node 8 service with the --inspector flag that is passed an IP Address, the following error is printed to the service console when each worker launches:

Unable to open devtools socket: address already in use

I have also sometimes observed this error continuing to print to the service console.

This is down to each worker using the same debugPort as the main service process.

I am creating a Pull Request that will reference this issue and should solve this error, allowing each worker to listen in on a different port.

bson buffer overflow

Hi,
Right now this library is using bson from mongo. there is a problem with the library here.
Bson is using a predefined buffer with max size of 17mb, once this buffer is over node will shut down the process for security reasons without any way for the system to recover from it (no process events, try catch or anything else i could think of).
There aren't many ways to resolve this, bson does not allow us to modify the buffer value which leaves us with two basic options:

  1. switch bson with something else, MessagePack can be a viable option (and a simple drop in replacement)
  2. leave things the way they are but allow to customize the message serializer (leaving the door open for many other binary serializers, like protobuf).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.