Giter Club home page Giter Club logo

Comments (3)

cvasseng avatar cvasseng commented on July 4, 2024

That's a great question!

25 is somewhat magic. It was the worker count that proved the most stable on my test setup early on when there were still some stability issues, and kind of stuck around since then. So right now it's somewhat arbitrary.

That said, running more phantom workers than the amount of threads supported on the CPU did have a slight performance gain in my benchmarks in the sense that it could handle more requests but that there was an additional (small) delay on all requests. This makes sense for some use cases. But it also comes at the cost of the server becoming IO/memory bound, which can be problematic when running it in virtual servers.

The "right" amount of workers for keeping the CPU at ideal load generally depends on the CPU; both the number of cores, and even more importantly, if the CPU supports hyperthreading or not.

There's another side to it as well though, which you also mentioned. Workers have to be restarted every once in a while, as performance starts degrading after running for a while. The way the pooling works is that each worker has a maximum number of work pieces that can be performed before it's killed and replaced.

Having a larger pool means that it takes longer for each individual worker to reach its maximum work count, which enhances performance quite a bit since there's overhead involved in starting up a new worker which quickly adds up to severe delays with heavy traffic. Each worker is also initially seeded with a random work count to avoid situations where all the workers restart at the same time, and the scheduler will try to distribute traffic uniformly across the entire pool when a new piece of work is posted. The result is somewhat queue-esque I suppose from a lower level perspective, but from node's perspective tasks are performed continuously (though at the mercy of the OS scheduler). There's a small request queue too in which work will end up if all workers are busy. This queue is limited to 5 pieces of work, which is also a magic number. However, this one was reached based on real-life data from heavy traffic surges in our production system. Having a larger queue would cause it to grow out of control if several workers were hogged for longer periods. Connections are dropped if the pool is saturated and the queue is full. This almost never happens in our production system.

In production (for export.highcharts.com), we use a worker count that ensures that there are rarely more active workers than the number of supported HW threads. Our traffic patterns are very predictable, so we choose hardware that could sustain the average request count without saturating the pool. We have a headroom of around 1.7x on top of the hardware supported thread count when looking at number of cores + hyperthreading. If the pool saturates during traffic spikes, the overall delay is fairly negligible. It took some iterations before we hit the sweet spot, but the service has now been running in production for about three months without any issues or intervention.

As for the timeouts, they are there to avoid bad charts from hogging a worker indefinitely, for instance if misbehaving JavaScript is injected (e.g infinite loops and so on). Not having it would make the pool very susceptible to DoS.

Anyway, the ideal work count varies based on the hardware on which it's ran, as well as traffic patterns, and as such it will vary based on use case. The best approach is to benchmark different settings for your specific use case to find a number that makes sense to you.

I'm going to adjust the defaults though, as 25 is quite high for most use cases. :)

from node-export-server.

lahma avatar lahma commented on July 4, 2024

Thank you for this very detailed explanation and reasoning, I'm grateful you took the time and elaborated real pruduction experiences. I will do more performance testing and try to find ideal settings for us. We are not so afraid of DoS as we really want to process all requests if possible as we are testing the service internally.

So far trying to run load tests using docker has mostly taken the whole docker virtual host down - something that does not happen with PhantomJS + nginx load balancer. Something I need to investigate more.

from node-export-server.

cvasseng avatar cvasseng commented on July 4, 2024

No problem at all!

I haven't tried running it in docker, so I'm afraid I can't really offer any advice there (we run the production service in elastic beanstalk), but do let me know if there's anything we can do to help.

from node-export-server.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.