Comments (8)
Hi @isavcic!
The reason for that behaviour is that worker pools are dynamic by default, which means they create more workers as needed depending on the workload. In your example, the worker pool starts out with 1 worker goroutine and can scale up to 100 if needed, but it never gets to launch more than 2 goroutines because its waiting queue (pool's max. capacity) is never full. In your case, the max. capacity is set to 1000 and only 100 tasks are submitted to the pool.
In order to ensure all 100 tasks are processed concurrently by 100 workers, you have 2 alternatives:
- Reduce the waiting queue size to make sure backpressure is detected sooner:
pool := pond.New(100, 10)
- Use a fixed-size pool by configuring the initial no. of workers:
pool := pond.New(100, 1000, pond.MinWorkers(100))
Both options will cause the pool to launch more goroutines, which should end up producing the result you were expecting.
Please let me know if that works out for you. Have a nice day!
from pond.
Thanks for the quick reply! That makes sense. So, maxCapacity doesn't represent the maximum simultaneous, active tasks, but rather as a (soft) limit of sorts on the task ingestion buffer?
That's correct, internally, each pool has 2 buffered channels tasks
and dispatchedTasks
. When a task is submitted, it's always sent to the tasks
channel first. Each pool also has a dispatcher goroutine that continuously reads from the tasks
channeel and forwards all items to the dispatchedTasks
channel, which is the one used to send the actual tasks to worker goroutines. The size of the first channel (tasks
) is controlled by the maxCapacity
option, and maxWorkers
, on the other hand, controls the size of the second channel (dispatchedTasks
), but it also represents the max no. of workers.
So basically, maxCapacity
determines whether the pool.Submit(task)
call is blocking (maxCapacity = 0
) or non-blocking (maxCapacity > 0
).
What are the scenarios when that is the bottleneck?
The maxCapacity
option is there to absorb large bursts of tasks from client goroutines (e.g. a spike in HTTP requests) and avoid blocking on the call to Submit
when all workers are busy, because when this happens, the dispatcher also blocks waiting until a worker becomes available.
That said, maybe the scenario when one needs to tune the maxCapacity
option is not very frequent, which would mean that option could just default to 0 🤔
from pond.
Yes, that's right, Eager
would give you that behaviour (immediate task execution unless the maxWorkers limit is reached).
Doing pool := pond.New(512, 1)
also gives you something similar but because of the maxCapacity
option set to 1, client goroutines might spend some time blocked in the Submit() call when more than 1 goroutines attempt to submit tasks simultaneously. Now that the resizing strategy is a different config option, you can safely increase the size of the buffer via maxCapacity and still have immediate execution (this was not possible before).
Thanks for your feedback!, it was key to discover this limitation 🙂
from pond.
Thanks for the quick reply! That makes sense. So, maxCapacity
doesn't represent the maximum simultaneous, active tasks, but rather a (soft) limit of sorts on the task ingestion buffer? What are the scenarios when that is the bottleneck? I'm still trying to grasp the logic behind it, sorry. :)
By setting the pool as pool := pond.New(100, 0)
I'm getting the expected behaviour: 100 tasks maximum running at the same time. Nice!
from pond.
Hey @isavcic!, I just wanted to let you know that, based on this issue and our discussion around it, I released a new version of the library with changes to give a better control over the pool resizing strategy and also revamped the default strategy to avoid the behaviour you were experiencing.
Any feedback you have would be very appreciated 🙂
from pond.
Hey, great! I'm looking over the doc now. If I undestand correctly, if I want immediate task execution (by spawning a new goroutine/worker), I should use the Eager strategy? Currently I ended up doing this by spawning the pool with pool := pond.New(512, 1)
from pond.
Any guarantees on the ordering of dispatchedTasks? Completion order is obviously dependant on the worker, but it would help to know if dispatchedTasks are fifo to other.
from pond.
Hey @ioannist, sorry for the delay, I just came back from vacations.
Currently, there's no ordering guarantee when submitting tasks to worker pools. This library uses a buffered channel internally to dispatch tasks to workers, and these do not guarantee delivery order (even though data itself is ordered in the channel).
More info:
- https://go.dev/doc/effective_go#channels
- https://www.ardanlabs.com/blog/2014/02/the-nature-of-channels-in-go.html
from pond.
Related Issues (20)
- Why use atomic.AddInt32() and sync.Mutex at the same time HOT 2
- Access channel concurrently HOT 3
- Exceptions may occur when closing the pool HOT 7
- does pond support serial queue? HOT 3
- Question: What happens to the pool when a worker panic HOT 3
- goroutine running on other thread; stack unavailable HOT 2
- error handling HOT 4
- Deadlock on zero minimum workers HOT 3
- Unaligned 64-bit atomic operation error when built for linux HOT 2
- pond.New(30, 100, pond.MinWorkers(30)) creates 60 goroutines not 30 HOT 2
- Crash on arm 32-bits (RPi4)
- panic: send on closed channel, when lots of goroutine are running(may be for a long time) HOT 3
- IdleWorkers RunningWorkers HOT 3
- metric problem HOT 3
- run stop panic
- support task priority
- 1 pool for 3 different tasks with the same number of workers on each task HOT 1
- Will requesting to submit an asynchronous task cause blocking? HOT 1
- What is the best way to handle errors? HOT 1
- Setting unlimited maxCapacity, int vs uint HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pond.