tenjininc / procrastinator Goto Github PK
View Code? Open in Web Editor NEWDelayed task queues made simple.
Home Page: https://rubygems.org/gems/procrastinator
License: MIT License
Delayed task queues made simple.
Home Page: https://rubygems.org/gems/procrastinator
License: MIT License
If it was successful, it's sort of superfluous to then have to deal with possible errors in the success block. Procrastinator should automatically rescue StandardError and just puts a message to stderr.
Somehow the bin/ directory is in the gem distributable. It should be removed as its contents are for the gem developers, not end users.
So that one task doesn't clog up the entire queue, QueueWorker should be multithreaded to handle max_tasks
tasks at a time.
The API currently allows you to delay a task, but it cannot cancel or reschedule a task.
Procrastinator logs to STDOUT when test mode is enabled to reduce hits on the hard drive for faster test runs. While this is fine for the default procrastinator output, custom log messages (eg. Success Hooks) still need to be testable because they are app-defined. Those apps can use FakeFS or similar to avoid hitting the drive while still having a "file" to check for testing.
Procrastinator env has methods #log_inside
and #log_at_level
, which raises options:
#log_to_stdout
flag.#log_inside
to accept :stdin
If the task given is a string, it should just write the string. Otherwise, it should dump the object to a string using YAML
This is an uninformative method name.
Some suggestions:
task_attributes
import_attributes
load_attributes
Currently, the behaviour is to keep failed and final_failed tasks for diagnosing. We could provide a syntax in setup to allow the environment to delete them instead (perhaps because the final_fail hook sends the data via email to an admin).
Procrastinator.setup(persister) do |env|
env.define_queue(:test)
env.delete_on_final_fail
end
It spiral into a massive chain of object serialisation, depending on the instance variables declared in the task.
This is made worse by the style wherein tasks use task_attr delcarations, which defines a @logger
, @context
, @data
and/or @scheduler
, the majority of which could have overly complex (and unhelpful to a log message) serializations.
It's also feasible that Tasks would have additional instance variables, which would also get serialized for (possibly) no benefit.
The original intent of the log message including the YAML dumped task was to provide maximum context, especially if the task was to be deleted on final_fail. Some suitable options:
data
. The data is intended to be a simple data type anyway, so this risks far less serializationIt should gracefully default attempts to zero
it should take a block in the setup for each queue that will run for any task in that queue.
In short, Procrastinator assumes that the parent process will stay alive, but Passenger kills/respawns processes all the time for resource conservation purposes. When Procrastinator spins off a subprocess, the child remembers the parent's PID and self-terminates when the parent disappears; however, Passenger treats those parent processes as semi-ephemeral, so the PID will change after the process is "paused" in Passenger.
The longterm fix here is to re-engineer Procrastinator to no longer have children rely on their parent process in that way.
For now, though, the workaround is to set Passenger's min instances to 0 and disable preloading. Neither is desirable, but necessary.
https://www.phusionpassenger.com/library/config/apache/reference/#passengermininstances
https://www.phusionpassenger.com/library/config/apache/reference/#passengerspawnmethod
Right now there's no procedure for restarting a subprocess should it die. That's bad.
Ideally, you would be able to just drop in a ROM Repository object as your task persister strategy, and it would work flawlessly.
If a task is loaded from an old persistence, but the queue is no longer defined for it (or is renamed, etc), then there might be a small explosion.
Just a tiny one.
Should this be handled by Procrastinator, or should it be left to the app to manage its own data for this case?
Right now it raises generic RuntimeErrors with a message, but it would be best to have specific classes for each type of problem. eg. TaskNotFoundError
Calling reschedule is broken due to it calling it on the wrong receiver.
Perils of over-mocking tests, I guess.
When it complains about either a wrong or missing queue name, it should list the queue names that have been registered.
To better control / understand the processes running on a system, the dev may want to be able to specify a prefix (say, the name of the app) for each subprocess.
eg.
Procrastinator.setup(persister) do |env|
env.define_queue(:test)
env.subprocess_prefix = 'my-app'
end
Would produce queue process my-app-test-queue-worker
.
Multiprocessing uses Process.fork
right now, but that is not windows friendly (or some BSD). Look into gems like childprocess to wrap the child-spawning process in a cross-platform way.
#log_inside
could be misinterpreted as inside-vs-outside.
#log_directory
would be a clearer name.
Right now it's very limited, and multi processing is a complex topic. I would like to see better testing of the looping mechanism, the sleep & its duration, and perhaps a full live integration test.
It should run the setup block on the main process and again on all subprocess queues, ignoring #define_queue
in subprocesses.
This would eliminate the need for each_process entirely, rolling it into #setup
itself. Doing so is conceptually simpler, but also reduces duplicated provide_context statements.
Something like:
Procrastinator.setup(task_repo, 'config/queues.yml' )
# or
Procrastinator.setup(task_repo, 'config/queues.json' )
Would be useful for projects that prefer to define their queue behaviour in external config files.
It could use the file extension to determine the file format.
If a dev wants to manage the task queue with an outside system (cron
, monit
, etc) then Procrastinator should provide a short script like delayed_job
's start-stop script.
The worker started log entry doesn't include the prefix, which it should, since people will want to be able to look for it by process title.
eg.
"Started worker process, email-queue-worker, to work off queue email"
should be
"Started worker process, myapp-email-queue-worker, to work off queue email"
If restarting app, the main process will die, and then main-prime would start its own workers.
Because of the heartbeat delay, there could be a short period with two QueueWorker processes for the same queue.
It's feasible that the parent could be killed and the pid is reused (by anything else) before the child checks for it to terminate themselves.
The children should use a more robust method to watch for their parent process. Maybe also watch the process name or the start time.
When core testing, it doesn't matter what the task actually does. Therefore it would be worthwhile to allow a nil, when test mode is enabled
If the parent process is owned by other user, then kill(0, pid) will return an error.
http://stackoverflow.com/questions/325082/how-can-i-check-from-ruby-whether-a-process-with-a-certain-pid-is-running
This especially may occur if the standalone script is used.
Tasks should be given an app object (eg. a DI container) that can be used as needed by the task to complete its duties.
For example, a run method might need to fetch data from the app's persistence, but each task does not need to be creating its own persistence env and set of persisters when they could share one that is passed in.
It may be helpful/necessary for the setup to take such an object to be passed in (or a lambda to be run for each task, etc)
Rescheduling is currently done on a static formula, but devs may want to tweak that process. Perhaps a command on the Environment like:
Procrastinator.setup(persister) do |env|
env.define_queue(:test)
env.reschedule_with do |attempts|
#define custom formula here
end
end
#delay should raise an error if the task being scheduled is set for more than a short period in the past (to allow the clock ticking over while scheduling it)
Options:
To prevent looping, sleeping, and forking. Those are all gem details that end-apps shouldn't need to care about to test.
Either find a way to use the external API without loops, or add a flag to disable them.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.