Giter Club home page Giter Club logo

procrastinator's People

Contributors

robinetmiller avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

procrastinator's Issues

Rescue errors in Success block

If it was successful, it's sort of superfluous to then have to deal with possible errors in the success block. Procrastinator should automatically rescue StandardError and just puts a message to stderr.

Multithread QueueWorker

So that one task doesn't clog up the entire queue, QueueWorker should be multithreaded to handle max_tasks tasks at a time.

Ability to optionally log to STDOUT when test_mode enabled

Procrastinator logs to STDOUT when test mode is enabled to reduce hits on the hard drive for faster test runs. While this is fine for the default procrastinator output, custom log messages (eg. Success Hooks) still need to be testable because they are app-defined. Those apps can use FakeFS or similar to avoid hitting the drive while still having a "file" to check for testing.

Procrastinator env has methods #log_inside and #log_at_level, which raises options:

  1. Add a #log_to_stdout flag.
  2. Allow #log_inside to accept :stdin

Rename task_attr

This is an uninformative method name.

Some suggestions:

  • task_attributes
  • import_attributes
  • load_attributes

Determine behaviour on final_failed tasks

Currently, the behaviour is to keep failed and final_failed tasks for diagnosing. We could provide a syntax in setup to allow the environment to delete them instead (perhaps because the final_fail hook sends the data via email to an admin).

Procrastinator.setup(persister) do |env|
   env.define_queue(:test)

   env.delete_on_final_fail
end

handle_final_failure should not YAML dump the whole task object

It spiral into a massive chain of object serialisation, depending on the instance variables declared in the task.

This is made worse by the style wherein tasks use task_attr delcarations, which defines a @logger, @context, @data and/or @scheduler, the majority of which could have overly complex (and unhelpful to a log message) serializations.

It's also feasible that Tasks would have additional instance variables, which would also get serialized for (possibly) no benefit.

The original intent of the log message including the YAML dumped task was to provide maximum context, especially if the task was to be deleted on final_fail. Some suitable options:

  • Only log data. The data is intended to be a simple data type anyway, so this risks far less serialization
  • Do not include task info in the log message at all and expect that if tasks are deleted on final_fail, then the data was not important.

Global event hooks

it should take a block in the setup for each queue that will run for any task in that queue.

Procrastinator doesn't play nice with Passenger

In short, Procrastinator assumes that the parent process will stay alive, but Passenger kills/respawns processes all the time for resource conservation purposes. When Procrastinator spins off a subprocess, the child remembers the parent's PID and self-terminates when the parent disappears; however, Passenger treats those parent processes as semi-ephemeral, so the PID will change after the process is "paused" in Passenger.

The longterm fix here is to re-engineer Procrastinator to no longer have children rely on their parent process in that way.

For now, though, the workaround is to set Passenger's min instances to 0 and disable preloading. Neither is desirable, but necessary.

https://www.phusionpassenger.com/library/config/apache/reference/#passengermininstances
https://www.phusionpassenger.com/library/config/apache/reference/#passengerspawnmethod

What happens if a task is found that doesn't match a defined queue?

If a task is loaded from an old persistence, but the queue is no longer defined for it (or is renamed, etc), then there might be a small explosion.

Just a tiny one.

Should this be handled by Procrastinator, or should it be left to the app to manage its own data for this case?

Scheduler should raise classed errors

Right now it raises generic RuntimeErrors with a message, but it would be best to have specific classes for each type of problem. eg. TaskNotFoundError

Reschedule is broken

Calling reschedule is broken due to it calling it on the wrong receiver.

Perils of over-mocking tests, I guess.

Providing a process name prefix

To better control / understand the processes running on a system, the dev may want to be able to specify a prefix (say, the name of the app) for each subprocess.

eg.

Procrastinator.setup(persister) do |env|
   env.define_queue(:test)
   env.subprocess_prefix = 'my-app'
end

Would produce queue process my-app-test-queue-worker.

Cross platform support

Multiprocessing uses Process.fork right now, but that is not windows friendly (or some BSD). Look into gems like childprocess to wrap the child-spawning process in a cross-platform way.

Rename `#log_inside`

#log_inside could be misinterpreted as inside-vs-outside.

#log_directory would be a clearer name.

Improve testing for parent process monitoring

Right now it's very limited, and multi processing is a complex topic. I would like to see better testing of the looping mechanism, the sleep & its duration, and perhaps a full live integration test.

Setup should run on all processes

It should run the setup block on the main process and again on all subprocess queues, ignoring #define_queue in subprocesses.

This would eliminate the need for each_process entirely, rolling it into #setup itself. Doing so is conceptually simpler, but also reduces duplicated provide_context statements.

Accept JSON or YAML file queue definitons

Something like:

Procrastinator.setup(task_repo, 'config/queues.yml' )
# or
Procrastinator.setup(task_repo, 'config/queues.json' )

Would be useful for projects that prefer to define their queue behaviour in external config files.

It could use the file extension to determine the file format.

Standalone script

If a dev wants to manage the task queue with an outside system (cron, monit, etc) then Procrastinator should provide a short script like delayed_job's start-stop script.

Worker queue log entry doens't include prefix

The worker started log entry doesn't include the prefix, which it should, since people will want to be able to look for it by process title.

eg.

"Started worker process, email-queue-worker, to work off queue email"

should be

"Started worker process, myapp-email-queue-worker, to work off queue email"

Double queue workers

If restarting app, the main process will die, and then main-prime would start its own workers.

Because of the heartbeat delay, there could be a short period with two QueueWorker processes for the same queue.

Subprocess parent id trickery

It's feasible that the parent could be killed and the pid is reused (by anything else) before the child checks for it to terminate themselves.

The children should use a more robust method to watch for their parent process. Maybe also watch the process name or the start time.

It should pass in an app container to task init

Tasks should be given an app object (eg. a DI container) that can be used as needed by the task to complete its duties.

For example, a run method might need to fetch data from the app's persistence, but each task does not need to be creating its own persistence env and set of persisters when they could share one that is passed in.

It may be helpful/necessary for the setup to take such an object to be passed in (or a lambda to be run for each task, etc)

Allow custom recheduling algo

Rescheduling is currently done on a static formula, but devs may want to tweak that process. Perhaps a command on the Environment like:

Procrastinator.setup(persister) do |env|
   env.define_queue(:test)
   env.reschedule_with do |attempts|
      #define custom formula here
   end
end

Raise error when scheduling in past

#delay should raise an error if the task being scheduled is set for more than a short period in the past (to allow the clock ticking over while scheduling it)

Options:

  1. Provide a setting that the dev can use to customize it (with reasonable default)
  2. Use the sleep-loop time as the limit
  3. Use a constant (eg. 5 sec)

Add a testing mode

To prevent looping, sleeping, and forking. Those are all gem details that end-apps shouldn't need to care about to test.

Either find a way to use the external API without loops, or add a flag to disable them.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.