mhenrixon / sidekiq-unique-jobs Goto Github PK

View Code? Open in Web Editor NEW

1.4K 17.0 274.0 19.31 MB

Prevents duplicate Sidekiq jobs

License: MIT License

Shell 0.21% Ruby 88.87% HTML 4.81% JavaScript 1.54% CSS 0.01% Lua 4.55% Slim 0.01%

ruby sidekiq redis

sidekiq-unique-jobs's Introduction

SidekiqUniqueJobs

Support Me

Want to show me some ❤️ for the hard work I do on this gem? You can use the following PayPal link: https://paypal.me/mhenrixon1. Any amount is welcome and let me tell you it feels good to be appreciated. Even a dollar makes me super excited about all of this.

Introduction
Usage
Requirements
Locks
Conflict Strategy
- log
- raise
- reject
- replace
- reschedule
- Custom Strategies
- 3 Cleanup Dead Locks
Debugging
Testing
- Validating Worker Configuration
- Uniqueness
Configuration
Communication
Contributing
Contributors

Introduction

This gem adds unique constraints to sidekiq jobs. The uniqueness is achieved by creating a set of keys in redis based off of queue, class, args (in the sidekiq job hash).

By default, only one lock for a given hash can be acquired. What happens when a lock can't be acquired is governed by a chosen Conflict Strategy strategy. Unless a conflict strategy is chosen (?)

This is the documentation for the main branch. You can find the documentation for each release by navigating to its tag.

Here are links to some of the old versions

Usage

Installation

Add this line to your application's Gemfile:

gem 'sidekiq-unique-jobs'

And then execute:

bundle

Add the middleware

Before v7, the middleware was configured automatically. Since some people reported issues with other gems (see Other Sidekiq Gems) it was decided to give full control over to the user.

NOTE if you want to use the reaper you also need to configure the server middleware.

The following shows how to modify your config/initializers/sidekiq.rb file to use the middleware. Here is a full example.

require "sidekiq-unique-jobs"

Sidekiq.configure_server do |config|
  config.redis = { url: ENV["REDIS_URL"], driver: :hiredis }

  config.client_middleware do |chain|
    chain.add SidekiqUniqueJobs::Middleware::Client
  end

  config.server_middleware do |chain|
    chain.add SidekiqUniqueJobs::Middleware::Server
  end

  SidekiqUniqueJobs::Server.configure(config)
end

Sidekiq.configure_client do |config|
  config.redis = { url: ENV["REDIS_URL"], driver: :hiredis }

  config.client_middleware do |chain|
    chain.add SidekiqUniqueJobs::Middleware::Client
  end
end

Your first worker

The lock type most likely to be is :until_executed. This type of lock creates a lock from when UntilExecutedWorker.perform_async is called until right after UntilExecutedWorker.new.perform has been called.

# frozen_string_literal: true

class UntilExecutedWorker
  include Sidekiq::Worker

  sidekiq_options lock: :until_executed

  def perform
    logger.info("cowboy")
    sleep(1) # hardcore processing
    logger.info("beebop")
  end
end

You can read more about the worker configuration in Worker Configuration below.

Requirements

Sidekiq >= 5.0 (>= 5.2 recommended)
Ruby:
- MRI >= 2.5 (>= 2.6 recommended)
- JRuby >= 9.0 (>= 9.2 recommended)
- Truffleruby
Redis Server >= 3.2 (>= 5.0 recommended)
[ActiveJob officially not supported][48]
[redis-namespace officially not supported][49]

See [Sidekiq requirements][24] for detailed requirements of Sidekiq itself (be sure to check the right sidekiq version).

Locks

Until Executing

A lock is created when UntilExecuting.perform_async is called. Then it is either unlocked when lock_ttl is hit or before Sidekiq calls the perform method on your worker.

Example worker

class UntilExecuting
  include Sidekiq::Workers

  sidekiq_options lock: :until_executing

  def perform(id)
    # Do work
  end
end

NOTE this is probably not so good for jobs that shouldn't be running simultaneously (aka slow jobs).

The reason this type of lock exists is to fix the following problem: sidekiq/issues/3471

Until Executed

A lock is created when UntilExecuted.perform_async is called. Then it is either unlocked when lock_ttl is hit or when Sidekiq has called the perform method on your worker.

Example worker

class UntilExecuted
  include Sidekiq::Workers

  sidekiq_options lock: :until_executed

  def perform(id)
    # Do work
  end
end

Until Expired

This lock behaves identically to the Until Executed except for one thing. This job won't be unlocked until the expiration is hit. For jobs that need to run only once per day, this would be the perfect lock. This way, we can't create more jobs until one day after this job was first pushed.

Example worker

class UntilExpired
  include Sidekiq::Workers

  sidekiq_options lock: :until_expired, lock_ttl: 1.day

  def perform
    # Do work
  end
end

Until And While Executing

This lock is a combination of two locks (:until_executing and :while_executing). Please see the configuration for Until Executing and While Executing

Example worker

class UntilAndWhileExecutingWorker
  include Sidekiq::Workers

  sidekiq_options lock: :until_and_while_executing,
                  lock_timeout: 2,
                  on_conflict: {
                    client: :log,
                    server: :raise
                  }
  def perform(id)
    # Do work
  end
end

While Executing

These locks are put on a queue without any type of locking mechanism, the locking doesn't happen until Sidekiq pops the job from the queue and starts processing it.

Example worker

class WhileExecutingWorker
  include Sidekiq::Workers

  sidekiq_options lock: :while_executing,
                  lock_timeout: 2,
                  on_conflict: {
                    server: :raise
                  }
  def perform(id)
    # Do work
  end
end

NOTE Unless a conflict strategy of :raise is specified, if lock fails, the job will be dropped without notice. When told to raise, the job will be put back and retried. It would also be possible to use :reschedule with this lock.

NOTE Unless this job is configured with a lock_timeout: nil or lock_timeout: > 0 then all jobs that are attempted to be executed will just be dropped without waiting.

There is an example of this to try it out in the myapp application. Run foreman start in the root of the directory and open the url: localhost:5000/work/duplicate_while_executing.

In the console you should see something like:

0:32:24 worker.1 | 2017-04-23T08:32:24.955Z 84404 TID-ougq4thko WhileExecutingWorker JID-400ec51c9523f41cd4a35058 INFO: start
10:32:24 worker.1 | 2017-04-23T08:32:24.956Z 84404 TID-ougq8csew WhileExecutingWorker JID-8d6d9168368eedaed7f75763 INFO: start
10:32:24 worker.1 | 2017-04-23T08:32:24.957Z 84404 TID-ougq8crt8 WhileExecutingWorker JID-affcd079094c9b26e8b9ba60 INFO: start
10:32:24 worker.1 | 2017-04-23T08:32:24.959Z 84404 TID-ougq8cs8s WhileExecutingWorker JID-9e197460c067b22eb1b5d07f INFO: start
10:32:24 worker.1 | 2017-04-23T08:32:24.959Z 84404 TID-ougq4thko WhileExecutingWorker JID-400ec51c9523f41cd4a35058 WhileExecutingWorker INFO: perform(1, 2)
10:32:34 worker.1 | 2017-04-23T08:32:34.964Z 84404 TID-ougq4thko WhileExecutingWorker JID-400ec51c9523f41cd4a35058 INFO: done: 10.009 sec
10:32:34 worker.1 | 2017-04-23T08:32:34.965Z 84404 TID-ougq8csew WhileExecutingWorker JID-8d6d9168368eedaed7f75763 WhileExecutingWorker INFO: perform(1, 2)
10:32:44 worker.1 | 2017-04-23T08:32:44.965Z 84404 TID-ougq8crt8 WhileExecutingWorker JID-affcd079094c9b26e8b9ba60 WhileExecutingWorker INFO: perform(1, 2)
10:32:44 worker.1 | 2017-04-23T08:32:44.965Z 84404 TID-ougq8csew WhileExecutingWorker JID-8d6d9168368eedaed7f75763 INFO: done: 20.009 sec
10:32:54 worker.1 | 2017-04-23T08:32:54.970Z 84404 TID-ougq8cs8s WhileExecutingWorker JID-9e197460c067b22eb1b5d07f WhileExecutingWorker INFO: perform(1, 2)
10:32:54 worker.1 | 2017-04-23T08:32:54.969Z 84404 TID-ougq8crt8 WhileExecutingWorker JID-affcd079094c9b26e8b9ba60 INFO: done: 30.012 sec
10:33:04 worker.1 | 2017-04-23T08:33:04.973Z 84404 TID-ougq8cs8s WhileExecutingWorker JID-9e197460c067b22eb1b5d07f INFO: done: 40.014 sec

Custom Locks

You may need to define some custom lock. You can define it in one project folder:

# lib/locks/my_custom_lock.rb
module Locks
  class MyCustomLock < SidekiqUniqueJobs::Lock::BaseLock
    def execute
      # Do something ...
    end
  end
end

You can refer on all the locks defined in lib/sidekiq_unique_jobs/lock/*.rb.

In order to make it available, you should call in your project startup:

(For rails application config/initializers/sidekiq_unique_jobs.rb or other projects, wherever you prefer)

SidekiqUniqueJobs.configure do |config|
  config.add_lock :my_custom_lock, Locks::MyCustomLock
end

And then you can use it in the jobs definition:

sidekiq_options lock: :my_custom_lock, on_conflict: :log

Please not that if you try to override a default lock, an ArgumentError will be raised.

Conflict Strategy

Decides how we handle conflict. We can either reject the job to the dead queue or reschedule it. Both are useful for jobs that absolutely need to run and have been configured to use the lock WhileExecuting that is used only by the sidekiq server process.

Furthermore, log can be be used with the lock UntilExecuted and UntilExpired. Now we write a log entry saying the job could not be pushed because it is a duplicate of another job with the same arguments.

It is possible for locks to have different conflict strategy for the client and server. This is useful for :until_and_while_executing.

sidekiq_options lock: :until_and_while_executing,
                on_conflict: { client: :log, server: :reject }

log

sidekiq_options on_conflict: :log

This strategy is intended to be used with UntilExecuted and UntilExpired. It will log a line that this job is a duplicate of another.

raise

sidekiq_options on_conflict: :raise

This strategy is intended to be used with WhileExecuting. Basically it will allow us to let the server process crash with a specific error message and be retried without messing up the Sidekiq stats.

reject

sidekiq_options on_conflict: :reject

This strategy is intended to be used with WhileExecuting and will push the job to the dead queue on conflict.

replace

sidekiq_options on_conflict: :replace

This strategy is intended to be used with client locks like UntilExecuted. It will delete any existing job for these arguments from retry, schedule and queue and retry the lock again.

This is slightly dangerous and should probably only be used for jobs that are always scheduled in the future. Currently only attempting to retry one time.

reschedule

sidekiq_options on_conflict: :reschedule

This strategy is intended to be used with WhileExecuting and will delay the job to be tried again in 5 seconds. This will mess up the sidekiq stats but will prevent exceptions from being logged and confuse your sysadmins.

Custom Strategies

You may need to define some custom strategy. You can define it in one project folder:

# lib/strategies/my_custom_strategy.rb
module Strategies
  class MyCustomStrategy < SidekiqUniqueJobs::OnConflict::Strategy
    def call
      # Do something ...
    end
  end
end

You can refer to all the strategies defined in lib/sidekiq_unique_jobs/on_conflict.

In order to make it available, you should call in your project startup:

(For rails application config/initializers/sidekiq_unique_jobs.rb for other projects, wherever you prefer)

SidekiqUniqueJobs.configure do |config|
  config.add_strategy :my_custom_strategy, Strategies::MyCustomStrategy
end

And then you can use it in the jobs definition:

sidekiq_options lock: :while_executing, on_conflict: :my_custom_strategy

Please not that if you try to override a default lock, an ArgumentError will be raised.

3 Cleanup Dead Locks

For sidekiq versions < 5.1 a sidekiq_retries_exhausted block is required per worker class. This is deprecated in Sidekiq 6.0

class MyWorker
  sidekiq_retries_exhausted do |msg, _ex|
    digest = msg['lock_digest']
    SidekiqUniqueJobs::Digests.new.delete_by_digest(digest) if digest
  end
end

Starting in v5.1, Sidekiq can also fire a global callback when a job dies: In version 7, this is handled automatically for you. You don't need to add a death handler, if you configure v7 like in Add the middleware you don't have to worry about the below.

Sidekiq.configure_server do |config|
  config.death_handlers << ->(job, _ex) do
    digest = job['lock_digest']
    SidekiqUniqueJobs::Digests.new.delete_by_digest(digest) if digest
  end
end

Debugging

There are several ways of removing keys that are stuck. The prefered way is by using the unique extension to Sidekiq::Web. The old console and command line versions still work but might be deprecated in the future. It is better to search for the digest itself and delete the keys matching that digest.

Sidekiq Web

To use the web extension you need to require it in your routes.

#app/config/routes.rb
require 'sidekiq_unique_jobs/web'
mount Sidekiq::Web, at: '/sidekiq'

There is no need to require 'sidekiq/web' since sidekiq_unique_jobs/web already does this.

To filter/search for keys we can use the wildcard *. If we have a unique digest 'uniquejobs:9e9b5ce5d423d3ea470977004b50ff84 we can search for it by enter *ff84 and it should return all digests that end with ff84.

Reflections (metrics, logging, etc.)

To be able to gather some insights on what is going on inside this gem. I provide a reflection API that can be used.

To setup reflections for logging or metrics, use the following API:

def extract_log_from_job(message, job_hash)
  worker    = job_hash['class']
  args      = job_hash['args']
  lock_args = job_hash['lock_args']
  queue     = job_hash['queue']
  {
    message: message,
    worker: worker,
    args: args,
    lock_args: lock_args,
    queue: queue
  }
end

SidekiqUniqueJobs.reflect do |on|
  on.lock_failed do |job_hash|
    message = extract_log_from_job('Lock Failed', job_hash)
    Sidekiq.logger.warn(message)
  end
end

after_unlock_callback_failed

This is called when you have configured a custom callback for when a lock has been released.

error

Not in use yet but will be used deep into the stack to provide a means to catch and report errors inside the gem.

execution_failed

When the sidekiq processor picks the job of the queue for certain jobs but your job raised an error to the middleware. This will be the reflection. It is probably nothing to worry about. When your worker raises an error, we need to handle some edge cases for until and while executing.

lock_failed

If we can't achieve a lock, this will be the reflection. It most likely is nothing to worry about. We just couldn't retrieve a lock in a timely fashion.

The biggest reason for this reflection would be to gather metrics on which workers fail the most at the locking step for example.

locked

For when a lock has been successful. Again, mostly useful for metrics I suppose.

reschedule_failed

For when the reschedule strategy failed to reschedule the job.

rescheduled

For when a job was successfully rescheduled

timeout

This is also mostly useful for reporting/metrics purposes. What this reflection does is signal that the job was configured to wait (lock_timeout was configured), but we couldn't retrieve a lock even though we waited for some time.

unlock_failed

This means that the server middleware could not unlock your job and the lock is kept (potentially preventing subsequent jobs from being pushed or processed).

unlocked

Also mostly useful for reporting purposes. The job was successfully unlocked.

unknown_sidekiq_worker

The reason this happens is that the server couldn't find a valid sidekiq worker class. Most likely, that worker isn't intended to be processed by this sidekiq server instance.

Show Locks

Show Lock

Testing

Validating Worker Configuration

Since v7 it is possible to perform some simple validation against your workers sidekiq_options. What it does is scan for some issues that are known to cause problems in production.

Let's take a bad worker:

#app/workers/bad_worker.rb
class BadWorker
  sidekiq_options lock: :while_executing, on_conflict: :replace
end

#spec/workers/bad_worker_spec.rb

require "sidekiq_unique_jobs/testing"

RSpec.describe BadWorker do
  specify { expect(described_class).to have_valid_sidekiq_options }
end

This gives us a helpful error message for a wrongly configured worker:

Expected BadWorker to have valid sidekiq options but found the following problems:
    on_server_conflict: :replace is incompatible with the server process

If you are not using RSpec (a lot of people prefer minitest or test unit) you can do something like:

assert_raise(InvalidWorker){ SidekiqUniqueJobs.validate_worker!(BadWorker.get_sidekiq_options) }

Uniqueness

This has been probably the most confusing part of this gem. People get really confused with how unreliable the unique jobs have been. I there for decided to do what Mike is doing for sidekiq enterprise. Read the section about unique jobs: Enterprise unique jobs(?)

SidekiqUniqueJobs.configure do |config|
  config.enabled = !Rails.env.test?
  config.logger_enabled = !Rails.env.test?
end

If you truly wanted to test the sidekiq client push you could do something like below. Note that it will only work for the jobs that lock when the client pushes the job to redis (UntilExecuted, UntilAndWhileExecuting and UntilExpired).

require "sidekiq_unique_jobs/testing"

RSpec.describe Workers::CoolOne do
  before do
    SidekiqUniqueJobs.config.enabled = false
  end

  # ... your tests that don't test uniqueness

  context 'when Sidekiq::Testing.disabled?' do
    before do
      Sidekiq::Testing.disable!
      Sidekiq.redis(&:flushdb)
    end

    after do
      Sidekiq.redis(&:flushdb)
    end

    it 'prevents duplicate jobs from being scheduled' do
      SidekiqUniqueJobs.use_config(enabled: true) do
        expect(described_class.perform_in(3600, 1)).not_to eq(nil)
        expect(described_class.perform_async(1)).to eq(nil)
      end
    end
  end
end

It is recommended to leave the uniqueness testing to the gem maintainers. If you care about how the gem is integration tested have a look at the following specs:

Configuration

Other Sidekiq gems

apartment-sidekiq

It was reported in #536 that the order of the Sidekiq middleware needs to be as follows.

Sidekiq.client_middleware do |chain|
  chain.add Apartment::Sidekiq::Middleware::Client
  chain.add SidekiqUniqueJobs::Middleware::Client
end

Sidekiq.server_middleware do |chain|
  chain.add Apartment::Sidekiq::Middleware::Server
  chain.add SidekiqUniqueJobs::Middleware::Server
end

The reason being that this gem needs to be configured AFTER the apartment gem or the apartment will not be able to be considered for uniqueness

sidekiq-global_id

It was reported in #235 that the order of the Sidekiq middleware needs to be as follows.

For a working setup check the following file.

Sidekiq.client_middleware do |chain|
  chain.add Sidekiq::GlobalId::ClientMiddleware
  chain.add SidekiqUniqueJobs::Middleware::Client
end

Sidekiq.server_middleware do |chain|
  chain.add Sidekiq::GlobalId::ServerMiddleware
  chain.add SidekiqUniqueJobs::Middleware::Server
end

The reason for this is that the global id needs to be set before the unique jobs middleware runs. Otherwise that won't be available for uniqueness.

sidekiq-status

It was reported in #564 that the order of the middleware needs to be as follows.

# Thanks to @ArturT for the correction

Sidekiq.configure_server do |config|
  config.client_middleware do |chain|
    chain.add SidekiqUniqueJobs::Middleware::Client
    chain.add Sidekiq::Status::ClientMiddleware, expiration: 30.minutes
  end

  config.server_middleware do |chain|
    chain.add Sidekiq::Status::ServerMiddleware, expiration: 30.minutes
    chain.add SidekiqUniqueJobs::Middleware::Server
  end

  SidekiqUniqueJobs::Server.configure(config)
end


Sidekiq.configure_client do |config|
  config.client_middleware do |chain|
    chain.add SidekiqUniqueJobs::Middleware::Client
    chain.add Sidekiq::Status::ClientMiddleware, expiration: 30.minutes
  end
end

The reason for this is that if a job is duplicated it shouldn't end up with the status middleware at all. Status is just a monitor so to prevent clashes, leftovers and ensure cleanup. The status middleware should run after uniqueness on client and before on server. This will lead to less surprises.

Global Configuration

The gem supports a few different configuration options that might be of interest if you run into some weird issues.

Configure SidekiqUniqueJobs in an initializer or the sidekiq initializer on application startup.

SidekiqUniqueJobs.configure do |config|
  config.logger = Sidekiq.logger # default, change at your own discretion
  config.logger_enabled  = true # default, disable for test environments
  config.debug_lua       = false # Turn on when debugging
  config.lock_info       = false # Turn on when debugging
  config.lock_ttl        = 600   # Expire locks after 10 minutes
  config.lock_timeout    = nil   # turn off lock timeout
  config.max_history     = 0     # Turn on when debugging
  config.reaper          = :ruby # :ruby, :lua or :none/nil
  config.reaper_count    = 1000  # Stop reaping after this many keys
  config.reaper_interval = 600   # Reap orphans every 10 minutes
  config.reaper_timeout  = 150   # Timeout reaper after 2.5 minutes
end

debug_lua

SidekiqUniqueJobs.config.debug_lua #=> false

Turning on debug_lua will allow the lua scripts to output debug information about what the lua scripts do. It will log all redis commands that are executed and also some helpful messages about what is going on inside the lua script.

lock_timeout

SidekiqUniqueJobs.config.lock_timeout #=> 0

Set a global lock_timeout to use for all jobs that don't otherwise specify a lock_timeout.

Lock timeout decides how long to wait for acquiring the lock. A value of nil means to wait indefinitely for a lock resource to become available.

lock_ttl

SidekiqUniqueJobs.config.lock_ttl #=> nil

Set a global lock_ttl to use for all jobs that don't otherwise specify a lock_ttl.

Lock TTL decides how long to wait at most before considering a lock to be expired and making it possible to reuse that lock.

enabled

SidekiqUniqueJobs.config.enabled #=> true

Globally turn the locking mechanism on or off.

logger

SidekiqUniqueJobs.config.logger #=> #<Sidekiq::Logger:0x00007fdc1f96d180>

By default this gem piggybacks on the Sidekiq logger. It is not recommended to change this as the gem uses some features in the Sidekiq logger and you might run into problems. If you need a different logger and you do run into problems then get in touch and we'll see what we can do about it.

max_history

SidekiqUniqueJobs.config.max_history #=> 1_000

The max_history setting can be used to tweak the number of changelogs generated. It can also be completely turned off if performance suffers or if you are just not interested in using the changelog.

This is a log that can be accessed by a lock to see what happened for that lock. Any items after the configured max_history will be automatically deleted as new items are added.

reaper

SidekiqUniqueJobs.config.reaper #=> :ruby

If using the orphans cleanup process it is critical to be aware of the following. The :ruby job is much slower but the :lua job locks redis while executing. While doing intense processing it is best to avoid locking redis with a lua script. There for the batch size (controlled by the reaper_count setting) needs to be reduced.

In my benchmarks deleting 1000 orphaned locks with lua performs around 65% faster than deleting 1000 keys in ruby.

On the other hand if I increase it to 10 000 orphaned locks per cleanup (reaper_count: 10_0000) then redis starts throwing:

BUSY Redis is busy running a script. You can only call SCRIPT KILL or SHUTDOWN NOSAVE. (RedisClient::CommandError)

If you want to disable the reaper set it to :none, nil or false. Actually, any value that isn't :ruby or :lua will disable the reaping.

SidekiqUniqueJobs.config.reaper = :none
SidekiqUniqueJobs.config.reaper = nil
SidekiqUniqueJobs.config.reaper = false

reaper_count

SidekiqUniqueJobs.config.reaper_count #=> 1_000

The reaper_count setting configures how many orphans at a time will be cleaned up by the orphan cleanup job. This might have to be tweaked depending on which orphan job is running.

reaper_interval

SidekiqUniqueJobs.config.reaper_interval #=> 600

The number of seconds between reaping.

reaper_timeout

SidekiqUniqueJobs.config.reaper_timeout #=> 10

The number of seconds to wait for the reaper to finish before raising a TimeoutError. This is done to ensure that the next time we reap isn't getting stuck due to the previous process already running.

lock_prefix

SidekiqUniqueJobs.config.lock_prefix #=> "uniquejobs"

Use if you want a different key prefix for the keys in redis.

lock_info

SidekiqUniqueJobs.config.lock_info #=> false

Using lock info will create an additional key for the lock with a json object containing information about the lock. This will be presented in the web interface and might help track down why some jobs are getting stuck.

Worker Configuration

lock_info

Lock info gathers information about a specific lock. It collects things like which lock_args where used to compute the lock_digest that is used for maintaining uniqueness.

sidekiq_options lock_info: false # this is the default, set to true to turn on

lock_prefix

Use if you want a different key prefix for the keys in redis.

sidekiq_options lock_prefix: "uniquejobs" # this is the default value

lock_ttl

Lock TTL decides how long to wait at most before considering a lock to be expired and making it possible to reuse that lock.

Starting from v7 the expiration will take place when the job is pushed to the queue.

sidekiq_options lock_ttl: nil # default - don't expire keys
sidekiq_options lock_ttl: 20.days.to_i # expire this lock in 20 days

lock_timeout

This is the timeout (how long to wait) when creating the lock. By default we don't use a timeout so we won't wait for the lock to be created. If you want it is possible to set this like below.

sidekiq_options lock_timeout: 0 # default - don't wait at all
sidekiq_options lock_timeout: 5 # wait 5 seconds
sidekiq_options lock_timeout: nil # lock indefinitely, this process won't continue until it gets a lock. VERY DANGEROUS!!

unique_across_queues

This configuration option is slightly misleading. It doesn't disregard the queue on other jobs. Just on itself, this means that a worker that might schedule jobs into multiple queues will be able to have uniqueness enforced on all queues it is pushed to.

This is mainly intended for Worker.set(queue: :another).perform_async.

class Worker
  include Sidekiq::Worker

  sidekiq_options unique_across_queues: true, queue: 'default'

  def perform(args); end
end

Now if you push override the queue with Worker.set(queue: 'another').perform_async(1) it will still be considered unique when compared to Worker.perform_async(1) (that was actually pushed to the queue default).

unique_across_workers

This configuration option is slightly misleading. It doesn't disregard the worker class on other jobs. Just on itself, this means that the worker class won't be used for generating the unique digest. The only way this option really makes sense is when you want to have uniqueness between two different worker classes.

class WorkerOne
  include Sidekiq::Worker

  sidekiq_options unique_across_workers: true, queue: 'default'

  def perform(args); end
end

class WorkerTwo
  include Sidekiq::Worker

  sidekiq_options unique_across_workers: true, queue: 'default'

  def perform(args); end
end


WorkerOne.perform_async(1)
# => 'the jobs unique id'

WorkerTwo.perform_async(1)
# => nil because WorkerOne just stole the lock

Finer Control over Uniqueness

Sometimes it is desired to have a finer control over which arguments are used in determining uniqueness of the job, and others may be transient. For this use-case, you need to define either a lock_args method, or a ruby proc.

NOTE: The lock_args method need to return an array of values to use for uniqueness check.

NOTE: The arguments passed to the proc or the method is always an array. If your method takes a single array as argument the value of args will be [[...]].

The method or the proc can return a modified version of args without the transient arguments included, as shown below:

class UniqueJobWithFilterMethod
  include Sidekiq::Worker
  sidekiq_options lock: :until_and_while_executing,
                  lock_args_method: :lock_args # this is default and will be used if such a method is defined

  def self.lock_args(args)
    [ args[0], args[2][:type] ]
  end

  ...

end

class UniqueJobWithFilterProc
  include Sidekiq::Worker
  sidekiq_options lock: :until_executed,
                  lock_args_method: ->(args) { [ args.first ] }

  ...

end

It is possible to ensure different types of unique args based on context. I can't vouch for the below example but see #203 for the discussion.

class UniqueJobWithFilterMethod
  include Sidekiq::Worker
  sidekiq_options lock: :until_and_while_executing, lock_args_method: :lock_args

  def self.lock_args(args)
    if Sidekiq::ProcessSet.new.size > 1
      # sidekiq runtime; uniqueness for the object (first arg)
      args.first
    else
      # queuing from the app; uniqueness for all params
      args
    end
  end
end

After Unlock Callback

If you need to perform any additional work after the lock has been released you can provide an #after_unlock instance method. The method will be called when the lock has been unlocked. Most times this means after yield but there are two exceptions to that.

Exception 1: UntilExecuting unlocks and uses callback before yielding. Exception 2: UntilExpired expires eventually, no after_unlock hook is called.

NOTE: It is also possible to write this code as a class method.

class UniqueJobWithFilterMethod
  include Sidekiq::Worker
  sidekiq_options lock: :while_executing,

  def self.after_unlock
   # block has yielded and lock is released
  end

  def after_unlock
   # block has yielded and lock is released
  end
  ...
end.

Communication

There is a for praise or scorn. This would be a good place to have lengthy discuss or brilliant suggestions or simply just nudge me if I forget about anything.

Contributing

Fork it
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Add some feature')
Push to the branch (git push origin my-new-feature)
Create new Pull Request

Contributors

You can find a list of contributors over on Contributors

sidekiq-unique-jobs's People

Contributors

Stargazers

Watchers

Forkers

kensodev wanelo penguinxr2 bmarini orenmazor mirceapreotu feedbin blairio zapo lsimoneau priteshjain disbelief underpantsgnome trywildcard w-a-l-l-e atipugin hgebhardt salesking kbaum tjhosford sgonyea jsmilani jiffinc sepastian robinmessage mmmries bandcentralgit nglx seanmccann djdarkbeat loyaltynz spacemunkay blasterpal felixbuenemann musicglue bigzed moorecp lambcr elhu crowdtap bradrobertson engineyard phuongnd08 rickenharp neutralino1 salrepe ronakjangir47 adstage p7s1digital instacart replaygaming matthijsgroen javierbertoli sonots jprincipe crberube simonoff yoshiso repairshopr antek-drzewiecki invoca projectivetech pik cruglobal slamorsi joshvar88 restorando thbar zeqfreed albertyw lsantosc ngty vkuznetsov benseligman kpheasey fergmastaflex slania gitter-badger theprogrammerin mathieujobin alpcrimea andrew otzy007 vincentwoo thebigsadowski bobochka dolzenko vicentesi pboling spidrtech azurewraith jessecollier hapiben carlosmartinez effilab cheald alphagov dsander hexagonalconsulting makshark

sidekiq-unique-jobs's Issues

Can you update the change log?

I would like to see changes since 3.0.0.
Thanks!

payload_hash staying around

Unique jobs have been broken in testing since sidekiq/sidekiq@07a2781.

The payload_hash stays in Redis across multiple runs within the same timeout.

I'm not sure if there's anything that can be done about this

https://github.com/form26/sidekiq-unique-jobs/blob/master/lib/sidekiq-unique-jobs/middleware/client/unique_jobs.rb#L22

What is the use case for the uniqueness window?

I'm puzzled about the uniqueness window. Can someone please illustrate an example of how it's useful?

For my app I want each job performed once, which I why I sought out this plugin. I don't see how waiting any amount of time would make it ok to allow this "same" job to get enqueued.

Also, it would be nice to have a little more background in the Readme on how uniqueness is implemented. Is there a Redis query each time (performance penalty), or are all the existing job signatures stored in memory (memory constraint)?

What is the exact behavior?

Great gem, thank you for making it.

I have been looking through the code and reading up and can't figure out precisely what the behavior of this gem is regarding what it looks at when determining whether or not to keep or throw away a job. Which already existing jobs are considered? There are processed, failed, busy, enqueued, retries, scheduled and dead jobs. Which of those does this gem care about when deciding whether or not to keep the second job?

Also, if a job is part-way through/currently being processed, what is the behavior? In my case I want the second job to be kept if the first job is already started as the second job may have new information that makes the first job out of date.

Thanks again.

The deprecation message is unclear and unnecessary

I get this all the time I start my app:
This method has been deprecated. See readme for information
which is from
https://github.com/mhenrixon/sidekiq-unique-jobs/blob/master/lib/sidekiq_unique_jobs/config.rb

unclear: If I don't search through the source of all the gems I am using, I won't understand what it means
unnecessary: I use .config already, why am I still getting it?

Duplicated Jobs With Nested Sidekiq Workers

I have have an issue when using nested workers, where uniqueness is not followed, and leads to duplicate jobs.

When executing RunAJobWorker multiple times it leads to duplicate LongRunningWorker instances.

I am using Sidekiq 2.6.5 and Unique-jobs 2.3.2

Here is the follow code that will reproduce the problem:

class RunAJobWorker
    include Sidekiq::Workder
    sidekiq_options unique: true

    def perform
        # Do some db lookup to find params for this 
        # the job
        id = 10
        LongRunningWorker.perform_async(id)
    end
end

class LongRunningWorker
    include Sidekiq::Workder
    sidekiq_options unique: true

    def perform(id)
        # Find model
        # model = Model.find(id)
        # model long running task
        sleep(10)
    end
end

Middleware not added to chain?

I added the gem to my Gemfile and I couldn't get UniqueJobs to work. I pry'ed my code and it looks like the middleware is not automatically inserted in the chain.

I tried adding it in an initializer as well as manually in a Pry session but to no avail. Oddly enough, the Client middleware is added. Any idea what might be going on? Here's some code:

[1] pry(#<NotificationPushWorker>)> Sidekiq.server_middleware
=> #<Sidekiq::Middleware::Chain:0x007fd675435e10
 @entries=
  [#<Sidekiq::Middleware::Entry:0x007fd675435d70
    @args=[],
    @klass=Sidekiq::Middleware::Server::Logging>,
   #<Sidekiq::Middleware::Entry:0x007fd675435cd0
    @args=[],
    @klass=Sidekiq::Middleware::Server::RetryJobs>,
   #<Sidekiq::Middleware::Entry:0x007fd675435c30
    @args=[],
    @klass=Sidekiq::Middleware::Server::ActiveRecord>,
   #<Sidekiq::Middleware::Entry:0x007fd675435b90
    @args=[],
    @klass=Sidekiq::Middleware::Server::Timeout>]>
[2] pry(#<NotificationPushWorker>)> Sidekiq.configure_server do |config|
[2] pry(#<NotificationPushWorker>)*   config.server_middleware do |chain|  
[2] pry(#<NotificationPushWorker>)*     require 'sidekiq-unique-jobs/middleware/server/unique_jobs'    
[2] pry(#<NotificationPushWorker>)*     chain.add SidekiqUniqueJobs::Middleware::Server::UniqueJobs    
[2] pry(#<NotificationPushWorker>)*   end    
[2] pry(#<NotificationPushWorker>)* end  
=> nil
[3] pry(#<NotificationPushWorker>)> 
[4] pry(#<NotificationPushWorker>)> Sidekiq.server_middleware
=> #<Sidekiq::Middleware::Chain:0x007fd675435e10
 @entries=
  [#<Sidekiq::Middleware::Entry:0x007fd675435d70
    @args=[],
    @klass=Sidekiq::Middleware::Server::Logging>,
   #<Sidekiq::Middleware::Entry:0x007fd675435cd0
    @args=[],
    @klass=Sidekiq::Middleware::Server::RetryJobs>,
   #<Sidekiq::Middleware::Entry:0x007fd675435c30
    @args=[],
    @klass=Sidekiq::Middleware::Server::ActiveRecord>,
   #<Sidekiq::Middleware::Entry:0x007fd675435b90
    @args=[],
    @klass=Sidekiq::Middleware::Server::Timeout>]>
[5] pry(#<NotificationPushWorker>)> Sidekiq.client_middleware
=> #<Sidekiq::Middleware::Chain:0x007fd698cc2628
 @entries=
  [#<Sidekiq::Middleware::Entry:0x007fd698cb6328
    @args=[],
    @klass=SidekiqUniqueJobs::Middleware::Client::UniqueJobs>]>

Thanks!

Unique jobs sets Sidekiq testing to inline! mode

I am having an issue that as soon as I enable unique jobs for a worker, Sidekiq starts to operate in inline mode, thus requiring Redis connection. I would like to continue using it in fake mode. Is it an expected behavior? Here are the versions I use:

Using sidekiq 3.3.4
Using sidekiq-unique-jobs 3.0.13

Missing info from README

I just found this project while googling how to make sure certain Sidekiq jobs are not executed multiple times. sidekiq-unique-jobs seems to do exactly that... awesome!

I think there is some info missing in the README though, specifically:

Are worker arguments taken into account? So if I have a HardWorker and I call HardWorker.perform_async('bob', 5) multiple times, that job should obviously only be queued once. But what if I call HardWorker.perform_async('bob', 5) and HardWorker.perform_async('jane', 10)? Are both those jobs queued? I suppose so but I'm not 100% sure.
Why is the expiration parameter needed? Does it mean that by default the same job cannot be enqueued again up to 30min after it was removed from the queue?

I think both these points (and possibly more) should be explained in the README.
I'm happy to prepare a pull request for it, if you answer my questions in here.

Thanks for your work on this!

Example Test using Sidekiq::Testing.inline

I have a few service objects I'd like to write integration tests for, so I'm using Sidekiq::Testing.inline! so they run synchronously. This doesn't appear to work with sidekiq-unique jobs. Is there an example or workaround on how to get the job to execute immediately?

Crash handling

Currently Unique Jobs does not handle crashes. So if a sidekiq worker crashes whatever it was doing is lost (except for Sidekiq Pro). When using a worker with unique those unique job keys persist in Redis with no way of clearing them out (short of deleting them manually).

is mock_redis really a runtime dependency?

Seems like this would be more of a dev dependency. Happy to submit a PR making that change but I wasn't sure how it's used

Usage of sidekiq-unique-jobs with activejob

To use the uniqueness with active job:

Sidekiq.default_worker_options = {
   'unique' => true,
   'unique_args' => proc do |args|
     [args.first.except('job_id')]
   end
}
SidekiqUniqueJobs.config.unique_args_enabled = true

Maybe you can update the readme for this?

Sidekiq tests failed when sidekiq-unique-jobs is used

My sidekiq tests with Sidekiq::Testing.fake! were passing until I've added sidekiq-unique-jobs and enabling it in my worker. All other tests are passing except line 19 and 20. Here is my test case :

class MyWorker
  include Sidekiq::Worker
  sidekiq_options :queue => :working, :retry => 1, :backtrace => 10
  sidekiq_options :unique => true

  sidekiq_retries_exhausted do |msg|
    Sidekiq.logger.warn "Failed #{msg['class']} with #{msg['args']}: #{msg['error_message']}"
  end

  def perform(param)
    puts param
  end
end

require "spec_helper"

describe MyWorker do

    context "as a resque worker" do
        it "reponds to #perform" do
            MyWorker.new.should respond_to(:perform)
        end
    end

    it { should be_processed_in :working }
    it { should be_retryable 1 }
    it { should be_unique }

    it "enqueue a job" do
        param = 'work'
        expect(MyWorker).to have_enqueued_jobs(0)
        MyWorker.perform_async(param)
        expect(MyWorker).to have_enqueued_jobs(1)
        expect(MyWorker).to have_enqueued_job(param)
    end

    it "performs a job" do
        MyWorker.new.perform('chocolate').should be_true
    end
end

Failures:

  1) MyWorker enqueue a job
     Failure/Error: expect(MyWorker).to have_enqueued_jobs(1)
       expected MyWorker to have 1 enqueued job but got 0
     # ./spec/workers/my_worker_spec.rb:19:in `block (2 levels) in <top (required)>'

sidekiq-unique-jobs prevents not unique jobs creation event with sidekiq inline test mode

Hi, sidekiq-unique-jobs doens't allow to create not unique jobs event if sidekiq inline test mode is turned on.
sidekiq have inline test mode for testing jobs. It simply invokes perform method instead of perform_async. I faced a problem in my tests, that sidekiq-unique-jobs doesn't allow me to create a new not unique job, when the first one job was already performed.

LoadError: cannot load such file -- mock_redis

I am receiving the following error when running my tests (MiniTest via rake test command):
LoadError: cannot load such file -- mock_redis

Some excerpts of how I use sidekiq-unique-jobs:

Gemfile:

# Use sidekiq for background tasks
gem 'sidekiq'

# Use sidekiq enhancement for unique jobs
gem 'sidekiq-unique-jobs'

Worker:

class MyWorker
  include Sidekiq::Worker
  sidekiq_options queue: :my_queue, unique: true, 
                  unique_job_expiration: 24 * 60 * 60

  def perform(user_id)
    # some code
  end
end

test_helper:

require 'sidekiq/testing'
Sidekiq::Testing.fake!

I'm not sure if I'm doing something wrong here or if there is an issue with the combination of Rails 4.1.4, most current sidekiq and sidekiq-unique-jobs gems. Can anybody help me or fix this issue?

I think it is an issue because if I remove this gem from the project then all tests succeed.
Thanks for your help! :)

If a job is deleted from the enqueued list, it's still unique and new jobs can't be added.

For example I have a job:

class TestJob
  include Sidekiq::Worker

  sidekiq_options queue: 'high',
                  unique: true,
                  unique_args: ->(args) { [ args.first ] }

  def perform(arg)
    # do something
  end
end

And calling it:

[1] pry(main)> TestJob.perform_async 1
=> "de5ff32394dbcdab2128d5ee"

Then if I go to sidekiq web interface and delete it from the enqueued list before it processed, I can't add a new one with the same argument.

[2] pry(main)> TestJob.perform_async 1
=> nil

I'm pretty sure this is not an expected behaviour. Am I right?

Throttling jobs

I have received a lot of questions about how to throttle jobs using sidekiq-unique-jobs. After searching for throttling sidekiq jobs I ended up with sidekiq-throttler. Which seams like a pretty straight forward way of doing what most people want with the uniqueness expiration.

Is throttling jobs something that should exist in the sidekiq-unique-jobs gem or should we remove all expiration completely and tell people to use sidekiq-throttler instead?

The reason for clearing jobs in the first place was that some jobs never got cleared ever so no new such jobs could be scheduled however in a recent release of sidekiq @mperham added clearing of stale jobs after 60 minutes meaning we could only ever keep jobs for 60 minutes or we would have to turn to another solution. I am still undecided on how to proceed here so any suggestions you have are greatly appreciated.

undefined method `get_sidekiq_options' for "MyScheduledWorker":String

Our Sidekiq redis instance is shared among multiple services.
So workers are available on one of the repo while not on the others.

When a scheduled or retried job is being consumed by the sidekiq poller running on another service, we need to to safely re-enqueue the task without exception.

Currently Sidekiq does re-enqueue jobs without the worker class without issue, but apparently the unique jobs middleware is not accounting for this case.

I have workaround the issue with a monkey patch, but an official fix is much appreciated. Thanks for the hard work!

Jobs not being executed anymore??

Hi, upgrading the gem from 3.0.2 to 3.0.9 seems creating issue while executing some kinds of jobs.

I don't know how to build a simple test case to replicate it. It happens every time we are executing a particular job, but this job is not different from the others, and it doesn't even have the unique attribute so it should be totally ignored.
I mean, if the issue is present with this job, than it should be present with all the jobs...

The job contains an options object with { data => { match_id => "XXXXYYYYY"} }, I think something about the hashing didn't work properly because if one of them was enqueued when another one was already waiting to be executed (both using the "perform_in"), then the second was not executed, it was totally forgotten, like if it was never been added to the queue.

I downgraded to 3.0.2 and the issue is solved.

Jobs are unlocked if they fail and are retried

I just discovered that jobs are unlocked if they fail. This happens regardless if there are retries left or if the job dies. I'm wondering if this is the intended behavior. As I would understand (and need) that fact a job that will be retired should still be unique.

The code responsible for this is the following: (in /lib/sidekiq_unique_jobs/middleware/server/unique_jobs.rb)

def call(worker, item, _queue, redis_pool = nil)
  ...
  yield
  ensure
    if after_yield? || !defined? unlocked || unlocked != 1
      unlock(lock_key)
    end
end

So are there any reasons for this behavior or do I miss anything?

Update

So after some research I start to understand why unlocking works the way it does. I was not aware of the fact that jobs in the schedule/retry queue are pushed to the worker queues via client push (involving sidekiqs client middlewares). If in that situation the job is still locked the job will never be pushed into its worker queue.

To circumvent this issue it might be possible to save the jid of the job that acquired the lock instead of the [1,2]. So the client middleware can check for the jid and let the job reenqueue if they match.

What do you think of this idea?

mock_redis and the mess

We are running into an issue that the test pass on local machine but failed on CircleCI.
Upon close inspection, I can see that between tests the mock_redis instance that SidekiqUniqueJobs is pointing to need to be cleared so that the pre-condition of the spec can be guaranteed.

Though, with the presence of mock_redis, I think it makes the gem complicated and doesn't worth the hassle. Per my opinion, sidekiq-unique-jobs should just use what ever the redis sidekiq is using, even during test mode. This lead to more predictable behavior of the tests cross environments.

Sidekiq::Testing inline detection assumes you're always using inline testing

The check in lib/sidekiq_unique_jobs/config.rb only looks if you're using sidekiq testing, not which method you're using. So, even if you're using sidekiq's fake testing method, sidekiq-unique-jobs still blows with the mock_redis gem. Is this gem necessary in all scenarios?

Add unique job key to the message json

I am attempting to rewrite some slow workers in Go using https://github.com/jrallison/go-workers and need to be able to remove the unique job key from redis when the worker completes, but I do not have access to the key in my Go workers. It would be great if sidekiq-unique-jobs would add the key to the message so that I can accomplish this.

undefined `configuration` when using .configure

When doing

SidekiqUniqueJobs.configure do |c|
…
end

It fails with:

/Users/mrfoto/.gem/ruby/2.0.0/gems/sidekiq-unique-jobs-3.0.9/lib/sidekiq-unique-jobs.rb:26:in `configure': undefined local variable or method `configuration' for SidekiqUniqueJobs:Module (NameError)

Support for sidekiq 3?

@mperham has recently released sidekiq 3.0, but sidekiq-unique-jobs is versioned at ~> 2.6.

What's the roadmap like to support 3.0?

Server middleware removes payload hash key before expiration

Hi,

I'm using this gem to throttle duplicate jobs queued within a 24 hour window.

Unfortunately, the server-side middleware is not letting me achieve this. Once the job is processed by the server middleware, the payload_hash key is removed, whereas I expect it to just expire after my TTL. To get around this I've put in a hack to set the "unique_unlock_order" option to -1, so that the key is never deleted.

I'm a bit confused because if the purpose of the gem is to ensure unique jobs, why would the key ever be removed and rather than just letting it expire on it's own?

I'm also a bit unclear of the use case for the server side piece entirely, so maybe you could provide an example?

unique_args_enabled has been deprecated, nothing in readme

I am getting an error that says unique_args_enabled has been deprecated. See readme for information, but i don't see anything in the readme that mentions it. Is this a bug?

Lock remains when running with Sidekiq::Testing.inline!

When running within Sidekiq::Testing.inline! mode, my jobs seemed to be forever locked. I believe the reason is that when running within inline mode, the server middleware is not run.

I am thinking there should be a way to disable uniqueness when running within inline mode.

thx!

Incorrect README re: uniqueness time?

"For jobs scheduled in the future it is possible to set for how long the job should be unique. The job will be unique for the number of seconds configured (default 30 minutes) or until the job has been completed. Thus, the job will be unique for the shorter of the two."

the SETEX doesn't care about the job finishing, and if the args stay the same, the same hash will be used to look at the lock. So if you set it to be unique for 30 minutes but it finishes in one, how would it get enqueued again?

Test suite unclear on what happens when duplicate job is attempted

The test suite says that a duplicate job is not added, which is the desired behavior.

But what else happens? Is an error raised? Is false returned? How does one know if success occurred or not? The nearest I can tell is that perform just won't return a job_id:

TestJob.perform_async :arg => 1
# => "1234..."

TestJob.perform_async :arg => 1    # a duplicate!
# => nil

The test suite doesn't make it clear how to check this, since it works by looking at the queue size, which definitely isn't the correct strategy for production, since jobs are being added and removed all the time.

Documentation Not Clear

I'm trying to understand what this gem does, but the documentation isn't very clear on a basic level. So we're making jobs unique based on worker and arguments. What does that mean? Is only one unique job allowed in the queue at a time? (Meaning that if I put two different jobs in the queue that each match the unique_args constraint, is the second job never executed?) Or are two unique jobs allowed in the queue, but the latter always executed after the former has concluded?

Optimize Redis usage

Right now the uniqueness check requires three network round trips to Redis for each job pushed:

watch
get
(multi / setex) || unwatch

Redis 2.6.12+ has new flags for set which potentially allow this operation with a single command:

conn.set(payload_hash, 1 || 2, nx: true, ex: expires_at)

You don't document a minimum Redis version but requiring Redis 2.6+ is your choice.

Redis not mocked in testing

When I add sidekiq-unique-jobs to my Rails app I noticed that Redis calls aren't stubbed during testing.

I have the following in my spec_helper.rb

require 'sidekiq/testing'

ConnectionPool used incorrectly - causes deadlocks

Been trying to hunt down some mysteriously stalling dynos on our heroku app, and have traced back the source of our woes:

https://github.com/mhenrixon/sidekiq-unique-jobs/blob/master/lib/sidekiq_unique_jobs/connectors/sidekiq_redis.rb#L5

The redis connector classes should not be returning the connections for use outside the #with or #redis blocks - that block is used to guarantee exclusive access to the connection and prevent other threads from touching the connection while it's working. This would blow up a lot more massively, but the redis connections themselves are intended to be thread safe, so the bugs end up being a lot more subtle:

The deadlocks I've been hunting down:
https://github.com/mhenrixon/sidekiq-unique-jobs/blob/master/lib/sidekiq_unique_jobs/middleware/client/strategies/unique.rb#L49-L51 the connection the multi is started on will not necessarily be the same one setex is called on. To be thread safe, redis throws a separate mutex around multi blocks (https://github.com/redis/redis-rb/blob/master/lib/redis.rb#L2147) and every other command as well - so it's possible you try to call setex on a connection that's currently locked for a multi, and then inside that multi locks on the connection waiting for the setex to unlock.
Potential race conditions allowing jobs to be added multiple times - watch needs to be called on the same connection you call multi on, but since #conn is potentially a different connection from the pool every time, there's no guarantee this happens.

There might be other issues stemming from this as well. I was able to reproduce the error case we're seeing with the following script: https://gist.github.com/adstage-david/d1057fb6e4b1a676cce4

Change log level to info rather than warn

Attempting to create a job that isn't unique shouldn't be such an important event that it shows up as a warning in the logs. I think the ability to log the attempts is great, but I think the value isn't there unless you're engaged in a level of diagnosis in which you're checking logs at the info or debug levels. Could we switch the logging level to be info?

Happy to do the PR to make the change if there's agreement.

Retries duplicates unique jobs

Hi,

The problem is: when a job fails for some reasons Sidekiq requeues it creating many duplicates despite the fact that it's unique.

Actually the problem is described here but it's not a Sidekiq issue anymore.

Is there any way to avoid duplicating?

UPD:

I'm runnig Puma as a web server

Scheduled workers

Was this intended to work with workers that need to be scheduled?

Not all sidekiq:sidekiq_unique keys are removed from Redis

I am seeing weird behavior in production where sidekiq:sidekiq_unique are not always removed after completing a job.

I am running an hourly import job, that queues over 1000 jobs to fetch and process data from an API. To prevent multiple workers processing the same job, I am using sidekiq-unique-jobs with a unique_job_expiration of 1.day.

When I run this on my development machine (OS X), everything is fine. When running in production (Linux, the uniqueness keys are not always removed. This causes import jobs not the run for a whole day.

Normally (and what I see in on my development machine) is that the number of sidekiq:sidekiq_unique keys is equal to the number of currently running jobs plus the queue size. When I running the same import on production, I see over 120 sidekiq:sidekiq_unique keys not being unlocked.

My first thought was that this is caused by some worker jobs, queueing other worker jobs. But I could also reproduce this in production by performing the same worker multiple times.

At this moment I don't have any clue what the cause of this is. But maybe someone has the same issue or is able to provide debugging instructions.

Runtime uniqueness when using :before_yield as unlock order

I have a specific use case for sidekiq uniquness for a project I work on. We need the possibility to enqueue more jobs while a job is running but we never want two unique jobs to run concurrently. As far as i understand there is no way to do it as :after_yield unlock order would not allow to enqueue more jobs while a worker is running and :before_yield would allow simultaneous workers. I made a fork adding a locking mechanism for each running job to fix our problem. You can find it here:
https://github.com/tsubery/sidekiq-unique-jobs/tree/runtime_uniqueness
I also added some documentation describing it.
Do you think that is something that might be interesting to other users of the gem?

Using with sidekiq delayed extensions

I'm currently using unique to guarantee I don't send multiple emails after jobs complete.

I end up having to wrap the email call in a worker like:

class UniqueMailer
  include Sidekiq::Worker

  sidekiq_options unique: true, unique_job_expiration: 30 * 60, unique_unlock_order: :never

  def perform(resource_id)
   Mailer.delay.mail(resource_id)
  end

It would be nice if something like the following worked:

Mailer.delay.mail(resource_id, unique: true, unique_job_expiration: 30 * 60, unique_unlock_order: :never)

or a new method

Mailer.delay.unique_mail(resource_id, unique: true, unique_job_expiration: 30 * 60, unique_unlock_order: :never)

Thoughts?

Unique key inconsistency between server and client

Hi,

I encountered a problem today while trying to use this gem, when using custom uniqueness parameters, I found that the key added to Redis to enforce uniqueness isn't the same as the key later deleted by the server.
In my case, this means that once my first job is pushed to the Sidekiq queue, no more jobs can be added even after the first one is processed, since the lock key is still present in Redis.

It might be because of another middleware misbehaving, but I believe sidekiq-unique-jobs should be able to avoid this kind of deadlock.
After looking a bit at the code, I saw that the name of the key used to enforce uniqueness is added to the job's payload. Is there a reason why this key isn't then used by the server to perform the unlock, instead of trying to recompute it?

I'd be happy to provide a pull-request if you want.

Thanks a lot in advance !

clarification on unique_args

Does the args filter need to return an array?
Or can it return any unique string (or object that responds to .hash, etc)?

For example would this work?

class SomeJob
  include Sidekiq::Worker
  sidekiq_options queue: :critical, unique: true, unique_args: :args_filter

  def self.args_filter(*args)
    args.first
  end

  def perform(object_id, attempts=1)
    ...
  end
end

Second question: does args_filter need to include the name of the job ("SomeJob" in this case)? Or is this added for you?

The docs don't make it super clear because they shown an example like this (but don't show the corresponding perform method signature, so it's unclear what these variables are actually referring to.

  def self.unique_args(name, id, options)
    [ name, options[:type] ]
  end

Thanks!

The second job does not run, even if it has different arguments

The second job does not run, even if it has different arguments.... Why is this?

# WORKER
class FooWorker
  include Sidekiq::Worker

  sidekiq_options({
    # Mitigates from race conditions Should be set to true (enables uniqueness for async jobs)
    unique: :true,
    unique_job_expiration: 5.minutes.to_i, # Unique expiration (optional, default is 30 minutes)
    unique_args: :unique_args,

    retry: false,
    backtrace: true
  })

  def self.unique_args(foo, bar)
    [foo, bar]
  end

  def perform(*args)
    sleep(30.seconds)
  end
end


# In Rails console
FooWorker.perform_async(4, 4) # => "defa813c6ff16a6b9dba6f6a"
FooWorker.perform_async(5, 5) # nil

Short jobs are not unique for the given time window

I ran into a bug at work where the same job was being run multiple times (sending out emails to users). So I did some testing and came up with a minimal reproduction of the problem here: https://github.com/hqmq/sidekiq-not-unique-jobs

In order to run it just bundle install and then in one terminal start sidekiq like normal:

$ bundle exec sidekiq -r config/bootstrap.rb

Then in a second terminal run:

$ ruby test_uniqueness.rb
2014-02-26T21:50:12Z 67772 TID-znnu9k INFO: Sidekiq client with redis options {:url=>"redis://localhost:6379", :namespace=>"sk"}
Expecting the counter to = 1
counter = 9

Will a second job lose if the job is already queued, or is already scheduled?

A uniq job is already queued or scheduled, and then a new job coming. Will it lose ?

class QueueWorker
  include Sidekiq::Worker
  sidekiq_options queue: 'test', unique: true, unique_args: :unique_args

  def self.unique_args user_id, client_id, options
    [user_id, client_id]
  end

  def perform(*args)
    sleep 10
  end
end

QueueWorker.perform_sync(1,1, {})  # No.1
QueueWorker.perform_sync(2,1, {})  # No.2
QueueWorker.perform_sync(1,1, {})  # No.3

After reading the source code, I found that if the No.1 job was previously scheduled and is now being queued, the No.3 job will lose!

Scheduled Unique Jobs Not Being Enqueued

Since updating Sidekiq to 2.12.1, unique jobs that I try to schedule 5 minutes in the future with:

JobClass.perform_in(5.minutes, args)

do wind up in the schedule, but then when five minutes rolls around they go away and are not enqueued or performed. It looks like there was a change to the way Sidekiq uses middleware for scheduled jobs (it now calls client middleware when scheduled jobs or retries are put on the queue).

Is it possible that the sidekiq-unique-jobs middleware is setting the payload hash key when the job is scheduled, and since the new Sidekiq version also calls client middleware when the scheduled job is then enqueued, it's not enqueued because the hash is already there?

Latest release is breaking

Rolling back to '3.0.13 fixed the issue. Jobs dont get created I think.

What does uniqueness mean in case of this gem?

I have a simple use-case:

web server receives http request and schedules sidekiq job
sidekiq job upon completion send http request (e.g. completion callback)

I want my sidekiq jobs to be unique during (for example) 2 hours. It means that after pushing some job with arguments {'a'=>1,'b'=>2} I do not want any other jobs with exactly same arguments to appear in queue and/or in "working" jobs regardless that first job state(successfully finished or failed). The result behavior that after first job finished I can add to queue other job with same arguments which is definitely not expected behavior (or am I missing something?).

So does it bug or feature?

Thanks!

mock redis dependency

Hi,
after this commit, 9fdc855 related with #46, when I run my app specs I receive an error because mock_redis can not be loaded.
I think that it should be like it was before, a gem dependency, because if not people using this gem in their tests will have to require in the Gemfile mock_redis explicitly and that is something that I do not like to do.

Another option would be to allow people to do not have to use mock_redis when using the gem in test mode. Having some kind of configuration.
What are your thoughts?