active-elastic-job / active-elastic-job Goto Github PK

Run Rails background jobs or periodic tasks (cron jobs) in Amazon Elastic Beanstalk worker environments. No need for customised container commands.

License: MIT License

Ruby 82.81% Shell 0.73% JavaScript 1.82% CSS 1.29% HTML 12.74% Dockerfile 0.50% Procfile 0.10%

active-elastic-job's Introduction

Active Elastic Job

You have your Rails application deployed on the Amazon Elastic Beanstalk platform and now your application needs to offload work—like sending emails—into asynchronous background jobs. Or you want to perform jobs periodically similar to cron jobs. Then Active Elastic Job is the right gem. It provides an adapter for Rails' Active Job framework that allows your application to queue jobs as messages in an Amazon SQS queue. Elastic Beanstalk provides worker environments that automatically pull messages from the queue and transforms them into HTTP requests. This gem knows how to handle these requests. It comes with a Rack middleware that intercepts these requests and transforms them back into jobs which are subsequently executed.

Why use this gem?

It is easy to setup.
It makes your application ready for worker environments that are highly integrated in the Elastic Beanstalk landscape.
It is based on Amazon SQS, a fast, fully managed, scaleable, and reliable queue service. You do not need to operate and maintain your custom-messaging cluster.
It is easy to deploy. You simply push your application code to a worker environment, the same way that you push your application code to your web environment.
It scales. The worker environments come with auto-scale capability. Additional worker instances will spawn automatically and process jobs from the queue if the load increases above a preconfigured threshold.

Usage

Add this line to your application's Gemfile:
```
 gem 'active_elastic_job'
```
Create an SQS queue:

Log into your Amazon Web Service Console and select SQS from the services menu.

Create a new queue. Select a name of choice but do not forget to use the same name in your Active Job class definition.

class YourJob < ActiveJob::Base
  queue_as :name_of_your_queue
end

Also use that same name in your Action Mailer configuration (if you send emails in background jobs):

# config/application.rb
module YourApp
  class Application < Rails::Application
    config.action_mailer.deliver_later_queue_name = :name_of_your_queue
  end
end

Choose a visibility timeout that exceeds the maximum amount of time a single job will take.

Give your EC2 instances permission to send messages to SQS queues:

Stay logged in and select the IAM service from the services menu.
Select the Roles submenu.
Find the role that you select as the instance profile when creating the Elastic Beanstalk web environment:
Attach the AmazonSQSFullAccess policy to this role.
Make yourself familiar with AWS Service Roles, Instance Profiles, and User Policies.

Tell the gem the region of your SQS queue that you created in step 2:

Select the web environment that is currently hosting your application and open the Software Configuration settings.
Add AWS_REGION and set it to the region of the SQS queue, created in Step 2.

Create a worker environment:

Stay logged in and select the Elastic Beanstalk option from the services menu.
Select your application, click the Actions button and select Launch New Environment.
Click the create worker button and select the identical platform that you had chosen for your web environment.
In the Worker Details form, select the queue, that you created in Step 2, as the worker queue, and leave the MIME type to application/json. The visibility timeout setting should exceed the maximum time that you expect a single background job will take. The HTTP path setting can be left as it is (it will be ignored).

Configure the worker environment for processing jobs:

Select the worker environment that you just have created and open the Software Configuration settings.
Add PROCESS_ACTIVE_ELASTIC_JOBS and set it to true.

Configure Active Elastic Job as the queue adapter.

# config/application.rb
module YourApp
  class Application < Rails::Application
    config.active_job.queue_adapter = :active_elastic_job
  end
end

Verify that both environments—web and worker—have the same secret base key:

In the Software Configuration settings of the web environment, copy the value of the SECRET_KEY_BASE variable.
Open the Software Configuration settings of the worker environment and add the SECRET_KEY_BASE variable. Paste the value from the web environment, so that both environments have the same secret key base.

Deploy the application to both environments (web and worker).

Set up periodic tasks (cron jobs)

Elastic beanstalk worker environments support the execution of periodic tasks similar to cron jobs. We recommend you to make yourself familiar with Elastic Beanstalks' official doumentation first.

You don't need this gem to make use of Elastic Beanstalk's periodic tasks feature, however, this gem takes care of intercepting the POST requests from the SQS daemon (explained in the official documentation). If the gem detects a POST request from the daemon caused by a periodic task definition, then the gem will create a corresponding Active Job instance and trigger the execution. To make use of the gem, just follow these conventions when writing your definition of the perdiodic tasks in cron.yaml:

Set name to the class name the of the (ActiveJob) job that should be performed.
Set url to /periodic_tasks.

This is an example of a cron.yaml file which sets up a periodic task that is executed at 11pm UTC every day. The url setting leads to requests which will be intercepted by the gem. It then looks at the name setting, passed as a request header value by the SQS daemon, and instantiates a PeriodicTaskJob job object. Subsequently it triggers its execution by calling the #perform_now method.

version: 1
cron:
 - name: "PeriodicTaskJob"
   url: "/periodic_tasks"
   schedule: "0 23 * * *"

FIFO Queues

FIFO (First-In-First-Out) queues are designed to enhance messaging between applications when the order of operations and events is critical, or where duplicates can't be tolerated. FIFO queues also provide exactly-once processing but have a limited number of transactions per second (TPS).

The message group id will be set to the job type, and the message deduplication id will be set to the job id.

Note: Periodic tasks don't work for worker environments that are configured with Amazon SQS FIFO queues.

Optional configuration

This gem is configurable in case your setup requires different settings than the defaults. The snippet below shows the various configurable settings and their defaults.

Rails.application.configure do
  config.active_elastic_job.process_jobs = ENV['PROCESS_ACTIVE_ELASTIC_JOBS'] == 'true'
  config.active_elastic_job.aws_credentials = lambda { Aws::InstanceProfileCredentials.new } # allows lambdas for lazy loading
  config.active_elastic_job.aws_region # no default
  config.active_elastic_job.secret_key_base = Rails.application.secrets[:secret_key_base]
  config.active_elastic_job.periodic_tasks_route = '/periodic_tasks'.freeze
end

If you don't want to provide AWS credentials by using EC2 instance profiles, but via environment variables, you can do so:

Rails.application.configure do
  config.active_elastic_job.aws_credentials = Aws::Credentials.new(ENV['AWS_ACCESS_KEY_ID'], ENV['AWS_SECRET_ACCESS_KEY'])
end

Suggested Elastic Beanstalk configuration

Extended Nginx read timeout

By default, Nginx has a read timeout of 60 seconds. If a job takes more than 60 seconds to complete, Nginx will close the connection making AWS SQS think the job failed. However, the job will continue running until it completes (or errors out), and SQS will re-queue the job to be processed again, which typically is not desirable.

The most basic way to make this change is to simply add this to a document within nginx/conf.d:

fastcgi_read_timeout 1800; # 30 minutes
proxy_read_timeout 1800; # 30 minutes

However, one of the best parts about active-elastic-job is that you can use the same code base for your web environment and your worker environment. You probably don't want your web environment to have a read_timeout longer than 60 seconds. So here's an Elastic Beanstalk configuration file to only add this to your worker environments.

Amazon Linux 2

Create two files (for application / configuration deployment) with same content.

.platform/hooks/predeploy/nginx_read_timeout.sh .platform/confighooks/predeploy/nginx_read_timeout.sh

#!/usr/bin/env bash
set -xe

if [ $PROCESS_ACTIVE_ELASTIC_JOBS ]
then
  cat >/var/proxy/staging/nginx/conf.d/read_timeout.conf <<EOL
fastcgi_read_timeout 1800;
proxy_read_timeout 1800;
EOL
fi

Pre-Amazon Linux 2

Coming soon

Experimental

Multiple Queues with Single Worker

The default aws-sqsd daemon only support one queue at a time and is determined by the Elastic Beanstalk configuration. However, as of 3.1.0, we've introduced an experimental feature for also handling requests made by other sqsd daemons. One options is (sqsd)[https://github.com/mogadanez/sqsd], but any daemon that makes localhost requests with the user-agent sqsd should work.

Potential Setup

In .platform/hooks/postdeploy put a shell script like this:

#!/usr/bin/env bash
set -xe

if [ "$PROCESS_ACTIVE_ELASTIC_JOBS" ]
then
  npm install -g sqsd
  nohup "$(npm bin -g)/sqsd" --queue-url $SQS_URL --web-hook '/' --worker-health-url '/health' --ssl-enabled false --daemonized false >> /var/log/sqsd.log 2>&1 &
fi

worker-health-url is optional, but better to have than to not
ssl-enabled is set to false as the default Elastic Beanstalk setup has the SSL ending at the load balancer and not the application.
daemonized is set to false otherwise sqsd would stop once the queue was empty (this seems backwards from the sqsd README, but it works this way)
user-agent is technically optional as the default value is sqsd, but there's potential to expand features based on this field
Everything starting with the >> is optional unless you want output from the daemon logged

Potential Problems

aws-sqsd cannot coordinate resources with sqsd therefor it can't properly "load balance" tasks like normal. If this becomes a problem you can lower the number of concurrent workers and max messages to help keep resources in check.

FAQ

A summary of frequently asked questions:

What are the advantages in comparison to popular alternatives like Resque, Sidekiq or DelayedJob?

You decided to use Elastic Beanstalk because it facilitates deploying and operating your application. Active Elastic Job embraces this approach and keeps deployment and maintenance simple. To use Resque, Sidekiq or DelayedJob as a queuing backend, you would need to setup at least one extra EC2 instance that runs your queue application. This complicates deployment. Furthermore, you will need to monitor your queue and make sure that it is in a healthy state.

Can I run Resque or DelayedJob in my web environment which already exists?

It is possible but not recommended. Your jobs will be executed on the same instance that is hosting your web server, which handles your users' HTTP requests. Therefore, the web server and the worker processes will fight for the same resources. This leads to slower responses of your application. But a fast response time is actually one of the main reasons to offload tasks into background jobs.

Is there a possibility to prioritize certain jobs?

Amazon SQS does not support prioritization. In order to achieve faster processing of your jobs you can add more instances to the worker environment or create a separate queue with its own worker environment for your high-priority jobs.

Can jobs be delayed?

You can schedule jobs not more than 15 minutes into the future. See the Amazon SQS API reference. If you need to postpone the execution of a job further into the future, then consider the possibility of setting up a periodic task.

Can I monitor and inspect failed jobs?

Amazon SQS provides dead-letter queues. These queues can be used to isolate and sideline unsuccessful jobs.

Is my internet-facing web environment protected against being spoofed into processing jobs?

The Rails application will treat requests presenting a user agent value aws-sqsd/* as a request from the SQS daemo; therefore, it tries to un-marshal the request body back into a job object for further execution. This adds a potential attack vector since anyone can fabricate a request with this user agent and, therefore, might try to spoof the application into processing jobs or even malicious code. This gem takes several counter-measures to block the attack vector.

The middleware that processes the requests from the SQS daemon is disabled per default. It has to be enabled deliberately by setting the environment variable PROCESS_ACTIVE_ELASTIC_JOBS to true, as instructed in the Usage section.
Messages that represent the jobs are signed before they are enqueued. The signature is verified before the job is executed. This is the reason both environments-web and worker-need to have the same value for the environment variable SECRET_KEY_BASE (see the Usage section Step 7) since the secret key base will be used to generate and verify the signature.
Only requests that originate from the same host (localhost) are considered to be requests from the SQS daemon. SQS daemons are installed in all instances running in a worker environment and will only send requests to the application running in the same instance. Because of these safety measures it is possible to deploy the same codebase to both environments, which keeps the deployment simple and reduces complexity.

Can jobs get lost?

Active Elastic Job will raise an error if a job has not been sent successfully to the SQS queue. It expects the queue to return an MD5 digest of the message contents, which it verifies for correctness. Amazon advertises SQS to be reliable and messages are stored redundantly. If a job is not executed successfully, the corresponding message become visible in the queue again. Depending on the queue's setting, the worker environment will pull the message again and an attempt will be made to execute the jobs again.

What can be the reason if jobs are not executed?

Inspect the log files of your worker tier environment. It should contain entries for the requests that are performed by the AWS SQS daemon. Look out for POST requests from user agents starting with aws-sqsd/. If the log does not contain any, then make sure that there are messages enqueued in the SQS queue which is attached to your worker tier. You can do this from your AWS console.

When you have found the requests, check their response codes which give a clue on why a job is not executed:

status code 500: something went wrong. The job might have raised an error.
status code 403: the request seems to originate from another host than localhost or the message which represents the job has not been verified successfully. Make sure that both environment, web and worker, use the same SECRET_KEY_BASE.
status code 404 or 301: the gem is not included in the bundle, or the PROCESS_ACTIVE_ELASTIC_JOBS is not set to true (see step 6) in the worker environment or the worker environment uses an outdated platform which uses the AWS SQS daemon version 1. Check the user agent again, if it lookes like this aws-sqsd/1.* then it uses the old version. This gem works only for daemons version 2 or newer.

Bugs - Questions - Improvements

Whether you catch a bug, have a question or a suggestion for improvement, I sincerely appreciate any feedback. Please feel free to create an issue and I will follow up as soon as possible.

Contribute

Running the complete test suite requires to launch elastic beanstalk environments. Travis builds triggered by a pull request will launch the needed elastic beanstalk environments and subsequently run the complete test suite. You can run all specs that do not depend on running elasitic beanstalk environments by setting an environment variable:

EXCEPT_DEPLOYED=true bundle exec rspec spec

Feel free to issue a pull request, if this subset of specs passes.

Development environment with Docker

We recommend to run the test suite in a controlled and predictable envrionment. If your development machine has Docker installed, then you can make use of the Dockerfile that comes with this package. Build an image and run tests in container of that image.

docker build -t active-elastic-job-dev .
docker run -e EXCEPT_DEPLOYED=true -v $(pwd):/usr/src/app active-elastic-job-dev bundle exec rspec spec

active-elastic-job's People

Contributors

Stargazers

Watchers

Forkers

comandeo aonsager ryannaughton kizzler mkozono sauy7 sakuma michaelclopez ingmaras joseronierison acpk bookingexperts soutaro jisupark osanyin javs-perez rkiller lmborba 1secondeveryday dorner hendrikb onefriendaday plashchynski wanderfalke chayelheinsen aceandtate miguelac pablonahuelgomez kaikousa smittix6 akadanpaul marcgreenstock jordanbrock aaronksalmon dvauction jeromedalbert nicolasleger rud nulib jeremyhicks marcelkooi emanuelcadems samnang thegnarco adhitia charlesharley kevinburkeomg mohideen unchi listia spbk modsaid kceb sutori truemobilehealth knownapp matthewrdodds whileman133 gomathi-n dalink-mallow notdavelane dialexa satnami haelmx jenriquez-bit fabiensebban adaline nicodelpiano mistersourcerer restlessthinker psych-hub sleepepi waissbluth geshwho dyanagi ftm-tl-ss shapemill layered-inc scalefactor karann armansa adilsarwarnu vbeliaiev romanmaluf-vaypol jmortlock gunnertech rrdharmesh mystery-applicant eduardstinga adityakumarmishra keisukekomeda codeguru42 uday94043 mucase mhmiles ryanfox1985 iq-scm zaru rebeccapinheiro tberrysoln

active-elastic-job's Issues

Using 256kb client rather than the new 2gb sqs

See: https://github.com/awslabs/amazon-sqs-java-extended-client-lib

Pausing Jobs

If we experience a bug / issue, how can we pause a set of jobs?

Rails 5 Issue?

Checking this Gem out on Rails 5, the following error occurs when attempting to #perform_later:

NoMethodError (undefined method `enqueue' for #ActiveJob::QueueAdapters::ActiveElasticJobAdapter:0x00558d3e366bd8):

Odd since the method does exist.

Just wanted to give a heads up. When I have some time, I'll try to help resolve the issue.

Please release 1.6.2

Thanks much. 👍

Support for Elastic Beanstalk's Periodic Tasks? (via cron.yml)

Does this library support Elastic Beanstalk's periodic tasks via cron.yml file? I do not see documentation for such a feature.

Allow user to configure gem in initializer

I would like to be able to configure my AEJ user independently from the AWS users required by other gems. This is not possible because this gem uses AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables. As it stands, this configuration requirement dictates that I must give multiple permissions to a AWS single user.

I want to be able to configure this gem in an initializer like so:

ActiveElasticJob.setup do |config|
  config.aws_access_id = ENV.fetch("AEJ_ACCESS_ID")
  config.aws_secret_key = ENV.fetch("AEJ_SECRET_KEY")
  config.aws_region = ENV.fetch("AEJ_REGION")
  config.disable_sqs_consumer = ENV.fetch("DISABLE_SQS_CONSUMER")
end

How to find out why a job failed?

Hi Tawan,

I am using your library for a month already and so far it works really smooth. Thanks a lot for that.

I have some issues debugging why the jobs are failing on the worker. I noticed that a message ended up in a dead queue, so I tried to resend it. That failed. Usually I get an error from AppSignal. But this time - nothing. Then I tried to take a look in the EB logs:

/var/log/aws-sqsd/default.log

2016-03-13T17:33:56Z message: sent to %[http://localhost:80]
2016-03-13T17:33:56Z http-err: f0d021e8-cde9-413f-aafc-e1a02a523912 (1) 403 - 0.004
2016-03-13T19:03:13Z message: sent to %[http://localhost:80]
2016-03-13T19:04:24Z message: sent to %[http://localhost:80]
2016-03-13T19:04:24Z message: sent to %[http://localhost:80]
2016-03-13T19:59:59Z message: sent to %[http://localhost:80]
2016-03-13T19:59:59Z http-err: 5ba003c1-b275-4582-841f-f73024f94a6c (1) 403 - 0.004

Not sure if this one is relevant but in var\log\cfn-hup.log I get lots of these:

2016-03-13 20:35:05,351 [DEBUG] Receiving messages for queue https://eu-central-1.queue.amazonaws.com/45051790cxxx/xxxxf0ab5de7d1b20a6692967bcb1cb79d8b8d8b
2016-03-13 20:35:05,416 [ERROR] IOError caught while processing messages
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/cfnbootstrap/update_hooks.py", line 407, in process
    for msg in self.sqs_client.receive_message(self.queue_url, request_credentials = self._creds_provider.credentials, wait_time=20):
  File "/usr/lib/python2.7/dist-packages/cfnbootstrap/util.py", line 181, in _retry
    raise last_error
AwsQueryError: [Errno 403] RequestThrottled: Your ReceiveMessage request is throttled because you have too many concurrent ReceiveMessage requests coming from your account.  Reduce the number of clients polling your queues.
2016-03-13 20:35:05,416 [DEBUG] Receiving messages for queue https://eu-central-1.queue.amazonaws.com/45051790cxxx/xxxxf0ab5de7d1b20a6692967bcb1cb79d8b8d8b
2016-03-13 20:35:05,481 [ERROR] IOError caught while processing messages
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/cfnbootstrap/update_hooks.py", line 407, in process
    for msg in self.sqs_client.receive_message(self.queue_url, request_credentials = self._creds_provider.credentials, wait_time=20):
  File "/usr/lib/python2.7/dist-packages/cfnbootstrap/util.py", line 181, in _retry
    raise last_error

Do you have any advice how I can find out what happened?

Config for force_ssl = true

When an app has the config:

config.force_ssl = true

SQSd requests are redirected:

==> /var/log/nginx/access.log <==
127.0.0.1 - - [14/Mar/2016:20:15:52 +0000] "POST / HTTP/1.1" 301 5 "-" "aws-sqsd/2.0" "-"

==> /var/log/aws-sqsd/default.log <==
2016-03-14T20:15:52Z http-err: 802ae37e-ec44-435c-9faa-d77c0c338d36 (3) 301 - 0.002
2016-03-14T20:15:54Z message: sent to %[http://localhost:80]

I'm not sure which Rails/Rack middleware is doing this and I don't know if moving ActiveElasticJob::Rack::SqsMessageConsumer "up" in the middleware order would be ok.

I worked around with this config for webapp environment:

config.force_ssl = ENV['DISABLE_SQS_CONSUMER'] == 'true'

Not the prettiest thing, but ok.
Do you have another suggestion? I was thinking this could probably go into the README or a wiki page.
This is quite hard to notice until you SSH into the instance and tail the logs...

Deprecation in Rails 5

Hi, so far I can see, everything works in Rails 5. There is one little deprecation:

DEPRECATION WARNING: Passing strings or symbols to the middleware builder is deprecated, please change
them to actual class references.  For example:

  "ActiveElasticJob::Rack::SqsMessageConsumer" => ActiveElasticJob::Rack::SqsMessageConsumer

Could not obtain a database connection (Puma-ActiveRecord-Postgres-RDS)

Thanks for this gem - fantastic ease of deployment to elastic beanstalk.

Sometimes, when I have piling up of jobs in SQS, I end up getting a lot of errors like:

Error while trying to deserialize arguments: could not obtain a database connection within 5.000 seconds (waited 5.000 seconds)

This would seem to indicate that the worker instance is trying to use more connections than are available in the ActiveRecord pool (set at 20 to accommodate Puma). However, the postgres activity doesn't seem to show this. At the time of errors I only observed 5 or so postgres connections. I expected to see close to 20 if it was indeed running out of connections from the pool.

Is it possible that ActiveElasticJob isn't respecting my pool size?

Thanks,
Jeff

Litmit number of concurrency jobs

How to limit number of concurrency job.

Worker instances only has enough resources for about 10 concurrently jobs.
But it receives more than these number of requests so it cannot handle all of them

Please help to prevent it

not detecting queue

not sure how to handle this. seems it's not detecting my SQS queue? here's the response i get when i try to queue up a job:

ActiveJob::QueueAdapters::ActiveElasticJobAdapter::NonExistentQueue: The job is bound to queue at test_queue. Unfortunately a queue
with this name does not exist in this region. Either create an Amazon SQS queue
named test_queue - you can do this in AWS console, make sure to select
region 'us-west-2' - or you select another queue for your jobs.

my queue is https://sqs.us-west-2.amazonaws.com/622149014054/test_queue, so it's definitely not an issue of proper name or region.

Explain motivation of this gem in README

See comments in #11

Resue `Aws::SQS::Error`s

Catch all Aws::SQS::Errors and throw an ActiveElasticJob error.

SQS daemon requests are not intercepted

HTTP Error 301

Hi, I am getting a 301 error in my worker env logs, any idea why?

var/app/support/logs/access.log
127.0.0.1 - - [28/May/2016:02:04:30 +0000] "POST / HTTP/1.1" 301 5 "-" "aws-sqsd/2.0"

/var/log/aws-sqsd/default.log

2016-05-28T02:07:33Z message: sent to %[http://localhost:80]
2016-05-28T02:07:33Z http-err: 3d762187-16aa-409e-aa6f-ba21f2a2962a (995) 301 - 0.004

403 error, same secrete base key

2016-11-15T17:44:50Z message: sent to http://localhost:80
2016-11-15T17:44:50Z http-err: 2dd429b5-be92-4cc5-bdaf-92e7ba41bc8e (1) 403 - 0.004

I'm getting a request from localhost but still getting a 403, I have the same secret key base for web and worker tiers. Any tips for debugging or things I may have missed?

Production ready?

Hi!

Awesome that you made this.
It's exactly what I was looking for, see the last message => ruby-shoryuken/shoryuken#169

Are you using this in production and is it stable?

Thanks for answering.

Schedule jobs for more than 15 minutes in the future

When using this gem, what is the recommended way to supplement it in order to execute jobs in the future (ex: emails)?

Update README to state why you it is safe to have the gem installed on the web environment

The worker is safe by default because the rails application will listen only on localhost and thus only the daemon is able to communicate. So any message the rails application receives is bound to be sent by the daemon.

On the other hand, the web application is exposed and anyone could curl a request with the right user agent and stuff. I assume the message digest is meant to keep "odd" requests at bay and prevent people from spoofing the web environment into processing jobs. I had to dig a bit into the code to ensure this was covered, so it is useful to point out.

How is multiple jobs processed ?

Hi,

If there is 10 jobs coming (assume each job requires 50% CPU running for 2 min), how many will the worker server takes and processes ?

As I used Resque before, I can set worker number to 2 per server, so the server will take only 2 jobs at one time and using 10 mins to finish all the job. And I can view the process detail via a web UI.

So I want to know if I could set a worker number for the AEJ like resque? And also can I view the current processing jobs like Resque web UI?

Thanks, the AEJ is a great tool to enjoy aws eb with no extra pain = ]

Handling errors

Should the documentation include ways to handle failed jobs so that SQS does not keep running them indefinitely?

Sudden loss of sqs queue

Hi,

Recently we experience a suddenly loss of the SQS queue.
An exception like this is thrown:

ActiveJob::QueueAdapters::ActiveElasticJobAdapter::NonExistentQueue:
The job is bound to queue at <queue_name>.
Unfortunately a queue with this name does not exist in this region.
Either create an Amazon SQS queue named <queue_name> - you can do this in AWS console,
make sure to select region '<region>' - or you select another queue for your jobs.

We run multi threaded. Not all threads / processes seem to throw this error.
And it's seems to occur when there are relatively long running jobs 10 + seconds.

We run version 1.4.2.

Going to investigate further.

EB Web and Work - Sharing Database

I'm assuming they need to share a database, but didn't see this mentioned in the Readme. What are the recommended steps for sharing the RDS instance?

Config for Rack::SslEnforcer middleware

I have to use another way instead of config.force_ssl for enable Heath Check feature http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.healthstatus.html.

Tried a lot of things for make it works with force_ssl, like described here:
http://robforman.com/how-to-setup-a-public-and-http-version-health-check-while-still-using-global-force_ssl-in-rails-for-elastic-load-balancer/

But only one way it works for me:

config.middleware.use(Rack::SslEnforcer, except: ['/health'], strict: :true)

Not I have to use ENV var again:

config.middleware.use(Rack::SslEnforcer, except: ['/health'], strict: :true) if ENV['DISABLE_SQS_CONSUMER'] == 'true'

Is there any chance to make a fix for Rack::SslEnforcer too, as you did it here #25? Or maybe someone use force_ssl and enabled EB Heath Check?

Thanks,
Alexander.

Priority

Hi @tawan!

Any idea how to implement some sort of priority system for messages?
With delayed job you can give the job a priority. So it will be processed with precedence above others.

In Beanstalk I can imagine the following:

Make multiple worker environments
One with more app servers than the other so jobs will be processed quicker.
Send jobs to a different queues.

Is there a better way?

Jobs that take longer than 60s fail.

I suspect that long running jobs will not work with the default configuration, as nginx times them out after 60s. You may want to add something to the readme about this:

nginx/error.log:

2017/01/28 17:55:02 [error] 24608#0: *41 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 127.0.0.1, server: _, request: "POST / HTTP/1.1", upstream: "http://unix:///var/run/puma/my_app.sock/", host: "localhost"

/var/log/aws-sqsd/default.log

2017-01-28T17:55:02Z http-err: 1bed1375-83ef-44f1-93b6-2a4ae7f3d612 (4) 504 - 68.761

Puma platform does not pass message digest in headers

The header HTTP_X_AWS_SQSD_ATTR_MESSAGE_DIGEST seems to be missing when puma passes the rack env.

Using instance profile credentials?

The README says to explicitly configure the SQS Client connection:

- Add AWS_ACCESS_KEY_ID and set it to access key id of the newly created user (from Step 3).
- Add AWS_SECRET_ACCESS_KEY and set it to the secret access key of the newly created user (from Step 3).
- Add AWS_REGION and set it to the region of the SQS queue, created in Step 2.

Is it possible to use the implicit instance profile credentials when running in EC2? It seems like the environment-based configuration is pretty hard-coded, but maybe you can suggest an approach that would work?

Verify MD5 hashes

Aws::SQS::Client#send_message returns a response object that carries two MD5 hashes which are calculated from the message body, respective the message attributes (this calculation is not trivial).

Verify that that the message body and message attributes reached the queue correctly by comparing the returned MD5 hashes with MD5 hashes calculated in place.

Thoughts on Message Digest

I'm trying to use this gem in an service-oriented architecture, where there are multiple writers to SQS queues.

The Rails app is the "center-piece" and also where all job processing is performed.

Due to the checks of a message-digest and the "origin=AEJ" header (https://github.com/tawan/active-elastic-job/blob/master/lib/active_elastic_job/rack/sqs_message_consumer.rb#L59) it is currently very hard to have publishers to SQS, other than the Rails app (with the same secret_key_base).

I believe I'll be able to "fake" these headers so that I can have my other publishers' jobs accepted. (Either that or I'll fork this gem and delete that check :P)

Either way, this got me thinking if this feature isn't "too much", and maybe it should be configurable (at least to turn on/off).

I think the check for localhost requests is all ok, but other than that it feels like this gem is doing too much. It's the user's job to ensure that their queues aren't publicly accessible and AWS provides all the tools to make that happen.

In my case the other publishers aren't rails/ruby apps, but even if they were, I'd have to ensure those other rails apps had the same secret_key_base, which is a bad thing.

Local development environment

I was wondering how would the local development be like with this gem?

Jobs should still be queued on SQS, but I can't see how those jobs would be retrieved and ran locally.
Any idea on how this is done?
I'd think this gem could provide a simple command/rake task that periodically fetches messages form the specified queues.

(I'm trying to replace Shoryuken with this gem in an effort to move our production environment to AWS EB.)

No route matches [POST] "/periodic_tasks

I get the following error for the cron.yml:

No route matches [POST] "/periodic_tasks

ArgumentError: wrong number of argument (given 1, expected 2)

Hello, I tried to follow straight all the documentation for configuring my queue and my application.
However I have an error with the following log:
ArgumentError: wrong number of arguments (given 1, expected 2)
from /opt/rubies/ruby-2.3.1/lib/ruby/gems/2.3.0/gems/active_elastic_job-2.0.0/lib/active_job/queue_adapters/active_elastic_job_adapter.rb:53:in initialize' from /opt/rubies/ruby-2.3.1/lib/ruby/gems/2.3.0/gems/active_elastic_job-2.0.0/lib/active_job/queue_adapters/active_elastic_job_adapter.rb:145:in exception'
from /opt/rubies/ruby-2.3.1/lib/ruby/gems/2.3.0/gems/active_elastic_job-2.0.0/lib/active_job/queue_adapters/active_elastic_job_adapter.rb:145:in raise' from /opt/rubies/ruby-2.3.1/lib/ruby/gems/2.3.0/gems/active_elastic_job-2.0.0/lib/active_job/queue_adapters/active_elastic_job_adapter.rb:145:in rescue in queue_url'
from /opt/rubies/ruby-2.3.1/lib/ruby/gems/2.3.0/gems/active_elastic_job-2.0.0/lib/active_job/queue_adapters/active_elastic_job_adapter.rb:139:in queue_url' from /opt/rubies/ruby-2.3.1/lib/ruby/gems/2.3.0/gems/active_elastic_job-2.0.0/lib/active_job/queue_adapters/active_elastic_job_adapter.rb:122:in build_message'
from /opt/rubies/ruby-2.3.1/lib/ruby/gems/2.3.0/gems/active_elastic_job-2.0.0/lib/active_job/queue_adapters/active_elastic_job_adapter.rb:96:in enqueue_at' from /opt/rubies/ruby-2.3.1/lib/ruby/gems/2.3.0/gems/active_elastic_job-2.0.0/lib/active_job/queue_adapters/active_elastic_job_adapter.rb:90:in enqueue'
from /opt/rubies/ruby-2.3.1/lib/ruby/gems/2.3.0/gems/active_elastic_job-2.0.0/lib/active_job/queue_adapters/active_elastic_job_adapter.rb:81:in enqueue' from /opt/rubies/ruby-2.3.1/lib/ruby/gems/2.3.0/gems/activejob-5.0.0.1/lib/active_job/enqueuing.rb:76:in block in enqueue'

Do you know how I did wrong ?

Thanks for helping.

cat: /proc/1/cgroup: No such file or directory

Due to the code checking for docker container, the following message gets printed to the console whenever you run rails/rake commands.
cat: /proc/1/cgroup: No such file or directory

Cache queue url

Currently an SQS queue url is fetched every time a job is enqueued. It is basically an API call to AWS which decreases performance significantly (relevant code part).

Cache the fetched queue url in memory for each queue.
If Aws::SQS::Client#send_message throws an Aws::SQS::Errors::NonExistentQueue and the queue url was taken from the cache, fetch the queue url again with an API call and retry once to send the message.

No route matches [POST] "/"

Hi,

First off, thanks for making this Gem. It's just what we needed. Currently I am in the process of getting this setup on AWS (Elastic Beanstalk). For our application it's being used to send email asynchronously.

The messages are getting enqueued correctly but during dequeuing by the worker environment it has the following error:

/var/app/containerfiles/logs/production.log

I, [2016-04-01T01:03:37.421909 #9616] INFO -- : Started POST "/" for 127.0.0.1 at 2016-04-01 01:03:37 +0000
F, [2016-04-01T01:03:37.554981 #9616] FATAL -- :
ActionController::RoutingError (No route matches [POST] "/"):

Any ideas?

Thanks in advance,
Mike

Check remote_ip of requests before processing

The middleware should only process requests that originate from the localhost. Since the SQS daemon which sends requests only runs on the local host this check will close another attack vector. See also the discussion in #11 .

Perform more than 15 minutes later

We want to be able to run jobs at a specific time in the future. Since AWS SQS doesn't allow delays more than 15 minutes, we can't do this:

GuestsCleanupJob.set(wait_until: Date.tomorrow.noon).perform_later(guest)

Anybody else need this functionality? What did you do?

managing priority?

sorry i'm throwing so many questions at you!

as it's built now, how i understand it is that it's a first-come-first-serve type of model. is there a way to manage the priority of tasks? i.e. if i have a long queue of tasks running, is there a way i can then shoot off an e-mail asynchronously without too much delay? or, in this scenario, would i have to setup a separate background worker and queue purely for e-mails?

-chris

request.local? fails with Dockerized application

In ActiveElasticJob::Rack::SqsMessageConsumer there is a check that the request is local (request.local?).
https://github.com/tawan/active-elastic-job/blob/master/lib/active_elastic_job/rack/sqs_message_consumer.rb#L30-L32

Our application is running inside a docker container, and this check currently fails as the remote_addr and remote_ip resolve to 172.17.0.1.

My suggestion is to make the local check optional. This could be set in an initializer. Or to allow a proc to be set that allows the user to configure their check.

Thanks again @tawan for all the work on this gem. I'll submit more docker related issues as I find them.

Errno::ETIMEDOUT raised

I tried to do a long time job about 3 hours using active-elastic-job.
Then, I got an error.

2017-04-30T02:01:25Z socket-err: 96055c97-fc82-41d0-a3ec-4aa000fb970e (2) Errno::ETIMEDOUT - 1798.986

But, the background job was working.
What can I solve?

Visibility timeout: 21600
Error visibility timeout: 2
(On "Advanced")
Inactivity timeout: 1799

How to run on local environment

I want to test elastic job on my local pc. How can I do it?

Great library

Hi @tawan,

Thanks for sharing this great library and clear instructions. I just installed it and everything worked fine from the first time.

I am considering to use it in production and I am wondering what can go wrong? I assume that once installed, it keeps working, right? Is this library vulnerable to sudden AWS changes? Are there any other risks?
Is there a way that I can keep track of succesful jobs?

Karens

2.0 config issue (possible)

Hello,

I've upgraded to 2.0 and the periodic tasks is very useful. However I notice a possible issue from commit of "Simplify configuration".

Aws credential setting, this line :

config.active_elastic_job.aws_credentials = Aws::InstanceProfileCredentials.new

It seems very slow. (my small app takes obviously much longer time to start after upgrade to 2.0)
It seems not working and I have to set the credential manually like

config.active_elastic_job.aws_credentials = Aws::Credentials.new(Rails.application.secrets.aws_key_id, Rails.application.secrets.aws_key_secret)

I'm not 100% sure but looks like a problem. Thanks for any help.

Add message attribute that identifies origin

Add an attribute to the messages sent to SQS which declares ActiveElaticJob as the origin. This avoids interference of requests with the same user agent as the SQS daemon

Support AWS_SECRET_KEY as well

The AWS Ruby SDK supports all three:
AWS_SECRET_ACCESS_KEY, AMAZON_SECRET_ACCESS_KEY, AWS_SECRET_KEY

It would be great if this gem followed suit. Wouldn't complicate the code too much and you wouldn't have to mention it in docs too prominently either. Also, When creating a new worker environment, the configuration says:

jobs being repeatedly called

not 100% the source of this issue, but if you can shed any light it's much appreciated. i'm still just trying to make sure everything is wired up properly so i set up a really simple job. for some reason the job seems to keep getting called over and over again.. here's a snippet of the log when i made a second request. i made a call from my webserver console and called UpdateSongJob.perform_later(args)


[ActiveJob] [UpdateSongJob] [6b99eb86-3981-4bdf-bfd4-21702fca7c4f] Performing UpdateSongJob from ActiveElasticJob(test_queue) with arguments: "568979ced2b38b05d00003d6", "188.40.62.138", "80"
[ActiveJob] [UpdateSongJob] [6b99eb86-3981-4bdf-bfd4-21702fca7c4f] Performed UpdateSongJob from ActiveElasticJob(test_queue) in 0.48ms
Started POST "/" for 127.0.0.1 at 2016-02-18 11:14:27 +0000
[ActiveJob] [UpdateSongJob] [97ea6055-8940-4d18-a02a-f0911d83509f] Performing UpdateSongJob from ActiveElasticJob(test_queue) with arguments: "5670b6fe7e8482b048000166", "188.40.62.138", "80"
[ActiveJob] [UpdateSongJob] [97ea6055-8940-4d18-a02a-f0911d83509f] Performed UpdateSongJob from ActiveElasticJob((test_queue) in 0.51ms
Started POST "/" for 127.0.0.1 at 2016-02-18 11:14:27 +0000
[ActiveJob] [UpdateSongJob] [c2bde572-73d2-4369-be53-263cd86179c5] Performing UpdateSongJob from ActiveElasticJob(test_queue) with arguments: "5670b6fe7e8482b048000166", "188.40.62.138", "80"
[ActiveJob] [UpdateSongJob] [c2bde572-73d2-4369-be53-263cd86179c5] Performed UpdateSongJob from ActiveElasticJob(test_queue) in 0.38ms
Started POST "/" for 127.0.0.1 at 2016-02-18 11:14:27 +0000
[ActiveJob] [UpdateSongJob] [c41a9f00-4437-4889-a1a7-681777b4dd0f] Performing UpdateSongJob from ActiveElasticJob(test_queue) with arguments: "5670b6fe7e8482b048000166", "188.40.62.138", "80"
[ActiveJob] [UpdateSongJob] [c41a9f00-4437-4889-a1a7-681777b4dd0f] Performed UpdateSongJob from ActiveElasticJob(test_queue) in 0.54ms
Started POST "/" for 127.0.0.1 at 2016-02-18 11:14:27 +0000
[ActiveJob] [UpdateSongJob] [73b93aa3-1903-49ef-9da8-171ed1cce732] Performing UpdateSongJob from ActiveElasticJob(test_queue) with arguments: "5670b6fe7e8482b048000166", "188.40.62.138", "80"
[ActiveJob] [UpdateSongJob] [73b93aa3-1903-49ef-9da8-171ed1cce732] Performed UpdateSongJob from ActiveElasticJob(test_queue) in 0.46ms
Started POST "/" for 127.0.0.1 at 2016-02-18 11:14:28 +0000
[ActiveJob] [UpdateSongJob] [6b99eb86-3981-4bdf-bfd4-21702fca7c4f] Performing UpdateSongJob from ActiveElasticJob(test_queue) with arguments: "568979ced2b38b05d00003d6", "188.40.62.138", "80"
[ActiveJob] [UpdateSongJob] [6b99eb86-3981-4bdf-bfd4-21702fca7c4f] Performed UpdateSongJob from ActiveElasticJob(test_queue) in 0.45ms

Disable middleware via environment variable

Add the possibility to disable the middleware in web environments. See also the discussion in issue #11.

Disable MD5 checksum verification when aws-sdk > v.2.2.19

MD5 checksum verification has been merged upstream and therefore is not needed if aws-sdk is >= 2.2.19