Giter Club home page Giter Club logo

aws-batch-image-processor's Introduction

aws-batch-image-processor

aws-batch-image-processor is a sample project that demonstrates the use of AWS Batch. Here, we use Amazon Rekognition to detect entities in a photo uploaded to Amazon S3. While this is somewhat simple use case, the job itself could be extended to handle more complex batch scenarios as well. See my blog post for additional details.

The sample code contains the following:

  • template.tf - Terraform plan containing the AWS resources required (e.g. AWS Batch Job Queue and Job Definition).
  • /job - assets to execute the batch job.
  • /lambda - AWS Lambda function used to submit / start the batch job.

Getting Started

To get started, clone this repository locally:

$ git clone https://github.com/jkahn117/aws-batch-image-processor.git

Prerequisites

To run the project, you will to:

  1. Select an AWS Region. Be sure that all required services (e.g. AWS Batch, AWS Lambda) are available in the Region selected.
  2. Install Docker.
  3. Install HashiCorp Terraform.
  4. Install the latest version of the AWS CLI and confirm it is properly configured.

Create AWS Resources with Terraform

For this project, we will use Terraform to deploy our AWS Resources. These includes various Batch components (Compute Environnment, Job Queue, and Job Definition) as well as a Lambda function and related IAM Roles.

Project Architecture

To build infrastructure with Terraform (be sure AWS credentials are configured with appropriate permissions):

# initialize the terraform environment
$ terraform init

# review the plan
$ terraform plan

# deploy...
$ terraform apply

Build and Push Docker Image

Once finished, Terraform will output the name of your newly created ECR Repository, e.g. 123456789098.dkr.ecr.us-east-1.amazonaws.com/aws-batch-image-processor-sample. Note this value as we will use it in subsequent steps (referred to as MY_REPO_NAME):

$ cd job

# build the docker image
$ docker build -t aws-batch-image-processor-sample .

# tag the image
$ docker tag aws-batch-image-processor-sample:latest <MY_REPO_NAME>:latest

# push the image to the repository
docker push <MY_REPO_NAME>:latest

Pushing the image may take several minutes.

Upload an Image to S3

In addition to the repository name, Terraform will also output the name of a newly created S3 bucket (starts with "aws-batch-sample-"). We will use that name next (referred to as MY_BUCKET_NAME).

Select an image, perhaps of a pet or your street.

# upload the image to s3
$ aws s3 cp <MY_IMAGE> s3://<MY_BUCKET_NAME>

Invoke Lambda to Submit Batch Job

Finally, let's invoke our Lambda function to submit a new batch job.

$ aws lambda invoke aws-batch-image-processor-function \
                    --payload '{"imageName": "reptile.jpg"}'

Verification

Creating the compute resources to run the Batch job may require several minutes. During this time, I recommend visiting the AWS Console. A few things to review:

  • AWS Batch Dashboard and Job Queue
    • Note the state of your job.
    • Submitting multiple jobs will cause your queue length to increase.
  • AWS Batch Job Definition
    • Defines the batch job, note the Docker image.
    • Note the command used to initiate the job worker (written here in Ruby, but could be any language).

Once your job state is "SUCCEEDED", visit the DynamoDB console and open the aws-batch-image-processor table. It should contain an entry such as:

Result in DynamoDB

Cleaning Up

When ready, it is easy to remove the resources created in this sample via Terraform (note that you may need to empty your S3 bucket first):

$ terraform destroy

Authors

  • Josh Kahn - initial work

aws-batch-image-processor's People

Contributors

jkahn117 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.