Giter Club home page Giter Club logo

stepwise's Introduction

Stepwise

Clojars Project

Stepwise inverts the responsibility [1] of AWS Step Functions such that you control your workflow orchestration with a Stepwise DSL from inside Clojure. Use Stepwise to coordinate asynchronous, distributed processes with AWS managing state, branching, and retries.

This Motiva-AI library is a fork of uwcpdx/stepwise and contains breaking changes.

Implemented Features:

Stepwise is in production at:

  • UW Medicine Center For Precision Diagnostics orchestrate their ETL with Stepwise
  • Motiva's production workflow serving a handful of Fortune 500 companies is mostly built upon this library

Further reading:

  1. You Can't Buy Integration
  2. Orchestrating Pizza-Making: A Tutorial for AWS Step Functions with Stepwise

Basic Usage

Here's how to make a trivial state machine that just adds two inputs together. The only prerequisite is to install the AWS CLI and run aws configure to set up your authentication and region.

At the REPL:

(require '[stepwise.core :as stepwise])

(stepwise/ensure-state-machine :adder
                               {:start-at :add
                                :states   {:add {:type     :task
                                                 :resource :activity/add
                                                 :end      true}}})
=> "arn:aws:states:us-west-2:123456789012:stateMachine:adder"

(stepwise/start-workers! {:activity/add (fn [{:keys [x y]}] (+ x y))})
=> ...

(stepwise/start-execution!! :adder {:input {:x 1 :y 1}})
=>
{:input             {:x 1 :y 1}
 :output            2
 :state-machine-arn "arn:aws:states:us-west-2:123456789012:stateMachine:adder"
 :start-date        #inst"2017-06-20T22:48:14.241-00:00"
 :stop-date         #inst"2017-06-20T22:48:14.425-00:00"
 :arn               "arn:aws:states:us-west-2:123456789012:execution:adder:9c5623c6-eee3-49fa-a7d6-22e3ca236c9a"
 :status            "SUCCEEDED"
 :name              "9c5623c6-eee3-49fa-a7d6-22e3ca236c9a"}

(stepwise/shutdown-workers *2)
=> #{:done}

Development Workflow

Stepwise enables a rapidly cycling development workflow for Step Functions. State machine definition updates are eventually consistent, so new state machines need to be created on each cycle during development. Also, activity task polls are long and cannot be interrupted, demanding registration of new activities (or a JVM restart) to prevent stealing by stale bytecode. Stepwise provides a single function, stepwise.reloaded/start-execution!!, that uses fresh state machine and activity task registrations to run a state machine execution wherein code changes are immediately reflected.

Example:

(stepwise.reloaded/start-execution!! :adder
                                     {:start-at :add
                                      :states   {:add {:type     :task
                                                       :resource :activity/add
                                                       :end      true}}}
                                     {:activity/add (fn [{:keys [x y]}] (+ x y))}
                                     {:x 41 :y 1})
=>
{:output 42,
 :state-machine-arn "arn:aws:states:us-west-2:256212633204:stateMachine:adder-1522697821734",
 :start-date #inst"2018-04-02T19:37:02.061-00:00",
 :stop-date #inst"2018-04-02T19:37:02.183-00:00",
 :input {:x 41,
         :y 1,
         :state-machine-name "adder-1522697821734",
         :execution-name "93f1d268-b2ff-4261-bf53-8ff92d7bc2c2"},
 :arn "arn:aws:states:us-west-2:256212633204:execution:adder-1522697821734:93f1d268-b2ff-4261-bf53-8ff92d7bc2c2",
 :status "SUCCEEDED",
 :name "93f1d268-b2ff-4261-bf53-8ff92d7bc2c2"}

Naturally your state machine and handlers will not be defined inline like this, so pair this call with something like tools.namespace or Cursive's native code reloading to rapidly try out changes to your namespaces.

EDN States Language Representation

Stepwise represents the Amazon States Language as EDN.

The only pervasive departure from the JSON representation is the use of lowered-hyphen keywords for keys, state names, and state types instead of CamelCase strings. The official AWS SDK validates your state machine definitions, and its exceptions are let through unmolested, so their messages are in terms of the CamelCase naming. When retrieving definitions from AWS they are translated into the EDN representation whether created with stepwise or not (state names are converted to keywords but otherwise untouched).

Aside from keywords being substituted for strings, the other sugaring is on:

  • Condition expressions in choice states
  • Error matching expressions in retry and catch blocks
  • Activity task resources

Condition Expressions

Condition expressions in choice states are sugared in stepwise as illustrated by the example below:

JSON
----
{
    "Not": {
      "Variable": "$.type",
      "StringEquals": "Private"
    },
    "Next": "Public"
}

EDN
---
{:condition [:not [:= "$.type" "Private"]]
 :next      :public}

JSON
----
{
  "And": [
    {
      "Variable": "$.value",
      "NumericGreaterThanEquals": 20
    },
    {
      "Variable": "$.value",
      "NumericLessThan": 30
    }
  ],
  "Next": "ValueInTwenties"
}

EDN
---
{:condition [:and [:>= "$.value" 20]
                  [:< "$.value" 30]]
 :next      :value-in-twenties}

See src/stepwise/sugar.clj and src/stepwise/specs/sugar.clj for full details.

Error Matching

In stepwise the :error-equals key that corresponds to the ErrorEquals key in the states language can be a keyword or collection of keywords:

JSON
----
"Retry" : [
    {
      "ErrorEquals": [ "States.ALL" ]
    }
]

EDN
---
{:retry [{:error-equals :States.ALL}]}

You can of course use keywords for your custom error names, including namespaced keywords.

Activity Task Resources

Activity task resources can be specified as keywords and stepwise.core/ensure-state-machine will register appropriately named activities for you and substitute in their ARNs. For example:

JSON
----
{
  "StartAt": "Add",
  "States": {
    "Add": {
      "Type": "Task",
      "Resource": "arn:aws:states:us-west-2:123456789012:activity:add",
      "End": true
    }
  }
}

EDN
---
{:start-at :add
 :states   {:add {:type     :task
                  :resource :add
                  :end      true}}}

You can also supply an ARN string for the resource to specify a lambda task or activity task managed outside of stepwise.

Offloading Large Payload to S3

To offload some of your input to S3 before execution, call stepwise/offload-select-keys-to-s3 like so,

(as-> {:x 1 :y 1} $
  (stepwise/offload-select-keys-to-s3 $ [:x :y] bucket-name)
  (hash-map :input $)
  (stepwise/start-execution!! :adder $))

Note that data will only be offloaded to S3 if the payload is actually large in size, i.e. ~200kb.

On the receiving end, Stepwise workers will automatically load these offloaded data from S3 for you. So that your activity worker can remain unchanged.

(stepwise/start-workers! {:activity/add (fn [{:keys [x y]}] (+ x y)) })

To offload outputs between activities, use the built-in offloading interceptors.

Interceptors

Stepwise support interceptors to the activity handlers.

Example

(require '[stepwise.core :as stepwise])

(stepwise/ensure-state-machine :adder
                               {:start-at :add
                                :states   {:add {:type     :task
                                                 :resource :activity/add
                                                 :end      true}}})
=> "arn:aws:states:us-west-2:123456789012:stateMachine:adder"

(stepwise/start-workers! {:activity/add
                            {:handler-fn (fn [{:keys [x y]}] (+ x y))
                             :interceptors [[:inc-x {:enter (fn [ctx] (update-in ctx [:request :x] inc))}]]}})
=> ...

(stepwise/start-execution!! :adder {:input {:x 1 :y 1}})
=>
{:input             {:x 1 :y 1}
 :output            3
 :state-machine-arn "arn:aws:states:us-west-2:123456789012:stateMachine:adder"
 :start-date        #inst"2017-06-20T22:48:14.241-00:00"
 :stop-date         #inst"2017-06-20T22:48:14.425-00:00"
 :arn               "arn:aws:states:us-west-2:123456789012:execution:adder:98bc5005-4d3d-4a7f-9a34-13ac7f1c0891"
 :status            "SUCCEEDED"
 :name              "98bc5005-4d3d-4a7f-9a34-13ac7f1c0891"}

(stepwise/shutdown-workers *2)
=> #{:done}

Built-In Interceptors

send-heartbeat-every-n-seconds-interceptor

A stepwise.interceptors/send-heartbeat-every-n-seconds-interceptor is provided for use with the :heartbeat-seconds task setting. For example,

(stepwise/ensure-state-machine :adder
                               {:start-at :add
                                :states   {:add {:type     :task
                                                 :resource :activity/add
                                                 :heartbeat-seconds 3
                                                 :end      true}}})
(stepwise/start-workers!
  {:activity/add {:handler-fn (fn [{:keys [x y]}] (Thread/sleep 10000) (+ x y))
                  :interceptors [(stepwise.interceptors/send-heartbeat-every-n-seconds-interceptor 1)]}})

offload-select-keys-to-s3-interceptor

Offload your activity output to S3. Any subsequent Stepwise activity would load them back for you automatically.

Specify the keys that you want to offload to S3 like so,

(stepwise/start-workers!
  {:activity/add {:handler-fn (fn [{:keys [x y]}] {:z (+ x y)})
                  :interceptors [(stepwise.interceptors/offload-select-keys-to-s3-interceptor "my-bucket-name" [:z])]}})

Note that the output needs to be a map for this interceptor to offload the values. Any non-map output would be ignored.

offload-all-keys-to-s3-interceptor

Similar to offload-select-keys-to-s3-interceptor but instead of selecting keys, this would offload all the output values.

(stepwise/start-workers!
  {:activity/add {:handler-fn (fn [{:keys [x y]}] {:z (+ x y)})
                  :interceptors [(stepwise.interceptors/offload-all-keys-to-s3-interceptor "my-bucket-name")]}})

Note that the output needs to be a map for this interceptor to offload the values. Any non-map output would be ignored.

License

Copyright © 2017-2021 UW Medicine Center For Precision Diagnostics, Motiva A.I.

Distributed under the Eclipse Public License, the same as Clojure.

stepwise's People

Contributors

p14n avatar quantisan avatar spieden avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

p14n aamedina

stepwise's Issues

statecharts <> Step Functions?

perhaps we can abstract out AWS State Language entirely and use statecharts instead? Is the cost / benefit justifiable?

https://github.com/davidkpiano/xstate via https://github.com/lucywang000/clj-statecharts (only borrow the DSL part and not the underlying engine)

One question is that since this eventually goes to Step Functions, leaving it as close to AWS's implementation as possible has its benefits. e.g. being able to refer to AWS SFN documentations.

At the moment I'm seeing this possibly as a separate library that uses Stepwise under the hood.

Ops improvements

Collecting potential Ops improvements here:

  • add store_test_results step to allow better insight in failed tests
  • remove :gpg and motiva-ai have a type mismatch merging profiles warning
    ...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.