Giter Club home page Giter Club logo

flow's Introduction

flow

A small library for batch computations.

This was born from billing projects where the same task coordination logic was being repeatedly written and copied around, and where workflows that were to be executed in AWS Lambda would need to be tested locally.

Note: This project is experimental. Use with extreme caution.

Concepts

  • task: A task is a unit of work. Tasks in flow are described by an object with the following fields:

    • id: Unique identifier for the task.
    • name: Name of the task, used to determine a handler for the task.
    • input: Data to be processed.
    • output: Result of processing.
    • dependencies: Tasks that must be complete before this task.
    • dependents: Tasks that should be started once this task is complete.
    • status: Status of the task, either "pending" or "complete".
    • statusUpdatedAt: When the task's status was last updated. Could be used to determine whether a task has timed out.
  • handler: A handler is a function to process tasks with a certain name. Handlers are given the following arguments:

    • input: Data to be processed.
    • task: Task being processed.
    • medium: Reference to the medium the flow is running through.

    A number of handler decorators with specific purposes are provided:

    • compute: The return value of a compute handler is written as the task output, and dependent tasks are started.
    • graph: Graph handlers are invoked with a task graph instance, through which the handler can define more tasks to be ran to compute an output. The return value of graph handlers is ignored, however metadata is injected into the sink of the graph to update the task output later.
    • sink: Sink handlers are invoked with an array of the outputs of the task's dependencies. They also handle metadata from graph tasks to redirect output to a source task.
  • task graph: A task graph defines tasks to be executed, along with the dependencies between them.

  • medium: A 'medium' is a combination of storage and executor, used to run a flow handler. The medium provides the API through which handlers can store output, start dependents or get the status of other tasks.

    • storage: The 'storage' aspect of the medium defines how task definitions are persisted. In particular it defines how to retrieve a persisted task, how to persist a new task, how to update a task's output, etc.
    • executor: The 'executor' aspect of the medium defines how task handlers are invoked. In particular it defines how to start a task with a given ID and name, and how to execute a task.

flow's People

Contributors

connec avatar ewencochran-cr avatar kierandoonan avatar laura-tamm avatar sftrabbit avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flow's Issues

Passing inputs to default task handler

Copied from Jira ticket PDEV-1936. The suggestion from @sftrabbit is:

= Context
Flow tasks get their input from storage by fetching the row with the corresponding task ID. The default task is usually executed by passing no task ID and the task name "default", and nothing gets fetched from storage. Therefore, there is no way to pass input to the default task (without some shenanigans).

= Done When
There is some clear and concise mechanism for passing input to the default task. Perhaps the task handler takes an optional third parameter that overrides task input if there is no task ID? Not sure.

HTTP executor - different URLs per task name

Copied from Jira ticket PDEV-1940. The proposal from @sftrabbit is that:

  • You can provide the HTTP executor an object mapping task names to URLs.
  • When starting a task with a particular name, it sends the request to the corresponding URL.

Use async/await in Flow

Copied from Jira ticket PDEV-1978. The suggestion from @sftrabbit is that:

  • All uses of .then or other Promise-chaining shenanigans are replaced by appropriate use of async and await.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.