Giter Club home page Giter Club logo

Comments (18)

thantos avatar thantos commented on August 20, 2024 1

Totally agree. From chats with people, timeout and heartbeat are important parts of long running workflows that would make the service look more legit/complete. Because timers are implemented from sleep, it is easy to create timeouts now and the only new part about heartbeat is adding the client operation.

Will start to work on this and if it proves to be high effort, will push off.

from eventual.

thantos avatar thantos commented on August 20, 2024

Tech Design

  1. Orchestrator
    1. Activity Scheduled
    2. Start Timeout Timer (if configured)
    3. Wait - ActivityCompleted, ActivityFailed, ActivityHeartbeat, ActivityHeartbeatTimedOut, ActivityTimedOut
  2. Activity Worker
    1. Activity Worker Locks Activity
    2. Start Heartbeat Timer (if configured)
    3. Activity Handler Invoked
    4. If the Handler returns a value - push complete event to workflow queue
    5. If the Handler returns an AsyncToken - do nothing
  3. On Activity Heartbeat Call
    1. Send ActivityHeartbeat event to the workflow
  4. On client.completeActivity(...)
    1. Send ActivityCompleted to workflow queue
  5. On client.failActivity(...)
    1. Send ActivityFailed to workflow queue
  6. On client.heartbeatActivity(...)
    1. Send ActivityHeartbeat to workflow queue
  7. Orchestrator Wakes Up With...
    1. ActivityCompleted - if the activity has not previously timedout, completed, or failed - return result else ignore
    2. ActivityFailed - if the activity has not previously timedout, completed, or failed - throw error in the workflow else ignore
    3. ActivityHeartbeat - Create an ExtendHeartbeat command, unless the activity is completed, failed, or timedout.
    4. ActivityHeartbeatTimedOut - If the activity is completed, failed, or timedout, ignore. Throw heartbeat error and fail unless there is a heartbeat event from timestamp - heartbeat timeout.
    5. ActivityTimedOut - Throw timeout error and fail if not competed, failed, or timedout.
  8. On ExtendHeartbeat command
    9. TimerClient.updateTimer() - new API which tries to update a Schedule or creates a new SQS message.
  • What is the heartbeat timer? Use the TimerClient (EB Scheduler + SQS).
  • How do we handle the repeating nature? Create a one-time timer. After each Heartbeat, create a ExtendHeartbeat command which updates the timer or creates a new heartbeat event.
  • When do we delete the heartbeat timer? Don't for now, the extra event will be ignored.
  • What happens if a Heartbeat timeout makes it to the workflow, but there has been a heartbeat recently? The orchestrator will filter out HeartbeatTimeout events that happen within X time of the last heartbeat.

from eventual.

thantos avatar thantos commented on August 20, 2024

More details on timeouts for all workflows here: #63

from eventual.

sam-goodwin avatar sam-goodwin commented on August 20, 2024

What is deciding that the activity is async here? The declaration or the implementation?

act1 = activity<{ result: string }>({
   heartbeat: { seconds: number },
   timeout: { seconds: number }
}, (context: Context): { result: string } | AsyncToken => {
   ...doSomeWork...

   await sendToQueue({ token: context.activity.token });

   return makeAsync();
})

from eventual.

sam-goodwin avatar sam-goodwin commented on August 20, 2024

What's the use case for heartbeat?

from eventual.

cfraz89 avatar cfraz89 commented on August 20, 2024

Just spitballing, would a builder pattern make it more ergonomic?

act1 = ActivityBuilder({heartbeat: {seconds: 20}, timeout: {seconds: 20}})
  .activity(context: Context): { result: string } | AsyncToken => {
   ...doSomeWork...

   await sendToQueue({ token: context.activity.token });

   return makeAsync();
})

from eventual.

sam-goodwin avatar sam-goodwin commented on August 20, 2024

I really don't like builder patterns for constructing a function. Bottom layer should be pure and a builder can always be put on top.

Another consideration is how we use the activity/workflow functions for heuristics in the transformer

from eventual.

thantos avatar thantos commented on August 20, 2024

What is deciding that the activity is async here? The declaration or the implementation?

act1 = activity<{ result: string }>({
   heartbeat: { seconds: number },
   timeout: { seconds: number }
}, (context: Context): { result: string } | AsyncToken => {
   ...doSomeWork...

   await sendToQueue({ token: context.activity.token });

   return makeAsync();
})

The activity decides that it needs to be async and a single activity can support both patterns (return sync when possible and go async when necessary.

The workflow decides how long it is willing to wait for the activity to complete.

Controls the Workflow Has:

  1. Heartbeat - report back every X or fail
  2. Timeout - finish within X or fail

Controls the Activity Has:

  1. Return Sync or Async
  2. Succeed or Fail
  3. Use Heartbeat to store checkpoints
  4. Use heartbeat to determine if the workflow is still alive

An abstraction would be to support activities that are explicitly async from the workflow like Step Functions does, but it would be basically the same under the hood.

workflow(() => {
   await asyncEventActivity((token) => {}); // create an event which contains a token and waits on the response
});
// or maybe a special activity type?
const myActivity = eventActivity<string>((token, input) => ({ type: "myEvent", token, input }));

And then the other way to do it would be like SFN's Activities which provide a queue to poll on from anywhere.

Which again could just be a special activity type that is called by the workflow like any other activity.

const myActivity = queueActivity<string>(myQueue); // a queue that contains activity requests to resolve using the token they contain.

from eventual.

thantos avatar thantos commented on August 20, 2024

What's the use case for heartbeat?

Heartbeat is important in durable systems. Let say you have a long running activity that may take up to a week, so you set it's timeout to 2 weeks just in case. That means if something goes wrong and the message is lost, the workflow won't wake up for 2 weeks just to find it failed. Now you could set a hourly or daily heartbeat which allows the activity's system to report back to the workflow to say it is still alive.

Yehuda expressed how important that this is in his systems when long running processes are involved.

From Temporal's Docs:

An Activity Heartbeat is a ping from the Worker that is executing the Activity to the Temporal Cluster. Each ping informs the Temporal Cluster that the Activity Execution is making progress and the Worker has not crashed.
For long-running Activities, we recommend using a relatively short Heartbeat Timeout and a frequent Heartbeat. That way if a Worker fails it can be handled in a timely manner.

Step Functions:

It's a good practice to set a timeout value and a heartbeat interval for long-running activities. This can be done by specifying the timeout and heartbeat values, or by setting them dynamically.

from eventual.

thantos avatar thantos commented on August 20, 2024

Use Cases:

  1. Health Ping from Activity to Workflow
  2. Activity checking if the workflow is still alive - Future use case
  3. Checkpointing (activity can save partial data in the heatbeat) - Future use case

from eventual.

sam-goodwin avatar sam-goodwin commented on August 20, 2024

I could have been clearer, I do know why they are important. Just not sure why it's important right now.

from eventual.

thantos avatar thantos commented on August 20, 2024

Yehuda will ask about them and I think we can get the basic impl done quickly.

from eventual.

sam-goodwin avatar sam-goodwin commented on August 20, 2024

Yehuda will ask about them and I think we can get the basic impl done quickly.

So low effort high roi? Sounds good. Let's try and think of some examples when we implement it and add them to the test app?

I may be being pedantic, just trying to learn the lesson of functionless and focus on examples and features, not just features.

from eventual.

thantos avatar thantos commented on August 20, 2024

Was looking at how to avoid the context argument.

Option 1: context method

activity((...args) => {
    const { asyncToken } = getActivityContext(); // fails when called from outside of an activity.
    async sqs.send(new SendMessageCommand(... token ... ));
    return makeAsync();
});

Option 2: token is provided by the makeAsync function via a callback.

activity((...args) => {
    return makeAsync(async (token) => {
       async sqs.send(new SendMessageCommand(... token ... ));
    });
});

Option 3: context parameter

activity((...args, context) => {
    async sqs.send(new SendMessageCommand(... context.asyncToken ... ));

    return makeAsync();
})

from eventual.

sam-goodwin avatar sam-goodwin commented on August 20, 2024

What's wrong with the context parameter? I think we wanted to update activities to only allow a single input argument just like a workflow so that it aligns with a lambda contract. There was another reason I think too, but can't remember.

Context argument is preferable because it's discoverable.

from eventual.

sam-goodwin avatar sam-goodwin commented on August 20, 2024

While writing the documentation, I found myself confused about why heartbeat is a global. have we closed on our decision to change activities to be single argument only and then add a context parameter? We could then provide the heartbeat function on the context parameter instead.

from eventual.

thantos avatar thantos commented on August 20, 2024

How do you decide what is a context method and what is an intrinsic? Is the difference that heartbeat is specific to activities (and systems acting on behalf of an async activity)?

Would we apply the same logic to workflow only things, sleep, Promise.*, signal, etc?

Heartbeat for an activity can be done by anything with access to the token. I see it as the same as completeActivity and failActivity, an operation performed by an activity or by something acting as an activity. For example, when an activity is async, a workflow, event handler, or some random lambda using the client will need to call heartbeat.

Options:

  1. No intrinsic - Move all intrinsics (sleep, heartbeat, etc) to their respective objects and/or context variables
  2. Move only heartbeat for activities to context (add activity.heartbeat, keep workflowclient.heartbeatActivity)
  3. Rename to heartbeatActivity, add intrinsic for completeActivity and failActivity.

from eventual.

sam-goodwin avatar sam-goodwin commented on August 20, 2024

Was building something today and found myself really wanting a context variable in an activity so i can get the execution ID without having to explicitly pass it through from the workflow.

from eventual.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.