Giter Club home page Giter Club logo

alexcasalboni / aws-lambda-power-tuning Goto Github PK

View Code? Open in Web Editor NEW
5.2K 5.2K 359.0 2.99 MB

AWS Lambda Power Tuning is an open-source tool that can help you visualize and fine-tune the memory/power configuration of Lambda functions. It runs in your own AWS account - powered by AWS Step Functions - and it supports three optimization strategies: cost, speed, and balanced.

License: Apache License 2.0

JavaScript 85.13% Shell 1.09% HCL 7.23% C# 1.73% TypeScript 1.11% Python 1.59% Batchfile 0.22% Go 1.89%
aws aws-lambda cloud cost lambda performance serverless stepfunctions

aws-lambda-power-tuning's Introduction

Hi there ๐Ÿ‘‹

Alex's GitHub stats

Interested in serverless cost/performance optimization? Check this out:

AWS Lambda Power Tuning

aws-lambda-power-tuning's People

Contributors

alexcasalboni avatar andrestoll avatar andybkay avatar arishlabroo avatar bobsut avatar claudiopastorini avatar cledevedec avatar clementmarcilhacy avatar clete2 avatar dependabot[bot] avatar dz902 avatar ellisms avatar fhightower avatar gino247 avatar gliptak avatar grzegorzpapkala avatar hscheib avatar lavanya0513 avatar ldcorentin avatar matteo-ronchetti avatar mettke avatar neiljed avatar parro avatar rrhodes avatar smosek avatar tam-alex avatar teknogeek0 avatar tljdebrouwer avatar tonysherman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aws-lambda-power-tuning's Issues

Make Node.js 10.x runtime an option

As of May 15 2019 AWS added support for Node.js 10.x in Lambdas. I'm thinking it would be nice to have an option to run the power-tuning Lambdas in a similar Runtime as the Lambda someone is writing.

Link to article AWS What's new

Implement dynamic parallelism

This new feature will make Lambda Power Tuning much more flexible: https://aws.amazon.com/blogs/aws/new-step-functions-support-for-dynamic-parallelism/

Now you can't easily test new memory configurations for each state machine execution, as memory configurations are hard-coded in the state machine structure.

With dynamic parallelism, we'll be able to provide a list of memory configurations as input and dynamically test only those configurations (without any deploy-time parameter).

--dry-run

Per discussion in #37 (comment)

This might be an overkill, because num=1 would cover it, but having a possibility to verify if user's setup is correct could benefit a lot of people.

I personally always run dry runs if they exist, especially when im a beginner in an area, and/or stakes of mistake is high.

Provide a context to Lambda function when running the lambda power tuning

The lambda function that is being tuned requires a context.

How can we pass the context to this lambda function when running the tuning step function using the following:
{
"lambdaARN": "lambda ARN",
"powerValues": [128, 256],
"num": 5,
"payload": {},
"parallelInvocation": true,
"strategy": "balanced"
}

Cannot add memory option

When I deployed via the console, I added a 2048MB option to the comma-separated list:

image

But the state machine doesn't reflect this:

image

Is it suppose to work? (thought you were doing something along the line of deploying a custom resource and then using it in the same stack, like what SLS does for event bridge)

Reset $LATEST memory configuration after state machine execution

@alexcasalboni Great job on the tool, very handy & useful!

I ran the test on my lambda function with all possible memory settings. Initially, my function had 512MB memory assigned. After the tests were completed (confirmed that Cleaner & Finalizer are green), function got assigned memory of 3008 MB. I also checked that the versions that got created during the test were removed (which is expected) but the memory got set to the max memory given during the tests.

Is this expected?

Thanks!

Environment variable minRAM must contain string

July 4, 2018
serverless/serverless#5094 (comment)
After updating today to 1.28.0, Serverless (or a dependency) now expects all environment variables to be strings. This sounds reasonable, but it's a breaking change so I'm making people aware.

Serverless: Excluding development dependencies...

  Serverless Error ---------------------------------------

  Environment variable minRAM must contain string

  Get Support --------------------------------------------
     Docs:          docs.serverless.com
     Bugs:          github.com/serverless/serverless/issues
     Issues:        forum.serverless.com

  Your Environment Information -----------------------------
     OS:                     win32
     Node Version:           8.11.1
     Serverless Version:     1.28.0

Manually changing the serverless.base.yml file, to strings, fixes that issue.

    minRAM: '128'
    minCost: '0.000000208'

Use a unique payload for every run

Hi @alexcasalboni

#85 issue didn't seem to convey my problem properly.

Currently, when 6 types of powerValues โ€‹โ€‹are given and 5 is given to num, the Lambda Function is invoked 30 times in total.
If this Lambda Function behaves like deleting one record of the ID passed by Payload for each Invoke, the type of Payload needs to prepare more than the total number of calls.

However, according to the current specifications, it is not possible to give more payloads than the value of num.

Therefore, when the execution of the first PowerValues โ€‹โ€‹ends, the record with the ID specified in Payload has been deleted from the DB, and the specified ID does not exist when the second and subsequent PowerValues โ€‹โ€‹are executed. The correct processing is not performed and the required PowerValue cannot be checked.

In order to solve this problem, it is necessary to modify Payload so that it can be given more than the total number of executions.

Alternatively, by implementing a function that allows you to specify a Lambda Function that performs processing before and after each execution, you can execute it by giving only one type of payload. I think that such a function is very useful.

Multiple optimization strategies

As mentioned in #30, new optimization strategies could be more useful in specific use cases.

The default strategy could remain cost, but a few more can be implemented.

The second most straightforward strategy is speed, and we should implement it in a way that's easy for new contributors to add implement strategies.

The Finalizer function takes all the statistics as input and will return the optimal configuration, so everything can be implemented there.

InvalidParameterValueException on v3.2.3

Just deployed the latest v3.2.3 and every time I run the new state machine I get the following error:

"error": {
    "Error": "InvalidParameterValueException",
    "Cause": "{\"errorType\":\"InvalidParameterValueException\",\"errorMessage\":\"The role defined for the function cannot be assumed by Lambda.\",\"trace\":[\"InvalidParameterValueException: The role defined for the function cannot be assumed by Lambda.\",\"    at Object.extractError (/var/runtime/node_modules/aws-sdk/lib/protocol/json.js:51:27)\",\"    at Request.extractError (/var/runtime/node_modules/aws-sdk/lib/protocol/rest_json.js:55:8)\",\"    at Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:106:20)\",\"    at Request.emit (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:78:10)\",\"    at Request.emit (/var/runtime/node_modules/aws-sdk/lib/request.js:683:14)\",\"    at Request.transition (/var/runtime/node_modules/aws-sdk/lib/request.js:22:10)\",\"    at AcceptorStateMachine.runTo (/var/runtime/node_modules/aws-sdk/lib/state_machine.js:14:12)\",\"    at /var/runtime/node_modules/aws-sdk/lib/state_machine.js:26:10\",\"    at Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:38:9)\",\"    at Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:685:12)\"]}"
  }
}

Optionally configure a single Lambda ARN at deploy-time

The current IAM statement for initializer, execution and optimizer looks like this:

          Statement:
            - Effect: Allow
              Action:
                - lambda:GetAlias
                - lambda:PublishVersion
                - lambda:UpdateFunctionConfiguration
                - lambda:CreateAlias
                - lambda:UpdateAlias
              Resource: '*'

I think some security-concerned users would rather avoid that Resource: '*'.

We could allow them to optionally configure a Lambda ARN (or prefix) so that these IAM policies are a bit more fine-grained.

Technically, this would be a CFN Parameter (e.g. lambdaResource), directly referenced via !Ref.

Feature Request - Leverage X-Ray to analyze different segments

I have a lambda that makes an external network call as part of it's execution which is a variable I can't control. It would be awesome if this tool leveraged AWS X-Ray to be able to report on the effect of memory tuning on various segments of execution.

image

Questions this approach could answer:

  • How do different memory allocations affect my initialization segement? This is a primary factor in cold start times
  • How do different memory allocations affect TLS connection setup times? I've heard that it can be significant, would be helpful want to quantify

Restrict Lambda IAM permissions

The current role has full access to AWS Lambda:

iamRoleStatements:
    - Effect: Allow
      Action:
        - 'lambda:*'
      Resource: '*'

Since we want the lambdaARN to be given at runtime, we can't really restrict the Resource parameter. We could restrict the set of actions, though. Also, experienced users can always force Resource to be the Lambda Function(s) they want to optimize.

As far as actions are concerned, Initializer, Executor, Finalizer and Cleaner need the following Lambda permissions (only 7 out of 28):

  • GetAlias
  • UpdateFunctionConfiguration
  • PublishVersion
  • DeleteFunction (always with Qualifier)
  • CreateAlias
  • DeleteAlias
  • Invoke

ResourceNotFoundException: Functions from 'us-east-1' are not reachable in this region ('us-west-1')

Seems like cross region usage is not available for lambdas so, the Step Function should be deployed on each region, am I right, or is it just a coding restriction and can be improved?

Full trace from CloudWatch Logs:

`START RequestId: c48d4975-d571-46d7-9835-9a7f84e0300f Version: $LATEST
2019-05-14T13:32:20.857Z c48d4975-d571-46d7-9835-9a7f84e0300f { ResourceNotFoundException: Functions from 'us-east-1' are not reachable in this region ('us-west-1')
at Object.extractError (/var/runtime/node_modules/aws-sdk/lib/protocol/json.js:48:27)
at Request.extractError (/var/runtime/node_modules/aws-sdk/lib/protocol/rest_json.js:52:8)
at Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:105:20)
at Request.emit (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:77:10)
at Request.emit (/var/runtime/node_modules/aws-sdk/lib/request.js:683:14)
at Request.transition (/var/runtime/node_modules/aws-sdk/lib/request.js:22:10)
at AcceptorStateMachine.runTo (/var/runtime/node_modules/aws-sdk/lib/state_machine.js:14:12)
at /var/runtime/node_modules/aws-sdk/lib/state_machine.js:26:10
at Request. (/var/runtime/node_modules/aws-sdk/lib/request.js:38:9)
at Request. (/var/runtime/node_modules/aws-sdk/lib/request.js:685:12)
message: 'Functions from 'us-east-1' are not reachable in this region ('us-west-1')',
code: 'ResourceNotFoundException',
time: 2019-05-14T13:32:20.857Z,
requestId: 'b3418c32-764c-11e9-ab60-7505d4afde13',
statusCode: 404,
retryable: false,
retryDelay: 52.181729400208376 }
2019-05-14T13:32:20.932Z c48d4975-d571-46d7-9835-9a7f84e0300f Error: Interrupt
at /var/task/initializer.js:68:27
at
at process._tickDomainCallback (internal/process/next_tick.js:228:7)
2019-05-14T13:32:20.932Z c48d4975-d571-46d7-9835-9a7f84e0300f Error: Interrupt
at /var/task/initializer.js:68:27
at
at process._tickDomainCallback (internal/process/next_tick.js:228:7)
2019-05-14T13:32:20.932Z c48d4975-d571-46d7-9835-9a7f84e0300f Error: Interrupt
at /var/task/initializer.js:68:27
at
at process._tickDomainCallback (internal/process/next_tick.js:228:7)
2019-05-14T13:32:20.932Z c48d4975-d571-46d7-9835-9a7f84e0300f Error: Interrupt
at /var/task/initializer.js:68:27
at
at process._tickDomainCallback (internal/process/next_tick.js:228:7)
2019-05-14T13:32:20.932Z c48d4975-d571-46d7-9835-9a7f84e0300f Error: Interrupt
at /var/task/initializer.js:68:27
at
at process._tickDomainCallback (internal/process/next_tick.js:228:7)
2019-05-14T13:32:20.934Z c48d4975-d571-46d7-9835-9a7f84e0300f
{
"errorMessage": "Interrupt",
"errorType": "Error",
"stackTrace": [
"/var/task/initializer.js:68:27",
"",
"process._tickDomainCallback (internal/process/next_tick.js:228:7)"
]
}

END RequestId: c48d4975-d571-46d7-9835-9a7f84e0300f
REPORT RequestId: c48d4975-d571-46d7-9835-9a7f84e0300f Duration: 1510.06 ms Billed Duration: 1600 ms Memory Size: 128 MB Max Memory Used: 59 MB `

Cleanup after failed execution

I noticed if a lambda is long (200 seconds) running it in default non parallel mode, will cause the executioner to time out, because it will run it in sequence.

note: documentation is wrong, the parameter is parallelInvocation not enableParallel

The failure results in executioner exceeding 300sec run time after the 2nd invocation. If you set num to 10, it will take 200 * 10 seconds to finish which it will never do.

In case of failure, it should still clean up the aliases/versions it created.

And the error message is unknown, it should say that the executioner timed out, or something to indicate that the person should run these in parallel mode.

Error

Lambda.Unknown
Cause

The cause could not be determined because Lambda did not return an error type.

Side note, would be nice to finish the rest and not cancel on first failed. If you have an array of lambdas 128 to whatever, and 128 is too small, it breaks the testing for the rest of them.

Question around parallelInvocations

When i run the framework with parallelinvocation=false the invocation times are great - in millisecs interval. But when we run with the same flag = true - times are like in 100s of secs. Ours is busines critical app and need to be under 75 ms. Can you please explain a bit more around how parallelinvocation is implemented and what happens under the hood? Though reviewing code is easy, please share your thoughts and insights as well so we can learn and adjust our code accordingly.

Document IAM role permissions

As mentioned in #5, the Resource attribute of the default IAM role could be restricted so that the state machine can only interact with the configured Lambda Function(s).

Resource is set to * by default because the original goal was to provide any lambdaARN at runtime.

We should document how to update such configuration manually, or eventually implement an additional parameter at generation-time.

The new parameter could look like this:

$ npm run generate -- -A ACCOUNT_ID -L arn:aws:lambda:*:*:function:MyFunctionName

Improve weighted payload logging in case of invocation error

I prepared each function for CRUD operation of data.
Among them, I wanted to collect the statistics of the function for deletion, but the deletion target put in Payload is deleted by the first execution, and an error occurs from the second time.
In order to avoid this, I prepared data to be deleted more than the total number of executions and gave the payload a weight of all 1 and executed it, but I could not execute it because of an Invalid payload weight (num is too small) error was.

Please tell me a good way to handle such a function.

Weighted optimization strategy

Based on @pavelloz's feedback in #31, we could have a configurable weight between speed and cost.

Since there are many different use cases and very subjective ways to optimize for cost vs. speed, such a weight would need to be very well documented, imho.

In the long-term, we might be able to "categorize" a given function into some sort of optimization class based on the speed-cost relation across memory configurations, and come up with a globally optimal strategy for each class.

FYI @matteo-ronchetti is already working on the first iteration of this :)

Add state machine invocation command

For now, you are to manually start the state machine and provide the correct input.

There should be a simple command that would take care of:

  • generate the input object based on user-provided params
  • start the state machine and monitor its status
  • fetch the state machine output and clearly visualize it
  • eventually, the script could also set the new power level to the optimal one (or reset it to the original value)

Use the correct regional base price

The base price for Lambda executions (128MB, 100ms) is $0.0000002083 in almost every region.

Here are the regions where the price is slightly different:

  • Hong Kong (ap-east-1): $0.0000002865 (+37%)
  • Cape Town (af-south-1): $0.0000002763 (+32%)
  • Bahrain (me-south-1): $0.0000002583 (+24%)

This difference should be considered in two places:

  • in the state machine output results.stateMachine.lambdaCost
  • in the visualization (charts)

We should update the utilities utils. computePrice(...) and utils. computeTotalCost(...), used by the Executor function here.

Thanks #75 for bringing this up.

Is correct the relation between RAM Configuration/Cost by ms?

Hi Alex!

After some testing with the tool I have seen incorrect numbers on the graph.

I have compared the time/costs results of one lambda execution in my region (EU-Ireland) with the result that appears hover the graph. And...

Here my calculation:

  1. Data from AWS Lambda Pricing (EU-Ireland):
  • RAM Configuration: 128MB
  • Cost by 100ms: $0,0000002083
  • Cost by 1ms: $0,0000000020830
  1. State Machine execution graph results:
  • Size: 128MB
  • Time: 366ms
  • Cost: 0,00000083$
    image

I guess cost is incorrect because...

Cost with Lambda Pricing (EU-Ireland): 366ms * $0,0000000020830 cost/ms = $0,0000007623780

Also with other RAM configurations the comparation with my calculations fails...

I don't know if I'm doing smth wrong...It was only a doubt during my researching about lambda performance. ๐Ÿ‘

Thank you in advance!

Invocation error with Cannot read property of undefined

Though the execution succeeded, it shows error and empty object in Output

The input i passed is below as mentioned in guide: I changed the arn with my lambda arn.

{
"lambdaARN": "your-lambda-function-arn",
"powerValues": [128, 256, 512, 1024, 2048, 3008],
"num": 10,
"payload": {},
"parallelInvocation": true,
"strategy": "cost"
}

Should i pass payload?

Here is the complete error:

{
"error": "Error",
"cause": {
"errorType": "Error",
"errorMessage": "Invocation error: {"errorType":"TypeError","errorMessage":"Cannot read property 'id' of undefined","trace":["TypeError: Cannot read property 'id' of undefined"," at Runtime.module.exports.get [as handler] (/var/task/todos/get.js:13:32)"," at Runtime.handleOnce (/var/runtime/Runtime.js:66:25)"]}",
"trace": [
"Error: Invocation error: {"errorType":"TypeError","errorMessage":"Cannot read property 'id' of undefined","trace":["TypeError: Cannot read property 'id' of undefined"," at Runtime.module.exports.get [as handler] (/var/task/todos/get.js:13:32)"," at Runtime.handleOnce (/var/runtime/Runtime.js:66:25)"]}",
" at /var/task/executor.js:114:19",
" at processTicksAndRejections (internal/process/task_queues.js:97:5)",
" at async Promise.all (index 1)",
" at async runInParallel (/var/task/executor.js:119:5)",
" at async Runtime.module.exports.handler (/var/task/executor.js:31:19)"
]
}
}

ResourceNotFoundException: Function not found

I see this below error when i run the Power tuning with ParallelInvocation : false. But when i run with ParallelInvocation: true..It works.

MyInput:

{
  "<LambdaARN>",
  "powerValues": [128, 256, 512, 1024, 2048, 3008],
  "num": 100,
  "payload": {
    "headers": {
      "Authorization": "<Auth Token>",
      "x-api-key": "<API Key>"
    }
  },
  "parallelInvocation": false,
  "strategy": "balanced",
  "balanceWeight": 0.5
}

Error:

{
  "error": "Lambda.Unknown",
  "cause": "The cause could not be determined because Lambda did not return an error type."
}
{
  "error": "ResourceNotFoundException",
  "cause": {
    "errorType": "ResourceNotFoundException",
    "errorMessage": "Function not found: arn:aws:lambda:<region>:<accno>:function:GetCustomerProfile:RAM256",
    "trace": [
      "ResourceNotFoundException: Function not found: arn:aws:lambda:<region>-<accno>:function:GetCustomerProfile:RAM256",
      "    at Object.extractError (/var/runtime/node_modules/aws-sdk/lib/protocol/json.js:51:27)",

Can you please help on this?

Default nodejs version runtime

Looking at https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html

I wonder if https://github.com/alexcasalboni/aws-lambda-power-tuning/blob/master/template.yml#L27 should be 10.x by default.

I know there are differences between AMI and AMI2, and even in my case i could not migrate one lambda that uses aws-chrome-lambda because of incompatibilities.

But there is a case to be made for either setting it to a current AWS recommended version, or writing one sentence on it in documentation as a heads up. :)

Or maybe i dont understand how it works yet, im not the best in reading SAM/CF manifests.

Refactor Node.js code (new ES)

The current implementation is not very readable because of promises & callbacks hell.

I'd like to refactor it to use async/await syntax.

Can we use the tool for stress testing?

If you test only one power configuration and use a very large num, this tool could be used for stress testing Lambda functions and visualize average cost and execution time.

We could design a different visualization to be used when testing only one power configuration, where we could visualize more detailed statistics.

Compute and report total cost of state machine execution

The state machine could return its own total cost of execution.

I think this would add more transparency to Lambda Power Tuning.

The cost should include both Lambda execution costs and Step Functions execution costs (even though the max step transitions is always around 15-20, which means less than $0.0005 per state machine execution).

Depending on the value of num, Lambda costs will likely outweigh Step Functions cost. For example, even with a "no-op" function and num: 100, we can expect the overall Lambda cost to be around $0.001.

I will add some more documentation about costs too.

Missing or empty optimal value

Running through tuning app with version 3.2.3 I get an error possibly related to new dryRun parameter
I get the following error:

{
  "errorType": "Error",
  "errorMessage": "Missing or empty optimal value",
  "trace": [
    "Error: Missing or empty optimal value",
    "    at validateInput (/var/task/optimizer.js:40:15)",
    "    at Runtime.module.exports.handler (/var/task/optimizer.js:14:5)",
    "    at Runtime.handleOnce (/var/runtime/Runtime.js:66:25)"
  ]
}

when the input is:

{
  "lambdaARN": "lambdaARN",
  "powerValues": [
    128,
    256,
    512,
    1024,
    2048,
    3008
  ],
  "num": 30,
  "payload": {},
  "strategy": "speed",
  "dryRun": true,
  "parallelInvocation": false,
  "stats": [
    {
      "averagePrice": 2.08e-7,
      "averageDuration": 36.37111111111112,
      "totalCost": 0.0000066560000000000045,
      "value": 128
    },
    {
      "averagePrice": 4.16e-7,
      "averageDuration": 2.3211111111111116,
      "totalCost": 0.000012480000000000008,
      "value": 256
    },
    {
      "averagePrice": 8.32e-7,
      "averageDuration": 2.4144444444444444,
      "totalCost": 0.000024960000000000015,
      "value": 512
    },
    {
      "averagePrice": 0.000001664,
      "averageDuration": 2.355,
      "totalCost": 0.00004992000000000003,
      "value": 1024
    },
    {
      "averagePrice": 0.000003328,
      "averageDuration": 2.364444444444444,
      "totalCost": 0.00009984000000000006,
      "value": 2048
    },
    {
      "averagePrice": 0.0000048880000000000005,
      "averageDuration": 2.255555555555555,
      "totalCost": 0.00014664000000000005,
      "value": 3008
    }
  ],
  "analysis": null
}

Not sure why the analysis field is null? In my input to the Step Function I don't define this so whatever generating it seems to have a problem with the dryRun.
Removing dryRun the run works fine

Regional base price selection (Step Functions)

Similarly to #77, we should use the correct regional price of Step Functions based on where the state machine is executed (which might be different from the input function's region!).

Each state transition costs $0.000025 in almost every region, with some exceptions:

  • default: $0.025
  • eu-south-1: $0.02625
  • us-west-1: $0.0279
  • af-south-1: $0.02975
  • ap-east-1: $0.0275
  • ap-south-1: $0.0285
  • ap-northeast-2: $0.0271
  • eu-south-1: $0.02625
  • eu-west-3: $0.0297
  • me-south-1: $0.0275
  • sa-east-1: $0.0375
  • us-gov-east-1: $0.03
  • us-gov-west-1: $0.03

The state machine execution cost is computed here:

module.exports.stepFunctionsCost = (nPower) => +(0.000025 * (6 + nPower)).toFixed(5);

The formula to compute the # of state transitions is: 6 + COUNT(POWERVALUES), therefore the Step Functions cost will be REGIONAL_COST * NUMBER_OF_TRANSITIONS.

Customizable execution timeout

Currently, the Executor timeout is 300 seconds (5 minutes) and there is no way to customize it if you deploy via SAR.

Also, the same timeout should be configured on the state machine task to make sure the timeout error is properly handled (otherwise Lambda.Unknown is detected instead of States.Timeout).

It could be a simple CloudFormation parameter, used for both values.

3.1.1: Execution output: {} after running execute.sh

Steps:

  1. ./deploy.sh
  2. Result (already ran the deploy, just wanted to make sure it was deployed):
Waiting for changeset to be created..
Error: No changes to deploy. Stack lambda-power-tuning is up to date
  1. ./execute.sh
  2. Result:
-n .
// etc
SUCCEEDED
Execution output:
{}

Checked the logs: aws stepfunctions get-execution-history --profile default --execution-arn $EXECUTION_ARN

There are 96 entries in the logs like:

        {
            "timestamp": 1578505766.01,
            "type": "LambdaFunctionFailed",
            "id": 124,
            "previousEventId": 96,
            "lambdaFunctionFailedEventDetails": {
                "error": "Error",
                "cause": "{\"errorType\":\"Error\",\"errorMessage\":\"Invocation error: {\\\"errorType\\\":\\\"string\\\",\\\"errorMessage\\\":\\\"{\\\\\\\"statusCode\\\\\\\":\\\\\\\"500\\\\\\\",\\\\\\\"message\\\\\\\":\\\\\\\"An unexpected error occurred\\\\\\\"}\\\",\\\"trace\\\":[]}\",\"trace\":[\"Error: Invocation error: {\\\"errorType\\\":\\\"string\\\",\\\"errorMessage\\\":\\\"{\\\\\\\"statusCode\\\\\\\":\\\\\\\"500\\\\\\\",\\\\\\\"message\\\\\\\":\\\\\\\"An unexpected error occurred\\\\\\\"}\\\",\\\"trace\\\":[]}\",\"    at utils.range.map (/var/task/executor.js:67:19)\",\"    at process._tickCallback (internal/process/next_tick.js:68:7)\"]}"
            }
        },

So it appears to have failed but not caught it and then given an empty output.

Config:

{
    "lambdaARN": "arn:aws:lambda:us-west-2:etcetc",
    "powerValues": "ALL",
    "num": 5,
    "parallelInvocation": true,
    "strategy": "speed",
    "payload": [ {...} ]
}

Tried with explicit powerValues and parallelInvocation: false as well.

Payload does not support GET Methods

I am trying to test with lambda functions that expects Query parameters. I am passing the params via payload, but throws invocation errors with null object references.
Our functions expects Query parameters. Any chance we can support query params to be passed with payload?

memorySize of the lambdas should be 128.

There is no reason for the tester functions to be so big,

functions:
  initializer:
    handler: lambda/initializer.handler
    memorySize: 128
    timeout: 60
  executor:
    handler: lambda/executor.handler
    memorySize: 128
    timeout: 300
  cleaner:
    handler: lambda/cleaner.handler
    memorySize: 128
    timeout: 60
  finalizer:
    handler: lambda/finalizer.handler
    memorySize: 128
    timeout: 60

Integrate stats visualization (chart)

@matteo-ronchetti has developed a simple web interface that we can integrate in the state machine output. This way, users can simplify click on a link/URL and visualize useful numbers about cost and performance.

This should be easy to implement in the finalizer function, or maybe as a third parallel step.

I've considered making this an opt-in feature, but I think most users will benefit from it and I can't see any relevant data privacy concern since you can simply not click on that link.

The UI is currently hosted as an Amplify Console app here: https://master.d19f2a8daatc3f.amplifyapp.com

You can provide input data including it in the URL hash: https://master.d19f2a8daatc3f.amplifyapp.com/index.html#gAAAAQACAAQABg==;AACAQQAAAEEAAIBAMzMzQGZmBkA=;CtcjPG8SAzwK16M7vHQTPKabRDw=

The hash structure is as follows: <encode(power_values)>;<encode(execution_time);<encode(execution_cost)>.

For example:

let sizes = [128, 256, 512, 1024, 1536];
let times = [16.0, 8.0, 4.0, 2.8, 2.1];
let costs = [0.01, 0.008, 0.005, 0.009, 0.012];
let hash = encode(sizes, Int16Array) + ";" + encode(times) + ";" + encode(costs);

where

const base64js = require('base64-js');

function encode(input, c = Float32Array) {
    input = new c(input);
    if (!(input instanceof Uint8Array)) {
        input = new Uint8Array(input.buffer)
    }
    return base64js.fromByteArray(input);
}

Optimise multible functions at once

As stated in #55 it would be nice to be able to optimize multiple chained functions at once. @alexcasalboni stated that he thinks the optimum for the overall chain would in his opinion correlating with the optimum of all indiviual functions.

I researched a bit and there is research (a bachelor thesis I sadly can't share) that the pareto optimum for all functions differs indeed from just the edge optimization of the single functions individually.

It would be nice to be able to optimize a chain of functions or even a step function state machine with this tool.

API Gateway end-to-end testing?

How could we support this?

Would it be mutually exclusive wrt Lambda or maybe a totally independent branch?

APIGW comes with many configurations that might make performance tuning less reliable such as caching, WAF, endpoint type (regional or edge-optimized), etc.

We could simply invoke the API endpoint instead of the Lambda function, but I'm not 100% sure of what the benefit would be.

Set of test payloads

Lets take the hello world of lambda world for example: compress image from s3 using sharp.

Having a function that does complicated thing which depends so heavily on the input, forces us to test it using multiple different variants, ie:

  • big file
  • small file
  • format 1, format 2, format 3
  • extremely big file
  • transformations passed as options

It would be nice if Power Tuning could take multiple events as an inputs, and before recommending power, take into consideration output from all the different tests.

Ideally, I could tell the percentage of tests run by any given test (ie. medium file with jpeg format - 50%, extremely big file - 3%, transformations - 5%) - so that extremely big file would not skew the results too much (average likes to do that).

AWS no longer supports nodejs4.3

Deploying gives this error: An error occurred: CleanerLambdaFunction - The runtime parameter of nodejs4.3 is no longer supported for creating or updating AWS Lambda functions. We recommend you use the new runtime (nodejs8.10) while creating or updating functions. (Service: AWSLambdaInternal; Status Code: 400; Error Code: InvalidParameterValueException; Request ID: aa86828b-4748-11e9-9731-7d5c960221fe)

Changing all references to be nodejs8.10 appears to work

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.