elastic / elastic-serverless-forwarder Goto Github PK

View Code? Open in Web Editor NEW

35.0 172.0 34.0 3.18 MB

Elastic Serverless Forwarder

License: Other

Python 94.47% Shell 4.80% Dockerfile 0.25% Makefile 0.48%

elastic-serverless-forwarder's Introduction

elastic-serverless-forwarder

Elastic Serverless Forwarder

Changelog link

For AWS documentation, go here

elastic-serverless-forwarder's People

Contributors

Stargazers

Watchers

elastic-serverless-forwarder's Issues

Add bad path coverage for secret manager expander in integration test

Describe the enhancement:
We have an integration test for bad paths:
tests.handlers.aws.test_handler.TestLambdaHandlerFailure.test_lambda_handler_failure

We should add the required cases for error in secret manager expander

SAR: Add telemetry data for SAR usage

Add telemetry data for Elastic Serverless Forwarder usage (available in SAR for users to use from). Initial focus is on the usage of the Lambda function itself and the inputs being used (we can further limit to Elastic cloud, if really needed). Things like:

Some unique identifier of the deployed Lambda from SAR
Version of the Lambda is use
What input is being used (SQS S3, Kinesis, CloudWatch)
Maybe we should collect and send data only once per SAR execution (Lambda function by default usually runs for 15 minutes).

We should be able to graph things like:

How many unique usage of Lambda function in the last X days
Distribution of version
Distribution of input usage

I want to start with something small and then grow based on specific needs.

Few additional details for considerations:

Telemetry data collection is a secondary operation and should be collected on the best effort basis. Meaning we okay to send only once in the lifetime of a single Lambda execution (which is 15 min max). If their network security policy didn’t allow the connection then we fail, write a simple log message and that’s okay and I don’t think we need to stress about it too much. If we fail once then we don’t need to retry. This should remove concerns around extra cost to users etc.
We will provide in our documentation what telemetry data we are collecting and allow the user to disable it via way of setting an environment variable. Here we are focusing to collect data around the usage of the Lambda and of course will avoid any PII information. This should alleviate some concerns around security and also give users complete control to disable the telemetry data collection if they so choose.

Review of Elastic Serverless Forwarder Python code for best practices

Amazon EKS Control Plane Logging

Describe the enhancement:
EKS has different log streams in the same log group: the forwarder doesn't handle this at the moment. Add support for this and implement the forwarding capability for EKS audit logs.

https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html

Describe a specific use case for the enhancement or feature:
There is a VERY strong need to be able to use this serverless forwarder to forward EKS audit logs to the Elastic stack utilizing the Kubernetes Audit Log integration pipeline when the data gets ingested to properly parse and utilize this data.

Add support for Private VPC Config

Describe the enhancement:

Describe a specific use case for the enhancement or feature:

As part of forwarder deployment create the S3_CONFIG_FILE environment variable to simplify configuration by user and scripts

Currently we expect the users to add and set the S3_CONFIG_FILE environment variable.

This enhancement is to create the environment variable as part of the deployment:
Add the environment variable "S3_CONFIG_FILE" with value something like "s3://bucket-name-chageme/config-file-changeme"

This helps users not to have to remember the environment variable name, see the format of the value and more easily set the S3 URI value to the config file.

ESF: Performance testing (memory size, timeout value etc. values)

ESF: Support for AWS Secrets manager expansion in config yaml

share.config.parse_config should be refactored to receive a list of expanders callable:

def parse_config(config_yaml: str, expanders: list[Callable[[str], str]] = []) -> Config:
    for expander in expanders:
        config_yaml = expander(config_yaml)

Implements an expander for AWS secrets manager (somewhere in share.config.aws_sm_expander, up to you):

Look for entries in the format arn:aws:secretsmanager:AWS_REGION:AWS_ACCOUNT_ID:secret:SECRET_NAME:JSON_SECRET_KEY and replace with the value of the fetched secret for the given json secret key
Beware that the ARN of a secret is arn:aws:secretsmanager:AWS_REGION:AWS_ACCOUNT_ID:secret:SECRET_NAME, the secret will be in json format so we have to reference also the key appending :JSON_SECRET_KEY in the reference in the config file.
If multiple json keys for the same secret ARN are referenced we should fetch the secret only once.
The expander will take as input a str with the config yaml and return as output the same config yaml with the secrets expansion applied

Add unit test coverage for happy, sad and bad paths
Add the expander in the aws handler (handlers/aws/handler.py:37)
Change the integration test tests.handlers.aws.test_handler.TestLambdaHandlerSuccess.test_lambda_handler to have at least one secret in the config file
Document the feature in docs/README_AWS.md

feel free to split further the ticket in multiple ones for every steps

SAR: CloudWatch logs input support

Add support for CloudWatch log input

This helps the use case where users are already storing AWS services or other logs in AWS CloudWatch.
Many AWS services provide writing logs directly to AWS CloudWatch & S3 and we want to meet users where they are.
If they are already using CloudWatch we want to integrate without having them to create extra AWS setup.

Add a NOTICE.txt file

A file to track the 3rd party licenses used in this project should be added at the root of this repo

Unable to parse event from cloud watch-log

Please include configurations and logs if available.

For confirmed bugs, please report:

Version: 0.27.0
Setup trigger for cloud watch logs created by RDS database, i.e. - logStream:group=/aws/rds/cluster/database-1/error)
Add an instance to the database, which will generate log entries

The lambda function executes and logs an error:
[ERROR] TriggerTypeException: Not supported trigger
Traceback (most recent call last):
  File "/var/task/main_aws.py", line 17, in handler
    return lambda_handler(lambda_event, lambda_context)
  File "/var/task/elasticapm/contrib/serverless/aws.py", line 116, in decorated
    self.response = func(*args, **kwds)
  File "/var/task/handlers/aws/utils.py", line 109, in wrapper
    raise e
  File "/var/task/handlers/aws/utils.py", line 99, in wrapper
    return func(lambda_event, lambda_context)
  File "/var/task/handlers/aws/handler.py", line 61, in lambda_handler
    raise TriggerTypeException(e)

lambda_log.csv
log-events-viewer-result-showing-event.csv

AWS ElasticForwarder does not push logs when the Lambda closes due to a raised Exception

We have an application hosted in AWS that receives JSON Requests and sends us a JSON Response with content we want to see in ElasticSearch. We started using AWS ElasticForwarder and came across a peculiar situation:

When the request has the right format (and a response is sent), the ElasticForwarder is called and logs from CloudWatch are seen in ElasticSearch
BUT
When the request is wrong (differently-named key, for example), raising an Exception, logs from CloudWatch show that an exception is raised and ElasticForwarder also throws an exception.

This results in having ElasticSearch only logging the requests that were correct, and all the errors become invisible from Elastic's point of view.
To reproduce, simply raise a KeyError Exception on the application that you want to track logs from.

Below you will find the AWS CloudWatch logs, from both our hosted application and the ElasticForwarder for a good and bad request.

Application Logs (Bad): Raises an Exception

ElasticForwarder Logs (Bad): Abruptly stops running

And the exception details:

{
    "@timestamp": "2022-08-11T07:37:12.607Z",
    "log.level": "error",
    "message": "exception raised",
    "ecs": {
        "version": "1.6.0"
    },
    "error": {
        "message": "string argument should contain only ASCII characters",
        "stack_trace": "  File \"/var/task/handlers/aws/utils.py\", line 84, in wrapper\n    return func(lambda_event, lambda_context)\n  File \"/var/task/handlers/aws/handler.py\", line 133, in lambda_handler\n    for (\n  File \"/var/task/handlers/aws/cloudwatch_logs_trigger.py\", line 105, in _handle_cloudwatch_logs_event\n    for log_event, json_object, ending_offset, starting_offset, newline_length in events:\n  File \"/var/task/storage/payload.py\", line 60, in get_by_lines\n    base64_decoded = base64.b64decode(self._payload, validate=True)\n  File \"/var/lang/lib/python3.9/base64.py\", line 80, in b64decode\n    s = _bytes_from_decode_data(s)\n  File \"/var/lang/lib/python3.9/base64.py\", line 39, in _bytes_from_decode_data\n    raise ValueError('string argument should contain only ASCII characters')\n",
        "type": "ValueError"
    },
    "log": {
        "logger": "root",
        "origin": {
            "file": {
                "line": 109,
                "name": "utils.py"
            },
            "function": "wrapper"
        },
        "original": "exception raised"
    },
    "process": {
        "name": "MainProcess",
        "pid": 9,
        "thread": {
            "id": 140534405121856,
            "name": "MainThread"
        }
    }
}

Application Logs (Good): Normal Execution

ElasticForwarder Logs (Good): Normal Execution

YAML Serverless App Repo Config File (removed data on the region, account, group naming and secrets...)

inputs:
  - type: "cloudwatch-logs"
    id: "arn:aws:logs:[REGION]:[AWS_ACC_ID]:log-group:/aws/lambda/[LOG_GROUP_NAME]:*"
    outputs:
      - type: "elasticsearch"
        args:
          cloud_id: "arn:aws:secretsmanager:[REGION]:[AWS_ACC_ID]:secret:[SECRET_NAME]:[SECRET_KEY]"
          username: "arn:aws:secretsmanager:[REGION]:[AWS_ACC_ID]:secret:[SECRET_NAME]:[SECRET_KEY]"
          password: "arn:aws:secretsmanager:[REGION]:[AWS_ACC_ID]:secret:[SECRET_NAME]:[SECRET_KEY]"
          es_datastream_name: "logs-generic-default"
          batch_max_actions: 500
          batch_max_bytes: 10485760

ESF: Deployment: Create manual procedure documentation in the repository

SAR: support cloudtrail logs

Describe the enhancement:
Cloudtrail log entries are a single json object that we should extract the content of Records field from: it contains an array of json object that we should forward as events instead of the encompassing json log entry object

Describe a specific use case for the enhancement or feature:
Sending the json objects inside the array contained in Records field of the json log entry object of cloudtrail logs will allow the users to use the aws.cloudtrail integration

SAR: Review code and documentation and fix as appropriate for automatic routing support for S3 SQS Event Notifications input

Based on our documentation I see several discrepancy that we need to review/test from code and then fix both code and doc as appropriate.

aws.ec2_logs This looks missing in our list but we have an Elastic Agent integration for this.
aws.route53_public_logs This looks missing in our list but we have an Elastic Agent integration for this.
aws.route53_resolver_logs This looks missing in our list but we have an Elastic Agent integration for this.
aws.s3access This looks missing in our list but we have an Elastic Agent integration for this.
aws.lambda This is in our list, but I see only AWS Lambda Metrics integration no logs so this probably is in there as error.
aws.sns This is in our list, but I see only AWS SNS Metrics integration no logs so this probably is in there as error.
aws.s3_storage_lens This is in our list, but this is a metrics integration. I am assuming we should have automatic routing support for aws.s3access logs which is missing in our list.

Review and fix code and documentation as appropriate for the support we have for automatic routing for S3 SQS Event Notifications input.

Conclusion:

aws.lambda, aws.sns, and aws.s3_storage_lens are being removed.
We can't apply the current auto-discovery mechanism on the following log types "aws.ec2_logs, aws.route53_public_logs, aws.route53_resolver_logs, aws.s3access" and we will keep it outside of the scope of this issue.

Can't deploy multiple times in same AWS account/region

When deploying twice, with any method (e.g., AWS console), the second deploy fails. A real case scenario could be to deploy one for staging and another for production, while keeping changes in a Git repo.
The second deploy throws error when creating the ElasticServerlessForwarderEventMacro resource. Within that resource, the error is that elastic-serverless-forwarder-macro already exists in stack. It is thrown because the first deploy already creates a macro with that exact name.

Please include configurations and logs if available.

For confirmed bugs, please report:

Version: 1.2.1
Steps to Reproduce:
- In one AWS Account/Region, try to deploy the application twice.

SAR: Support routing of AWS services logs without user specifying dataset and namespace values

Add support for automatic routing of AWS services logs to the correct data stream without users having to specify dataset and namespace values in the config file. This will be limited to "AWS services logs" for which we have an integration.

The values can still be specified by the user and in that case we will map accordingly.

Handle continuation on Kinesis data stream input

Describe the enhancement:

Send not processed records for kinesis data stream input to the continuation queue

Describe a specific use case for the enhancement or feature:

Not processed records due to the lambda reaching the execution timeout are handled differently from other inputs: the not processed kinesis records are sent back to the data stream returning them in the batchItemFailures payload of the response.

Poison pill might be contained in the records, timeout could be reached even with the current grace period and in general sending back the record to kinesis increase usage costs.

For all the above reasons we should adopt the continuation queue mechanism also for kinesis stream.

Let the user provide an hint on the format of json content

Describe the enhancement:
Introduce a json_content_type setting at input level where the user can provide an hint on the format of the json content: either a single json object (spanning or not multiple lines) per payload, or ndjson format per payload

Describe a specific use case for the enhancement or feature:

The Elastic Serverless Forwarder is able to discovery automatically JSON content in the payload of an input and collect the contained JSON objects in the payload.
The JSON objects can be either be on a single line or spanning multiple lines. In the second case the Elastic Serverless Forwarder expects the different JSON objects spanning multiple lines to be separated by a newline.

In case of JSON objects spanning multiple lines a limit of 1000 lines is applied: every JSON object spanning on more lines than those will not be collected. Every line composing the whole JSON object will be forwarder individually instead,

Sometimes relaying on the Elastic Serverless Forwarder JSON content auto-discovery feature might have a huge impact on performance, or you have a known payload content of a single JSON object spanning more than 1000 lines. In this case you can provide in the input configuration and hint on the nature of the JSON content: this will change the parsing logic applied and improve performance or overcome the 1000 lines limit.

Refactor handler unit / integration tests

Describe the enhancement:

test_handler.py , that includes unit and integration tests for the Lambda handler, is almost 4000 lines of code and I find it hard to maintain. I propose to refactor it into multiple, purpose-specific, python test files. As first step, I would split unit and integration tests into 2 different files, and iterate on that where we see fit

ESF: Custom tag support

config structure should accept a tags param as string list for output type elasticsearch (https://github.com/elastic/elastic-serverless-forwarder/blob/main/docs/README-AWS.md#s3_config_file)

inputs:
  - type: "sqs"
    id: "arn:aws:sqs:%REGION%:%ACCOUNT%:%QUEUENAME%"
    outputs:
      - type: "elasticsearch"
        args:
          # either elasticsearch_url or cloud_id, elasticsearch_url takes precedence
          elasticsearch_url: "http(s)://domain.tld:port"
          cloud_id: "cloud_id:bG9jYWxob3N0OjkyMDAkMA=="
          # either api_key or username/password, apy_key takes precedence
          api_key: "YXBpX2tleV9pZDphcGlfa2V5X3NlY3JldAo="
          username: "username"
          password: "password"
          dataset: "generic"
          namespace: "default"
          tags:
            - tag1
            - tag2
            - tag3

share.config.ElasticSearchOutput should accept an extra __init__ params for tags values coming from config
shippers.es.ElasticsearchShipper should accept an extra __init__ params for tags values coming from config
shippers.factory.ShipperFactory.create_from_output should be adapted to pass tags for elasticsearch output type
shippers.es.ElasticsearchShipper._enrich_event should merge tags key for event_payload with the tags coming from input
Add unit test coverage for happy, sad and bad paths
Change the integration test tests.handlers.aws.test_handler.TestLambdaHandlerSuccess.test_lambda_handler to have at least one tags in the config file
Document the feature in docs/README_AWS.md

feel free to split further the ticket in multiple ones for every steps

Add integration tests to Elastic Server Forwarder

Improve handling of index/data stream routing

Provide a single configuration parameter (ex: es_index_or_datastream_name) for users to optionally set the index or data stream name where the data is sent to Elasticsearch. The users should be able to use an alias as well.
As part of this change we should remove the current dataset and namespace config values in lieu of this single config value.

For various supported AWS service logs the user doesn’t have to set es_index_or_datastream_name as the code will automatically process the supported AWS services log and send it to the correct integration data stream. This assumes that user has installed the AWS integration assets.
If there is no value specified by the user and also we do not have a match against any of the supported AWS service logs then the data should be sent to "logs-generic-default" data stream.

If es_index_or_datastream_name is specified by the user in the input section of the config, the Lambda will parse the value and check if it matches to the convention of an integration data stream <type>-<dataset>-<namespace>. If there is a pattern match, it will process it as an integration data stream.
If there is no pattern match then the code will process it as a regular index and send the data without adding any data streams specific parameters.

The above logic doesn’t handle the case of user defined custom data stream name that doesn’t follow the integration data stream naming convention <type>-<dataset>-<namespace>. We will handle that use case as a separate issue.

Update code
Update documentation

cc: @aspacca

CDK Construct

Describe the enhancement:

An AWS CDK Construct that encapsulates that SAR Application and has L2 support for using CDK S3 Buckets and other related resources.

Describe a specific use case for the enhancement or feature:

We're currently using CDK to manage a lot of our Elastic infrastructure. Ideally, we'll like not to drop to raw CFN or the L1 SAR Construct and use an Elastic construct directly for this.

I'd be willing to contribute a PoC if there's interest.

Expected CloudWatch Logs Log Group ARN in `ElasticServerlessForwarderCloudWatchLogsEvents` CloudFormation template has the wrong format

the correct format of the ARN of a CloudWatch Logs Log Group is the following:
arn:aws:logs:eu-central-1:XXXXXXX:log-group:log-group-name:*

The format we expect as value of ElasticServerlessForwarderCloudWatchLogsEvents parameter in the CloudFormation template is the following: arn:aws:logs:eu-central-1:XXXXXXX:log-group:log-group-name

Numbers of columns in the ARN string is relevant: the macro, in order to grant logs:DescribeLogGroups to the resource with ARN pattern arn:aws:logs:eu-central-1:XXXXXXX:*:* (that's the correct one for such action), does some string manipulation splitting the value on columns and replacing the last element with *:* and finally joining together the string with :

while the wrong ARN format arn:aws:logs:eu-central-1:XXXXXXX:log-group:log-group-name will produce the right one for the permission arn:aws:logs:eu-central-1:XXXXXXX:*:*, the correct ARN format arn:aws:logs:eu-central-1:XXXXXXX:log-group:log-group-name:* will produce the wrong one for the permission arn:aws:logs:eu-central-1:XXXXXXX:log-group:log-group-name:*:*

Replacing the wrong expectation is not the right solution, since there are already deployments using the wrong format that we don't want to break, so the macro should take care of both formats and apply the correct string manipulation according to the one effectively used

For confirmed bugs, please report:

Version: v1.1.1
Steps to Reproduce:
https://discuss.elastic.co/t/how-to-integrate-aws-lambda-with-plugin-elastic/306506/26?u=andrea_spacca

✨ Implement an option for disable SSL Certificate Verify

Hello,

We are trying to deploy it without SSL signed on ElasticSearch side, and it doesn't seem to work without SSL Signed.

Please find below the error we have :

{
    "@timestamp": "2022-09-29T11:54:37.848Z",
    "log.level": "warning",
    "message": "elasticsearch shipper",
    "_id": "f5d4f08ae9-000000000000",
    "ecs": {
        "version": "1.6.0"
    },
    "error": "ConnectionError([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1129)) caused by: SSLError([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1129))",
    "log": {
        "logger": "root",
        "origin": {
            "file": {
                "line": 167,
                "name": "es.py"
            },
            "function": "_handle_outcome"
        },
        "original": "elasticsearch shipper"
    },
    "process": {
        "name": "MainProcess",
        "pid": 8,
        "thread": {
            "id": 140115916543808,
            "name": "MainThread"
        }
    }
}

Would it be possible to add an option into the config.yaml file to handle no certificate verification (We use auto-signed certificate).

Many thanks for the ✨ ^^

Hints Reference :

elastic-serverless-forwarder/shippers/es.py

Line 108 in 8fd2d8a

def _elasticsearch_client(**es_client_kwargs: Any) -> Elasticsearch:

SAR: multiline config and processor support in Elastic serverless forwarder

SAR: Remove "beta" tag from the documentation and make this is a GA functionality

Let's remove the beta tag from the documentation and make this generally available.

Update the document to remove the note talking about "beta" functionality and update version as 1.0.0
Publish a new version in SAR

ESF: Deployment: CloudFormation - Document the steps in the repository

Needed Logstash Support

[ERROR] ConfigFileException: Type must be one of elasticsearch: logstash given
Traceback (most recent call last):
File "/var/task/main_aws.py", line 17, in handler
return lambda_handler(lambda_event, lambda_context)
File "/var/task/handlers/aws/utils.py", line 64, in wrapper
return func(lambda_event, lambda_context)
File "/var/task/handlers/aws/utils.py", line 100, in wrapper
raise e
File "/var/task/handlers/aws/utils.py", line 84, in wrapper
return func(lambda_event, lambda_context)
File "/var/task/handlers/aws/handler.py", line 79, in lambda_handler
raise ConfigFileException(e)

I need an logstash output, so configured it that way, but it is not supporting, support logstash as well, Thanks.

When Using expand_event_list_from_field add values outside the list for each individual record.

Hello Elastic Team!

Describe the enhancement:

We are using Lambda to get Records from Kinesis and send them to Elastic

Giving the following Input into the Lambda

{"@timestamp": "2022-06-16T04:06:03.064Z", "message": "{ "name":"name #1", "logEvents":[{"key": "value #1"},{"key": "value #2"}]}"}
{"@timestamp": "2022-06-16T04:06:13.888Z", "message": "{ "name":"name #2", "logEvents":[{"key": "value #3"},{"key": "value #4"}]}"}

And we are using expand_event_list_from_field: "logEvents":
https://github.com/elastic/elastic-serverless-forwarder/blob/main/docs/README-AWS.md#expanding-events-from-list-in-json-object

We need to add values outside the expanded array inside each individual record.

Describe a specific use case for the enhancement or feature:

Can we Move the values outside logEvents like the name into each individual record in the Lambda Outpost to have them available in Elastic?
Desired Config:

type: "kinesis-data-stream"
id: "***"
expand_event_list_from_field: "logEvents"
    add_root_fields_to_event_list_extracted_from_field: 
        - "name"

Desired Output:

{"@timestamp": "2022-06-16T04:06:21.105Z", "message": "{"name":"name #1", "key": "value #1"}"}
{"@timestamp": "2022-06-16T04:06:27.204Z", "message": "{"name":"name #1", "key": "value #2"}"}
{"@timestamp": "2022-06-16T04:06:31.154Z", "message": "{"name":"name #2", "key": "value #3"}"}
{"@timestamp": "2022-06-16T04:06:36.189Z", "message": "{"name":"name #2", "key": "value #4"}"}

Particular use case:
We need to be able to add owner, logGroup and logStream for each es_event inside kinesis_trigger.py.

I did a draft proposal and looking forward to making it dynamic, specifying the values to be added in the configuration file.
handlers/aws/kinesis_trigger.py

Thank you in Advanced!

Handle offset continuation tracking for `expand_event_list_from_field`

Describe the enhancement:

Handle an offset of the position in the list in the case of expand_event_list_from_field usage instead of waiting for the full list to be processed regardless the fact we already reached the timeout grace period.

Describe a specific use case for the enhancement or feature:

When we extract a list of events from a json field (with the expand_event_list_from_field setting) we wait for the whole list to be forwarded. In the meantime the timeout grace period could have been already reached and waiting for the whole list to be forwarded can end up in the lambda timing out.

For this reason we should introduce an offset continuation handling (like we do for regular plaintext content) for this specific case: in case while iterating through an expanded list of events we reach the timeout grace period, we mark the offset of the position in the least that we reached and proceed with stopping the iteration and handling the continuation like for the case where there is no expansion.
In the continuation of the payload we'll skip directly to the offset position that was marked.

SAR: Kinesis data streams input support

Add support for Kinesis data streams input

This helps the use case where users are already using Kinesis data streams to consolidate events from various services.
If they are already using Kinesis data streams we want to integrate without having them to create extra AWS setup.

SAR: Terraform support

We should include a way to add the required IAM permissions/policies.

SAR: Consider replacing manual docker bootstrap and relative bolilerplate in integration tests with testcontainers-python

Describe the enhancement:
testcontainers-python

Describe a specific use case for the enhancement or feature:
Currently we use docker directly with random ports exposed to the host: this requires some boilerplate to retrieve those ports and override them for localstack so that we can use its test helpers.

The first scope of this issue is to understand if using testcontainers-python we can keep the random ports but avoid a lost if not most of the boilerplate

Handle replay queue messages to shipper in batches (performance improvement and costs saving factor)

Describe the enhancement:
Currently the replay queue handler accepts batches of messages as input, but still every message (that are 1:1 to failed events) are sent again to the shipper one by one, without exploiting any bulk or other batches feature of the output.
We should instead keep a cache of the shippers for all the messages in the input batch and relay on the bulking/batching logic of every shipper and flush only at the end of the input batch or when the timeout grace period is reached, similar to what we do for the continuing queue.

Describe a specific use case for the enhancement or feature:
Some users attach the replay queue as trigger of the forwarder lambda, relaying on the eventual transient nature of the forwarding failures. This is indeed an intended scenario, and it's the reason why the replay queue was set with a max receive count setting equals to 3 and a DLQ.
In this case, forwarding 1 event at time has a considerable impact on the performance of the replay handler that will producer longer execution times and therefore higher costs for the users.

SAR: Add Integration tile for elastic-serverless-forwarder in the Integrations UI

Improve the discovery of elastic-serverless-forwarder functionality for the users. Currently the users (both internal and external) have hard time finding the serverless data ingestion method. Adding a tile in the Integrations UI creates visibility and provides a way for the users to discover the functionality more easily.

Clicking the tile will forward the users to “https://github.com/elastic/elastic-serverless-forwarder/blob/main/docs/README-AWS.md”. We already have this kind of mechanism being utilized for other things like Language client.
The tile should be visible in the “AWS” and “Custom” section.
Details of the tile:
Use the AWS Serverless Application Repository icon.

Bold text says “AWS Serverless Application Repository”.
Detailed text says “Collect logs using AWS Lambda application available in AWS Serverless Application Repository.”

cc: @aspacca

[Question] S3-SQS input & deploy/config

LS,
I am looking at deploying the elastic-serverless-forwarder but I have two questions (Is this the correct place for those questions ?)
As a start we will look at using the S3-SQS input as a replacement for our current lambda function.

We could deploy the elastic-serverless-forwarder and consume messages from s3-sqs. But ...
if we add an extra S3-Bucket to send messages to that SQS, do we need to redeploy/reconfigure the elastic-serverless-forwarder ?
(assuming the permissions for the lambda to access the extra S3-Bucket is in place)

Why do we need to specify ElasticServerlessForwarderS3Buckets in the deployment of the lambda ?
In our AWS setup the permissions on the bucket are handled by the owner of the bucket and not by the owner of this lambda.

Greetings,
PeterPaul

Unstable tests

Hello,

It looks like there are a few unstable tests in the new pipeline for this project:

https://beats-ci.elastic.co/view/All/job/Library/job/elastic-serverless-forwarder-mbp/job/main/4/testReport/junit/tests.storage.test_s3/TestS3Storage/test_get_by_lines/

and

https://beats-ci.elastic.co/view/All/job/Library/job/elastic-serverless-forwarder-mbp/job/main/5/testReport/junit/tests.handlers.aws.test_handler/TestLambdaHandlerSuccess/test_lambda_handler/

We're now monitoring the health of this pipeline so if these could be cleaned up, it would be terrific. Thanks!

Run AWS handler for Elastic Serverless Forwarder locally

Describe the enhancement:

Provide a way to run the lambda handler locally accepting different kind of trigger events.
Ideally we will be able to run the handler from an IDE with a debugger attached.

Describe a specific use case for the enhancement or feature:

Currently you need to deploy a new version of the lambda in SAR on a development account and test a payload through AWS console. Or wrap your own __main__ entry-point that's not versioned if you want to run locally/with debugger.

This is far from ideal and providing this feature will improve developers experience.

Make Elastic Serverless Forwarder available in AWS GovCloud regions

Currently, Elastic Serverless Forwarder(ESF) is not available in AWS GovCloud regions.

From the AWS docs "Applications that are publicly shared in other AWS Regions are not automatically available in AWS GovCloud (US) Regions. To make applications available in AWS GovCloud (US) Regions, you must publish and share them independently of other AWS Regions".

This issue is to publish ESF in AWS SAR to be made available on the AWS GovCloud(US) Regions.

There are certain requirements around who can have access to AWS GovCloud(US) Regions. For example:
"AWS GovCloud (US-East) and (US-West) Regions are operated by employees who are U.S. citizens on U.S. soil. AWS GovCloud (US) is only accessible to U.S. entities and root account holders who pass a screening process. Customers must confirm that they will only use a U.S. person (green card holder or citizen as defined by the U.S. Department of State) to manage and access root account keys to these regions."

Figure out the logistics of AWS account that can be used for this
Figure out the logistics of who can get access and get access to those folks
Figure out the process of publishing ESF from that account to AWS GovCloud (US-East) and (US-West) Regions
Define the process of how we will maintain this for each updated version over time

Beta Give feedback

https://github.com/elastic/enhancements/issues/17413
https://github.com/elastic/enhancements/issues/18231
Options

SAR: Review AWS metadata enrichment for event

Describe the enhancement:

We already enrich the events, according to their source, with some metadata related to AWS (s3, kinesis etc).
We should review the available information in order to add as much metadata as possible (for example at the moment we are not adding cloud.account.id while we probably have this information).

Probably a refactoring of the enrichment code should be required to be able to reuse the code no matter what the input of the events is.

Describe a specific use case for the enhancement or feature:

Adding metadata related to the AWS environment is an enabler for users discoverability of the events.
Users could rely on filter or dashboard that can exploit the metadata we collect, like isolating events from a specific AWS account ID

Unescape message before sending data to Elastic and treat it as JSON and not as plain text

Describe the enhancement:
Hi

We are using Lambda to get Records from Kinesis and send it to Elastic

Giving the following Input:

The output data looks is treated as an escaped string and placed inside the message:

How can we get all the messages as JSON inside Elastic customizing the S3_CONFIG_FILE YAML file?

Current config file:
`inputs:

type: "s3-sqs"
id: "***"
expand_event_list_from_field: "Records"
outputs:
- type: "elasticsearch"
  args:
  cloud_id: ""
  api_key: ""
  es_index_or_datastream_name: "aws-alb-logs"
  batch_max_actions: 500
  batch_max_bytes: 10485760
type: "kinesis-data-stream"
id: "***"
expand_event_list_from_field: "Records"
outputs:
- type: "elasticsearch"
  args:
  cloud_id: ""
  api_key: ""
  es_index_or_datastream_name: "atc-log-test-2022"
  batch_max_actions: 500
  batch_max_bytes: 10485760
  `

Describe a specific use case for the enhancement or feature:

SAR: Research and document auto discovery of AWS service logs based on events

Currently when users use elastic-serverless-forwarder to send AWS services logs to Elastic they have to understand our data stream naming convention, find out what the default data stream name is (that's setup by the integration) and then specify the name as a config value to get the data routed correctly to Elastic and utilize the OOTB assets (index template, pipeline, dashboards etc.). We currently have auto routing but that's based on S3 object naming convention and hence works only for Amazon S3 input. This issue is to research using the events data itself to identify the AWS service log for all the OOTB supported AWS services logs, so that the method works for all supported inputs (including Kinesis data stream, SQS, CloudWatch logs).

For other complex use cases users can continue to specify the config value.

This issue is to research and document the findings for auto discovery of AWS service logs based on events data.
This issue scope would be to focus right now on the AWS service logs that we have supported integrations for.
These are the current AWS log integrations we have (based on index template we install).

CloudWatch parse always raise UnicodeDecodeError

Please include configurations and logs if available.

For confirmed bugs, please report:

Version: 0.28.0
Steps to Reproduce: every run triggered by cloudwatch logs

Whenever the Lambda function is triggered by a CloudWatch logs it raise an UnicodeDecodeError.

log-events-viewer-result.csv

SAR: Provide a seamless experience to the user in order to add required IAM permissions to the lambda

Describe the enhancement:
The SAR template comes with no IAM permissions/policies.
Still the user is required to add them in order to be able to run the lambda successfully once deployed from SAR.
While manual step are documented we want to provide a more seamless experience for the user like a script they can run or similar.
This could be used also for IaC purpose

Add functionality of conditional include and exclude of events, based on regular expression based matching

Currently elastic-serverless-forwarder includes all events that are available in a given source input.
This issue is to add regular expression based include and/or exclude functionality where the logs are excluded or included based on matching the regular expression.

Exclude is applied first and then include is applied, so exclude takes precedence if both are supplied by the user
Filtering is applied to the original message
User can specify one or more filters
Debug logs should indicate what is being filtered by the forwarder
The summary message should be present in the log by default providing how many events were skipped by the forwarder

Remove backward compatibility code for `es_index_or_datastream_name` in v1.0.0

#115 replaced the config param es_index_or_datastream_name with es_datastream_name

Backward compatibility code was put in place in order to allow for a migration path.

It is planned for v1.0.0 to remove the Backward compatibility code

SAR: Gracefull handling of error scenarios

Handle error scenarios gracefully in the Lambda function. Few scenarios comes to mind that we need to investigate and see if extra error handling code needs to be added:

Elasticsearch overloaded “(429) Too Many Requests”
Elasticsearch unreachable because of network or other errors
Some events in the bulk request failed to ingest
Failure because of format errors etc.

Some error cases can be retried and we should have optimal retry logic in the code for those.
In other cases we will need to use the DLQ and write events there so users can investigate and potentially be able to re-process

Kafka output

Describe the enhancement:
Would be nice to be able to output to Apache Kafka

Describe a specific use case for the enhancement or feature:
By sending the data to Apache Kafka it opens up the door to use Apache Flink for data enrichment prior to sending the data to the Elastic Stack.

elastic / elastic-serverless-forwarder Goto Github PK

elastic-serverless-forwarder's Introduction

elastic-serverless-forwarder

Changelog link

For AWS documentation, go here

elastic-serverless-forwarder's People

Contributors

Stargazers

Watchers

Forkers

elastic-serverless-forwarder's Issues

Related

Recommend Projects

Recommend Topics

Recommend Org