Giter Club home page Giter Club logo

kouros-athena's Introduction

terraform-dataproduct-aws-athena

This open source Terraform module provisions the necessary services to provide a data product on AWS.

Overview

Services

  • AWS S3
  • AWS Athena
  • AWS Glue
  • AWS Lambda

Usage

module my_data_product {
  source = "[email protected]:datamesh-architecture/terraform-dataproduct-aws-athena.git"

  domain   = "<data_product_domain>"
  name     = "<data_product_name>"
  
  schedule = "0 0 * * ? *" # Run at 00:00 am (UTC) every day

  input = [
    {
      source = "<existing_s3_bucket>"
    }
  ]

  transform = {
    query    = "sql/<name_of_the_transform>.sql"
  }

  output = {
    format   = "<format>"
    schema   = "schema/<name_of_the_schema>.schema.json"
  }
}

Additionally, it's necessary to configure credentials for AWS. This can be done in a separate file terraform.tfvars with the following content:

aws = {
  region = "REGION"
  access_key = "ACCESS_KEY"
  secret_key = "SECRET_KEY"
}

The specified credentials can then be referenced and forwarded in the other *.tf files.

Endpoint data

The module creates an RESTful endpoint via AWS lambda (e.g. https://3jopsshxxc.execute-api.eu-central-1.amazonaws.com/prod/). This endpoint can be used as an input for another data product or to retrieve information about this data product.

{
  "domain": "<data_product_domain>",
  "name": "<data_product_name>",
  "output": {
    "location": "arn:aws:s3:::<s3_bucket_name>/output/data/"
  }
}

Examples

See examples repository.

Authors

This terraform module is maintained by André Deuerling, Jochen Christ, and Simon Harrer.

License

MIT License.

Requirements

Name Version
aws >= 4.56

Providers

Name Version
archive n/a
aws >= 4.56
local n/a

Modules

No modules.

Resources

Name Type
aws_apigatewayv2_api.lambda_info resource
aws_apigatewayv2_integration.lambda_info resource
aws_apigatewayv2_route.lambda_info resource
aws_apigatewayv2_stage.lambda_info_prod resource
aws_cloudwatch_event_rule.aws_cloudwatch_event_rule resource
aws_cloudwatch_event_target.aws_cloudwatch_event_target resource
aws_cloudwatch_log_group.lambda_info resource
aws_cloudwatch_log_group.lambda_to_cloudwatch resource
aws_glue_catalog_database.aws_glue_catalog_database resource
aws_glue_catalog_table.aws_glue_catalog_table resource
aws_glue_schema.aws_glue_schema resource
aws_iam_role.lambda_execution_role resource
aws_iam_role_policy.lambda_athena resource
aws_iam_role_policy.lambda_glue resource
aws_iam_role_policy.lambda_logs resource
aws_iam_role_policy.lambda_s3 resource
aws_iam_role_policy.lambda_s3_input resource
aws_kms_key.aws_kms_key resource
aws_lambda_function.aws_lambda_function resource
aws_lambda_function.lambda_info resource
aws_lambda_permission.aws_lambda_permission resource
aws_lambda_permission.lambda_info resource
aws_s3_bucket.aws_s3_bucket resource
aws_s3_bucket_acl.aws_s3_bucket_acl resource
aws_s3_bucket_server_side_encryption_configuration.aws_s3_bucket_server_side_encryption_configuration resource
aws_s3_object.archive_info_to_s3_object resource
aws_s3_object.archive_to_s3_object resource
local_file.lambda_info_to_s3 resource
local_file.lambda_to_s3 resource
local_file.query_to_s3 resource
archive_file.archive_info_to_s3 data source
archive_file.archive_to_s3 data source
aws_iam_policy_document.allow_athena data source
aws_iam_policy_document.allow_glue data source
aws_iam_policy_document.allow_logging data source
aws_iam_policy_document.allow_s3 data source
aws_iam_policy_document.allow_s3_input data source
aws_iam_policy_document.lambda_assume data source

Inputs

Name Description Type Default Required
aws AWS related information and credentials
object({
region = string
access_key = string
secret_key = string
})
n/a yes
domain The domain of the data product string n/a yes
input List of S3 buckets of other data products which should be used as input
list(object({
source = string
}))
n/a yes
name The name of the data product string n/a yes
output format: Output format of this data product (e.g. PARQUET)
schema: Path to the JSON schema file which describes the output of this data product
object({
format = string
schema = string
})
n/a yes
schedule The schedule expression to pass to the EventBridge event rule. Format: Minutes | Hours | Day of month | Month | Day of week | Year string "" no
transform Path to a SQL file, which should be used to transform the input data
object({
query = string
})
n/a yes

Outputs

No outputs.

kouros-athena's People

Contributors

dean4711 avatar jochenchrist avatar kourospechlivanidis avatar simonharrer avatar stefannegele avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.