Giter Club home page Giter Club logo

aws-ecs-airflow's Introduction

Hello

I'm Nicola (he/him). I'm a Data Engineer and Data Architect, mostly working with AWS. I'm experienced on designing/evolving/maintaining data architectures of different sizes, from data-lakes to fat DWH.

Expertise

  • AWS
  • IaC
    • Terraform
    • Cloudformation
    • CDK
  • Docker/Kubernetes/AWS ECS
  • Airflow/AWS Step Functions
  • Python
  • SQL
  • Spark
  • Redshift/Snowflake
  • Postgres/MySQL
  • Presto/Trino/AWS Athena

Interests

  • Serverless
  • Data Lakes/Data Lakehouses/Data Warehouses
  • Open Table Formats (specifically Iceberg)
  • Distributed systems
  • Streaming

Connect with me

nicolacorda | LinkedIn nicorc88 | Twitter

aws-ecs-airflow's People

Contributors

felix-weizman-deel avatar neylsoncrepalde avatar nicor88 avatar sre-ops avatar yaronshemesh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

aws-ecs-airflow's Issues

Can't see logs from Airflow UI

After this is deployed in ECS and an example dag ran, I tried to check the task logs, but here is the error message:

*** Log file does not exist: /usr/local/airflow/logs/test_dag/test/2022-01-03T22:07:16.085055+00:00/3.log
*** Fetching from: http://:8793/log/test_dag/test/2022-01-03T22:07:16.085055+00:00/3.log
*** Failed to fetch log file from worker. Request URL missing either an 'http://' or 'https://' protocol.

Everything is working locally with docker-compose.

Any idea how to resolve this issue?

Question: Where do dags/plugin etc live

Hi,
Once airflow is deployed via ECS, where do we store the custom objects like dags, plugins etc?
Can they be part of codecommit/git and would ECS be able to pull them once a new dag/update is published?
Thanks
Apun

issue with terraform tags

cd infrastructure && terraform plan;

Error: Unsupported block type

on network.tf line 45, in resource "aws_subnet" "public-subnet-1":
45: tags {

Blocks of type "tags" are not expected here. Did you mean to define argument
"tags"? If so, use the equals sign to assign it a value.

Error: Unsupported block type

on network.tf line 62, in resource "aws_subnet" "public-subnet-2":
62: tags {

Blocks of type "tags" are not expected here. Did you mean to define argument
"tags"? If so, use the equals sign to assign it a value.

Error: Unsupported block type

on network.tf line 79, in resource "aws_subnet" "public-subnet-3":
79: tags {

Blocks of type "tags" are not expected here. Did you mean to define argument
"tags"? If so, use the equals sign to assign it a value.

How to view Airflow UI once deployed to AWS?

I have deployed this project to AWS as per the instructions in the project.

It seems to be stood up correctly, i see the cluster and services in ECS and the tasks appear to be running good with no errors in the logs.

But I can't seem to figure out how to find the public url to view my airflow UI now that it is deployed. I also do not see anything in the README or in previous issues about this, just the one other open issue about the flower UI.

Is the UI just not available with this deployment setup? I figure it must be but am very confused about how to locate it.

flower error (flower doesn't start up (ie not accessible on port: 5555))

worker_1 | [2019-03-07 05:50:51,945] {settings.py:174} INFO - setting.configure_orm(): Using pool settings. pool_size=10, pool_recycle=3600
flower_1 | Traceback (most recent call last):
flower_1 | File "/usr/local/bin/flower", line 6, in
flower_1 | from flower.main import main
flower_1 | File "/usr/local/lib/python3.6/site-packages/flower/main.py", line 4, in
flower_1 | from flower.command import FlowerCommand
flower_1 | File "/usr/local/lib/python3.6/site-packages/flower/command.py", line 18, in
flower_1 | from .app import Flower
flower_1 | File "/usr/local/lib/python3.6/site-packages/flower/app.py", line 15, in
flower_1 | from .urls import handlers
flower_1 | File "/usr/local/lib/python3.6/site-packages/flower/urls.py", line 9, in
flower_1 | from .api import tasks
flower_1 | File "/usr/local/lib/python3.6/site-packages/flower/api/tasks.py", line 81, in
flower_1 | class TaskApply(BaseTaskHandler):
flower_1 | File "/usr/local/lib/python3.6/site-packages/flower/api/tasks.py", line 83, in TaskApply
flower_1 | @web.asynchronous
flower_1 | AttributeError: module 'tornado.web' has no attribute 'asynchronous'

getting errors after creating the plugins directory

worker_1 | [: CRITICAL/MainProcess] Unrecoverable error: VersionMismatch('Redis transport requires redis-py versions 3.2.0 or later. You have 2.10.6',)

What's the reason you're not using the official airflow docker image? -- why build it yourself from scratch?

compose up build fail for permission denied

Hi
when trying to follow the readme and run docker-compose up --build
i get permission denied on the postgress_data folder
tried changing the owner to be my user - but then subdirectories have the same error.

with which user one should run docker-compose ?
thanks
Shlomi

Referenceability as terraform module

I found this repo via google search and it seems like a great starting point for a project I'm starting on.

Rather than forking the repo, I'm curious if you have any interest in updating the layout of the terraform scripts in order to match the convention of having a core main.tf, outputs.tf and variables.tf for the primary use case(s). This would make the repo more easily referenceable, which could allow one to plug this into other environments more easily.

Feel free to say "no" if that's not the direction this project is going. (Thanks very much either way!)

ImportError: cannot import name 'soft_unicode' from 'markupsafe'

Hi,
I was trying to setup airflow in ECS using this code.Tasks doesnt run because we were facing this error
" cannot import name 'soft_unicode' from 'markupsafe' (/usr/local/lib/python3.8/site-packages/markupsafe/init.py) | ImportError: cannot import name 'soft_unicode' from 'markupsafe' (/usr/local/lib/python3.8/site-packages/markupsafe/init.py).

Tried various solutions but didnt work. Any idea on how to fix this

Many thanks

Errors in infrastructure/network.tf files

Error message at line no 45.

Blocks of type "tags" are not expected here. Did you mean to define argument
"tags"? If so, use the equals sign to assign it a value

Just having an "=" after tags resolved the issue for me. Please update the network.tf file.
There are three such occurrence in the same file.

How to get Flower UI working behind ALB

We have added a forwarding rule using the path /flower* in the ALB listener on port 80 and set up a target group for flower to forward to. I've further added flower_url_prefix = /flower to the airflow.cfg. Now the UI renders with the CSS loading fine but instead of the content I get a Error, page not found message.

terraform version incompatible

Hi
your terraform aws provider is >1.5 - while the latest terraform installation requires >2.7
here is the error when trying to init terraform.

Provider "aws" v1.60.0 is not compatible with Terraform 0.12.8.

Provider version 2.7.0 is the earliest compatible version. Select it with 
the following version constraint:

	version = "~> 2.7"

Terraform checked all of the plugin versions matching the given constraint:
    ~> 1.52

Question: how autoscaling will work

Hi
let's say i deploy this stack on ECS - how can i scale up and down the number of workers .
will i need to create my own alerts and auto scaling groups - or is it managed already ?

thanks
Shlomi

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.