Giter Club home page Giter Club logo

aws-airflow-stack's People

Contributors

amizzo87 avatar edbizarro avatar ferdingler avatar florpor avatar rohitjones avatar tomfaulhaber avatar villasv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aws-airflow-stack's Issues

Scaling back instances might kill running tasks

The stack autoscaling model is not flexible at all: scale in response to the queue length. But except for DAGs consisting solely of quick operations, this is problematic.

If the downscaling in response to 0 queued tasks is N seconds, any tasks that take N+1 seconds might get killed. That's just an obvious example, you don't need to get execution times close to N to get problems if you have many tasks because the average queue length can still be close to zero.

It's very unlikely that there can be a one-size-fits-all autoscaling strategy, this one was implemented because it was easy and useful. The problem is, we should strive for a transparent infrastructure and should not have things like an Operator that allocates 5 machines, for example. DAGs should be all about data.

Adhere to the maximum line length for YAML adopted by CFN

Uploading the code to the designer and then downloading it will convert a vew string blocks from standard line continuation | to >, introducing a bunch of blank lines. This is a minor inconvenience, the changes are rapidly undone.

A very minor issue, but making the upload/download idempotent would be pretty convenient.

Running DAG's don't have their tasks scheduled

Something quite wrong is happening with manually triggered dags. Tasks are being assigned task_state=None and are not seen by the scheduler, so they're never getting queued.

Running a backfill from the CLI works, so I'm inclined to think this is a problem with the scheduler.

Intermittent problems with EPEL 6 repository

Once in a while yum makecache fails, apparently because EPEL screws up (epel: Check uncompressed DB failed) and expected the wrong payload size ([Errno 14] HTTP Error 416 - Requested Range Not Satisfiable for all its mirrors).

Perhaps I can make this problem go from "uncommon" to "rare" if I configure yum to ignore unavailable repositories (yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true), therefore encouraging it to ignore the cache.

Shouldn't be very impactful, the only use of EPEL6 at the moment is the supervisor package.

Merge CloudFormation::Init sections in a single resource

A very welcome trick is described here to share initialization configuration for EC2 instances.

Basically it comes from the fact that calling the initialization routine will accept the resource ID instead of (what I assumed) always fetching from the caller resource. This means that one resource can access the CFN::Init metadata from the other.

Using a dummy resource we can concentrate all init configs in a single place for once.

Convert Scheduler EC2 instance to Amazon Linux AMI

This is not strictly necessary, but will make integration with CloudFormation utilities easier.
It will make setting up the environment harder, but this is generally easier to find answers around.

Necessary step before #8.

Airflow installation doesn’t work in the latest AMIs

Reported by @tomfaulhaber, didn't reproduce yet.

Still need to decide on #34, because providing baked AMIs adds a lot more complexity to the template, although it greatly simplifies the actual resource creation during autoscaling.

But again, autoscaling is not the main feature of the stack anymore. The installation is not that expensive and as long as it consistently works on the newer official Amazon Linux AMIs it should be fine.

Do not name SQS queues manually

Queue names are unique because they're used by CloudWatch metrics. Just let AWS pick a name so multiple stacks can be deployed simultaneously in the same region.

Turbine CFN stack deploy diagnostic tool

Having a diagnostic tool for airflow deployments could be really useful, although this task might seem too general and complex. Testing the UI and if tasks are getting scheduled/executed would be a good start, as well as connections.

Use the stack name as tag prefix on AWS resources

The stack was already configured in a way that multiple stacks can be created without conflicts - by using tags and not unique names to identify them.

But still, those tags are generally turbine- or some prefix alike. Ideally we should provide support for additional tagging information so users can easily distinguish between a production deployment and a testing deployment.

Reinstall pycurl with proper SLL backend

Celery was erroring with
ImportError: The curl client requires the pycurl library.
even though PyCurl is installed.

Turns out it was erroring to be imported because
ImportError: pycurl: libcurl link-time ssl backend (openssl) is different from compile-time ssl backend (none/other).

Solution is already documented:
pip-3.5 install --no-cache-dir --compile --ignore-installed --install-option="--with-nss" pycurl

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.