Giter Club home page Giter Club logo

airflow's Introduction

ETL

Running

To run the airflow deployment, use the main deployment repository and then visit http://0.0.0.0:8080/login/

docker-compose up

Then, add a new connection in the ui with the following parameters. This must be done for the dags to run.

The name is fs_default and of type File(path).

DAGS

process-highways

This dag is responsible for downloading highway geometries, saving them to disk in a database ready format, and loading them into the database.

process-addresses

This dag downloads the address data and then inserts it into neo4j.

Developing

Poetry is used to manage the dependencies and environment. To use, run

poetry shell

To exit, run exit.

Linting

Before submitting code for review, lint it with the following commands

python3 -m black .
python3 -m isort .

Pull Requests

Pull requests should be made into the develop branch from feature branches.

airflow's People

Contributors

thomasthelen avatar

Watchers

 avatar

airflow's Issues

Download & Link Addresses

We need to download addresses and link them to nearest intersections. Create a DAG that does this.

Fix timeout on long data downloads

The download for the road networks can take over an hour. During this timeout, Airflow kills the task. We should configure the dag/task's healthcheck to not do this.

Download Mexico Highway Routes

We need to download the geometries (intersections) of all highways in Mexico and upload them to neo4j. We can use OpenStreetMap for this information.

Create a DAG that does this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.