Giter Club home page Giter Club logo

treetracker-airflow-dags's Introduction

Treetracker Automation using Airflow

This repository contains automations for Treetracker processes using an open source job running called Airflow. Airflow allows us to flexibly and rebustly create and schedule jobs to run on a schedule or be manually triggered. Airflow uses the python language and requires python 3 to be installed to run.

Set up your development environment

Option 1: Install using pip

sudo pip3 install apache-airflow

This approach seems to work well on MacOS X, but on Windows requires many extra dependencies. More information is available at https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html#

Option 2: Run using docker

You may run a ubuntu instace in docker, and install airflow there using pip3. There may be a docker image available on dockerhub for this purpose as well, but we have not tested any publicly available images at this time.

If run airflow in docker, using ubuntu, all pre-requisites for airflow is just: python (2/3), pip and bind the 8080 port to allow locally visit to the admin dashboard of airflow, here is an example of docker command to run the container: docker run -it -d -v ~/temp/airflow/mydata:/mydata -p 8080:8080 --name myairflow ubuntu

Author airflow jobs (DAGs)

Airflow DAGs are authored using any editor that you choose. When you author DAGs in the configured airflow DAGs folder (defaults to /Users/{user}/airflow/dags on MacOS X), airflow detects the automatically and runs them. You can view the outputs of each run in the airflow web panel.

  1. Run airflow locally, execute airflow standalone on the command line, note that this command gives you an address for the airflow web panel
  2. Open the airflow control panel using the provided web address
  3. Enable any DAGs in the web panel that you want to develop
  4. If you are creating a new DAG, copy an example DAG of interest to a new files in the DAGs folder and update the dag id in the file. (see below to locate the dag id)
  5. As you update your DAG, airflow will run it according to the schedule you specify. Or you can manually trigger the DAG.

Locating the DAG id

with DAG(
    'reporting-schema-copy',   <<<--- this is the DAG id, it cannot be duplicated
    default_args=default_args,
    description='Calculate earnings for FCC planters',
    schedule_interval= '* * * * *',
    #schedule_interval= '@hourly',
    start_date=datetime(2021, 1, 1),
    catchup=False,
    tags=['earnings'],
)

treetracker-airflow-dags's People

Contributors

dadiorchen avatar tanguyen1893 avatar zavenarra avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.