Giter Club home page Giter Club logo

airflow_learning's Introduction

Airflow

Airflow DAG is a python program. It consists of these logical blocks.

  • Imports
  • DAG Arguments
  • DAG Definition
  • Task Definitions
  • Task Pipeline

Imports

Example

# import the libraries

from datetime import timedelta
# The DAG object; we'll need this to instantiate a DAG
from airflow import DAG
# Operators; we need this to write tasks!
from airflow.operators.bash_operator import BashOperator
# This makes scheduling easy
from airflow.utils.dates import days_ago

DAG Arguments

default_args = {
    'owner': 'HaPhan Tran',
    'start_date': days_ago(0),
    'email': ['[email protected]'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(minutes=5),
}

DAG definition

Main information about DAG: name, description, scheduleetc...

dag = DAG(
    'ETL_Server_Access_Log_Processing',
    default_args=default_args,
    description='ETL read server log',
    schedule_interval=timedelta(days=1),
)

Task Definitions

Information about task id, command, which dag the task belongs...

extract_data = BashOperator(
    task_id='extract',
    bash_command='cut server_log.txt -d "#" -f 1,4',
    dag=dag,
)

Task Pipeline

After define all the tasks, you need to specify the sequence that those tasks will be triggered.

# task pipeline
download >> extract_data >> transform_data >> load_data

airflow_learning's People

Contributors

haphantran avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.