Giter Club home page Giter Club logo

nyc-transit-archive-old's Introduction

nyc-transit-archive

The NYC Transit Archive service is managed using a set of task graphs (or DAGs) running inside of Apache Airflow.

Development

Running tasks locally

This section documents the steps required to run the tasks locally.

First, create a new environment (the following uses conda, but you can also use something else, e.g. pipenv) with the requisite packages installed:

# the attrs library may fail to install if you do not prepulate the environment with Python
conda create --name quilt-gtfs-rt-pipeline-dev python=3.6
conda activate quilt-gtfs-rt-pipeline-dev

# install script/dev env packages
conda install jupyter boto3

# install airflow
export AIRFLOW_HOME=~/airflow
pip install apache-airflow

# install the cryptography lib
pip install cryptography

If you haven't done so already, clone this repo:

git clone https://github.com/ResidentMario/nyc-transit-archive.git

Next you will need to generate a Fernet key. The following code will copy one to your clipboard.

python -c "from cryptography.fernet import Fernet; fernet_key= Fernet.generate_key(); print(fernet_key.decode())" | pbcopy

Now edit ~/airflow/airflow.cfg. Make the following edits:

  • Set sql_alchemy_conn to sqlite:////Users/alex/airflow/airflow.db.
  • Set load_examples to False.
  • Set fernet_key to the value you just pasted to clipboard.
  • Set executor to SequentialExecutor.
  • Set dags_folder to ~/Desktop/quilt-airflow/dags/.

Next, export your MTA access key to the mtakey environment variable:

export mtakey=$KEY_VALUE

If you do not have an access key you will need to create one. You can do so on the MTA website.

Make sure that you are authenticated for AWS. You will need to change the buckets you write to in the various tasks in the dags/ folder to ones that you have access to.

Start the scheduler process:

airflow scheduler

Start the webserver process:

airflow webserver

Navigate to localhost:8080 and you're ready to go.

nyc-transit-archive-old's People

Contributors

residentmario avatar

Stargazers

 avatar

Watchers

James Cloos avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.