Giter Club home page Giter Club logo

ilcampaigncash's Introduction

Illinois Campaign Cash loader

Loads Illinois campaign fundraising data.

Requirements

  • GNU Make
  • PostgreSQL
  • Python 3
  • Aria2c
  • FTP access to Illinois State Board Of Elections Data

Install

Some kind of Python environment tool (e.g. pipenv or virtualenv) is highly recommended but not required.

pip install -r requirements.txt

Configure

You'll need to export or set some environment variables:

export ILCAMPAIGNCASH_FTP_USER=<USERNAME>
export ILCAMPAIGNCASH_FTP_PASSWD=<PASSWORD>
export ILCAMPAIGNCASH_DB_NAME=ilcampaigncash
export ILCAMPAIGNCASH_DB_ROOT_URL=postgres://localhost:5432/postgres
export ILCAMPAIGNCASH_DB_URL=postgres://localhost:5432/ilcampaigncash

Loading the data

Check out Makefile for all possible tasks.

Load all

Download, process, and load.

make all

Clean

Wipes database and files.

make clean

Download

Download the latest data.

make download

Advanced usage

Because of Make's weird parallelization model, loading in parallel requires multiple steps.

make create_db sql_init create_tables create_views && make -j 4 load_data

This incantation creates the database and then loads each table in four parallel processes. Because the expenditure and receipts tables are orders of magnitude larger than any others, the performance increase isn't significant.

How it works

This loader mimics the Illinois Sunshine extract-transform-load process.

It uses the efficient Postgres COPY command to load the raw data into a Postgres schema called raw. The raw tables are then cleaned and copied as materialized views into a public schema which substantially matches the Illinois Sunshine data model.

The loader does NOT handle updates, though it could be adapted to. However, this is not recommended. Inserts are faster than updates and somewhat easier to parallelize, and static builds are cheaper and more reliable than dynamic sites. The data itself is only updated daily at the time of writing.

Because we use a static build process, the created tables are currently not optimized as a few seconds difference in query performance has no impact on client performance. If that's something you need, please add it and send a pull request.

ilcampaigncash's People

Contributors

eads avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.