The ballot-cure from mcculleydj

Ingest Scripts

Process county and SoS data for the purposes of helping cure ballots. This was built targeting a GCP PSQL instance. You will need your own PSQL instance to run this without modifying the code. In the csv directory create at least one sub-directory named for a date in this format MM-DD (e.g. 10-06). In that sub-directory place at least one CSV named for a county (e.g. Polk.csv).

service.py: defines the Postgres context manager class, you will need to supply your own values for host, dbname, etc. either in the constructor call or as hardcoded defaults in this file
constants.py: exists to share common constants between other scripts and keep them out of the way
initialize.py: removes all logs then drops and recreates the voters table (executes schema.sql)
ingest_county.py: target a directory that is named following this format MM-DD containing one or more county CSV files to ingest this content into the database
- relies on logging both to the DB and to flat files to track changes over time to voter records
- CSVs should be ingested in chronological order from oldest to most recent
digest_sos.py: process the SoS daily CSV (does not yet interact with the persistent layer) to output the top 5 counties by number rejected and rejection rate. easily extended to answer specific questions, e.g. how many counties are reporting at least one rejected ballot?
schema.md: a description of the fields in the county CSVs, SoS CSV, and the schema defined in schema.sql

Table Parser

a demonstration using AWS (textract) to convert images of tabular data into a CSV. to run this you will need to supply your own AWS credentials and a test file converted into images. e.g. break Howard 10.08.pdf into 5 images files and use those image files as inputs to table_parser.py. early tests are promising that we can use this to convert scanned PDFs into usable data.

mcculleydj / ballot-cure Goto Github PK

ballot-cure's Introduction

Ingest Scripts

Table Parser

ballot-cure's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent