code-for-nashville / inclucivics Goto Github PK

View Code? Open in Web Editor NEW

11.0 29.0 18.0 27.24 MB

Data visualization of Nashville Metropolitan Government employee salary and demographics

Home Page: http://www.codefornashville.org/inclucivics/

License: MIT License

CSS 7.06% JavaScript 85.93% HTML 7.01%

civic-tech data-viz diversity-measurement

inclucivics's Introduction

IncluCivics

IncluCivics is a data visualization app completed in partnership with the Human Relations Commission. It provides transparency on employee demographics within the Nashville Metropolitan Government.

Running

To run, make sure you have yarn installed, and run

yarn install && yarn start

to see the site live at http://localhost:3000.

This project is built using create-react-app. Check out the User Guide for more information about testing and building.

Adding Data

To add data:

Run yarn data:fetch to download the latest file to the input/ directory.
Run yarn data:import. This will produce new files in public/data/ that will be used in the Explore section of the grahps. Note: The import may fail because of changes to column names or data format. If you encouter a failure, please file an issue ✍️. This will also generate a summary of all files in the input/ directory
Commit any changes and submit a pull request.

Deploying

Run yarn deploy. This will fail if you don't have push rights to the repository's gh-pages branch. The app is configured to use git over https. If you have two-factor authentication to github, you will need to create a personal access token with the 'public_repo' permission.

Contributing

Contributions are welcome. Look at the "Issues" tab to squash 🐛s, add features and suggest improvements. If you are new to open source, check out How to Contribute to Open Source for a rundown.

License

MIT

inclucivics's People

Contributors

Stargazers

Watchers

Forkers

mshenfield gtback terryjbates luketlancaster jordanthomp81 moniquebt enlore topherhooper libardo1 jcockhren combinatorist jackmoch braindawg kazshak sarah-weatherbee staufferalexander heymonicakay akilah-littlejohn

inclucivics's Issues

Front end testing

Break it!

Fix link names on first page

Would be great to reduce the links for the open data portal to something not so ugly.

feature: Query socrata api checking for last update of demographics data.

feature: Write request for socrata API

Add Web Analytics

We want to know if the public are using/visiting the site. inclucivics.com

We have two options:

Google Analytics
Piwik

Automating builds and deploys

This issue is in regards to the following:

Hook up this repo to Sophicware hosted jenkins instance
~~[ ] Create jenkins job for building AMIs within C4N AWS account~~
Setup continuous deployment

Sort graph items

Not sure what the deal is with this, but a simple sorting of values either alphabetically or by numeric value would be helpful.

@colbydehart You think you could handle this one?

Feature: Add parser to create data for timelapse analysis

This is currently loaded from a static json file. I'll need to preseed the database first and have the parser create an object from all available timestamps.

Handle dns setup

Heroku project setup.

@jchapin @JamesNix

Based on the discussions we had it seemed like the basic approach was to use mongo for the db and host the app using Heroku. I don't think MQL will be too difficult to pick up.

I started playing with Heroku, but am not particularly familar with it. Does either one of you want to try and setup a project that I can tag along into at the next hack night? I need a few python libraries, but from what I read Heroku should support it.

Also, if you want to begin working on an interface and testing rendering images using vega. Checkout vega. The vega editor has examples of JSON specifications for graphs that you could use for testing purposes.

Add project to website

Please add this to our website's projects page.

seed database with "previous" socrata imports

The major problem with switching out the backend here is that we've got to maintain the data we had when we were importing from text files, and put it in a format identical that which is expected by socrata. I'm thinking here that I'm just going to manually create a database that has that data, complete with timestamps, and will provide a rethinkdb backup to be imported if the new database does not exit. This way we should be able to continue on as if we had been importing data all along.

begin piping data.nashville.gov data to a staging ground/private data portal

Feature: add celery and associated tasks

I could pretty easily use celery to handle the periodic update request... seems like a bit of overkill for this, but ... I can set it up pretty fast. The real issue will be deployment.

Need all fields for actual demographics even when values are 0.

This makes calculating observed/expected values considerably easier.

Figure out a metric for measuring changes over time

@mshenfield

Potential metrics...:

Chi Square. Compare the Observed/Expected ratios for Metro as a whole and plot the resulting chisquare value overtime.

Advantages: Statistical backing, appropriate for this type of data, creates a single metric that can be plotted

Disadvantages: May be to sophisticated for average user, how do we handle multiple income levels?

Some metric that relates the number of departments at or near the values predicted by census ?

Figure out how to sort the graph income ranges in a human logical way.

Currently does machine sorting 1, 10, 100, 2, 20, etc.

Hook up api to front end

Mobile responsiveness

@bdfinlayson

We need your help here. The mobile version of this project looks absolutely atrocious. We can't have public attention on this platform when it looks like this. It will reflect poorly on Code for Nashville and what we are able to do as an organization.

If you can rally some people to fix this, or handle it yourself, it would be deeply appreciated.

look at www.inclucivics.com to get an idea

python cleanup/refactor

the python face of this is pretty rusty at this point. Going to go through and do some refactoring of that whole piece.

502 Bad Gateway

@jcockhren

http://52.1.163.16/

Update the paragraph on the home page

Content to Use

In January 2015, the Metro Human Relations Commission (MHRC) released the IncluCivics Report, analyzing the demographic makeup of 50 Metro Nashville departments. The data in the original report was provided by Metro Human Resources (Metro HR) in August 2014. Since then, Metro HR has provided more recent data (captured April 1, 2015) and has announced that updated data will be released quarterly. The original IncluCivics Report, and a recent and more robust Data Update are at https://www.nashville.gov/Human-Relations-Commission/IncluCivics.aspx.

This platform, graciously created and maintained free of charge by Code for Nashville, exists for two reasons. First, it is imperative to establish a baseline from which to assess our collective efforts at attaining a more diverse workforce in the future. Second, to further encourage transparency and public education, this platform will capture the demographic data provided quarterly by Metro HR, render it in user-friendly charts and graphs, and will track changes in the data over time.

The raw data used on this platform is available to the public and can be found at https://data.nashville.gov/Metro-Government/General-Government-Employees-Demographics/4ibi-mxs4

Feed HR data to brigade open data portal and have a data integrity check there

I think this is an excellent use case to begin playing with our own open data portal.

I downloaded the CSV from data.nashville.gov only to find that the keynames had changed (surprise!) between the first dataset I got and the next. I resolved this temporarily by just editing the header myself, but obviously this is not a process

We need to handle basic ETL issues (type serialization, missing values, key_names). I'd like to be pulling this data from a staging area rather than from CSVs.

Handle python path issues for deployment

Mod charts.js so it shows theoretical distributions for all race/ethnicities

Need a way to get back to the homepage from graphs

remove rethinkdb as a service

Makes more sense to just have batch jobs output static json files.

Add chart to show temporal changes in demographics

@mshenfield and I discussed this out, it would be best to put this chart on the initial load underneath the text snippet describing the intention of the site.

Should use highcharts and be a line graph of some sort. Even a preliminary with fake data would be progress at this point.

Add in HRC logo

Update backend to handle datasets over time

This is probably just going to be a simple extension of the simple import job we're using to bring in data right now.

Eventually we'll want to figure out how to integrate with Metro's Open Data Portal and perhaps utilize our own Socrata portal as a staging ground for cleaned data. If we can do a JSON request from our own portal we may be able to eliminate rethinkdb as a dependency all together (which is to our advantage as I believe I am the only person who uses it. Shame though, great NoSQL db)

Importing from `ntp.*` fails on deployment, runs locally

For some reason, calling an import statement like from ntp.{module or modules} import {something} fails on the deployment server. However, these same statements run wtihout error on the VM from our Vagrantfile.

Removing all such statements in commit b755eee allowed the build to be deployed on the server.

The server and local versions are both running Python 2.7.6, and a fresh install is done each deployment.

Here are some example stack traces from the server deployment for commit 0d23ae7 where the ntp.* imports have not been removed. Haven't been able to recreate on the local VM running the sudo python {relative path e.g. /vagrant/ntp/}run_server.py from different directories.

Traceback (most recent call last):
File "/opt/inclucivics/ntp/run_server.py", line 8, in
from app import app
File "/opt/inclucivics/ntp/app/init.py", line 7, in
from api import departments, data
File "/opt/inclucivics/ntp/app/api.py", line 3, in
from include.functions import rdb_get_data_by_department, rdb_get_department_names#, rdb_get_temporal_values
File "/opt/inclucivics/ntp/app/include/functions.py", line 1, in
from ntp.data.include.rethinkdb.tables import RdbMostRecent, RdbChiMerged
ImportError: cannot import name RdbMostRecent

ImportError: cannot import name filter_str
Traceback (most recent call last):
File "/opt/inclucivics/ntp/run_server.py", line 2, in
from data.load import table_check
File "/opt/inclucivics/ntp/data/load.py", line 2, in
import sanitize as clean
File "/opt/inclucivics/ntp/data/sanitize.py", line 6, in

Traceback (most recent call last):
File "/opt/inclucivics/ntp/run_server.py", line 2, in
from data.load import table_check
File "/opt/inclucivics/ntp/data/load.py", line 1, in
import aggregate as agg
File "/opt/inclucivics/ntp/data/aggregate.py", line 1, in
from ntp.project.common.helpers import merge_json_like, sortDict
ImportError: cannot import name merge_json_like

Calculate Chi square values

Need to figure out an easy dependency for this or roll my own

Fix graph column alignment

Hey @beck410

I've added the graphs in, but I think you've modded the primary div container so that it only has a single column, but there are two divs of graphs that need to be compared side by side. Could you check the add_graphs branch and change it so that the graphs are aligned left and right?

Expected graph is broken

For some reason the result graphs are not displaying any resuts! It is getting 0 back for all the expected values. See http://52.1.163.16/

Remove pandas dependency

Handle data upload using python

README Update

Update README to say:

'launch the python webserver available on localhost:8082'

See PR #75

Apologies for pushing to my master branch...

Need to combine or delete the departments that Jon and I previously discussed.

Correct the dates on the temporal graphs

Dates on the temporal graphs should be: 2014-August, 2015-January, and 2015-April

feature: Import and sanitize data from API

Known problems are mostly associated with type serialization.

Not uncommon to find the salary field with mixed string and floating types.

I.e.

"$37,000.00"
and "37000.00"

Simple answer is to split on "." filter non numeric values, and convert index 0 to int

Change pie charts headings from "Actual Demographics" to "Metro Demographics."
Remove the income brackets from the "Census Predicted" pie charts
Correct the dates on the temporal graphs: Dates on the temporal graphs should be: 2014-August, 2015-January, and 2015-April