Giter Club home page Giter Club logo

judges-education-map's Introduction

Mapping Court of Appeals judges' schools

Final project for CJS Data and Databases class

  • Jupyter notebooks are in the main repository
  • Data output - Data exported from and imported into the Jupyter notebooks
  • Final map

Process:

  • Scrape data from the Federal Judicial Center using BeautifulSoup and Selenium
  • Webpages with all of the judges (past and present) from each circuit: https://fjc.gov/history/courts/u.s.-court-appeals-{court-name}-judges
  • Loop through these webpages to retrieve links for each judge's biography: https://fjc.gov/history/judges/{lastname-firstname-middlename}
  • Find judge name using BeautifulSoup, use regex to scrape between "Education:" and "Professional" - this outputs one long string with the education
  • Output: 01-scrape.csv
  • Use regex to split the judges' names and educations into multiple columns
  • Example: original string "Yale College, A.B., 1868Columbia Law School, LL.B.,1870" was split into the following columns
    • school1: "Yale College, A.B."
    • year1: 1868
    • school2: "Columbia Law School, LL.B."
    • year2: 1870
  • Output: 02-split.csv
  • Replace mentions of "Read law" with NaNs. These are indications of having a degree, but no actual school is listed (mostly for older judges)
  • Use regex to remove degrees (B.A., Ph.D., LL.B., etc.)
    • Replaced schools that started with St. with St### to avoid accidentally splitting them in this process
    • Also renamed schools with periods in their names, like Ohio Northern University, Warren G. Harding College of Law
  • Did a bunch of renaming of schools for the following reasons:
    • Have "Harvard College," "Harvard School of Law" and "Harvard Business School" to all be "Harvard University"
    • Clean straggling punctuation
    • Align with our merge to get addresses for each school in Step 5
  • Output: 03-clean.csv
  • Count number of times each school appeared overall, across all circuits, for the table that appears on the final map
  • Count number of times each school appears in each circuit for the final map
  • Outputs: 04-format.csv and 04-format-table.csv
  • Data on schools downloaded from the Database of Accredited Postsecondary Institutions and Programs
  • Renamed schools to align with the address data
    • The data on judges' educations include state systems like University of Texas that have multiple locations, so defaulted to one of the schools in the system, like University of Texas at Austin
    • International schools had NaNs
    • Schools that had closed were replaced either with former addresses, if it could be found, or their city's City Hall
    • Also replaced the address for West Virginia State University, since the one provided in the DAPIP data geocodes incorrectly
  • Removed extra columns, keeping only the name of the school, the count, the circuit and the address
  • Output: 05-address.csv
  • Used HERE Geocoder API to get addresses for each location
  • NaNs in the "address" column were automatically replaced with a latitude of 6.48812 and a longitude of 2.6138, these values were then replaced with NaNs
  • Output: 06-geocode.csv
  • Create popup text (article), rollover text (headline), name (the school), names (group_name) and order (group_id) for the groups in the dropdown menu
  • Colors were set based on the circuit, circles were scaled by area
  • Output: 07-geojson.js > Copy file to the final map folder as geo-data.js

judges-education-map's People

Contributors

ilenapeng avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.