Giter Club home page Giter Club logo

historical-travel-of-us-secretary-state's Introduction

US Secretary Of State Travel Data

The US Secretary of State's website publishes an overview of the official international travel of the Secretary of State. This data goes back continuously to 1905. Unfortunately the data is only available as a series of webpages and individual rows may contain multiple cities.

This repository contains a single CSV file of this data. It also includes the Node.js scripts used to process the data. Hopefully this data is useful visualization proejcts, statistical analyses, and other uses for geographic, time-based, historical data.

Data Preview

alt tag

The Dataset

/data/destinations.csv

  • original_country: text string identifying country scraped from Secretary Of State website
  • original_city: text string identifying city scraped from Secretary Of State website.
  • original_date: text string identifying date interval scraped from Secretary Of State website.
  • description: text string describing the Secretary's travel arrangement.
  • country_modified_for_geo: text string based on original but modified for geocoding.
  • city_modified_for_geo: text string based on original but modified for geocoding.
  • date: unused should remove
  • sec_id: text string identifying the Secretary of State
  • sec_name: text string of Secretary's name
  • id: chronological integer of destinations
  • glat: latitude of location from Google
  • glon: longitude of location from Google
  • gcity: text string of what Google identifies as the city of the destination
  • gcountry: text string of what Google identifies as the country of the destination
  • isGeocoded: binary note that the destination was geocoded
  • split_added: binary note that the entry was interpolated from a single Secretary of State entry.
  • original_line: binary note that the entry generated multiple destinations, which have split_added as true
  • start_time: start time of the destination in milliseconds since epoch.
  • end_time: end time of the destination in milliseconds since epoch.
  • start_time_form: start time of the destination in UTC format.
  • end_time_form: end time of the destination in UTC format.
  • elapsed_days: number of days at the destination

/data/secretaries.csv

  • sec_id: id of secretary. Same as in /data/destinations.csv
  • url: url of the page where destination data was scraped
  • name: full name of Secretary
  • years: years the Secretary served

/country-regions.csv

  • country: every country from the original_country field
  • id: rough approximation of world region: A: Americas, AS: Asia, AF: Africa, MD: Middle East, E: Europe

Generating Data

The dataset can be regenerated using four Node.js scripts

  1. scrape-raw.js: Pull in initial data and generate secretaries.csv
  2. fix-dates.js: Transform the date/interval text string to structured form
  3. split-and-clean-locs.js: Some original lines contain multiple cities over multiple days. This script splits them into unique destinations as long as the original line's destination lasted more than one day.
  4. geocode.js: Uses Google's geocoding service to geocode the destinations. Requires a free API key.

historical-travel-of-us-secretary-state's People

Contributors

sciutoalex avatar

Stargazers

 avatar Cooper Thomas avatar Talha Oz avatar Jonathan Egol avatar Eduardo Flores avatar Jason Heppler avatar Jason Poulos avatar JJ Chen avatar Aaron Williams avatar Thomas J. Leeper avatar Robert Jonczy avatar Joel Gombin avatar Dan Calacci avatar François Briatte avatar  avatar Pablo Barberá avatar Rochelle Terman avatar gaurav avatar Troy Griggs avatar Benjamin Schmidt avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.