Giter Club home page Giter Club logo

ctabus's Introduction

CTA Bus Data Analysis

CTA Bus Data Analysis is an ongoing project to analyze the performance of the Chicago Transit Authority's bus routes.

Visit the current project page: https://spencerchan.github.io/ctabus

Motivation

This project started because I kept having bad luck catching the bus. I wanted to know: is the bus schedule for CTA route 55 Garfield overpromising on its late-afternoon wait times or am I just unlucky? Asked more precisely: is 20 minutes an unreasonable amount of time to wait for an eastbound 55 Garfield bus at the Garfield Red Line Station at 4pm on a weekday? Since answering this question, the project has morphed into something larger and more ambitious.

Even in the face of declining ridership over the past decade, the bus remains a popular way to get around Chicago: the CTA provides over 800,000 bus rides every weekday, totaling to nearly 250 million bus rides each year. In fact, bus rides account for over half of the CTA's daily and annual ridership. With so many Chicagoans depending on the bus each day, it is critical that bus service throughout the city is reliable, fast, and frequent. This is especially true in order to encourage people to start using public transit and to stem further losses in ridership.

The goal of this project is not to offer suggestions to improve service, but to leverage data to highlight issues with the existing service. Some of the questions the project hopes to answer are: what are travel and wait times for each bus route, and how do these times compare to the scheduled service? Which routes experience bus bunching most frequently? Are there areas of the city with better bus performance than others? A secondary goal for the future is to offer a statistical approach to commuting by bus that improves upon individual experience or directions provided by a service like Google Maps.

Technologies Used

  • Python 2.7
  • Jupyter
  • pandas
  • D3.js
  • SQLite
  • HTML
  • JavaScript

Project Status

The project is actively collecting and analyzing data. Throughout 2019, bus location data will be gathered for all active CTA bus routes—over 120 routes—and will be processed and analyzed on a monthly basis.

Data Sources

Bus location data is gathered from the CTA Bus Tracker’s getvehicles API at a regular interval via a Python script run as a cron job. The Bus Tracker family of APIs provides near-real-time locations and estimated arrival times of all CTA buses. The API makes accessible only the most recent position for each vehicle, hence the need to regularly access the API and archive the data.

Supporting data, such as the locations of bus stops, also comes from the Bus Tracker API. Read the documentation to learn more about what data is available through the API and how to access it. Scheduled service data is obtained from the CTA's GTFS feed.

For the purposes of this project, the most important data collected from getvehicles is the location of each vehicle along its route (pdist) and the time the vehicle was at that location (tmstmp). We can discover a lot using those two pieces of information, especially when combined with the bus stop location data, including: the speed of a bus, the wait time between two buses at a particular stop, the travel time between two stops, and more.

Current Analysis

The current project website and its visualizations focus on bus location data collected during February and March 2017 from the 55 Garfield bus. At present, they show:

  • When buses are dispatched throughout the day and week
  • The time of day and location along the route where bus bunching occurs most often
  • Travel times between any two major stops broken down by time of day
  • Wait times between consecutive buses broken down by time of day

Future Plans

  • Create a brand new project website with pages providing analysis of each CTA bus route and analysis of bus service by Chicago neighborhood
  • Analyze how each bus route adheres to or deviates from its scheduled service
  • Improve project pages to be more mobile friendly
  • Add documentation and provide instructions for replicating analysis
  • Publish raw data

Featured Notebooks

Directory structure

├── data
│   └── processed				# final data sets used in visualizations and analysis
├── flask-bokeh_site				# old flask site with bokeh visualizations
├── notebooks					# jupyter notebooks with data analysis
├── references					# reference documents
├── scripts					# D3.js visualizations for current project page
├── src						# source code for project
│   ├── processing				# data processing scripts
│   └── remote					# scripts to scrape vehicle data from Bus Tracker API
├── .gitignore
├── index.html					# current landing page for project
├── LICENSE
└── README.md

Author

Spencer Chan - https://github.com/spencerchan

Acknowledgments

I started this project as a participant in the Spring 2017 ChiPy Mentorship Program. I want to thank my mentor, Matt Hall, for his support and guidance while I learned Python for the first time. Special thanks to the program director, Ray Berg, without whose hard work and organization, the program would not have been possible. Data was collected from the Chicago Transit Authority's Bus Tracker API.

License

Released under the GNU General Public License, version 3

ctabus's People

Contributors

sabrinadchan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.