Giter Club home page Giter Club logo

standard-traffic-data's Introduction

Standard Traffic Data

The 1st fully open-source repository for road traffic timeseries.

This repository is a mix of engineering, data science and knowledge. The underling topic is that of road traffic timeseries analysis: vehicles (or, in general) objects move in a road network and it's worth spending time to study the data, to ask questions and to uncover patterns.

Everything you will see here is open-source and reproducible:

  • Data generation is reproducible via scripting and Docker files.
  • The generated data is published and can be downloaded by everybody (no registration required).
  • Analysis, studies and techniques are published in form of articles and/or reproducible notebooks.

How to contribute

The real aim of this project is to involve as many people as possible. Whether you are an experienced engineer, data scientist or a student (isn't everybody?), if you are interested in playing with these datasets then please go ahead and have fun. We hope you'll get in touch and collaborate, because we believe open-source is meant to produce and share knowledge. There's already some amazing people sharing their experience with us, and we'd love for you to be the next one.

If that's what you believe too, open an issue now and explain what ideas you have for your next article with this data.

But if for some reason you'd rather not, then you can simply download the data and use it for your own purpose. You don't need to ask permission. Take note of the licence though: it's MIT.

Repository structure

Here's how you can navigate this repository after you fork it.

  • knowledge/ is where you should start. The directory contains all articles and notebooks with the studies other contributors have made and published. Remember, you can be the next one!
  • std_traffic is a Python package. You can install it with pip install -e ..
  • std_traffic/pipelines/ is where the software pipelines for data generation, processing and storage are (sort of ETL scripts).
  • std_traffic/utils/ contains Python functions that can be useful for a variety of things, mainly interacting with cloud storage and databases.
  • scripts/ contains ... executable scripts!

The data

Whenever we generate or collect data, we publish it for everybody's benefits. Next is a list of all datasets, or databases, that we have.

Principality of Monaco

These are timeseries of simulated road traffic data. The simulator used is SUMo, and the simulated city is the Principality of Monaco. We used the previous work of researchers at Communication Systems Department of Sophia-Antipolis, France. We took their (quite complex!) work and made it 100% reproducible with a Docker file. The story is told in the introduction of this article.

For a description of the data, read the introduction of this other article.

For more information about the ETL process, read this page.

Time horizon File size Download
4am - 6:30am 200 MB link
4am - 7am 686 MB link
4am - 8am 1.4 GB link
4am - 8:30am 2 GB link
4am - 9am 2.5 GB link
4am - 10am 3.9 GB link
4am - 11am 5.2 GB link
4am - 12pm 6.2 GB link
4am - 1pm 7 GB link
4am - 2pm 7+ GB link

We have also saved the same data in a database that is accessible via the internet. This is the better approach for statistical sampling and large data, instead of downloading a huge CSV. See this article for a usage example.

Maintaining the database is a bit expensive for us, especially because this is a nonprofit, self-funded project. Therefore, we don't disclose the host and password, to avoid bots.

But know this: if you request access and tell us what's your idea, we will definitely share the database credentials with you. Nobody's request was ever rejected so far. Open an issue to start collaborating!

Project contributors (submit a PR if your name is missing!)

The list is in alphabetical order (by last names).

standard-traffic-data's People

Contributors

ruggerofabbiano avatar pgrandinetti avatar hrshtt avatar pedrohgv avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.