Giter Club home page Giter Club logo

socrata.jl's Introduction

Socrata.jl

Socrata.jl is a Julia wrapper for accessing the Socrata Open Data API (http://dev.socrata.com) and importing data into a DataFrame. Socrata is an open data platform used by many local and State governments as well as by the Federal Government.

Here are just a few examples of Socrata datasets/repositories:

More Open Data Resources can be found here.

Installation

Pkg.clone("https://github.com/dreww2/Socrata.jl.git")

Basic Usage

The Socrata API consists of a single function, socrata, which at a minimum takes a Socrata url and returns a DataFrame:

julia> using Socrata

julia> df = socrata("http://soda.demo.socrata.com/resource/4334-bgaj")
100x9 DataFrame
|-------|--------------------|------------|---------|
| Col # | Name               | Eltype     | Missing |
| 1     | Source             | UTF8String | 0       |
| 2     | Earthquake_ID      | UTF8String | 0       |
| 3     | Version            | UTF8String | 0       |
| 4     | Datetime           | UTF8String | 0       |
| 5     | Magnitude          | Float64    | 0       |
| 6     | Depth              | Float64    | 0       |
| 7     | Number_of_Stations | Int64      | 0       |
| 8     | Region             | UTF8String | 0       |
| 9     | Location           | UTF8String | 0       |

The url may be a Socrata API Endpoint or may be the URL from the address bar (in which case Socrata.jl will automatically attempt to parse the string into a usable format). For example, the following are all valid urls for the same dataset:

Optional Arguments

Basic Arguments

There are several optional keyword string arguments:

  • app_token is your Socrata application token which allows for more API requests per unit of time
  • limit is equal to the number of rows in the dataset you would like to retrieve. Default is equal to 100, max is equal to 1,000 (Socrata's limit). If you want to download a large dataset, set fulldataset=true (see below).
  • offset indicates the first row from which to start pulling data.
  • fulldataset ignores all query parameters including limit, offset, and any of the Socrata Query Language (SoQL) arguments and downloads the entire dataset.
  • usefieldids is not yet implemented, but will substitute the default human-readable column headers with API field IDs.

Socrata Query Language (SoQL) Arguments

Socrata.jl supports SoQL queries using the following arguments:

  • select
  • where
  • order
  • group
  • q
  • limit and offset as described above.

Note that any references to columns inside these arguments must reference the dataset's API Field ID, which can be found on any Socrata dataset page under Export => SODA API => Column IDs.

Examples

using Socrata

url = "http://soda.demo.socrata.com/resource/4334-bgaj"
token = "your_app_token_goes_here"

A basic query, getting the first 5 rows:

df = socrata(url, app_token=token, limit="5")

Get rows 5-10 of the data:

df = socrata(url, app_token=token, limit="5", offset="5")

Get only the first 10 rows and the Source, Earthquake_ID, Magnitude, and Region columns:

df = socrata(url, app_token=token, limit="10", select="source, earthquake_id, magnitude, region")

You can add multiple conditions within a single argument. For example, get only rows where magnitude is greater than 5.5 and depth is less than 30:

df = socrata(url, app_token=token, where="magnitude > 5.5 AND depth < 30")

Search for Hawaii in the dataset where Magnitude > 2 and only select certain columns:

df = socrata(url, app_token=token, q="hawaii", where="magnitude > 2", select="datetime, magnitude, region, location")

TODO

  • Add support for automatically getting API Field IDs
  • Implement better app_token system
  • Add support for JSON and XML

socrata.jl's People

Contributors

drewgendreau avatar

Stargazers

Michael Corrado avatar anand jain avatar Ben Fogarty avatar Chris Baughman avatar Keisuke UTO avatar Chester Beard avatar  avatar Matt Erbst avatar Tomas Mikoviny avatar Angus H. avatar Adrian Lanzafame avatar Dan Wagner avatar Larry Eisenstein avatar Yeesian Ng avatar

Watchers

 avatar Evan Zhu  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.