Giter Club home page Giter Club logo

m10-apis's Introduction

Module 10: Introduction to Application Programming Interfaces

Overview

The term Application Programming Interface (API, for short) encompasses a broad set of utilities (protocols, tools) used for building software. In fact, the term API could appropriately describe some of the libraries we've been using in R, such as DPLYR. The general pattern used by DPLRY and steps necessary for using it are referred to as it's API. This is well described on WikiPedia:

the API describes and prescribes the expected behavior while the library is an actual implementation of this set of rules

In addition to exposing functions for buidling software, APIs are often built to expose data components. The focus of this learning module is to focus on accessing data from APIs that expose data as their primary function.

Resources

REST APIs

APIs are often developed to provide a consistent way to access data from a complex data structure. In particular, Representational State Transfer (REST) APIs were designed to transfer data given a set of predefined protocols. REST APIs most commonly communicate data through HyperText Transfer Protocol, or HTTP. This allows developers to access different web endpoints to request data, and process it in a language / application of their choice.

For example, we'll use the Spotify API to query publicly available data from Spotify. The actual data structure contains massive amounts of information (artists, songs, playlists, users, listens, etc.) and the underlying structure of the data may change. By developing an API, Spotify allows developers to access their data in a consistent framework.

Accessing Data

To access data from an API, you'll need to navigate to the appropriate API endpoint (i.e., the URL that will return the information you're seeking). For example, try opening up the following URL in your web browser:

https://api.spotify.com/v1/search?q=adele&type=artist

When you open that URL, you should see the data structure returned by the REST API (more on that below). The URL you've entered constitutes a request to the Spotify by constructing a query (q=adele) and specifying what you're looking for (type=artist). In order to write a query, you'll need to read the documentation to understand how to request certain information.

In R, you can make requests to a URL using the fromJSON function, which is part of the jsonlite package.

# Install jsonlite
install.packages('jsonlite')
library(jsonlite)

# Read data using the fromJSON function (part of jsonlite)
data <- fromJSON('https://api.spotify.com/v1/search?q=adele&type=artist')

In practice, you'll likely want to write a function that allows you to paste together a search query:

# Base URL of API
base <- 'https://api.spotify.com/v1/search?'

# Parameters
search <- 'q=adele'
type <- '&type=artist'

# Query string
query_url <- paste0(base, search, type)

Data Structure

Most REST APIs will return your data in JavaScript Object Notation (JSON) format. JSON format is a common structure for storing data using key-value pairs. Note, these values can be nested, so you can have many levels of a JSON object. Here's an example JSON object storing a set of items in a todo list:

// All data is stored in the same object
{
    "todos":{
        "one":{
            "description": "Do INFO 343 Homework",
            "status":"Incomplete",
            "urgency":"Low"
        },
        "two":{
            "description": "Do INFO 474 Homework",
            "status":"Incomplete",
            "urgency":"High"
        }
    }
}

The corollary to a JSON object in R is a list of lists. A list can have key-value pairs, where the names of your items are your keys, and the elements stored in those locations are the values. And, as with JSON objects, these can be nested, meaning that your values can themselves be lists with their own set of key-value pairs.

To practice requesting data from the Spotify API, see exercise-2.

Flattening Data

One challenge of working with data from APIs is making sure that the data is in the proper format. In order to ask questions of our datasets, we'll still want to put them in dataframes. Unfortunately, sometimes the nested nature of JSON data will result in unintended data structures. In particular, you may end up with a dataframe in which one of the columns is a dataframe. For example,

# Let's do something silly
people <- data.frame(names = c('Spencer', 'Jessica', 'Keagan'))
favorites <- data.frame(
                food = c('Pizza', 'Pasta', 'salad'),
                music = c('Bluegrass', 'Indie', 'Electronic')
            )
# Store dataframe column
people$favorites <- favorites

R will display the dataframe properly, but it won't actually save it properly. If you View the dataframe created above, it will appear as if there is a favorites.food column, but people$favorites.food is actually null (it's stored in people$favorites$food). Luckily, you're not the first person to encounter this issue, and it can easily be address with the flatten function, which is also part of the jsonlite package:

# Spread a dataframe into separate columns
people <- flatten(people)
people$favorites.food # this just got created

To practice working with flattening data, see exercise-3.

m10-apis's People

Contributors

mkfreeman avatar jockfax32 avatar amaglio1 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.