Giter Club home page Giter Club logo

tidygoogleway's Introduction

tidygoogleway

The googleway package provides some excellent and highly versatile methods for querying and analyzing data from the Google Maps APIs.

tidygoogleway builds on the functionality in googleway with a single purpose - to provide a tidy interface to the Google Places API. The methods in this package assume that you are starting with a dataframe/tibble of location data that you wish to enrich with data from Google Places.

Installation

You can install tidygoogleway from Github using the following command:

# You must have devtools installed first
devtools::install_github("joshmuncke/tidygoogleway")

Setup

To use this package you’ll need a Google Places API key. You can save this key to your environment variables using googleway::set_key and it will be automatically picked up by tidygoogleway.

googleway::set_key("<YOUR API KEY>")

Usage

The add_google_places function expects a dataframe with (at the minimum) a field containing the name and address of the locations you wish to add Google Places data to. It will return a dataframe with the relevant Places data appended (i.e. it’s pipe-able).

Often a Google Places search will return multiple results. In this instance add_google_places function will perform a string similarity comparison on the location name and address between the values you provide and the values returned from Google. If you supply latitude and longitude fields then add_google_places will factor a geographic distance into this calculation too.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(magrittr)
library(furrr)
#> Loading required package: future
library(purrr)
#> 
#> Attaching package: 'purrr'
#> The following object is masked from 'package:magrittr':
#> 
#>     set_names
library(tidygoogleway)

# The macdonalds dataframe contains the name and address of 11 McDonalds locations in Los Angeles
mcdonalds %>% head(5)
#> # A tibble: 5 x 2
#>   name      address                                    
#>   <chr>     <chr>                                      
#> 1 McDonalds 2809 N Lincoln Blvd Santa Monica, CA 90405 
#> 2 McDonalds 4680 Lincoln Blvd, Los Angeles, CA 90292   
#> 3 McDonalds 2457 Lincoln Blvd, Venice, CA 90291        
#> 4 McDonalds 1540 2nd Ave, Santa Monica, CA 90405       
#> 5 McDonalds 2902 West Pico Blvd, Santa Monica, CA 90405

# Now add Google Places data to our dataframe
enriched <- mcdonalds %>% add_google_places(name, address, radar = F)
#> The radar argument is now deprecated
#> The radar argument is now deprecated
#> The radar argument is now deprecated
#> The radar argument is now deprecated
#> The radar argument is now deprecated
#> The radar argument is now deprecated
#> The radar argument is now deprecated
#> The radar argument is now deprecated
#> The radar argument is now deprecated
#> The radar argument is now deprecated
#> The radar argument is now deprecated
enriched %>% select(name, address, google_place_id, google_rating)
#> # A tibble: 11 x 4
#>    name     address                     google_place_id       google_rating
#>    <chr>    <chr>                       <chr>                         <dbl>
#>  1 McDonal… 2809 N Lincoln Blvd Santa … ChIJG1-i-tm6woARShGW…           3.6
#>  2 McDonal… 4680 Lincoln Blvd, Los Ang… ChIJsWoNelHBwoARe6bL…           3.6
#>  3 McDonal… 2457 Lincoln Blvd, Venice,… ChIJo_SkgY26woARR0FJ…           3.5
#>  4 McDonal… 1540 2nd Ave, Santa Monica… ChIJIzmJms-kwoARsrO3…           3.6
#>  5 McDonal… 2902 West Pico Blvd, Santa… ChIJlaqRZhe7woARwM2i…           3.5
#>  6 McDonal… 2712 Santa Monica Blvd, Sa… ChIJEc3LTka7woARUaiX…           3.5
#>  7 McDonal… 11300 National Blvd, Los A… ChIJM7ZVgq67woARCeiE…           3.7
#>  8 McDonal… 10623 Venice Blvd, Los Ang… ChIJZ295STC6woARhE1O…           3.5
#>  9 McDonal… 3571 Rosecrans Ave, Hawtho… ChIJL_usBci1woARxjKJ…           3.8
#> 10 McDonal… 15810 Crenshaw Blvd, Garde… ChIJgzc3DaG1woARezwC…           3.9
#> 11 McDonal… 101 W Manchester Ave, Los … ChIJMb-tCr_JwoARwjL-…           3.6

By default, only the best matching location will be returned (so the number of rows in will be the same as the number of rows out). If you wish to override this behaviour and return multiple results use .keep_all = T.

Note that if you use the default .keep_all = T you may end up with more rows than you started with. These can be filtered using the mean_distance column (geometric mean of geo-distance and string distance) or google_result_number (ordering of results from Google Places API).

Parallel processing

Often for these kinds of use cases you are iterating over a large number of locations. To speed this process up (and provide progress visibility) add_google_places utilizes the furrr library.

N.B. In order to make use of the parallel processing capabilities you must set plan(multiprocess) prior to running the add_google_places command. This syntax should work on Windows and Mac.

tidygoogleway's People

Contributors

joshmuncke avatar

Stargazers

 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.