Giter Club home page Giter Club logo

Comments (2)

BillPetti avatar BillPetti commented on July 24, 2024

The source data does not return game_id, but it does return game_pk, which can be used to discern between two games of a double header.

from baseballr.

ssp3nc3r avatar ssp3nc3r commented on July 24, 2024

Thanks. I've written a function that adds a PITCHf/x gameday_link to the Statcast data frame returned by your scraping function. Feel free to use and improve:

#' Create PITCHf/x gameday_link for Statcast
#'
#' This function allows you to add a PITCHf/x gameday_link to Statcast data.
#' @param x a data frame with Statcast variables: game_date, home_team, away_team, sv_id, game_pk
#' @keywords MLB, sabermetrics, Statcast, PITCHf/x, gameday_link
#' @importFrom plyr
#' @export
#' @examples
#' \dontrun{
#' stopifnot(require(baseballr))
#' x <- scrape_statcast_savant_batter_all("2016-05-07", "2016-05-07")
#' x <- sc_add_gameday_link(x)
#' }

sc_add_gameday_link <- function(x) {
  
  # create lookup to convert team abbreviations to those used by PITCHf/x
  abb <- data.frame(
    px.name = c("ari","atl","bal","bos","chn","cin","cle","col","cha","det",
                "hou","kca","ana","lan","mia","mil","min","nyn","nya","oak",
                "phi","pit","sdn","sea","sfn","sln","tba","tex","tor","was"),
    sc.name = c("ARI","ATL","BAL","BOS","CHC","CIN","CLE","COL","CWS","DET",
                "HOU","KC","LAA","LAD","MIA","MIL","MIN","NYM","NYY","OAK",
                "PHI","PIT","SD","SEA","SF","STL","TB","TEX","TOR","WSH"), 
    stringsAsFactors = F)
  
  # sort statcast dataframe by sv_id
  x <- x[with(x, order(sv_id)), ]
  x <- merge(x, abb, by.x = "home_team", by.y = "sc.name")
  names(x)[names(x) == "px.name"] <- "px.home"
  x <- merge(x, abb, by.x = "away_team", by.y = "sc.name")
  names(x)[names(x) == "px.name"] <- "px.away"

  # group by game_date and home_team, create link
  x <- plyr::ddply(x, 
             c("game_date", "home_team"), 
             transform, 
             gameday_link = paste0("gid_", gsub("-", "_", game_date), "_", 
                                   px.away, "mlb_", 
                                   px.home, "mlb_",
                                   ifelse(game_pk == game_pk[1], 1, 2))
             , stringsAsFactors = FALSE)
  
  # remove temp variables in dataframe
  x <- within(x, rm(px.home, px.away))

  # return original Statast data plus gameday_link
  return(x)
}

from baseballr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.