Giter Club home page Giter Club logo

ebbr's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ebbr's Issues

Extending ebbr past binomial distributions

Sorry this is not really an issue but I did not know of a better way to contact you.

I'm working through your book Empirical Bayes: Examples from Baseball Statistics! and it's a huge blast and a lot of fun but I've been having a hard time extending the concepts in the book and ebbr past success/total analyses.

I can extrapolate the book and package to something like k% (strikeouts/batters) or SwStr% (pitches swung at/pitches thrown) for pitchers but start to get conceptually tripped up for more complex, composite metrics like wOBA. Can ebbr be applied to metrics like this, if not, would you be able to point me in the direction of something in the same vein?

Also, do you have any idea of the implications of using ebb_fit_prior() fitted values in composite metrics?

prior_subset not working in combination with method = "gamlss"

Hello,

I try to add prior_subset to add_ebb_estimate in combination with gamlss as below:

eb_career_ab <- career %>%
  add_ebb_estimate(H, AB, method = "gamlss",
                    prior_subset = AB >= 500,
                   mu_predictors = ~ log10(AB))

Although this gives this error:

Error in lm.wfit(X[onlydata, , drop = FALSE], y, w) : 
  incompatible dimensions
In addition: Warning message:
`data_frame()` is deprecated as of tibble 1.1.0.
Please use `tibble()` instead.

Alternatively I did succeed to fit the prior with ebb_fit_prior on a subset of the data, followed by augment(prior, newdata = full_dataset) on the complete dataset.

Thanks.

CRAN?

Is there a reason this was never released to CRAN?

ebb_fit_prior error when using beta-binomial regression

When debugging the error below it seems to be caused by the call parameters <- broom::tidy(fit) in ebb_fit_prior().

# recreating chapter 11 of David Robinson's Introduction to Empricial Bayes

library(Lahman)
#> Warning: package 'Lahman' was built under R version 4.1.3
library(ebbr)
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 4.1.2
#> Warning: package 'ggplot2' was built under R version 4.1.2
#> Warning: package 'tibble' was built under R version 4.1.2
#> Warning: package 'tidyr' was built under R version 4.1.2
#> Warning: package 'readr' was built under R version 4.1.2
#> Warning: package 'purrr' was built under R version 4.1.2
#> Warning: package 'dplyr' was built under R version 4.1.2
#> Warning: package 'stringr' was built under R version 4.1.2
#> Warning: package 'forcats' was built under R version 4.1.2
theme_set(theme_light())

# grab career batting average of non-pitchers
pitchers <- 
  Pitching %>% 
  group_by(playerID) %>% 
  summarise(gamesPitched = sum(G)) %>% 
  filter(gamesPitched > 3)

# add player names
player_names <- 
  People %>% 
  tibble %>% 
  select(playerID, nameFirst, nameLast, bats) %>% 
  unite(name, nameFirst, nameLast, sep = " ")

career_full <- 
  Batting %>% 
  filter(AB > 0) %>% 
  anti_join(pitchers, by = "playerID") %>% 
  group_by(playerID) %>% 
  summarise(H = sum(H), AB = sum(AB), year = mean(yearID)) %>% 
  inner_join(player_names, by = "playerID") %>% 
  filter(!is.na(bats))

career <- 
  career_full %>% 
  select(-bats, -year)

# solve this with beta-binomial regression
eb_career_ab <- 
  career %>% 
  ebb_fit_prior(H, AB, method = "gamlss",
                mu_predictors = ~ log10(AB))
#> Warning in summary.gamlss(x): summary: vcov has failed, option qr is used instead
#> ******************************************************************
#> Family:  c("BB", "Beta Binomial") 
#> 
#> Call:  
#> gamlss::gamlss(formula = form, family = fam, data = tbl, sigma.predictors = sigma_predictors) 
#> 
#> 
#> Fitting method: RS() 
#> 
#> ------------------------------------------------------------------
#> Mu link function:  logit
#> Mu Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) -1.694982   0.009005 -188.23   <2e-16 ***
#> log10(AB)    0.193192   0.002768   69.79   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> ------------------------------------------------------------------
#> Sigma link function:  log
#> Sigma Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  -6.3316     0.0225  -281.3   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> ------------------------------------------------------------------
#> No. of observations in the fit:  10240 
#> Degrees of Freedom for the fit:  3
#>       Residual Deg. of Freedom:  10237 
#>                       at cycle:  5 
#>  
#> Global Deviance:     74715.37 
#>             AIC:     74721.37 
#>             SBC:     74743.07 
#> ******************************************************************
#> Error: $ operator is invalid for atomic vectors

Created on 2022-05-31 by the reprex package (v2.0.1)

ebbr doesn't allow for custom functions

Hi, I'm trying to make a custom function to make ebbr_fit_prior estimates on many columns in a dataframe.

I'm running into a lot of issues in passing variable column names to ebbr when it is in a custom function - I've tried using !!as.symbol() but some people on reddit said this might not work given R's base NSE that you might be using to build this code.

Can you suggest a way by which to do this?

`library(tidyverse)
library(Lahman)
library(ebbr)

career <- Batting %>%
filter(AB > 0) %>%
anti_join(Pitching, by = "playerID") %>%
group_by(playerID) %>%
summarize(H = sum(H), AB = sum(AB)) %>%
mutate(average = H / AB)

#this works
career %>%
ebbr::ebb_fit_prior(H, AB)

#function that i can use to make a bunch of estimates
make_eb_estimate = function(data, success, total, method = "mle"){
fitted = data %>%
ebb_fit_prior(x = success, n = total, method = method) %>%
augment() %>%
.$.fitted
}

#this does not work
career %>%
make_eb_estimate("H", "AB")`

Cannot install on Windows 7 Home Premium under latest R (3.5.1), using latest devtools and install_github

`
R version 3.5.1 (2018-07-02) -- "Feather Spray"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Welcome at Sat Dec 01 13:41:05 2018

library(devtools)
install_github("dgrtwo/ebbr")
Downloading GitHub repo dgrtwo/ebbr@master

checking for file 'C:\Users\Jan\AppData\Local\Temp\RtmpchhTfy\remotes19f824eb4e27\dgrtwo-ebbr-4b9747d/DESCRIPTION' ...

checking for file 'C:\Users\Jan\AppData\Local\Temp\RtmpchhTfy\remotes19f824eb4e27\dgrtwo-ebbr-4b9747d/DESCRIPTION' ...

√ checking for file 'C:\Users\Jan\AppData\Local\Temp\RtmpchhTfy\remotes19f824eb4e27\dgrtwo-ebbr-4b9747d/DESCRIPTION'

  • preparing 'ebbr':
    checking DESCRIPTION meta-information ...

    checking DESCRIPTION meta-information ...

√ checking DESCRIPTION meta-information

  • checking for LF line-endings in source and make files and shell scripts

  • checking for empty or unneeded directories

  • building 'ebbr_0.1.tar.gz'

Welcome at Sat Dec 01 13:41:33 2018

  • installing source package 'ebbr' ...
    ** R
    ** byte-compile and prepare package for lazy loading
    ** help
    *** installing help indices
    converting help for package 'ebbr'
    finding HTML links ... done
    add_ebb_estimate html
    add_ebb_prop_test html
    ebb_fit_mixture html
    ebb_fit_prior html
    ebb_mixture_tidiers html
    ebb_prior_tidiers html
    h html
    logLik.ebb_mixture html
    logLik.ebb_prior html
    model.frame.ebb_prior html
    print.ebb_mixture html
    print.ebb_prior html
    reexports html
    ** building package indices
    ** testing if installed package can be loaded
    *** arch - i386
    *** arch - x64

Welcome at Sat Dec 01 13:41:54 2018

Goodbye at Sat Dec 01 13:41:55 2018
ERROR: loading failed for 'i386'

  • removing 'C:/Program Files/R/R-2.13.1/library/ebbr'
    In R CMD INSTALL
    Error in i.p(...) :
    (converted from warning) installation of package ‘C:/Users/Jan/AppData/Local/Temp/RtmpchhTfy/file19f8474d1c01/ebbr_0.1.tar.gz’ had non-zero exit status

2018-12-01_134145
`

Failure on Missing gamlss.data package

Using ebbr for the first time, on a reasonably clean R 3.3.3. install, the calculation fails on the missing gamlss.data package

> trainer_sr_bbr <- trainer_sr %>%
+   ebbr::add_ebb_estimate(wins, runs, method = "gamlss", mu_predictors = ~ log10(runs))
Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : 
  there is no package calledgamlss.data

The ebbr package installs fine and I notice gamlss and gamlss.dist are Imports in the DESCRIPTION file. Should gamlss.data also be added here?

add_ebb_estimate works correctly after manually installing gamlss.data package.

Install Fails on Missing 'psych' Package

Attempting to install ebbr this morning on a Windows 7 machine, I encountered a build error on the pysch package.

> devtools::install_github("dgrtwo/ebbr")
Downloading GitHub repo dgrtwo/ebbr@master
from URL https://api.github.com/repos/dgrtwo/ebbr/zipball/master
Installing ebbr
"C:/PROGRA~1/R/R-33~1.3/bin/x64/R" --no-site-file --no-environ --no-save --no-restore --quiet CMD INSTALL  \
  "C:/Users/phillc/AppData/Local/Temp/RtmpMVzQV6/devtools1dcc44f5442d/dgrtwo-ebbr-4b9747d"  \
  --library="C:/Program Files/R/R-3.3.3/library" --install-tests 

* installing *source* package 'ebbr' ...
** R
** tests
** preparing package for lazy loading
Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : 
  there is no package called 'psych'
ERROR: lazy loading failed for package 'ebbr'
* removing 'C:/Program Files/R/R-3.3.3/library/ebbr'
Error: Command failed (1)

This is a reasonably clean R install, with a very small collection of external libraries installed. On my main Linux machine, running R 3.3.2, which has many additional libraries installed, ebbr installed fine.

Manually installing psych package fixes the issue and ebbr installs subsequently without error.

I notice psych is not listed as an Import or Suggest in the DESCRIPTION file.

ebbr fails with unhelpful error when all observations are 0

When all observations (k) are 0, ebbr fails with an unhelpful error message:

> ebbr::add_ebb_estimate(data.frame(k=rep(0, 10), n=sample(100, 10)), k ,n)

 Error in if (!all(lower <= start & start <= upper)) { : 
  missing value where TRUE/FALSE needed

If at least one observation is >1 it does work fine:

> ebbr::add_ebb_estimate(data.frame(k=c(1, rep(0, 9)), n=sample(100, 10)), k ,n)
   k  n   .alpha1   .beta1     .fitted       .raw         .low       .high
1  1 30 1.5202724 293.3042 0.005156533 0.03333333 3.829061e-04 0.015922746
2  0 74 0.5202724 338.3042 0.001535522 0.00000000 1.958268e-06 0.007564209
3  0 52 0.5202724 316.3042 0.001642147 0.00000000 2.094575e-06 0.008088588
4  0 97 0.5202724 361.3042 0.001437914 0.00000000 1.833525e-06 0.007084076
5  0 95 0.5202724 359.3042 0.001445906 0.00000000 1.843738e-06 0.007123393
6  0 58 0.5202724 322.3042 0.001611626 0.00000000 2.055554e-06 0.007938499
7  0 41 0.5202724 305.3042 0.001701212 0.00000000 2.170101e-06 0.008379020
8  0 39 0.5202724 303.3042 0.001712411 0.00000000 2.184422e-06 0.008434081
9  0 38 0.5202724 302.3042 0.001718066 0.00000000 2.191654e-06 0.008461884
10 0 44 0.5202724 308.3042 0.001684686 0.00000000 2.148968e-06 0.008297763

ebb_fit_mixture error

I'm running into errors using the example code in the documentation for the ebb_fit_mixture function.

First, by_row appears to be from purrrlyr, which is not loaded at the top of the example. But even with this loaded, I get the following error:

# simulate some data
set.seed(2017)
sim_data <- data_frame(cluster = 1:2,
                       alpha = c(30, 35),
                       beta = c(70, 15),
                       size = c(300, 700)) %>%
  by_row(~ rbeta(.$size, .$alpha, .$beta)) %>%
  unnest(p = .out) %>%
  mutate(total = round(rlnorm(n(), 5, 2) + 1),
         x = rbinom(n(), total, p))

mm <- ebb_fit_mixture(sim_data, x, total)

Error: `.x` must be a vector, not a function
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
All elements of `...` must be named.
Did you want `data = c(id, x, n)`? 

I ran last_error() and got:

<error/purrr_error_bad_type>
`.x` must be a vector, not a function
Backtrace:
  1. ebbr::ebb_fit_mixture(sim_data, x, total)
 24. purrr:::stop_bad_type(...)
Run `rlang::last_trace()` to see the full context.

Any tips on how to proceed?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.