Giter Club home page Giter Club logo

btydplus's Introduction

BTYDplus

Travis-CI Build Status License CRAN_Status_Badge CRAN Downloads month CRAN Downloads overall

The BTYDplus R package provides advanced statistical methods to describe and predict customer's purchase behavior. It uses historic transaction records to fit a probabilistic model, which then allows to compute quantities of managerial interest on a cohort- as well as on a customer level (Customer Lifetime Value, Customer Equity, P(alive), etc.).

This package complements the BTYD package by providing several additional buy-till-you-die models, that have been published in the marketing literature, but whose implementation are complex and non-trivial. These models are: NBD, MBG/NBD, BG/CNBD-k, MBG/CNBD-k, Pareto/NBD (HB), Pareto/NBD (Abe) and Pareto/GGG.

Installation

# install.packages("devtools")
devtools::install_github("mplatzer/BTYDplus", dependencies=TRUE)
library(BTYDplus)

Getting Started

demo("cdnow")        # Demonstration of fitting various models to the CDNow dataset
demo("mbg-cnbd-k")   # Demonstration of MBG/CNBD-k model with grocery dataset
demo("pareto-abe")   # Demonstration of Abe's Pareto/NBD variant with CDNow dataset
demo("pareto-ggg")   # Demonstration of Pareto/NBD (HB) & Pareto/GGG model with grocery dataset

Implemented Models

These R source files extend the functionality of the BTYD package by providing functions for parameter estimation and scoring for NBD, MBG/NBD, BG/CNBD-k, MBG/CNBD-k, Pareto/NBD (HB), Pareto/NBD (Abe) and Pareto/GGG.

  • NBD Ehrenberg, Andrew SC. "The pattern of consumer purchases." Applied Statistics (1959): 26-41. \doi{10.2307/2985810}
  • MBG/NBD Batislam, E.P., M. Denizel, A. Filiztekin. 2007. Empirical validation and comparison of models for customer base analysis. International Journal of Research in Marketing 24(3) 201–209. \doi{10.1016/j.ijresmar.2006.12.005}
  • (M)BG/CNBD-k Reutterer, T., Platzer, M., & Schroeder, N. (2020). "Leveraging purchase regularity for predicting customer behavior the easy way." International Journal of Research in Marketing. \doi{10.1016/j.ijresmar.2020.09.002}
  • Pareto/NBD (HB) Ma, Shao-Hui, and Jin-Lan Liu. "The MCMC approach for solving the Pareto/NBD model and possible extensions." Natural Computation, 2007. ICNC 2007. Third International Conference on. Vol. 2. IEEE, 2007. \doi{10.1109/ICNC.2007.728}
  • Pareto/NBD (Abe) Abe, Makoto. "Counting your customers one by one: A hierarchical Bayes extension to the Pareto/NBD model." Marketing Science 28.3 (2009): 541-553. \doi{10.1287/mksc.1090.0502}
  • Pareto/GGG Platzer, Michael, and Thomas Reutterer. "Ticking Away the Moments: Timing Regularity Helps to Better Predict Customer Activity." Marketing Science (2016). \doi{10.1287/mksc.2015.0963}

Contributions

We certainly welcome all feedback and contributions to this package! Please use GitHub Issues for filing bug reports and feature requests, and provide your contributions in the form of Pull Requests. See also these general guidelines to contribute to Open Source projects on GitHub.

btydplus's People

Contributors

jackiep00 avatar jakeruss avatar michaelchirico avatar mplatzer avatar paolorais avatar rqueraud avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

btydplus's Issues

increase code coverage to 100%

  • test MBG/NBD
  • test MBG/CNBD-k
  • test Pareto/NBD (HB) Ma/Liu
  • drop unused C++ methods: post_gamma, slice_sample_gamma, post_mvnorm, slice_sample_mvnorm
  • etc...

could not find function "plotSampledTimingPatterns"

On a fresh BTYDPlus install I am not able to run demo("mbg-cnbd-k"). I get the following error message:

' Plot Timing Patterns of a few sampled customers

plotSampledTimingPatterns(groceryElog, T.cal = "2006-12-31")
Error in eval(expr, envir, enclos) :
could not find function "plotSampledTimingPatterns"

Rcpp warning

warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
if (add < 1e-8 & i >= 100) {

slice_sample_cpp loop does not finish

structure(list(cust = c(103523979L, 106539747L, 109669649L, 114536229L, 
115893543L, 115959927L, 117271474L, 121815850L, 123091742L, 123545940L, 
126225453L, 126396424L, 126844622L, 126963045L, 132218991L, 133320753L, 
134979819L, 135327695L, 136081180L, 136651599L, 137087039L, 138027869L, 
138717722L, 139202404L, 141638965L, 142495136L, 143705614L, 145281013L, 
147186313L, 147408159L, 149208624L), x = c(0, 0, 0, 0, 1, 0, 
0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 4, 0, 0, 0, 
1, 0, 0, 0), t.x = c(0, 0, 0, 0, 20.1428571428571, 0, 0, 0, 0, 
0, 0, 3, 0, 0, 0.285714285714286, 0, 0, 16.8571428571429, 15.4285714285714, 
0, 0, 0, 0, 9.28571428571429, 0, 0, 0, 16.1428571428571, 0, 0, 
0), litt = c(0, 0, 0, 0, 3.00284974132285, 0, 0, 0, 0, 0, 0, 
1.09861228866811, 0, 0, -1.25276296849537, 0, 0, 2.82477447541035, 
2.73622107806891, 0, 0, 0, 0, 2.06327660482648, 0, 0, 0, 2.78147766965703, 
0, 0, 0), sales = c(154.98, 849.98, 269.94, 54.98, 499.43, 249.99, 
106.98, 944.97, 36.67, 349.97, 24.99, 2149.93, 257.71, 889.95, 
37.97, 29.99, 349.99, 1494.92, 594.92, 977.93, 479.99, 59.99, 
204.96, 7566.69, 54.98, 839.97, 14.99, 2914.91, 192.98, 208.94, 
1029.92), sales.x = c(0, 0, 0, 0, 162.98, 0, 0, 0, 0, 0, 0, 869.95, 
0, 0, 29.98, 0, 0, 274.94, 149.99, 0, 0, 0, 0, 1698.89, 0, 0, 
0, 139.99, 0, 0, 0), first = structure(c(12059, 12056, 12070, 
12079, 12075, 12056, 12077, 12066, 12058, 12070, 12059, 12078, 
12062, 12054, 12055, 12054, 12055, 12071, 12069, 12056, 12077, 
12064, 12062, 12066, 12082, 12080, 12082, 12069, 12062, 12069, 
12059), class = "Date"), T.cal = c(23.8571428571429, 24.2857142857143, 
22.2857142857143, 21, 21.5714285714286, 24.2857142857143, 21.2857142857143, 
22.8571428571429, 24, 22.2857142857143, 23.8571428571429, 21.1428571428571, 
23.4285714285714, 24.5714285714286, 24.4285714285714, 24.5714285714286, 
24.4285714285714, 22.1428571428571, 22.4285714285714, 24.2857142857143, 
21.2857142857143, 23.1428571428571, 23.4285714285714, 22.8571428571429, 
20.5714285714286, 20.8571428571429, 20.5714285714286, 22.4285714285714, 
23.4285714285714, 22.4285714285714, 23.8571428571429), T.star = c(74.7142857142857, 
74.7142857142857, 74.7142857142857, 74.7142857142857, 74.7142857142857, 
74.7142857142857, 74.7142857142857, 74.7142857142857, 74.7142857142857, 
74.7142857142857, 74.7142857142857, 74.7142857142857, 74.7142857142857, 
74.7142857142857, 74.7142857142857, 74.7142857142857, 74.7142857142857, 
74.7142857142857, 74.7142857142857, 74.7142857142857, 74.7142857142857, 
74.7142857142857, 74.7142857142857, 74.7142857142857, 74.7142857142857, 
74.7142857142857, 74.7142857142857, 74.7142857142857, 74.7142857142857, 
74.7142857142857, 74.7142857142857), x.star = c(0L, 0L, 0L, 1L, 
0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 6L, 1L, 0L, 
0L, 1L, 0L, 2L, 0L, 2L, 0L, 2L, 0L, 0L, 1L), sales.star = c(0, 
0, 0, 54.9, 0, 155.97, 0, 0, 0, 299.94, 0, 379.98, 0, 210.76, 
149.99, 0, 0, 1597.82, 389.98, 0, 0, 393.92, 0, 399.97, 0, 1764.45, 
0, 1899.94, 0, 0, 49.99)), .Names = c("cust", "x", "t.x", "litt", 
"sales", "sales.x", "first", "T.cal", "T.star", "x.star", "sales.star"
), row.names = c(NA, -31L), class = "data.frame")

pggg.draws <- pggg.mcmc.DrawParameters(cbs, mcmc=2500, burnin=500, chains = 1, thin=50)

results in

Error in pggg_slice_sample("lambda", x = data$x, tx = data$t.x, Tcal = data$T.cal,  :
  slice_sample_cpp loop did not finish

Dockerfile

More a proposal than an issue: I wrote a dockerfile for BTYDplus, any interest to integrate it in the project?

auto-guess `T.tot` in `bgcnbd.PlotTrackingInc`

and allow for skipping the inc argument, as well as allow for longer time horizons to be plotted

cdnow <- cdnow.sample()
elog  <- cdnow$elog
cbs   <- cdnow$cbs
(params <- bgcnbd.EstimateParameters(cbs))
inc <- elog2inc(elog)
bgcnbd.PlotTrackingInc(params, cbs$T.cal, T.tot = 78, inc) # check e.g. T.tot = 100

also, reconsider the logic for by=7 in elog2cum

interactive online demo of BTYDplus

Let's supplement the R package with a Shiny app to demo the provided BTYDplus functionality (see this tutorial). As a second step, the Shiny App could then be hosted at www.shinyapps.io or at our own Shiny Server.

UI Flow:

  • upload transaction data (with cust and date fields) and specify calibration period cutoff
  • or generate random transaction data following a model's assumption and specified parameters - the parameters should have reasonable defaults already provided
  • select the model (NBD, Pareto/NBD, BG/CNBD-k, MBG/CNBD-k, Pareto/GGG)
  • estimate parameters
  • estimate conditional expected transactions
  • plot aggregate level (plotTrackingInc, plotTrackingCum)
  • report MAE error
  • report frequency distribution for holdout

add methods to MCMC models

we should have:

  • mcmc.ExpectedCumulativeTransactions
  • mcmc.Expectation
  • mcmc.PlotTrackingCum
  • mcmc.PlotTrackingInc
  • mcmc.PlotFrequencyInCalibration

data.table.rdb is corrupt error on installing BTYDplus

I've run into this problem before, and recall solving it, but am at a loss today.

I get this error when loading BTYDplus (library(BTYDplus))

Error: package or namespace load failed for 'data.table' in get(Info[i, 1], envir = env): lazy-load database '/home/mark/R/x86_64-pc-linux-gnu-library/3.4/data.table/R/data.table.rdb' is corrupt In addition: Warning message: In get(Info[i, 1], envir = env) : internal error -3 in R_decompress1

I've tried removing data.table and BTYDplus, and reinstalling, but no luck.

I've installed with devtools::install_github("mplatzer/BTYDplus", dependencies=TRUE)

add example for estimating monetary component

> setDT(elog)
> spends <- elog[, .(m.x=mean(sales), x=.N), by=cust]
> spend.params <- spend.EstimateParameters(spends[m.x>0, m.x], spends[m.x>0, x])
> cbs$avg.spend <- spends$m.x
> cbs$avg.spend.star <- spend.expected.value(spend.params, spends$m.x, spends$x)
> cbs
       cust x       t.x     litt    T.cal T.star x.star avg.spend avg.spend.star
   1: 10002 0  0.000000 0.000000 33.71429     39      0  23.72333       24.70481
   2: 10020 0  0.000000 0.000000 33.71429     39      0  37.55500       36.74995
   3: 10041 0  0.000000 0.000000 33.71429     39      0   6.79000       13.84195

bgcnbd.PlotTrackingCum / mbgcnbd.PlotTrackingCum

Just downloaded the latest version of package via command:
devtools::install_github("mplatzer/BTYDplus", dependencies=TRUE)

While I am trying to call subj functions I got an error:
Error in .Call("_BTYDplus_xbgcnbd_exp_cpp", PACKAGE = "BTYDplus", params, :
"_BTYDplus_xbgcnbd_exp_cpp" not available for .Call() for package "BTYDplus".

It seems like this function is not public anymore.

Add "m.x" to elog2cbs() output

Calculation of the average spend per repeat transaction in the calibration period ("m.x.") cannot be derived from the cbs created by elog2cbs() because "sales" appears to be a vector of TOTAL sales (ie. first transaction is NOT removed).

It would be great to provide "sales.x" (repeat sales only, first transaction removed) so one can calculate m.x = sales.x/x; and/or m.x.

TODO

  • add clumpy measure
  • add plot_samples, plot_totals,..
  • handling large 1mio+ customers? check out pnbd.compress.cbs
  • extract individual-level draws from draws$level_1?

Python version

Hi, I wonder if there is a python version of this package.
Thanks,
Nahue

Use of table in xbgcnbd.PlotFrequencyInCalibration (and mcmc.PlotFrequencyInCalibration)

Hi!

First, thanks a lot for the package. Really useful stuff here.

I noticed what I think is a bug in xbgcnbd.PlotFrequencyInCalibration (and potentially in mcmc.PlotFrequencyInCalibration, but I didn't use it directly).
The behavior at the line
x_act <- table(x_act)
is not correct, to me, when a member of the series 0:censor is missing in the table, e.g. if our censor is at 52 and nobody made 20,35, or 45 repeat transactions.
It will not fill with 0 this frequency values and hence x_act will be smaller than x_ect and due to the way R handle this, the matrix will be ... strange (got a warning in Rstudio, didn't in Visual Studio)
It can be fixed by using:
x_act <- table(factor(x_act, levels=c(0:censor)))

See this toy example to understand what i mean:

censor=25

x_act <- c(1:10, 20:30)
x_act[x_act > censor] <- censor
x_act <- table(x_act)
x_est <- 0:censor
mat <- matrix(c(x_act, x_est), nrow = 2, ncol = censor + 1, byrow = TRUE)
mat

x_act2 <- c(1:10, 20:30)
x_act2[x_act2 > censor] <- censor
x_act2 <- table(factor(x_act2,levels=c(0:censor)))
x_est <- 0:censor
mat2 <- matrix(c(x_act2, x_est), nrow = 2, ncol = censor + 1, byrow = TRUE)
mat2

I know this is a limit case. But I encountered it analysing some 'small' data subset and thought it would be worth rising it up here.
I'd gladly submit a pull request if you want.

Cheers,
Sébastien

Ps: first post actually on github... please don't hate me if I'm too long...

avoid returning of data.tables

the R community is not necessarily familiar with data.table, so let's ensure that methods always return data.frames, and that we don't use data.table syntax in demo/help code

can't use BTYDplus package(c++ function)

param.draws <- pnbd.mcmc.DrawParameters(data,mcmc = 20000, burnin = 3000, thin = 20, chains = 1, param_init =list(r=1,alpha=0.5,s=0.5,beta=1)) # short MCMC to run demo fast

Error in .Call("_BTYDplus_slice_sample_gamma_parameters", PACKAGE = "BTYDplus", :
"_BTYDplus_slice_sample_gamma_parameters" not available for .Call() for package "BTYDplus"

mcmc.PlotTrackingCum throws error for Pareto/NBD (Abe)

(bug report thanks to Keith Botner)

cdnowElog <- read.csv(system.file("data/cdnowElog.csv", package = "BTYD"),
                      stringsAsFactors = FALSE,
                      col.names = c("cust", "sampleid", "date", "cds", "sales"))

cdnowElog$date <- as.Date(as.character(cdnowElog$date), format = "%Y%m%d")
cbs <- elog2cbs(cdnowElog, T.cal = "1997-09-30", T.tot = "1998-06-30")

first <- aggregate(sales ~ cust, cdnowElog, function(x) x[1] * 10^-3)
names(first) <- c("cust", "first.sales")

cbs <- merge(cbs, first, by = "cust")

draws.m2 <- abe.mcmc.DrawParameters(cbs,
                                    covariates = c("first.sales"),
                                    mcmc = 7500, burnin = 2500)
weekly_cum_repeat <- elog2cum(cdnowElog)
cum_vs_exp_repeat <- mcmc.PlotTrackingCum(draws.m2,
                                          T.cal = cbs$T.cal,
                                          T.tot = round(max(cbs$T.cal + cbs$T.star)),
                                          weekly_cum_repeat,
                                          sample_size = 10000)

throws

Error in covars %*% params$beta : 
  requires numeric/complex matrix/vector arguments

mbg-cnbd-k: ConditionalExpectedTransactions - why length >= 100

this may be a dumb one... The model mbg-cnbd-k.R has the ConditionalExpectedTransactions function, where there is code to correct for bias. My question is: why length >= 100? is this some rule of thumb? what's the rational behind it? This makes the forecasts conditional to the length of the vector, and not only to their past behaviour. Am I missing something?

I also do not understand the comment: "Only do so, if we can safely assume that the full customer cohort is passed."

# Adjust bias BG/NBD-based approximation by scaling via the Unconditional
# Expectations (for wich we have exact expression). Only do so, if we can
# safely assume that the full customer cohort is passed.
do.bias.corr <- k > 1 && length(x) == length(t.x) 
                            && length(x) == length(T.cal) && length(x) >= 100

PGGG: lower upper bound limit of k slice sampling

When drawing pggg parameters I tend to get a limited (0.5 %) cluster of cases with mean k's between 985 and 999, with absolutely no mean k between 50 and 985 (Expected aggregate k is around 0.8).

The only thing seemingly setting these cases apart is that they're somewhat regular but rather short-lived, they're probably somewhat overrepresented and must be confusing the algorithm.

The upper bound for k slice sampling is set at 1000, which at first sight I don't think is a realistic expectation in any scenario? I would suspect a limit of around 100 to be safer and would adjust the algorithm accordingly.

Speaking of assumptions, it occurred to me that k's aggregate distribution is more likely to follow a lognormal than a gamma distribution. Even in the clumpiest of scenarios, extremely low k's remain less likely than values around 0.5, with a few higher k cases always remaining quite likely, a situation the gamma distribution doesn't allow for.

add missing BTYD plots

  • pnbd.PlotFreqVsConditionalExpectedFrequency
  • pnbd.PlotRecVsConditionalExpectedFrequency

Error when running Pareto NBD

When I'm running Pareto NBD, it threw an error:

customer_by_sufficient_statistic_pnbd <-
  transaction_data %>%
  elog2cbs(elog = ., units = "days")

params_pnbd <- BTYD::pnbd.EstimateParameters(customer_by_sufficient_statistic_pnbd)
names(params_pnbd) <- c("r", "alpha", "s", "beta")
round(params_pnbd, 3) 
Error in optim(logparams, pnbd.eLL, cal.cbs = cal.cbs, max.param.value = max.param.value,  : 
  L-BFGS-B needs finite values of 'fn'

My data is transaction data sample:

knitr::kable(head(transaction_data, 50))
|cust          |date                | sales|
|:-------------|:-------------------|-----:|
|000000000000o |2016-08-24 06:36:28 |  3000|
|00000025      |2016-08-02 10:14:31 |  1000|
|00000027      |2016-08-02 13:18:19 |  3000|
|00000030      |2016-08-23 19:48:21 |  3000|
|00000030      |2016-08-25 11:25:29 |  3000|
|00000030      |2016-08-26 00:18:09 |  3000|
|00000030      |2016-08-26 13:56:38 |  1000|
|00000030      |2016-08-26 14:06:14 |  1000|
|00000112      |2016-08-05 16:32:23 |  1000|
|00000113      |2016-08-04 21:32:20 |  1000|
|00000134      |2016-08-03 17:45:27 |  1000|
|00000134      |2016-08-03 18:19:24 |  1000|
|00000134      |2016-08-04 11:10:53 |  1000|
|00000134      |2016-08-04 21:26:10 |  1000|
|00000139      |2016-08-07 00:40:19 |  1000|
|00000139      |2016-08-07 18:40:35 |  1000|
|00000139      |2016-08-07 18:41:35 |  1000|
|00000139      |2016-08-09 14:54:24 |  1000|
|00000139      |2016-08-09 14:55:31 |  1000|
|00000148      |2016-08-04 17:47:40 |  3000|
|00000148      |2016-08-04 18:36:44 |  5000|
|00000148      |2016-08-05 01:43:27 |  5000|
|00000148      |2016-08-05 11:53:36 |  1000|
|00000148      |2016-08-05 14:34:41 |  1000|
|00000148      |2016-08-10 13:15:12 |  5000|
|00000148      |2016-08-11 11:02:51 |  5000|
|00000148      |2016-08-12 13:38:01 |  5000|
|00000179      |2016-09-25 13:21:36 |  1000|
|00000180      |2016-08-02 19:16:07 |  3000|
|00000188      |2016-08-04 06:24:25 |  3000|
|00000188      |2016-08-04 06:24:49 |  3000|
|00000191      |2016-08-05 13:48:37 |  1000|
|00000191      |2016-08-08 12:26:30 |  1000|
|00000191      |2016-08-08 13:39:25 |  1000|
|00000191      |2016-08-08 13:41:20 |  1000|
|00000191      |2016-08-09 09:57:15 |  1000|
|00000191      |2016-08-09 13:58:22 |  1000|
|00000191      |2016-09-27 17:50:06 |  1000|
|00000222      |2016-08-02 20:09:57 |  1000|
|00000222      |2016-08-03 09:09:58 |   200|
|00000222      |2016-08-03 09:10:17 |   200|
|00000222      |2016-08-03 09:10:37 |   200|
|00000222      |2016-08-03 15:07:04 |   200|
|00000222      |2016-08-04 11:04:57 |   200|
|00000222      |2016-08-04 11:05:15 |   200|
|00000222      |2016-08-04 11:49:45 |   200|
|00000222      |2016-08-05 08:52:40 |   200|
|00000222      |2016-08-05 08:53:29 |   200|
|00000222      |2016-08-07 14:22:04 |   500|
|00000222      |2016-08-07 14:22:37 |   200|

My cbs sample:

knitr::kable(head(customer_by_sufficient_statistic_pnbd, 50))
|cust          |  x|        t.x|        litt| sales|first               |    T.cal|
|:-------------|--:|----------:|-----------:|-----:|:-------------------|--------:|
|000000000000o |  0|  0.0000000|    0.000000|  3000|2016-08-24 06:36:28 | 75.72418|
|00000025      |  0|  0.0000000|    0.000000|  1000|2016-08-02 10:14:31 | 97.57275|
|00000027      |  0|  0.0000000|    0.000000|  3000|2016-08-02 13:18:19 | 97.44512|
|00000030      |  4|  2.7624190|   -5.696879| 11000|2016-08-23 19:48:21 | 76.17426|
|00000112      |  0|  0.0000000|    0.000000|  1000|2016-08-05 16:32:23 | 94.31035|
|00000113      |  0|  0.0000000|    0.000000|  1000|2016-08-04 21:32:20 | 95.10205|
|00000134      |  3|  1.1532755|   -4.951050|  4000|2016-08-03 17:45:27 | 96.25961|
|00000139      |  4|  2.5938889|  -14.110905|  5000|2016-08-07 00:40:19 | 92.97150|
|00000148      |  7|  7.8266319|   -6.040406| 30000|2016-08-04 17:47:40 | 95.25807|
|00000179      |  0|  0.0000000|    0.000000|  1000|2016-09-25 13:21:36 | 43.44284|
|00000180      |  0|  0.0000000|    0.000000|  3000|2016-08-02 19:16:07 | 97.19664|
|00000188      |  1|  0.0002778|   -8.188689|  6000|2016-08-04 06:24:25 | 95.73255|
|00000191      |  6| 53.1676968|   -6.586634|  7000|2016-08-05 13:48:37 | 94.42407|
|00000222      | 13|  5.8879861|  -47.090180|  4100|2016-08-02 20:09:57 | 97.15926|
|00000227      |  0|  0.0000000|    0.000000|  1000|2016-08-03 14:14:16 | 96.40626|
|00000233      | 11| 67.8287731|  -16.401807| 16000|2016-08-12 22:34:53 | 87.05861|
|00000245      |  3|  2.3961111|   -1.882027|  4000|2016-08-03 09:16:11 | 96.61326|
|00000247      |  0|  0.0000000|    0.000000|  1000|2016-08-09 20:10:06 | 90.15916|
|00000252      | 13|  7.7163773|  -33.256233|  7300|2016-09-22 18:33:27 | 46.22627|
|00000268      |  1|  2.2652315|    0.817677|  4000|2016-08-02 17:04:13 | 97.28824|
|00000293      |  4|  6.7340046|   -9.404809|  5000|2016-08-02 18:00:45 | 97.24898|
|00000297      |  6|  0.9672569|  -12.129457|  7000|2016-08-02 15:58:47 | 97.33368|
|00000306      | 34| 39.8468981|  -49.574711| 62000|2016-08-02 18:23:14 | 97.23337|
|00000322      | 45| 37.1828356|  -71.839988| 46000|2016-08-05 10:16:48 | 94.57117|
|00000348      |  0|  0.0000000|    0.000000|  1000|2016-08-02 16:21:03 | 97.31822|
|00000358      |  0|  0.0000000|    0.000000|  1000|2016-08-05 14:46:34 | 94.38383|
|00000384      |  0|  0.0000000|    0.000000|  5000|2016-08-07 19:31:40 | 92.18584|
|00000388      |  0|  0.0000000|    0.000000|  1000|2016-08-02 17:38:09 | 97.26468|
|00000410      | 19|  2.9094792|  -55.245497| 24000|2016-08-02 14:25:06 | 97.39874|
|00000429      |  0|  0.0000000|    0.000000|  1000|2016-08-04 18:58:38 | 95.20878|
|00000449      | 37|  4.2021528| -188.924434| 80000|2016-08-02 13:03:11 | 97.45562|
|00000474      |  2|  0.7993750|   -2.135192|  3000|2016-08-02 20:04:38 | 97.16295|
|00000487      |  0|  0.0000000|    0.000000|  3000|2016-08-02 13:36:12 | 97.43270|
|00000508      | 19| 53.3256597|  -39.142569| 24000|2016-08-02 13:57:38 | 97.41781|
|00000533      |  1|  0.1032870|   -2.270243|  2000|2016-08-05 10:00:43 | 94.58234|
|00000543      |  1|  0.0118981|   -4.431373|  8000|2016-08-03 16:18:37 | 96.31991|
|00000550      |  0|  0.0000000|    0.000000|  5000|2016-08-02 18:17:33 | 97.23731|
|00000559      | 31| 36.2829051| -110.370241| 33900|2016-08-03 17:21:53 | 96.27597|
|00000562      |  4|  0.2167361|  -13.778951|  9000|2016-08-02 20:40:43 | 97.13789|
|00000567      |  0|  0.0000000|    0.000000|  1000|2016-08-02 19:19:55 | 97.19400|
|00000577      |  0|  0.0000000|    0.000000|  1000|2016-08-02 21:11:37 | 97.11644|
|00000580      | 30| 96.1238889|   17.855325| 31000|2016-08-03 16:11:21 | 96.32495|
|00000596      |  2|  3.9631134|    1.355179|  3000|2016-08-03 21:06:09 | 96.12023|
|0000060009    |  0|  0.0000000|    0.000000|  1000|2016-08-27 11:29:07 | 72.52095|
|00000603      |  0|  0.0000000|    0.000000|  1000|2016-08-02 22:51:40 | 97.04696|
|00000619      |  0|  0.0000000|    0.000000|  3000|2016-08-05 07:20:43 | 94.69345|
|00000637      |  2|  0.0893750|   -9.179702|  3000|2016-08-02 16:35:39 | 97.30808|
|00000654      |  0|  0.0000000|    0.000000|  3000|2016-09-15 17:13:16 | 53.28196|
|00000657      |  2| 27.8792130|    3.926673|  3000|2016-08-03 13:54:12 | 96.42020|
|00000661      |  1|  0.2140509|   -1.541541|  2000|2016-08-02 15:08:05 | 97.36889|

Does the data require any constraint in order to apply the method properly?

Thank you in advance.

Throw error for empty data frame?

I recently encountered the following error message, Error:_ T.tot >= min(elog$date) is not TRUE, which led me to investigate whether my dates were correct inside of a list of data frames. Finally, I made the connection that some of my data frames were empty, which prompted the message.

It might be worth adding a check for whether the supplied data frame is empty, and give a more informative error message. Do you agree? If so, I'll file a pull request.

library(BTYDplus)
df <- data.frame(cust = character(), date = as.Date(character()))
elog2cbs(df)

data generators don't handle varying `T.cal` properly, since 0.8.1

changes in 0.8.1 introduced a bug for *.GenerateData methods in the case of varying T.cal values being provided; that bug is only of significance for large discrepancies in T.cal.

the following test case will ensure that this bug is fixed

  n <- 5000
  params <- list(r = 0.9, alpha = 10, s = 0.8, beta = 12)
  # constant T.cal
  T.cal <- rep(52, n)
  cbs1 <- pnbd.GenerateData(n, T.cal, 52, params, TRUE)$cbs
  # varying T.cal, with one T.cal being particularly large
  T.cal[n] <- 52 * 100
  cbs2 <- pnbd.GenerateData(n, T.cal, 52, params, TRUE)$cbs
  expect_equal(sum(cbs1[-n]$x.star), sum(cbs2[-n]$x.star), tolerance = 0.2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.