jhelvy / cbctools Goto Github PK

View Code? Open in Web Editor NEW

4.0 2.0 5.0 27.66 MB

An R package with tools for designing choice based conjoint (cbc) survey experiments and conducting power analyses

Home Page: https://jhelvy.github.io/cbcTools/

License: Other

R 97.70% TeX 2.30%

r rstats conjoint design cbc sawtooth discrete-choice survey

cbctools's Introduction

Hello! 👋

My name is John Paul Helveston and I am an Assistant Professor in EMSE @ GWU. On GitHub you'll usually see me contributing to research projects, R packages, courses I develop / teach, or other fun side projects. I speak English, Chinese, R (base + tidyverse), and Python fluently as well as some moderate CSS and HTML. 😄

Research Projects

Each of these repositories contain the data and code to reproduce analyses for research projects:

pev-resale-2024: Replication code for our 2024 paper “Battery-Powered Bargains? Assessing Electric Vehicle Resale Value in the United States” Environmental Research Letters. DOI: 10.1088/1748-9326/ad3fce
vmt-2023: Replication code for our 2023 paper "Quantifying electric vehicle mileage in the United States." Joule. DOI: 10.1016/j.joule.2023.09.015
solar-learning-2021: Replication for our 2022 paper "Quantifying the cost savings of global solar photovoltaic supply chains." Nature. DOI: 10.1038/s41586-022-05316-6
pev-incentives-2021: Replication for our 2022 paper "Not all subsidies are equal: Measuring preferences for electric vehicle financial incentives.” Environmental Research Letters. DOI: 10.1088/1748-9326/ac7df3
dcTravelSurvey: A conjoint survey about user trip travel preferences in the DC Metro Area conducted at George Washington University.
pev-experience-2019: Replication for our 2020 paper "Electric vehicle adoption: can short experiences lead to big change?,” Environmental Research Letters. 15(0940c3). DOI: 10.1088/1748-9326/aba715
tra2015: Replication for our 2015 paper "Will subsidies drive electric vehicle adoption? Measuring consumer preferences in the U.S. and China" Transportation Research Part A: Policy and Practice, 73, 96–112. DOI: 10.1016/j.tra.2015.01.002

R Packages

logitr: logitr: Fast Estimation of Multinomial and Mixed Logit Models with Preference Space and Willingness to Pay Space Utility Parameterizations utility parameterizations. Accomanying JSS article here: DOI: 10.18637/jss.v105.i10
cbcTools: An R package with tools for designing choice based conjoint (cbc) survey experiments and conduction power analyses.
renderthis: Package for rendering media (e.g., xaringan slides) into multiple different formats. Co-authored with Garrick Aden-Buie.
surveydown: An attempt to build a markdown-based survey platform using Quarto & Shiny. This is a work in progress and is not yet formatting into a formal package.

Open Source Courses

EMSE 4571: Intro to Programming for Analytics: A course on the fundamentals of programming and computational thinking in R.
EMSE 4572: Exploratory Data Analysis: A course on the foundations in exploring and visualizing data in R.
EMSE 6035: Marketing Analytics for Design Decisions: A course on analyzing consumer choice data in R to inform design decisions in an uncertain, competitive, market.
R for Analytics Primer: A self-guided tutorial for developing a foundation in programming in R for data analysis, including data input/output, data wrangling, and data visualization using the Tidyverse.

Community

GW Coders: A community organization co-founded by myself and Ryan Watkins that brings together students and faculty to apply computational and data analytics skills in research.
The Distillery: A distill blog and showcase about building distill websites and blogs.

Keyboards

splitKbCompare: An interactive tool for comparing layouts of different split mechanical keyboards.
Wireless Corne: Build log and photo gallery for my Wireless Corne keyboard.
Iris Rev 2: Build log and photo gallery for my Iris Rev 2 keyboard.

cbctools's People

Contributors

Stargazers

Watchers

Forkers

trevorlolsen farhanhazari akshayagr zinniamaika marilotte

cbctools's Issues

cbc_design includes restricted profiles as NA when method = "orthogonal"

This issue was previously fixed for Bayesian designs using cbc_design, however, if the method is changed to "orthogonal", restricted profiles still show up in the final orthogonal design, with NA as profileID (see lines 24, 31, and 36 of the results below). Also, running the following code issues a lot of the same warnings when creating orthogonal design. Details below:

Code:

library(cbcTools)
library(logitr)

Nresp = 20
Nchoice = 16 
Nalt = 4
HeadN <- Nalt*Nchoice

profiles <- cbc_profiles(
  cost = seq(20, 30, 5), #generates sequence of numbers between 20 and 30 inclusive, 5 by 5. --Ratio Data
  brand = c("O", "HH"), #--Nominal Data
  usrrt = seq(30, 20, -5), #generates sequence of numbers between 4.8 and 3.2 inclusive, -0.8 by -0.8, to have the highest rating first so the top generated profile is the best. --Ratio Data
  accinf = c("H", "M", "L", "U") #--Ordinal Data
)

rstrct_profiles <- cbc_restrict(
  profiles,
  cost == 20 & brand == "O" & usrrt == 30 & accinf == "H", #exclude dominant alternative with the best levels for each attribute
  cost == 30 & brand == "HH" & usrrt == 20 & accinf == "U" #exclude the worst alternative with the worst levels for each attribute
)

design_ortho <- cbc_design(
  profiles = rstrct_profiles,
  n_resp = Nresp,
  n_alts = Nalt,
  n_q = Nchoice,
  method = "orthogonal"
)

DF_design_ortho <- as.data.frame(design_ortho)
FirstOrtho <- head(DF_design_ortho, HeadN)
FirstOrtho

Results:

Orthogonal array found; using 72 out of 70 profiles for design
There were 50 or more warnings (use warnings() to see the first 50)
   profileID respID qID altID obsID cost brand usrrt accinf
1         12      1   1     1     1   20     O    20      H
2         47      1   1     2     1   30    HH    25      L
3         63      1   1     3     1   20    HH    25      U
4         25      1   1     4     1   25     O    25      M
5          8      1   2     1     2   30     O    25      H
6         45      1   2     2     2   20    HH    25      L
7         27      1   2     3     2   20    HH    25      M
8         26      1   2     4     2   30     O    25      M
9         41      1   3     1     3   30    HH    30      L
10        40      1   3     2     3   25    HH    30      L
11        46      1   3     3     3   25    HH    25      L
12        20      1   3     4     3   30     O    30      M
13        19      1   4     1     4   25     O    30      M
14        51      1   4     2     4   20    HH    20      L
15         8      1   4     3     4   30     O    25      H
16         2      1   4     4     4   30     O    30      H
17        32      1   5     1     5   30     O    20      M
18         3      1   5     2     5   20    HH    30      H
19        50      1   5     3     5   30     O    20      L
20        63      1   5     4     5   20    HH    25      U
21        50      1   6     1     6   30     O    20      L
22        45      1   6     2     6   20    HH    25      L
23        30      1   6     3     6   20     O    20      M
24        NA      1   6     4     6   30    HH    20      U
25        37      1   7     1     7   25     O    30      L
26        29      1   7     2     7   30    HH    25      M
27        59      1   7     3     7   30    HH    30      U
28        62      1   7     4     7   30     O    25      U
29        40      1   8     1     8   25    HH    30      L
30        51      1   8     2     8   20    HH    20      L
31        NA      1   8     3     8   30    HH    20      U
32        39      1   8     4     8   20    HH    30      L
33        34      1   9     1     9   25    HH    20      M
34        39      1   9     2     9   20    HH    30      L
35        36      1   9     3     9   20     O    30      L
36        NA      1   9     4     9   30    HH    20      U
37        44      1  10     1    10   30     O    25      L
38        23      1  10     2    10   30    HH    30      M
39        63      1  10     3    10   20    HH    25      U
40        59      1  10     4    10   30    HH    30      U
41        41      1  11     1    11   30    HH    30      L
42        10      1  11     2    11   25    HH    25      H
43        59      1  11     3    11   30    HH    30      U
44        54      1  11     4    11   20     O    30      U
45        13      1  12     1    12   25     O    20      H
46         1      1  12     2    12   25     O    30      H
47        21      1  12     3    12   20    HH    30      M
48        67      1  12     4    12   25     O    20      U
49        31      1  13     1    13   25     O    20      M
50        43      1  13     2    13   25     O    25      L
51         5      1  13     3    13   30    HH    30      H
52        33      1  13     4    13   20    HH    20      M
53        12      1  14     1    14   20     O    20      H
54        22      1  14     2    14   25    HH    30      M
55        34      1  14     3    14   25    HH    20      M
56        28      1  14     4    14   25    HH    25      M
57        59      1  15     1    15   30    HH    30      U
58        40      1  15     2    15   25    HH    30      L
59        41      1  15     3    15   30    HH    30      L
60        47      1  15     4    15   30    HH    25      L
61        26      1  16     1    16   30     O    25      M
62        62      1  16     2    16   30     O    25      U
63        50      1  16     3    16   30     O    20      L
64        35      1  16     4    16   30    HH    20      M

Expanding warnings:

> warnings()
Warning messages:
1: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 8)
2: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 2)
3: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 12)
4: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 9)
5: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 13)
6: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 16)
7: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 1)
8: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 9)
9: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 4)
10: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 14)
11: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 1)
12: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 16)
13: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 7)
14: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 8)
15: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 6)
16: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 2)
17: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 12)
18: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 6)
19: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 13)
20: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 16)
21: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 1)
22: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 9)
23: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 13)
24: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 14)
25: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 1)
26: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 16)
27: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 7)
28: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 8)
29: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 6)
30: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 2)
31: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 12)
32: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 6)
33: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 13)
34: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 16)
35: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 1)
36: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 9)
37: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 13)
38: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 14)
39: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 1)
40: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 16)
41: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 7)
42: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 8)
43: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 6)
44: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 2)
45: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 12)
46: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 6)
47: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 13)
48: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 16)
49: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 1)
50: In (function (..., deparse.level = 1)  ... :
  number of columns of result is not a multiple of vector length (arg 9)

Release cbcTools 0.3.0

Prepare for release:

Submit to CRAN:

usethis::use_version('minor')
devtools::submit_cran()
Approve email

Wait for CRAN...

cbc_design(method = full/orthogonal) returns a design with NA

Hello,

For the full and orthogonal methods, cbc_design() returns missing experimental designs when the number of possible profiles is less than the number of trials.
Though it will only rarely occur, it could be helpful if an error raises instead.
(this request is based on #27 (comment))

Thank you ,
Shigeru

error in check_inputs_design(): Modfed and dopt not included

The code right now is:

# Check on restricted profile sets

if (profiles_restricted) {
  if (!method %in% c('random', 'full')) {
    stop(
      'Restricted profile sets can only be used with the "random", "full" ',
      '"dopt", or "Modfed" methods'
    )
  }
}

Because of this it's not possible to use a restricted profile with either Modfed or dopt methods. This sould be an easy fix, however :)

Cheers!
Marianne

Undefined columns selected error

I have been encountering an error when trying to run the code provided for generating a Bayesian D-efficient design. The error message that I receive is as follows: "Error in [.data.frame(des, c("profileID", varnames)) : undefined columns selected."

I was wondering if you could help me with this issue. Is there a way to fix this error or a workaround that I can use to generate a Bayesian D-efficient design with the CBCTools package?

Thanks!

cbc_overlap help description possibly incorrect

I tested a design I built using the SAS macros such that no level of any attribute appeared more than once in a choice task (choice question). When I examined the cbc_overlap 100% of every attribute showed up under the labeled 1 of the output from the function. The help says that the count under the 1 is the count of tasks where every level of an attribute is the same across all alternatives. I think you really meant a 3 9assuming the # of alts was 3) is the count of tasks where the attribute level is the same across all 3 alternatives and a 1 is the count of times there was NO OVERLAP.

I would be glad to share my code and the design for you to double check my work. Please let me know. My email is [email protected]

Tom Eagle

request: randomize question/alternatives par respondents

Hello,

cbc_design() currently returns a design where respondents can have common questions & alternatives (except random methods).
It will be very helpfull if we can specify to randomize the order of the row of design par respondents.
(this request is based on #27 (comment))

Thank you,
Shigeru

request: minimize overlap or avoid full overlap in cbc_design()

Hello,

In designs generated by cbc_design(methods = random|full|orthogonal|dopt), overlapped levels in a same question is not minimized. It could be a problematic, especially in random design.
It could be very helpfull if we can minimize overlap, or avoiding full overlap of any one attribute.
(this request is based on #27 (comment))

Thank you so much,
Shigeru

TODO

I'm just collecting some todo items here from existing issues:

Consider adding support for naming the attribute levels in the priors arguments for cbc_choices and cbc_design (see #24).
Consider adding sigma as another argument to cbc_design for Bayesian D-efficient designs to have more control over priors (see here).
Minimize overlap / avoid full overlap in cbc_design #30.
Randomize order across respondents #29
full / orthogonal methods return NA #28
Update cbc_power() to return a new class object of all of the estimated models and then create a print and summary method for this class that shows key information (see here).

Bug in creating Bayesian D-efficient designs when not all combinations of levels are allowed

I ran the following code:

cbcTools::cbc_design(profiles, n_resp, n_alts, n_q, priors = priors)

where n_resp = 735, n_alts = 4, n_q = 50, there are 12 discrete attributes with numbers of levels 12, 6, 3, ..., 3, 4, 4, and profiles only contains 1932 rows, which is far less than 1263*...34*4 = 7,558,272. I got the following error:

Error in $<-.data.frame(*tmp*, blockID, value = c(1L, 1L, 1L, 1L, :
replacement has 200 rows, data has 3

Investigating with the debugger, I found the following erroneous code in cbcTools:::make_design_deff():

...
else {
  ...
  des <- merge(des, profiles, by = varnames)
  ...
}
...
des$blockID <- rep(seq(n_blocks), each = n_alts * n_q)
...

The issue is that an earlier call to idefix::CEA() produced a design des that ignored profiles, and this is patched up via the call to merge(); but the number of rows in des gets drastically reduced at this point, from 200 to only 3, and the line

des$blockID <- rep(seq(n_blocks), each = n_alts * n_q)

assumes that des still has the original number of rows.

Even if the above assignment were fixed to account for the reduced number of rows, there are still problems -- every block of n_alts = 4 rows in the original des is a question in the design, and this structure is destroyed by the call to merge(), not to mention that there are no longer sufficient profiles in des to even produce n_q = 50 questions.

Praise 🎉

Funders and academic employers are increasingly interested in seeing evidence for the impact academic research generates. For software such as {cbcTools}, this is hard to accomplish because the typical metrics for promotion and tenure that matter (publications, grants, and citations) don't really apply. The consequence is that there are increasingly fewer and fewer incentives to develop packages like {cbcTools} 😞

The good news is you can help! If you have found {cbcTools} useful in any regard, please leave feedback here for other users, funders, and employers to view. This helps the package authors show how {cbcTools} is being used by academic and non-academic users to increase their productivity and work quality.

So please contribute some praise 😁! Tell us how cool this package is and how you use it in your work!

And if you write a paper using {cbcTools}, please cite the package in your publications, see: citation("cbcTools")

Alternative design packages

As of version 0.3.0, the Bayesian D-efficient designs are generated via the idefix package. However, as was raised in #10, idefix has some limitations. In particular, the idefix::Modfed() function is quite slow, and it is the only option available if using a restricted set of profiles.

So with that in mind, I'm posting this issue to discuss potential alternatives. I'm not necessarily looking to replace idefix, but perhaps supplement cbc_design() with even more options for generating D-efficient designs.

This summary of related packages was recently published. After a quick look, there appear to be quite a few possible alternatives that could be incorporated into `cbc_design():

support.CEs: Looks pretty simple, but it doesn't look like it supports Bayesian designs with priors. The author has a book too with a lot of other examples.
skpr: Looks promising but isn't clear if it's made for choice experiments.
ExpertChoice: Clearly made for choice experiments, not clear if priors can be included.

cbc_design includes profiles that have been previously restricted (either manually or through cbc_restrict)

It seems like, for some reason, cbc_design still includes the profiles that have been previously restricted. The only difference is when printed, the profileID of restricted profiles appears as "NA" instead of a number that corresponds with the list of all available profiles. This seems to happen regardless of whether cbc_restrict is used or the profiles are manually excluded. Version used: v0.3.3. Same issue appeared with v0.3.2.

`library("cbcTools")
library("logitr")

#global variables
Nresp = 150 #global variable to identify number of respondents
Nchoice = 12 #global variable for the number of choice sets
Nalt = 4 #global variable for the number of alternatives per choice set
HeadN <- Nalt*Nchoice #secondary gobal variable to only print out the first respID set in the DB-Efficient design, other respIDs are just repetition of the same design.

#create full-factorial design based on attributes and levels
profiles <- cbc_profiles(
cost = seq(20, 30, 5), #generates sequence of numbers between 20 and 30 inclusive, 5 by 5. --Ratio Data
brand = c("known", "unknown"), #--Nominal Data
usrrt = seq(4.8, 3.2, -0.8), #generates sequence of numbers between 4.8 and 3.2 inclusive, -0.8 by -0.8.
accinf = c("high", "low", "unavailable") #--Nominal Data
)
profiles

#restrict undesired profiles
rstrct_profiles <- cbc_restrict(
profiles,
cost == 20 & brand == "known" & usrrt == 4.8 & accinf == "high", #exclude dominant alternative
cost == 30 & brand == "unknown" & usrrt == 3.2 & accinf == "unavailable" #exclude the worst alternative
)
rstrct_profiles

#create DB-efficient design from restricted full-factorial design
design_dbeff <- cbc_design(
profiles = rstrct_profiles,
n_resp = Nresp,
n_alts = Nalt, #number of alternatives in each choice set
n_q = Nchoice, #number of "questions" or choice sets
n_start = 50, #numeric value indicating the number of random start designs to use.
priors = list(
cost = 0,
brand = 0,
usrrt = 0,
accinf = c(0, 0)
), #using priors in designing DB-efficient fractional-factorial design
max_iter = 10000
)
head(design_dbeff, HeadN)`

Duplicate profile combinations within a respondent

First of all, thanks for the package; it has been very helpful.

In generating designs, I've encounted situations where the same profile combinations are shown within a respondent. This does not seem desirable, as the same respondent will see the exact same question twice (or even more frequently).

See some replicable code below. In this case, obsID 3 and 8 have exactly the same profile combinations. The same with obsID 7 and 10. These are all IDs within the same respondent (respID). Despite generating 12 profile combinations (i.e., questions), only 10 are actually unique.

profiles <- cbc_profiles(
a = c(0, 1),
b = c("Yes", "No"),
c = c("100%", "80%"),
d = c("day", "week")
)

set.seed(64)
design <- cbc_design(
profiles = profiles,
n_resp = 4,
n_alts = 2,
n_q = 12,
)

`cbc_choices()` with an outside good (`no_choice = TRUE`)

When I use cbc_choices() with a design where no_choice = TRUE, the choices get simulated, but the model always has NA for the standard errors. There may need to be a special argument for handling the no_choice variable, similar to the prior_no_choice argument in cbc_design(). Here is an example:

# Load libraries
library(cbcTools)
library(logitr)

# Define profiles with attributes and levels
profiles <- cbc_profiles(
    price       = c(15, 20, 25), # Price ($1,000)
    fuelEconomy = c(20, 25, 30),   # Fuel economy (mpg)
    accelTime   = c(6, 7, 8),      # 0-60 mph acceleration time (s)
    powertrain  = c("Gasoline", "Electric")
)

# Make a full-factorial design of experiment
design <- cbc_design(
    profiles = profiles,
    n_resp   = 1000, # Number of respondents
    n_alts   = 3,    # Number of alternatives per question
    n_q      = 8,     # Number of questions per respondent
    no_choice = TRUE
)

# Simulate choices according to an assumed utility model
data <- cbc_choices(
    design = design,
    obsID = "obsID",
    priors = list(
        price       = -0.7,
        fuelEconomy = 0.1,
        accelTime   = -0.2,
        powertrain_Electric = -4.0,
        no_choice = -2
    )
)

# Estimate a model 
model <- logitr(
  data   = data,
  outcome = "choice",
  obsID  = "obsID",
  pars   = c(
      "price", "fuelEconomy", "accelTime", "powertrain_Electric", "no_choice"
  )
)

summary(model)

Power Analysis Simulation comparison between optimal designs with zero priors and true priors

EDIT: I made a mistake by not including the true priors in cbc_choices. I will not touch this post because I think it is still interesting to see how cbc_choices behaves. I added the priors and ran the power analysis simulation again. The new results as well as the section of the code that was updated are in the second comment. In summary, it seems like "fixing" the code by including true priors, actually made the simulation results even more conservative.

Background: I designed a DZ-Efficient DCE (Bayesian D-Efficient with zero priors) using {cbcTools}. Using cbc_choices and cbc_power, I performed a power analysis simulation to determine an optimal sample size. Based on the output, it seemed like with about 180 participants I should be able to achieve standard errors of less than 0.05 for the model coefficients. Based on the recommended best practices in the literature, I conducted a pilot study to estimate the true priors, with 1/10th of the sample size (N=18). After analyzing the results, I found that almost all estimated coefficients had achieved very high statistical significance. For some, the p-value was <0.001, and the rest were less than 0.05, except for one factor that did not achieve significance (however, based on other indicators in the study, this factor truly did not impact the decisions of my respondents).
The discrepancy between the simulation results and true results is understandable, since zero priors were identified for the optimization. With zero priors, cbc_choices has no way of knowing how the choices are likely to be made. As a result, the estimated impact of the sample size on the standard errors for coefficients is on the "very conservative" side. To explore this discrepancy further, I compared the results of power analysis simulation, with zero priors, and with the true priors from the pilot study.

I will be renaming some attributes and levels for simplification and readability, otherwise, everything else will remain true to the actual procedures and findings.

Study Parameters:
Attributes and Levels: Cost ($20, $25, $30), Type (A, B), UserRatings (3.2 Stars, 4.0 Stars, 4.8 Stars), Information (Unavailable, Low, Medium, High)
Note: Type and Information are categorical attributes, Cost and UserRatings are numerical attributes
Choice Matrix Shape: 16 choice tasks, each with 4 profiles (16x4)
No Choice Option: False

Pilot Results:
These coefficient estimates will be used as true priors.

Model Type:    Multinomial Logit
Model Space:          Preference
Model Run:                1 of 1
Iterations:                   20
Elapsed Time:        0h:0m:0.01s
Algorithm:        NLOPT_LD_LBFGS
Weights Used?:             FALSE
Robust?                    FALSE

Model Coefficients: 
                     Estimate Std. Error z-value  Pr(>|z|)    
Cost              -0.130720   0.048061 -2.7199  0.006530 ** 
TypeA              0.416037   0.260832  1.5950  0.110703    
UserRatings        3.241950   0.357835  9.0599 < 2.2e-16 ***
InformationLow    -1.794934   0.629253 -2.8525  0.004338 ** 
InformationMedium  1.893437   0.403223  4.6958 2.656e-06 ***
InformationHigh    5.237012   0.574275  9.1194 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
                                    
Log-Likelihood:         -112.5016203
Null Log-Likelihood:    -399.2527760
AIC:                     237.0032407
BIC:                     258.9810000
McFadden R2:               0.7182196
Adj McFadden R2:           0.7031915
Number of Observations:  288.0000000

Power Analysis Simulation Comparison (Results):
This is very interesting! My estimation was that including true priors will improve things drastically. Evidently, it only made it worse! Also noteworthy, the DB-Error also went from 0.34 (zero priors) to 0.54 (true priors)!

Power Analysis Simulation Comparison (Code):

library(cbcTools)
library(logitr)

Nresp = 200
Nchoice = 16
Nalt = 4
Nbreaks = 20
HeadN <- Nalt*Nchoice
set.seed(6132)

# Profiles for design with zero priors
# UserRatings need to range and vary similar to Cost, otherwise there will be balance issues due to zero priors
DZ_profiles <- cbc_profiles(
  Cost = seq(20, 30, 5),
  Type = c("A", "B"),
  UserRatings = seq(20, 30, 5),
  Information = c("Unavailable", "High", "Medium", "Low")
)

DZ_rstrct_profiles <- cbc_restrict(
  DZ_profiles,
  Cost == 20 & Type == "A" & UserRatings == 30 & Information == "High",
  Cost == 30 & Type == "B" & UserRatings == 20 & Information == "Unavailable"
)

# Profiles for design with true priors
# UserRatings can now take their true values since priors will not be zero anymore
DB_profiles <- cbc_profiles(
  Cost = seq(20, 30, 5),
  Type = c("B", "A"),
  UserRatings = seq(3.2, 4.8, 0.8),
  Information = c("Unavailable", "High", "Medium", "Low")
)

DB_rstrct_profiles <- cbc_restrict(
  DB_profiles,
  Cost == 20 & Type == "A" & UserRatings == 4.8 & Information == "High",
  Cost == 30 & Type == "B" & UserRatings == 3.2 & Information == "Unavailable"
)

# Create DZ-Efficient design
DZ_design <- cbc_design(
  profiles = DZ_rstrct_profiles,
  n_resp = Nresp,
  n_alts = Nalt,
  n_q = Nchoice,
  n_start = 10,
  priors = list(
    Cost = 0,
    Type = 0,
    UserRatings = 0,
    Information = c(0, 0, 0)
  ),
  max_iter = 1000,
  method = "Modfed",
  keep_db_error = TRUE,
  parallel = TRUE
)
DZerr <- as.numeric(DZ_design$db_err)
DZ_design_dataframe <- as.data.frame(DZ_design$design)
first_design_DZ <- head(DZ_design_dataframe, HeadN)

# Create DB-Efficient design
DB_design <- cbc_design(
  profiles = DB_rstrct_profiles,
  n_resp = Nresp,
  n_alts = Nalt,
  n_q = Nchoice,
  n_start = 10,
  priors = list(
    Cost = -0.13,
    Type = 0, # did not achieve significance in pilot
    UserRatings = 3.2,
    Information = c(5.2, 1.9, -1.8)
  ),
  max_iter = 1000,
  method = "Modfed",
  keep_db_error = TRUE,
  parallel = TRUE
)
DBerr <- as.numeric(DB_design$db_err)
DB_design_dataframe <- as.data.frame(DB_design$design)
first_design_DB <- head(DB_design_dataframe, HeadN)

# Examining balance and overlap
cbc_balance(first_design_DZ)
cbc_overlap(first_design_DZ)

cbc_balance(first_design_DB)
cbc_overlap(first_design_DB)

# DZ choice simulation
choice_sim_DZ <- cbc_choices(
  design = DZ_design_dataframe,
  obsID = "obsID"
)
choice_sim_DZ <- as.data.frame(choice_sim_DZ)

# DB choice simulation
choice_sim_DB <- cbc_choices(
  design = DB_design_dataframe,
  obsID = "obsID"
)
choice_sim_DB <- as.data.frame(choice_sim_DB)

# Power analysis simulation and comparison
powerDZ <- cbc_power(
  data = choice_sim_DZ,
  pars = c("Cost", "Type", "UserRatings", "Information"),
  outcome = "choice",
  obsID = "obsID",
  nbreaks = Nbreaks,
  n_q = Nchoice,
)

powerDB <- cbc_power(
  data = choice_sim_DB,
  pars = c("Cost", "Type", "UserRatings", "Information"),
  outcome = "choice",
  obsID = "obsID",
  nbreaks = Nbreaks,
  n_q = Nchoice,
)

plot_compare_power(powerDZ, powerDB)

cbc_power - "NA" generated for 'estimated coefficient' and 'standard errors' in Output

John --

When running cbc_tools with the Example from (https://rdrr.io/github/jhelvy/cbcTools/man/cbc_power.html)
I get the following Output with "NA"s for Estimated Coefficients and Standard Errors:

A simple conjoint experiment about apples

library(cbcTools)

Generate all possible profiles

profiles <- cbc_profiles(

price = c(1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5),
type = c("Fuji", "Gala", "Honeycrisp"),
freshness = c('Poor', 'Average', 'Excellent')
)

Make a randomized survey design

design <- cbc_design(

profiles = profiles,
n_resp = 300, # Number of respondents
n_alts = 3, # Number of alternatives per question
n_q = 6 # Number of questions per respondent
)

Simulate random choices

data <- cbc_choices(

design = design,
obsID = "obsID"
)

Conduct a power analysis

power <- cbc_power(

data = data,
pars = c("price", "type", "freshness"),
outcome = "choice",
obsID = "obsID",
nbreaks = 10,
n_q = 6
)
Estimating models using 3 cores...
done!

head(power)
sampleSize coef est se
1 30 price NA NA
2 30 typeGala NA NA
3 30 typeHoneycrisp NA NA
4 30 freshnessAverage NA NA
5 30 freshnessExcellent NA NA
6 60 price NA NA

I re-ran this several times -- but still get this "NA" Result. Am I missing something?

The issue of balance in DB-Efficient designs

Consider the following attributes and levels:

Cost: $20, $25, $30
Brand: Known, Unknown
Rating: 3.2 Stars, 4.0 Stars, and 4.8 Stars
Usability: High, Low, Unavailable

These attribute-levels were presented in the code as follows:
profiles <- cbc_profiles( cost = seq(20, 30, 5), brand = c("known", "unknown"), rating = seq(4.8, 3.2, -0.8), usability = c("high", "low", "unavailable") )

I found that creating a DB-Efficient design using these attributes and levels, results in complete omission of "Rating: 4.0 Stars" from the final design. Using cbc_balance, I would observe that there is no representation from that particular attribute level. At first I thought floats are not allowed in the design and they are being truncated, resulting in an unbalanced design, but that is not the case. After changing the levels to "5, 4, 3" instead, I found that 4 does get representation, but it was still very unbalanced (e.g. 25 appearances for 5 stars, 4 appearances for 4 stars, and 25 appearances for 3 stars).

The issue seems to be that, since both Cost and Rating are presented as ratio data, the optimization algorithm deems the differences between 3.2 to 4.0 to 4.8 to be not as significant as the differences between 20 to 25 to 30. As such, the optimal design just ignores the "medium" value for Rating, and only represents the highest and lowest values. Obviously, in a real world application, there is a significant and perceivable difference between the ratings of a 3.2 stars product and a 4 stars product.

The workaround I found is to introduce the Ratings with the same values as Cost, i.e. rating = seq(30, 20, -5), and then manually replace the values in the final design with 4.8, 4.0, and 3.2 respectively. For manual replacement, I personally write the dataframe of the optimal design to a CSV file, and then find and replace the values accordingly, since I'll be using the CSV file to generate my survey. This resulted in a far more balanced optimal design with almost equal representation for all levels for Rating. The only issue I can anticipate, is that the power analysis simulation using cbc_power may be even further removed from the findings of the study with actual participants, and thus the power analysis simulation becomes more unreliable.

I wonder if this workaround violates the anything else in the optimal design algorithm, however, this is the only way I could find to create a balanced optimal design. Although this is technically an issue with the "idefix" package, I wonder if there is a proper way to address that in cbc_tools?

Db Efficient vs D efficient designs

Hi there - this is an awesome package and I've found it very easy to use however, just seeking some clarification based on my limited experience with other software packages.
It was my understanding that you should specify point estimate priors for D-efficient designs, and specify prior distributions for bayesian d-efficient. However, from what I can tell, in this package you specify no priors for d-efficient designs and point estimates for Db efficient designs. Can you speak to this at all?

Thanks,
Jack

The order of priors in cbc_design and its connection to cbc_profiles

Background: I have conducted a pilot study, designed using cbcTools with zero priors, and the results of the experiment were analyzed using logitr. When estimating the model in logitr, for one of my categorical variables I wanted to use a different category as reference. logitr (I'm assuming) was picking the first category alphabetically, and using that as reference. I used the factor() function, as specified in the paper on logitr (under section 5.4. Continuous and discrete variable coding), and identified another category as reference. Let's call the attribute Information, and the levels are c("High", "Medium", "Low", "Unavailable"). In logitr, if I didn't use the following code, "High" would be used as the reference category, but I wanted "Unavailable" to be the reference:

ChoiceData$Information <- factor(
  x = ChoiceData$Information,
  levels = c("Unavailable", "Low", "Medium", "High")
)

Now, the resulting MNL model coefficient estimates are:

Model Coefficients: 
                     Estimate Std. Error z-value  Pr(>|z|)    
Cost              -0.130720   0.048061 -2.7199  0.006530 ** 
UserRating         3.241950   0.357835  9.0599 < 2.2e-16 ***
InformationLow    -1.794934   0.629253 -2.8525  0.004338 ** 
InformationMedium  1.893437   0.403223  4.6958 2.656e-06 ***
InformationHigh    5.237012   0.574275  9.1194 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

My understanding based on the examples provided for cbc_design, is that when creating profiles using cbc_profiles, the order of levels when identifying the categorical attributes determines the order of priors in cbc_design. The first level will be ignored in the priors vector (or considered as reference), and priors for the rest of the levels will be introduced. So now that I will be including my priors, and because I used "Unavailable" as reference category in obtaining those priors, I think I need to make sure the order of levels for Information are consistent:

Profiles <- cbc_profiles(
  Cost = seq(20, 30, 5),
  UserRating = seq(3.2, 4.8, 0.8),
  Information = c("Unavailable", "Low", "Medium", "High") #<=== The order
)

design_dbeff <- cbc_design(
  profiles = Profiles,
  n_resp = Nresp,
  n_alts = Nalt,
  n_q = Nchoice,
  n_start = 10,
  priors = list(
    Cost = -0.13,
    UserRating = 3.2,
    Information = c(-1.8, 1.9, 5.2) #<=== The same order for "Low", "Medium", "High"  but excluding the prior for "Unavailable"
  ),
  method = "Modfed",
  keep_db_error = TRUE,
  parallel = TRUE
)

My Questions:

Are my assumptions correct that 1) logitr alphabetically selects the reference category, 2) the order of levels for a categorical attribute determines the order of the priors, and 3) the prior for the first level needs to be ignored in the vector of priors, if that level was used as reference?
Considering that using priors is the main point of DB-Efficient designs, is there a way to increase the perceived reliability of this process when identifying priors for categorical attributes, maybe by using a named list (aka dictionary) to identify the priors? I think improving this aspect will help users make sure they are identifying the correct priors for each level, without having to rely on identifying the values in a specific order that may be different from their original design. For example:

.
.
.
  priors = list(
    Cost = -0.13,
    UserRatings = 3.2,
    Information = list(Unavailable = "Reference", Low = -1.8, Medium = 1.9, High = 5.2)
  ),
.
.
.

If creating a nested list like above wreaks havoc on the internal operations, maybe the named list can be assigned to a variable before using cbc_design, and then that variable can be identified as the prior for a categorical attribute? For example:

InformationPriors <- list(Unavailable = "Reference", Low = -1.8, Medium = 1.9, High = 5.2)
.
.
.
  priors = list(
    Cost = -0.13,
    UserRatings = 3.2,
    Information = InformationPriors
  ),
.
.
.

cbc_design returns a Bayesian D-Efficient design as a list instead of dataframe

After updating the R packages (which updated cbcTools from v0.3.4 to v0.4.0 for me), I noticed that my code wasn't functioning properly anymore. After checking the updates to the package, and my code, I noticed that the output of cbc_design function has changed from Data Frame to List. I suspect this has to do with the addition of reporting db_err.
In my old code, I used to isolate only the design for the first respondent (respID = 1) to inspect the design for balance and overlap, I call it the first design. I did this because in my case (1 block, unlabeled experiment) cbc_design repeats the first design for all respondents (as determined by n_resp). This was causing the problem in my old code, as I was extracting the first design using the head function.
By manipulating the object type (using as.data.frame), I am now getting the first design correctly, however, db_err is now added as a column and repeats the same value for all rows.
I was wondering if it's possible that:

the output of cbc_design can only consist of the efficient design (so cbc_design returns a data frame that contains only the profileID, respID, qID, altID, obsID, attributes, and blockID).
the value of db_err can be accessed independently so it can be assigned to a variable and reported separately (e.g. DBError <- Design(db_err), or a similar method)

While obviously the output list from cbc_design can be manipulated to present a separate data frame for the design, and a value for db_err that can then be assigned to a variable, I think presenting the output as described above creates less complexity for using other functions in this package (e.g. inspecting the design using cbc_balance and cbc_overlap, or choice and power analysis simulation). Currently, the raw output of cbc_design, at least in my case, is incompatible with other functions in the package.

Here's my code, and the output will be pasted below the code:

library("cbcTools")
library("logitr")

Nresp = 150 #identify number of respondents for which the design is optimized
Nchoice = 16 #global variable for the number of choice sets
Nalt = 4 #global variable for the number of alternatives per choice set
HeadN <- Nalt*Nchoice #secondary variable to only print out the first respID set in the DB-Efficient design, other respIDs are just repetition of the same design.

#create full-factorial design based on attributes and levels
profiles <- cbc_profiles(
  cost = seq(20, 30, 5), #generates sequence of numbers between 20 and 30 inclusive, 5 by 5. --Ratio Data
  brand = c("O", "HH"), #--Nominal Data
  usrrt = seq(30, 20, -5), #generates sequence of numbers between 4.8 and 3.2 inclusive, -0.8 by -0.8, to have the highest rating first so the top generated profile is the best. --Ratio Data
  accinf = c("H", "M", "L", "U") #--Ordinal Data
)

#restrict dominant alternative and worst alternative
rstrct_profiles <- cbc_restrict(
  profiles,
  cost == 20 & brand == "O" & usrrt == 30 & accinf == "H", #exclude dominant alternative with the best levels for each attribute
  cost == 30 & brand == "HH" & usrrt == 20 & accinf == "U" #exclude the worst alternative with the worst levels for each attribute
)

#create DB-efficient design from restricted full-factorial design with zero priors
design_dbeff <- cbc_design(
  profiles = rstrct_profiles,
  n_resp = Nresp,
  n_alts = Nalt, #number of alternatives in each choice set
  n_q = Nchoice, #number of "questions" or choice sets
  n_start = 20, #numeric value indicating the number of random start designs to use
  priors = list(
    cost = 0,
    brand = 0,
    usrrt = 0,
    accinf = c(0, 0, 0)
  ),
  max_iter = 1000,
  method = "Modfed",
  keep_db_error = TRUE,
  parallel = TRUE
)

typeof(design_dbeff)

DF_design_dbeff <- as.data.frame(design_dbeff)

FirstDesign <- DF_design_dbeff[1:HeadN,] #Only save the design for the first respID, the rest of the respIDs are repititions of the first design
FirstDesign

Output:

> typeof(design_dbeff)
[1] "list"
> 
> DF_design_dbeff <- as.data.frame(design_dbeff) #converting list to data frame
> 
> FirstDesign <- DF_design_dbeff[1:HeadN,] #Only save the design for the first respID, the rest of the respIDs are repititions of the first design
> FirstDesign
   design.profileID design.respID design.qID design.altID design.obsID design.cost design.brand design.usrrt design.accinf design.blockID    db_err
1                10             1          1            1            1          25           HH           25             H              1 0.4261433
2                25             1          1            2            1          25            O           25             M              1 0.4261433
3                64             1          1            3            1          25           HH           25             U              1 0.4261433
4                37             1          1            4            1          25            O           30             L              1 0.4261433
5                14             1          2            1            2          30            O           20             H              1 0.4261433
6                61             1          2            2            2          25            O           25             U              1 0.4261433
7                35             1          2            3            2          30           HH           20             M              1 0.4261433
8                46             1          2            4            2          25           HH           25             L              1 0.4261433
9                63             1          3            1            3          20           HH           25             U              1 0.4261433
10               40             1          3            2            3          25           HH           30             L              1 0.4261433
11               22             1          3            3            3          25           HH           30             M              1 0.4261433
12                6             1          3            4            3          20            O           25             H              1 0.4261433
13               50             1          4            1            4          30            O           20             L              1 0.4261433
14               17             1          4            2            4          30           HH           20             H              1 0.4261433
15               20             1          4            3            4          30            O           30             M              1 0.4261433
16               59             1          4            4            4          30           HH           30             U              1 0.4261433
17               25             1          5            1            5          25            O           25             M              1 0.4261433
18               38             1          5            2            5          30            O           30             L              1 0.4261433
19               63             1          5            3            5          20           HH           25             U              1 0.4261433
20               59             1          5            4            5          30           HH           30             U              1 0.4261433
21               53             1          6            1            6          30           HH           20             L              1 0.4261433
22               66             1          6            2            6          20            O           20             U              1 0.4261433
23               33             1          6            3            6          20           HH           20             M              1 0.4261433
24               14             1          6            4            6          30            O           20             H              1 0.4261433
25                5             1          7            1            7          30           HH           30             H              1 0.4261433
26               56             1          7            2            7          30            O           30             U              1 0.4261433
27               49             1          7            3            7          25            O           20             L              1 0.4261433
28               34             1          7            4            7          25           HH           20             M              1 0.4261433
29               63             1          8            1            8          20           HH           25             U              1 0.4261433
30               20             1          8            2            8          30            O           30             M              1 0.4261433
31                6             1          8            3            8          20            O           25             H              1 0.4261433
32               45             1          8            4            8          20           HH           25             L              1 0.4261433
33               30             1          9            1            9          20            O           20             M              1 0.4261433
34                8             1          9            2            9          30            O           25             H              1 0.4261433
35               69             1          9            3            9          20           HH           20             U              1 0.4261433
36               47             1          9            4            9          30           HH           25             L              1 0.4261433
37               17             1         10            1           10          30           HH           20             H              1 0.4261433
38               53             1         10            2           10          30           HH           20             L              1 0.4261433
39                4             1         10            3           10          25           HH           30             H              1 0.4261433
40               32             1         10            4           10          30            O           20             M              1 0.4261433
41                4             1         11            1           11          25           HH           30             H              1 0.4261433
42               19             1         11            2           11          25            O           30             M              1 0.4261433
43               61             1         11            3           11          25            O           25             U              1 0.4261433
44               46             1         11            4           11          25           HH           25             L              1 0.4261433
45               34             1         12            1           12          25           HH           20             M              1 0.4261433
46               68             1         12            2           12          30            O           20             U              1 0.4261433
47                7             1         12            3           12          25            O           25             H              1 0.4261433
48               46             1         12            4           12          25           HH           25             L              1 0.4261433
49               43             1         13            1           13          25            O           25             L              1 0.4261433
50                3             1         13            2           13          20           HH           30             H              1 0.4261433
51               54             1         13            3           13          20            O           30             U              1 0.4261433
52               23             1         13            4           13          30           HH           30             M              1 0.4261433
53                8             1         14            1           14          30            O           25             H              1 0.4261433
54               42             1         14            2           14          20            O           25             L              1 0.4261433
55                9             1         14            3           14          20           HH           25             H              1 0.4261433
56               65             1         14            4           14          30           HH           25             U              1 0.4261433
57               28             1         15            1           15          25           HH           25             M              1 0.4261433
58               10             1         15            2           15          25           HH           25             H              1 0.4261433
59               43             1         15            3           15          25            O           25             L              1 0.4261433
60               60             1         15            4           15          20            O           25             U              1 0.4261433
61               68             1         16            1           16          30            O           20             U              1 0.4261433
62               27             1         16            2           16          20           HH           25             M              1 0.4261433
63                6             1         16            3           16          20            O           25             H              1 0.4261433
64               17             1         16            4           16          30           HH           20             H              1 0.4261433

Release cbcTools 0.1.0

First release:

usethis::use_cran_comments()
Update (aspirational) install instructions in README
Proofread Title: and Description:
Check that all exported functions have @return and @examples
Check that Authors@R: includes a copyright holder (role 'cph')
Check licensing of included files
Review https://github.com/DavisVaughan/extrachecks

Prepare for release:

git pull
Check if any deprecation processes should be advanced, as described in Gradual deprecation
devtools::build_readme()
urlchecker::url_check()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
git push
Draft blog post

Submit to CRAN:

usethis::use_version('minor')
devtools::submit_cran()
Approve email

Wait for CRAN...

duplicated profiles within a question

Hello,
I think I found cbc_design() generates designs which contain errors. Please see the demo code below.

library(cbcTools)
profiles_3x3x2 <- cbc_profiles(
  A = c("A1", "A2", "A3"), 
  B = c("B1", "B2", "B3"), 
  C = c("C1", "C2")
)
set.seed(123)
demo_CEA <- cbc_design(
  profiles = profiles_3x3x2,  
  n_resp   = 10,      
  n_alts   = 3,       
  n_q      = 7,       
  no_choice = FALSE,  
  method   = 'CEA',   
  priors   = list(A = c(0,0), B = c(0, 0), C = 0) 
)
print(demo_CEA[demo_CEA$respID == 1 & demo_CEA$qID == 1,])

The output is:

  profileID respID qID altID obsID blockID  A  B  C
1        16      1   1     1     1       1 A1 B3 C2
2        14      1   1     2     1       1 A2 B2 C2
3        14      1   1     3     1       1 A2 B2 C2

profiles of altID = 2 and 3 are duplicated.

I guess this behavior comes from join_profiles() in design.R. At line 332 , it merges design with profiles. It doen't retain the row order of design.

just to be sure, my sessionInfo() is:

R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default


locale:
[1] LC_COLLATE=Japanese_Japan.utf8  LC_CTYPE=Japanese_Japan.utf8    LC_MONETARY=Japanese_Japan.utf8
[4] LC_NUMERIC=C                    LC_TIME=Japanese_Japan.utf8    

time zone: Asia/Tokyo
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] cbcTools_0.5.0

loaded via a namespace (and not attached):
 [1] vctrs_0.6.3       cli_3.6.1         rlang_1.1.1       promises_1.2.0.1  generics_0.1.3   
 [6] shiny_1.7.4.1     xtable_1.8-4      glue_1.6.2        colorspace_2.1-0  htmltools_0.5.5  
[11] httpuv_1.6.11     scales_1.2.1      fansi_1.0.4       grid_4.3.1        munsell_0.5.0    
[16] tibble_3.2.1      ellipsis_0.3.2    MASS_7.3-60       fastmap_1.1.1     lifecycle_1.0.3  
[21] idefix_1.0.3      compiler_4.3.1    dplyr_1.1.2       Rcpp_1.0.11       pkgconfig_2.0.3  
[26] later_1.3.1       rstudioapi_0.15.0 digest_0.6.33     R6_2.5.1          tidyselect_1.2.0 
[31] utf8_1.2.3        Rdpack_2.4        pillar_1.9.0      rbibutils_2.2.13  magrittr_2.0.3   
[36] tools_4.3.1       gtable_0.3.3      mime_0.12         ggplot2_3.4.2

Thank you,
Shigeru ONO

Release cbcTools 0.2.0

First release:

usethis::use_cran_comments()
Update (aspirational) install instructions in README
Proofread Title: and Description:
Check that all exported functions have @return and @examples
Check that Authors@R: includes a copyright holder (role 'cph')
Check licensing of included files
Review https://github.com/DavisVaughan/extrachecks

Prepare for release:

git pull
Check if any deprecation processes should be advanced, as described in Gradual deprecation
devtools::build_readme()
urlchecker::url_check()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
git push
Draft blog post

Submit to CRAN:

usethis::use_version('minor')
devtools::submit_cran()
Approve email

Wait for CRAN...

Error when using restricted_profiles and method="Modfed" or "dopt"

When I use method="Modfed" or method="dopt" in the example from the vignette with restricted_profiles I get an error:

design <- cbc_design(

profiles = restricted_profiles,
n_resp = 900, # Number of respondents
n_alts = 3, # Number of alternatives per question
n_q = 6, # Number of questions per respondent
method = "Modfed"
)
Error in check_inputs_design(profiles, n_resp, n_alts, n_q, n_blocks, :
Restricted profile sets can only be used with the "random", "full" "dopt", or "Modfed" methods

Any hints what is wrong? Thank you

jhelvy / cbctools Goto Github PK

cbctools's Introduction

Hello! 👋

Links

Research Projects

R Packages

Open Source Courses

Community

Keyboards

cbctools's People

Contributors

Stargazers

Watchers

Forkers

cbctools's Issues

A simple conjoint experiment about apples

Generate all possible profiles

Make a randomized survey design

Simulate random choices

Conduct a power analysis

Recommend Projects

Recommend Topics

Recommend Org