Set up MPT model formula
mpt_formula.Rdmpt_formula() sets up model formula(s) that specify the regression
structure and hierarchical structure for MPT model formulas. The regression
structure allows to vary parameters across both between-participants and
within-participants conditions. By using lme4/brms style syntax (e.g.
(1|id)) the random-effects (i.e., a hierarchical structure) can be defined.
For mpt_formula objects, brms::stancode and brms::standata methods are
provided.
Usage
mpt_formula(
formula,
...,
response,
model,
data_format = "long",
brms_args = list()
)
# S3 method for class 'mpt_formula'
stancode(
object,
data,
default_prior_intercept = "normal(0, 1)",
default_prior_coef = "normal(0, 0.5)",
default_priors = TRUE,
tree,
log_p = FALSE,
link = "probit",
...
)
# S3 method for class 'mpt_formula'
standata(object, data, tree, ...)Arguments
- formula
An object of class
formulaproviding a symbolic description of the regression model and hierarchical structure applied to MPT model parameters. If only one formula is given, the left-hand-side (LHS) needs to give the response variable and the right-hand-side (RHS) gives the model structure for all parameters.- ...
for
mpt_formula(), optional additionalformulaobjects providing a symbolic description of the regression model and hierarchical structure applied to the remaining MPT model parameters. For thempt_formulamethods, additional arguments passed to the corresponding default methods.- response
one sided formula or character vector giving the name of the response variable. Cannot be missing if a
formulais specified for each parameter and the data are given in long format (seedata_formatarg)- model
An
mpt_modelobject as created bymake_mpt().- data_format
character string indicating whether the formula is to be generated for fitting data in long format / non-aggregated data (
long, the default), where a single variable contains trial-level responses, or for data in wide format / aggregated data (wide), where a separate column for each response category contains the respective frequency.- brms_args
A
listof additional arguments passed tobrms::brmsformula(), such ascenter, which is the function ultimately creating the formula for fitting the model.- object
An object of class
mpt_formula- data
data.framecontaining the variables informula. Data needs to be on an observation-level (i.e., each row is one response/observation) and cannot be aggregated in any way. TODO: change this- default_prior_intercept
character string describing the prior applied to the fixed-effect intercepts for each MPT model parameter on the unconstrained scale (if
default_priors = TRUE). The default,"normal(0, 1)"implies a flat prior on the MPT parameter scale.- default_prior_coef
character string describing the prior applied to the non-intercept fixed-effect parameters for each MPT model parameter on the unconstrained scale (if
default_priors = TRUE).- default_priors
logical value indicating whether (the default,
TRUE) or not (FALSE) the priors specified via thedefault_prior_interceptanddefault_prior_coefargument should be applied.- tree
one-sided formula or character specifying the variable in
dataindicating the tree (or item type) of a given observation. The values of thetreevariable need to match the names of the trees inmodel. Can be omitted for models with only one tree.- log_p
logical value indicating whether the likelihood should be evaluated with probabilities (the default,
FALSE) or log probabilities (TRUE). Settinglog_ptoTRUEcan help in case of convergence issues but might be slower.- link
character specifying the link function for transforming from unconstrained space to MPT model parameter (i.e., 0 to 1) space. Default is
"probit".
Value
An object of class mpt_formula which is a list containing the
following slots:
formulas: Alistof formulas for each MPT model parameter.response: A one-sidedformulagiven the response variable on the RHS.brms_formula: Thebrmsformulaobject created bybrms::brmsformula().model: Thempt_modelobject passed in themodelargument.data_format: see the corresponding argument
The brms::stancode and brms::standata methods for mpt_formula objects
return the same objects as the corresponding default brms methods (which
are internally called).
Details
There are two ways of using mpt_formula() function:
Specify a single formula that applies to all MPT model parameters (passed via
model). In this case, the LHS of the formula needs to give the response variable if data is in long format (LHS is ignored if an LHS is given for a formula for data in wide format / aggregated data).Specify a formula for each MPT model parameter of the
model. In this case, the LHS of each formula needs to give the parameters name. Furthermore, the name of the response variable needs to be passed via theresponseargument for data in long format.
Examples
## Model with 4 parameters: Dn, Do, g1x, g2x
EQNFILE <- system.file("extdata", "u2htm.eqn", package = "mptstan")
u2htsm_model <- make_mpt(EQNFILE)
#> model type auto-detected as 'eqn'
#> Warning: parameter names ending with a number amended with 'x'
u2htsm_model
#>
#> MPT model with 4 independent categories (from 2 trees) and 4 parameters:
#> Dn, Do, g1x, g2x
#>
#> Tree 1: old
#> Categories: old, unsure, new
#> Parameters: Do, g1x, g2x
#> Tree 2: new
#> Categories: old, unsure, new
#> Parameters: Dn, g1x, g2x
#>
## formulas are given for following data
str(skk13)
#> 'data.frame': 8400 obs. of 7 variables:
#> $ id : Factor w/ 42 levels "1","3","5","6",..: 1 1 1 1 1 1 1 1 1 1 ...
#> $ trial: Factor w/ 200 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
#> $ race : Factor w/ 2 levels "german","arabic": 2 1 1 1 1 1 1 2 1 1 ...
#> $ type : Factor w/ 2 levels "old","new": 1 1 2 2 1 1 1 1 2 2 ...
#> $ resp : Factor w/ 3 levels "old","unsure",..: 3 1 3 1 1 1 3 1 3 3 ...
#> $ rt : num 4.68 2.75 4.25 1.6 0.95 ...
#> $ stim : Factor w/ 200 levels "A001","A002",..: 40 132 117 143 140 162 193 19 120 170 ...
#### simplest possible formula: ~ 1
## no random-effects and there is only one set of parameters (i.e., no
## differences across conditions).
## Same model holds for all MPT model parameters
(f1 <- mpt_formula(resp ~ 1, model = u2htsm_model))
#> MPT formulas for long / non-aggregated data (response: resp):
#> Dn ~ 1
#> <environment: 0x55eb8d1ba928>
#> Do ~ 1
#> <environment: 0x55eb8d1ba928>
#> g1x ~ 1
#> <environment: 0x55eb8d1ba928>
#> g2x ~ 1
#> <environment: 0x55eb8d1ba928>
#### model with condition effects: ~ race
## Each parameter differs across the race variable
(f2 <- mpt_formula(resp ~ race, model = u2htsm_model))
#> MPT formulas for long / non-aggregated data (response: resp):
#> Dn ~ race
#> <environment: 0x55eb8d1ba928>
#> Do ~ race
#> <environment: 0x55eb8d1ba928>
#> g1x ~ race
#> <environment: 0x55eb8d1ba928>
#> g2x ~ race
#> <environment: 0x55eb8d1ba928>
### model with simple by-participant random effects
## because race is within-subject factor, we need random slopes for race
## this model only has correlations within one MPT model parameter
(f3 <- mpt_formula(resp ~ race + (race|id), model = u2htsm_model))
#> MPT formulas for long / non-aggregated data (response: resp):
#> Dn ~ race + (race | id)
#> <environment: 0x55eb8d1ba928>
#> Do ~ race + (race | id)
#> <environment: 0x55eb8d1ba928>
#> g1x ~ race + (race | id)
#> <environment: 0x55eb8d1ba928>
#> g2x ~ race + (race | id)
#> <environment: 0x55eb8d1ba928>
### model with correlated by-participant random effects
## to employ full latent-trait structure (Klauer, 2010), we need to have
## correlations across MPT model parameters
(f4 <- mpt_formula(resp ~ race + (race|p|id), model = u2htsm_model))
#> MPT formulas for long / non-aggregated data (response: resp):
#> Dn ~ race + (race | p | id)
#> <environment: 0x55eb8d1ba928>
#> Do ~ race + (race | p | id)
#> <environment: 0x55eb8d1ba928>
#> g1x ~ race + (race | p | id)
#> <environment: 0x55eb8d1ba928>
#> g2x ~ race + (race | p | id)
#> <environment: 0x55eb8d1ba928>
### model with crossed random-effects for participants and items:
## because race is a between-item factor (i.e., race is nested within) the item
## factor, we only have random intercepts for item, but they are correlated as
## well.
(f5 <- mpt_formula(resp ~ race + (race|p|id) + (1|i|item), model = u2htsm_model))
#> MPT formulas for long / non-aggregated data (response: resp):
#> Dn ~ race + (race | p | id) + (1 | i | item)
#> <environment: 0x55eb8d1ba928>
#> Do ~ race + (race | p | id) + (1 | i | item)
#> <environment: 0x55eb8d1ba928>
#> g1x ~ race + (race | p | id) + (1 | i | item)
#> <environment: 0x55eb8d1ba928>
#> g2x ~ race + (race | p | id) + (1 | i | item)
#> <environment: 0x55eb8d1ba928>
### we can also specify an individual structure for each parameters.
## In this case, we need to specify the response variable separately.
(f6 <- mpt_formula(
Do ~ race + (race|p|id) + (1|i|item),
Dn ~ race + (race|p|id) + (1|i|item),
g1x ~ race + (race|p|id) + (1|i|item),
g2x ~ race + (race|p|id) + (1|i|item),
response = ~ resp,
model = u2htsm_model))
#> MPT formulas for long / non-aggregated data (response: resp):
#> Dn ~ race + (race | p | id) + (1 | i | item)
#> <environment: 0x55eb8d1ba928>
#> Do ~ race + (race | p | id) + (1 | i | item)
#> <environment: 0x55eb8d1ba928>
#> g1x ~ race + (race | p | id) + (1 | i | item)
#> <environment: 0x55eb8d1ba928>
#> g2x ~ race + (race | p | id) + (1 | i | item)
#> <environment: 0x55eb8d1ba928>
all.equal(f5, f6) ## TRUE
#> [1] TRUE
### can be more interesting, if we want different structures for each parameter
(f7 <- mpt_formula(
Do ~ 1 + (1|p|id) + (1|i|item),
Dn ~ race + (race|p|id) + (1|i|item),
g1x ~ 1 + (1|p|id) + (1|i|item),
g2x ~ race + (race|p|id) + (1|i|item),
response = ~ resp,
model = u2htsm_model))
#> MPT formulas for long / non-aggregated data (response: resp):
#> Dn ~ race + (race | p | id) + (1 | i | item)
#> <environment: 0x55eb8d1ba928>
#> Do ~ 1 + (1 | p | id) + (1 | i | item)
#> <environment: 0x55eb8d1ba928>
#> g1x ~ 1 + (1 | p | id) + (1 | i | item)
#> <environment: 0x55eb8d1ba928>
#> g2x ~ race + (race | p | id) + (1 | i | item)
#> <environment: 0x55eb8d1ba928>