[1] '2.1.5'
questionnaire_gen(n_obs, cat_prop = NULL, n_vars = NULL, n_X = NULL, n_W = NULL, cor_matrix = NULL,
cov_matrix = NULL, c_mean = NULL, c_sd = NULL, theta = FALSE, family = NULL, full_output = FALSE,
verbose = TRUE)
By default, the function returns a data.frame object where the first
column (“subject”) is a \(1, \ldots,
n\) ordered list of the \(n\)
observations and the other columns correspond to the questionnaire
answers. If theta = TRUE
, the first column after “subject”
will be the latent variable theta; in any case, the continuous variables
always come before the categorical ones.
If the logical argument full_output
is
TRUE
, output will be a list containing the questionnaire
data as well as several objects that might be of interest for further
analysis of the data, listed below:
bg
: a data frame containing the background
questionnaire answers (i.e., the same object output if
full_output = FALSE
).c_mean
: is a vector of population means for each
continuous variable (\(Y\) and \(X\)).c_sd
: is a vector of population standard deviations for
each continuous variable (\(Y\) and
\(X\)).cat_prop
: list of cumulative proportions for each item.
If theta = TRUE
, the first element of cat_prop
must be a scalar 1, which corresponds to theta
.cat_prop_W_p
: a list containing the probabilities for
each category of the categorical variables (cat_prop_W
contains the cumulative probabilities).cor_matrix
: latent correlation matrix. The first
row/column corresponds to the latent trait (\(Y\)). The other rows/columns correspond to
the continuous (\(X\) or \(Z\)) or the discrete (\(W\)) background variables, in the same
order as cat_prop
.cov_matrix
: latent covariance matrix, formatted as
cor_matrix
.family
: distribution of the background variables. Can
be NULL
(default) or ‘gaussian’.n_obs
: number of observations to generate.n_tot
: named vector containing the number of total
variables, the number of continuous background variables (i.e., the
total number of background variables except theta) and the number of
categorical variables.n_W
: vector containing the number of categorical
variables.n_X
: vector containing the number of continuous
variables (except theta).sd_YXW
: vector with the standard deviations of all the
variablessd_YXZ
: vector containing the standard deviations of
theta, the background continuous variables (\(X\)) and the Normally-distributed variables
\(Z\) which will generate the
background categorical variables (\(W\)).theta
: if TRUE
, the first continuous
variable will be labeled “theta”. Otherwise, it will be labeled
q1
.var_W
: list containing the variances of the categorical
variables.var_YX
: list containing the variances of the continuous
variables (including theta)linear_regression
: This list is printed only if
theta = TRUE
, family = "gaussian"
and full_output = TRUE
. It contains one vector named
betas
and one tabled named cov_YXW
. The former
displays the true linear regression coefficients of theta on the
background questionnaire answers; the latter contains the covariance
matrix between all these variables.We generate one continuous and two ordinal covariates. We specify the
covariance matrix between the numeric and ordinal variables. The data is
generated from a multivariate normal distribution. And we set the
logical argument full_output = TRUE
.
The output is a list containing the following elements: bg, c_mean, c_sd, cat_prop, cat_prop_W_p, cor_matrix, cov_matrix, family, n_W, n_X, n_obs, n_tot, sd_YXW, sd_YXZ, theta, var_W, var_YX, verbose, linear_regression.
[[1]]
[1] 1
[[2]]
[1] 0.25 1.00
[[3]]
[1] 0.2 0.8 1.0
[,1] [,2] [,3]
[1,] 1.0 0.5 0.5
[2,] 0.5 1.0 0.8
[3,] 0.5 0.8 1.0
questionnaire_gen(n_obs = 10, cat_prop = props, cov_matrix = yw_cov, theta = TRUE, family = "gaussian",
full_output = TRUE)
$bg
subject theta q1 q2
1 1 -0.8440231 2 2
2 2 -2.0198262 2 2
3 3 -0.7921984 1 1
4 4 -1.1724355 1 1
5 5 -0.5099209 2 2
6 6 -0.4202077 1 1
7 7 -0.2292551 2 3
8 8 -0.4616903 2 2
9 9 -0.8524573 1 2
10 10 -1.1829590 2 1
$c_mean
[1] 0
$c_sd
[1] 1
$cat_prop
$cat_prop[[1]]
[1] 1
$cat_prop[[2]]
[1] 0.25 1.00
$cat_prop[[3]]
[1] 0.2 0.8 1.0
$cat_prop_W_p
$cat_prop_W_p[[1]]
[1] 0.25 0.75
$cat_prop_W_p[[2]]
[1] 0.2 0.6 0.2
$cor_matrix
theta q1 q2
theta 1.0 0.5 0.5
q1 0.5 1.0 0.8
q2 0.5 0.8 1.0
$cov_matrix
theta q1 q2
theta 1.0 0.5 0.5
q1 0.5 1.0 0.8
q2 0.5 0.8 1.0
$family
[1] "gaussian"
$n_W
[1] 2
$n_X
[1] 0
$n_obs
[1] 10
$n_tot
n_vars n_X n_W theta
3 0 2 1
$sd_YXW
[1] 1.0000000 0.4330127 0.4330127 0.4000000 0.4898979 0.4000000
$sd_YXZ
[1] 1 1 1
$theta
[1] TRUE
$var_W
$var_W[[1]]
[1] 0.1875 0.1875
$var_W[[2]]
[1] 0.16 0.24 0.16
$var_YX
[1] 1
$verbose
[1] TRUE
$linear_regression
$linear_regression$betas
theta q1.2 q2.2 q2.3
-0.8218134 0.4547365 0.4450622 1.0686187
$linear_regression$vcov_YXW
theta q1.2 q2.2 q2.3
theta 1.000000e+00 0.15888829 5.551115e-17 0.13998096
q1.2 1.588883e-01 0.18750000 4.710271e-02 0.04928003
q2.2 5.551115e-17 0.04710271 2.400000e-01 -0.12000000
q2.3 1.399810e-01 0.04928003 -1.200000e-01 0.16000000