semTools/ 0000755 0001762 0000144 00000000000 14764514742 012075 5 ustar ligges users semTools/MD5 0000644 0001762 0000144 00000020030 14764514742 012400 0 ustar ligges users 01e81474807aa45c53f8d85020d70e93 *DESCRIPTION
e5312a485ec0f2dac0d9e1cb3fe83b75 *NAMESPACE
17c0795d9e1e9dac7abbbd3f7c52fedd *NEWS.md
35e3b92136a02ce4a8a3b19817054398 *R/EmpKaiser.R
deeb23e9dd0dda60ea5ec933c8b93c24 *R/NET.R
528e54b04c168f9818d962f24976ce26 *R/PAVranking.R
1830a4f07e22584dc6fb6d508c232949 *R/TSML.R
b493f1e2905a9f6c68e09b577f7bcc39 *R/aa_semTools-deprecated.R
4af8fe4548dc80d75c65bbb033a57299 *R/auxiliary.R
2a42d605da779f85461c0de8f9413545 *R/clipboard.R
414285f701314ef09aa2e61a0f29b5be *R/compareFit.R
9452cfe3870212e48397e951991b27bf *R/data.R
bd2d9ec4e394cbfb22d616f0b3006410 *R/dataDiagnosis.R
36efcb04956d27597da57d2edb00e7d4 *R/discriminantValidity.R
1e509ab0163cf575a72a7280f21a6ec4 *R/efa.R
1c7a99c8c46cc8bf2817f8459bec1023 *R/emmeans_lavaan.R
f68e7fcf7b1351bac3feb40038d148fb *R/fitIndices.R
7f95ac609c99baf8953fd0145c1da9c0 *R/fmi.R
3d26bbf0d7143f21e1f76db128a9649d *R/gorica.R
5116fa966b876ab74cccca6f18d414dd *R/htmt.R
f70a1de02115ad9cef51820ebaa1565e *R/imposeStart.R
317f3300c1e8a0def5f6df9220d7d717 *R/indProd.R
ed3337c42eecfba61fc37e863ea61f65 *R/kd.R
664b1dee784db1f51dfc2b82984d21c4 *R/loadingFromAlpha.R
fd6fe461a0fc803ab3558ca4c0ab26fe *R/longInvariance.R
e9f316b43e1e510d771d13b3b93d0748 *R/measEq.R
6c59696ff6fa81dc9c02b558d65034b0 *R/measurementInvariance.R
a9498c16325bbfcd556f786eb36a7e31 *R/measurementInvarianceCat.R
1c671dd436dbecc4b5c20b4012c13fd8 *R/miPowerFit.R
798b84d227427fcbfe3ba9fed332576e *R/missingBootstrap.R
bb1bedb72f6e00232c22418479aa41a7 *R/monteCarloCI.R
f0f13700b0f7cc300f5c6326c4653acc *R/mvrnonnorm.R
7704490623667921dff2868bd2392dd6 *R/ordMoments.R
b3a5236c5120109f909904c212692c87 *R/parcelAllocation.R
ab4ae4f4c6a74e028b848588a41d8c5d *R/partialInvariance.R
e4dbf1869e8b44e961e142bf36fcfb31 *R/permuteMeasEq.R
373d9869b27631c40cbedbcd7261328e *R/plausibleValues.R
464377d7f7fdabcaeef7806675102e28 *R/poolMAlloc.R
bab9b4f4590967574d0955e9bb8621e8 *R/powerAnalysisNested.R
588942616f7a22bf4258f545e0f3b000 *R/powerAnalysisRMSEA.R
35eca65deb80aa5c9a03080b3a5fd09a *R/powerAnalysisSS.R
7918c8b6be077cb98212cafb5f99e05e *R/probeInteraction.R
2fe1e05a38e10ca178678b663baaad24 *R/quark.R
a271b15672276a99d2b10d47cbd15d3f *R/reliability.R
d740abd5f729b37d44c58007747aeb84 *R/residualCovariate.R
675ddf14e8759ce30fe49ee9333a0222 *R/runMI-LRT.R
608eaa911b92ffce42f5384d97d13101 *R/runMI-Wald.R
4b37f8e362964a96691095c1972ebd6d *R/runMI-methods.R
ae25c5a8a4c8b8e888f817541de3be6a *R/runMI-modification.R
116da660f76e16c8d3e1893a0c385810 *R/runMI-score.R
ccb836c856259541245fd982f4208850 *R/runMI.R
8c786f138852cab89cf318c66f6e0f94 *R/semTools.R
a2d82175ca8f019859f0636de40a81e9 *R/singleParamTest.R
db5ec4cf1a5310d50dbb09dc664dc561 *R/splitSample.R
a7a44eb40f895592c9c2f5ddc3008e6d *R/tukeySEM.R
540f327d33c85cc090c6fda8c361efde *R/zzz.R
ebcb52eff7ecb5000bb00ad17957036b *README.md
4454e632feb75ce4f508807d50b3b12f *build/partial.rdb
59dab2029bfb738e06e036800778c3d4 *build/vignette.rds
4d1090db8b160303f235075ebbda16d7 *data/dat2way.rda
c56c2358c6e4d82c4cd7855e4df37282 *data/dat3way.rda
d3f489a4fefd07ccfcd3875bc71e0b83 *data/datCat.rda
ba21a4b5113fd7515fbf26fccfe0fe17 *data/exLong.rda
35cb60d186fd440645d81b56e22a3fa2 *data/simParcel.rda
cf344eafafe90f08e687a259663b0c4d *inst/CITATION
81dc319ba4e3739b20cbcd3196f00e1e *inst/doc/partialInvariance.Rnw
93b949b15e67a95b70b0607eb5f75066 *inst/doc/partialInvariance.pdf
fbdb4902cc1df8b273d644fb4d1de132 *inst/examples/lavaan2emmeans.R
4d59b5a289ca1e7b3cf7f487570dd3cd *man/AVE.Rd
a28c22979c817fda070bbc8b191a36df *man/BootMiss-class.Rd
156b38edd860913bffdffad8b0dd6e55 *man/EFA-class.Rd
9b843422caf2145783b42860a0ad55ba *man/FitDiff-class.Rd
b559b20ac662d06cf943a39472653a57 *man/Net-class.Rd
b116b58e4420ba329e7d8e4abbab33fa *man/OLDlavaan.mi-class.Rd
84758fa1b8f55b0af087d11a1da6d0fb *man/PAVranking.Rd
13d9de2a40e21e2fec56b0135fcf719e *man/SSpower.Rd
a70ea2c6d04e74909d8a0e0b3dc30beb *man/auxiliary.Rd
168db4d761c2691aa9ce09ae3c8480da *man/bsBootMiss.Rd
3ebeb7407f27d97eeb065aadb4199619 *man/calculate.D2-deprecated.Rd
91d2269797723745356c74b0a16f05bc *man/chisqSmallN.Rd
0358922e0e101d12670e23db2ae468b5 *man/clipboard.Rd
32205b01e78c1e2f7d1ecee7d19d5e9e *man/combinequark.Rd
c4b31139833748945955285ba7b69653 *man/compRelSEM.Rd
ab6be7f82ae5a019f5cddbca4b91c828 *man/compareFit.Rd
e3b7ebbd49ee04307b7c2a9269ccd5c2 *man/dat2way.Rd
bc676545cc6f99640bee85f1d124637d *man/dat3way.Rd
bb535d599ad9096582dc0256ef785a1f *man/datCat.Rd
928b3be970ee728a5b63bf356c5550bc *man/discriminantValidity.Rd
b953186f103230d01c7cf98d5c245e15 *man/efa.ekc.Rd
a765ac05a4f82b953ee9c612e8cb7c03 *man/efaUnrotate-deprecated.Rd
91c710004d40191cc92a7b7d4e27139f *man/exLong.Rd
8e886463f55e5edeedddbb77f601a1a2 *man/findRMSEApower.Rd
4388dbbe39c7f66ddc58d3543c742864 *man/findRMSEApowernested.Rd
181b47581b794e42091883a3f2715a4a *man/findRMSEAsamplesize.Rd
3b4214558a5f6b4ad5a7df6e411f2e90 *man/findRMSEAsamplesizenested.Rd
75895615d5f9bb188af3d355a5bce1fa *man/fmi.Rd
bfbfda5bca607c81efda738b31d4a7b4 *man/goricaSEM.Rd
4fef726eb173e9ab4aafca30452a5849 *man/htmt.Rd
2865323337743b69f0111bfe8d3763f4 *man/imposeStart.Rd
793499bc1078d3c3a55ffc5e81da467d *man/indProd.Rd
050966fdeb43eeb3c23176c0791d88b4 *man/kd.Rd
8dbf7627bcc9e7d5f23e3f57a0712a23 *man/kurtosis.Rd
ea964847025c22e9e87e187e896a6334 *man/lavTestLRT.mi-deprecated.Rd
bf779d773d3484933109637ccf75f0b7 *man/lavTestScore.mi-deprecated.Rd
57f44a50bdab4364f229b3cc8f2ab21b *man/lavTestWald.mi-deprecated.Rd
25dc38a53f52a8e4eddf6a169f1e117a *man/lavaan2emmeans.Rd
b24d6f6dc617e522f1bdb7d2478e9ec0 *man/loadingFromAlpha.Rd
4d7e1e6b4e4b00d0c333ff5667eb9400 *man/longInvariance-deprecated.Rd
a1416c3b9594ff872e601a11f69ebe56 *man/lrv2ord.Rd
e62aff85d844043eeb417a94b065a10d *man/mardiaKurtosis.Rd
3e5c5ac5ce4923a87133e02fdd87ca7e *man/mardiaSkew.Rd
7d7f26b6540e5176a7b44c086bb6857b *man/maximalRelia.Rd
97c1b90dab0910110c6363dcbadba2ba *man/measEq.syntax-class.Rd
6b08c7e06ca0bfb0d0143c3416fd2f1c *man/measEq.syntax.Rd
1ef20d2748ed82006ae30cf7bc8bbde8 *man/measurementInvariance-deprecated.Rd
f4fadfbb8d4eb0f79ef1d98d93a0669b *man/measurementInvarianceCat-deprecated.Rd
1bbca5c95a676992a09bfe400f65009a *man/miPowerFit.Rd
af841691e041cb9309205c6864a9d57b *man/modindices.mi-deprecated.Rd
e136c8b6e08a3be1062473389a4cf4b6 *man/monteCarloCI.Rd
eee6c0fc8f5ab944e846887571f09a68 *man/moreFitIndices.Rd
a46444a8b87a06d19a8446a21f58749c *man/mvrnonnorm.Rd
f51137adb8cb4587d0e5eb7a63e4778f *man/net.Rd
6d949b7d106a6774f75b3387ccc62efa *man/nullRMSEA.Rd
dad21e0350e2d1b05dd0291b25d8ea07 *man/parcelAllocation.Rd
c391317e75520f81a60392e07e616fdb *man/partialInvariance.Rd
dbe9310ae109bb5b17b671de55b77196 *man/permuteMeasEq-class.Rd
20a50c52bc7c6fb4b4480c2f65ac77df *man/permuteMeasEq.Rd
2d674defaa2c487063c080a197836afd *man/plausibleValues.Rd
bfd140211baf83ac057f7caaf356a66f *man/plotProbe.Rd
981a1c3ef734038ade2d80f6f1c8e1fb *man/plotRMSEAdist.Rd
2a8a4bdc7777b0723cdc3e25e36de04f *man/plotRMSEApower.Rd
1003bb6dba12f7f08a7d0e97b9ca043f *man/plotRMSEApowernested.Rd
86ff2b2c35eea574ed046f6cd8e93bf3 *man/poolMAlloc.Rd
7c19934693578e00fd5a2cd8f30b47f3 *man/probe2WayMC.Rd
d2c032bc6a9e4fae589dc25b0fa3bf14 *man/probe2WayRC.Rd
ce1daf746eb6cbdd47e2334b0d1fd6ce *man/probe3WayMC.Rd
f1d63c788d3faaff8946ea539b6c1c28 *man/probe3WayRC.Rd
9093b87793638e47a6890826d44a4a7a *man/quark.Rd
af445dcb2bd7c487bc053fe1f47c6898 *man/reliability-deprecated.Rd
ee1fcbc69047bfe60c047ee4d549c198 *man/reliabilityL2-deprecated.Rd
99dbb956123d40f686e2d77d842b0886 *man/residualCovariate.Rd
227d9cc1050e78fcf56de7e8d6fbba68 *man/rotate-deprecated.Rd
e05acabb8c05ef66e4448ad2877ca856 *man/runMI-deprecated.Rd
75e345c0ff37d47791ac78e3d09e2184 *man/semTools-deprecated.Rd
23410ceaee95b119e395cc72a537e5b2 *man/semTools.Rd
2d0c2f4e2f10d8c84feb8f1705bef4fb *man/simParcel.Rd
792694fbca04408c36137c258b318c4f *man/singleParamTest.Rd
a8935482bf0136620398a0e38cc7e33e *man/skew.Rd
2da1028ad38bb0531d85ef2b95f9273f *man/splitSample.Rd
84b89acd41d31adce3db39a5b89f0009 *man/tukeySEM.Rd
21b218fa85360e1ae22c0d31b1fb8306 *man/twostage-class.Rd
b21e87ffc30baacb82e92cea8c87e8e7 *man/twostage.Rd
81dc319ba4e3739b20cbcd3196f00e1e *vignettes/partialInvariance.Rnw
4f5891dc46f7212c1ce6189c4467adba *vignettes/partialInvariance.bib
semTools/R/ 0000755 0001762 0000144 00000000000 14753125033 012263 5 ustar ligges users semTools/R/PAVranking.R 0000644 0001762 0000144 00000170470 14764274542 014433 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 12 March 2025
##' Parcel-Allocation Variability in Model Ranking
##'
##' This function quantifies and assesses the consequences of parcel-allocation
##' variability for model ranking of structural equation models (SEMs) that
##' differ in their structural specification but share the same parcel-level
##' measurement specification (see Sterba & Rights, 2016). This function calls
##' [parcelAllocation()]---which can be used with only one SEM in
##' isolation---to fit two (assumed) nested models to each of a specified number
##' of random item-to-parcel allocations. Output includes summary information
##' about the distribution of model selection results (including plots) and the
##' distribution of results for each model individually, across allocations
##' within-sample. Note that this function can be used when selecting among more
##' than two competing structural models as well (see instructions below
##' involving the `seed=` argument).
##'
##' This is based on a SAS macro `ParcelAlloc` (Sterba & MacCallum, 2010).
##' The `PAVranking()` function produces results discussed in Sterba and
##' Rights (2016) relevant to the assessment of parcel-allocation variability in
##' model selection and model ranking. Specifically, the `PAVranking()`
##' function first calls [parcelAllocation()] to generate a given
##' number (`nAlloc=`) of item-to-parcel allocations, fitting both specified
##' models to each allocation, and providing summaryies of PAV for each model.
##' Additionally, `PAVranking()` provides the following new summaries:
##'
##' \itemize{
##' \item{PAV in model selection index values and model ranking between
##' Models `model0=` and `model1=`.}
##' \item{The proportion of allocations that converged and the proportion of
##' proper solutions (results are summarized for allocations with both
##' converged and proper allocations only).}
##' }
##'
##' For further details on the benefits of the random allocation of items to
##' parcels, see Sterba (2011) and Sterba and MacCallum (2010).
##'
##' To test whether nested models have equivalent fit, results can be pooled
##' across allocations using the same methods available for pooling results
##' across multiple imputations of missing data (see **Examples**).
##'
##' *Note*: This function requires the `lavaan` package. Missing data
##' must be coded as `NA`. If the function returns `"Error in
##' plot.new() : figure margins too large"`, the user may need to increase
##' size of the plot window (e.g., in RStudio) and rerun the function.
##'
##'
##' @importFrom stats sd
##' @importFrom lavaan parTable lavListInspect lavaanList
##' @importFrom graphics hist
##'
##' @param model0,model1 [lavaan::lavaan()] model syntax specifying
##' nested models (`model0` within `model1`) to be fitted
##' to the same parceled data. Note that there can be a mixture of
##' items and parcels (even within the same factor), in case certain items
##' should never be parceled. Can be a character string or parameter table.
##' Also see [lavaan::lavaanify()] for more details.
##' @param data A `data.frame` containing all observed variables appearing
##' in `model0=` and `model1=`, as well as those in the `item.syntax=` used to
##' create parcels. If the data have missing values, multiple imputation
##' before parceling is recommended: submit a stacked data set (with a variable
##' for the imputation number, so they can be separated later) and set
##' `do.fit=FALSE` to return the list of `data.frame`s (one per
##' allocation), each of which is a stacked, multiply imputed data set with
##' parcels created using the same allocation scheme.
##' @param parcel.names `character` vector containing names of all parcels
##' appearing as indicators in `model0=` or `model1=`.
##' @param item.syntax [lavaan::lavaan()] model syntax specifying the model
##' that would be fit to all of the unparceled items, including items that
##' should be randomly allocated to parcels appearing in `model0=` and `model1=`.
##' @param nAlloc The number of random items-to-parcels allocations to generate.
##' @param fun `character` string indicating the name of the
##' [lavaan::lavaan()] function used to fit `model0=` and `model1=` to `data=`.
##' Can only take the values `"lavaan"`, `"sem"`, `"cfa"`, or `"growth"`.
##' @param alpha Alpha level used as criterion for significance.
##' @param bic.crit Criterion for assessing evidence in favor of one model
##' over another. See Rafferty (1995) for guidelines (default is "very
##' strong evidence" in favor of the model with lower BIC).
##' @param fit.measures `character` vector containing names of fit measures
##' to request from each fitted [lavaan::lavaan-class] model. See the
##' output of [lavaan::fitMeasures()] for a list of available measures.
##' @param \dots Additional arguments to be passed to
##' [lavaan::lavaanList()]. See also [lavaan::lavOptions()]
##' @param show.progress If `TRUE`, show a [utils::txtProgressBar()]
##' indicating how fast each model-fitting iterates over allocations.
##' @param iseed (Optional) Random seed used for parceling items. When the same
##' random seed is specified and the program is re-run, the same allocations
##' will be generated. The seed argument can be used to assess parcel-allocation
##' variability in model ranking when considering more than two models. For each
##' pair of models under comparison, the program should be rerun using the same
##' random seed. Doing so ensures that multiple model comparisons will employ
##' the same set of parcel datasets. *Note*: When using \pkg{parallel}
##' options, you must first type `RNGkind("L'Ecuyer-CMRG")` into the R
##' Console, so that the seed will be controlled across cores.
##' @param warn Whether to print warnings when fitting models to each allocation
##'
##' @return
##' A `list` with 3 elements. The first two (`model0.results` and
##' `model1.results`) are results returned by [parcelAllocation()]
##' for `model0` and `model1`, respectively.
##' The third element (`model0.v.model1`) is a `list` of
##' model-comparison results, including the following:
##' \item{`LRT_Summary:`}{ The average likelihood ratio test across
##' allocations, as well as the *SD*, minimum, maximum, range, and the
##' proportion of allocations for which the test was significant.}
##' \item{`Fit_Index_Differences:`}{ Differences in fit indices, organized
##' by what proportion favored each model and among those, what the average
##' difference was.}
##' \item{`Favored_by_BIC:`}{ The proportion of allocations in which each
##' model met the criterion (`bic.crit`) for a substantial difference
##' in fit.}
##' \item{`Convergence_Summary:`}{ The proportion of allocations in which
##' each model (and both models) converged on a solution.}
##'
##' Histograms are also printed to the current plot-output device.
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @seealso [parcelAllocation()] for fitting a single model,
##' [poolMAlloc()] for choosing the number of allocations
##'
##' @references
##' Raftery, A. E. (1995). Bayesian model selection in social
##' research. *Sociological Methodology, 25*, 111--163. \doi{10.2307/271063}
##'
##' Sterba, S. K. (2011). Implications of parcel-allocation variability for
##' comparing fit of item-solutions and parcel-solutions. *Structural
##' Equation Modeling, 18*(4), 554--577.\doi{10.1080/10705511.2011.607073}
##'
##' Sterba, S. K., & MacCallum, R. C. (2010). Variability in parameter estimates
##' and model fit across repeated allocations of items to parcels.
##' *Multivariate Behavioral Research, 45*(2), 322--358.
##' \doi{10.1080/00273171003680302}
##'
##' Sterba, S. K., & Rights, J. D. (2016). Accounting for parcel-allocation
##' variability in practice: Combining sources of uncertainty and choosing the
##' number of allocations. *Multivariate Behavioral Research, 51*(2--3),
##' 296--313. \doi{10.1080/00273171.2016.1144502}
##'
##' Sterba, S. K., & Rights, J. D. (2017). Effects of parceling on model
##' selection: Parcel-allocation variability in model ranking.
##' *Psychological Methods, 22*(1), 47--68. \doi{10.1037/met0000067}
##'
##' @examples
##'
##' ## Specify the item-level model (if NO parcels were created)
##' ## This must apply to BOTH competing models
##'
##' item.syntax <- c(paste0("f1 =~ f1item", 1:9),
##' paste0("f2 =~ f2item", 1:9))
##' cat(item.syntax, sep = "\n")
##' ## Below, we reduce the size of this same model by
##' ## applying different parceling schemes
##'
##' ## Specify a 2-factor CFA with correlated factors, using 3-indicator parcels
##' mod1 <- '
##' f1 =~ par1 + par2 + par3
##' f2 =~ par4 + par5 + par6
##' '
##' ## Specify a more restricted model with orthogonal factors
##' mod0 <- '
##' f1 =~ par1 + par2 + par3
##' f2 =~ par4 + par5 + par6
##' f1 ~~ 0*f2
##' '
##' ## names of parcels (must apply to BOTH models)
##' (parcel.names <- paste0("par", 1:6))
##'
##' \donttest{
##' ## override default random-number generator to use parallel options
##' RNGkind("L'Ecuyer-CMRG")
##'
##' PAVranking(model0 = mod0, model1 = mod1, data = simParcel, nAlloc = 100,
##' parcel.names = parcel.names, item.syntax = item.syntax,
##' # parallel = "multicore", # parallel available on Mac/Linux
##' std.lv = TRUE) # any addition lavaan arguments
##'
##'
##'
##' ## POOL RESULTS by treating parcel allocations as multiple imputations.
##' ## Details provided in Sterba & Rights (2016); see ?poolMAlloc.
##'
##' ## save list of data sets instead of fitting model yet
##' dataList <- parcelAllocation(mod0, # or mod1 (either uses same allocations)
##' data = simParcel, nAlloc = 100,
##' parcel.names = parcel.names,
##' item.syntax = item.syntax,
##' do.fit = FALSE)
##' ## now fit each model to each data set
##' if(requireNamespace("lavaan.mi")){
##' library(lavaan.mi)
##' fit0 <- cfa.mi(mod0, data = dataList, std.lv = TRUE)
##' fit1 <- cfa.mi(mod1, data = dataList, std.lv = TRUE)
##' anova(fit0, fit1) # Pooled test statistic comparing models.
##' help(package = "lavaan.mi") # Find more methods for pooling results.
##' }
##'
##' }
##'
##' @export
PAVranking <- function(model0, model1, data, parcel.names, item.syntax,
nAlloc = 100, fun = "sem", alpha = .05, bic.crit = 10,
fit.measures = c("chisq","df","cfi","tli","rmsea",
"srmr","logl","aic","bic","bic2"), ...,
show.progress = FALSE, iseed = 12345, warn = FALSE) {
if (alpha >= 1 | alpha <= 0) stop('alpha level must be between 0 and 1')
bic.crit <- abs(bic.crit)
## fit each model
out0 <- parcelAllocation(model = model0, data = data, nAlloc = nAlloc,
parcel.names = parcel.names, item.syntax = item.syntax,
fun = fun, alpha = alpha, fit.measures = fit.measures,
..., show.progress = show.progress, iseed = iseed,
return.fit = TRUE, warn = warn)
out1 <- parcelAllocation(model = model1, data = data, nAlloc = nAlloc,
parcel.names = parcel.names, item.syntax = item.syntax,
fun = fun, alpha = alpha, fit.measures = fit.measures,
..., show.progress = show.progress, iseed = iseed,
return.fit = TRUE, warn = warn)
## convergence summary
conv0 <- out0$Model@meta$ok
conv1 <- out1$Model@meta$ok
conv01 <- conv0 & conv1
conv <- data.frame(Proportion_Converged = sapply(list(conv0, conv1, conv01), mean),
row.names = c("model0","model1","Both_Models"))
## add proper solutions? I would advise against
## check df matches assumed nesting
DF0 <- out0$Fit["df", "Avg"]
DF1 <- out1$Fit["df", "Avg"]
if (DF0 == DF1) stop('Models have identical df, so they cannot be compared.')
if (DF0 < DF1) warning('model0 should be nested within model1, ',
'but df_0 < df_1. Should models be swapped?')
temp <- out0
out0 <- out1
out1 <- temp
## Re-run lavaanList to collect model-comparison results
if (show.progress) message('Re-fitting model0 to collect model-comparison ',
'statistics\n')
oldCall <- out0$Model@call
oldCall$model <- parTable(out0$Model)
oldCall$dataList <- out0$Model@external$dataList[conv01]
if (!is.null(oldCall$parallel)) {
if (oldCall$parallel == "snow") {
oldCall$parallel <- "no"
oldCall$ncpus <- 1L
if (warn) warning("Unable to pass lavaan::lavTestLRT() arguments when ",
"parallel = 'snow'. Switching to parallel = 'no'. ",
"Unless using Windows, parallel = 'multicore' works.")
}
}
PT1 <- parTable(out1$Model)
op1 <- lavListInspect(out1$Model, "options")
oldCall$FUN <- function(obj) {
fit1 <- try(lavaan::lavaan(model = PT1, slotOptions = op1,
slotData = obj@Data), silent = TRUE)
if (inherits(fit1, "try-error")) return("fit failed")
out <- lavaan::lavTestLRT(obj, fit1)
if (inherits(out, "try-error")) return("lavTestLRT() failed")
out
}
fit01 <- eval(as.call(oldCall))
## check if there are any results
noLRT <- sapply(fit01@funList, is.character)
if (all(noLRT)) stop("No success using lavTestScore() on any allocations.")
## anova() results
CHIs <- sapply(fit01@funList[!noLRT], "[", i = 2, j = "Chisq diff")
pVals <- sapply(fit01@funList[!noLRT], "[", i = 2, j = "Pr(>Chisq)")
LRT <- c(`Avg LRT` = mean(CHIs), df = abs(DF0 - DF1), SD = sd(CHIs),
Min = min(CHIs), Max = max(CHIs), Range = max(CHIs) - min(CHIs),
`% Sig` = mean(pVals < alpha))
class(LRT) <- c("lavaan.vector","numeric")
## differences in fit indices
indices <- fit.measures[!grepl(pattern = "chisq|df|pvalue", fit.measures)]
Fit0 <- do.call(cbind, out0$Model@funList[conv01])[indices, ]
Fit1 <- do.call(cbind, out1$Model@funList[conv01])[indices, ]
## higher values for model0
Fit01 <- Fit0 - Fit1
higher0 <- Fit0 > Fit1
perc0 <- rowMeans(higher0)
avg0 <- rowMeans(Fit01 * higher0)
## higher values for model1
Fit10 <- Fit1 - Fit0
higher1 <- Fit1 > Fit0
perc1 <- rowMeans(higher1)
avg1 <- rowMeans(Fit10 * higher1)
fitDiff <- data.frame(Prop0 = perc0, Avg0 = avg0, Prop1 = perc1, Avg1 = avg1)
class(fitDiff) <- c("lavaan.data.frame","data.frame")
attr(fitDiff, "header") <- paste("Note: Higher values of goodness-of-fit",
"indices (e.g., CFI) favor that model, but",
"higher values of badness-of-fit indices",
"(e.g., RMSEA) indicate the competing model",
"is favored.\n\n'Prop0' indicates the",
"proportion of allocations for which each",
"index was higher for model0 (likewise,",
"'Prop1' indicates this for model1).\n",
"\nAmong those allocations, 'Avg0' or 'Avg1'",
"indicates the average amount by which the",
"index was higher for that model.")
## favored by BIC
favorBIC <- NULL
if (any(grepl(pattern = "bic", fit.measures))) {
if ("bic" %in% fit.measures) {
highBIC <- abs(Fit01["bic",]) >= bic.crit
favor0 <- mean(higher1["bic",] & highBIC)
favor1 <- mean(higher0["bic",] & highBIC)
favorBIC <- data.frame("bic" = c(favor0, favor1),
row.names = paste("Evidence Favoring Model", 0:1))
}
if ("bic2" %in% fit.measures) {
highBIC <- abs(Fit01["bic2",]) >= bic.crit
favor0 <- mean(higher1["bic2",] & highBIC)
favor1 <- mean(higher0["bic2",] & highBIC)
favorBIC2 <- data.frame("bic2" = c(favor0, favor1),
row.names = paste("Evidence Favoring Model", 0:1))
if (is.null(favorBIC)) {
favorBIC <- favorBIC2
} else favorBIC <- cbind(favorBIC, favorBIC2)
}
class(favorBIC) <- c("lavaan.data.frame","data.frame")
attr(favorBIC, "header") <- paste("Percent of Allocations with |BIC Diff| >",
bic.crit)
}
## return results
list(Model0_Results = out0[c("Estimates","SE","Fit")],
Model1_Results = out1[c("Estimates","SE","Fit")],
Model0.v.Model1 = list(LRT_Summary = LRT,
Fit_Index_Differences = fitDiff,
Favored_by_BIC = favorBIC,
Convergence_Summary = conv))
}
## ------------
## old function
## ------------
## @param nPerPar A list in which each element is a vector, corresponding to
## each factor, indicating sizes of parcels. If variables are left out of
## parceling, they should not be accounted for here (i.e., there should not be
## parcels of size "1").
## @param facPlc A list of vectors, each corresponding to a factor, specifying
## the item indicators of that factor (whether included in parceling or not).
## Either variable names or column numbers. Variables not listed will not be
## modeled or included in output datasets.
## @param nAlloc The number of random allocations of items to parcels to
## generate.
## @param syntaxA lavaan syntax for Model A. Note that, for likelihood ratio
## test (LRT) results to be interpreted, Model A should be nested within Model
## B (though the function will still provide results when Models A and B are
## nonnested).
## @param syntaxB lavaan syntax for Model B. Note that, for likelihood ratio
## test (LRT) results to be appropriate, Model A should be nested within Model
## B (though the function will still provide results when Models A and B are
## nonnested).
## @param dataset Item-level dataset
## @param parceloutput folder where parceled data sets will be outputted (note
## for Windows users: file path must specified using forward slashes).
## @param names (Optional) A character vector containing the names of parceled
## variables.
## @param leaveout (Optional) A vector of variables to be left out of
## randomized parceling. Either variable names or column numbers are allowed.
## @examples
##
## \dontrun{
## ## lavaan syntax for Model A: a 2 Uncorrelated
## ## factor CFA model to be fit to parceled data
##
## parmodelA <- '
## f1 =~ NA*p1f1 + p2f1 + p3f1
## f2 =~ NA*p1f2 + p2f2 + p3f2
## p1f1 ~ 1
## p2f1 ~ 1
## p3f1 ~ 1
## p1f2 ~ 1
## p2f2 ~ 1
## p3f2 ~ 1
## p1f1 ~~ p1f1
## p2f1 ~~ p2f1
## p3f1 ~~ p3f1
## p1f2 ~~ p1f2
## p2f2 ~~ p2f2
## p3f2 ~~ p3f2
## f1 ~~ 1*f1
## f2 ~~ 1*f2
## f1 ~~ 0*f2
## '
##
## ## lavaan syntax for Model B: a 2 Correlated
## ## factor CFA model to be fit to parceled data
##
## parmodelB <- '
## f1 =~ NA*p1f1 + p2f1 + p3f1
## f2 =~ NA*p1f2 + p2f2 + p3f2
## p1f1 ~ 1
## p2f1 ~ 1
## p3f1 ~ 1
## p1f2 ~ 1
## p2f2 ~ 1
## p3f2 ~ 1
## p1f1 ~~ p1f1
## p2f1 ~~ p2f1
## p3f1 ~~ p3f1
## p1f2 ~~ p1f2
## p2f2 ~~ p2f2
## p3f2 ~~ p3f2
## f1 ~~ 1*f1
## f2 ~~ 1*f2
## f1 ~~ f2
## '
##
## ## specify items for each factor
## f1name <- colnames(simParcel)[1:9]
## f2name <- colnames(simParcel)[10:18]
##
## ## run function
## PAVranking(nPerPar = list(c(3,3,3), c(3,3,3)), facPlc = list(f1name,f2name),
## nAlloc = 100, parceloutput = 0, leaveout = 0,
## syntaxA = parmodelA, syntaxB = parmodelB, dataset = simParcel,
## names = list("p1f1","p2f1","p3f1","p1f2","p2f2","p3f2"))
## }
##
# PAVranking <- function(nPerPar, facPlc, nAlloc = 100, parceloutput = 0, syntaxA, syntaxB,
# dataset, names = NULL, leaveout = 0, seed = NA, ...) {
# if (is.null(names))
# names <- matrix(NA, length(nPerPar), 1)
# ## set random seed if specified
# if (is.na(seed) == FALSE)
# set.seed(seed)
# ## allow many tables to be outputted
# options(max.print = 1e+06)
#
# ## Create parceled datasets
# if (is.character(dataset)) dataset <- utils::read.csv(dataset)
# dataset <- as.matrix(dataset)
#
# if (nAlloc < 2)
# stop("Minimum of two allocations required.")
#
# if (is.list(facPlc)) {
# if (is.numeric(facPlc[[1]][1]) == FALSE) {
# facPlcb <- facPlc
# Namesv <- colnames(dataset)
#
# for (i in 1:length(facPlc)) {
# for (j in 1:length(facPlc[[i]])) {
# facPlcb[[i]][j] <- match(facPlc[[i]][j], Namesv)
# }
# facPlcb[[i]] <- as.numeric(facPlcb[[i]])
# }
# facPlc <- facPlcb
# }
#
# # facPlc2 <- rep(0, sum(sapply(facPlc, length)))
# facPlc2 <- rep(0, ncol(dataset))
#
# for (i in 1:length(facPlc)) {
# for (j in 1:length(facPlc[[i]])) {
# facPlc2[facPlc[[i]][j]] <- i
# }
# }
# facPlc <- facPlc2
# }
#
# if (leaveout != 0) {
# if (is.numeric(leaveout) == FALSE) {
# leaveoutb <- rep(0, length(leaveout))
# Namesv <- colnames(dataset)
#
# for (i in 1:length(leaveout)) {
# leaveoutb[i] <- match(leaveout[i], Namesv)
# }
# leaveout <- as.numeric(leaveoutb)
# }
# k1 <- 0.001
# for (i in 1:length(leaveout)) {
# facPlc[leaveout[i]] <- facPlc[leaveout[i]] + k1
# k1 <- k1 + 0.001
# }
# }
#
# if (0 %in% facPlc == TRUE) {
# Zfreq <- sum(facPlc == 0)
# for (i in 1:Zfreq) {
# Zplc <- match(0, facPlc)
# dataset <- dataset[, -Zplc]
# facPlc <- facPlc[-Zplc]
# }
# ## this allows for unused variables in dataset, which are specified by zeros, and
# ## deleted
# }
#
# if (is.list(nPerPar)) {
# nPerPar2 <- c()
# for (i in 1:length(nPerPar)) {
# Onesp <- sum(facPlc > i & facPlc < i + 1)
# nPerPar2 <- c(nPerPar2, nPerPar[i], rep(1, Onesp), recursive = TRUE)
# }
# nPerPar <- nPerPar2
# }
#
# Npp <- c()
# for (i in 1:length(nPerPar)) {
# Npp <- c(Npp, rep(i, nPerPar[i]))
# }
#
# Locate <- sort(round(facPlc))
# Maxv <- max(Locate) - 1
#
# if (length(Locate) != length(Npp))
# stop("Parcels incorrectly specified.\",
# \" Check input!")
#
# if (Maxv > 0) {
# ## Bug was here. With 1 factor Maxv=0. Skip this with a single factor
# for (i in 1:Maxv) {
# Mat <- match(i + 1, Locate)
# if (Npp[Mat] == Npp[Mat - 1])
# stop("Parcels incorrectly specified.\",
# \" Check input!")
# }
# }
# ## warning message if parcel crosses into multiple factors vector, parcel to which
# ## each variable belongs vector, factor to which each variables belongs if
# ## variables are in the same parcel, but different factors error message given in
# ## output
#
# Onevec <- facPlc - round(facPlc)
# NleaveA <- length(Onevec) - sum(Onevec == 0)
# NleaveP <- sum(nPerPar == 1)
#
# if (NleaveA < NleaveP)
# warning("Single-variable parcels have been requested.\",
# \" Check input!")
#
# if (NleaveA > NleaveP)
# warning("More non-parceled variables have been", " requested than provided for in parcel",
# " vector. Check input!")
#
# if (length(names) > 1) {
# if (length(names) != length(nPerPar))
# warning("Number of parcel names provided not equal to number", " of parcels requested")
# }
#
# Data <- c(1:ncol(dataset))
# ## creates a vector of the number of indicators e.g. for three indicators, c(1, 2,
# ## 3)
# Nfactors <- max(facPlc)
# ## scalar, number of factors
# Nindicators <- length(Data)
# ## scalar, number of indicators
# Npar <- length(nPerPar)
# ## scalar, number of parcels
# Rmize <- runif(Nindicators, 1, Nindicators)
# ## create vector of randomly ordered numbers, length of number of indicators
#
# Data <- rbind(facPlc, Rmize, Data)
# ## 'Data' becomes object of three rows, consisting of 1) factor to which each
# ## indicator belongs (in order to preserve indicator/factor assignment during
# ## randomization) 2) randomly order numbers 3) indicator number
#
# Results <- matrix(numeric(0), nAlloc, Nindicators)
# ## create empty matrix for parcel allocation matrix
#
# Pin <- nPerPar[1]
# for (i in 2:length(nPerPar)) {
# Pin <- c(Pin, nPerPar[i] + Pin[i - 1])
# ## creates vector which indicates the range of columns (endpoints) in each parcel
# }
#
# for (i in 1:nAlloc) {
# Data[2, ] <- runif(Nindicators, 1, Nindicators)
# ## Replace second row with newly randomly ordered numbers
#
# Data <- Data[, order(Data[2, ])]
# ## Order the columns according to the values of the second row
#
# Data <- Data[, order(Data[1, ])]
# ## Order the columns according to the values of the first row in order to preserve
# ## factor assignment
#
# Results[i, ] <- Data[3, ]
# ## assign result to allocation matrix
# }
#
# Alpha <- rbind(Results[1, ], dataset)
# ## bind first random allocation to dataset 'Alpha'
#
# Allocations <- list()
# ## create empty list for allocation data matrices
#
# for (i in 1:nAlloc) {
# Ineff <- rep(NA, ncol(Results))
# Ineff2 <- c(1:ncol(Results))
# for (inefficient in 1:ncol(Results)) {
# Ineff[Results[i, inefficient]] <- Ineff2[inefficient]
# }
#
# Alpha[1, ] <- Ineff
# ## replace first row of dataset matrix with row 'i' from allocation matrix
#
# Beta <- Alpha[, order(Alpha[1, ])]
# ## arrangle dataset columns by values of first row assign to temporary matrix
# ## 'Beta'
#
# Temp <- matrix(NA, nrow(dataset), Npar)
# ## create empty matrix for averaged parcel variables
#
# TempAA <- if (length(1:Pin[1]) > 1)
# Beta[2:nrow(Beta), 1:Pin[1]] else cbind(Beta[2:nrow(Beta), 1:Pin[1]], Beta[2:nrow(Beta), 1:Pin[1]])
# Temp[, 1] <- rowMeans(TempAA, na.rm = TRUE)
# ## fill first column with averages from assigned indicators
# for (al in 2:Npar) {
# Plc <- Pin[al - 1] + 1
# ## placeholder variable for determining parcel width
# TempBB <- if (length(Plc:Pin[al]) > 1)
# Beta[2:nrow(Beta), Plc:Pin[al]] else cbind(Beta[2:nrow(Beta), Plc:Pin[al]], Beta[2:nrow(Beta), Plc:Pin[al]])
# Temp[, al] <- rowMeans(TempBB, na.rm = TRUE)
# ## fill remaining columns with averages from assigned indicators
# }
# if (length(names) > 1)
# colnames(Temp) <- names
# Allocations[[i]] <- Temp
# ## assign result to list of parcel datasets
# }
#
# ## Write parceled datasets
# if (as.vector(regexpr("/", parceloutput)) != -1) {
# replist <- matrix(NA, nAlloc, 1)
# for (i in 1:nAlloc) {
# ## if (is.na(names)==TRUE) names <- matrix(NA,nrow(
# colnames(Allocations[[i]]) <- names
# utils::write.table(Allocations[[i]], paste(parceloutput, "/parcelruns", i,
# ".dat", sep = ""),
# row.names = FALSE, col.names = TRUE)
# replist[i, 1] <- paste("parcelruns", i, ".dat", sep = "")
# }
# utils::write.table(replist, paste(parceloutput, "/parcelrunsreplist.dat",
# sep = ""),
# quote = FALSE, row.names = FALSE, col.names = FALSE)
# }
#
#
# ## Model A estimation
#
# {
# Param_A <- list()
# ## list for parameter estimated for each imputation
# Fitind_A <- list()
# ## list for fit indices estimated for each imputation
# Converged_A <- list()
# ## list for whether or not each allocation converged
# ProperSolution_A <- list()
# ## list for whether or not each allocation has proper solutions
# ConvergedProper_A <- list()
# ## list for whether or not each allocation converged and has proper solutions
#
# for (i in 1:nAlloc) {
# data_A <- as.data.frame(Allocations[[i]], row.names = NULL, optional = FALSE)
# ## convert allocation matrix to dataframe for model estimation
# fit_A <- lavaan::sem(syntaxA, data = data_A, ...)
# ## estimate model in lavaan
# if (lavInspect(fit_A, "converged") == TRUE) {
# Converged_A[[i]] <- 1
# } else Converged_A[[i]] <- 0
# ## determine whether or not each allocation converged
# Param_A[[i]] <- lavaan::parameterEstimates(fit_A)[, c("lhs", "op", "rhs",
# "est", "se", "z", "pvalue", "ci.lower", "ci.upper")]
# ## assign allocation parameter estimates to list
# if (lavInspect(fit_A, "post.check") == TRUE & Converged_A[[i]] == 1) {
# ProperSolution_A[[i]] <- 1
# } else ProperSolution_A[[i]] <- 0
# ## determine whether or not each allocation has proper solutions
# if (any(is.na(Param_A[[i]][, 5] == TRUE)))
# ProperSolution_A[[i]] <- 0
# ## make sure each allocation has existing SE
# if (Converged_A[[i]] == 1 & ProperSolution_A[[i]] == 1) {
# ConvergedProper_A[[i]] <- 1
# } else ConvergedProper_A[[i]] <- 0
# ## determine whether or not each allocation converged and has proper solutions
#
# if (ConvergedProper_A[[i]] == 0)
# Param_A[[i]][, 4:9] <- matrix(data = NA, nrow(Param_A[[i]]), 6)
# ## make parameter estimates null for nonconverged, improper solutions
#
# if (ConvergedProper_A[[i]] == 1) {
# Fitind_A[[i]] <- lavaan::fitMeasures(fit_A, c("chisq", "df", "cfi",
# "tli", "rmsea", "srmr", "logl", "bic", "aic"))
# } else Fitind_A[[i]] <- c(NA, NA, NA, NA, NA, NA, NA, NA, NA)
# ### assign allocation parameter estimates to list
#
# }
#
#
# nConverged_A <- Reduce("+", Converged_A)
# ## count number of converged allocations
#
# nProperSolution_A <- Reduce("+", ProperSolution_A)
# ## count number of allocations with proper solutions
#
# nConvergedProper_A <- Reduce("+", ConvergedProper_A)
# ## count number of allocations with proper solutions
#
# if (nConvergedProper_A == 0)
# stop("All allocations failed to converge and/or yielded improper solutions for Model A and/or B.")
# ## stop program if no allocations converge
#
# Parmn_A <- Param_A[[1]]
# ## assign first parameter estimates to mean dataframe
#
# ParSE_A <- matrix(NA, nrow(Parmn_A), nAlloc)
# ParSEmn_A <- Parmn_A[, 5]
#
# Parsd_A <- matrix(NA, nrow(Parmn_A), nAlloc)
# ## assign parameter estimates for S.D. calculation
#
# Fitmn_A <- Fitind_A[[1]]
# ## assign first fit indices to mean dataframe
#
# Fitsd_A <- matrix(NA, length(Fitmn_A), nAlloc)
# ## assign fit indices for S.D. calculation
#
# Sigp_A <- matrix(NA, nrow(Parmn_A), nAlloc)
# ## assign p-values to calculate percentage significant
#
# Fitind_A <- data.frame(Fitind_A)
# ### convert fit index table to data frame
#
# for (i in 1:nAlloc) {
#
# Parsd_A[, i] <- Param_A[[i]][, 4]
# ## assign parameter estimates for S.D. estimation
#
# ParSE_A[, i] <- Param_A[[i]][, 5]
#
# if (i > 1) {
# ParSEmn_A <- rowSums(cbind(ParSEmn_A, Param_A[[i]][, 5]), na.rm = TRUE)
# }
#
# Sigp_A[, ncol(Sigp_A) - i + 1] <- Param_A[[i]][, 7]
# ## assign p-values to calculate percentage significant
#
# Fitsd_A[, i] <- Fitind_A[[i]]
# ## assign fit indices for S.D. estimation
#
# if (i > 1) {
# Parmn_A[, 4:ncol(Parmn_A)] <- rowSums(cbind(Parmn_A[, 4:ncol(Parmn_A)],
# Param_A[[i]][, 4:ncol(Parmn_A)]), na.rm = TRUE)
# }
# ## add together all parameter estimates
#
# if (i > 1)
# Fitmn_A <- rowSums(cbind(Fitmn_A, Fitind_A[[i]]), na.rm = TRUE)
# ## add together all fit indices
#
# }
#
#
# Sigp_A <- Sigp_A + 0.45
# Sigp_A <- apply(Sigp_A, c(1, 2), round)
# Sigp_A <- 1 - as.vector(rowMeans(Sigp_A, na.rm = TRUE))
# ## calculate percentage significant parameters
#
# Parsum_A <- cbind(apply(Parsd_A, 1, mean, na.rm = TRUE),
# apply(Parsd_A, 1, sd, na.rm = TRUE),
# apply(Parsd_A, 1, max, na.rm = TRUE),
# apply(Parsd_A, 1, min, na.rm = TRUE),
# apply(Parsd_A, 1, max, na.rm = TRUE) - apply(Parsd_A, 1, min, na.rm = TRUE), Sigp_A * 100)
# colnames(Parsum_A) <- c("Avg Est.", "S.D.", "MAX", "MIN", "Range", "% Sig")
# ## calculate parameter S.D., minimum, maximum, range, bind to percentage
# ## significant
#
# ParSEmn_A <- Parmn_A[, 1:3]
# ParSEfn_A <- cbind(ParSEmn_A, apply(ParSE_A, 1, mean, na.rm = TRUE),
# apply(ParSE_A, 1, sd, na.rm = TRUE),
# apply(ParSE_A, 1, max, na.rm = TRUE),
# apply(ParSE_A, 1, min, na.rm = TRUE),
# apply(ParSE_A, 1, max, na.rm = TRUE) - apply(ParSE_A, 1, min, na.rm = TRUE))
# colnames(ParSEfn_A) <- c("lhs", "op", "rhs", "Avg SE", "S.D.",
# "MAX", "MIN", "Range")
#
# Fitsum_A <- cbind(apply(Fitsd_A, 1, mean, na.rm = TRUE),
# apply(Fitsd_A, 1, sd, na.rm = TRUE),
# apply(Fitsd_A, 1, max, na.rm = TRUE),
# apply(Fitsd_A, 1, min, na.rm = TRUE),
# apply(Fitsd_A, 1, max, na.rm = TRUE) - apply(Fitsd_A, 1, min, na.rm = TRUE))
# rownames(Fitsum_A) <- c("chisq", "df", "cfi", "tli", "rmsea", "srmr", "logl",
# "bic", "aic")
# ## calculate fit S.D., minimum, maximum, range
#
# Parmn_A[, 4:ncol(Parmn_A)] <- Parmn_A[, 4:ncol(Parmn_A)]/nConvergedProper_A
# ## divide totalled parameter estimates by number converged allocations
# Parmn_A <- Parmn_A[, 1:3]
# ## remove confidence intervals from output
# Parmn_A <- cbind(Parmn_A, Parsum_A)
# ## bind parameter average estimates to cross-allocation information
# Fitmn_A <- Fitmn_A/nConvergedProper_A
# ## divide totalled fit indices by number converged allocations
#
# pChisq_A <- list()
# ## create empty list for Chi-square p-values
# sigChisq_A <- list()
# ## create empty list for Chi-square significance
#
# for (i in 1:nAlloc) {
# pChisq_A[[i]] <- (1 - pchisq(Fitsd_A[1, i], Fitsd_A[2, i]))
# ## calculate p-value for each Chi-square
# if (is.na(pChisq_A[[i]]) == FALSE & pChisq_A[[i]] < 0.05) {
# sigChisq_A[[i]] <- 1
# } else sigChisq_A[[i]] <- 0
# }
# ## count number of allocations with significant chi-square
#
# PerSigChisq_A <- (Reduce("+", sigChisq_A))/nConvergedProper_A * 100
# PerSigChisq_A <- round(PerSigChisq_A, 3)
# ## calculate percent of allocations with significant chi-square
#
# PerSigChisqCol_A <- c(PerSigChisq_A, "n/a", "n/a", "n/a", "n/a", "n/a", "n/a",
# "n/a", "n/a")
# ## create list of Chi-square Percent Significant and 'n/a' (used for fit summary
# ## table)
#
# options(stringsAsFactors = FALSE)
# ## set default option to allow strings into dataframe without converting to factors
#
# Fitsum_A <- data.frame(Fitsum_A, PerSigChisqCol_A)
# colnames(Fitsum_A) <- c("Avg Ind", "S.D.", "MAX", "MIN", "Range", "% Sig")
# ### bind to fit averages
#
# options(stringsAsFactors = TRUE)
# ## unset option to allow strings into dataframe without converting to factors
#
# ParSEfn_A[, 4:8] <- apply(ParSEfn_A[, 4:8], 2, round, digits = 3)
# Parmn_A[, 4:9] <- apply(Parmn_A[, 4:9], 2, round, digits = 3)
# Fitsum_A[, 1:5] <- apply(Fitsum_A[, 1:5], 2, round, digits = 3)
# ## round output to three digits
#
# Fitsum_A[2, 2:5] <- c("n/a", "n/a", "n/a", "n/a")
# ## Change df row to 'n/a' for sd, max, min, and range
#
# Output_A <- list(Parmn_A, ParSEfn_A, Fitsum_A)
# names(Output_A) <- c("Estimates_A", "SE_A", "Fit_A")
# ## output summary for model A
#
# }
#
# ## Model B estimation
#
# {
# Param <- list()
# ## list for parameter estimated for each imputation
# Fitind <- list()
# ## list for fit indices estimated for each imputation
# Converged <- list()
# ## list for whether or not each allocation converged
# ProperSolution <- list()
# ## list for whether or not each allocation has proper solutions
# ConvergedProper <- list()
# ## list for whether or not each allocation is converged and proper
#
# for (i in 1:nAlloc) {
# data <- as.data.frame(Allocations[[i]], row.names = NULL, optional = FALSE)
# ## convert allocation matrix to dataframe for model estimation
# fit <- lavaan::sem(syntaxB, data = data, ...)
# ## estimate model in lavaan
# if (lavInspect(fit, "converged") == TRUE) {
# Converged[[i]] <- 1
# } else Converged[[i]] <- 0
# ## determine whether or not each allocation converged
# Param[[i]] <- lavaan::parameterEstimates(fit)[, c("lhs", "op", "rhs",
# "est", "se", "z", "pvalue", "ci.lower", "ci.upper")]
# ## assign allocation parameter estimates to list
# if (lavInspect(fit, "post.check") == TRUE & Converged[[i]] == 1) {
# ProperSolution[[i]] <- 1
# } else ProperSolution[[i]] <- 0
# ## determine whether or not each allocation has proper solutions
# if (any(is.na(Param[[i]][, 5] == TRUE)))
# ProperSolution[[i]] <- 0
# ## make sure each allocation has existing SE
# if (Converged[[i]] == 1 & ProperSolution[[i]] == 1) {
# ConvergedProper[[i]] <- 1
# } else ConvergedProper[[i]] <- 0
# ## determine whether or not each allocation converged and has proper solutions
#
# if (ConvergedProper[[i]] == 0)
# Param[[i]] <- matrix(data = NA, nrow(Param[[i]]), ncol(Param[[i]]))
# ## make parameter estimates null for nonconverged, improper solutions
#
# if (ConvergedProper[[i]] == 1) {
# Fitind[[i]] <- lavaan::fitMeasures(fit, c("chisq", "df", "cfi", "tli",
# "rmsea", "srmr", "logl", "bic", "aic"))
# } else Fitind[[i]] <- c(NA, NA, NA, NA, NA, NA, NA, NA, NA)
# ### assign allocation parameter estimates to list
#
#
# }
#
#
#
#
# nConverged <- Reduce("+", Converged)
# ## count number of converged allocations
#
# nProperSolution <- Reduce("+", ProperSolution)
# ## count number of allocations with proper solutions
#
# nConvergedProper <- Reduce("+", ConvergedProper)
# ## count number of allocations with proper solutions
#
# if (nConvergedProper == 0)
# stop("All allocations failed to converge", " and/or yielded improper solutions for",
# " Model A and/or B.")
# ## stop program if no allocations converge
#
# Parmn <- Param[[1]]
# ## assign first parameter estimates to mean dataframe
#
# ParSE <- matrix(NA, nrow(Parmn), nAlloc)
# ParSEmn <- Parmn[, 5]
#
# Parsd <- matrix(NA, nrow(Parmn), nAlloc)
# ## assign parameter estimates for S.D. calculation
#
# Fitmn <- Fitind[[1]]
# ## assign first fit indices to mean dataframe
#
# Fitsd <- matrix(NA, length(Fitmn), nAlloc)
# ## assign fit indices for S.D. calculation
#
# Sigp <- matrix(NA, nrow(Parmn), nAlloc)
# ## assign p-values to calculate percentage significant
#
# Fitind <- data.frame(Fitind)
# ### convert fit index table to dataframe
#
#
# for (i in 1:nAlloc) {
#
# Parsd[, i] <- Param[[i]][, 4]
# ## assign parameter estimates for S.D. estimation
#
# ParSE[, i] <- Param[[i]][, 5]
#
# if (i > 1)
# ParSEmn <- rowSums(cbind(ParSEmn, Param[[i]][, 5]), na.rm = TRUE)
#
# Sigp[, ncol(Sigp) - i + 1] <- Param[[i]][, 7]
# ## assign p-values to calculate percentage significant
#
#
# Fitsd[, i] <- Fitind[[i]]
# ## assign fit indices for S.D. estimation
#
# if (i > 1) {
# Parmn[, 4:ncol(Parmn)] <- rowSums(cbind(Parmn[, 4:ncol(Parmn)], Param[[i]][,
# 4:ncol(Parmn)]), na.rm = TRUE)
# }
# ## add together all parameter estimates
#
# if (i > 1)
# Fitmn <- rowSums(cbind(Fitmn, Fitind[[i]]), na.rm = TRUE)
# ## add together all fit indices
#
# }
#
#
# Sigp <- Sigp + 0.45
# Sigp <- apply(Sigp, c(1, 2), round)
# Sigp <- 1 - as.vector(rowMeans(Sigp, na.rm = TRUE))
# ## calculate percentage significant parameters
#
# Parsum <- cbind(apply(Parsd, 1, mean, na.rm = TRUE), apply(Parsd, 1, sd, na.rm = TRUE),
# apply(Parsd, 1, max, na.rm = TRUE), apply(Parsd, 1, min, na.rm = TRUE),
# apply(Parsd, 1, max, na.rm = TRUE) - apply(Parsd, 1, min, na.rm = TRUE),
# Sigp * 100)
# colnames(Parsum) <- c("Avg Est", "S.D.", "MAX", "MIN", "Range", "% Sig")
# ## calculate parameter S.D., minimum, maximum, range, bind to percentage
# ## significant
#
# ParSEmn <- Parmn[, 1:3]
# ParSEfn <- cbind(ParSEmn, apply(ParSE, 1, mean, na.rm = TRUE), apply(ParSE,
# 1, sd, na.rm = TRUE), apply(ParSE, 1, max, na.rm = TRUE), apply(ParSE,
# 1, min, na.rm = TRUE), apply(ParSE, 1, max, na.rm = TRUE) - apply(ParSE,
# 1, min, na.rm = TRUE))
# colnames(ParSEfn) <- c("lhs", "op", "rhs", "Avg SE", "S.D.", "MAX", "MIN",
# "Range")
#
# Fitsum <- cbind(apply(Fitsd, 1, mean, na.rm = TRUE), apply(Fitsd, 1, sd, na.rm = TRUE),
# apply(Fitsd, 1, max, na.rm = TRUE), apply(Fitsd, 1, min, na.rm = TRUE),
# apply(Fitsd, 1, max, na.rm = TRUE) - apply(Fitsd, 1, min, na.rm = TRUE))
# rownames(Fitsum) <- c("chisq", "df", "cfi", "tli", "rmsea", "srmr", "logl",
# "bic", "aic")
# ## calculate fit S.D., minimum, maximum, range
#
# Parmn[, 4:ncol(Parmn)] <- Parmn[, 4:ncol(Parmn)]/nConvergedProper
# ## divide totalled parameter estimates by number converged allocations
# Parmn <- Parmn[, 1:3]
# ## remove confidence intervals from output
# Parmn <- cbind(Parmn, Parsum)
# ## bind parameter average estimates to cross-allocation information
# Fitmn <- as.numeric(Fitmn)
# ## make fit index values numeric
# Fitmn <- Fitmn/nConvergedProper
# ## divide totalled fit indices by number converged allocations
#
# pChisq <- list()
# ## create empty list for Chi-square p-values
# sigChisq <- list()
# ## create empty list for Chi-square significance
#
# for (i in 1:nAlloc) {
#
# pChisq[[i]] <- (1 - pchisq(Fitsd[1, i], Fitsd[2, i]))
# ## calculate p-value for each Chi-square
#
# if (is.na(pChisq[[i]]) == FALSE & pChisq[[i]] < 0.05) {
# sigChisq[[i]] <- 1
# } else sigChisq[[i]] <- 0
# }
# ## count number of allocations with significant chi-square
#
# PerSigChisq <- (Reduce("+", sigChisq))/nConvergedProper * 100
# PerSigChisq <- round(PerSigChisq, 3)
# ## calculate percent of allocations with significant chi-square
#
# PerSigChisqCol <- c(PerSigChisq, "n/a", "n/a", "n/a", "n/a", "n/a", "n/a",
# "n/a", "n/a")
# ## create list of Chi-square Percent Significant and 'n/a' (used for fit summary
# ## table)
#
# options(stringsAsFactors = FALSE)
# ## set default option to allow strings into dataframe without converting to factors
#
# Fitsum <- data.frame(Fitsum, PerSigChisqCol)
# colnames(Fitsum) <- c("Avg Ind", "S.D.", "MAX", "MIN", "Range", "% Sig")
# ### bind to fit averages
#
# options(stringsAsFactors = TRUE)
# ## unset option to allow strings into dataframe without converting to factors
#
# ParSEfn[, 4:8] <- apply(ParSEfn[, 4:8], 2, round, digits = 3)
# Parmn[, 4:9] <- apply(Parmn[, 4:9], 2, round, digits = 3)
# Fitsum[, 1:5] <- apply(Fitsum[, 1:5], 2, round, digits = 3)
# ## round output to three digits
#
# Fitsum[2, 2:5] <- c("n/a", "n/a", "n/a", "n/a")
# ## Change df row to 'n/a' for sd, max, min, and range
#
# Output_B <- list(Parmn, ParSEfn, Fitsum)
# names(Output_B) <- c("Estimates_B", "SE_B", "Fit_B")
# ## output summary for model A
#
# }
#
# ## Model Comparison (everything in this section is new)
#
# {
# Converged_AB <- list()
# ## create list of convergence comparison for each allocation
# ProperSolution_AB <- list()
# ## create list of proper solution comparison for each allocation
# ConvergedProper_AB <- list()
# ## create list of convergence and proper solution comparison for each allocation
# lrtest_AB <- list()
# ## create list for likelihood ratio test for each allocation
# lrchisq_AB <- list()
# ## create list for likelihood ratio chi square value
# lrchisqp_AB <- list()
# ## create list for likelihood ratio test p-value
# lrsig_AB <- list()
# ## create list for likelihood ratio test significance
#
# for (i in 1:nAlloc) {
# if (Converged_A[[i]] == 1 & Converged[[i]] == 1) {
# Converged_AB[[i]] <- 1
# } else Converged_AB[[i]] <- 0
# ## compare convergence
#
# if (ProperSolution_A[[i]] == 1 & ProperSolution[[i]] == 1) {
# ProperSolution_AB[[i]] <- 1
# } else ProperSolution_AB[[i]] <- 0
# ## compare existence of proper solutions
#
# if (ConvergedProper_A[[i]] == 1 & ConvergedProper[[i]] == 1) {
# ConvergedProper_AB[[i]] <- 1
# } else ConvergedProper_AB[[i]] <- 0
# ## compare existence of proper solutions and convergence
#
#
#
# if (ConvergedProper_AB[[i]] == 1) {
#
# data <- as.data.frame(Allocations[[i]], row.names = NULL, optional = FALSE)
# ## convert allocation matrix to dataframe for model estimation
# fit_A <- lavaan::sem(syntaxA, data = data, ...)
# ## estimate model A in lavaan
# fit <- lavaan::sem(syntaxB, data = data, ...)
# ## estimate model B in lavaan
# lrtest_AB[[i]] <- lavaan::lavTestLRT(fit_A, fit)
# ## likelihood ratio test comparing A and B
# lrtestd_AB <- as.data.frame(lrtest_AB[[i]], row.names = NULL, optional = FALSE)
# ## convert lrtest results to dataframe
# lrchisq_AB[[i]] <- lrtestd_AB[2, 5]
# ## write lrtest chisq as single numeric variable
# lrchisqp_AB[[i]] <- lrtestd_AB[2, 7]
# ## write lrtest p-value as single numeric variable
# if (lrchisqp_AB[[i]] < 0.05) {
# lrsig_AB[[i]] <- 1
# } else {
# lrsig_AB[[i]] <- 0
# }
# ## determine statistical significance of lrtest
#
# }
# }
#
# lrchisqp_AB <- unlist(lrchisqp_AB, recursive = TRUE, use.names = TRUE)
# ## convert lrchisqp_AB from list to vector
# lrchisqp_AB <- as.numeric(lrchisqp_AB)
# ## make lrchisqp_AB numeric
# lrsig_AB <- unlist(lrsig_AB, recursive = TRUE, use.names = TRUE)
# ## convert lrsig_AB from list to vector
# lrsig_AB <- as.numeric(lrsig_AB)
# ### make lrsig_AB numeric
#
#
# nConverged_AB <- Reduce("+", Converged_AB)
# ## count number of allocations that converged for both A and B
# nProperSolution_AB <- Reduce("+", ProperSolution_AB)
# ## count number of allocations with proper solutions for both A and B
# nConvergedProper_AB <- Reduce("+", ConvergedProper_AB)
# ## count number of allocations that converged and have proper solutions for both A
# ## and B
# ProConverged_AB <- (nConverged_AB/nAlloc) * 100
# ## calc proportion of allocations that converged for both A and B
# nlrsig_AB <- Reduce("+", lrsig_AB)
# ## count number of allocations with significant lrtest between A and B
# Prolrsig_AB <- (nlrsig_AB/nConvergedProper_AB) * 100
# ## calc proportion of allocations with significant lrtest between A and B
# lrchisq_AB <- unlist(lrchisq_AB, recursive = TRUE, use.names = TRUE)
# ### convert lrchisq_AB from list to vector
# lrchisq_AB <- as.numeric(lrchisq_AB)
# ### make lrchisq_AB numeric
# AvgLRT_AB <- (Reduce("+", lrchisq_AB))/nConvergedProper_AB
# ## calc average LRT
#
# LRTsum <- cbind(AvgLRT_AB, lrtestd_AB[2, 3], sd(lrchisq_AB, na.rm = TRUE),
# max(lrchisq_AB), min(lrchisq_AB),
# max(lrchisq_AB) - min(lrchisq_AB), Prolrsig_AB)
# colnames(LRTsum) <- c("Avg LRT", "df", "S.D.", "MAX", "MIN", "Range", "% Sig")
# ## calculate LRT distribution statistics
#
# FitDiff_AB <- Fitsd_A - Fitsd
# ## compute fit index difference matrix
#
# for (i in 1:nAlloc) {
# if (ConvergedProper_AB[[i]] != 1)
# FitDiff_AB[1:9, i] <- 0
# }
# ### make fit differences zero for each non-converged allocation
#
# BICDiff_AB <- list()
# AICDiff_AB <- list()
# RMSEADiff_AB <- list()
# CFIDiff_AB <- list()
# TLIDiff_AB <- list()
# SRMRDiff_AB <- list()
# BICDiffGT10_AB <- list()
# ## create list noting each allocation in which A is preferred over B
#
# BICDiff_BA <- list()
# AICDiff_BA <- list()
# RMSEADiff_BA <- list()
# CFIDiff_BA <- list()
# TLIDiff_BA <- list()
# SRMRDiff_BA <- list()
# BICDiffGT10_BA <- list()
# ## create list noting each allocation in which B is preferred over A
#
# for (i in 1:nAlloc) {
# if (FitDiff_AB[8, i] < 0) {
# BICDiff_AB[[i]] <- 1
# } else BICDiff_AB[[i]] <- 0
# if (FitDiff_AB[9, i] < 0) {
# AICDiff_AB[[i]] <- 1
# } else AICDiff_AB[[i]] <- 0
# if (FitDiff_AB[5, i] < 0) {
# RMSEADiff_AB[[i]] <- 1
# } else RMSEADiff_AB[[i]] <- 0
# if (FitDiff_AB[3, i] > 0) {
# CFIDiff_AB[[i]] <- 1
# } else CFIDiff_AB[[i]] <- 0
# if (FitDiff_AB[4, i] > 0) {
# TLIDiff_AB[[i]] <- 1
# } else TLIDiff_AB[[i]] <- 0
# if (FitDiff_AB[6, i] < 0) {
# SRMRDiff_AB[[i]] <- 1
# } else SRMRDiff_AB[[i]] <- 0
# if (FitDiff_AB[8, i] < (-10)) {
# BICDiffGT10_AB[[i]] <- 1
# } else BICDiffGT10_AB[[i]] <- 0
# }
# nBIC_AoverB <- Reduce("+", BICDiff_AB)
# nAIC_AoverB <- Reduce("+", AICDiff_AB)
# nRMSEA_AoverB <- Reduce("+", RMSEADiff_AB)
# nCFI_AoverB <- Reduce("+", CFIDiff_AB)
# nTLI_AoverB <- Reduce("+", TLIDiff_AB)
# nSRMR_AoverB <- Reduce("+", SRMRDiff_AB)
# nBICDiffGT10_AoverB <- Reduce("+", BICDiffGT10_AB)
# ## compute number of 'A preferred over B' for each fit index
#
# for (i in 1:nAlloc) {
# if (FitDiff_AB[8, i] > 0) {
# BICDiff_BA[[i]] <- 1
# } else BICDiff_BA[[i]] <- 0
# if (FitDiff_AB[9, i] > 0) {
# AICDiff_BA[[i]] <- 1
# } else AICDiff_BA[[i]] <- 0
# if (FitDiff_AB[5, i] > 0) {
# RMSEADiff_BA[[i]] <- 1
# } else RMSEADiff_BA[[i]] <- 0
# if (FitDiff_AB[3, i] < 0) {
# CFIDiff_BA[[i]] <- 1
# } else CFIDiff_BA[[i]] <- 0
# if (FitDiff_AB[4, i] < 0) {
# TLIDiff_BA[[i]] <- 1
# } else TLIDiff_BA[[i]] <- 0
# if (FitDiff_AB[6, i] > 0) {
# SRMRDiff_BA[[i]] <- 1
# } else SRMRDiff_BA[[i]] <- 0
# if (FitDiff_AB[8, i] > (10)) {
# BICDiffGT10_BA[[i]] <- 1
# } else BICDiffGT10_BA[[i]] <- 0
# }
# nBIC_BoverA <- Reduce("+", BICDiff_BA)
# nAIC_BoverA <- Reduce("+", AICDiff_BA)
# nRMSEA_BoverA <- Reduce("+", RMSEADiff_BA)
# nCFI_BoverA <- Reduce("+", CFIDiff_BA)
# nTLI_BoverA <- Reduce("+", TLIDiff_BA)
# nSRMR_BoverA <- Reduce("+", SRMRDiff_BA)
# nBICDiffGT10_BoverA <- Reduce("+", BICDiffGT10_BA)
# ## compute number of 'B preferred over A' for each fit index
#
# BICDiffAvgtemp <- list()
# AICDiffAvgtemp <- list()
# RMSEADiffAvgtemp <- list()
# CFIDiffAvgtemp <- list()
# TLIDiffAvgtemp <- list()
# SRMRDiffAvgtemp <- list()
# BICgt10DiffAvgtemp <- list()
# ## create empty list for average fit index differences
#
# for (i in 1:nAlloc) {
# if (BICDiff_AB[[i]] != 1) {
# BICDiffAvgtemp[[i]] <- 0
# } else BICDiffAvgtemp[[i]] <- FitDiff_AB[8, i]
# if (AICDiff_AB[[i]] != 1) {
# AICDiffAvgtemp[[i]] <- 0
# } else AICDiffAvgtemp[[i]] <- FitDiff_AB[9, i]
# if (RMSEADiff_AB[[i]] != 1) {
# RMSEADiffAvgtemp[[i]] <- 0
# } else RMSEADiffAvgtemp[[i]] <- FitDiff_AB[5, i]
# if (CFIDiff_AB[[i]] != 1) {
# CFIDiffAvgtemp[[i]] <- 0
# } else CFIDiffAvgtemp[[i]] <- FitDiff_AB[3, i]
# if (TLIDiff_AB[[i]] != 1) {
# TLIDiffAvgtemp[[i]] <- 0
# } else TLIDiffAvgtemp[[i]] <- FitDiff_AB[4, i]
# if (SRMRDiff_AB[[i]] != 1) {
# SRMRDiffAvgtemp[[i]] <- 0
# } else SRMRDiffAvgtemp[[i]] <- FitDiff_AB[6, i]
# if (BICDiffGT10_AB[[i]] != 1) {
# BICgt10DiffAvgtemp[[i]] <- 0
# } else BICgt10DiffAvgtemp[[i]] <- FitDiff_AB[8, i]
# }
# ## make average fit index difference list composed solely of values where A is
# ## preferred over B
#
# BICDiffAvg_AB <- Reduce("+", BICDiffAvgtemp)/nBIC_AoverB * (-1)
# AICDiffAvg_AB <- Reduce("+", AICDiffAvgtemp)/nAIC_AoverB * (-1)
# RMSEADiffAvg_AB <- Reduce("+", RMSEADiffAvgtemp)/nRMSEA_AoverB * (-1)
# CFIDiffAvg_AB <- Reduce("+", CFIDiffAvgtemp)/nCFI_AoverB
# TLIDiffAvg_AB <- Reduce("+", TLIDiffAvgtemp)/nTLI_AoverB
# SRMRDiffAvg_AB <- Reduce("+", SRMRDiffAvgtemp)/nSRMR_AoverB * (-1)
# BICgt10DiffAvg_AB <- Reduce("+", BICgt10DiffAvgtemp)/nBICDiffGT10_AoverB *
# (-1)
# ## calc average fit index difference when A is preferred over B
#
# FitDiffAvg_AoverB <- list(BICDiffAvg_AB, AICDiffAvg_AB, RMSEADiffAvg_AB, CFIDiffAvg_AB,
# TLIDiffAvg_AB, SRMRDiffAvg_AB)
# ## create list of all fit index differences when A is preferred over B
#
# FitDiffAvg_AoverB <- unlist(FitDiffAvg_AoverB, recursive = TRUE, use.names = TRUE)
# ### convert from list to vector
#
# for (i in 1:nAlloc) {
# if (BICDiff_BA[[i]] != 1) {
# BICDiffAvgtemp[[i]] <- 0
# } else BICDiffAvgtemp[[i]] <- FitDiff_AB[8, i]
# if (AICDiff_BA[[i]] != 1) {
# AICDiffAvgtemp[[i]] <- 0
# } else AICDiffAvgtemp[[i]] <- FitDiff_AB[9, i]
# if (RMSEADiff_BA[[i]] != 1) {
# RMSEADiffAvgtemp[[i]] <- 0
# } else RMSEADiffAvgtemp[[i]] <- FitDiff_AB[5, i]
# if (CFIDiff_BA[[i]] != 1) {
# CFIDiffAvgtemp[[i]] <- 0
# } else CFIDiffAvgtemp[[i]] <- FitDiff_AB[3, i]
# if (TLIDiff_BA[[i]] != 1) {
# TLIDiffAvgtemp[[i]] <- 0
# } else TLIDiffAvgtemp[[i]] <- FitDiff_AB[4, i]
# if (SRMRDiff_BA[[i]] != 1) {
# SRMRDiffAvgtemp[[i]] <- 0
# } else SRMRDiffAvgtemp[[i]] <- FitDiff_AB[6, i]
# if (BICDiffGT10_BA[[i]] != 1) {
# BICgt10DiffAvgtemp[[i]] <- 0
# } else BICgt10DiffAvgtemp[[i]] <- FitDiff_AB[8, i]
# }
# ## make average fit index difference list composed solely of values where B is
# ## preferred over A
#
# BICDiffAvg_BA <- Reduce("+", BICDiffAvgtemp)/nBIC_BoverA
# AICDiffAvg_BA <- Reduce("+", AICDiffAvgtemp)/nAIC_BoverA
# RMSEADiffAvg_BA <- Reduce("+", RMSEADiffAvgtemp)/nRMSEA_BoverA
# CFIDiffAvg_BA <- Reduce("+", CFIDiffAvgtemp)/nCFI_BoverA * (-1)
# TLIDiffAvg_BA <- Reduce("+", TLIDiffAvgtemp)/nTLI_BoverA * (-1)
# SRMRDiffAvg_BA <- Reduce("+", SRMRDiffAvgtemp)/nSRMR_BoverA
# BICgt10DiffAvg_BA <- Reduce("+", BICgt10DiffAvgtemp)/nBICDiffGT10_BoverA
# ## calc average fit index difference when B is preferred over A
#
# FitDiffAvg_BoverA <- list(BICDiffAvg_BA, AICDiffAvg_BA, RMSEADiffAvg_BA, CFIDiffAvg_BA,
# TLIDiffAvg_BA, SRMRDiffAvg_BA)
# ## create list of all fit index differences when B is preferred over A
#
# FitDiffAvg_BoverA <- unlist(FitDiffAvg_BoverA, recursive = TRUE, use.names = TRUE)
# ### convert from list to vector
#
# FitDiffBICgt10_AoverB <- nBICDiffGT10_AoverB/nConvergedProper_AB * 100
# ### calculate portion of allocations where A strongly preferred over B
#
# FitDiffBICgt10_BoverA <- nBICDiffGT10_BoverA/nConvergedProper_AB * 100
# ### calculate portion of allocations where B strongly preferred over A
#
# FitDiffBICgt10 <- rbind(FitDiffBICgt10_AoverB, FitDiffBICgt10_BoverA)
# rownames(FitDiffBICgt10) <- c("Very Strong evidence for A>B", "Very Strong evidence for B>A")
# colnames(FitDiffBICgt10) <- "% Allocations"
# ### create table of proportions of 'A strongly preferred over B' and 'B strongly
# ### preferred over A'
#
# FitDiff_AoverB <- list(nBIC_AoverB/nConvergedProper_AB * 100, nAIC_AoverB/nConvergedProper_AB *
# 100, nRMSEA_AoverB/nConvergedProper_AB * 100, nCFI_AoverB/nConvergedProper_AB *
# 100, nTLI_AoverB/nConvergedProper_AB * 100, nSRMR_AoverB/nConvergedProper_AB *
# 100)
# ### create list of all proportions of 'A preferred over B'
# FitDiff_BoverA <- list(nBIC_BoverA/nConvergedProper_AB * 100, nAIC_BoverA/nConvergedProper_AB *
# 100, nRMSEA_BoverA/nConvergedProper_AB * 100, nCFI_BoverA/nConvergedProper_AB *
# 100, nTLI_BoverA/nConvergedProper_AB * 100, nSRMR_BoverA/nConvergedProper_AB *
# 100)
# ### create list of all proportions of 'B preferred over A'
#
# FitDiff_AoverB <- unlist(FitDiff_AoverB, recursive = TRUE, use.names = TRUE)
# ### convert from list to vector
#
# FitDiff_BoverA <- unlist(FitDiff_BoverA, recursive = TRUE, use.names = TRUE)
# ### convert from list to vector
#
# FitDiffSum_AB <- cbind(FitDiff_AoverB, FitDiffAvg_AoverB, FitDiff_BoverA,
# FitDiffAvg_BoverA)
# colnames(FitDiffSum_AB) <- c("% A>B", "Avg Amount A>B", "% B>A", "Avg Amount B>A")
# rownames(FitDiffSum_AB) <- c("bic", "aic", "rmsea", "cfi", "tli", "srmr")
# ## create table showing number of allocations in which A>B and B>A as well as
# ## average difference values
#
# for (i in 1:nAlloc) {
# is.na(FitDiff_AB[1:9, i]) <- ConvergedProper_AB[[i]] != 1
# }
# ### make fit differences missing for each non-converged allocation
#
# LRThistMax <- max(hist(lrchisqp_AB, plot = FALSE)$counts)
# BIChistMax <- max(hist(FitDiff_AB[8, 1:nAlloc], plot = FALSE)$counts)
# AIChistMax <- max(hist(FitDiff_AB[9, 1:nAlloc], plot = FALSE)$counts)
# RMSEAhistMax <- max(hist(FitDiff_AB[5, 1:nAlloc], plot = FALSE)$counts)
# CFIhistMax <- max(hist(FitDiff_AB[3, 1:nAlloc], plot = FALSE)$counts)
# TLIhistMax <- max(hist(FitDiff_AB[4, 1:nAlloc], plot = FALSE)$counts)
# ### calculate y-axis height for each histogram
#
# LRThist <- hist(lrchisqp_AB, ylim = c(0, LRThistMax), xlab = "p-value", main = "LRT p-values")
# ## plot histogram of LRT p-values
#
# BIChist <- hist(FitDiff_AB[8, 1:nAlloc], ylim = c(0, BIChistMax), xlab = "BIC_modA - BIC_modB",
# main = "BIC Diff")
# AIChist <- hist(FitDiff_AB[9, 1:nAlloc], ylim = c(0, AIChistMax), xlab = "AIC_modA - AIC_modB",
# main = "AIC Diff")
# RMSEAhist <- hist(FitDiff_AB[5, 1:nAlloc], ylim = c(0, RMSEAhistMax), xlab = "RMSEA_modA - RMSEA_modB",
# main = "RMSEA Diff")
# CFIhist <- hist(FitDiff_AB[3, 1:nAlloc], ylim = c(0, CFIhistMax), xlab = "CFI_modA - CFI_modB",
# main = "CFI Diff")
# TLIhist <- hist(FitDiff_AB[4, 1:nAlloc], ylim = c(0, TLIhistMax), xlab = "TLI_modA - TLI_modB",
# main = "TLI Diff")
# ### plot histograms for each index_modA - index_modB
# BIChist
# AIChist
# RMSEAhist
# CFIhist
# TLIhist
#
# ConvergedProperSum <- rbind(nConverged_A/nAlloc, nConverged/nAlloc, nConverged_AB/nAlloc,
# nConvergedProper_A/nAlloc, nConvergedProper/nAlloc, nConvergedProper_AB/nAlloc)
# rownames(ConvergedProperSum) <- c("Converged_A", "Converged_B", "Converged_AB",
# "ConvergedProper_A", "ConvergedProper_B", "ConvergedProper_AB")
# colnames(ConvergedProperSum) <- "Proportion of Allocations"
# ### create table summarizing proportions of converged allocations and allocations
# ### with proper solutions
#
# Output_AB <- list(round(LRTsum, 3), "LRT results are interpretable specifically for nested models",
# round(FitDiffSum_AB, 3), round(FitDiffBICgt10, 3), ConvergedProperSum)
# names(Output_AB) <- c("LRT Summary, Model A vs. Model B", "Note:", "Fit Index Differences",
# "Percent of Allocations with |BIC Diff| > 10", "Converged and Proper Solutions Summary")
# ### output for model comparison
#
# }
#
# return(list(Output_A, Output_B, Output_AB))
# ## returns output for model A, model B, and the comparison of these
# }
semTools/R/permuteMeasEq.R 0000644 0001762 0000144 00000215230 14764267640 015203 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 12 March 2025
### permutation randomization test for measurement equivalence and DIF
## -----------------
## Class and Methods
## -----------------
##' Class for the Results of Permutation Randomization Tests of Measurement
##' Equivalence and DIF
##'
##' This class contains the results of tests of Measurement Equivalence and
##' Differential Item Functioning (DIF).
##'
##'
##' @name permuteMeasEq-class
##' @aliases permuteMeasEq-class show,permuteMeasEq-method
##' summary,permuteMeasEq-method hist,permuteMeasEq-method
##' @docType class
##'
##' @slot PT A `data.frame` returned by a call to
##' [lavaan::parTable()] on the constrained model
##' @slot modelType A character indicating the specified `modelType` in the
##' call to `permuteMeasEq`
##' @slot ANOVA A `numeric` vector indicating the results of the observed
##' (\eqn{\Delta})\eqn{\chi^2} test, based on the central \eqn{\chi^2}
##' distribution
##' @slot AFI.obs A vector of observed (changes in) user-selected fit measures
##' @slot AFI.dist The permutation distribution(s) of user-selected fit measures.
##' A `data.frame` with `n.Permutations` rows and one column for each
##' `AFI.obs`.
##' @slot AFI.pval A vector of *p* values (one for each element in slot
##' `AFI.obs`) calculated using slot `AFI.dist`, indicating the
##' probability of observing a change at least as extreme as `AFI.obs`
##' if the null hypothesis were true
##' @slot MI.obs A `data.frame` of observed Lagrange Multipliers
##' (modification indices) associated with the equality constraints or fixed
##' parameters specified in the `param` argument. This is a subset of the
##' output returned by a call to [lavaan::lavTestScore()] on the
##' constrained model.
##' @slot MI.dist The permutation distribution of the maximum modification index
##' (among those seen in slot `MI.obs$X2`) at each permutation of group
##' assignment or of `covariates`
##' @slot extra.obs If `permuteMeasEq` was called with an `extra`
##' function, the output when applied to the original data is concatenated
##' into this vector
##' @slot extra.dist A `data.frame`, each column of which contains the
##' permutation distribution of the corresponding statistic in slot
##' `extra.obs`
##' @slot n.Permutations An `integer` indicating the number of permutations
##' requested by the user
##' @slot n.Converged An `integer` indicating the number of permuation
##' iterations which yielded a converged solution
##' @slot n.nonConverged An `integer` vector of length
##' `n.Permutations` indicating how many times group assignment was
##' randomly permuted (at each iteration) before converging on a solution
##' @slot n.Sparse Only relevant with `ordered` indicators when
##' `modelType == "mgcfa"`. An `integer` vector of length
##' `n.Permutations` indicating how many times group assignment was
##' randomly permuted (at each iteration) before obtaining a sample with all
##' categories observed in all groups.
##' @slot oldSeed An `integer` vector storing the value of
##' `.Random.seed` before running `permuteMeasEq`. Only relevant
##' when using a parallel/multicore option and the original
##' `RNGkind() != "L'Ecuyer-CMRG"`. This enables users to restore their
##' previous `.Random.seed` state, if desired, by running:
##' `.Random.seed[-1] <- permutedResults@oldSeed[-1]`
##' @section Objects from the Class: Objects can be created via the
##' [semTools::permuteMeasEq()] function.
##'
##' @return
##' \itemize{
##' \item The `show` method prints a summary of the multiparameter
##' omnibus test results, using the user-specified AFIs. The parametric
##' (\eqn{\Delta})\eqn{\chi^2} test is also displayed.
##' \item The `summary` method prints the same information from the
##' `show` method, but when `extra = FALSE` (the default) it also
##' provides a table summarizing any requested follow-up tests of DIF using
##' modification indices in slot `MI.obs`. The user can also specify an
##' `alpha` level for flagging modification indices as significant, as
##' well as `nd` (the number of digits displayed). For each modification
##' index, the *p* value is displayed using a central \eqn{\chi^2}
##' distribution with the *df* shown in that column. Additionally, a
##' *p* value is displayed using the permutation distribution of the
##' maximum index, which controls the familywise Type I error rate in a manner
##' similar to Tukey's studentized range test. If any indices are flagged as
##' significant using the `tukey.p.value`, then a message is displayed for
##' each flagged index. The invisibly returned `data.frame` is the
##' displayed table of modification indices, unless
##' [semTools::permuteMeasEq()] was called with `param = NULL`,
##' in which case the invisibly returned object is `object`. If
##' `extra = TRUE`, the permutation-based *p* values for each
##' statistic returned by the `extra` function are displayed and returned
##' in a `data.frame` instead of the modification indices requested in the
##' `param` argument.
##' \item The `hist` method returns a list of `length == 2`,
##' containing the arguments for the call to `hist` and the arguments
##' to the call for `legend`, respectively. This list may facilitate
##' creating a customized histogram of `AFI.dist`, `MI.dist`, or
##' `extra.dist`
##' }
##'
##' @author Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' @seealso [semTools::permuteMeasEq()]
##'
##' @examples
##'
##' # See the example from the permuteMeasEq function
##'
setClass("permuteMeasEq", slots = c(PT = "data.frame",
modelType = "character",
ANOVA = "vector",
AFI.obs = "vector",
AFI.dist = "data.frame",
AFI.pval = "vector",
MI.obs = "data.frame",
MI.dist = "vector",
extra.obs = "vector",
extra.dist = "data.frame",
n.Permutations = "integer",
n.Converged = "integer",
n.nonConverged = "vector",
n.Sparse = "vector",
oldSeed = "integer"))
##' @rdname permuteMeasEq-class
##' @aliases show,permuteMeasEq-method
##' @export
setMethod("show", "permuteMeasEq", function(object) {
## print warning if there are nonConverged permutations
if (object@n.Permutations != object@n.Converged) {
warning(paste("Only", object@n.Converged, "out of",
object@n.Permutations, "models converged within",
max(object@n.nonConverged), "attempts per permutation.\n\n"))
}
## print ANOVA
cat("Omnibus p value based on parametric chi-squared difference test:\n\n")
print(round(object@ANOVA, digits = 3))
## print permutation results
cat("\n\nOmnibus p values based on nonparametric permutation method: \n\n")
AFI <- data.frame(AFI.Difference = object@AFI.obs, p.value = object@AFI.pval)
class(AFI) <- c("lavaan.data.frame","data.frame")
print(AFI, nd = 3)
invisible(object)
})
##' @rdname permuteMeasEq-class
##' @aliases summary,permuteMeasEq-method
##' @export
setMethod("summary", "permuteMeasEq", function(object, alpha = .05, nd = 3,
extra = FALSE) {
## print warning if there are nonConverged permutations
if (object@n.Permutations != object@n.Converged) {
warning(paste("Only", object@n.Converged, "out of",
object@n.Permutations, "models converged within",
max(object@n.nonConverged), "attempts per permutation.\n\n"))
}
## print ANOVA
cat("Omnibus p value based on parametric chi-squared difference test:\n\n")
print(round(object@ANOVA, digits = nd))
## print permutation results
cat("\n\nOmnibus p values based on nonparametric permutation method: \n\n")
AFI <- data.frame(AFI.Difference = object@AFI.obs, p.value = object@AFI.pval)
class(AFI) <- c("lavaan.data.frame","data.frame")
print(AFI, nd = nd)
## print extras or DIF test results, if any were requested
if (extra && length(object@extra.obs)) {
cat("\n\nUnadjusted p values of extra statistics,\n",
"based on permutation distribution of each statistic: \n\n")
MI <- data.frame(Statistic = object@extra.obs)
class(MI) <- c("lavaan.data.frame","data.frame")
MI$p.value <- sapply(names(object@extra.dist), function(nn) {
mean(abs(object@extra.dist[,nn]) >= abs(object@extra.obs[nn]), na.rm = TRUE)
})
MI$flag <- ifelse(MI$p.value < alpha, "* ", "")
print(MI, nd = nd)
} else if (length(object@MI.dist)) {
cat("\n\n Modification indices for equality constrained parameter estimates,\n",
"with unadjusted 'p.value' based on chi-squared distribution and\n",
"adjusted 'tukey.p.value' based on permutation distribution of the\n",
"maximum modification index per iteration: \n\n")
MI <- do.call(paste("summ", object@modelType, sep = "."),
args = list(object = object, alpha = alpha))
print(MI, nd = nd)
## print messages about potential DIF
if (all(MI$tukey.p.value > alpha)) {
cat("\n\n No equality constraints were flagged as significant.\n\n")
return(invisible(MI))
}
if (object@modelType == "mgcfa") {
cat("\n\nThe following equality constraints were flagged as significant:\n\n")
for (i in which(MI$tukey.p.value < alpha)) {
cat("Parameter '", MI$parameter[i], "' may differ between Groups '",
MI$group.lhs[i], "' and '", MI$group.rhs[i], "'.\n", sep = "")
}
cat("\nUse lavTestScore(..., epc = TRUE) on your constrained model to",
"display expected parameter changes for these equality constraints\n\n")
}
} else return(invisible(object))
invisible(MI)
})
summ.mgcfa <- function(object, alpha) {
MI <- object@MI.obs
class(MI) <- c("lavaan.data.frame","data.frame")
PT <- object@PT
eqPar <- rbind(PT[PT$plabel %in% MI$lhs, ], PT[PT$plabel %in% MI$rhs, ])
MI$flag <- ""
MI$parameter <- ""
MI$group.lhs <- ""
MI$group.rhs <- ""
for (i in 1:nrow(MI)) {
par1 <- eqPar$par[ eqPar$plabel == MI$lhs[i] ]
par2 <- eqPar$par[ eqPar$plabel == MI$rhs[i] ]
MI$parameter[i] <- par1
MI$group.lhs[i] <- eqPar$group.label[ eqPar$plabel == MI$lhs[i] ]
MI$group.rhs[i] <- eqPar$group.label[ eqPar$plabel == MI$rhs[i] ]
if (par1 != par2) {
myMessage <- paste0("Constraint '", MI$lhs[i], "==", MI$rhs[i],
"' refers to different parameters: \n'",
MI$lhs[i], "' is '", par1, "' in group '",
MI$group.lhs[i], "'\n'",
MI$rhs[i], "' is '", par2, "' in group '",
MI$group.rhs[i], "'\n")
warning(myMessage)
}
if (MI$tukey.p.value[i] < alpha) MI$flag[i] <- "* -->"
}
MI
}
summ.mimic <- function(object, alpha) {
MI <- object@MI.obs
class(MI) <- c("lavaan.data.frame","data.frame")
MI$flag <- ifelse(MI$tukey.p.value < alpha, "* ", "")
MI
}
##' @rdname permuteMeasEq-class
##' @aliases hist,permuteMeasEq-method
##' @importFrom stats qchisq dchisq quantile
##' @param object,x object of class `permuteMeasEq`
##' @param ... Additional arguments to pass to [graphics::hist()]
##' @param AFI `character` indicating the fit measure whose permutation
##' distribution should be plotted
##' @param alpha alpha level used to draw confidence limits in `hist` and
##' flag significant statistics in `summary` output
##' @param nd number of digits to display
##' @param extra `logical` indicating whether the `summary` output
##' should return permutation-based *p* values for each statistic returned
##' by the `extra` function. If `FALSE` (default), `summary`
##' will return permutation-based *p* values for each modification index.
##' @param printLegend `logical`. If `TRUE` (default), a legend will
##' be printed with the histogram
##' @param legendArgs `list` of arguments passed to the
##' [graphics::legend()] function. The default argument is a list
##' placing the legend at the top-left of the figure.
##' @export
setMethod("hist", "permuteMeasEq", function(x, ..., AFI, alpha = .05, nd = 3,
printLegend = TRUE,
legendArgs = list(x = "topleft")) {
histArgs <- list(...)
histArgs$x <- x@AFI.dist[[AFI]]
if (is.null(histArgs$col)) histArgs$col <- "grey69"
histArgs$freq <- !grepl("chi", AFI)
histArgs$ylab <- if (histArgs$freq) "Frequency" else "Probability Density"
if (printLegend) {
if (is.null(legendArgs$box.lty)) legendArgs$box.lty <- 0
if (nd < length(strsplit(as.character(1 / alpha), "")[[1]]) - 1) {
warning(paste0("The number of digits argument (nd = ", nd ,
") is too low to display your p value at the",
" same precision as your requested alpha level (alpha = ",
alpha, ")"))
}
if (x@AFI.pval[[AFI]] < (1 / 10^nd)) {
pVal <- paste(c("< .", rep(0, nd - 1),"1"), collapse = "")
} else {
pVal <- paste("=", round(x@AFI.pval[[AFI]], nd))
}
}
delta <- length(x@MI.dist) > 0L && x@modelType == "mgcfa"
if (grepl("chi", AFI)) { ####################################### Chi-squared
ChiSq <- x@AFI.obs[AFI]
DF <- x@ANOVA[2]
histArgs$xlim <- range(c(ChiSq, x@AFI.dist[[AFI]], qchisq(c(.01, .99), DF)))
xVals <- seq(histArgs$xlim[1], histArgs$xlim[2], by = .1)
theoDist <- dchisq(xVals, df = DF)
TheoCrit <- round(qchisq(p = alpha, df = DF, lower.tail = FALSE), 2)
Crit <- quantile(histArgs$x, probs = 1 - alpha)
if (ChiSq > histArgs$xlim[2]) histArgs$xlim[2] <- ChiSq
if (delta) {
histArgs$main <- expression(Permutation~Distribution~of~Delta*chi^2)
histArgs$xlab <- expression(Delta*chi^2)
if (printLegend) {
legendArgs$legend <- c(bquote(Theoretical~Delta*chi[Delta*.(paste("df =", DF))]^2 ~ Distribution),
bquote(Critical~chi[alpha~.(paste(" =", alpha))]^2 == .(round(TheoCrit, nd))),
bquote(.(paste("Permuted Critical Value =", round(Crit, nd)))),
bquote(Observed~Delta*chi^2 == .(round(ChiSq, nd))),
expression(paste("")),
bquote(Permuted~italic(p)~.(pVal)))
}
} else {
histArgs$main <- expression(Permutation~Distribution~of~chi^2)
histArgs$xlab <- expression(chi^2)
if (printLegend) {
legendArgs$legend <- c(bquote(Theoretical~chi[.(paste("df =", DF))]^2 ~ Distribution),
bquote(Critical~chi[alpha~.(paste(" =", alpha))]^2 == .(round(TheoCrit, nd))),
bquote(.(paste("Permuted Critical Value =", round(Crit, nd)))),
bquote(Observed~chi^2 == .(round(ChiSq, nd))),
expression(paste("")),
bquote(Permuted~italic(p)~.(pVal)))
}
}
H <- do.call(hist, c(histArgs["x"], plot = FALSE))
histArgs$ylim <- c(0, max(H$density, theoDist))
if (printLegend) {
legendArgs <- c(legendArgs, list(lty = c(2, 2, 1, 1, 0, 0),
lwd = c(2, 2, 2, 3, 0, 0),
col = c("black","black","black","red","","")))
}
} else { ################################################### other AFIs
badness <- grepl(pattern = "fmin|aic|bic|rmr|rmsea|cn|sic|hqc",
x = AFI, ignore.case = TRUE)
if (badness) {
Crit <- quantile(histArgs$x, probs = 1 - alpha)
} else {
Crit <- quantile(histArgs$x, probs = alpha)
}
histArgs$xlim <- range(histArgs$x, x@AFI.obs[AFI])
if (delta) {
histArgs$main <- bquote(~Permutation~Distribution~of~Delta*.(toupper(AFI)))
histArgs$xlab <- bquote(~Delta*.(toupper(AFI)))
if (printLegend) {
legendArgs$legend <- c(bquote(Critical~Delta*.(toupper(AFI))[alpha~.(paste(" =", alpha))] == .(round(Crit, nd))),
bquote(Observed~Delta*.(toupper(AFI)) == .(round(x@AFI.obs[AFI], nd))),
expression(paste("")),
bquote(Permuted~italic(p)~.(pVal)))
}
} else {
histArgs$main <- paste("Permutation Distribution of", toupper(AFI))
histArgs$xlab <- toupper(AFI)
if (printLegend) {
legendArgs$legend <- c(bquote(Critical~.(toupper(AFI))[alpha~.(paste(" =", alpha))] == .(round(Crit, nd))),
bquote(Observed~.(toupper(AFI)) == .(round(x@AFI.obs[AFI], nd))),
expression(paste("")),
bquote(Permuted~italic(p)~.(pVal)))
}
}
if (printLegend) {
legendArgs <- c(legendArgs, list(lty = c(1, 1, 0, 0),
lwd = c(2, 3, 0, 0),
col = c("black","red","","")))
}
}
## print histogram (and optionally, print legend)
suppressWarnings({
do.call(hist, histArgs)
if (grepl("chi", AFI)) {
lines(x = xVals, y = theoDist, lwd = 2, lty = 2)
abline(v = TheoCrit, col = "black", lwd = 2, lty = 2)
}
abline(v = Crit, col = "black", lwd = 2)
abline(v = x@AFI.obs[AFI], col = "red", lwd = 3)
if (printLegend) do.call(legend, legendArgs)
})
## return arguments to create histogram (and optionally, legend)
invisible(list(hist = histArgs, legend = legendArgs))
})
## --------------------
## Constructor Function
## --------------------
##' Permutation Randomization Tests of Measurement Equivalence and Differential
##' Item Functioning (DIF)
##'
##' The function `permuteMeasEq` provides tests of hypotheses involving
##' measurement equivalence, in one of two frameworks: multigroup CFA or MIMIC
##' models.
##'
##'
##' The function `permuteMeasEq` provides tests of hypotheses involving
##' measurement equivalence, in one of two frameworks:
##' \enumerate{
##' \item{1} For multiple-group CFA models, provide a pair of nested lavaan objects,
##' the less constrained of which (`uncon`) freely estimates a set of
##' measurement parameters (e.g., factor loadings, intercepts, or thresholds;
##' specified in `param`) in all groups, and the more constrained of which
##' (`con`) constrains those measurement parameters to equality across
##' groups. Group assignment is repeatedly permuted and the models are fit to
##' each permutation, in order to produce an empirical distribution under the
##' null hypothesis of no group differences, both for (a) changes in
##' user-specified fit measures (see `AFIs` and `moreAFIs`) and for
##' (b) the maximum modification index among the user-specified equality
##' constraints. Configural invariance can also be tested by providing that
##' fitted lavaan object to `con` and leaving `uncon = NULL`, in which
##' case `param` must be `NULL` as well.
##'
##' \item{2} In MIMIC models, one or a set of continuous and/or discrete
##' `covariates` can be permuted, and a constrained model is fit to each
##' permutation in order to provide a distribution of any fit measures (namely,
##' the maximum modification index among fixed parameters in `param`) under
##' the null hypothesis of measurement equivalence across levels of those
##' covariates.
##' }
##'
##' In either framework, modification indices for equality constraints or fixed
##' parameters specified in `param` are calculated from the constrained
##' model (`con`) using the function [lavaan::lavTestScore()].
##'
##' For multiple-group CFA models, the multiparameter omnibus null hypothesis of
##' measurement equivalence/invariance is that there are no group differences in
##' any measurement parameters (of a particular type). This can be tested using
##' the `anova` method on nested `lavaan` objects, as seen in the
##' output of [semTools::measurementInvariance()], or by inspecting
##' the change in alternative fit indices (AFIs) such as the CFI. The
##' permutation randomization method employed by `permuteMeasEq` generates
##' an empirical distribution of any `AFIs` under the null hypothesis, so
##' the user is not restricted to using fixed cutoffs proposed by Cheung &
##' Rensvold (2002), Chen (2007), or Meade, Johnson, & Braddy (2008).
##'
##' If the multiparameter omnibus null hypothesis is rejected, partial
##' invariance can still be established by freeing invalid equality constraints,
##' as long as equality constraints are valid for at least two indicators per
##' factor. Modification indices can be calculated from the constrained model
##' (`con`), but multiple testing leads to inflation of Type I error rates.
##' The permutation randomization method employed by `permuteMeasEq`
##' creates a distribution of the maximum modification index if the null
##' hypothesis is true, which allows the user to control the familywise Type I
##' error rate in a manner similar to Tukey's *q* (studentized range)
##' distribution for the Honestly Significant Difference (HSD) post hoc test.
##'
##' For MIMIC models, DIF can be tested by comparing modification indices of
##' regression paths to the permutation distribution of the maximum modification
##' index, which controls the familywise Type I error rate. The MIMIC approach
##' could also be applied with multiple-group models, but the grouping variable
##' would not be permuted; rather, the covariates would be permuted separately
##' within each group to preserve between-group differences. So whether
##' parameters are constrained or unconstrained across groups, the MIMIC
##' approach is only for testing null hypotheses about the effects of
##' `covariates` on indicators, controlling for common factors.
##'
##' In either framework, [lavaan::lavaan()]'s `group.label`
##' argument is used to preserve the order of groups seen in `con` when
##' permuting the data.
##'
##'
##' @importFrom lavaan lavInspect parTable
##'
##' @param nPermute An integer indicating the number of random permutations used
##' to form empirical distributions under the null hypothesis.
##' @param modelType A character string indicating type of model employed:
##' multiple-group CFA (`"mgcfa"`) or MIMIC (`"mimic"`).
##' @param con The constrained `lavaan` object, in which the parameters
##' specified in `param` are constrained to equality across all groups when
##' `modelType = "mgcfa"`, or which regression paths are fixed to zero when
##' `modelType = "mimic"`. In the case of testing *configural*
##' invariance when `modelType = "mgcfa"`, `con` is the configural
##' model (implicitly, the unconstrained model is the saturated model, so use
##' the defaults `uncon = NULL` and `param = NULL`). When
##' `modelType = "mimic"`, `con` is the MIMIC model in which the
##' covariate predicts the latent construct(s) but no indicators (unless they
##' have already been identified as DIF items).
##' @param uncon Optional. The unconstrained `lavaan` object, in which the
##' parameters specified in `param` are freely estimated in all groups.
##' When `modelType = "mgcfa"`, only in the case of testing
##' *configural* invariance should `uncon = NULL`. When
##' `modelType = "mimic"`, any non-`NULL uncon` is silently set to
##' `NULL`.
##' @param null Optional. A `lavaan` object, in which an alternative null
##' model is fit (besides the default independence model specified by
##' `lavaan`) for the calculation of incremental fit indices. See Widamin &
##' Thompson (2003) for details. If `NULL`, `lavaan`'s default
##' independence model is used.
##' @param param An optional character vector or list of character vectors
##' indicating which parameters the user would test for DIF following a
##' rejection of the omnibus null hypothesis tested using
##' (`more`)`AFIs`. Note that `param` does not guarantee certain
##' parameters *are* constrained in `con`; that is for the user to
##' specify when fitting the model. If users have any "anchor items" that they
##' would never intend to free across groups (or levels of a covariate), these
##' should be excluded from `param`; exceptions to a type of parameter can
##' be specified in `freeParam`. When `modelType = "mgcfa"`,
##' `param` indicates which parameters of interest are constrained across
##' groups in `con` and are unconstrained in `uncon`. Parameter names
##' must match those returned by `names(coef(con))`, but omitting any
##' group-specific suffixes (e.g., `"f1~1"` rather than `"f1~1.g2"`)
##' or user-specified labels (that is, the parameter names must follow the rules
##' of lavaan's [lavaan::model.syntax()]). Alternatively (or
##' additionally), to test all constraints of a certain type (or multiple types)
##' of parameter in `con`, `param` may take any combination of the
##' following values: `"loadings"`, `"intercepts"`,
##' `"thresholds"`, `"residuals"`, `"residual.covariances"`,
##' `"means"`, `"lv.variances"`, and/or `"lv.covariances"`. When
##' `modelType = "mimic"`, `param` must be a vector of individual
##' parameters or a list of character strings to be passed one-at-a-time to
##' `lavaan::lavTestScore(object = con, add = param[i])`,
##' indicating which (sets of) regression paths fixed to zero in `con` that
##' the user would consider freeing (i.e., exclude anchor items). If
##' `modelType = "mimic"` and `param` is a list of character strings,
##' the multivariate test statistic will be saved for each list element instead
##' of 1-*df* modification indices for each individual parameter, and
##' `names(param)` will name the rows of the `MI.obs` slot (see
##' [permuteMeasEq-class]). Set `param = NULL` (default) to avoid
##' collecting modification indices for any follow-up tests.
##' @param freeParam An optional character vector, silently ignored when
##' `modelType = "mimic"`. If `param` includes a type of parameter
##' (e.g., `"loadings"`), `freeParam` indicates exceptions (i.e.,
##' anchor items) that the user would *not* intend to free across groups
##' and should therefore be ignored when calculating *p* values adjusted
##' for the number of follow-up tests. Parameter types that are already
##' unconstrained across groups in the fitted `con` model (i.e., a
##' *partial* invariance model) will automatically be ignored, so they do
##' not need to be specified in `freeParam`. Parameter names must match
##' those returned by `names(coef(con))`, but omitting any group-specific
##' suffixes (e.g., `"f1~1"` rather than `"f1~1.g2"`) or
##' user-specified labels (that is, the parameter names must follow the rules of
##' lavaan [lavaan::model.syntax()]).
##' @param covariates An optional character vector, only applicable when
##' `modelType = "mimic"`. The observed data are partitioned into columns
##' indicated by `covariates`, and the rows are permuted simultaneously for
##' the entire set before being merged with the remaining data. Thus, the
##' covariance structure is preserved among the covariates, which is necessary
##' when (e.g.) multiple dummy codes are used to represent a discrete covariate
##' or when covariates interact. If `covariates = NULL` when
##' `modelType = "mimic"`, the value of `covariates` is inferred by
##' searching `param` for predictors (i.e., variables appearing after the
##' "`~`" operator).
##' @param AFIs A character vector indicating which alternative fit indices (or
##' chi-squared itself) are to be used to test the multiparameter omnibus null
##' hypothesis that the constraints specified in `con` hold in the
##' population. Any fit measures returned by [lavaan::fitMeasures()]
##' may be specified (including constants like `"df"`, which would be
##' nonsensical). If both `AFIs` and `moreAFIs` are `NULL`, only
##' `"chisq"` will be returned.
##' @param moreAFIs Optional. A character vector indicating which (if any)
##' alternative fit indices returned by [semTools::moreFitIndices()]
##' are to be used to test the multiparameter omnibus null hypothesis that the
##' constraints specified in `con` hold in the population.
##' @param maxSparse Only applicable when `modelType = "mgcfa"` and at
##' least one indicator is `ordered`. An integer indicating the maximum
##' number of consecutive times that randomly permuted group assignment can
##' yield a sample in which at least one category (of an `ordered`
##' indicator) is unobserved in at least one group, such that the same set of
##' parameters cannot be estimated in each group. If such a sample occurs, group
##' assignment is randomly permuted again, repeatedly until a sample is obtained
##' with all categories observed in all groups. If `maxSparse` is exceeded,
##' `NA` will be returned for that iteration of the permutation
##' distribution.
##' @param maxNonconv An integer indicating the maximum number of consecutive
##' times that a random permutation can yield a sample for which the model does
##' not converge on a solution. If such a sample occurs, permutation is
##' attempted repeatedly until a sample is obtained for which the model does
##' converge. If `maxNonconv` is exceeded, `NA` will be returned for
##' that iteration of the permutation distribution, and a warning will be
##' printed when using `show` or `summary`.
##' @param showProgress Logical. Indicating whether to display a progress bar
##' while permuting. Silently set to `FALSE` when using parallel options.
##' @param warn Sets the handling of warning messages when fitting model(s) to
##' permuted data sets. See [base::options()].
##' @param datafun An optional function that can be applied to the data
##' (extracted from `con`) after each permutation, but before fitting the
##' model(s) to each permutation. The `datafun` function must have an
##' argument named `data` that accepts a `data.frame`, and it must
##' return a `data.frame` containing the same column names. The column
##' order may differ, the values of those columns may differ (so be careful!),
##' and any additional columns will be ignored when fitting the model, but an
##' error will result if any column names required by the model syntax do not
##' appear in the transformed data set. Although available for any
##' `modelType`, `datafun` may be useful when using the MIMIC method
##' to test for nonuniform DIF (metric/weak invariance) by using product
##' indicators for a latent factor representing the interaction between a factor
##' and one of the `covariates`, in which case the product indicators would
##' need to be recalculated after each permutation of the `covariates`. To
##' access other R objects used within `permuteMeasEq`, the arguments to
##' `datafun` may also contain any subset of the following: `"con"`,
##' `"uncon"`, `"null"`, `"param"`, `"freeParam"`,
##' `"covariates"`, `"AFIs"`, `"moreAFIs"`, `"maxSparse"`,
##' `"maxNonconv"`, and/or `"iseed"`. The values for those arguments
##' will be the same as the values supplied to `permuteMeasEq`.
##' @param extra An optional function that can be applied to any (or all) of the
##' fitted lavaan objects (`con`, `uncon`, and/or `null`). This
##' function will also be applied after fitting the model(s) to each permuted
##' data set. To access the R objects used within `permuteMeasEq`, the
##' arguments to `extra` must be any subset of the following: `"con"`,
##' `"uncon"`, `"null"`, `"param"`, `"freeParam"`,
##' `"covariates"`, `"AFIs"`, `"moreAFIs"`, `"maxSparse"`,
##' `"maxNonconv"`, and/or `"iseed"`. The values for those arguments
##' will be the same as the values supplied to `permuteMeasEq`. The
##' `extra` function must return a named `numeric` vector or a named
##' `list` of scalars (i.e., a `list` of `numeric` vectors of
##' `length == 1`). Any unnamed elements (e.g., `""` or `NULL`)
##' of the returned object will result in an error.
##' @param parallelType The type of parallel operation to be used (if any). The
##' default is `"none"`. Forking is not possible on Windows, so if
##' `"multicore"` is requested on a Windows machine, the request will be
##' changed to `"snow"` with a message.
##' @param ncpus Integer: number of processes to be used in parallel operation.
##' If `NULL` (the default) and `parallelType %in%
##' c("multicore","snow")`, the default is one less than the maximum number of
##' processors detected by [parallel::detectCores()]. This default is
##' also silently set if the user specifies more than the number of processors
##' detected.
##' @param cl An optional \pkg{parallel} or \pkg{snow} cluster for use when
##' `parallelType = "snow"`. If `NULL`, a `"PSOCK"` cluster on
##' the local machine is created for the duration of the `permuteMeasEq`
##' call. If a valid [parallel::makeCluster()] object is supplied,
##' `parallelType` is silently set to `"snow"`, and `ncpus` is
##' silently set to `length(cl)`.
##' @param iseed Integer: Only used to set the states of the RNG when using
##' parallel options, in which case [base::RNGkind()] is set to
##' `"L'Ecuyer-CMRG"` with a message. See
##' [parallel::clusterSetRNGStream()] and Section 6 of
##' `vignette("parallel", "parallel")` for more details. If user supplies
##' an invalid value, `iseed` is silently set to the default (12345). To
##' set the state of the RNG when not using parallel options, call
##' [base::set.seed()] before calling `permuteMeasEq`.
##'
##' @return The [permuteMeasEq-class] object representing the results of
##' testing measurement equivalence (the multiparameter omnibus test) and DIF
##' (modification indices), as well as diagnostics and any `extra` output.
##'
##' @author Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' @seealso [stats::TukeyHSD()], [lavaan::lavTestScore()],
##' [semTools::measurementInvariance()],
##' [semTools::measurementInvarianceCat()]
##'
##' @references
##'
##' **Papers about permutation tests of measurement equivalence:**
##'
##' Jorgensen, T. D., Kite, B. A., Chen, P.-Y., & Short, S. D. (2018).
##' Permutation randomization methods for testing measurement equivalence and
##' detecting differential item functioning in multiple-group confirmatory
##' factor analysis. *Psychological Methods, 23*(4), 708--728.
##' \doi{10.1037/met0000152}
##'
##' Kite, B. A., Jorgensen, T. D., & Chen, P.-Y. (2018). Random permutation
##' testing applied to measurement invariance testing with ordered-categorical
##' indicators. *Structural Equation Modeling 25*(4), 573--587.
##' \doi{10.1080/10705511.2017.1421467}
##'
##' Jorgensen, T. D. (2017). Applying permutation tests and multivariate
##' modification indices to configurally invariant models that need
##' respecification. *Frontiers in Psychology, 8*(1455).
##' \doi{10.3389/fpsyg.2017.01455}
##'
##' **Additional reading:**
##'
##' Chen, F. F. (2007). Sensitivity of goodness of fit indexes to
##' lack of measurement invariance. *Structural Equation Modeling, 14*(3),
##' 464--504. \doi{10.1080/10705510701301834}
##'
##' Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes
##' for testing measurement invariance. *Structural Equation Modeling,
##' 9*(2), 233--255. \doi{10.1207/S15328007SEM0902_5}
##'
##' Meade, A. W., Johnson, E. C., & Braddy, P. W. (2008). Power and sensitivity
##' of alternative fit indices in tests of measurement invariance. *Journal
##' of Applied Psychology, 93*(3), 568--592. \doi{10.1037/0021-9010.93.3.568}
##'
##' Widamin, K. F., & Thompson, J. S. (2003). On specifying the null model for
##' incremental fit indices in structural equation modeling. *Psychological
##' Methods, 8*(1), 16--37. \doi{10.1037/1082-989X.8.1.16}
##'
##' @examples
##'
##' \donttest{
##'
##' ########################
##' ## Multiple-Group CFA ##
##' ########################
##'
##' ## create 3-group data in lavaan example(cfa) data
##' HS <- lavaan::HolzingerSwineford1939
##' HS$ageGroup <- ifelse(HS$ageyr < 13, "preteen",
##' ifelse(HS$ageyr > 13, "teen", "thirteen"))
##'
##' ## specify and fit an appropriate null model for incremental fit indices
##' mod.null <- c(paste0("x", 1:9, " ~ c(T", 1:9, ", T", 1:9, ", T", 1:9, ")*1"),
##' paste0("x", 1:9, " ~~ c(L", 1:9, ", L", 1:9, ", L", 1:9, ")*x", 1:9))
##' fit.null <- cfa(mod.null, data = HS, group = "ageGroup")
##'
##' ## fit target model with varying levels of measurement equivalence
##' mod.config <- '
##' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9
##' '
##' fit.config <- cfa(mod.config, data = HS, std.lv = TRUE, group = "ageGroup")
##' fit.metric <- cfa(mod.config, data = HS, std.lv = TRUE, group = "ageGroup",
##' group.equal = "loadings")
##' fit.scalar <- cfa(mod.config, data = HS, std.lv = TRUE, group = "ageGroup",
##' group.equal = c("loadings","intercepts"))
##'
##'
##' ####################### Permutation Method
##'
##' ## fit indices of interest for multiparameter omnibus test
##' myAFIs <- c("chisq","cfi","rmsea","mfi","aic")
##' moreAFIs <- c("gammaHat","adjGammaHat")
##'
##' ## Use only 20 permutations for a demo. In practice,
##' ## use > 1000 to reduce sampling variability of estimated p values
##'
##' ## test configural invariance
##' set.seed(12345)
##' out.config <- permuteMeasEq(nPermute = 20, con = fit.config)
##' out.config
##'
##' ## test metric equivalence
##' set.seed(12345) # same permutations
##' out.metric <- permuteMeasEq(nPermute = 20, uncon = fit.config, con = fit.metric,
##' param = "loadings", AFIs = myAFIs,
##' moreAFIs = moreAFIs, null = fit.null)
##' summary(out.metric, nd = 4)
##'
##' ## test scalar equivalence
##' set.seed(12345) # same permutations
##' out.scalar <- permuteMeasEq(nPermute = 20, uncon = fit.metric, con = fit.scalar,
##' param = "intercepts", AFIs = myAFIs,
##' moreAFIs = moreAFIs, null = fit.null)
##' summary(out.scalar)
##'
##' ## Not much to see without significant DIF.
##' ## Try using an absurdly high alpha level for illustration.
##' outsum <- summary(out.scalar, alpha = .50)
##'
##' ## notice that the returned object is the table of DIF tests
##' outsum
##'
##' ## visualize permutation distribution
##' hist(out.config, AFI = "chisq")
##' hist(out.metric, AFI = "chisq", nd = 2, alpha = .01,
##' legendArgs = list(x = "topright"))
##' hist(out.scalar, AFI = "cfi", printLegend = FALSE)
##'
##'
##' ####################### Extra Output
##'
##' ## function to calculate expected change of Group-2 and -3 latent means if
##' ## each intercept constraint were released
##' extra <- function(con) {
##' output <- list()
##' output["x1.vis2"] <- lavTestScore(con, release = 19:20, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[70]
##' output["x1.vis3"] <- lavTestScore(con, release = 19:20, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[106]
##' output["x2.vis2"] <- lavTestScore(con, release = 21:22, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[70]
##' output["x2.vis3"] <- lavTestScore(con, release = 21:22, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[106]
##' output["x3.vis2"] <- lavTestScore(con, release = 23:24, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[70]
##' output["x3.vis3"] <- lavTestScore(con, release = 23:24, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[106]
##' output["x4.txt2"] <- lavTestScore(con, release = 25:26, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[71]
##' output["x4.txt3"] <- lavTestScore(con, release = 25:26, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[107]
##' output["x5.txt2"] <- lavTestScore(con, release = 27:28, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[71]
##' output["x5.txt3"] <- lavTestScore(con, release = 27:28, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[107]
##' output["x6.txt2"] <- lavTestScore(con, release = 29:30, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[71]
##' output["x6.txt3"] <- lavTestScore(con, release = 29:30, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[107]
##' output["x7.spd2"] <- lavTestScore(con, release = 31:32, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[72]
##' output["x7.spd3"] <- lavTestScore(con, release = 31:32, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[108]
##' output["x8.spd2"] <- lavTestScore(con, release = 33:34, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[72]
##' output["x8.spd3"] <- lavTestScore(con, release = 33:34, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[108]
##' output["x9.spd2"] <- lavTestScore(con, release = 35:36, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[72]
##' output["x9.spd3"] <- lavTestScore(con, release = 35:36, univariate = FALSE,
##' epc = TRUE, warn = FALSE)$epc$epc[108]
##' output
##' }
##'
##' ## observed EPC
##' extra(fit.scalar)
##'
##' ## permutation results, including extra output
##' set.seed(12345) # same permutations
##' out.scalar <- permuteMeasEq(nPermute = 20, uncon = fit.metric, con = fit.scalar,
##' param = "intercepts", AFIs = myAFIs,
##' moreAFIs = moreAFIs, null = fit.null, extra = extra)
##' ## summarize extra output
##' summary(out.scalar, extra = TRUE)
##'
##'
##' ###########
##' ## MIMIC ##
##' ###########
##'
##' ## Specify Restricted Factor Analysis (RFA) model, equivalent to MIMIC, but
##' ## the factor covaries with the covariate instead of being regressed on it.
##' ## The covariate defines a single-indicator construct, and the
##' ## double-mean-centered products of the indicators define a latent
##' ## interaction between the factor and the covariate.
##' mod.mimic <- '
##' visual =~ x1 + x2 + x3
##' age =~ ageyr
##' age.by.vis =~ x1.ageyr + x2.ageyr + x3.ageyr
##'
##' x1 ~~ x1.ageyr
##' x2 ~~ x2.ageyr
##' x3 ~~ x3.ageyr
##' '
##'
##' HS.orth <- indProd(var1 = paste0("x", 1:3), var2 = "ageyr", match = FALSE,
##' data = HS[ , c("ageyr", paste0("x", 1:3))] )
##' fit.mimic <- cfa(mod.mimic, data = HS.orth, meanstructure = TRUE)
##' summary(fit.mimic, stand = TRUE)
##'
##' ## Whereas MIMIC models specify direct effects of the covariate on an indicator,
##' ## DIF can be tested in RFA models by specifying free loadings of an indicator
##' ## on the covariate's construct (uniform DIF, scalar invariance) and the
##' ## interaction construct (nonuniform DIF, metric invariance).
##' param <- as.list(paste0("age + age.by.vis =~ x", 1:3))
##' names(param) <- paste0("x", 1:3)
##' # param <- as.list(paste0("x", 1:3, " ~ age + age.by.vis")) # equivalent
##'
##' ## test both parameters simultaneously for each indicator
##' do.call(rbind, lapply(param, function(x) lavTestScore(fit.mimic, add = x)$test))
##' ## or test each parameter individually
##' lavTestScore(fit.mimic, add = as.character(param))
##'
##'
##' ####################### Permutation Method
##'
##' ## function to recalculate interaction terms after permuting the covariate
##' datafun <- function(data) {
##' d <- data[, c(paste0("x", 1:3), "ageyr")]
##' indProd(var1 = paste0("x", 1:3), var2 = "ageyr", match = FALSE, data = d)
##' }
##'
##' set.seed(12345)
##' perm.mimic <- permuteMeasEq(nPermute = 20, modelType = "mimic",
##' con = fit.mimic, param = param,
##' covariates = "ageyr", datafun = datafun)
##' summary(perm.mimic)
##'
##' }
##'
##' @export
permuteMeasEq <- function(nPermute, modelType = c("mgcfa","mimic"),
con, uncon = NULL, null = NULL,
param = NULL, freeParam = NULL, covariates = NULL,
AFIs = NULL, moreAFIs = NULL,
maxSparse = 10, maxNonconv = 10, showProgress = TRUE,
warn = -1, datafun, extra,
parallelType = c("none","multicore","snow"),
ncpus = NULL, cl = NULL, iseed = 12345) {
## save arguments from call
availableArgs <- as.list(formals(permuteMeasEq))
argNames <- names(availableArgs)
if (missing(datafun)) argNames <- setdiff(argNames, "datafun")
if (missing(extra)) argNames <- setdiff(argNames, "extra")
for (aa in argNames) {
if (!is.null(eval(as.name(aa))))
suppressWarnings(availableArgs[[aa]] <- eval(as.name(aa)))
}
## check and return them
fullCall <- do.call(checkPermArgs, availableArgs)
## assign them to workspace (also adds old_RNG & oldSeed to workspace)
for (aa in names(fullCall)) assign(aa, fullCall[[aa]])
###################### SAVE OBSERVED RESULTS ##########################
AFI.obs <- do.call(getAFIs, fullCall)
## save modification indices if !is.null(param)
if (is.null(param)) {
MI.obs <- data.frame(NULL)
} else MI.obs <- do.call(getMIs, fullCall)
## anything extra?
if (!missing(extra)) {
extraArgs <- formals(extra)
neededArgs <- intersect(names(extraArgs), names(fullCall))
extraArgs <- do.call(c, lapply(neededArgs, function(nn) fullCall[nn]))
extraOut <- do.call(extra, extraArgs)
## check that extra() returns a named list of scalars
if (!is.list(extraOut)) extraOut <- as.list(extraOut)
wrongFormat <- paste('Function "extra" must return a numeric vector or a',
'list of scalars, with each element named.')
if (!all(sapply(extraOut, is.numeric))) stop(wrongFormat)
if (!all(sapply(extraOut, length) == 1L)) stop(wrongFormat)
if (is.null(names(extraOut)) | any(names(extraOut) == "")) stop(wrongFormat)
extra.obs <- do.call(c, extraOut)
} else extra.obs <- numeric(length = 0L)
######################### PREP DATA ##############################
argList <- fullCall[c("con","uncon","null","param","freeParam","covariates",
"AFIs","moreAFIs","maxSparse","maxNonconv","warn","iseed")]
argList$G <- lavInspect(con, "group")
## check for categorical variables
# catVars <- lavaan::lavNames(con, type = "ov.ord")
# numVars <- lavaan::lavNames(con, type = "ov.num")
# latentVars <- lavaan::lavNames(con, type = "lv.regular")
## assemble data to which the models were fit
if (length(argList$G)) {
dataList <- mapply(FUN = function(x, g, n) {
y <- data.frame(as.data.frame(x), g, stringsAsFactors = FALSE)
names(y) <- c(n, argList$G)
y
}, SIMPLIFY = FALSE,
x = lavInspect(con, "data"), g = lavInspect(con, "group.label"),
n = lavaan::lavNames(con, type = "ov",
group = seq_along(lavInspect(con, "group.label"))))
argList$d <- do.call(rbind, dataList)
} else {
argList$d <- as.data.frame(lavInspect(con, "data"))
names(argList$d) <- lavaan::lavNames(con, type = "ov")
}
## check that covariates are actual variables
if (modelType == "mimic") {
if (length(covariates) && !all(covariates %in% names(argList$d)))
stop('These specified covariates are not columns in the data.frame:\n',
paste(setdiff(covariates, names(argList$d)), collapse = ", "))
}
## anything extra?
if (!missing(extra)) argList$extra <- extra
if (!missing(datafun)) argList$datafun <- datafun
###################### PERMUTED RESULTS ###########################
## permute and return distributions of (delta)AFIs, largest MI, and extras
if (showProgress) {
mypb <- utils::txtProgressBar(min = 1, max = nPermute, initial = 1,
char = "=", width = 50, style = 3, file = "")
permuDist <- list()
for (j in 1:nPermute) {
permuDist[[j]] <- do.call(paste("permuteOnce", modelType, sep = "."),
args = c(argList, i = j))
utils::setTxtProgressBar(mypb, j)
}
close(mypb)
} else if (parallelType == "multicore") {
if (length(iseed)) set.seed(iseed)
argList$FUN <- as.name(paste("permuteOnce", modelType, sep = "."))
argList$X <- 1:nPermute
argList$mc.cores <- ncpus
argList$mc.set.seed <- TRUE
pmcl <- function(...) { parallel::mclapply(...) }
permuDist <- do.call(pmcl, args = argList)
## restore old RNG type
if (fullCall$old_RNG[1] != "L'Ecuyer-CMRG") RNGkind(fullCall$old_RNG[1])
} else if (parallelType == "snow") {
stopTheCluster <- FALSE
if (is.null(cl)) {
stopTheCluster <- TRUE
cl <- parallel::makePSOCKcluster(rep("localhost", ncpus))
}
parallel::clusterSetRNGStream(cl, iseed = iseed)
argList$cl <- cl
argList$X <- 1:nPermute
argList$fun <- paste("permuteOnce", modelType, sep = ".")
parallel::clusterExport(cl, varlist = c(argList$fun, "getAFIs","getMIs")) #FIXME: need update?
tempppl <- function(...) { parallel::parLapply(...) }
permuDist <- do.call(tempppl, args = argList)
if (stopTheCluster) parallel::stopCluster(cl)
## restore old RNG type
if (fullCall$old_RNG[1] != "L'Ecuyer-CMRG") RNGkind(fullCall$old_RNG[1])
} else {
argList$X <- 1:nPermute
argList$FUN <- paste("permuteOnce", modelType, sep = ".")
permuDist <- do.call(lapply, args = argList)
}
## extract AFI distribution
if (length(AFI.obs) > 1) {
AFI.dist <- as.data.frame(t(sapply(permuDist, function(x) x$AFI)))
}
if (length(AFI.obs) == 1L) {
AFI.dist <- data.frame(sapply(permuDist, function(x) x$AFI))
colnames(AFI.dist) <- names(AFI.obs)
}
## identify badness-of-fit measures
badness <- grepl(pattern = "fmin|chi|aic|bic|rmr|rmsea|cn|sic|hqc",
x = names(AFI.obs), ignore.case = TRUE)
## calculate all one-directional p-values
AFI.pval <- mapply(FUN = function(x, y, b) {
if (b) return(mean(x >= y, na.rm = TRUE))
mean(x <= y, na.rm = TRUE)
}, x = unclass(AFI.dist), y = AFI.obs, b = badness)
## extract distribution of maximum modification indices
MI.dist <- as.numeric(unlist(lapply(permuDist, function(x) x$MI)))
## calculate Tukey-adjusted p values for modification indices
if (!is.null(param)) {
MI.obs$tukey.p.value <- sapply(MI.obs$X2,
function(i) mean(i <= MI.dist, na.rm = TRUE))
MI.obs <- as.data.frame(unclass(MI.obs))
rownames(MI.obs) <- names(param)
}
## anything extra?
if (!missing(extra)) {
extra.dist <- do.call(rbind, lapply(permuDist, function(x) x$extra))
} else extra.dist <- data.frame(NULL)
## save parameter table for show/summary methods
PT <- as.data.frame(parTable(con))
PT$par <- paste0(PT$lhs, PT$op, PT$rhs)
if (length(lavInspect(con, "group")))
PT$group.label[PT$group > 0] <- lavInspect(con, "group.label")[PT$group[PT$group > 0] ]
## return observed results, permutation p values, and ANOVA results
if (is.null(uncon)) {
delta <- lavaan::anova(con)
} else {
delta <- lavaan::anova(uncon, con)
}
ANOVA <- sapply(delta[,c("Chisq diff","Df diff","Pr(>Chisq)")], function(x) x[2])
out <- new("permuteMeasEq", PT = PT, modelType = modelType, ANOVA = ANOVA,
AFI.obs = AFI.obs, AFI.dist = AFI.dist, AFI.pval = AFI.pval,
MI.obs = MI.obs, MI.dist = MI.dist,
extra.obs = extra.obs, extra.dist = extra.dist,
n.Permutations = nPermute, n.Converged = sum(!is.na(AFI.dist[,1])),
n.nonConverged = sapply(permuDist, function(x) x$n.nonConverged),
n.Sparse = sapply(permuDist, function(x) x$n.Sparse),
oldSeed = fullCall$oldSeed)
out
}
## ----------------
## Hidden Functions
## ----------------
## function to check validity of arguments to permuteMeasEq()
#' @importFrom lavaan lavInspect parTable
checkPermArgs <- function(nPermute, modelType, con, uncon, null,
param, freeParam, covariates, AFIs, moreAFIs,
maxSparse, maxNonconv, showProgress, warn,
datafun, extra, parallelType, ncpus, cl, iseed) {
fixedCall <- as.list(match.call())[-1]
fixedCall$nPermute <- as.integer(nPermute[1])
fixedCall$modelType <- modelType[1]
if (!fixedCall$modelType %in% c("mgcfa","mimic","long"))
stop('modelType must be one of c("mgcfa","mimic","long")')
if (fixedCall$modelType == "long") stop('modelType "long" is not yet available.')
if (fixedCall$modelType == "mgcfa" && lavInspect(con, "ngroups") == 1L)
stop('modelType = "mgcfa" applies only to multigroup models.')
if (fixedCall$modelType == "mimic") {
uncon <- NULL
fixedCall$uncon <- NULL
fixedCall <- c(fixedCall, list(uncon = NULL))
}
## strip white space
if (is.list(param)) {
fixedCall$param <- lapply(param, function(cc) gsub("[[:space:]]+", "", cc))
} else if (!is.null(param)) fixedCall$param <- gsub("[[:space:]]+", "", param)
if (!is.null(freeParam)) fixedCall$freeParam <- gsub("[[:space:]]+", "", freeParam)
if (fixedCall$modelType == "mimic") {
# PT <- lavaan::lavaanify(fixedCall$param)
# checkCovs <- unique(PT$rhs[PT$op == "~"])
# if (is.null(covariates)) covariates <- checkCovs
# if (length(setdiff(covariates, checkCovs)))
# warning('Argument "covariates" includes predictors not in argument "param"')
##### ordVars <- lavaan::lavNames(con, type = "ov.ord")
fixedCall$covariates <- as.character(covariates)
}
fixedCall$maxSparse <- as.integer(maxSparse[1])
fixedCall$maxNonconv <- as.integer(maxNonconv[1])
fixedCall$showProgress <- as.logical(showProgress[1])
fixedCall$warn <- as.integer(warn[1])
fixedCall$oldSeed <- as.integer(NULL)
parallelType <- as.character(parallelType[1])
if (!parallelType %in% c("none","multicore","snow")) parallelType <- "none"
if (!is.null(cl)) {
if (!is(cl, "cluster")) stop("Invalid cluster object. Check class(cl)")
parallelType <- "snow"
ncpus <- length(cl)
}
if (parallelType == "multicore" && .Platform$OS.type == "windows") {
parallelType <- "snow"
message("'multicore' option unavailable on Windows. Using 'snow' instead.")
}
## parallel settings, adapted from boot::boot()
if (parallelType != "none") {
if (is.null(ncpus) || ncpus > parallel::detectCores()) {
ncpus <- parallel::detectCores() - 1
}
if (ncpus <= 1L) {
parallelType <- "none"
} else {
fixedCall$showProgress <- FALSE
fixedCall$old_RNG <- RNGkind()
fixedCall$oldSeed <- .Random.seed
if (fixedCall$old_RNG[1] != "L'Ecuyer-CMRG") {
RNGkind("L'Ecuyer-CMRG")
message("Your RNGkind() was changed from ", fixedCall$old_RNG[1],
" to L'Ecuyer-CMRG, which is required for reproducibility ",
" in parallel jobs. Your RNGkind() has been returned to ",
fixedCall$old_RNG[1], " but the seed has not been set. ",
" The state of your previous RNG is saved in the slot ",
" named 'oldSeed', if you want to restore it using ",
" the syntax:\n",
".Random.seed[-1] <- permuteMeasEqObject@oldSeed[-1]")
}
fixedCall$iseed <- as.integer(iseed[1])
if (is.na(fixedCall$iseed)) fixedCall$iseed <- 12345
}
}
fixedCall$parallelType <- parallelType
if (is.null(ncpus)) {
fixedCall$ncpus <- NULL
fixedCall <- c(fixedCall, list(ncpus = NULL))
} else fixedCall$ncpus <- ncpus
## Check that "param" is NULL if uncon is NULL, and check for lavaan class.
## Also check that models are fitted to raw data, not summary stats.
notLavaan <- "Non-NULL 'con', 'uncon', or 'null' must be fitted lavaan object."
notRawData <- "lavaan models ('con', 'uncon', or 'null') must be fitted to raw data=, not summary statistics (e.g., sample.cov=)"
if (is.null(uncon)) {
if (!is.null(fixedCall$param) && fixedCall$modelType == "mgcfa") {
message(c(" When 'uncon = NULL', only configural invariance is tested.",
"\n So the 'param' argument was changed to NULL."))
fixedCall$param <- NULL
fixedCall <- c(fixedCall, list(param = NULL))
}
if (!inherits(con, "lavaan")) stop(notLavaan)
stopifnot(con@Data@data.type == "full")
} else {
if (!inherits(con, "lavaan")) stop(notLavaan)
if (!inherits(uncon, "lavaan")) stop(notLavaan)
stopifnot( con@Data@data.type == "full")
stopifnot(uncon@Data@data.type == "full")
}
if (!is.null(null)) {
if (!inherits(null, "lavaan")) stop(notLavaan)
stopifnot(null@Data@data.type == "full")
}
############ FIXME: check that lavInspect(con, "options")$conditional.x = FALSE (find defaults for continuous/ordered indicators)
if (!is.null(fixedCall$param)) {
## Temporarily warn about testing thresholds without necessary constraints. FIXME: check for binary indicators
if ("thresholds" %in% fixedCall$param | any(grepl("\\|", fixedCall$param))) {
warning(c("This function is not yet optimized for testing thresholds.\n",
"Necessary identification contraints might not be specified."))
}
## collect parameter types for "mgcfa"
if (fixedCall$modelType != "mimic") {
## save all estimates from constrained model
PT <- parTable(con)[ , c("lhs","op","rhs","group","plabel")]
## extract parameters of interest
paramTypes <- c("loadings","intercepts","thresholds","residuals","means",
"residual.covariances","lv.variances","lv.covariances")
params <- PT[paste0(PT$lhs, PT$op, PT$rhs) %in% setdiff(fixedCall$param,
paramTypes), ]
## add parameters by type, if any are specified
types <- intersect(fixedCall$param, paramTypes)
ov.names <- lavaan::lavNames(con, "ov")
isOV <- PT$lhs %in% ov.names
lv.names <- con@pta$vnames$lv[[1]]
isLV <- PT$lhs %in% lv.names & PT$rhs %in% lv.names
if ("loadings" %in% types) params <- rbind(params, PT[PT$op == "=~", ])
if ("intercepts" %in% types) {
params <- rbind(params, PT[isOV & PT$op == "~1", ])
}
if ("thresholds" %in% types) params <- rbind(params, PT[PT$op == "|", ])
if ("residuals" %in% types) {
params <- rbind(params, PT[isOV & PT$lhs == PT$rhs & PT$op == "~~", ])
}
if ("residual.covariances" %in% types) {
params <- rbind(params, PT[isOV & PT$lhs != PT$rhs & PT$op == "~~", ])
}
if ("means" %in% types) {
params <- rbind(params, PT[PT$lhs %in% lv.names & PT$op == "~1", ])
}
if ("lv.variances" %in% types) {
params <- rbind(params, PT[isLV & PT$lhs == PT$rhs & PT$op == "~~", ])
}
if ("lv.covariances" %in% types) {
params <- rbind(params, PT[isLV & PT$lhs != PT$rhs & PT$op == "~~", ])
}
## remove parameters specified by "freeParam" argument
params <- params[!paste0(params$lhs, params$op, params$rhs) %in% fixedCall$freeParam, ]
fixedCall$param <- paste0(params$lhs, params$op, params$rhs)
}
}
if (is.null(AFIs) & is.null(moreAFIs)) {
message("No AFIs were selected, so only chi-squared will be permuted.\n")
fixedCall$AFIs <- "chisq"
AFIs <- "chisq"
}
if ("ecvi" %in% AFIs & lavInspect(con, "ngroups") > 1L)
stop("ECVI is not available for multigroup models.")
## check estimators
leastSq <- grepl("LS", lavInspect(con, "options")$estimator)
if (!is.null(uncon)) {
if (uncon@Options$estimator != lavInspect(con, "options")$estimator)
stop("Models must be fit using same estimator.")
}
if (!is.null(null)) {
if (lavInspect(null, "options")$estimator != lavInspect(con, "options")$estimator)
stop("Models must be fit using same estimator.")
}
## check extra functions, if any
restrictedArgs <- c("con","uncon","null","param","freeParam","covariates",
"AFIs","moreAFIs","maxSparse","maxNonconv","iseed")
if (!missing(datafun)) {
if (!is.function(datafun)) stop('Argument "datafun" must be a function.')
extraArgs <- formals(datafun)
if (!all(names(extraArgs) %in% c(restrictedArgs, "data")))
stop('The user-supplied function "datafun" can only have any among the ',
'following arguments:\n', paste(restrictedArgs, collapse = ", "))
}
if (!missing(extra)) {
if (!is.function(extra)) stop('Argument "extra" must be a function.')
extraArgs <- formals(extra)
if (!all(names(extraArgs) %in% restrictedArgs))
stop('The user-supplied function "extra" can only have any among the ',
'following arguments:\n', paste(restrictedArgs, collapse = ", "))
}
## return evaluated list of other arguments
lapply(fixedCall, eval)
}
## function to extract fit measures
#' @importFrom lavaan lavInspect
getAFIs <- function(...) {
dots <- list(...)
AFI1 <- list()
AFI0 <- list()
leastSq <- grepl("LS", lavInspect(dots$con, "options")$estimator)
## check validity of user-specified AFIs, save output
if (!is.null(dots$AFIs)) {
IC <- grep("ic|logl", dots$AFIs, value = TRUE)
if (leastSq & length(IC)) {
stop(paste("Argument 'AFIs' includes invalid options:",
paste(IC, collapse = ", "),
"Information criteria unavailable for least-squares estimators.",
sep = "\n"))
}
if (!is.null(dots$uncon))
AFI1[[1]] <- lavaan::fitMeasures(dots$uncon, fit.measures = dots$AFIs,
baseline.model = dots$null)
AFI0[[1]] <- lavaan::fitMeasures(dots$con, fit.measures = dots$AFIs,
baseline.model = dots$null)
}
## check validity of user-specified moreAFIs
if (!is.null(dots$moreAFIs)) {
IC <- grep("ic|hqc", dots$moreAFIs, value = TRUE)
if (leastSq & length(IC)) {
stop(paste("Argument 'moreAFIs' includes invalid options:",
paste(IC, collapse = ", "),
"Information criteria unavailable for least-squares estimators.",
sep = "\n"))
}
if (!is.null(dots$uncon))
AFI1[[2]] <- moreFitIndices(dots$uncon, fit.measures = dots$moreAFIs)
AFI0[[2]] <- moreFitIndices(dots$con, fit.measures = dots$moreAFIs)
}
## save observed AFIs or delta-AFIs
if (is.null(dots$uncon)) {
AFI.obs <- unlist(AFI0)
} else {
AFI.obs <- unlist(AFI0) - unlist(AFI1)
}
AFI.obs
}
## Function to extract modification indices for equality constraints
#' @importFrom lavaan parTable
getMIs <- function(...) {
dots <- list(...)
if (dots$modelType == "mgcfa") {
## save all estimates from constrained model
PT <- parTable(dots$con)[ , c("lhs","op","rhs","group","plabel")]
## extract parameters of interest
params <- PT[paste0(PT$lhs, PT$op, PT$rhs) %in% dots$param, ]
## return modification indices for specified constraints (param)
MIs <- lavaan::lavTestScore(dots$con)$uni
MI.obs <- MIs[MIs$lhs %in% params$plabel, ]
} else if (dots$modelType == "mimic") {
if (is.list(dots$param)) {
MI <- lapply(dots$param, function(x) lavaan::lavTestScore(dots$con, add = x)$test)
MI.obs <- do.call(rbind, MI)
} else MI.obs <- lavaan::lavTestScore(dots$con, add = dots$param)$uni
} else if (dots$modelType == "long") {
## coming soon
}
MI.obs
}
## Functions to find delta-AFIs & maximum modification index in one permutation
permuteOnce.mgcfa <- function(i, d, G, con, uncon, null, param, freeParam,
covariates, AFIs, moreAFIs, maxSparse, maxNonconv,
iseed, warn, extra = NULL, datafun = NULL) {
old_warn <- options()$warn
options(warn = warn)
## save arguments from call
argNames <- names(formals(permuteOnce.mgcfa))
availableArgs <- lapply(argNames, function(x) eval(as.name(x)))
names(availableArgs) <- argNames
group.label <- lavaan::lavInspect(con, "group.label")
nSparse <- 0L
nTries <- 1L
while ( (nSparse <= maxSparse) & (nTries <= maxNonconv) ) {
## permute grouping variable
d[ , G] <- sample(d[ , G])
## transform data?
if (!is.null(datafun)) {
extraArgs <- formals(datafun)
neededArgs <- intersect(names(extraArgs), names(availableArgs))
extraArgs <- do.call(c, lapply(neededArgs, function(nn) availableArgs[nn]))
extraArgs$data <- d
originalNames <- colnames(d)
d <- do.call(datafun, extraArgs)
## coerce extraOut to data.frame
if (!is.data.frame(d)) stop('Argument "datafun" did not return a data.frame')
if (!all(originalNames %in% colnames(d)))
stop('The data.frame returned by argument "datafun" did not contain ',
'column names required by the model:\n',
paste(setdiff(originalNames, colnames(d)), collapse = ", "))
}
## for ordered indicators, check that groups have same observed categories
ordVars <- lavaan::lavNames(con, type = "ov.ord")
if (length(ordVars) > 0) {
try(onewayTables <- lavaan::lavTables(d, dimension = 1L,
categorical = ordVars, group = G),
silent = TRUE)
if (exists("onewayTables")) {
if (any(onewayTables$obs.prop == 1)) {
nSparse <- nSparse + 1L
next
}
} else {
## no "onewayTables" probably indicates empty categories in 1+ groups
nSparse <- nSparse + 1L
next
}
}
## fit null model, if it exists
if (!is.null(null)) {
out.null <- lavaan::update(null, data = d, group.label = group.label)
}
## fit constrained model, check for convergence
try(out0 <- lavaan::update(con, data = d, group.label = group.label))
if (!exists("out0")) {
nTries <- nTries + 1L
next
}
if (!lavaan::lavInspect(out0, "converged")) {
nTries <- nTries + 1L
next
}
## fit unconstrained model (unless NULL), check for convergence
if (!is.null(uncon)) {
try(out1 <- lavaan::update(uncon, data = d, group.label = group.label))
if (!exists("out1")) {
nTries <- nTries + 1L
next
}
if (!lavaan::lavInspect(out1, "converged")) {
nTries <- nTries + 1L
next
}
}
## If you get this far, everything converged, so break WHILE loop
break
}
## if WHILE loop ended before getting results, return NA
if ( (nSparse == maxSparse) | (nTries == maxNonconv) ) {
allAFIs <- c(AFIs, moreAFIs)
AFI <- rep(NA, sum(!is.na(allAFIs)))
names(AFI) <- allAFIs[!is.na(allAFIs)]
MI <- if (is.null(param)) NULL else NA
extra.obs <- NA
nTries <- nTries + 1L
} else {
availableArgs$con <- out0
if (exists("out1")) availableArgs$uncon <- out1
if (exists("out.null")) availableArgs$null <- out.null
AFI <- do.call(getAFIs, availableArgs)
## save max(MI) if !is.null(param)
if (is.null(param)) {
MI <- NULL
} else {
MI <- max(do.call(getMIs, c(availableArgs, modelType = "mgcfa"))$X2)
}
## anything extra?
if (!is.null(extra)) {
extraArgs <- formals(extra)
neededArgs <- intersect(names(extraArgs), names(availableArgs))
extraArgs <- do.call(c, lapply(neededArgs, function(nn) availableArgs[nn]))
extraOut <- do.call(extra, extraArgs)
## coerce extraOut to data.frame
if (!is.list(extraOut)) extraOut <- as.list(extraOut)
extra.obs <- data.frame(extraOut)
} else extra.obs <- data.frame(NULL)
}
options(warn = old_warn)
list(AFI = AFI, MI = MI, extra = extra.obs,
n.nonConverged = nTries - 1L, n.Sparse = nSparse)
}
permuteOnce.mimic <- function(i, d, G, con, uncon, null, param, freeParam,
covariates, AFIs, moreAFIs, maxSparse, maxNonconv,
iseed, warn, extra = NULL, datafun = NULL) {
old_warn <- options()$warn
options(warn = warn)
## save arguments from call
argNames <- names(formals(permuteOnce.mimic))
availableArgs <- lapply(argNames, function(x) eval(as.name(x)))
names(availableArgs) <- argNames
group.label <- lavaan::lavInspect(con, "group.label")
nTries <- 1L
while (nTries <= maxNonconv) {
## permute covariate(s) within each group
if (length(G)) {
for (gg in group.label) {
dG <- d[ d[[G]] == gg, ]
N <- nrow(dG)
newd <- dG[sample(1:N, N), covariates, drop = FALSE]
for (COV in covariates) d[d[[G]] == gg, COV] <- newd[ , COV]
}
} else {
N <- nrow(d)
newd <- d[sample(1:N, N), covariates, drop = FALSE]
for (COV in covariates) d[ , COV] <- newd[ , COV]
}
## transform data?
if (!is.null(datafun)) {
extraArgs <- formals(datafun)
neededArgs <- intersect(names(extraArgs), names(availableArgs))
extraArgs <- do.call(c, lapply(neededArgs, function(nn) availableArgs[nn]))
extraArgs$data <- d
originalNames <- colnames(d)
d <- do.call(datafun, extraArgs)
## coerce extraOut to data.frame
if (!is.data.frame(d)) stop('Argument "datafun" did not return a data.frame')
if (!all(originalNames %in% colnames(d)))
stop('The data.frame returned by argument "datafun" did not contain ',
'column names required by the model:\n',
paste(setdiff(originalNames, colnames(d)), collapse = ", "))
}
## fit null model, if it exists
if (!is.null(null)) {
out.null <- lavaan::update(null, data = d, group.label = group.label)
}
## fit constrained model
try(out0 <- lavaan::update(con, data = d, group.label = group.label))
## check for convergence
if (!exists("out0")) {
nTries <- nTries + 1L
next
}
if (!lavaan::lavInspect(out0, "converged")) {
nTries <- nTries + 1L
next
}
## If you get this far, everything converged, so break WHILE loop
break
}
## if WHILE loop ended before getting results, return NA
if (nTries == maxNonconv) {
allAFIs <- c(AFIs, moreAFIs)
AFI <- rep(NA, sum(!is.na(allAFIs)))
names(AFI) <- allAFIs[!is.na(allAFIs)]
MI <- if (is.null(param)) NULL else NA
extra.obs <- NA
nTries <- nTries + 1L
} else {
availableArgs$con <- out0
if (exists("out.null")) availableArgs$null <- out.null
AFI <- do.call(getAFIs, availableArgs)
if (is.null(param)) {
MI <- NULL
} else {
MI <- max(do.call(getMIs, c(availableArgs, modelType = "mimic"))$X2)
}
## anything extra?
if (!is.null(extra)) {
extraArgs <- formals(extra)
neededArgs <- intersect(names(extraArgs), names(availableArgs))
extraArgs <- do.call(c, lapply(neededArgs, function(nn) availableArgs[nn]))
extraOut <- do.call(extra, extraArgs)
## coerce extraOut to data.frame
if (!is.list(extraOut)) extraOut <- as.list(extraOut)
extra.obs <- data.frame(extraOut)
} else extra.obs <- data.frame(NULL)
}
options(warn = old_warn)
list(AFI = AFI, MI = MI, extra = extra.obs,
n.nonConverged = nTries - 1L, n.Sparse = integer(length = 0))
}
semTools/R/residualCovariate.R 0000644 0001762 0000144 00000003576 14632143377 016076 0 ustar ligges users ### Sunthud Pornprasertmanit
### Last updated: 10 January 2021
##' Residual-center all target indicators by covariates
##'
##' This function will regress target variables on the covariate and replace the
##' target variables by the residual of the regression analysis. This procedure
##' is useful to control the covariate from the analysis model (Geldhof,
##' Pornprasertmanit, Schoemann, & Little, 2013).
##'
##'
##' @importFrom stats lm
##'
##' @param data The desired data to be transformed.
##' @param targetVar Varible names or the position of indicators that users wish
##' to be residual centered (as dependent variables)
##' @param covVar Covariate names or the position of the covariates using for
##' residual centering (as independent variables) onto target variables
##'
##' @return The data that the target variables replaced by the residuals
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##' @seealso [indProd()] For creating the indicator products with no
##' centering, mean centering, double-mean centering, or residual centering.
##'
##' @references Geldhof, G. J., Pornprasertmanit, S., Schoemann, A. M., &
##' Little, T. D. (2013). Orthogonalizing through residual centering:
##' Extended applications and caveats. *Educational and Psychological
##' Measurement, 73*(1), 27--46. \doi{10.1177/0013164412445473}
##'
##' @examples
##'
##' dat <- residualCovariate(attitude, 2:7, 1)
##'
##' @export
residualCovariate <- function(data, targetVar, covVar) {
x <- as.list(match.call())
cov <- eval(x$covVar)
target <- eval(x$targetVar)
if (all(is.numeric(cov))) cov <- colnames(data)[cov]
if (all(is.numeric(target))) target <- colnames(data)[target]
express <- paste("cbind(", paste(target, collapse = ", "), ") ~ ",
paste(cov, collapse = " + "), sep = "")
data[, target] <- lm(express, data = data)$residuals
return(data)
}
semTools/R/powerAnalysisSS.R 0000644 0001762 0000144 00000020014 14632143377 015520 0 ustar ligges users ### Alexander M. Schoemann & Terrence D. Jorgensen
### Last updated: 2 June 2022
### Function to apply Satorra & Saris method for chi-squared power analysis
## Steps:
## 1. Specify model (use lavaan syntax based on simulateData)
## 2. get model implied covariance matrix
## 3. Fit model with parameter constrained to 0 (or take a model specification for multiparameter tests?)
## 4. Use chi square from step 3 as non-centrality parameter to get power.
## Alternatively, combine steps 1 and 2 by providing population moments directly
##' Power for model parameters
##'
##' Apply Satorra & Saris (1985) method for chi-squared power analysis.
##'
##' Specify all non-zero parameters in a population model, either by using
##' lavaan syntax (`popModel`) or by submitting a population covariance
##' matrix (`Sigma`) and optional mean vector (`mu`) implied by the
##' population model. Then specify an analysis model that places at least
##' one invalid constraint (note the number in the `nparam` argument).
##'
##' There is also a Shiny app called "power4SEM" that provides a graphical user
##' interface for this functionality (Jak et al., in press). It can be accessed
##' at .
##'
##'
##' @importFrom stats qchisq pchisq
##'
##' @param powerModel lavaan [lavaan::model.syntax()] for the model to
##' be analyzed. This syntax should constrain at least one nonzero parameter
##' to 0 (or another number).
##' @param n `integer`. Sample size used in power calculation, or a vector
##' of sample sizes if analyzing a multigroup model. If
##' `length(n) < length(Sigma)` when `Sigma` is a list, `n` will
##' be recycled. If `popModel` is used instead of `Sigma`, `n`
##' must specify a sample size for each group, because that is used to infer
##' the number of groups.
##' @param nparam `integer`. Number of invalid constraints in `powerModel`.
##' @param popModel lavaan [lavaan::model.syntax()] specifying the
##' data-generating model. This syntax should specify values for all nonzero
##' parameters in the model. If `length(n) > 1`, the same population
##' values will be used for each group, unless different population values are
##' specified per group, either in the lavaan [lavaan::model.syntax()]
##' or by utilizing a list of `Sigma` (and optionally `mu`).
##' @param mu `numeric` or `list`. For a single-group model, a vector
##' of population means. For a multigroup model, a list of vectors (one per
##' group). If `mu` and `popModel` are missing, mean structure will
##' be excluded from the analysis.
##' @param Sigma `matrix` or `list`. For a single-group model,
##' a population covariance matrix. For a multigroup model, a list of matrices
##' (one per group). If missing, `popModel` will be used to generate a
##' model-implied Sigma.
##' @param fun character. Name of `lavaan` function used to fit
##' `powerModel` (i.e., `"cfa"`, `"sem"`, `"growth"`, or
##' `"lavaan"`).
##' @param alpha Type I error rate used to set a criterion for rejecting H0.
##' @param ... additional arguments to pass to [lavaan::lavaan()].
##' See also [lavaan::lavOptions()].
##'
##' @author
##' Alexander M. Schoemann (East Carolina University; \email{schoemanna@@ecu.edu})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @references
##' Satorra, A., & Saris, W. E. (1985). Power of the likelihood ratio
##' test in covariance structure analysis. *Psychometrika, 50*(1), 83--90.
##' \doi{10.1007/BF02294150}
##'
##' Jak, S., Jorgensen, T. D., Verdam, M. G., Oort, F. J., & Elffers, L.
##' (2021). Analytical power calculations for structural equation modeling:
##' A tutorial and Shiny app. *Behavior Research Methods, 53*, 1385--1406.
##' \doi{10.3758/s13428-020-01479-0}
##'
##' @examples
##' ## Specify population values. Note every parameter has a fixed value.
##' modelP <- '
##' f1 =~ .7*V1 + .7*V2 + .7*V3 + .7*V4
##' f2 =~ .7*V5 + .7*V6 + .7*V7 + .7*V8
##' f1 ~~ .3*f2
##' f1 ~~ 1*f1
##' f2 ~~ 1*f2
##' V1 ~~ .51*V1
##' V2 ~~ .51*V2
##' V3 ~~ .51*V3
##' V4 ~~ .51*V4
##' V5 ~~ .51*V5
##' V6 ~~ .51*V6
##' V7 ~~ .51*V7
##' V8 ~~ .51*V8
##' '
##' ## Specify analysis model. Note parameter of interest f1~~f2 is fixed to 0.
##' modelA <- '
##' f1 =~ V1 + V2 + V3 + V4
##' f2 =~ V5 + V6 + V7 + V8
##' f1 ~~ 0*f2
##' '
##' ## Calculate power
##' SSpower(powerModel = modelA, popModel = modelP, n = 150, nparam = 1,
##' std.lv = TRUE)
##'
##' ## Get power for a range of sample sizes
##' Ns <- seq(100, 500, 40)
##' Power <- rep(NA, length(Ns))
##' for(i in 1:length(Ns)) {
##' Power[i] <- SSpower(powerModel = modelA, popModel = modelP,
##' n = Ns[i], nparam = 1, std.lv = TRUE)
##' }
##' plot(x = Ns, y = Power, type = "l", xlab = "Sample Size")
##'
##'
##' ## Optionally specify different values for multiple populations
##'
##' modelP2 <- '
##' f1 =~ .7*V1 + .7*V2 + .7*V3 + .7*V4
##' f2 =~ .7*V5 + .7*V6 + .7*V7 + .7*V8
##' f1 ~~ c(-.3, .3)*f2 # DIFFERENT ACROSS GROUPS
##' f1 ~~ 1*f1
##' f2 ~~ 1*f2
##' V1 ~~ .51*V1
##' V2 ~~ .51*V2
##' V3 ~~ .51*V3
##' V4 ~~ .51*V4
##' V5 ~~ .51*V5
##' V6 ~~ .51*V6
##' V7 ~~ .51*V7
##' V8 ~~ .51*V8
##' '
##' modelA2 <- '
##' f1 =~ V1 + V2 + V3 + V4
##' f2 =~ V5 + V6 + V7 + V8
##' f1 ~~ c(psi21, psi21)*f2 # EQUALITY CONSTRAINT ACROSS GROUPS
##' '
##' ## Calculate power
##' SSpower(powerModel = modelA2, popModel = modelP2, n = c(100, 100), nparam = 1,
##' std.lv = TRUE)
##' ## Get power for a range of sample sizes
##' Ns2 <- cbind(Group1 = seq(10, 100, 10), Group2 = seq(10, 100, 10))
##' Power2 <- apply(Ns2, MARGIN = 1, FUN = function(nn) {
##' SSpower(powerModel = modelA2, popModel = modelP2, n = nn,
##' nparam = 1, std.lv = TRUE)
##' })
##' plot(x = rowSums(Ns2), y = Power2, type = "l", xlab = "Total Sample Size",
##' ylim = 0:1)
##' abline(h = c(.8, .9), lty = c("dotted","dashed"))
##' legend("bottomright", c("80% Power","90% Power"), lty = c("dotted","dashed"))
##'
##' @export
SSpower <- function(powerModel, n, nparam, popModel, mu, Sigma,
fun = "sem", alpha = .05, ...) {
if (missing(Sigma)) {
## specify (vector of) sample size(s) for optional multigroup syntax to work
popMoments <- lavaan::fitted(do.call(fun, list(model = popModel,
sample.nobs = n),
envir = getNamespace("lavaan")))
## without data, can't apply fitted() to multigroup model syntax, so
## save the same fitted moments for each group
if (length(n) > 1L) {
Sigma <- lapply(popMoments, "[[", i = "cov")
mu <- if (!is.null(popMoments[[1]]$mean)) {
lapply(popMoments, "[[", i = "mean")
} else NULL
} else {
## single group
Sigma <- popMoments$cov
mu <- popMoments$mean
}
} else {
## otherwise, user-supplied moments
if (is.list(Sigma)) {
nG <- length(Sigma)
if (length(n) < nG) n <- rep(n, length.out = nG)
if (length(n) > nG) n <- n[1:nG]
no.mu <- missing(mu)
if (!no.mu) null.mu <- any(sapply(mu, is.null))
if (no.mu || null.mu) {
mu <- NULL
}
} else if (is.matrix(Sigma)) {
n <- n[[1]]
if (missing(mu)) {
mu <- NULL
} else if (!is.numeric(mu) || !!is.null(mu)) {
stop('mu must be a numeric vector, or a list (one vector per group)')
}
} else stop('Sigma must be a covariance matrix, or a list (one matrix per group)')
}
## Fit (probably constrained) analysis model
dots <- list(...)
funArgs <- list(model = powerModel, sample.cov = Sigma,
sample.mean = mu, sample.nobs = n)
useArgs <- c(funArgs, dots[setdiff(names(dots), names(funArgs))])
fit <- do.call(fun, useArgs, envir = getNamespace("lavaan"))
## get NCP from chi square
ncp <- lavaan::fitmeasures(fit)["chisq"]
## critical value under H0
critVal <- qchisq(alpha, df = nparam, lower.tail = FALSE)
## return power
pchisq(critVal, df = nparam, ncp = ncp, lower.tail = FALSE)
}
semTools/R/parcelAllocation.R 0000644 0001762 0000144 00000042730 14764334616 015703 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 12 March 2025
##' Random Allocation of Items to Parcels in a Structural Equation Model
##'
##' @description
##' This function generates a given number of randomly generated item-to-parcel
##' allocations, fits a model to each allocation, and provides averaged results
##' over all allocations.
##'
##' @details
##' This function implements the random item-to-parcel allocation procedure
##' described in Sterba (2011) and Sterba and MacCallum (2010). The function
##' takes a single data set with item-level data, randomly assigns items to
##' parcels, fits a structural equation model to the parceled data using
##' [lavaan::lavaanList()], and repeats this process for a user-specified
##' number of random allocations. Results from all fitted models are summarized
##' in the output. For further details on the benefits of randomly allocating
##' items to parcels, see Sterba (2011) and Sterba and MacCallum (2010).
##'
##' @importFrom stats sd qnorm
##' @importFrom lavaan parTable lavInspect lavaanList lavaanify lavNames
##'
##' @param model [lavaan::lavaan()] model syntax specifying the model
##' fit to (at least some) parceled data. Note that there can be a mixture of
##' items and parcels (even within the same factor), in case certain items
##' should never be parceled. Can be a character string or parameter table.
##' Also see [lavaan::lavaanify()] for more details.
##' @param data A `data.frame` containing all observed variables appearing
##' in the `model`, as well as those in the `item.syntax` used to
##' create parcels. If the data have missing values, multiple imputation
##' before parceling is recommended: submit a stacked data set (with a variable
##' for the imputation number, so they can be separateed later) and set
##' `do.fit = FALSE` to return the list of `data.frame`s (one per
##' allocation), each of which is a stacked, imputed data set with parcels.
##' @param parcel.names `character` vector containing names of all parcels
##' appearing as indicators in `model`.
##' @param item.syntax [lavaan::model.syntax()] specifying the model
##' that would be fit to all of the unparceled items, including items that
##' should be randomly allocated to parcels appearing in `model`.
##' @param nAlloc The number of random items-to-parcels allocations to generate.
##' @param fun `character` string indicating the name of the
##' [lavaan::lavaan()] function used to fit `model` to
##' `data`. Can only take the values `"lavaan"`, `"sem"`,
##' `"cfa"`, or `"growth"`.
##' @param alpha Alpha level used as criterion for significance.
##' @param fit.measures `character` vector containing names of fit measures
##' to request from each fitted [lavaan::lavaan()] model. See the
##' output of [lavaan::fitMeasures()] for a list of available measures.
##' @param \dots Additional arguments to be passed to
##' [lavaan::lavaanList()]. See also [lavaan::lavOptions()]
##' @param show.progress If `TRUE`, show a [utils::txtProgressBar()]
##' indicating how fast the model-fitting iterates over allocations.
##' @param iseed (Optional) Random seed used for parceling items. When the same
##' random seed is specified and the program is re-run, the same allocations
##' will be generated. Using the same `iseed` argument will ensure the
##' any model is fit to the same parcel allocations. *Note*: When using
##' \pkg{parallel} options, you must first type `RNGkind("L'Ecuyer-CMRG")`
##' into the R Console, so that the seed will be controlled across cores.
##' @param do.fit If `TRUE` (default), the `model` is fitted to each
##' parceled data set, and the summary of results is returned (see the Value
##' section below). If `FALSE`, the items are randomly parceled, but the
##' model is not fit; instead, the `list` of `data.frame`s is
##' returned (so assign it to an object).
##' @param return.fit If `TRUE`, a [lavaan::lavaanList-class] object
##' is returned with the `list` of results across allocations
##' @param warn Whether to print warnings when fitting `model` to each allocation
##'
##' @return
##' \item{Estimates}{A `data.frame` containing results related to
##' parameter estimates with columns corresponding to their names; average
##' and standard deviation across allocations; minimum, maximum, and range
##' across allocations; and the proportion of allocations in which each
##' parameter estimate was significant.}
##' \item{SE}{A `data.frame` containing results similar to
##' `Estimates`, but related to the standard errors of parameter
##' estimates.}
##' \item{Fit}{A `data.frame` containing results related to model fit,
##' with columns corresponding to fit index names; their average and
##' standard deviation across allocations; the minimum, maximum, and range
##' across allocations; and (if the test statistic or RMSEA is included in
##' `fit.measures`) the proportion of allocations in which each
##' test of (exact or close) fit was significant.}
##' \item{Model}{A [lavaan::lavaanList-class] object containing results
##' of the `model` fitted to each parcel allocation. Only returned if
##' `return.fit = TRUE`.}
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @seealso [PAVranking()] for comparing 2 models,
##' [poolMAlloc()] for choosing the number of allocations
##'
##' @references
##'
##' Sterba, S. K. (2011). Implications of parcel-allocation
##' variability for comparing fit of item-solutions and parcel-solutions.
##' *Structural Equation Modeling, 18*(4), 554--577.
##' \doi{10.1080/10705511.2011.607073}
##'
##' Sterba, S. K. & MacCallum, R. C. (2010). Variability in parameter estimates
##' and model fit across random allocations of items to parcels.
##' *Multivariate Behavioral Research, 45*(2), 322--358.
##' \doi{10.1080/00273171003680302}
##'
##' Sterba, S. K., & Rights, J. D. (2016). Accounting for parcel-allocation
##' variability in practice: Combining sources of uncertainty and choosing the
##' number of allocations. *Multivariate Behavioral Research, 51*(2--3),
##' 296--313. \doi{10.1080/00273171.2016.1144502}
##'
##' Sterba, S. K., & Rights, J. D. (2017). Effects of parceling on model
##' selection: Parcel-allocation variability in model ranking.
##' *Psychological Methods, 22*(1), 47--68. \doi{10.1037/met0000067}
##'
##' @examples
##'
##' ## Fit 2-factor CFA to simulated data. Each factor has 9 indicators.
##'
##' ## Specify the item-level model (if NO parcels were created)
##' item.syntax <- c(paste0("f1 =~ f1item", 1:9),
##' paste0("f2 =~ f2item", 1:9))
##' cat(item.syntax, sep = "\n")
##' ## Below, we reduce the size of this same model by
##' ## applying different parceling schemes
##'
##'
##' ## 3-indicator parcels
##' mod.parcels <- '
##' f1 =~ par1 + par2 + par3
##' f2 =~ par4 + par5 + par6
##' '
##' ## names of parcels
##' (parcel.names <- paste0("par", 1:6))
##'
##' \donttest{
##' ## override default random-number generator to use parallel options
##' RNGkind("L'Ecuyer-CMRG")
##'
##' parcelAllocation(mod.parcels, data = simParcel, nAlloc = 100,
##' parcel.names = parcel.names, item.syntax = item.syntax,
##' # parallel = "multicore", # parallel available in Mac/Linux
##' std.lv = TRUE) # any addition lavaan arguments
##'
##'
##'
##' ## POOL RESULTS by treating parcel allocations as multiple imputations
##' ## Details provided in Sterba & Rights (2016); see ?poolMAlloc.
##'
##' ## save list of data sets instead of fitting model yet
##' dataList <- parcelAllocation(mod.parcels, data = simParcel, nAlloc = 100,
##' parcel.names = parcel.names,
##' item.syntax = item.syntax,
##' do.fit = FALSE)
##' ## now fit the model to each data set
##' library(lavaan.mi)
##' fit.parcels <- cfa.mi(mod.parcels, data = dataList, std.lv = TRUE)
##' summary(fit.parcels) # pooled using Rubin's rules
##' anova(fit.parcels) # pooled test statistic
##' help(package = "lavaan.mi") # find more methods for pooling results
##' }
##'
##'
##' ## multigroup example
##' simParcel$group <- 0:1 # arbitrary groups for example
##' mod.mg <- '
##' f1 =~ par1 + c(L2, L2)*par2 + par3
##' f2 =~ par4 + par5 + par6
##' '
##' ## names of parcels
##' (parcel.names <- paste0("par", 1:6))
##'
##' parcelAllocation(mod.mg, data = simParcel, parcel.names, item.syntax,
##' std.lv = TRUE, group = "group", group.equal = "loadings",
##' nAlloc = 20, show.progress = TRUE)
##'
##'
##'
##' ## parcels for first factor, items for second factor
##' mod.items <- '
##' f1 =~ par1 + par2 + par3
##' f2 =~ f2item2 + f2item7 + f2item8
##' '
##' ## names of parcels
##' (parcel.names <- paste0("par", 1:3))
##'
##' parcelAllocation(mod.items, data = simParcel, parcel.names, item.syntax,
##' nAlloc = 20, std.lv = TRUE)
##'
##'
##'
##' ## mixture of 1- and 3-indicator parcels for second factor
##' mod.mix <- '
##' f1 =~ par1 + par2 + par3
##' f2 =~ f2item2 + f2item7 + f2item8 + par4 + par5 + par6
##' '
##' ## names of parcels
##' (parcel.names <- paste0("par", 1:6))
##'
##' parcelAllocation(mod.mix, data = simParcel, parcel.names, item.syntax,
##' nAlloc = 20, std.lv = TRUE)
##'
##' @export
parcelAllocation <- function(model, data, parcel.names, item.syntax,
nAlloc = 100, fun = "sem", alpha = .05,
fit.measures = c("chisq","df","cfi",
"tli","rmsea","srmr"), ...,
show.progress = FALSE, iseed = 12345,
do.fit = TRUE, return.fit = FALSE, warn = FALSE) {
if (nAlloc < 2) stop("Minimum of two allocations required.")
if (!fun %in% c("sem","cfa","growth","lavaan"))
stop("'fun' argument must be either 'lavaan', 'cfa', 'sem', or 'growth'")
lavArgs <- list(...)
lavArgs$model <- item.syntax
lavArgs$data <- data
lavArgs$do.fit <- FALSE
## fit item-level model to data
item.fit <- do.call(fun, lavArgs)
item.PT <- parTable(item.fit)
## construct parameter table for parcel-level model
if (is.character(model)) {
## default lavaanify arguments
ptArgs <- formals(lavaanify)
## arguments passed to lavaan by user
fitArgs <- lavInspect(item.fit, "call")[-1]
## overwrite defaults with user's values
sameArgs <- intersect(names(ptArgs), names(fitArgs))
ptArgs[sameArgs] <- fitArgs[sameArgs]
ptArgs$model <- model
if (is.null(ptArgs$model.type)) ptArgs$model.type <- "sem"
if (ptArgs$model.type != "growth") ptArgs$model.type <- "sem"
ptArgs$ngroups <- lavInspect(item.fit, "ngroups")
PT <- do.call("lavaanify", ptArgs)
} else if (is.data.frame(model)) {
PT <- model
} else stop("'model' argument must be a character string of lavaan model",
" syntax or a lavaan parameter table. See ?lavaanify help page.")
## check that both models specify the same factors
## NOTE: lv.names remains unaltered, to omit latent indicators from $items
factorNames <- lv.names <- lavNames(PT, type = "lv")
if (!all(sort(lavNames(item.PT, type = "lv")) == sort(factorNames))) {
stop("'model' and 'item.syntax' arguments specify different factors.\n",
"'model' specifies: ", paste(sort(factorNames), collapse = ", "), "\n",
"'item.syntax' specifies: ", paste(sort(lavNames(item.PT,
type = "lv")),
collapse = ", "))
}
## for each factor, assign item sets to parcel sets
assignments <- list()
for (i in factorNames) {
## all indicators from parcel-level model
parcels <- PT$rhs[PT$lhs == i & PT$op == "=~"]
## all indicators from item-level model
items <- item.PT$rhs[item.PT$lhs == i & item.PT$op == "=~"]
## exclude observed indicators from parceling scheme if specified
## in parcel-level model
assignments[[i]]$parcels <- setdiff(parcels, c(names(data), lv.names))
assignments[[i]]$items <- setdiff(items , c( parcels , lv.names))
## Does this factor have parcels? If not, omit this factor from next loop
if (length(assignments[[i]]$parcels) == 0L) {
factorNames <- factorNames[-which(factorNames == i)]
next
}
## how many items per parcel?
nItems <- length(assignments[[i]]$items)
nParcels <- length(assignments[[i]]$parcels)
assignments[[i]]$nPerParcel <- rep(nItems %/% nParcels, nParcels)
if (nItems %% nParcels > 0) for (j in 1:(nItems %% nParcels)) {
assignments[[i]]$nPerParcel[j] <- assignments[[i]]$nPerParcel[j] + 1
}
names(assignments[[i]]$nPerParcel) <- assignments[[i]]$parcels
}
## for each allocation, create parcels from items
dataList <- list()
for (i in 1:nAlloc) {
dataList[[i]] <- data
for (j in factorNames) {
## create a random assignment pattern
ranAss <- sample(rep(names(assignments[[j]]$nPerParcel),
times = assignments[[j]]$nPerParcel))
## add each parcel to a copy of the original data set
for (k in assignments[[j]]$parcels) {
## which items were selected for this parcel?
ranVars <- assignments[[j]]$items[ranAss == k]
## calculate row means of those items, save as parcel
dataList[[i]][ , k] <- rowMeans(data[ , ranVars])
}
}
}
if (!do.fit) return(dataList)
## fit parcel-level model to list of data sets
set.seed(iseed) # in case not using parallel
fitList <- lavaanList(model, dataList, cmd = fun, ..., warn = warn, iseed = iseed,
FUN = lavaan::fitMeasures, show.progress = show.progress)
## for which data sets did the model converge?
conv <- fitList@meta$ok
if (!any(conv)) stop("The model did not converge for any allocations.")
if (!all(conv)) message("The model did not converge for the following ",
"allocations: ", paste(which(!conv), collapse = ", "))
## tools to extract output
getOutput <- function(x, sig = FALSE) {
c(Avg = mean(x, na.rm = TRUE), SD = sd(x, na.rm = TRUE),
Min = min(x, na.rm = TRUE), Max = max(x, na.rm = TRUE),
Range = max(x, na.rm = TRUE) - min(x, na.rm = TRUE))
}
out <- list()
myCols <- c("lhs","op","rhs","group", "block","label")
template <- data.frame(fitList@ParTableList[[which(conv)[1]]][myCols])
## parameter estimates
Est <- sapply(fitList@ParTableList[conv], function(x) x$est)
out$Estimates <- cbind(template, t(apply(Est, 1, getOutput)))
## standard errors
SE <- sapply(fitList@ParTableList[conv], function(x) x$se)
## Any for which SE could not be calculated?
missingSE <- apply(SE, 2, function(x) any(is.na(x)))
if (!all(missingSE)) {
if (any(missingSE)) message("Standard errors could not be computed for ",
"the following allocations: ",
paste(which(missingSE), collapse = ", "))
out$SE <- cbind(template, t(apply(SE[ , !missingSE], 1, getOutput)))
## add significance test results to $Estimates
Sig <- abs(Est[, !missingSE] / SE[, !missingSE]) > qnorm(alpha / 2,
lower.tail = FALSE)
out$Estimates$Percent_Sig <- rowMeans(Sig)
out$Estimates$Percent_Sig[fitList@ParTableList[[which(conv)[1]]]$free == 0L] <- NA
} else {
message("Standard errors could not be calculated for any converged",
" data sets, so no significance tests could be conducted.")
out$SE <- NULL
}
## fit measures
Fit <- do.call(cbind, fitList@funList[conv])[fit.measures, ]
out$Fit <- data.frame(t(apply(Fit, 1, getOutput)))
if (any(grepl(pattern = "chisq", fit.measures))) {
out$Fit$Percent_Sig <- NA
if ("chisq" %in% fit.measures) {
pvalues <- sapply(fitList@funList[conv], "[", i = "pvalue")
out$Fit["chisq", "Percent_Sig"] <- mean(pvalues < alpha, na.rm = TRUE)
}
if ("chisq.scaled" %in% fit.measures) {
pvalues <- sapply(fitList@funList[conv], "[", i = "pvalue.scaled")
out$Fit["chisq.scaled", "Percent_Sig"] <- mean(pvalues < alpha, na.rm = TRUE)
}
}
if (any(grepl(pattern = "rmsea", fit.measures))) {
if (is.null(out$Fit$Percent_Sig)) out$Fit$Percent_Sig <- NA
if ("rmsea" %in% fit.measures) {
pvalues <- sapply(fitList@funList[conv], "[", i = "rmsea.pvalue")
out$Fit["rmsea", "Percent_Sig"] <- mean(pvalues < alpha, na.rm = TRUE)
}
if ("rmsea.scaled" %in% fit.measures) {
pvalues <- sapply(fitList@funList[conv], "[", i = "rmsea.pvalue.scaled")
out$Fit["rmsea.scaled", "Percent_Sig"] <- mean(pvalues < alpha, na.rm = TRUE)
}
}
## check for robust test
if (any(grepl(pattern = "scaled", names(fitList@funList[conv][[1]]))) &
!any(grepl(pattern = "scaled", fit.measures))) {
warning('Robust test requested, but "fit.measures" argument does not',
' include any scaled measures (e.g., "chisq.scaled", ',
'"rmsea.scaled", or "rmsea.robust").')
}
## remove rows that do not correspond to estimates
out$Estimates <- out$Estimates[fitList@ParTableList[[which(conv)[1]]]$group > 0L, ]
if (!is.null(out$SE)) out$SE <- out$SE[fitList@ParTableList[[which(conv)[1]]]$group > 0L, ]
## assign class for lavaan's print method
class(out$Estimates) <- c("lavaan.data.frame","data.frame")
if (!is.null(out$SE)) class(out$SE) <- c("lavaan.data.frame","data.frame")
class(out$Fit) <- c("lavaan.data.frame","data.frame")
## return output
if (return.fit) {
out$Model <- fitList
out$Model@external$dataList <- dataList
}
out
}
semTools/R/compareFit.R 0000644 0001762 0000144 00000052021 14764323566 014514 0 ustar ligges users ### Terrence D. Jorgensen & Sunthud Pornprasertmanit
### Last updated: 12 March 2025
### source code for compareFit() function and FitDiff class
## -----------------
## Class and Methods
## -----------------
##' Class For Representing A Template of Model Fit Comparisons
##'
##' This class contains model fit measures and model fit comparisons among
##' multiple models
##'
##'
##' @name FitDiff-class
##' @aliases FitDiff-class show,FitDiff-method summary,FitDiff-method
##' @docType class
##'
##' @slot name `character`. The name of each model
##' @slot model.class `character`. One class to which each model belongs
##' @slot nested `data.frame`. Model fit comparisons between adjacently
##' nested models that are ordered by their degrees of freedom (*df*)
##' @slot fit `data.frame`. Fit measures of all models specified in the
##' `name` slot, ordered by their *df*
##' @slot fit.diff `data.frame`. Sequential differences in fit measures in
##' the `fit` slot
##'
##' @section Objects from the Class: Objects can be created via the
##' [compareFit()] function.
##'
##'
##' @author Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' @seealso [compareFit()]; [clipboard()]
##'
##' @examples
##'
##' HS.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9 '
##' fit.config <- cfa(HS.model, data = HolzingerSwineford1939, group = "school")
##' ## invariance constraints
##' fit.metric <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
##' group.equal = "loadings")
##' fit.scalar <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
##' group.equal = c("loadings","intercepts"))
##' fit.strict <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
##' group.equal = c("loadings","intercepts","residuals"))
##' measEqOut <- compareFit(fit.config, fit.metric, fit.scalar, fit.strict)
##' summary(measEqOut)
##' summary(measEqOut, fit.measures = "all")
##' summary(measEqOut, fit.measures = c("aic", "bic"))
##'
##' if(interactive()){
##' ## Save results to a file
##' saveFile(measEqOut, file = "measEq.txt")
##'
##' ## Copy to a clipboard
##' clipboard(measEqOut)
##' }
##'
setClass("FitDiff", slots = c(name = "character", # list of model names
model.class = "character", # lavaan or lavaan.mi
nested = "data.frame", # anova() table
fit = "data.frame", # fitMeasures() output
fit.diff = "data.frame")) # index differences
##' @rdname FitDiff-class
##' @aliases show,FitDiff-method
##' @importFrom methods getMethod
##' @export
setMethod("show", signature(object = "FitDiff"), function(object) {
cat("The following", object@model.class, "models were compared:\n ")
cat(object@name, sep = "\n ")
cat("To view results, assign the compareFit() output to an object and ",
"use the summary() method; see the class?FitDiff help page.\n")
invisible(object)
})
##' @rdname FitDiff-class
##' @aliases summary,FitDiff-method
##'
##' @param object object of class `FitDiff`
##' @param fit.measures `character` vector naming fit indices the user can
##' request from [lavaan::fitMeasures()]. If `"default"`, the
##' fit measures will be `c("chisq", "df", "pvalue", "cfi", "tli",
##' "rmsea", "srmr", "aic", "bic")`. If `"all"`, all available fit
##' measures will be returned.
##' @param nd number of digits printed
##' @param tag single `character` used to flag the model preferred by each
##' fit index. To omit tags, set to `NULL` or `NA`.
##'
##' @export
setMethod("summary", signature(object = "FitDiff"),
function(object, fit.measures = "default", nd = 3, tag = "\u2020") {
if (nrow(object@nested) > 0L) {
cat("################### Nested Model Comparison #########################\n")
test.statistics <- object@nested
if (object@model.class == "lavaan") {
print(test.statistics, nd = nd)
} else {
class(test.statistics) <- c("lavaan.data.frame","data.frame")
stats::printCoefmat(test.statistics, P.values = TRUE, has.Pvalue = TRUE)
}
cat("\n")
}
noFit <- ncol(object@fit) == 1L && names(object@fit)[1] == "df"
if (!noFit) {
if (is.null(fit.measures)) fit.measures <- colnames(object@fit)
if ("all" %in% fit.measures) fit.measures <- colnames(object@fit)
if (length(fit.measures) == 1 && fit.measures == "default") {
## robust or scaled test statistics?
if (is.null(object@fit$cfi.scaled)) {
fit.measures <- c("chisq","df","pvalue","rmsea","cfi","tli","srmr")
} else if (all(!is.na(object@fit$cfi.robust)) && !is.null(object@fit$cfi.robust)) {
fit.measures <- c("chisq.scaled","df.scaled","pvalue.scaled",
"rmsea.robust","cfi.robust","tli.robust","srmr")
} else {
fit.measures <- c("chisq.scaled","df.scaled","pvalue.scaled",
"rmsea.scaled","cfi.scaled","tli.scaled","srmr")
}
if ("aic" %in% colnames(object@fit)) {
fit.measures <- c(fit.measures, "aic", "bic")
}
}
cat("####################### Model Fit Indices ###########################\n")
## this is the object to return (numeric, no printed daggers)
fit.indices <- object@fit[ , fit.measures , drop = FALSE]
## print with daggers marking each fit index's preferred model
## (turns "numeric" vectors into "character")
badness <- grepl(pattern = c("chisq|rmsea|ic|rmr|ecvi|fmin|hqc"),
x = colnames(fit.indices))
goodness <- grepl(pattern = c("cfi|tli|rfi|nfi|ifi|rni|cn|gfi|mfi|Hat"),
x = colnames(fit.indices))
minvalue <- badness & !goodness
minvalue[!badness & !goodness] <- NA
fit.integer <- grepl(pattern = c("df|npar|ntotal"),
x = colnames(fit.indices))
suppressWarnings(fitTab <- as.data.frame(mapply(tagCharacter, nd = nd,
char = tag,
vec = fit.indices,
minvalue = minvalue,
print_integer = fit.integer),
stringsAsFactors = FALSE))
rownames(fitTab) <- object@name
colnames(fitTab) <- colnames(fit.indices)
class(fitTab) <- c("lavaan.data.frame","data.frame")
print(fitTab, nd = nd)
cat("\n")
if (nrow(object@nested) > 0L) {
fit.diff.measures <- fit.measures[!grepl(pattern = "chisq|pvalue|ntotal",
x = fit.measures)]
cat("################## Differences in Fit Indices #######################\n")
fit.diff <- object@fit.diff[ , fit.diff.measures, drop = FALSE]
class(fit.diff) <- c("lavaan.data.frame","data.frame")
print(fit.diff, nd = nd)
cat("\n")
}
}
invisible(object)
})
## "method" for saveFile() function (see "clipboard.R")
saveFileFitDiff <- function(object, file, what = "summary",
tableFormat = FALSE, fit.measures = "default",
writeArgs = list()) {
if (tableFormat) {
writeArgs$file <- file
writeArgs$append <- TRUE
if (is.null(writeArgs$sep)) writeArgs$sep <- "\t"
if (is.null(writeArgs$quote)) writeArgs$quote <- FALSE
if (is.null(writeArgs$row.names)) writeArgs$row.names <- FALSE
if (nrow(object@nested) > 0L) {
cat("Nested Model Comparison\n\n", file = file, append = TRUE)
out <- object@nested
#out <- data.frame(model.diff = rownames(out), out)
writeArgs$x <- out
do.call("write.table", writeArgs)
cat("\n\n", file = file, append = TRUE)
}
out2 <- getFitSummary(object, fit.measures)
out2 <- data.frame(model = object@name, out2)
cat("Fit Indices Summaries\n\n", file = file, append = TRUE)
writeArgs$x <- out2
do.call("write.table", writeArgs)
} else {
write(paste(utils::capture.output(getMethod("summary",
signature = "FitDiff")(object)),
collapse = "\n"), file = file)
}
}
## --------------------
## Constructor Function
## --------------------
##' Build an object summarizing fit indices across multiple models
##'
##' This function will create the template to compare fit indices across
##' multiple fitted lavaan objects. The results can be exported to a clipboard
##' or a file later.
##'
##' @importFrom lavaan lavTestLRT
##' @importMethodsFrom lavaan fitMeasures
##'
##' @param ... fitted `lavaan` models or list(s) of `lavaan` objects.
##' [lavaan.mi::lavaan.mi-class] objects are also accepted, but all models
##' must belong to the same class.
##' @param nested `logical` indicating whether the models in `...` are
##' nested. See [net()] for an empirical test of nesting.
##' @param argsLRT `list` of arguments to pass to
##' [lavaan::lavTestLRT()], as well as to
##' [lavaan.mi::lavTestLRT.mi()] and [lavaan::fitMeasures()] when
##' comparing [lavaan.mi::lavaan.mi-class] models.
##' @param indices `logical` indicating whether to return fit indices from
##' the [lavaan::fitMeasures()] function. Selecting particular
##' indices is controlled in the `summary` method; see
##' [FitDiff-class].
##' @param moreIndices `logical` indicating whether to return fit indices
##' from the [moreFitIndices()] function. Selecting particular
##' indices is controlled in the `summary` method; see
##' [FitDiff-class].
##' @param baseline.model optional fitted [lavaan::lavaan-class] model passed to
##' [lavaan::fitMeasures()] to calculate incremental fit indices.
##' @param nPrior passed to [moreFitIndices()], if relevant
##'
##' @return A [FitDiff-class] object that saves model fit
##' comparisons across multiple models. If the models are not nested, only
##' fit indices for each model are returned. If the models are nested, the
##' differences in fit indices are additionally returned, as well as test
##' statistics comparing each sequential pair of models (ordered by their
##' degrees of freedom).
##'
##' @author Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' @seealso [FitDiff-class], [clipboard()]
##'
##' @examples
##'
##' HS.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9 '
##'
##' ## non-nested models
##' fit1 <- cfa(HS.model, data = HolzingerSwineford1939)
##'
##' m2 <- ' f1 =~ x1 + x2 + x3 + x4
##' f2 =~ x5 + x6 + x7 + x8 + x9 '
##' fit2 <- cfa(m2, data = HolzingerSwineford1939)
##'
##' (out1 <- compareFit(fit1, fit2, nested = FALSE))
##' summary(out1)
##'
##'
##' ## nested model comparisons: measurement equivalence/invariance
##' fit.config <- cfa(HS.model, data = HolzingerSwineford1939, group = "school")
##' fit.metric <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
##' group.equal = "loadings")
##' fit.scalar <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
##' group.equal = c("loadings","intercepts"))
##' fit.strict <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
##' group.equal = c("loadings","intercepts","residuals"))
##'
##' measEqOut <- compareFit(fit.config, fit.metric, fit.scalar, fit.strict,
##' moreIndices = TRUE) # include moreFitIndices()
##' summary(measEqOut)
##' summary(measEqOut, fit.measures = "all")
##' summary(measEqOut, fit.measures = c("aic", "bic", "sic", "ibic"))
##'
##'
#FIXME: why doesn't this example work?
## \donttest{
## ## also applies to lavaan.mi objects (fit model to multiple imputations)
## library(lavaan.mi)
## data("HS20imps", package = "lavaan.mi") # example data: 20 imputations
##
## ## request robust test statistics
## mgfit2 <- cfa.mi(HS.model, data = HS20imps, group = "school", estimator = "mlm")
## mgfit1 <- cfa.mi(HS.model, data = HS20imps, group = "school", estimator = "mlm",
## group.equal = "loadings")
## mgfit0 <- cfa.mi(HS.model, data = HS20imps, group = "school", estimator = "mlm",
## group.equal = c("loadings","intercepts"))
##
## ## request the strictly-positive robust test statistics
## out2 <- compareFit(scalar = mgfit0, metric = mgfit1, config = mgfit2,
## argsLRT = list(asymptotic = TRUE,
## method = "satorra.bentler.2010"))
## ## note that moreFitIndices() does not work for lavaan.mi objects
## summary(out2, fit.measures = c("crmr","srmr", "cfi.robust","tli.robust",
## "rmsea.robust",
## "rmsea.ci.lower.robust",
## "rmsea.ci.upper.robust"))
## }
##'
##' @export
compareFit <- function(..., nested = TRUE, argsLRT = list(), indices = TRUE,
moreIndices = FALSE, baseline.model = NULL, nPrior = 1) {
## make sure there is something to do
if (!(nested || indices || moreIndices)) {
message('User requested neither indices nor tests of nested models.')
return(NULL)
}
## separate models from lists of models
dots <- list(...)
idx.list <- sapply(dots, is.list)
modLists <- dots[ idx.list]
mods <- dots[!idx.list]
## capture names of any arguments passed via dots
allnames <- sapply(substitute(list(...))[-1], deparse)
listnames <- allnames[ idx.list]
modnames <- allnames[!idx.list]
## make sure models are named
if (length(mods) && is.null(names(mods))) {
names(mods) <- modnames
} else for (nn in seq_along(mods)) {
if (names(mods)[nn] == "") names(mods)[nn] <- modnames[nn]
}
## make sure lists are named
if (length(modLists) && is.null(names(modLists))) {
names(modLists) <- listnames
} else for (nn in seq_along(modLists)) {
if (names(modLists)[nn] == "") names(modLists)[nn] <- listnames[nn]
}
## within each list, make sure models are named
for (i in seq_along(modLists)) {
if (length(modLists[[i]]) && is.null(names(modLists[[i]]))) {
names(modLists[[i]]) <- seq_along(modLists[[i]])
} else for (nn in seq_along(modLists[[i]])) {
if (names(modLists[[i]])[nn] == "") names(modLists[[i]])[nn] <- nn
}
}
## collapse into a single list of models
if (length(modLists)) mods <- c(mods, unlist(modLists))
## check for lavaan models
not.lavaan <- !sapply(mods, inherits, what = c("lavaan","lavaan.mi"))
if (any(not.lavaan)) stop("The following are not fitted lavaan(.mi) models:\n",
paste0(names(which(not.lavaan)), collapse = ", "))
modClass <- unique(sapply(mods, class))
if (length(modClass) > 1L) stop('All models must be of the same class (e.g.,',
' cannot compare lavaan objects to lavaan.mi)')
if (inherits(mods[[1]], "lavaan")) {
nonConv <- !sapply(mods, lavInspect, what = "converged")
} else if (inherits(mods[[1]], "lavaan.mi")) {
## attach lavaan.mi to access LRT + fitMeasures
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
nonConv <- !sapply(mods, function(fit) {
any(sapply(fit@convergence, "[", i = "converged"))
})
}
if (all(nonConv)) {
stop('No models converged')
} else if (any(nonConv)) {
message('The following models did not converge, so they are ignored:\n',
paste(names(nonConv)[nonConv], collapse = ",\t"))
mods <- mods[which(!nonConv)]
}
## grab lavaan.mi options, if relevant
if (is.null(argsLRT$pool.robust)) {
pool.robust <- formals(lavaan.mi::lavTestLRT.mi)$pool.robust # default value
} else {
pool.robust <- argsLRT$pool.robust # user-specified value
}
if (is.null(argsLRT$pool.method)) {
pool.method <- eval(formals(lavaan.mi::lavTestLRT.mi)$pool.method) # default value
} else {
pool.method <- argsLRT$pool.method # user-specified value
}
## FIT INDICES
if (indices || moreIndices) {
fitList <- lapply(mods, fitMeasures, baseline.model = baseline.model,
pool.robust = pool.robust, pool.method = pool.method)
if (moreIndices && modClass == "lavaan") {
moreFitList <- lapply(mods, moreFitIndices, nPrior = nPrior)
fitList <- mapply(c, fitList, moreFitList, SIMPLIFY = FALSE)
}
if (length(unique(sapply(fitList, length))) > 1L) {
warning('fitMeasures() returned vectors of different lengths for different',
' models, probably because certain options are not the same. Check',
' lavInspect(fit, "options")[c("estimator","test","meanstructure")]',
' for each model, or run fitMeasures() on each model to investigate.')
indexList <- lapply(fitList, names)
useNames <- names(which(table(unlist(indexList)) == length(fitList)))
fitList <- lapply(fitList, "[", i = useNames)
}
fit <- as.data.frame(do.call(rbind, fitList))
} else {
fitList <- lapply(mods, fitMeasures, fit.measures = "df",
pool.robust = pool.robust, pool.method = pool.method)
## check for scaled tests
nDF <- sapply(fitList, length)
if (any(nDF != nDF[1])) stop('Some (but not all) models have robust tests,',
' so they cannot be compared as nested models.')
fit <- data.frame(df = sapply(fitList, "[",
i = if (any(nDF > 1L)) 2L else 1L))
}
## order models by increasing df (least-to-most constrained)
ord <- order(fit$df) #FIXME: what if test == "mean.var.adjusted"?
fit <- fit[ord, , drop = FALSE]
mods <- mods[ord]
## TEST STATISTICS
if (nested) {
argsLRT$model.names <- names(mods)
argsLRT$object <- mods[[1]]
if (inherits(mods[[1]], "lavaan")) {
nestedout <- do.call(lavTestLRT, c(mods[-1], argsLRT))
} else if (inherits(mods[[1]], "lavaan.mi")) {
nestedout <- do.call(lavaan.mi::lavTestLRT.mi, c(mods[-1], argsLRT))
}
## not nested
} else nestedout <- data.frame()
## DIFFERENCES IN FIT INDICES
if (indices && length(names(mods)) > 1L) {
fitSubset <- colnames(fit)[!grepl(pattern = "chisq|pvalue|ntotal",
x = colnames(fit))]
fitTab <- fit[ , fitSubset, drop = FALSE]
diffTab <- as.data.frame(do.call(cbind, lapply(fitTab, diff)))
rownames(diffTab) <- paste(names(mods)[-1], "-", names(mods)[-length(names(mods))])
} else diffTab <- data.frame(df = diff(fit))
new("FitDiff", name = names(mods), model.class = modClass,
nested = nestedout, fit = fit, fit.diff = diffTab)
}
## ----------------
## Hidden Functions
## ----------------
noLeadingZero <- function(vec, fmt, nd = 3L) {
out <- sprintf(fmt, vec)
upper.limit <- paste0(".", paste(rep(9, nd - 1L), collapse = ""), "5")
used <- vec < as.numeric(upper.limit) & vec >= 0
used[is.na(used)] <- FALSE
out[used] <- substring(out[used], 2)
out
}
tagCharacter <- function(vec, char = "\u2020", minvalue = NA,
print_integer = FALSE, nd = 3L) {
char <- if (is.null(char)) as.character(NA) else as.character(char)[1]
if (nchar(char) != 1L) {
message('Only a single character can be used to tag= preferred models, so ',
'no tags were added. To omit tags, specify tag=NULL or tag=NA.')
char <- as.character(NA)
}
if (print_integer) {
vec <- noLeadingZero(vec, fmt = "%.0f", nd = nd)
} else if (is.na(minvalue)) {
vec <- noLeadingZero(vec, fmt = paste0("%.", nd, "f"), nd = nd)
} else {
target <- if (minvalue) min(vec, na.rm = TRUE) else max(vec, na.rm = TRUE)
tag <- rep(" ", length(vec))
if (!is.na(char)) tag[vec == target] <- char
vec <- noLeadingZero(vec, fmt = paste0("%.", nd, "f"), nd = nd)
vec <- paste0(vec, tag)
}
vec
}
getFitSummary <- function(object, fit.measures = "default", return.diff = FALSE) {
if (is.null(fit.measures)) fit.measures <- colnames(object@fit)
if ("all" %in% fit.measures) fit.measures <- colnames(object@fit)
if (length(fit.measures) == 1 && fit.measures == "default") {
## robust or scaled test statistics?
if (is.null(object@fit$cfi.scaled)) {
fit.measures <- c("chisq","df","pvalue","rmsea","cfi","tli","srmr")
} else if (all(!is.na(object@fit$cfi.robust)) && !is.null(object@fit$cfi.robust)) {
fit.measures <- c("chisq.scaled","df.scaled","pvalue.scaled",
"rmsea.robust","cfi.robust","tli.robust","srmr")
} else {
fit.measures <- c("chisq.scaled","df.scaled","pvalue.scaled",
"rmsea.scaled","cfi.scaled","tli.scaled","srmr")
}
if ("aic" %in% colnames(object@fit)) {
fit.measures <- c(fit.measures, "aic", "bic")
}
}
## chi-squared difference test already reported, so remove (diff in p-value)
if (return.diff) {
fit.measures <- fit.measures[!grepl(pattern = "chisq|pvalue|ntotal",
x = fit.measures)]
}
## return numeric values
fitTab <- object@fit[ , colnames(object@fit) %in% fit.measures, drop = FALSE]
if (!return.diff) return(fitTab)
## or return differences in fit indices
diffTab <- as.data.frame(do.call(cbind, lapply(fitTab, diff)))
rownames(diffTab) <- paste(object@name[-1], "-", object@name[-length(object@name)])
diffTab
}
semTools/R/discriminantValidity.R 0000644 0001762 0000144 00000022453 14753073506 016615 0 ustar ligges users ### Mikko Rönkkö (Roxygen edits by TDJ)
### Last updated: 12 February 2025
##' Calculate discriminant validity statistics
##'
##' Calculate discriminant validity statistics based on a fitted lavaan object
##'
##' Evaluated on the measurement scale level, discriminant validity is commonly
##' evaluated by checking if each pair of latent correlations is sufficiently
##' below one (in absolute value) that the latent variables can be thought of
##' representing two distinct constructs.
##'
##' `discriminantValidity` function calculates two sets of statistics that
##' are commonly used in discriminant validity evaluation. The first set are
##' factor correlation estimates and their confidence intervals. The second set
##' is a series of nested model tests, where the baseline model is compared
##' against a set of constrained models that are constructed by constraining
##' each factor correlation to the specified cutoff one at a time.
##'
##' The function assume that the `object` is set of confirmatory
##' factor analysis results where the latent variables are scaled by fixing their
##' variances to 1s. If the model is not a CFA model, the function will calculate
##' the statistics for the correlations among exogenous latent variables, but
##' for the *residual* variances with endogenous variables. If the
##' latent variables are scaled in some other way (e.g. fixing the first loadings),
##' the function issues a warning and re-estimates the model by fixing latent
##' variances to 1 (and estimating all loadings) so that factor covariances are
##' already estimated as correlations.
##'
##' The likelihood ratio tests are done by comparing the original baseline model
##' against more constrained alternatives. By default, these alternatives are
##' constructed by fixing each correlation at a time to a cutoff value. The
##' typical purpose of this test is to demonstrate that the estimated factor
##' correlation is well below the cutoff and a significant \eqn{chi^2} statistic
##' thus indicates support for discriminant validity. In some cases, the original
##' correlation estimate may already be greater than the cutoff, making it
##' redundant to fit a "restricted" model. When this happens, the likelihood
##' ratio test will be replaced by comparing the baseline model against itself.
##' For correlations that are estimated to be negative, a negation of the cutoff
##' is used in the constrained model.
##'
##' Another alternative is to do a nested model comparison against a model where
##' two factors are merged as one by setting the `merge` argument to
##' `TRUE`. In this comparison, the constrained model is constructed by
##' removing one of the correlated factors from the model and assigning its
##' indicators to the factor that remains in the model.
##'
##'
##' @importFrom lavaan lavInspect lavNames parTable
##'
##' @param object The [lavaan::lavaan-class] model object returned by
##' the [lavaan::cfa()] function.
##' @param cutoff A cutoff to be used in the constrained models in likelihood
##' ratio tests.
##' @param merge Whether the constrained models should be constructed by merging
##' two factors as one. Implies `cutoff` = 1.
##'
##' @inheritParams lavaan::parameterEstimates
##'
##' @return A `data.frame` of latent variable correlation estimates, their
##' confidence intervals, and a likelihood ratio tests against constrained models.
##' with the following attributes:
##' \describe{
##' \item{baseline}{The baseline model after possible rescaling.}
##' \item{constrained}{A `list` of the fitted constrained models
##' used in the likelihood ratio test.}
##' }
##'
##' @author
##' Mikko Rönkkö (University of Jyväskylä; \email{mikko.ronkko@jyu.fi}):
##' @references
##'
##' Rönkkö, M., & Cho, E. (2022). An updated guideline for assessing
##' discriminant validity. *Organizational Research Methods*, 25(1), 6–14.
##' \doi{10.1177/1094428120968614}
##'
##' @examples
##'
##' library(lavaan)
##'
##' HS.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9 '
##'
##' fit <- cfa(HS.model, data = HolzingerSwineford1939)
##' discriminantValidity(fit)
##' discriminantValidity(fit, merge = TRUE)
##'
##' @export
discriminantValidity <- function(object, cutoff = .9, merge = FALSE, level = .95,
boot.ci.type = "perc") {
free <- lavInspect(object, "free", add.class = FALSE)
#FIXME: adapt for multiple blocks by looping over groups/levels
if (lavInspect(object, "ngroups") > 1L | lavInspect(object, "nlevels") > 1L)
stop("Only implemented for single-group, single-level models so far.")
# Identify the latent variables that we will use
lvs <- lavNames(object,"lv")
if (cutoff <= 0 | cutoff > 1) stop("The cutoff must be between (0,1]")
if (merge & !missing(cutoff) & cutoff != 1)
message("Merging factors imply constraining factor correlation to 1. ",
"Cutoff will be ignored.")
if (length(lvs) == 0) stop("The model does not have any exogenous latent variables.")
if (length(lvs) == 1) stop("The model has only one exogenous latent variable. ",
"At least two are required for assessing discriminant validity.")
if (length(lavNames(object, "lv.y")) > 0)
warning("The model has at least one endogenous latent variable (",
paste(lavNames(object, "lv.y"), collapse = ", "),
"). The correlations of these variables will be estimated after ",
"conditioning on their predictors.")
# Extract the part of psi that contains latent variables
psi <- free$psi[lvs,lvs]
# Identify exogenous variances and covariances
pt <- parTable(object)
varIndices <- which(pt$lhs == pt$rhs & pt$lhs %in% lvs & pt$op == "~~")
covIndices <- which(pt$lhs != pt$rhs & pt$lhs %in% lvs & pt$rhs %in% lvs & pt$op == "~~")
# Check that the diagonal of psi is all zeros
if (any(diag(psi) != 0)) {
message("Some of the latent variable variances are estimated instead of ",
"fixed to 1. The model is re-estimated by scaling the latent ",
"variables by fixing their variances and freeing all factor loadings.")
# Identify free exogenous variances
i <- intersect(varIndices,which(pt$free != 0))
pt$free[i] <- 0
pt$ustart[i] <- 1
pt$user[i] <- 1
# Free all factor loadings corresponding of lvs where the covariances were
# just freed
i <- which(pt$lhs %in% pt$lhs[i] & pt$op == "=~")
pt$free[i] <- -1
pt$ustart[i] <- NA
# Update parameter numbers
i <- which(pt$free != 0)
pt$free[i] <- seq_along(i)
object <- lavaan::update(object, model = pt[,1:12]) # Leave out starting
# values, estimates, and
# ses from pt
# Update pt based on the new model
pt <- parTable(object)
}
# At this point we can be sure that all exogenous variances are fixed instead
# of being estimated. We need to still check that they are fixed to 1s
est <- lavInspect(object,"est")$psi[lvs,lvs]
if (any(diag(est) != 1)) {
message("Some of the latent variable variances are fixed to values other ",
"than 1. The model is re-estimated by scaling the latent variables",
" based on the first factor loading.")
# constrain the exogenous variances to 1
pt$ustart[varIndices] <- 1
object <- lavaan::update(object, model = pt[,1:12]) # Leave out starting
# values, estimates, and
# ses from pt
# Update pt based on the new estimates
pt <- parTable(object)
}
# At this point we can be sure that all exogenous LVs have their variances
# fixed to ones and can start constructing the matrix to be returned
ret <- lavaan::parameterEstimates(object, ci = TRUE,
level = level,
boot.ci.type = boot.ci.type)[covIndices,
c("lhs","op","rhs","est",
"ci.lower","ci.upper")]
rownames(ret) <- seq_len(nrow(ret))
# Add the chi^2 test to all correlation pairs
constrainedModels <- lapply(covIndices, function(i) {
thisPt <- pt
if (merge) {
lhs <- pt$lhs[i]
rhs <- pt$rhs[i]
# Merge the factors by assigning indicator of lhs to rhs
thisPt$lhs[thisPt$lhs == lhs & thisPt$op == "=~"] <- rhs
# Then remove all other parameters concering lhs
thisPt <- thisPt[!(thisPt$lhs == lhs | thisPt$rhs == lhs), ]
thisPt$id <- seq_len(nrow(thisPt))
} else {
# If the correlation is estimated to be greater than the cuttof, constrain
# it to the estimated alue
if (abs(pt$est[i]) > cutoff) {
thisCutoff <- pt$est[i]
} else {
thisCutoff <- ifelse(pt$est[i] < 0, - cutoff, cutoff)
}
thisPt$free[i] <- 0
thisPt$ustart[i] <- thisCutoff
}
# Update parameter numbers
j <- which(thisPt$free != 0)
thisPt$free[j] <- seq_along(j)
lavaan::update(object, model = thisPt[,1:12],
se = "none") # Disable SEs to save computational time
})
lrTests <- lapply(constrainedModels, function(constrained) {
lavaan::lavTestLRT(object,constrained)[2,] # Return the second row of the
# test
})
ret <- cbind(ret,do.call(rbind,lrTests))
# Store the baseline model
attr(ret,"baseline") <- object
attr(ret,"constrained") <- constrainedModels
ret
}
semTools/R/NET.R 0000644 0001762 0000144 00000031213 14764263766 013056 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 12 March 2025
### semTools functions for Nesting and Equivalence Testing
## -----------------
## Class and Methods
## -----------------
##' Class For the Result of Nesting and Equivalence Testing
##'
##' This class contains the results of nesting and equivalence testing among
##' multiple models
##'
##'
##' @name Net-class
##' @aliases Net-class show,Net-method summary,Net-method
##' @docType class
##'
##' @slot test Logical `matrix` indicating nesting/equivalence among models
##' @slot df The degrees of freedom of tested models
##'
##' @section Objects from the Class: Objects can be created via the
##' [net()] function.
##'
##' @param object An object of class `Net`.
##'
##' @return
##' \item{show}{`signature(object = "Net")`: prints the logical matrix of
##' test results. `NA` indicates a model did not converge.}
##' \item{summary}{`signature(object = "Net")`: prints a narrative
##' description of results. The original `object` is invisibly returned.}
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @seealso [net()]
##'
##' @examples
##'
##' # See the example in the net function.
##'
setClass("Net", representation(test = "matrix", df = "vector"))
##' @rdname Net-class
##' @aliases show,Net-method
##' @export
setMethod("show", "Net",
function(object) {
if (length(object@test)) {
m <- as.matrix(unclass(object@test))
m[upper.tri(m, diag = TRUE)] <- ""
cat("
If cell [R, C] is TRUE, the model in row R is nested within column C.
If the models also have the same degrees of freedom, they are equivalent.
NA indicates the model in column C did not converge when fit to the
implied means and covariance matrix from the model in row R.
The hidden diagonal is TRUE because any model is equivalent to itself.
The upper triangle is hidden because for models with the same degrees
of freedom, cell [C, R] == cell [R, C]. For all models with different
degrees of freedom, the upper diagonal is all FALSE because models with
fewer degrees of freedom (i.e., more parameters) cannot be nested
within models with more degrees of freedom (i.e., fewer parameters).
\n")
print(m, quote = FALSE)
} else {
cat(data.class(object@test), "(0)\n", sep = "")
}
invisible(object)
})
##' @rdname Net-class
##' @aliases summary,Net-method
##' @export
setMethod("summary", "Net",
function(object) {
DFs <- object@df
x <- object@test
mods <- colnames(x)
## keep track of how many are printed
nPrinted <- 0L
for (R in 2:nrow(x)) {
for (C in (R - 1):1) {
## if model didn't converge (logical value is missing), go to next iteration
if (is.na(x[R, C])) next
## if the models are not nested, go to next iteration
if (!x[R, C]) next
## choose message based on whether models are equivalent or nested
if (identical(DFs[R], DFs[C])) {
rel <- "equivalent to"
nPrinted <- nPrinted + 1L
} else {
rel <- "nested within"
nPrinted <- nPrinted + 1L
}
cat("Model \"", mods[R], "\" is ", rel, " model \"", mods[C], "\"\n", sep = "")
}
}
if (nPrinted == 0L) cat('No models were determined as nested/equivalent.\n')
invisible(object)
})
## --------------------
## Constructor Function
## --------------------
##' Nesting and Equivalence Testing
##'
##' This test examines whether pairs of SEMs are nested or equivalent.
##'
##' The concept of nesting/equivalence should be the same regardless of
##' estimation method. However, the particular method of testing
##' nesting/equivalence (as described in Bentler & Satorra, 2010) employed by
##' the `net` function analyzes summary statistics (model-implied means and
##' covariance matrices, not raw data). In the case of robust methods like MLR,
##' the raw data is only utilized for the robust adjustment to SE and chi-sq,
##' and the net function only checks the unadjusted chi-sq for the purposes of
##' testing nesting/equivalence. This method also applies to models for
##' categorical data, following the procedure described by Asparouhov & Muthen
##' (2019).
##'
##'
##' @importFrom lavaan lavInspect
##'
##' @param \dots The `lavaan` objects used for test of nesting and
##' equivalence
##' @param crit The upper-bound criterion for testing the equivalence of models.
##' Models are considered nested (or equivalent) if the difference between
##' their \eqn{\chi^2} fit statistics is less than this criterion.
##'
##' @return The [Net-class] object representing the outputs for nesting
##' and equivalent testing, including a logical matrix of test results and a
##' vector of degrees of freedom for each model.
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @references
##'
##' Bentler, P. M., & Satorra, A. (2010). Testing model nesting and equivalence.
##' *Psychological Methods, 15*(2), 111--123. \doi{10.1037/a0019625}
##'
##' Asparouhov, T., & Muthen, B. (2019). Nesting and equivalence testing for
##' structural equation models. *Structural Equation Modeling, 26*(2),
##' 302--309. \doi{10.1080/10705511.2018.1513795}
##'
##' @examples
##'
##' m1 <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9 '
##'
##'
##' m2 <- ' f1 =~ x1 + x2 + x3 + x4
##' f2 =~ x5 + x6 + x7 + x8 + x9 '
##'
##' m3 <- ' visual =~ x1 + x2 + x3
##' textual =~ eq*x4 + eq*x5 + eq*x6
##' speed =~ x7 + x8 + x9 '
##'
##' fit1 <- cfa(m1, data = HolzingerSwineford1939)
##' fit1a <- cfa(m1, data = HolzingerSwineford1939, std.lv = TRUE) # Equivalent to fit1
##' fit2 <- cfa(m2, data = HolzingerSwineford1939) # Not equivalent to or nested in fit1
##' fit3 <- cfa(m3, data = HolzingerSwineford1939) # Nested in fit1 and fit1a
##'
##' \donttest{
##' tests <- net(fit1, fit1a, fit2, fit3)
##' tests
##' summary(tests)
##' }
##'
##' @export
net <- function(..., crit = .0001) {
## put fitted objects in a list
fitList <- list(...)
## check that they are all lavaan objects
notLavaan <- !sapply(fitList, inherits, what = "lavaan")
if (any(notLavaan)) {
fitNames <- sapply(as.list(substitute(list(...)))[-1], deparse)
stop(paste("The following arguments are not fitted lavaan objects:\n",
paste(fitNames[notLavaan], collapse = "\t")))
}
## remove any that did not converge
nonConv <- !sapply(fitList, lavInspect, what = "converged")
if (all(nonConv)) {
stop('No models converged')
} else if (any(nonConv)) {
fitNames <- sapply(as.list(substitute(list(...)))[-1], deparse)
message('The following models did not converge, so they are ignored:\n',
paste(fitNames[nonConv], collapse = ",\t"))
fitList <- fitList[which(!nonConv)]
}
## check for meanstructure
meanstructure <- sapply(fitList, function(x) lavInspect(x, "options")$meanstructure)
if (!(all(meanstructure) || !any(meanstructure)))
stop('Some (but not all) fitted lavaan objects include a mean structure. ',
'Please re-fit all models with the argument meanstructure=TRUE.')
## get degrees of freedom for each model
DFs <- sapply(fitList, function(x) lavInspect(x, "fit")["df"])
## name according to named objects, with DF in parentheses
fitNames <- names(fitList)
dotNames <- sapply(as.list(substitute(list(...)))[-1], deparse)
if (any(nonConv)) dotNames <- dotNames[which(!nonConv)]
if (is.null(names(fitList))) {
fitNames <- dotNames
} else {
noName <- which(fitNames == "")
fitNames[noName] <- dotNames[noName]
}
names(fitList) <- paste(fitNames, " (df = ", DFs, ")", sep = "")
## sort list according to DFs
fitList <- fitList[order(DFs)]
fitNames <- fitNames[order(DFs)]
orderedDFs <- DFs[order(DFs)]
## create structure for sequence of tests (logical matrix), FALSE by default
nestMat <- matrix(FALSE, length(fitList), length(fitList),
dimnames = list(names(fitList), fitNames))
diag(nestMat) <- TRUE # every model is equivalent with itself
## Loop through sorted models in sequence of most to least restricted model
for (R in 2:nrow(nestMat)) {
for (C in (R - 1):1) {
## test for nesting/equivalence
nestMat[R, C] <- x.within.y(x = fitList[[R]], y = fitList[[C]], crit = crit)
## if models are equivalent, set above-diagonal value to TRUE
if (identical(orderedDFs[R], orderedDFs[C])) nestMat[C, R] <- nestMat[R, C]
if (C == 1) next # to prevent the next 2 tests from returning an error
## if model didn't converge (logical value is missing), go to next iteration
if (is.na(nestMat[R, C]) | is.na(nestMat[R - 1, C - 1])) next
## check whether nesting is implied, to skip unnecessary tests
if (nestMat[R, C] & nestMat[R - 1, C - 1]) {
nestMat[R, C - 1] <- TRUE
next
}
}
}
out <- new("Net", test = nestMat, df = orderedDFs)
out
}
## --------------------------------------------------------------------
## Hidden Function to test whether model "x" is nested within model "y"
## --------------------------------------------------------------------
#' @importFrom lavaan lavInspect lavNames parTable
x.within.y <- function(x, y, crit = .0001) {
if (!lavInspect(x, "converged")) return(NA)
if (!lavInspect(y, "converged")) return(NA)
## not possible for clustered data
if (length(lavInspect(x, "cluster")) || length(lavInspect(y, "cluster")))
stop('The net() function does not work with models for clustered data.')
## not currently implemented unless all variables are considered random
exoX <- lavInspect(x, "options")$fixed.x & length(lavNames(x, "ov.x"))
exoY <- lavInspect(y, "options")$fixed.x & length(lavNames(y, "ov.x"))
if (exoX | exoY) {
stop(c("The net() function does not work with exogenous variables.\n",
"Fit the model again with 'fixed.x = FALSE'"))
}
## variable names
Xnames <- sort(lavNames(x))
Ynames <- sort(lavNames(y))
if (!identical(Xnames, Ynames))
stop("Models do not contain the same variables")
## check that the analyzed data matches
xData <- sort(unlist(lavInspect(x, "sampstat")))
yData <- sort(unlist(lavInspect(y, "sampstat")))
names(xData) <- NULL
names(yData) <- NULL
if (!isTRUE(all.equal(xData, yData, tolerance = crit)))
stop("Sample statistics differ. Models must apply to the same data")
#FIXME: this method requires raw data
# xData <- lavInspect(x, "data")
# if (is.list(xData)) xData <- do.call(rbind, xData)
# xData <- xData[ , order(Xnames)]
# yData <- lavInspect(y, "data")
# if (is.list(yData)) yData <- do.call(rbind, yData)
# yData <- yData[ , order(Ynames)]
# if (!identical(xData, yData)) stop("Models must apply to the same data")
## check degrees of freedom support nesting structure
if (lavInspect(x, "fit")["df"] < lavInspect(y, "fit")["df"])
stop("x cannot be nested within y because y is more restricted than x")
## check sample sizes
N <- lavInspect(x, "nobs")
if (!all(N == lavInspect(y, "nobs"))) {
stop("Sample sizes differ. Models must apply to the same data")
}
## model-implied moments
Sigma <- lavInspect(x, "cov.ov")
nBlocks <- if (is.list(Sigma)) length(Sigma) else 1L
## mean structure?
Mu <- lavInspect(x, "mean.ov")
if (nBlocks == 1L) {
if (!length(Mu)) Mu <- NULL
} else {
if (all(sapply(Mu, length) == 0)) Mu <- NULL
}
## thresholds?
Thr <- lavInspect(x, "thresholds")
if (nBlocks == 1L) {
if (!length(Thr)) Thr <- NULL
} else {
if (all(sapply(Thr, length) == 0)) Thr <- NULL
}
if (!is.null(Thr)) attr(Thr, "th.idx") <- lavInspect(x, "th.idx")
## If DWLS, extract WLS.V and NACOV
estimator <- lavInspect(x, "options")$estimator
if (estimator == "DWLS") {
WLS.V <- lavInspect(x, "WLS.V")
NACOV <- lavInspect(x, "gamma")
#TODO: check against same output from y
} else {
WLS.V <- NULL
NACOV <- NULL
}
## fit model and check that chi-squared < crit
PT <- parTable(y)
PT$start <- PT$est
PT$est <- PT$se <- NULL
CALL <- lavInspect(y, "call")
CALL$model <- PT
CALL$data <- NULL
CALL$sample.cov <- Sigma
CALL$sample.mean <- Mu
CALL$sample.nobs <- N
CALL$sample.th <- Thr
CALL$estimator <- estimator
CALL$WLS.V <- WLS.V
CALL$NACOV <- NACOV
CALL$se <- "none" # to save time
CALL$test <- "standard"
CALL$group.label <- NULL # unnecessary, and causes an error when is.integer()
suppressWarnings(try( newFit <- eval(as.call(CALL)) ))
if (!lavInspect(newFit, "converged")) return(NA) else {
result <- lavInspect(newFit, "fit")[["chisq"]] < crit
if (lavInspect(x, "fit")["df"] ==
lavInspect(y, "fit")["df"]) return(c(Equivalent = result))
}
c(Nested = result)
}
semTools/R/splitSample.R 0000644 0001762 0000144 00000013304 14632016456 014710 0 ustar ligges users ### Corbin Quick
### Last updated: 4 April 2017
#' Randomly Split a Data Set into Halves
#'
#' This function randomly splits a data set into two halves, and saves the
#' resulting data sets to the same folder as the original.
#'
#' This function randomly orders the rows of a data set, divides the data set
#' into two halves, and saves the halves to the same folder as the original
#' data set, preserving the original formatting. Data set type (*.csv or *.dat)
#' and formatting (headers) are automatically detected, and output data sets
#' will preserve input type and formatting unless specified otherwise. Input
#' can be in the form of a file path (*.dat or *.csv), or an R object (matrix or
#' dataframe). If input is an R object and path is default, output data sets
#' will be returned as a list object.
#'
#'
#' @importFrom stats runif
#'
#' @param dataset The original data set to be divided. Can be a file path to a
#' *.csv or *.dat file (headers will automatically be detected) or an R object
#' (matrix or dataframe). (Windows users: file path must be specified using
#' FORWARD SLASHES (`/`) ONLY.)
#' @param path File path to folder for output data sets. NOT REQUIRED if
#' dataset is a filename. Specify ONLY if dataset is an R object, or desired
#' output folder is not that of original data set. If path is specified as
#' "object", output data sets will be returned as a list, and not saved to hard
#' drive.
#' @param div Number of output data sets. NOT REQUIRED if default, 2 halves.
#' @param type Output file format ("dat" or "csv"). NOT REQUIRED unless desired
#' output formatting differs from that of input, or dataset is an R object and
#' csv formatting is desired.
#' @param name Output file name. NOT REQUIRED unless desired output name
#' differs from that of input, or input dataset is an R object. (If input is an
#' R object and name is not specified, name will be "splitSample".)
#' @return If `path = "object"`, `list` of output data sets.
#' Otherwise, output will saved to hard drive in the same format as input.
#' @author Corbin Quick (University of Michigan; \email{corbinq@@umich.edu})
#' @examples
#'
#' #### Input is .dat file
#' #splitSample("C:/Users/Default/Desktop/MYDATA.dat")
#' #### Output saved to "C:/Users/Default/Desktop/" in .dat format
#' #### Names are "MYDATA_s1.dat" and "MYDATA_s2.dat"
#'
#' #### Input is R object
#' ## Split C02 dataset from the datasets package
#' library(datasets)
#' splitMyData <- splitSample(CO2, path = "object")
#' summary(splitMyData[[1]])
#' summary(splitMyData[[2]])
#' #### Output object splitMyData becomes list of output data sets
#'
#' #### Input is .dat file in "C:/" folder
#' #splitSample("C:/testdata.dat", path = "C:/Users/Default/Desktop/", type = "csv")
#' #### Output saved to "C:/Users/Default/Desktop/" in *.csv format
#' #### Names are "testdata_s1.csv" and "testdata_s2.csv"
#'
#' #### Input is R object
#' #splitSample(myData, path = "C:/Users/Default/Desktop/", name = "splitdata")
#' #### Output saved to "C:/Users/Default/Desktop/" in *.dat format
#' #### Names are "splitdata_s1.dat" and "splitdata_s2.dat"
#'
#' @export
splitSample <- function(dataset, path = "default", div = 2,
type = "default", name = "splitSample") {
type1 <- type
hea = FALSE
file <- dataset
if (is.character(file)) {
temp <- strsplit(file, "/", fixed = TRUE)
if (path == "default") {
path <- paste(temp[[1]][1:(length(temp[[1]]) - 1)], "/",
sep = "", collapse = "")
}
fileN <- temp[[1]][length(temp[[1]])]
temp <- strsplit(fileN, ".", fixed = TRUE)
type <- temp[[1]][2]
name <- temp[[1]][1]
if (type == "dat") {
if (is.numeric(as.matrix(utils::read.table(file, nrows = 1))) == FALSE) {
data <- as.matrix(utils::read.table(file, header = TRUE))
hea = TRUE
} else {
data <- as.matrix(utils::read.table(file))
}
}
if (type == "csv") {
if (is.numeric(as.matrix(utils::read.table(file, nrows = 1))) == FALSE) {
data <- as.matrix(utils::read.csv(file, header = TRUE))
hea = TRUE
} else {
data <- as.matrix(utils::read.csv(file))
}
}
} else {
if (is.matrix(file) | is.data.frame(file)) {
data <- as.matrix(file)
} else {
stop("Provide data in *.dat or *.csv format")
}
}
if (type1 != "default") {
type <- type1
}
if (is.character(colnames(data))) {
hea = TRUE
}
random <- runif(nrow(data), 1, nrow(data))
data <- cbind(random, data)
data <- data[order(random), ]
data <- data[, 2:ncol(data)]
size <- split((1:nrow(data)), cut((1:nrow(data)), div, labels = FALSE))
size <- as.matrix(as.data.frame(lapply(size, length)))
dataL <- list()
dataL[[1]] <- data[1:size[1, 1], ]
for (i in 2:div) {
size[1, i] <- size[1, (i - 1)] + size[1, i]
dataL[[i]] <- data[(size[1, (i - 1)] + 1):size[1, i], ]
}
if (path == "default") {
return(dataL)
} else {
if (path == "object") {
return(dataL)
} else {
for (i in 1:div) {
if (type == "dat") {
utils::write.table(dataL[[i]],
paste(path, name, "_s", i, ".dat", sep = ""),
sep = " ", row.names = FALSE, col.names = hea)
}
if (type == "csv") {
utils::write.table(dataL[[i]],
paste(path, name, "_s", i, ".csv", sep = ""),
sep = ",", row.names = FALSE, col.names = hea)
}
if (type == "default") {
utils::write.table(dataL[[i]],
paste(path, name, "_s", i, ".dat", sep = ""),
sep = " ", row.names = FALSE, col.names = hea)
}
}
}
}
}
semTools/R/aa_semTools-deprecated.R 0000644 0001762 0000144 00000001011 14632016456 016747 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated 25 August 2018
### automatically create documentation for "deprecated" help page
#' @title Deprecated functions in package \pkg{semTools}.
#' @description The functions listed below are deprecated and will be defunct
#' in the near future. When possible, alternative functions with similar
#' functionality are also mentioned. Help pages for deprecated functions are
#' available at `help("semTools-deprecated")`.
#' @name semTools-deprecated
#' @keywords internal
NULL
semTools/R/longInvariance.R 0000644 0001762 0000144 00000061625 14753124664 015370 0 ustar ligges users ### Sunthud Pornprasertmanit & Yves Rosseel
### Last updated: 10 January 2021
##' Measurement Invariance Tests Within Person
##'
##' Testing measurement invariance across timepoints (longitudinal) or any
##' context involving the use of the same scale in one case (e.g., a dyad case
##' with husband and wife answering the same scale). The measurement invariance
##' uses a typical sequence of model comparison tests. This function currently
##' works with only one scale, and only with continuous indicators.
##'
##' If `strict = FALSE`, the following four models are tested in order:
##' \enumerate{
##' \item Model 1: configural invariance. The same factor structure is
##' imposed on all units.
##' \item Model 2: weak invariance. The factor loadings are constrained to be
##' equal across units.
##' \item Model 3: strong invariance. The factor loadings and intercepts are
##' constrained to be equal across units.
##' \item Model 4: The factor loadings, intercepts and means are constrained to
##' be equal across units.
##' }
##'
##' Each time a more restricted model is fitted, a \eqn{\Delta\chi^2} test is
##' reported, comparing the current model with the previous one, and comparing
##' the current model to the baseline model (Model 1). In addition, the
##' difference in CFA is also reported (\eqn{\Delta}CFI).
##'
##' If `strict = TRUE`, the following five models are tested in order:
##'
##' \enumerate{
##' \item Model 1: configural invariance. The same factor structure is imposed
##' on all units.
##' \item Model 2: weak invariance. The factor loadings are constrained to be
##' equal across units.
##' \item Model 3: strong invariance. The factor loadings and intercepts are
##' constrained to be equal across units.
##' \item Model 4: strict invariance. The factor loadings, intercepts and
##' residual variances are constrained to be equal across units.
##' \item Model 5: The factor loadings, intercepts, residual variances and
##' means are constrained to be equal across units.
##' }
##'
##' Note that if the \eqn{\chi^2} test statistic is scaled (eg. a Satorra-Bentler
##' or Yuan-Bentler test statistic), a special version of the \eqn{\Delta\chi^2}
##' test is used as described in
##'
##'
##' @param model lavaan syntax or parameter table
##' @param varList A list containing indicator names of factors used in the
##' invariance testing, such as the list that the first element is the vector
##' of indicator names in the first timepoint and the second element is the
##' vector of indicator names in the second timepoint. The order of indicator
##' names should be the same (but measured in different times or different
##' units).
##' @param auto The order of autocorrelation on the measurement errors on the
##' similar items across factor (e.g., Item 1 in Time 1 and Time 2). If 0 is
##' specified, the autocorrelation will be not imposed. If 1 is specified,
##' the autocorrelation will imposed for the adjacent factor listed in
##' `varList`. The maximum number can be specified is the number of
##' factors specified minus 1. If `"all"` is specified, the maximum
##' number of order will be used.
##' @param constrainAuto If `TRUE`, the function will equate the
##' auto-*covariance* to be equal within the same item across factors.
##' For example, the covariance of item 1 in time 1 and time 2 is equal to
##' the covariance of item 1 in time 2 and time 3.
##' @param fixed.x See [lavaan::lavOptions()].
##' @param std.lv See [lavaan::lavOptions()].
##' @param group See [lavaan::lavaan()].
##' @param group.equal See [lavaan::lavOptions()].
##' @param group.partial See [lavaan::lavOptions()].
##' @param strict If `TRUE`, the sequence requires strict invariance. See
##' @param warn See See [lavaan::lavOptions()].
##' @param debug See See [lavaan::lavOptions()]. details for more information.
##' @param quiet If `FALSE` (default), a summary is printed out containing
##' an overview of the different models that are fitted, together with some
##' model comparison tests. If `TRUE`, no summary is printed.
##' @param fit.measures Fit measures used to calculate the differences between
##' nested models.
##' @param baseline.model custom baseline model passed to
##' [lavaan::fitMeasures()]
##' @param method The method used to calculate likelihood ratio test. See
##' [lavaan::lavTestLRT()] for available options
##' @param ... Additional arguments in the [lavaan::lavaan()]
##' function. See also [lavaan::lavOptions()]
##'
##' @return Invisibly, all model fits in the sequence are returned as a list.
##'
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' Yves Rosseel (Ghent University; \email{Yves.Rosseel@@UGent.be})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @references Vandenberg, R. J., and Lance, C. E. (2000). A review and
##' synthesis of the measurement invariance literature: Suggestions,
##' practices, and recommendations for organizational research.
##' *Organizational Research Methods, 3*(1), 4--70.
##' \doi{10.1177/109442810031002}
##'
##' @examples
##'
##' model <- ' f1t1 =~ y1t1 + y2t1 + y3t1
##' f1t2 =~ y1t2 + y2t2 + y3t2
##' f1t3 =~ y1t3 + y2t3 + y3t3 '
##'
##' ## Create list of variables
##' var1 <- c("y1t1", "y2t1", "y3t1")
##' var2 <- c("y1t2", "y2t2", "y3t2")
##' var3 <- c("y1t3", "y2t3", "y3t3")
##' constrainedVar <- list(var1, var2, var3)
##'
##' ## Invariance of the same factor across timepoints
##' longInvariance(model, auto = 1, constrainAuto = TRUE,
##' varList = constrainedVar, data = exLong)
##'
##' ## Invariance of the same factor across timepoints and groups
##' longInvariance(model, auto = 1, constrainAuto = TRUE,
##' varList = constrainedVar, data = exLong, group = "sex",
##' group.equal = c("loadings", "intercepts"))
##'
##' @name longInvariance-deprecated
##' @usage
##' longInvariance(model, varList, auto = "all", constrainAuto = FALSE,
##' fixed.x = TRUE, std.lv = FALSE, group = NULL,
##' group.equal = "", group.partial = "", strict = FALSE,
##' warn = TRUE, debug = FALSE, quiet = FALSE,
##' fit.measures = "default", baseline.model = NULL,
##' method = "satorra.bentler.2001", ...)
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##'
##' @export
longInvariance <- function(model, varList, auto = "all", constrainAuto = FALSE,
fixed.x = TRUE, std.lv = FALSE, group = NULL,
group.equal = "", group.partial = "", strict = FALSE,
warn = TRUE, debug = FALSE, quiet = FALSE,
fit.measures = "default", baseline.model = NULL,
method = "satorra.bentler.2001", ...) {
.Deprecated(msg = c("The longInvariance function is deprecated, and ",
"it will cease to be included in future versions of ",
"semTools. See help('semTools-deprecated) for details."))
List <- list(...)
# Find the number of groups
ngroups <- 1
if (!is.null(group)) {
if (!is.null(List$data)) {
ngroups <- length(unique(List$data[,group]))
} else if (!is.null(List$sample.cov)) {
ngroups <- length(List$sample.cov)
} else {
stop("Cannot find the specifying variable name in the 'group' argument.")
}
}
# Get the lavaan parameter table
if (is.character(model)) {
lavaanParTable <-
lavaan::lavaanify(model = model,
meanstructure = TRUE,
int.ov.free = TRUE,
int.lv.free = FALSE,
orthogonal = FALSE,
fixed.x = fixed.x,
std.lv = std.lv,
auto.fix.first = ifelse(std.lv, FALSE, TRUE),
auto.fix.single = TRUE,
auto.var = TRUE,
auto.cov.lv.x = TRUE,
auto.cov.y = TRUE,
ngroups = ngroups,
group.equal = group.equal,
group.partial = group.partial,
debug = debug,
warn = warn,
as.data.frame. = TRUE)
} else if (is.list(model)) {
if (!is.null(model$lhs) && !is.null(model$op) &&
!is.null(model$rhs) && !is.null(model$free)) {
lavaanParTable <- model
} else if (is.character(model[[1]])) {
stop("lavaan ERROR: model is a list, but not a parameterTable?")
}
} else {
cat("model type: ", class(model), "\n")
stop("lavaan ERROR: model is not of type character or list")
}
# Error checking on the varList argument and get the factor name corresponding to each elements of the list
facName <- lapply(varList,
function(vec, pt) pt$lhs[(pt$op == "=~") & (pt$rhs %in% vec)],
pt = lavaanParTable)
if (any(sapply(facName, function(x) length(unique(x)) > 1)))
stop("The factor names of the same element of the 'varList' are not the same.")
if (length(unique(sapply(facName, function(x) length(x)))) > 1)
stop("The numbers of variables in each element are not equal.")
facName <- unlist(lapply(facName, unique))
# Impose the autocorrelation in the parameter table
if (auto != 0) {
if (is.numeric(auto) && auto >= length(varList))
stop("The number of lag in auto-correlation is not possible in the current number of timepoints.")
if (auto == "all") auto <- length(varList) - 1
for (k in 1:ngroups) {
for (i in 1:length(varList[[1]])) {
name <- sapply(varList, function(x, element) x[element], element = i)
for (j in 1:auto) {
vec <- 1:(length(varList) - j)
lavaanParTable <- freeParTable(lavaanParTable, name[vec], "~~", name[vec + j], k, ustart = NA)
if (constrainAuto & (length(vec) > 1))
lavaanParTable <- constrainParTable(lavaanParTable, name[vec], "~~", name[vec + j], k)
}
}
}
}
# Fit configural invariance
fitConfigural <- try(lavaan::lavaan(lavaanParTable, ..., group = group,
group.equal = group.equal,
group.partial = group.partial,
warn = TRUE, debug = FALSE),
silent = TRUE)
# Create the parameter table for metric invariance
ptMetric <- lavaanParTable
if (std.lv) {
for (k in 1:ngroups) {
# Free variances of factor 2, 3, ...
ptMetric <- freeParTable(ptMetric, facName[-1], "~~", facName[-1], k, ustart = NA)
# Constrain factor loadings
for (i in 1:length(varList[[1]])) {
ptMetric <- constrainParTable(ptMetric, facName, "=~", sapply(varList, function(x, element) x[element], element = i), k)
}
}
ptMetric$ustart[(ptMetric$op == "=~") & (ptMetric$rhs %in% sapply(varList, function(x, element) x[element], element = 1))] <- 1
} else {
for (k in 1:ngroups) {
# Constrain factor loadings but keep marker variables
for (i in 2:length(varList[[1]])) {
ptMetric <- constrainParTable(ptMetric, facName, "=~", sapply(varList, function(x, element) x[element], element = i), k)
}
}
}
fitMetric <- try(lavaan::lavaan(ptMetric, ..., group = group,
group.equal = group.equal,
group.partial = group.partial,
warn = TRUE, debug = FALSE),
silent = TRUE)
# Create the parameter table for scalar invariance
ptScalar <- ptMetric
for (k in 1:ngroups) {
# Free means of factors 2, 3, ...
ptScalar <- freeParTable(ptScalar, facName[-1], "~1", "", k, ustart = NA)
# Constrain measurement intercepts
for (i in 1:length(varList[[1]])) {
ptScalar <- constrainParTable(ptScalar, sapply(varList, function(x, element) x[element], element = i), "~1", "", k)
}
}
ptScalar$ustart[(ptMetric$op == "~1") & (ptMetric$rhs %in% facName)] <- 0
fitScalar <- try(lavaan::lavaan(ptScalar, ..., group = group,
group.equal = group.equal,
group.partial = group.partial,
warn = TRUE, debug = FALSE),
silent = TRUE)
ptMeans <- ptScalar
# Create the parameter table for strict invariance if specified
ptStrict <- ptScalar
fitStrict <- NULL
if (strict) {
ptStrict <- ptScalar
for (k in 1:ngroups) {
# Constrain measurement error variances
for (i in 1:length(varList[[1]])) {
name <- sapply(varList, function(x, element) x[element], element = i)
ptStrict <- constrainParTable(ptStrict, name, "~~", name, k)
}
}
fitStrict <- try(lavaan::lavaan(ptStrict, ..., group = group,
group.equal = group.equal,
group.partial = group.partial,
warn = TRUE, debug = FALSE),
silent = TRUE)
ptMeans <- ptStrict
}
# Create the parameter table for mean equality
# Constrain factor means to be equal
for (k in 1:ngroups) {
ptMeans <- fixParTable(ptMeans, facName[-1], "~1", "", k, ustart = 0)
}
fitMeans <- try(lavaan::lavaan(ptMeans, ..., group = group,
group.equal = group.equal,
group.partial = group.partial,
warn = TRUE, debug = FALSE),
silent = TRUE)
FIT <- invisible(list(fit.configural = fitConfigural, fit.loadings = fitMetric,
fit.intercepts = fitScalar, fit.residuals = fitStrict,
fit.means = fitMeans))
FIT <- FIT[!sapply(FIT, is.null)]
if (!quiet) printInvarianceResult(FIT, fit.measures, baseline.model, method)
# Modify these functions from measurementInvariance function
# if(!quiet) {
# cat("\n#################### Measurement invariance tests ####################\n")
# cat("\nThe order of autocorrelation: ", auto, "\n")
# cat("\n#################### Model 1: configural invariance:\n")
# printFitLine(fitConfigural)
# cat("\n#################### Model 2: weak invariance (equal loadings):\n")
# printFitLine(fitMetric)
# cat("\n[Model 1 versus model 2]\n")
# difftest(fitConfigural, fitMetric)
# cat("\n#################### Model 3: strong invariance (equal loadings + intercepts):\n")
# printFitLine(fitScalar)
# cat("\n[Model 1 versus model 3]\n")
# difftest(fitConfigural, fitScalar)
# cat("\n[Model 2 versus model 3]\n")
# difftest(fitMetric, fitScalar)
# if(strict) {
# cat("\n#################### Model 4: strict invariance (equal loadings + intercepts + residuals):\n")
# printFitLine(fitStrict)
# cat("\n[Model 1 versus model 4]\n")
# difftest(fitConfigural, fitStrict)
# cat("\n[Model 2 versus model 4]\n")
# difftest(fitMetric, fitStrict)
# cat("\n[Model 3 versus model 4]\n")
# difftest(fitScalar, fitStrict)
# cat("\n#################### Model 5: equal loadings + intercepts + residuals + means:\n")
# printFitLine(fitMeans, horizontal=TRUE)
# cat("\n[Model 1 versus model 5]\n")
# difftest(fitConfigural, fitMeans)
# cat("\n[Model 2 versus model 5]\n")
# difftest(fitMetric, fitMeans)
# cat("\n[Model 3 versus model 5]\n")
# difftest(fitScalar, fitMeans)
# cat("\n[Model 4 versus model 5]\n")
# difftest(fitStrict, fitMeans)
# } else {
# cat("\n#################### Model 4: equal loadings + intercepts + means:\n")
# printFitLine(fitMeans)
# cat("\n[Model 1 versus model 4]\n")
# difftest(fitConfigural, fitMeans)
# cat("\n[Model 2 versus model 4]\n")
# difftest(fitMetric, fitMeans)
# cat("\n[Model 3 versus model 4]\n")
# difftest(fitScalar, fitMeans)
# }
# }
# return(invisible(list(fit.configural = fitConfigural, fit.loadings = fitMetric, fit.intercepts = fitScalar, fit.residuals = fitStrict, fit.means = fitMeans)))
invisible(FIT)
}
## ----------------
## Hidden Functions
## ----------------
# rearrangeFreeElement: Rearrange the number listed in 'free' in parameter tables
rearrangeFreeElement <- function(vec) {
vec2 <- vec
vec <- vec[vec != 0]
uvec <- unique(vec)
newvec <- 1:length(unique(vec))
vec2[vec2 != 0] <- newvec[match(vec, uvec)]
class(vec2) <- "integer"
vec2
}
# rearrangept: Rearrange parameter table and plabel
rearrangept <- function(pt) {
createplabel <- function(num) {
result <- paste0(".p", num, ".")
result[num == 0] <- ""
result
}
oldfree <- pt$free
newfree <- rearrangeFreeElement(oldfree)
oldplabel <- pt$plabel
newplabel <- createplabel(seq_along(pt$op))
eqpos <- which(pt$op == "==")
newplabel[eqpos] <- ""
if (length(eqpos) > 0) {
eqlhs <- pt$lhs[eqpos]
eqrhs <- pt$rhs[eqpos]
matchlhs <- match(eqlhs, oldplabel)
matchrhs <- match(eqrhs, oldplabel)
neweqlhs <- newplabel[matchlhs]
neweqrhs <- newplabel[matchrhs]
neweqlhs[is.na(matchlhs)] <- eqlhs[is.na(matchlhs)]
neweqrhs[is.na(matchrhs)] <- eqrhs[is.na(matchrhs)]
pt$lhs[eqpos] <- neweqlhs
pt$rhs[eqpos] <- neweqrhs
}
pt$free <- newfree
pt$plabel <- newplabel
pt
}
# freeParTable: Free elements in parameter table
#FIXME: used in singleParamTest and partialInvariance
freeParTable <- function(parTable, lhs, op, rhs, group, ustart = NA) {
parTable$start <- parTable$est <- parTable$se <- NULL
target <- cbind(lhs, op, rhs, group)
for (i in 1:nrow(target)) {
targetElem <- matchElement(parTable = parTable, vec = target[i,])
ptargetElem <- parTable$plabel[targetElem]
if ((length(targetElem) == 0) || is.na(targetElem)) {
newline <- list(lhs = as.character(target[i, 1]),
op = as.character(target[i, 2]),
rhs = as.character(target[i, 3]),
group = as.integer(target[i, 4]),
free = as.integer(max(parTable$free) + 1),
ustart = as.numeric(NA))
parTable <- patMerge(pt1 = parTable, pt2 = newline)
} else {
if (parTable$free[targetElem] == 0) {
parTable$ustart[targetElem] <- ustart
parTable$user[targetElem] <- 1
parTable$free[targetElem] <- max(parTable$free) + 1
}
equalelement <- which(parTable$op == "==")
rmelem <- intersect(union(match(ptargetElem, parTable$lhs),
match(ptargetElem, parTable$rhs)),
equalelement)
if (length(rmelem) > 0) parTable <- removeEqCon(parTable, rmelem)
}
}
parTable <- rearrangept(parTable)
parTable
}
# fixParTable: Fix elements in parameter table
#FIXME: used in singleParamTest and partialInvariance
fixParTable <- function(parTable, lhs, op, rhs, group, ustart = NA) {
parTable$start <- parTable$est <- parTable$se <- NULL
target <- cbind(lhs, op, rhs, group)
element <- apply(target, 1, matchElement, parTable=parTable)
for (i in 1:nrow(target)) {
## Why was Sunthud printing warnings? (originally used warnings(), not warning()...)
# if (parTable$free[element[i]] == 0) warning('The parameter ', lhs, op, rhs,
# ' in group ', group,
# ' is already fixed.')
# equalelement <- which(parTable$op == "==")
# targetElem <- matchElement(parTable = parTable, vec = target[i,])
# ptargetElem <- parTable$plabel[targetElem]
# rmelem <- intersect(union(match(ptargetElem, parTable$lhs), match(ptargetElem, parTable$rhs)), equalelement)
# if(length(rmelem) > 0) parTable <- removeEqCon(parTable, rmelem)
parTable$ustart[element[i]] <- ustart
parTable$user[element[i]] <- 1
parTable$free[element[i]] <- 0
}
parTable <- rearrangept(parTable)
# rearrangePlabel with change all equality constraints
parTable
}
# constrainParTable: Impose equality constraints in any set of elements in the parameter table
#FIXME: used in partialInvariance
constrainParTable <- function(parTable, lhs, op, rhs, group) {
parTable$start <- parTable$est <- parTable$se <- NULL
target <- cbind(lhs, op, rhs, group)
element <- apply(target, 1, matchElement, parTable=parTable)
# id lhs op rhs user group free ustart exo label plabel start
for (i in 2:length(element)) {
len <- length(parTable$id)
newline <- list(lhs = parTable$plabel[element[1]], op = "==",
rhs = parTable$plabel[element[i]])
if (!any(parTable$lhs == newline$lhs & parTable$op == newline$op &
parTable$rhs == newline$rhs)) parTable <- patMerge(pt1 = parTable, pt2 = newline)
}
parTable
}
# matchElement: Find the number of row that have the specification in vec (lhs, op, rhs, group)
#FIXME: used in partialInvariance
matchElement <- function(parTable, vec) {
if (is.null(parTable$group)) {
return(which((parTable$lhs == vec[1]) & (parTable$op == vec[2]) & (parTable$rhs == vec[3])))
} else {
return(which((parTable$lhs == vec[1]) & (parTable$op == vec[2]) & (parTable$rhs == vec[3]) & (parTable$group == vec[4])))
}
}
#FIXME: used in partialInvariance
getValue <- function(parTable, est, lhs, op, rhs, group) {
target <- cbind(lhs, op, rhs, group)
element <- apply(target, 1, matchElement, parTable = parTable)
free <- parTable$free[element]
out <- parTable$ustart[element]
out[free != 0] <- est[free[free != 0]]
out
}
# removeEqCon: Remove equality constraints
#FIXME: used in singleParamTest
removeEqCon <- function(pt, element) {
pt <- lapply(pt, "[", -element)
pt$id <- seq_along(pt$id)
pt
}
#FIXME: used in singleParamTest
patMerge <- function (pt1 = NULL, pt2 = NULL, remove.duplicated = FALSE,
fromLast = FALSE, warn = TRUE) {
pt1 <- as.data.frame(pt1, stringsAsFactors = FALSE)
pt2 <- as.data.frame(pt2, stringsAsFactors = FALSE)
stopifnot(!is.null(pt1$lhs), !is.null(pt1$op), !is.null(pt1$rhs),
!is.null(pt2$lhs), !is.null(pt2$op), !is.null(pt2$rhs))
if (is.null(pt1$group) && is.null(pt2$group)) {
TMP <- rbind(pt1[, c("lhs", "op", "rhs", "group")],
pt2[, c("lhs", "op", "rhs", "group")])
}
else {
if (is.null(pt1$group) && !is.null(pt2$group)) {
pt1$group <- rep(1L, length(pt1$lhs))
}
else if (is.null(pt2$group) && !is.null(pt1$group)) {
pt2$group <- rep(1L, length(pt2$lhs))
}
TMP <- rbind(pt1[, c("lhs", "op", "rhs", "group")],
pt2[, c("lhs", "op", "rhs", "group")])
}
if (is.null(pt1$user) && !is.null(pt2$user)) {
pt1$user <- rep(0L, length(pt1$lhs))
}
else if (is.null(pt2$user) && !is.null(pt1$user)) {
pt2$user <- rep(0L, length(pt2$lhs))
}
if (is.null(pt1$free) && !is.null(pt2$free)) {
pt1$free <- rep(0L, length(pt1$lhs))
}
else if (is.null(pt2$free) && !is.null(pt1$free)) {
pt2$free <- rep(0L, length(pt2$lhs))
}
if (is.null(pt1$ustart) && !is.null(pt2$ustart)) {
pt1$ustart <- rep(0, length(pt1$lhs))
}
else if (is.null(pt2$ustart) && !is.null(pt1$ustart)) {
pt2$ustart <- rep(0, length(pt2$lhs))
}
if (is.null(pt1$exo) && !is.null(pt2$exo)) {
pt1$exo <- rep(0L, length(pt1$lhs))
}
else if (is.null(pt2$exo) && !is.null(pt1$exo)) {
pt2$exo <- rep(0L, length(pt2$lhs))
}
if (is.null(pt1$label) && !is.null(pt2$label)) {
pt1$label <- rep("", length(pt1$lhs))
}
else if (is.null(pt2$label) && !is.null(pt1$label)) {
pt2$label <- rep("", length(pt2$lhs))
}
if (is.null(pt1$plabel) && !is.null(pt2$plabel)) {
pt1$plabel <- rep("", length(pt1$lhs))
}
else if (is.null(pt2$plabel) && !is.null(pt1$plabel)) {
pt2$plabel <- rep("", length(pt2$lhs))
}
if (is.null(pt1$start) && !is.null(pt2$start)) {
pt1$start <- rep(as.numeric(NA), length(pt1$lhs))
}
else if (is.null(pt2$start) && !is.null(pt1$start)) {
pt2$start <- rep(as.numeric(NA), length(pt2$lhs))
}
if (!is.null(pt1$est)) pt1$est <- NULL
if (!is.null(pt2$est)) pt2$est <- NULL
if (!is.null(pt1$se)) pt1$se <- NULL
if (!is.null(pt2$se)) pt2$se <- NULL
if (remove.duplicated) {
idx <- which(duplicated(TMP, fromLast = fromLast))
if (length(idx)) {
if (warn) {
warning("lavaan WARNING: duplicated parameters are ignored:\n",
paste(apply(pt1[idx, c("lhs", "op", "rhs")],
1, paste, collapse = " "), collapse = "\n"))
}
if (fromLast) {
pt1 <- pt1[-idx, ]
}
else {
idx <- idx - nrow(pt1)
pt2 <- pt2[-idx, ]
}
}
} else if (!is.null(pt1$start) && !is.null(pt2$start)) {
for (i in 1:length(pt1$lhs)) {
idx <- which(pt2$lhs == pt1$lhs[i] & pt2$op == pt1$op[i] &
pt2$rhs == pt1$rhs[i] & pt2$group == pt1$group[i])
pt2$start[idx] <- pt1$start[i]
}
}
if (is.null(pt1$id) && !is.null(pt2$id)) {
nid <- max(pt2$id)
pt1$id <- (nid + 1L):(nid + nrow(pt1))
}
else if (is.null(pt2$id) && !is.null(pt1$id)) {
nid <- max(pt1$id)
pt2$id <- (nid + 1L):(nid + nrow(pt2))
}
NEW <- base::merge(pt1, pt2, all = TRUE, sort = FALSE)
NEW
}
semTools/R/reliability.R 0000644 0001762 0000144 00000372423 14753073401 014733 0 ustar ligges users ### Sunthud Pornprasertmanit, Terrence D. Jorgensen, Yves Rosseel
### Last updated: 12 February 2025
## -----------------------------
## AVE()
## average variance extracted
## (not a reliability index)
## -----------------------------
##' Calculate average variance extracted
##'
##' Calculate average variance extracted (AVE) per factor from `lavaan` object
##'
##' The average variance extracted (AVE) can be calculated by
##'
##' \deqn{ AVE = \frac{\bold{1}^\prime
##' \textrm{diag}\left(\Lambda\Psi\Lambda^\prime\right)\bold{1}}{\bold{1}^\prime
##' \textrm{diag}\left(\hat{\Sigma}\right) \bold{1}}, }
##'
##' Note that this formula is modified from Fornell & Larcker (1981) in the case
##' that factor variances are not 1. The proposed formula from Fornell & Larcker
##' (1981) assumes that the factor variances are 1. Note that AVE will not be
##' provided for factors consisting of items with dual loadings. AVE is the
##' property of items but not the property of factors. AVE is calculated with
##' polychoric correlations when ordinal indicators are used.
##'
##' @importFrom lavaan lavInspect
##' @importFrom methods getMethod
##'
##' @param object A [lavaan::lavaan-class] or [lavaan.mi::lavaan.mi-class] object,
##' expected to contain only exogenous common factors (i.e., a CFA model).
##' Cross-loadings are not allowed and will result in `NA` for any factor with
##' indicator(s) that cross-load.
##' @param obs.var `logical` indicating whether to compute AVE using
##' observed variances in the denominator. Setting `FALSE` triggers
##' using model-implied variances in the denominator.
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. NPD solutions are not excluded by default because
##' they are likely to occur due to sampling error, especially in small
##' samples. However, gross model misspecification could also cause
##' NPD solutions, users can compare pooled results with and without
##' this setting as a sensitivity analysis to see whether some
##' imputations warrant further investigation.
##' @param omit.factors `character` vector naming any common factors
##' modeled in `object` whose indicators' AVE is not of interest.
##' @param dropSingle `logical` indicating whether to exclude factors
##' defined by a single indicator from the returned results. If `TRUE`
##' (default), single indicators will still be included in the `total`
##' column when `return.total = TRUE`.
##' @param return.df `logical` indicating whether to return reliability
##' coefficients in a `data.frame` (one row per group/level), which is
##' possible when every model block includes the same factors (after excluding
##' those in `omit.factors` and applying `dropSingle`).
##'
##' @return `numeric` vector of average variance extracted from indicators
##' per factor. For models with multiple "blocks" (any combination of groups
##' and levels), vectors may be returned as columns in a `data.frame`
##' with additional columns indicating the group/level (see `return.df=`
##' argument description for caveat).
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @references
##' Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models
##' with unobservable variables and measurement errors. *Journal of
##' Marketing Research, 18*(1), 39--50. \doi{10.2307/3151312}
##'
##' @seealso [compRelSEM()] for composite reliability estimates
##'
##' @examples
##' data(HolzingerSwineford1939)
##' HS9 <- HolzingerSwineford1939[ , c("x7","x8","x9")]
##' HSbinary <- as.data.frame( lapply(HS9, cut, 2, labels=FALSE) )
##' names(HSbinary) <- c("y7","y8","y9")
##' HS <- cbind(HolzingerSwineford1939, HSbinary)
##'
##' HS.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ y7 + y8 + y9 '
##'
##' fit <- cfa(HS.model, data = HS, ordered = c("y7","y8","y9"), std.lv = TRUE)
##'
##' ## works for factors with exclusively continuous OR categorical indicators
##' AVE(fit) # uses observed (or unconstrained polychoric/polyserial) by default
##' AVE(fit, obs.var = FALSE)
##'
##'
##' ## works for multigroup models and for multilevel models (and both)
##' data(Demo.twolevel)
##' ## assign clusters to arbitrary groups
##' Demo.twolevel$g <- ifelse(Demo.twolevel$cluster %% 2L, "type1", "type2")
##' model2 <- ' group: type1
##' level: within
##' fac =~ y1 + L2*y2 + L3*y3
##' level: between
##' fac =~ y1 + L2*y2 + L3*y3
##'
##' group: type2
##' level: within
##' fac =~ y1 + L2*y2 + L3*y3
##' level: between
##' fac =~ y1 + L2*y2 + L3*y3
##' '
##' fit2 <- sem(model2, data = Demo.twolevel, cluster = "cluster", group = "g")
##' AVE(fit2)
##'
##'@export
AVE <- function(object, obs.var = TRUE, omit.imps = c("no.conv","no.se"),
omit.factors = character(0), dropSingle = TRUE,
return.df = TRUE) {
## numbers of blocks
ngroups <- lavInspect(object, "ngroups")
nLevels <- lavInspect(object, "nlevels")
nblocks <- ngroups*nLevels #FIXME: always true?
## labels for groups
if (ngroups > 1L) {
group.label <- lavInspect(object, "group.label")
blk.g.lab <- if (!length(group.label)) paste0("g", 1:ngroups) else group.label
} else {
group.label <- blk.g.lab <- NULL
}
## labels for clusters
if (nLevels > 1L) {
#FIXME? lavInspect(object, "level.label") is always ==
# c("within", lavInspect(object, "cluster"))
PT <- parTable(object)
clus.label <- unique(PT$level)
clus.label <- clus.label[which(clus.label != "")]
clus.label <- clus.label[which(clus.label != 0)]
blk.clus.lab <- if (is.numeric(clus.label)) {
c("within", lavInspect(object, "cluster"))
} else clus.label
} else clus.label <- blk.clus.lab <- NULL
## labels for blocks
if (nblocks > 1L) {
block.label <- paste(rep(blk.g.lab, each = nLevels), blk.clus.lab,
sep = if (ngroups > 1L && nLevels > 1L) "_" else "")
} else block.label <- NULL
## check for categorical
anyCategorical <- lavInspect(object, "categorical")
if (inherits(object, "lavaan")) {
PHI <- lavInspect(object, "cov.lv") # common-factor variance
EST <- lavInspect(object, "est") # to extract loadings
SIGMA <- lavInspect(object, # total variance
ifelse(obs.var, "sampstat", "fitted"))
if (nblocks == 1L) {
PHI <- list(PHI)
LAMBDA <- list(EST$lambda)
SIGMA <- list(SIGMA$cov)
} else {
LAMBDA <- sapply(EST, "[[", i = "lambda", simplify = FALSE)
SIGMA <- sapply(SIGMA, "[[", i = "cov", simplify = FALSE)
}
} else if (inherits(object, "lavaan.mi")) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
## common-factor variance
phiList <- object@phiList[useImps]
if (nblocks == 1L) for (i in 1:m) phiList[[i]] <- list(phiList[[i]])
PHI <- list()
for (b in 1:nblocks) {
PHI[[b]] <- Reduce("+", lapply(phiList, "[[", i = b) ) / m
}
## loadings
LAMBDA <- vector("list", nblocks)
if (nblocks == 1L) {
lamList <- lapply(object@coefList[useImps], "[[", i = "lambda")
LAMBDA[[1]] <- Reduce("+", lamList) / length(lamList)
} else {
for (b in 1:nblocks) {
lamList <- lapply(object@coefList[useImps], function(i) i[[b]]$lambda)
LAMBDA[[b]] <- Reduce("+", lamList) / length(lamList)
}
}
## total variance
if (obs.var) {
SIGMA <- vector("list", nblocks)
## loop over blocks to pool saturated-model (observed) matrices
for (b in 1:nblocks) {
covList <- lapply(object@h1List[useImps], function(i) i$implied$cov[[b]])
SIGMA[[b]] <- Reduce("+", covList) / m
rownames(SIGMA[[b]]) <- colnames(SIGMA[[b]]) <- lavNames(object, block = b)
}
} else {
## pooled model-implied matrices
if (nblocks == 1L) {
SIGMA <- getMethod("fitted", class(object))(object)["cov"] # retain list format
} else {
SIGMA <- sapply(getMethod("fitted", class(object))(object),
"[[", "cov", simplify = FALSE)
}
}
} # end lavaan vs. lavaan.mi conditional
## scale polychoric/polyserial to modeled LRV scale
if (anyCategorical) {
SDs <- sapply(getScales(object, omit.imps = omit.imps),
FUN = function(x) diag(1 / as.numeric(x)),
simplify = FALSE)
for (b in 1:nblocks) {
dimnames(SDs[[b]]) <- dimnames(SIGMA[[b]])
SIGMA[[b]] <- SDs[[b]] %*% SIGMA[[b]] %*% SDs[[b]]
}
}
avevar <- list()
for (b in 1:nblocks) {
## extract factor and indicator names
LY <- LAMBDA[[b]]
allIndNames <- rownames(LY)
allFacNames <- colnames(LY)
myFacNames <- setdiff(allFacNames, omit.factors)
if (dropSingle) {
multInd <- sapply(myFacNames, function(fn) sum(LY[,fn] != 0) > 1L)
myFacNames <- myFacNames[multInd]
}
subLY <- LY[ , myFacNames, drop = FALSE]
myIndNames <- rownames(subLY)[apply(subLY, 1L, function(x) any(x != 0))]
## check for cross-loadings
Xload <- apply(subLY, 1L, function(x) sum(round(x, 5) != 0) > 1L)
avevar[[b]] <- setNames(rep(NA, length(myFacNames)), nm = myFacNames)
## loop over factors
for (fn in myFacNames) {
idx <- which(subLY[,fn] != 0)
if (any(Xload[idx])) next # cross-loading violates AVE definition
commonVar <- sum(subLY[idx, fn]^2) * PHI[[b]][fn, fn]
avevar[[b]][fn] <- commonVar / sum(diag(SIGMA[[b]])[ myIndNames[idx] ])
}
}
## drop list structure?
if (nblocks == 1L) {
avevar <- avevar[[1]]
class(avevar) <- c("lavaan.vector","numeric")
return(avevar)
} else {
facList <- lapply(avevar, names)
sameNames <- all(sapply(2:nblocks, function(i) {
isTRUE(all.equal(facList[[1]], facList[[i]]))
} ))
if (!(sameNames && return.df)) {
## can't simplify, return as a list
for (i in seq_along(avevar)) class(avevar[[i]]) <- c("lavaan.vector","numeric")
names(avevar) <- block.label
return(avevar)
}
}
## concatenate each factor's AVE across blocks
facRel <- sapply(facList[[1]], simplify = FALSE, FUN = function(nn) {
sapply(avevar, "[[", i = nn, USE.NAMES = FALSE) # extract AVE for factor i
})
if (ngroups > 1L && nLevels > 1L) {
out <- data.frame(group = rep(blk.g.lab, each = nLevels),
level = rep(blk.clus.lab, times = ngroups),
facRel)
} else if (ngroups > 1L) {
out <- data.frame(group = blk.g.lab, facRel)
} else if (nLevels > 1L) {
out <- data.frame(level = blk.clus.lab, facRel)
}
class(out) <- c("lavaan.data.frame","data.frame")
out
}
## ------------
## compRelSEM()
## ------------
##' Composite Reliability using SEM
##'
##' Calculate composite reliability from estimated factor-model parameters
##'
##' Several coefficients for factor-analysis reliability have been termed
##' "omega", which Cho (2021) argues is a misleading misnomer and argues for
##' using \eqn{\rho} to represent them all, differentiated by descriptive
##' subscripts. In our package, we strive to provide unlabeled coefficients,
##' leaving it to the user to decide on a label in their report. But we do
##' use the symbols \eqn{\alpha} and \eqn{\omega} in the formulas below in order
##' to distinguish coefficients that do (not) assume essential tau-equivalence.
##' For higher-order constructs with latent indicators, only \eqn{\omega} is
##' available. Lai's (2021) multilevel coefficients are labeled in accordance
##' with the symbols used in that article (more details below).
##'
##' Bentler (1968) first introduced factor-analysis reliability for a
##' unidimensional factor model with congeneric indicators, labeling the
##' coeficients \eqn{\alpha}. McDonald (1999) later referred to this
##' *and other reliability coefficients*, first as \eqn{\theta} (in 1970),
##' then as \eqn{\omega}, which is a source of confusion when reporting
##' coefficients (Cho, 2021). Coefficients based on factor models were later
##' generalized to account for multidimenisionality (possibly with
##' cross-loadings) and correlated errors. The general \eqn{\omega} formula
##' implemented in this function is:
##'
##' \deqn{ \omega = \frac{\left( \sum^{k}_{i = 1} \lambda_i \right)^{2}
##' Var\left( \psi \right)}{\bold{1}^\prime \hat{\Sigma} \bold{1}}, }
##'
##' where \eqn{\hat{\Sigma}} can be the model-implied covariance matrix from
##' either the saturated model (i.e., the "observed" covariance matrix, used by
##' default) or from the hypothesized CFA model, controlled by the
##' `obs.var` argument. A \eqn{k}-dimensional vector \eqn{\bold{1}} is used
##' to sum elements in the matrix. Note that if the model includes any directed
##' effects (latent regression slopes), all coefficients are calculated
##' from **total** factor variances: `lavInspect(object, "cov.lv")`.
##'
##' Assuming (essential) tau-equivalence (`tau.eq=TRUE`) makes \eqn{\omega}
##' equivalent to coefficient \eqn{\alpha} from classical test theory
##' (Cronbach, 1951):
##'
##' \deqn{ \alpha = \frac{k}{k - 1}\left[ 1 - \frac{\sum^{k}_{i = 1}
##' \sigma_{ii}}{\sum^{k}_{i = 1} \sigma_{ii} + 2\sum_{i < j} \sigma_{ij}}
##' \right],}
##'
##' where \eqn{k} is the number of items in a factor's composite,
##' \eqn{\sigma_{ii}} signifies item *i*'s variance, and \eqn{\sigma_{ij}}
##' signifies the covariance between items *i* and *j*. Again, the
##' `obs.var` argument controls whether \eqn{\alpha} is calculated using
##' the observed or model-implied covariance matrix.
##'
##' By setting `return.total=TRUE`, one can estimate reliability for a
##' single composite calculated using all indicators in a multidimensional
##' CFA (Bentler, 1972, 2009). Setting `return.total = -1` will return
##' **only** the total-composite reliability (not per factor).
##'
##' **Higher-Order Factors**:
##' The reliability of a composite that represents a higher-order construct
##' requires partitioning the model-implied factor covariance matrix \eqn{\Phi}
##' in order to isolate the common-factor variance associated only with the
##' higher-order factor. Using a second-order factor model, the model-implied
##' covariance matrix of observed indicators \eqn{\hat{\Sigma}} can be
##' partitioned into 3 sources:
##' \enumerate{
##' \item the second-order common-factor (co)variance:
##' \eqn{\Lambda \bold{B} \Phi_2 \bold{B}^{\prime} \Lambda^{\prime}}
##' \item the residual variance of the first-order common factors (i.e., not
##' accounted for by the second-order factor):
##' \eqn{\Lambda \Psi_{u} \Lambda^{\prime}}
##' \item the measurement error of observed indicators: \eqn{\Theta}
##' }
##'
##' where \eqn{\Lambda} contains first-order factor loadings, \eqn{\bold{B}}
##' contains second-order factor loadings, \eqn{\Phi_2} is the model-implied
##' covariance matrix of the second-order factor(s), and \eqn{\Psi_{u}} is the
##' covariance matrix of first-order factor disturbances. In practice, we can
##' use the full \eqn{\bold{B}} matrix and full model-implied \eqn{\Phi} matrix
##' (i.e., including all latent factors) because the zeros in \eqn{\bold{B}}
##' will cancel out unwanted components of \eqn{\Phi}. Thus, we can calculate
##' the proportion of variance of a composite score calculated from the observed
##' indicators (e.g., a total score or scale mean) that is attributable to the
##' second-order factor (i.e., coefficient \eqn{\omega}):
##'
##' \deqn{\omega=\frac{\bold{1}^{\prime} \Lambda \bold{B} \Phi \bold{B}^{\prime}
##' \Lambda^{\prime} \bold{1} }{ \bold{1}^{\prime} \hat{\Sigma} \bold{1}}, }
##'
##' where \eqn{\bold{1}} is the *k*-dimensional vector of 1s and *k*
##' is the number of observed indicators in the composite. Note that if a
##' higher-order factor also has observed indicators, it is necessary to model
##' the observed indicators as single-indicator constructs, so that all of the
##' higher-order factor indicators are latent (with loadings in the Beta matrix,
##' not Lambda).
##'
##' **Categorical Indicators**:
##' When all indicators (per composite) are ordinal, the `ord.scale`
##' argument controls whether the coefficient is calculated on the
##' latent-response scale (`FALSE`) or on the observed ordinal scale
##' (`TRUE`, the default). For \eqn{\omega}-type coefficients
##' (`tau.eq=FALSE`), Green and Yang's (2009, formula 21) approach is used
##' to transform factor-model results back to the ordinal response scale. When
##' `ord.scale=TRUE` and `tau.eq=TRUE`, coefficient \eqn{\alpha} is
##' calculated using the covariance matrix calculated from the integer-valued
##' numeric weights for ordinal categories, consistent with its definition
##' (Chalmers, 2018) and the `alpha` function in the `psych` package;
##' this implies `obs.var=TRUE`, so `obs.var=FALSE` will be ignored
##' When `ord.scale=FALSE`, the standard \eqn{\alpha} formula is applied to
##' the polychoric correlation matrix ("ordinal \eqn{\alpha}"; Zumbo et al., 2007),
##' estimated from the saturated or hypothesized model (see `obs.var`),
##' and \eqn{\omega} is calculated from CFA results without applying Green and
##' Yang's (2009) correction (see Zumbo & Kroc, 2019, for a rationalization).
##' No method analogous to Green and Yang (2009) has been proposed for
##' calculating reliability with a mixture of categorical and continuous
##' indicators, so an error is returned if `object` includes factors with a
##' mixture of indicator types (unless omitted using `omit.factors`). If
##' categorical indicators load on a different factor(s) than continuous
##' indicators, then reliability will still be calculated separately for those
##' factors, but `return.total` must be `FALSE` (unless
##' `omit.factors` is used to isolate factors with indicators of the same
##' type).
##'
##' **Multilevel Measurement Models**:
##' Under the default settings, `compRelSEM()` will apply the same formula
##' in each "block" (group and/or level of analysis). In the case of multilevel
##' (ML-)SEMs, this yields "reliability" for latent within- and between-level
##' components, as proposed by Geldhof et al. (2014). Although this works fine
##' to calculate reliability per group, this is not recommended for ML-SEMs
##' because the coefficients do not correspond to actual composites that would
##' be calculated from the observed data. Lai (2021) proposed coefficients for
##' reliability of actual composites, depending on the type of construct, which
##' requires specifying the names of constructs for which reliability is desired
##' (or multiple constructs whose indicators would compose a multidimensional
##' composite). Configural (`config=`) and/or `shared=` constructs
##' can be specified; the same construct can be specified in both arguments, so
##' that overall scale-reliability can be estimated for a shared construct by
##' including it in `config`. Instead of organizing the output by block
##' (the default), specifying `config=` and/or `shared=` will prompt
##' organizing the list of output by `$config` and/or `$shared`.
##'
##' \itemize{
##' \item The overall (`_2L`) scale reliability for `config`ural
##' constructs is returned, along with the reliability of a purely
##' individual-level composite (`_W`, calculated by cluster-mean
##' centering).
##' \item The reliability for a `shared` construct quantifies
##' generalizability across both indicators and raters (i.e., subjects rating
##' their cluster's construct). Lüdtke et al. (2011) refer to these as
##' measurement error and sampling error, respectively. An interrater
##' reliability (IRR) coefficient is also returned, quantifying
##' generalizability across rater/sampling-error only. To obtain a
##' scale-reliability coefficient (quantifying a shared construct's
##' generalizability across indicator/measurement-error only), include the
##' same factor name in `config=`. Jak et al. (2021) recommended
##' modeling components of the same construct at both levels, but users may
##' also saturate the within-level model (Lai, 2021).
##' }
##'
##' Be careful about including Level-2 variables in the model, especially
##' whether it makes sense to include them in a total composite for a Level-2
##' construct. `dropSingle=TRUE` only prevents estimating reliability for
##' a single-indicator construct, not from including such an indicator in a
##' total composite. It is permissible for `shared=` constructs to have
##' additional indicators at Level-2 only. If it is necessary to model other
##' Level-2 variables (e.g., to justify the missing-at-random assumption when
##' using `missing="FIML" estimation`), they should be placed in the
##' `omit.indicators=` argument to exclude them from total composites.
##'
##'
##' @importFrom lavaan lavInspect lavNames parTable
##' @importFrom methods getMethod
##'
##' @param object A [lavaan::lavaan-class] or [lavaan.mi::lavaan.mi-class] object,
##' expected to contain only exogenous common factors (i.e., a CFA model).
##' @param obs.var `logical` indicating whether to compute reliability
##' using observed variances in the denominator. Setting `FALSE` triggers
##' using model-implied variances in the denominator.
##' @param tau.eq `logical` indicating whether to assume (essential)
##' tau-equivalence, yielding a coefficient analogous to \eqn{\alpha}.
##' Setting `FALSE` yields an \eqn{\omega}-type coefficient.
##' @param ord.scale `logical` indicating whether to apply Green and Yang's
##' (2009, formula 21) correction, so that reliability is calculated for the
##' actual ordinal response scale (ignored for factors with continuous
##' indicators). Setting `FALSE` yields coefficients that are
##' only applicable to the continuous latent-response scale.
##' @param config `character` vector naming any configural constructs in
##' a multilevel CFA. For these constructs (and optional total composite),
##' Lai's (2021) coefficients \eqn{\omega^\textrm{W}} and \eqn{\omega^\textrm{2L}}
##' are returned (or corresponding \eqn{\alpha} coefficients when
##' `tau.eq=TRUE`), rather than Geldhof et al.'s (2014) coefficients for
##' hypothetical composites of latent components (although the same formula
##' is used for \eqn{\omega^\textrm{W}} in either case). Note that the same name
##' must be used for the factor component represented at each level of the
##' model.
##' @param shared `character` vector naming any shared constructs in
##' a multilevel CFA. For these constructs (and optional total composite),
##' Lai's (2021) coefficient \eqn{\omega^\textrm{B}} or \eqn{\alpha^\textrm{B}} is
##' returned, rather than Geldhof et al.'s (2014) between-level coefficient
##' for hypothetical composites of latent cluster means. Lai's (2021)
##' coefficient quantifies reliability relative to error associated with both
##' indicators (measurement error) and subjects (sampling error), like a
##' generalizability coefficient. Given that subjects can be considered as
##' raters of their cluster's shared construct, an interrater reliability
##' (IRR) coefficient is also returned, quantifying reliability relative to
##' rater/sampling error alone. To quantify reliability relative to
##' indicator/measurement error alone (i.e., \eqn{\omega^\textrm{2L}}), the
##' `shared=` construct name(s) can additionally be included in
##' `config=` argument.
##' @param higher `character` vector naming any higher-order constructs in
##' `object` for which composite reliability should be calculated.
##' Ignored when `tau.eq=TRUE` because alpha is not based on a CFA model;
##' instead, users must fit a CFA with tau-equivalence constraints.
##' To obtain Lai's (2021) multilevel composite-reliability indices for a
##' higher-order factor, do not use this argument; instead, specify the
##' higher-order factor(s) using the `shared=` or `config=` argument
##' (`compRelSEM` will automatically check whether it includes latent
##' indicators and apply the appropriate formula).
##' @param return.total `logical` indicating whether to return a final
##' column containing the reliability of a composite of all indicators (not
##' listed in `omit.indicators`) of first-order factors not listed in
##' `omit.factors`. Ignored in 1-factor models, and should only be set
##' `TRUE` if all factors represent scale dimensions that could be
##' meaningfully collapsed to a single composite (scale sum or scale mean).
##' Setting a negative value (e.g., `-1` returns **only** the
##' total-composite reliability (excluding coefficients per factor), except
##' when requesting Lai's (2021) coefficients for multilevel `config`ural
##' or `shared=` constructs.
##' @param dropSingle `logical` indicating whether to exclude factors
##' defined by a single indicator from the returned results. If `TRUE`
##' (default), single indicators will still be included in the `total`
##' column when `return.total = TRUE`.
##' @param omit.factors `character` vector naming any common factors
##' modeled in `object` whose composite reliability is not of
##' interest. For example, higher-order or method factors. Note that
##' [reliabilityL2()] should be used to calculate composite
##' reliability of a higher-order factor.
##' @param omit.indicators `character` vector naming any observed variables
##' that should be omitted from the composite whose reliability is calculated.
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. NPD solutions are not excluded by default because
##' they are likely to occur due to sampling error, especially in small
##' samples. However, gross model misspecification could also cause
##' NPD solutions, users can compare pooled results with and without
##' this setting as a sensitivity analysis to see whether some
##' imputations warrant further investigation.
##' @param return.df `logical` indicating whether to return reliability
##' coefficients in a `data.frame` (one row per group/level), which is
##' possible when every model block includes the same factors (after excluding
##' those in `omit.factors` and applying `dropSingle`).
##'
##' @return A `numeric` vector of composite reliability coefficients per
##' factor, or a `list` of vectors per "block" (group and/or level of
##' analysis), optionally returned as a `data.frame` when possible (see
##' `return.df=` argument description for caveat). If there are multiple
##' factors, whose multidimensional indicators combine into a single
##' composite, users can request `return.total=TRUE` to add a column
##' including a reliability coefficient for the total composite, or
##' `return.total = -1` to return **only** the total-composite
##' reliability (ignored when `config=` or `shared=` is specified
##' because each factor's specification must be checked across levels).
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' Uses hidden functions written by Sunthud Pornprasertmanit
##' (\email{psunthud@@gmail.com}) for the old `reliability()` function.
##'
##' @seealso
##' [maximalRelia()] for the maximal reliability of weighted composite
##'
##' @references
##' Bentler, P. M. (1968). Alpha-maximized factor analysis (alphamax): Its
##' relation to alpha and canonical factor analysis. *Psychometrika, 33*(3),
##' 335--345. \doi{10.1007/BF02289328}
##'
##' Bentler, P. M. (1972). A lower-bound method for the dimension-free
##' measurement of internal consistency. *Social Science Research, 1*(4),
##' 343--357. \doi{10.1016/0049-089X(72)90082-8}
##'
##' Bentler, P. M. (2009). Alpha, dimension-free, and model-based internal
##' consistency reliability. *Psychometrika, 74*(1), 137--143.
##' \doi{10.1007/s11336-008-9100-1}
##'
##' Chalmers, R. P. (2018). On misconceptions and the limited usefulness of
##' ordinal alpha. *Educational and Psychological Measurement, 78*(6),
##' 1056--1071. \doi{10.1177/0013164417727036}
##'
##' Cho, E. (2021) Neither Cronbach’s alpha nor McDonald’s omega: A commentary
##' on Sijtsma and Pfadt. *Psychometrika, 86*(4), 877--886.
##' \doi{10.1007/s11336-021-09801-1}
##'
##' Cronbach, L. J. (1951). Coefficient alpha and the internal structure of
##' tests. *Psychometrika, 16*(3), 297--334. \doi{10.1007/BF02310555}
##'
##' Geldhof, G. J., Preacher, K. J., & Zyphur, M. J. (2014). Reliability
##' estimation in a multilevel confirmatory factor analysis framework.
##' *Psychological Methods, 19*(1), 72--91. \doi{10.1037/a0032138}
##'
##' Green, S. B., & Yang, Y. (2009). Reliability of summed item scores using
##' structural equation modeling: An alternative to coefficient alpha.
##' *Psychometrika, 74*(1), 155--167. \doi{10.1007/s11336-008-9099-3}
##'
##' Jak, S., Jorgensen, T. D., & Rosseel, Y. (2021). Evaluating cluster-level
##' factor models with `lavaan` and M*plus*. *Psych, 3*(2),
##' 134--152. \doi{10.3390/psych3020012}
##'
##' Lai, M. H. C. (2021). Composite reliability of multilevel data: It’s about
##' observed scores and construct meanings. *Psychological Methods, 26*(1),
##' 90--102. \doi{10.1037/met0000287}
##'
##' Lüdtke, O., Marsh, H. W., Robitzsch, A., & Trautwein, U. (2011).
##' A 2 \eqn{\times} 2 taxonomy of multilevel latent contextual models:
##' Accuracy--bias trade-offs in full and partial error correction models.
##' *Psychological Methods, 16*(4), 444--467. \doi{10.1037/a0024376}
##'
##' McDonald, R. P. (1999). *Test theory: A unified treatment*. Mahwah, NJ:
##' Erlbaum.
##'
##' Zumbo, B. D., Gadermann, A. M., & Zeisser, C. (2007). Ordinal versions of
##' coefficients alpha and theta for Likert rating scales.
##' *Journal of Modern Applied Statistical Methods, 6*(1), 21--29.
##' \doi{10.22237/jmasm/1177992180}
##'
##' Zumbo, B. D., & Kroc, E. (2019). A measurement is a choice and Stevens’
##' scales of measurement do not help make it: A response to Chalmers.
##' *Educational and Psychological Measurement, 79*(6), 1184--1197.
##' \doi{10.1177/0013164419844305}
##'
##'
##' @examples
##'
##' data(HolzingerSwineford1939)
##' HS9 <- HolzingerSwineford1939[ , c("x7","x8","x9")]
##' HSbinary <- as.data.frame( lapply(HS9, cut, 2, labels=FALSE) )
##' names(HSbinary) <- c("y7","y8","y9")
##' HS <- cbind(HolzingerSwineford1939, HSbinary)
##'
##' HS.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ y7 + y8 + y9 '
##'
##' fit <- cfa(HS.model, data = HS, ordered = c("y7","y8","y9"), std.lv = TRUE)
##'
##' ## works for factors with exclusively continuous OR categorical indicators
##' compRelSEM(fit)
##'
##' ## reliability for ALL indicators only available when they are
##' ## all continuous or all categorical
##' compRelSEM(fit, omit.factors = "speed", return.total = TRUE)
##'
##'
##' ## loop over visual indicators to calculate alpha if one indicator is removed
##' for (i in paste0("x", 1:3)) {
##' cat("Drop ", i, ":\n", sep = "")
##' print(compRelSEM(fit, omit.factors = c("textual","speed"),
##' omit.indicators = i, tau.eq = TRUE))
##' }
##' ## item-total correlations obtainable by adding a composite to the data
##' HS$Visual <- HS$x1 + HS$x2 + HS$x3
##' cor(HS$Visual, y = HS[paste0("x", 1:3)])
##' ## comparable to psych::alpha(HS[paste0("x", 1:3)])
##'
##' ## Reliability of a composite that represents a higher-order factor
##' mod.hi <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9
##' general =~ visual + textual + speed '
##'
##' fit.hi <- cfa(mod.hi, data = HolzingerSwineford1939)
##' compRelSEM(fit.hi, higher = "general")
##' ## reliabilities for lower-order composites also returned
##'
##'
##' ## works for multigroup models and for multilevel models (and both)
##' data(Demo.twolevel)
##' ## assign clusters to arbitrary groups
##' Demo.twolevel$g <- ifelse(Demo.twolevel$cluster %% 2L, "type1", "type2")
##' model2 <- ' group: type1
##' level: 1
##' f1 =~ y1 + L2*y2 + L3*y3
##' f2 =~ y4 + L5*y5 + L6*y6
##' level: 2
##' f1 =~ y1 + L2*y2 + L3*y3
##' f2 =~ y4 + L5*y5 + L6*y6
##'
##' group: type2
##' level: 1
##' f1 =~ y1 + L2*y2 + L3*y3
##' f2 =~ y4 + L5*y5 + L6*y6
##' level: 2
##' f1 =~ y1 + L2*y2 + L3*y3
##' f2 =~ y4 + L5*y5 + L6*y6
##' '
##' fit2 <- sem(model2, data = Demo.twolevel, cluster = "cluster", group = "g")
##' compRelSEM(fit2) # Geldhof's indices (hypothetical, for latent components)
##'
##' ## Lai's (2021) indices for Level-1 and configural constructs
##' compRelSEM(fit2, config = c("f1","f2"))
##' ## Lai's (2021) indices for shared (Level-2) constructs
##' ## (also an interrater reliability coefficient)
##' compRelSEM(fit2, shared = c("f1","f2"))
##'
##'
##' ## Shared construct using saturated within-level model
##' mod.sat1 <- ' level: 1
##' y1 ~~ y1 + y2 + y3 + y4 + y5 + y6
##' y2 ~~ y2 + y3 + y4 + y5 + y6
##' y3 ~~ y3 + y4 + y5 + y6
##' y4 ~~ y4 + y5 + y6
##' y5 ~~ y5 + y6
##' y6 ~~ y6
##'
##' level: 2
##' f1 =~ y1 + L2*y2 + L3*y3
##' f2 =~ y4 + L5*y5 + L6*y6
##' '
##' fit.sat1 <- sem(mod.sat1, data = Demo.twolevel, cluster = "cluster")
##' compRelSEM(fit.sat1, shared = c("f1","f2"))
##'
##'
##' ## Simultaneous shared-and-configural model (Stapleton et al, 2016, 2019),
##' ## not recommended, but possible by omitting shared or configural factor.
##' mod.both <- ' level: 1
##' fc =~ y1 + L2*y2 + L3*y3 + L4*y4 + L5*y5 + L6*y6
##' level: 2
##' ## configural construct
##' fc =~ y1 + L2*y2 + L3*y3 + L4*y4 + L5*y5 + L6*y6
##' ## orthogonal shared construct
##' fs =~ NA*y1 + y2 + y3 + y4 + y5 + y6
##' fs ~~ 1*fs + 0*fc
##' '
##' fit.both <- sem(mod.both, data = Demo.twolevel, cluster = "cluster")
##' compRelSEM(fit.both, shared = "fs", config = "fc")
##'
##' @export
compRelSEM <- function(object, obs.var = TRUE, tau.eq = FALSE, ord.scale = TRUE,
config = character(0), shared = character(0),
higher = character(0),
return.total = FALSE, dropSingle = TRUE,
omit.factors = character(0),
omit.indicators = character(0),
omit.imps = c("no.conv","no.se"), return.df = TRUE) {
## numbers of blocks
ngroups <- lavInspect(object, "ngroups")
nLevels <- lavInspect(object, "nlevels")
nblocks <- ngroups*nLevels #FIXME: always true?
return.total <- rep(return.total, nblocks)
## labels for groups
if (ngroups > 1L) {
group.label <- lavInspect(object, "group.label")
blk.g.lab <- if (!length(group.label)) paste0("g", 1:ngroups) else group.label
} else {
group.label <- blk.g.lab <- NULL
}
## labels for clusters
if (nLevels > 1L) {
#FIXME? lavInspect(object, "level.label") is always ==
# c("within", lavInspect(object, "cluster"))
PT <- parTable(object)
clus.label <- unique(PT$level)
clus.label <- clus.label[which(clus.label != "")]
clus.label <- clus.label[which(clus.label != 0)]
blk.clus.lab <- if (is.numeric(clus.label)) {
c("within", lavInspect(object, "cluster"))
} else clus.label
} else clus.label <- blk.clus.lab <- NULL
## labels for blocks
if (nblocks > 1L) {
block.label <- paste(rep(blk.g.lab, each = nLevels), blk.clus.lab,
sep = if (ngroups > 1L && nLevels > 1L) "_" else "")
} else block.label <- NULL
## check for categorical
anyCategorical <- lavInspect(object, "categorical")
threshold <- if (anyCategorical) getThreshold(object, omit.imps = omit.imps) else NULL
latScales <- if (anyCategorical) getScales(object, omit.imps = omit.imps) else NULL
if (inherits(object, "lavaan")) {
## common-factor variance
PHI <- lavInspect(object, "cov.lv") # ignored if tau.eq
if (nblocks == 1L) PHI <- list(PHI)
names(PHI) <- block.label
## factor loadings
EST <- lavInspect(object, "est", drop.list.single.group = FALSE)
LAMBDA <- sapply(EST, "[[", i = "lambda", simplify = FALSE)
names(LAMBDA) <- block.label
## possibly higher-order loadings?
BETA <- sapply(EST, "[[", i = "beta", simplify = FALSE)
names(BETA) <- block.label
## total variance
if (anyCategorical && tau.eq && ord.scale) {
## calculate SIGMA from data for alpha
#FIXME when MLSEM available for ordinal indicators
# (Level 2 components available? Extend conditional?)
rawData <- try(lavInspect(object, "data", drop.list.single.group = FALSE),
silent = TRUE)
if (inherits(rawData, "try-error"))
stop('Error in lavInspect(fit, "data"); tau.eq= and ord.scale= cannot ',
'both be TRUE for models fitted to summary statistics of ',
'categorial data.')
SIGMA <- lapply(rawData, cov)
names(SIGMA) <- block.label
} else {
SIGMA <- sapply(lavInspect(object, drop.list.single.group = FALSE,
what = ifelse(obs.var, "sampstat", "fitted")),
"[[", i = "cov", simplify = FALSE)
names(SIGMA) <- block.label
}
} else if (inherits(object, "lavaan.mi")) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
## common-factor variance
phiList <- object@phiList[useImps]
if (nblocks == 1L) for (i in 1:m) phiList[[i]] <- list(phiList[[i]])
PHI <- vector("list", nblocks)
names(PHI) <- block.label
for (b in 1:nblocks) {
PHI[[b]] <- Reduce("+", lapply(phiList, "[[", i = b) ) / m
}
## loadings (including higher-order in Beta)
if (nblocks == 1L) {
lamList <- lapply(object@coefList[useImps], "[[", i = "lambda")
LAMBDA <- list(Reduce("+", lamList) / length(lamList))
betList <- lapply(object@coefList[useImps], "[[", i = "beta")
if (length(betList)) {
BETA <- list(Reduce("+", betList) / length(betList))
} else BETA <- list(NULL)
} else {
LAMBDA <- BETA <- vector("list", nblocks)
names(LAMBDA) <- names(BETA) <- block.label
for (b in 1:nblocks) {
lamList <- lapply(object@coefList[useImps], function(i) i[[b]]$lambda)
LAMBDA[[b]] <- Reduce("+", lamList) / length(lamList)
betList <- lapply(object@coefList[useImps], function(i) i[[b]]$beta )
BETA[[b]] <- Reduce("+", betList) / length(betList)
}
}
## total variance
if (anyCategorical && tau.eq && ord.scale) {
## calculate SIGMA from data for alpha
#FIXME when MLSEM available for ordinal indicators
# (Level 2 components available? Extend conditional?)
dataList <- object@DataList[useImps]
SIGMA <- vector("list", nblocks)
names(SIGMA) <- group.label #FIXME when MLSEMs can have ordinal indicators
if (nblocks == 1L) {
VV <- lavNames(object, type = "ov")
impCovList <- lapply(dataList, function(DD) {
dat <- do.call(cbind, sapply(DD[VV], as.numeric, simplify = FALSE))
cov(dat)
})
SIGMA[[1]] <- Reduce("+", impCovList) / length(impCovList)
} else {
## multigroup models need separate data matrices per group
G <- lavInspect(object, "group")
for (g in seq_along(group.label)) {
VV <- lavNames(object, type = "ov",
group = ifelse(length(group.label),
yes = group.label[g], no = g))
impCovList <- lapply(dataList, function(DD) {
RR <- DD[,G] == ifelse(length(group.label),
yes = group.label[g], no = g)
dat <- do.call(cbind, sapply(DD[RR, VV], as.numeric, simplify = FALSE))
cov(dat)
})
SIGMA[[g]] <- Reduce("+", impCovList) / length(impCovList)
} # g
}
} else {
## use model-implied SIGMA from h0 or h1 model
if (obs.var) {
SIGMA <- vector("list", nblocks)
names(SIGMA) <- block.label
## loop over blocks to pool saturated-model (observed) matrices
for (b in 1:nblocks) {
covList <- lapply(object@h1List[useImps], function(i) i$implied$cov[[b]])
SIGMA[[b]] <- Reduce("+", covList) / m
## The slot does not contain dimnames, so add them
rownames(SIGMA[[b]]) <- colnames(SIGMA[[b]]) <- lavNames(object, block = b)
}
} else {
## pooled model-implied matrices
if (nblocks == 1L) {
SIGMA <- getMethod("fitted", class(object))(object)["cov"] # retain list format
} else {
SIGMA <- sapply(getMethod("fitted", class(object))(object),
"[[", "cov", simplify = FALSE)
names(SIGMA) <- block.label
}
}
}
} # end lavaan vs. lavaan.mi conditional
## scale polychoric/polyserial to modeled LRV scale
if (anyCategorical) {
SDs <- sapply(latScales, function(x) diag(1 / x),
simplify = FALSE)
for (b in 1:nblocks) {
dimnames(SDs[[b]]) <- dimnames(SIGMA[[b]])
SIGMA[[b]] <- SDs[[b]] %*% SIGMA[[b]] %*% SDs[[b]]
}
}
warnTotal <- warnAlpha <- warnOmega <- FALSE
if (!length(c(config, shared))) {
rel <- vector("list", length = nblocks)
for (b in 1:nblocks) {
LY <- LAMBDA[[b]]
allIndNames <- rownames(LY)
allFacNames <- colnames(LY)
myFacNames <- setdiff(allFacNames, omit.factors)
if (dropSingle) {
multInd <- sapply(myFacNames, function(fn) sum(LY[,fn] != 0) > 1L)
myFacNames <- myFacNames[multInd]
}
subLY <- LY[ , myFacNames, drop = FALSE]
myIndNames <- rownames(subLY)[apply(subLY, 1L, function(x) any(x != 0))]
## remove unwanted indicators
myIndNames <- setdiff(myIndNames, omit.indicators)
subLY <- subLY[myIndNames, , drop = FALSE]
## distinguish between categorical, continuous, and latent indicators
nameArgs <- list(object = object)
if (nblocks > 1L) nameArgs$block <- b
ordNames <- do.call(lavNames, c(nameArgs, list(type = "ov.ord")))
numNames <- do.call(lavNames, c(nameArgs, list(type = "ov.num")))
if (anyCategorical) {
## identify when the (sub)set of factors are all categorical
blockCat <- all(myIndNames %in% ordNames)
## identify when the (sub)set of factors have mixed indicators, so no total
mix <- any(myIndNames %in% ordNames) && any(myIndNames %in% numNames)
} else {
blockCat <- FALSE
mix <- FALSE
}
if (mix && return.total[b]) {
return.total[b] <- FALSE
if (!(tau.eq && ord.scale)) warnTotal <- TRUE
}
## compute reliability per factor?
if (return.total[b] >= 0) {
## set result missing by default
rel[[b]] <- setNames(rep(NA, length(myFacNames)), nm = myFacNames)
for (fn in myFacNames) {
## names of indicators with nonzero loadings
fIndNames <- myIndNames[which(subLY[,fn] != 0)]
## check for ANY indicators
if (length(fIndNames) == 0L) next
## check for single indicators
if (dropSingle && length(fIndNames) == 1L) next
## check for categorical (or mixed) indicators
fCat <- any(fIndNames %in% ordNames)
## identify when this factor has mixed indicators, so no omegas
fMix <- fCat && any(fIndNames %in% numNames)
## ALPHA
totalCov <- SIGMA[[b]][fIndNames, fIndNames, drop = FALSE]
if (tau.eq) {
if (fMix && !ord.scale) {
## can't mix observed and latent scales
warnAlpha <- TRUE #TODO
next
}
rel[[b]][fn] <- computeAlpha(totalCov)
next
} # else compute omega
## OMEGA
if (fMix) {
warnOmega <- TRUE # can't (yet) mix observed and latent scales
next
}
Lf <- subLY[fIndNames, fn, drop = FALSE]
commonCov <- Lf %*% PHI[[b]][fn, fn] %*% t(Lf)
if (fCat && ord.scale) {
## Green & Yang (2009)
rel[[b]][fn] <- omegaCat(truevar = commonCov, denom = totalCov,
threshold = threshold[[b]][fIndNames],
scales = latScales[[b]][fIndNames])
next
} # else, all continuous or all LRV-scale
rel[[b]][fn] <- sum(commonCov) / sum(totalCov)
} # end loop over factors
} else rel[[b]] <- c(total = as.numeric(NA))
## compute for total composite?
if (return.total[b] && length(myFacNames) > 1L) {
## ALPHA
totalCov <- SIGMA[[b]][myIndNames, myIndNames, drop = FALSE]
if (tau.eq) {
rel[[b]]["total"] <- computeAlpha(totalCov)
next
} # else compute omega
## OMEGA
commonCov <- subLY %*% PHI[[b]][myFacNames, myFacNames] %*% t(subLY)
if (blockCat && ord.scale) {
## Green & Yang (2009)
rel[[b]]["total"] <- omegaCat(truevar = commonCov, denom = totalCov,
threshold = threshold[[b]][myIndNames],
scales = latScales[[b]][myIndNames])
next
} # else, all continuous or all LRV-scale
rel[[b]]["total"] <- sum(commonCov) / sum(totalCov)
}
## composite(s) representing higher-order factor(s)?
for (hf in higher) {
## find latent indicators
L2 <- BETA[[b]][,hf]
latInds <- setdiff(names(L2)[L2 != 0], omit.factors)
## find observed indicators
indList <- lapply(c(hf, latInds), function(i) names(LY[,i])[ LY[,i] != 0])
myIndNames <- setdiff(unique(do.call(c, indList)), omit.indicators)
totalCov <- SIGMA[[b]][myIndNames, myIndNames] # no need for drop = FALSE
L <- LY[myIndNames, c(hf, latInds)]
B <- BETA[[b]][c(hf, latInds), c(hf, latInds)]
Phi <- PHI[[b]][c(hf, latInds), c(hf, latInds)]
commonCov <- L %*% B %*% Phi %*% t(B) %*% t(L)
if (blockCat && ord.scale) {
## Green & Yang (2009)
rel[[b]][hf] <- omegaCat(truevar = commonCov, denom = totalCov,
threshold = threshold[[b]][myIndNames],
scales = latScales[[b]][myIndNames])
next
} # else, all continuous or all LRV-scale
rel[[b]][hf] <- sum(commonCov) / sum(totalCov)
}
}
## drop list structure
if (nblocks == 1L) {
rel <- rel[[1]]
class(rel) <- c("lavaan.vector","numeric")
return(rel)
} else {
facList <- lapply(rel, names)
sameNames <- all(sapply(2:nblocks, function(i) {
isTRUE(all.equal(facList[[1]], facList[[i]]))
} ))
if (!(sameNames && return.df)) {
## can't simplify, return as a list
for (i in seq_along(rel)) class(rel[[i]]) <- c("lavaan.vector","numeric")
names(rel) <- block.label
return(rel)
}
}
## concatenate each factor's reliability across blocks
facRel <- sapply(facList[[1]], simplify = FALSE, FUN = function(nn) {
sapply(rel, "[[", i = nn, USE.NAMES = FALSE) # extract reliability for factor i
})
if (ngroups > 1L && nLevels > 1L) {
out <- data.frame(group = rep(blk.g.lab, each = nLevels),
level = rep(blk.clus.lab, times = ngroups),
facRel)
} else if (ngroups > 1L) {
out <- data.frame(group = blk.g.lab, facRel)
} else if (nLevels > 1L) {
out <- data.frame(level = blk.clus.lab, facRel)
}
class(out) <- c("lavaan.data.frame","data.frame")
return(out)
}
if (warnTotal) {
message('Cannot return.total when model contains both continuous and ',
'binary/ordinal observed indicators. Use the ',
'omit.factors= argument to choose factors with only categorical ',
'or continuous indicators, if that is a composite of interest.\n')
}
if (warnAlpha) {
message('Coefficient alpha cannot be computed for factors as though a ',
'composite would be calculated using both observed-response scales',
' (for continuous indicators) and latent-response scales (for ',
'categorical indicators). If you want to assume tau-equivalence, ',
'either set ord.scale=FALSE or fit a model that treats ordinal ',
'indicators as continuous.')
}
if (warnOmega) {
message('Composite reliability (omega) cannot be computed for factors ',
'with mixed categorical and continuous indicators, unless a model ',
'is fitted by treating ordinal indicators as continuous.')
}
## otherwise, only use Lai's MULTILEVEL coefficients
if (nLevels > 1L && length(c(config, shared))) {
## group-level list, each containing 2 coefs per factor/total in data.frame
rel <- vector("list", length = ngroups)
for (g in 1:ngroups) {
gLab <- ifelse(length(group.label), yes = group.label[g], no = g)
nameArgs <- list(object = object, type = "lv")
if (ngroups > 1L) nameArgs$group <- gLab
lv.names1 <- do.call(lavNames, c(nameArgs, list(level = clus.label[1L])))
lv.names2 <- do.call(lavNames, c(nameArgs, list(level = clus.label[2L])))
nameArgs$type <- "ov"
ov.names1 <- do.call(lavNames, c(nameArgs, list(level = clus.label[1L])))
ov.names2 <- do.call(lavNames, c(nameArgs, list(level = clus.label[2L])))
nameArgs$type <- "lv.ind"
allLatInds <- do.call(lavNames, c(nameArgs, list(level = clus.label[1L])))
## erase higher= argument, use it to collect factor names for later checks
higher <- character(0)
PT <- parTable(object)
PT <- PT[PT$op == "=~", ]
if (ngroups > 1L) PT <- PT[PT$group == gLab, ]
## block indices for 2 levels in this group
#FIXME: eventually possible to model partial clustering?
idx1 <- 1 + (g-1)*2 # within
idx2 <- 2 + (g-1)*2 # cluster
## configural construct(s) defined at both levels this group?
for (fn in config) {
if (fn %in% omit.factors) {
## why would they do this?
config <- setdiff(config, omit.factors)
next
}
if (fn %in% lv.names1 && fn %in% lv.names2) {
## same indicators for this construct at both levels?
indNames1 <- setdiff(PT$rhs[PT$lhs == fn & PT$level == clus.label[1L]],
omit.indicators)
indNames2 <- setdiff(PT$rhs[PT$lhs == fn & PT$level == clus.label[2L]],
omit.indicators)
if (!all.equal(indNames1, indNames2)) {
stop('After removing omit.indicators=, the indicators of factor ',
fn, ' do not match across levels',
ifelse(ngroups > 1L, paste(' in group', gLab), ""))
}
if (dropSingle && length(indNames1) == 1L) next
## is it a higher-order factor?
latInds1 <- intersect(indNames1, allLatInds)
if (length(latInds1)) {
lowList <- lapply(latInds1, function(fn1) {
indNames1 <- setdiff(PT$rhs[PT$lhs == fn1 & PT$level == clus.label[1L]],
omit.indicators)
indNames2 <- setdiff(PT$rhs[PT$lhs == fn1 & PT$level == clus.label[2L]],
omit.indicators)
## check indicators also match for lower-order factors
if (!all.equal(indNames1, indNames2)) {
stop('After removing omit.indicators=, the indicators of factor ',
fn1, ' do not match across levels',
ifelse(ngroups > 1L, paste(' in group', gLab), ""))
}
## check indicators of lower-order factors are not also latent
lowerLatent <- intersect(indNames1, allLatInds)
if (length(lowerLatent))
stop('Indicators of lower-order factor ', fn1,
ifelse(ngroups > 1L, paste(' in group', gLab), ""),
' cannot also be latent')
indNames1
})
## update indicator list to only include relevant observed variables
indNames1 <- intersect(c(indNames1, do.call(c, lowList)), ov.names1)
## update list of higher-order factors
higher <- c(higher, fn)
}
Sigma1 <- SIGMA[[idx1]][indNames1, indNames1, drop = FALSE]
Sigma2 <- SIGMA[[idx2]][indNames1, indNames1, drop = FALSE]
if (tau.eq) {
if (fn %in% higher) {
warning('Cannot apply alpha (tau.eq=TRUE) to higher-order factor ',
fn, '. Instead, fit a higher-order CFA that imposes ',
'tau-equivalence constraints.')
} else {
## ALPHA
rel[[g]]$config[[fn]] <- c(`alpha_W` = computeAlpha(Sigma1),
`alpha_2L` = computeAlpha(Sigma1 + Sigma2))
}
} else {
## OMEGA
lam1 <- LAMBDA[[idx1]][indNames1, c(fn, latInds1), drop = FALSE]
lam2 <- LAMBDA[[idx2]][indNames1, c(fn, latInds1), drop = FALSE]
if (!isTRUE(all.equal(lam1, lam2)))
warning('Unequal observed-indicator loadings across levels ',
'detected among factors (', paste(c(fn, latInds1), collapse = ","),
ifelse(ngroups > 1L, paste(') in group', gLab), ")"),
'. omega_2L for configural constructs assumes invariance.')
phi1 <- PHI[[idx1]][c(fn, latInds1), c(fn, latInds1), drop = FALSE]
phi2 <- PHI[[idx2]][c(fn, latInds1), c(fn, latInds1), drop = FALSE]
if (length(latInds1)) {
bet1 <- BETA[[idx1]][c(fn, latInds1), c(fn, latInds1), drop = FALSE]
bet2 <- BETA[[idx2]][c(fn, latInds1), c(fn, latInds1), drop = FALSE]
if (!isTRUE(all.equal(bet1, bet2)))
warning('Unequal higher-order loadings detected across levels ',
'detected for factor ', fn,
ifelse(ngroups > 1L, paste(' in group', gLab), ""),
'. omega_2L for configural constructs assumes invariance.')
commonCov1 <- lam1 %*% bet1 %*% phi1 %*% t(bet1) %*% t(lam1)
commonCov2 <- lam2 %*% bet2 %*% phi2 %*% t(bet2) %*% t(lam2)
} else {
commonCov1 <- lam1 %*% phi1 %*% t(lam1)
commonCov2 <- lam2 %*% phi2 %*% t(lam2)
}
rel[[g]]$config[[fn]] <- c(`omega_W` = sum(commonCov1) / sum(Sigma1),
`omega_2L` = sum(commonCov1 + commonCov2) / sum(Sigma1 + Sigma2))
}
} else {
warning('Configural factor ', fn, 'not detected at both levels of ',
'analysis, so removed from config= list. Please use the ',
'same name for the within- and between-level component of a ',
'configural construct in your syntax.')
config <- setdiff(config, fn) # rm non-configural construct
}
}
## after removing ineligible config, still multiple for total?
if (length(config) > 1L) {
## only possible if NONE of config= are higher-order factors, or if
## the first-order factors in config= are not latent indicators of any
## higher-order factors in config=
if (return.total[idx1] && !(length(higher) && any(config %in% allLatInds))) {
indNames1 <- setdiff(PT$rhs[PT$lhs %in% config & PT$level == clus.label[1]],
omit.indicators)
## include observed indicators of any latent indicators
latInds1 <- intersect(indNames1, allLatInds)
indNames1 <- setdiff(PT$rhs[PT$lhs %in% c(config, latInds1) & PT$level == clus.label[1]],
omit.indicators)
Sigma1 <- SIGMA[[idx1]][indNames1, indNames1, drop = FALSE]
Sigma2 <- SIGMA[[idx2]][indNames1, indNames1, drop = FALSE]
if (tau.eq) {
if (any(config %in% higher)) {
warning('Cannot apply alpha (tau.eq=TRUE) to total composite ',
'that includes higher-order configural factor(s):\n',
paste(intersect(shared, higher), collapse = ", "),
'\nInstead, impose tau-equiv. in a higher-order CFA')
} else {
## ALPHA
rel[[g]]$config$total <- c(`alpha_W` = computeAlpha(Sigma1),
`alpha_2L` = computeAlpha(Sigma1 + Sigma2))
}
} else {
## OMEGA
lam1 <- LAMBDA[[idx1]][indNames1, c(config, latInds1), drop = FALSE]
lam2 <- LAMBDA[[idx2]][indNames1, c(config, latInds1), drop = FALSE]
phi1 <- PHI[[idx1]][c(config, latInds1), c(config, latInds1), drop = FALSE]
phi2 <- PHI[[idx2]][c(config, latInds1), c(config, latInds1), drop = FALSE]
if (length(latInds1)) {
bet1 <- BETA[[idx1]][c(config, latInds1), c(config, latInds1), drop = FALSE]
bet2 <- BETA[[idx2]][c(config, latInds1), c(config, latInds1), drop = FALSE]
commonCov1 <- lam1 %*% bet1 %*% phi1 %*% t(bet1) %*% t(lam1)
commonCov2 <- lam2 %*% bet2 %*% phi2 %*% t(bet2) %*% t(lam2)
} else {
commonCov1 <- lam1 %*% phi1 %*% t(lam1)
commonCov2 <- lam2 %*% phi2 %*% t(lam2)
}
rel[[g]]$config$total <- c(`omega_W` = sum(commonCov1) / sum(Sigma1),
`omega_2L` = sum(commonCov1 + commonCov2) / sum(Sigma1 + Sigma2))
}
}
## collect 2 coefs for multiple composites into a matrix
rel[[g]]$config <- do.call(cbind, rel[[g]]$config)
}
## (reciprocal of) harmonic-mean cluster size
Ns <- mean(1 / lavInspect(object, "cluster.size",
drop.list.single.group = FALSE)[[g]])
## reset higher= argument for later checks
higher <- character(0)
## shared construct(s) defined at between level in this group?
for (fn in shared) {
if (fn %in% omit.factors) {
## why would they do this?
shared <- setdiff(shared, omit.factors)
next
}
if (fn %in% lv.names2) {
indNames2 <- setdiff(PT$rhs[PT$lhs == fn & PT$level == clus.label[2L]],
omit.indicators)
## only Level-2 single-indicator factors are relevant to drop
if (dropSingle && length(indNames2) == 1L) next
## is it a higher-order factor?
latInds2 <- intersect(indNames2, allLatInds)
if (length(latInds2)) {
lowList <- lapply(latInds2, function(fn2) {
indNames2 <- setdiff(PT$rhs[PT$lhs == fn2 & PT$level == clus.label[2L]],
omit.indicators)
## check indicators of lower-order factors are not also latent
lowerLatent <- intersect(indNames2, allLatInds)
if (length(lowerLatent))
stop('Indicators of lower-order factor ', fn2,
ifelse(ngroups > 1L, paste(' in group', gLab), ""),
' cannot also be latent')
indNames2
})
## update indicator list to only include relevant observed variables
indNames2 <- intersect(c(indNames2, do.call(c, lowList)), ov.names2)
## update list of higher-order factors
higher <- c(higher, fn)
}
## capture within-level variance components of same indicators
## (make sure none are Level-2 only)
indNames1 <- intersect(indNames2, ov.names1)
Sigma1 <- SIGMA[[idx1]][indNames1, indNames1, drop = FALSE]
Sigma2 <- SIGMA[[idx2]][indNames2, indNames2, drop = FALSE]
if (tau.eq) {
if (fn %in% higher) {
warning('Cannot apply alpha (tau.eq=TRUE) to higher-order factor ',
fn, '. Instead, fit a higher-order CFA that imposes ',
'tau-equivalence constraints.')
} else {
nI <- length(indNames2)
if (nI == 1L) {
stop('Coefficient alpha is undefined for a single indicator. ',
'Set tau.eq=FALSE or dropSingle=TRUE')
}
kw <- nI / (nI-1) # weight for alpha based on number of items
## ALPHA
onlyCov2 <- Sigma2
diag(onlyCov2) <- 0
rel[[g]]$shared[[fn]] <- c(`alpha_B` = kw*sum(onlyCov2) / sum(Sigma1*Ns, Sigma2),
`IRR` = sum( Sigma2 ) / sum(Sigma1*Ns, Sigma2))
}
} else {
## OMEGA
lam2 <- LAMBDA[[idx2]][indNames2, c(fn, latInds2), drop = FALSE]
phi2 <- PHI[[idx2]][c(fn, latInds2), c(fn, latInds2), drop = FALSE]
if (length(latInds2)) {
bet2 <- BETA[[idx2]][c(fn, latInds2), c(fn, latInds2), drop = FALSE]
commonCov2 <- lam2 %*% bet2 %*% phi2 %*% t(bet2) %*% t(lam2)
} else commonCov2 <- lam2 %*% phi2 %*% t(lam2)
rel[[g]]$shared[[fn]] <- c(`omega_B` = sum(commonCov2) / sum(Sigma1*Ns, Sigma2),
`IRR` = sum( Sigma2 ) / sum(Sigma1*Ns, Sigma2))
}
} else shared <- setdiff(shared, fn) # rm non-shared construct
}
## after removing ineligible shared, still multiple for total?
if (length(shared) > 1L) {
## only possible if NONE of shared= are higher-order factors, or if
## the first-order factors in shared= are not latent indicators of any
## higher-order factors in shared=
if (return.total[idx2] && !(length(higher) && any(shared %in% allLatInds))) {
indNames2 <- setdiff(PT$rhs[PT$lhs %in% shared & PT$level == clus.label[2]],
omit.indicators)
## include observed indicators of any latent indicators
latInds2 <- intersect(indNames2, allLatInds)
indNames2 <- setdiff(PT$rhs[PT$lhs %in% c(shared, latInds2) & PT$level == clus.label[2]],
omit.indicators)
## capture within-level variance components of same indicators
## (make sure none are Level-2 only)
indNames1 <- intersect(indNames2, ov.names1)
Sigma1 <- SIGMA[[idx1]][indNames1, indNames1, drop = FALSE]
Sigma2 <- SIGMA[[idx2]][indNames2, indNames2, drop = FALSE]
if (tau.eq) {
if (any(shared %in% higher)) {
warning('Cannot apply alpha (tau.eq=TRUE) to total composite ',
'that includes higher-order shared factor(s):\n',
paste(intersect(shared, higher), collapse = ", "),
'\nInstead, impose tau-equiv. in a higher-order CFA')
} else {
## ALPHA
nI <- length(indNames2) #TODO: justify when > length(indNames1)
# (Level-1 component exists with SD=0)
kw <- nI / (nI-1) # weight for alpha based on number of items
onlyCov2 <- Sigma2
diag(onlyCov2) <- 0
rel[[g]]$shared$total <- c(`alpha_B` = kw*sum(onlyCov2) / sum(Sigma1*Ns, Sigma2),
`IRR` = sum( Sigma2 ) / sum(Sigma1*Ns, Sigma2))
}
} else {
## OMEGA
lam2 <- LAMBDA[[idx2]][indNames2, c(shared, latInds2), drop = FALSE]
phi2 <- PHI[[idx2]][c(shared, latInds2), c(shared, latInds2), drop = FALSE]
if (length(latInds1)) {
bet2 <- BETA[[idx2]][c(shared, latInds2), c(shared, latInds2), drop = FALSE]
commonCov2 <- lam2 %*% bet2 %*% phi2 %*% t(bet2) %*% t(lam2)
} else commonCov2 <- lam2 %*% phi2 %*% t(lam2)
rel[[g]]$shared$total <- c(`omega_B` = sum(commonCov2) / sum(Sigma1*Ns, Sigma2),
`IRR` = sum( Sigma2 ) / sum(Sigma1*Ns, Sigma2))
}
}
## collect 2 coefs for multiple composites into a matrix
rel[[g]]$shared <- do.call(cbind, rel[[g]]$shared)
}
} # end loop over groups
## drop list structure?
if (ngroups == 1L) {
rel <- rel[[1]]
} else names(rel) <- group.label
}
rel
}
## -------------
## reliability()
## (deprecated 10 May 2022)
## -------------
##' Composite Reliability using SEM
##'
##' Calculate composite reliability from estimated factor-model parameters
##'
##' The coefficient alpha (Cronbach, 1951) can be calculated by
##'
##' \deqn{ \alpha = \frac{k}{k - 1}\left[ 1 - \frac{\sum^{k}_{i = 1}
##' \sigma_{ii}}{\sum^{k}_{i = 1} \sigma_{ii} + 2\sum_{i < j} \sigma_{ij}}
##' \right],}
##'
##' where \eqn{k} is the number of items in a factor, \eqn{\sigma_{ii}} is the
##' item *i* observed variances, \eqn{\sigma_{ij}} is the observed
##' covariance of items *i* and *j*.
##'
##' Several coefficients for factor-analysis reliability have been termed
##' "omega", which Cho (2021) argues is a misleading misnomer and argues for
##' using \eqn{\rho} to represent them all, differentiated by descriptive
##' subscripts. In our package, we number \eqn{\omega} based on commonly
##' applied calculations. Bentler (1968) first introduced factor-analysis
##' reliability for a unidimensional factor model with congeneric indicators.
##' However, assuming there are no cross-loadings in a multidimensional CFA,
##' this reliability coefficient can be calculated for each factor in the model.
##'
##' \deqn{ \omega_1 =\frac{\left( \sum^{k}_{i = 1} \lambda_i \right)^{2}
##' Var\left( \psi \right)}{\left( \sum^{k}_{i = 1} \lambda_i \right)^{2}
##' Var\left( \psi \right) + \sum^{k}_{i = 1} \theta_{ii} + 2\sum_{i < j}
##' \theta_{ij} }, }
##'
##' where \eqn{\lambda_i} is the factor loading of item *i*, \eqn{\psi} is
##' the factor variance, \eqn{\theta_{ii}} is the variance of measurement errors
##' of item *i*, and \eqn{\theta_{ij}} is the covariance of measurement
##' errors from item *i* and *j*. McDonald (1999) later referred to
##' this *and other reliability coefficients* as "omega", which is a source
##' of confusion when reporting coefficients (Cho, 2021).
##'
##' The additional coefficients generalize the first formula by accounting for
##' multidimenisionality (possibly with cross-loadings) and correlated errors.
##' By setting `return.total=TRUE`, one can estimate reliability for a
##' single composite calculated using all indicators in the multidimensional
##' CFA (Bentler, 1972, 2009). `"omega2"` is calculated by
##'
##' \deqn{ \omega_2 = \frac{\left( \sum^{k}_{i = 1} \lambda_i \right)^{2}
##' Var\left( \psi \right)}{\bold{1}^\prime \hat{\Sigma} \bold{1}}, }
##'
##' where \eqn{\hat{\Sigma}} is the model-implied covariance matrix, and
##' \eqn{\bold{1}} is the \eqn{k}-dimensional vector of 1. The first and the
##' second coefficients omega will have the same value per factor in models with
##' simple structure, but they differ when there are (e.g.) cross-loadings
##' or method factors. The first coefficient omega can be viewed as the
##' reliability controlling for the other factors (like \eqn{\eta^2_{partial}} in
##' ANOVA). The second coefficient omega can be viewed as the unconditional
##' reliability (like \eqn{\eta^2} in ANOVA).
##'
##' The `"omega3"` coefficient (McDonald, 1999), sometimes referred to as
##' hierarchical omega, can be calculated by
##'
##' \deqn{ \omega_3 =\frac{\left( \sum^{k}_{i = 1} \lambda_i \right)^{2}
##' Var\left( \psi \right)}{\bold{1}^\prime \Sigma \bold{1}}, }
##'
##' where \eqn{\Sigma} is the observed covariance matrix. If the model fits the
##' data well, \eqn{\omega_3} will be similar to \eqn{\omega_2}. Note that if
##' there is a directional effect in the model, all coefficients are calculated
##' from total factor variances: `lavInspect(object, "cov.lv")`.
##'
##' In conclusion, \eqn{\omega_1}, \eqn{\omega_2}, and \eqn{\omega_3} are
##' different in the denominator. The denominator of the first formula assumes
##' that a model is congeneric factor model where measurement errors are not
##' correlated. The second formula accounts for correlated measurement errors.
##' However, these two formulas assume that the model-implied covariance matrix
##' explains item relationships perfectly. The residuals are subject to sampling
##' error. The third formula use observed covariance matrix instead of
##' model-implied covariance matrix to calculate the observed total variance.
##' This formula is the most conservative method in calculating coefficient
##' omega.
##'
##' The average variance extracted (AVE) can be calculated by
##'
##' \deqn{ AVE = \frac{\bold{1}^\prime
##' \textrm{diag}\left(\Lambda\Psi\Lambda^\prime\right)\bold{1}}{\bold{1}^\prime
##' \textrm{diag}\left(\hat{\Sigma}\right) \bold{1}}, }
##'
##' Note that this formula is modified from Fornell & Larcker (1981) in the case
##' that factor variances are not 1. The proposed formula from Fornell & Larcker
##' (1981) assumes that the factor variances are 1. Note that AVE will not be
##' provided for factors consisting of items with dual loadings. AVE is the
##' property of items but not the property of factors. AVE is calculated with
##' polychoric correlations when ordinal indicators are used.
##'
##' Coefficient alpha is by definition applied by treating indicators as numeric
##' (see Chalmers, 2018), which is consistent with the `alpha` function in
##' the `psych` package. When indicators are ordinal, `reliability`
##' additionally applies the standard alpha calculation to the polychoric
##' correlation matrix to return Zumbo et al.'s (2007) "ordinal alpha".
##'
##' Coefficient omega for categorical items is calculated using Green and Yang's
##' (2009, formula 21) approach. Three types of coefficient omega indicate
##' different methods to calculate item total variances. The original formula
##' from Green and Yang is equivalent to \eqn{\omega_3} in this function.
##' Green and Yang did not propose a method for
##' calculating reliability with a mixture of categorical and continuous
##' indicators, and we are currently unaware of an appropriate method.
##' Therefore, when `reliability` detects both categorical and continuous
##' indicators of a factor, an error is returned. If the categorical indicators
##' load on a different factor(s) than continuous indicators, then reliability
##' will still be calculated separately for those factors, but
##' `return.total` must be `FALSE` (unless `omit.factors` is used
##' to isolate factors with indicators of the same type).
##'
##'
##' @importFrom lavaan lavInspect lavNames
##' @importFrom methods getMethod
##'
##' @param object A [lavaan::lavaan-class] or [lavaan.mi::lavaan.mi-class] object,
##' expected to contain only exogenous common factors (i.e., a CFA model).
##' @param what `character` vector naming any reliability indices to
##' calculate. All are returned by default. When indicators are ordinal,
##' both traditional `"alpha"` and Zumbo et al.'s (2007) so-called
##' "ordinal alpha" (`"alpha.ord"`) are returned, though the latter is
##' arguably of dubious value (Chalmers, 2018).
##' @param return.total `logical` indicating whether to return a final
##' column containing the reliability of a composite of all indicators (not
##' listed in `omit.indicators`) of factors not listed in
##' `omit.factors`. Ignored in 1-factor models, and should only be set
##' `TRUE` if all factors represent scale dimensions that could be
##' meaningfully collapsed to a single composite (scale sum or scale mean).
##' @param dropSingle `logical` indicating whether to exclude factors
##' defined by a single indicator from the returned results. If `TRUE`
##' (default), single indicators will still be included in the `total`
##' column when `return.total = TRUE`.
##' @param omit.factors `character` vector naming any common factors
##' modeled in `object` whose composite reliability is not of
##' interest. For example, higher-order or method factors. Note that
##' [reliabilityL2()] should be used to calculate composite
##' reliability of a higher-order factor.
##' @param omit.indicators `character` vector naming any observed variables
##' that should be ignored when calculating composite reliability. This can
##' be useful, for example, to estimate reliability when an indicator is
##' removed.
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. NPD solutions are not excluded by default because
##' they are likely to occur due to sampling error, especially in small
##' samples. However, gross model misspecification could also cause
##' NPD solutions, users can compare pooled results with and without
##' this setting as a sensitivity analysis to see whether some
##' imputations warrant further investigation.
##'
##' @return Reliability values (coefficient alpha, coefficients omega, average
##' variance extracted) of each factor in each group. If there are multiple
##' factors, a `total` column can optionally be included.
##'
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' Yves Rosseel (Ghent University; \email{Yves.Rosseel@@UGent.be})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @references
##' Bentler, P. M. (1972). A lower-bound method for the dimension-free
##' measurement of internal consistency. *Social Science Research, 1*(4),
##' 343--357. \doi{10.1016/0049-089X(72)90082-8}
##'
##' Bentler, P. M. (2009). Alpha, dimension-free, and model-based internal
##' consistency reliability. *Psychometrika, 74*(1), 137--143.
##' \doi{10.1007/s11336-008-9100-1}
##'
##' Chalmers, R. P. (2018). On misconceptions and the limited usefulness of
##' ordinal alpha. *Educational and Psychological Measurement, 78*(6),
##' 1056--1071. \doi{10.1177/0013164417727036}
##'
##' Cho, E. (2021) Neither Cronbach’s alpha nor McDonald’s omega: A commentary
##' on Sijtsma and Pfadt. *Psychometrika, 86*(4), 877--886.
##' \doi{10.1007/s11336-021-09801-1}
##'
##' Cronbach, L. J. (1951). Coefficient alpha and the internal structure of
##' tests. *Psychometrika, 16*(3), 297--334. \doi{10.1007/BF02310555}
##'
##' Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models
##' with unobservable variables and measurement errors. *Journal of
##' Marketing Research, 18*(1), 39--50. \doi{10.2307/3151312}
##'
##' Green, S. B., & Yang, Y. (2009). Reliability of summed item scores using
##' structural equation modeling: An alternative to coefficient alpha.
##' *Psychometrika, 74*(1), 155--167. \doi{10.1007/s11336-008-9099-3}
##'
##' McDonald, R. P. (1999). *Test theory: A unified treatment*. Mahwah, NJ:
##' Erlbaum.
##'
##' Raykov, T. (2001). Estimation of congeneric scale reliability using
##' covariance structure analysis with nonlinear constraints *British
##' Journal of Mathematical and Statistical Psychology, 54*(2), 315--323.
##' \doi{10.1348/000711001159582}
##'
##' Zumbo, B. D., Gadermann, A. M., & Zeisser, C. (2007). Ordinal versions of
##' coefficients alpha and theta for Likert rating scales.
##' *Journal of Modern Applied Statistical Methods, 6*(1), 21--29.
##' \doi{10.22237/jmasm/1177992180}
##'
##' Zumbo, B. D., & Kroc, E. (2019). A measurement is a choice and Stevens’
##' scales of measurement do not help make it: A response to Chalmers.
##' *Educational and Psychological Measurement, 79*(6), 1184--1197.
##' \doi{10.1177/0013164419844305}
##'
##'
##' @examples
##'
##' data(HolzingerSwineford1939)
##' HS9 <- HolzingerSwineford1939[ , c("x7","x8","x9")]
##' HSbinary <- as.data.frame( lapply(HS9, cut, 2, labels=FALSE) )
##' names(HSbinary) <- c("y7","y8","y9")
##' HS <- cbind(HolzingerSwineford1939, HSbinary)
##'
##' HS.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ y7 + y8 + y9 '
##'
##' fit <- cfa(HS.model, data = HS, ordered = c("y7","y8","y9"), std.lv = TRUE)
##'
##' ## works for factors with exclusively continuous OR categorical indicators
##' reliability(fit)
##'
##' ## reliability for ALL indicators only available when they are
##' ## all continuous or all categorical
##' reliability(fit, omit.factors = "speed", return.total = TRUE)
##'
##'
##' ## loop over visual indicators to calculate alpha if one indicator is removed
##' for (i in paste0("x", 1:3)) {
##' cat("Drop x", i, ":\n")
##' print(reliability(fit, omit.factors = c("textual","speed"),
##' omit.indicators = i, what = "alpha"))
##' }
##'
##'
##' ## works for multigroup models and for multilevel models (and both)
##' data(Demo.twolevel)
##' ## assign clusters to arbitrary groups
##' Demo.twolevel$g <- ifelse(Demo.twolevel$cluster %% 2L, "type1", "type2")
##' model2 <- ' group: type1
##' level: within
##' fac =~ y1 + L2*y2 + L3*y3
##' level: between
##' fac =~ y1 + L2*y2 + L3*y3
##'
##' group: type2
##' level: within
##' fac =~ y1 + L2*y2 + L3*y3
##' level: between
##' fac =~ y1 + L2*y2 + L3*y3
##' '
##' fit2 <- sem(model2, data = Demo.twolevel, cluster = "cluster", group = "g")
##' reliability(fit2, what = c("alpha","omega3"))
##'
##' @name reliability-deprecated
##' @usage
##' reliability(object, what = c("alpha", "omega", "omega2", "omega3", "ave"),
##' return.total = FALSE, dropSingle = TRUE, omit.factors = character(0),
##' omit.indicators = character(0), omit.imps = c("no.conv", "no.se"))
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @section Reliability:
##' The original `reliability` function was suboptimally designed.
##' For example, AVE was returned, which is not a reliability index. Also,
##' alpha and several omega-type coefficients were returned, including the
##' original formula that was in appropriate for models with complex structure.
##' Some features could be controlled by the user for one but not both types of
##' index For example, alpha for categorical indicators was returned on both
##' the observed and latent-response scales, but this was not an option for any
##' omega-type indices. The omegas differed in terms of whether the observed or
##' model-implied covariance matrix was used in the denominator, but alpha was
##' only computed using the observed matrix. These inconsistencies have been
##' resolved in the new [compRelSEM()] function, which returns only
##' one reliability index (per factor, optionally total score) according to the
##' user's requested features, for which there is much more flexibility.
##' Average variance extracted is now available in a dedicated [AVE()]
##' function.
##'
##' @export
reliability <- function(object,
what = c("alpha","omega","omega2","omega3","ave"),
return.total = FALSE, dropSingle = TRUE,
omit.factors = character(0),
omit.indicators = character(0),
omit.imps = c("no.conv","no.se")) {
.Deprecated(msg = c("\nThe reliability() function was deprecated in 2022 and ",
"will cease to be included in future versions of semTools",
". See help('semTools-deprecated) for details.",
"\n\nIt is replaced by the compRelSEM() function, which ",
"can estimate alpha and model-based reliability in an ",
"even wider variety of models and data types, with ",
"greater control in specifying the desired type of ",
"reliability coefficient (i.e., more explicitly choosing ",
"assumptions). \n\nThe average variance extracted should ",
"never have been included because it is not a reliability ",
"coefficient. It is now available from the AVE() function."))
ngroups <- lavInspect(object, "ngroups") #TODO: adapt to multiple levels
nLevels <- lavInspect(object, "nlevels")
nblocks <- ngroups*nLevels #FIXME: always true?
return.total <- rep(return.total, nblocks)
group.label <- if (ngroups > 1L) lavInspect(object, "group.label") else NULL
#FIXME? lavInspect(object, "level.labels")
clus.label <- if (nLevels > 1L) c("within", lavInspect(object, "cluster")) else NULL
if (nblocks > 1L) {
block.label <- paste(rep(group.label, each = nLevels), clus.label,
sep = if (ngroups > 1L && nLevels > 1L) "_" else "")
}
## check for categorical (determines what S will be)
anyCategorical <- lavInspect(object, "categorical")
if (anyCategorical && "alpha" %in% what) {
what <- c(what, "alpha.ord")
what <- unique(what) # in case it was already explicitly requested
}
## categorical-model parameters
threshold <- if (anyCategorical) getThreshold(object, omit.imps = omit.imps) else NULL
latScales <- if (anyCategorical) getScales(object, omit.imps = omit.imps) else NULL
## all other relevant parameters in GLIST format (not flat, need block-level list)
if (inherits(object, "lavaan")) {
param <- lavInspect(object, "est")
ve <- lavInspect(object, "cov.lv") # model-implied latent covariance matrix
S <- object@h1$implied$cov # observed sample covariance matrix (already a list)
if (anyCategorical && any(c("alpha","alpha.ord") %in% what)) {
rawData <- try(lavInspect(object, "data"), silent = TRUE)
if (inherits(rawData, "try-error"))
stop('Error in lavInspect(fit, "data"); what="alpha" unavailable for ',
'models fitted to summary statistics of categorial data.')
if (nblocks == 1L) rawData <- list(rawData)
S.as.con <- lapply(rawData, cov) # for actual "alpha", not "alpha.ord"
}
if (nblocks == 1L) {
param <- list(param)
ve <- list(ve)
}
} else if (inherits(object, "lavaan.mi")) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
param <- object@coefList[[ useImps[1] ]] # first admissible as template
coefList <- object@coefList[useImps]
phiList <- object@phiList[useImps]
if (anyCategorical) {
dataList <- object@DataList[useImps]
S.as.con <- vector("list", nblocks) # for group-list of pooled S
}
## add block-level list per imputation?
if (nblocks == 1L) {
param <- list(param)
for (i in 1:m) {
coefList[[i]] <- list(coefList[[i]])
phiList[[i]] <- list(phiList[[i]])
}
if (anyCategorical) { #FIXME: currently no categorical ML-SEMs
#dataList[[i]] <- list(dataList[[i]])
VV <- lavNames(object, type = "ov")
impCovList <- lapply(dataList, function(DD) {
dat <- do.call(cbind, sapply(DD[VV], as.numeric, simplify = FALSE))
cov(dat)
})
S.as.con[[1]] <- Reduce("+", impCovList) / length(impCovList)
}
} else if (anyCategorical) { #FIXME: currently no categorical ML-SEMs
## multigroup models need separate data matrices per group
G <- lavInspect(object, "group")
for (g in seq_along(group.label)) {
VV <- try(lavNames(object, type = "ov", group = group.label[g]),
silent = TRUE)
if (inherits(VV, "try-error")) {
VV <- lavNames(object, type = "ov", group = g)
}
impCovList <- lapply(dataList, function(DD) {
RR <- DD[,G] == group.label[g]
dat <- do.call(cbind, sapply(DD[RR, VV], as.numeric, simplify = FALSE))
cov(dat)
})
S.as.con[[g]] <- Reduce("+", impCovList) / length(impCovList)
}
}
S <- vector("list", nblocks) # pooled observed OR polychoric covariance matrix
ve <- vector("list", nblocks)
## loop over blocks
for (b in 1:nblocks) {
## param: loop over GLIST elements
for (mat in names(param[[b]])) {
matList <- lapply(coefList, function(i) i[[b]][[mat]])
param[[b]][[mat]] <- Reduce("+", matList) / length(matList)
} # mat
## pooled observed OR polychoric covariance matrix
covList <- lapply(object@h1List[useImps], function(i) i$implied$cov[[b]])
S[[b]] <- Reduce("+", covList) / m
## pooled model-implied latent covariance matrix
ve[[b]] <- Reduce("+", lapply(phiList, "[[", i = b) ) / m
} # b
}
if (nblocks == 1L) {
SigmaHat <- getMethod("fitted", class(object))(object)["cov"] # retain list format
} else {
SigmaHat <- sapply(getMethod("fitted", class(object))(object),
"[[", "cov", simplify = FALSE)
}
ly <- lapply(param, "[[", "lambda")
te <- lapply(param, "[[", "theta")
beta <- if ("beta" %in% names(param[[1]])) {
lapply(param, "[[", "beta")
} else NULL
result <- list()
warnTotal <- FALSE
warnHigher <- character(0) # collect list of potential higher-order factors
## loop over i blocks (groups/levels)
for (i in 1:nblocks) {
## extract factor and indicator names
allIndNames <- rownames(ly[[i]])
allFacNames <- colnames(ly[[i]])
myFacNames <- setdiff(allFacNames, omit.factors)
subLY <- ly[[i]][ , myFacNames, drop = FALSE] != 0
myIndNames <- rownames(subLY)[apply(subLY, MARGIN = 1L, FUN = any)]
## distinguish between categorical, continuous, and latent indicators
nameArgs <- list(object = object)
if (nblocks > 1L) nameArgs$block <- i
ordNames <- do.call(lavNames, c(nameArgs, list(type = "ov.ord")))
numNames <- do.call(lavNames, c(nameArgs, list(type = "ov.num")))
if (anyCategorical) {
## identify when the (sub)set of factors are all categorical
blockCat <- all(myIndNames %in% ordNames)
## identify when the (sub)set of factors have mixed indicators, so no total
mix <- any(myIndNames %in% ordNames) && any(myIndNames %in% numNames)
} else {
blockCat <- FALSE
mix <- FALSE
}
if (mix && return.total[i]) {
return.total[i] <- FALSE
warnTotal <- TRUE
}
## identify POSSIBLE higher-order factors (that affect other latent vars)
latInds <- do.call(lavNames, c(nameArgs, list(type = "lv.ind")))
higher <- if (length(latInds) == 0L) character(0) else {
allFacNames[apply(beta[[i]], MARGIN = 2, function(x) any(x != 0))]
}
## keep track of factor indices to skip
idx.drop <- numeric(0)
## relevant quantities
common <- (apply(ly[[i]], 2, sum)^2) * diag(ve[[i]])
truevar <- ly[[i]] %*% ve[[i]] %*% t(ly[[i]])
## vectors to store results for each factor
error <- rep(NA, length(common))
alpha <- rep(NA, length(common))
alpha.ord <- rep(NA, length(common))
total <- rep(NA, length(common))
omega1 <- omega2 <- omega3 <- rep(NA, length(common))
impliedTotal <- rep(NA, length(common))
avevar <- rep(NA, length(common))
warnOmega <- FALSE
## loop over j factors
for (j in 1:length(common)) {
## skip this factor?
if (allFacNames[j] %in% omit.factors) {
idx.drop <- c(idx.drop, j)
next
}
index <- setdiff(which(ly[[i]][,j] != 0), # nonzero loadings
which(allIndNames %in% omit.indicators))
jIndNames <- allIndNames[index]
## identify when this factor has mixed indicators, so no omegas
jMix <- any(jIndNames %in% ordNames) && any(jIndNames %in% numNames)
## check for ANY indicators (possibly skip purely higher-order factors)
if (length(index) == 0L) {
idx.drop <- c(idx.drop, j)
next
}
## check for single indicators
if (dropSingle && length(index) == 1L) {
idx.drop <- c(idx.drop, j)
next
}
## check for categorical (or mixed) indicators
jCat <- any(jIndNames %in% ordNames)
warnOmega <- jCat && !all(jIndNames %in% ordNames)
## check for latent indicators
if (allFacNames[j] %in% higher && !(allFacNames[j] %in% omit.factors)) {
warnHigher <- c(warnHigher, allFacNames[j])
}
sigma <- S[[i]][index, index, drop = FALSE]
faccontrib <- ly[[i]][,j, drop = FALSE] %*% ve[[i]][j,j, drop = FALSE] %*% t(ly[[i]][,j, drop = FALSE])
truefac <- diag(faccontrib[index, index, drop = FALSE])
trueitem <- diag(truevar[index, index, drop = FALSE])
erritem <- diag(te[[i]][index, index, drop = FALSE])
if (sum(abs(trueitem - truefac)) < 0.00001 & "ave" %in% what) {
avevar[j] <- sum(trueitem) / sum(trueitem + erritem)
}
if (jCat) {
if ("alpha" %in% what) {
alpha[j] <- computeAlpha(S.as.con[[i]][index, index, drop = FALSE])
}
if ("alpha.ord" %in% what) {
alpha.ord[j] <- computeAlpha(sigma)
}
if ("omega" %in% what) {
omega1[j] <- omegaCat(truevar = faccontrib[index, index, drop = FALSE],
threshold = threshold[[i]][jIndNames],
scales = latScales[[i]][index],
denom = faccontrib[index, index, drop = FALSE] + te[[i]][index, index, drop = FALSE])
}
if ("omega2" %in% what) {
omega2[j] <- omegaCat(truevar = faccontrib[index, index, drop = FALSE],
threshold = threshold[[i]][jIndNames],
scales = latScales[[i]][index],
denom = SigmaHat[[i]][index, index, drop = FALSE])
}
if ("omega3" %in% what) {
omega3[j] <- omegaCat(truevar = faccontrib[index, index, drop = FALSE],
threshold = threshold[[i]][jIndNames],
scales = latScales[[i]][index],
denom = sigma)
}
} else {
alpha[j] <- computeAlpha(sigma)
commonfac <- sum(faccontrib[index, index, drop = FALSE])
error[j] <- sum(te[[i]][index, index, drop = FALSE])
impliedTotal[j] <- sum(SigmaHat[[i]][index, index, drop = FALSE])
total[j] <- sum(sigma)
omega1[j] <- commonfac / (commonfac + error[j])
omega2[j] <- commonfac / impliedTotal[j]
omega3[j] <- commonfac / total[j]
}
## end loop over j factors
}
if (return.total[i] & length(myFacNames) > 1L) {
if (blockCat) {
if ("alpha" %in% what) {
alpha <- c(alpha, computeAlpha(S.as.con[[i]]))
}
if ("alpha.ord" %in% what) {
alpha.ord <- c(alpha.ord, total = computeAlpha(S[[i]]))
}
if ("omega" %in% what) {
omega1 <- c(omega1, total = omegaCat(truevar = truevar,
threshold = threshold[[i]],
scales = latScales[[i]],
denom = truevar + te[[i]]))
}
if ("omega2" %in% what) {
omega2 <- c(omega2, total = omegaCat(truevar = truevar,
threshold = threshold[[i]],
scales = latScales[[i]],
denom = SigmaHat[[i]]))
}
if ("omega2" %in% what) {
omega3 <- c(omega3, total = omegaCat(truevar = truevar,
threshold = threshold[[i]],
scales = latScales[[i]],
denom = S[[i]]))
}
} else {
alpha <- c(alpha, total = computeAlpha(S[[i]]))
omega1 <- c(omega1, total = sum(truevar) / (sum(truevar) + sum(te[[i]])))
omega2 <- c(omega2, total = sum(truevar) / (sum(SigmaHat[[i]])))
omega3 <- c(omega3, total = sum(truevar) / (sum(S[[i]])))
}
avevar <- c(avevar,
total = sum(diag(truevar)) / sum((diag(truevar) + diag(te[[i]]))))
}
if (all(is.na(alpha.ord))) alpha.ord <- NULL
result[[i]] <- rbind(alpha = if ("alpha" %in% what) alpha else NULL,
alpha.ord = if ("alpha.ord" %in% what) alpha.ord else NULL,
omega = if ("omega" %in% what) omega1 else NULL,
omega2 = if ("omega2" %in% what) omega2 else NULL,
omega3 = if ("omega3" %in% what) omega3 else NULL,
avevar = if ("ave" %in% what) avevar else NULL)
colnames(result[[i]])[1:length(allFacNames)] <- allFacNames
if (return.total[i] & length(myFacNames) > 1L) {
colnames(result[[i]])[ ncol(result[[i]]) ] <- "total"
}
if (length(idx.drop)) {
result[[i]] <- result[[i]][ , -idx.drop, drop = FALSE]
## reset indices for next block (could have different model/variables)
idx.drop <- numeric(0)
}
## end loop over blocks
}
warnCat <- sapply(result, function(x) any(c("alpha.ord","ave") %in% rownames(x)))
if (any(warnCat)) {
alphaMessage <- paste0('Zumbo et al.`s (2007) "ordinal alpha" is calculated',
' in addition to the standard alpha, which treats ',
'ordinal variables as numeric. See Chalmers (2018) ',
'for a critique of "alpha.ord" and the response by ',
'Zumbo & Kroc (2019).')
AVEmessage <- paste0('average variance extracted is calculated from ',
'polychoric (polyserial) not Pearson correlations.')
both <- "alpha.ord" %in% what & "ave" %in% what
connectMessage <- if (both) ' Likewise, ' else ' the '
catMessage <- paste0("For constructs with categorical indicators, ",
if ("alpha.ord" %in% what) alphaMessage else NULL,
if (both) ' Likewise, ' else NULL,
if ("ave" %in% what) AVEmessage else NULL)
if ("alpha.ord" %in% what || "ave" %in% what) message(catMessage, "\n")
}
if (length(warnHigher)) warning('Possible higher-order factors detected:\n',
paste(unique(warnHigher), sep = ", "))
if (warnTotal) {
message('Cannot return.total when model contains both continuous and ',
'binary/ordinal observed indicators. Use the ',
'omit.factors= argument to choose factors with only categorical ',
'indicators, if that is a composite of interest.\n')
}
if (warnOmega) {
message('Composite reliability (omega) cannot be computed for factors ',
'with mixed categorical and continuous indicators.')
}
## drop list structure?
if (nblocks == 1L) {
result <- result[[1]]
} else names(result) <- block.label
result
}
## ---------------
## reliabilityL2()
## (deprecated 10 May 2022)
## ---------------
##' Calculate the reliability values of a second-order factor
##'
##' Calculate the reliability values (coefficient omega) of a second-order
##' factor
##'
##' The first formula of the coefficient omega (in the
##' [reliability()]) will be mainly used in the calculation. The
##' model-implied covariance matrix of a second-order factor model can be
##' separated into three sources: the second-order common-factor variance,
##' the residual variance of the first-order common factors (i.e., not
##' accounted for by the second-order factor), and the measurement error of
##' observed indicators:
##'
##' \deqn{ \hat{\Sigma} = \Lambda \bold{B} \Phi_2 \bold{B}^{\prime}
##' \Lambda^{\prime} + \Lambda \Psi_{u} \Lambda^{\prime} + \Theta, }
##'
##' where \eqn{\hat{\Sigma}} is the model-implied covariance matrix,
##' \eqn{\Lambda} contains first-order factor loadings, \eqn{\bold{B}} contains
##' second-order factor loadings, \eqn{\Phi_2} is the covariance matrix of the
##' second-order factor(s), \eqn{\Psi_{u}} is the covariance matrix of residuals
##' from first-order factors, and \eqn{\Theta} is the covariance matrix of the
##' measurement errors from observed indicators. Thus, we can calculate the
##' proportion of variance of a composite score calculated from the observed
##' indicators (e.g., a total score or scale mean) that is attributable to the
##' second-order factor, i.e. coefficient omega at Level 1:
##'
##' \deqn{ \omega_{L1} = \frac{\bold{1}^{\prime} \Lambda \bold{B} \Phi_2
##' \bold{B}^{\prime} \Lambda^{\prime} \bold{1}}{\bold{1}^{\prime} \Lambda
##' \bold{B} \Phi_2 \bold{B} ^{\prime} \Lambda^{\prime} \bold{1} +
##' \bold{1}^{\prime} \Lambda \Psi_{u} \Lambda^{\prime} \bold{1} +
##' \bold{1}^{\prime} \Theta \bold{1}}, }
##'
##' where \eqn{\bold{1}} is the *k*-dimensional vector of 1 and *k* is
##' the number of observed variables.
##'
##' The model-implied covariance matrix among first-order factors (\eqn{\Phi_1})
##' can be calculated as:
##'
##' \deqn{ \Phi_1 = \bold{B} \Phi_2 \bold{B}^{\prime} + \Psi_{u}, }
##'
##' Thus, the proportion of variance among first-order common factors that is
##' attributable to the second-order factor (i.e., coefficient omega at Level 2)
##' can be calculated as:
##'
##' \deqn{ \omega_{L2} = \frac{\bold{1}_F^{\prime} \bold{B} \Phi_2
##' \bold{B}^{\prime} \bold{1}_F}{\bold{1}_F^{\prime} \bold{B} \Phi_2
##' \bold{B}^{\prime} \bold{1}_F + \bold{1}_F^{\prime} \Psi_{u} \bold{1}_F}, }
##'
##' where \eqn{\bold{1}_F} is the *F*-dimensional vector of 1 and *F*
##' is the number of first-order factors. This Level-2 omega can be interpreted
##' as an estimate of the reliability of a hypothetical composite calculated
##' from error-free observable variables representing the first-order common
##' factors. This might only be meaningful as a thought experiment.
##'
##' An additional thought experiment is possible: If the observed indicators
##' contained only the second-order common-factor variance and unsystematic
##' measurement error, then there would be no first-order common factors because
##' their unique variances would be excluded from the observed measures. An
##' estimate of this hypothetical composite reliability can be calculated as the
##' partial coefficient omega at Level 1, or the proportion of observed
##' variance explained by the second-order factor after partialling out the
##' uniqueness from the first-order factors:
##'
##' \deqn{ \omega_{L1} = \frac{\bold{1}^{\prime} \Lambda \bold{B} \Phi_2
##' \bold{B}^{\prime} \Lambda^{\prime} \bold{1}}{\bold{1}^{\prime} \Lambda
##' \bold{B} \Phi_2 \bold{B}^{\prime} \Lambda^{\prime} \bold{1} +
##' \bold{1}^{\prime} \Theta \bold{1}}, }
##'
##' Note that if the second-order factor has a direct factor loading on some
##' observed variables, the observed variables will be counted as first-order
##' factors, which might not be desirable.
##'
##'
##' @importFrom lavaan lavInspect
##'
##' @param object A [lavaan::lavaan-class] or [lavaan.mi::lavaan.mi-class] object,
##' expected to contain a least one exogenous higher-order common factor.
##' @param secondFactor The name of a single second-order factor in the
##' model fitted in `object`. The function must be called multiple
##' times to estimate reliability for each higher-order factor.
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. NPD solutions are not excluded by default because
##' they are likely to occur due to sampling error, especially in small
##' samples. However, gross model misspecification could also cause
##' NPD solutions, users can compare pooled results with and without
##' this setting as a sensitivity analysis to see whether some
##' imputations warrant further investigation.
##'
##' @return Reliability values at Levels 1 and 2 of the second-order factor, as
##' well as the partial reliability value at Level 1
##'
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' @examples
##'
##' HS.model3 <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9
##' higher =~ visual + textual + speed'
##'
##' fit6 <- cfa(HS.model3, data = HolzingerSwineford1939)
##' reliability(fit6) # Should provide a warning for the endogenous variables
##' reliabilityL2(fit6, "higher")
##'
##' @name reliabilityL2-deprecated
##' @usage
##' reliabilityL2(object, secondFactor, omit.imps = c("no.conv", "no.se"))
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @section Higher-Order Reliability:
##' Originally, composite reliability of a single higher-order factor was
##' estimated in a separate function: `reliabilityL2`. It is now available
##' for multiple higher-order factors in the same model, and from the same
##' [compRelSEM()] function that estimates reliability for first-order
##' factors, using the `higher=` argument to name higher-order factors in
##' the fitted model.
##'
##' @export
reliabilityL2 <- function(object, secondFactor,
omit.imps = c("no.conv","no.se")) {
.Deprecated(msg = c("\nThe reliabilityL2() function was deprecated in 2022 and ",
"will cease to be included in future versions of semTools",
". See help('semTools-deprecated) for details.",
"\n\nIt is replaced by the compRelSEM() function, which ",
"can estimate alpha and model-based reliability in an ",
"even wider variety of models and data types, with ",
"greater control in specifying the desired type of ",
"reliability coefficient (i.e., more explicitly choosing ",
"assumptions)."))
secondFactor <- as.character(secondFactor)[1] # only one at a time
ngroups <- lavInspect(object, "ngroups") #TODO: adapt to multiple levels
nLevels <- lavInspect(object, "nlevels")
nblocks <- ngroups*nLevels #FIXME: always true?
group.label <- if (ngroups > 1L) lavInspect(object, "group.label") else NULL
#FIXME? lavInspect(object, "level.labels")
clus.label <- if (nLevels > 1L) c("within", lavInspect(object, "cluster")) else NULL
if (nblocks > 1L) {
block.label <- paste(rep(group.label, each = nLevels), clus.label,
sep = if (ngroups > 1L && nLevels > 1L) "_" else "")
}
## parameters in GLIST format (not flat, need block-level list)
if (inherits(object, "lavaan")) {
param <- lavInspect(object, "est")
ve <- lavInspect(object, "cov.lv") # model-implied latent covariance matrix
S <- object@h1$implied$cov # observed sample covariance matrix (already a list)
if (nblocks == 1L) {
param <- list(param)
ve <- list(ve)
}
} else if (inherits(object, "lavaan.mi")) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
param <- object@coefList[[ useImps[1] ]] # first admissible as template
coefList <- object@coefList[useImps]
phiList <- object@phiList[useImps]
## add block-level list per imputation?
if (nblocks == 1L) {
param <- list(param)
for (i in 1:m) {
coefList[[i]] <- list(coefList[[i]])
phiList[[i]] <- list(phiList[[i]])
}
}
S <- vector("list", nblocks) # pooled observed covariance matrix
ve <- vector("list", nblocks)
## loop over blocks
for (b in 1:nblocks) {
## param: loop over GLIST elements
for (mat in names(param[[b]])) {
matList <- lapply(coefList, function(i) i[[b]][[mat]])
param[[b]][[mat]] <- Reduce("+", matList) / length(matList)
} # mat
## pooled observed covariance matrix
covList <- lapply(object@h1List[useImps], function(i) i$implied$cov[[b]])
S[[b]] <- Reduce("+", covList) / m
## pooled model-implied latent covariance matrix
ve[[b]] <- Reduce("+", lapply(phiList, "[[", i = b) ) / m
} # b
}
if (nblocks == 1L) {
SigmaHat <- getMethod("fitted", class(object))(object)["cov"] # retain list format
} else {
SigmaHat <- sapply(getMethod("fitted", class(object))(object),
"[[", "cov", simplify = FALSE)
}
ly <- lapply(param, "[[", "lambda")
te <- lapply(param, "[[", "theta")
ps <- lapply(param, "[[", "psi")
be <- lapply(param, "[[", "beta")
result <- list()
for (i in 1:nblocks) {
# Prepare for higher-order reliability
l2var <- ve[[i]][secondFactor, secondFactor, drop = FALSE]
l2load <- be[[1]][,secondFactor]
indexl2 <- which(l2load != 0)
commonl2 <- (sum(l2load)^2) * l2var
errorl2 <- sum(ps[[i]][indexl2, indexl2, drop = FALSE])
# Prepare for lower-order reliability
indexl1 <- which(apply(ly[[i]][,indexl2, drop = FALSE], 1, function(x) sum(x != 0)) > 0)
l1load <- ly[[i]][,indexl2] %*% as.matrix(be[[1]][indexl2, secondFactor, drop = FALSE])
commonl1 <- (sum(l1load)^2) * l2var
errorl1 <- sum(te[[i]][indexl1, indexl1, drop = FALSE])
uniquel1 <- 0
for (j in seq_along(indexl2)) {
uniquel1 <- uniquel1 + (sum(ly[[i]][,indexl2[j]])^2) * ps[[i]][indexl2[j], indexl2[j], drop = FALSE]
}
# Adjustment for direct loading from L2 to observed variables
if (any(ly[[i]][,secondFactor] != 0)) {
indexind <- which(ly[[i]][,secondFactor] != 0)
if (length(intersect(indexind, indexl1)) > 0)
stop("Direct and indirect loadings of higher-order factor to observed",
" variables are specified at the same time.")
commonl2 <- sum(c(ly[[i]][,secondFactor], l2load))^2 * l2var
errorl2 <- errorl2 + sum(te[[i]][indexind, indexind, drop = FALSE])
commonl1 <- sum(c(ly[[i]][,secondFactor], l1load))^2 * l2var
errorl1 <- errorl1 + sum(te[[i]][indexind, indexind, drop = FALSE])
}
# Calculate Reliability
omegaL1 <- commonl1 / (commonl1 + uniquel1 + errorl1)
omegaL2 <- commonl2 / (commonl2 + errorl2)
partialOmegaL1 <- commonl1 / (commonl1 + errorl1)
result[[i]] <- c(omegaL1 = omegaL1, omegaL2 = omegaL2, partialOmegaL1 = partialOmegaL1)
}
if (nblocks == 1L) {
result <- result[[1]]
} else names(result) <- block.label
result
}
## --------------
## maximalRelia()
## --------------
##' Calculate maximal reliability
##'
##' Calculate maximal reliability of a scale
##'
##' Given that a composite score (\eqn{W}) is a weighted sum of item scores:
##'
##' \deqn{ W = \bold{w}^\prime \bold{x} ,}
##'
##' where \eqn{\bold{x}} is a \eqn{k \times 1} vector of the scores of each
##' item, \eqn{\bold{w}} is a \eqn{k \times 1} weight vector of each item, and
##' \eqn{k} represents the number of items. Then, maximal reliability is
##' obtained by finding \eqn{\bold{w}} such that reliability attains its maximum
##' (Li, 1997; Raykov, 2012). Note that the reliability can be obtained by
##'
##' \deqn{ \rho = \frac{\bold{w}^\prime \bold{S}_T \bold{w}}{\bold{w}^\prime
##' \bold{S}_X \bold{w}}}
##'
##' where \eqn{\bold{S}_T} is the covariance matrix explained by true scores and
##' \eqn{\bold{S}_X} is the observed covariance matrix. Numerical method is used
##' to find \eqn{\bold{w}} in this function.
##'
##' For continuous items, \eqn{\bold{S}_T} can be calculated by
##'
##' \deqn{ \bold{S}_T = \Lambda \Psi \Lambda^\prime,}
##'
##' where \eqn{\Lambda} is the factor loading matrix and \eqn{\Psi} is the
##' covariance matrix among factors. \eqn{\bold{S}_X} is directly obtained by
##' covariance among items.
##'
##' For categorical items, Green and Yang's (2009) method is used for
##' calculating \eqn{\bold{S}_T} and \eqn{\bold{S}_X}. The element \eqn{i} and
##' \eqn{j} of \eqn{\bold{S}_T} can be calculated by
##'
##' \deqn{ \left[\bold{S}_T\right]_{ij} = \sum^{C_i - 1}_{c_i = 1} \sum^{C_j -
##' 1}_{c_j - 1} \Phi_2\left( \tau_{x_{c_i}}, \tau_{x_{c_j}}, \left[ \Lambda
##' \Psi \Lambda^\prime \right]_{ij} \right) - \sum^{C_i - 1}_{c_i = 1}
##' \Phi_1(\tau_{x_{c_i}}) \sum^{C_j - 1}_{c_j - 1} \Phi_1(\tau_{x_{c_j}}),}
##'
##' where \eqn{C_i} and \eqn{C_j} represents the number of thresholds in Items
##' \eqn{i} and \eqn{j}, \eqn{\tau_{x_{c_i}}} represents the threshold \eqn{c_i}
##' of Item \eqn{i}, \eqn{\tau_{x_{c_j}}} represents the threshold \eqn{c_i} of
##' Item \eqn{j}, \eqn{ \Phi_1(\tau_{x_{c_i}})} is the cumulative probability of
##' \eqn{\tau_{x_{c_i}}} given a univariate standard normal cumulative
##' distribution and \eqn{\Phi_2\left( \tau_{x_{c_i}}, \tau_{x_{c_j}}, \rho
##' \right)} is the joint cumulative probability of \eqn{\tau_{x_{c_i}}} and
##' \eqn{\tau_{x_{c_j}}} given a bivariate standard normal cumulative
##' distribution with a correlation of \eqn{\rho}
##'
##' Each element of \eqn{\bold{S}_X} can be calculated by
##'
##' \deqn{ \left[\bold{S}_T\right]_{ij} = \sum^{C_i - 1}_{c_i = 1} \sum^{C_j -
##' 1}_{c_j - 1} \Phi_2\left( \tau_{V_{c_i}}, \tau_{V_{c_j}}, \rho^*_{ij}
##' \right) - \sum^{C_i - 1}_{c_i = 1} \Phi_1(\tau_{V_{c_i}}) \sum^{C_j -
##' 1}_{c_j - 1} \Phi_1(\tau_{V_{c_j}}),}
##'
##' where \eqn{\rho^*_{ij}} is a polychoric correlation between Items \eqn{i}
##' and \eqn{j}.
##'
##'
##' @importFrom lavaan lavInspect lavNames
##'
##' @param object A [lavaan::lavaan-class] or [lavaan.mi::lavaan.mi-class] object,
##' expected to contain only exogenous common factors (i.e., a CFA model).
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. NPD solutions are not excluded by default because
##' they are likely to occur due to sampling error, especially in small
##' samples. However, gross model misspecification could also cause
##' NPD solutions, users can compare pooled results with and without
##' this setting as a sensitivity analysis to see whether some
##' imputations warrant further investigation.
##'
##' @return Maximal reliability values of each group. The maximal-reliability
##' weights are also provided. Users may extracted the weighted by the
##' `attr` function (see example below).
##'
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' @seealso [reliability()] for reliability of an unweighted
##' composite score
##'
##' @references
##' Li, H. (1997). A unifying expression for the maximal reliability of a linear
##' composite. *Psychometrika, 62*(2), 245--249. \doi{10.1007/BF02295278}
##'
##' Raykov, T. (2012). Scale construction and development using structural
##' equation modeling. In R. H. Hoyle (Ed.), *Handbook of structural
##' equation modeling* (pp. 472--494). New York, NY: Guilford.
##'
##' @examples
##'
##' total <- 'f =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 '
##' fit <- cfa(total, data = HolzingerSwineford1939)
##' maximalRelia(fit)
##'
##' # Extract the weight
##' mr <- maximalRelia(fit)
##' attr(mr, "weight")
##'
##' @export
maximalRelia <- function(object, omit.imps = c("no.conv","no.se")) {
ngroups <- lavInspect(object, "ngroups") #TODO: adapt to multiple levels
nLevels <- lavInspect(object, "nlevels")
nblocks <- ngroups*nLevels #FIXME: always true?
group.label <- if (ngroups > 1L) lavInspect(object, "group.label") else NULL
#FIXME? lavInspect(object, "level.labels")
clus.label <- if (nLevels > 1L) c("within", lavInspect(object, "cluster")) else NULL
if (nblocks > 1L) {
block.label <- paste(rep(group.label, each = nLevels), clus.label,
sep = if (ngroups > 1L && nLevels > 1L) "_" else "")
}
## parameters in GLIST format (not flat, need block-level list)
if (inherits(object, "lavaan")) {
param <- lavInspect(object, "est")
ve <- lavInspect(object, "cov.lv") # model-implied latent covariance matrix
S <- object@h1$implied$cov # observed sample covariance matrix (already a list)
if (nblocks == 1L) {
param <- list(param)
ve <- list(ve)
}
} else if (inherits(object, "lavaan.mi")) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
param <- object@coefList[[ useImps[1] ]] # first admissible as template
coefList <- object@coefList[useImps]
phiList <- object@phiList[useImps]
## add block-level list per imputation?
if (nblocks == 1L) {
param <- list(param)
for (i in 1:m) {
coefList[[i]] <- list(coefList[[i]])
phiList[[i]] <- list(phiList[[i]])
}
}
S <- vector("list", nblocks) # pooled observed covariance matrix
ve <- vector("list", nblocks)
## loop over blocks
for (b in 1:nblocks) {
## param: loop over GLIST elements
for (mat in names(param[[b]])) {
matList <- lapply(coefList, function(i) i[[b]][[mat]])
param[[b]][[mat]] <- Reduce("+", matList) / length(matList)
} # mat
## pooled observed covariance matrix
covList <- lapply(object@h1List[useImps], function(i) i$implied$cov[[b]])
S[[b]] <- Reduce("+", covList) / m
## pooled model-implied latent covariance matrix
ve[[b]] <- Reduce("+", lapply(phiList, "[[", i = b) ) / m
} # b
}
if (nblocks == 1L) {
SigmaHat <- getMethod("fitted", class(object))(object)["cov"] # retain list format
} else {
SigmaHat <- sapply(getMethod("fitted", class(object))(object),
"[[", "cov", simplify = FALSE)
}
ly <- lapply(param, "[[", "lambda")
te <- lapply(param, "[[", "theta")
categorical <- lavInspect(object, "categorical")
threshold <- if (categorical) getThreshold(object, omit.imps = omit.imps) else NULL
result <- list()
for (i in 1:nblocks) {
truevar <- ly[[i]] %*% ve[[i]] %*% t(ly[[i]])
varnames <- colnames(truevar)
if (categorical) {
invstdvar <- 1 / sqrt(diag(SigmaHat[[i]]))
polyr <- diag(invstdvar) %*% truevar %*% diag(invstdvar)
nitem <- ncol(SigmaHat[[i]])
result[[i]] <- calcMaximalReliaCat(polyr, threshold[[i]], S[[i]], nitem, varnames)
} else {
result[[i]] <- calcMaximalRelia(truevar, S[[i]], varnames)
}
}
if (nblocks == 1L) {
result <- result[[1]]
} else names(result) <- block.label
result
}
## ----------------
## Hidden Functions
## ----------------
computeAlpha <- function(S) {
k <- nrow(S)
k/(k - 1) * (1.0 - sum(diag(S)) / sum(S))
}
#' @importFrom stats cov2cor pnorm
omegaCat <- function(truevar, threshold, scales, denom) {
## must be in standardized latent scale
R <- diag(scales) %*% truevar %*% diag(scales)
## denom could be model-implied polychoric correlation assuming diagonal theta,
## model-implied polychoric correlation accounting for error covariances,
## or "observed" polychoric correlation matrix.
## If parameterization="theta", standardize the polychoric coVARIANCE matrix
denom <- cov2cor(denom)
nitem <- ncol(denom)
## initialize sums of cumulative probabilities
sumnum <- 0 # numerator
addden <- 0 # denominator
## loop over all pairs of items
for (j in 1:nitem) {
for (jp in 1:nitem) {
## initialize sums of cumulative probabilities *per item*
sumprobn2 <- 0
addprobn2 <- 0
## for each pair of items, loop over all their thresholds
t1 <- threshold[[j]] * scales[j] # on standardized latent scale
t2 <- threshold[[jp]] * scales[jp] #FIXME: subtract intercept (or marginal mean?)
for (c in 1:length(t1)) {
for (cp in 1:length(t2)) {
sumprobn2 <- sumprobn2 + p2(t1[c], t2[cp], R[j, jp])
addprobn2 <- addprobn2 + p2(t1[c], t2[cp], denom[j, jp])
}
}
sumprobn1 <- sum(pnorm(t1))
sumprobn1p <- sum(pnorm(t2))
sumnum <- sumnum + (sumprobn2 - sumprobn1 * sumprobn1p)
addden <- addden + (addprobn2 - sumprobn1 * sumprobn1p)
}
}
reliab <- sumnum / addden
reliab
}
p2 <- function(t1, t2, r) {
mnormt::pmnorm(c(t1, t2), c(0,0), matrix(c(1, r, r, 1), 2, 2))
}
# polycorLavaan <- function(object) {
# ngroups <- lavInspect(object, "ngroups")
# coef <- lavInspect(object, "est")
# targettaunames <- NULL
# if (ngroups == 1L) {
# targettaunames <- rownames(coef$tau)
# } else {
# targettaunames <- rownames(coef[[1]]$tau)
# }
# barpos <- sapply(strsplit(targettaunames, ""), function(x) which(x == "|"))
# varnames <- unique(apply(data.frame(targettaunames, barpos - 1), MARGIN = 1,
# FUN = function(x) substr(x[1], 1, x[2])))
# if (length(varnames))
# script <- ""
# for (i in 2:length(varnames)) {
# temp <- paste0(varnames[1:(i - 1)], collapse = " + ")
# temp <- paste0(varnames[i], "~~", temp, "\n")
# script <- paste(script, temp)
# }
# newobject <- refit(script, object)
# if (ngroups == 1L) {
# return(lavInspect(newobject, "est")$theta)
# }
# lapply(lavInspect(newobject, "est"), "[[", "theta")
# }
##' @importFrom lavaan lavInspect lavNames
getThreshold <- function(object, omit.imps = c("no.conv","no.se")) {
ngroups <- lavInspect(object, "ngroups") #TODO: add nlevels when capable
ordnames <- lavNames(object, "ov.ord")
if (inherits(object, "lavaan")) {
EST <- lavInspect(object, "est")
} else if (inherits(object, "lavaan.mi")) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
EST <- object@coefList[useImps]
}
if (ngroups == 1L) {
if (inherits(object, "lavaan")) {
thresholds <- EST$tau[,"threshold"]
} else if (inherits(object, "lavaan.mi")) {
tauList <- lapply(EST, function(x) x$tau[,"threshold"])
thresholds <- Reduce("+", tauList) / length(tauList)
}
result <- lapply(ordnames,
function(nn) thresholds[grepl(paste0(nn, "\\|"), names(thresholds))])
names(result) <- ordnames
## needs to be within a list when called above within block-loops
result <- list(result)
} else {
thresholds <- vector("list", ngroups)
for (g in 1:ngroups) {
if (inherits(object, "lavaan")) {
thresholds[[g]] <- EST[[g]]$tau[,"threshold"]
} else if (inherits(object, "lavaan.mi")) {
tauList <- lapply(EST, function(x) x[[g]]$tau[,"threshold"])
thresholds[[g]] <- Reduce("+", tauList) / length(tauList)
}
}
result <- list()
group.label <- lavInspect(object, "group.label")
for (g in 1:ngroups) {
result[[ group.label[g] ]] <- lapply(ordnames, function(nn) {
thresholds[[g]][ grepl(paste0(nn, "\\|"), names(thresholds[[g]])) ]
})
names(result[[ group.label[g] ]]) <- ordnames
}
}
return(result)
}
##' @importFrom lavaan lavInspect lavNames
getScales <- function(object, omit.imps = c("no.conv","no.se")) {
ngroups <- lavInspect(object, "ngroups") #TODO: add nlevels when capable
ordnames <- lavNames(object, "ov.ord") #TODO: use to allow mix of cat/con vars
if (inherits(object, "lavaan")) {
EST <- lavInspect(object, "est")
} else if (inherits(object, "lavaan.mi")) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
EST <- object@coefList[useImps]
}
if (ngroups == 1L) {
if (inherits(object, "lavaan")) {
result <- list(EST$delta[,"scales"])
} else if (inherits(object, "lavaan.mi")) {
scales <- lapply(EST, function(x) x$delta[,"scales"])
result <- list(Reduce("+", scales) / length(scales))
}
} else {
result <- vector("list", ngroups)
for (g in 1:ngroups) {
if (inherits(object, "lavaan")) {
result[[g]] <- EST[[g]]$delta[,"scales"]
} else if (inherits(object, "lavaan.mi")) {
scales <- lapply(EST, function(x) x[[g]]$delta[,"scales"])
result[[g]] <- Reduce("+", scales) / length(scales)
}
}
}
return(result)
}
invGeneralRelia <- function(w, truevar, totalvar) {
1 - (t(w) %*% truevar %*% w) / (t(w) %*% totalvar %*% w)
}
#' @importFrom stats pnorm
invGeneralReliaCat <- function(w, polyr, threshold, denom, nitem) {
# denom could be polychoric correlation, model-implied correlation, or model-implied without error correlation
upper <- matrix(NA, nitem, nitem)
lower <- matrix(NA, nitem, nitem)
for (j in 1:nitem) {
for (jp in 1:nitem) {
sumprobn2 <- 0
addprobn2 <- 0
t1 <- threshold[[j]]
t2 <- threshold[[jp]]
for (c in 1:length(t1)) {
for (cp in 1:length(t2)) {
sumprobn2 <- sumprobn2 + p2(t1[c], t2[cp], polyr[j, jp])
addprobn2 <- addprobn2 + p2(t1[c], t2[cp], denom[j, jp])
}
}
sumprobn1 <- sum(pnorm(t1))
sumprobn1p <- sum(pnorm(t2))
upper[j, jp] <- (sumprobn2 - sumprobn1 * sumprobn1p)
lower[j, jp] <- (addprobn2 - sumprobn1 * sumprobn1p)
}
}
1 - (t(w) %*% upper %*% w) / (t(w) %*% lower %*% w)
}
#' @importFrom stats nlminb
calcMaximalRelia <- function(truevar, totalvar, varnames) {
start <- rep(1, nrow(truevar))
out <- nlminb(start, invGeneralRelia, truevar = truevar, totalvar = totalvar)
if (out$convergence != 0) stop("The numerical method for finding the maximal",
" reliability did not converge.")
result <- 1 - out$objective
weight <- out$par / mean(out$par)
names(weight) <- varnames
attr(result, "weight") <- weight
result
}
#' @importFrom stats nlminb
calcMaximalReliaCat <- function(polyr, threshold, denom, nitem, varnames) {
start <- rep(1, nrow(polyr))
out <- nlminb(start, invGeneralReliaCat, polyr = polyr, threshold = threshold,
denom = denom, nitem = nitem)
if (out$convergence != 0) stop("The numerical method for finding the maximal",
" reliability did not converge.")
result <- 1 - out$objective
weight <- out$par / mean(out$par)
names(weight) <- varnames
attr(result, "weight") <- weight
result
}
semTools/R/clipboard.R 0000644 0001762 0000144 00000032444 14764264223 014363 0 ustar ligges users ### Sunthud Pornprasertmanit & Terrence D. Jorgensen
### Last updated: 12 March 2025
### Copy or save each aspect of the lavaan object into a clipboard or a file
##' Copy or save the result of `lavaan` or `FitDiff` objects into a
##' clipboard or a file
##'
##' Copy or save the result of `lavaan` or [FitDiff-class]
##' object into a clipboard or a file. From the clipboard, users may paste the
##' result into the Microsoft Excel or spreadsheet application to create a table
##' of the output.
##'
##'
##' @aliases clipboard saveFile
##'
##' @param object An object of class [lavaan::lavaan-class] or
##' [FitDiff-class].
##' @param what The attributes of the `lavaan` object to be copied in the
##' clipboard. `"summary"` is to copy the screen provided from the
##' `summary` function. `"mifit"` is to copy the result from the
##' [miPowerFit()] function. Other attributes listed in the
##' `inspect` method in the [lavaan::lavaan-class] could also be
##' used, such as `"coef"`, `"se"`, `"fit"`, `"samp"`, and
##' so on. Ignored for [FitDiff-class]-class objects.
##' @param file A file name used for saving the result.
##' @param tableFormat If `TRUE`, save the result in the table format using
##' tabs for separation. Otherwise, save the result as the output screen
##' printed in the R console.
##' @param fit.measures `character` vector specifying names of fit measures
##' returned by [lavaan::fitMeasures()] to be copied/saved. Only
##' relevant if `object` is class [FitDiff-class].
##' @param writeArgs `list` of additional arguments to be passed to
##' [utils::write.table()]
##' @param \dots Additional arguments when passing a `lavaan` object to the
##' `summary` or [miPowerFit()] function.
##'
##' @return The resulting output will be saved into a clipboard or a file. If
##' using the `clipboard` function, users may paste it in the other
##' applications.
##'
##' @author
##' Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@gmail.com})
##'
##' @examples
##'
##' library(lavaan)
##' HW.model <- ' visual =~ x1 + c1*x2 + x3
##' textual =~ x4 + c1*x5 + x6
##' speed =~ x7 + x8 + x9 '
##'
##' fit <- cfa(HW.model, data = HolzingerSwineford1939, group = "school")
##'
##' if(interactive()){
##' # Copy the summary of the lavaan object
##' clipboard(fit)
##'
##' # pass additional arguments to summary() method for class?lavaan
##' clipboard(fit, rsquare = TRUE, standardized = TRUE, fit.measures = TRUE)
##'
##' # Copy modification indices and fit stats from the miPowerFit() function
##' clipboard(fit, "mifit")
##'
##' # Copy the parameter estimates
##' clipboard(fit, "coef")
##'
##' # Copy the standard errors
##' clipboard(fit, "se")
##'
##' # Copy the sample statistics
##' clipboard(fit, "samp")
##'
##' # Copy the fit measures
##' clipboard(fit, "fit")
##'
##' # Save the summary of the lavaan object
##' saveFile(fit, "out.txt")
##'
##' # Save modification indices and fit stats from the miPowerFit() function
##' saveFile(fit, "out.txt", "mifit")
##'
##' # Save the parameter estimates
##' saveFile(fit, "out.txt", "coef")
##'
##' # Save the standard errors
##' saveFile(fit, "out.txt", "se")
##'
##' # Save the sample statistics
##' saveFile(fit, "out.txt", "samp")
##'
##' # Save the fit measures
##' saveFile(fit, "out.txt", "fit")
##' }
##'
##' @export
clipboard <- function(object, what = "summary", ...) {
if (.Platform$OS.type == "windows") {
saveFile(object, file = "clipboard-128", what = what, tableFormat = TRUE, ...)
cat("File saved in the clipboard; please paste it in any program you wish.\n")
} else {
## Mac OS?
if (system("pbcopy -help", ignore.stderr = TRUE) == 0) {
CON <- pipe("pbcopy", "w")
on.exit(close(CON))
saveFile(object, file = CON, what = what, tableFormat = TRUE, ...)
cat("File saved in the clipboard; please paste it in any program you wish. If you cannot paste it, it is okay because this function works for some computers, which I still have no explanation currently. Please consider using the 'saveFile' function instead.\n")
} else if (system("xclip -version", ignore.stderr = TRUE) == 0) {
## Linux OS?
CON <- pipe("xclip -i", "w")
on.exit(close(CON))
saveFile(object, file = CON, what = what, tableFormat = TRUE, ...)
cat("File saved in the xclip; please paste it in any program you wish. If you cannot paste it, it is okay because this function works for some computers, which I still have no explanation currently. Please consider using the 'saveFile' function instead.\n")
} else {
stop("For Mac users, the 'pbcopy' command in the shell file does not work. For linux users, this function depends on the 'xclip' application. Please install and run the xclip application before using this function in R (it does not guarantee to work though). Alternatively, use the 'saveFile' function to write the output into a file.")
}
}
}
##' @rdname clipboard
##' @export
saveFile <- function(object, file, what = "summary", tableFormat = FALSE,
fit.measures = "default", writeArgs = list(), ...) {
# Check whether the object is in the lavaan class
if (is(object, "lavaan")) {
saveFileLavaan(object, file, what = what, tableFormat = tableFormat,
writeArgs = writeArgs, ...)
} else if (is(object, "FitDiff")) {
saveFileFitDiff(object, file, what = what, tableFormat = tableFormat,
fit.measures = fit.measures, writeArgs = writeArgs)
} else {
stop("The object must be a class?lavaan object or the",
" output from the compareFit() function.")
}
}
## ----------------
## Hidden functions
## ----------------
##' @importFrom lavaan lavInspect
saveFileLavaan <- function(object, file, what = "summary", tableFormat = FALSE,
writeArgs = list(), ...) {
if (length(what) > 1) message("only the first `what' option is used")
# be case insensitive
what <- tolower(what[1])
writeArgs$file <- file
if (is.null(writeArgs$sep)) writeArgs$sep <- "\t"
if (is.null(writeArgs$quote)) writeArgs$quote <- FALSE
if (what == "summary") {
if (tableFormat) {
writeArgs <- copySummary(object, file = file, writeArgs = writeArgs, ...)
} else {
writeArgs$x <- paste(utils::capture.output(summary(object, ...)),
collapse = "\n")
}
} else if (what == "mifit") {
if (tableFormat) {
writeArgs$x <- miPowerFit(object, ...)
if (is.null(writeArgs$row.names)) writeArgs$row.names <- FALSE
if (is.null(writeArgs$col.names)) writeArgs$col.names <- TRUE
} else {
writeArgs$x <- paste(utils::capture.output(miPowerFit(object, ...)),
collapse = "\n")
}
} else {
target <- lavInspect(object, what=what)
if (tableFormat) {
if (is(target, "lavaan.data.frame") || is(target, "data.frame")) {
writeArgs$x <- target
if (is.null(writeArgs$row.names)) writeArgs$row.names <- FALSE
if (is.null(writeArgs$col.names)) writeArgs$col.names <- TRUE
} else if (is(target, "list")) {
if (is(target[[1]], "list")) {
target <- lapply(target, listToDataFrame)
target <- mapply(function(x, y) rbind(rep("", ncol(y)), c(x, rep("", ncol(y) - 1)), y),
names(target), target, SIMPLIFY = FALSE)
writeArgs$x <- do.call(rbind, target)
if (is.null(writeArgs$row.names)) writeArgs$row.names <- FALSE
if (is.null(writeArgs$col.names)) writeArgs$col.names <- FALSE
} else {
writeArgs$x <- listToDataFrame(target)
if (is.null(writeArgs$row.names)) writeArgs$row.names <- FALSE
if (is.null(writeArgs$col.names)) writeArgs$col.names <- FALSE
}
} else {
writeArgs$x <- target
if (is.null(writeArgs$row.names)) writeArgs$row.names <- TRUE
if (is.null(writeArgs$col.names)) writeArgs$col.names <- TRUE
}
} else {
writeArgs$x <- paste(utils::capture.output(target), collapse = "\n")
}
}
do.call("write.table", writeArgs)
}
## copySummary: copy the summary of the lavaan object into the clipboard and
## potentially be useful if users paste it into the Excel application
## object = lavaan object input
copySummary <- function(object, file, writeArgs = list(), ...) {
# Capture the output of the lavaan class
outputText <- utils::capture.output(lavaan::summary(object, ...))
# Split the text by two spaces
outputText <- strsplit(outputText, " ")
# Trim and delete the "" elements
outputText <- lapply(outputText, function(x) x[x != ""])
outputText <- lapply(outputText, trim)
outputText <- lapply(outputText, function(x) x[x != ""])
# Group the output into three sections: fit, parameter estimates, and r-squared
cut1 <- grep("Estimate", outputText)[1]
cut2 <- grep("R-Square", outputText)[1]
if (is.na(cut2)) {
## no R-squared output requested, so set2 == set3
cut2 <- length(outputText)
}
set1 <- outputText[1:(cut1 - 1)]
set2 <- outputText[cut1:(cut2 - 1)]
set3 <- outputText[cut2:length(outputText)]
# Assign the number of columns in the resulting data frame and check whether the output contains any labels
numcol <- 7
test <- set2[-grep("Estimate", set2)]
test <- test[sapply(test, length) >= 2]
if (any(sapply(test, function(x) is.na(suppressWarnings(as.numeric(x[2])))))) numcol <- numcol + 1
# A function to parse the fit-measures output
set1Parse <- function(x, numcol) {
if (length(x) == 0) {
return(rep("", numcol))
} else if (length(x) == 1) {
return(c(x, rep("", numcol - 1)))
} else if ((length(x) >= 2) & (length(x) <= numcol)) {
return(c(x[1], rep("", numcol - length(x)), x[2:length(x)]))
} else {
stop("Cannot parse text")
}
}
set1 <- t(sapply(set1, set1Parse, numcol))
# A function to parse the parameter-estimates output
set2Parse <- function(x, numcol) {
if (length(x) == 0) return(rep("", numcol))
if (any(grepl("Estimate", x))) return(c(rep("", numcol-length(x)), x))
if (length(x) == 1) {
return(c(x, rep("", numcol-1)))
} else {
group1 <- x[1]
group2 <- x[2:length(x)]
if (is.na(suppressWarnings(as.numeric(x[2])))) {
group1 <- x[1:2]
group2 <- x[3:length(x)]
} else if (numcol == 8) {
group1 <- c(group1, "")
}
if (length(group2) == 1) {
group2 <- c(group2, rep("", 6 - length(group2)))
} else if (length(group2) == 4) {
group2 <- c(group2, rep("", 6 - length(group2)))
} else {
group2 <- c(group2[1], rep("", 6 - length(group2)), group2[2:length(group2)])
}
return(c(group1, group2))
}
}
set2 <- t(sapply(set2, set2Parse, numcol))
# A function to parse the r-squared output
set3Parse <- function(x, numcol) {
if (length(x) == 0) {
return(rep("", numcol))
} else {
return(c(x, rep("", numcol - length(x))))
}
}
set3 <- t(sapply(set3, set3Parse, numcol))
# Copy the output into the clipboard
writeArgs$x <- rbind(set1, set2, set3)
writeArgs$file <- file
if (is.null(writeArgs$quote)) writeArgs$quote <- FALSE
if (is.null(writeArgs$sep)) writeArgs$sep <- "\t"
if (is.null(writeArgs$row.names)) writeArgs$row.names <- FALSE
if (is.null(writeArgs$col.names)) writeArgs$col.names <- FALSE
# do.call("write.table", writeArgs)
writeArgs
}
## trim function from the R.oo package
trim <- function(object) {
s <- sub("^[\t\n\f\r ]*", "", as.character(object));
s <- sub("[\t\n\f\r ]*$", "", s);
s;
}
## listToDataFrame: Change a list with multiple elements into a single data.frame
listToDataFrame <- function(object) {
name <- names(object)
# Count the maximum number of column (+1 is for the column for row name)
numcol <- max(sapply(object, function(x) ifelse(is(x, "lavaan.matrix") || is(x, "lavaan.matrix.symmetric") || is(x, "matrix") || is(x, "data.frame"), return(ncol(x)), return(1)))) + 1
# Change all objects in the list into a data.frame with the specified column
target <- lapply(object, niceDataFrame, numcol)
# Paste the name of each object into each data.frame
target <- mapply(function(x, y) rbind(rep("", ncol(y)), c(x, rep("", ncol(y) - 1)), y), name, target, SIMPLIFY=FALSE)
# Combine into a single data.frame
target <- do.call(rbind, target)
target[-1,]
}
## niceDataFrame: Change an object into a data.frame with a specified number of
## columns and the row and column names are included in the data.frame
niceDataFrame <- function(object, numcol) {
temp <- NULL
if (is(object, "lavaan.matrix.symmetric")) {
# save only the lower diagonal of the symmetric matrix
temp <- matrix("", nrow(object), ncol(object))
for (i in 1:nrow(object)) {
temp[i, 1:i] <- object[i, 1:i]
}
} else if (is(object, "data.frame") || is(object, "matrix") || is(object, "lavaan.matrix")) {
# copy the matrix
temp <- object
} else if (is(object, "vector") || is(object, "lavaan.vector")) {
# transform a vector into a matrix
object <- as.matrix(object)
temp <- object
} else {
stop("The 'niceDataFrame' function has a bug. Please contact the developer.")
}
# Transform into the result with a specified number of columns, excluding the row name
result <- matrix("", nrow(temp), numcol - 1)
# Parse the column names
result[,1:ncol(temp)] <- temp
firstRow <- colnames(object)
ifelse(is.null(firstRow), firstRow <- rep("", ncol(result)), firstRow <- c(firstRow, rep("", numcol - length(firstRow) - 1)))
# Parse the row names
result <- rbind(firstRow, result)
firstCol <- rownames(object)
ifelse(is.null(firstCol), firstCol <- rep("", nrow(result)), firstCol <- c("", firstCol))
result <- cbind(firstCol, result)
dimnames(result) <- NULL
result
}
semTools/R/partialInvariance.R 0000644 0001762 0000144 00000261520 14764267611 016063 0 ustar ligges users ### Sunthud Pornprasertmanit
### Last updated: 12 March 2025
##' Partial Measurement Invariance Testing Across Groups
##'
##' This test will provide partial invariance testing by (a) freeing a parameter
##' one-by-one from nested model and compare with the original nested model or
##' (b) fixing (or constraining) a parameter one-by-one from the parent model
##' and compare with the original parent model. This function only works with
##' congeneric models. The `partialInvariance` is used for continuous
##' variable. The `partialInvarianceCat` is used for categorical variables.
##'
##' There are four types of partial invariance testing:
##'
##' \itemize{
##' \item Partial weak invariance. The model named 'fit.configural'
##' from the list of models is compared with the model named 'fit.loadings'.
##' Each loading will be freed or fixed from the metric and configural
##' invariance models respectively. The modified models are compared with the
##' original model. Note that the objects in the list of models must have the
##' names of "fit.configural" and "fit.loadings". Users may use "metric",
##' "weak", "loading", or "loadings" in the `type` argument. Note that, for
##' testing invariance on marker variables, other variables will be assigned as
##' marker variables automatically.
##'
##' \item Partial strong invariance. The model
##' named 'fit.loadings' from the list of models is compared with the model
##' named either 'fit.intercepts' or 'fit.thresholds'. Each intercept will be
##' freed or fixed from the scalar and metric invariance models respectively.
##' The modified models are compared with the original model. Note that the
##' objects in the list of models must have the names of "fit.loadings" and
##' either "fit.intercepts" or "fit.thresholds". Users may use "scalar",
##' "strong", "intercept", "intercepts", "threshold", or "thresholds" in the
##' `type` argument. Note that, for testing invariance on marker variables,
##' other variables will be assigned as marker variables automatically. Note
##' that if all variables are dichotomous, scalar invariance testing is not
##' available.
##'
##' \item Partial strict invariance. The model named either
##' 'fit.intercepts' or 'fit.thresholds' (or 'fit.loadings') from the list of
##' models is compared with the model named 'fit.residuals'. Each residual
##' variance will be freed or fixed from the strict and scalar (or metric)
##' invariance models respectively. The modified models are compared with the
##' original model. Note that the objects in the list of models must have the
##' names of "fit.residuals" and either "fit.intercepts", "fit.thresholds", or
##' "fit.loadings". Users may use "strict", "residual", "residuals", "error", or
##' "errors" in the `type` argument.
##'
##' \item Partial mean invariance. The
##' model named either 'fit.intercepts' or 'fit.thresholds' (or 'fit.residuals'
##' or 'fit.loadings') from the list of models is compared with the model named
##' 'fit.means'. Each factor mean will be freed or fixed from the means and
##' scalar (or strict or metric) invariance models respectively. The modified
##' models are compared with the original model. Note that the objects in the
##' list of models must have the names of "fit.means" and either
##' "fit.residuals", "fit.intercepts", "fit.thresholds", or "fit.loadings".
##' Users may use "means" or "mean" in the `type` argument. }
##'
##' Two types of comparisons are used in this function:
##' \enumerate{
##' \item `free`: The nested model is used as a template. Then, one
##' parameter indicating the differences between two models is free. The new
##' model is compared with the nested model. This process is repeated for all
##' differences between two models. The likelihood-ratio test and the difference
##' in CFI are provided.
##'
##' \item `fix`: The parent model is used as a template. Then, one parameter
##' indicating the differences between two models is fixed or constrained to be
##' equal to other parameters. The new model is then compared with the parent
##' model. This process is repeated for all differences between two models. The
##' likelihood-ratio test and the difference in CFI are provided.
##'
##' \item `wald`: This method is similar to the `fix` method. However,
##' instead of building a new model and compare them with likelihood-ratio test,
##' multivariate wald test is used to compare equality between parameter
##' estimates. See [lavaan::lavTestWald()] for further details. Note
##' that if any rows of the contrast cannot be summed to 0, the Wald test is not
##' provided, such as comparing two means where one of the means is fixed as 0.
##' This test statistic is not as accurate as likelihood-ratio test provided in
##' `fix`. I provide it here in case that likelihood-ratio test fails to
##' converge.
##' }
##'
##' Note that this function does not adjust for the inflated Type I error rate
##' from multiple tests. The degree of freedom of all tests would be the number
##' of groups minus 1.
##'
##' The details of standardized estimates and the effect size used for each
##' parameters are provided in the vignettes by running
##' `vignette("partialInvariance")`.
##'
##' @importFrom lavaan lavInspect parTable
##' @aliases partialInvariance partialInvarianceCat
##'
##' @param fit A list of models for invariance testing. Each model should be
##' assigned by appropriate names (see details). The result from
##' [measurementInvariance()] or
##' [measurementInvarianceCat()] could be used in this argument
##' directly.
##' @param type The types of invariance testing: "metric", "scalar", "strict",
##' or "means"
##' @param free A vector of variable names that are free across groups in
##' advance. If partial mean invariance is tested, this argument represents a
##' vector of factor names that are free across groups.
##' @param fix A vector of variable names that are constrained to be equal
##' across groups in advance. If partial mean invariance is tested, this
##' argument represents a vector of factor names that are fixed across groups.
##' @param refgroup The reference group used to make the effect size comparison
##' with the other groups.
##' @param poolvar If `TRUE`, the variances are pooled across group for
##' standardization. Otherwise, the variances of the reference group are used
##' for standardization.
##' @param p.adjust The method used to adjust p values. See
##' [stats::p.adjust()] for the options for adjusting p values. The
##' default is to not use any corrections.
##' @param fbound The z-scores of factor that is used to calculate the effect
##' size of the loading difference proposed by Millsap and Olivera-Aguilar
##' (2012).
##' @param return.fit Return the submodels fitted by this function
##' @param method The method used to calculate likelihood ratio test. See
##' [lavaan::lavTestLRT()] for available options
##'
##' @return A list of results are provided. The list will consists of at least
##' two elements:
##' \enumerate{
##' \item `estimates`: The results of parameter estimates including pooled
##' estimates (`poolest`), the estimates for each group, standardized
##' estimates for each group (`std`), the difference in standardized
##' values, and the effect size statistic (*q* for factor loading
##' difference and *h* for error variance difference). See the details of
##' this effect size statistic by running `vignette("partialInvariance")`.
##' In the `partialInvariance` function, the additional effect statistics
##' proposed by Millsap and Olivera-Aguilar (2012) are provided. For factor
##' loading, the additional outputs are the observed mean difference
##' (`diff_mean`), the mean difference if factor scores are low
##' (`low_fscore`), and the mean difference if factor scores are high
##' (`high_fscore`). The low factor score is calculated by (a) finding the
##' factor scores that its *z* score equals -`bound` (the default is
##' \eqn{-2}) from all groups and (b) picking the minimum value among the
##' factor scores. The high factor score is calculated by (a) finding the
##' factor scores that its *z* score equals `bound` (default = 2)
##' from all groups and (b) picking the maximum value among the factor scores.
##' For measurement intercepts, the additional outputs are the observed means
##' difference (`diff_mean`) and the proportion of the differences in the
##' intercepts over the observed means differences (`propdiff`). For error
##' variances, the additional outputs are the proportion of the difference in
##' error variances over the difference in observed variances (`propdiff`).
##'
##' \item `results`: Statistical tests as well as the change in CFI are
##' provided. \eqn{\chi^2} and *p* value are provided for all methods.
##'
##' \item `models`: The submodels used in the `free` and `fix`
##' methods, as well as the nested and parent models. The nested and parent
##' models will be changed from the original models if `free` or
##' `fit` arguments are specified.
##' }
##'
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' @seealso [measurementInvariance()] for measurement invariance for
##' continuous variables; [measurementInvarianceCat()] for measurement
##' invariance for categorical variables; [lavaan::lavTestWald()] for
##' multivariate Wald test
##'
##' @references Millsap, R. E., & Olivera-Aguilar, M. (2012). Investigating
##' measurement invariance using confirmatory factor analysis. In R. H. Hoyle
##' (Ed.), *Handbook of structural equation modeling* (pp. 380--392). New
##' York, NY: Guilford.
##'
##' @examples
##'
##' ## Conduct weak invariance testing manually by using fixed-factor
##' ## method of scale identification
##'
##' library(lavaan)
##'
##' conf <- "
##' f1 =~ NA*x1 + x2 + x3
##' f2 =~ NA*x4 + x5 + x6
##' f1 ~~ c(1, 1)*f1
##' f2 ~~ c(1, 1)*f2
##' "
##'
##' weak <- "
##' f1 =~ NA*x1 + x2 + x3
##' f2 =~ NA*x4 + x5 + x6
##' f1 ~~ c(1, NA)*f1
##' f2 ~~ c(1, NA)*f2
##' "
##'
##' configural <- cfa(conf, data = HolzingerSwineford1939, std.lv = TRUE, group="school")
##' weak <- cfa(weak, data = HolzingerSwineford1939, group="school", group.equal="loadings")
##' models <- list(fit.configural = configural, fit.loadings = weak)
##' partialInvariance(models, "metric")
##'
##' \donttest{
##' partialInvariance(models, "metric", free = "x5") # "x5" is free across groups in advance
##' partialInvariance(models, "metric", fix = "x4") # "x4" is fixed across groups in advance
##'
##' ## Use the result from the measurementInvariance function
##' HW.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9 '
##'
##' models2 <- measurementInvariance(model = HW.model, data=HolzingerSwineford1939,
##' group="school")
##' partialInvariance(models2, "scalar")
##'
##' ## Conduct weak invariance testing manually by using fixed-factor
##' ## method of scale identification for dichotomous variables
##'
##' f <- rnorm(1000, 0, 1)
##' u1 <- 0.9*f + rnorm(1000, 1, sqrt(0.19))
##' u2 <- 0.8*f + rnorm(1000, 1, sqrt(0.36))
##' u3 <- 0.6*f + rnorm(1000, 1, sqrt(0.64))
##' u4 <- 0.7*f + rnorm(1000, 1, sqrt(0.51))
##' u1 <- as.numeric(cut(u1, breaks = c(-Inf, 0, Inf)))
##' u2 <- as.numeric(cut(u2, breaks = c(-Inf, 0.5, Inf)))
##' u3 <- as.numeric(cut(u3, breaks = c(-Inf, 0, Inf)))
##' u4 <- as.numeric(cut(u4, breaks = c(-Inf, -0.5, Inf)))
##' g <- rep(c(1, 2), 500)
##' dat2 <- data.frame(u1, u2, u3, u4, g)
##'
##' configural2 <- "
##' f1 =~ NA*u1 + u2 + u3 + u4
##' u1 | c(t11, t11)*t1
##' u2 | c(t21, t21)*t1
##' u3 | c(t31, t31)*t1
##' u4 | c(t41, t41)*t1
##' f1 ~~ c(1, 1)*f1
##' f1 ~ c(0, NA)*1
##' u1 ~~ c(1, 1)*u1
##' u2 ~~ c(1, NA)*u2
##' u3 ~~ c(1, NA)*u3
##' u4 ~~ c(1, NA)*u4
##' "
##'
##' outConfigural2 <- cfa(configural2, data = dat2, group = "g",
##' parameterization = "theta", estimator = "wlsmv",
##' ordered = c("u1", "u2", "u3", "u4"))
##'
##' weak2 <- "
##' f1 =~ NA*u1 + c(f11, f11)*u1 + c(f21, f21)*u2 + c(f31, f31)*u3 + c(f41, f41)*u4
##' u1 | c(t11, t11)*t1
##' u2 | c(t21, t21)*t1
##' u3 | c(t31, t31)*t1
##' u4 | c(t41, t41)*t1
##' f1 ~~ c(1, NA)*f1
##' f1 ~ c(0, NA)*1
##' u1 ~~ c(1, 1)*u1
##' u2 ~~ c(1, NA)*u2
##' u3 ~~ c(1, NA)*u3
##' u4 ~~ c(1, NA)*u4
##' "
##'
##' outWeak2 <- cfa(weak2, data = dat2, group = "g", parameterization = "theta",
##' estimator = "wlsmv", ordered = c("u1", "u2", "u3", "u4"))
##' modelsCat <- list(fit.configural = outConfigural2, fit.loadings = outWeak2)
##'
##' partialInvarianceCat(modelsCat, type = "metric")
##'
##' partialInvarianceCat(modelsCat, type = "metric", free = "u2")
##' partialInvarianceCat(modelsCat, type = "metric", fix = "u3")
##'
##' ## Use the result from the measurementInvarianceCat function
##'
##' model <- ' f1 =~ u1 + u2 + u3 + u4
##' f2 =~ u5 + u6 + u7 + u8'
##'
##' modelsCat2 <- measurementInvarianceCat(model = model, data = datCat, group = "g",
##' parameterization = "theta",
##' estimator = "wlsmv", strict = TRUE)
##'
##' partialInvarianceCat(modelsCat2, type = "scalar")
##' }
##'
##' @export
partialInvariance <- function(fit, type, free = NULL, fix = NULL, refgroup = 1,
poolvar = TRUE, p.adjust = "none", fbound = 2,
return.fit = FALSE, method = "satorra.bentler.2001") {
# fit <- measurementInvariance(HW.model, data=HolzingerSwineford1939, group="school", strict = TRUE)
# type <- "weak"
# free <- NULL
# fix <- "x1"
# refgroup <- 1
# poolvar <- TRUE
# p.adjust <- "none"
# return.fit <- FALSE
# fbound <- 2
# method <- "satorra.bentler.2001"
type <- tolower(type)
numType <- 0
fit1 <- fit0 <- NULL
# fit0 = Nested model, fit1 = Parent model
if(type %in% c("metric", "weak", "loading", "loadings")) {
numType <- 1
if(all(c("fit.configural", "fit.loadings") %in% names(fit))) {
fit1 <- fit$fit.configural
fit0 <- fit$fit.loadings
} else {
stop("The elements named 'fit.configural' and 'fit.loadings' are needed in the 'fit' argument")
}
} else if (type %in% c("scalar", "strong", "intercept", "intercepts", "threshold", "thresholds")) {
numType <- 2
if(all(c("fit.loadings", "fit.intercepts") %in% names(fit))) {
fit1 <- fit$fit.loadings
fit0 <- fit$fit.intercepts
} else {
stop("The elements named 'fit.loadings' and 'fit.intercepts' are needed in the 'fit' argument")
}
} else if (type %in% c("strict", "residual", "residuals", "error", "errors")) {
numType <- 3
if(all(c("fit.intercepts", "fit.residuals") %in% names(fit))) {
fit1 <- fit$fit.intercepts
fit0 <- fit$fit.residuals
} else {
stop("The elements named 'fit.intercepts' and 'fit.residuals' are needed in the 'fit' argument")
}
} else if (type %in% c("means", "mean")) {
numType <- 4
if("fit.means" %in% names(fit)) {
fit0 <- fit$fit.means
if("fit.residuals" %in% names(fit)) {
fit1 <- fit$fit.residuals
} else if ("fit.intercepts" %in% names(fit)) {
fit1 <- fit$fit.intercepts
} else {
stop("The elements named either 'fit.residuals' or 'fit.intercepts ' is needed in the 'fit' argument")
}
} else {
stop("The elements named 'fit.means' is needed in the 'fit' argument")
}
} else {
stop("Please specify the correct type of measurement invariance. See the help page.")
}
pt1 <- parTable(fit1)
pt0 <- parTable(fit0)
pt0$start <- pt0$est <- pt0$se <- NULL
pt1$start <- pt1$est <- pt1$se <- NULL
pt1$label[substr(pt1$label, 1, 1) == "." & substr(pt1$label, nchar(pt1$label), nchar(pt1$label)) == "."] <- ""
pt0$label[substr(pt0$label, 1, 1) == "." & substr(pt0$label, nchar(pt0$label), nchar(pt0$label)) == "."] <- ""
namept1 <- paramNameFromPt(pt1)
namept0 <- paramNameFromPt(pt0)
if(length(table(table(pt0$rhs[pt0$op == "=~"]))) != 1) stop("The model is not congeneric. This function does not support non-congeneric model.")
varfree <- varnames <- unique(pt0$rhs[pt0$op == "=~"])
facnames <- unique(pt0$lhs[(pt0$op == "=~") & (pt0$rhs %in% varnames)])
facrepresent <- table(pt0$lhs[(pt0$op == "=~") & (pt0$rhs %in% varnames)], pt0$rhs[(pt0$op == "=~") & (pt0$rhs %in% varnames)])
if(any(apply(facrepresent, 2, function(x) sum(x != 0)) > 1)) stop("The model is not congeneric. This function does not support non-congeneric model.")
facList <- list()
for(i in 1:nrow(facrepresent)) {
facList[[i]] <- colnames(facrepresent)[facrepresent[i,] > 0]
}
names(facList) <- rownames(facrepresent)
facList <- facList[match(names(facList), facnames)]
fixLoadingFac <- list()
for(i in seq_along(facList)) {
select <- pt1$lhs == names(facList)[i] & pt1$op == "=~" & pt1$rhs %in% facList[[i]] & pt1$group == 1 & pt1$free == 0 & (!is.na(pt1$ustart) & pt1$ustart > 0)
fixLoadingFac[[i]] <- pt1$rhs[select]
}
names(fixLoadingFac) <- names(facList)
fixIntceptFac <- list()
for(i in seq_along(facList)) {
select <- pt1$op == "~1" & pt1$rhs %in% facList[[i]] & pt1$group == 1 & pt1$free == 0
fixIntceptFac[[i]] <- pt1$rhs[select]
}
names(fixIntceptFac) <- names(facList)
ngroups <- max(pt0$group)
neach <- lavInspect(fit0, "nobs")
groupvar <- lavInspect(fit0, "group")
grouplab <- lavInspect(fit0, "group.label")
if(!is.numeric(refgroup)) refgroup <- which(refgroup == grouplab)
grouporder <- 1:ngroups
grouporder <- c(refgroup, setdiff(grouporder, refgroup))
grouplaborder <- grouplab[grouporder]
complab <- paste(grouplaborder[2:ngroups], "vs.", grouplaborder[1])
if(ngroups <= 1) stop("Well, the number of groups is 1. Measurement invariance across 'groups' cannot be done.")
if(numType == 4) {
if(!all(c(free, fix) %in% facnames)) stop("'free' and 'fix' arguments should consist of factor names because mean invariance is tested.")
} else {
if(!all(c(free, fix) %in% varnames)) stop("'free' and 'fix' arguments should consist of variable names.")
}
result <- fixCon <- freeCon <- NULL
estimates <- NULL
listFreeCon <- listFixCon <- list()
beta <- lavaan::coef(fit1)
beta0 <- lavaan::coef(fit0)
waldMat <- matrix(0, ngroups - 1, length(beta))
if(numType == 1) {
if(!is.null(free) | !is.null(fix)) {
if(!is.null(fix)) {
facinfix <- findFactor(fix, facList)
dup <- duplicated(facinfix)
for(i in seq_along(fix)) {
if(dup[i]) {
pt0 <- constrainParTable(pt0, facinfix[i], "=~", fix[i], 1:ngroups)
pt1 <- constrainParTable(pt1, facinfix[i], "=~", fix[i], 1:ngroups)
} else {
oldmarker <- fixLoadingFac[[facinfix[i]]]
if(length(oldmarker) > 0) {
oldmarkerval <- pt1$ustart[pt1$lhs == facinfix[i] & pt1$op == "=~" & pt1$rhs == oldmarker & pt1$group == 1]
if(oldmarker == fix[i]) {
pt0 <- fixParTable(pt0, facinfix[i], "=~", fix[i], 1:ngroups, oldmarkerval)
pt1 <- fixParTable(pt1, facinfix[i], "=~", fix[i], 1:ngroups, oldmarkerval)
} else {
pt0 <- freeParTable(pt0, facinfix[i], "=~", oldmarker, 1:ngroups)
pt0 <- constrainParTable(pt0, facinfix[i], "=~", oldmarker, 1:ngroups)
pt1 <- freeParTable(pt1, facinfix[i], "=~", oldmarker, 1:ngroups)
pt0 <- fixParTable(pt0, facinfix[i], "=~", fix[i], 1:ngroups, oldmarkerval)
pt1 <- fixParTable(pt1, facinfix[i], "=~", fix[i], 1:ngroups, oldmarkerval)
fixLoadingFac[[facinfix[i]]] <- fix[i]
}
} else {
pt0 <- constrainParTable(pt0, facinfix[i], "=~", fix[i], 1:ngroups)
pt1 <- constrainParTable(pt1, facinfix[i], "=~", fix[i], 1:ngroups)
}
}
}
}
if(!is.null(free)) {
facinfree <- findFactor(free, facList)
for(i in seq_along(free)) {
# Need to change marker variable if fixed
oldmarker <- fixLoadingFac[[facinfree[i]]]
if(length(oldmarker) > 0 && oldmarker == free[i]) {
oldmarkerval <- pt1$ustart[pt1$lhs == facinfix[i] & pt1$op == "=~" & pt1$rhs == oldmarker & pt1$group == 1]
candidatemarker <- setdiff(facList[[facinfree[i]]], free[i])[1]
pt0 <- freeParTable(pt0, facinfree[i], "=~", free[i], 1:ngroups)
pt1 <- freeParTable(pt1, facinfree[i], "=~", free[i], 1:ngroups)
pt0 <- fixParTable(pt0, facinfix[i], "=~", candidatemarker, 1:ngroups, oldmarkerval)
pt1 <- fixParTable(pt1, facinfix[i], "=~", candidatemarker, 1:ngroups, oldmarkerval)
fixLoadingFac[[facinfix[i]]] <- candidatemarker
} else {
pt0 <- freeParTable(pt0, facinfree[i], "=~", free[i], 1:ngroups)
pt1 <- freeParTable(pt1, facinfree[i], "=~", free[i], 1:ngroups)
}
}
}
namept1 <- paramNameFromPt(pt1)
namept0 <- paramNameFromPt(pt0)
fit0 <- refit(pt0, fit0)
fit1 <- refit(pt1, fit1)
beta <- lavaan::coef(fit1)
beta0 <- lavaan::coef(fit0)
waldMat <- matrix(0, ngroups - 1, length(beta))
varfree <- setdiff(varfree, c(free, fix))
}
obsmean <- sapply(lavInspect(fit0, "sampstat"), "[[", "mean") #FIXME: there might not be a mean structure
obsmean <- obsmean[,grouporder]
obsdiff <- obsmean[,2:ngroups, drop = FALSE] - matrix(obsmean[,1], nrow(obsmean), ngroups - 1)
obsdiff <- obsdiff[varfree, , drop = FALSE]
colnames(obsdiff) <- paste0("diff_mean:", complab)
estimates <- matrix(NA, length(varfree), ngroups + 1)
stdestimates <- matrix(NA, length(varfree), ngroups)
colnames(estimates) <- c("poolest", paste0("load:", grouplab))
colnames(stdestimates) <- paste0("std:", grouplab)
esstd <- esz <- matrix(NA, length(varfree), ngroups - 1)
colnames(esstd) <- paste0("diff_std:", complab)
colnames(esz) <- paste0("q:", complab)
esdiff <- matrix(NA, length(varfree), ngroups - 1)
# Extract facmean, facsd, load, tau -> lowdiff, highdiff
lowdiff <- matrix(NA, length(varfree), ngroups - 1)
highdiff <- matrix(NA, length(varfree), ngroups - 1)
colnames(lowdiff) <- paste0("low_fscore:", complab)
colnames(highdiff) <- paste0("high_fscore:", complab)
fixCon <- freeCon <- matrix(NA, length(varfree), 4)
waldCon <- matrix(NA, length(varfree), 3)
colnames(fixCon) <- c("fix.chi", "fix.df", "fix.p", "fix.cfi")
colnames(freeCon) <- c("free.chi", "free.df", "free.p", "free.cfi")
colnames(waldCon) <- c("wald.chi", "wald.df", "wald.p")
index <- which((pt1$rhs %in% varfree) & (pt1$op == "=~") & (pt1$group == 1))
facinfix <- findFactor(fix, facList)
varinfixvar <- unlist(facList[facinfix])
varinfixvar <- setdiff(varinfixvar, setdiff(varinfixvar, varfree))
indexfixvar <- which((pt1$rhs %in% varinfixvar) & (pt1$op == "=~") & (pt1$group == 1))
varnonfixvar <- setdiff(varfree, varinfixvar)
indexnonfixvar <- setdiff(index, indexfixvar)
pos <- 1
for(i in seq_along(indexfixvar)) {
runnum <- indexfixvar[i]
temp <- constrainParTable(pt1, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if(!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if(!is(compresult, "try-error")) fixCon[pos,] <- c(unlist(modelcomp[2,5:7]), deltacfi(fit1, tempfit))
}
listFixCon <- c(listFixCon, tryresult)
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, 1] <- getValue(pt0, beta0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1)
tryresult0 <- try(tempfit0 <- refit(temp0, fit0), silent = TRUE)
if(!is(tryresult0, "try-error")) {
compresult0 <- try(modelcomp0 <- lavaan::lavTestLRT(tempfit0, fit0, method = method), silent = TRUE)
if(!is(compresult0, "try-error")) freeCon[pos,] <- c(unlist(modelcomp0[2,5:7]), deltacfi(tempfit0, fit0))
loadVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, 2:ncol(estimates)] <- loadVal
facVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], "~~", pt0$lhs[runnum], 1:ngroups)
totalVal <- sapply(lavaan::fitted.values(tempfit0), function(x, v) x$cov[v, v], v = pt0$rhs[runnum])
names(facVal) <- names(totalVal) <- grouplab
ifelse(poolvar, refFacVal <- poolVariance(facVal, neach), refFacVal <- facVal[refgroup])
ifelse(poolvar, refTotalVal <- poolVariance(totalVal, neach), refTotalVal <- totalVal[refgroup])
stdLoadVal <- loadVal * sqrt(refFacVal) / sqrt(refTotalVal)
stdestimates[pos,] <- stdLoadVal
stdLoadVal <- stdLoadVal[grouporder]
esstd[pos,] <- stdLoadVal[2:ngroups] - stdLoadVal[1]
if(any(abs(stdLoadVal) > 0.9999)) warning(paste("Standardized Loadings of", pt0$rhs[runnum], "in some groups are less than -1 or over 1. The standardized loadings used in Fisher z transformation are changed to -0.9999 or 0.9999."))
stdLoadVal[stdLoadVal > 0.9999] <- 0.9999
stdLoadVal[stdLoadVal < -0.9999] <- -0.9999
zLoadVal <- atanh(stdLoadVal)
esz[pos,] <- zLoadVal[2:ngroups] - zLoadVal[1]
facMean <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], "~1", "", 1:ngroups)
wlow <- min(facMean - fbound * sqrt(facVal))
whigh <- max(facMean + fbound * sqrt(facVal))
intVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$rhs[runnum], "~1", "", 1:ngroups)
loadVal <- loadVal[grouporder]
intVal <- intVal[grouporder]
loaddiff <- loadVal[2:ngroups] - loadVal[1]
intdiff <- intVal[2:ngroups] - intVal[1]
lowdiff[pos,] <- intdiff + wlow * loaddiff
highdiff[pos,] <- intdiff + whigh * loaddiff
}
listFreeCon <- c(listFreeCon, tryresult0)
waldCon[pos,] <- waldConstraint(fit1, pt1, waldMat, cbind(pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups))
pos <- pos + 1
}
facinvarfree <- findFactor(varnonfixvar, facList)
for(i in seq_along(indexnonfixvar)) {
runnum <- indexnonfixvar[i]
# Need to change marker variable if fixed
oldmarker <- fixLoadingFac[[facinvarfree[i]]]
if(length(oldmarker) > 0 && oldmarker == varnonfixvar[i]) {
candidatemarker <- setdiff(facList[[facinvarfree[i]]], varnonfixvar[i])[1]
temp <- freeParTable(pt1, facinvarfree[i], "=~", varnonfixvar[i], 1:ngroups)
temp <- fixParTable(temp, facinvarfree[i], "=~", candidatemarker, 1:ngroups, ustart = 1)
temp <- constrainParTable(temp, facinvarfree[i], "=~", varnonfixvar[i], 1:ngroups)
newparent <- freeParTable(pt1, facinvarfree[i], "=~", varnonfixvar[i], 1:ngroups)
newparent <- fixParTable(newparent, facinvarfree[i], "=~", candidatemarker, 1:ngroups, ustart = 1)
newparentresult <- try(newparentfit <- refit(newparent, fit1), silent = TRUE)
if(!is(newparentresult, "try-error")) {
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if(!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, newparentfit, method = method), silent = TRUE)
if(!is(compresult, "try-error")) fixCon[pos,] <- c(unlist(modelcomp[2,5:7]), deltacfi(newparentfit, tempfit))
}
waldCon[pos,] <- waldConstraint(newparentfit, newparent, waldMat, cbind(facinvarfree[i], "=~", varnonfixvar[i], 1:ngroups))
}
} else {
temp <- constrainParTable(pt1, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if(!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if(!is(compresult, "try-error")) fixCon[pos,] <- c(unlist(modelcomp[2,5:7]), deltacfi(fit1, tempfit))
}
waldCon[pos,] <- waldConstraint(fit1, pt1, waldMat, cbind(pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups))
}
listFixCon <- c(listFixCon, tryresult)
if(length(oldmarker) > 0 && oldmarker == varnonfixvar[i]) {
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 2:ngroups)
} else {
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
}
estimates[pos, 1] <- getValue(pt0, beta0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1)
tryresult0 <- try(tempfit0 <- refit(temp0, fit0), silent = TRUE)
if(!is(tryresult0, "try-error")) {
compresult0 <- try(modelcomp0 <- lavaan::lavTestLRT(tempfit0, fit0, method = method), silent = TRUE)
if(!is(compresult0, "try-error")) freeCon[pos,] <- c(unlist(modelcomp0[2,5:7]), deltacfi(tempfit0, fit0))
loadVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, 2:ncol(estimates)] <- loadVal
facVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], "~~", pt0$lhs[runnum], 1:ngroups)
totalVal <- sapply(lavaan::fitted.values(tempfit0), function(x, v) x$cov[v, v], v = pt0$rhs[runnum])
names(facVal) <- names(totalVal) <- grouplab
ifelse(poolvar, refFacVal <- poolVariance(facVal, neach), refFacVal <- facVal[refgroup])
ifelse(poolvar, refTotalVal <- poolVariance(totalVal, neach), refTotalVal <- totalVal[refgroup])
stdLoadVal <- loadVal * sqrt(refFacVal) / sqrt(refTotalVal)
stdestimates[pos,] <- stdLoadVal
stdLoadVal <- stdLoadVal[grouporder]
esstd[pos,] <- stdLoadVal[2:ngroups] - stdLoadVal[1]
if(any(abs(stdLoadVal) > 0.9999)) warning(paste("Standardized Loadings of", pt0$rhs[runnum], "in some groups are less than -1 or over 1. The standardized loadings used in Fisher z transformation are changed to -0.9999 or 0.9999."))
stdLoadVal[stdLoadVal > 0.9999] <- 0.9999
stdLoadVal[stdLoadVal < -0.9999] <- -0.9999
zLoadVal <- atanh(stdLoadVal)
esz[pos,] <- zLoadVal[2:ngroups] - zLoadVal[1]
facMean <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], "~1", "", 1:ngroups)
wlow <- min(facMean - fbound * sqrt(facVal))
whigh <- max(facMean + fbound * sqrt(facVal))
intVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$rhs[runnum], "~1", "", 1:ngroups)
loadVal <- loadVal[grouporder]
intVal <- intVal[grouporder]
loaddiff <- loadVal[2:ngroups] - loadVal[1]
intdiff <- intVal[2:ngroups] - intVal[1]
lowdiff[pos,] <- intdiff + wlow * loaddiff
highdiff[pos,] <- intdiff + whigh * loaddiff
}
listFreeCon <- c(listFreeCon, tryresult0)
pos <- pos + 1
}
freeCon[,3] <- stats::p.adjust(freeCon[,3], p.adjust)
fixCon[,3] <- stats::p.adjust(fixCon[,3], p.adjust)
waldCon[,3] <- stats::p.adjust(waldCon[,3], p.adjust)
rownames(fixCon) <- names(listFixCon) <- rownames(freeCon) <- names(listFreeCon) <- rownames(waldCon) <- rownames(estimates) <- namept1[c(indexfixvar, indexnonfixvar)]
estimates <- cbind(estimates, stdestimates, esstd, esz, obsdiff, lowdiff, highdiff)
result <- cbind(freeCon, fixCon, waldCon)
} else if (numType == 2) {
if(!is.null(free) | !is.null(fix)) {
if(!is.null(fix)) {
facinfix <- findFactor(fix, facList)
dup <- duplicated(facinfix)
for(i in seq_along(fix)) {
if(dup[i]) {
pt0 <- constrainParTable(pt0, fix[i], "~1", "", 1:ngroups)
pt1 <- constrainParTable(pt1, fix[i], "~1", "", 1:ngroups)
} else {
oldmarker <- fixIntceptFac[[facinfix[i]]]
if(length(oldmarker) > 0) {
oldmarkerval <- pt1$ustart[pt1$lhs == fix[i] & pt1$op == "~1" & pt1$rhs == "" & pt1$group == 1]
if(oldmarker == fix[i]) {
pt0 <- fixParTable(pt0, fix[i], "~1", "", 1:ngroups, oldmarkerval)
pt1 <- fixParTable(pt1, fix[i], "~1", "", 1:ngroups, oldmarkerval)
} else {
pt0 <- freeParTable(pt0, oldmarker, "~1", "", 1:ngroups)
pt0 <- constrainParTable(pt0, oldmarker, "~1", "", 1:ngroups)
pt1 <- freeParTable(pt1, oldmarker, "~1", "", 1:ngroups)
pt0 <- fixParTable(pt0, fix[i], "~1", "", 1:ngroups, oldmarkerval)
pt1 <- fixParTable(pt1, fix[i], "~1", "", 1:ngroups, oldmarkerval)
fixIntceptFac[[facinfix[i]]] <- fix[i]
}
} else {
pt0 <- constrainParTable(pt0, fix[i], "~1", "", 1:ngroups)
pt1 <- constrainParTable(pt1, fix[i], "~1", "", 1:ngroups)
}
}
}
}
if(!is.null(free)) {
facinfree <- findFactor(free, facList)
for(i in seq_along(free)) {
# Need to change marker variable if fixed
oldmarker <- fixIntceptFac[[facinfree[i]]]
if(length(oldmarker) > 0 && oldmarker == free[i]) {
oldmarkerval <- pt1$ustart[pt1$lhs == oldmarker & pt1$op == "~1" & pt1$rhs == "" & pt1$group == 1]
candidatemarker <- setdiff(facList[[facinfree[i]]], free[i])[1]
pt0 <- freeParTable(pt0, free[i], "~1", "", 1:ngroups)
pt1 <- freeParTable(pt1, free[i], "~1", "", 1:ngroups)
pt0 <- fixParTable(pt0, candidatemarker, "~1", "", 1:ngroups, oldmarkerval)
pt1 <- fixParTable(pt1, candidatemarker, "~1", "", 1:ngroups, oldmarkerval)
fixIntceptFac[[facinfix[i]]] <- candidatemarker
} else {
pt0 <- freeParTable(pt0, free[i], "~1", "", 1:ngroups)
pt1 <- freeParTable(pt1, free[i], "~1", "", 1:ngroups)
}
}
}
namept1 <- paramNameFromPt(pt1)
namept0 <- paramNameFromPt(pt0)
fit0 <- refit(pt0, fit0)
fit1 <- refit(pt1, fit1)
beta <- lavaan::coef(fit1)
beta0 <- lavaan::coef(fit0)
waldMat <- matrix(0, ngroups - 1, length(beta))
varfree <- setdiff(varfree, c(free, fix))
}
obsmean <- sapply(lavInspect(fit0, "sampstat"), "[[", "mean") #FIXME: there might not be a mean structure
obsmean <- obsmean[,grouporder]
obsdiff <- obsmean[,2:ngroups, drop = FALSE] - matrix(obsmean[,1], nrow(obsmean), ngroups - 1)
obsdiff <- obsdiff[varfree, , drop = FALSE]
colnames(obsdiff) <- paste0("diff_mean:", complab)
# Prop diff
propdiff <- matrix(NA, length(varfree), ngroups - 1)
colnames(propdiff) <- paste0("propdiff:", complab)
estimates <- matrix(NA, length(varfree), ngroups + 1)
stdestimates <- matrix(NA, length(varfree), ngroups)
colnames(estimates) <- c("poolest", paste0("int:", grouplab))
colnames(stdestimates) <- paste0("std:", grouplab)
esstd <- matrix(NA, length(varfree), ngroups - 1)
colnames(esstd) <- paste0("diff_std:", complab)
fixCon <- freeCon <- matrix(NA, length(varfree), 4)
waldCon <- matrix(NA, length(varfree), 3)
colnames(fixCon) <- c("fix.chi", "fix.df", "fix.p", "fix.cfi")
colnames(freeCon) <- c("free.chi", "free.df", "free.p", "free.cfi")
colnames(waldCon) <- c("wald.chi", "wald.df", "wald.p")
index <- which((pt1$lhs %in% varfree) & (pt1$op == "~1") & (pt1$group == 1))
facinfix <- findFactor(fix, facList)
varinfixvar <- unlist(facList[facinfix])
varinfixvar <- setdiff(varinfixvar, setdiff(varinfixvar, varfree))
indexfixvar <- which((pt1$lhs %in% varinfixvar) & (pt1$op == "~1") & (pt1$group == 1))
varnonfixvar <- setdiff(varfree, varinfixvar)
indexnonfixvar <- setdiff(index, indexfixvar)
pos <- 1
for(i in seq_along(varinfixvar)) {
runnum <- indexfixvar[i]
temp <- constrainParTable(pt1, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if(!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if(!is(compresult, "try-error")) fixCon[pos,] <- c(unlist(modelcomp[2,5:7]), deltacfi(fit1, tempfit))
}
listFixCon <- c(listFixCon, tryresult)
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, 1] <- getValue(pt0, beta0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1)
tryresult0 <- try(tempfit0 <- refit(temp0, fit0), silent = TRUE)
if(!is(tryresult0, "try-error")) {
compresult0 <- try(modelcomp0 <- lavaan::lavTestLRT(tempfit0, fit0, method = method), silent = TRUE)
if(!is(compresult0, "try-error")) freeCon[pos,] <- c(unlist(modelcomp0[2,5:7]), deltacfi(tempfit0, fit0))
intVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, 2:ncol(estimates)] <- intVal
totalVal <- sapply(lavaan::fitted.values(tempfit0), function(x, v) x$cov[v, v], v = pt0$lhs[runnum])
ifelse(poolvar, refTotalVal <- poolVariance(totalVal, neach), refTotalVal <- totalVal[refgroup])
stdIntVal <- intVal / sqrt(refTotalVal)
stdestimates[pos,] <- stdIntVal
stdIntVal <- stdIntVal[grouporder]
esstd[pos,] <- stdIntVal[2:ngroups] - stdIntVal[1]
intVal <- intVal[grouporder]
propdiff[pos,] <- (intVal[2:ngroups] - intVal[1]) / obsdiff[pos,]
}
listFreeCon <- c(listFreeCon, tryresult0)
waldCon[pos,] <- waldConstraint(fit1, pt1, waldMat, cbind(pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups))
pos <- pos + 1
}
facinvarfree <- findFactor(varfree, facList)
for(i in seq_along(varnonfixvar)) {
runnum <- indexnonfixvar[i]
# Need to change marker variable if fixed
oldmarker <- fixIntceptFac[[facinvarfree[i]]]
if(length(oldmarker) > 0 && oldmarker == varfree[i]) {
candidatemarker <- setdiff(facList[[facinvarfree[i]]], varfree[i])[1]
temp <- freeParTable(pt1, varfree[i], "~1", "", 1:ngroups)
temp <- constrainParTable(temp, varfree[i], "~1", "", 1:ngroups)
temp <- fixParTable(temp, candidatemarker, "~1", "", 1:ngroups)
newparent <- freeParTable(pt1, varfree[i], "~1", "", 1:ngroups)
newparent <- fixParTable(newparent, candidatemarker, "~1", "", 1:ngroups)
newparentresult <- try(newparentfit <- refit(newparent, fit1), silent = TRUE)
if(!is(newparentresult, "try-error")) {
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if(!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, newparentfit, method = method), silent = TRUE)
if(!is(compresult, "try-error")) fixCon[pos,] <- c(unlist(modelcomp[2,5:7]), deltacfi(newparentfit, tempfit))
}
waldCon[pos,] <- waldConstraint(newparentfit, newparent, waldMat, cbind(varfree[i], "~1", "", 1:ngroups))
}
} else {
temp <- constrainParTable(pt1, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if(!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if(!is(compresult, "try-error")) fixCon[pos,] <- c(unlist(modelcomp[2,5:7]), deltacfi(fit1, tempfit))
}
waldCon[pos,] <- waldConstraint(fit1, pt1, waldMat, cbind(pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups))
}
listFixCon <- c(listFixCon, tryresult)
if(length(oldmarker) > 0 && oldmarker == varfree[i]) {
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 2:ngroups)
} else {
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
}
estimates[pos, 1] <- getValue(pt0, beta0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1)
tryresult0 <- try(tempfit0 <- refit(temp0, fit0), silent = TRUE)
if(!is(tryresult0, "try-error")) {
compresult0 <- try(modelcomp0 <- lavaan::lavTestLRT(tempfit0, fit0, method = method), silent = TRUE)
if(!is(compresult0, "try-error")) freeCon[pos,] <- c(unlist(modelcomp0[2,5:7]), deltacfi(tempfit0, fit0))
intVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, 2:ncol(estimates)] <- intVal
totalVal <- sapply(lavaan::fitted.values(tempfit0), function(x, v) x$cov[v, v], v = pt0$lhs[runnum])
ifelse(poolvar, refTotalVal <- poolVariance(totalVal, neach), refTotalVal <- totalVal[refgroup])
stdIntVal <- intVal / sqrt(refTotalVal)
stdestimates[pos,] <- stdIntVal
stdIntVal <- stdIntVal[grouporder]
esstd[pos,] <- stdIntVal[2:ngroups] - stdIntVal[1]
intVal <- intVal[grouporder]
propdiff[pos,] <- (intVal[2:ngroups] - intVal[1]) / obsdiff[pos,]
}
listFreeCon <- c(listFreeCon, tryresult0)
pos <- pos + 1
}
freeCon[,3] <- stats::p.adjust(freeCon[,3], p.adjust)
fixCon[,3] <- stats::p.adjust(fixCon[,3], p.adjust)
waldCon[,3] <- stats::p.adjust(waldCon[,3], p.adjust)
rownames(fixCon) <- names(listFixCon) <- rownames(freeCon) <- names(listFreeCon) <- rownames(waldCon) <- rownames(estimates) <- namept1[c(indexfixvar, indexnonfixvar)]
estimates <- cbind(estimates, stdestimates, esstd, obsdiff, propdiff)
result <- cbind(freeCon, fixCon, waldCon)
} else if (numType == 3) {
if(!is.null(free) | !is.null(fix)) {
if(!is.null(fix)) {
for(i in seq_along(fix)) {
pt0 <- constrainParTable(pt0, fix[i], "~~", fix[i], 1:ngroups)
pt1 <- constrainParTable(pt1, fix[i], "~~", fix[i], 1:ngroups)
}
}
if(!is.null(free)) {
for(i in seq_along(free)) {
pt0 <- freeParTable(pt0, free[i], "~~", free[i], 1:ngroups)
pt1 <- freeParTable(pt1, free[i], "~~", free[i], 1:ngroups)
}
}
namept1 <- paramNameFromPt(pt1)
namept0 <- paramNameFromPt(pt0)
fit0 <- refit(pt0, fit0)
fit1 <- refit(pt1, fit1)
beta <- lavaan::coef(fit1)
beta0 <- lavaan::coef(fit0)
waldMat <- matrix(0, ngroups - 1, length(beta))
varfree <- setdiff(varfree, c(free, fix))
}
# Prop diff
propdiff <- matrix(NA, length(varfree), ngroups - 1)
colnames(propdiff) <- paste0("propdiff:", complab)
estimates <- matrix(NA, length(varfree), ngroups + 1)
stdestimates <- matrix(NA, length(varfree), ngroups)
colnames(estimates) <- c("poolest", paste0("errvar:", grouplab))
colnames(stdestimates) <- paste0("std:", grouplab)
esstd <- esz <- matrix(NA, length(varfree), ngroups - 1)
colnames(esstd) <- paste0("diff_std:", complab)
colnames(esz) <- paste0("h:", complab)
fixCon <- freeCon <- matrix(NA, length(varfree), 4)
waldCon <- matrix(NA, length(varfree), 3)
colnames(fixCon) <- c("fix.chi", "fix.df", "fix.p", "fix.cfi")
colnames(freeCon) <- c("free.chi", "free.df", "free.p", "free.cfi")
colnames(waldCon) <- c("wald.chi", "wald.df", "wald.p")
index <- which((pt1$lhs %in% varfree) & (pt1$op == "~~") & (pt1$lhs == pt1$rhs) & (pt1$group == 1))
for(i in seq_along(index)) {
runnum <- index[i]
temp <- constrainParTable(pt1, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if(!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if(!is(compresult, "try-error")) fixCon[i,] <- c(unlist(modelcomp[2,5:7]), deltacfi(fit1, tempfit))
}
listFixCon <- c(listFixCon, tryresult)
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[i, 1] <- getValue(pt0, beta0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1)
tryresult0 <- try(tempfit0 <- refit(temp0, fit0), silent = TRUE)
if(!is(tryresult0, "try-error")) {
compresult0 <- try(modelcomp0 <- lavaan::lavTestLRT(tempfit0, fit0, method = method), silent = TRUE)
if(!is(compresult0, "try-error")) freeCon[i,] <- c(unlist(modelcomp0[2,5:7]), deltacfi(tempfit0, fit0))
errVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[i, 2:ncol(estimates)] <- errVal
totalVal <- sapply(lavaan::fitted.values(tempfit0), function(x, v) x$cov[v, v], v = pt0$rhs[runnum])
ifelse(poolvar, refTotalVal <- poolVariance(totalVal, neach), refTotalVal <- totalVal[refgroup])
stdErrVal <- errVal / sqrt(refTotalVal)
stdestimates[i,] <- stdErrVal
stdErrVal <- stdErrVal[grouporder]
esstd[i,] <- stdErrVal[2:ngroups] - stdErrVal[1]
if(any(abs(stdErrVal) > 0.9999)) warning(paste("The uniqueness of", pt0$rhs[runnum], "in some groups are over 1. The uniqueness used in arctan transformation are changed to 0.9999."))
stdErrVal[stdErrVal > 0.9999] <- 0.9999
zErrVal <- asin(sqrt(stdErrVal))
esz[i,] <- zErrVal[2:ngroups] - zErrVal[1]
errVal <- errVal[grouporder]
totalVal <- totalVal[grouporder]
errdiff <- errVal[2:ngroups] - errVal[1]
totaldiff <- totalVal[2:ngroups] - totalVal[1]
propdiff[i,] <- errdiff / totaldiff
}
listFreeCon <- c(listFreeCon, tryresult0)
waldCon[i,] <- waldConstraint(fit1, pt1, waldMat, cbind(pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups))
}
freeCon[,3] <- stats::p.adjust(freeCon[,3], p.adjust)
fixCon[,3] <- stats::p.adjust(fixCon[,3], p.adjust)
waldCon[,3] <- stats::p.adjust(waldCon[,3], p.adjust)
rownames(fixCon) <- names(listFixCon) <- rownames(freeCon) <- names(listFreeCon) <- rownames(waldCon) <- rownames(estimates) <- namept1[index]
estimates <- cbind(estimates, stdestimates, esstd, esz, propdiff)
result <- cbind(freeCon, fixCon, waldCon)
} else if (numType == 4) {
varfree <- facnames
if(!is.null(free) | !is.null(fix)) {
if(!is.null(fix)) {
for(i in seq_along(fix)) {
pt0 <- constrainParTable(pt0, fix[i], "~1", "", 1:ngroups)
pt1 <- constrainParTable(pt1, fix[i], "~1", "", 1:ngroups)
}
}
if(!is.null(free)) {
for(i in seq_along(free)) {
pt0 <- freeParTable(pt0, free[i], "~1", "", 1:ngroups)
pt1 <- freeParTable(pt1, free[i], "~1", "", 1:ngroups)
}
}
namept1 <- paramNameFromPt(pt1)
namept0 <- paramNameFromPt(pt0)
fit0 <- refit(pt0, fit0)
fit1 <- refit(pt1, fit1)
beta <- lavaan::coef(fit1)
beta0 <- lavaan::coef(fit0)
waldMat <- matrix(0, ngroups - 1, length(beta))
varfree <- setdiff(varfree, c(free, fix))
}
estimates <- matrix(NA, length(varfree), ngroups + 1)
stdestimates <- matrix(NA, length(varfree), ngroups)
colnames(estimates) <- c("poolest", paste0("mean:", grouplab))
colnames(stdestimates) <- paste0("std:", grouplab)
esstd <- matrix(NA, length(varfree), ngroups - 1)
colnames(esstd) <- paste0("diff_std:", complab)
fixCon <- freeCon <- matrix(NA, length(varfree), 4)
waldCon <- matrix(NA, length(varfree), 3)
colnames(fixCon) <- c("fix.chi", "fix.df", "fix.p", "fix.cfi")
colnames(freeCon) <- c("free.chi", "free.df", "free.p", "free.cfi")
colnames(waldCon) <- c("wald.chi", "wald.df", "wald.p")
index <- which((pt1$lhs %in% varfree) & (pt1$op == "~1") & (pt1$group == 1))
for(i in seq_along(index)) {
runnum <- index[i]
isfree <- pt1$free[runnum] != 0
if(isfree) {
temp <- constrainParTable(pt1, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)
} else {
temp <- fixParTable(pt1, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 2:ngroups, ustart = pt1$ustart[runnum])
}
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if(!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if(!is(compresult, "try-error")) fixCon[i,] <- c(unlist(modelcomp[2,5:7]), deltacfi(fit1, tempfit))
}
listFixCon <- c(listFixCon, tryresult)
isfree0 <- pt0$free[runnum] != 0
if(isfree0) {
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
} else {
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 2:ngroups)
}
estimates[i, 1] <- getValue(pt0, beta0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1)
tryresult0 <- try(tempfit0 <- refit(temp0, fit0), silent = TRUE)
if(!is(tryresult0, "try-error")) {
compresult0 <- try(modelcomp0 <- lavaan::lavTestLRT(tempfit0, fit0, method = method), silent = TRUE)
if(!is(compresult0, "try-error")) freeCon[i,] <- c(unlist(modelcomp0[2,5:7]), deltacfi(tempfit0, fit0))
meanVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[i, 2:ncol(estimates)] <- meanVal
facVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], "~~", pt0$lhs[runnum], 1:ngroups)
ifelse(poolvar, refFacVal <- poolVariance(facVal, neach), refFacVal <- facVal[refgroup])
stdMeanVal <- meanVal / sqrt(refFacVal)
stdestimates[i,] <- stdMeanVal
stdMeanVal <- stdMeanVal[grouporder]
esstd[i,] <- stdMeanVal[2:ngroups] - stdMeanVal[1]
}
listFreeCon <- c(listFreeCon, tryresult0)
waldCon[i,] <- waldConstraint(fit1, pt1, waldMat, cbind(pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups))
}
freeCon[,3] <- stats::p.adjust(freeCon[,3], p.adjust)
fixCon[,3] <- stats::p.adjust(fixCon[,3], p.adjust)
waldCon[,3] <- stats::p.adjust(waldCon[,3], p.adjust)
rownames(fixCon) <- names(listFixCon) <- rownames(freeCon) <- names(listFreeCon) <- rownames(waldCon) <- rownames(estimates) <- namept1[index]
estimates <- cbind(estimates, stdestimates, esstd)
result <- cbind(freeCon, fixCon, waldCon)
}
if(return.fit) {
return(invisible(list(estimates = estimates, results = result, models = list(free = listFreeCon, fix = listFixCon, nested = fit0, parent = fit1))))
} else {
return(list(estimates = estimates, results = result))
}
}
##' @importFrom lavaan lavInspect parTable
##' @rdname partialInvariance
##' @export
partialInvarianceCat <- function(fit, type, free = NULL, fix = NULL,
refgroup = 1, poolvar = TRUE,
p.adjust = "none", return.fit = FALSE,
method = "satorra.bentler.2001") {
# model <- ' f1 =~ u1 + u2 + u3 + u4
# f2 =~ u5 + u6 + u7 + u8'
# modelsCat2 <- measurementInvarianceCat(model, data = datCat, group = "g", parameterization="theta",
# estimator="wlsmv", strict = TRUE)
# fit <- modelsCat2
# type <- "weak"
# free <- NULL
# fix <- NULL
# refgroup <- 1
# poolvar <- TRUE
# p.adjust <- "none"
# return.fit <- FALSE
# method = "satorra.bentler.2001"
type <- tolower(type)
numType <- 1
fit1 <- fit0 <- NULL
# fit0 = Nested model, fit1 = Parent model
if (type %in% c("metric", "weak", "loading", "loadings")) {
numType <- 1
if (all(c("fit.configural", "fit.loadings") %in% names(fit))) {
fit1 <- fit$fit.configural
fit0 <- fit$fit.loadings
} else {
stop("The elements named 'fit.configural' and 'fit.loadings' are needed",
" in the 'fit' argument")
}
} else if (type %in% c("scalar", "strong", "intercept", "intercepts",
"threshold", "thresholds")) {
numType <- 2
if (all(c("fit.loadings", "fit.thresholds") %in% names(fit))) {
fit1 <- fit$fit.loadings
fit0 <- fit$fit.thresholds
} else {
stop("The elements named 'fit.loadings' and 'fit.thresholds' are needed",
" in the 'fit' argument")
}
} else if (type %in% c("strict", "residual", "residuals", "error", "errors")) {
numType <- 3
if ("fit.residuals" %in% names(fit)) {
fit0 <- fit$fit.residuals
if ("fit.thresholds" %in% names(fit)) {
fit1 <- fit$fit.thresholds
} else if ("fit.loadings" %in% names(fit)) {
fit1 <- fit$fit.loadings
} else {
stop("The element named either 'fit.thresholds' or 'fit.loadings' is",
" needed in the 'fit' argument")
}
} else {
stop("The element named 'fit.residuals' is needed in the 'fit' argument")
}
} else if (type %in% c("means", "mean")) {
numType <- 4
if ("fit.means" %in% names(fit)) {
fit0 <- fit$fit.means
if("fit.residuals" %in% names(fit)) {
fit1 <- fit$fit.residuals
} else if ("fit.thresholds" %in% names(fit)) {
fit1 <- fit$fit.thresholds
} else if ("fit.loadings" %in% names(fit)) {
fit1 <- fit$fit.loadings
} else {
stop("The element named either 'fit.residuals', 'fit.thresholds',",
" or 'fit.loadings' is needed in the 'fit' argument")
}
} else {
stop("The element named 'fit.means' is needed in the 'fit' argument")
}
} else {
stop("Please specify the correct type of measurement invariance. See the help page.")
}
pt1 <- parTable(fit1)
pt0 <- parTable(fit0)
pt0$start <- pt0$est <- pt0$se <- NULL
pt1$start <- pt1$est <- pt1$se <- NULL
pt1$label[substr(pt1$label, 1, 1) == "." & substr(pt1$label, nchar(pt1$label),
nchar(pt1$label)) == "."] <- ""
pt0$label[substr(pt0$label, 1, 1) == "." & substr(pt0$label, nchar(pt0$label),
nchar(pt0$label)) == "."] <- ""
namept1 <- paramNameFromPt(pt1)
namept0 <- paramNameFromPt(pt0)
if (length(table(table(pt0$rhs[pt0$op == "=~"]))) != 1)
stop("The model is not congeneric. This function does not support non-congeneric model.")
varfree <- varnames <- unique(pt0$rhs[pt0$op == "=~"])
facnames <- unique(pt0$lhs[(pt0$op == "=~") & (pt0$rhs %in% varnames)])
facrepresent <- table(pt0$lhs[(pt0$op == "=~") & (pt0$rhs %in% varnames)],
pt0$rhs[(pt0$op == "=~") & (pt0$rhs %in% varnames)])
if (any(apply(facrepresent, 2, function(x) sum(x != 0)) > 1))
stop("The model is not congeneric. This function does not support non-congeneric model.")
facList <- list()
for (i in 1:nrow(facrepresent)) {
facList[[i]] <- colnames(facrepresent)[facrepresent[i,] > 0]
}
names(facList) <- rownames(facrepresent)
facList <- facList[match(names(facList), facnames)]
fixLoadingFac <- list()
for (i in seq_along(facList)) {
select <- pt1$lhs == names(facList)[i] & pt1$op == "=~" & pt1$rhs %in% facList[[i]] & pt1$group == 1 & pt1$free == 0 & (!is.na(pt1$ustart) & pt1$ustart > 0)
fixLoadingFac[[i]] <- pt1$rhs[select]
}
names(fixLoadingFac) <- names(facList)
# Find the number of thresholds
# Check whether the factor configuration is the same across gorups
conParTable <- lapply(pt1, "[", pt1$op == "==")
group1pt <- lapply(pt1, "[", pt1$group != 1)
numThreshold <- table(sapply(group1pt, "[", group1pt$op == "|")[,"lhs"])
plabelthres <- split(group1pt$plabel[group1pt$op == "|"], group1pt$lhs[group1pt$op == "|"])
numFixedThreshold <- sapply(lapply(plabelthres, function(vec) !is.na(match(vec, conParTable$lhs)) | !is.na(match(vec, conParTable$rhs))), sum)[names(numThreshold)]
#numFixedThreshold <- table(sapply(group1pt, "[", group1pt$op == "|" & group1pt$eq.id != 0)[,"lhs"])
fixIntceptFac <- list()
for (i in seq_along(facList)) {
tmp <- numFixedThreshold[facList[[i]]]
if (all(tmp > 1)) {
fixIntceptFac[[i]] <- integer(0)
} else {
fixIntceptFac[[i]] <- names(which.max(tmp))[1]
}
}
names(fixIntceptFac) <- names(facList)
ngroups <- max(pt0$group)
neach <- lavInspect(fit0, "nobs")
groupvar <- lavInspect(fit0, "group")
grouplab <- lavInspect(fit0, "group.label")
if (!is.numeric(refgroup)) refgroup <- which(refgroup == grouplab)
grouporder <- 1:ngroups
grouporder <- c(refgroup, setdiff(grouporder, refgroup))
grouplaborder <- grouplab[grouporder]
complab <- paste(grouplaborder[2:ngroups], "vs.", grouplaborder[1])
if (ngroups <= 1) stop("Well, the number of groups is 1. Measurement",
" invariance across 'groups' cannot be done.")
if (numType == 4) {
if (!all(c(free, fix) %in% facnames))
stop("'free' and 'fix' arguments should consist of factor names because",
" mean invariance is tested.")
} else {
if (!all(c(free, fix) %in% varnames))
stop("'free' and 'fix' arguments should consist of variable names.")
}
result <- fixCon <- freeCon <- NULL
estimates <- NULL
listFreeCon <- listFixCon <- list()
beta <- lavaan::coef(fit1)
beta0 <- lavaan::coef(fit0)
waldMat <- matrix(0, ngroups - 1, length(beta))
if (numType == 1) {
if (!is.null(free) | !is.null(fix)) {
if (!is.null(fix)) {
facinfix <- findFactor(fix, facList)
dup <- duplicated(facinfix)
for (i in seq_along(fix)) {
if (dup[i]) {
pt0 <- constrainParTable(pt0, facinfix[i], "=~", fix[i], 1:ngroups)
pt1 <- constrainParTable(pt1, facinfix[i], "=~", fix[i], 1:ngroups)
} else {
oldmarker <- fixLoadingFac[[facinfix[i]]]
if (length(oldmarker) > 0) {
oldmarkerval <- pt1$ustart[pt1$lhs == facinfix[i] & pt1$op == "=~" & pt1$rhs == oldmarker & pt1$group == 1]
if (oldmarker == fix[i]) {
pt0 <- fixParTable(pt0, facinfix[i], "=~", fix[i], 1:ngroups, oldmarkerval)
pt1 <- fixParTable(pt1, facinfix[i], "=~", fix[i], 1:ngroups, oldmarkerval)
} else {
pt0 <- freeParTable(pt0, facinfix[i], "=~", oldmarker, 1:ngroups)
pt0 <- constrainParTable(pt0, facinfix[i], "=~", oldmarker, 1:ngroups)
pt1 <- freeParTable(pt1, facinfix[i], "=~", oldmarker, 1:ngroups)
pt0 <- fixParTable(pt0, facinfix[i], "=~", fix[i], 1:ngroups, oldmarkerval)
pt1 <- fixParTable(pt1, facinfix[i], "=~", fix[i], 1:ngroups, oldmarkerval)
fixLoadingFac[[facinfix[i]]] <- fix[i]
}
} else {
pt0 <- constrainParTable(pt0, facinfix[i], "=~", fix[i], 1:ngroups)
pt1 <- constrainParTable(pt1, facinfix[i], "=~", fix[i], 1:ngroups)
}
}
}
}
if (!is.null(free)) {
facinfree <- findFactor(free, facList)
for (i in seq_along(free)) {
# Need to change marker variable if fixed
oldmarker <- fixLoadingFac[[facinfree[i]]]
if (length(oldmarker) > 0 && oldmarker == free[i]) {
oldmarkerval <- pt1$ustart[pt1$lhs == facinfix[i] & pt1$op == "=~" & pt1$rhs == oldmarker & pt1$group == 1]
candidatemarker <- setdiff(facList[[facinfree[i]]], free[i])[1]
pt0 <- freeParTable(pt0, facinfree[i], "=~", free[i], 1:ngroups)
pt1 <- freeParTable(pt1, facinfree[i], "=~", free[i], 1:ngroups)
pt0 <- fixParTable(pt0, facinfix[i], "=~", candidatemarker, 1:ngroups, oldmarkerval)
pt1 <- fixParTable(pt1, facinfix[i], "=~", candidatemarker, 1:ngroups, oldmarkerval)
fixLoadingFac[[facinfix[i]]] <- candidatemarker
} else {
pt0 <- freeParTable(pt0, facinfree[i], "=~", free[i], 1:ngroups)
pt1 <- freeParTable(pt1, facinfree[i], "=~", free[i], 1:ngroups)
}
}
}
namept1 <- paramNameFromPt(pt1)
namept0 <- paramNameFromPt(pt0)
fit0 <- refit(pt0, fit0)
fit1 <- refit(pt1, fit1)
beta <- lavaan::coef(fit1)
beta0 <- lavaan::coef(fit0)
waldMat <- matrix(0, ngroups - 1, length(beta))
varfree <- setdiff(varfree, c(free, fix))
}
estimates <- matrix(NA, length(varfree), ngroups + 1)
stdestimates <- matrix(NA, length(varfree), ngroups)
colnames(estimates) <- c("poolest", paste0("load:", grouplab))
colnames(stdestimates) <- paste0("std:", grouplab)
esstd <- esz <- matrix(NA, length(varfree), ngroups - 1)
colnames(esstd) <- paste0("diff_std:", complab)
colnames(esz) <- paste0("q:", complab)
fixCon <- freeCon <- matrix(NA, length(varfree), 4)
waldCon <- matrix(NA, length(varfree), 3)
colnames(fixCon) <- c("fix.chi", "fix.df", "fix.p", "fix.cfi")
colnames(freeCon) <- c("free.chi", "free.df", "free.p", "free.cfi")
colnames(waldCon) <- c("wald.chi", "wald.df", "wald.p")
index <- which((pt1$rhs %in% varfree) & (pt1$op == "=~") & (pt1$group == 1))
facinfix <- findFactor(fix, facList)
varinfixvar <- unlist(facList[facinfix])
varinfixvar <- setdiff(varinfixvar, setdiff(varinfixvar, varfree))
indexfixvar <- which((pt1$rhs %in% varinfixvar) & (pt1$op == "=~") & (pt1$group == 1))
varnonfixvar <- setdiff(varfree, varinfixvar)
indexnonfixvar <- setdiff(index, indexfixvar)
pos <- 1
for (i in seq_along(indexfixvar)) {
runnum <- indexfixvar[i]
temp <- constrainParTable(pt1, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if (!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if (!is(compresult, "try-error")) fixCon[pos,] <- c(unlist(modelcomp[2,5:7]), deltacfi(fit1, tempfit))
}
listFixCon <- c(listFixCon, tryresult)
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, 1] <- getValue(pt0, beta0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1)
tryresult0 <- try(tempfit0 <- refit(temp0, fit0), silent = TRUE)
if (!is(tryresult0, "try-error")) {
compresult0 <- try(modelcomp0 <- lavaan::lavTestLRT(tempfit0, fit0, method = method), silent = TRUE)
if (!is(compresult0, "try-error")) freeCon[pos,] <- c(unlist(modelcomp0[2,5:7]), deltacfi(tempfit0, fit0))
loadVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, 2:ncol(estimates)] <- loadVal
facVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], "~~", pt0$lhs[runnum], 1:ngroups)
totalVal <- sapply(thetaImpliedTotalVar(tempfit0), function(x, v) x[v, v], v = pt0$rhs[runnum])
names(facVal) <- names(totalVal) <- grouplab
ifelse(poolvar, refFacVal <- poolVariance(facVal, neach), refFacVal <- facVal[refgroup])
ifelse(poolvar, refTotalVal <- poolVariance(totalVal, neach), refTotalVal <- totalVal[refgroup])
stdLoadVal <- loadVal * sqrt(refFacVal) / sqrt(refTotalVal)
stdestimates[pos,] <- stdLoadVal
stdLoadVal <- stdLoadVal[grouporder]
esstd[pos,] <- stdLoadVal[2:ngroups] - stdLoadVal[1]
if (any(abs(stdLoadVal) > 0.9999))
warning(paste("Standardized Loadings of", pt0$rhs[runnum],
"in some groups are less than -1 or over 1. The",
" standardized loadings used in Fisher z",
" transformation are changed to -0.9999 or 0.9999."))
stdLoadVal[stdLoadVal > 0.9999] <- 0.9999
stdLoadVal[stdLoadVal < -0.9999] <- -0.9999
zLoadVal <- atanh(stdLoadVal)
esz[pos,] <- zLoadVal[2:ngroups] - zLoadVal[1]
}
listFreeCon <- c(listFreeCon, tryresult0)
waldCon[pos,] <- waldConstraint(fit1, pt1, waldMat, cbind(pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups))
pos <- pos + 1
}
facinvarfree <- findFactor(varnonfixvar, facList)
for (i in seq_along(indexnonfixvar)) {
runnum <- indexnonfixvar[i]
# Need to change marker variable if fixed
oldmarker <- fixLoadingFac[[facinvarfree[i]]]
if (length(oldmarker) > 0 && oldmarker == varnonfixvar[i]) {
candidatemarker <- setdiff(facList[[facinvarfree[i]]], varnonfixvar[i])[1]
temp <- freeParTable(pt1, facinvarfree[i], "=~", varnonfixvar[i], 1:ngroups)
temp <- constrainParTable(temp, facinvarfree[i], "=~", varnonfixvar[i], 1:ngroups)
temp <- fixParTable(temp, facinvarfree[i], "=~", candidatemarker, 1:ngroups)
newparent <- freeParTable(pt1, facinvarfree[i], "=~", varnonfixvar[i], 1:ngroups)
newparent <- fixParTable(newparent, facinvarfree[i], "=~", candidatemarker, 1:ngroups)
newparentresult <- try(newparentfit <- refit(newparent, fit1), silent = TRUE)
if (!is(newparentresult, "try-error")) {
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if (!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, newparentfit, method = method), silent = TRUE)
if (!is(compresult, "try-error")) fixCon[pos,] <- c(unlist(modelcomp[2,5:7]), deltacfi(newparentfit, tempfit))
}
waldCon[pos,] <- waldConstraint(newparentfit, newparent, waldMat, cbind(facinvarfree[i], "=~", varnonfixvar[i], 1:ngroups))
}
} else {
temp <- constrainParTable(pt1, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if (!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if (!is(compresult, "try-error")) fixCon[pos,] <- c(unlist(modelcomp[2,5:7]), deltacfi(fit1, tempfit))
}
waldCon[pos,] <- waldConstraint(fit1, pt1, waldMat, cbind(pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups))
}
listFixCon <- c(listFixCon, tryresult)
if (length(oldmarker) > 0 && oldmarker == varnonfixvar[i]) {
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 2:ngroups)
} else {
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
}
estimates[pos, 1] <- getValue(pt0, beta0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1)
tryresult0 <- try(tempfit0 <- refit(temp0, fit0), silent = TRUE)
if (!is(tryresult0, "try-error")) {
compresult0 <- try(modelcomp0 <- lavaan::lavTestLRT(tempfit0, fit0, method = method), silent = TRUE)
if (!is(compresult0, "try-error")) freeCon[pos,] <- c(unlist(modelcomp0[2,5:7]), deltacfi(tempfit0, fit0))
loadVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, 2:ncol(estimates)] <- loadVal
facVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], "~~", pt0$lhs[runnum], 1:ngroups)
totalVal <- sapply(thetaImpliedTotalVar(tempfit0), function(x, v) x[v, v], v = pt0$rhs[runnum])
names(facVal) <- names(totalVal) <- grouplab
ifelse(poolvar, refFacVal <- poolVariance(facVal, neach), refFacVal <- facVal[refgroup])
ifelse(poolvar, refTotalVal <- poolVariance(totalVal, neach), refTotalVal <- totalVal[refgroup])
stdLoadVal <- loadVal * sqrt(refFacVal) / sqrt(refTotalVal)
stdestimates[pos,] <- stdLoadVal
stdLoadVal <- stdLoadVal[grouporder]
esstd[pos,] <- stdLoadVal[2:ngroups] - stdLoadVal[1]
if (any(abs(stdLoadVal) > 0.9999))
warning(paste("Standardized Loadings of", pt0$rhs[runnum],
"in some groups are less than -1 or over 1. The",
" standardized loadings used in Fisher z",
" transformation are changed to -0.9999 or 0.9999."))
stdLoadVal[stdLoadVal > 0.9999] <- 0.9999
stdLoadVal[stdLoadVal < -0.9999] <- -0.9999
zLoadVal <- atanh(stdLoadVal)
esz[pos,] <- zLoadVal[2:ngroups] - zLoadVal[1]
}
listFreeCon <- c(listFreeCon, tryresult0)
pos <- pos + 1
}
freeCon[,3] <- stats::p.adjust(freeCon[,3], p.adjust)
fixCon[,3] <- stats::p.adjust(fixCon[,3], p.adjust)
waldCon[,3] <- stats::p.adjust(waldCon[,3], p.adjust)
rownames(fixCon) <- names(listFixCon) <- rownames(freeCon) <- names(listFreeCon) <- rownames(waldCon) <- rownames(estimates) <- namept1[c(indexfixvar, indexnonfixvar)]
estimates <- cbind(estimates, stdestimates, esstd, esz)
result <- cbind(freeCon, fixCon, waldCon)
} else if (numType == 2) {
if (!is.null(free) | !is.null(fix)) {
if (!is.null(fix)) {
facinfix <- findFactor(fix, facList)
dup <- duplicated(facinfix)
for (i in seq_along(fix)) {
numfixthres <- numThreshold[fix[i]]
if (numfixthres > 1) {
if (dup[i]) {
for (s in 2:numfixthres) {
pt0 <- constrainParTable(pt0, fix[i], "|", paste0("t", s), 1:ngroups)
pt1 <- constrainParTable(pt1, fix[i], "|", paste0("t", s), 1:ngroups)
}
} else {
oldmarker <- fixIntceptFac[[facinfix[i]]]
numoldthres <- numThreshold[oldmarker]
if (length(oldmarker) > 0) {
if (oldmarker == fix[i]) {
for (s in 2:numfixthres) {
pt0 <- constrainParTable(pt0, fix[i], "|", paste0("t", s), 1:ngroups)
pt1 <- constrainParTable(pt1, fix[i], "|", paste0("t", s), 1:ngroups)
}
} else {
for (r in 2:numoldthres) {
pt1 <- freeParTable(pt1, oldmarker, "|", paste0("t", r), 1:ngroups)
}
for (s in 2:numfixthres) {
pt0 <- constrainParTable(pt0, fix[i], "|", paste0("t", s), 1:ngroups)
pt1 <- constrainParTable(pt1, fix[i], "|", paste0("t", s), 1:ngroups)
}
fixIntceptFac[[facinfix[i]]] <- fix[i]
}
} else {
for (s in 2:numfixthres) {
pt0 <- constrainParTable(pt0, fix[i], "|", paste0("t", s), 1:ngroups)
pt1 <- constrainParTable(pt1, fix[i], "|", paste0("t", s), 1:ngroups)
}
}
}
}
}
}
if (!is.null(free)) {
facinfree <- findFactor(free, facList)
for (i in seq_along(free)) {
numfreethres <- numThreshold[free[i]]
# Need to change marker variable if fixed
oldmarker <- fixIntceptFac[[facinfree[i]]]
numoldthres <- numThreshold[oldmarker]
if (length(oldmarker) > 0 && oldmarker == free[i]) {
candidatemarker <- setdiff(facList[[facinfree[i]]], free[i])
candidatemarker <- candidatemarker[numThreshold[candidatemarker] > 1][1]
numcandidatethres <- numThreshold[candidatemarker]
pt0 <- constrainParTable(pt0, candidatemarker, "|", "t2", 1:ngroups)
pt1 <- constrainParTable(pt1, candidatemarker, "|", "t2", 1:ngroups)
for (s in 2:numfixthres) {
pt0 <- freeParTable(pt0, free[i], "|", paste0("t", s), 1:ngroups)
pt1 <- freeParTable(pt1, free[i], "|", paste0("t", s), 1:ngroups)
}
fixIntceptFac[[facinfix[i]]] <- candidatemarker
} else {
for (s in 2:numfixthres) {
pt0 <- freeParTable(pt0, free[i], "|", paste0("t", s), 1:ngroups)
pt1 <- freeParTable(pt1, free[i], "|", paste0("t", s), 1:ngroups)
}
}
}
}
namept1 <- paramNameFromPt(pt1)
namept0 <- paramNameFromPt(pt0)
fit0 <- refit(pt0, fit0)
fit1 <- refit(pt1, fit1)
beta <- lavaan::coef(fit1)
beta0 <- lavaan::coef(fit0)
waldMat <- matrix(0, ngroups - 1, length(beta))
varfree <- setdiff(varfree, c(free, fix))
}
maxcolumns <- max(numThreshold[varfree]) - 1
tname <- paste0("t", 2:(maxcolumns + 1))
estimates <- matrix(NA, length(varfree), (ngroups * length(tname)) + length(tname))
stdestimates <- matrix(NA, length(varfree), ngroups * length(tname))
tnameandlab <- expand.grid(tname, grouplab)
colnames(estimates) <- c(paste0("pool:", tname), paste0(tnameandlab[,1], ":", tnameandlab[,2]))
colnames(stdestimates) <- paste0("std:", tnameandlab[,1], ":", tnameandlab[,2])
esstd <- matrix(NA, length(varfree), (ngroups - 1)* length(tname))
tnameandcomplab <- expand.grid(tname, complab)
colnames(esstd) <- paste0("diff_std:", tnameandcomplab[,1], ":", tnameandcomplab[,2])
fixCon <- freeCon <- matrix(NA, length(varfree), 4)
waldCon <- matrix(NA, length(varfree), 3)
colnames(fixCon) <- c("fix.chi", "fix.df", "fix.p", "fix.cfi")
colnames(freeCon) <- c("free.chi", "free.df", "free.p", "free.cfi")
colnames(waldCon) <- c("wald.chi", "wald.df", "wald.p")
facinfix <- findFactor(fix, facList)
varinfixvar <- unlist(facList[facinfix])
varinfixvar <- setdiff(varinfixvar, setdiff(varinfixvar, varfree))
varnonfixvar <- setdiff(varfree, varinfixvar)
pos <- 1
for (i in seq_along(varinfixvar)) {
temp <- pt1
for (s in 2:numThreshold[varinfixvar[i]]) {
runnum <- which((pt1$lhs == varfree[i]) & (pt1$op == "|") & (pt1$rhs == paste0("t", s)) & (pt1$group == 1))
temp <- constrainParTable(temp, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)
}
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if (!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if (!is(compresult, "try-error")) fixCon[pos,] <- c(unlist(modelcomp[2,5:7]), deltacfi(fit1, tempfit))
}
listFixCon <- c(listFixCon, tryresult)
temp0 <- pt0
for (s in 2:numThreshold[varinfixvar[i]]) {
runnum <- which((pt0$lhs == varfree[i]) & (pt0$op == "|") & (pt0$rhs == paste0("t", s)) & (pt0$group == 1))
temp0 <- freeParTable(temp0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, s - 1] <- getValue(pt0, beta0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1)
}
tryresult0 <- try(tempfit0 <- refit(temp0, fit0), silent = TRUE)
if (!is(tryresult0, "try-error")) {
compresult0 <- try(modelcomp0 <- lavaan::lavTestLRT(tempfit0, fit0, method = method), silent = TRUE)
if (!is(compresult0, "try-error")) freeCon[pos,] <- c(unlist(modelcomp0[2,5:7]), deltacfi(tempfit0, fit0))
for (s in 2:numThreshold[varinfixvar[i]]) {
runnum <- which((pt0$lhs == varfree[i]) & (pt0$op == "|") & (pt0$rhs == paste0("t", s)) & (pt0$group == 1))
thresVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, maxcolumns*(1:ngroups) + (s - 1)] <- thresVal
totalVal <- sapply(thetaImpliedTotalVar(tempfit0), function(x, v) x[v, v], v = pt0$lhs[runnum])
ifelse(poolvar, refTotalVal <- poolVariance(totalVal, neach), refTotalVal <- totalVal[refgroup])
stdIntVal <- thresVal / sqrt(refTotalVal)
stdestimates[pos, maxcolumns*(1:ngroups - 1) + (s - 1)] <- stdIntVal
stdIntVal <- stdIntVal[grouporder]
esstd[pos, maxcolumns*(1:length(complab) - 1) + (s - 1)] <- stdIntVal[2:ngroups] - stdIntVal[1]
}
}
listFreeCon <- c(listFreeCon, tryresult0)
args <- list(fit1, pt1, waldMat)
for (s in 2:numThreshold[varinfixvar[i]]) {
runnum <- which((pt1$lhs == varfree[i]) & (pt1$op == "|") & (pt1$rhs == paste0("t", s)) & (pt1$group == 1))
args <- c(args, list(cbind(pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)))
}
waldCon[pos,] <- do.call(waldConstraint, args)
pos <- pos + 1
}
facinvarfree <- findFactor(varnonfixvar, facList)
for (i in seq_along(varnonfixvar)) {
# Need to change marker variable if fixed
oldmarker <- fixIntceptFac[[facinvarfree[i]]]
if (length(oldmarker) > 0 && oldmarker == varfree[i]) {
candidatemarker <- setdiff(facList[[facinvarfree[i]]], varnonfixvar[i])
candidatemarker <- candidatemarker[numThreshold[candidatemarker] > 1][1]
numcandidatethres <- numThreshold[candidatemarker]
newparent <- constrainParTable(pt1, candidatemarker, "|", "t2", 1:ngroups)
for (s in 2:numcandidatethres) {
newparent <- freeParTable(newparent, varnonfixvar[i], "|", paste0("t", s), 1:ngroups)
}
temp <- newparent
for (s in 2:numThreshold[varnonfixvar[i]]) {
runnum <- which((newparent$lhs == varnonfixvar[i]) & (newparent$op == "|") & (newparent$rhs == paste0("t", s)) & (newparent$group == 1))
temp <- constrainParTable(temp, newparent$lhs[runnum], newparent$op[runnum], newparent$rhs[runnum], 1:ngroups)
}
newparentresult <- try(newparentfit <- refit(newparent, fit1), silent = TRUE)
if (!is(newparentresult, "try-error")) {
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if (!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, newparentfit, method = method), silent = TRUE)
if (!is(compresult, "try-error")) fixCon[pos,] <- c(unlist(modelcomp[2,5:7]), deltacfi(newparentfit, tempfit))
}
args <- list(newparentfit, newparent, waldMat)
for (s in 2:numThreshold[varnonfixvar[i]]) {
runnum <- which((newparent$lhs == varnonfixvar[i]) & (newparent$op == "|") & (newparent$rhs == paste0("t", s)) & (newparent$group == 1))
args <- c(args, list(cbind(newparent$lhs[runnum], newparent$op[runnum], newparent$rhs[runnum], 1:ngroups)))
}
waldCon[pos,] <- do.call(waldConstraint, args)
}
} else {
temp <- pt1
for (s in 2:numThreshold[varnonfixvar[i]]) {
runnum <- which((pt1$lhs == varfree[i]) & (pt1$op == "|") & (pt1$rhs == paste0("t", s)) & (pt1$group == 1))
temp <- constrainParTable(temp, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)
}
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if (!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if (!is(compresult, "try-error")) fixCon[pos,] <- c(unlist(modelcomp[2,5:7]), deltacfi(fit1, tempfit))
}
args <- list(fit1, pt1, waldMat)
for (s in 2:numThreshold[varnonfixvar[i]]) {
runnum <- which((pt1$lhs == varfree[i]) & (pt1$op == "|") & (pt1$rhs == paste0("t", s)) & (pt1$group == 1))
args <- c(args, list(cbind(pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)))
}
waldCon[pos,] <- do.call(waldConstraint, args)
}
listFixCon <- c(listFixCon, tryresult)
temp0 <- pt0
for (s in 2:numThreshold[varnonfixvar[i]]) {
runnum <- which((pt0$lhs == varfree[i]) & (pt0$op == "|") & (pt0$rhs == paste0("t", s)) & (pt0$group == 1))
temp0 <- freeParTable(temp0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, s - 1] <- getValue(pt0, beta0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1)
}
tryresult0 <- try(tempfit0 <- refit(temp0, fit0), silent = TRUE)
if (!is(tryresult0, "try-error")) {
compresult0 <- try(modelcomp0 <- lavaan::lavTestLRT(tempfit0, fit0, method = method), silent = TRUE)
if (!is(compresult0, "try-error")) freeCon[pos,] <- c(unlist(modelcomp0[2,5:7]), deltacfi(tempfit0, fit0))
for (s in 2:numThreshold[varnonfixvar[i]]) {
runnum <- which((pt0$lhs == varfree[i]) & (pt0$op == "|") & (pt0$rhs == paste0("t", s)) & (pt0$group == 1))
thresVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[pos, maxcolumns*(1:ngroups) + (s - 1)] <- thresVal
totalVal <- sapply(thetaImpliedTotalVar(tempfit0), function(x, v) x[v, v], v = pt0$lhs[runnum])
ifelse(poolvar, refTotalVal <- poolVariance(totalVal, neach), refTotalVal <- totalVal[refgroup])
stdIntVal <- thresVal / sqrt(refTotalVal)
stdestimates[pos, maxcolumns*(1:ngroups - 1) + (s - 1)] <- stdIntVal
stdIntVal <- stdIntVal[grouporder]
esstd[pos, maxcolumns*(1:length(complab) - 1) + (s - 1)] <- stdIntVal[2:ngroups] - stdIntVal[1]
}
}
listFreeCon <- c(listFreeCon, tryresult0)
pos <- pos + 1
}
freeCon[,3] <- stats::p.adjust(freeCon[,3], p.adjust)
fixCon[,3] <- stats::p.adjust(fixCon[,3], p.adjust)
waldCon[,3] <- stats::p.adjust(waldCon[,3], p.adjust)
rownames(fixCon) <- names(listFixCon) <- rownames(freeCon) <- names(listFreeCon) <- rownames(waldCon) <- rownames(estimates) <- paste0(c(varinfixvar, varnonfixvar), "|")
estimates <- cbind(estimates, stdestimates, esstd)
result <- cbind(freeCon, fixCon, waldCon)
} else if (numType == 3) {
if (!is.null(free) | !is.null(fix)) {
if (!is.null(fix)) {
for (i in seq_along(fix)) {
pt0 <- constrainParTable(pt0, fix[i], "~~", fix[i], 1:ngroups)
pt1 <- constrainParTable(pt1, fix[i], "~~", fix[i], 1:ngroups)
}
}
if (!is.null(free)) {
for (i in seq_along(free)) {
pt0 <- freeParTable(pt0, free[i], "~~", free[i], 1:ngroups)
pt1 <- freeParTable(pt1, free[i], "~~", free[i], 1:ngroups)
}
}
namept1 <- paramNameFromPt(pt1)
namept0 <- paramNameFromPt(pt0)
fit0 <- refit(pt0, fit0)
fit1 <- refit(pt1, fit1)
beta <- lavaan::coef(fit1)
beta0 <- lavaan::coef(fit0)
waldMat <- matrix(0, ngroups - 1, length(beta))
varfree <- setdiff(varfree, c(free, fix))
}
estimates <- matrix(NA, length(varfree), ngroups + 1)
stdestimates <- matrix(NA, length(varfree), ngroups)
colnames(estimates) <- c("poolest", paste0("errvar:", grouplab))
colnames(stdestimates) <- paste0("std:", grouplab)
esstd <- esz <- matrix(NA, length(varfree), ngroups - 1)
colnames(esstd) <- paste0("diff_std:", complab)
colnames(esz) <- paste0("h:", complab)
fixCon <- freeCon <- matrix(NA, length(varfree), 4)
waldCon <- matrix(NA, length(varfree), 3)
colnames(fixCon) <- c("fix.chi", "fix.df", "fix.p", "fix.cfi")
colnames(freeCon) <- c("free.chi", "free.df", "free.p", "free.cfi")
colnames(waldCon) <- c("wald.chi", "wald.df", "wald.p")
index <- which((pt1$lhs %in% varfree) & (pt1$op == "~~") & (pt1$lhs == pt1$rhs) & (pt1$group == 1))
for (i in seq_along(index)) {
runnum <- index[i]
ustart <- getValue(pt1, beta, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1)
temp <- fixParTable(pt1, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 2:ngroups, ustart)
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if (!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if (!is(compresult, "try-error")) fixCon[i,] <- c(unlist(modelcomp[2,5:7]), deltacfi(fit1, tempfit))
}
listFixCon <- c(listFixCon, tryresult)
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 2:ngroups)
estimates[i, 1] <- getValue(pt0, beta0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1)
tryresult0 <- try(tempfit0 <- refit(temp0, fit0), silent = TRUE)
if (!is(tryresult0, "try-error")) {
compresult0 <- try(modelcomp0 <- lavaan::lavTestLRT(tempfit0, fit0, method = method), silent = TRUE)
if (!is(compresult0, "try-error")) freeCon[i,] <- c(unlist(modelcomp0[2,5:7]), deltacfi(tempfit0, fit0))
errVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[i, 2:ncol(estimates)] <- errVal
totalVal <- sapply(thetaImpliedTotalVar(tempfit0), function(x, v) x[v, v], v = pt0$rhs[runnum])
ifelse(poolvar, refTotalVal <- poolVariance(totalVal, neach), refTotalVal <- totalVal[refgroup])
stdErrVal <- errVal / sqrt(refTotalVal)
stdestimates[i,] <- stdErrVal
stdErrVal <- stdErrVal[grouporder]
esstd[i,] <- stdErrVal[2:ngroups] - stdErrVal[1]
if (any(abs(stdErrVal) > 0.9999))
warning(paste("The uniqueness of", pt0$rhs[runnum],
"in some groups are over 1. The uniqueness used in",
" arctan transformation are changed to 0.9999."))
stdErrVal[stdErrVal > 0.9999] <- 0.9999
zErrVal <- asin(sqrt(stdErrVal))
esz[i,] <- zErrVal[2:ngroups] - zErrVal[1]
}
listFreeCon <- c(listFreeCon, tryresult0)
waldCon[i,] <- waldConstraint(fit1, pt1, waldMat, cbind(pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups))
}
freeCon[,3] <- stats::p.adjust(freeCon[,3], p.adjust)
fixCon[,3] <- stats::p.adjust(fixCon[,3], p.adjust)
waldCon[,3] <- stats::p.adjust(waldCon[,3], p.adjust)
rownames(fixCon) <- names(listFixCon) <- rownames(freeCon) <- names(listFreeCon) <- rownames(waldCon) <- rownames(estimates) <- namept1[index]
estimates <- cbind(estimates, stdestimates, esstd, esz)
result <- cbind(freeCon, fixCon, waldCon)
} else if (numType == 4) {
varfree <- facnames
if (!is.null(free) | !is.null(fix)) {
if (!is.null(fix)) {
for (i in seq_along(fix)) {
pt0 <- constrainParTable(pt0, fix[i], "~1", "", 1:ngroups)
pt1 <- constrainParTable(pt1, fix[i], "~1", "", 1:ngroups)
}
}
if (!is.null(free)) {
for (i in seq_along(free)) {
pt0 <- freeParTable(pt0, free[i], "~1", "", 1:ngroups)
pt1 <- freeParTable(pt1, free[i], "~1", "", 1:ngroups)
}
}
namept1 <- paramNameFromPt(pt1)
namept0 <- paramNameFromPt(pt0)
fit0 <- refit(pt0, fit0)
fit1 <- refit(pt1, fit1)
beta <- lavaan::coef(fit1)
beta0 <- lavaan::coef(fit0)
waldMat <- matrix(0, ngroups - 1, length(beta))
varfree <- setdiff(varfree, c(free, fix))
}
estimates <- matrix(NA, length(varfree), ngroups + 1)
stdestimates <- matrix(NA, length(varfree), ngroups)
colnames(estimates) <- c("poolest", paste0("mean:", grouplab))
colnames(stdestimates) <- paste0("std:", grouplab)
esstd <- matrix(NA, length(varfree), ngroups - 1)
colnames(esstd) <- paste0("diff_std:", complab)
fixCon <- freeCon <- matrix(NA, length(varfree), 4)
waldCon <- matrix(NA, length(varfree), 3)
colnames(fixCon) <- c("fix.chi", "fix.df", "fix.p", "fix.cfi")
colnames(freeCon) <- c("free.chi", "free.df", "free.p", "free.cfi")
colnames(waldCon) <- c("wald.chi", "wald.df", "wald.p")
index <- which((pt1$lhs %in% varfree) & (pt1$op == "~1") & (pt1$group == 1))
for (i in seq_along(index)) {
runnum <- index[i]
isfree <- pt1$free[runnum] != 0
if (isfree) {
temp <- constrainParTable(pt1, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups)
} else {
temp <- fixParTable(pt1, pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 2:ngroups, ustart = pt1$ustart[runnum])
}
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if (!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if (!is(compresult, "try-error")) fixCon[i,] <- c(unlist(modelcomp[2,5:7]), deltacfi(fit1, tempfit))
}
listFixCon <- c(listFixCon, tryresult)
isfree0 <- pt0$free[runnum] != 0
if (isfree0) {
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
} else {
temp0 <- freeParTable(pt0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 2:ngroups)
}
estimates[i, 1] <- getValue(pt0, beta0, pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1)
tryresult0 <- try(tempfit0 <- refit(temp0, fit0), silent = TRUE)
if (!is(tryresult0, "try-error")) {
compresult0 <- try(modelcomp0 <- lavaan::lavTestLRT(tempfit0, fit0, method = method), silent = TRUE)
if (!is(compresult0, "try-error")) freeCon[i,] <- c(unlist(modelcomp0[2,5:7]), deltacfi(tempfit0, fit0))
meanVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], pt0$op[runnum], pt0$rhs[runnum], 1:ngroups)
estimates[i, 2:ncol(estimates)] <- meanVal
facVal <- getValue(temp0, lavaan::coef(tempfit0), pt0$lhs[runnum], "~~", pt0$lhs[runnum], 1:ngroups)
ifelse(poolvar, refFacVal <- poolVariance(facVal, neach), refFacVal <- facVal[refgroup])
stdMeanVal <- meanVal / sqrt(refFacVal)
stdestimates[i,] <- stdMeanVal
stdMeanVal <- stdMeanVal[grouporder]
esstd[i,] <- stdMeanVal[2:ngroups] - stdMeanVal[1]
}
listFreeCon <- c(listFreeCon, tryresult0)
waldCon[i,] <- waldConstraint(fit1, pt1, waldMat, cbind(pt1$lhs[runnum], pt1$op[runnum], pt1$rhs[runnum], 1:ngroups))
}
freeCon[,3] <- stats::p.adjust(freeCon[,3], p.adjust)
fixCon[,3] <- stats::p.adjust(fixCon[,3], p.adjust)
waldCon[,3] <- stats::p.adjust(waldCon[,3], p.adjust)
rownames(fixCon) <- names(listFixCon) <- rownames(freeCon) <- names(listFreeCon) <- rownames(waldCon) <- rownames(estimates) <- namept1[index]
estimates <- cbind(estimates, stdestimates, esstd)
result <- cbind(freeCon, fixCon, waldCon)
}
if (return.fit) {
return(invisible(list(estimates = estimates, results = result, models = list(free = listFreeCon, fix = listFixCon, nested = fit0, parent = fit1))))
} else {
return(list(estimates = estimates, results = result))
}
}
## ----------------
## Hidden Functions
## ----------------
findFactor <- function(var, facList) {
tempfac <- lapply(facList, intersect, var)
facinvar <- rep(names(tempfac), sapply(tempfac, length))
facinvar[match(unlist(tempfac), var)]
}
## Terry moved here from wald.R so that wald() could be removed (redundant with lavaan::lavTestWald)
## FIXME: Update WaldConstraint to rely on lavaan::lavTestWald instead
#' @importFrom stats pchisq
waldContrast <- function(object, contrast) {
beta <- lavaan::coef(object)
acov <- lavaan::vcov(object)
chisq <- t(contrast %*% beta) %*% solve(contrast %*% as.matrix(acov) %*% t(contrast)) %*% (contrast %*% beta)
df <- nrow(contrast)
p <- pchisq(chisq, df, lower.tail=FALSE)
c(chisq = chisq, df = df, p = p)
}
#' @importFrom lavaan parTable
waldConstraint <- function(fit, pt, mat, ...) {
dotdotdot <- list(...)
overallMat <- NULL
for(i in seq_along(dotdotdot)) {
target <- dotdotdot[[i]]
tempMat <- mat
element <- apply(target, 1, matchElement, parTable = pt)
freeIndex <- pt$free[element]
tempMat[,freeIndex[1]] <- -1
for(m in 2:length(freeIndex)) {
tempMat[m - 1, freeIndex[m]] <- 1
}
overallMat <- rbind(overallMat, tempMat)
}
result <- rep(NA, 3)
if(!any(apply(overallMat, 1, sum) != 0)) {
try(result <- waldContrast(fit, overallMat), silent = TRUE)
}
return(result)
}
poolVariance <- function(var, n) {
nm <- n - 1
sum(var * nm) / sum(nm)
}
deltacfi <- function(parent, nested) lavaan::fitmeasures(nested)["cfi"] - lavaan::fitmeasures(parent)["cfi"]
## For categorical. FIXME: Why is this even necessary?
## Did Sunthud not know implied Sigma is available?
#' @importFrom lavaan lavInspect
thetaImpliedTotalVar <- function(object) {
# param <- lavInspect(object, "est")
# ngroup <- lavInspect(object, "ngroups")
# name <- names(param)
# if(ngroup == 1) {
# ly <- param[name == "lambda"]
# } else {
# ly <- lapply(param, "[[", "lambda")
# }
# ps <- lavInspect(object, "cov.lv")
# if(ngroup == 1) ps <- list(ps)
# if(ngroup == 1) {
# te <- param[name == "theta"]
# } else {
# te <- lapply(param, "[[", "theta")
# }
# result <- list()
# for(i in 1:ngroup) {
# result[[i]] <- ly[[i]] %*% ps[[i]] %*% t(ly[[i]]) + te[[i]]
# }
# result
if (lavInspect(object, "ngroups") == 1L) return(list(lavInspect(object, "cov.ov")))
lavInspect(object, "cov.ov")
}
semTools/R/measEq.R 0000644 0001762 0000144 00000453011 14764271042 013631 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 12 March 2025
### lavaan model syntax-writing engine for new measEq() to replace
### measurementInvariance(), measurementInvarianceCat(), and longInvariance()
## ----------------------
## Model-Fitting Function
## ----------------------
measEq <- function(configural.model,
ID.fac = c("std.lv","auto.fix.first","effects.coding"),
ID.cat = c("Wu.Estabrook.2016","Mplus","Millsap.Tein.2004","LISREL"),
ID.thr = c(1L, 2L), # only for ID.cat == "Millsap.Tein.2004"
group = NULL, longFacNames = list(), longIndNames = list(),
#group.equal = "", long.equal = "",
group.partial = "", long.partial = "",
auto = "all", extra = NULL,
test.seq = c("thresholds","loadings","intercepts","means",
"lv.variances","residuals"), # optional resid/lv.autocov, residual/lv.covariances
#fixed.x = TRUE, strict = FALSE, quiet = FALSE,
warn = TRUE, debug = FALSE,
alpha = .05, fit.measures = "default", argsLRT = list(), ...) {
#TODO: check GLIST structure for multilevel (with single and multiple groups)
#TODO: compatibility with auxiliary(), runMI(), parcelAllocation(), permuteMeasEq()
#TODO: develop automated anchorSelection() and measEq.partial()
#TODO: if (inherits(configural.model, "lavaan.measEq")) {continue sequence}?
## This example might help: https://groups.google.com/d/msg/lavaan/LvALeUpJBDg/2zD1CoikAQAJ
#TODO: add argument to accept measEq.partial output, to continue sequence (or make and update() method?)
if (is.character(group.partial)) {
if ((group.partial == "")[1L] && length(group.partial) == 1L) {
group.partial <- data.frame(stringsAsFactors = FALSE, lhs = character(0),
op = character(0), rhs = character(0))
} else {
group.partial <- lavaan::lavParseModelString(group.partial,
as.data.frame. = TRUE,
warn = warn, debug = debug)
}
} #TODO: else {extract information from a measEq.partial object}
if (is.character(long.partial)) {
if ((long.partial == "")[1L] && length(long.partial) == 1L) {
long.partial <- data.frame(stringsAsFactors = FALSE, lhs = character(0),
op = character(0), rhs = character(0))
} else {
long.partial <- lavaan::lavParseModelString(long.partial,
as.data.frame. = TRUE,
warn = warn, debug = debug)
}
} #TODO: else {extract information from a measEq.partial object}
## pass arguments to measEq.syntax(), which performs checks
}
## -----------------
## Class and Methods
## -----------------
##' Class for Representing a Measurement-Equivalence Model
##'
##' This class of object stores information used to automatically generate
##' lavaan model syntax to represent user-specified levels of measurement
##' equivalence/invariance across groups and/or repeated measures. See
##' [measEq.syntax()] for details.
##'
##'
##' @name measEq.syntax-class
##' @aliases measEq.syntax-class show,measEq.syntax-method
##' summary,measEq.syntax-method as.character,measEq.syntax-method
##' update,measEq.syntax-method
##' @docType class
##'
##' @slot package `character` indicating the software package used to
##' represent the model. Currently, only `"lavaan"` is available, which
##' uses the LISREL representation (see [lavaan::lavOptions()]).
##' In the future, `"OpenMx"` may become available, using RAM
##' representation.
##' @slot model.type `character`. Currently, only "cfa" is available.
##' Future versions may allow for MIMIC / RFA models, where invariance can be
##' tested across levels of exogenous variables explicitly included as
##' predictors of indicators, controlling for their effects on (or correlation
##' with) the common factors.
##' @slot call The function call as returned by `match.call()`, with
##' some arguments updated if necessary for logical consistency.
##' @slot meanstructure `logical` indicating whether a mean structure is
##' included in the model.
##' @slot numeric `character` vector naming `numeric` manifest indicators.
##' @slot ordered `character` vector naming `ordered` indicators.
##' @slot parameterization `character`. See [lavaan::lavOptions()].
##' @slot specify `list` of parameter matrices, similar in form to the
##' output of `lavInspect(fit, "free")`. These matrices
##' are `logical`, indicating whether each parameter should be specified
##' in the model syntax.
##' @slot values `list` of parameter matrices, similar in form to the
##' output of `lavInspect(fit, "free")`. These matrices
##' are `numeric`, indicating whether each parameter should be freely
##' estimated (indicated by `NA`) or fixed to a particular value.
##' @slot labels `list` of parameter matrices, similar in form to the
##' output of `lavInspect(fit, "free")`. These matrices
##' contain `character` labels used to constrain parameters to equality.
##' @slot constraints `character` vector containing additional equality
##' constraints used to identify the model when `ID.fac = "fx"`.
##' @slot ngroups `integer` indicating the number of groups.
##'
##' @param x,object an object of class `measEq.syntax`
##' @param package `character` indicating the package for which the model
##' syntax should be generated. Currently, only `"lavaan"` and
##' `"mplus"` are supported.
##' @param params `character` vector indicating which type(s) of parameter
##' to print syntax for. Must match a type that can be passed to
##' `group.equal` or `long.equal`, but `"residual.covariances"`
##' and `"lv.covariances"` will be silently ignored. Instead, requesting
##' `"residuals"` or `"lv.variances"` will return covariances along
##' with variances. By default (`NULL`), all types are printed.
##' @param single `logical` indicating whether to concatenate lavaan
##' [lavaan::model.syntax()] into a single `character` string.
##' Setting `FALSE` will return a vector of strings, which may be
##' convenient (or even necessary to prevent an error) in
##' models with long variable names, many variables, or many groups.
##' @param groups.as.blocks `logical` indicating whether to write lavaan
##' [lavaan::model.syntax()] using vectors of labels and values
##' for multiple groups (the default: `FALSE`), or whether to write
##' a separate "block" of syntax per group. The block structure could allow
##' users to apply the generated multigroup syntax (after some editing) to
##' test invariance across levels in a multilevel SEM (see final example on
##' [measEq.syntax()] help page).
##' @param verbose `logical` indicating whether to print a summary to the
##' screen (default). If `FALSE`, only a pattern matrix is returned.
##' @param ... Additional arguments to the `call`, or arguments with
##' changed values.
##' @param evaluate If `TRUE`, evaluate the new `call`; otherwise,
##' return the new `call`.
##' @param change.syntax [lavaan::model.syntax()] specifying
##' labels or fixed/free values of parameters in `object`.
##' These provide some flexibility to customize
##' existing parameters without having to copy/paste the output of
##' `as.character(object)` into an R script. For example,
##' `group.partial` will free a parameter across all groups, but
##' `update` allows users to free the parameter in just one group
##' while maintaining equality constraints among other groups.
##'
##' @return
##' \item{summary}{`signature(object = "measEq.syntax", verbose = TRUE)`:
##' A `character` matrix indicating the pattern of `numeric`,
##' `ordered`, or latent indicators loading on common factors.
##' By default (`verbose = TRUE`), `summary` also prints descriptive
##' details about the model, including the numbers of indicators and factors,
##' and which parameters are constrained to equality.}
##' \item{show}{`signature(object = "measEq.syntax")`: Prints a message
##' about how to use the `object` for model fitting. Invisibly
##' returns the `object`.}
##' \item{update}{`signature(object = "measEq.syntax", ...,
##' evaluate = TRUE, change.syntax = NULL)`: Creates a new
##' `object` with updated arguments in `...`, or updated
##' parameter labels or fixed/free specifications in `object`.}
##' \item{as.character}{`signature(x = "measEq.syntax", package = "lavaan")`:
##' Converts the `measEq.syntax` object to model syntax that can be
##' copy/pasted or written to a syntax file to be edited before analysis,
##' or simply passed to [lavaan::lavaan()] to fit the model to
##' data. Generated M*plus* syntax could also be utilized using the
##' \pkg{MplusAuthomation} package.}
##'
##' @author Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' @examples
##' ## See ?measEq.syntax help page for examples using lavaan
##'
## ## Here, we illustrate how measEq.syntax() objects can be used in
## ## tandem with MplusAutomation.
##
## \donttest{
## ## borrow example data from Mplus user guide
## myData <- read.table("http://www.statmodel.com/usersguide/chap5/ex5.16.dat")
## names(myData) <- c("u1","u2","u3","u4","u5","u6","x1","x2","x3","g")
## bin.mod <- '
## FU1 =~ u1 + u2 + u3
## FU2 =~ u4 + u5 + u6
## '
## ## pretend the 2 factors actually measure the same factor (FU) twice
## longFacNames <- list(FU = c("FU1","FU2"))
## syntax.scalar <- measEq.syntax(configural.model = bin.mod,
## data = myData, ordered = paste0("u", 1:6),
## parameterization = "theta",
## ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
## group = "g", longFacNames = longFacNames,
## group.equal = c("thresholds","loadings","intercepts"),
## long.equal = c("thresholds","loadings","intercepts"))
## if(requireNamespace("MplusAutomation")){
## library(MplusAutomation)
## mpInp <- mplusObject(rdata = myData, TITLE = "Scalar Invariance",
## VARIABLE = "GROUPING = g (1=g1 2=g2);",
## usevariables = c(paste0("u", 1:6), "g"),
## ANALYSIS = "ESTIMATOR = WLSMV;",
## ## model specification from measEq.syntax():
## MODEL = as.character(syntax.scalar, package = "mplus"))
## ## add details for Mplus script:
## mpInp <- update(mpInp, ANALYSIS = ~ . + "PARAMETERIZATION = THETA;",
## VARIABLE = ~ . + "CATEGORICAL = u1 u2 u3 u4 u5 u6;")
## ## fit model
## mpOut <- mplusModeler(mpInp, modelout = "scalar.inp", run = 1L)
## }
## }
#TODO: add configural and DIFFTEST example
##'
setClass("measEq.syntax", slots = c(package = "character", # lavaan, OpenMx in the future?
model.type = "character", # cfa, extend to mimic/rfa?
call = "call",
meanstructure = "logical",
numeric = "character",
ordered = "character",
parameterization = "character",
specify = "list",
values = "list",
labels = "list",
constraints = "character",
updates = "list", # 2 data.frames: labels and values
ngroups = "integer"))
##' @rdname measEq.syntax-class
##' @aliases as.character,measEq.syntax-method
##' @export
setMethod("as.character", "measEq.syntax", function(x, package = "lavaan",
params = NULL, single = TRUE,
groups.as.blocks = FALSE) {
package <- tolower(package)[1]
if (package == "mplus") {
LL <- x@specify[[1]]$lambda
nn <- c(rownames(LL), colnames(LL))
over8 <- nchar(nn) > 8L
if (any(over8)) warning('Mplus only allows variable names to have 8 ',
'characters. The following variable names in ',
'your model exceed 8 characters:\n',
paste(nn[over8], collapse = ", "), '\n',
'Consider shortening variable names before ',
'printing an Mplus MODEL statement.')
## print everything leading up to the MODEL command
script <- c("MODEL:\n")
if (length(x@ordered)) {
script <- c(paste0("!Make sure your VARIABLE command indicates the ",
"following variables as CATEGORICAL:\n!",
paste(x@ordered, collapse = ", "), '\n'), script)
}
if (x@ngroups > 1L) {
script <- c(script, "!This is the first group's MODEL.\n!Group 2's MODEL",
"!will be labeled as 'g2', and so on for any other groups.\n")
}
script <- c(script, write.mplus.syntax(object = x, group = 1L,
params = params))
if (x@ngroups > 1L) for (g in 2:x@ngroups) {
script <- c(script, paste0("\nMODEL g", g, ":\n"),
write.mplus.syntax(object = x, group = g, params = params))
}
return(paste(script, collapse = "\n")) # always return a single string
} else if (package == "lavaan") {
script <- character(0)
pmatList <- c("lambda","tau","nu","delta","theta","alpha","psi")
names(pmatList) <- c("loadings","thresholds","intercepts","scales",
"residuals","means","lv.variances")
## selected parameter types?
if (!is.null(params)) {
requested <- intersect(names(pmatList), params)
if (!length(requested)) stop('invalid choice: params = c("',
paste(params, collapse = '", "'), '")\n',
'Valid choices include: ',
paste(names(pmatList), collapse = ", "))
pmatList <- pmatList[requested]
}
if (groups.as.blocks) {
## loop over groups
for (gg in 1:x@ngroups) {
script <- c(script, paste("group:", gg, "\n", collapse = ""))
## loop over pmats
for (pm in pmatList) {
if (!pm %in% names(x@specify[[gg]])) next
if (pm == "lambda" && "beta" %in% names(x@specify[[gg]])) {
## add higher-order loadings to lambda matrix
specify <- list(rbind(x@specify[[gg]]$lambda, x@specify[[gg]]$beta))
value <- list(rbind(x@values[[gg]]$lambda, x@values[[gg]]$beta))
label <- list(rbind(x@labels[[gg]]$lambda, x@labels[[gg]]$beta))
} else {
specify <- list(x@specify[[gg]][[pm]])
value <- list(x@values[[gg]][[pm]])
label <- list(x@labels[[gg]][[pm]])
}
script <- c(script, write.lavaan.syntax(pmat = pm, specify = specify,
value = value, label = label))
} # end pm
} # end gg
} else {
## the usual multigroup lavaan syntax:
## loop over pmats, send all groups together
for (pm in pmatList) {
if (!pm %in% names(x@specify[[1]])) next
if (pm == "lambda" && "beta" %in% names(x@specify[[1]])) {
## add higher-order loadings to lambda matrix
specify.l <- lapply(x@specify, "[[", i = "lambda")
value.l <- lapply(x@values , "[[", i = "lambda")
label.l <- lapply(x@labels , "[[", i = "lambda")
specify.b <- lapply(x@specify, "[[", i = "beta")
value.b <- lapply(x@values , "[[", i = "beta")
label.b <- lapply(x@labels , "[[", i = "beta")
specify <- mapply(rbind, specify.l, specify.b, SIMPLIFY = FALSE)
value <- mapply(rbind, value.l , value.b , SIMPLIFY = FALSE)
label <- mapply(rbind, label.l , label.b , SIMPLIFY = FALSE)
} else {
specify <- lapply(x@specify, "[[", i = pm)
value <- lapply(x@values, "[[", i = pm)
label <- lapply(x@labels, "[[", i = pm)
}
script <- c(script, write.lavaan.syntax(pmat = pm, specify = specify,
value = value, label = label))
}
}
if (length(x@constraints)) script <- c(script,
"## MODEL CONSTRAINTS:\n",
x@constraints, "")
}
#TODO: else if (package == "openmx") # concatenate matrices for RAM specification
## convert GLIST objects to a character string
if (single) return(paste(script, collapse = "\n"))
script
})
##' @rdname measEq.syntax-class
##' @aliases show,measEq.syntax-method
##' @export
setMethod("show", "measEq.syntax", function(object) {
cat('This object contains information for specifying a CFA using lavaan',
'model syntax.\nTo print the syntax (to copy/paste it into an R script),',
'use the as.character() method:\n\n\tcat(as.character(object))\n\nTo fit',
'this model to data, save the syntax to an object and pass it to lavaan:',
'\n\n\tmodel <- as.character(object)\n\tfit <- lavaan(model, ...)',
'\n\nTo view some key features of the model use: \n\n\tsummary(object)')
invisible(object)
})
##' @rdname measEq.syntax-class
##' @aliases summary,measEq.syntax-method
##' @export
setMethod("summary", "measEq.syntax", function(object, verbose = TRUE) {
nG <- object@ngroups
nOrd <- length(object@ordered)
higher <- !is.null(object@specify[[1]]$beta)
## create pattern matrix
lambda <- object@specify[[1]]$lambda
lambda[!lambda] <- ""
for (RR in 1:nrow(lambda)) {
if (rownames(lambda)[RR] %in% object@ordered) {
lambda[RR, object@specify[[1]]$lambda[RR, ] ] <- "ord"
} else lambda[RR, object@specify[[1]]$lambda[RR, ] ] <- "num"
}
if (higher) {
beta <- object@specify[[1]]$beta
beta[!beta] <- ""
for (RR in 1:nrow(beta)) {
beta[RR, object@specify[[1]]$beta[RR, ] ] <- "lat"
}
rownames(beta) <- paste("**", rownames(beta), "**")
lambda <- rbind(lambda, beta)
}
if (!verbose) return(lambda)
## Basics: number of groups, factors, and indicators (higher order?); ID.fac
nHigher <- if (higher) sum(apply(object@specify[[1]]$beta, 2, any)) else 0L
if (object@call$ID.fac == "ul" && !object@meanstructure) {
ID.fac.text <- 'first indicator`s factor loading was fixed to 1.'
} else if (object@call$ID.fac == "ul" && object@meanstructure) {
ID.fac.text <- paste('first indicator`s intercept and factor loading were',
'fixed to 0 and 1, respectively.')
} else if (object@call$ID.fac == "uv" && !object@meanstructure) {
ID.fac.text <- paste('factor variances were fixed to 1, unless equality',
'constraints on factor loadings allow them to be freed.')
} else if (object@call$ID.fac == "uv" && object@meanstructure) {
ID.fac.text <- paste('factor means and variances were fixed to 0 and 1,',
'respectively, unless equality constraints on',
'measurement parameters allow them to be freed.')
} else if (object@call$ID.fac == "fx") {
ID.fac.text <- paste('factor loadings were constrained to average 1',
if (object@meanstructure) 'and intercepts were constrained to average 0',
'within each factor. In models with partial',
'invariance, only the factor loadings',
if (object@meanstructure) 'and intercepts',
'that were constrained to equality across ALL groups',
'and repeated measures (when applicable) are used to',
'identify the common-factor distribution.')
}
cat('This lavaan model syntax specifies a CFA with ',
nrow(object@specify[[1]]$lambda), ' manifest indicators ',
if (nOrd == 1L) {
paste0('(', nOrd, ' of which is ordinal) ')
} else if (nOrd > 1L) {
paste0('(', nOrd, ' of which are ordinal) ')
}, 'of ', ncol(object@specify[[1]]$lambda), ' common factor(s)',
if (nHigher == 1L) {
paste(',', nHigher, 'of which is a higher-order factor. ')
} else if (nHigher > 1L) {
paste(',', nHigher, 'of which are higher-order factors. ')
} else '.\n\n', 'To identify the ',
if (object@meanstructure) 'location and ',
'scale of each common factor, the ', ID.fac.text, "\n\n", sep = '')
## if (ordered) ID.cat and parameterization
if (nOrd) {
if (object@call$ID.cat == "wu") {
ID.cat.author <- 'recommended by Wu & Estabrook (2016). '
ID.cat.DOI <- 'https://doi.org/10.1007/s11336-016-9506-0 \n\n'
} else if (object@call$ID.cat == "millsap") {
ID.cat.author <- 'recommended by Millsap & Tein (2004). '
} else if (object@call$ID.cat == "mplus") {
ID.cat.author <- 'used by default in the Mplus (and lavaan) software. '
} else if (object@call$ID.cat == "lisrel") {
ID.cat.author <- 'used by default in the LISREL software. '
}
if (object@call$ID.cat != "wu") ID.cat.DOI <- 'http://dx.doi.org/10.1207/S15327906MBR3903_4 \n\n'
cat('The location and scale of each latent item-response underlying ', nOrd,
' ordinal indicators were identified using the "', object@parameterization,
'" parameterization, and the identification constraints ',
ID.cat.author, 'For details, read:\n\n\t', ID.cat.DOI, sep = '')
}
## number of occassions per longitudinal construct
if (length(object@call$longFacNames)) {
cat('The following factors were measured on multiple occasions:\n')
for (f in names(object@call$longFacNames)) {
cat('\t"', f, '" was measured on ', length(object@call$longFacNames[[f]]),
' occasions\n', sep = '')
}
cat('\n')
}
## print pattern matrix
cat('Pattern matrix indicating num(eric), ord(ered), and lat(ent)',
'indicators per factor:\n\n')
print(lambda, quote = FALSE)
cat('\n')
## without any constraints, call it the configural model
no.group.equal <- length(object@call$group.equal) == 1L && object@call$group.equal == ""
no.long.equal <- length(object@call$long.equal) == 1L && object@call$long.equal == ""
if (no.group.equal && no.long.equal) {
cat('\nThis model hypothesizes only configural invariance.\n\n')
## return pattern matrix
return(invisible(lambda))
}
## otherwise, print the constraints & exceptions
## constrained parameters across groups (+ partial exceptions)
if (nG > 1L) {
if (no.group.equal) {
cat('No parameters were constrained to equality across groups.\n')
} else {
cat('The following types of parameter were constrained to',
'equality across groups:\n\n')
for (i in object@call$group.equal) {
group.partial <- object@call$group.partial
## first, check for exceptions
if (i == "loadings") {
man.ind <- group.partial$rhs %in% rownames(object@specify[[1]]$lambda)
group.partial <- group.partial[group.partial$op == "=~" & man.ind, ]
} else if (i == "regressions") {
lat.ind <- group.partial$rhs %in% colnames(object@specify[[1]]$lambda)
group.partial <- group.partial[group.partial$op == "=~" & lat.ind, ]
} else if (i == "thresholds") {
man.ind <- group.partial$lhs %in% rownames(object@specify[[1]]$lambda)
group.partial <- group.partial[group.partial$op == "|" & man.ind, ]
} else if (i == "residuals") {
man.ind <- group.partial$rhs %in% rownames(object@specify[[1]]$lambda)
same.ind <- group.partial$rhs == group.partial$lhs
group.partial <- group.partial[group.partial$op == "~~" & man.ind & same.ind, ]
} else if (i == "residual.covariances") {
man.ind <- group.partial$rhs %in% rownames(object@specify[[1]]$lambda)
same.ind <- group.partial$rhs == group.partial$lhs
group.partial <- group.partial[group.partial$op == "~~" & man.ind & !same.ind, ]
} else if (i == "lv.variances") {
lat <- group.partial$rhs %in% colnames(object@specify[[1]]$lambda)
same <- group.partial$rhs == group.partial$lhs
group.partial <- group.partial[group.partial$op == "~~" & lat & same, ]
} else if (i == "lv.covariances") {
lat <- group.partial$rhs %in% colnames(object@specify[[1]]$lambda)
same <- group.partial$rhs == group.partial$lhs
group.partial <- group.partial[group.partial$op == "~~" & lat & !same, ]
} else if (i == "intercepts") {
man.ind <- group.partial$lhs %in% rownames(object@specify[[1]]$lambda)
group.partial <- group.partial[group.partial$op == "~1" & man.ind, ]
} else if (i == "means") {
lat <- group.partial$lhs %in% colnames(object@specify[[1]]$lambda)
group.partial <- group.partial[group.partial$op == "~1" & lat, ]
}
## then print a message
cat('\t', i,
if (nrow(group.partial)) ', with the exception of:\n',
'\n', sep = '')
if (nrow(group.partial)) {
rownames(group.partial) <- paste(" row-",
rownames(group.partial), ": ",
sep = "")
print(group.partial)
cat('\n')
}
}
}
cat('\n')
}
## constrained parameters across repeated measures (+ partial exceptions)
if (length(object@call$longFacNames)) {
if (no.long.equal) {
cat('No parameters were constrained to equality across repeated measures:\n')
} else {
cat('The following types of parameter were constrained to equality',
'across repeated measures:\n\n')
for (i in object@call$long.equal) {
long.partial <- object@call$long.partial
## first, check for exceptions
if (i == "loadings") {
man.ind <- long.partial$rhs %in% names(object@call$longIndNames)
long.partial <- long.partial[long.partial$op == "=~" & man.ind, ]
} else if (i == "regressions") {
lat.ind <- long.partial$rhs %in% names(object@call$longFacNames)
long.partial <- long.partial[long.partial$op == "=~" & lat.ind, ]
} else if (i == "thresholds") {
man.ind <- long.partial$lhs %in% names(object@call$longIndNames)
long.partial <- long.partial[long.partial$op == "|" & man.ind, ]
} else if (i == "residuals") {
man.ind <- long.partial$rhs %in% names(object@call$longIndNames)
same.ind <- long.partial$rhs == long.partial$lhs
long.partial <- long.partial[long.partial$op == "~~" & man.ind & same.ind, ]
} else if (i == "lv.variances") {
lat <- long.partial$rhs %in% names(object@call$longFacNames)
same <- long.partial$rhs == long.partial$lhs
long.partial <- long.partial[long.partial$op == "~~" & lat & same, ]
} else if (i == "intercepts") {
man.ind <- long.partial$lhs %in% names(object@call$longIndNames)
long.partial <- long.partial[long.partial$op == "~1" & man.ind, ]
} else if (i == "means") {
lat <- long.partial$lhs %in% names(object@call$longFacNames)
long.partial <- long.partial[long.partial$op == "~1" & lat, ]
}
## then print a message
cat('\t', i,
if (nrow(long.partial)) ', with the exception of:\n',
'\n', sep = '')
if (nrow(long.partial)) {
rownames(long.partial) <- paste(" row-",
rownames(long.partial), ": ",
sep = "")
print(long.partial)
cat('\n')
}
}
}
cat('\n')
}
## return pattern matrix
invisible(lambda)
})
updateMeasEqSyntax <- function(object, ..., evaluate = TRUE,
change.syntax = NULL) {
# data.frame(stringsAsFactors = FALSE, extras = c(TRUE, FALSE),
# override = c(TRUE, TRUE, FALSE, FALSE),
# eval = c(TRUE, TRUE, TRUE, TRUE,
# FALSE, FALSE, FALSE, FALSE),
# TODO = c("extras; eval; transfer, augment, and apply @updates",
# "apply and augment @updates",
# "extras; eval; transfer and apply @updates", "return object",
# "nothing, can't add to call", "nothing, can't add to call",
# "extras, return call", "return call")) -> foo
# foo[order(foo$extras), ]
# extras override eval TODO
# 1 FALSE TRUE TRUE apply and augment @updates
# 2 FALSE FALSE TRUE return object *
# 3 FALSE TRUE FALSE nothing, can't add to call *
# 4 FALSE FALSE FALSE return call *
# 5 TRUE TRUE TRUE extras; eval; transfer, augment, and apply @updates
# 6 TRUE FALSE TRUE extras, eval, transfer @updates
# 7 TRUE TRUE FALSE nothing, can't add to call *
# 8 TRUE FALSE FALSE extras, return call *
#extras <- match.call(expand.dots = FALSE)$...
extras <- list(...)
custom <- !is.null(change.syntax)
## regardless of customization/evaluation, extras can be added to call first
if (length(extras)) {
## prep 5:8
call <- object@call
existing <- !is.na(match(names(extras), names(call)))
for (a in names(extras)[existing]) call[[a]] <- extras[[a]]
if (any(!existing)) {
call <- c(as.list(call), extras[!existing])
call <- as.call(call)
}
if (!evaluate) {
if (custom) warning('cannot apply "change.syntax" ',
'argument when evaluate=FALSE.')
## finish 7:8
return(call)
}
} else if (!evaluate) {
if (custom) warning('cannot apply "change.syntax" ',
'argument when evaluate=FALSE.')
## finish 3:4
return(object@call)
} else if (!custom) {
## finish 2
return(object)
}
# extras override eval TODO
# 1 FALSE TRUE TRUE apply and augment @updates
# 5 TRUE TRUE TRUE eval; transfer, augment, and apply @updates
# 6 TRUE FALSE TRUE eval; transfer and apply @updates
if (length(extras)) {
## prep 5:6
out <- eval(call, parent.frame())
if (nrow(object@updates$values)) out@updates$values <- object@updates$values
if (nrow(object@updates$labels)) out@updates$labels <- object@updates$labels
} else out <- object # "prep" 1
# extras override eval TODO
# 1 FALSE TRUE TRUE apply and augment @updates
# 5 TRUE TRUE TRUE augment, and apply @updates
# 6 TRUE FALSE TRUE apply @updates
## check before augmenting to prep 1 and 5
if (!is.null(change.syntax)) {
stopifnot(is.character(change.syntax))
## convert syntax to data.frame of updates to make
UPDATES <- char2update(object, change.syntax, return.object = FALSE)
out@updates$values <- rbind(out@updates$values, UPDATES$values)
out@updates$labels <- rbind(out@updates$labels, UPDATES$labels)
}
## nothing left to do but apply @updates
## loop over any values/labels (now stored) to finish 1 and 5:6
if (nrow(out@updates$values)) for (RR in 1:nrow(out@updates$values)) {
valueArgs <- c(list(object = out, slotName = "values"),
as.list(out@updates$values[RR, ]))
BB <- out@updates$values$group[RR]
matName <- out@updates$values$matName[RR]
out@values[[BB]][[matName]] <- do.call(override, valueArgs)
}
if (nrow(out@updates$labels)) for (RR in 1:nrow(out@updates$labels)) {
labelArgs <- c(list(object = out, slotName = "labels"),
as.list(out@updates$labels[RR, ]))
BB <- out@updates$labels$group[RR]
matName <- out@updates$labels$matName[RR]
out@labels[[BB]][[matName]] <- do.call(override, labelArgs)
}
## user-specified parameters to override MUST include:
## - group (eventually block), defaults to 1 (convenient for longitudinal CFA)
## - matName (e.g., lambda)
## - row and col (integer or character indices)
## - replacement (NA or numeric for values, character for labels)
out
}
##' @rdname measEq.syntax-class
##' @aliases update,measEq.syntax-method
##' @importFrom stats update
##' @export
setMethod("update", "measEq.syntax", updateMeasEqSyntax)
## -----------------------
## Syntax-Writing Function
## -----------------------
##' Syntax for measurement equivalence
##'
##' Automatically generates `lavaan` model syntax to specify a confirmatory
##' factor analysis (CFA) model with equality constraints imposed on
##' user-specified measurement (or structural) parameters. Optionally returns
##' the fitted model (if data are provided) representing some chosen level of
##' measurement equivalence/invariance across groups and/or repeated measures.
##'
##' This function is a pedagogical and analytical tool to generate model syntax
##' representing some level of measurement equivalence/invariance across any
##' combination of multiple groups and/or repeated measures. Support is provided
##' for confirmatory factor analysis (CFA) models with simple or complex
##' structure (i.e., cross-loadings and correlated residuals are allowed).
##' For any complexities that exceed the limits of automation, this function is
##' intended to still be useful by providing a means to generate syntax that
##' users can easily edit to accommodate their unique situations.
##'
##' Limited support is provided for bifactor models and higher-order constructs.
##' Because bifactor models have cross-loadings by definition, the option
##' `ID.fac = "effects.code"` is unavailable. `ID.fac = "UV"` is
##' recommended for bifactor models, but `ID.fac = "UL"` is available on
##' the condition that each factor has a unique first indicator in the
##' `configural.model`. In order to maintain generality, higher-order
##' factors may include a mix of manifest and latent indicators, but they must
##' therefore require `ID.fac = "UL"` to avoid complications with
##' differentiating lower-order vs. higher-order (or mixed-level) factors.
##' The keyword `"loadings"` in `group.equal` or `long.equal`
##' constrains factor loadings of all manifest indicators (including loadings on
##' higher-order factors that also have latent indicators), whereas the keyword
##' `"regressions"` constrains factor loadings of latent indicators. Users
##' can edit the model syntax manually to adjust constraints as necessary, or
##' clever use of the `group.partial` or `long.partial` arguments
##' could make it possible for users to still automated their model syntax.
##' The keyword `"intercepts"` constrains the intercepts of all manifest
##' indicators, and the keyword `"means"` constrains intercepts and means
##' of all latent common factors, regardless of whether they are latent
##' indicators of higher-order factors. To test equivalence of lower-order and
##' higher-order intercepts/means in separate steps, the user can either
##' manually edit their generated syntax or conscientiously exploit the
##' `group.partial` or `long.partial` arguments as necessary.
##'
##' **`ID.fac`:** If the `configural.model` fixes any (e.g.,
##' the first) factor loadings, the generated syntax object will retain those
##' fixed values. This allows the user to retain additional constraints that
##' might be necessary (e.g., if there are only 1 or 2 indicators). Some methods
##' must be used in conjunction with other settings:
##' \itemize{
##' \item `ID.cat = "Millsap"` requires `ID.fac = "UL"` and
##' `parameterization = "theta"`.
##' \item `ID.cat = "LISREL"` requires `parameterization = "theta"`.
##' \item `ID.fac = "effects.code"` is unavailable when there are any
##' cross-loadings.
##' }
##'
##' **`ID.cat`:** Wu & Estabrook (2016) recommended constraining
##' thresholds to equality first, and doing so should allow releasing any
##' identification constraints no longer needed. For each `ordered`
##' indicator, constraining one threshold to equality will allow the item's
##' intercepts to be estimated in all but the first group or repeated measure.
##' Constraining a second threshold (if applicable) will allow the item's
##' (residual) variance to be estimated in all but the first group or repeated
##' measure. For binary data, there is no independent test of threshold,
##' intercept, or residual-variance equality. Equivalence of thresholds must
##' also be assumed for three-category indicators. These guidelines provide the
##' least restrictive assumptions and tests, and are therefore the default.
##'
##' The default setting in M*plus* is similar to Wu & Estabrook (2016),
##' except that intercepts are always constrained to zero (so they are assumed
##' to be invariant without testing them). Millsap & Tein (2004) recommended
##' `parameterization = "theta"` and identified an item's residual variance
##' in all but the first group (or occasion; Liu et al., 2017) by constraining
##' its intercept to zero and one of its thresholds to equality. A second
##' threshold for the reference indicator (so `ID.fac = "UL"`) is used to
##' identify the common-factor means in all but the first group/occasion. The
##' LISREL software fixes the first threshold to zero and (if applicable) the
##' second threshold to 1, and assumes any remaining thresholds to be equal
##' across groups / repeated measures; thus, the intercepts are always
##' identified, and residual variances (`parameterization = "theta"`) are
##' identified except for binary data, when they are all fixed to one.
##'
##' **Repeated Measures:** If each repeatedly measured factor is measured
##' by the same indicators (specified in the same order in the
##' `configural.model`) on each occasion, without any cross-loadings, the
##' user can let `longIndNames` be automatically generated. Generic names
##' for the repeatedly measured indicators are created using the name of the
##' repeatedly measured factors (i.e., `names(longFacNames)`) and the
##' number of indicators. So the repeatedly measured first indicator
##' (`"ind"`) of a longitudinal construct called "factor" would be
##' generated as `"._factor_ind.1"`.
##'
##' The same types of parameter can be specified for `long.equal` as for
##' `group.equal` (see [lavaan::lavOptions()] for a list), except
##' for `"residual.covariances"` or `"lv.covariances"`. Instead, users
##' can constrain *auto*covariances using keywords `"resid.autocov"`
##' or `"lv.autocov"`. Note that `group.equal = "lv.covariances"` or
##' `group.equal = "residual.covariances"` will constrain any
##' autocovariances across groups, along with any other covariances the user
##' specified in the `configural.model`. Note also that autocovariances
##' cannot be specified as exceptions in `long.partial`, so anything more
##' complex than the `auto` argument automatically provides should instead
##' be manually specified in the `configural.model`.
##'
##' When users set `orthogonal=TRUE` in the `configural.model` (e.g.,
##' in bifactor models of repeatedly measured constructs), autocovariances of
##' each repeatedly measured factor will still be freely estimated in the
##' generated syntax.
##'
##' **Missing Data:** If users wish to utilize the [auxiliary()]
##' function to automatically include auxiliary variables in conjunction with
##' `missing = "FIML"`, they should first generate the hypothesized-model
##' syntax, then submit that syntax as the model to `auxiliary()`.
##' If users utilized [lavaan.mi::lavaan.mi()] to fit their `configural.model`
##' to multiply imputed data, that model can also be passed to the
##' `configural.model` argument, and if `return.fit = TRUE`, the
##' generated model will be fitted to the multiple imputations.
##'
##' @importFrom lavaan lavInspect lavNames parTable cfa
##'
##' @param configural.model A model with no measurement-invariance constraints
##' (i.e., representing only configural invariance), unless required for model
##' identification. `configural.model` can be either:
##' \itemize{
##' \item [lavaan::model.syntax()] or a [lavaan::parTable()] specifying the
##' configural model. Using this option, the user can also provide
##' either raw `data` or summary statistics via `sample.cov`
##' and (optionally) `sample.mean`. See argument descriptions in
##' [lavaan::lavaan()]. In order to include thresholds in
##' the generated syntax, either users must provide raw `data`,
##' or the `configural.model` syntax must specify all thresholds
##' (see first example). If raw `data` are not provided, the
##' number of blocks (groups, levels, or combination) must be
##' indicated using an arbitrary `sample.nobs` argument (e.g.,
##' 3 groups could be specified using `sample.nobs=rep(1, 3)`).
##' \item a fitted [lavaan::lavaan-class] model (e.g., as returned by
##' [lavaan::cfa()]) estimating the configural model
##' }
##' Note that the specified or fitted model must not contain any latent
##' structural parameters (i.e., it must be a CFA model), unless they are
##' higher-order constructs with latent indicators (i.e., a second-order CFA).
##'
##' @param ... Additional arguments (e.g., `data`, `ordered`, or
##' `parameterization`) passed to the [lavaan::lavaan()]
##' function. See also [lavaan::lavOptions()].
##'
##' @param ID.fac `character`. The method for identifying common-factor
##' variances and (if `meanstructure = TRUE`) means. Three methods are
##' available, which go by different names in the literature:
##' \itemize{
##' \item Standardize the common factor (mean = 0, *SD* = 1) by
##' specifying any of: `"std.lv"`, `"unit.variance"`,
##' `"UV"`, `"fixed.factor"`,
##' `"fixed-factor"`
##' \item Choose a reference indicator by specifying any of:
##' `"auto.fix.first"`, `"unit.loading"`, `"UL"`,
##' `"marker"`, `"ref"`, `"ref.indicator"`,
##' `"reference.indicator"`, `"reference-indicator"`,
##' `"marker.variable"`, `"marker-variable"`
##' \item Apply effects-code constraints to loadings and intercepts by
##' specifying any of: `"FX"`, `"EC"`, `"effects"`,
##' `"effects.coding"`, `"effects-coding"`,
##' `"effects.code"`, `"effects-code"`
##' }
##' See Kloessner & Klopp (2019) for details about all three methods.
##'
##' @param ID.cat `character`. The method for identifying (residual)
##' variances and intercepts of latent item-responses underlying any
##' `ordered` indicators. Four methods are available:
##' \itemize{
##' \item To follow Wu & Estabrook's (2016) guidelines (default), specify
##' any of: `"Wu.Estabrook.2016"`, `"Wu.2016"`,
##' `"Wu.Estabrook"`, `"Wu"`, `"Wu2016"`. For
##' consistency, specify `ID.fac = "std.lv"`.
##' \item To use the default settings of M*plus* and `lavaan`,
##' specify any of: `"default"`, `"Mplus"`, `"Muthen"`.
##' Details provided in Millsap & Tein (2004).
##' \item To use the constraints recommended by Millsap & Tein (2004; see
##' also Liu et al., 2017, for the longitudinal case)
##' specify any of: `"millsap"`, `"millsap.2004"`,
##' `"millsap.tein.2004"`. For consistency, specify
##' `ID.fac = "marker"` and `parameterization = "theta"`.
##' \item To use the default settings of LISREL, specify `"LISREL"`
##' or `"Joreskog"`. Details provided in Millsap & Tein (2004).
##' For consistency, specify `parameterization = "theta"`.
##' }
##' See **Details** and **References** for more information.
##'
##' @param ID.thr `integer`. Only relevant when
##' `ID.cat = "Millsap.Tein.2004"`. Used to indicate which thresholds
##' should be constrained for identification. The first integer indicates the
##' threshold used for all indicators, the second integer indicates the
##' additional threshold constrained for a reference indicator (ignored if
##' binary).
##'
##' @param group optional `character` indicating the name of a grouping
##' variable. See [lavaan::cfa()].
##'
##' @param group.equal optional `character` vector indicating type(s) of
##' parameter to equate across groups. Ignored if `is.null(group)`.
##' See [lavaan::lavOptions()].
##'
##' @param group.partial optional `character` vector or a parameter table
##' indicating exceptions to `group.equal` (see
##' [lavaan::lavOptions()]). Any variables not appearing in the
##' `configural.model` will be ignored, and any parameter constraints
##' needed for identification (e.g., two thresholds per indicator when
##' `ID.cat = "Millsap"`) will be removed.
##'
##' @param longFacNames optional named `list` of `character` vectors,
##' each indicating multiple factors in the model that are actually the same
##' construct measured repeatedly. See **Details** and **Examples**.
##'
##' @param longIndNames optional named `list` of `character` vectors,
##' each indicating multiple indicators in the model that are actually the
##' same indicator measured repeatedly. See **Details** and
##' **Examples**.
##'
##' @param long.equal optional `character` vector indicating type(s) of
##' parameter to equate across repeated measures. Ignored if no factors are
##' indicated as repeatedly measured in `longFacNames`.
##'
##' @param long.partial optional `character` vector or a parameter table
##' indicating exceptions to `long.equal`. Any longitudinal variable
##' names not appearing in `names(longFacNames)` or
##' `names(longIndNames)` will be ignored, and any parameter constraints
##' needed for identification will be removed.
##'
##' @param auto Used to automatically included autocorrelated measurement errors
##' among repeatedly measured indicators in `longIndNames`. Specify a
##' single `integer` to set the maximum order (e.g., `auto = 1L`
##' indicates that an indicator's unique factors should only be correlated
##' between adjacently measured occasions). `auto = TRUE` or `"all"`
##' will specify residual covariances among all possible lags per repeatedly
##' measured indicator in `longIndNames`.
##'
##' @param warn,debug `logical`. Passed to [lavaan::lavaan()]
##' and [lavaan::lavParseModelString()].
##' See [lavaan::lavOptions()].
##'
##' @param return.fit `logical` indicating whether the generated syntax
##' should be fitted to the provided `data` (or summary statistics, if
##' provided via `sample.cov`). If `configural.model` is a fitted
##' lavaan model, the generated syntax will be fitted using the `update`
##' method (see [lavaan::lavaan-class]), and \dots will be passed to
##' [lavaan::lavaan()]. If neither data nor a fitted lavaan model
##' were provided, this must be `FALSE`. If `TRUE`, the generated
##' `measEq.syntax` object will be included in the `lavaan` object's
##' `@@external` slot, accessible by `fit@@external$measEq.syntax`.
##'
##' @return By default, an object of class [measEq.syntax-class].
##' If `return.fit = TRUE`, a fitted [lavaan::lavaan()]
##' model, with the `measEq.syntax` object stored in the
##' `@@external` slot, accessible by `fit@@external$measEq.syntax`.
##'
##' @author Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' @seealso [compareFit()]
##'
##' @references
##' Kloessner, S., & Klopp, E. (2019). Explaining constraint interaction: How
##' to interpret estimated model parameters under alternative scaling methods.
##' *Structural Equation Modeling, 26*(1), 143--155.
##' \doi{10.1080/10705511.2018.1517356}
##'
##' Liu, Y., Millsap, R. E., West, S. G., Tein, J.-Y., Tanaka, R., & Grimm,
##' K. J. (2017). Testing measurement invariance in longitudinal data with
##' ordered-categorical measures. *Psychological Methods, 22*(3),
##' 486--506. \doi{10.1037/met0000075}
##'
##' Millsap, R. E., & Tein, J.-Y. (2004). Assessing factorial invariance in
##' ordered-categorical measures. *Multivariate Behavioral Research, 39*(3),
##' 479--515. \doi{10.1207/S15327906MBR3903_4}
##'
##' Wu, H., & Estabrook, R. (2016). Identification of confirmatory factor
##' analysis models of different levels of invariance for ordered categorical
##' outcomes. *Psychometrika, 81*(4), 1014--1045.
##' \doi{10.1007/s11336-016-9506-0}
##'
##' @examples
##' mod.cat <- ' FU1 =~ u1 + u2 + u3 + u4
##' FU2 =~ u5 + u6 + u7 + u8 '
##' ## the 2 factors are actually the same factor (FU) measured twice
##' longFacNames <- list(FU = c("FU1","FU2"))
##'
##' ## CONFIGURAL model: no constraints across groups or repeated measures
##' syntax.config <- measEq.syntax(configural.model = mod.cat,
##' # NOTE: data provides info about numbers of
##' # groups and thresholds
##' data = datCat,
##' ordered = paste0("u", 1:8),
##' parameterization = "theta",
##' ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
##' group = "g", longFacNames = longFacNames)
##' ## print lavaan syntax to the Console
##' cat(as.character(syntax.config))
##' ## print a summary of model features
##' summary(syntax.config)
##'
##' ## THRESHOLD invariance:
##' ## only necessary to specify thresholds if you have no data
##' mod.th <- '
##' u1 | t1 + t2 + t3 + t4
##' u2 | t1 + t2 + t3 + t4
##' u3 | t1 + t2 + t3 + t4
##' u4 | t1 + t2 + t3 + t4
##' u5 | t1 + t2 + t3 + t4
##' u6 | t1 + t2 + t3 + t4
##' u7 | t1 + t2 + t3 + t4
##' u8 | t1 + t2 + t3 + t4
##' '
##' syntax.thresh <- measEq.syntax(configural.model = c(mod.cat, mod.th),
##' # NOTE: data not provided, so syntax must
##' # include thresholds, and number of
##' # groups == 2 is indicated by:
##' sample.nobs = c(1, 1),
##' parameterization = "theta",
##' ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
##' group = "g", group.equal = "thresholds",
##' longFacNames = longFacNames,
##' long.equal = "thresholds")
##' ## notice that constraining 4 thresholds allows intercepts and residual
##' ## variances to be freely estimated in all but the first group & occasion
##' cat(as.character(syntax.thresh))
##' ## print a summary of model features
##' summary(syntax.thresh)
##'
##'
##' ## Fit a model to the data either in a subsequent step (recommended):
##' mod.config <- as.character(syntax.config)
##' fit.config <- cfa(mod.config, data = datCat, group = "g",
##' ordered = paste0("u", 1:8), parameterization = "theta")
##' ## or in a single step (not generally recommended):
##' fit.thresh <- measEq.syntax(configural.model = mod.cat, data = datCat,
##' ordered = paste0("u", 1:8),
##' parameterization = "theta",
##' ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
##' group = "g", group.equal = "thresholds",
##' longFacNames = longFacNames,
##' long.equal = "thresholds", return.fit = TRUE)
##' ## compare their fit to test threshold invariance
##' anova(fit.config, fit.thresh)
##'
##'
##' ## --------------------------------------------------------
##' ## RECOMMENDED PRACTICE: fit one invariance model at a time
##' ## --------------------------------------------------------
##'
##' ## - A downside of setting return.fit=TRUE is that if the model has trouble
##' ## converging, you don't have the opportunity to investigate the syntax,
##' ## or even to know whether an error resulted from the syntax-generator or
##' ## from lavaan itself.
##' ## - A downside of automatically fitting an entire set of invariance models
##' ## (like the old measurementInvariance() function did) is that you might
##' ## end up testing models that shouldn't even be fitted because less
##' ## restrictive models already fail (e.g., don't test full scalar
##' ## invariance if metric invariance fails! Establish partial metric
##' ## invariance first, then test equivalent of intercepts ONLY among the
##' ## indicators that have invariate loadings.)
##'
##' ## The recommended sequence is to (1) generate and save each syntax object,
##' ## (2) print it to the screen to verify you are fitting the model you expect
##' ## to (and potentially learn which identification constraints should be
##' ## released when equality constraints are imposed), and (3) fit that model
##' ## to the data, as you would if you had written the syntax yourself.
##'
##' ## Continuing from the examples above, after establishing invariance of
##' ## thresholds, we proceed to test equivalence of loadings and intercepts
##' ## (metric and scalar invariance, respectively)
##' ## simultaneously across groups and repeated measures.
##'
##' \donttest{
##'
##' ## metric invariance
##' syntax.metric <- measEq.syntax(configural.model = mod.cat, data = datCat,
##' ordered = paste0("u", 1:8),
##' parameterization = "theta",
##' ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
##' group = "g", longFacNames = longFacNames,
##' group.equal = c("thresholds","loadings"),
##' long.equal = c("thresholds","loadings"))
##' summary(syntax.metric) # summarize model features
##' mod.metric <- as.character(syntax.metric) # save as text
##' cat(mod.metric) # print/view lavaan syntax
##' ## fit model to data
##' fit.metric <- cfa(mod.metric, data = datCat, group = "g",
##' ordered = paste0("u", 1:8), parameterization = "theta")
##' ## test equivalence of loadings, given equivalence of thresholds
##' anova(fit.thresh, fit.metric)
##'
##' ## scalar invariance
##' syntax.scalar <- measEq.syntax(configural.model = mod.cat, data = datCat,
##' ordered = paste0("u", 1:8),
##' parameterization = "theta",
##' ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
##' group = "g", longFacNames = longFacNames,
##' group.equal = c("thresholds","loadings",
##' "intercepts"),
##' long.equal = c("thresholds","loadings",
##' "intercepts"))
##' summary(syntax.scalar) # summarize model features
##' mod.scalar <- as.character(syntax.scalar) # save as text
##' cat(mod.scalar) # print/view lavaan syntax
##' ## fit model to data
##' fit.scalar <- cfa(mod.scalar, data = datCat, group = "g",
##' ordered = paste0("u", 1:8), parameterization = "theta")
##' ## test equivalence of intercepts, given equal thresholds & loadings
##' anova(fit.metric, fit.scalar)
##'
##'
##' ## For a single table with all results, you can pass the models to
##' ## summarize to the compareFit() function
##' Comparisons <- compareFit(fit.config, fit.thresh, fit.metric, fit.scalar)
##' summary(Comparisons)
##'
##'
##' ## ------------------------------------------------------
##' ## NOT RECOMMENDED: fit several invariance models at once
##' ## ------------------------------------------------------
##' test.seq <- c("thresholds","loadings","intercepts","means","residuals")
##' meq.list <- list()
##' for (i in 0:length(test.seq)) {
##' if (i == 0L) {
##' meq.label <- "configural"
##' group.equal <- ""
##' long.equal <- ""
##' } else {
##' meq.label <- test.seq[i]
##' group.equal <- test.seq[1:i]
##' long.equal <- test.seq[1:i]
##' }
##' meq.list[[meq.label]] <- measEq.syntax(configural.model = mod.cat,
##' data = datCat,
##' ordered = paste0("u", 1:8),
##' parameterization = "theta",
##' ID.fac = "std.lv",
##' ID.cat = "Wu.Estabrook.2016",
##' group = "g",
##' group.equal = group.equal,
##' longFacNames = longFacNames,
##' long.equal = long.equal,
##' return.fit = TRUE)
##' }
##'
##' evalMeasEq <- compareFit(meq.list)
##' summary(evalMeasEq)
##'
##'
##' ## -----------------
##' ## Binary indicators
##' ## -----------------
##'
##' ## borrow example data from Mplus user guide
##' myData <- read.table("http://www.statmodel.com/usersguide/chap5/ex5.16.dat")
##' names(myData) <- c("u1","u2","u3","u4","u5","u6","x1","x2","x3","g")
##' bin.mod <- '
##' FU1 =~ u1 + u2 + u3
##' FU2 =~ u4 + u5 + u6
##' '
##' ## Must SIMULTANEOUSLY constrain thresholds, loadings, and intercepts
##' test.seq <- list(strong = c("thresholds","loadings","intercepts"),
##' means = "means",
##' strict = "residuals")
##' meq.list <- list()
##' for (i in 0:length(test.seq)) {
##' if (i == 0L) {
##' meq.label <- "configural"
##' group.equal <- ""
##' long.equal <- ""
##' } else {
##' meq.label <- names(test.seq)[i]
##' group.equal <- unlist(test.seq[1:i])
##' # long.equal <- unlist(test.seq[1:i])
##' }
##' meq.list[[meq.label]] <- measEq.syntax(configural.model = bin.mod,
##' data = myData,
##' ordered = paste0("u", 1:6),
##' parameterization = "theta",
##' ID.fac = "std.lv",
##' ID.cat = "Wu.Estabrook.2016",
##' group = "g",
##' group.equal = group.equal,
##' #longFacNames = longFacNames,
##' #long.equal = long.equal,
##' return.fit = TRUE)
##' }
##'
##' evalMeasEq <- compareFit(meq.list)
##' summary(evalMeasEq)
##'
##'
#TODO: add ternary example? or note to start with EQ thresholds?
##'
##' ## ---------------------
##' ## Multilevel Invariance
##' ## ---------------------
##'
##' ## To test invariance across levels in a MLSEM, specify syntax as though
##' ## you are fitting to 2 groups instead of 2 levels.
##'
##' mlsem <- ' f1 =~ y1 + y2 + y3
##' f2 =~ y4 + y5 + y6 '
##' ## metric invariance
##' syntax.metric <- measEq.syntax(configural.model = mlsem, meanstructure = TRUE,
##' ID.fac = "std.lv", sample.nobs = c(1, 1),
##' group = "cluster", group.equal = "loadings")
##' ## by definition, Level-1 means must be zero, so fix them
##' syntax.metric <- update(syntax.metric,
##' change.syntax = paste0("y", 1:6, " ~ c(0, NA)*1"))
##' ## save as a character string
##' mod.metric <- as.character(syntax.metric, groups.as.blocks = TRUE)
##' ## convert from multigroup to multilevel
##' mod.metric <- gsub(pattern = "group:", replacement = "level:",
##' x = mod.metric, fixed = TRUE)
##' ## fit model to data
##' fit.metric <- lavaan(mod.metric, data = Demo.twolevel, cluster = "cluster")
##' summary(fit.metric)
##' }
##' @export
measEq.syntax <- function(configural.model, ..., ID.fac = "std.lv",
ID.cat = "Wu.Estabrook.2016", ID.thr = c(1L, 2L),
group = NULL, group.equal = "", group.partial = "",
longFacNames = list(), longIndNames = list(),
long.equal = "", long.partial = "", auto = "all",
warn = TRUE, debug = FALSE, return.fit = FALSE) {
mc <- match.call(expand.dots = TRUE)
## evaluate promises that might change before being evaluated
## (e.g., for-loops or Monte Carlo studies)
mc$ID.fac <- eval(ID.fac)
mc$ID.cat <- eval(ID.cat)
mc$ID.thr <- eval(ID.thr)
mc$group <- eval(group)
mc$group.equal <- eval(group.equal)
mc$group.partial <- eval(group.partial)
mc$longFacNames <- eval(longFacNames)
mc$longIndNames <- eval(longIndNames)
mc$long.equal <- eval(long.equal)
mc$long.partial <- eval(long.partial)
mc$auto <- eval(auto)
## -------------------------------
## Preliminary checks on arguments
## -------------------------------
## check identification arguments
ID.fac <- tolower(as.character(ID.fac)[1])
if (ID.fac %in% c("std.lv","unit.variance","uv",
"fixed.factor","fixed-factor")) {
ID.fac <- "uv"
mc$ID.fac <- "uv"
} else if (ID.fac %in% c("auto.fix.first","unit.loading","ul","marker","ref",
"marker.variable","marker-variable","ref.indicator",
"reference.indicator","reference-indicator")) {
ID.fac <- "ul"
mc$ID.fac <- "ul"
} else if (ID.fac %in% c("fx","ec","effects","effects.coding",
"effects-coding","effects.code","effects-code")) {
ID.fac <- "fx"
mc$ID.fac <- "fx"
} else stop('Invalid choice for argument: ID.fac = "', ID.fac, '"')
ID.cat <- tolower(as.character(ID.cat)[1])
if (ID.cat %in% c("wu.estabrook.2016","wu.2016","wu.estabrook","wu","wu2016")) {
ID.cat <- "wu"
mc$ID.cat <- "wu"
} else if (ID.cat %in% c("millsap","millsap.2004","millsap.tein.2004")) {
ID.cat <- "millsap"
mc$ID.cat <- "millsap"
} else if (ID.cat %in% c("default","mplus","muthen")) {
ID.cat <- "mplus"
mc$ID.cat <- "mplus"
} else if (ID.cat %in% c("joreskog","lisrel")) {
ID.cat <- "lisrel"
mc$ID.cat <- "lisrel"
} else stop('Invalid choice for argument: ID.cat = "', ID.cat, '"')
## pass arguments to lavaan
dots <- list(...)
dots$debug <- debug
dots$warn <- warn
dots$group <- group
## check lavaan arguments
if (!is.null(dots$model)) stop('A model should be specified only with the ',
'"configural.model=" argument, not "model=".')
if (is.null(dots$meanstructure)) {
constrMeanStr <- c("intercepts","means") %in% c(group.equal, long.equal)
if (is.null(dots$data) && is.null(dots$sample.mean) &&
is.null(dots$sample.th) && !any(constrMeanStr)) {
dots$meanstructure <- FALSE
mc$meanstructure <- FALSE
} else {
dots$meanstructure <- TRUE
mc$meanstructure <- TRUE
}
}
## lavaan template from configural model
if (inherits(configural.model, c("lavaan","lavaanList"))) {
lavTemplate <- configural.model
## check that first loading is not constrained unless ID.fac == "ul"
if (ID.fac != "ul" && lavInspect(lavTemplate, "options")$auto.fix.first) {
stop('The "configural.model" argument is a lavaan model fitted using ',
'auto.fix.first=TRUE (or std.lv=FALSE), which conflicts with the ',
'requested "ID.fac" method. To generate syntax using the fixed-',
'factor or effects-coding method of identification, set std.lv=TRUE',
' to prevent initial loadings from being fixed to 1 in the syntax.')
}
## check that if (!meanstructure), not set TRUE in call
if (!is.null(mc$meanstructure)) {
if (!lavInspect(lavTemplate, "options")$meanstructure && mc$meanstructure)
stop('Request for meanstructure=TRUE requires configural.model to be ',
'fitted with meanstructure=TRUE')
} else mc$meanstructure <- lavInspect(lavTemplate, "options")$meanstructure # just in case
} else {
lavArgs <- dots
if (ID.fac != "ul") lavArgs$std.lv <- TRUE
lavArgs$model <- configural.model # let lavaan() do its own checks
lavArgs$do.fit <- FALSE
lavTemplate <- do.call("cfa", lavArgs, envir = getNamespace("lavaan"))
mc$meanstructure <- lavInspect(lavTemplate, "options")$meanstructure # just in case
mc$configural.model <- lavTemplate
}
## warn about regression parameters
if (any(parTable(lavTemplate)$op == "~"))
warning('Regression operator (~) detected. measEq.syntax() was designed ',
'only for multigroup CFA models. Regression operator (~) could be ',
'used to define a higher-order factor (although the =~ operator ',
'is easier), but structural regressions should not be specified.')
## prevent inconsistency
if (lavInspect(lavTemplate, "categorical") &&
ID.cat %in% c("wu","mplus") &&
ID.fac != "uv") warning('For factors measured only by categorical ',
'indicators, constraints on intercepts are ',
'insufficient to identify latent means when the ',
'intercepts are already fixed to zero in order ',
'to identify latent item scales. To prevent',
'underidentified models, it is recommended to ',
'instead set ID.fac = "std.lv".')
## convert *.partial strings to parTables
if (is.character(group.partial)) {
if ((group.partial == "")[1L] && length(group.partial) == 1L) {
group.partial <- data.frame(stringsAsFactors = FALSE, lhs = character(0),
op = character(0), rhs = character(0))
} else {
group.partial <- lavaan::lavParseModelString(group.partial,
as.data.frame. = TRUE,
warn = warn, debug = debug)
}
} #TODO: else {extract information from a measEq.partial object}
if (is.character(long.partial)) {
if ((long.partial == "")[1L] && length(long.partial) == 1L) {
long.partial <- data.frame(stringsAsFactors = FALSE, lhs = character(0),
op = character(0), rhs = character(0))
} else {
long.partial <- lavaan::lavParseModelString(long.partial,
as.data.frame. = TRUE,
warn = warn, debug = debug)
}
} #TODO: else {extract information from a measEq.partial object}
## only relevant when there are longitudinal factors
if (length(longFacNames) > 0L) {
if (!is.atomic(auto)) stop("'auto' must be a non-negative integer or the character 'all'.")
if (is.logical(auto)) { if (auto) auto <- "all" else auto <- 0L}
if (is.factor(auto)) auto <- as.character(auto)
if (is.character(auto) && auto != "all")
stop("'auto' must be a non-negative integer or the character 'all'.")
if (is.numeric(auto)) {
auto <- as.integer(auto[1]) # only the first integer
if (auto < 1L) auto <- NULL
}
mc$auto <- auto
}
## extract options and other information
if (is.null(mc$parameterization)) {
parameterization <- lavInspect(lavTemplate, "options")$parameterization
} else {
parameterization <- try(eval(mc$parameterization), silent = TRUE)
if (inherits(parameterization, "try-error")) {
parameterization <- try(eval.parent(mc$parameterization, 1), silent = TRUE)
}
}
if (is.null(mc$meanstructure)) {
meanstructure <- lavInspect(lavTemplate, "options")$meanstructure
} else meanstructure <- mc$meanstructure
nG <- lavInspect(lavTemplate, "ngroups")
## names of ordinal indicators, number of thresholds for each
allOrdNames <- lavNames(lavTemplate, type = "ov.ord")
if (length(allOrdNames)) {
#TODO: add nThr= argument (named numeric vector?) so data= not required
nThr <- table(sapply(strsplit(lavNames(lavTemplate, "th"),
split = "|", fixed = TRUE),
"[", i = 1))
} else nThr <- numeric(0)
if (length(allOrdNames) && ID.cat == "millsap") {
## Check for ID.thr
if (is.numeric(ID.thr)) {
if (length(ID.thr) == 1L) ID.thr <- rep(ID.thr, 2)
ID.thr <- sapply(allOrdNames, function(x) ID.thr[1:2], simplify = FALSE)
} else if (is.list(ID.thr)) {
if (length((setdiff(allOrdNames, names(ID.thr)))))
stop('If the same thresholds will not be used for all ordered indicators,',
' then "ID.thr" must specify 2 integers per ordered indicator in ',
'a named list (using names of ordered indicators).')
}
## check identification methods
ID.fac <- "ul"
if (parameterization != "theta") stop('If ID.cat == "millsap", you must ',
'use parameterization = "theta"')
}
if (length(allOrdNames) && ID.cat == "lisrel") {
if (parameterization != "theta") stop('If ID.cat == "lisrel", you must ',
'use parameterization = "theta"')
## thresholds must be constrained to equality
if (!"thresholds" %in% group.equal) group.equal <- c("thresholds", group.equal)
if (!"thresholds" %in% long.equal) long.equal <- c("thresholds", long.equal)
## so remove any thresholds from *.partial
partial.thr <- group.partial$op == "|"
if (any(partial.thr)) group.partial <- group.partial[!partial.thr, ]
partial.thr <- long.partial$op == "|"
if (any(partial.thr)) long.partial <- long.partial[!partial.thr, ]
}
if (length(allOrdNames) && ID.cat %in% c("millsap","mplus")) {
## scalar invariance implies equal intercepts, even though they are
## fixed to zero anyway. This will correctly trigger freeing latent mean(s)
if ("loadings" %in% group.equal && "thresholds" %in% group.equal &&
!"intercepts" %in% group.equal) group.equal <- c("intercepts", group.equal)
if ("loadings" %in% long.equal && "thresholds" %in% long.equal &&
!"intercepts" %in% long.equal) long.equal <- c("intercepts", long.equal)
}
if (!meanstructure) {
## make sure *.equal includes no mean-structure parameters
eq.means <- which(group.equal %in% c("means","intercepts"))
if (length(eq.means)) group.equal <- group.equal[-eq.means]
eq.means <- which(long.equal %in% c("means","intercepts"))
if (length(eq.means)) long.equal <- long.equal[-eq.means]
## make sure *.partial includes no mean-structure parameters
partial.means <- group.partial$op == "~1"
if (any(partial.means)) group.partial <- group.partial[!partial.means, ]
partial.means <- long.partial$op == "~1"
if (any(partial.means)) long.partial <- long.partial[!partial.means, ]
}
mc$group.partial <- group.partial[c("lhs","op","rhs")] #FIXME: any more? "block" for multilevel?
mc$long.partial <- long.partial[c("lhs","op","rhs")]
## check logic of constraints
if (length(allOrdNames) && parameterization == "delta") {
if ("residuals" %in% long.equal) {
stop('Residual variances cannot be tested for invariance ',
'across repeated measures when parameterization = "delta". \n',
'Please set parameterization = "theta". \n')
}
if ("residuals" %in% group.equal) {
stop('Residual variances cannot be tested for invariance ',
'across groups when parameterization = "delta". \n',
'Please set parameterization = "theta". \n')
}
}
if (warn) {
if (any(c("lv.variances","lv.autocov") %in% long.equal) && !"loadings" %in% long.equal)
warning('Latent (co)variances are not comparable over repeated measures ',
'if their respective factor loadings are not equal ',
'over repeated measures.')
if (any(c("lv.variances","lv.covariances") %in% group.equal) && !"loadings" %in% group.equal)
warning('Latent (co)variances are not comparable across groups ',
'if their respective factor loadings are not equal across groups.')
if ("intercepts" %in% long.equal && !"loadings" %in% long.equal)
warning('Indicator intercepts are not comparable over repeated measures ',
'if their respective factor loadings are not equal ',
'over repeated measures.')
if ("intercepts" %in% group.equal && !"loadings" %in% group.equal)
warning('Indicator intercepts are not comparable over across groups ',
'if their respective factor loadings are not equal across groups.')
if ("means" %in% long.equal && !all(c("loadings","intercepts") %in% long.equal))
warning('Latent means are not comparable over repeated measures if their ',
'respective factor loadings and intercepts are not equal ',
'over repeated measures.')
if ("means" %in% group.equal && !all(c("loadings","intercepts") %in% group.equal))
warning('Latent means are not comparable across groups if their ',
'respective factor loadings and intercepts are not equal ',
'across groups.')
if ("resid.autocov" %in% long.equal && !"residuals" %in% long.equal)
warning('Residual auto-covariances might not be comparable over repeated ',
'measures if their respective residual variances are not equal ',
'over repeated measures.')
if ("residual.covariances" %in% group.equal && !"residuals" %in% group.equal)
warning('Residual covariances might not be comparable across groups if ',
'their respective residual variances are not equal across groups.')
if ("lv.autocov" %in% long.equal && !"lv.variances" %in% long.equal)
warning('Latent auto-covariances might not be comparable over repeated ',
'measures if their respective latent variances are not equal ',
'over repeated measures.')
if ("lv.covariances" %in% group.equal && !"lv.variances" %in% group.equal)
warning('Latent covariances might not be comparable across groups if ',
'their respective latent variances are not equal across groups.')
}
## ------------------
## Parameter Matrices
## ------------------
## Parameter matrices used for labels, fixed/free values, and whether to specify
GLIST.free <- lavInspect(lavTemplate, "free")
if (nG == 1L) GLIST.free <- list(`1` = GLIST.free)
## only save relevant matrices to specify
pmats <- intersect(c("tau","lambda","beta",
if (meanstructure) "nu" else NULL ,
"theta",
if (meanstructure) "alpha" else NULL ,
"psi",
if (length(allOrdNames) && parameterization == "delta") "delta" else NULL),
names(GLIST.free[[1]]))
if ("beta" %in% pmats && ID.fac != "ul") {
ID.fac <- "ul" #FIXME: could use effects-coding with relative ease?
mc$ID.fac <- ID.fac
message('Higher-order factors detected. ID.fac set to "ul".')
}
## matrices with estimates depends on class of model
if (inherits(lavTemplate, "lavaan")) {
GLIST.est <- lavInspect(lavTemplate, "est")
if (nG == 1L) GLIST.est <- list(`1` = GLIST.est)
} else if (inherits(lavTemplate, "lavaanList")) {
nn <- names(lavTemplate@Model@GLIST) #FIXME: will @Model continue to exist?
GLIST.est <- list()
for (g in 1:nG) {
GLIST.est[[g]] <- list()
for (p in pmats) {
GLIST.est[[g]][[p]] <- lavTemplate@Model@GLIST[[ which(nn == p)[g] ]]
## add dimnames to matrices
dimnames(GLIST.est[[g]][[p]]) <- dimnames(GLIST.free[[g]][[p]])
}
}
}
for (g in 1:nG) {
GLIST.est[[g]] <- GLIST.est[[g]][pmats]
GLIST.free[[g]] <- GLIST.free[[g]][pmats]
if (g > 1L) {
## make sure all groups have the same observed & latent variables
same.obs <- all(rownames(GLIST.free[[g]]$lambda) == rownames(GLIST.free[[1]]$lambda))
same.lat <- all(colnames(GLIST.free[[g]]$lambda) == colnames(GLIST.free[[1]]$lambda))
if (!same.obs) stop('Models contain different observed variables across ',
'groups/blocks. Configural invariance impossible.')
if (!same.lat) stop('Models contain different latent variables across ',
'groups/blocks. Configural invariance impossible.')
}
}
## FIXME: check for others? (e.g., test invariance across multiple levels?)
## In general, specify if GLIST.free > 0 | (GLIST.free == 0 & GLIST.est != 0)
## - tau : specify all
## - lambda: specify any nonzero in free + fixed-nonzero (e.g., auto.fix.first)
## - beta : treat as second-order lambda
## - nu : specify all
## - theta : specify diagonal (unless delta?) + any nonzero off-diagonal
## - delta : specify ONLY if parameterization == "delta"
## - alpha : specify all
## - psi : specify all
GLIST.specify <- sapply(names(GLIST.free), function(g) list())
for (g in 1:nG) {
for (p in pmats) {
## THRESHOLDS
if (p == "tau") {
GLIST.specify[[g]]$tau <- GLIST.free[[g]]$tau == 0
GLIST.specify[[g]]$tau[ , 1] <- TRUE
next
}
## LOADINGS
if (p == "lambda") {
free.loading <- GLIST.free[[g]]$lambda > 0L
fixed.nonzero.loading <- GLIST.free[[g]]$lambda == 0L & GLIST.est[[g]]$lambda != 0
GLIST.specify[[g]]$lambda <- free.loading | fixed.nonzero.loading
next
}
## SECOND-ORDER LOADINGS
if (p == "beta") {
free.loading <- GLIST.free[[g]]$beta > 0L
fixed.nonzero.loading <- GLIST.free[[g]]$beta == 0L & GLIST.est[[g]]$beta != 0
GLIST.specify[[g]]$beta <- free.loading | fixed.nonzero.loading
next
}
## INTERCEPTS
if (p == "nu") {
GLIST.specify[[g]]$nu <- GLIST.free[[g]]$nu == 0
GLIST.specify[[g]]$nu[ , 1] <- TRUE
next
}
## LATENT MEANS
if (p == "alpha") {
GLIST.specify[[g]]$alpha <- GLIST.free[[g]]$alpha == 0
GLIST.specify[[g]]$alpha[ , 1] <- TRUE
next
}
## LATENT (CO)VARIANCES
if (p == "psi") {
GLIST.specify[[g]]$psi <- matrix(TRUE, nrow = nrow(GLIST.free[[g]]$psi),
ncol = ncol(GLIST.free[[g]]$psi),
dimnames = dimnames(GLIST.free[[g]]$psi))
## only specify lower triangle
GLIST.specify[[g]]$psi[upper.tri(GLIST.specify[[g]]$psi)] <- FALSE
next
}
## RESIDUAL (CO)VARIANCES
if (p == "theta") {
free.var <- GLIST.free[[g]]$theta > 0L
fixed.nonzero.var <- GLIST.free[[g]]$theta == 0L & GLIST.est[[g]]$theta != 0
GLIST.specify[[g]]$theta <- free.var | fixed.nonzero.var
## can't specify for ordinal indicators using delta parameterization
if (parameterization == "delta")
diag(GLIST.specify[[g]]$theta)[allOrdNames] <- FALSE
## only specify lower triangle
GLIST.specify[[g]]$theta[upper.tri(GLIST.specify[[g]]$theta)] <- FALSE
next
}
## SCALING FACTORS (delta parameters for latent item-responses)
if (p == "delta") {
GLIST.specify[[g]]$delta <- GLIST.free[[g]]$delta == 1
GLIST.specify[[g]]$delta[ , 1] <- parameterization == "delta"
}
## end loops
}
}
## check for any cross-loadings
#TODO: special check for bifactor models possible? Find factors whose indicators all cross-load...
anyXload <- FALSE
for (g in 1:nG) {
if (any(apply(GLIST.specify[[g]]$lambda, 1, sum) > 1)) anyXload <- TRUE
}
## can the effects-coding identification method be used?
if (ID.fac == "fx" && anyXload) {
stop('Effects-coding method of factor identification ',
'("ID.fac") unavailable in models with cross-loadings.')
}
## Warn about constraining intercepts but not means
freeMeans <- ("intercepts" %in% group.equal && !("means" %in% group.equal)) ||
("intercepts" %in% long.equal && !("means" %in% long.equal) )
if (ID.fac == "uv" && anyXload && freeMeans) {
warning('A factor\'s mean cannot be freed unless it has at least one ',
'indicator without a cross-loading whose intercept is constrained ',
'to equality. Use cat(as.character()) to check whether the syntax ',
'returned by measEq.syntax() must be manually adapted to free the ',
'necessary latent means.')
}
## If it is estimated in the user's configural model, free it (NA).
## If it is specified as fixed but != 0, retain fixed value.
GLIST.values <- sapply(names(GLIST.free), function(g) list())
for (g in 1:nG) {
GLIST.values[[g]] <- mapply(function(est, free) {
est[free > 0L] <- NA
est
}, SIMPLIFY = FALSE, est = GLIST.est[[g]], free = GLIST.free[[g]])
## constrain first loadings to 1 and first indicators to 0?
if (ID.fac == "ul") {
## matrix to store whether each indicator is a reference indicator
lambda.first <- matrix(FALSE, nrow = nrow(GLIST.values[[g]]$lambda),
ncol = ncol(GLIST.values[[g]]$lambda),
dimnames = dimnames(GLIST.values[[g]]$lambda))
if ("beta" %in% pmats) {
beta.first <- matrix(FALSE, nrow = nrow(GLIST.values[[g]]$beta),
ncol = ncol(GLIST.values[[g]]$beta),
dimnames = dimnames(GLIST.values[[g]]$beta))
}
## loop over factors to constrain loadings to 1
for (f in colnames(GLIST.values[[g]]$lambda)) {
## if any loading(s) is(are) fixed to 1 already, no changes needed
ones.lambda <- which(GLIST.values[[g]]$lambda[ , f] == 1L)
if ("beta" %in% pmats) ones.beta <- which(GLIST.values[[g]]$beta[ , f] == 1L)
any1.lambda <- length(ones.lambda) > 0L
any1.beta <- if ("beta" %in% pmats) length(ones.beta) > 0L else FALSE
if (!any1.lambda && !any1.beta) {
## If not already indicated, find the first indicator and fix it to 1.
## Prioritize latent indicators to be first (in case observed has cross-loading)
if ("beta" %in% pmats) {
indicators <- names(which(GLIST.specify[[g]]$beta[ , f]))
if (length(indicators)) {
first.indicator <- indicators[1]
} else first.indicator <- NULL
} else first.indicator <- NULL
if (length(first.indicator)) {
## only true if ("beta" %in% pmats)
GLIST.values[[g]]$beta[first.indicator, f] <- 1L
beta.first[first.indicator, f] <- TRUE
} else {
## no latent indicators, so look in lambda
indicators <- names(which(GLIST.specify[[g]]$lambda[ , f]))
first.indicator <- indicators[1] #FIXME: no chance of NA by now, right?
GLIST.values[[g]]$lambda[first.indicator, f] <- 1L
lambda.first[first.indicator, f] <- TRUE
}
## otherwise, use first fixed == 1 indicator as the marker variable
} else if (any1.beta) {
beta.first[ones.beta[1], f] <- TRUE
} else if (any1.lambda) {
lambda.first[ones.lambda[1], f] <- TRUE
}
}
## loop over indicators to constrain intercepts to zero
if (meanstructure) {
## manifest indicators
for (i in rownames(GLIST.specify[[g]]$lambda)) {
## for the first indicator of a construct, constrain to zero
first.indicator <- lambda.first[i, ]
if (sum(first.indicator) > 1L)
stop('The intercept of indicator "', i, '" can only be fixed to zero ',
'in order to identify one latent mean, but it is specified as ',
'the first indicator of the following factors:\n\t',
paste(names(which(first.indicator)), collapse = ", "), '\n',
'Please respecify the model so that each factor has a unique ',
'first indicator to use as a reference indicator.')
if (any(first.indicator)) GLIST.values[[g]]$nu[i, 1] <- 0
}
## latent indicators of higher-order constructs
if ("beta" %in% pmats) for (i in rownames(GLIST.specify[[g]]$beta)) {
## for the first indicator of a construct, constrain to zero
first.indicator <- beta.first[i, ]
if (sum(first.indicator) > 1L)
stop('The intercept of indicator "', i, '" can only be fixed to zero ',
'in order to identify one factor mean, but it is specified as ',
'the first indicator of the following factors:\n\t',
paste(names(which(first.indicator)), collapse = ", "), '\n',
'Please respecify the model so that each factor has a unique ',
'first indicator to use as a reference indicator.')
if (any(first.indicator)) {
GLIST.values[[g]]$alpha[i, 1] <- 0
} else GLIST.values[[g]]$alpha[i, 1] <- NA
}
}
}
}
## Make labels
GLIST.labels <- sapply(names(GLIST.free), function(g) list())
for (g in 1:nG) {
for (p in pmats) {
if (p == "tau") {
## THRESHOLDS
GLIST.labels[[g]]$tau <- cbind(gsub(x = rownames(GLIST.free[[g]]$tau),
pattern = "|t", replacement = ".thr",
fixed = TRUE))
dimnames(GLIST.labels[[g]]$tau) <- dimnames(GLIST.free[[g]]$tau)
} else {
## ANY OTHER PARAMETERS
GLIST.labels[[g]][[p]] <- matrix("", nrow = nrow(GLIST.free[[g]][[p]]),
ncol = ncol(GLIST.free[[g]][[p]]),
dimnames = dimnames(GLIST.free[[g]][[p]]))
for (RR in rownames(GLIST.free[[g]][[p]])) {
for (CC in colnames(GLIST.free[[g]][[p]])) {
GLIST.labels[[g]][[p]][RR, CC] <- getLabel(GLIST.labels[[g]],
parMat = p,
RR = RR, CC = CC)
}
}
}
## end loops
}
## no labels for scaling factors (cannot equate, not a measuremet parameter)
GLIST.labels[[g]]$delta <- NULL
}
## ------------------------------------
## Preliminary checks on model and data
## ------------------------------------
## check longitudinal factor names
if (!is.list(longFacNames)) stop('"longFacNames" must be a list of character vectors.')
## check that no longitudinal factors are only at 1 occasion
if (length(longFacNames)) longFacNames <- longFacNames[sapply(longFacNames, length) > 1L]
## also check longIndNames, and each non-NULL element
if (!is.list(longIndNames)) stop('"longIndNames" must be a list of character vectors.')
if (length(longIndNames)) {
longIndList <- sapply(longIndNames, is.character)
if (!all(longIndList)) stop('"longIndNames" must be a list of character vectors.')
## No problem if any(length == 1L). It just won't be constrained.
}
## names of factors in syntax
allFacNames <- lapply(GLIST.free, function(x) colnames(x$lambda))
## collapse names of longitudinal factors (plus non-longitudinal factors)
# reducedFacNames <- c(names(longFacNames), setdiff(unlist(allFacNames),
# unlist(longFacNames)))
## check for longitudinal indicator names, automatically generate if empty
make.longIndNames <- length(longIndNames) == 0L
for (f in names(longFacNames)) {
## time-specific factor names
fs <- longFacNames[[f]]
nT <- length(fs) # number of occasions
## get indicators of each
indNames <- sapply(fs, function(ff) {
names(which(GLIST.specify[[1]]$lambda[ , ff]))
}, simplify = FALSE)
if (make.longIndNames) {
# check for same number of indicators, match across factors
nInd <- length(indNames[[1]])
if (!all(sapply(indNames, length) == nInd))
stop('The number of indicators for longitudinal factor "', f,
'" differs across measurement occasions. Please use the ',
'"longIndNames" argument to specify which longitudinal indicators',
' are the same indicator on different occasions of measurement.')
if (nInd > 0L) for (i in 1:nInd) {
longIndNames[[paste0("._", f, "_.ind.", i)]] <- sapply(indNames, "[",
i = i,
USE.NAMES = FALSE)
}
} else {
## add unique indicators per factor (omitted from user-specified matches) ## NO LONGER NECESSARY
# for (i in fs) {
# extraIndicators <- setdiff(indNames[[i]], unlist(longIndNames[[f]]))
# longIndNames[[f]][extraIndicators] <- extraIndicators
# }
}
}
## check none have cross-loadings
longIndTable <- table(unlist(longIndNames))
if (any(longIndTable > 1L))
stop('Some longitudinal indicators define more than one factor:\n ',
paste(names(longIndTable[longIndTable > 1L]), collapse = ", "), "\n ",
'The "longIndNames=" argument must be explicitly declared.')
## check equivalence of data type (ordinal vs. continuous) across time
longOrdNames <- sapply(longIndNames, "%in%", table = allOrdNames, simplify = FALSE)
someNotAll <- sapply(longOrdNames, function(i) any(i) & !all(i))
if (any(someNotAll)) {
stop('At least one longitudinal indicator is declared as "ordered" on',
' at least one, but not every, occasion: \n ',
paste(names(which(someNotAll)), collapse = ", "))
}
## check number of thresholds/categories is equivalent across time
allOrd <- sapply(longOrdNames, all)
if (length(allOrd)) for (i in which(allOrd)) {
checkThr <- nThr[ longIndNames[[ names(allOrd)[i] ]] ]
if (!all(checkThr == checkThr[1]))
stop('These "ordered" longitudinal indicators do not have the same ',
'number of thresholds (endorsed categories) on every occasion: \n',
paste(names(checkThr), collapse = ", "),
"\nConsider collapsing rarely endorsed categories.")
}
## create a backward-key for finding long(Fac/Ind)Names from variable names
longFacKey <- rep(names(longFacNames), times = sapply(longFacNames, length))
names(longFacKey) <- unlist(longFacNames)
longIndKey <- rep(names(longIndNames), times = sapply(longIndNames, length))
names(longIndKey) <- unlist(longIndNames)
mc$longFacNames <- longFacNames
mc$longIndNames <- longIndNames
## -----------------
## Apply constraints
## -----------------
## THRESHOLDS (+ intercept & variance ID constraints for allOrdNames)
## longitudinal constraints (one group at a time, but same across groups)
for (g in 1:nG) {
## loop over ordinal indicators
for (i in allOrdNames) {
## when other variables are this same indicator?
longInds <- names(longIndKey)[ which(longIndKey == longIndKey[i]) ]
if (length(longInds) == 0L) next
## keep track of how many thresholds for the i_th indicator have been
## constrained, in case identification constraints can be released
nEqThr <- 0L
## loop over thresholds of the i_th ordinal indicator
for (th in 1:(nThr[i])) {
## (ADD) constraints across repeated measures?
equate.long <- "thresholds" %in% long.equal
## check whether not to equate because it is in long.partial
partial.th <- long.partial$op == "|" & long.partial$rhs == paste0("t", th)
if (equate.long && any(partial.th)) {
partial.inds <- longIndNames[ long.partial$lhs[which(partial.th)] ]
equate.long <- !i %in% unlist(partial.inds)
}
## check whether to equate for identification (overrides *.partial)
if (ID.cat == "millsap") {
## always equate the first (or only, if binary)
if (th == ID.thr[[i]][1]) {
equate.long <- TRUE
## remove this from long.partial, if necessary
rm.th <- which(long.partial$lhs == longIndKey[i] & partial.th)
if (length(rm.th)) long.partial <- long.partial[-rm.th, ]
}
## for the first indicator of a construct, equate the second
fs <- which(GLIST.specify[[g]]$lambda[i, ])
first.indicator <- sapply(fs, function(f) {
lams <- GLIST.specify[[g]]$lambda[ , f]
lam.eq.1 <- which(GLIST.values[[g]]$lambda[ , f] == 1)
if (length(lam.eq.1)) return(names(lams[ lam.eq.1[1] ]) == i)
names(which(lams))[1] == i
})
if (th == ID.thr[[i]][2] && any(first.indicator)) {
equate.long <- TRUE
if (length(fs) > 1L && warn)
warning('Millsap & Tein`s (2004) identification constraints might ',
'not be optimal when the reference indicator ("', i,
'") has a cross-loading (on factors "',
paste0(names(fs), collapse = '", "'), '")')
## remove this from long.partial, if necessary
rm.th <- which(long.partial$lhs == longIndKey[i] & partial.th)
if (length(rm.th)) long.partial <- long.partial[-rm.th, ]
}
}
## apply longitudinal constraint?
if (equate.long) {
## iterate count of constrained thresholds
nEqThr <- nEqThr + 1L
## apply longitudinal constraint
this.th <- paste0(i, "|t", th)
first.th <- paste0(longInds[1], "|t", th)
GLIST.labels[[g]]$tau[this.th, 1] <- GLIST.labels[[g]]$tau[first.th, 1]
}
} ## end loop over thresholds
## check whether enough thresholds were equated to free
## IDENTIFICATION CONSTRAINTS on intercepts & residuals
equate.int <- "intercepts" %in% long.equal &&
!any(long.partial$lhs == longIndKey[i] & long.partial$op == "~1")
equate.resid <- "residuals" %in% long.equal &&
!any(long.partial$lhs == longIndKey[i] &
long.partial$rhs == longIndKey[i] &
long.partial$op == "~~") #FIXME: leave resid==0 for reference indicators
if (i == longInds[1]) {
if (ID.cat == "lisrel") {
## always estimate intercepts, and variances unless binary
GLIST.values[[g]]$nu[i, 1] <- NA
diag(GLIST.values[[g]]$theta)[i] <- if (nThr[i] == 1L) 1 else NA
} else {
## always set reference occasion's intercepts to 0 and variances to 1
GLIST.values[[g]]$nu[i, 1] <- 0
if (parameterization == "theta") {
diag(GLIST.values[[g]]$theta)[i] <- 1
} else {
GLIST.values[[g]]$delta[i, 1] <- 1
}
}
} else if (ID.cat == "wu") {
## priority to freeing intercepts
if (nEqThr == 0L || equate.int) {
GLIST.values[[g]]$nu[i, 1] <- 0
} else GLIST.values[[g]]$nu[i, 1] <- NA
if (nEqThr == 0L || (nEqThr < 2L && !equate.int) || equate.resid) {
## keep (residual) variances fixed
if (parameterization == "theta") {
diag(GLIST.values[[g]]$theta)[i] <- 1
} else {
GLIST.values[[g]]$delta[i, 1] <- 1
}
} else {
## free (residual) variances
if (parameterization == "theta") {
diag(GLIST.values[[g]]$theta)[i] <- NA
} else {
GLIST.values[[g]]$delta[i, 1] <- NA
}
}
} else if (ID.cat %in% c("mplus","millsap")) {
## never free intercepts, only variances
if (nEqThr == 0L || equate.resid) {
## keep (residual) variances fixed
if (parameterization == "theta") {
diag(GLIST.values[[g]]$theta)[i] <- 1
} else {
GLIST.values[[g]]$delta[i, 1] <- 1
}
} else {
## free (residual) variances
if (parameterization == "theta") {
diag(GLIST.values[[g]]$theta)[i] <- NA
} else {
GLIST.values[[g]]$delta[i, 1] <- NA
}
}
} else if (ID.cat == "lisrel") {
## always estimate intercepts, and variances unless binary
GLIST.values[[g]]$nu[i, 1] <- NA
diag(GLIST.values[[g]]$theta)[i] <- if (nThr[i] == 1L) 1 else NA
}
}
}
## group constraints
if (nG == 1L && ID.cat == "lisrel") {
## Single-group model for repeated measures:
## Longitudinal loop above only places LISREL equality constraints on
## thresholds. Here, still neeed to fix the first 2 == {0, 1}.
## loop over ordinal indicators
for (i in allOrdNames) {
## loop over thresholds of the i_th ordinal indicator
for (th in 1:(nThr[i])) {
## always fix the first (or only, if binary) to zero
if (th == 1L) GLIST.values[[g]]$tau[paste0(i, "|t", th), 1] <- 0
## always fix the second to one
if (th == 2L) GLIST.values[[g]]$tau[paste0(i, "|t", th), 1] <- 1
## estimate any others
if (th > 2L) GLIST.values[[g]]$tau[paste0(i, "|t", th), 1] <- NA
} ## end loop over thresholds
} ## end loop over ordinal indicators
} else if (nG > 1L) for (g in 1:nG) {
## loop over ordinal indicators
for (i in allOrdNames) {
## keep track of how many thresholds for the i_th indicator have
## constrained, in case identification constraints can be released
nEqThr <- 0L
## loop over thresholds of the i_th ordinal indicator
for (th in 1:(nThr[i])) {
## (REMOVE) constraints across groups?
equate.group <- "thresholds" %in% group.equal
## check whether not to equate because it is in group.partial
partial.th <- group.partial$lhs == i & group.partial$op == "|" &
group.partial$rhs == paste0("t", th)
if (equate.group) equate.group <- !any(partial.th)
## check whether to equate for identification (overrides *.partial)
if (ID.cat == "millsap") {
## always equate the first (or only, if binary)
if (th == ID.thr[[i]][1]) {
equate.group <- TRUE
## remove this from group.partial, if necessary
rm.th <- which(partial.th)
if (length(rm.th)) group.partial <- group.partial[-rm.th, ]
}
## for the first indicator of a construct, equate the second
fs <- which(GLIST.specify[[g]]$lambda[i, ])
first.indicator <- sapply(fs, function(f) {
lams <- GLIST.specify[[g]]$lambda[ , f]
lam.eq.1 <- which(GLIST.values[[g]]$lambda[ , f] == 1)
if (length(lam.eq.1)) return(names(lams[ lam.eq.1[1] ]) == i)
names(which(lams))[1] == i
})
if (th == ID.thr[[i]][2] && any(first.indicator)) {
equate.group <- TRUE
if (length(fs) > 1L && warn)
warning('Millsap & Tein`s (2004) identification constraints might ',
'not be optimal when the reference indicator ("', i,
'") has a cross-loading (on factors "',
paste0(names(fs), collapse = '", "'), '")')
## remove this from group.partial, if necessary
rm.th <- which(partial.th)
if (length(rm.th)) group.partial <- group.partial[-rm.th, ]
}
} else if (ID.cat == "lisrel") {
## always fix the first (or only, if binary) to zero
if (th == 1L) GLIST.values[[g]]$tau[paste0(i, "|t", th), 1] <- 0
## always fix the second to one
if (th == 2L) GLIST.values[[g]]$tau[paste0(i, "|t", th), 1] <- 1
## estimate any others
if (th > 2L) GLIST.values[[g]]$tau[paste0(i, "|t", th), 1] <- NA
}
## apply group-specific label, unless constrained
if (!equate.group) {
## row in GLIST
RR <- paste0(i, "|t", th)
GLIST.labels[[g]]$tau[RR, 1] <- paste0(GLIST.labels[[g]]$tau[RR, 1], ".g", g)
} else nEqThr <- nEqThr + 1L # iterate count of constrained thresholds
} ## end loop over thresholds
## check whether enough thresholds were equated to free
## IDENTIFICATION CONSTRAINTS on intercepts & residuals.
## Note: Group 1 constraints already set in longitudinal loop, ONLY if
## there are repeated measures identified by longInds.
## Section below only RELEASES constraints.
## DON'T OVERWRITE FREED CONSTRAINTS AFTER TIME 1.
equate.int <- "intercepts" %in% group.equal &&
!any(group.partial$lhs == i & group.partial$op == "~1")
equate.resid <- "residuals" %in% group.equal &&
!any(group.partial$lhs == i & group.partial$rhs == i & group.partial$op == "~~")
if (g > 1L && ID.cat == "wu") {
## priority to freeing intercepts
#FIXME: binary indicators, latent mean arbitrarily freed, nesting problems
if (nEqThr >= 1L && !equate.int) GLIST.values[[g]]$nu[i, 1] <- NA
if ((nEqThr >= 2L || (nEqThr >= 1L && equate.int)) && !equate.resid) {
## free (residual) variances
if (parameterization == "theta") {
diag(GLIST.values[[g]]$theta)[i] <- NA
} else {
GLIST.values[[g]]$delta[i, 1] <- NA
}
}
} else if (g > 1L && ID.cat %in% c("mplus","millsap")) {
## never free intercepts, only variances
if (nEqThr >= 1L && !equate.resid) {
## free (residual) variances
if (parameterization == "theta") {
diag(GLIST.values[[g]]$theta)[i] <- NA
} else {
GLIST.values[[g]]$delta[i, 1] <- NA
}
}
} else if (ID.cat == "lisrel") {
## always estimate intercepts, and variances unless binary
GLIST.values[[g]]$nu[i, 1] <- NA
diag(GLIST.values[[g]]$theta)[i] <- if (nThr[i] == 1L) 1 else NA
}
}
}
## LATENT MEANS
## longitudinal constraints (one group at a time, but same across groups)
if (meanstructure) for (g in 1:nG) {
## fix or free factor means?
if (ID.fac == "uv") {
GLIST.values[[g]]$alpha[ , 1] <- 0 # free below, if any loading is constrained
## freed when any loading is constrained to equality
} else if ("beta" %in% pmats) {
## latent indicators of any higher-order factors already set to 0 or NA
## in GLIST.values loop above
} else GLIST.values[[g]]$alpha[ , 1] <- NA
## loop over factors
for (f in rownames(GLIST.labels[[g]]$alpha)) {
## which other variables are this same factor?
longFacs <- names(longFacKey)[ which(longFacKey == longFacKey[f]) ]
if (length(longFacs) == 0L) {
## not a longitudinal factor, set first group's mean to 0 for Millsap
if (ID.cat == "millsap" && g == 1L) GLIST.values[[g]]$alpha[f, 1] <- 0
next
}
## first time a factor is measured, set first group's mean to 0 for Millsap
if (ID.cat == "millsap" && g == 1L && longFacs[1] == f) {
GLIST.values[[g]]$alpha[f, 1] <- 0
}
## assign labels
equate.means <- "means" %in% long.equal &&
!any(long.partial$lhs == longFacKey[f] & long.partial$op == "~1")
if (equate.means) {
GLIST.labels[[g]]$alpha[f, 1] <- GLIST.labels[[g]]$alpha[longFacs[1], 1]
}
}
}
## group constraints
if (meanstructure && nG > 1L) for (g in 1:nG) {
## loop over factors
for (f in rownames(GLIST.labels[[g]]$alpha)) {
## assign labels
equate.means <- "means" %in% group.equal &&
!any(group.partial$lhs == f & group.partial$op == "~1")
if (!equate.means) {
GLIST.labels[[g]]$alpha[f, 1] <- paste0(GLIST.labels[[g]]$alpha[f, 1], ".g", g)
}
}
}
## LATENT VARIANCES
## longitudinal constraints (one group at a time, but same across groups)
for (g in 1:nG) {
## fix or free factor variances?
if (ID.fac == "uv") {
diag(GLIST.values[[g]]$psi) <- 1 # free below, if any loading is constrained
## freed when any loading is constrained to equality
} else diag(GLIST.values[[g]]$psi) <- NA
## loop over factors
for (f in colnames(GLIST.labels[[g]]$lambda)) {
## which other variables are this same factor?
longFacs <- names(longFacKey)[ which(longFacKey == longFacKey[f]) ]
if (length(longFacs) == 0L) next
## assign labels
equate.var <- "lv.variances" %in% long.equal &&
!any(long.partial$lhs == longFacKey[f] &
long.partial$op == "~~" &
long.partial$rhs == longFacKey[f])
if (equate.var) {
GLIST.labels[[g]]$psi[f, f] <- GLIST.labels[[g]]$psi[longFacs[1], longFacs[1]]
}
}
}
## group constraints
if (nG > 1L) for (g in 1:nG) {
## loop over factors
for (f in colnames(GLIST.labels[[g]]$lambda)) {
## assign labels
equate.var <- "lv.variances" %in% group.equal &&
!any(group.partial$lhs == f &
group.partial$op == "~~" &
group.partial$rhs == f)
if (!equate.var) {
GLIST.labels[[g]]$psi[f, f] <- paste0(GLIST.labels[[g]]$psi[f, f], ".g", g)
}
}
}
## LOADINGS
## longitudinal constraints (one group at a time, but same across groups)
for (g in 1:nG) {
## loop over factors
for (f in colnames(GLIST.labels[[g]]$lambda)) {
## which other factors are this same factor?
longFacs <- names(longFacKey)[ which(longFacKey == longFacKey[f]) ]
if (length(longFacs) == 0L) next
## loop over any manifest indicators within each factor
for (i in names(which(GLIST.specify[[g]]$lambda[ , f])) ) {
## which other variables are this same indicator?
longInds <- names(longIndKey)[ which(longIndKey == longIndKey[i]) ]
if (length(longInds) == 0L) next
## assign labels
equate.load <- "loadings" %in% long.equal &&
!any(long.partial$lhs == longFacKey[f] &
long.partial$op == "=~" &
long.partial$rhs == longIndKey[i])
if (equate.load) {
GLIST.labels[[g]]$lambda[i, f] <- GLIST.labels[[g]]$lambda[longInds[1], longFacs[1]]
## free factor variance(s) after Time 1
if (ID.fac == "uv" && f %in% longFacs[-1]) diag(GLIST.values[[g]]$psi)[f] <- NA
}
}
## loop over any latent indicators within each factor
if ("beta" %in% pmats) for (i in names(which(GLIST.specify[[g]]$beta[ , f])) ) {
## which other factors are this same factor?
longInds <- names(longFacKey)[ which(longFacKey == longFacKey[i]) ]
if (length(longInds) == 0L) next
## assign labels
equate.load <- "regressions" %in% long.equal &&
!any(long.partial$lhs == longFacKey[f] &
long.partial$op == "=~" &
long.partial$rhs == longFacKey[i])
if (equate.load) {
GLIST.labels[[g]]$beta[i, f] <- GLIST.labels[[g]]$beta[longInds[1], longFacs[1]]
}
}
}
}
## group constraints
if (nG > 1L) for (g in 1:nG) {
## loop over factors
for (f in colnames(GLIST.labels[[g]]$lambda)) {
## loop over any manifest indicators within each factor
for (i in names(which(GLIST.specify[[g]]$lambda[ , f])) ) {
## assign labels
equate.load <- "loadings" %in% group.equal &&
!any(group.partial$lhs == f &
group.partial$op == "=~" &
group.partial$rhs == i)
if (!equate.load) {
GLIST.labels[[g]]$lambda[i, f] <- paste0(GLIST.labels[[g]]$lambda[i, f],
".g", g)
} else if (ID.fac == "uv" && g > 1L) {
## free factor variance(s) in group(s) other than the first
diag(GLIST.values[[g]]$psi)[f] <- NA
}
}
## loop over any latent indicators within each factor
if ("beta" %in% pmats) for (i in names(which(GLIST.specify[[g]]$beta[ , f])) ) {
## assign labels
equate.load <- "regressions" %in% group.equal &&
!any(group.partial$lhs == f &
group.partial$op == "=~" &
group.partial$rhs == i)
if (!equate.load) {
GLIST.labels[[g]]$beta[i, f] <- paste0(GLIST.labels[[g]]$beta[i, f],
".g", g)
}
}
}
}
## INTERCEPTS
## longitudinal constraints (one group at a time, but same across groups)
if (meanstructure) for (g in 1:nG) {
## loop over indicators
for (i in lavNames(lavTemplate, "ov.ind", group = g)) {
## when other variables are this same indicator?
longInds <- names(longIndKey)[ which(longIndKey == longIndKey[i]) ]
if (length(longInds) == 0L) next
## assign labels
equate.int <- "intercepts" %in% long.equal &&
!any(long.partial$lhs == longIndKey[i] & long.partial$op == "~1")
if (equate.int) {
GLIST.labels[[g]]$nu[i, 1] <- GLIST.labels[[g]]$nu[longInds[1], 1]
## free factor mean(s) after Time 1 only if an indicator without a
## cross-loading has an equality-constrained intercept
if (ID.fac == "uv") {
## factors this indicator measures
fs <- colnames(GLIST.specify[[g]]$lambda)[ GLIST.specify[[g]]$lambda[i,] ]
only.measures.1 <- length(fs) == 1L
## name(s) of longitudinal factor(s)
LFN <- longFacKey[fs]
not.time.1 <- fs[1] %in% names(which(longFacKey == LFN))[-1]
if (only.measures.1 && not.time.1) GLIST.values[[g]]$alpha[fs, 1] <- NA
}
}
}
}
## group constraints
if (meanstructure && nG > 1L) for (g in 1:nG) {
## loop over indicators
for (i in lavNames(lavTemplate, "ov.ind", group = g)) {
## assign labels
equate.int <- "intercepts" %in% group.equal &&
!any(group.partial$lhs == i & group.partial$op == "~1")
if (!equate.int) {
GLIST.labels[[g]]$nu[i, 1] <- paste0(GLIST.labels[[g]]$nu[i, 1], ".g", g)
} else if (ID.fac == "uv") {
## factors this indicator measures
fs <- colnames(GLIST.specify[[g]]$lambda)[ GLIST.specify[[g]]$lambda[i,] ]
only.measures.1 <- length(fs) == 1L
## free factor mean(s) other than group 1 only if an indicator without a
## cross-loading has an equality-constrained intercept
if (only.measures.1 && g > 1L) GLIST.values[[g]]$alpha[fs, 1] <- NA
}
}
}
## RESIDUAL VARIANCES
## longitudinal constraints (one group at a time, but same across groups)
for (g in 1:nG) {
## loop over indicators
for (i in lavNames(lavTemplate, "ov.ind", group = g)) {
## when other variables are this same indicator?
longInds <- names(longIndKey)[ which(longIndKey == longIndKey[i]) ]
if (length(longInds) == 0L) next
## assign labels
equate.resid <- "residuals" %in% long.equal &&
!any(long.partial$lhs == longIndKey[i] &
long.partial$rhs == longIndKey[i] &
long.partial$op == "~~")
if (equate.resid) {
diag(GLIST.labels[[g]]$theta)[i] <- diag(GLIST.labels[[g]]$theta)[ longInds[1] ]
}
}
}
## group constraints
if (nG > 1L) for (g in 1:nG) {
## loop over indicators
for (i in lavNames(lavTemplate, "ov.ind", group = g)) {
## assign labels
equate.resid <- "residuals" %in% group.equal &&
!any(group.partial$lhs == i & group.partial$rhs == i & group.partial$op == "~~")
if (!equate.resid) {
diag(GLIST.labels[[g]]$theta)[i] <- paste0(diag(GLIST.labels[[g]]$theta)[i],
".g", g)
}
}
}
## RESIDUAL AUTO-COVARIANCES: longitudinal constraints only
if (length(longIndNames) && !is.null(auto)) for (g in 1:nG) {
## loop over longitudinal indicators
for (i in names(longIndNames)) {
nn <- longIndNames[[i]]
nT <- length(nn) # number repeated measures of indicator i
auto.i <- suppressWarnings(as.integer(auto))[1] # nT can vary over i
if (auto == "all" | is.na(auto.i)) auto.i <- nT - 1L # max lag
if (auto.i >= nT | auto.i < 0L ) auto.i <- nT - 1L # max lag
## for each lag...
for (lag in 1:auto.i) {
for (tt in 1:(nT - lag)) {
## sort indices to ensure the lower.tri is always specified, in case
## order of longIndNames does not match order in syntax/theta
nn.idx <- c(which(rownames(GLIST.specify[[g]]$theta) == nn[tt]),
which(rownames(GLIST.specify[[g]]$theta) == nn[tt + lag]))
idx1 <- nn.idx[ which.max(nn.idx) ] # row index
idx2 <- nn.idx[ which.min(nn.idx) ] # column index
## specify and set free
GLIST.specify[[g]]$theta[idx1, idx2] <- TRUE
GLIST.values[[g]]$theta[ idx1, idx2] <- NA
## constrain to equality across repeated measures?
if ("resid.autocov" %in% long.equal && tt > 1L) {
o.idx <- c(which(rownames(GLIST.specify[[g]]$theta) == nn[1]),
which(rownames(GLIST.specify[[g]]$theta) == nn[1 + lag]))
o1 <- o.idx[ which.max(o.idx) ] # row index
o2 <- o.idx[ which.min(o.idx) ] # column index
first.label <- GLIST.labels[[g]]$theta[o1, o2]
GLIST.labels[[g]]$theta[idx1, idx2] <- first.label
}
}
}
}
}
## group constraints on any RESIDUAL COVARIANCES
if (nG > 1) for (g in 1:nG) {
## add group-specific labels to any off-diagonal GLIST.specify?
freeTheta <- which(GLIST.specify[[g]]$theta, arr.ind = TRUE)
offDiag <- freeTheta[ , "row"] > freeTheta[ , "col"]
if (sum(offDiag) == 0) break # nothing to do
## loop over elements that require action
free.offDiag <- freeTheta[offDiag, , drop = FALSE]
for (RR in 1:nrow(free.offDiag)) {
i <- free.offDiag[RR, "row"]
j <- free.offDiag[RR, "col"]
## check group.partial in both directions
partial.ij <- any(group.partial$lhs == i & group.partial$rhs == j & group.partial$op == "~~")
partial.ji <- any(group.partial$lhs == j & group.partial$rhs == i & group.partial$op == "~~")
equate.rescov <- "residual.covariances" %in% group.equal && !any(partial.ij | partial.ji)
## assign group-specific labels?
if (!equate.rescov) {
GLIST.labels[[g]]$theta[i, j] <- paste0(GLIST.labels[[g]]$theta[i, j],
".g", g)
}
}
}
## LATENT AUTO-COVARIANCES: longitudinal constraints only
if (length(longFacNames)) for (g in 1:nG) {
## loop over longitudinal indicators
for (i in names(longFacNames)) {
nn <- longFacNames[[i]]
nT <- length(nn) # number repeated measures of indicator i
## for each lag...
for (lag in 1:(nT - 1) ) {
for (tt in 1:(nT - lag) ) {
## specify and set free (overwrite possible "orthogonal=TRUE")
GLIST.specify[[g]]$psi[ nn[tt + lag], nn[tt] ] <- TRUE
GLIST.values[[g]]$psi[ nn[tt + lag], nn[tt] ] <- NA
## constrain to equality across repeated measures?
if ("lv.autocov" %in% long.equal && tt > 1L) {
first.label <- GLIST.labels[[g]]$psi[ nn[1 + lag], nn[1] ]
GLIST.labels[[g]]$psi[ nn[tt + lag], nn[tt] ] <- first.label
}
}
}
}
}
## group constraints on any LATENT COVARIANCES
if (nG > 1) for (g in 1:nG) {
## add group-specific labels to any off-diagonal GLIST.specify?
freePsi <- which(GLIST.specify[[g]]$psi, arr.ind = TRUE)
offDiag <- freePsi[ , "row"] > freePsi[ , "col"]
if (sum(offDiag) == 0) break # nothing to do
## loop over elements that require action
free.offDiag <- freePsi[offDiag, , drop = FALSE]
for (RR in 1:nrow(free.offDiag)) {
i <- free.offDiag[RR, "row"]
j <- free.offDiag[RR, "col"]
## check group.partial in both directions
partial.ij <- any(group.partial$lhs == i & group.partial$rhs == j & group.partial$op == "~~")
partial.ji <- any(group.partial$lhs == j & group.partial$rhs == i & group.partial$op == "~~")
equate.latcov <- "lv.covariances" %in% group.equal && !any(partial.ij | partial.ji)
## assign group-specific labels?
if (!equate.latcov) {
GLIST.labels[[g]]$psi[i, j] <- paste0(GLIST.labels[[g]]$psi[i, j], ".g", g)
}
}
}
## assemble parameter labels for effects-code identification constraints
fxList <- character(0)
if (ID.fac == "fx") {
listLabels.L <- list()
if (meanstructure) listLabels.I <- list()
for (g in 1:nG) {
## loadings labels
listLabels.L[[g]] <- sapply(colnames(GLIST.labels[[g]]$lambda), function(f) {
GLIST.labels[[g]]$lambda[GLIST.specify[[g]]$lambda[ , f], f]
}, simplify = FALSE)
## intercept labels
if (meanstructure) {
listLabels.I[[g]] <- sapply(colnames(GLIST.labels[[g]]$lambda), function(f) {
GLIST.labels[[g]]$nu[GLIST.specify[[g]]$lambda[ , f], 1]
}, simplify = FALSE)
#TODO: threshold labels
}
}
## names of factors measured in each group
gFacNames <- lapply(listLabels.L, names)
## loop over common-factor names
for (f in unique(unlist(allFacNames))) {
## in which groups is this factor measured?
groups.with.f <- which(sapply(gFacNames, function(gn) f %in% gn))
## get the labels used for indicators in each group
allLabels.L <- lapply(listLabels.L[groups.with.f], "[[", i = f)
if (meanstructure) allLabels.I <- lapply(listLabels.I[groups.with.f],
"[[", i = f)
## one group, one time --> no checks necessary
if (length(groups.with.f) == 1L && !f %in% names(longFacKey)) {
fxList <- c(fxList, make.FX.constraint(allLabels.L[[1]], "loadings"))
if (meanstructure) {
fxList <- c(fxList, make.FX.constraint(allLabels.I[[1]], "intercepts"))
}
}
## one group, multiple times
if (length(groups.with.f) == 1L && f %in% names(longFacKey)) {
## this factor's name on all occasions
LFN <- names(which(longFacKey == longFacKey[f]))
## count constraints on loadings across time
allConstrained <- which(table(unlist(listLabels.L[[1]][LFN])) == length(LFN))
if (length(allConstrained)) {
if (f == LFN[1]) {
fxList <- c(fxList, make.FX.constraint(names(allConstrained), "loadings"))
}
} else {
## no constraints, each factor gets its own
fxList <- c(fxList, make.FX.constraint(allLabels.L[[1]], "loadings"))
}
## count constraints on intercepts across time
if (meanstructure) {
allConstrained <- which(table(unlist(listLabels.I[[1]][LFN])) == length(LFN))
if (length(allConstrained)) {
if (f == LFN[1]) {
fxList <- c(fxList, make.FX.constraint(names(allConstrained), "intercepts"))
}
} else {
## no constraints, each factor gets its own
fxList <- c(fxList, make.FX.constraint(allLabels.I[[1]], "intercepts"))
}
}
}
## multiple groups, one time
if (length(groups.with.f) > 1L && !f %in% names(longFacKey)) {
## count constraints on loadings across groups
allConstrained <- which(table(unlist(allLabels.L)) == length(groups.with.f))
if (length(allConstrained)) {
fxList <- c(fxList, make.FX.constraint(names(allConstrained), "loadings"))
} else {
## no constraints, each group gets its own
for (g in groups.with.f) {
fxList <- c(fxList, make.FX.constraint(allLabels.L[[g]], "loadings"))
}
}
## count constraints on intercepts across groups
if (meanstructure) {
allConstrained <- which(table(unlist(allLabels.I)) == length(groups.with.f))
if (length(allConstrained)) {
fxList <- c(fxList, make.FX.constraint(names(allConstrained), "intercepts"))
} else {
## no constraints, each group gets its own
for (g in groups.with.f) {
fxList <- c(fxList, make.FX.constraint(allLabels.I[[g]], "loadings"))
}
}
}
}
## multiple groups, multiple times: Constrain across any/all dimensions?
if (length(groups.with.f) > 1L && f %in% names(longFacKey)) {
## This factor's name on all occasions
LFN <- names(which(longFacKey == longFacKey[f]))
## Number of dimensions (number of groups times number of occasions).
## Assumes each occasion was measured in each group.
nGT <- length(LFN)*length(groups.with.f)
## count constraints on loadings across both dimensions
all.GL.Labels.L <- lapply(LFN, function(ff) {
lapply(listLabels.L[groups.with.f], "[[", i = ff)
})
all.GL.Constrained.L <- which(table(unlist(all.GL.Labels.L)) == nGT)
if (length(all.GL.Constrained.L)) {
if (f == LFN[1]) {
fxList <- c(fxList, make.FX.constraint(names(all.GL.Constrained.L), "loadings"))
}
} else {
if (f == LFN[1])
warning('No indicators of longitudinal factor "', longFacKey[f],
'" have loadings constrained across all groups and all ',
'occasions, so the automatically generated syntax applies ',
'effects-code identification constraints separately for each',
' occasion and group. If at least 1 loading is constrained ',
'across either groups or occasions, the user should save the',
' syntax to manually reduce the number of identification ',
'constraints by applying them only to loadings constrained ',
'to equality across groups or occasions.') #TODO: update() method
for (g in groups.with.f) {
fxList <- c(fxList, make.FX.constraint(allLabels.L[[g]], "loadings"))
}
}
## count constraints on intercepts across both dimensions
if (meanstructure) {
all.GL.Labels.I <- lapply(LFN, function(ff) {
lapply(listLabels.I[groups.with.f], "[[", i = ff)
})
all.GL.Constrained.I <- which(table(unlist(all.GL.Labels.I)) == nGT)
if (length(all.GL.Constrained.I)) {
if (f == LFN[1]) {
fxList <- c(fxList, make.FX.constraint(names(all.GL.Constrained.I), "intercepts"))
}
} else {
if (f == LFN[1])
warning('No indicators of longitudinal factor "', longFacKey[f],
'" have intercepts constrained across all groups and all ',
'occasions, so the automatically generated syntax applies ',
'effects-code identification constraints separately for each',
' occasion and group. If at least 1 loading is constrained ',
'across either groups or occasions, the user should save the',
' syntax to manually reduce the number of identification ',
'constraints by applying them only to intercepts constrained ',
'to equality across groups or occasions.') #TODO: update() method
for (g in groups.with.f) {
fxList <- c(fxList, make.FX.constraint(allLabels.I[[g]], "intercepts"))
}
}
}
}
} # end loop over common factors
#TODO: Implement effects-coding constraints for thresholds?
# For each latent item-response, mean(thresholds) == 0, which
# identifies intercepts, resolving the problem of effects-coding with
# categorical indicators!
# (i.e., constraining intercepts that == 0 to average 0 is redundant)
}
## -------------
## Return object
## -------------
out <- new("measEq.syntax", package = "lavaan", model.type = "cfa", call = mc,
meanstructure = meanstructure,
numeric = lavNames(lavTemplate, "ov.num"),
ordered = lavNames(lavTemplate, "ov.ord"),
parameterization = parameterization,
specify = GLIST.specify,
values = GLIST.values,
labels = GLIST.labels,
constraints = fxList,
updates = list(values = data.frame(NULL),
labels = data.frame(NULL)),
ngroups = nG)
if (return.fit) {
if (inherits(configural.model, "lavaan")) {
fit <- try(lavaan::update(configural.model,
model = as.character(out), ...),
silent = TRUE)
} else if (inherits(configural.model, "lavaanList")) {
configural.model@call$model <- as.character(out)
configural.model@call$do.fit <- TRUE
fit <- try(eval(configural.model@call, parent.frame()), silent = TRUE)
} else {
lavArgs$model <- as.character(out)
lavArgs$do.fit <- TRUE
fit <- try(do.call("cfa", lavArgs, envir = getNamespace("lavaan")),
silent = TRUE)
}
## check whether the model could be fit
if (inherits(fit, "try-error")) {
warning('The generated model syntax was not successfully fit to the ',
'data, and generated the following error message(s): \n\n',
fit[1:length(fit)], "\n",
"The measEq.syntax object was returned instead.")
} else {
fit@external$measEq.syntax <- out # save the syntax to the lavaan(.mi) object
out <- fit # return the fitted lavaan(.mi) object
}
}
out
}
## ----------------
## Hidden Functions
## ----------------
## function to label a parameter by its location in a parameter matrix
getLabel <- function(GLIST, parMat, RR, CC = 1L) {
dn <- dimnames(GLIST[[parMat]])
out <- paste(parMat, which(dn[[1]] == RR), sep = ".")
if (!parMat %in% c("alpha","nu")) out <- paste(out, which(dn[[2]] == CC),
sep = "_")
out
}
## function to assemble a model constraint for effects-code identification
make.FX.constraint <- function(parLabels, param) {
nCon <- length(parLabels)
conVal <- if (param == "loadings") nCon else 0 #TODO: algorithm for thresholds
out <- paste0(parLabels[1], " == ", conVal)
if (nCon > 1) out <- paste(c(out, parLabels[-1]), collapse = " - ")
out
}
## function to generate a character vector of lines of syntax (for as.character)
write.lavaan.syntax <- function(pmat, specify, value, label) {
nG <- length(specify)
## LOADINGS
if (pmat == "lambda") {
params <- "## LOADINGS:\n"
for (fac in colnames(specify[[1]])) {
for (ind in rownames(specify[[1]])) {
if (!specify[[1]][ind, fac]) next
if (nG > 1L) {
params <- c(params,
paste0(fac, " =~ c(",
paste(sapply(value, "[", i = ind, j = fac),
collapse = ", "),
")*", ind, " + c(",
paste(sapply(label, "[", i = ind, j = fac),
collapse = ", "),
")*", ind))
} else {
params <- c(params,
paste0(fac, " =~ ", value[[1]][ind, fac], "*", ind,
" + ", label[[1]][ind, fac], "*", ind))
}
}
}
return(c(params, ""))
}
## THRESHOLDS
if (pmat == "tau") {
params <- sapply(rownames(specify[[1]]), function(th) {
th.names <- strsplit(th, split = "|", fixed = TRUE)[[1]]
if (nG > 1L) {
param <- paste0(th.names[1], " | c(",
paste(sapply(value, "[", i = th, j = 1),
collapse = ", "),
")*", th.names[2], " + c(",
paste(sapply(label, "[", i = th, j = 1),
collapse = ", "),
")*", th.names[2])
} else {
param <- paste0(th.names[1], " | ", value[[1]][th, 1], "*", th.names[2],
" + ", label[[1]][th, 1], "*", th.names[2])
}
param
})
return(c("## THRESHOLDS:\n", params, ""))
}
## INTERCEPTS or LATENT MEANS
if (pmat %in% c("nu","alpha")) {
## specify all, so no need to check
params <- sapply(rownames(specify[[1]]), function(x) {
if (nG > 1L) {
param <- paste0(x, " ~ c(",
paste(sapply(value, "[", i = x, j = 1),
collapse = ", "),
")*1 + c(",
paste(sapply(label, "[", i = x, j = 1),
collapse = ", "),
")*1")
} else {
param <- paste0(x, " ~ ", value[[1]][x, 1], "*1 + ",
label[[1]][x, 1], "*1")
}
param
})
if (pmat == "nu") params <- c("## INTERCEPTS:\n", params)
if (pmat == "alpha") params <- c("## LATENT MEANS/INTERCEPTS:\n", params)
return(c(params, ""))
}
## SCALING FACTORS (delta)
if (pmat == "delta") {
## specify any?
spec.delta <- which(specify[[1]][ , 1])
if (length(spec.delta) == 0L) return(NULL)
## if so...
params <- sapply(names(spec.delta), function(x) {
if (nG > 1L) {
param <- paste0(x, " ~*~ c(",
paste(sapply(value, "[", i = x, j = 1),
collapse = ", "),
")*", x)
} else {
param <- paste0(x, " ~*~ ", value[[1]][x, 1], "*", x)
}
param
})
return(c("## SCALING FACTORS:\n", params, ""))
}
## LATENT or RESIDUAL (CO)VARIANCES
if (pmat %in% c("theta","psi")) {
## do diagonal first, then check for off-diagonal
spec.vars <- which(diag(specify[[1]]))
if (pmat == "psi") {
params <- "## COMMON-FACTOR VARIANCES:\n"
} else if (pmat == "theta" && length(spec.vars)) {
params <- "## UNIQUE-FACTOR VARIANCES:\n"
} else params <- character(0)
## variances
if (length(spec.vars)) {
params <- c(params,
sapply(names(spec.vars), function(x) {
if (nG > 1L) {
param <- paste0(x, " ~~ c(",
paste(sapply(value, function(j) diag(j)[x]),
collapse = ", "),
")*", x, " + c(",
paste(sapply(label, function(j) diag(j)[x]),
collapse = ", "),
")*", x)
} else {
param <- paste0(x, " ~~ ", diag(value[[1]])[x], "*", x,
" + ", diag(label[[1]])[x], "*", x)
}
param
}))
}
## covariances
if (any(specify[[1]][lower.tri(specify[[1]], diag = FALSE)])) {
if (pmat == "psi") params <- c(params, "\n## COMMON-FACTOR COVARIANCES:\n")
if (pmat == "theta") params <- c(params, "\n## UNIQUE-FACTOR COVARIANCES:\n")
}
nn <- rownames(specify[[1]])
if (length(nn) > 1L) for (CC in 1:(length(nn) - 1)) {
for (RR in (CC + 1):length(nn)) {
if (!specify[[1]][RR, CC]) next
if (nG > 1L) {
params <- c(params,
paste0(nn[CC], " ~~ c(",
paste(sapply(value, "[", i = RR, j = CC),
collapse = ", "),
")*", nn[RR], " + c(",
paste(sapply(label, "[", i = RR, j = CC),
collapse = ", "),
")*", nn[RR]))
} else {
params <- c(params,
paste0(nn[CC], " ~~ ", value[[1]][RR, CC], "*", nn[RR],
" + ", label[[1]][RR, CC], "*", nn[RR]))
}
}
}
return(c(params, ""))
}
## out of options, should never get this far
invisible(NULL)
}
#TODO: adapt routine to write Mplus MODEL statements and OpenMx RAM commands
write.mplus.syntax <- function(object, group = 1, params = NULL) {
out <- character()
pmatList <- intersect(c("lambda","tau","nu", object@parameterization,
"alpha","psi"), names(object@specify[[group]]))
names(pmatList) <- c("loadings","thresholds","intercepts",
ifelse(object@parameterization == "delta",
"scales", "residuals"),"means","lv.variances")
## selected parameter types?
if (!is.null(params)) {
requested <- intersect(names(pmatList), params)
if (!length(requested)) stop('invalid choice: params = c("',
paste(params, collapse = '", "'), '")\n',
'Valid choices include: ',
paste(names(pmatList), collapse = ", "))
pmatList <- pmatList[requested]
}
## concatenate all latent-variable definitions
if ("beta" %in% names(object@specify[[group]])) {
specify.lambda <- rbind(object@specify[[group]]$lambda,
object@specify[[group]]$beta)
values.lambda <- rbind(object@values[[group]]$lambda,
object@values[[group]]$beta)
labels.lambda <- rbind(object@labels[[group]]$lambda,
object@labels[[group]]$beta)
} else {
specify.lambda <- object@specify[[group]]$lambda
values.lambda <- object@values[[group]]$lambda
labels.lambda <- object@labels[[group]]$lambda
}
## check for @ordered, define latent item-response factors,
if (length(object@ordered)) {
out <- c(out, "! Define LATENT ITEM-RESPONSES as factors",
paste0("LIR", 1:length(object@ordered), " BY ", object@ordered,
"@1; LIR", 1:length(object@ordered), "@0;"))
for (i in seq_along(object@ordered)) {
## update rownames in Lambda
#FIXME: update names in concatenated Lambda instead?
idx <- which(rownames(object@specify[[group]]$lambda) == object@ordered[i])
rownames(specify.lambda)[idx] <- paste0("LIR", i)
rownames(values.lambda)[ idx] <- paste0("LIR", i)
rownames(labels.lambda)[ idx] <- paste0("LIR", i)
## update rownames in Nu
idx <- which(rownames(object@specify[[group]]$nu) == object@ordered[i])
rownames(object@specify[[group]]$nu)[idx] <- paste0("LIR", i)
rownames(object@values[[group]]$nu)[idx] <- paste0("LIR", i)
rownames(object@labels[[group]]$nu)[idx] <- paste0("LIR", i)
}
}
out <- c(out, "! FACTOR LOADINGS")
## shorten labels
labels.lambda <- gsub(pattern = "lambda.", replacement = "L",
x = labels.lambda, fixed = TRUE)
labels.lambda <- gsub(pattern = ".g", replacement = "_",
x = labels.lambda, fixed = TRUE)
## loop over factors
for (fac in colnames(specify.lambda)) {
out <- c(out, paste(fac, "BY"))
ind <- names(which(specify.lambda[ , fac]))
lastInd <- rev(ind)[1]
for (i in ind) {
val <- values.lambda[i, fac]
out <- c(out, paste0(" ", i,
if (is.na(val)) "*" else paste0("@", val),
" (", labels.lambda[i, fac],
")", if (i == lastInd) ";" else ""))
}
}
if ("tau" %in% pmatList) {
out <- c(out, "! THRESHOLDS")
## find unique names to shorten labels
allThrNames <- unique(do.call(c, lapply(object@labels, "[[", i = "tau")))
## loop over ordinal indicators, specify set on a single line
for (i in object@ordered) {
iThr <- grep(paste0(i, "\\|"), rownames(object@labels[[group]]$tau))
specify <- object@specify[[group]]$tau[iThr, 1] #NOTE: These are now vectors
values <- object@values[[ group]]$tau[iThr, 1]
labels <- object@labels[[ group]]$tau[iThr, 1]
## identify unique parameter number among thresholds (for short labels)
idx <- integer()
for (lab in labels) idx <- c(idx, which(allThrNames == lab))
out <- c(out,
paste0("[", i, "$", 1:length(iThr),
ifelse(is.na(values), "", paste("@", values)),
"] (T", idx, ");", collapse = " "))
}
}
## INDICATOR-LEVEL PARAMETERS
hasInts <- object@meanstructure
hasResid <- length(object@numeric) || object@parameterization == "theta"
hasScales <- length(object@ordered) && object@parameterization == "delta"
## assemble comment for this section
if (sum(hasInts, hasResid, hasScales) == 3L) {
out <- c(out, "! INDICATOR INTERCEPTS, RESIDUAL VARIANCES, & SCALING FACTORS")
} else {
element.names <- c("INTERCEPTS","RESIDUAL VARIANCES","SCALING FACTORS")
element.tests <- c(hasInts, hasResid, hasScales)
out <- c(out, paste0("! INDICATOR ", paste(element.names[element.tests],
collapse = " and ")))
}
i.nu <- character()
i.var <- character()
## Loop over indicators
for (i in 1:nrow(object@specify[[group]]$lambda)) {
LIR <- rownames(specify.lambda)[i] # LIR names
RR <- rownames(object@specify[[group]]$lambda)[i]
if (object@meanstructure) {
## INTERCEPTS
N.val <- object@values[[group]]$nu[LIR, 1]
## shorten labels
N.lab <- gsub(pattern = "nu.", replacement = "N",
x = object@labels[[group]]$nu[LIR, 1], fixed = TRUE)
N.lab <- gsub(pattern = ".g", replacement = "_", x = N.lab, fixed = TRUE)
i.nu <- c(i.nu, paste0("[", LIR, ifelse(is.na(N.val), yes = "*",
no = paste0("@", N.val)),
"] (", N.lab, "); "))
}
if (RR %in% object@ordered && object@parameterization == "delta") {
## SCALING FACTORS
E.val <- object@values[[group]]$delta[RR, 1]
E.lab <- ""
i.var <- c(i.var, paste0("{", RR, ifelse(is.na(E.val), yes = "*",
no = paste0("@", E.val)), "};"))
} else {
## RESIDUAL VARIANCES
E.val <- object@values[[group]]$theta[RR, RR]
## shorten labels
E.lab <- gsub(pattern = "theta.", replacement = "E",
x = object@labels[[group]]$theta[RR, RR], fixed = TRUE)
E.lab <- gsub(pattern = ".g", replacement = "_", x = E.lab, fixed = TRUE)
i.var <- c(i.var, paste0(RR, ifelse(is.na(E.val), yes = "*",
no = paste0("@", E.val)),
" (", E.lab, ");"))
}
}
out <- c(out, paste(i.nu, i.var))
E.specify <- object@specify[[group]]$theta
LT <- E.specify & lower.tri(E.specify, diag = FALSE)
if (any(LT)) {
out <- c(out, "! RESIDUAL COVARIANCES")
E.values <- object@values[[group]]$theta
## shorten labels
E.labels <- gsub(pattern = "theta.", replacement = "E",
x = object@labels[[group]]$theta, fixed = TRUE)
E.labels <- gsub(pattern = ".g", replacement = "_",
x = E.labels, fixed = TRUE)
for (CC in 1:(ncol(LT) - 1)) {
if (!any(LT[ , CC])) next
if (sum(LT[ , CC]) == 1L) {
RR <- which(LT[ , CC])
out <- c(out,
paste0(colnames(LT)[CC], " WITH ", rownames(LT)[RR],
ifelse(is.na(E.values[RR, CC]), yes = "",
no = paste("@", E.values[RR, CC])),
" (", E.labels[RR, CC], ");"))
next
}
## else, there are multiple covariates with LT[CC]
out <- c(out, paste(colnames(LT)[CC], "WITH"))
ind <- names(which(LT[ , CC]))
lastInd <- rev(ind)[1]
for (RR in ind) {
val <- E.values[RR, CC]
out <- c(out, paste0(" ", RR,
if (is.na(val)) "" else paste0("@", val),
" (", E.labels[RR, CC],
")", if (RR == lastInd) ";" else ""))
}
}
}
## FACTOR-LEVEL PARAMETERS
out <- c(out, paste("! FACTOR",
if (object@meanstructure) "INTERCEPTS &" else NULL,
"(RESIDUAL) VARIANCES"))
i.alpha <- character()
i.psi <- character()
## Loop over factors
for (i in rownames(object@specify[[group]]$psi)) {
if (object@meanstructure) {
## INTERCEPTS
A.val <- object@values[[group]]$alpha[i, 1]
## shorten labels
A.lab <- gsub(pattern = "alpha.", replacement = "A",
x = object@labels[[group]]$alpha[i, 1], fixed = TRUE)
A.lab <- gsub(pattern = ".g", replacement = "_", x = A.lab, fixed = TRUE)
i.alpha <- c(i.alpha, paste0("[", i, ifelse(is.na(A.val), yes = "*",
no = paste0("@", A.val)),
"] (", A.lab, "); "))
}
## RESIDUAL VARIANCES
P.val <- object@values[[group]]$psi[i, i]
## shorten labels
P.lab <- gsub(pattern = "psi.", replacement = "P",
x = object@labels[[group]]$psi[i, i], fixed = TRUE)
P.lab <- gsub(pattern = ".g", replacement = "_", x = P.lab, fixed = TRUE)
i.psi <- c(i.psi, paste0(i, ifelse(is.na(P.val), yes = "",
no = paste0("@", P.val)),
" (", P.lab, ");"))
}
out <- c(out, paste(i.alpha, i.psi))
P.specify <- object@specify[[group]]$psi
LT <- P.specify & lower.tri(P.specify, diag = FALSE)
if (any(LT)) {
out <- c(out, "! FACTOR COVARIANCES")
P.values <- object@values[[group]]$psi
## shorten labels
P.labels <- gsub(pattern = "psi.", replacement = "P",
x = object@labels[[group]]$psi, fixed = TRUE)
P.labels <- gsub(pattern = ".g", replacement = "_",
x = P.labels, fixed = TRUE)
for (CC in 1:(ncol(LT) - 1)) {
if (!any(LT[ , CC])) next
if (sum(LT[ , CC]) == 1L) {
RR <- which(LT[ , CC])
out <- c(out,
paste0(colnames(LT)[CC], " WITH ", rownames(LT)[RR],
ifelse(is.na(P.values[RR, CC]), yes = "",
no = paste("@", P.values[RR, CC])),
" (", P.labels[RR, CC], ");"))
next
}
## else, there are multiple covariates with LT[CC]
out <- c(out, paste(colnames(LT)[CC], "WITH"))
ind <- names(which(LT[ , CC]))
lastInd <- rev(ind)[1]
for (RR in ind) {
val <- P.values[RR, CC]
out <- c(out, paste0(" ", RR,
if (is.na(val)) "" else paste0("@", val),
" (", P.labels[RR, CC],
")", if (RR == lastInd) ";" else ""))
}
}
}
## MODEL CONSTRAINTs
if (length(object@constraints) && group == object@ngroups) {
con <- object@constraints
con <- gsub("lambda.", "L", con)
con <- gsub("theta.", "E", con)
con <- gsub("psi.", "P", con)
if (length(object@ordered)) for (th in object@labels[[group]]$tau[ , 1]) {
con <- gsub(th, paste0("T", which(allThrNames == th)), con)
}
if (object@meanstructure) {
con <- gsub("nu.", "N", con)
con <- gsub("alpha.", "A", con)
}
con <- gsub(".g", "_", con)
con <- gsub("==", "=", con)
out <- c(out, "\nMODEL CONSTRAINT:", paste0(con, ";"))
}
#TODO: gsub = for ==, add ";", anything else? object@constraints, "")
paste(out, collapse = "\n")
}
# write.OpenMx.syntax <- function(pmat, specify, value, label) {}
## function to allow users to customize syntax with update(),
## so they don't necessarily have to copy/paste a script to adapt it.
override <- function(object, slotName = "values", group = 1L,
matName, row, col, replacement) {
stopifnot(inherits(object, "measEq.syntax"))
MM <- methods::slot(object, slotName)[[group]] # only "values" or "labels"
## check indices
if (is.character(row)) {
if (! row %in% rownames(MM[[matName]]))
stop("'", row, "' not found in rownames(",
deparse(substitute(object)), "@", slotName, "[[", group, "]]$",
matName, ")")
} else if (is.numeric(row)) {
if (! as.integer(row) %in% 1:nrow(MM[[matName]]))
stop(as.integer(row), "' is outside the number of nrow(",
deparse(substitute(object)), "@", slotName, "[[", group, "]]$",
matName, ")")
} else stop('row argument must be numeric/character indices')
## repeat for col
if (matName %in% c("nu","alpha","delta","tau")) col <- 1L else {
if (is.character(col)) {
if (! col %in% colnames(MM[[matName]]))
stop("'", col, "' not found in colnames(",
deparse(substitute(object)), "@", slotName, "[[", group, "]]$",
matName, ")")
} else if (is.numeric(col)) {
if (! as.integer(col) %in% 1:ncol(MM[[matName]]))
stop(as.integer(col), "' is outside the number of ncol(",
deparse(substitute(object)), "@", slotName, "[[", group, "]]$",
matName, ")")
} else stop('col argument must be numeric/character indices')
}
newM <- MM[[matName]]
newM[row, col] <- replacement
if (matName %in% c("theta","psi")) newM[col, row] <- replacement
newM
}
## function to assemble values/labels to update
char2update <- function(object, model, return.object = TRUE) {
stopifnot(inherits(object, "measEq.syntax"))
stopifnot(inherits(model, "character"))
PT <- lavaan::lavParseModelString(model, as.data.frame. = TRUE)
indNames <- lapply(object@values, function(x) rownames(x$lambda)) # per block
facNames <- lapply(object@values, function(x) colnames(x$lambda))
values <- PT$fixed
labels <- PT$label
## check for multigroup specification of values/labels
if (any(grepl(pattern = ";", x = values))) {
values <- strsplit(values, split = ";")
nValues <- sapply(values, length)
} else nValues <- rep(1L, length(values))
if (any(grepl(pattern = ";", x = labels))) {
labels <- strsplit(labels, split = ";")
nLabels <- sapply(labels, length)
} else nLabels <- rep(1L, length(labels))
nBlocks <- length(facNames)
values.DF <- data.frame(NULL)
labels.DF <- data.frame(NULL)
for (RR in 1:nrow(PT)) {
## check whether numbers match
if (nValues[RR] > 1L && nValues[RR] != nBlocks) {
stop('Number of fixed/free values (', nValues[RR],
') specified for parameter "', PT$lhs[RR], PT$op[RR], PT$rhs[RR],
'" does not match the number of groups (', nBlocks, ')')
}
if (nLabels[RR] > 1L && nLabels[RR] != nBlocks) {
stop('Number of labels (', nLabels[RR],
') specified for parameter "', PT$lhs[RR], PT$op[RR], PT$rhs[RR],
'" does not match the number of groups (', nBlocks, ')')
}
## loop over blocks (currently only groups)
for (BB in 1:nBlocks) {
## make template for values and labels, depending on parameter matrix
## INTERCEPTS
if (PT$op[RR] == "~1" && PT$lhs[RR] %in% indNames[[BB]]) {
DF <- data.frame(stringsAsFactors = FALSE, group = BB, matName = "nu",
row = PT$lhs[RR], col = "intercept")
## LATENT MEANS
} else if (PT$op[RR] == "~1" && PT$lhs[RR] %in% facNames[[BB]]) {
DF <- data.frame(stringsAsFactors = FALSE, group = BB, matName = "alpha",
row = PT$lhs[RR], col = "intercept")
## LOADINGS
} else if (PT$op[RR] == "=~" && PT$rhs[RR] %in% indNames[[BB]]) {
DF <- data.frame(stringsAsFactors = FALSE, group = BB, matName = "lambda",
row = PT$rhs[RR], col = PT$lhs[RR])
## SECOND-ORDER LOADINGS
} else if (PT$op[RR] == "=~" && PT$rhs[RR] %in% facNames[[BB]]) {
DF <- data.frame(stringsAsFactors = FALSE, group = BB, matName = "beta",
row = PT$rhs[RR], col = PT$lhs[RR])
## LATENT (CO)VARIANCES
} else if (PT$op[RR] == "~~" && PT$rhs[RR] %in% facNames[[BB]]) {
DF <- data.frame(stringsAsFactors = FALSE, group = BB, matName = "psi",
# symmetry handled in override
row = PT$rhs[RR], col = PT$lhs[RR])
## RESIDUAL (CO)VARIANCES
} else if (PT$op[RR] == "~~" && PT$rhs[RR] %in% indNames[[BB]]) {
DF <- data.frame(stringsAsFactors = FALSE, group = BB, matName = "theta",
# symmetry handled in override
row = PT$rhs[RR], col = PT$lhs[RR])
## THRESHOLDS
} else if (PT$op[RR] == "|") {
if (!length(object@ordered)) {
warning('Thresholds ignored when no indicators are declared as ordered')
}
DF <- data.frame(stringsAsFactors = FALSE, group = BB, matName = "tau",
row = paste0(PT$lhs[RR], "|", PT$rhs[RR]),
col = "threshold")
## SCALING FACTORS (delta parameters for latent item-responses)
} else if (PT$op[RR] == "~*~") {
if (!length(object@ordered)) {
warning('Thresholds ignored when no indicators are declared as ordered')
}
if (object@parameterization == "theta") {
warning('Latent-response scales (specified with the "~*~" operator) ',
'ignored when parameterization = "theta"')
}
if (PT$lhs[RR] != PT$rhs[RR]) {
warning('Latent-response scales (specified with the "~*~" operator) ',
'ignored when left- and right-hand side do not match (',
PT$lhs[RR], '~*~', PT$rhs[RR], ')')
next
}
DF <- data.frame(stringsAsFactors = FALSE, group = BB, matName = "delta",
row = PT$lhs[RR], col = "scales")
}
#FIXME? anything that does not match is simply ignored (no error messages)
## change labels?
if (BB > 1L && nLabels[RR] == 1L) {
if (labels[[RR]] != "") {
labels.DF <- rbind(labels.DF, cbind(DF, stringsAsFactors = FALSE,
replacement = labels[[RR]]))
}
} else if (labels[[RR]][BB] != "") {
labels.DF <- rbind(labels.DF, cbind(DF, stringsAsFactors = FALSE,
replacement = labels[[RR]][BB]))
}
## change fixed/free values?
if (BB > 1L && nValues[RR] == 1L) {
if (values[[RR]] != "") {
values.DF <- rbind(values.DF, cbind(DF, stringsAsFactors = FALSE,
replacement = values[[RR]]))
}
} else if (values[[RR]][BB] != "") {
values.DF <- rbind(values.DF, cbind(DF, stringsAsFactors = FALSE,
replacement = values[[RR]][BB]))
}
} # end loop over blocks
} # end loop over parameters
## make sure values are stored as numeric, not character
if (nrow(values.DF)) {
suppressWarnings(values.DF$replacement <- as.numeric(values.DF$replacement))
}
if (return.object) {
object@updates$values <- rbind(object@updates$values, values.DF)
object@updates$labels <- rbind(object@updates$labels, labels.DF)
return(object)
}
## else return the list of data.frames with updates to make
list(values = values.DF, labels = labels.DF)
}
semTools/R/loadingFromAlpha.R 0000644 0001762 0000144 00000001325 14006342740 015613 0 ustar ligges users ### Sunthud Pornprasertmanit
### Last updated: 3 April 2017
#' Find standardized factor loading from coefficient alpha
#'
#' Find standardized factor loading from coefficient alpha assuming that all
#' items have equal loadings.
#'
#' @param alpha A desired coefficient alpha value.
#' @param ni A desired number of items.
#' @return \item{result}{The standardized factor loadings that make desired
#' coefficient alpha with specified number of items.}
#' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
#' @examples
#'
#' loadingFromAlpha(0.8, 4)
#'
#' @export
loadingFromAlpha <- function(alpha, ni) {
denominator <- ni - ((ni - 1) * alpha)
result <- sqrt(alpha/denominator)
return(result)
}
semTools/R/quark.R 0000644 0001762 0000144 00000046260 14764270116 013546 0 ustar ligges users ### Steven R. Chesnut, Danny Squire, Terrence D. Jorgensen
### Last updated: 12 March 2025
##' Quark
##'
##' The `quark` function provides researchers with the ability to calculate
##' and include component scores calculated by taking into account the variance
##' in the original dataset and all of the interaction and polynomial effects of
##' the data in the dataset.
##'
##' The `quark` function calculates these component scores by first filling
##' in the data via means of multiple imputation methods and then expanding the
##' dataset by aggregating the non-overlapping interaction effects between
##' variables by calculating the mean of the interactions and polynomial
##' effects. The multiple imputation methods include one of iterative sampling
##' and group mean substitution and multiple imputation using a polytomous
##' regression algorithm (mice). During the expansion process, the dataset is
##' expanded to three times its normal size (in width). The first third of the
##' dataset contains all of the original data post imputation, the second third
##' contains the means of the polynomial effects (squares and cubes), and the
##' final third contains the means of the non-overlapping interaction effects. A
##' full principal componenent analysis is conducted and the individual
##' components are retained. The subsequent [combinequark()] function
##' provides researchers the control in determining how many components to
##' extract and retain. The function returns the dataset as submitted (with
##' missing values) and the component scores as requested for a more accurate
##' multiple imputation in subsequent steps.
##'
##' @param data The data frame is a required component for `quark`. In
##' order for `quark` to process a data frame, it must not contain any
##' factors or text-based variables. All variables must be in numeric format.
##' Identifiers and dates can be left in the data; however, they will need to be
##' identified under the `id` argument.
##' @param id Identifiers and dates within the dataset will need to be
##' acknowledged as `quark` cannot process these. By acknowledging the
##' identifiers and dates as a vector of column numbers or variable names,
##' `quark` will remove them from the data temporarily to complete its main
##' processes. Among many potential issues of not acknowledging identifiers and
##' dates are issues involved with imputation, product and polynomial effects,
##' and principal component analysis.
##' @param order Order is an optional argument provided by quark that can be
##' used when the imputation procedures in mice fail. Under some circumstances,
##' mice cannot calculate missing values due to issues with extreme missingness.
##' Should an error present itself stating a failure due to not having any
##' columns selected, set the argument `order = 2` in order to reorder the
##' imputation method procedure. Otherwise, use the default `order = 1`.
##' @param silent If `FALSE`, the details of the `quark` process are
##' printed.
##' @param \dots additional arguments to pass to [mice::mice()].
##'
##' @return The output value from using the quark function is a list. It will
##' return a list with 7 components.
##' \item{ID Columns}{Is a vector of the identifier columns entered when
##' running quark.}
##' \item{ID Variables}{Is a subset of the dataset that contains the identifiers
##' as acknowledged when running quark.}
##' \item{Used Data}{Is a matrix / dataframe of the data provided by user as
##' the basis for quark to process.}
##' \item{Imputed Data}{Is a matrix / dataframe of the data after the multiple
##' method imputation process.}
##' \item{Big Matrix}{Is the expanded product and polynomial matrix.}
##' \item{Principal Components}{Is the entire dataframe of principal components
##' for the dataset. This dataset will have the same number of rows of the big
##' matrix, but will have 1 less column (as is the case with principal
##' component analyses).}
##' \item{Percent Variance Explained}{Is a vector of the percent variance
##' explained with each column of principal components.}
##'
##' @author Steven R. Chesnut (University of Southern Mississippi;
##' \email{Steven.Chesnut@@usm.edu})
##'
##' Danny Squire (Texas Tech University)
##'
##' Terrence D. Jorgensen (University of Amsterdam)
##'
##' The PCA code is copied and modified from the `FactoMineR` package.
##'
##' @seealso [combinequark()]
##'
##' @references Howard, W. J., Rhemtulla, M., & Little, T. D. (2015). Using
##' Principal Components as Auxiliary Variables in Missing Data Estimation.
##' *Multivariate Behavioral Research, 50*(3), 285--299.
##' \doi{10.1080/00273171.2014.999267}
##'
##' @examples
##'
##' set.seed(123321)
##'
##' dat <- HolzingerSwineford1939[,7:15]
##' misspat <- matrix(runif(nrow(dat) * 9) < 0.3, nrow(dat))
##' dat[misspat] <- NA
##' dat <- cbind(HolzingerSwineford1939[,1:3], dat)
##' \donttest{
##' quark.list <- quark(data = dat, id = c(1, 2))
##'
##' final.data <- combinequark(quark = quark.list, percent = 80)
##'
##' ## Example to rerun quark after imputation failure:
##' quark.list <- quark(data = dat, id = c(1, 2), order = 2)
##' }
##'
##' @export
quark <- function(data, id, order = 1, silent = FALSE, ...){
if(!is.data.frame(data) && !is.matrix(data)) {
stop("Inappropriate data file provided.")
}
if(!silent) cat("Data Check Passed.\n")
if(is.character(id)) id <- match(id, colnames(data))
for(i in 1:length(id)){
if(id[i] > ncol(data) || id[i] < 1){
stop("At least one of the IDs is out of bounds.")
}
}
if(!silent) cat("ID Check Passed.\n")
if(!(order %in% 1:2)) stop("Currently, the order argument can take either 1 or 2.")
final.collect <- list()
final.collect$ID_Columns <- id
final.collect$ID_Vars <- data[,id]
final.collect$Used_Data <- data[,-c(id)]
##FIXME 26-June-2018: Terrence had to add a logical check for whether mice
## is installed, otherwise won't pass CRAN checks.
checkMice <- requireNamespace("mice", quietly = TRUE)
if (!checkMice) {
message('The quark function requires the "mice" package to be installed.')
return(invisible(NULL))
}
final.collect$Imputed_Data <- imputequark(data = final.collect$Used_Data,
order = order, silent = silent, ...)
final.collect$Big_Data_Matrix <- bigquark(data = final.collect$Imputed_Data,
silent = silent)
cmp <- compquark(data = final.collect$Big_Data_Matrix, silent = silent)
final.collect$Prin_Components <- cmp[[1]]
final.collect$Prin_Components_Prcnt <- cmp[[2]]
return(final.collect)
}
##' Combine the results from the quark function
##'
##' This function builds upon the [quark()] function to provide a
##' final dataset comprised of the original dataset provided to
##' [quark()] and enough principal components to be able to account
##' for a certain level of variance in the data.
##'
##'
##' @param quark Provide the [quark()] object that was returned. It
##' should be a list of objects. Make sure to include it in its entirety.
##' @param percent Provide a percentage of variance that you would like to have
##' explained. That many components (columns) will be extracted and kept with
##' the output dataset. Enter this variable as a number WITHOUT a percentage
##' sign.
##'
##' @return The output of this function is the original dataset used in quark
##' combined with enough principal component scores to be able to account for
##' the amount of variance that was requested.
##'
##' @author Steven R. Chesnut (University of Southern Mississippi
##' \email{Steven.Chesnut@@usm.edu})
##'
##' @seealso [quark()]
##'
##' @examples
##'
##' set.seed(123321)
##' dat <- HolzingerSwineford1939[,7:15]
##' misspat <- matrix(runif(nrow(dat) * 9) < 0.3, nrow(dat))
##' dat[misspat] <- NA
##' dat <- cbind(HolzingerSwineford1939[,1:3], dat)
##'
##' quark.list <- quark(data = dat, id = c(1, 2))
##'
##' final.data <- combinequark(quark = quark.list, percent = 80)
##'
##' @export
combinequark <- function(quark, percent) {
data <- cbind(quark$ID_Vars, quark$Used_Data)
pct <- quark$Prin_Components_Prcnt
comp <- quark$Prin_Components
for (i in 1:length(pct)) {
if(pct[i] >= percent) {
num <- i
break
}
}
return(cbind(data, comp[,1:num]))
}
## ----------------
## Hidden Functions
## ----------------
imputequark <- function(data, order, silent = FALSE, ...){
if (order == 1){
data <- aImp(data = data, silent = silent, ...)
data <- gImp(data = data, silent = silent)
} else if(order == 2) {
data <- gImp(data = data, silent = silent)
if (length(which(is.na(data > 0)))) {
data <- aImp(data = data, silent = silent, ...)
}
}
return(data)
}
#' @importFrom stats cor
gImp <- function(data, silent = FALSE) {
imputed_data <- data
num_adds <- vector(length = ncol(data)) # number of columns combined into one for averaging.
data.cor <- cor(data, use = "pairwise", method = "pearson")
class(data.cor) <- c("lavaan.matrix.symmetric","matrix")
if (!silent) print(data.cor)
#populate multiple matrices that can then be utilized to determine if one column should enhance another based upon
#the correlations they share...
if (!silent) cat("Imputing Column... \n")
for (a in 1:ncol(data)) {
temp_mat <- matrix(ncol = ncol(data), nrow = nrow(data))
list <- unique(sort(data[,a]))
if (length(list) > 1 && length(list) <= 10) {
for (b in 1:nrow(data)) {
for (c in 1:length(list)) {
if (data[b, a] == list[c] && !is.na(data[b,a])) {
temp_mat[b,] <- round(colMeans(subset(data, data[ , a] == list[c]), na.rm = TRUE), digits = 1)
} else if (is.na(data[b,a])) {
for (p in 1:ncol(data)) temp_mat[b,p] <- data[b,p]
}
}
}
# Here I need to determine if the other columns are correlated enough with
# the reference to ensure accuracy of predictions
temp_cor <- data.cor[,a]
# if (countNA(temp_cor)==0) {
for (i in 1:length(temp_cor)) {
if (i != a) {
if (abs(temp_cor[i]) >= .5 && !is.na(temp_cor[i])) { # Using a moderate effect size, column a, will inform other columns.
for (x in 1:nrow(imputed_data)){
imputed_data[x,i] <- sum(imputed_data[x,i], temp_mat[x,a], na.rm = TRUE)
}
num_adds[i] <- num_adds[i] + 1
}
}
}
#}
if (!silent) cat("\t", colnames(data)[a])
}
}
if (!silent) cat("\n")
imputed_data <- cleanMat(m1 = data, m2 = imputed_data, impact = num_adds)
imputed_data <- fixData(imputed_data)
return(imputed_data)
}
cleanMat <- function(m1, m2, impact) {
#Impact is the number of influences on each column...
#We need to clean up and then try to determine what final values should be...
#Go through each of the cells...
new_mat <- m2
for (a in 1:ncol(m1)) {
for (b in 1:nrow(m1)) {
if (!is.na(m1[b,a])) {
new_mat[b,a] <- m1[b,a]
} else if (is.na(m1[b,a])) {
new_mat[b,a] <- new_mat[b,a] / impact[a]
}
}
}
return(new_mat)
}
fixData <- function(data) {
for (a in 1:ncol(data)) {
for (b in 1:nrow(data)) {
data[b,a] <- round(data[b,a], digits = 1)
}
}
return(data)
}
aImp <- function(data, silent = FALSE, ...) {
miceArgs <- list(...)
miceArgs$data <- data
miceArgs$maxit <- 1
miceArgs$m <- 1
miceArgs$printFlag <- !silent
if (!("package:mice" %in% search())) attachNamespace("mice")
if (!silent) cat("Starting Algorithm Imputation...\n")
impData <- mice::complete(do.call("mice", miceArgs))
if (!silent) cat("Ending Algorithm Imputation...\n")
return(impData)
}
bigquark <- function(data, silent = FALSE) {
if (!silent) cat("Calculating Polynomial Effects.\n")
poly <- ((data^2)+(data^3))/2
if (!silent) cat("Creating Matrix for Interaction Effects.\n")
prod <- matrix(ncol=(ncol(data)-1),nrow=nrow(data))
if (!silent) cat("Calculating Interaction Effects...0%..")
for (i in 1:nrow(data)) {
if (!silent) printpct(percent = i/nrow(data))
for (j in 1:(ncol(data)-1)) {
prod[i,j] <- mean(as.numeric(data[i,j])*as.numeric(data[i,(j+1):ncol(data)]))
}
}
cat("\n")
data <- cbind(data,poly,prod)
return(data)
}
compquark <- function(data, silent = FALSE) {
if (!silent) cat("Calculating values for the PCA\n")
pcam <- pcaquark(data, ncp = ncol(data))
cmp <- list()
cmp$pca <- pcam$ind$coord
cmp$var <- pcam$eig[,3]
colnames(cmp$pca) <- c(paste0("AuxVar",1:ncol(cmp$pca)))
return(cmp)
}
printpct <- function(percent) {
if (round(percent, digits = 10) == 0) cat("0%..")
if (round(percent, digits = 10) == .10) cat("10%..")
if (round(percent, digits = 10) == .20) cat("20%..")
if (round(percent, digits = 10) == .30) cat("30%..")
if (round(percent, digits = 10) == .40) cat("40%..")
if (round(percent, digits = 10) == .50) cat("50%..")
if (round(percent, digits = 10) == .60) cat("60%..")
if (round(percent, digits = 10) == .70) cat("70%..")
if (round(percent, digits = 10) == .80) cat("80%..")
if (round(percent, digits = 10) == .90) cat("90%..")
if (round(percent, digits = 10) == 1) cat("100%..")
}
## This function is modified from the FactoMinoR package.
pcaquark <- function(X, ncp = 5) {
moy.p <- function(V, poids) res <- sum(V * poids)/sum(poids)
ec <- function(V, poids) res <- sqrt(sum(V^2 * poids)/sum(poids))
X <- as.data.frame(X)
if (any(is.na(X))) {
warnings("Missing values are imputed by the mean of the variable: you should use the imputePCA function of the missMDA package")
X[is.na(X)] <- matrix(apply(X,2,mean,na.rm=TRUE),ncol=ncol(X),nrow=nrow(X),byrow=TRUE)[is.na(X)]
}
if (is.null(rownames(X))) rownames(X) <- 1:nrow(X)
if (is.null(colnames(X))) colnames(X) <- paste("V", 1:ncol(X), sep = "")
colnames(X)[colnames(X) == ""] <- paste("V", 1:sum(colnames(X)==""),sep="")
rownames(X)[is.null(rownames(X))] <- paste("row",1:sum(rownames(X)==""),sep="")
Xtot <- X
if (any(!sapply(X, is.numeric))) {
auxi <- NULL
for (j in 1:ncol(X)) if (!is.numeric(X[, j])) auxi <- c(auxi, colnames(X)[j])
stop(paste("\nThe following variables are not quantitative: ", auxi))
}
ncp <- min(ncp, nrow(X) - 1, ncol(X))
row.w <- rep(1, nrow(X))
row.w.init <- row.w
row.w <- row.w/sum(row.w)
col.w <- rep(1, ncol(X))
centre <- apply(X, 2, moy.p, row.w)
X <- as.matrix(sweep(as.matrix(X), 2, centre, FUN = "-"))
ecart.type <- apply(X, 2, ec, row.w)
ecart.type[ecart.type <= 1e-16] <- 1
X <- sweep(as.matrix(X), 2, ecart.type, FUN = "/")
dist2.ind <- apply(sweep(X,2,sqrt(col.w),FUN="*")^2,1,sum)
dist2.var <- apply(sweep(X,1,sqrt(row.w),FUN="*")^2,2,sum)
tmp <- svd.triplet.quark(X, row.w = row.w, col.w = col.w, ncp = ncp)
eig <- tmp$vs^2
vp <- as.data.frame(matrix(NA, length(eig), 3))
rownames(vp) <- paste("comp", 1:length(eig))
colnames(vp) <- c("eigenvalue","percentage of variance",
"cumulative percentage of variance")
vp[, "eigenvalue"] <- eig
vp[, "percentage of variance"] <- (eig/sum(eig)) * 100
vp[, "cumulative percentage of variance"] <- cumsum(vp[, "percentage of variance"])
V <- tmp$V
U <- tmp$U
eig <- eig[1:ncp]
coord.ind <- sweep(as.matrix(U), 2, sqrt(eig), FUN = "*")
coord.var <- sweep(as.matrix(V), 2, sqrt(eig), FUN = "*")
contrib.var <- sweep(as.matrix(coord.var^2), 2, eig, "/")
contrib.var <- sweep(as.matrix(contrib.var), 1, col.w, "*")
dist2 <- dist2.var
cor.var <- sweep(as.matrix(coord.var), 1, sqrt(dist2), FUN = "/")
cos2.var <- cor.var^2
rownames(coord.var) <- rownames(cos2.var) <- rownames(cor.var) <- rownames(contrib.var) <- colnames(X)
colnames(coord.var) <- colnames(cos2.var) <- colnames(cor.var) <- colnames(contrib.var) <- paste("Dim", c(1:ncol(V)), sep = ".")
res.var <- list(coord = coord.var[, 1:ncp], cor = cor.var[, 1:ncp],
cos2 = cos2.var[, 1:ncp], contrib = contrib.var[, 1:ncp] * 100)
dist2 <- dist2.ind
cos2.ind <- sweep(as.matrix(coord.ind^2), 1, dist2, FUN = "/")
contrib.ind <- sweep(as.matrix(coord.ind^2), 1, row.w/sum(row.w), FUN = "*")
contrib.ind <- sweep(as.matrix(contrib.ind), 2, eig, FUN = "/")
rownames(coord.ind) <- rownames(cos2.ind) <- rownames(contrib.ind) <- names(dist2) <- rownames(X)
colnames(coord.ind) <- colnames(cos2.ind) <- colnames(contrib.ind) <- paste("Dim", c(1:ncol(U)), sep = ".")
res.ind <- list(coord = coord.ind[, 1:ncp], cos2 = cos2.ind[, 1:ncp],
contrib = contrib.ind[, 1:ncp] * 100, dist = sqrt(dist2))
res <- list(eig = vp, var = res.var, ind = res.ind, svd = tmp)
class(res) <- c("PCA", "list")
return(res)
}
## This function is modified from the FactoMinoR package.
svd.triplet.quark <- function (X, row.w = NULL, col.w = NULL, ncp = Inf) {
tryCatch.W.E <- function(expr) { ## function proposed by Maechlmr
W <- NULL
w.handler <- function(w) { # warning handler
W <<- w
invokeRestart("muffleWarning")
}
list(value = withCallingHandlers(tryCatch(expr, error = function(e) e),
warning = w.handler), warning = W)
}
ncp <- min(ncp,nrow(X)-1,ncol(X))
row.w <- row.w / sum(row.w)
X <- sweep(X, 2, sqrt(col.w), FUN = "*")
X <- sweep(X, 1, sqrt(row.w), FUN = "*")
if (ncol(X) < nrow(X)) {
svd.usuelle <- tryCatch.W.E(svd(X, nu = ncp, nv = ncp))$val
if (names(svd.usuelle)[[1]] == "message") {
svd.usuelle <- tryCatch.W.E(svd(t(X), nu = ncp, nv = ncp))$val
if (names(svd.usuelle)[[1]] == "d") {
aux <- svd.usuelle$u
svd.usuelle$u <- svd.usuelle$v
svd.usuelle$v <- aux
} else {
bb <- eigen(t(X) %*% X, symmetric = TRUE)
svd.usuelle <- vector(mode = "list", length = 3)
svd.usuelle$d[svd.usuelle$d < 0] <- 0
svd.usuelle$d <- sqrt(svd.usuelle$d)
svd.usuelle$v <- bb$vec[,1:ncp]
svd.usuelle$u <- sweep(X %*% svd.usuelle$v, 2, svd.usuelle$d[1:ncp], FUN = "/")
}
}
U <- svd.usuelle$u
V <- svd.usuelle$v
if (ncp > 1) {
mult <- sign(apply(V, 2, sum))
mult[mult == 0] <- 1
U <- sweep(U, 2, mult, FUN = "*")
V <- sweep(V, 2, mult, FUN = "*")
}
U <- sweep(as.matrix(U), 1, sqrt(row.w), FUN = "/")
V <- sweep(as.matrix(V), 1, sqrt(col.w), FUN = "/")
} else {
svd.usuelle <- tryCatch.W.E(svd(t(X), nu = ncp, nv = ncp))$val
if (names(svd.usuelle)[[1]] == "message") {
svd.usuelle <- tryCatch.W.E(svd(X, nu = ncp, nv = ncp))$val
if (names(svd.usuelle)[[1]] == "d") {
aux <- svd.usuelle$u
svd.usuelle$u <- svd.usuelle$v
svd.usuelle$v <- aux
} else {
bb <- eigen(X%*%t(X),symmetric=TRUE)
svd.usuelle <- vector(mode = "list", length = 3)
svd.usuelle$d[svd.usuelle$d < 0] <- 0
svd.usuelle$d <- sqrt(svd.usuelle$d)
svd.usuelle$v <- bb$vec[,1:ncp]
svd.usuelle$u <- sweep(t(X) %*% svd.usuelle$v, 2, svd.usuelle$d[1:ncp], FUN = "/")
}
}
U <- svd.usuelle$v
V <- svd.usuelle$u
mult <- sign(apply(V, 2, sum))
mult[mult == 0] <- 1
V <- sweep(V, 2, mult, FUN = "*")
U <- sweep(U, 2, mult, FUN = "*")
U <- sweep(U, 1, sqrt(row.w), FUN = "/")
V <- sweep(V, 1, sqrt(col.w), FUN = "/")
}
vs <- svd.usuelle$d[1:min(ncol(X), nrow(X) - 1)]
num <- which(vs[1:ncp] < 1e-15)
if (length(num)==1) {
U[,num] <- U[,num] * vs[num]
V[,num] <- V[,num] * vs[num]
}
if (length(num) > 1) {
U[,num] <- sweep(U[,num], 2, vs[num], FUN = "*")
V[,num] <- sweep(V[,num], 2, vs[num], FUN = "*")
}
res <- list(vs = vs, U = U, V = V)
return(res)
}
semTools/R/TSML.R 0000644 0001762 0000144 00000067025 14753125033 013177 0 ustar ligges users ## Terrence D. Jorgensen
### Last updated: 12 February 2025
### semTools function to implement 2-stage ML
## -----------------
## Class and Methods
## -----------------
##' Class for the Results of 2-Stage Maximum Likelihood (TSML) Estimation for
##' Missing Data
##'
##' This class contains the results of 2-Stage Maximum Likelihood (TSML)
##' estimation for missing data. The `summary`, `anova`, `vcov`
##' methods return corrected *SE*s and test statistics. Other methods are
##' simply wrappers around the corresponding [lavaan::lavaan-class]
##' methods.
##'
##'
##' @name twostage-class
##' @aliases twostage-class show,twostage-method summary,twostage-method
##' anova,twostage-method vcov,twostage-method coef,twostage-method
##' fitted.values,twostage-method fitted,twostage-method
##' residuals,twostage-method resid,twostage-method nobs,twostage-method
##' @docType class
##'
##' @slot saturated A fitted [lavaan::lavaan-class] object containing the
##' saturated model results
##' @slot target A fitted [lavaan::lavaan-class] object containing the
##' target/hypothesized model results
##' @slot baseline A fitted [lavaan::lavaan-class] object containing the
##' baseline/null model results
##' @slot auxNames A character string (potentially of `length == 0`) of any
##' auxiliary variable names, if used
##'
##' @param object An object of class `twostage`.
##' @param ... arguments passed to [lavaan::parameterEstimates()].
##' @param h1 An object of class `twostage` in which `object` is
##' nested, so that their difference in fit can be tested using
##' `anova` (see **Value** section for details).
##' @param baseline `logical` indicating whether to return results for the
##' baseline model, rather than the default target (hypothesized) model.
##' @param type The meaning of this argument varies depending on which method it
##' it used for. Find detailed descriptions in the **Value** section
##' under `coef`, `nobs`, and `residuals`.
##' @param model `character` naming the slot for which to return the
##' model-implied sample moments (see `fitted.values` description.)
##' @param labels `logical` indicating whether the model-implied sample
##' moments should have (row/column) labels.
##'
##' @return
##' \item{show}{`signature(object = "twostage"):` The `show` function
##' is used to display the results of the `anova` method, as well as the
##' header of the (uncorrected) target model results.}
##' \item{summary}{`signature(object = "twostage", ...):` The summary
##' function prints the same information from the `show` method, but also
##' provides (and returns) the output of
##' `parameterEstimates(object@target, ...)` with corrected
##' *SE*s, test statistics, and confidence intervals. Additional
##' arguments can be passed to [lavaan::parameterEstimates()],
##' including `fmi = TRUE` to provide an estimate of the fraction of
##' missing information.}
##' \item{anova}{`signature(object = "twostage", h1 = NULL, baseline = FALSE):`
##' The `anova` function returns the residual-based \eqn{\chi^2} test
##' statistic result, as well as the scaled \eqn{\chi^2} test statistic result,
##' for the model in the `target` slot, or for the model in the
##' `baseline` slot if `baseline = TRUE`. The user can also provide
##' a single additional `twostage` object to the `h1` argument, in
##' which case `anova` returns residual-based and scaled
##' (\eqn{\Delta})\eqn{\chi^2} test results, under the assumption that the
##' models are nested. The models will be automatically sorted according their
##' degrees of freedom.}
##' \item{nobs}{`signature(object = "twostage",
##' type = c("ntotal", "ngroups", "n.per.group", "norig", "patterns", "coverage")):`
##' The `nobs` function will return the total sample sized used in the
##' analysis by default. Also available are the number of groups or the sample
##' size per group, the original sample size (if any rows were deleted because
##' all variables were missing), the missing data patterns, and the matrix of
##' coverage (diagonal is the proportion of sample observed on each variable,
##' and off-diagonal is the proportion observed for both of each pair of
##' variables).}
##' \item{coef}{`signature(object = "twostage", type = c("free", "user")):`
##' This is simply a wrapper around the corresponding
##' [lavaan::lavaan-class] method, providing point estimates from the
##' `target` slot.}
##' \item{vcov}{`signature(object = "twostage", baseline = FALSE):` Returns
##' the asymptotic covariance matrix of the estimated parameters (corrected for
##' additional uncertainty due to missing data) for the model in the
##' `target` slot, or for the model in the `baseline` slot if
##' `baseline = TRUE`.}
##' \item{fitted.values, fitted}{`signature(object = "twostage",
##' model = c("target", "saturated", "baseline")):` This is simply a wrapper
##' around the corresponding [lavaan::lavaan-class] method, providing
##' model-implied sample moments from the slot specified in the `model`
##' argument.}
##' \item{residuals, resid}{`signature(object = "twostage", type = c("raw",
##' "cor", "normalized", "standardized")):` This is simply a wrapper around the
##' corresponding [lavaan::lavaan-class] method, providing residuals of
##' the specified `type` from the `target` slot.}
##'
##' @section Objects from the Class: Objects can be created via the
##' [twostage()] function.
##'
##' @author Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' @seealso [twostage()]
##'
##' @examples
##'
##' # See the example from the twostage function
##'
setClass("twostage",
slots = c(saturated = "lavaan", target = "lavaan", baseline = "lavaan",
auxNames = "character"))
##' @rdname twostage-class
##' @aliases show,twostage-method
##' @export
setMethod("show", "twostage", function(object) {
## show chi-squared test results
cat("Chi-squared test(s) results, ADJUSTED for missing data:\n\n")
getMethod("anova", "twostage")(object)
cat("\n\nChi-squared test results, UNADJUSTED for missing data:\n\n")
show(object@target)
invisible(object)
})
##' @rdname twostage-class
##' @aliases summary,twostage-method
##' @importFrom stats pnorm qnorm
##' @importFrom lavaan parTable
##' @export
setMethod("summary", "twostage", function(object, ...) {
## show chi-squared test results AND estimates
getMethod("show", "twostage")(object)
cat("\n\nParameter Estimates, with SEs (and tests/CIs) ADJUSTED for missing data:\n\n")
dots <- list(...)
if (!"fmi" %in% names(dots)) dots$fmi <- FALSE
if (!"ci" %in% names(dots)) dots$ci <- TRUE
if (!"level" %in% names(dots)) dots$level <- .95
PT <- parTable(object@target)
PT <- PT[PT$group > 0, ]
PE <- do.call(lavaan::parameterEstimates, c(dots, object = object@target))
SEs <- sqrt(diag(getMethod("vcov", "twostage")(object)))
PE$se[PT$free > 0] <- SEs[PT$free]
PE$z[PT$free > 0] <- PE$est[PT$free > 0] / PE$se[PT$free > 0]
PE$pvalue[PT$free > 0] <- pnorm(abs(PE$z[PT$free > 0]), lower.tail = FALSE)*2
if (dots$ci) {
crit <- qnorm(1 - (1 - dots$level) / 2)
PE$ci.lower[PT$free > 0] <- PE$est[PT$free > 0] - crit * PE$se[PT$free > 0]
PE$ci.upper[PT$free > 0] <- PE$est[PT$free > 0] + crit * PE$se[PT$free > 0]
}
if (dots$fmi) {
compVar <- diag(lavaan::vcov(object@target))[PT$free] # uncorrected Stage-2 SE estimates
# compFit <- lavaan::update(object@target, sample.nobs = lavaan::nobs(object@target),
# sample.cov = lavInspect(object@target, "cov.ov"),
# sample.mean = lavInspect(object@target, "mean.ov"))
# compVar <- diag(lavaan::vcov(compFit))[PT$free]
missVar <- SEs^2
PE$fmi[PT$free > 0] <- 1 - compVar / missVar
}
PE
})
## (hidden) function utilized by vcov and anova methods
##' @importFrom lavaan lavInspect parTable
twostageMatrices <- function(object, baseline) {
SLOT <- if (baseline) "baseline" else "target"
## extract parameter table to isolate estimates by group
PTsat <- parTable(object@saturated)
nG <- max(PTsat$group)
isMG <- nG > 1L
## model derivatives
delta <- lavInspect(slot(object, SLOT), "delta")
if (!isMG) delta <- list(delta)
for (g in 1:nG) {
covparams <- grep(pattern = "~~", x = rownames(delta[[g]]))
meanparams <- grep(pattern = "~1", x = rownames(delta[[g]]))
delta[[g]] <- delta[[g]][c(covparams, meanparams), ]
}
## stack groups' deltas into 1 matrix
delta <- do.call(rbind, delta)
## extract estimated moments from saturated model, and number of moments
satSigma <- lavInspect(object@saturated, "cov.ov")
satMu <- lavInspect(object@saturated, "mean.ov")
if (!isMG) {
satSigma <- list(satSigma)
satMu <- list(satMu)
}
if (length(object@auxNames)) {
an <- object@auxNames
tn <- lavaan::lavNames(slot(object, SLOT))
for (g in 1:nG) {
satSigma[[g]] <- satSigma[[g]][tn, tn]
satMu[[g]] <- satMu[[g]][tn]
}
}
p <- length(satMu[[1]])
pStar <- p*(p + 1) / 2
## extract model-implied moments
muHat <- lavInspect(slot(object, SLOT), "mean.ov")
sigmaHat <- lavInspect(slot(object, SLOT), "cov.ov")
if (!isMG) {
sigmaHat <- list(sigmaHat)
muHat <- list(muHat)
}
shinv <- list()
for (g in 1:nG) {
muHat[[g]] <- muHat[[g]][names(satMu[[g]])]
sigmaHat[[g]] <- sigmaHat[[g]][rownames(satSigma[[g]]), colnames(satSigma[[g]])]
shinv[[g]] <- solve(sigmaHat[[g]])
}
## assemble complete-data information matrix
H <- list()
for (g in 1:nG) H[[g]] <- matrix(0, (pStar + p), (pStar + p))
# if (lavInspect(slot(object, SLOT), "options")$estimator == "expected") {
if (TRUE) {
## complete-data expected information (Savalei, 2010, Eq. 6; Savalei & Bentler, 2009, Eq. 7)
for (g in 1:nG) {
H[[g]][1:pStar, 1:pStar] <- .5*lavaan::lav_matrix_duplication_pre_post(shinv[[g]] %x% shinv[[g]])
H[[g]][(pStar + 1):(pStar + p), (pStar + 1):(pStar + p)] <- shinv[[g]]
}
} else {
## information == "observed" (Savalei, 2010, Eq. 5)
dMu <- list()
for (g in 1:nG) {
dMu[[g]] <- satMu[[g]] - muHat[[g]]
H[[g]][1:pStar, 1:pStar] <- lavaan::lav_matrix_duplication_pre_post(shinv[[g]] %x% (shinv[[g]] %*% (satSigma[[g]] + dMu[[g]] %*% t(dMu[[g]])) %*% shinv[[g]] - .5*shinv[[g]]))
H[[g]][(pStar + 1):(pStar + p), 1:pStar] <- lavaan::lav_matrix_duplication_post(shinv[[g]] %x% (t(dMu[[g]]) %*% shinv[[g]]))
H[[g]][1:pStar, (pStar + 1):(pStar + p)] <- t(H[[g]][(pStar + 1):(pStar + p), 1:pStar])
H[[g]][(pStar + 1):(pStar + p), (pStar + 1):(pStar + p)] <- shinv[[g]]
}
}
## combine into 1 block-diagonal matrix
H <- do.call(lavaan::lav_matrix_bdiag, H)
## asymptotic information and covariance matrices of target model
satACOV <- lavaan::vcov(object@saturated) #FIXME: inverting this is NOT equivalent to
satInfo <- solve(satACOV * lavaan::nobs(object@saturated)) # weighing by first-order info
## all(round(acov*N, 8) == round(solve(info), 8))
## all(round(acov, 8) == round(solve(info)/N, 8))
if (length(object@auxNames)) {
dimTar <- !(PTsat$lhs %in% an | PTsat$rhs %in% an)
dimAux <- PTsat$lhs %in% an | PTsat$rhs %in% an
infoTar <- satInfo[dimTar, dimTar]
infoAux <- satInfo[dimAux, dimAux]
infoAT <- satInfo[dimAux, dimTar]
satInfo <- infoTar - t(infoAT) %*% solve(infoAux) %*% infoAT
satACOV <- solve(satInfo) / lavaan::nobs(object@saturated)
}
list(delta = delta, H = H, satACOV = satACOV, satInfo = satInfo)
}
## (hidden?) function utilized by anova method to test 1 or 2 models
##' @importFrom stats pchisq
##' @importFrom lavaan lavInspect
twostageLRT <- function(object, baseline, print = FALSE) {
SLOT <- if (baseline) "baseline" else "target"
## calculate model derivatives and complete-data information matrix
MATS <- twostageMatrices(object, baseline)
## residual-based statistic (Savalei & Bentler, 2009, eq. 8)
N <- lavaan::nobs(slot(object, SLOT))
nG <- lavInspect(slot(object, SLOT), "ngroups")
res <- lavaan::residuals(slot(object, SLOT))
if (nG == 1L) res <- list(res)
etilde <- do.call(c, lapply(res, function(x) c(lavaan::lav_matrix_vech(x$cov), x$mean)))
ID <- MATS$satInfo %*% MATS$delta
T.res <- N*t(etilde) %*% (MATS$satInfo - ID %*% MASS::ginv(t(MATS$delta) %*% ID) %*% t(ID)) %*% etilde # FIXME: why not solve()?
DF <- lavInspect(slot(object, SLOT), "fit")[["df"]]
pval.res <- pchisq(T.res, df = DF, lower.tail = FALSE)
residual <- c(chisq = T.res, df = DF, pvalue = pval.res)
class(residual) <- c("lavaan.vector","numeric")
## scaled test statistic (Savalei & Bentler, 2009, eq. 9)
meat <- MATS$H %*% MATS$delta
bread <- MASS::ginv(t(MATS$delta) %*% meat) # FIXME: why not solve()?
cc <- DF / sum(diag(N*MATS$satACOV %*% (MATS$H - meat %*% bread %*% t(meat))))
chisq <- lavaan::lavTest(slot(object, SLOT))$stat
T.scaled <- cc * chisq
pval.scaled <- pchisq(T.scaled, df = DF, lower.tail = FALSE)
scaled <- c(chisq.naive = chisq, scaling.factor = 1 / cc,
chisq.scaled = T.scaled, df = DF, pvalue = pval.scaled)
class(scaled) <- c("lavaan.vector","numeric")
## return both statistics
if (print) {
if (lavInspect(object@saturated, "options")$se == "standard") {
cat("Browne (1984) residual-based test statistic:\n\n")
print(residual)
}
cat("\n\nSatorra-Bentler (2001) scaled test statistic:\n\n")
print(scaled)
}
invisible(list(residual = residual, scaled = scaled))
}
##' @rdname twostage-class
##' @aliases anova,twostage-method
##' @importFrom lavaan lavInspect
##' @export
setMethod("anova", "twostage", function(object, h1 = NULL, baseline = FALSE) {
if (is.null(h1)) {
return(twostageLRT(object, baseline, print = TRUE))
}
H0 <- twostageLRT(object, baseline = FALSE)
H1 <- twostageLRT(h1, baseline = FALSE)
DF0 <- H0$residual[["df"]]
DF1 <- H1$residual[["df"]]
if (DF0 == DF1) stop("Models have the same degrees of freedom.")
if (min(c(DF0, DF1)) == 0L) return(twostageLRT(object, baseline, print = TRUE))
parent <- which.min(c(DF0, DF1))
if (parent == 1L) {
parent <- H0
H0 <- H1
H1 <- parent
DF0 <- H0$residual[["df"]]
DF1 <- H1$residual[["df"]]
}
DF <- DF0 - DF1
## residual-based statistic
T.res <- H0$residual[["chisq"]] - H1$residual[["chisq"]]
residual <- c(chisq = T.res, df = DF,
pvalue = pchisq(T.res, df = DF, lower.tail = FALSE))
class(residual) <- c("lavaan.vector","numeric")
## scaled test statistic
chisq.naive <- H0$scaled[["chisq.naive"]] - H1$scaled[["chisq.naive"]]
cc <- (DF0*H0$scaled[["scaling.factor"]] - DF1*H1$scaled[["scaling.factor"]]) / DF
if (cc < 0) {
warning("Scaling factor is negative, so it was set to missing.")
cc <- NA
}
scaled <- c(chisq.naive = chisq.naive, scaling.factor = cc,
chisq.scaled = chisq.naive / cc, DF = DF,
pvalue = pchisq(chisq.naive / cc, df = DF, lower.tail = FALSE))
class(scaled) <- c("lavaan.vector","numeric")
## return both statistics
if (lavInspect(object@saturated, "options")$se == "standard") {
cat("Difference test for Browne (1984) residual-based statistics:\n\n")
print(residual)
}
cat("\n\nSatorra-Bentler (2001) scaled difference test:\n\n")
print(scaled)
invisible(list(residual = residual, scaled = scaled))
})
##' @rdname twostage-class
##' @aliases nobs,twostage-method
##' @importFrom lavaan lavInspect
##' @export
setMethod("nobs", "twostage",
function(object, type = c("ntotal","ngroups","n.per.group","norig",
"patterns","coverage")) {
type <- type[1]
if (type == "n.per.group") type <- "nobs"
lavInspect(object@saturated, what = type)
})
##' @rdname twostage-class
##' @aliases coef,twostage-method
##' @export
setMethod("coef", "twostage", function(object, type = c("free","user")) {
type <- type[1]
lavaan::coef(object@target, type = type)
})
##' @rdname twostage-class
##' @aliases vcov,twostage-method
##' @export
setMethod("vcov", "twostage", function(object, baseline = FALSE) {
SLOT <- if (baseline) "baseline" else "target"
## calculate model derivatives and complete-data information matrix
MATS <- twostageMatrices(object, baseline)
meat <- MATS$H %*% MATS$delta
bread <- MASS::ginv(t(MATS$delta) %*% meat) # FIXME: why not solve()?
out <- bread %*% t(meat) %*% MATS$satACOV %*% meat %*% bread
class(out) <- c("lavaan.matrix.symmetric","matrix")
if (baseline) {
rownames(out) <- names(getMethod("coef", "lavaan")(object@baseline))
} else {
rownames(out) <- names(getMethod("coef", "twostage")(object))
}
colnames(out) <- rownames(out)
out
})
##' @rdname twostage-class
##' @aliases fitted.values,twostage-method
##' @export
setMethod("fitted.values", "twostage",
function(object, model = c("target","saturated","baseline"),
type = "moments", labels = TRUE) {
model <- model[1]
lavaan::fitted.values(slot(object, model), type = type, labels = labels)
})
##' @rdname twostage-class
##' @aliases fitted,twostage-method
##' @export
setMethod("fitted", "twostage",
function(object, model = c("target","saturated","baseline"),
type = "moments", labels = TRUE) {
model <- model[1]
lavaan::fitted.values(slot(object, model), type = type, labels = labels)
})
##' @rdname twostage-class
##' @aliases residuals,twostage-method
##' @export
setMethod("residuals", "twostage",
function(object, type = c("raw","cor","normalized","standardized")) {
type <- type[1]
lavaan::residuals(object@target, type = type)
})
##' @rdname twostage-class
##' @aliases resid,twostage-method
##' @export
setMethod("resid", "twostage",
function(object, type = c("raw","cor","normalized","standardized")) {
type <- type[1]
lavaan::residuals(object@target, type = type)
})
# fitS <- cfa(model = model, data = dat1, missing = "fiml", se = "standard")
# fitR <- cfa(model = model, data = dat1, missing = "fiml", se = "robust.huber.white")
# all(lavInspect(fitS, "information") == lavInspect(fitR, "information"))
# all(vcov(fitS) == vcov(fitR))
## ---------------------
## Constructor Functions
## ---------------------
##' Fit a lavaan model using 2-Stage Maximum Likelihood (TSML) estimation for
##' missing data.
##'
##' This function automates 2-Stage Maximum Likelihood (TSML) estimation,
##' optionally with auxiliary variables. Step 1 involves fitting a saturated
##' model to the partially observed data set (to variables in the hypothesized
##' model as well as auxiliary variables related to missingness). Step 2
##' involves fitting the hypothesized model to the model-implied means and
##' covariance matrix (also called the "EM" means and covariance matrix) as if
##' they were complete data. Step 3 involves correcting the Step-2 standard
##' errors (*SE*s) and chi-squared statistic to account for additional
##' uncertainty due to missing data (using information from Step 1; see
##' References section for sources with formulas).
##'
##' All variables (including auxiliary variables) are treated as endogenous
##' varaibles in the Step-1 saturated model (`fixed.x = FALSE`), so data
##' are assumed continuous, although not necessarily multivariate normal
##' (dummy-coded auxiliary variables may be included in Step 1, but categorical
##' endogenous variables in the Step-2 hypothesized model are not allowed). To
##' avoid assuming multivariate normality, request `se =
##' "robust.huber.white"`. CAUTION: In addition to setting `fixed.x =
##' FALSE` and `conditional.x = FALSE` in [lavaan::lavaan()],
##' this function will automatically set `meanstructure = TRUE`,
##' `estimator = "ML"`, `missing = "fiml"`, and `test =
##' "standard"`. [lavaan::lavaan()]'s `se` option can only be
##' set to `"standard"` to assume multivariate normality or to
##' `"robust.huber.white"` to relax that assumption.
##'
##'
##' @aliases twostage cfa.2stage sem.2stage growth.2stage lavaan.2stage
##' @importFrom lavaan lavInspect
##'
##' @param \dots Arguments passed to the [lavaan::lavaan()] function
##' specified in the `fun` argument. See also
##' [lavaan::lavOptions()]. At a minimum, the user must supply the
##' first two named arguments to [lavaan::lavaan()] (i.e.,
##' `model` and `data`).
##' @param aux An optional character vector naming auxiliary variable(s) in
##' `data`
##' @param fun The character string naming the lavaan function used to fit the
##' Step-2 hypothesized model (`"cfa"`, `"sem"`, `"growth"`, or
##' `"lavaan"`).
##' @param baseline.model An optional character string, specifying the lavaan
##' [lavaan::model.syntax()] for a user-specified baseline model.
##' Interested users can use the fitted baseline model to calculate incremental
##' fit indices (e.g., CFI and TLI) using the corrected chi-squared values (see
##' the `anova` method in [twostage-class]). If `NULL`,
##' the default "independence model" (i.e., freely estimated means and
##' variances, but all covariances constrained to zero) will be specified
##' internally.
##'
##' @return The [twostage-class] object contains 3 fitted lavaan
##' models (saturated, target/hypothesized, and baseline) as well as the names
##' of auxiliary variables. None of the individual models provide the correct
##' model results (except the point estimates in the target model are unbiased).
##' Use the methods in [twostage-class] to extract corrected
##' *SE*s and test statistics.
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @seealso [twostage-class]
##'
##' @references
##' Savalei, V., & Bentler, P. M. (2009). A two-stage approach to missing data:
##' Theory and application to auxiliary variables.
##' *Structural Equation Modeling, 16*(3), 477--497.
##' \doi{10.1080/10705510903008238}
##'
##' Savalei, V., & Falk, C. F. (2014). Robust two-stage approach outperforms
##' robust full information maximum likelihood with incomplete nonnormal data.
##' *Structural Equation Modeling, 21*(2), 280--302.
##' \doi{10.1080/10705511.2014.882692}
##'
##' @examples
##'
##' ## impose missing data for example
##' HSMiss <- HolzingerSwineford1939[ , c(paste("x", 1:9, sep = ""),
##' "ageyr","agemo","school")]
##' set.seed(12345)
##' HSMiss$x5 <- ifelse(HSMiss$x5 <= quantile(HSMiss$x5, .3), NA, HSMiss$x5)
##' age <- HSMiss$ageyr + HSMiss$agemo/12
##' HSMiss$x9 <- ifelse(age <= quantile(age, .3), NA, HSMiss$x9)
##'
##' ## specify CFA model from lavaan's ?cfa help page
##' HS.model <- '
##' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9
##' '
##'
##' ## use ageyr and agemo as auxiliary variables
##' out <- cfa.2stage(model = HS.model, data = HSMiss, aux = c("ageyr","agemo"))
##'
##' ## two versions of a corrected chi-squared test results are shown
##' out
##' ## see Savalei & Bentler (2009) and Savalei & Falk (2014) for details
##'
##' ## the summary additionally provides the parameter estimates with corrected
##' ## standard errors, test statistics, and confidence intervals, along with
##' ## any other options that can be passed to parameterEstimates()
##' summary(out, standardized = TRUE)
##'
##'
##'
##' ## use parameter labels to fit a more constrained model
##' modc <- '
##' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + a*x8 + a*x9
##' '
##' outc <- cfa.2stage(model = modc, data = HSMiss, aux = c("ageyr","agemo"))
##'
##'
##' ## use the anova() method to test this constraint
##' anova(out, outc)
##' ## like for a single model, two corrected statistics are provided
##'
##' @export
twostage <- function(..., aux, fun, baseline.model = NULL) {
if (all(aux == "")) aux <- NULL
dots <- list(...)
if (is.null(dots$model)) stop("lavaan model syntax argument must be named 'model'.")
lavaanifyArgs <- dots[intersect(names(dots), names(formals(lavaan::lavaanify)))]
args4lavaan <- c(names(formals(lavaan::lavaan)), names(lavaan::lavOptions()))
funArgs <- dots[intersect(names(dots), args4lavaan)]
## set some non-optional lavaan arguments
funArgs$meanstructure <- TRUE
funArgs$conditional.x <- FALSE
funArgs$fixed.x <- FALSE
funArgs$missing <- "fiml"
funArgs$estimator <- "ML"
funArgs$test <- "standard"
if (is.null(funArgs$information)) funArgs$information <- "observed"
if (funArgs$information[1] == "expected") {
message("If data are MAR, only the observed information matrix is consistent.")
if (!is.null(aux)) {
funArgs$information <- "observed"
message(c("Using auxiliary variables implies assuming that data are MAR. ",
"The lavaan argument 'information' was set to 'observed'."))
}
if (!is.null(funArgs$se)) if(funArgs$se != "standard") {
funArgs$information <- "observed"
message(c("The lavaan argument 'information' was set to 'observed' ",
"because adjusting SEs for non-normality requires it."))
}
}
funArgs$NACOV <- NULL
funArgs$do.fit <- NULL
## STAGE 1:
## fit saturated model
if (!is.null(funArgs$group))
lavaanifyArgs$ngroups <- length(table(funArgs$data[ , funArgs$group]))
targetNames <- lavaan::lavNames(do.call(lavaan::lavaanify, lavaanifyArgs))
varnames <- c(targetNames, aux)
covstruc <- outer(varnames, varnames, function(x, y) paste(x, "~~", y))
satArgs <- funArgs
satArgs$constraints <- NULL
satArgs$group.equal <- ""
satArgs$model <- c(covstruc[lower.tri(covstruc, diag = TRUE)],
paste(varnames, "~ 1"))
satFit <- do.call(lavaan::lavaan, satArgs)
## check for robust estimators
opts <- lavInspect(satFit, "options")
if (!opts$se %in% c("standard","robust.huber.white"))
stop(c("Two-Stage estimation requires either se = 'standard' for ",
"multivariate normal data or se = 'robust.huber.white' to ",
"correct for non-normality."))
## STAGE 2:
## fit target model to saturated estimates
targetArgs <- funArgs
targetArgs$data <- NULL
targetArgs$sample.cov <- lavInspect(satFit, "cov.ov")
targetArgs$sample.mean <- lavInspect(satFit, "mean.ov")
targetArgs$sample.nobs <- lavInspect(satFit, "nobs")
targetArgs$se <- "standard"
targetArgs$sample.cov.rescale <- FALSE
targetFit <- do.call(fun, targetArgs)
## STAGE 0:
## fit baseline model (for incremental fit indices)
baseArgs <- targetArgs
if (is.null(baseline.model)) {
basecov <- outer(targetNames, targetNames, function(x, y) paste0(x, " ~~ 0*", y))
diag(basecov) <- paste(targetNames, "~~", targetNames)
baseArgs$model <- c(basecov[lower.tri(basecov, diag = TRUE)],
paste(targetNames, "~ 1"))
} else baseArgs$model <- baseline.model
baseArgs$se <- "standard"
baseFit <- do.call(lavaan::lavaan, baseArgs)
if (length(setdiff(lavaan::lavNames(baseFit), targetNames)))
warning("The baseline model includes variables excluded from the target model.")
if (length(setdiff(targetNames, lavaan::lavNames(baseFit))))
warning("The target model includes variables excluded from the baseline model.")
## return both models
out <- new("twostage", saturated = satFit, target = targetFit,
baseline = baseFit, auxNames = as.character(aux))
out
}
##' @rdname twostage
##' @export
lavaan.2stage <- function(..., aux = NULL, baseline.model = NULL) {
twostage(..., aux = aux, fun = "lavaan", baseline.model = baseline.model)
}
##' @rdname twostage
##' @export
cfa.2stage <- function(..., aux = NULL, baseline.model = NULL) {
twostage(..., aux = aux, fun = "cfa", baseline.model = baseline.model)
}
##' @rdname twostage
##' @export
sem.2stage <- function(..., aux = NULL, baseline.model = NULL) {
twostage(..., aux = aux, fun = "sem", baseline.model = baseline.model)
}
##' @rdname twostage
##' @export
growth.2stage <- function(..., aux = NULL, baseline.model = NULL) {
twostage(..., aux = aux, fun = "growth", baseline.model = baseline.model)
}
semTools/R/ordMoments.R 0000644 0001762 0000144 00000034655 14751370431 014554 0 ustar ligges users ### Terrence D. Jorgensen & Andrew R. Johnson
### Last updated: 20 January 2025
### function to derive ordinal-scale moments implied by LRV-scale moments
##' Calculate Population Moments for Ordinal Data Treated as Numeric
##'
##' This function calculates ordinal-scale moments implied by LRV-scale moments
##'
##' Binary and ordinal data are frequently accommodated in SEM by incorporating
##' a threshold model that links each observed categorical response variable to
##' a corresponding latent response variable that is typically assumed to be
##' normally distributed (Kamata & Bauer, 2008; Wirth & Edwards, 2007).
##' This function can be useful for real-data analysis or for designing
##' Monte Carlo simulations, as described by Jorgensen and Johnson (2022).
##'
##' @importFrom stats dnorm setNames
##' @importFrom lavaan lavInspect
##' @importFrom pbivnorm pbivnorm
##'
##' @param Sigma Population covariance [matrix()], with variable names
##' saved in the [dimnames()] attribute.
##' @param Mu Optional `numeric` vector of population means. If missing,
##' all means will be set to zero.
##' @param thresholds Either a single `numeric` vector of population
##' thresholds used to discretize each normally distributed variable, or a
##' named `list` of each discretized variable's vector of thresholds.
##' The discretized variables may be a subset of all variables in `Sigma`
##' if the remaining variables are intended to be observed rather than latent
##' normally distributed variables.
##' @param cWts Optional (default when missing is to use 0 for the lowest
##' category, followed by successive integers for each higher category).
##' Either a single `numeric` vector of category weights (if they are
##' identical across all variables) or a named `list` of each
##' discretized variable's vector of category weights.
##'
##' @return A `list` including the LRV-scale population moments (means,
##' covariance matrix, correlation matrix, and thresholds), the category
##' weights, a `data.frame` of implied univariate moments (means,
##' *SD*s, skewness, and excess kurtosis (i.e., in excess of 3, which is
##' the kurtosis of the normal distribution) for discretized data treated as
##' `numeric`, and the implied covariance and correlation matrix of
##' discretized data treated as `numeric`.
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' Andrew R. Johnson (Curtin University; \email{andrew.johnson@@curtin.edu.au})
##'
##' @references
##'
##' Jorgensen, T. D., & Johnson, A. R. (2022). How to derive expected values of
##' structural equation model parameters when treating discrete data as
##' continuous. *Structural Equation Modeling, 29*(4), 639--650.
##' \doi{10.1080/10705511.2021.1988609}
##'
##' Kamata, A., & Bauer, D. J. (2008). A note on the relation between factor
##' analytic and item response theory models.
##' *Structural Equation Modeling, 15*(1), 136--153.
##' \doi{10.1080/10705510701758406}
##'
##' Wirth, R. J., & Edwards, M. C. (2007). Item factor analysis: Current
##' approaches and future directions. *Psychological Methods, 12*(1),
##' 58--79. \doi{10.1037/1082-989X.12.1.58}
##'
##' @examples
##'
##' ## SCENARIO 1: DIRECTLY SPECIFY POPULATION PARAMETERS
##'
##' ## specify population model in LISREL matrices
##' Nu <- rep(0, 4)
##' Alpha <- c(1, -0.5)
##' Lambda <- matrix(c(1, 1, 0, 0, 0, 0, 1, 1), nrow = 4, ncol = 2,
##' dimnames = list(paste0("y", 1:4), paste0("eta", 1:2)))
##' Psi <- diag(c(1, .75))
##' Theta <- diag(4)
##' Beta <- matrix(c(0, .5, 0, 0), nrow = 2, ncol = 2)
##'
##' ## calculate model-implied population means and covariance matrix
##' ## of latent response variables (LRVs)
##' IB <- solve(diag(2) - Beta) # to save time and space
##' Mu_LRV <- Nu + Lambda %*% IB %*% Alpha
##' Sigma_LRV <- Lambda %*% IB %*% Psi %*% t(IB) %*% t(Lambda) + Theta
##'
##' ## Specify (unstandardized) thresholds to discretize normally distributed data
##' ## generated from Mu_LRV and Sigma_LRV, based on marginal probabilities
##' PiList <- list(y1 = c(.25, .5, .25),
##' y2 = c(.17, .33, .33, .17),
##' y3 = c(.1, .2, .4, .2, .1),
##' ## make final variable highly asymmetric
##' y4 = c(.33, .25, .17, .12, .08, .05))
##' sapply(PiList, sum) # all sum to 100%
##' CumProbs <- sapply(PiList, cumsum)
##' ## unstandardized thresholds
##' TauList <- mapply(qnorm, p = lapply(CumProbs, function(x) x[-length(x)]),
##' m = Mu_LRV, sd = sqrt(diag(Sigma_LRV)))
##' for (i in 1:4) names(TauList[[i]]) <- paste0(names(TauList)[i], "|t",
##' 1:length(TauList[[i]]))
##'
##' ## assign numeric weights to each category (optional, see default)
##' NumCodes <- list(y1 = c(-0.5, 0, 0.5), y2 = 0:3, y3 = 1:5, y4 = 1:6)
##'
##'
##' ## Calculate Population Moments for Numerically Coded Ordinal Variables
##' lrv2ord(Sigma = Sigma_LRV, Mu = Mu_LRV, thresholds = TauList, cWts = NumCodes)
##'
##'
##' ## SCENARIO 2: USE ESTIMATED PARAMETERS AS POPULATION
##'
##' data(datCat) # already stored as c("ordered","factor")
##' fit <- cfa(' f =~ 1*u1 + 1*u2 + 1*u3 + 1*u4 ', data = datCat)
##' lrv2ord(Sigma = fit, thresholds = fit) # use same fit for both
##' ## or use estimated thresholds with specified parameters, but note that
##' ## lrv2ord() will only extract standardized thresholds
##' dimnames(Sigma_LRV) <- list(paste0("u", 1:4), paste0("u", 1:4))
##' lrv2ord(Sigma = cov2cor(Sigma_LRV), thresholds = fit)
##'
##' @export
lrv2ord <- function(Sigma, Mu, thresholds, cWts) {
if (inherits(Sigma, "lavaan")) {
if (lavInspect(Sigma, "ngroups") > 1L || lavInspect(Sigma, "nlevels") > 1L) {
stop('Sigma= only accepts single-group/level lavaan models')
}
fitSigma <- Sigma
Sigma <- lavInspect(fitSigma, "cov.ov")
} else stopifnot(is.matrix(Sigma))
vn <- rownames(Sigma) # variable names
SDs <- sqrt(diag(Sigma))
if (missing(Mu)) {
Mu <- rep(0, nrow(Sigma))
} else if (inherits(Mu, "lavaan")) {
if (lavInspect(Mu, "ngroups") > 1L || lavInspect(Mu, "nlevels") > 1L) {
stop('Mu= only accepts single-group/level lavaan models')
}
fitMu <- Mu
Mu <- lavInspect(fitMu, "mean.ov")
}
names(Mu) <- names(SDs) <- vn
## If a single vector of thresholds is passed, broadcast to a list
if (inherits(thresholds, "lavaan")) {
if (lavInspect(thresholds, "ngroups") > 1L || lavInspect(thresholds, "nlevels") > 1L) {
stop('thresholds= only accepts single-group/level lavaan models')
}
## check whether diag(Sigma) == 1
isSTD <- sapply(SDs, function(x) {
isTRUE(all.equal(x, current = 1, tolerance = .001))
})
if (!all(isSTD)) warning('standardized thresholds= extracted from a ',
'lavaan object, but Sigma= is not a ',
'correlation matrix.')
fitThr <- thresholds
thresholds <- lavInspect(fitThr, "th") # STANDARDIZED thresholds
thresh <- lapply(unique(lavInspect(fitThr, "th.idx")), function(x) {
thresholds[lavInspect(fitThr, "th.idx") == x]
})
names(thresh) <- sapply(thresh, function(x) {
strsplit(names(x)[1], "|t", fixed = TRUE)[[1]][1]
})
} else if (is.atomic(thresholds)) {
thresh <- sapply(vn, function(x) {thresholds}, simplify = FALSE)
} else {
stopifnot(is.list(thresholds)) # must be a list
stopifnot(length(thresholds) <= nrow(Sigma)) # no more than 1 per variable
stopifnot(all(names(thresholds) %in% vn)) # names must match
thresh <- thresholds
}
cn <- names(thresh)
stopifnot(length(cn) > 0L)
## If no category weights are passed, default to 0:nCat
if (missing(cWts)) {
cWts <- sapply(thresh, function(x) { 0:length(x) }, simplify = FALSE)
} else if (is.atomic(cWts)) {
## If a single vector of category weights is passed, broadcast to a list
#FIXME: assumes same number of thresholds across variables
cWts <- sapply(cn, function(x) { cWts }, simplify = FALSE)
} else {
stopifnot(is.list(cWts)) # must be a list
stopifnot(length(cWts) <= nrow(Sigma)) # no more than 1 per variable
stopifnot(all(names(cWts) %in% vn)) # names must match
stopifnot(all(cn %in% names(cWts))) # names must match
cWts <- cWts[cn] # discard any others
}
stopifnot(all((sapply(thresh, length) + 1L) == sapply(cWts, length)))
## Calculate marginal probabilities implied by thresholds on moments
get_marg_probs <- function(threshs, m, sd) {
thr <- c(-Inf, threshs, Inf)
sapply(2:length(thr), function(k) {
pnorm(thr[k], m, sd) - pnorm(thr[k-1], m, sd)
})
}
marginal_probs <- mapply(get_marg_probs, SIMPLIFY = FALSE, threshs = thresh,
m = Mu[cn], sd = SDs[cn])
## Marginal means
Mu_ord <- Mu
Mu_ord[cn] <- mapply(function(p, w) {
stopifnot(length(p) == length(w))
sum(p * w)
}, p = marginal_probs, w = cWts)
## marginal variances (fill in covariances below)
Sigma_ord <- Sigma
if (length(cn) == 1) {
## drop=FALSE is not a solution (yields different error)
Sigma_ord[cn,cn] <- mapply(function(p, w, mu) {
stopifnot(length(p) == length(w))
sum(p * (w - mu)^2)
}, p = marginal_probs, w = cWts, mu = Mu_ord[cn])
} else {
diag(Sigma_ord[cn,cn]) <- mapply(function(p, w, mu) {
stopifnot(length(p) == length(w))
sum(p * (w - mu)^2)
}, p = marginal_probs, w = cWts, mu = Mu_ord[cn])
}
## marginal (standardized) skew
skew_ord <- setNames(rep(0, nrow(Sigma)), nm = vn)
skew_ord[cn] <- mapply(function(p, w, mu) {
stopifnot(length(p) == length(w))
numerator <- sum(p * (w - mu)^3)
Moment2 <- sum(p * (w - mu)^2)
denominator <- sqrt(Moment2)^3
numerator / denominator
}, p = marginal_probs, w = cWts, mu = Mu_ord[cn])
## marginal (standardized, excess) kurtosis
kurt_ord <- setNames(rep(0, nrow(Sigma)), nm = vn)
kurt_ord[cn] <- mapply(function(p, w, mu) {
stopifnot(length(p) == length(w))
numerator <- sum(p * (w - mu)^4)
Moment2 <- sum(p * (w - mu)^2)
denominator <- sqrt(Moment2)^4
numerator / denominator
}, p = marginal_probs, w = cWts, mu = Mu_ord[cn]) - 3 # excess kurtosis
## all marginal descriptives
(margMoments <- data.frame(Mean = Mu_ord, SD = sqrt(diag(Sigma_ord)),
Skew = skew_ord, Kurtosis3 = kurt_ord,
row.names = vn))
class(margMoments) <- c("lavaan.data.frame","data.frame") # for printing
## save old copies to return with new
out <- list(Mu_LRV = Mu, Sigma_LRV = Sigma, R_LRV = stats::cov2cor(Sigma),
Thresholds = thresh, Category_weights = cWts, Uni_ord = margMoments)
class(out$Mu_LRV) <- c("lavaan.vector","numeric")
class(out$Sigma_LRV) <- c("lavaan.matrix.symmetric","matrix")
class(out$R_LRV) <- c("lavaan.matrix.symmetric","matrix")
out$Thresholds <- lapply(out$Thresholds, "class<-",
c("lavaan.vector","numeric"))
out$Category_weights <- lapply(out$Category_weights, "class<-",
c("lavaan.vector","numeric"))
## need bivariate moments?
if (length(vn) == 1L) return(out)
## function to apply to any pair of indicators (i and j) in Sigma
getOrdCov <- function(i, j) {
## to use apply(), i= can be 2 values indicating the [row, column]
if (length(i) > 1L) {
if (!missing(j)) warning("j ignored when i has multiple values")
if (length(i) > 2L) stop("i had ", length(i), " values. Only the first 2 were used.")
j <- i[2]
i <- i[1]
}
## if i/j are numeric, get names
if (is.numeric(i)) i <- vn[i]
if (is.numeric(j)) j <- vn[j]
## make sure thresholds are standardized
# i.thr <-
# j.thr <-
## template for matrices of joint probabilities and cross-products
JointProbs <- CP <- matrix(0, nrow = length(cWts[[i]]),
ncol = length(cWts[[j]]))
i.thr <- c(-1e5, (thresh[[i]] - Mu[i]) / SDs[i], 1e5)
j.thr <- c(-1e5, (thresh[[j]] - Mu[j]) / SDs[j], 1e5)
tCombos <- cbind(expand.grid(i = i.thr, j = j.thr),
expand.grid(cat1 = c(0, seq_along(cWts[[i]])),
cat2 = c(0, seq_along(cWts[[j]]))))
tCombos$cp <- pbivnorm(x = tCombos$i, y = tCombos$j, rho = out$R_LRV[i,j])
## loop over rows & columns
for (RR in seq_along(cWts[[i]])) for (CC in seq_along(cWts[[j]])) {
## calculate joint probabilities
idx1 <- which(tCombos$cat1 == RR & tCombos$cat2 == CC )
idx2 <- which(tCombos$cat1 == RR - 1 & tCombos$cat2 == CC )
idx3 <- which(tCombos$cat1 == RR & tCombos$cat2 == CC - 1)
idx4 <- which(tCombos$cat1 == RR - 1 & tCombos$cat2 == CC - 1)
JointProbs[RR,CC] <- tCombos$cp[idx1] - tCombos$cp[idx2] - tCombos$cp[idx3] + tCombos$cp[idx4]
## calculate cross-products
CP[RR,CC] <- (cWts[[i]][RR] - Mu_ord[i]) * (cWts[[j]][CC] - Mu_ord[j])
}
sum(JointProbs * CP) # return covariance
}
## check whether all variables are being discretized
stayCon <- setdiff(vn, cn)
if (length(stayCon) == 0) {
## all are polychoric
(ij <- which(lower.tri(Sigma_ord), arr.ind = TRUE))
Sigma_ord[ij] <- mapply(getOrdCov, i = ij[,1], j = ij[,2])
Sigma_ord[ ij[,2:1] ] <- Sigma_ord[ij] # copy lower to upper triangle
} else {
## pair by pair, choose polychoric or polyserial
for (i in vn[-length(vn)]) for (j in vn[(which(vn == i)+1):length(vn)]) {
if (i %in% stayCon && j %in% stayCon) next
if (j %in% cn && j %in% cn) {
## both discretized, calculate polychoric
Sigma_ord[i,j] <- Sigma_ord[j,i] <- getOrdCov(i, j)
next
}
## else, calculate polyserial
if (i %in% stayCon) {
CON <- i
CAT <- j
} else {
CAT <- i
CON <- j
}
DENS <- mapply(function(tau, interval, m = 0, sd = 1) {
dnorm(tau, mean = m, sd = sd) * interval
}, tau = thresh[[CAT]], interval = diff(cWts[[CAT]]),
m = Mu[CAT], sd = SDs[CAT])
## Note: polyserial correlation divides by sqrt(diag(Sigma_ord)[CAT]),
## but that cancels out when scaling by both SDs to get covariance
Sigma_ord[CON, CAT] <- Sigma_ord[CAT, CON] <-
out$R_LRV[CON, CAT] * sum(DENS) * sqrt(diag(out$Sigma_LRV)[CON])
}
}
class(Sigma_ord) <- c("lavaan.matrix.symmetric","matrix")
if (nrow(Sigma_ord) > 1L) {
R_ord <- cov2cor(Sigma_ord)
class(R_ord) <- c("lavaan.matrix.symmetric","matrix")
} else R_ord <- NULL
c(out, list(Sigma_ord = Sigma_ord, R_ord = R_ord))
}
semTools/R/gorica.R 0000644 0001762 0000144 00000010466 14753073531 013666 0 ustar ligges users ### Leonard Vanbrabant (Roxygen edits by Terrence D. Jorgensen)
### Last updated: 12 February 2025
##' Wrapper for `goric.lavaan()` from the `restriktor` package
##'
##' The `goricaSEM()` function is an interface to [restriktor::goric.lavaan()],
##' allowing users to perform generalized order-restricted information criterion
##' approximation (GORICA) analysis specifically for structural equation
##' models fitted using the \pkg{lavaan} package.
##'
##' @details
##' This function is designed as a wrapper for the [restriktor::goric.lavaan()]
##' function. It calculates GORICA values and weights, which can be used to
##' compare models or hypotheses under inequality constraints.
##'
##' The `hypotheses=` argument allows users to specify constraints in text-based
##' syntax or matrix notation. For text-based syntax, constraints are specified
##' as a string (e.g., `"a1 > a2"`). For matrix notation, a named list with
##' `$constraints`, `$rhs`, and `$neq` elements can be provided.
##'
##' The `comparison=` argument determines whether the specified hypothesis is
##' compared against its `"complement"`, the `"unconstrained"` model, or
##' neither (`"none"`).
##'
##' @param object A [lavaan::lavaan-class] object.
##' @param hypotheses A named `list` of hypotheses to test. See **Details** for
##' information on how to specify hypotheses.
##' @param comparison A `character` string specifying the type of comparison.
##' Options are `"unconstrained"`, `"complement"`, or `"none"`.
##' Default behavior depends on the number of hypotheses.
##' @param type A `character` string indicating the type of analysis, either
##' `"gorica"` (default) or `"goricac"`.
##' @param standardized `logical` indicating whether standardized estimates are
##' used in the analysis. Defaults to `FALSE`.
##' @param debug `logical` indicating whether to print debugging information.
##' Defaults to \code{FALSE}.
##' @param ... Additional arguments passed to [restriktor::goric.lavaan()].
##'
##' @return
##' A `list` containing the results of the \code{goric.lavaan} function,
##' including:
##' \itemize{
##' \item The log-likelihood.
##' \item Penalty term.
##' \item GORIC(A) values and weights.
##' \item Relative GORIC(A) weights.
##' }
##'
##' @references
##' Kuiper, R. M., Hoijtink, H., & Silvapulle, M. J. (2011). An Akaike-type
##' information criterion for model selection under inequality constraints.
##' \emph{Biometrika, 98}(2), 495--501. \doi{10.1093/biomet/asr002}
##'
##' Vanbrabant, L., Van Loey, N., & Kuiper, R. M. (2020). Evaluating a
##' theory-based hypothesis against its complement using an AIC-type
##' information criterion with an application to facial burn injury.
##' \emph{Psychological Methods, 25}(2), 129--142. \doi{10.1037/met0000238}
##'
##' @seealso [restriktor::goric.lavaan()]
##'
##' @author Leonard Vanbrabant and Rebecca Kuiper
##'
##' @examples
##'
##' ## Example: Perform GORICA analysis on a lavaan model
##' library(lavaan)
##' library(restriktor)
##'
##' ## Define the SEM model
##' model <- '
##' ind60 =~ x1 + x2 + x3
##' dem60 =~ y1 + a1*y2 + b1*y3 + c1*y4
##' dem65 =~ y5 + a2*y6 + b2*y7 + c2*y8
##' dem60 ~ ind60
##' dem65 ~ ind60 + dem60
##' y1 ~~ y5
##' y2 ~~ y4 + y6
##' y3 ~~ y7
##' y4 ~~ y8
##' y6 ~~ y8
##' '
##'
##' ## Fit the model
##' data(PoliticalDemocracy)
##' fit <- sem(model, data = PoliticalDemocracy)
##'
##' ## Define hypotheses
##' myHypothesis <- 'a1 > a2, b1 > b2, c1 > c2'
##'
##' ## Perform GORICA analysis
##' result <- goricaSEM(fit, hypotheses = list(H1 = myHypothesis),
##' standardized = FALSE, comparison = "complement",
##' type = "gorica")
##'
##' ## Print result
##' print(result)
##'
##' @export
goricaSEM <- function(object, ..., hypotheses = NULL,
comparison = NULL,
type = "gorica",
standardized = FALSE,
debug = FALSE) {
## Check if the original function is available
if (!requireNamespace("restriktor", quietly = TRUE)) {
stop("The 'restriktor' package is required but not installed.")
}
## Call the original function
restriktor::goric.lavaan(
object = object,
hypotheses = hypotheses,
comparison = comparison,
type = type,
standardized = standardized,
debug = debug,
...
)
}
semTools/R/zzz.R 0000644 0001762 0000144 00000003404 14634011255 013242 0 ustar ligges users .onAttach <- function(libname, pkgname) {
version <- read.dcf(file = system.file("DESCRIPTION", package = pkgname), fields = "Version")
packageStartupMessage(" ")
packageStartupMessage("###############################################################################")
packageStartupMessage("This is ", paste(pkgname, version))
packageStartupMessage("All users of R (or SEM) are invited to submit functions or ideas for functions.")
packageStartupMessage("###############################################################################")
## if lavaan.mi is already loaded, warn users to use :: or switch order
if ("lavaan.mi" %in% names(utils::sessionInfo()$otherPkgs)) {
packageStartupMessage(" ")
packageStartupMessage("The lavaan.mi package was already attached when semTools was attached.")
packageStartupMessage("To access lavaan.mi functions masked by semTools, use the double-colon:")
packageStartupMessage(" ")
packageStartupMessage("\t lavaan.mi::cfa.mi() or lavaan.mi::lavTestLRT.mi()\n")
packageStartupMessage("It is preferable to first attach semTools, then attach lavaan.mi, but")
packageStartupMessage("you can change the path order by detaching and reloading lavaan.mi,")
packageStartupMessage("using this syntax:\n")
packageStartupMessage('\t detach("package:lavaan.mi", unload = TRUE)')
packageStartupMessage('\t library(lavaan.mi) \n')
packageStartupMessage("Then the deprecated semTools::lavaan.mi functionality will be masked.")
}
}
.onLoad <- function(libname, pkgname) {
## "register" emmeans functionality
emInstalled <- try(loadNamespace("emmeans"), silent = TRUE)
if (!inherits(emInstalled, "try-error")) {
emmeans::.emm_register("lavaan", pkgname)
}
}
semTools/R/efa.R 0000644 0001762 0000144 00000066461 14764264620 013166 0 ustar ligges users ### Sunthud Pornprasertmanit & Terrence D. Jorgensen
### Last updated: 27 May 2020
### DEPRECATED 24 November 2022 (after 0.5-6 was on CRAN)
### REMOVED next time!
### fit and rotate EFA models in lavaan
## -------------------------------------
## Exploratory Factor Analysis in lavaan
## -------------------------------------
##' Analyze Unrotated Exploratory Factor Analysis Model
##'
##' This function will analyze unrotated exploratory factor analysis model. The
##' unrotated solution can be rotated by the \code{\link{orthRotate}} and
##' \code{\link{oblqRotate}} functions.
##'
##' This function will generate a lavaan script for unrotated exploratory factor
##' analysis model such that (1) all factor loadings are estimated, (2) factor
##' variances are fixed to 1, (3) factor covariances are fixed to 0, and (4) the
##' dot products of any pairs of columns in the factor loading matrix are fixed
##' to zero (Johnson & Wichern, 2002). The reason for creating this function
##' to supplement the \code{\link{factanal}} function is that users can enjoy
##' some advanced features from the \code{lavaan} package, such as scaled
##' \eqn{\chi^2}, diagonally weighted least squares estimation for ordinal
##' indicators, or full-information maximum likelihood (FIML) to handle
##' incomplete data.
##'
##' @importFrom lavaan lavInspect parTable
##' @importFrom stats factanal
##'
##' @param data A target \code{data.frame}
##' @param nf The desired number of factors
##' @param varList Target observed variables. If not specified, all variables in
##' \code{data} will be used (or \code{sample.cov} if \code{is.null(data)};
##' see \code{\link[lavaan]{cfa}} for argument descriptions).
##' @param start Use starting values in the analysis from the
##' \code{\link{factanal}} \code{function}. If \code{FALSE}, the starting values
##' from the \code{lavaan} package will be used. \code{TRUE} is ignored with a
##' warning if the \code{aux} argument is used.
##' @param aux The list of auxiliary variables. These variables will be included
##' in the model by the saturated-correlates approach to account for missing
##' information.
##' @param \dots Other arguments in the \code{\link[lavaan]{cfa}} function in
##' the \code{lavaan} package, such as \code{ordered}, \code{se},
##' \code{estimator}, or \code{sample.cov} and \code{sample.nobs}.
##'
##' @return A \code{lavaan} output of unrotated exploratory factor analysis
##' solution.
##'
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @examples
##'
##' \donttest{
##'
##' unrotated <- efaUnrotate(HolzingerSwineford1939, nf = 3,
##' varList = paste0("x", 1:9), estimator = "mlr")
##' summary(unrotated, std = TRUE)
##' inspect(unrotated, "std")
##'
##' dat <- data.frame(HolzingerSwineford1939,
##' z = rnorm(nrow(HolzingerSwineford1939), 0, 1))
##' unrotated2 <- efaUnrotate(dat, nf = 2, varList = paste0("x", 1:9), aux = "z")
##' }
##'
##' @name efaUnrotate-deprecated
##' @usage
##' efaUnrotate(data = NULL, nf, varList = NULL,
##' start = TRUE, aux = NULL, ...)
##' @seealso \code{\link{semTools-deprecated}}
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##'
##' @section Support functions for exploratory factor analysis in \code{lavaan}:
##' The \code{efaUnrotate}, \code{orthRotate}, \code{oblqRotate}, and
##' \code{funRotate} functions will no longer be supported. These functions
##' allowed users to estimate EFA parameters with \code{lavaan}. Instead, users
##' can now directly use \code{lavaan}'s \code{\link[lavaan]{efa}()} function.
##' Exploratory SEM (ESEM) is also supported by lavaan using special operators
##' in \code{lavaan}'s \code{\link[lavaan]{model.syntax}}; see
##' \url{https://github.com/yrosseel/lavaan/issues/112} for details.
##'
##' @export
efaUnrotate <- function(data = NULL, nf, varList = NULL,
start = TRUE, aux = NULL, ...) {
.Deprecated(msg = c("The efaUnrotate function is deprecated, and it will ",
"cease to be included in future versions of semTools. ",
"The lavaan package itself now provides EFA support. ",
"See help('semTools-deprecated) for details."))
efaArgs <- list(...)
if (is.null(data)) {
## check for summary statistics
sample.cov <- efaArgs$sample.cov
sample.mean <- efaArgs$sample.mean
sample.nobs <- efaArgs$sample.nobs
sample.th <- efaArgs$sample.th
WLS.V <- efaArgs$WLS.V
NACOV <- efaArgs$NACOV
if (is.null(sample.cov)) stop('User must supply either raw data or ',
'summary statistics to pass to lavaan().')
if (is.null(varList)) varList <- colnames(sample.cov)
anyOrdered <- !is.null(sample.th)
ordNames <- efaArgs$ordered
if (anyOrdered & (is.logical(ordNames) | is.null(ordNames))) {
if (is.null(ordNames)) {
message('Thresholds supplied, but not an ordered= argument. Must ',
'assume all model variables are ordered.')
}
ordNames <- varList
}
} else {
sample.cov <- NULL
sample.mean <- NULL
sample.nobs <- NULL
sample.th <- NULL
WLS.V <- NULL
NACOV <- NULL
efaArgs$data <- data
if (is.null(varList)) varList <- colnames(data)
anyOrdered <- checkOrdered(data, varList, ...)
ordNames <- checkOrdered(data, varList, ..., return.names = TRUE)
if (!is.null(efaArgs$group)) stop("Multi-group EFA is not currently supported.")
}
if (!is.null(aux)) {
if (anyOrdered) {
stop("The analysis model or the analysis data have ordered categorical",
" variables. The auxiliary variable feature is not available for",
" the models for categorical variables with the weighted least",
" square approach.")
}
efaArgs$fixed.x <- FALSE
efaArgs$missing <- "fiml"
efaArgs$aux <- aux
lavaancfa <- function(...) { cfa.auxiliary(...)}
} else lavaancfa <- function(...) { lavaan::cfa(...)}
nvar <- length(varList)
facnames <- paste0("factor", 1:nf)
loading <- outer(1:nvar, 1:nf, function(x, y) paste0("load", x, "_", y))
syntax <- ""
for (i in 1:nf) {
variablesyntax <- paste(paste0(loading[,i], "*", varList), collapse = " + ")
factorsyntax <- paste0(facnames[i], " =~ NA*", varList[1], " + ", variablesyntax, "\n")
syntax <- paste(syntax, factorsyntax)
}
syntax <- paste(syntax, paste(paste0(facnames, " ~~ 1*", facnames),
collapse = "\n"), "\n")
dataSupportsMeans <- length(setdiff(varList, ordNames)) && !(is.null(data) && is.null(sample.mean))
meanstructure <- efaArgs$meanstructure
if (is.null(meanstructure)) meanstructure <- anyOrdered #FIXME: wise default for EFA?
stopifnot(is.logical(meanstructure))
if (meanstructure && dataSupportsMeans) {
syntax <- paste(syntax, paste(paste0(setdiff(varList, ordNames),
" ~ 1"), collapse = "\n"), "\n")
}
if (nf > 1) {
covsyntax <- outer(facnames, facnames,
function(x, y) paste0(x, " ~~ 0*", y, "\n"))[lower.tri(diag(nf), diag = FALSE)]
syntax <- paste(syntax, paste(covsyntax, collapse = " "))
for (i in 2:nf) {
for (j in 1:(i - 1)) {
loadconstraint <- paste(paste0(loading[,i], "*", loading[,j]), collapse=" + ")
syntax <- paste(syntax, paste0("0 == ", loadconstraint), "\n")
}
}
}
if (start) {
if (is.null(aux)) {
List <- c(list(model = syntax, data = data), list(...))
List$do.fit <- FALSE
outtemp <- do.call(lavaancfa, List)
covtemp <- lavInspect(outtemp, "sampstat")$cov
partemp <- parTable(outtemp)
err <- try(startload <- factanal(factors = nf, covmat = covtemp)$loadings[],
silent = TRUE)
if (is(err, "try-error")) stop("The starting values from the factanal",
" function cannot be calculated. Please",
" use start = FALSE instead.")
startval <- sqrt(diag(diag(covtemp))) %*% startload
partemp$ustart[match(as.vector(loading), partemp$label)] <- as.vector(startval)
partemp$est <- partemp$se <- NULL
syntax <- partemp
} else warning("The 'start' argument was ignored because factanal() does",
" not support auxiliary variables. When using auxiliary",
" variables, set 'start = FALSE' ")
} else {
## FIXME: optimizer can't escape loadings == 0 without starting values from
## factanal() or by using theta parameterization
## https://groups.google.com/d/msg/lavaan/ujkHmCVirEY/-LGut4ewAwAJ
parameterization <- efaArgs$parameterization
if (is.null(parameterization)) parameterization <- lavaan::lavOptions("parameterization")
if (anyOrdered && parameterization != "theta")
warning('If the default parameterization = "delta" returns results with ',
'all factor loadings equal to zero, try either setting start ',
'= TRUE or setting parameterization = "theta" instead.')
}
efaArgs$model <- syntax
do.call(lavaancfa, efaArgs)
}
## -----------------
## Class and Methods
## -----------------
##' Class For Rotated Results from EFA
##'
##' This class contains the results of rotated exploratory factor analysis
##'
##'
##' @name EFA-class
##' @aliases EFA-class show,EFA-method summary,EFA-method
##' @docType class
##' @slot loading Rotated standardized factor loading matrix
##' @slot rotate Rotation matrix
##' @slot gradRotate gradient of the objective function at the rotated loadings
##' @slot convergence Convergence status
##' @slot phi: Factor correlation matrix. Will be an identity matrix if
##' orthogonal rotation is used.
##' @slot se Standard errors of the rotated standardized factor loading matrix
##' @slot method Method of rotation
##' @slot call The command used to generate this object
##' @section Objects from the Class: Objects can be created via the
##' \code{\link{orthRotate}} or \code{\link{oblqRotate}} function.
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##' @seealso \code{\link{efaUnrotate}}; \code{\link{orthRotate}};
##' \code{\link{oblqRotate}}
##' @examples
##'
##' \donttest{
##'
##' unrotated <- efaUnrotate(HolzingerSwineford1939, nf = 3,
##' varList = paste0("x", 1:9), estimator = "mlr")
##' summary(unrotated, std = TRUE)
##' lavInspect(unrotated, "std")
##'
##' # Rotated by Quartimin
##' rotated <- oblqRotate(unrotated, method = "quartimin")
##' summary(rotated)
##' }
##'
setClass("EFA", representation(loading = "matrix",
rotate = "matrix",
gradRotate = "matrix",
convergence = "logical",
phi = "matrix",
se = "matrix",
method = "character",
call = "call"))
##' @rdname EFA-class
##' @aliases show,EFA-method
##' @export
setMethod("show", signature(object = "EFA"), function(object) {
cat("Standardized Rotated Factor Loadings\n")
print(printLoadings(object))
cat("\nFactor Correlation\n")
print(object@phi)
cat("\nMethod of rotation:\t")
cat(object@method, "\n")
message("The standard errors are close but do not match with other packages.",
" Be mindful when using it.")
})
##' @rdname EFA-class
##' @aliases summary,EFA-method
##' @param object object of class \code{EFA}
##' @param suppress any standardized loadings less than the specified value
##' will not be printed to the screen
##' @param sort \code{logical}. If \code{TRUE} (default), factor loadings will
##' be sorted by their size in the console output
##' @export
setMethod("summary", signature(object = "EFA"),
function(object, suppress = 0.1, sort = TRUE) {
cat("Standardized Rotated Factor Loadings\n")
print(printLoadings(object, suppress = suppress, sort = sort))
cat("\nFactor Correlation\n")
print(object@phi)
cat("\nMethod of rotation:\t")
cat(object@method, "\n")
cat("\nTest Statistics for Standardized Rotated Factor Loadings\n")
print(testLoadings(object))
})
## ------------------------------
## Rotation Constructor Functions
## ------------------------------
##' Implement orthogonal or oblique rotation
##'
##' These functions will implement orthogonal or oblique rotation on
##' standardized factor loadings from a lavaan output.
##'
##' These functions will rotate the unrotated standardized factor loadings by
##' orthogonal rotation using the \code{\link[GPArotation]{GPForth}} function or
##' oblique rotation using the \code{\link[GPArotation]{GPFoblq}} function the
##' \code{GPArotation} package. The resulting rotation matrix will be used to
##' calculate standard errors of the rotated standardized factor loading by
##' delta method by numerically computing the Jacobian matrix by the
##' \code{\link[lavaan]{lav_func_jacobian_simple}} function.
##'
## @aliases orthRotate oblqRotate funRotate
## @rdname rotate
##' @param object A lavaan output
##' @param method The method of rotations, such as \code{"varimax"},
##' \code{"quartimax"}, \code{"geomin"}, \code{"oblimin"}, or any gradient
##' projection algorithms listed in the \code{\link[GPArotation]{GPA}} function
##' in the \code{GPArotation} package.
##' @param fun The name of the function that users wish to rotate the
##' standardized solution. The functions must take the first argument as the
##' standardized loading matrix and return the \code{GPArotation} object. Check
##' this page for available functions: \code{\link[GPArotation]{rotations}}.
##' @param \dots Additional arguments for the \code{\link[GPArotation]{GPForth}}
##' function (for \code{orthRotate}), the \code{\link[GPArotation]{GPFoblq}}
##' function (for \code{oblqRotate}), or the function that users provide in the
##' \code{fun} argument.
##' @return An \code{linkS4class{EFA}} object that saves the rotated EFA solution
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##' @examples
##'
##' \donttest{
##'
##' unrotated <- efaUnrotate(HolzingerSwineford1939, nf = 3,
##' varList = paste0("x", 1:9), estimator = "mlr")
##'
##' # Orthogonal varimax
##' out.varimax <- orthRotate(unrotated, method = "varimax")
##' summary(out.varimax, sort = FALSE, suppress = 0.3)
##'
##' # Orthogonal Quartimin
##' orthRotate(unrotated, method = "quartimin")
##'
##' # Oblique Quartimin
##' oblqRotate(unrotated, method = "quartimin")
##'
##' # Geomin
##' oblqRotate(unrotated, method = "geomin")
##'
##' # Target rotation
##' library(GPArotation)
##' target <- matrix(0, 9, 3)
##' target[1:3, 1] <- NA
##' target[4:6, 2] <- NA
##' target[7:9, 3] <- NA
##' colnames(target) <- c("factor1", "factor2", "factor3")
##' ## This function works with GPArotation version 2012.3-1
##' funRotate(unrotated, fun = "targetQ", Target = target)
##' }
##'
##' @name rotate-deprecated
##'
##' @usage
##' orthRotate(object, method = "varimax", ...)
##'
##' oblqRotate(object, method = "quartimin", ...)
##'
##' funRotate(object, fun, ...)
##'
##' @seealso \code{\link{semTools-deprecated}}
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @export
orthRotate <- function(object, method = "varimax", ...) {
.Deprecated(msg = c("The orthRotate function is deprecated, and it will ",
"cease to be included in future versions of semTools. ",
"The lavaan package itself now provides support EFA. ",
"See help('semTools-deprecated) for details."))
requireNamespace("GPArotation")
if (!("package:GPArotation" %in% search())) attachNamespace("GPArotation")
mc <- match.call()
initL <- getLoad(object)
rotated <- GPArotation::GPForth(initL, method=method, ...)
# rotateMat <- t(solve(rotated$Th)) # defined but never used
LIST <- seStdLoadings(rotated, object, fun = GPArotation::GPForth,
MoreArgs = c(method = method, list(...)))
# orthogonal <- rotated$orthogonal # define but never used
loading <- rotated$loadings
rotate <- rotated$Th
gradRotate <- rotated$Gq
convergence <- rotated$convergence
method <- rotated$method
phi <- diag(ncol(loading))
lv.names <- colnames(loading)
dimnames(phi) <- list(lv.names, lv.names)
new("EFA", loading = loading, rotate = rotate, gradRotate = gradRotate,
convergence = convergence, phi = phi, se = LIST, method = method, call = mc)
}
##' @rdname semTools-deprecated
##' @export
oblqRotate <- function(object, method = "quartimin", ...) {
.Deprecated(msg = c("The oblqRotate function is deprecated, and it will ",
"cease to be included in future versions of semTools. ",
"The lavaan package itself now provides support EFA. ",
"See help('semTools-deprecated) for details."))
requireNamespace("GPArotation")
if (!("package:GPArotation" %in% search())) attachNamespace("GPArotation")
mc <- match.call()
initL <- getLoad(object)
rotated <- GPArotation::GPFoblq(initL, method = method, ...)
# rotateMat <- t(solve(rotated$Th)) # defined but never used
LIST <- seStdLoadings(rotated, object, fun = GPArotation::GPFoblq,
MoreArgs = c(method = method, list(...)))
# orthogonal <- rotated$orthogonal # defined but never used
loading <- rotated$loadings
rotate <- rotated$Th
gradRotate <- rotated$Gq
convergence <- rotated$convergence
method <- rotated$method
phi <- rotated$Phi
lv.names <- colnames(loading)
dimnames(phi) <- list(lv.names, lv.names)
new("EFA", loading=loading, rotate = rotate, gradRotate = gradRotate,
convergence = convergence, phi = phi, se = LIST, method = method, call = mc)
}
##' @rdname semTools-deprecated
##' @export
funRotate <- function(object, fun, ...) {
.Deprecated(msg = c("The funRotate function is deprecated, and it will ",
"cease to be included in future versions of semTools. ",
"The lavaan package itself now provides support EFA. ",
"See help('semTools-deprecated) for details."))
stopifnot(is.character(fun))
requireNamespace("GPArotation")
if (!("package:GPArotation" %in% search())) attachNamespace("GPArotation")
mc <- match.call()
initL <- getLoad(object)
rotated <- do.call(fun, c(list(L = initL), list(...)))
# rotateMat <- t(solve(rotated$Th)) # defined but never used
gradRotate <- rotated$Gq
LIST <- seStdLoadings(rotated, object, fun = fun, MoreArgs = list(...))
# orthogonal <- rotated$orthogonal # defined but never used
loading <- rotated$loadings
rotate <- rotated$Th
convergence <- rotated$convergence
method <- rotated$method
phi <- rotated$Phi
if (is.null(phi)) phi <- diag(ncol(loading))
lv.names <- colnames(loading)
dimnames(phi) <- list(lv.names, lv.names)
new("EFA", loading = loading, rotate = rotate, gradRotate = gradRotate,
convergence = convergence, phi = phi, se = LIST, method = method, call = mc)
}
## ----------------
## Hidden Functions
## ----------------
printLoadings <- function(object, suppress = 0.1, sort = TRUE) {
loading <- object@loading
nf <- ncol(loading)
loadingText <- sprintf("%.3f", object@loading)
sig <- ifelse(testLoadings(object)$p < 0.05, "*", " ")
loadingText <- paste0(loadingText, sig)
loadingText[abs(loading) < suppress] <- ""
loadingText <- matrix(loadingText, ncol = nf, dimnames = dimnames(loading))
lead <- apply(abs(loading), 1, which.max)
ord <- NULL
if (sort) {
for (i in 1:nf) {
ord <- c(ord, intersect(order(abs(loading[,i]), decreasing = TRUE), which(lead == i)))
}
loadingText <- loadingText[ord,]
}
as.data.frame(loadingText)
}
##' @importFrom stats pnorm qnorm
testLoadings <- function(object, level = 0.95) {
se <- object@se
loading <- object@loading
lv.names <- colnames(loading)
z <- loading / se
p <- 2 * (1 - pnorm( abs(z) ))
crit <- qnorm(1 - (1 - level)/2)
est <- as.vector(loading)
se <- as.vector(se)
warning("The standard error is currently invalid because it does not account",
" for the variance of the rotation function. It is simply based on",
" the delta method.")
out <- data.frame(lhs=lv.names[col(loading)], op = "=~",
rhs = rownames(loading)[row(loading)], std.loading = est,
se = se, z = as.vector(z), p = as.vector(p),
ci.lower = (est - crit*se), ci.upper = (est + crit*se))
class(out) <- c("lavaan.data.frame", "data.frame")
out
}
##' @importFrom lavaan lavInspect
getLoad <- function(object, std = TRUE) {
out <- lavInspect(object, "est")$lambda #FIXME: check for multiple groups
if (std) {
impcov <- lavaan::fitted.values(object)$cov
impsd <- sqrt(diag(diag(impcov)))
out <- solve(impsd) %*% out
}
rownames(out) <- lavaan::lavNames(object@ParTable, "ov", group = 1)
if (!is.null(object@external$aux)) {
out <- out[!(rownames(out) %in% object@external$aux),]
}
class(out) <- c("loadings", out)
out
}
fillMult <- function(X, Y, fillrowx = 0, fillrowy = 0, fillcolx = 0, fillcoly = 0) {
tempX <- matrix(0, nrow = nrow(X) + fillrowx, ncol = ncol(X) + fillcolx)
tempY <- matrix(0, nrow = nrow(Y) + fillrowy, ncol = ncol(Y) + fillcoly)
tempX[1:nrow(X), 1:ncol(X)] <- X
tempY[1:nrow(Y), 1:ncol(Y)] <- Y
result <- tempX %*% tempY
result[1:nrow(X), 1:ncol(Y)]
}
##' @importFrom lavaan lavInspect
stdRotatedLoadings <- function(est, object, fun, aux = NULL, rotate = NULL, MoreArgs = NULL) {
ov.names <- lavaan::lavNames(object@ParTable, "ov", group = 1)
lv.names <- lavaan::lavNames(object@ParTable, "lv", group = 1)
ind.names <- setdiff(ov.names, aux)
# Compute model-implied covariance matrix
partable <- object@ParTable
# LY
load.idx <- which(partable$op == "=~" & !(partable$rhs %in% lv.names))
loading <- matrix(est[load.idx], ncol = length(lv.names))
loading <- rbind(loading, matrix(0, length(aux), ncol(loading)))
# Nu
if (lavInspect(object, "options")$meanstructure) {
int.idx <- which(partable$op == "~1" & (partable$rhs == "") & (partable$lhs %in% ov.names))
intcept <- matrix(est[int.idx], ncol = 1)
}
# Theta
th.idx <- which(partable$op == "~~" & (partable$rhs %in% ov.names) & (partable$lhs %in% ov.names))
theta <- matrix(0, nrow = length(ov.names), ncol = length(ov.names),
dimnames = list(ov.names, ov.names))
for (i in th.idx) {
theta[partable$lhs[i], partable$rhs[i]] <- theta[partable$rhs[i], partable$lhs[i]] <- est[i]
}
OV <- loading %*% t(loading) + theta
invsd <- solve(sqrt(diag(diag(OV))))
requireNamespace("GPArotation")
if (!("package:GPArotation" %in% search())) attachNamespace("GPArotation")
# Compute standardized results
loading <- invsd %*% loading
obj <- do.call(fun, c(list(loading), MoreArgs))
# GPArotation::GPFoblq(loading, method = "geomin")
loading <- obj$loadings
rotMat <- t(solve(obj$Th))
# %*% rotate
est[load.idx] <- as.vector(loading[seq_along(ind.names),])
if (lavInspect(object, "options")$meanstructure) {
est[int.idx] <- as.vector(invsd %*% intcept)
}
theta <- invsd %*% theta %*% invsd
rownames(theta) <- colnames(theta) <- ov.names
for(i in th.idx) est[i] <- theta[partable$lhs[i], partable$rhs[i]]
# Put phi
rv.idx <- which(partable$op == "~~" & partable$rhs %in% lv.names)
templhs <- match(partable$lhs[rv.idx], lv.names)
temprhs <- match(partable$rhs[rv.idx], lv.names)
# rotate2 <- t(solve(rotate))
# phi <- t(rotate2) %*% rotate2
phi <- obj$Phi
if (!is.null(phi)) {
for (i in seq_along(templhs)) {
est[rv.idx[i]] <- phi[templhs[i], temprhs[i]]
}
}
est
}
##' @importFrom lavaan lavInspect parTable
seStdLoadings <- function(rotate, object, fun, MoreArgs) {
# object <- efaUnrotate(HolzingerSwineford1939, nf=3, varList=paste0("x", 1:9), estimator="mlr")
# initL <- getLoad(object)
# rotate <- GPArotation::GPFoblq(initL, method="oblimin")
rotMat <- t(solve(rotate$Th))
gradient <- rotate$Gq
loading <- rotate$loadings
phi <- rotate$Phi
if (is.null(phi)) phi <- diag(ncol(loading))
est <- lavaan::parameterEstimates(object)$est
aux <- object@external$aux
# Standardized results
JAC1 <- lavaan::lav_func_jacobian_simple(func = stdRotatedLoadings,
x = object@Fit@est, object = object,
aux = aux, rotate = rotMat,
fun = fun, MoreArgs = MoreArgs)
LIST <- lavInspect(object, "list")
free.idx <- which(LIST$free > 0L)
m <- ncol(phi)
phi.idx <- which(LIST$op == "~~" & LIST$lhs != LIST$rhs & (LIST$lhs %in% paste0("factor", 1:m)))
JAC1 <- JAC1[c(free.idx, phi.idx), free.idx]
VCOV <- as.matrix(lavaan::vcov(object, labels = FALSE))
if(object@Model@eq.constraints) {
JAC1 <- JAC1 %*% object@Model@eq.constraints.K
}
COV1 <- JAC1 %*% VCOV %*% t(JAC1)
# I1 <- MASS::ginv(COV1)
# I1p <- matrix(0, nrow(I1) + length(phi.idx), ncol(I1) + length(phi.idx))
# I1p[1:nrow(I1), 1:ncol(I1)] <- I1
# phi.idx2 <- nrow(I1) + 1:length(phi.idx)
# p <- nrow(loading)
# dconlambda <- matrix(0, m^2 - m, p*m)
# gradphi <- gradient %*% solve(phi)
# lambgradphi <- t(loading) %*% gradphi
# lambphi <- loading %*% solve(phi)
# lamblamb <- t(loading) %*% loading
# runrow <- 1
# descript <- NULL
# for(u in 1:m) {
# for(v in setdiff(1:m, u)) {
# runcol <- 1
# for(r in 1:m) {
# for(i in 1:p) {
# mir <- (1 - 1/p) * sum(loading[i,]^2) + sum(loading[,r]^2)/p - loading[i,r]^2
# dur <- 0
# if(u == r) dur <- 1
# dconlambda[runrow, runcol] <- dur * gradphi[i, v] + 4 * mir * loading[i, u] * phi[r, v] + (8 - 8/p)*loading[i,r]*loading[i,u]*lambphi[i,v] + 8*loading[i,r]*lamblamb[u,r]*phi[r,v]/p - 8*loading[i,r]^2*loading[i,u]*phi[r,v]
# descript <- rbind(descript, c(runrow, runcol, u, v, i, r))
# runcol <- runcol + 1
# }
# }
# runrow <- runrow + 1
# }
# }
# dconphi <- matrix(0, m^2 - m, m*(m-1)/2)
# runrow <- 1
# descript2 <- NULL
# for(u in 1:m) {
# for(v in setdiff(1:m, u)) {
# runcol <- 1
# for(x in 2:m) {
# for(y in 1:(x - 1)) {
# dux <- 0
# if(u == x) dux <- 1
# duy <- 0
# if(u == y) duy <- 1
# dconphi[runrow, runcol] <- -(dux * phi[y, v] + duy * phi[x, v]) * lambgradphi[u, u]
# descript2 <- rbind(descript2, c(runrow, runcol, u, v, x, y))
# runcol <- runcol + 1
# }
# }
# runrow <- runrow + 1
# }
# }
# I2 <- matrix(0, nrow(I1p) + m^2 - m, ncol(I1p) + m^2 - m)
# I2[1:nrow(I1p), 1:ncol(I1p)] <- I1p
# I2[lamb.idx, 1:(m^2 - m) + nrow(I1p)] <- t(dconlambda)
# I2[1:(m^2 - m) + nrow(I1p), lamb.idx] <- dconlambda
# I2[phi.idx2, 1:(m^2 - m) + nrow(I1p)] <- t(dconphi)
# I2[1:(m^2 - m) + nrow(I1p), phi.idx2] <- dconphi
# COV2 <- MASS::ginv(I2)[1:nrow(I1p), 1:ncol(I1p)]
COV2 <- COV1
LIST <- LIST[,c("lhs", "op", "rhs", "group")]
LIST$se <- rep(NA, length(LIST$lhs))
LIST$se[c(free.idx, phi.idx)] <- sqrt(diag(COV2))
tmp.se <- ifelse( LIST$se == 0.0, NA, LIST$se)
lv.names <- lavaan::lavNames(object@ParTable, "lv", group = 1)
partable <- parTable(object)
idx <- which(partable$op == "=~" & !(partable$rhs %in% lv.names))
matrix(LIST$se[idx], ncol = length(lv.names))
}
checkOrdered <- function(dat, varnames, ..., return.names = FALSE) {
ord <- list(...)$ordered
if (is.null(ord)) {
ord <- FALSE
} else {
ord <- TRUE
}
if (is.null(dat)) {
orderedVar <- FALSE
} else {
orderedVar <- sapply(dat[,varnames], function(x) "ordered" %in% is(x))
}
if (return.names) {
## added by TDJ 4-Feb-2020
return(unique(c(list(...)$ordered, names(orderedVar[orderedVar]))))
}
any(c(ord, orderedVar))
}
semTools/R/tukeySEM.R 0000644 0001762 0000144 00000005055 14632016456 014125 0 ustar ligges users ### Alexander M. Schoemann
### Last updated: 9 March 2018
#' Tukey's WSD post-hoc test of means for unequal variance and sample size
#'
#' This function computes Tukey's WSD post hoc test of means when variances and
#' sample sizes are not equal across groups. It can be used as a post hoc test
#' when comparing latent means in multiple group SEM.
#'
#' After conducting an omnibus test of means across three of more groups,
#' researchers often wish to know which sets of means differ at a particular
#' Type I error rate. Tukey's WSD test holds the error rate stable across
#' multiple comparisons of means. This function implements an adaptation of
#' Tukey's WSD test from Maxwell & Delaney (2004), that allows variances and
#' sample sizes to differ across groups.
#'
#'
#' @importFrom stats ptukey
#'
#' @param m1 Mean of group 1.
#' @param m2 Mean of group 2.
#' @param var1 Variance of group 1.
#' @param var2 Variance of group 2.
#' @param n1 Sample size of group 1.
#' @param n2 Sample size of group 2.
#' @param ng Total number of groups to be compared (i.e., the number of groups
#' compared in the omnibus test).
#' @return A vector with three elements:
#' \enumerate{
#' \item `q`: The *q* statistic
#' \item `df`: The degrees of freedom for the *q* statistic
#' \item `p`: A *p* value based on the *q* statistic, *df*,
#' and the total number of groups to be compared
#' }
#' @author Alexander M. Schoemann (East Carolina University;
#' \email{schoemanna@@ecu.edu})
#' @references Maxwell, S. E., & Delaney, H. D. (2004). *Designing
#' experiments and analyzing data: A model comparison perspective* (2nd ed.).
#' Mahwah, NJ: Lawrence Erlbaum Associates.
#' @examples
#'
#' ## For a case where three groups have been compared:
#' ## Group 1: mean = 3.91, var = 0.46, n = 246
#' ## Group 2: mean = 3.96, var = 0.62, n = 465
#' ## Group 3: mean = 2.94, var = 1.07, n = 64
#'
#' ## compare group 1 and group 2
#' tukeySEM(3.91, 3.96, 0.46, 0.62, 246, 425, 3)
#'
#' ## compare group 1 and group 3
#' tukeySEM(3.91, 2.94, 0.46, 1.07, 246, 64, 3)
#'
#' ## compare group 2 and group 3
#' tukeySEM(3.96, 2.94, 0.62, 1.07, 465, 64, 3)
#'
#' @export
tukeySEM <- function(m1, m2, var1, var2, n1, n2, ng) {
qNum <- abs(m1 - m2)
qDenom <- sqrt(((var1/n1) + (var2/n2))/2)
Tukeyq <- qNum / qDenom
Tukeydf <- ((var1/n1) + (var2/n2))^2 /
(((var1/n1)^2 / (n1 - 1)) + ((var2/n2)^2 / (n2 - 1)))
c(q = Tukeyq, df = Tukeydf, p = 1 - ptukey(Tukeyq, ng, Tukeydf))
}
##Example from Schoemann (2013)
##Bio vs. policial science on evo misconceptions
#tukeySEM(3.91, 3.96,.46, .62, 246, 425,3)
semTools/R/miPowerFit.R 0000644 0001762 0000144 00000036546 14632143376 014517 0 ustar ligges users ### Sunthud Pornprasertmanit and Terrence D. Jorgensen (added ... argument)
### Last updated: 2 September 2021
##' Modification indices and their power approach for model fit evaluation
##'
##' The model fit evaluation approach using modification indices and expected
##' parameter changes.
##'
##' To decide whether a parameter should be freed, one can inspect its
##' modification index (MI) and expected parameter change (EPC).
##' Those values can be used to evaluate model fit by 2 methods.
##'
##' Method 1: Saris, Satorra, and van der Veld (2009, pp. 570--573) used
##' power (probability of detecting a significant MI) and EPC to decide whether
##' to free a parametr. First, one should evaluate whether a parameter's MI
##' is significant. Second, one should evaluate whether the power to detect a
##' target EPC is high enough. The combination of criteria leads to the
##' so-called "JRule" first implemented with LISREL (van der Veld et al., 2008):
##'
##' \itemize{
##' \item If the MI is not significant and the power is low,
##' the test is inconclusive.
##' \item If the MI is not significant and the power is high,
##' there is no misspecification.
##' \item If the MI is significant and the power is low,
##' the fixed parameter is misspecified.
##' \item If the MI is significant and the power is high,
##' the EPC is investigated. If the EPC is large (greater than the
##' the target EPC), the parameter is misspecified. If the EPC is low
##' (lower than the target EPC), the parameter is not misspecificied.
##' }
##'
##' Method 2: The confidence interval (CI) of an EPC is calculated.
##' These CIs are compared with the range of trivial
##' misspecification, which could be (-`delta`, `delta`) or (0,
##' `delta`) for nonnegative parameters.
##'
##' \itemize{
##' \item If a CI overlaps with the range of trivial misspecification,
##' the test is inconclusive.
##' \item If a CI completely exceeds the range of trivial misspecification,
##' the fixed parameters are severely misspecified.
##' \item If a CI is completely within the range of trivial misspecification,
##' the fixed parameters are trivially misspecified.
##' }
##'
##'
##' @aliases miPowerFit miPowerFit
##' @importFrom lavaan lavInspect
##' @importFrom stats qnorm qchisq pchisq
##'
##' @param lavaanObj The lavaan model object used to evaluate model fit
##' @param stdLoad The amount of standardized factor loading that one would like
##' to be detected (rejected). The default value is 0.4, which is suggested by
##' Saris and colleagues (2009, p. 571).
##' @param cor The amount of factor or error correlations that one would like to
##' be detected (rejected). The default value is 0.1, which is suggested by
##' Saris and colleagues (2009, p. 571).
##' @param stdBeta The amount of standardized regression coefficients that one
##' would like to be detected (rejected). The default value is 0.1, which is
##' suggested by Saris and colleagues (2009, p. 571).
##' @param intcept The amount of standardized intercept (similar to Cohen's
##' *d* that one would like to be detected (rejected). The default value
##' is 0.2, which is equivalent to a low effect size proposed by Cohen (1988,
##' 1992).
##' @param stdDelta The vector of the standardized parameters that one would
##' like to be detected (rejected). If this argument is specified, the value
##' here will overwrite the other arguments above. The order of the vector
##' must be the same as the row order from modification indices from the
##' `lavaan` object. If a single value is specified, the value will be
##' applied to all parameters.
##' @param delta The vector of the unstandardized parameters that one would like
##' to be detected (rejected). If this argument is specified, the value here
##' will overwrite the other arguments above. The order of the vector must be
##' the same as the row order from modification indices from the `lavaan`
##' object. If a single value is specified, the value will be applied to all
##' parameters.
##' @param cilevel The confidence level of the confidence interval of expected
##' parameter changes. The confidence intervals are used in the equivalence
##' testing.
##' @param \dots arguments passed to [lavaan::modificationIndices()],
##' except for `delta`, which is already an argument (which can be
##' substituted for `stdDelta` or specific sets of parameters using
##' `stdLoad`, `cor`, `stdBeta`, and `intcept`).
##'
##' @return A data frame with these variables:
##' \enumerate{
##' \item `lhs`: The left-hand side variable, with respect to the operator in
##' in the lavaan [lavaan::model.syntax()]
##' \item `op`: The lavaan syntax operator: "~~" represents covariance,
##' "=~" represents factor loading, "~" represents regression, and
##' "~1" represents intercept.
##' \item `rhs`: The right-hand side variable
##' \item `group`: The level of the group variable for the parameter in question
##' \item `mi`: The modification index of the fixed parameter
##' \item `epc`: The EPC if the parameter is freely estimated
##' \item `target.epc`: The target EPC that represents the minimum size
##' of misspecification that one would like to be detected
##' by the test with a high power
##' \item `std.epc`: The standardized EPC if the parameter is freely estimated
##' \item `std.target.epc`: The standardized target expected parameter change
##' \item `significant.mi`: Represents whether the modification index value is
##' significant
##' \item `high.power`: Represents whether the power is enough to detect the
##' target expected parameter change
##' \item `decision.pow`: The decision whether the parameter is misspecified
##' or not based on Saris et al's method: `"M"` represents the parameter
##' is misspecified, `"NM"` represents the parameter is not misspecified,
##' `"EPC:M"` represents the parameter is misspecified decided by
##' checking the expected parameter change value, `"EPC:NM"` represents
##' the parameter is not misspecified decided by checking the expected
##' parameter change value, and `"I"` represents the decision is
##' inconclusive.
##' \item `se.epc`: The standard errors of the expected parameter changes.
##' \item `lower.epc`: The lower bound of the confidence interval of expected
##' parameter changes.
##' \item `upper.epc`: The upper bound of the confidence interval of expected
##' parameter changes.
##' \item `lower.std.epc`: Lower confidence limit of standardized EPCs
##' \item `upper.std.epc`: Upper confidence limit of standardized EPCs
##' \item `decision.ci`: Decision whether the parameter is misspecified
##' based on the CI method: `"M"` represents the
##' parameter is misspecified, `"NM"` represents the parameter is not
##' misspecified, and `"I"` represents the decision is inconclusive.
##' }
##'
##' The row numbers matches with the results obtained from the
##' `inspect(object, "mi")` function.
##'
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' @seealso [moreFitIndices()] For the additional fit indices
##' information
##'
##' @references
##' Cohen, J. (1988). *Statistical power analysis for the
##' behavioral sciences* (2nd ed.). Hillsdale, NJ: Erlbaum.
##'
##' Cohen, J. (1992). A power primer. *Psychological Bulletin, 112*(1),
##' 155--159. \doi{10.1037/0033-2909.112.1.155}
##'
##' Saris, W. E., Satorra, A., & van der Veld, W. M. (2009). Testing structural
##' equation models or detection of misspecifications? *Structural Equation
##' Modeling, 16*(4), 561--582. \doi{10.1080/10705510903203433}
##'
##' van der Veld, W. M., Saris, W. E., & Satorra, A. (2008).
##' *JRule 3.0 Users Guide*. \doi{10.13140/RG.2.2.13609.90729}
##'
##' @examples
##'
##' library(lavaan)
##'
##' HS.model <- ' visual =~ x1 + x2 + x3 '
##' fit <- cfa(HS.model, data = HolzingerSwineford1939,
##' group = "sex", group.equal = c("loadings","intercepts"))
##' miPowerFit(fit, free.remove = FALSE, op = "=~") # loadings
##' miPowerFit(fit, free.remove = FALSE, op = "~1") # intercepts
##'
##' model <- '
##' # latent variable definitions
##' ind60 =~ x1 + x2 + x3
##' dem60 =~ y1 + a*y2 + b*y3 + c*y4
##' dem65 =~ y5 + a*y6 + b*y7 + c*y8
##'
##' # regressions
##' dem60 ~ ind60
##' dem65 ~ ind60 + dem60
##'
##' # residual correlations
##' y1 ~~ y5
##' y2 ~~ y4 + y6
##' y3 ~~ y7
##' y4 ~~ y8
##' y6 ~~ y8
##' '
##' fit2 <- sem(model, data = PoliticalDemocracy, meanstructure = TRUE)
##' miPowerFit(fit2, stdLoad = 0.3, cor = 0.2, stdBeta = 0.2, intcept = 0.5)
##'
##' @export
miPowerFit <- function(lavaanObj, stdLoad = 0.4, cor = 0.1, stdBeta = 0.1,
intcept = 0.2, stdDelta = NULL, delta = NULL,
cilevel = 0.90, ...) {
mi <- lavaan::modificationIndices(lavaanObj, ...)
mi <- mi[mi$op != "==",]
sigma <- mi[,"epc"] / sqrt(mi[,"mi"])
if (is.null(delta)) {
if (is.null(stdDelta))
stdDelta <- getTrivialEpc(mi, stdLoad = stdLoad, cor = cor,
stdBeta = stdBeta, intcept = intcept)
if (length(stdDelta) == 1) stdDelta <- rep(stdDelta, nrow(mi))
delta <- unstandardizeEpc(mi, stdDelta, findTotalVar(lavaanObj))
}
if (length(delta) == 1) delta <- rep(delta, nrow(mi))
ncp <- (delta / sigma)^2
alpha <- 0.05
desiredPow <- 0.80
cutoff <- qchisq(1 - alpha, df = 1)
pow <- 1 - pchisq(cutoff, df = 1, ncp = ncp)
sigMI <- mi[,"mi"] > cutoff
highPow <- pow > desiredPow
group <- rep(1, nrow(mi))
if ("group" %in% colnames(mi)) group <- mi[ , "group"]
decision <- mapply(decisionMIPow, sigMI = sigMI, highPow = highPow,
epc = mi[ , "epc"], trivialEpc = delta)
if (is.null(stdDelta)) stdDelta <- standardizeEpc(mi, findTotalVar(lavaanObj),
delta = delta)
result <- cbind(mi[ , 1:3], group, as.numeric(mi[ , "mi"]), mi[ , "epc"],
delta, standardizeEpc(mi, findTotalVar(lavaanObj)),
stdDelta, sigMI, highPow, decision)
# New method
crit <- abs(qnorm((1 - cilevel)/2))
seepc <- abs(result[,6]) / sqrt(abs(result[,5]))
lowerepc <- result[,6] - crit * seepc
upperepc <- result[,6] + crit * seepc
stdlowerepc <- standardizeEpc(mi, findTotalVar(lavaanObj), delta = lowerepc)
stdupperepc <- standardizeEpc(mi, findTotalVar(lavaanObj), delta = upperepc)
isVar <- mi[,"op"] == "~~" & mi[,"lhs"] == mi[,"rhs"]
decisionci <- mapply(decisionCIEpc, targetval = as.numeric(stdDelta),
lower = stdlowerepc, upper = stdupperepc,
positiveonly = isVar)
result <- cbind(result, seepc, lowerepc, upperepc, stdlowerepc,
stdupperepc, decisionci)
result <- result[!is.na(decision), ]
colnames(result) <- c("lhs","op","rhs","group","mi","epc","target.epc",
"std.epc","std.target.epc","significant.mi",
"high.power","decision.pow","se.epc","lower.epc",
"upper.epc","lower.std.epc","upper.std.epc","decision.ci")
class(result) <- c("lavaan.data.frame","data.frame")
return(result)
}
## ----------------
## Hidden Functions
## ----------------
## totalFacVar: Find total factor variances when regression coeffient matrix
## and factor residual covariance matrix are specified
totalFacVar <- function(beta, psi) {
ID <- diag(nrow(psi))
total <- solve(ID - beta) %*% psi %*% t(solve(ID - beta))
return(diag(total))
}
## findTotalVar: find the total indicator and factor variances
##' @importFrom lavaan lavInspect
findTotalVar <- function(lavaanObj) {
result <- list()
nGroups <- lavInspect(lavaanObj, "ngroups")
cov.all <- lavInspect(lavaanObj, "cov.all")
if (nGroups == 1) cov.all <- list(cov.all)
for (i in 1:nGroups) {
temp <- diag(cov.all[[i]])
names(temp) <- rownames(cov.all[[i]])
result[[i]] <- temp
}
return(result)
}
## getTrivialEpc: find the trivial misspecified expected parameter changes
## given the type of parameters in each row of modification indices
getTrivialEpc <- function(mi, stdLoad=0.4, cor=0.1, stdBeta=0.1, intcept=0.2) {
op <- mi[,"op"]
result <- gsub("=~", stdLoad, op)
result <- gsub("~~", cor, result)
result <- gsub("~1", intcept, result)
result <- gsub("~", stdBeta, result)
return(result)
}
## unstandardizeEpc: Transform from standardized EPC to unstandardized EPC
unstandardizeEpc <- function(mi, delta, totalVar) {
name <- names(totalVar[[1]])
lhsPos <- match(mi[,"lhs"], name)
rhsPos <- match(mi[,"rhs"], name)
group <- rep(1, nrow(mi))
if("group" %in% colnames(mi)) group <- mi[,"group"]
getVar <- function(pos, group) totalVar[[group]][pos]
lhsVar <- mapply(getVar, pos=lhsPos, group=group)
rhsVar <- mapply(getVar, pos=rhsPos, group=group)
FUN <- function(op, lhsVar, rhsVar, delta) {
if(op == "|") return(NA)
lhsSD <- sqrt(lhsVar)
rhsSD <- sqrt(rhsVar)
if(!is.numeric(delta)) delta <- as.numeric(delta)
if(op == "=~") {
return((rhsSD * delta) / lhsSD)
} else if (op == "~~") {
return(lhsSD * delta * rhsSD)
} else if (op == "~1") {
return(lhsSD * delta)
} else if (op == "~") {
return((lhsSD * delta) / rhsSD)
} else {
return(NA)
}
}
unstdDelta <- mapply(FUN, op=mi[,"op"], lhsVar=lhsVar, rhsVar=rhsVar, delta=delta)
return(unstdDelta)
}
## unstandardizeEpc: Transform from unstandardized EPC to standardized EPC.
## If delta is null, the unstandardized epc from the modification indices
## data.frame are used
standardizeEpc <- function(mi, totalVar, delta = NULL) {
if(is.null(delta)) delta <- mi[,"epc"]
name <- names(totalVar[[1]])
lhsPos <- match(mi[,"lhs"], name)
rhsPos <- match(mi[,"rhs"], name)
group <- rep(1, nrow(mi))
if("group" %in% colnames(mi)) group <- mi[,"group"]
getVar <- function(pos, group) totalVar[[group]][pos]
lhsVar <- mapply(getVar, pos=lhsPos, group=group)
rhsVar <- mapply(getVar, pos=rhsPos, group=group)
FUN <- function(op, lhsVar, rhsVar, delta) {
lhsSD <- sqrt(lhsVar)
rhsSD <- sqrt(rhsVar)
if(!is.numeric(delta)) delta <- as.numeric(delta)
if(op == "=~") {
#stdload = beta * sdlatent / sdindicator = beta * lhs / rhs
return((delta / rhsSD) * lhsSD)
} else if (op == "~~") {
#r = cov / (sd1 * sd2)
return(delta / (lhsSD * rhsSD))
} else if (op == "~1") {
#d = meanDiff/sd
return(delta / lhsSD)
} else if (op == "~") {
#beta = b * sdX / sdY = b * rhs / lhs
return((delta / lhsSD) * rhsSD)
} else {
return(NA)
}
}
stdDelta <- mapply(FUN, op=mi[,"op"], lhsVar=lhsVar, rhsVar=rhsVar, delta=delta)
return(stdDelta)
}
## decisionMIPow: provide the decision given the significance of modification
## indices and power to detect trivial misspecification
decisionMIPow <- function(sigMI, highPow, epc, trivialEpc) {
if(is.na(sigMI) | is.na(highPow)) return(NA)
if(sigMI & highPow) {
if(abs(epc) > abs(trivialEpc)) {
return("EPC:M")
} else {
return("EPC:NM")
}
} else if (sigMI & !highPow) {
return("M")
} else if (!sigMI & highPow) {
return("NM")
} else if (!sigMI & !highPow) {
return("I")
} else {
return(NA)
}
}
decisionCIEpc <- function(targetval, lower, upper, positiveonly = FALSE) {
if(is.na(lower) | is.na(upper)) return(NA)
if(positiveonly) {
if(lower > targetval) {
return("M")
} else if (upper < targetval) {
return("NM")
} else {
return("I")
}
} else {
negtargetval <- -targetval
if(lower > targetval | upper < negtargetval) {
return("M")
} else if (upper < targetval & negtargetval < lower) {
return("NM")
} else {
return("I")
}
}
}
semTools/R/plausibleValues.R 0000644 0001762 0000144 00000065203 14764326230 015560 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 12 March 2025
### function to draw plausible values of factor scores from lavPredict
# library(blavaan)
# bfit <- bcfa(HS.model, data=HolzingerSwineford1939, save.lvs = TRUE,
# bcontrol=list(method="rjparallel"), group = "school",
# #target = "stan", control=list(cores = 4, seed = 123),
# burnin = 4000, sample = 30, n.chains = 2)
# bFS <- do.call(rbind, blavInspect(bfit, "lvs"))
# do.call()
## -------------
## Main function
## -------------
##' Plausible-Values Imputation of Factor Scores Estimated from a lavaan Model
##'
##' Draw plausible values of factor scores estimated from a fitted
##' [lavaan::lavaan()] model, then treat them as multiple imputations
##' of missing data using [lavaan.mi::lavaan.mi()].
##'
##'
##' Because latent variables are unobserved, they can be considered as missing
##' data, which can be imputed using Monte Carlo methods. This may be of
##' interest to researchers with sample sizes too small to fit their complex
##' structural models. Fitting a factor model as a first step,
##' [lavaan::lavPredict()] provides factor-score estimates, which can
##' be treated as observed values in a path analysis (Step 2). However, the
##' resulting standard errors and test statistics could not be trusted because
##' the Step-2 analysis would not take into account the uncertainty about the
##' estimated factor scores. Using the asymptotic sampling covariance matrix
##' of the factor scores provided by [lavaan::lavPredict()],
##' `plausibleValues` draws a set of `nDraws` imputations from the
##' sampling distribution of each factor score, returning a list of data sets
##' that can be treated like multiple imputations of incomplete data. If the
##' data were already imputed to handle missing data, `plausibleValues`
##' also accepts an object of class [lavaan.mi::lavaan.mi-class], and will
##' draw `nDraws` plausible values from each imputation. Step 2 would
##' then take into account uncertainty about both missing values and factor
##' scores. Bayesian methods can also be used to generate factor scores, as
##' available with the \pkg{blavaan} package, in which case plausible
##' values are simply saved parameters from the posterior distribution. See
##' Asparouhov and Muthen (2010) for further technical details and references.
##'
##' Each returned `data.frame` includes a `case.idx` column that
##' indicates the corresponding rows in the data set to which the model was
##' originally fitted (unless the user requests only Level-2 variables). This
##' can be used to merge the plausible values with the original observed data,
##' but users should note that including any new variables in a Step-2 model
##' might not accurately account for their relationship(s) with factor scores
##' because they were not accounted for in the Step-1 model from which factor
##' scores were estimated.
##'
##' If `object` is a multilevel `lavaan` model, users can request
##' plausible values for latent variables at particular levels of analysis by
##' setting the [lavaan::lavPredict()] argument `level=1` or
##' `level=2`. If the `level` argument is not passed via \dots,
##' then both levels are returned in a single merged data set per draw. For
##' multilevel models, each returned `data.frame` also includes a column
##' indicating to which cluster each row belongs (unless the user requests only
##' Level-2 variables).
##'
##'
##' @importFrom lavaan lavInspect lavPredict
##'
##' @param object A fitted model of class [lavaan::lavaan-class],
##' [blavaan::blavaan-class], or [lavaan.mi::lavaan.mi-class]
##' @param nDraws `integer` specifying the number of draws, analogous to
##' the number of imputed data sets. If `object` is of class
##' [lavaan.mi::lavaan.mi-class], this will be the number of draws taken
##' *per imputation*. If `object` is of class
##' [blavaan::blavaan-class], `nDraws` cannot exceed
##' `blavInspect(object, "niter") * blavInspect(bfitc, "n.chains")`
##' (number of MCMC samples from the posterior). The drawn samples will be
##' evenly spaced (after permutation for `target="stan"`), using
##' [ceiling()] to resolve decimals.
##' @param seed `integer` passed to [set.seed()].
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations when `object` is of class [lavaan.mi::lavaan.mi-class].
##' Can include any of `c("no.conv", "no.se", "no.npd")`.
##' @param ... Optional arguments to pass to [lavaan::lavPredict()].
##' `assemble` will be ignored because multiple groups are always
##' assembled into a single `data.frame` per draw. `type` will be
##' ignored because it is set internally to `type="lv"`.
##'
##' @return A `list` of length `nDraws`, each of which is a
##' `data.frame` containing plausible values, which can be treated as
##' a `list` of imputed data sets to be passed to [runMI()]
##' (see **Examples**). If `object` is of class
##' [lavaan.mi::lavaan.mi-class], the `list` will be of length
##' `nDraws*m`, where `m` is the number of imputations.
##'
##' @author Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' @references
##' Asparouhov, T. & Muthen, B. O. (2010). *Plausible values for latent
##' variables using M*plus. Technical Report. Retrieved from
##' www.statmodel.com/download/Plausible.pdf
##'
##' @seealso [lavaan.mi::lavaan.mi()], [lavaan.mi::lavaan.mi-class]
##'
##' @examples
##'
##' ## example from ?cfa and ?lavPredict help pages
##' HS.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9 '
##'
##' fit1 <- cfa(HS.model, data = HolzingerSwineford1939)
##' fs1 <- plausibleValues(fit1, nDraws = 3,
##' ## lavPredict() can add only the modeled data
##' append.data = TRUE)
##' lapply(fs1, head)
##'
##' \donttest{
##' ## To merge factor scores to original data.frame (not just modeled data)
##' fs1 <- plausibleValues(fit1, nDraws = 3)
##' idx <- lavInspect(fit1, "case.idx") # row index for each case
##' if (is.list(idx)) idx <- do.call(c, idx) # for multigroup models
##' data(HolzingerSwineford1939) # copy data to workspace
##' HolzingerSwineford1939$case.idx <- idx # add row index as variable
##' ## loop over draws to merge original data with factor scores
##' for (i in seq_along(fs1)) {
##' fs1[[i]] <- merge(fs1[[i]], HolzingerSwineford1939, by = "case.idx")
##' }
##' lapply(fs1, head)
##'
##'
##' ## multiple-group analysis, in 2 steps
##' step1 <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
##' group.equal = c("loadings","intercepts"))
##' PV.list <- plausibleValues(step1)
##'
##' ## subsequent path analysis
##' path.model <- ' visual ~ c(t1, t2)*textual + c(s1, s2)*speed '
##' if(requireNamespace("lavaan.mi")){
##' library(lavaan.mi)
##' step2 <- sem.mi(path.model, data = PV.list, group = "school")
##' ## test equivalence of both slopes across groups
##' lavTestWald.mi(step2, constraints = 't1 == t2 ; s1 == s2')
##' }
##'
##'
##' ## multilevel example from ?Demo.twolevel help page
##' model <- '
##' level: 1
##' fw =~ y1 + y2 + y3
##' fw ~ x1 + x2 + x3
##' level: 2
##' fb =~ y1 + y2 + y3
##' fb ~ w1 + w2
##' '
##' msem <- sem(model, data = Demo.twolevel, cluster = "cluster")
##' mlPVs <- plausibleValues(msem, nDraws = 3) # both levels by default
##' lapply(mlPVs, head, n = 10)
##' ## only Level 1
##' mlPV1 <- plausibleValues(msem, nDraws = 3, level = 1)
##' lapply(mlPV1, head)
##' ## only Level 2
##' mlPV2 <- plausibleValues(msem, nDraws = 3, level = 2)
##' lapply(mlPV2, head)
##'
##'
##'
##' ## example with 20 multiple imputations of missing data:
##' nPVs <- 5
##' nImps <- 20
##'
##' if(requireNamespace("lavaan.mi")){
##' data(HS20imps, package = "lavaan.mi")
##'
##' ## specify CFA model from lavaan's ?cfa help page
##' HS.model <- '
##' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9
##' '
##' out2 <- cfa.mi(HS.model, data = HS20imps)
##' PVs <- plausibleValues(out2, nDraws = nPVs)
##'
##' idx <- out2@@Data@@case.idx # can't use lavInspect() on lavaan.mi
##' ## empty list to hold expanded imputations
##' impPVs <- list()
##' for (m in 1:nImps) {
##' HS20imps[[m]]["case.idx"] <- idx
##' for (i in 1:nPVs) {
##' impPVs[[ nPVs*(m - 1) + i ]] <- merge(HS20imps[[m]],
##' PVs[[ nPVs*(m - 1) + i ]],
##' by = "case.idx")
##' }
##' }
##' lapply(impPVs, head)
##' }
##'
##' }
##'
##' @export
plausibleValues <- function(object, nDraws = 20L, seed = 12345,
omit.imps = c("no.conv","no.se"), ...) {
if (inherits(object, "lavaan")) {
## generate vector of seeds
set.seed(seed)
seeds <- sample(100000:9999999, size = nDraws, replace = FALSE)
PV <- lapply(seeds, plaus.lavaan, object = object, ...)
} else if (inherits(object, "lavaan.mi")) {
## generate vector of seeds
set.seed(seed)
seeds <- sample(100000:9999999, size = nDraws, replace = FALSE)
PV <- plaus.mi(object, seeds = seeds, omit.imps = omit.imps, ...)
} else if (inherits(object, "blavaan")) {
PV <- plaus.blavaan(object, nDraws = nDraws, seed = seed, ...)
#TODO: pass nDraws to sample() iterations?
} else stop("object's class not valid: ", class(object))
PV
}
## ----------------
## Hidden functions
## ----------------
## draw 1 set of plausible values from a lavaan object
##' @importFrom lavaan lavInspect lavPredict lavNames
plaus.lavaan <- function(seed = 1, object, ...) {
stopifnot(inherits(object, "lavaan"))
if (lavInspect(object, "categorical")) {
stop("Plausible values not available (yet) for categorical data")
}
if (lavInspect(object, "options")$missing %in% c("ml", "ml.x")) {
#TODO: verify this:
stop("Plausible values not available (yet) for missing data + fiml.\n",
" Multiple imputations can be used via lavaan.mi()")
}
#FIXME? https://github.com/yrosseel/lavaan/issues/156
set.seed(seed)
cluster <- lavInspect(object, "cluster")
group <- lavInspect(object, "group")
group.label <- lavInspect(object, "group.label")
nG <- lavInspect(object, "ngroups")
nL <- lavInspect(object, "nlevels")
l.names <- o.names <- list()
if (nG == 1L && nL == 1L) {
## single block
l.names <- list(lavNames(object, "lv"))
o.names <- list(lavNames(object, "ov"))
} else if (nG == 1L && nL > 1L) {
## multilevel
for (BB in 1:nL) {
l.names <- c(l.names, list(lavNames(object, "lv", block = BB)))
o.names <- c(o.names, list(lavNames(object, "ov", block = BB)))
}
} else if (nG > 1L && nL == 1L) {
## multigroup
for (BB in 1:nG) {
l.names <- c(l.names, list(lavNames(object, "lv", block = BB)))
o.names <- c(o.names, list(lavNames(object, "ov", block = BB)))
}
} else {
## multilevel + multigroup
for (BB in 1:(nG*nL)) { #FIXME: lavInspect(object, "nblocks")
l.names <- c(l.names, list(lavNames(object, "lv", block = BB)))
o.names <- c(o.names, list(lavNames(object, "ov", block = BB)))
}
}
## extract factor scores + covariance matrix
fsArgs <- list(...)
fsArgs$type <- "lv"
fsArgs$assemble <- FALSE # assemble after drawing
append.data <- fsArgs$append.data
if (is.null(append.data)) append.data <- FALSE # default in lavPredict()
only.L2 <- fsArgs$level == 2L
if (length(only.L2) == 0L) only.L2 <- FALSE
if (only.L2) fsArgs$append.data <- append.data <- FALSE #FIXME: how will Yves handle lavPredict(fit, append=T, level=2)?
bothLevels <- nL > 1L && is.null(fsArgs$level)
fsArgs$object <- object
fsArgs$acov <- "standard" #FIXME: update if other options become available
FS <- do.call(lavPredict, fsArgs) #FIXME: breaks when multigroup MLSEM: https://github.com/yrosseel/lavaan/issues/157
## also draw Level 2, if multilevel and no specific level requested
if (bothLevels) {
fsArgs$level <- 2L
fsArgs$append.data <- FALSE #FIXME: how will Yves handle lavPredict(fit, append=T, level=2)?
FS2 <- do.call(lavPredict, fsArgs)
}
## draw plausible values, if factor scores exist
if (nG == 1L) {
if (ncol(FS) == 0L) {
PV <- FS
} else {
ACOV <- attr(FS, "acov")[[1]]
v.idx <- if (only.L2) 2L else 1L
PV <- apply(FS[ , l.names[[v.idx]], drop = FALSE], 1, function(mu) {
mnormt::rmnorm(n = 1, mean = mu, varcov = ACOV)
})
if (is.null(dim(PV))) {
PV <- as.matrix(PV)
colnames(PV) <- l.names[[v.idx]]
} else PV <- t(PV)
if (append.data) {
PV <- cbind(FS[ , o.names[[v.idx]], drop = FALSE], PV)
}
}
## add Level 2 if multilevel and no specific level requested
if (bothLevels) {
if (ncol(FS2) == 0L) {
PV2 <- FS2
} else {
ACOV2 <- attr(FS2, "acov")[[1]]
#FIXME: how will Yves handle lavPredict(fit, append=T, level=2)?
PV2 <- apply(FS2, 1, function(mu) {
out <- mnormt::rmnorm(n = 1, mean = mu, varcov = ACOV2)
})
if (is.null(dim(PV2))) {
PV2 <- as.matrix(PV2)
colnames(PV2) <- l.names[[2]]
} else PV2 <- t(PV2)
}
}
} else {
ACOV <- list()
PV <- list()
for (gg in 1:nG) {
if (ncol(FS[[gg]]) == 0L) {
PV[[gg]] <- FS[[gg]]
} else {
ACOV[[gg]] <- attr(FS, "acov")[[gg]]
v.idx <- if (only.L2) (2L + (gg - 1L)*nL) else (1L + (gg - 1L)*nL)
PV[[gg]] <- apply(FS[[gg]][ , l.names[[v.idx]], drop = FALSE], 1, function(mu) {
mnormt::rmnorm(n = 1, mean = mu, varcov = ACOV[[gg]])
})
if (is.null(dim(PV[[gg]]))) {
PV[[gg]] <- as.matrix(PV[[gg]])
colnames(PV[[gg]]) <- l.names[[v.idx]]
} else PV[[gg]] <- t(PV[[gg]])
}
if (append.data) {
PV[[gg]] <- cbind(FS[[gg]][ , o.names[[v.idx]], drop = FALSE], PV[[gg]])
}
}
## add Level 2 if multilevel and no specific level requested
if (bothLevels) {
ACOV2 <- list()
PV2 <- list()
for (gg in 1:nG) {
if (ncol(FS2[[gg]]) == 0L) {
PV2[[gg]] <- FS2[[gg]]
} else {
ACOV2[[gg]] <- attr(FS2, "acov")[[gg]]
#FIXME: how will Yves handle lavPredict(fit, append=T, level=2)?
PV2[[gg]] <- apply(FS2[[gg]], 1, function(mu) {
mnormt::rmnorm(n = 1, mean = mu, varcov = ACOV2[[gg]])
})
if (is.null(dim(PV2[[gg]]))) {
PV2[[gg]] <- as.matrix(PV2[[gg]])
colnames(PV2[[gg]]) <- colnames(FS2[[gg]])
} else PV2[[gg]] <- t(PV2[[gg]])
}
}
}
}
## save as data.frame
if (nG > 1L) {
temp <- lapply(1:nG, function(gg) {
dd <- data.frame(PV[[gg]])
## add groups if multiple
dd[ , group] <- group.label[gg]
dd <- dd[ , c(group, setdiff(names(dd), group)), drop = FALSE ]
## attach row indices from original data for optional merging
if (only.L2) {
dd[ , cluster] <- lavInspect(object, "cluster.id")[[gg]]
dd <- dd[ , c(cluster, setdiff(names(dd), cluster)), drop = FALSE ]
} else {
dd <- cbind(case.idx = lavInspect(object, "case.idx")[[gg]], dd)
}
## attach cluster IDs, if multilevel and no level requested
if (bothLevels) {
dd[ , cluster] <- lavInspect(object, "cluster.label")[[gg]]
d2 <- data.frame(PV2[[gg]])
d2[ , group] <- group.label[gg]
d2[ , cluster] <- lavInspect(object, "cluster.id")[[gg]]
dd <- merge(dd, d2, by = c(group, cluster), all = TRUE)
}
dd
})
PV <- do.call(rbind, temp)
} else {
PV <- data.frame(PV)
## attach row indices from original data for optional merging
if (only.L2) {
PV[ , cluster] <- lavInspect(object, "cluster.id")
PV <- PV[ , c(cluster, setdiff(names(PV), cluster)), drop = FALSE ]
} else {
PV <- cbind(case.idx = lavInspect(object, "case.idx"), PV)
}
## attach cluster IDs, if multilevel and no level requested
if (bothLevels) {
PV[ , cluster] <- lavInspect(object, "cluster.label")
PV2 <- data.frame(PV2)
PV2[ , cluster] <- lavInspect(object, "cluster.id")
PV <- merge(PV, PV2, by = cluster, all = TRUE)
}
}
PV
}
## draw plausible values from a lavaan.mi object
##' @importFrom lavaan lavInspect lavPredict
plaus.mi <- function(object, seeds = 1:5, omit.imps = c("no.conv","no.se"), ...) {
stopifnot(inherits(object, "lavaan.mi"))
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
m <- sum(useImps)
useImps <- which(useImps)
## check if model has converged
if (m == 0L) stop("No models converged. Score tests unavailable.")
oldCall <- object@lavListCall
if (!is.null(oldCall$parallel)) {
if (oldCall$parallel == "snow") {
oldCall$parallel <- "no"
oldCall$ncpus <- 1L
message("Unable to pass lavaan::lavPredict() arguments ",
"when parallel='snow'. Switching to parallel='no'.",
" Unless using Windows, parallel='multicore' should work.")
}
}
## call lavaanList() again to run lavTestScore() on each imputation
oldCall$dataList <- object@DataList[useImps]
oldCall$FUN <- function(obj) lapply(seeds, plaus.lavaan, object = obj, ...)
FIT <- eval(as.call(oldCall))
## check if there are any results
noFS <- sapply(FIT@funList, is.null)
if (all(noFS)) stop("No success drawing plausible values for any imputations.")
do.call(c, FIT@funList) # concatenate lists
}
## draw 1 set of plausible values from a blavaan object
##' @importFrom lavaan lavNames lavInspect
plaus.blavaan <- function(object, nDraws = 20L, seed = 12345, ...) {
stopifnot(inherits(object, "blavaan"))
if (!"package:blavaan" %in% search()) attachNamespace("blavaan")
# cluster <- lavInspect(object, "cluster")
group <- lavInspect(object, "group")
group.label <- lavInspect(object, "group.label")
nG <- lavInspect(object, "ngroups")
# nL <- lavInspect(object, "nlevels")
case.idx <- lavInspect(object, "case.idx")
## plausible values of what? (could be latent item responses)
dots <- list(...)
if (is.null(dots$type)) dots$type <- "lv" # default to factor scores
dots$object <- object
## stack factor scores from each chain (one row per PV)
FS <- do.call(blavaan::blavPredict, dots)
#NOTE: might be latent responses
## only save nDraws from posterior
if (nDraws >= length(FS)) {
## why would anyone want this many? Or sample so few during estimation?
message('nDraws cannot exceed number of iterations in `object=`. \nSet to ',
'nDraws = blavInspect(object, "niter") * blavInspect(bfitc, "n.chains")')
nDraws <- length(FS)
}
set.seed(seed)
idx.sample <- ceiling(1:nDraws * length(FS)/nDraws)
#FIXME: if Ed accepts pull request, format will be the same as c("yhat","ypred")
if (dots$type == "lv" && utils::compareVersion(utils::packageDescription('blavaan')$Version, '0.4-2.949') >= 0L) {
## column names contain indices to store PVs in matrix
eta.idx <- colnames(FS)
## N and latent variable names, to know dimensions of PV
N <- lavInspect(object, "ntotal")
etas <- lavNames(object, "lv") #FIXME: assumes same model in both groups
PV <- list()
## loop over nDraws rows, assign columns to eta matrix, save in PV list
set.seed(seed)
idx.sample <- ceiling(1:nDraws * length(FS)/nDraws)
for (i in idx.sample) {
eta <- matrix(NA, nrow = N, ncol = length(etas), dimnames = list(NULL, etas))
for (j in eta.idx) eval(parse(text = paste(j, "<-", FS[i, j]) ))
PV[[i]] <- data.frame(eta)
## add case indices, and groups (if applicable)
if (nG == 1L) PV[[i]]$case.idx <- case.idx else {
PV[[i]]$case.idx <- do.call(c, case.idx)
PV[[i]][ , group] <- rep(group.label, times = lavInspect(object, "nobs"))
}
}
} else {
## latent responses, already a list
PV <- list()
for (i in idx.sample) {
## convert matrix to data.frame
PV[[i]] <- data.frame(FS[[i]])
## add case indices, and groups (if applicable)
if (nG == 1L) PV[[i]]$case.idx <- case.idx else {
PV[[i]]$case.idx <- do.call(c, case.idx)
PV[[i]][ , group] <- rep(group.label, times = lavInspect(object, "nobs"))
}
}
} #else stop('Not implemented for blavPredict(object, type="', dots$type,'")')
PV
}
## ------
## Checks
## ------
# HS.model <- ' visual =~ x1 + x2 + x3
# textual =~ x4 + x5 + x6
# speed =~ x7 + x8 + x9 '
#
# fit1 <- cfa(HS.model, data = HolzingerSwineford1939)
# fs1 <- plausibleValues(fit1, nDraws = 3, append.data = T)
# lapply(fs1, head)
#
#
#
# step1 <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
# group.equal = c("loadings","intercepts"))
# PV.list <- plausibleValues(step1, append.data = T)
# lapply(PV.list[1:3], head)
#
#
# model <- '
# level: 1
# fw =~ y1 + y2 + y3
# fw ~ x1 + x2 + x3
# level: 2
# fb =~ y1 + y2 + y3
# fb ~ w1 + w2
# '
# msem <- sem(model, data = Demo.twolevel, cluster = "cluster")
# mlPVs <- plausibleValues(msem, nDraws = 3, append.data = T) # both levels by default
# lapply(mlPVs, head, n = 10)
# ## only Level 1
# mlPV1 <- plausibleValues(msem, nDraws = 3, level = 1, append.data = T)
# lapply(mlPV1, head)
# ## only Level 2
# mlPV2 <- plausibleValues(msem, nDraws = 3, level = 2, append.data = T)
# lapply(mlPV2, head)
#
#
#
# data(Demo.twolevel)
# Demo.twolevel$g <- ifelse(Demo.twolevel$cluster %% 2L, "foo", "bar") # arbitrary groups
# table(Demo.twolevel$g)
# model2 <- ' group: foo
# level: within
# fw =~ y1 + L2*y2 + L3*y3
# fw ~ x1 + x2 + x3
# level: between
# fb =~ y1 + L2*y2 + L3*y3
# fb ~ w1 + w2
#
# group: bar
#
# level: within
# fw =~ y1 + L2*y2 + L3*y3
# fw ~ x1 + x2 + x3
# level: between
# fb =~ y1 + L2*y2 + L3*y3
# fb ~ w1 + w2
# '
# msem2 <- sem(model2, data = Demo.twolevel, cluster = "cluster", group = "g")
# ml2PVs <- plausibleValues(msem2, nDraws = 3, append.data = T) # both levels by default
# lapply(ml2PVs, head, n = 10)
# ## only Level 1
# ml2PV1 <- plausibleValues(msem2, nDraws = 3, level = 1, append.data = T)
# lapply(ml2PV1, head)
# ## only Level 2
# ml2PV2 <- plausibleValues(msem2, nDraws = 3, level = 2, append.data = T)
# lapply(ml2PV2, head)
## ordered-categorical data
# data(datCat)
#
# modc <- ' ## Set thresholds equal across groups
# ## thresholds at Time 1
# u1 | c(tau1.1, tau1.1)*t1 + c(tau1.2, tau1.2)*t2 + c(tau1.3, tau1.3)*t3 + c(tau1.4, tau1.4)*t4
# u2 | c(tau2.1, tau2.1)*t1 + c(tau2.2, tau2.2)*t2 + c(tau2.3, tau2.3)*t3 + c(tau2.4, tau2.4)*t4
# u3 | c(tau3.1, tau3.1)*t1 + c(tau3.2, tau3.2)*t2 + c(tau3.3, tau3.3)*t3 + c(tau3.4, tau3.4)*t4
# u4 | c(tau4.1, tau4.1)*t1 + c(tau4.2, tau4.2)*t2 + c(tau4.3, tau4.3)*t3 + c(tau4.4, tau4.4)*t4
# ## thresholds at Time 2 equal to Time 1
# u5 | c(tau1.1, tau1.1)*t1 + c(tau1.2, tau1.2)*t2 + c(tau1.3, tau1.3)*t3 + c(tau1.4, tau1.4)*t4
# u6 | c(tau2.1, tau2.1)*t1 + c(tau2.2, tau2.2)*t2 + c(tau2.3, tau2.3)*t3 + c(tau2.4, tau2.4)*t4
# u7 | c(tau3.1, tau3.1)*t1 + c(tau3.2, tau3.2)*t2 + c(tau3.3, tau3.3)*t3 + c(tau3.4, tau3.4)*t4
# u8 | c(tau4.1, tau4.1)*t1 + c(tau4.2, tau4.2)*t2 + c(tau4.3, tau4.3)*t3 + c(tau4.4, tau4.4)*t4
# ## define latent responses as single-indicator factors (resid. var = 0)
# y1 =~ 1*u1 ; u1 ~~ c(0, 0)*u1
# y2 =~ 1*u2 ; u2 ~~ c(0, 0)*u2
# y3 =~ 1*u3 ; u3 ~~ c(0, 0)*u3
# y4 =~ 1*u4 ; u4 ~~ c(0, 0)*u4
# y5 =~ 1*u5 ; u5 ~~ c(0, 0)*u5
# y6 =~ 1*u6 ; u6 ~~ c(0, 0)*u6
# y7 =~ 1*u7 ; u7 ~~ c(0, 0)*u7
# y8 =~ 1*u8 ; u8 ~~ c(0, 0)*u8
# ## only fix mean=0 in first group/occasion
# y1 + y2 + y3 + y4 ~ c( 0, NA)*1
# y5 + y6 + y7 + y8 ~ c(NA, NA)*1
# ## only fix variance=1 in first groop/occasion
# y1 ~~ c( 1, NA)*y1
# y2 ~~ c( 1, NA)*y2
# y3 ~~ c( 1, NA)*y3
# y4 ~~ c( 1, NA)*y4
# y5 ~~ c(NA, NA)*y5
# y6 ~~ c(NA, NA)*y6
# y7 ~~ c(NA, NA)*y7
# y8 ~~ c(NA, NA)*y8
# ## estimate all covariances
# y1 ~~ y2 + y3 + y4 + y5 + y6 + y7 + y8
# y2 ~~ y3 + y4 + y5 + y6 + y7 + y8
# y3 ~~ y4 + y5 + y6 + y7 + y8
# y4 ~~ y5 + y6 + y7 + y8
# y5 ~~ y6 + y7 + y8
# y6 ~~ y7 + y8
# y7 ~~ y8
# '
# ## fit in lavaan for sanity check
# fitc <- lavaan(modc, data = datCat, group = "g", parameterization = "theta")
# summary(fitc)
#
# ## impose 5% MCAR
# set.seed(123)
# for (i in 1:8) datCat[sample(1:nrow(datCat), size = .05*nrow(datCat)), i] <- NA
#
# ## try with pairwise deletion
# fitcm <- lavaan(modc, data = datCat, group = "g", parameterization = "theta",
# missing = "pairwise")
# summary(fitcm)
#
# ## try blavaan
# data(datCat) # doesn't yet work with missing ordinal data
# bfitc <- blavaan(modc, data = datCat, group = "g", ordered = TRUE, # why is this needed when they are already ordered?
# n.chains = 2, burnin = 500, sample = 101, seed = 123,
# bcontrol = list(cores = 2),
# save.lvs = TRUE)
# summary(bfitc)
# LIRs <- blavPredict(bfitc, type = "ypred")
# yhats <- blavPredict(bfitc, type = "yhat") # same when resid. var = 0?
# fscores <- blavPredict(bfitc, type = "lv") # same as LIRs?
#
#
# length(LIRs) # list: 1 N*p matrix per chain
# length(yhats) # list: 1 N*p matrix per chain
# length(fscores) # matrix: 1 row per chain, 1 column per [N, fs]
# ## all are basically interchangeable
# ch <- 2
# cor(cbind(yhats = as.numeric(yhats[[ch]]),
# LIRs = as.numeric(LIRs[[ch]]),
# fscores = fscores[ch,]))
# ## compare means (by group)
# aggregate(yhats[[ch]], by = datCat["g"], FUN = mean)
# aggregate( LIRs[[ch]], by = datCat["g"], FUN = mean)
# aggregate(matrix(fscores[ch,], ncol = 8), by = datCat["g"], FUN = mean)
# ## compare to lavaan
# do.call(rbind, sapply(lavInspect(fitc, "est"),
# function(i) i$alpha[,1], simplify = FALSE))
#
#
#
# ## now a CFA (so there are latent item responses and common factors)
# mod <- ' FU1 =~ u1 + u2 + u3 + u4
# FU2 =~ u5 + u6 + u7 + u8 '
# fit <- cfa(mod, data = datCat, std.lv = TRUE)
# bfit <- bcfa(mod, data = datCat, std.lv = TRUE, ordered = TRUE,
# n.chains = 2, burnin = 100, sample = 101, seed = 123,
# save.lvs = TRUE)
# summary(bfit)
# fscores <- blavPredict(bfit, type = "lv")
# LIRs <- blavPredict(bfit, type = "ypred")
# FS <- plausibleValues(bfitc)
# LIRs <- plausibleValues(bfitc, type = "ypred")
semTools/R/htmt.R 0000644 0001762 0000144 00000014531 14632150323 013362 0 ustar ligges users ### Ylenio Longo & Terrence D. Jorgensen
### Last updated: 31 January 2023
##' Assessing Discriminant Validity using Heterotrait--Monotrait Ratio
##'
##' This function assesses discriminant validity through the
##' heterotrait-monotrait ratio (HTMT) of the correlations (Henseler, Ringlet &
##' Sarstedt, 2015). Specifically, it assesses the arithmetic (Henseler et al.,
##' ) or geometric (Roemer et al., 2021) mean correlation
##' among indicators across constructs (i.e. heterotrait--heteromethod
##' correlations) relative to the geometric-mean correlation among indicators
##' within the same construct (i.e. monotrait--heteromethod correlations).
##' The resulting HTMT(2) values are interpreted as estimates of inter-construct
##' correlations. Absolute values of the correlations are recommended to
##' calculate the HTMT matrix, and are required to calculate HTMT2. Correlations
##' are estimated using the [lavaan::lavCor()] function.
##'
##'
##' @importFrom stats cov2cor
##'
##' @param model lavaan [lavaan::model.syntax()] of a confirmatory factor
##' analysis model where at least two factors are required for indicators
##' measuring the same construct.
##' @param data A `data.frame` or data `matrix`
##' @param sample.cov A covariance or correlation matrix can be used, instead of
##' `data=`, to estimate the HTMT values.
##' @param missing If `"listwise"`, cases with missing values are removed listwise
##' from the data frame. If `"direct"` or `"ml"` or `"fiml"` and the estimator is
##' maximum likelihood, an EM algorithm is used to estimate the unrestricted
##' covariance matrix (and mean vector). If `"pairwise"`, pairwise deletion is
##' used. If `"default"`, the value is set depending on the estimator and the
##' mimic option (see details in [lavaan::lavCor()]).
##' @param ordered Character vector. Only used if object is a `data.frame`.
##' Treat these variables as ordered (ordinal) variables. Importantly, all
##' other variables will be treated as numeric (unless `is.ordered` in
##' `data=`). See also [lavaan::lavCor()].
##' @param absolute `logical` indicating whether HTMT values should be
##' estimated based on absolute correlations (default is `TRUE`). This
##' is recommended for HTMT but required for HTMT2 (so silently ignored).
##' @param htmt2 `logical` indicating whether to use the geometric mean
##' (default, appropriate for congeneric indicators) or arithmetic mean
##' (which assumes tau-equivalence).
##'
##' @return A matrix showing HTMT(2) values (i.e., discriminant validity)
##' between each pair of factors.
##'
##' @author
##' Ylenio Longo (University of Nottingham; \email{yleniolongo@@gmail.com})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @references
##' Henseler, J., Ringle, C. M., & Sarstedt, M. (2015). A new criterion for
##' assessing discriminant validity in variance-based structural equation
##' modeling. *Journal of the Academy of Marketing Science, 43*(1),
##' 115--135. \doi{10.1007/s11747-014-0403-8}
##'
##' Roemer, E., Schuberth, F., & Henseler, J. (2021). HTMT2---An improved
##' criterion for assessing discriminant validity in structural equation
##' modeling. *Industrial Management & Data Systems, 121*(21), 2637--2650.
##' \doi{10.1108/IMDS-02-2021-0082}
##'
##' Voorhees, C. M., Brady, M. K., Calantone, R., & Ramirez, E. (2016).
##' Discriminant validity testing in marketing: An analysis, causes for
##' concern, and proposed remedies.
##' *Journal of the Academy of Marketing Science, 44*(1), 119--134.
##' \doi{10.1007/s11747-015-0455-4}
##'
##' @examples
##'
##' HS.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9 '
##'
##' dat <- HolzingerSwineford1939[, paste0("x", 1:9)]
##' htmt(HS.model, dat)
##'
##' ## save covariance matrix
##' HS.cov <- cov(HolzingerSwineford1939[, paste0("x", 1:9)])
##' ## HTMT using arithmetic mean
##' htmt(HS.model, sample.cov = HS.cov, htmt2 = FALSE)
##'
##' @export
htmt <- function(model, data = NULL, sample.cov = NULL, missing = "listwise",
ordered = NULL, absolute = TRUE, htmt2 = TRUE) {
model <- lavaan::lavaanify(model)
model <- model[model$op %in% "=~", ]
factors <- unique(model$lhs)
nf <- length(factors)
var <- list()
for (i in 1:nf) {
var[[i]] <- model$rhs[which(model$lhs %in% factors[i])]
}
varnames <- c(unlist(var))
if(!is.null(data)) { # if data
if(any(! varnames %in% colnames(data))) {
absent.vars <- which(! varnames %in% colnames(data))
stop("Missing observed variables in the dataset: ",
paste(varnames[absent.vars], collapse = " "))
}
data <- data[ , c(varnames)]
R <- lavaan::lavCor(data, missing = missing, ordered = ordered)
rownames(R) <- names(data)
colnames(R) <- names(data)
} else {
if (any(! varnames %in% colnames(sample.cov))) {
absent.vars <- which(! varnames %in% colnames(sample.cov))
stop("Missing observed variables in the covariance or correlation matrix: ",
paste(varnames[absent.vars], collapse = " "))
}
diagR <- diag(sample.cov)
if (max(diagR) != 1 & min(diagR) != 1) { #if covariance matrix
R <- cov2cor(sample.cov[varnames, varnames])
} else { # if correlation matrix
R <- sample.cov[varnames, varnames]
}
}
if (absolute || htmt2) {
R <- abs(R)
}
diag(R) <- NA
m.cor.w <- list()
for (i in 1:nf) {
if (htmt2) {
m.cor.w[[i]] <- exp(mean(log(R[ var[[i]], var[[i]] ]), na.rm = TRUE))
} else m.cor.w[[i]] <- mean(R[ var[[i]], var[[i]] ], na.rm = TRUE)
}
m.cor.w <- as.numeric(m.cor.w)
comb <- expand.grid(1:nf, 1:nf)
g <- list()
for (i in 1:nrow(comb)) {
g[[i]] <- sqrt(m.cor.w[comb[i, 2]] * m.cor.w[comb[i, 1]])
}
g <- as.numeric(g)
paste(comb[, 2], comb[, 1])
m.cor.a <- list()
for (i in 1:nrow(comb)) {
if (htmt2) {
m.cor.a[[i]] <- exp(mean(log(R[ var[[comb[i, 2]]],
var[[comb[i, 1]]] ]),
na.rm = TRUE))
} else m.cor.a[[i]] <- mean(R[ var[[ comb[i,2] ]],
var[[ comb[i,1] ]] ], na.rm = TRUE)
}
m.cor.a <- as.numeric(m.cor.a)
outhtmt <- m.cor.a / g
res <- matrix(outhtmt, nrow = nf, ncol = nf, dimnames = list(factors))
colnames(res) <- factors
class(res) <- c("lavaan.matrix.symmetric", "matrix")
res
}
semTools/R/semTools.R 0000644 0001762 0000144 00000005170 14632426776 014234 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 10 June 2024
### package documentation, along with convenience documentation (e.g., imports)
##' semTools: Useful Tools for Structural Equation Modeling
##'
##' The \pkg{semTools} package provides many miscellaneous functions that are
##' useful for statistical analysis involving SEM in R. Many functions extend
##' the funtionality of the \pkg{lavaan} package. Some sets of functions in
##' \pkg{semTools} correspond to the same theme. We call such a collection of
##' functions a *suite*. Our suites include:
##' \itemize{
##' \item{Model Fit Evaluation:
##' [moreFitIndices()],
##' [nullRMSEA()],
##' [singleParamTest()],
##' [miPowerFit()], and
##' [chisqSmallN()]}
##' \item{Measurement Invariance:
##' [measEq.syntax()],
##' [partialInvariance()],
##' [partialInvarianceCat()], and
##' [permuteMeasEq()]}
##' \item{Power Analysis:
##' [SSpower()],
##' [findRMSEApower()],
##' [plotRMSEApower()],
##' [plotRMSEAdist()],
##' [findRMSEAsamplesize()],
##' [findRMSEApowernested()],
##' [plotRMSEApowernested()], and
##' [findRMSEAsamplesizenested()]}
##' \item{Missing Data Analysis:
## [runMI()], # DEPRECATE
##' [auxiliary()],
##' [twostage()],
##' [fmi()],
##' [bsBootMiss()],
##' [quark()], and
##' [combinequark()]}
##' \item{Latent Interactions:
##' [indProd()],
##' [orthogonalize()],
##' [probe2WayMC()],
##' [probe3WayMC()],
##' [probe2WayRC()],
##' [probe3WayRC()], and
##' [plotProbe()]}
##' \item{Exploratory Factor Analysis (EFA):
##' [efa.ekc()]}
##' \item{Reliability Estimation:
##' [compRelSEM()] and
##' [maximalRelia()]
##' (see also [AVE()])}
##' \item{Parceling:
##' [parcelAllocation()],
##' [PAVranking()], and
##' [poolMAlloc()]}
##' \item{Non-Normality:
##' [skew()],
##' [kurtosis()],
##' [mardiaSkew()],
##' [mardiaKurtosis()], and
##' [mvrnonnorm()]}
##' }
##' All users of R (or SEM) are invited to submit functions or ideas for
##' functions by contacting the maintainer, Terrence Jorgensen
##' (\email{TJorgensen314@gmail.com}). Contributors are encouraged to use
##' `Roxygen` comments to document their contributed code, which is
##' consistent with the rest of \pkg{semTools}. Read the vignette from the
##' \pkg{roxygen2} package for details:
##' `vignette("rd", package = "roxygen2")`
##'
##' @name semTools
##' @aliases semTools-package
NULL
##' @importFrom methods setClass setMethod getMethod show is new slot as hasArg
NULL
##' @importFrom graphics hist plot par abline lines legend
NULL
##' @importFrom stats nobs residuals resid fitted fitted.values coef anova vcov
NULL
semTools/R/mvrnonnorm.R 0000644 0001762 0000144 00000014566 14632143377 014644 0 ustar ligges users ### Yves Rosseel, Sunthud Pornprasertmanit, & Terrence D. Jorgensen
### Last updated: 10 June 2024
##' Generate Non-normal Data using Vale and Maurelli (1983) method
##'
##' Generate Non-normal Data using Vale and Maurelli (1983) method. The function
##' is designed to be as similar as the popular `mvrnorm` function in the
##' `MASS` package. The codes are copied from `mvrnorm` function in
##' the `MASS` package for argument checking and `lavaan` package for
##' data generation using Vale and Maurelli (1983) method.
##'
##'
##' @importFrom stats cov2cor
##'
##' @param n Sample size
##' @param mu A mean vector. If elements are named, those will be used as
##' variable names in the returned data matrix.
##' @param Sigma A positive-definite symmetric matrix specifying the covariance
##' matrix of the variables. If rows or columns are named (and `mu` is
##' unnamed), those will be used as variable names in the returned data matrix.
##' @param skewness A vector of skewness of the variables
##' @param kurtosis A vector of excessive kurtosis of the variables
##' @param empirical deprecated, ignored.
##' @return A data matrix
##' @author The original function is the [lavaan::simulateData()]
##' function written by Yves Rosseel in the `lavaan` package. The function
##' is adjusted for a convenient usage by Sunthud Pornprasertmanit
##' (\email{psunthud@@gmail.com}). Terrence D. Jorgensen added the feature to
##' retain variable names from `mu` or `Sigma`.
##'
##' @references Vale, C. D. & Maurelli, V. A. (1983). Simulating multivariate
##' nonormal distributions. *Psychometrika, 48*(3), 465--471.
##' \doi{10.1007/BF02293687}
##'
##' @examples
##'
##' set.seed(123)
##' mvrnonnorm(20, c(1, 2), matrix(c(10, 2, 2, 5), 2, 2),
##' skewness = c(5, 2), kurtosis = c(3, 3))
##' ## again, with variable names specified in mu
##' set.seed(123)
##' mvrnonnorm(20, c(a = 1, b = 2), matrix(c(10, 2, 2, 5), 2, 2),
##' skewness = c(5, 2), kurtosis = c(3, 3))
##'
##' @export
mvrnonnorm <- function(n, mu, Sigma, skewness = NULL,
kurtosis = NULL, empirical = FALSE) {
## number of variables
p <- length(mu)
if (!all(dim(Sigma) == c(p, p))) stop("incompatible arguments")
## save variable names, if they exist
varnames <- names(mu)
if (is.null(varnames)) varnames <- rownames(Sigma)
if (is.null(varnames)) varnames <- colnames(Sigma)
## check for NPD
eS <- eigen(Sigma, symmetric = TRUE)
ev <- eS$values
if (!all(ev >= -1e-06 * abs(ev[1L])))
stop("'Sigma' is not positive definite")
## simulate X <- NULL
if (is.null(skewness) && is.null(kurtosis)) {
X <- mnormt::rmnorm(n = n, mean = mu, varcov = Sigma)
} else {
if (is.null(skewness)) skewness <- rep(0, p)
if (is.null(kurtosis)) kurtosis <- rep(0, p)
Z <- ValeMaurelli1983copied(n = n, COR = cov2cor(Sigma),
skewness = skewness, kurtosis = kurtosis)
TMP <- scale(Z, center = FALSE, scale = 1/sqrt(diag(Sigma)))[ , , drop = FALSE]
X <- sweep(TMP, MARGIN = 2, STATS = mu, FUN = "+")
}
colnames(X) <- varnames
X
}
## ----------------
## Hidden Functions
## ----------------
## Copied from lavaan package
##' @importFrom stats nlminb
ValeMaurelli1983copied <- function(n = 100L, COR, skewness, kurtosis,
debug = FALSE) {
fleishman1978_abcd <- function(skewness, kurtosis) {
system.function <- function(x, skewness, kurtosis) {
b.=x[1L]; c.=x[2L]; d.=x[3L]
eq1 <- b.^2 + 6*b.*d. + 2*c.^2 + 15*d.^2 - 1
eq2 <- 2*c.*(b.^2 + 24*b.*d. + 105*d.^2 + 2) - skewness
eq3 <- 24*(b.*d. + c.^2*(1 + b.^2 + 28*b.*d.) +
d.^2*(12 + 48*b.*d. + 141*c.^2 + 225*d.^2)) - kurtosis
eq <- c(eq1,eq2,eq3)
sum(eq^2) ## SS
}
out <- nlminb(start = c(1, 0, 0), objective = system.function,
scale = 10, control = list(trace = 0),
skewness = skewness, kurtosis = kurtosis)
if(out$convergence != 0) warning("no convergence")
b. <- out$par[1L]; c. <- out$par[2L]; d. <- out$par[3L]; a. <- -c.
c(a.,b.,c.,d.)
}
getICOV <- function(b1, c1, d1, b2, c2, d2, R) {
objectiveFunction <- function(x, b1, c1, d1, b2, c2, d2, R) {
rho=x[1L]
eq <- rho*(b1*b2 + 3*b1*d2 + 3*d1*b2 + 9*d1*d2) +
rho^2*(2*c1*c2) + rho^3*(6*d1*d2) - R
eq^2
}
#gradientFunction <- function(x, bcd1, bcd2, R) {
#
#}
out <- nlminb(start=R, objective=objectiveFunction,
scale=10, control=list(trace=0),
b1=b1, c1=c1, d1=d1, b2=b2, c2=c2, d2=d2, R=R)
if(out$convergence != 0) warning("no convergence")
rho <- out$par[1L]
rho
}
# number of variables
nvar <- ncol(COR)
# check skewness
if(is.null(skewness)) {
SK <- rep(0, nvar)
} else if(length(skewness) == nvar) {
SK <- skewness
} else if(length(skewness) == 1L) {
SK <- rep(skewness, nvar)
} else {
stop("skewness has wrong length")
}
if(is.null(kurtosis)) {
KU <- rep(0, nvar)
} else if(length(kurtosis) == nvar) {
KU <- kurtosis
} else if(length(kurtosis) == 1L) {
KU <- rep(kurtosis, nvar)
} else {
stop("kurtosis has wrong length")
}
# create Fleishman table
FTable <- matrix(0, nvar, 4L)
for(i in 1:nvar) {
FTable[i,] <- fleishman1978_abcd(skewness=SK[i], kurtosis=KU[i])
}
# compute intermediate correlations between all pairs
ICOR <- diag(nvar)
for(j in 1:(nvar-1L)) {
for(i in (j+1):nvar) {
if(COR[i,j] == 0) next
ICOR[i,j] <- ICOR[j,i] <-
getICOV(FTable[i,2], FTable[i,3], FTable[i,4],
FTable[j,2], FTable[j,3], FTable[j,4], R=COR[i,j])
}
}
if(debug) {
cat("\nOriginal correlations (for Vale-Maurelli):\n")
print(COR)
cat("\nIntermediate correlations (for Vale-Maurelli):\n")
print(ICOR)
cat("\nEigen values ICOR:\n")
print( eigen(ICOR)$values )
}
# generate Z
X <- Z <- mnormt::rmnorm(n = n, mean = rep(0,nvar), varcov = ICOR)
# transform Z using Fleischman constants
for(i in 1:nvar) {
X[,i] <- FTable[i,1L] + FTable[i,2L]*Z[,i] + FTable[i,3L]*Z[,i]^2 +
FTable[i,4L]*Z[,i]^3
}
X
}
semTools/R/imposeStart.R 0000644 0001762 0000144 00000010446 14632016456 014731 0 ustar ligges users ### Sunthud Pornprasertmanit
### Last updated: 2 April 2017
#' Specify starting values from a lavaan output
#'
#' This function will save the parameter estimates of a lavaan output and
#' impose those parameter estimates as starting values for another analysis
#' model. The free parameters with the same names or the same labels across two
#' models will be imposed the new starting values. This function may help to
#' increase the chance of convergence in a complex model (e.g.,
#' multitrait-multimethod model or complex longitudinal invariance model).
#'
#'
#' @param out The `lavaan` output that users wish to use the parameter
#' estimates as staring values for an analysis model
#' @param expr The original code that users use to run a lavaan model
#' @param silent Logical to print the parameter table with new starting values
#' @return A fitted lavaan model
#' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
#' @examples
#'
#' ## The following example show that the longitudinal weak invariance model
#' ## using effect coding was not convergent with three time points but convergent
#' ## with two time points. Thus, the parameter estimates from the model with
#' ## two time points are used as starting values of the three time points.
#' ## The model with new starting values is convergent properly.
#'
#' weak2time <- '
#' # Loadings
#' f1t1 =~ LOAD1*y1t1 + LOAD2*y2t1 + LOAD3*y3t1
#' f1t2 =~ LOAD1*y1t2 + LOAD2*y2t2 + LOAD3*y3t2
#'
#' # Factor Variances
#' f1t1 ~~ f1t1
#' f1t2 ~~ f1t2
#'
#' # Factor Covariances
#' f1t1 ~~ f1t2
#'
#' # Error Variances
#' y1t1 ~~ y1t1
#' y2t1 ~~ y2t1
#' y3t1 ~~ y3t1
#' y1t2 ~~ y1t2
#' y2t2 ~~ y2t2
#' y3t2 ~~ y3t2
#'
#' # Error Covariances
#' y1t1 ~~ y1t2
#' y2t1 ~~ y2t2
#' y3t1 ~~ y3t2
#'
#' # Factor Means
#' f1t1 ~ NA*1
#' f1t2 ~ NA*1
#'
#' # Measurement Intercepts
#' y1t1 ~ INT1*1
#' y2t1 ~ INT2*1
#' y3t1 ~ INT3*1
#' y1t2 ~ INT4*1
#' y2t2 ~ INT5*1
#' y3t2 ~ INT6*1
#'
#' # Constraints for Effect-coding Identification
#' LOAD1 == 3 - LOAD2 - LOAD3
#' INT1 == 0 - INT2 - INT3
#' INT4 == 0 - INT5 - INT6
#' '
#' model2time <- lavaan(weak2time, data = exLong)
#'
#' weak3time <- '
#' # Loadings
#' f1t1 =~ LOAD1*y1t1 + LOAD2*y2t1 + LOAD3*y3t1
#' f1t2 =~ LOAD1*y1t2 + LOAD2*y2t2 + LOAD3*y3t2
#' f1t3 =~ LOAD1*y1t3 + LOAD2*y2t3 + LOAD3*y3t3
#'
#' # Factor Variances
#' f1t1 ~~ f1t1
#' f1t2 ~~ f1t2
#' f1t3 ~~ f1t3
#'
#' # Factor Covariances
#' f1t1 ~~ f1t2 + f1t3
#' f1t2 ~~ f1t3
#'
#' # Error Variances
#' y1t1 ~~ y1t1
#' y2t1 ~~ y2t1
#' y3t1 ~~ y3t1
#' y1t2 ~~ y1t2
#' y2t2 ~~ y2t2
#' y3t2 ~~ y3t2
#' y1t3 ~~ y1t3
#' y2t3 ~~ y2t3
#' y3t3 ~~ y3t3
#'
#' # Error Covariances
#' y1t1 ~~ y1t2
#' y2t1 ~~ y2t2
#' y3t1 ~~ y3t2
#' y1t1 ~~ y1t3
#' y2t1 ~~ y2t3
#' y3t1 ~~ y3t3
#' y1t2 ~~ y1t3
#' y2t2 ~~ y2t3
#' y3t2 ~~ y3t3
#'
#' # Factor Means
#' f1t1 ~ NA*1
#' f1t2 ~ NA*1
#' f1t3 ~ NA*1
#'
#' # Measurement Intercepts
#' y1t1 ~ INT1*1
#' y2t1 ~ INT2*1
#' y3t1 ~ INT3*1
#' y1t2 ~ INT4*1
#' y2t2 ~ INT5*1
#' y3t2 ~ INT6*1
#' y1t3 ~ INT7*1
#' y2t3 ~ INT8*1
#' y3t3 ~ INT9*1
#'
#' # Constraints for Effect-coding Identification
#' LOAD1 == 3 - LOAD2 - LOAD3
#' INT1 == 0 - INT2 - INT3
#' INT4 == 0 - INT5 - INT6
#' INT7 == 0 - INT8 - INT9
#' '
#' ### The following command does not provide convergent result
#' # model3time <- lavaan(weak3time, data = exLong)
#'
#' ### Use starting values from the model with two time points
#' model3time <- imposeStart(model2time, lavaan(weak3time, data = exLong))
#' summary(model3time)
#'
#' @export
imposeStart <- function(out, expr, silent = TRUE) {
if(!is(out, "lavaan")) stop("The first argument of the function must be a lavaan output.")
template2 <- template <- substitute(expr)
template2$do.fit <- FALSE
model <- eval(expr = template2, enclos = parent.frame())
ptmodel <- parTable(model)
coefmodel <- lavaan::coef(model)
coefout <- lavaan::coef(out)
start <- coefout[match(names(coefmodel), names(coefout))]
ptmodel$start[ptmodel$free != 0] <- start[ptmodel$free[ptmodel$free != 0]]
ptmodel$est <- NULL
ptmodel$se <- NULL
if(!silent) {
cat("########## Model with imposed starting values #########\n")
print(ptmodel)
}
if("model" %in% names(template)) {
template$model <- ptmodel
} else {
template[[2]] <- ptmodel
}
eval(expr = template, enclos = parent.frame())
}
semTools/R/auxiliary.R 0000644 0001762 0000144 00000027406 14753073437 014440 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 12 February 2025
##' Implement Saturated Correlates with FIML
##'
##' Automatically add auxiliary variables to a lavaan model when using full
##' information maximum likelihood (FIML) to handle missing data
##'
##' These functions are wrappers around the corresponding lavaan functions.
##' You can use them the same way you use [lavaan::lavaan()], but you
##' *must* pass your full `data.frame` to the `data` argument.
##' Because the saturated-correlates approaches (Enders, 2008) treats exogenous
##' variables as random, `fixed.x` must be set to `FALSE`. Because FIML
##' requires continuous data (although nonnormality corrections can still be
##' requested), no variables in the model nor auxiliary variables specified in
##' `aux` can be declared as `ordered`.
##'
##' @aliases auxiliary lavaan.auxiliary cfa.auxiliary sem.auxiliary growth.auxiliary
##' @importFrom lavaan lavInspect parTable lavNames
##' @importFrom stats cov quantile
##'
##' @param model The analysis model can be specified with 1 of 2 objects:
##' \enumerate{
##' \item lavaan [lavaan::model.syntax()] specifying a hypothesized
##' model *without* mention of auxiliary variables in `aux`
##' \item a parameter table, as returned by [lavaan::parTable()],
##' specifying the target model *without* auxiliary variables.
##' This option requires these columns (and silently ignores all others):
##' `c("lhs","op","rhs","user","group","free","label","plabel","start")`
##' }
##' @param data `data.frame` that includes auxiliary variables as well as
##' any observed variables in the `model`
##' @param aux `character`. Names of auxiliary variables to add to `model`
##' @param fun `character`. Name of a specific lavaan function used to fit
##' `model` to `data` (i.e., `"lavaan"`, `"cfa"`,
##' `"sem"`, or `"growth"`). Only required for `auxiliary`.
##' @param ... Additional arguments to pass to `fun=`.
##' @param envir Passed to [do.call()].
##' @param return.syntax `logical` indicating whether to return a
##' `character` string of [lavaan::model.syntax()] that can be
##' added to a target `model=` that is also a `character` string.
##' This can be advantageous, for example, to use add saturated correlates to
##' a \pkg{blavaan} model.
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @references
##' Enders, C. K. (2008). A note on the use of missing auxiliary variables in
##' full information maximum likelihood-based structural equation models.
##' *Structural Equation Modeling, 15*(3), 434--448.
##' \doi{10.1080/10705510802154307}
##'
##' @return a fitted [lavaan::lavaan-class] object. Additional
##' information is stored as a `list` in the `@@external` slot:
##' \itemize{
##' \item `baseline.model`. a fitted [lavaan::lavaan-class]
##' object. Results of fitting an appropriate independence model for
##' the calculation of incremental fit indices (e.g., CFI, TLI) in
##' which the auxiliary variables remain saturated, so only the target
##' variables are constrained to be orthogonal. See Examples for how
##' to send this baseline model to [lavaan::fitMeasures()].
##' \item `aux`. The character vector of auxiliary variable names.
##' \item `baseline.syntax`. A character vector generated within the
##' `auxiliary` function, specifying the `baseline.model`
##' syntax.
##' }
##'
##' @examples
##' dat1 <- lavaan::HolzingerSwineford1939
##' set.seed(12345)
##' dat1$z <- rnorm(nrow(dat1))
##' dat1$x5 <- ifelse(dat1$z < quantile(dat1$z, .3), NA, dat1$x5)
##' dat1$x9 <- ifelse(dat1$z > quantile(dat1$z, .8), NA, dat1$x9)
##'
##' targetModel <- "
##' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9
##' "
##'
##' ## works just like cfa(), but with an extra "aux" argument
##' fitaux1 <- cfa.auxiliary(targetModel, data = dat1, aux = "z",
##' missing = "fiml", estimator = "mlr")
##'
##' ## with multiple auxiliary variables and multiple groups
##' fitaux2 <- cfa.auxiliary(targetModel, data = dat1, aux = c("z","ageyr","grade"),
##' group = "school", group.equal = "loadings")
##'
##' ## calculate correct incremental fit indices (e.g., CFI, TLI)
##' fitMeasures(fitaux2, fit.measures = c("cfi","tli"))
##' ## NOTE: lavaan will use the internally stored baseline model, which
##' ## is the independence model plus saturated auxiliary parameters
##' lavInspect(fitaux2@external$baseline.model, "free")
##'
##'
##' @export
auxiliary <- function(model, data, aux, fun, ...,
envir = getNamespace("lavaan"), return.syntax = FALSE) {
lavArgs <- list(...)
lavArgs$data <- substitute(data)
lavArgs$fixed.x <- FALSE
if (fun %in% c("lavaan","cfa","sem","growth")) {
#TODO: allow fun = "efa"?
lavArgs$missing <- "fiml"
lavArgs$meanstructure <- TRUE
}
lavArgs$ordered <- NULL
if (missing(aux))
stop("Please provide a character vector with names of auxiliary variables")
if (missing(data))
stop("Please provide a data.frame that includes modeled and auxiliary variables")
if (fun %in% c("lavaan","cfa","sem","growth") && !all(sapply(data[aux], is.numeric)))
stop("missing = 'FIML' is unavailable for categorical data")
PTcols <- c("lhs","op","rhs","user","block","group","free","label","plabel","start")
#TODO: add any? if fun %in% c("blavaan","bcfa","bsem","bgrowth")
## check parameter table, or create one from syntax
if (is.list(model)) {
if (any(model$exo == 1))
stop("All exogenous variables (covariates) must be treated as endogenous",
" by the 'auxiliary' function. Please set 'fixed.x = FALSE'")
if (!is.null(lavArgs$group.equal))
warning("The 'group.equal' argument is ignored when 'model' is a parameter table.")
if (is.null(model$start)) {
startArgs <- lavArgs
startArgs$model <- model
startArgs$do.fit <- FALSE
model$start <- parTable(do.call(fun, startArgs, envir = envir))$start
}
missingCols <- setdiff(PTcols, names(model))
if (length(missingCols)) stop("If the 'model' argument is a parameter table",
" it must also include these columns: \n",
paste(missingCols, collapse = ", "))
PT <- as.data.frame(model, stringsAsFactors = FALSE)[PTcols]
} else if (is.character(model)) {
ptArgs <- lavArgs
ptArgs$model <- model
ptArgs$do.fit <- FALSE #FIXME: does this work with blavaan?
PT <- parTable(do.call(fun, ptArgs, envir = envir))[PTcols]
} else stop("The 'model' argument must be a character vector of",
" lavaan syntax or a parameter table")
## separately store rows with constraints or user-defined parameters
conRows <- PT$op %in% c("==","<",">",":=")
if (any(conRows)) {
CON <- PT[ conRows, ]
PT <- PT[ !conRows, ]
} else CON <- data.frame(NULL)
## variable names
varnames <- lavNames(PT, type = "ov")
if (length(intersect(varnames, aux))) stop('modeled variable declared as auxiliary')
## specify a saturated model for auxiliaries
covstruc <- outer(aux, aux, function(x, y) paste(x, "~~", y))
satMod <- c(covstruc[lower.tri(covstruc, diag = TRUE)], paste(aux, "~ 1"), # among auxiliaries
outer(aux, varnames, function(x, y) paste(x, "~~", y))) # between aux and targets
if (return.syntax) return(satMod)
satPT <- lavaan::lavaanify(satMod, ngroups = max(PT$group))[c("lhs","op","rhs",
"user","block","group")]
## after omitting duplicates, check number of added parameters, add columns
mergedPT <- lavaan::lav_partable_merge(PT, satPT, remove.duplicated = TRUE, warn = FALSE)
nAuxPar <- nrow(mergedPT) - nrow(PT)
newRows <- 1L:nAuxPar + nrow(PT)
#TODO: mergedPT$user[newRows] <- 2L (list as constraints to omit printing?) or new code (9L)?
mergedPT$free[newRows] <- 1L:nAuxPar + max(PT$free)
mergedPT$plabel[newRows] <- paste0(".p", 1L:nAuxPar + nrow(PT), ".")
## calculate sample moments as starting values (recycle over groups)
# if (is.null(lavArgs$group)) {
# auxCov <- cov(data[aux], use = "pairwise.complete.obs")
# auxM <- colMeans(data[aux], na.rm = TRUE)
# auxTarget <- cov(data[c(aux, varnames)],
# use = "pairwise.complete.obs")[aux, varnames]
# ## match order of parameters in syntax above
# mergedPT$start[newRows] <- c(auxCov[lower.tri(auxCov, diag = TRUE)],
# auxM, as.numeric(auxTarget))
# } else {
# auxCovs <- list()
# auxMs <- list()
# auxTargets <- list()
# startVals <- numeric(0)
# for (g in unique(data[ , lavArgs$group])) {
# auxCovs[[g]] <- cov(data[data[ , lavArgs$group] == g, aux],
# use = "pairwise.complete.obs")
# auxMs[[g]] <- colMeans(data[data[ , lavArgs$group] == g, aux], na.rm = TRUE)
# auxTargets[[g]] <- cov(data[data[ , lavArgs$group] == g, c(aux, varnames)],
# use = "pairwise.complete.obs")[aux, varnames]
# startVals <- c(startVals, auxCovs[[g]][lower.tri(auxCovs[[g]], diag = TRUE)],
# auxMs[[g]], as.numeric(auxTargets[[g]]))
# }
# ## match order of parameters in syntax above
# mergedPT$start[newRows] <- startVals
# }
lavArgs$model <- lavaan::lav_partable_complete(rbind(mergedPT, CON)) #FIXME for blavaan
result <- do.call(fun, lavArgs, envir = envir)
if (fun %in% c("lavaan","cfa","sem","growth")) {
## specify, fit, and attach an appropriate independence model
baseArgs <- list()
baseArgs$model <- lavaan::lav_partable_complete(satPT)
baseArgs$data <- data
baseArgs$group <- lavArgs$group
baseArgs$group.label <- lavArgs$group.label
baseArgs$missing <- "fiml"
baseArgs$cluster <- lavArgs$cluster
baseArgs$sample.cov.rescale <- lavArgs$sample.cov.rescale
baseArgs$estimator <- lavArgs$estimator
baseArgs$information <- lavArgs$information
baseArgs$se <- lavArgs$se
baseArgs$test <- lavArgs$test
baseArgs$bootstrap <- lavArgs$bootstrap
baseArgs$control <- lavArgs$control
baseArgs$optim.method <- lavArgs$optim.method
result@external$baseline.model <- do.call("lavaan", baseArgs, envir = envir)
result@external$aux <- aux
result@external$baseline.syntax <- satMod
} # else something similar for blavaan? Better to fit manually for blavFitIndices
result
}
##' @rdname auxiliary
##' @aliases lavaan.auxiliary
##' @export
lavaan.auxiliary <- function(model, data, aux, ...,
envir = getNamespace("lavaan")) {
mc <- match.call(expand.dots = TRUE)
mc$fun <- "lavaan"
mc[[1L]] <- quote(semTools::auxiliary)
eval(mc, parent.frame())
}
##' @rdname auxiliary
##' @aliases cfa.auxiliary
##' @export
cfa.auxiliary <- function(model, data, aux, ...,
envir = getNamespace("lavaan")) {
mc <- match.call(expand.dots = TRUE)
mc$fun <- "cfa"
mc[[1L]] <- quote(semTools::auxiliary)
eval(mc, parent.frame())
}
##' @rdname auxiliary
##' @aliases sem.auxiliary
##' @export
sem.auxiliary <- function(model, data, aux, ...,
envir = getNamespace("lavaan")) {
mc <- match.call(expand.dots = TRUE)
mc$fun <- "sem"
mc[[1L]] <- quote(semTools::auxiliary)
eval(mc, parent.frame())
}
##' @rdname auxiliary
##' @aliases growth.auxiliary
##' @export
growth.auxiliary <- function(model, data, aux, ...,
envir = getNamespace("lavaan")) {
mc <- match.call(expand.dots = TRUE)
mc$fun <- "growth"
mc[[1L]] <- quote(semTools::auxiliary)
eval(mc, parent.frame())
}
semTools/R/kd.R 0000644 0001762 0000144 00000010101 14632016456 013001 0 ustar ligges users ### Edgar Merkle
### Last updated: 10 January 2021
### Kaiser-Dickman (1962) algorithm for generating sample data
### based on the input covmat, which is a covariance matrix.
##' Generate data via the Kaiser-Dickman (1962) algorithm.
##'
##' Given a covariance matrix and sample size, generate raw data that correspond
##' to the covariance matrix. Data can be generated to match the covariance
##' matrix exactly, or to be a sample from the population covariance matrix.
##'
##' By default, R's `cov()` function divides by `n`-1. The data
##' generated by this algorithm result in a covariance matrix that matches
##' `covmat`, but you must divide by `n` instead of `n`-1.
##'
##'
##' @importFrom stats cov2cor rnorm
##'
##' @param covmat a symmetric, positive definite covariance matrix
##' @param n the sample size for the data that will be generated
##' @param type type of data generation. `exact` generates data that
##' exactly correspond to `covmat`. `sample` treats `covmat` as
##' a poulation covariance matrix, generating a sample of size `n`.
##'
##' @return `kd` returns a data matrix of dimension `n` by
##' `nrow(covmat)`.
##'
##' @author Ed Merkle (University of Missouri; \email{merklee@@missouri.edu})
##'
##' @references Kaiser, H. F. and Dickman, K. (1962). Sample and population
##' score matrices and sample correlation matrices from an arbitrary population
##' correlation matrix. *Psychometrika, 27*(2), 179--182.
##' \doi{10.1007/BF02289635}
##'
##' @examples
##'
##' #### First Example
##'
##' ## Get data
##' dat <- HolzingerSwineford1939[ , 7:15]
##' hs.n <- nrow(dat)
##'
##' ## Covariance matrix divided by n
##' hscov <- ((hs.n-1)/hs.n) * cov(dat)
##'
##' ## Generate new, raw data corresponding to hscov
##' newdat <- kd(hscov, hs.n)
##'
##' ## Difference between new covariance matrix and hscov is minimal
##' newcov <- (hs.n-1)/hs.n * cov(newdat)
##' summary(as.numeric(hscov - newcov))
##'
##' ## Generate sample data, treating hscov as population matrix
##' newdat2 <- kd(hscov, hs.n, type = "sample")
##'
##' #### Another example
##'
##' ## Define a covariance matrix
##' covmat <- matrix(0, 3, 3)
##' diag(covmat) <- 1.5
##' covmat[2:3,1] <- c(1.3, 1.7)
##' covmat[3,2] <- 2.1
##' covmat <- covmat + t(covmat)
##'
##' ## Generate data of size 300 that have this covariance matrix
##' rawdat <- kd(covmat, 300)
##'
##' ## Covariances are exact if we compute sample covariance matrix by
##' ## dividing by n (vs by n - 1)
##' summary(as.numeric((299/300)*cov(rawdat) - covmat))
##'
##' ## Generate data of size 300 where covmat is the population covariance matrix
##' rawdat2 <- kd(covmat, 300)
##'
##' @export
kd <- function(covmat, n, type=c("exact","sample")) {
type <- match.arg(type)
## Check to ensure that covmat is a valid covariance matrix.
if (nrow(covmat) != ncol(covmat)) stop("non-square matrix supplied")
symmetric <- isSymmetric.matrix(covmat)
if (!symmetric) stop("non-symmetric matrix supplied")
pd <- all(eigen(covmat, only.values = TRUE)$values > 0)
if (!pd) stop("covariance matrix is not positive definite")
p <- nrow(covmat)
## Algorithm works on a correlation matrix
mv.vars <- matrix(0, nrow(covmat), nrow(covmat))
diag(mv.vars) <- sqrt(diag(covmat))
cormat <- cov2cor(covmat)
## Generate standard normal data and mean center each variable
Xscore <- matrix(rnorm(p*n), p, n)
Xsub0 <- t(apply(Xscore, 1, scale, scale = FALSE))
## Correlation matrix factored via Cholesky decomposition:
Fcomp <- t(chol(cormat))
## Equation 2 from K&D:
Zhat <- Fcomp %*% Xscore
## Equation 3 from K&D:
Xsub0.prod <- Xsub0 %*% t(Xsub0)
## Get singular value decomp of Xsub0.prod
Xsub0.svd <- svd(Xsub0.prod)
M.sqrt <- matrix(0, p, p)
diag(M.sqrt) <- 1 / sqrt(Xsub0.svd$d)
## Equation 5 from K&D:
Z <- Fcomp %*% M.sqrt %*% t(Xsub0.svd$u) %*% Xsub0
Z <- Z * sqrt(n)
dat <- Z
if (type == "sample") { dat <- Zhat }
## Scale data to correspond to covmat
dat <- t(dat) %*% mv.vars
## convert to data.frame, use any existing names from covmat
dat <- data.frame(dat)
if(!is.null(colnames(covmat))) names(dat) <- colnames(covmat)
dat
}
semTools/R/measurementInvarianceCat.R 0000644 0001762 0000144 00000027237 14764267172 017413 0 ustar ligges users ### Sunthud Pornprasertmanit, Yves Rosseel, & Terrence D. Jorgensen
### Last updated: 12 March 2025
### automate measurement invariance tests for categorical indicators
##' Measurement Invariance Tests for Categorical Items
##'
##' Testing measurement invariance across groups using a typical sequence of
##' model comparison tests.
##'
##' Theta parameterization is used to represent SEM for categorical items. That
##' is, residual variances are modeled instead of the total variance of
##' underlying normal variate for each item. Five models can be tested based on
##' different constraints across groups.
##' \enumerate{
##' \item Model 1: configural invariance. The same factor structure is imposed
##' on all groups.
##' \item Model 2: weak invariance. The factor loadings are constrained to be
##' equal across groups.
##' \item Model 3: strong invariance. The factor loadings and thresholds are
##' constrained to be equal across groups.
##' \item Model 4: strict invariance. The factor loadings, thresholds and
##' residual variances are constrained to be equal across groups.
##' For categorical variables, all residual variances are fixed as 1.
##' \item Model 5: The factor loadings, threshoulds, residual variances and
##' means are constrained to be equal across groups.
##' }
##'
##' However, if all items have two items (dichotomous), scalar invariance and
##' weak invariance cannot be separated because thresholds need to be equal
##' across groups for scale identification. Users can specify `strict`
##' option to include the strict invariance model for the invariance testing.
##' See the further details of scale identification and different
##' parameterization in Millsap and Yun-Tein (2004).
##'
##' @importFrom lavaan lavInspect parTable
##'
##' @param ... The same arguments as for any lavaan model. See
##' [lavaan::cfa()] for more information.
##' @param std.lv If `TRUE`, the fixed-factor method of scale
##' identification is used. If `FALSE`, the first variable for each
##' factor is used as marker variable.
##' @param strict If `TRUE`, the sequence requires `strict' invariance.
##' See details for more information.
##' @param quiet If `FALSE` (default), a summary is printed out containing
##' an overview of the different models that are fitted, together with some
##' model comparison tests. If `TRUE`, no summary is printed.
##' @param fit.measures Fit measures used to calculate the differences between
##' nested models.
##' @param baseline.model custom baseline model passed to
##' [lavaan::fitMeasures()]
##' @param method The method used to calculate likelihood ratio test. See
##' [lavaan::lavTestLRT()] for available options
##'
##' @return Invisibly, all model fits in the sequence are returned as a list.
##'
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' Yves Rosseel (Ghent University; \email{Yves.Rosseel@@UGent.be})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@gmail.com})
##'
##' @references Millsap, R. E., & Yun-Tein, J. (2004). Assessing factorial
##' invariance in ordered-categorical measures. *Multivariate Behavioral
##' Research, 39*(3), 479--515. \doi{10.1207/S15327906MBR3903_4}
##'
##' @examples
##'
##' \donttest{
##' syntax <- ' f1 =~ u1 + u2 + u3 + u4'
##'
##' measurementInvarianceCat(model = syntax, data = datCat, group = "g",
##' parameterization = "theta", estimator = "wlsmv",
##' ordered = c("u1", "u2", "u3", "u4"))
##' }
##'
##' @name measurementInvarianceCat-deprecated
##' @usage
##' measurementInvarianceCat(..., std.lv = FALSE, strict = FALSE,
##' quiet = FALSE, fit.measures = "default",
##' baseline.model = NULL, method = "default")
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##'
##' @export
measurementInvarianceCat <- function(..., std.lv = FALSE, strict = FALSE,
quiet = FALSE, fit.measures = "default",
baseline.model = NULL,
method = "default") {
.Deprecated(msg = c("The measurementInvarianceCat function is deprecated, and",
" it will cease to be included in future versions of ",
"semTools. See help('semTools-deprecated) for details."))
List <- list(...)
if (is.null(List$model)) stop('all lavaan() and lavOptions() arguments must ',
'named, including the "model=" argument.')
lavaancfa <- function(...) { lavaan::cfa(...) }
lavaanlavaan <- function(...) { lavaan::lavaan(...) }
if (!is.null(List$parameterization) && tolower(List$parameterization) != "theta")
warning("The parameterization is set to 'theta' by default.")
List$parameterization <- "theta"
# Find the number of groups
if (is.null(List$group)) stop("Please specify the group variable")
# Get the lavaan parameter table
template <- do.call(lavaancfa, c(List, do.fit = FALSE))
lavaanParTable <- parTable(template)
# Find the number of groups
ngroups <- max(lavaanParTable$group)
# Check whether all variables are categorical
sampstat <- lavInspect(template, "samp")[[1]]
meanname <- names(sampstat$mean)
thname <- names(sampstat$th)
if (any(is.na(charmatch(meanname, thname))))
stop("Some variables in your model are not identified as categorical.")
varList <- lavaanParTable$rhs[lavaanParTable$op == "=~"]
facName <- lavaanParTable$lhs[(lavaanParTable$op == "=~") & (lavaanParTable$rhs %in% varList)]
if (length(unique(sapply(facName, function(x) length(x)))) > 1)
stop("The numbers of variables in each element are not equal.")
varList <- unique(varList)
facName <- unique(facName)
# Check whether the factor configuration is the same across gorups
groupParTable <- split(lavaanParTable, lavaanParTable$group)
group1pt <- groupParTable[[1]]
groupParTable <- lapply(groupParTable, "[", c("lhs", "op", "rhs"))
if (!multipleAllEqualList(lapply(groupParTable, function(x) sapply(x, "[", x$op == "=~"))))
stop("Factor configuration is not the same across groups")
# Extract the number of thresholds
numThreshold <- table(sapply(group1pt, "[", group1pt$op == "|")[,"lhs"])
# Find the indicators of each factor
group1facload <- sapply(group1pt, "[", group1pt$op == "=~")
factorRep <- split(group1facload[,"rhs"], group1facload[,"lhs"])
# Find marker variables
marker <- rep(NA, length(factorRep))
numThresholdMarker <- rep(NA, length(factorRep))
for (i in seq_along(factorRep)) {
temp <- sapply(group1pt, "[", group1pt$rhs %in% factorRep[[i]] & group1pt$op == "=~" & group1pt$lhs == names(factorRep)[i])
marker[i] <- temp[!is.na(temp[,"ustart"]), "rhs"]
numThresholdMarker[i] <- numThreshold[marker[i]]
}
numThresholdFactorRep <- lapply(factorRep, function(x) numThreshold[x])
constraintSecondThreshold <- unlist(lapply(numThresholdFactorRep, function(x) names(which(x > 1)[1])))
constraintSecondThreshold <- constraintSecondThreshold[!is.na(constraintSecondThreshold)]
# Find the marker variable of each facto
for (i in names(numThreshold)) {
lavaanParTable <- constrainParTable(lavaanParTable, i, "|", "t1", 1:ngroups)
}
if (length(constraintSecondThreshold) > 0) {
for (i in constraintSecondThreshold) {
lavaanParTable <- constrainParTable(lavaanParTable, i, "|", "t2", 1:ngroups)
}
}
# Group 1
for (i in facName) {
lavaanParTable <- fixParTable(lavaanParTable, i, "~1", "", 1, 0) # Fix factor means as 0
if (std.lv) {
lavaanParTable <- fixParTable(lavaanParTable, i, "~~", i, 1, 1)
} else {
lavaanParTable <- freeParTable(lavaanParTable, i, "~~", i, 1, NA) # Free factor variances
}
# Assuming that all factor covariances are freeParTable
}
for (i in varList) {
lavaanParTable <- fixParTable(lavaanParTable, i, "~~", i, 1, 1)
}
# Other groups
for (k in 2:ngroups) {
for (i in facName) {
lavaanParTable <- freeParTable(lavaanParTable, i, "~1", "", k, NA)
if (std.lv) {
lavaanParTable <- fixParTable(lavaanParTable, i, "~~", i, k, 1)
} else {
lavaanParTable <- freeParTable(lavaanParTable, i, "~~", i, k, NA)
}
}
for (i in varList) {
lavaanParTable <- freeParTable(lavaanParTable, i, "~~", i, k, NA)
}
# Fix the indicator variances of marker variables with two categories as 1
for (i in seq_along(marker)) {
if (numThresholdMarker[i] == 1) lavaanParTable <- fixParTable(lavaanParTable, marker[i], "~~", marker[i], k, 1)
}
}
if (std.lv) {
for (i in seq_along(factorRep)) {
lavaanParTable <- freeParTable(lavaanParTable, names(factorRep)[i], "=~", marker[i], 1:ngroups, NA)
}
}
# Fit configural invariance
ListConfigural <- List
ListConfigural$model <- lavaanParTable
fitConfigural <- try(do.call(lavaanlavaan, ListConfigural), silent = TRUE)
# Create the parameter table for metric invariance
ptMetric <- lavaanParTable
for (i in seq_along(factorRep)) {
varwithin <- factorRep[[i]]
if (!std.lv) {
varwithin <- setdiff(varwithin, marker[i])
}
for (j in seq_along(varwithin)) {
ptMetric <- constrainParTable(ptMetric, names(factorRep)[i], "=~", varwithin[j], 1:ngroups)
}
}
if (std.lv) {
for (k in 2:ngroups) {
for (i in facName) {
ptMetric <- freeParTable(ptMetric, i, "~~", i, k, NA)
}
}
}
ListMetric <- List
ListMetric$model <- ptMetric
fitMetric <- try(do.call(lavaanlavaan, ListMetric), silent = TRUE)
ptMeans <- ptStrict <- ptMetric
nonMarker <- setdiff(names(numThreshold), marker)
nonDichoMarker <- numThreshold[which(numThreshold[nonMarker] > 1)]
scalar <- length(nonDichoMarker) > 0
if (scalar) {
ptScalar <- ptMetric
for (i in seq_along(numThreshold)) {
thresholdName <- paste0("t", 1:numThreshold[i])
for(j in seq_along(thresholdName)) {
ptScalar <- constrainParTable(ptScalar, names(numThreshold)[i], "|", thresholdName[j], 1:ngroups)
}
}
ListScalar <- List
ListScalar$model <- ptScalar
fitScalar <- try(do.call(lavaanlavaan, ListScalar), silent = TRUE)
ptMeans <- ptStrict <- ptScalar
} else fitScalar <- NULL
fitStrict <- NULL
# Create the parameter table for strict invariance if specified
if (strict) {
if (scalar) ptStrict <- ptScalar
for (k in 2:ngroups) {
# Constrain measurement error variances
for (i in varList) {
ptStrict <- fixParTable(ptStrict, i, "~~", i, k, 1)
}
}
ListStrict <- List
ListStrict$model <- ptStrict
fitStrict <- try(do.call(lavaanlavaan, ListStrict), silent = TRUE)
ptMeans <- ptStrict
}
# Create the parameter table for mean equality
# Constrain factor means to be equal
for (k in 2:ngroups) {
ptMeans <- fixParTable(ptMeans, facName, "~1", "", k, ustart = 0)
}
ListMeans <- List
ListMeans$model <- ptMeans
fitMeans <- try(do.call(lavaanlavaan, ListMeans), silent = TRUE)
FIT <- invisible(list(fit.configural = fitConfigural, fit.loadings = fitMetric,
fit.thresholds = fitScalar, fit.residuals = fitStrict,
fit.means = fitMeans))
FIT <- FIT[!sapply(FIT, is.null)]
if (!quiet) {
printInvarianceResult(FIT, fit.measures, baseline.model, method)
}
invisible(FIT)
}
## ----------------
## Hidden Functions
## ----------------
multipleAllEqual <- function(...) {
obj <- list(...)
multipleAllEqualList(obj)
}
multipleAllEqualList <- function(obj) {
for (i in 2:length(obj)) {
for (j in 1:(i - 1)) {
temp <- isTRUE(all.equal(obj[[i]], obj[[j]]))
if (!temp)
return(FALSE)
}
}
return(TRUE)
}
multipleAnyEqual <- function(...) {
obj <- list(...)
multipleAnyEqualList(obj)
}
multipleAnyEqualList <- function(obj) {
for (i in 2:length(obj)) {
for (j in 1:(i - 1)) {
temp <- isTRUE(all.equal(obj[[i]], obj[[j]]))
if (temp)
return(TRUE)
}
}
return(FALSE)
}
semTools/R/probeInteraction.R 0000644 0001762 0000144 00000245461 14753073565 015745 0 ustar ligges users ### Sunthud Pornprasertmanit & Terrence D. Jorgensen
### Last updated: 12 February 2025
## --------
## 2-way MC
## --------
##' Probing two-way interaction on the no-centered or mean-centered latent
##' interaction
##'
##' Probing interaction for simple intercept and simple slope for the
##' no-centered or mean-centered latent two-way interaction
##'
##' Before using this function, researchers need to make the products of the
##' indicators between the first-order factors using mean centering (Marsh, Wen,
##' & Hau, 2004). Note that the double-mean centering may not be appropriate for
##' probing interaction if researchers are interested in simple intercepts. The
##' mean or double-mean centering can be done by the [indProd()]
##' function. The indicator products can be made for all possible combination or
##' matched-pair approach (Marsh et al., 2004). Next, the hypothesized model
##' with the regression with latent interaction will be used to fit all original
##' indicators and the product terms. See the example for how to fit the product
##' term below. Once the lavaan result is obtained, this function will be used
##' to probe the interaction.
##'
##' Let that the latent interaction model regressing the dependent variable
##' (\eqn{Y}) on the independent variable (\eqn{X}) and the moderator (\eqn{Z})
##' be \deqn{ Y = b_0 + b_1X + b_2Z + b_3XZ + r, } where \eqn{b_0} is the
##' estimated intercept or the expected value of \eqn{Y} when both \eqn{X} and
##' \eqn{Z} are 0, \eqn{b_1} is the effect of \eqn{X} when \eqn{Z} is 0,
##' \eqn{b_2} is the effect of \eqn{Z} when \eqn{X} is 0, \eqn{b_3} is the
##' interaction effect between \eqn{X} and \eqn{Z}, and \eqn{r} is the residual
##' term.
##'
##' To probe a two-way interaction, the simple intercept of the independent
##' variable at each value of the moderator (Aiken & West, 1991; Cohen, Cohen,
##' West, & Aiken, 2003; Preacher, Curran, & Bauer, 2006) can be obtained by
##' \deqn{ b_{0|X = 0, Z} = b_0 + b_2 Z. }
##'
##' The simple slope of the independent varaible at each value of the moderator
##' can be obtained by
##' \deqn{ b_{X|Z} = b_1 + b_3 Z. }
##'
##' The variance of the simple intercept formula is
##' \deqn{ Var\left(b_{0|X = 0, Z}\right) =
##' Var\left(b_0\right) + 2Z \times Cov\left(b_0, b_2\right) +
##' Z^2 \times Var\left(b_2\right) },
##' where \eqn{Var} denotes the variance of a parameter
##' estimate and \eqn{Cov} denotes the covariance of two parameter estimates.
##'
##' The variance of the simple slope formula is
##' \deqn{ Var\left(b_{X|Z}\right) = Var\left(b_1\right) + 2Z \times
##' Cov\left(b_1, b_3\right) + Z^2 \times Var\left(b_3\right) }
##'
##' Wald *z* statistic is used for test statistic (even for objects of
##' class [lavaan.mi::lavaan.mi-class]).
##'
##'
##' @importFrom lavaan lavInspect parTable
##' @importFrom stats pnorm
##' @importFrom methods getMethod
##'
##' @param fit A fitted [lavaan::lavaan-class] or
##' [lavaan.mi::lavaan.mi-class] object with a latent 2-way interaction.
##' @param nameX `character` vector of all 3 factor names used as the
##' predictors. The lower-order factors must be listed first, and the final
##' name must be the latent interaction factor.
##' @param nameY The name of factor that is used as the dependent variable.
##' @param modVar The name of factor that is used as a moderator. The effect of
##' the other independent factor will be probed at each value of the
##' moderator variable listed in `valProbe`.
##' @param valProbe The values of the moderator that will be used to probe the
##' effect of the focal predictor.
##' @param group In multigroup models, the label of the group for which the
##' results will be returned. Must correspond to one of
##' `lavInspect(fit, "group.label")`, or an integer
##' corresponding to which of those group labels.
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Ignored unless `fit` is of
##' class [lavaan.mi::lavaan.mi-class]. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. NPD solutions are not excluded by default because
##' they are likely to occur due to sampling error, especially in small
##' samples. However, gross model misspecification could also cause
##' NPD solutions, users can compare pooled results with and without
##' this setting as a sensitivity analysis to see whether some
##' imputations warrant further investigation.
##'
##' @return A list with two elements:
##' \enumerate{
##' \item `SimpleIntercept`: The simple intercepts given each value of the
##' moderator.
##' \item `SimpleSlope`: The simple slopes given each value of the moderator.
##' }
##' In each element, the first column represents the values of the moderator
##' specified in the `valProbe` argument. The second column is the simple
##' intercept or simple slope. The third column is the *SE* of the simple
##' intercept or simple slope. The fourth column is the Wald (*z*)
##' statistic, and the fifth column is the associated *p* value testing
##' the null hypothesis that each simple intercept or slope is 0.
##'
##' @author
##' Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @seealso \itemize{
##' \item [indProd()] For creating the indicator products with no
##' centering, mean centering, double-mean centering, or residual centering.
##' \item [probe3WayMC()] For probing the three-way latent interaction
##' when the results are obtained from mean-centering, or double-mean centering
##' \item [probe2WayRC()] For probing the two-way latent interaction
##' when the results are obtained from residual-centering approach.
##' \item [probe3WayRC()] For probing the two-way latent interaction
##' when the results are obtained from residual-centering approach.
##' \item [plotProbe()] Plot the simple intercepts and slopes of the
##' latent interaction.
##' }
##'
##' @references
##'
##' Tutorial:
##'
##' Schoemann, A. M., & Jorgensen, T. D. (2021). Testing and interpreting
##' latent variable interactions using the `semTools` package.
##' *Psych, 3*(3), 322--335. \doi{10.3390/psych3030024}
##'
##' Background literature:
##'
##' Aiken, L. S., & West, S. G. (1991). *Multiple regression: Testing
##' and interpreting interactions*. Newbury Park, CA: Sage.
##'
##' Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). *Applied
##' multiple regression/correlation analysis for the behavioral sciences*
##' (3rd ed.). New York, NY: Routledge.
##'
##' Marsh, H. W., Wen, Z., & Hau, K. T. (2004). Structural equation models of
##' latent interactions: Evaluation of alternative estimation strategies and
##' indicator construction. *Psychological Methods, 9*(3), 275--300.
##' \doi{10.1037/1082-989X.9.3.275}
##'
##' Preacher, K. J., Curran, P. J., & Bauer, D. J. (2006). Computational tools
##' for probing interactions in multiple linear regression, multilevel modeling,
##' and latent curve analysis. *Journal of Educational and Behavioral
##' Statistics, 31*(4), 437--448. \doi{10.3102/10769986031004437}
##'
##' @examples
##'
##' dat2wayMC <- indProd(dat2way, 1:3, 4:6) # double mean centered by default
##'
##' model1 <- "
##' f1 =~ x1 + x2 + x3
##' f2 =~ x4 + x5 + x6
##' f12 =~ x1.x4 + x2.x5 + x3.x6
##' f3 =~ x7 + x8 + x9
##' f3 ~ f1 + f2 + f12
##' f12 ~~ 0*f1 + 0*f2 # not necessary, but implied by double mean centering
##' "
##'
##' fitMC2way <- sem(model1, data = dat2wayMC, meanstructure = TRUE)
##' summary(fitMC2way)
##'
##' probe2WayMC(fitMC2way, nameX = c("f1", "f2", "f12"), nameY = "f3",
##' modVar = "f2", valProbe = c(-1, 0, 1))
##'
##'
##' ## can probe multigroup models, one group at a time
##' dat2wayMC$g <- 1:2
##'
##' model2 <- "
##' f1 =~ x1 + x2 + x3
##' f2 =~ x4 + x5 + x6
##' f12 =~ x1.x4 + x2.x5 + x3.x6
##' f3 =~ x7 + x8 + x9
##' f3 ~ c(b1.g1, b1.g2)*f1 + c(b2.g1, b2.g2)*f2 + c(b12.g1, b12.g2)*f12
##' f12 ~~ 0*f1 + 0*f2
##' "
##' fit2 <- sem(model2, data = dat2wayMC, group = "g")
##' probe2WayMC(fit2, nameX = c("f1", "f2", "f12"), nameY = "f3",
##' modVar = "f2", valProbe = c(-1, 0, 1)) # group = 1 by default
##' probe2WayMC(fit2, nameX = c("f1", "f2", "f12"), nameY = "f3",
##' modVar = "f2", valProbe = c(-1, 0, 1), group = 2)
##'
##' @export
probe2WayMC <- function(fit, nameX, nameY, modVar, valProbe, group = 1L,
omit.imps = c("no.conv","no.se")) {
## TDJ: verify class
if (inherits(fit, "lavaan.mi")) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
useImps <- rep(TRUE, length(fit@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(fit@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(fit@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(fit@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(fit@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
} else if (!inherits(fit, "lavaan")) {
stop('"fit" must inherit from lavaan or lavaan.mi class', call. = FALSE)
}
meanstruc <- lavInspect(fit, "options")$meanstructure
# Check whether modVar is correct
if (is.character(modVar)) modVar <- match(modVar, nameX)
if (is.na(modVar) || !(modVar %in% 1:2))
stop("The moderator name is not in the name of independent factors or not 1 or 2.")
## TDJ: If multigroup, check group %in% group.label
nG <- lavInspect(fit, "ngroups")
if (nG > 1L) {
group.label <- lavInspect(fit, "group.label")
## assign numeric to character
if (is.numeric(group)) {
if (group %in% 1:nG) {
group <- group.label[group]
} else group <- as.character(group)
} else group <- as.character(group)
## check that character is a group
if (!as.character(group) %in% group.label)
stop('"group" must be a character string naming a group of interest, or ',
'an integer corresponding to a group in lavInspect(fit, "group.label")')
group.number <- which(group.label == group)
} else group.number <- 1L
## Get the parameter estimates for that group
if (nG > 1L) {
if (inherits(fit, "lavaan")) {
est <- lavInspect(fit, "est")[[group]]
if (!meanstruc) {
est$alpha <- matrix(0, 1, 1, dimnames = list(nameY, "intercept"))
}
} else if (inherits(fit, "lavaan.mi")) {
est <- list()
GLIST <- fit@coefList[useImps]
est$beta <- Reduce("+", lapply(GLIST, function(i) i[[group]]$beta)) / m
if (meanstruc) {
est$alpha <- Reduce("+", lapply(GLIST, function(i) i[[group]]$alpha)) / m
} else {
est$alpha <- matrix(0, 1, 1, dimnames = list(nameY, "intercept"))
}
}
} else {
## single-group model
if (inherits(fit, "lavaan")) {
est <- lavInspect(fit, "est")
if (!meanstruc) {
est$alpha <- matrix(0, 1, 1, dimnames = list(nameY, "intercept"))
}
} else if (inherits(fit, "lavaan.mi")) {
est <- list()
est$beta <- Reduce("+", lapply(fit@coefList[useImps], "[[", i = "beta")) / m
if (meanstruc) {
est$alpha <- Reduce("+", lapply(fit@coefList[useImps], "[[", i = "alpha")) / m
} else {
est$alpha <- matrix(0, 1, 1, dimnames = list(nameY, "intercept"))
}
}
}
## extract relevant slopes
betaNC <- matrix(est$beta[nameY, nameX], ncol = 1,
dimnames = list(nameX, nameY))
if (modVar == 1) betaNC <- betaNC[c(2, 1, 3)]
## Extract all sampling (co)variances
if (inherits(fit, "lavaan")) {
varEst <- lavaan::vcov(fit)
} else if (inherits(fit, "lavaan.mi")) {
varEst <- getMethod(f = "vcov", signature = "lavaan.mi",
where = getNamespace("lavaan.mi"))(fit, omit.imps = omit.imps)
}
## for indexing the sampling (co)variances
PT <- parTable(fit)
## index slopes in varEst
slope.idx <- which(PT$lhs == nameY & PT$op == "~" & PT$rhs %in% nameX & PT$group == group.number)
## check for user-defined labels; otherwise, set defaults
slope.label <- PT$label[slope.idx]
if (any(slope.label == "")) for (i in which(slope.label == "")) {
slope.label[i] <- paste0(nameY, "~", nameX[i],
ifelse(nG > 1L && group.number > 1L, no = "",
yes = paste0(".g", group.number)))
}
if (modVar == 1) slope.label <- slope.label[c(3, 2, 4)]
## index intercept in varEst
if (meanstruc) {
int.idx <- PT$label[PT$lhs == nameY & PT$op == "~1" & PT$group == group.number]
if (int.idx == "") {
## no custom label, use default
int.idx <- paste0(nameY, "~1")
if (nG > 1L && group.number > 1L) {
int.idx <- paste0(int.idx, ".g", group.number)
}
}
## check whether it is actually estimated (appears in $alpha and varEst)
estimateIntcept <- int.idx %in% rownames(varEst)
} else {
estimateIntcept <- FALSE
## use default label
int.idx <- paste0(nameY, "~1")
if (nG > 1L && group.number > 1L) {
int.idx <- paste0(int.idx, ".g", group.number)
}
}
## if intercept is not estimated, add a row/column with zero sampling variance
if (!estimateIntcept) {
dn <- c(rownames(varEst), int.idx)
varEst <- rbind(cbind(varEst, 0), 0)
dimnames(varEst) <- list(dn, dn)
}
## save all indices in a vector
targetcol <- c(int.idx, slope.label)
## collect relevant estimates and variances to probe the interaction
usedBeta <- rbind(est$alpha[nameY,], betaNC) # point estimates
usedVar <- varEst[targetcol, targetcol] # sampling (co)variances
## function to calculate p values for Wald z tests
pvalue <- function(x) (1 - pnorm(abs(x))) * 2
## Calculate simple intercepts
simpleIntcept <- usedBeta[1] + usedBeta[3] * valProbe
varIntcept <- usedVar[1, 1] + 2 * valProbe * usedVar[1, 3] + (valProbe^2) * usedVar[3, 3]
zIntcept <- simpleIntcept / sqrt(varIntcept)
pIntcept <- pvalue(zIntcept)
resultIntcept <- data.frame(valProbe, simpleIntcept, sqrt(varIntcept), zIntcept, pIntcept)
colnames(resultIntcept) <- c(nameX[modVar], "est", "se", "z", "pvalue")
class(resultIntcept) <- c("lavaan.data.frame","data.frame")
## Calculate simple slopes
simpleSlope <- usedBeta[2] + usedBeta[4] * valProbe
varSlope <- usedVar[2, 2] + 2 * valProbe * usedVar[2, 4] + (valProbe^2) * usedVar[4, 4]
zSlope <- simpleSlope / sqrt(varSlope)
pSlope <- pvalue(zSlope)
resultSlope <- data.frame(valProbe, simpleSlope, sqrt(varSlope), zSlope, pSlope)
colnames(resultSlope) <- c(nameX[modVar], "est", "se", "z", "pvalue")
class(resultSlope) <- c("lavaan.data.frame","data.frame")
list(SimpleIntcept = resultIntcept, SimpleSlope = resultSlope)
}
## --------
## 2-way RC
## --------
##' Probing two-way interaction on the residual-centered latent interaction
##'
##' Probing interaction for simple intercept and simple slope for the
##' residual-centered latent two-way interaction (Geldhof et al., 2013)
##'
##' Before using this function, researchers need to make the products of the
##' indicators between the first-order factors and residualize the products by
##' the original indicators (Lance, 1988; Little, Bovaird, & Widaman, 2006). The
##' process can be automated by the [indProd()] function. Note that
##' the indicator products can be made for all possible combination or
##' matched-pair approach (Marsh et al., 2004). Next, the hypothesized model
##' with the regression with latent interaction will be used to fit all original
##' indicators and the product terms. To use this function the model must be fit
##' with a mean structure. See the example for how to fit the product term
##' below. Once the lavaan result is obtained, this function will be used to
##' probe the interaction.
##'
##' The probing process on residual-centered latent interaction is based on
##' transforming the residual-centered result into the no-centered result. See
##' Geldhof et al. (2013) for further details. Note that this approach is based
##' on a strong assumption that the first-order latent variables are normally
##' distributed. The probing process is applied after the no-centered result
##' (parameter estimates and their covariance matrix among parameter estimates)
##' has been computed. See the [probe2WayMC()] for further details.
##'
##'
##' @importFrom lavaan lavInspect parTable
##' @importFrom stats pnorm
##' @importFrom methods getMethod
##'
##' @param fit A fitted [lavaan::lavaan-class] or
##' [lavaan.mi::lavaan.mi-class] object with a latent 2-way interaction.
##' @param nameX `character` vector of all 3 factor names used as the
##' predictors. The lower-order factors must be listed first, and the final
##' name must be the latent interaction factor.
##' @param nameY The name of factor that is used as the dependent variable.
##' @param modVar The name of factor that is used as a moderator. The effect of
##' the other independent factor will be probed at each value of the
##' moderator variable listed in `valProbe`.
##' @param valProbe The values of the moderator that will be used to probe the
##' effect of the focal predictor.
##' @param group In multigroup models, the label of the group for which the
##' results will be returned. Must correspond to one of
##' `lavInspect(fit, "group.label")`, or an integer
##' corresponding to which of those group labels.
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Ignored unless `fit` is of
##' class [lavaan.mi::lavaan.mi-class]. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. NPD solutions are not excluded by default because
##' they are likely to occur due to sampling error, especially in small
##' samples. However, gross model misspecification could also cause
##' NPD solutions, users can compare pooled results with and without
##' this setting as a sensitivity analysis to see whether some
##' imputations warrant further investigation.
##'
##' @return A list with two elements:
##' \enumerate{
##' \item `SimpleIntercept`: The simple intercepts given each value of the
##' moderator.
##' \item `SimpleSlope`: The simple slopes given each value of the moderator.
##' }
##' In each element, the first column represents the values of the moderators
##' specified in the `valProbe` argument. The second column is the simple
##' intercept or simple slope. The third column is the standard error of the
##' simple intercept or slope. The fourth column is the Wald (*z*)
##' statistic, and the fifth column is the associated *p* value testing
##' the null hypothesis that each simple intercept or slope is 0.
##'
##' @author
##' Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @seealso \itemize{
##' \item [indProd()] For creating the indicator products with no
##' centering, mean centering, double-mean centering, or residual centering.
##' \item [probe2WayMC()] For probing the two-way latent interaction
##' when the results are obtained from mean-centering, or double-mean centering
##' \item [probe3WayMC()] For probing the three-way latent interaction
##' when the results are obtained from mean-centering, or double-mean centering
##' \item [probe3WayRC()] For probing the two-way latent interaction
##' when the results are obtained from residual-centering approach.
##' \item [plotProbe()] Plot the simple intercepts and slopes of the
##' latent interaction.
##' }
##' @references
##'
##' Tutorial:
##'
##' Schoemann, A. M., & Jorgensen, T. D. (2021). Testing and interpreting
##' latent variable interactions using the `semTools` package.
##' *Psych, 3*(3), 322--335. \doi{10.3390/psych3030024}
##'
##' Background literature:
##'
##' Lance, C. E. (1988). Residual centering, exploratory and confirmatory
##' moderator analysis, and decomposition of effects in path models containing
##' interactions. *Applied Psychological Measurement, 12*(2), 163--175.
##' \doi{10.1177/014662168801200205}
##'
##' Little, T. D., Bovaird, J. A., & Widaman, K. F. (2006). On the merits of
##' orthogonalizing powered and product terms: Implications for modeling
##' interactions. *Structural Equation Modeling, 13*(4), 497--519.
##' \doi{10.1207/s15328007sem1304_1}
##'
##' Marsh, H. W., Wen, Z., & Hau, K. T. (2004). Structural equation models of
##' latent interactions: Evaluation of alternative estimation strategies and
##' indicator construction. *Psychological Methods, 9*(3), 275--300.
##' \doi{10.1037/1082-989X.9.3.275}
##'
##' Geldhof, G. J., Pornprasertmanit, S., Schoemann, A. M., & Little, T. D.
##' (2013). Orthogonalizing through residual centering: Extended applications
##' and caveats. *Educational and Psychological Measurement, 73*(1), 27--46.
##' \doi{10.1177/0013164412445473}
##'
##' @examples
##'
##' dat2wayRC <- orthogonalize(dat2way, 1:3, 4:6)
##'
##' model1 <- "
##' f1 =~ x1 + x2 + x3
##' f2 =~ x4 + x5 + x6
##' f12 =~ x1.x4 + x2.x5 + x3.x6
##' f3 =~ x7 + x8 + x9
##' f3 ~ f1 + f2 + f12
##' f12 ~~ 0*f1 + 0*f2
##' x1 + x4 + x1.x4 + x7 ~ 0*1 # identify latent means
##' f1 + f2 + f12 + f3 ~ NA*1
##' "
##'
##' fitRC2way <- sem(model1, data = dat2wayRC, meanstructure = TRUE)
##' summary(fitRC2way)
##'
##' probe2WayRC(fitRC2way, nameX = c("f1", "f2", "f12"), nameY = "f3",
##' modVar = "f2", valProbe = c(-1, 0, 1))
##'
##'
##' ## can probe multigroup models, one group at a time
##' dat2wayRC$g <- 1:2
##'
##' model2 <- "
##' f1 =~ x1 + x2 + x3
##' f2 =~ x4 + x5 + x6
##' f12 =~ x1.x4 + x2.x5 + x3.x6
##' f3 =~ x7 + x8 + x9
##' f3 ~ c(b1.g1, b1.g2)*f1 + c(b2.g1, b2.g2)*f2 + c(b12.g1, b12.g2)*f12
##' f12 ~~ 0*f1 + 0*f2
##' x1 + x4 + x1.x4 + x7 ~ 0*1 # identify latent means
##' f1 + f2 + f12 ~ NA*1
##' f3 ~ NA*1 + c(b0.g1, b0.g2)*1
##' "
##' fit2 <- sem(model2, data = dat2wayRC, group = "g")
##' probe2WayRC(fit2, nameX = c("f1", "f2", "f12"), nameY = "f3",
##' modVar = "f2", valProbe = c(-1, 0, 1)) # group = 1 by default
##' probe2WayRC(fit2, nameX = c("f1", "f2", "f12"), nameY = "f3",
##' modVar = "f2", valProbe = c(-1, 0, 1), group = 2)
##'
##' @export
probe2WayRC <- function(fit, nameX, nameY, modVar, valProbe, group = 1L,
omit.imps = c("no.conv","no.se")) {
## TDJ: verify class
if (inherits(fit, "lavaan.mi")) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
useImps <- rep(TRUE, length(fit@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(fit@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(fit@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(fit@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(fit@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
} else if (!inherits(fit, "lavaan")) {
stop('"fit" must inherit from lavaan or lavaan.mi class', call. = FALSE)
}
if (!lavInspect(fit, "options")$meanstructure)
stop('This function requires the model to be fit with a mean structure.',
call. = FALSE)
# Check whether modVar is correct
if (is.character(modVar)) modVar <- match(modVar, nameX)
if (is.na(modVar) || !(modVar %in% 1:2))
stop("The moderator name is not in the name of independent factors or not 1 or 2.")
## TDJ: If multigroup, check group %in% group.label
nG <- lavInspect(fit, "ngroups")
if (nG > 1L) {
group.label <- lavInspect(fit, "group.label")
## assign numeric to character
if (is.numeric(group)) {
if (group %in% 1:nG) {
group <- group.label[group]
} else group <- as.character(group)
} else group <- as.character(group)
## check that character is a group
if (!as.character(group) %in% group.label)
stop('"group" must be a character string naming a group of interest, or ',
'an integer corresponding to a group in lavInspect(fit, "group.label")')
group.number <- which(group.label == group)
} else group.number <- 1L
## Get the parameter estimates for that group
if (nG > 1L) {
if (inherits(fit, "lavaan")) {
est <- lavInspect(fit, "est")[[group]]
} else if (inherits(fit, "lavaan.mi")) {
est <- list()
GLIST <- fit@coefList[useImps]
est$beta <- Reduce("+", lapply(GLIST, function(i) i[[group]]$beta)) / m
est$alpha <- Reduce("+", lapply(GLIST, function(i) i[[group]]$alpha)) / m
est$psi <- Reduce("+", lapply(GLIST, function(i) i[[group]]$psi)) / m
}
} else {
## single-group model
if (inherits(fit, "lavaan")) {
est <- lavInspect(fit, "est")
} else if (inherits(fit, "lavaan.mi")) {
est <- list()
est$beta <- Reduce("+", lapply(fit@coefList[useImps], "[[", i = "beta")) / m
est$alpha <- Reduce("+", lapply(fit@coefList[useImps], "[[", i = "alpha")) / m
est$psi <- Reduce("+", lapply(fit@coefList[useImps], "[[", i = "psi")) / m
}
}
# Find the mean and covariance matrix of independent factors
varX <- est$psi[ nameX, nameX]
meanX <- est$alpha[nameX, , drop = FALSE]
# Find the intercept, regression coefficients, and residual variance of residual-centered regression
intceptRC <- est$alpha[nameY,]
resVarRC <- est$psi[ nameY, nameY]
betaRC <- t(est$beta[ nameY, nameX, drop = FALSE])
# Find the number of observations
numobs <- lavInspect(fit, "nobs")[group.number]
# Compute SSRC
meanXwith1 <- rbind(1, meanX)
varXwith0 <- cbind(0, rbind(0, varX))
SSRC <- numobs * (varXwith0 + (meanXwith1 %*% t(meanXwith1)))
# Compute Mean(Y) and Var(Y)
betaRCWithIntcept <- rbind(intceptRC, betaRC)
meanY <- t(meanXwith1) %*% betaRCWithIntcept
varY <- (t(betaRCWithIntcept) %*% SSRC %*% betaRCWithIntcept)/numobs - meanY^2 + resVarRC
# Compute Cov(Y, X)
covY <- as.matrix((varX %*% betaRC)[1:2,])
# Compute E(XZ)
meanX[3] <- meanX[1] * meanX[2] + varX[1, 2]
# Compute Var(XZ)
varX[3, 3] <- meanX[1]^2 * varX[2, 2] + meanX[2]^2 * varX[1, 1] + 2 * meanX[1] * meanX[2] * varX[1, 2] + varX[1, 1] * varX[2, 2] + varX[1, 2]^2
# Compute Cov(X, XZ), Cov(Z, XZ)
varX[1, 3] <- varX[3, 1] <- meanX[1] * varX[1, 2] + meanX[2] * varX[1, 1]
varX[2, 3] <- varX[3, 2] <- meanX[1] * varX[2, 2] + meanX[2] * varX[1, 2]
# Compute Cov(Y, XZ) and regression coefficients of no-centering
betaNC <- solve(varX[1:2,1:2], covY - rbind(varX[1,3] * betaRC[3,1], varX[2, 3] * betaRC[3,1]))
betaNC <- rbind(betaNC, betaRC[3, 1])
covY <- rbind(covY, (varX %*% betaNC)[3, 1])
# Aggregate the non-centering sufficient statistics (Just show how to do but not necessary)
fullCov <- rbind(cbind(varX, covY), c(covY, varY))
fullMean <- rbind(meanX, meanY)
# Compute the intercept of no-centering
intceptNC <- meanY - t(betaNC) %*% meanX
# Compute SSNC
betaNCWithIntcept <- rbind(intceptNC, betaNC)
meanXwith1 <- rbind(1, meanX)
varXwith0 <- rbind(0, cbind(0, varX))
SSNC <- numobs * (varXwith0 + (meanXwith1 %*% t(meanXwith1)))
# Compute residual variance on non-centering
resVarNC <- varY - (t(betaNCWithIntcept) %*% SSNC %*% betaNCWithIntcept)/numobs + meanY^2
## function to calculate p values for Wald z tests
pvalue <- function(x) (1 - pnorm(abs(x))) * 2
## Extract all sampling (co)variances
if (inherits(fit, "lavaan")) {
varEst <- lavaan::vcov(fit)
} else if (inherits(fit, "lavaan.mi")) {
varEst <- getMethod(f = "vcov", signature = "lavaan.mi",
where = getNamespace("lavaan.mi"))(fit, omit.imps = omit.imps)
}
## for indexing sampling (co)variances
PT <- parTable(fit)
int.idx <- PT$label[PT$lhs == nameY & PT$op == "~1" & PT$group == group.number]
if (int.idx == "") {
## no custom label, use default
int.idx <- paste0(nameY, "~1")
if (nG > 1L && group.number > 1L) {
int.idx <- paste0(int.idx, ".g", group.number)
}
}
## check intercept is actually estimated (thus, has sampling variance)
estimateIntcept <- int.idx %in% rownames(varEst)
## if intercept is not estimated, add a row/column with zero sampling variance
if (!estimateIntcept) {
dn <- c(rownames(varEst), int.idx)
varEst <- rbind(cbind(varEst, 0), 0)
dimnames(varEst) <- list(dn, dn)
}
## index slopes in varEst
slope.idx <- which(PT$lhs == nameY & PT$op == "~" & PT$rhs %in% nameX & PT$group == group.number)
## check for user-defined labels; otherwise, set defaults
slope.label <- PT$label[slope.idx]
if (any(slope.label == "")) for (i in which(slope.label == "")) {
slope.label[i] <- paste0(nameY, "~", nameX[i],
ifelse(nG > 1L && group.number > 1L, no = "",
yes = paste0(".g", group.number)))
}
## save all indices in a vector
targetcol <- c(int.idx, slope.label)
varEstSlopeRC <- varEst[targetcol, targetcol]
## collect relevant estimates and variances to probe the interaction
usedVar <- as.numeric(resVarNC/resVarRC) * (varEstSlopeRC %*% SSRC %*% solve(SSNC))
usedBeta <- betaNCWithIntcept
# Change the order of usedVar and usedBeta if the moderator variable is listed first
if (modVar == 1) {
usedVar <- usedVar[c(1, 3, 2, 4), c(1, 3, 2, 4)]
usedBeta <- usedBeta[c(1, 3, 2, 4)]
}
## Calculate simple intercepts
simpleIntcept <- usedBeta[1] + usedBeta[3] * valProbe
varIntcept <- usedVar[1, 1] + 2 * valProbe * usedVar[1, 3] + (valProbe^2) * usedVar[3, 3]
zIntcept <- simpleIntcept/sqrt(varIntcept)
pIntcept <- pvalue(zIntcept)
resultIntcept <- data.frame(valProbe, simpleIntcept, sqrt(varIntcept), zIntcept, pIntcept)
colnames(resultIntcept) <- c(nameX[modVar], "est", "se", "z", "pvalue")
class(resultIntcept) <- c("lavaan.data.frame","data.frame")
## Calculate simple slopes
simpleSlope <- usedBeta[2] + usedBeta[4] * valProbe
varSlope <- usedVar[2, 2] + 2 * valProbe * usedVar[2, 4] + (valProbe^2) * usedVar[4, 4]
zSlope <- simpleSlope/sqrt(varSlope)
pSlope <- pvalue(zSlope)
resultSlope <- data.frame(valProbe, simpleSlope, sqrt(varSlope), zSlope, pSlope)
colnames(resultSlope) <- c(nameX[modVar], "est", "se", "z", "pvalue")
class(resultSlope) <- c("lavaan.data.frame","data.frame")
list(SimpleIntcept = resultIntcept, SimpleSlope = resultSlope)
}
## --------
## 3-way MC
## --------
##' Probing three-way interaction on the no-centered or mean-centered latent
##' interaction
##'
##' Probing interaction for simple intercept and simple slope for the
##' no-centered or mean-centered latent two-way interaction
##'
##' Before using this function, researchers need to make the products of the
##' indicators between the first-order factors using mean centering (Marsh, Wen,
##' & Hau, 2004). Note that the double-mean centering may not be appropriate for
##' probing interaction if researchers are interested in simple intercepts. The
##' mean or double-mean centering can be done by the [indProd()]
##' function. The indicator products can be made for all possible combination or
##' matched-pair approach (Marsh et al., 2004). Next, the hypothesized model
##' with the regression with latent interaction will be used to fit all original
##' indicators and the product terms. See the example for how to fit the product
##' term below. Once the lavaan result is obtained, this function will be used
##' to probe the interaction.
##'
##' Let that the latent interaction model regressing the dependent variable
##' (\eqn{Y}) on the independent variable (\eqn{X}) and two moderators (\eqn{Z}
##' and \eqn{W}) be \deqn{ Y = b_0 + b_1X + b_2Z + b_3W + b_4XZ + b_5XW + b_6ZW
##' + b_7XZW + r, } where \eqn{b_0} is the estimated intercept or the expected
##' value of \eqn{Y} when \eqn{X}, \eqn{Z}, and \eqn{W} are 0, \eqn{b_1} is the
##' effect of \eqn{X} when \eqn{Z} and \eqn{W} are 0, \eqn{b_2} is the effect of
##' \eqn{Z} when \eqn{X} and \eqn{W} is 0, \eqn{b_3} is the effect of \eqn{W}
##' when \eqn{X} and \eqn{Z} are 0, \eqn{b_4} is the interaction effect between
##' \eqn{X} and \eqn{Z} when \eqn{W} is 0, \eqn{b_5} is the interaction effect
##' between \eqn{X} and \eqn{W} when \eqn{Z} is 0, \eqn{b_6} is the interaction
##' effect between \eqn{Z} and \eqn{W} when \eqn{X} is 0, \eqn{b_7} is the
##' three-way interaction effect between \eqn{X}, \eqn{Z}, and \eqn{W}, and
##' \eqn{r} is the residual term.
##'
##' To probe a three-way interaction, the simple intercept of the independent
##' variable at the specific values of the moderators (Aiken & West, 1991) can
##' be obtained by
##' \deqn{ b_{0|X = 0, Z, W} = b_0 + b_2Z + b_3W + b_6ZW. }
##' The simple slope of the independent variable at the specific values of the
##' moderators can be obtained by
##' \deqn{ b_{X|Z, W} = b_1 + b_3Z + b_4W + b_7ZW.}
##' The variance of the simple intercept formula is
##' \deqn{ Var\left(b_{0|X = 0, Z, W}\right) =
##' Var\left(b_0\right) + Z^2Var\left(b_2\right) + W^2Var\left(b_3\right) +
##' Z^2W^2Var\left(b_6\right)}
##' \deqn{+ 2ZCov\left(b_0, b_2\right) + 2WCov\left(b_0, b_3\right) +
##' 2ZWCov\left(b_0, b_6\right) + 2ZWCov\left(b_2, b_3\right) +
##' 2Z^2WCov\left(b_2, b_6\right) + 2ZW^2Cov\left(b_3, b_6\right), }
##' where \eqn{Var} denotes the variance of a parameter estimate and \eqn{Cov}
##' denotes the covariance of two parameter estimates.
##'
##' The variance of the simple slope formula is
##' \deqn{ Var\left(b_{X|Z, W}\right) =
##' Var\left(b_1\right) + Z^2Var\left(b_4\right) + W^2Var\left(b_5\right)
##' + Z^2W^2Var\left(b_7\right) }
##' \deqn{+ 2ZCov\left(b_1, b_4\right) + 2WCov\left(b_1, b_5\right) +
##' 2ZWCov\left(b_1, b_7\right) + 2ZWCov\left(b_4, b_5\right) +
##' 2Z^2WCov\left(b_4, b_7\right) + 2ZW^2Cov\left(b_5, b_7\right). }
##'
##' Wald *z* statistics are calculated (even for objects of class
##' [lavaan.mi::lavaan.mi-class]) to test null hypotheses that simple
##' intercepts or slopes are 0.
##'
##'
##' @importFrom lavaan lavInspect parTable
##' @importFrom stats pnorm
##' @importFrom methods getMethod
##'
##' @param fit A fitted [lavaan::lavaan-class] or
##' [lavaan.mi::lavaan.mi-class] object with a latent 2-way interaction.
##' @param nameX `character` vector of all 7 factor names used as the
##' predictors. The 3 lower-order factors must be listed first, followed by
##' the 3 second-order factors (specifically, the 4th element must be the
##' interaction between the factors listed first and second, the 5th element
##' must be the interaction between the factors listed first and third, and
##' the 6th element must be the interaction between the factors listed second
##' and third). The final name will be the factor representing the 3-way
##' interaction.
##' @param nameY The name of factor that is used as the dependent variable.
##' @param modVar The name of two factors that are used as the moderators. The
##' effect of the independent factor will be probed at each combination of
##' the moderator variables' chosen values.
##' @param valProbe1 The values of the first moderator that will be used to
##' probe the effect of the independent factor.
##' @param valProbe2 The values of the second moderator that will be used to
##' probe the effect of the independent factor.
##' @param group In multigroup models, the label of the group for which the
##' results will be returned. Must correspond to one of
##' `lavInspect(fit, "group.label")`.
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Ignored unless `fit` is of
##' class [lavaan.mi::lavaan.mi-class]. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. NPD solutions are not excluded by default because
##' they are likely to occur due to sampling error, especially in small
##' samples. However, gross model misspecification could also cause
##' NPD solutions, users can compare pooled results with and without
##' this setting as a sensitivity analysis to see whether some
##' imputations warrant further investigation.
##'
##' @return A list with two elements:
##' \enumerate{
##' \item `SimpleIntercept`: The model-implied intercepts given each
##' combination of moderator values.
##' \item `SimpleSlope`: The model-implied slopes given each combination
##' of moderator values.
##' }
##' In each element, the first column represents values of the first moderator
##' specified in the `valProbe1` argument. The second column represents
##' values of the second moderator specified in the `valProbe2` argument.
##' The third column is the simple intercept or simple slope. The fourth column
##' is the standard error of the simple intercept or simple slope. The fifth
##' column is the Wald (*z*) statistic, and the sixth column is its
##' associated *p* value to test the null hypothesis that each simple
##' intercept or simple slope equals 0.
##'
##' @author
##' Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @seealso \itemize{
##' \item [indProd()] For creating the indicator products with no
##' centering, mean centering, double-mean centering, or residual centering.
##' \item [probe2WayMC()] For probing the two-way latent interaction
##' when the results are obtained from mean-centering, or double-mean centering
##' \item [probe2WayRC()] For probing the two-way latent interaction
##' when the results are obtained from residual-centering approach.
##' \item [probe3WayRC()] For probing the two-way latent interaction
##' when the results are obtained from residual-centering approach.
##' \item [plotProbe()] Plot the simple intercepts and slopes of the
##' latent interaction.
##' }
##'
##' @references
##' Tutorial:
##'
##' Schoemann, A. M., & Jorgensen, T. D. (2021). Testing and interpreting
##' latent variable interactions using the `semTools` package.
##' *Psych, 3*(3), 322--335. \doi{10.3390/psych3030024}
##'
##' Background literature:
##'
##' Aiken, L. S., & West, S. G. (1991). *Multiple regression: Testing
##' and interpreting interactions*. Newbury Park, CA: Sage.
##'
##' Marsh, H. W., Wen, Z., & Hau, K. T. (2004). Structural equation models of
##' latent interactions: Evaluation of alternative estimation strategies and
##' indicator construction. *Psychological Methods, 9*(3), 275--300.
##' \doi{10.1037/1082-989X.9.3.275}
##'
##' @examples
##'
##' dat3wayMC <- indProd(dat3way, 1:3, 4:6, 7:9)
##'
##' model3 <- " ## define latent variables
##' f1 =~ x1 + x2 + x3
##' f2 =~ x4 + x5 + x6
##' f3 =~ x7 + x8 + x9
##' ## 2-way interactions
##' f12 =~ x1.x4 + x2.x5 + x3.x6
##' f13 =~ x1.x7 + x2.x8 + x3.x9
##' f23 =~ x4.x7 + x5.x8 + x6.x9
##' ## 3-way interaction
##' f123 =~ x1.x4.x7 + x2.x5.x8 + x3.x6.x9
##' ## outcome variable
##' f4 =~ x10 + x11 + x12
##'
##' ## latent regression model
##' f4 ~ b1*f1 + b2*f2 + b3*f3 + b12*f12 + b13*f13 + b23*f23 + b123*f123
##'
##' ## orthogonal terms among predictors
##' ## (not necessary, but implied by double mean centering)
##' f1 ~~ 0*f12 + 0*f13 + 0*f123
##' f2 ~~ 0*f12 + 0*f23 + 0*f123
##' f3 ~~ 0*f13 + 0*f23 + 0*f123
##' f12 + f13 + f23 ~~ 0*f123
##' "
##'
##' fitMC3way <- sem(model3, data = dat3wayMC, meanstructure = TRUE)
##' summary(fitMC3way)
##'
##' probe3WayMC(fitMC3way, nameX = c("f1" ,"f2" ,"f3",
##' "f12","f13","f23", # this order matters!
##' "f123"), # 3-way interaction
##' nameY = "f4", modVar = c("f1", "f2"),
##' valProbe1 = c(-1, 0, 1), valProbe2 = c(-1, 0, 1))
##'
##' @export
probe3WayMC <- function(fit, nameX, nameY, modVar, valProbe1, valProbe2,
group = 1L, omit.imps = c("no.conv","no.se")) {
## TDJ: verify class
if (inherits(fit, "lavaan.mi")) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
useImps <- rep(TRUE, length(fit@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(fit@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(fit@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(fit@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(fit@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
} else if (!inherits(fit, "lavaan")) {
stop('"fit" must inherit from lavaan or lavaan.mi class', call. = FALSE)
}
meanstruc <- lavInspect(fit, "options")$meanstructure
# Check whether modVar is correct
if (is.character(modVar)) modVar <- match(modVar, nameX)
if ((NA %in% modVar) || !(do.call("&", as.list(modVar %in% 1:3))))
stop("The moderator name is not in the list of independent factors and is not 1, 2 or 3.")
## TDJ: If multigroup, check group %in% group.label
nG <- lavInspect(fit, "ngroups")
if (nG > 1L) {
group.label <- lavInspect(fit, "group.label")
## assign numeric to character
if (is.numeric(group)) {
if (group %in% 1:nG) {
group <- group.label[group]
} else group <- as.character(group)
} else group <- as.character(group)
## check that character is a group
if (!as.character(group) %in% group.label)
stop('"group" must be a character string naming a group of interest, or ',
'an integer corresponding to a group in lavInspect(fit, "group.label")')
group.number <- which(group.label == group)
} else group.number <- 1L
## Get the parameter estimates for that group
if (nG > 1L) {
if (inherits(fit, "lavaan")) {
est <- lavInspect(fit, "est")[[group]]
if (!meanstruc) {
est$alpha <- matrix(0, 1, 1, dimnames = list(nameY, "intercept"))
}
} else if (inherits(fit, "lavaan.mi")) {
est <- list()
GLIST <- fit@coefList[useImps]
est$beta <- Reduce("+", lapply(GLIST, function(i) i[[group]]$beta)) / m
if (meanstruc) {
est$alpha <- Reduce("+", lapply(GLIST, function(i) i[[group]]$alpha)) / m
} else {
est$alpha <- matrix(0, 1, 1, dimnames = list(nameY, "intercept"))
}
}
} else {
## single-group model
if (inherits(fit, "lavaan")) {
est <- lavInspect(fit, "est")
if (!meanstruc) {
est$alpha <- matrix(0, 1, 1, dimnames = list(nameY, "intercept"))
}
} else if (inherits(fit, "lavaan.mi")) {
est <- list()
est$beta <- Reduce("+", lapply(fit@coefList[useImps], "[[", i = "beta")) / m
if (meanstruc) {
est$alpha <- Reduce("+", lapply(fit@coefList[useImps], "[[", i = "alpha")) / m
} else {
est$alpha <- matrix(0, 1, 1, dimnames = list(nameY, "intercept"))
}
}
}
## extract relevant slopes
betaNC <- matrix(est$beta[nameY, nameX], ncol = 1,
dimnames = list(nameX, nameY))
## Extract all sampling (co)variances
if (inherits(fit, "lavaan")) {
varEst <- lavaan::vcov(fit)
} else if (inherits(fit, "lavaan.mi")) {
varEst <- getMethod(f = "vcov", signature = "lavaan.mi",
where = getNamespace("lavaan.mi"))(fit, omit.imps = omit.imps)
}
## for indexing the sampling (co)variances
PT <- parTable(fit)
## index slopes in varEst
slope.idx <- which(PT$lhs == nameY & PT$op == "~" & PT$rhs %in% nameX & PT$group == group.number)
## check for user-defined labels; otherwise, set defaults
slope.label <- PT$label[slope.idx]
if (any(slope.label == "")) for (i in which(slope.label == "")) {
slope.label[i] <- paste0(nameY, "~", nameX[i],
ifelse(nG > 1L && group.number > 1L, no = "",
yes = paste0(".g", group.number)))
}
## index intercept in varEst
if (meanstruc) {
int.idx <- PT$label[PT$lhs == nameY & PT$op == "~1" & PT$group == group.number]
if (int.idx == "") {
## no custom label, use default
int.idx <- paste0(nameY, "~1")
if (nG > 1L && group.number > 1L) {
int.idx <- paste0(int.idx, ".g", group.number)
}
}
## check whether it is actually estimated (appears in $alpha and varEst)
estimateIntcept <- int.idx %in% rownames(varEst)
} else {
estimateIntcept <- FALSE
## use default label
int.idx <- paste0(nameY, "~1")
if (nG > 1L && group.number > 1L) {
int.idx <- paste0(int.idx, ".g", group.number)
}
}
## if intercept is not estimated, add a row/column with zero sampling variance
if (!estimateIntcept) {
dn <- c(rownames(varEst), int.idx)
varEst <- rbind(cbind(varEst, 0), 0)
dimnames(varEst) <- list(dn, dn)
}
## save all indices in a vector
targetcol <- c(int.idx, slope.label)
## collect relevant estimates and variances to probe the interaction
usedBeta <- rbind(est$alpha[nameY,], betaNC) # point estimates
usedVar <- varEst[targetcol, targetcol] # sampling (co)variances
## In case a moderator is listed first, find the order to rearrange
ord <- c(setdiff(1:3, modVar), modVar)
ord <- c(ord, 7 - rev(ord))
usedBeta <- usedBeta[c(1, ord+1, 8)]
usedVar <- usedVar[ c(1, ord+1, 8), c(1, ord+1, 8)]
## set all combinations of moderator values
val <- expand.grid(valProbe1, valProbe2)
## function to calculate p values for Wald z tests
pvalue <- function(x) (1 - pnorm(abs(x))) * 2
## Calculate simple intercepts
simpleIntcept <- usedBeta[1] + usedBeta[3] * val[,1] + usedBeta[4] * val[,2] + usedBeta[7] * val[,1] * val[,2]
varIntcept <- usedVar[1, 1] + val[,1]^2 * usedVar[3, 3] + val[,2]^2 * usedVar[4, 4] + val[,1]^2 * val[,2]^2 * usedVar[7, 7] + 2 * val[,1] * usedVar[1, 3] + 2 * val[,2] * usedVar[1, 4] + 2 * val[,1] * val[,2] * usedVar[1, 7] + 2 * val[,1] * val[,2] * usedVar[3, 4] + 2 * val[,1]^2 * val[,2] * usedVar[3, 7] + 2* val[,1] * val[,2]^2 * usedVar[4, 7]
zIntcept <- simpleIntcept / sqrt(varIntcept)
pIntcept <- pvalue(zIntcept)
resultIntcept <- cbind(val, simpleIntcept, sqrt(varIntcept), zIntcept, pIntcept)
colnames(resultIntcept) <- c(nameX[modVar], "est", "se", "z", "pvalue")
class(resultIntcept) <- c("lavaan.data.frame","data.frame")
## Calculate simple slopes
simpleSlope <- usedBeta[2] + usedBeta[5] * val[,1] + usedBeta[6] * val[,2] + usedBeta[8] * val[,1] * val[,2]
varSlope <- usedVar[2, 2] + val[,1]^2 * usedVar[5, 5] + val[,2]^2 * usedVar[6, 6] + val[,1]^2 * val[,2]^2 * usedVar[8, 8] + 2 * val[,1] * usedVar[2, 5] + 2 * val[,2] * usedVar[2, 6] + 2 * val[,1] * val[,2] * usedVar[2, 8] + 2 * val[,1] * val[,2] * usedVar[5, 6] + 2 * val[,1]^2 * val[,2] * usedVar[5, 8] + 2 * val[,1] * val[,2]^2 * usedVar[6, 8]
zSlope <- simpleSlope / sqrt(varSlope)
pSlope <- pvalue(zSlope)
resultSlope <- cbind(val, simpleSlope, sqrt(varSlope), zSlope, pSlope)
colnames(resultSlope) <- c(nameX[modVar], "est", "se", "z", "pvalue")
class(resultSlope) <- c("lavaan.data.frame","data.frame")
list(SimpleIntcept = resultIntcept, SimpleSlope = resultSlope)
}
## --------
## 3-way RC
## --------
##' Probing three-way interaction on the residual-centered latent interaction
##'
##' Probing interaction for simple intercept and simple slope for the
##' residual-centered latent three-way interaction (Geldhof et al., 2013)
##'
##' Before using this function, researchers need to make the products of the
##' indicators between the first-order factors and residualize the products by
##' the original indicators (Lance, 1988; Little, Bovaird, & Widaman, 2006). The
##' process can be automated by the [indProd()] function. Note that
##' the indicator products can be made for all possible combination or
##' matched-pair approach (Marsh et al., 2004). Next, the hypothesized model
##' with the regression with latent interaction will be used to fit all original
##' indicators and the product terms (Geldhof et al., 2013). To use this
##' function the model must be fit with a mean structure. See the example for
##' how to fit the product term below. Once the lavaan result is obtained, this
##' function will be used to probe the interaction.
##'
##' The probing process on residual-centered latent interaction is based on
##' transforming the residual-centered result into the no-centered result. See
##' Geldhof et al. (2013) for further details. Note that this approach based on
##' a strong assumption that the first-order latent variables are normally
##' distributed. The probing process is applied after the no-centered result
##' (parameter estimates and their covariance matrix among parameter estimates)
##' has been computed. See the [probe3WayMC()] for further details.
##'
##'
##' @importFrom lavaan lavInspect parTable
##' @importFrom stats pnorm
##' @importFrom methods getMethod
##'
##' @param fit A fitted [lavaan::lavaan-class] or
##' [lavaan.mi::lavaan.mi-class] object with a latent 2-way interaction.
##' @param nameX `character` vector of all 7 factor names used as the
##' predictors. The 3 lower-order factors must be listed first, followed by
##' the 3 second-order factors (specifically, the 4th element must be the
##' interaction between the factors listed first and second, the 5th element
##' must be the interaction between the factors listed first and third, and
##' the 6th element must be the interaction between the factors listed second
##' and third). The final name will be the factor representing the 3-way
##' interaction.
##' @param nameY The name of factor that is used as the dependent variable.
##' @param modVar The name of two factors that are used as the moderators. The
##' effect of the independent factor on each combination of the moderator
##' variable values will be probed.
##' @param valProbe1 The values of the first moderator that will be used to
##' probe the effect of the independent factor.
##' @param valProbe2 The values of the second moderator that will be used to
##' probe the effect of the independent factor.
##' @param group In multigroup models, the label of the group for which the
##' results will be returned. Must correspond to one of
##' `lavInspect(fit, "group.label")`.
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Ignored unless `fit` is of
##' class [lavaan.mi::lavaan.mi-class]. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. NPD solutions are not excluded by default because
##' they are likely to occur due to sampling error, especially in small
##' samples. However, gross model misspecification could also cause
##' NPD solutions, users can compare pooled results with and without
##' this setting as a sensitivity analysis to see whether some
##' imputations warrant further investigation.
##'
##' @return A list with two elements:
##' \enumerate{
##' \item `SimpleIntercept`: The model-implied intercepts given each
##' combination of moderator values.
##' \item `SimpleSlope`: The model-implied slopes given each combination
##' of moderator values.
##' }
##' In each element, the first column represents values of the first moderator
##' specified in the `valProbe1` argument. The second column represents
##' values of the second moderator specified in the `valProbe2` argument.
##' The third column is the simple intercept or simple slope. The fourth column
##' is the *SE* of the simple intercept or simple slope. The fifth column
##' is the Wald (*z*) statistic, and the sixth column is its associated
##' *p* value to test the null hypothesis that each simple intercept or
##' simple slope equals 0.
##'
##' @author
##' Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @seealso \itemize{
##' \item [indProd()] For creating the indicator products with no
##' centering, mean centering, double-mean centering, or residual centering.
##' \item [probe2WayMC()] For probing the two-way latent interaction
##' when the results are obtained from mean-centering, or double-mean centering
##' \item [probe3WayMC()] For probing the three-way latent interaction
##' when the results are obtained from mean-centering, or double-mean centering
##' \item [probe2WayRC()] For probing the two-way latent interaction
##' when the results are obtained from residual-centering approach.
##' \item [plotProbe()] Plot the simple intercepts and slopes of the
##' latent interaction.
##' }
##'
##' @references
##' Tutorial:
##'
##' Schoemann, A. M., & Jorgensen, T. D. (2021). Testing and interpreting
##' latent variable interactions using the `semTools` package.
##' *Psych, 3*(3), 322--335. \doi{10.3390/psych3030024}
##'
##' Background literature:
##'
##' Geldhof, G. J., Pornprasertmanit, S., Schoemann, A., & Little,
##' T. D. (2013). Orthogonalizing through residual centering: Extended
##' applications and caveats. *Educational and Psychological Measurement,
##' 73*(1), 27--46. \doi{10.1177/0013164412445473}
##'
##' Lance, C. E. (1988). Residual centering, exploratory and confirmatory
##' moderator analysis, and decomposition of effects in path models containing
##' interactions. *Applied Psychological Measurement, 12*(2), 163--175.
##' \doi{10.1177/014662168801200205}
##'
##' Little, T. D., Bovaird, J. A., & Widaman, K. F. (2006). On the merits of
##' orthogonalizing powered and product terms: Implications for modeling
##' interactions. *Structural Equation Modeling, 13*(4), 497--519.
##' \doi{10.1207/s15328007sem1304_1}
##'
##' Marsh, H. W., Wen, Z., & Hau, K. T. (2004). Structural equation models of
##' latent interactions: Evaluation of alternative estimation strategies and
##' indicator construction. *Psychological Methods, 9*(3), 275--300.
##' \doi{10.1037/1082-989X.9.3.275}
##'
##' Pornprasertmanit, S., Schoemann, A. M., Geldhof, G. J., & Little, T. D.
##' (submitted). *Probing latent interaction estimated with a residual
##' centering approach.*
##'
##' @examples
##'
##' dat3wayRC <- orthogonalize(dat3way, 1:3, 4:6, 7:9)
##'
##' model3 <- " ## define latent variables
##' f1 =~ x1 + x2 + x3
##' f2 =~ x4 + x5 + x6
##' f3 =~ x7 + x8 + x9
##' ## 2-way interactions
##' f12 =~ x1.x4 + x2.x5 + x3.x6
##' f13 =~ x1.x7 + x2.x8 + x3.x9
##' f23 =~ x4.x7 + x5.x8 + x6.x9
##' ## 3-way interaction
##' f123 =~ x1.x4.x7 + x2.x5.x8 + x3.x6.x9
##' ## outcome variable
##' f4 =~ x10 + x11 + x12
##'
##' ## latent regression model
##' f4 ~ b1*f1 + b2*f2 + b3*f3 + b12*f12 + b13*f13 + b23*f23 + b123*f123
##'
##' ## orthogonal terms among predictors
##' f1 ~~ 0*f12 + 0*f13 + 0*f123
##' f2 ~~ 0*f12 + 0*f23 + 0*f123
##' f3 ~~ 0*f13 + 0*f23 + 0*f123
##' f12 + f13 + f23 ~~ 0*f123
##'
##' ## identify latent means
##' x1 + x4 + x7 + x1.x4 + x1.x7 + x4.x7 + x1.x4.x7 + x10 ~ 0*1
##' f1 + f2 + f3 + f12 + f13 + f23 + f123 + f4 ~ NA*1
##' "
##'
##' fitRC3way <- sem(model3, data = dat3wayRC, meanstructure = TRUE)
##' summary(fitRC3way)
##'
##' probe3WayMC(fitRC3way, nameX = c("f1" ,"f2" ,"f3",
##' "f12","f13","f23", # this order matters!
##' "f123"), # 3-way interaction
##' nameY = "f4", modVar = c("f1", "f2"),
##' valProbe1 = c(-1, 0, 1), valProbe2 = c(-1, 0, 1))
##'
##' @export
probe3WayRC <- function(fit, nameX, nameY, modVar, valProbe1, valProbe2,
group = 1L, omit.imps = c("no.conv","no.se")) {
## TDJ: verify class
if (inherits(fit, "lavaan.mi")) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
useImps <- rep(TRUE, length(fit@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(fit@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(fit@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(fit@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(fit@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
} else if (!inherits(fit, "lavaan")) {
stop('"fit" must inherit from lavaan or lavaan.mi class', call. = FALSE)
}
if (!lavInspect(fit, "options")$meanstructure)
stop('This function requires the model to be fit with a mean structure.',
call. = FALSE)
# Check whether modVar is correct
if (is.character(modVar)) modVar <- match(modVar, nameX)
if ((NA %in% modVar) || !(do.call("&", as.list(modVar %in% 1:3))))
stop("The moderator name is not in the list of independent factors and is ",
"not 1, 2 or 3.") # JG: Changed error
## TDJ: If multigroup, check group %in% group.label
nG <- lavInspect(fit, "ngroups")
if (nG > 1L) {
group.label <- lavInspect(fit, "group.label")
## assign numeric to character
if (is.numeric(group)) {
if (group %in% 1:nG) {
group <- group.label[group]
} else group <- as.character(group)
} else group <- as.character(group)
## check that character is a group
if (!as.character(group) %in% group.label)
stop('"group" must be a character string naming a group of interest, or ',
'an integer corresponding to a group in lavInspect(fit, "group.label")')
group.number <- which(group.label == group)
} else group.number <- 1L
## Get the parameter estimates for that group
if (nG > 1L) {
if (inherits(fit, "lavaan")) {
est <- lavInspect(fit, "est")[[group]]
} else if (inherits(fit, "lavaan.mi")) {
est <- list()
GLIST <- fit@coefList[useImps]
est$beta <- Reduce("+", lapply(GLIST, function(i) i[[group]]$beta)) / m
est$alpha <- Reduce("+", lapply(GLIST, function(i) i[[group]]$alpha)) / m
est$psi <- Reduce("+", lapply(GLIST, function(i) i[[group]]$psi)) / m
}
} else {
## single-group model
if (inherits(fit, "lavaan")) {
est <- lavInspect(fit, "est")
} else if (inherits(fit, "lavaan.mi")) {
est <- list()
est$beta <- Reduce("+", lapply(fit@coefList[useImps], "[[", i = "beta")) / m
est$alpha <- Reduce("+", lapply(fit@coefList[useImps], "[[", i = "alpha")) / m
est$psi <- Reduce("+", lapply(fit@coefList[useImps], "[[", i = "psi")) / m
}
}
## Find the mean and covariance matrix of predictors
varX <- est$psi[ nameX, nameX]
meanX <-est$alpha[nameX, , drop = FALSE]
# Find the intercept, regression coefficients, and residual variance of residual-centered regression
intceptRC <- est$alpha[nameY, , drop = FALSE]
resVarRC <- est$psi[ nameY, nameY]
if (resVarRC < 0) warning("The residual variance is negative. The model did not converge on a proper solution.")
betaRC <- t(est$beta[nameY, nameX, drop = FALSE])
# Find the number of observations
numobs <- lavInspect(fit, "nobs")[group.number]
# Compute SSRC
meanXwith1 <- rbind(1, meanX)
varXwith0 <- cbind(0, rbind(0, varX))
SSRC <- numobs * (varXwith0 + (meanXwith1 %*% t(meanXwith1)))
# Compute Mean(Y) and Var(Y)
betaRCWithIntcept <- rbind(intceptRC, betaRC)
meanY <- t(meanXwith1) %*% betaRCWithIntcept
varY <- (t(betaRCWithIntcept) %*% SSRC %*% betaRCWithIntcept)/numobs - meanY^2 + resVarRC
# Compute Cov(Y, X)
covY <- as.matrix((varX %*% betaRC)[1:3,])
# Compute E(XZ), E(XW), E(ZW), E(XZW)
meanX[4] <- expect2NormProd(meanX[c(1,2)], varX[c(1,2), c(1,2)])
meanX[5] <- expect2NormProd(meanX[c(1,3)], varX[c(1,3), c(1,3)])
meanX[6] <- expect2NormProd(meanX[c(2,3)], varX[c(2,3), c(2,3)])
meanX[7] <- expect3NormProd(meanX[1:3], varX[1:3, 1:3])
# Compute Var(XZ), Var(XW), Var(ZW), Var(XZW)
varX[4, 4] <- var2NormProd(meanX[c(1,2)], varX[c(1,2), c(1,2)])
varX[5, 5] <- var2NormProd(meanX[c(1,3)], varX[c(1,3), c(1,3)])
varX[6, 6] <- var2NormProd(meanX[c(2,3)], varX[c(2,3), c(2,3)])
varX[7, 7] <- var3NormProd(meanX[1:3], varX[1:3, 1:3])
# Compute All covariances
varX[4, 1] <- varX[1, 4] <- expect3NormProd(meanX[c(1, 2, 1)], varX[c(1, 2, 1),c(1, 2, 1)]) - expect2NormProd(meanX[c(1,2)], varX[c(1,2),c(1,2)]) * meanX[1]
varX[5, 1] <- varX[1, 5] <- expect3NormProd(meanX[c(1, 3, 1)], varX[c(1, 3, 1),c(1, 3, 1)]) - expect2NormProd(meanX[c(1,3)], varX[c(1,3),c(1,3)]) * meanX[1]
varX[6, 1] <- varX[1, 6] <- expect3NormProd(meanX[c(2, 3, 1)], varX[c(2, 3, 1),c(2, 3, 1)]) - expect2NormProd(meanX[c(2,3)], varX[c(2,3),c(2,3)]) * meanX[1]
varX[7, 1] <- varX[1, 7] <- expect4NormProd(meanX[c(1,2,3,1)], varX[c(1,2,3,1),c(1,2,3,1)]) - expect3NormProd(meanX[c(1,2,3)], varX[c(1,2,3),c(1,2,3)]) * meanX[1]
varX[4, 2] <- varX[2, 4] <- expect3NormProd(meanX[c(1, 2, 2)], varX[c(1, 2, 2),c(1, 2, 2)]) - expect2NormProd(meanX[c(1,2)], varX[c(1,2),c(1,2)]) * meanX[2]
varX[5, 2] <- varX[2, 5] <- expect3NormProd(meanX[c(1, 3, 2)], varX[c(1, 3, 2),c(1, 3, 2)]) - expect2NormProd(meanX[c(1,3)], varX[c(1,3),c(1,3)]) * meanX[2]
varX[6, 2] <- varX[2, 6] <- expect3NormProd(meanX[c(2, 3, 2)], varX[c(2, 3, 2),c(2, 3, 2)]) - expect2NormProd(meanX[c(2,3)], varX[c(2,3),c(2,3)]) * meanX[2]
varX[7, 2] <- varX[2, 7] <- expect4NormProd(meanX[c(1,2,3,2)], varX[c(1,2,3,2),c(1,2,3,2)]) - expect3NormProd(meanX[c(1,2,3)], varX[c(1,2,3),c(1,2,3)]) * meanX[2]
varX[4, 3] <- varX[3, 4] <- expect3NormProd(meanX[c(1, 2, 3)], varX[c(1, 2, 3),c(1, 2, 3)]) - expect2NormProd(meanX[c(1,2)], varX[c(1,2),c(1,2)]) * meanX[3]
varX[5, 3] <- varX[3, 5] <- expect3NormProd(meanX[c(1, 3, 3)], varX[c(1, 3, 3),c(1, 3, 3)]) - expect2NormProd(meanX[c(1,3)], varX[c(1,3),c(1,3)]) * meanX[3]
varX[6, 3] <- varX[3, 6] <- expect3NormProd(meanX[c(2, 3, 3)], varX[c(2, 3, 3),c(2, 3, 3)]) - expect2NormProd(meanX[c(2,3)], varX[c(2,3),c(2,3)]) * meanX[3]
varX[7, 3] <- varX[3, 7] <- expect4NormProd(meanX[c(1,2,3,3)], varX[c(1,2,3,3),c(1,2,3,3)]) - expect3NormProd(meanX[c(1,2,3)], varX[c(1,2,3),c(1,2,3)]) * meanX[3]
varX[5, 4] <- varX[4, 5] <- expect4NormProd(meanX[c(1,3,1,2)], varX[c(1,3,1,2),c(1,3,1,2)]) - expect2NormProd(meanX[c(1,3)], varX[c(1,3),c(1,3)]) * expect2NormProd(meanX[c(1,2)], varX[c(1,2),c(1,2)])
varX[6, 4] <- varX[4, 6] <- expect4NormProd(meanX[c(2,3,1,2)], varX[c(2,3,1,2),c(2,3,1,2)]) - expect2NormProd(meanX[c(2,3)], varX[c(2,3),c(2,3)]) * expect2NormProd(meanX[c(1,2)], varX[c(1,2),c(1,2)])
varX[7, 4] <- varX[4, 7] <- expect5NormProd(meanX[c(1,2,3,1,2)], varX[c(1,2,3,1,2),c(1,2,3,1,2)]) - expect3NormProd(meanX[c(1, 2, 3)], varX[c(1, 2, 3),c(1, 2, 3)]) * expect2NormProd(meanX[c(1,2)], varX[c(1,2),c(1,2)])
varX[6, 5] <- varX[5, 6] <- expect4NormProd(meanX[c(2,3,1,3)], varX[c(2,3,1,3),c(2,3,1,3)]) - expect2NormProd(meanX[c(2,3)], varX[c(2,3),c(2,3)]) * expect2NormProd(meanX[c(1,3)], varX[c(1,3),c(1,3)])
varX[7, 5] <- varX[5, 7] <- expect5NormProd(meanX[c(1,2,3,1,3)], varX[c(1,2,3,1,3),c(1,2,3,1,3)]) - expect3NormProd(meanX[c(1, 2, 3)], varX[c(1, 2, 3),c(1, 2, 3)]) * expect2NormProd(meanX[c(1,3)], varX[c(1,3),c(1,3)])
varX[7, 6] <- varX[6, 7] <- expect5NormProd(meanX[c(1,2,3,2,3)], varX[c(1,2,3,2,3),c(1,2,3,2,3)]) - expect3NormProd(meanX[c(1, 2, 3)], varX[c(1, 2, 3),c(1, 2, 3)]) * expect2NormProd(meanX[c(2,3)], varX[c(2,3),c(2,3)])
# Find the meanX and varX without XZW
meanXReducedWith1 <- rbind(1, as.matrix(meanX[1:6]))
varXReducedWith0 <- cbind(0, rbind(0, varX[1:6, 1:6]))
SSMCReduced <- numobs * (varXReducedWith0 + (meanXReducedWith1 %*% t(meanXReducedWith1)))
# Find product of main and two-way onto three-way
covXZWwith0 <- rbind(0, as.matrix(varX[7, 1:6]))
meanXZWwith1 <- meanX[7] * meanXReducedWith1
SSXZW <- numobs * (covXZWwith0 + meanXZWwith1) # should the mean vector be squared (postmultiplied by its transpose)?
# Compute a vector and b4, b5, b6
a <- solve(SSMCReduced) %*% as.matrix(SSXZW)
betaTemp <- betaRC[4:6] - (as.numeric(betaRC[7]) * a[5:7])
betaTemp <- c(betaTemp, betaRC[7])
# Compute Cov(Y, XZ) and regression coefficients of no-centering
betaNC <- solve(varX[1:3,1:3], as.matrix(covY) - (t(varX[4:7, 1:3]) %*% as.matrix(betaTemp)))
betaNC <- rbind(as.matrix(betaNC), as.matrix(betaTemp))
covY <- rbind(covY, as.matrix((varX %*% betaNC)[4:7, 1]))
# Aggregate the non-centering sufficient statistics (Just show how to do but not necessary)
fullCov <- rbind(cbind(varX, covY), c(covY, varY))
fullMean <- rbind(meanX, meanY)
# Compute the intercept of no-centering
intceptNC <- meanY - t(betaNC) %*% meanX
# Compute SSNC
betaNCWithIntcept <- rbind(intceptNC, betaNC)
meanXwith1 <- rbind(1, meanX) #JG: redundant
varXwith0 <- rbind(0, cbind(0, varX)) #JG: redundant
SSNC <- numobs * (varXwith0 + (meanXwith1 %*% t(meanXwith1)))
# Compute residual variance on non-centering
resVarNC <- varY - (t(betaNCWithIntcept) %*% SSNC %*% betaNCWithIntcept)/numobs + meanY^2
## Extract all sampling (co)variances
if (inherits(fit, "lavaan")) {
varEst <- lavaan::vcov(fit)
} else if (inherits(fit, "lavaan.mi")) {
varEst <- getMethod(f = "vcov", signature = "lavaan.mi",
where = getNamespace("lavaan.mi"))(fit, omit.imps = omit.imps)
}
## for indexing sampling (co)variances
PT <- parTable(fit)
int.idx <- PT$label[PT$lhs == nameY & PT$op == "~1" & PT$group == group.number]
if (int.idx == "") {
## no custom label, use default
int.idx <- paste0(nameY, "~1")
if (nG > 1L && group.number > 1L) {
int.idx <- paste0(int.idx, ".g", group.number)
}
}
## check intercept is actually estimated (thus, has sampling variance)
estimateIntcept <- int.idx %in% rownames(varEst)
## if intercept is not estimated, add a row/column with zero sampling variance
if (!estimateIntcept) {
dn <- c(rownames(varEst), int.idx)
varEst <- rbind(cbind(varEst, 0), 0)
dimnames(varEst) <- list(dn, dn)
}
## index slopes in varEst
slope.idx <- which(PT$lhs == nameY & PT$op == "~" & PT$rhs %in% nameX & PT$group == group.number)
## check for user-defined labels; otherwise, set defaults
slope.label <- PT$label[slope.idx]
if (any(slope.label == "")) for (i in which(slope.label == "")) {
slope.label[i] <- paste0(nameY, "~", nameX[i],
ifelse(nG > 1L && group.number > 1L, no = "",
yes = paste0(".g", group.number)))
}
## save all indices in a vector
targetcol <- c(int.idx, slope.label)
varEstSlopeRC <- varEst[targetcol, targetcol]
## collect relevant estimates and variances to probe the interaction
usedVar <- as.numeric(resVarNC/resVarRC) * (varEstSlopeRC %*% SSRC %*% solve(SSNC))
usedBeta <- betaNCWithIntcept
## In case a moderator is listed first, find the order to rearrange
ord <- c(setdiff(1:3, modVar), modVar)
ord <- c(ord, 7 - rev(ord))
usedBeta <- usedBeta[c(1, ord+1, 8)]
usedVar <- usedVar[ c(1, ord+1, 8), c(1, ord+1, 8)]
## set all combinations of moderator values
val <- expand.grid(valProbe1, valProbe2)
## function to calculate p values for Wald z tests
pvalue <- function(x) (1 - pnorm(abs(x))) * 2
## Calculate simple intercepts
simpleIntcept <- usedBeta[1] + usedBeta[3] * val[,1] + usedBeta[4] * val[,2] + usedBeta[7] * val[,1] * val[,2]
varIntcept <- usedVar[1, 1] + val[,1]^2 * usedVar[3, 3] + val[,2]^2 * usedVar[4, 4] + val[,1]^2 * val[,2]^2 * usedVar[7, 7] + 2 * val[,1] * usedVar[1, 3] + 2 * val[,2] * usedVar[1, 4] + 2 * val[,1] * val[,2] * usedVar[1, 7] + 2 * val[,1] * val[,2] * usedVar[3, 4] + 2 * val[,1]^2 * val[,2] * usedVar[3, 7] + 2* val[,1] * val[,2]^2 * usedVar[4, 7]
zIntcept <- simpleIntcept / sqrt(varIntcept)
pIntcept <- pvalue(zIntcept)
resultIntcept <- cbind(val, simpleIntcept, sqrt(varIntcept), zIntcept, pIntcept)
colnames(resultIntcept) <- c(nameX[modVar], "est", "se", "z", "pvalue")
class(resultIntcept) <- c("lavaan.data.frame","data.frame")
## Calculate simple slopes
simpleSlope <- usedBeta[2] + usedBeta[5] * val[,1] + usedBeta[6] * val[,2] + usedBeta[8] * val[,1] * val[,2]
varSlope <- usedVar[2, 2] + val[,1]^2 * usedVar[5, 5] + val[,2]^2 * usedVar[6, 6] + val[,1]^2 * val[,2]^2 * usedVar[8, 8] + 2 * val[,1] * usedVar[2, 5] + 2 * val[,2] * usedVar[2, 6] + 2 * val[,1] * val[,2] * usedVar[2, 8] + 2 * val[,1] * val[,2] * usedVar[5, 6] + 2 * val[,1]^2 * val[,2] * usedVar[5, 8] + 2 * val[,1] * val[,2]^2 * usedVar[6, 8]
zSlope <- simpleSlope / sqrt(varSlope)
pSlope <- pvalue(zSlope)
resultSlope <- cbind(val, simpleSlope, sqrt(varSlope), zSlope, pSlope)
colnames(resultSlope) <- c(nameX[modVar], "est", "se", "z", "pvalue")
class(resultSlope) <- c("lavaan.data.frame","data.frame")
list(SimpleIntcept = resultIntcept, SimpleSlope = resultSlope)
}
## -----------------
## Plotting Function
## -----------------
##' Plot a latent interaction
##'
##' This function will plot the line graphs representing the simple effect of
##' the independent variable given the values of the moderator. For multigroup
##' models, it will only generate a plot for 1 group, as specified in the
##' function used to obtain the first argument.
##'
##' @note
##' If the `object` does not contain simple intercepts (i.e., if the
##' `object$SimpleIntcept` element is `NULL`), then all simple
##' intercepts are arbitrarily set to zero in order to plot the simple slopes.
##' This may not be consistent with the fitted model, but was (up until version
##' 0.5-7) the default behavior when the y-intercept was fixed to 0. In this case,
##' although the relative steepness of simple slopes can still meaningfully be
##' compared, the relative vertical positions of lines at any point along the
##' *x*-axis should not be interpreted.
##'
##' @param object A `list`, typically the result of probing a latent 2-way
##' or 3-way interaction obtained from the [probe2WayMC()],
##' [probe2WayRC()], [probe3WayMC()], or
##' [probe3WayRC()] functions.
##' @param xlim The vector of two numbers: the minimum and maximum values of the
##' independent variable
##' @param xlab The label of the x-axis
##' @param ylab The label of the y-axis
##' @param legend `logical`. If `TRUE` (default), a legend is printed.
##' @param legendArgs `list` of arguments passed to [legend()]
##' function if `legend=TRUE`.
##' @param \dots Any additional argument for the [plot()] function
##'
##' @return None. This function will plot the simple main effect only.
##'
##' @author
##' Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @seealso \itemize{
##' \item [indProd()] For creating the indicator products with no
##' centering, mean centering, double-mean centering, or residual centering.
##' \item [probe2WayMC()] For probing the two-way latent interaction
##' when the results are obtained from mean-centering, or double-mean centering
##' \item [probe3WayMC()] For probing the three-way latent interaction
##' when the results are obtained from mean-centering, or double-mean centering
##' \item [probe2WayRC()] For probing the two-way latent interaction
##' when the results are obtained from residual-centering approach.
##' \item [probe3WayRC()] For probing the two-way latent interaction
##' when the results are obtained from residual-centering approach.
##' }
##'
##' @references
##'
##' Schoemann, A. M., & Jorgensen, T. D. (2021). Testing and interpreting
##' latent variable interactions using the `semTools` package.
##' *Psych, 3*(3), 322--335. \doi{10.3390/psych3030024}
##'
##' @examples
##'
##' library(lavaan)
##'
##' dat2wayMC <- indProd(dat2way, 1:3, 4:6)
##'
##' model1 <- "
##' f1 =~ x1 + x2 + x3
##' f2 =~ x4 + x5 + x6
##' f12 =~ x1.x4 + x2.x5 + x3.x6
##' f3 =~ x7 + x8 + x9
##' f3 ~ f1 + f2 + f12
##' f12 ~~ 0*f1
##' f12 ~~ 0*f2
##' x1 ~ 0*1
##' x4 ~ 0*1
##' x1.x4 ~ 0*1
##' x7 ~ 0*1
##' f1 ~ NA*1
##' f2 ~ NA*1
##' f12 ~ NA*1
##' f3 ~ NA*1
##' "
##'
##' fitMC2way <- sem(model1, data = dat2wayMC, meanstructure = TRUE)
##' result2wayMC <- probe2WayMC(fitMC2way, nameX = c("f1", "f2", "f12"),
##' nameY = "f3", modVar = "f2", valProbe = c(-1, 0, 1))
##' plotProbe(result2wayMC, xlim = c(-2, 2))
##'
##'
##' dat3wayMC <- indProd(dat3way, 1:3, 4:6, 7:9)
##'
##' model3 <- "
##' f1 =~ x1 + x2 + x3
##' f2 =~ x4 + x5 + x6
##' f3 =~ x7 + x8 + x9
##' f12 =~ x1.x4 + x2.x5 + x3.x6
##' f13 =~ x1.x7 + x2.x8 + x3.x9
##' f23 =~ x4.x7 + x5.x8 + x6.x9
##' f123 =~ x1.x4.x7 + x2.x5.x8 + x3.x6.x9
##' f4 =~ x10 + x11 + x12
##' f4 ~ f1 + f2 + f3 + f12 + f13 + f23 + f123
##' f1 ~~ 0*f12
##' f1 ~~ 0*f13
##' f1 ~~ 0*f123
##' f2 ~~ 0*f12
##' f2 ~~ 0*f23
##' f2 ~~ 0*f123
##' f3 ~~ 0*f13
##' f3 ~~ 0*f23
##' f3 ~~ 0*f123
##' f12 ~~ 0*f123
##' f13 ~~ 0*f123
##' f23 ~~ 0*f123
##' x1 ~ 0*1
##' x4 ~ 0*1
##' x7 ~ 0*1
##' x10 ~ 0*1
##' x1.x4 ~ 0*1
##' x1.x7 ~ 0*1
##' x4.x7 ~ 0*1
##' x1.x4.x7 ~ 0*1
##' f1 ~ NA*1
##' f2 ~ NA*1
##' f3 ~ NA*1
##' f12 ~ NA*1
##' f13 ~ NA*1
##' f23 ~ NA*1
##' f123 ~ NA*1
##' f4 ~ NA*1
##' "
##'
##' fitMC3way <- sem(model3, data = dat3wayMC, std.lv = FALSE,
##' meanstructure = TRUE)
##' result3wayMC <- probe3WayMC(fitMC3way, nameX = c("f1", "f2", "f3", "f12",
##' "f13", "f23", "f123"),
##' nameY = "f4", modVar = c("f1", "f2"),
##' valProbe1 = c(-1, 0, 1), valProbe2 = c(-1, 0, 1))
##' plotProbe(result3wayMC, xlim = c(-2, 2))
##'
##' @export
plotProbe <- function(object, xlim, xlab = "Indepedent Variable",
ylab = "Dependent Variable", legend = TRUE,
legendArgs = list(), ...) {
if (length(xlim) != 2) stop("The x-limit should be specified as a numeric",
" vector with the length of 2.")
# Extract simple slope
slope <- object$SimpleSlope
# Check whether the object is the two-way or three-way interaction result
numInt <- 2
if (ncol(slope) == 6) numInt <- 3
estSlope <- slope[, ncol(slope) - 3]
# Get whether the simple slope is significant. If so, the resulting lines will be
# shown as red. If not, the line will be black.
estSlopeSig <- (slope[, ncol(slope)] < 0.05) + 1
# Extract simple intercept. If the simple intercept is not provided, the intercept
# will be fixed as 0.
estIntercept <- NULL
if (is.null(object$SimpleIntcept)) {
warning("Simple intercepts unavailable from object. All lines therefore ",
"arbitrarily intersect at the origin. Relative vertical positions ",
"of lines at any point on the x-axis are therefore arbitrary and ",
"should not be interpreted. Only the relative steepness of simple ",
"slopes can meaningfully be compared.")
} else estIntercept <- object$SimpleIntcept[, ncol(slope) - 3]
if (numInt == 2) {
if (is.null(legendArgs$title)) legendArgs$title <- colnames(slope)[1]
if (is.null(legendArgs$legend)) legendArgs$legend <- slope[, 1]
plotSingleProbe(estSlope, estIntercept, xlim = xlim, xlab = xlab, ylab = ylab,
colLine = estSlopeSig, legend = legend,
legendArgs = legendArgs, ...)
} else if (numInt == 3) {
# Three-way interaction; separate lines for the first moderator, separate graphs
# for the second moderator
mod2 <- unique(slope[, 2])
mod1 <- unique(slope[, 1])
# Use multiple graphs in a figure
if (length(mod2) == 2) {
obj <- par(mfrow = c(1, 2))
} else if (length(mod2) == 3) {
obj <- par(mfrow = c(1, 3))
} else if (length(mod2) > 3) {
obj <- par(mfrow = c(2, ceiling(length(mod2)/2)))
} else if (length(mod2) == 1) {
# Intentionally leaving as blank
} else stop("Some errors occur")
for (i in 1:length(mod2)) {
select <- slope[, 2] == mod2[i]
if (is.null(legendArgs$title)) legendArgs$title <- colnames(slope)[1]
if (is.null(legendArgs$legend)) legendArgs$legend <- mod1
plotSingleProbe(estSlope[select], estIntercept[select], xlim = xlim,
xlab = xlab, ylab = ylab, colLine = estSlopeSig[select],
main = paste(colnames(slope)[2], "=", mod2[i]),
legend = legend, legendArgs = legendArgs, ...)
}
if (length(mod2) > 1)
par(obj)
} else {
stop("Please make sure that the object argument is obtained from",
" 'probe2wayMC', 'probe2wayRC', 'probe3wayMC', or 'probe3wayRC'.")
}
invisible(NULL)
}
## ----------------
## Hidden Functions
## ----------------
## Find the expected value of the product of two normal variates
## m = the mean of each normal variate
## s = the covariance matrix of all variates
expect2NormProd <- function(m, s) return(prod(m) + s[1, 2])
## Find the expected value of the product of three normal variates
## m = the mean of each normal variate
## s = the covariance matrix of all variates
expect3NormProd <- function(m, s) {
return(prod(m) + m[3] * s[1, 2] + m[2] * s[1, 3] + m[1] * s[2, 3])
}
## Find the expected value of the product of four normal variates
## m = the mean of each normal variate
## s = the covariance matrix of all variates
expect4NormProd <- function(m, s) {
first <- prod(m)
com <- utils::combn(1:4, 2)
forSecond <- function(draw, meanval, covval, index) {
draw2 <- setdiff(index, draw)
prod(meanval[draw2]) * covval[draw[1], draw[2]]
}
second <- sum(apply(com, 2, forSecond, meanval=m, covval=s, index=1:4))
com2 <- com[,1:3] #select only first three terms containing the first element only
forThird <- function(draw, covval, index) {
draw2 <- setdiff(index, draw)
covval[draw[1], draw[2]] * covval[draw2[1], draw2[2]]
}
third <- sum(apply(com2, 2, forThird, covval=s, index=1:4))
return(first + second + third)
}
## Find the expected value of the product of five normal variates
## m = the mean of each normal variate
## s = the covariance matrix of all variates
expect5NormProd <- function(m, s) {
first <- prod(m)
com <- utils::combn(1:5, 2)
forSecond <- function(draw, meanval, covval, index) {
draw2 <- setdiff(index, draw)
prod(meanval[draw2]) * covval[draw[1], draw[2]]
}
second <- sum(apply(com, 2, forSecond, meanval=m, covval=s, index=1:5))
com2 <- utils::combn(1:5, 4)
forThirdOuter <- function(index, m, s, indexall) {
targetMean <- m[setdiff(indexall, index)]
cominner <- utils::combn(index, 2)[,1:3] #select only first three terms containing the first element only
forThirdInner <- function(draw, covval, index) {
draw2 <- setdiff(index, draw)
covval[draw[1], draw[2]] * covval[draw2[1], draw2[2]]
}
thirdInner <- targetMean * sum(apply(cominner, 2, forThirdInner, covval=s, index=index))
return(thirdInner)
}
third <- sum(apply(com2, 2, forThirdOuter, m=m, s=s, indexall=1:5))
return(first + second + third)
}
## Find the variance of the product of two normal variates
## m = the mean of each normal variate
## s = the covariance matrix of all variates
var2NormProd <- function(m, s) {
first <- m[2]^2 * s[1, 1] + m[1]^2 * s[2, 2]
second <- 2 * m[1] * m[2] * s[1, 2]
third <- s[1, 1] * s[2, 2]
fourth <- s[1, 2]^2
return(first + second + third + fourth)
}
## Find the variance of the product of three normal variates
## m = the mean of each normal variate
## s = the covariance matrix of all variates
var3NormProd <- function(m, s) {
com <- utils::combn(1:3, 2)
forFirst <- function(draw, meanval, covval, index) {
# draw = 2, 3; draw2 = 1
draw2 <- setdiff(index, draw)
term1 <- meanval[draw[1]]^2 * meanval[draw[2]]^2 * covval[draw2, draw2]
term2 <- 2 * meanval[draw2]^2 * meanval[draw[1]] * meanval[draw[2]] * covval[draw[1], draw[2]]
term3 <- (meanval[draw2]^2 * covval[draw[1], draw[1]] * covval[draw[2], draw[2]]) + (meanval[draw2]^2 * covval[draw[1], draw[2]]^2)
term4 <- 4 * meanval[draw[1]] * meanval[draw[2]] * covval[draw2, draw2] * covval[draw[1], draw[2]]
term5 <- 6 * meanval[draw[1]] * meanval[draw[2]] * covval[draw2, draw[1]] * covval[draw2, draw[2]]
term1 + term2 + term3 + term4 + term5
}
first <- sum(apply(com, 2, forFirst, meanval=m, covval=s, index=1:3))
second <- prod(diag(s))
third <- 2 * s[3, 3] * s[1, 2]^2 + 2 * s[2, 2] * s[1, 3]^2 + 2 * s[1, 1] * s[2, 3]^2
fourth <- 8 * s[1, 2] * s[1, 3] * s[2, 3]
return(first + second + third + fourth)
}
## plotSingleProbe : plot the probing interaction result specific for only one moderator
## estSlope = slope of each line
## estIntercept = intercept of each line
## xlim = the minimum and maximum values of the independent variable (x-axis)
## xlab = the label for the independent variable
## ylab = the lable for the dependent variable
## main = the title of the graph
## colLine = the color of each line
## legend = whether to print a legend
## legendArgs = arguments to pass to legend() function
plotSingleProbe <- function(estSlope, estIntercept = NULL, xlim,
xlab = "Indepedent Variable",
ylab = "Dependent Variable", main = NULL,
colLine = "black", legend = TRUE,
legendArgs = list(), ...) {
if (is.null(estIntercept)) estIntercept <- rep(0, length(estSlope))
if (length(colLine) == 1) colLine <- rep(colLine, length(estSlope))
lower <- estIntercept + (xlim[1] * estSlope)
upper <- estIntercept + (xlim[2] * estSlope)
ylim <- c(min(c(lower, upper)), max(c(lower, upper)))
plot(cbind(xlim, ylim), xlim = xlim, ylim = ylim, type = "n",
xlab = xlab, ylab = ylab, main = main, ...)
for (i in 1:length(estSlope)) {
lines(cbind(xlim, c(lower[i], upper[i])),
col = colLine[i], lwd = 1.5, lty = i)
}
if (legend) {
positionX <- 0.25
if (all(estSlope > 0)) positionX <- 0.01
if (all(estSlope < 0)) positionX <- 0.50
if (is.null(legendArgs$x)) legendArgs$x <- positionX * (xlim[2] - xlim[1]) + xlim[1]
if (is.null(legendArgs$y)) legendArgs$y <- 0.99 * (ylim[2] - ylim[1]) + ylim[1]
if (is.null(legendArgs$col)) legendArgs$col <- colLine
if (is.null(legendArgs$lty)) legendArgs$lty <- 1:length(estSlope)
do.call(graphics::legend, legendArgs)
}
invisible(NULL)
}
semTools/R/runMI-modification.R 0000644 0001762 0000144 00000047253 14764332565 016132 0 ustar ligges users ### Terrence D. Jorgensen & Yves Rosseel
### Last updated: 12 March 2025
### adaptation of lavaan::modindices() for OLDlavaan.mi-class objects
### DEPRECATED: 16 June 2024
### supplanted by lavaan.mi package
##' Modification Indices for Multiple Imputations
##'
##' Modification indices (1-*df* Lagrange multiplier tests) from a
##' latent variable model fitted to multiple imputed data sets. Statistics
##' for releasing one or more fixed or constrained parameters in model can
##' be calculated by pooling the gradient and information matrices
##' across imputed data sets in a method proposed by Mansolf, Jorgensen, &
##' Enders (2020)---analogous to the "D1" Wald test proposed by Li, Meng,
##' Raghunathan, & Rubin (1991)---or by pooling the complete-data score-test
##' statistics across imputed data sets (i.e., "D2"; Li et al., 1991).
##'
## @aliases modificationIndices.mi modificationindices.mi modindices.mi
##' @importFrom lavaan lavInspect lavListInspect lavNames
##' @importFrom methods getMethod
##' @importFrom stats cov pchisq qchisq
##'
##' @param object An object of class [OLDlavaan.mi-class]
##' @param test `character` indicating which pooling method to use.
##' `"D1"` requests Mansolf, Jorgensen, & Enders' (2020) proposed
##' Wald-like test for pooling the gradient and information, which are then
##' used to calculate score-test statistics in the usual manner. `"D2"`
##' (default because it is less computationall intensive) requests to pool the
##' complete-data score-test statistics from each imputed data set, then pool
##' them across imputations, described by Li et al. (1991) and Enders (2010).
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. Specific imputation numbers can also be included in this
##' argument, in case users want to apply their own custom omission criteria
##' (or simulations can use different numbers of imputations without
##' redundantly refitting the model).
##' @param standardized `logical`. If `TRUE`, two extra columns
##' (`$sepc.lv` and `$sepc.all`) will contain standardized values
##' for the EPCs. In the first column (`$sepc.lv`), standardizization is
##' based on the variances of the (continuous) latent variables. In the second
##' column (`$sepc.all`), standardization is based on both the variances
##' of both (continuous) observed and latent variables. (Residual) covariances
##' are standardized using (residual) variances.
##' @param cov.std `logical`. `TRUE` if `test == "D2"`.
##' If `TRUE` (default), the (residual)
##' observed covariances are scaled by the square-root of the diagonal elements
##' of the \eqn{\Theta} matrix, and the (residual) latent covariances are
##' scaled by the square-root of the diagonal elements of the \eqn{\Psi}
##' matrix. If `FALSE`, the (residual) observed covariances are scaled by
##' the square-root of the diagonal elements of the model-implied covariance
##' matrix of observed variables (\eqn{\Sigma}), and the (residual) latent
##' covariances are scaled by the square-root of the diagonal elements of the
##' model-implied covariance matrix of the latent variables.
##' @param information `character` indicating the type of information
##' matrix to use (check [lavaan::lavInspect()] for available options).
##' `"expected"` information is the default, which provides better
##' control of Type I errors.
##' @param power `logical`. If `TRUE`, the (post-hoc) power is
##' computed for each modification index, using the values of `delta`
##' and `alpha`.
##' @param delta The value of the effect size, as used in the post-hoc power
##' computation, currently using the unstandardized metric of the `$epc`
##' column.
##' @param alpha The significance level used for deciding if the modification
##' index is statistically significant or not.
##' @param high.power If the computed power is higher than this cutoff value,
##' the power is considered 'high'. If not, the power is considered 'low'.
##' This affects the values in the `$decision` column in the output.
##' @param sort. `logical`. If `TRUE`, sort the output using the
##' values of the modification index values. Higher values appear first.
##' @param minimum.value `numeric`. Filter output and only show rows with a
##' modification index value equal or higher than this minimum value.
##' @param maximum.number `integer`. Filter output and only show the first
##' maximum number rows. Most useful when combined with the `sort.` option.
##' @param na.remove `logical`. If `TRUE` (default), filter output by
##' removing all rows with `NA` values for the modification indices.
##' @param op `character` string. Filter the output by selecting only those
##' rows with operator `op`.
##'
##' @note When `test = "D2"`, each (S)EPC will be pooled by taking its
##' average across imputations. When `test = "D1"`, EPCs will be
##' calculated in the standard way using the pooled gradient and information,
##' and SEPCs will be calculated by standardizing the EPCs using model-implied
##' (residual) variances.
##'
##' @return A `data.frame` containing modification indices and (S)EPCs.
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' Adapted from \pkg{lavaan} source code, written by
##' Yves Rosseel (Ghent University; \email{Yves.Rosseel@@UGent.be})
##'
##' `test = "D1"` method proposed by
##' Maxwell Mansolf (University of California, Los Angeles;
##' \email{mamansolf@@gmail.com})
##'
##' @references
##' Enders, C. K. (2010). *Applied missing data analysis*.
##' New York, NY: Guilford.
##'
##' Li, K.-H., Meng, X.-L., Raghunathan, T. E., & Rubin, D. B. (1991).
##' Significance levels from repeated *p*-values with multiply-imputed
##' data.*Statistica Sinica, 1*(1), 65--92. Retrieved from
##'
##'
##' Mansolf, M., Jorgensen, T. D., & Enders, C. K. (2020). A multiple
##' imputation score test for model modification in structural equation
##' models. *Psychological Methods, 25*(4), 393--411.
##' \doi{10.1037/met0000243}
##'
##' @examples
##'
##' ## See the new lavaan.mi package
##'
##' @name modindices.mi-deprecated
##' @usage
##' modindices.mi(object, test = c("D2","D1"), omit.imps = c("no.conv","no.se"),
##' standardized = TRUE, cov.std = TRUE, information = "expected",
##' power = FALSE, delta = 0.1, alpha = 0.05, high.power = 0.75,
##' sort. = FALSE, minimum.value = 0, maximum.number = nrow(LIST),
##' na.remove = TRUE, op = NULL)
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @export
modindices.mi <- function(object,
test = c("D2","D1"),
omit.imps = c("no.conv","no.se"),
standardized = TRUE,
cov.std = TRUE,
information = "expected",
# power statistics?
power = FALSE,
delta = 0.1,
alpha = 0.05,
high.power = 0.75,
# customize output
sort. = FALSE,
minimum.value = 0.0,
maximum.number = nrow(LIST),
na.remove = TRUE,
op = NULL) {
.Deprecated(msg = c("\nThe runMI() and related lavaan.mi functions have been",
" deprecated and will cease to be included in future ",
"versions of semTools.\n\nSupport is still provided for ",
"analyzing lavaan.mi-class objects (e.g., compRelSEM() ",
"can estimate reliability using pooled results), which ",
"can now be created using the lavaan.mi package.\n\nThe ",
"deprecated runMI() function now creates an object of ",
"class OLDlavaan.mi, which can be analyzed using the ",
"deprecated functions in semTools, like lavTestLRT.mi(),",
" that have been updated and improved in the lavaan.mi ",
"package.\n\nFind more details help('semTools-deprecated)"))
stopifnot(inherits(object, "OLDlavaan.mi"))
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
test <- tolower(test[1])
N <- lavListInspect(object, "ntotal")
#FIXME: if (lavoptions$mimic == "EQS") N <- N - 1 # not in lavaan, why?
# not ready for estimator = "PML"
if (object@Options$estimator == "PML") {
stop("Modification indices not yet implemented for estimator PML.")
}
# sanity check
if (power) standardized <- TRUE
## use first available modification indices as template to store pooled results
ngroups <- lavListInspect(object, "ngroups")
nlevels <- lavListInspect(object, "nlevels")
myCols <- c("lhs","op","rhs")
if (ngroups > 1L) myCols <- c(myCols,"block","group")
if (nlevels > 1L) myCols <- c(myCols,"block","level")
myCols <- unique(myCols)
for (i in useImps) {
LIST <- object@miList[[i]][myCols]
nR <- try(nrow(LIST), silent = TRUE)
if (inherits(nR, "try-error") || is.null(nR)) {
if (i == max(useImps)) {
stop("No modification indices could be computed for any imputations.")
} else next
} else break
}
## D2 pooling method
if (test == "d2") {
chiList <- lapply(object@miList[useImps], "[[", i = "mi")
## imputations in columns, parameters in rows
pooledList <- apply(do.call(cbind, chiList), 1, function(x) {
calculate.D2(x, DF = 1, asymptotic = TRUE)
})
LIST$mi <- pooledList[1, ] # could be "F" or "chisq"
## diagnostics
LIST$riv <- pooledList["ariv", ]
LIST$fmi <- pooledList["fmi", ]
## also take average of epc & sepc.all
epcList <- lapply(object@miList[useImps], "[[", i = "epc")
LIST$epc <- rowMeans(do.call(cbind, epcList))
if (standardized) {
sepcList <- lapply(object@miList[useImps], "[[", i = "sepc.lv")
LIST$sepc.lv <- rowMeans(do.call(cbind, sepcList))
sepcList <- lapply(object@miList[useImps], "[[", i = "sepc.all")
LIST$sepc.all <- rowMeans(do.call(cbind, sepcList))
fixed.x <- lavListInspect(object, "options")$fixed.x && length(lavNames(object, "ov.x"))
if (fixed.x && "sepc.nox" %in% colnames(object@miList[useImps][[1]])) {
sepcList <- lapply(object@miList[useImps], "[[", i = "sepc.nox")
LIST$sepc.nox <- rowMeans(do.call(cbind, sepcList))
}
}
} else {
scoreOut <- lavTestScore.mi(object, add = cbind(LIST, user = 10L,
free = 1, start = 0),
test = "d1", omit.imps = omit.imps,
epc = TRUE, scale.W = FALSE, asymptotic = TRUE,
information = information)$uni
LIST$mi <- scoreOut$X2
LIST$riv <- scoreOut$riv
LIST$fmi <- scoreOut$fmi
LIST$epc <- scoreOut$epc #FIXME: use average across imputations?
# standardize?
if (standardized) {
## Need full parameter table for lavaan::standardizedSolution()
## Merge parameterEstimates() with modindices()
oldPE <- getMethod("summary","OLDlavaan.mi")(object, se = FALSE,
output = "data.frame",
omit.imps = omit.imps)
PE <- lavaan::lav_partable_merge(oldPE, cbind(LIST, est = 0),
remove.duplicated = TRUE, warn = FALSE)
## merge EPCs, using parameter labels (unavailable for estimates)
rownames(LIST) <- paste0(LIST$lhs, LIST$op, LIST$rhs, ".g", LIST$group) #FIXME: multilevel?
rownames(PE) <- paste0(PE$lhs, PE$op, PE$rhs, ".g", PE$group)
PE[rownames(LIST), "epc"] <- LIST$epc
## need "exo" column?
PT <- parTable(object)
if ("exo" %in% names(PT)) {
rownames(PT) <- paste0(PT$lhs, PT$op, PT$rhs, ".g", PT$group)
PE[rownames(PT), "exo"] <- PT$exo
} else PE$exo <- 0L
rownames(LIST) <- NULL
rownames(PE) <- NULL
EPC <- PE$epc
if (cov.std) {
# replace epc values for variances by est values
var.idx <- which(PE$op == "~~" & PE$lhs == PE$rhs)
EPC[ var.idx ] <- PE$est[ var.idx ]
}
# two problems:
# - EPC of variances can be negative, and that is perfectly legal
# - EPC (of variances) can be tiny (near-zero), and we should
# not divide by tiny variables
small.idx <- which(PE$op == "~~" &
PE$lhs == PE$rhs &
abs(EPC) < sqrt( .Machine$double.eps ) )
if (length(small.idx) > 0L) EPC[small.idx] <- as.numeric(NA)
# get the sign
EPC.sign <- sign(PE$epc)
## pooled estimates for standardizedSolution()
pooledest <- getMethod("coef", "OLDlavaan.mi")(object,
omit.imps = omit.imps)
## update @Model@GLIST for standardizedSolution(..., GLIST=)
object@Model <- lavaan::lav_model_set_parameters(object@Model, x = pooledest)
PE$sepc.lv <- EPC.sign * lavaan::standardizedSolution(object, se = FALSE,
type = "std.lv",
cov.std = cov.std,
partable = PE,
GLIST = object@Model@GLIST,
est = abs(EPC))$est.std
PE$sepc.all <- EPC.sign * lavaan::standardizedSolution(object, se = FALSE,
type = "std.all",
cov.std = cov.std,
partable = PE,
GLIST = object@Model@GLIST,
est = abs(EPC))$est.std
fixed.x <- lavListInspect(object, "options")$fixed.x && length(lavNames(object, "ov.x"))
if (fixed.x) {
PE$sepc.nox <- EPC.sign * lavaan::standardizedSolution(object, se = FALSE,
type = "std.nox",
cov.std = cov.std,
partable = PE,
GLIST = object@Model@GLIST,
est = abs(EPC))$est.std
}
if (length(small.idx) > 0L) {
PE$sepc.lv[small.idx] <- 0
PE$sepc.all[small.idx] <- 0
if (fixed.x) PE$sepc.nox[small.idx] <- 0
}
## remove unnecessary columns, then merge
if (is.null(LIST$block)) PE$block <- NULL
PE$est <- NULL
PE$mi <- NULL
PE$epc <- NULL
PE$exo <- NULL
LIST <- merge(LIST, PE, sort = FALSE)
class(LIST) <- c("lavaan.data.frame","data.frame")
}
}
# power?
if (power) {
LIST$sepc.lv <- NULL
LIST$delta <- delta
# FIXME: this is using epc in unstandardized metric
# this would be much more useful in standardized metric
# we need a standardize.est.all.reverse function...
LIST$ncp <- (LIST$mi / (LIST$epc*LIST$epc)) * (delta*delta)
LIST$power <- 1 - pchisq(qchisq((1.0 - alpha), df=1),
df=1, ncp=LIST$ncp)
LIST$decision <- character( length(LIST$power) )
# five possibilities (Table 6 in Saris, Satorra, van der Veld, 2009)
mi.significant <- ifelse( 1 - pchisq(LIST$mi, df=1) < alpha,
TRUE, FALSE )
high.power <- LIST$power > high.power
# FIXME: sepc.all or epc??
#epc.high <- LIST$sepc.all > LIST$delta
epc.high <- LIST$epc > LIST$delta
LIST$decision[ which(!mi.significant & !high.power)] <- "(i)"
LIST$decision[ which( mi.significant & !high.power)] <- "**(m)**"
LIST$decision[ which(!mi.significant & high.power)] <- "(nm)"
LIST$decision[ which( mi.significant & high.power &
!epc.high)] <- "epc:nm"
LIST$decision[ which( mi.significant & high.power &
epc.high)] <- "*epc:m*"
#LIST$decision[ which(mi.significant & high.power) ] <- "epc"
#LIST$decision[ which(mi.significant & !high.power) ] <- "***"
#LIST$decision[ which(!mi.significant & !high.power) ] <- "(i)"
}
# sort?
if (sort.) {
LIST <- LIST[order(LIST$mi, decreasing = TRUE),]
}
if (minimum.value > 0.0) {
LIST <- LIST[!is.na(LIST$mi) & LIST$mi > minimum.value,]
}
if (maximum.number < nrow(LIST)) {
LIST <- LIST[seq_len(maximum.number),]
}
if (na.remove) {
idx <- which(is.na(LIST$mi))
if (length(idx) > 0) LIST <- LIST[-idx,]
}
if (!is.null(op)) {
idx <- LIST$op %in% op
if (length(idx) > 0) LIST <- LIST[idx,]
}
# add header
# TODO: small explanation of the columns in the header?
# attr(LIST, "header") <-
# c("modification indices for newly added parameters only; to\n",
# "see the effects of releasing equality constraints, use the\n",
# "lavTestScore() function")
LIST
}
##' @name modindices.mi-deprecated
##' @usage
##' modificationIndices.mi(object, test = c("D2","D1"),
##' omit.imps = c("no.conv","no.se"),
##' standardized = TRUE, cov.std = TRUE,
##' information = "expected", power = FALSE, delta = 0.1,
##' alpha = 0.05, high.power = 0.75, sort. = FALSE,
##' minimum.value = 0, maximum.number = nrow(LIST),
##' na.remove = TRUE, op = NULL)
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @export
modificationIndices.mi <- modindices.mi
##' @name modindices.mi-deprecated
##' @usage
##' modificationindices.mi(object, test = c("D2","D1"),
##' omit.imps = c("no.conv","no.se"),
##' standardized = TRUE, cov.std = TRUE,
##' information = "expected", power = FALSE, delta = 0.1,
##' alpha = 0.05, high.power = 0.75, sort. = FALSE,
##' minimum.value = 0, maximum.number = nrow(LIST),
##' na.remove = TRUE, op = NULL)
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @export
modificationindices.mi <- modindices.mi
semTools/R/powerAnalysisNested.R 0000644 0001762 0000144 00000020234 14632143377 016421 0 ustar ligges users ### Sunthud Pornprasertmanit, Bell Clinton, Pavel Panko
### Last updated: 10 January 2021
##' Find power given a sample size in nested model comparison
##'
##' Find the sample size that the power in rejection the samples from the
##' alternative pair of RMSEA is just over the specified power.
##'
##'
##' @importFrom stats qchisq pchisq
##'
##' @param rmsea0A The \eqn{H_0} baseline RMSEA
##' @param rmsea0B The \eqn{H_0} alternative RMSEA (trivial misfit)
##' @param rmsea1A The \eqn{H_1} baseline RMSEA
##' @param rmsea1B The \eqn{H_1} alternative RMSEA (target misfit to be rejected)
##' @param dfA degree of freedom of the more-restricted model
##' @param dfB degree of freedom of the less-restricted model
##' @param n Sample size
##' @param alpha The alpha level
##' @param group The number of group in calculating RMSEA
##'
##' @author Bell Clinton
##'
##' Pavel Panko (Texas Tech University; \email{pavel.panko@@ttu.edu})
##'
##' Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' @seealso \itemize{
##' \item [plotRMSEApowernested()] to plot the statistical power for
##' nested model comparison based on population RMSEA given the sample size
##' \item [findRMSEAsamplesizenested()] to find the minium sample
##' size for a given statistical power in nested model comparison based on
##' population RMSEA
##' }
##'
##' @references
##' MacCallum, R. C., Browne, M. W., & Cai, L. (2006). Testing
##' differences between nested covariance structure models: Power analysis and
##' null hypotheses. *Psychological Methods, 11*(1), 19--35.
##' \doi{10.1037/1082-989X.11.1.19}
##'
##' @examples
##'
##' findRMSEApowernested(rmsea0A = 0.06, rmsea0B = 0.05, rmsea1A = 0.08,
##' rmsea1B = 0.05, dfA = 22, dfB = 20, n = 200,
##' alpha = 0.05, group = 1)
##'
##' @export
findRMSEApowernested <- function(rmsea0A = NULL, rmsea0B = NULL, rmsea1A, rmsea1B = NULL, dfA, dfB, n, alpha = 0.05, group = 1) {
if(is.null(rmsea0A)) rmsea0A <- 0
if(is.null(rmsea0B)) rmsea0B <- 0
if(is.null(rmsea1B)) rmsea1B <- rmsea0B
if(dfA <= dfB) stop("The degree of freedom of the more-restricted model (dfA) should be greater than the degree of freedom of the less-restricted model (dfB)")
if(rmsea0A < rmsea0B) stop("In the null-hypothesis models, the RMSEA of the more-restricted model (rmsea0A) should have a higher than or equal to the RMSEA of the less-restricted model (rmsea0B).")
if(rmsea1A < rmsea1B) stop("In the alternative-hypothesis models, the RMSEA of the more-restricted model (rmsea1A) should have a higher than or equal to the RMSEA of the less-restricted model (rmsea1B).")
ddiff <- dfA-dfB
f0a <- (dfA*rmsea0A^2)/group
f0b <- (dfB*rmsea0B^2)/group
f1a <- (dfA*rmsea1A^2)/group
f1b <- (dfB*rmsea1B^2)/group
ncp0 <- (n-1)*(f0a-f0b)
ncp1 <- (n-1)*(f1a-f1b)
cval <- qchisq(1-alpha,ddiff,ncp0)
Power <- 1-pchisq(cval,ddiff,ncp1)
Power
}
##' Find sample size given a power in nested model comparison
##'
##' Find the sample size that the power in rejection the samples from the
##' alternative pair of RMSEA is just over the specified power.
##'
##'
##' @param rmsea0A The \eqn{H_0} baseline RMSEA
##' @param rmsea0B The \eqn{H_0} alternative RMSEA (trivial misfit)
##' @param rmsea1A The \eqn{H_1} baseline RMSEA
##' @param rmsea1B The \eqn{H_1} alternative RMSEA (target misfit to be rejected)
##' @param dfA degree of freedom of the more-restricted model.
##' @param dfB degree of freedom of the less-restricted model.
##' @param power The desired statistical power.
##' @param alpha The alpha level.
##' @param group The number of group in calculating RMSEA.
##'
##' @author Bell Clinton
##'
##' Pavel Panko (Texas Tech University; \email{pavel.panko@@ttu.edu})
##'
##' Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' @seealso \itemize{
##' \item [plotRMSEApowernested()] to plot the statistical power for
##' nested model comparison based on population RMSEA given the sample size
##' \item [findRMSEApowernested()] to find the power for a given
##' sample size in nested model comparison based on population RMSEA
##' }
##'
##' @references
##' MacCallum, R. C., Browne, M. W., & Cai, L. (2006). Testing
##' differences between nested covariance structure models: Power analysis and
##' null hypotheses. *Psychological Methods, 11*(1), 19--35.
##' \doi{10.1037/1082-989X.11.1.19}
##'
##' @examples
##'
##' findRMSEAsamplesizenested(rmsea0A = 0, rmsea0B = 0, rmsea1A = 0.06,
##' rmsea1B = 0.05, dfA = 22, dfB = 20, power = 0.80,
##' alpha = .05, group = 1)
##'
##' @export
findRMSEAsamplesizenested <- function(rmsea0A = NULL, rmsea0B = NULL, rmsea1A,
rmsea1B = NULL, dfA, dfB, power = 0.80,
alpha = .05, group = 1) {
if(is.null(rmsea0A)) rmsea0A <- 0
if(is.null(rmsea0B)) rmsea0B <- 0
if(is.null(rmsea1B)) rmsea1B <- rmsea0B
if(dfA <= dfB) stop("The degree of freedom of the more-restricted model (dfA) should be greater than the degree of freedom of the less-restricted model (dfB)")
if(rmsea0A < rmsea0B) stop("In the null-hypothesis models, the RMSEA of the more-restricted model (rmsea0A) should have a higher than or equal to the RMSEA of the less-restricted model (rmsea0B).")
if(rmsea1A < rmsea1B) stop("In the alternative-hypothesis models, the RMSEA of the more-restricted model (rmsea1A) should have a higher than or equal to the RMSEA of the less-restricted model (rmsea1B).")
n <- 5:500
pow <- findRMSEApowernested(rmsea0A, rmsea0B, rmsea1A, rmsea1B, dfA, dfB, n, alpha, group = group)
if(all(pow > power)) {
return("Sample Size <= 5")
} else if (all(power > pow)) {
repeat {
n <- n + 500
pow <- findRMSEApowernested(rmsea0A, rmsea0B, rmsea1A, rmsea1B, dfA, dfB, n, alpha, group = group)
if(any(pow > power)) {
index <- which(pow > power)[1]
return(n[index]/group)
}
}
} else {
index <- which(pow > power)[1]
return(n[index]/group)
}
}
##' Plot power of nested model RMSEA
##'
##' Plot power of nested model RMSEA over a range of possible sample sizes.
##'
##'
##' @param rmsea0A The \eqn{H_0} baseline RMSEA
##' @param rmsea0B The \eqn{H_0} alternative RMSEA (trivial misfit)
##' @param rmsea1A The \eqn{H_1} baseline RMSEA
##' @param rmsea1B The \eqn{H_1} alternative RMSEA (target misfit to be rejected)
##' @param dfA degree of freedom of the more-restricted model
##' @param dfB degree of freedom of the less-restricted model
##' @param nlow Lower bound of sample size
##' @param nhigh Upper bound of sample size
##' @param steps Step size
##' @param alpha The alpha level
##' @param group The number of group in calculating RMSEA
##' @param \dots The additional arguments for the plot function.
##'
##' @author Bell Clinton
##'
##' Pavel Panko (Texas Tech University; \email{pavel.panko@@ttu.edu})
##'
##' Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' @seealso \itemize{
##' \item [findRMSEApowernested()] to find the power for a given
##' sample size in nested model comparison based on population RMSEA
##' \item [findRMSEAsamplesizenested()] to find the minium sample
##' size for a given statistical power in nested model comparison based on
##' population RMSEA
##' }
##'
##' @references
##' MacCallum, R. C., Browne, M. W., & Cai, L. (2006). Testing
##' differences between nested covariance structure models: Power analysis and
##' null hypotheses. *Psychological Methods, 11*(1), 19--35.
##' \doi{10.1037/1082-989X.11.1.19}
##'
##' @examples
##'
##' plotRMSEApowernested(rmsea0A = 0, rmsea0B = 0, rmsea1A = 0.06,
##' rmsea1B = 0.05, dfA = 22, dfB = 20, nlow = 50,
##' nhigh = 500, steps = 1, alpha = .05, group = 1)
##'
##' @export
plotRMSEApowernested <- function(rmsea0A = NULL, rmsea0B = NULL, rmsea1A,
rmsea1B = NULL, dfA, dfB, nlow, nhigh,
steps = 1, alpha = .05, group = 1, ...){
nseq <- seq(nlow,nhigh, by=steps)
pow1 <- findRMSEApowernested(rmsea0A = rmsea0A, rmsea0B = rmsea0B, rmsea1A = rmsea1A, rmsea1B = rmsea1B, dfA = dfA, dfB = dfB, n = nseq, alpha = alpha, group = group)
plot(nseq, pow1, xlab="Sample Size", ylab="Power", main="Compute Power for Nested RMSEA", type="l", ...)
}
semTools/R/runMI-LRT.R 0000644 0001762 0000144 00000064433 14764332523 014117 0 ustar ligges users ### Terrence D. Jorgensen & Yves Rosseel
### Last updated: 12 March 2025
### Pooled likelihood ratio test for multiple imputations
### Borrowed source code from lavaan/R/lav_test_LRT.R
### DEPRECATED: 16 June 2024
### supplanted by lavaan.mi package
## -------------
## Main function
## -------------
##' Likelihood Ratio Test for Multiple Imputations
##'
##' Likelihood ratio test (LRT) for lavaan models fitted to multiple imputed
##' data sets. Statistics for comparing nested models can be calculated by
##' pooling the likelihood ratios across imputed data sets, as described by
##' Meng & Rubin (1992), or by pooling the LRT statistics from each imputation,
##' as described by Li, Meng, Raghunathan, & Rubin (1991).
##'
##' The Meng & Rubin (1992) method, also referred to as the `"D3"`
##' statistic, is only applicable when using a likelihood-based estimator.
##' Otherwise (e.g., DWLS for categorical outcomes), users are notified that
##' `test` was set to `"D2"`.
##'
##' `test = "Mplus"` implies `"D3"` and `asymptotic = TRUE`
##' (see Asparouhov & Muthen, 2010).
##'
##' Note that unlike [lavaan::lavTestLRT()], `lavTestLRT` can
##' only be used to compare a single pair of models, not a longer list of
##' models. To compare several nested models fitted to multiple imputations,
##' see examples on the [compareFit()] help page.
##'
## @aliases lavTestLRT.mi
##' @importFrom lavaan lavListInspect parTable lavTestLRT
##' @importFrom stats cov pchisq pf
##'
##' @param object,h1 An object of class [OLDlavaan.mi-class].
##' `object` should be nested within (more constrained than) `h1`.
##' @param test `character` indicating which pooling method to use.
##' `"D3"`, `"mr"`, or `"meng.rubin"` (default) requests the
##' method described by Meng & Rubin (1992). `"D2"`, `"LMRR"`,
##' or `"Li.et.al"` requests the complete-data LRT statistic should be
##' calculated using each imputed data set, which will then be pooled across
##' imputations, as described in Li, Meng, Raghunathan, & Rubin (1991).
##' Find additional details in Enders (2010, chapter 8).
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. Specific imputation numbers can also be included in this
##' argument, in case users want to apply their own custom omission criteria
##' (or simulations can use different numbers of imputations without
##' redundantly refitting the model).
##' @param asymptotic `logical`. If `FALSE` (default), the pooled test
##' will be returned as an *F*-distributed statistic with numerator
##' (`df1`) and denominator (`df2`) degrees of freedom.
##' If `TRUE`, the pooled *F* statistic will be multiplied by its
##' `df1` on the assumption that its `df2` is sufficiently large
##' enough that the statistic will be asymptotically \eqn{\chi^2} distributed
##' with `df1`.
##' @param pool.robust `logical`. Ignored unless `test = "D2"` and a
##' robust test was requested. If `pool.robust = TRUE`, the robust test
##' statistic is pooled, whereas `pool.robust = FALSE` will pool
##' the naive test statistic (or difference statistic) and apply the average
##' scale/shift parameter to it (unavailable for mean- and variance-adjusted
##' difference statistics, so `pool.robust` will be set `TRUE`).
##' @param ... Additional arguments passed to [lavaan::lavTestLRT()],
##' only if `test = "D2"` and `pool.robust = TRUE`
##'
##' @return
##' A vector containing the LRT statistic (either an `F` or \eqn{\chi^2}
##' statistic, depending on the `asymptotic` argument), its degrees of
##' freedom (numerator and denominator, if `asymptotic = FALSE`), its
##' *p* value, and 2 missing-data diagnostics: the relative invrease
##' in variance (RIV, or average for multiparameter tests: ARIV) and the
##' fraction missing information (FMI = ARIV / (1 + ARIV)). Robust statistics
##' will also include the average (across imputations) scaling factor and
##' (if relevant) shift parameter(s), unless `pool.robust = TRUE`.
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' @references
##' Enders, C. K. (2010). *Applied missing data analysis*.
##' New York, NY: Guilford.
##'
##' Li, K.-H., Meng, X.-L., Raghunathan, T. E., & Rubin, D. B. (1991).
##' Significance levels from repeated *p*-values with multiply-imputed
##' data. *Statistica Sinica, 1*(1), 65--92. Retrieved from
##'
##'
##' Meng, X.-L., & Rubin, D. B. (1992). Performing likelihood ratio tests with
##' multiply-imputed data sets. *Biometrika, 79*(1), 103--111.
##' \doi{10.2307/2337151}
##'
##' Rubin, D. B. (1987). *Multiple imputation for nonresponse in surveys*.
##' New York, NY: Wiley.
##'
##' @seealso [lavaan::lavTestLRT()], [compareFit()]
##'
##' @examples
##'
##' ## See the new lavaan.mi package
##'
##' @name lavTestLRT.mi-deprecated
##' @usage
##' lavTestLRT.mi(object, h1 = NULL, test = c("D3","D2"),
##' omit.imps = c("no.conv","no.se"),
##' asymptotic = FALSE, pool.robust = FALSE, ...)
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##'
##' @export
lavTestLRT.mi <- function(object, h1 = NULL, test = c("D3","D2"),
omit.imps = c("no.conv","no.se"),
asymptotic = FALSE, pool.robust = FALSE, ...) {
.Deprecated(msg = c("\nThe runMI() and related lavaan.mi functions have been",
" deprecated and will cease to be included in future ",
"versions of semTools.\n\nSupport is still provided for ",
"analyzing lavaan.mi-class objects (e.g., compRelSEM() ",
"can estimate reliability using pooled results), which ",
"can now be created using the lavaan.mi package.\n\nThe ",
"deprecated runMI() function now creates an object of ",
"class OLDlavaan.mi, which can be analyzed using the ",
"deprecated functions in semTools, like lavTestLRT.mi(),",
" that have been updated and improved in the lavaan.mi ",
"package.\n\nFind more details help('semTools-deprecated)"))
## check class
if (!inherits(object, "OLDlavaan.mi")) stop("object is not class 'OLDlavaan.mi'")
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
DF0 <- object@testList[[ useImps[1] ]]$standard[["df"]]
## model comparison?
if (!is.null(h1)) {
if (!inherits(h1, "OLDlavaan.mi")) stop("h1 is not class 'OLDlavaan.mi'")
if (!all(lavListInspect(object, "options")$test == lavListInspect(h1, "options")$test)) {
stop('Different (sets of) test statistics were requested for the 2 models.')
}
useImps1 <- rep(TRUE, length(h1@DataList))
if ("no.conv" %in% omit.imps) useImps1 <- sapply(h1@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps1 <- useImps1 & sapply(h1@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(h1@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(h1@convergence, "[[", i = "Heywood.ov")
useImps1 <- useImps1 & !(Heywood.lv | Heywood.ov)
}
m <- sum(useImps1)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps1 <- which(useImps1)
h0.not.h1 <- setdiff(useImps, useImps1)
if (length(h0.not.h1)) warn0 <- paste('\n\nFor the following imputations, ',
'"object" converged but not "h1":',
paste(h0.not.h1, collapse = ", "),
'\n\n')
h1.not.h0 <- setdiff(useImps1, useImps)
if (length(h1.not.h0)) warn1 <- paste('\n\nFor the following imputations, ',
'"h1" converged but not "object":',
paste(h1.not.h0, collapse = ", "),
'\n\n')
if (length(c(h0.not.h1, h1.not.h0)))
warning('The models being compared did not converge on the same set of ',
'imputations. ',
if (length(h0.not.h1)) warn0 else '',
if (length(h1.not.h0)) warn1 else '',
'Likelihood ratio test conducted using only the ',
'imputations for which both models converged.')
useImps <- intersect(useImps, useImps1)
m <- length(useImps) # yes, redefine if necessary
if (m == 0L) stop('For no imputations did both models converge.')
## check DF
DF1 <- h1@testList[[ useImps[1] ]]$standard[["df"]]
if (DF0 == DF1) stop("models have equal degrees of freedom")
if (DF0 < DF1) {
H0 <- h1
h1 <- object
object <- H0
H0 <- DF1
DF1 <- DF0
DF0 <- H0
}
DF <- DF0 - DF1
} else DF <- DF0
## only keep arguments relevant to pass to lavTestLRT (if D2)
dots <- list(...)[names(formals(lavTestLRT))]
## check test-pooling options, for backward compatibility?
test <- tolower(test[1])
if (test == "mplus") {
test <- "D3"
asymptotic <- TRUE
}
if (tolower(test) %in% c("mr","meng.rubin","likelihood","lrt","d3")) test <- "D3"
if (tolower(test) %in% c("lmrr","li.et.al","pooled.wald","d2")) test <- "D2"
if (test == "D3" && !lavListInspect(object, "options")$estimator %in% c("ML","PML","FML")) {
message('"D3" only available using maximum likelihood estimation. ',
'Changed test to "D2".')
test <- "D2"
}
## check for robust
test.names <- lavListInspect(object, "options")$test
# lavaan 0.6-5: for now, we only acknowledge the first non-standard @test
if (length(test.names) > 1L) {
## remove standard and any bootstrapped tests
rm.idx <- which(test.names %in% c("standard","bootstrap","bollen.stine"))
if (length(rm.idx) > 0L) {
test.names <- test.names[-rm.idx]
}
## only acknowledge the first scaled test statistic
if (length(test.names) > 1L) {
test.names <- test.names[1]
}
}
robust <- any(test.names %in% c("satorra.bentler","yuan.bentler",
"yuan.bentler.mplus","scaled.shifted",
"mean.var.adjusted","satterthwaite"))
if (!robust) pool.robust <- FALSE
scaleshift <- any(test.names == "scaled.shifted")
if (scaleshift && !is.null(h1)) {
if (test == "D3" | !pool.robust)
message("If test = 'scaled.shifted' (estimator = 'WLSMV' or 'MLMV'), ",
"model comparison is only available by (re)setting test = 'D2' ",
"and pool.robust = TRUE.\n",
"Control more options by passing arguments to lavTestLRT() via ",
"the '...' argument.\n")
pool.robust <- TRUE
test <- 'D2'
}
if (robust && !pool.robust && !asymptotic) {
message('Robust correction can only be applied to pooled chi-squared ',
'statistic, not F statistic. "asymptotic" was switched to TRUE.')
asymptotic <- TRUE
}
if (pool.robust && test == "D3") {
message('pool.robust = TRUE is only applicable when test = "D2". ',
'Changed test to "D2".')
test <- "D2"
}
## calculate pooled test:
if (robust && pool.robust) {
## pool both the naive and robust test statistics, return both to
## make output consistent across options
out.naive <- D2.LRT(object, h1 = h1, useImps = useImps,
asymptotic = asymptotic, pool.robust = FALSE)
out.robust <- D2.LRT(object, h1 = h1, useImps = useImps, LRTargs = dots,
asymptotic = asymptotic, pool.robust = TRUE)
out <- c(out.naive, out.robust)
} else if (test == "D2") {
out <- D2.LRT(object, h1 = h1, useImps = useImps, asymptotic = asymptotic,
pool.robust = pool.robust)
} else if (test == "D3") {
out <- D3.LRT(object, h1 = h1, useImps = useImps, asymptotic = asymptotic,
omit.imps = omit.imps)
}
## If test statistic is negative, return without any indices or robustness
if (asymptotic) {
if (out[["chisq"]] == 0) {
message('Negative pooled test statistic was set to zero, so fit will ',
'appear to be arbitrarily perfect. ',
if (robust) 'Robust corrections uninformative, not returned.',
'\n')
class(out) <- c("lavaan.vector","numeric")
return(out)
}
} else {
if (out[["F"]] == 0) {
message('Negative pooled test statistic was set to zero, so fit will ',
'appear to be arbitrarily perfect.\n')
class(out) <- c("lavaan.vector","numeric")
return(out)
}
}
## If robust statistics were not pooled above, robustify naive statistics
if (robust & !pool.robust) {
out <- robustify(ChiSq = out, object = object, h1 = h1, useImps = useImps)
if (scaleshift) {
extraWarn <- ' and shift parameter'
} else if (any(test.names == "mean.var.adjusted")) {
extraWarn <- ' and degrees of freedom'
} else extraWarn <- ''
message('Robust corrections are made by pooling the naive chi-squared ',
'statistic across ', m, ' imputations for which the model ',
'converged, then applying the average (across imputations) scaling',
' factor', extraWarn, ' to that pooled value. \n',
'To instead pool the robust test statistics, set test = "D2" and ',
'pool.robust = TRUE. \n')
}
class(out) <- c("lavaan.vector","numeric")
out
}
## ----------------
## Hidden Functions
## ----------------
##' @importFrom lavaan lavListInspect parTable lavTestLRT
D2.LRT <- function(object, h1 = NULL, useImps, asymptotic = FALSE,
pool.robust = FALSE, LRTargs = list()) {
warn <- lavListInspect(object, "options")$warn
if (pool.robust && !is.null(h1)) {
PT1 <- parTable(h1)
op1 <- lavListInspect(h1, "options")
oldCall <- object@lavListCall #re-run lavaanList() and save DIFFTEST
if (!is.null(oldCall$parallel)) {
if (oldCall$parallel == "snow") {
oldCall$parallel <- "no"
oldCall$ncpus <- 1L
if (warn) warning("Unable to pass lavaan::lavTestLRT() arguments when ",
"parallel = 'snow'. Switching to parallel = 'no'. ",
"Unless using Windows, parallel = 'multicore' works.")
}
}
## call lavaanList() again to run lavTestLRT() on each imputation
oldCall$FUN <- function(obj) {
fit1 <- try(lavaan::lavaan(PT1, slotOptions = op1, slotData = obj@Data),
silent = TRUE)
if (inherits(fit1, "try-error")) {
return("fit failed")
} else {
argList <- c(list(object = obj, fit1), LRTargs)
}
out <- try(do.call(lavTestLRT, argList),
silent = TRUE)
if (inherits(out, "try-error")) return("lavTestLRT() failed")
c(chisq = out[2, "Chisq diff"], df = out[2, "Df diff"])
}
FIT <- eval(as.call(oldCall))
## check if there are any results
noFit <- sapply(FIT@funList, function(x) x[1] == "fit failed")
noLRT <- sapply(FIT@funList, function(x) x[1] == "lavTestLRT() failed")
if (all(noFit | noLRT)) stop("No success using lavTestScore() on any imputations.")
chiList <- sapply(FIT@funList[ intersect(which(!(noFit | noLRT)), useImps) ],
"[[", i = "chisq")
dfList <- sapply(FIT@funList[ intersect(which(!(noFit | noLRT)), useImps) ],
"[[", i = "df")
out <- calculate.D2(chiList, DF = mean(dfList), asymptotic)
names(out) <- paste0(names(out), ".scaled")
class(out) <- c("lavaan.vector","numeric")
return(out)
}
## else, return model fit OR naive difference test to be robustified
test.names <- lavListInspect(object, "options")$test
# lavaan 0.6-5: for now, we only acknowledge the first non-standard @test
if (length(test.names) > 1L) {
## remove standard and any bootstrapped tests
rm.idx <- which(test.names %in% c("standard","bootstrap","bollen.stine"))
if (length(rm.idx) > 0L) {
test.names <- test.names[-rm.idx]
}
## only acknowledge the first scaled test statistic
if (length(test.names) > 1L) {
test.names <- test.names[1]
}
}
## pool Wald tests
if (is.null(h1)) {
test <- if (pool.robust) test.names[1] else "standard"
DF <- mean(sapply(object@testList[useImps], function(x) x[[test]][["df"]] ))
w <- sapply(object@testList[useImps], function(x) x[[test]][["stat"]])
} else {
## this will not get run if !pool.robust because logic catches that first
DF0 <- mean(sapply(object@testList[useImps], function(x) x$standard[["df"]]))
DF1 <- mean(sapply( h1@testList[useImps], function(x) x$standard[["df"]]))
DF <- DF0 - DF1
w0 <- sapply(object@testList[useImps], function(x) x$standard[["stat"]])
w1 <- sapply( h1@testList[useImps], function(x) x$standard[["stat"]])
w <- w0 - w1
## check DF
if (DF < 0) {
w <- -1*w
DF <- -1*DF
}
}
out <- calculate.D2(w, DF, asymptotic)
## add .scaled suffix
if (pool.robust) names(out) <- paste0(names(out), ".scaled")
## for 1 model, add extra info (redundant if pool.robust)
if (is.null(h1) & !pool.robust) {
PT <- parTable(object)
out <- c(out, npar = max(PT$free) - sum(PT$op == "=="),
ntotal = lavListInspect(object, "ntotal"))
}
class(out) <- c("lavaan.vector","numeric")
out
}
##' @importFrom lavaan parTable lavaan lavListInspect
##' @importFrom methods getMethod
getLLs <- function(object, useImps, saturated = FALSE,
omit.imps = c("no.conv","no.se")) {
## FIXME: lavaanList does not return info when fixed because no convergence!
dataList <- object@DataList[useImps]
lavoptions <- lavListInspect(object, "options")
group <- lavListInspect(object, "group")
if (length(group) == 0L) group <- NULL
cluster <- lavListInspect(object, "cluster")
if (length(cluster) == 0L) cluster <- NULL
if (saturated) {
#FIXME: below is legacy code, no longer needed?
# fit <- lavaan(parTable(object), data = dataList[[ useImps[1] ]],
# slotOptions = lavoptions, group = group, cluster = cluster)
# ## use saturated parameter table as new model
# PT <- lavaan::lav_partable_unrestricted(fit)
# ## fit saturated parameter table to each imputation, return estimates
# satParams <- lapply(object@DataList[useImps], function(d) {
# parTable(lavaan(model = PT, data = d, slotOptions = lavoptions,
# group = group, cluster = cluster))$est
# })
# ## fix them to pooled estimates
# PT$ustart <- colMeans(do.call(rbind, satParams))
PT <- object@h1List[[ useImps[1] ]]$PT
coefList <- lapply(object@h1List[useImps], function(x) x$PT$ustart)
PT$ustart <- colMeans(do.call(rbind, coefList))
## set all parameters fixed
PT$free <- 0L
PT$user <- 1L
PT$start <- NULL
PT$est <- NULL
PT$se <- NULL
} else {
## save parameter table as new model
PT <- parTable(object)
## set all parameters fixed
PT$free <- 0L
PT$user <- 1L
## fix them to pooled estimates
fixedValues <- getMethod("coef","OLDlavaan.mi")(object, type = "user",
omit.imps = omit.imps)
PT$ustart <- fixedValues
PT$start <- NULL
PT$est <- NULL
PT$se <- NULL
## omit (in)equality constraints and user-defined parameters
params <- !(PT$op %in% c("==","<",">",":="))
PT <- PT[params, ]
}
## return log-likelihoods
sapply(dataList, function(d) {
lavaan::logLik(lavaan(PT, data = d, slotOptions = lavoptions,
group = group, cluster = cluster))
})
#TODO: use "dry-run" trick from blavaan:::postpred() to save computing time
}
##' @importFrom stats pf pchisq
##' @importFrom lavaan lavListInspect parTable
D3.LRT <- function(object, h1 = NULL, useImps, asymptotic = FALSE,
omit.imps = c("no.conv","no.se")) {
## NOTE: Need to pass omit.imps= to getLLs(), which calls coef() method
N <- lavListInspect(object, "ntotal")
m <- length(useImps)
if (is.null(h1)) {
DF <- object@testList[[ useImps[1] ]]$standard[["df"]]
} else {
DF1 <- h1@testList[[ useImps[1] ]]$standard[["df"]]
DF0 <- object@testList[[ useImps[1] ]]$standard[["df"]]
DF <- DF0 - DF1
if (DF < 0) stop('The "object" model must be nested within (i.e., have ',
'fewer degrees of freedom than) the "h1" model.')
}
## calculate m log-likelihoods under pooled H0 estimates
LL0 <- getLLs(object, useImps, omit.imps = omit.imps)
## calculate m log-likelihoods under pooled H1 estimates
if (is.null(h1)) {
LL1 <- getLLs(object, useImps, saturated = TRUE, omit.imps = omit.imps)
} else {
LL1 <- getLLs(h1, useImps, omit.imps = omit.imps)
}
#FIXME: check whether LL1 or LL0 returned errors? add try()?
## calculate average of m LRTs
LRT_con <- mean(-2*(LL0 - LL1)) # getLLs() already applies [useImps]
## average chisq across imputations
if (is.null(h1)) {
LRT_bar <- mean(sapply(object@testList[useImps], function(x) x$standard$stat))
} else {
LRT_bar <- mean(sapply(object@testList[useImps], function(x) x$standard$stat) -
sapply(h1@testList[useImps], function(x) x$standard$stat))
}
## calculate average relative increase in variance
a <- DF*(m - 1)
ariv <- ((m + 1) / a) * (LRT_bar - LRT_con)
test.stat <- LRT_con / (DF*(1 + ariv))
if (is.na(test.stat)) stop('D3 test statistic could not be calculated. ',
'Try the D2 pooling method.') #FIXME: check whether model-implied Sigma is NPD
if (test.stat < 0) {
message('Negative test statistic set to zero \n')
test.stat <- 0
}
if (asymptotic) {
out <- c("chisq" = test.stat * DF, df = DF,
pvalue = pchisq(test.stat * DF, df = DF, lower.tail = FALSE),
ariv = ariv, fmi = ariv / (1 + ariv))
} else {
## F statistic
if (a > 4) {
v4 <- 4 + (a - 4) * (1 + (1 - (2 / a))*(1 / ariv))^2 # Enders (eq. 8.34)
} else {
v4 <- a*(1 + 1/DF)*(1 + 1/ariv)^2 / 2 # Enders (eq. 8.35)
# v4 <- (DF + 1)*(m - 1)*(1 + (1 / ariv))^2 / 2 # Grund et al. (eq. 9)
}
out <- c("F" = test.stat, df1 = DF, df2 = v4,
pvalue = pf(test.stat, df1 = DF, df2 = v4, lower.tail = FALSE),
ariv = ariv, fmi = ariv / (1 + ariv))
}
## add log-likelihood and AIC/BIC for target model
if (is.null(h1)) {
PT <- parTable(object)
npar <- max(PT$free) - sum(PT$op == "==")
out <- c(out, npar = npar, ntotal = N,
logl = mean(LL0), unrestricted.logl = mean(LL1),
aic = -2*mean(LL0) + 2*npar, bic = -2*mean(LL0) + npar*log(N),
bic2 = -2*mean(LL0) + npar*log((N + 2) / 24))
## NOTE: Mplus reports the average of m likelihoods evaluated at the
## m point estimates, not evaluated at the pooled point estimates.
## Mplus also uses those to calcluate AIC and BIC.
}
class(out) <- c("lavaan.vector","numeric")
out
}
##' @importFrom stats pchisq
##' @importFrom lavaan lavListInspect
robustify <- function(ChiSq, object, h1 = NULL, baseline = FALSE, useImps) {
test.names <- lavListInspect(object, "options")$test
# lavaan 0.6-5: for now, we only acknowledge the first non-standard @test
if (length(test.names) > 1L) {
## remove standard and any bootstrapped tests
rm.idx <- which(test.names %in% c("standard","bootstrap","bollen.stine"))
if (length(rm.idx) > 0L) {
test.names <- test.names[-rm.idx]
}
## only acknowledge the first scaled test statistic
if (length(test.names) > 1L) {
test.names <- test.names[1]
}
}
scaleshift <- any(test.names %in% "scaled.shifted")
if (baseline) {
TEST <- lapply(object@baselineList[useImps], "[[", i = "test")
} else TEST <- object@testList[useImps]
d0 <- mean(sapply(TEST, function(x) x[[ test.names[1] ]][["df"]]))
c0 <- mean(sapply(TEST, function(x) x[[ test.names[1] ]][["scaling.factor"]]))
if (!is.null(h1)) {
d1 <- mean(sapply(h1@testList[useImps], function(x) x[[ test.names[1] ]][["df"]]))
c1 <- mean(sapply(h1@testList[useImps],
function(x) x[[ test.names[1] ]][["scaling.factor"]]))
delta_c <- (d0*c0 - d1*c1) / (d0 - d1)
ChiSq["chisq.scaled"] <- ChiSq[["chisq"]] / delta_c
ChiSq["df.scaled"] <- d0 - d1
ChiSq["pvalue.scaled"] <- pchisq(ChiSq[["chisq.scaled"]],
df = ChiSq[["df.scaled"]],
lower.tail = FALSE)
ChiSq["chisq.scaling.factor"] <- delta_c
} else {
ChiSq["chisq.scaled"] <- ChiSq[["chisq"]] / c0
ChiSq["df.scaled"] <- d0
if (scaleshift) {
## add average shift parameter (or average of sums, if nG > 1)
shift <- mean(sapply(TEST, function(x) sum(x[[ test.names[1] ]][["shift.parameter"]]) ))
ChiSq["chisq.scaled"] <- ChiSq[["chisq.scaled"]] + shift
ChiSq["pvalue.scaled"] <- pchisq(ChiSq[["chisq.scaled"]],
df = ChiSq[["df.scaled"]],
lower.tail = FALSE)
ChiSq["chisq.scaling.factor"] <- c0
ChiSq["chisq.shift.parameters"] <- shift
} else {
ChiSq["pvalue.scaled"] <- pchisq(ChiSq[["chisq.scaled"]],
df = ChiSq[["df.scaled"]],
lower.tail = FALSE)
ChiSq["chisq.scaling.factor"] <- c0
}
}
ChiSq
}
semTools/R/fmi.R 0000644 0001762 0000144 00000030523 14764331327 013173 0 ustar ligges users ### Mauricio Garnier Villarreal & Terrence D. Jorgensen
### Last updated: 12 March 2025
### This function estimates the Fraction of Missing Information for means and
### (co)variances of each variable in a partially observed data set or from
### a list of multiple imputed data sets
##' Fraction of Missing Information.
##'
##' This function estimates the Fraction of Missing Information (FMI) for
##' summary statistics of each variable, using either an incomplete data set or
##' a list of imputed data sets.
##'
##' The function estimates a saturated model with [lavaan::lavaan()] for a
##' single incomplete data set using FIML, or with [lavaan.mi::lavaan.mi()]
##' for a list of imputed data sets. If method = `"saturated"`, FMI will be
##' estiamted for all summary statistics, which could take a lot of time with
##' big data sets. If method = `"null"`, FMI will only be estimated for
##' univariate statistics (e.g., means, variances, thresholds). The saturated
##' model gives more reliable estimates, so it could also help to request a
##' subset of variables from a large data set.
##'
##'
##' @importFrom lavaan lavListInspect lavInspect
##'
##' @param data Either a single `data.frame` with incomplete observations,
##' or a `list` of imputed data sets.
##' @param method character. If `"saturated"` or `"sat"` (default),
##' the model used to estimate FMI is a freely estimated covariance matrix and
##' mean vector for numeric variables, and/or polychoric correlations and
##' thresholds for ordered categorical variables, for each group (if
##' applicable). If `"null"`, only means and variances are estimated for
##' numeric variables, and/or thresholds for ordered categorical variables
##' (i.e., covariances and/or polychoric/polyserial correlations are
##' constrained to zero). See **Details** for more information.
##' @param group `character`. The optional name of a grouping variable, to
##' request FMI in each group.
##' @param ords Optional `character` vector naming ordered-categorical
##' variables, if they are not already stored as class `ordered` in `data`.
##' @param varnames Optional `character` vector of variable names, to calculate
##' FMI for a subset of variables in `data`. By default, all numeric and
##' `ordered=` variables will be included, unless `data=` is a single
##' incomplete `data.frame`, in which case only numeric variables can be
##' used with FIML estimation. Other variable types will be removed.
##' @param exclude Optional `character` vector naming variables to exclude from
##' the analysis.
##' @param return.fit logical. If `TRUE`, the fitted [lavaan::lavaan-class] or
##' [lavaan.mi::lavaan.mi-class] model is returned, so FMI can be found from
##' `summary(..., fmi=TRUE)`.
##'
##' @return `fmi()` returns a list with at least 2 of the following:
##'
##' \item{Covariances}{A list of symmetric matrices: (1) the estimated/pooled
##' covariance matrix, or a list of group-specific matrices (if applicable)
##' and (2) a matrix of FMI, or a list of group-specific matrices (if
##' applicable). Only available if `method = "saturated"`. When
##' `method="cor"`, this element is replaced by `Correlations`.}
##' \item{Variances}{The estimated/pooled variance for each numeric variable.
##' Only available if `method = "null"` (otherwise, it is on the diagonal
##' of Covariances).}
##' \item{Means}{The estimated/pooled mean for each numeric variable.}
##' \item{Thresholds}{The estimated/pooled threshold(s) for each
##' ordered-categorical variable.}
##'
##' @author
##' Mauricio Garnier Villarreal (Vrije Universiteit Amsterdam; \email{m.garniervillarreal@@vu.nl})
##'
##' Terrence Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @references
##' Rubin, D. B. (1987). *Multiple imputation for nonresponse in surveys*.
##' New York, NY: Wiley.
##'
##' Savalei, V. & Rhemtulla, M. (2012). On obtaining estimates of the fraction
##' of missing information from full information maximum likelihood.
##' *Structural Equation Modeling, 19*(3), 477--494.
##' \doi{10.1080/10705511.2012.687669}
##'
##' Wagner, J. (2010). The fraction of missing information as a tool for
##' monitoring the quality of survey data. *Public Opinion Quarterly,
##' 74*(2), 223--243. \doi{10.1093/poq/nfq007}
##'
##' @examples
##'
##' HSMiss <- HolzingerSwineford1939[ , c(paste("x", 1:9, sep = ""),
##' "ageyr","agemo","school")]
##' set.seed(12345)
##' HSMiss$x5 <- ifelse(HSMiss$x5 <= quantile(HSMiss$x5, .3), NA, HSMiss$x5)
##' age <- HSMiss$ageyr + HSMiss$agemo/12
##' HSMiss$x9 <- ifelse(age <= quantile(age, .3), NA, HSMiss$x9)
##'
##' ## calculate FMI (using FIML, provide partially observed data set)
##' (out1 <- fmi(HSMiss, exclude = "school"))
##' (out2 <- fmi(HSMiss, exclude = "school", method = "null"))
##' (out3 <- fmi(HSMiss, varnames = c("x5","x6","x7","x8","x9")))
##' (out4 <- fmi(HSMiss, method = "cor", group = "school")) # correlations by group
##'
##' ## significance tests in lavaan(.mi) object
##' out5 <- fmi(HSMiss, method = "cor", return.fit = TRUE)
##' summary(out5) # factor loading == SD, covariance = correlation
##'
##' if(requireNamespace("lavaan.mi")){
##' ## ordered-categorical data
##' data(binHS5imps, package = "lavaan.mi")
##'
##' ## calculate FMI, using list of imputed data sets
##' fmi(binHS5imps, group = "school")
##' }
##'
##' @export
fmi <- function(data, method = "saturated", group = NULL, ords = NULL,
varnames = NULL, exclude = NULL, return.fit = FALSE) {
fiml <- is.data.frame(data)
## check for single data set or list of imputed data sets
data1 <- if (fiml) data else data[[1]]
## select user-specified variables
vars <- if (is.character(varnames)) varnames else colnames(data1)
## remove grouping variable and user-specified exclusions, if applicable
vars <- setdiff(vars, c(group, exclude))
## check classes
ordvars <- vars[sapply(data1[vars], is.ordered)]
if (!is.null(ords)) ordvars <- c(ordvars, ords)
numvars <- vars[sapply(data1[vars], is.numeric)]
vars <- union(numvars, ordvars)
numvars <- setdiff(vars, ordvars)
if (fiml) {
#TODO: enable estimator = "PML"
# pass missing= option as another fmi() argument?
if (length(ordvars)) message(c("By providing a single data set, only the ",
"FIML option is available to calculate FMI,",
" which requires continuous variables. The ",
"following variables were removed: ",
paste(ordvars, collapse = ", ")))
if (!length(numvars)) stop("No numeric variables were provided.")
vars <- numvars
}
## construct model
if (method == "saturated" | method == "sat") {
covstruc <- outer(vars, vars, function(x, y) paste(x, "~~", y))
diag(covstruc)[which(vars %in% ordvars)] <- ""
model <- covstruc[lower.tri(covstruc, diag = TRUE)]
} else if (method == "null") {
covstruc <- outer(vars, vars, function(x, y) paste(x, "~~", y))
diag(covstruc)[which(vars %in% ordvars)] <- ""
model <- diag(covstruc)
} else if (method == "cor") {
# phantoms <- paste0(".", vars, ".")
model <- c(paste0(".", vars, ". =~ ", ifelse(vars %in% ordvars, "1*", "NA*"),
vars), # loadings = SDs (fixed to 1 when ordinal)
paste0(vars, " ~~ 0*", vars)) # "residual" variances = 0
} else stop('Invalid method= argument: "', method, '"')
if (length(numvars)) model <- c(model, paste(numvars, "~ 1"))
## fit model
if (fiml) {
if (method == "cor") {
fit <- lavaan::cfa(model, data = data, missing = "fiml", group = group,
std.lv = TRUE)
} else {
fit <- lavaan::lavaan(model, data = data, missing = "fiml", group = group)
}
if (return.fit) return(fit)
comb.results <- lavaan::parameterEstimates(fit, fmi = TRUE, zstat = FALSE,
pvalue = FALSE, ci = FALSE)
nG <- lavInspect(fit, "ngroups")
if (nG == 1L) comb.results$group <- 1L
group.label <- lavInspect(fit, "group.label")
} else {
## list of imputations
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
if (method == "cor") {
fit <- lavaan.mi::cfa.mi(model, data = data, group = group,
ordered = ordvars, std.lv = TRUE)
} else {
fit <- lavaan.mi::lavaan.mi(model, data = data, group = group,
ordered = ordvars, auto.th = TRUE)
}
if (return.fit) return(fit)
comb.results <- lavaan.mi::parameterEstimates.mi(fit, fmi = TRUE,
zstat = FALSE,
pvalue = FALSE, ci = FALSE)
nG <- lavListInspect(fit, "ngroups")
if (nG == 1L) comb.results$group <- 1L
group.label <- lavListInspect(fit, "group.label")
#FIXME: also return RIV? Or make it an argument (FMI or RIV)
comb.results$riv <- NULL
}
## Variances from null model, if applicable
if (method == "null") {
if (length(numvars)) {
Variances <- comb.results[comb.results$lhs == comb.results$rhs,
c("lhs","group","est","fmi")]
colnames(Variances)[c(1, 3)] <- c("variable","coef")
if (nG > 1L) Variances$group <- group.label[Variances$group]
class(Variances) <- c("lavaan.data.frame","data.frame")
attr(Variances, "header") <- paste("Null-model estimates may not be as",
"accurate as saturated-model estimates.")
## start list of results
results <- list(Variances = Variances)
} else results <- list()
} else {
## covariances from saturated model, including polychorics (if applicable)
if (fiml) {
if (nG == 1L) {
covmat <- lavInspect(fit, "est")[[ifelse(method == "cor", "psi", "theta")]]
covmat <- list(covmat)
} else {
covmat <- sapply(lavInspect(fit, "est"), "[[",
i = ifelse(method == "cor", "psi", "theta"),
simplify = FALSE)
}
} else {
useImps <- sapply(fit@convergence, "[[", "converged")
m <- sum(useImps)
if (nG == 1L) {
CovList <- lapply(fit@coefList[useImps],
function(x) x[[ifelse(method == "cor", "psi", "theta")]])
covmat <- list(Reduce("+", CovList) / m)
} else {
covmat <- list()
for (i in group.label) {
groupList <- lapply(fit@coefList[useImps],"[[", i)
CovList <- lapply(groupList, function(x) x[[ifelse(method == "cor", "psi", "theta")]])
covmat[[i]] <- Reduce("+", CovList) / m
}
}
}
fmimat <- covmat
covars <- comb.results[comb.results$op == "~~", c("lhs","rhs","group","est","fmi")]
for (i in 1:nG) {
theseCovars <- covars[covars$group == i,]
if (method == "cor") {
phantomRows <- !(theseCovars$lhs %in% vars)
theseCovars <- theseCovars[phantomRows,]
}
fmimat[[i]][as.matrix(theseCovars[, 1:2])] <- theseCovars$fmi
fmimat[[i]][as.matrix(theseCovars[, 2:1])] <- theseCovars$fmi
if (method == "cor") {
## reset variable names
dimnames(fmimat[[i]]) <- dimnames(covmat[[i]]) <- list(vars, vars)
}
}
if (nG == 1L) {
Covariances <- list(coef = covmat[[1]], fmi = fmimat[[1]])
} else Covariances <- list(coef = covmat, fmi = fmimat)
## start list of results
results <- setNames(list(Covariances),
nm = ifelse(method == "cor", "Correlations", "Covariances"))
}
## Means, if applicable
if (length(numvars)) {
results$Means <- comb.results[comb.results$op == "~1" & comb.results$lhs %in% numvars,
c("lhs","group","est","fmi")]
colnames(results$Means)[c(1, 3)] <- c("variable","coef")
if (nG > 1L) results$Means$group <- group.label[results$Means$group]
class(results$Means) <- c("lavaan.data.frame","data.frame")
}
## Thresholds, if applicable
if (length(ordvars)) {
results$Thresholds <- comb.results[comb.results$op == "|",
c("lhs","rhs","group","est","fmi")]
colnames(results$Thresholds)[c(1, 2, 4)] <- c("variable","threshold","coef")
if (nG > 1L) results$Thresholds$group <- group.label[results$Thresholds$group]
class(results$Thresholds) <- c("lavaan.data.frame","data.frame")
}
results
}
semTools/R/runMI-Wald.R 0000644 0001762 0000144 00000030756 14764332711 014345 0 ustar ligges users ### Terrence D. Jorgensen & Yves Rosseel
### Last updated: 12 March 2025
### Pooled Wald test for multiple imputations
### Borrowed source code from lavaan/R/lav_test_Wald.R
### DEPRECATED: 16 June 2024
### supplanted by lavaan.mi package
##' Wald Test for Multiple Imputations
##'
##' Wald test for testing a linear hypothesis about the parameters of lavaan
##' models fitted to multiple imputed data sets. Statistics for constraining
##' one or more free parameters in a model can be calculated from the pooled
##' point estimates and asymptotic covariance matrix of model parameters
##' using Rubin's (1987) rules, or by pooling the Wald test statistics
##' across imputed data sets (Li, Meng, Raghunathan, & Rubin, 1991).
##'
##' The constraints are specified using the `"=="` operator.
##' Both the left-hand side and the right-hand side of the equality can contain
##' a linear combination of model parameters, or a constant (like zero).
##' The model parameters must be specified by their user-specified labels from
##' the \code{link[lavaan]{model.syntax}}. Names of defined parameters
##' (using the ":=" operator) can be included too.
##'
## @aliases lavTestWald.mi
##' @importFrom lavaan parTable lavListInspect
##' @importFrom stats pchisq pf
##' @importFrom methods getMethod
##'
##' @param object An object of class [OLDlavaan.mi-class].
##' @param constraints A `character` string (typically between single
##' quotes) containing one or more equality constraints.
##' See examples for more details
##' @param test `character` indicating which pooling method to use.
##' `"D1"` or `"Rubin"` (default) indicates Rubin's (1987) rules
##' will be applied to the point estimates and the asymptotic covariance
##' matrix of model parameters, and those pooled values will be used to
##' calculate the Wald test in the usual manner. `"D2"`, `"LMRR"`,
##' or `"Li.et.al"` indicate that the complete-data Wald test statistic
##' should be calculated using each imputed data set, which will then be
##' pooled across imputations, as described in Li, Meng, Raghunathan, & Rubin
##' (1991) and Enders (2010, chapter 8).
##' @param asymptotic `logical`. If `FALSE` (default), the pooled test
##' will be returned as an *F*-distributed statistic with numerator
##' (`df1`) and denominator (`df2`) degrees of freedom.
##' If `TRUE`, the pooled *F* statistic will be multiplied by its
##' `df1` on the assumption that its `df2` is sufficiently large
##' enough that the statistic will be asymptotically \eqn{\chi^2} distributed
##' with `df1`.
##' @param scale.W `logical`. If `FALSE`, the pooled
##' asymptotic covariance matrix of model parameters is calculated as the
##' weighted sum of the within-imputation and between-imputation components.
##' Otherwise, the pooled asymptotic covariance matrix of model parameters is
##' calculated by scaling the within-imputation component by the
##' average relative increase in variance (ARIV; see Enders, 2010, p. 235),
##' which is *only* consistent when requesting the *F* test (i.e.,
##' `asymptotic = FALSE`. Ignored (irrelevant) if `test = "D2"`.
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. Specific imputation numbers can also be included in this
##' argument, in case users want to apply their own custom omission criteria
##' (or simulations can use different numbers of imputations without
##' redundantly refitting the model).
##' @param verbose `logical`. If `TRUE`, print the restriction
##' matrix and the estimated restricted values.
##' @param warn `logical`. If `TRUE`, print warnings if they occur.
##'
##' @return
##' A vector containing the Wald test statistic (either an `F` or
##' \eqn{\chi^2} statistic, depending on the `asymptotic` argument),
##' the degrees of freedom (numerator and denominator, if
##' `asymptotic = FALSE`), and a *p* value. If
##' `asymptotic = FALSE`, the relative invrease in variance (RIV, or
##' average for multiparameter tests: ARIV) used to calculate the denominator
##' *df* is also returned as a missing-data diagnostic, along with the
##' fraction missing information (FMI = ARIV / (1 + ARIV)).
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' Adapted from \pkg{lavaan} source code, written by
##' Yves Rosseel (Ghent University; \email{Yves.Rosseel@@UGent.be})
##'
##' @references
##' Enders, C. K. (2010). *Applied missing data analysis*.
##' New York, NY: Guilford.
##'
##' Li, K.-H., Meng, X.-L., Raghunathan, T. E., & Rubin, D. B. (1991).
##' Significance levels from repeated *p*-values with multiply-imputed
##' data. *Statistica Sinica, 1*(1), 65--92. Retrieved from
##'
##'
##' Rubin, D. B. (1987). *Multiple imputation for nonresponse in surveys*.
##' New York, NY: Wiley.
##'
##' @seealso [lavaan::lavTestWald()]
##'
##' @examples
##'
##' ## See the new lavaan.mi package
##'
##' @name lavTestWald.mi-deprecated
##' @usage
##' lavTestWald.mi(object, constraints = NULL, test = c("D1","D2"),
##' asymptotic = FALSE, scale.W = !asymptotic,
##' omit.imps = c("no.conv","no.se"),
##' verbose = FALSE, warn = TRUE)
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @export
lavTestWald.mi <- function(object, constraints = NULL, test = c("D1","D2"),
asymptotic = FALSE, scale.W = !asymptotic,
omit.imps = c("no.conv","no.se"),
verbose = FALSE, warn = TRUE) {
.Deprecated(msg = c("\nThe runMI() and related lavaan.mi functions have been",
" deprecated and will cease to be included in future ",
"versions of semTools.\n\nSupport is still provided for ",
"analyzing lavaan.mi-class objects (e.g., compRelSEM() ",
"can estimate reliability using pooled results), which ",
"can now be created using the lavaan.mi package.\n\nThe ",
"deprecated runMI() function now creates an object of ",
"class OLDlavaan.mi, which can be analyzed using the ",
"deprecated functions in semTools, like lavTestLRT.mi(),",
" that have been updated and improved in the lavaan.mi ",
"package.\n\nFind more details help('semTools-deprecated)"))
stopifnot(inherits(object, "OLDlavaan.mi"))
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
test <- tolower(test[1])
if (test %in% c("d2", "lmrr", "li.et.al")) test <- "D2"
if (test %in% c("d1", "rubin")) test <- "D1"
if (!test %in% c("D1","D2")) stop('Invalid choice of "test" argument.')
message('\nWald test calculated using se = "',
lavListInspect(object, "options")$se, '"\n')
if (test == "D2") {
oldCall <- object@lavListCall
if (!is.null(oldCall$parallel)) {
if (oldCall$parallel == "snow") {
oldCall$parallel <- "no"
oldCall$ncpus <- 1L
if (warn) warning("Unable to pass lavaan::lavTestWald() arguments ",
"when parallel='snow'. Switching to parallel='no'.",
" Unless using Windows, parallel='multicore' works.")
}
}
## call lavaanList() again to run lavTestWald() on each imputation
oldCall$FUN <- function(obj) {
out <- try(lavaan::lavTestWald(object = obj, constraints = constraints,
verbose = FALSE), silent = TRUE)
if (inherits(out, "try-error")) return(NULL)
do.call(c, out[1:2])
}
FIT <- eval(as.call(oldCall))
## check if there are any results
noStats <- sapply(FIT@funList, is.null)
if (all(noStats)) stop("No success using lavTestWald() on any imputations.")
## template to fill in pooled values
## at a minimum, pool the total score test
chiList <- sapply(FIT@funList[ intersect(useImps, which(!noStats)) ],
"[[", i = 1)
DF <- FIT@funList[[ intersect(useImps, which(!noStats))[1] ]][[2]]
out <- calculate.D2(chiList, DF = DF, asymptotic)
class(out) <- c("lavaan.vector","numeric")
return(out)
} # else test == "D1", making 'scale.W=' relevant
## "borrowed" lavTestWald()
if (is.null(constraints) || nchar(constraints) == 0L) stop("constraints are empty")
# remove == constraints from parTable, save as list
PT <- parTable(object)
partable <- as.list(PT[PT$op != "==", ])
# parse constraints
FLAT <- lavaan::lavParseModelString( constraints )
CON <- attr(FLAT, "constraints")
LIST <- list()
if (length(CON) > 0L) {
lhs <- unlist(lapply(CON, "[[", i = "lhs"))
op <- unlist(lapply(CON, "[[", i = "op"))
rhs <- unlist(lapply(CON, "[[", i = "rhs"))
LIST$lhs <- c(LIST$lhs, lhs) # FIXME: why concatenate with NULL?
LIST$op <- c(LIST$op, op)
LIST$rhs <- c(LIST$rhs, rhs)
} else stop("no equality constraints found in constraints argument")
# theta = free parameters only (equality-constrained allowed)
theta <- getMethod("coef", "OLDlavaan.mi")(object, omit.imps = omit.imps) #object@optim$x
# build constraint function
ceq.function <- lavaan::lav_partable_constraints_ceq(partable = partable,
con = LIST, debug = FALSE)
# compute jacobian restrictions
JAC <- try(lavaan::lav_func_jacobian_complex(func = ceq.function, x = theta),
silent = TRUE)
if (inherits(JAC, "try-error")) { # eg. pnorm()
JAC <- lavaan::lav_func_jacobian_simple(func = ceq.function, x = theta)
}
if (verbose) {cat("Restriction matrix (jacobian):\n"); print(JAC); cat("\n")}
# linear restriction
theta.r <- ceq.function( theta )
if (verbose) {cat("Restricted theta values:\n"); print(theta.r); cat("\n")}
# get VCOV
VCOV <- getMethod("vcov","OLDlavaan.mi")(object, scale.W = scale.W,
omit.imps = omit.imps)
# restricted vcov
info.r <- JAC %*% VCOV %*% t(JAC)
# Wald test statistic
test.stat <- as.numeric(t(theta.r) %*% solve( info.r ) %*% theta.r)
# number of constraints (k in Enders (2010, p. 235) eqs. 8.23-25)
DF <- nrow(JAC)
if (asymptotic) {
out <- c("chisq" = test.stat, df = DF,
pvalue = pchisq(test.stat, df = DF, lower.tail = FALSE))
} else {
W <- getMethod("vcov", "OLDlavaan.mi")(object, type = "within",
omit.imps = omit.imps)
B <- getMethod("vcov", "OLDlavaan.mi")(object, type = "between",
omit.imps = omit.imps)
#FIXME: only valid for linear constraints?
## restricted B & W components of VCOV
W.r <- JAC %*% W %*% t(JAC)
B.r <- JAC %*% B %*% t(JAC)
## relative increase in variance due to missing data
W.inv <- MASS::ginv(W.r)
ariv <- (1 + 1/m) * sum(diag(B.r %*% W.inv)) / DF
## calculate denominator DF for F statistic
a <- DF*(m - 1)
if (a > 4) {
v2 <- 4 + (a - 4) * (1 + (1 - 2/a)*(1 / ariv))^2 # Enders (eq. 8.24)
} else {
v2 <- a*(1 + 1/DF) * (1 + 1/ariv)^2 / 2 # Enders (eq. 8.25)
}
out <- c("F" = test.stat / DF, df1 = DF, df2 = v2,
pvalue = pf(test.stat / DF, df1 = DF, df2 = v2, lower.tail = FALSE),
ariv = ariv, fmi = ariv / (1 + ariv))
}
class(out) <- c("lavaan.vector","numeric")
out
}
semTools/R/monteCarloCI.R 0000644 0001762 0000144 00000036664 14764333616 014756 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 12 March 2025
## from http://www.da.ugent.be/cvs/pages/en/Presentations/Presentation%20Yves%20Rosseel.pdf
# dd <- read.table("http://www.statmodel.com/examples/shortform/4cat%20m.dat",
# col.names = c("intention", "intervention", "ciguse", "w"))
# myData <- do.call(rbind, lapply(1:nrow(dd), function(RR) {
# data.frame(rep(1, dd$w[RR]) %*% as.matrix(dd[RR, 1:3]))
# }))
# model <- '
# ciguse ~ c*intervention + b*intention
# intention ~ a*intervention
# # label threshold for ciguse
# ciguse | b0*t1
# # biased SEs
# naive.indirect := a*b
# naive.direct := c
# # correct
# probit11 := (-b0+c+b*a)/sqrt(b^2+1)
# probit10 := (-b0+c )/sqrt(b^2+1)
# probit00 := (-b0 )/sqrt(b^2+1)
# indirect := pnorm(probit11) - pnorm(probit10)
# direct := pnorm(probit10) - pnorm(probit00)
# OR.indirect := (pnorm(probit11)/(1-pnorm(probit11))) / (pnorm(probit10)/(1-pnorm(probit10)))
# OR.direct := (pnorm(probit10)/(1-pnorm(probit10))) / (pnorm(probit00)/(1-pnorm(probit00)))
# '
# fit <- sem(model, data = myData, ordered = c("ciguse","intention"))
# summary(fit, ci = TRUE)
##' Monte Carlo Confidence Intervals to Test Functions of Parameter Estimates
##'
##' Robust confidence intervals for functions of parameter estimates,
##' based on empirical sampling distributions of estimated model parameters.
##'
##' This function implements the Monte Carlo method of obtaining an empirical
##' sampling distribution of estimated model parameters, as described by
##' MacKinnon et al. (2004) for testing indirect effects in mediation models.
##' This is essentially a parametric bootstrap method, which (re)samples
##' parameters (rather than raw data) from a multivariate-normal distribution
##' with mean vector equal to estimates in `coef()` and covariance matrix
##' equal to the asymptotic covariance matrix `vcov()` of estimated parameters.
##'
##' The easiest way to use the function is to fit a SEM to data with
##' [lavaan::lavaan()], using the `:=` operator in the
##' [lavaan::model.syntax()] to specify user-defined parameters.
##' All information is then available in the resulting
##' [lavaan::lavaan-class] object. Alternatively (especially when using
##' external SEM software to fit the model), the expression(s) can be explicitly
##' passed to the function, along with the vector of estimated model parameters
##' and their associated asymptotic sampling covariance matrix (ACOV).
##' For further information on the Monte Carlo method, see MacKinnon et al.
##' (2004) and Preacher & Selig (2012).
##'
##' The asymptotic covariance matrix can be obtained easily from many popular
##' SEM software packages.
##' \itemize{
##' \item{LISREL: Including the EC option on the OU line will print the ACM
##' to a seperate file. The file contains the lower triangular elements of
##' the ACM in free format and scientific notation.}
##' \item{M*plus*: Include the command TECH3; in the OUTPUT section.
##' The ACM will be printed in the output.}
##' \item{`lavaan`: Use the [vcov()] method on the fitted [lavaan::lavaan-class]
##' object to return the ACM.}
##' }
##'
##'
##' @importFrom stats quantile
##' @importFrom methods getMethod
##' @importFrom lavaan parTable lavInspect
##'
##' @param object A object of class [lavaan::lavaan-class] in which
##' functions of parameters have already been defined using the `:=`
##' operator in `lavaan`'s [lavaan::model.syntax()]. When
##' `NULL`, users must specify `expr`, `coefs`, and `ACM`.
##' @param expr Optional `character` vector specifying functions of model
##' parameters (e.g., an indirect effect). Ideally, the vector should have
##' names, which is necessary if any user-defined parameters refer to other
##' user-defined parameters defined earlier in the vector (order matters!).
##' All parameters appearing in the vector must be provided in `coefs`,
##' or defined (as functions of `coefs`) earlier in `expr`. If
##' `length(expr) > 1L`, `nRep` samples will be drawn
##' simultaneously from a single multivariate distribution; thus,
##' `ACM` must include all parameters in `coefs`.
##' @param coefs `numeric` vector of parameter estimates used in
##' `expr`. Ignored when `object` is used.
##' @param ACM Symmetric `matrix` representing the asymptotic sampling
##' covariance matrix (ACOV) of the parameter estimates in `coefs`.
##' Ignored when `object` is used. Information on how to obtain the ACOV
##' in popular SEM software is described in **Details**.
##' @param nRep `integer`. The number of samples to draw, to obtain an
##' empirical sampling distribution of model parameters. Many thousand are
##' recommended to minimize Monte Carlo error of the estimated CIs.
##' @param standardized `logical` indicating whether to obtain CIs for the
##' fully standardized (`"std.all"`) estimates, using their asymptotic
##' sampling covariance matrix.
##' @param fast `logical` indicating whether to use a fast algorithm that
##' assumes all functions of parameters (in `object` or `expr`) use
##' standard operations. Set to `FALSE` if using (e.g.) [c()]
##' to concatenate parameters in the definition, which would have unintended
##' consequences when vectorizing functions in `expr` across sampled
##' parameters.
##' @param level `numeric` confidence level, between 0--1
##' @param na.rm `logical` passed to [stats::quantile()]
##' @param append.samples `logical` indicating whether to return the
##' simulated empirical sampling distribution of parameters (in `coefs`)
##' and functions (in `expr`) in a `list` with the results. This
##' could be useful to calculate more precise highest-density intervals (see
##' examples).
##' @param plot `logical` indicating whether to plot the empirical sampling
##' distribution of each function in `expr`
##' @param ask whether to prompt user before printing each plot
##' @param \dots arguments passed to [graphics::hist()] when
##' `plot = TRUE`.
##'
##' @return A `lavaan.data.frame` (to use lavaan's `print` method)
##' with point estimates and confidence limits of each requested function of
##' parameters in `expr` is returned. If `append.samples = TRUE`,
##' output will be a `list` with the same `$Results` along with a
##' second `data.frame` with the `$Samples` (in rows) of each
##' parameter (in columns), and an additional column for each requested
##' function of those parameters.
##'
##' @author Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @references
##' MacKinnon, D. P., Lockwood, C. M., & Williams, J. (2004). Confidence limits
##' for the indirect effect: Distribution of the product and resampling methods.
##' *Multivariate Behavioral Research, 39*(1) 99--128.
##' \doi{10.1207/s15327906mbr3901_4}
##'
##' Preacher, K. J., & Selig, J. P. (2010, July). Monte Carlo method
##' for assessing multilevel mediation: An interactive tool for creating
##' confidence intervals for indirect effects in 1-1-1 multilevel models.
##' Computer software available from .
##'
##' Preacher, K. J., & Selig, J. P. (2012). Advantages of Monte Carlo confidence
##' intervals for indirect effects. *Communication Methods and Measures, 6*(2),
##' 77--98. \doi{10.1080/19312458.2012.679848}
##'
##' Selig, J. P., & Preacher, K. J. (2008, June). Monte Carlo method for
##' assessing mediation: An interactive tool for creating confidence intervals
##' for indirect effects. Computer software available from
##' .
##'
##' @aliases monteCarloCI monteCarloMed
##'
##' @examples
##'
##' ## From the mediation tutorial:
##' ## http://lavaan.ugent.be/tutorial/mediation.html
##'
##' set.seed(1234)
##' X <- rnorm(100)
##' M <- 0.5*X + rnorm(100)
##' Y <- 0.7*M + rnorm(100)
##' dat <- data.frame(X = X, Y = Y, M = M)
##'
##' mod <- ' # direct effect
##' Y ~ c*X
##' # mediator
##' M ~ a*X
##' Y ~ b*M
##' # indirect effect (a*b)
##' ind := a*b
##' # total effect
##' total := ind + c
##' '
##' fit <- sem(mod, data = dat)
##' summary(fit, ci = TRUE) # print delta-method CIs
##'
##' ## Automatically extract information from lavaan object
##' set.seed(1234)
##' monteCarloCI(fit) # CIs more robust than delta method in smaller samples
##'
##' ## delta method for standardized solution
##' standardizedSolution(fit)
##' ## compare to Monte Carlo CIs:
##' set.seed(1234)
##' monteCarloCI(fit, standardized = TRUE)
##'
##' \donttest{
##' ## save samples to calculate more precise intervals:
##' set.seed(1234)
##' foo <- monteCarloCI(fit, append.samples = TRUE)
##' # library(HDInterval) # not a dependency; must be installed
##' # hdi(foo$Samples)
##' }
##' ## Parameters can also be obtained from an external analysis
##' myParams <- c("a","b","c")
##' (coefs <- coef(fit)[myParams]) # names must match those in the "expression"
##' ## Asymptotic covariance matrix from an external analysis
##' (AsyCovMat <- vcov(fit)[myParams, myParams])
##' ## Compute CI, include a plot
##' set.seed(1234)
##' monteCarloCI(expr = c(ind = 'a*b', total = 'ind + c',
##' ## other arbitrary functions are also possible
##' meaningless = 'sqrt(a)^b / log(abs(c))'),
##' coefs = coefs, ACM = AsyCovMat,
##' plot = TRUE, ask = TRUE) # print a plot for each
##'
##' @export
monteCarloCI <- function(object = NULL, expr, coefs, ACM, nRep = 2e4,
standardized = FALSE, fast = TRUE, level = 0.95,
na.rm = TRUE, append.samples = FALSE, plot = FALSE,
ask = getOption("device.ask.default"), ...) {
if (inherits(object, c("lavaan","lavaan.mi"))) {
## extract user-defined parameters from parTable (order of user-defined
PT <- parTable(object) # parameters must be correct for model to be fitted)
## create expression vector
expr <- PT$rhs[PT$op == ":="]
names(expr) <- PT$lhs[PT$op == ":="]
if (length(expr) == 0L) stop('No user-defined parameters found.')
}
## provide names if there are none
if (is.null(names(expr))) names(expr) <- expr
## Get names and the number of unique variables in the expression
funcVars <- unique(do.call("c", lapply(paste("~", expr), function(x) {
all.vars(stats::as.formula(x))
})))
## isolate names of model parameters (not user-defined), which get sampled
if (inherits(object, "lavaan")) {
if (standardized) {
STD <- lavaan::standardizedSolution(object)
coefRows <- !(STD$op %in% c(":=","==","<",">","<=",">="))
coefs <- STD$est.std[coefRows]
names(coefs) <- lavaan::lav_partable_labels(STD[coefRows, ])
} else coefs <- lavaan::coef(object)
} else if (inherits(object, "lavaan.mi")) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
if (standardized) {
## only available after this lavaan version:
if ( utils::packageDescription("lavaan", fields = "Version") < "0.6-19" ||
(utils::packageDescription("lavaan", fields = "Version") > "0.6-19" &&
utils::packageDescription("lavaan", fields = "Version") < "0.6-19.2148") ) {
stop("standardized=TRUE for a lavaan.mi-class object requires lavaan ",
"version >= 0.6-19 from CRAN, or development version >= ",
"0.6-19.2148 from GitHub")
}
STD <- lavaan.mi::standardizedSolution.mi(object)
coefRows <- !(STD$op %in% c(":=","==","<",">","<=",">="))
coefs <- STD$est.std[coefRows]
names(coefs) <- lavaan::lav_partable_labels(STD[coefRows, ])
} else coefs <- getMethod(f = "coef", signature = "lavaan.mi",
where = getNamespace("lavaan.mi"))(object)
}
sampVars <- intersect(names(coefs), funcVars)
## If a lavaan(.mi) object is provided, extract coefs and ACM
if (inherits(object, "lavaan")) {
coefs <- coefs[sampVars]
if (standardized) {
ACM <- lavInspect(object, "vcov.std.all")[sampVars, sampVars]
} else {
ACM <- lavaan::vcov(object)[sampVars, sampVars]
}
} else if (inherits(object, "lavaan.mi")) {
coefs <- coefs[sampVars]
if (standardized) {
ACM <- lavaan.mi::standardizedSolution.mi(object, return.vcov = TRUE,
type = "std.all")[sampVars, sampVars]
} else {
ACM <- getMethod(f = "vcov", signature = "lavaan.mi",
where = getNamespace("lavaan.mi"))(object)[sampVars, sampVars]
}
}
## Apply the expression(s) to POINT ESTIMATES
estList <- as.list(coefs)
for (i in seq_along(expr)) {
estList[names(expr[i])] <- eval(parse(text = expr[i]), envir = estList)
}
EST <- data.frame(est = do.call("c", estList[names(expr)]))
## old, buggy code (see issue #142)
# estList <- within(as.list(coefs), expr = {
# for (i in seq_along(expr)) assign(names(expr[i]), eval(parse(text = expr[i])))
# })[names(expr)]
# EST <- data.frame(est = do.call("c", estList))
rownames(EST) <- names(expr)
if (standardized && inherits(object, "lavaan")) colnames(EST) <- "est.std"
## Matrix of sampled values
# dat <-
samples <- data.frame(mnormt::rmnorm(n = nRep, mean = coefs, varcov = ACM))
## Apply the expression(s) to VECTORS of ESTIMATES
if (fast) {
for (i in seq_along(expr)) {
samples[names(expr[i])] <- eval(parse(text = expr[i]), envir = samples)
}
## old, buggy code (see issue #142)
# samples <- within(dat, expr = {
# for (i in seq_along(expr)) assign(names(expr[i]), eval(parse(text = expr[i])))
# })[c(sampVars, names(expr))]
} else {
## SLOWER: only necessary if expr creates objects using (e.g.) c(), which
## would concatenate parameters ACROSS samples as well as WITHIN
datList <- lapply(1:nRep, function(Rep) {
samples[Rep,] # dat[Rep,]
})
samples <- do.call(rbind, lapply(datList, function(Rep) {
for (i in seq_along(expr)) {
Rep[names(expr[i])] <- eval(parse(text = expr[i]), envir = Rep)
}
## old, buggy code (see issue #142)
# within(Rep, expr = {
# for (i in seq_along(expr)) assign(names(expr[i]), eval(parse(text = expr[i])))
# })
Rep
})) # [c(sampVars, names(expr))]
}
## Get the CI(s)
halfAlpha <- (1-level)/2
Limits <- lapply(samples[names(expr)], quantile, na.rm = na.rm,
probs = c(halfAlpha, 1 - halfAlpha))
CIs <- data.frame(do.call("rbind", Limits))
rownames(CIs) <- names(expr)
colnames(CIs) <- c("ci.lower","ci.upper")
## Switch for outputting a plot
if (plot) {
if (length(expr) > 1L && ask) {
opar <- grDevices::devAskNewPage()
grDevices::devAskNewPage(ask = TRUE)
}
for (i in seq_along(expr)) {
histArgs <- list(...)
histArgs$x <- samples[[ names(expr)[i] ]]
if (is.null(histArgs$breaks)) histArgs$breaks <- "FD"
if (is.null(histArgs$xlab)) histArgs$xlab <- paste0(level*100, '% Confidence Interval')
if (is.null(histArgs$main)) histArgs$main <- paste('Distribution of', names(expr)[i])
do.call("hist", histArgs)
abline(v = EST[i,1], lwd = 3)
abline(v = CIs[i,1:2], lwd = 2, lty = "dashed")
}
if (length(expr) > 1L && ask) grDevices::devAskNewPage(ask = opar)
}
## Always return point and interval estimates
out <- cbind(EST, CIs)
class(out) <- c("lavaan.data.frame","data.frame")
## also simulated values? (e.g., to calculate highest-density intervals)
if (append.samples) return(list(Results = out, Samples = samples))
out
}
#FIXME: Remove after a few version updates
monteCarloMed <- function(...) {
.Defunct("monteCarloCI",
msg = "monteCarloMed() has been replaced by monteCarloCI()")
}
semTools/R/powerAnalysisRMSEA.R 0000644 0001762 0000144 00000037646 14632143377 016065 0 ustar ligges users ### Sunthud Pornprasertmanit, Alexander M. Schoemann, Kristopher J. Preacher, Donna Coffman
### Last updated: 22 April 2022
##' Plot power curves for RMSEA
##'
##' Plots power of RMSEA over a range of sample sizes
##'
##' This function creates plot of power for RMSEA against a range of sample
##' sizes. The plot places sample size on the horizontal axis and power on the
##' vertical axis. The user should indicate the lower and upper values for
##' sample size and the sample size between each estimate ("step size") We
##' strongly urge the user to read the sources below (see References) before
##' proceeding. A web version of this function is available at:
##' . This function is also
##' implemented in the web application "power4SEM":
##'
##'
##'
##' @importFrom stats qchisq pchisq
##'
##' @param rmsea0 Null RMSEA
##' @param rmseaA Alternative RMSEA
##' @param df Model degrees of freedom
##' @param nlow Lower sample size
##' @param nhigh Upper sample size
##' @param steps Increase in sample size for each iteration. Smaller values of
##' steps will lead to more precise plots. However, smaller step sizes means a
##' longer run time.
##' @param alpha Alpha level used in power calculations
##' @param group The number of group that is used to calculate RMSEA.
##' @param \dots The additional arguments for the plot function.
##'
##' @return Plot of power for RMSEA against a range of sample sizes
##'
##' @author
##' Alexander M. Schoemann (East Carolina University; \email{schoemanna@@ecu.edu})
##'
##' Kristopher J. Preacher (Vanderbilt University; \email{kris.preacher@@vanderbilt.edu})
##'
##' Donna L. Coffman (Pennsylvania State University; \email{dlc30@@psu.edu})
##'
##' @seealso \itemize{
##' \item [plotRMSEAdist()] to visualize the RMSEA distributions
##' \item [findRMSEApower()] to find the statistical power based on
##' population RMSEA given a sample size
##' \item [findRMSEAsamplesize()] to find the minium sample size for
##' a given statistical power based on population RMSEA
##' }
##'
##' @references
##' MacCallum, R. C., Browne, M. W., & Cai, L. (2006). Testing
##' differences between nested covariance structure models: Power analysis and
##' null hypotheses. *Psychological Methods, 11*(1), 19--35.
##' \doi{10.1037/1082-989X.11.1.19}
##'
##' MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis
##' and determination of sample size for covariance structure modeling.
##' *Psychological Methods, 1*(2), 130--149. \doi{10.1037/1082-989X.1.2.130}
##'
##' MacCallum, R. C., Lee, T., & Browne, M. W. (2010). The issue of isopower in
##' power analysis for tests of structural equation models. *Structural
##' Equation Modeling, 17*(1), 23--41. \doi{10.1080/10705510903438906}
##'
##' Preacher, K. J., Cai, L., & MacCallum, R. C. (2007). Alternatives to
##' traditional model comparison strategies for covariance structure models. In
##' T. D. Little, J. A. Bovaird, & N. A. Card (Eds.), *Modeling contextual
##' effects in longitudinal studies* (pp. 33--62). Mahwah, NJ: Lawrence Erlbaum
##' Associates.
##'
##' Steiger, J. H. (1998). A note on multiple sample extensions of the RMSEA fit
##' index. *Structural Equation Modeling, 5*(4), 411--419.
##' \doi{10.1080/10705519809540115}
##'
##' Steiger, J. H., & Lind, J. C. (1980, June). *Statistically based tests
##' for the number of factors.* Paper presented at the annual meeting of the
##' Psychometric Society, Iowa City, IA.
##'
##' Jak, S., Jorgensen, T. D., Verdam, M. G., Oort, F. J., & Elffers, L.
##' (2021). Analytical power calculations for structural equation modeling:
##' A tutorial and Shiny app. *Behavior Research Methods, 53*, 1385--1406.
##' \doi{10.3758/s13428-020-01479-0}
##'
##' @examples
##'
##' plotRMSEApower(rmsea0 = .025, rmseaA = .075, df = 23,
##' nlow = 100, nhigh = 500, steps = 10)
##'
##' @export
plotRMSEApower <- function(rmsea0, rmseaA, df, nlow, nhigh, steps = 1,
alpha = .05, group = 1, ...) {
pow1 <- 0
nseq <- seq(nlow,nhigh, by=steps)
for(i in nseq){
ncp0 <- ((i-1)*df*rmsea0^2)/group
ncpa <- ((i-1)*df*rmseaA^2)/group
#Compute power
if(rmsea0 < rmseaA) {
cval <- qchisq(alpha,df,ncp=ncp0,lower.tail=FALSE)
pow <- pchisq(cval,df,ncp=ncpa,lower.tail=FALSE)
}
if(rmsea0 > rmseaA) {
cval <- qchisq(1-alpha, df, ncp=ncp0, lower.tail=FALSE)
pow <- 1-pchisq(cval,df,ncp=ncpa,lower.tail=FALSE)
}
pow1<-c(pow1, pow)
}
pow1 <- pow1[-1]
plot(nseq,pow1,xlab="Sample Size",ylab="Power",main="Compute Power for RMSEA",type="l", ...)
}
##' Plot the sampling distributions of RMSEA
##'
##' Plots the sampling distributions of RMSEA based on the noncentral chi-square
##' distributions
##'
##' This function creates overlappling plots of the sampling distribution of
##' RMSEA based on noncentral \eqn{\chi^2} distribution (MacCallum, Browne, &
##' Suguwara, 1996). First, the noncentrality parameter (\eqn{\lambda}) is
##' calculated from RMSEA (Steiger, 1998; Dudgeon, 2004) by \deqn{\lambda = (N -
##' 1)d\varepsilon^2 / K,} where \eqn{N} is sample size, \eqn{d} is the model
##' degree of freedom, \eqn{K} is the number of group, and \eqn{\varepsilon} is
##' the population RMSEA. Next, the noncentral \eqn{\chi^2} distribution with a
##' specified *df* and noncentrality parameter is plotted. Thus,
##' the x-axis represents the sample \eqn{\chi^2} value. The sample \eqn{\chi^2}
##' value can be transformed to the sample RMSEA scale (\eqn{\hat{\varepsilon}})
##' by \deqn{\hat{\varepsilon} = \sqrt{K}\sqrt{\frac{\chi^2 - d}{(N - 1)d}},}
##' where \eqn{\chi^2} is the \eqn{\chi^2} value obtained from the noncentral
##' \eqn{\chi^2} distribution.
##'
##'
##' @importFrom stats qchisq
##'
##' @param rmsea The vector of RMSEA values to be plotted
##' @param n Sample size of a dataset
##' @param df Model degrees of freedom
##' @param ptile The percentile rank of the distribution of the first RMSEA that
##' users wish to plot a vertical line in the resulting graph
##' @param caption The name vector of each element of `rmsea`
##' @param rmseaScale If `TRUE`, the RMSEA scale is used in the x-axis. If
##' `FALSE`, the chi-square scale is used in the x-axis.
##' @param group The number of group that is used to calculate RMSEA.
##'
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' @seealso \itemize{
##' \item [plotRMSEApower()] to plot the statistical power
##' based on population RMSEA given the sample size
##' \item [findRMSEApower()] to find the statistical power based on
##' population RMSEA given a sample size
##' \item [findRMSEAsamplesize()] to find the minium sample size for
##' a given statistical power based on population RMSEA
##' }
##'
##' @references
##' Dudgeon, P. (2004). A note on extending Steiger's (1998)
##' multiple sample RMSEA adjustment to other noncentrality parameter-based
##' statistic. *Structural Equation Modeling, 11*(3), 305--319.
##' \doi{10.1207/s15328007sem1103_1}
##'
##' MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis
##' and determination of sample size for covariance structure modeling.
##' *Psychological Methods, 1*(2), 130--149. \doi{10.1037/1082-989X.1.2.130}
##'
##' Steiger, J. H. (1998). A note on multiple sample extensions of the RMSEA fit
##' index. *Structural Equation Modeling, 5*(4), 411--419.
##' \doi{10.1080/10705519809540115}
##'
##' @examples
##'
##' plotRMSEAdist(c(.05, .08), n = 200, df = 20, ptile = .95, rmseaScale = TRUE)
##' plotRMSEAdist(c(.05, .01), n = 200, df = 20, ptile = .05, rmseaScale = FALSE)
##'
##' @export
plotRMSEAdist <- function(rmsea, n, df, ptile = NULL, caption = NULL,
rmseaScale = TRUE, group = 1) {
graph <- cbind(rmsea, df)
ncp <- apply(graph, MARGIN = 1,
FUN = function(x, n, group) ((n - 1) * x[2] * (x[1]^2))/group,
n = n, group = group)
graph <- cbind(graph, ncp)
dens <- lapply(as.list(data.frame(t(graph))), function(x) findDensity("chisq", df = x[2], ncp=x[3]))
if (rmseaScale) dens <- lapply(dens, function(x, df, n, group) { x[,1] <- (x[,1] - df)/(n-1); x[(x[,1] < 0),1] <- 0; x[,1] <- sqrt(group) * sqrt(x[,1]/df); return(x) }, df=df, n=n, group=group)
cutoff <- NULL
if (!is.null(ptile)) {
cutoff <- qchisq(ptile, df = graph[1, 2], ncp = graph[1, 3])
if (rmseaScale) cutoff <- sqrt(group) * sqrt((cutoff - df)/(df * (n - 1)))
}
if (is.null(caption)) caption <- sapply(graph[,1],
function(x) paste("Population RMSEA = ",
format(x, digits = 3),
sep = ""))
plotOverlapDensity(dens, cutoff, caption, ylab = "Density",
xlab = ifelse(rmseaScale, "RMSEA", "Chi-Square"))
equal0 <- sapply(dens, function(x) x[,1] == 0)
if (any(equal0)) warning("The density at RMSEA = 0 cannot be trusted",
" because the plots are truncated.")
}
##' Find the statistical power based on population RMSEA
##'
##' Find the proportion of the samples from the sampling distribution of RMSEA
##' in the alternative hypothesis rejected by the cutoff dervied from the
##' sampling distribution of RMSEA in the null hypothesis. This function can be
##' applied for both test of close fit and test of not-close fit (MacCallum,
##' Browne, & Suguwara, 1996)
##'
##' This function find the proportion of sampling distribution derived from the
##' alternative RMSEA that is in the critical region derived from the sampling
##' distribution of the null RMSEA. If `rmseaA` is greater than
##' `rmsea0`, the test of close fit is used and the critical region is in
##' the right hand side of the null sampling distribution. On the other hand, if
##' `rmseaA` is less than `rmsea0`, the test of not-close fit is used
##' and the critical region is in the left hand side of the null sampling
##' distribution (MacCallum, Browne, & Suguwara, 1996).
##'
##' There is also a Shiny app called "power4SEM" that provides a graphical user
##' interface for this functionality (Jak et al., in press). It can be accessed
##' at .
##'
##'
##' @importFrom stats qchisq pchisq
##'
##' @param rmsea0 Null RMSEA
##' @param rmseaA Alternative RMSEA
##' @param df Model degrees of freedom
##' @param n Sample size of a dataset
##' @param alpha Alpha level used in power calculations
##' @param group The number of group that is used to calculate RMSEA.
##'
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' @seealso \itemize{
##' \item [plotRMSEApower()] to plot the statistical power based on
##' population RMSEA given the sample size
##' \item [plotRMSEAdist()] to visualize the RMSEA distributions
##' \item [findRMSEAsamplesize()] to find the minium sample size for
##' a given statistical power based on population RMSEA
##' }
##'
##' @references
##' MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis
##' and determination of sample size for covariance structure modeling.
##' *Psychological Methods, 1*(2), 130--149. \doi{10.1037/1082-989X.1.2.130}
##'
##' Jak, S., Jorgensen, T. D., Verdam, M. G., Oort, F. J., & Elffers, L.
##' (2021). Analytical power calculations for structural equation modeling:
##' A tutorial and Shiny app. *Behavior Research Methods, 53*, 1385--1406.
##' \doi{10.3758/s13428-020-01479-0}
##'
##' @examples
##'
##' findRMSEApower(rmsea0 = .05, rmseaA = .08, df = 20, n = 200)
##'
##' @export
findRMSEApower <- function(rmsea0, rmseaA, df, n, alpha = .05, group = 1) {
ncp0 <- ((n-1)*df*rmsea0^2)/group
ncpa <- ((n-1)*df*rmseaA^2)/group
if (rmsea0 power)) {
return("Sample Size <= 5")
} else if (all(power > pow)) {
repeat {
n <- n + 500
pow <- findRMSEApower(rmsea0, rmseaA, df, n, alpha, group=group)
if(any(pow > power)) {
index <- which(pow > power)[1]
return(n[index]/group)
}
}
} else {
index <- which(pow > power)[1]
return(n[index]/group)
}
}
## ----------------
## Hidden Functions
## ----------------
## findDensity
## Find the x and y coordinate of a distribution in order to plot a density of a distribution
## dist: target distribution in text, such as "chisq"
## ...: Additional argument of the distribution
## Return the data frame with x and y coordinates for plotting density
findDensity <- function(dist, ...) {
FUN <- list()
FUN[[1]] <- get(paste("q", dist, sep=""))
FUN[[2]] <- c(0.001, 0.999)
FUN <- c(FUN, ...)
bound <- eval(as.call(FUN))
val <- seq(bound[1], bound[2], length.out=1000)
FUN[[1]] <- get(paste("d", dist, sep=""))
FUN[[2]] <- val
height <- eval(as.call(FUN))
return(cbind(val, height))
}
##Example Code
##findDensity("chisq", df=10)
## plotOverlapDensity
## Plot the overlapping distributions using density
## dat: A list of data frame where each data frame has the x coordinate as the variable 1 and y coordinate as the variable 2
## vline: Vertical line in the graph
## caption: The name of each density line
## ...: Additional argument of the plot function
plotOverlapDensity <- function(dat, vline = NULL, caption = NULL, ...) {
if (!is.list(dat)) {
temp <- list()
temp[[1]] <- dat
dat <- temp
}
stack <- do.call(rbind, dat)
lim <- apply(stack, 2, function(x) c(min(x), max(x)))
plot(stack, xlim = lim[,1], ylim = lim[,2], type = "n", ...)
for (i in 1:length(dat)) lines(dat[[i]], col = i, lwd = 1.5)
for (i in 1:length(vline)) abline(v = vline[i], lwd = 1.5)
if (!is.null(caption))
legend(0.50 * (lim[2,1] - lim[1,1]) + lim[1,1], 0.99 * (lim[2,2] - lim[1,2]) + lim[1,2], caption, col=1:length(dat), lty=1)
}
semTools/R/EmpKaiser.R 0000644 0001762 0000144 00000013653 14632144122 014272 0 ustar ligges users ### Ylenio Longo
### Last updated: 10 January 2021
##' Empirical Kaiser criterion
##'
##' Identify the number of factors to extract based on the Empirical Kaiser
##' Criterion (EKC). The analysis can be run on a `data.frame` or data
##' `matrix` (`data`), or on a correlation or covariance matrix
##' (`sample.cov`) and the sample size (`sample.nobs`). A
##' `data.frame` is returned with two columns: the eigenvalues from your
##' data or covariance matrix and the reference eigenvalues. The number of
##' factors suggested by the Empirical Kaiser Criterion (i.e. the sample
##' eigenvalues greater than the reference eigenvalues), and the number of
##' factors suggested by the original Kaiser Criterion
##' (i.e. sample eigenvalues > 1) is printed above the output.
##'
##'
##' @importFrom stats cov cov2cor
##'
##' @param data A `data.frame` or data `matrix` containing columns of
##' variables to be factor-analyzed.
##' @param sample.cov A covariance or correlation matrix can be used, instead of
##' `data`, to estimate the eigenvalues.
##' @param sample.nobs Number of observations (i.e. sample size) if
##' `is.null(data)` and `sample.cov=` is used.
##' @param missing If `"listwise"`, incomplete cases are removed listwise from
##' the `data.frame`. If `"direct"` or `"ml"` or `"fiml"` and the `estimator=`
##' is maximum likelihood, an EM algorithm is used to estimate an unrestricted
##' covariance matrix (and mean vector). If `"pairwise"`, pairwise deletion is
##' used. If `"default"``, the value is set depending on the estimator and the
##' mimic option (see [lavaan::lavCor()] for details).
##' @param ordered `character` vector. Only used if object is a `data.frame`.
##' Treat these variables as `ordered=` (ordinal) variables. Importantly, all
##' other variables will be treated as numeric (unless `is.ordered == TRUE` in
##' `data`). (see also [lavCor][lavaan::lavCor])
##' @param plot logical. Whether to print a scree plot comparing the sample
##' eigenvalues with the reference eigenvalues.
##' @return A `data.frame` showing the sample and reference eigenvalues.
##'
##' The number of factors suggested by the Empirical Kaiser Criterion (i.e. the
##' sample eigenvalues greater than the reference eigenvalues) is returned as an
##' attribute (see **Examples**).
##'
##' The number of factors suggested by the original Kaiser Criterion (i.e.
##' sample eigenvalues > 1) is also printed as a header to the `data.frame`
##'
##' @author Ylenio Longo (University of Nottingham;
##' \email{yleniolongo@@gmail.com})
##'
##' Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' @references Braeken, J., & van Assen, M. A. L. M. (2017). An empirical
##' Kaiser criterion. *Psychological Methods, 22*(3), 450--466.
##' \doi{10.1037/met0000074}
##'
##' @examples
##'
##' ## Simulate data with 3 factors
##' model <- '
##' f1 =~ .3*x1 + .5*x2 + .4*x3
##' f2 =~ .3*x4 + .5*x5 + .4*x6
##' f3 =~ .3*x7 + .5*x8 + .4*x9
##' '
##' dat <- simulateData(model, seed = 123)
##' ## save summary statistics
##' myCovMat <- cov(dat)
##' myCorMat <- cor(dat)
##' N <- nrow(dat)
##'
##' ## Run the EKC function
##' (out <- efa.ekc(dat))
##'
##' ## To extract the recommended number of factors using the EKC:
##' attr(out, "nfactors")
##'
##' ## If you do not have raw data, you can use summary statistics
##' (x1 <- efa.ekc(sample.cov = myCovMat, sample.nobs = N, plot = FALSE))
##' (x2 <- efa.ekc(sample.cov = myCorMat, sample.nobs = N, plot = FALSE))
##'
##' @export
efa.ekc <- function(data = NULL, sample.cov = NULL, sample.nobs = NULL,
missing = "default", ordered = NULL, plot = TRUE) {
## if data
if (!is.null(data)) {
data <- as.data.frame(data)
R <- lavaan::lavCor(data, missing = missing, ordered = ordered) #correlations
j <- dim(data)[2] #number of variables
n <- dim(data)[1] #sample size
} else {
## if covariance matrix
if (max(diag(sample.cov)) != 1 & min(diag(sample.cov)) != 1) {
R <- cov2cor(sample.cov)
j <- dim(R)[2] #number of variables
n <- sample.nobs #sample size
} else {
## if correlation matrix
R <- sample.cov
j <- dim(R)[2] #number of variables
n <- sample.nobs #sample size
}
}
g <- j/n #gamma: var / sample
l <- (1 + sqrt(g))^2 #1st reference eigenvalue
e <- eigen(R)$values #eigenvalues
v <- cumsum(e) #Define cumulatively summed eigenvalue vector
v1 <- v[1:j - 1] #omit last element
v2 <- c(0, v1) #put a zero upfront
w <- sort(1:j, decreasing = TRUE) #eigenvalue order vector
ref <- (((j - v2)/w) * l) #EKC reference eigenvalues
# results
Eigenvalues <- data.frame(Sample = e, Ref = ref) #sample and reference eigenvalues
rownames(Eigenvalues) <- 1:j
class(Eigenvalues) <- c("lavaan.data.frame","data.frame")
## add no. factors to extract as attribute, using each criterion
nfactors_EKC <- which(!(Eigenvalues[, 1] > Eigenvalues[, 2]))[1] - 1 # EKC
nfactors_KC <- which(!(Eigenvalues[, 1] > 1))[1] - 1 # Kaiser Criterion
attr(Eigenvalues, "header") <- paste(" Empirical Kaiser Criterion suggests",
nfactors_EKC, "factors.\n",
"Traditional Kaiser Criterion suggests",
nfactors_KC, "factors.")
attr(Eigenvalues, "nfactors") <- nfactors_EKC
if (plot) {
plot(Eigenvalues[, 1], type = "b", pch = 20, cex = 0.9, col = "black",
main = "Empirical Kaiser Criterion\nScree Plot", ylab = "Eigenvalues",
ylim = c(min(Eigenvalues), max(ceiling(Eigenvalues))),
xlab = "Factor Number", xlim = c(1, j))
lines(Eigenvalues[, 2], lty = "dashed", col = "blue")
legend("topright", c(" Data", " Empirical\n Reference", " Kaiser Criterion"),
col = c("black","blue","gray"), bty = "n",
pch = c(20, NA, NA), lty = c("solid","dashed","solid"), merge = TRUE)
abline(h = 1, col = "gray") # Kaiser Criterion
}
return(Eigenvalues)
}
semTools/R/runMI-methods.R 0000644 0001762 0000144 00000203054 14764332442 015113 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 12 February 2025
### Class and Methods for OLDlavaan.mi object, returned by runMI()
### DEPRECATED: 16 June 2024
### supplanted by lavaan.mi package
##' Class for a lavaan Model Fitted to Multiple Imputations
##'
##' This class extends the [lavaan::lavaanList-class] class, created by
##' fitting a lavaan model to a list of data sets. In this case, the list of
##' data sets are multiple imputations of missing data.
##'
##'
##' @name OLDlavaan.mi-class
##' @importClassesFrom lavaan lavaanList
##' @aliases OLDlavaan.mi-class show,OLDlavaan.mi-method summary,OLDlavaan.mi-method
##' fitMeasures,OLDlavaan.mi-method fitmeasures,OLDlavaan.mi-method
##' anova,OLDlavaan.mi-method nobs,OLDlavaan.mi-method coef,OLDlavaan.mi-method
##' vcov,OLDlavaan.mi-method fitted,OLDlavaan.mi-method fitted.values,OLDlavaan.mi-method
##' residuals,OLDlavaan.mi-method resid,OLDlavaan.mi-method
##' @docType class
##'
##' @slot coefList `list` of estimated coefficients in matrix format (one
##' per imputation) as output by `lavInspect(fit, "est")`
##' @slot phiList `list` of model-implied latent-variable covariance
##' matrices (one per imputation) as output by
##' `lavInspect(fit, "cov.lv")`
##' @slot miList `list` of modification indices output by
##' [lavaan::modindices()]
##' @slot seed `integer` seed set before running imputations
##' @slot lavListCall call to [lavaan::lavaanList()] used to fit the
##' model to the list of imputed data sets in `@@DataList`, stored as a
##' `list` of arguments
##' @slot imputeCall call to imputation function (if used), stored as a
##' `list` of arguments
##' @slot convergence `list` of `logical` vectors indicating whether,
##' for each imputed data set, (1) the model converged on a solution, (2)
##' *SE*s could be calculated, (3) the (residual) covariance matrix of
##' latent variables (\eqn{\Psi}) is non-positive-definite, and (4) the
##' residual covariance matrix of observed variables (\eqn{\Theta}) is
##' non-positive-definite.
##' @slot lavaanList_slots All remaining slots are from
##' [lavaan::lavaanList-class], but [runMI()] only populates a
##' subset of the `list` slots, two of them with custom information:
##' @slot DataList The `list` of imputed data sets
##' @slot SampleStatsList List of output from
##' `lavInspect(fit, "sampstat")` applied to each fitted
##' model
##' @slot ParTableList See [lavaan::lavaanList-class]
##' @slot vcovList See [lavaan::lavaanList-class]
##' @slot testList See [lavaan::lavaanList-class]
##' @slot h1List See [lavaan::lavaanList-class]. An additional element is
##' added to the `list`: `$PT` is the "saturated" model's parameter
##' table, returned by [lavaan::lav_partable_unrestricted()].
##' @slot baselineList See [lavaan::lavaanList-class]
##'
##' @param object An object of class `OLDlavaan.mi`
##' @param se,ci,level,standardized,rsquare,header,output See
##' [lavaan::parameterEstimates()]. The `output=` argument
##' can also be passed to [lavaan::fitMeasures()].
##' @param fmi `logical` indicating whether to include the Fraction Missing
##' Information (FMI) for parameter estimates in the `summary`
##' output (see **Value** section).
##' @param asymptotic `logical`. If `FALSE` (typically a default, but
##' see **Value** section for details using various methods), pooled
##' tests (of fit or pooled estimates) will be *F* or *t*
##' statistics with associated degrees of freedom (*df*). If
##' `TRUE`, the (denominator) *df* are assumed to be
##' sufficiently large for a *t* statistic to follow a normal
##' distribution, so it is printed as a *z* statisic; likewise,
##' *F* times its numerator *df* is printed, assumed to follow
##' a \eqn{\chi^2} distribution.
##' @param scale.W `logical`. If `TRUE` (default), the `vcov`
##' method will calculate the pooled covariance matrix by scaling the
##' within-imputation component by the ARIV (see Enders, 2010, p. 235,
##' for definition and formula). Otherwise, the pooled matrix is
##' calculated as the weighted sum of the within-imputation and
##' between-imputation components (see Enders, 2010, ch. 8, for details).
##' This in turn affects how the `summary()` method calculates its
##' pooled standard errors, as well as [lavTestWald.mi()].
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. NPD solutions are not excluded by default because
##' they are likely to occur due to sampling error, especially in small
##' samples. However, gross model misspecification could also cause
##' NPD solutions, users can compare pooled results with and without
##' this setting as a sensitivity analysis to see whether some
##' imputations warrant further investigation. Specific imputation
##' numbers can also be included in this argument, in case users want to
##' apply their own custom omission criteria (or simulations can use
##' different numbers of imputations without redundantly refitting the
##' model).
##' @param labels `logical` indicating whether the `coef` output
##' should include parameter labels. Default is `TRUE`.
##' @param total `logical` (default: `TRUE`) indicating whether the
##' `nobs` method should return the total sample size or (if
##' `FALSE`) a vector of group sample sizes.
##' @param type The meaning of this argument varies depending on which method it
##' it used for. Find detailed descriptions in the **Value** section
##' under `coef`, `vcov`, and `residuals`.
##' @param fit.measures,baseline.model See [lavaan::fitMeasures()].
##' `summary(object, fit.measures = TRUE)` will print (but not
##' return) a table of fit measures to the console.
##' @param ... Additional arguments passed to [lavTestLRT.mi()], or
##' subsequently to [lavaan::lavTestLRT()].
##'
##' @return
##'
##' \item{coef}{`signature(object = "OLDlavaan.mi", type = "free",
##' labels = TRUE, omit.imps = c("no.conv","no.se"))`:
##' See [lavaan::lavaan-class]. Returns the pooled point estimates (i.e.,
##' averaged across imputed data sets; see Rubin, 1987).}
##'
##' \item{vcov}{`signature(object = "OLDlavaan.mi", scale.W = TRUE,
##' omit.imps = c("no.conv","no.se"),
##' type = c("pooled","between","within","ariv"))`: By default, returns the
##' pooled covariance matrix of parameter estimates (`type = "pooled"`),
##' the within-imputations covariance matrix (`type = "within"`), the
##' between-imputations covariance matrix (`type = "between"`), or the
##' average relative increase in variance (`type = "ariv"`) due to
##' missing data.}
##'
##' \item{fitted.values}{`signature(object = "OLDlavaan.mi",
##' omit.imps = c("no.conv","no.se"))`: See [lavaan::lavaan-class].
##' Returns model-implied moments, evaluated at the pooled point estimates.}
##' \item{fitted}{alias for `fitted.values`}
##'
##' \item{residuals}{`signature(object = "OLDlavaan.mi",
##' type = c("raw","cor"), omit.imps = c("no.conv","no.se"))`:
##' See [lavaan::lavaan-class]. By default (`type = "raw"`), returns
##' the difference between the model-implied moments from `fitted.values`
##' and the pooled observed moments (i.e., averaged across imputed data sets).
##' Standardized residuals are also available, using Bollen's
##' (`type = "cor"` or `"cor.bollen"`) or Bentler's
##' (`type = "cor.bentler"`) formulas.}
##' \item{resid}{alias for `residuals`}
##'
##' \item{nobs}{`signature(object = "OLDlavaan.mi", total = TRUE)`: either
##' the total (default) sample size or a vector of group sample sizes
##' (`total = FALSE`).}
##'
##' \item{anova}{`signature(object = "OLDlavaan.mi", ...)`:
##' Returns a test of model fit for a single model (`object`) or test(s)
##' of the difference(s) in fit between nested models passed via `...`.
##' See [lavTestLRT.mi()] and [compareFit()] for details.}
##'
##' \item{fitMeasures}{`signature(object = "OLDlavaan.mi",
##' fit.measures = "all", baseline.model = NULL, output = "vector",
##' omit.imps = c("no.conv","no.se"), ...)`: See lavaan's
##' [lavaan::fitMeasures()] for details. Pass additional arguments
##' to [lavTestLRT.mi()] via `...`.}
##' \item{fitmeasures}{alias for `fitMeasures`.}
##'
##' \item{show}{`signature(object = "OLDlavaan.mi")`: returns a message about
##' convergence rates and estimation problems (if applicable) across imputed
##' data sets.}
##'
##' \item{summary}{`signature(object = "OLDlavaan.mi", se = TRUE, ci = FALSE,
##' level = .95, standardized = FALSE, rsquare = FALSE, fmi = FALSE,
##' scale.W = !asymptotic, omit.imps = c("no.conv","no.se"), asymptotic = FALSE,
##' header = TRUE, output = "text", fit.measures = FALSE, ...)`:
##' See [lavaan::parameterEstimates()] for details.
##' By default, `summary()` returns pooled point and *SE*
##' estimates, along with *t* test statistics and their associated
##' *df* and *p* values. If `ci = TRUE`, confidence intervals
##' are returned with the specified confidence `level` (default 95\% CI).
##' If `asymptotic = TRUE`, *z* instead of *t* tests are
##' returned. `standardized` solution(s) can also be requested by name
##' (`"std.lv"` or `"std.all"`) or both are returned with `TRUE`.
##' *R*-squared for endogenous variables can be requested, as well as the
##' Fraction Missing Information (FMI) for parameter estimates. By default, the
##' output will appear like `lavaan`'s `summary()` output, but if
##' `output == "data.frame"`, the returned `data.frame` will resemble
##' the `parameterEstimates()` output. The `scale.W` argument is
##' passed to `vcov` (see description above).
##' Setting `fit.measures=TRUE` will additionally print fit measures to
##' the console, but they will not be returned; additional arguments may be
##' passed via `...` to [lavaan::fitMeasures()] and
##' subsequently to [lavTestLRT.mi()].}
##'
##' @section Objects from the Class: See the [runMI()] function for details.
##'
##' @author Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' @references
##' Asparouhov, T., & Muthen, B. (2010). *Chi-square statistics
##' with multiple imputation*. Technical Report. Retrieved from
##'
##'
##' Enders, C. K. (2010). *Applied missing data analysis*. New York, NY:
##' Guilford.
##'
##' Li, K.-H., Meng, X.-L., Raghunathan, T. E., & Rubin, D. B. (1991).
##' Significance levels from repeated *p*-values with multiply-imputed
##' data. *Statistica Sinica, 1*(1), 65--92. Retrieved from
##'
##'
##' Meng, X.-L., & Rubin, D. B. (1992). Performing likelihood ratio tests with
##' multiply-imputed data sets. *Biometrika, 79*(1), 103--111.
##' \doi{10.2307/2337151}
##'
##' Rubin, D. B. (1987). *Multiple imputation for nonresponse in surveys*.
##' New York, NY: Wiley.
##'
##' @examples
##'
##' ## See the new lavaan.mi package
##'
setClass("OLDlavaan.mi", contains = "lavaanList",
slots = c(coefList = "list", # coefficients in matrix format
phiList = "list", # list of model-implied latent covariance matrices
miList = "list", # modification indices
seed = "integer", # seed set before running imputations
lavListCall = "list", # store actual call to lavaanList
imputeCall = "list", # store call from imputation, if used
convergence = "list")) # also check SEs and Heywood cases
##' @name OLDlavaan.mi-class
##' @aliases show,OLDlavaan.mi-method
##' @export
setMethod("show", "OLDlavaan.mi", function(object) {
nData <- object@meta$ndat
useImps <- sapply(object@convergence, "[[", i = "converged")
nConverged <- sum(useImps)
SE <- sapply(object@convergence, "[[", "SE")
SE[is.na(SE)] <- FALSE
Heywood.ov <- sapply(object@convergence, "[[", "Heywood.ov")
Heywood.ov[is.na(Heywood.ov)] <- FALSE
Heywood.lv <- sapply(object@convergence, "[[", "Heywood.lv")
Heywood.lv[is.na(Heywood.lv)] <- FALSE
cat('OLDlavaan.mi object based on ', nData, ' imputed data sets. \n',
'See class?OLDlavaan.mi help page for available methods. \n\n',
'Convergence information:\n', 'The model converged on ',
nConverged, ' imputed data sets \n\n', sep = "")
if (!all(SE)) cat('Standard errors could not be computed for data set(s)',
paste(which(!SE), collapse = ", "), '\nTry fitting the',
'model to the individual data set(s) to diagnose',
'problems. If they cannot be fixed, try inspecting the',
'imputations. It may be necessary to reimpute the data',
'with some restrictions imposed. \n\n')
if (any(Heywood.ov | Heywood.lv))
cat('Heywood cases detected for data set(s)',
paste(which(Heywood.ov | Heywood.lv), collapse = ", "),
'\nThese are not necessarily a cause for concern, unless a pooled',
'estimate is also a Heywood case. \n\n')
object
})
##' @importFrom stats pt qt pnorm qnorm
##' @importFrom lavaan lavListInspect parTable lavNames
##' @importFrom methods getMethod
summary_OLDlavaan.mi <- function(object, se = TRUE, ci = FALSE, level = .95,
standardized = FALSE, rsquare = FALSE,
fmi = FALSE, scale.W = !asymptotic,
omit.imps = c("no.conv","no.se"),
asymptotic = FALSE, header = TRUE,
output = "text", fit.measures = FALSE, ...) {
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
lavoptions <- lavListInspect(object, "options")
## extract parameter table with attributes for printing
PT <- parTable(object)
myCols <- c("lhs","op","rhs","exo")
if (lavListInspect(object, "ngroups") > 1L) myCols <- c(myCols,"block","group")
if (lavListInspect(object, "nlevels") > 1L) myCols <- c(myCols,"block","level")
PE <- PT[ , unique(myCols)]
free <- PT$free > 0L | PT$op == ":="
STDs <- !(PT$op %in% c("==","<",">")) # which rows can be standardized
PE$est <- getMethod("coef","OLDlavaan.mi")(object, type = "all",
omit.imps = omit.imps)
if (lavoptions$se == "none") {
warning('pooled variances and tests unavailable when se="none" is requested')
se <- FALSE
}
if (!se) fmi <- FALSE
messPool <- paste0("Rubin's (1987) rules were used to pool point",
if (se) " and SE",
" estimates across ", m, " imputed data sets",
if (se & !asymptotic) ", and to calculate degrees of",
if (se & !asymptotic) " freedom for each parameter's t",
if (se & !asymptotic) " test and CI.",
"\n")
if (se) {
VCOV <- getMethod("vcov","OLDlavaan.mi")(object, scale.W = scale.W,
omit.imps = omit.imps)
PE$se <- lavaan::lav_model_vcov_se(object@Model, VCOV = VCOV,
lavpartable = object@ParTable)
W <- rowMeans(sapply(object@ParTableList[useImps], "[[", i = "se")^2)
B <- apply(sapply(object@ParTableList[useImps], "[[", i = "est"), 1, var)
Bm <- B + B/m
Tot <- W + Bm
if (asymptotic) {
PE$z[free] <- PE$est[free] / PE$se[free]
PE$pvalue <- pnorm(-abs(PE$z))*2
crit <- qnorm(1 - (1 - level) / 2)
} else {
PE$t[free] <- PE$est[free] / PE$se[free]
## calculate df for t test
## can't do finite-sample correction because Wald z tests have no df
## (see Enders, 2010, p. 231, eq. 8.13 & 8.14)
PE$df[free] <- (m - 1) * (1 + W[free] / Bm[free])^2
## if DF are obscenely large, set them to infinity for pretty printing
PE$df <- ifelse(PE$df > 9999, Inf, PE$df)
PE$pvalue <- pt(-abs(PE$t), df = PE$df)*2
crit <- qt(1 - (1 - level) / 2, df = PE$df)
}
if (ci) {
PE$ci.lower <- PE$est - crit * PE$se
PE$ci.upper <- PE$est + crit * PE$se
PE$ci.lower[!free] <- PE$ci.upper[!free] <- PE$est[!free]
}
}
if (is.logical(standardized)) {
if (standardized) {
standardized <- c("std.lv","std.all")
if (length(lavNames(object, "ov.x")) && lavoptions$fixed.x) {
standardized <- c(standardized, "std.nox")
}
} else standardized <- NULL
} else standardized <- tolower(as.character(standardized))
if (length(standardized) || rsquare) {
## pooled estimates for standardizedSolution()
est <- getMethod("coef", "OLDlavaan.mi")(object, omit.imps = omit.imps)
## updates @Model@GLIST for standardizedSolution(..., GLIST=)
object@Model <- lavaan::lav_model_set_parameters(object@Model, x = est)
}
if ("std.lv" %in% standardized) {
PE$std.lv[STDs] <- lavaan::standardizedSolution(object, se = FALSE,
type = "std.lv",
GLIST = object@Model@GLIST,
est = PE$est)$est.std
}
if ("std.all" %in% standardized) {
PE$std.all[STDs] <- lavaan::standardizedSolution(object, se = FALSE,
type = "std.all",
GLIST = object@Model@GLIST,
est = PE$est)$est.std
}
if ("std.nox" %in% standardized) {
PE$std.nox[STDs] <- lavaan::standardizedSolution(object, se = FALSE,
type = "std.nox",
GLIST = object@Model@GLIST,
est = PE$est)$est.std
}
if (fmi) {
PE$fmi[free] <- Bm[free] / Tot[free]
PE$riv[free] <- Bm[free] / W[free] # (Enders, 2010, p. 226, eq. 8.10)
# == PE$riv[free] <- PE$fmi1[free] / (1 - PE$fmi1[free])
messRIV <- paste("The RIV will exceed 1 whenever between-imputation",
"variance exceeds within-imputation variance",
"(when FMI(1) > 50%).\n\n")
}
## fancy or not?
if (output == "text") {
PE$label <- PT$label
#FIXME: no longer needed? PE$exo <- 0L
class(PE) <- c("lavaan.parameterEstimates","lavaan.data.frame","data.frame")
attr(PE, "categorical") <- lavoptions$categorical
attr(PE, "parameterization") <- lavoptions$parameterization
attr(PE, "information") <- lavoptions$information[1]
attr(PE, "information.meat") <- lavoptions$information.meat
attr(PE, "se") <- lavoptions$se
attr(PE, "group.label") <- lavListInspect(object, "group.label")
attr(PE, "level.label") <- c("within", lavListInspect(object, "cluster"))
attr(PE, "bootstrap") <- lavoptions$bootstrap
attr(PE, "bootstrap.successful") <- 0L #FIXME: assumes none. Implement Wei & Fan's mixing method?
attr(PE, "missing") <- lavoptions$missing
attr(PE, "observed.information") <- lavoptions$observed.information[1]
attr(PE, "h1.information") <- lavoptions$h1.information[1]
attr(PE, "h1.information.meat") <- lavoptions$h1.information.meat
attr(PE, "header") <- header
# FIXME: lavaan may add more!!
if (fmi) cat("\n", messRIV, sep = "")
} else {
PE$exo <- NULL
class(PE) <- c("lavaan.data.frame","data.frame")
}
## requested R-squared?
endoNames <- c(lavNames(object, "ov.nox"), lavNames(object, "lv.nox"))
if (rsquare & length(endoNames)) {
isEndo <- sapply(PE$lhs, function(x) x %in% endoNames)
rsqPE <- PE[PE$lhs == PE$rhs & PE$op == "~~" & isEndo, ]
rsqPE$op <- "r2"
for (i in which(!sapply(colnames(PE),
function(x) x %in% c("lhs","op","rhs","block",
"level","group","est","exo")))) {
rsqPE[ , i] <- NA
}
STD <- lavaan::standardizedSolution(object, se = FALSE, type = "std.all",
GLIST = object@Model@GLIST, est = PE$est)
isEndoSTD <- sapply(STD$lhs, function(x) x %in% endoNames)
std.all <- STD$est.std[STD$lhs == STD$rhs & STD$op == "~~" & isEndoSTD]
rsqPE$est <- ifelse(std.all < 0, NA, 1 - std.all) # negative variances
if (output == "text") rsqPE$label <- ""
PE <- rbind(PE, rsqPE)
}
if (output == "data.frame") PE <- PE[!(PE$op %in% c("==","<",">")), ]
rownames(PE) <- NULL
if (output == "text") {
getMethod("show", "OLDlavaan.mi")(object)
cat(messPool)
}
if (fit.measures) {
indices <- c("chisq","df","pvalue","cfi","tli","rmsea","srmr")
FITS <- suppressWarnings(fitMeasures(object, fit.measures = indices,
output = "text", ...))
try(print(FITS, add.h0 = TRUE), silent = TRUE)
}
PE
}
##' @name OLDlavaan.mi-class
##' @aliases summary,OLDlavaan.mi-method
##' @export
setMethod("summary", "OLDlavaan.mi", summary_OLDlavaan.mi)
##' @name OLDlavaan.mi-class
##' @aliases nobs,OLDlavaan.mi-method
##' @importFrom lavaan lavListInspect
##' @export
setMethod("nobs", "OLDlavaan.mi", function(object, total = TRUE) {
if (total) return(lavListInspect(object, "ntotal"))
#FIXME: cluster N for multilevel?
N <- lavListInspect(object, "norig")
if (length(N) > 1L) names(N) <- lavListInspect(object, "group.label")
N
})
##' @importFrom lavaan parTable
coef_OLDlavaan.mi <- function(object, type = "free", labels = TRUE,
omit.imps = c("no.conv","no.se")) {
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
PT <- parTable(object)
if (type == "user" || type == "all") {
type <- "user"
idx <- 1:length(PT$lhs)
} else if (type == "free") {
## FIXME: duplicated leftover from old way of handling EQ constraints?
idx <- which(PT$free > 0L & !duplicated(PT$free))
}
## extract coefficients for converged models
coefList <- lapply(object@ParTableList[useImps], "[[", i = "est")
out <- colMeans(do.call(rbind, coefList))[idx]
## attach names, set class
if (labels) names(out) <- lavaan::lav_partable_labels(PT, type = type)
class(out) <- c("lavaan.vector","numeric")
out
}
##' @name OLDlavaan.mi-class
##' @aliases coef,OLDlavaan.mi-method
##' @export
setMethod("coef", "OLDlavaan.mi", coef_OLDlavaan.mi)
##' @importFrom stats cov
##' @importFrom lavaan lavListInspect parTable
vcov_OLDlavaan.mi <- function(object, type = c("pooled","between","within","ariv"),
scale.W = TRUE, omit.imps = c("no.conv","no.se")) {
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
if (lavListInspect(object, "options")$se == "none") {
warning('requested se="none", so only between-imputation (co)variance can',
' be computed')
type <- "between"
}
type <- tolower(type[1])
if (!(type %in% c("pooled","between","within","ariv")))
stop("'", type, "' is not a valid option for 'type'")
PT <- parTable(object)
ncon <- sum(PT$op == "==")
npar <- max(PT$free) - ncon
coefList <- lapply(object@ParTableList[useImps], "[[", i = "est")
B <- cov(do.call(rbind, coefList)[ , PT$free > 0L & !duplicated(PT$free)])
class(B) <- c("lavaan.matrix.symmetric","matrix")
rownames(B) <- colnames(B) <- lavaan::lav_partable_labels(PT, type = "free")
if (type == "between") return(B)
W <- Reduce("+", lapply(object@vcovList[useImps], function(x) x$vcov)) / m
class(W) <- c("lavaan.matrix.symmetric","matrix")
dimnames(W) <- dimnames(B)
if (type == "within") return(W)
## check whether equality constraints prevent inversion of W
if (scale.W || type == "ariv") {
inv.W <- if (ncon == 0) try(solve(W), silent = TRUE) else MASS::ginv(W)
if (inherits(inv.W, "try-error")) {
if (ncon == 0) {
warning("Could not invert within-imputation covariance matrix. ",
"Generalized inverse used instead.\nIt may be ",
"safer to set `scale.W = FALSE' (and `asymptotic = TRUE').")
}
inv.W <- MASS::ginv(W)
}
## relative increase in variance due to missing data
r <- (1 + 1/m)/npar * sum(diag(B %*% inv.W)) # Enders (2010, p. 235) eqs. 8.20-21
if (type == "ariv") return(r)
Total <- (1 + r) * W # FIXME: asked Yves for a hack, says it can't be inverted back to infoMat
} else {
## less reliable, but constraints prevent inversion of W
Total <- W + B + (1/m)*B ## Enders (2010, p. 235) eq. 8.19
}
## return pooled variance
Total
}
##' @name OLDlavaan.mi-class
##' @aliases vcov,OLDlavaan.mi-method
##' @export
setMethod("vcov", "OLDlavaan.mi", vcov_OLDlavaan.mi)
##' @importFrom lavaan lavListInspect lavTestLRT
anova_OLDlavaan.mi <- function(object, ...) {
## save model names
objname <- deparse(substitute(object))
dotnames <- as.character(sapply(substitute(list(...))[-1], deparse))
## check class
if (!inherits(object, "OLDlavaan.mi")) stop("object is not class 'OLDlavaan.mi'")
## check for additional arguments
dots <- list(...)
if (length(dots)) {
## separate OLDlavaan.mi objects from other lavTestLRT.mi() arguments
idx.mi <- which(sapply(dots, inherits, what = "OLDlavaan.mi"))
if (length(idx.mi)) {
mods <- dots[idx.mi]
dots <- dots[-idx.mi]
## save names for mods, so compareFit() doesn't break
modnames <- dotnames[idx.mi]
nonames <- which(names(mods) == "")
names(mods)[nonames] <- modnames[nonames]
} else {
mods <- NULL
modnames <- NULL
}
LRT.names <- intersect(names(dots),
union(names(formals(lavTestLRT)),
names(formals(lavTestLRT.mi))))
dots <- if (length(LRT.names)) dots[LRT.names] else NULL
if (!is.null(dots$h1)) {
#FIXME: this shouldn't be necessary: mods <- c(mods, list(h1 = dots$h1))
dots$h1 <- NULL
}
} else mods <- NULL
## run compareFit if length(idx.mi) > 1L
if (length(mods) == 0L) {
argList <- c(list(object = object), dots)
results <- do.call(lavTestLRT.mi, argList)
} else if (length(mods) == 1L) {
argList <- c(list(object = object, h1 = mods[[1]]), dots)
results <- do.call(lavTestLRT.mi, argList)
} else if (length(mods) > 1L) {
modList <- c(list(object), mods)
names(modList) <- c(objname, modnames)
argList <- c(modList, list(argsLRT = dots, indices = FALSE))
results <- do.call(compareFit, argList)@nested
class(results) <- c("lavaan.data.frame","data.frame")
attr(results, "header") <- "Nested Model Comparisons:"
}
results
}
##' @name OLDlavaan.mi-class
##' @aliases anova,OLDlavaan.mi-method
##' @export
setMethod("anova", "OLDlavaan.mi", anova_OLDlavaan.mi)
##' @importFrom lavaan lavListInspect lavNames
##' @importFrom methods getMethod
## utility function called within fitMeasures.mi()
getSRMR <- function(object, type = "cor.bentler", level = "within",
include.mean = TRUE, omit.imps = c("no.conv","no.se")) {
conditional.x <- lavListInspect(object, "options")$conditional.x
include.mean <- include.mean && lavListInspect(object, "meanstructure")
include.diag <- type %in% c("cor.bentler","raw")
mplus <- type == "mplus"
if (mplus) type <- "cor.bollen"
## how many blocks to loop over
nG <- lavListInspect(object, "ngroups")
nlevels <- lavListInspect(object, "nlevels")
## save relevant sample sizes
if (nlevels > 1L && level != "within") {
n.per.group <- lavListInspect(object, "nclusters") #FIXME: only works for 2 levels
N <- sum(n.per.group)
} else {
n.per.group <- lavListInspect(object, "nobs")
N <- lavListInspect(object, "ntotal")
}
## grab residuals
R <- getMethod("resid", "OLDlavaan.mi")(object, type = type,
omit.imps = omit.imps)
if (mplus) Rd <- getMethod("resid", "OLDlavaan.mi")(object,
type = "cor.bentler",
omit.imps = omit.imps)
## restructure, if necessary
if (nG == 1L) {
loopBlocks <- 1L
## extract relevant level
if (nlevels > 1L) {
R <- R[[level]]
if (mplus) Rd <- Rd[[level]]
}
## to loop over blocks
R <- list(R)
if (mplus) Rd <- list(Rd)
## multiple groups AND multilevel
} else if (nlevels > 1L) {
loopBlocks <- 2*(1:nG)
if (level == "within") loopBlocks <- loopBlocks - 1L
R <- R[loopBlocks]
if (mplus) Rd <- Rd[loopBlocks]
} else loopBlocks <- 1:nG # no restructure necessary for multigroup 1-level models
## store vector of squared residuals
RR <- vector("list", nG)
for (b in loopBlocks) {
index <- if (conditional.x) "res.cov" else "cov"
RR[[b]] <- R[[b]][[index]][lower.tri(R[[b]][[index]], diag = FALSE)]^2
## only capture means/variances of numeric modeled variables (not conditional.x)
vv <- intersect(lavNames(object, type = "ov.num", block = b),
lavNames(object, type = "ov.model", block = b))
if (include.diag) RR[[b]] <- c(RR[[b]], diag(R[[b]][[index]])[vv]^2)
if (mplus) RR[[b]] <- c(RR[[b]], diag(Rd[[b]][[index]])[vv]^2)
if (include.mean) {
index <- if (conditional.x) "res.int" else "mean"
RR[[b]] <- c(RR[[b]], R[[b]][[index]][vv]^2)
}
}
## take weighted average of group means
as.numeric( (n.per.group %*% sqrt(sapply(RR, mean))) / N )
}
##' @importFrom lavaan lavNames lavListInspect
##' @importFrom stats pchisq uniroot
fitMeasures_OLDlavaan.mi <- function(object, fit.measures = "all", baseline.model = NULL,
output = "vector", omit.imps = c("no.conv","no.se"),
...) {
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
lavoptions <- lavListInspect(object, "options")
fit.measures <- tolower(fit.measures)
if (length(fit.measures) == 0L) fit.measures <- "all"
## narrow down fit indices
incremental <- c("cfi","tli","nnfi","rfi","nfi","pnfi","ifi","rni")
if ("all" %in% fit.measures) {
indices <- c("chisq","df","pvalue","scaling", incremental,
"rmsea","rmr","mfi","gammahat")
} else {
indices <- grep(pattern = paste(c("chisq","df","pvalue","scaling",
incremental, "mfi","rmsea",
"gammahat","rmr"), collapse = "|"),
x = fit.measures, ignore.case = TRUE, value = TRUE)
}
## CHI-SQUARED-BASED FIT INDICES
notest <- length(lavoptions$test) == 1L && lavoptions$test == "none"
if (notest || any(!grepl(pattern = "rmr", x = indices))) {
## check for additional arguments
dots <- list(...)
if (length(dots)) {
LRT.names <- intersect(names(dots),
union(names(formals(lavTestLRT)),
names(formals(lavTestLRT.mi))))
dots <- if (length(LRT.names)) dots[LRT.names] else list(asymptotic = TRUE)
} else dots <- list(asymptotic = TRUE)
## check test options (adapted from lavTestLRT.mi, limits duplicate warnings)
test <- dots$test
if (is.null(test)) {
test <- "d3" # default
} else test <- tolower(test[1])
if (tolower(test) %in% c("mr","meng.rubin","likelihood","lrt","mplus","d3")) test <- "D3"
if (tolower(test) %in% c("lmrr","li.et.al","pooled.wald","d2")) test <- "D2"
if (test == "D3" && !lavoptions$estimator %in% c("ML","PML","FML")) {
message('"D3" only available using maximum likelihood estimation. ',
'Changed test to "D2".')
test <- "D2"
}
## check for robust
test.names <- lavoptions$test
# lavaan 0.6-5: for now, we only acknowledge the first non-standard @test
if (length(test.names) > 1L) {
## remove standard and any bootstrapped tests
rm.idx <- which(test.names %in% c("standard","bootstrap","bollen.stine"))
if (length(rm.idx) > 0L) {
test.names <- test.names[-rm.idx]
}
## only acknowledge the first scaled test statistic
if (length(test.names) > 1L) {
test.names <- test.names[1]
}
}
robust <- any(test.names %in% c("satorra.bentler","yuan.bentler",
"yuan.bentler.mplus","scaled.shifted",
"mean.var.adjusted","satterthwaite"))
if (robust) {
## assign pool.robust option to object
if (is.null(dots$pool.robust)) {
pool.robust <- formals(lavTestLRT.mi)$pool.robust # default value
} else {
pool.robust <- dots$pool.robust # user-specified value
}
} else dots$pool.robust <- pool.robust <- FALSE
scaleshift <- any(test.names == "scaled.shifted")
if (scaleshift) {
if (test == "D3") {
message("If test = 'scaled.shifted' (estimator = 'WLSMV' or 'MLMV'), ",
"model evaluation is only available by (re)setting .",
"test = 'D2'.\nControl more options by passing arguments to ",
"lavTestLRT() via the '...' argument.\n")
test <- 'D2'
}
}
if (pool.robust && test == "D3") {
message('pool.robust = TRUE is only applicable when test = "D2". ',
'Changed test to "D2".')
test <- "D2"
}
dots$test <- test
## pooled test statistic(s)
argList <- c(list(object = object), dots)
argList$asymptotic <- TRUE # in case it wasn't set in list(...)
argList$omit.imps <- omit.imps
out <- do.call(lavTestLRT.mi, argList)
## check for scaled test statistic (if not, set robust=FALSE)
if (robust && is.na(out["chisq.scaled"])) robust <- FALSE
## fit baseline model if necessary
if (any(indices %in% incremental)) {
if (inherits(baseline.model, "OLDlavaan.mi")) {
baseFit <- baseline.model
} else if (inherits(object@external$baseline.model, "OLDlavaan.mi")) {
baseFit <- object@external$baseline.model
## MUST fit PTb for "D3" likelihoods, but for "D2" use @baselineList
} else if (test == "D2") {
## length(baseImps) == m, not just length(useImps)
baseImps <- object@meta$baseline.ok
if (!all(baseImps[useImps])) warning('The default independence model ',
'did not converge for data set(s): ',
which(!baseImps))
## only use imputations that converged for both
baseImps <- intersect(useImps, which(baseImps))
w <- sapply(object@baselineList[baseImps],
function(x) x$test$standard[["stat"]])
if (is.list(w)) {
#TODO: figure out why this happens!
w <- unlist(w)
DF <- mean(unlist(sapply(object@baselineList[baseImps],
function(x) x$test$standard[["df"]])))
} else {
DF <- mean(sapply(object@baselineList[baseImps],
function(x) x$test$standard[["df"]]))
}
baseOut <- calculate.D2(w, DF, asymptotic = TRUE)
if (robust) {
if (pool.robust) {
w.r <- sapply(object@baselineList[baseImps],
function(x) x$test[[ test.names[1] ]][["stat"]])
if (is.list(w.r)) {
w.r <- unlist(w.r)
DF.r <- mean(unlist(sapply(object@baselineList[baseImps],
function(x) x$test[[ test.names[1] ]][["df"]])))
} else {
DF.r <- mean(sapply(object@baselineList[baseImps],
function(x) x$test[[ test.names[1] ]][["df"]]))
}
base.robust <- calculate.D2(w.r, DF.r, asymptotic = TRUE)
names(base.robust) <- paste0(names(base.robust), ".scaled")
baseOut <- c(baseOut, base.robust)
} else {
baseOut <- robustify(ChiSq = baseOut, object = object,
baseline = TRUE, useImps = baseImps)
}
}
baseFit <- NULL # for later checking, to avoid unnecessary calls
} else {
PTb <- object@baselineList[[ useImps[1] ]]$partable
PTb[c("est","se")] <- NULL
# FIXME: shouldn't need this line, but lav_partable_merge() fails when
# lavaan:::lav_object_extended() returns a NULL slot instead of "plabel"
PTb$plabel <- paste0(".p", PTb$id, ".")
group <- lavListInspect(object, "group")
if (length(group) == 0L) group <- NULL
cluster <- lavListInspect(object, "cluster")
if (length(cluster) == 0L) cluster <- NULL
baseFit <- runMI(model = PTb, data = object@DataList[useImps],
group = group, cluster = cluster,
test = lavoptions$test, estimator = lavoptions$estimator,
fixed.x = lavoptions$fixed.x, se = "none", # to save time
conditional.x = lavoptions$conditional.x,
ordered = lavListInspect(object, "ordered"),
parameterization = lavoptions$parameterization)
}
if (!is.null(baseFit)) {
## length(baseImps) is only as long as length(useImps), not the original m
baseImps <- sapply(baseFit@convergence, "[[", i = "converged")
if (!all(baseImps)) warning('baseline.model did not converge for data set(s): ',
useImps[!baseImps])
argList <- c(list(object = baseFit), dots)
argList$asymptotic <- TRUE # in case it wasn't set in list(...)
argList$omit.imps <- setdiff(omit.imps, "no.se") # se="none" in baseFit
baseOut <- do.call(lavTestLRT.mi, argList)
}
# else { already used "D2" with @baselineList info to make baseOut }
}
X2 <- out[["chisq"]]
DF <- out[["df"]]
if (robust) {
X2.sc <- out[["chisq.scaled"]]
DF.sc <- out[["df.scaled"]] ## for mean.var.adjusted, mean DF across imputations
if (!pool.robust) ch <- out[["chisq.scaling.factor"]] ## mean c_hat across imputations
if (X2 < .Machine$double.eps && DF == 0) ch <- 0
## for RMSEA
if ("rmsea" %in% indices) {
d <- mean(sapply(object@testList[useImps],
function(x) sum(x[[ test.names[1] ]][["trace.UGamma"]])))
if (is.na(d) || d == 0) d <- NA # FIXME: only relevant when mean.var.adjusted?
}
}
## for CFI, TLI, etc.
if (any(indices %in% incremental)) {
bX2 <- baseOut[["chisq"]]
bDF <- baseOut[["df"]]
out <- c(out, baseline.chisq = bX2, baseline.df = bDF,
baseline.pvalue = baseOut[["pvalue"]])
if (robust) {
out["baseline.chisq.scaled"] <- bX2.sc <- baseOut[["chisq.scaled"]]
out["baseline.df.scaled"] <- bDF.sc <- baseOut[["df.scaled"]]
out["baseline.pvalue.scaled"] <- baseOut[["pvalue.scaled"]]
if (!pool.robust) {
cb <- baseOut[["chisq.scaling.factor"]]
out["baseline.chisq.scaling.factor"] <- cb
if (scaleshift) out["baseline.chisq.shift.parameters"] <- baseOut[["chisq.shift.parameters"]]
}
}
}
if ("cfi" %in% indices) {
t1 <- max(X2 - DF, 0)
t2 <- max(X2 - DF, bX2 - bDF, 0)
out["cfi"] <- if(t1 == 0 && t2 == 0) 1 else 1 - t1/t2
if (robust) {
## scaled
t1 <- max(X2.sc - DF.sc, 0)
t2 <- max(X2.sc - DF.sc, bX2.sc - bDF.sc, 0)
if (is.na(t1) || is.na(t2)) {
out["cfi.scaled"] <- NA
} else if (t1 == 0 && t2 == 0) {
out["cfi.scaled"] <- 1
} else out["cfi.scaled"] <- 1 - t1/t2
## Brosseau-Liard & Savalei MBR 2014, equation 15
if (!pool.robust & test.names[1] %in%
c("satorra.bentler","yuan.bentler","yuan.bentler.mplus")) {
t1 <- max(X2 - ch*DF, 0)
t2 <- max(X2 - ch*DF, bX2 - cb*bDF, 0)
if (is.na(t1) || is.na(t2)) {
out["cfi.robust"] <- NA
} else if (t1 == 0 && t2 == 0) {
out["cfi.robust"] <- 1
} else out["cfi.robust"] <- 1 - t1/t2
}
}
}
if ("rni" %in% indices) {
t1 <- X2 - DF
t2 <- bX2 - bDF
out["rni"] <- if (t2 == 0) NA else 1 - t1/t2
if (robust) {
## scaled
t1 <- X2.sc - DF.sc
t2 <- bX2.sc - bDF.sc
if (is.na(t1) || is.na(t2)) {
out["rni.scaled"] <- NA
} else if (t2 == 0) {
out["rni.scaled"] <- NA
} else out["rni.scaled"] <- 1 - t1/t2
## Brosseau-Liard & Savalei MBR 2014, equation 15
if (!pool.robust & test.names[1] %in%
c("satorra.bentler","yuan.bentler","yuan.bentler.mplus")) {
t1 <- X2 - ch*DF
t2 <- bX2 - cb*bDF
if (is.na(t1) || is.na(t2)) {
out["rni.robust"] <- NA
} else if (t1 == 0 && t2 == 0) {
out["rni.robust"] <- NA
} else out["rni.robust"] <- 1 - t1/t2
}
}
}
if (any(indices %in% c("tli","nnfi"))) {
t1 <- (X2 - DF)*bDF
t2 <- (bX2 - bDF)*DF
out["tli"] <- out["nnfi"] <- if (DF > 0) 1 - t1/t2 else 1
if (robust) {
## scaled
t1 <- (X2.sc - DF.sc)*bDF.sc
t2 <- (bX2.sc - bDF.sc)*DF.sc
if (is.na(t1) || is.na(t2)) {
out["tli.scaled"] <- out["nnfi.scaled"] <- NA
} else if (DF > 0 && t2 != 0) {
out["tli.scaled"] <- out["nnfi.scaled"] <- 1 - t1/t2
} else {
out["tli.scaled"] <- out["nnfi.scaled"] <- 1
}
## Brosseau-Liard & Savalei MBR 2014, equation 15
if (!pool.robust & test.names[1] %in%
c("satorra.bentler","yuan.bentler","yuan.bentler.mplus")) {
t1 <- (X2 - ch*DF)*bDF
t2 <- (bX2 - cb*bDF)*DF
if (is.na(t1) || is.na(t2)) {
out["tli.robust"] <- out["nnfi.robust"] <- NA
} else if (t1 == 0 && t2 == 0) {
out["tli.robust"] <- out["nnfi.robust"] <- 1 - t1/t2
} else out["tli.robust"] <- out["nnfi.robust"] <- 1
}
}
}
if ("rfi" %in% indices) {
if (DF > 0) {
t2 <- bX2 / bDF
t1 <- t2 - X2/DF
out["rfi"] <- if (t1 < 0 || t2 < 0) 1 else t1/t2
} else out["rfi"] <- 1
if (robust) {
if (DF > 0) {
t2 <- bX2.sc / bDF.sc
t1 <- t2 - X2.sc/DF.sc
out["rfi.scaled"] <- if (t1 < 0 || t2 < 0) 1 else t1/t2
} else out["rfi.scaled"] <- 1
}
}
if ("nfi" %in% indices) {
if (DF > 0) {
t1 <- bX2 - X2
t2 <- bX2
out["nfi"] <- t1 / t2
} else out["nfi"] <- 1
if (robust) out["nfi.scaled"] <- (bX2.sc - X2.sc) / bX2.sc
}
if ("pnfi" %in% indices) {
t1 <- bX2 - X2
t2 <- bX2
out["pnfi"] <- (DF / bDF) * t1/t2
if (robust) {
t1 <- bX2.sc - X2.sc
t2 <- bX2.sc
out["pnfi.scaled"] <- (DF / bDF) * t1/t2
}
}
if ("ifi" %in% indices) {
t1 <- bX2 - X2
t2 <- bX2 - DF
out["ifi"] <- if (t2 < 0) 1 else t1/t2
if (robust) {
t1 <- bX2.sc - X2.sc
t2 <- bX2.sc - DF.sc
if (is.na(t2)) {
out["ifi.scaled"] <- NA
} else if (t2 < 0) {
out["ifi.scaled"] <- 1
} else out["ifi.scaled"] <- t1/t2
}
}
N <- lavListInspect(object, "ntotal")
Ns <- lavListInspect(object, "nobs") # N per group
nG <- lavListInspect(object, "ngroups")
nVars <- length(lavNames(object))
if (!(lavoptions$likelihood == "normal" |
lavoptions$estimator %in% c("ML","PML","FML"))) {
N <- N - nG
Ns <- Ns - 1
}
if ("mfi" %in% indices) {
out["mfi"] <- exp(-0.5 * (X2 - DF) / N)
}
if ("rmsea" %in% indices) {
N.RMSEA <- max(N, X2*4) # FIXME: good strategy??
if (is.na(X2) || is.na(DF)) {
out["rmsea"] <- as.numeric(NA)
} else if (DF > 0) {
getLambda <- function(lambda, chi, df, p) pchisq(chi, df, ncp=lambda) - p
out["rmsea"] <- sqrt( max(0, (X2/N)/DF - 1/N) ) * sqrt(nG)
## lower confidence limit
if (getLambda(0, X2, DF, .95) < 0.0) out["rmsea.ci.lower"] <- 0 else {
lambda.l <- try(uniroot(f = getLambda, chi = X2, df = DF, p = .95,
lower = 0, upper = X2)$root, silent = TRUE)
if (inherits(lambda.l, "try-error")) lambda.l <- NA
out["rmsea.ci.lower"] <- sqrt( lambda.l/(N*DF) ) * sqrt(nG)
}
## upper confidence limit
if (getLambda(N.RMSEA, X2, DF, .05) > 0 || getLambda(0, X2, DF, .05) < 0) {
out["rmsea.ci.upper"] <- 0
} else {
lambda.u <- try(uniroot(f = getLambda, chi = X2, df = DF, p = .05,
lower = 0, upper = N.RMSEA)$root, silent = TRUE)
if (inherits(lambda.u, "try-error")) lambda.u <- NA
out["rmsea.ci.upper"] <- sqrt( lambda.u/(N*DF) ) * sqrt(nG)
}
## p value
out["rmsea.pvalue"] <- pchisq(X2, DF, ncp = N*DF*0.05^2/nG,
lower.tail = FALSE)
## Scaled versions (naive and robust)
if (robust & !scaleshift) {
## naive
out["rmsea.scaled"] <- sqrt( max(0, (X2/N)/d - 1/N) ) * sqrt(nG)
## lower confidence limit
if (DF < 1 || d < 1 || getLambda(0, X2, d, .95) < 0.0) {
out["rmsea.ci.lower.scaled"] <- 0
} else {
lambda.l <- try(uniroot(f = getLambda, chi = X2, df = d, p = .95,
lower = 0, upper = X2)$root, silent = TRUE)
if (inherits(lambda.l, "try-error")) lambda.l <- NA
out["rmsea.ci.lower.scaled"] <- sqrt( lambda.l/(N*DF) ) * sqrt(nG)
}
## upper confidence limit
if (DF < 1|| d < 1 || getLambda(0, X2, d, .95) < 0.0 || getLambda(N.RMSEA, X2, d, .05) > 0.0) {
out["rmsea.ci.upper.scaled"] <- 0
} else {
lambda.u <- try(uniroot(f = getLambda, chi = X2, df = d, p = .05,
lower = 0, upper = N.RMSEA)$root, silent = TRUE)
if (inherits(lambda.u, "try-error")) lambda.u <- NA
out["rmsea.ci.upper.scaled"] <- sqrt( lambda.u/(N*DF) ) * sqrt(nG)
}
## p value
out["rmsea.pvalue.scaled"] <- pchisq(X2, d, ncp = N*d*0.05^2/nG,
lower.tail = FALSE)
if (!pool.robust & test.names[1] %in%
c("satorra.bentler","yuan.bentler","yuan.bentler.mplus")) {
## robust
out["rmsea.robust"] <- sqrt( max(0, (X2/N)/DF - ch/N ) ) * sqrt(nG)
## lower confidence limit
if (DF.sc < 1 | getLambda(0, X2.sc, DF.sc, .95) < 0.0) {
out["rmsea.ci.lower.robust"] <- 0
} else {
lambda.l <- try(uniroot(f = getLambda, chi = X2.sc, df = DF.sc, p = .95,
lower = 0, upper = X2.sc)$root, silent = TRUE)
if (inherits(lambda.l, "try-error")) lambda.l <- NA
out["rmsea.ci.lower.robust"] <- sqrt( (ch*lambda.l)/(N*DF.sc) ) * sqrt(nG)
}
## upper confidence limit
if (DF.sc < 1 | getLambda(N.RMSEA, X2.sc, DF.sc, .05) > 0.0) {
out["rmsea.ci.upper.robust"] <- 0
} else {
lambda.u <- try(uniroot(f = getLambda, chi = X2.sc, df = DF.sc, p = .05,
lower = 0, upper = N.RMSEA)$root, silent = TRUE)
if (inherits(lambda.u, "try-error")) lambda.u <- NA
out["rmsea.ci.upper.robust"] <- sqrt( (ch*lambda.u)/(N*DF.sc) ) * sqrt(nG)
}
## p value
########## To be discovered?
}
} else if (robust & scaleshift) {
## naive only
out["rmsea.scaled"] <- sqrt( max(0, (X2.sc/N)/DF - 1/N) ) * sqrt(nG)
## lower confidence limit
if (DF < 1 | getLambda(0, X2.sc, DF, .95) < 0.0) {
out["rmsea.ci.lower.scaled"] <- 0
} else {
lambda.l <- try(uniroot(f = getLambda, chi = X2.sc, df = DF, p = .95,
lower = 0, upper = X2.sc)$root, silent = TRUE)
if (inherits(lambda.l, "try-error")) lambda.l <- NA
out["rmsea.ci.lower.scaled"] <- sqrt( lambda.l/(N*DF) ) * sqrt(nG)
}
## upper confidence limit
if (DF < 1 | getLambda(N.RMSEA, X2.sc, DF, .05) > 0.0) {
out["rmsea.ci.upper.scaled"] <- 0
} else {
lambda.u <- try(uniroot(f = getLambda, chi = X2.sc, df = DF, p = .05,
lower = 0, upper = N.RMSEA)$root, silent = TRUE)
if (inherits(lambda.u, "try-error")) lambda.u <- NA
out["rmsea.ci.upper.scaled"] <- sqrt( lambda.u/(N*DF) ) * sqrt(nG)
}
## p value
out["rmsea.pvalue.scaled"] <- pchisq(X2.sc, DF, ncp = N*DF*0.05^2/nG,
lower.tail = FALSE)
}
}
}
if ("gammahat" %in% indices) {
out["gammaHat"] <- nVars / (nVars + 2*((X2 - DF) / N))
out["adjGammaHat"] <- 1 - (((nG * nVars * (nVars + 1)) / 2) / DF) * (1 - out["gammaHat"])
if (robust) {
out["gammaHat.scaled"] <- nVars / (nVars + 2*((X2.sc - DF.sc) / N))
out["adjGammaHat.scaled"] <- 1 - (((nG * nVars * (nVars + 1)) / 2) / DF.sc) * (1 - out["gammaHat.scaled"])
}
}
## END CHI-SQUARED-BASED FIT INDICES
} else out <- numeric(0)
## RESIDUALS-BASED FIT INDICES
if (any(grepl(pattern = "rmr", x = indices))) {
if (lavListInspect(object, "nlevels") > 1L) {
out["srmr"] <- NA # to preserve the order in lavaan output
out["srmr_within"] <- getSRMR(object, type = "cor", include.mean = FALSE,
level = "within", omit.imps = omit.imps)
out["srmr_between"] <- getSRMR(object, type = "cor", include.mean = FALSE,
level = lavListInspect(object, "cluster"),
omit.imps = omit.imps)
out["srmr"] <- out["srmr_within"] + out["srmr_between"]
} else {
out["rmr"] <- getSRMR(object, type = "raw", include.mean = TRUE,
omit.imps = omit.imps)
out["rmr_nomean"] <- getSRMR(object, type = "raw", include.mean = FALSE,
omit.imps = omit.imps)
out["srmr_bentler"] <- out["srmr"] <- getSRMR(object, type = "cor.bentler",
include.mean = TRUE,
omit.imps = omit.imps)
out["srmr_bentler_nomean"] <- getSRMR(object, type = "cor.bentler",
include.mean = FALSE,
omit.imps = omit.imps)
out["crmr"] <- getSRMR(object, type = "cor.bollen", include.mean = TRUE,
omit.imps = omit.imps)
out["crmr_nomean"] <- getSRMR(object, type = "cor.bollen",
include.mean = FALSE, omit.imps = omit.imps)
out["srmr_mplus"] <- getSRMR(object, type = "mplus", include.mean = TRUE,
omit.imps = omit.imps)
out["srmr_mplus_nomean"] <- getSRMR(object, type = "mplus",
include.mean = FALSE,
omit.imps = omit.imps)
}
## END RESIDUALS-BASED FIT INDICES
}
## return requested measures (loosely matched)
if ("all" %in% fit.measures) {
fits <- out
} else {
fits <- out[grepl(pattern = paste(fit.measures, collapse = "|"),
x = names(out), ignore.case = TRUE)]
fits <- fits[which(!is.na(names(fits)))]
}
class(fits) <- c("lavaan.vector","numeric")
if (output == "text") class(fits) <- c("lavaan.fitMeasures", class(fits))
fits
}
##' @name OLDlavaan.mi-class
##' @aliases fitMeasures,OLDlavaan.mi-method
##' @importFrom lavaan fitMeasures
##' @export
setMethod("fitMeasures", "OLDlavaan.mi", fitMeasures_OLDlavaan.mi)
## lowercase 'm'
##' @name OLDlavaan.mi-class
##' @aliases fitmeasures,OLDlavaan.mi-method
##' @importFrom lavaan fitmeasures
##' @export
setMethod("fitmeasures", "OLDlavaan.mi", fitMeasures_OLDlavaan.mi)
##' @importFrom lavaan lavListInspect lavNames
##' @importFrom methods getMethod
fitted_OLDlavaan.mi <- function(object, omit.imps = c("no.conv","no.se")) {
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
## how many blocks to loop over
nG <- lavListInspect(object, "ngroups")
nlevels <- lavListInspect(object, "nlevels")
nBlocks <- nG * nlevels #FIXME: always?
group.label <- if (nG > 1L) lavListInspect(object, "group.label") else NULL
clus.label <- if (nlevels > 1L) c("within", lavListInspect(object, "cluster")) else NULL
if (nBlocks > 1L) {
block.label <- paste(rep(group.label, each = nlevels), clus.label,
sep = if (nG > 1L && nlevels > 1L) "_" else "")
}
est <- getMethod("coef", "OLDlavaan.mi")(object, omit.imps = omit.imps)
setpar <- lavaan::lav_model_set_parameters(object@Model, x = est)
impMats <- lavaan::lav_model_implied(setpar)
# if (lavListInspect(object, "categorical")) {
# th.idx <- lavListInspect(object, "th.idx") # to select $(res.)th
# if (nBlocks == 1L) th.idx <- list(th.idx) # to loop over
# #FIXME when multilevel accepts categorical
# }
#TODO: adapt to multilevel, multigroup, or both
## loop over (blocks and) moments
Implied <- vector("list", nBlocks)
for (b in 1:nBlocks) {
for (nm in names(impMats)) {
## skip any empty objects
if (is.null(impMats[[nm]][[b]])) next
Implied[[b]][[nm]] <- impMats[[nm]][[b]]
## assign names and classes
if (nm %in% c("cov","res.cov")) {
NAMES <- lavNames(object, type = "ov.model", block = b)
dimnames(Implied[[b]][[nm]]) <- list(NAMES, NAMES)
class(Implied[[b]][[nm]]) <- c("lavaan.matrix.symmetric","matrix")
} else if (nm %in% c("mean","res.int")) {
Implied[[b]][[nm]] <- as.numeric(Implied[[b]][[nm]]) # remove matrix
names(Implied[[b]][[nm]]) <- lavNames(object, type = "ov.model", block = b)
class(Implied[[b]][[nm]]) <- c("lavaan.vector","numeric")
} else if (nm %in% c("th","res.th")) {
#FIXME: When lavaan allows multilevel categorical, thresholds only
## apply once (not to each level, like for all groups).
## Will lavaan return a vector of zeros for all but "within"?
## If not, it will not exist for each block, so count over groups.
Implied[[b]][[nm]] <- as.numeric(Implied[[b]][[nm]]) #[ th.idx[[b]] ] # remove matrix & numeric -means
names(Implied[[b]][[nm]]) <- lavNames(object, type = "th",
block = b) #FIXME?
class(Implied[[b]][[nm]]) <- c("lavaan.vector","numeric")
} else if (nm == "group.w") {
## Only for (D)WLS estimation, but when is it relevant?
## For now, assign no names/class
## The remaining only exist when conditional.x
} else if (nm %in% c("slopes","res.slopes")) {
dimnames(Implied[[b]][[nm]]) <- list(lavNames(object, type = "ov.nox", block = b),
lavNames(object, type = "ov.x", block = b))
class(Implied[[b]][[nm]]) <- c("lavaan.matrix","matrix")
} else if (nm == "cov.x") {
NAMES <- lavNames(object, type = "ov.x", block = b)
dimnames(Implied[[b]][[nm]]) <- list(NAMES, NAMES)
class(Implied[[b]][[nm]]) <- c("lavaan.matrix.symmetric","matrix")
} else if (nm == "mean.x") {
Implied[[b]][[nm]] <- as.numeric(Implied[[b]][[nm]]) # remove matrix
names(Implied[[b]][[nm]]) <- lavNames(object, type = "ov.x", block = b)
class(Implied[[b]][[nm]]) <- c("lavaan.vector","numeric")
}
## end loops
}
}
## drop list for 1 block, or add labels for multiple
if (nBlocks == 1L) {
Implied <- Implied[[1]]
} else names(Implied) <- block.label
Implied
}
##' @name OLDlavaan.mi-class
##' @aliases fitted,OLDlavaan.mi-method
##' @export
setMethod("fitted", "OLDlavaan.mi", fitted_OLDlavaan.mi)
##' @name OLDlavaan.mi-class
##' @aliases fitted.values,OLDlavaan.mi-method
##' @export
setMethod("fitted.values", "OLDlavaan.mi", fitted_OLDlavaan.mi)
##' @importFrom lavaan lavListInspect
##' @importFrom methods getMethod
##' @importFrom stats cov2cor
resid_OLDlavaan.mi <- function(object, type = c("raw","cor"),
omit.imps = c("no.conv","no.se")) {
## @SampleStatsList is (for each imputation) output from:
## getSampStats <- function(obj) lavInspect(obj, "sampstat")
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
## check type options
type <- tolower(type[1])
if (type %in% c("raw","rmr")) {
type = "raw"
} else if (type %in% c("cor","cor.bollen","crmr")) {
type <- "cor.bollen"
} else if (type %in% c("cor.bentler","cor.eqs","srmr")) {
type <- "cor.bentler"
} else stop('type="', type, '" not supported for lavaan.mi objects')
## how many blocks to loop over
nG <- lavListInspect(object, "ngroups")
nlevels <- lavListInspect(object, "nlevels")
nBlocks <- nG * nlevels #FIXME: always?
group.label <- if (nG > 1L) lavListInspect(object, "group.label") else NULL
clus.label <- if (nlevels > 1L) c("within", lavListInspect(object, "cluster")) else NULL
if (nBlocks > 1L) {
block.label <- paste(rep(group.label, each = nlevels), clus.label,
sep = if (nG > 1L && nlevels > 1L) "_" else "")
}
if (lavListInspect(object, "categorical")) {
th.idx <- lavListInspect(object, "th.idx") # to select $(res.)th
if (nBlocks == 1L) th.idx <- list(th.idx) # to loop over
#FIXME when multilevel accepts categorical
}
## H0-model-implied moments, already pooled
## (moments-list nested in block-list)
Implied <- getMethod("fitted", "OLDlavaan.mi")(object, omit.imps = omit.imps)
if (nBlocks == 1L) Implied <- list(Implied) # store single block in a block-list
## template to store observed moments & residuals
RES <- OBS <- vector("list", nBlocks)
## loop over (blocks and) moments
for (b in 1:nBlocks) {
for (nm in names(Implied[[b]])) {
## skip if Implied element is not part of the saturated list
if (is.null(object@h1List[[ useImps[1] ]]$implied[[nm]][[b]])) next
## H1 (saturated model) implied moments
## (block-list nested in moments-list)
momentList <- lapply(object@h1List[useImps],
function(x) x$implied[[nm]][[b]])
OBS[[b]][[nm]] <- Reduce("+", momentList) / m
#TODO: unnecessary calculation if standardized and nm %in% c("th","slopes")
## remove numeric -means from thresholds
if (nm %in% c("th","res.th")) OBS[[b]][[nm]] <- as.numeric(OBS[[b]][[nm]])[ th.idx[[b]] ]
## calculate residuals
if (type == "raw") {
RES[[b]][[nm]] <- OBS[[b]][[nm]] - Implied[[b]][[nm]]
class(RES[[b]][[nm]]) <- class(Implied[[b]][[nm]])
## correlation residuals
} else if (type == "cor.bollen") {
if (nm %in% c("cov","res.cov")) {
RES[[b]][[nm]] <- cov2cor(OBS[[b]][[nm]]) - cov2cor(Implied[[b]][[nm]])
class(RES[[b]][[nm]]) <- c("lavaan.matrix.symmetric","matrix")
## mean structure
} else if (nm == "mean") {
std.obs.M <- OBS[[b]][[nm]] / sqrt(diag(OBS[[b]]$cov))
std.mod.M <- Implied[[b]][[nm]] / sqrt(diag(Implied[[b]]$cov))
RES[[b]][[nm]] <- std.obs.M - std.mod.M
class(RES[[b]][[nm]]) <- c("lavaan.vector","numeric")
} else if (nm == "res.int") {
std.obs.M <- OBS[[b]][[nm]] / sqrt(diag(OBS[[b]]$res.cov))
std.mod.M <- Implied[[b]][[nm]] / sqrt(diag(Implied[[b]]$res.cov))
RES[[b]][[nm]] <- std.obs.M - std.mod.M
class(RES[[b]][[nm]]) <- c("lavaan.vector","numeric")
## thresholds, slopes, cov.x, mean.x
} else {
#FIXME: lavaan currently (0.6-4.1399) returns nothing
next
}
## standardized (by observed SDs) residuals
} else if (type == "cor.bentler") {
if (nm %in% c("cov","mean")) {
SDs <- diag(sqrt(diag(OBS[[b]]$cov)))
dimnames(SDs) <- dimnames(OBS[[b]][[nm]])
} else if (nm %in% c("res.cov","res.int")) {
SDs <- diag(sqrt(diag(OBS[[b]]$res.cov)))
dimnames(SDs) <- dimnames(OBS[[b]][[nm]])
} else {
#FIXME: lavaan currently (0.6-4.1399) returns nothing for "th" or "slopes"
next
}
if (nm %in% c("cov","res.cov")) {
RES[[b]][[nm]] <- solve(SDs) %*% (OBS[[b]][[nm]] - Implied[[b]][[nm]]) %*% solve(SDs)
class(RES[[b]][[nm]]) <- c("lavaan.matrix.symmetric","matrix")
} else if (nm %in% c("mean","res.int")) {
RES[[b]][[nm]] <- (OBS[[b]][[nm]] - Implied[[b]][[nm]]) / diag(SDs)
class(RES[[b]][[nm]]) <- c("lavaan.vector","numeric")
}
}
## copy names from fitted() results
if (is.null(dim(RES[[b]][[nm]]))) {
names(RES[[b]][[nm]]) <- names(Implied[[b]][[nm]])
} else dimnames(RES[[b]][[nm]]) <- dimnames(Implied[[b]][[nm]])
## end loop over moments
}
## add type to beginning of each block's list
RES[[b]] <- c(list(type = type), RES[[b]])
#TODO: Rename (res.)cov to (res.)cor? lavResiduals() does not
}
## drop list for 1 block
if (nBlocks == 1L) {
RES <- RES[[1]]
} else names(RES) <- block.label #FIXME: will lavaan do this in the future?
RES
}
##' @name OLDlavaan.mi-class
##' @aliases residuals,OLDlavaan.mi-method
##' @export
setMethod("residuals", "OLDlavaan.mi", resid_OLDlavaan.mi)
##' @name OLDlavaan.mi-class
##' @aliases resid,OLDlavaan.mi-method
##' @export
setMethod("resid", "OLDlavaan.mi", resid_OLDlavaan.mi)
semTools/R/poolMAlloc.R 0000644 0001762 0000144 00000136236 14764270100 014460 0 ustar ligges users ### Authors:
### Jason D. Rights (Vanderbilt University; jason.d.rights@vanderbilt.edu)
### - based on research from/with Sonya Sterba
### - adapted from OLD parcelAllocation() by Corbin Quick and Alexander Schoemann
### - additional "indices" argument added by Terrence D. Jorgensen
### Last updated: 12 March 2025
##' Combine sampling variability with parcel-allocation variability by
##' pooling results across M parcel-allocations
##'
##' @description
##' This function employs an iterative algorithm to pick the number of random
##' item-to-parcel allocations needed to meet user-defined stability criteria
##' for a fitted structural equation model (SEM) (see **Details** below for
##' more information). Pooled point and standard-error estimates from this SEM
##' can be outputted at this final selected number of allocations (however, it
##' is more efficient to save the allocations and treat them as multiple
##' imputations using [lavaan.mi::lavaan.mi()]; see **See Also** for links with
##' examples). Additionally, new indices (see Sterba & Rights, 2016) are
##' outputted for assessing the relative contributions of parcel-allocation
##' variability vs. sampling variability in each estimate. At each iteration,
##' this function generates a given number of random item-to-parcel allocations,
##' fits a SEM to each allocation, pools estimates across allocations from that
##' iteration, and then assesses whether stopping criteria are met. If stopping
##' criteria are not met, the algorithm increments the number of allocations
##' used (generating all new allocations).
##'
##' @details
##'
##' This function implements an algorithm for choosing the number of allocations
##' (*M*; described in Sterba & Rights, 2016), pools point and
##' standard-error estimates across these *M* allocations, and produces
##' indices for assessing the relative contributions of parcel-allocation
##' variability vs. sampling variability in each estimate.
##'
##' To obtain pooled test statistics for model fit or model comparison, the
##' `list` or parcel allocations can be passed to [lavaan.mi::lavaan.mi()]
##' (find **Examples** on the help pages for [parcelAllocation()]
##' and [PAVranking()]).
##'
##' This function randomly generates a given number (`nAllocStart`) of
##' item-to-parcel allocations, fits a SEM to each allocation, and then
##' increments the number of allocations used (by `nAllocAdd`) until the
##' pooled point and standard-error estimates fulfill stopping criteria
##' (`stopProp` and `stopValue`, defined above). A summary of results
##' from the model that was fit to the *M* allocations are returned.
##'
##' Additionally, this function outputs the proportion of allocations with
##' solutions that converged (using a maximum likelihood estimator) as well as
##' the proportion of allocations with solutions that were converged and proper.
##' The converged and proper solutions among the final *M* allocations are
##' used in computing pooled results.
##'
##' Additionally, after each iteration of the algorithm, information useful in
##' monitoring the algorithm is outputted. The number of allocations used at
##' that iteration, the proportion of pooled parameter estimates meeting
##' stopping criteria at the previous iteration, the proportion of pooled
##' standard errors meeting stopping criteria at the previous iteration, and the
##' runtime of that iteration are outputted. When stopping criteria are
##' satisfied, the full set of results are outputted.
##'
##' For further details on the benefits of the random allocation of items to
##' parcels, see Sterba (2011) and Sterba & MacCallum (2010).
##'
##' @importFrom stats sd pnorm pt qt runif pchisq
##' @importFrom lavaan lavInspect
##'
##' @param nPerPar A list in which each element is a vector, corresponding to
##' each factor, indicating sizes of parcels. If variables are left out of
##' parceling, they should not be accounted for here (i.e., there should not
##' be parcels of size "1").
##' @param facPlc A list of vectors, each corresponding to a factor, specifying
##' the item indicators of that factor (whether included in parceling or not).
##' Either variable names or column numbers. Variables not listed will not be
##' modeled or included in output datasets.
##' @param nAllocStart The number of random allocations of items to parcels to
##' generate in the first iteration of the algorithm.
##' @param nAllocAdd The number of allocations to add with each iteration of the
##' algorithm. Note that if only one iteration is desired, `nAllocAdd` can
##' be set to \eqn{0} and results will be output for `nAllocStart`
##' allocations only.
##' @param syntax lavaan syntax that defines the model.
##' @param dataset Item-level dataset
##' @param parceloutput Optional `character`. Path (folder/directory) where
##' *M* (the final selected number of allocations) parceled data sets will
##' be outputted from the iteration where the algorithm met stopping criteria.
##' Note for Windows users: file path must be specified using forward slashes
##' (`/`), not backslashes (`\\`). See [base::path.expand()]
##' for details. If `NULL` (default), nothing is saved to disk.
##' @param stopProp Value used in defining stopping criteria of the algorithm
##' (\eqn{\delta_a} in Sterba & Rights, 2016). This is the minimum proportion
##' of change (in any pooled parameter or pooled standard error estimate
##' listed in `selectParam`) that is allowable from one iteration of the
##' algorithm to the next. That is, change in pooled estimates and pooled
##' standard errors from one iteration to the next must all be less than
##' (`stopProp`) \eqn{\times} (value from former iteration). Note that
##' `stopValue` can override this criterion (see below). Also note that values
##' less than .01 are unlikely to lead to more substantively meaningful
##' precision. Also note that if only `stopValue` is a desired criterion,
##' `stopProp` can be set to 0.
##' @param stopValue Value used in defining stopping criteria of the algorithm
##' (\eqn{\delta_b} in Sterba & Rights, 2016). `stopValue` is a minimum
##' allowable amount of absolute change (in any pooled parameter or pooled
##' standard error estimate listed in `selectParam`) from one iteration of
##' the algorithm to the next. For a given pooled estimate or pooled standard
##' error, `stopValue` is only invoked as a stopping criteria when the
##' minimum change required by `stopProp` is less than `stopValue`.
##' Note that values less than .01 are unlikely to lead to more substantively
##' meaningful precision. Also note that if only `stopProp` is a desired
##' criterion, `stopValue` can be set to 0.
##' @param selectParam (Optional) A list of the pooled parameters to be used in
##' defining stopping criteria (i.e., `stopProp` and `stopValue`).
##' These parameters should appear in the order they are listed in the lavaan
##' syntax. By default, all pooled parameters are used. Note that
##' `selectParam` should only contain freely-estimated parameters. In one
##' example from Sterba & Rights (2016) `selectParam` included all free
##' parameters except item intercepts and in another example `selectParam`
##' included only structural parameters.
##' @param indices Optional `character` vector indicating the names of
##' available [lavaan::fitMeasures()] to be included in the output.
##' The first and second elements should be a chi-squared test statistic and
##' its associated degrees of freedom, both of which will be added if missing.
##' If `"default"`, the indices will be `c("chisq", "df", "cfi", "tli",
##' "rmsea","srmr")`. If a robust test statistic is requested (see
##' [lavaan::lavOptions()]), `c("chisq","df")` will be replaced
##' by `c("chisq.scaled","df.scaled")`. For the output to include both the
##' naive and robust test statistics, `indices` should include both, but
##' put the scaled test statistics first, as in `indices =
##' c("chisq.scaled", "df.scaled", "chisq", "df")`
##' @param double (Optional) If set to `TRUE`, requires stopping criteria
##' (`stopProp` and `stopValue`) to be met for all parameters (in
##' `selectParam`) for two consecutive iterations of the algorithm. By
##' default, this is set to `FALSE`, meaning stopping criteria need only be
##' met at one iteration of the algorithm.
##' @param names (Optional) A character vector containing the names of parceled
##' variables.
##' @param leaveout (Optional) A vector of variables to be left out of
##' randomized parceling. Either variable names or column numbers are allowed.
##' @param useTotalAlloc (Optional) If set to `TRUE`, function will output
##' a separate set of results that uses all allocations created by the
##' algorithm, rather than *M* allocations (see "Allocations needed for
##' stability" below). This distinction is further discussed in Sterba and
##' Rights (2016).
##' @param checkConv (Optional) If set to TRUE, function will output pooled
##' estimates and standard errors from 10 iterations post-convergence.
##' @param \dots Additional arguments to be passed to
##' [lavaan::lavaan()]. See also [lavaan::lavOptions()]
##'
##' @return
##'
##' \item{Estimates}{A table containing pooled results across *M*
##' allocations at the iteration where stopping criteria were met. Columns
##' correspond to individual parameter name, pooled estimate, pooled standard
##' error, *p* value for a *z* test of the parameter, normal-theory \eqn{95\%}
##' CI, *p* value for a *t* test of the parameter (using \eqn{df} described in
##' Sterba & Rights, 2016), and *t*-based \eqn{95\%} CI for the parameter.}
##' \item{Fit}{A table containing results related to model fit from the *M*
##' allocations at the iteration where stopping criteria were met. Columns
##' correspond to fit index names, the mean of each index across allocations,
##' the *SD* of each fit index across allocations, the minimum, maximum and
##' range of each fit index across allocations, and the percent of the *M*
##' allocations where the chi-square test of absolute fit was significant.}
##' \item{Proportions}{A table containing the proportion of the final *M*
##' allocations that (a) met the optimizer convergence criteria) and
##' (b) converged to proper solutions. Note that pooled estimates, pooled
##' standard errors, and other results are computed using only the converged,
##' proper allocations.}
##' \item{Stability}{The number of allocations (*M*) needed for stability, at
##' which point the algorithm's stopping criteria (defined above) were met.}
##' \item{Uncertainty}{Indices used to quantify uncertainty in estimates due to
##' sample vs. allocation variability. A table containing individual parameter
##' names, an estimate of the proportion of total variance of a pooled
##' parameter estimate that is attributable to parcel-allocation variability
##' (PPAV), and an estimate of the ratio of the between-allocation variance of
##' a pooled parameter estimate to the within-allocation variance (RPAV).
##' See Sterba & Rights (2016) for more detail.}
##' \item{Time}{The total runtime of the function, in minutes. Note that the
##' total runtime will be greater when the specified model encounters
##' convergence problems for some allocations, as is the case with the
##' [simParcel()] dataset used below.}
##'
##'
##' @author
##' Jason D. Rights (Vanderbilt University; \email{jason.d.rights@@vanderbilt.edu})
##'
##' The author would also like to credit Corbin Quick and Alexander Schoemann
##' for providing the original [parcelAllocation()] function (prior to its
##' revision by Terrence D. Jorgensen) on which this function is based.
##'
##' @seealso
##' [lavaan.mi::lavaan.mi()] for treating allocations as multiple imputations
##' to pool results across allocations. See **Examples** on help pages for
##' [parcelAllocation()] (when fitting a single model) and [PAVranking()]
##' (when comparing 2 models).
##'
##' @references
##'
##' Sterba, S. K. (2011). Implications of parcel-allocation
##' variability for comparing fit of item-solutions and parcel-solutions.
##' *Structural Equation Modeling, 18*(4), 554--577.
##' \doi{10.1080/10705511.2011.607073}
##'
##' Sterba, S. K., & MacCallum, R. C. (2010). Variability in parameter estimates
##' and model fit across random allocations of items to parcels.
##' *Multivariate Behavioral Research, 45*(2), 322--358.
##' \doi{10.1080/00273171003680302}
##'
##' Sterba, S. K., & Rights, J. D. (2016). Accounting for parcel-allocation
##' variability in practice: Combining sources of uncertainty and choosing the
##' number of allocations. *Multivariate Behavioral Research, 51*(2--3),
##' 296--313. \doi{10.1080/00273171.2016.1144502}
##'
##' Sterba, S. K., & Rights, J. D. (2017). Effects of parceling on model
##' selection: Parcel-allocation variability in model ranking.
##' *Psychological Methods, 22*(1), 47--68. \doi{10.1037/met0000067}
##'
##' @examples
##'
##' \donttest{
##' ## lavaan syntax: A 2 Correlated
##' ## factor CFA model to be fit to parceled data
##'
##' parmodel <- '
##' f1 =~ NA*p1f1 + p2f1 + p3f1
##' f2 =~ NA*p1f2 + p2f2 + p3f2
##' p1f1 ~ 1
##' p2f1 ~ 1
##' p3f1 ~ 1
##' p1f2 ~ 1
##' p2f2 ~ 1
##' p3f2 ~ 1
##' p1f1 ~~ p1f1
##' p2f1 ~~ p2f1
##' p3f1 ~~ p3f1
##' p1f2 ~~ p1f2
##' p2f2 ~~ p2f2
##' p3f2 ~~ p3f2
##' f1 ~~ 1*f1
##' f2 ~~ 1*f2
##' f1 ~~ f2
##' '
##'
##' ## specify items for each factor
##' f1name <- colnames(simParcel)[1:9]
##' f2name <- colnames(simParcel)[10:18]
##'
##' ## run function
##' poolMAlloc(nPerPar = list(c(3,3,3), c(3,3,3)),
##' facPlc = list(f1name, f2name), nAllocStart = 10, nAllocAdd = 10,
##' syntax = parmodel, dataset = simParcel, stopProp = .03,
##' stopValue = .03, selectParam = c(1:6, 13:18, 21),
##' names = list("p1f1","p2f1","p3f1","p1f2","p2f2","p3f2"),
##' double = FALSE, useTotalAlloc = FALSE)
##' }
##'
##' ## See examples on ?parcelAllocation and ?PAVranking for how to obtain
##' ## pooled test statistics and other pooled lavaan output.
##' ## Details provided in Sterba & Rights (2016).
##'
##' @export
poolMAlloc <- function(nPerPar, facPlc, nAllocStart, nAllocAdd = 0,
parceloutput = NULL, syntax, dataset, stopProp, stopValue,
selectParam = NULL, indices = "default", double = FALSE,
checkConv = FALSE, names = "default", leaveout = 0,
useTotalAlloc = FALSE, ...) {
message('Note that more options for pooled results are available using the ',
'lavaan.mi package (see Examples on ?parcelAllocation and ?PAVranking)')
if (!is.null(parceloutput)) {
if (!dir.exists(parceloutput)) stop('invalid directory:\n',
paste(parceloutput), "\n\n")
}
StartTimeFull <- proc.time()
if (is.character(dataset)) dataset <- utils::read.csv(dataset)
if (indices[1] == "default") indices <- c("chisq", "df", "cfi", "tli", "rmsea","srmr")
## make sure chi-squared and df are the first and second elements
requestedChiSq <- grep(pattern = "chisq", indices, value = TRUE)
if (length(requestedChiSq) == 0L) {
indices <- unique(c("chisq", indices))
} else {
indices <- unique(c(requestedChiSq[1], indices))
}
requestedDF <- grep(pattern = "df", indices, value = TRUE)
if (length(requestedDF) == 0L) {
indices <- unique(c(indices[1], "df", indices[-1]))
} else {
indices <- unique(c(indices[1], requestedDF[1], indices[-1]))
}
isProperSolution <- function(object) {
lavpartable <- object@ParTable
lavfit <- object@Fit
lavdata <- object@Data
lavmodel <- object@Model
var.idx <- which(lavpartable$op == "~~" & lavpartable$lhs == lavpartable$rhs)
if (length(var.idx) > 0L && any(lavfit@est[var.idx] < 0)) return(FALSE)
if (length(lavaan::lavaanNames(lavpartable, type = "lv.regular")) > 0L) {
ETA <- list(lavInspect(object, "cov.lv"))
for (g in 1:lavdata@ngroups) {
eigvals <- eigen(ETA[[g]], symmetric = TRUE, only.values = TRUE)$values
if (any(eigvals < -1 * .Machine$double.eps^(3/4))) return(FALSE)
}
}
THETA <- list(lavInspect(object, "theta"))
for (g in 1:lavdata@ngroups) {
num.idx <- lavmodel@num.idx[[g]]
if (length(num.idx) > 0L) {
eigvals <- eigen(THETA[[g]][unlist(num.idx),
unlist(num.idx), drop = FALSE], symmetric = TRUE,
only.values = TRUE)$values
if (any(eigvals < -1 * .Machine$double.eps^(3/4))) return(FALSE)
}
}
TRUE
}
nloop <- 0
nAllocStarttemp <- nAllocStart
options(max.print = 1e+06)
BreakCounter <- NA
repeat {
StartTime <- proc.time()
nloop <- nloop + 1
if (double == TRUE & is.na(BreakCounter) == FALSE)
BreakCounter <- BreakCounter + 1
if (checkConv == TRUE & is.na(BreakCounter) == FALSE)
BreakCounter <- BreakCounter + 1
if (nloop > 1) {
if (is.na(BreakCounter) == TRUE) {
Parmn_revFinal <- Parmn_rev[[nloop - 1]]
nConvergedOutput <- nConverged
nConvergedProperOutput <- nConvergedProper
PooledSEwithinvarFinal <- PooledSEwithinvar
PooledSEbetweenvarFinal <- PooledSEbetweenvar
PooledSEFinal <- PooledSE
FitsumOutput <- Fitsum
nAllocOutput <- nAllocStart - nAllocAdd
AllocationsOutput <- Allocations
#ParamFinal <- Param # defined, but never used
}
ParamPooledSE_temp <- ParamPooledSE
ParamTest_temp <- ParamTest
PooledSE_temp <- PooledSE
ParamPoolSEdiffmin <- abs(ParamPooledSE_temp * stopProp)
ParamPoolSEdiffmin[ParamPoolSEdiffmin < stopValue] <- stopValue
ParamDiffMin <- abs(ParamTest * stopProp)
ParamDiffMin[ParamDiffMin < stopValue] <- stopValue
PooledSEmin <- abs(PooledSE * stopProp)
PooledSEmin[PooledSEmin < stopValue] <- stopValue
}
dataset <- as.matrix(dataset)
if (nAllocStart < 2) stop("Minimum of two allocations required.")
if (is.list(facPlc)) {
if (is.numeric(facPlc[[1]][1]) == FALSE) {
facPlcb <- facPlc
Namesv <- colnames(dataset)
for (i in 1:length(facPlc)) {
for (j in 1:length(facPlc[[i]])) {
facPlcb[[i]][j] <- match(facPlc[[i]][j],
Namesv)
}
facPlcb[[i]] <- as.numeric(facPlcb[[i]])
}
facPlc <- facPlcb
}
facPlc2 <- rep(0, ncol(dataset))
for (i in 1:length(facPlc)) {
for (j in 1:length(facPlc[[i]])) {
facPlc2[facPlc[[i]][j]] <- i
}
}
facPlc <- facPlc2
}
if (leaveout != 0) {
if (is.numeric(leaveout) == FALSE) {
leaveoutb <- rep(0, length(leaveout))
Namesv <- colnames(dataset)
for (i in 1:length(leaveout)) {
leaveoutb[i] <- match(leaveout[i], Namesv)
}
leaveout <- as.numeric(leaveoutb)
}
k1 <- 0.001
for (i in 1:length(leaveout)) {
facPlc[leaveout[i]] <- facPlc[leaveout[i]] + k1
k1 <- k1 + 0.001
}
}
if (0 %in% facPlc == TRUE) {
Zfreq <- sum(facPlc == 0)
for (i in 1:Zfreq) {
Zplc <- match(0, facPlc)
dataset <- dataset[, -Zplc]
facPlc <- facPlc[-Zplc]
}
}
if (is.list(nPerPar)) {
nPerPar2 <- c()
for (i in 1:length(nPerPar)) {
Onesp <- sum(facPlc > i & facPlc < i + 1)
nPerPar2 <- c(nPerPar2, nPerPar[i], rep(1, Onesp), recursive = TRUE)
}
nPerPar <- nPerPar2
}
Npp <- c()
for (i in 1:length(nPerPar)) {
Npp <- c(Npp, rep(i, nPerPar[i]))
}
Locate <- sort(round(facPlc))
Maxv <- max(Locate) - 1
if (length(Locate) != length(Npp)) {
stop("** ERROR! ** Parcels incorrectly specified. Check input!")
}
if (Maxv > 0) {
for (i in 1:Maxv) {
Mat <- match(i + 1, Locate)
if (Npp[Mat] == Npp[Mat - 1]) {
stop("** ERROR! ** Parcels incorrectly specified. Check input!")
}
}
}
Onevec <- facPlc - round(facPlc)
NleaveA <- length(Onevec) - sum(Onevec == 0)
NleaveP <- sum(nPerPar == 1)
if (NleaveA < NleaveP) {
warning("** WARNING! ** Single-variable parcels have been requested.",
" Check input!")
}
if (NleaveA > NleaveP)
warning("** WARNING! ** More non-parceled variables have been requested",
" than provided for in parcel vector. Check input!")
if (length(names) > 1) {
if (length(names) != length(nPerPar)) {
warning("** WARNING! ** Number of parcel names provided not equal to",
" number of parcels requested. Check input!")
}
}
Data <- c(1:ncol(dataset))
# Nfactors <- max(facPlc) # defined but never used
Nindicators <- length(Data)
Npar <- length(nPerPar)
Rmize <- runif(Nindicators, 1, Nindicators)
Data <- rbind(facPlc, Rmize, Data)
Results <- matrix(numeric(0), nAllocStart, Nindicators)
Pin <- nPerPar[1]
for (i in 2:length(nPerPar)) {
Pin <- c(Pin, nPerPar[i] + Pin[i - 1])
}
for (i in 1:nAllocStart) {
Data[2, ] <- runif(Nindicators, 1, Nindicators)
Data <- Data[, order(Data[2, ])]
Data <- Data[, order(Data[1, ])]
Results[i, ] <- Data[3, ]
}
Alpha <- rbind(Results[1, ], dataset)
Allocations <- list()
for (i in 1:nAllocStart) {
Ineff <- rep(NA, ncol(Results))
Ineff2 <- c(1:ncol(Results))
for (inefficient in 1:ncol(Results)) {
Ineff[Results[i, inefficient]] <- Ineff2[inefficient]
}
Alpha[1, ] <- Ineff
Beta <- Alpha[, order(Alpha[1, ])]
Temp <- matrix(NA, nrow(dataset), Npar)
TempAA <- if (length(1:Pin[1]) > 1) {
Beta[2:nrow(Beta), 1:Pin[1]]
} else cbind(Beta[2:nrow(Beta), 1:Pin[1]], Beta[2:nrow(Beta), 1:Pin[1]])
Temp[, 1] <- rowMeans(TempAA, na.rm = TRUE)
for (al in 2:Npar) {
Plc <- Pin[al - 1] + 1
TempBB <- if (length(Plc:Pin[al]) > 1) {
Beta[2:nrow(Beta), Plc:Pin[al]]
} else cbind(Beta[2:nrow(Beta), Plc:Pin[al]],
Beta[2:nrow(Beta), Plc:Pin[al]])
Temp[, al] <- rowMeans(TempBB, na.rm = TRUE)
}
if (length(names) > 1) {
colnames(Temp) <- names
}
Allocations[[i]] <- Temp
}
Param <- list()
Fitind <- list()
Converged <- list()
ProperSolution <- list()
ConvergedProper <- list()
for (i in 1:(nAllocStart)) {
data_parcel <- as.data.frame(Allocations[[i]], row.names = NULL, optional = FALSE)
fit <- lavaan::sem(syntax, data = data_parcel, ...)
## if a robust estimator was requested, update fit indices accordingly
requestedTest <- lavInspect(fit, "options")$test
if (any(requestedTest %in% c("satorra.bentler","yuan.bentler",
"yuan.bentler.mplus","scaled.shifted",
"mean.var.adjusted","satterthwaite"))) {
indices[1:2] <- c("chisq.scaled","df.scaled")
} else indices[1:2] <- c("chisq","df")
## check convergence and solution
if (lavInspect(fit, "converged") == TRUE) {
Converged[[i]] <- 1
} else Converged[[i]] <- 0
Param[[i]] <- lavaan::parameterEstimates(fit)[,
c("lhs", "op", "rhs", "est", "se", "z", "pvalue",
"ci.lower", "ci.upper")]
if (isProperSolution(fit) == TRUE & Converged[[i]] == 1) {
ProperSolution[[i]] <- 1
} else ProperSolution[[i]] <- 0
if (any(is.na(Param[[i]][, 5] == TRUE)))
ProperSolution[[i]] <- 0
if (Converged[[i]] == 1 & ProperSolution[[i]] == 1) {
ConvergedProper[[i]] <- 1
} else ConvergedProper[[i]] <- 0
if (ConvergedProper[[i]] == 0)
Param[[i]][, 4:9] <- matrix(data = NA, nrow(Param[[i]]), 6)
if (ConvergedProper[[i]] == 1) {
Fitind[[i]] <- lavaan::fitMeasures(fit, indices)
if (!all(indices %in% names(Fitind[[i]]))) {
invalidIndices <- setdiff(indices, names(Fitind[[i]]))
Fitind[[i]][invalidIndices] <- NA
}
} else Fitind[[i]] <- rep(NA, length(indices))
}
nConverged <- Reduce("+", Converged)
nProperSolution <- Reduce("+", ProperSolution)
nConvergedProper <- Reduce("+", ConvergedProper)
if (nConvergedProper == 0) stop("All allocations failed to converge and/or",
" yielded improper solutions for a given loop.")
Parmn <- Param[[1]]
if (is.null(selectParam))
selectParam <- 1:nrow(Parmn)
ParSE <- matrix(NA, nrow(Parmn), nAllocStart)
ParSEmn <- Parmn[, 5]
Parsd <- matrix(NA, nrow(Parmn), nAllocStart)
Fitmn <- Fitind[[1]]
Fitsd <- matrix(NA, length(Fitmn), nAllocStart)
Sigp <- matrix(NA, nrow(Parmn), nAllocStart)
Fitind <- data.frame(Fitind)
ParamSEsquared <- list()
for (i in 1:nAllocStart) {
ParamSEsquared[[i]] <- cbind(Param[[i]][, 5], Param[[i]][, 5])
if (any(is.na(ParamSEsquared[[i]]) == TRUE)) ParamSEsquared[[i]] <- 0
ParamSEsquared[[i]] <- apply(as.data.frame(ParamSEsquared[[i]]), 1, prod)
Parsd[, i] <- Param[[i]][, 4]
ParSE[, i] <- Param[[i]][, 5]
Sigp[, ncol(Sigp) - i + 1] <- Param[[i]][, 7]
Fitsd[, i] <- Fitind[[i]]
}
Sigp <- Sigp + 0.45
Sigp <- apply(Sigp, c(1, 2), round)
Sigp <- 1 - as.vector(rowMeans(Sigp, na.rm = TRUE))
Parsum <- cbind(apply(Parsd, 1, mean, na.rm = TRUE),
apply(Parsd, 1, sd, na.rm = TRUE),
apply(Parsd, 1, max, na.rm = TRUE),
apply(Parsd, 1, min, na.rm = TRUE),
apply(Parsd, 1, max, na.rm = TRUE) - apply(Parsd, 1, min, na.rm = TRUE),
Sigp)
colnames(Parsum) <- c("Avg Est.", "S.D.", "MAX",
"MIN", "Range", "% Sig")
ParSEmn <- Parmn[, 1:3]
ParSEfn <- cbind(ParSEmn, apply(ParSE, 1, mean, na.rm = TRUE),
apply(ParSE, 1, sd, na.rm = TRUE),
apply(ParSE, 1, max, na.rm = TRUE),
apply(ParSE, 1, min, na.rm = TRUE),
apply(ParSE, 1, max, na.rm = TRUE) - apply(ParSE, 1, min, na.rm = TRUE))
colnames(ParSEfn) <- c("lhs", "op", "rhs", "Avg SE",
"S.D.", "MAX", "MIN", "Range")
Fitsum <- cbind(apply(Fitsd, 1, mean, na.rm = TRUE),
apply(Fitsd, 1, sd, na.rm = TRUE),
apply(Fitsd, 1, max, na.rm = TRUE),
apply(Fitsd, 1, min, na.rm = TRUE),
apply(Fitsd, 1, max, na.rm = TRUE) - apply(Fitsd, 1, min, na.rm = TRUE))
rownames(Fitsum) <- indices
Parmn[, 4:ncol(Parmn)] <- Parmn[, 4:ncol(Parmn)]/nConvergedProper
Parmn <- Parmn[, 1:3]
Parmn <- cbind(Parmn, Parsum)
Fitmn <- Fitmn/nConvergedProper
pChisq <- list()
sigChisq <- list()
for (i in 1:nAllocStart) {
pChisq[[i]] <- (1 - pchisq(Fitsd[1, i], Fitsd[2, i]))
if (is.na(pChisq[[i]]) == FALSE & pChisq[[i]] < 0.05) {
sigChisq[[i]] <- 1
}
else sigChisq[[i]] <- 0
}
PerSigChisq <- (Reduce("+", sigChisq))/nConvergedProper * 100
PerSigChisq <- round(PerSigChisq, 4)
PerSigChisqCol <- c(PerSigChisq, # however many indices != chisq(.scaled)
rep("n/a", sum(!grepl(pattern = "chisq", x = indices))))
Fitsum <- data.frame(Fitsum, PerSigChisqCol, stringsAsFactors = FALSE)
colnames(Fitsum) <- c("Avg Ind", "S.D.", "MAX", "MIN",
"Range", "% Sig")
PooledSEwithinvar <- Reduce("+", ParamSEsquared)/nConvergedProper
PooledSEbetweenvar <- Parmn[, 5]^2
PooledSE <- sqrt(PooledSEwithinvar + PooledSEbetweenvar + PooledSEbetweenvar/nConvergedProper)
ParamPooledSE <- c(Parmn[, 4], PooledSE)
ParamTest <- Parmn[, 4]
if (nloop > 1) {
ParamPoolSEdiff <- abs(ParamPooledSE_temp - ParamPooledSE)
Paramdiff <- abs(ParamTest_temp - ParamTest)
PooledSEdiff <- abs(PooledSE - PooledSE_temp)
ParamPoolSEdifftest <- ParamPoolSEdiff - ParamPoolSEdiffmin
ParamPoolSEdifftest[ParamPoolSEdifftest <= 0] <- 0
ParamPoolSEdifftest[ParamPoolSEdifftest > 0] <- 1
Paramdifftest <- Paramdiff - ParamDiffMin
Paramdifftest[Paramdifftest <= 0] <- 0
Paramdifftest[Paramdifftest > 0] <- 1
PooledSEdifftest <- PooledSEdiff - PooledSEmin
PooledSEdifftest[PooledSEdifftest <= 0] <- 0
PooledSEdifftest[PooledSEdifftest > 0] <- 1
if (nloop == 2) {
ParamPoolSEdifftesttable <- cbind(ParamPoolSEdifftest)
Paramdifftesttable <- cbind(Paramdifftest)
PooledSEdifftesttable <- cbind(PooledSEdifftest)
}
if (nloop > 2) {
ParamPoolSEdifftesttable <- cbind(ParamPoolSEdifftesttable,
ParamPoolSEdifftest)
Paramdifftesttable <- cbind(Paramdifftesttable,
Paramdifftest)
PooledSEdifftesttable <- cbind(PooledSEdifftesttable,
PooledSEdifftest)
}
PropStopParam <- 1 - (Reduce("+", Paramdifftesttable[selectParam,
nloop - 1])/length(selectParam))
PropStopPooled <- 1 - (Reduce("+", PooledSEdifftesttable[selectParam,
nloop - 1])/length(selectParam))
PropStopParamPooled <- 1 - (Reduce("+", ParamPoolSEdifftesttable[c(selectParam, selectParam + nrow(Parmn)), nloop - 1]) /
(2 * length(selectParam)))
if (checkConv == TRUE & is.na(BreakCounter) == TRUE) {
print(nAllocStart)
print("Proportion of pooled estimates meeting stop criteria:")
print(PropStopParam)
print("Proportion of pooled SE meeting stop criteria:")
print(PropStopPooled)
}
if (checkConv == FALSE) {
print(nAllocStart)
print("Proportion of pooled estimates meeting stop criteria:")
print(PropStopParam)
print("Proportion of pooled SE meeting stop criteria:")
print(PropStopPooled)
}
}
nAllocStart <- nAllocStart + nAllocAdd
StopTime <- proc.time() - StartTime
print("Runtime:")
print(StopTime)
Parmn_rev <- list()
Parmn_rev[[nloop]] <- cbind(Parmn[, 1:4], PooledSE)
Parmn_rev[[nloop]][, 4:5] <- sapply(Parmn_rev[[nloop]][,4:5], as.numeric)
colnames(Parmn_rev[[nloop]]) <- c("lhs", "op", "rhs","Estimate", "Pooled SE")
if (nloop == 1) {
Param_revTemp <- cbind(Parmn[, 1:3], Parmn_rev[[nloop]][,4])
Param_revTemp[, 4] <- as.numeric(Param_revTemp[,4])
Param_revTotal <- cbind(Param_revTemp)
PooledSE_revTemp <- cbind(Parmn[, 1:3], Parmn_rev[[nloop]][,5])
PooledSE_revTemp[, 4] <- as.numeric(PooledSE_revTemp[,4])
PooledSE_revTotal <- cbind(PooledSE_revTemp)
}
if (nloop > 1) {
Param_revTemp <- cbind(Parmn_rev[[nloop]][, 4])
Param_revTemp <- as.numeric(Param_revTemp)
Param_revTotal <- cbind(Param_revTotal, Param_revTemp)
PooledSE_revTemp <- cbind(Parmn_rev[[nloop]][,
5])
PooledSE_revTemp <- as.numeric(PooledSE_revTemp)
PooledSE_revTotal <- cbind(PooledSE_revTotal,
PooledSE_revTemp)
}
if (nloop == 1) {
ParamTotal <- Param
FitindTotal <- Fitind
AllocationsTotal <- Allocations
nAllocTotal <- nAllocStart - nAllocAdd
nConvergedTotal <- nConverged
nProperSolutionTotal <- nProperSolution
nConvergedProperTotal <- nConvergedProper
}
if (nloop > 1) {
ParamTotal <- c(ParamTotal, Param)
FitindTotal <- c(FitindTotal, Fitind)
AllocationsTotal <- c(AllocationsTotal, Allocations)
nAllocTotal <- nAllocTotal + nAllocStart - nAllocAdd
nConvergedTotal <- nConverged + nConvergedTotal
nProperSolution <- nProperSolution + nProperSolutionTotal
nConvergedProperTotal <- nConvergedProper + nConvergedProperTotal
}
if (nloop > 1 & double == TRUE & is.na(BreakCounter) == FALSE & BreakCounter == 2) {
if (Reduce("+", ParamPoolSEdifftesttable[c(selectParam,
selectParam + nrow(Parmn_rev[[nloop]])), nloop - 1]) == 0)
break
}
if (nloop > 1 & double == TRUE) {
if (Reduce("+", ParamPoolSEdifftesttable[c(selectParam,
selectParam + nrow(Parmn_rev[[nloop]])), nloop - 1]) == 0) {
BreakCounter <- 1
}
else BreakCounter <- NA
}
if (nloop > 1 & checkConv == TRUE & is.na(BreakCounter) == TRUE) {
if (Reduce("+", ParamPoolSEdifftesttable[c(selectParam,
selectParam + nrow(Parmn_rev[[nloop]])), nloop - 1]) == 0)
BreakCounter <- 0
}
if (nloop > 1 & double == FALSE & checkConv == FALSE) {
if (Reduce("+", ParamPoolSEdifftesttable[c(selectParam,
selectParam + nrow(Parmn_rev[[nloop]])), nloop - 1]) == 0)
break
}
if (nAllocAdd == 0)
break
if (checkConv == TRUE & is.na(BreakCounter) == FALSE & BreakCounter == 9)
break
}
if (nAllocAdd == 0) {
Parmn_revFinal <- Parmn_rev[[nloop]]
nConvergedOutput <- nConverged
nConvergedProperOutput <- nConvergedProper
PooledSEwithinvarFinal <- PooledSEwithinvar
PooledSEbetweenvarFinal <- PooledSEbetweenvar
PooledSEFinal <- PooledSE
FitsumOutput <- Fitsum
nAllocOutput <- nAllocStart - nAllocAdd
AllocationsOutput <- Allocations
}
if (!is.null(parceloutput)) {
replist <- matrix(NA, nAllocOutput, 1)
for (i in 1:(nAllocOutput)) {
colnames(AllocationsOutput[[i]]) <- names
utils::write.table(AllocationsOutput[[i]],
file = paste(parceloutput, "/parcelruns", i, ".dat", sep = ""),
row.names = FALSE, col.names = TRUE)
replist[i, 1] <- paste("parcelruns", i, ".dat", sep = "")
}
utils:: write.table(replist, paste(parceloutput, "/parcelrunsreplist.dat",
sep = ""), quote = FALSE, row.names = FALSE,
col.names = FALSE)
}
if (useTotalAlloc == TRUE) {
ParmnTotal <- ParamTotal[[1]]
ParSETotal <- matrix(NA, nrow(ParmnTotal), nAllocTotal)
ParSEmnTotal <- ParmnTotal[, 5]
ParsdTotal <- matrix(NA, nrow(ParmnTotal), nAllocTotal)
FitmnTotal <- FitindTotal[[1]]
FitsdTotal <- matrix(NA, length(FitmnTotal), nAllocTotal)
SigpTotal <- matrix(NA, nrow(ParmnTotal), nAllocTotal)
FitindTotal <- data.frame(FitindTotal)
ParamSEsquaredTotal <- list()
for (i in 1:nAllocTotal) {
ParamSEsquaredTotal[[i]] <- cbind(ParamTotal[[i]][,5], ParamTotal[[i]][, 5])
if (any(is.na(ParamSEsquaredTotal[[i]]) == TRUE))
ParamSEsquaredTotal[[i]] <- 0
ParamSEsquaredTotal[[i]] <- apply(as.data.frame(ParamSEsquaredTotal[[i]]),1, prod)
ParsdTotal[, i] <- ParamTotal[[i]][, 4]
ParSETotal[, i] <- ParamTotal[[i]][, 5]
SigpTotal[, ncol(Sigp) - i + 1] <- ParamTotal[[i]][,7]
FitsdTotal[, i] <- FitindTotal[[i]]
}
SigpTotal <- SigpTotal + 0.45
SigpTotal <- apply(SigpTotal, c(1, 2), round)
SigpTotal <- 1 - as.vector(rowMeans(SigpTotal, na.rm = TRUE))
ParsumTotal <- cbind(apply(ParsdTotal, 1, mean, na.rm = TRUE),
apply(ParsdTotal, 1, sd, na.rm = TRUE),
apply(ParsdTotal, 1, max, na.rm = TRUE),
apply(ParsdTotal, 1, min, na.rm = TRUE),
apply(ParsdTotal, 1, max, na.rm = TRUE) - apply(ParsdTotal, 1, min, na.rm = TRUE),
SigpTotal)
colnames(ParsumTotal) <- c("Avg Est.", "S.D.", "MAX", "MIN", "Range", "% Sig")
ParSEmnTotal <- ParmnTotal[, 1:3]
ParSEfnTotal <- cbind(ParSEmnTotal,
apply(ParSETotal, 1, mean, na.rm = TRUE),
apply(ParSETotal, 1, sd, na.rm = TRUE),
apply(ParSETotal, 1, max, na.rm = TRUE),
apply(ParSETotal, 1, min, na.rm = TRUE),
apply(ParSETotal, 1, max, na.rm = TRUE) - apply(ParSETotal, 1, min, na.rm = TRUE))
colnames(ParSEfnTotal) <- c("lhs", "op", "rhs", "Avg SE",
"S.D.", "MAX", "MIN", "Range")
FitsumTotal <- cbind(apply(FitsdTotal, 1, mean, na.rm = TRUE),
apply(FitsdTotal, 1, sd, na.rm = TRUE),
apply(FitsdTotal, 1, max, na.rm = TRUE),
apply(FitsdTotal, 1, min, na.rm = TRUE),
apply(FitsdTotal, 1, max, na.rm = TRUE) - apply(FitsdTotal, 1, min, na.rm = TRUE))
rownames(FitsumTotal) <- indices
ParmnTotal[, 4:ncol(ParmnTotal)] <- ParmnTotal[,4:ncol(Parmn)]/nConvergedProperTotal
ParmnTotal <- ParmnTotal[, 1:3]
ParmnTotal <- cbind(ParmnTotal, ParsumTotal)
FitmnTotal <- FitmnTotal/nConvergedProperTotal
pChisqTotal <- list()
sigChisqTotal <- list()
for (i in 1:nAllocTotal) {
pChisqTotal[[i]] <- (1 - pchisq(FitsdTotal[1,i], FitsdTotal[2, i]))
if (is.na(pChisqTotal[[i]]) == FALSE & pChisqTotal[[i]] < 0.05) {
sigChisqTotal[[i]] <- 1
} else sigChisqTotal[[i]] <- 0
}
PerSigChisqTotal <- (Reduce("+", sigChisqTotal))/nConvergedProperTotal * 100
PerSigChisqTotal <- round(PerSigChisqTotal, 4)
PerSigChisqColTotal <- c(PerSigChisqTotal, "n/a", "n/a", "n/a", "n/a")
FitsumTotal <- data.frame(FitsumTotal, PerSigChisqColTotal, stringsAsFactors = FALSE)
colnames(FitsumTotal) <- c("Avg Ind", "S.D.", "MAX", "MIN", "Range", "% Sig")
PooledSEwithinvarTotal <- Reduce("+", ParamSEsquaredTotal)/nConvergedProperTotal
PooledSEbetweenvarTotal <- ParmnTotal[, 5]^2
PooledSETotal <- sqrt(PooledSEwithinvarTotal + PooledSEbetweenvarTotal +
PooledSEbetweenvarTotal/nConvergedProperTotal)
ParamPooledSETotal <- c(ParmnTotal[, 4], PooledSETotal)
ParamTestTotal <- ParmnTotal[, 4]
Parmn_revTotal <- cbind(ParmnTotal[, 1:4], PooledSETotal)
Parmn_revTotal[, 4:5] <- sapply(Parmn_revTotal[,4:5], as.numeric)
colnames(Parmn_revTotal) <- c("lhs", "op", "rhs",
"Estimate", "Pooled SE")
df_tTotal <- (nConvergedProperTotal - 1) *
(1 + (nConvergedProperTotal * PooledSEwithinvarTotal)/(nConvergedProperTotal *
PooledSEbetweenvarTotal + PooledSEbetweenvarTotal))^2
crit_tTotal <- abs(qt(0.05/2, df_tTotal))
pval_zTotal <- 2 * (1 - pnorm(abs(Parmn_revTotal[, 4]/PooledSETotal)))
pval_tTotal <- 2 * (1 - pt(abs(Parmn_revTotal[, 4]/PooledSETotal),
df = df_tTotal))
CI95_Lower_zTotal <- Parmn_revTotal[, 4] - 1.959963985 * PooledSETotal
CI95_Upper_zTotal <- Parmn_revTotal[, 4] + 1.959963985 * PooledSETotal
CI95_Lower_tTotal <- Parmn_revTotal[, 4] - crit_tTotal * PooledSETotal
CI95_Upper_tTotal <- Parmn_revTotal[, 4] + crit_tTotal * PooledSETotal
Parmn_revTotal <- cbind(Parmn_revTotal, pval_zTotal,
CI95_Lower_zTotal, CI95_Upper_zTotal, pval_tTotal,
CI95_Lower_tTotal, CI95_Upper_tTotal)
colnames(Parmn_revTotal) <- c("lhs", "op", "rhs",
"Pooled Est", "Pooled SE", "pval_z", "CI95_LB_z",
"CI95_UB_z", "pval_t", "CI95_LB_t", "CI95_UB_t")
for (i in 1:nrow(Parmn_revTotal)) {
if (Parmn_revTotal[i, 5] == 0)
Parmn_revTotal[i, 6:11] <- NA
}
RPAVTotal <- (PooledSEbetweenvarTotal + (PooledSEbetweenvarTotal/(nConvergedProperTotal)))/PooledSEwithinvarTotal
PPAVTotal <- (((nConvergedProperTotal + 1)/(nConvergedProperTotal)) *
PooledSEbetweenvarTotal)/(PooledSEwithinvarTotal +
(((nConvergedProperTotal + 1)/(nConvergedProperTotal)) * PooledSEbetweenvarTotal))
PAVtableTotal <- cbind(ParmnTotal[1:3], RPAVTotal, PPAVTotal)
Parmn_revTotal[, 4:11] <- apply(Parmn_revTotal[, 4:11], 2, round, digits = 4)
FitsumTotal[, 1:5] <- apply(FitsumTotal[, 1:5], 2, round, digits = 4)
PAVtableTotal[, 4:5] <- apply(PAVtableTotal[, 4:5], 2, round, digits = 4)
FitsumTotal[2, 2:5] <- c("n/a", "n/a", "n/a", "n/a")
ConvergedProperSumTotal <- rbind((nConvergedTotal)/(nAllocTotal),
(nConvergedProperTotal)/(nAllocTotal))
rownames(ConvergedProperSumTotal) <- c("Converged", "Converged and Proper")
colnames(ConvergedProperSumTotal) <- "Proportion of Allocations"
}
if (nAllocAdd != 0) {
if (nloop == 2) {
PropParamMet <- matrix(data = 1, nrow(Parmn), 1)
PropPooledSEMet <- matrix(data = 1, nrow(Parmn), 1)
}
if (nloop != 2) {
PropParamMet <- (1 - apply(Paramdifftesttable[, 1:nloop - 1], 1, mean)) * 100
PropPooledSEMet <- (1 - apply(PooledSEdifftesttable[,1:nloop - 1], 1, mean)) * 100
}
FirstParamMet <- apply(Paramdifftesttable == 0, 1, which.max)
FirstPooledSEMet <- apply(PooledSEdifftesttable == 0, 1, which.max)
}
if (nAllocAdd == 0) {
PropParamMet <- matrix(data = NA, nrow(Parmn), 1)
PropPooledSEMet <- matrix(data = NA, nrow(Parmn), 1)
FirstParamMet <- matrix(data = NA, nrow(Parmn), 1)
FirstPooledSEMet <- matrix(data = NA, nrow(Parmn), 1)
}
PerLoops <- cbind(Parmn[, 1:3], PropParamMet, PropPooledSEMet)
colnames(PerLoops) <- c("lhs", "op", "rhs", "Param Criteria Met",
"PooledSE Criteria Met")
FirstLoops <- cbind(Parmn[, 1:3], FirstParamMet, FirstPooledSEMet)
colnames(FirstLoops) <- c("lhs", "op", "rhs", "Param Criteria Met",
"PooledSE Criteria Met")
NumbAllocations <- cbind(Parmn[, 1:3],
(FirstParamMet - 1) * nAllocAdd + nAllocStarttemp,
(FirstPooledSEMet - 1) * nAllocAdd + nAllocStarttemp)
colnames(NumbAllocations) <- c("lhs", "op", "rhs", "Param Criteria Met",
"PooledSE Criteria Met")
if (nAllocAdd != 0) {
for (i in 1:nrow(Parmn)) {
if ((i %in% selectParam) == FALSE)
PerLoops[i, 4:5] <- NA
if ((i %in% selectParam) == FALSE)
FirstLoops[i, 4:5] <- NA
if ((i %in% selectParam) == FALSE)
NumbAllocations[i, 4:5] <- NA
}
}
df_t <- (nConvergedProperOutput - 1) *
(1 + (nConvergedProperOutput * PooledSEwithinvarFinal) /
(nConvergedProperOutput * PooledSEbetweenvarFinal + PooledSEbetweenvarFinal))^2
crit_t <- abs(qt(0.05/2, df_t))
pval_z <- 2 * (1 - pnorm(abs(Parmn_revFinal[, 4]/PooledSEFinal)))
pval_t <- 2 * (1 - pt(abs(Parmn_revFinal[, 4]/PooledSEFinal),
df = df_t))
CI95_Lower_z <- Parmn_revFinal[, 4] - 1.959963985 * PooledSEFinal
CI95_Upper_z <- Parmn_revFinal[, 4] + 1.959963985 * PooledSEFinal
CI95_Lower_t <- Parmn_revFinal[, 4] - crit_t * PooledSEFinal
CI95_Upper_t <- Parmn_revFinal[, 4] + crit_t * PooledSEFinal
Parmn_revFinal <- cbind(Parmn_revFinal, pval_z, CI95_Lower_z,
CI95_Upper_z, pval_t, CI95_Lower_t, CI95_Upper_t)
colnames(Parmn_revFinal) <- c("lhs", "op", "rhs", "Pooled Est",
"Pooled SE", "pval_z", "CI95_LB_z", "CI95_UB_z",
"pval_t", "CI95_LB_t", "CI95_UB_t")
for (i in 1:nrow(Parmn_revFinal)) {
if (Parmn_revFinal[i, 5] == 0 | is.na(Parmn_revFinal[i, 5]) == TRUE)
Parmn_revFinal[i, 6:11] <- NA
}
RPAV <- (PooledSEbetweenvarFinal + (PooledSEbetweenvarFinal/(nConvergedProperOutput)))/PooledSEwithinvarFinal
PPAV <- (((nConvergedProperOutput + 1)/(nConvergedProperOutput)) *
PooledSEbetweenvarFinal)/(PooledSEwithinvarFinal +
(((nConvergedProperOutput + 1)/(nConvergedProperOutput)) *
PooledSEbetweenvarFinal))
PAVtable <- cbind(Parmn[1:3], RPAV, PPAV)
colnames(Param_revTotal) <- c("lhs", "op", "rhs", c(1:nloop))
colnames(PooledSE_revTotal) <- c("lhs", "op", "rhs",
c(1:nloop))
Param_revTotal[, 4:(nloop + 3)] <- sapply(Param_revTotal[,
4:(nloop + 3)], as.numeric)
PooledSE_revTotal[, 4:(nloop + 3)] <- sapply(PooledSE_revTotal[,
4:(nloop + 3)], as.numeric)
Parmn_revFinal[, 4:11] <- apply(Parmn_revFinal[, 4:11],
2, round, digits = 4)
FitsumOutput[, 1:5] <- apply(FitsumOutput[, 1:5], 2,
round, digits = 4)
if (nAllocAdd != 0)
Param_revTotal[, 4:(nloop + 3)] <- apply(Param_revTotal[,
4:(nloop + 3)], 2, round, digits = 8)
if (nAllocAdd == 0)
Param_revTotal[, 4] <- round(Param_revTotal[, 4],
8)
if (nAllocAdd != 0)
PooledSE_revTotal[, 4:(nloop + 3)] <- apply(PooledSE_revTotal[,
4:(nloop + 3)], 2, round, digits = 8)
if (nAllocAdd == 0)
PooledSE_revTotal[, 4] <- round(PooledSE_revTotal[,
4], 8)
PAVtable[, 4:5] <- apply(PAVtable[, 4:5], 2, round, digits = 4)
FitsumOutput[2, 2:5] <- c("n/a", "n/a", "n/a", "n/a")
ConvergedProperSum <- rbind((nConvergedOutput)/(nAllocOutput),
(nConvergedProperOutput)/(nAllocOutput))
rownames(ConvergedProperSum) <- c("Converged", "Converged and Proper")
colnames(ConvergedProperSum) <- "Proportion of Allocations"
StopTimeFull <- proc.time() - StartTimeFull
if (useTotalAlloc == FALSE) {
Output_mod <- list(Parmn_revFinal, FitsumOutput,
ConvergedProperSum, nAllocOutput, PAVtable, StopTimeFull[[3]]/60)
names(Output_mod) <- c("Estimates", "Fit",
"Proportion of Converged and Proper Allocations",
"Allocations needed for stability (M)",
"Indices to quantify uncertainty in estimates due to sampling vs. allocation variability",
"Total runtime (minutes)")
}
if (useTotalAlloc == TRUE) {
Output_mod <- list(Parmn_revFinal, FitsumOutput,
ConvergedProperSum, nAllocOutput, PAVtable, Parmn_revTotal,
FitsumTotal, ConvergedProperSumTotal, nAllocTotal,
PAVtableTotal, StopTimeFull[[3]]/60)
names(Output_mod) <- c("Estimates (using M allocations)", "Fit (using M allocations)",
"Proportion of Converged and Proper Allocations (using M allocations)",
"Allocations needed for stability (M)",
"Indices to quantify uncertainty in estimates due to sampling vs. allocation variability (using M allocations)",
"Estimates (using all allocations)", "Fit (using all allocations)",
"Proportion of Converged and Proper Allocations (using all allocations)",
"Total Allocations used by algorithm",
"Indices to quantify uncertainty in estimates due to sampling vs. allocation variability (using all allocations)",
"Total runtime (minutes)")
}
if (exists("invalidIndices")) {
if (length(invalidIndices)) message('\n\nInvalid fit indices requested: ',
paste(invalidIndices, collapse = ", "),
"\n\n")
}
return(Output_mod)
}
semTools/R/data.R 0000644 0001762 0000144 00000011222 14632143376 013323 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 4 April 2017
### document example data sets
#' Simulated Dataset to Demonstrate Two-way Latent Interaction
#'
#' A simulated data set with 2 independent factors and 1 dependent factor where
#' each factor has three indicators
#'
#'
#' @format A `data.frame` with 500 observations of 9 variables.
#' \describe{
#' \item{x1}{The first indicator of the first independent factor}
#' \item{x2}{The second indicator of the first independent factor}
#' \item{x3}{The third indicator of the first independent factor}
#' \item{x4}{The first indicator of the second independent factor}
#' \item{x5}{The second indicator of the second independent factor}
#' \item{x6}{The third indicator of the second independent factor}
#' \item{x7}{The first indicator of the dependent factor}
#' \item{x8}{The second indicator of the dependent factor}
#' \item{x9}{The third indicator of the dependent factor}
#' }
#' @source Data were generated by the [MASS::mvrnorm()] function in
#' the `MASS` package.
#' @examples head(dat2way)
"dat2way"
#' Simulated Dataset to Demonstrate Three-way Latent Interaction
#'
#' A simulated data set with 3 independent factors and 1 dependent factor where
#' each factor has three indicators
#'
#'
#' @format A `data.frame` with 500 observations of 12 variables.
#' \describe{
#' \item{x1}{The first indicator of the first independent factor}
#' \item{x2}{The second indicator of the first independent factor}
#' \item{x3}{The third indicator of the first independent factor}
#' \item{x4}{The first indicator of the second independent factor}
#' \item{x5}{The second indicator of the second independent factor}
#' \item{x6}{The third indicator of the second independent factor}
#' \item{x7}{The first indicator of the third independent factor}
#' \item{x8}{The second indicator of the third independent factor}
#' \item{x9}{The third indicator of the third independent factor}
#' \item{x10}{The first indicator of the dependent factor}
#' \item{x11}{The second indicator of the dependent factor}
#' \item{x12}{The third indicator of the dependent factor}
#' }
#' @source Data were generated by the [MASS::mvrnorm()] function in
#' the `MASS` package.
#' @examples head(dat3way)
"dat3way"
#' Simulated Data set to Demonstrate Categorical Measurement Invariance
#'
#' A simulated data set with 2 factors with 4 indicators each separated into
#' two groups
#'
#'
#' @format A `data.frame` with 200 observations of 9 variables.
#' \describe{
#' \item{g}{Sex of respondents}
#' \item{u1}{Indicator 1}
#' \item{u2}{Indicator 2}
#' \item{u3}{Indicator 3}
#' \item{u4}{Indicator 4}
#' \item{u5}{Indicator 5}
#' \item{u6}{Indicator 6}
#' \item{u7}{Indicator 7}
#' \item{u8}{Indicator 8}
#' }
#' @source Data were generated using the `lavaan` package.
#' @examples head(datCat)
"datCat"
#' Simulated Data set to Demonstrate Longitudinal Measurement Invariance
#'
#' A simulated data set with 1 factors with 3 indicators in three timepoints
#'
#'
#' @format A `data.frame` with 200 observations of 10 variables.
#' \describe{
#' \item{sex}{Sex of respondents}
#' \item{y1t1}{Indicator 1 in Time 1}
#' \item{y2t1}{Indicator 2 in Time 1}
#' \item{y3t1}{Indicator 3 in Time 1}
#' \item{y1t2}{Indicator 1 in Time 2}
#' \item{y2t2}{Indicator 2 in Time 2}
#' \item{y3t2}{Indicator 3 in Time 2}
#' \item{y1t3}{Indicator 1 in Time 3}
#' \item{y2t3}{Indicator 2 in Time 3}
#' \item{y3t3}{Indicator 3 in Time 3}
#' }
#' @source Data were generated using the `simsem` package.
#' @examples head(exLong)
"exLong"
#' Simulated Data set to Demonstrate Random Allocations of Parcels
#'
#' A simulated data set with 2 factors with 9 indicators for each factor
#'
#'
#' @format A `data.frame` with 800 observations of 18 variables.
#' \describe{
#' \item{f1item1}{Item 1 loading on factor 1}
#' \item{f1item2}{Item 2 loading on factor 1}
#' \item{f1item3}{Item 3 loading on factor 1}
#' \item{f1item4}{Item 4 loading on factor 1}
#' \item{f1item5}{Item 5 loading on factor 1}
#' \item{f1item6}{Item 6 loading on factor 1}
#' \item{f1item7}{Item 7 loading on factor 1}
#' \item{f1item8}{Item 8 loading on factor 1}
#' \item{f1item9}{Item 9 loading on factor 1}
#' \item{f2item1}{Item 1 loading on factor 2}
#' \item{f2item2}{Item 2 loading on factor 2}
#' \item{f2item3}{Item 3 loading on factor 2}
#' \item{f2item4}{Item 4 loading on factor 2}
#' \item{f2item5}{Item 5 loading on factor 2}
#' \item{f2item6}{Item 6 loading on factor 2}
#' \item{f2item7}{Item 7 loading on factor 2}
#' \item{f2item8}{Item 8 loading on factor 2}
#' \item{f2item9}{Item 9 loading on factor 2}
#' }
#' @source Data were generated using the `simsem` package.
#' @examples head(simParcel)
"simParcel"
semTools/R/indProd.R 0000644 0001762 0000144 00000034020 14764266564 014025 0 ustar ligges users ### Sunthud Pornprasertmanit and Alexander M. Schoemann
### Last updated: 12 March 2025
### prepare product indicators for 2-way and 3-way interactions in SEM
##' Make products of indicators using no centering, mean centering, double-mean
##' centering, or residual centering
##'
##' The `indProd` function will make products of indicators using no
##' centering, mean centering, double-mean centering, or residual centering. The
##' `orthogonalize` function is the shortcut of the `indProd` function
##' to make the residual-centered indicators products.
##'
##'
##' @aliases indProd orthogonalize
##' @importFrom stats lm
##'
##' @param data The desired data to be transformed.
##' @param var1 Names or indices of the variables loaded on the first factor
##' @param var2 Names or indices of the variables loaded on the second factor
##' @param var3 Names or indices of the variables loaded on the third factor
##' (for three-way interaction)
##' @param match Specify `TRUE` to use match-paired approach (Marsh, Wen, &
##' Hau, 2004). If `FALSE`, the resulting products are all possible
##' products.
##' @param meanC Specify `TRUE` for mean centering the main effect
##' indicator before making the products
##' @param residualC Specify `TRUE` for residual centering the products by
##' the main effect indicators (Little, Bovaird, & Widaman, 2006).
##' @param doubleMC Specify `TRUE` for centering the resulting products
##' (Lin et. al., 2010)
##' @param namesProd The names of resulting products
##' @return The original data attached with the products.
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com}) Alexander
##' Schoemann (East Carolina University; \email{schoemanna@@ecu.edu})
##' @seealso \itemize{ \item [probe2WayMC()] For probing the two-way
##' latent interaction when the results are obtained from mean-centering, or
##' double-mean centering. \item [probe3WayMC()] For probing the
##' three-way latent interaction when the results are obtained from
##' mean-centering, or double-mean centering. \item [probe2WayRC()]
##' For probing the two-way latent interaction when the results are obtained
##' from residual-centering approach. \item [probe3WayRC()] For
##' probing the two-way latent interaction when the results are obtained from
##' residual-centering approach. \item [plotProbe()] Plot the simple
##' intercepts and slopes of the latent interaction. }
##' @references Marsh, H. W., Wen, Z. & Hau, K. T. (2004). Structural equation
##' models of latent interactions: Evaluation of alternative estimation
##' strategies and indicator construction. *Psychological Methods, 9*(3),
##' 275--300. \doi{10.1037/1082-989X.9.3.275}
##'
##' Lin, G. C., Wen, Z., Marsh, H. W., & Lin, H. S. (2010). Structural equation
##' models of latent interactions: Clarification of orthogonalizing and
##' double-mean-centering strategies. *Structural Equation Modeling, 17*(3),
##' 374--391. \doi{10.1080/10705511.2010.488999}
##'
##' Little, T. D., Bovaird, J. A., & Widaman, K. F. (2006). On the merits of
##' orthogonalizing powered and product terms: Implications for modeling
##' interactions among latent variables. *Structural Equation Modeling,
##' 13*(4), 497--519. \doi{10.1207/s15328007sem1304_1}
##' @examples
##'
##' ## Mean centering / two-way interaction / match-paired
##' dat <- indProd(attitude[ , -1], var1 = 1:3, var2 = 4:6)
##'
##' ## Residual centering / two-way interaction / match-paired
##' dat2 <- indProd(attitude[ , -1], var1 = 1:3, var2 = 4:6, match = FALSE,
##' meanC = FALSE, residualC = TRUE, doubleMC = FALSE)
##'
##' ## Double-mean centering / two-way interaction / match-paired
##' dat3 <- indProd(attitude[ , -1], var1 = 1:3, var2 = 4:6, match = FALSE,
##' meanC = TRUE, residualC = FALSE, doubleMC = TRUE)
##'
##' ## Mean centering / three-way interaction / match-paired
##' dat4 <- indProd(attitude[ , -1], var1 = 1:2, var2 = 3:4, var3 = 5:6)
##'
##' ## Residual centering / three-way interaction / match-paired
##' dat5 <- orthogonalize(attitude[ , -1], var1 = 1:2, var2 = 3:4, var3 = 5:6,
##' match = FALSE)
##'
##' ## Double-mean centering / three-way interaction / match-paired
##' dat6 <- indProd(attitude[ , -1], var1 = 1:2, var2 = 3:4, var3 = 5:6,
##' match = FALSE, meanC = TRUE, residualC = TRUE,
##' doubleMC = TRUE)
##'
##'
##' ## To add product-indicators to multiple-imputed data sets
##' \donttest{
##' HSMiss <- HolzingerSwineford1939[ , c(paste0("x", 1:9), "ageyr","agemo")]
##' set.seed(12345)
##' HSMiss$x5 <- ifelse(HSMiss$x5 <= quantile(HSMiss$x5, .3), NA, HSMiss$x5)
##' age <- HSMiss$ageyr + HSMiss$agemo/12
##' HSMiss$x9 <- ifelse(age <= quantile(age, .3), NA, HSMiss$x9)
##' library(Amelia)
##' set.seed(12345)
##' HS.amelia <- amelia(HSMiss, m = 3, p2s = FALSE)
##' imps <- HS.amelia$imputations # extract a list of imputations
##' ## apply indProd() to the list of data.frames
##' imps2 <- lapply(imps, indProd,
##' var1 = c("x1","x2","x3"), var2 = c("x4","x5","x6"))
##' ## verify:
##' lapply(imps2, head)
##' }
##'
##' @export
indProd <- function(data, var1, var2, var3 = NULL, match = TRUE, meanC = TRUE,
residualC = FALSE, doubleMC = TRUE, namesProd = NULL) {
# Get all variable names
if (all(is.numeric(var1)))
var1 <- colnames(data)[var1]
if (all(is.numeric(var2)))
var2 <- colnames(data)[var2]
if (!is.null(var3) && all(is.numeric(var3))) var3 <- colnames(data)[var3]
dat1 <- data[, var1]
dat2 <- data[, var2]
dat3 <- NULL
if (!is.null(var3)) dat3 <- data[, var3]
# Mean centering on the original indicators
if (meanC) {
dat1 <- scale(dat1, scale = FALSE)
dat2 <- scale(dat2, scale = FALSE)
if (!is.null(dat3)) dat3 <- scale(dat3, scale = FALSE)
}
if (match) {
# Check whether the number of variables are equal across variable sets
if (length(var1) != length(var2))
stop("If the match-paired approach is used, the number of",
" variables in all sets must be equal.")
if (!is.null(var3) && (length(var1) != length(var3)))
stop("If the match-paired approach is used, the number of",
" variables in all three sets must be equal.")
datProd <- NULL
if (is.null(var3)) {
# Two-way interaction
datProd <- dat1 * dat2
if (residualC) {
notmissing <- which(!apply(datProd, 1, function(x) any(is.na(x))))
colnames(datProd) <- paste("interactionProduct", 1:ncol(datProd), sep = "")
# Write the expression for linear model and residualize the products
temp <- data.frame(datProd, dat1, dat2)
express <- paste("cbind(", paste(colnames(datProd), collapse = ", "),
") ~ ", paste(c(colnames(dat1), colnames(dat2)),
collapse = " + "), sep = "")
datProd[notmissing,] <- lm(express, data = temp)$residuals
}
} else {
# Three-way interaction
datProd2way <- cbind(dat1 * dat2, dat1 * dat3, dat2 * dat3)
datProd3way <- dat1 * dat2 * dat3
if (residualC) {
notmissing2way <- which(!apply(datProd2way, 1, function(x) any(is.na(x))))
colnames(datProd2way) <- paste("interaction2Product", 1:ncol(datProd2way), sep = "")
# Write the expression for linear model and residualize the two-way products
temp2 <- data.frame(datProd2way, dat1, dat2, dat3)
express2 <- paste("cbind(", paste(colnames(datProd2way), collapse = ", "),
") ~ ", paste(c(colnames(dat1), colnames(dat2),
colnames(dat3)), collapse = " + "), sep = "")
datProd2way[notmissing2way,] <- lm(express2, data = temp2)$residuals
# Making all possible products to residualize the 3-way interaction
datProd2wayFull <- matrix(0, nrow(data), 1)
for (i in 1:length(var1)) datProd2wayFull <- data.frame(datProd2wayFull, matrix(rep(dat1[, i], length(var2)), ncol = length(var2)) * dat2)
for (i in 1:length(var1)) datProd2wayFull <- data.frame(datProd2wayFull, matrix(rep(dat1[, i], length(var3)), ncol = length(var3)) * dat3)
for (i in 1:length(var2)) datProd2wayFull <- data.frame(datProd2wayFull, matrix(rep(dat2[, i], length(var3)), ncol = length(var3)) * dat3)
datProd2wayFull <- datProd2wayFull[, -1]
colnames(datProd2wayFull) <- paste("interaction2Product", 1:ncol(datProd2wayFull), sep = "")
notmissing3way <- which(!apply(datProd3way, 1, function(x) any(is.na(x))))
colnames(datProd3way) <- paste("interaction3Product", 1:ncol(datProd3way), sep = "")
# Write the expression for linear model and residualize the three-way products
temp3 <- data.frame(datProd3way, dat1, dat2, dat3, datProd2wayFull)
express3 <- paste("cbind(", paste(colnames(datProd3way), collapse = ", "),
") ~ ", paste(c(colnames(dat1), colnames(dat2), colnames(dat3),
colnames(datProd2wayFull)), collapse = " + "), sep = "")
datProd3way[notmissing3way,] <- lm(express3, data = temp3)$residuals
}
datProd <- cbind(datProd2way, datProd3way)
}
## Mean-centering the final product
if (doubleMC) datProd <- scale(datProd, scale = FALSE)
## Rename the obtained product terms
if (is.null(namesProd)) {
if (is.null(var3)) {
colnames(datProd) <- paste(var1, var2, sep = ".")
} else {
colnames(datProd) <- c(paste(var1, var2, sep = "."),
paste(var1, var3, sep = "."),
paste(var2, var3, sep = "."),
paste(var1, var2, var3, sep = "."))
}
} else {
colnames(datProd) <- namesProd
}
} else {
datProd <- NULL
if (is.null(var3)) {
# Create all possible combinations of the products of indicators
datProd <- matrix(0, nrow(data), 1)
for (i in 1:length(var1)) datProd <- data.frame(datProd, matrix(rep(dat1[, i], length(var2)), ncol = length(var2)) * dat2)
datProd <- datProd[, -1]
if (residualC) {
notmissing <- which(!apply(datProd, 1, function(x) any(is.na(x))))
colnames(datProd) <- paste("interactionProduct", 1:ncol(datProd), sep = "")
# Write the expression for linear model and residualize the two-way products
temp <- data.frame(datProd, dat1, dat2)
express <- paste("cbind(", paste(colnames(datProd), collapse = ", "),
") ~ ", paste(c(colnames(dat1), colnames(dat2)),
collapse = " + "), sep = "")
datProd[notmissing,] <- lm(express, data = temp)$residuals
}
} else {
# Create all possible combinations of the products of indicators
datProd2way <- matrix(0, nrow(data), 1)
for (i in 1:length(var1)) datProd2way <- data.frame(datProd2way, matrix(rep(dat1[, i], length(var2)), ncol = length(var2)) * dat2)
for (i in 1:length(var1)) datProd2way <- data.frame(datProd2way, matrix(rep(dat1[, i], length(var3)), ncol = length(var3)) * dat3)
for (i in 1:length(var2)) datProd2way <- data.frame(datProd2way, matrix(rep(dat2[, i], length(var3)), ncol = length(var3)) * dat3)
datProd3way <- matrix(0, nrow(data), 1)
for (i in 1:length(var1)) {
for(j in 1:length(var2)) {
datProd3way <- data.frame(datProd3way, matrix(rep(dat1[, i], length(var3)), ncol = length(var3)) * matrix(rep(dat2[, j], length(var3)), ncol = length(var3)) * dat3)
}
}
datProd2way <- datProd2way[, -1]
datProd3way <- datProd3way[, -1]
if (residualC) {
notmissing2way <- which(!apply(datProd2way, 1, function(x) any(is.na(x))))
colnames(datProd2way) <- paste("interaction2Product", 1:ncol(datProd2way), sep = "")
# Write the expression for linear model and residualize the two-way products
temp2 <- data.frame(datProd2way, dat1, dat2, dat3)
express2 <- paste("cbind(", paste(colnames(datProd2way), collapse = ", "),
") ~ ", paste(c(colnames(dat1), colnames(dat2),
colnames(dat3)), collapse = " + "), sep = "")
datProd2way[notmissing2way,] <- lm(express2, data = temp2)$residuals
notmissing3way <- which(!apply(datProd3way, 1, function(x) any(is.na(x))))
colnames(datProd3way) <- paste("interaction3Product", 1:ncol(datProd3way), sep = "")
# Write the expression for linear model and residualize the three-way products
temp3 <- data.frame(datProd3way, dat1, dat2, dat3, datProd2way)
express3 <- paste("cbind(", paste(colnames(datProd3way), collapse = ", "),
") ~ ", paste(c(colnames(dat1), colnames(dat2),
colnames(dat3), colnames(datProd2way)),
collapse = " + "), sep = "")
datProd3way[notmissing3way,] <- lm(express3, data = temp3)$residuals
}
datProd <- cbind(datProd2way, datProd3way)
}
## Double-mean centering
if (doubleMC) datProd <- scale(datProd, scale = FALSE)
## Name the resulting product terms
if (is.null(namesProd)) {
temp <- NULL
if (is.null(var3)) {
for (i in 1:length(var1)) temp <- c(temp, paste(var1[i], var2, sep = "."))
} else {
for (i in 1:length(var1)) temp <- c(temp, paste(var1[i], var2, sep = "."))
for (i in 1:length(var1)) temp <- c(temp, paste(var1[i], var3, sep = "."))
for (i in 1:length(var2)) temp <- c(temp, paste(var2[i], var3, sep = "."))
for (i in 1:length(var1)) {
for(j in 1:length(var2)) {
temp <- c(temp, paste(var1[i], var2[j], var3, sep = "."))
}
}
}
colnames(datProd) <- temp
} else {
colnames(datProd) <- namesProd
}
}
## Bind the products back to the original data
data.frame(data, datProd)
}
##' @rdname indProd
##' @export
orthogonalize <- function(data, var1, var2, var3 = NULL,
match = TRUE, namesProd = NULL) {
indProd(data = data, var1 = var1, var2 = var2, var3 = var3,
match = match, meanC = FALSE, residualC = TRUE, doubleMC = FALSE,
namesProd = namesProd)
}
semTools/R/singleParamTest.R 0000644 0001762 0000144 00000023723 14632143377 015526 0 ustar ligges users ### Sunthud Pornprasertmanit
### Last updated: 2 June 2022
##' Single Parameter Test Divided from Nested Model Comparison
##'
##' In comparing two nested models, \eqn{\Delta\chi^2} test may indicate that
##' two models are different. However, like other omnibus tests, researchers do
##' not know which fixed parameters or constraints make these two models
##' different. This function will help researchers identify the significant
##' parameter.
##'
##' This function first identifies the differences between these two models. The
##' model with more free parameters is referred to as parent model and the model
##' with fewer free parameters is referred to as nested model. Two tests are
##' implemented here:
##'
##' \enumerate{
##' \item `free`: The nested model is used as a template. Then,
##' one parameter indicating the differences between two models is freed. The new
##' model is compared with the nested model. This process is repeated for all
##' differences between two models.
##' \item`fix`: The parent model is used
##' as a template. Then, one parameter indicating the differences between two
##' models is fixed or constrained to be equal to other parameters. The new
##' model is then compared with the parent model. This process is repeated for
##' all differences between two models.
##' \item`mi`: No longer available
##' because the test of modification indices is not consistent. For example, if
##' two parameters are equality constrained, the modification index from the
##' first parameter is not equal to the second parameter.
##' }
##'
##' Note that this function does not adjust for the inflated Type I error rate
##' from multiple tests.
##'
##' @param model1 Model 1.
##' @param model2 Model 2. Note that two models must be nested models. Further,
##' the order of parameters in their parameter tables are the same. That is,
##' nested models with different scale identifications may not be able to test
##' by this function.
##' @param return.fit Return the submodels fitted by this function
##' @param method The method used to calculate likelihood ratio test. See
##' [lavaan::lavTestLRT()] for available options
##' @return If `return.fit = FALSE`, the result tables are provided.
##' \eqn{\chi^2} and *p* value are provided for all methods. Note that the
##' \eqn{\chi^2} is all based on 1 *df*. Expected parameter changes
##' and their standardized forms are also provided.
##'
##' If `return.fit = TRUE`, a list with two elements are provided. The
##' first element is the tabular result. The second element is the submodels
##' used in the `free` and `fix` methods.
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##' @examples
##'
##' library(lavaan)
##'
##' # Nested model comparison by hand
##' HS.model1 <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6'
##' HS.model2 <- ' visual =~ a*x1 + a*x2 + a*x3
##' textual =~ b*x4 + b*x5 + b*x6'
##'
##' m1 <- cfa(HS.model1, data = HolzingerSwineford1939, std.lv = TRUE,
##' estimator = "MLR")
##' m2 <- cfa(HS.model2, data = HolzingerSwineford1939, std.lv = TRUE,
##' estimator = "MLR")
##' anova(m1, m2)
##' singleParamTest(m1, m2)
##'
##' ## Nested model comparison from the measurementInvariance function
##' HW.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9 '
##'
##' models <- measurementInvariance(model = HW.model, data = HolzingerSwineford1939,
##' group = "school")
##' singleParamTest(models[[1]], models[[2]])
##'
##' ## Note that the comparison between metric (Model 2) and scalar invariance
##' ## (Model 3) cannot be done by this function because the metric invariance
##' ## model fixes factor means as 0 in Group 2 but the strong invariance model
##' ## frees the factor means in Group 2. Users may use this function to compare
##' ## scalar invariance (Model 3) to a homogeneous-means model.
##'
##' @export
singleParamTest <- function(model1, model2, return.fit = FALSE,
method = "satorra.bentler.2001") {
# Check nested models without any swaps
if(lavaan::fitMeasures(model1, "df")[[1]] > lavaan::fitMeasures(model2, "df")[[1]]) {
fit0 <- model1
fit1 <- model2
} else {
fit0 <- model2
fit1 <- model1
}
# fit0 = Nested model, fit1 = Parent model
pt1 <- parTable(fit1)
pt0 <- parTable(fit0)
namept1 <- paramNameFromPt(pt1)
namept0 <- paramNameFromPt(pt0)
# Two possible constraints: fixed parameters and equality constraints
free1 <- (pt1$free != 0) & !(duplicated(pt1$free))
free0 <- (pt0$free != 0) & !(duplicated(pt0$free))
iscon1 <- pt1$op == "=="
iscon0 <- pt0$op == "=="
con1 <- list(id = integer(0), lhs = character(0), op = character(0), rhs = character(0))
con0 <- list(id = integer(0), lhs = character(0), op = character(0), rhs = character(0))
if(any(iscon1)) con1 <- list(id = pt1$id[iscon1], lhs = pt1$lhs[iscon1], op = pt1$op[iscon1], rhs = pt1$rhs[iscon1])
if(any(iscon0)) con0 <- list(id = pt0$id[iscon0], lhs = pt0$lhs[iscon0], op = pt0$op[iscon0], rhs = pt0$rhs[iscon0])
if(length(free1[!iscon1]) != length(free0[!iscon0])) stop("Parameter tables in two models do not have equal lengths. This function does not work.")
if(!all(free1[free0])) stop("Model are not nested or are not arranged in the way that this function works.")
if(sum(iscon1) > sum(iscon0)) stop("There are equality constraints in the model with less degrees of freedom that do not exist in the model with higher degrees of freedom. Thus, two models are not nested.")
if(!all.equal(lapply(pt1[2:4], "[", !iscon1), lapply(pt0[2:4], "[", !iscon0))) stop("This function needs parameter tables of two models to have the same orders of the same parameters.")
# Find fixed values or constraints
difffree <- !free0[!iscon0] & free1[!iscon1]
textcon1 <- paste0(con1$lhs, con1$op, con1$rhs)
textcon0 <- paste0(con0$lhs, con0$op, con0$rhs)
indexsamecon <- match(textcon1, textcon0)
indexdiffcon <- setdiff(seq_along(textcon0), indexsamecon)
diffcon <- lapply(con0, "[", indexdiffcon)
fixval <- which(difffree)
index <- c(fixval, diffcon$id)
if(length(index) <= 0) stop("Two models are identical. No single parameter test can be done.")
# Find nested model and release 1-by-1
freeCon <- matrix(NA, length(index), 2)
colnames(freeCon) <- c("free.chi", "free.p")
listFreeCon <- list()
runnum <- 1
for(i in seq_along(fixval)) {
temp <- freeParTable(pt0, pt0$lhs[fixval[i]], pt0$op[fixval[i]], pt0$rhs[fixval[i]], pt0$group[fixval[i]])
tryresult <- try(tempfit <- refit(temp, fit0), silent = TRUE)
if(!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit0, method = method), silent = TRUE)
if(!is(compresult, "try-error")) freeCon[runnum,] <- unlist(modelcomp[2, c(5, 7)])
}
listFreeCon <- c(listFreeCon, tryresult)
runnum <- runnum + 1
}
rownames(freeCon)[seq_along(fixval)] <- names(listFreeCon)[seq_along(fixval)] <- namept0[fixval]
for(i in seq_along(diffcon$id)) {
temp <- removeEqCon(pt0, diffcon$id[i])
tryresult <- try(tempfit <- refit(temp, fit0), silent = TRUE)
if(!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit0, method = method), silent = TRUE)
if(!is(compresult, "try-error")) freeCon[runnum,] <- unlist(modelcomp[2, c(5, 7)])
}
listFreeCon <- c(listFreeCon, tryresult)
runnum <- runnum + 1
}
poscon <- seq_along(diffcon$id) + length(fixval)
rownames(freeCon)[poscon] <- names(listFreeCon)[poscon] <- namept0[diffcon$id]
# Find parent model and constrain 1-by-1
fixCon <- matrix(NA, length(index), 2)
colnames(fixCon) <- c("fix.chi", "fix.p")
listFixCon <- list()
runnum <- 1
for(i in seq_along(fixval)) {
temp <- fixParTable(pt1, pt1$lhs[fixval[i]], pt1$op[fixval[i]], pt1$rhs[fixval[i]], pt1$group[fixval[i]], pt0$ustart[fixval[i]])
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if(!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if(!is(compresult, "try-error")) fixCon[runnum,] <- unlist(modelcomp[2,c(5, 7)])
}
listFixCon <- c(listFixCon, tryresult)
runnum <- runnum + 1
}
rownames(fixCon)[seq_along(fixval)] <- names(listFixCon)[seq_along(fixval)] <- namept0[fixval]
for(i in seq_along(diffcon$id)) {
temp <- patMerge(pt1, list(lhs = diffcon$lhs[i], op = diffcon$op[i], rhs = diffcon$rhs[i]))
tryresult <- try(tempfit <- refit(temp, fit1), silent = TRUE)
if(!is(tryresult, "try-error")) {
compresult <- try(modelcomp <- lavaan::lavTestLRT(tempfit, fit1, method = method), silent = TRUE)
if(!is(compresult, "try-error")) fixCon[runnum,] <- unlist(modelcomp[2, c(5, 7)])
}
listFixCon <- c(listFixCon, tryresult)
runnum <- runnum + 1
}
poscon <- seq_along(diffcon$id) + length(fixval)
rownames(fixCon)[poscon] <- names(listFixCon)[poscon] <- namept0[diffcon$id]
result <- cbind(freeCon, fixCon)
if(return.fit) {
return(invisible(list(result = result, models = list(free = listFreeCon, fix = listFixCon))))
} else {
return(result)
}
}
## ----------------
## Hidden Functions
## ----------------
paramNameFromPt <- function(pt) {
ngroups <- max(pt$group)
result <- NULL
if (ngroups == 1) {
result <- paste0(pt$lhs, pt$op, pt$rhs)
} else {
grouplab <- paste0(".g", pt$group)
grouplab[grouplab == ".g0" | grouplab == ".g1"] <- ""
result <- paste0(pt$lhs, pt$op, pt$rhs, grouplab)
}
con <- pt$op == "=="
pt$lhs[con] <- result[match(pt$lhs[con], pt$plabel)]
pt$rhs[con] <- result[match(pt$rhs[con], pt$plabel)]
result[con] <- paste(pt$lhs[con], pt$op[con], pt$rhs[con])
result
}
##' @importFrom lavaan lavInspect
refit <- function(pt, object, resetstart = TRUE) {
if (resetstart && "start" %in% names(pt)) pt <- pt[-which("start" == names(pt))]
previousCall <- lavInspect(object, "call")
## Why this?
args <- previousCall[-1]
args$model <- pt
funcall <- as.character(previousCall[[1]])
do.call(funcall[length(funcall)], args)
## instead of this?
# previousCall$model <- pt
# eval(previousCall)
}
semTools/R/dataDiagnosis.R 0000644 0001762 0000144 00000024464 14677254477 015216 0 ustar ligges users ### Sunthud Pornprasertmanit & Terrence D. Jorgensen
### Last updated: 2 October 2024
### Higher-order moments. Initial version from the simsem package.
##' Finding skewness
##'
##' Finding skewness (\eqn{g_{1}}) of an object
##'
##' The skewness computed by default is \eqn{g_{1}}, the third standardized
##' moment of the empirical distribution of `object`.
##' The population parameter skewness \eqn{\gamma_{1}} formula is
##'
##' \deqn{\gamma_{1} = \frac{\mu_{3}}{\mu^{3/2}_{2}},}
##'
##' where \eqn{\mu_{i}} denotes the \eqn{i} order central moment.
##'
##' The skewness formula for sample statistic \eqn{g_{1}} is
##'
##' \deqn{g_{1} = \frac{k_{3}}{k^{2}_{2}},}
##'
##' where \eqn{k_{i}} are the \eqn{i} order *k*-statistic.
##'
##' The standard error of the skewness is
##'
##' \deqn{Var(\hat{g}_1) = \frac{6}{N}}
##'
##' where \eqn{N} is the sample size.
##'
##'
##' @importFrom stats pnorm
##'
##' @param object A vector used to find a skewness
##' @param population `TRUE` to compute the parameter formula. `FALSE`
##' to compute the sample statistic formula.
##' @return A value of a skewness with a test statistic if the population is
##' specified as `FALSE`
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##' @seealso \itemize{
##' \item [kurtosis()] Find the univariate excessive kurtosis
##' of a variable
##' \item [mardiaSkew()] Find Mardia's multivariate skewness
##' of a set of variables
##' \item [mardiaKurtosis()] Find the Mardia's multivariate
##' kurtosis of a set of variables
##' }
##' @references Weisstein, Eric W. (n.d.). *Skewness*. Retrived from
##' *MathWorld*--A Wolfram Web Resource:
##'
##' @examples
##'
##' skew(1:5)
##'
##' @export
skew <- function(object, population = FALSE) {
if(any(is.na(object))) {
object <- object[!is.na(object)]
warning("Missing observations are removed from a vector.")
}
if(population) {
out <- centralMoment(object, 3) / (centralMoment(object, 2)^(3/2))
} else {
est <- kStat(object, 3) / (kStat(object, 2)^(3/2))
se <- sqrt(6/length(object))
z <- est/se
p <- (1 - pnorm(abs(z)))*2
out <- c("skew (g1)"=est, se=se, z=z, p=p)
}
class(out) <- c("lavaan.vector", "numeric")
out
}
##' Finding excessive kurtosis
##'
##' Finding excessive kurtosis (\eqn{g_{2}}) of an object
##'
##' The excessive kurtosis computed by default is \eqn{g_{2}}, the fourth
##' standardized moment of the empirical distribution of `object`.
##' The population parameter excessive kurtosis \eqn{\gamma_{2}} formula is
##'
##' \deqn{\gamma_{2} = \frac{\mu_{4}}{\mu^{2}_{2}} - 3,}
##'
##' where \eqn{\mu_{i}} denotes the \eqn{i} order central moment.
##'
##' The excessive kurtosis formula for sample statistic \eqn{g_{2}} is
##'
##' \deqn{g_{2} = \frac{k_{4}}{k^{2}_{2}} - 3,}
##'
##' where \eqn{k_{i}} are the \eqn{i} order *k*-statistic.
##'
##' The standard error of the excessive kurtosis is
##'
##' \deqn{Var(\hat{g}_{2}) = \frac{24}{N}}
##'
##' where \eqn{N} is the sample size.
##'
##'
##' @importFrom stats pnorm
##'
##' @param object A vector used to find a excessive kurtosis
##' @param population `TRUE` to compute the parameter formula. `FALSE`
##' to compute the sample statistic formula.
##' @return A value of an excessive kurtosis with a test statistic if the
##' population is specified as `FALSE`
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##' @seealso \itemize{
##' \item [skew()] Find the univariate skewness of a variable
##' \item [mardiaSkew()] Find the Mardia's multivariate
##' skewness of a set of variables
##' \item [mardiaKurtosis()] Find the Mardia's multivariate kurtosis
##' of a set of variables
##' }
##' @references Weisstein, Eric W. (n.d.). *Kurtosis.* Retrieved from
##' *MathWorld*--A Wolfram Web Resource:
##'
##'
##' @examples
##'
##' kurtosis(1:5)
##'
##' @export
kurtosis <- function(object, population = FALSE) {
if(any(is.na(object))) {
object <- object[!is.na(object)]
warning("Missing observations are removed from a vector.")
}
if(population) {
out <- (centralMoment(object, 4) / (centralMoment(object, 2)^2)) - 3
} else {
est <- kStat(object, 4) / (kStat(object, 2)^(2))
se <- sqrt(24/length(object))
z <- est/se
p <- (1 - pnorm(abs(z)))*2
out <- c("Excess Kur (g2)"=est, se=se, z=z, p=p)
}
class(out) <- c("lavaan.vector", "numeric")
out
}
##' Finding Mardia's multivariate skewness
##'
##' Finding Mardia's multivariate skewness of multiple variables
##'
##' The Mardia's multivariate skewness formula (Mardia, 1970) is
##' \deqn{ b_{1, d} = \frac{1}{n^2}\sum^n_{i=1}\sum^n_{j=1}\left[
##' \left(\bold{X}_i - \bold{\bar{X}} \right)^{'} \bold{S}^{-1}
##' \left(\bold{X}_j - \bold{\bar{X}} \right) \right]^3, }
##' where \eqn{d} is the number of variables, \eqn{X} is the target dataset
##' with multiple variables, \eqn{n} is the sample size, \eqn{\bold{S}} is
##' the sample covariance matrix of the target dataset, and \eqn{\bold{\bar{X}}}
##' is the mean vectors of the target dataset binded in \eqn{n} rows.
##' When the population multivariate skewness is normal, the
##' \eqn{\frac{n}{6}b_{1,d}} is asymptotically distributed as \eqn{\chi^2}
##' distribution with \eqn{d(d + 1)(d + 2)/6} degrees of freedom.
##'
##'
##' @importFrom stats cov pchisq
##'
##' @param dat The target matrix or data frame with multiple variables
##' @param use Missing data handling method from the [stats::cov()]
##' function.
##' @return A value of a Mardia's multivariate skewness with a test statistic
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##' @seealso \itemize{
##' \item [skew()] Find the univariate skewness of a variable
##' \item [kurtosis()] Find the univariate excessive
##' kurtosis of a variable
##' \item [mardiaKurtosis()] Find the Mardia's multivariate
##' kurtosis of a set of variables
##' }
##' @references Mardia, K. V. (1970). Measures of multivariate skewness and
##' kurtosis with applications. *Biometrika, 57*(3), 519--530.
##' \doi{10.2307/2334770}
##' @examples
##'
##' library(lavaan)
##' mardiaSkew(HolzingerSwineford1939[ , paste0("x", 1:9)])
##'
##' @export
mardiaSkew <- function(dat, use = "everything") {
centeredDat <- scale(dat, center=TRUE, scale=FALSE)
invS <- solve(cov(dat, use = use))
FUN <- function(vec1, vec2, invS) {
as.numeric(t(as.matrix(vec1)) %*% invS %*% as.matrix(vec2))
}
FUN2 <- function(vec1, listVec2, invS) {
sapply(listVec2, FUN, vec1=vec1, invS=invS)
}
indivTerm <- sapply(as.list(data.frame(t(centeredDat))), FUN2,
listVec2=as.list(data.frame(t(centeredDat))), invS=invS)
b1d <- sum(indivTerm^3, na.rm = TRUE) / (nrow(dat)^2)
d <- ncol(dat)
chi <- nrow(dat) * b1d / 6
df <- d * (d + 1) * (d + 2) / 6
p <- pchisq(chi, df = df, lower.tail = FALSE)
out <- c(b1d = b1d, chi = chi, df=df, p=p)
class(out) <- c("lavaan.vector", "numeric")
return(out)
}
##' Finding Mardia's multivariate kurtosis
##'
##' Finding Mardia's multivariate kurtosis of multiple variables
##'
##' The Mardia's multivariate kurtosis formula (Mardia, 1970) is
##' \deqn{ b_{2, d} = \frac{1}{n}\sum^n_{i=1}\left[ \left(\bold{X}_i -
##' \bold{\bar{X}} \right)^{'} \bold{S}^{-1} \left(\bold{X}_i -
##' \bold{\bar{X}} \right) \right]^2, }
##' where \eqn{d} is the number of variables, \eqn{X} is the target
##' dataset with multiple variables, \eqn{n} is the sample size, \eqn{\bold{S}}
##' is the sample covariance matrix of the target dataset, and
##' \eqn{\bold{\bar{X}}} is the mean vectors of the target dataset binded in
##' \eqn{n} rows. When the population multivariate kurtosis is normal, the
##' \eqn{b_{2,d}} is asymptotically distributed as normal distribution with the
##' mean of \eqn{d(d + 2)} and variance of \eqn{8d(d + 2)/n}.
##'
##'
##' @importFrom stats cov pnorm
##'
##' @param dat The target matrix or data frame with multiple variables
##' @param use Missing data handling method from the [stats::cov()]
##' function.
##' @return A value of a Mardia's multivariate kurtosis with a test statistic
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##' @seealso \itemize{
##' \item [skew()] Find the univariate skewness of a variable
##' \item [kurtosis()] Find the univariate excessive kurtosis
##' of a variable
##' \item [mardiaSkew()] Find the Mardia's multivariate skewness
##' of a set of variables
##' }
##' @references Mardia, K. V. (1970). Measures of multivariate skewness and
##' kurtosis with applications. *Biometrika, 57*(3), 519--530.
##' \doi{10.2307/2334770}
##' @examples
##'
##' library(lavaan)
##' mardiaKurtosis(HolzingerSwineford1939[ , paste0("x", 1:9)])
##'
##' @export
mardiaKurtosis <- function(dat, use = "everything") {
centeredDat <- scale(dat, center=TRUE, scale=FALSE)
invS <- solve(cov(dat, use = use))
FUN <- function(vec, invS) {
as.numeric(t(as.matrix(vec)) %*% invS %*% as.matrix(vec))
}
indivTerm <- sapply(as.list(data.frame(t(centeredDat))), FUN, invS=invS)
b2d <- sum(indivTerm^2, na.rm = TRUE) / nrow(dat)
d <- ncol(dat)
m <- d * (d + 2)
v <- 8 * d * (d + 2) / nrow(dat)
z <- (b2d - m)/sqrt(v)
p <- pnorm(-abs(z)) * 2
out <- c(b2d = b2d, z = z, p=p)
class(out) <- c("lavaan.vector", "numeric")
return(out)
}
## ----------------
## Hidden Functions
## ----------------
## centralMoment
## Calculate central moments of a variable
## Arguments:
## x: vector of a variable
## ord: order of the moment
## weight: weight variable
centralMoment <- function(x, ord) {
if(ord < 2) stop("Central moment can be calculated for order 2 or more in an integer.")
wm <- mean(x)
result <- sum((x - wm)^(ord))/length(x)
return(result)
}
## Example
## centralMoment(1:5, 2)
## kStat
## Calculate the k-statistic (i.e., unbiased estimator of a cumulant) of a variable
## Arguments:
## x: vector of a variable
## ord: order of the k-statistics
kStat <- function(x, ord) {
# Formula from mathworld wolfram
n <- length(x)
if(ord == 1) {
return(mean(x))
} else if (ord == 2) {
return(centralMoment(x, 2) * n / (n - 1))
} else if (ord == 3) {
return(centralMoment(x, 3) * n^2 / ((n - 1) * (n - 2)))
} else if (ord == 4) {
num1 <- n^2
num2 <- (n + 1) * centralMoment(x, 4)
num3 <- 3 * (n - 1) * centralMoment(x, 2)^2
denom <- (n - 1) * (n - 2) * (n - 3)
return((num1 * (num2 - num3))/denom)
} else {
stop("Order can be 1, 2, 3, or 4 only.")
}
}
## Example
## kStat(1:5, 4)
semTools/R/runMI.R 0000644 0001762 0000144 00000047613 14764332732 013463 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 12 March 2025
### runMI creates OLDlavaan.mi object, inherits from lavaanList class
### DEPRECATED: 16 June 2024
### supplanted by lavaan.mi package
## -------------
## Main function
## -------------
##' Fit a lavaan Model to Multiple Imputed Data Sets
##'
##' This function fits a lavaan model to a list of imputed data sets, and can
##' also implement multiple imputation for a single `data.frame` with
##' missing observations, using either the Amelia package or the mice package.
##'
##'
##' @aliases runMI-deprecated lavaan.mi-deprecated cfa.mi-deprecated sem.mi-deprecated growth.mi-deprecated
##' @importFrom lavaan lavInspect parTable
##' @importFrom methods getMethod
##'
##' @param model The analysis model can be specified using
##' [lavaan::model.syntax()] or a [lavaan::parTable()]
##' @param data A `data.frame` with missing observations, or a `list`
##' of imputed data sets (if data are imputed already). If `runMI()` has
##' already been called, then imputed data sets are stored in the
##' `@@DataList` slot, so `data=` can also be an `OLDlavaan.mi` object
##' from which the same imputed data will be used for additional analyses.
##' @param fun `character`. Name of a specific lavaan function used to fit
##' `model=` to `data=` (i.e., `"lavaan"`, `"cfa"`, `"sem"`, or `"growth"`).
##' Only required for `runMI()`.
##' @param \dots additional arguments to pass to [lavaan::lavaan()] or
##' [lavaan::lavaanList()]. See also [lavaan::lavOptions()].
##' Note that `lavaanList` provides parallel computing options, as well as
##' a `FUN` argument so the user can extract custom output after the model
##' is fitted to each imputed data set (see **Examples**). TIP: If a
##' custom `FUN` is used *and* `parallel = "snow"` is requested,
##' the user-supplied function should explicitly call `library` or use
##' \code{\link[base]{::}} for any functions not part of the base distribution.
##' @param m `integer`. Request the number of imputations. Ignored if `data=` is
##' already a `list` of imputed data sets or an `OLDlavaan.mi` object.
##' @param miArgs Addition arguments for the multiple-imputation function
##' (`miPackage`). The arguments should be put in a list (see example
##' below). Ignored if `data=` is already a `list` of imputed data
##' sets or an `OLDlavaan.mi` object.
##' @param miPackage Package to be used for imputation. Currently these
##' functions only support `"Amelia"` or `"mice"` for imputation.
##' Ignored if `data` is already a `list` of imputed data sets or an
##' `OLDlavaan.mi` object.
##' @param seed `integer`. Random number seed to be set before imputing the
##' data. Ignored if `data` is already a `list` of imputed data sets
##' or an `OLDlavaan.mi` object.
##'
##' @return A [semTools::OLDlavaan.mi-class] object
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @references
##' Enders, C. K. (2010). *Applied missing data analysis*. New
##' York, NY: Guilford.
##'
##' Rubin, D. B. (1987). *Multiple imputation for nonresponse in surveys*.
##' New York, NY: Wiley.
##'
##' @examples
##'
##' ## See the new lavaan.mi package
##'
##' @name runMI-deprecated
##' @usage
##' runMI(model, data, fun = "lavaan", ...,
##' m, miArgs = list(), miPackage = "Amelia", seed = 12345)
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @section `runMI()` and the `lavaan.mi` class functionality:
##' The `runMI()` function and support for `lavaan.mi-class` objects became
##' such a large part of semTools that it made sense to move that functionality
##' to its own package. The \pkg{lavaan.mi} package is now available for users
##' to fit `lavaan` models to their multiply imputed data. The new package
##' already fixes many bugs and provides many new features that make the
##' semTools `OLDlavaan.mi-class` obsolete. Please immediately discontinue
##' your dependence on `semTools::runMI()` amd transition to the new
##' \pkg{lavaan.mi} package, which also provides a more similar user interface
##' as the \pkg{lavaan} package provides for a single `data=` set. The README
##' on provides a list of analogous
##' functionality in \pkg{lavaan} and \pkg{lavaan.mi}, and the NEWS file
##' documents new features and other differences from the deprecated
##' `semTools::runMI()` functionality.
##'
##' @export
runMI <- function(model, data, fun = "lavaan", ...,
m, miArgs = list(), miPackage = "Amelia", seed = 12345) {
.Deprecated(msg = c("\nThe runMI() function and lavaan.mi-class have been ",
"deprecated and will cease to be included in future ",
"versions of semTools.\n\nSupport is still provided for ",
"analyzing lavaan.mi-class objects (e.g., compRelSEM() ",
"can estimate reliability using pooled results), which ",
"can now be created using the lavaan.mi package.\n\nThe ",
"deprecated runMI() function now creates an object of ",
"class OLDlavaan.mi, which can be analyzed using the ",
"deprecated functions in semTools, like lavTestLRT.mi(),",
" that have been updated and improved in the lavaan.mi ",
"package.\n\nFind more details help('semTools-deprecated)"))
CALL <- match.call()
dots <- list(...)
## check for (Bollen-Stine) bootstrap request
if (all(!is.null(dots$test),
tolower(dots$test) %in% c("boot","bootstrap","bollen.stine")) ||
all(!is.null(dots$se), tolower(dots$se) %in% c("boot","bootstrap"))) {
stop('Bootstraping unavailable (and not recommended) in combination with ',
'multiple imputations. For robust confidence intervals of indirect',
' effects, see the ?semTools::monteCarloCI help page. To bootstrap ',
'within each imputation, users can pass a custom function to the ',
'FUN= argument (see ?lavaanList) to save bootstrap distributions in ',
'the @funList slot, then manually combine afterward.')
}
seed <- as.integer(seed[1])
## Create (or acknowledge) list of imputed data sets
imputedData <- NULL
if (missing(data)) {
#TODO: check for summary statistics
#TODO: make lavaanList() accept lists of summary stats
#TODO: Add argument to implement Li Cai's pool-polychorics first, pass
# to lavaan for DWLS with pooled WLS.V= and NACOV=, return(lavaan).
} else if (is.data.frame(data)) {
if (miPackage[1] == "Amelia") {
requireNamespace("Amelia")
if (!"package:Amelia" %in% search()) attachNamespace("Amelia")
imputeCall <- c(list(Amelia::amelia, x = data, m = m, p2s = 0), miArgs)
set.seed(seed)
imputedData <- unclass(eval(as.call(imputeCall))$imputations)
} else if (miPackage[1] == "mice") {
requireNamespace("mice")
if (!"package:mice" %in% search()) attachNamespace("mice")
imputeCall <- c(list(mice::mice, data = data, m = m, diagnostics = FALSE,
printFlag = FALSE), miArgs)
set.seed(seed)
miceOut <- eval(as.call(imputeCall))
imputedData <- list()
for (i in 1:m) {
imputedData[[i]] <- mice::complete(data = miceOut, action = i, include = FALSE)
}
} else stop("Currently runMI only supports imputation by Amelia or mice")
} else if (is.list(data)) {
## check possibility that it is a mids object (inherits from list)
if (requireNamespace("mice", quietly = TRUE)) {
if (mice::is.mids(data)) {
m <- data$m
imputedData <- list()
for (i in 1:m) {
imputedData[[i]] <- mice::complete(data, action = i, include = FALSE)
}
imputeCall <- list()
} else {
seed <- integer(length = 0)
imputeCall <- list()
imputedData <- data
m <- length(data)
class(imputedData) <- "list" # override inheritance (e.g., "mi" if Amelia)
}
} else {
## can't check for mice, so probably isn't mids
seed <- integer(length = 0)
imputeCall <- list()
imputedData <- data
m <- length(data)
class(imputedData) <- "list" # override inheritance (e.g., "mi" if Amelia)
}
} else if (is(data, "OLDlavaan.mi")) {
seed <- data@seed
imputeCall <- data@imputeCall
imputedData <- data@DataList
m <- length(imputedData)
} else stop("data is not a valid input type: a partially observed data.frame,",
" a list of imputed data.frames, or previous OLDlavaan.mi object")
## Function to get custom output for OLDlavaan.mi object
## NOTE: Need "lavaan::" to allow for parallel computations
.getOutput. <- function(obj) {
converged <- lavaan::lavInspect(obj, "converged")
if (converged) {
se <- lavaan::parTable(obj)$se
se.test <- all(!is.na(se)) & all(se >= 0) & any(se != 0)
if (lavaan::lavInspect(obj, "ngroups") == 1L && lavaan::lavInspect(obj, "nlevels") == 1L) {
Heywood.lv <- det(lavaan::lavInspect(obj, "cov.lv")) <= 0
Heywood.ov <- det(lavaan::lavInspect(obj, "theta")) <= 0
} else {
Heywood.lv <- !all(sapply(lavaan::lavInspect(obj, "cov.lv"), det) > 0)
Heywood.ov <- !all(sapply(lavaan::lavInspect(obj, "theta"), det) > 0)
}
} else {
se.test <- Heywood.lv <- Heywood.ov <- NA
}
list(sampstat = lavaan::lavInspect(obj, "sampstat"),
coefMats = lavaan::lavInspect(obj, "est"),
satPT = data.frame(lavaan::lav_partable_unrestricted(obj),
#FIXME: do starting values ALWAYS == estimates?
stringsAsFactors = FALSE),
modindices = try(lavaan::modindices(obj), silent = TRUE),
cov.lv = lavaan::lavInspect(obj, "cov.lv"), #TODO: calculate from pooled estimates for reliability()
converged = converged, SE = se.test,
Heywood.lv = Heywood.lv, Heywood.ov = Heywood.ov)
}
## fit model using lavaanList
lavListCall <- list(lavaan::lavaanList, model = model, dataList = imputedData,
cmd = fun)
lavListCall <- c(lavListCall, dots)
lavListCall$store.slots <- c("partable","vcov","test","h1","baseline")
lavListCall$FUN <- if (is.null(dots$FUN)) .getOutput. else function(obj) {
temp1 <- .getOutput.(obj)
temp2 <- dots$FUN(obj)
if (!is.list(temp2)) temp2 <- list(userFUN1 = temp2)
if (is.null(names(temp2))) names(temp2) <- paste0("userFUN", 1:length(temp2))
duplicatedNames <- which(sapply(names(temp2), function(x) {
x %in% c("sampstat","coefMats","satPT","modindices","converged",
"SE","Heywood.lv","Heywood.ov","cov.lv")
}))
for (i in duplicatedNames) names(temp2)[i] <- paste0("userFUN", i)
c(temp1, temp2)
}
fit <- eval(as.call(lavListCall))
## Store custom @DataList and @SampleStatsList
fit@SampleStatsList <- lapply(fit@funList, "[[", i = "sampstat")
fit@DataList <- imputedData
## add parameter table to @h1List
for (i in 1:m) fit@h1List[[i]] <- c(fit@h1List[[i]],
list(PT = fit@funList[[i]]$satPT))
## assign class and add new slots
fit <- as(fit, "OLDlavaan.mi")
fit@coefList <- lapply(fit@funList, "[[", i = "coefMats")
fit@miList <- lapply(fit@funList, "[[", i = "modindices")
fit@phiList <- lapply(fit@funList, "[[", i = "cov.lv")
fit@seed <- seed
fit@call <- CALL
fit@lavListCall <- lavListCall
fit@imputeCall <- imputeCall
convList <- lapply(fit@funList, "[", i = c("converged","SE",
"Heywood.lv","Heywood.ov"))
nonConv <- which(sapply(convList, is.null))
if (length(nonConv)) for (i in nonConv) {
convList[[i]] <- list(converged = FALSE, SE = NA, Heywood.lv = NA, Heywood.ov = NA)
}
fit@convergence <- lapply(convList, function(x) do.call(c, x))
conv <- which(sapply(fit@convergence, "[", i = "converged"))
if (!length(conv)) warning('The model did not converge for any imputed data sets.')
## keep any remaining funList slots (if allowing users to supply custom FUN)
funNames <- names(fit@funList[[1]])
keepIndex <- which(!sapply(funNames, function(x) {
x %in% c("sampstat","coefMats","satPT","modindices","converged",
"SE","Heywood.lv","Heywood.ov","cov.lv")
}))
if (length(keepIndex)) {
fit@funList <- lapply(fit@funList, "[", i = keepIndex)
if (length(keepIndex) > 1L) {
keepNames <- funNames[keepIndex]
noNames <- which(keepNames == "")
for (i in seq_along(noNames)) keepNames[ noNames[i] ] <- paste0("userFUN", i)
fit@funList <- lapply(fit@funList, "names<-", value = keepNames)
}
} else fit@funList <- list()
NewStartVals <- try(getMethod("coef", "OLDlavaan.mi")(fit, type = "user",
labels = FALSE),
silent = TRUE)
if (!inherits(NewStartVals, "try-error")) fit@ParTable$start <- NewStartVals
fit
}
##' @name runMI-deprecated
##' @usage
##' lavaan.mi(model, data, ...,
##' m, miArgs = list(), miPackage = "Amelia", seed = 12345)
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @export
lavaan.mi <- function(model, data, ...,
m, miArgs = list(), miPackage = "Amelia", seed = 12345) {
# runMI(model = model, data = data, fun = "lavaan", ...,
# m = m, miArgs = miArgs, miPackage = miPackage, seed = seed)
mc <- match.call(expand.dots = TRUE)
mc$fun <- "lavaan"
mc[[1L]] <- quote(semTools::runMI)
eval(mc, parent.frame())
}
##' @name runMI-deprecated
##' @usage
##' cfa.mi(model, data, ...,
##' m, miArgs = list(), miPackage = "Amelia", seed = 12345)
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @export
cfa.mi <- function(model, data, ...,
m, miArgs = list(), miPackage = "Amelia", seed = 12345) {
# runMI(model = model, data = data, fun = "cfa", ...,
# m = m, miArgs = miArgs, miPackage = miPackage, seed = seed)
mc <- match.call(expand.dots = TRUE)
mc$fun <- "cfa"
mc[[1L]] <- quote(semTools::runMI)
eval(mc, parent.frame())
}
##' @name runMI-deprecated
##' @usage
##' sem.mi(model, data, ...,
##' m, miArgs = list(), miPackage = "Amelia", seed = 12345)
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @export
sem.mi <- function(model, data, ...,
m, miArgs = list(), miPackage = "Amelia", seed = 12345) {
# runMI(model = model, data = data, fun = "sem", ...,
# m = m, miArgs = miArgs, miPackage = miPackage, seed = seed)
mc <- match.call(expand.dots = TRUE)
mc$fun <- "sem"
mc[[1L]] <- quote(semTools::runMI)
eval(mc, parent.frame())
}
##' @name runMI-deprecated
##' @usage
##' growth.mi(model, data, ...,
##' m, miArgs = list(), miPackage = "Amelia", seed = 12345)
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @export
growth.mi <- function(model, data, ...,
m, miArgs = list(), miPackage = "Amelia", seed = 12345) {
# runMI(model = model, data = data, fun = "growth", ...,
# m = m, miArgs = miArgs, miPackage = miPackage, seed = seed)
mc <- match.call(expand.dots = TRUE)
mc$fun <- "growth"
mc[[1L]] <- quote(semTools::runMI)
eval(mc, parent.frame())
}
## -----------------
## Utility functions
## -----------------
##' Calculate the "D2" statistic
##'
##' This is a utility function used to calculate the "D2" statistic for pooling
##' test statistics across multiple imputations. This function is called by
##' several functions used for [OLDlavaan.mi-class] objects, such as
##' [lavTestLRT.mi()], [lavTestWald.mi()], and
##' [lavTestScore.mi()]. But this function can be used for any general
##' scenario because it only requires a vector of \eqn{\chi^2} statistics (one
##' from each imputation) and the degrees of freedom for the test statistic.
##' See Li, Meng, Raghunathan, & Rubin (1991) and Enders (2010, chapter 8) for
##' details about how it is calculated.
##'
##' @importFrom stats var pf pchisq
##'
##' @param w `numeric` vector of Wald \eqn{\chi^2} statistics. Can also
##' be Wald *z* statistics, which will be internally squared to make
##' \eqn{\chi^2} statistics with one *df* (must set `DF = 0L`).
##' @param DF degrees of freedom (*df*) of the \eqn{\chi^2} statistics.
##' If `DF = 0L` (default), `w` is assumed to contain *z*
##' statistics, which will be internally squared.
##' @param asymptotic `logical`. If `FALSE` (default), the pooled test
##' will be returned as an *F*-distributed statistic with numerator
##' (`df1`) and denominator (`df2`) degrees of freedom.
##' If `TRUE`, the pooled *F* statistic will be multiplied by its
##' `df1` on the assumption that its `df2` is sufficiently large
##' enough that the statistic will be asymptotically \eqn{\chi^2} distributed
##' with `df1`.
##'
##' @return A `numeric` vector containing the test statistic, *df*,
##' its *p* value, and 2 missing-data diagnostics: the relative invrease
##' in variance (RIV, or average for multiparameter tests: ARIV) and the
##' fraction missing information (FMI = ARIV / (1 + ARIV)).
##'
##' @seealso [lavTestLRT.mi()], [lavTestWald.mi()], [lavTestScore.mi()]
##'
##' @author Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' @references
##' Enders, C. K. (2010). *Applied missing data analysis*. New
##' York, NY: Guilford.
##'
##' Li, K.-H., Meng, X.-L., Raghunathan, T. E., & Rubin, D. B. (1991).
##' Significance levels from repeated *p*-values with multiply-imputed
##' data. *Statistica Sinica, 1*(1), 65--92. Retrieved from
##'
##'
##' @examples
##' ## generate a vector of chi-squared values, just for example
##' DF <- 3 # degrees of freedom
##' M <- 20 # number of imputations
##' CHI <- rchisq(M, DF)
##'
##' ## pool the "results"
##' calculate.D2(CHI, DF) # by default, an F statistic is returned
##' calculate.D2(CHI, DF, asymptotic = TRUE) # asymptotically chi-squared
##'
##' ## generate standard-normal values, for an example of Wald z tests
##' Z <- rnorm(M)
##' calculate.D2(Z) # default DF = 0 will square Z to make chisq(DF = 1)
##' ## F test is equivalent to a t test with the denominator DF
##'
##' @name calculate.D2-deprecated
##' @usage calculate.D2(w, DF = 0L, asymptotic = FALSE)
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##'
##' @export
calculate.D2 <- function(w, DF = 0L, asymptotic = FALSE) {
.Deprecated(msg = c("\nThe calculate.D2() function has been deprecated from ",
"semTools and moved to the new lavaan.mi package, along ",
"with other multiple-imputation functionality related to",
" runMI().\nSee help('semTools-deprecated)"))
if (length(w) == 0L) return(NA)
w <- as.numeric(w)
DF <- as.numeric(DF)
nImps <- sum(!is.na(w))
if (nImps == 0) return(NA)
if (DF <= 0L) {
## assume they are Z scores
w <- w^2
DF <- 1L
}
## pool test statistics
if (length(w) > 1L) {
w_bar <- mean(w, na.rm = TRUE)
ariv <- (1 + 1/nImps) * var(sqrt(w), na.rm = TRUE)
test.stat <- (w_bar/DF - ((nImps + 1) * ariv / (nImps - 1))) / (1 + ariv)
} else {
warning('There was only 1 non-missing value to pool, leading to zero ',
'variance, so D2 cannot be calculated.')
test.stat <- ariv <- NA
}
if (test.stat < 0) test.stat <- 0
if (asymptotic) {
out <- c("chisq" = test.stat * DF, df = DF,
pvalue = pchisq(test.stat * DF, df = DF, lower.tail = FALSE),
ariv = ariv, fmi = ariv / (1 + ariv))
} else {
v3 <- DF^(-3 / nImps) * (nImps - 1) * (1 + (1 / ariv))^2
out <- c("F" = test.stat, df1 = DF, df2 = v3,
pvalue = pf(test.stat, df1 = DF, df2 = v3, lower.tail = FALSE),
ariv = ariv, fmi = ariv / (1 + ariv))
}
class(out) <- c("lavaan.vector","numeric")
out
}
semTools/R/measurementInvariance.R 0000644 0001762 0000144 00000027703 14753124717 016754 0 ustar ligges users ### Sunthud Pornprasertmanit, Yves Rosseel, and Terrence D. Jorgensen
### Last updated: 12 June 2024
### DEPRECATED: 9 May 2022
##' Measurement Invariance Tests
##'
##' Testing measurement invariance across groups using a typical sequence of
##' model comparison tests.
##'
##' If `strict = FALSE`, the following four models are tested in order:
##' \enumerate{
##' \item Model 1: configural invariance. The same factor structure
##' is imposed on all groups.
##' \item Model 2: weak invariance. The factor loadings are constrained to
##' be equal across groups.
##' \item Model 3: strong invariance. The factor loadings and intercepts
##' are constrained to be equal across groups.
##' \item Model 4: The factor loadings, intercepts and means are constrained
##' to be equal across groups.
##' }
##'
##' Each time a more restricted model is fitted, a \eqn{\Delta\chi^2} test is
##' reported, comparing the current model with the previous one, and comparing
##' the current model to the baseline model (Model 1). In addition, the
##' difference in CFI is also reported (\eqn{\Delta}CFI).
##'
##' If `strict = TRUE`, the following five models are tested in order:
##' \enumerate{
##' \item Model 1: configural invariance. The same factor structure
##' is imposed on all groups.
##' \item Model 2: weak invariance. The factor loadings are constrained to be
##' equal across groups.
##' \item Model 3: strong invariance. The factor loadings and intercepts are
##' constrained to be equal across groups.
##' \item Model 4: strict invariance. The factor loadings, intercepts and
##' residual variances are constrained to be equal across groups.
##' \item Model 5: The factor loadings, intercepts, residual variances and means
##' are constrained to be equal across groups.
##' }
##'
##' Note that if the \eqn{\chi^2} test statistic is scaled (e.g., a Satorra-Bentler
##' or Yuan-Bentler test statistic), a special version of the \eqn{\Delta\chi^2}
##' test is used as described in
##'
##' @importFrom lavaan parTable
##'
##' @param ... The same arguments as for any lavaan model. See
##' [lavaan::cfa()] for more information.
##' @param std.lv If `TRUE`, the fixed-factor method of scale
##' identification is used. If `FALSE`, the first variable for each factor
##' is used as marker variable.
##' @param strict If `TRUE`, the sequence requires `strict' invariance.
##' See details for more information.
##' @param quiet If `FALSE` (default), a summary is printed out containing
##' an overview of the different models that are fitted, together with some
##' model comparison tests. If `TRUE`, no summary is printed.
##' @param fit.measures Fit measures used to calculate the differences between
##' nested models.
##' @param baseline.model custom baseline model passed to
##' [lavaan::fitMeasures()]
##' @param method The method used to calculate likelihood ratio test. See
##' [lavaan::lavTestLRT()] for available options
##'
##' @return Invisibly, all model fits in the sequence are returned as a list.
##'
##' @author Yves Rosseel (Ghent University; \email{Yves.Rosseel@@UGent.be})
##'
##' Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@gmail.com})
##'
##' @references
##' Vandenberg, R. J., and Lance, C. E. (2000). A review and synthesis of the
##' measurement invariance literature: Suggestions, practices, and
##' recommendations for organizational research. *Organizational
##' Research Methods, 3,* 4--70.
##'
##' @examples
##'
##' HW.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9 '
##'
##' measurementInvariance(model = HW.model, data = HolzingerSwineford1939,
##' group = "school", fit.measures = c("cfi","aic"))
##'
##' @name measurementInvariance-deprecated
##' @usage
##' measurementInvariance(..., std.lv = FALSE, strict = FALSE, quiet = FALSE,
##' fit.measures = "default", baseline.model = NULL,
##' method = "satorra.bentler.2001")
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @section Previous measurement-invariance functions:
##' The `measurementInvariance`, `measurementInvarianceCat`, and
##' `longInvariance` functions will no longer be supported. Instead, use
##' the [measEq.syntax()] function, which is much more flexible and
##' supports a wider range of data (e.g., any mixture of `numeric` and
##' `ordered` indicators, any combination of multiple groups and repeated
##' measures, models fit to multiple imputations with [lavaan.mi::lavaan.mi()]).
##'
##' @export
measurementInvariance <- function(..., std.lv = FALSE, strict = FALSE,
quiet = FALSE, fit.measures = "default",
baseline.model = NULL,
method = "satorra.bentler.2001") {
.Deprecated(msg = c("The measurementInvariance function is deprecated, and",
" it will cease to be included in future versions of ",
"semTools. See help('semTools-deprecated) for details."))
lavaancfa <- function(...) { lavaan::cfa(...) }
## check for a group.equal argument in ...
dotdotdot <- list(...)
if (is.null(dotdotdot$model)) stop('all lavaan() and lavOptions() arguments must',
' named, including the "model=" argument.')
if (!is.null(dotdotdot$group.equal))
stop("lavaan ERROR: group.equal argument should not be used")
## and a model
if (names(dotdotdot)[1] == "") names(dotdotdot)[1] <- "model"
res <- list()
## base-line model: configural invariance
configural <- dotdotdot
configural$group.equal <- ""
template <- try(do.call(lavaancfa, configural), silent = TRUE)
if (inherits(template, "try-error")) stop('Configural model did not converge.')
pttemplate <- parTable(template)
varnames <- unique(pttemplate$rhs[pttemplate$op == "=~"])
facnames <- unique(pttemplate$lhs[(pttemplate$op == "=~") & (pttemplate$rhs %in% varnames)])
ngroups <- max(pttemplate$group)
if (ngroups <= 1) stop("Well, the number of groups is 1. Measurement",
" invariance across 'groups' cannot be done.")
if (std.lv) {
for (i in facnames) {
pttemplate <- fixParTable(pttemplate, i, "~~", i, 1:ngroups, 1)
}
fixloadings <- which(pttemplate$op == "=~" & pttemplate$free == 0)
for (i in fixloadings) {
pttemplate <- freeParTable(pttemplate, pttemplate$lhs[i], "=~",
pttemplate$rhs[i], pttemplate$group[i])
}
dotdotdot$model <- pttemplate
res$fit.configural <- try(do.call(lavaancfa, dotdotdot), silent = TRUE)
} else {
res$fit.configural <- template
}
## fix loadings across groups
if (std.lv) {
findloadings <- which(pttemplate$op == "=~" & pttemplate$free != 0 & pttemplate$group == 1)
for (i in findloadings) {
pttemplate <- constrainParTable(pttemplate, pttemplate$lhs[i],
"=~", pttemplate$rhs[i], 1:ngroups)
}
for (i in facnames) {
pttemplate <- freeParTable(pttemplate, i, "~~", i, 2:ngroups)
}
dotdotdot$model <- pttemplate
res$fit.loadings <- try(do.call(lavaancfa, dotdotdot), silent = TRUE)
} else {
loadings <- dotdotdot
loadings$group.equal <- c("loadings")
res$fit.loadings <- try(do.call(lavaancfa, loadings), silent = TRUE)
}
## fix loadings + intercepts across groups
if (std.lv) {
findintcepts <- which(pttemplate$op == "~1" & pttemplate$lhs %in% varnames &
pttemplate$free != 0 & pttemplate$group == 1)
for (i in findintcepts) {
pttemplate <- constrainParTable(pttemplate,
pttemplate$lhs[i], "~1", "", 1:ngroups)
}
for (i in facnames) {
pttemplate <- freeParTable(pttemplate, i, "~1", "", 2:ngroups)
}
dotdotdot$model <- pttemplate
res$fit.intercepts <- try(do.call(lavaancfa, dotdotdot), silent = TRUE)
} else {
intercepts <- dotdotdot
intercepts$group.equal <- c("loadings", "intercepts")
res$fit.intercepts <- try(do.call(lavaancfa, intercepts), silent = TRUE)
}
if (strict) {
if (std.lv) {
findresiduals <- which(pttemplate$op == "~~" &
pttemplate$lhs %in% varnames &
pttemplate$rhs == pttemplate$lhs &
pttemplate$free != 0 & pttemplate$group == 1)
for (i in findresiduals) {
pttemplate <- constrainParTable(pttemplate, pttemplate$lhs[i], "~~",
pttemplate$rhs[i], 1:ngroups)
}
dotdotdot$model <- pttemplate
res$fit.residuals <- try(do.call(lavaancfa, dotdotdot), silent = TRUE)
for (i in facnames) {
pttemplate <- fixParTable(pttemplate, i, "~1", "", 1:ngroups, 0)
}
dotdotdot$model <- pttemplate
res$fit.means <- try(do.call(lavaancfa, dotdotdot), silent = TRUE)
} else {
# fix loadings + intercepts + residuals
residuals <- dotdotdot
residuals$group.equal <- c("loadings", "intercepts", "residuals")
res$fit.residuals <- try(do.call(lavaancfa, residuals), silent = TRUE)
# fix loadings + residuals + intercepts + means
means <- dotdotdot
means$group.equal <- c("loadings", "intercepts", "residuals", "means")
res$fit.means <- try(do.call(lavaancfa, means), silent = TRUE)
}
} else {
if (std.lv) {
for (i in facnames) {
pttemplate <- fixParTable(pttemplate, i, "~1", "", 1:ngroups, 0)
}
dotdotdot$model <- pttemplate
res$fit.means <- try(do.call(lavaancfa, dotdotdot), silent = TRUE)
} else {
# fix loadings + intercepts + means
means <- dotdotdot
means$group.equal <- c("loadings", "intercepts", "means")
res$fit.means <- try(do.call(lavaancfa, means), silent = TRUE)
}
}
if (!quiet) printInvarianceResult(res, fit.measures, baseline.model, method)
invisible(res)
}
## ----------------
## Hidden Functions
## ----------------
##' @importFrom lavaan lavInspect
##' @importMethodsFrom lavaan fitMeasures
printInvarianceResult <- function(FIT, fit.measures, baseline.model, method) {
## check whether models converged
NAMES <- names(FIT)
nonconv <- which(sapply(FIT, class) == "try-error")
if (length(nonconv)) {
message('The following model(s) did not converge: \n', paste(NAMES[nonconv], sep = "\n"))
FIT <- FIT[-nonconv]
NAMES <- NAMES[-nonconv]
}
names(FIT) <- NULL
## compare models
lavaanLavTestLRT <- function(...) lavaan::lavTestLRT(...)
TABLE <- do.call(lavaanLavTestLRT, c(FIT, list(model.names = NAMES,
method = method)))
if (length(fit.measures) == 1L && fit.measures == "default") {
## scaled test statistic?
if (length(lavInspect(FIT[[1]], "test")) > 1L) {
if (lavInspect(FIT[[1]], "test")[[2]]$test %in% c("satorra.bentler", "yuan.bentler")) {
fit.measures <- c("cfi.robust", "rmsea.robust")
} else fit.measures <- c("cfi.scaled", "rmsea.scaled")
} else {
fit.measures <- c("cfi", "rmsea")
}
}
## add some fit measures
if (length(fit.measures)) {
FM <- lapply(FIT, fitMeasures,
fit.measures = fit.measures, baseline.model = baseline.model)
FM.table1 <- sapply(fit.measures, function(x) sapply(FM, "[[", x))
if (length(FM) == 1L) {
FM.table1 <- rbind( rep(as.numeric(NA), length(fit.measures)), FM.table1)
}
if (length(FM) > 1L) {
FM.table2 <- rbind(as.numeric(NA),
abs(apply(FM.table1, 2, diff)))
colnames(FM.table2) <- paste(colnames(FM.table2), ".delta", sep = "")
FM.TABLE <- as.data.frame(cbind(FM.table1, FM.table2))
} else {
FM.TABLE <- as.data.frame(FM.table1)
}
rownames(FM.TABLE) <- rownames(TABLE)
class(FM.TABLE) <- c("lavaan.data.frame", "data.frame")
}
cat("\n")
cat("Measurement invariance models:\n\n")
cat(paste(paste("Model", seq_along(FIT), ":", NAMES), collapse = "\n"))
cat("\n\n")
print(TABLE)
if (length(fit.measures)) {
cat("\n\n")
cat("Fit measures:\n\n")
print(FM.TABLE)
cat("\n")
return(list(anova = TABLE, fitMeasures = FM.TABLE))
}
TABLE
}
semTools/R/fitIndices.R 0000644 0001762 0000144 00000072676 14751404365 014517 0 ustar ligges users ### Title: Compute more fit indices
### Authors: Terrence D. Jorgensen, Sunthud Pornprasertmanit,
### Aaron Boulton, Ruben Arslan, Mauricio Garnier-Villarreal
### Last updated: 7 February 2025
### Description: Calculations for promising alternative fit indices
##' Calculate more fit indices
##'
##' Calculate more fit indices that are not already provided in lavaan.
##'
##' See [nullRMSEA()] for the further details of the computation of
##' RMSEA of the null model.
##'
##' Gamma-Hat (`gammaHat`; West, Taylor, & Wu, 2012) is a global
##' goodness-of-fit index which can be computed (assuming equal number of
##' indicators across groups) by
##'
##' \deqn{ \hat{\Gamma} =\frac{p}{p + 2 \times \frac{\chi^{2}_{k} - df_{k}}{N}},}
##'
##' where \eqn{p} is the number of variables in the model, \eqn{\chi^{2}_{k}} is
##' the \eqn{\chi^2} test statistic value of the target model, \eqn{df_{k}} is
##' the degree of freedom when fitting the target model, and \eqn{N} is the
##' sample size (or sample size minus the number of groups if `mimic` is
##' set to `"EQS"`).
##'
##' Adjusted Gamma-Hat (`adjGammaHat`; West, Taylor, & Wu, 2012) is a
##' global fit index which can be computed by
##'
##' \deqn{ \hat{\Gamma}_\textrm{adj} = \left(1 - \frac{K \times p \times
##' (p + 1)}{2 \times df_{k}} \right) \times \left( 1 - \hat{\Gamma} \right),}
##'
##' where \eqn{K} is the number of groups (please refer to Dudgeon, 2004, for
##' the multiple-group adjustment for `adjGammaHat`).
##'
##' Note that if Satorra--Bentler's or Yuan--Bentler's method is used, the fit
##' indices using the scaled \eqn{\chi^2} values are also provided.
##'
##' The remaining indices are information criteria calculated using the
##' `object`'s \eqn{-2 \times} log-likelihood, abbreviated \eqn{-2LL}.
##'
##' Corrected Akaike Information Criterion (`aic.smallN`; Burnham &
##' Anderson, 2003) is a corrected version of AIC for small sample size, often
##' abbreviated AICc:
##'
##' \deqn{ \textrm{AIC}_{\textrm{small}-N} = AIC + \frac{2q(q + 1)}{N - q - 1},}
##'
##' where \eqn{AIC} is the original AIC: \eqn{-2LL + 2q} (where \eqn{q}
##' = the number of estimated parameters in the target model). Note that AICc is
##' a small-sample correction derived for univariate regression models, so it is
##' probably *not* appropriate for comparing SEMs.
##'
##' Corrected Bayesian Information Criterion (`bic.priorN`; Kuha, 2004) is
##' similar to BIC but explicitly specifying the sample size on which the prior
##' is based (\eqn{N_{prior}}) using the `nPrior` argument.
##'
##' \deqn{ \textrm{BIC}_{\textrm{prior}-N} = -2LL + q\log{( 1 + \frac{N}{N_{prior}} )}.}
##'
##' Bollen et al. (2012, 2014) discussed additional BICs that incorporate more
##' terms from a Taylor series expansion, which the standard BIC drops. The
##' "Scaled Unit-Information Prior" BIC is calculated depending on whether the
##' product of the vector of estimated model parameters (\eqn{\hat{\theta}}) and
##' the observed information matrix (FIM) exceeds the number of estimated model
##' parameters (Case 1) or not (Case 2), which is checked internally:
##'
##' \deqn{ \textrm{SPBIC}_{\textrm{Case 1}} = -2LL + q(1 - \frac{q}{\hat{\theta}^{'} \textrm{FIM} \hat{\theta}}), \textrm{ or}}
##' \deqn{ \textrm{SPBIC}_{\textrm{Case 2}} = -2LL + \hat{\theta}^{'} \textrm{FIM} \hat{\theta},}
##'
##' Note that this implementation of SPBIC is calculated on the assumption that
##' priors for all estimated parameters are centered at zero, which is
##' inappropriate for most SEMs (e.g., variances should not have priors centered
##' at the lowest possible value; Bollen, 2014, p. 6).
##'
##' Bollen et al. (2014, eq. 14) credit the HBIC to Haughton (1988):
##'
##' \deqn{ \textrm{HBIC} = -2LL + q\log{\frac{N}{2 \pi}}.}
##'
##' Bollen et al. (2012, p. 305) proposed the information matrix (\eqn{I})-based BIC by
##' adding another term:
##'
##' \deqn{ \textrm{IBIC} = -2LL + q\log{\frac{N}{2 \pi}} + \log{\det{\textrm{FIM}}},}
##'
##' or equivalently, using the inverse information (the asymptotic sampling
##' covariance matrix of estimated parameters: ACOV):
##'
##' \deqn{ \textrm{IBIC} = -2LL - q\log{2 \pi} - \log{\det{\textrm{ACOV}}}.}
##'
##' Stochastic information criterion (SIC; see Preacher, 2006, for details) is
##' similar to IBIC but does not include the \eqn{q\log{2 \pi}} term that is
##' also in HBIC. SIC and IBIC both account for model complexity in a model's
##' functional form, not merely the number of free parameters. The SIC can be
##' computed as:
##'
##' \deqn{ \textrm{SIC} = -2LL + q\log{N} + \log{\det{\textrm{FIM}}} = -2LL - \log{\det{\textrm{ACOV}}}.}
##'
##' Hannan--Quinn Information Criterion (HQC; Hannan & Quinn, 1979) is used for
##' model selection, similar to AIC or BIC.
##'
##' \deqn{ \textrm{HQC} = -2LL + 2q\log{(\log{N})},}
##'
##' Bozdogan Information Complexity (ICOMP) Criteria (Howe et al., 2011),
##' instead of penalizing the number of free parameters directly,
##' ICOMP penalizes the covariance complexity of the model.
##'
##' \deqn{ \textrm{ICOMP} = -2LL + s \times log(\frac{\bar{\lambda_a}}{\bar{\lambda_g}}) }
##'
##'
##' @importFrom lavaan lavInspect
##'
##' @param object The lavaan model object provided after running the `cfa`,
##' `sem`, `growth`, or `lavaan` functions.
##' @param fit.measures Additional fit measures to be calculated. All additional
##' fit measures are calculated by default
##' @param nPrior The sample size on which prior is based. This argument is used
##' to compute `bic.priorN`.
##'
##' @return A `numeric` `lavaan.vector` including any of the
##' following requested via `fit.measures=`
##' \enumerate{
##' \item `gammaHat`: Gamma-Hat
##' \item `adjGammaHat`: Adjusted Gamma-Hat
##' \item `baseline.rmsea`: RMSEA of the default baseline (i.e., independence) model
##' \item `gammaHat.scaled`: Gamma-Hat using scaled \eqn{\chi^2}
##' \item `adjGammaHat.scaled`: Adjusted Gamma-Hat using scaled \eqn{\chi^2}
##' \item `baseline.rmsea.scaled`: RMSEA of the default baseline (i.e.,
##' independence) model using scaled \eqn{\chi^2}
##' \item `aic.smallN`: Corrected (for small sample size) AIC
##' \item `bic.priorN`: BIC with specified prior sample size
##' \item `spbic`: Scaled Unit-Information Prior BIC (SPBIC)
##' \item `hbic`: Haughton's BIC (HBIC)
##' \item `ibic`: Information-matrix-based BIC (IBIC)
##' \item `sic`: Stochastic Information Criterion (SIC)
##' \item `hqc`: Hannan-Quinn Information Criterion (HQC)
##' \item `icomp`: Bozdogan Information Complexity (ICOMP) Criteria
##' }
##'
##' @author Sunthud Pornprasertmanit (\email{psunthud@@gmail.com})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' Aaron Boulton (University of Delaware)
##'
##' Ruben Arslan (Humboldt-University of Berlin, \email{rubenarslan@@gmail.com})
##'
##' Yves Rosseel (Ghent University; \email{Yves.Rosseel@@UGent.be})
##'
##' Mauricio Garnier-Villarreal (Vrije Universiteit Amsterdam; \email{mgv@pm.me})
##'
##' A great deal of feedback was provided by Kris Preacher regarding Bollen et
##' al.'s (2012, 2014) extensions of BIC.
##'
##' @seealso
##' \itemize{
##' \item [miPowerFit()] For the modification indices and their
##' power approach for model fit evaluation
##' \item [nullRMSEA()] For RMSEA of the default independence model
##' }
##'
##' @references
##'
##' Bollen, K. A., Ray, S., Zavisca, J., & Harden, J. J. (2012). A comparison of
##' Bayes factor approximation methods including two new methods.
##' *Sociological Methods & Research, 41*(2), 294--324.
##' \doi{10.1177/0049124112452393}
##'
##' Bollen, K. A., Harden, J. J., Ray, S., & Zavisca, J. (2014). BIC and
##' alternative Bayesian information criteria in the selection of structural
##' equation models. *Structural Equation Modeling, 21*(1), 1--19.
##' \doi{10.1080/10705511.2014.856691}
##'
##' Burnham, K., & Anderson, D. (2003). *Model selection and
##' multimodel inference: A practical--theoretic approach*. New York, NY:
##' Springer--Verlag.
##'
##' Dudgeon, P. (2004). A note on extending Steiger's (1998) multiple sample
##' RMSEA adjustment to other noncentrality parameter-based statistic.
##' *Structural Equation Modeling, 11*(3), 305--319.
##' \doi{10.1207/s15328007sem1103_1}
##'
##' Howe, E. D., Bozdogan, H., & Katragadda, S. (2011). Structural equation
##' modeling (SEM) of categorical and mixed-data using the novel Gifi
##' transformations and information complexity (ICOMP) criterion.
##' *Istanbul University Journal of the School of Business Administration, 40*(1), 86--123.
##'
##' Kuha, J. (2004). AIC and BIC: Comparisons of assumptions and performance.
##' *Sociological Methods Research, 33*(2), 188--229.
##' \doi{10.1177/0049124103262065}
##'
##' Preacher, K. J. (2006). Quantifying parsimony in structural equation
##' modeling. *Multivariate Behavioral Research, 43*(3), 227--259.
##' \doi{10.1207/s15327906mbr4103_1}
##'
##' West, S. G., Taylor, A. B., & Wu, W. (2012). Model fit and model selection
##' in structural equation modeling. In R. H. Hoyle (Ed.), *Handbook of
##' structural equation modeling* (pp. 209--231). New York, NY: Guilford.
##'
##'
##' @examples
##'
##' HS.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9 '
##'
##' fit <- cfa(HS.model, data = HolzingerSwineford1939)
##' moreFitIndices(fit)
##'
##' fit2 <- cfa(HS.model, data = HolzingerSwineford1939, estimator = "mlr")
##' moreFitIndices(fit2)
##'
##' @export
moreFitIndices <- function(object, fit.measures = "all", nPrior = 1) {
## check for validity of user-specified "fit.measures" argument
fit.choices <- c("gammaHat","adjGammaHat","baseline.rmsea",
"gammaHat.scaled","adjGammaHat.scaled","baseline.rmsea.scaled",
"aic.smallN","bic.priorN","spbic","hbic","ibic","sic","hqc","icomp")
flags <- setdiff(fit.measures, c("all", fit.choices))
if (length(flags)) stop(paste("Argument 'fit.measures' includes invalid options:",
paste(flags, collapse = ", "),
"Please choose 'all' or among the following:",
paste(fit.choices, collapse = ", "), sep = "\n"))
if ("all" %in% fit.measures) fit.measures <- fit.choices
# Extract fit indices information from lavaan object
fit <- lavInspect(object, "fit")
# Get the number of variable
p <- length(lavaan::lavNames(object, type = "ov", group = 1))
# Get the number of parameters
nParam <- fit["npar"]
# Find the number of groups
ngroup <- lavInspect(object, "ngroups")
# Get number of observations
N <- n <- lavInspect(object, "ntotal")
if (lavInspect(object, "options")$mimic == "EQS") n <- n - ngroup
## Calculate -2*log(likelihood)
f <- -2 * fit["logl"]
## Compute fit indices
result <- list()
if (length(grep("gamma", fit.measures, ignore.case = TRUE))) {
gammaHat <- p / (p + 2 * ((fit["chisq"] - fit["df"]) / n))
adjGammaHat <- 1 - (((ngroup * p * (p + 1)) / 2) / fit["df"]) * (1 - gammaHat)
result["gammaHat"] <- gammaHat
result["adjGammaHat"] <- adjGammaHat
if (any(grepl(pattern = "scaled", x = names(fit)))) {
gammaHatScaled <- p / (p + 2 * ((fit["chisq.scaled"] - fit["df.scaled"]) / n))
adjGammaHatScaled <- 1 - (((ngroup * p * (p + 1)) / 2) / fit["df.scaled"]) * (1 - gammaHatScaled)
result["gammaHat.scaled"] <- gammaHatScaled
result["adjGammaHat.scaled"] <- adjGammaHatScaled
}
}
if (length(grep("rmsea", fit.measures))) {
result["baseline.rmsea"] <- nullRMSEA(object, silent = TRUE)
if (any(grepl(pattern = "scaled", x = names(fit)))) {
result["baseline.rmsea.scaled"] <- nullRMSEA(object, scaled = TRUE, silent = TRUE)
}
}
## Compute information criteria
if (!is.na(f)) {
if ("aic.smallN" %in% fit.measures) {
warning('AICc (aic.smallN) was developed for univariate linear models.',
' It is probably not appropriate to use AICc to compare SEMs.')
result["aic.smallN"] <- fit[["aic"]] + (2 * nParam * (nParam + 1)) / (N - nParam - 1)
}
if ("bic.priorN" %in% fit.measures) {
result["bic.priorN"] <- f + log(1 + N/nPrior) * nParam
}
if ("spbic" %in% fit.measures) {
theta <- lavaan::coef(object)
FIM <- lavInspect(object, "information.observed")
junk <- t(theta) %*% FIM %*% theta
if (nParam < junk) {
result["spbic"] <- f + nParam*(1 - log(nParam / junk)) # Case 1
} else result["spbic"] <- f + junk # Case 2
}
if ("hbic" %in% fit.measures) result["hbic"] <- f + nParam*log(N/(2*pi))
if ("icomp" %in% fit.measures) {
Fhatinv <- lavInspect(object, "inverted.information.expected")
s <- qr(Fhatinv)$rank
C1 <- (s/2)*log((sum(diag(Fhatinv)))/s) - .5*log(det(Fhatinv))
result["icomp"] <- f + 2*C1
}
## check determinant of ACOV for IBIC and SIC
if (any(c("ibic","sic") %in% fit.measures)) {
ACOV <- lavInspect(object, "vcov")
detACOV <- det(ACOV)
if (detACOV <= 0) {
## look for duplicate names (simple equality constraints) to
## remove any obviously redundant parameters
#TODO: any way to check for redundancies implied by "==" constraints?
RN <- unique(rownames(ACOV))
if (length(RN) < nrow(ACOV)) detACOV <- det(ACOV[RN, RN])
}
}
if ("ibic" %in% fit.measures) {
if (detACOV <= 0) {
result["ibic"] <- NA
message('Determinant of vcov(object) <= 0, so IBIC cannot be calculated')
} else result["ibic"] <- f - nParam*log(2*pi) - log(detACOV)
}
if ("sic" %in% fit.measures) {
if (detACOV <= 0) {
result["sic"] <- NA
message('Determinant of vcov(object) <= 0, so SIC cannot be calculated')
} else {
result["sic"] <- f - log(detACOV)
## doi:10.1007/s10519-004-5587-0 (p. 596) says to use observed Fisher information,
## but ACOV will be lavInspect(object, "options")$information
## legacy code:
# E.inv <- lavaan::lavTech(object, "inverted.information.observed")
# E <- MASS::ginv(E.inv) * lavaan::nobs(object) # multiply unit information by N
}
}
if ("hqc" %in% fit.measures) result["hqc"] <- f + 2*nParam*log(log(N))
}
result <- unlist(result[fit.measures])
class(result) <- c("lavaan.vector","numeric")
result
}
##' Calculate the RMSEA of the null model
##'
##' Calculate the RMSEA of the null (baseline) model
##'
##' RMSEA of the null model is calculated similar to the formula provided in the
##' `lavaan` package. The standard formula of RMSEA is
##'
##' \deqn{ RMSEA =\sqrt{\frac{\chi^2}{N \times df} - \frac{1}{N}} \times
##' \sqrt{G} }
##'
##' where \eqn{\chi^2} is the chi-square test statistic value of the target
##' model, \eqn{N} is the total sample size, \eqn{df} is the degree of freedom
##' of the hypothesized model, \eqn{G} is the number of groups. Kenny proposed
##' in his website that
##'
##' "A reasonable rule of thumb is to examine the RMSEA for the null model and
##' make sure that is no smaller than 0.158. An RMSEA for the model of 0.05 and
##' a TLI of .90, implies that the RMSEA of the null model is 0.158. If the
##' RMSEA for the null model is less than 0.158, an incremental measure of fit
##' may not be that informative."
##'
##' See also
##'
##'
##' @importFrom lavaan lavInspect
##'
##' @param object The lavaan model object provided after running the `cfa`,
##' `sem`, `growth`, or `lavaan` functions.
##' @param scaled If `TRUE`, the scaled (or robust, if available) RMSEA
##' is returned. Ignored if a robust test statistic was not requested.
##' @param silent If `TRUE`, do not print anything on the screen.
##'
##' @return A value of RMSEA of the null model (a `numeric` vector)
##' returned invisibly.
##'
##' @author
##' Ruben Arslan (Humboldt-University of Berlin, \email{rubenarslan@@gmail.com})
##'
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @seealso
##' \itemize{
##' \item [miPowerFit()] For the modification indices and their
##' power approach for model fit evaluation
##' \item [moreFitIndices()] For other fit indices
##' }
##'
##' @references Kenny, D. A., Kaniskan, B., & McCoach, D. B. (2015). The
##' performance of RMSEA in models with small degrees of freedom.
##' *Sociological Methods Research, 44*(3), 486--507.
##' \doi{10.1177/0049124114543236}
##'
##' @examples
##'
##' HS.model <- ' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9 '
##'
##' fit <- cfa(HS.model, data = HolzingerSwineford1939)
##' nullRMSEA(fit)
##'
##' @export
nullRMSEA <- function(object, scaled = FALSE, silent = FALSE) {
fit <- lavaan::update(object, slotData = object@Data,
model = lavaan::lav_partable_independence(object))
fits <- lavaan::fitMeasures(fit, fit.measures = c("rmsea","rmsea.scaled",
"rmsea.robust"))
if (scaled) {
RMSEA <- fits["rmsea.robust"]
if (is.na(RMSEA)) RMSEA <- fits["rmsea.scaled"]
if (is.na(RMSEA)) RMSEA <- fits["rmsea"]
} else RMSEA <- fits["rmsea"]
if (!silent) {
cat("The baseline model's RMSEA =", RMSEA, "\n\n")
if (RMSEA < 0.158 ) {
cat("CFI, TLI, and other incremental fit indices may not be very",
"informative because the baseline model's RMSEA < 0.158",
"(Kenny, Kaniskan, & McCoach, 2015). \n")
}
}
invisible(RMSEA)
}
## Stochastic Information Criterion
## f = minimized discrepancy function
## lresults = lavaan sem output object
#TODO: update to extract f from lresults. Make public?
sic <- function(f, lresults = NULL) {
## p. 596 of doi:10.1007/s10519-004-5587-0 says to use observed Fisher information
E.inv <- lavaan::lavTech(lresults, "inverted.information.observed")
if (inherits(E.inv, "try-error")) {
return(as.numeric(NA))
}
E <- MASS::ginv(E.inv) * lavaan::nobs(lresults)
eigvals <- eigen(E, symmetric = TRUE, only.values = TRUE)$values
# only positive ones
eigvals <- eigvals[ eigvals > sqrt(.Machine$double.eps)]
DET <- prod(eigvals)
## check singular
if (DET <= 0) return(NA)
## return SIC
}
##' Small-*N* correction for \eqn{chi^2} test statistic
##'
##' Calculate small-*N* corrections for \eqn{chi^2} model-fit test
##' statistic to adjust for small sample size (relative to model size).
##'
##' Four finite-sample adjustments to the chi-squared statistic are currently
##' available, all of which are described in Shi et al. (2018). These all
##' assume normally distributed data, and may not work well with severely
##' nonnormal data. Deng et al. (2018, section 4) review proposed small-*N*
##' adjustments that do not assume normality, which rarely show promise, so
##' they are not implemented here. This function currently will apply
##' small-*N* adjustments to scaled test statistics with a warning that
##' they do not perform well (Deng et al., 2018).
##'
##' @importFrom lavaan lavInspect lavNames
##' @importFrom stats pchisq
##' @importFrom methods getMethod
##'
##' @param fit0,fit1 [lavaan::lavaan-class] or [lavaan.mi::lavaan.mi-class] object(s)
##' @param smallN.method `character` indicating the small-*N*
##' correction method to use. Multiple may be chosen (all of which assume
##' normality), as described in Shi et al. (2018):
##' `c("swain","yuan.2015","yuan.2005","bartlett")`. Users may also
##' simply select `"all"`.
##' @param \dots Additional arguments to the [lavaan::lavTestLRT()] or
##' [lavaan.mi::lavTestLRT.mi()] functions. Ignored when `is.null(fit1)`.
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Ignored unless `fit0` (and
##' optionally `fit1`) is a [lavaan.mi::lavaan.mi-class] object. See
##' [lavaan.mi::lavTestLRT.mi()] for a description of options and defaults.
##'
##' @return A `list` of `numeric` vectors: one for the originally
##' requested statistic(s), along with one per requested `smallN.method`.
##' All include the the (un)adjusted test statistic, its *df*, and the
##' *p* value for the test under the null hypothesis that the model fits
##' perfectly (or that the 2 models have equivalent fit).
##' The adjusted chi-squared statistic(s) also include(s) the scaling factor
##' for the small-*N* adjustment.
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' @references
##' Deng, L., Yang, M., & Marcoulides, K. M. (2018). Structural equation
##' modeling with many variables: A systematic review of issues and
##' developments. *Frontiers in Psychology, 9*, 580.
##' \doi{10.3389/fpsyg.2018.00580}
##'
##' Shi, D., Lee, T., & Terry, R. A. (2018). Revisiting the model
##' size effect in structural equation modeling.
##' *Structural Equation Modeling, 25*(1), 21--40.
##' \doi{10.1080/10705511.2017.1369088}
##'
##' @examples
##'
##' HS.model <- '
##' visual =~ x1 + b1*x2 + x3
##' textual =~ x4 + b2*x5 + x6
##' speed =~ x7 + b3*x8 + x9
##' '
##' fit1 <- cfa(HS.model, data = HolzingerSwineford1939[1:50,])
##' ## test a single model (implicitly compared to a saturated model)
##' chisqSmallN(fit1)
##'
##' ## fit a more constrained model
##' fit0 <- cfa(HS.model, data = HolzingerSwineford1939[1:50,],
##' orthogonal = TRUE)
##' ## compare 2 models
##' chisqSmallN(fit1, fit0)
##'
##' @export
chisqSmallN <- function(fit0, fit1 = NULL,
smallN.method = if (is.null(fit1)) c("swain","yuan.2015") else "yuan.2005",
..., omit.imps = c("no.conv","no.se")) {
if ("all" %in% smallN.method) smallN.method <- c("swain","yuan.2015",
"yuan.2005","bartlett")
smallN.method <- intersect(tolower(smallN.method),
c("swain","yuan.2015","yuan.2005","bartlett"))
if (!any(smallN.method %in% c("swain","yuan.2015","yuan.2005","bartlett")))
stop('No recognized options for "smallN.method" argument')
## check class
if (!inherits(fit0, what = c("lavaan","lavaan.mi")))
stop("this function is only applicable to fitted lavaan(.mi) models")
## necessary to load lavaan.mi?
if (inherits(fit0, what = c("lavaan.mi"))) {
if (!"package:lavaan.mi" %in% search()) attachNamespace("lavaan.mi")
}
## if there are 2 models...
if (!is.null(fit1)) {
## check classes
if (!inherits(fit1, what = c("lavaan","lavaan.mi")))
stop("this function is only applicable to fitted lavaan(.mi) models")
modClass <- unique(sapply(list(fit0, fit1), class))
if (length(modClass) > 1L) stop('All models must be of the same class (e.g.,',
' cannot compare lavaan objects to lavaan.mi)')
## check order of DF
suppressMessages(DF0 <- getMethod("fitMeasures", class(fit0))(fit0, fit.measures = "df",
omit.imps = omit.imps)[1])
suppressMessages(DF1 <- getMethod("fitMeasures", class(fit1))(fit1, fit.measures = "df",
omit.imps = omit.imps)[1])
if (DF0 == DF1) stop("Models have the same degrees of freedom.")
parent <- which.min(c(DF0, DF1))
if (parent == 1L) {
parent <- fit0
fit0 <- fit1
fit1 <- parent
parent <- DF0
DF0 <- DF1
DF1 <- parent
}
if (min(c(DF0, DF1)) == 0L) {
message('Less restricted model has df=0, so chi-squared difference ',
'not needed to compare models. Using only the restricted ',
"model's chi-squared statistic.")
fit1 <- NULL
}
}
## check whether methods can be used
if (!is.null(fit1)) {
if (any(smallN.method %in% c("yuan.2015","swain"))) {
message('Swain (1975) and Yuan (2015) corrections depend on the number ',
'of free parameters, so it is unavailable for model comparison.')
smallN.method <- smallN.method[-which(smallN.method %in% c("yuan.2015","swain"))]
}
if (!length(smallN.method)) {
stop('No valid options for "smallN.method" argument')
} else warning('Small-N corrections developed for single models, not for ',
'model comparison. Experimentally applying correction to ',
'chi-squared difference statistic, which might be invalid.')
}
## save quantities relevant across correction methods
N <- lavInspect(fit0, "ntotal")
Ng <- lavInspect(fit0, "ngroups")
if (!lavInspect(fit0, "options")$sample.cov.rescale) N <- N - Ng
P <- length(lavNames(fit0))
K <- length(lavNames(fit0, type = "lv")) # count latent factors
if (is.null(fit1)) {
FIT <- getMethod("fitMeasures", class(fit0))(fit0,
## lavaan.mi arguments ignored
## for lavaan objects
omit.imps = omit.imps,
asymptotic = TRUE,
fit.measures = c("npar","chisq",
"df","pvalue"))
scaled <- any(grepl(pattern = "scaled", x = names(FIT)))
if (scaled) warning('Small-N corrections developed assuming normality, but',
' a scaled test was requested. Applying correction(s) ',
'to the scaled test statistic, but this has not ',
'performed well in past simulations.')
NPAR <- FIT[["npar"]]
chi <- FIT[[if (scaled) "chisq.scaled" else "chisq"]]
DF <- FIT[[if (scaled) "df.scaled" else "df"]]
PV <- FIT[[if (scaled) "pvalue.scaled" else "pvalue"]]
} else {
## Compare to a second model. Check matching stats.
N1 <- lavInspect(fit1, "ntotal")
Ng1 <- lavInspect(fit1, "ngroups")
if (!lavInspect(fit1, "options")$sample.cov.rescale) N1 <- N1 - Ng1
if (N != N1) stop("Unequal sample sizes")
if (P != length(lavNames(fit1))) stop("Unequal number of observed variables")
K1 <- length(lavNames(fit1, type = "lv"))
if (K != K1 && any(smallN.method %in% c("yuan.2005","bartlett"))) {
warning("Unequal number of latent variables (k). Unclear how to apply ",
"Yuan (2005) or Bartlett (2015) corrections when comparing ",
"models with different k. Experimentally using the larger ",
"model's k, but there is no evidence this is valid.")
K <- max(K, K1)
}
AOV <- try(compareFit(fit0, fit1, argsLRT = list(...), indices = FALSE),
silent = TRUE)
if (inherits(AOV, "try-error")) stop('Model comparison failed. Try using ',
'lavTestLRT() to investigate why.')
if (inherits(fit0, "lavaan")) {
if (grepl("scaled", attr(AOV@nested, "heading"), ignore.case = TRUE))
warning('Small-N corrections developed assuming normality, but scaled ',
'tests were requested. Applying correction(s) to the scaled test',
' statistic, but this has not performed well in past simulations.')
chi <- AOV@nested[["Chisq diff"]][2]
DF <- AOV@nested[["Df diff"]][2]
PV <- AOV@nested[["Pr(>Chisq)"]][2]
} else if (inherits(fit0, "lavaan.mi")) {
if (any(grepl("scaled", colnames(AOV@nested), ignore.case = TRUE)))
warning('Small-N corrections developed assuming normality, but scaled ',
'tests were requested. Applying correction(s) to the scaled test',
' statistic, but this has not performed well in past simulations.')
chi <- AOV@nested[1, 1]
DF <- AOV@nested[1, 2]
PV <- AOV@nested[1, 3]
}
}
## empty list to store correction(s)
out <- list()
out[[ lavInspect(fit0, "options")$test ]] <- c(chisq = chi, df = DF,
pvalue = PV)
class(out[[1]]) <- c("lavaan.vector","numeric")
## calculate Swain's (1975) correction
## (undefined for model comparison)
if ("swain" %in% smallN.method) {
Q <- (sqrt(1 + 8*NPAR) - 1) / 2
num <- P*(2*P^2 + 3*P - 1) - Q*(2*Q^2 + 3*Q - 1)
SC <- 1 - num / (12*DF*N)
out[["swain"]] <- c(chisq = chi*SC, df = DF,
pvalue = pchisq(chi*SC, DF, lower.tail = FALSE),
smallN.factor = SC)
class(out[["swain"]]) <- c("lavaan.vector","numeric")
}
## calculate Yuan's (2015) correction
## (undefined for model comparison)
if ("yuan.2015" %in% smallN.method) {
## numerator uses actual N regardless of sample.cov.rescale
SC <- (lavInspect(fit0, "ntotal") - (2.381 + .361*P + .006*NPAR)) / N
out[["yuan.2015"]] <- c(chisq = chi*SC, df = DF,
pvalue = pchisq(chi*SC, DF, lower.tail = FALSE),
smallN.factor = SC)
class(out[["yuan.2015"]]) <- c("lavaan.vector","numeric")
}
## calculate Yuan's (2005) correction
if ("yuan.2005" %in% smallN.method) {
SC <- 1 - ((2*P + 2*K + 7) / (6*N))
out[["yuan.2005"]] <- c(chisq = chi*SC, df = DF,
pvalue = pchisq(chi*SC, DF, lower.tail = FALSE),
smallN.factor = SC)
class(out[["yuan.2005"]]) <- c("lavaan.vector","numeric")
}
## calculate Bartlett's (1950) k-factor correction (ONLY appropriate for EFA)
if ("bartlett" %in% smallN.method) {
message('Bartlett\'s k-factor correction was developed for EFA models, ',
'not for general SEMs.')
SC <- 1 - ((2*P + 4*K + 5) / (6*N))
out[["bartlett"]] <- c(chisq = chi*SC, df = DF,
pvalue = pchisq(chi*SC, DF, lower.tail = FALSE),
smallN.factor = SC)
class(out[["bartlett"]]) <- c("lavaan.vector","numeric")
}
out[c(lavInspect(fit0, "options")$test, smallN.method)]
}
semTools/R/missingBootstrap.R 0000644 0001762 0000144 00000101071 14764276216 015771 0 ustar ligges users ### Terrence D. Jorgensen
### Last updated: 12 March 2025
### Savalei & Yuan's (2009) model-based bootstrap for missing data
## ----------------------------
## "BootMiss" Class and Methods
## ----------------------------
##' Class For the Results of Bollen-Stine Bootstrap with Incomplete Data
##'
##' This class contains the results of Bollen-Stine bootstrap with missing data.
##'
##'
##' @name BootMiss-class
##' @aliases BootMiss-class show,BootMiss-method summary,BootMiss-method
##' hist,BootMiss-method
##' @docType class
##' @section Objects from the Class: Objects can be created via the
##' [bsBootMiss()] function.
##' @slot time A list containing 2 `difftime` objects (`transform`
##' and `fit`), indicating the time elapsed for data transformation and
##' for fitting the model to bootstrap data sets, respectively.
##' @slot transData Transformed data
##' @slot bootDist The vector of \eqn{chi^2} values from bootstrap data sets
##' fitted by the target model
##' @slot origChi The \eqn{chi^2} value from the original data set
##' @slot df The degree of freedom of the model
##' @slot bootP The *p* value comparing the original \eqn{chi^2} with the
##' bootstrap distribution
##' @author Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##' @seealso [bsBootMiss()]
##' @examples
##'
##' # See the example from the bsBootMiss function
##'
setClass("BootMiss", representation(time = "list",
transData = "data.frame",
bootDist = "vector",
origChi = "numeric",
df = "numeric",
bootP = "numeric"))
##' @rdname BootMiss-class
##' @aliases show,BootMiss-method
##' @importFrom stats pchisq
##' @export
setMethod("show", "BootMiss",
function(object) {
cat("Chi-Squared = ", object@origChi, "\nDegrees of Freedom = ",
object@df, "\nTheoretical p value = ",
pchisq(object@origChi, object@df, lower.tail = FALSE),
"\n i.e., pchisq(", object@origChi, ", df = ",
object@df, ", lower.tail = FALSE)\n",
"\nBootstrapped p value = ", object@bootP, "\n\n", sep = "")
invisible(object)
})
##' @rdname BootMiss-class
##' @aliases summary,BootMiss-method
##' @importFrom stats var
##' @export
setMethod("summary", "BootMiss",
function(object) {
cat("Time elapsed to transform the data:\n")
print(object@time$transform)
cat("\nTime elapsed to fit the model to", length(object@bootDist),
"bootstrapped samples:\n")
print(object@time$fit)
cat("\nMean of Theoretical Distribution = DF =", object@df,
"\nVariance of Theoretical Distribution = 2*DF =", 2*object@df,
"\n\nMean of Bootstrap Distribution =", mean(object@bootDist),
"\nVariance of Bootstrap Distribution =",
var(object@bootDist), "\n\n")
show(object)
invisible(object)
})
##' @rdname BootMiss-class
##' @aliases hist,BootMiss-method
##' @importFrom stats qchisq dchisq quantile
##' @param object,x object of class `BootMiss`
##' @param ... Additional arguments to pass to [graphics::hist()]
##' @param alpha alpha level used to draw confidence limits
##' @param nd number of digits to display
##' @param printLegend `logical`. If `TRUE` (default), a legend will
##' be printed with the histogram
##' @param legendArgs `list` of arguments passed to the
##' [graphics::legend()] function. The default argument is a list
##' placing the legend at the top-left of the figure.
##' @return The `hist` method returns a list of `length == 2`,
##' containing the arguments for the call to `hist` and the arguments
##' to the call for `legend`, respectively.
##' @export
setMethod("hist", "BootMiss",
function(x, ..., alpha = .05, nd = 2, printLegend = TRUE,
legendArgs = list(x = "topleft")) {
ChiSq <- x@origChi
DF <- x@df
bootDist <- x@bootDist
bCrit <- round(quantile(bootDist, probs = 1 - alpha), nd)
theoDist <- dchisq(seq(.1, max(c(ChiSq, bootDist)), by = .1), df = DF)
Crit <- round(qchisq(p = alpha, df = DF, lower.tail = FALSE), nd)
Lim <- c(0, max(c(ChiSq, bootDist, Crit)))
if (ChiSq > Lim[2]) Lim[2] <- ChiSq
histArgs <- list(...)
histArgs$x <- bootDist
histArgs$freq <- FALSE
if (is.null(histArgs$col)) histArgs$col <- "grey69"
if (is.null(histArgs$xlim)) histArgs$xlim <- Lim
if (is.null(histArgs$main)) histArgs$main <- expression("Model-Based Bootstrap Distribution of" ~ chi^2)
if (is.null(histArgs$ylab)) histArgs$ylab <- "Probability Density"
if (is.null(histArgs$xlab)) histArgs$xlab <- expression(chi^2)
if (printLegend) {
if (nd < length(strsplit(as.character(1 / alpha), "")[[1]]) - 1) {
warning(paste0("The number of digits argument (nd = ", nd ,
") is too low to display your p value at the",
" same precision as your requested alpha level (alpha = ",
alpha, ")"))
}
if (x@bootP < (1 / 10^nd)) {
pVal <- paste(c("< .", rep(0, nd - 1),"1"), collapse = "")
} else {
pVal <- paste("=", round(x@bootP, nd))
}
if (is.null(legendArgs$box.lty)) legendArgs$box.lty <- 0
if (is.null(legendArgs$lty)) legendArgs$lty <- c(1, 2, 2, 1, 0, 0)
if (is.null(legendArgs$lwd)) legendArgs$lwd <- c(2, 2, 2, 3, 0, 0)
#if (is.null(legendArgs$cex)) legendArgs$cex <- c(1.1, 1, 1, 1, 1, 1)
if (is.null(legendArgs$col)) legendArgs$col <- c("black","black","grey69","red","", "")
legendArgs$legend <- c(bquote(chi[.(paste("df =", DF))]^2),
bquote(Critical ~ chi[alpha ~ .(paste(" =", alpha))]^2 == .(Crit)),
bquote(Bootstrap~Critical~chi[alpha ~ .(paste(" =", alpha))]^2 == .(bCrit)),
expression(Observed ~ chi^2),
bquote(.("")),
bquote(Bootstrap ~ italic(p) ~~ .(pVal)))
}
H <- do.call(hist, c(histArgs["x"], plot = FALSE))
histArgs$ylim <- c(0, max(H$density, theoDist))
suppressWarnings({
do.call(hist, histArgs)
lines(x = seq(.1, max(c(ChiSq, bootDist)), by = .1), y = theoDist, lwd = 2)
abline(v = Crit, col = "black", lwd = 2, lty = 2)
abline(v = bCrit, col = "grey69", lwd = 2, lty = 2)
abline(v = ChiSq, col = "red", lwd = 3)
if (printLegend) do.call(legend, legendArgs)
})
## return arguments to create histogram (and optionally, legend)
invisible(list(hist = histArgs, legend = legendArgs))
})
## --------------------
## Constructor Function
## --------------------
##' Bollen-Stine Bootstrap with the Existence of Missing Data
##'
##' Implement the Bollen and Stine's (1992) Bootstrap when missing observations
##' exist. The implemented method is proposed by Savalei and Yuan (2009). This
##' can be used in two ways. The first and easiest option is to fit the model to
##' incomplete data in `lavaan` using the FIML estimator, then pass that
##' `lavaan` object to `bsBootMiss`.
##'
##' The second is designed for users of other software packages (e.g., LISREL,
##' EQS, Amos, or Mplus). Users can import their data, \eqn{\chi^2} value, and
##' model-implied moments from another package, and they have the option of
##' saving (or writing to a file) either the transformed data or bootstrapped
##' samples of that data, which can be analyzed in other programs. In order to
##' analyze the bootstrapped samples and return a *p* value, users of other
##' programs must still specify their model using lavaan syntax.
##'
##'
##' @importFrom lavaan lavInspect parTable
##' @param x A target `lavaan` object used in the Bollen-Stine bootstrap
##' @param transformation The transformation methods in Savalei and Yuan (2009).
##' There are three methods in the article, but only the first two are currently
##' implemented here. Use `transformation = 1` when there are few missing
##' data patterns, each of which has a large size, such as in a
##' planned-missing-data design. Use `transformation = 2` when there are
##' more missing data patterns. The currently unavailable
##' `transformation = 3` would be used when several missing data patterns
##' have n = 1.
##' @param nBoot The number of bootstrap samples.
##' @param model Optional. The target model if `x` is not provided.
##' @param rawData Optional. The target raw data set if `x` is not
##' provided.
##' @param Sigma Optional. The model-implied covariance matrix if `x` is
##' not provided.
##' @param Mu Optional. The model-implied mean vector if `x` is not
##' provided.
##' @param group Optional character string specifying the name of the grouping
##' variable in `rawData` if `x` is not provided.
##' @param ChiSquared Optional. The model's \eqn{\chi^2} test statistic if
##' `x` is not provided.
##' @param EMcov Optional, if `x` is not provided. The EM (or Two-Stage ML)
##' estimated covariance matrix used to speed up Transformation 2 algorithm.
##' @param transDataOnly Logical. If `TRUE`, the result will provide the
##' transformed data only.
##' @param writeTransData Logical. If `TRUE`, the transformed data set is
##' written to a text file, `transDataOnly` is set to `TRUE`, and the
##' transformed data is returned invisibly.
##' @param bootSamplesOnly Logical. If `TRUE`, the result will provide
##' bootstrap data sets only.
##' @param writeBootData Logical. If `TRUE`, the stacked bootstrap data
##' sets are written to a text file, `bootSamplesOnly` is set to
##' `TRUE`, and the list of bootstrap data sets are returned invisibly.
##' @param writeArgs Optional `list`. If `writeBootData = TRUE` or
##' `writeBootData = TRUE`, user can pass arguments to the
##' [utils::write.table()] function as a list. Some default values
##' are provided: `file` = "bootstrappedSamples.dat", `row.names` =
##' `FALSE`, and `na` = "-999", but the user can override all of these
##' by providing other values for those arguments in the `writeArgs` list.
##' @param seed The seed number used in randomly drawing bootstrap samples.
##' @param suppressWarn Logical. If `TRUE`, warnings from `lavaan`
##' function will be suppressed when fitting the model to each bootstrap sample.
##' @param showProgress Logical. Indicating whether to display a progress bar
##' while fitting models to bootstrap samples.
##' @param \dots The additional arguments in the [lavaan::lavaan()]
##' function. See also [lavaan::lavOptions()]
##' @return As a default, this function returns a [BootMiss-class]
##' object containing the results of the bootstrap samples. Use `show`,
##' `summary`, or `hist` to examine the results. Optionally, the
##' transformed data set is returned if `transDataOnly = TRUE`. Optionally,
##' the bootstrap data sets are returned if `bootSamplesOnly = TRUE`.
##' @author Terrence D. Jorgensen (University of Amsterdam;
##' \email{TJorgensen314@@gmail.com})
##'
##' Syntax for transformations borrowed from http://www2.psych.ubc.ca/~vsavalei/
##' @seealso [BootMiss-class]
##' @references
##'
##' Bollen, K. A., & Stine, R. A. (1992). Bootstrapping goodness-of-fit measures
##' in structural equation models. *Sociological Methods &
##' Research, 21*(2), 205--229. \doi{10.1177/0049124192021002004}
##'
##' Savalei, V., & Yuan, K.-H. (2009). On the model-based bootstrap with missing
##' data: Obtaining a p-value for a test of exact fit. *Multivariate
##' Behavioral Research, 44*(6), 741--763. \doi{10.1080/00273170903333590}
##' @examples
##'
##' dat1 <- HolzingerSwineford1939
##' dat1$x5 <- ifelse(dat1$x1 <= quantile(dat1$x1, .3), NA, dat1$x5)
##' dat1$x9 <- ifelse(is.na(dat1$x5), NA, dat1$x9)
##'
##' targetModel <- "
##' visual =~ x1 + x2 + x3
##' textual =~ x4 + x5 + x6
##' speed =~ x7 + x8 + x9
##' "
##' targetFit <- sem(targetModel, dat1, meanstructure = TRUE, std.lv = TRUE,
##' missing = "fiml", group = "school")
##' summary(targetFit, fit = TRUE, standardized = TRUE)
##'
##' \donttest{
##' ## The number of bootstrap samples should be much higher than this example
##' temp <- bsBootMiss(targetFit, transformation = 1, nBoot = 10, seed = 31415)
##'
##' temp
##' summary(temp)
##' hist(temp)
##' hist(temp, printLegend = FALSE) # suppress the legend
##' ## user can specify alpha level (default: alpha = 0.05), and the number of
##' ## digits to display (default: nd = 2). Pass other arguments to hist(...),
##' ## or a list of arguments to legend() via "legendArgs"
##' hist(temp, alpha = .01, nd = 3, xlab = "something else", breaks = 25,
##' legendArgs = list("bottomleft", box.lty = 2))
##' }
##'
##' @export
bsBootMiss <- function(x, transformation = 2, nBoot = 500, model, rawData,
Sigma, Mu, group, ChiSquared, EMcov,
writeTransData = FALSE, transDataOnly = FALSE,
writeBootData = FALSE, bootSamplesOnly = FALSE,
writeArgs, seed = NULL, suppressWarn = TRUE,
showProgress = TRUE, ...) {
if(writeTransData) transDataOnly <- TRUE
if(writeBootData) bootSamplesOnly <- TRUE
check.nBoot <- (!is.numeric(nBoot) | nBoot < 1L) & !transDataOnly
if (check.nBoot) stop("The \"nBoot\" argument must be a positive integer.")
## Which transformation?
if (!(transformation %in% 1:2)) stop("User must specify transformation 1 or 2.
Consult Savalei & Yuan (2009) for advice.
Transformation 3 is not currently available.")
if (transformation == 2) SavaleiYuan <- trans2
#if (transformation == 3) SavaleiYuan <- trans3
######################
## Data Preparation ##
######################
## If a lavaan object is supplied, the extracted values for rawData, Sigma, Mu,
## EMcov, and EMmeans will override any user-supplied arguments.
if (hasArg(x)) {
if (!lavInspect(x, "options")$meanstructure)
stop('You must fit the lavaan model with the argument "meanstructure=TRUE".')
nG <- lavInspect(x, "ngroups")
if (nG == 1L) {
rawData <- list(as.data.frame(lavInspect(x, "data")))
} else rawData <- lapply(lavInspect(x, "data"), as.data.frame)
for (g in seq_along(rawData)) {
colnames(rawData[[g]]) <- lavaan::lavNames(x)
checkAllMissing <- apply(rawData[[g]], 1, function(x) all(is.na(x)))
if (any(checkAllMissing)) rawData[[g]] <- rawData[[g]][!checkAllMissing, ]
}
ChiSquared <- lavInspect(x, "fit")[c("chisq", "chisq.scaled")]
ChiSquared <- ifelse(is.na(ChiSquared[2]), ChiSquared[1], ChiSquared[2])
group <- lavInspect(x, "group")
if (length(group) == 0) group <- "group"
group.label <- lavInspect(x, "group.label")
if (length(group.label) == 0) group.label <- 1
Sigma <- lavInspect(x, "cov.ov")
Mu <- lavInspect(x, "mean.ov")
EMcov <- lavInspect(x, "sampstat")$cov
if (nG == 1L) {
Sigma <- list(Sigma)
Mu <- list(Mu)
EMcov <- list(EMcov)
}
} else {
## If no lavaan object is supplied, check that required arguments are.
suppliedData <- c(hasArg(rawData), hasArg(Sigma), hasArg(Mu))
if (!all(suppliedData)) {
stop("Without a lavaan fitted object, user must supply raw data and",
" model-implied moments.")
}
if (!hasArg(model) & !(transDataOnly | bootSamplesOnly)) {
stop("Without model syntax or fitted lavaan object, user can only call",
" this function to save transformed data or bootstrapped samples.")
}
if (!hasArg(ChiSquared) & !(transDataOnly | bootSamplesOnly)) {
stop("Without a fitted lavaan object or ChiSquared argument, user can",
" only call this function to save transformed data, bootstrapped",
" samples, or bootstrapped chi-squared values.")
}
if (!any(c(transDataOnly, bootSamplesOnly))) {
if (!is.numeric(ChiSquared)) stop("The \"ChiSquared\" argument must be numeric.")
}
## If user supplies one-group data & moments, convert to lists.
if ( inherits(rawData, "data.frame")) rawData <- list(rawData)
if (!inherits(rawData, "list")) {
stop("The \"rawData\" argument must be a data.frame or list of data frames.")
} else {
if (!all(sapply(rawData, is.data.frame))) stop("Every element of \"rawData\" must be a data.frame")
}
if (inherits(Sigma, "matrix")) Sigma <- list(Sigma)
if (is.numeric(Mu)) Mu <- list(Mu)
## check whether EMcov was supplied for starting values in Trans2/Trans3
if (!hasArg(EMcov)) {
EMcov <- vector("list", length(Sigma))
} else {
if (inherits(EMcov, "matrix")) EMcov <- list(EMcov)
## check EMcov is symmetric and dimensions match Sigma
for (g in seq_along(EMcov)) {
if (!isSymmetric(EMcov[[g]])) stop("EMcov in group ", g, " not symmetric.")
unequalDim <- !all(dim(EMcov[[g]]) == dim(Sigma[[g]]))
if (unequalDim) stop("Unequal dimensions in Sigma and EMcov.")
}
}
## Check the number of groups by the size of the lists.
unequalGroups <- !all(length(rawData) == c(length(Sigma), length(Mu)))
if (unequalGroups) stop("Unequal number of groups in rawData, Sigma, Mu.
For multiple-group models, rawData must be a list of data frames,
NOT a single data frame with a \"group\" column.")
nG <- length(Sigma)
## In each group, check Sigma is symmetric and dimensions match rawData and Mu.
for (g in seq_along(rawData)) {
if (!isSymmetric(Sigma[[g]])) stop("Sigma in group ", g, " not symmetric.")
unequalDim <- !all(ncol(rawData[[g]]) == c(nrow(Sigma[[g]]), length(Mu[[g]])))
if (unequalDim) stop("Unequal dimensions in rawData, Sigma, Mu.")
}
## Check for names of group levels. If NULL, assign arbitrary ones.
if (!hasArg(group)) group <- "group"
if (!is.character(group)) stop("The \"group\" argument must be a character string.")
if (is.null(names(rawData))) {
group.label <- paste0("g", seq_along(rawData))
} else {
group.label <- names(rawData)
}
}
## save a copy as myTransDat, whose elements will be replaced iteratively by
## group and by missing data pattern within group.
myTransDat <- rawData
names(myTransDat) <- group.label
output <- list()
#########################
## Data Transformation ##
#########################
for (g in seq_along(group.label)) {
if (transformation == 1) {
## get missing data patterns
R <- ifelse(is.na(rawData[[g]]), 1, 0)
rowMissPatt <- apply(R, 1, function(x) paste(x, collapse = ""))
patt <- unique(rowMissPatt)
myRows <- lapply(patt, function(x) which(rowMissPatt == x))
## for each pattern, apply transformation
tStart <- Sys.time()
transDatList <- lapply(patt, trans1, rowMissPatt = rowMissPatt,
dat = rawData[[g]], Sigma = Sigma[[g]], Mu = Mu[[g]])
output$timeTrans <- Sys.time() - tStart
for (i in seq_along(patt)) myTransDat[[g]][myRows[[i]], ] <- transDatList[[i]]
} else {
tStart <- Sys.time()
myTransDat[[g]] <- SavaleiYuan(dat = rawData[[g]], Sigma = Sigma[[g]],
Mu = Mu[[g]], EMcov = EMcov[[g]])
output$timeTrans <- Sys.time() - tStart
}
}
## option to end function here
if (transDataOnly) {
for (g in seq_along(myTransDat)) myTransDat[[g]][ , group] <- group.label[g]
## option to write transformed data to a file
if (writeTransData) {
## Set a few options, if the user didn't.
if (!hasArg(writeArgs)) writeArgs <- list(file = "transformedData.dat",
row.names = FALSE, na = "-999")
if (!exists("file", where = writeArgs)) writeTransArgs$file <- "transformedData.dat"
if (!exists("row.names", where = writeArgs)) writeArgs$row.names <- FALSE
if (!exists("na", where = writeArgs)) writeArgs$na <- "-999"
## add grouping variable and bind together into one data frame
for (g in seq_along(myTransDat)) myTransDat[[g]][ , group] <- group.label[g]
writeArgs$x <- do.call("rbind", myTransDat)
## write to file, print details to screen
do.call("write.table", writeArgs)
cat("Transformed data was written to file \"", writeArgs$file, "\" in:\n\n",
getwd(), "\n\nunless path specified by user in 'file' argument.\n", sep = "")
return(invisible(writeArgs$x))
}
return(do.call("rbind", myTransDat))
}
#############################################
## Bootstrap distribution of fit statistic ##
#############################################
## draw bootstrap samples
if (!is.null(seed)) set.seed(seed)
bootSamples <- lapply(1:nBoot, function(x) getBootSample(myTransDat, group, group.label))
## option to write bootstrapped samples to file(s)
if (writeBootData) {
## Set a few options, if the user didn't.
if (!hasArg(writeArgs)) writeArgs <- list(file = "bootstrappedSamples.dat",
row.names = FALSE, na = "-999")
if (!exists("file", where = writeArgs)) writeTransArgs$file <- "bootstrappedSamples.dat"
if (!exists("row.names", where = writeArgs)) writeArgs$row.names <- FALSE
if (!exists("na", where = writeArgs)) writeArgs$na <- "-999"
## add indicator for bootstrapped sample, bind together into one data frame
for (b in seq_along(bootSamples)) bootSamples[[b]]$bootSample <- b
writeArgs$x <- do.call("rbind", bootSamples)
## write to file, print details to screen
do.call("write.table", writeArgs)
cat("Bootstrapped samples written to file \"", writeArgs$file, "\" in:\n\n",
getwd(), "\n\nunless path specified by user in 'file' argument.\n", sep = "")
return(invisible(bootSamples))
}
## option to end function here
if (bootSamplesOnly) return(bootSamples)
## check for lavaan arguments in (...)
lavaanArgs <- list(...)
lavaanArgs$group <- group
## fit model to bootstrap samples, save distribution of chi-squared test stat
if (hasArg(x)) {
## grab defaults from lavaan object "x"
lavaanArgs$slotParTable <- as.list(parTable(x))
lavaanArgs$slotModel <- x@Model
lavaanArgs$slotOptions <- lavInspect(x, "options")
} else {
lavaanArgs$model <- model
lavaanArgs$missing <- "fiml"
## set defaults that will be necessary for many models to run, that will
## probably not be specified explictly or included in lavaan syntax
lavaanArgs$meanstructure <- TRUE
if (!exists("auto.var", where = lavaanArgs)) lavaanArgs$auto.var <- TRUE
if (!exists("auto.cov.y", where = lavaanArgs)) lavaanArgs$auto.cov.y <- TRUE
if (!exists("auto.cov.lv.x", where = lavaanArgs)) lavaanArgs$auto.cov.lv.x <- TRUE
}
## run bootstrap fits
if (showProgress) {
mypb <- utils::txtProgressBar(min = 1, max = nBoot, initial = 1, char = "=",
width = 50, style = 3, file = "")
bootFits <- numeric()
tStart <- Sys.time()
for (j in 1:nBoot) {
bootFits[j] <- fitBootSample(bootSamples[[j]], args = lavaanArgs,
suppress = suppressWarn)
utils::setTxtProgressBar(mypb, j)
}
close(mypb)
output$timeFit <- Sys.time() - tStart
} else {
tStart <- Sys.time()
bootFits <- sapply(bootSamples, fitBootSample, args = lavaanArgs,
suppress = suppressWarn)
output$timeFit <- Sys.time() - tStart
}
## stack groups, save transformed data and distribution in output object
for (g in seq_along(myTransDat)) myTransDat[[g]][ , group] <- group.label[g]
output$Transformed.Data <- do.call("rbind", myTransDat)
output$Bootstrapped.Distribution <- bootFits
output$Original.ChiSquared <- ChiSquared
if (hasArg(x)) {
output$Degrees.Freedom <- lavInspect(x, "fit")["df"]
} else {
convSamp <- which(!is.na(bootFits))[1]
lavaanArgs$data <- bootSamples[[convSamp]]
lavaanlavaan <- function(...) { lavaan::lavaan(...) }
output$Degrees.Freedom <- lavInspect(do.call(lavaanlavaan, lavaanArgs), "fit")["df"]
}
## calculate bootstrapped p-value
output$Bootstrapped.p.Value <- mean(bootFits >= ChiSquared, na.rm = TRUE)
## print warning if any models didn't converge
if (any(is.na(bootFits))) {
nonConvMessage <- paste("Model did not converge for the following bootstrapped samples",
paste(which(is.na(bootFits)), collapse = "\t"), sep = ":\n")
warning(nonConvMessage)
}
finalResult <- new("BootMiss", time = list(transform = output$timeTrans, fit = output$timeFit), transData = output$Transformed.Data, bootDist = output$Bootstrapped.Distribution, origChi = output$Original.ChiSquared, df = output$Degrees.Freedom, bootP = output$Bootstrapped.p.Value)
finalResult
}
## ----------------
## Hidden Functions
## ----------------
## Function to execute Transformation 1 on a single missing-data pattern
trans1 <- function(MDpattern, rowMissPatt, dat, Sigma, Mu) {
myRows <- which(rowMissPatt == MDpattern)
X <- apply(dat[myRows, ], 2, scale, scale = FALSE)
observed <- !is.na(X[1, ])
Xreduced <- X[ , observed]
Mreduced <- as.numeric(Mu[observed])
SigmaChol <- chol(Sigma[observed, observed])
S <- t(Xreduced) %*% Xreduced / nrow(X)
Areduced <- t(SigmaChol) %*% t(solve(chol(S)))
Yreduced <- t(Areduced %*% t(Xreduced) + Mreduced)
Y <- replace(X, !is.na(X), Yreduced)
Y
}
## Function to execute Transformation 2 on a single group
trans2 <- function(dat, Sigma, Mu, EMcov) {
## Computing Function of A (eq. 12), whose root is desired
eq12 <- function(A) {
ga <- rep(0, pStar)
for (j in 1:J) {
Tj <- Mjs[[j]] %*% A %*% Hjs[[j]] %*% A %*% Mjs[[j]] - Mjs[[j]]
ga <- ga + Njs[j] * Dupinv %*% c(Tj) # same as vech(Tj)
}
ga
}
## Computing Derivative of Function of A (eq. 13)
eq13 <- function(A) {
deriv12 <- matrix(0, nrow = pStar, ncol = pStar)
for (j in 1:J) {
Tj1 <- Mjs[[j]] %*% A %*% Hjs[[j]]
deriv12 <- deriv12 + 2*Njs[j]*Dupinv %*% kronecker(Tj1, Mjs[[j]]) %*% Dup
}
deriv12
}
## get missing data patterns
R <- ifelse(is.na(dat), 1, 0)
rowMissPatt <- apply(R, 1, function(x) paste(x, collapse = ""))
MDpattern <- unique(rowMissPatt)
## sample size within each MD pattern
Njs <- sapply(MDpattern, function(patt) sum(rowMissPatt == patt))
J <- length(MDpattern) # number of MD patterns
p <- ncol(dat) # number of variables in model
pStar <- p*(p + 1) / 2 # number of nonredundant covariance elements
## create empty lists for each MD pattern
Xjs <- vector("list", J)
Hjs <- vector("list", J)
Mjs <- vector("list", J)
## Create Duplication Matrix and its inverse (Magnus & Neudecker, 1999)
Dup <- lavaan::lav_matrix_duplication(p)
Dupinv <- solve(t(Dup) %*% Dup) %*% t(Dup)
## step through each MD pattern, populate Hjs and Mjs
for (j in 1:J) {
Xjs[[j]] <- apply(dat[rowMissPatt == MDpattern[j], ], 2, scale, scale = FALSE)
if (!is.matrix(Xjs[[j]])) Xjs[[j]] <- t(Xjs[[j]])
observed <- !is.na(Xjs[[j]][1, ])
Sj <- t(Xjs[[j]]) %*% Xjs[[j]] / Njs[j]
Hjs[[j]] <- replace(Sj, is.na(Sj), 0)
Mjs[[j]] <- replace(Sj, !is.na(Sj), solve(Sigma[observed, observed]))
Mjs[[j]] <- replace(Mjs[[j]], is.na(Mjs[[j]]), 0)
}
## Compute starting Values for A
if (is.null(EMcov)) {
A <- diag(p)
} else {
EMeig <- eigen(EMcov)
EMrti <- EMeig$vectors %*% diag(1 / sqrt(EMeig$values)) %*% t(EMeig$vectors)
Sigeig <- eigen(Sigma)
Sigrt <- Sigeig$vectors %*% diag(sqrt(Sigeig$values)) %*% t(Sigeig$vectors)
B <- Sigrt %*% EMrti
A <- .5*(B + t(B))
}
## Newton Algorithm for finding root (eq. 14)
crit <- .1
a <- c(A)
fA <- eq12(A)
while (crit > 1e-11) {
dvecF <- eq13(A)
a <- a - Dup %*% solve(dvecF) %*% fA
A <- matrix(a, ncol = p)
fA <- eq12(A)
crit <- max(abs(fA))
}
## Transform dataset X to dataset Y
Yjs <- Xjs
for (j in 1:J) {
observed <- !is.na(Xjs[[j]][1, ])
XjReduced <- Xjs[[j]][ , observed, drop = FALSE]
Aj <- A[observed, observed, drop = FALSE]
Mj <- as.numeric(Mu[observed])
Yj <- t(Aj %*% t(XjReduced) + Mj)
Yjs[[j]] <- replace(Yjs[[j]], !is.na(Yjs[[j]]), Yj)
}
Y <- as.data.frame(do.call("rbind", Yjs))
colnames(Y) <- colnames(dat)
Y
}
## Function to execute Transformation 3 on a single group -- TRANSFORMATION DOES NOT RETURN CH-SQ = 0
trans3 <- function(dat, Sigma, Mu, EMcov) {
# Computing Saturated Means as a Function of A (eq. B1 in Appendix B)
mut <- function(A) {
M <- matrix(0, ncol = 1, nrow = p)
for (j in 1:J) {
M <- M + Njs[[j]] * Mjs[[j]] %*% A %*% Ybarjs[[j]]
}
Mjtoti %*% M
}
# Computing Function of A (eq. 18) whose root is desired
eq18 <- function(A) {
ga <- rep(0, pStar)
mutilda <- mut(A)
for (j in 1:J) {
Tj <- Mjs[[j]] %*% A %*% Hjs[[j]] %*% A %*% Mjs[[j]] - Mjs[[j]]
dif <- A %*% Ybarjs[[j]] - mutilda
middle <- dif %*% t(dif)
Tjnew <- Tj + Mjs[[j]] %*% middle %*% Mjs[[j]]
ga <- ga + Njs[j] * Dupinv %*% c(Tjnew)
}
ga
}
# Computing Derivative of Function eq. 18
deriv18 <- function(A) {
d18 <- matrix(0, nrow = pStar, ncol = pStar)
for (j in 1:J) {
Tj1 <- Mjs[[j]] %*% A %*% Hjs[[j]]
mutilda <- mut(A)
dif <- A %*% Ybarjs[[j]] - mutilda
Tj2 <- Mjs[[j]] %*% dif %*% t(Ybarjs[[j]])
Tj3 <- kronecker(Mjs[[j]] %*% dif, Mjs[[j]]) %*% Mjtoti %*% Tj3add
d18 <- d18 + 2*Njs[j]*Dupinv %*% ((kronecker((Tj1 + Tj2), Mjs[[j]])) - Tj3) %*% Dup
}
d18
}
## get missing data patterns
R <- ifelse(is.na(dat), 1, 0)
rowMissPatt <- apply(R, 1, function(x) paste(x, collapse = ""))
MDpattern <- unique(rowMissPatt)
## sample size within each MD pattern
Njs <- sapply(MDpattern, function(patt) sum(rowMissPatt == patt))
J <- length(MDpattern) # number of MD patterns
p <- ncol(dat) # number of variables in model
pStar <- p*(p + 1) / 2 # number of nonredundant covariance elements
## create empty lists for each MD pattern
Xjs <- vector("list", J)
Ybarjs <- vector("list", J)
Hjs <- vector("list", J)
Mjs <- vector("list", J)
Mjtot <- matrix(0, ncol = p, nrow = p)
Tj3add <- matrix(0, nrow = p, ncol = p * p)
## Create Duplication Matrix and its inverse (Magnus & Neudecker, 1999)
Dup <- lavaan::lav_matrix_duplication(p)
Dupinv <- solve(t(Dup) %*% Dup) %*% t(Dup)
## step through each MD pattern, populate Hjs and Mjs
for (j in 1:J) {
Xjs[[j]] <- apply(dat[rowMissPatt == MDpattern[j], ], 2, scale, scale = FALSE)
if (!is.matrix(Xjs[[j]])) Xjs[[j]] <- t(Xjs[[j]])
observed <- !is.na(Xjs[[j]][1, ])
pj <- p - sum(observed)
means <- colMeans(dat[rowMissPatt == MDpattern[j], ])
Ybarjs[[j]] <- replace(means, is.na(means), 0)
Sj <- t(Xjs[[j]]) %*% Xjs[[j]] / Njs[j]
Hjs[[j]] <- replace(Sj, is.na(Sj), 0)
Mjs[[j]] <- replace(Sj, !is.na(Sj), solve(Sigma[observed, observed]))
Mjs[[j]] <- replace(Mjs[[j]], is.na(Mjs[[j]]), 0)
Mjtot <- Mjtot + Njs[[j]] * Mjs[[j]]
Tj3add <- Tj3add + Njs[[j]] * kronecker(t(Ybarjs[[j]]), Mjs[[j]])
}
Mjtoti <- solve(Mjtot)
## Compute starting Values for A
if (is.null(EMcov)) {
A <- diag(p)
} else {
EMeig <- eigen(EMcov)
EMrti <- EMeig$vectors %*% diag(1 / sqrt(EMeig$values)) %*% t(EMeig$vectors)
Sigeig <- eigen(Sigma)
Sigrt <- Sigeig$vectors %*% diag(sqrt(Sigeig$values)) %*% t(Sigeig$vectors)
B <- Sigrt %*% EMrti
A <- .5*(B + t(B))
}
## Newton Algorithm for finding root (eq. 14)
crit <- .1
a <- c(A)
fA <- eq18(A)
while (crit > 1e-11) {
dvecF <- deriv18(A)
a <- a - Dup %*% solve(dvecF) %*% fA
A <- matrix(a, ncol = p)
fA <- eq18(A)
crit <- max(abs(fA))
}
## Transform dataset X to dataset Y (Z in the paper, eqs. 15-16)
Yjs <- Xjs
for (j in 1:J) {
observed <- !is.na(Xjs[[j]][1, ])
XjReduced <- Xjs[[j]][ , observed, drop = FALSE]
Aj <- A[observed, observed, drop = FALSE]
Mj <- as.numeric((Mu - mut(A))[observed])
Yj <- t(Aj %*% t(XjReduced) + Mj)
Yjs[[j]] <- replace(Yjs[[j]], !is.na(Yjs[[j]]), Yj)
}
Y <- as.data.frame(do.call("rbind", Yjs))
colnames(Y) <- colnames(dat)
Y
}
## Get a single bootstrapped sample from the transformed data. If there are
## multiple groups, bootstrapping occurs independently within each group, and
## a single data frame is returned. A new column is added to indicate group
## membership, which will be ignored in a single-group analysis.
getBootSample <- function(groupDat, group, group.label) {
bootSamp <- list()
for (g in seq_along(groupDat)) {
dat <- groupDat[[g]]
dat[ , group] <- group.label[g]
bootSamp[[g]] <- dat[sample(1:nrow(dat), nrow(dat), replace = TRUE), ]
}
do.call("rbind", bootSamp)
}
## fit the model to a single bootstrapped sample and return chi-squared
##' @importFrom lavaan lavInspect
fitBootSample <- function(dat, args, suppress) {
args$data <- dat
lavaanlavaan <- function(...) { lavaan::lavaan(...) }
if (suppress) {
fit <- suppressWarnings(do.call(lavaanlavaan, args))
} else {
fit <- do.call(lavaanlavaan, args)
}
if (!exists("fit")) return(c(chisq = NA))
if (lavInspect(fit, "converged")) {
chisq <- lavInspect(fit, "fit")[c("chisq", "chisq.scaled")]
} else {
chisq <- NA
}
if (is.na(chisq[2])) return(chisq[1]) else return(chisq[2])
}
semTools/R/runMI-score.R 0000644 0001762 0000144 00000112423 14764332604 014562 0 ustar ligges users ### Terrence D. Jorgensen & Yves Rosseel
### Last updated: 12 March 2025
### Pooled score test (= Lagrange Multiplier test) for multiple imputations
### Borrowed source code from lavaan/R/lav_test_score.R
### DEPRECATED: 16 June 2024
### supplanted by lavaan.mi package
## this function can run two modes:
## MODE 1: 'add'
## add new parameters that are currently not included in de model
## (aka fixed to zero), but should be released
## MODE 2: 'release' (the default)
## release existing "==" constraints
##' Score Test for Multiple Imputations
##'
##' Score test (or "Lagrange multiplier" test) for lavaan models fitted to
##' multiple imputed data sets. Statistics for releasing one or more
##' fixed or constrained parameters in model can be calculated by pooling
##' the gradient and information matrices pooled across imputed data sets in a
##' method proposed by Mansolf, Jorgensen, & Enders (2020)---analogous to
##' the "D1" Wald test proposed by Li, Meng, Raghunathan, & Rubin's (1991)---or
##' by pooling the complete-data score-test statistics across imputed data sets
##' (i.e., "D2"; Li et al., 1991).
##'
## @aliases lavTestScore.mi
##' @importFrom lavaan lavListInspect parTable
##' @importFrom stats cov pchisq pf
##' @importFrom methods getMethod
##'
##' @param object An object of class [OLDlavaan.mi-class].
##' @param add Either a `character` string (typically between single
##' quotes) or a parameter table containing additional (currently
##' fixed-to-zero) parameters for which the score test must be computed.
##' @param release Vector of `integer`s. The indices of the *equality*
##' constraints that should be released. The indices correspond to the order of
##' the equality constraints as they appear in the parameter table.
##' @param test `character` indicating which pooling method to use.
##' `"D1"` requests Mansolf, Jorgensen, & Enders' (2020) proposed
##' Wald-like test for pooling the gradient and information, which are then
##' used to calculate score-test statistics in the usual manner. `"D2"`
##' (default because it is less computationall intensive) requests to pool the
##' complete-data score-test statistics from each imputed data set, then pool
##' them across imputations, described by Li et al. (1991) and Enders (2010).
##' @param scale.W `logical`. If `FALSE`, the pooled
##' information matrix is calculated as the weighted sum of the
##' within-imputation and between-imputation components. Otherwise, the pooled
##' information is calculated by scaling the within-imputation component by
##' the average relative increase in variance (ARIV; Enders, 2010, p. 235),
##' which is *only* consistent when requesting the *F* test (i.e.,
##' `asymptotic = FALSE`. Ignored (irrelevant) if `test = "D2"`.
##' @param omit.imps `character` vector specifying criteria for omitting
##' imputations from pooled results. Can include any of
##' `c("no.conv", "no.se", "no.npd")`, the first 2 of which are the
##' default setting, which excludes any imputations that did not
##' converge or for which standard errors could not be computed. The
##' last option (`"no.npd"`) would exclude any imputations which
##' yielded a nonpositive definite covariance matrix for observed or
##' latent variables, which would include any "improper solutions" such
##' as Heywood cases. Specific imputation numbers can also be included in this
##' argument, in case users want to apply their own custom omission criteria
##' (or simulations can use different numbers of imputations without
##' redundantly refitting the model).
##' @param asymptotic `logical`. If `FALSE` (default when using
##' `add` to test adding fixed parameters to the model), the pooled test
##' will be returned as an *F*-distributed variable with numerator
##' (`df1`) and denominator (`df2`) degrees of freedom.
##' If `TRUE`, the pooled *F* statistic will be multiplied by its
##' `df1` on the assumption that its `df2` is sufficiently large
##' enough that the statistic will be asymptotically \eqn{\chi^2} distributed
##' with `df1`. When using the `release` argument, `asymptotic`
##' will be set to `TRUE` because (A)RIV can only be calculated for
##' `add`ed parameters.
##' @param univariate `logical`. If `TRUE`, compute the univariate
##' score statistics, one for each constraint.
##' @param cumulative `logical`. If `TRUE`, order the univariate score
##' statistics from large to small, and compute a series of multivariate
##' score statistics, each time including an additional constraint in the test.
##' @param epc `logical`. If `TRUE`, and we are releasing existing
##' constraints, compute the expected parameter changes for the existing
##' (free) parameters (and any specified with `add`), if all constraints
##' were released. For EPCs associated with a particular (1-*df*)
##' constraint, only specify one parameter in `add` or one constraint in
##' `release`.
##' @param standardized If `TRUE`, two extra columns (`sepc.lv` and
##' `sepc.all`) in the `$epc` table will contain standardized values
##' for the EPCs. See [lavaan::lavTestScore()].
##' @param cov.std `logical`. See [lavaan::standardizedSolution()].
##' @param verbose `logical`. Not used for now.
##' @param warn `logical`. If `TRUE`, print warnings if they occur.
##' @param information `character` indicating the type of information
##' matrix to use (check [lavaan::lavInspect()] for available options).
##' `"expected"` information is the default, which provides better
##' control of Type I errors.
##'
##' @return
##' A list containing at least one `data.frame`:
##' \itemize{
##' \item{`$test`: The total score test, with columns for the score
##' test statistic (`X2`), its degrees of freedom (`df`), its
##' *p* value under the \eqn{\chi^2} distribution (`p.value`),
##' and if `asymptotic=FALSE`, the average relative invrease in
##' variance (ARIV) used to calculate the denominator *df* is also
##' returned as a missing-data diagnostic, along with the fraction missing
##' information (FMI = ARIV / (1 + ARIV)).}
##' \item{`$uni`: Optional (if `univariate=TRUE`).
##' Each 1-*df* score test, equivalent to modification indices. Also
##' includes EPCs if `epc=TRUE`, and RIV and FMI if
##' `asymptotic=FALSE`.}
##' \item{`$cumulative`: Optional (if `cumulative=TRUE`).
##' Cumulative score tests, with ARIV and FMI if `asymptotic=FALSE`.}
##' \item{`$epc`: Optional (if `epc=TRUE`). Parameter estimates,
##' expected parameter changes, and expected parameter values if ALL
##' the tested constraints were freed.}
##' }
##' See [lavaan::lavTestScore()] for details.
##'
##' @author
##' Terrence D. Jorgensen (University of Amsterdam; \email{TJorgensen314@@gmail.com})
##'
##' Adapted from \pkg{lavaan} source code, written by
##' Yves Rosseel (Ghent University; \email{Yves.Rosseel@@UGent.be})
##'
##' `test = "D1"` method proposed by
##' Maxwell Mansolf (University of California, Los Angeles;
##' \email{mamansolf@@gmail.com})
##'
##' @references
##' Bentler, P. M., & Chou, C.-P. (1992). Some new covariance structure model
##' improvement statistics. *Sociological Methods & Research, 21*(2),
##' 259--282. \doi{10.1177/0049124192021002006}
##'
##' Enders, C. K. (2010). *Applied missing data analysis*.
##' New York, NY: Guilford.
##'
##' Li, K.-H., Meng, X.-L., Raghunathan, T. E., & Rubin, D. B. (1991).
##' Significance levels from repeated *p*-values with multiply-imputed
##' data. *Statistica Sinica, 1*(1), 65--92. Retrieved from
##'
##'
##' Mansolf, M., Jorgensen, T. D., & Enders, C. K. (2020). A multiple
##' imputation score test for model modification in structural equation
##' models. *Psychological Methods, 25*(4), 393--411.
##' \doi{10.1037/met0000243}
##'
##' @seealso [lavaan::lavTestScore()]
##'
##' @examples
##'
##' ## See the new lavaan.mi package
##'
##' @name lavTestScore.mi-deprecated
##' @usage
##' lavTestScore.mi(object, add = NULL, release = NULL,
##' test = c("D2","D1"), scale.W = !asymptotic,
##' omit.imps = c("no.conv","no.se"),
##' asymptotic = is.null(add),
##' univariate = TRUE, cumulative = FALSE,
##' epc = FALSE, standardized = epc, cov.std = epc,
##' verbose = FALSE, warn = TRUE, information = "expected")
##' @seealso [semTools-deprecated()]
##' @keywords internal
NULL
##' @rdname semTools-deprecated
##' @export
lavTestScore.mi <- function(object, add = NULL, release = NULL,
test = c("D2","D1"), scale.W = !asymptotic,
omit.imps = c("no.conv","no.se"),
asymptotic = is.null(add), # as F or chi-squared
univariate = TRUE, cumulative = FALSE,
epc = FALSE, standardized = epc, cov.std = epc,
verbose = FALSE, warn = TRUE, information = "expected") {
.Deprecated(msg = c("\nThe runMI() and related lavaan.mi functions have been",
" deprecated and will cease to be included in future ",
"versions of semTools.\n\nSupport is still provided for ",
"analyzing lavaan.mi-class objects (e.g., compRelSEM() ",
"can estimate reliability using pooled results), which ",
"can now be created using the lavaan.mi package.\n\nThe ",
"deprecated runMI() function now creates an object of ",
"class OLDlavaan.mi, which can be analyzed using the ",
"deprecated functions in semTools, like lavTestLRT.mi(),",
" that have been updated and improved in the lavaan.mi ",
"package.\n\nFind more details help('semTools-deprecated)"))
stopifnot(inherits(object, "OLDlavaan.mi"))
lavoptions <- object@Options
useImps <- rep(TRUE, length(object@DataList))
if ("no.conv" %in% omit.imps) useImps <- sapply(object@convergence, "[[", i = "converged")
if ("no.se" %in% omit.imps) useImps <- useImps & sapply(object@convergence, "[[", i = "SE")
if ("no.npd" %in% omit.imps) {
Heywood.lv <- sapply(object@convergence, "[[", i = "Heywood.lv")
Heywood.ov <- sapply(object@convergence, "[[", i = "Heywood.ov")
useImps <- useImps & !(Heywood.lv | Heywood.ov)
}
## custom removal by imputation number
rm.imps <- omit.imps[ which(omit.imps %in% 1:length(useImps)) ]
if (length(rm.imps)) useImps[as.numeric(rm.imps)] <- FALSE
## whatever is left
m <- sum(useImps)
if (m == 0L) stop('No imputations meet "omit.imps" criteria.')
useImps <- which(useImps)
test <- toupper(test[1])
if (!test %in% c("D2","D1")) stop('Invalid choice of "test" argument.')
## check if model has converged
if (m == 0L) stop("No models converged. Score tests unavailable.")
# check for inequality constraints
PT <- parTable(object)
if (any(PT$op == ">" | PT$op == "<")) {
stop("lavTestScore.mi() does not handle inequality constraints (yet)")
}
# check arguments
if (cumulative) univariate <- TRUE
if (sum(is.null(release), is.null(add)) == 0) {
stop("`add' and `release' arguments cannot be used together.\n",
"Fixed parameters can instead be labeled in the model syntax ",
"and those labels can be constrained to fixed values, so that ",
"the constraints can be tested using the `release' argument along ",
"with other released constraints.")
}
oldCall <- object@lavListCall
#oldCall$model <- parTable(object) # FIXME: necessary?
if (test == "D2") {
if (!is.null(oldCall$parallel)) {
if (oldCall$parallel == "snow") {
oldCall$parallel <- "no"
oldCall$ncpus <- 1L
if (warn) warning("Unable to pass lavaan::lavTestScore() arguments ",
"when parallel='snow'. Switching to parallel='no'.",
" Unless using Windows, parallel='multicore' works.")
}
}
## call lavaanList() again to run lavTestScore() on each imputation
oldCall$FUN <- function(obj) {
out <- try(lavaan::lavTestScore(obj, add = add, release = release,
univariate = univariate, epc = epc,
cumulative = cumulative, cov.std = cov.std,
standardized = standardized,
warn = FALSE, information = information),
silent = TRUE)
if (inherits(out, "try-error")) return(NULL)
out
}
FIT <- eval(as.call(oldCall))
## check if there are any results
noScores <- sapply(FIT@funList, is.null)
if (all(noScores)) stop("No success using lavTestScore() on any imputations.")
## template to fill in pooled values
OUT <- FIT@funList[[ intersect(useImps, which(!noScores))[1] ]]
## at a minimum, pool the total score test
chiList <- sapply(FIT@funList[intersect(useImps, which(!noScores))],
function(x) x$test$X2)
chiPooled <- calculate.D2(chiList, DF = OUT$test$df, asymptotic)
OUT$test$X2 <- chiPooled[1]
if (!asymptotic) {
names(OUT$test)[names(OUT$test) == "X2"] <- "F"
names(OUT$test)[names(OUT$test) == "df"] <- "df1"
OUT$test$df2 <- chiPooled[["df2"]]
OUT$test$p.value <- NULL # so it appears after "df2" column
}
OUT$test$p.value <- chiPooled[["pvalue"]]
OUT$test$ariv <- chiPooled[["ariv"]]
OUT$test$fmi <- chiPooled[["fmi"]]
## univariate?
if (univariate) {
if (!asymptotic) {
names(OUT$uni)[names(OUT$uni) == "X2"] <- "F"
OUT$uni$p.value <- NULL # so it appears after "df2" column
OUT$uni$df2 <- NA
OUT$uni$p.value <- NA
OUT$uni$riv <- NA
OUT$uni$fmi <- NA
if ("epc" %in% colnames(OUT$uni)) {
OUT$uni$epc <- NULL # so it appears after "p.value"
OUT$uni$epc <- NA # fill in below
}
}
for (i in 1:nrow(OUT$uni)) {
chiList <- sapply(FIT@funList[intersect(useImps, which(!noScores))],
function(x) x$uni$X2[i] )
chiPooled <- calculate.D2(chiList, DF = OUT$uni$df[i], asymptotic)
if (!asymptotic) {
OUT$uni$F[i] <- chiPooled[[1]]
OUT$uni$df2[i] <- chiPooled[["df2"]]
} else OUT$uni$X2[i] <- chiPooled[[1]]
OUT$uni$p.value[i] <- chiPooled[["pvalue"]]
OUT$uni$riv[i] <- chiPooled[["ariv"]]
OUT$uni$fmi[i] <- chiPooled[["fmi"]]
if ("epc" %in% colnames(OUT$uni)) {
epcUniList <- sapply(FIT@funList[intersect(useImps, which(!noScores))],
function(x) x$uni$epc[i] )
OUT$uni$epc[i] <- mean(epcUniList)
}
}
if (!asymptotic) names(OUT$uni)[names(OUT$uni) == "df"] <- "df1"
}
## cumulative?
if (cumulative) {
if (!asymptotic) {
names(OUT$cumulative)[names(OUT$cumulative) == "X2"] <- "F"
OUT$cumulative$p.value <- NULL # so it appears after "df2" column
OUT$cumulative$df2 <- NA
OUT$cumulative$p.value <- NA
OUT$cumulative$ariv <- NA
OUT$cumulative$fmi <- NA
}
for (i in 1:nrow(OUT$cumulative)) {
chiList <- sapply(FIT@funList[intersect(useImps, which(!noScores))],
function(x) x$cumulative$X2[i] )
chiPooled <- calculate.D2(chiList, DF = OUT$cumulative$df[i], asymptotic)
if (!asymptotic) {
OUT$cumulative$F[i] <- chiPooled[[1]]
OUT$cumulative$df2[i] <- chiPooled[["df2"]]
} else OUT$cumulative$X2[i] <- chiPooled[[1]]
OUT$cumulative$p.value[i] <- chiPooled[["pvalue"]]
OUT$cumulative$ariv[i] <- chiPooled[["ariv"]]
OUT$cumulative$fmi[i] <- chiPooled[["fmi"]]
}
if (!asymptotic) names(OUT$cumulative)[names(OUT$cumulative) == "df"] <- "df1"
}
## EPCs?
if (epc) {
estList <- lapply(FIT@funList[intersect(useImps, which(!noScores))],
function(x) x$epc$est)
OUT$epc$est <- rowMeans(do.call(cbind, estList))
epcList <- lapply(FIT@funList[intersect(useImps, which(!noScores))],
function(x) x$epc$epc)
OUT$epc$epc <- rowMeans(do.call(cbind, epcList))
OUT$epc$epv <- OUT$epc$est + OUT$epc$epc
if (standardized) {
sepcList <- lapply(FIT@funList[intersect(useImps, which(!noScores))],
function(x) x$epc$sepc.lv)
OUT$epc$sepc.lv <- rowMeans(do.call(cbind, sepcList))
sepcList <- lapply(FIT@funList[intersect(useImps, which(!noScores))],
function(x) x$epc$sepc.all)
OUT$epc$sepc.all <- rowMeans(do.call(cbind, sepcList))
if ("sepc.nox" %in% colnames(FIT@funList[intersect(useImps, which(!noScores))][[1]])) {
sepcList <- lapply(FIT@funList[intersect(useImps, which(!noScores))],
function(x) x$epc$sepc.nox)
OUT$epc$sepc.nox <- rowMeans(do.call(cbind, sepcList))
}
}
}
return(OUT)
} # else test == "D1", making 'scale.W=' relevant
## number of free parameters (regardless of whether they are constrained)
npar <- object@Model@nx.free
## sample size
N <- lavListInspect(object, "ntotal")
if (lavoptions$mimic == "EQS") N <- N - 1
# Mode 1: ADDING new parameters
if (!is.null(add) && nchar(add) > 0L) {
## turn off SNOW cluster (can't past arguments)
if (!is.null(oldCall$parallel)) {
if (oldCall$parallel == "snow") {
oldCall$parallel <- "no"
oldCall$ncpus <- 1L
if (warn) warning("Unable to pass lavaan::lavTestScore() arguments ",
"when parallel='snow'. Switching to parallel='no'.",
" Unless using Windows, parallel='multicore' works.")
}
}
## call lavaanList() to fit augmented model (do.fit = FALSE)
oldCall$FUN <- function(obj) {
ngroups <- lavaan::lavInspect(obj, "ngroups")
nlevels <- lavaan::lavInspect(obj, "nlevels")
## --------------------------------------
## borrowed code from lav_object_extend()
## --------------------------------------
## select columns that should always be included below
myCols <- c("lhs","op","rhs")
if (ngroups > 1L) myCols <- c(myCols,"block","group")
if (nlevels > 1L) myCols <- c(myCols,"block","level")
myCols <- unique(myCols)
# partable original model
oldPT <- lavaan::parTable(obj)[c(myCols, "free","label","plabel")]
# replace 'start' column, since lav_model will fill these in in GLIST
oldPT$start <- lavaan::parameterEstimates(obj, remove.system.eq = FALSE,
remove.def = FALSE,
remove.eq = FALSE,
remove.ineq = FALSE)$est
# add new parameters, extend model
# ADD <- lavaan::modindices(obj, standardized = FALSE)[myCols]
if (is.list(add)) {
stopifnot(!is.null(add$lhs),
!is.null(add$op),
!is.null(add$rhs))
ADD <- as.data.frame(add)
} else if (is.character(add)) {
ADD <- lavaan::lavaanify(add, ngroups = ngroups)
ADD <- ADD[ , c(myCols, "user","label")]
remove.idx <- which(ADD$user == 0)
if (length(remove.idx) > 0L) {
ADD <- ADD[-remove.idx,]
}
ADD$start <- rep( 0, nrow(ADD))
ADD$free <- rep( 1, nrow(ADD))
ADD$user <- rep(10, nrow(ADD))
} else stop("'add' must be lavaan model syntax or a parameter table.")
# nR <- try(nrow(ADD), silent = TRUE)
# if (inherits(nR, "try-error") || is.null(nR)) return(list(gradient = NULL,
# information = NULL))
# ADD$free <- rep(1L, nR)
# ADD$user <- rep(10L, nR)
# merge
LIST <- lavaan::lav_partable_merge(oldPT, ADD, remove.duplicated = TRUE, warn = FALSE)
# redo 'free'
free.idx <- which(LIST$free > 0)
LIST$free[free.idx] <- 1:length(free.idx)
# adapt options
lavoptions <- obj@Options
if (any(LIST$op == "~1")) lavoptions$meanstructure <- TRUE
lavoptions$do.fit <- FALSE
obj2 <- lavaan::lavaan(LIST,
slotOptions = lavoptions,
slotSampleStats = obj@SampleStats,
slotData = obj@Data,
slotCache = obj@Cache,
sloth1 = obj@h1)
## ---------------------------------
list(gradient = lavaan::lavInspect(obj2, "gradient"),
information = lavaan::lavInspect(obj2, paste("information",
information, sep = ".")),
#TODO: Max wants to calculate EPCs as averages across imputations.
# Run lavTestScore(epc=TRUE) here? or be consistent...
nadd = nrow(ADD), parTable = lavaan::parTable(obj2))
}
FIT <- eval(as.call(oldCall))
## obtain list of inverted Jacobians: within-impuation covariance matrices
R.model <- object@Model@con.jac[,,drop = FALSE]
nadd <- FIT@funList[[ useImps[1] ]]$nadd
## pool gradients and information matrices
gradList <- lapply(FIT@funList[useImps], "[[", i = "gradient")
infoList <- lapply(FIT@funList[useImps], "[[", i = "information")
score <- colMeans(do.call(rbind, gradList)) # pooled point estimates
B <- cov(do.call(rbind, gradList) * sqrt(N)) # between-imputation UNIT information
W <- Reduce("+", infoList) / m # within-imputation UNIT information
inv.W <- try(solve(W), silent = TRUE)
if (inherits(inv.W, "try-error")) {
if (warn && scale.W) warning("Could not invert W for total score test, ",
"perhaps due to constraints on estimated ",
"parameters. Generalized inverse used instead.\n",
"If the model does not have equality constraints, ",
"it may be safer to set `scale.W = FALSE'.")
inv.W <- MASS::ginv(W)
}
## relative increase in variance due to missing data
ariv <- (1 + 1/m)/nadd * sum(diag(B %*% inv.W)) # ONLY relevant for scaling full INFO matrix
if (scale.W) {
information <- (1 + ariv) * W # Enders (2010, p. 235) eqs. 8.20-21
} else {
## less reliable, but constraints prevent inversion of W
information <- W + B + (1/m)*B # Enders (2010, p. 235) eq. 8.19
}
if (nrow(R.model) > 0L) {
R.model <- cbind(R.model, matrix(0, nrow(R.model), ncol = nadd))
R.add <- cbind(matrix(0, nrow = nadd, ncol = npar), diag(nadd))
R <- rbind(R.model, R.add)
Z <- cbind(rbind(information, R.model),
rbind(t(R.model),matrix(0,nrow(R.model),nrow(R.model))))
Z.plus <- MASS::ginv(Z)
J.inv <- Z.plus[ 1:nrow(information), 1:nrow(information) ]
r.idx <- seq_len(nadd) + nrow(R.model)
} else {
R <- cbind(matrix(0, nrow = nadd, ncol = npar), diag(nadd))
J.inv <- MASS::ginv(information)
r.idx <- seq_len(nadd)
}
PT <- FIT@funList[[ useImps[1] ]]$parTable
if (is.null(PT$group)) PT$group <- PT$block
# lhs/rhs
lhs <- lavaan::lav_partable_labels(PT)[ PT$user == 10L ]
op <- rep("==", nadd)
rhs <- rep("0", nadd)
Table <- data.frame(lhs = lhs, op = op, rhs = rhs)
class(Table) <- c("lavaan.data.frame", "data.frame")
} else {
# MODE 2: releasing constraints
if (!asymptotic) {
message('The average relative increase in variance (ARIV) cannot be ',
'calculated for releasing estimated constraints, preventing the ',
'denominator degrees of freedom from being calculated for the F ',
'test, so the "asymptotic" argument was switched to TRUE.' )
asymptotic <- TRUE
scale.W <- FALSE
}
if (is.character(release)) stop("not implemented yet") #FIXME: moved up to save time
R <- object@Model@con.jac[,,drop = FALSE]
if (nrow(R) == 0L) stop("No equality constraints found in the model.")
## use lavaanList() to get gradient/information from each imputation
oldCall$FUN <- function(obj) {
list(gradient = lavaan::lavInspect(obj, "gradient"),
information = lavaan::lavInspect(obj, paste("information",
information, sep = ".")))
}
FIT <- eval(as.call(oldCall))
## pool gradients and information matrices
gradList <- lapply(FIT@funList[useImps], "[[", i = "gradient")
infoList <- lapply(FIT@funList[useImps], "[[", i = "information")
score <- colMeans(do.call(rbind, gradList)) # pooled point estimates
B <- cov(do.call(rbind, gradList) * sqrt(N)) # between-imputation UNIT information
W <- Reduce("+", infoList) / m # within-imputation UNIT information
inv.W <- try(solve(W), silent = TRUE)
if (inherits(inv.W, "try-error")) {
if (warn && scale.W) warning("Could not invert W for total score test, ",
"perhaps due to constraints on estimated ",
"parameters. Generalized inverse used instead.\n",
"If the model does not have equality constraints, ",
"it may be safer to set `scale.W = FALSE'.")
inv.W <- MASS::ginv(W)
}
## relative increase in variance due to missing data
k <- length(release)
if (k == 0) k <- nrow(R)
ariv <- (1 + 1/m)/k * sum(diag(B %*% inv.W)) #FIXME: Invalid extrapolation!
if (scale.W) {
#TODO: This option is disabled, kept only to update with future research
information <- (1 + ariv) * W # Enders (2010, p. 235) eqs. 8.20-21
} else {
## less reliable, but constraints prevent inversion of W
information <- W + B + (1/m)*B # Enders (2010, p. 235) eq. 8.19
}
if (is.null(release)) {
# ALL constraints
r.idx <- seq_len( nrow(R) )
J.inv <- MASS::ginv(information) #FIXME? Yves has this above if(is.null(release))
} else if (is.numeric(release)) {
r.idx <- release
if(max(r.idx) > nrow(R)) {
stop("lavaan ERROR: maximum constraint number (", max(r.idx),
") is larger than number of constraints (", nrow(R), ")")
}
# neutralize the non-needed constraints
R1 <- R[-r.idx, , drop = FALSE]
Z1 <- cbind( rbind(information, R1),
rbind(t(R1), matrix(0, nrow(R1), nrow(R1))) )
Z1.plus <- MASS::ginv(Z1)
J.inv <- Z1.plus[ 1:nrow(information), 1:nrow(information) ]
} else if (is.character(release)) {
stop("not implemented yet")
}
# lhs/rhs
eq.idx <- which(object@ParTable$op == "==")
if (length(eq.idx) > 0L) {
lhs <- object@ParTable$lhs[eq.idx][r.idx]
op <- rep("==", length(r.idx))
rhs <- object@ParTable$rhs[eq.idx][r.idx]
}
Table <- data.frame(lhs = lhs, op = op, rhs = rhs)
class(Table) <- c("lavaan.data.frame", "data.frame")
}
if (lavoptions$se == "standard") {
stat <- as.numeric(N * score %*% J.inv %*% score)
} else {
# generalized score test
if (warn) warning("se is not `standard'. Robust test not implemented yet. ",
"Falling back to ordinary score test.")
# NOTE!!!
# we can NOT use VCOV here, because it reflects the constraints,
# and the whole point is to test for these constraints...
stat <- as.numeric(N * score %*% J.inv %*% score)
}
# compute df, taking into account that some of the constraints may
# be needed to identify the model (and hence information is singular)
# information.plus <- information + crossprod(R)
#df <- qr(R[r.idx,,drop = FALSE])$rank +
# ( qr(information)$rank - qr(information.plus)$rank )
DF <- nrow( R[r.idx, , drop = FALSE] )
if (asymptotic) {
TEST <- data.frame(test = "score", X2 = stat, df = DF,
p.value = pchisq(stat, df = DF, lower.tail = FALSE))
} else {
## calculate denominator DF for F statistic
myDims <- 1:nadd + npar #TODO: not available in release mode, unless calculating Jacobian of constraints, like Wald test?
ARIV <- (1 + 1/m)/nadd * sum(diag(B[myDims, myDims, drop = FALSE] %*% inv.W[myDims, myDims, drop = FALSE]))
a <- DF*(m - 1)
if (a > 4) {
df2 <- 4 + (a - 4) * (1 + (1 - 2/a)*(1 / ARIV))^2 # Enders (eq. 8.24)
} else {
df2 <- a*(1 + 1/DF) * (1 + 1/ARIV)^2 / 2 # Enders (eq. 8.25)
}
TEST <- data.frame(test = "score", "F" = stat / DF, df1 = DF, df2 = df2,
p.value = pf(stat / DF, df1 = DF, df2 = df2, lower.tail = FALSE),
ariv = ARIV, fmi = ARIV / (1 + ARIV))
}
class(TEST) <- c("lavaan.data.frame", "data.frame")
attr(TEST, "header") <- "total score test:"
OUT <- list(test = TEST)
if (univariate) {
TS <- numeric( nrow(R) )
EPC.uni <- numeric( nrow(R) ) #FIXME: to add univariate EPCs for added parameters
for (r in r.idx) {
R1 <- R[-r, , drop = FALSE]
Z1 <- cbind( rbind(information, R1),
rbind(t(R1), matrix(0, nrow(R1), nrow(R1))) )
Z1.plus <- MASS::ginv(Z1)
Z1.plus1 <- Z1.plus[ 1:nrow(information), 1:nrow(information) ]
TS[r] <- as.numeric(N * t(score) %*% Z1.plus1 %*% score)
## FIXME: experimentally add univariate EPCs for added parameters, as would accompany modification indices
if (epc && !is.null(add)) EPC.uni[r] <- 1 * utils::tail(as.numeric(score %*% Z1.plus1), n = nrow(R))[r]
}
Table2 <- Table
DF <- rep(1, length(r.idx))
if (asymptotic) {
Table2$X2 <- TS[r.idx]
Table2$df <- DF
Table2$p.value <- pchisq(Table2$X2, df = DF, lower.tail = FALSE)
} else {
Table2$F <- TS[r.idx] / DF
Table2$df1 <- DF
## calculate denominator DF for F statistic using RIV per 1-df test (Enders eq. 8.10)
myDims <- 1:nadd + npar
RIVs <- diag((1 + 1/m) * B[myDims, myDims, drop = FALSE]) / diag(W[myDims, myDims, drop = FALSE])
Table2$df2 <- sapply(RIVs, function(riv) {
DF1 <- 1L # Univariate tests
a <- DF1*(m - 1)
DF2 <- if (a > 4) {
4 + (a - 4) * (1 + (1 - 2/a)*(1 / riv))^2 # Enders (eq. 8.24)
} else a*(1 + 1/DF1) * (1 + 1/riv)^2 / 2 # Enders (eq. 8.25)
DF2
})
Table2$p.value <- pf(Table2$F, df1 = DF, df2 = Table2$df2, lower.tail = FALSE)
Table2$riv <- RIVs
Table2$fmi <- RIVs / (1 + RIVs)
}
## add univariate EPCs, equivalent to modindices() output
if (epc && !is.null(add)) Table2$epc <- EPC.uni[r.idx]
attr(Table2, "header") <- "univariate score tests:"
OUT$uni <- Table2
}
if (cumulative) {
TS.order <- sort.int(TS, index.return = TRUE, decreasing = TRUE)$ix
TS <- numeric( length(r.idx) )
if (!asymptotic) ARIVs <- numeric( length(r.idx) )
for (r in 1:length(r.idx)) {
rcumul.idx <- TS.order[1:r]
R1 <- R[-rcumul.idx, , drop = FALSE]
Z1 <- cbind( rbind(information, R1),
rbind(t(R1), matrix(0, nrow(R1), nrow(R1))) )
Z1.plus <- MASS::ginv(Z1)
Z1.plus1 <- Z1.plus[ 1:nrow(information), 1:nrow(information) ]
TS[r] <- as.numeric(N * t(score) %*% Z1.plus1 %*% score)
if (!asymptotic) {
myDims <- rcumul.idx + npar
ARIVs[r] <- (1 + 1/m)/length(myDims) * sum(diag(B[myDims, myDims, drop = FALSE] %*% inv.W[myDims, myDims, drop = FALSE]))
}
}
Table3 <- Table[TS.order,]
DF <- seq_len( length(TS) )
if (asymptotic) {
Table3$X2 <- TS
Table3$df <- DF
Table3$p.value <- pchisq(Table3$X2, df = DF, lower.tail = FALSE)
} else {
Table3$F <- TS / DF
Table3$df1 <- DF
## calculate denominator DF for F statistic
Table3$df2 <- mapply(FUN = function(DF1, ariv) {
a <- DF1*(m - 1)
DF2 <- if (a > 4) {
4 + (a - 4) * (1 + (1 - 2/a)*(1 / ariv))^2 # Enders (eq. 8.24)
} else a*(1 + 1/DF1) * (1 + 1/ariv)^2 / 2 # Enders (eq. 8.25)
DF2
}, DF1 = DF, ariv = ARIVs)
Table3$p.value = pf(Table3$F, df1 = DF, df2 = Table3$df2, lower.tail = FALSE)
Table3$riv <- ARIVs
Table3$fmi <- ARIVs / (1 + ARIVs)
}
attr(Table3, "header") <- "cumulative score tests:"
OUT$cumulative <- Table3
}
if (epc) {
ngroups <- lavaan::lavInspect(object, "ngroups")
nlevels <- lavaan::lavInspect(object, "nlevels")
################# source code Yves commented out.
################# Calculates 1 EPC-vector per constraint.
################# Better to call lavTestScore() multiple times? Ugh...
# EPC <- vector("list", length = length(r.idx))
# for (i in 1:length(r.idx)) {
# r <- r.idx[i]
# R1 <- R[-r,,drop = FALSE]
# Z1 <- cbind( rbind(information, R1),
# rbind(t(R1), matrix(0,nrow(R1),nrow(R1))) )
# Z1.plus <- MASS::ginv(Z1)
# Z1.plus1 <- Z1.plus[ 1:nrow(information), 1:nrow(information) ]
# EPC[[i]] <- -1 * as.numeric(score %*% Z1.plus1)
# }
# OUT$EPC <- EPC
# EPCs when freeing all constraints together (total test)
R1 <- R[-r.idx, , drop = FALSE]
Z1 <- cbind( rbind(information, R1),
rbind(t(R1), matrix(0, nrow(R1), nrow(R1))) )
Z1.plus <- MASS::ginv(Z1)
Z1.plus1 <- Z1.plus[ 1:nrow(information), 1:nrow(information) ]
EPC.all <- 1 * as.numeric(score %*% Z1.plus1)
# create epc table for the 'free' parameters
myCoefs <- getMethod("coef","OLDlavaan.mi")(object, omit.imps = omit.imps)
myCols <- c("lhs","op","rhs","user")
if (ngroups > 1L) myCols <- c(myCols, "block","group")
if (nlevels > 1L) myCols <- c(myCols, "block","level")
myCols <- c(unique(myCols), "free","exo","label","plabel")
LIST <- if (!is.null(add) && nchar(add) > 0L) {
PT[ , myCols]
} else parTable(object)[ , myCols]
nonpar.idx <- which(LIST$op %in% c("==", ":=", "<", ">"))
if (length(nonpar.idx) > 0L) LIST <- LIST[ -nonpar.idx , ]
LIST$est[ LIST$free > 0 & LIST$user != 10 ] <- myCoefs
LIST$est[ LIST$user == 10L ] <- 0
LIST$epc <- rep(as.numeric(NA), length(LIST$lhs))
LIST$epc[ LIST$free > 0 ] <- EPC.all
LIST$epv <- LIST$est + LIST$epc
#TODO: add SEPCs
if (standardized) {
EPC <- LIST$epc
if (cov.std) {
# replace epc values for variances by est values
var.idx <- which(LIST$op == "~~" & LIST$lhs == LIST$rhs & LIST$exo == 0L)
EPC[ var.idx ] <- LIST$est[ var.idx ]
}
# two problems:
# - EPC of variances can be negative, and that is perfectly legal
# - EPC (of variances) can be tiny (near-zero), and we should
# not divide by tiny variables
small.idx <- which(LIST$op == "~~" &
LIST$lhs == LIST$rhs &
abs(EPC) < sqrt( .Machine$double.eps ) )
if (length(small.idx) > 0L) EPC[ small.idx ] <- as.numeric(NA)
# get the sign
EPC.sign <- sign(LIST$epc)
## pooled estimates for standardizedSolution()
pooledest <- getMethod("coef", "OLDlavaan.mi")(object,
omit.imps = omit.imps)
## update @Model@GLIST for standardizedSolution(..., GLIST=)
object@Model <- lavaan::lav_model_set_parameters(object@Model, x = pooledest)
LIST$sepc.lv <- EPC.sign * lavaan::standardizedSolution(object, se = FALSE,
type = "std.lv",
cov.std = cov.std,
partable = LIST,
GLIST = object@Model@GLIST,
est = abs(EPC))$est.std
LIST$sepc.all <- EPC.sign * lavaan::standardizedSolution(object, se = FALSE,
type = "std.all",
cov.std = cov.std,
partable = LIST,
GLIST = object@Model@GLIST,
est = abs(EPC))$est.std
fixed.x <- lavListInspect(object, "options")$fixed.x && length(lavNames(object, "ov.x"))
if (fixed.x) {
LIST$sepc.nox <- EPC.sign * lavaan::standardizedSolution(object, se = FALSE,
type = "std.nox",
cov.std = cov.std,
partable = LIST,
GLIST = object@Model@GLIST,
est = abs(EPC))$est.std
}
if (length(small.idx) > 0L) {
LIST$sepc.lv[small.idx] <- 0
LIST$sepc.all[small.idx] <- 0
if (fixed.x) LIST$sepc.nox[small.idx] <- 0
}
}
LIST$free[ LIST$user == 10L ] <- 0
LIST$user <- NULL
LIST$exo <- NULL
DF <- if (asymptotic) OUT$test$df else OUT$test$df1
attr(LIST, "header") <- paste0("expected parameter changes (epc) and ",
"expected parameter values (epv)",
if (DF < 2) ":" else {
" if ALL constraints in 'add' or 'release' were freed:" })
OUT$epc <- LIST
}
OUT
}
semTools/R/emmeans_lavaan.R 0000644 0001762 0000144 00000042072 14752750163 015371 0 ustar ligges users ### Mattan S. Ben-Shachar & Terrence D. Jorgensen
### Last updated: 11 February 2025
### emmeans support for lavaan objects
##' `emmeans` Support Functions for `lavaan` Models
##'
##' @description Provide emmeans support for lavaan objects
##'
##' @param object An object of class [lavaan::lavaan()].
##' See **Details**.
##' @param lavaan.DV `character` string naming the variable(s) for which
##' expected marginal means / trends should be produced.
##' A vector of names indicates a multivariate outcome, treated by default
##' as repeated measures.
##' @param data An optional `data.frame` without missing values, to be passed
##' when `missing="FIML"` estimation was useed, thus avoiding a reference-grid
##' with missing values.
##' @param trms,xlev,grid See `emmeans::emm_basis`
##' @param ... Further arguments passed to `emmeans::recover_data.lm` or
##' `emmeans::emm_basis.lm`
##'
##' @details
##'
##' \subsection{Supported DVs}{
##' `lavaan.DV` must be an *endogenous variable*, by appearing on
##' the left-hand side of either a regression operator (`"~"`)
##' or an intercept operator (`"~1"`), or both.
##' \cr\cr
##' `lavaan.DV` can also be a vector of endogenous variable, in which
##' case they will be treated by `emmeans` as a multivariate outcome
##' (often, this indicates repeated measures) represented by an additional
##' factor named `rep.meas` by default. The `mult.name=` argument
##' can be used to overwrite this default name.
##' }
##'
##' \subsection{Unsupported Models}{
##' This functionality does not support the following models:
##' \itemize{
##' \item Multi-level models are not supported.
##' \item Models not fit to a `data.frame` (i.e., models fit to a
##' covariance matrix).
##' }
##' }
##'
##' \subsection{Dealing with Fixed Parameters}{
##' Fixed parameters (set with `lavaan`'s modifiers) are treated as-is:
##' their values are set by the users, and they have a *SE* of 0 (as such,
##' they do not co-vary with any other parameter).
##' }
##'
##' \subsection{Dealing with Multigroup Models}{
##' If a multigroup model is supplied, a factor is added to the reference grid,
##' the name matching the `group` argument supplied when fitting the model.
##' *Note that you must set* `nesting = NULL`.
##' }
##'
##' \subsection{Dealing with Missing Data}{
##' Limited testing suggests that these functions do work when the model was fit
##' to incomplete data.
##' }
##'
##' \subsection{Dealing with Factors}{
##' By default `emmeans` recognizes binary variables (0,1) as a "factor"
##' with two levels (and not a continuous variable). With some clever contrast
##' defenitions it should be possible to get the desired emmeans / contasts.
##' See example below.
##' }
##'
##' @author Mattan S. Ben-Shachar (Ben-Gurion University of the Negev;
##' \email{matanshm@@post.bgu.ac.il})
##'
##' @example inst/examples/lavaan2emmeans.R
##'
##' @name lavaan2emmeans
NULL
##' @rdname lavaan2emmeans
recover_data.lavaan <- function(object, lavaan.DV, data = NULL, ...){
if (!requireNamespace("emmeans", quietly = TRUE)){
stop("'emmeans' is not installed.")
}
.emlav_test_DV(object, lavaan.DV)
## testing multi-group requires access to ...
dots <- list(...)
#FIXME: the nesting= argument is not passed to ... here, so the warning is
# annoyingly printed even when nesting = NULL is specified.
if (lavInspect(object, 'ngroups') > 1L && !("nesting" %in% names(dots))) {
warning(
"For multi-group models, don't forget to set 'nesting = NULL'.\n",
"See `?lavaan2emmeans` for more info.",
call. = FALSE
)
}
# Fake it
lavaan_data <- .emlav_recover_data(object, data)
recovered <- emmeans::recover_data(.emlav_fake_fit(object, lavaan.DV, lavaan_data),
data = lavaan_data, ...)
# Make it
lavaan_data <- lavaan_data[, colnames(recovered), drop = FALSE]
# Fill attributes (but keep lavaan_data in case of missing data)
mostattributes(lavaan_data) <- attributes(recovered)
return(lavaan_data)
}
##' @rdname lavaan2emmeans
##' @importFrom lavaan lavInspect
emm_basis.lavaan <- function(object,trms, xlev, grid, lavaan.DV, ...){
if (!requireNamespace("emmeans", quietly = TRUE)) {
stop("'emmeans' is not installed.")
}
# Fake it
emmb <- emmeans::emm_basis(.emlav_fake_fit(object, lavaan.DV),
trms, xlev, grid, ...)
# bhat --------------------------------------------------------------------
pars <- .emlav_clean_pars_tab(object, lavaan.DV, "bhat")
par_names <- pars$rhs
if(nrow(pars) < length(emmb$bhat)) {
warning(
"Not all parameters have been estimated.\n",
"This is usually caused by a missing mean structure.\n",
"Fixing estimates for these parameters at 0.",
call. = FALSE
)
}
# re-shape to deal with any missing estimates
temp_bhat <- rep(0, length = length(emmb$bhat))
temp_bhat[seq_len(nrow(pars))] <- pars$est
names(temp_bhat) <- .emlab_find_term_names(par_names, colnames(emmb$V))
# re-order
b_ind <- match(colnames(emmb$V), names(temp_bhat))
emmb$bhat <- temp_bhat[b_ind]
# VCOV --------------------------------------------------------------------
lavVCOV <- lavInspect(object, "vcov")
pars <- .emlav_clean_pars_tab(object, lavaan.DV, "vcov")
par_names <- paste0(pars$lhs, pars$op, pars$rhs)
# only regression estimates
pattern <- paste0("^(", paste0(lavaan.DV, collapse = "|"), ")")
is_reg <- grepl(paste0(pattern, "~"), par_names)
is_cov <- grepl(paste0(pattern, "~~"), par_names)
only_reg <- is_reg & !is_cov
lavVCOV <- lavVCOV[only_reg, only_reg, drop = FALSE]
if(ncol(lavVCOV) < nrow(emmb$V)) {
warning(
"Not all parameters are included in the VCOV matrix.\n",
"Perhaps some are fixed with a modifier, or the mean structure is missing.\n",
"Fixing SEs for these parameters at 0.",
call. = FALSE
)
}
# get only RHS
par_names <- par_names[only_reg]
par_names <- sub(paste0("~1$"), "~(Intercept)", par_names)
par_names <- sub(paste0(pattern, "~"), "", par_names)
# re-shape to deal with any missing estimates
temp_vcov <- matrix(0, nrow = nrow(emmb$V), ncol = ncol(emmb$V))
temp_vcov[seq_len(ncol(lavVCOV)), seq_len(ncol(lavVCOV))] <- lavVCOV
colnames(temp_vcov) <- .emlab_find_term_names(par_names, colnames(emmb$V))
rownames(temp_vcov) <- .emlab_find_term_names(par_names, colnames(emmb$V))
# re-order
v_ind <- match(colnames(emmb$V), colnames(temp_vcov))
emmb$V <- temp_vcov[v_ind, v_ind]
# dffun & dfargs ----------------------------------------------------------
emmb$dffun <- function(...) Inf
emmb$dfargs <- list(df = Inf)
# nbasis and misc ---------------------------------------------------------
## DONT CHANGE! MESSES UP MULTI-DV REF_GRID
# emmb$nbasis <- matrix(NA, 1, 1)
# emmb$misc <- list()
return(emmb)
}
##' @keywords internal
.emlab_find_term_names <- function(terms, candidates) {
terms_split <- strsplit(terms, split = ":")
candidates_split <- strsplit(candidates, split = ":")
is_in <- matrix(NA,
nrow = length(terms_split),
ncol = length(candidates_split))
for (i in seq_along(terms_split)) {
for (j in seq_along(candidates_split)) {
is_in[i,j] <-
all(candidates_split[[j]] %in% terms_split[[i]]) &&
all(terms_split[[i]] %in% candidates_split[[j]])
}
}
if (length(terms) > 1L) is_in <- apply(is_in, 2, any)
c(terms, if (any(!is_in)) candidates[!is_in])
}
##' @keywords internal
##' @importFrom lavaan lavInspect
.emlav_test_DV <- function(object, lavaan.DV){
# has DV?
pars <- lavaan::parameterEstimates(object)
pars <- pars[pars$op %in% c("~1", "~"), ]
if (!all(lavaan.DV %in% pars$lhs)) {
lavaan.DV <- lavaan.DV[!lavaan.DV %in% pars$lhs]
lavaan.DV <- paste0(lavaan.DV, collapse = ",")
stop(
"{", lavaan.DV, "} is not predicted (endogenous) in this model!\n",
"See `?lavaan2emmeans` for more info.",
call. = FALSE
)
}
# Is DV ordered?
if (any(lavaan.DV %in% lavInspect(object, 'ordered'))) {
lavaan.DV <- lavaan.DV[lavaan.DV %in% lavInspect(object, 'ordered')]
lavaan.DV <- paste0(lavaan.DV, collapse = ",")
stop(
"{", lavaan.DV, "} is an ordered variable! ",
"Currently only continuous DVs are supported.\n",
"See `?lavaan2emmeans` for more info.",
call. = FALSE
)
}
# is multilevel?
if (lavInspect(object, 'nlevels') > 1L){
warning(
"emmeans support is unavailable for multilevel SEMs.",
call. = FALSE
)
}
invisible(NULL)
}
##' @keywords internal
##' @importFrom lavaan lavInspect
.emlav_recover_data <- function(object, data = NULL, silent = FALSE){
##This function was contributed by TDJ
dat <- lavaan::lavPredict(object, newdata = data,
type = "lv",
assemble = TRUE,
append.data = TRUE)
## convert to data.frame, if necessary (single group)
dat <- as.data.frame(dat)
if (anyNA(dat)) {
## mean-impute any NAs
if (!silent) {
warning("'data' contains missing value. Mean-imputing them.", call. = FALSE)
}
dat[] <- lapply(dat, function(v) {
if (is.numeric(v) && (ix <- length(which(is.na(v))))) {
v[ix] <- mean(v, na.rm = TRUE)
}
v
})
}
dat
}
#TODO: delete old function after verifying the new one (above) works
# function(object){
# data_obs <- lavInspect(object, "data")
# data_lat <- lavaan::lavPredict(object, type = "lv")
#
# # If multi group
# if (lavInspect(object, 'ngroups') > 1L) {
# # make single data frame + add group labels
# group_labels <- sapply(seq_along(names(data_obs)), function(i) {
# label_ <- names(data_obs)[i]
# nobs_ <- nrow(data_obs[[i]])
# rep(label_, times = nobs_)
# }, simplify = FALSE)
#
# data_obs <- data.frame(do.call(rbind, data_obs))
# data_obs[[lavInspect(object, "group")]] <- unlist(group_labels)
# data_lat <- do.call(rbind, data_lat)
# }
#
# data_full <- cbind(data_obs, data_lat)
# return(data.frame(data_full))
# }
##' @keywords internal
##' @importFrom lavaan lavInspect
.emlav_fake_fit <- function(object, lavaan.DV, lavaan_data = NULL){
if (is.null(lavaan_data)) {
lavaan_data <- .emlav_recover_data(object, silent = TRUE)
}
# Fake it
pars <- lavaan::parameterEstimates(object)
pars <- pars[pars$lhs %in% lavaan.DV & pars$op == "~", ]
# If multi-group
if (lavInspect(object, 'ngroups') > 1L) {
# condition on group (no intercept!)
RHS <- paste0(
"0 +",
lavInspect(object, "group"),
"+",
lavInspect(object, "group"),
"/(",
paste0(pars$rhs, collapse = " + "),
")"
)
} else {
RHS <- paste0(pars$rhs, collapse = " + ")
}
lavaan_formula <- stats::as.formula(paste0(
paste0("cbind(",paste0(lavaan.DV, collapse = ","),")"),
"~",
RHS
))
return(lm(lavaan_formula, lavaan_data))
}
##' @keywords internal
##' @importFrom lavaan lavInspect
.emlav_clean_pars_tab <- function(object, lavaan.DV, type = c("bhat", "vcov")){
type <- match.arg(type)
if (type == "bhat") {
pars <- lavaan::parameterEstimates(object)
pars <- pars[pars$lhs %in% lavaan.DV & pars$op %in% c("~", "~1"), ]
} else {
pars <- lavaan::parameterEstimates(object,
remove.nonfree = TRUE,
remove.def = TRUE)
}
pars$rhs[pars$op == "~1"] <- "(Intercept)"
pars$op[pars$op == "~1"] <- "~"
if (lavInspect(object, 'ngroups') > 1L) {
group_labs <- paste0(lavInspect(object, 'group'),
lavInspect(object, 'group.label'))
pars$group <- group_labs[pars$group]
temp_rhs <- paste0(pars$group, ":", pars$rhs)
temp_rhs[grepl("(Intercept)", temp_rhs)] <-
pars$group[grepl("(Intercept)", temp_rhs)]
pars$rhs <- temp_rhs
}
if (length(lavaan.DV) > 1L) {
pars$rhs <- paste0(pars$lhs, ":", pars$rhs)
}
return(pars[, colnames(pars) %in% c("lhs", "op", "rhs", "label", "est")])
}
##' @keywords internal test
.emlav_run_tests <- function() {
if (!requireNamespace("testthat")) {
stop("Need 'testthat' for testing")
}
if (!requireNamespace("emmeans")) {
stop("Need 'emmeans' for testing")
}
testthat::test_that("moderation", {
model <- '
# regressions
Sepal.Length ~ b1 * Sepal.Width + b2 * Petal.Length + b3 * Sepal.Width:Petal.Length
# simple slopes for condition effect
below := b2 + b3 * (-1)
above := b2 + b3 * (+1)
'
semFit <- lavaan::sem(model = model,
data = datasets::iris,
meanstructure = TRUE)
em_ <- summary(
emmeans::emtrends(
semFit,
~ Sepal.Width,
"Petal.Length",
lavaan.DV = "Sepal.Length",
at = list(Sepal.Width = c(-1, 1))
)
)
em_est <- em_$Petal.Length.trend
em_se <- em_$SE
lv_est <-
lavaan::parameterEstimates(semFit, output = "pretty")[15:16, "est"]
lv_se <-
lavaan::parameterEstimates(semFit, output = "pretty")[15:16, "se"]
testthat::expect_equal(em_est, lv_est, tolerance = 1e-4)
testthat::expect_equal(em_se, lv_se, tolerance = 1e-4)
})
testthat::test_that("latent", {
model <- '
LAT1 =~ Sepal.Length + Sepal.Width
LAT1 ~ b1 * Petal.Width + 1 * Petal.Length
Petal.Length ~ Petal.Length.mean * 1
V1 := 1 * Petal.Length.mean + 1 * b1
V2 := 1 * Petal.Length.mean + 2 * b1
'
semFit <- suppressWarnings(
lavaan::sem(model = model,
data = datasets::iris,
std.lv = TRUE)
)
em_ <- suppressWarnings(summary(emmeans::emmeans(
semFit,
~ Petal.Width,
lavaan.DV = "LAT1",
at = list(Petal.Width = 1:2)
)))
em_est <- em_$emmean
lv_est <-
lavaan::parameterEstimates(semFit, output = "pretty")[15:16, "est"]
testthat::expect_equal(em_est, lv_est, tolerance = 1e-4)
})
testthat::test_that("multi-dv", {
model <- '
ind60 =~ x1 + x2 + x3
# metric invariance
dem60 =~ y1 + a*y2 + b*y3 + c*y4
dem65 =~ y5 + a*y6 + b*y7 + c*y8
# scalar invariance
y1 + y5 ~ d*1
y2 + y6 ~ e*1
y3 + y7 ~ f*1
y4 + y8 ~ g*1
# regressions (slopes differ: interaction with time)
dem60 ~ b1*ind60
dem65 ~ b2*ind60 + NA*1 + Mean.Diff*1
# residual correlations
y1 ~~ y5
y2 ~~ y4 + y6
y3 ~~ y7
y4 ~~ y8
y6 ~~ y8
# conditional mean differences (besides mean(ind60) == 0)
low := (-1*b2 + Mean.Diff) - (-1*b1) # 1 SD below M
high := (b2 + Mean.Diff) - b1 # 1 SD above M
'
semFit <- lavaan::sem(model, data = lavaan::PoliticalDemocracy)
em_ <- suppressWarnings(summary(emmeans::emmeans(
semFit,
pairwise ~ rep.meas | ind60,
lavaan.DV = c("dem60", "dem65"),
at = list(ind60 = c(-1, 1))
)[[2]]))
em_est <- em_$estimate
lv_est <-
lavaan::parameterEstimates(semFit, output = "pretty")[49:50, "est"]
em_se <- em_$SE
lv_se <-
lavaan::parameterEstimates(semFit, output = "pretty")[49:50, "se"]
testthat::expect_equal(em_est, -lv_est, tolerance = 1e-4)
testthat::expect_equal(em_se, lv_se, tolerance = 1e-4)
})
testthat::test_that("Multi Group", {
model <- '
x1 ~ c(int1, int2)*1 + c(b1, b2)*ageyr
diff_11 := (int2 + b2*11) - (int1 + b1*11)
diff_13 := (int2 + b2*13) - (int1 + b1*13)
diff_15 := (int2 + b2*15) - (int1 + b1*15)
'
semFit <-
lavaan::sem(model,
group = "school",
data = lavaan::HolzingerSwineford1939)
em_ <-
suppressWarnings(summary(
emmeans::emmeans(
semFit,
pairwise ~ school | ageyr,
lavaan.DV = "x1",
at = list(ageyr = c(11, 13, 15)),
nesting = NULL
)[[2]]
))
em_est <- em_$estimate
lv_est <-
lavaan::parameterEstimates(semFit, output = "pretty")$est[11:13]
em_se <- em_$SE
lv_se <-
lavaan::parameterEstimates(semFit, output = "pretty")$se[11:13]
testthat::expect_equal(em_est, lv_est, tolerance = 1e-4)
testthat::expect_equal(em_se, lv_se, tolerance = 1e-4)
})
testthat::test_that("all!", {
model <- '
LAT1 =~ x1 + x2 + x3
LAT2 =~ x4 + x5 + x6
LAT3 =~ LAT1 + LAT2 + x7 + x8 + x9
LAT3 ~ c(b1,b1)*ageyr + agemo
grade ~ ageyr
'
semFit <- lavaan::sem(model,
data = lavaan::HolzingerSwineford1939,
group = "school")
rg <- suppressWarnings(emmeans::ref_grid(semFit, lavaan.DV = c("LAT3", "grade")))
testthat::expect_s4_class(rg, "emmGrid")
})
testthat::test_that("missing data", {
raw_mtcars <- mtcars_na <- datasets::mtcars
mtcars_na$hp[1] <- NA
model <- " mpg ~ hp + drat + hp:drat "
fit <- lavaan::sem(model, mtcars_na, missing = "fiml.x")
testthat::expect_warning(
rg <- emmeans::ref_grid(fit, lavaan.DV = "mpg"),
regexp = "missing")
testthat::expect_false(anyNA(rg@grid))
testthat::expect_equal(rg@grid$hp, 147.871, tolerance = 0.01)
testthat::expect_warning(
rg2 <- emmeans::ref_grid(fit, lavaan.DV = "mpg",
data = raw_mtcars),
regexp = NA)
testthat::expect_equal(rg2@grid$hp, mean(raw_mtcars$hp), tolerance = 0.01)
})
message("All good!")
}
semTools/vignettes/ 0000755 0001762 0000144 00000000000 14764374533 014107 5 ustar ligges users semTools/vignettes/partialInvariance.bib 0000644 0001762 0000144 00000002704 14006342740 020204 0 ustar ligges users @preamble{ " \newcommand{\noop}[1]{} " }
@article{cheung2002,
title={Evaluating goodness-of-fit indexes for testing measurement invariance},
author={Cheung, Gordon W and Rensvold, Roger B},
journal={Structural equation modeling},
volume={9},
number={2},
pages={233--255},
year={2002},
publisher={Taylor \& Francis}
}
@article{meade2008,
title={Power and sensitivity of alternative fit indices in tests of measurement invariance.},
author={Meade, Adam W and Johnson, Emily C and Braddy, Phillip W},
journal={Journal of Applied Psychology},
volume={93},
number={3},
pages={568},
year={2008},
publisher={American Psychological Association}
}
@book{muthen1998,
title={Mplus technical appendices},
author={Muth{\'e}n, Bengt O},
publisher={Muth{\'e}n \& Muth{\'e}n},
address={Los Angeles, CA},
year={\noop{1998}1998--2004}
}
@article{cohen1992,
title={A power primer.},
author={Cohen, Jacob},
journal={Psychological bulletin},
volume={112},
number={1},
pages={155--159},
year={1992},
publisher={American Psychological Association}
}
@incollection{millsap2012,
author = {Millsap, Roger E and Olivera-Aguilar, Margarita},
title = {Investigating measurement invariance using confirmatory factor analysis},
editor = {Hoyle, Rick H},
booktitle = {Handbook of structural equation modeling},
pages = {380--392},
publisher = {Guilford},
address = {New York},
year = {2012}
} semTools/vignettes/partialInvariance.Rnw 0000644 0001762 0000144 00000024372 14006342740 020223 0 ustar ligges users \documentclass[12pt]{article}
%%\VignetteIndexEntry{Partial Invariance}
%%\VignetteDepends{semTools}
\usepackage[utf8]{inputenc}
\usepackage{amsfonts}
\usepackage{amstext}
\usepackage{amsmath}
\usepackage{natbib}
\title{A Note on Effect Size for Measurement Invariance}
\author{Sunthud Pornprasertmanit}
\begin{document}
\maketitle
This article aims to show the mathematical reasoning behind all effect sizes used in the \texttt{partialInvariance} and \texttt{partialInvarianceCat} functions in \texttt{semTools} package. In the functions, the following statistics are compared across groups: factor loadings, item intercepts (for continuous items), item thresholds (for categorical items), measurement error variances, and factor means.
The comparison can be compared between two groups (e.g., Cohen's \emph{d}) or multiple groups (e.g., $R^2$). This note provides the details of the effect sizes in comparing two groups only. The comparison between multiple groups can be done by picking the reference group and compare the other groups with the reference group in the similar fashion to dummy variables. For example, the comparison between four groups would create three effect size values (i.e., Group 1 vs. Reference, Group 2 vs. Reference, and Group 3 vs. Reference). Alternatively, for the measurement invariance, the change in comparative fit index (CFI) can be used as the measure of effect size. In the measurement invariance literature \citep{cheung2002, meade2008}, the change in CFI is used to test the equality constraints for multiple items simultaneously. The functions in \texttt{semTools} will show the change in CFI for each individual item. That is, if an item were to allow to have different statistics (e.g., loading), how large the CFI would drop from the original model. Please note that more research is needed in finding the appropriate cutoffs for the change in CFI for individual items. Are the cutoffs of .002 or .01 appropriate for this context?
In creating effect size, a target statistic needs to be standardized. Sample variances are used in the standardization formula. If researchers can assume that target variances across groups are equal in population, then pooled variances can be used in the standardization. The pooled variance $s^2_P$ can be computed as follows:
$$s^2_P = \frac{\sum^G_{g=1}(n_g - 1)s^2_g}{\sum^G_{g=1}(n_g - 1)},$$
\noindent
where $g$ represents the index of groups, $G$ is the number of groups, $s^2_g$ represents the variance of Group $g$, and $n_g$ is the Group $g$ size. If the variances are not assumed to be equal across groups, I recommend to pick a reference (baseline) group for the standardization.
In the following sections, I will show how effect sizes are defined in each type of partial invariance testing.
\section{Factor Loading}
Let $\lambda_{ijg}$ be the unstandardized factor loading of Item $i$ from Factor $j$ in Group $g$. A standardized factor loading $\lambda^*_{ijg}$ can be computed \citep{muthen1998}:
$$\lambda^*_{ijg} = \lambda_{ijg}\cdot\frac{\psi_{jg}}{\sigma_{ig}},$$
\noindent
where $\psi_{jg}$ is the standard deviation of Factor $j$ from Group $g$ and $\sigma_{ig}$ is the total standard deviation of Item $i$ from Group $g$. To quantify the difference in factor loadings between groups in standardized scale, the standard deviation in the standardization formula needs to be the same across groups. If Group A and Group B are compared, the standardized difference in factor loading is defined:
$$\Delta\lambda^*_{ij} = (\lambda_{ijA} - \lambda_{ijB})\cdot\frac{\psi_{jP}}{\sigma_{iP}},$$
\noindent
where $\psi_{jP}$ is the pooled standard deviation of Factor $j$ and $\sigma_{iP}$ is the pooled total standard deviation of Item $i$. If Group A is the reference group, $\psi_{jA}$ and $\sigma_{iA}$ can substitute $\psi_{jP}$ and $\sigma_{iP}$. Assume that standardized factor loadings are from congeneric measurement model, standardized factor loadings represent the correlation between items and factors. \cite{cohen1992} provide a guideline for interpreting the magnitude of the difference in correlations for independent groups. The correlations are transformed to Fisher's z transformation:
$$q = \arctan\left(\lambda_{ijA}\cdot\frac{\psi_{jP}}{\sigma_{iP}}\right) - \arctan\left(\lambda_{ijB}\cdot\frac{\psi_{jP}}{\sigma_{iP}}\right)$$
Then, the $q$ values of .1, .3, and .5 are interpreted as small, medium, and large effect sizes.
For continuous outcomes, the amount of mean differences implied by the factor loading difference given a factor score can be used as an effect size \citep{millsap2012}. Let $X_ijg$ be the observed score of Item $i$ loaded on Factor $j$ from Group $g$ and $W_{j}$ represents the score of Factor $j$. The expected value of the observed score differences between Group A and Group B is calculated as follows:
$$E\left(X_{iA} - X_iB | W_j \right) = \left( \nu_{iA} - \nu_{iB} \right) + \left( \lambda_{ijA} - \lambda_{ijB} \right) \times W_{j},
$$
\noindent
where $\nu_{ig}$ represents the intercept of Item $i$ in Group $g$. Let the values between $W_{jl}$ and $W_{jh}$ be the values of interest. We can find the expected difference in the observed scores under this range of the factor scores. \cite{millsap2012} proposed that, if the size of the expected difference is over the value of meaningful differences, the loading difference is not negligible. See their article for the discussion of the meaningful difference.
Note that, in the \texttt{partialInvariance} function, $W_{jl}$ is calculated by (a) finding the factor scores representing a low \emph{z}-score (e.g., -2) from all groups and (b) selecting the lowest factor score across all groups. $W_{jh}$ is calculated by (a) finding the factor scores representing a high \emph{z}-score (e.g., 2) from all groups and (b) selecting the highest factor score across all groups.
\section{Item Intercepts}
Let $\nu_{ig}$ be the intercept of Item $i$ in Group $g$. A standardized intercept $\nu^*_{ig}$ is defined as follows \citep{muthen1998}:
$$\nu^*_{ig} = \nu_{ig} / \sigma_{ig}.$$
Thus, the standardized difference between Groups A and B in item intercepts is defined:
$$\Delta\nu^*_{i} = (\nu_{iA} - \nu_{iB}) / \sigma_{iP}.$$
Note that $\sigma_{iA}$ can substitute $\sigma_{iP}$ if Group A is the reference group. By using this scale, .2, .5, and .8 can be interpreted as small, medium, and large effect sizes according to \cite{cohen1992}.
The proportion of the intercept difference over the observed score difference may be used as an effect size \citep{millsap2012}:
$$(\nu_{iA} - \nu_{iB}) / (M_{iA} - M_{iB}),
$$
\noindent
where $M_{ig}$ represents the observed mean of Item $i$ in Group $g$. \cite{millsap2012} noted that a relatively small proportion (e.g., less than 20\%) is ignorable. If the sign is negative or the value is over 1, the interpretation is doubtful.
\section{Item Thresholds}
Let $\tau_{cig}$ be the threshold categorizing between category $c$ and $c + 1$ for Item $i$ in Group $g$. Note that the maximum number of $c$ is the number of categories minus 1. Because thresholds are the location of the distribution underlying ordered categorical items (usually normal distribution), the location statistic can be standardized by dividing it by the standard deviation of the underlying distribution. The standardized threshold $\tau^*_{cig}$ is defined as follows:
$$\tau^*_{cig} = \tau_{cig} / \sigma^u_{ig},$$
\noindent
where $\sigma^u_{ig}$ is the standard deviation of the distribution underlying the categorical data for Item $i$ in Group $g$. In theta parameterization of categorical confirmatory factor analysis, $\sigma^u_{ig}$ may not be equal across groups. The standardized difference in thresholds between Group A and B needs the pooled standard deviation. The standardized difference in thresholds is defined:
$$\Delta\tau^*_{ci} = (\tau_{ciA} - \tau_{ciB}) / \sigma^u_{iP}.$$
Note that $\sigma^u_{iA}$ can substitute $\sigma^u_{iP}$ if Group A is the reference group. By using this scale, .2, .5, and .8 can be interpreted as small, medium, and large effect sizes according to \cite{cohen1992}.
\section{Measurement Error Variances}
Let $\theta_{ig}$ be the measurement error variance of Item $i$ in Group $g$. A standardized measurement error variance $\theta^*_{ig}$ is defined as follows \citep{muthen1998}:
$$\theta^*_{ig} = \theta_{ig} / \sigma_{ig},$$
Thus, the standardized difference between Groups A and B in measurement error variances could be defined:
$$\Delta\theta^*_{i} = (\theta_{iA} - \theta_{iB}) / \sigma_{iP}.$$
Note that $\sigma_{iA}$ can substitute $\sigma_{iP}$ if Group A is the reference group. However, there is no direct guideline to interpret the magnitude of the difference in measurement error variances according to Cohen (1992). A new standardized difference in measurement error variances is needed.
Assume that $\sigma_{iP}$ is always greater than $\theta_{iA}$ and $\theta_{iB}$, which is usually correct, then $\theta_{iA} / \sigma_{iP}$ and $\theta_{iB} / \sigma_{iP}$ ranges between 0 and 1 similar to a proportion statistic. \cite{cohen1992} provided a guideline in interpreting the magnitude of the difference in proportions using arcsine transformation. The new index ($h$) is defined as follows:
$$h = \sin^{-1}\sqrt{\frac{\theta_{iA}}{\sigma_{iP}}} - \sin^{-1}\sqrt{\frac{\theta_{iB}}{\sigma_{iP}}}.$$
Then, the $h$ values of .2, .5, and .8 are interpreted as small, medium, and large effect sizes.
If items are continuous, the proportion of the error variance difference over the observed variance difference may be used as an effect size \citep{millsap2012}:
$$(\theta_{iA} - \theta_{iB}) / (\sigma_{iA} - \sigma_{iB}).
$$
\noindent
If the sign is negative or the value is over 1, the interpretation is doubtful.
\section{Factor Means}
Let $\alpha_{jg}$ be the mean of Factor $j$ in Group $g$. A standardized factor mean $\alpha^*_{jg}$ is defined as follows \citep{muthen1998}:
$$\alpha^*_{jg} = \alpha_{jg} / \psi_{jg}$$
Thus, the standardized difference between Groups A and B in factor means is defined:
$$\Delta\alpha^*_{j} = (\alpha_{jA} - \alpha_{jB}) / \psi_{jP}.$$
Note that $\psi_{jA}$ can substitute $\psi_{jP}$ if Group A is the reference group. By using this scale, .2, .5, and .8 can be interpreted as small, medium, and large effect sizes according to \cite{cohen1992}.
\bibliographystyle{plainnat}
\bibliography{partialInvariance}
\end{document} semTools/data/ 0000755 0001762 0000144 00000000000 14006342740 012770 5 ustar ligges users semTools/data/exLong.rda 0000644 0001762 0000144 00000033516 14006342740 014724 0 ustar ligges users {?UqQfJd
IjLIdΔY*D$D1D$!IuIf<㾟y?{]u뵗4>#,;[v,<5p8jH=;7";w;?YKvuDG{˿s3w\wcޱjnZ)h~ͪ-2##q^ZK7Z ĊiGQ1_hõ`Ρo*. 3UR=J3zٔ?c[J: gJǡx@7guY!Dil}
y
QQ]~#5 NEK1-FHÌ:/Q+:Ïv:=t>*\WʇoIȓ>,;^}D f{uwҙ j4@lLu&sh2)'sϽȏtZ] ɬ'Х5HfJ,5>?km~wcG<0Psu:.<#JrY?)fiꧣоŢr飾>t q{ zbsTnEMGצj)0/vk^0teZ))/iϨFFu!:`.:=w"6xx=rʁ0IZʶ/ݳ4Z1\2äm36,nrq&XlEL-4&,#.܋g`tQ8bBexH0|jOQ+\G\91@;$B]Gjeq7gǪS$/ -r<9'=3Eڡ
-gPu1fxlӬ ,kBYi@٥; {;|ǥTa
Ra^-O!*ȋr>lfrrw8m.ZN5Kk.~&DVfvw0Zic:w
)Q=5\0~ZLgmia(
ǧ?~iKd(04T1rLNMR\o|^:>0/1\qh4&Sm%䡥,̭hrV'2:DŽ4HeB5.4ƸQ__ hfb BU;1M u/L9${l
Pg[O@3fo IXfh@#{ʪ Me)BwAQK!Q^u<`UՊ9ysspAidBa@;H1
TR6\Sa.g.T/۹~3OG.CwWV%ۏ_5
=.bhW d@Ge>^=.^EF']sݾ7AcMgr'r 3T-@KYGǰ5N^Y L=L%vͅ2\A{z/|B &]hwUQs6Xd: LgP-M:)㳾D`[o<
]τCZ+[_H
<?ˡytchaX$_@daA6]ߏD9q^Dm~O "F̈
&jRn0} G~*B$eɷaE+
ly_j84HĻMp]S*Pu {8p.Ѡ-d*D~!;ċSHyثPV'>KPE#9E6dTc6sxUDCcyk+hRiCUP3;|P|0Z>,UmS/5Lr;_Mg$
^4|LW4I፶ Pҙ/fXfaX '"қQQo9 ,J-;~E;CBS&Ƣ&ڦq ~s8:EPcӜԣ{F$tmeH@m8ᥒҮɂq=ӨVw)ϢS<7<_>z"8r$?7U\Z!шF#f}Cɵ:( cb|иfy0Q/'y9COԫq3
6y 6wo|u|s*4L!ftȠo
f;5ѲZ(D1$6ʡSf4$W/:
{wѠaUʝwLݸ,L]Z{RO&F^~QJǾ-ePGERAY@Zv9Lyil_GCbAJyhhްڢd7AgB?O އ|҈6[7Ԩ V0ς*I^4ϭ?8BB~r7C
Bw~1EW묐uwt+~=?
g=>Zv"hQYv!Qݺ)L&C]bl<<3?1VvhڟW.Ynzqjʷ]n0wlԩi).*;MN)a̻ĥ$eظ+Ȭ\2\o
|`*7A_rNJ`1չ0xWc
*V;+2t%m0d&\caP[zPA(䅍S2VB4uY Uǃ%1h2{ZX9baJ*Ku$b7?r\聉?.{)Un*Ætb@!10o0~J~9g1\ -G%q{i
Du
Pd̆.rVV=\a0dK74`Yzz9싄rF$L}X<!heOM)7ڗH0ʞٓc`F4Ry
-t|4x-˒+ݡ$?
6
L~+я5=GXaP}``lcGIOVkVeg0o3w~~~NsPX 4Gb[
h*<ަuݏ1Wb(sҔRCΫ(gm\X<ԙl[kbcB`JCcx
(O/\OA<;:huMagug2T-4k4zGr.K6H[0+)X&fF>1An쁜{26ƆvU P'5
u'{y=z\0{Su
l58T[]yDTX`.tao':v5Hq܋.dX)}v%"&'RyBՊۘ=\a">Bƹ{U0|ؐa蹼x|dԍva|>!Ubt9X
㎬u^FSZocĚzrb,s$g`I`HCe?/В!hsU
<[g`;ԹrRL/}0h^>:S/,U(SE)wk3&E[+~C'"`*`̛8?ElPmP@>+X<%S\A)4@i|U%+ zYsUZ%-,}K0{q}Fp4ucXg =DC9mÐK gS@/\l]6
*5`ⅵxl4Jӭ-9z.(| t=x
-wH%K9ܨ|i8I3+ˠvPx!)drUDbZǻ`H:,,ľ,m<8~UtZe?Y\C&.JeS2\b3颪;lBԠ
^l*1<'>J*[9!Cч8R#YQJ\?[