Gelman, Andrew, and Jennifer Hill. 2007. *Data Analysis Using Regression
and Multilevel/Hierarchical Models*. Analytical Methods for Social
Research. Cambridge ; New York: Cambridge University Press.
Hox, J. J. 2010. *Multilevel Analysis: Techniques and Applications*. 2nd
ed. Quantitative Methodology Series. New York: Routledge.
Johnson, Paul C. D. 2014. “Extension of Nakagawa & Schielzeth’s R2 GLMM
to Random Slopes Models.” Edited by Robert B. O’Hara. *Methods in
Ecology and Evolution* 5 (9): 944–46.
Nakagawa, Shinichi, Paul C. D. Johnson, and Holger Schielzeth. 2017.
“The Coefficient of Determination R2 and Intra-Class Correlation
Coefficient from Generalized Linear Mixed-Effects Models Revisited and
Expanded.” *Journal of The Royal Society Interface* 14 (134): 20170213.
performance/man/ 0000755 0001762 0000144 00000000000 14517502246 013334 5 ustar ligges users performance/man/model_performance.kmeans.Rd 0000644 0001762 0000144 00000001345 14257247716 020575 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/model_performance.kmeans.R
\name{model_performance.kmeans}
\alias{model_performance.kmeans}
\title{Model summary for k-means clustering}
\usage{
\method{model_performance}{kmeans}(model, verbose = TRUE, ...)
}
\arguments{
\item{model}{Object of type \code{kmeans}.}
\item{verbose}{Toggle off warnings.}
\item{...}{Arguments passed to or from other methods.}
}
\description{
Model summary for k-means clustering
}
\examples{
# a 2-dimensional example
x <- rbind(
matrix(rnorm(100, sd = 0.3), ncol = 2),
matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2)
)
colnames(x) <- c("x", "y")
model <- kmeans(x, 2)
model_performance(model)
}
performance/man/item_reliability.Rd 0000644 0001762 0000144 00000003336 14362032043 017146 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/item_reliability.R
\name{item_reliability}
\alias{item_reliability}
\title{Reliability Test for Items or Scales}
\usage{
item_reliability(x, standardize = FALSE, digits = 3)
}
\arguments{
\item{x}{A matrix or a data frame.}
\item{standardize}{Logical, if \code{TRUE}, the data frame's vectors will be
standardized. Recommended when the variables have different measures /
scales.}
\item{digits}{Amount of digits for returned values.}
}
\value{
A data frame with the corrected item-total correlations (\emph{item
discrimination}, column \code{item_discrimination}) and Cronbach's Alpha
(if item deleted, column \code{alpha_if_deleted}) for each item
of the scale, or \code{NULL} if data frame had too less columns.
}
\description{
Compute various measures of internal consistencies
for tests or item-scales of questionnaires.
}
\details{
This function calculates the item discriminations (corrected item-total
correlations for each item of \code{x} with the remaining items) and the
Cronbach's alpha for each item, if it was deleted from the scale. The
absolute value of the item discrimination indices should be above 0.2. An
index between 0.2 and 0.4 is considered as "fair", while an index above 0.4
(or below -0.4) is "good". The range of satisfactory values is from 0.4 to
0.7. Items with low discrimination indices are often ambiguously worded and
should be examined. Items with negative indices should be examined to
determine why a negative value was obtained (e.g. reversed answer categories
regarding positive and negative poles).
}
\examples{
data(mtcars)
x <- mtcars[, c("cyl", "gear", "carb", "hp")]
item_reliability(x)
}
performance/man/model_performance.lavaan.Rd 0000644 0001762 0000144 00000011545 14501062052 020540 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/model_performance.lavaan.R
\name{model_performance.lavaan}
\alias{model_performance.lavaan}
\title{Performance of lavaan SEM / CFA Models}
\usage{
\method{model_performance}{lavaan}(model, metrics = "all", verbose = TRUE, ...)
}
\arguments{
\item{model}{A \strong{lavaan} model.}
\item{metrics}{Can be \code{"all"} or a character vector of metrics to be
computed (some of \code{"Chi2"}, \code{"Chi2_df"}, \code{"p_Chi2"}, \code{"Baseline"},
\code{"Baseline_df"}, \code{"p_Baseline"}, \code{"GFI"}, \code{"AGFI"}, \code{"NFI"}, \code{"NNFI"},
\code{"CFI"}, \code{"RMSEA"}, \code{"RMSEA_CI_low"}, \code{"RMSEA_CI_high"}, \code{"p_RMSEA"},
\code{"RMR"}, \code{"SRMR"}, \code{"RFI"}, \code{"PNFI"}, \code{"IFI"}, \code{"RNI"}, \code{"Loglikelihood"},
\code{"AIC"}, \code{"BIC"}, and \code{"BIC_adjusted"}.}
\item{verbose}{Toggle off warnings.}
\item{...}{Arguments passed to or from other methods.}
}
\value{
A data frame (with one row) and one column per "index" (see
\code{metrics}).
}
\description{
Compute indices of model performance for SEM or CFA models from the
\strong{lavaan} package.
}
\details{
\subsection{Indices of fit}{
\itemize{
\item \strong{Chisq}: The model Chi-squared assesses overall fit and the
discrepancy between the sample and fitted covariance matrices. Its p-value
should be > .05 (i.e., the hypothesis of a perfect fit cannot be
rejected). However, it is quite sensitive to sample size.
\item \strong{GFI/AGFI}: The (Adjusted) Goodness of Fit is the proportion
of variance accounted for by the estimated population covariance.
Analogous to R2. The GFI and the AGFI should be > .95 and > .90,
respectively.
\item \strong{NFI/NNFI/TLI}: The (Non) Normed Fit Index. An NFI of 0.95,
indicates the model of interest improves the fit by 95\\% relative to the
null model. The NNFI (also called the Tucker Lewis index; TLI) is
preferable for smaller samples. They should be > .90 (Byrne, 1994) or >
.95 (Schumacker and Lomax, 2004).
\item \strong{CFI}: The Comparative Fit Index is a revised form of NFI.
Not very sensitive to sample size (Fan, Thompson, and Wang, 1999). Compares
the fit of a target model to the fit of an independent, or null, model. It
should be > .90.
\item \strong{RMSEA}: The Root Mean Square Error of Approximation is a
parsimony-adjusted index. Values closer to 0 represent a good fit. It
should be < .08 or < .05. The p-value printed with it tests the hypothesis
that RMSEA is less than or equal to .05 (a cutoff sometimes used for good
fit), and thus should be not significant.
\item \strong{RMR/SRMR}: the (Standardized) Root Mean Square Residual
represents the square-root of the difference between the residuals of the
sample covariance matrix and the hypothesized model. As the RMR can be
sometimes hard to interpret, better to use SRMR. Should be < .08.
\item \strong{RFI}: the Relative Fit Index, also known as RHO1, is not
guaranteed to vary from 0 to 1. However, RFI close to 1 indicates a good
fit.
\item \strong{IFI}: the Incremental Fit Index (IFI) adjusts the Normed Fit
Index (NFI) for sample size and degrees of freedom (Bollen's, 1989). Over
0.90 is a good fit, but the index can exceed 1.
\item \strong{PNFI}: the Parsimony-Adjusted Measures Index. There is no
commonly agreed-upon cutoff value for an acceptable model for this index.
Should be > 0.50. }
}
See the documentation for \code{?lavaan::fitmeasures}.
\subsection{What to report}{
Kline (2015) suggests that at a minimum the following indices should be
reported: The model \strong{chi-square}, the \strong{RMSEA}, the \strong{CFI}
and the \strong{SRMR}.
}
}
\examples{
\dontshow{if (require("lavaan")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
# Confirmatory Factor Analysis (CFA) ---------
data(HolzingerSwineford1939, package = "lavaan")
structure <- " visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 "
model <- lavaan::cfa(structure, data = HolzingerSwineford1939)
model_performance(model)
\dontshow{\}) # examplesIf}
}
\references{
\itemize{
\item Byrne, B. M. (1994). Structural equation modeling with EQS and
EQS/Windows. Thousand Oaks, CA: Sage Publications.
\item Tucker, L. R., and Lewis, C. (1973). The reliability coefficient for
maximum likelihood factor analysis. Psychometrika, 38, 1-10.
\item Schumacker, R. E., and Lomax, R. G. (2004). A beginner's guide to
structural equation modeling, Second edition. Mahwah, NJ: Lawrence Erlbaum
Associates.
\item Fan, X., B. Thompson, and L. Wang (1999). Effects of sample size,
estimation method, and model specification on structural equation modeling
fit indexes. Structural Equation Modeling, 6, 56-83.
\item Kline, R. B. (2015). Principles and practice of structural equation
modeling. Guilford publications.
}
}
performance/man/r2_tjur.Rd 0000644 0001762 0000144 00000002017 14327475467 015227 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/r2_tjur.R
\name{r2_tjur}
\alias{r2_tjur}
\title{Tjur's R2 - coefficient of determination (D)}
\usage{
r2_tjur(model, ...)
}
\arguments{
\item{model}{Binomial Model.}
\item{...}{Arguments from other functions, usually only used internally.}
}
\value{
A named vector with the R2 value.
}
\description{
This method calculates the Coefficient of Discrimination \code{D}
(also known as Tjur's R2; \cite{Tjur, 2009}) for generalized linear (mixed) models
for binary outcomes. It is an alternative to other pseudo-R2 values like
Nagelkerke's R2 or Cox-Snell R2. The Coefficient of Discrimination \code{D}
can be read like any other (pseudo-)R2 value.
}
\examples{
model <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial")
r2_tjur(model)
}
\references{
Tjur, T. (2009). Coefficients of determination in logistic regression
models - A new proposal: The coefficient of discrimination. The American
Statistician, 63(4), 366-372.
}
performance/man/r2_bayes.Rd 0000644 0001762 0000144 00000007663 14517461276 015354 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/r2_bayes.R
\name{r2_bayes}
\alias{r2_bayes}
\alias{r2_posterior}
\alias{r2_posterior.brmsfit}
\alias{r2_posterior.stanreg}
\alias{r2_posterior.BFBayesFactor}
\title{Bayesian R2}
\usage{
r2_bayes(model, robust = TRUE, ci = 0.95, verbose = TRUE, ...)
r2_posterior(model, ...)
\method{r2_posterior}{brmsfit}(model, verbose = TRUE, ...)
\method{r2_posterior}{stanreg}(model, verbose = TRUE, ...)
\method{r2_posterior}{BFBayesFactor}(model, average = FALSE, prior_odds = NULL, verbose = TRUE, ...)
}
\arguments{
\item{model}{A Bayesian regression model (from \strong{brms},
\strong{rstanarm}, \strong{BayesFactor}, etc).}
\item{robust}{Logical, if \code{TRUE}, the median instead of mean is used to
calculate the central tendency of the variances.}
\item{ci}{Value or vector of probability of the CI (between 0 and 1) to be
estimated.}
\item{verbose}{Toggle off warnings.}
\item{...}{Arguments passed to \code{r2_posterior()}.}
\item{average}{Compute model-averaged index? See \code{\link[bayestestR:weighted_posteriors]{bayestestR::weighted_posteriors()}}.}
\item{prior_odds}{Optional vector of prior odds for the models compared to
the first model (or the denominator, for \code{BFBayesFactor} objects). For
\code{data.frame}s, this will be used as the basis of weighting.}
}
\value{
A list with the Bayesian R2 value. For mixed models, a list with the
Bayesian R2 value and the marginal Bayesian R2 value. The standard errors
and credible intervals for the R2 values are saved as attributes.
}
\description{
Compute R2 for Bayesian models. For mixed models (including a
random part), it additionally computes the R2 related to the fixed effects
only (marginal R2). While \code{r2_bayes()} returns a single R2 value,
\code{r2_posterior()} returns a posterior sample of Bayesian R2 values.
}
\details{
\code{r2_bayes()} returns an "unadjusted" R2 value. See
\code{\link[=r2_loo]{r2_loo()}} to calculate a LOO-adjusted R2, which comes
conceptually closer to an adjusted R2 measure.
For mixed models, the conditional and marginal R2 are returned. The marginal
R2 considers only the variance of the fixed effects, while the conditional
R2 takes both the fixed and random effects into account.
\code{r2_posterior()} is the actual workhorse for \code{r2_bayes()} and
returns a posterior sample of Bayesian R2 values.
}
\examples{
\dontshow{if (require("rstanarm") && require("rstantools") && require("BayesFactor") && require("brms")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
library(performance)
\donttest{
model <- suppressWarnings(rstanarm::stan_glm(
mpg ~ wt + cyl,
data = mtcars,
chains = 1,
iter = 500,
refresh = 0,
show_messages = FALSE
))
r2_bayes(model)
model <- suppressWarnings(rstanarm::stan_lmer(
Petal.Length ~ Petal.Width + (1 | Species),
data = iris,
chains = 1,
iter = 500,
refresh = 0
))
r2_bayes(model)
}
BFM <- BayesFactor::generalTestBF(mpg ~ qsec + gear, data = mtcars, progress = FALSE)
FM <- BayesFactor::lmBF(mpg ~ qsec + gear, data = mtcars)
r2_bayes(FM)
r2_bayes(BFM[3])
r2_bayes(BFM, average = TRUE) # across all models
# with random effects:
mtcars$gear <- factor(mtcars$gear)
model <- BayesFactor::lmBF(
mpg ~ hp + cyl + gear + gear:wt,
mtcars,
progress = FALSE,
whichRandom = c("gear", "gear:wt")
)
r2_bayes(model)
\donttest{
model <- suppressWarnings(brms::brm(
mpg ~ wt + cyl,
data = mtcars,
silent = 2,
refresh = 0
))
r2_bayes(model)
model <- suppressWarnings(brms::brm(
Petal.Length ~ Petal.Width + (1 | Species),
data = iris,
silent = 2,
refresh = 0
))
r2_bayes(model)
}
\dontshow{\}) # examplesIf}
}
\references{
Gelman, A., Goodrich, B., Gabry, J., and Vehtari, A. (2018).
R-squared for Bayesian regression models. The American Statistician, 1–6.
\doi{10.1080/00031305.2018.1549100}
}
performance/man/check_singularity.Rd 0000644 0001762 0000144 00000010221 14413052536 017323 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/check_singularity.R
\name{check_singularity}
\alias{check_singularity}
\title{Check mixed models for boundary fits}
\usage{
check_singularity(x, tolerance = 1e-05, ...)
}
\arguments{
\item{x}{A mixed model.}
\item{tolerance}{Indicates up to which value the convergence result is
accepted. The larger \code{tolerance} is, the stricter the test
will be.}
\item{...}{Currently not used.}
}
\value{
\code{TRUE} if the model fit is singular.
}
\description{
Check mixed models for boundary fits.
}
\details{
If a model is "singular", this means that some dimensions of the
variance-covariance matrix have been estimated as exactly zero. This
often occurs for mixed models with complex random effects structures.
"While singular models are statistically well defined (it is theoretically
sensible for the true maximum likelihood estimate to correspond to a singular
fit), there are real concerns that (1) singular fits correspond to overfitted
models that may have poor power; (2) chances of numerical problems and
mis-convergence are higher for singular models (e.g. it may be computationally
difficult to compute profile confidence intervals for such models); (3)
standard inferential procedures such as Wald statistics and likelihood ratio
tests may be inappropriate." (\emph{lme4 Reference Manual})
There is no gold-standard about how to deal with singularity and which
random-effects specification to choose. Beside using fully Bayesian methods
(with informative priors), proposals in a frequentist framework are:
\itemize{
\item avoid fitting overly complex models, such that the variance-covariance
matrices can be estimated precisely enough (\emph{Matuschek et al. 2017})
\item use some form of model selection to choose a model that balances
predictive accuracy and overfitting/type I error (\emph{Bates et al. 2015},
\emph{Matuschek et al. 2017})
\item "keep it maximal", i.e. fit the most complex model consistent with the
experimental design, removing only terms required to allow a non-singular
fit (\emph{Barr et al. 2013})
}
Note the different meaning between singularity and convergence: singularity
indicates an issue with the "true" best estimate, i.e. whether the maximum
likelihood estimation for the variance-covariance matrix of the random
effects is positive definite or only semi-definite. Convergence is a
question of whether we can assume that the numerical optimization has
worked correctly or not.
}
\examples{
\dontshow{if (require("lme4")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
library(lme4)
data(sleepstudy)
set.seed(123)
sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE)
sleepstudy$mysubgrp <- NA
for (i in 1:5) {
filter_group <- sleepstudy$mygrp == i
sleepstudy$mysubgrp[filter_group] <-
sample(1:30, size = sum(filter_group), replace = TRUE)
}
model <- lmer(
Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject),
data = sleepstudy
)
check_singularity(model)
\dontshow{\}) # examplesIf}
}
\references{
\itemize{
\item Bates D, Kliegl R, Vasishth S, Baayen H. Parsimonious Mixed Models.
arXiv:1506.04967, June 2015.
\item Barr DJ, Levy R, Scheepers C, Tily HJ. Random effects structure for
confirmatory hypothesis testing: Keep it maximal. Journal of Memory and
Language, 68(3):255-278, April 2013.
\item Matuschek H, Kliegl R, Vasishth S, Baayen H, Bates D. Balancing type
I error and power in linear mixed models. Journal of Memory and Language,
94:305-315, 2017.
\item lme4 Reference Manual, \url{https://cran.r-project.org/package=lme4}
}
}
\seealso{
Other functions to check model assumptions and and assess model quality:
\code{\link{check_autocorrelation}()},
\code{\link{check_collinearity}()},
\code{\link{check_convergence}()},
\code{\link{check_heteroscedasticity}()},
\code{\link{check_homogeneity}()},
\code{\link{check_model}()},
\code{\link{check_outliers}()},
\code{\link{check_overdispersion}()},
\code{\link{check_predictions}()},
\code{\link{check_zeroinflation}()}
}
\concept{functions to check model assumptions and and assess model quality}
performance/man/performance_pcp.Rd 0000644 0001762 0000144 00000004716 14327475467 017013 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/performance_pcp.R
\name{performance_pcp}
\alias{performance_pcp}
\title{Percentage of Correct Predictions}
\usage{
performance_pcp(model, ci = 0.95, method = "Herron", verbose = TRUE)
}
\arguments{
\item{model}{Model with binary outcome.}
\item{ci}{The level of the confidence interval.}
\item{method}{Name of the method to calculate the PCP (see 'Details').
Default is \code{"Herron"}. May be abbreviated.}
\item{verbose}{Toggle off warnings.}
}
\value{
A list with several elements: the percentage of correct predictions
of the full and the null model, their confidence intervals, as well as the
chi-squared and p-value from the Likelihood-Ratio-Test between the full and
null model.
}
\description{
Percentage of correct predictions (PCP) for models
with binary outcome.
}
\details{
\code{method = "Gelman-Hill"} (or \code{"gelman_hill"}) computes the
PCP based on the proposal from \emph{Gelman and Hill 2017, 99}, which is
defined as the proportion of cases for which the deterministic prediction
is wrong, i.e. the proportion where the predicted probability is above 0.5,
although y=0 (and vice versa) (see also \emph{Herron 1999, 90}).
\code{method = "Herron"} (or \code{"herron"}) computes a modified version
of the PCP (\emph{Herron 1999, 90-92}), which is the sum of predicted
probabilities, where y=1, plus the sum of 1 - predicted probabilities,
where y=0, divided by the number of observations. This approach is said to
be more accurate.
The PCP ranges from 0 to 1, where values closer to 1 mean that the model
predicts the outcome better than models with an PCP closer to 0. In general,
the PCP should be above 0.5 (i.e. 50\\%), the closer to one, the better.
Furthermore, the PCP of the full model should be considerably above
the null model's PCP.
The likelihood-ratio test indicates whether the model has a significantly
better fit than the null-model (in such cases, p < 0.05).
}
\examples{
data(mtcars)
m <- glm(formula = vs ~ hp + wt, family = binomial, data = mtcars)
performance_pcp(m)
performance_pcp(m, method = "Gelman-Hill")
}
\references{
\itemize{
\item Herron, M. (1999). Postestimation Uncertainty in Limited Dependent
Variable Models. Political Analysis, 8, 83–98.
\item Gelman, A., and Hill, J. (2007). Data analysis using regression and
multilevel/hierarchical models. Cambridge; New York: Cambridge University
Press, 99.
}
}
performance/man/check_homogeneity.Rd 0000644 0001762 0000144 00000004275 14501142162 017305 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/check_homogeneity.R
\name{check_homogeneity}
\alias{check_homogeneity}
\alias{check_homogeneity.afex_aov}
\title{Check model for homogeneity of variances}
\usage{
check_homogeneity(x, method = c("bartlett", "fligner", "levene", "auto"), ...)
\method{check_homogeneity}{afex_aov}(x, method = "levene", ...)
}
\arguments{
\item{x}{A linear model or an ANOVA object.}
\item{method}{Name of the method (underlying test) that should be performed
to check the homogeneity of variances. May either be \code{"levene"} for
Levene's Test for Homogeneity of Variance, \code{"bartlett"} for the
Bartlett test (assuming normal distributed samples or groups),
\code{"fligner"} for the Fligner-Killeen test (rank-based, non-parametric
test), or \code{"auto"}. In the latter case, Bartlett test is used if the
model response is normal distributed, else Fligner-Killeen test is used.}
\item{...}{Arguments passed down to \code{car::leveneTest()}.}
}
\value{
Invisibly returns the p-value of the test statistics. A p-value <
0.05 indicates a significant difference in the variance between the groups.
}
\description{
Check model for homogeneity of variances between groups described
by independent variables in a model.
}
\note{
There is also a \href{https://easystats.github.io/see/articles/performance.html}{\code{plot()}-method}
implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}.
}
\examples{
model <<- lm(len ~ supp + dose, data = ToothGrowth)
check_homogeneity(model)
# plot results
if (require("see")) {
result <- check_homogeneity(model)
plot(result)
}
}
\seealso{
Other functions to check model assumptions and and assess model quality:
\code{\link{check_autocorrelation}()},
\code{\link{check_collinearity}()},
\code{\link{check_convergence}()},
\code{\link{check_heteroscedasticity}()},
\code{\link{check_model}()},
\code{\link{check_outliers}()},
\code{\link{check_overdispersion}()},
\code{\link{check_predictions}()},
\code{\link{check_singularity}()},
\code{\link{check_zeroinflation}()}
}
\concept{functions to check model assumptions and and assess model quality}
performance/man/check_factorstructure.Rd 0000644 0001762 0000144 00000010577 14406527371 020234 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/check_factorstructure.R
\name{check_factorstructure}
\alias{check_factorstructure}
\alias{check_kmo}
\alias{check_sphericity_bartlett}
\title{Check suitability of data for Factor Analysis (FA) with Bartlett's Test of Sphericity and KMO}
\usage{
check_factorstructure(x, n = NULL, ...)
check_kmo(x, n = NULL, ...)
check_sphericity_bartlett(x, n = NULL, ...)
}
\arguments{
\item{x}{A dataframe or a correlation matrix. If the latter is passed, \code{n}
must be provided.}
\item{n}{If a correlation matrix was passed, the number of observations must
be specified.}
\item{...}{Arguments passed to or from other methods.}
}
\value{
A list of lists of indices related to sphericity and KMO.
}
\description{
This checks whether the data is appropriate for Factor Analysis (FA) by
running the Bartlett's Test of Sphericity and the Kaiser, Meyer, Olkin (KMO)
Measure of Sampling Adequacy (MSA). See \strong{details} below for more information
about the interpretation and meaning of each test.
}
\details{
\subsection{Bartlett's Test of Sphericity}{
Bartlett's (1951) test of sphericity tests whether a matrix (of correlations)
is significantly different from an identity matrix (filled with 0). It tests
whether the correlation coefficients are all 0. The test computes the
probability that the correlation matrix has significant correlations among at
least some of the variables in a dataset, a prerequisite for factor analysis
to work.
While it is often suggested to check whether Bartlett’s test of sphericity is
significant before starting with factor analysis, one needs to remember that
the test is testing a pretty extreme scenario (that all correlations are non-significant).
As the sample size increases, this test tends to be always significant, which
makes it not particularly useful or informative in well-powered studies.
}
\subsection{Kaiser, Meyer, Olkin (KMO)}{
\emph{(Measure of Sampling Adequacy (MSA) for Factor Analysis.)}
Kaiser (1970) introduced a Measure of Sampling Adequacy (MSA), later modified
by Kaiser and Rice (1974). The Kaiser-Meyer-Olkin (KMO) statistic, which can
vary from 0 to 1, indicates the degree to which each variable in a set is
predicted without error by the other variables.
A value of 0 indicates that the sum of partial correlations is large relative
to the sum correlations, indicating factor analysis is likely to be
inappropriate. A KMO value close to 1 indicates that the sum of partial
correlations is not large relative to the sum of correlations and so factor
analysis should yield distinct and reliable factors. It means that patterns
of correlations are relatively compact, and so factor analysis should yield
distinct and reliable factors. Values smaller than 0.5 suggest that you should
either collect more data or rethink which variables to include.
Kaiser (1974) suggested that KMO > .9 were marvelous, in the .80s,
meritorious, in the .70s, middling, in the .60s, mediocre, in the .50s,
miserable, and less than .5, unacceptable. Hair et al. (2006) suggest
accepting a value > 0.5. Values between 0.5 and 0.7 are mediocre, and values
between 0.7 and 0.8 are good.
Variables with individual KMO values below 0.5 could be considered for
exclusion them from the analysis (note that you would need to re-compute the
KMO indices as they are dependent on the whole dataset).
}
}
\examples{
library(performance)
check_factorstructure(mtcars)
# One can also pass a correlation matrix
r <- cor(mtcars)
check_factorstructure(r, n = nrow(mtcars))
}
\references{
This function is a wrapper around the \code{KMO} and the \code{cortest.bartlett()}
functions in the \strong{psych} package (Revelle, 2016).
\itemize{
\item Revelle, W. (2016). How To: Use the psych package for Factor Analysis
and data reduction.
\item Bartlett, M. S. (1951). The effect of standardization on a Chi-square
approximation in factor analysis. Biometrika, 38(3/4), 337-344.
\item Kaiser, H. F. (1970). A second generation little jiffy.
Psychometrika, 35(4), 401-415.
\item Kaiser, H. F., & Rice, J. (1974). Little jiffy, mark IV. Educational
and psychological measurement, 34(1), 111-117.
\item Kaiser, H. F. (1974). An index of factorial simplicity.
Psychometrika, 39(1), 31-36.
}
}
\seealso{
\code{\link[=check_clusterstructure]{check_clusterstructure()}}.
}
performance/man/check_normality.Rd 0000644 0001762 0000144 00000004124 14503335324 016773 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/check_normality.R
\name{check_normality}
\alias{check_normality}
\alias{check_normality.merMod}
\title{Check model for (non-)normality of residuals.}
\usage{
check_normality(x, ...)
\method{check_normality}{merMod}(x, effects = c("fixed", "random"), ...)
}
\arguments{
\item{x}{A model object.}
\item{...}{Currently not used.}
\item{effects}{Should normality for residuals (\code{"fixed"}) or random
effects (\code{"random"}) be tested? Only applies to mixed-effects models.
May be abbreviated.}
}
\value{
The p-value of the test statistics. A p-value < 0.05 indicates a
significant deviation from normal distribution.
}
\description{
Check model for (non-)normality of residuals.
}
\details{
\code{check_normality()} calls \code{stats::shapiro.test} and checks the
standardized residuals (or studentized residuals for mixed models) for
normal distribution. Note that this formal test almost always yields
significant results for the distribution of residuals and visual inspection
(e.g. Q-Q plots) are preferable. For generalized linear models, no formal
statistical test is carried out. Rather, there's only a \code{plot()} method for
GLMs. This plot shows a half-normal Q-Q plot of the absolute value of the
standardized deviance residuals is shown (in line with changes in
\code{plot.lm()} for R 4.3+).
}
\note{
For mixed-effects models, studentized residuals, and \emph{not}
standardized residuals, are used for the test. There is also a
\href{https://easystats.github.io/see/articles/performance.html}{\code{plot()}-method}
implemented in the \href{https://easystats.github.io/see/}{\strong{see}-package}.
}
\examples{
\dontshow{if (require("see")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
m <<- lm(mpg ~ wt + cyl + gear + disp, data = mtcars)
check_normality(m)
# plot results
x <- check_normality(m)
plot(x)
\donttest{
# QQ-plot
plot(check_normality(m), type = "qq")
# PP-plot
plot(check_normality(m), type = "pp")
}
\dontshow{\}) # examplesIf}
}
performance/man/check_collinearity.Rd 0000644 0001762 0000144 00000017645 14477405322 017475 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/check_collinearity.R, R/check_concurvity.R
\name{check_collinearity}
\alias{check_collinearity}
\alias{multicollinearity}
\alias{check_collinearity.default}
\alias{check_collinearity.glmmTMB}
\alias{check_concurvity}
\title{Check for multicollinearity of model terms}
\usage{
check_collinearity(x, ...)
multicollinearity(x, ...)
\method{check_collinearity}{default}(x, ci = 0.95, verbose = TRUE, ...)
\method{check_collinearity}{glmmTMB}(
x,
component = c("all", "conditional", "count", "zi", "zero_inflated"),
ci = 0.95,
verbose = TRUE,
...
)
check_concurvity(x, ...)
}
\arguments{
\item{x}{A model object (that should at least respond to \code{vcov()},
and if possible, also to \code{model.matrix()} - however, it also should
work without \code{model.matrix()}).}
\item{...}{Currently not used.}
\item{ci}{Confidence Interval (CI) level for VIF and tolerance values.}
\item{verbose}{Toggle off warnings or messages.}
\item{component}{For models with zero-inflation component, multicollinearity
can be checked for the conditional model (count component,
\code{component = "conditional"} or \code{component = "count"}),
zero-inflation component (\code{component = "zero_inflated"} or
\code{component = "zi"}) or both components (\code{component = "all"}).
Following model-classes are currently supported: \code{hurdle},
\code{zeroinfl}, \code{zerocount}, \code{MixMod} and \code{glmmTMB}.}
}
\value{
A data frame with information about name of the model term, the
variance inflation factor and associated confidence intervals, the factor
by which the standard error is increased due to possible correlation
with other terms, and tolerance values (including confidence intervals),
where \code{tolerance = 1/vif}.
}
\description{
\code{check_collinearity()} checks regression models for
multicollinearity by calculating the variance inflation factor (VIF).
\code{multicollinearity()} is an alias for \code{check_collinearity()}.
\code{check_concurvity()} is a wrapper around \code{mgcv::concurvity()}, and can be
considered as a collinearity check for smooth terms in GAMs. Confidence
intervals for VIF and tolerance are based on Marcoulides et al.
(2019, Appendix B).
}
\note{
The code to compute the confidence intervals for the VIF and tolerance
values was adapted from the Appendix B from the Marcoulides et al. paper.
Thus, credits go to these authors the original algorithm. There is also
a \href{https://easystats.github.io/see/articles/performance.html}{\code{plot()}-method}
implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}.
}
\section{Multicollinearity}{
Multicollinearity should not be confused with a raw strong correlation
between predictors. What matters is the association between one or more
predictor variables, \emph{conditional on the other variables in the
model}. In a nutshell, multicollinearity means that once you know the
effect of one predictor, the value of knowing the other predictor is rather
low. Thus, one of the predictors doesn't help much in terms of better
understanding the model or predicting the outcome. As a consequence, if
multicollinearity is a problem, the model seems to suggest that the
predictors in question don't seems to be reliably associated with the
outcome (low estimates, high standard errors), although these predictors
actually are strongly associated with the outcome, i.e. indeed might have
strong effect (\emph{McElreath 2020, chapter 6.1}).
Multicollinearity might arise when a third, unobserved variable has a causal
effect on each of the two predictors that are associated with the outcome.
In such cases, the actual relationship that matters would be the association
between the unobserved variable and the outcome.
Remember: "Pairwise correlations are not the problem. It is the conditional
associations - not correlations - that matter." (\emph{McElreath 2020, p. 169})
}
\section{Interpretation of the Variance Inflation Factor}{
The variance inflation factor is a measure to analyze the magnitude of
multicollinearity of model terms. A VIF less than 5 indicates a low
correlation of that predictor with other predictors. A value between 5 and
10 indicates a moderate correlation, while VIF values larger than 10 are a
sign for high, not tolerable correlation of model predictors (\emph{James et al.
2013}). The \emph{Increased SE} column in the output indicates how much larger
the standard error is due to the association with other predictors
conditional on the remaining variables in the model. Note that these
thresholds, although commonly used, are also criticized for being too high.
\emph{Zuur et al. (2010)} suggest using lower values, e.g. a VIF of 3 or larger
may already no longer be considered as "low".
}
\section{Multicollinearity and Interaction Terms}{
If interaction terms are included in a model, high VIF values are expected.
This portion of multicollinearity among the component terms of an
interaction is also called "inessential ill-conditioning", which leads to
inflated VIF values that are typically seen for models with interaction
terms \emph{(Francoeur 2013)}.
}
\section{Concurvity for Smooth Terms in Generalized Additive Models}{
\code{check_concurvity()} is a wrapper around \code{mgcv::concurvity()}, and can be
considered as a collinearity check for smooth terms in GAMs."Concurvity
occurs when some smooth term in a model could be approximated by one or more
of the other smooth terms in the model." (see \code{?mgcv::concurvity}).
\code{check_concurvity()} returns a column named \emph{VIF}, which is the "worst"
measure. While \code{mgcv::concurvity()} range between 0 and 1, the \emph{VIF} value
is \code{1 / (1 - worst)}, to make interpretation comparable to classical VIF
values, i.e. \code{1} indicates no problems, while higher values indicate
increasing lack of identifiability. The \emph{VIF proportion} column equals the
"estimate" column from \code{mgcv::concurvity()}, ranging from 0 (no problem) to
1 (total lack of identifiability).
}
\examples{
m <- lm(mpg ~ wt + cyl + gear + disp, data = mtcars)
check_collinearity(m)
\dontshow{if (require("see")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
# plot results
x <- check_collinearity(m)
plot(x)
\dontshow{\}) # examplesIf}
}
\references{
\itemize{
\item Francoeur, R. B. (2013). Could Sequential Residual Centering Resolve
Low Sensitivity in Moderated Regression? Simulations and Cancer Symptom
Clusters. Open Journal of Statistics, 03(06), 24-44.
\item James, G., Witten, D., Hastie, T., and Tibshirani, R. (eds.). (2013).
An introduction to statistical learning: with applications in R. New York:
Springer.
\item Marcoulides, K. M., and Raykov, T. (2019). Evaluation of Variance
Inflation Factors in Regression Models Using Latent Variable Modeling
Methods. Educational and Psychological Measurement, 79(5), 874–882.
\item McElreath, R. (2020). Statistical rethinking: A Bayesian course with
examples in R and Stan. 2nd edition. Chapman and Hall/CRC.
\item Vanhove, J. (2019). Collinearity isn't a disease that needs curing.
\href{https://janhove.github.io/posts/2019-09-11-collinearity/}{webpage}
\item Zuur AF, Ieno EN, Elphick CS. A protocol for data exploration to avoid
common statistical problems: Data exploration. Methods in Ecology and
Evolution (2010) 1:3–14.
}
}
\seealso{
Other functions to check model assumptions and and assess model quality:
\code{\link{check_autocorrelation}()},
\code{\link{check_convergence}()},
\code{\link{check_heteroscedasticity}()},
\code{\link{check_homogeneity}()},
\code{\link{check_model}()},
\code{\link{check_outliers}()},
\code{\link{check_overdispersion}()},
\code{\link{check_predictions}()},
\code{\link{check_singularity}()},
\code{\link{check_zeroinflation}()}
}
\concept{functions to check model assumptions and and assess model quality}
performance/man/performance_mae.Rd 0000644 0001762 0000144 00000001070 14257247716 016755 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/performance_mae.R
\name{performance_mae}
\alias{performance_mae}
\alias{mae}
\title{Mean Absolute Error of Models}
\usage{
performance_mae(model, ...)
mae(model, ...)
}
\arguments{
\item{model}{A model.}
\item{...}{Arguments passed to or from other methods.}
}
\value{
Numeric, the mean absolute error of \code{model}.
}
\description{
Compute mean absolute error of models.
}
\examples{
data(mtcars)
m <- lm(mpg ~ hp + gear, data = mtcars)
performance_mae(m)
}
performance/man/performance_mse.Rd 0000644 0001762 0000144 00000001646 14257247716 017010 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/performance_mse.R
\name{performance_mse}
\alias{performance_mse}
\alias{mse}
\title{Mean Square Error of Linear Models}
\usage{
performance_mse(model, ...)
mse(model, ...)
}
\arguments{
\item{model}{A model.}
\item{...}{Arguments passed to or from other methods.}
}
\value{
Numeric, the mean square error of \code{model}.
}
\description{
Compute mean square error of linear models.
}
\details{
The mean square error is the mean of the sum of squared residuals, i.e. it
measures the average of the squares of the errors. Less technically speaking,
the mean square error can be considered as the variance of the residuals,
i.e. the variation in the outcome the model doesn't explain. Lower values
(closer to zero) indicate better fit.
}
\examples{
data(mtcars)
m <- lm(mpg ~ hp + gear, data = mtcars)
performance_mse(m)
}
performance/man/compare_performance.Rd 0000644 0001762 0000144 00000012001 14501062052 017611 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/compare_performance.R
\name{compare_performance}
\alias{compare_performance}
\title{Compare performance of different models}
\usage{
compare_performance(
...,
metrics = "all",
rank = FALSE,
estimator = "ML",
verbose = TRUE
)
}
\arguments{
\item{...}{Multiple model objects (also of different classes).}
\item{metrics}{Can be \code{"all"}, \code{"common"} or a character vector of
metrics to be computed. See related
\code{\link[=model_performance]{documentation()}} of object's class for
details.}
\item{rank}{Logical, if \code{TRUE}, models are ranked according to 'best'
overall model performance. See 'Details'.}
\item{estimator}{Only for linear models. Corresponds to the different
estimators for the standard deviation of the errors. If \code{estimator = "ML"}
(default), the scaling is done by n (the biased ML estimator), which is
then equivalent to using \code{AIC(logLik())}. Setting it to \code{"REML"} will give
the same results as \code{AIC(logLik(..., REML = TRUE))}.}
\item{verbose}{Toggle warnings.}
}
\value{
A data frame with one row per model and one column per "index" (see
\code{metrics}).
}
\description{
\code{compare_performance()} computes indices of model
performance for different models at once and hence allows comparison of
indices across models.
}
\details{
\subsection{Model Weights}{
When information criteria (IC) are requested in \code{metrics} (i.e., any of \code{"all"},
\code{"common"}, \code{"AIC"}, \code{"AICc"}, \code{"BIC"}, \code{"WAIC"}, or \code{"LOOIC"}), model
weights based on these criteria are also computed. For all IC except LOOIC,
weights are computed as \code{w = exp(-0.5 * delta_ic) / sum(exp(-0.5 * delta_ic))},
where \code{delta_ic} is the difference between the model's IC value and the
smallest IC value in the model set (Burnham and Anderson, 2002).
For LOOIC, weights are computed as "stacking weights" using
\code{\link[loo:loo_model_weights]{loo::stacking_weights()}}.
}
\subsection{Ranking Models}{
When \code{rank = TRUE}, a new column \code{Performance_Score} is returned.
This score ranges from 0\\% to 100\\%, higher values indicating better model
performance. Note that all score value do not necessarily sum up to 100\\%.
Rather, calculation is based on normalizing all indices (i.e. rescaling
them to a range from 0 to 1), and taking the mean value of all indices for
each model. This is a rather quick heuristic, but might be helpful as
exploratory index.
\cr \cr
In particular when models are of different types (e.g. mixed models,
classical linear models, logistic regression, ...), not all indices will be
computed for each model. In case where an index can't be calculated for a
specific model type, this model gets an \code{NA} value. All indices that
have any \code{NA}s are excluded from calculating the performance score.
\cr \cr
There is a \code{plot()}-method for \code{compare_performance()},
which creates a "spiderweb" plot, where the different indices are
normalized and larger values indicate better model performance.
Hence, points closer to the center indicate worse fit indices
(see \href{https://easystats.github.io/see/articles/performance.html}{online-documentation}
for more details).
}
\subsection{REML versus ML estimator}{
By default, \code{estimator = "ML"}, which means that values from information
criteria (AIC, AICc, BIC) for specific model classes (like models from \emph{lme4})
are based on the ML-estimator, while the default behaviour of \code{AIC()} for
such classes is setting \code{REML = TRUE}. This default is intentional, because
comparing information criteria based on REML fits is usually not valid
(it might be useful, though, if all models share the same fixed effects -
however, this is usually not the case for nested models, which is a
prerequisite for the LRT). Set \code{estimator = "REML"} explicitly return the
same (AIC/...) values as from the defaults in \code{AIC.merMod()}.
}
}
\note{
There is also a \href{https://easystats.github.io/see/articles/performance.html}{\code{plot()}-method} implemented in the \href{https://easystats.github.io/see/}{\pkg{see}-package}.
}
\examples{
\dontshow{if (require("lme4")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
data(iris)
lm1 <- lm(Sepal.Length ~ Species, data = iris)
lm2 <- lm(Sepal.Length ~ Species + Petal.Length, data = iris)
lm3 <- lm(Sepal.Length ~ Species * Petal.Length, data = iris)
compare_performance(lm1, lm2, lm3)
compare_performance(lm1, lm2, lm3, rank = TRUE)
m1 <- lm(mpg ~ wt + cyl, data = mtcars)
m2 <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial")
m3 <- lme4::lmer(Petal.Length ~ Sepal.Length + (1 | Species), data = iris)
compare_performance(m1, m2, m3)
\dontshow{\}) # examplesIf}
}
\references{
Burnham, K. P., and Anderson, D. R. (2002).
\emph{Model selection and multimodel inference: A practical information-theoretic approach} (2nd ed.).
Springer-Verlag. \doi{10.1007/b97636}
}
performance/man/performance_score.Rd 0000644 0001762 0000144 00000004450 14503335324 017316 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/performance_score.R
\name{performance_score}
\alias{performance_score}
\title{Proper Scoring Rules}
\usage{
performance_score(model, verbose = TRUE, ...)
}
\arguments{
\item{model}{Model with binary or count outcome.}
\item{verbose}{Toggle off warnings.}
\item{...}{Arguments from other functions, usually only used internally.}
}
\value{
A list with three elements, the logarithmic, quadratic/Brier and spherical score.
}
\description{
Calculates the logarithmic, quadratic/Brier and spherical score
from a model with binary or count outcome.
}
\details{
Proper scoring rules can be used to evaluate the quality of model
predictions and model fit. \code{performance_score()} calculates the logarithmic,
quadratic/Brier and spherical scoring rules. The spherical rule takes values
in the interval \verb{[0, 1]}, with values closer to 1 indicating a more
accurate model, and the logarithmic rule in the interval \verb{[-Inf, 0]},
with values closer to 0 indicating a more accurate model.
For \code{stan_lmer()} and \code{stan_glmer()} models, the predicted values
are based on \code{posterior_predict()}, instead of \code{predict()}. Thus,
results may differ more than expected from their non-Bayesian counterparts
in \strong{lme4}.
}
\note{
Code is partially based on
\href{https://drizopoulos.github.io/GLMMadaptive/reference/scoring_rules.html}{GLMMadaptive::scoring_rules()}.
}
\examples{
\dontshow{if (require("glmmTMB")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18, 17, 15, 20, 10, 20, 25, 13, 12)
outcome <- gl(3, 1, 9)
treatment <- gl(3, 3)
model <- glm(counts ~ outcome + treatment, family = poisson())
performance_score(model)
\donttest{
data(Salamanders, package = "glmmTMB")
model <- glmmTMB::glmmTMB(
count ~ spp + mined + (1 | site),
zi = ~ spp + mined,
family = nbinom2(),
data = Salamanders
)
performance_score(model)
}
\dontshow{\}) # examplesIf}
}
\references{
Carvalho, A. (2016). An overview of applications of proper scoring rules.
Decision Analysis 13, 223–242. \doi{10.1287/deca.2016.0337}
}
\seealso{
\code{\link[=performance_logloss]{performance_logloss()}}
}
performance/man/test_performance.Rd 0000644 0001762 0000144 00000027261 14473560135 017175 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/test_bf.R, R/test_likelihoodratio.R,
% R/test_performance.R, R/test_vuong.R, R/test_wald.R
\name{test_bf}
\alias{test_bf}
\alias{test_bf.default}
\alias{test_likelihoodratio}
\alias{test_lrt}
\alias{test_performance}
\alias{test_vuong}
\alias{test_wald}
\title{Test if models are different}
\usage{
test_bf(...)
\method{test_bf}{default}(..., reference = 1, text_length = NULL)
test_likelihoodratio(..., estimator = "ML", verbose = TRUE)
test_lrt(..., estimator = "ML", verbose = TRUE)
test_performance(..., reference = 1, verbose = TRUE)
test_vuong(..., verbose = TRUE)
test_wald(..., verbose = TRUE)
}
\arguments{
\item{...}{Multiple model objects.}
\item{reference}{This only applies when models are non-nested, and determines
which model should be taken as a reference, against which all the other
models are tested.}
\item{text_length}{Numeric, length (number of chars) of output lines.
\code{test_bf()} describes models by their formulas, which can lead to
overly long lines in the output. \code{text_length} fixes the length of
lines to a specified limit.}
\item{estimator}{Applied when comparing regression models using
\code{test_likelihoodratio()}. Corresponds to the different estimators for
the standard deviation of the errors. Defaults to \code{"OLS"} for linear models,
\code{"ML"} for all other models (including mixed models), or \code{"REML"} for
linear mixed models when these have the same fixed effects. See 'Details'.}
\item{verbose}{Toggle warning and messages.}
}
\value{
A data frame containing the relevant indices.
}
\description{
Testing whether models are "different" in terms of accuracy or explanatory
power is a delicate and often complex procedure, with many limitations and
prerequisites. Moreover, many tests exist, each coming with its own
interpretation, and set of strengths and weaknesses.
The \code{test_performance()} function runs the most relevant and appropriate
tests based on the type of input (for instance, whether the models are
\emph{nested} or not). However, it still requires the user to understand what the
tests are and what they do in order to prevent their misinterpretation. See
the \emph{Details} section for more information regarding the different tests
and their interpretation.
}
\details{
\subsection{Nested vs. Non-nested Models}{
Model's "nesting" is an important concept of models comparison. Indeed, many
tests only make sense when the models are \emph{"nested",} i.e., when their
predictors are nested. This means that all the \emph{fixed effects} predictors of
a model are contained within the \emph{fixed effects} predictors of a larger model
(sometimes referred to as the encompassing model). For instance,
\code{model1 (y ~ x1 + x2)} is "nested" within \code{model2 (y ~ x1 + x2 + x3)}. Usually,
people have a list of nested models, for instance \code{m1 (y ~ 1)}, \code{m2 (y ~ x1)},
\code{m3 (y ~ x1 + x2)}, \code{m4 (y ~ x1 + x2 + x3)}, and it is conventional
that they are "ordered" from the smallest to largest, but it is up to the
user to reverse the order from largest to smallest. The test then shows
whether a more parsimonious model, or whether adding a predictor, results in
a significant difference in the model's performance. In this case, models are
usually compared \emph{sequentially}: m2 is tested against m1, m3 against m2,
m4 against m3, etc.
Two models are considered as \emph{"non-nested"} if their predictors are
different. For instance, \code{model1 (y ~ x1 + x2)} and \code{model2 (y ~ x3 + x4)}.
In the case of non-nested models, all models are usually compared
against the same \emph{reference} model (by default, the first of the list).
Nesting is detected via the \code{insight::is_nested_models()} function.
Note that, apart from the nesting, in order for the tests to be valid,
other requirements have often to be the fulfilled. For instance, outcome
variables (the response) must be the same. You cannot meaningfully test
whether apples are significantly different from oranges!
}
\subsection{Estimator of the standard deviation}{
The estimator is relevant when comparing regression models using
\code{test_likelihoodratio()}. If \code{estimator = "OLS"}, then it uses the same
method as \code{anova(..., test = "LRT")} implemented in base R, i.e., scaling
by n-k (the unbiased OLS estimator) and using this estimator under the
alternative hypothesis. If \code{estimator = "ML"}, which is for instance used
by \code{lrtest(...)} in package \strong{lmtest}, the scaling is done by n (the
biased ML estimator) and the estimator under the null hypothesis. In
moderately large samples, the differences should be negligible, but it
is possible that OLS would perform slightly better in small samples with
Gaussian errors. For \code{estimator = "REML"}, the LRT is based on the REML-fit
log-likelihoods of the models. Note that not all types of estimators are
available for all model classes.
}
\subsection{REML versus ML estimator}{
When \code{estimator = "ML"}, which is the default for linear mixed models (unless
they share the same fixed effects), values from information criteria (AIC,
AICc) are based on the ML-estimator, while the default behaviour of \code{AIC()}
may be different (in particular for linear mixed models from \strong{lme4}, which
sets \code{REML = TRUE}). This default in \code{test_likelihoodratio()} intentional,
because comparing information criteria based on REML fits requires the same
fixed effects for all models, which is often not the case. Thus, while
\code{anova.merMod()} automatically refits all models to REML when performing a
LRT, \code{test_likelihoodratio()} checks if a comparison based on REML fits is
indeed valid, and if so, uses REML as default (else, ML is the default).
Set the \code{estimator} argument explicitely to override the default behaviour.
}
\subsection{Tests Description}{
\itemize{
\item \strong{Bayes factor for Model Comparison} - \code{test_bf()}: If all
models were fit from the same data, the returned \code{BF} shows the Bayes
Factor (see \code{bayestestR::bayesfactor_models()}) for each model against
the reference model (which depends on whether the models are nested or
not). Check out
\href{https://easystats.github.io/bayestestR/articles/bayes_factors.html#bayesfactor_models}{this vignette}
for more details.
\item \strong{Wald's F-Test} - \code{test_wald()}: The Wald test is a rough
approximation of the Likelihood Ratio Test. However, it is more applicable
than the LRT: you can often run a Wald test in situations where no other
test can be run. Importantly, this test only makes statistical sense if the
models are nested.
Note: this test is also available in base R
through the \code{\link[=anova]{anova()}} function. It returns an \code{F-value} column
as a statistic and its associated p-value.
\item \strong{Likelihood Ratio Test (LRT)} - \code{test_likelihoodratio()}:
The LRT tests which model is a better (more likely) explanation of the
data. Likelihood-Ratio-Test (LRT) gives usually somewhat close results (if
not equivalent) to the Wald test and, similarly, only makes sense for
nested models. However, maximum likelihood tests make stronger assumptions
than method of moments tests like the F-test, and in turn are more
efficient. Agresti (1990) suggests that you should use the LRT instead of
the Wald test for small sample sizes (under or about 30) or if the
parameters are large.
Note: for regression models, this is similar to
\code{anova(..., test="LRT")} (on models) or \code{lmtest::lrtest(...)}, depending
on the \code{estimator} argument. For \strong{lavaan} models (SEM, CFA), the function
calls \code{lavaan::lavTestLRT()}.
For models with transformed response variables (like \code{log(x)} or \code{sqrt(x)}),
\code{logLik()} returns a wrong log-likelihood. However, \code{test_likelihoodratio()}
calls \code{insight::get_loglikelihood()} with \code{check_response=TRUE}, which
returns a corrected log-likelihood value for models with transformed
response variables. Furthermore, since the LRT only accepts nested
models (i.e. models that differ in their fixed effects), the computed
log-likelihood is always based on the ML estimator, not on the REML fits.
\item \strong{Vuong's Test} - \code{test_vuong()}: Vuong's (1989) test can
be used both for nested and non-nested models, and actually consists of two
tests.
\itemize{
\item The \strong{Test of Distinguishability} (the \code{Omega2} column and
its associated p-value) indicates whether or not the models can possibly be
distinguished on the basis of the observed data. If its p-value is
significant, it means the models are distinguishable.
\item The \strong{Robust Likelihood Test} (the \code{LR} column and its
associated p-value) indicates whether each model fits better than the
reference model. If the models are nested, then the test works as a robust
LRT. The code for this function is adapted from the \strong{nonnest2}
package, and all credit go to their authors.
}
}
}
}
\examples{
# Nested Models
# -------------
m1 <- lm(Sepal.Length ~ Petal.Width, data = iris)
m2 <- lm(Sepal.Length ~ Petal.Width + Species, data = iris)
m3 <- lm(Sepal.Length ~ Petal.Width * Species, data = iris)
test_performance(m1, m2, m3)
test_bf(m1, m2, m3)
test_wald(m1, m2, m3) # Equivalent to anova(m1, m2, m3)
# Equivalent to lmtest::lrtest(m1, m2, m3)
test_likelihoodratio(m1, m2, m3, estimator = "ML")
# Equivalent to anova(m1, m2, m3, test='LRT')
test_likelihoodratio(m1, m2, m3, estimator = "OLS")
if (require("CompQuadForm")) {
test_vuong(m1, m2, m3) # nonnest2::vuongtest(m1, m2, nested=TRUE)
# Non-nested Models
# -----------------
m1 <- lm(Sepal.Length ~ Petal.Width, data = iris)
m2 <- lm(Sepal.Length ~ Petal.Length, data = iris)
m3 <- lm(Sepal.Length ~ Species, data = iris)
test_performance(m1, m2, m3)
test_bf(m1, m2, m3)
test_vuong(m1, m2, m3) # nonnest2::vuongtest(m1, m2)
}
# Tweak the output
# ----------------
test_performance(m1, m2, m3, include_formula = TRUE)
# SEM / CFA (lavaan objects)
# --------------------------
# Lavaan Models
if (require("lavaan")) {
structure <- " visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
visual ~~ textual + speed "
m1 <- lavaan::cfa(structure, data = HolzingerSwineford1939)
structure <- " visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
visual ~~ 0 * textual + speed "
m2 <- lavaan::cfa(structure, data = HolzingerSwineford1939)
structure <- " visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
visual ~~ 0 * textual + 0 * speed "
m3 <- lavaan::cfa(structure, data = HolzingerSwineford1939)
test_likelihoodratio(m1, m2, m3)
# Different Model Types
# ---------------------
if (require("lme4") && require("mgcv")) {
m1 <- lm(Sepal.Length ~ Petal.Length + Species, data = iris)
m2 <- lmer(Sepal.Length ~ Petal.Length + (1 | Species), data = iris)
m3 <- gam(Sepal.Length ~ s(Petal.Length, by = Species) + Species, data = iris)
test_performance(m1, m2, m3)
}
}
}
\references{
\itemize{
\item Vuong, Q. H. (1989). Likelihood ratio tests for model selection and
non-nested hypotheses. Econometrica, 57, 307-333.
\item Merkle, E. C., You, D., & Preacher, K. (2016). Testing non-nested
structural equation models. Psychological Methods, 21, 151-163.
}
}
\seealso{
\code{\link[=compare_performance]{compare_performance()}} to compare the performance indices of
many different models.
}
performance/man/r2_somers.Rd 0000644 0001762 0000144 00000001336 14503335324 015535 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/r2_somers.R
\name{r2_somers}
\alias{r2_somers}
\title{Somers' Dxy rank correlation for binary outcomes}
\usage{
r2_somers(model)
}
\arguments{
\item{model}{A logistic regression model.}
}
\value{
A named vector with the R2 value.
}
\description{
Calculates the Somers' Dxy rank correlation for logistic regression models.
}
\examples{
\donttest{
if (require("correlation") && require("Hmisc")) {
model <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial")
r2_somers(model)
}
}
}
\references{
Somers, R. H. (1962). A new asymmetric measure of association for
ordinal variables. American Sociological Review. 27 (6).
}
performance/man/model_performance.ivreg.Rd 0000644 0001762 0000144 00000002076 14407025063 020417 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/model_performance.ivreg.R
\name{model_performance.ivreg}
\alias{model_performance.ivreg}
\title{Performance of instrumental variable regression models}
\usage{
\method{model_performance}{ivreg}(model, metrics = "all", verbose = TRUE, ...)
}
\arguments{
\item{model}{A model.}
\item{metrics}{Can be \code{"all"}, \code{"common"} or a character vector of
metrics to be computed (some of \code{c("AIC", "AICc", "BIC", "R2", "RMSE", "SIGMA", "Sargan", "Wu_Hausman", "weak_instruments")}). \code{"common"} will
compute AIC, BIC, R2 and RMSE.}
\item{verbose}{Toggle off warnings.}
\item{...}{Arguments passed to or from other methods.}
}
\description{
Performance of instrumental variable regression models
}
\details{
\code{model_performance()} correctly detects transformed response and
returns the "corrected" AIC and BIC value on the original scale. To get back
to the original scale, the likelihood of the model is multiplied by the
Jacobian/derivative of the transformation.
}
performance/man/item_split_half.Rd 0000644 0001762 0000144 00000002455 14257247716 017005 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/item_split_half.R
\name{item_split_half}
\alias{item_split_half}
\title{Split-Half Reliability}
\usage{
item_split_half(x, digits = 3)
}
\arguments{
\item{x}{A matrix or a data frame.}
\item{digits}{Amount of digits for returned values.}
}
\value{
A list with two elements: the split-half reliability \code{splithalf}
and the Spearman-Brown corrected split-half reliability
\code{spearmanbrown}.
}
\description{
Compute various measures of internal consistencies
for tests or item-scales of questionnaires.
}
\details{
This function calculates the split-half reliability for items in
\code{x}, including the Spearman-Brown adjustment. Splitting is done by
selecting odd versus even columns in \code{x}. A value closer to 1
indicates greater internal consistency.
}
\examples{
data(mtcars)
x <- mtcars[, c("cyl", "gear", "carb", "hp")]
item_split_half(x)
}
\references{
\itemize{
\item Spearman C. 1910. Correlation calculated from faulty data. British
Journal of Psychology (3): 271-295. \doi{10.1111/j.2044-8295.1910.tb00206.x}
\item Brown W. 1910. Some experimental results in the correlation of mental
abilities. British Journal of Psychology (3): 296-322. \doi{10.1111/j.2044-8295.1910.tb00207.x}
}
}
performance/man/r2_xu.Rd 0000644 0001762 0000144 00000001466 14257247716 014702 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/r2_xu.R
\name{r2_xu}
\alias{r2_xu}
\title{Xu' R2 (Omega-squared)}
\usage{
r2_xu(model)
}
\arguments{
\item{model}{A linear (mixed) model.}
}
\value{
The R2 value.
}
\description{
Calculates Xu' Omega-squared value, a simple R2 equivalent for
linear mixed models.
}
\details{
\code{r2_xu()} is a crude measure for the explained variance from
linear (mixed) effects models, which is originally denoted as
\ifelse{html}{\out{Ω