emmeans/0000755000176200001440000000000014165107726011710 5ustar liggesusersemmeans/NAMESPACE0000644000176200001440000000620614165066731013133 0ustar liggesusers# Generated by roxygen2: do not edit by hand S3method("+",emmGrid) S3method("[",emmGrid) S3method("[",summary_emm) S3method("levels<-",emmGrid) S3method(as.data.frame,emmGrid) S3method(as.data.frame,emm_list) S3method(as.glht,default) S3method(as.glht,emmGrid) S3method(as.glht,emm_list) S3method(as.list,emmGrid) S3method(as.list,emm_list) S3method(coef,emmGrid) S3method(coef,emm_list) S3method(coef,glht_list) S3method(confint,emmGrid) S3method(confint,emm_list) S3method(confint,glht_list) S3method(contrast,emmGrid) S3method(contrast,emm_list) S3method(emm_basis,aovlist) S3method(emm_basis,lm) S3method(emm_basis,lme) S3method(emm_basis,merMod) S3method(emm_basis,mlm) S3method(emmip,default) S3method(pairs,emmGrid) S3method(pairs,emm_list) S3method(plot,emmGrid) S3method(plot,emm_list) S3method(plot,glht_list) S3method(plot,summary_emm) S3method(predict,emmGrid) S3method(print,emmGrid) S3method(print,emm_list) S3method(print,pwpm) S3method(print,summary_emm) S3method(print,xtable_emm) S3method(rbind,emmGrid) S3method(rbind,emm_list) S3method(recover_data,aovlist) S3method(recover_data,call) S3method(recover_data,lm) S3method(recover_data,lme) S3method(recover_data,merMod) S3method(str,emmGrid) S3method(str,emm_list) S3method(subset,emmGrid) S3method(summary,emmGrid) S3method(summary,emm_list) S3method(summary,glht_list) S3method(test,emmGrid) S3method(test,emm_list) S3method(update,emmGrid) S3method(update,summary_emm) S3method(vcov,emmGrid) S3method(vcov,glht_list) S3method(xtable,emmGrid) S3method(xtable,summary_emm) export(.all.vars) export(.aovlist.dffun) export(.cmpMM) export(.combine.terms) export(.diag) export(.emm_basis) export(.emm_register) export(.emm_vignette) export(.get.excl) export(.get.offset) export(.my.vcov) export(.num.key) export(.recover_data) export(.std.link.labels) export(.zi.simulate) export(add_grouping) export(as.emmGrid) export(as.emm_list) export(as.glht) export(as.mcmc.emmGrid) export(contrast) export(eff_size) export(emm) export(emm_basis) export(emm_defaults) export(emm_options) export(emmeans) export(emmip) export(emmip_ggplot) export(emmip_lattice) export(emmobj) export(emtrends) export(force_regular) export(get.lsm.option) export(get_emm_option) export(hpd.summary) export(joint_tests) export(lsm) export(lsm.options) export(lsmeans) export(lsmip) export(lsmobj) export(lstrends) export(make.tran) export(mvcontrast) export(pmm) export(pmmeans) export(pmmip) export(pmmobj) export(pmtrends) export(pwpm) export(pwpp) export(qdrg) export(recover_data) export(ref_grid) export(regrid) export(test) exportClasses(emmGrid) import(estimability) import(mvtnorm) import(stats) importFrom(graphics,pairs) importFrom(graphics,plot) importFrom(methods,"slot<-") importFrom(methods,as) importFrom(methods,is) importFrom(methods,new) importFrom(methods,slot) importFrom(methods,slotNames) importFrom(stats,coef) importFrom(utils,getS3method) importFrom(utils,hasName) importFrom(utils,installed.packages) importFrom(utils,methods) importFrom(utils,str) importFrom(xtable,xtable) importFrom(xtable,xtableList) emmeans/README.md0000644000176200001440000001166114137062735013173 0ustar liggesusersR package **emmeans**: Estimated marginal means ==== ## Features Estimated marginal means (EMMs, previously known as least-squares means in the context of traditional regression models) are derived by using a model to make predictions over a regular grid of predictor combinations (called a *reference grid*). These predictions may possibly be averaged (typically with equal weights) over one or more of the predictors. Such marginally-averaged predictions are useful for describing the results of fitting a model, particularly in presenting the effects of factors. The **emmeans** package can easily produce these results, as well as various graphs of them (interaction-style plots and side-by-side intervals). * Estimation and testing of pairwise comparisons of EMMs, and several other types of contrasts, are provided. * In rank-deficient models, the estimability of predictions is checked, to avoid outputting results that are not uniquely defined. * For models where continuous predictors interact with factors, the package's `emtrends` function works in terms of a reference grid of predicted slopes of trend lines for each factor combination. * Vignettes are provided on various aspects of EMMs and using the package. See the [CRAN page](https://CRAN.R-project.org/package=emmeans). * We try to provide flexible (but pretty basic) graphics support for the `emmGrid` objects produced by the package. Also, support is provided for nested fixed effects. * Response transformations and link functions are supported via a `type` argument in many functions (e.g., `type = "response"` to back-transform results to the response scale). Also, a `regrid()` function is provided to reconstruct the object on any transformed scale that the user wishes. * Two-way support of the `glht` function in the **multcomp** package. ## Model support * The package incorporates support for many types of models, including standard models fitted using `lm`, `glm`, and relatives, various mixed models, GEEs, survival models, count models, ordinal responses, zero-inflated models, and others. Provisions for some models include special modes for accessing different types of predictions; for example, with zero-inflated models, one may opt for the estimated response including zeros, just the linear predictor, or the zero model. For details, see [`vignette("models", package = "emmeans")`](https://CRAN.R-project.org/package=emmeans/vignettes/models.html) * Various Bayesian models (**carBayes**, **MCMCglmm**, **MCMCpack**) are supported by way of creating a posterior sample of least-squares means or contrasts thereof, which may then be examined using tools such as in the **coda** package. * Package developers are encouraged to incorporate **emmeans** support for their models by writing `recover_data` and `emm_basis` methods. See [`vignette("extending", package = "emmeans")`](https://CRAN.R-project.org/package=emmeans/vignettes/xtending.html) ## Versions and installation * **CRAN** The latest CRAN version may be found at [https://CRAN.R-project.org/package=emmeans](https://CRAN.R-project.org/package=emmeans). Also at that site, formatted versions of this package's vignettes may be viewed. * **GitHub** To install the latest development version from GitHub, install the newest version (definitely 2.0 or higher) of the **devtools** package; then run ```r remotes::install_github("rvlenth/emmeans", dependencies = TRUE, build_opts = "") ### To install without vignettes (faster): remotes::install_github("rvlenth/emmeans") ``` *Note:* If you are a Windows user, you should also first download and install the latest version of [`Rtools`](https://cran.r-project.org/bin/windows/Rtools/). For the latest release notes on this development version, see the [NEWS file](https://github.com/rvlenth/emmeans/blob/master/NEWS.md) ## Supersession plan The developer of **emmeans** continues to maintain and occasionally add new features. However, none of us is immortal; and neither is software. I have thought of trying to find a co-maintainer who could carry the ball once I am gone or lose interest, but the flip side of that is that the codebase is not getting less messy as time goes on -- why impose that on someone else? So my thought now is that if at some point, enough active R developers want the capabilities of **emmeans** but I am no longer in the picture, they should feel free to supersede it with some other package that does it better. All of the code is publicly available on GitHub, so just take what is useful and replace what is not. ##### *Note: **emmeans** supersedes the package **lsmeans**. The latter is just a front end for **emmeans**, and in fact, the `lsmeans()` function itself is part of **emmeans**.* emmeans/data/0000755000176200001440000000000014137062735012620 5ustar liggesusersemmeans/data/fiber.RData0000644000176200001440000000041614137062735014625 0ustar liggesusersmRN0؅HG-!)S&uh4 2?аwbi3;}iM !Dw$Eyol87ss\VSW-z7íU?Y^*7GNhq Éo+5o3ƽ2O: a |l9鏄xIO_>yzu5*ߗ&:iMg+1Ԛ6ǯ?Wuݏ QQ` ا9ƻ{emmeans/data/MOats.RData0000644000176200001440000000106514137062735014562 0ustar liggesusersTj0U$m2A}Pؿ̓6[Bxǃv#3^RXJg]VyV/(,H(ȺT$957ְdfmWSzO쐶Z/v7IOa8JztN~s={ݪ5˸Z89RY}\p-cemmeans/data/nutrition.RData0000644000176200001440000000075214137062735015574 0ustar liggesusersVN@jD%C![1O.aAn~3U )!/ %eGy`RIٚ-ϧIvvO2Ԛ8Eg1-za\ gH2۸MdMuFM#UCu5}Ht=lmp5&&,{1!͟ȱ铹(_)GB[~NK)ɋܯ|01(YJ;Ce\PzQhC攎`vqz?b<]/> 4Nřp֍dWhdōўc2G;kY\a<ŗ '˟'t`v틎?hoA+:|/ՍWyFoK}!"Qy -0= n emmeans/data/autonoise.RData0000644000176200001440000000052514137062735015545 0ustar liggesusersSMK@M- /Cs=x^d H*zg{:ծCxo&vnuR ҄A|u7fqj1C>qq5sv{m53gך=?.k9ُ5C9ѓJEvf.˖,y m>9R5 ײulGf.C.~HiCPOc|V}-"i7{?^|HISg9B Rl`v;emmeans/data/pigs.rda0000644000176200001440000000054714137062735014260 0ustar liggesusersBZh91AY&SYs.{fHA&`/ߠ@0@ 86&Hh 44a@")=M#MCFA_` I[cQm0oZ 5uF(-E\:KaC2*4UL+y5'%&$c~!\ M#1B( ɓD6L`bEM$CaC03AXIg2@wT4>3KRک C1, d| 7oZR8?K._yߔd(2rB$kdyX-Cdw&kK=3Sէ+0^6lyI- PhԀ@HLN#rE8Ps.{emmeans/data/neuralgia.rda0000644000176200001440000000073514137062735015264 0ustar liggesusersBZh91AY&SY['# `/@)h!MM FLF5)Jhi0H=L&haM!@N'@" 18 &|TӬ:" Bq $80d$\p`B)f& ᐈ4Y}-U~Qv-$/ovfY1.b_c2]CǑ2-^#UV;-ZՎyH)e^La9"! ehRfL3#b#aS`Wuà hUD!M $ c4 HD؁16Np_pC 0ÐLy5+sh(z-Ïܶ^TKiXb%j|*Zt8*W2w5NBr@?ܑN$emmeans/data/ubds.RData0000644000176200001440000000205214137062735014471 0ustar liggesusersWRAdT[P 5ekh?}b۝mڝ?t꙾{gvԴmn6x&cR4MLf)ly;3 ` j@}ŕJU23 tMϣyZuK^*utn2UC:f,}|Kv/UϺ~hZ,@G&|<0;P^fZ[ZHȧQ:eݺVn'G w6R\NwvKU]ì0JۤCȼ]e%7 jb|%lM/]v*ڭh[iC\r?h'sWoR֟KᠹZSsWh?t}9%z99~{iŧ ϥ~/󿟒eOp|<8w?{mOԒvΏ[CE)]׋>K;=W㼸nK~gi˫zO](&=&~?{.兩g>}h_Ժzrc#~~QEg9?M}oz9c:@?1YX~],y0I|GǻyR_kSu}ǫ1kڗ?+ 5&aC=}1 <<| n "7%1N ;Ӏ)k~ "./8M|\Àۀ`13~R2 .`YY F@`pUpwGkB{ 䡐5o8"`V \"o}wusoumLM|-y -yַuHH!Pxpԇ6½R!$(ZE "3%iQ BDfƒA[cDFemmeans/data/oranges.RData0000644000176200001440000000154514137062735015200 0ustar liggesusers r0b```b`f@& `d`aEy @.*`%1̌80vlԲQFC01L` S LbpK򋀬hA2 H  f)X*&[!"a[$1K,lG7[]P)] d\G(<61qA7D堶IYQ+v*/v0zͯA73{gOO!Ϧy䎼(y=AA"G)>i>f9e? Țk-z5.N{ ${U6ŭ񆃈{k Nlp> U)$\28(a\g]xA_5gDބ]=꜃BP䰃ߒxDɬ0tPUnG;{}B累+8l`wX8 /v_bd|wic57_99Ȕig=n -;*xi}lCN~pyr05T.A@ 퐺?PdS>s0y;1f;0(1\.`I3':s #>N>fE>ؗcTFPO1ϸ?9ON:7y%םrxx^܇7 Ak?qﲛ؃. h>)>́!c>'wNE=ZxeK"_=^E6v5ZƕZk Vp6}]e<鮕"vLwHemmeans/man/0000755000176200001440000000000014137062735012462 5ustar liggesusersemmeans/man/summary.emmGrid.Rd0000644000176200001440000005110014151004004016003 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/summary.R, R/test.R \name{summary.emmGrid} \alias{summary.emmGrid} \alias{confint.emmGrid} \alias{test} \alias{test.emmGrid} \alias{predict.emmGrid} \alias{as.data.frame.emmGrid} \alias{[.summary_emm} \title{Summaries, predictions, intervals, and tests for \code{emmGrid} objects} \usage{ \method{summary}{emmGrid}(object, infer, level, adjust, by, type, df, calc, null, delta, side, frequentist, bias.adjust = get_emm_option("back.bias.adj"), sigma, ...) \method{confint}{emmGrid}(object, parm, level = 0.95, ...) test(object, null, ...) \method{test}{emmGrid}(object, null = 0, joint = FALSE, verbose = FALSE, rows, by, status = FALSE, ...) \method{predict}{emmGrid}(object, type, interval = c("none", "confidence", "prediction"), level = 0.95, bias.adjust = get_emm_option("back.bias.adj"), sigma, ...) \method{as.data.frame}{emmGrid}(x, row.names = NULL, optional, check.names = TRUE, ...) \method{[}{summary_emm}(x, ..., as.df = TRUE) } \arguments{ \item{object}{An object of class \code{"emmGrid"} (see \link{emmGrid-class})} \item{infer}{A vector of one or two logical values. The first determines whether confidence intervals are displayed, and the second determines whether \emph{t} tests and \emph{P} values are displayed. If only one value is provided, it is used for both.} \item{level}{Numerical value between 0 and 1. Confidence level for confidence intervals, if \code{infer[1]} is \code{TRUE}.} \item{adjust}{Character value naming the method used to adjust \eqn{p} values or confidence limits; or to adjust comparison arrows in \code{plot}. See the P-value adjustments section below.} \item{by}{Character name(s) of variables to use for grouping into separate tables. This affects the family of tests considered in adjusted \emph{P} values.} \item{type}{Character: type of prediction desired. This only has an effect if there is a known transformation or link function. \code{"response"} specifies that the inverse transformation be applied. \code{"mu"} (or equivalently, \code{"unlink"}) is usually the same as \code{"response"}, but in the case where the model has both a link function and a response transformation, only the link part is back-transformed. Other valid values are \code{"link"}, \code{"lp"}, and \code{"linear.predictor"}; these are equivalent, and request that results be shown for the linear predictor, with no back-transformation. The default is \code{"link"}, unless the \code{"predict.type"} option is in force; see \code{\link{emm_options}}, and also the section below on transformations and links.} \item{df}{Numeric. If non-missing, a constant number of degrees of freedom to use in constructing confidence intervals and \emph{P} values (\code{NA} specifies asymptotic results).} \item{calc}{Named list of character value(s) or formula(s). The expressions in \code{char} are evaluated and appended to the summary, just after the \code{df} column. The expression may include any names up through \code{df} in the summary, any additional names in \code{object@grid} (such as \code{.wgt.} or \code{.offset.}), or any earlier elements of \code{calc}.} \item{null}{Numeric. Null hypothesis value(s), on the linear-predictor scale, against which estimates are tested. May be a single value used for all, or a numeric vector of length equal to the number of tests in each family (i.e., \code{by} group in the displayed table).} \item{delta}{Numeric value (on the linear-predictor scale). If zero, ordinary tests of significance are performed. If positive, this specifies a threshold for testing equivalence (using the TOST or two-one-sided-test method), non-inferiority, or non-superiority, depending on \code{side}. See Details for how the test statistics are defined.} \item{side}{Numeric or character value specifying whether the test is left-tailed (\code{-1}, \code{"-"}, code{"<"}, \code{"left"}, or \code{"nonsuperiority"}); right-tailed (\code{1}, \code{"+"}, \code{">"}, \code{"right"}, or \code{"noninferiority"}); or two-sided (\code{0}, \code{2}, \code{"!="}, \code{"two-sided"}, \code{"both"}, \code{"equivalence"}, or \code{"="}). See the special section below for more details.} \item{frequentist}{Ignored except if a Bayesian model was fitted. If missing or \code{FALSE}, the object is passed to \code{\link{hpd.summary}}. Otherwise, a logical value of \code{TRUE} will have it return a frequentist summary.} \item{bias.adjust}{Logical value for whether to adjust for bias in back-transforming (\code{type = "response"}). This requires a value of \code{sigma} to exist in the object or be specified.} \item{sigma}{Error SD assumed for bias correction (when \code{type = "response"} and a transformation is in effect), or for constructing prediction intervals. If not specified, \code{object@misc$sigma} is used, and an error is thrown if it is not found. \emph{Note:} \code{sigma} may be a vector, as long as it conforms to the number of rows of the reference grid.} \item{...}{Optional arguments such as \code{scheffe.rank} (see \dQuote{P-value adjustments}). In \code{as.data.frame.emmGrid}, \code{confint.emmGrid}, \code{predict.emmGrid}, and \code{test.emmGrid}, these arguments are passed to \code{summary.emmGrid}.} \item{parm}{(Required argument for \code{confint} methods, but not used)} \item{joint}{Logical value. If \code{FALSE}, the arguments are passed to \code{\link{summary.emmGrid}} with \code{infer=c(FALSE, TRUE)}. If \code{joint = TRUE}, a joint test of the hypothesis L beta = null is performed, where L is \code{object@linfct} and beta is the vector of fixed effects estimated by \code{object@betahat}. This will be either an \emph{F} test or a chi-square (Wald) test depending on whether degrees of freedom are available. See also \code{\link{joint_tests}}.} \item{verbose}{Logical value. If \code{TRUE} and \code{joint = TRUE}, a table of the effects being tested is printed.} \item{rows}{Integer values. The rows of L to be tested in the joint test. If missing, all rows of L are used. If not missing, \code{by} variables are ignored.} \item{status}{logical. If \code{TRUE}, a \code{note} column showing status flags (for rank deficiencies and estimability issues) is displayed even when empty. If \code{FALSE}, the column is included only if there are such issues.} \item{interval}{Type of interval desired (partial matching is allowed): \code{"none"} for no intervals, otherwise confidence or prediction intervals with given arguments, via \code{\link{confint.emmGrid}}.} \item{x}{object of the given class} \item{row.names}{passed to \code{\link{as.data.frame}}} \item{optional}{required argument, but ignored in \code{as.data.frame.emmGrid}} \item{check.names}{passed to \code{\link{data.frame}}} \item{as.df}{Logical value. With \code{x[..., as.df = TRUE]}, the result is object is coerced to an ordinary \code{\link{data.frame}}; otherwise, it is left as a \code{summary_emm} object.} } \value{ \code{summary.emmGrid}, \code{confint.emmGrid}, and \code{test.emmGrid} return an object of class \code{"summary_emm"}, which is an extension of \code{\link{data.frame}} but with a special \code{print} method that displays it with custom formatting. For models fitted using MCMC methods, the call is diverted to \code{\link{hpd.summary}} (with \code{prob} set to \code{level}, if specified); one may alternatively use general MCMC summarization tools with the results of \code{as.mcmc}. \code{predict} returns a vector of predictions for each row of \code{object@grid}. The \code{as.data.frame} method returns a plain data frame, equivalent to \code{as.data.frame(summary(.))}. } \description{ These are the primary methods for obtaining numerical or tabular results from an \code{emmGrid} object. \code{summary.emmGrid} is the general function for summarizing \code{emmGrid} objects. It also serves as the print method for these objects; so for convenience, \code{summary()} arguments may be included in calls to functions such as \code{\link{emmeans}} and \code{\link{contrast}} that construct \code{emmGrid} objects. Note that by default, summaries for Bayesian models are diverted to \code{\link{hpd.summary}}. } \details{ \code{confint.emmGrid} is equivalent to \code{summary.emmGrid with infer = c(TRUE, FALSE)}. When called with \code{joint = FALSE}, \code{test.emmGrid} is equivalent to \code{summary.emmGrid} with \code{infer = c(FALSE, TRUE)}. With \code{joint = TRUE}, \code{test.emmGrid} calculates the Wald test of the hypothesis \code{linfct \%*\% bhat = null}, where \code{linfct} and \code{bhat} refer to slots in \code{object} (possibly subsetted according to \code{by} or \code{rows}). An error is thrown if any row of \code{linfct} is non-estimable. It is permissible for the rows of \code{linfct} to be linearly dependent, as long as \code{null == 0}, in which case a reduced set of contrasts is tested. Linear dependence and nonzero \code{null} cause an error. } \note{ In doing testing and a transformation and/or link is in force, any \code{null} and/or \code{delta} values specified must always be on the scale of the linear predictor, regardless of the setting for `type`. If \code{type = "response"}, the null value displayed in the summary table will be back-transformed from the value supplied by the user. But the displayed \code{delta} will not be changed, because there (often) is not a natural way to back-transform it. When we have \code{type = "response"}, and \code{bias.adj = TRUE}, the \code{null} value displayed in the output is both back-transformed and bias-adjusted, leading to a rather non-intuitive-looking null value. However, since the tests themselves are performed on the link scale, this is the response value at which a *P* value of 1 would be obtained. The default \code{show} method for \code{emmGrid} objects (with the exception of newly created reference grids) is \code{print(summary())}. Thus, with ordinary usage of \code{\link{emmeans}} and such, it is unnecessary to call \code{summary} unless there is a need to specify other than its default options. The \code{as.data.frame} method is intended primarily to allow for \code{emmGrid} objects to be coerced to a data frame as needed internally. However, we recommend \emph{against} users routinely using \code{as.data.frame}; instead, use \code{summary}, \code{confint}, or \code{test}, which already return a special \code{data.frame} with added annotations. Those annotations display important information such as adjustment methods and confidence levels. If you need to see more digits, use \code{print(summary(object), digits = ...)}; and if you \emph{always} want to see more digits, use \code{emm_options(opt.digits = FALSE)}. } \section{Defaults}{ The \code{misc} slot in \code{object} may contain default values for \code{by}, \code{calc}, \code{infer}, \code{level}, \code{adjust}, \code{type}, \code{null}, \code{side}, and \code{delta}. These defaults vary depending on the code that created the object. The \code{\link{update}} method may be used to change these defaults. In addition, any options set using \samp{emm_options(summary = ...)} will trump those stored in the object's \code{misc} slot. } \section{Transformations and links}{ With \code{type = "response"}, the transformation assumed can be found in \samp{object@misc$tran}, and its label, for the summary is in \samp{object@misc$inv.lbl}. Any \eqn{t} or \eqn{z} tests are still performed on the scale of the linear predictor, not the inverse-transformed one. Similarly, confidence intervals are computed on the linear-predictor scale, then inverse-transformed. When \code{bias.adjust} is \code{TRUE}, then back-transformed estimates are adjusted by adding \eqn{0.5 h''(u)\sigma^2}, where \eqn{h} is the inverse transformation and \eqn{u} is the linear predictor. This is based on a second-order Taylor expansion. There are better or exact adjustments for certain specific cases, and these may be incorporated in future updates. } \section{P-value adjustments}{ The \code{adjust} argument specifies a multiplicity adjustment for tests or confidence intervals. This adjustment always is applied \emph{separately} to each table or sub-table that you see in the printed output (see \code{\link{rbind.emmGrid}} for how to combine tables). The valid values of \code{adjust} are as follows: \describe{ \item{\code{"tukey"}}{Uses the Studentized range distribution with the number of means in the family. (Available for two-sided cases only.)} \item{\code{"scheffe"}}{Computes \eqn{p} values from the \eqn{F} distribution, according to the Scheffe critical value of \eqn{\sqrt{rF(\alpha; r, d)}}{sqrt[r*qf(alpha, r, d)]}, where \eqn{d} is the error degrees of freedom and \eqn{r} is the rank of the set of linear functions under consideration. By default, the value of \code{r} is computed from \code{object@linfct} for each by group; however, if the user specifies an argument matching \code{scheffe.rank}, its value will be used instead. Ordinarily, if there are \eqn{k} means involved, then \eqn{r = k - 1} for a full set of contrasts involving all \eqn{k} means, and \eqn{r = k} for the means themselves. (The Scheffe adjustment is available for two-sided cases only.)} \item{\code{"sidak"}}{Makes adjustments as if the estimates were independent (a conservative adjustment in many cases).} \item{\code{"bonferroni"}}{Multiplies \eqn{p} values, or divides significance levels by the number of estimates. This is a conservative adjustment.} \item{\code{"dunnettx"}}{Uses our own\emph{ad hoc} approximation to the Dunnett distribution for a family of estimates having pairwise correlations of \eqn{0.5} (as is true when comparing treatments with a control with equal sample sizes). The accuracy of the approximation improves with the number of simultaneous estimates, and is much faster than \code{"mvt"}. (Available for two-sided cases only.)} \item{\code{"mvt"}}{Uses the multivariate \eqn{t} distribution to assess the probability or critical value for the maximum of \eqn{k} estimates. This method produces the same \eqn{p} values and intervals as the default \code{summary} or \code{confint} methods to the results of \code{\link{as.glht}}. In the context of pairwise comparisons or comparisons with a control, this produces \dQuote{exact} Tukey or Dunnett adjustments, respectively. However, the algorithm (from the \pkg{mvtnorm} package) uses a Monte Carlo method, so results are not exactly repeatable unless the same random-number seed is used (see \code{\link[base:Random]{set.seed}}). As the family size increases, the required computation time will become noticeable or even intolerable, making the \code{"tukey"}, \code{"dunnettx"}, or others more attractive.} \item{\code{"none"}}{Makes no adjustments to the \eqn{p} values.} } %%%%%%%%%%%%%%%% end \describe {} For tests, not confidence intervals, the Bonferroni-inequality-based adjustment methods in \code{\link{p.adjust}} are also available (currently, these include \code{"holm"}, \code{"hochberg"}, \code{"hommel"}, \code{"bonferroni"}, \code{"BH"}, \code{"BY"}, \code{"fdr"}, and \code{"none"}). If a \code{p.adjust.methods} method other than \code{"bonferroni"} or \code{"none"} is specified for confidence limits, the straight Bonferroni adjustment is used instead. Also, if an adjustment method is not appropriate (e.g., using \code{"tukey"} with one-sided tests, or with results that are not pairwise comparisons), a more appropriate method (usually \code{"sidak"}) is substituted. In some cases, confidence and \eqn{p}-value adjustments are only approximate -- especially when the degrees of freedom or standard errors vary greatly within the family of tests. The \code{"mvt"} method is always the correct one-step adjustment, but it can be very slow. One may use \code{\link{as.glht}} with methods in the \pkg{multcomp} package to obtain non-conservative multi-step adjustments to tests. \emph{Warning:} Non-estimable cases are \emph{included} in the family to which adjustments are applied. You may wish to subset the object using the \code{[]} operator to work around this problem. } \section{Tests of significance, nonsuperiority, noninferiority, or equivalence}{ When \code{delta = 0}, test statistics are the usual tests of significance. They are of the form \samp{(estimate - null)/SE}. Notationally: \describe{ \item{Significance}{\eqn{H_0: \theta = \theta_0} versus \cr \eqn{H_1: \theta < \theta_0} (left-sided), or\cr \eqn{H_1 \theta > \theta_0} (right-sided), or\cr \eqn{H_1: \theta \ne \theta_0} (two-sided)\cr The test statistic is\cr \eqn{t = (Q - \theta_0)/SE}\cr where \eqn{Q} is our estimate of \eqn{\theta}; then left, right, or two-sided \eqn{p} values are produced, depending on \code{side}.} } When \code{delta} is positive, the test statistic depends on \code{side} as follows. \describe{ \item{Left-sided (nonsuperiority)}{\eqn{H_0: \theta \ge \theta_0 + \delta} versus \eqn{H_1: \theta < \theta_0 + \delta}\cr \eqn{t = (Q - \theta_0 - \delta)/SE}\cr The \eqn{p} value is the lower-tail probability.} \item{Right-sided (noninferiority)}{\eqn{H_0: \theta \le \theta_0 - \delta} versus \eqn{H_1: \theta > \theta_0 - \delta}\cr \eqn{t = (Q - \theta_0 + \delta)/SE}\cr The \eqn{p} value is the upper-tail probability.} \item{Two-sided (equivalence)}{\eqn{H_0: |\theta - \theta_0| \ge \delta} versus \eqn{H_1: |\theta - \theta_0| < \delta}\cr \eqn{t = (|Q - \theta_0| - \delta)/SE}\cr The \eqn{p} value is the \emph{lower}-tail probability.\cr Note that \eqn{t} is the maximum of \eqn{t_{nonsup}} and \eqn{-t_{noninf}}. This is equivalent to choosing the less significant result in the two-one-sided-test (TOST) procedure.} } %%%%%%%%%%%% end \describe{} } \section{Non-estimable cases}{ When the model is rank-deficient, each row \code{x} of \code{object}'s \code{linfct} slot is checked for estimability. If \code{sum(x*bhat)} is found to be non-estimable, then the string \code{NonEst} is displayed for the estimate, and associated statistics are set to \code{NA}. The estimability check is performed using the orthonormal basis \code{N} in the \code{nbasis} slot for the null space of the rows of the model matrix. Estimability fails when \eqn{||Nx||^2 / ||x||^2} exceeds \code{tol}, which by default is \code{1e-8}. You may change it via \code{\link{emm_options}} by setting \code{estble.tol} to the desired value. See the warning above that non-estimable cases are still included when determining the family size for \emph{P}-value adjustments. } \section{Warning about potential misuse of P values}{ Some in the statistical and scientific community argue that the term \dQuote{statistical significance} should be completely abandoned, and that criteria such as \dQuote{p < 0.05} never be used to assess the importance of an effect. These practices can be too misleading and are prone to abuse. See \href{../doc/basics.html#pvalues}{the \dQuote{basics} vignette} for more discussion. } \examples{ warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) warp.emm <- emmeans(warp.lm, ~ tension | wool) warp.emm # implicitly runs 'summary' confint(warp.emm, by = NULL, level = .90) # -------------------------------------------------------------- pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) pigs.emm <- emmeans(pigs.lm, "percent", type = "response") summary(pigs.emm) # (inherits type = "response") summary(pigs.emm, calc = c(n = ".wgt.")) # Show sample size # For which percents is EMM non-inferior to 35, based on a 10\% threshold? # Note the test is done on the log scale even though we have type = "response" test(pigs.emm, null = log(35), delta = log(1.10), side = ">") con <- contrast(pigs.emm, "consec") test(con) test(con, joint = TRUE) # default Scheffe adjustment - rank = 3 summary(con, infer = c(TRUE, TRUE), adjust = "scheffe") # Consider as some of many possible contrasts among the six cell means summary(con, infer = c(TRUE, TRUE), adjust = "scheffe", scheffe.rank = 5) # Show estimates to more digits print(test(con), digits = 7) } \seealso{ \code{\link{hpd.summary}} } emmeans/man/add_grouping.Rd0000644000176200001440000000555314137062735015423 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/nested.R \name{add_grouping} \alias{add_grouping} \title{Add a grouping factor} \usage{ add_grouping(object, newname, refname, newlevs) } \arguments{ \item{object}{An \code{emmGrid} object} \item{newname}{Character name of grouping factor to add (different from any existing factor in the grid)} \item{refname}{Character name(s) of the reference factor(s)} \item{newlevs}{Character vector or factor of the same length as that of the (combined) levels for \code{refname}. The grouping factor \code{newname} will have the unique values of \code{newlevs} as its levels. The order of levels in \code{newlevs} is the same as the order of the level combinations produced by \code{\link{expand.grid}} applied to the levels of \code{refname} -- that is, the first factor's levels change the fastest and the last one's vary the slowest.} } \value{ A revised \code{emmGrid} object having an additional factor named \code{newname}, and a new nesting structure with each \code{refname \%in\% newname} } \description{ This function adds a grouping factor to an existing reference grid or other \code{emmGrid} object, such that the levels of one or more existing factors (call them the reference factors) are mapped to a smaller number of levels of the new grouping factor. The reference factors are then nested in the new grouping factor. This facilitates obtaining marginal means of the grouping factor, and contrasts thereof. } \note{ By default, the levels of \code{newname} will be ordered alphabetically. To dictate a different ordering of levels, supply \code{newlevs} as a \code{factor} having its levels in the desired order. When \code{refname} specifies more than one factor, this can fundamentally (and permanently) change what is meant by the levels of those individual factors. For instance, in the \code{gwrg} example below, there are two levels of \code{wool} nested in each \code{prod}; and that implies that we now regard these as four different kinds of wool. Similarly, there are five different tensions (L, M, H in prod 1, and L, M in prod 2). } \examples{ fiber.lm <- lm(strength ~ diameter + machine, data = fiber) ( frg <- ref_grid(fiber.lm) ) # Suppose the machines are two different brands brands <- factor(c("FiberPro", "FiberPro", "Acme"), levels = c("FiberPro", "Acme")) ( gfrg <- add_grouping(frg, "brand", "machine", brands) ) emmeans(gfrg, "machine") emmeans(gfrg, "brand") ### More than one reference factor warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) gwrg <- add_grouping(ref_grid(warp.lm), "prod", c("tension", "wool"), c(2, 1, 1, 1, 2, 1)) # level combinations: LA MA HA LB MB HB emmeans(gwrg, ~ wool * tension) # some NAs due to impossible combinations emmeans(gwrg, "prod") } emmeans/man/emmobj.Rd0000644000176200001440000000705014137062735014224 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/emmeans.R \name{emmobj} \alias{emmobj} \title{Construct an \code{emmGrid} object from scratch} \usage{ emmobj(bhat, V, levels, linfct = diag(length(bhat)), df = NA, dffun, dfargs = list(), post.beta = matrix(NA), nesting = NULL, ...) } \arguments{ \item{bhat}{Numeric. Vector of regression coefficients} \item{V}{Square matrix. Covariance matrix of \code{bhat}} \item{levels}{Named list or vector. Levels of factor(s) that define the estimates defined by \code{linfct}. If not a list, we assume one factor named \code{"level"}} \item{linfct}{Matrix. Linear functions of \code{bhat} for each combination of \code{levels}.} \item{df}{Numeric value or function with arguments \code{(x, dfargs)}. If a number, that is used for the degrees of freedom. If a function, it should return the degrees of freedom for \code{sum(x*bhat)}, with any additional parameters in \code{dfargs}.} \item{dffun}{Overrides \code{df} if specified. This is a convenience to match the slot names of the returned object.} \item{dfargs}{List containing arguments for \code{df}. This is ignored if df is numeric.} \item{post.beta}{Matrix whose columns comprise a sample from the posterior distribution of the regression coefficients (so that typically, the column averages will be \code{bhat}). A 1 x 1 matrix of \code{NA} indicates that such a sample is unavailable.} \item{nesting}{Nesting specification as in \code{\link{ref_grid}}. This is ignored if \code{model.info} is supplied.} \item{...}{Arguments passed to \code{\link{update.emmGrid}}} } \value{ An \code{emmGrid} object } \description{ This allows the user to incorporate results obtained by some analysis into an \code{emmGrid} object, enabling the use of \code{emmGrid} methods to perform related follow-up analyses. } \details{ The arguments must be conformable. This includes that the length of \code{bhat}, the number of columns of \code{linfct}, and the number of columns of \code{post.beta} must all be equal. And that the product of lengths in \code{levels} must be equal to the number of rows of \code{linfct}. The \code{grid} slot of the returned object is generated by \code{\link{expand.grid}} using \code{levels} as its arguments. So the rows of \code{linfct} should be in corresponding order. The functions \code{qdrg} and \code{\link{emmobj}} are close cousins, in that they both produce \code{emmGrid} objects. When starting with summary statistics for an existing grid, \code{emmobj} is more useful, while \code{qdrg} is more useful when starting from an unsupported fitted model. } \examples{ # Given summary statistics for 4 cells in a 2 x 2 layout, obtain # marginal means and comparisons thereof. Assume heteroscedasticity # and use the Satterthwaite method levels <- list(trt = c("A", "B"), dose = c("high", "low")) ybar <- c(57.6, 43.2, 88.9, 69.8) s <- c(12.1, 19.5, 22.8, 43.2) n <- c(44, 11, 37, 24) se2 = s^2 / n Satt.df <- function(x, dfargs) sum(x * dfargs$v)^2 / sum((x * dfargs$v)^2 / (dfargs$n - 1)) expt.rg <- emmobj(bhat = ybar, V = diag(se2), levels = levels, linfct = diag(c(1, 1, 1, 1)), df = Satt.df, dfargs = list(v = se2, n = n), estName = "mean") plot(expt.rg) ( trt.emm <- emmeans(expt.rg, "trt") ) ( dose.emm <- emmeans(expt.rg, "dose") ) rbind(pairs(trt.emm), pairs(dose.emm), adjust = "mvt") } \seealso{ \code{\link{qdrg}}, an alternative that is useful when starting with a fitted model not supported in \pkg{emmeans}. } emmeans/man/rbind.emmGrid.Rd0000644000176200001440000000714214137062735015436 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/rbind.R, R/emm-list.R, R/nested.R \name{rbind.emmGrid} \alias{rbind.emmGrid} \alias{rbind.emm_list} \alias{+.emmGrid} \alias{[.emmGrid} \alias{subset.emmGrid} \alias{force_regular} \title{Combine or subset \code{emmGrid} objects} \usage{ \method{rbind}{emmGrid}(..., deparse.level = 1, adjust = "bonferroni") \method{rbind}{emm_list}(..., which, adjust = "bonferroni") \method{+}{emmGrid}(e1, e2) \method{[}{emmGrid}(x, i, adjust, drop.levels = TRUE, ...) \method{subset}{emmGrid}(x, subset, ...) force_regular(object) } \arguments{ \item{...}{Additional arguments: In \code{rbind}, object(s) of class \code{emmGrid}. In \code{"["}, it is ignored. In \code{subset}, it is passed to \code{[.emmGrid]}} \item{deparse.level}{(required but not used)} \item{adjust}{Character value passed to \code{\link{update.emmGrid}}} \item{which}{Integer vector of subset of elements to use; if missing, all are combined} \item{e1}{An \code{emmGrid} object} \item{e2}{Another \code{emmGrid} object} \item{x}{An \code{emmGrid} object to be subsetted} \item{i}{Integer vector of indexes} \item{drop.levels}{Logical value. If \code{TRUE}, the \code{"levels"} slot in the returned object is updated to hold only the predictor levels that actually occur} \item{subset}{logical expression indicating which rows of the grid to keep} \item{object}{an object of class \code{emmGrid}} } \value{ A revised object of class \code{emmGrid} The \code{rbind} method for \code{emm_list} objects simply combines the \code{emmGrid} objects comprising the first element of \code{...}. The result of \code{e1 + e2} is the same as \code{rbind(e1, e2)} \code{force_regular} adds extra (invisible) rows to an \code{emmGrid} object to make it a regular grid (all combinations of factors). This regular structure is needed by \code{emmeans}. An object can become irregular by, for example, subsetting rows, or by obtaining contrasts of a nested structure. } \description{ These functions provide methods for \code{\link[base:cbind]{rbind}} and \code{\link[base:Extract]{[}} that may be used to combine \code{emmGrid} objects together, or to extract a subset of cases. The primary reason for doing this would be to obtain multiplicity-adjusted results for smaller or larger families of tests or confidence intervals. } \note{ \code{rbind} throws an error if there are incompatibilities in the objects' coefficients, covariance structures, etc. But they are allowed to have different factors; a missing level \code{'.'} is added to factors as needed. } \examples{ warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) warp.rg <- ref_grid(warp.lm) # Do all pairwise comparisons within rows or within columns, # all considered as one faily of tests: w.t <- pairs(emmeans(warp.rg, ~ wool | tension)) t.w <- pairs(emmeans(warp.rg, ~ tension | wool)) rbind(w.t, t.w, adjust = "mvt") update(w.t + t.w, adjust = "fdr") ## same as above except for adjustment ### Working with 'emm_list' objects mod <- lm(conc ~ source + factor(percent), data = pigs) all <- emmeans(mod, list(src = pairwise ~ source, pct = consec ~ percent)) rbind(all, which = c(2, 4), adjust = "mvt") # Show only 3 of the 6 cases summary(warp.rg[c(2, 4, 5)]) # After-the-fact 'at' specification subset(warp.rg, wool == "A") ## or warp.rg |> subset(wool == "A") ### Irregular object tmp <- warp.rg[-1] ## emmeans(tmp, "tension") # will fail because tmp is irregular emmeans(force_regular(tmp), "tension") # will show some results } emmeans/man/neuralgia.Rd0000644000176200001440000000313514137062735014722 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/datasets.R \docType{data} \name{neuralgia} \alias{neuralgia} \title{Neuralgia data} \format{ A data frame with 60 observations and 5 variables: \describe{ \item{\code{Treatment}}{Factor with 3 levels \code{A}, \code{B}, and \code{P}. The latter is placebo} \item{\code{Sex}}{Factor with two levels \code{F} and \code{M}} \item{\code{Age}}{Numeric covariate -- patient's age in years} \item{\code{Duration}}{Numeric covariate -- duration of the condition before beginning treatment} \item{\code{Pain}}{Binary response factor with levels \code{No} and \code{Yes}} } } \source{ Cai, Weijie (2014) \emph{Making Comparisons Fair: How LS-Means Unify the Analysis of Linear Models}, SAS Institute, Inc. Technical paper 142-2014, page 12, \url{http://support.sas.com/resources/papers/proceedings14/SAS060-2014.pdf} } \usage{ neuralgia } \description{ These data arise from a study of analgesic effects of treatments of elderly patients who have neuralgia. Two treatments and a placebo are compared. The response variable is whether the patient reported pain or not. Researchers recorded the age and gender of 60 patients along with the duration of complaint before the treatment began. } \examples{ # Model and analysis shown in the SAS report: neuralgia.glm <- glm(Pain ~ Treatment * Sex + Age, family = binomial(), data = neuralgia) pairs(emmeans(neuralgia.glm, ~ Treatment, at = list(Sex = "F")), reverse = TRUE, type = "response", adjust = "bonferroni") } \keyword{datasets} emmeans/man/emmGrid-class.Rd0000644000176200001440000001160414137062735015442 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/S4-classes.R \docType{class} \name{emmGrid-class} \alias{emmGrid-class} \title{The \code{emmGrid} class} \description{ The \code{emmGrid} class encapsulates linear functions of regression parameters, defined over a grid of predictors. This includes reference grids and grids of marginal means thereof (aka estimated marginal means). Objects of class `emmGrid` may be used independently of the underlying model object. Instances are created primarily by \code{\link{ref_grid}} and \code{\link{emmeans}}, and several related functions. } \section{Slots}{ \describe{ \item{\code{model.info}}{list. Contains the elements \code{call} (the call that produced the model), \code{terms} (its \code{terms} object), and \code{xlev} (factor-level information)} \item{\code{roles}}{list. Contains at least the elements \code{predictors}, \code{responses}, and \code{multresp}. Each is a character vector of names of these variables.} \item{\code{grid}}{data.frame. Contains the combinations of the variables that define the reference grid. In addition, there is an auxiliary column named \code{".wgt."} holding the observed frequencies or weights for each factor combination (excluding covariates). If the model has one or more \code{\link{offset}()} calls, there is an another auxiliary column named \code{".offset."}. Auxiliary columns are not considered part of the reference grid. (However, any variables included in \code{offset} calls \emph{are} in the reference grid.)} \item{\code{levels}}{list. Each entry is a character vector with the distinct levels of each variable in the reference grid. Note that \code{grid} is obtained by applying the function \code{\link{expand.grid}} to this list} \item{\code{matlevs}}{list. Like \code{levels} but has the levels of any matrices in the original dataset. Matrix columns are always concatenated and treated as a single variable for purposes of the reference grid} \item{\code{linfct}}{matrix. Each row consists of the linear function of the regression coefficients for predicting its corresponding element of the reference grid. The rows of this matrix go in one-to-one correspondence with the rows of \code{grid}, and the columns with elements of \code{bhat}.} \item{\code{bhat}}{numeric. The regression coefficients. If there is a multivariate response, the matrix of coefficients is flattened to a single vector, and \code{linfct} and \code{V} redefined appropriately. Important: \code{bhat} must \emph{include} any \code{NA} values produced as a result of collinearity in the predictors. These are taken care of later in the estimability check.} \item{\code{nbasis}}{matrix. The basis for the non-estimable functions of the regression coefficients. Every EMM will correspond to a linear combination of rows of \code{linfct}, and that result must be orthogonal to all the columns of \code{nbasis} in order to be estimable. If everything is estimable, \code{nbasis} should be a 1 x 1 matrix of \code{NA}.} \item{\code{V}}{matrix. The symmetric variance-covariance matrix of \code{bhat}} \item{\code{dffun}}{function having two arguments. \code{dffun(k, dfargs)} should return the degrees of freedom for the linear function \code{sum(k*bhat)}, or \code{NA} if unavailable} \item{\code{dfargs}}{list. Used to hold any additional information needed by \code{dffun}.} \item{\code{misc}}{list. Additional information used by methods. These include at least the following: \code{estName} (the label for the estimates of linear functions), and the default values of \code{infer}, \code{level}, and \code{adjust} to be used in the \code{\link{summary.emmGrid}} method. Elements in this slot may be modified if desired using the \code{\link{update.emmGrid}} method.} \item{\code{post.beta}}{matrix. A sample from the posterior distribution of the regression coefficients, if MCMC methods were used; or a 1 x 1 matrix of \code{NA} otherwise. When it is non-trivial, the \code{\link{as.mcmc.emmGrid}} method returns \code{post.beta \%*\% t(linfct)}, which is a sample from the posterior distribution of the EMMs.} }} \section{Methods}{ All methods for these objects are S3 methods except for \code{show}. They include \code{\link{[.emmGrid}}, \code{\link{as.glht.emmGrid}}, \code{\link{as.mcmc.emmGrid}}, \code{\link{as.mcmc.list.emmGrid}} (see \pkg{coda}), \code{\link{cld.emmGrid}} (see \pkg{multcomp}), \code{\link{coef.emmGrid}}, \code{\link{confint.emmGrid}}, \code{\link{contrast.emmGrid}}, \code{\link{pairs.emmGrid}}, \code{\link{plot.emmGrid}}, \code{\link{predict.emmGrid}}, \code{\link{print.emmGrid}}, \code{\link{rbind.emmGrid}}, \code{show.emmGrid}, \code{\link{str.emmGrid}}, \code{\link{summary.emmGrid}}, \code{\link{test.emmGrid}}, \code{\link{update.emmGrid}}, \code{\link{vcov.emmGrid}}, and \code{\link{xtable.emmGrid}} } emmeans/man/pwpm.Rd0000644000176200001440000000561414137062735013742 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/pwpp.R \name{pwpm} \alias{pwpm} \title{Pairwise P-value matrix (plus other statistics)} \usage{ pwpm(emm, by, reverse = FALSE, pvals = TRUE, means = TRUE, diffs = TRUE, flip = FALSE, digits, ...) } \arguments{ \item{emm}{An \code{emmGrid} object} \item{by}{Character vector of variable(s) in the grid to condition on. These will create different matrices, one for each level or level-combination. If missing, \code{by} is set to \code{emm@misc$by.vars}. Grid factors not in \code{by} are the \emph{primary} factors: whose levels or level combinations are compared pairwise.} \item{reverse}{Logical value passed to \code{\link{pairs.emmGrid}}. Thus, \code{FALSE} specifies \code{"pairwise"} comparisons (earlier vs. later), and \code{TRUE} specifies \code{"revpairwise"} comparisons (later vs. earlier).} \item{pvals}{Logical value. If \code{TRUE}, the pairwise differences of the EMMs are included in each matrix according to \code{flip}.} \item{means}{Logical value. If \code{TRUE}, the estimated marginal means (EMMs) from \code{emm} are included in the matrix diagonal(s).} \item{diffs}{Logical value. If \code{TRUE}, the pairwise differences of the EMMs are included in each matrix according to \code{flip}.} \item{flip}{Logical value that determines where P values and differences are placed. \code{FALSE} places the P values in the upper triangle and differences in the lower, and \code{TRUE} does just the opposite.} \item{digits}{Integer. Number of digits to display. If missing, an optimal number of digits is determined.} \item{...}{Additional arguments passed to \code{\link{contrast.emmGrid}} and \code{\link{summary.emmGrid}}. You should \emph{not} include \code{method} here, because pairwise comparisons are always used.} } \value{ A matrix or `list` of matrices, one for each `by` level. } \description{ This function presents results from \code{emmeans} and pairwise comparisons thereof in a compact way. It displays a matrix (or matrices) of estimates, pairwise differences, and P values. The user may opt to exclude any of these via arguments \code{means}, \code{diffs}, and \code{pvals}, respectively. To control the direction of the pairwise differences, use \code{reverse}; and to control what appears in the upper and lower triangle(s), use \code{flip}. Optional arguments are passed to \code{contrast.emmGrid} and/or \code{summary.emmGrid}, making it possible to control what estimates and tests are displayed. } \examples{ warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) warp.emm <- emmeans(warp.lm, ~ tension | wool) pwpm(warp.emm) # use dot options to specify noninferiority tests pwpm(warp.emm, by = NULL, side = ">", delta = 5, adjust = "none") } \seealso{ A graphical display of essentially the same results is available from \code{\link{pwpp}} } emmeans/man/glht-support.Rd0000644000176200001440000000515314157436056015430 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/glht-support.R \name{emm} \alias{emm} \alias{glht-support} \alias{glht.emmGrid} \alias{glht.emmlf} \alias{modelparm.emmwrap} \alias{as.glht} \alias{as.glht.emmGrid} \title{Support for \code{multcomp::glht}} \usage{ emm(...) as.glht(object, ...) \method{as.glht}{emmGrid}(object, ...) } \arguments{ \item{...}{In \code{emm}, the \code{specs}, \code{by}, and \code{contr} arguments you would normally supply to \code{\link{emmeans}}. Only \code{specs} is required. Otherwise, arguments that are passed to other methods.} \item{object}{An object of class \code{emmGrid} or \code{emm_list}} } \value{ \code{emm} returns an object of an intermediate class for which there is a \code{multcomp::glht} method. \code{as.glht} returns an object of class \code{glht} or \code{glht_list} according to whether \code{object} is of class \code{emmGrid} or \code{emm_list}. See Details below for more on \code{glht_list}s. } \description{ These functions and methods provide an interface between \pkg{emmeans} and the \code{multcomp::glht} function for simultaneous inference provided by the \pkg{multcomp} package. } \details{ \code{emm} is meant to be called only \emph{from} \code{"glht"} as its second (\code{linfct}) argument. It works similarly to \code{multcomp::mcp}, except with \code{specs} (and optionally \code{by} and \code{contr} arguments) provided as in a call to \code{\link{emmeans}}. } \note{ The multivariate-\eqn{t} routines used by \code{glht} require that all estimates in the family have the same integer degrees of freedom. In cases where that is not true, a message is displayed that shows what df is used. The user may override this via the \code{df} argument. } \section{Details}{ A \code{glht_list} object is simply a \code{list} of \code{glht} objects. It is created as needed -- for example, when there is a \code{by} variable. Appropriate convenience methods \code{coef}, \code{confint}, \code{plot}, \code{summary}, and \code{vcov} are provided, which simply apply the corresponding \code{glht} methods to each member. } \examples{ if(require(multcomp, quietly = TRUE)) withAutoprint({ # --- multcomp must be installed warp.lm <- lm(breaks ~ wool*tension, data = warpbreaks) # Using 'emm' summary(glht(warp.lm, emm(pairwise ~ tension | wool))) # Same, but using an existing 'emmeans' result warp.emm <- emmeans(warp.lm, ~ tension | wool) summary(as.glht(pairs(warp.emm))) # Same contrasts, but treat as one family summary(as.glht(pairs(warp.emm), by = NULL)) }, spaced = TRUE) } emmeans/man/auto.noise.Rd0000644000176200001440000000504714137062735015043 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/datasets.R \docType{data} \name{auto.noise} \alias{auto.noise} \title{Auto Pollution Filter Noise} \format{ A data frame with 36 observations on the following 4 variables. \describe{ \item{\code{noise}}{Noise level in decibels (but see note) - a numeric vector.} \item{\code{size}}{The size of the vehicle - an ordered factor with levels \code{S}, \code{M}, \code{L}.} \item{\code{type}}{Type of anti-pollution filter - a factor with levels \code{Std} and \code{Octel}} \item{\code{side}}{The side of the car where measurement was taken -- a factor with levels \code{L} and \code{R}.} } } \source{ The dataset was obtained from the Data and Story Library (DASL) at Carnegie-Mellon University. Apparently it has since been removed. The original dataset was altered by assigning meaningful names to the factors and sorting the observations in random order as if this were the run order of the experiment. } \usage{ auto.noise } \description{ Three-factor experiment comparing pollution-filter noise for two filters, three sizes of cars, and two sides of the car. } \details{ The data are from a statement by Texaco, Inc., to the Air and Water Pollution Subcommittee of the Senate Public Works Committee on June 26, 1973. Mr. John McKinley, President of Texaco, cited an automobile filter developed by Associated Octel Company as effective in reducing pollution. However, questions had been raised about the effects of filters on vehicle performance, fuel consumption, exhaust gas back pressure, and silencing. On the last question, he referred to the data included here as evidence that the silencing properties of the Octel filter were at least equal to those of standard silencers. } \note{ While the data source claims that \code{noise} is measured in decibels, the values are implausible. I believe that these measurements are actually in tenths of dB (centibels?). Looking at the values in the dataset, note that every measurement ends in 0 or 5, and it is reasonable to believe that measurements are accurate to the nearest half of a decibel. %%% Thanks to an email communication from a speech/hearing scientist } \examples{ # (Based on belief that noise/10 is in decibel units) noise.lm <- lm(noise/10 ~ size * type * side, data = auto.noise) # Interaction plot of predictions emmip(noise.lm, type ~ size | side) # Confidence intervals plot(emmeans(noise.lm, ~ size | side*type)) } \keyword{datasets} emmeans/man/contrast.Rd0000644000176200001440000002200114137062735014601 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/contrast.R \name{contrast} \alias{contrast} \alias{contrast.emmGrid} \alias{pairs.emmGrid} \alias{coef.emmGrid} \title{Contrasts and linear functions of EMMs} \usage{ contrast(object, ...) \method{contrast}{emmGrid}(object, method = "eff", interaction = FALSE, by, offset = NULL, scale = NULL, name = "contrast", options = get_emm_option("contrast"), type, adjust, simple, combine = FALSE, ratios = TRUE, parens, ...) \method{pairs}{emmGrid}(x, reverse = FALSE, ...) \method{coef}{emmGrid}(object, ...) } \arguments{ \item{object}{An object of class \code{emmGrid}} \item{...}{Additional arguments passed to other methods} \item{method}{Character value giving the root name of a contrast method (e.g. \code{"pairwise"} -- see \link{emmc-functions}). Alternatively, a function of the same form, or a named \code{list} of coefficients (for a contrast or linear function) that must each conform to the number of results in each \code{by} group. In a multi-factor situation, the factor levels are combined and treated like a single factor.} \item{interaction}{Character vector, logical value, or list. If this is specified, \code{method} is ignored. See the \dQuote{Interaction contrasts} section below for details.} \item{by}{Character names of variable(s) to be used for ``by'' groups. The contrasts or joint tests will be evaluated separately for each combination of these variables. If \code{object} was created with by groups, those are used unless overridden. Use \code{by = NULL} to use no by groups at all.} \item{offset, scale}{Numeric vectors of the same length as each \code{by} group. The \code{scale} values, if supplied, multiply their respective linear estimates, and any \code{offset} values are added. Scalar values are also allowed. (These arguments are ignored when \code{interaction} is specified.)} \item{name}{Character name to use to override the default label for contrasts used in table headings or subsequent contrasts of the returned object.} \item{options}{If non-\code{NULL}, a named \code{list} of arguments to pass to \code{\link{update.emmGrid}}, just after the object is constructed.} \item{type}{Character: prediction type (e.g., \code{"response"}) -- added to \code{options}} \item{adjust}{Character: adjustment method (e.g., \code{"bonferroni"}) -- added to \code{options}} \item{simple}{Character vector or list: Specify the factor(s) \emph{not} in \code{by}, or a list thereof. See the section below on simple contrasts.} \item{combine}{Logical value that determines what is returned when \code{simple} is a list. See the section on simple contrasts.} \item{ratios}{Logical value determining how log and logit transforms are handled. These transformations are exceptional cases in that there is a valid way to back-transform contrasts: differences of logs are logs of ratios, and differences of logits are odds ratios. If \code{ratios = TRUE} and summarized with \code{type = "response"}, \code{contrast} results are back-transformed to ratios whenever we have true contrasts (coefficients sum to zero). For other transformations, there is no natural way to back-transform contrasts, so even when summarized with \code{type = "response"}, contrasts are computed and displayed on the linear-predictor scale. Similarly, if \code{ratios = FALSE}, log and logit transforms are treated in the same way as any other transformation.} \item{parens}{character or \code{NULL}. If a character value, the labels for levels being contrasted are parenthesized if they match the regular expression in \code{parens[1]} (via \code{\link{grep}}). The default is \code{emm_option("parens")}. Optionally, \code{parens} may contain second and third elements specifying what to use for left and right parentheses (default \code{"("} and \code{")"}). Specify \code{parens = NULL} or \code{parens = "a^"} (which won't match anything) to disable all parenthesization.} \item{x}{An \code{emmGrid} object} \item{reverse}{Logical value - determines whether to use \code{"pairwise"} (if \code{TRUE}) or \code{"revpairwise"} (if \code{FALSE}).} } \value{ \code{contrast} and \code{pairs} return an object of class \code{emmGrid}. Its grid will correspond to the levels of the contrasts and any \code{by} variables. The exception is that an \code{\link{emm_list}} object is returned if \code{simple} is a list and \code{complete} is \code{FALSE}. \code{coef} returns a \code{data.frame} containing the object's grid, along with columns named \code{c.1, c.2, ...} containing the contrast coefficients. If } \description{ These methods provide for follow-up analyses of \code{emmGrid} objects: Contrasts, pairwise comparisons, tests, and confidence intervals. They may also be used to compute arbitrary linear functions of predictions or EMMs. } \note{ When \code{object} has a nesting structure (this can be seen via \code{str(object)}), then any grouping factors involved are forced into service as \code{by} variables, and the contrasts are thus computed separately in each nest. This in turn may lead to an irregular grid in the returned \code{emmGrid} object, which may not be valid for subsequent \code{emmeans} calls. } \section{Pairs method}{ The call \code{pairs(object)} is equivalent to \code{contrast(object, method = "pairwise")}; and \code{pairs(object, reverse = TRUE)} is the same as \code{contrast(object, method = "revpairwise")}. } \section{Interaction contrasts}{ When \code{interaction} is specified, interaction contrasts are computed. Specifically contrasts are generated for each factor separately, one at a time; and these contrasts are applied to the object (the first time around) or to the previous result (subsequently). (Any factors specified in \code{by} are skipped.) The final result comprises contrasts of contrasts, or, equivalently, products of contrasts for the factors involved. Any named elements of \code{interaction} are assigned to contrast methods; others are assigned in order of appearance in \code{object@levels}. The contrast factors in the resulting \code{emmGrid} object are ordered the same as in \code{interaction}. \code{interaction} may be a character vector or list of valid contrast methods (as documented for the \code{method} argument). If the vector or list is shorter than the number needed, it is recycled. Alternatively, if the user specifies \code{contrast = TRUE}, the contrast specified in \code{method} is used for all factors involved. } \section{Simple contrasts}{ \code{simple} is essentially the complement of \code{by}: When \code{simple} is a character vector, \code{by} is set to all the factors in the grid \emph{except} those in \code{simple}. If \code{simple} is a list, each element is used in turn as \code{simple}, and assembled in an \code{"emm_list"}. To generate \emph{all} simple main effects, use \code{simple = "each"} (this works unless there actually is a factor named \code{"each"}). Note that a non-missing \code{simple} will cause \code{by} to be ignored. Ordinarily, when \code{simple} is a list or \code{"each"}, the return value is an \code{\link{emm_list}} object with each entry in correspondence with the entries of \code{simple}. However, with \code{combine = TRUE}, the elements are all combined into one family of contrasts in a single \code{\link[=emmGrid-class]{emmGrid}} object using \code{\link{rbind.emmGrid}}.. In that case, the \code{adjust} argument sets the adjustment method for the combined set of contrasts. } \examples{ warp.lm <- lm(breaks ~ wool*tension, data = warpbreaks) warp.emm <- emmeans(warp.lm, ~ tension | wool) contrast(warp.emm, "poly") # inherits 'by = "wool"' from warp.emm pairs(warp.emm) # ditto contrast(warp.emm, "eff", by = NULL) # contrasts of the 6 factor combs pairs(warp.emm, simple = "wool") # same as pairs(warp.emm, by = "tension") # Do all "simple" comparisons, combined into one family pairs(warp.emm, simple = "each", combine = TRUE) \dontrun{ ## Note that the following are NOT the same: contrast(warp.emm, simple = c("wool", "tension")) contrast(warp.emm, simple = list("wool", "tension")) ## The first generates contrasts for combinations of wool and tension ## (same as by = NULL) ## The second generates contrasts for wool by tension, and for ## tension by wool, respectively. } # An interaction contrast for tension:wool tw.emm <- contrast(warp.emm, interaction = c(tension = "poly", wool = "consec"), by = NULL) tw.emm # see the estimates coef(tw.emm) # see the contrast coefficients # Use of scale and offset # an unusual use of the famous stack-loss data... mod <- lm(Water.Temp ~ poly(stack.loss, degree = 2), data = stackloss) (emm <- emmeans(mod, "stack.loss", at = list(stack.loss = 10 * (1:4)))) # Convert results from Celsius to Fahrenheit: confint(contrast(emm, "identity", scale = 9/5, offset = 32)) } emmeans/man/ubds.Rd0000644000176200001440000000271514137062735013713 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/datasets.R \docType{data} \name{ubds} \alias{ubds} \title{Unbalanced dataset} \format{ A data frame with 100 observations, 5 variables, and a special \code{"cells"} attribute: \describe{ \item{A}{Factor with levels 1, 2, and 3} \item{B}{Factor with levels 1, 2, and 3} \item{C}{Factor with levels 1, 2, and 3} \item{x}{A numeric variable} \item{y}{A numeric variable} } In addition, \code{attr(ubds, "cells")} consists of a named list of length 27 with the row numbers for each combination of \code{A, B, C}. For example, \code{attr(ubds, "cells")[["213"]]} has the row numbers corresponding to levels \code{A == 2, B == 1, C == 3}. The entries are ordered by length, so the first entry is the cell with the lowest frequency. } \usage{ ubds } \description{ This is a simulated unbalanced dataset with three factors and two numeric variables. There are true relationships among these variables. This dataset can be useful in testing or illustrating messy-data situations. There are no missing data, and there is at least one observation for every factor combination; however, the \code{"cells"} attribute makes it simple to construct subsets that have empty cells. } \examples{ # Omit the three lowest-frequency cells low3 <- unlist(attr(ubds, "cells")[1:3]) messy.lm <- lm(y ~ (x + A + B + C)^3, data = ubds, subset = -low3) } \keyword{datasets} emmeans/man/make.tran.Rd0000644000176200001440000001614614137062735014641 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/transformations.R \name{make.tran} \alias{make.tran} \title{Response-transformation extensions} \usage{ make.tran(type = c("genlog", "power", "boxcox", "sympower", "asin.sqrt", "bcnPower", "scale"), param = 1, y, ...) } \arguments{ \item{type}{The name of the transformation. See Details.} \item{param}{Numeric parameter needed for the transformation. Optionally, it may be a vector of two numeric values; the second element specifies an alternative base or origin for certain transformations. See Details.} \item{y, ...}{Used only with \code{type = "scale"}. These parameters are passed to \code{\link{scale}} to determine \code{param}.} } \value{ A \code{list} having at least the same elements as those returned by \code{\link{make.link}}. The \code{linkfun} component is the transformation itself. } \description{ The \code{make.tran} function creates the needed information to perform transformations of the response variable, including inverting the transformation and estimating variances of back-transformed predictions via the delta method. \code{make.tran} is similar to \code{\link{make.link}}, but it covers additional transformations. The result can be used as an environment in which the model is fitted, or as the \code{tran} argument in \code{\link{update.emmGrid}} (when the given transformation was already applied in an existing model). } \details{ The functions \code{\link{emmeans}}, \code{\link{ref_grid}}, and related ones automatically detect response transformations that are recognized by examining the model formula. These are \code{log}, \code{log2}, \code{log10}, \code{log1p}, \code{sqrt}, \code{logit}, \code{probit}, \code{cauchit}, \code{cloglog}; as well as (for a response variable \code{y}) \code{asin(sqrt(y))}, \code{asinh(sqrt(y))}, and \code{sqrt(y) + sqrt(y+1)}. In addition, any constant multiple of these (e.g., \code{2*sqrt(y)}) is auto-detected and appropriately scaled (see also the \code{tran.mult} argument in \code{\link{update.emmGrid}}). A few additional character strings may be supplied as the \code{tran} argument in \code{\link{update.emmGrid}}: \code{"identity"}, \code{"1/mu^2"}, \code{"inverse"}, \code{"reciprocal"}, \code{"log10"}, \code{"log2"}, \code{"asin.sqrt"}, and \code{"asinh.sqrt"}. More general transformations may be provided as a list of functions and supplied as the \code{tran} argument as documented in \code{\link{update.emmGrid}}. The \code{make.tran} function returns a suitable list of functions for several popular transformations. Besides being usable with \code{update}, the user may use this list as an enclosing environment in fitting the model itself, in which case the transformation is auto-detected when the special name \code{linkfun} (the transformation itself) is used as the response transformation in the call. See the examples below. Most of the transformations available in "make.tran" require a parameter, specified in \code{param}; in the following discussion, we use \eqn{p} to denote this parameter, and \eqn{y} to denote the response variable. The \code{type} argument specifies the following transformations: \describe{ \item{\code{"genlog"}}{Generalized logarithmic transformation: \eqn{log(y + p)}, where \eqn{y > -p}} \item{\code{"power"}}{Power transformation: \eqn{y^p}, where \eqn{y > 0}. When \eqn{p = 0}, \code{"log"} is used instead} \item{\code{"boxcox"}}{The Box-Cox transformation (unscaled by the geometric mean): \eqn{(y^p - 1) / p}, where \eqn{y > 0}. When \eqn{p = 0}, \eqn{log(y)} is used.} \item{\code{"sympower"}}{A symmetrized power transformation on the whole real line: \eqn{abs(y)^p * sign(y)}. There are no restrictions on \eqn{y}, but we require \eqn{p > 0} in order for the transformation to be monotone and continuous.} \item{\code{"asin.sqrt"}}{Arcsin-square-root transformation: \eqn{sin^(-1)(y/p)^{1/2)}}. Typically, the parameter \eqn{p} is equal to 1 for a fraction, or 100 for a percentage.} \item{\code{"bcnPower"}}{Box-Cox with negatives allowed, as described for the \code{bcnPower} function in the \pkg{car} package. It is defined as the Box-Cox transformation \eqn{(z^p - 1) / p} of the variable \eqn{z = y + (y^2+g^2)^(1/2)}. This requires \code{param} to have two elements: the power \eqn{p} and the offset \eqn{g > 0}.} \item{\code{"scale"}}{This one is a little different than the others, in that \code{param} is ignored; instead, \code{param} is determined by calling \code{scale(y, ...)}. The user should give as \code{y} the response variable in the model to be fitted to its scaled version.} } The user may include a second element in \code{param} to specify an alternative origin (other than zero) for the \code{"power"}, \code{"boxcox"}, or \code{"sympower"} transformations. For example, \samp{type = "power", param = c(1.5, 4)} specifies the transformation \eqn{(y - 4)^1.5}. In the \code{"genpower"} transformation, a second \code{param} element may be used to specify a base other than the default natural logarithm. For example, \samp{type = "genlog", param = c(.5, 10)} specifies the \eqn{log10(y + .5)} transformation. In the \code{"bcnPower"} transformation, the second element is required and must be positive. For purposes of back-transformation, the \samp{sqrt(y) + sqrt(y+1)} transformation is treated exactly the same way as \samp{2*sqrt(y)}, because both are regarded as estimates of \eqn{2\sqrt\mu}. } \note{ The \code{genlog} transformation is technically unneeded, because a response transformation of the form \code{log(y + c)} is now auto-detected by \code{\link{ref_grid}}. We modify certain \code{\link{make.link}} results in transformations where there is a restriction on valid prediction values, so that reasonable inverse predictions are obtained, no matter what. For example, if a \code{sqrt} transformation was used but a predicted value is negative, the inverse transformation is zero rather than the square of the prediction. A side effect of this is that it is possible for one or both confidence limits, or even a standard error, to be zero. } \examples{ # Fit a model using an oddball transformation: bctran <- make.tran("boxcox", 0.368) warp.bc <- with(bctran, lm(linkfun(breaks) ~ wool * tension, data = warpbreaks)) # Obtain back-transformed LS means: emmeans(warp.bc, ~ tension | wool, type = "response") ### Using a scaled response... # Case where it is auto-detected: fib.lm <- lm(scale(strength) ~ diameter + machine, data = fiber) ref_grid(fib.lm) # Case where scaling is not auto-detected -- and what to do about it: fib.aov <- aov(scale(strength) ~ diameter + Error(machine), data = fiber) fib.rg <- suppressWarnings(ref_grid(fib.aov, at = list(diameter = c(20, 30)))) # Scaling was not retrieved, so we can do: fib.rg = update(fib.rg, tran = make.tran("scale", y = fiber$strength)) emmeans(fib.rg, "diameter") \dontrun{ ### An existing model 'mod' was fitted with a y^(2/3) transformation... ptran = make.tran("power", 2/3) emmeans(mod, "treatment", tran = ptran) } } emmeans/man/emmeans.Rd0000644000176200001440000002567314137062735014413 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/emmeans.R \name{emmeans} \alias{emmeans} \title{Estimated marginal means (Least-squares means)} \usage{ emmeans(object, specs, by = NULL, fac.reduce = function(coefs) apply(coefs, 2, mean), contr, options = get_emm_option("emmeans"), weights, offset, trend, ..., tran) } \arguments{ \item{object}{An object of class \code{emmGrid}; or a fitted model object that is supported, such as the result of a call to \code{lm} or \code{lmer}. Many fitted-model objects are supported; see \href{../doc/models.html}{\code{vignette("models", "emmeans")}} for details.} \item{specs}{A \code{character} vector specifying the names of the predictors over which EMMs are desired. \code{specs} may also be a \code{formula} or a \code{list} (optionally named) of valid \code{spec}s. Use of formulas is described in the Overview section below.} \item{by}{A character vector specifying the names of predictors to condition on.} \item{fac.reduce}{A function that combines the rows of a matrix into a single vector. This implements the ``marginal averaging'' aspect of EMMs. The default is the mean of the rows. Typically if it is overridden, it would be some kind of weighted mean of the rows. If \code{fac.reduce} is nonlinear, bizarre results are likely, and EMMs will not be interpretable. NOTE: If the \code{weights} argument is non-missing, \code{fac.reduce} is ignored.} \item{contr}{A character value or \code{list} specifying contrasts to be added. See \code{\link{contrast}}. NOTE: \code{contr} is ignored when \code{specs} is a formula.} \item{options}{If non-\code{NULL}, a named \code{list} of arguments to pass to \code{\link{update.emmGrid}}, just after the object is constructed. (Options may also be included in \code{...}; see the \sQuote{options} section below.)} \item{weights}{Character value, numeric vector, or numeric matrix specifying weights to use in averaging predictions. See \dQuote{Weights} section below. Also, if \code{object} is not already a reference grid, \code{weights} (if it is character) is passed to \code{ref_grid} as \code{wt.nuis} in case nuisance factors are specified. We can override this by specifying \code{wt.nuis} explicitly. This more-or-less makes the weighting of nuisance factors consistent with that of primary factors.} \item{offset}{Numeric vector or scalar. If specified, this adds an offset to the predictions, or overrides any offset in the model or its reference grid. If a vector of length differing from the number of rows in the result, it is subsetted or cyclically recycled.} \item{trend}{This is now deprecated. Use \code{\link{emtrends}} instead.} \item{...}{When \code{object} is not already a \code{"emmGrid"} object, these arguments are passed to \code{\link{ref_grid}}. Common examples are \code{at}, \code{cov.reduce}, \code{data}, code{type}, \code{transform}, \code{df}, \code{nesting}, and \code{vcov.}. Model-type-specific options (see \href{../doc/models.html}{\code{vignette("models", "emmeans")}}), commonly \code{mode}, may be used here as well. In addition, if the model formula contains references to variables that are not predictors, you must provide a \code{params} argument with a list of their names. These arguments may also be used in lieu of \code{options}. See the \sQuote{Options} section below.} \item{tran}{Placeholder to prevent it from being included in \code{...}. If non-missing, it is added to `options`. See the \sQuote{Options} section.} } \value{ When \code{specs} is a \code{character} vector or one-sided formula, an object of class \code{"emmGrid"}. A number of methods are provided for further analysis, including \code{\link{summary.emmGrid}}, \code{\link{confint.emmGrid}}, \code{\link{test.emmGrid}}, \code{\link{contrast.emmGrid}}, and \code{\link{pairs.emmGrid}}. When \code{specs} is a \code{list} or a \code{formula} having a left-hand side, the return value is an \code{\link{emm_list}} object, which is simply a \code{list} of \code{emmGrid} objects. } \description{ Compute estimated marginal means (EMMs) for specified factors or factor combinations in a linear model; and optionally, comparisons or contrasts among them. EMMs are also known as least-squares means. } \details{ Users should also consult the documentation for \code{\link{ref_grid}}, because many important options for EMMs are implemented there, via the \code{...} argument. } \section{Overview}{ Estimated marginal means or EMMs (sometimes called least-squares means) are predictions from a linear model over a \emph{reference grid}; or marginal averages thereof. The \code{\link{ref_grid}} function identifies/creates the reference grid upon which \code{emmeans} is based. For those who prefer the terms \dQuote{least-squares means} or \dQuote{predicted marginal means}, functions \code{lsmeans} and \code{pmmeans} are provided as wrappers. See \code{\link{wrappers}}. If \code{specs} is a \code{formula}, it should be of the form \code{~ specs}, \code{~ specs | by}, \code{contr ~ specs}, or \code{contr ~ specs | by}. The formula is parsed and the variables therein are used as the arguments \code{specs}, \code{by}, and \code{contr} as indicated. The left-hand side is optional, but if specified it should be the name of a contrast family (e.g., \code{pairwise}). Operators like \code{*} or \code{:} are needed in the formula to delineate names, but otherwise are ignored. In the special case where the mean (or weighted mean) of all the predictions is desired, specify \code{specs} as \code{~ 1} or \code{"1"}. A number of standard contrast families are provided. They can be identified as functions having names ending in \code{.emmc} -- see the documentation for \code{\link{emmc-functions}} for details -- including how to write your own \code{.emmc} function for custom contrasts. } \section{Weights}{ If \code{weights} is a vector, its length must equal the number of predictions to be averaged to obtain each EMM. If a matrix, each row of the matrix is used in turn, wrapping back to the first row as needed. When in doubt about what is being averaged (or how many), first call \code{emmeans} with \code{weights = "show.levels"}. If \code{weights} is a string, it should partially match one of the following: \describe{ \item{\code{"equal"}}{Use an equally weighted average.} \item{\code{"proportional"}}{Weight in proportion to the frequencies (in the original data) of the factor combinations that are averaged over.} \item{\code{"outer"}}{Weight in proportion to each individual factor's marginal frequencies. Thus, the weights for a combination of factors are the outer product of the one-factor margins} \item{\code{"cells"}}{Weight according to the frequencies of the cells being averaged.} \item{\code{"flat"}}{Give equal weight to all cells with data, and ignore empty cells.} \item{\code{"show.levels"}}{This is a convenience feature for understanding what is being averaged over. Instead of a table of EMMs, this causes the function to return a table showing the levels that are averaged over, in the order that they appear.} } Outer weights are like the 'expected' counts in a chi-square test of independence, and will yield the same results as those obtained by proportional averaging with one factor at a time. All except \code{"cells"} uses the same set of weights for each mean. In a model where the predicted values are the cell means, cell weights will yield the raw averages of the data for the factors involved. Using \code{"flat"} is similar to \code{"cells"}, except nonempty cells are weighted equally and empty cells are ignored. } \section{Offsets}{ Unlike in \code{ref_grid}, an offset need not be scalar. If not enough values are supplied, they are cyclically recycled. For a vector of offsets, it is important to understand that the ordering of results goes with the first name in \code{specs} varying fastest. If there are any \code{by} factors, those vary slower than all the primary ones, but the first \code{by} variable varies the fastest within that hierarchy. See the examples. } \section{Options and \code{...}}{ Arguments that could go in \code{options} may instead be included in \code{...}, typically, arguments such as \code{type}, \code{infer}, etc. that in essence are passed to \code{\link{summary.emmGrid}}. Arguments in both places are overridden by the ones in \code{...}. There is a danger that \code{...} arguments could partially match those used by both \code{ref_grid} and \code{update.emmGrid}, creating a conflict. If these occur, usually they can be resolved by providing complete (or at least longer) argument names; or by isolating non-\code{ref_grid} arguments in \code{options}; or by calling \code{ref_grid} separately and passing the result as \code{object}. See a not-run example below. Also, when \code{specs} is a two-sided formula, or \code{contr} is specified, there is potential confusion concerning which \code{...} arguments apply to the means, and which to the contrasts. When such confusion is possible, we suggest doing things separately (a call to \code{emmeans} with no contrasts, followed by a call to \code{\link{contrast}}). We do treat for \code{adjust} as a special case: it is applied to the \code{emmeans} results \emph{only} if there are no contrasts specified, otherwise it is passed to \code{contrast}. } \examples{ warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) emmeans (warp.lm, ~ wool | tension) # or equivalently emmeans(warp.lm, "wool", by = "tension") # 'adjust' argument ignored in emmeans, passed to contrast part... emmeans (warp.lm, poly ~ tension | wool, adjust = "sidak") \dontrun{ # 'adjust' argument NOT ignored ... emmeans (warp.lm, ~ tension | wool, adjust = "sidak") } \dontrun{ ### Offsets: Consider a silly example: emmeans(warp.lm, ~ tension | wool, offset = c(17, 23, 47)) @ grid # note that offsets are recycled so that each level of tension receives # the same offset for each wool. # But using the same offsets with ~ wool | tension will probably not # be what you want because the ordering of combinations is different. ### Conflicting arguments... # This will error because 'tran' is passed to both ref_grid and update emmeans(some.model, "treatment", tran = "log", type = "response") # Use this if the response was a variable that is the log of some other variable # (Keep 'tran' from being passed to ref_grid) emmeans(some.model, "treatment", options = list(tran = "log"), type = "response") # This will re-grid the result as if the response had been log-transformed # ('transform' is passed only to ref_grid, not to update) emmeans(some.model, "treatment", transform = "log", type = "response") } } \seealso{ \code{\link{ref_grid}}, \code{\link{contrast}}, \href{../doc/models.html}{vignette("models", "emmeans")} } emmeans/man/oranges.Rd0000644000176200001440000000315614137062735014414 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/datasets.R \docType{data} \name{oranges} \alias{oranges} \title{Sales of oranges} \format{ A data frame with 36 observations and 6 variables: \describe{ \item{\code{store}}{a factor with levels \code{1} \code{2} \code{3} \code{4} \code{5} \code{6}. The store that was observed.} \item{\code{day}}{a factor with levels \code{1} \code{2} \code{3} \code{4} \code{5} \code{6}. The day the observation was taken (same for each store).} \item{\code{price1}}{a numeric vector. Price of variety 1.} \item{\code{price2}}{a numeric vector. Price of variety 2.} \item{\code{sales1}}{a numeric vector. Sales (per customer) of variety 1.} \item{\code{sales2}}{a numeric vector. Sales (per customer) of variety 2.} } } \source{ This is (or once was) available as a SAS sample dataset. } \usage{ oranges } \description{ This example dataset on sales of oranges has two factors, two covariates, and two responses. There is one observation per factor combination. } \examples{ # Example on p.244 of Littell et al. oranges.lm <- lm(sales1 ~ price1*day, data = oranges) emmeans(oranges.lm, "day") # Example on p.246 of Littell et al. emmeans(oranges.lm, "day", at = list(price1 = 0)) # A more sensible model to consider, IMHO (see vignette("interactions")) org.mlm <- lm(cbind(sales1, sales2) ~ price1 * price2 + day + store, data = oranges) } \references{ Littell, R., Stroup W., Freund, R. (2002) \emph{SAS For Linear Models} (4th edition). SAS Institute. ISBN 1-59047-023-0. } \keyword{datasets} emmeans/man/feedlot.Rd0000644000176200001440000000377614137062735014410 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/datasets.R \docType{data} \name{feedlot} \alias{feedlot} \title{Feedlot data} \format{ A data frame with 67 observations and 4 variables: \describe{ \item{\code{herd}}{a factor with levels \code{9} \code{16} \code{3} \code{32} \code{24} \code{31} \code{19} \code{36} \code{34} \code{35} \code{33}, designating the herd that a feeder calf came from.} \item{\code{diet}}{a factor with levels \code{Low} \code{Medium} \code{High}: the energy level of the diet given the animal.} \item{\code{swt}}{a numeric vector: the weight of the animal at slaughter.} \item{\code{ewt}}{a numeric vector: the weight of the animal at entry to the feedlot.} } } \source{ Urquhart NS (1982) Adjustment in covariates when one factor affects the covariate. \emph{Biometrics} 38, 651-660. } \usage{ feedlot } \description{ This is an unbalanced analysis-of-covariance example, where one covariate is affected by a factor. Feeder calves from various herds enter a feedlot, where they are fed one of three diets. The weight of the animal at entry is the covariate, and the weight at slaughter is the response. } \details{ The data arise from a Western Regional Research Project conducted at New Mexico State University. Calves born in 1975 in commercial herds entered a feedlot as yearlings. Both diets and herds are of interest as factors. The covariate, \code{ewt}, is thought to be dependent on \code{herd} due to different genetic backgrounds, breeding history, etc. The levels of \code{herd} ordered to similarity of genetic background. Note: There are some empty cells in the cross-classification of \code{herd} and \code{diet}. } \examples{ feedlot.lm <- lm(swt ~ ewt + herd*diet, data = feedlot) # Obtain EMMs with a separate reference value of ewt for each # herd. This reproduces the last part of Table 2 in the reference emmeans(feedlot.lm, ~ diet | herd, cov.reduce = ewt ~ herd) } \keyword{datasets} emmeans/man/nutrition.Rd0000644000176200001440000000350414137062735015006 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/datasets.R \docType{data} \name{nutrition} \alias{nutrition} \title{Nutrition data} \format{ A data frame with 107 observations and 4 variables: \describe{ \item{\code{age}}{a factor with levels \code{1}, \code{2}, \code{3}, \code{4}. Mother's age group.} \item{\code{group}}{a factor with levels \code{FoodStamps}, \code{NoAid}. Whether or not the family receives food stamp assistance.} \item{\code{race}}{a factor with levels \code{Black}, \code{Hispanic}, \code{White}. Mother's race.} \item{\code{gain}}{a numeric vector (the response variable). Gain score (posttest minus pretest) on knowledge of nutrition.} } } \source{ Milliken, G. A. and Johnson, D. E. (1984) \emph{Analysis of Messy Data -- Volume I: Designed Experiments}. Van Nostrand, ISBN 0-534-02713-7. } \usage{ nutrition } \description{ This observational dataset involves three factors, but where several factor combinations are missing. It is used as a case study in Milliken and Johnson, Chapter 17, p.202. (You may also find it in the second edition, p.278.) } \details{ A survey was conducted by home economists ``to study how much lower-socioeconomic-level mothers knew about nutrition and to judge the effect of a training program designed to increase their knowledge of nutrition.'' This is a messy dataset with several empty cells. } \examples{ nutr.aov <- aov(gain ~ (group + age + race)^2, data = nutrition) # Summarize predictions for age group 3 nutr.emm <- emmeans(nutr.aov, ~ race * group, at = list(age="3")) emmip(nutr.emm, race ~ group) # Hispanics seem exceptional; but this doesn't test out due to very sparse data pairs(nutr.emm, by = "group") pairs(nutr.emm, by = "race") } \keyword{datasets} emmeans/man/models.Rd0000644000176200001440000000050114137062735014230 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/models.R \name{models} \alias{models} \title{Models supported in \pkg{emmeans}} \description{ Documentation for models has been moved to a vignette. To access it, use \href{../doc/models.html}{\code{vignette("models", "emmeans")}}. } emmeans/man/ref_grid.Rd0000644000176200001440000005004714137062735014540 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/ref-grid.R \name{ref_grid} \alias{ref_grid} \title{Create a reference grid from a fitted model} \usage{ ref_grid(object, at, cov.reduce = mean, cov.keep = get_emm_option("cov.keep"), mult.names, mult.levs, options = get_emm_option("ref_grid"), data, df, type, transform, nesting, offset, sigma, nuisance = character(0), non.nuisance, wt.nuis = "equal", rg.limit = get_emm_option("rg.limit"), ...) } \arguments{ \item{object}{An object produced by a supported model-fitting function, such as \code{lm}. Many models are supported. See \href{../doc/models.html}{\code{vignette("models", "emmeans")}}.} \item{at}{Optional named list of levels for the corresponding variables} \item{cov.reduce}{A function, logical value, or formula; or a named list of these. Each covariate \emph{not} specified in \code{cov.keep} or \code{at} is reduced according to these specifications. See the section below on \dQuote{Using \code{cov.reduce} and \code{cov.keep}}.} \item{cov.keep}{Character vector: names of covariates that are \emph{not} to be reduced; these are treated as factors and used in weighting calculations. \code{cov.keep} may also include integer value(s), and if so, the maximum of these is used to set a threshold such that any covariate having no more than that many unique values is automatically included in \code{cov.keep}.} \item{mult.names}{Character value: the name(s) to give to the pseudo-factor(s) whose levels delineate the elements of a multivariate response. If this is provided, it overrides the default name(s) used for \code{class(object)} when it has a multivariate response (e.g., the default is \code{"rep.meas"} for \code{"mlm"} objects).} \item{mult.levs}{A named list of levels for the dimensions of a multivariate response. If there is more than one element, the combinations of levels are used, in \code{\link{expand.grid}} order. The (total) number of levels must match the number of dimensions. If \code{mult.name} is specified, this argument is ignored.} \item{options}{If non-\code{NULL}, a named \code{list} of arguments to pass to \code{\link{update.emmGrid}}, just after the object is constructed.} \item{data}{A \code{data.frame} to use to obtain information about the predictors (e.g. factor levels). If missing, then \code{\link{recover_data}} is used to attempt to reconstruct the data. See the note with \code{\link{recover_data}} for an important precaution.} \item{df}{Numeric value. This is equivalent to specifying \code{options(df = df)}. See \code{\link{update.emmGrid}}.} \item{type}{Character value. If provided, this is saved as the \code{"predict.type"} setting. See \code{\link{update.emmGrid}} and the section below on prediction types and transformations.} \item{transform}{Character, logical, or list. If non-missing, the reference grid is reconstructed via \code{\link{regrid}} with the given \code{transform} argument. See the section below on prediction types and transformations.} \item{nesting}{If the model has nested fixed effects, this may be specified here via a character vector or named \code{list} specifying the nesting structure. Specifying \code{nesting} overrides any nesting structure that is automatically detected. See the section below on Recovering or Overriding Model Information.} \item{offset}{Numeric scalar value (if a vector, only the first element is used). This may be used to add an offset, or override offsets based on the model. A common usage would be to specify \code{offset = 0} for a Poisson regression model, so that predictions from the reference grid become rates relative to the offset that had been specified in the model.} \item{sigma}{Numeric value to use for subsequent predictions or back-transformation bias adjustments. If not specified, we use \code{sigma(object)}, if available, and \code{NULL} otherwise.} \item{nuisance, non.nuisance, wt.nuis}{If \code{nuisance} is a vector of predictor names, those predictors are omitted from the reference grid. Instead, the result will be as if we had averaged over the levels of those factors, with either equal or proportional weights as specified in \code{wt.nuis} (see the \code{weights} argument in \code{\link{emmeans}}). The factors in \code{nuisance} must not interact with other factors, not even other nuisance factors. Specifying nuisance factors can save considerable storage and computation time, and help avoid exceeding the maximum reference-grid size (\code{get_emm_option("rg.limit")}).} \item{rg.limit}{Integer limit on the number of reference-grid rows to allow (checked before any multivariate responses are included).} \item{...}{Optional arguments passed to \code{\link{summary.emmGrid}}, \code{\link{emm_basis}}, and \code{\link{recover_data}}, such as \code{params}, \code{vcov.} (see \bold{Covariance matrix} below), or options such as \code{mode} for specific model types (see \href{../doc/models.html}{vignette("models", "emmeans")}).} } \value{ An object of the S4 class \code{"emmGrid"} (see \code{\link{emmGrid-class}}). These objects encapsulate everything needed to do calculations and inferences for estimated marginal means, and contain nothing that depends on the model-fitting procedure. } \description{ Using a fitted model object, determine a reference grid for which estimated marginal means are defined. The resulting \code{ref_grid} object encapsulates all the information needed to calculate EMMs and make inferences on them. } \details{ To users, the \code{ref_grid} function itself is important because most of its arguments are in effect arguments of \code{\link{emmeans}} and related functions, in that those functions pass their \code{...} arguments to \code{ref_grid}. The reference grid consists of combinations of independent variables over which predictions are made. Estimated marginal means are defined as these predictions, or marginal averages thereof. The grid is determined by first reconstructing the data used in fitting the model (see \code{\link{recover_data}}), or by using the \code{data.frame} provided in \code{data}. The default reference grid is determined by the observed levels of any factors, the ordered unique values of character-valued predictors, and the results of \code{cov.reduce} for numeric predictors. These may be overridden using \code{at}. See also the section below on recovering/overriding model information. } \note{ The system default for \code{cov.keep} causes models containing indicator variables to be handled differently than in \pkg{emmeans} version 1.4.1 or earlier. To replicate older analyses, change the default via \samp{emm_options(cov.keep = character(0))}. Some earlier versions of \pkg{emmeans} offer a \code{covnest} argument. This is now obsolete; if \code{covnest} is specified, it is harmlessly ignored. Cases where it was needed are now handled appropriately via the code associated with \code{cov.keep}. } \section{Using \code{cov.reduce} and \code{cov.keep}}{ The \code{cov.keep} argument was not available in \pkg{emmeans} versions 1.4.1 and earlier. Any covariates named in this list are treated as if they are factors: all the unique levels are kept in the reference grid. The user may also specify an integer value, in which case any covariate having no more than that number of unique values is implicitly included in \code{cov.keep}. The default for \code{cove.keep} is set and retrieved via the \code{\link{emm_options}} framework, and the system default is \code{"2"}, meaning that covariates having only two unique values are automatically treated as two-level factors. See also the Note below on backward compatibility. There is a subtle distinction between including a covariate in \code{cov.keep} and specifying its values manually in \code{at}: Covariates included in \code{cov.keep} are treated as factors for purposes of weighting, while specifying levels in \code{at} will not include the covariate in weighting. See the \code{mtcars.lm} example below for an illustration. \code{cov.reduce} may be a function, logical value, formula, or a named list of these. If a single function, it is applied to each covariate. If logical and \code{TRUE}, \code{mean} is used. If logical and \code{FALSE}, it is equivalent to including all covariates in \code{cov.keep}. Use of \samp{cov.reduce = FALSE} is inadvisable because it can result in a huge reference grid; it is far better to use \code{cov.keep}. If a formula (which must be two-sided), then a model is fitted to that formula using \code{\link{lm}}; then in the reference grid, its response variable is set to the results of \code{\link{predict}} for that model, with the reference grid as \code{newdata}. (This is done \emph{after} the reference grid is determined.) A formula is appropriate here when you think experimental conditions affect the covariate as well as the response. To allow for situations where a simple \code{lm()} call as described above won't be adequate, a formula of the form \code{ext ~ fcnname} is also supported, where the left-hand side may be \code{ext}, \code{extern}, or \code{external} (and must \emph{not} be a predictor name) and the right-hand side is the name of an existing function. The function is called with one argument, a data frame with columns for each variable in the reference grid. The function is expected to use that frame as new data to be used to obtain predictions for one or more models; and it should return a named list or data frame with replacement values for one or more of the covariates. If \code{cov.reduce} is a named list, then the above criteria are used to determine what to do with covariates named in the list. (However, formula elements do not need to be named, as those names are determined from the formulas' left-hand sides.) Any unresolved covariates are reduced using \code{"mean"}. Any \code{cov.reduce} of \code{cov.keep} specification for a covariate also named in \code{at} is ignored. } \section{Interdependent covariates}{ Care must be taken when covariate values depend on one another. For example, when a polynomial model was fitted using predictors \code{x}, \code{x2} (equal to \code{x^2}), and \code{x3} (equal to \code{x^3}), the reference grid will by default set \code{x2} and \code{x3} to their means, which is inconsistent. The user should instead use the \code{at} argument to set these to the square and cube of \code{mean(x)}. Better yet, fit the model using a formula involving \code{poly(x, 3)} or \code{I(x^2)} and \code{I(x^3)}; then there is only \code{x} appearing as a covariate; it will be set to its mean, and the model matrix will have the correct corresponding quadratic and cubic terms. } \section{Matrix covariates}{ Support for covariates that appear in the dataset as matrices is very limited. If the matrix has but one column, it is treated like an ordinary covariate. Otherwise, with more than one column, each column is reduced to a single reference value -- the result of applying \code{cov.reduce} to each column (averaged together if that produces more than one value); you may not specify values in \code{at}; and they are not treated as variables in the reference grid, except for purposes of obtaining predictions. } \section{Recovering or overriding model information}{ Ability to support a particular class of \code{object} depends on the existence of \code{recover_data} and \code{emm_basis} methods -- see \link{extending-emmeans} for details. The call \code{methods("recover_data")} will help identify these. \bold{Data.} In certain models, (e.g., results of \code{\link[lme4]{glmer.nb}}), it is not possible to identify the original dataset. In such cases, we can work around this by setting \code{data} equal to the dataset used in fitting the model, or a suitable subset. Only the complete cases in \code{data} are used, so it may be necessary to exclude some unused variables. Using \code{data} can also help save computing, especially when the dataset is large. In any case, \code{data} must represent all factor levels used in fitting the model. It \emph{cannot} be used as an alternative to \code{at}. (Note: If there is a pattern of \code{NAs} that caused one or more factor levels to be excluded when fitting the model, then \code{data} should also exclude those levels.) \bold{Covariance matrix.} By default, the variance-covariance matrix for the fixed effects is obtained from \code{object}, usually via its \code{\link{vcov}} method. However, the user may override this via a \code{vcov.} argument, specifying a matrix or a function. If a matrix, it must be square and of the same dimension and parameter order of the fixed effects. If a function, must return a suitable matrix when it is called with arguments \code{(object, ...)}. Be careful with possible unintended conflicts with arguments in \code{...}; for example, \code{sandwich::vcovHAC()} has optional arguments \code{adjust} and \code{weights} that may be intended for \code{emmeans()} but will also be passed to \code{vcov.()}. \bold{Nested factors.} Having a nesting structure affects marginal averaging in \code{emmeans} in that it is done separately for each level (or combination thereof) of the grouping factors. \code{ref_grid} tries to discern which factors are nested in other factors, but it is not always obvious, and if it misses some, the user must specify this structure via \code{nesting}; or later using \code{\link{update.emmGrid}}. The \code{nesting} argument may be a character vector, a named \code{list}, or \code{NULL}. If a \code{list}, each name should be the name of a single factor in the grid, and its entry a character vector of the name(s) of its grouping factor(s). \code{nested} may also be a character value of the form \code{"factor1 \%in\% (factor2*factor3)"} (the parentheses are optional). If there is more than one such specification, they may be appended separated by commas, or as separate elements of a character vector. For example, these specifications are equivalent: \code{nesting = list(state = "country", city = c("state", "country")}, \code{nesting = "state \%in\% country, city \%in\% (state*country)"}, and \code{nesting = c("state \%in\% country", "city \%in\% state*country")}. } \section{Predictors with subscripts and data-set references}{ When the fitted model contains subscripts or explicit references to data sets, the reference grid may optionally be post-processed to simplify the variable names, depending on the \code{simplify.names} option (see \code{\link{emm_options}}), which by default is \code{TRUE}. For example, if the model formula is \code{data1$resp ~ data1$trt + data2[[3]] + data2[["cov"]]}, the simplified predictor names (for use, e.g., in the \code{specs} for \code{\link{emmeans}}) will be \code{trt}, \code{data2[[3]]}, and \code{cov}. Numerical subscripts are not simplified; nor are variables having simplified names that coincide, such as if \code{data2$trt} were also in the model. Please note that this simplification is performed \emph{after} the reference grid is constructed. Thus, non-simplified names must be used in the \code{at} argument (e.g., \code{at = list(`data2["cov"]` = 2:4)}. If you don't want names simplified, use \code{emm_options(simplify.names = FALSE)}. } \section{Prediction types and transformations}{ Transformations can exist because of a link function in a generalized linear model, or as a response transformation, or even both. In many cases, they are auto-detected, for example a model formula of the form \code{sqrt(y) ~ ...}. Even transformations containing multiplicative or additive constants, such as \code{2*sqrt(y + pi) ~ ...}, are auto-detected. A response transformation of \code{y + 1 ~ ...} is \emph{not} auto-detected, but \code{I(y + 1) ~ ...} is interpreted as \code{identity(y + 1) ~ ...}. A warning is issued if it gets too complicated. Complex transformations like the Box-Cox transformation are not auto-detected; but see the help page for \code{\link{make.tran}} for information on some advanced methods. There is a subtle difference between specifying \samp{type = "response"} and \samp{transform = "response"}. While the summary statistics for the grid itself are the same, subsequent use in \code{\link{emmeans}} will yield different results if there is a response transformation or link function. With \samp{type = "response"}, EMMs are computed by averaging together predictions on the \emph{linear-predictor} scale and then back-transforming to the response scale; while with \samp{transform = "response"}, the predictions are already on the response scale so that the EMMs will be the arithmetic means of those response-scale predictions. To add further to the possibilities, \emph{geometric} means of the response-scale predictions are obtainable via \samp{transform = "log", type = "response"}. See also the help page for \code{\link{regrid}}. } \section{Optional side effect}{ If the \code{save.ref_grid} option is set to \code{TRUE} (see \code{\link{emm_options}}), The most recent result of \code{ref_grid}, whether called directly or indirectly via \code{\link{emmeans}}, \code{\link{emtrends}}, or some other function that calls one of these, is saved in the user's environment as \code{.Last.ref_grid}. This facilitates checking what reference grid was used, or reusing the same reference grid for further calculations. This automatic saving is disabled by default, but may be enabled via \samp{emm_options(save.ref_grid = TRUE)}. } \examples{ fiber.lm <- lm(strength ~ machine*diameter, data = fiber) ref_grid(fiber.lm) ref_grid(fiber.lm, at = list(diameter = c(15, 25))) \dontrun{ # We could substitute the sandwich estimator vcovHAC(fiber.lm) # as follows: summary(ref_grid(fiber.lm, vcov. = sandwich::vcovHAC)) } # If we thought that the machines affect the diameters # (admittedly not plausible in this example), then we should use: ref_grid(fiber.lm, cov.reduce = diameter ~ machine) ### Model with indicator variables as predictors: mtcars.lm <- lm(mpg ~ disp + wt + vs * am, data = mtcars) (rg.default <- ref_grid(mtcars.lm)) (rg.nokeep <- ref_grid(mtcars.lm, cov.keep = character(0))) (rg.at <- ref_grid(mtcars.lm, at = list(vs = 0:1, am = 0:1))) # Two of these have the same grid but different weights: rg.default@grid rg.at@grid ### Using cov.reduce formulas... # Above suggests we can vary disp indep. of other factors - unrealistic rg.alt <- ref_grid(mtcars.lm, at = list(wt = c(2.5, 3, 3.5)), cov.reduce = disp ~ vs * wt) rg.alt@grid # Alternative to above where we model sqrt(disp) disp.mod <- lm(sqrt(disp) ~ vs * wt, data = mtcars) disp.fun <- function(dat) list(disp = predict(disp.mod, newdata = dat)^2) rg.alt2 <- ref_grid(mtcars.lm, at = list(wt = c(2.5, 3, 3.5)), cov.reduce = external ~ disp.fun) rg.alt2@grid # Multivariate example MOats.lm = lm(yield ~ Block + Variety, data = MOats) ref_grid(MOats.lm, mult.names = "nitro") # Silly illustration of how to use 'mult.levs' to make comb's of two factors ref_grid(MOats.lm, mult.levs = list(T=LETTERS[1:2], U=letters[1:2])) # Using 'params' require("splines") my.knots = c(2.5, 3, 3.5) mod = lm(Sepal.Length ~ Species * ns(Sepal.Width, knots = my.knots), data = iris) ## my.knots is not a predictor, so need to name it in 'params' ref_grid(mod, params = "my.knots") } \seealso{ Reference grids are of class \code{\link[=emmGrid-class]{emmGrid}}, and several methods exist for them -- for example \code{\link{summary.emmGrid}}. Reference grids are fundamental to \code{\link{emmeans}}. Supported models are detailed in \href{../doc/models.html}{\code{vignette("models", "emmeans")}}. See \code{\link{update.emmGrid}} for details of arguments that can be in \code{options} (or in \code{...}). } emmeans/man/xtable.emmGrid.Rd0000644000176200001440000000440414137062735015615 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/xtable-method.R \name{xtable.emmGrid} \alias{xtable.emmGrid} \alias{xtable.summary_emm} \alias{print.xtable_emm} \title{Using \code{xtable} for EMMs} \usage{ \method{xtable}{emmGrid}(x, caption = NULL, label = NULL, align = NULL, digits = 4, display = NULL, auto = FALSE, ...) \method{xtable}{summary_emm}(x, caption = NULL, label = NULL, align = NULL, digits = 4, display = NULL, auto = FALSE, ...) \method{print}{xtable_emm}(x, type = getOption("xtable.type", "latex"), include.rownames = FALSE, sanitize.message.function = footnotesize, ...) } \arguments{ \item{x}{Object of class \code{emmGrid}} \item{caption}{Passed to \code{\link[xtable]{xtableList}}} \item{label}{Passed to \code{xtableList}} \item{align}{Passed to \code{xtableList}} \item{digits}{Passed to \code{xtableList}} \item{display}{Passed to \code{xtableList}} \item{auto}{Passed to \code{xtableList}} \item{...}{Arguments passed to \code{\link{summary.emmGrid}}} \item{type}{Passed to \code{\link[xtable]{print.xtable}}} \item{include.rownames}{Passed to \code{print.xtable}} \item{sanitize.message.function}{Passed to \code{print.xtable}} } \value{ The \code{xtable} methods return an \code{xtable_emm} object, for which its print method is \code{print.xtable_emm} . } \description{ These methods provide support for the \pkg{xtable} package, enabling polished presentations of tabular output from \code{\link{emmeans}} and other functions. } \details{ The methods actually use \code{\link[xtable]{xtableList}}, because of its ability to display messages such as those for P-value adjustments. These methods return an object of class \code{"xtable_emm"} -- an extension of \code{"xtableList"}. Unlike other \code{xtable} methods, the number of digits defaults to 4; and degrees of freedom and \emph{t} ratios are always formatted independently of \code{digits}. The \code{print} method uses \code{\link[xtable:xtableList]{print.xtableList}}, and any \code{\dots} arguments are passed there. } \examples{ pigsint.lm <- lm(log(conc) ~ source * factor(percent), data = pigs) pigsint.emm <- emmeans(pigsint.lm, ~ percent | source) xtable::xtable(pigsint.emm, type = "response") } emmeans/man/as.emmGrid.Rd0000644000176200001440000000467714137062735014755 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/emmeans.R, R/emm-list.R \name{as.list.emmGrid} \alias{as.list.emmGrid} \alias{as.emm_list} \alias{as.emmGrid} \title{Convert to and from \code{emmGrid} objects} \usage{ \method{as.list}{emmGrid}(x, model.info.slot = FALSE, ...) as.emm_list(object, ...) as.emmGrid(object, ...) } \arguments{ \item{x}{An \code{emmGrid} object} \item{model.info.slot}{Logical value: Include the \code{model.info} slot? Set this to \code{TRUE} if you want to preserve the original call and information needed by the \code{submodel} option. If \code{FALSE}, only the nesting information (if any) is saved} \item{...}{In \code{as.emmGrid}, additional arguments passed to \code{\link{update.emmGrid}} before returning the object. This argument is ignored in \code{as.list.emmGrid}} \item{object}{Object to be converted to class \code{emmGrid}. It may be a \code{list} returned by \code{as.list.emmGrid}, or a \code{ref.grid} or \code{lsmobj} object created by \pkg{emmeans}'s predecessor, the \pkg{lsmeans} package. An error is thrown if \code{object} cannot be converted.} } \value{ \code{as.list.emmGrid} returns an object of class \code{list}. \code{as.emm_list} returns an object of class \code{emm_list}. \code{as.emmGrid} returns an object of class \code{emmGrid}. However, in fact, both \code{as.emmGrid} and \code{as.emm_list} check for an attribute in \code{object} to decide whether to return an \code{emmGrid} or \code{emm_list)} object. } \description{ These are useful utility functions for creating a compact version of an \code{emmGrid} object that may be saved and later reconstructed, or for converting old \code{ref.grid} or \code{lsmobj} objects into \code{emmGrid} objects. } \details{ An \code{emmGrid} object is an S4 object, and as such cannot be saved in a text format or saved without a lot of overhead. By using \code{as.list}, the essential parts of the object are converted to a list format that can be easily and compactly saved for use, say, in another session or by another user. Providing this list as the arguments for \code{\link{emmobj}} allows the user to restore a working \code{emmGrid} object. } \examples{ pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) pigs.sav <- as.list(ref_grid(pigs.lm)) pigs.anew <- as.emmGrid(pigs.sav) emmeans(pigs.anew, "source") } \seealso{ \code{\link{emmobj}} } emmeans/man/extending-emmeans.Rd0000644000176200001440000002321614137062735016365 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/interfacing.R, R/zzz.R \name{extending-emmeans} \alias{extending-emmeans} \alias{recover_data} \alias{recover_data.call} \alias{emm_basis} \alias{.recover_data} \alias{.emm_basis} \alias{.emm_register} \title{Support functions for model extensions} \usage{ recover_data(object, ...) \method{recover_data}{call}(object, trms, na.action, data = NULL, params = "pi", frame, ...) emm_basis(object, trms, xlev, grid, ...) .recover_data(object, ...) .emm_basis(object, trms, xlev, grid, ...) .emm_register(classes, pkgname) } \arguments{ \item{object}{An object of the same class as is supported by a new method.} \item{...}{Additional parameters that may be supported by the method.} \item{trms}{The \code{\link{terms}} component of \code{object} (typically with the response deleted, e.g. via \code{\link{delete.response}})} \item{na.action}{Integer vector of indices of observations to ignore; or \code{NULL} if none} \item{data}{Data frame. Usually, this is \code{NULL}. However, if non-null, this is used in place of the reconstructed dataset. It must have all of the predictors used in the model, and any factor levels must match those used in fitting the model.} \item{params}{Character vector giving the names of any variables in the model formula that are \emph{not} predictors. For example, a spline model may involve a local variable \code{knots} that is not a predictor, but its value is needed to fit the model. Names of parameters not actually used are harmless, and the default value \code{"pi"} (the only numeric constant in base R) is provided in case the model involves it. An example involving splines may be found at \url{https://github.com/rvlenth/emmeans/issues/180}.} \item{frame}{Optional \code{data.frame}. Many model objects contain the model frame used when fitting the model. In cases where there are no predictor transformations, this model frame has all the original predictor values and so is usable for recovering the data. Thus, if \code{frame} is non-missing and \code{data} is \code{NULL}, a check is made on \code{trms} and if there are no function calls, we use \code{data = frame}. This can be helpful because it provides a modicum of security against the possibility that the original data used when fitting the model has been altered or removed.} \item{xlev}{Named list of factor levels (\emph{excluding} ones coerced to factors in the model formula)} \item{grid}{A \code{data.frame} (provided by \code{ref_grid}) containing the predictor settings needed in the reference grid} \item{classes}{Character names of one or more classes to be registered. The package must contain the functions \code{recover_data.foo} and \code{emm_basis.foo} for each class \code{foo} listed in \code{classes}.} \item{pkgname}{Character name of package providing the methods (usually should be the second argument of \code{.onLoad})} } \value{ The \code{recover_data} method must return a \code{\link{data.frame}} containing all the variables that appear as predictors in the model, and attributes \code{"call"}, \code{"terms"}, \code{"predictors"}, and \code{"responses"}. (\code{recover_data.call} will provide these attributes.) The \code{emm_basis} method should return a \code{list} with the following elements: \describe{ \item{X}{The matrix of linear functions over \code{grid}, having the same number of rows as \code{grid} and the number of columns equal to the length of \code{bhat}.} \item{bhat}{The vector of regression coefficients for fixed effects. This should \emph{include} any \code{NA}s that result from rank deficiencies.} \item{nbasis}{A matrix whose columns form a basis for non-estimable functions of beta, or a 1x1 matrix of \code{NA} if there is no rank deficiency.} \item{V}{The estimated covariance matrix of \code{bhat}.} \item{dffun}{A function of \code{(k, dfargs)} that returns the degrees of freedom associated with \code{sum(k * bhat)}.} \item{dfargs}{A \code{list} containing additional arguments needed for \code{dffun}}. } %%% end of describe \code{.recover_data} and \code{.emm_basis} are hidden exported versions of \code{recover_data} and \code{emm_basis}, respectively. They run in \pkg{emmeans}'s namespace, thus providing access to all existing methods. } \description{ This documents the methods that \code{\link{ref_grid}} calls. A user or package developer may add \pkg{emmeans} support for a model class by writing \code{recover_data} and \code{emm_basis} methods for that class. (Users in need for a quick way to obtain results for a model that is not supported may be better served by the \code{\link{qdrg}} function.) } \note{ Without an explicit \code{data} argument, \code{recover_data} returns the \emph{current version} of the dataset. If the dataset has changed since the model was fitted, then this will not be the data used to fit the model. It is especially important to know this in simulation studies where the data are randomly generated or permuted, and in cases where several datasets are processed in one step (e.g., using \code{dplyr}). In those cases, users should be careful to provide the actual data used to fit the model in the \code{data} argument. } \section{Details}{ To create a reference grid, the \code{ref_grid} function needs to reconstruct the data used in fitting the model, and then obtain a matrix of linear functions of the regression coefficients for a given grid of predictor values. These tasks are performed by calls to \code{recover_data} and \code{emm_basis} respectively. A vignette giving details and examples is available via \href{../doc/xtending.html}{vignette("xtending", "emmeans")} To extend \pkg{emmeans}'s support to additional model types, one need only write S3 methods for these two functions. The existing methods serve as helpful guidance for writing new ones. Most of the work for \code{recover_data} can be done by its method for class \code{"call"}, providing the \code{terms} component and \code{na.action} data as additional arguments. Writing an \code{emm_basis} method is more involved, but the existing methods (e.g., \code{emmeans:::emm_basis.lm}) can serve as models. Certain \code{recover_data} and \code{emm_basis} methods are exported from \pkg{emmeans}. (To find out, do \code{methods("recover_data")}.) If your object is based on another model-fitting object, it may be that all that is needed is to call one of these exported methods and perhaps make modifications to the results. Contact the developer if you need others of these exported. If the model has a multivariate response, \code{bhat} needs to be \dQuote{flattened} into a single vector, and \code{X} and \code{V} must be constructed consistently. In models where a non-full-rank result is possible (often, you can tell by seeing if there is a \code{singular.ok} argument in the model-fitting function), \code{\link{summary.emmGrid}} and its relatives check the estimability of each prediction, using the \code{\link[estimability]{nonest.basis}} function in the \pkg{estimability} package. The models already supported are detailed in \href{../doc/models.html}{the "models" vignette}. Some packages may provide additional \pkg{emmeans} support for its object classes. } \section{Communication between methods}{ If the \code{recover_data} method generates information needed by \code{emm_basis}, that information may be incorporated by creating a \code{"misc"} attribute in the returned recovered data. That information is then passed as the \code{misc} argument when \code{ref_grid} calls \code{emm_basis}. } \section{Optional hooks}{ Some models may need something other than standard linear estimates and standard errors. If so, custom functions may be pointed to via the items \code{misc$estHook}, \code{misc$vcovHook} and \code{misc$postGridHook}. If just the name of the hook function is provided as a character string, then it is retrieved using \code{\link{get}}. The \code{estHook} function should have arguments \samp{(object, do.se, tol, ...)} where \code{object} is the \code{emmGrid} object, \code{do.se} is a logical flag for whether to return the standard error, and \code{tol} is the tolerance for assessing estimability. It should return a matrix with 3 columns: the estimates, standard errors (\code{NA} when \code{do.se==FALSE}), and degrees of freedom (\code{NA} for asymptotic). The number of rows should be the same as \samp{object@linfct}. The \code{vcovHook} function should have arguments \samp{(object, tol, ...)} as described. It should return the covariance matrix for the estimates. Finally, \code{postGridHook}, if present, is called at the very end of \code{ref_grid}; it takes one argument, the constructed \code{object}, and should return a suitably modified \code{emmGrid} object. } \section{Registering S3 methods for a model class}{ The \code{.emm_register} function is provided as a convenience to conditionally register your S3 methods for a model class, \code{recover_data.foo} and \code{emm_basis.foo}, where \code{foo} is the class name. Your package should implement an \code{.onLoad} function and call \code{.emm_register} if \pkg{emmeans} is installed. See the example. } \examples{ \dontrun{ #--- If your package provides recover_data and emm_grid methods for class 'mymod', #--- put something like this in your package code -- say in zzz.R: .onLoad = function(libname, pkgname) { if (requireNamespace("emmeans", quietly = TRUE)) emmeans::.emm_register("mymod", pkgname) } } } \seealso{ \href{../doc/xtending.html}{Vignette on extending emmeans} } emmeans/man/pigs.Rd0000644000176200001440000000220514137062735013712 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/datasets.R \docType{data} \name{pigs} \alias{pigs} \title{Effects of dietary protein on free plasma leucine concentration in pigs} \format{ A data frame with 29 observations and 3 variables: \describe{ \item{source}{Source of protein in the diet (factor with 3 levels: fish meal, soybean meal, dried skim milk)} \item{percent}{Protein percentage in the diet (numeric with 4 values: 9, 12, 15, and 18)} \item{conc}{Concentration of free plasma leucine, in mcg/ml} } } \source{ Windels HF (1964) PhD thesis, Univ. of Minnesota. (Reported as Problem 10.8 in Oehlert G (2000) \emph{A First Course in Design and Analysis of Experiments}, licensed under Creative Commons, \url{http://users.stat.umn.edu/~gary/Book.html}.) Observations 7, 22, 23, 31, 33, and 35 have been omitted, creating a more notable imbalance. } \usage{ pigs } \description{ A two-factor experiment with some observations lost } \examples{ pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) emmeans(pigs.lm, "source") } \keyword{datasets} emmeans/man/emmGrid-methods.Rd0000644000176200001440000000227114137062735016000 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/emmGrid-methods.R \name{str.emmGrid} \alias{str.emmGrid} \alias{print.emmGrid} \alias{vcov.emmGrid} \title{Miscellaneous methods for \code{emmGrid} objects} \usage{ \method{str}{emmGrid}(object, ...) \method{print}{emmGrid}(x, ..., export = FALSE) \method{vcov}{emmGrid}(object, ...) } \arguments{ \item{object}{An \code{emmGrid} object} \item{...}{(required but not used)} \item{x}{An \code{emmGrid} object} \item{export}{Logical value. If \code{FALSE}, the object is printed. If \code{TRUE}, a list is invisibly returned, which contains character elements named \code{summary} and \code{annotations} that may be saved or displayed as the user sees fit. \code{summary} is a character matrix (or list of such matrices, if a \code{by} variable is in effect). \code{annotations} is a character vector of the annotations that would have been printed below the summary or summaries.} } \value{ The \code{vcov} method returns a symmetric matrix of variances and covariances for \code{predict.emmGrid(object, type = "lp")} } \description{ Miscellaneous methods for \code{emmGrid} objects } emmeans/man/mvcontrast.Rd0000644000176200001440000000702314137062735015153 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/multiv.R \name{mvcontrast} \alias{mvcontrast} \title{Multivariate contrasts} \usage{ mvcontrast(object, method = "eff", mult.name = object@roles$multresp, null = 0, by = object@misc$by.vars, adjust = c("sidak", p.adjust.methods), show.ests = FALSE, ...) } \arguments{ \item{object}{An object of class \code{emmGrid}} \item{method}{A contrast method, per \code{\link{contrast.emmGrid}}} \item{mult.name}{Character vector of nNames of the factors whose levels define the multivariate means to contrast. If the model itself has a multivariate response, that is what is used. Otherwise, \code{mult.name} \emph{must} be specified.} \item{null}{Scalar or conformable vector of null-hypothesis values to test against} \item{by}{Any \code{by} variable(s). These should not include the primary variables to be contrasted. For convenience, the \code{by} variable is nulled-out if it would result in no primary factors being contrasted.} \item{adjust}{Character value of a multiplicity adjustment method (\code{"none"} for no adjustment). The available adjustment methods are more limited that in \code{contrast}, and any default adjustment returned via \code{method} is ignored.} \item{show.ests}{Logical flag determining whether the multivariate means are displayed} \item{...}{Additional arguments passed to \code{contrast}} } \value{ An object of class \code{summary_emm} containing the multivariate test results; or a list of the estimates and the tests if \code{show.ests} is \code{TRUE}. The test results include the Hotelling \eqn{T^2} statistic, \eqn{F} ratios, degrees of freedom, and \eqn{P} values. } \description{ This function displays tests of multivariate comparisons or contrasts. The contrasts are constructed at each level of the variable in \code{mult.name}, and then we do a multivariate test that the vector of estimates is equal to \code{null} (zero by default). The \emph{F} statistic and degrees of freedom are determined via the Hotelling distribution. that is, if there are \eqn{m} error degrees of freedom and multivariate dimensionality \eqn{d}, then the resulting \eqn{F} statistic has degrees of freedom \eqn{(d, m - d + 1)} as shown in Hotelling (1931). } \note{ If some interactions among the primary and \code{mult.name} factors are absent, the covariance of the multivariate means is singular; this situation is accommodated, but the result has reduced degrees of freedom and a message is displayed. If there are other abnormal conditions such as non-estimable results, estimates are shown as \code{NA}. While designed primarily for testing contrasts, multivariate tests of the mean vector itself can be implemented via \code{method = "identity")} (see the examples). } \examples{ MOats.lm <- lm(yield ~ Variety + Block, data = MOats) MOats.emm <- emmeans(MOats.lm, ~ Variety | rep.meas) mvcontrast(MOats.emm, "consec", show.ests = TRUE) # mult.name defaults to rep.meas # Test each mean against a specified null vector mvcontrast(MOats.emm, "identity", name = "Variety", null = c(80, 100, 120, 140), adjust = "none") # (Note 'name' is passed to contrast() and overrides default name "contrast") # 'mult.name' need not refer to a multivariate response mvcontrast(MOats.emm, "trt.vs.ctrl1", mult.name = "Variety") } \references{ Hotelling, Harold (1931) "The generalization of Student's ratio", \emph{Annals of Mathematical Statistics} 2(3), 360–378. doi:10.1214/aoms/1177732979 } emmeans/man/emmeans-package.Rd0000644000176200001440000001247014137062735015773 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/emmeans-package.R \name{emmeans-package} \alias{emmeans-package} \title{Estimated marginal means (aka Least-squares means)} \description{ This package provides methods for obtaining estimated marginal means (EMMs, also known as least-squares means) for factor combinations in a variety of models. Supported models include [generalized linear] models, models for counts, multivariate, multinomial and ordinal responses, survival models, GEEs, and Bayesian models. For the latter, posterior samples of EMMs are provided. The package can compute contrasts or linear combinations of these marginal means with various multiplicity adjustments. One can also estimate and contrast slopes of trend lines. Some graphical displays of these results are provided. } \section{Overview}{ \describe{ \item{Vignettes}{A number of vignettes are provided to help the user get acquainted with the \pkg{emmeans} package and see some examples.} \item{Concept}{Estimated marginal means (see Searle \emph{et al.} 1980 are popular for summarizing linear models that include factors. For balanced experimental designs, they are just the marginal means. For unbalanced data, they in essence estimate the marginal means you \emph{would} have observed that the data arisen from a balanced experiment. Earlier developments regarding these techniques were developed in a least-squares context and are sometimes referred to as \dQuote{least-squares means}. Since its early development, the concept has expanded far beyond least-squares settings.} \item{Reference grids}{ The implementation in \pkg{emmeans} relies on our own concept of a \emph{reference grid}, which is an array of factor and predictor levels. Predictions are made on this grid, and estimated marginal means (or EMMs) are defined as averages of these predictions over zero or more dimensions of the grid. The function \code{\link{ref_grid}} explicitly creates a reference grid that can subsequently be used to obtain least-squares means. The object returned by \code{ref_grid} is of class \code{"emmGrid"}, the same class as is used for estimated marginal means (see below). Our reference-grid framework expands slightly upon Searle \emph{et al.}'s definitions of EMMs, in that it is possible to include multiple levels of covariates in the grid. } \item{Models supported}{As is mentioned in the package description, many types of models are supported by the package. See \href{../doc/models.html}{vignette("models", "emmeans")} for full details. Some models may require other packages be installed in order to access all of the available features. For models not explicitly supported, it may still be possible to do basic post hoc analyses of them via the \code{\link{qdrg}} function.} \item{Estimated marginal means}{ The \code{\link{emmeans}} function computes EMMs given a fitted model (or a previously constructed \code{emmGrid} object), using a specification indicating what factors to include. The \code{\link{emtrends}} function creates the same sort of results for estimating and comparing slopes of fitted lines. Both return an \code{emmGrid} object.} \item{Summaries and analysis}{ The \code{\link{summary.emmGrid}} method may be used to display an \code{emmGrid} object. Special-purpose summaries are available via \code{\link{confint.emmGrid}} and \code{\link{test.emmGrid}}, the latter of which can also do a joint test of several estimates. The user may specify by variables, multiplicity-adjustment methods, confidence levels, etc., and if a transformation or link function is involved, may reverse-transform the results to the response scale.} \item{Contrasts and comparisons}{ The \code{\link{contrast}} method for \code{emmGrid} objects is used to obtain contrasts among the estimates; several standard contrast families are available such as deviations from the mean, polynomial contrasts, and comparisons with one or more controls. Another \code{emmGrid} object is returned, which can be summarized or further analyzed. For convenience, a \code{pairs.emmGrid} method is provided for the case of pairwise comparisons. } \item{Graphs}{The \code{\link{plot.emmGrid}} method will display side-by-side confidence intervals for the estimates, and/or \dQuote{comparison arrows} whereby the *P* values of pairwise differences can be observed by how much the arrows overlap. The \code{\link{emmip}} function displays estimates like an interaction plot, multi-paneled if there are by variables. These graphics capabilities require the \pkg{lattice} package be installed.} \item{MCMC support}{When a model is fitted using MCMC methods, the posterior chains(s) of parameter estimates are retained and converted into posterior samples of EMMs or contrasts thereof. These may then be summarized or plotted like any other MCMC results, using tools in, say \pkg{coda} or \pkg{bayesplot}.} \item{\pkg{multcomp} interface}{The \code{\link{as.glht}} function and \code{glht} method for \code{emmGrid}s provide an interface to the \code{glht} function in the \pkg{multcomp} package, thus providing for more exacting simultaneous estimation or testing. The package also provides an \code{\link{emm}} function that works as an alternative to \code{mcp} in a call to \code{glht}. } } %%% end describe } emmeans/man/MOats.Rd0000644000176200001440000000316714137062735014003 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/datasets.R \docType{data} \name{MOats} \alias{MOats} \title{Oats data in multivariate form} \format{ A data frame with 18 observations and 3 variables \describe{ \item{\code{Variety}}{a factor with levels \code{Golden Rain}, \code{Marvellous}, \code{Victory}} \item{\code{Block}}{an ordered factor with levels \code{VI} < \code{V} < \code{III} < \code{IV} < \code{II} < \code{I}} \item{\code{yield}}{a matrix with 4 columns, giving the yields with nitrogen concentrations of 0, .2, .4, and .6.} } } \source{ The dataset \code{\link[nlme]{Oats}} in the \pkg{nlme} package. } \usage{ MOats } \description{ This is the \code{Oats} dataset provided in the \pkg{nlme} package, but it is rearranged as one multivariate observation per plot. } \details{ These data arise from a split-plot experiment reported by Yates (1935) and used as an example in Pinheiro and Bates (2000) and other texts. Six blocks were divided into three whole plots, randomly assigned to the three varieties of oats. The whole plots were each divided into 4 split plots and randomized to the four concentrations of nitrogen. } \examples{ MOats.lm <- lm (yield ~ Block + Variety, data = MOats) MOats.rg <- ref_grid (MOats.lm, mult.name = "nitro") emmeans(MOats.rg, ~ nitro | Variety) } \references{ Pinheiro, J. C. and Bates D. M. (2000) \emph{Mixed-Effects Models in S and S-PLUS}, Springer, New York. (Appendix A.15) Yates, F. (1935) Complex experiments, \emph{Journal of the Royal Statistical Society} Suppl. 2, 181-247 } \keyword{datasets} emmeans/man/pwpp.Rd0000644000176200001440000001260314137062735013741 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/pwpp.R \name{pwpp} \alias{pwpp} \title{Pairwise P-value plot} \usage{ pwpp(emm, method = "pairwise", by, sort = TRUE, values = TRUE, rows = ".", xlab, ylab, xsub = "", plim = numeric(0), add.space = 0, aes, ...) } \arguments{ \item{emm}{An \code{emmGrid} object} \item{method}{Character or list. Passed to \code{\link{contrast}}, and defines the contrasts to be displayed. Any contrast method may be used, provided that each contrast includes one coefficient of \code{1}, one coefficient of \code{-1}, and the rest \code{0}. That is, calling \code{contrast(object, method)} produces a set of comparisons, each with one estimate minus another estimate.} \item{by}{Character vector of variable(s) in the grid to condition on. These will create different panels, one for each level or level-combination. Grid factors not in \code{by} are the \emph{primary} factors: whose levels or level combinations are compared pairwise.} \item{sort}{Logical value. If \code{TRUE}, levels of the factor combinations are ordered by their marginal means. If \code{FALSE}, they appear in order based on the existing ordering of the factor levels involved. Note that the levels are ordered the same way in all panels, and in many cases this implies that the means in any particular panel will \emph{not} be ordered even when \code{sort = TRUE}.} \item{values}{Logical value. If \code{TRUE}, the values of the EMMs are included in the plot. When there are several side-by-side panels due to \code{by} variable(s), the labels showing values start stealing a lot of space from the plotting area; in those cases, it may be desirable to specify \code{FALSE} or use \code{rows} so that some panels are vertically stacked.} \item{rows}{Character vector of which \code{by} variable(s) are used to define rows of the panel layout. Those variables in \code{by} not included in \code{rows} define columns in the array of panels. A \code{"."} indicates that only one row is used, so all panels are stacked side-by-side.} \item{xlab}{Character label to use in place of the default for the P-value axis.} \item{ylab}{Character label to use in place of the default for the primary-factor axis.} \item{xsub}{Character label used as caption at the lower right of the plot.} \item{plim}{numeric vector of value(s) between 0 and 1. These are included among the observed p values so that the range of tick marks includes at least the range of \code{plim}. Choosing \code{plim = c(0,1)} will ensure the widest possible range.} \item{add.space}{Numeric value to adjust amount of space used for value labels. Positioning of value labels is tricky, and depends on how many panels and the physical size of the plotting region. This parameter allows the user to adjust the position. Changing it by one unit should shift the position by about one character width (right if positive, left if negative). Note that this interacts with \code{aes$label} below.} \item{aes}{optional named list of lists. Entries considered are \code{point}, \code{segment}, and \code{label}, and contents are passed to the respective \code{ggplot2::geom_xxx()} functions. These affect rendering of points, line segments joining them, and value labels. Defaults are \code{point = list(size = 2)}, \code{segment = list()}, and \code{label = list(size = 2.5)}.} \item{...}{Additional arguments passed to \code{contrast} and \code{\link{summary.emmGrid}}, as well as to \code{geom_segment} and \code{geom_label}} } \description{ Constructs a plot of P values associated with pairwise comparisons of estimated marginal means. } \details{ Factor levels (or combinations thereof) are plotted on the vertical scale, and P values are plotted on the horizontal scale. Each P value is plotted twice -- at vertical positions corresponding to the levels being compared -- and connected by a line segment. Thus, it is easy to visualize which P values are small and large, and which levels are compared. In addition, factor levels are color-coded, and the points and half-line segments appear in the color of the other level. The P-value scale is nonlinear, so as to stretch-out smaller P values and compress larger ones. P values smaller than 0.0004 are altered and plotted in a way that makes them more distinguishable from one another. If \code{xlab}, \code{ylab}, and \code{xsub} are not provided, reasonable labels are created. \code{xsub} is used to note special features; e.g., equivalence thresholds or one-sided tests. } \note{ The \pkg{ggplot2} and \pkg{scales} packages must be installed in order for \code{pwpp} to work. Additional plot aesthetics are available by adding them to the returned object; see the examples } \examples{ pigs.lm <- lm(log(conc) ~ source * factor(percent), data = pigs) emm = emmeans(pigs.lm, ~ percent | source) pwpp(emm) pwpp(emm, method = "trt.vs.ctrl1", type = "response", side = ">") # custom aesthetics: my.aes <- list(point = list(shape = "square"), segment = list(linetype = "dashed", color = "red"), label = list(family = "serif", fontface = "italic")) my.pal <- c("darkgreen", "blue", "magenta", "orange") pwpp(emm, aes = my.aes) + ggplot2::scale_color_manual(values = my.pal) } \seealso{ A numerical display of essentially the same results is available from \code{\link{pwpm}} } emmeans/man/regrid.Rd0000644000176200001440000001404214137062735014226 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/emmGrid-methods.R \name{regrid} \alias{regrid} \title{Reconstruct a reference grid with a new transformation or simulations} \usage{ regrid(object, transform = c("response", "mu", "unlink", "none", "pass", links), inv.link.lbl = "response", predict.type, bias.adjust = get_emm_option("back.bias.adj"), sigma, N.sim, sim = mvtnorm::rmvnorm, ...) } \arguments{ \item{object}{An object of class \code{emmGrid}} \item{transform}{Character, list, or logical value. If \code{"response"}, \code{"mu"}, or \code{TRUE}, the inverse transformation is applied to the estimates in the grid (but if there is both a link function and a response transformation, \code{"mu"} back-transforms only the link part); if \code{"none"} or \code{FALSE}, \code{object} is re-gridded so that its \code{bhat} slot contains \code{predict(object)} and its \code{linfct} slot is the identity. Any internal transformation information is preserved. If \code{transform = "pass"}, the object is not re-gridded in any way (this may be useful in conjunction with \code{N.sim}). If \code{transform} is a character value in \code{links} (which is the set of valid arguments for the \code{\link{make.link}} function, excepting \code{"identity"}), or if \code{transform} is a list of the same form as returned by \code{make.links} or \code{\link{make.tran}}, the results are formulated as if the response had been transformed with that link function.} \item{inv.link.lbl}{Character value. This applies only when \code{transform} is in \code{links}, and is used to label the predictions if subsequently summarized with \code{type = "response"}.} \item{predict.type}{Character value. If provided, the returned object is updated with the given type to use by default by \code{summary.emmGrid} (see \code{\link{update.emmGrid}}). This may be useful if, for example, when one specifies \code{transform = "log"} but desires summaries to be produced by default on the response scale.} \item{bias.adjust}{Logical value for whether to adjust for bias in back-transforming (\code{transform = "response"}). This requires a value of \code{sigma} to exist in the object or be specified.} \item{sigma}{Error SD assumed for bias correction (when \code{transform = "response"} and a transformation is in effect). If not specified, \code{object@misc$sigma} is used, and an error is thrown if it is not found.} \item{N.sim}{Integer value. If specified and \code{object} is based on a frequentist model (i.e., does not have a posterior sample), then a fake posterior sample is generated using the function \code{sim}.} \item{sim}{A function of three arguments (no names are assumed). If \code{N.sim} is supplied with a frequentist model, this function is called with respective arguments \code{N.sim}, \code{object@bhat}, and \code{object@V}. The default is the multivariate normal distribution.} \item{...}{Ignored.} } \value{ An \code{emmGrid} object with the requested changes } \description{ The typical use of this function is to cause EMMs to be computed on a different scale, e.g., the back-transformed scale rather than the linear-predictor scale. In other words, if you want back-transformed results, do you want to average and then back-transform, or back-transform and then average? } \details{ The \code{regrid} function reparameterizes an existing \code{ref.grid} so that its \code{linfct} slot is the identity matrix and its \code{bhat} slot consists of the estimates at the grid points. If \code{transform} is \code{TRUE}, the inverse transform is applied to the estimates. Outwardly, when \code{transform = "response"}, the result of \code{\link{summary.emmGrid}} after applying \code{regrid} is identical to the summary of the original object using \samp{type="response"}. But subsequent EMMs or contrasts will be conducted on the new scale -- which is the reason this function exists. This function may also be used to simulate a sample of regression coefficients for a frequentist model for subsequent use as though it were a Bayesian model. To do so, specify a value for \code{N.sim} and a sample is simulated using the function \code{sim}. The grid may be further processed in accordance with the other arguments; or if \code{transform = "pass"}, it is simply returned with the only change being the addition of the simulated sample. } \note{ Another way to use \code{regrid} is to supply a \code{transform} argument to \code{\link{ref_grid}} (either directly of indirectly via \code{\link{emmeans}}). This is often a simpler approach if the reference grid has not already been constructed. } \section{Degrees of freedom}{ In cases where the degrees of freedom depended on the linear function being estimated (e.g., Satterthwaite method), the d.f. from the reference grid are saved, and a kind of \dQuote{containment} method is substituted in the returned object, whereby the calculated d.f. for a new linear function will be the minimum d.f. among those having nonzero coefficients. This is kind of an \emph{ad hoc} method, and it can over-estimate the degrees of freedom in some cases. An annotation is displayed below any subsequent summary results stating that the degrees-of-freedom method is inherited from the previous method at the time of re-gridding. } \examples{ pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) rg <- ref_grid(pigs.lm) # This will yield EMMs as GEOMETRIC means of concentrations: (emm1 <- emmeans(rg, "source", type = "response")) pairs(emm1) ## We obtain RATIOS # This will yield EMMs as ARITHMETIC means of concentrations: (emm2 <- emmeans(regrid(rg, transform = "response"), "source")) pairs(emm2) ## We obtain DIFFERENCES # Same result, useful if we hadn't already created 'rg' # emm2 <- emmeans(pigs.lm, "source", transform = "response") # Simulate a sample of regression coefficients set.seed(2.71828) rgb <- regrid(rg, N.sim = 200, transform = "pass") emmeans(rgb, "source", type = "response") ## similar to emm1 } emmeans/man/fiber.Rd0000644000176200001440000000240214137062735014036 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/datasets.R \docType{data} \name{fiber} \alias{fiber} \title{Fiber data} \format{ A data frame with 15 observations and 3 variables: \describe{ \item{\code{machine}}{a factor with levels \code{A} \code{B} \code{C}. This is the primary factor of interest.} \item{\code{strength}}{a numeric vector. The response variable.} \item{\code{diameter}}{a numeric vector. A covariate.} } } \source{ Montgomery, D. C. (2013) \emph{Design and Analysis of Experiments} (8th ed.). John Wiley and Sons, ISBN 978-1-118-14692-7. } \usage{ fiber } \description{ Fiber data from Montgomery Design (8th ed.), p.656 (Table 15.10). Useful as a simple analysis-of-covariance example. } \details{ The goal of the experiment is to compare the mean breaking strength of fibers produced by the three machines. When testing this, the technician also measured the diameter of each fiber, and this measurement may be used as a concomitant variable to improve precision of the estimates. } \examples{ fiber.lm <- lm(strength ~ diameter + machine, data=fiber) ref_grid(fiber.lm) # Covariate-adjusted means and comparisons emmeans(fiber.lm, pairwise ~ machine) } \keyword{datasets} emmeans/man/hpd.summary.Rd0000644000176200001440000000401514137062735015220 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/MCMC-support.R \name{hpd.summary} \alias{hpd.summary} \title{Summarize an emmGrid from a Bayesian model} \usage{ hpd.summary(object, prob, by, type, point.est = median, bias.adjust = get_emm_option("back.bias.adj"), sigma, ...) } \arguments{ \item{object}{an \code{emmGrid} object having a non-missing \code{post.beta} slot} \item{prob}{numeric probability content for HPD intervals (note: when not specified, the current \code{level} option is used; see \code{\link{emm_options}})} \item{by}{factors to use as \code{by} variables} \item{type}{prediction type as in \code{\link{summary.emmGrid}}} \item{point.est}{function to use to compute the point estimates from the posterior sample for each grid point} \item{bias.adjust}{Logical value for whether to adjust for bias in back-transforming (\code{type = "response"}). This requires a value of \code{sigma} to exist in the object or be specified.} \item{sigma}{Error SD assumed for bias correction (when \code{type = "response"}. If not specified, \code{object@misc$sigma} is used, and an error is thrown if it is not found. \emph{Note:} \code{sigma} may be a vector, as long as it conforms to the number of observations in the posterior sample.} \item{...}{required but not used} } \value{ an object of class \code{summary_emm} } \description{ This function computes point estimates and HPD intervals for each factor combination in \code{object@emmGrid}. While this function may be called independently, it is called automatically by the S3 method \code{\link{summary.emmGrid}} when the object is based on a Bayesian model. (Note: the \code{level} argument, or its default, is passed as \code{prob}). } \examples{ if(require("coda")) { # Create an emmGrid object from a system file cbpp.rg <- do.call(emmobj, readRDS(system.file("extdata", "cbpplist", package = "emmeans"))) hpd.summary(emmeans(cbpp.rg, "period")) } } \seealso{ summary.emmGrid } emmeans/man/emm_list-object.Rd0000644000176200001440000000261114137062735016026 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/emm-list.R \name{emm_list} \alias{emm_list} \title{The \code{emm_list} class} \description{ An \code{emm_list} object is simply a list of \code{\link[=emmGrid-class]{emmGrid}} objects. Such a list is returned, for example, by \code{\link{emmeans}} with a two-sided formula or a list as its \code{specs} argument. } \details{ Methods for \code{emm_list} objects include \code{summary}, \code{coef}, \code{confint}, \code{contrast}, \code{pairs}, \code{plot}, \code{print}, and \code{test}. These are all the same as those methods for \code{emmGrid} objects, with an additional \code{which} argument (integer) to specify which members of the list to use. The default is \code{which = seq_along(object)}; i.e., the method is applied to every member of the \code{emm_list} object. The exception is \code{plot}, where only the \code{which[1]}th element is plotted. As an example, to summarize a single member -- say the second one -- of an \code{emm_list}, one may use \code{summary(object, which = 2)}, but it is probably preferable to directly summarize it using \code{summary(object[[2]])}. } \note{ No \code{export} option is provided for printing an \code{emm_list} (see \code{\link{print.emmGrid}}). If you wish to export these objects, you must do so separately for each element in the list. #' } emmeans/man/emtrends.Rd0000644000176200001440000001473014137062735014577 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/emtrends.R \name{emtrends} \alias{emtrends} \title{Estimated marginal means of linear trends} \usage{ emtrends(object, specs, var, delta.var = 0.001 * rng, max.degree = 1, ...) } \arguments{ \item{object}{A supported model object (\emph{not} a reference grid)} \item{specs}{Specifications for what marginal trends are desired -- as in \code{\link{emmeans}}. If \code{specs} is missing or \code{NULL}, \code{emmeans} is not run and the reference grid for specified trends is returned.} \item{var}{Character value giving the name of a variable with respect to which a difference quotient of the linear predictors is computed. In order for this to be useful, \code{var} should be a numeric predictor that interacts with at least one factor in \code{specs}. Then instead of computing EMMs, we compute and compare the slopes of the \code{var} trend over levels of the specified other predictor(s). As in EMMs, marginal averages are computed for the predictors in \code{specs} and \code{by}. See also the \dQuote{Generalizations} section below.} \item{delta.var}{The value of \emph{h} to use in forming the difference quotient \eqn{(f(x+h) - f(x))/h}. Changing it (especially changing its sign) may be necessary to avoid numerical problems such as logs of negative numbers. The default value is 1/1000 of the range of \code{var} over the dataset.} \item{max.degree}{Integer value. The maximum degree of trends to compute (this is capped at 5). If greater than 1, an additional factor \code{degree} is added to the grid, with corresponding numerical derivatives of orders \code{1, 2, ..., max.degree} as the estimates.} \item{...}{Additional arguments passed to \code{\link{ref_grid}} or \code{\link{emmeans}} as appropriate. See Details.} } \value{ An \code{emmGrid} or \code{emm_list} object, according to \code{specs}. See \code{\link{emmeans}} for more details on when a list is returned. } \description{ The \code{emtrends} function is useful when a fitted model involves a numerical predictor \eqn{x} interacting with another predictor \code{a} (typically a factor). Such models specify that \eqn{x} has a different trend depending on \eqn{a}; thus, it may be of interest to estimate and compare those trends. Analogous to the \code{\link{emmeans}} setting, we construct a reference grid of these predicted trends, and then possibly average them over some of the predictors in the grid. } \details{ The function works by constructing reference grids for \code{object} with various values of \code{var}, and then calculating difference quotients of predictions from those reference grids. Finally, \code{\link{emmeans}} is called with the given \code{specs}, thus computing marginal averages as needed of the difference quotients. Any \code{...} arguments are passed to the \code{ref_grid} and \code{\link{emmeans}}; examples of such optional arguments include optional arguments (often \code{mode}) that apply to specific models; \code{ref_grid} options such as \code{data}, \code{at}, \code{cov.reduce}, \code{mult.names}, \code{nesting}, or \code{transform}; and \code{emmeans} options such as \code{weights} (but please avoid \code{trend} or \code{offset}. } \note{ In earlier versions of \code{emtrends}, the first argument was named \code{model} rather than \code{object}. (The name was changed because of potential mis-matching with a \code{mode} argument, which is an option for several types of models.) For backward compatibility, \code{model} still works \emph{provided all arguments are named}. It is important to understand that trends computed by \code{emtrends} are \emph{not} equivalent to polynomial contrasts in a parallel model where \code{var} is regarded as a factor. That is because the model \code{object} here is assumed to fit a smooth function of \code{var}, and the estimated trends reflect \emph{local} behavior at particular value(s) of \code{var}; whereas when \code{var} is modeled as a factor and polynomial contrasts are computed, those contrasts represent the \emph{global} pattern of changes over \emph{all} levels of \code{var}. See the \code{pigs.poly} and \code{pigs.fact} examples below for an illustration. The linear and quadratic trends depend on the value of \code{percent}, but the cubic trend is constant (because that is true of a cubic polynomial, which is the underlying model). The cubic contrast in the factorial model has the same P value as for the cubic trend, again because the cubic trend is the same everywhere. } \section{Generalizations}{ Instead of a single predictor, the user may specify some monotone function of one variable, e.g., \code{var = "log(dose)"}. If so, the chain rule is applied. Note that, in this example, if \code{object} contains \code{log(dose)} as a predictor, we will be comparing the slopes estimated by that model, whereas specifying \code{var = "dose"} would perform a transformation of those slopes, making the predicted trends vary depending on \code{dose}. } \examples{ fiber.lm <- lm(strength ~ diameter*machine, data=fiber) # Obtain slopes for each machine ... ( fiber.emt <- emtrends(fiber.lm, "machine", var = "diameter") ) # ... and pairwise comparisons thereof pairs(fiber.emt) # Suppose we want trends relative to sqrt(diameter)... emtrends(fiber.lm, ~ machine | diameter, var = "sqrt(diameter)", at = list(diameter = c(20, 30))) # Obtaining a reference grid mtcars.lm <- lm(mpg ~ poly(disp, degree = 2) * (factor(cyl) + factor(am)), data = mtcars) # Center trends at mean disp for each no. of cylinders mtcTrends.rg <- emtrends(mtcars.lm, var = "disp", cov.reduce = disp ~ factor(cyl)) summary(mtcTrends.rg) # estimated trends at grid nodes emmeans(mtcTrends.rg, "am", weights = "prop") ### Higher-degree trends ... pigs.poly <- lm(conc ~ poly(percent, degree = 3), data = pigs) emt <- emtrends(pigs.poly, ~ degree | percent, "percent", max.degree = 3, at = list(percent = c(9, 13.5, 18))) # note: 'degree' is an extra factor created by 'emtrends' summary(emt, infer = c(TRUE, TRUE)) # Compare above results with poly contrasts when 'percent' is modeled as a factor ... pigs.fact <- lm(conc ~ factor(percent), data = pigs) emm <- emmeans(pigs.fact, "percent") contrast(emm, "poly") # Some P values are comparable, some aren't! See Note in documentation } \seealso{ \code{\link{emmeans}}, \code{\link{ref_grid}} } emmeans/man/wrappers.Rd0000644000176200001440000000360714137062735014622 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/wrappers.R \name{lsmeans} \alias{lsmeans} \alias{wrappers} \alias{pmmeans} \alias{lstrends} \alias{pmtrends} \alias{lsmip} \alias{pmmip} \alias{lsm} \alias{pmm} \alias{lsmobj} \alias{pmmobj} \alias{lsm.options} \alias{get.lsm.option} \title{Wrappers for alternative naming of EMMs} \usage{ lsmeans(...) pmmeans(...) lstrends(...) pmtrends(...) lsmip(...) pmmip(...) lsm(...) pmm(...) lsmobj(...) pmmobj(...) lsm.options(...) get.lsm.option(x, default = emm_defaults[[x]]) } \arguments{ \item{...}{Arguments passed to the corresponding \code{em}\emph{xxxx} function} \item{x}{Character name of desired option} \item{default}{default value to return if \code{x} not found} } \value{ The result of the call to \code{em}\emph{xxxx}, suitably modified. \code{get.lsm.option} and \code{lsm.options} remap options from and to corresponding options in the \pkg{lsmeans} options system. } \description{ These are wrappers for \code{\link{emmeans}} and related functions to provide backward compatibility, or for users who may prefer to use other terminology than \dQuote{estimated marginal means} -- namely \dQuote{least-squares means} or \dQuote{predicted marginal means}. } \details{ For each function with \code{ls}\emph{xxxx} or \code{pm}\emph{xxxx} in its name, the same function named \code{em}\emph{xxxx} is called. Any estimator names or list items beginning with \dQuote{em} are replaced with \dQuote{ls} or \dQuote{pm} before the results are returned } \examples{ pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) lsmeans(pigs.lm, "source") } \seealso{ \code{\link{emmeans}}, \code{\link{emtrends}}, \code{\link{emmip}}, \code{\link{emm}}, \code{\link{emmobj}}, \code{\link{emm_options}}, \code{\link{get_emm_option}} } emmeans/man/joint_tests.Rd0000644000176200001440000000751114137062735015322 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/test.R \name{joint_tests} \alias{joint_tests} \title{Compute joint tests of the terms in a model} \usage{ joint_tests(object, by = NULL, show0df = FALSE, cov.reduce = range, ...) } \arguments{ \item{object, cov.reduce}{\code{object} is a fitted model or an \code{emmGrid}. If a fitted model, it is replaced by \code{ref_grid(object, cov.reduce = cov.reduce, ...)}} \item{by}{character names of \code{by} variables. Separate sets of tests are run for each combination of these.} \item{show0df}{logical value; if \code{TRUE}, results with zero numerator degrees of freedom are displayed, if \code{FALSE} they are skipped} \item{...}{additional arguments passed to \code{ref_grid} and \code{emmeans}} } \value{ a \code{summary_emm} object (same as is produced by \code{\link{summary.emmGrid}}). All effects for which there are no estimable contrasts are omitted from the results. } \description{ This function produces an analysis-of-variance-like table based on linear functions of predictors in a model or \code{emmGrid} object. Specifically, the function constructs, for each combination of factors (or covariates reduced to two or more levels), a set of (interaction) contrasts via \code{\link{contrast}}, and then tests them using \code{\link{test}} with \code{joint = TRUE}. Optionally, one or more of the predictors may be used as \code{by} variable(s), so that separate tables of tests are produced for each combination of them. } \details{ In models with only factors, no covariates, we believe these tests correspond to \dQuote{type III} tests a la \pkg{SAS}, as long as equal-weighted averaging is used and there are no estimability issues. When covariates are present and interact with factors, the results depend on how the covariate is handled in constructing the reference grid. See the example at the end of this documentation. The point that one must always remember is that \code{joint_tests} always tests contrasts among EMMs, in the context of the reference grid, whereas type III tests are tests of model coefficients -- which may or may not have anything to do with EMMs or contrasts. } \examples{ pigs.lm <- lm(log(conc) ~ source * factor(percent), data = pigs) joint_tests(pigs.lm) ## will be same as type III ANOVA joint_tests(pigs.lm, weights = "outer") ## differently weighted joint_tests(pigs.lm, by = "source") ## separate joint tests of 'percent' ### Comparisons with type III tests toy = data.frame( treat = rep(c("A", "B"), c(4, 6)), female = c(1, 0, 0, 1, 0, 0, 0, 1, 1, 0 ), resp = c(17, 12, 14, 19, 28, 26, 26, 34, 33, 27)) toy.fac = lm(resp ~ treat * factor(female), data = toy) toy.cov = lm(resp ~ treat * female, data = toy) # (These two models have identical fitted values and residuals) joint_tests(toy.fac) joint_tests(toy.cov) # female is regarded as a 2-level factor by default joint_tests(toy.cov, at = list(female = 0.5)) joint_tests(toy.cov, cov.keep = 0) # i.e., female = mean(toy$female) joint_tests(toy.cov, at = list(female = 0)) # -- Compare with SAS output -- female as factor -- ## Source DF Type III SS Mean Square F Value Pr > F ## treat 1 488.8928571 488.8928571 404.60 <.0001 ## female 1 78.8928571 78.8928571 65.29 0.0002 ## treat*female 1 1.7500000 1.7500000 1.45 0.2741 # # -- Compare with SAS output -- female as covariate -- ## Source DF Type III SS Mean Square F Value Pr > F ## treat 1 252.0833333 252.0833333 208.62 <.0001 ## female 1 78.8928571 78.8928571 65.29 0.0002 ## female*treat 1 1.7500000 1.7500000 1.45 0.2741 } \seealso{ \code{\link{test}} } emmeans/man/update.emmGrid.Rd0000644000176200001440000003060614137062735015623 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/emmGrid-methods.R, R/summary.R \name{update.emmGrid} \alias{update.emmGrid} \alias{levels<-.emmGrid} \alias{update.summary_emm} \title{Update an \code{emmGrid} object} \usage{ \method{update}{emmGrid}(object, ..., silent = FALSE) \method{levels}{emmGrid}(x) <- value \method{update}{summary_emm}(object, by.vars, mesg, ...) } \arguments{ \item{object}{An \code{emmGrid} object} \item{...}{Options to be set. These must match a list of known options (see Details)} \item{silent}{Logical value. If \code{FALSE} (the default), a message is displayed if any options are not matched. If \code{TRUE}, no messages are shown.} \item{x}{an \code{emmGrid} object} \item{value}{\code{list} or replacement levels. See the documentation for \code{update.emmGrid} with the \code{levels} argument, as well as the section below on \dQuote{Replaciong levels}} \item{by.vars, mesg}{Attributes that can be altered in \code{update.summary_emm}} } \value{ an updated \code{emmGrid} object. \code{levels<-} replaces the levels of the object in-place. See the section on replacing levels for details. } \description{ Objects of class \code{emmGrid} contain several settings that affect such things as what arguments to pass to \code{\link{summary.emmGrid}}. The \code{update} method allows safer management of these settings than by direct modification of its slots. } \note{ When it makes sense, an option set by \code{update} will persist into future results based on that object. But some options are disabled as well. For example, a \code{calc} option will be nulled-out if \code{contrast} is called, because it probably will not make sense to do the same calculations on the contrast results, and in fact the variable(s) needed may not even still exist. \code{factor(percent)}. } \section{Details}{ The names in \code{\dots} are partially matched against those that are valid, and if a match is found, it adds or replaces the current setting. The valid names are \describe{ \item{\code{tran}, \code{tran2}}{(\code{list} or \code{character}) specifies the transformation which, when inverted, determines the results displayed by \code{\link{summary.emmGrid}}, \code{\link{predict.emmGrid}}, or \code{\link{emmip}} when \code{type="response"}. The value may be the name of a standard transformation from \code{\link{make.link}} or additional ones supported by name, such as \code{"log2"}; or, for a custom transformation, a \code{list} containing at least the functions \code{linkinv} (the inverse of the transformation) and \code{mu.eta} (the derivative thereof). The \code{\link{make.tran}} function returns such lists for a number of popular transformations. See the help page of \code{\link{make.tran}} for details as well as information on the additional named transformations that are supported. \code{tran2} is just like \code{tran} except it is a second transformation (i.e., a response transformation in a generalized linear model).} \item{\code{tran.mult}}{Multiple for \code{tran}. For example, for the response transformation \samp{2*sqrt(y)} (or \samp{sqrt(y) + sqrt(y + 1)}, for that matter), we should have \code{tran = "sqrt"} and \code{tran.mult = 2}. If absent, a multiple of 1 is assumed.} \item{\code{tran.offset}}{Additive constant before a transformation is applied. For example, a response transformation of \code{log(y + pi)} has \code{tran.offset = pi}. If no value is present, an offset of 0 is assumed.} \item{\code{estName}}{(\code{character}) is the column label used for displaying predictions or EMMs.} \item{\code{inv.lbl}}{(\code{character)}) is the column label to use for predictions or EMMs when \code{type="response"}.} \item{\code{by.vars}}{(\code{character)} vector or \code{NULL}) the variables used for grouping in the summary, and also for defining subfamilies in a call to \code{\link{contrast}}.} \item{\code{pri.vars}}{(\code{character} vector) are the names of the grid variables that are not in \code{by.vars}. Thus, the combinations of their levels are used as columns in each table produced by \code{\link{summary.emmGrid}}.} \item{\code{alpha}}{(numeric) is the default significance level for tests, in \code{\link{summary.emmGrid}} as well as \code{\link{plot.emmGrid}} when \samp{CIs = TRUE}. Be cautious that methods that depend on specifying \code{alpha} are prone to abuse. See the discussion in \href{../doc/basics.html#pvalues}{\code{vignette("basics", "emmeans")}}.} \item{\code{adjust}}{(\code{character)}) is the default for the \code{adjust} argument in \code{\link{summary.emmGrid}}.} \item{\code{famSize}}{(integer) is the number of means involved in a family of inferences; used in Tukey adjustment} \item{\code{infer}}{(\code{logical} vector of length 2) is the default value of \code{infer} in \code{\link{summary.emmGrid}}.} \item{\code{level}}{(numeric) is the default confidence level, \code{level}, in \code{\link{summary.emmGrid}}. \emph{Note:} You must specify all five letters of \sQuote{level} to distinguish it from the slot name \sQuote{levels}.} \item{\code{df}}{(numeric) overrides the default degrees of freedom with a specified single value.} \item{\code{calc}}{(list) additional calculated columns. See \code{\link{summary.emmGrid}}.} \item{\code{null}}{(numeric) null hypothesis for \code{summary} or \code{test} (taken to be zero if missing).} \item{\code{side}}{(numeric or character) \code{side} specification for for \code{summary} or \code{test} (taken to be zero if missing).} \item{\code{sigma}}{(numeric) Error SD to use in predictions and for bias-adjusted back-transformations} \item{\code{delta}}{(numeric) \code{delta} specification for \code{summary} or \code{test} (taken to be zero if missing).} \item{\code{predict.type} or \code{type}}{(character) sets the default method of displaying predictions in \code{\link{summary.emmGrid}}, \code{\link{predict.emmGrid}}, and \code{\link{emmip}}. Valid values are \code{"link"} (with synonyms \code{"lp"} and \code{"linear"}), or \code{"response"}.} \item{\code{bias.adjust}, \code{frequentist}}{(character) These are used by \code{summary} if the value of these arguments are not specified.} \item{\code{estType}}{(\code{character}) is used internally to determine what \code{adjust} methods are appropriate. It should match one of \samp{c("prediction", "contrast", "pairs")}. As an example of why this is needed, the Tukey adjustment should only be used for pairwise comparisons (\code{estType = "pairs"}); if \code{estType} is some other string, Tukey adjustments are not allowed.} \item{\code{avgd.over}}{(\code{character)} vector) are the names of the variables whose levels are averaged over in obtaining marginal averages of predictions, i.e., estimated marginal means. Changing this might produce a misleading printout, but setting it to \code{character(0)} will suppress the \dQuote{averaged over} message in the summary.} \item{\code{initMesg}}{(\code{character}) is a string that is added to the beginning of any annotations that appear below the \code{\link{summary.emmGrid}} display.} \item{\code{methDesc}}{(\code{character}) is a string that may be used for creating names for a list of \code{emmGrid} objects. } \item{\code{nesting}}{(Character or named \code{list}) specifies the nesting structure. See \dQuote{Recovering or overriding model information} in the documentation for \code{\link{ref_grid}}. The current nesting structure is displayed by \code{\link{str.emmGrid}}.} \item{\code{levels}}{named \code{list} of new levels for the elements of the current \code{emmGrid}. The list name(s) are used as new variable names, and if needed, the list is expanded using \code{expand.grid}. These results replace current variable names and levels. This specification changes the \code{levels}, \code{grid}, \code{roles}, and \code{misc} slots in the updated \code{emmGrid}, and resets \code{pri.vars}, \code{by.vars}, \code{adjust}, \code{famSize}, and \code{avgd.over}. In addition, if there is nesting of factors, that may be altered; a warning is issued if it involves something other than mere name changes. \emph{Note:} All six letters of \code{levels} is needed in order to distinguish it from \code{level}.} \item{\code{submodel}}{\code{formula} or \code{character} value specifying a submodel (requires this feature being supported by underlying methods for the model class). When specified, the \code{linfct} slot is replaced by its aliases for the specified sub-model. Any factors in the sub-model that do not appear in the model matrix are ignored, as are any interactions that are not in the main model, and any factors associate with multivariate responses. The estimates displayed are then computed as if the sub-model had been fitted. (However, the standard errors will be based on the error variance(s) of the full model.) \emph{Note:} The formula should refer only to predictor names, \emph{excluding} any function calls (such as \code{factor} or \code{poly}) that appear in the original model formula. See the example. The character values allowed should partially match \code{"minimal"} or \code{"type2"}. With \code{"minimal"}, the sub-model is taken to be the one only involving the surviving factors in \code{object} (the ones averaged over being omitted). Specifying \code{"type2"} is the same as \code{"minimal"} except only the highest-order term in the submodel is retained, and all effects not containing it are orthogonalized-out. Thus, in a purely linear situation such as an \code{lm} model, the joint test of the modified object is in essence a type-2 test as in \code{car::Anova}. For some objects such as generalized linear models, specifying \code{submodel} will typically not produce the same estimates or type-2 tests as would be obtained by actually fitting a separate model with those specifications. The reason is that those models are fitted by iterative-reweighting methods, whereas the \code{submodel} calculations preserve the final weights used in fitting the full model.} \item{(any other slot name)}{If the name matches an element of \code{slotNames(object)} other than \code{levels}, that slot is replaced by the supplied value, if it is of the required class (otherwise an error occurs). The user must be very careful in replacing slots because they are interrelated; for example, the lengths and dimensions of \code{grid}, \code{linfct}, \code{bhat}, and \code{V} must conform.} } %%%%%%% end \describe } \section{Replacing levels}{ The \code{levels<-} method uses \code{update.emmGrid} to replace the levels of one or more factors. This method allows selectively replacing the levels of just one factor (via subsetting operators), whereas \code{update(x, levels = list(...))} requires a list of \emph{all} factors and their levels. If any factors are to be renamed, we must replace all levels and include the new names in the replacements. See the examples. } \section{Method for \code{summary_emm} objects}{ This method exists so that we can change the way a summary is displayed, by changing the by variables or the annotations. } \examples{ # Using an already-transformed response: pigs.lm <- lm(log(conc) ~ source * factor(percent), data = pigs) # Reference grid that knows about the transformation # and asks to include the sample size in any summaries: pigs.rg <- update(ref_grid(pigs.lm), tran = "log", predict.type = "response", calc = c(n = ~.wgt.)) emmeans(pigs.rg, "source") # Obtain estimates for the additive model # [Note that the submodel refers to 'percent', not 'factor(percent)'] emmeans(pigs.rg, "source", submodel = ~ source + percent) # Type II ANOVA joint_tests(pigs.rg, submodel = "type2") ## Changing levels of one factor newrg <- pigs.rg levels(newrg)$source <- 1:3 newrg ## Unraveling a previously standardized covariate zd = scale(fiber$diameter) fibz.lm <- lm(strength ~ machine * zd, data = fiber) (fibz.rg <- ref_grid(fibz.lm, at = list(zd = -2:2))) ### 2*SD range lev <- levels(fibz.rg) levels(fibz.rg) <- list ( machine = lev$machine, diameter = with(attributes(zd), `scaled:center` + `scaled:scale` * lev$zd) ) fibz.rg ### Compactify results with a by variable update(joint_tests(pigs.rg, by = "source"), by = NULL) } \seealso{ \code{\link{emm_options}} } emmeans/man/emmc-functions.Rd0000644000176200001440000001611214137062735015701 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/emm-contr.R \name{contrast-methods} \alias{contrast-methods} \alias{pairwise.emmc} \alias{emmc-functions} \alias{revpairwise.emmc} \alias{tukey.emmc} \alias{poly.emmc} \alias{trt.vs.ctrl.emmc} \alias{trt.vs.ctrl1.emmc} \alias{trt.vs.ctrlk.emmc} \alias{dunnett.emmc} \alias{eff.emmc} \alias{del.eff.emmc} \alias{consec.emmc} \alias{mean_chg.emmc} \alias{identity.emmc} \title{Contrast families} \usage{ pairwise.emmc(levs, exclude = integer(0), include, ...) revpairwise.emmc(levs, exclude = integer(0), include, ...) tukey.emmc(levs, reverse = FALSE, ...) poly.emmc(levs, max.degree = min(6, k - 1), ...) trt.vs.ctrl.emmc(levs, ref = 1, reverse = FALSE, exclude = integer(0), include, ...) trt.vs.ctrl1.emmc(levs, ref = 1, ...) trt.vs.ctrlk.emmc(levs, ref = length(levs), ...) dunnett.emmc(levs, ref = 1, ...) eff.emmc(levs, exclude = integer(0), include, ...) del.eff.emmc(levs, exclude = integer(0), include, ...) consec.emmc(levs, reverse = FALSE, exclude = integer(0), include, ...) mean_chg.emmc(levs, reverse = FALSE, exclude = integer(0), include, ...) identity.emmc(levs, exclude = integer(0), include, ...) } \arguments{ \item{levs}{Vector of factor levels} \item{exclude}{integer vector of indices, or character vector of levels to exclude from consideration. These levels will receive weight 0 in all contrasts. Character levels must exactly match elements of \code{levs}.} \item{include}{integer or character vector of levels to include (the complement of \code{exclude}). An error will result if the user specifies both \code{exclude} and \code{include}.} \item{...}{Additional arguments, passed to related methods as appropriate} \item{reverse}{Logical value to determine the direction of comparisons} \item{max.degree}{Integer specifying the maximum degree of polynomial contrasts} \item{ref}{Integer(s) or character(s) specifying which level(s) to use as the reference. Character values must exactly match elements of \code{levs}.} } \value{ A data.frame, each column containing contrast coefficients for levs. The "desc" attribute is used to label the results in emmeans, and the "adjust" attribute gives the default adjustment method for multiplicity. } \description{ Functions with an extension of \code{.emmc} provide for named contrast families. One of the standard ones documented here may be used, or the user may write such a function. } \details{ Each standard contrast family has a default multiple-testing adjustment as noted below. These adjustments are often only approximate; for a more exacting adjustment, use the interfaces provided to \code{glht} in the \pkg{multcomp} package. \code{pairwise.emmc}, \code{revpairwise.emmc}, and \code{tukey.emmc} generate contrasts for all pairwise comparisons among estimated marginal means at the levels in levs. The distinction is in which direction they are subtracted. For factor levels A, B, C, D, \code{pairwise.emmc} generates the comparisons A-B, A-C, A-D, B-C, B-D, and C-D, whereas \code{revpairwise.emmc} generates B-A, C-A, C-B, D-A, D-B, and D-C. \code{tukey.emmc} invokes \code{pairwise.emmc} or \code{revpairwise.emmc} depending on \code{reverse}. The default multiplicity adjustment method is \code{"tukey"}, which is only approximate when the standard errors differ. \code{poly.emmc} generates orthogonal polynomial contrasts, assuming equally-spaced factor levels. These are derived from the \code{\link[stats]{poly}} function, but an \emph{ad hoc} algorithm is used to scale them to integer coefficients that are (usually) the same as in published tables of orthogonal polynomial contrasts. The default multiplicity adjustment method is \code{"none"}. \code{trt.vs.ctrl.emmc} and its relatives generate contrasts for comparing one level (or the average over specified levels) with each of the other levels. The argument \code{ref} should be the index(es) (not the labels) of the reference level(s). \code{trt.vs.ctrl1.emmc} is the same as \code{trt.vs.ctrl.emmc} with a reference value of 1, and \code{trt.vs.ctrlk.emmc} is the same as \code{trt.vs.ctrl} with a reference value of \code{length(levs)}. \code{dunnett.emmc} is the same as \code{trt.vs.ctrl}. The default multiplicity adjustment method is \code{"dunnettx"}, a close approximation to the Dunnett adjustment. \emph{Note} in all of these functions, it is illegal to have any overlap between the \code{ref} levels and the \code{exclude} levels. If any is found, an error is thrown. \code{consec.emmc} and \code{mean_chg.emmc} are useful for contrasting treatments that occur in sequence. For a factor with levels A, B, C, D, E, \code{consec.emmc} generates the comparisons B-A, C-B, and D-C, while \code{mean_chg.emmc} generates the contrasts (B+C+D)/3 - A, (C+D)/2 - (A+B)/2, and D - (A+B+C)/3. With \code{reverse = TRUE}, these differences go in the opposite direction. \code{eff.emmc} and \code{del.eff.emmc} generate contrasts that compare each level with the average over all levels (in \code{eff.emmc}) or over all other levels (in \code{del.eff.emmc}). These differ only in how they are scaled. For a set of k EMMs, \code{del.eff.emmc} gives weight 1 to one EMM and weight -1/(k-1) to the others, while \code{eff.emmc} gives weights (k-1)/k and -1/k respectively, as in subtracting the overall EMM from each EMM. The default multiplicity adjustment method is \code{"fdr"}. This is a Bonferroni-based method and is slightly conservative; see \code{\link[stats]{p.adjust}}. \code{identity.emmc} simply returns the identity matrix (as a data frame), minus any columns specified in \code{exclude}. It is potentially useful in cases where a contrast function must be specified, but none is desired. } \note{ Caution is needed in cases where the user alters the ordering of results (e.g., using the the \code{"[...]"} operator), because the contrasts generated depend on the order of the levels provided. For example, suppose \code{trt.vs.ctrl1} contrasts are applied to two \code{by} groups with levels ordered (Ctrl, T1, T2) and (T1, T2, Ctrl) respectively, then the contrasts generated will be for (T1 - Ctrl, T2 - Ctrl) in the first group and (T2 - T1, Ctrl - T1) in the second group, because the first level in each group is used as the reference level. } \examples{ warp.lm <- lm(breaks ~ wool*tension, data = warpbreaks) warp.emm <- emmeans(warp.lm, ~ tension | wool) contrast(warp.emm, "poly") contrast(warp.emm, "trt.vs.ctrl", ref = "M") # Compare only low and high tensions # Note pairs(emm, ...) calls contrast(emm, "pairwise", ...) pairs(warp.emm, exclude = 2) # (same results using exclude = "M" or include = c("L","H") or include = c(1,3)) ### Setting up a custom contrast function helmert.emmc <- function(levs, ...) { M <- as.data.frame(contr.helmert(levs)) names(M) <- paste(levs[-1],"vs earlier") attr(M, "desc") <- "Helmert contrasts" M } contrast(warp.emm, "helmert") \dontrun{ # See what is used for polynomial contrasts with 6 levels emmeans:::poly.emmc(1:6) } } emmeans/man/eff_size.Rd0000644000176200001440000001306714157435113014546 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/eff-size.R \name{eff_size} \alias{eff_size} \title{Calculate effect sizes and confidence bounds thereof} \usage{ eff_size(object, sigma, edf, method = "pairwise", ...) } \arguments{ \item{object}{an \code{\link[=emmGrid-class]{emmGrid}} object, typically one defining the EMMs to be contrasted. If instead, \code{class(object) == "emm_list"}, such as is produced by \code{emmeans(model, pairwise ~ treatment)}, a message is displayed; the contrasts already therein are used; and \code{method} is replaced by \code{"identity"}.} \item{sigma}{numeric scalar, value of the population SD.} \item{edf}{numeric scalar that specifies the equivalent degrees of freedom for the \code{sigma}. This is a way of specifying the uncertainty in \code{sigma}, in that we regard our estimate of \code{sigma^2} as being proportional to a chi-square random variable with \code{edf} degrees of freedom. (\code{edf} should not be confused with the \code{df} argument that may be passed via \code{...} to specify the degrees of freedom to use in \eqn{t} statistics and confidence intervals.)} \item{method}{the contrast method to use to define the effects. This is passed to \code{\link{contrast}} after the elements of \code{object} are scaled.} \item{...}{Additional arguments passed to \code{contrast}} } \value{ an \code{\link[=emmGrid-class]{emmGrid}} object containing the effect sizes } \description{ Standardized effect sizes are typically calculated using pairwise differences of estimates, divided by the SD of the population providing the context for those effects. This function calculates effect sizes from an \code{emmGrid} object, and confidence intervals for them, accounting for uncertainty in both the estimated effects and the population SD. } \details{ Any \code{by} variables specified in \code{object} will remain in force in the returned effects, unless overridden in the optional arguments. For models having a single random effect, such as those fitted using \code{\link{lm}}; in that case, the \code{stats::sigma} and \code{stats::df.residual} functions may be useful for specifying \code{sigma} and \code{edf}. For models with more than one random effect, \code{sigma} may be based on some combination of the random-effect variances. Specifying \code{edf} can be rather unintuitive but is also relatively uncritical; but the smaller the value, the wider the confidence intervals for effect size. The value of \code{sqrt(2/edf)} can be interpreted as the relative accuracy of \code{sigma}; for example, with \code{edf = 50}, \eqn{\sqrt(2/50) = 0.2}, meaning that \code{sigma} is accurate to plus or minus 20 percent. Note in an example below, we tried two different \code{edf} values as kind of a bracketing/sensitivity-analysis strategy. A value of \code{Inf} is allowable, in which case you are assuming that \code{sigma} is known exactly. Obviously, this narrows the confidence intervals for the effect sizes -- unrealistically if in fact \code{sigma} is unknown. } \note{ The effects are always computed on the scale of the \emph{linear-predictor}; any response transformation or link function is completely ignored. If you wish to base the effect sizes on the response scale, it is \emph{not} enough to replace \code{object} with \code{regrid(object)}, because this back-transformation changes the SD required to compute effect sizes. \strong{Disclaimer:} There is substantial disagreement among practitioners on what is the appropriate \code{sigma} to use in computing effect sizes; or, indeed, whether \emph{any} effect-size measure is appropriate for some situations. The user is completely responsible for specifying appropriate parameters (or for failing to do so). The examples here illustrate a sobering message that effect sizes are often not nearly as accurate as you may think. } \section{Computation}{ This function uses calls to \code{\link{regrid}} to put the estimated marginal means (EMMs) on the log scale. Then an extra element is added to this grid for the log of \code{sigma} and its standard error (where we assume that \code{sigma} is uncorrelated with the log EMMs). Then a call to \code{\link{contrast}} subtracts \code{log{sigma}} from each of the log EMMs, yielding values of \code{log(EMM/sigma)}. Finally, the results are re-gridded back to the original scale and the desired contrasts are computed using \code{method}. In the log-scaling part, we actually rescale the absolute values and keep track of the signs. } \examples{ fiber.lm <- lm(strength ~ diameter + machine, data = fiber) emm <- emmeans(fiber.lm, "machine") eff_size(emm, sigma = sigma(fiber.lm), edf = df.residual(fiber.lm)) # or equivalently: eff_size(pairs(emm), sigma(fiber.lm), df.residual(fiber.lm), method = "identity") ### Mixed model example: if (require(nlme)) withAutoprint({ Oats.lme <- lme(yield ~ Variety + factor(nitro), random = ~ 1 | Block / Variety, data = Oats) # Combine variance estimates VarCorr(Oats.lme) (totSD <- sqrt(214.4724 + 109.6931 + 162.5590)) # I figure edf is somewhere between 5 (Blocks df) and 51 (Resid df) emmV <- emmeans(Oats.lme, ~ Variety) eff_size(emmV, sigma = totSD, edf = 5) eff_size(emmV, sigma = totSD, edf = 51) }, spaced = TRUE) # Multivariate model for the same data: MOats.lm <- lm(yield ~ Variety, data = MOats) eff_size(emmeans(MOats.lm, "Variety"), sigma = sqrt(mean(sigma(MOats.lm)^2)), # RMS of sigma() edf = df.residual(MOats.lm)) } emmeans/man/emm_options.Rd0000644000176200001440000002234514137062735015310 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/emmGrid-methods.R \docType{data} \name{emm_options} \alias{emm_options} \alias{get_emm_option} \alias{emm_defaults} \title{Set or change emmeans options} \format{ An object of class \code{list} of length 21. } \usage{ emm_options(..., disable) get_emm_option(x, default = emm_defaults[[x]]) emm_defaults } \arguments{ \item{...}{Option names and values (see Details)} \item{disable}{If non-missing, this will reset all options to their defaults if \code{disable} tests \code{TRUE} (but first save them for possible later restoration). Otherwise, all previously saved options are restored. This is important for bug reporting; please see the section below on reproducible bugs. When \code{disable} is specified, the other arguments are ignored.} \item{x}{Character value - the name of an option to be queried} \item{default}{Value to return if \code{x} is not found} } \value{ \code{emm_options} returns the current options (same as the result of \samp{getOption("emmeans")}) -- invisibly, unless called with no arguments. \code{get_emm_option} returns the currently stored option for \code{x}, or its default value if not found. } \description{ Use \code{emm_options} to set or change various options that are used in the \pkg{emmeans} package. These options are set separately for different contexts in which \code{emmGrid} objects are created, in a named list of option lists. } \details{ \pkg{emmeans}'s options are stored as a list in the system option \code{"emmeans"}. Thus, \code{emm_options(foo = bar)} is the same as \code{options(emmeans = list(..., foo = bar))} where \code{...} represents any previously existing options. The list \code{emm_defaults} contains the default values in case the corresponding element of system option \code{emmeans} is \code{NULL}. Currently, the following main list entries are supported: \describe{ \item{\code{ref_grid}}{A named \code{list} of defaults for objects created by \code{\link{ref_grid}}. This could affect other objects as well. For example, if \code{emmeans} is called with a fitted model object, it calls \code{ref_grid} and this option will affect the resulting \code{emmGrid} object.} \item{\code{emmeans}}{A named \code{list} of defaults for objects created by \code{\link{emmeans}} or \code{\link{emtrends}}.} \item{\code{contrast}}{A named \code{list} of defaults for objects created by \code{\link{contrast.emmGrid}} or \code{\link{pairs.emmGrid}}.} \item{\code{summary}}{A named \code{list} of defaults used by the methods \code{\link{summary.emmGrid}}, \code{\link{predict.emmGrid}}, \code{\link{test.emmGrid}}, \code{\link{confint.emmGrid}}, and \code{\link{emmip}}. The only option that can affect the latter four is \code{"predict.method"}.} \item{\code{sep}}{A character value to use as a separator in labeling factor combinations. Such labels are potentially used in several places such as \code{\link{contrast}} and \code{\link{plot.emmGrid}} when combinations of factors are compared or plotted. The default is \code{" "}.} \item{\code{parens}}{Character vector that determines which labels are parenthesized when they are contrasted. The first element is a regular expression, and the second and third elements are used as left and right parentheses. See details for the \code{parens} argument in \code{\link{contrast}}. The default will parenthesize labels containing the four arithmetic operators, using round parentheses.} \item{\code{cov.keep}}{The default value of \code{cov.keep} in \code{\link{ref_grid}}. Defaults to \code{"2"}, i.e., two-level covariates are treated like factors.} \item{\code{graphics.engine}}{A character value matching \code{c("ggplot", "lattice")}, setting the default engine to use in \code{\link{emmip}} and \code{\link{plot.emmGrid}}. Defaults to \code{"ggplot"}.} \item{\code{msg.interaction}}{A logical value controlling whether or not a message is displayed when \code{emmeans} averages over a factor involved in an interaction. It is probably not appropriate to do this, unless the interaction is weak. Defaults to \code{TRUE}.} \item{\code{msg.nesting}}{A logical value controlling whether or not to display a message when a nesting structure is auto-detected. The existence of such a structure affects computations of EMMs. Sometimes, a nesting structure is falsely detected -- namely when a user has omitted some main effects but included them in interactions. This does not change the model fit, but it produces a different parameterization that is picked up when the reference grid is constructed. Defaults to \code{TRUE}.} \item{\code{rg.limit}}{An integer value setting a limit on the number of rows in a newly constructed reference grid. This is checked based on the number of levels of the factors involved; but it excludes the levels of any multivariate responses because those are not yet known. The reference grid consists of all possible combinations of the predictors, and this can become huge if there are several factors. An error is thrown if this limit is exceeded. One can use the \code{nuisance} argument of \code{\link{ref_grid}} to collapse on nuisance factors, thus making the grid smaller. Defaults to 10,000.} \item{\code{simplify.names}}{A logical value controlling whether to simplify (when possible) names in the model formula that refer to datasets -- for example, should we simplify a predictor name like \dQuote{\code{data$trt}} to just \dQuote{\code{trt}}? Defaults to \code{TRUE}.} \item{\code{opt.digits}}{A logical value controlling the precision with which summaries are printed. If \code{TRUE} (default), the number of digits displayed is just enough to reasonably distinguish estimates from the ends of their confidence intervals; but always at least 3 digits. If \code{FALSE}, the system value \code{getOption("digits")} is used.} \item{\code{back.bias.adj}}{A logical value controlling whether we try to adjust bias when back-transforming. If \code{FALSE}, we use naive back transformation. If \code{TRUE} \emph{and \code{sigma} is available}, a second-order adjustment is applied to estimate the mean on the response scale.} \item{\code{enable.submodel}}{A logical value. If \code{TRUE}, enables support for selected model classes to implement the \code{submodel} option. If \code{FALSE}, this support is disabled. Setting this option to \code{FALSE} could save excess memory consumption.} }%%% end describe{} Some other options have more specific purposes: \describe{ \item{\code{estble.tol}}{Tolerance for determining estimability in rank-deficient cases. If absent, the value in \code{emm_defaults$estble.tol)} is used.} \item{\code{save.ref_grid}}{Logical value of \code{TRUE} if you wish the latest reference grid created to be saved in \code{.Last.ref_grid}. The default is \code{FALSE}.} \item{Options for \code{lme4::lmerMod} models}{Options \code{lmer.df}, \code{disable.pbkrtest}, \code{pbkrtest.limit}, \code{disable.lmerTest}, and \code{lmerTest.limit} options affect how degrees of freedom are computed for \code{lmerMod} objects produced by the \pkg{lme4} package). See that section of the "models" vignette for details.} } %%%%%% end \describe } \section{Reproducible bugs}{ Most options set display attributes and such that are not likely to be associated with bugs in the code. However, some other options (e.g., \code{cov.keep}) are essentially configuration settings that may affect how/whether the code runs, and the settings for these options may cause subtle effects that may be hard to reproduce. Therefore, when sending a bug report, please create a reproducible example and make sure the bug occurs with all options set at their defaults. This is done by preceding it with \code{emm_options(disable = TRUE)}. By the way, \code{disable} works like a stack (LIFO buffer), in that \code{disable = TRUE} is equivalent to \code{emm_options(saved.opts = emm_options())} and \code{emm_options(disable = FALSE)} is equivalent to \code{options(emmeans = get_emm_option("saved.opts"))}. To completely erase all options, use \code{options(emmeans = NULL)} } \examples{ \dontrun{ emm_options(ref_grid = list(level = .90), contrast = list(infer = c(TRUE,FALSE)), estble.tol = 1e-6) # Sets default confidence level to .90 for objects created by ref.grid # AS WELL AS emmeans called with a model object (since it creates a # reference grid). In addition, when we call 'contrast', 'pairs', etc., # confidence intervals rather than tests are displayed by default. } \dontrun{ emm_options(disable.pbkrtest = TRUE) # This forces use of asymptotic methods for lmerMod objects. # Set to FALSE or NULL to re-enable using pbkrtest. } # See tolerance being used for determining estimability get_emm_option("estble.tol") \dontrun{ # Set all options to their defaults emm_options(disable = TRUE) # ... and perhaps follow with code for a minimal reproducible bug, # which may include emm_options() clls if they are pertinent ... # restore options that had existed previously emm_options(disable = FALSE) } } \seealso{ \code{\link{update.emmGrid}} } \keyword{datasets} emmeans/man/qdrg.Rd0000644000176200001440000001027514157436056013716 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/qdrg.R \name{qdrg} \alias{qdrg} \title{Quick and dirty reference grid} \usage{ qdrg(formula, data, coef, vcov, df, mcmc, object, subset, weights, contrasts, link, qr, ordinal.dim, ...) } \arguments{ \item{formula}{Formula for the fixed effects} \item{data}{Dataset containing the variables in the model} \item{coef}{Fixed-effect regression coefficients (must conform to formula)} \item{vcov}{Variance-covariance matrix of the fixed effects} \item{df}{Error degrees of freedom} \item{mcmc}{Posterior sample of fixed-effect coefficients} \item{object}{Optional model object. If provided, it is used to set certain other arguments, if not specified. See Details.} \item{subset}{Subset of \code{data} used in fitting the model} \item{weights}{Weights used in fitting the model} \item{contrasts}{List of contrasts specified in fitting the model} \item{link}{Link function (character or list) used, if a generalized linear model. (Note: response transformations are auto-detected from \code{formula})} \item{qr}{QR decomposition of the model matrix; needed only if there are \code{NA}s in \code{coef}.} \item{ordinal.dim}{Integer number of levels in an ordinal response. If not missing, the intercept terms are modified appropriate to predicting the latent response (see \code{vignette("models")}, Section O. In this case, we expect the first \code{ordinal.dim - 1} elements of \code{coef} to be the estimated threshold parameters, followed by the coefficients for the linear predictor.)} \item{...}{Optional arguments passed to \code{\link{ref_grid}}} } \value{ An \code{emmGrid} object constructed from the arguments } \description{ This function may make it possible to compute a reference grid for a model object that is otherwise not supported. } \details{ If \code{object} is specified, it is used to try to obtain certain other arguments, as detailed below. The user should ensure that these defaults will work. The default values for the arguments are as follows: \itemize{ \item{\code{formula}: }{Required unless obtainable via \code{formula(object)}} \item{\code{data}: }{Required if variables are not in \code{parent.frame()} or obtainable via \code{object$data}} \item{\code{coef}: }{\code{coef(object)}} \item{\code{vcov}: }{\code{vcov(object)}} \item{\code{df}: }{Set to \code{Inf} if not available in \code{object$df.residual}} \item{\code{mcmc}: }{\code{object$sample}} \item{\code{subset}: }{\code{NULL} (so that all observations in \code{data} are used)} \item{\code{contrasts}: }{\code{object$contrasts}} } The functions \code{\link{qdrg}} and \code{emmobj} are close cousins, in that they both produce \code{emmGrid} objects. When starting with summary statistics for an existing grid, \code{emmobj} is more useful, while \code{qdrg} is more useful when starting from a fitted model. } \examples{ if (require(biglm, quietly = TRUE)) withAutoprint({ # Post hoc analysis of a "biglm" object -- not supported by emmeans bigmod <- biglm(log(conc) ~ source + factor(percent), data = pigs) rg1 <- qdrg(log(conc) ~ source + factor(percent), data = pigs, coef = coef(bigmod), vcov = vcov(bigmod), df = bigmod$df.residual) emmeans(rg1, "source", type = "response") ## But in this particular case, we could have done it the easy way: ## rg1 <- qdrg(object = bigmod, data = pigs) }, spaced = TRUE) if(require(coda, quietly = TRUE) && require(lme4, quietly = TRUE)) withAutoprint({ # Use a stored example having a posterior sample # Model is based on the data in lme4::cbpp post <- readRDS(system.file("extdata", "cbpplist", package = "emmeans"))$post.beta rg2 <- qdrg(~ size + period, data = lme4::cbpp, mcmc = post, link = "logit") summary(rg2, type = "response") }, spaced = TRUE) if(require(ordinal, quietly = TRUE)) withAutoprint({ wine.clm <- clm(rating ~ temp * contact, data = wine) ref_grid(wine.clm) # verify that we get the same thing via: qdrg(object = wine.clm, data = wine, ordinal.dim = 5) }, spaced = TRUE) } \seealso{ \code{\link{emmobj}} for an alternative way to construct an \code{emmGrid}. } emmeans/man/plot.Rd0000644000176200001440000001555114140550422013723 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/plot.emm.R \name{plot.emmGrid} \alias{plot.emmGrid} \alias{plot.summary_emm} \title{Plot an \code{emmGrid} or \code{summary_emm} object} \usage{ \method{plot}{emmGrid}(x, y, type, CIs = TRUE, PIs = FALSE, comparisons = FALSE, colors = c("black", "blue", "blue", "red"), alpha = 0.05, adjust = "tukey", int.adjust = "none", intervals, frequentist, ...) \method{plot}{summary_emm}(x, y, horizontal = TRUE, CIs = TRUE, xlab, ylab, layout, scale = NULL, colors = c("black", "blue", "blue", "red"), intervals, plotit = TRUE, ...) } \arguments{ \item{x}{Object of class \code{emmGrid} or \code{summary_emm}} \item{y}{(Required but ignored)} \item{type}{Character value specifying the type of prediction desired (matching \code{"linear.predictor"}, \code{"link"}, or \code{"response"}). See details under \code{\link{summary.emmGrid}}. In addition, the user may specify \code{type = "scale"}, in which case a transformed scale (e.g., a log scale) is displayed based on the transformation or link function used. Additional customization of this scale is available through including arguments to \code{ggplot2::scale_x_continuous} in \code{...} .} \item{CIs}{Logical value. If \code{TRUE}, confidence intervals are plotted for each estimate.} \item{PIs}{Logical value. If \code{TRUE}, prediction intervals are plotted for each estimate. If \code{objecct} is a Bayesian model, this requires \code{frequentist = TRUE} and \code{sigma =} (some value). Note that the \code{PIs} option is \emph{not} available with \code{summary_emm} objects -- only for \code{emmGrid} objects. Also, prediction intervals are not available with \code{engine = "lattice"}.} \item{comparisons}{Logical value. If \code{TRUE}, \dQuote{comparison arrows} are added to the plot, in such a way that the degree to which arrows overlap reflects as much as possible the significance of the comparison of the two estimates. (A warning is issued if this can't be done.) Note that comparison arrows are not available with `summary_emm` objects.} \item{colors}{Character vector of color names to use for estimates, CIs, PIs, and comparison arrows, respectively. CIs and PIs are rendered with some transparency, and colors are recycled if the length is less than four; so all plot elements are visible even if a single color is specified.} \item{alpha}{The significance level to use in constructing comparison arrows} \item{adjust}{Character value: Multiplicity adjustment method for comparison arrows \emph{only}.} \item{int.adjust}{Character value: Multiplicity adjustment method for the plotted confidence intervals \emph{only}.} \item{intervals}{If specified, it is used to set \code{CIs}. This is the previous argument name for \code{CIs} and is provided for backward compatibility.} \item{frequentist}{Logical value. If there is a posterior MCMC sample and \code{frequentist} is non-missing and TRUE, a frequentist summary is used for obtaining the plot data, rather than the posterior point estimate and HPD intervals. This argument is ignored when it is not a Bayesian model.} \item{...}{Additional arguments passed to \code{\link{update.emmGrid}}, \code{\link{predict.emmGrid}}, or \code{\link[lattice:xyplot]{dotplot}}} \item{horizontal}{Logical value specifying whether the intervals should be plotted horizontally or vertically} \item{xlab}{Character label for horizontal axis} \item{ylab}{Character label for vertical axis} \item{layout}{Numeric value passed to \code{\link[lattice:xyplot]{dotplot}} when \code{engine == "lattice"}.} \item{scale}{Object of class \code{trans} (in the \pkg{scales} package) to specify a nonlinear scale. This is used in lieu of \code{type = "scale"} when plotting a \code{summary_emm} object created with \code{type = "response"}. This is ignored with other types of summaries.} \item{plotit}{Logical value. If \code{TRUE}, a graphical object is returned; if \code{FALSE}, a data.frame is returned containing all the values used to construct the plot.} } \value{ If \code{plotit = TRUE}, a graphical object is returned. If \code{plotit = FALSE}, a \code{data.frame} with the table of EMMs that would be plotted. In the latter case, the estimate being plotted is named \code{the.emmean}, and any factors involved have the same names as in the object. Confidence limits are named \code{lower.CL} and \code{upper.CL}, prediction limits are named \code{lpl} and \code{upl}, and comparison-arrow limits are named \code{lcmpl} and \code{ucmpl}. There is also a variable named \code{pri.fac} which contains the factor combinations that are \emph{not} among the \code{by} variables. } \description{ Methods are provided to plot EMMs as side-by-side CIs, and optionally to display \dQuote{comparison arrows} for displaying pairwise comparisons. } \note{ In order to play nice with the plotting functions, any variable names that are not syntactically correct (e.g., contain spaces) are altered using \code{\link{make.names}}. } \section{Details}{ If any \code{by} variables are in force, the plot is divided into separate panels. For \code{"summary_emm"} objects, the \code{\dots} arguments in \code{plot} are passed \emph{only} to \code{dotplot}, whereas for \code{"emmGrid"} objects, the object is updated using \code{\dots} before summarizing and plotting. In plots with \code{comparisons = TRUE}, the resulting arrows are only approximate, and in some cases may fail to accurately reflect the pairwise comparisons of the estimates -- especially when estimates having large and small standard errors are intermingled in just the wrong way. Note that the maximum and minimum estimates have arrows only in one direction, since there is no need to compare them with anything higher or lower, respectively. See the \href{../doc/xplanations.html#arrows}{\code{vignette("xplanations", "emmeans")}} for details on how these are derived. If \code{adjust} or \code{int.adjust} are not supplied, they default to the internal \code{adjust} setting saved in \code{pairs(x)} and \code{x} respectively (see \code{\link{update.emmGrid}}). } \examples{ warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) warp.emm <- emmeans(warp.lm, ~ tension | wool) plot(warp.emm) plot(warp.emm, by = NULL, comparisons = TRUE, adjust = "mvt", horizontal = FALSE, colors = "darkgreen") ### Using a transformed scale pigs.lm <- lm(log(conc + 2) ~ source * factor(percent), data = pigs) pigs.emm <- emmeans(pigs.lm, ~ percent | source) plot(pigs.emm, type = "scale", breaks = seq(20, 100, by = 10)) # Based on a summary. # To get a transformed axis, must specify 'scale'; but it does not necessarily # have to be the same as the actual response transformation pigs.ci <- confint(pigs.emm, type = "response") plot(pigs.ci, scale = scales::log10_trans()) } emmeans/man/CLD.emmGrid.Rd0000644000176200001440000000610714157422304014734 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/cld-emm.R \name{cld.emmGrid} \alias{cld.emmGrid} \alias{cld.emm_list} \title{Compact letter displays} \usage{ \method{cld}{emmGrid}(object, details = FALSE, sort = TRUE, by, alpha = 0.05, Letters = c("1234567890", LETTERS, letters), reversed = FALSE, ...) \method{cld}{emm_list}(object, ..., which = 1) } \arguments{ \item{object}{An object of class \code{emmGrid}} \item{details}{Logical value determining whether detailed information on tests of pairwise comparisons is displayed} \item{sort}{Logical value determining whether the EMMs are sorted before the comparisons are produced. When \code{TRUE}, the results are displayed according to \code{reversed}.} \item{by}{Character value giving the name or names of variables by which separate families of comparisons are tested. If NULL, all means are compared. If missing, the object's \code{by.vars} setting, if any, is used.} \item{alpha}{Numeric value giving the significance level for the comparisons} \item{Letters}{Character vector of letters to use in the display. Any strings of length greater than 1 are expanded into individual characters} \item{reversed}{Logical value (passed to \code{multcompView::multcompLetters}.) If \code{TRUE}, the order of use of the letters is reversed. In addition, if both \code{sort} and \code{reversed} are TRUE, the sort order of results is reversed.} \item{...}{Arguments passed to \code{\link{contrast}} (for example, an \code{adjust} method)} \item{which}{Which element of the \code{emm_list} object to process (If length exceeds one, only the first one is used)} } \description{ A method for \code{multcomp::cld()} is provided for users desiring to produce compact-letter displays (CLDs). This method uses the Piepho (2004) algorithm (as implemented in the \pkg{multcompView} package) to generate a compact letter display of all pairwise comparisons of estimated marginal means. The function obtains (possibly adjusted) P values for all pairwise comparisons of means, using the \code{\link{contrast}} function with \code{method = "pairwise"}. When a P value exceeds \code{alpha}, then the two means have at least one letter in common. } \note{ We warn that such displays encourage a poor practice in interpreting significance tests. CLDs are misleading because they visually group means with comparisons \emph{P} > \code{alpha} as though they are equal, when in fact we have only failed to prove that they differ. As alternatives, consider \code{\link{pwpp}} (graphical display of \emph{P} values) or \code{\link{pwpm}} (matrix display). } \examples{ if(requireNamespace("multcomp")) { pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) pigs.emm <- emmeans(pigs.lm, "percent", type = "response") multcomp::cld(pigs.emm, alpha = 0.10, Letters = LETTERS) } } \references{ Piepho, Hans-Peter (2004) An algorithm for a letter-based representation of all pairwise comparisons, Journal of Computational and Graphical Statistics, 13(2), 456-466. } emmeans/man/mcmc-support.Rd0000644000176200001440000001125414157434227015406 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/MCMC-support.R \name{as.mcmc.emmGrid} \alias{as.mcmc.emmGrid} \alias{mcmc-support} \alias{as.mcmc.list.emmGrid} \title{Support for MCMC-based estimation} \usage{ \method{as.mcmc}{emmGrid}(x, names = TRUE, sep.chains = TRUE, likelihood, NE.include = FALSE, ...) \method{as.mcmc.list}{emmGrid}(x, names = TRUE, ...) } \arguments{ \item{x}{An object of class \code{emmGrid}} \item{names}{Logical scalar or vector specifying whether variable names are appended to levels in the column labels for the \code{as.mcmc} or \code{as.mcmc.list} result -- e.g., column names of \code{treat A} and \code{treat B} versus just \code{A} and \code{B}. When there is more than one variable involved, the elements of \code{names} are used cyclically.} \item{sep.chains}{Logical value. If \code{TRUE}, and there is more than one MCMC chain available, an \code{\link[coda]{mcmc.list}} object is returned by \code{as.mcmc}, with separate EMMs posteriors in each chain.} \item{likelihood}{Character value or function. If given, simulations are made from the corresponding posterior predictive distribution. If not given, we obtain the posterior distribution of the parameters in \code{object}. See Prediction section below.} \item{NE.include}{Logical value. If \code{TRUE}, non-estimable columns are kept but returned as columns of \code{NA} values (this may create errors or warnings in subsequent analyses using, say, \pkg{coda}). If \code{FALSE}, non-estimable columns are dropped, and a warning is issued. (If all are non-estimable, an error is thrown.)} \item{...}{arguments passed to other methods} } \value{ An object of class \code{\link[coda]{mcmc}} or \code{\link[coda]{mcmc.list}}. } \description{ When a model is fitted using Markov chain Monte Carlo (MCMC) methods, its reference grid contains a \code{post.beta} slot. These functions transform those posterior samples to posterior samples of EMMs or related contrasts. They can then be summarized or plotted using, e.g., functions in the \pkg{coda} package. } \section{Details}{ When the object's \code{post.beta} slot is non-trivial, \code{as.mcmc} will return an \code{\link[coda]{mcmc}} or \code{\link[coda]{mcmc.list}} object that can be summarized or plotted using methods in the \pkg{coda} package. In these functions, \code{post.beta} is transformed by post-multiplying it by \code{t(linfct)}, creating a sample from the posterior distribution of LS means. In \code{as.mcmc}, if \code{sep.chains} is \code{TRUE} and there is in fact more than one chain, an \code{mcmc.list} is returned with each chain's results. The \code{as.mcmc.list} method is guaranteed to return an \code{mcmc.list}, even if it comprises just one chain. } \section{Prediction}{ When \code{likelihood} is specified, it is used to simulate values from the posterior predictive distribution corresponding to the given likelihood and the posterior distribution of parameter values. Denote the likelihood function as \eqn{f(y|\theta,\phi)}, where \eqn{y} is a response, \eqn{\theta} is the parameter estimated in \code{object}, and \eqn{\phi} comprises zero or more additional parameters to be specified. If \code{likelihood} is a function, that function should take as its first argument a vector of \eqn{\theta} values (each corresponding to one row of \code{object@grid}). Any \eqn{\phi} values should be specified as additional named function arguments, and passed to \code{likelihood} via \code{...}. This function should simulate values of \eqn{y}. A few standard likelihoods are available by specifying \code{likelihood} as a character value. They are: \describe{ \item{\code{"normal"}}{The normal distribution with mean \eqn{\theta} and standard deviation specified by additional argument \code{sigma}} \item{\code{"binomial"}}{The binomial distribution with success probability \eqn{theta}, and number of trials specified by \code{trials}} \item{\code{"poisson"}}{The Poisson distribution with mean \eqn{theta} (no additional parameters)} \item{\code{"gamma"}}{The gamma distribution with scale parameter \eqn{\theta} and shape parameter specified by \code{shape}} } } \examples{ if(requireNamespace("coda")) { ### A saved reference grid for a mixed logistic model (see lme4::cbpp) cbpp.rg <- do.call(emmobj, readRDS(system.file("extdata", "cbpplist", package = "emmeans"))) # Predictive distribution for herds of size 20 # (perhaps a bias adjustment should be applied; see "sophisticated" vignette) pred.incidence <- coda::as.mcmc(regrid(cbpp.rg), likelihood = "binomial", trials = 20) summary(pred.incidence) } } emmeans/man/emmip.Rd0000644000176200001440000002330714137062735014065 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/emmip.R \name{emmip} \alias{emmip} \alias{emmip.default} \alias{emmip_ggplot} \alias{emmip_lattice} \title{Interaction-style plots for estimated marginal means} \usage{ emmip(object, formula, ...) \method{emmip}{default}(object, formula, type, CIs = FALSE, PIs = FALSE, style, engine = get_emm_option("graphics.engine"), plotit = TRUE, nesting.order = FALSE, ...) emmip_ggplot(emms, style = "factor", dodge = 0.1, xlab = labs$xlab, ylab = labs$ylab, tlab = labs$tlab, facetlab = "label_context", scale, dotarg = list(), linearg = list(), CIarg = list(lwd = 2, alpha = 0.5), PIarg = list(lwd = 1.25, alpha = 0.33), ...) emmip_lattice(emms, style = "factor", xlab = labs$xlab, ylab = labs$ylab, tlab = labs$tlab, pch = c(1, 2, 6, 7, 9, 10, 15:20), lty = 1, col = NULL, ...) } \arguments{ \item{object}{An object of class \code{emmGrid}, or a fitted model of a class supported by the \pkg{emmeans} package} \item{formula}{Formula of the form \code{trace.factors ~ x.factors | by.factors}. The EMMs are plotted against \code{x.factor} for each level of \code{trace.factors}. \code{by.factors} is optional, but if present, it determines separate panels. Each element of this formula may be a single factor in the model, or a combination of factors using the \code{*} operator.} \item{...}{Additional arguments passed to \code{\link{emmeans}} (when \code{object} is not already an \code{emmGrid} object), \code{predict.emmGrid}, \code{emmip_ggplot}, or \code{emmip_lattice}.} \item{type}{As in \code{\link{predict.emmGrid}}, this determines whether we want to inverse-transform the predictions (\code{type = "response"}) or not (any other choice). The default is \code{"link"}, unless the \code{"predict.type"} option is in force; see \code{\link{emm_options}}. In addition, the user may specify \code{type = "scale"} to create a transformed scale for the vertical axis based on \code{object}'s response transformation or link function.} \item{CIs}{Logical value. If \code{TRUE}, confidence intervals (or HPD intervals for Bayesian models) are added to the plot (works only with \code{engine = "ggplot"}).} \item{PIs}{Logical value. If \code{TRUE}, prediction intervals are added to the plot (works only with \code{engine = "ggplot"}). If both \code{CIs} and \code{CIs} are \code{TRUE}, the prediction intervals will be somewhat longer, lighter, and thinner than the confidence intervals. Additional parameters to \code{\link{predict.emmGrid}} (e.g., \code{sigma}) may be passed via \code{...}. For Bayesian models, PIs require \code{frequentist = TRUE} and a value for \code{sigma}.} \item{style}{Optional character value. This has an effect only when the horizontal variable is a single numeric variable. If \code{style} is unspecified or \code{"numeric"}, the horizontal scale will be numeric and curves are plotted using lines (and no symbols). With \code{style = "factor"}, the horizontal variable is treated as the levels of a factor (equally spaced along the horizontal scale), and curves are plotted using lines and symbols. When the horizontal variable is character or factor, or a combination of more than one predictor, \code{"factor"} style is always used.} \item{engine}{Character value matching \code{"ggplot"} (default), \code{"lattice"}, or \code{"none"}. The graphics engine to be used to produce the plot. These require, respectively, the \pkg{ggplot2} or \pkg{lattice} package to be installed. Specifying \code{"none"} is equivalent to setting \code{plotit = FALSE}.} \item{plotit}{Logical value. If \code{TRUE}, a graphical object is returned; if \code{FALSE}, a data.frame is returned containing all the values used to construct the plot.} \item{nesting.order}{Logical value. If \code{TRUE}, factors that are nested are presented in order according to their nesting factors, even if those nesting factors are not present in \code{formula}. If \code{FALSE}, only the variables in \code{formula} are used to order the variables.} \item{emms}{A \code{data.frame} created by calling \code{emmip} with \code{plotit = FALSE}. Certain variables and attributes are expected to exist in this data frame; see the section detailing the rendering functions.} \item{dodge}{Numerical amount passed to \code{ggplot2::position_dodge} by which points and intervals are offset so they do not collide.} \item{xlab, ylab, tlab}{Character labels for the horizontal axis, vertical axis, and traces (the different curves), respectively. The \code{emmip} function generates these automatically and provides therm via the \code{labs} attribute, but the user may override these if desired.} \item{facetlab}{Labeller for facets (when by variables are in play). Use \code{"label_value"} to show just the factor levels, or \code{"label_both"} to show both the factor names and factor levels. The default of \code{"label_context"} decides which based on how many \code{by} factors there are. See the documentation for \code{ggplot2::label_context}.} \item{scale}{If not missing, an object of class \code{scales::trans} specifying a (usually) nonlinear scaling for the vertical axis. For example, \code{scales = scales::log_trans()} specifies a logarithmic scale. For fine-tuning purposes, additional arguments to \code{ggplot2::scale_y_continuous} may be included in \code{...} .} \item{dotarg}{\code{list} of arguments passed to \code{geom_point} to customize appearance of points} \item{linearg}{\code{list} of arguments passed to \code{geom_line} to customize appearance of lines} \item{CIarg, PIarg}{\code{list}s of arguments passed to \code{geom_linerange} to customize appearance of intervals} \item{pch}{The plotting characters to use for each group (i.e., levels of \code{trace.factors}). They are recycled as needed.} \item{lty}{The line types to use for each group. Recycled as needed.} \item{col}{The colors to use for each group, recycled as needed. If not specified, the default trellis colors are used.} } \value{ If \code{plotit = FALSE}, a \code{data.frame} (actually, a \code{summary_emm} object) with the table of EMMs that would be plotted. The variables plotted are named \code{xvar} and \code{yvar}, and the trace factor is named \code{tvar}. This data frame has an added \code{"labs"} attribute containing the labels \code{xlab}, \code{ylab}, and \code{tlab} for these respective variables. The confidence limits are also included, renamed \code{LCL} and \code{UCL}. If \code{plotit = TRUE}, the function returns an object of class \code{"ggplot"} or a \code{"trellis"}, depending on \code{engine}. } \description{ Creates an interaction plot of EMMs based on a fitted model and a simple formula specification. } \note{ Conceptually, this function is equivalent to \code{\link{interaction.plot}} where the summarization function is thought to return the EMMs. } \section{Details}{ If \code{object} is a fitted model, \code{\link{emmeans}} is called with an appropriate specification to obtain estimated marginal means for each combination of the factors present in \code{formula} (in addition, any arguments in \code{\dots} that match \code{at}, \code{trend}, \code{cov.reduce}, or \code{fac.reduce} are passed to \code{emmeans}). Otherwise, if \code{object} is an \code{emmGrid} object, its first element is used, and it must contain one estimate for each combination of the factors present in \code{formula}. } \section{Rendering functions}{ The functions \code{emmip_ggplot} and \code{emmip_lattice} are called when \code{plotit == TRUE} to render the plots; but they may also be called later on an object saved via \code{plotit = FALSE} (or \code{engine = "none"}). The functions require that \code{emms} contains variables \code{xvar}, \code{yvar}, and \code{tvar}, and attributes \code{"labs"} and \code{"vars"}. Confidence intervals are plotted if variables \code{LCL} and \code{UCL} exist; and prediction intervals are plotted if \code{LPL} and \code{UPL} exist. Finally, it must contain the variables named in \code{attr(emms, "vars")}. } \examples{ #--- Three-factor example noise.lm = lm(noise ~ size * type * side, data = auto.noise) # Separate interaction plots of size by type, for each side emmip(noise.lm, type ~ size | side) # One interaction plot, using combinations of size and side as the x factor # ... with added confidence intervals and some formatting changes emmip(noise.lm, type ~ side * size, CIs = TRUE, linearg = list(linetype = "dashed"), CIarg = list(lwd = 1, alpha = 1)) # One interaction plot using combinations of type and side as the trace factor emmip(noise.lm, type * side ~ size) # Individual traces in panels emmip(noise.lm, ~ size | type * side) # Example for the 'style' argument fib.lm = lm(strength ~ machine * sqrt(diameter), data = fiber) fib.rg = ref_grid(fib.lm, at = list(diameter = c(3.5, 4, 4.5, 5, 5.5, 6)^2)) emmip(fib.rg, machine ~ diameter) # curves (because diameter is numeric) emmip(fib.rg, machine ~ diameter, style = "factor") # points and lines # For an example using extra ggplot2 code, see 'vignette("messy-data")', # in the section on nested models. ### Options with transformations or link functions neuralgia.glm <- glm(Pain ~ Treatment * Sex + Age, family = binomial(), data = neuralgia) # On link scale: emmip(neuralgia.glm, Treatment ~ Sex) # On response scale: emmip(neuralgia.glm, Treatment ~ Sex, type = "response") # With transformed axis scale and custom scale divisions emmip(neuralgia.glm, Treatment ~ Sex, type = "scale", breaks = seq(0.10, 0.90, by = 0.10)) } \seealso{ \code{\link{emmeans}}, \code{\link{interaction.plot}}. } emmeans/DESCRIPTION0000644000176200001440000000415414165107726013422 0ustar liggesusersPackage: emmeans Type: Package Title: Estimated Marginal Means, aka Least-Squares Means Version: 1.7.2 Date: 2022-01-04 Authors@R: c(person("Russell V.", "Lenth", role = c("aut", "cre", "cph"), email = "russell-lenth@uiowa.edu"), person("Paul", "Buerkner", role = "ctb"), person("Maxime", "Herve", role = "ctb"), person("Jonathon", "Love", role = "ctb"), person("Fernando", "Miguez", role = "ctb"), person("Hannes", "Riebl", role = "ctb"), person("Henrik", "Singmann", role = "ctb")) Depends: R (>= 3.5.0) Imports: estimability (>= 1.3), graphics, methods, numDeriv, stats, utils, mvtnorm, xtable (>= 1.8-2) Suggests: bayesplot, bayestestR, biglm, brms, car, coda (>= 0.17), ggplot2, lattice, logspline, mediation, mgcv, multcomp, multcompView, nlme, ordinal (>= 2014.11-12), pbkrtest (>= 0.4-1), lme4, lmerTest (>= 2.0.32), MASS, MuMIn, rsm, knitr, rmarkdown, scales, splines, testthat Enhances: CARBayes, coxme, gee, geepack, MCMCglmm, MCMCpack, mice, nnet, pscl, rstanarm, sommer, survival URL: https://github.com/rvlenth/emmeans BugReports: https://github.com/rvlenth/emmeans/issues LazyData: yes ByteCompile: yes Description: Obtain estimated marginal means (EMMs) for many linear, generalized linear, and mixed models. Compute contrasts or linear functions of EMMs, trends, and comparisons of slopes. Plots and other displays. Least-squares means are discussed, and the term "estimated marginal means" is suggested, in Searle, Speed, and Milliken (1980) Population marginal means in the linear model: An alternative to least squares means, The American Statistician 34(4), 216-221 . License: GPL-2 | GPL-3 Encoding: UTF-8 RoxygenNote: 7.1.1 VignetteBuilder: knitr NeedsCompilation: no Packaged: 2022-01-04 15:55:42 UTC; rlenth Author: Russell V. Lenth [aut, cre, cph], Paul Buerkner [ctb], Maxime Herve [ctb], Jonathon Love [ctb], Fernando Miguez [ctb], Hannes Riebl [ctb], Henrik Singmann [ctb] Maintainer: Russell V. Lenth Repository: CRAN Date/Publication: 2022-01-04 18:20:06 UTC emmeans/build/0000755000176200001440000000000014165066776013020 5ustar liggesusersemmeans/build/vignette.rds0000644000176200001440000000115214165066776015356 0ustar liggesusersTn1\M .{AHT^Icu^Ǔ7}ݬό=srg+jTTj~հO>5}5Kuǵ)ҍUBh%)|+9 B(<ylD֫A e3@ZSROT̈TŽc``T^2O<8ҀLwXHcoC#dFLMpvgz窰UW7;-?Fa[!oC}mR!d&2c(Q9 \>Q?/;$;4>~d%J`ʡEݵ t,3oC~EHq%6 _>!eXglZzYA8rpk0p?!8d3#[f֫UL<܊ G"Lf\g|z%}V UuXYkۧ˾rGq5v`w&oWRemmeans/tests/0000755000176200001440000000000014137062735013051 5ustar liggesusersemmeans/tests/testthat/0000755000176200001440000000000014165107726014712 5ustar liggesusersemmeans/tests/testthat/test-ref_grid.R0000644000176200001440000000437314137062735017601 0ustar liggesuserscontext("Reference grids") pigs.lm = lm(log(conc) ~ source + factor(percent), data = pigs) rg = ref_grid(pigs.lm) rg1 = ref_grid(pigs.lm, at = list(source = "soy", percent = 12)) pigs.lm2 = update(pigs.lm, conc ~ source + percent, data = pigs) rg2 = ref_grid(pigs.lm2) rg2a = ref_grid(pigs.lm2, at = list(source = c("fish", "soy"), percent = 10)) rg2c = ref_grid(pigs.lm2, cov.reduce = FALSE) rg2m = ref_grid(pigs.lm2, cov.reduce = min) pigs.lm3 = update(pigs.lm2, . ~ source + source:factor(percent)) pigs = transform(pigs, sp = interaction(source, percent)) pigs.lm4 = update(pigs.lm2, . ~ source + sp) test_that("Reference grid is constructed correctly", { expect_equal(nrow(rg@grid), 12) expect_equal(nrow(rg1@grid), 1) expect_equal(nrow(rg2@grid), 3) expect_equal(nrow(rg2a@grid), 2) expect_equal(nrow(rg2c@grid), 12) expect_equal(nrow(rg1@grid), 1) expect_equal(length(rg@levels), 2) expect_equal(rg2@levels$percent, mean(pigs$percent)) expect_equal(rg2m@levels$percent, min(pigs$percent)) }) test_that("Reference grid extras are detected", { expect_equal(rg@misc$tran, "log") expect_true(is.null(rg2@misc$tran)) expect_true(is.null(rg2@model.info$nesting)) expect_is(ref_grid(pigs.lm3)@model.info$nesting, "list") # see note above expect_is(ref_grid(pigs.lm4)@model.info$nesting, "list") # see note above }) colnames(ToothGrowth) <- c('len', 'choice of supplement', 'dose') model <- stats::aov(`len` ~ `choice of supplement`, ToothGrowth) test_that("Reference grid handles variables with spaces", { expect_output(str(ref_grid(model, ~`choice of supplement`)), "choice of supplement") }) # models outside of data.frames x = 1:10 y = rnorm(10) mod1 = with(pigs, lm(log(conc) ~ source + factor(percent))) test_that("ref_grid works with no data or subset", { expect_silent(ref_grid(lm(y ~ x))) expect_silent(ref_grid(mod1)) }) # Multivariate models MOats.lm <- lm (yield ~ Block + Variety, data = MOats) MOats.rg <- ref_grid (MOats.lm, mult.levs = list( trt = LETTERS[1:2], dose = as.character(1:2)) ) test_that("We can construct multivariate reference grid", { expect_equal(nrow(MOats.rg@grid), 72) expect_equal(length(MOats.rg@levels), 4) }) emmeans/tests/testthat/test-emtrends.R0000644000176200001440000000152714137062735017637 0ustar liggesuserscontext("Emtrends function") pigs.lm = lm(log(conc) ~ source * poly(percent, 3), data = pigs) test_that("emtrends works", { emt = emtrends(pigs.lm, ~ source, "percent") expect_equal(nrow(semt <- summary(emt)), 3) expect_equal(semt$percent.trend[1], .00429, tol = 0.0001) emtt = emtrends(pigs.lm, ~ source, "sqrt(percent)") expect_equal(summary(emtt)[["sqrt(percent).trend"]][1], .0309, tol = 0.001) emtp = emtrends(pigs.lm, ~ source, "percent", max.degree = 3) expect_equal(nrow(semtp <- summary(emtp)), 9) expect_equal(semtp$percent.trend[7], .001337, tol = 0.0001) emtpa = emtrends(pigs.lm, ~ source | percent, "percent", max.degree = 2, at = list(percent = c(9,12,15,18))) expect_equal(nrow(summary(emtpa)), 24) # 3 sources * 2 degrees * 4 percents }) emmeans/tests/testthat/test-contrast.R0000644000176200001440000000254614137062735017655 0ustar liggesuserscontext("Contrast function") pigs.lm = lm(log(conc) ~ source + factor(percent), data = pigs) rg = ref_grid(pigs.lm) rgg = add_grouping(rg, "group", "source", c("1", "2", "2")) emms = emmeans(rg, "source") emmg = emmeans(rgg, "group") emmgs = emmeans(rgg, "source") pigs.lmi = lm(log(conc) ~ source * factor(percent), data = pigs) rgi = ref_grid(pigs.lmi) test_that("Non-nested contrasts work", { expect_equal(nrow(summary(contrast(emms))), 3) expect_equal(nrow(summary(contrast(emms, "consec"))), 2) expect_equal(nrow(summary(contrast(rg, by = "source"))), 12) expect_equal(nrow(summary(pairs(rg, by = "source"))), 18) expect_equal(nrow(summary(pairs(rg))), 66) }) test_that("Nested contrasts work", { expect_equal(nrow(summary(contrast(emmg, "consec"))), 1) expect_warning(summary(contrast(emmgs, "consec", by = "group"))) # warning from cov2cor due to mvt adjustment expect_equal(nrow(summary(pairs(emmgs, by = "group"))), 2) }) test_that("Interaction contrasts work", { expect_equal(nrow(summary(contrast(rgi, interaction = TRUE))), 12) expect_equal(nrow(summary(contrast(rgi, interaction = "consec"))), 6) expect_equal(nrow(summary(contrast(rgi, interaction = c("consec", "pairwise")))), 12) expect_equal(nrow(summary(contrast(rgi, interaction = c("pairwise","consec")))), 9) }) emmeans/tests/testthat/test-emmeans.R0000644000176200001440000000241514142775612017441 0ustar liggesuserscontext("Estimated marginal means") pigs.lm = lm(log(conc) ~ source + factor(percent), data = pigs) rg = ref_grid(pigs.lm) rgg = add_grouping(rg, "group", "source", c("1", "2", "2")) test_that("Character interface works", { expect_equal(confint(emmeans(rg, "source"))$emmean, c(3.39, 3.67, 3.80), tol = 0.01) expect_equal(nrow(emmeans(rg, c("source", "percent"))@grid), 12) expect_equal(nrow(emmeans(rg, "source", by = "percent")@grid), 12) expect_equal(nrow(emmeans(rg, c("1"))@grid), 1) }) test_that("Formula interface works", { expect_equal(confint(emmeans(rg, ~ source))$emmean, c(3.39, 3.67, 3.80), tol = 0.01) expect_equal(nrow(emmeans(rg, ~ source * percent)@grid), 12) expect_equal(nrow(emmeans(rg, ~ source | percent)@grid), 12) expect_equal(nrow(emmeans(rg, ~ 1)@grid), 1) expect_equal(nrow(emmeans(rg, ~ 1 | percent)@grid), 4) expect_equal(nrow(emmeans(rg, ~ percent | 1)@grid), 4) }) # nesting test_that("Nested EMMs work", { expect_equal(nrow(emmeans(rgg, ~ group)@grid), 2) expect_equal(nrow(emmeans(rgg, ~ source)@grid), 6) # 3 rows have 0 weight expect_equal(nrow(confint(emmeans(rgg, ~ source))), 3) expect_equal(colnames(emmeans(rgg, ~ source)@grid)[1:2], c("source","group")) }) emmeans/tests/testthat/test-nested.R0000644000176200001440000000223214137062735017272 0ustar liggesuserscontext("Nested structures") set.seed(412.1948) foo = data.frame( nest = factor(c(rep("A",8),rep("B",12),rep("C",4))), bird = factor(rep(1:6, each=4)), m = rep(c(2,3,0,-1,-2,5), each=4), b = rep(c(0,-1,1,2,2,3), each=4), x = 3 + 5*runif(24), e = 0.3*rnorm(24) ) foo = transform(foo, resp = m*x + b + e) foo1.lm = lm(resp ~ nest + bird*x, data = foo) rg1 = ref_grid(foo1.lm) foo2.lm = lm(resp ~ (nest + bird)*x, data = foo) rg2 = ref_grid(foo2.lm) test_that("nested EMM works", { emm1 = emmeans(rg1, "bird") expect_equal(nrow(summary(emm1)), 6) expect_equal(names(emm1@grid[1:2]), c("bird", "nest")) p1bn = predict(emmeans(emm1, "nest")) p1n = predict(emmeans(rg1, "nest")) p2n = predict(emmeans(rg2, "nest")) expect_equal(p1bn, p1n, tol = 1e-6) expect_equal(p1n, p2n, tol = 1e-6) }) test_that("nested trends works", { emtb = emtrends(foo1.lm, "bird", "x") emtn = emtrends(foo1.lm, "nest", "x") emtbn = emmeans(emtb, "nest") expect_equal(nrow(summary(emtb)), 6) expect_equal(nrow(summary(emtn)), 3) expect_equal(predict(emtn), predict(emtbn), tol = 1e-6) }) emmeans/tests/testthat.R0000644000176200001440000000007614137062735015037 0ustar liggesuserslibrary(testthat) library(emmeans) test_check("emmeans") emmeans/vignettes/0000755000176200001440000000000014165066776013731 5ustar liggesusersemmeans/vignettes/comparisons.Rmd0000644000176200001440000004044014137062735016722 0ustar liggesusers--- title: "Comparisons and contrasts in emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Comparisons and contrasts} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` ## Contents This vignette covers techniques for comparing EMMs at levels of a factor predictor, and other related analyses. 1. [Pairwise comparisons](#pairwise) 2. [Other contrasts](#contrasts) 3. [Formula interface](#formulas) 4. [Custom contrasts and linear functions](#linfcns) 5. [Special behavior with log transformations](#logs) 6. Interaction contrasts (in ["interactions" vignette](interactions.html#contrasts)) 7. Multivariate contrasts (in ["interactions" vignette](interactions.html#multiv)) [Index of all vignette topics](vignette-topics.html) ## Pairwise comparisons {#pairwise} The most common follow-up analysis for models having factors as predictors is to compare the EMMs with one another. This may be done simply via the `pairs()` method for `emmGrid` objects. In the code below, we obtain the EMMs for `source` for the `pigs` data, and then compare the sources pairwise. ```{r} pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) pigs.emm.s <- emmeans(pigs.lm, "source") pairs(pigs.emm.s) ``` In its out-of-the-box configuration, `pairs()` sets two defaults for [`summary()`](confidence-intervals.html#summary): `adjust = "tukey"` (multiplicity adjustment), and `infer = c(FALSE, TRUE)` (test statistics, not confidence intervals). You may override these, of course, by calling `summary()` on the result with different values for these. In the example above, EMMs for later factor levels are subtracted from those for earlier levels; if you want the comparisons to go in the other direction, use `pairs(pigs.emm.s, reverse = TRUE)`. Also, in multi-factor situations, you may specify `by` factor(s) to perform the comparisons separately at the levels of those factors. ### Matrix displays {#pwpm} The numerical main results associated with pairwise comparisons can be presented compactly in matrix form via the `pwpm()` function. We simply hand it the `emmGrid` object to use in making the comparisons: ```{r} pwpm(pigs.emm.s) ``` This matrix shows the EMMs along the diagonal, $P$ values in the upper triangle, and the differences in the lower triangle. Options exist to switch off any one of these and to switch which triangle is used for the latter two. Also, optional arguments are passed. For instance, we can reverse the direction of the comparisons, suppress the display of EMMs, swap where the $P$ values go, and perform noninferiority tests with a threshold of 0.05 as follows: ```{r} pwpm(pigs.emm.s, means = FALSE, flip = TRUE, # args for pwpm() reverse = TRUE, # args for pairs() side = ">", delta = 0.05, adjust = "none") # args for test() ``` With all three *P* values so small, we have fish, soy, and skim in increasing order of noninferiority based on the given threshold. When more than one factor is present, an existing or newly specified `by` variables() can split the results into l list of matrices. ### Effect size Some users desire standardized effect-size measures. Most popular is probably Cohen's *d*, which is defined as the observed difference, divided by the population SD; and obviously Cohen effect sizes are close cousins of pairwise differences. They are available via the `eff_size()` function, where the user must specify the `emmGrid` object with the means to be compared, the estimated population SD `sigma`, and its degrees of freedom `edf`. This is illustrated with the current example: ```{r} eff_size(pigs.emm.s, sigma = sigma(pigs.lm), edf = 23) ``` The confidence intervals shown take into account the error in estimating `sigma` as well as the error in the differences. Note that the intervals are narrower if we claim that we know `sigma` perfectly (i.e., infinite degrees of freedom): ```{r} eff_size(pigs.emm.s, sigma = sigma(pigs.lm), edf = Inf) ``` Note that `eff_size()` expects the object with the means, not the differences. If you want to use the differences, use the `method` argument to specify that you don't want to compute pairwise differences again; e.g., ```{r, eval = FALSE} eff_size(pairs(pigs.emm.s), sigma = sigma(pigs.lm), edf = 23, method = "identity") ``` (results are identical to the first effect sizes shown). ### Graphical comparisons {#graphical} Comparisons may be summarized graphically via the `comparisons` argument in `plot.emm()`: ```{r fig.height = 1.5} plot(pigs.emm.s, comparisons = TRUE) ``` The blue bars are confidence intervals for the EMMs, and the red arrows are for the comparisons among them. If an arrow from one mean overlaps an arrow from another group, the difference is not "significant," based on the `adjust` setting (which defaults to `"tukey"`) and the value of `alpha` (which defaults to 0.05). See the ["xplanations" supplement](xplanations.html#arrows) for details on how these are derived. *Note:* Don't *ever* use confidence intervals for EMMs to perform comparisons; they can be very misleading. Use the comparison arrows instead; or better yet, use `pwpp()`. *A caution:* it really is not good practice to draw a bright distinction based on whether or not a *P* value exceeds some cutoff. This display does dim such distinctions somewhat by allowing the viewer to judge whether a *P* value is close to `alpha` one way or the other; but a better strategy is to simply obtain all the *P* values using `pairs()`, and look at them individually. #### Pairwise *P*-value plots {#pwpp} In trying to develop an alternative to compact-letter displays (see next subsection), we devised the "pairwise *P*-value plot" displaying all the *P* values in pairwise comparisons: ```{r} pwpp(pigs.emm.s) ``` Each comparison is associated with a vertical line segment that joins the scale positions of the two EMMs being compared, and whose horizontal position is determined by the *P* value of that comparison. This kind of plot can get quite "busy" as the number of means being compared goes up. For example, suppose we include the interactions in the model for the pigs data, and compare all 12 cell means: ```{r, fig.width = 9} pigs.lmint <- lm(log(conc) ~ source * factor(percent), data = pigs) pigs.cells <- emmeans(pigs.lmint, ~ source * percent) pwpp(pigs.cells, type = "response") ``` While this plot has a lot of stuff going on, consider looking at it row-by-row. Next to each EMM, we can visualize the *P* values of all 11 comparisons with each other EMM (along with their color codes). Also, note that we can include arguments that are passed to `summary()`; in this case, to display the back-transformed means. If we are willing to forgo the diagonal comparisons (where neither factor has a common level), we can make this a lot less cluttered via a `by` specification: ```{r, fig.width = 6} pwpp(pigs.cells, by = "source", type = "response") ``` In this latter plot we can see that the comparisons with `skim` as the source tend to be statistically stronger. This is also an opportunity to remind the user that multiplicity adjustments are made relative to each `by` group. For example, comparing `skim:9` versus `skim:15` has a Tukey-adjusted *P* value somewhat greater than 0.1 when all are in one family of 12 means, but about 0.02 relative to a smaller family of 4 means as depicted in the three-paneled plot. #### Compact letter displays {#CLD} Another way to depict comparisons is by *compact letter displays*, whereby two EMMs sharing one or more grouping symbols are not "significantly" different. These may be generated by the `multcomp::cld()` function. I really recommend against this kind of display, though, and decline to illustrate it. These displays promote visually the idea that two means that are "not significantly different" are to be judged as being equal; and that is a very wrong interpretation. In addition, they draw an artificial "bright line" between *P* values on either side of `alpha`, even ones that are very close. [Back to Contents](#contents) ## Other contrasts {#contrasts} Pairwise comparisons are an example of linear functions of EMMs. You may use `coef()` to see the coefficients of these linear functions: ```{r} coef(pairs(pigs.emm.s)) ``` The pairwise comparisons correspond to columns of the above results. For example, the first pairwise comparison, `fish - soy`, gives coefficients of 1, -1, and 0 to fish, soy, and skim, respectively. In cases, such as this one, where each column of coefficients sums to zero, the linear functions are termed *contrasts* The `contrast()` function provides for general contrasts (and linear functions, as well) of factor levels. Its second argument, `method`, is used to specify what method is to be used. In this section we describe the built-in ones, where we simply provide the name of the built-in method. Consider, for example, the factor `percent` in the model `pigs.lm` . It is treated as a factor in the model, but it corresponds to equally-spaced values of a numeric variable. In such cases, users often want to compute orthogonal polynomial contrasts: ```{r} pigs.emm.p <- emmeans(pigs.lm, "percent") ply <- contrast(pigs.emm.p, "poly") ply coef(ply) ``` We obtain tests for the linear, quadratic, and cubic trends. The coefficients are those that can be found in tables in many experimental-design texts. It is important to understand that the estimated linear contrast is *not* the slope of a line fitted to the data. It is simply a contrast having coefficients that increase linearly. It *does* test the linear trend, however. There are a number of other named contrast methods, for example `"trt.vs.ctrl"`, `"eff"`, and `"consec"`. The `"pairwise"` and `"revpairwise"` methods in `contrast()` are the same as `Pairs()` and `pairs(..., reverse = TRUE)`. See `help("contrast-methods")` for details. [Back to Contents](#contents) ## Formula interface {#formulas} If you already know what contrasts you will want before calling `emmeans()`, a quick way to get them is to specify the method as the left-hand side of the formula in its second argument. For example, with the `oranges` dataset provided in the package, ```{r} org.aov <- aov(sales1 ~ day + Error(store), data = oranges, contrasts = list(day = "contr.sum")) org.emml <- emmeans(org.aov, consec ~ day) org.emml ``` The contrasts shown are the day-to-day changes. This two-sided formula technique is quite convenient, but it can also create confusion. For one thing, the result is not an `emmGrid` object anymore; it is a `list` of `emmGrid` objects, called an `emm_list`. You may need to be cognizant of that if you are to do further contrasts or other analyzes. For example if you want `"eff"` contrasts as well, you need to do `contrast(org.emml[[1]], "eff")` or `contrast(org.emml, "eff", which = 1)`. Another issue is that it may be unclear which part of the results is affected by certain options. For example, if you were to add `adjust = "bonf"` to the `org.emm` call above, would the Bonferroni adjustment be applied to the EMMs, or to the contrasts? (See the documentation if interested; but the best practice is to avoid such dilemmas.) [Back to Contents](#contents) ## Custom contrasts and linear functions {#linfcns} The user may write a custom contrast function for use in `contrast()`. What's needed is a function having the desired name with `".emmc"` appended, that generates the needed coefficients as a list or data frame. The function should take a vector of levels as its first argument, and any optional parameters as additional arguments. For example, suppose we want to compare every third level of a treatment. The following function provides for this: ```{r} skip_comp.emmc <- function(levels, skip = 1, reverse = FALSE) { if((k <- length(levels)) < skip + 1) stop("Need at least ", skip + 1, " levels") coef <- data.frame() coef <- as.data.frame(lapply(seq_len(k - skip - 1), function(i) { sgn <- ifelse(reverse, -1, 1) sgn * c(rep(0, i - 1), 1, rep(0, skip), -1, rep(0, k - i - skip - 1)) })) names(coef) <- sapply(coef, function(x) paste(which(x == 1), "-", which(x == -1))) attr(coef, "adjust") = "fdr" # default adjustment method coef } ``` To test it, try 5 levels: ```{r} skip_comp.emmc(1:5) skip_comp.emmc(1:5, skip = 0, reverse = TRUE) ``` (The latter is the same as `"consec"` contrasts.) Now try it with the `oranges` example we had previously: ```{r} contrast(org.emml[[1]], "skip_comp", skip = 2, reverse = TRUE) ``` ####### {#linfct} The `contrast()` function may in fact be used to compute arbitrary linear functions of EMMs. Suppose for some reason we want to estimate the quantities $\lambda_1 = \mu_1+2\mu_2-7$ and $\lambda_2 = 3\mu_2-2\mu_3+1$, where the $\mu_j$ are the population values of the `source` EMMs in the `pigs` example. This may be done by providing the coefficients in a list, and the added constants in the `offset` argument: ```{r} LF <- contrast(pigs.emm.s, list(lambda1 = c(1, 2, 0), lambda2 = c(0, 3, -2)), offset = c(-7, 1)) confint(LF, adjust = "bonferroni") ``` [Back to Contents](#contents) ## Special properties of log (and logit) transformations {#logs} Suppose we obtain EMMs for a model having a response transformation or link function. In most cases, when we compute contrasts of those EMMs, there is no natural way to express those contrasts on anything other than the transformed scale. For example, in a model fitted using `glm()` with the `gamma()` family, the default link function is the inverse. Predictions on such a model are estimates of $1/\mu_j$ for various $j$. Comparisons of predictions will be estimates of $1/\mu_j - 1/\mu_{k}$ for $j \ne k$. There is no natural way to back-transform these differences to some other interpretable scale. However, logs are an exception, in that $\log\mu_j - \log\mu_k = \log(\mu_j/\mu_k)$. Accordingly, when `contrast()` (or `pairs()`) notices that the response is on the log scale, it back-transforms contrasts to ratios when results are to be of `response` type. For example: ```{r} pairs(pigs.emm.s, type = "lp") pairs(pigs.emm.s, type = "response") ``` As is true of EMM summaries with `type = "response"`, the tests and confidence intervals are done before back-transforming. The ratios estimated here are actually ratios of *geometric* means. In general, a model with a log response is in fact a model for *relative* effects of any of its linear predictors, and this back-transformation to ratios goes hand-in-hand with that. In generalized linear models, this behaviors will occur in two common cases: Poisson or count regression, for which the usual link is the log; and logistic regression, because logits are logs of odds ratios. [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/vignettes/vignette-topics.Rmd0000644000176200001440000006300414165066711017511 0ustar liggesusers--- title: "Index of vignette topics" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Index of vignette topics} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} ---
### Jump to: [A](#a) [B](#b) [C](#c) [D](#d) [E](#e) [F](#f) [G](#g) [H](#h) [I](#i) [J](#j) [K](#k) [L](#l) [M](#m) [N](#n) [O](#o) [P](#p) [Q](#q) [R](#r) [S](#s) [T](#t) [U](#u) [V](#v) [W](#w) [X](#x) [Z](#z) {#topnav} ### A {#a} * [`add_grouping()`](utilities.html#groups) * `adjust` * [in *comparisons: pairwise*](comparisons.html#pairwise) * [in *confidence-intervals: adjust*](confidence-intervals.html#adjust) * [`afex_aov` objects](models.html#V) * [Alias matrix](xplanations.html#submodels) * [Analysis of subsets of data](FAQs.html#model) * Analysis of variance * [versus *post hoc* comparisons](FAQs.html#anova) * [Type III](confidence-intervals.html#joint_tests) * [`aovList` objects](models.html#V) * [`appx-satterthwaite` method](models.html#K) * [`as.data.frame()`](utilities.html#data) * `as.mcmc()` * [in *models: S*](models.html#S) * [in *sophisticated: bayesxtra*](sophisticated.html#bayesxtra) * [ASA Statement on *P* values](basics.html#pvalues) * [Asymptotic tests](sophisticated.html#dfoptions) * [ATOM](basics.html#pvalues) * [`averaging` models](models.html#I) [Back to top](#topnav) ### B {#b} * [Bayes factor](sophisticated.html#bayesxtra) * Bayesian models * [in *models: S*](models.html#S) * [in *sophisticated: mcmc*](sophisticated.html#mcmc) * [**bayesplot** package](sophisticated.html#bayesxtra) * [**bayestestR** package](sophisticated.html#bayesxtra) * [Beta regression](models.html#B) * [`betareg` models](models.html#B) * Bias adjustment * [For link functions vs. response transformations](transformations.html#link-bias) * [in Bayesian models](sophisticated.html#bias-adj-mcmc) * [In GLMMs and GEE models](transformations.html#cbpp) * [When back-transforming](transformations.html#bias-adj) * [When *not* to use](transformations.html#insects) * [Bonferroni adjustment](confidence-intervals.html#adjmore) * [`boot-satterthwaite` method](models.html#K) * [Brackets (`[ ]` and `[[ ]]` operators)](utilities.html#brackets) * [`brmsfit` objects](models.html#S) * [`by` groups](confidence-intervals.html#byvars) * [Identical comparisons](FAQs.html#additive) [Back to top](#topnav) ### C {#c} * [`cld()`](comparisons.html#CLD) * [`clm` models](models.html#O) * [**coda** package](sophisticated.html#bayesxtra) * [`coef()`](comparisons.html#contrasts) * [Cohen's *d*](comparisons.html#pwpm) * [Compact letter displays](comparisons.html#CLD) * Comparison arrows * [Derivation](xplanations.html#arrows) * Comparisons * [Back-transforming](comparisons.html#logs) * [Displaying as groups](comparisons.html#CLD) * [Displaying *P* values](comparisons.html#pwpp) * [Graphical](comparisons.html#graphical) * [How arrows are determined](xplanations.html#arrows) * [with logs](comparisons.html#logs) * [with overlapping CIs](FAQs.html#CIerror) * [Comparisons result in `(nothing)`](FAQs.html#nopairs) * [Confidence intervals](confidence-intervals.html#summary) * [Overlapping](FAQs.html#CIerror) * [`confint()`](confidence-intervals.html#summary) * [`consec` contrasts](comparisons.html#contrasts) * [Constrained marginal means](xplanations.html#submodels) * [Consultants](basics.html#recs3) * [Containment d.f.](models.html#K) * [`contrast()`](comparisons.html#contrasts) * [`adjust`](comparisons.html#linfcns) * [Changing defaults](utilities.html#defaults) * [`combine`](interactions.html#simple) * [`interaction`](interactions.html#contrasts) * [Linear functions](comparisons.html#linfct) * `simple` * [in *confidence-intervals: simple*](confidence-intervals.html#simple) * [in *interactions: simple*](interactions.html#simple) * Contrasts * [of other contrasts](interactions.html#contrasts) * [Custom](comparisons.html#linfcns) * [Formula](comparisons.html#formulas) * [Multivariate](interactions.html#multiv) * [Pairwise](comparisons.html#pairwise) * [Polynomial](comparisons.html#contrasts) * Tests of * [with transformations](comparisons.html#logs) * [Count regression](models.html#C) * [`cov.reduce`](messy-data.html#med.covred) * Covariates * [Adjusted](messy-data.html#adjcov) * [`cov.keep`](FAQs.html#numeric) * [`cov.reduce`](FAQs.html#numeric) * [Derived](basics.html#depcovs) * [`emmeans()` doesn't work](FAQs.html#numeric) * [Interacting with factors](FAQs.html#trends) * [Mediating](messy-data.html#mediators) [Back to top](#topnav) ### D {#d} * [Degrees of freedom](sophisticated.html#dfoptions) * [Infinite](FAQs.html#asymp) * [Digits, optimizing](utilities.html#digits) * [Dunnett method](comparisons.html#contrasts) [Back to top](#topnav) ### E {#e} * [`eff` contrasts](comparisons.html#contrasts) * [`eff_size()`](comparisons.html#pwpm) * [Effect size](comparisons.html#pwpm) * [`emm_basis()`](xtending.html#intro) * [Arguments and returned value](xtending.html#ebreqs) * [Communicating with `recover_data()`](xtending.html#communic) * [Dispatching](xtending.html#dispatch) * [Hook functions](xtending.html#hooks) * [for `lqs` objects](xtending.html#eblqs) * [for `rsm` objects](xtending.html#ebrsm) * [`emm_list` object](comparisons.html#formulas) * [`emm_options()`](utilities.html#defaults) * [`.emmc` functions](comparisons.html#linfcns) * **emmeans** package * [Exporting extensions to](xtending.html#exporting) * [`emmeans()`](basics.html#emmeans) * [And the underlying model](FAQs.html#nowelch) * [Changing defaults](utilities.html#defaults) * [Surprising results from](FAQs.html#transformations) * `weights` * [in *basics: weights*](basics.html#weights) * [in *messy-data: weights*](messy-data.html#weights) * [With transformations](transformations.html#regrid) * [`emmGrid` objects](basics.html#emmobj) * [Accessing data](utilities.html#data) * [Combining and subsetting](utilities.html#rbind) * [Modifying](utilities.html#update) * [Setting defaults for](utilities.html#defaults) * `emmip()` * [in *basics: plots*](basics.html#plots) * [in *interactions: factors*](interactions.html#factors) * [nested factors](messy-data.html#cows) * [EMMs](basics.html#EMMdef) * [Appropriateness of](basics.html#eqwts) * [Projecting to a submodel](xplanations.html#submodels) * [What are they?](FAQs.html#what) * `emtrends()` * [in *interactions: covariates*](interactions.html#covariates) * [in *interactions: oranges*](interactions.html#oranges) * [`estHook`](xtending.html#hooks) * [Estimability](messy-data.html#nonestex) * [Estimability issues](FAQs.html#NAs) * [Estimated marginal means](basics.html#EMMdef) * [Defined](basics.html#emmeans) * Examples * [`auto.noise`](interactions.html#factors) * [Bayesian model](sophisticated.html#mcmc) * `cbpp` * [in *sophisticated: mcmc*](sophisticated.html#mcmc) * [in *transformations: cbpp*](transformations.html#cbpp) * [`cows`](messy-data.html#cows) * [`feedlot`](predictions.html#feedlot) * `fiber` * [in *interactions: covariates*](interactions.html#covariates) * [in *transformations: stdize*](transformations.html#stdize) * [`framing`](messy-data.html#mediators) * [Gamma regression](transformations.html#tranlink) * [`InsectSprays`](transformations.html#insects) * [Insurance claims (SAS)](sophisticated.html#offsets) * [Logistic regression](transformations.html#links) * [`lqs` objects](xtending.html#lqs) * [`MOats`](basics.html#multiv) * `mtcars` * [in *basics: altering*](basics.html#altering) * [in *messy-data: nuis.example*](messy-data.html#nuis.example) * [Multivariate](basics.html#multiv) * [Nested fixed effects](messy-data.html#cows) * `neuralgia` * [in *transformations: links*](transformations.html#links) * [in *transformations: trangraph*](transformations.html#trangraph) * [`nutrition`](messy-data.html#nutrex) * [`submodel`](messy-data.html#submodels) * [`weights`](messy-data.html#weights) * [`Oats`](sophisticated.html#lmer) * `oranges` * [in *comparisons: formulas*](comparisons.html#formulas) * [in *interactions: oranges*](interactions.html#oranges) * [Ordinal model](sophisticated.html#ordinal) * `pigs` * [in *basics: motivation*](basics.html#motivation) * [in *confidence-intervals: summary*](confidence-intervals.html#summary) * [in *transformations: altscale*](transformations.html#altscale) * [in *transformations: overview*](transformations.html#overview) * [in *transformations: pigs-biasadj*](transformations.html#pigs-biasadj) * [`rlm` objects](xtending.html#rlm) * [Robust regression](xtending.html#rlm) * [Split-plot experiment](sophisticated.html#lmer) * [Unbalanced data](basics.html#motivation) * `warpbreaks` * [in *transformations: tranlink*](transformations.html#tranlink) * [in *utilities: relevel*](utilities.html#relevel) * [Welch's *t* comparisons](utilities.html#relevel) * [`wine`](sophisticated.html#ordinal) * [Exporting output](basics.html#formatting) * Extending **emmeans** * [Exports useful to developers](xtending.html#exported) * [Restrictions](xtending.html#dispatch) [Back to top](#topnav) ### F {#f} * *F* test * [vs. pairwise comparisons](FAQs.html#anova) * [Role in *post hoc* tests](basics.html#recs1) * Factors * [Mediating](messy-data.html#weights) * [Formatting results](basics.html#formatting) * [Frequently asked questions](FAQs.html) [Back to top](#topnav) ### G {#g} * [`gam` models](models.html#G) * [`gamlss` models](models.html#H) * [GEE models](models.html#E) * [Generalized additive models](models.html#G) * [Generalized linear models](models.html#G) * [Geometric means](transformations.html#bias-adj) * [Get the model right first](FAQs.html#nowelch) * [`get_emm_option()`](utilities.html#options) * **ggplot2** package * [in *basics: ggplot*](basics.html#ggplot) * [in *messy-data: cows*](messy-data.html#cows) * [`glm`*xxx* models](models.html#G) * [`gls` models](models.html#K) * [Graphical displays](basics.html#plots) * [Grouping factors](utilities.html#groups) * [Grouping into separate sets](confidence-intervals.html#byvars) [Back to top](#topnav) ### H {#h} * [Hook functions](xtending.html#hooks) * [Hotelling's $T^2$](interactions.html#multiv) * [`hpd.summary()`](sophisticated.html#mcmc) * [`hurdle` models](models.html#C) [Back to top](#topnav) ### I {#i} * [Infinite degrees of freedom](FAQs.html#asymp) * Interactions * [Analysis](interactions.html) * [Contrasts](interactions.html#contrasts) * [Covariate with factors](interactions.html#covariates) * [Implied](interactions.html#oranges) * [Plotting](interactions.html#factors) * [Possible inappropriateness of marginal means](interactions.html#factors) [Back to top](#topnav) ### J {#j} * [`joint`](confidence-intervals.html#joint) * `joint_tests()` * [in *confidence-intervals: joint_tests*](confidence-intervals.html#joint_tests) * [in *interactions: contrasts*](interactions.html#contrasts) * [with `submodel = "type2"`](messy-data.html#type2submodel) [Back to top](#topnav) ### K {#k} * [`kable`](basics.html#formatting) * [Kenward-Roger d.f.](models.html#L) [Back to top](#topnav) ### L {#l} * Labels * [Changing](utilities.html#relevel) * [Large models](messy-data.html#nuisance) * [Least-squares means](FAQs.html#what) * Levels * [Changing](utilities.html#relevel) * [Linear functions](comparisons.html#linfct) * [Link functions](transformations.html#links) * [`lme` models](models.html#K) * `lmerMod` models * [in *models: L*](models.html#L) * [in *sophisticated: lmer*](sophisticated.html#lmer) * [System options for](sophisticated.html#lmerOpts) * Logistic regression * [Odds ratios](transformations.html#oddsrats) * [Surprising results](FAQs.html#transformations) * LSD * [protected](basics.html#recs2) [Back to top](#topnav) ### M {#m} * [`make.tran()`](transformations.html#special) * [`mcmc` objects](models.html#S) * Means * [Cell](basics.html#motivation) * [Generalized](transformations.html#bias-adj) * [Marginal](basics.html#motivation) * [Based on a model](basics.html#EMMdef) * [of cell means](basics.html#eqwts) * Weighted * [in *basics: eqwts*](basics.html#eqwts) * [in *basics: weights*](basics.html#weights) * [Mediating covariates](messy-data.html#mediators) * [Memory usage](messy-data.html#nuisance) * [Limiting](messy-data.html#rg.limit) * [`mira` models](models.html#I) * [`misc` attribute and argument](xtending.html#communic) * [`mlm` models](models.html#N) * [`mmer` models](models.html#G) * Model * [Get it right first](basics.html#recs2) * [Importance of](FAQs.html#fastest) * [Importance of getting it right](FAQs.html#nowelch) * [Model averaging](models.html#I) * Models * [Constrained](messy-data.html#submodels) * [Large](messy-data.html#nuisance) * [Quick reference](models.html#quickref) * [Unsupported](FAQs.html#qdrg) * [Multi-factor studies](FAQs.html#interactions) * [Multinomial models](models.html#N) * [Multiple imputation](models.html#I) * [Multiplicity adjustments](confidence-intervals.html#adjust) * [Multivariate contrasts](interactions.html#multiv) * Multivariate models * [in *basics: multiv*](basics.html#multiv) * [in *interactions: oranges*](interactions.html#oranges) * [in *models: M*](models.html#M) * [with `submodel`](xplanations.html#mult.submodel) * [Multivariate *t* (`"mvt"`) adjustment](confidence-intervals.html#adjmore) * [`mvcontrast()`](interactions.html#multiv) * [**mvtnorm** package](confidence-intervals.html#adjmore) [Back to top](#topnav) ### N {#n} * [`NA`s in the output](FAQs.html#NAs) * [Nesting](messy-data.html#nesting) * [Auto-detection](messy-data.html#nest-trap) * Nesting factors * [Creating](utilities.html#groups) * [Non-estimability](messy-data.html#nonestex) * [`non.nuisance`](messy-data.html#nuisance) * [`NonEst` values](FAQs.html#NAs) * [`(nothing)` in output](FAQs.html#nopairs) * [`nuisance`](messy-data.html#nuisance) * [Nuisance factors](messy-data.html#nuisance) [Back to top](#topnav) ### O {#o} * [Observational data](messy-data.html#issues) * [Odds ratios](transformations.html#oddsrats) * [Offsets](sophisticated.html#offsets) * [`opt.digits` option](utilities.html#digits) * [Options](utilities.html#options) * [Startup](utilities.html#startup) * Ordinal models * [Latent scale](sophisticated.html#ordinal) * [Linear-predictor scale](sophisticated.html#ordlp) * [in *models: O*](models.html#O) * [`prob` and `mean.class`](sophisticated.html#ordprob) * [in *sophisticated: ordinal*](sophisticated.html#ordinal) [Back to top](#topnav) ### P {#p} * *P* values * [Adjusted](basics.html#recs1) * [Adjustment is ignored](FAQs.html#noadjust) * [Interpreting](basics.html#pvalues) * [`pairs()`](comparisons.html#pairwise) * [Pairwise comparisons](comparisons.html#pairwise) * [Matrix displays](comparisons.html#pwpm) * [`pairwise` contrasts](comparisons.html#contrasts) * [Pairwise *P*-value plots](comparisons.html#pwpp) * [`params`](basics.html#params) * [Percentage differences](transformations.html#altscale) * `plot()` * [nested factors](messy-data.html#cows) * [`plot.emmGrid()`](basics.html#plot.emmGrid) * Plots * [of confidence intervals](basics.html#plot.emmGrid) * [of EMMs](basics.html#plots) * [Interaction-style](basics.html#plots) * [`+` operator](utilities.html#rbind) * Poisson regression * [Surprising results](FAQs.html#transformations) * [`polreg` models](models.html#O) * [Polynomial regression](basics.html#depcovs) * Pooled *t* * [Instead of Welch's *t*](FAQs.html#nowelch) * [`postGridHook`](xtending.html#hooks) * [Practices, recommended](basics.html#recs1) * Predictions * Bayesian models * [in *predictions: bayes*](predictions.html#bayes) * [in *sophisticated: predict-mcmc*](sophisticated.html#predict-mcmc) * [Error SD](predictions.html#sd-estimate) * [graphics](predictions.html#feedlot) * [on Particular strata](predictions.html#strata) * [Posterior predictive distribution](sophisticated.html#predict-mcmc) * [Reference grid](predictions.html#ref-grid) * [Total SD](predictions.html#feedlot) * [`print.summary_emm()`](basics.html#emmobj) * [`pwpm()`](comparisons.html#pwpm) * [`pwpp()`](comparisons.html#pwpp) [Back to top](#topnav) ### Q {#q} * [`qdrg()`](FAQs.html#qdrg) * [Quadratic terms](basics.html#depcovs) * [Quick start](FAQs.html#fastest) [Back to top](#topnav) ### R {#r} * [`rbind()`](utilities.html#rbind) * [Re-labeling](utilities.html#relevel) * [Recommended practices](basics.html#recs1) * [`recover_data()`](xtending.html#intro) * [Communicating with `emm_basis()`](xtending.html#communic) * [`data` and `params` arguments](xtending.html#rdargs) * [Dispatching](xtending.html#dispatch) * [Error handling](xtending.html#rderrs) * [for `lqs` objects](xtending.html#rd.lqs) * [for `rsm` objects](xtending.html#rdrsm) * `recover_data.call()` * [`frame` argument](xtending.html#rdargs) * [`ref_grid()`](basics.html#ref_grid) * [`at`](basics.html#altering) * [`cov.keep`](basics.html#altering) * [`cov.reduce`](basics.html#altering) * [`mult.name`](basics.html#multiv) * [`nesting`](messy-data.html#nest-trap) * [`offset`](sophisticated.html#offsets) * [Reference grids](basics.html#ref_grid) * [Altering](basics.html#altering) * [Prediction on](predictions.html#ref-grid) * [Region of practical equivalence](sophisticated.html#bayesxtra) * [Registering `recover_data` and `emm_basis` methods](xtending.html#exporting) * [`regrid()`](transformations.html#regrid) * [`transform = "log"`](transformations.html#logs) * [`transform` vs. `type`](transformations.html#regrid) * [Response scale](confidence-intervals.html#tran) * [`revpairwise` contrasts](comparisons.html#contrasts) * [`rg.limit` option](messy-data.html#rg.limit) * [RMarkdown](basics.html#formatting) * [ROPE](sophisticated.html#bayesxtra) * [**rsm** package](xtending.html#rsm) * [`rstanarm`](sophisticated.html#mcmc) [Back to top](#topnav) ### S {#s} * [Sample size, displaying](confidence-intervals.html#summary) * Satterthwaite d.f. * [in *models: K*](models.html#K) * [in *models: L*](models.html#L) * [`"scale"` type](transformations.html#trangraph) * [`scale()`](transformations.html#stdize) * [Selecting results](utilities.html#brackets) * [Sidak adjustment](confidence-intervals.html#adjust) * Significance * [Assessing](basics.html#pvalues) * [`simple = "each"`](confidence-intervals.html#simple) * Simple comparisons * [in *confidence-intervals: simple*](confidence-intervals.html#simple) * [in *FAQs: interactions*](FAQs.html#interactions) * [in *interactions: simple*](interactions.html#simple) * `specs` * [Formula](comparisons.html#formulas) * [Standardized response](transformations.html#stdize) * [`stanreg` objects](models.html#S) * [* gazing (star gazing)](interactions.html#factors) * [Startup options](utilities.html#startup) * [Statistical consultants](basics.html#recs3) * [Statistics is hard](basics.html#recs3) * [`str()`](basics.html#emmobj) * `submodel` * [`"minimal"` and `"type2"`](messy-data.html#type2submodel) * [in a multivariate model](xplanations.html#mult.submodel) * [in *messy-data: submodels*](messy-data.html#submodels) * [in *xplanations: submodels*](xplanations.html#submodels) * [Subsets of data](FAQs.html#model) * `summary()` * [`adjust`](comparisons.html#pairwise) * [in *basics: emmobj*](basics.html#emmobj) * Bayesian models * [in *confidence-intervals: summary*](confidence-intervals.html#summary) * [in *models: S*](models.html#S) * [Calculated columns](confidence-intervals.html#summary) * [in *confidence-intervals: summary*](confidence-intervals.html#summary) * [HPD intervals](sophisticated.html#mcmc) * [`hpd.summary()`](confidence-intervals.html#summary) * `infer` * [in *comparisons: pairwise*](comparisons.html#pairwise) * [in *confidence-intervals: summary*](confidence-intervals.html#summary) * [Show sample size](confidence-intervals.html#summary) * [`type = "unlink"`](transformations.html#tranlink) * [`summary_emm` object](basics.html#emmobj) * [As a data frame](utilities.html#data) [Back to top](#topnav) ### T {#t} * [*t* tests vs. *z* tests](FAQs.html#asymp) * [`test()`](confidence-intervals.html#summary) * [`delta`](confidence-intervals.html#equiv) * [`joint = TRUE`](confidence-intervals.html#joint) * Tests * [Equivalence](confidence-intervals.html#equiv) * [Noninferiority](confidence-intervals.html#equiv) * [Nonzero null](confidence-intervals.html#summary) * [One- and two-sided](confidence-intervals.html#summary) * Transformations * [Adding after the fact](transformations.html#after) * [Auto-detected](transformations.html#auto) * [Back-transforming](confidence-intervals.html#tran) * [Bias adjustment](transformations.html#bias-adj) * [Custom](transformations.html#special) * [faking](transformations.html#faking) * [Faking a log transformation](transformations.html#logs) * [Graphical display](transformations.html#trangraph) * [with link function](transformations.html#tranlink) * [Log](comparisons.html#logs) * [Overview](transformations.html#overview) * [Percent difference](transformations.html#altscale) * [Re-gridding](transformations.html#regrid) * [Response versus link functions](transformations.html#link-bias) * [`scale()`](transformations.html#stdize) * [Standardizing](transformations.html#stdize) * [Timing is everything](transformations.html#timing) * Trends * [Estimating and comparing](interactions.html#oranges) * [`trt.vs.ctrl` contrasts](comparisons.html#contrasts) * [Tukey adjustment](confidence-intervals.html#adjust) * [Ignored or changed](FAQs.html#notukey) * [`type`](confidence-intervals.html#tran) * [`type = "scale"`](transformations.html#trangraph) * [Type II analysis](messy-data.html#type2submodel) * Type III tests * [in *confidence-intervals: joint*](confidence-intervals.html#joint) * [in *confidence-intervals: joint_tests*](confidence-intervals.html#joint_tests) [Back to top](#topnav) ### U {#u} * [Unadjusted tests](confidence-intervals.html#adjmore) * [`update()`](utilities.html#update) * [`tran`](transformations.html#after) * [Using results](utilities.html#data) [Back to top](#topnav) ### V {#v} * [Variables that are not predictors](basics.html#params) * [`vcovHook`](xtending.html#hooks) * Vignettes * [Basics](basics.html) * [Comparisons](comparisons.html) * [Confidence intervals and tests](confidence-intervals.html) * [Explanations supplement](xplanations.html) * [Extending **emmeans**](xtending.html) * [FAQS](FAQs.html) * [Interactions](interactions.html) * [Messy data](messy-data.html) * [Models](models.html) * [Predictions](predictions.html) * [Sophisticated models](sophisticated.html) * [Transformations and link functions](transformations.html) * [Utilities and options](utilities.html) [Back to top](#topnav) ### W {#w} * [`weights`](messy-data.html#weights) * [Welch's *t* comparisons](FAQs.html#nowelch) * [Example](utilities.html#relevel) * [`wt.nuis`](messy-data.html#nuisance) [Back to top](#topnav) ### X {#x} * [`xtable` method](basics.html#formatting) [Back to top](#topnav) ### Z {#z} * [*z* tests](sophisticated.html#dfoptions) * [vs. *t* tests](FAQs.html#asymp) * [`zeroinfl` models](models.html#C) [Back to top](#topnav) *Index generated by the [vigindex](https://github.com/rvlenth/vigindex) package.* emmeans/vignettes/interactions.Rmd0000644000176200001440000004347514137062735017102 0ustar liggesusers--- title: "Interaction analysis in emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Interaction analysis in emmeans} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") options(show.signif.stars = FALSE) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro", class.message = "re") ``` Models in which predictors interact seem to create a lot of confusion concerning what kinds of *post hoc* methods should be used. It is hoped that this vignette will be helpful in shedding some light on how to use the **emmeans** package effectively in such situations. ## Contents {#contents} 1. [Interacting factors](#factors) a. [Simple contrasts](#simple) 2. [Interaction contrasts](#contrasts) 3. [Multivariate contrasts](#multiv) 4. [Interactions with covariates](#covariates) 9. [Summary](#summary) [Index of all vignette topics](vignette-topics.html) ## Interacting factors {#factors} As an example for this topic, consider the `auto.noise` dataset included with the package. This is a balanced 3x2x2 experiment with three replications. The response -- noise level -- is evaluated with different sizes of cars, types of anti-pollution filters, on each side of the car being measured.[^1] [^1]: I sure wish I could ask some questions about how how these data were collected; for example, are these independent experimental runs, or are some cars measured more than once? The model is based on the independence assumption, but I have my doubts. Let's fit a model and obtain the ANOVA table (because of the scale of the data, we believe that the response is recorded in tenths of decibels; so we compensate for this by scaling the response): ```{r} noise.lm <- lm(noise/10 ~ size * type * side, data = auto.noise) anova(noise.lm) ``` There are statistically strong 2- and 3-way interactions. One mistake that a lot of people seem to make is to proceed too hastily to estimating marginal means (even in the face of all these interactions!). They would go straight to analyses like this: ```{r} emmeans(noise.lm, pairwise ~ size) ``` The analyst-in-a-hurry would thus conclude that the noise level is higher for medium-sized cars than for small or large ones. But as is seen in the message before the output, `emmeans()` valiantly tries to warn you that it may not be a good idea to average over factors that interact with the factor of interest. It isn't *always* a bad idea to do this, but sometimes it definitely is. What about this time? I think a good first step is always to try to visualize the nature of the interactions before doing any statistical comparisons. The following plot helps. ```{r} emmip(noise.lm, type ~ size | side) ``` Examining this plot, we see that the "medium" mean is not always higher; so the marginal means, and the way they compare, does not represent what is always the case. Moreover, what is evident in the plot is that the peak for medium-size cars occurs for only one of the two filter types. So it seems more useful to do the comparisons of size separately for each filter type. This is easily done, simply by conditioning on `type`: ```{r} emm_s.t <- emmeans(noise.lm, pairwise ~ size | type) emm_s.t ``` Not too surprisingly, the statistical comparisons are all different for standard filters, but with Octel filters, there isn't much of a difference between small and medium size. For comparing the levels of other factors, similar judgments must be made. It may help to construct other interaction plots with the factors in different roles. In my opinion, almost all meaningful statistical analysis should be grounded in evaluating the practical impact of the estimated effects *first*, and seeing if the statistical evidence backs it up. Those who put all their attention on how many asterisks (I call these people "`*` gazers") are ignoring the fact that these don't measure the sizes of the effects on a practical scale.[^2] An effect can be practically negligible and still have a very small *P* value -- or practically important but have a large *P* value -- depending on sample size and error variance. Failure to describe what is actually going on in the data is a failure to do an adequate analysis. Use lots of plots, and *think* about the results. For more on this, see the discussion of *P* values in the ["basics" vignette](basics.html#pvalues). [^2]: You may have noticed that there are no asterisks in the ANOVA table in this vignette. I habitually opt out of star-gazing by including `options(show.signif.stars = FALSE)` in my `.Rprofile` file. ### Simple contrasts {#simple} An alternative way to specify conditional contrasts or comparisons is through the use of the `simple` argument to `contrast()` or `pairs()`, which amounts to specifying which factors are *not* used as `by` variables. For example, consider: ```{r} noise.emm <- emmeans(noise.lm, ~ size * side * type) ``` Then `pairs(noise.emm, simple = "size")` is the same as `pairs(noise.emm, by = c("side", "type"))`. One may specify a list for `simple`, in which case separate runs are made with each element of the list. Thus, `pairs(noise.emm, simple = list("size", c("side", "type"))` returns two sets of contrasts: comparisons of `size` for each combination of the other two factors; and comparisons of `side*type` combinations for each `size`. A shortcut that generates all simple main-effect comparisons is to use `simple = "each"`. In this example, the result is the same as obtained using `simple = list("size", "side", "type")`. Ordinarily, when `simple` is a list (or equal to `"each"`), a list of contrast sets is returned. However, if the additional argument `combine` is set to `TRUE`, they are all combined into one family: ```{r} contrast(noise.emm, "consec", simple = "each", combine = TRUE, adjust = "mvt") ``` The dots (`.`) in this result correspond to which simple effect is being displayed. If we re-run this same call with `combine = FALSE` or omitted, these twenty comparisons would be displayed in three broad sets of contrasts, each broken down further by combinations of `by` variables, each separately multiplicity-adjusted (a total of 16 different tables). [Back to Contents](#contents) ## Interaction contrasts {#contrasts} An interaction contrast is a contrast of contrasts. For instance, in the auto-noise example, we may want to obtain the linear and quadratic contrasts of `size` separately for each `type`, and compare them. Here are estimates of those contrasts: ```{r} contrast(emm_s.t[[1]], "poly") ## 'by = "type"' already in previous result ``` The comparison of these contrasts may be done using the `interaction` argument in `contrast()` as follows: ```{r} IC_st <- contrast(emm_s.t[[1]], interaction = c("poly", "consec"), by = NULL) IC_st ``` (Using `by = NULL` restores `type` to a primary factor in these contrasts.) The practical meaning of this is that there isn't a statistical difference in the linear trends, but the quadratic trend for Octel is greater than for standard filter types. (Both quadratic trends are negative, so in fact it is the standard filters that have more pronounced *downward* curvature, as is seen in the plot.) In case you need to understand more clearly what contrasts are being estimated, the `coef()` method helps: ```{r} coef(IC_st) ``` Note that the 4th through 6th contrast coefficients are the negatives of the 1st through 3rd -- thus a comparison of two contrasts. By the way, "type III" tests of interaction effects can be obtained via interaction contrasts: ```{r} test(IC_st, joint = TRUE) ``` This result is exactly the same as the *F* test of `size:type` in the `anova` output. The three-way interaction may be explored via interaction contrasts too: ```{r} contrast(emmeans(noise.lm, ~ size*type*side), interaction = c("poly", "consec", "consec")) ``` One interpretation of this is that the comparison by `type` of the linear contrasts for `size` is different on the left side than on the right side; but the comparison of that comparison of the quadratic contrasts, not so much. Refer again to the plot, and this can be discerned as a comparison of the interaction in the left panel versus the interaction in the right panel. Finally, **emmeans** provides a `joint_tests()` function that obtains and tests the interaction contrasts for all effects in the model and compiles them in one Type-III-ANOVA-like table: ```{r} joint_tests(noise.lm) ``` You may even add `by` variable(s) to obtain separate ANOVA tables for the remaining factors: ```{r} joint_tests(noise.lm, by = "side") ``` [Back to Contents](#contents) ## Multivariate contrasts {#multiv} In the preceding sections, the way we addressed interacting factors was to do comparisons or contrasts of some factors()) separately at levels of other factor(s). This leads to a lot of estimates and associated tests. Another approach is to compare things in a multivariate way. In the auto-noise example, for example, we have four means (corresponding to the four combinations of `type` and `size`) with each size of car, and we could consider comparing these *sets* of means. Such multivariate comparisons can be done via the *Mahalanobis distance* (a kind of standardized distance measure) between one set of four means and another. This is facilitated by the `mvcontrast()` function: ```{r} mvcontrast(noise.emm, "pairwise", mult.name = c("type", "side")) ``` In this output, the `T.square` values are Hotelling's $T^2$ statistics, which are the squared Mahalanobis distances among the sets of four means. These results thus accomplish a similar objective as the initial comparisons presented in this vignette, but are not complicated by the issue that the factors interact. (Instead, we lose the directionality of the comparisons.) While all comparisons are "significant," the `T.square` values indicate that large cars are statistically most different from the other sizes. We may still break things down using `by` variables. Suppose, for example, we wish to compare the two filter types for each size of car, without regard to which side: ```{r} update(mvcontrast(noise.emm, "consec", mult.name = "side", by = "size"), by = NULL) ``` One detail to note about multivariate comparisons: in order to make complete sense, all the factors involved must interact. Suppose we were to repeat the initial multivariate comparison after removing all interactions: ```{r} mvcontrast(update(noise.emm, submodel = ~ side + size + type), "pairwise", mult.name = c("type", "side")) ``` Note that each $F$ ratio now has 1 d.f. Also, note that `T.square = F.ratio`, and you can verify that these values are equal to the squares of the `t.ratio`s in the initial example in this vignette ($(-6.147)^2 = 37.786$, etc.). That is, if we ignore all interactions, the multivariate tests are exactly equivalent to the univariate tests of the marginal means. [Back to Contents](#contents) ## Interactions with covariates {#covariates} When a covariate and a factor interact, we typically don't want EMMs themselves, but rather estimates of *slopes* of the covariate trend for each level of the factor. As a simple example, consider the `fiber` dataset, and fit a model including the interaction between `diameter` (a covariate) and `machine` (a factor): ```{r} fiber.lm <- lm(strength ~ diameter*machine, data = fiber) ``` This model comprises fitting, for each machine, a separate linear trend for `strength` versus `diameter`. Accordingly, we can estimate and compare the slopes of those lines via the `emtrends()` function: ```{r} emtrends(fiber.lm, pairwise ~ machine, var = "diameter") ``` We see the three slopes, but no two of them test as being statistically different. To visualize the lines themselves, you may use ```{r fig.height = 2} emmip(fiber.lm, machine ~ diameter, cov.reduce = range) ``` The `cov.reduce = range` argument is passed to `ref_grid()`; it is needed because by default, each covariate is reduced to only one value (see the ["basics" vignette](basics.html)). Instead, we call the `range()` function to obtain the minimum and maximum diameter. ######### {#oranges} For a more sophisticated example, consider the `oranges` dataset included with the package. These data concern the sales of two varieties of oranges. The prices (`price1` and `price2`) were experimentally varied in different stores and different days, and the responses `sales1` and `sales2` were observed. Let's consider three multivariate models for these data, with additive effects for days and stores, and different levels of fitting on the prices: ```{r} org.quad <- lm(cbind(sales1, sales2) ~ poly(price1, price2, degree = 2) + day + store, data = oranges) org.int <- lm(cbind(sales1, sales2) ~ price1 * price2 + day + store, data = oranges) org.add <- lm(cbind(sales1, sales2) ~ price1 + price2 + day + store, data = oranges) ``` Being a multivariate model, **emmeans** methods will distinguish the responses as if they were levels of a factor, which we will name "variety". Moreover, separate effects are estimated for each multivariate response, so there is an *implied interaction* between `variety` and each of the predictors involving `price1` and `price2`. (In `org.int`, there is an implied three-way interaction.) An interesting way to view these models is to look at how they predict sales of each variety at each observed values of the prices: ```{r} emmip(org.quad, price2 ~ price1 | variety, mult.name = "variety", cov.reduce = FALSE) ``` The trends portrayed here are quite sensible: In the left panel, as we increase the price of variety 1, sales of that variety will tend to decrease -- and the decrease will be faster when the other variety of oranges is low-priced. In the right panel, as price of variety 1 increases, sales of variety 2 will increase when it is low-priced, but could decrease also at high prices because oranges in general are just too expensive. A plot like this for `org.int` will be similar but all the curves will be straight lines; and the one for `plot.add` will have all lines parallel. In all models, though, there are implied `price1:variety` and `price2:variety` interactions, because we have different regression coefficients for the two responses. Which model should we use? They are nested models, so they can be compared by `anova()`: ```{r} anova(org.quad, org.int, org.add) ``` It seems like the full-quadratic model has little advantage over the interaction model. There truly is nothing magical about a *P* value of 0.05, and we have enough data that over-fitting is not a hazard; so I like `org.int`. However, what follows could be done with any of these models. To summarize and test the results compactly, it makes sense to obtain estimates of a representative trend in each of the left and right panels, and perhaps to compare them. In turn, that can be done by obtaining the slope of the curve (or line) at the average value of `price2`. The `emtrends()` function is designed for exactly this kind of purpose. It uses a difference quotient to estimate the slope of a line fitted to a given variable. It works just like `emmeans()` except for requiring the variable to use in the difference quotient. Using the `org.int` model: ```{r} emtrends(org.int, pairwise ~ variety, var = "price1", mult.name = "variety") ``` From this, we can say that, starting with `price1` and `price2` both at their average values, we expect `sales1` to decrease by about .75 per unit increase in `price1`; meanwhile, there is a suggestion of a slight increase of `sales2`, but without much statistical evidence. Marginally, the first variety has a 0.89 disadvantage relative to sales of the second variety. Other analyses (not shown) with `price2` set at a higher value will reduce these effects, while setting `price2` lower will exaggerate all these effects. If the same analysis is done with the quadratic model, the the trends are curved, and so the results will depend somewhat on the setting for `price1`. The graph above gives an indication of the nature of those changes. Similar results hold when we analyze the trends for `price2`: ```{r} emtrends(org.int, pairwise ~ variety, var = "price2", mult.name = "variety") ``` At the averages, increasing the price of variety 2 has the effect of decreasing sales of variety 2 while slightly increasing sales of variety 1 -- a marginal difference of about .92. [Back to Contents](#contents) ## Summary {#summary} Interactions, by nature, make things more complicated. One must resist pressures and inclinations to try to produce simple bottom-line conclusions. Interactions require more work and more patience; they require presenting more cases -- more than are presented in the examples in this vignette -- in order to provide a complete picture. [Index of all vignette topics](vignette-topics.html) emmeans/vignettes/xtending.Rmd0000644000176200001440000007617314137062735016221 0ustar liggesusers--- title: "For developers: Extending **emmeans**" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{For developers: Extending emmeans} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") set.seed(271828) ``` ## Contents {#contents} This vignette explains how developers may incorporate **emmeans** support in their packages. If you are a user looking for a quick way to obtain results for an unsupported model, you are probably better off trying to use the `qdrg()` function. 1. [Introduction](#intro) 2. [Data example](#dataex) 3. [Supporting `rlm` objects](#rlm) 4. [Supporting `lqs` objects](#lqs) 5. [Communication between methods](#communic) 5. [Hook functions](#hooks) 6. [Exported methods from **emmeans**](#exported) 7. [Existing support for `rsm` objects](#rsm) 7. [Dispatching and restrictions](#dispatch) 8. [Exporting and registering your methods](#exporting) 9. [Conclusions](#concl) [Index of all vignette topics](vignette-topics.html) ## Introduction {#intro} Suppose you want to use **emmeans** for some type of model that it doesn't (yet) support. Or, suppose you have developed a new package with a fancy model-fitting function, and you'd like it to work with **emmeans**. What can you do? Well, there is hope because **emmeans** is designed to be extended. The first thing to do is to look at the help page for extending the package: ```{r eval=FALSE} help("extending-emmeans", package="emmeans") ``` It gives details about the fact that you need to write two S3 methods, `recover_data` and `emm_basis`, for the class of object that your model-fitting function returns. The `recover_data` method is needed to recreate the dataset so that the reference grid can be identified. The `emm_basis` method then determines the linear functions needed to evaluate each point in the reference grid and to obtain associated information---such as the variance-covariance matrix---needed to do estimation and testing. These methods must also be exported from your package so that they are available to users. See the section on [exporting the methods](#exporting) for details and suggestions. This vignette presents an example where suitable methods are developed, and discusses a few issues that arise. [Back to Contents](#contents) ## Data example {#dataex} The **MASS** package contains various functions that do robust or outlier-resistant model fitting. We will cobble together some **emmeans** support for these. But first, let's create a suitable dataset (a simulated two-factor experiment) for testing. ```{r} fake = expand.grid(rep = 1:5, A = c("a1","a2"), B = c("b1","b2","b3")) fake$y = c(11.46,12.93,11.87,11.01,11.92,17.80,13.41,13.96,14.27,15.82, 23.14,23.75,-2.09,28.43,23.01,24.11,25.51,24.11,23.95,30.37, 17.75,18.28,17.82,18.52,16.33,20.58,20.55,20.77,21.21,20.10) ``` The `y` values were generated using predetermined means and Cauchy-distributed errors. There are some serious outliers in these data. ## Supporting `rlm` {#rlm} The **MASS** package provides an `rlm` function that fits robust-regression models using *M* estimation. We'll fit a model using the default settings for all tuning parameters: ```{r} library(MASS) fake.rlm = rlm(y ~ A * B, data = fake) library(emmeans) emmeans(fake.rlm, ~ B | A) ``` The first lesson to learn about extending **emmeans** is that sometimes, it already works! It works here because `rlm` objects inherit from `lm`, which is supported by the **emmeans** package, and `rlm` objects aren't enough different to create any problems. [Back to Contents](#contents) ## Supporting `lqs` objects {#lqs} The **MASS** resistant-regression functions `lqs`, `lmsreg`, and `ltsreg` are another story, however. They create `lqs` objects that are not extensions of any other class, and have other issues, including not even having a `vcov` method. So for these, we really do need to write new methods for `lqs` objects. First, let's fit a model. ```{r} fake.lts = ltsreg(y ~ A * B, data = fake) ``` ### The `recover_data` method {#rd.lqs} It is usually an easy matter to write a `recover_data` method. Look at the one for `lm` objects: ```{r} emmeans:::recover_data.lm ``` Note that all it does is obtain the `call` component and call the method for class `call`, with additional arguments for its `terms` component and `na.action`. It happens that we can access these attributes in exactly the same way as for `lm` objects; so: ```{r} recover_data.lqs = emmeans:::recover_data.lm ``` Let's test it: ```{r} rec.fake = recover_data(fake.lts) head(rec.fake) ``` Our recovered data excludes the response variable `y` (owing to the `delete.response` call), and this is fine. #### Special arguments {#rdargs} By the way, there are two special arguments `data` and `params` that may be handed to `recover_data` via `ref_grid` or `emmeans` or a related function; and you may need to provide for if you don't use the `recover_data.call` function. The `data` argument is needed to cover a desperate situation that occurs with certain kinds of models where the underlying data information is not saved with the object---e.g., models that are fitted by iteratively modifying the data. In those cases, the only way to recover the data is to for the user to give it explicitly, and `recover_data` just adds a few needed attributes to it. The `params` argument is needed when the model formula refers to variables besides predictors. For example, a model may include a spline term, and the knots are saved in the user's environment as a vector and referred to in the call to fit the model. In trying to recover the data, we try to construct a data frame containing all the variables present on the right-hand side of the model, but if some of those are scalars or of different lengths than the number of observations, an error occurs. So you need to exclude any names in `params` when reconstructing the data. Many model objects contain the model frame as a slot; for example, a model fitted with `lm(..., model = TRUE)` has a member `$model` containing the model frame. This can be useful for recovering the data, provided none of the predictors are transformed (when predictors are transformed, the original predictor values are not in the model frame so it's harder to recover them). Therefore, when the model frame is available in the model object, it should be provided in the `frame` argument of `recover_data.call()`; then when `data = NULL`, a check is made on `trms`, and if it has no function calls, then `data` is set to `frame`. Of course, in the rarer case where the original data are available in the model object, specify that as `data`. #### Error handling {#rderrs} If you check for any error conditions in `recover_data`, simply have it return a character string with the desired message, rather than invoking `stop`. This provides a cleaner exit. The reason is that whenever `recover_data` throws an error, an informative message suggesting that `data` or `params` be provided is displayed. But a character return value is tested for and throws a different error with your string as the message. ### The `emm_basis` method {#ebreqs} The `emm_basis` method has four required arguments: ```{r} args(emmeans:::emm_basis.lm) ``` These are, respectively, the model object, its `terms` component (at least for the right-hand side of the model), a `list` of levels of the factors, and the grid of predictor combinations that specify the reference grid. The function must obtain six things and return them in a named `list`. They are the matrix `X` of linear functions for each point in the reference grid, the regression coefficients `bhat`; the variance-covariance matrix `V`; a matrix `nbasis` for non-estimable functions; a function `dffun(k,dfargs)` for computing degrees of freedom for the linear function `sum(k*bhat)`; and a list `dfargs` of arguments to pass to `dffun`. Optionally, the returned list may include a `model.matrix` element (the model matrix for the data or a compact version thereof obtained via `.cmpMM()`), which, if included, enables the `submodel` option. To write your own `emm_basis` function, examining some of the existing methods can help; but the best resource is the `predict` method for the object in question, looking carefully to see what it does to predict values for a new set of predictors (e.g., `newdata` in `predict.lm`). Following this advice, let's take a look at it: ```{r} MASS:::predict.lqs ``` ###### {#eblqs} Based on this, here is a listing of an `emm_basis` method for `lqs` objects: ```{r} emm_basis.lqs = function(object, trms, xlev, grid, ...) { m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = object$contrasts) bhat = coef(object) Xmat = model.matrix(trms, data=object$model) # 5 V = rev(object$scale)[1]^2 * solve(t(Xmat) %*% Xmat) nbasis = matrix(NA) dfargs = list(df = nrow(Xmat) - ncol(Xmat)) dffun = function(k, dfargs) dfargs$df list(X = X, bhat = bhat, nbasis = nbasis, V = V, #10 dffun = dffun, dfargs = dfargs) } ``` Before explaining it, let's verify that it works: ```{r} emmeans(fake.lts, ~ B | A) ``` Hooray! Note the results are comparable to those we had for `fake.rlm`, albeit the standard errors are quite a bit smaller. (In fact, the SEs could be misleading; a better method for estimating covariances should probably be implemented, but that is beyond the scope of this vignette.) [Back to Contents](#contents) ### Dissecting `emm_basis.lqs` Let's go through the listing of this method, line-by-line: * Lines 2--3: Construct the linear functions, `X`. This is a pretty standard two-step process: First obtain a model frame, `m`, for the grid of predictors, then pass it as data to `model.matrix` to create the associated design matrix. As promised, this code is essentially identical to what you find in `predict.lqs`. * Line 4: Obtain the coefficients, `bhat`. Most model objects have a `coef` method. * Lines 5--6: Obtain the covariance matrix, `V`, of `bhat`. In many models, this can be obtained using the object's `vcov` method. But not in this case. Instead, I cobbled one together using the inverse of the **X'X** matrix as in ordinary regression, and the variance estimate found in the last element of the `scale` element of the object. This probably under-estimates the variances and distorts the covariances, because robust estimators have some efficiency loss. * Line 7: Compute the basis for non-estimable functions. This applies only when there is a possibility of rank deficiency in the model. But `lqs` methods don't allow rank deficiencies, so it we have fitted such a model, we can be sure that all linear functions are estimable; we signal that by setting `nbasis` equal to a 1 x 1 matrix of `NA`. If rank deficiency were possible, the **estimability** package (which is required by **emmeans**) provides a `nonest.basis` function that makes this fairly painless---I would have coded `nbasis = estimability::nonest.basis(Xmat)`. There some subtleties you need to know regarding estimability. Suppose the model is rank-deficient, so that the design matrix **X** has *p* columns but rank *r* < *p*. In that case, `bhat` should be of length *p* (not *r*), and there should be *p* - *r* elements equal to `NA`, corresponding to columns of **X** that were excluded from the fit. Also, `X` should have all *p* columns. In other words, do not alter or throw-out columns of `X` or their corresponding elements of `bhat`---even those with `NA` coefficients---as they are essential for assessing estimability. `V` should be *r* x *r*, however---the covariance matrix for the non-excluded predictors. * Lines 8--9: Obtain `dffun` and `dfargs`. This is a little awkward because it is designed to allow support for mixed models, where approximate methods may be used to obtain degrees of freedom. The function `dffun` is expected to have two arguments: `k`, the vector of coefficients of `bhat`, and `dfargs`, a list containing any additional arguments. In this case (and in many other models), the degrees of freedom are the same regardless of `k`. We put the required degrees of freedom in `dfargs` and write `dffun` so that it simply returns that value. (Note: If asymptotic tests and CIs are desired, return `Inf` degrees of freedom.) * Line 10: Return these results in a named list. [Back to Contents](#contents) ## Communication between methods {#communic} If you need to pass information obtained in `recover_data()` to the `emm_basis()` method, simply incorporate it as `attr(data, "misc")` where `data` is the dataset returned by `recover_data()`. Subsequently, that attribute is available in `emm_grid()` by adding a `misc` argument. ## Hook functions {#hooks} Most linear models supported by **emmeans** have straightforward structure: Regression coefficients, their covariance matrix, and a set of linear functions that define the reference grid. However, a few are more complex. An example is the `clm` class in the **ordinal** package, which allows a scale model in addition to the location model. When a scale model is used, the scale parameters are included in the model matrix, regression coefficients, and covariance matrix, and we can't just use the usual matrix operations to obtain estimates and standard errors. To facilitate using custom routines for these tasks, the `emm_basis.clm` function function provided in **emmeans** includes, in its `misc` part, the names (as character constants) of two "hook" functions: `misc$estHook` has the name of the function to call when computing estimates, standard errors, and degrees of freedom (for the `summary` method); and `misc$vcovHook` has the name of the function to call to obtain the covariance matrix of the grid values (used by the `vcov` method). These functions are called in lieu of the usual built-in routines for these purposes, and return the appropriately sized matrices. In addition, you may want to apply some form of special post-processing after the reference grid is constructed. To provide for this, give the name of your function to post-process the object in `misc$postGridHook`. Again, `clm` objects (as well as `polr` in the **MASS** package) serve as an example. They allow a `mode` specification that in two cases, calls for post-processing. The `"cum.prob"` mode uses the `regrid` function to transform the linear predictor to the cumulative-probability scale. And the `"prob"` mode performs this, as well as applying the contrasts necessary to convert the cumulative probabilities into the class probabilities. [Back to Contents](#contents) ## Exported methods from **emmeans** {#exported} For package developers' convenience, **emmeans** exports some of its S3 methods for `recover_data` and/or `emm_basis`---use `methods("recover_data")` and `methods("emm_basis")` to discover which ones. It may be that all you need is to invoke one of those methods and perhaps make some small changes---especially if your model-fitting algorithm makes heavy use of an existing model type supported by **emmeans**. For those methods that are not exported, use `recover_data()` and `.emm_basis()`, which run in **emmeans**'s namespace, thus providing access to all available methods.. A few additional functions are exported because they may be useful to developers. They are as follows: * `emmeans::.all.vars(expr, retain)` Some users of your package may include `$` or `[[]]` operators in their model formulas. If you need to get the variable names, `base::all.vars` will probably not give you what you need. For example, if `form = ~ data$x + data[[5]]`, then `base::all.vars(form)` returns the names `"data"` and `"x"`, whereas `emmeans::.all.vars(form)` returns the names `"data$x"` and `"data[[5]]"`. The `retain` argument may be used to specify regular expressions for patterns to retain as parts of variable names. * `emmeans::.diag(x, nrow, ncol)` The base `diag` function has a booby trap whereby, for example, `diag(57.6)` returns a 57 x 57 identity matrix rather than a 1 x 1 matrix with 57.6 as its only element. But `emmeans::.diag(57.6)` will return the latter. The function works identically to `diag` except for its tail run around the identity-matrix trap. * `emmeans::.aovlist.dffun(k, dfargs)` This function is exported because it is needed for computing degrees of freedom for models fitted using `aov`, but it may be useful for other cases where Satterthwaite degrees-of-freedom calculations are needed. It requires the `dfargs` slot to contain analogous contents. * `emmeans::.get.offset(terms, grid)` If `terms` is a model formula containing an `offset` call, this is will compute that offset in the context of `grid` (a `data.frame`). * `emmeans::.my.vcov(object, ...)` In a call to `ref_grid`, `emmeans`, etc., the user may use `vcov.` to specify an alternative function or matrix to use as the covariance matrix of the fixed-effects coefficients. This function supports that feature. Calling `.my.vcov` in place of the `vcov` method will substitute the user's `vcov.` when it is specified. * `emmeans::.std.link.labels(fam, misc)` This is useful in `emm_basis` methods for generalized linear models. Call it with `fam` equal to the `family` object for your model, and `misc` either an existing list, or just `list()` if none. It returns a new `misc` list containing the link function and, in some cases, extra features that are used for certain types of link functions (e.g., for a log link, the setups for returning ratio comparisons with `type = "response"`). * `emmeans::.num.key(levs, key)` Returns integer indices of elements of `key` in `levs` when `key` is a character vector; or just returns integer values if already integer. Also throws an error if levels are mismatched or indices exceed legal range. This is useful in custom contrast functions (`.emmc` functions). * `emmeans::.get.excl(levs, exclude, include)` This is support for the `exclude` and `include` arguments of contrast functions. It checks legality and returns an integer vector of `exclude` indices in `levs`, given specified integer or character arguments `exclude` and `include`. In your `.emmc` function, `exclude` should default to `integer(0)` and `include` should have no default. * `emmeans::.cmpMM(X, weights, assign)` creates a compact version of the model matrix `X` (or, preferably, its QR decomposition). This is useful if we want an `emm_basis()` method to return a `model.matrix` element. The returned result is just the R portion of the QR decomposition of `diag(sqrt(weights)) %*% X`, with the `assign` attribute added. If `X` is a `qr` object, we assume the weights are already incorporated, as is true of the `qr` slot of a `lm` object. [Back to Contents](#contents) ## Existing support for `rsm` objects {#rsm} As a nontrivial example of how an existing package supports **emmeans**, we show the support offered by the **rsm** package. Its `rsm` function returns an `rsm` object which is an extension of the `lm` class. Part of that extension has to do with `coded.data` structures whereby, as is typical in response-surface analysis, models are fitted to variables that have been linearly transformed (coded) so that the scope of each predictor is represented by plus or minus 1 on the coded scale. Without any extra support in **rsm**, `emmeans` will work just fine with `rsm` objects; but if the data are coded, it becomes awkward to present results in terms of the original predictors on their original, uncoded scale. The `emmeans`-related methods in **rsm** provide a `mode` argument that may be used to specify whether we want to work with coded or uncoded data. The possible values for `mode` are `"asis"` (ignore any codings, if present), `"coded"` (use the coded scale), and `"decoded"` (use the decoded scale). The first two are actually the same in that no decoding is done; but it seems clearer to provide separate options because they represent two different situations. ### The `recover_data` method {#rdrsm} Note that coding is a *predictor* transformation, not a response transformation (we could have that, too, as it's already supported by the **emmeans** infrastructure). So, to handle the `"decode"` mode, we will need to actually decode the predictors used to construct he reference grid. That means we need to make `recover_data` a lot fancier! Here it is: ```{r} recover_data.rsm = function(object, data, mode = c("asis", "coded", "decoded"), ...) { mode = match.arg(mode) cod = rsm::codings(object) fcall = object$call if(is.null(data)) # 5 data = emmeans::recover_data(fcall, delete.response(terms(object)), object$na.action, ...) if (!is.null(cod) && (mode == "decoded")) { pred = cpred = attr(data, "predictors") trms = attr(data, "terms") #10 data = rsm::decode.data(rsm::as.coded.data(data, formulas = cod)) for (form in cod) { vn = all.vars(form) if (!is.na(idx <- grep(vn[1], pred))) { pred[idx] = vn[2] #15 cpred = setdiff(cpred, vn[1]) } } attr(data, "predictors") = pred new.trms = update(trms, reformulate(c("1", cpred))) #20 attr(new.trms, "orig") = trms attr(data, "terms") = new.trms attr(data, "misc") = cod } data } ``` Lines 2--7 ensure that `mode` is legal, retrieves the codings from the object, and obtain the results we would get from `recover_data` had it been an `lm` object. If `mode` is not `"decoded"`, *or* if no codings were used, that's all we need. Otherwise, we need to return the decoded data. However, it isn't quite that simple, because the model equation is still defined on the coded scale. Rather than to try to translate the model coefficients and covariance matrix to the decoded scale, we elected to remember what we will need to do later to put things back on the coded scale. In lines 9--10, we retrieve the attributes of the recovered data that provide the predictor names and `terms` object on the coded scale. In line 11, we replace the recovered data with the decoded data. By the way, the codings comprise a list of formulas with the coded name on the left and the original variable name on the right. It is possible that only some of the predictors are coded (for example, blocking factors will not be). In the `for` loop in lines 12--18, the coded predictor names are replaced with their decoded names. For technical reasons to be discussed later, we also remove these coded predictor names from a copy, `cpred`, of the list of all predictors in the coded model. In line 19, the `"predictors"` attribute of `data` is replaced with the modified version. Now, there is a nasty technicality. The `ref_grid` function in **emmeans** has a few lines of code after `recover_data` is called that determine if any terms in the model convert covariates to factors or vice versa; and this code uses the model formula. That formula involves variables on the coded scale, and those variables are no longer present in the data, so an error will occur if it tries to access them. Luckily, if we simply take those terms out of the formula, it won't hurt because those coded predictors would not have been converted in that way. So in line 20, we update `trms` with a simpler model with the coded variables excluded (the intercept is explicitly included to ensure there will be a right-hand side even is `cpred` is empty). We save that as the `terms` attribute, and the original terms as a new `"orig"` attribute to be retrieved later. The `data` object, modified or not, is returned. If data have been decoded, `ref_grid` will construct its grid using decoded variables. In line 23, we save the codings as the `"misc"` attribute, to be accessed later by `emm_basis()`. ### The `emm_basis` method {#ebrsm} Now comes the `emm_basis` method that will be called after the grid is defined. It is listed below: ```{r} emm_basis.rsm = function(object, trms, xlev, grid, mode = c("asis", "coded", "decoded"), misc, ...) { mode = match.arg(mode) cod = misc if(!is.null(cod) && mode == "decoded") { # 5 grid = rsm::coded.data(grid, formulas = cod) trms = attr(trms, "orig") } m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) #10 X = model.matrix(trms, m, contrasts.arg = object$contrasts) bhat = as.numeric(object$coefficients) V = emmeans::.my.vcov(object, ...) if (sum(is.na(bhat)) > 0) #15 nbasis = estimability::nonest.basis(object$qr) else nbasis = estimability::all.estble dfargs = list(df = object$df.residual) dffun = function(k, dfargs) dfargs$df #20 list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = list()) } ``` This is much simpler. The coding formulas are obtained from `misc` (line 4) so that we don't have to re-obtain them from the object. All we have to do is determine if decoding was done (line 5); and, if so, convert the grid back to the coded scale (line 6) and recover the original `terms` attribute (line 7). The rest is borrowed directly from the `emm_basis.lm` method in **emmeans**. Note that line 13 uses one of the exported functions we described in the preceding section. Lines 15--18 use functions from the **estimability** package to handle the possibility that the model is rank-deficient. ### A demonstration {#demo} Here's a demonstration of this **rsm** support. The standard example for `rsm` fits a second-order model `CR.rs2` to a dataset organized in two blocks and with two coded predictors. ```{r results = "hide", warning = FALSE, message = FALSE} library("rsm") example("rsm") ### (output is not shown) ### ``` First, let's look at some results on the coded scale---which are the same as for an ordinary `lm` object. ```{r} emmeans(CR.rs2, ~ x1 * x2, mode = "coded", at = list(x1 = c(-1, 0, 1), x2 = c(-2, 2))) ``` Now, the coded variables `x1` and `x2` are derived from these coding formulas for predictors `Time` and `Temp`: ```{r} codings(CR.rs1) ``` Thus, for example, a coded value of `x1 = 1` corresponds to a time of 85 + 1 x 5 = 90. Here are some results working with decoded predictors. Note that the `at` list must now be given in terms of `Time` and `Temp`: ```{r} emmeans(CR.rs2, ~ Time * Temp, mode = "decoded", at = list(Time = c(80, 85, 90), Temp = c(165, 185))) ``` Since the supplied settings are the same on the decoded scale as were used on the coded scale, the EMMs are identical to those in the previous output. ## Dispatching and restrictions {#dispatch} The **emmeans** package has internal support for a number of model classes. When `recover_data()` and `emm_basis()` are dispatched, a search is made for external methods for a given class; and if found, those methods are used instead of the internal ones. However, certain restrictions apply when you aim to override an existing internal method: 1. The class name being extended must appear in the first or second position in the results of `class(object)`. That is, you may have a base class for which you provide `recover_data()` and `emm_basis()` methods, and those will also work for *direct* descendants thereof; but any class in third place or later in the inheritance is ignored. 2. Certain classes vital to the correct operation of the package, e.g., `"lm"`, `"glm"`, etc., may not be overridden. If there are no existing internal methods for the class(es) you provide methods for, there are no restrictions on them. ## Exporting and registering your methods {#exporting} To make the methods available to users of your package, the methods must be exported. R and CRAN are evolving in a way that having S3 methods in the registry is increasingly important; so it is a good idea to provide for that. The problem is not all of your package users will have **emmeans** installed. Thus, registering the methods must be done conditionally. We provide a courtesy function `.emm_register()` to make this simple. Suppose that your package offers two model classes `foo` and `bar`, and it includes the corresponding functions `recover_data.foo`, `recover_data.bar`, `emm_basis.foo`, and `emm_basis.bar`. Then to register these methods, add or modify the `.onLoad` function in your package (traditionally saved in the source file `zzz.R`): ```r .onLoad <- function(libname, pkgname) { if (requireNamespace("emmeans", quietly = TRUE)) emmeans::.emm_register(c("foo", "bar"), pkgname) } ``` You should also add `emmeans (>= 1.4)` and `estimability` (which is required by **emmeans**) to the `Suggests` field of your `DESCRIPTION` file. [Back to Contents](#contents) ## Conclusions {#concl} It is relatively simple to write appropriate methods that work with **emmeans** for model objects it does not support. I hope this vignette is helpful for understanding how. Furthermore, if you are the developer of a package that fits linear models, I encourage you to include `recover_data` and `emm_basis` methods for those classes of objects, so that users have access to **emmeans** support. [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/vignettes/predictions.Rmd0000644000176200001440000001712314137062735016712 0ustar liggesusers--- title: "Prediction in **emmeans**" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Prediction in emmeans} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") options(show.signif.stars = FALSE, width = 100) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` In this vignette, we discuss **emmeans**'s rudimentary capabilities for constructing prediction intervals. ## Contents {#contents} 1. [Focus on reference grids](#ref-grid) 2. [Need for an SD estimate](#sd-estimate) 3. [Feedlot example](#feedlot) 4. [Predictions on particular strata](#strata) 5. [Predictions with Bayesian models](#bayes) [Index of all vignette topics](vignette-topics.html) ## Focus on reference grids {#ref-grid} Prediction is not the central purpose of the **emmeans** package. Even its name refers to the idea of obtaining marginal averages of fitted values; and it is a rare situation where one would want to make a prediction of the average of several observations. We can certainly do that if it is truly desired, but almost always, predictions should be based on the reference grid itself (i.e., *not* the result of an `emmeans()` call), inasmuch as a reference grid comprises combinations of model predictors. ## Need for an SD estimate {#sd-estimate} A prediction interval requires an estimate of the error standard deviation, because we need to account for both the uncertainty of our point predictions and the uncertainty of outcomes centered on those estimates. By its current design, we save the value (if any) returned by `stats::sigma(object)` when a reference grid is constructed for a model `object`. Not all models provide a `sigma()` method, in which case an error is thrown if the error SD is not manually specified. Also, in many cases, there may be a `sigma()` method, but it does not return the appropriate value(s) in the context of the needed predictions. (In an object returned by `lme4::glmer(), for example, `sigma()` seems to always returns 1.0.) Indeed, as will be seen in the example that follows, one usually needs to construct a manual SD estimate when the model is a mixed-effects model. So it is essentially always important to think very specifically about whether we are using an appropriate value. You may check the value being assumed by looking at the `misc` slot in the reference grid: ```{r eval = FALSE} rg <- ref_grid(model) rg@misc$sigma ``` Finally, `sigma` may be a vector, as long as it is conformable with the estimates in the reference grid. This would be appropriate, for example, with a model fitted by `nlme::gls()` with some kind of non-homogeneous error structure. It may take some effort, as well as a clear understanding of the model and its structure, to obtain suitable SD estimates. It was suggested to me that the function `insight::get_variance()` may be helpful -- especially when working with an unfamiliar model class. Personally, I prefer to make sure I understand the structure of the model object and/or its summary to ensure I am not going astray. [Back to Contents](#contents) ## Feedlot example {#feedlot} To illustrate, consider the `feedlot` dataset provided with the package. Here we have several herds of feeder cattle that are sent to feed lots and given one of three diets. The weights of the cattle are measured at time of entry (`ewt`) and at time of slaughter (`swt`). Different herds have possibly different entry weights, based on breed and ranching practices, so we will center each herd's `ewt` measurements, then use that as a covariate in a mixed model: ```{r, message = FALSE} feedlot = transform(feedlot, adj.ewt = ewt - predict(lm(ewt ~ herd))) require(lme4) feedlot.lmer <- lmer(swt ~ adj.ewt + diet + (1|herd), data = feedlot) feedlot.rg <- ref_grid(feedlot.lmer, at = list(adj.ewt = 0)) summary(feedlot.rg) ## point predictions ``` Now, as advised, let's look at the SDs involved in this model: ```{r} lme4::VarCorr(feedlot.lmer) ## for the model feedlot.rg@misc$sigma ## default in the ref. grid ``` So the residual SD will be assumed in our prediction intervals if we don't specify something else. And we *do* want something else, because in order to predict the slaughter weight of an arbitrary animal, without regard to its herd, we need to account for the variation among herds too, which is seen to be considerable. The two SDs reported by `VarCorr()` are assumed to represent independent sources of variation, so they may be combined into a total SD using the Pythagorean Theorem. We will update the reference grid with the new value: ```{r} feedlot.rg <- update(feedlot.rg, sigma = sqrt(77.087^2 + 57.832^2)) ``` We are now ready to form prediction intervals. To do so, simply call the `predict()` function with an `interval` argument: ```{r} predict(feedlot.rg, interval = "prediction") ``` These results may also be displayed graphically: ```{r, fig.height = 2} plot(feedlot.rg, PIs = TRUE) ``` The inner intervals are confidence intervals, and the outer ones are the prediction intervals. Note that the SEs for prediction are considerably greater than the SEs for estimation in the original summary of `feedlot.rg`. Also, as a sanity check, observe that these prediction intervals cover about the same ground as the original data: ```{r} range(feedlot$swt) ``` By the way, we could have specified the desired `sigma` value as an additional `sigma` argument in the `predict()` call, rather than updating the `feedlot.rg` object. [Back to Contents](#contents) ## Predictions on particular strata {#strata} Suppose, in our example, we want to predict `swt` for one or more particular herds. Then the total SD we computed is not appropriate for that purpose, because that includes variation among herds. But more to the point, if we are talking about particular herds, then we are really regarding `herd` as a fixed effect of interest; so the expedient thing to do is to fit a different model where `herd` is a fixed effect: ```{r} feedlot.lm <- lm(swt ~ adj.ewt + diet + herd, data = feedlot) ``` So to predict slaughter weight for herds `9` and `19`: ```{r} newrg <- ref_grid(feedlot.lm, at = list(adj.ewt = 0, herd = c("9", "19"))) predict(newrg, interval = "prediction", by = "herd") ``` This is an instance where the default `sigma` was already correct (being the only error SD we have available). The SD value is comparable to the residual SD in the previous model, and the prediction SEs are smaller than those for predicting over all herds. [Back to Contents](#contents) ## Predictions with Bayesian models {#bayes} For models fitted using Bayesian methods, these kinds of prediction intervals are available only by forcing a frequentist analysis (`frequentist = TRUE`). However, a better and more flexible approach with Bayesian models is to simulate observations from the posterior predictive distribution. This is done via `as.mcmc()` and specifying a `likelihood` argument. An example is given in the ["sophisticated models" vignette](sophisticated.html#predict-mcmc). [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/vignettes/FAQs.Rmd0000644000176200001440000005154714151001674015160 0ustar liggesusers--- title: "FAQs for emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{FAQs for emmeans} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") options(show.signif.stars = FALSE) ``` This vignette contains answers to questions received from users or posted on discussion boards like [Cross Validated](https://stats.stackexchange.com) and [Stack Overflow](https://stackoverflow.com/) ## Contents {#contents} 1. [What are EMMs/lsmeans?](#what) 2. [What is the fastest way to obtain EMMs and pairwise comparisons?](#fastest) 2. [I wanted comparisons, but all I get is (nothing)](#nopairs) 2. [The model I fitted is not supported by **emmeans**](#qdrg) 2. [I have three (or two or four) factors that interact](#interactions) 3. [I have covariate(s) that interact(s) with factor(s)](#trends) 3. [I have covariate(s) and am fitting a polynomial model](#polys) 3. [Some "significant" comparisons have overlapping confidence intervals](#CIerror) 4. [All my pairwise comparisons have the same *P* value](#notfactor) 5. [emmeans() doesn't work as expected](#numeric) 6. [All or some of the results are NA](#NAs) 6. [If I analyze subsets of the data separately, I get different results](#model) 7. [My lsmeans/EMMs are way off from what I expected](#transformations) 8. [Why do I get `Inf` for the degrees of freedom?](#asymp) 10. [I get exactly the same comparisons for each "by" group](#additive) 11. [My ANOVA *F* is significant, but no pairwise comparisons are](#anova) 12. [I asked for a Tukey adjustments, but that's not what I got](#notukey) 12. [`emmeans()` completely ignores my P-value adjustments](#noadjust) 13. [`emmeans()` gives me pooled *t* tests, but I expected Welch's *t*](#nowelch) [Index of all vignette topics](vignette-topics.html) ## What are EMMs/lsmeans? {#what} Estimated marginal means (EMMs), a.k.a. least-squares means, are predictions on a reference grid of predictor settings, or marginal averages thereof. See details in [the "basics" vignette](basics.html). ## What is the fastest way to obtain EMMs and pairwise comparisons? {#fastest} There are two answers to this (i.e., be careful what you wish for): 1. Don't think; just fit the first model that comes to mind and run `emmeans(model, pairwise ~ treatment)`. This is the fastest way; however, the results have a good chance of being invalid. 2. *Do* think: Make sure you fit a model that really explains the responses. Do diagnostic residual plots, include appropriate interactions, account for heteroscadesticity if necessary, etc. This is the fastest way to obtain *appropriate* estimates and comparisons. The point here is that `emmeans()` summarizes the *model*, not the data directly. If you use a bad model, you will get bad results. And if you use a good model, you will get appropriate results. It's up to you: it's your research---is it important? [Back to Contents](#contents) ## I wanted comparisons, but all I get is (nothing) {#nopairs} This happens when you have only one estimate; and you can't compare it with itself! This is turn can happen when you have a situation like this: you have fitted ``` mod <- lm(RT ~ treat, data = mydata) ``` and `treat` is coded in your dataset with numbers 1, 2, 3, ... . Since `treat` is a numeric predictor, `emmeans()` just reduces it to a single number, its mean, rather than separate values for each treatment. Also, please note that this is almost certainly NOT the model you want, because it forces an assumption that the treatment effects all fall on a straight line. You should fit a model like ``` mod <- lm(RT ~ factor(treat), data = mydata) ``` then you will have much better luck with comparisons. ## The model I fitted is not supported by **emmeans** {#qdrg} You may still be able to get results using `qdrg()` (quick and dirty reference grid). See `?qdrg` for details and examples. ## I have three (or two or four) factors that interact {#interactions} Perhaps your question has to do with interacting factors, and you want to do some kind of *post hoc* analysis comparing levels of one (or more) of the factors on the response. Some specific versions of this question... * Perhaps you tried to do a simple comparison for one treatment and got a warning message you don't understand * You do pairwise comparisons of factor combinations and it's just too much -- want just some of them * How do I even approach this? My first answer is: plots almost always help. If you have factors A, B, and C, try something like `emmip(model, A ~ B | C)`, which creates an interaction-style plot of the predictions against B, for each A, with separate panels for each C. This will help visualize what effects stand out in a practical way. This can guide you in what post-hoc tests would make sense. See the ["interactions" vignette](interactions.html) for more discussion and examples. [Back to Contents](#contents) ## I have covariate(s) that interact(s) with factor(s) {#trends} This is a situation where it may well be appropriate to compare the slopes of trend lines, rather than the EMMs. See the `help("emtrends"()`")` and the discussion of this topic in [the "interactions" vignette](interactions.html#covariates) ## I have covariate(s) and am fitting a polynomial model {#polys} You need to be careful to define the reference grid consistently. For example, if you use covariates `x` and `xsq` (equal to `x^2`) to fit a quadratic curve, the default reference grid uses the mean of each covariate -- and `mean(xsq)` is usually not the same as `mean(x)^2`. So you need to use `at` to ensure that the covariates are set consistently with respect to the model. See [this subsection of the "basics" vignette](basics.html#depcovs) for an example. ## 3. Some "significant" comparisons have overlapping confidence intervals {#CIerror} That can happen because *it is just plain wrong to use [non-]overlapping CIs for individual means to do comparisons*. Look at the printed results from something like `emmeans(mymodel, pairwise ~ treatment)`. In particular, note that the `SE` values are *not* the same*, and may even have different degrees of freedom. Means are one thing statistically, and differences of means are quite another thing. Don't ever mix them up, and don't ever use a CI display for comparing means. I'll add that making hard-line decisions about "significant" and "non-significant" is in itself a poor practice. See [the discussion in the "basics" vignette](basics.html#pvalues) ## All my pairwise comparisons have the same *P* value {#notfactor} This will happen if you fitted a model where the treatments you want to compare were put in as a numeric predictor; for example `dose`, with values of 1, 2, and 3. If `dose` is modeled as numeric, you will be fitting a linear trend in those dose values, rather than a model that allows those doses to differ in arbitrary ways. Go back and fit a different model using `factor(dose)` instead; it will make all the difference. This is closely related to the next topic. ## emmeans() doesn't work as expected {#numeric} Equivalently, users ask how to get *post hoc* comparisons when we have covariates rather than factors. Yes, it does work, but you have to tell it the appropriate reference grid. But before saying more, I have a question for you: *Are you sure your model is meaningful?* * If your question concerns *only* two-level predictors such as `sex` (coded 1 for female, 2 for male), no problem. The model will produce the same predictions as you'd get if you'd used these as factors. * If *any* of the predictors has 3 or more levels, you may have fitted a nonsense model, in which case you need to fit a different model that does make sense before doing any kind of *post hoc* analysis. For instance, the model contains a covariate `brand` (coded 1 for Acme, 2 for Ajax, and 3 for Al's), this model is implying that the difference between Acme and Ajax is exactly equal to the difference between Ajax and Al's, owing to the fact that a linear trend in `brand` has been fitted. If you had instead coded 1 for Ajax, 2 for Al's, and 3 for Acme, the model would produce different fitted values. Ask yourself if it makes sense to have `brand = 2.319`. If not, you need to fit another model using `factor(brand)` in place of `brand`. Assuming that the appropriateness of the model is settled, the current version of **emmeans** automatically casts two-value covariates as factors, but not covariates having higher numbers of unique values. Suppose your model has a covariate `dose` which was experimentally varied over four levels, but can sensibly be interpreted as a numerical predictor. If you want to include the separate values of `dose` rather than the mean `dose`, you can do that using something like `emmeans(model, "dose", at = list(dose = 1:4))`, or `emmeans(model, "dose", cov.keep = "dose")`, or `emmeans(model, "dose", cov.keep = "4")`. There are small differences between these. The last one regards any covariate having 4 or fewer unique values as a factor. See "altering the reference grid" in the ["basics" vignette](basics.html#altering) for more discussion. [Back to Contents](#contents) ## All or some of the results are NA {#NAs} The **emmeans** package uses tools in the **estimability** package to determine whether its results are uniquely estimable. For example, in a two-way model with interactions included, if there are no observations in a particular cell (factor combination), then we cannot estimate the mean of that cell. When *some* of the EMMs are estimable and others are not, that is information about missing information in the data. If it's possible to remove some terms from the model (particularly interactions), that may make more things estimable if you re-fit with those terms excluded; but don't delete terms that are really needed for the model to fit well. When *all* of the estimates are non-estimable, it could be symptomatic of something else. Some possibilities include: * An overly ambitious model; for example, in a Latin square design, interaction effects are confounded with main effects; so if any interactions are included in the model, you will render main effects inestimable. * Possibly you have a nested structure that needs to be included in the model or specified via the `nesting` argument. Perhaps the levels that B can have depend on which level of A is in force. Then B is nested in A and the model should specify `A + A:B`, with no main effect for `B`. * Modeling factors as numeric predictors (see also the [related section on covariates](#numeric)). To illustrate, suppose you have data on particular state legislatures, and the model includes the predictors `state_name` as well as `dem_gov` which is coded 1 if the governor is a Democrat and 0 otherwise. If the model was fitted with `state_name` as a factor or character variable, but `dem_gov` as a numeric predictor, then, chances are, `emmeans()` will return non-estimable results. If instead, you use `factor(dem_gov)` in the model, then the fact that `state_name` is nested in `dem_gov` will be detected, causing EMMs to be computed separately for each party's states, thus making things estimable. * Some other things may in fact be estimable. For illustration, it's easy to construct an example where all the EMMs are non-estimable, but pairwise comparisons are estimable: ```{r} pg <- transform(pigs, x = rep(1:3, c(10, 10, 9))) pg.lm <- lm(log(conc) ~ x + source + factor(percent), data = pg) emmeans(pg.lm, consec ~ percent) ``` The ["messy-data" vignette](messy-data.html) has more examples and discussion. [Back to Contents](#contents) ## If I analyze subsets of the data separately, I get different results {#model} Estimated marginal means summarize the *model* that you fitted to the data -- not the data themselves. Many of the most common models rely on several simplifying assumptions -- that certain effects are linear, that the error variance is constant, etc. -- and those assumptions are passed forward into the `emmeans()` results. Doing separate analyses on subsets usually comprises departing from that overall model, so of course the results are different. ## My lsmeans/EMMs are way off from what I expected {#transformations} First step: Carefully read the annotations below the output. Do they say something like "results are on the log scale, not the response scale"? If so, that explains it. A Poisson or logistic model involves a link function, and by default, `emmeans()` produces its results on that same scale. You can add `type = "response"` to the `emmeans()` call and it will put the results of the scale you expect. But that is not always the best approach. The ["transformations" vignette](transformations.html) has examples and discussion. ## Why do I get `Inf` for the degrees of freedom? {#asymp} This is simply the way that **emmeans** labels asymptotic results (that is, estimates that are tested against the standard normal distribution -- *z* tests -- rather than the *t* distribution). Note that obtaining quantiles or probabilities from the *t* distribution with infinite degrees of freedom is the same as obtaining the corresponding values from the standard normal. For example: ```{r} qt(c(.9, .95, .975), df = Inf) qnorm(c(.9, .95, .975)) ``` so when you see infinite d.f., that just means its a *z* test or a *z* confidence interval. [Back to Contents](#contents) ## I get exactly the same comparisons for each "by" group {#additive} As mentioned elsewhere, EMMs summarize a *model*, not the data. If your model does not include any interactions between the `by` variables and the factors for which you want EMMs, then by definition, the effects for the latter will be exactly the same regardless of the `by` variable settings. So of course the comparisons will all be the same. If you think they should be different, then you are saying that your model should include interactions between the factors of interest and the `by` factors. ## My ANOVA *F* is significant, but no pairwise comparisons are {#anova} First of all, you should not be making binary decisions of "significant" or "nonsignificant." This is a simplistic view of *P* values that assigns an unmerited magical quality to the value 0.05. It is suggested that you just report the *P* values actually obtained, and let your readers decide how significant your findings are in the context of the scientific findings. But to answer the question: This is a common misunderstanding of ANOVA. If *F* has a particular *P* value, this implies only that *some contrast* among the means (or effects) has the same *P* value, after applying the Scheffe adjustment. That contrast may be very much unlike a pairwise comparison, especially when there are several means being compared. Having an *F* statistic with a *P* value of, say, 0.06, does *not* imply that any pairwise comparison will have a *P* value of 0.06 or smaller. Again referring to the paragraph above, just report the *P* value for each pairwise comparison, and don't try to relate them to the *F* statistic. Another consideration is that by default, *P* values for pairwise comparisons are adjusted using the Tukey method, and the adjusted *P* values can be quite a bit larger than the unadjusted ones. (But I definitely do *not* advocate using no adjustment to "repair" this problem.) ## I asked for Tukey adjustments, but that's not what I got {#notukey} There are two reasons this could happen: 1. There is only one comparison in each `by` group (see next topic). 2. A Tukey adjustment is inappropriate. The Tukey adjustment is appropriate for pairwise comparisons of means. When you have some other set of contrasts, the Tukey method is deemed unsuitable and the Sidak method is used instead. A suggestion is to use `"mvt"` adjustment (which is exact); we don't default to this because it can require a lot of computing time for a large set of contrasts or comparisons. ## `emmeans()` completely ignores my P-value adjustments {#noadjust} This happens when there are only two means (or only two in each `by` group). Thus there is only one comparison. When there is only one thing to test, there is no multiplicity issue, and hence no multiplicity adjustment to the *P* values. If you wish to apply a *P*-value adjustment to all tests across all groups, you need to null-out the `by` variable and summarize, as in the following: ```r EMM <- emmeans(model, ~ treat | group) # where treat has 2 levels pairs(EMM, adjust = "sidak") # adjustment is ignored - only 1 test per group summary(pairs(EMM), by = NULL, adjust = "sidak") # all are in one group now ``` Note that if you put `by = NULL` *inside* the call to `pairs()`, then this causes all `treat`,`group` combinations to be compared. [Back to Contents](#contents) ## `emmeans()` gives me pooled *t* tests, but I expected Welch's *t* {#nowelch} It is important to note that `emmeans()` and its relatives produce results based on the *model object* that you provide -- not the data. So if your sample SDs are wildly different, a model fitted using `lm()` or `aov()` is not a good model, because those R functions use a statistical model that presumes that the errors have constant variance. That is, the problem isn't in `emmeans()`, it's in handing it an inadequate model object. Here is a simple illustrative example. Consider a simple one-way experiment and the following model: ``` mod1 <- aov(response ~ treat, data = mydata) emmeans(mod1, pairwise ~ treat) ``` This code will estimate means and comparisons among treatments. All standard errors, confidence intervals, and *t* statistics are based on the pooled residual SD with *N - k* degrees of freedom (assuming *N* observations and *k* treatments). These results are useful *only* if the underlying assumptions of `mod1` are correct -- including the assumption that the error SD is the same for all treatments. Alternatively, you could fit the following model using generalized least-squares: ``` mod2 = nlme::gls(response ~ treat, data = mydata, weights = varIdent(form = ~1 | treat)) emmeans(mod2, pairwise ~ treat) ``` This model specifies that the error variance depends on the levels of `treat`. This would be a much better model to use when you have wildly different sample SDs. The results of the `emmeans()` call will reflect this improvement in the modeling. The standard errors of the EMMs will depend on the individual sample variances, and the *t* tests of the comparisons will be in essence the Welch *t* statistics with Satterthwaite degrees of freedom. To obtain appropriate *post hoc* estimates, contrasts, and comparisons, one must first find a model that successfully explains the peculiarities in the data. This point cannot be emphasized enough. If you give `emmeans()` a good model, you will obtain correct results; if you give it a bad model, you will obtain incorrect results. Get the model right *first*. [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/vignettes/xplanations.Rmd0000644000176200001440000003421414137062735016727 0ustar liggesusers--- title: "Explanations supplement" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Explanations supplement} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, fig.height = 2.0, class.output = "ro", class.message = "re", class.error = "re", class.warning = "re") ###knitr::opts_chunk$set(fig.width = 4.5, fig.height = 2.0) ``` This vignette provides additional documentation for some methods implemented in the **emmeans** package. [Index of all vignette topics](vignette-topics.html) ## Contents {#contents} 1. [Sub-models](#submodels) 2. [Comparison arrows](#arrows) ## Sub-models {#submodels} Estimated marginal means (EMMs) and other statistics computed by the **emmeans** package are *model-based*: they depend on the model that has been fitted to the data. In this section we discuss a provision whereby a different underlying model may be considered. The `submodel` option in `update()` can project EMMs and other statistics to an alternative universe where a simpler version of the model has been fitted to the data. Another way of looking at this is that it constrains certain external effects to be zero -- as opposed to averaging over them as is otherwise done for marginal means. Two things to know before getting into details: 1. The `submodel` option uses information from the fixed-effects portion of the model matrix 2. Not all model classes are supported for the `submodel` option. Now some details. Suppose that we have a fixed-effects model matrix $X$, and let $X_1$ denote a sub-matrix of $X$ whose columns correspond to a specified sub-model. (Note: if there are weights, use $X = W^{1/2}X^*$, where $X^*$ is the model matrix without the weights incorporated.) The trick we use is what is called the *alias matrix*: $A = (X_1'X_1)^-X_1'X$ where $Z^-$ denotes a generalized inverse of $Z$. It can be shown that $(X_1'X_1)^-X_1' = A(X'X)^-X'$; thus, in an ordinary fixed-effects regression model, $b_1 = Ab$ where $b_1$ and $b$ denote the regression coefficients for the sub-model and full model, respectively. Thus, given a matrix $L$ such that $Lb$ provides estimates of interest for the full model, the corresponding estimates for the sub-model are $L_1b_1$, where $L_1$ is the sub-matrix of $L$ consisting of the columns corresponding to the columns of $X_1$. Moreover, $L_1b_1 = L_1(Ab) = (L_1A)b$; that is, we can replace $L$ by $L_1A$ to obtain estimates from the sub-model. That's all that `update(..., submodel = ...)` does. Here are some intuitive observations: 1. Consider the excluded effects, $X_2$, consisting of the columns of $X$ other than $X_1$. The corresponding columns of the alias matrix are regression coefficients treating $X_2$ as the response and $X_1$ as the predictors. 2. Thus, when we obtain predictions via these aliases, we are predicting the effects of $X_2$ based on $X_1$. 3. The columns of the new linear predictor $\tilde L = L_1A$ depend only on the columns of $L_1$, and hence not on other columns of $L$. These three points provide three ways of saying nearly the same thing, namely that we are excluding the effects in $X_2$. Note that in a rank-deficient situation, there are different possible generalized inverses, and so in (1), $A$ is not unique. However, the predictions in (2) are unique. In ordinary regression models, (1), (2), and (3) all apply and will be the same as predictions from re-fitting the model with model matrix $X_1$; however, in generalized linear models, mixed models, etc., re-fitting will likely produce somewhat different results. That is because fitting such models involves iterative weighting, and the re-fitted models will probably not have the same weights. However, point (3) will still hold: the predictions obtained with a submodel will involve only the columns of $L_1$ and hence constrain all effects outside of the sub-model to be zero. Therefore, when it really matters to get the correct estimates from the stated sub-model, the user should actually fit that sub-model unless the full model is an ordinary linear regression. A technicality: Most writers define the alias matrix as $(X_1'X_1)^-X_1'X_2$, where $X_2$ denotes that part of $X$ that excludes the columns of $X_1$. We are including all columns of $X$ here just because it makes the notation very simple; the $X_1$ portion of $X$ just reduces to the identity (at least in the case where $X_1$ is full-rank). A word on computation: Like many matrix expressions, we do not compute $A$ directly as shown. Instead, we use the QR decomposition of $X_1$, obtainable via the R call `Z <- qr(X1)`. Then the alias matrix is computed via `A <- qr.coef(Z, X)`. In fact, nothing changes if we use just the $R$ portion of $X = QR$, saving us both memory and computational effort. The exported function `.cmpMM()` extracts this $R$ matrix, taking care of any pivoting that might have occurred. And in an `lm` object, the QR decomposition of $X$ is already saved as a slot. The `qr.coef()` function works just fine in both the full-rank and rank-deficient cases, but in the latter situation, some elements of `A` will be `NA`; those correspond to "excluded" predictors, but that is another way of saying that we are constraining their regression coefficients to be zero. Thus, we can easily clean that up via `A[is.na(A)] <- 0`. If we specify `submodel = "minimal"`, the software figures out the sub-model by extracting terms involving only factors that have not already been averaged over. If the user specifies `submodel = "type2"`, an additional step is performed: Let $X_1$ have only the highest-order effect in the minimal model, and $X_0$ denote the matrix of all columns of $X$ whose columns do not contain the effect in $X_1$. We then replace $Z$ by the QR decomposition of $[I - X_0(X_0'X_0)^-X_0']X_1^*$. This projects $X_1^*$ onto the null space of $X_0$. The net result is that we obtain estimates of just the $X_1^*$ effects, after adjusting for all effects that don't contain it (including the intercept if present). Such estimates have very limited use in data description, but provide a kind of "Type II" analysis when used in conjunction with `joint_tests()`. The `"type2"` calculations parallel those [documented by SAS](https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_introglmest_sect011.htm&docsetVersion=14.3&locale=en) for obtaining type II estimable functions in SAS `PROC GLM`. However, we (as well as `car::Anova()`) define "contained" effects differently from SAS, treating covariates no differently than factors. ### A note on multivariate models {#mult.submodel} Recall that **emmeans** generates a constructed factor for the levels of a multivariate response. That factor (or factors) is completely ignored in any sub-model calculations. The $X$ and $X_1$ matrices described above involve only the predictors in the right-hand side of the model equation . The multivariate response "factor" implicitly interacts with everything in the right-hand-side model; and the same is true of any sub-model. So it is not possible to consider sub-models where terms are omitted from among those multivariate interactions (note that it is also impossible to fit a multivariate sub-model that excludes those interactions). The only way to remove consideration of multivariate effects is to average over them via a call to `emmeans()`. [Back to Contents](#contents) ## Comparison arrows {#arrows} The `plot()` method for `emmGrid` objects offers the option `comparisons = TRUE`. If used, the software attempts to construct "comparison arrows" whereby two estimated marginal means (EMMs) differ significantly if, and only if, their respective comparison arrows do not overlap. In this section, we explain how these arrows are obtained. First, please understand these comparison arrows are decidedly *not* the same as confidence intervals. Confidence intervals for EMMs are based on the statistical properties of the individual EMMs, whereas comparison arrows are based on the statistical properties of *differences* of EMMs. Let the EMMs be denoted $m_1, m_2, ..., m_k$. For simplicity, let us assume that these are ordered: $m_1 \le m_2 \le \cdots \le m_k$. Let $d_{ij} = m_j - m_i$ denote the difference between the $i$th and $j$th EMM. Then the $(1 - \alpha)$ confidence interval for the true difference $\delta_{ij} = \mu_j - \mu_i$ is $$ d_{ij} - e_{ij}\quad\mbox{to}\quad d_{ij} + e_{ij} $$ where $e_{ij}$ is the "margin of error" for the difference; i.e., $e_{ij} = t\cdot SE(d_{ij})$ for some critical value $t$ (equal to $t_{\alpha/2}$ when no multiplicity adjustment is used). Note that $d_{ij}$ is statistically significant if, and only if, $d_{ij} > e_{ij}$. Now, how to get the comparison arrows? These arrows are plotted with origins at the $m_i$; we have an arrow of length $L_i$ pointing to the left, and an arrow of length $R_i$ pointing to the right. To compare EMMs $m_i$ and $m_j$ (and remembering that we are supposing that $m_i \le m_j$), we propose to look to see if the arrows extending right from $m_i$ and left from $m_j$ overlap or not. So, ideally, if we want overlap to be identified with statistical non-significance, we want $$ R_i + L_j = e_{ij} \quad\mbox{for all } i < j $$ If we can do that, then the two arrows will overlap if, and only if, $d_{ij} < e_{ij}$. This is easy to accomplish if all the $e_{ij}$ are equal: just set all $L_i = R_j = \frac12e_{12}$. But with differing $e_{ij}$ values, it may or may not even be possible to obtain suitable arrow lengths. The code in **emmeans** uses an *ad hoc* weighted regression method to solve the above equations. We give greater weights to cases where $d_{ij}$ is close to $e_{ij}$, because those are the cases where it is more critical that we get the lengths of the arrows right. Once the regression equations are solved, we test to make sure that $R_i + L_j < d_{ij}$ when the difference is significant, and $\ge d_{ij}$ when it is not. If one or more of those checks fails, a warning is issued. That's the essence of the algorithm. Note, however, that there are a few complications that need to be handled: * For the lowest EMM $m_1$, $L_1$ is completely arbitrary because there are no right-pointing arrows with which to compare it; in fact, we don't even need to display that arrow. The same is true of $R_k$ for the largest EMM $m_k$. Moreover, there could be additional unneeded arrows when other $m_i$ are equal to $m_1$ or $m_k$. * Depending on the number $k$ of EMMs and the number of tied minima and maxima, the system of equations could be under-determined, over-determined, or just right. * It is possible that the solution could result in some $L_i$ or $R_j$ being negative. That would result in an error. In summary, the algorithm does not always work (in fact it is possible to construct cases where no solution is possible). But we try to do the best we can. The main reason for trying to do this is to encourage people to not ever use confidence intervals for the $m_i$ as a means of testing the comparisons $d_{ij}$. That is almost always incorrect. What is better yet is to simply avoid using comparison arrows altogether and use `pwpp()` or `pwpm()` to display the *P* values directly. ### Examples and tests Here is a constructed example with specified means and somewhat unequal SEs ```{r, message = FALSE} m = c(6.1, 4.5, 5.4, 6.3, 5.5, 6.7) se2 = c(.3, .4, .37, .41, .23, .48)^2 lev = list(A = c("a1","a2","a3"), B = c("b1", "b2")) foo = emmobj(m, diag(se2), levels = lev, linfct = diag(6)) plot(foo, CIs = FALSE, comparisons = TRUE) ``` This came out pretty well. But now let's keep the means and SEs the same but make them correlated. Such correlations happen, for example, in designs with subject effects. The function below is used to set a specified intra-class correlation, treating `A` as a within-subjects (or split-plot) factor and `B` as a between-subjects (whole-plot) factor. We'll start with a correlation of 0.3. ```{r, message = FALSE} mkmat <- function(V, rho = 0, indexes = list(1:3, 4:6)) { sd = sqrt(diag(V)) for (i in indexes) V[i,i] = (1 - rho)*diag(sd[i]^2) + rho*outer(sd[i], sd[i]) V } # Intraclass correlation = 0.3 foo3 = foo foo3@V <- mkmat(foo3@V, 0.3) plot(foo3, CIs = FALSE, comparisons = TRUE) ``` Same with intraclass correlation of 0.6: ```{r, message = FALSE} foo6 = foo foo6@V <- mkmat(foo6@V, 0.6) plot(foo6, CIs = FALSE, comparisons = TRUE) ``` Now we have a warning that some arrows don't overlap, but should. We can make it even worse by upping the correlation to 0.8: ```{r, message = FALSE, error = TRUE} foo8 = foo foo8@V <- mkmat(foo8@V, 0.8) plot(foo8, CIs = FALSE, comparisons = TRUE) ``` Now the solution actually leads to negative arrow lengths. What is happening here is we are continually reducing the SE of within-B comparisons while keeping the others the same. These all work out if we use `B` as a `by` variable: ```{r, message = FALSE} plot(foo8, CIs = FALSE, comparisons = TRUE, by = "B") ``` Note that the lengths of the comparison arrows are relatively equal within the levels of `B`. Or, we can use `pwpp()` or `pwpm()` to show the *P* values for all comparisons among the six means: ```{r} pwpp(foo6, sort = FALSE) pwpm(foo6) ``` [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html)emmeans/vignettes/messy-data.Rmd0000644000176200001440000005563414137062735016447 0ustar liggesusers--- title: "Working with messy data" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Working with messy data} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") require("ggplot2") options(show.signif.stars = FALSE) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` ## Contents {#contents} 1. [Issues with observational data](#issues) 2. [Mediating covariates](#mediators) 3. [Mediating factors and weights](#weights) 3. [Nuisance factors](#nuisance) 4. [Sub-models](#submodels) 5. [Nested fixed effects](#nesting) a. [Avoiding mis-identified nesting](#nest-trap) [Index of all vignette topics](vignette-topics.html) ## Issues with observational data {#issues} In experiments, we control the conditions under which observations are made. Ideally, this leads to balanced datasets and clear inferences about the effects of those experimental conditions. In observational data, factor levels are observed rather than controlled, and in the analysis we control *for* those factors and covariates. It is possible that some factors and covariates lie in the causal path for other predictors. Observational studies can be designed in ways to mitigate some of these issues; but often we are left with a mess. Using EMMs does not solve the inherent problems in messy, undesigned studies; but they do give us ways to compensate for imbalance in the data, and allow us to estimate meaningful effects after carefully considering the ways in which they can be confounded. ####### {#nutrex} As an illustration, consider the `nutrition` dataset provided with the package. These data are used as an example in Milliken and Johnson (1992), *Analysis of Messy Data*, and contain the results of an observational study on nutrition education. Low-income mothers are classified by race, age category, and whether or not they received food stamps (the `group` factor); and the response variable is a gain score (post minus pre scores) after completing a nutrition training program. First, let's fit a model than includes all main effects and 2-way interactions, and obtain its "type II" ANOVA: ```{r} nutr.lm <- lm(gain ~ (age + group + race)^2, data = nutrition) car::Anova(nutr.lm) ``` There is definitely a `group` effect and a hint of and interaction with `race`. Here are the EMMs for those two factors, along with their counts: ```{r} emmeans(nutr.lm, ~ group * race, calc = c(n = ".wgt.")) ``` ####### {#nonestex} Hmmmm. The EMMs when `race` is "Hispanic" are not given; instead they are flagged as non-estimable. What does that mean? Well, when using a model to make predictions, it is impossible to do that beyond the linear space of the data used to fit the model. And we have no data for three of the age groups in the Hispanic population: ```{r} with(nutrition, table(race, age)) ``` We can't make predictions for all the cases we are averaging over in the above EMMs, and that is why some of them are non-estimable. The bottom line is that we simply cannot include Hispanics in the mix when comparing factor effects. That's a limitation of this study that cannot be overcome without collecting additional data. Our choices for further analysis are to focus only on Black and White populations; or to focus only on age group 3. For example (the latter): ```{r} summary(emmeans(nutr.lm, pairwise ~ group | race, at = list(age = "3")), by = NULL) ``` (We used trickery with providing a `by` variable, and then taking it away, to make the output more compact.) Evidently, the training program has been beneficial to the Black and White groups in that age category. There is no conclusion for the Hispanic group -- for which we have very little data. [Back to Contents](#contents) ## Mediating covariates {#mediators} The `framing` data in the **mediation** package has the results of an experiment conducted by Brader et al. (2008) where subjects were given the opportunity to send a message to Congress regarding immigration. However, before being offered this, some subjects (`treat = 1`) were first shown a news story that portrays Latinos in a negative way. Besides the binary response (whether or not they elected to send a message), the experimenters also measured `emo`, the subjects' emotional state after the treatment was applied. There are various demographic variables as well. Let's a logistic regression model, after changing the labels for `educ` to shorter strings. ```{r} framing <- mediation::framing levels(framing$educ) <- c("NA","Ref","< HS", "HS", "> HS","Coll +") framing.glm <- glm(cong_mesg ~ age + income + educ + emo + gender * factor(treat), family = binomial, data = framing) ``` The conventional way to handle covariates like `emo` is to set them at their means and use those means for purposes of predictions and EMMs. These adjusted means are shown in the following plot. ```{r} emmip(framing.glm, treat ~ educ | gender, type = "response") ``` This plot gives the impression that the effect of `treat` is reversed between male and female subjects; and also that the effect of education is not monotone. Both of these are counter-intuitive. ###### {#med.covred} However, note that the covariate `emo` is measured *post*-treatment. That suggests that in fact `treat` (and perhaps other factors) could affect the value of `emo`; and if that is true (as is in fact established by mediation analysis techniques), we should not pretend that `emo` can be set independently of `treat` as was done to obtain the EMMs shown above. Instead, let `emo` depend on `treat` and the other predictors -- easily done using `cov.reduce` -- and we obtain an entirely different impression: ```{r} emmip(framing.glm, treat ~ educ | gender, type = "response", cov.reduce = emo ~ treat*gender + age + educ + income) ``` The reference grid underlying this plot has different `emo` values for each factor combination. The plot suggests that, after taking emotional response into account, male (but not female) subjects exposed to the negative news story are more likely to send the message than are females or those not seeing the negative news story. Also, the effect of `educ` is now nearly monotone. ###### {#adjcov} By the way, the results in this plot are the same is what you would obtain by refitting the model with an adjusted covariate ```{r eval = FALSE} emo.adj <- resid(lm(emo ~ treat*gender + age + educ + income, data = framing)) ``` ... and then using ordinary covariate-adjusted means at the means of `emo.adj`. This is a technique that is often recommended. If there is more than one mediating covariate, their settings may be defined in sequence; for example, if `x1`, `x2`, and `x3` are all mediating covariates, we might use ```{r eval = FALSE} emmeans(..., cov.reduce = list(x1 ~ trt, x2 ~ trt + x1, x3 ~ trt + x1 + x2)) ``` (or possibly with some interactions included as well). [Back to Contents](#contents) ## Mediating factors and weights {#weights} A mediating covariate is one that is in the causal path; likewise, it is possible to have a mediating *factor*. For mediating factors, the moral equivalent of the `cov.reduce` technique described above is to use *weighted* averages in lieu of equally-weighted ones in computing EMMs. The weights used in these averages should depend on the frequencies of mediating factor(s). Usually, the `"cells"` weighting scheme described later in this section is the right approach. In complex situations, it may be necessary to compute EMMs in stages. As described in [the "basics" vignette](basics.html#emmeans), EMMs are usually defined as *equally-weighted* means of reference-grid predictions. However, there are several built-in alternative weighting schemes that are available by specifying a character value for `weights` in a call to `emmeans()` or related function. The options are `"equal"` (the default), `"proportional"`, `"outer"`, `"cells"`, and `"flat"`. The `"proportional"` (or `"prop"` for short) method weights proportionally to the frequencies (or model weights) of each factor combination that is averaged over. The `"outer"` method uses the outer product of the marginal frequencies of each factor that is being averaged over. To explain the distinction, suppose the EMMs for `A` involve averaging over two factors `B` and `C`. With `"prop"`, we use the frequencies for each combination of `B` and `C`; whereas for `"outer"`, first obtain the marginal frequencies for `B` and for `C` and weight proportionally to the product of these for each combination of `B` and `C`. The latter weights are like the "expected" counts used in a chi-square test for independence. Put another way, outer weighting is the same as proportional weighting applied one factor at a time; the following two would yield the same results: ```{r eval = FALSE} emmeans(model, "A", weights = "outer") emmeans(emmeans(model, c("A", "B"), weights = "prop"), weights = "prop") ``` Using `"cells"` weights gives each prediction the same weight as occurs in the model; applied to a reference grid for a model with all interactions, `"cells"`-weighted EMMs are the same as the ordinary marginal means of the data. With `"flat"` weights, equal weights are used, except zero weight is applied to any factor combination having no data. Usually, `"cells"` or `"flat"` weighting will *not* produce non-estimable results, because we exclude empty cells. (That said, if covariates are linearly dependent with factors, we may still encounter non-estimable cases.) Here is a comparison of predictions for `nutr.lm` defined [above](#issues), using different weighting schemes: ```{r message = FALSE} sapply(c("equal", "prop", "outer", "cells", "flat"), function(w) predict(emmeans(nutr.lm, ~ race, weights = w))) ``` In the other hand, if we do `group * race` EMMs, only one factor (`age`) is averaged over; thus, the results for `"prop"` and `"outer"` weights will be identical in that case. [Back to Contents](#contents) ## Nuisance factors {#nuisance} Consider a situation where we have a model with 15 factors, each at 5 levels. Regardless of how simple or complex the model is, the reference grid consists of all combinations of these factors -- and there are $%^15$ of these, or over 30 billion. If there are, say, 100 regression coefficients in the model, then just the `linfct` slot in the reference grid requires $100\times5^15\times8$ bytes of storage, or almost 23,000 gigabytes. Suppose in addition the model has a multivariate response with 5 levels. That multiplies *both* the rows and columns in `linfct`, increasing the storage requirements by a factor of 25. Either way, your computer can't store that much -- so this definitely qualifies as a messy situation! The `ref_grid()` function now provides some relief, in the way of specifying some of the factors as "nuisance" factors. The reference grid is then constructed with those factors already averaged-out. So, for example with the same scenario, if only three of those 15 factors are of primary interest, and we specify the other 12 as nuisance factors to be averaged, that leaves us with only $3^5=125$ rows in the reference grid, and hence $125\times100\times8=10,000$ bytes of storage required for `linfct`. If there is a 5-level multivariate response, we'll have 625 rows in the reference grid and $25\times1000=250,000$ bytes in `linfct`. Suddenly a horribly unmanageable situation becomes quite manageable! But of course, there is a restriction: nuisance factors must not interact with any other factors -- not even other nuisance factors. And a multivariate response (or an implied multivariate response, e.g., in an ordinal model) can never be a nuisance factor. Under that condition, the average effects of a nuisance factor are the same regardless of the levels of other factors, making it possible to pre-average them by considering just one case. We specify nuisance factors by listing their names in a `nuisance` argument to `ref_grid()` (in `emmeans()`, this argument is passed to `ref_grid)`). Often, it is much more convenient to give the factors that are *not* nuisance factors, via a `non.nuisance` argument. If you do specify a nuisance factor that does interact with others, or doesn't exist, it is quietly excluded from the nuisance list. ###### {#nuis.example} Time for an example. Consider the `mtcars` dataset standard in R, and the model ```{r} mtcars.lm <- lm(mpg ~ factor(cyl)*am + disp + hp + drat + log(wt) + vs + factor(gear) + factor(carb), data = mtcars) ``` And let's construct two different reference grids: ```{r} rg.usual <- ref_grid(mtcars.lm) rg.usual nrow(rg.usual@linfct) rg.nuis = ref_grid(mtcars.lm, non.nuisance = "cyl") rg.nuis nrow(rg.nuis@linfct) ``` Notice that we left `am` out of `non.nuisance` and hence included it in `nuisance`. However, it interacts with `cyl`, so it was not allowed as a nuisance factor. But `rg.nuis` requires 1/36 as much storage. There's really nothing else to show, other than to demonstrate that we get the same EMMs either way, with slightly different annotations: ```{r} emmeans(rg.usual, ~ cyl * am) emmeans(rg.nuis, ~ cyl * am) ``` By default, the pre-averaging is done with equal weights. If we specify `wt.nuis` as anything other than `"equal"`, they are averaged proportionally. As described above, this really amounts to `"outer"` weights since they are averaged separately. Let's try it to see how the estimates differ: ```{r} predict(emmeans(mtcars.lm, ~ cyl * am, non.nuis = c("cyl", "am"), wt.nuis = "prop")) predict(emmeans(mtcars.lm, ~ cyl * am, weights = "outer")) ``` These are the same as each other, but different from the equally-weighted EMMs we obtained before. By the way, to help make things consistent, if `weights` is character, `emmeans()` passes `wt.nuis = weights` to `ref_grid` (if it is called), unless `wt.nuis` is also specified. There is a trick to get `emmeans` to use the smallest possible reference grid: Pass the `specs` argument to `ref_grid()` as `non.nuisance`. But we have to quote it to delay evaluation, and also use `all.vars()` if (and only if) `specs` is a formula: ```{r} emmeans(mtcars.lm, ~ gear | am, non.nuis = quote(all.vars(specs))) ``` Observe that `cyl` was passed over as a nuisance factor because it interacts with another factor. ### Limiting the size of the reference grid {#rg.limit} We have just seen how easily the size of a reference grid can get out of hand. The `rg.limit` option (set via `emm_options()` or as an optional argument in `ref_grid()` or `emmeans()`) serves to guard against excessive memory demands. It specifies the number of allowed rows in the reference grid. But because of the way `ref_grid()` works, this check is made *before* any multivariate-response levels are taken into account. If the limit is exceeded, an error is thrown: ```{r, error = TRUE} ref_grid(mtcars.lm, rg.limit = 200) ``` The default `rg.limit` is 10,000. With this limit, and if we have 1,000 columns in the model matrix, then the size of `linfct` is limited to about 80MB. If in addition, there is a 5-level multivariate response, the limit is 2GB -- darn big, but perhaps manageable. Even so, I suspect that the 10000-row default may be to loose to guard against some users getting into a tight situation. [Back to Contents](#contents) ## Sub-models {#submodels} We have just seen that we can assign different weights to the levels of containing factors. Another option is to constrain the effects of those containing factors to zero. In essence, that means fitting a different model without those containing effects; however, for certain models (not all), an `emmGrid` may be updated with a `submodel` specification so as to impose such a constraint. For illustration, return again to the nutrition example, and consider the analysis of `group` and `race` as before, after removing interactions involving `age`: ```{r} summary(emmeans(nutr.lm, pairwise ~ group | race, submodel = ~ age + group*race), by = NULL) ``` If you like, you may confirm that we would obtain exactly the same estimates if we had fitted that sub-model to the data, except we continue to use the residual variance from the full model in tests and confidence intervals. Without the interactions with `age`, all of the marginal means become estimable. The results are somewhat different from those obtained earlier where we narrowed the scope to just age 3. These new estimates include all ages, averaging over them equally, but with constraints that the interaction effects involving `age` are all zero. ###### {#type2submodel} There are two special character values that may be used with `submodel`. Specifying `"minimal"` creates a submodel with only the active factors: ```{r} emmeans(nutr.lm, ~ group * race, submodel = "minimal") ``` This submodel constrains all effects involving `age` to be zero. Another interesting option is `"type2"`, whereby we in essence analyze the residuals of the model with all contained or overlapping effects, then constrain the containing effects to be zero. So what is left if only the interaction effects of the factors involved. This is most useful with `joint_tests()`: ```{r} joint_tests(nutr.lm, submodel = "type2") ``` These results are identical to the type II anova obtained [at the beginning of this example](#nutrex). More details on how `submodel` works may be found in [`vignette("xplanations")`](xplanations.html#submodels) [Back to Contents](#contents) ## Nested fixed effects {#nesting} A factor `A` is nested in another factor `B` if the levels of `A` have a different meaning in one level of `B` than in another. Often, nested factors are random effects---for example, subjects in an experiment may be randomly assigned to treatments, in which case subjects are nested in treatments---and if we model them as random effects, these random nested effects are not among the fixed effects and are not an issue to `emmeans`. But sometimes we have fixed nested factors. ###### {#cows} Here is an example of a fictional study of five fictional treatments for some disease in cows. Two of the treatments are administered by injection, and the other three are administered orally. There are varying numbers of observations for each drug. The data and model follow: ```{r} cows <- data.frame ( route = factor(rep(c("injection", "oral"), c(5, 9))), drug = factor(rep(c("Bovineumab", "Charloisazepam", "Angustatin", "Herefordmycin", "Mollycoddle"), c(3,2, 4,2,3))), resp = c(34, 35, 34, 44, 43, 36, 33, 36, 32, 26, 25, 25, 24, 24) ) cows.lm <- lm(resp ~ route + drug, data = cows) ``` The `ref_grid` function finds a nested structure in this model: ```{r message = FALSE} cows.rg <- ref_grid(cows.lm) cows.rg ``` When there is nesting, `emmeans` computes averages separately in each group\ldots ```{r} route.emm <- emmeans(cows.rg, "route") route.emm ``` ... and insists on carrying along any grouping factors that a factor is nested in: ```{r} drug.emm <- emmeans(cows.rg, "drug") drug.emm ``` Here are the associated pairwise comparisons: ```{r} pairs(route.emm, reverse = TRUE) pairs(drug.emm, by = "route", reverse = TRUE) ``` In the latter result, the contrast itself becomes a nested factor in the returned `emmGrid` object. That would not be the case if there had been no `by` variable. #### Graphs with nesting It can be very helpful to take advantage of special features of **ggplot2** when graphing results with nested factors. For example, the default plot for the `cows` example is not ideal: ```{r, fig.width = 5.5} emmip(cows.rg, ~ drug | route) ``` We can instead remove `route` from the call and instead handle it with **ggplot2** code to use separate *x* scales: ```{r, fig.width = 5.5} require(ggplot2) emmip(cows.rg, ~ drug) + facet_wrap(~ route, scales = "free_x") ``` Similarly with `plot.emmGrid()`: ```{r, fig.height = 2.5, fig.width = 5.5} plot(drug.emm, PIs = TRUE) + facet_wrap(~ route, nrow = 2, scales = "free_y") ``` ### Auto-identification of nested factors -- avoid being trapped! {#nest-trap} `ref_grid()` and `emmeans()` tries to discover and accommodate nested structures in the fixed effects. It does this in two ways: first, by identifying factors whose levels appear in combination with only one level of another factor; and second, by examining the `terms` attribute of the fixed effects. In the latter approach, if an interaction `A:B` appears in the model but `A` is not present as a main effect, then `A` is deemed to be nested in `B`. Note that this can create a trap: some users take shortcuts by omitting some fixed effects, knowing that this won't affect the fitted values. But such shortcuts *do* affect the interpretation of model parameters, ANOVA tables, etc., and I advise against ever taking such shortcuts. Here are some ways you may notice mistakenly-identified nesting: * A message is displayed when nesting is detected * A `str()` listing of the `emmGrid` object shows a nesting component * An `emmeans()` summary unexpectedly includes one or more factors that you didn't specify * EMMs obtained using `by` factors don't seem to behave right, or give the same results with different specifications To override the auto-detection of nested effects, use the `nesting` argument in `ref_grid()` or `emmeans()`. Specifying `nesting = NULL` will ignore all nesting. Incorrectly-discovered nesting can be overcome by specifying something akin to `nesting = "A %in% B, C %in% (A * B)"` or, equivalently, `nesting = list(A = "B", C = c("A", "B"))`. [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/vignettes/transformations.Rmd0000644000176200001440000007421014150770352017614 0ustar liggesusers--- title: "Transformations and link functions in emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Transformations and link functions} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` ## Contents {#contents} This vignette covers the intricacies of transformations and link functions in **emmeans**. 1. [Overview](#overview) 2. [Re-gridding](#regrid) 3. [Link functions](#links) 3. [Graphing transformations and links](#trangraph) 4. [Both a response transformation and a link](#tranlink) 5. [Special transformations](#special) 6. [Specifying a transformation after the fact](#after) 6. [Auto-detected transformations](#auto) 6. [Standardized response](#stdize) 7. [Faking a log transformation](#logs) a. [Faking other transformations](#faking) b. [Alternative scale](#altscale) 8. [Bias adjustment](#bias-adj) [Index of all vignette topics](vignette-topics.html) ## Overview {#overview} Consider the same example with the `pigs` dataset that is used in many of these vignettes: ```{r} pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) ``` This model has two factors, `source` and `percent` (coerced to a factor), as predictors; and log-transformed `conc` as the response. Here we obtain the EMMs for `source`, examine its structure, and finally produce a summary, including a test against a null value of log(35): ```{r} pigs.emm.s <- emmeans(pigs.lm, "source") str(pigs.emm.s) ``` ```{r} summary(pigs.emm.s, infer = TRUE, null = log(35)) ``` Now suppose that we want the EMMs expressed on the same scale as `conc`. This can be done by adding `type = "response"` to the `summary()` call: ```{r} summary(pigs.emm.s, infer = TRUE, null = log(35), type = "response") ``` Note: Looking ahead, this output is compared later in this vignette with a [bias-adjusted version](#pigs-biasadj). ### Timing is everything {#timing} Dealing with transformations in **emmeans** is somewhat complex, due to the large number of possibilities. But the key is understanding what happens, when. These results come from a sequence of steps. Here is what happens (and doesn't happen) at each step: 1. The reference grid is constructed for the `log(conc)` model. The fact that a log transformation is used is recorded, but nothing else is done with that information. 2. The predictions on the reference grid are averaged over the four `percent` levels, for each `source`, to obtain the EMMs for `source` -- *still* on the `log(conc)` scale. 3. The standard errors and confidence intervals for these EMMs are computed -- *still* on the `log(conc)` scale. 4. Only now do we do back-transformation... a. The EMMs are back-transformed to the `conc` scale. b. The endpoints of the confidence intervals are back-transformed. c. The *t* tests and *P* values are left as-is. d. The standard errors are converted to the `conc` scale using the delta method. These SEs were *not* used in constructing the tests and confidence intervals. ### The model is our best guide This choice of timing is based on the idea that *the model is right*. In particular, the fact that the response is transformed suggests that the transformed scale is the best scale to be working with. In addition, the model specifies that the effects of `source` and `percent` are *linear* on the transformed scale; inasmuch as marginal averaging to obtain EMMs is a linear operation, that averaging is best done on the transformed scale. For those two good reasons, back-transforming to the response scale is delayed until the very end by default. [Back to Contents](#contents) ## Re-gridding {#regrid} As well-advised as it is, some users may not want the default timing of things. The tool for changing when back-transformation is performed is the `regrid()` function -- which, with default settings of its arguments, back-transforms an `emmGrid` object and adjusts everything in it appropriately. For example: ```{r} str(regrid(pigs.emm.s)) summary(regrid(pigs.emm.s), infer = TRUE, null = 35) ``` Notice that the structure no longer includes the transformation. That's because it is no longer relevant; the reference grid is on the `conc` scale, and how we got there is now forgotten. Compare this `summary()` result with the preceding one, and note the following: * It no longer has annotations concerning transformations. * The estimates and SEs are identical. * The confidence intervals, *t* ratios, and *P* values are *not* identical. This is because, this time, the SEs shown in the table are the ones actually used to construct the tests and intervals. Understood, right? But think carefully about how these EMMs were obtained. They are back-transformed from `pigs.emm.s`, in which *the marginal averaging was done on the log scale*. If we want to back-transform *before* doing the averaging, we need to call `regrid()` after the reference grid is constructed but before the averaging takes place: ```{r} pigs.rg <- ref_grid(pigs.lm) pigs.remm.s <- emmeans(regrid(pigs.rg), "source") summary(pigs.remm.s, infer = TRUE, null = 35) ``` These results all differ from either of the previous two summaries -- again, because the averaging is done on the `conc` scale rather than the `log(conc)` scale. ###### {#regrid} Note: For those who want to routinely back-transform before averaging, the `transform` argument in `ref_grid()` simplifies this. The first two steps above could have been done more easily as follows: ```{r eval = FALSE} pigs.remm.s <- emmeans(pigs.lm, "source", transform = "response") ``` But don't get `transform` and `type` confused. The `transform` argument is passed to `regrid()` after the reference grid is constructed, whereas the `type` argument is simply remembered and used by `summary()`. So a similar-looking call: ```{r eval = FALSE} emmeans(pigs.lm, "source", type = "response") ``` will compute the results we have seen for `pigs.emm.s` -- back-transformed *after* averaging on the log scale. Remember again: When it comes to transformations, timing is everything. [Back to Contents](#contents) ## Link functions {#links} Exactly the same ideas we have presented for response transformations apply to generalized linear models having non-identity link functions. As far as **emmeans** is concerned, there is no difference at all. To illustrate, consider the `neuralgia` dataset provided in the package. These data come from an experiment reported in a SAS technical report where different treatments for neuralgia are compared. The patient's sex is an additional factor, and their age is a covariate. The response is `Pain`, a binary variable on whether or not the patient reports neuralgia pain after treatment. The model suggested in the SAS report is equivalent to the following. We use it to obtain estimated probabilities of experiencing pain: ```{r} neuralgia.glm <- glm(Pain ~ Treatment * Sex + Age, family = binomial(), data = neuralgia) neuralgia.emm <- emmeans(neuralgia.glm, "Treatment", type = "response") neuralgia.emm ``` ###### {#oddsrats} (The note about the interaction is discussed shortly.) Note that the averaging over `Sex` is done on the logit scale, *before* the results are back-transformed for the summary. We may use `pairs()` to compare these estimates; note that logits are logs of odds; so this is another instance where log-differences are back-transformed -- in this case to odds ratios: ```{r} pairs(neuralgia.emm, reverse = TRUE) ``` So there is evidence of considerably more pain being reported with placebo (treatment `P`) than with either of the other two treatments. The estimated odds of pain with `B` are about half that for `A`, but this finding is not statistically significant. (The odds that this is a made-up dataset seem quite high, but that finding is strictly this author's impression.) Observe that there is a note in the output for `neuralgia.emm` that the results may be misleading. It is important to take it seriously, because if two factors interact, it may be the case that marginal averages of predictions don't reflect what is happening at any level of the factors being averaged over. To find out, look at an interaction plot of the fitted model: ```{r} emmip(neuralgia.glm, Sex ~ Treatment) ``` There is no practical difference between females and males in the patterns of response to `Treatment`; so I think most people would be quite comfortable with the marginal results that are reported earlier. [Back to Contents](#contents) ## Graphing transformations and links {#trangraph} There are a few options for displaying transformed results graphically. First, the `type` argument works just as it does in displaying a tabular summary. Following through with the `neuralgia` example, let us display the marginal `Treatment` EMMs on both the link scale and the response scale (we are opting to do the averaging on the link scale): ```{r, fig.height = 1.5} neur.Trt.emm <- suppressMessages(emmeans(neuralgia.glm, "Treatment")) plot(neur.Trt.emm) # Link scale by default plot(neur.Trt.emm, type = "response") ``` Besides whether or not we see response values, there is a dramatic difference in the symmetry of the intervals. For `emmip()` and `plot()` *only* (and currently only with the "ggplot" engine), there is also the option of specifying `type = "scale"`, which causes the response values to be calculated but plotted on a nonlinear scale corresponding to the transformation or link: ```{r, fig.height = 1.5} plot(neur.Trt.emm, type = "scale") ``` Notice that the interior part of this plot is identical to the plot on the link scale. Only the horizontal axis is different. That is because the response values are transformed using the link function to determine the plotting positions of the graphical elements -- putting them back where they started. As is the case here, nonlinear scales can be confusing to read, and it is very often true that you will want to display more scale divisions, and even add minor ones. This is done via adding arguments for the function `ggplot2::scale_x_continuous()` (see its documentation): ```{r, fig.height = 1.5} plot(neur.Trt.emm, type = "scale", breaks = seq(0.10, 0.90, by = 0.10), minor_breaks = seq(0.05, 0.95, by = 0.05)) ``` When using the `"ggplot"` engine, you always have the option of using **ggplot2** to incorporate a transformed scale -- and it doesn't even have to be the same as the transformation used in the model. For example, here we display the same results on an arcsin-square-root scale. ```{r, fig.height = 1.5} plot(neur.Trt.emm, type = "response") + ggplot2::scale_x_continuous(trans = scales::asn_trans(), breaks = seq(0.10, 0.90, by = 0.10)) ``` This comes across as a compromise: not as severe as the logit scaling, and not as distorted as the linear scaling of response values. Again, the same techniques can be used with `emmip()`, except it is the vertical scale that is affected. [Back to Contents](#contents) ## Models having both a response transformation and a link function {#tranlink} It is possible to have a generalized linear model with a non-identity link *and* a response transformation. Here is an example, with the built-in `wapbreaks` dataset: ```{r} warp.glm <- glm(sqrt(breaks) ~ wool*tension, family = Gamma, data = warpbreaks) ref_grid(warp.glm) ``` The canonical link for a gamma model is the reciprocal (or inverse); and there is the square-root response transformation besides. If we choose `type = "response"` in summarizing, we undo *both* transformations: ```{r} emmeans(warp.glm, ~ tension | wool, type = "response") ``` What happened here is first the linear predictor was back-transformed from the link scale (inverse); then the squares were obtained to back-transform the rest of the way. It is possible to undo the link, and not the response transformation: ```{r} emmeans(warp.glm, ~ tension | wool, type = "unlink") ``` It is *not* possible to undo the response transformation and leave the link in place, because the response was transform first, then the link model was applied; we have to undo those in reverse order to make sense. One may also use `"unlink"` as a `transform` argument in `regrid()` or through `ref_grid()`. [Back to Contents](#contents) ## Special transformations {#special} The `make.tran()` function provides several special transformations and sets things up so they can be handled in **emmeans** with relative ease. (See `help("make.tran", "emmeans")` for descriptions of what is available.) `make.tran()` works much like `stats::make.link()` in that it returns a list of functions `linkfun()`, `linkinv()`, etc. that serve in managing results on a transformed scale. The difference is that most transformations with `make.tran()` require additional arguments. To use this capability in `emmeans()`, it is fortuitous to first obtain the `make.tran()` result, and then to use it as the enclosing environment for fitting the model, with `linkfun` as the transformation. For example, suppose the response variable is a percentage and we want to use the response transformation $\sin^{-1}\sqrt{y/100}$. Then proceed like this: ```{r eval = FALSE} tran <- make.tran("asin.sqrt", 100) my.model <- with(tran, lmer(linkfun(percent) ~ treatment + (1|Block), data = mydata)) ``` Subsequent calls to `ref_grid()`, `emmeans()`, `regrid()`, etc. will then be able to access the transformation information correctly. The help page for `make.tran()` has an example like this using a Box-Cox transformation. [Back to Contents](#contents) ## Specifying a transformation after the fact {#after} It is not at all uncommon to fit a model using statements like the following: ```{r eval = FALSE} mydata <- transform(mydata, logy.5 = log(yield + 0.5)) my.model <- lmer(logy.5 ~ treatment + (1|Block), data = mydata) ``` In this case, there is no way for `ref_grid()` to figure out that a response transformation was used. What can be done is to update the reference grid with the required information: ```{r eval = FALSE} my.rg <- update(ref_grid(my.model), tran = make.tran("genlog", .5)) ``` Subsequently, use `my.rg` in place of `my.model` in any `emmeans()` analyses, and the transformation information will be there. For standard transformations (those in `stats::make.link()`), just give the name of the transformation; e.g., ```{r eval = FALSE} model.rg <- update(ref_grid(model), tran = "sqrt") ``` ## Auto-detected response transformations {#auto} As can be seen in the initial `pigs.lm` example in this vignette, certain straightforward response transformations such as `log`, `sqrt`, etc. are automatically detected when `emmeans()` (really, `ref_grid()`) is called on the model object. In fact, scaling and shifting is supported too; so the preceding example with `my.model` could have been done more easily by specifying the transformation directly in the model formula: ```r my.better.model <- lmer(log(yield + 0.5) ~ treatment + (1|Block), data = mydata) ``` The transformation would be auto-detected, saving you the trouble of adding it later. Similarly, a response transformation of `2 * sqrt(y + 1)` would be correctly auto-detected. A model with a linearly transformed response, e.g. `4*(y - 1)`, would *not* be auto-detected, but `4*I(y + -1)` would be interpreted as `4*identity(y + -1)`. Parsing is such that the response expression must be of the form `mult * fcn(resp + const)`; operators of `-` and `/` are not recognized. [Back to Contents](#contents) ## Faking a log transformation {#logs} The `regrid()` function makes it possible to fake a log transformation of the response. Why would you want to do this? So that you can make comparisons using ratios instead of differences. Consider the `pigs` example once again, but suppose we had fitted a model with a square-root transformation instead of a log: ```{r} pigroot.lm <- lm(sqrt(conc) ~ source + factor(percent), data = pigs) piglog.emm.s <- regrid(emmeans(pigroot.lm, "source"), transform = "log") confint(piglog.emm.s, type = "response") pairs(piglog.emm.s, type = "response") ``` These results are not identical, but very similar to the back-transformed confidence intervals [above](#timing) for the EMMs and the [pairwise ratios in the "comparisons" vignette](comparisons.html#logs), where the fitted model actually used a log response. ### Faking other transformations {#faking} It is possible to fake transformations other than the log. Just use the same method, e.g. ```{r, eval = FALSE} regrid(emm, transform = "probit") ``` would re-grid the existing `emm` to the probit scale. Note that any estimates in `emm` outside of the interval $(0,1)$ will be flagged as non-estimable. The [section on standardized responses](#stdize) gives an example of reverse-engineering a standardized response transformation in this way. ### Alternative scale {#altscale} It is possible to create a report on an alternative scale by updating the `tran` component. For example, suppose we want percent differences instead of ratios in the preceding example with the `pigs` dataset. This is possible by modifying the inverse transformation: since the uusual inverse transformation is a ratio of the form $r = a/b$, we have that the percentage difference between $a$ and $b$ is $100(a-b)/b = 100(r-1)$. Thus, ```{r} pct.diff.tran <- list( linkfun = function(mu) log(mu/100 + 1), linkinv = function(eta) 100 * (exp(eta) - 1), mu.eta = function(eta) 100 * exp(eta), name = "log(pct.diff)" ) update(pairs(piglog.emm.s, type = "response"), tran = pct.diff.tran, inv.lbl = "pct.diff") ``` ## Standardized response {#stdize} In some disciplines, it is common to fit a model to a standardized response variable. R's base function `scale()` makes this easy to do; but it is important to notice that `scale(y)` is more complicated than, say, `sqrt(y)`, because `scale(y)` requires all the values of `y` in order to determine the centering and scaling parameters. The `ref_grid()` function (called by `emmeans() and others) tries to detect the scaling parameters. To illustrate: ```{r, message = FALSE} fiber.lm <- lm(scale(strength) ~ machine * scale(diameter), data = fiber) emmeans(fiber.lm, "machine") # on the standardized scale emmeans(fiber.lm, "machine", type = "response") # strength scale ``` More interesting (and complex) is what happens with `emtrends()`. Without anything fancy added, we have ```{r} emtrends(fiber.lm, "machine", var = "diameter") ``` These slopes are (change in `scale(strength)`) / (change in `diameter`); that is, we didn't do anything to undo the response transformation, but the trend is based on exactly the variable specified, `diameter`. To get (change in `strength`) / (change in `diameter`), we need to undo the response transformation, and that is done via `transform` (which invokes `regrid()` after the reference grid is constructed): ```{r} emtrends(fiber.lm, "machine", var = "diameter", transform = "response") ``` What if we want slopes for (change in `scale(strength)`) / (change in `scale(diameter)`)? This can be done, but it is necessary to manually specify the scaling parameters for `diameter`. ```{r} with(fiber, c(mean = mean(diameter), sd = sd(diameter))) emtrends(fiber.lm, "machine", var = "scale(diameter, 24.133, 4.324)") ``` This result is the one most directly related to the regression coefficients: ```{r} coef(fiber.lm)[4:6] ``` There is a fourth possibility, (change in `strength`) / (change in `scale(diameter)`), that I leave to the reader. ### What to do if auto-detection fails Auto-detection of standardized responses is a bit tricky, and doesn't always succeed. If it fails, a message is displayed and the transformation is ignored. In cases where it doesn't work, we need to explicitly specify the transformation using `make.tran()`. The methods are exactly as shown earlier in this vignette, so we show the code but not the results for a hypothetical example. One method is to fit the model and then add the transformation information later. In this example, `some.fcn` is a model-fitting function which for some reason doesn't allow the scaling information to be detected. ```{r, eval = FALSE} mod <- some.fcn(scale(RT) ~ group + (1|subject), data = mydata) emmeans(mod, "group", type = "response", tran = make.tran("scale", y = mydata$RT)) ``` The other, equivalent, method is to create the transformation object first and use it in fitting the model: ```{r, eval = FALSE} mod <- with(make.tran("scale", y = mydata$RT), some.fcn(linkfun(RT) ~ group + (1|subject), data = mydata)) emmeans(mod, "group", type = "response") ``` ### Reverse-engineering a standardized response An interesting twist on all this is the reverse situation: Suppose we fitted the model *without* the standardized response, but we want to know what the results would be if we had standardized. Here we reverse-engineer the `fiber.lm` example above: ```{r, message = FALSE} fib.lm <- lm(strength ~ machine * diameter, data = fiber) # On raw scale: emmeans(fib.lm, "machine") # On standardized scale: tran <- make.tran("scale", y = fiber$strength) emmeans(fib.lm, "machine", transform = tran) ``` In the latter call, the `transform` argument causes `regrid()` to be called after the reference grid is constructed. [Back to Contents](#contents) ## Bias adjustment {#bias-adj} So far, we have discussed ideas related to back-transforming results as a simple way of expressing results on the same scale as the response. In particular, means obtained in this way are known as *generalized means*; for example, a log transformation of the response is associated with geometric means. When the goal is simply to make inferences about which means are less than which other means, and a response transformation is used, it is often acceptable to present estimates and comparisons of these generalized means. However, sometimes it is important to report results that actually do reflect expected values of the untransformed response. An example is a financial study, where the response is in some monetary unit. It may be convenient to use a response transformation for modeling purposes, but ultimately we may want to make financial projections in those same units. In such settings, we need to make a bias adjustment when we back-transform, because any nonlinear transformation biases the expected values of statistical quantities. More specifically, suppose that we have a response $Y$ and the transformed response is $U$. To back-transform, we use $Y = h(U)$; and using a Taylor approximation, $Y \approx h(\eta) + h'(\eta)(U-\eta) + \frac12h''(\eta)(U-\eta)^2$, so that $E(Y) \approx h(\eta) + \frac12h''(\eta)Var(U)$. This shows that the amount of needed bias adjustment is approximately $\frac12h''(\eta)\sigma^2$ where $\sigma$ is the error SD in the model for $U$. It depends on $\sigma$, and the larger this is, the greater the bias adjustment is needed. This second-order bias adjustment is what is currently used in the **emmeans** package when bias-adjustment is requested. There are better or exact adjustments for certain cases, and future updates may incorporate some of those. ### Pigs example revisited {#pigs-biasadj} Let us compare the estimates in [the overview](#overview) after we apply a bias adjustment. First, note that an estimate of the residual SD is available via the `sigma()` function: ```{r} sigma(pigs.lm) ``` This estimate is used by default. The bias-adjusted EMMs for the sources are: ```{r} summary(pigs.emm.s, type = "response", bias.adj = TRUE) ``` These estimates (and also their SEs) are slightly larger than we had without bias adjustment. They are estimates of the *arithmetic* mean responses, rather than the *geometric* means shown in the overview. Had the value of `sigma` been larger, the adjustment would have been greater. You can experiment with this by adding a `sigma =` argument to the above call. ### Response transformations vs. link functions {#link-bias} At this point, it is important to point out that the above discussion focuses on *response* transformations, as opposed to link functions used in generalized linear models (GLMs). In an ordinary GLM, no bias adjustment is needed, nor is it appropriate, because the link function is just used to define a nonlinear relationship between the actual response mean $\eta$ and the linear predictor. That is, the back-transformed parameter is already the mean. #### InsectSprays example {#insects} To illustrate this, consider the `InsectSprays` data in the **datasets** package. The response variable is a count, and there is one treatment, the spray that is used. Let us model the count as a Poisson variable with (by default) a log link; and obtain the EMMs, with and without a bias adjustment ```{r} ismod <- glm(count ~ spray, data = InsectSprays, family = poisson()) emmeans(ismod, "spray", type = "response", bias.adj = FALSE) emmeans(ismod, "spray", type = "response", bias.adj = TRUE) ``` These are substantially different! Which is right? Well, due to the simple structure of this dataset, the estimates should be well in line with the simple observed mean counts: ```{r} with(InsectSprays, tapply(count, spray, mean)) ``` This illustrates that it is the *non*-bias-adjusted results that are appropriate. Again, the point here is that a GLM does not have an additive error term, that the model is already formulated in terms of the mean, not some generalized mean. Users must be very careful with this! There is no way to automatically do the right thing. Note that, in a generalized linear *mixed* model, including generalized estimating equations and such, there *are* additive random components involved, and then bias adjustment becomes appropriate. #### CBPP example {#cbpp} Consider an example adapted from the help page for `lme4::cbpp`. Contagious bovine pleuropneumonia (CBPP) is a disease in African cattle, and the dataset contains data on incidence of CBPP in several herds of cattle over four time periods. We will fit a mixed model that accounts for herd variations as well as overdispersion (variations larger than expected with a simple binomial model): ```{r, message = FALSE} require(lme4) cbpp <- transform(cbpp, unit = 1:nrow(cbpp)) cbpp.glmer <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd) + (1 | unit), family = binomial, data = cbpp) emm <- emmeans(cbpp.glmer, "period") summary(emm, type = "response") ``` The above summary reflects the back-transformed estimates, with no bias adjustment. However, the model estimates two independent sources of random variation that probably should be taken into account: ```{r} lme4::VarCorr(cbpp.glmer) ``` Notably, the over-dispersion SD is considerably greater than the herd SD. Suppose we want to estimate the marginal probabilities of CBPP incidence, averaged over herds and over-dispersion variations. For this purpose, we need the combined effect of these variations; so we compute the overall SD via the Pythagorean theorem: ```{r} total.SD = sqrt(0.89107^2 + 0.18396^2) ``` Accordingly, here are the bias-adjusted estimates of the marginal probabilities: ```{r} summary(emm, type = "response", bias.adjust = TRUE, sigma = total.SD) ``` These estimates are somewhat larger than the unadjusted estimates (actually, any estimates greater than 0.5 would have been adjusted downward). These adjusted estimates are more appropriate for describing the marginal incidence of CBPP for all herds. In fact, these estimates are fairly close to those obtained directly from the incidences in the data: ```{r} cases <- with(cbpp, tapply(incidence, period, sum)) trials <- with(cbpp, tapply(size, period, sum)) cases / trials ``` Left as an exercise: Revisit the `InsectSprays` example, but (using similar methods to the above) create a `unit` variable and fit an over-dispersion model. Compare the results with and without bias adjustment, and evaluate these results against the earlier results. This is simpler than the CBPP example because there is only one random effect. [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/vignettes/sophisticated.Rmd0000644000176200001440000004326714137062735017242 0ustar liggesusers--- title: "Sophisticated models in emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Sophisticated models in emmeans} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") options(show.signif.stars = FALSE, width = 100) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` This vignette gives a few examples of the use of the **emmeans** package to analyze other than the basic types of models provided by the **stats** package. Emphasis here is placed on accessing the optional capabilities that are typically not needed for the more basic models. A reference for all supported models is provided in the ["models" vignette](models.html). ## Contents {#contents} 1. [Linear mixed models (lmer)](#lmer) a. [System options for lmerMod models](#lmerOpts) 2. [Models with offsets](#offsets) 3. [Ordinal models](#ordinal) 4. [Models fitted using MCMC methods](#mcmc) [Index of all vignette topics](vignette-topics.html) ## Linear mixed models (lmer) {#lmer} Linear mixed models are really important in statistics. Emphasis here is placed on those fitted using `lme4::lmer()`, but **emmeans** also supports other mixed-model packages such as **nlme**. To illustrate, consider the `Oats` dataset in the **nlme** package. It has the results of a balanced split-plot experiment: experimental blocks are divided into plots that are randomly assigned to oat varieties, and the plots are subdivided into subplots that are randomly assigned to amounts of nitrogen within each plot. We will consider a linear mixed model for these data, excluding interaction (which is justified in this case). For sake of illustration, we will exclude a few observations. ```{r} Oats.lmer <- lme4::lmer(yield ~ Variety + factor(nitro) + (1|Block/Variety), data = nlme::Oats, subset = -c(1,2,3,5,8,13,21,34,55)) ``` Let's look at the EMMs for `nitro`: ```{r} Oats.emm.n <- emmeans(Oats.lmer, "nitro") Oats.emm.n ``` You will notice that the degrees of freedom are fractional: that is due to the fact that whole-plot and subplot variations are combined when standard errors are estimated. Different degrees-of-freedom methods are available. By default, the Kenward-Roger method is used, and that's why you see a message about the **pbkrtest** package being loaded, as it implements that method. We may specify a different degrees-of-freedom method via the optional argument `lmer.df`: ```{r} emmeans(Oats.lmer, "nitro", lmer.df = "satterthwaite") ``` ###### {#dfoptions} This latest result uses the Satterthwaite method, which is implemented in the **lmerTest** package. Note that, with this method, not only are the degrees of freedom slightly different, but so are the standard errors. That is because the Kenward-Roger method also entails making a bias adjustment to the covariance matrix of the fixed effects; that is the principal difference between the methods. A third possibility is `"asymptotic"`: ```{r} emmeans(Oats.lmer, "nitro", lmer.df = "asymptotic") ``` This just sets all the degrees of freedom to `Inf` -- that's **emmeans**'s way of using *z* statistics rather than *t* statistics. The asymptotic methods tend to make confidence intervals a bit too narrow and P values a bit too low; but they involve much, much less computation. Note that the SEs are the same as obtained using the Satterthwaite method. Comparisons and contrasts are pretty much the same as with other models. As `nitro` has quantitative levels, we might want to test polynomial contrasts: ```{r} contrast(Oats.emm.n, "poly") ``` The interesting thing here is that the degrees of freedom are much larger than they are for the EMMs. The reason is because `nitro` within-plot factor, so inter-plot variations have little role in estimating contrasts among `nitro` levels. On the other hand, `Variety` is a whole-plot factor, and there is not much of a bump in degrees of freedom for comparisons: ```{r} emmeans(Oats.lmer, pairwise ~ Variety) ``` ### System options for lmerMod models {#lmerOpts} The computation required to compute the adjusted covariance matrix and degrees of freedom may become cumbersome. Some user options (i.e., `emm_options()` calls) make it possible to streamline these computations through default methods and limitations on them. First, the option `lmer.df`, which may have values of `"kenward-roger"`, `"satterthwaite"`, or `"asymptotic"` (partial matches are OK!) specifies the default degrees-of-freedom method. The options `disable.pbkrtest` and `disable.lmerTest` may be `TRUE` or `FALSE`, and comprise another way of controlling which method is used (e.g., the Kenward-Roger method will not be used if `get_emm_option("disable.pbkrtest") == TRUE`). Finally, the options `pbkrtest.limit` and `lmerTest.limit`, which should be set to numeric values, enable the given package conditionally on whether the number of data rows does not exceed the given limit. The factory default is 3000 for both limits. [Back to Contents](#contents) ## Models with offsets {#offsets} If a model is fitted and its formula includes an `offset()` term, then by default, the offset is computed and included in the reference grid. To illustrate, consider a hypothetical dataset on insurance claims (used as an [example in SAS's documentation](https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_genmod_sect006.htm)). There are classes of cars of varying counts (`n`), sizes (`size`), and age (`age`), and we record the number of insurance claims (`claims`). We fit a Poisson model to `claims` as a function of `size` and `age`. An offset of `log(n)` is included so that `n` functions as an "exposure" variable. ```{r} ins <- data.frame( n = c(500, 1200, 100, 400, 500, 300), size = factor(rep(1:3,2), labels = c("S","M","L")), age = factor(rep(1:2, each = 3)), claims = c(42, 37, 1, 101, 73, 14)) ins.glm <- glm(claims ~ size + age + offset(log(n)), data = ins, family = "poisson") ``` First, let's look at the reference grid obtained by default: ```{r} ref_grid(ins.glm) ``` Note that `n` is included in the reference grid and that its average value of 500 is used for all predictions. Thus, if we obtain EMMs for, say, `age`, these are results are based on a pool of 500 cars: ```{r} emmeans(ins.glm, "size", type = "response") ``` However, many users would like to ignore the offset for this kind of model, because then the estimates we obtain are rates per unit value of the (logged) offset. This may be accomplished by specifying an `offset` parameter in the call: ```{r} emmeans(ins.glm, "size", type = "response", offset = 0) ``` You may verify that the above estimates are 1/500th of the previous ones. You may also verify that the above results are identical to those obtained by setting `n` equal to 1: ```{r eval = FALSE} emmeans(ins.glm, "size", type = "response", at = list(n = 1)) ``` However, those who use these types of models will be more comfortable directly setting the offset to zero. By the way, you may set some other reference value for the rates. For example, if you want estimates of claims per 100 cars, simply use (results not shown): ```{r eval = FALSE} emmeans(ins.glm, "size", type = "response", offset = log(100)) ``` [Back to Contents](#contents) ## Ordinal models {#ordinal} Ordinal-response models comprise an example where several options are available for obtaining EMMs. To illustrate, consider the `wine` data in the **ordinal** package. The response is a rating of bitterness on a five-point scale. we will consider a probit model in two factors during fermentation: `temp` (temperature) and `contact` (contact with grape skins), with the judge making the rating as a scale predictor: ```{r} require("ordinal") wine.clm <- clm(rating ~ temp + contact, scale = ~ judge, data = wine, link = "probit") ``` (in earlier modeling, we found little interaction between the factors.) Here are the EMMs for each factor using default options: ```{r} emmeans(wine.clm, list(pairwise ~ temp, pairwise ~ contact)) ``` These results are on the "latent" scale; the idea is that there is a continuous random variable (in this case normal, due to the probit link) having a mean that depends on the predictors; and that the ratings are a discretization of the latent variable based on a fixed set of cut points (which are estimated). In this particular example, we also have a scale model that says that the variance of the latent variable depends on the judges. The latent results are quite a bit like those for measurement data, making them easy to interpret. The only catch is that they are not uniquely defined: we could apply a linear transformation to them, and the same linear transformation to the cut points, and the results would be the same. ###### {#ordlp} The `clm` function actually fits the model using an ordinary probit model but with different intercepts for each cut point. We can get detailed information for this model by specifying `mode = "linear.predictor"`: ```{r} tmp <- ref_grid(wine.clm, mode = "lin") tmp ``` Note that this reference grid involves an additional constructed predictor named `cut` that accounts for the different intercepts in the model. Let's obtain EMMs for `temp` on the linear-predictor scale: ```{r} emmeans(tmp, "temp") ``` These are just the negatives of the latent results obtained earlier (the sign is changed to make the comparisons go the right direction). Closely related to this is `mode = "cum.prob"` and `mode = "exc.prob"`, which simply transform the linear predictor to cumulative probabilities and exceedance (1 - cumulative) probabilities. These modes give us access to the details of the fitted model but are cumbersome to use for describing results. When they can become useful is when you want to work in terms of a particular cut point. Let's look at `temp` again in terms of the probability that the rating will be at least 4: ```{r} emmeans(wine.clm, ~ temp, mode = "exc.prob", at = list(cut = "3|4")) ``` ###### {#ordprob} There are yet more modes! With `mode = "prob"`, we obtain estimates of the probability distribution of each rating. Its reference grid includes a factor with the same name as the model response -- in this case `rating`. We usually want to use that as the primary factor, and the factors of interest as `by` variables: ```{r} emmeans(wine.clm, ~ rating | temp, mode = "prob") ``` Using `mode = "mean.class"` obtains the average of these probability distributions as probabilities of the integers 1--5: ```{r} emmeans(wine.clm, "temp", mode = "mean.class") ``` And there is a mode for the scale model too. In this example, the scale model involves only judges, and that is the only factor in the grid: ```{r} summary(ref_grid(wine.clm, mode = "scale"), type = "response") ``` Judge 8's ratings don't vary much, relative to the others. The scale model is in terms of log(SD). Again, these are not uniquely identifiable, and the first level's estimate is set to log(1) = 0. so, actually, each estimate shown is a comparison with judge 1. [Back to Contents](#contents) ## Models fitted using MCMC methods {#mcmc} To illustrate **emmeans**'s support for models fitted using MCMC methods, consider the `example_model` available in the **rstanarm** package. The example concerns CBPP, a serious disease of cattle in Ethiopia. A generalized linear mixed model was fitted to the data using the code below. (This is a Bayesian equivalent of the frequentist model we considered in the ["Transformations" vignette](transformations.html#cbpp).) In fitting the model, we first set the contrast coding to `bayestestR::contr.bayes` because this equalizes the priors across different treatment levels (a correction from an earlier version of this vignette.) We subsequently obtain the reference grids for these models in the usual way. For later use, we also fit the same model with just the prior information. ```{r eval = FALSE} cbpp <- transform(lme4::cbpp, unit = 1:56) require("bayestestR") options(contrasts = c("contr.bayes", "contr.poly")) cbpp.rstan <- rstanarm::stan_glmer( cbind(incidence, size - incidence) ~ period + (1|herd) + (1|unit), data = cbpp, family = binomial, prior = student_t(df = 5, location = 0, scale = 2, autoscale = FALSE), chains = 2, cores = 1, seed = 2021.0120, iter = 1000) cbpp_prior.rstan <- update(cbpp.rstan, prior_PD = TRUE) cbpp.rg <- ref_grid(cbpp.rstan) cbpp_prior.rg <- ref_grid(cbpp_prior.rstan) ``` ```{r echo = FALSE} cbpp.rg <- do.call(emmobj, readRDS(system.file("extdata", "cbpprglist", package = "emmeans"))) cbpp_prior.rg <- do.call(emmobj, readRDS(system.file("extdata", "cbpppriorrglist", package = "emmeans"))) cbpp.sigma <- readRDS(system.file("extdata", "cbppsigma", package = "emmeans")) ``` Here is the structure of the reference grid: ```{r} cbpp.rg ``` So here are the EMMs (no averaging needed in this simple model): ```{r} summary(cbpp.rg) ``` The summary for EMMs of Bayesian models shows the median of the posterior distribution of each estimate, along with highest posterior density (HPD) intervals. Under the hood, the posterior sample of parameter estimates is used to compute a corresponding sample of posterior EMMs, and it is those that are summarized. (Technical note: the summary is actually rerouted to the `hpd.summary()` function. ###### {#bayesxtra} We can access the posterior EMMs via the `as.mcmc` method for `emmGrid` objects. This gives us an object of class `mcmc` (defined in the **coda** package), which can be summarized and explored as we please. ```{r} require("coda") summary(as.mcmc(cbpp.rg)) ``` Note that `as.mcmc` will actually produce an `mcmc.list` when there is more than one chain present, as in this example. The 2.5th and 97.5th quantiles are similar, but not identical, to the 95% confidence intervals in the frequentist summary. The **bayestestR** package provides `emmGrid` methods for most of its description and testing functions. For example: ```{r} bayestestR::bayesfactor_parameters(pairs(cbpp.rg), prior = pairs(cbpp_prior.rg)) bayestestR::p_rope(pairs(cbpp.rg), range = c(-0.25, 0.25)) ``` Both of these sets of results suggest that period 1 is different from the others. For more information on these methods, refer to [the CRAN page for **bayestestR**](https://cran.r-project.org/package=bayestestR) and its vignettes, e.g., the one on Bayes factors. ### Bias-adjusted incidence probabilities {#bias-adj-mcmc} Next, let us consider the back-transformed results. As is discussed with the [frequentist model](transformations.html#cbpp), there are random effects present, and if wee want to think in terms of marginal probabilities across all herds and units, we should correct for bias; and to do that, we need the standard deviations of the random effects. The model object has MCMC results for the random effects of each herd and each unit, but after those, there are also summary results for the posterior SDs of the two random effects. (I used the `colnames` function to find that they are in the 78th and 79th columns.) ```{r eval = FALSE} cbpp.sigma = as.matrix(cbpp.rstan$stanfit)[, 78:79] ``` Here are the first few: ```{r} head(cbpp.sigma) ``` So to obtain bias-adjusted marginal probabilities, obtain the resultant SD and regrid with bias correction: ```{r} totSD <- sqrt(apply(cbpp.sigma^2, 1, sum)) cbpp.rgrd <- regrid(cbpp.rg, bias.adjust = TRUE, sigma = totSD) summary(cbpp.rgrd) ``` Here is a plot of the posterior incidence probabilities, back-transformed: ```{r} bayesplot::mcmc_areas(as.mcmc(cbpp.rgrd)) ``` ... and here are intervals for each period compared with its neighbor: ```{r} contrast(cbpp.rgrd, "consec", reverse = TRUE) ``` The only interval that excludes zero is the one that compares periods 1 and 2. ### Bayesian prediction {#predict-mcmc} To predict from an MCMC model, just specify the `likelihood` argument in `as.mcmc`. Doing so causes the function to simulate data from the posterior predictive distribution. For example, if we want to predict the CBPP incidence in future herds of 25 cattle, we can do: ```{r} set.seed(2019.0605) cbpp.preds <- as.mcmc(cbpp.rgrd, likelihood = "binomial", trials = 25) bayesplot::mcmc_hist(cbpp.preds, binwidth = 1) ``` [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/vignettes/utilities.Rmd0000644000176200001440000003214114137062735016377 0ustar liggesusers--- title: "Utilities and options for emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Utilities and options} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") emm_options(opt.digits = TRUE) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` ## Contents {#contents} 1. [Updating an `emmGrid` object](#update) 2. [Setting options](#options) a. [Setting and viewing defaults](#defaults) b. [Optimal digits to display](#digits) c. [Startup options](#startup) 3. [Combining and subsetting `emmGrid` objects](#rbind) 4. [Accessing results to use elsewhere](#data) 5. [Adding grouping factors](#groups) 6. [Re-labeling and re-leveling an `emmGrid`](#relevel) [Index of all vignette topics](vignette-topics.html) ## Updating an `emmGrid` object {#update} Several internal settings are saved when functions like `ref_grid()`, `emmeans()`, `contrast()`, etc. are run. Those settings can be manipulated via the `update()` method for `emmGrid`s. To illustrate, consider the `pigs` dataset and model yet again: ```{r} pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) pigs.emm <- emmeans(pigs.lm, "source") pigs.emm ``` We see confidence intervals but not tests, by default. This happens as a result of internal settings in `pigs.emm.s` that are passed to `summary()` when the object is displayed. If we are going to work with this object a lot, we might want to change its internal settings rather than having to rely on explicitly calling `summary()` with several arguments. If so, just update the internal settings to what is desired; for example: ```{r} pigs.emm.s <- update(pigs.emm, infer = c(TRUE, TRUE), null = log(35), calc = c(n = ".wgt.")) pigs.emm.s ``` Note that by adding of `calc`, we have set a default to calculate and display the sample size when the object is summarized. See `help("update.emmGrid")` for details on the keywords that can be changed. Mostly, they are the same as the names of arguments in the functions that construct these objects. Of course, we can always get what we want via calls to `test()`, `confint()` or `summary()` with appropriate arguments. But the `update()` function is more useful in sophisticated manipulations of objects, or called implicitly via the `...` or `options` argument in `emmeans()` and other functions. Those options are passed to `update()` just before the object is returned. For example, we could have done the above update within the `emmeans()` call as follows (results are not shown because they are the same as before): ```{r eval = FALSE} emmeans(pigs.lm, "source", infer = c(TRUE, TRUE), null = log(35), calc = c(n = ".wgt.")) ``` [Back to contents](#contents) ## Setting options {#options} Speaking of the `options` argument, note that the default in `emmeans()` is `options = get_emm_option("emmeans")`. Let's see what that is: ```{r} get_emm_option("emmeans") ``` So, by default, confidence intervals, but not tests, are displayed when the result is summarized. The reverse is true for results of `contrast()` (and also the default for `pairs()` which calls `contrast()`): ```{r} get_emm_option("contrast") ``` There are also defaults for a newly constructed reference grid: ```{r} get_emm_option("ref_grid") ``` The default is to display neither intervals nor tests when summarizing. In addition, the flag `is.new.rg` is set to `TRUE`, and that is why one sees a `str()` listing rather than a summary as the default when the object is simply shown by typing its name at the console. ### Setting and viewing defaults {#defaults} The user may have other preferences. She may want to see both intervals and tests whenever contrasts are produced; and perhaps she also wants to always default to the response scale when transformations or links are present. We can change the defaults by setting the corresponding options; and that is done via the `emm_options()` function: ```{r} emm_options(emmeans = list(type = "response"), contrast = list(infer = c(TRUE, TRUE))) ``` Now, new `emmeans()` results and contrasts follow the new defaults: ```{r} pigs.anal.p <- emmeans(pigs.lm, consec ~ percent) pigs.anal.p ``` Observe that the contrasts "inherited" the `type = "response"` default from the EMMs. NOTE: Setting the above options does *not* change how existing `emmGrid` objects are displayed; it only affects ones constructed in the future. There is one more option -- `summary` -- that overrides all other display defaults for both existing and future objects. For example, specifying `emm_options(summary = list(infer = c(TRUE, TRUE)))` will result in both intervals and tests being displayed, regardless of their internal defaults, unless `infer` is explicitly specified in a call to `summary()`. To temporarily revert to factory defaults in a single call to `emmeans()` or `contrast()` or `pairs()`, specify `options = NULL` in the call. To reset everything to factory defaults (which we do presently), null-out all of the **emmeans** package options: ```{r} options(emmeans = NULL) ``` ### Optimal digits to display {#digits} When an `emmGrid` object is summarized and displayed, the factory default is to display it with just enough digits as is justified by the standard errors or HPD intervals of the estimates displayed. You may use the `"opt.digits"` option to change this. If it is `TRUE` (the default), we display only enough digits as is justified (but at least 3). If it is set to `FALSE`, the number of digits is set using the R system's default, `getOption("digits")`; this is often much more precision than is justified. To illustrate, here is the summary of `pigs.emm` displayed without optimizing digits. Compare it with the first summary in this vignette. ```{r} emm_options(opt.digits = FALSE) pigs.emm emm_options(opt.digits = TRUE) # revert to optimal digits ``` By the way, setting this option does *not* round the calculated values computed by `summary.emmGrid()` or saved in a `summary)emm` object; it simply controls the precision displayed by `print.summary_emm()`. ### Startup options {#startup} The options accessed by `emm_options()` and `get_emm_option()` are stored in a list named `emmeans` within R's options environment. Therefore, if you desire options other than the defaults provided on a regular basis, this can be easily arranged by specifying them in your startup script for R. For example, if you want to default to Satterthwaite degrees of freedom for `lmer` models, and display confidence intervals rather than tests for contrasts, your `.Rprofile` file could contain the line ```{r eval = FALSE} options(emmeans = list(lmer.df = "satterthwaite", contrast = list(infer = c(TRUE, FALSE)))) ``` [Back to contents](#contents) ## Combining and subsetting `emmGrid` objects {#rbind} Two or more `emmGrid` objects may be combined using the `rbind()` or `+` methods. The most common reason (or perhaps the only good reason) to do this is to combine EMMs or contrasts into one family for purposes of applying a multiplicity adjustment to tests or intervals. A user may want to combine the three pairwise comparisons of sources with the three comparisons above of consecutive percents into a single family of six tests with a suitable multiplicity adjustment. This is done quite simply: ```{r} rbind(pairs(pigs.emm.s), pigs.anal.p[[2]]) ``` The default adjustment is `"bonferroni"`; we could have specified something different via the `adjust` argument. An equivalent way to combine `emmGrid`s is via the addition operator. Any options may be provided by `update()`. Below, we combine the same results into a family but ask for the "exact" multiplicity adjustment. ```{r} update(pigs.anal.p[[2]] + pairs(pigs.emm.s), adjust = "mvt") ``` Also evident in comparing these results is that settings are obtained from the first object combined. So in the second output, where they are combined in reverse order, we get both confidence intervals and tests, and transformation to the response scale. ###### {#brackets} To subset an `emmGrid` object, just use the subscripting operator `[]`. For instance, ```{r} pigs.emm[2:3] ``` ## Accessing results to use elsewhere {#data} Sometimes, users want to use the results of an analysis (say, an `emmeans()` call) in other computations. The `summary()` method creates a `summary_emm` object that inherits from the `data.frame` class; so one may use the variables therein just as those in a data frame. Another way is to use the `as.data.frame()` method for `emmGrid` objects. This is provided to implement the standard way to coerce an object to a data frame. For illustration, let's compute the widths of the confidence intervals in our example. ```{r} transform(pigs.emm, CI.width = upper.CL - lower.CL) ``` This implicitly converted `pigs.emm` to a data frame by passing it to the `as.data.frame()` method, then performed the required computation. But sometimes you have to explicitly call `as.data.frame()`. [Note that the `opt.digits` option is ignored here, because this is a regular data frame, not the summary of an `emmGrid`.] [Back to contents](#contents) ## Adding grouping factors {#groups} Sometimes, users want to group levels of a factor into a smaller number of groups. Those groups may then be, say, averaged separately and compared, or used as a `by` factor. The `add_grouping()` function serves this purpose. The function takes four arguments: the object, the name of the grouping factor to be created, the name of the reference factor that is being grouped, and a vector of level names of the grouping factor corresponding to levels of the reference factor. Suppose for example that we want to distinguish animal and non-animal sources of protein in the `pigs` example: ```{r} pigs.emm.ss <- add_grouping(pigs.emm.s, "type", "source", c("animal", "vegetable", "animal")) str(pigs.emm.ss) ``` Note that the new object has a nesting structure (see more about this in the ["messy-data" vignette](messy-data.html#nesting)), with the reference factor nested in the new grouping factor. Now we can obtain means and comparisons for each group ```{r} emmeans(pigs.emm.ss, pairwise ~ type) ``` [Back to contents](#contents) ## Re-labeling or re-leveling an `emmGrid` {#relevel} Sometimes it is desirable to re-label the rows of an `emmGrid`, or cast it in terms of other factor(s). This can be done via the `levels` argument in `update()`. As an example, sometimes a fitted model has a treatment factor that comprises combinations of other factors. In subsequent analysis, we may well want to break it down into the individual factors' contributions. Consider, for example, the `warpbreaks` data provided with R. We will define a single factor and fit a non homogeneous-variance model: ```{r, message = FALSE} warp <- transform(warpbreaks, treat = interaction(wool, tension)) library(nlme) warp.gls <- gls(breaks ~ treat, weights = varIdent(form = ~ 1|treat), data = warp) ( warp.emm <- emmeans(warp.gls, "treat") ) ``` But now we want to re-cast this `emmGrid` into one that has separate factors for `wool` and `tension`. We can do this as follows: ```{r} warp.fac <- update(warp.emm, levels = list( wool = c("A", "B"), tension = c("L", "M", "H"))) str(warp.fac) ``` So now we can do various contrasts involving the separate factors: ```{r} contrast(warp.fac, "consec", by = "wool") ``` Note: When re-leveling to more than one factor, you have to be careful to anticipate that the levels will be expanded using `expand.grid()`: the first factor in the list varies the fastest and the last varies the slowest. That was the case in our example, but in others, it may not be. Had the levels of `treat` been ordered as `A.L, A.M, A.H, B.L, B.M, B.H`, then we would have had to specify the levels of `tension` first and the levels of `wool` second. [Back to contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/vignettes/basics.Rmd0000644000176200001440000007024314137062735015635 0ustar liggesusers--- title: "Basics of estimated marginal means" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Basics of EMMs} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` ## Contents {#contents} 1. [Motivating example](#motivation) 2. [EMMs defined](#EMMdef) a. [Reference grids](#ref_grid) b. [Estimated marginal means](#emmeans) c. [Altering the reference grid](#altering) d. [Derived covariates](#depcovs) d. [Non-predictor variables](#params) e. [Graphical displays](#plots) e. [Formatting results](#formatting) f. [Weighting](#weights) g. [Multivariate models](#multiv) 3. [Objects, structures, and methods](#emmobj) 4. [P values, "significance", and recommendations](#pvalues) 5. [Summary](#summary) 6. [Further reading](#more) [Index of all vignette topics](vignette-topics.html) ## Why we need EMMs {#motivation} Consider the `pigs` dataset provided with the package (`help("pigs")` provides details). These data come from an unbalanced experiment where pigs are given different percentages of protein (`percent`) from different sources (`source`) in their diet, and later we measure the concentration (`conc`) of leucine. Here's an interaction plot showing the mean `conc` at each combination of the other factors. ```{r, echo = FALSE} par(mar = .1 + c(4, 4, 1, 1)) # reduce head space ``` ```{r} with(pigs, interaction.plot(percent, source, conc)) ``` This plot suggests that with each `source`, `conc` tends to go up with `percent`, but that the mean differs with each `source`. Now, suppose that we want to assess, numerically, the marginal results for `percent`. The natural thing to do is to obtain the marginal means: ```{r} with(pigs, tapply(conc, percent, mean)) ``` Looking at the plot, it seems a bit surprising that the last three means are all about the same, with the one for 15 percent being the largest. Hmmmm, so let's try another approach -- actually averaging together the values we see in the plot. First, we need the means that are shown there: ```{r} cell.means <- matrix(with(pigs, tapply(conc, interaction(source, percent), mean)), nrow = 3) cell.means ``` Confirm that the rows of this matrix match the plotted values for fish, soy, and skim, respectively. Now, average each column: ```{r} apply(cell.means, 2, mean) ``` These results are decidedly different from the ordinary marginal means we obtained earlier. What's going on? The answer is that some observations were lost, making the data unbalanced: ```{r} with(pigs, table(source, percent)) ``` We can reproduce the marginal means by weighting the cell means with these frequencies. For example, in the last column: ```{r} sum(c(3, 1, 1) * cell.means[, 4]) / 5 ``` The big discrepancy between the ordinary mean for `percent = 18` and the marginal mean from `cell.means` is due to the fact that the lowest value receives 3 times the weight as the other two values. ### The point {#eqwts} The point is that the marginal means of `cell.means` give *equal weight* to each cell. In many situations (especially with experimental data), that is a much fairer way to compute marginal means, in that they are not biased by imbalances in the data. We are, in a sense, estimating what the marginal means *would* be, had the experiment been balanced. Estimated marginal means (EMMs) serve that need. All this said, there are certainly situations where equal weighting is *not* appropriate. Suppose, for example, we have data on sales of a product given different packaging and features. The data could be unbalanced because customers are more attracted to some combinations than others. If our goal is to understand scientifically what packaging and features are inherently more profitable, then equally weighted EMMs may be appropriate; but if our goal is to predict or maximize profit, the ordinary marginal means provide better estimates of what we can expect in the marketplace. [Back to Contents](#contents) ## What exactly are EMMs? {#EMMdef} ### Model and reference grid {#ref_grid} Estimated marginal means are based on a *model* -- not directly on data. The basis for them is what we call the *reference grid* for a given model. To obtain the reference grid, consider all the predictors in the model. Here are the default rules for constructing the reference grid * For each predictor that is a *factor*, use its levels (dropping unused ones) * For each numeric predictor (covariate), use its average.[^1] The reference grid is then a regular grid of all combinations of these reference levels. As a simple example, consider again the `pigs` dataset (see `help("fiber")` for details). Examination of residual plots from preliminary models suggests that it is a good idea to work in terms of log concentration. If we treat the predictor `percent` as a factor, we might fit the following model: ```{r} pigs.lm1 <- lm(log(conc) ~ source + factor(percent), data = pigs) ``` The reference grid for this model can be found via the `ref_grid` function: ```{r} ref_grid(pigs.lm1) ``` (*Note:* Many of the calculations that follow are meant to illustrate what is inside this reference-grid object; You don't need to do such calculations yourself in routine analysis; just use the `emmeans()` (or possibly `ref_grid()`) function as we do later.) In this model, both predictors are factors, and the reference grid consists of the $3\times4 = 12$ combinations of these factor levels. It can be seen explicitly by looking at the `grid` slot of this object: ```{r} ref_grid(pigs.lm1) @ grid ``` Note that other information is retained in the reference grid, e.g., the transformation used on the response, and the cell counts as the `.wgt.` column. Now, suppose instead that we treat `percent` as a numeric predictor. This leads to a different model -- and a different reference grid. ```{r} pigs.lm2 <- lm(log(conc) ~ source + percent, data = pigs) ref_grid(pigs.lm2) ``` This reference grid has the levels of `source`, but only one `percent` value, its average. Thus, the grid has only three elements: ```{r} ref_grid(pigs.lm2) @ grid ``` [^1]: In newer versions of **emmeans**, however, covariates having only two distinct values are by default treated as two-level factors, though there is an option to reduce them to their mean. [Back to Contents](#contents) ### Estimated marginal means {#emmeans} Once the reference grid is established, we can consider using the model to estimate the mean at each point in the reference grid. (Curiously, the convention is to call this "prediction" rather than "estimation"). For `pigs.lm1`, we have ```{r} pigs.pred1 <- matrix(predict(ref_grid(pigs.lm1)), nrow = 3) pigs.pred1 ``` Estimated marginal means (EMMs) are defined as equally weighted means of these predictions at specified margins: ```{r} apply(pigs.pred1, 1, mean) ### EMMs for source apply(pigs.pred1, 2, mean) ### EMMs for percent ``` For the other model, `pigs.lm2`, we have only one point in the reference grid for each `source` level; so the EMMs for `source` are just the predictions themselves: ```{r} predict(ref_grid(pigs.lm2)) ``` These are slightly different from the previous EMMs for `source`, emphasizing the fact that EMMs are model-dependent. In models with covariates, EMMs are often called *adjusted means*. The `emmeans` function computes EMMs, accompanied by standard errors and confidence intervals. For example, ```{r} emmeans(pigs.lm1, "percent") ``` In these examples, all the results are presented on the `log(conc)` scale (and the annotations in the output warn of this). It is possible to convert them back to the `conc` scale by back-transforming. This topic is discussed in [the vignette on transformations](transformations.html). An additional note: There is an exception to the definition of EMMs given here. If the model has a nested structure in the fixed effects, then averaging is performed separately in each nesting group. See the [section on nesting in the "messy-data" vignette](messy-data.html#nesting) for an example. [Back to Contents](#contents) ### Altering the reference grid {#altering} It is possible to alter the reference grid. We might, for example, want to define a reference grid for `pigs.lm2` that is comparable to the one for `pigs.lm1`. ```{r} ref_grid(pigs.lm2, cov.keep = "percent") ``` Using `cov.keep = "percent"` specifies that, instead of using the mean, the reference grid should use all the unique values of `each covariate`"percent"`. Another option is to specify a `cov.reduce` function that is used in place of the mean; e.g., ```{r} ref_grid(pigs.lm2, cov.reduce = range) ``` Another option is to use the `at` argument. Consider this model for the built-in `mtcars` dataset: ```{r} mtcars.lm <- lm(mpg ~ disp * cyl, data = mtcars) ref_grid(mtcars.lm) ``` Since both predictors are numeric, the default reference grid has only one point. For purposes of describing the fitted model, you might want to obtain predictions at a grid of points, like this: ```{r} mtcars.rg <- ref_grid(mtcars.lm, cov.keep = 3, at = list(disp = c(100, 200, 300))) mtcars.rg ``` This illustrates two things: a new use of `cov.keep` and the `at` argument. `cov.keep = "3"` specifies that any covariates having 3 or fewer unique values is treated like a factor (the system default is `cov.keep = "2"`). The `at` specification gives three values of `disp`, overriding the default behavior to use the mean of `disp`. Another use of `at` is to focus on only some of the levels of a factor. Note that `at` does not need to specify every predictor; those not mentioned in `at` are handled by `cov.reduce`, `cov.keep`, or the default methods. Also, covariate values in `at` need not be values that actually occur in the data, whereas `cov.keep` will use only values that are achieved. [Back to Contents](#contents) ### Derived covariates {#depcovs} You need to be careful when one covariate depends on the value of another. To illustrate in the `mtcars` example, suppose we want to use `cyl` as a factor and include a quadratic term for `disp`: ```{r} mtcars.1 <- lm(mpg ~ factor(cyl) + disp + I(disp^2), data = mtcars) emmeans(mtcars.1, "cyl") ``` Some users may not like function calls in the model formula, so they instead do something like this: ```{r} mtcars <- transform(mtcars, Cyl = factor(cyl), dispsq = disp^2) mtcars.2 <- lm(mpg ~ Cyl + disp + dispsq, data = mtcars) emmeans(mtcars.2, "Cyl") ``` Wow! Those are really different results -- even though the models are equivalent. Why is this? To understand, look at the reference grids: ```{r} ref_grid(mtcars.1) ref_grid(mtcars.2) ``` For both models, the reference grid uses the `disp` mean of 230.72. But for `mtcars.2`, we also set `dispsq` to its mean of 68113. This is not right, because `dispsq` should be the square of `disp` (about 53232, not 68113) in order to be consistent. If we use that value of `dispsq`, we get the same results (modulus rounding error) as for `mtcars.1`: ```{r} emmeans(mtcars.2, "Cyl", at = list(dispsq = 230.72^2)) ``` In summary, for polynomial models and others where some covariates depend on others in nonlinear ways, include that dependence in the model formula (as in `mtcars.1`) using `I()` or `poly()` expressions, or alter the reference grid so that the dependency among covariates is correct. ### Non-predictor variables {#params} Reference grids are derived using the variables in the right-hand side of the model formula. But sometimes, these variables are not actually predictors. For example: ```{r, eval = FALSE} deg <- 2 mod <- lm(y ~ treat * poly(x, degree = deg), data = mydata) ``` If we call `ref_grid()` or `emmeans()` with this model, it will try to construct a grid of values of `treat`, `x`, and `deg` -- causing an error because `deg` is not a predictor in this model. To get things to work correctly, you need to name `deg` in a `params` argument, e.g., ```{r, eval = FALSE} emmeans(mod, ~ treat | x, at = list(x = 1:3), params = "deg") ``` [Back to Contents](#contents) ### Graphical displays {#plots} The results of `ref_grid()` or `emmeans()` (these are objects of class `emmGrid`) may be plotted in two different ways. One is an interaction-style plot, using `emmip()`. In the following, let's use it to compare the predictions from `pigs.lm1` and `pigs.lm2`: ```{r} emmip(pigs.lm1, source ~ percent) emmip(ref_grid(pigs.lm2, cov.reduce = FALSE), source ~ percent) ``` Notice that `emmip()` may also be used on a fitted model. The formula specification needs the *x* variable on the right-hand side and the "trace" factor (what is used to define the different curves) on the left. This is a good time to yet again emphasize that EMMs are based on a *model*. Neither of these plots is an interaction plot of the *data*; they are interaction plots of model predictions; and since both models do not include an interaction, no interaction at all is evident in the plots. ###### {#plot.emmGrid} The other graphics option offered is the `plot()` method for `emmGrid` objects. In the following, we display the estimates and 95% confidence intervals for `mtcars.rg` in separate panels for each `disp`. ```{r} plot(mtcars.rg, by = "disp") ``` This plot illustrates, as much as anything else, how silly it is to try to predict mileage for a 4-cylinder car having high displacement, or an 8-cylinder car having low displacement. The widths of the intervals give us a clue that we are extrapolating. A better idea is to acknowledge that displacement largely depends on the number of cylinders. So here is yet another way to use `cov.reduce` to modify the reference grid: ```{r} mtcars.rg_d.c <- ref_grid(mtcars.lm, at = list(cyl = c(4,6,8)), cov.reduce = disp ~ cyl) mtcars.rg_d.c @ grid ``` The `ref_grid` call specifies that `disp` depends on `cyl`; so a linear model is fitted with the given formula and its fitted values are used as the `disp` values -- only one for each `cyl`. If we plot this grid, the results are sensible, reflecting what the model predicts for typical cars with each number of cylinders: ```{r fig.height = 1.5} plot(mtcars.rg_d.c) ``` ###### {#ggplot} Wizards with the **ggplot2** package can further enhance these plots if they like. For example, we can add the data to an interaction plot -- this time we opt to include confidence intervals and put the three sources in separate panels: ```{r} require("ggplot2") emmip(pigs.lm1, ~ percent | source, CIs = TRUE) + geom_point(aes(x = percent, y = log(conc)), data = pigs, pch = 2, color = "blue") ``` ### Formatting results {#formatting} If you want to include `emmeans()` results in a report, you might want to have it in a nicer format than just the printed output. We provide a little bit of help for this, especially if you are using RMarkdown or SWeave to prepare the report. There is an `xtable` method for exporting these results, which we do not illustrate here but it works similarly to `xtable()` in other contexts. Also, the `export` option the `print()` method allows the user to save exactly what is seen in the printed output as text, to be saved or formatted as the user likes (see the documentation for `print.emmGrid` for details). Here is an example using one of the objects above: ```{r, eval = FALSE} ci <- confint(mtcars.rg_d.c, level = 0.90, adjust = "scheffe") xport <- print(ci, export = TRUE) cat("\n") knitr::kable(xport$summary, align = "r") for (a in xport$annotations) cat(paste(a, "
")) cat("
\n") ``` ```{r, results = "asis", echo = FALSE} ci <- confint(mtcars.rg_d.c, level = 0.90, adjust = "scheffe") xport <- print(ci, export = TRUE) cat("\n") knitr::kable(xport$summary, align = "r") for (a in xport$annotations) cat(paste(a, "
")) cat("
\n") ``` [Back to Contents](#contents) ### Using weights {#weights} It is possible to override the equal-weighting method for computing EMMs. Using `weights = "cells"` in the call will weight the predictions according to their cell frequencies (recall this information is retained in the reference grid). This produces results comparable to ordinary marginal means: ```{r} emmeans(pigs.lm1, "percent", weights = "cells") ``` Note that, as in the ordinary means in [the motivating example](#motivation), the highest estimate is for `percent = 15` rather than `percent = 18`. It is interesting to compare this with the results for a model that includes only `percent` as a predictor. ```{r} pigs.lm3 <- lm(log(conc) ~ factor(percent), data = pigs) emmeans(pigs.lm3, "percent") ``` The EMMs in these two tables are identical, but their standard errors are considerably different. That is because the model `pigs.lm1` accounts for variations due to `source`. The lesson here is that it is possible to obtain statistics comparable to ordinary marginal means, while still accounting for variations due to the factors that are being averaged over. [Back to Contents](#contents) ### Multivariate responses {#multiv} The **emmeans** package supports various multivariate models. When there is a multivariate response, the dimensions of that response are treated as if they were levels of a factor. For example, the `MOats` dataset provided in the package has predictors `Block` and `Variety`, and a four-dimensional response `yield` giving yields observed with varying amounts of nitrogen added to the soil. Here is a model and reference grid: ```{r} MOats.lm <- lm (yield ~ Block + Variety, data = MOats) ref_grid (MOats.lm, mult.name = "nitro") ``` So, `nitro` is regarded as a factor having 4 levels corresponding to the 4 dimensions of `yield`. We can subsequently obtain EMMs for any of the factors `Block`, `Variety`, `nitro`, or combinations thereof. The argument `mult.name = "nitro"` is optional; if it had been excluded, the multivariate levels would have been named `rep.meas`. [Back to Contents](#contents) ## Objects, structures, and methods {#emmobj} The `ref_grid()` and `emmeans()` functions are introduced previously. These functions, and a few related ones, return an object of class `emmGrid`: ```{r} pigs.rg <- ref_grid(pigs.lm1) class(pigs.rg) pigs.emm.s <- emmeans(pigs.rg, "source") class(pigs.emm.s) ``` If you simply show these objects, you get different-looking results: ```{r} pigs.rg pigs.emm.s ``` This is based on guessing what users most need to see when displaying the object. You can override these defaults; for example to just see a quick summary of what is there, do ```{r} str(pigs.emm.s) ``` The most important method for `emmGrid` objects is `summary()`. It is used as the print method for displaying an `emmeans()` result. For this reason, arguments for `summary()` may also be specified within most functions that produce `these kinds of results.`emmGrid` objects. For example: ```{r} # equivalent to summary(emmeans(pigs.lm1, "percent"), level = 0.90, infer = TRUE)) emmeans(pigs.lm1, "percent", level = 0.90, infer = TRUE) ``` This `summary()` method for `emmGrid` objects) actually produces a `data.frame`, but with extra bells and whistles: ```{r} class(summary(pigs.emm.s)) ``` This can be useful to know because if you want to actually *use* `emmeans()` results in other computations, you should save its summary, and then you can access those results just like you would access data in a data frame. The `emmGrid` object itself is not so accessible. There is a `print.summary_emm()` function that is what actually produces the output you see above -- a data frame with extra annotations. [Back to Contents](#contents) ## P values, "significance", and recommendations {#pvalues} There is some debate among statisticians and researchers about the appropriateness of *P* values, and that the term "statistical significance" can be misleading. If you have a small *P* value, it *only* means that the effect being tested is unlikely to be explained by chance variation alone, in the context of the current study and the current statistical model underlying the test. If you have a large *P* value, it *only* means that the observed effect could plausibly be due to chance alone: it is *wrong* to conclude that there is no effect. The American Statistical Association has for some time been advocating very cautious use of *P* values (Wasserman *et al.* 2014) because it is too often misinterpreted, and too often used carelessly. Wasserman *et al.* (2019) even goes so far as to advise against *ever* using the term "statistically significant". The 43 articles it accompanies in the same issue of *TAS*, recommend a number of alternatives. I do not agree with all that is said in the main article, and there are portions that are too cutesy or wander off-topic. Further, it is quite dizzying to try to digest all the accompanying articles, and to reconcile their disagreeing viewpoints. For some time I included a summary of Wasserman *et al.*'s recommendations and their *ATOM* paradigm (Acceptance of uncertainty, Thoughtfulness, Openness, Modesty). But in the meantime, I have handled a large number of user questions, and many of those have made it clear to me that there are more important fish to fry in a vignette section like this. It is just a fact that *P* values are used, and are useful. So I have my own set of recommendations regarding them. #### A set of comparisons or well-chosen contrasts is more useful and interpretable than an omnibus *F* test {#recs1} *F* tests are useful for model selection, but don't tell you anything specific about the nature of an effect. If *F* has a small *P* value, it suggests that there is some effect, somewhere. It doesn't even necessarily imply that any two means differ statistically. #### Use *adjusted* *P* values When you run a bunch of tests, there is a risk of making too many type-I errors, and adjusted *P* values (e.g., the Tukey adjustment for pairwise comparisons) keep you from making too many mistakes. That said, it is possible to go overboard; and it's usually reasonable to regard each "by" group as a separate family of tests for purposes of adjustment. #### It is *not* necessary to have a significant *F* test as a prerequisite to doing comparisons or contrasts {#recs2} ... as long as an appropriate adjustment is used. There do exist rules such as the "protected LSD" by which one is given license to do unadjusted comparisons provided the $F$ statistic is "significant." However, this is a very weak form of protection for which the justification is, basically, "if $F$ is significant, you can say absolutely anything you want." #### Get the model right first Everything the **emmeans** package does is an interpretation of the model that you fitted to the data. If the model is bad, you will get bad results from `emmeans()` and other functions. Every single limitation of your model, be it presuming constant error variance, omitting interaction terms, etc., becomes a limitation of the results `emmeans()` produces. So do a responsible job of fitting the model. And if you don't know what's meant by that... #### Consider seeking the advice of a statistical consultant {#recs3} Statistics is hard. It is a lot more than just running programs and copying output. It is *your* research; is it important that it be done right? Many academic statistics and biostatistics departments can refer you to someone who can help. [Back to Contents](#contents) ## Summary of main points {#summary} * EMMs are derived from a *model*. A different model for the same data may lead to different EMMs. * EMMs are based on a *reference grid* consisting of all combinations of factor levels, with each covariate set to its average (by default). * For purposes of defining the reference grid, dimensions of a multivariate response are treated as levels of a factor. * EMMs are then predictions on this reference grid, or marginal averages thereof (equally weighted by default). * Reference grids may be modified using `at` or `cov.reduce`; the latter may be logical, a function, or a formula. * Reference grids and `emmeans()` results may be plotted via `plot()` (for parallel confidence intervals) or `emmip()` (for an interaction-style plot). * Be cautious with the terms "significant" and "nonsignificant", and don't ever interpret a "nonsignificant" result as saying that there is no effect. * Follow good practices such as getting the model right first, and using adjusted *P* values for appropriately chosen families of comparisons or contrasts. [Back to Contents](#contents) ### References Wasserman RL, Lazar NA (2016) "The ASA's Statement on *p*-Values: Context, Process, and Purpose," *The American Statistician*, **70**, 129--133, https://doi.org/10.1080/00031305.2016.1154108 Wasserman RL, Schirm AL, Lazar, NA (2019) "Moving to a World Beyond 'p < 0.05'," *The American Statistician*, **73**, 1--19, https://doi.org/10.1080/00031305.2019.1583913 ## Further reading {#more} The reader is referred to other vignettes for more details and advanced use. The strings linked below are the names of the vignettes; i.e., they can also be accessed via `vignette("`*name*`", "emmeans")` * Models that are supported in **emmeans** (there are lots of them) ["models"](models.html) * Confidence intervals and tests: ["confidence-intervals"](confidence-intervals.html) * Often, users want to compare or contrast EMMs: ["comparisons"](comparisons.html) * Working with response transformations and link functions: ["transformations"](transformations.html) * Multi-factor models with interactions: ["interactions"](interactions.html) * Working with messy data and nested effects: ["messy-data"](messy-data.html) * Making predictions from your model: ["predictions"](predictions.html) * Examples of more sophisticated models (e.g., mixed, ordinal, MCMC) ["sophisticated"](sophisticated.html) * Utilities for working with `emmGrid` objects: ["utilities"](utilities.html) * Frequently asked questions: ["FAQs"](FAQs.html) * Adding **emmeans** support to your package: ["xtending"](xtending.html) [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html)emmeans/vignettes/confidence-intervals.Rmd0000644000176200001440000003067114137062735020474 0ustar liggesusers--- title: "Confidence intervals and tests in emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Confidence intervals and tests} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` ## Contents {#contents} This vignette describes various ways of summarizing `emmGrid` objects. 1. [`summary()`, `confint()`, and `test()`](#summary) 2. [Back-transforming to response scale](#tran) (See also the ["transformations" vignette](transformations.html)) 3. [Multiplicity adjustments](#adjust) 4. [Using "by" variables](#byvars) 5. [Joint (omnibus) tests](#joint) 6. [Testing equivalence, noninferiority, nonsuperiority](#equiv) 7. Graphics (in ["basics" vignette](basics.html#plots)) [Index of all vignette topics](vignette-topics.html) ## `summary()`, `confint()`, and `test()` {#summary} The most important method for `emmGrid` objects is `summary()`. For one thing, it is called by default when you display an `emmeans()` result. The `summary()` function has a lot of options, and the detailed documentation via `help("summary.emmGrid")` is worth a look. For ongoing illustrations, let's re-create some of the objects in the ["basics" vignette](basics.html) for the `pigs` example: ```{r} pigs.lm1 <- lm(log(conc) ~ source + factor(percent), data = pigs) pigs.rg <- ref_grid(pigs.lm1) pigs.emm.s <- emmeans(pigs.rg, "source") ``` Just `summary()` by itself will produce a summary that varies somewhat according to context. It does this by setting different defaults for the `infer` argument, which consists of two logical values, specifying confidence intervals and tests, respectively. [The exception is models fitted using MCMC methods, where `summary()` is diverted to the `hpd.summary()` function, a preferable summary for many Bayesians.] The summary of a newly made reference grid will show just estimates and standard errors, but not confidence intervals or tests (that is, `infer = c(FALSE, FALSE)`). The summary of an `emmeans()` result, as we see above, will have intervals, but no tests (i.e., `infer = c(TRUE, FALSE)`); and the result of a `contrast()` call (see [comparisons and contrasts](comparisons.html)) will show test statistics and *P* values, but not intervals (i.e., `infer = c(FALSE, TRUE)`). There are courtesy methods `confint()` and `test()` that just call `summary()` with the appropriate `infer` setting; for example, ```{r} test(pigs.emm.s) ``` It is not particularly useful, though, to test these EMMs against the default of zero -- which is why tests are not usually shown. It makes a lot more sense to test them against some target concentration, say 40. And suppose we want to do a one-sided test to see if the concentration is greater than 40. Remembering that the response is log-transformed in this model, ```{r} test(pigs.emm.s, null = log(40), side = ">") ``` It is also possible to add calculated columns to the summary, via the `calc` argument. The calculations can include any columns up through `df` in the summary, as well as any variable in the object's `grid` slot. Among the latter are usually weights in a column named `.wgt.`, and we can use that to include sample size in the summary: ```{r} confint(pigs.emm.s, calc = c(n = ~.wgt.)) ``` [Back to Contents](#contents) ## Back-transforming {#tran} Transformations and link functions are supported an several ways in **emmeans**, making this a complex topic worthy of [its own vignette](transformations.html). Here, we show just the most basic approach. Namely, specifying the argument `type = "response"` will cause the displayed results to be back-transformed to the response scale, when a transformation or link function is incorporated in the model. For example, let's try the preceding `test()` call again: ```{r} test(pigs.emm.s, null = log(40), side = ">", type = "response") ``` Note what changes and what doesn't change. In the `test()` call, we *still* use the log of 40 as the null value; `null` must always be specified on the linear-prediction scale, in this case the log. In the output, the displayed estimates, as well as the `null` value, are shown back-transformed. As well, the standard errors are altered (using the delta method). However, the *t* ratios and *P* values are identical to the preceding results. That is, the tests themselves are still conducted on the linear-predictor scale (as is noted in the output). Similar statements apply to confidence intervals on the response scale: ```{r} confint(pigs.emm.s, side = ">", level = .90, type = "response") ``` With `side = ">"`, a *lower* confidence limit is computed on the log scale, then that limit is back-transformed to the response scale. (We have also illustrated how to change the confidence level.) [Back to Contents](#contents) ## Multiplicity adjustments {#adjust} Both tests and confidence intervals may be adjusted for simultaneous inference. Such adjustments ensure that the confidence coefficient for a whole set of intervals is at least the specified level, or to control for multiplicity in a whole family of tests. This is done via the `adjust` argument. For `ref_grid()` and `emmeans()` results, the default is `adjust = "none"`. For most `contrast()` results, `adjust` is often something else, depending on what type of contrasts are created. For example, pairwise comparisons default to `adjust = "tukey"`, i.e., the Tukey HSD method. The `summary()` function sometimes *changes* `adjust` if it is inappropriate. For example, with ```{r} confint(pigs.emm.s, adjust = "tukey") ``` the adjustment is changed to the Sidak method because the Tukey adjustment is inappropriate unless you are doing pairwise comparisons. ####### {#adjmore} An adjustment method that is usually appropriate is Bonferroni; however, it can be quite conservative. Using `adjust = "mvt"` is the closest to being the "exact" all-around method "single-step" method, as it uses the multivariate *t* distribution (and the **mvtnorm** package) with the same covariance structure as the estimates to determine the adjustment. However, this comes at high computational expense as the computations are done using simulation techniques. For a large set of tests (and especially confidence intervals), the computational lag becomes noticeable if not intolerable. For tests, `adjust` increases the *P* values over those otherwise obtained with `adjust = "none"`. Compare the following adjusted tests with the unadjusted ones previously computed. ```{r} test(pigs.emm.s, null = log(40), side = ">", adjust = "bonferroni") ``` [Back to Contents](#contents) ## "By" variables {#byvars} Sometimes you want to break a summary down into smaller pieces; for this purpose, the `by` argument in `summary()` is useful. For example, ```{r} confint(pigs.rg, by = "source") ``` If there is also an `adjust` in force when `by` variables are used, the adjustment is made *separately* on each `by` group; e.g., in the above, we would be adjusting for sets of 4 intervals, not all 12 together. There can be a `by` specification in `emmeans()` (or equivalently, a `|` in the formula); and if so, it is passed on to `summary()` and used unless overridden by another `by`. Here are examples, not run: ```{r eval = FALSE} emmeans(pigs.lm, ~ percent | source) ### same results as above summary(.Last.value, by = percent) ### grouped the other way ``` Specifying `by = NULL` will remove all grouping. ### Simple comparisons {#simple} There is also a `simple` argument for `contrast()` that is in essence the inverse of `by`; the contrasts are run using everything *except* the specified variables as `by` variables. To illustrate, let's consider the model for `pigs` that includes the interaction (so that the levels of one factor compare differently at levels of the other factor). ```{r} pigsint.lm <- lm(log(conc) ~ source * factor(percent), data = pigs) pigsint.rg <- ref_grid(pigsint.lm) contrast(pigsint.rg, "consec", simple = "percent") ``` In fact, we may do *all* one-factor comparisons by specifying `simple = "each"`. This typically produces a lot of output, so use it with care. [Back to Contents](#contents) ## Joint tests {#joint} From the above, we already know how to test individual results. For pairwise comparisons (details in [the "comparisons" vignette](comparisons.html)), we might do ```{r} pigs.prs.s <- pairs(pigs.emm.s) pigs.prs.s ``` But suppose we want an *omnibus* test that all these comparisons are zero. Easy enough, using the `joint` argument in `test` (note: the `joint` argument is *not* available in `summary()`; only in `test()`): ```{r} test(pigs.prs.s, joint = TRUE) ``` Notice that there are three comparisons, but only 2 d.f. for the test, as cautioned in the message. The test produced with `joint = TRUE` is a "type III" test (assuming the default equal weights are used to obtain the EMMs). See more on these types of tests for higher-order effects in the ["interactions" vignette section on contrasts](interactions.html#contrasts). ####### {#joint_tests} For convenience, there is also a `joint_tests()` function that performs joint tests of contrasts among each term in a model or `emmGrid` object. ```{r} joint_tests(pigsint.rg) ``` The tests of main effects are of families of contrasts; those for interaction effects are for interaction contrasts. These results are essentially the same as a "Type-III ANOVA", but may differ in situations where there are empty cells or other non-estimability issues, or if generalizations are present such as unequal weighting. (Another distinction is that sums of squares and mean squares are not shown; that is because these really are tests of contrasts among predictions, and they may or may not correspond to model sums of squares.) One may use `by` variables with `joint_tests`. For example: ```{r} joint_tests(pigsint.rg, by = "source") ``` In some models, it is possible to specify `submodel = "type2"`, thereby obtaining something akin to a Type II analysis of variance. See the [messy-data vignette](messy-data.html#type2submodel) for an example. [Back to Contents](#contents) ## Testing equivalence, noninferiority, and nonsuperiority {#equiv} The `delta` argument in `summary()` or `test()` allows the user to specify a threshold value to use in a test of equivalence, noninferiority, or nonsuperiority. An equivalence test is kind of a backwards significance test, where small *P* values are associated with small differences relative to a specified threshold value `delta`. The help page for `summary.emmGrid` gives the details of these tests. Suppose in the present example, we consider two sources to be equivalent if they are within 25% of each other. We can test this as follows: ```{r} test(pigs.prs.s, delta = log(1.25), adjust = "none") ``` By our 25% standard, the *P* value is quite small for comparing soy and skim, providing statistical evidence that their difference is enough smaller than the threshold to consider them equivalent. [Back to Contents](#contents) ## Graphics {#graphics} Graphical displays of `emmGrid` objects are described in the ["basics" vignette](basics.html#plots) [Index of all vignette topics](vignette-topics.html) emmeans/vignettes/models.Rmd0000644000176200001440000007104714147507455015663 0ustar liggesusers--- title: "Models supported by emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Models supported by emmeans} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- Here we document what model objects may be used with **emmeans**, and some special features of some of them that may be accessed by passing additional arguments through `ref_grid` or `emmeans()`. Certain objects are affected by optional arguments to functions that construct `emmGrid` objects, including `ref_grid()`, `emmeans()`, `emtrends()`, and `emmip()`. When "*arguments*" are mentioned in the subsequent quick reference and object-by-object documentation, we are talking about arguments in these constructors. If a model type is not included here, users may be able to obtain usable results via the `qdrg()` function; see its help page. Package developers may support their models by writing appropriate `recover_data` and `emm_basis` methods. See the package documentation for `extending-emmeans` and `vignette("xtending")` for details. [Index of all vignette topics](vignette-topics.html) ## Quick reference for supported objects and options {#quickref} Here is an alphabetical list of model classes that are supported, and the arguments that apply. Detailed documentation follows, with objects grouped by the code in the "Group" column. Scroll down or follow the links to those groups for more information. |Object.class |Package |Group |Arguments / notes | |:------------|:--------|:-------:|:------------------------------------------------------------| |aov |stats |[A](#A) | | |aovList |stats |[V](#V) |Best with balanced designs, orthogonal coding | |averaging |MuMIn |[I](#I) | | |betareg |betareg |[B](#B) |`mode = c("link", "precision", "phi.link",` | | | | |` "variance", "quantile")` | |brmsfit |brms |[P](#P) |Supported in **brms** package | |carbayes |CARBayes |[S](#S) |`data` is required | |clm |ordinal |[O](#O) |`mode = c("latent", "linear.predictor", "cum.prob",` | | | | |` "exc.prob", "prob", "mean.class", "scale")` | |clmm |ordinal |[O](#O) |Like `clm` but no `"scale"` mode | |coxme |coxme |[G](#G) | | |coxph |survival |[G](#G) | | |gam |mgcv |[G](#G) |`freq = FALSE`, `unconditional = FALSE`, | | | | |`what = c("location", "scale", "shape", "rate", "prob.gt.0")`| |gamm |mgcv |[G](#G) |`call = object$gam$call` | |Gam |gam |[G](#G) |`nboot = 800` | |gamlss |gamlss |[H](#H) |`what = c("mu", "sigma", "nu", "tau")` | |gee |gee |[E](#E) |`vcov.method = c("naive", "robust")` | |geeglm |geepack |[E](#E) |`vcov.method = c("vbeta", "vbeta.naiv", "vbeta.j1s",` | | | | |`"vbeta.fij", "robust", "naive")` or a matrix | |geese |geepack |[E](#E) |Like `geeglm` | |glm |stats |[G](#G) | | |glm.nb |MASS |[G](#G) |Requires `data` argument | |glmerMod |lme4 |[G](#G) | | |glmmadmb |glmmADMB | |No longer supported | |glmmPQL |MASS |[G](#G) |inherits `lm` support | |glmmTMB |glmmTMB |[P](#P) |Supported in **glmmTMB** package (dev. version only?) | |gls |nlme |[K](#K) |`mode = c("auto", "df.error", "satterthwaite", "asymptotic")`| |gnls |nlme |[A](#A) |Supports `params` part. Requires `param = ""` | |hurdle |pscl |[C](#C) |`mode = c("response", "count", "zero", "prob0"),` | | | | |`lin.pred = c(FALSE, TRUE)` | |lm |stats |[A](#A) |Several other classes inherit from this and may be supported | |lme |nlme |[K](#K) |`sigmaAdjust = c(TRUE, FALSE),` | | | | |`mode = c("auto", containment", "satterthwaite", "asymptotic"),`| | | | |`extra.iter = 0` | |lmerMod |lme4 |[L](#L) |`lmer.df = c("kenward-roger", "satterthwaite", "asymptotic")`, | | | | |`pbkrtest.limit = 3000`, `disable.pbkrtest = FALSE`. | | | | |`emm_options(lmer.df =, pbkrtest.limit =, disable.pbkrtest =)` | |lqm,lqmm |lqmm |[Q](#Q) |`tau = "0.5"` (must match an entry in `object$tau`) | | | | |Optional: `method`, `R`, `seed`, `startQR` (must be fully spelled-out) | |manova |stats |[M](#M) |`mult.name`, `mult.levs` | |maov |stats |[M](#M) |`mult.name`, `mult.levs` | |mblogit |mclogit |[P](#P) |Supported in **mclogit** (overrides previous minimal support here) | |mcmc |mcmc |[S](#S) |May require `formula`, `data` | |MCMCglmm |MCMCglmm |[S](#S) |(see also [M](#M#)) `mult.name`, `mult.levs`, `trait`, | | | | |`mode = c("default", "multinomial")`; `data` is required | |mira |mice |[I](#I) |Optional arguments per class of `$analyses` elements | |mixed |afex |[P](#P) |Supported in **afex** package | |mlm |stats |[M](#M) |`mult.name`, `mult.levs` | |mmer |sommer |[G](#G) | | |multinom |nnet |[N](#N) |`mode = c("prob", "latent")` | | | | |Always include response in specs for `emmeans()` | |nauf |nauf.*xxx* |[P](#P) |Supported in **nauf** package | |nlme |nlme |[A](#A) |Supports fixed part. Requires `param = ""` | |polr |MASS |[O](#O) |`mode = c("latent", "linear.predictor", "cum.prob",` | | | | |`"exc.prob", "prob", "mean.class")` | |rlm |MASS |[A](#A) |inherits `lm` support | |rms |rms |[O](#O) |`mode = ("middle", "latent", "linear.predictor",` | | | | |`"cum.prob", "exc.prob", "prob", "mean.class")` | |rq,rqs |quantreg |[Q](#Q) |`tau = "0.5"` (must match an entry in `object$tau`) | | | | |Optional: `se`, `R`, `bsmethod`, etc. | |rlmerMod |robustlmm|[P](#P) |Supported in **robustlmm** package | |rsm |rsm |[P](#P) |Supported in **rsm** package | |stanreg |rstanarm |[S](#S) |Args for `stanreg_`*xxx* similar to those for *xxx* | |survreg |survival |[A](#A) | | |svyglm |survey |[A](#A) | | |zeroinfl |pscl |[C](#C) |`mode = c("response", "count", "zero", "prob0")`, | | | | |`lin.pred = c(FALSE, TRUE)` | ## Group A -- "Standard" or minimally supported models {#A} Models in this group, such as `lm`, do not have unusual features that need special support; hence no extra arguments are needed. Some may require `data` in the call. ## B -- Beta regression {#B} The additional `mode` argument for `betareg` objects has possible values of `"response"`, `"link"`, `"precision"`, `"phi.link"`, `"variance"`, and `"quantile"`, which have the same meaning as the `type` argument in `predict.betareg` -- with the addition that `"phi.link"` is like `"link"`, but for the precision portion of the model. When `mode = "quantile"` is specified, the additional argument `quantile` (a numeric scalar or vector) specifies which quantile(s) to compute; the default is 0.5 (the median). Also in `"quantile"` mode, an additional variable `quantile` is added to the reference grid, and its levels are the values supplied. [Back to quick reference](#quickref) ## Group C -- Count models {#C} Two optional arguments -- `mode` and `lin.pred` -- are provided. The `mode` argument has possible values `"response"` (the default), `"count"`, `"zero"`, or `"prob0"`. `lin.pred` is logical and defaults to `FALSE`. With `lin.pred = FALSE`, the results are comparable to those returned by `predict(..., type = "response")`, `predict(..., type = "count")`, `predict(..., type = "zero")`, or `predict(..., type = "prob")[, 1]`. See the documentation for `predict.hurdle` and `predict.zeroinfl`. The option `lin.pred = TRUE` only applies to `mode = "count"` and `mode = "zero"`. The results returned are on the linear-predictor scale, with the same transformation as the link function in that part of the model. The predictions for a reference grid with `mode = "count"`, `lin.pred = TRUE`, and `type = "response"` will be the same as those obtained with `lin.pred = FALSE` and `mode = "count"`; however, any EMMs derived from these grids will be different, because the averaging is done on the log-count scale and the actual count scale, respectively -- thereby producing geometric means versus arithmetic means of the predictions. If the `vcov.` argument is used (see details in the documentation for `ref_grid`), it must yield a matrix of the same size as would be obtained using `vcov.hurdle` or `vcov.zeroinfl` with its `model` argument set to `("full", "count", "zero")` in respective correspondence with `mode` of `("mean", "count", "zero")`. If `vcov.` is a function, it must support the `model` argument. [Back to quick reference](#quickref) ## Group E -- GEE models {#E} These models all have more than one covariance estimate available, and it may be selected by supplying a string as the `vcov.method` argument. It is partially matched with the available choices shown in the quick reference. In `geese` and `geeglm`, the aliases `"robust"` (for `"vbeta"`) and `"naive"` (for `"vbeta.naiv"` are also accepted. If a matrix or function is supplied as `vcov.method`, it is interpreted as a `vcov.` specification as described for `...` in the documentation for `ref_grid`. ## Group G -- Generalized linear models and relatives {#G} Most models in this group receive only standard support as in [Group A](#A), but typically the tests and confidence intervals are asymptotic. Thus the `df` column for tabular results will be `Inf`. Some objects in this group *require* that the original or reference dataset be provided when calling `ref_grid()` or `emmeans()`. In the case of `mgcv::gam` objects, there are optional `freq` and `unconditional` arguments as is detailed in the documentation for `mgcv::vcov.gam()`. Both default to `FALSE`. The value of `unconditional` matters only if `freq = FALSE` and `object$Vc` is non-null. For `mgcv::gamm` objects, `emmeans()` results are based on the `object$gam` part. Unfortunately, that is missing its `call` component, so the user must supply it in the `call` argument (e.g., `call = quote(gamm(y ~ s(x), data = dat))`) or give the dataset in the `data` argument. Alternatively (and recommended), you may first set `object$gam$call` to the quoted call ahead of time. The `what` arguments are used to select which model formula to use: `"location", "scale"` apply to `gaulss` and `gevlss` families, `"shape"` applies only to `gevlss`, and `"rate", "prob.gt.0"` apply to `ziplss`. With `gam::Gam` objects, standard errors are estimated using a bootstrap method when there are any smoothers involved. Accordingly, there is an optional `nboot` argument that sets the number of bootstrap replications used to estimate the variances and covariances of the smoothing portions of the model. Generally, it is better to use models fitted via `mgcv::gam()` rather than `gam::gam()`. [Back to quick reference](#quickref) ## Group H -- `gamlss` models {#H} The `what` argument has possible values of `"mu"` (default), `"sigma"`, `"nu"`, or `"tau"` depending on which part of the model you want results for. Currently, there is no support when the selected part of the model contains a smoothing method like `pb()`. ## Group I -- Multiple models (via imputation or averaging) {#I} These objects are the results of fitting several models with different predictor subsets or imputed values. The `bhat` and `V` slots are obtained via averaging and, in the case of multiple imputation, adding a multiple of the between-imputation covariance per Rubin's rules. Support for `MuMIn::averaging` objects may be somewhat dodgy, as it is not clear that all supported model classes will work. The object *must* have a `"modelList"` attribute (obtained by constructing the object explicitly from a model list or by including `fit = TRUE` in the call). And each model should be fitted with `data` as a **named** argument in the call; or else provide a `data` argument in the call to `emmeans()` or `ref_grid()`. No estimability checking is done at present: if/when it is added, a linear function will be estimable only if it is estimable in *all* models included in the averaging. ## Group K -- `gls` and `lme` models {#K} The `sigmaAdjust` argument is a logical value that defaults to `TRUE`. It is comparable to the `adjustSigma` option in `nlme::summary.lme` (the name-mangling is to avoid conflicts with the often-used `adjust` argument), and determines whether or not a degrees-of-freedom adjustment is performed with models fitted using the ML method. The optional `mode` argument affects the degrees of freedom. The `mode = "satterthwaite"` option determines degrees of freedom via the Satterthwaite method: If `s^2` is the estimate of some variance, then its Satterthwaite d.f. is `2*s^4 / Var(s^2)`. In case our numerical methods for this fail, we also offer `mode = "appx-satterthwaite"` as a backup, by which quantities related to `Var(s^2)` are obtained by randomly perturbing the response values. Currently, only `"appx-satterthwaite"` is available for `lme` objects, and it is used if `"satterthwaite"` is requested. Because `appx-satterthwaite` is simulation-based, results may vary if the same analysis is repeated. An `extra.iter` argument may be added to request additional simulation runs (at [possibly considerable] cost of repeating the model-fitting that many more times). (Note: Previously, `"appx-satterthwaite"` was termed `"boot-satterthwaite"`; this is still supported for backward compatibility. The "boot" was abandoned because it is really an approximation method, not a bootstrap method in the sense as many statistical methods.) An alternative method is `"df.error"` (for `gls`) and `"containment"` (for `lme`). `df.error` is just the error degrees of freedom for the model, minus the number of extra random effects estimated; it generally over-estimates the degrees of freedom. The `asymptotic` mode simply sets the degrees of freedom to infinity. "containment"` mode (for `lme` models) determines the degrees of freedom for the coarsest grouping involved in the contrast or linear function involved, so it tends to under-estimate the degrees of freedom. The default is `mode = "auto"`, which uses Satterthwaite if there are estimated random effects and the non-Satterthwaite option otherwise. The `extra.iter` argument is ignored unless the d.f. method is (or defaults to) `appx-satterthwaite`. [Back to quick reference](#quickref) ## Group L -- `lmerMod` models {#L} There is an optional `lmer.df` argument that defaults to `get_EMM_option("lmer.df")` (which in turn defaults to `"kenward-roger"`). The possible values are `"kenward-roger"`, `"satterthwaite"`, and `"asymptotic"` (these are partially matched and case-insensitive). With `"kenward-roger"`, d.f. are obtained using code from the **pbkrtest** package, if installed. With `"satterthwaite"`, d.f. are obtained using code from the **lmerTest** package, if installed. With `"asymptotic"`, or if the needed package is not installed, d.f. are set to `Inf`. (For backward compatibility, the user may specify `mode` in lieu of `lmer.df`.) A by-product of the Kenward-Roger method is that the covariance matrix is adjusted using `pbkrtest::vcovAdj()`. This can require considerable computation; so to avoid that overhead, the user should opt for the Satterthwaite or asymptotic method; or, for backward compatibility, may disable the use of **pbkrtest** via `emm_options(disable.pbkrtest = TRUE)` (this does not disable the **pbkrtest** package entirely, just its use in **emmeans**). The computation time required depends roughly on the number of observations, *N*, in the design matrix (because a major part of the computation involves inverting an *N* x *N* matrix). Thus, **pbkrtest** is automatically disabled if *N* exceeds the value of `get_emm_option("pbkrtest.limit")`, for which the factory default is 3000. (The user may also specify `pbkrtest.limit` or `disable.pbkrtest` as an argument in the call to `emmeans()` or `ref_grid()`) Similarly to the above, the `disable.lmerTest` and `lmerTest.limit` options or arguments affect whether Satterthwaite methods can be implemented. The `df` argument may be used to specify some other degrees of freedom. Note that if `df` and `method = "kenward-roger"` are both specified, the covariance matrix is adjusted but the K-R degrees of freedom are not used. Finally, note that a user-specified covariance matrix (via the `vcov.` argument) will also disable the Kenward-Roger method; in that case, the Satterthwaite method is used in place of Kenward-Roger. [Back to quick reference](#quickref) ## Group M -- Multivariate models {#M} When there is a multivariate response, the different responses are treated as if they were levels of a factor -- named `rep.meas` by default. The `mult.name` argument may be used to change this name. The `mult.levs` argument may specify a named list of one or more sets of levels. If this has more than one element, then the multivariate levels are expressed as combinations of the named factor levels via the function `base::expand.grid`. ## N - Multinomial responses {#N} The reference grid includes a pseudo-factor with the same name and levels as the multinomial response. There is an optional `mode` argument which should match `"prob"` or `"latent"`. With `mode = "prob"`, the reference-grid predictions consist of the estimated multinomial probabilities. The `"latent"` mode returns the linear predictor, recentered so that it averages to zero over the levels of the response variable (similar to sum-to-zero contrasts). Thus each latent variable can be regarded as the log probability at that level minus the average log probability over all levels. There are two optional arguments: `mode` and `rescale` (which defaults to `c(0, 1)`). Please note that, because the probabilities sum to 1 (and the latent values sum to 0) over the multivariate-response levels, all sensible results from `emmeans()` must involve that response as one of the factors. For example, if `resp` is a response with *k* levels, `emmeans(model, ~ resp | trt)` will yield the estimated multinomial distribution for each `trt`; but `emmeans(model, ~ trt)` will just yield the average probability of 1/*k* for each `trt`. [Back to quick reference](#quickref) ## Group O - Ordinal responses {#O} The reference grid for ordinal models will include all variables that appear in the main model as well as those in the `scale` or `nominal` models (if provided). There are two optional arguments: `mode` (a character string) and `rescale` (which defaults to `c(0, 1)`). `mode` should match one of `"latent"` (the default), `"linear.predictor"`, `"cum.prob"`, `"exc.prob"`, `"prob"`, `"mean.class"`, or `"scale"` -- see the quick reference and note which are supported. With `mode = "latent"`, the reference-grid predictions are made on the scale of the latent variable implied by the model. The scale and location of this latent variable are arbitrary, and may be altered via `rescale`. The predictions are multiplied by `rescale[2]`, then added to `rescale[1]`. Keep in mind that the scaling is related to the link function used in the model; for example, changing from a probit link to a logistic link will inflate the latent values by around $\pi/\sqrt{3}$, all other things being equal. `rescale` has no effect for other values of `mode`. With `mode = "linear.predictor"`, `mode = "cum.prob"`, and `mode = "exc.prob"`, the boundaries between categories (i.e., thresholds) in the ordinal response are included in the reference grid as a pseudo-factor named `cut`. The reference-grid predictions are then of the cumulative probabilities at each threshold (for `mode = "cum.prob"`), exceedance probabilities (one minus cumulative probabilities, for `mode = "exc.prob"`), or the link function thereof (for `mode = "linear.predictor"`). With `mode = "prob"`, a pseudo-factor with the same name as the model's response variable is created, and the grid predictions are of the probabilities of each class of the ordinal response. With `"mean.class"`, the returned results are means of the ordinal response, interpreted as a numeric value from 1 to the number of classes, using the `"prob"` results as the estimated probability distribution for each case. With `mode = "scale"`, and the fitted object incorporates a scale model, EMMs are obtained for the factors in the scale model (with a log response) instead of the response model. The grid is constructed using only the factors in the scale model. Any grid point that is non-estimable by either the location or the scale model (if present) is set to `NA`, and any EMMs involving such a grid point will also be non-estimable. A consequence of this is that if there is a rank-deficient `scale` model, then *all* latent responses become non-estimable because the predictions are made using the average log-scale estimate. `rms` models have an additional `mode`. With `mode = "middle"` (this is the default), the middle intercept is used, comparable to the default for `rms::Predict()`. This is quite similar in concept to `mode = "latent"`, where all intercepts are averaged together. [Back to quick reference](#quickref) ## P -- Other packages {#P} Models in this group have their **emmeans** support provided by the package that implements the model-fitting procedure. Users should refer to the package documentation for details on **emmeans** support. In some cases, a package's models may have been supported here in **emmeans**; if so, the other package's support overrides it. ## Q -- Quantile regression {#Q} The argument `tau` should match (within a very small margin) one of the quantiles actually specified in fitting the model; otherwise an error results. In these models, the covariance matrix is obtained via the model's `summary()` method with `covariance = TRUE`. The user may specify one or more of the other arguments for `summary` or to be passed to, say, a bootstrap routine. If so, those optional arguments must be spelled-out completely (e.g., `start` will *not* be matched to `startQR`). ## S -- Sampling (MCMC) methods {#S} Models fitted using MCMC methods contain a sample from the posterior distribution of fixed-effect coefficients. In some cases (e.g., results of `MCMCpack::MCMCregress()` and `MCMCpack::MCMCpoisson()`), the object may include a `"call"` attribute that `emmeans()` can use to reconstruct the data and obtain a basis for the EMMs. If not, a `formula` and `data` argument are provided that may help produce the right results. In addition, the `contrasts` specifications are not necessarily recoverable from the object, so the system default must match what was actually used in fitting the model. The `summary.emmGrid()` method provides credibility intervals (HPD intervals) of the results, and ignores the frequentist-oriented arguments (`infer`, `adjust`, etc.) An `as.mcmc()` method is provided that creates an `mcmc` object that can be summarized or plotted using the **coda** package (or others that support those objects). It provides a posterior sample of EMMs, or contrasts thereof, for the given reference grid, based on the posterior sample of the fixed effects from the model object. In `MCMCglmm` objects, the `data` argument is required; however, if you save it as a member of the model object (e.g., `object$data = quote(mydata)`), that removes the necessity of specifying it in each call. The special keyword `trait` is used in some models. When the response is multivariate and numeric, `trait` is generated automatically as a factor in the reference grid, and the arguments `mult.levels` can be used to name its levels. In other models such as a multinomial model, use the `mode` argument to specify the type of model, and `trait = ` to specify the name of the data column that contains the levels of the factor response. The **brms** package version 2.13 and later, has its own `emmeans` support. Refer to the documentation in that package. [Back to quick reference](#quickref) ## Group V -- `aovList` objects (also used with `afex_aov` objects) {#V} Support for these objects is limited. To avoid strong biases in the predictions, it is strongly recommended that when fitting the model, the `contrasts` attribute of all factors should be of a type that sums to zero -- for example, `"contr.sum"`, `"contr.poly"`, or `"contr.helmert"` but *not* `"contr.treatment"`. If that is found not to be the case, the model is re-fitted using sum-to-zero contrasts (thus requiring additional computation). Doing so does *not* remove all bias in the EMMs unless the design is perfectly balanced, and an annotation is added to warn of that. This bias cancels out when doing comparisons and contrasts. Only intra-block estimates of covariances are used. That is, if a factor appears in more than one error stratum, only the covariance structure from its lowest stratum is used in estimating standard errors. Degrees of freedom are obtained using the Satterthwaite method. In general, `aovList` support is best with balanced designs, with due caution in the use of contrasts. If a `vcov.` argument is supplied, it must yield a single covariance matrix for the unique fixed effects (not a set of them for each error stratum). In that case, the degrees of freedom are set to `NA`. [Back to quick reference](#quickref) [Index of all vignette topics](vignette-topics.html) emmeans/R/0000755000176200001440000000000014165066776012122 5ustar liggesusersemmeans/R/models.R0000644000176200001440000000347314137062735013525 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # documentation for models #' Models supported in \pkg{emmeans} #' #' Documentation for models has been moved to a vignette. To access it, #' use \href{../doc/models.html}{\code{vignette("models", "emmeans")}}. #' #' @name models NULL emmeans/R/emmGrid-methods.R0000644000176200001440000015031514137062735015265 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### =========== Various methods for emmGrid class ============================= # (note: some major ones have their own file) ### S4 show method ## use S3 for this setMethod("summary", "emmGrid", summary.emmGrid) setMethod("show", "emmGrid", function(object) { isnewrg = object@misc$is.new.rg if (is.null(isnewrg)) isnewrg = FALSE if (isnewrg) str.emmGrid(object) else print(summary.emmGrid(object)) }) ### Others are all S3 methods #' @rdname emmGrid-methods #' @method str emmGrid #' @export str.emmGrid <- function(object, ...) { showlevs = function(x) { # internal convenience function if (is.null(x)) cat("(predicted by other variables)") else cat(paste(format(x, digits = 5, justify = "none"), collapse=", ")) } showtran = function(misc, label) { # internal convenience fcn cat(paste(label, dQuote(.fmt.tran(misc)), "\n")) } levs = object@levels cat(paste("'", class(object)[1], "' object with variables:\n", sep="")) for (nm in union(object@roles$predictors, union(object@roles$multresp, object@roles$responses))) { cat(paste(" ", nm, " = ", sep = "")) if (hasName(object@matlevs, nm)) { if (nm %in% object@roles$responses) cat("multivariate response with means: ") else cat("matrix with column reference values: ") cat("\n ") showlevs(object@matlevs[[nm]]) } else if (nm %in% object@roles$multresp) { cat("multivariate response levels: ") showlevs(levs[[nm]]) } else if (nm %in% object@roles$responses) { cat("response variable with mean ") showlevs(levs[[nm]]) } else showlevs(levs[[nm]]) cat("\n") } if (length(nuis <- object@roles$nuisance) > 0) { cat("Nuisance factors that have been collapsed by averaging:\n ") tmp = paste0(names(nuis), "(", sapply(nuis, length), ")") cat(paste(tmp, collapse = ", ")) cat("\n") } if(!is.null(object@model.info$nesting)) { cat("Nesting structure: ") cat(.fmt.nest(object@model.info$nesting)) cat("\n") } if(!is.null(tran <- object@misc$tran)) { showtran(object@misc, "Transformation:") if (!is.null(tran2 <- object@misc$tran2)) showtran(list(tran = tran2), "Additional response transformation:") } } #' @rdname emmGrid-methods #' @param export Logical value. If \code{FALSE}, the object is printed. #' If \code{TRUE}, a list is invisibly returned, which contains character #' elements named \code{summary} and \code{annotations} that may be saved #' or displayed as the user sees fit. \code{summary} is a character matrix #' (or list of such matrices, if a \code{by} variable is in effect). #' \code{annotations} is a character vector of the annotations that would #' have been printed below the summary or summaries. #' @method print emmGrid #' @param x An \code{emmGrid} object #' @export print.emmGrid = function(x, ..., export = FALSE) print(summary.emmGrid(x, ...), export = export) # vcov method #' Miscellaneous methods for \code{emmGrid} objects #' @rdname emmGrid-methods #' #' @param object An \code{emmGrid} object #' @param ... (required but not used) #' #' @return The \code{vcov} method returns a symmetric matrix of variances and #' covariances for \code{predict.emmGrid(object, type = "lp")} #' #' @method vcov emmGrid #' @export vcov.emmGrid = function(object, ...) { tol = get_emm_option("estble.tol") if (!is.null(hook <- object@misc$vcovHook)) { if (is.character(hook)) hook = get(hook) hook(object, tol = tol, ...) } else { X = object@linfct estble = estimability::is.estble(X, object@nbasis, tol) X[!estble, ] = NA X = X[, !is.na(object@bhat), drop = FALSE] X %*% tcrossprod(object@V, X) } } # Method to alter contents of misc slot #' Update an \code{emmGrid} object #' #' Objects of class \code{emmGrid} contain several settings that affect such things as #' what arguments to pass to \code{\link{summary.emmGrid}}. #' The \code{update} method allows safer management of these settings than #' by direct modification of its slots. #' #' @param object An \code{emmGrid} object #' @param ... Options to be set. These must match a list of known options (see #' Details) #' @param silent Logical value. If \code{FALSE} (the default), a message is #' displayed if any options are not matched. If \code{TRUE}, no messages are #' shown. #' #' @return an updated \code{emmGrid} object. #' #' @method update emmGrid #' @order 1 #' @export #' #' @section Details: #' The names in \code{\dots} are partially matched against those that are valid, and if a match is found, it adds or replaces the current setting. The valid names are #' #' \describe{ #' \item{\code{tran}, \code{tran2}}{(\code{list} or \code{character}) specifies #' the transformation which, when inverted, determines the results displayed by #' \code{\link{summary.emmGrid}}, \code{\link{predict.emmGrid}}, or \code{\link{emmip}} when #' \code{type="response"}. The value may be the name of a standard #' transformation from \code{\link{make.link}} or additional ones supported by #' name, such as \code{"log2"}; or, for a custom transformation, a \code{list} #' containing at least the functions \code{linkinv} (the inverse of the #' transformation) and \code{mu.eta} (the derivative thereof). The #' \code{\link{make.tran}} function returns such lists for a number of popular #' transformations. See the help page of \code{\link{make.tran}} for details as #' well as information on the additional named transformations that are #' supported. \code{tran2} is just like \code{tran} except it is a second #' transformation (i.e., a response transformation in a generalized linear #' model).} #' #' \item{\code{tran.mult}}{Multiple for \code{tran}. For example, for the #' response transformation \samp{2*sqrt(y)} (or \samp{sqrt(y) + sqrt(y + 1)}, #' for that matter), we should have \code{tran = "sqrt"} and \code{tran.mult = #' 2}. If absent, a multiple of 1 is assumed.} #' #' \item{\code{tran.offset}}{Additive constant before a transformation is applied. #' For example, a response transformation of \code{log(y + pi)} has #' \code{tran.offset = pi}. If no value is present, an offset of 0 is assumed.} #' #' \item{\code{estName}}{(\code{character}) is the column label used for #' displaying predictions or EMMs.} #' #' \item{\code{inv.lbl}}{(\code{character)}) is the column label to use for #' predictions or EMMs when \code{type="response"}.} #' #' \item{\code{by.vars}}{(\code{character)} vector or \code{NULL}) the variables #' used for grouping in the summary, and also for defining subfamilies in a call #' to \code{\link{contrast}}.} #' #' \item{\code{pri.vars}}{(\code{character} vector) are the names of the grid #' variables that are not in \code{by.vars}. Thus, the combinations of their #' levels are used as columns in each table produced by \code{\link{summary.emmGrid}}.} #' #' \item{\code{alpha}}{(numeric) is the default significance level for tests, in #' \code{\link{summary.emmGrid}} as well as \code{\link{plot.emmGrid}} #' when \samp{CIs = TRUE}. Be cautious that methods that depend on #' specifying \code{alpha} are prone to abuse. See the #' discussion in \href{../doc/basics.html#pvalues}{\code{vignette("basics", "emmeans")}}.} #' #' \item{\code{adjust}}{(\code{character)}) is the default for the \code{adjust} #' argument in \code{\link{summary.emmGrid}}.} #' #' \item{\code{famSize}}{(integer) is the number of means involved in a family of #' inferences; used in Tukey adjustment} #' #' \item{\code{infer}}{(\code{logical} vector of length 2) is the default value #' of \code{infer} in \code{\link{summary.emmGrid}}.} #' #' \item{\code{level}}{(numeric) is the default confidence level, \code{level}, #' in \code{\link{summary.emmGrid}}. \emph{Note:} You must specify all five letters #' of \sQuote{level} to distinguish it from the slot name \sQuote{levels}.} #' #' \item{\code{df}}{(numeric) overrides the default degrees of freedom with a #' specified single value.} #' #' \item{\code{calc}}{(list) additional calculated columns. See \code{\link{summary.emmGrid}}.} #' #' \item{\code{null}}{(numeric) null hypothesis for \code{summary} or #' \code{test} (taken to be zero if missing).} #' #' \item{\code{side}}{(numeric or character) \code{side} specification for for #' \code{summary} or \code{test} (taken to be zero if missing).} #' #' \item{\code{sigma}}{(numeric) Error SD to use in predictions and for bias-adjusted #' back-transformations} #' #' \item{\code{delta}}{(numeric) \code{delta} specification for \code{summary} #' or \code{test} (taken to be zero if missing).} #' #' \item{\code{predict.type} or \code{type}}{(character) sets the default method #' of displaying predictions in \code{\link{summary.emmGrid}}, #' \code{\link{predict.emmGrid}}, and \code{\link{emmip}}. Valid values are #' \code{"link"} (with synonyms \code{"lp"} and \code{"linear"}), or #' \code{"response"}.} #' #' \item{\code{bias.adjust}, \code{frequentist}}{(character) These #' are used by \code{summary} if the value of these arguments are not specified.} #' #' \item{\code{estType}}{(\code{character}) is used internally to determine #' what \code{adjust} methods are appropriate. It should match one of #' \samp{c("prediction", "contrast", "pairs")}. As an example of why this is needed, #' the Tukey adjustment should only be used for pairwise comparisons #' (\code{estType = "pairs"}); if \code{estType} is some other string, Tukey #' adjustments are not allowed.} #' #' \item{\code{avgd.over}}{(\code{character)} vector) are the names of the #' variables whose levels are averaged over in obtaining marginal averages of #' predictions, i.e., estimated marginal means. Changing this might produce a #' misleading printout, but setting it to \code{character(0)} will suppress the #' \dQuote{averaged over} message in the summary.} #' #' \item{\code{initMesg}}{(\code{character}) is a string that is added to the #' beginning of any annotations that appear below the \code{\link{summary.emmGrid}} #' display.} #' #' \item{\code{methDesc}}{(\code{character}) is a string that may be used for #' creating names for a list of \code{emmGrid} objects. } #' #' \item{\code{nesting}}{(Character or named \code{list}) specifies the nesting #' structure. See \dQuote{Recovering or overriding model information} in the #' documentation for \code{\link{ref_grid}}. The current nesting structure is #' displayed by \code{\link{str.emmGrid}}.} #' #' \item{\code{levels}}{named \code{list} of new levels for the elements of the #' current \code{emmGrid}. The list name(s) are used as new variable names, and #' if needed, the list is expanded using \code{expand.grid}. These results replace #' current variable names and levels. This specification changes the \code{levels}, #' \code{grid}, \code{roles}, and \code{misc} slots in the updated \code{emmGrid}, #' and resets \code{pri.vars}, \code{by.vars}, \code{adjust}, \code{famSize}, #' and \code{avgd.over}. In addition, if there is nesting of factors, that may be #' altered; a warning is issued if it involves something other than mere name changes. #' \emph{Note:} All six letters of \code{levels} is needed in order to distinguish #' it from \code{level}.} #' #' \item{\code{submodel}}{\code{formula} or \code{character} value specifying a #' submodel (requires this feature being supported by underlying methods #' for the model class). When specified, the \code{linfct} slot is replaced by #' its aliases for the specified sub-model. Any factors in the sub-model that #' do not appear in the model matrix are ignored, as are any interactions that #' are not in the main model, and any factors associate with multivariate responses. #' The estimates displayed are then computed as if #' the sub-model had been fitted. (However, the standard errors will be based on the #' error variance(s) of the full model.) #' \emph{Note:} The formula should refer only to predictor names, \emph{excluding} any #' function calls (such as \code{factor} or \code{poly}) that appear in the #' original model formula. See the example. #' #' The character values allowed should partially #' match \code{"minimal"} or \code{"type2"}. With \code{"minimal"}, the sub-model #' is taken to be the one only involving the surviving factors in \code{object} #' (the ones averaged over being omitted). Specifying \code{"type2"} is the same as #' \code{"minimal"} except only the highest-order term in the submodel is retained, #' and all effects not containing it are orthogonalized-out. Thus, in a purely linear #' situation such as an \code{lm} model, the joint test #' of the modified object is in essence a type-2 test as in \code{car::Anova}. #' #' For some objects such as generalized linear models, specifying \code{submodel} #' will typically not produce the same estimates or type-2 tests as would be #' obtained by actually fitting a separate model with those specifications. #' The reason is that those models are fitted by iterative-reweighting methods, #' whereas the \code{submodel} calculations preserve the final weights used in #' fitting the full model.} #' #' \item{(any other slot name)}{If the name matches an element of #' \code{slotNames(object)} other than \code{levels}, that slot is replaced by #' the supplied value, if it is of the required class (otherwise an error occurs). #' #' The user must be very careful in #' replacing slots because they are interrelated; for example, the lengths #' and dimensions of \code{grid}, \code{linfct}, \code{bhat}, and \code{V} must #' conform.} #' } %%%%%%% end \describe #' #' @note #' When it makes sense, an option set by \code{update} will persist into #' future results based on that object. But some options are disabled as well. #' For example, a \code{calc} option will be nulled-out if \code{contrast} #' is called, because it probably will not make sense to do the same #' calculations on the contrast results, and in fact the variable(s) needed #' may not even still exist. #' \code{factor(percent)}. #' #' @seealso \code{\link{emm_options}} #' @examples #' # Using an already-transformed response: #' pigs.lm <- lm(log(conc) ~ source * factor(percent), data = pigs) #' #' # Reference grid that knows about the transformation #' # and asks to include the sample size in any summaries: #' pigs.rg <- update(ref_grid(pigs.lm), tran = "log", #' predict.type = "response", #' calc = c(n = ~.wgt.)) #' emmeans(pigs.rg, "source") #' #' # Obtain estimates for the additive model #' # [Note that the submodel refers to 'percent', not 'factor(percent)'] #' emmeans(pigs.rg, "source", submodel = ~ source + percent) #' #' # Type II ANOVA #' joint_tests(pigs.rg, submodel = "type2") update.emmGrid = function(object, ..., silent = FALSE) { args = list(...) # see .valid.misc below this function for list of legal options valid.slots = slotNames(object) valid.choices = union(.valid.misc, valid.slots) misc = object@misc for (nm in names(args)) { fullname = try(match.arg(nm, valid.choices), silent=TRUE) if(inherits(fullname, "try-error")) { if (!silent) message("Argument ", sQuote(nm), " was ignored. Valid choices are:\n", paste(valid.choices, collapse=", ")) } else { if (fullname == "type") fullname = "predict.type" if (fullname == "levels") { lvls = args[[nm]] if (!is.list(lvls)) stop("'levels' must be a named list.") nm = names(lvls) if (is.null(nm) || any(nm == "")) stop("'levels' must be a named list.") grd = do.call(expand.grid, lvls) if (nrow(object@grid) != nrow(grd)) stop("Length of replacement levels does not match the number of rows in the grid") oldlvls = object@levels if(!is.null(object@model.info$nesting) && ((length(oldlvls) != length(lvls)) || any(sapply(oldlvls, length) != sapply(lvls, length)))) warning("Changes to levels may have altered nesting structure.\n", "You likely need to also run 'update(..., nesting = ...)'") object@levels = lvls for (nm in c(".wgt.", ".offset")) grd[[nm]] = object@grid[[nm]] object@grid = grd object@roles$predictors = misc$pri.vars = names(lvls) misc$by.vars = misc$avgd.over = NULL if (!is.null(object@model.info$nesting)) object@model.info$nesting = .find_nests(grd, NULL, FALSE, lvls) misc$adjust = "none" misc$famSize = nrow(grd) } else if (fullname %in% valid.slots) # all slots but "levels" slot(object, fullname) = args[[nm]] else { if (fullname == "by.vars") { allvars = union(misc$pri.vars, misc$by.vars) misc$pri.vars = setdiff(allvars, args[[nm]]) } if (fullname == "pri.vars") { allvars = union(misc$pri.vars, misc$by.vars) misc$by.vars = setdiff(allvars, args[[nm]]) } # special case - I keep nesting in model.info. Plus add'l checks if (fullname == "nesting") { object@model.info$nesting = lst = .parse_nest(args[[nm]]) if(!is.null(lst)) { nms = union(names(lst), unlist(lst)) if(!all(nms %in% names(object@grid))) stop("Nonexistent variables specified in 'nesting'") object@misc$display = .find.nonempty.nests(object, nms) } } if (fullname == "submodel") { if(!is.null(A <- .alias.matrix(object, args[[nm]]))) { rcols = attr(A, "rcols") L = object@linfct k = ncol(L) / ncol(A) if(abs(k - (k<-as.integer(k))) > .01) stop("Incompatible columns in alias setup") ixmat = matrix(seq_along(L[1,]), ncol = k) for (j in seq_len(k)) { rc = ixmat[rcols, j] fc =ixmat[, j] object@linfct[, fc] = L[, rc, drop = FALSE] %*% A } object@model.info$model.matrix = "" # silent message misc$initMesg = c(misc$initMesg, paste("submodel: ~", attr(A, "submodstr"))) } } else misc[[fullname]] = args[[nm]] } } } object@misc = misc object } ### List of valid strings to match in update() ### .valid.misc = c("adjust","alpha","avgd.over","bias.adjust","by.vars","calc","delta","df", "initMesg","estName","estType","famSize","frequentist","infer","inv.lbl", "level","methDesc","nesting","null","predict.type","pri.vars", "side","sigma","tran","tran.mult","tran.offset","tran2","type","is.new.rg", "submodel") #' Set or change emmeans options #' #' Use \code{emm_options} to set or change various options that are used in #' the \pkg{emmeans} package. These options are set separately for different contexts in #' which \code{emmGrid} objects are created, in a named list of option lists. #' #' \pkg{emmeans}'s options are stored as a list in the system option \code{"emmeans"}. #' Thus, \code{emm_options(foo = bar)} is the same as #' \code{options(emmeans = list(..., foo = bar))} where \code{...} represents any #' previously existing options. The list \code{emm_defaults} contains the default #' values in case the corresponding element of system option \code{emmeans} is \code{NULL}. #' #' Currently, the following main list entries are supported: #' \describe{ #' \item{\code{ref_grid}}{A named \code{list} of defaults for objects created by #' \code{\link{ref_grid}}. This could affect other objects as well. For example, #' if \code{emmeans} is called with a fitted model object, it calls #' \code{ref_grid} and this option will affect the resulting \code{emmGrid} #' object.} #' \item{\code{emmeans}}{A named \code{list} of defaults for objects created by #' \code{\link{emmeans}} or \code{\link{emtrends}}.} #' \item{\code{contrast}}{A named \code{list} of defaults for objects created by #' \code{\link{contrast.emmGrid}} or \code{\link{pairs.emmGrid}}.} #' \item{\code{summary}}{A named \code{list} of defaults used by the methods #' \code{\link{summary.emmGrid}}, \code{\link{predict.emmGrid}}, \code{\link{test.emmGrid}}, #' \code{\link{confint.emmGrid}}, and \code{\link{emmip}}. The only option that can #' affect the latter four is \code{"predict.method"}.} #' \item{\code{sep}}{A character value to use as a separator in labeling factor combinations. #' Such labels are potentially used in several places such as \code{\link{contrast}} and #' \code{\link{plot.emmGrid}} when combinations of factors are compared or plotted. #' The default is \code{" "}.} #' \item{\code{parens}}{Character vector that determines which labels are parenthesized #' when they are contrasted. The first element is a regular expression, and the second and #' third elements are used as left and right parentheses. #' See details for the \code{parens} argument in \code{\link{contrast}}. The default #' will parenthesize labels containing the four arithmetic operators, #' using round parentheses.} #' \item{\code{cov.keep}}{The default value of \code{cov.keep} in \code{\link{ref_grid}}. #' Defaults to \code{"2"}, i.e., two-level covariates are treated like factors.} #' \item{\code{graphics.engine}}{A character value matching #' \code{c("ggplot", "lattice")}, setting the default engine to use in #' \code{\link{emmip}} and \code{\link{plot.emmGrid}}. Defaults to \code{"ggplot"}.} #' \item{\code{msg.interaction}}{A logical value controlling whether or not #' a message is displayed when \code{emmeans} averages over a factor involved #' in an interaction. It is probably not appropriate to do this, unless #' the interaction is weak. Defaults to \code{TRUE}.} #' \item{\code{msg.nesting}}{A logical value controlling whether or not to #' display a message when a nesting structure is auto-detected. The existence #' of such a structure affects computations of EMMs. Sometimes, a nesting #' structure is falsely detected -- namely when a user has omitted some #' main effects but included them in interactions. This does not change the #' model fit, but it produces a different parameterization that is picked #' up when the reference grid is constructed. Defaults to \code{TRUE}.} #' \item{\code{rg.limit}}{An integer value setting a limit on the number of rows #' in a newly constructed reference grid. This is checked based on the number of #' levels of the factors involved; but it excludes the levels of any multivariate #' responses because those are not yet known. The reference grid consists of all #' possible combinations of the predictors, and this can become huge if there are #' several factors. An error is thrown if this limit is exceeded. One can use the #' \code{nuisance} argument of \code{\link{ref_grid}} to collapse on nuisance #' factors, thus making the grid smaller. Defaults to 10,000.} #' \item{\code{simplify.names}}{A logical value controlling whether to #' simplify (when possible) names in the model formula that refer to datasets -- #' for example, should we simplify a predictor name like \dQuote{\code{data$trt}} #' to just \dQuote{\code{trt}}? Defaults to \code{TRUE}.} #' \item{\code{opt.digits}}{A logical value controlling the precision with which #' summaries are printed. If \code{TRUE} (default), the number of digits #' displayed is just enough to reasonably distinguish estimates from the ends #' of their confidence intervals; but always at least 3 digits. If #' \code{FALSE}, the system value \code{getOption("digits")} is used.} #' \item{\code{back.bias.adj}}{A logical value controlling whether we #' try to adjust bias when back-transforming. If \code{FALSE}, we use naive #' back transformation. If \code{TRUE} \emph{and \code{sigma} is available}, a #' second-order adjustment is applied to estimate the mean on the response #' scale.} #' \item{\code{enable.submodel}}{A logical value. If \code{TRUE}, enables support #' for selected model classes to implement the \code{submodel} option. If #' \code{FALSE}, this support is disabled. Setting this option to \code{FALSE} #' could save excess memory consumption.} #' #' }%%% end describe{} #' Some other options have more specific purposes: #' \describe{ #' \item{\code{estble.tol}}{Tolerance for determining estimability in #' rank-deficient cases. If absent, the value in \code{emm_defaults$estble.tol)} #' is used.} #' \item{\code{save.ref_grid}}{Logical value of \code{TRUE} if you wish the #' latest reference grid created to be saved in \code{.Last.ref_grid}. #' The default is \code{FALSE}.} #' \item{Options for \code{lme4::lmerMod} models}{Options \code{lmer.df}, #' \code{disable.pbkrtest}, \code{pbkrtest.limit}, \code{disable.lmerTest}, #' and \code{lmerTest.limit} #' options affect how degrees of freedom are computed for \code{lmerMod} objects #' produced by the \pkg{lme4} package). See that section of the "models" vignette #' for details.} #' } %%%%%% end \describe #' #' @param ... Option names and values (see Details) #' @param disable If non-missing, this will reset all options to their defaults #' if \code{disable} tests \code{TRUE} (but first save them for possible later #' restoration). Otherwise, all previously saved options #' are restored. This is important for bug reporting; please see the section below #' on reproducible bugs. When \code{disable} is specified, the other arguments are ignored. #' #' @return \code{emm_options} returns the current options (same as the result #' of \samp{getOption("emmeans")}) -- invisibly, unless called with no arguments. #' #' @section Reproducible bugs: #' Most options set display attributes and such that are not likely to be associated #' with bugs in the code. However, some other options (e.g., \code{cov.keep}) #' are essentially configuration settings that may affect how/whether the code #' runs, and the settings for these options may cause subtle effects that may be #' hard to reproduce. Therefore, when sending a bug report, please create a reproducible #' example and make sure the bug occurs with all options set at their defaults. #' This is done by preceding it with \code{emm_options(disable = TRUE)}. #' #' By the way, \code{disable} works like a stack (LIFO buffer), in that \code{disable = TRUE} #' is equivalent to \code{emm_options(saved.opts = emm_options())} and #' \code{emm_options(disable = FALSE)} is equivalent to #' \code{options(emmeans = get_emm_option("saved.opts"))}. To completely erase #' all options, use \code{options(emmeans = NULL)} #' #' #' #' @seealso \code{\link{update.emmGrid}} #' @export #' @examples #' \dontrun{ #' emm_options(ref_grid = list(level = .90), #' contrast = list(infer = c(TRUE,FALSE)), #' estble.tol = 1e-6) #' # Sets default confidence level to .90 for objects created by ref.grid #' # AS WELL AS emmeans called with a model object (since it creates a #' # reference grid). In addition, when we call 'contrast', 'pairs', etc., #' # confidence intervals rather than tests are displayed by default. #' } #' #' \dontrun{ #' emm_options(disable.pbkrtest = TRUE) #' # This forces use of asymptotic methods for lmerMod objects. #' # Set to FALSE or NULL to re-enable using pbkrtest. #' } #' #' # See tolerance being used for determining estimability #' get_emm_option("estble.tol") #' #' \dontrun{ #' # Set all options to their defaults #' emm_options(disable = TRUE) #' # ... and perhaps follow with code for a minimal reproducible bug, #' # which may include emm_options() clls if they are pertinent ... #' #' # restore options that had existed previously #' emm_options(disable = FALSE) #' } #' emm_options = function(..., disable) { opts = getOption("emmeans", list()) newopts = list(...) display = TRUE # flag to display all options if ... is empty if (!missing(disable)) { if (disable) opts = list(saved.opts = opts) else if (!is.null(saved <- opts$saved.opts)) opts = saved newopts = list() display = FALSE } for (nm in names(newopts)) opts[[nm]] = newopts[[nm]] options(emmeans = opts) if (length(newopts) > 0) invisible(opts) else if (display) { opts = c(opts, emm_defaults) opts[sort(names(opts))] } } # equivalent of getOption() #' @rdname emm_options #' @param x Character value - the name of an option to be queried #' @param default Value to return if \code{x} is not found #' @return \code{get_emm_option} returns the currently stored option for \code{x}, #' or its default value if not found. #' @export get_emm_option = function(x, default = emm_defaults[[x]]) { opts = getOption("emmeans", list()) if(is.null(default) || hasName(opts, x)) opts[[x]] else default } ### Exported defaults for certain options #' @rdname emm_options #' @export emm_defaults = list ( ref_grid = list(is.new.rg = TRUE, infer = c(FALSE, FALSE)), emmeans = list(infer = c(TRUE, FALSE)), contrast = list(infer = c(FALSE, TRUE)), save.ref_grid = FALSE, # save new ref_grid in .Last.ref_grid cov.keep = "2", # default for cov.keep arg in ref_grid sep = " ", # separator for combining factor levels parens = c("-|\\+|\\/|\\*", "(", ")"), # patterns for what/how to parenthesize in contrast graphics.engine = "ggplot", # default for emmip and plot.emmGrid ### msg.data.call = TRUE, # message when there's a call in data or subset msg.interaction = TRUE, # message about averaging w/ interactions msg.nesting = TRUE, # message when nesting is detected estble.tol = 1e-8, # tolerance for estimability checks simplify.names = TRUE, # simplify names like data$x to just "x" back.bias.adj = FALSE, # Try to bias-adjust back-transformations? opt.digits = TRUE, # optimize displayed digits? enable.submodel = TRUE, # enable saving extra info for submodel rg.limit = 10000, # limit on number of rows in a reference grid lmer.df = "kenward-roger", # Use Kenward-Roger for df disable.pbkrtest = FALSE, # whether to bypass pbkrtest routines for lmerMod pbkrtest.limit = 3000, # limit on N for enabling K-R disable.lmerTest = FALSE, # whether to bypass lmerTest routines for lmerMod lmerTest.limit = 3000 # limit on N for enabling Satterthwaite ) # override levels<- method #' @rdname update.emmGrid #' @order 5 #' @export #' @param x an \code{emmGrid} object #' @param value \code{list} or replacement levels. See the documentation for #' \code{update.emmGrid} with the \code{levels} argument, #' as well as the section below on \dQuote{Replaciong levels} #' #' @return \code{levels<-} replaces the levels of the object in-place. #' See the section on replacing levels for details. #' @section Replacing levels: #' The \code{levels<-} method uses \code{update.emmGrid} to replace the #' levels of one or more factors. This method allows selectively replacing #' the levels of just one factor (via subsetting operators), whereas #' \code{update(x, levels = list(...))} requires a list of \emph{all} factors #' and their levels. If any factors are to be renamed, we must replace all #' levels and include the new names in the replacements. See the examples. #' #' @examples #' #' ## Changing levels of one factor #' newrg <- pigs.rg #' levels(newrg)$source <- 1:3 #' newrg #' #' ## Unraveling a previously standardized covariate #' zd = scale(fiber$diameter) #' fibz.lm <- lm(strength ~ machine * zd, data = fiber) #' (fibz.rg <- ref_grid(fibz.lm, at = list(zd = -2:2))) ### 2*SD range #' lev <- levels(fibz.rg) #' levels(fibz.rg) <- list ( #' machine = lev$machine, #' diameter = with(attributes(zd), #' `scaled:center` + `scaled:scale` * lev$zd) ) #' fibz.rg #' "levels<-.emmGrid" = function(x, value) { update.emmGrid(x, levels = value) } # ### transform method (I decided to ditch this completely in favor of levels<-) # #' Modify variable names and/or levels in a reference grid # #' # #' @param `_data` An object of class \code{emmGrid} # #' @param ... Specifications for changes to be made. See Specifications section # #' @param `_par` named \code{list} containing any additional parameters needed # #' in evaluating expressions # #' # #' @section Specifications: # #' Each specification can be of one of the following forms: # #' \itemize{ # #' \item{\code{name = } (replace levels only)} # #' \item{\code{name = newname ~ } (replace levels and rename)} # #' \item{\code{name = ~ } (calculate new levels)} # #' \item{\code{name = newname ~ } (calculate new levels and rename)} # #' \item{\code{newame ~ name } (rename with levels unchanged)} # #' \item{\code{newname ~ } (calculate new levels for variable # #' in expression, and rename)} # #' } # #' Here, \code{name} must be the name of an existing predictor in the grid, # #' and \code{} is a character of numeric vector of length # #' exactly equal to the number of levels of \code{name}. The type of the replacement levels # #' does not need to match the type of the existing levels; however, any factor in the grid # #' remains a factor, with its levels changed. # #' # #' Expressions must be supplied via a formula, and must be evaluable in the context # #' of \code{envir} and the existing levels of \code{name}. # #' If a formula has a left-hand side, it is used as # #' a replacement name for that variable. # #' # #' @return a modified \code{emmGrid} object # #' @export # #' # #' @note # #' An alternative way to use this is to supply a list of arguments as the \code{morph} # #' option in \code{\link{update.emmGrid}}. # #' # #' @examples # #' warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) # #' (warp.rg <- ref_grid(warp.lm)) # #' transform(warp.rg, tension = 1:3, wool = texture ~ c("soft", "coarse")) # #' # #' # Standardized predictor # #' z <- scale(fiber$diameter) # #' fiber.lm <- lm(strength ~ z + machine, data = fiber) # #' # #' ### Mean predictions at 1-SD intervals: # #' (fiber.emm <- emmeans(fiber.lm, "z", at = list(z = -1:1))) # #' # #' ### Same predictions labeled with actual diameter values: # #' transform(fiber.emm, diameter ~ `scaled:center` + `scaled:scale` * z, # #' `_par` = attributes(z)) # #' # #' # # transform.emmGrid <- function(`_data`, ..., `_par` = list()) { # specs = list(...) # nms = names(c(`_dummy_` = 0, specs))[-1] # keeps this from being NULL # for (i in which(nms == "")) { # get name from formula rhs # if (!inherits(spc <- specs[[i]], "formula") # || (inherits(spc, "formula") && (length(spc) < 3))) # stop("Unnamed specifications must be two-sided formulas") # nms[i] = names(specs)[i] = c(intersect(all.vars(spc[-2]), # names(`_data`@levels)), "(absent)")[1] # } # for (var in nms) { # oldlev = `_data`@levels[[var]] # if (is.null(oldlev)) # stop("No variable named '", var, "' in this object.") # newlev = specs[[var]] # if(inherits(newlev, "formula") && length(newlev) > 2) { # newname = as.character(newlev)[2] # newlev = newlev[-2] # } # else # newname = "" # if ((is.numeric(newlev) || is.character(newlev)) # && length(newlev) != length(oldlev)) # stop("Must provide exactly ", length(oldlev), " levels for '", var, "'") # else if (inherits(newlev, "formula")) # newlev = eval(str2expression(as.character(newlev)[2]), # envir = c(`_data`@levels[var], `_par`)) # # so at this point we have conforming numbers of levels. # `_data`@levels[[var]] = newlev # v = newv = `_data`@grid[[var]] # if (inherits(v, "factor")) # levels(newv) = newlev # else for (i in seq_along(oldlev)) # newv[v == oldlev[[i]]] = newlev[[i]] # `_data`@grid[[var]] = newv # if (newname != "") { # i = which(names(`_data`@levels) == var) # names(`_data`@levels)[i] = newname # i = which(names(`_data`@grid) == var) # names(`_data`@grid)[i] = newname # i = which(`_data`@roles$predictors == var) # if (length(i) > 0) # `_data`@roles$predictors[i] = newname # nst = `_data`@model.info$nesting # if (!is.null(nst)) { # names(nst)[names(nst) == var] = newname # nst = lapply(nst, function(x) { x[x == var] = newname; x }) # `_data`@model.info$nesting = nst # } # } # } # `_data` # } ### Utility to change the internal structure of an emmGrid object ### Returned emmGrid object has linfct = I and bhat = estimates ### Primary reason to do this is with transform = TRUE, then can ### work with linear functions of the transformed predictions #' Reconstruct a reference grid with a new transformation or simulations #' #' The typical use of this function is to cause EMMs to be computed on #' a different scale, e.g., the back-transformed scale rather than the #' linear-predictor scale. In other words, if you want back-transformed #' results, do you want to average and then back-transform, or #' back-transform and then average? #' #' The \code{regrid} function reparameterizes an existing \code{ref.grid} so #' that its \code{linfct} slot is the identity matrix and its \code{bhat} slot #' consists of the estimates at the grid points. If \code{transform} is #' \code{TRUE}, the inverse transform is applied to the estimates. Outwardly, #' when \code{transform = "response"}, the result of \code{\link{summary.emmGrid}} #' after applying \code{regrid} is identical to the summary of the original #' object using \samp{type="response"}. But subsequent EMMs or #' contrasts will be conducted on the new scale -- which is #' the reason this function exists. #' #' This function may also be used to simulate a sample of regression #' coefficients for a frequentist model for subsequent use as though it were a #' Bayesian model. To do so, specify a value for \code{N.sim} and a sample is #' simulated using the function \code{sim}. The grid may be further processed in #' accordance with the other arguments; or if \code{transform = "pass"}, it is #' simply returned with the only change being the addition of the simulated #' sample. #' #' @param object An object of class \code{emmGrid} #' @param transform Character, list, or logical value. If \code{"response"}, #' \code{"mu"}, or \code{TRUE}, the inverse transformation is applied to the #' estimates in the grid (but if there is both a link function and a response #' transformation, \code{"mu"} back-transforms only the link part); if #' \code{"none"} or \code{FALSE}, \code{object} is re-gridded so that its #' \code{bhat} slot contains \code{predict(object)} and its \code{linfct} slot #' is the identity. Any internal transformation information is preserved. If #' \code{transform = "pass"}, the object is not re-gridded in any way (this #' may be useful in conjunction with \code{N.sim}). #' #' If \code{transform} is a character value in \code{links} (which is the set #' of valid arguments for the \code{\link{make.link}} function, excepting #' \code{"identity"}), or if \code{transform} is a list of the same form as #' returned by \code{make.links} or \code{\link{make.tran}}, the results are #' formulated as if the response had been transformed with that link function. #' #' @param inv.link.lbl Character value. This applies only when \code{transform} #' is in \code{links}, and is used to label the predictions if subsequently summarized #' with \code{type = "response"}. #' @param predict.type Character value. If provided, the returned object is #' updated with the given type to use by default by \code{summary.emmGrid} #' (see \code{\link{update.emmGrid}}). This may be useful if, for example, #' when one specifies \code{transform = "log"} but desires summaries to be #' produced by default on the response scale. #' @param bias.adjust Logical value for whether to adjust for bias in #' back-transforming (\code{transform = "response"}). This requires a value of #' \code{sigma} to exist in the object or be specified. #' @param sigma Error SD assumed for bias correction (when #' \code{transform = "response"} and a transformation #' is in effect). If not specified, #' \code{object@misc$sigma} is used, and an error is thrown if it is not found. #' @param N.sim Integer value. If specified and \code{object} is based on a #' frequentist model (i.e., does not have a posterior sample), then a fake #' posterior sample is generated using the function \code{sim}. #' @param sim A function of three arguments (no names are assumed). #' If \code{N.sim} is supplied with a frequentist model, this function is called #' with respective arguments \code{N.sim}, \code{object@bhat}, and \code{object@V}. #' The default is the multivariate normal distribution. #' @param ... Ignored. #' #' @section Degrees of freedom: #' In cases where the #' degrees of freedom depended on the linear function being estimated (e.g., #' Satterthwaite method), the d.f. #' from the reference grid are saved, and a kind of \dQuote{containment} method #' is substituted in the returned object, whereby the calculated d.f. for a new #' linear function will be the minimum d.f. among those having nonzero #' coefficients. This is kind of an \emph{ad hoc} method, and it can #' over-estimate the degrees of freedom in some cases. An annotation is #' displayed below any subsequent summary results stating that the #' degrees-of-freedom method is inherited from the previous method at #' the time of re-gridding. #' #' @note Another way to use \code{regrid} is to supply a \code{transform} #' argument to \code{\link{ref_grid}} (either directly of indirectly via #' \code{\link{emmeans}}). This is often a simpler approach if the reference #' grid has not already been constructed. #' #' @return An \code{emmGrid} object with the requested changes #' @export #' #' @examples #' pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) #' rg <- ref_grid(pigs.lm) #' #' # This will yield EMMs as GEOMETRIC means of concentrations: #' (emm1 <- emmeans(rg, "source", type = "response")) #' pairs(emm1) ## We obtain RATIOS #' #' # This will yield EMMs as ARITHMETIC means of concentrations: #' (emm2 <- emmeans(regrid(rg, transform = "response"), "source")) #' pairs(emm2) ## We obtain DIFFERENCES #' # Same result, useful if we hadn't already created 'rg' #' # emm2 <- emmeans(pigs.lm, "source", transform = "response") #' #' # Simulate a sample of regression coefficients #' set.seed(2.71828) #' rgb <- regrid(rg, N.sim = 200, transform = "pass") #' emmeans(rgb, "source", type = "response") ## similar to emm1 regrid = function(object, transform = c("response", "mu", "unlink", "none", "pass", links), inv.link.lbl = "response", predict.type, bias.adjust = get_emm_option("back.bias.adj"), sigma, N.sim, sim = mvtnorm::rmvnorm, ...) { links = c("logit", "probit", "cauchit", "cloglog", "log", "log10", "log2", "sqrt", "1/mu^2", "inverse") if (is.logical(transform)) # for backward-compatibility transform = ifelse(transform, "response", "none") else if (is.list(transform)) { userlink = transform transform = "user" } else transform = match.arg(transform) if (is.na(object@post.beta[1]) && !missing(N.sim)) { message("Simulating a sample of size ", N.sim, " of regression coefficients.") object@post.beta = sim(N.sim, object@bhat, object@V) } if (transform == "pass") return(object) # if we have two transformations to undo, do the first one recursively if ((transform == "response") && (!is.null(object@misc$tran2))) object = regrid(object, transform = "mu") # Save post.beta stuff PB = object@post.beta NC = attr(PB, "n.chains") if (!is.na(PB[1])) { # fix up post.beta BEFORE we overwrite parameters PB = PB %*% t(object@linfct) if (".offset." %in% names(object@grid)) PB = t(apply(PB, 1, function(.) . + object@grid[[".offset."]])) } est = .est.se.df(object, do.se = TRUE) ###FALSE) estble = !(is.na(est[[1]])) object@V = vcov(object)[estble, estble, drop = FALSE] object@bhat = est[[1]] object@linfct = diag(1, length(estble)) if (!is.null(disp <- object@misc$display)) { # fix up for the bookkeeping in nested models object@V = object@V[disp, disp, drop = FALSE] object@linfct = matrix(0, nrow = length(disp), ncol = length(estble)) object@linfct[disp, ] = diag(1, length(estble)) } if(all(estble)) object@nbasis = estimability::all.estble else object@nbasis = object@linfct[, !estble, drop = FALSE] # override the df function df = est$df edf = df[estble] if (length(edf) == 0) edf = NA # note both NA/NA and Inf/Inf test is.na() = TRUE prev.df.msg = attr(object@dffun, "mesg") if (any(is.na(edf/edf)) || (diff(range(edf)) < .01)) { # use common value object@dfargs = list(df = mean(edf, na.rm = TRUE)) object@dffun = function(k, dfargs) dfargs$df } else { # use containment df object@dfargs = list(df = df) object@dffun = function(k, dfargs) { idx = which(zapsmall(k) != 0) ifelse(length(idx) == 0, NA, min(dfargs$df[idx], na.rm = TRUE)) } } if(!is.null(prev.df.msg)) attr(object@dffun, "mesg") = ifelse( startsWith(prev.df.msg, "inherited"), prev.df.msg, paste("inherited from", prev.df.msg, "when re-gridding")) if(transform %in% c("response", "mu", "unlink", links, "user") && !is.null(object@misc$tran)) { flink = link = attr(est, "link") if (bias.adjust) { if(missing(sigma)) sigma = object@misc$sigma link = .make.bias.adj.link(link, sigma) if (!is.na(PB[1])) # special frequentist version when sigma is MCMC sample flink = .make.bias.adj.link(flink, mean(sigma)) else flink = link } D = .diag(flink$mu.eta(object@bhat[estble])) object@bhat = flink$linkinv(object@bhat) object@V = D %*% tcrossprod(object@V, D) if (!is.na(PB[1])) PB = matrix(link$linkinv(PB), ncol = ncol(PB)) inm = object@misc$inv.lbl if (!is.null(inm)) { object@misc$estName = inm if (!is.null(object@misc$log.contrast) && object@misc$log.contrast) # relabel ratios for (v in setdiff(object@misc$pri.vars, object@misc$by.vars)) object@grid[[v]] = gsub(" - ", "/", object@grid[[v]]) } if((transform %in% c("mu", "unlink")) && !is.null(object@misc$tran2)) { object@misc$tran = object@misc$tran2 object@misc$tran2 = object@misc$tran.mult = object@misc$tran.offset = object@misc$inv.lbl = NULL } else object@misc$tran = object@misc$tran.mult = object@misc$tran.offset = object@misc$inv.lbl = NULL sigma = object@misc$sigma = NULL } if (transform %in% c(links, "user")) { # fake a transformation link = if (transform == "user") userlink else .make.link(transform) bounds = range(link$linkinv(c(-1e6, -100, -1, 0, 1, 100, 1e6))) nas = which(is.na(object@bhat)) # already NA incl = vincl = which((object@bhat > bounds[1]) & (object@bhat < bounds[2])) if (length(nas) > 0) vincl = which((object@bhat[-nas] > bounds[1]) & (object@bhat[-nas] < bounds[2])) negs = setdiff(seq_along(object@bhat), incl) if (length(negs) > 0) { message("Invalid response predictions are flagged as non-estimable") object@bhat[negs] = NA tmp = seq_along(object@bhat) object@nbasis = sapply(c(nas, negs), function(ii) 0 + (tmp == ii)) } object@bhat = link$linkfun(object@bhat) Vee = object@V if(length(incl) > 0) { D = .diag(1/link$mu.eta(object@bhat[incl])) object@V = D %*% tcrossprod(Vee[vincl, vincl, drop = FALSE], D) } if (!is.na(PB[1])) { PB[PB <= 0] = NA PB = link$linkfun(PB) PB[1] = ifelse(is.na(PB[1]), 0, PB[1]) # make sure 1st elt isn't NA } object@misc$tran = if (transform == "user") link else transform object@misc$inv.lbl = inv.link.lbl } if(!is.na(PB[1])) { attr(PB, "n.chains") = NC object@post.beta = PB } # Nix out things that are no longer needed or valid object@grid$.offset. = object@misc$offset.mult = object@misc$estHook = object@misc$vcovHook = NULL object@model.info$model.matrix = "Submodels are not available with regridded objects" if(!missing(predict.type)) object = update(object, predict.type = predict.type) object } emmeans/R/emmip.R0000644000176200001440000005004714147222702013341 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # emmip code - interaction plots #' Interaction-style plots for estimated marginal means #' #' Creates an interaction plot of EMMs based on a fitted model and a simple #' formula specification. #' #' @export emmip = function(object, formula, ...) { UseMethod("emmip") } # Our one method #' @rdname emmip #' @param object An object of class \code{emmGrid}, or a fitted model of a class #' supported by the \pkg{emmeans} package #' @param formula Formula of the form #' \code{trace.factors ~ x.factors | by.factors}. The EMMs are #' plotted against \code{x.factor} for each level of \code{trace.factors}. #' \code{by.factors} is optional, but if present, it determines separate #' panels. Each element of this formula may be a single factor in the model, #' or a combination of factors using the \code{*} operator. #' @param type As in \code{\link{predict.emmGrid}}, this determines #' whether we want to inverse-transform the predictions #' (\code{type = "response"}) or not (any other choice). The default is #' \code{"link"}, unless the \code{"predict.type"} option is in force; see #' \code{\link{emm_options}}. #' In addition, the user may specify \code{type = "scale"} to create a #' transformed scale for the vertical axis based on \code{object}'s response #' transformation or link function. #' @param CIs Logical value. If \code{TRUE}, confidence intervals (or HPD intervals #' for Bayesian models) are added to the plot #' (works only with \code{engine = "ggplot"}). #' @param PIs Logical value. If \code{TRUE}, prediction intervals are added to the plot #' (works only with \code{engine = "ggplot"}). If both \code{CIs} and #' \code{CIs} are \code{TRUE}, the prediction intervals will be somewhat #' longer, lighter, and thinner than the confidence intervals. Additional #' parameters to \code{\link{predict.emmGrid}} (e.g., \code{sigma}) may be passed via #' \code{...}. For Bayesian models, PIs require \code{frequentist = TRUE} and #' a value for \code{sigma}. #' @param style Optional character value. This has an effect only when the #' horizontal variable is a single numeric variable. If \code{style} is #' unspecified or \code{"numeric"}, the horizontal scale will be numeric and #' curves are plotted using lines (and no symbols). With \code{style = #' "factor"}, the horizontal variable is treated as the levels of a factor #' (equally spaced along the horizontal scale), and curves are plotted using #' lines and symbols. When the horizontal variable is character or factor, or #' a combination of more than one predictor, \code{"factor"} style is always used. #' @param engine Character value matching \code{"ggplot"} (default), #' \code{"lattice"}, or \code{"none"}. The graphics engine to be used to produce the plot. #' These require, respectively, the \pkg{ggplot2} or \pkg{lattice} package to #' be installed. Specifying \code{"none"} is equivalent to setting \code{plotit = FALSE}. #' @param plotit Logical value. If \code{TRUE}, a graphical object is returned; #' if \code{FALSE}, a data.frame is returned containing all the values #' used to construct the plot. #' @param nesting.order Logical value. If \code{TRUE}, factors that are nested #' are presented in order according to their nesting factors, even if those nesting #' factors are not present in \code{formula}. If \code{FALSE}, only the #' variables in \code{formula} are used to order the variables. #' @param ... Additional arguments passed to \code{\link{emmeans}} (when #' \code{object} is not already an \code{emmGrid} object), #' \code{predict.emmGrid}, #' \code{emmip_ggplot}, or \code{emmip_lattice}. #' #' @section Details: #' If \code{object} is a fitted model, \code{\link{emmeans}} is called with an #' appropriate specification to obtain estimated marginal means for each #' combination of the factors present in \code{formula} (in addition, any #' arguments in \code{\dots} that match \code{at}, \code{trend}, #' \code{cov.reduce}, or \code{fac.reduce} are passed to \code{emmeans}). #' Otherwise, if \code{object} is an \code{emmGrid} object, its first element is #' used, and it must contain one estimate for each combination of the factors #' present in \code{formula}. #' #' @return If \code{plotit = FALSE}, a \code{data.frame} (actually, a #' \code{summary_emm} object) with the table of EMMs that would be plotted. #' The variables plotted are named \code{xvar} and \code{yvar}, and the trace #' factor is named \code{tvar}. This data frame has an added \code{"labs"} #' attribute containing the labels \code{xlab}, \code{ylab}, and \code{tlab} #' for these respective variables. The confidence limits are also #' included, renamed \code{LCL} and \code{UCL}. #' #' @return If \code{plotit = TRUE}, the function #' returns an object of class \code{"ggplot"} or a \code{"trellis"}, depending #' on \code{engine}. #' #' @note Conceptually, this function is equivalent to #' \code{\link{interaction.plot}} where the summarization function is thought #' to return the EMMs. #' #' @seealso \code{\link{emmeans}}, \code{\link{interaction.plot}}. #' @export #' @method emmip default #' #' @examples #' #--- Three-factor example #' noise.lm = lm(noise ~ size * type * side, data = auto.noise) #' #' # Separate interaction plots of size by type, for each side #' emmip(noise.lm, type ~ size | side) #' #' # One interaction plot, using combinations of size and side as the x factor #' # ... with added confidence intervals and some formatting changes #' emmip(noise.lm, type ~ side * size, CIs = TRUE, #' linearg = list(linetype = "dashed"), CIarg = list(lwd = 1, alpha = 1)) #' #' # One interaction plot using combinations of type and side as the trace factor #' emmip(noise.lm, type * side ~ size) #' #' # Individual traces in panels #' emmip(noise.lm, ~ size | type * side) #' #' # Example for the 'style' argument #' fib.lm = lm(strength ~ machine * sqrt(diameter), data = fiber) #' fib.rg = ref_grid(fib.lm, at = list(diameter = c(3.5, 4, 4.5, 5, 5.5, 6)^2)) #' emmip(fib.rg, machine ~ diameter) # curves (because diameter is numeric) #' emmip(fib.rg, machine ~ diameter, style = "factor") # points and lines #' #' # For an example using extra ggplot2 code, see 'vignette("messy-data")', #' # in the section on nested models. emmip.default = function(object, formula, type, CIs = FALSE, PIs = FALSE, style, engine = get_emm_option("graphics.engine"), # pch = c(1,2,6,7,9,10,15:20), # lty = 1, col = NULL, plotit = TRUE, nesting.order = FALSE, ...) { engine = match.arg(engine, c("ggplot", "lattice", "none")) if (engine == "ggplot") .requireNS("ggplot2", "The 'ggplot' engine requires the 'ggplot2' package be installed.") else if (engine == "lattice") .requireNS("lattice", "The 'lattice' engine requires the 'lattice' package be installed.") else plotit = FALSE specs = .parse.by.formula(formula) # list of lhs, rhs, by # Glean the parts of ... to use in emmeans call # arguments allowed to be passed lsa.allowed = c("at","trend","cov.reduce","fac.reduce") xargs = list(...) emmopts = list(...) for (arg in names(xargs)) { idx = pmatch(arg, lsa.allowed) if (!is.na(idx)) { opt = lsa.allowed[idx] emmopts[[opt]] = xargs[[arg]] xargs[[arg]] = NULL } } emmopts$object = object emmopts$specs = .reformulate(unlist(specs)) emmo = do.call("emmeans", emmopts) # add possibility of type = "scale". If so, we use "response" and set a flag if(missing(type)) { type = get_emm_option("summary")$predict.type if (is.null(type)) type = .get.predict.type(emmo@misc) } # If we say type = "scale", set it to "response" and set a flag if (nonlin.scale <- (type %.pin% "scale")) type = "response" type = .validate.type(type) emms = summary(emmo, type = type, infer = c(CIs, F)) if(PIs) { prd = predict(emmo, interval = "pred", ...) emms$LPL = prd$lower.PL emms$UPL = prd$upper.PL } # Ensure the estimate is named "yvar" and the conf limits are "LCL" and "UCL" nm = names(emms) tgts = c(attr(emms, "estName"), attr(emms, "clNames")) subs = c("yvar", "LCL", "UCL") for (i in 1:3) names(emms)[nm == tgts[i]] = subs[i] attr(emms, "estName") = "yvar" if(!nesting.order) { # re-order by factor levels actually in plot snm = intersect(nm, unlist(specs)) ord = do.call(order, unname(emms[rev(snm)])) emms = emms[ord, ] } sep = get_emm_option("sep") # Set up trace vars and key tvars = specs$lhs if (one.trace <- (length(tvars) == 0)) { tlab = "" tvars = ".single." emms$.single. = 1 } else tlab = paste(tvars, collapse = sep) tv = do.call(paste, c(unname(emms[tvars]), sep = sep)) emms$tvar = factor(tv, levels=unique(tv)) xvars = specs$rhs xv = do.call(paste, c(unname(emms[xvars]), sep = sep)) ltest = max(apply(table(xv,tv), 2, function(x) sum(x > 0))) # length of longest trace if (!missing(style)) styl = match.arg(style, c("factor", "numeric")) if (missing(style) || styl == "numeric") styl = ifelse(length(xvars) == 1 && is.numeric(emms[[xvars]]) && ltest > 1, "numeric", "factor") if (styl == "factor") { emms$xvar = factor(xv, levels = unique(xv)) predicate = "Levels of " if (ltest <= 1) message("Suggestion: Add 'at = list(", xvars, " = ...)' ", "to call to see > 1 value per group.") } else { emms$xvar = as.numeric(xv) predicate = "" } emms = emms[order(emms$xvar), ] byvars = specs$by xlab = ifelse(is.null(xargs$xlab), paste0(predicate, paste(xvars, collapse = sep)), xargs$xlab) rspLbl = paste("Predicted", ifelse(is.null(emmo@misc$inv.lbl), "response", emmo@misc$inv.lbl)) ylab = ifelse(is.null(xargs$ylab), ifelse(type == "response", rspLbl, "Linear prediction"), xargs$ylab) # remove the unneeded stuff from xlabs xargs = xargs[setdiff(names(xargs), c("xlab","ylab"))] emms$.single. = NULL # in case we have that trick column attr(emms, "labs") = list(xlab = xlab, ylab = ylab, tlab = tlab) attr(emms, "vars") = list(byvars = byvars, tvars = setdiff(tvars, ".single.")) if (!plotit || engine == "none") return (emms) fcn = paste("emmip", engine, sep = "_") args = c(list(emms = emms, style = styl), xargs) if (nonlin.scale) args = c(args, list(scale = .make.scale(emmo@misc))) do.call(fcn, args) } ### render emmip using ggplot #' @rdname emmip #' @param dodge Numerical amount passed to \code{ggplot2::position_dodge} #' by which points and intervals are offset so they do not collide. #' @param xlab,ylab,tlab Character labels for the horizontal axis, vertical #' axis, and traces (the different curves), respectively. The \code{emmip} #' function generates these automatically and provides therm via the \code{labs} #' attribute, but the user may override these if desired. #' @param facetlab Labeller for facets (when by variables are in play). #' Use \code{"label_value"} to show just the factor levels, or \code{"label_both"} #' to show both the factor names and factor levels. The default of #' \code{"label_context"} decides which based on how many \code{by} factors there are. #' See the documentation for \code{ggplot2::label_context}. #' @param scale If not missing, an object of class \code{scales::trans} specifying #' a (usually) nonlinear scaling for the vertical axis. For example, #' \code{scales = scales::log_trans()} specifies a logarithmic scale. For #' fine-tuning purposes, additional #' arguments to \code{ggplot2::scale_y_continuous} may be included in \code{...} . #' @param dotarg \code{list} #' of arguments passed to \code{geom_point} to customize appearance of points #' @param linearg \code{list} #' of arguments passed to \code{geom_line} to customize appearance of lines #' @param CIarg,PIarg \code{list}s #' of arguments passed to \code{geom_linerange} to customize appearance of intervals #' #' @section Rendering functions: #' The functions \code{emmip_ggplot} and \code{emmip_lattice} #' are called when \code{plotit == TRUE} to render the plots; #' but they may also be called later on an object saved via \code{plotit = FALSE} #' (or \code{engine = "none"}). The functions require that \code{emms} contains variables #' \code{xvar}, \code{yvar}, and \code{tvar}, and attributes \code{"labs"} and \code{"vars"}. #' Confidence intervals are plotted if variables \code{LCL} and \code{UCL} exist; #' and prediction intervals are plotted if \code{LPL} and \code{UPL} exist. #' Finally, it must contain the variables named in \code{attr(emms, "vars")}. #' @examples #' #'### Options with transformations or link functions #' neuralgia.glm <- glm(Pain ~ Treatment * Sex + Age, family = binomial(), #' data = neuralgia) #' #' # On link scale: #' emmip(neuralgia.glm, Treatment ~ Sex) #' #' # On response scale: #' emmip(neuralgia.glm, Treatment ~ Sex, type = "response") #' #' # With transformed axis scale and custom scale divisions #' emmip(neuralgia.glm, Treatment ~ Sex, type = "scale", #' breaks = seq(0.10, 0.90, by = 0.10)) #' @export emmip_ggplot = function(emms, style = "factor", dodge = .1, xlab = labs$xlab, ylab = labs$ylab, tlab = labs$tlab, facetlab = "label_context", scale, dotarg = list(), linearg = list(), CIarg = list(lwd = 2, alpha = .5), PIarg = list(lwd = 1.25, alpha = .33), ...) { labs = attr(emms, "labs") vars = attr(emms, "vars") CIs = !is.null(emms$LCL) PIs = !is.null(emms$LPL) pos = ggplot2::position_dodge(width = ifelse(CIs|PIs, dodge, 0)) # use dodging if CIs dotarg$position = pos linearg$mapping = ggplot2::aes_(group = ~tvar) linearg$position = pos if (length(vars$tvars) > 0) { grobj = ggplot2::ggplot(emms, ggplot2::aes_(x = ~xvar, y = ~yvar, color = ~tvar)) if (style == "factor") grobj = grobj + do.call(ggplot2::geom_point, dotarg) grobj = grobj + do.call(ggplot2::geom_line, linearg) + ggplot2::labs(x = xlab, y = ylab, color = tlab) } else { # just one trace per plot grobj = ggplot2::ggplot(emms, ggplot2::aes_(x = ~xvar, y = ~yvar)) if (style == "factor") grobj = grobj + do.call(ggplot2::geom_point, dotarg) grobj = grobj + do.call(ggplot2::geom_line, linearg) + ggplot2::labs(x = xlab, y = ylab) } if (PIs) { PIarg$mapping = ggplot2::aes_(ymin = ~LPL, ymax = ~UPL) PIarg$position = pos grobj = grobj + do.call(ggplot2::geom_linerange, PIarg) } if (CIs) { CIarg$mapping = ggplot2::aes_(ymin = ~LCL, ymax = ~UCL) CIarg$position = pos grobj = grobj + do.call(ggplot2::geom_linerange, CIarg) } if (length(byvars <- vars$byvars) > 0) { # we have by variables if (length(byvars) > 1) { byform = as.formula(paste(byvars[1], " ~ ", paste(byvars[-1], collapse="*"))) grobj = grobj + ggplot2::facet_grid(byform, labeller = facetlab) } else grobj = grobj + ggplot2::facet_wrap(byvars, labeller = facetlab) } if (!missing(scale)) { args = list(...) pass = names(args) %.pin% names(as.list(args(ggplot2::scale_y_continuous))) args = c(list(trans = scale), args[pass]) grobj = grobj + do.call(ggplot2::scale_y_continuous, args) } grobj } #' @rdname emmip #' @param emms A \code{data.frame} created by calling \code{emmip} with #' \code{plotit = FALSE}. Certain variables and attributes are expected #' to exist in this data frame; see the section detailing the rendering functions. #' @param pch The plotting characters to use for each group (i.e., levels of #' \code{trace.factors}). They are recycled as needed. #' @param lty The line types to use for each group. Recycled as needed. #' @param col The colors to use for each group, recycled as needed. If not #' specified, the default trellis colors are used. #' @export emmip_lattice = function(emms, style = "factor", xlab = labs$xlab, ylab = labs$ylab, tlab = labs$tlab, pch = c(1,2,6,7,9,10,15:20), lty = 1, col = NULL, ...) { labs = attr(emm, "labs") vars = attr(emms, "vars") # The strips the way I want them my.strip = lattice::strip.custom(strip.names = c(TRUE,TRUE), strip.levels = c(TRUE,TRUE), sep = " = ") if (length(vars$byvars) == 0) plotform = yvar ~ xvar else plotform = as.formula(paste("yvar ~ xvar |", paste(vars$byvars, collapse="*"))) sep = get_emm_option("sep") my.key = function(tvars) list(space="right", title = paste(tvars, collapse = sep), points = TRUE, lines=length(lty) > 1, cex.title=1) TP = TP.orig = lattice::trellis.par.get() TP$superpose.symbol$pch = pch TP$superpose.line$lty = lty if (!is.null(col)) TP$superpose.symbol$col = TP$superpose.line$col = col lattice::trellis.par.set(TP) plty = if(style=="factor") c("p","l") else "l" plotspecs = list(x = plotform, data = emms, groups = ~ tvar, xlab = xlab, ylab = ylab, strip = my.strip, auto.key = my.key(vars$tvars), type = plty) if(length(vars$tvars) == 0) plotspecs$auto.key = NULL # no key when single trace grobj = do.call(lattice::xyplot, c(plotspecs, list(...))) lattice::trellis.par.set(TP.orig) grobj } emmeans/R/zzz.R0000644000176200001440000002161214137062735013072 0ustar liggesusers############################################################################## # Copyright (c) 2012-2019 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # Just define the function for now. When we get to R version 3.6 or so # maybe we can we require R >= 3.4 (first that has hasName()) # and add utils::hasName to imports (in emmeans-package.R) ### No longer needed as now I require R >= 3.5.0 # hasName = function(x, name) # match(name, names(x), nomatch = 0L) > 0L ### NOTE: Revised just after version 1.3.1 release to move CSS file to inst/css ### because devtools and relatives will delete inst/doc without notice! # NOTE: Excluded from documentation # Custom Vignette format # # This is used to format HTML vignettes the way its developer wants them. # # @param ... Arguments passed to \code{rmarkdown::html_document} # # @return R Markdown format used by \code{rmarkdown::render} #' @export .emm_vignette = function(css = system.file("css", "clean-simple.css", package = "emmeans"), highlight = NULL, ...) { rmarkdown::html_document(theme = NULL, highlight = highlight, fig_width = 3, fig_height = 3, css = css, pandoc_args = "", ...) ### css = css, pandoc_args = "--strip-comments", ...) } ### Dynamic registration of S3 methods # Code borrowed from hms pkg. I omitted some type checks etc. because # this is only for internal use and I solemnly promise to behave myself. register_s3_method = function(pkg, generic, class, envir = parent.frame()) { fun = get(paste0(generic, ".", class), envir = envir) if (isNamespaceLoaded(pkg)) { registerS3method(generic, class, fun, envir = asNamespace(pkg)) } # Register hook in case package is later unloaded & reloaded setHook( packageEvent(pkg, "onLoad"), function(...) { registerS3method(generic, class, fun, envir = asNamespace(pkg)) } ) } .onLoad = function(libname, pkgname) { if (.requireNS("coda", fail = .nothing)) { register_s3_method("coda", "as.mcmc", "emmGrid") register_s3_method("coda", "as.mcmc.list", "emmGrid") } if (.requireNS("multcomp", fail = .nothing)) { register_s3_method("multcomp", "glht", "emmlf") register_s3_method("multcomp", "glht", "emmGrid") register_s3_method("multcomp", "cld", "emmGrid") register_s3_method("multcomp", "cld", "emm_list") register_s3_method("multcomp", "modelparm", "emmwrap") } } # .onAttach <- function(libname, pkgname) { # packageStartupMessage("Welcome to emmeans.\n", # "NOTE -- Important change from versions <= 1.41:\n", # " Indicator predictors are now treated as 2-level factors by default.\n", # " To revert to old behavior, use emm_options(cov.keep = character(0))") # } #' @rdname extending-emmeans #' @section Registering S3 methods for a model class: #' The \code{.emm_register} function is provided as a convenience to conditionally #' register your #' S3 methods for a model class, \code{recover_data.foo} and \code{emm_basis.foo}, #' where \code{foo} is the class name. Your package should implement an #' \code{.onLoad} function and call \code{.emm_register} if \pkg{emmeans} is #' installed. See the example. #' #' @param classes Character names of one or more classes to be registered. #' The package must contain the functions \code{recover_data.foo} and #' \code{emm_basis.foo} for each class \code{foo} listed in \code{classes}. #' @param pkgname Character name of package providing the methods (usually #' should be the second argument of \code{.onLoad}) #' #' @export #' #' @examples #' \dontrun{ #' #--- If your package provides recover_data and emm_grid methods for class 'mymod', #' #--- put something like this in your package code -- say in zzz.R: #' .onLoad = function(libname, pkgname) { #' if (requireNamespace("emmeans", quietly = TRUE)) #' emmeans::.emm_register("mymod", pkgname) #' } #' } .emm_register = function(classes, pkgname) { envir = asNamespace(pkgname) for (class in classes) { register_s3_method("emmeans", "recover_data", class, envir) register_s3_method("emmeans", "emm_basis", class, envir) } } ## Here is a utility that we won't export, but can help clean out lsmeans ## stuff from one's workspace, and unload unnecessary junk convert_workspace = function(envir = .GlobalEnv) { if (exists(".Last.ref.grid", envir = envir)) { cat("Deleted .Last.ref.grid\n") remove(".Last.ref.grid", envir = envir) } for (nm in names(envir)) { obj <- get(nm) if (is(obj, "ref.grid")) { cat(paste("Converted", nm, "to class 'emmGrid'\n")) assign(nm, as.emmGrid(obj), envir = envir) } } if ("package:lsmeans" %in% search()) detach("package:lsmeans") if ("lsmeans" %in% loadedNamespaces()) unloadNamespace("lsmeans") message("The environment has been converted and lsmeans's namespace is unloaded.\n", "Now you probably should save it.") } ## Here is a non-exported utility to convert .R and .Rmd files ## It's entirely menu-driven. convert_scripts = function() { infiles = utils::choose.files( caption = "Select R script(s) or markdown file(s) to be converted", multi = TRUE) lsm.to.emmGrid = utils::menu(c("yes", "no"), graphics = TRUE, "lsmxxx() -> emmxxx()?") == 1 pmm.to.emmGrid = utils::menu(c("yes", "no"), graphics = TRUE, "pmmxxx() -> emmxxx()?") == 1 for (infile in infiles) { buffer = scan(infile, what = character(0), sep = "\n", blank.lines.skip = FALSE) buffer = gsub("library *\\(\"*'*lsmeans\"*'*\\)", "library(\"emmeans\")", buffer) buffer = gsub("require *\\(\"*'*lsmeans\"*'*\\)", "require(\"emmeans\")", buffer) buffer = gsub("lsmeans::", "emmeans::", buffer) buffer = gsub("ref\\.grid *\\(", "ref_grid(", buffer) opt.idx = grep("lsm\\.option", buffer) if (length(opt.idx) > 0) { buffer[opt.idx] = gsub("ref\\.grid", "ref_grid", buffer[opt.idx]) buffer[opt.idx] = gsub("lsmeans", "emmeans", buffer[opt.idx]) buffer[opt.idx] = gsub("lsm\\.options *\\(", "emm_options(", buffer[opt.idx]) buffer[opt.idx] = gsub("get\\.lsm\\.option *\\(", "get_emm_option(", buffer[opt.idx]) } buffer = gsub("\\.lsmc", ".emmc", buffer) if (lsm.to.emmGrid) { buffer = gsub("lsmeans *\\(", "emmeans(", buffer) buffer = gsub("lsmip *\\(", "emmip(", buffer) buffer = gsub("lstrends *\\(", "emtrends(", buffer) buffer = gsub("lsm *\\(", "emmGrid(", buffer) buffer = gsub("lsmobj *\\(", "emmobj(", buffer) } if (pmm.to.emmGrid) { buffer = gsub("pmmeans *\\(", "emmeans(", buffer) buffer = gsub("pmmip *\\(", "emmip(", buffer) buffer = gsub("pmtrends *\\(", "emtrends(", buffer) buffer = gsub("pmm *\\(", "emmGrid(", buffer) buffer = gsub("pmmobj *\\(", "emmobj(", buffer) } outfile = file.path(dirname(infile), sub("\\.", "-emm.", basename(infile))) write(buffer, outfile) cat(paste(infile, "\n\twas converted to\n", outfile, "\n")) } } emmeans/R/ordinal-support.R0000644000176200001440000004151314147510540015372 0ustar liggesusers############################################################################## # Copyright (c) 2012-2016 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### support for the ordinal package recover_data.clm = function(object, mode = "latent", ...) { if (mode %.pin% "scale") { ###(!is.na(pmatch(mode, "scale"))) { if (is.null(trms <- object$S.terms)) return("Specified mode=\"scale\", but no scale model is present") # ref_grid's error handler takes it from here recover_data(object$call, trms, object$na.action, ...) } else if (is.null(object$S.terms) && is.null(object$nom.terms)) recover_data.lm(object, ...) else { # bring-in predictors from loc, scale, and nom models trms = delete.response(object$terms) x.preds = union(.all.vars(object$S.terms), .all.vars(object$nom.terms)) x.trms = terms(update(trms, .reformulate(c(".", x.preds)))) recover_data(object$call, x.trms, object$na.action, ...) } } # For now at least, clmm doesn't cover scale, nominal options recover_data.clmm = function(object, ...) recover_data.lm(object, ...) # Note: For ALL thresholds, object$Theta has all the threshold values # for the different cuts (same as object$alpha when threshold=="flexible") # and object$tJac is s.t. tJac %*% alpha = Theta # Note also that some functions of cut are constrained to be zero when # threshold != "flexible". Can get basis using nonest.basis(t(tJac)) # # opt arg 'mode' - determines what goes into ref_grid # 'rescale' - (loc, scale) for linear transformation of latent result emm_basis.clm = function (object, trms, xlev, grid, mode = c("latent", "linear.predictor", "cum.prob", "exc.prob", "prob", "mean.class", "scale"), rescale = c(0,1), ...) { # general stuff mode = match.arg(mode) if (mode == "scale") return (.emm_basis.clm.scale(object, trms, xlev, grid, ...)) # if (is.null(object$contrasts)) # warning("Contrasts used to fit the model are unknown.\n", # "Defaulting to system option, but results may be wrong.") bhat = coef(object) V = .my.vcov(object, ...) tJac = object$tJac dffun = function(...) Inf link = as.character(object$info$link) cnm = dimnames(object$tJac)[[1]] if (is.null(cnm)) cnm = paste(seq_len(nrow(tJac)), "|", 1 + seq_len(nrow(tJac)), sep = "") misc = list() # My strategy is to piece together the needed matrices for each threshold parameter # Then assemble the results ### ----- Location part ----- ### contrasts = object$contrasts # Remember trms was trumped-up to include scale and nominal predictors. # Recover the actual terms for the principal model trms = delete.response(object$terms) m = model.frame(trms, grid, na.action = na.pass, xlev = object$xlevels) X = model.matrix(trms, m, contrasts.arg = contrasts) # Need following code because clmm objects don't have NAs for dropped columns... nms.needed = c(names(object$alpha), setdiff(colnames(X), "(Intercept)")) if (length(setdiff(nms.needed, bnm <- names(bhat))) > 0) { bhat = seq_along(nms.needed) * NA names(bhat) = nms.needed bhat[bnm] = coef(object) object$coefficients = bhat # will be needed by model.matrix object$beta = bhat[setdiff(nms.needed, names(object$alpha))] } xint = match("(Intercept)", colnames(X), nomatch = 0L) if (xint > 0L) { X = X[, -xint, drop = FALSE] } ### ----- Nominal part ----- ### if (is.null(object$nom.terms)) NOM = matrix(1, nrow = nrow(X)) else { mn = model.frame(object$nom.terms, grid, na.action = na.pass, xlev = object$nom.xlevels) NOM = model.matrix(object$nom.terms, mn, contrasts.arg = object$nom.contrasts) } bigNom = kronecker(tJac, NOM) # cols are in wrong order... I'll get the indexes by transposing a matrix of subscripts if (ncol(NOM) > 1) bigNom = bigNom[, as.numeric(t(matrix(seq_len(ncol(bigNom)), nrow=ncol(NOM))))] ### ----- Scale part ----- ### if (!is.null(object$S.terms)) { ms = model.frame(object$S.terms, grid, na.action = na.pass, xlev = object$S.xlevels) S = model.matrix(object$S.terms, ms, contrasts.arg = object$S.contrasts) S = S[, names(object$zeta), drop = FALSE] if (!is.null(attr(object$S.terms, "offset"))) { soff = .get.offset(object$S.terms, grid) # we'll add a column to S and adjust bhat and V accordingly S = cbind(S, offset = soff) bhat = c(bhat, offset = 1) V = rbind(cbind(V, offset = 0), offset = 0) } si = misc$scale.idx = length(object$alpha) + length(object$beta) + seq_len(ncol(S)) # Make sure there are no name clashes names(bhat)[si] = paste(".S", names(object$zeta), sep=".") misc$estHook = ".clm.estHook" misc$vcovHook = ".clm.vcovHook" } else S = NULL ### ----- Get non-estimability basis ----- ### nbasis = snbasis = estimability::all.estble if (any(is.na(bhat))) { obj = object # work around fact that model.matrix.clmm doesn't work class(obj) = "clm" mm = try(model.matrix(obj), silent = TRUE) if (inherits(mm, "try-error")) stop("Currently, it is not possible to construct a reference grid for this\n", "object, because it is rank-deficient and no model matrix is available.") # note: mm has components X, NOM, and S if (any(is.na(c(object$alpha, object$beta)))) { NOMX = if (is.null(mm$NOM)) mm$X else cbind(mm$NOM, mm$X[, -1]) nbasis = estimability::nonest.basis(NOMX) # replicate and reverse the sign of the NOM parts nomcols = seq_len(ncol(NOM)) nbasis = apply(nbasis, 2, function(x) c(rep(-x[nomcols], each = nrow(NOM)), x[-nomcols])) } if (!is.null(mm$S)) { if (any(is.na(object$zeta))) { snbasis = estimability::nonest.basis(mm$S) # put intercept part at end snbasis = rbind(snbasis[-1, , drop=FALSE], snbasis[1, ]) if (!is.null(attr(object$S.terms, "offset"))) snbasis = rbind(snbasis, 0) snbasis = rbind(matrix(0, ncol=ncol(snbasis), nrow=min(si)-1), snbasis) # Note scale intercept is included, so tack it on to the end of everything S = cbind(S, .S.intcpt = 1) bhat = c(bhat, .S.intcpt = 0) V = rbind(cbind(V, .S.intcpt = 0), .S.intcpt = 0) si = misc$scale.idx = c(si, 1 + max(si)) } } if (is.na(nbasis[1])) # then only nonest part is scale nbasis = snbasis else { if (!is.null(S)) # pad nbasis with zeros when there's a scale model nbasis = rbind(nbasis, matrix(0, nrow=length(si), ncol=ncol(nbasis))) if (!is.na(snbasis[1])) nbasis = cbind(nbasis, snbasis) } } if (mode == "latent") { # Create constant columns for means of scale and nominal parts J = matrix(1, nrow = nrow(X)) nomm = rescale[2] * apply(bigNom, 2, mean) X = rescale[2] * X if (!is.null(S)) { sm = apply(S, 2, mean) X = cbind(X, kronecker(-J, matrix(sm, nrow = 1))) } bigX = cbind(kronecker(-J, matrix(nomm, nrow = 1)), X) misc$offset.mult = misc$offset.mult * rescale[2] intcpt = seq_len(ncol(tJac)) bhat[intcpt] = bhat[intcpt] - rescale[1] / rescale[2] } else { ### ----- Piece together big matrix for each threshold ----- ### misc$ylevs = list(cut = cnm) # support for links not in make.link if (is.character(link) && !(link %in% c("logit", "probit", "cauchit", "cloglog"))) { setLinks = get("setLinks", asNamespace("ordinal")) env = new.env() setLinks(env, link) link = list(linkfun = quote(stop), linkinv=env$pfun, mu.eta = env$dfun, name = env$link, lambda = env$lambda) } misc$tran = link misc$inv.lbl = "cumprob" misc$offset.mult = -1 if (!is.null(S)) X = cbind(X, S) J = matrix(1, nrow=nrow(tJac)) bigX = cbind(bigNom, kronecker(-J, X)) if (mode != "linear.predictor") { misc$mode = mode misc$respName = as.character.default(object$terms)[2] misc$postGridHook = ".clm.postGrid" } } dimnames(bigX)[[2]] = names(bhat) list(X = bigX, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = list(), misc = misc) } # function called at end of ref_grid # I use this for polr as well # Also used for stanreg result of stan_polr & potentially other MCMC ordinal models .clm.postGrid = function(object, ...) { mode = object@misc$mode object@misc$postGridHook = object@misc$mode = NULL object = regrid(object, transform = "response", ...) if(object@misc$estName == "exc.prob") { # back-transforming yields exceedance probs object@bhat = 1 - object@bhat if(!is.null(object@post.beta[1])) object@post.beta = 1 - object@post.beta object@misc$estName = "cum.prob" } if (mode == "prob") { object = .clm.prob.grid(object, ...) } else if (mode == "mean.class") { object = .clm.mean.class(object, ...) } else if (mode == "exc.prob") { object@bhat = 1 - object@bhat if(!is.null(object@post.beta[1])) object@post.beta = 1 - object@post.beta object@misc$estName = "exc.prob" } # (else mode == "cum.prob" and it's all OK) object@misc$respName = NULL # cleanup object } # Make the linear-predictor ref_grid into one for class probabilities # This assumes that object has already been re-gridded and back-transformed .clm.prob.grid = function(object, thresh = "cut", newname = object@misc$respName, ...) { byv = setdiff(names(object@levels), thresh) newrg = contrast(object, ".diff_cum", by = byv, ...) newrg@grid$.offset. = (apply(newrg@linfct, 1, sum) < 0) + 0 if (!is.null(wgt <- object@grid[[".wgt."]])) { km1 = length(object@levels[[thresh]]) wgt = wgt[seq_len(length(wgt) / km1)] # unique weights for byv combs newrg = force_regular(newrg) key = do.call(paste, object@grid[byv])[seq_along(wgt)] tgt = do.call(paste, newrg@grid[byv]) for (i in seq_along (wgt)) newrg@grid[[".wgt."]][tgt == key[i]] = wgt[i] } # proceed to disavow that this was ever exposed to 'emmeans' or 'contrast' ## class(newrg) = "ref.grid" misc = newrg@misc if(!is.null(misc$display) && all(misc$display)) misc$display = NULL misc$is.new.rg = TRUE misc$infer = c(FALSE,FALSE) misc$estName = "prob" misc$pri.vars = misc$by.vars = misc$con.coef = misc$orig.grid = NULL newrg@misc = misc conid = which(names(newrg@levels) == "contrast") names(newrg@levels)[conid] = names(newrg@grid)[conid] = newname newrg@roles = object@roles newrg@roles$multresp = newname newrg } # special 'contrast' fcn used by .clm.mean.class .meanclass.emmc = function(levs, lf, ...) data.frame(mean = lf) .clm.mean.class = function(object, ...) { prg = .clm.prob.grid(object, newname = "class", ...) byv = setdiff(names(prg@levels), "class") lf = as.numeric(prg@levels$class) newrg = contrast(prg, ".meanclass", lf = lf, by = byv, ...) newrg = update(newrg, infer = c(FALSE, FALSE), pri.vars = NULL, by.vars = NULL, estName = "mean.class") newrg@levels$contrast = newrg@grid$contrast = NULL prg@roles$multresp = NULL newrg@roles = prg@roles ## class(newrg) = "ref.grid" update(force_regular(newrg), is.new.rg = TRUE) } # Contrast fcn for turning estimates of cumulative probabilities # into cell probabilities .diff_cum.emmc = function(levs, sep = "|", ...) { plevs = unique(setdiff(unlist(strsplit(levs, sep, TRUE)), sep)) k = 1 + length(levs) if (length(plevs) != k) plevs = seq_len(k) M = matrix(0, nrow = length(levs), ncol = k) for (i in seq_along(levs)) M[i, c(i,i+1)] = c(1,-1) dimnames(M) = list(levs, plevs) M = as.data.frame(M) attr(M, "desc") = "Differences of cumulative probabilities" attr(M, "adjust") = "none" attr(M, "offset") = c(rep(0, k-1), 1) M } #### replacement estimation routines for cases with a scale param ## workhorse for estHook and vcovHook functions .clm.hook = function(object, tol = 1e-8, ...) { scols = object@misc$scale.idx bhat = object@bhat active = !is.na(bhat) bhat[!active] = 0 linfct = object@linfct estble = estimability::is.estble(linfct, object@nbasis, tol) ###apply(linfct, 1, .is.estble, object@nbasis, tol) estble[!estble] = NA rsigma = estble * as.numeric(linfct[, scols, drop = FALSE] %*% object@bhat[scols]) rsigma = exp(rsigma) * estble # I'll do the scaling later eta = as.numeric(linfct[, -scols, drop = FALSE] %*% bhat[-scols]) if (!is.null(object@grid$.offset.)) eta = eta + object@grid$.offset. for (j in scols) linfct[, j] = eta * linfct[, j] linfct = (.diag(rsigma) %*% linfct) [, active, drop = FALSE] list(est = eta * rsigma, V = linfct %*% tcrossprod(object@V, linfct)) } .clm.estHook = function(object, do.se = TRUE, tol = 1e-8, ...) { raw.matl = .clm.hook(object, tol, ...) SE = if (do.se) sqrt(diag(raw.matl$V)) else NA cbind(est = raw.matl$est, SE = SE, df = Inf) } .clm.vcovHook = function(object, tol = 1e-8, ...) { .clm.hook(object, tol, ...)$V } ### Special emm_basis fcn for the scale model .emm_basis.clm.scale = function(object, trms, xlev, grid, ...) { m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = object$S.contrasts) bhat = c(`(intercept)` = 0, object$zeta) nbasis = estimability::all.estble if (any(is.na(bhat))) nbasis = estimability::nonest.basis(model.matrix(object)$S) k = sum(!is.na(bhat)) - 1 V = .my.vcov(object, ...) pick = nrow(V) - k + seq_len(k) V = V[pick, pick, drop = FALSE] V = cbind(0, rbind(0,V)) misc = list(tran = "log") list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = function(...) Inf, dfargs = list(), misc = misc) } emm_basis.clmm = function (object, trms, xlev, grid, ...) { if(is.null(object$Hessian)) { message("Updating the model to obtain the Hessian...") object = update(object, Hess = TRUE) } # borrowed from Maxime's code -- need to understand this better, e.g. when it happens H = object$Hessian if (any(apply(object$Hessian, 1, function(x) all(x == 0)))) { H = H[names(coef(object)), names(coef(object))] object$Hessian = H } result = emm_basis.clm(object, trms, xlev, grid, ...) # strip off covariances of random effects keep = seq_along(result$bhat[!is.na(result$bhat)]) result$V = result$V[keep,keep] result } emmeans/R/aovlist-support.R0000644000176200001440000002070314137062735015430 0ustar liggesusers############################################################################## # Copyright (c) 2012-2016 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # emmeans support for aovlist objects #' @export recover_data.aovlist = function(object, ...) { fcall = match.call(aov, attr(object, "call")) # matches even if called via `manova()` trms = terms(object) # Find the Error terms lbls = attr(trms, "term.labels") err.idx = grep("^Error\\(", lbls) newf = as.formula(paste(c(".~.", lbls[err.idx]), collapse = "-")) trms = terms(update(trms, newf)) dat = recover_data(fcall, delete.response(trms), na.action = attr(object, "na.action"), ...) attr(dat, "pass.it.on") = TRUE dat } # This works great for balanced experiments, and goes horribly wrong # even for slightly unbalanced ones. So I abort on these kinds of cases #' @export emm_basis.aovlist = function (object, trms, xlev, grid, vcov., ...) { m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) contr = attr(object, "contrasts") non.orth = sapply(contr, function(x) { !is.character(x) || !(x %in% c("contr.helmert", "contr.poly", "contr.sum")) }) if (any(non.orth)) { # refit with sum-to-zero contrasts contr[non.orth] = "contr.sum" message("Note: re-fitting model with sum-to-zero contrasts") cl = attr(object, "call") cl$contrasts = contr object = eval(cl) } X = model.matrix(trms, m, contrasts.arg = contr) xnms = dimnames(X)[[2]] # Check for situations we can't handle... colsums = apply(X[, setdiff(xnms, "(Intercept)"), drop=FALSE], 2, sum) if (any(round(colsums,3) != 0)) warning("Some predictors are correlated with the intercept - results may be very biased") if (length(unlist(lapply(object, function(x) names(coef(x))))) > length(xnms)) message("NOTE: Results are based on intra-block estimates and are biased.") # initialize arrays nonint = setdiff(names(object), "(Intercept)") k = npar = length(xnms) bhat1 = rep(NA, k) # I'll use NAs in 1st dim of bhat to track which slots I've filled # check for multivariate response m = ifelse (is.matrix(coefm <- object[[1]]$coefficients), ncol(coefm), 1) bhat = rep(NA, k*m) zmm1 = seq_len(m) - 1 # seq 0 : (m-1) #### utility functions... # return indices of as.numeric(x[i, ]) where x has nr rows indx = function(i, nr = npar) as.numeric(sapply(zmm1, function(j) nr * j + i)) # get names or rownames mynames = function(x) if (m == 1) names(x) else rownames(x) V = matrix(0, nrow = k*m, ncol = k*m) names(bhat1) = xnms allxnms = xnms if (m > 1) { ylevs = colnames(coefm) allxnms = as.character(sapply(ylevs, function(.) paste0(xnms, .))) } names(bhat) = allxnms dimnames(V) = list(allxnms, allxnms) empty.list = as.list(nonint) names(empty.list) = nonint Vmats = Vidx = Vdf = empty.list wts = matrix(0, nrow = length(nonint), ncol = k*m) dimnames(wts) = list(nonint, allxnms) # NOTE: At present, I just do intra-block analysis: wts are all 0 and 1 btemp = bhat1 #++ temp for tracking indexes #++Work thru strata in reverse order for (nm in rev(nonint)) { x = object[[nm]] if (m > 1) class(x) = c("mlm", "lm") # because vcov.aov is NOT suitable bi = coef(x) rn = mynames(bi) nr = length(rn) idx = which(!is.na(bi[seq_along(rn)])) bi = bi[indx(idx, nr)] ii = match(rn[idx], xnms) use = setdiff(ii, which(!is.na(bhat1))) #++ omit elts already filled if(length(use) > 0) { ii.left = seq_along(ii)[!is.na(match(ii,use))] wts[nm, indx(use)] = 1 bhat1[use] = bi[ii.left] allii.left = indx(ii.left, nr) alluse = Vidx[[nm]] = indx(use) bhat[alluse] = bi[allii.left] # following is OK now that we have class(x) = "mlm" Vi = vcov(x, complete = FALSE)[allii.left, allii.left, drop = FALSE] Vmats[[nm]] = Vi V[alluse, alluse] = Vi } else { Vmats[[nm]] = matrix(0, nrow=0, ncol=0) Vidx[[nm]] = integer(0) } # Any cases with 0 df will have NaN for covariances. I make df = -1 # in those cases so I don't divide by 0 later in Satterthwaite calcs Vdf[[nm]] = ifelse(x$df > 0, x$df, -1) } x <- object[["(Intercept)"]] if (!is.null(x)) { # The intercept belongs in the 1st error stratum # So have to add a row and column to its covariance matrix idx1 = indx(1) bhat[idx1] = as.numeric(coef(x)) wts[1, idx1] = 1 x = object[[nonint[1]]] # 1st non-intercept stratum Vidx[[1]] = ii = sort(c(idx1, Vidx[[1]])) k = length(ii) vv = matrix(0, nrow = k, ncol = k) i2k = indx(2:(k / m), k / m) if (k > m) vv[i2k, i2k] = Vmats[[1]] # Variance of intercept is EMS of this stratum divided by N # Here I'm assuming there are no weights N = sum(sapply(object, function(x) length(x$residuals))) / m i1 = indx(1, k / m) if (m > 1) V[idx1, idx1] = vv[i1,i1] = estVar(x) / N else V[1,1] = vv[1,1] = sum(resid(x)^2) / x$df / N #dimnames(vv) = list(c(xnms[ii], xnms[ii])) Vmats[[1]] = vv } # override V if vcov. is supplied if(!missing(vcov.)) { V = .my.vcov(object, vcov.) dfargs = list() dffun = function(k, dfargs) Inf } else { dfargs = list(Vmats=Vmats, Vidx=Vidx, Vdf=unlist(Vdf), wts = wts) dffun = function(k, dfargs) { emmeans::.aovlist.dffun(k, dfargs) } } nbasis = estimability::all.estble # Consider this further? misc = list(initMesg = "Warning: EMMs are biased unless design is perfectly balanced") if (m > 1) { misc$ylevs = list(rep.meas = ylevs) X = kronecker(diag(1, m), X) } # submodel support mm = NULL if(!is.null(dat <- attr(object, "data"))) { m = model.frame(trms, dat, na.action = na.pass, xlev = xlev) mm = model.matrix(trms, m, contrasts.arg = contr) mm = .cmpMM(mm, assign = attr(mm, "assign")) } list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = misc, model.matrix = mm) } #' @export .aovlist.dffun = function(k, dfargs) { if(is.matrix(k) && (nrow(k) > 1)) { dfs = apply(k, 1, .aovlist.dffun, dfargs) min(dfs) } else { v = sapply(seq_along(dfargs$Vdf), function(j) { ii = dfargs$Vidx[[j]] kk = (k * dfargs$wts[j, ])[ii] #sum(kk * .mat.times.vec(dfargs$Vmats[[j]], kk)) .qf.non0(dfargs$Vmats[[j]], kk) }) sum(v)^2 / sum(v^2 / dfargs$Vdf) # Good ole Satterthwaite } }emmeans/R/helpers.R0000644000176200001440000010025014162173143013665 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### Helper functions for emmeans ### Here we have 'recover_data' and 'emm_basis' methods ### For models that this package supports. #-------------------------------------------------------------- ### lm objects (and also aov, rlm, others that inherit) -- but NOT aovList ### Recent additional arhument 'frame' should point to where the model frame ### might be available, or NULL otherwise #' @method recover_data lm #' @export recover_data.lm = function(object, frame = object$model, ...) { fcall = object$call recover_data(fcall, delete.response(terms(object)), object$na.action, frame = frame, ...) } #' @export emm_basis.lm = function(object, trms, xlev, grid, ...) { # coef() works right for lm but coef.aov tosses out NAs bhat = object$coefficients nm = if(is.null(names(bhat))) row.names(bhat) else names(bhat) m = suppressWarnings(model.frame(trms, grid, na.action = na.pass, xlev = xlev)) X = model.matrix(trms, m, contrasts.arg = object$contrasts) assign = attr(X, "assign") X = X[, nm, drop = FALSE] bhat = as.numeric(bhat) # stretches it out if multivariate - see mlm method V = .my.vcov(object, ...) if (sum(is.na(bhat)) > 0) nbasis = estimability::nonest.basis(object$qr) else nbasis = estimability::all.estble misc = list() if (inherits(object, "glm")) { misc = .std.link.labels(object$family, misc) dffun = function(k, dfargs) dfargs$df dfargs = list(df = ifelse(object$family$family %in% c("gaussian", "Gamma"), object$df.residual, Inf)) } else { dfargs = list(df = object$df.residual) dffun = function(k, dfargs) dfargs$df } list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=dfargs, misc=misc, model.matrix = .cmpMM(object$qr, assign = assign)) } #-------------------------------------------------------------- ### mlm objects # (recover_data.lm works just fine) #' @export emm_basis.mlm = function(object, trms, xlev, grid, ...) { class(object) = c("mlm", "lm") # avoids error in vcov for "maov" objects bas = emm_basis.lm(object, trms, xlev, grid, ...) bhat = coef(object) k = ncol(bhat) bas$X = kronecker(diag(rep(1,k)), bas$X) bas$nbasis = kronecker(rep(1,k), bas$nbasis) ylevs = dimnames(bhat)[[2]] if (is.null(ylevs)) ylevs = seq_len(k) bas$misc$ylevs = list(rep.meas = ylevs) bas } #---------------------------------------------------------- # manova objects recover_data.manova = function(object, ...) { fcall = match.call(aov, object$call) # need to borrow arg matching from aov() recover_data(fcall, delete.response(terms(object)), object$na.action, frame = object$model, ...) } #-------------------------------------------------------------- ### merMod objects (lme4 package) #' @export recover_data.merMod = function(object, ...) { if(!lme4::isLMM(object) && !lme4::isGLMM(object)) return("Can't handle a nonlinear mixed model") fcall = object@call recover_data(fcall, delete.response(terms(object)), attr(object@frame, "na.action"), frame = object@frame, ...) } #' @export emm_basis.merMod = function(object, trms, xlev, grid, vcov., mode = get_emm_option("lmer.df"), lmer.df, disable.pbkrtest = get_emm_option("disable.pbkrtest"), pbkrtest.limit = get_emm_option("pbkrtest.limit"), disable.lmerTest = get_emm_option("disable.lmerTest"), lmerTest.limit = get_emm_option("lmerTest.limit"), options, ...) { if (missing(vcov.)) V = as.matrix(vcov(object, correlation = FALSE)) else V = as.matrix(.my.vcov(object, vcov.)) dfargs = misc = list() if (lme4::isLMM(object)) { # Allow lmer.df in lieu of mode if (!missing(lmer.df)) mode = lmer.df mode = match.arg(tolower(mode), c("satterthwaite", "kenward-roger", "asymptotic")) # if we're gonna override the df anyway, keep it simple # OTOH, if K-R, documentation promises we'll adjust V if (!is.null(options$df) && (mode != "kenward-roger")) mode = "asymptotic" # set flags objN = lme4::getME(object, "N") tooBig.k = (objN > pbkrtest.limit) tooBig.s = (objN > lmerTest.limit) tooBigMsg = function(pkg, limit) { message("Note: D.f. calculations have been", " disabled because the number of observations exceeds ", limit, ".\n", "To enable adjustments, add the argument '", pkg, ".limit = ", objN, "' (or larger)\n", "[or, globally, 'set emm_options(", pkg, ".limit = ", objN, ")' or larger];\n", "but be warned that this may result in large computation time and memory use.") } # pick the lowest-hanging apples first if (mode == "kenward-roger") { if (disable.pbkrtest || tooBig.k || !.requireNS("pbkrtest", "Cannot use mode = \"kenward-roger\" because *pbkrtest* package is not installed", fail = message)) mode = "satterthwaite" if (!disable.pbkrtest && tooBig.k) tooBigMsg("pbkrtest", pbkrtest.limit) } if (mode == "satterthwaite") { if (disable.lmerTest || tooBig.s || !.requireNS("lmerTest", "Cannot use mode = \"satterthwaite\" because *lmerTest* package is not installed", fail = message)) mode = ifelse(!disable.pbkrtest && !tooBig.k && .requireNS("pbkrtest", fail = .nothing), "kenward-roger", "asymptotic") if (!disable.lmerTest && tooBig.s) tooBigMsg("lmerTest", lmerTest.limit) } # if my logic isn't flawed, we are guaranteed that mode is both desired and possible if (mode == "kenward-roger") { if (missing(vcov.)) { dfargs = list(unadjV = V, adjV = pbkrtest::vcovAdj.lmerMod(object, 0)) V = as.matrix(dfargs$adjV) tst = try(pbkrtest::Lb_ddf) if(class(tst) != "try-error") dffun = function(k, dfargs) pbkrtest::Lb_ddf (k, dfargs$unadjV, dfargs$adjV) else { mode = "asymptotic" warning("Failure in loading pbkrtest routines", " - reverted to \"asymptotic\"") } } else { message("Kenward-Roger method can't be used with user-supplied covariances") mode = "satterthwaite" } } if (mode == "satterthwaite") { dfargs = list(object = object) dffun = function(k, dfargs) suppressMessages(lmerTest::calcSatterth(dfargs$object, k)$denom) } if (mode == "asymptotic") { dffun = function(k, dfargs) Inf } attr(dffun, "mesg") = mode } else if (lme4::isGLMM(object)) { dffun = function(k, dfargs) Inf misc = .std.link.labels(family(object), misc) } else stop("Can't handle a nonlinear mixed model") contrasts = attr(object@pp$X, "contrasts") m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = contrasts) bhat = lme4::fixef(object) if (length(bhat) < ncol(X)) { # Newer versions of lmer can handle rank deficiency, but we need to do a couple of # backflips to put the pieces together right, # First, figure out which columns were retained kept = match(names(bhat), dimnames(X)[[2]]) # Now re-do bhat with NAs in the right places bhat = NA * X[1, ] bhat[kept] = lme4::fixef(object) # we have to reconstruct the model matrix modmat = model.matrix(trms, object@frame, contrasts.arg=contrasts) nbasis = estimability::nonest.basis(modmat) } else nbasis=estimability::all.estble mm = .cmpMM(object@pp$X, object@pp$Xwts^2, attr(object@pp$X, "assign")) list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=dfargs, misc=misc, model.matrix = mm) } #-------------------------------------------------------------- ### lme objects (nlme package) #' @export recover_data.lme = function(object, data, ...) { fcall = object$call if (!is.null(fcall$weights)) { # painful -- we only get weights for complete cases if (!is.null(object$na.action)) { w = nlme::varWeights(object$modelStruct) wts = rep(0, length(w) + length(object$na.action)) wts[-object$na.action] = w fcall$weights = wts } else fcall$weights = nlme::varWeights(object$modelStruct) } dat = recover_data(fcall, delete.response(object$terms), object$na.action, data = data, ...) attr(dat, "pass.it.on") = TRUE dat } #' @export emm_basis.lme = function(object, trms, xlev, grid, mode = c("containment", "satterthwaite", "appx-satterthwaite", "auto", "boot-satterthwaite", "asymptotic"), sigmaAdjust = TRUE, options, extra.iter = 0, ...) { mode = match.arg(mode) if (mode == "boot-satterthwaite") mode = "appx-satterthwaite" # backward compatibility if (mode == "asymptotic") options$df = Inf if (!is.null(options$df)) # if we're gonna override the df anyway, keep it simple! mode = "fixed" if (mode == "auto") mode = ifelse(is.null(object$apVar), "containment", "appx-satterthwaite") if (is.null(object$apVar)) mode = "containment" contrasts = object$contrasts m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = contrasts) bhat = nlme::fixef(object) V = .my.vcov(object, ...) if (sigmaAdjust && object$method == "ML") V = V * object$dims$N / (object$dims$N - nrow(V)) misc = list() if (!is.null(object$family)) { misc = .std.link.labels(object$family, misc) } nbasis = estimability::all.estble if (mode == "fixed") { # hack to just put in df from options dfargs = list(df = options$df) dffun = function(k, dfargs) dfargs$df } else if (mode %in% c("satterthwaite", "appx-satterthwaite")) { mode = "appx-satterthwaite" G = try(gradV.kludge(object, extra.iter = extra.iter), silent = TRUE) ###! not yet, doesn't work G = try(lme_grad(object, object$call, object$data, V)) if (inherits(G, "try-error")) stop("Unable to estimate Satterthwaite parameters") dfargs = list(V = V, A = object$apVar, G = G) dffun = function(k, dfargs) { est = tcrossprod(crossprod(k, dfargs$V), k) g = sapply(dfargs$G, function(M) tcrossprod(crossprod(k, M), k)) varest = tcrossprod(crossprod(g, dfargs$A), g) 2 * est^2 / varest } } else { # containment df dfx = object$fixDF$X if (names(bhat[1]) == "(Intercept)") dfx[1] = length(levels(object$groups[[1]])) - 1 ### Correct apparent error in lme containment algorithm dffun = function(k, dfargs) { idx = which(abs(k) > 1e-4) ifelse(length(idx) > 0, min(dfargs$dfx[idx]), NA) } dfargs = list(dfx = dfx) } attr(dffun, "mesg") = mode # submodel support (not great -- omits any weights) m = model.frame(trms, attr(object, "data"), na.action = na.pass, xlev = xlev) mm = model.matrix(trms, m, contrasts.arg = contrasts) mm = .cmpMM(mm, assign = attr(mm, "assign")) list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = misc, model.matrix = mm) } # Here is a total hack, but it works pretty well # We estimate the gradient of the V matrix by fitting the # model with a few random perturbations of y, then # regressing the changes in V against the changes in the # covariance parameters gradV.kludge = function(object, Vname = "varFix", call = object$call$fixed, data = object$data, extra.iter = 0) { # check consistency of contrasts #### This code doesn't work with coerced factors. Hardly seems messing with, so I commented it out # cnm = names(object$contrasts) # cdiff = sapply(cnm, function(.) max(abs(contrasts(data[[.]]) - object$contrasts[[.]]))) # if (max(cdiff) > 1e-6) { # message("Contrasts don't match those used when the model was fitted. Fix this and re-run") # stop() # } A = object$apVar theta = attr(A, "Pars") V = object[[Vname]] sig = .01 * object$sigma #data = object$data yname = all.vars(eval(call))[1] y = data[[yname]] n = length(y) dat = t(replicate(2 + extra.iter + length(theta), { data[[yname]] = y + sig * rnorm(n) mod = update(object, data = data) c(attr(mod$apVar, "Pars") - theta, as.numeric(mod[[Vname]] - V)) })) dimnames(dat) = c(NULL, NULL) xcols = seq_along(theta) B = lm.fit(dat[, xcols], dat[,-xcols])$coefficients grad = lapply(seq_len(nrow(B)), function(i) matrix(B[i, ], nrow=nrow(V))) grad } # ### new way to get gradients for lme models # # (not ready for primetime...) # lme_grad = function(object, call, data, V) { # obj = object$modelStruct # conLin = object # class(conLin) = class(obj) # X = model.matrix(eval(call$fixed), data = data) # y = data[[all.vars(call)[1]]] # conLin$Xy = cbind(X, y) # conLin$fixedSigma = FALSE # grps = object$groups # May have to re-order these fancily? # MEest = get("MEestimate", getNamespace("nlme")) ## workaround its not being exported # func = function(x) { # coef(obj) = x # tmp = MEest(obj, grps, conLin) # crossprod(tmp$sigma * tmp$varFix) # } # res = numDeriv::jacobian(func, coef(obj)) # G = lapply(seq_len(ncol(res)), function(j) matrix(res[, j], ncol = ncol(V))) # G[[1 + length(G)]] = 2 * V # gradient wrt log sigma # G # } #-------------------------------------------------------------- ### new way to get jacobians for gls models gls_grad = function(object, call, data, V) { obj = object$modelStruct conLin = object class(conLin) = class(obj) X = model.matrix(eval(call$model), data = data) y = data[[all.vars(call)[1]]] conLin$Xy = cbind(X, y) conLin$fixedSigma = FALSE func = function(x) { obj = nlme::`coef<-`(obj, value = x) tmp = nlme::glsEstimate(obj, conLin) .get.lt(crossprod(tmp$sigma * tmp$varBeta)) # lower triangular form } res = numDeriv::jacobian(func, coef(obj)) G = lapply(seq_len(ncol(res)), function(j) .lt2mat(res[, j])) G[[1 + length(G)]] = 2 * V # gradient wrt log sigma G } ### gls objects (nlme package) recover_data.gls = function(object, data, ...) { fcall = object$call if (!is.null(wts <- fcall$weights)) { wts = nlme::varWeights(object$modelStruct) fcall$weights = NULL } trms = delete.response(terms(nlme::getCovariateFormula(object))) result = recover_data.call(fcall, trms, object$na.action, data = data, ...) if (!is.null(wts)) result[["(weights)"]] = wts if (!missing(data)) attr(result, "misc") = list(data = data) attr(result, "pass.it.on") = TRUE result } emm_basis.gls = function(object, trms, xlev, grid, mode = c("auto", "df.error", "satterthwaite", "appx-satterthwaite", "boot-satterthwaite", "asymptotic"), extra.iter = 0, options, misc, ...) { contrasts = object$contrasts m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = contrasts) bhat = coef(object) V = .my.vcov(object, ...) nbasis = estimability::all.estble mode = match.arg(mode) if (mode == "boot-satterthwaite") mode = "appx-satterthwaite" # backward compatibility if (!is.null(options$df)) # if we're gonna override the df anyway, keep it simple! mode = "df.error" if (mode == "auto") mode = ifelse(is.null(object$apVar), "df.error", "satterthwaite") if (!is.matrix(object$apVar)) mode = "df.error" if (mode %in% c("satterthwaite", "appx-satterthwaite")) { data = if(is.null(misc$data)) eval(object$call$data, parent.frame(2)) else misc$data misc = list() chk = attr(object$apVar, "Pars") if(max(abs(coef(object$modelStruct) - chk[-length(chk)])) > .001) { message("Analytical Satterthwaite method not available; using appx-satterthwaite") mode = "appx-satterthwaite" } if (mode == "appx-satterthwaite") { G = try(gradV.kludge(object, "varBeta", call = object$call$model, data = data, extra.iter = extra.iter), silent = TRUE) } else G = try(gls_grad(object, object$call, data, V)) if (inherits(G, "try-error")) { sugg = ifelse(mode == "satterthwaite", "appx-satterthwaite", "df.error") stop("Can't estimate Satterthwaite parameters.\n", " Try adding the argument 'mode = \"", sugg, "\"'", call. = FALSE) } dfargs = list(V = V, A = object$apVar, G = G) dffun = function(k, dfargs) { est = tcrossprod(crossprod(k, dfargs$V), k) g = sapply(dfargs$G, function(M) tcrossprod(crossprod(k, M), k)) varest = tcrossprod(crossprod(g, dfargs$A), g) 2 * est^2 / varest } } else if (mode %in% c("df.error", "asymptotic")) { df = ifelse(mode == "asymptotic", Inf, object$dims$N - object$dims$p - length(attr(object$apVar, "Pars"))) dfargs = list(df = df) dffun = function(k, dfargs) dfargs$df } attr(dffun, "mesg") = mode # submodel support (not great because I don't know how to retrieve the weights) m = model.frame(trms, attr(object, "data"), na.action = na.pass, xlev = xlev) mm = model.matrix(trms, m, contrasts.arg = contrasts) mm = .cmpMM(mm, assign = attr(mm, "assign")) list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=dfargs, misc=misc, model.matrix = mm) } #-------------------------------------------------------------- ### polr objects (MASS package) recover_data.polr = function(object, ...) recover_data.lm(object, ...) emm_basis.polr = function(object, trms, xlev, grid, mode = c("latent", "linear.predictor", "cum.prob", "exc.prob", "prob", "mean.class"), rescale = c(0,1), ...) { mode = match.arg(mode) contrasts = object$contrasts m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = contrasts) # Strip out the intercept (borrowed code from predict.polr) xint = match("(Intercept)", colnames(X), nomatch = 0L) if (xint > 0L) X = X[, -xint, drop = FALSE] bhat = c(coef(object), object$zeta) V = .my.vcov(object, ...) k = length(object$zeta) if (mode == "latent") { X = rescale[2] * cbind(X, matrix(- 1/k, nrow = nrow(X), ncol = k)) bhat = c(coef(object), object$zeta - rescale[1] / rescale[2]) misc = list(offset.mult = rescale[2]) } else { j = matrix(1, nrow=k, ncol=1) J = matrix(1, nrow=nrow(X), ncol=1) X = cbind(kronecker(-j, X), kronecker(diag(1,k), J)) link = object$method if (link == "logistic") link = "logit" misc = list(ylevs = list(cut = names(object$zeta)), tran = link, inv.lbl = "cumprob", offset.mult = -1) if (mode != "linear.predictor") { # just use the machinery we already have for the 'ordinal' package misc$mode = mode misc$postGridHook = ".clm.postGrid" } } misc$respName = as.character.default(terms(object))[2] nbasis = estimability::all.estble dffun = function(...) Inf list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=list(), misc=misc) } #-------------------------------------------------------------- ### survreg objects (survival package) recover_data.survreg = function(object, ...) { fcall = object$call trms = delete.response(terms(object)) # I'm gonna delete any terms involving cluster(), or frailty() -- keep strata() mod.elts = dimnames(attr(trms, "factor"))[[2]] tmp = grep("cluster\\(|frailty\\(", mod.elts) if (length(tmp)) trms = trms[-tmp] recover_data(fcall, trms, object$na.action, ...) } # Seems to work right in a little testing. # However, it fails sometimes if I update the model # with a subset argument. Workaround: just fitting a new model emm_basis.survreg = function(object, trms, xlev, grid, ...) { # Much of this code is adapted from predict.survreg bhat = object$coefficients k = length(bhat) - sum(is.na(bhat)) V = .my.vcov(object, ...)[seq_len(k), seq_len(k), drop=FALSE] # ??? not used... is.fixeds = (k == ncol(object$var)) ### zap-out factors in xlev not needed by model.frame xlev[setdiff(names(xlev), rownames(attr(trms, "factors")))] = NULL m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) # X = model.matrix(object, m) # This is what predict.survreg does # But I have manipulated trms, so need to make sure things are consistent X = model.matrix(trms, m, contrasts.arg = object$contrasts) usecols = intersect(colnames(X), names(bhat)) bhat = bhat[usecols] # in case ref_grid code excluded some levels... nbasis = estimability::nonest.basis(model.matrix(object)) dfargs = list(df = object$df.residual) dffun = function(k, dfargs) dfargs$df if (object$dist %in% c("exponential","weibull","loglogistic","loggaussian","lognormal")) misc = list(tran = "log", inv.lbl = "response") else misc = list() misc$postGridHook = .notran2 # removes "Surv()" as response transformation list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=dfargs, misc=misc) } #-------------------------------------------------------------- ### coxph objects (survival package) recover_data.coxph = function(object, ...) recover_data.survreg(object, ...) emm_basis.coxph = function (object, trms, xlev, grid, ...) { object$dist = "doesn't matter" result = emm_basis.survreg(object, trms, xlev, grid, ...) result$dfargs$df = Inf nms = colnames(result$X) # delete columns for intercept and main effects of strata zaps = which(nms %in% setdiff(nms, names(result$bhat))) result$X = result$X[, -zaps, drop = FALSE] ### result$X = result$X - rep(object$means, each = nrow(result$X)) result$misc$tran = "log" result$misc$inv.lbl = "hazard" result } .notran2 = function(object, ...) { for (nm in c("tran", "tran2")) if(!is.null(object@misc[[nm]]) && object@misc[[nm]] == "Surv") object@misc[[nm]] = NULL object } # Note: Very brief experimentation suggests coxph.penal also works. # This is an extension of coxph #-------------------------------------------------------------- ### coxme objects #### ### Greatly revised 6-15-15 (after version 2.18) recover_data.coxme = function(object, ...) recover_data.survreg(object, ...) emm_basis.coxme = function(object, trms, xlev, grid, ...) { bhat = coxme::fixef(object) k = length(bhat) V = .my.vcov(object, ...)[seq_len(k), seq_len(k), drop = FALSE] m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m) X = X[, -1, drop = FALSE] # remove the intercept # scale the linear predictor for (j in seq_along(X[1, ])) X[, j] = (X[, j] - object$means[j]) ### / object$scale[j] nbasis = estimability::all.estble dffun = function(k, dfargs) Inf misc = list(tran = "log", inv.lbl = "hazard") misc$postGridHook = .notran2 # removes "Surv()" as response transformation list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = list(), misc = misc) } ### special vcov prototype for cases where there are several vcov options ### e.g., gee, geeglm, geese .named.vcov = function(object, method, ...) UseMethod(".named.vcov") # default has optional idx of same length as valid and if so, idx indicating # which elt of valid to use if matched # Ex: valid = c("mammal", "fish", "rat", "dog", "trout", "perch") # idx = c( 1, 2, 1, 1, 2, 2) # -- so ultimately results can only be "mammal" or "fish" # nonmatches revert to 1st elt. .named.vcov.default = function(object, method, valid, idx = seq_along(valid), ...) { if (!is.character(method)) { # in case vcov. arg was matched by vcov.method { V = .my.vcov(object, method) method = "user-supplied" } else { i = pmatch(method, valid, 1) method = valid[idx[i]] V = object[[method]] } attr(V, "methMesg") = paste("Covariance estimate used:", method) V } # general-purpose emm_basis function for GEEs .emmb.geeGP = function(object, trms, xlev, grid, vcov.method, valid, idx = seq_along(valid), ...) { m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = object$contrasts) bhat = coef(object) V = .named.vcov(object, vcov.method, valid, idx, ...) if (sum(is.na(bhat)) > 0) nbasis = estimability::nonest.basis(object$qr) else nbasis = estimability::all.estble misc = .std.link.labels(object$family, list()) misc$initMesg = attr(V, "methMesg") dffun = function(k, dfargs) Inf dfargs = list() list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=dfargs, misc=misc) } #--------------------------------------------------------------- ### gee objects #### recover_data.gee = function(object, ...) recover_data.lm(object, frame = NULL, ...) emm_basis.gee = function(object, trms, xlev, grid, vcov.method = "robust.variance", ...) .emmb.geeGP(object, trms, xlev, grid, vcov.method, valid = c("robust.variance", "naive.variance")) ### geepack objects #### recover_data.geeglm = function(object, ...) recover_data.lm(object, ...) emm_basis.geeglm = function(object, trms, xlev, grid, vcov.method = "vbeta", ...) { m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = object$contrasts) bhat = coef(object) V = .named.vcov(object$geese, vcov.method, valid = c("vbeta", "vbeta.naiv","vbeta.j1s","vbeta.fij","robust","naive"), idx = c(1,2,3,4,1,2)) if (sum(is.na(bhat)) > 0) nbasis = estimability::nonest.basis(object$qr) else nbasis = estimability::all.estble misc = .std.link.labels(object$family, list()) misc$initMesg = attr(V, "methMesg") dffun = function(k, dfargs) Inf dfargs = list() list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=dfargs, misc=misc) } recover_data.geese = function(object, ...) { fcall = object$call # what a pain - we need to reconstruct the terms component args = as.list(fcall[-1]) na.action = object$na.action #trms = terms.formula(fcall$formula) if (!is.null(args$data)) { data = eval(args$data, parent.frame()) trms = terms(model.frame(fcall$formula, data = data)) } else { trms = terms(model.frame(fcall$formula)) } recover_data(fcall, delete.response(trms), na.action, ...) } emm_basis.geese = function(object, trms, xlev, grid, vcov.method = "vbeta", ...) { m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = object$contrasts) bhat = object$beta V = .named.vcov(object, vcov.method, valid = c("vbeta", "vbeta.naiv","vbeta.j1s","vbeta.fij","robust","naive"), idx = c(1,2,3,4,1,2)) # We don't have the qr component - I'm gonna punt for now if (sum(is.na(bhat)) > 0) warning("There are non-estimable functions, but estimability is NOT being checked") # nbasis = estimability::nonest.basis(object$qr) # else nbasis = estimability::all.estble misc = list() if (!is.null(fam <- object$call$family)) misc = .std.link.labels(eval(fam)(), misc) misc$initMesg = attr(V, "methMesg") dffun = function(k, dfargs) Inf dfargs = list() list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=dfargs, misc=misc) } ### survey pacvkage # svyglm class recover_data.svyglm = function(object, data = NULL, ...) { if (is.null(data)) { env = environment(terms(object)) des = eval(object$call$design, envir = env) data = eval(des$call$data, envir = env) } recover_data.lm(object, data = data, frame = object$model, ...) } # inherited emm_basis.lm method works fine ### ----- Auxiliary routines ------------------------- # Provide for vcov. argument in ref_grid call, which could be a function or a matrix .statsvcov = function(object, ...) stats::vcov(object, complete = FALSE, ...) #' @export .my.vcov = function(object, vcov. = .statsvcov, ...) { if (is.function(vcov.)) vcov. = vcov.(object, ...) else if (!is.matrix(vcov.)) stop("vcov. must be a function or a square matrix") vcov. } # Call this to do the standard stuff with link labels # Returns a modified misc #' @export .std.link.labels = function(fam, misc) { if (is.null(fam) || !is.list(fam)) return(misc) if (fam$link == "identity") return(misc) misc$tran = fam$link misc$inv.lbl = "response" if (length(grep("binomial", fam$family)) == 1) misc$inv.lbl = "prob" else if (length(grep("poisson", fam$family)) == 1) misc$inv.lbl = "rate" misc } emmeans/R/contrast.R0000644000176200001440000006340214150775124014073 0ustar liggesusers############################################################################## # Copyright (c) 2012-2019 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # contrast() and related functions (previously in with emmeans code) ### 'contrast' S3 generic and method #' Contrasts and linear functions of EMMs #' #' These methods provide for follow-up analyses of \code{emmGrid} objects: #' Contrasts, pairwise comparisons, tests, and confidence intervals. They may #' also be used to compute arbitrary linear functions of predictions or EMMs. #' #' @export contrast = function(object, ...) UseMethod("contrast") #' @rdname contrast #' @param object An object of class \code{emmGrid} #' @param method Character value giving the root name of a contrast method (e.g. #' \code{"pairwise"} -- see \link{emmc-functions}). Alternatively, a function #' of the same form, or a named \code{list} of coefficients (for a contrast or #' linear function) that must each conform to the number of results in each #' \code{by} group. In a multi-factor situation, the factor levels are #' combined and treated like a single factor. #' @param interaction Character vector, logical value, or list. If this is specified, #' \code{method} is ignored. See the \dQuote{Interaction contrasts} section #' below for details. #' @param by Character names of variable(s) to be used for ``by'' groups. The #' contrasts or joint tests will be evaluated separately for each combination #' of these variables. If \code{object} was created with by groups, those are #' used unless overridden. Use \code{by = NULL} to use no by groups at all. #' @param offset,scale Numeric vectors of the same length as each \code{by} group. #' The \code{scale} values, if supplied, multiply their respective linear estimates, and #' any \code{offset} values are added. Scalar values are also allowed. #' (These arguments are ignored when \code{interaction} is specified.) #' @param name Character name to use to override the default label for contrasts #' used in table headings or subsequent contrasts of the returned object. #' @param options If non-\code{NULL}, a named \code{list} of arguments to pass #' to \code{\link{update.emmGrid}}, just after the object is constructed. #' @param type Character: prediction type (e.g., \code{"response"}) -- added to #' \code{options} #' @param adjust Character: adjustment method (e.g., \code{"bonferroni"}) -- #' added to \code{options} #' @param simple Character vector or list: Specify the factor(s) \emph{not} in #' \code{by}, or a list thereof. See the section below on simple contrasts. #' @param combine Logical value that determines what is returned when #' \code{simple} is a list. See the section on simple contrasts. #' @param ratios Logical value determining how log and logit transforms are #' handled. These transformations are exceptional cases in that there is a #' valid way to back-transform contrasts: differences of logs are logs of #' ratios, and differences of logits are odds ratios. If \code{ratios = TRUE} #' and summarized with \code{type = "response"}, \code{contrast} results are #' back-transformed to ratios whenever we have true contrasts (coefficients #' sum to zero). For other transformations, there is no natural way to #' back-transform contrasts, so even when summarized with \code{type = "response"}, #' contrasts are computed and displayed on the linear-predictor scale. Similarly, #' if \code{ratios = FALSE}, log and logit transforms are treated in the same way as #' any other transformation. #' @param parens character or \code{NULL}. If a character value, the labels for levels #' being contrasted are parenthesized if they match the regular expression in #' \code{parens[1]} (via \code{\link{grep}}). The default is \code{emm_option("parens")}. #' Optionally, \code{parens} may contain second and third elements specifying #' what to use for left and right parentheses (default \code{"("} and \code{")"}). #' Specify \code{parens = NULL} or \code{parens = "a^"} (which won't match anything) #' to disable all parenthesization. #' @param ... Additional arguments passed to other methods #' #' @return \code{contrast} and \code{pairs} return an object of class #' \code{emmGrid}. Its grid will correspond to the levels of the contrasts and #' any \code{by} variables. The exception is that an \code{\link{emm_list}} #' object is returned if \code{simple} is a list and \code{complete} is #' \code{FALSE}. #' #' @section Pairs method: The call \code{pairs(object)} is equivalent to #' \code{contrast(object, method = "pairwise")}; and \code{pairs(object, #' reverse = TRUE)} is the same as \code{contrast(object, method = #' "revpairwise")}. #' #' @section Interaction contrasts: When \code{interaction} is specified, #' interaction contrasts are computed. Specifically contrasts are generated #' for each factor separately, one at a time; and these contrasts are applied #' to the object (the first time around) or to the previous result #' (subsequently). (Any factors specified in \code{by} are skipped.) The final #' result comprises contrasts of contrasts, or, equivalently, products of #' contrasts for the factors involved. Any named elements of \code{interaction} #' are assigned to contrast methods; others are assigned in order of #' appearance in \code{object@levels}. The contrast factors in the resulting #' \code{emmGrid} object are ordered the same as in \code{interaction}. #' #' \code{interaction} may be a character vector or list of valid contrast #' methods (as documented for the \code{method} argument). If the vector or #' list is shorter than the number needed, it is recycled. Alternatively, if #' the user specifies \code{contrast = TRUE}, the contrast specified in #' \code{method} is used for all factors involved. #' #' @section Simple contrasts: #' \code{simple} is essentially the complement of \code{by}: When #' \code{simple} is a character vector, \code{by} is set to all the factors in #' the grid \emph{except} those in \code{simple}. If \code{simple} is a list, #' each element is used in turn as \code{simple}, and assembled in an #' \code{"emm_list"}. To generate \emph{all} simple main effects, use #' \code{simple = "each"} (this works unless there actually is a factor named #' \code{"each"}). Note that a non-missing \code{simple} will cause \code{by} #' to be ignored. #' #' Ordinarily, when \code{simple} is a list or \code{"each"}, the return value #' is an \code{\link{emm_list}} object with each entry in correspondence with #' the entries of \code{simple}. However, with \code{combine = TRUE}, the #' elements are all combined into one family of contrasts in a single #' \code{\link[=emmGrid-class]{emmGrid}} object using #' \code{\link{rbind.emmGrid}}.. In that case, the \code{adjust} argument sets #' the adjustment method for the combined set of contrasts. #' #' @note When \code{object} has a nesting structure (this can be seen via #' \code{str(object)}), then any grouping factors involved are forced into #' service as \code{by} variables, and the contrasts are thus computed #' separately in each nest. This in turn may lead to an irregular grid in the #' returned \code{emmGrid} object, which may not be valid for subsequent #' \code{emmeans} calls. #' #' @method contrast emmGrid #' @export #' #' @examples #' warp.lm <- lm(breaks ~ wool*tension, data = warpbreaks) #' warp.emm <- emmeans(warp.lm, ~ tension | wool) #' contrast(warp.emm, "poly") # inherits 'by = "wool"' from warp.emm #' pairs(warp.emm) # ditto #' contrast(warp.emm, "eff", by = NULL) # contrasts of the 6 factor combs #' pairs(warp.emm, simple = "wool") # same as pairs(warp.emm, by = "tension") #' #' # Do all "simple" comparisons, combined into one family #' pairs(warp.emm, simple = "each", combine = TRUE) #' #' \dontrun{ #' #' ## Note that the following are NOT the same: #' contrast(warp.emm, simple = c("wool", "tension")) #' contrast(warp.emm, simple = list("wool", "tension")) #' ## The first generates contrasts for combinations of wool and tension #' ## (same as by = NULL) #' ## The second generates contrasts for wool by tension, and for #' ## tension by wool, respectively. #' } #' #' # An interaction contrast for tension:wool #' tw.emm <- contrast(warp.emm, interaction = c(tension = "poly", wool = "consec"), #' by = NULL) #' tw.emm # see the estimates #' coef(tw.emm) # see the contrast coefficients #' #' # Use of scale and offset #' # an unusual use of the famous stack-loss data... #' mod <- lm(Water.Temp ~ poly(stack.loss, degree = 2), data = stackloss) #' (emm <- emmeans(mod, "stack.loss", at = list(stack.loss = 10 * (1:4)))) #' # Convert results from Celsius to Fahrenheit: #' confint(contrast(emm, "identity", scale = 9/5, offset = 32)) #' contrast.emmGrid = function(object, method = "eff", interaction = FALSE, by, offset = NULL, scale = NULL, name = "contrast", options = get_emm_option("contrast"), type, adjust, simple, combine = FALSE, ratios = TRUE, parens, ...) { if (!missing(type)) options = as.list(c(options, predict.type = type)) if(!missing(simple)) return(.simcon(object, method = method, interaction = interaction, offset = offset, scale = scale, name = name, options = options, type = type, simple = simple, combine = combine, adjust = adjust, parens = parens, ...)) if(missing(by)) by = object@misc$by.vars if(length(by) == 0) # character(0) --> NULL by = NULL nesting = object@model.info$nesting if (!is.null(nesting) || !is.null(object@misc$display)) return (.nested_contrast(rgobj = object, method = method, interaction = interaction, by = by, adjust = adjust, type = type, offset = offset, ...)) orig.grid = object@grid[, , drop = FALSE] orig.grid[[".wgt."]] = orig.grid[[".offset."]] = NULL if (is.logical(interaction) && interaction) interaction = method if (!is.logical(interaction)) { # i.e., interaction is not FALSE if(missing(adjust)) adjust = "none" vars = names(object@levels) k = length(vars) if(!is.null(by)) { vars = c(setdiff(vars, by), by) k = k - length(by) } nms = names(interaction) interaction = as.list(rep(interaction, k)[1:k]) names(interaction) = c(nms, rep("", k))[1:k] nms = vars[1:k] if (is.null(names(interaction))) names(interaction) = nms else { unnamed = which(!(names(interaction) %in% nms)) names(interaction)[unnamed] = setdiff(nms, names(interaction)) nms = names(interaction) } # if (!is.character(interaction)) # stop("interaction requires named contrast function(s)") ### by = NULL why was this here before ??? tcm = NULL for (i in k:1) { if (is.character(interaction[[i]])) nm = paste(nms[i], interaction[[i]], sep = "_") else nm = paste(nms[i], "custom", sep = "_") pos = which(vars == nms[i]) object = contrast.emmGrid(object, interaction[[i]], by = vars[-pos], name = nm, ...) if(is.null(tcm)) tcm = object@misc$con.coef else tcm = object@misc$con.coef %*% tcm vars[pos] = nm } object = update(object, by = by, adjust = adjust, silent = TRUE) ### removed `...` here Nov 2019 because a `mode` arg gets matched with `model.info` ### when passed via formula lhs in `emmeans()` object@misc$is.new.rg = NULL object@misc$orig.grid = orig.grid object@misc$con.coef = tcm object = .update.options(object, options, ...) return(object) } # else we have a regular contrast (not interaction) linfct = object@linfct[, , drop = FALSE] args = g = object@grid[, , drop = FALSE] args[[".offset."]] = NULL args[[".wgt."]] = NULL # ignore auxiliary stuff in labels, etc. if (!is.null(by)) { by.rows = .find.by.rows(args, by) ulen = unique(sapply(by.rows, length)) if (length(ulen) > 1) stop ("`by` groups are of irregular size;\n currently not supported except with nested structures") bylevs = args[, by, drop=FALSE] all.args = args args = args[by.rows[[1]], , drop=FALSE] for (nm in by) { args[[nm]] = NULL all.args[[nm]] = NULL } all.levs = do.call("paste", c(unname(all.args), sep = get_emm_option("sep"))) # keep all levels in case we have permutations of them } args = unname(args) args$sep = get_emm_option("sep") levs = do.call("paste", args) # NOTE - these are levels for the first (or only) by-group if (length(levs) == 0) # prevent error when there are no levels to contrast method = "eff" if(is.null(by)) all.levs = levs # parenthesize levels if they contain spaces or operators rawlevs = levs # save orig ones if(missing(parens)) parens = get_emm_option("parens") if(is.character(parens) && length(idx <- grep(parens[1], all.levs)) > 0) { if(length(parens) < 3) parens = c(parens, "(", ")") all.levs[idx] = paste0(parens[2], all.levs[idx], parens[3]) idx = grep(parens[1], levs) if (length(idx) > 0) levs[idx] = paste0(parens[2], levs[idx], parens[3]) } attr(levs, "raw") = rawlevs # so we can recover original levels when needed if (is.list(method)) { cmat = as.data.frame(method, optional = TRUE) # I have no clue why they named that argument 'optional', # but setting it to TRUE keeps it from messing up the names method = function(levs, ...) cmat } else if (is.character(method)) { fn = paste(method, "emmc", sep=".") method = if (exists(fn, mode="function")) get(fn) else stop(paste("Contrast function '", fn, "' not found", sep="")) } # case like in old lsmeans, contr = list else if (!is.function(method)) stop("'method' must be a list, function, or the basename of an '.emmc' function") # Get the contrasts; this should be a data.frame cmat = method(levs, ...) if (!is.data.frame(cmat)) stop("Contrast function must provide a data.frame") else if(ncol(cmat) == 0) { cmat = data.frame(`(nothing)` = rep(NA, nrow(args)), check.names = FALSE) adjust = "none" } # warning("No contrasts were generated! Perhaps only one emmean is involved.\n", # " This can happen, for example, when your predictors are not factors.") else if (nrow(cmat) != nrow(args)) stop("Nonconforming number of contrast coefficients") tcmat = t(cmat) if (!is.null(scale)) { if (length(scale) %in% c(1, nrow(tcmat))) tcmat = tcmat * scale else stop("'scale' length of ", length(scale), " does not conform with ", nrow(tcmat), " contrasts", call. = FALSE) } # do some bookkeeping whereby we get NAs only when NA rows get nonzero weight NAflag = linfct[, 1] * 0 NAflag[is.na(linfct[, 1])] = 1 linfct[is.na(linfct)] = 0 # now we don't have any NAs but we know where to put them back later .NArows = function(mat, flag) # utility for flagging bad rows (any nonzero coef applied to an NA) apply(mat, 1, function(x, f) any(x * f != 0), flag) if (is.null(by)) { linfct = tcmat %*% linfct linfct[.NArows(tcmat, NAflag), ] = NA # put back the NAs where they belong grid = data.frame(.contrast.=names(cmat)) if (hasName(object@grid, ".offset.")) grid[[".offset."]] = t(cmat) %*% object@grid[[".offset."]] by.rows = list(seq_along(object@linfct[ , 1])) } # NOTE: The kronecker thing here depends on the grid being regular. # Irregular grids are handled by .nested_contrast else { tcmat = kronecker(.diag(rep(1,length(by.rows))), tcmat) linfct = tcmat %*% linfct[unlist(by.rows), , drop = FALSE] linfct[.NArows(tcmat, NAflag[unlist(by.rows)]), ] = NA tmp = expand.grid(con = names(cmat), by = seq_len(length(by.rows)), stringsAsFactors = FALSE)###unique(by.id)) # check if levs have different orderings in subsequent by groups for (i in 1 + seq_along(by.rows[-1])) { j = by.rows[[i]] if (any(all.levs[j] != levs)) { cm = method(all.levs[j], ...) tmp$con[seq_along(cm) + length(cm)*(i-1)] = names(cm) } } grid = data.frame(.contrast. = factor(tmp$con, levels = unique(tmp$con))) n.each = ncol(cmat) row.1st = sapply(by.rows, function(x) x[1]) xlevs = list() for (v in by) xlevs[[v]] = rep(bylevs[row.1st, v], each=n.each) grid = cbind(grid, data.frame(xlevs, check.names = FALSE)) if (hasName(object@grid, ".offset.")) grid[[".offset."]] = tcmat %*% object@grid[unlist(by.rows), ".offset."] } # Rename the .contrast. column -- ordinarily to "contrast", # but otherwise a unique variation thereof con.pat = paste("^", name, "[0-p]?", sep = "") n.prev.con = length(grep(con.pat, names(grid))) con.col = grep("\\.contrast\\.", names(grid)) con.name = paste(name, ifelse(n.prev.con == 0, "", n.prev.con), sep="") names(grid)[con.col] = con.name row.names(linfct) = NULL misc = object@misc misc$initMesg = NULL misc$estName = "estimate" if (!is.null(et <- attr(cmat, "type"))) misc$estType = et else { is.con = all(abs(sapply(cmat, sum)) < .001) misc$estType = ifelse(is.con, "contrast", "prediction") } misc$methDesc = attr(cmat, "desc") misc$famSize = ifelse(is.null(fs <- attr(cmat, "famSize")), length(by.rows[[1]]), fs) misc$pri.vars = setdiff(names(grid), c(by, ".offset.",".wgt.")) if (missing(adjust)) adjust = attr(cmat, "adjust") if (is.null(adjust)) adjust = "none" if (!is.null(attr(cmat, "offset"))) offset = attr(cmat, "offset") if (!is.null(offset)) { if(!hasName(grid, ".offset.")) grid[[".offset."]] = 0 grid[[".offset."]] = grid[[".offset."]] + rep(offset, length(by.rows)) } if (!is.null(fs <- attr(cmat, "famSize"))) misc$famSize = fs misc$is.new.rg = FALSE misc$adjust = adjust misc$infer = c(FALSE, TRUE) misc$by.vars = by if(!is.na(misc$estType) && misc$estType == "pairs") # internal flag to keep track of original by vars for paired comps misc$.pairby = paste(c("", by), collapse = ",") # save contrast coefs by.cols = seq_len(ncol(tcmat)) if(!is.null(by.rows)) by.cols[unlist(by.rows)] = by.cols # gives us inverse of by.rows order misc$orig.grid = orig.grid # save original grid misc$con.coef = tcmat[ , by.cols, drop = FALSE] # save contrast coefs # test that each set of coefs sums to 0 true.con = all(abs(apply(cmat, 2, function(.) sum(.) / (1.0e-6 + max(abs(.))))) < 1.0e-6) if(is.na(true.con)) # prevent error when there are no contrasts true.con = FALSE if(true.con) misc$sigma = NULL # sigma surely no longer makes sense # zap the transformation info except in special cases if (!is.null(misc$tran)) { misc$orig.tran = .fmt.tran(misc) if (ratios && true.con && misc$tran %in% c("log", "log2", "log10", ### REMOVED "genlog", "logit", "log.o.r.")) { misc$log.contrast = TRUE # remember how we got here; used by summary misc$orig.inv.lbl = misc$inv.lbl if (misc$tran == "logit") { misc$inv.lbl = "odds.ratio" misc$tran = "log.o.r." misc$tran.mult = misc$tran.offset = NULL } else if (misc$tran == "log.o.r.") { # leave everything as-is. Once an odds ratio, always an odds ratio } else { misc$inv.lbl = "ratio" ### (stays at log, log2, log10) misc$tran = "log" misc$tran.mult = misc$tran.offset = NULL } } else { misc$initMesg = c(misc$initMesg, paste("Note: contrasts are still on the", misc$orig.tran, "scale")) message("Note: Use 'contrast(regrid(object), ...)' to obtain contrasts of back-transformed estimates") misc$tran = misc$tran.mult = misc$tran.offset = NULL } } # ensure we don't inherit inappropriate settings misc$null = misc$delta = misc$side = misc$calc = NULL object@roles$predictors = "contrast" levels = list() for (nm in setdiff(names(grid), ".offset.")) levels[[nm]] = unique(grid[[nm]]) ### bypass new as we're not re-classing result = new("emmGrid", object, linfct = linfct, levels = levels, grid = grid, misc = misc) result = as(object, "emmGrid") result@linfct = linfct result@levels = levels result@grid = grid result@misc = misc result@roles$predictors = setdiff(names(result@levels), result@roles$multresp) .update.options(result, options, ...) } # Add-on to support `simple` argument - the complement of `by` # e.g., with factors A, B, C, simple = "A" <==> by = c("B", "C") # Note `by` is an argument so that it can be ignored and never duplicated # If `simple` is a list, we run this on each element # If simple = "each", we run it on each factorin the grid # We handle `adjust`` ourselves rather than passing it to `contrast`` .simcon = function(object, ..., simple, by, combine = FALSE, adjust) { if (is.list(simple)) { if(is.null(names(simple))) names(simple) = sapply(simple, function(.) paste("simple contrasts for", paste0(., collapse = "*"))) result = lapply(simple, function(.) .simcon(object, ..., simple = .)) class(result) = c("emm_list", "list") if(combine) result = do.call(rbind.emmGrid, result) } else if ((length(simple) == 1) && (simple == "each") && !("each" %in% names(object@levels))) { result = .simcon(object, ..., simple = as.list(names(object@levels)), combine = combine) } else { facs = names(object@levels) by = setdiff(facs, simple) if (length(by) == 0) by = NULL result = contrast.emmGrid(object, ..., by = by) } if (!missing(adjust)) { if (is.list(result)) { for (i in seq_along(result)) result[[i]] = update(result[[i]], adjust = adjust) } else result = update(result, adjust = adjust) } result } # pairs method #' @rdname contrast #' @param x An \code{emmGrid} object #' @param reverse Logical value - determines whether to use \code{"pairwise"} (if \code{TRUE}) or \code{"revpairwise"} (if \code{FALSE}). #' @importFrom graphics pairs #' @export pairs.emmGrid = function(x, reverse = FALSE, ...) { object = x # for my sanity if (reverse) contrast(object, method = "revpairwise", ...) else contrast(object, method = "pairwise", ...) } # coef method - returns contrast coefficients #' @rdname contrast #' @return \code{coef} returns a \code{data.frame} containing the object's grid, along with columns named \code{c.1, c.2, ...} containing the contrast coefficients. If #' @export #' @importFrom stats coef #' @method coef emmGrid coef.emmGrid = function(object, ...) { if (is.null(cc <- object@misc$con.coef)) { message("No contrast coefficients are available") return (NULL) } cc = as.data.frame(t(cc)) names(cc) = paste("c", seq_len(ncol(cc)), sep = ".") cbind(object@misc$orig.grid, cc) } emmeans/R/xtable-method.R0000644000176200001440000001431614137062735014775 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### xtable method # Modified from xtableLSMeans function provided by David Scott #' Using \code{xtable} for EMMs #' #' These methods provide support for the \pkg{xtable} package, enabling #' polished presentations of tabular output from \code{\link{emmeans}} #' and other functions. #' #' The methods actually use \code{\link[xtable]{xtableList}}, #' because of its ability to display messages such as those for P-value #' adjustments. These methods return an object of class \code{"xtable_emm"} -- #' an extension of \code{"xtableList"}. Unlike other \code{xtable} methods, the #' number of digits defaults to 4; and degrees of freedom and \emph{t} ratios #' are always formatted independently of \code{digits}. The \code{print} method #' uses \code{\link[xtable:xtableList]{print.xtableList}}, and any \code{\dots} arguments are #' passed there. #' #' @param x Object of class \code{emmGrid} #' @param caption Passed to \code{\link[xtable]{xtableList}} #' @param label Passed to \code{xtableList} #' @param align Passed to \code{xtableList} #' @param digits Passed to \code{xtableList} #' @param display Passed to \code{xtableList} #' @param auto Passed to \code{xtableList} #' @param ... Arguments passed to \code{\link{summary.emmGrid}} #' #' @return The \code{xtable} methods return an \code{xtable_emm} #' object, for which its print method is \code{print.xtable_emm} . #' #' @method xtable emmGrid #' @importFrom xtable xtable #' @importFrom xtable xtableList #' @export #' @examples #' pigsint.lm <- lm(log(conc) ~ source * factor(percent), data = pigs) #' pigsint.emm <- emmeans(pigsint.lm, ~ percent | source) #' xtable::xtable(pigsint.emm, type = "response") xtable.emmGrid = function(x, caption = NULL, label = NULL, align = NULL, digits = 4, display = NULL, auto = FALSE, ...) { xtable.summary_emm(summary(x, ...), caption = caption, label = label, align = align, digits = digits, display = display, auto = auto) } #' @rdname xtable.emmGrid #' @method xtable summary_emm #' @export xtable.summary_emm = function (x, caption = NULL, label = NULL, align = NULL, digits = 4, display = NULL, auto = FALSE, ...) { if (!is.null(x$df)) x$df = round(x$df, 2) if (!is.null(x$t.ratio)) x$t.ratio = round(x$t.ratio, 3) if (!is.null(x$z.ratio)) x$z.ratio = round(x$z.ratio, 3) if (!is.null(x$p.value)) { fp = x$p.value = format(round(x$p.value,4), nsmall=4, sci=FALSE) x$p.value[fp=="0.0000"] = "<.0001" } if (!is.null(byv <- attr(x, "by.vars"))) { byc = which(names(x) %in% byv) xList = split(as.data.frame(x), f = x[, byc]) labs = rep("", length(xList)) for (i in 1:length(xList)) { levs = sapply(xList[[i]][1, byc], as.character) labs[i] = paste(paste(byv, levs, sep = " = "), collapse = ", ") xList[[i]] = as.data.frame(xList[[i]][, -byc, drop = FALSE]) } attr(xList, "subheadings") = labs } else { xList = list(as.data.frame(x)) } attr(xList, "message") = attr(x, "mesg") result = xtable::xtableList(xList, caption = caption, label = label, align = align, digits = digits, display = display, auto = auto, ...) digits = xtable::digits(result[[1]]) # format df and t ratios digits = xtable::digits(result[[1]]) i = which(names(x) == "df") if (length(i) > 0) { dfd = ifelse(all(zapsmall(x$df - round(x$df)) == 0), 0, 2) digits[i + 1 - length(byv)] = ifelse(is.na(dfd), 0, dfd) } i = which(names(x) %in% c("t.ratio", "z.ratio")) if (length(i) > 0) digits[i + 1 - length(byv)] = 3 for (i in seq_along(result)) xtable::digits(result[[i]]) = digits class(result) = c("xtable_emm", "xtableList") result } #' @rdname xtable.emmGrid #' @param type Passed to \code{\link[xtable]{print.xtable}} #' @param include.rownames Passed to \code{print.xtable} #' @param sanitize.message.function Passed to \code{print.xtable} #' @method print xtable_emm #' @export print.xtable_emm = function(x, type = getOption("xtable.type", "latex"), include.rownames = FALSE, sanitize.message.function = footnotesize, ...) { footnotesize = switch(type, html = function(x) paste0("", x, ""), latex = function(x) paste0("{\\footnotesize ", x, "}"), function(x) x ) invisible(xtable::print.xtableList(x, type = type, include.rownames = include.rownames, sanitize.message.function = sanitize.message.function, ...)) } emmeans/R/rbind.R0000644000176200001440000001634714137062735013344 0ustar liggesusers############################################################################## # Copyright (c) 2012-2016 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # rbind method for emmGrid objects #' Combine or subset \code{emmGrid} objects #' #' These functions provide methods for \code{\link[base:cbind]{rbind}} and #' \code{\link[base:Extract]{[}} that may be used to combine \code{emmGrid} objects #' together, or to extract a subset of cases. The primary reason for #' doing this would be to obtain multiplicity-adjusted results for smaller #' or larger families of tests or confidence intervals. #' #' @param ... Additional arguments: #' In \code{rbind}, object(s) of class \code{emmGrid}. #' In \code{"["}, it is ignored. #' In \code{subset}, it is passed to \code{[.emmGrid]} #' @param deparse.level (required but not used) #' @param adjust Character value passed to \code{\link{update.emmGrid}} #' #' @note \code{rbind} throws an error if there are incompatibilities in #' the objects' coefficients, covariance structures, etc. But they #' are allowed to have different factors; a missing level \code{'.'} #' is added to factors as needed. #' #' @return A revised object of class \code{emmGrid} #' @order 1 #' @method rbind emmGrid #' @export #' @examples #' warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) #' warp.rg <- ref_grid(warp.lm) #' #' # Do all pairwise comparisons within rows or within columns, #' # all considered as one faily of tests: #' w.t <- pairs(emmeans(warp.rg, ~ wool | tension)) #' t.w <- pairs(emmeans(warp.rg, ~ tension | wool)) #' rbind(w.t, t.w, adjust = "mvt") #' update(w.t + t.w, adjust = "fdr") ## same as above except for adjustment rbind.emmGrid = function(..., deparse.level = 1, adjust = "bonferroni") { objs = list(...) if (!all(sapply(objs, inherits, "emmGrid"))) stop("All objects must inherit from 'emmGrid'") bhats = lapply(objs, function(o) o@bhat) bhat = bhats[[1]] if(!all(sapply(bhats, function(b) (length(b) == length(bhat)) && (sum((b - bhat)^2, na.rm = TRUE) == 0)))) stop("All objects must have the same fixed effects") Vs = lapply(objs, function(o) o@V) V = Vs[[1]] if(!all(sapply(Vs, function(v) sum((v - V)^2) == 0))) stop("All objects must have the same covariances") obj = objs[[1]] linfcts = lapply(objs, function(o) o@linfct) obj@linfct = do.call(rbind, linfcts) bnms = unlist(lapply(objs, function(o) o@misc$by.vars)) grids = lapply(objs, function(o) o@grid) gnms = unique(c(bnms, unlist(lapply(grids, names)))) gnms = setdiff(gnms, c(".wgt.", ".offset.")) # exclude special names grid = data.frame(.tmp. = seq_len(n <- nrow(obj@linfct))) for (g in gnms) grid[[g]] = rep(".", n) grid[[".wgt."]] = grid[[".offset."]] = 0 grid$.tmp. = NULL n.before = 0 for (g in grids) { rows = n.before + seq_along(g[[1]]) n.before = max(rows) for (nm in setdiff(names(g), c(".wgt.", ".offset."))) grid[rows, nm] = as.character(g[[nm]]) if (!is.null(g$.wgt.)) grid[rows, ".wgt."] = g$.wgt. if (!is.null(g$.offset.)) grid[rows, ".offset."] = g$.offset. } if (all(grid$.wgt. == 0)) grid$.wgt. = 1 if (all(grid$.offset. == 0)) grid$.offset. = NULL avgd.over = unique(unlist(lapply(objs, function(o) o@misc$avgd.over))) attr(avgd.over, "qualifier") = " some or all of" obj@grid = grid obj@levels = lapply(gnms, function(nm) unique(grid[[nm]])) names(obj@levels) = gnms obj@roles$predictors = setdiff(names(obj@levels), obj@roles$multresp) obj@misc$con.coef = obj@misc$orig.grid = obj@misc$.pairby = NULL update(obj, pri.vars = gnms, by.vars = NULL, adjust = adjust, estType = "rbind", famSize = n, # instead of round((1 + sqrt(1 + 8*n)) / 2, 3), avgd.over = avgd.over) } #' @rdname rbind.emmGrid #' @order 4 #' @param e1 An \code{emmGrid} object #' @param e2 Another \code{emmGrid} object #' @return The result of \code{e1 + e2} is the same as \code{rbind(e1, e2)} #' @method + emmGrid #' @export "+.emmGrid" = function(e1, e2) { if(!is(e2, "emmGrid")) stop("'+.emmGrid' works only when all objects are class `emmGrid`", call. = FALSE) rbind(e1, e2) } ### Subset a reference grid # if drop = TRUE, the levels of factors are reduced #' @rdname rbind.emmGrid #' @order 5 #' @param x An \code{emmGrid} object to be subsetted #' @param i Integer vector of indexes #' @param drop.levels Logical value. If \code{TRUE}, the \code{"levels"} slot in #' the returned object is updated to hold only the predictor levels that actually occur #' #' @method [ emmGrid #' @export #' @examples #' #' # Show only 3 of the 6 cases #' summary(warp.rg[c(2, 4, 5)]) "[.emmGrid" = function(x, i, adjust, drop.levels = TRUE, ...) { x@linfct = x@linfct[i, , drop = FALSE] x@grid = x@grid[i, , drop = FALSE] x = update(x, pri.vars = names(x@grid), famSize = length(i), estType = "[") x@misc$orig.grid = x@misc$con.coef = x@misc$.pairby = NULL x@misc$by.vars = NULL if(!missing(adjust)) x@misc$adjust = adjust if(!is.null(disp <- x@misc$display)) x@misc$display = disp[i] if (drop.levels) { for (nm in names(x@levels)) x@levels[[nm]] = unique(x@grid[[nm]]) } x } #' @rdname rbind.emmGrid #' @order 7 #' @param subset logical expression indicating which rows of the grid to keep #' @method subset emmGrid #' @export #' @examples #' #' # After-the-fact 'at' specification #' subset(warp.rg, wool == "A") ## or warp.rg |> subset(wool == "A") #' subset.emmGrid = function(x, subset, ...) { sel = eval(substitute(subset), envir = x@grid) x[sel, ...] } emmeans/R/glht-support.R0000644000176200001440000002161614157435510014706 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### Code for an enhancement of 'glht' in 'multcomp' package ### Provides for using 'emm' in similar way to 'mcp' ### This is implemented via the class "emmlf" -- linear functions for emmeans ## NOTE: Registration of S3 methods for glht is done dynamically in zzz.R # emm(specs) will be used as 'linfct' argument in glht # all we need to do is class it and save the arguments #' Support for \code{multcomp::glht} #' #' These functions and methods provide an interface between \pkg{emmeans} and #' the \code{multcomp::glht} function for simultaneous inference provided #' by the \pkg{multcomp} package. #' #' \code{emm} is meant to be called only \emph{from} \code{"glht"} as its second #' (\code{linfct}) argument. It works similarly to \code{multcomp::mcp}, #' except with \code{specs} (and optionally \code{by} and \code{contr} #' arguments) provided as in a call to \code{\link{emmeans}}. #' #' @rdname glht-support #' @aliases glht-support glht.emmGrid glht.emmlf modelparm.emmwrap #' @param ... In \code{emm}, the \code{specs}, \code{by}, and \code{contr} #' arguments you would normally supply to \code{\link{emmeans}}. Only #' \code{specs} is required. Otherwise, arguments that are passed to other #' methods. #' #' @return \code{emm} returns an object of an intermediate class for which #' there is a \code{multcomp::glht} method. #' @export emm <- function(...) { result <- list(...) class(result) <- "emmlf" result } # New S3 method for emmlf objects glht.emmlf <- function(model, linfct, ...) { # Pass the arguments we should pass to ref_grid: args = linfct args[[1]] = model names(args)[1] = "object" # Now pass the ref_grid to emmeans: linfct$object <- do.call("ref_grid", args) emmo <- do.call("emmeans", linfct) if (is.list(emmo)) emmo = emmo[[length(emmo)]] # Then call the method for emmo objject glht.emmGrid(model, emmo, ...) } # S3 method for an emmGrid object # Note: model is redundant, really, so can be omitted # See related roxygen stuff just before glht.emmlf glht.emmGrid <- function(model, linfct, by, ...) { .requireNS("multcomp", sQuote("glht")," requires ", dQuote("multcomp"), " to be installed", call. = FALSE) object = linfct # so I don't get confused if (missing(model)) model = .cls.list("emmwrap", object = object) args = list(model = model, ...) # add a df value if not supplied if (is.null(args$df)) { df = summary(linfct)$df df[is.infinite(df)] = NA if(any(!is.na(df))) { args$df = max(1, as.integer(mean(df, na.rm=TRUE) + .25)) if (any(args$df != df)) message("Note: df set to ", args$df) } } if (missing(by)) by = object@misc$by.vars nms = setdiff(names(object@grid), c(by, ".offset.", ".freq.", ".wgt.")) if (is.null(object@misc$estHook)) lf = object@linfct else # custom estimation setup - use the grid itself as the parameterization lf = diag(1, nrow(object@linfct)) dimnames(lf)[[1]] = as.character(interaction(object@grid[, nms], sep=", ")) if (is.null(by)) { args$linfct = lf return(do.call(multcomp::glht, args)) } # (else...) by.rows = .find.by.rows(object@grid, by) result = lapply(by.rows, function(r) { args$linfct = lf[r, , drop=FALSE] do.call(multcomp::glht, args) }) bylevs = lapply(by, function(byv) unique(object@grid[[byv]])) names(bylevs) = by bygrid = do.call("expand.grid", bylevs) levlbls = unname(lapply(by, function(byv) paste(byv, "=", bygrid[[byv]]))) levlbls$sep = ", " names(result) = do.call("paste", levlbls) class(result) = c("glht_list", "list") result } ### as. glht -- convert my object to glht object #' @rdname glht-support #' @param object An object of class \code{emmGrid} or \code{emm_list} #' #' @return \code{as.glht} returns an object of class \code{glht} or \code{glht_list} #' according to whether \code{object} is of class \code{emmGrid} or \code{emm_list}. #' See Details below for more on \code{glht_list}s. #' #' @section Details: #' A \code{glht_list} object is simply a \code{list} of \code{glht} objects. #' It is created as needed -- for example, when there is a \code{by} variable. #' Appropriate convenience methods \code{coef}, #' \code{confint}, \code{plot}, \code{summary}, and \code{vcov} are provided, #' which simply apply the corresponding \code{glht} methods to each member. #' #' @note The multivariate-\eqn{t} routines used by \code{glht} require that all #' estimates in the family have the same integer degrees of freedom. In cases #' where that is not true, a message is displayed that shows what df is used. #' The user may override this via the \code{df} argument. #' #' @examples #' if(require(multcomp, quietly = TRUE)) withAutoprint({ # --- multcomp must be installed #' warp.lm <- lm(breaks ~ wool*tension, data = warpbreaks) #' # Using 'emm' #' summary(glht(warp.lm, emm(pairwise ~ tension | wool))) #' # Same, but using an existing 'emmeans' result #' warp.emm <- emmeans(warp.lm, ~ tension | wool) #' summary(as.glht(pairs(warp.emm))) #' # Same contrasts, but treat as one family #' summary(as.glht(pairs(warp.emm), by = NULL)) #' }, spaced = TRUE) #' @export as.glht <- function(object, ...) { UseMethod("as.glht") } #' @method as.glht default #' @export as.glht.default <- function(object, ...) stop("Cannot convert an object of class ", sQuote(class(object)[1]), " to a ", sQuote("glht"), " object") #' @rdname glht-support #' @method as.glht emmGrid #' @export as.glht.emmGrid <- function(object, ...) glht.emmGrid( , object, ...) # 1st arg not necessary #' @method as.glht emm_list #' @export as.glht.emm_list <- function(object, ..., which = 1) as.glht(object[[which]], ...) # S3 modelparm method for emmwrap (S3 wrapper for an emmGrid obj - see glht.emmGrid) #--- dynamically registered in zzz.R --- #' @export modelparm.emmwrap <- function(model, coef., vcov., df, ...) { object = model$object if (is.null(object@misc$estHook)) { bhat = object@bhat V = object@V } else { # Have custom vcov and est methods. Use the grid itself as parameterization bhat = predict(object) V = vcov(object) } if(missing(df) || is.na(df) || is.infinite(df)) df = 0 .cls.list("modelparm", coef = bhat, vcov = V, df = df, estimable = !is.na(bhat)) # This is NOT what we mean by 'estimable', but it is what glht wants... } # S3 methods for glht_list ### Doesn't work so excluded... # cld.glht_list = function(object, ...) # lapply(object, cld, ...) #' @method coef glht_list #' @export coef.glht_list = function(object, ...) lapply(object, coef, ...) #' @method confint glht_list #' @export confint.glht_list = function(object, ...) lapply(object, confint, ...) #' @method plot glht_list #' @export plot.glht_list = function(x, ...) lapply(x, plot, ...) #' @method summary glht_list #' @export summary.glht_list = function(object, ...) lapply(object, summary, ...) #' @method vcov glht_list #' @export vcov.glht_list = function(object, ...) lapply(object, vcov, ...) emmeans/R/emtrends.R0000644000176200001440000003123314164700054014047 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### Code for emtrends ### emtrends function #' Estimated marginal means of linear trends #' #' The \code{emtrends} function is useful when a fitted model involves a #' numerical predictor \eqn{x} interacting with another predictor \code{a} #' (typically a factor). Such models specify that \eqn{x} has a different trend #' depending on \eqn{a}; thus, it may be of interest to estimate and compare #' those trends. Analogous to the \code{\link{emmeans}} setting, we construct a #' reference grid of these predicted trends, and then possibly average them over #' some of the predictors in the grid. #' #' The function works by constructing reference grids for \code{object} with #' various values of \code{var}, and then calculating difference quotients of predictions #' from those reference grids. Finally, \code{\link{emmeans}} is called with #' the given \code{specs}, thus computing marginal averages as needed of #' the difference quotients. Any \code{...} arguments are passed to the #' \code{ref_grid} and \code{\link{emmeans}}; examples of such optional #' arguments include optional arguments (often \code{mode}) that apply to #' specific models; \code{ref_grid} options such as \code{data}, \code{at}, #' \code{cov.reduce}, \code{mult.names}, \code{nesting}, or \code{transform}; #' and \code{emmeans} options such as \code{weights} (but please avoid #' \code{trend} or \code{offset}. #' #' #' @param object A supported model object (\emph{not} a reference grid) #' @param specs Specifications for what marginal trends are desired -- as in #' \code{\link{emmeans}}. If \code{specs} is missing or \code{NULL}, #' \code{emmeans} is not run and the reference grid for specified trends #' is returned. #' @param var Character value giving the name of a variable with respect to #' which a difference quotient of the linear predictors is computed. In order #' for this to be useful, \code{var} should be a numeric predictor that #' interacts with at least one factor in \code{specs}. Then instead of #' computing EMMs, we compute and compare the slopes of the \code{var} trend #' over levels of the specified other predictor(s). As in EMMs, marginal #' averages are computed for the predictors in \code{specs} and \code{by}. #' See also the \dQuote{Generalizations} section below. #' @param delta.var The value of \emph{h} to use in forming the difference #' quotient \eqn{(f(x+h) - f(x))/h}. Changing it (especially changing its #' sign) may be necessary to avoid numerical problems such as logs of negative #' numbers. The default value is 1/1000 of the range of \code{var} over the #' dataset. #' @param max.degree Integer value. The maximum degree of trends to compute (this #' is capped at 5). If greater than 1, an additional factor \code{degree} is #' added to the grid, with corresponding numerical derivatives of orders #' \code{1, 2, ..., max.degree} as the estimates. #' @param ... Additional arguments passed to \code{\link{ref_grid}} or #' \code{\link{emmeans}} as appropriate. See Details. #' #' @section Generalizations: #' Instead of a single predictor, the user may specify some monotone function of #' one variable, e.g., \code{var = "log(dose)"}. If so, the chain rule is #' applied. Note that, in this example, if \code{object} contains #' \code{log(dose)} as a predictor, we will be comparing the slopes estimated by #' that model, whereas specifying \code{var = "dose"} would perform a #' transformation of those slopes, making the predicted trends vary depending on #' \code{dose}. #' #' @return An \code{emmGrid} or \code{emm_list} object, according to \code{specs}. #' See \code{\link{emmeans}} for more details on when a list is returned. #' #' @seealso \code{\link{emmeans}}, \code{\link{ref_grid}} #' #' @note #' In earlier versions of \code{emtrends}, the first argument was named #' \code{model} rather than \code{object}. (The name was changed because of #' potential mis-matching with a \code{mode} argument, which is an option for #' several types of models.) For backward compatibility, \code{model} still works #' \emph{provided all arguments are named}. #' #' @note #' It is important to understand that trends computed by \code{emtrends} are #' \emph{not} equivalent to polynomial contrasts in a parallel model where #' \code{var} is regarded as a factor. That is because the model \code{object} #' here is assumed to fit a smooth function of \code{var}, and the estimated #' trends reflect \emph{local} behavior at particular value(s) of \code{var}; #' whereas when \code{var} is modeled as a factor and polynomial contrasts are #' computed, those contrasts represent the \emph{global} pattern of changes over #' \emph{all} levels of \code{var}. #' #' See the \code{pigs.poly} and \code{pigs.fact} examples below for an #' illustration. The linear and quadratic trends depend on the value of #' \code{percent}, but the cubic trend is constant (because that is true of #' a cubic polynomial, which is the underlying model). The cubic contrast #' in the factorial model has the same P value as for the cubic trend, #' again because the cubic trend is the same everywhere. #' #' @export #' #' @examples #' fiber.lm <- lm(strength ~ diameter*machine, data=fiber) #' # Obtain slopes for each machine ... #' ( fiber.emt <- emtrends(fiber.lm, "machine", var = "diameter") ) #' # ... and pairwise comparisons thereof #' pairs(fiber.emt) #' #' # Suppose we want trends relative to sqrt(diameter)... #' emtrends(fiber.lm, ~ machine | diameter, var = "sqrt(diameter)", #' at = list(diameter = c(20, 30))) #' #' # Obtaining a reference grid #' mtcars.lm <- lm(mpg ~ poly(disp, degree = 2) * (factor(cyl) + factor(am)), data = mtcars) #' #' # Center trends at mean disp for each no. of cylinders #' mtcTrends.rg <- emtrends(mtcars.lm, var = "disp", #' cov.reduce = disp ~ factor(cyl)) #' summary(mtcTrends.rg) # estimated trends at grid nodes #' emmeans(mtcTrends.rg, "am", weights = "prop") #' #' #' ### Higher-degree trends ... #' #' pigs.poly <- lm(conc ~ poly(percent, degree = 3), data = pigs) #' emt <- emtrends(pigs.poly, ~ degree | percent, "percent", max.degree = 3, #' at = list(percent = c(9, 13.5, 18))) #' # note: 'degree' is an extra factor created by 'emtrends' #' #' summary(emt, infer = c(TRUE, TRUE)) #' #' # Compare above results with poly contrasts when 'percent' is modeled as a factor ... #' pigs.fact <- lm(conc ~ factor(percent), data = pigs) #' emm <- emmeans(pigs.fact, "percent") #' #' contrast(emm, "poly") #' # Some P values are comparable, some aren't! See Note in documentation emtrends = function(object, specs, var, delta.var=.001*rng, max.degree = 1, ...) { estName = paste(var, "trend", sep=".") # Do now as I may replace var later # construct our first call to ref_grid() to get the data... rgargs = list(object = object, ...) if (is.null(rgargs$options)) rgargs$options = list() # backward compatibility for when 1st argument was "model" if (missing(object) && ("model" %in% names(rgargs))) { names(rgargs)[names(rgargs) == "model"] = "object" } rgargs$options$just.data = TRUE data = do.call("ref_grid", c(rgargs)) rgargs$options$just.data = rgargs$data = NULL x = data[[var]] fcn = NULL # differential if (is.null(x)) { fcn = var var = .all.vars(as.formula(paste("~", var))) if (length(var) > 1) stop("Can only support a function of one variable") else { x = data[[var]] if (is.null(x)) stop("Variable '", var, "' is not in the dataset") } } rng = diff(range(x)) if (delta.var == 0) stop("Provide a nonzero value of 'delta.var'") max.degree = max(1, min(5, as.integer(max.degree + .1))) # create a vector of delta values, such that a middle one has value 0 delts = delta.var * (0:max.degree) idx.base = as.integer((2 + max.degree)/2) delts = delts - delts[idx.base] # set up call for ref_grid # ref_grid hook -- expand grid by these increments of var rgargs$options = c(rgargs$options, list(var = var, delts = delts)) bigRG = do.call("ref_grid", c(rgargs, data = quote(data))) ### create var.subs -- list of indexes for each value of delts ## Since var was created as slowest-varying, only multivariate factors will vary slower gdim = nrow(bigRG@grid) / length(delts) mdim = 1 # added code: only consider mult levels that come AFTER var (post-processing could affect sort order) levnms = names(bigRG@levels) vi = which(levnms == var) for (v in bigRG@roles$multresp) if (which(levnms == v) > vi) mdim = mdim * length(bigRG@levels[[v]]) arr = array(seq_len(nrow(bigRG@grid)), c(gdim / mdim, length(delts), mdim)) var.subs = lapply(seq_along(delts), function(i) as.numeric(arr[,i,])) RG = orig.rg = bigRG[var.subs[[idx.base]]] # the subset that corresponds to reference values row.names(RG@grid) = seq_along(RG@grid[[1]]) linfct = lapply(seq_along(delts), function(i) bigRG@linfct[var.subs[[i]], , drop = FALSE]) if (!is.null(fcn)) { # need a different "h" when diff wrt a function tmp = sapply(seq_along(delts), function(i) eval(parse(text = fcn), envir = bigRG@grid[var.subs[[i]], , drop = FALSE])) delta.var = apply(tmp, 1, function(.) mean(diff(.))) } newlf = numeric(0) h = 1 for (i in 1:max.degree) { # successively difference linfct linfct = lapply(seq_along(linfct)[-1], function(j) linfct[[j]] - linfct[[j-1]]) h = h * delta.var * i what = as.integer((length(linfct) + 1) / 2) # pick out one in the middle newlf = rbind(newlf, linfct[[what]] / h) } # Now replace linfct w/ difference quotient(s) RG@linfct = newlf RG@roles$trend = var if (max.degree > 1) { degnms = c("linear", "quadratic", "cubic", "quartic", "quintic") RG@grid$degree = degnms[1] g = RG@grid for (j in 2:max.degree) { g$degree = degnms[j] RG@grid = rbind(RG@grid, g) } RG@roles$predictors = c(RG@roles$predictors, "degree") RG@levels$degree = degnms[1:max.degree] } RG@grid$.offset. = NULL # offset never applies after differencing RG@misc$tran = RG@misc$tran.mult = NULL RG@misc$estName = estName RG@misc$methDesc = "emtrends" .save.ref_grid(RG) # save in .Last.ref_grid, if enabled if (missing(specs) || is.null(specs)) return (RG) # args for emmeans calls args = list(object = NULL, specs = specs, ...) args$at = args$cov.reduce = args$mult.levs = args$vcov. = args$data = args$trend = args$transform = NULL if(max.degree > 1) { chk = union(all.vars(specs), args$by) if (!("degree" %in% chk)) args$by = c("degree", args$by) } args$object = RG do.call("emmeans", args) } emmeans/R/betareg.support.R0000644000176200001440000001325214137062735015362 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # Support for 'betareg' class # mode is same as 'type' in predict.betareg, PLUS # mode = "phi.link" refers to link function before back-transforming to "precision" recover_data.betareg = function(object, mode = c("response", "link", "precision", "phi.link", "variance", "quantile"), ...) { fcall = object$call mode = match.arg(mode) if (mode %in% c("response", "link")) mode = "mean" if (mode == "phi.link") mode = "precision" if(mode %in% c("mean", "precision")) trms = delete.response(terms(object, model = mode)) else trms = delete.response(object$terms$full) # Make sure there's an offset function available env = new.env(parent = attr(trms, ".Environment")) env$offset = function(x) x attr(trms, ".Environment") = env recover_data(fcall, trms, object$na.action, ...) } # PRELIMINARY... # Currently works correctly only for "resp", "link", "precision", "phi" modes emm_basis.betareg = function(object, trms, xlev, grid, mode = c("response", "link", "precision", "phi.link", "variance", "quantile"), quantile = .5, ...) { mode = match.arg(mode) # if (mode %in% c("variance", "quantile")) # stop(paste0('"', mode, '" mode is not yet supported.')) # figure out which parameters we need model = if (mode %in% c("response", "link")) "mean" else if (mode %in% c("precision", "phi.link")) "precision" else "full" V = .pscl.vcov(object, model = model) # borrowed from pscl methods bhat = coef(object, model = model) nbasis = estimability::all.estble dffun = function(k, dfargs) Inf dfargs = list() if (mode %in% c("response", "link", "precision", "phi.link")) { m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = object$contrasts[[model]]) misc = list(tran = object$link[[model]]$name) if (mode %in% c("response", "precision")) { misc$postGridHook = ".betareg.pg" } } else { ### (mode %in% c("variance", "quantile")) m.trms = delete.response(terms(object, "mean")) m.m = model.frame(m.trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(m.trms, m.m, contrasts.arg = object$contrasts$mean) m.idx = seq_len(ncol(X)) m.lp = as.numeric(X %*% bhat[m.idx] + .get.offset(m.trms, grid)) mu = object$link$mean$linkinv(m.lp) p.trms = delete.response(terms(object, "precision")) p.m = model.frame(m.trms, grid, na.action = na.pass, xlev = xlev) Z = model.matrix(p.trms, p.m, contrasts.arg = object$contrasts$precision) p.lp = as.numeric(Z %*% bhat[-m.idx] + .get.offset(p.trms, grid)) phi = object$link$precision$linkinv(p.lp) if (mode == "variance") { bhat = mu * (1 - mu) / (1 + phi) dbhat.dm = (1 - 2 * mu) / (1 + phi) dbhat.dp = -bhat / (1 + phi) delta = cbind(diag(dbhat.dm) %*% X, diag(dbhat.dp) %*% Z) V = delta %*% tcrossprod(V, delta) misc = list() } else { ### (mode = "quantile") bhat = as.numeric(sapply(quantile, function(q) stats::qbeta(q, phi * mu, phi * (1 - mu)))) V = matrix(NA, nrow = length(bhat), ncol = length(bhat)) misc = list(ylevs = list(quantile = quantile)) } X = diag(1, length(bhat)) } list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=dfargs, misc=misc) } # Post-grid hook for simple back-transforming .betareg.pg = function(object, ...) { object@misc$postGridHook = NULL regrid(object, transform = TRUE) } ### predict methods # link: X%*%beta + off_m # response: mu = h_m(link) # # phi.link: Z%*%gamma + off_p # precision: phi = h_p(phi.link) # # variance: mu*(1 - mu) / (1 + phi) # quantile: qbeta(p, mu*phi, (1 - mu)*phi) # # Defns: # phi = a + b # mu = a / (a + b) # so that phi*mu = a and phi*(1 - mu) = b, # Variance = ab / [(a + b)^2 * (a + b + 1)] emmeans/R/qdrg.R0000644000176200001440000002365214157436051013176 0ustar liggesusers############################################################################## # Copyright (c) 2018 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### Quick-and-dirty support for otherwise unsupported models #' Quick and dirty reference grid #' #' This function may make it possible to compute a reference grid for a model #' object that is otherwise not supported. #' #' If \code{object} is specified, it is used to try to obtain certain #' other arguments, as detailed below. The user should ensure that these defaults #' will work. The default values for the arguments are as follows: #' \itemize{ #' \item{\code{formula}: }{Required unless obtainable via \code{formula(object)}} #' \item{\code{data}: }{Required if variables are not in \code{parent.frame()} or #' obtainable via \code{object$data}} #' \item{\code{coef}: }{\code{coef(object)}} #' \item{\code{vcov}: }{\code{vcov(object)}} #' \item{\code{df}: }{Set to \code{Inf} if not available in \code{object$df.residual}} #' \item{\code{mcmc}: }{\code{object$sample}} #' \item{\code{subset}: }{\code{NULL} (so that all observations in \code{data} are used)} #' \item{\code{contrasts}: }{\code{object$contrasts}} #' } #' #' The functions \code{\link{qdrg}} and \code{emmobj} are close cousins, in that #' they both produce \code{emmGrid} objects. When starting with summary #' statistics for an existing grid, \code{emmobj} is more useful, while #' \code{qdrg} is more useful when starting from a fitted model. #' #' @param formula Formula for the fixed effects #' @param data Dataset containing the variables in the model #' @param coef Fixed-effect regression coefficients (must conform to formula) #' @param vcov Variance-covariance matrix of the fixed effects #' @param df Error degrees of freedom #' @param mcmc Posterior sample of fixed-effect coefficients #' @param object Optional model object. If provided, it is used to set #' certain other arguments, if not specified. See Details. #' @param subset Subset of \code{data} used in fitting the model #' @param weights Weights used in fitting the model #' @param contrasts List of contrasts specified in fitting the model #' @param link Link function (character or list) used, if a generalized linear model. #' (Note: response transformations are auto-detected from \code{formula}) #' @param qr QR decomposition of the model matrix; needed only if there are \code{NA}s #' in \code{coef}. #' @param ordinal.dim Integer number of levels in an ordinal response. If not #' missing, the intercept terms are modified appropriate to predicting the latent #' response (see \code{vignette("models")}, Section O. In this case, we expect #' the first \code{ordinal.dim - 1} elements of \code{coef} to be the #' estimated threshold parameters, followed by the coefficients for the #' linear predictor.) #' @param ... Optional arguments passed to \code{\link{ref_grid}} #' #' @return An \code{emmGrid} object constructed from the arguments #' #' @seealso \code{\link{emmobj}} for an alternative way to construct an \code{emmGrid}. #' #' @export #' @examples #' if (require(biglm, quietly = TRUE)) withAutoprint({ #' # Post hoc analysis of a "biglm" object -- not supported by emmeans #' bigmod <- biglm(log(conc) ~ source + factor(percent), data = pigs) #' rg1 <- qdrg(log(conc) ~ source + factor(percent), data = pigs, #' coef = coef(bigmod), vcov = vcov(bigmod), df = bigmod$df.residual) #' emmeans(rg1, "source", type = "response") #' ## But in this particular case, we could have done it the easy way: #' ## rg1 <- qdrg(object = bigmod, data = pigs) #' }, spaced = TRUE) #' if(require(coda, quietly = TRUE) && require(lme4, quietly = TRUE)) #' withAutoprint({ #' # Use a stored example having a posterior sample #' # Model is based on the data in lme4::cbpp #' post <- readRDS(system.file("extdata", "cbpplist", package = "emmeans"))$post.beta #' rg2 <- qdrg(~ size + period, data = lme4::cbpp, mcmc = post, link = "logit") #' summary(rg2, type = "response") #' }, spaced = TRUE) #' if(require(ordinal, quietly = TRUE)) withAutoprint({ #' wine.clm <- clm(rating ~ temp * contact, data = wine) #' ref_grid(wine.clm) #' # verify that we get the same thing via: #' qdrg(object = wine.clm, data = wine, ordinal.dim = 5) #' }, spaced = TRUE) #' qdrg = function(formula, data, coef, vcov, df, mcmc, object, subset, weights, contrasts, link, qr, ordinal.dim, ...) { result = match.call(expand.dots = FALSE) if (!missing(object)) { if (missing(formula)) result$formula = stats::formula(object) if (missing(data)) { data = object$data if (is.null(data)) data = parent.frame() result$data = data } if (missing(coef)) result$coef = stats::coef(object) if (missing(mcmc)) result$mcmc = object$sample if (missing(vcov)) result$vcov = stats::vcov(object) if(missing(df)) result$df = object$df.residual if(missing(contrasts)) result$contrasts = object$contrasts if(any(is.na(result$coef))) result$qr = object$qr } else { if(missing(formula)) stop("When 'object' is missing, must at least provide 'formula'") result$formula = formula if(missing(data)) result$data = parent.frame() else result$data = data if (!missing(coef)) result$coef = coef if (!missing(vcov)) result$vcov = vcov if(!missing(df)) result$df = df } if(is.null(result$df)) result$df = Inf if(!missing(mcmc)) result$mcmc = mcmc if(!missing(subset)) result$subset = subset if(!missing(weights)) result$weights = weights if(!missing(contrasts)) result$contrasts = contrasts if(!missing(link)) result$link = link if(!missing(qr) && any(is.na(result$coef))) result$qr = qr if(!missing(ordinal.dim)) result$ordinal.dim = ordinal.dim # make sure "formula" exists, has a LHS and is is 2nd element so that # response transformation can be found if (is.null(result$formula)) stop("No formula; cannot construct a reference grid") if(length(result$formula) < 3) result$formula = update(result$formula, .dummy ~ .) fpos = grep("formula", names(result))[1] result = result[c(1, fpos, seq_along(result)[-c(1, fpos)])] class(result) = c("qdrg", "call") ref_grid(result, ...) } recover_data.qdrg = function(object, ...) { recover_data.call(object, delete.response(terms(object$formula)), object$na.action, ...) } vcov.qdrg = function(object, ...) object$vcov emm_basis.qdrg = function(object, trms, xlev, grid, ...) { bhat = object$coef m = suppressWarnings(model.frame(trms, grid, na.action = na.pass, xlev = xlev)) X = model.matrix(trms, m, contrasts.arg = object$contrasts) V = .my.vcov(object, ...) if (!is.null(object$mcmc)) { if (is.null(object$coef)) bhat = apply(object$mcmc, 2, mean) if (is.null(object$vcov)) V = cov(object$mcmc) } # If ordinal, add extra avgd, subtracted intercepts -- for latent mode if(!is.null(od <- object$ordinal.dim)) { intcpt = matrix(-1 / (od - 1), nrow = nrow(X), ncol = od - 1) X = cbind(intcpt, X[, -1, drop = FALSE]) } nbasis = estimability::all.estble if (sum(is.na(bhat)) > 0) { if(!is.na(object$qr)) nbasis = estimability::nonest.basis(object$qr) else warning("Non-estimable cases can't be determined.\n", "To rectify, provide appropriate 'qr' in call to qdrg()") } misc = list() # check multivariate situation if (is.matrix(bhat)) { X = kronecker (diag(ncol(bhat)), X) nbasis = kronecker(rep(1, ncol(bhat)), nbasis) nms = colnames(bhat) if (is.null(nms)) nms = seq_len(ncol(bhat)) misc$ylevs = list(rep.meas = nms) bhat = as.numeric(bhat) } if (!is.null(object$link)) { misc = .std.link.labels(eval(list(link = object$link)), misc) dffun = function(k, dfargs) Inf dfargs = list() } else { dfargs = list(df = object$df) dffun = function(k, dfargs) dfargs$df } list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=dfargs, misc=misc, post.beta=object$mcmc) } emmeans/R/multiv.R0000644000176200001440000001670514137062735013564 0ustar liggesusers############################################################################## # Copyright (c) 2012-2021 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## #' Multivariate contrasts #' #' This function displays tests of multivariate comparisons or contrasts. #' The contrasts are constructed at each level of the variable in \code{mult.name}, #' and then we do a multivariate test that the vector of estimates is equal to #' \code{null} (zero by default). The \emph{F} statistic and degrees #' of freedom are determined via the Hotelling distribution. that is, if there are #' \eqn{m} error degrees of freedom and multivariate dimensionality \eqn{d}, then #' the resulting \eqn{F} statistic has degrees of freedom \eqn{(d, m - d + 1)} #' as shown in Hotelling (1931). #' #' #' @param object An object of class \code{emmGrid} #' @param method A contrast method, per \code{\link{contrast.emmGrid}} #' @param mult.name Character vector of nNames of the factors whose levels #' define the multivariate means to contrast. If the model itself has a #' multivariate response, that is what is used. Otherwise, \code{mult.name} #' \emph{must} be specified. #' @param null Scalar or conformable vector of null-hypothesis values to test against #' @param by Any \code{by} variable(s). These should not include the primary #' variables to be contrasted. For convenience, the \code{by} variable is #' nulled-out if it would result in no primary factors being contrasted. #' @param adjust Character value of a multiplicity adjustment method #' (\code{"none"} for no adjustment). The available adjustment methods are #' more limited that in \code{contrast}, and any default adjustment returned #' via \code{method} is ignored. #' @param show.ests Logical flag determining whether the multivariate means #' are displayed #' @param ... Additional arguments passed to \code{contrast} #' #' @return An object of class \code{summary_emm} containing the multivariate #' test results; or a list of the estimates and the tests if \code{show.ests} #' is \code{TRUE}. The test results include the Hotelling \eqn{T^2} statistic, #' \eqn{F} ratios, degrees of freedom, and \eqn{P} values. #' @note #' If some interactions among the primary and \code{mult.name} factors are #' absent, the covariance of the multivariate means is singular; this situation #' is accommodated, but the result has reduced degrees of freedom and a message #' is displayed. If there are other abnormal conditions such as non-estimable #' results, estimates are shown as \code{NA}. #' #' While designed primarily for testing contrasts, multivariate tests of the #' mean vector itself can be implemented via \code{method = "identity")} (see #' the examples). #' #' @references Hotelling, Harold (1931) "The generalization of Student's ratio", #' \emph{Annals of Mathematical Statistics} 2(3), 360–378. doi:10.1214/aoms/1177732979 #' #' @export #' #' @examples #' MOats.lm <- lm(yield ~ Variety + Block, data = MOats) #' MOats.emm <- emmeans(MOats.lm, ~ Variety | rep.meas) #' mvcontrast(MOats.emm, "consec", show.ests = TRUE) # mult.name defaults to rep.meas #' #' # Test each mean against a specified null vector #' mvcontrast(MOats.emm, "identity", name = "Variety", #' null = c(80, 100, 120, 140), adjust = "none") #' # (Note 'name' is passed to contrast() and overrides default name "contrast") #' #' # 'mult.name' need not refer to a multivariate response #' mvcontrast(MOats.emm, "trt.vs.ctrl1", mult.name = "Variety") #' mvcontrast = function(object, method = "eff", mult.name = object@roles$multresp, null = 0, by = object@misc$by.vars, adjust = c("sidak", p.adjust.methods), show.ests = FALSE, ...) { if (is.null(mult.name) || length(mult.name) == 0) stop("Must specify at least one factor in 'mult.name'") if(length(setdiff(names(object@levels), union(by, mult.name))) == 0) by = NULL # avoid the case where we're left with no variables con = contrast(object, method = method, by = union(by, mult.name), ...) mvnm = paste(mult.name, collapse = " ") con = contrast(con, "identity", simple = mult.name, name = mvnm) # just re-orders it ese = .est.se.df(con) est = ese$est df = ese$df V = vcov(con) rows = .find.by.rows(con@grid, con@misc$by.vars) red.rank = FALSE # flag for red.rank cases result = lapply(rows, function(r) { QR = try(qr(V[r, r, drop = FALSE]), silent = TRUE) if (inherits(QR, "try-error")) T2 = df1 = df2 = F = NA else { df1 = QR$rank if(df1 < length(r)) red.rank <<- TRUE rawdf = mean(df[r]) df2 = rawdf - df1 + 1 qe = qr.coef(QR, est[r] - null) qe[is.na(qe)] = 0 T2 = sum(qe * (est[r] - null)) F = T2 / df1 * (df2 / rawdf) } data.frame(T.square = T2, df1 = df1, df2 = df2, F.ratio = F) }) result = cbind(con@grid[sapply(rows, function(r) r[1]), ], do.call(rbind, result)) result[[mvnm]] = NULL class(result) = c("summary_emm", "data.frame") by = setdiff(by, mult.name) if (length(by) == 0) by = NULL rows = .find.by.rows(result, by) adjust = match.arg(adjust) result$p.value = NA for (r in rows) { pv = with(result[r, ], pf(F.ratio, df1, df2, lower.tail = FALSE)) result$p.value[r] = switch(adjust, sidak = 1 - (1 - pv)^length(r), p.adjust(pv, adjust)) } attr(result, "estName") = "F.ratio" attr(result, "by.vars") = by if (adjust == "none") mesg = NULL else mesg = paste("P value adjustment:", adjust) if (red.rank) mesg = c(mesg, "NOTE: Some or all d.f. are reduced due to singularities") if(any(is.na(result$T.square))) mesg = c(mesg, "NAs indicate non-estimabile cases or other errors") attr(result, "mesg") = mesg if (show.ests) list(estimates = con, tests = result) else result }emmeans/R/plot.emm.R0000644000176200001440000006467014147511200013766 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # S3 plot method for emmGrid objects # ... are arguments sent to update() # DELETED: #' @import ggplot2 #' @rdname plot #' @importFrom graphics plot #' @method plot emmGrid #' @export plot.emmGrid = function(x, y, type, CIs = TRUE, PIs = FALSE, comparisons = FALSE, colors = c("black", "blue", "blue", "red"), alpha = .05, adjust = "tukey", int.adjust = "none", intervals, frequentist, ...) { if(!missing(intervals)) CIs = intervals nonlin.scale = FALSE if(!missing(type)) { # If we say type = "scale", set it to "response" and set a flag if (nonlin.scale <- (type %.pin% "scale")) ##(pmatch(type, "scale", 0) == 1)) type = "response" object = update(x, predict.type = type, ..., silent = TRUE) } else object = update(x, ..., silent = TRUE) ptype = ifelse(is.null(object@misc$predict.type), "lp", object@misc$predict.type) # when we want comparisons, we have a transformation, and we want non-link scale, it's mandatory to regrid first: ### Nope, we are bypassing that as we will ALWAYS do things on the link scale # if(comparisons && !is.null(object@misc$tran) && # !(ptype %in% c("link", "lp", "linear.predictor"))) # object = regrid(object, transform = ptype) if (missing(int.adjust)) { int.adjust = object@misc$adjust if (is.null(int.adjust)) int.adjust = "none" } # we will do everything on link scale and back-transform later in .plot.srg... summ = summary(object, infer = c(TRUE, FALSE), adjust = int.adjust, frequentist = frequentist, type = "lp", ...) if (is.null(attr(summ, "pri.vars"))) { ## new ref_grid - use all factors w/ > 1 level pv = names(x@levels) len = sapply(x@levels, length) if (max(len) > 1) pv = pv[len > 1] attr(summ, "pri.vars") = pv } if (PIs) { prd = predict(object, interval = "pred", frequentist = frequentist, type = "lp", ...) summ$lpl = prd$lower.PL summ$upl = prd$upper.PL } estName = attr(summ, "estName") extra = NULL bayes = if(comparisons) { if ((!is.na(object@post.beta[1])) && (missing(frequentist) || !frequentist)) stop("Comparison intervals are not implemented for Bayesian analyses") extra = object extra@misc$comp.alpha = alpha extra@misc$comp.adjust = adjust } if (nonlin.scale) scale = .make.scale(object@misc) else scale = NULL .plot.srg(x = summ, CIs = CIs, PIs = PIs, colors = colors, extra = extra, backtran = !(ptype %in% c("link", "lp", "linear.predictor")), link = attr(.est.se.df(object), "link"), scale = scale, ...) } # May use in place of plot.emmGrid but no control over level etc. # extra is a placeholder for comparison-interval stuff #' Plot an \code{emmGrid} or \code{summary_emm} object #' #' Methods are provided to plot EMMs as side-by-side CIs, and optionally to display #' \dQuote{comparison arrows} for displaying pairwise comparisons. #' #' @rdname plot #' @param x Object of class \code{emmGrid} or \code{summary_emm} #' @param y (Required but ignored) #' @param horizontal Logical value specifying whether the intervals should be #' plotted horizontally or vertically #' @param xlab Character label for horizontal axis #' @param ylab Character label for vertical axis #' @param layout Numeric value passed to \code{\link[lattice:xyplot]{dotplot}} #' when \code{engine == "lattice"}. #' @param type Character value specifying the type of prediction desired #' (matching \code{"linear.predictor"}, \code{"link"}, or \code{"response"}). #' See details under \code{\link{summary.emmGrid}}. #' In addition, the user may specify \code{type = "scale"}, in which case #' a transformed scale (e.g., a log scale) is displayed based on the transformation #' or link function used. Additional customization of this scale is available through #' including arguments to \code{ggplot2::scale_x_continuous} in \code{...} . #' @param scale Object of class \code{trans} (in the \pkg{scales} package) to #' specify a nonlinear scale. This is used in lieu of \code{type = "scale"} when #' plotting a \code{summary_emm} object created with \code{type = "response"}. #' This is ignored with other types of summaries. #' @param CIs Logical value. If \code{TRUE}, confidence intervals are #' plotted for each estimate. #' @param PIs Logical value. If \code{TRUE}, prediction intervals are #' plotted for each estimate. If \code{objecct} is a Bayesian model, #' this requires \code{frequentist = TRUE} and \code{sigma =} (some value). #' Note that the \code{PIs} option is \emph{not} available with #' \code{summary_emm} objects -- only for \code{emmGrid} objects. #' Also, prediction intervals are not available #' with \code{engine = "lattice"}. #' @param comparisons Logical value. If \code{TRUE}, \dQuote{comparison arrows} #' are added to the plot, in such a way that the degree to which arrows #' overlap reflects as much as possible the significance of the comparison of #' the two estimates. (A warning is issued if this can't be done.) #' Note that comparison arrows are not available with `summary_emm` objects. #' @param colors Character vector of color names to use for estimates, CIs, PIs, #' and comparison arrows, respectively. CIs and PIs are rendered with some #' transparency, and colors are recycled if the length is less than four; #' so all plot elements are visible even if a single color is specified. #' @param alpha The significance level to use in constructing comparison arrows #' @param adjust Character value: Multiplicity adjustment method for comparison arrows \emph{only}. #' @param int.adjust Character value: Multiplicity adjustment method for the plotted confidence intervals \emph{only}. #' @param intervals If specified, it is used to set \code{CIs}. This is the previous #' argument name for \code{CIs} and is provided for backward compatibility. #' @param frequentist Logical value. If there is a posterior MCMC sample and #' \code{frequentist} is non-missing and TRUE, a frequentist summary is used for #' obtaining the plot data, rather than the posterior point estimate and HPD #' intervals. This argument is ignored when it is not a Bayesian model. #' @param plotit Logical value. If \code{TRUE}, a graphical object is returned; #' if \code{FALSE}, a data.frame is returned containing all the values #' used to construct the plot. #' @param ... Additional arguments passed to \code{\link{update.emmGrid}}, #' \code{\link{predict.emmGrid}}, or #' \code{\link[lattice:xyplot]{dotplot}} #' #' @return If \code{plotit = TRUE}, a graphical object is returned. #' #' If \code{plotit = FALSE}, a \code{data.frame} with the table of #' EMMs that would be plotted. In the latter case, the estimate being plotted #' is named \code{the.emmean}, and any factors involved have the same names as #' in the object. Confidence limits are named \code{lower.CL} and #' \code{upper.CL}, prediction limits are named \code{lpl} and \code{upl}, and #' comparison-arrow limits are named \code{lcmpl} and \code{ucmpl}. #' There is also a variable named \code{pri.fac} which contains the factor #' combinations that are \emph{not} among the \code{by} variables. #' @section Details: #' If any \code{by} variables are in force, the plot is divided into separate #' panels. For #' \code{"summary_emm"} objects, the \code{\dots} arguments in \code{plot} #' are passed \emph{only} to \code{dotplot}, whereas for \code{"emmGrid"} #' objects, the object is updated using \code{\dots} before summarizing and #' plotting. #' #' In plots with \code{comparisons = TRUE}, the resulting arrows are only #' approximate, and in some cases may fail to accurately reflect the pairwise #' comparisons of the estimates -- especially when estimates having large and #' small standard errors are intermingled in just the wrong way. Note that the #' maximum and minimum estimates have arrows only in one direction, since there #' is no need to compare them with anything higher or lower, respectively. See #' the \href{../doc/xplanations.html#arrows}{\code{vignette("xplanations", #' "emmeans")}} for details on how these are derived. #' #' If \code{adjust} or \code{int.adjust} are not supplied, they default to the #' internal \code{adjust} setting saved in \code{pairs(x)} and \code{x} #' respectively (see \code{\link{update.emmGrid}}). #' #' @note In order to play nice with the plotting functions, #' any variable names that are not syntactically correct (e.g., contain spaces) #' are altered using \code{\link{make.names}}. #' #' @importFrom graphics plot #' #' @method plot summary_emm #' @export #' #' @examples #' warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) #' warp.emm <- emmeans(warp.lm, ~ tension | wool) #' plot(warp.emm) #' plot(warp.emm, by = NULL, comparisons = TRUE, adjust = "mvt", #' horizontal = FALSE, colors = "darkgreen") #' #' ### Using a transformed scale #' pigs.lm <- lm(log(conc + 2) ~ source * factor(percent), data = pigs) #' pigs.emm <- emmeans(pigs.lm, ~ percent | source) #' plot(pigs.emm, type = "scale", breaks = seq(20, 100, by = 10)) #' #' # Based on a summary. #' # To get a transformed axis, must specify 'scale'; but it does not necessarily #' # have to be the same as the actual response transformation #' pigs.ci <- confint(pigs.emm, type = "response") #' plot(pigs.ci, scale = scales::log10_trans()) plot.summary_emm = function(x, y, horizontal = TRUE, CIs = TRUE, xlab, ylab, layout, scale = NULL, colors = c("black", "blue", "blue", "red"), intervals, plotit = TRUE, ...) { if(!missing(intervals)) CIs = intervals if(attr(x, "type") != "response") # disable scale when no response transformation scale = NULL .plot.srg (x, y, horizontal, xlab, ylab, layout, scale = scale, CIs = CIs, colors = colors, plotit = plotit, ...) } # Workhorse for plot.summary_emm .plot.srg = function(x, y, horizontal = TRUE, xlab, ylab, layout, colors, engine = get_emm_option("graphics.engine"), CIs = TRUE, PIs = FALSE, extra = NULL, plotit = TRUE, backtran = FALSE, link, scale = NULL, ...) { engine = match.arg(engine, c("ggplot", "lattice")) if (engine == "ggplot") .requireNS("ggplot2", "The 'ggplot' engine requires the 'ggplot2' package be installed.") if (engine == "lattice") .requireNS("lattice", "The 'lattice' engine requires the 'lattice' package be installed.") summ = x # so I don't get confused estName = "the.emmean" names(summ)[which(names(summ) == attr(summ, "estName"))] = estName clNames = attr(summ, "clNames") if (is.null(clNames)) { warning("No information available to display confidence limits") lcl = ucl = summ[[estName]] } else { lcl = summ[[clNames[1]]] ucl = summ[[clNames[2]]] } # ensure all names are syntactically valid summ = .validate.names(summ) if (engine == "lattice") { # ---------- lattice-specific stuff ---------- # Panel functions... prepanel.ci = function(x, y, horizontal=TRUE, CIs=TRUE, lcl, ucl, subscripts, ...) { x = as.numeric(x) lcl = as.numeric(lcl[subscripts]) ucl = as.numeric(ucl[subscripts]) if (!CIs) # no special scaling needed list() else if (horizontal) list(xlim = range(x, ucl, lcl, finite = TRUE)) else list(ylim = range(y, ucl, lcl, finite = TRUE)) } panel.ci <- function(x, y, horizontal=TRUE, CIs=TRUE, lcl, ucl, lcmpl, rcmpl, subscripts, pch = 16, lty = dot.line$lty, lwd = dot.line$lwd, col = dot.symbol$col, col.line = dot.line$col, ...) { dot.line <- lattice::trellis.par.get("dot.line") dot.symbol <- lattice::trellis.par.get("dot.symbol") x = as.numeric(x) y = as.numeric(y) lcl = as.numeric(lcl[subscripts]) ucl = as.numeric(ucl[subscripts]) compare = !is.null(lcmpl) if(compare) { lcmpl = as.numeric(lcmpl[subscripts]) rcmpl = as.numeric(rcmpl[subscripts]) } if(horizontal) { lattice::panel.abline(h = unique(y), col = col.line, lty = lty, lwd = lwd) if(CIs) lattice::panel.arrows(lcl, y, ucl, y, col = col, length = .6, unit = "char", angle = 90, code = 3) if(compare) { s = (x > min(x)) lattice::panel.arrows(lcmpl[s], y[s], x[s], y[s], length = .5, unit = "char", code = 1, col = "red", type = "closed", fill="red") s = (x < max(x)) lattice::panel.arrows(rcmpl[s], y[s], x[s], y[s], length = .5, unit = "char", code = 1, col = "red", type = "closed", fill="red") } } else { lattice::panel.abline(v = unique(x), col = col.line, lty = lty, lwd = lwd) if(CIs) lattice::panel.arrows(x, lcl, x, ucl, col=col, length = .6, unit = "char", angle = 90, code = 3) if(compare) { s = (y > min(y)) lattice::panel.arrows(x[s], lcmpl[s], x[s], y[s], length = .5, unit = "char", code = 1, col = "red", type = "closed", fill="red") s = (y < max(y)) lattice::panel.arrows(x[s], rcmpl[s], x[s], y[s], length = .5, unit = "char", code = 1, col = "red", type = "closed", fill="red") } } lattice::panel.xyplot(x, y, pch=16, ...) } my.strip = lattice::strip.custom(strip.names = c(TRUE,TRUE), strip.levels = c(TRUE,TRUE), sep = " = ") } # ---------- end lattice-specific ----------- sep = get_emm_option("sep") priv = attr(summ, "pri.vars") pf = do.call(paste, c(unname(summ[priv]), sep = sep)) if (length(pf) == 0) pf = "1" summ$pri.fac = factor(pf, levels=unique(pf)) chform = ifelse(horizontal, paste("pri.fac ~", estName), paste(estName, "~ pri.fac")) byv = attr(summ, "by.vars") if (!is.null(byv) && length(byv) > 0) { chform = paste(chform, "|", paste(byv, collapse="*")) lbv = do.call("paste", c(unname(summ[byv]), sep = sep)) # strings for matching by variables ubv = unique(lbv) } else { lbv = rep(1, nrow(summ)) ubv = 1 } # Obtain comparison limits if (!is.null(extra)) { ### We will ALWAYS be working on LP scale now... # # we need to work on the linear predictor scale # # typeid = 1 -> response, 2 -> other # typeid = pmatch(extra@misc$predict.type, "response", nomatch = 2) # if(length(typeid) < 1) typeid = 2 # if (typeid == 1) # est = predict(extra, type = "lp") # else est = summ[[estName]] alpha = extra@misc$comp.alpha adjust = extra@misc$comp.adjust psumm = suppressMessages(confint(pairs(extra), level = 1 - alpha, type = "lp", adjust = adjust)) psumm = .validate.names(psumm) k = ncol(psumm) del = (psumm[[k]] - psumm[[k-1]]) / 4 # half the halfwidth, on lp scale diff = psumm[[attr(psumm, "estName")]] overlap = apply(psumm[ ,(k-1):k], 1, function(x) 2*min(-x[1],x[2])/(x[2]-x[1])) # figure out by variables and indexes (lbv, ubv already defined) if(is.null(byv)) pbv = rep(1, nrow(psumm)) else pbv = do.call("paste", c(unname(psumm[byv]), sep = sep)) neach = length(lbv) / length(ubv) # indexes for pairs results -- est[id1] - est[id2] id1 = rep(seq_len(neach-1), rev(seq_len(neach-1))) id2 = unlist(sapply(seq_len(neach-1), function(x) x + seq_len(neach-x))) # list of psumm row numbers involved in each summ row involved = lapply(seq_len(neach), function(x) union(which(id2==x), which(id1==x))) # initialize arrays mind = numeric(length(lbv)) # for minima of del llen = rlen = numeric(neach) # for left and right arrow lengths npairs = length(id1) iden = diag(rep(1, 2*neach)) for (by in ubv) { d = del[pbv == by] rows = which(lbv == by) for(i in seq_len(neach)) mind[rows[i]] = min(d[involved[[i]]]) # Set up regression equations to match arrow overlaps with interval overlaps # We'll add rows later (with weights 1) to match with mind values lmat = rmat = matrix(0, nrow = npairs, ncol = neach) y = numeric(npairs) v1 = 1 - overlap[pbv == by] dif = diff[pbv == by] for (i in which(!is.na(v1))) { #wgt = 6 * max(0, ifelse(v1[i] < 1, v1[i], 2-v1[i])) wgt = 3 + 20 * max(0, .5 - (1 - v1[i])^2) # really this is sqrt of weight if (dif[i] > 0) # id2 <-----> id1 lmat[i, id1[i]] = rmat[i, id2[i]] = wgt*v1[i] else # id1 <-----> id2 rmat[i, id1[i]] = lmat[i, id2[i]] = wgt*v1[i] y[i] = wgt * abs(dif[i]) } X = rbind(cbind(lmat, rmat),iden) y = c(y, rep(mind[rows], 2)) y[is.na(y)] = 0 soln = qr.coef(qr(X), y) soln[is.na(soln)] = 0 ll = llen[rows] = soln[seq_len(neach)] rl = rlen[rows] = soln[neach + seq_len(neach)] # Abort if negative lengths if (any(c(rl, ll) < 0)) { stop("Aborted -- Some comparison arrows have negative length!\n", "(in group \"", by, "\")", call. = FALSE) } # Overlap check for (i in which(!is.na(v1))) { v = 1 - v1[i] obsv = 1 - abs(dif[i]) / ifelse(dif[i] > 0, ll[id1[i]] + rl[id2[i]], rl[id1[i]] + ll[id2[i]]) if (v*obsv < 0) warning("Comparison discrepancy in group \"", by, "\", ", psumm[i, 1], ":\n Target overlap = ", round(v, 4), ", overlap on graph = ", round(obsv, 4), call. = FALSE) } # shorten arrows that go past the data range estby = est[rows] rng = suppressWarnings(range(estby, na.rm = TRUE)) diffr = diff(rng) ii = which(estby - ll < rng[1]) llen[rows][ii] = estby[ii] - rng[1] + .02 * diffr ii = which(estby + rl > rng[2]) rlen[rows][ii] = rng[2] - estby[ii] + .02 * diffr # remove arrows completely from extremes llen[rows][estby < rng[1] + .0001 * diffr] = NA rlen[rows][estby > rng[2] - .0001 * diffr] = NA } ### Unneeded now as we are always on LP scale # invtran = I # if (typeid == 1) { # tran = extra@misc$tran # if(is.character(tran)) { # link = try(make.link(tran), silent=TRUE) # if (!inherits(link, "try-error")) # invtran = link$linkinv # } # else if (is.list(tran)) # invtran = tran$linkinv # } lcmpl = summ$lcmpl = as.numeric(est - llen) rcmpl = summ$rcmpl = as.numeric(est + rlen) } else lcmpl = rcmpl = NULL if(backtran && !missing(link) && !is.null(link)) { ### we need to back-transform stuff... link must be non-missing summ$the.emmean = with(link, linkinv(summ$the.emmean)) summ[[clNames[1]]] = lcl = with(link, linkinv(lcl)) summ[[clNames[2]]] = ucl = with(link, linkinv(ucl)) if (PIs) { summ$lpl = with(link, linkinv(summ$lpl)) summ$upl = with(link, linkinv(summ$upl)) } if(!is.null(extra)) { summ$lcmpl = lcmpl = with(link, linkinv(summ$lcmpl)) summ$rcmpl = rcmpl = with(link, linkinv(summ$rcmpl)) } } if(!plotit) return(as.data.frame(summ)) facName = paste(priv, collapse=":") if(length(colors) < 4) colors = rep(colors, 4) dot.col = colors[1] CI.col = colors[2] PI.col = colors[3] comp.col = colors[4] if (engine == "lattice") { if (missing(layout)) { layout = c(1, length(ubv)) if(!horizontal) layout = rev(layout) } form = as.formula(chform) if (horizontal) { if (missing(xlab)) xlab = attr(summ, "estName") if (missing(ylab)) ylab = facName lattice::dotplot(form, prepanel=prepanel.ci, panel=panel.ci, strip = my.strip, horizontal = TRUE, ylab = ylab, xlab = xlab, data = summ, CIs = CIs, lcl=lcl, ucl=ucl, lcmpl=lcmpl, rcmpl=rcmpl, layout = layout, ...) } else { if (missing(xlab)) xlab = facName if (missing(ylab)) ylab = attr(summ, "estName") lattice::dotplot(form, prepanel=prepanel.ci, panel=panel.ci, strip = my.strip, horizontal = FALSE, xlab = xlab, ylab = ylab, data = summ, CIs = CIs, lcl=lcl, ucl=ucl, lcmpl=lcmpl, rcmpl=rcmpl, layout = layout, ...) } } # --- end lattice plot else { ## ggplot method summ$lcl = lcl summ$ucl = ucl # construct horizontal plot - flip coords later if necessary grobj = ggplot2::ggplot(summ, ggplot2::aes_(x = ~the.emmean, y = ~pri.fac)) + ggplot2::geom_point(color = CI.col, alpha = .0) # invisible points to get grouping order right if (PIs) grobj = grobj + ggplot2::geom_segment(ggplot2::aes_(x = ~lpl, xend = ~upl, y = ~pri.fac, yend = ~pri.fac), color = PI.col, lwd = 2.5, alpha = .15) if (CIs) grobj = grobj + ggplot2::geom_segment(ggplot2::aes_(x = ~lcl, xend = ~ucl, y = ~pri.fac, yend = ~pri.fac), color = CI.col, lwd = 4, alpha = .25) if (!is.null(extra)) { grobj = grobj + ggplot2::geom_segment(ggplot2::aes_(x = ~the.emmean, xend = ~lcmpl, y = ~pri.fac, yend = ~pri.fac), arrow = ggplot2::arrow(length = ggplot2::unit(.07, "inches"), type = "closed"), color = comp.col, data = summ[!is.na(summ$lcmpl), ]) + ggplot2::geom_segment(ggplot2::aes_(x = ~the.emmean, xend = ~rcmpl, y = ~pri.fac, yend = ~pri.fac), arrow = ggplot2::arrow(length = ggplot2::unit(.07, "inches"), type = "closed"), color = comp.col, data = summ[!is.na(summ$rcmpl), ]) } if (length(byv) > 0) grobj = grobj + ggplot2::facet_grid(as.formula(paste(paste(byv, collapse = "+"), " ~ .")), labeller = "label_both") grobj = grobj + ggplot2::geom_point(color = dot.col, size = 2) if(!is.null(scale)) { args = list(...) pass = (names(args) %.pin% names(as.list(args(ggplot2::scale_x_continuous)))) args = c(list(trans = scale), args[pass]) grobj = grobj + do.call(ggplot2::scale_x_continuous, args) } if (missing(xlab)) xlab = attr(summ, "estName") if (missing(ylab)) ylab = facName if(!horizontal) grobj = grobj + ggplot2::coord_flip() grobj + ggplot2::labs(x = xlab, y = ylab) } } emmeans/R/brms-support.R0000644000176200001440000000305314137062735014711 0ustar liggesusers### Rudimentary support for brms. ### Obviously this is way less than is needed, but it does support simpler models #xxxx' @importFrom brms parse_bf recover_data.brmsfit = function(object, data, ...) { bt = brms::parse_bf(formula(object)) if (class(bt) != "brmsterms") stop("This model is currently not supported.") mt = attr(model.frame(bt$dpars$mu$fe, data = object$data), "terms") # form = bt$dpars$mu$fe # for (att in c("predvars", "dataClasses")) # attr(trms, att) = attr(mt, att) ### we don't have a call component so I'll just put in NULL recover_data.call(NULL, mt, "na.omit", data = object$data, ...) } emm_basis.brmsfit = function(object, trms, xlev, grid, vcov., ...) { m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) contr = lapply(object$data, function(.) attr(., "contrasts")) contr = contr[!sapply(contr, is.null)] X = model.matrix(trms, m, contrasts.arg = contr) ###nm = gsub("(Intercept)", "Intercept", dimnames(X)[[2]], fixed = TRUE) nm = eval(parse(text = "brms:::rename(colnames(X))")) V = vcov(object)[nm, nm, drop = FALSE] nbasis = estimability::all.estble dfargs = list() dffun = function(k, dfargs) Inf misc = .std.link.labels(brms::parse_bf(formula(object))$dpars$mu$family, list()) post.beta = as.matrix(object, pars = paste0("b_", nm), exact = TRUE) bhat = apply(post.beta, 2, mean) list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=dfargs, misc=misc, post.beta=post.beta) } emmeans/R/gam-support.R0000644000176200001440000002345614156726526014531 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### Support for 'gam' objects # Note: This is a mess, because both packages 'gam' and 'mgcv' produce these, # and they are different. Both inherit from glm and lm, though, so recover_data.lm # still serves for these (I hope) # gam::Gam objects... # We have two args: # nboot # of bootstrap reps to get variances of smooths emm_basis.Gam = function(object, trms, xlev, grid, nboot = 800, ...) { result = emm_basis.lm(object, trms, xlev, grid, ...) old.smooth = object$smooth if (is.null(old.smooth)) # "just an ordinary glm" (My Fair Lady) return(result) # else we need to add-in some smoothers smooth.frame = model.frame(trms, grid, na.action = na.pass, xlev = xlev) data = object$smooth.frame labs = names(data) w = object$weights resid = object$residuals for (i in seq_along(labs)) { lab = labs[i] sig = apply(smooth.frame[, i, drop=FALSE], 1, paste, collapse = ":") usig = unique(sig) rows = lapply(usig, function(s) which(sig == s)) xeval = smooth.frame[sapply(rows, "[", 1), lab] bsel = matrix(0, nrow = length(sig), ncol = length(usig)) for (j in seq_along(rows)) bsel[rows[[j]], j] = 1 cl = attr(data[[i]], "call") cl$xeval = substitute(xeval) z = resid + old.smooth[, lab] bh = as.numeric(eval(cl)) m = length(bh) n = length(result$bhat) result$bhat = c(result$bhat, bh) result$X = cbind(result$X, bsel) boot = replicate(nboot, { z = sample(resid, replace = TRUE) + old.smooth[, lab] as.numeric(eval(cl)) }) covar = if(m == 1) var(boot) else cov(t(boot)) result$V = rbind(cbind(result$V, matrix(0, nrow = n, ncol = m)), cbind(matrix(0, nrow = m, ncol = n), covar)) } result } ### This addition contributed by Hannes Riebl (#303) .emm_basis.gam_multinom = function(object, trms, xlev, grid, freq = FALSE, unconditional = FALSE, mode = c("prob", "latent"), ...) { mode = match.arg(mode) X = mgcv::predict.gam(object, newdata = grid, type = "lpmatrix", newdata.guaranteed = TRUE) k = length(attr(X, "lpi")) nbhat = vapply(attr(X, "lpi"), length, FUN.VALUE = integer(1)) pat = (rbind(0, diag(k + 1, k)) - 1) / (k + 1) X = apply(pat, 1, function(row) { y = rep.int(row, times = nbhat) out = apply(X, 1, "*", y = y, simplify = FALSE) do.call(rbind, out) }, simplify = FALSE) X = do.call(rbind, X) bhat = as.numeric(coef(object)) V = .my.vcov(object, freq = freq, unconditional = unconditional, ...) nbasis = kronecker(rep.int(1, times = k), estimability::all.estble) dfargs = list(df = sum(object$edf)) dffun = function(k, dfargs) dfargs$df misc = list(tran = "log", inv.lbl = "e^y") ylevs = list(class = seq.int(0, k)) names(ylevs) = as.character(object$formula[[1]][[2]]) misc$ylevs = ylevs if (mode == "prob") { misc$postGridHook = .multinom.postGrid } list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = misc) } ### emm_basis method for mgcv::gam objects ### extra arg `unconditional` and `freq` as in `vcov.gam` emm_basis.gam = function(object, trms, xlev, grid, freq = FALSE, unconditional = FALSE, what = c("location", "scale", "shape", "rate", "prob.gt.0"), ...) { bhat = object$coefficients X = mgcv::predict.gam(object, newdata = grid, type = "lpmatrix", newdata.guaranteed = TRUE) bhat = as.numeric(bhat) V = .my.vcov(object, freq = freq, unconditional = unconditional, ...) fam_name = object$family$family what_num = what if (fam_name == "multinom") { return(.emm_basis.gam_multinom(object, trms, xlev, grid, freq, unconditional, ...)) } else if (fam_name == "mvn") { if (!is.numeric(what)) { stop("Family 'mvn' requires a numeric argument 'what'") } } else if (is.character(what)) { what = match.arg(what) if (fam_name == "ziplss") { what_num = switch(what, location = 1, rate = 1, prob.gt.0 = 2) } else { what_num = switch(what, location = 1, scale = 2, shape = 3) } } select = attr(X, "lpi") if (is.null(select)) select = list(seq_along(bhat)) select = try(select[[what_num]], silent = TRUE) if (inherits(select, "try-error")) { stop("Model does not have a linear predictor 'what = ", what, "'") } bhat = bhat[select] X = X[, select, drop = FALSE] V = V[select, select, drop = FALSE] nbasis = estimability::all.estble link = object$family$link[what_num] if(link == "identity") # they may be lying link = switch(fam_name, ocat = "logit", ziP = "log", cox.ph = "log", ## ??? ziplss = c("log", "cloglog")[what_num], gevlss = c("identity", "log", "logit")[what_num], "identity") misc = .std.link.labels(list(link = link, family = fam_name), list()) if (!is.null(misc$tran) && misc$tran == "logb") # the way this is documented is truly bizarre but I think this is right misc$tran = make.tran("genlog", - environment(object$family$linfo[[what_num]]$linkfun)$b) dfargs = list(df = object$df.residual) dffun = function(k, dfargs) dfargs$df list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=dfargs, misc=misc) } ### mgcv::gamm objects... recover_data.gamm = function(object, data = NULL, call = object$gam$call, ...) { gam = object$gam class(gam) = c("gam", "glm", "lm") if (!is.null(data)) { gam$call = quote(gamm()) return(recover_data(gam, data = data, ...)) } else { if (is.null(call)) return("Must supply either 'data' or 'call' with gamm objects") gam$call = call recover_data(gam, ...) } } emm_basis.gamm = function(object, ...) emm_basis(object$gam, ...) ###=================================================================== # Support for gamlss objects # 'what' parameter mimics predict.gamlss recover_data.gamlss = function(object, what = c("mu", "sigma", "nu", "tau"), ...) { fcall = object$call what = match.arg(what) trms = terms(formula(object, what = what)) recover_data(fcall, delete.response(trms), object$na.action, ...) } emm_basis.gamlss = function(object, trms, xlev, grid, what = c("mu", "sigma", "nu", "tau"), vcov., ...) { what = match.arg(what) smo.mat = object[[paste0(what, ".s")]] if (!is.null(smo.mat)) stop("gamlss models with smoothing are not yet supported in 'emmeans'", call. = NULL) object$coefficients = object[[paste0(what, ".coefficients")]] if (missing(vcov.)) { # tedious code to pull needed vcov elements # Gotta do this before messing up the object V = suppressWarnings(vcov(object)) len = sapply(object$parameters, function(p) length(object[[paste0(p, ".coefficients")]])) before = which(object$parameters == what) - 1 if (before > 0) before = sum(len[seq_len(before)]) idx = before + seq_along(object$coefficients) vcov. = V[idx, idx, drop = FALSE] } if (!is.null(link <- object[[paste0(what, ".link")]])) { # Decide whether to use d.f. or not use.df = c("BCCG", "BCPE", "BCT", "GA", "GT", "NO", "NOF", "TF") fam = ifelse((what == "mu") && (object$family[1] %in% use.df), "gaussian", "other") object$family = list(family = fam, link = link) } ###object$qr = object[[paste0(what, ".qr")]] ###NextMethod("emm_basis", vcov. = vcov., ...) emm_basis.lm(object, trms, xlev, grid, vcov. = vcov., ...) } emmeans/R/S4-classes.R0000644000176200001440000001644114137062735014162 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### S4 class definitions for emmeans package ### emmGrid object - For reference grids, emmeans results, etc. #' The \code{emmGrid} class #' #' The \code{emmGrid} class encapsulates linear functions of regression #' parameters, defined over a grid of predictors. This includes reference #' grids and grids of marginal means thereof (aka estimated marginal means). #' Objects of class `emmGrid` may be used independently of the underlying model #' object. Instances are created primarily by \code{\link{ref_grid}} and #' \code{\link{emmeans}}, and several related functions. #' #' @rdname emmGrid-class #' @slot model.info list. Contains the elements \code{call} (the call that #' produced the model), \code{terms} (its \code{terms} object), and #' \code{xlev} (factor-level information) #' @slot roles list. Contains at least the elements \code{predictors}, #' \code{responses}, and \code{multresp}. Each is a character vector of names #' of these variables. #' @slot grid data.frame. Contains the combinations of the variables that define #' the reference grid. In addition, there is an auxiliary column named #' \code{".wgt."} holding the observed frequencies or weights for each factor #' combination (excluding covariates). If the model has one or more #' \code{\link{offset}()} calls, there is an another auxiliary column named #' \code{".offset."}. Auxiliary columns are not considered part of the #' reference grid. (However, any variables included in \code{offset} calls #' \emph{are} in the reference grid.) #' @slot levels list. Each entry is a character vector with the distinct levels #' of each variable in the reference grid. Note that \code{grid} is obtained #' by applying the function \code{\link{expand.grid}} to this list #' @slot matlevs list. Like \code{levels} but has the levels of any matrices in #' the original dataset. Matrix columns are always concatenated and treated as #' a single variable for purposes of the reference grid #' @slot linfct matrix. Each row consists of the linear function of the #' regression coefficients for predicting its corresponding element of the #' reference grid. The rows of this matrix go in one-to-one correspondence #' with the rows of \code{grid}, and the columns with elements of \code{bhat}. #' @slot bhat numeric. The regression coefficients. If there is a multivariate #' response, the matrix of coefficients is flattened to a single vector, and #' \code{linfct} and \code{V} redefined appropriately. Important: \code{bhat} #' must \emph{include} any \code{NA} values produced as a result of #' collinearity in the predictors. These are taken care of later in the #' estimability check. #' @slot nbasis matrix. The basis for the non-estimable functions of the #' regression coefficients. Every EMM will correspond to a linear combination #' of rows of \code{linfct}, and that result must be orthogonal to all the #' columns of \code{nbasis} in order to be estimable. If everything is #' estimable, \code{nbasis} should be a 1 x 1 matrix of \code{NA}. #' @slot V matrix. The symmetric variance-covariance matrix of \code{bhat} #' @slot dffun function having two arguments. \code{dffun(k, dfargs)} should #' return the degrees of freedom for the linear function \code{sum(k*bhat)}, #' or \code{NA} if unavailable #' @slot dfargs list. Used to hold any additional information needed by #' \code{dffun}. #' @slot misc list. Additional information used by methods. These include at #' least the following: \code{estName} (the label for the estimates of linear #' functions), and the default values of \code{infer}, \code{level}, and #' \code{adjust} to be used in the \code{\link{summary.emmGrid}} method. Elements in #' this slot may be modified if desired using the \code{\link{update.emmGrid}} #' method. #' @slot post.beta matrix. A sample from the posterior distribution of the #' regression coefficients, if MCMC methods were used; or a 1 x 1 matrix of #' \code{NA} otherwise. When it is non-trivial, the \code{\link{as.mcmc.emmGrid}} #' method returns \code{post.beta \%*\% t(linfct)}, which is a sample from the #' posterior distribution of the EMMs. #' #' @section Methods: #' All methods for these objects are S3 methods except for \code{show}. #' They include \code{\link{[.emmGrid}}, \code{\link{as.glht.emmGrid}}, #' \code{\link{as.mcmc.emmGrid}}, \code{\link{as.mcmc.list.emmGrid}} (see \pkg{coda}), #' \code{\link{cld.emmGrid}} (see \pkg{multcomp}), #' \code{\link{coef.emmGrid}}, \code{\link{confint.emmGrid}}, #' \code{\link{contrast.emmGrid}}, \code{\link{pairs.emmGrid}}, #' \code{\link{plot.emmGrid}}, \code{\link{predict.emmGrid}}, \code{\link{print.emmGrid}}, #' \code{\link{rbind.emmGrid}}, \code{show.emmGrid}, \code{\link{str.emmGrid}}, #' \code{\link{summary.emmGrid}}, \code{\link{test.emmGrid}}, #' \code{\link{update.emmGrid}}, \code{\link{vcov.emmGrid}}, and #' \code{\link{xtable.emmGrid}} #' #' @export setClass("emmGrid", slots = c( model.info = "list", roles = "list", grid = "data.frame", levels = "list", matlevs = "list", linfct = "matrix", bhat = "numeric", nbasis = "matrix", V = "matrix", dffun = "function", dfargs = "list", misc = "list", post.beta = "matrix" )) # Note: misc will hold various extra params, # including at least the following req'd by the summary method # estName: column name for the estimate in the summary ["prediction"] # infer: booleans (CIs?, tests?) [(FALSE,FALSE)] # level: default conf level [.95] # adjust: default adjust method ["none"] # famSize: number of means in family # NOTE: Old ref.grid and lsmobj classes moved to deprecated.Remmeans/R/rms-support.R0000644000176200001440000001157414137062735014556 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # Support for objects in the *rms* package recover_data.rms = function(object, ...) { fcall = object$call recover_data(fcall, delete.response(terms(object)), object$na.action$omit, ...) } # TODO: # 1. If multivariate - like mlm method? # 2. orm cases? emm_basis.rms = function(object, trms, xlev, grid, mode = c("middle", "latent", "linear.predictor", "cum.prob", "exc.prob", "prob", "mean.class"), vcov., ...) { mode = match.arg(mode) bhat = coef(object) if (missing(vcov.)) V = vcov(object, intercepts = "all") else V = .my.vcov(object, vcov.) misc = list() X = predict(object, newdata = grid, type = "x") #xnames = dimnames(X)[[2]] #intcpts = setdiff(names(bhat), xnames) nint = length(bhat) - ncol(X) intcpts = names(bhat)[seq_len(nint)] xnames = setdiff(names(bhat), intcpts) if (length(intcpts) == 1) mode = "single" # stealth mode for ordinary single-intercept case if (mode %in% c("single", "middle", "latent")) { X = cbind(1, X) mididx = ifelse(mode != "middle", 1, as.integer((1 + length(intcpts)) / 2)) dimnames(X)[[2]][1] = switch(mode, single = intcpts, middle = intcpts[mididx], latent = "avg.intercept") if (mode == "middle") { nms = c(intcpts[mididx], xnames) bhat = bhat[nms] V = V[nms, nms, drop = FALSE] } else if (mode == "latent") { bhat = c(mean(bhat[intcpts]), bhat[xnames]) nx = length(xnames) J1 = rbind(rep(1/nint, nint), matrix(0, nrow = nx, ncol = nint)) J2 = rbind(0, diag(1, nx)) J = cbind(J1, J2) V = J %*% V %*% t(J) } ### else mode == "single" and all is OK as it is } else { # mode %in% c("linear.predictor", "cum.prob", "exc.prob", "prob", "mean.class") misc$ylevs = list(cut = intcpts) I = diag(1, nint) J = matrix(1, nrow = nrow(X)) JJ = matrix(1, nrow=nint) X = cbind(kronecker(I, J), kronecker(JJ, X)) # Note V is correct as-is dimnames(X)[[2]] = c(intcpts, xnames) if (mode != "linear.predictor") { misc$mode = mode misc$postGridHook = .clm.postGrid misc$respName = as.character.default(object$terms)[2] } } # I think rms does not allow rank deficiency... nbasis = estimability::all.estble if (!is.null(object$family)) { if (!is.character(object$family)) misc = .std.link.labels(object$family, misc) else { misc$tran = object$family if (misc$tran == "logistic") misc$tran = "logit" misc$inv.lbl = switch(class(object)[1], orm = "exc.prob", lrm = ifelse(nint == 1, "prob", "exc.prob"), "response") } dffun = function(k, dfargs) Inf dfargs = list() } else { dfargs = list(df = object$df.residual) if (is.null(dfargs$df)) dfargs$df = Inf dffun = function(k, dfargs) dfargs$df } list(X=X, bhat=bhat, nbasis=nbasis, V=V, dffun=dffun, dfargs=dfargs, misc=misc) } emmeans/R/nested.R0000644000176200001440000005542414164627214013526 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # Code supporting nested models # This code replies on nested structures specified in a named list like # list(a = "b", c = c("d", "e")) # ... to denote a %in% b, c %in% d*e ### Create a grouping factor and add it to a ref grid #' Add a grouping factor #' #' This function adds a grouping factor to an existing reference grid or other #' \code{emmGrid} object, such that the levels of one or more existing factors (call them the #' reference factors) are mapped to a smaller number of levels of the new #' grouping factor. The reference factors are then nested in the new grouping factor. #' This facilitates obtaining marginal means of the grouping factor, and #' contrasts thereof. #' #' @param object An \code{emmGrid} object #' @param newname Character name of grouping factor to add (different from any #' existing factor in the grid) #' @param refname Character name(s) of the reference factor(s) #' @param newlevs Character vector or factor of the same length as that of the (combined) levels for #' \code{refname}. The grouping factor \code{newname} will have the unique #' values of \code{newlevs} as its levels. The order of levels in \code{newlevs} #' is the same as the order of the level combinations produced by #' \code{\link{expand.grid}} applied to the levels of \code{refname} -- that is, the #' first factor's levels change the fastest and the last one's vary the slowest. #' #' @return A revised \code{emmGrid} object having an additional factor named #' \code{newname}, and a new nesting structure with each \code{refname \%in\% newname} #' #' @note By default, the levels of \code{newname} will be ordered #' alphabetically. To dictate a different ordering of levels, supply #' \code{newlevs} as a \code{factor} having its levels in the desired order. #' #' @note When \code{refname} specifies more than one factor, this can #' fundamentally (and permanently) change what is meant by the levels of those #' individual factors. For instance, in the \code{gwrg} example below, there #' are two levels of \code{wool} nested in each \code{prod}; and that implies #' that we now regard these as four different kinds of wool. Similarly, there #' are five different tensions (L, M, H in prod 1, and L, M in prod 2). #' #' @export #' #' @examples #' fiber.lm <- lm(strength ~ diameter + machine, data = fiber) #' ( frg <- ref_grid(fiber.lm) ) #' #' # Suppose the machines are two different brands #' brands <- factor(c("FiberPro", "FiberPro", "Acme"), levels = c("FiberPro", "Acme")) #' ( gfrg <- add_grouping(frg, "brand", "machine", brands) ) #' #' emmeans(gfrg, "machine") #' #' emmeans(gfrg, "brand") #' #' ### More than one reference factor #' warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) #' gwrg <- add_grouping(ref_grid(warp.lm), #' "prod", c("tension", "wool"), c(2, 1, 1, 1, 2, 1)) #' # level combinations: LA MA HA LB MB HB #' #' emmeans(gwrg, ~ wool * tension) # some NAs due to impossible combinations #' #' emmeans(gwrg, "prod") #' add_grouping = function(object, newname, refname, newlevs) { if(!is.null(object@model.info$nesting[[refname]])) stop("'", refname, "' is already nested in another factor; cannot re-group it") if(newname %in% object@roles$predictors) stop("'", newname, "' is already the name of an existing predictor") rlevs = do.call(paste, do.call(expand.grid, object@levels[refname])) if (length(newlevs) != length(rlevs)) stop("Length of 'newlevs' doesn't match # levels of '", refname, "'") newlevs = factor(newlevs) glevs = levels(newlevs) k = length(glevs) one = matrix(1, nrow = k, ncol = 1) object@linfct = kronecker(one, object@linfct) object@levels[[newname]] = glevs object@roles$predictors = c(object@roles$predictors, newname) ref = do.call(paste, object@grid[refname]) # obs levels of rlevs wgt = object@grid$.wgt. if (is.null(wgt)) wgt = rep(1, nrow(object@grid)) offset = object@grid$.offset. ogrid = object@grid[setdiff(names(object@grid), c(".wgt.", ".offset."))] grid = data.frame() valid = logical(0) # flag for rows that make sense for (i in 1:k) { g = ogrid g[[newname]] = glevs[i] g$.wgt. = wgt g$.offset. = offset grid = rbind(grid, g) alevs = rlevs[newlevs == glevs[i]] valid = c(valid, ref %in% alevs) } # screen out invalid rows grid[!valid, ".wgt."] = 0 object@linfct[!valid, ] = NaN object@misc$pri.vars = c(object@misc$pri.vars, newname) if(is.null(disp <- object@misc$display)) object@misc$display = valid else object@misc$display = disp & valid object@grid = grid # update nesting structure nesting = object@model.info$nesting if (is.null(nesting)) nesting = list() for (nm in names(nesting)) if (any(refname %in% nesting[[nm]])) nesting[[nm]] = c(nesting[[nm]], newname) for (nm in refname) nesting[[nm]] = newname ### ??? should it be c(nesting[[nm]], newname) object@model.info$nesting = nesting object } ### ----- Rest of this file is used only internally --------- # Internal function to deal with nested structures. # rgobj -- an emmGrid object # specs, ... -- arguments for emmeans # nesting -- a named list of nesting info # This function works by subsetting rgobj as needed, and applying emmeans # to each subsetted object # This is a servant to emmeans.character.emmGrid, so we can assume specs is character .nested_emm = function(rgobj, specs, by = NULL, ..., nesting) { # # Trap something not supported for these... This doesn't work # dots = list(...) # if("weights" %in% dots) # if(!is.na(pmatch(dots$weights, "show.levels"))) # stop('weights = "show.levels" is not supported for nested models.') orig.by = by # save original 'by' vars #### Two issues to worry about.... # (1) specs contains nested factors. We need to include their grouping factors xspecs = intersect(union(specs, by), names(nesting)) if (length(xspecs) > 0) { xgrps = unlist(nesting[xspecs]) specs = union(union(xspecs, xgrps), specs) # expanded specs with flagged ones first by = setdiff(by, xspecs) # can't use nested factors for grouping } # (2) If we average over any nested factors, we need to do it separately avg.over = setdiff(names(rgobj@levels), union(specs, by)) afacs = intersect(names(nesting), avg.over) ### DUH!names(nesting)[names(nesting) %in% avg.over] rgobj@misc$display = NULL ## suppress warning messages from emmeans if (length(afacs) == 0) { # no nesting issues; just use emmeans result = emmeans(rgobj, specs, by = by, ...) } else { # we need to handle each group separately sz = sapply(afacs, function(nm) length(nesting[[nm]])) # use highest-order one first: potentially, we end up calling this recursively afac = afacs[rev(order(sz))][1] otrs = setdiff(afacs, afac) # other factors than afac grpfacs = union(nesting[[afac]], otrs) gspecs = union(specs, union(by, grpfacs)) grpids = as.character(interaction(rgobj@grid[, grpfacs])) grps = do.call(expand.grid, rgobj@levels[grpfacs]) # all combinations of group factors result = NULL rg = rgobj actually.avgd.over = character(0) # keep track of this from emmeans calls for (i in seq_len(nrow(grps))) { sig = as.character(interaction(grps[i, ])) rows = which(grpids == sig) grd = rgobj@grid[rows, , drop = FALSE] lf = rgobj@linfct[rows, , drop = FALSE] # Reduce grid to infacs that actually appear in this group nzg = grd[grd$.wgt. > 0, , drop = FALSE] rows = integer(0) # focus on levels of afac that exist in this group levs = unique(nzg[[afac]]) rg@levels[[afac]] = levs rows = union(rows, which(grd[[afac]] %in% levs)) rg@grid = grd[rows, , drop = FALSE] rg@linfct = lf[rows, , drop = FALSE] for (j in seq_along(grpfacs)) rg@levels[[grpfacs[j]]] = grps[i, j] emmGrid = suppressMessages(emmeans(rg, gspecs, ...)) actually.avgd.over = union(actually.avgd.over, emmGrid@misc$avgd.over) if (is.null(result)) result = emmGrid else { result@grid = rbind(result@grid, emmGrid@grid) result@linfct = rbind(result@linfct, emmGrid@linfct) } } for (j in seq_along(grpfacs)) result@levels[grpfacs[j]] = rgobj@levels[grpfacs[j]] result@misc$avgd.over = setdiff(actually.avgd.over, gspecs) result@misc$display = NULL nkeep = intersect(names(nesting), names(result@levels)) if (length(nkeep) > 0) result@model.info$nesting = nesting[nkeep] else result@model.info$nesting = NULL # Note: if any nesting remains, this next call recurs back to this function result = emmeans(result, specs, by = by, ...) } if (length(xspecs) > 0) result@misc$display = .find.nonempty.nests(result, xspecs, nesting) # preserve any nesting that still exists nesting = nesting[names(nesting) %in% names(result@levels)] result@model.info$nesting = if (length(nesting) > 0) nesting else NULL # resolve 'by' by = orig.by if (length(xspecs <- intersect(by, names(nesting)))) by = union(unlist(nesting[xspecs]), by) result@misc$by.vars = by result } ### contrast function for nested structures .nested_contrast = function(rgobj, method = "eff", interaction = FALSE, by = NULL, adjust, ...) { nesting = rgobj@model.info$nesting # Prevent meaningless cases -- if A %in% B, we can't have A in 'by' without B # Our remedy will be to EXPAND the by list for (nm in intersect(by, names(nesting))) if (!all(nesting[[nm]] %in% by)) { by = union(by, nesting[[nm]]) message("Note: Grouping factor(s) for '", nm, "' have been added to the 'by' list.") } if(!is.character(method)) stop ("Non-character contrast methods are not supported with nested objects") testcon = try(get(paste0(method, ".emmc"))(1:3), silent = TRUE) if(inherits(testcon, "try-error")) testcon = NULL if(missing(adjust)) adjust = attr(testcon, "adjust") estType = attr(testcon, "type") wkrg = rgobj # working copy facs = setdiff(names(wkrg@levels), by) # these are the factors we'll combine & contrast if (length(facs) == 0) stop("There are no factor levels left to contrast. Try taking nested factors out of 'by'.") if (!is.null(display <- wkrg@misc$display)) wkrg = wkrg[which(display), drop.levels = TRUE] wkrg@model.info$nesting = wkrg@misc$display = NULL by.rows = .find.by.rows(wkrg@grid, by) if(length(by.rows) == 1) result = contrast.emmGrid(wkrg, method = method, interaction = interaction, by = by, adjust = adjust, ...) else { result = lapply(by.rows, function(rows) { contrast.emmGrid(wkrg[rows, drop.levels = TRUE], method = method, interaction = interaction, by = by, adjust = adjust, ...) }) # set up coef matrix comb.nms = unique(do.call(paste, wkrg@grid[facs])) ncon = sapply(result, function(x) nrow(x@grid)) con.idx = rep(seq_along(by.rows), ncon) ## looks like 1,1, 2,2,2, 3, 4,4 ... con.coef = matrix(0, nrow = sum(ncon), ncol = nrow(wkrg@grid)) # Have to define .wgt. for nested emmGrid. Use average weight - seems most sensible for (i in seq_along(by.rows)) { result[[i]]@grid$.wgt. = mean(wkrg@grid[[".wgt."]][by.rows[[i]]]) con.coef[con.idx == i, by.rows[[i]]] = result[[i]]@misc$con.coef } result$adjust = ifelse(is.null(adjust), "none", adjust) result = do.call(rbind.emmGrid, result) result@misc$con.coef = con.coef result@misc$orig.grid = wkrg@grid[names(wkrg@levels)] result = update(result, by = by, estType = ifelse(is.null(estType), "contrast", estType)) cname = setdiff(names(result@levels), by) if (!is.null(result@model.info$nesting)) for (nm in cname) result@model.info$nesting[[nm]] = by } ## result@misc$orig.grid = result@misc$con.coef = NULL # we now provide these for (nm in by) { if (nm %in% names(nesting)) result@model.info$nesting[[nm]] = intersect(nesting[[nm]], by) } if(!is.na(ET <- result@misc$estType) && (ET == "pairs")) # internal flag to keep track of original by vars for paired comps result@misc$.pairby = paste(c("", by), collapse = ",") result } # Internal function to find nonempty cells in nested structures in rgobj for xfacs # Returns logical vector, FALSE are rows of the grid we needn't display .find.nonempty.nests = function(rgobj, xfacs = union(names(nesting), unlist(nesting)), nesting = rgobj@model.info$nesting) { grid = rgobj@grid keep = rep(TRUE, nrow(grid)) for (x in xfacs) { facs = union(x, nesting[[x]]) combs = do.call(expand.grid, rgobj@levels[facs]) levs = as.character(interaction(combs)) glevs = as.character(interaction(grid[facs])) for (lev in levs) { idx = which(glevs == lev) if (all(grid$.wgt.[idx] == 0)) { keep[idx] = FALSE levs[levs==lev] = "" } } } keep } # Fill-in extra elements to make a grid regular #' @rdname rbind.emmGrid #' @order 9 #' @param object an object of class \code{emmGrid} #' @return \code{force_regular} adds extra (invisible) rows to an \code{emmGrid} object #' to make it a regular grid (all combinations of factors). This regular structure is #' needed by \code{emmeans}. An object can become irregular by, for example, #' subsetting rows, or by obtaining contrasts of a nested structure. #' @export #' @examples #' #' ### Irregular object #' tmp <- warp.rg[-1] #' ## emmeans(tmp, "tension") # will fail because tmp is irregular #' emmeans(force_regular(tmp), "tension") # will show some results force_regular = function(object) { newgrid = do.call(expand.grid, object@levels) newkey = do.call(paste, newgrid) newlf = matrix(NA, nrow = nrow(newgrid), ncol = ncol(object@linfct)) colnames(newlf) = colnames(object@linfct) newdisp = rep(FALSE, nrow(newgrid)) oldgrid = object@grid oldkey = do.call(paste, oldgrid[setdiff(names(oldgrid), c(".wgt.", ".offset."))]) if (wtd <- (".wgt." %in% names(oldgrid))) newgrid$.wgt. = 0 if (ofs <- (".offset." %in% names(oldgrid))) newgrid$.offset. = NA for (j in seq_along(oldkey)) { key = oldkey[j] i = which(newkey == key) newlf[i, ] = object@linfct[j, ] newdisp[i] = TRUE if(wtd) newgrid$.wgt.[i] = oldgrid$.wgt.[j] if(ofs) newgrid$.offset.[i] = oldgrid$.offset.[j] } object@grid = newgrid object@linfct = newlf if(is.null(object@model.info$nesting) && all(newdisp)) # remove 'display` if all TRUE` newdisp = NULL object@misc$display = newdisp object } # Internal function to find nesting # We look at two things: # (1) structural nesting - i.e., any combinations of # factors A and B for which each level of A occurs with one and only one # level of B. If so, we deem A %in% B. # (2) Model-term nesting - cases where a factor appears not as a main effect # but only in higher-order terms. This is discovered using the 1s and 2s in # trms$factors # The function returns a named list, e.g., list(A = "B") # If none found, an empty list is returned. # # Added ver 1.5.3+ : if trms is NULL, we skip that part .find_nests = function(grid, trms, coerce, levels) { result = list() # only consider cases where levels has length > 1 lng = sapply(levels, length) nms = names(levels[lng > 1]) if (length(nms) < 2) return (result) g = grid[grid$.wgt. > 0, nms, drop = FALSE] for (nm in nms) { x = levels[[nm]] # exclude other factors this is already nested in excl = sapply(names(result), function(lnm) ifelse(nm %in% result[[lnm]], lnm, "")) otrs = setdiff(nms[!(nms == nm)], excl) max.levs = sapply(otrs, function(n) { max(sapply(x, function(lev) length(unique(g[[n]][g[[nm]] == lev])))) }) if (any(max.levs == 1)) result[[nm]] = otrs[max.levs == 1] } if(!is.null(trms)) { # Now look at factors attribute fac = attr(trms, "factors") if (length(fac) > 0) { if (!is.null(coerce)) for (stg in coerce) { subst = paste(.all.vars(stats::reformulate(stg)), collapse = ":") for (i in 1:2) dimnames(fac)[[i]] = gsub(stg, subst, dimnames(fac)[[i]], fixed = TRUE) } fac = fac[intersect(nms, row.names(fac)), , drop = FALSE] ### new code nms = row.names(fac) cols = dimnames(fac)[[2]] pert = setdiff(nms, intersect(nms, cols)) # pertinent - no main effect in model for (nm in pert) { pfac = fac[, fac[nm, ] == 1, drop = FALSE] # cols where nm appears if (ncol(pfac) == 0) { # case where there is no 1 in a row pfac = fac[, fac[nm, ] == 2, drop = FALSE] pfac[nm, ] = 1 # make own entry 1 so it isn't nested in self } nst = .strip.supersets(apply(pfac, 2, function(col) nms[col == 2])) if (length(nst) > 0) result[[nm]] = union(result[[nm]], nst) } } } # [end if(!is.null(trms))] # include nesting factors that are themselves nested for (nm in names(result)) result[[nm]] = union(unlist(result[result[[nm]]]), result[[nm]]) # omit any entries where factors are nested in themselves (we get such in a model like y ~ A : B) for (nm in names(result)) if (nm %in% result[[nm]]) result[[nm]] = NULL result } # strip supersets from a list and condense down to a character vector # e.g., lst = list("a", c("A", "B")) --> "A" .strip.supersets = function(lst) { if (is.list(lst) && (length(lst) > 1)) { lst = lst[order(sapply(lst, length))] # order by length for (i in length(lst):2) { tst = sapply(lst[1:(i-1)], function(x) all(x %in% lst[[i]])) if (any(tst)) lst[[i]] = NULL } } unique(unlist(lst)) } # internal function to format a list of nested levels .fmt.nest = function(nlist) { if (length(nlist) == 0) "none" else { tmp = lapply(nlist, function(x) if (length(x) == 1) x else paste0("(", paste(x, collapse = "*"), ")") ) paste(sapply(names(nlist), function (nm) paste0(nm, " %in% ", tmp[[nm]])), collapse = ", ") } } # internal function to parse a nesting string & return a list # spec can be a named list, character vector ####, or formula .parse_nest = function(spec) { if (is.null(spec)) return(NULL) if (is.list(spec)) return (spec) result = list() # break up any comma delimiters spec = trimws(unlist(strsplit(spec, ","))) for (s in spec) { parts = strsplit(s, "[ ]+%in%[ ]+")[[1]] grp = .all.vars(stats::reformulate(parts[2])) result[[parts[[1]]]] = grp } if(length(result) == 0) result = NULL result } # ### I'm removing this because I now think it creates more problems than it solves # # # # courtesy function to create levels for a nested structure factor %in% nest # # factor: factor (or interaction() result) # # ...: factors in nest # # SAS: if (FALSE|TRUE), reference level in each nest is (first|last) # nested = function(factor, ..., SAS = FALSE) { # nfacs = list(...) # if (length(nfacs) == 0) # return(factor) # nfacs$drop = TRUE # nest = do.call(interaction, nfacs) # result = as.character(interaction(factor, nest, sep = ".in.")) # ores = unique(sort(result)) # nlev = levels(nest) # flev = levels(factor) # refs = lapply(nlev, function(nst) { # r = ores[ores %in% paste0(flev, ".in.", nst)] # ifelse (SAS, rev(r)[1], r[1]) # }) # result[result %in% refs] = ".nref." # ores[ores %in% refs] = ".nref." # ores = setdiff(ores, ".nref.") # if (SAS) # factor(result, levels = c(ores, ".nref.")) # else # factor(result, levels = c(".nref.", ores)) # } emmeans/R/multiple-models.R0000644000176200001440000001134714154277657015367 0ustar liggesusers############################################################################## # Copyright (c) 2012-2019 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### Support for objects involving several models (e.g., model averaging, multiple imputation) ### MuMIn::averaging ### This is just bare-bones. Need to provide data, ### and if applicable, df. ### Optional argument tran sets the tran property # $data is NOT a standard member, but if it's there, we'll use it # Otherwise, we need to provide data or its name in the call recover_data.averaging = function(object, data, ...) { ml = attr(object, "modelList") if (is.null(ml)) return(paste0("emmeans support for 'averaging' models requires a 'modelList' attribute.\n", "Re-fit the model from a model list or with fit = TRUE")) if (is.null(object$formula)) { lhs = as.formula(paste(formula(ml[[1]])[[2]], "~.")) rhs = sapply(ml, function(m) {f = formula(m); f[[length(f)]]}) object$formula = update(as.formula(paste("~", paste(rhs, collapse = "+"))), lhs) } if (is.null(data)) data = ml[[1]]$call$data trms = attr(model.frame(object$formula, data = data), "terms") fcall = call("model.avg", formula = object$formula, data = data) recover_data(fcall, delete.response(trms), na.action = NULL, ...) } emm_basis.averaging = function(object, trms, xlev, grid, ...) { bhat = coef(object, full = TRUE) V = .my.vcov(object, function(., ...) vcov(., full = TRUE), ...) m = suppressWarnings(model.frame(trms, grid, na.action = na.pass, xlev = xlev)) X = model.matrix(trms, m, contrasts.arg = object$contrasts) nbasis = estimability::all.estble ml = attr(object, "modelList") ml1 = ml[[1]] misc = list() if (!is.null(fam <- family(ml1))) misc = .std.link.labels(fam, misc) if (!is.null(df.residual(ml1))) { dffun = function(k, dfargs) dfargs$df dfargs = list(df = min(sapply(ml, df.residual))) } else { dffun = function(k, dfargs) Inf dfargs = list() } list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = misc) } ### minc::mira support ----------------------------------------- # Here we rely on the methods already in place for elements of $analyses recover_data.mira = function(object, ...) { rdlist = lapply(object$analyses, recover_data, ...) rd = rdlist[[1]] # we'll average the numeric columns... numcols = which(sapply(rd, is.numeric)) for (j in numcols) rd[, j] = apply(sapply(rdlist, function(.) .[, j]), 1, mean) rd } emm_basis.mira = function(object, trms, xlev, grid, ...) { bas = emm_basis(object$analyses[[1]], trms, xlev, grid, ...) k = length(object$analyses) # we just average the V and bhat elements... V = 1/k * bas$V allb = cbind(bas$bhat, matrix(0, nrow = length(bas$bhat), ncol = k - 1)) for (i in 1 + seq_len(k - 1)) { basi = emm_basis(object$analyses[[i]], trms, xlev, grid, ...) V = V + 1/k * basi$V allb[, i] = basi$bhat } bas$bhat = apply(allb, 1, mean) # averaged coef notna = which(!is.na(bas$bhat)) bas$V = V + (k + 1)/k * cov(t(allb[notna, , drop = FALSE])) # pooled via Rubin's rules bas } emmeans/R/wrappers.R0000644000176200001440000001227114137062735014101 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### Wrappers for those who want other (familiar) terminology: lsmeans and pmmeans ### general-purpose wrapper for creating pmxxxxx and lsxxxxx functions ### use subst arg to specify e.g. "ls" or "pm" .emwrap = function(emmfcn, subst, ...) { result = emmfcn(...) if (inherits(result, "emmGrid")) result = .sub.em(result, subst) else if(inherits(result, "emm_list")) { for (i in seq_along(result)) result[[i]] = .sub.em(result[[i]], subst) names(result) = gsub("^em", subst, names(result)) } result } # returns an updated emmGrid object with estName "em..." replaced by "xx..." .sub.em = function(object, subst) { nm = object@misc$estName update(object, estName = gsub("^em", subst, nm)) } ### Exported implementations # lsmeans family #' Wrappers for alternative naming of EMMs #' #' These are wrappers for \code{\link{emmeans}} and related functions to provide #' backward compatibility, or for users who may prefer to #' use other terminology than \dQuote{estimated marginal means} -- namely #' \dQuote{least-squares means} or \dQuote{predicted marginal means}. #' #' For each function with \code{ls}\emph{xxxx} or \code{pm}\emph{xxxx} in its name, #' the same function named \code{em}\emph{xxxx} is called. Any estimator names or #' list items beginning with \dQuote{em} are replaced with \dQuote{ls} or #' \dQuote{pm} before the results are returned #' #' @param ... Arguments passed to the corresponding \code{em}\emph{xxxx} function #' #' @return The result of the call to \code{em}\emph{xxxx}, suitably modified. #' @rdname wrappers #' @aliases wrappers #' @seealso \code{\link{emmeans}}, \code{\link{emtrends}}, \code{\link{emmip}}, #' \code{\link{emm}}, \code{\link{emmobj}}, \code{\link{emm_options}}, #' \code{\link{get_emm_option}} #' @export #' @examples #' pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) #' lsmeans(pigs.lm, "source") lsmeans = function(...) .emwrap(emmeans, subst = "ls", ...) #' @rdname wrappers #' @export pmmeans = function(...) .emwrap(emmeans, subst = "pm", ...) #' @rdname wrappers #' @export lstrends = function(...) .emwrap(emtrends, subst = "ls", ...) #' @rdname wrappers #' @export pmtrends = function(...) .emwrap(emtrends, subst = "pm", ...) #' @rdname wrappers #' @export lsmip = function(...) emmip(...) #' @rdname wrappers #' @export pmmip = function(...) emmip(...) #' @rdname wrappers #' @export lsm = function(...) emm(...) #' @rdname wrappers #' @export pmm = function(...) emm(...) #' @rdname wrappers #' @export lsmobj = function(...) .emwrap(emmobj, subst = "ls", ...) #' @rdname wrappers #' @export pmmobj = function(...) .emwrap(emmobj, subst = "pm", ...) #' @rdname wrappers #' @export lsm.options = function(...) { .Deprecated("emm_options") args = list(...) nms = names(args) nms = gsub("ref.grid", "ref_grid", nms) nms = gsub("lsmeans", "emmeans", nms) names(args) = nms do.call(emm_options, args) } #' @rdname wrappers #' @param x Character name of desired option #' @param default default value to return if \code{x} not found #' #' @return \code{get.lsm.option} and \code{lsm.options} remap options from #' and to corresponding options in the \pkg{lsmeans} options system. #' @export get.lsm.option = function(x, default = emm_defaults[[x]]) { .Deprecated("get_emm_option") if(x == "ref.grid") x = "ref_grid" if(x == "lsmeans") x = "emmeans" get_emm_option(x, default = default) } emmeans/R/countreg-support.R0000644000176200001440000003321714137062735015601 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # Support for zeroinfl and hurdle models (pscl) # We'll support two optional arguments: # mode -- type of result required # lin.pred -- TRUE: keep linear predictor and link # FALSE: back-transform (default) # # With lin.pred = FALSE and mode %in% c("response", "count", "zero"), we # will return comparable results to predict(..., type = mode) # with mode = "prob0", same results as predict(..., type = "prob")[, 1] # # lin.pred only affects results for mode %in% c("count", "zero"). # When lin.pred = TRUE, we get the actual linear predictor and link function # for that part of the model. # ----- zeroinfl objects ----- recover_data.zeroinfl = function(object, mode = c("response", "count", "zero", "prob0"), ...) { fcall = object$call mode = match.arg(mode) if (mode %in% c("count", "zero")) trms = delete.response(terms(object, model = mode)) else ### mode = %in% c("response", "prob0") trms = delete.response(object$terms$full) # Make sure there's an offset function available env = new.env(parent = attr(trms, ".Environment")) env$offset = function(x) x attr(trms, ".Environment") = env recover_data(fcall, trms, object$na.action, ...) } emm_basis.zeroinfl = function(object, trms, xlev, grid, mode = c("response", "count", "zero", "prob0"), lin.pred = FALSE, ...) { mode = match.arg(mode) m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) if (mode %in% c("count", "zero")) { contr = object$contrasts[[mode]] X = model.matrix(trms, m, contrasts.arg = contr) bhat = coef(object, model = mode) V = .pscl.vcov(object, model = mode, ...) if (mode == "count") misc = list(tran = "log", inv.lbl = "count") else misc = list(tran = object$link, inv.lbl = "prob") if (!lin.pred) { # back-transform the results lp = as.numeric(X %*% bhat + .get.offset(trms, grid)) lnk = make.link(misc$tran) bhat = lnk$linkinv(lp) delta = .diag(lnk$mu.eta(lp)) %*% X V = delta %*% tcrossprod(V, delta) X = diag(1, length(bhat)) misc = list(offset.mult = 0) } } else { ## "response", "prob0" trms1 = delete.response(terms(object, model = "count")) off1 = .get.offset(trms1, grid) contr1 = object$contrasts[["count"]] X1 = model.matrix(trms1, m, contrasts.arg = contr1) b1 = coef(object, model = "count") lp1 = as.numeric(X1 %*% b1 + off1) mu1 = exp(lp1) trms2 = delete.response(terms(object, model = "zero")) off2 = .get.offset(trms2, grid) contr2 = object$contrasts[["zero"]] X2 = model.matrix(trms2, m, contrasts.arg = contr2) b2 = coef(object, model = "zero") lp2 = as.numeric(X2 %*% b2) + off2 mu2 = object$linkinv(lp2) mu2prime = stats::make.link(object$link)$mu.eta(lp2) if(mode == "response") { delta = .diag(mu1) %*% cbind(.diag(1 - mu2) %*% X1, .diag(-mu2prime) %*% X2) bhat = (1 - mu2) * mu1 } else { # mode = "prob0" p0 = 1 - .prob.gt.0(object$dist, mu1, object$theta) dp0 = - .dprob.gt.0(object$dist, mu1, object$theta, "log", lp1) bhat = (1 - mu2) * p0 + mu2 delta = cbind(.diag((1 - mu2) * dp0) %*% X1, .diag(mu2prime * (1 - p0)) %*% X2) } V = delta %*% tcrossprod(.pscl.vcov(object, model = "full", ...), delta) X = diag(1, length(bhat)) misc = list(offset.mult = 0) } nbasis = estimability::all.estble dffun = function(k, dfargs) Inf dfargs = list() list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = misc) } #### Support for hurdle models recover_data.hurdle = function(object, mode = c("response", "count", "zero", "prob0"), ...) { fcall = object$call mode = match.arg(mode) if (mode %in% c("count", "zero")) trms = delete.response(terms(object, model = mode)) else ### mode = "mean" or "prob.ratio" trms = delete.response(object$terms$full) # Make sure there's an offset function available env = new.env(parent = attr(trms, ".Environment")) env$offset = function(x) x attr(trms, ".Environment") = env recover_data(fcall, trms, object$na.action, ...) } # see expl notes afterward for notations in some of this emm_basis.hurdle = function(object, trms, xlev, grid, mode = c("response", "count", "zero", "prob0"), lin.pred = FALSE, ...) { mode = match.arg(mode) m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) if ((lin.pred && mode %in% c("count", "zero")) || (!lin.pred && mode %in% c("count", "prob0"))) { model = ifelse(mode == "count", "count", "zero") contr = object$contrasts[[model]] X = model.matrix(trms, m, contrasts.arg = contr) bhat = coef(object, model = model) V = .pscl.vcov(object, model = model, ...) misc = switch(object$dist[[model]], binomial = list(tran = object$link, inv.lbl = "prob"), list(tran = "log", inv.lbl = "count")) if (!lin.pred) { # back-transform lp = as.numeric(X %*% bhat + .get.offset(trms, grid)) lnk = make.link(misc$tran) bhat = lnk$linkinv(lp) if (mode != "prob0") { delta = .diag(lnk$mu.eta(lp)) %*% X } else { bhat = 1 - .prob.gt.0(object$dist$zero, bhat, object$theta["zero"]) db = - .dprob.gt.0(object$dist$zero, bhat, object$theta["zero"], misc$tran, lp) delta = .diag(db) %*% X } V = delta %*% tcrossprod(V, delta) X = diag(1, length(bhat)) misc = list(offset.mult = 0) } } else { ### "zero" or "response" with implied lin.pred = FALSE trms1 = delete.response(terms(object, model = "count")) off1 = .get.offset(trms1, grid) contr1 = object$contrasts[["count"]] X1 = model.matrix(trms1, m, contrasts.arg = contr1) b1 = coef(object, model = "count") mu1 = as.numeric(exp(X1 %*% b1 + off1)) theta1 = object$theta["count"] p1 = .prob.gt.0(object$dist$count, mu1, theta1) dp1 = .dprob.gt.0(object$dist$count, mu1, theta1, "", 0) # binomial won't happen trms2 = delete.response(terms(object, model = "zero")) off2 = .get.offset(trms2, grid) contr2 = object$contrasts[["zero"]] X2 = model.matrix(trms2, m, contrasts.arg = contr2) b2 = coef(object, model = "zero") lp2 = as.numeric(X2 %*% b2 + off2) mu2 = switch(object$dist$zero, binomial = object$linkinv(lp2), exp(lp2) ) theta2 = object$theta["zero"] p2 = .prob.gt.0(object$dist$zero, mu2, theta2) dp2 = .dprob.gt.0(object$dist$zero, mu2, theta2, object$link, lp2) if (mode == "response") { bhat = p2 * mu1 / p1 delta = cbind(.diag(bhat*(1 - mu1 * dp1 / p1)) %*% X1, .diag(mu1 * dp2 / p1) %*% X2) } else { ## mode == "zero" bhat = p2 / p1 delta = cbind(.diag(-p2 * dp1 / p1^2) %*% X1, .diag(dp2 / p1) %*% X2) } V = delta %*% tcrossprod(.pscl.vcov(object, model = "full", ...), delta) X = .diag(1, length(bhat)) misc = list(estName = mode, offset.mult = 0) } nbasis = estimability::all.estble dffun = function(k, dfargs) dfargs$df dfargs = list(df = object$df.residual) list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = misc) } # utility for prob (Y > 0 | dist, mu, theta) .prob.gt.0 = function(dist, mu, theta) { switch(dist, binomial = mu, poisson = 1 - exp(-mu), negbin = 1 - (theta / (mu + theta))^theta, geometric = 1 - 1 / (1 + mu) ) } # utility for d/d(eta) prob (Y > 0 | dist, mu, theta) .dprob.gt.0 = function(dist, mu, theta, link, lp) { switch(dist, binomial = stats::make.link(link)$mu.eta(lp), poisson = mu * exp(-mu), negbin = mu * (theta /(mu + theta))^(1 + theta), geometric = mu / (1 + mu)^2 ) } # special version of .my.vcov that accepts (and requires!) model argument .pscl.vcov = function(object, model, vcov. = stats::vcov, ...) { if (is.function(vcov.)) vcov. = vcov.(object, model = model) else if (!is.matrix(vcov.)) stop("vcov. must be a function or a square matrix") vcov. } # Explanatory notes for hurdle models # ----------------------------------- # We have a linear predictor eta = X%*%beta + offset # mu = h(eta) where h is inverse link (usually exp but not always) # Define p = P(Y > 0 | mu). This comes out to... # binomial: mu # poisson: 1 - exp(-mu) # negbin: 1 - (theta/(mu+theta))^theta # geometric: 1 - 1/(mu+1) # Define dp = dp/d(eta). Note - when h(mu)=exp(mu) we have dp = mu*dp/d(mu) # binomial: h'(eta) # poisson: mu*exp(-mu) # negbin: mu*(theta/(mu+theta))^(theta+1) # geometric: mu/(mu+1)^2 # # This gives us what we need to find the estimates and apply the delta method # In the code we index these notations with 1 (count model) and 2 (zero model) # And we treat theta1 and theta2 as constants # #!!! In theory, above seems correct, and estimates match those from predict.hurdle. #!!! But SEs don't seem right. #!!! They do seem right though if I omit the factor of mu in dp #!!! when link is log ### Simulation-based approach for ZI and hurdle models ### This returns a list of ther same form as an emm_basis() method ## X is model matrix for both models (X.count | X.zi) ## ncoef.c # cols in X.count ## bhat is all regression coefs -- OR a matrix w/ posterior samples ## V is combined vcov matrix (ignored if bhat is a matrix) ## links is vector (or list) of links for the 2 parts of the model ## fams is list of family names ## hurdle is a named list(famc, thetac, famz, thetaz) or empty if ZI ## N.sim is # simulations desired ## keep.sim set to TRUE to return sims in post.beta ## df is d.f. to return ## NOTE if bhat is a matrix, keep.sim is set to TRUE and N.sim is ignored #' @export .zi.simulate = function(X, ncoef.c, bhat, V, links, hurdle = list(), N.sim = 1000, keep.sim = FALSE, df = Inf, misc = list()) { if(is.matrix(bhat)) { keep.sim = TRUE B = bhat bhat = NA } else { if (length(bhat) != ncol(V)) stop("Non-estimable cases not yet supported for zero-inflation calculations") B = mvtnorm::rmvnorm(N.sim, bhat, V) } back.tran = function(idx, link) { W = B[, idx] %*% t(X[, idx]) if (is.character(link)) link = make.link(link) link$linkinv(W) } if(!is.na(bhat[1])) B = rbind(bhat, B) # put our pt est in 1st row ic = seq_len(ncoef.c) iz = setdiff(seq_len(ncol(B)), ic) C = back.tran(ic, links[[1]]) Z = back.tran(iz, links[[2]]) if (length(hurdle) >= 4) R = C * .prob.gt.0(hurdle$famz, Z, hurdle$thetaz) / .prob.gt.0(hurdle$famc, C, hurdle$thetac) else R = (1 - Z) * C if(!is.na(bhat[1])){ bhat = R[1, ] R = R[-1, ] } else bhat = apply(R, 2, mean) V = cov(R) dffun = function(k, dfargs) dfargs$df dfargs = list(df = df) if(is.null(misc$est.name)) misc$estName = "emmean" if(!keep.sim) post.beta = matrix(NA) list(X = diag(length(bhat)), bhat = bhat, nbasis = estimability::all.estble, V = V, dffun = dffun, dfargs = dfargs, misc = misc, post.beta = R) } emmeans/R/eff-size.R0000644000176200001440000001603314157434710013744 0ustar liggesusers# Cohen's effect sizes #' Calculate effect sizes and confidence bounds thereof #' #' Standardized effect sizes are typically calculated using pairwise differences of estimates, #' divided by the SD of the population providing the context for those effects. #' This function calculates effect sizes from an \code{emmGrid} object, #' and confidence intervals for them, accounting for uncertainty in both the estimated #' effects and the population SD. #' #' Any \code{by} variables specified in \code{object} will remain in force in the returned #' effects, unless overridden in the optional arguments. #' #' For models having a single random effect, such as those fitted using #' \code{\link{lm}}; in that case, the \code{stats::sigma} and #' \code{stats::df.residual} functions may be useful for specifying \code{sigma} #' and \code{edf}. For models with more than one random effect, \code{sigma} may #' be based on some combination of the random-effect variances. #' #' Specifying \code{edf} can be rather unintuitive but is also relatively #' uncritical; but the smaller the value, the wider the confidence intervals for #' effect size. The value of \code{sqrt(2/edf)} can be interpreted as the #' relative accuracy of \code{sigma}; for example, with \code{edf = 50}, #' \eqn{\sqrt(2/50) = 0.2}, meaning that \code{sigma} is accurate to plus or #' minus 20 percent. Note in an example below, we tried two different \code{edf} #' values as kind of a bracketing/sensitivity-analysis strategy. A value of #' \code{Inf} is allowable, in which case you are assuming that \code{sigma} is #' known exactly. Obviously, this narrows the confidence intervals for the #' effect sizes -- unrealistically if in fact \code{sigma} is unknown. #' #' #' @param object an \code{\link[=emmGrid-class]{emmGrid}} object, #' typically one defining the EMMs to #' be contrasted. If instead, \code{class(object) == "emm_list"}, #' such as is produced by \code{emmeans(model, pairwise ~ treatment)}, #' a message is displayed; the contrasts already therein are used; and #' \code{method} is replaced by \code{"identity"}. #' @param sigma numeric scalar, value of the population SD. #' @param edf numeric scalar that specifies the equivalent degrees of freedom #' for the \code{sigma}. This is a way of specifying the uncertainty in \code{sigma}, #' in that we regard our estimate of \code{sigma^2} as being proportional to #' a chi-square random variable with \code{edf} degrees of freedom. (\code{edf} should #' not be confused with the \code{df} argument that may be passed via \code{...} #' to specify the degrees of freedom to use in \eqn{t} statistics and confidence intervals.) #' @param method the contrast method to use to define the effects. #' This is passed to \code{\link{contrast}} after the elements of \code{object} #' are scaled. #' @param ... Additional arguments passed to \code{contrast} #' #' @return an \code{\link[=emmGrid-class]{emmGrid}} object containing the effect sizes #' #' @section Computation: #' This function uses calls to \code{\link{regrid}} to put the estimated #' marginal means (EMMs) on the log scale. Then an extra element is added to #' this grid for the log of \code{sigma} and its standard error (where we assume #' that \code{sigma} is uncorrelated with the log EMMs). Then a call to #' \code{\link{contrast}} subtracts \code{log{sigma}} from each of the log EMMs, #' yielding values of \code{log(EMM/sigma)}. #' Finally, the results are re-gridded back to the original scale and the #' desired contrasts are computed using \code{method}. In the log-scaling #' part, we actually rescale the absolute values and keep track of the signs. #' #' @note #' The effects are always computed on the scale of the \emph{linear-predictor}; #' any response transformation or link function is completely ignored. If you #' wish to base the effect sizes on the response scale, it is \emph{not} enough #' to replace \code{object} with \code{regrid(object)}, because this #' back-transformation changes the SD required to compute effect sizes. #' #' @note #' \strong{Disclaimer:} There is substantial disagreement among practitioners on #' what is the appropriate \code{sigma} to use in computing effect sizes; or, #' indeed, whether \emph{any} effect-size measure is appropriate for some #' situations. The user is completely responsible for specifying #' appropriate parameters (or for failing to do so). #' #' @export #' @note #' The examples here illustrate a sobering message that effect sizes are often not nearly as accurate as you may think. #' #' @examples #' fiber.lm <- lm(strength ~ diameter + machine, data = fiber) #' #' emm <- emmeans(fiber.lm, "machine") #' eff_size(emm, sigma = sigma(fiber.lm), edf = df.residual(fiber.lm)) #' #' # or equivalently: #' eff_size(pairs(emm), sigma(fiber.lm), df.residual(fiber.lm), method = "identity") #' #' #' ### Mixed model example: #' if (require(nlme)) withAutoprint({ #' Oats.lme <- lme(yield ~ Variety + factor(nitro), #' random = ~ 1 | Block / Variety, #' data = Oats) #' # Combine variance estimates #' VarCorr(Oats.lme) #' (totSD <- sqrt(214.4724 + 109.6931 + 162.5590)) #' # I figure edf is somewhere between 5 (Blocks df) and 51 (Resid df) #' emmV <- emmeans(Oats.lme, ~ Variety) #' eff_size(emmV, sigma = totSD, edf = 5) #' eff_size(emmV, sigma = totSD, edf = 51) #' }, spaced = TRUE) #' #' # Multivariate model for the same data: #' MOats.lm <- lm(yield ~ Variety, data = MOats) #' eff_size(emmeans(MOats.lm, "Variety"), #' sigma = sqrt(mean(sigma(MOats.lm)^2)), # RMS of sigma() #' edf = df.residual(MOats.lm)) eff_size = function(object, sigma, edf, method = "pairwise", ...) { if (inherits(object, "emm_list") && ("contrasts" %in% names(object))) { message("Since 'object' is a list, we are using the contrasts already present.") object = object$contrasts method = "identity" } SE.logsigma = sqrt(1 / (2 * edf)) object = update(object, tran = NULL) object = regrid(object, transform = "response") # put on absolute scale emm = object@bhat sgn = sign(emm) emm = abs(emm) sgn[emm < 1e-15] = 0 emm[sgn == 0] = 1 object@bhat = emm # put on log scale and incorporate the SD estimate logobj = regrid(object, transform = "log") logobj@bhat = c(logobj@bhat, log(sigma)) V = rbind(cbind(logobj@V, 0), 0) n = nrow(V) V[n,n] = SE.logsigma^2 logobj@V = V logobj@levels = list(dummy = 1:n) logobj@grid = data.frame(dummy = 1:n) logobj@linfct = diag(n) logobj@misc$by = NULL con = contrast(logobj, "trt.vs.ctrlk", by = NULL) con = regrid(con, transform = "response") object@bhat = con@bhat * sgn object@V = con@V update(contrast(object, method, adjust = "none", ...), estName = "effect.size", infer = c(TRUE, FALSE), initMesg = paste("sigma used for effect sizes:", signif(sigma, digits = 4))) } emmeans/R/MCMC-support.R0000644000176200001440000006511614157433533014475 0ustar liggesusers############################################################################## # Copyright (c) 2012-2016 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # Support for MCMCglmm class and possibly more MCMC-based models # Method to create a coda 'mcmc' or 'mcmc.list' object from a ref.grid # (dots not supported, unfortunately) # If sep.chains is TRUE and there is more than one chain, an mcmc.list is returned # NOTE: S3 registration of as.mcmc and as.mcmc.list is done dynamically in zzz.R #' Support for MCMC-based estimation #' #' When a model is fitted using Markov chain Monte Carlo (MCMC) methods, #' its reference grid contains a \code{post.beta} slot. These functions #' transform those posterior samples to posterior samples of EMMs or #' related contrasts. They can then be summarized or plotted using, #' e.g., functions in the \pkg{coda} package. #' #' @rdname mcmc-support #' @aliases mcmc-support #' @param x An object of class \code{emmGrid} #' @param names Logical scalar or vector specifying whether variable names are #' appended to levels in the column labels for the \code{as.mcmc} or #' \code{as.mcmc.list} result -- e.g., column names of \code{treat A} and #' \code{treat B} versus just \code{A} and \code{B}. When there is more than #' one variable involved, the elements of \code{names} are used cyclically. #' @param sep.chains Logical value. If \code{TRUE}, and there is more than one #' MCMC chain available, an \code{\link[coda]{mcmc.list}} object is returned #' by \code{as.mcmc}, with separate EMMs posteriors in each chain. #' @param likelihood Character value or function. If given, simulations are made from #' the corresponding posterior predictive distribution. If not given, we obtain #' the posterior distribution of the parameters in \code{object}. See Prediction #' section below. #' @param NE.include Logical value. If \code{TRUE}, non-estimable columns are #' kept but returned as columns of \code{NA} values (this may create errors or #' warnings in subsequent analyses using, say, \pkg{coda}). If \code{FALSE}, #' non-estimable columns are dropped, and a warning is issued. (If all are #' non-estimable, an error is thrown.) #' @param ... arguments passed to other methods #' #' @return An object of class \code{\link[coda]{mcmc}} or \code{\link[coda]{mcmc.list}}. #' #' @section Details: #' When the object's \code{post.beta} slot is non-trivial, \code{as.mcmc} will #' return an \code{\link[coda]{mcmc}} or \code{\link[coda]{mcmc.list}} object #' that can be summarized or plotted using methods in the \pkg{coda} package. #' In these functions, \code{post.beta} is transformed by post-multiplying it by #' \code{t(linfct)}, creating a sample from the posterior distribution of LS #' means. In \code{as.mcmc}, if \code{sep.chains} is \code{TRUE} and there is in #' fact more than one chain, an \code{mcmc.list} is returned with each chain's #' results. The \code{as.mcmc.list} method is guaranteed to return an #' \code{mcmc.list}, even if it comprises just one chain. #' #' @section Prediction: #' When \code{likelihood} is specified, it is used to simulate values from the #' posterior predictive distribution corresponding to the given likelihood and #' the posterior distribution of parameter values. Denote the likelihood #' function as \eqn{f(y|\theta,\phi)}, where \eqn{y} is a response, \eqn{\theta} #' is the parameter estimated in \code{object}, and \eqn{\phi} comprises zero or #' more additional parameters to be specified. If \code{likelihood} is a #' function, that function should take as its first argument a vector of #' \eqn{\theta} values (each corresponding to one row of \code{object@grid}). #' Any \eqn{\phi} values should be specified as additional named function #' arguments, and passed to \code{likelihood} via \code{...}. This function should #' simulate values of \eqn{y}. #' #' A few standard likelihoods are available by specifying \code{likelihood} as #' a character value. They are: #' \describe{ #' \item{\code{"normal"}}{The normal distribution with mean \eqn{\theta} and #' standard deviation specified by additional argument \code{sigma}} #' \item{\code{"binomial"}}{The binomial distribution with success probability #' \eqn{theta}, and number of trials specified by \code{trials}} #' \item{\code{"poisson"}}{The Poisson distribution with mean \eqn{theta} #' (no additional parameters)} #' \item{\code{"gamma"}}{The gamma distribution with scale parameter \eqn{\theta} #' and shape parameter specified by \code{shape}} #' } #' #' @method as.mcmc emmGrid #' @export as.mcmc.emmGrid #' @examples #' if(requireNamespace("coda")) { #' ### A saved reference grid for a mixed logistic model (see lme4::cbpp) #' cbpp.rg <- do.call(emmobj, #' readRDS(system.file("extdata", "cbpplist", package = "emmeans"))) #' # Predictive distribution for herds of size 20 #' # (perhaps a bias adjustment should be applied; see "sophisticated" vignette) #' pred.incidence <- coda::as.mcmc(regrid(cbpp.rg), likelihood = "binomial", trials = 20) #' summary(pred.incidence) #' } as.mcmc.emmGrid = function(x, names = TRUE, sep.chains = TRUE, likelihood, NE.include = FALSE, ...) { if (is.na(x@post.beta[1])) { stop("No posterior sample -- can't make an 'mcmc' object") } # notes on estimability issues: # 1. Use @bhat to determine which coefs to use # 2. @nabasis as in freq models # 3. @post.beta we will EXCLUDE cols corresp to NAs in @bhat # See stanreg support for hints/details use = which(!is.na(x@bhat)) est = estimability::is.estble(x@linfct, x@nbasis) if (!any(est)) stop("Aborted -- No estimates in the grid are estimable") else if(!all(est) && !NE.include) { rows = paste(which(!est), collapse = ", ") warning("Cases ", rows, " were dropped due to non-estimability", call. = FALSE) } mat = x@post.beta %*% t(x@linfct[, use, drop = FALSE]) if (NE.include) mat[, !est] = NA else { mat = mat[, est, drop = FALSE] x@grid = x@grid[est, , drop = FALSE] } if(!is.null(offset <- x@grid[[".offset."]])) { n = nrow(mat) mat = mat + matrix(rep(offset, each = n), nrow = n) } if (!missing(likelihood)) { if (is.character(likelihood)) { likelihood = match.arg(likelihood, c("normal", "binomial", "poisson", "gamma")) likelihood = switch(likelihood, normal = function(theta, sigma, ...) rnorm(length(theta), mean = theta, sd = sigma), binomial = function(theta, trials, ...) rbinom(length(theta), size = trials, prob = theta), poisson = function(theta, ...) rpois(length(theta), lambda = theta), gamma = function(theta, shape, ...) rgamma(length(theta), scale = theta, shape = shape) #, stop("There is no predefined likelihood named '", likelihood, "'") ) } mat = apply(mat, 2, likelihood, ...) ##! TODO: Add "multinomial" support. This will require a flag to observe ##! the 'by' variable(s), then we get parameter values from the columns ##! corresponding to each 'by' group } nm = setdiff(names(x@grid), c(".wgt.",".offset.")) if (any(names)) { names = rep(names, length(nm)) for (i in seq_along(nm)) if(names[i]) x@grid[nm[i]] = paste(nm[i], x@grid[[nm[i]]]) } if(is.null(dimnames(mat))) dimnames(mat) = list(seq_len(nrow(mat)), seq_len(ncol(mat))) dimnames(mat)[[2]] = do.call(paste, c(unname(x@grid[, nm, drop = FALSE]), sep=", ")) n.chains = attr(x@post.beta, "n.chains") if (!sep.chains || is.null(n.chains) || (n.chains == 1)) coda::mcmc(mat) else { n = nrow(mat) / n.chains seqn = seq_len(n) chains = lapply(seq_len(n.chains), function(i) coda::mcmc(mat[n*(i - 1) + seqn, , drop = FALSE])) coda::mcmc.list(chains) } } ### as.mcmc.list - guaranteed to return a list #' @rdname mcmc-support #' @method as.mcmc.list emmGrid as.mcmc.list.emmGrid = function(x, names = TRUE, ...) { result = as.mcmc.emmGrid(x, names = names, sep.chains = TRUE, ...) if(!inherits(result, "mcmc.list")) result = coda::mcmc.list(result) result } #' Summarize an emmGrid from a Bayesian model #' #' This function computes point estimates and HPD intervals for each #' factor combination in \code{object@emmGrid}. While this function #' may be called independently, it is called automatically by the S3 method #' \code{\link{summary.emmGrid}} when the object is based on a Bayesian model. #' (Note: the \code{level} argument, or its default, is passed as \code{prob}). #' #' @param object an \code{emmGrid} object having a non-missing \code{post.beta} slot #' @param prob numeric probability content for HPD intervals (note: when not specified, #' the current \code{level} option is used; see \code{\link{emm_options}}) #' @param by factors to use as \code{by} variables #' @param type prediction type as in \code{\link{summary.emmGrid}} #' @param point.est function to use to compute the point estimates from the #' posterior sample for each grid point #' @param bias.adjust Logical value for whether to adjust for bias in #' back-transforming (\code{type = "response"}). This requires a value of #' \code{sigma} to exist in the object or be specified. #' @param sigma Error SD assumed for bias correction (when #' \code{type = "response"}. If not specified, #' \code{object@misc$sigma} is used, and an error is thrown if it is not found. #' \emph{Note:} \code{sigma} may be a vector, as long as it conforms to the #' number of observations in the posterior sample. #' @param ... required but not used #' #' @return an object of class \code{summary_emm} #' #' @seealso summary.emmGrid #' #' @export #' #' @examples #' if(require("coda")) { #' # Create an emmGrid object from a system file #' cbpp.rg <- do.call(emmobj, #' readRDS(system.file("extdata", "cbpplist", package = "emmeans"))) #' hpd.summary(emmeans(cbpp.rg, "period")) #' } #' hpd.summary = function(object, prob, by, type, point.est = median, bias.adjust = get_emm_option("back.bias.adj"), sigma, ...) { if(!is.null(object@misc$.predFlag)) stop("Prediction intervals for MCMC models should be done using 'frequentist = TRUE'\n", "or using 'as.mcmc(object, ..., likelihood = ...)'") .requireNS("coda", "Bayesian summary requires the 'coda' package") ### require("coda") ### Nope this is a CRAN no-no # Steal some init code from summary.emmGrid: opt = get_emm_option("summary") if(!is.null(opt)) { opt$object = object object = do.call("update.emmGrid", opt) } misc = object@misc use.elts = if (is.null(misc$display)) rep(TRUE, nrow(object@grid)) else misc$display grid = object@grid[use.elts, , drop = FALSE] if(missing(prob)) prob = misc$level if(missing(by)) by = misc$by.vars if (missing(type)) type = .get.predict.type(misc) else type = .validate.type(type) # if there are two transformations and we want response, then we need to undo both if ((type == "response") && (!is.null(misc$tran2))) object = regrid(object, transform = "mu") if ((type %in% c("mu", "unlink")) && (!is.null(t2 <- misc$tran2))) { if (!is.character(t2)) t2 = "tran" object = update(object, inv.lbl = paste0(t2, "(resp)")) } link = .get.link(misc) inv = (type %in% c("response", "mu", "unlink")) # flag to inverse-transform if (inv && is.null(link)) inv = FALSE ### OK, finally, here is the real stuff pe.lbl = as.character(substitute(point.est)) if(length(pe.lbl) > 1) pe.lbl = "user-supplied function" mesg = c(misc$initMesg, paste("Point estimate displayed:", pe.lbl)) mcmc = as.mcmc.emmGrid(object, names = FALSE, sep.chains = FALSE, NE.include = TRUE, ...) mcmc = mcmc[, use.elts, drop = FALSE] if (inv) { if (bias.adjust) { if (missing(sigma)) sigma = object@misc@sigma link = .make.bias.adj.link(link, sigma) } for (j in seq_along(mcmc[1, ])) mcmc[, j] = with(link, linkinv(mcmc[, j])) mesg = c(mesg, paste("Results are back-transformed from the", link$name, "scale")) if(bias.adjust) mesg = c(mesg, paste("Bias adjustment applied based on sigma =", .fmt.sigma(sigma))) } else if(!is.null(link)) mesg = c(mesg, paste("Results are given on the", link$name, "(not the response) scale.")) est = !is.na(mcmc[1, ]) mcmc[, !est] = 0 # temp so we don't get errors mesg = c(mesg, paste("HPD interval probability:", prob)) pt.est = data.frame(apply(mcmc, 2, point.est)) names(pt.est) = object@misc$estName summ = as.data.frame(coda::HPDinterval(mcmc, prob = prob))[c("lower","upper")] names(summ) = cnm = paste0(names(summ), ".HPD") lblnms = setdiff(names(grid), c(object@roles$responses, ".offset.", ".wgt.")) lbls = grid[lblnms] if (inv) { if (!is.null(misc$inv.lbl)) { names(pt.est) = misc$estName = misc$inv.lbl if (!is.null(misc$log.contrast)) # contrast of logs - relabel as ratios for (ell in seq_along(lbls)){ lbls[[ell]] = factor(lbls[[ell]]) levels(lbls[[ell]]) = gsub(" - ", " / ", levels(lbls[[ell]])) } } else names(pt.est) = misc$estName = "response" } summ[!est, ] = NA pt.est[!est, ] = NA summ = cbind(lbls, pt.est, summ) attr(summ, "estName") = misc$estName attr(summ, "clNames") = cnm if (is.null(misc$pri.vars) || length(misc$pri.vars) == 0) misc$pri.vars = names(object@levels) attr(summ, "pri.vars") = setdiff(union(misc$pri.vars, misc$by.vars), by) attr(summ, "by.vars") = by attr(summ, "mesg") = unique(mesg) class(summ) = c("summary_emm", "data.frame") summ } # Currently, data is required, as call is not stored recover_data.MCMCglmm = function(object, data, trait, ...) { if (is.null(data) && !is.null(object$data)) # allow for including data in object data = eval(object$data) # if a multivariate response, stack the data with `trait` variable yvars = .all.vars(update(object$Fixed$formula, ". ~ 1")) if ("trait" %in% names(data)) { # don't do anything, just use what's provided } else if(length(yvars) > 1) { # for (v in yvars) data[[v]] = NULL dat = data for (i in seq_len(length(yvars) - 1)) data = rbind(data, dat) data$trait = factor(rep(yvars, each = nrow(dat))) } else if(!missing(trait)) { # we'll create a fake "trait" variable with specified variable n = nrow(data) levs = levels(data[[trait]]) attr(data, "misc") = list(resp.levs = levs, trait = trait) data$trait = rep(levs[-1], n)[1:n] # way overkill, but easy coding } attr(data, "call") = object$Fixed attr(data, "terms") = trms = delete.response(terms(object$Fixed$formula)) attr(data, "predictors") = .all.vars(delete.response(trms)) data } # misc may be NULL or a list generated by trait spec emm_basis.MCMCglmm = function(object, trms, xlev, grid, vcov., mode = c("default", "multinomial"), misc, ...) { nobs.MCMCglmm = function(object, ...) 1 # prevents warning about nobs m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = NULL) Sol = as.matrix(object$Sol)[, seq_len(object$Fixed$nfl)] # toss out random effects if included bhat = apply(Sol, 2, mean) if (missing(vcov.)) V = cov(Sol) else V = .my.vcov(object, vcov.) if (is.null(misc)) misc = list() mode = match.arg(mode) if (mode == "multinomial") { misc$postGridHook = .MCMCglmm.multinom.postGrid } else { # try to figure out the link fam = unique(object$family) if (length(fam) > 1) stop("There is more than one 'family' in this model - too complex for emmeans support") link = switch(fam, poisson = "log", multinomial = "log", categorical = "logit", ordinal = "logit") # maybe more later? if (!is.null(link)) misc = .std.link.labels(list(link = link), misc) } list(X = X, bhat = bhat, nbasis = matrix(NA), V = V, dffun = function(k, dfargs) Inf, dfargs = list(), misc = misc, post.beta = Sol) } .MCMCglmm.multinom.postGrid = function(object, ...) { linfct = object@linfct misc = object@misc post.lp = object@post.beta %*% t(linfct) sel = .find.by.rows(object@grid, "trait") k = length(sel) cols = unlist(sel) scal = sqrt(1 + 2 * (16 * sqrt(3) / (15 * pi))^2 / (k + 1)) # scaling const for logistic # I'm assuming here that diag(IJ) = 2 / (k + 1) object@post.beta = post.p = t(apply(post.lp, 1, function(l) { expX = exp(cbind(0, matrix(l[cols], ncol = k)) / scal) as.numeric(apply(expX, 1, function(z) z / sum(z))) })) # These results come out with response levels varying the fastest. object@bhat = apply(post.p, 2, mean) object@V = cov(post.p) preds = c(misc$trait, object@roles$predictors) object@roles$predictors = preds[preds != "trait"] object@levels[["trait"]] = NULL object@levels = c(list(misc$resp.levs), object@levels) names(object@levels)[1] = misc$trait object@grid = do.call(expand.grid, object@levels) misc$postGridHook = misc$tran = misc$inv.lbl = misc$trait = misc$resp.levs = NULL misc$display = object@model.info$nesting = NULL misc$estName = "prob" object@linfct = diag(1, ncol(post.p)) object@misc = misc object } ### Support for MCMCpack , maybe others that produce mcmc objects ### Whether it works depends on: ### 1. if there is a "call" attribute with a formula or fixed member ### 2. if it's right, even then ### Alternatively, maybe providing formula and data will do the trick recover_data.mcmc = function(object, formula, data, ...) { if (missing(formula)) { cl = attr(object, "call") if (is.null(cl$formula)) cl$formula = cl$fixed if (is.null(cl$formula)) return("No fixed-effects formula found") data = NULL } else { if (missing(formula) || missing(data)) return("Requires both formula and data to proceed") cl = call("mcmc.proxy", formula = formula, data = quote(data)) } trms = delete.response(terms(eval(cl$formula, parent.frame()))) recover_data(cl, trms, NULL, data, ...) } emm_basis.mcmc = function(object, trms, xlev, grid, vcov., contrasts.arg = NULL, ...) { m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = contrasts.arg) samp = as.matrix(object)[, seq_len(ncol(X)), drop = FALSE] bhat = apply(samp, 2, mean) if (missing(vcov.)) V = cov(samp) else V = .my.vcov(object, vcov.) misc = list() list(X = X, bhat = bhat, nbasis = matrix(NA), V = V, dffun = function(k, dfargs) Inf, dfargs = list(), misc = misc, post.beta = samp) } ### Support for mcmc.list recover_data.mcmc.list = function(object, formula, data, ...) { recover_data.mcmc(object[[1]], formula, data, ...) } emm_basis.mcmc.list = function(object, trms, xlev, grid, vcov., ...) { result = emm_basis.mcmc(object[[1]], trms, xlev, grid, vcov., ...) cols = seq_len(ncol(result$post.beta)) for (i in 2:length(object)) result$post.beta = rbind(result$post.beta, as.matrix(object[[i]])[, cols, drop = FALSE]) attr(result$post.beta, "n.chains") = length(object) result } ### support for CARBayes package - currently MUST supply data and have ### default contrasts matching what was used in fitting the mdoel recover_data.carbayes = function(object, data, ...) { if(is.null(data)) # Try to recover data from parent frame data = model.frame(object$formula, data = parent.frame()) cl = call("carbayes.proxy", formula = object$formula, data = quote(data)) trms = delete.response(terms(eval(object$formula, parent.frame()))) recover_data(cl, trms, NULL, data, ...) } emm_basis.carbayes = function(object, trms, xlev, grid, ...) { m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = attr(object$X, "contrasts")) samp = as.matrix(object$samples$beta) bhat = apply(samp, 2, mean) V = cov(samp) misc = list() list(X = X, bhat = bhat, nbasis = matrix(NA), V = V, dffun = function(k, dfargs) Inf, dfargs = list(), misc = misc, post.beta = samp) } ### Support for the rstanarm package (stanreg objects) ### recover_data.stanreg = function(object, ...) { recover_data.lm(object, ...) } # note: mode and rescale are ignored for some models emm_basis.stanreg = function(object, trms, xlev, grid, mode, rescale, ...) { misc = list() if (!is.null(object$family)) { if (is.character(object$family)) # work around bug for stan_polr misc$tran = object$method else misc = .std.link.labels(object$family, misc) } # Previous code... ### m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) ### if(is.null(contr <- object$contrasts)) ### contr = attr(model.matrix(object), "contrasts") ### X = model.matrix(trms, m, contrasts.arg = contr) ### bhat = rstanarm::fixef(object) ### nms = intersect(colnames(X), names(bhat)) ### bhat = bhat[nms] ### V = vcov(object)[nms, nms, drop = FALSE] # Instead, use internal routine in rstanarm to get the model matrix # Later, we'll get bhat and V from the posterior sample because # the vcov(object) doesn't always jibe with fixef(object) if(is.null(object$contrasts)) # old version of rstanarm where contrasts may get lost. object$contrasts = attr(model.matrix(object), "contrasts") pp_data = get("pp_data", envir = getNamespace("rstanarm")) X = pp_data(object, newdata = grid, re.form = ~0, ...)[[1]] nms = colnames(X) if(!is.null(object$zeta)) { # Polytomous regression model if (missing(mode)) mode = "latent" else mode = match.arg(mode, c("latent", "linear.predictor", "cum.prob", "exc.prob", "prob", "mean.class")) xint = match("(Intercept)", nms, nomatch = 0L) if (xint > 0L) X = X[, -xint, drop = FALSE] k = length(object$zeta) if (mode == "latent") { ### if (missing(rescale)) #------ (Disabl rescale) rescale = c(0,1) X = rescale[2] * cbind(X, matrix(- 1/k, nrow = nrow(X), ncol = k)) ### bhat = c(bhat, object$zeta - rescale[1] / rescale[2]) misc = list(offset.mult = rescale[2]) } else { ### bhat = c(bhat, object$zeta) j = matrix(1, nrow=k, ncol=1) J = matrix(1, nrow=nrow(X), ncol=1) X = cbind(kronecker(-j, X), kronecker(diag(1,k), J)) link = object$method if (link == "logistic") link = "logit" misc = list(ylevs = list(cut = names(object$zeta)), tran = link, inv.lbl = "cumprob", offset.mult = -1) if (mode != "linear.predictor") { misc$mode = mode misc$postGridHook = ".clm.postGrid" # we probably need to adapt this } } nms = colnames(X) = c(nms, names(object$zeta)) misc$respName = as.character.default(terms(object))[2] } samp = as.matrix(object$stanfit)[, nms, drop = FALSE] attr(samp, "n.chains") = object$stanfit@sim$chains bhat = apply(samp, 2, mean) V = cov(samp) # estimability... nbasis = estimability::all.estble all.nms = colnames(X) if (length(nms) < length(all.nms)) { if(is.null(contr <- object$contrasts)) contr = attr(model.matrix(object), "contrasts") coef = NA * X[1, ] coef[names(bhat)] = bhat bhat = coef mmat = model.matrix(trms, object$data, contrasts.arg = contr) nbasis = estimability::nonest.basis(mmat) } list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = function(k, dfargs) Inf, dfargs = list(), misc = misc, post.beta = samp) } emmeans/R/interfacing.R0000644000176200001440000004477014137062735014540 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## #' Support functions for model extensions #' #' This documents the methods that \code{\link{ref_grid}} calls. A user #' or package developer may add \pkg{emmeans} support for a model #' class by writing \code{recover_data} and \code{emm_basis} methods #' for that class. (Users in need for a quick way to obtain results for a model #' that is not supported may be better served by the \code{\link{qdrg}} function.) #' ## #' @rdname extending-emmeans #' @name extending-emmeans #' @param object An object of the same class as is supported by a new method. #' @param ... Additional parameters that may be supported by the method. #' #' @section Details: #' To create a reference grid, the \code{ref_grid} function needs to reconstruct #' the data used in fitting the model, and then obtain a matrix of linear #' functions of the regression coefficients for a given grid of predictor #' values. These tasks are performed by calls to \code{recover_data} and #' \code{emm_basis} respectively. A vignette giving details and examples #' is available via \href{../doc/xtending.html}{vignette("xtending", "emmeans")} #' #' To extend \pkg{emmeans}'s support to additional model types, one need only #' write S3 methods for these two functions. The existing methods serve as #' helpful guidance for writing new ones. Most of the work for #' \code{recover_data} can be done by its method for class \code{"call"}, #' providing the \code{terms} component and \code{na.action} data as additional #' arguments. Writing an \code{emm_basis} method is more involved, but the #' existing methods (e.g., \code{emmeans:::emm_basis.lm}) can serve as models. #' Certain \code{recover_data} and \code{emm_basis} methods are exported from #' \pkg{emmeans}. (To find out, do \code{methods("recover_data")}.) If your #' object is based on another model-fitting object, it #' may be that all that is needed is to call one of these exported methods and #' perhaps make modifications to the results. Contact the developer if you need #' others of these exported. #' #' If the model has a multivariate response, \code{bhat} needs to be #' \dQuote{flattened} into a single vector, and \code{X} and \code{V} must be #' constructed consistently. #' #' In models where a non-full-rank result is possible (often, you can tell by #' seeing if there is a \code{singular.ok} argument in the model-fitting #' function), \code{\link{summary.emmGrid}} and its relatives check the #' estimability of each #' prediction, using the \code{\link[estimability]{nonest.basis}} function in #' the \pkg{estimability} package. #' #' The models already supported are detailed in \href{../doc/models.html}{the #' "models" vignette}. Some packages may provide additional \pkg{emmeans} #' support for its object classes. #' #' #' @return The \code{recover_data} method must return a \code{\link{data.frame}} #' containing all the variables that appear as predictors in the model, #' and attributes \code{"call"}, \code{"terms"}, \code{"predictors"}, #' and \code{"responses"}. (\code{recover_data.call} will #' provide these attributes.) #' #' @note Without an explicit \code{data} argument, \code{recover_data} returns #' the \emph{current version} of the dataset. If the dataset has changed #' since the model was fitted, then this will not be the data used to fit #' the model. It is especially important to know this in simulation studies #' where the data are randomly generated or permuted, and in cases where #' several datasets are processed in one step (e.g., using \code{dplyr}). #' In those cases, users should be careful to provide the actual data #' used to fit the model in the \code{data} argument. #' #' @seealso \href{../doc/xtending.html}{Vignette on extending emmeans} #' #' @export recover_data = function(object, ...) { # look for outside methods first for (cl in .chk.cls(object)) { rd <- .get.outside.method("recover_data", cl) if(!is.null(rd)) return(rd(object, ...)) } UseMethod("recover_data") } # get classes that are OK for external code to modify # We don't allow overriding certain anchor classes, # nor ones in 3rd place or later in inheritance .chk.cls = function(object) { sacred = c("call", "lm", "glm", "mlm", "aovlist", "lme", "qdrg") setdiff(class(object)[1:2], sacred) } ### My internal method dispatch -- we prefer outside methods .get.outside.method = function(generic, cls) { mth = utils::getAnywhere(paste(generic, cls, sep = ".")) from = sapply(strsplit(mth[[3]], "[ :]"), function(x) rev(x)[1]) if (length(from) == 0) return (NULL) if(any(outside <- (from != "emmeans"))) mth[which(outside)[1]] else NULL } #-------------------------------------------------------------- ### call' objects # This recover_data method serves as the workhorse for the others # For model objects, call this with the object's call and its terms component # Late addition: if data is non-null, use it in place of recovered data # Later addition: na.action arg req'd - vector of row indexes removed due to NAs # na.action is ignored when data is non-NULL #' @rdname extending-emmeans #' @param trms The \code{\link{terms}} component of \code{object} (typically with #' the response deleted, e.g. via \code{\link{delete.response}}) #' @param na.action Integer vector of indices of observations to ignore; or #' \code{NULL} if none #' @param data Data frame. Usually, this is \code{NULL}. However, if non-null, #' this is used in place of the reconstructed dataset. It must have all of the #' predictors used in the model, and any factor levels must match those used #' in fitting the model. #' @param params Character vector giving the names of any variables in the model #' formula that are \emph{not} predictors. For example, a spline model may involve #' a local variable \code{knots} that is not a predictor, but its value is #' needed to fit the model. Names of parameters not actually used are harmless, #' and the default value \code{"pi"} (the only numeric constant in base R) #' is provided in case the model involves it. An example involving splines #' may be found at \url{https://github.com/rvlenth/emmeans/issues/180}. #' @param frame Optional \code{data.frame}. Many model objects contain the #' model frame used when fitting the model. In cases where there are no #' predictor transformations, this model frame has all the original predictor #' values and so is usable for recovering the data. Thus, if \code{frame} is #' non-missing and \code{data} is \code{NULL}, a check is made on \code{trms} #' and if there are no function calls, we use \code{data = frame}. This #' can be helpful because it provides a modicum of security against the #' possibility that the original data used when fitting the model has been #' altered or removed. #' #' @method recover_data call #' @export recover_data.call = function(object, trms, na.action, data = NULL, params = "pi", frame, ...) { fcall = object # because I'm easily confused vars = setdiff(.all.vars(trms), params) if (!missing(frame) && is.null(data) && !.has.fcns(trms)) data = frame tbl = data if (length(vars) == 0 || vars[1] == "1") { tbl = data.frame(c(1,1)) vars = names(tbl) = 1 } if (is.null(tbl)) { possibly.random = FALSE m = match(c("formula", "data", "subset", "weights"), names(fcall), 0L) fcall = fcall[c(1L, m)] # check to see if there are any function calls to worry about # [e.g., subset = sample(1:n, 50) will give us a # different subset than model used] mm = match(c("data", "subset"), names(fcall), 0L) if(any(mm > 0)) { # Flag cases where there is a function call in data or subset # May indicate a situation where data are randomized fcns = unlist(lapply(fcall[mm], function(x) setdiff(all.names(x), c("::",":::","[[","]]",all.vars(x))))) possibly.random = (max(nchar(c("", fcns))) > 1) } fcall$drop.unused.levels = TRUE fcall[[1L]] = quote(stats::model.frame) fcall$xlev = NULL # we'll ignore xlev if(!is.numeric(na.action)) ### In case na.action is not a vector of indices na.action = NULL # If we have an explicit list of cases to exclude, let everything through now if (!is.null(na.action)) fcall$na.action = na.pass else # exclude incomplete cases fcall$na.action = na.omit form = .reformulate(vars) fcall$formula = update(trms, form) env = environment(trms) if (is.null(env)) env = parent.frame() tbl = try(eval(fcall, env, parent.frame()), silent = TRUE) if(inherits(tbl, "try-error")) return(.rd.error(vars, fcall)) if (possibly.random) { chk = eval(fcall, env, parent.frame()) if (!all(chk == tbl)) stop("Data appear to be randomized -- ", "cannot consistently recover the data\n", "Move the randomization ", "outside of the model-fitting call.") } # Now we can drop na.action's rows if (!is.null(na.action)) tbl = tbl[-(na.action), , drop=FALSE] } else { tbl = tbl[, vars, drop = FALSE] # consider only the variables actually needed tbl = tbl[complete.cases(tbl), , drop=FALSE] } attr(tbl, "call") = object # the original call attr(tbl, "terms") = trms attr(tbl, "predictors") = setdiff(.all.vars(delete.response(trms)), params) attr(tbl, "responses") = setdiff(vars, union(attr(tbl, "predictors"), params)) tbl } # error message for recover_data.call .rd.error = function(vars, fcall) { if ("pi" %in% vars) return("\nTry re-running with 'params = c\"pi\", ...)'") if (is.list(fcall$data)) fcall$data = "(raw data structure)" dataname = as.character(fcall$data)[1] if ((!is.na(dataname)) && (nchar(dataname) > 50)) dataname = paste(substring(dataname, 1, 50), "...") mesg = "We are unable to reconstruct the data.\n" mesg = paste0(mesg, "The variables needed are:\n\t", paste(vars, collapse = " "), "\n", "Are any of these actually constants? (specify via 'params = ')\n") if (is.na(dataname)) mesg = paste(mesg, "Try re-running with 'data = \"\"'\n") else mesg = paste0(mesg, "The dataset name is:\n\t", dataname, "\n", "Does the data still exist? Or you can specify a dataset via 'data = '\n") mesg } #---------------------------------------------------------- ### emm_basis methods create a basis for the reference grid # # Required args: # object - the model object # trms - terms component of object # xlev - named list of factor levels (but not the coerced ones) # grid - reference grid # All methods must return a list with these elements: # X - basis for linear fcns over reference grid # bhat - regression coefficients for fixed effects (INCLUDING any NAs) # nbasis - matrix whose columns for a basis for non-estimable functions of beta; all.estble if none # V - estimated covariance matrix of bhat # dffun - function(k, dfargs) to find df for k'bhat having std error se # dfargs - additional arguments, if any, for dffun # misc - Extra info ... # -- if extra levels need to be added (e.g. mlm, polr), # put them in misc$ylevs # -- For transformations or link fcns, use misc$tran # for name (see 'make.link'), and use misc$inv.lbl # for label to use in 'summary' when tran is inverted # (ref_grid looks at lhs of model for tran if none found) # Note: if no df exists, set dffun = function(...) NA and dfargs = list() #-------------------------------------------------------------- #' @rdname extending-emmeans #' @param xlev Named list of factor levels (\emph{excluding} ones coerced to #' factors in the model formula) #' @param grid A \code{data.frame} (provided by \code{ref_grid}) containing #' the predictor settings needed in the reference grid #' #' @return The \code{emm_basis} method should return a \code{list} with the #' following elements: #' \describe{ #' \item{X}{The matrix of linear functions over \code{grid}, having the same #' number of rows as \code{grid} and the number of columns equal to the length #' of \code{bhat}.} #' \item{bhat}{The vector of regression coefficients for fixed effects. This #' should \emph{include} any \code{NA}s that result from rank deficiencies.} #' \item{nbasis}{A matrix whose columns form a basis for non-estimable functions #' of beta, or a 1x1 matrix of \code{NA} if there is no rank deficiency.} #' \item{V}{The estimated covariance matrix of \code{bhat}.} #' \item{dffun}{A function of \code{(k, dfargs)} that returns the degrees of #' freedom associated with \code{sum(k * bhat)}.} #' \item{dfargs}{A \code{list} containing additional arguments needed for #' \code{dffun}}. #' } %%% end of describe #' @export #' #' @section Communication between methods: #' If the \code{recover_data} method generates information needed by \code{emm_basis}, #' that information may be incorporated by creating a \code{"misc"} attribute in the #' returned recovered data. That information is then passed as the \code{misc} #' argument when \code{ref_grid} calls \code{emm_basis}. #' #' @section Optional hooks: #' Some models may need something other than standard linear estimates and #' standard errors. If so, custom functions may be pointed to via the items #' \code{misc$estHook}, \code{misc$vcovHook} and \code{misc$postGridHook}. If #' just the name of the hook function is provided as a character string, then it #' is retrieved using \code{\link{get}}. #' #' The \code{estHook} function should have arguments \samp{(object, do.se, tol, #' ...)} where \code{object} is the \code{emmGrid} object, #' \code{do.se} is a logical flag for whether to return the standard error, and #' \code{tol} is the tolerance for assessing estimability. It should return a #' matrix with 3 columns: the estimates, standard errors (\code{NA} when #' \code{do.se==FALSE}), and degrees of freedom (\code{NA} for asymptotic). The #' number of rows should be the same as \samp{object@linfct}. The #' \code{vcovHook} function should have arguments \samp{(object, tol, ...)} as #' described. It should return the covariance matrix for the estimates. Finally, #' \code{postGridHook}, if present, is called at the very end of #' \code{ref_grid}; it takes one argument, the constructed \code{object}, and #' should return a suitably modified \code{emmGrid} object. emm_basis = function(object, trms, xlev, grid, ...) { # look for outside methods first for (cl in .chk.cls(object)) { emb <- .get.outside.method("emm_basis", cl) if(!is.null(emb)) return(emb(object, trms, xlev, grid, ...)) } UseMethod("emm_basis") # lands here only if no outside method found } # Hidden courtesy function that provides access to all recover_data methods #' @rdname extending-emmeans #' @export .recover_data = function(object, ...) recover_data(object, ...) # Hidden courtesy function that provides access to all emm_basis methods #' @rdname extending-emmeans #' @return \code{.recover_data} and \code{.emm_basis} are hidden exported versions of #' \code{recover_data} and \code{emm_basis}, respectively. They run in \pkg{emmeans}'s #' namespace, thus providing access to all existing methods. #' @export .emm_basis = function(object, trms, xlev, grid, ...) emm_basis(object, trms, xlev, grid, ...) #-------------------------------------------------------------- ### DEFAULT METHODS (we hit these when a model is NOT supported) # I'll have it return the message if we caught the error in this way # Then caller can use try() to check for other types of errors, # and just print this message otherwise # NOT @exported recover_data.default = function(object, ...) { paste("Can't handle an object of class ", dQuote(class(object)[1]), "\n", paste(.show_supported(), collapse="")) } # NOT @exported emm_basis.default = function(object, trms, xlev, grid, ...) { stop("Can't handle an object of class", dQuote(class(object)[1]), "\n", .show_supported()) } # Private fcn to get a list of supported objects # does this by looking in namespace [ns] and methods [meth] # then strips that off leaving extensions .show_supported = function(ns = "emmeans", meth = "emm_basis") { "Use help(\"models\", package = \"emmeans\") for information on supported models." } emmeans/R/summary.R0000644000176200001440000016256314164423304013736 0ustar liggesusers############################################################################## # Copyright (c) 2012-2018 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### This file has summary.emmGrid S3 method and related functions # S3 summary method #' Summaries, predictions, intervals, and tests for \code{emmGrid} objects #' #' These are the primary methods for obtaining numerical or tabular results #' from an \code{emmGrid} object. #' \code{summary.emmGrid} is the general function for summarizing \code{emmGrid} objects. #' It also serves as the print method for these objects; so for convenience, #' \code{summary()} arguments may be included in calls to functions such as #' \code{\link{emmeans}} and \code{\link{contrast}} that construct \code{emmGrid} #' objects. Note that by default, summaries for Bayesian models are #' diverted to \code{\link{hpd.summary}}. #' #' \code{confint.emmGrid} is equivalent to \code{summary.emmGrid with #' infer = c(TRUE, FALSE)}. When called with \code{joint = FALSE}, \code{test.emmGrid} #' is equivalent to \code{summary.emmGrid} with \code{infer = c(FALSE, TRUE)}. #' #' With \code{joint = TRUE}, \code{test.emmGrid} calculates the Wald test of the #' hypothesis \code{linfct \%*\% bhat = null}, where \code{linfct} and #' \code{bhat} refer to slots in \code{object} (possibly subsetted according to #' \code{by} or \code{rows}). An error is thrown if any row of \code{linfct} is #' non-estimable. It is permissible for the rows of \code{linfct} to be linearly #' dependent, as long as \code{null == 0}, in which case a reduced set of #' contrasts is tested. Linear dependence and nonzero \code{null} cause an #' error. #' #' @param object An object of class \code{"emmGrid"} (see \link{emmGrid-class}) #' @param infer A vector of one or two logical values. The first determines #' whether confidence intervals are displayed, and the second determines #' whether \emph{t} tests and \emph{P} values are displayed. If only one value #' is provided, it is used for both. #' @param level Numerical value between 0 and 1. Confidence level for confidence #' intervals, if \code{infer[1]} is \code{TRUE}. #' @param adjust Character value naming the method used to adjust \eqn{p} values #' or confidence limits; or to adjust comparison arrows in \code{plot}. See #' the P-value adjustments section below. #' @param by Character name(s) of variables to use for grouping into separate #' tables. This affects the family of tests considered in adjusted \emph{P} #' values. #' @param type Character: type of prediction desired. This only has an effect if #' there is a known transformation or link function. \code{"response"} #' specifies that the inverse transformation be applied. \code{"mu"} (or #' equivalently, \code{"unlink"}) is usually the same as \code{"response"}, #' but in the case where the model has both a link function and a response #' transformation, only the link part is back-transformed. Other valid values #' are \code{"link"}, \code{"lp"}, and \code{"linear.predictor"}; these are #' equivalent, and request that results be shown for the linear predictor, #' with no back-transformation. The default is \code{"link"}, unless the #' \code{"predict.type"} option is in force; see \code{\link{emm_options}}, #' and also the section below on transformations and links. #' @param df Numeric. If non-missing, a constant number of degrees of freedom to #' use in constructing confidence intervals and \emph{P} values (\code{NA} #' specifies asymptotic results). #' @param calc Named list of character value(s) or formula(s). #' The expressions in \code{char} are evaluated and appended to the #' summary, just after the \code{df} column. The expression may include #' any names up through \code{df} in the summary, any additional names in #' \code{object@grid} (such as \code{.wgt.} or \code{.offset.}), or any #' earlier elements of \code{calc}. #' @param null Numeric. Null hypothesis value(s), on the linear-predictor scale, #' against which estimates are tested. May be a single value used for all, or #' a numeric vector of length equal to the number of tests in each family #' (i.e., \code{by} group in the displayed table). #' @param delta Numeric value (on the linear-predictor scale). If zero, ordinary #' tests of significance are performed. If positive, this specifies a #' threshold for testing equivalence (using the TOST or two-one-sided-test #' method), non-inferiority, or non-superiority, depending on \code{side}. See #' Details for how the test statistics are defined. #' @param side Numeric or character value specifying whether the test is #' left-tailed (\code{-1}, \code{"-"}, code{"<"}, \code{"left"}, or #' \code{"nonsuperiority"}); right-tailed (\code{1}, \code{"+"}, \code{">"}, #' \code{"right"}, or \code{"noninferiority"}); or two-sided (\code{0}, #' \code{2}, \code{"!="}, \code{"two-sided"}, \code{"both"}, #' \code{"equivalence"}, or \code{"="}). See the special section below for #' more details. #' @param frequentist Ignored except if a Bayesian model was fitted. If missing #' or \code{FALSE}, the object is passed to \code{\link{hpd.summary}}. Otherwise, #' a logical value of \code{TRUE} will have it return a frequentist summary. #' @param bias.adjust Logical value for whether to adjust for bias in #' back-transforming (\code{type = "response"}). This requires a value of #' \code{sigma} to exist in the object or be specified. #' @param sigma Error SD assumed for bias correction (when #' \code{type = "response"} and a transformation #' is in effect), or for constructing prediction intervals. If not specified, #' \code{object@misc$sigma} is used, and an error is thrown if it is not found. #' \emph{Note:} \code{sigma} may be a vector, as long as it conforms to the number of rows #' of the reference grid. #' @param ... Optional arguments such as \code{scheffe.rank} #' (see \dQuote{P-value adjustments}). #' In \code{as.data.frame.emmGrid}, \code{confint.emmGrid}, #' \code{predict.emmGrid}, and #' \code{test.emmGrid}, these arguments are passed to #' \code{summary.emmGrid}. #' #' @return \code{summary.emmGrid}, \code{confint.emmGrid}, and #' \code{test.emmGrid} return an object of class \code{"summary_emm"}, which #' is an extension of \code{\link{data.frame}} but with a special \code{print} #' method that displays it with custom formatting. For models fitted using #' MCMC methods, the call is diverted to \code{\link{hpd.summary}} (with #' \code{prob} set to \code{level}, if specified); one may #' alternatively use general MCMC summarization tools with the #' results of \code{as.mcmc}. #' #' @section Defaults: #' The \code{misc} slot in \code{object} may contain default values for #' \code{by}, \code{calc}, \code{infer}, \code{level}, \code{adjust}, #' \code{type}, \code{null}, \code{side}, and \code{delta}. #' These defaults vary depending #' on the code that created the object. The \code{\link{update}} method may be #' used to change these defaults. In addition, any options set using #' \samp{emm_options(summary = ...)} will trump those stored in the object's #' \code{misc} slot. #' #' @section Transformations and links: #' With \code{type = "response"}, the transformation assumed can be found in #' \samp{object@misc$tran}, and its label, for the summary is in #' \samp{object@misc$inv.lbl}. Any \eqn{t} or \eqn{z} tests are still performed #' on the scale of the linear predictor, not the inverse-transformed one. #' Similarly, confidence intervals are computed on the linear-predictor scale, #' then inverse-transformed. #' #' When \code{bias.adjust} is \code{TRUE}, then back-transformed estimates #' are adjusted by adding #' \eqn{0.5 h''(u)\sigma^2}, where \eqn{h} is the inverse transformation and #' \eqn{u} is the linear predictor. This is based on a second-order Taylor #' expansion. There are better or exact adjustments for certain specific #' cases, and these may be incorporated in future updates. #' #' @section P-value adjustments: #' The \code{adjust} argument specifies a multiplicity adjustment for tests or #' confidence intervals. This adjustment always is applied \emph{separately} #' to each table or sub-table that you see in the printed output (see #' \code{\link{rbind.emmGrid}} for how to combine tables). #' #' The valid values of \code{adjust} are as follows: #' \describe{ #' \item{\code{"tukey"}}{Uses the Studentized range distribution with the number #' of means in the family. (Available for two-sided cases only.)} #' \item{\code{"scheffe"}}{Computes \eqn{p} values from the \eqn{F} #' distribution, according to the Scheffe critical value of #' \eqn{\sqrt{rF(\alpha; r, d)}}{sqrt[r*qf(alpha, r, d)]}, where \eqn{d} is #' the error degrees of freedom and \eqn{r} is the rank of the set of linear #' functions under consideration. By default, the value of \code{r} is #' computed from \code{object@linfct} for each by group; however, if the #' user specifies an argument matching \code{scheffe.rank}, its value will #' be used instead. Ordinarily, if there are \eqn{k} means involved, then #' \eqn{r = k - 1} for a full set of contrasts involving all \eqn{k} means, and #' \eqn{r = k} for the means themselves. (The Scheffe adjustment is available #' for two-sided cases only.)} #' \item{\code{"sidak"}}{Makes adjustments as if the estimates were independent #' (a conservative adjustment in many cases).} #' \item{\code{"bonferroni"}}{Multiplies \eqn{p} values, or divides significance #' levels by the number of estimates. This is a conservative adjustment.} #' \item{\code{"dunnettx"}}{Uses our own\emph{ad hoc} approximation to the #' Dunnett distribution for a family of estimates having pairwise #' correlations of \eqn{0.5} (as is true when comparing treatments with a #' control with equal sample sizes). The accuracy of the approximation #' improves with the number of simultaneous estimates, and is much faster #' than \code{"mvt"}. (Available for two-sided cases only.)} #' \item{\code{"mvt"}}{Uses the multivariate \eqn{t} distribution to assess the #' probability or critical value for the maximum of \eqn{k} estimates. This #' method produces the same \eqn{p} values and intervals as the default #' \code{summary} or \code{confint} methods to the results of #' \code{\link{as.glht}}. In the context of pairwise comparisons or comparisons #' with a control, this produces \dQuote{exact} Tukey or Dunnett adjustments, #' respectively. However, the algorithm (from the \pkg{mvtnorm} package) uses a #' Monte Carlo method, so results are not exactly repeatable unless the same #' random-number seed is used (see \code{\link[base:Random]{set.seed}}). As the family #' size increases, the required computation time will become noticeable or even #' intolerable, making the \code{"tukey"}, \code{"dunnettx"}, or others more #' attractive.} #' \item{\code{"none"}}{Makes no adjustments to the \eqn{p} values.} #' } %%%%%%%%%%%%%%%% end \describe {} #' #' For tests, not confidence intervals, the Bonferroni-inequality-based adjustment #' methods in \code{\link{p.adjust}} are also available (currently, these #' include \code{"holm"}, \code{"hochberg"}, \code{"hommel"}, #' \code{"bonferroni"}, \code{"BH"}, \code{"BY"}, \code{"fdr"}, and #' \code{"none"}). If a \code{p.adjust.methods} method other than #' \code{"bonferroni"} or \code{"none"} is specified for confidence limits, the #' straight Bonferroni adjustment is used instead. Also, if an adjustment method #' is not appropriate (e.g., using \code{"tukey"} with one-sided tests, or with #' results that are not pairwise comparisons), a more appropriate method #' (usually \code{"sidak"}) is substituted. #' #' In some cases, confidence and \eqn{p}-value adjustments are only approximate #' -- especially when the degrees of freedom or standard errors vary greatly #' within the family of tests. The \code{"mvt"} method is always the correct #' one-step adjustment, but it can be very slow. One may use #' \code{\link{as.glht}} with methods in the \pkg{multcomp} package to obtain #' non-conservative multi-step adjustments to tests. #' #' \emph{Warning:} Non-estimable cases are \emph{included} in the family to which adjustments #' are applied. You may wish to subset the object using the \code{[]} operator #' to work around this problem. #' #' @section Tests of significance, nonsuperiority, noninferiority, or equivalence: #' When \code{delta = 0}, test statistics are the usual tests of significance. #' They are of the form #' \samp{(estimate - null)/SE}. Notationally: #' \describe{ #' \item{Significance}{\eqn{H_0: \theta = \theta_0} versus \cr #' \eqn{H_1: \theta < \theta_0} (left-sided), or\cr #' \eqn{H_1 \theta > \theta_0} (right-sided), or\cr #' \eqn{H_1: \theta \ne \theta_0} (two-sided)\cr #' The test statistic is\cr #' \eqn{t = (Q - \theta_0)/SE}\cr #' where \eqn{Q} is our estimate of \eqn{\theta}; #' then left, right, or two-sided \eqn{p} values are produced, #' depending on \code{side}.} #' } #' When \code{delta} is positive, the test statistic depends on \code{side} as #' follows. #' \describe{ #' \item{Left-sided (nonsuperiority)}{\eqn{H_0: \theta \ge \theta_0 + \delta} #' versus \eqn{H_1: \theta < \theta_0 + \delta}\cr #' \eqn{t = (Q - \theta_0 - \delta)/SE}\cr #' The \eqn{p} value is the lower-tail probability.} #' \item{Right-sided (noninferiority)}{\eqn{H_0: \theta \le \theta_0 - \delta} #' versus \eqn{H_1: \theta > \theta_0 - \delta}\cr #' \eqn{t = (Q - \theta_0 + \delta)/SE}\cr #' The \eqn{p} value is the upper-tail probability.} #' \item{Two-sided (equivalence)}{\eqn{H_0: |\theta - \theta_0| \ge \delta} #' versus \eqn{H_1: |\theta - \theta_0| < \delta}\cr #' \eqn{t = (|Q - \theta_0| - \delta)/SE}\cr #' The \eqn{p} value is the \emph{lower}-tail probability.\cr #' Note that \eqn{t} is the maximum of \eqn{t_{nonsup}} and \eqn{-t_{noninf}}. #' This is equivalent to choosing the less #' significant result in the two-one-sided-test (TOST) procedure.} #' } %%%%%%%%%%%% end \describe{} #' #' #' @section Non-estimable cases: #' When the model is rank-deficient, each row \code{x} of \code{object}'s #' \code{linfct} slot is checked for estimability. If \code{sum(x*bhat)} #' is found to be non-estimable, then the string \code{NonEst} is displayed for the #' estimate, and associated statistics are set to \code{NA}. #' The estimability check is performed #' using the orthonormal basis \code{N} in the \code{nbasis} slot for the null #' space of the rows of the model matrix. Estimability fails when #' \eqn{||Nx||^2 / ||x||^2} exceeds \code{tol}, which by default is #' \code{1e-8}. You may change it via \code{\link{emm_options}} by setting #' \code{estble.tol} to the desired value. #' #' See the warning above that non-estimable cases are still included when #' determining the family size for \emph{P}-value adjustments. #' #' @section Warning about potential misuse of P values: #' Some in the statistical and scientific community argue that #' the term \dQuote{statistical significance} should be completely abandoned, and #' that criteria such as \dQuote{p < 0.05} never be used to assess the #' importance of an effect. These practices can be too misleading and are prone to abuse. #' See \href{../doc/basics.html#pvalues}{the \dQuote{basics} vignette} for more #' discussion. #' #' #' @note In doing testing and a transformation and/or link is in force, any #' \code{null} and/or \code{delta} values specified must always be on the #' scale of the linear predictor, regardless of the setting for `type`. If #' \code{type = "response"}, the null value displayed in the summary table #' will be back-transformed from the value supplied by the user. But the #' displayed \code{delta} will not be changed, because there (often) is #' not a natural way to back-transform it. #' #' @note When we have \code{type = "response"}, and \code{bias.adj = TRUE}, #' the \code{null} value displayed in the output is both back-transformed #' and bias-adjusted, leading to a rather non-intuitive-looking null value. #' However, since the tests themselves are performed on the link scale, #' this is the response value at which a *P* value of 1 would be obtained. #' #' @note The default \code{show} method for \code{emmGrid} objects (with the #' exception of newly created reference grids) is \code{print(summary())}. #' Thus, with ordinary usage of \code{\link{emmeans}} and such, it is #' unnecessary to call \code{summary} unless there is a need to #' specify other than its default options. #' #' @seealso \code{\link{hpd.summary}} #' #' @method summary emmGrid #' @export #' @order 1 #' #' @examples #' warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) #' warp.emm <- emmeans(warp.lm, ~ tension | wool) #' warp.emm # implicitly runs 'summary' #' #' confint(warp.emm, by = NULL, level = .90) #' #' # -------------------------------------------------------------- #' pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) #' pigs.emm <- emmeans(pigs.lm, "percent", type = "response") #' summary(pigs.emm) # (inherits type = "response") #' summary(pigs.emm, calc = c(n = ".wgt.")) # Show sample size #' #' # For which percents is EMM non-inferior to 35, based on a 10% threshold? #' # Note the test is done on the log scale even though we have type = "response" #' test(pigs.emm, null = log(35), delta = log(1.10), side = ">") #' #' con <- contrast(pigs.emm, "consec") #' test(con) #' #' test(con, joint = TRUE) #' #' # default Scheffe adjustment - rank = 3 #' summary(con, infer = c(TRUE, TRUE), adjust = "scheffe") #' #' # Consider as some of many possible contrasts among the six cell means #' summary(con, infer = c(TRUE, TRUE), adjust = "scheffe", scheffe.rank = 5) #' summary.emmGrid <- function(object, infer, level, adjust, by, type, df, calc, null, delta, side, frequentist, bias.adjust = get_emm_option("back.bias.adj"), sigma, ...) { if(missing(sigma)) sigma = object@misc$sigma if(missing(frequentist) && !is.null(object@misc$frequentist)) frequentist = object@misc$frequentist if(missing(bias.adjust)) { if (!is.null(object@misc$bias.adjust)) bias.adjust = object@misc$bias.adjust else bias.adjust = get_emm_option("back.bias.adj") } if(!is.na(object@post.beta[1]) && (missing(frequentist) || !frequentist)) return (hpd.summary(object, prob = level, by = by, type = type, bias.adjust = bias.adjust, sigma = sigma, ...)) # Any "summary" options override built-in opt = get_emm_option("summary") if(!is.null(opt)) { opt$object = object object = do.call("update.emmGrid", opt) } if(!missing(sigma)) object = update(object, sigma = sigma) # we'll keep sigma in misc misc = object@misc use.elts = .reconcile.elts(object) grid = object@grid[use.elts, , drop = FALSE] ### For missing arguments, get from misc, else default if(missing(infer)) infer = misc$infer if(missing(level)) level = misc$level if(missing(adjust)) adjust = misc$adjust if(missing(by)) by = misc$by.vars # Disable Tukey if by vars don't match those used in construction if((!is.na(misc$estType) && misc$estType == "pairs") && (paste(c("", by), collapse = ",") != misc$.pairby)) misc$estType = object@misc$estType = "contrast" if (missing(type)) type = .get.predict.type(misc) else type = .validate.type(type) # if there are two transformations and we want response, then we need to undo both if ((type == "response") && (!is.null(misc$tran2))) { tmp = match.call() tmp$type = "unlink" summ.unlink = eval(tmp) #this is summary with type = "unlink" object = regrid(object, transform = "mu") two.trans = TRUE } else two.trans = FALSE if ((type %in% c("mu", "unlink")) && (!is.null(t2 <- misc$tran2))) { if (!is.character(t2)) t2 = "tran" object = update(object, inv.lbl = paste0(t2, "(resp)")) } if(missing(df)) df = misc$df if(!is.null(df)) { object@dffun = function(k, dfargs) df[1] attr(object@dffun, "mesg") = "user-specified" } # for missing args that default to zero unless provided or in misc slot .nul.eq.zero = function(val) { if(is.null(val)) 0 else val } if(missing(null)) null = .nul.eq.zero(misc$null) if(missing(delta)) delta = .nul.eq.zero(misc$delta) if(missing(side)) side = .nul.eq.zero(misc$side) # reconcile all the different ways we could specify the alternative # ... and map each to one of the first 3 subscripts side.opts = c("left","both","right","two-sided","noninferiority","nonsuperiority","equivalence","superiority","inferiority","0","2","-1","1","+1","<",">","!=","=") side.map = c( 1, 2, 3, 2, 3, 1, 2, 3, 1, 2, 2, 1, 3, 3, 1, 3, 2, 2) side = side.map[pmatch(side, side.opts, 2)[1]] - 2 delta = abs(delta) result = .est.se.df(object, ...) lblnms = setdiff(names(grid), c(object@roles$responses, ".offset.", ".wgt.")) lbls = grid[lblnms] zFlag = (all(is.na(result$df) | is.infinite(result$df))) inv = (type %in% c("response", "mu", "unlink")) # flag to inverse-transform link = attr(result, "link") if (inv && is.null(link)) inv = FALSE if ((length(infer) == 0) || !is.logical(infer)) infer = c(FALSE, FALSE) if(length(infer == 1)) infer = c(infer,infer) if(inv && !is.null(misc$tran)) { if (!is.null(misc$inv.lbl)) { names(result)[1] = misc$inv.lbl if (!is.null(misc$log.contrast)) # contrast of logs - relabel as ratios for (ell in seq_along(lbls)){ lbls[[ell]] = factor(lbls[[ell]]) levels(lbls[[ell]]) = gsub(" - ", " / ", levels(lbls[[ell]])) } } else names(result)[1] = "response" } attr(result, "link") = NULL estName = names(result)[1] mesg = misc$initMesg # Look for "mesg" attribute in dffun if (!is.null(dfm <- attr(object@dffun, "mesg"))) mesg = c(mesg, paste("Degrees-of-freedom method:", dfm)) ### Add an annotation when we show results on lp scale and ### there is a transformation if (!inv && !is.null(link)) { mesg = c(mesg, paste("Results are given on the", link$name, "(not the response) scale.")) } if (inv && !is.null(link$unknown)) { mesg = c(mesg, paste0('Unknown transformation "', link$name, '": no transformation done')) inv = FALSE link = NULL } if(inv && bias.adjust && !is.null(link)) link = .make.bias.adj.link(link, sigma) # et = 1 if a prediction, 2 if a contrast (or unmatched or NULL), 3 if pairs et = pmatch(c(misc$estType, "c"), c("prediction", "contrast", "pairs"), nomatch = 2)[1] by.size = nrow(grid) by.rows = .find.by.rows(grid, by) if (!is.null(by)) { if (length(unique(sapply(by.rows, length))) > 1) { by.size = misc$famSize = "(varies)" } else for (nm in by) by.size = by.size / length(unique(object@levels[[nm]])) } fam.info = c(misc$famSize, by.size, et) cnm = NULL adjust = tolower(adjust) if (adjust %in% c("bh", "by")) # special cases in p.adjust.methods adjust = toupper(adjust) # get vcov matrix only if needed (adjust == "mvt") corrmat = sch.rank = NULL if (adjust %.pin% "mvt") { ##(!is.na(pmatch(adjust, "mvt"))) { corrmat = cov2cor(vcov(object)) attr(corrmat, "by.rows") = by.rows } else if (adjust %.pin% "scheffe") { ##(!is.na(pmatch(adjust, "scheffe"))) { if(is.null(sch.rank <- .match.dots("scheffe.rank", ...))) sch.rank = sapply(by.rows, function(.) qr(zapsmall(object@linfct[., , drop = FALSE]))$rank) if(length(unique(sch.rank)) > 1) fam.info[1] = "uneven" # This forces ragged.by = TRUE in .adj functions } # Add calculated columns if(!missing(calc) || !is.null(calc <- misc$calc)) { env = c(result, grid[setdiff(names(grid), names(result))]) for (v in names(calc)) { elt = rev(as.character(calc[[v]]))[1] # pick out rhs if a formula val = try(eval(parse(text = elt), envir = env), silent = TRUE) if(!inherits(val, "try-error")) result[[v]] = env[[v]] = val else warning("The column '", v, "' could not be calculated, ", " so it is omitted") } } # add linkname attribute if (two.trans) linkname = paste0(link$name, "[", attr(summ.unlink, "linkname"), "]") else linkname = link$name if(infer[1]) { # add CIs acv = .adj.critval(level, result$df, adjust, fam.info, side, corrmat, by.rows, sch.rank) ###adjust = acv$adjust # in older versions, I forced same adj method for tests cv = acv$cv cv = switch(side + 2, cbind(-Inf, cv), cbind(-cv, cv), cbind(-cv, Inf)) cnm = if (zFlag) c("asymp.LCL", "asymp.UCL") else c("lower.CL","upper.CL") if(!is.null(misc$.predFlag)) { cnm = c("lower.PL", "upper.PL") sigma = misc$sigma mesg = c(mesg, paste0( "Prediction intervals and SEs are based on an error SD of ", .fmt.sigma(sigma))) estName = names(result)[1] = "prediction" } if (!two.trans) { result[[cnm[1]]] = result[[1]] + cv[, 1]*result$SE result[[cnm[2]]] = result[[1]] + cv[, 2]*result$SE } else { result[cnm] = summ.unlink[cnm] } mesg = c(mesg, paste("Confidence level used:", level), acv$mesg) if (inv) { clims = with(link, cbind(linkinv(result[[cnm[1]]]), linkinv(result[[cnm[2]]]))) tmp = apply(clims, 1, function(x) { z = diff(x); ifelse(is.na(z), 0, z) }) idx = if (all(tmp >= 0)) 1:2 else 2:1 result[[cnm[1]]] = clims[, idx[1]] result[[cnm[2]]] = clims[, idx[2]] mesg = c(mesg, paste("Intervals are back-transformed from the", linkname, "scale")) } } if(infer[2]) { # add tests tnm = ifelse (zFlag, "z.ratio", "t.ratio") tail = ifelse(side == 0, -sign(abs(delta)), side) if(!two.trans) { result[["null"]] = null if (inv && !is.null(link)) result[["null"]] = link$linkinv(null) if (all(result$null == 0)) result[["null"]] = NULL if (side == 0) { if (delta == 0) # two-sided sig test t.ratio = result[[tnm]] = (result[[1]] - null) / result$SE else t.ratio = result[[tnm]] = (abs(result[[1]] - null) - delta) / result$SE } else { t.ratio = result[[tnm]] = (result[[1]] - null + side * delta) / result$SE } apv = .adj.p.value(t.ratio, result$df, adjust, fam.info, tail, corrmat, by.rows, sch.rank) adjust = apv$adjust # in case it was abbreviated result$p.value = apv$pval mesg = c(mesg, apv$mesg) } else { result$null = ifelse(is.null(summ.unlink$null), link$linkinv(0), link$linkinv(summ.unlink$null)) result[[tnm]] = summ.unlink[[tnm]] result$p.value = summ.unlink$p.value apv = .adj.p.value(0, result$df, adjust, fam.info, tail, corrmat, by.rows, sch.rank) # we ignore everything about apv except the message mesg = c(mesg, apv$mesg) } if (delta > 0) mesg = c(mesg, paste("Statistics are tests of", c("nonsuperiority","equivalence","noninferiority")[side+2], "with a threshold of", signif(delta, 5))) if(tail != 0) mesg = c(mesg, paste("P values are ", ifelse(tail<0,"left-","right-"),"tailed", sep="")) if (inv) mesg = c(mesg, paste("Tests are performed on the", linkname, "scale")) } if (inv) { result[["SE"]] = with(link, abs(mu.eta(result[[1]]) * result[["SE"]])) result[[1]] = with(link, linkinv(result[[1]])) if(bias.adjust) mesg = c(mesg, paste("Bias adjustment applied based on sigma =", .fmt.sigma(sigma))) } if (length(misc$avgd.over) > 0) { qual = attr(misc$avgd.over, "qualifier") if (is.null(qual)) qual = "" mesg = c(paste0("Results are averaged over", qual, " the levels of: ", paste(misc$avgd.over, collapse = ", ")), mesg) } summ = cbind(lbls, result) attr(summ, "estName") = estName attr(summ, "clNames") = cnm # will be NULL if infer[1] is FALSE if (is.null(misc$pri.vars) || length(misc$pri.vars) == 0) misc$pri.vars = names(object@levels) attr(summ, "pri.vars") = setdiff(union(misc$pri.vars, misc$by.vars), c(by, ".wgt.", ".offset.")) attr(summ, "by.vars") = by attr(summ, "adjust") = adjust attr(summ, "side") = side attr(summ, "delta") = delta attr(summ, "type") = type attr(summ, "mesg") = unique(mesg) attr(summ, "linkname") = linkname class(summ) = c("summary_emm", "data.frame") summ } # S3 predict method #' @rdname summary.emmGrid #' @order 4 #' @method predict emmGrid #' @param interval Type of interval desired (partial matching is allowed): #' \code{"none"} for no intervals, #' otherwise confidence or prediction intervals with given arguments, #' via \code{\link{confint.emmGrid}}. #' #' @export #' @return \code{predict} returns a vector of predictions for each row of \code{object@grid}. predict.emmGrid <- function(object, type, interval = c("none", "confidence", "prediction"), level = 0.95, bias.adjust = get_emm_option("back.bias.adj"), sigma, ...) { # update with any "summary" options opt = get_emm_option("summary") if(!is.null(opt)) { opt$object = object object = do.call("update.emmGrid", opt) } interval = match.arg(interval) if (interval %in% c( "confidence", "prediction")) { if (interval == "prediction") object@misc$.predFlag = TRUE return(confint.emmGrid(object, type = type, level = level, bias.adjust = bias.adjust, sigma = sigma, ...)) } if (missing(type)) type = .get.predict.type(object@misc) else type = .validate.type(type) # if there are two transformations and we want response, then we need to undo both if ((type == "response") && (!is.null(object@misc$tran2))) object = regrid(object, transform = "mu") pred = .est.se.df(object, do.se = FALSE, ...) result = pred[[1]] if (type %in% c("response", "mu", "unlink")) { link = attr(pred, "link") if (!is.null(link)) { if (bias.adjust) { if(missing(sigma)) sigma = object@misc$sigma link = .make.bias.adj.link(link, sigma) } result = link$linkinv(result) if (is.logical(link$unknown) && link$unknown) warning("Unknown transformation: \"", link$name, "\" -- no transformation applied.") } } result } # as.data.frame method #' @rdname summary.emmGrid #' @order 5 #' @param x object of the given class #' @param row.names passed to \code{\link{as.data.frame}} #' @param optional required argument, but ignored in \code{as.data.frame.emmGrid} #' @param check.names passed to \code{\link{data.frame}} #' @return The \code{as.data.frame} method returns a plain data frame, #' equivalent to \code{as.data.frame(summary(.))}. #' @note The \code{as.data.frame} method is intended primarily to allow for #' \code{emmGrid} objects to be coerced to a data frame as needed internally. #' However, we recommend \emph{against} users #' routinely using \code{as.data.frame}; instead, use \code{summary}, #' \code{confint}, or \code{test}, which already return a special #' \code{data.frame} with added annotations. Those annotations display #' important information such as adjustment methods and confidence levels. #' If you need to see more digits, use \code{print(summary(object), digits = ...)}; #' and if you \emph{always} want to see more digits, use #' \code{emm_options(opt.digits = FALSE)}. #' @method as.data.frame emmGrid #' @export #' @examples #' # Show estimates to more digits #' print(test(con), digits = 7) as.data.frame.emmGrid = function(x, row.names = NULL, optional, check.names = TRUE, ...) { as.data.frame(summary(x, ...), row.names = row.names, check.names = check.names) } #' @rdname summary.emmGrid #' @order 6 #' @method [ summary_emm #' @param as.df Logical value. With \code{x[..., as.df = TRUE]}, the result is #' object is coerced to an ordinary \code{\link{data.frame}}; otherwise, it is left as a #' \code{summary_emm} object. #' @export "[.summary_emm" = function(x, ..., as.df = TRUE) { if (as.df) as.data.frame(x)[...] else base::`[.data.frame`(x, ...) } # Computes the quadratic form y'Xy after subsetting for the nonzero elements of y .qf.non0 = function(X, y) { ii = (zapsmall(y) != 0) if (any(ii)) sum(y[ii] * (X[ii, ii, drop = FALSE] %*% y[ii])) else 0 } # Workhorse for getting estimates etc. .est.se.df = function(object, do.se=TRUE, tol = get_emm_option("estble.tol"), ...) { if (nrow(object@grid) == 0) { result = data.frame(NA, NA, NA) names(result) = c(object@misc$estName, "SE", "df") return(result[-1, ]) } misc = object@misc use.elts = .reconcile.elts(object) if (!is.null(hook <- misc$estHook)) { if (is.character(hook)) hook = get(hook) result = hook(object, do.se=do.se, tol=tol, ...) } else { active = which(!is.na(object@bhat)) bhat = object@bhat[active] result = t(apply(object@linfct[use.elts, , drop = FALSE], 1, function(x) { if (!any(is.na(x)) && estimability::is.estble(x, object@nbasis, tol)) { x = x[active] est = sum(bhat * x) if(do.se) { se = sqrt(.qf.non0(object@V, x)) df = object@dffun(x, object@dfargs) ### Brute force: if (is.na(df)) df = Inf ### Added } else # if these unasked-for results are used, we're bound to get an error! se = df = 0 c(est, se, df) } else c(NA,NA,NA) })) if (!is.null(object@grid$.offset.)) result[, 1] = result[, 1] + object@grid$.offset.[use.elts] } result[1] = as.numeric(result[1]) # silly bit of code to avoid getting a data.frame of logicals if all are NA result = as.data.frame(result) names(result) = c(misc$estName, "SE", "df") if (!is.null(misc$.predFlag)) { if (is.null(misc$sigma)) stop("No 'sigma' is available for obtaining Prediction intervals.", call. = FALSE) result$SE = sqrt(result$SE^2 + misc$sigma^2) } if (!is.null(misc$tran)) { attr(result, "link") = .get.link(misc) if(is.character(misc$tran) && (misc$tran == "none")) attr(result, "link") = NULL } result } # workhorse for nailing down the link function .get.link = function(misc) { link = if(is.character(misc$tran)) .make.link(misc$tran) else if (is.list(misc$tran)) misc$tran else NULL if (is.list(link)) { # See if multiple of link is requested, or variable is offset if (!is.null(misc$tran.mult) || !is.null(misc$tran.offset)) { name = link$name mult = link$mult = ifelse(is.null(misc$tran.mult), 1, misc$tran.mult) off = link$offset = ifelse(is.null(misc$tran.offset), 0, misc$tran.offset) link = with(link, list( linkinv = function(eta) linkinv(eta / mult) - offset, mu.eta = function(eta) mu.eta(eta / mult) / mult)) if(mult != 1) name = paste0(round(mult, 3), "*", name) if(off != 0) name = paste0(name, "(mu + ", round(off, 3), ")") link$name = name } } if (!is.null(link) && is.null(link$name)) link$name = "linear-predictor" link } # patch-in alternative back-transform stuff for bias adjustment # Currently, we just use a 2nd-order approx for everybody: # E(h(nu + E)) ~= h(nu) + 0.5*h"(nu)*var(E) .make.bias.adj.link = function(link, sigma) { if (is.null(sigma)) stop("Must specify 'sigma' to obtain bias-adjusted back transformations", call. = FALSE) link$inv = link$linkinv link$der = link$mu.eta link$sigma22 = sigma^2 / 2 link$der2 = function(eta) with(link, 1000 * (der(eta + .0005) - der(eta - .0005))) link$linkinv = function(eta) with(link, inv(eta) + sigma22 * der2(eta)) link$mu.eta = function(eta) with(link, der(eta) + 1000 * sigma22 * (der2(eta + .0005) - der2(eta - .0005))) link } ####!!!!! TODO: Re-think whether we are handling Scheffe adjustments correctly ####!!!!! if/when we shift around 'by' specs, etc. ### utility for changing adjustments .chg.adjust = function(old, new, reason) { message("Note: adjust = \"", old, "\" was changed to \"", new, "\"\nbecause \"", old, "\" is ", reason) new } # utility to compute an adjusted p value # tail is -1, 0, 1 for left, two-sided, or right # Note fam.info is c(famsize, ncontr, estTypeIndex) # lsmeans >= 2.14: added corrmat arg, dunnettx & mvt adjustments # emmeans > 1.3.4: we have sch.rank of same length as by.rows # NOTE: corrmat is NULL unless adjust == "mvt" .adj.p.value = function(t, DF, adjust, fam.info, tail, corrmat, by.rows, sch.rank) { fam.size = fam.info[1] n.contr = fam.info[2] et = as.numeric(fam.info[3]) # Force no adjustment when just one test, unless we're using scheffe if ((n.contr == 1) && !(adjust %.pin% "scheffe")) ##(pmatch(adjust, "scheffe", 0) != 1)) adjust = "none" # do a pmatch of the adjust method adj.meths = c("sidak", "tukey", "scheffe", "dunnettx", "mvt", p.adjust.methods) k = pmatch(adjust, adj.meths) if(is.na(k)) stop("Adjust method '", adjust, "' is not recognized or not valid") adjust = adj.meths[k] if ((tail != 0) && (adjust %in% c("tukey", "scheffe", "dunnettx"))) # meth not approp for 1-sided adjust = .chg.adjust(adjust, "sidak", "not appropriate for one-sided inferences") if ((et != 3) && adjust == "tukey") # not pairwise adjust = .chg.adjust(adjust, "sidak", "only appropriate for one set of pairwise comparisons") ragged.by = (is.character(fam.size) || adjust %in% c("mvt", p.adjust.methods)) # flag that we need to do groups separately if (!ragged.by) by.rows = list(seq_along(t)) # not ragged, we can do all as one by group # asymptotic results when df is NA DF[is.na(DF)] = Inf # if estType is "prediction", use #contrasts + 1 as family size # (produces right Scheffe CV; Tukey ones are a bit strange) # deprecated - used to try to keep track of scheffe.adj = ifelse(et == 1, 0, - 1) if (tail == 0) p.unadj = 2*pt(abs(t), DF, lower.tail=FALSE) else p.unadj = pt(t, DF, lower.tail = (tail<0)) # pvals = lapply(by.rows, function(rows) { pval = numeric(length(t)) for(jj in seq_along(by.rows)) { ####(rows in by.rows) { rows = by.rows[[jj]] unadj.p = p.unadj[rows] abst = abs(t[rows]) df = DF[rows] if (ragged.by) { n.contr = length(rows) fam.size = (1 + sqrt(1 + 8*n.contr)) / 2 # tukey family size - e.g., 6 pairs -> family of 4 } if (adjust %in% p.adjust.methods) { if (n.contr == length(unadj.p)) pval[rows] = p.adjust(unadj.p, adjust, n = n.contr) else # only will happen when by.rows is length 1 pval[rows] = as.numeric(apply(matrix(unadj.p, nrow=n.contr), 2, function(pp) p.adjust(pp, adjust, n=sum(!is.na(pp))))) } else pval[rows] = switch(adjust, sidak = 1 - (1 - unadj.p)^n.contr, # NOTE: tukey, scheffe, dunnettx all assumed 2-sided! tukey = ptukey(sqrt(2)*abst, fam.size, zapsmall(df), lower.tail=FALSE), scheffe = pf(t[rows]^2 / (sch.rank[jj]), sch.rank[jj], df, lower.tail = FALSE), dunnettx = 1 - .pdunnx(abst, n.contr, df), mvt = 1 - .my.pmvt(t[rows], df, corrmat[rows,rows,drop=FALSE], -tail) # tricky - reverse the tail because we're subtracting from 1 ) } chk.adj = match(adjust, c("none", "tukey", "scheffe"), nomatch = 99) if (ragged.by) { nc = max(sapply(by.rows, length)) fs = (1 + sqrt(1 + 8*nc)) / 2 scheffe.dim = "(varies)" } else { nc = n.contr fs = fam.size scheffe.dim = sch.rank[1] } do.msg = (chk.adj > 1) && !((fs < 3) && (chk.adj < 10)) ### WAS (chk.adj > 1) && (nc > 1) && !((fs < 3) && (chk.adj < 10)) if (do.msg) { # xtra = if(chk.adj < 10) paste("a family of", fam.size, "tests") # else paste(n.contr, "tests") xtra = switch(adjust, tukey = paste("for comparing a family of", fam.size, "estimates"), scheffe = paste("with rank", scheffe.dim), paste("for", n.contr, "tests") ) mesg = paste("P value adjustment:", adjust, "method", xtra) } else mesg = NULL list(pval = pval, mesg = mesg, adjust = adjust) } # Code needed for an adjusted critical value # returns a list similar to .adj.p.value # lsmeans >= 2.14: Added tail & corrmat args, dunnettx & mvt adjustments # emmeans > 1.3.4: Added sch.rank # NOTE: corrmat is NULL unless adjust == "mvt" .adj.critval = function(level, DF, adjust, fam.info, tail, corrmat, by.rows, sch.rank) { mesg = NULL fam.size = fam.info[1] n.contr = fam.info[2] et = as.numeric(fam.info[3]) ragged.by = (is.character(fam.size)) # flag that we need to do groups separately if (!ragged.by && (n.contr == 1) && !(adjust %.pin% "scheffe")) ##(pmatch(adjust, "scheffe", 0) != 1)) # Force no adjustment when just one interval, unless using Scheffe adjust = "none" adj.meths = c("sidak", "tukey", "scheffe", "dunnettx", "mvt", "bonferroni", "none") k = pmatch(adjust, adj.meths) if(is.na(k)) k = which(adj.meths == "bonferroni") adjust = adj.meths[k] if (!ragged.by && (adjust != "mvt") && (length(unique(DF)) == 1)) by.rows = list(seq_along(DF)) # not ragged, we can do all as one by group if ((tail != 0) && (adjust %in% c("tukey", "scheffe", "dunnettx"))) # meth not approp for 1-sided adjust = .chg.adjust(adjust, "sidak", "not appropriate for one-sided inferences") if ((et != 3) && adjust == "tukey") # not pairwise adjust = .chg.adjust(adjust, "sidak", "only appropriate for one set of pairwise comparisons") # asymptotic results when df is NA DF[is.na(DF)] = Inf #### No longer used scheffe.adj = ifelse(et == 1, 0, - 1) chk.adj = match(adjust, c("none", "tukey", "scheffe"), nomatch = 99) if (ragged.by) { nc = max(sapply(by.rows, length)) fs = (1 + sqrt(1 + 8*nc)) / 2 scheffe.dim = "(varies)" } else { nc = n.contr fs = fam.size scheffe.dim = sch.rank[1] } do.msg = (chk.adj > 1) && ### (nc > 1) && !((fs < 3) && (chk.adj < 10)) if (do.msg) { # xtra = if(chk.adj < 10) paste("a family of", fam.size, "estimates") # else paste(n.contr, "estimates") xtra = switch(adjust, tukey = paste("for comparing a family of", fam.size, "estimates"), scheffe = paste("with rank", scheffe.dim), paste("for", n.contr, "estimates") ) mesg = paste("Conf-level adjustment:", adjust, "method", xtra) } adiv = ifelse(tail == 0, 2, 1) # divisor for alpha where needed ###cvs = lapply(by.rows, function(rows) { cv = numeric(sum(sapply(by.rows, length))) for (jj in seq_along(by.rows)) { ####(rows in by.rows) { rows = by.rows[[jj]] df = DF[rows] if (ragged.by) { n.contr = length(rows) fam.size = (1 + sqrt(1 + 8*n.contr)) / 2 # tukey family size - e.g., 6 pairs -> family of 4 } cv[rows] = switch(adjust, none = -qt((1-level)/adiv, df), sidak = -qt((1 - level^(1/n.contr))/adiv, df), bonferroni = -qt((1-level)/n.contr/adiv, df), tukey = qtukey(level, fam.size, df) / sqrt(2), scheffe = sqrt((sch.rank[jj]) * qf(level, sch.rank[jj], df)), dunnettx = .qdunnx(level, n.contr, df), mvt = .my.qmvt(level, df, corrmat[rows, rows, drop = FALSE], tail) ) } list(cv = cv, mesg = mesg, adjust = adjust) } ### My own functions to ease access to mvt functions ### These use one argument at a time and expands each (lower, upper) or p to a k-vector ### Use tailnum = -1, 0, or 1 ### NOTE: corrmat needs "by.rows" attribute to tell which rows ### belong to which submatrix. .my.pmvt = function(x, df, corrmat, tailnum) { lower = switch(tailnum + 2, -Inf, -abs(x), x) upper = switch(tailnum + 2, x, abs(x), Inf) by.rows = attr(corrmat, "by.rows") if (is.null(by.rows)) by.rows = list(seq_len(length(x))) by.sel = numeric(length(x)) for (i in seq_along(by.rows)) by.sel[by.rows[[i]]] = i df = .fix.df(df) apply(cbind(lower, upper, df, by.sel), 1, function(z) { idx = by.rows[[z[4]]] k = length(idx) pval = try(mvtnorm::pmvt(rep(z[1], k), rep(z[2], k), df = as.integer(z[3]), corr = corrmat[idx, idx]), silent = TRUE) if (inherits(pval, "try-error")) NA else pval }) } # Vectorized for df but needs p to be scalar .my.qmvt = function(p, df, corrmat, tailnum) { tail = c("lower.tail", "both.tails", "lower.tail")[tailnum + 2] df = .fix.df(df) by.rows = attr(corrmat, "by.rows") if (is.null(by.rows)) by.rows = list(seq_len(length(df))) by.sel = numeric(length(df)) for (i in seq_along(by.rows)) by.sel[by.rows[[i]]] = i # If df all equal, compute just once for each by group eq.df = (diff(range(df)) == 0) i1 = if (eq.df) sapply(by.rows, function(r) r[1]) else seq_along(df) result = apply(cbind(p, df[i1], by.sel[i1]), 1, function(z) { idx = by.rows[[z[3]]] cv = try(mvtnorm::qmvt(z[1], tail = tail, df = as.integer(z[2]), corr = corrmat[idx, idx])$quantile, silent = TRUE) if (inherits(cv, "try-error")) NA else cv }) if (eq.df) { res = result result = numeric(length(df)) for(i in seq_along(by.rows)) result[by.rows[[i]]] = res[i] } result } # utility to get appropriate integer df .fix.df = function(df) { sapply(df, function(d) { if (d > 0) d = max(1, d) if (is.infinite(d) || (d > 9999)) d = 0 floor(d + .25) # tends to round down }) } ### My approximate dunnett distribution ### - a mix of the Tukey cdf and Sidak-corrected t .pdunnx = function(x, k, df, twt = (k - 1)/k) { tukey = ptukey(sqrt(2)*x, (1 + sqrt(1 + 8*k))/2, df) sidak = (pf(x^2, 1, df))^k twt*tukey + (1 - twt)*sidak } # Uses linear interpolation to get quantile .qdunnx = function(p, k, df, ...) { if (k < 1.005) return(qt(1 - .5*(1 - p), df)) xtuk = qtukey(p, (1 + sqrt(1 + 8*k))/2, df) / sqrt(2) xsid = sqrt(qf(p^(1/k), 1, df)) fcn = function(x, d) .pdunnx(x, k, d, ...) - p apply(cbind(xtuk, xsid, df), 1, function(r) { if (abs(diff(r[1:2])) < .0005) return (r[1]) x = try(uniroot(fcn, r[1:2], tol = .0005, d = r[3]), silent = TRUE) if (inherits(x, "try-error")) { warning("Root-finding failed; using qtukey approximation for Dunnett quantile") return(xtuk) } else x$root }) } ### Support for different prediction types ### # Valid values for type arg or predict.type option # "link", "lp", "linear" are all legal but equivalent # "mu" and "response" are usually equivalent -- but in a GLM with a response transformation, # "mu" (or "unlink") would back-transform the link only, "response" would do both .valid.types = c("link","lp","linear", "response", "mu", "unlink") # get "predict.type" option from misc, and make sure it's legal .get.predict.type = function(misc) { type = misc$predict.type if (is.null(type)) .valid.types[1] else .validate.type(type) } # check a "type" arg to make it legal # NOTE: if not matched, returns "link", i.e., no back-transformation will be done .validate.type = function (type) { type = .valid.types[pmatch(type, .valid.types, 1)] if (length(type) > 1) { type = type[1] warning("You specified more than one prediction type. Only type = \"", type, "\" was used") } type } # left-or right-justify column labels for m depending on "l" or "R" in just .just.labs = function(m, just) { nm = dimnames(m)[[2]] for (j in seq_len(length(nm))) { if(just[nm[j]] == "L") nm[j] = format(nm[j], width = nchar(m[1,j]), just="left") } dimnames(m) = list(rep("", nrow(m)), nm) m } # Format a data.frame produced by summary.emmGrid #' @method print summary_emm #' @export print.summary_emm = function(x, ..., digits=NULL, quote=FALSE, right=TRUE, export = FALSE) { test.stat.names = c("t.ratio", "z.ratio", "F.ratio", "T.square") # format these w 3 dec places x.save = x if(export) x.save = list() for(i in which(sapply(x, is.matrix))) x[[i]] = NULL # hide matrices for (i in seq_along(names(x))) # zapsmall the numeric columns if (is.numeric(x[[i]])) x[[i]] = zapsmall(x[[i]]) just = sapply(x, function(col) if(is.numeric(col)) "R" else "L") ### was later if (!is.null(x$df)) x$df = round(x$df, 2) for (nm in test.stat.names) if(!is.null(x[[nm]])) x[[nm]] = format(round(x[[nm]], 3), nsmall = 3, sci = FALSE) if (!is.null(x$p.value)) { fp = x$p.value = format(round(x$p.value, 4), nsmall = 4, sci = FALSE) x$p.value[fp=="0.0000"] = "<.0001" } estn = attr(x, "estName") ### moved up just = sapply(x, function(col) if(is.numeric(col)) "R" else "L") est = x[[estn]] if (get_emm_option("opt.digits") && is.null(digits)) { if (!is.null(x[["SE"]])) tmp = est + x[["SE"]] * cbind(rep(-2, nrow(x)), 0, 2) else if (!is.null(x[["lower.HPD"]])) tmp = x[, c("lower.HPD", estn, "upper.HPD"), drop = FALSE] else tmp = NULL if (!is.null(tmp)) digits = max(apply(tmp, 1, .opt.dig)) } if (any(is.na(est))) { x[[estn]] = format(est, digits=digits) x[[estn]][is.na(est)] = "nonEst" } xc = as.matrix(format.data.frame(x, digits=digits, na.encode=FALSE)) m = apply(rbind(just, names(x), xc), 2, function(x) { w = max(sapply(x, nchar)) if (x[1] == "R") format(x[-seq_len(2)], width = w, justify="right") else format(x[-seq_len(2)], width = w, justify="left") }) if(!is.matrix(m)) m = t(as.matrix(m)) by.vars = attr(x, "by.vars") if (is.null(by.vars)) { m = .just.labs(m, just) if (export) x.save$summary = m else { print(m, quote=FALSE, right=TRUE) cat("\n") } } else { # separate listing for each by variable m = .just.labs(m[, setdiff(names(x), by.vars), drop = FALSE], just) if(export) x.save$summary = list() pargs = unname(as.list(x[,by.vars, drop=FALSE])) pargs$sep = ", " lbls = do.call(paste, pargs) for (lb in unique(lbls)) { rows = which(lbls==lb) levs = paste(by.vars, "=", xc[rows[1], by.vars]) levs = paste(levs, collapse=", ") if(export) x.save$summary[[levs]] = m[rows, , drop = FALSE] else { cat(paste(levs, ":\n", sep="")) print(m[rows, , drop=FALSE], ..., quote=quote, right=right) cat("\n") } } } msg = unique(attr(x, "mesg")) if (!is.null(msg) && !export) for (j in seq_len(length(msg))) cat(paste(msg[j], "\n")) else (x.save$annotations = msg) invisible(x.save) } #' @method update summary_emm #' @export #' @rdname update.emmGrid #' @order 9 #' @param by.vars,mesg Attributes that can be altered in \code{update.summary_emm} #' @section Method for \code{summary_emm} objects: #' This method exists so that we can change the way a summary is displayed, #' by changing the by variables or the annotations. #' #' @examples #' ### Compactify results with a by variable #' update(joint_tests(pigs.rg, by = "source"), by = NULL) update.summary_emm = function(object, by.vars, mesg, ...) { args = match.call()[-1] args$object = NULL for (nm in names(args)) attr(object, nm) = args[[nm]] object } # determine optimum digits to display based on a conf or cred interval # (but always at least 3) .opt.dig = function(x) { z = range(x) / max(abs(x)) dz = zapsmall(c(diff(z), z), digits = 8)[1] zz = round(1.51 - log(dz, 10)) # approx 1 - log(diff(z/3)) zz[is.infinite(zz)] = 3 # we get z = Inf when SE is 0 max(zz, 3, na.rm = TRUE) } # Utility -- When misc$display present, reconcile which elements to use. # Needed if we messed with previous nesting .reconcile.elts = function(object) { display = object@misc$display nrows = nrow(object@grid) use.elts = rep(TRUE, nrows) if (!is.null(display) && (length(display) == nrows)) use.elts = display use.elts } emmeans/R/0nly-internal.R0000644000176200001440000004212114156434413014724 0ustar liggesusers############################################################################## # Copyright (c) 2012-2019 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # Functions used only internally and of fairly general use # More of these with specific uses only within a certain context remain there. ## %in%-style operator with partial matching ## e.g., ("bonf" %.pin% p.adjust.methods) is TRUE "%.pin%" = function (x, table) pmatch(x, table, nomatch = 0L) > 0L ## Alternative to all.vars, but keeps vars like foo$x and foo[[1]] as-is ## Passes ... to all.vars #' @export .all.vars = function(expr, retain = c("\\$", "\\[\\[", "\\]\\]", "'", '"'), ...) { if (is.null(expr) || length(expr) == 0) return(character(0)) if (!inherits(expr, "formula")) { expr = try(eval(expr), silent = TRUE) if(inherits(expr, "try-error")) { return(character(0)) } } repl = paste(".Av", seq_along(retain), ".", sep = "") for (i in seq_along(retain)) expr = gsub(retain[i], repl[i], expr) subs = switch(length(expr), 1, c(1,2), c(2,1,3)) vars = all.vars(as.formula(paste(expr[subs], collapse = "")), ...) retain = gsub("\\\\", "", retain) for (i in seq_along(retain)) vars = gsub(repl[i], retain[i], vars) if(length(vars) == 0) vars = "1" # no vars ---> intercept vars } ### returns TRUE iff there is one or more function call in a formula .has.fcns = function(form) { fcns = setdiff(.all.vars(form, functions = TRUE), c("~", "+", "-", "*", "/", ":", "(", "|", .all.vars(form))) length(fcns) > 0 } ### parse a formula of the form lhs ~ rhs | by into a list ### of variable names in each part ### Returns character(0) for any missing pieces .parse.by.formula = function(form) { allv = .all.vars(form) ridx = ifelse(length(form) == 2, 2, 3) allrhs = as.vector(form, "character")[ridx] allrhs = gsub("\\|", "+ .by. +", allrhs) # '|' --> '.by.' allrhs = .all.vars(stats::reformulate(allrhs)) bidx = grep(".by.", allrhs, fixed = TRUE) if (length(bidx) == 0) { # no '|' in formula by = character(0) rhs = allrhs } else { rhs = allrhs[seq_len(bidx[1] - 1)] by = setdiff(allrhs, c(rhs, ".by.")) } lhs = setdiff(allv, allrhs) list(lhs = lhs, rhs = rhs, by = by) } # Utility to pick out the args that can be passed to a function .args.for.fcn = function(fcn, args) { oknames = names(as.list(args(fcn))) mat = pmatch(names(args), oknames) args = args[!is.na(mat)] mat = mat[!is.na(mat)] names(args) = oknames[mat] args } # Create a list and give it class class.name .cls.list <- function(class.name, ...) { result <- list(...) class(result) <- c(class.name, "list") result } ### Not-so-damn-smart replacement of diag() that will ### not be so quick to assume I want an identity matrix ### returns matrix(x) when x is a scalar #' @export .diag = function(x, nrow, ncol) { if(is.matrix(x)) diag(x) else if((length(x) == 1) && missing(nrow) && missing(ncol)) matrix(x) else diag(x, nrow, ncol) } # Utility that returns TRUE if getOption("emmeans")[[opt]] is TRUE .emmGrid.is.true = function(opt) { x = get_emm_option(opt) if (is.logical(x)) x else FALSE } # return list of row indexes in tbl for each combination of by # tbl should be a data.frame .find.by.rows = function(tbl, by) { if (is.null(by)) return(list(seq_len(nrow(tbl)))) if (any(is.na(match(by, names(tbl))))) stop("'by' variables are not all in the grid") bylevs = tbl[ , by, drop = FALSE] by.id = do.call("paste", unname(bylevs)) uids = unique(by.id) result = lapply(uids, function(id) which(by.id == id)) names(result) = uids result } # calculate the offset for the given grid #' @export .get.offset = function(terms, grid) { off.idx = attr(terms, "offset") offset = rep(0, nrow(grid)) tvars = attr(terms, "variables") for (i in off.idx) offset = offset + eval(tvars[[i+1]], grid) offset } # combine variables in several `terms` objects #' @export .combine.terms = function(...) { trms = list(...) vars = unlist(lapply(trms, .all.vars)) terms(.reformulate(vars, env = environment(trms[[1]]))) } ###################################################################### ### Contributed by Jonathon Love, https://github.com/jonathon-love ### ### and adapted by RVL to exclude terms like df$trt or df[["trt"]] ### ###################################################################### # reformulate for us internally in emmeans # same as stats::reformulate, except it surrounds term labels with backsticks # # RVL note: I renamed it .reformulate to avoid certain issues. # For example I need reformulate() sometimes to strip off function calls # and this .reformulate works quite differently. # .reformulate <- function (termlabels, response = NULL, intercept = TRUE, env = parent.frame()) { if (!is.character(termlabels) || !length(termlabels)) stop("'termlabels' must be a character vector of length at least one") has.resp = !is.null(response) termlabels = sapply(trimws(termlabels), function(x) if (length(grep("\\$|\\[\\[", x)) > 0) x else paste0("`", x, "`")) termtext = paste(if (has.resp) "response", "~", paste(termlabels, collapse = "+"), collapse = "") # prev version: paste0("`", trimws(termlabels), "`", collapse = "+"), collapse = "") if (!intercept) termtext = paste(termtext, "- 1") rval = eval(parse(text = termtext, keep.source = FALSE)[[1L]]) if (has.resp) rval[[2L]] = if (is.character(response)) as.symbol(response) else response environment(rval) = env rval } ### Find variable names of the form df$x or df[["x"]] # Returns indexes. In addition, return value has attribute # "details" matrix with 1st row being dataset names, second row is variable names .find.comp.names = function(vars) { comp = grep("\\$|\\[\\[", vars) # untick vars containing "$" if (length(comp) > 0) { attr(comp, "details") = gsub("\"", "", sapply(strsplit(vars[comp], "\\$|\\[\\[|\\]\\]"), function(.) .[1:2])) } comp } ### Find `arg` in `...`. If pmatched, return its value, else NULL ### If arg is a vector, runs .match.dots.list .match.dots = function(arg, ..., lst) { if(missing(lst)) lst = list(...) if (length(arg) > 1) return (.match.dots.list(arg, lst = lst)) m = pmatch(names(lst), arg) idx = which(!is.na(m)) if(length(idx) == 1) lst[[idx]] else NULL } # like .match.dots, but returns a list of all matched args .match.dots.list = function(args, ..., lst) { if(missing(lst)) lst = list(...) rtn = list() for (a in args) rtn[[a]] = .match.dots(a, lst = lst) rtn } # return a list from ..., omitting any that pmatch omit .zap.args = function(..., omit) { args = list(...) args[!is.na(pmatch(names(args), omit))] = NULL args } # return updated object with option list AND dot list # optionally may exclude any opts in 'exclude' .update.options = function(object, options, ..., exclude) { if (!is.list(options)) options = as.list(options) dot.opts = .match.dots.list(.valid.misc, ...) if (!missing(exclude)) for (nm in exclude) dot.opts[[nm]] = NULL # entries in both lists are overridden by those in ... for (nm in names(dot.opts)) options[[nm]] = dot.opts[[nm]] options[["object"]] = object do.call(update.emmGrid, options) } # my own model.frame function. Intercepts compound names # and fixes up the data component accordingly. We do this # by creating data.frames within data having required variables of simple names model.frame = function(formula, data, ...) { if (is.null(data)) return (stats::model.frame(formula, ...)) idx = .find.comp.names(names(data)) if (length(idx) > 0) { nm = names(data)[idx] others = names(data[-idx]) details = attr(idx, "details") num = suppressWarnings(as.numeric(details[2, ])) data = as.list(data) for (dfnm in unique(details[1, ])) { w = which(details[1, ] == dfnm) data[[dfnm]] = as.data.frame(data[nm[w]]) names(data[[dfnm]]) = details[2, w] data[[dfnm]] = .reorder.cols(data[[dfnm]], num[w]) } data[nm] = NULL # toss out stuff we don't need # save subst table in environ if (get_emm_option("simplify.names")) { details[1, ] = nm # top row is now fancy names, bottom is plain names all.nms = c(details[2, ], others) dup.cnt = sapply(details[2, ], function(.) sum(. == all.nms)) dup.cnt[!is.na(num)] = 3 # don't simplify numeric ones details = details[, dup.cnt == 1, drop = FALSE] if (ncol(details) > 0) environment(formula)$.simplify.names. = details } } stats::model.frame(formula, data = data, ...) } # Utility to simplify names. each elt of top row of tbl is changed to bottom row .simplify.names = function(nms, tbl) { for (j in seq_along(tbl[1,])) nms[nms == tbl[1, j]] = tbl[2, j] nms } # utility to make all names in a summary syntactically valid .validate.names = function(object) { for (a in c("names", "pri.vars", "by.vars")) if (!is.null(att <- attr(object, a))) attr(object, a) = make.names(att) object } # reorder columns of data frame to match numeric positions in num (if any) .reorder.cols = function(data, num) { if (all(is.na(num))) return(data) m = max(num[!is.na(num)]) nm = names(data) if (m > length(nm)) { k = length(nm) data = cbind(data, matrix(NA, nrow = nrow(data), ncol = m - k)) nm = names(data) = c(nm, paste0(".xtra.", seq_len(m - k), ".")) } for (i in num[!is.na(num)]) { j = which(nm == as.character(i)) if (i != j) { tmp = nm[i] nm[i] = nm[j] nm[j] = tmp } } data[, nm, drop = FALSE] } # format sigma for use in messages .fmt.sigma = function(sigma) { if (length(sigma) == 1) round(sigma, 4 - floor(log10(sigma))) else "(various values)" } # format a transformation for messages .fmt.tran = function(misc) { tran = misc$tran if (is.list(tran)) tran = ifelse(is.null(tran$name), "custom", tran$name) if (!is.null(mul <- misc$tran.mult)) tran = paste0(mul, "*", tran) if(!is.null(off <- misc$tran.offset)) tran = paste0(tran, "(mu + ", off, ")") tran } # My own utility for requiring a namespace and handling case where it is not available # pkg package name # ... passed to fail # fail function to call if namespace not found # quietly passed to requireNamespace() ## I can't decide definitively if I want to suppress S3 masking messages or not... .requireNS = function(pkg, ..., fail = stop, quietly = TRUE) { ### result = suppressMessages(requireNamespace(pkg, quietly = quietly)) result = requireNamespace(pkg, quietly = TRUE) if (!result) fail(...) result } # of possible use as fail in .requireNS .nothing = function(...) invisible() ### Utilities for converting symm matrices to and from lower-triangle storage mode .get.lt = function(X) { rtn = X[lower.tri(X, diag = TRUE)] attr(rtn, "nrow") = nrow(X) rtn } .lt2mat = function(lt) { if (is.null(n <- attr(lt, "nrow"))) n = (sqrt(8 * length(lt) + 1) - 1) / 2 X = matrix(NA, nrow = n, ncol = n) lti = which(lower.tri(X, diag = TRUE)) X[lti] = lt X = t(X) X[lti] = lt X } ### submodel support... # Compact a model matrix # This returns just the R part of X's QR decomposition. # This is sufficient in lieu of the whole model matrix # Ideally, X is already a qr object in which case weights is assumed already incorporated # assign should be right if input is generated by model.matrix() or is $qr slot of lm #' @export .cmpMM = function(X, weights = rep(1, nrow(X)), assign = attr(X$qr, "assign")) { if (!get_emm_option("enable.submodel")) return(NULL) if(!is.qr(X)) { if(any(is.na(X))) return(NULL) X = try({ X = sweep(X, 1, sqrt(weights), "*") # scale rows by sqrt(weights) qr(X) }, silent = TRUE) if (inherits(X, "try-error")) return(NULL) } R = qr.R(X, complete = FALSE) R[, X$pivot] = R colnames(R)[X$pivot] = colnames(R) attr(R, "assign") = assign R } # Get 'factors' table and simplify it (remove function calls from term labels) .smpFT = function(trms) { tbl = attr(trms, "factors") rn = rownames(tbl) newrn = sapply(rn, function(x) all.vars(as.formula(paste("~", x)))[1]) if (any(newrn != rn)) { rownames(tbl) = newrn colnames(tbl) = apply(tbl, 2, function(x) paste(newrn[x > 0], collapse = ":")) } tbl } # Alias matrix. Goal here is to find indexes # i1 = indices of columns of smaller model # i2 = indices of all other terms # Then for a given linear hypothesis L_R %*% bhat_R (_R subscripts smaller model) # we have L_R %*% bhat_R = L_R %*% (A %*% bhat_F) = (L_R %*% A) %*% bhat_F # (where _F subscripts full model). # Moreover, L_R is just columns i1 of the linfct for the effect of interest. .alias.matrix = function(object, submodel) { X = object@model.info$model.matrix # assumed to have attributes "factors" and "assign" if (is.character(X)) { # model.matrix is a message if (nchar(X) > 0) warning(X, call. = FALSE) return(NULL) } assign = attr(X, "assign") if (is.null(assign)) { ### either missing model matrix or assign attribute warning("submodel information is not available for this object", call. = FALSE) return(NULL) } tbl = attr(X, "factors") if (is.character(submodel)) { type2 = pmatch(submodel[1], "type2", nomatch = 0) # 1 if type2, 0 otherwise submodel = as.formula(paste("~", paste(names(object@levels), collapse="*"))) } else type2 = 0 # now submodel is a formula # create term labels in compatible factor order subtbl = attr(terms(update(object@model.info$terms, submodel)), "factors") com = intersect(rownames(tbl), rownames(subtbl)) if(length(com) == 0) { # No matching factors at all com = 1; subtbl = matrix(0) } sublab = apply(subtbl[com, , drop = FALSE], 2, function(x) paste(com[x > 0], collapse = ":")) usecols = intersect(colnames(tbl), sublab) incl = c(0, which(colnames(tbl) %in% usecols)) if (type2) { incl = max(incl) # just the last one overlap = apply(tbl, 2, function(x) sum(x * tbl[, incl])) zaplist = which(overlap < sum(tbl[, incl])) # don't contain our term i0 = which(assign %in% c(0, zaplist)) } else i0 = integer(0) rcols = which(assign %in% incl) if (length(i0) > 0) { X[, rcols] = qr.resid(qr(X[, i0, drop = FALSE]), X[, rcols, drop = FALSE]) X[, i0] = 0 } A = qr.coef(qr(X[, rcols, drop = FALSE]), X) # alias matrix A[is.na(A)] = 0 ## NA coefs are really ones constrained to zero attr(A, "rcols") = rcols # cols in submodel attr(A, "submodstr") = paste(colnames(tbl)[incl], collapse = " + ") A } emmeans/R/sommer-support.R0000644000176200001440000000642614137062735015257 0ustar liggesusers############################################################################## # Copyright (c) 2012-2019 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # sommer package support recover_data.mmer = function(object, data, ...) { if (is.null(data)) data = object$data fcall = call("mmer", formula = object$call$fixed, data = data) emmeans::recover_data(fcall, delete.response(terms(object$call$fixed)), object$call$na.method.V, ...) } emm_basis.mmer = function(object, trms, xlev, grid, ...) { cf = object$Beta bhat = cf$Estimate m = suppressWarnings(model.frame(trms, grid, na.action = na.pass, xlev = xlev)) # if we can get contrasts from the object, fix next line X = model.matrix(trms, m, contrasts.arg = NULL) V = .my.vcov(object, vcov. = function(., ...) .$VarBeta) if ((k <- length(bhat)) < ncol(X)) { # we have rank deficiencies QR = qr(model.matrix(trms, object$data, contrasts.arg = NULL)) bhat = rep(NA, ncol(X)) bhat[QR$pivot[1:k]] = cf$Estimate nbasis = estimability::nonest.basis(QR) } else nbasis = estimability::all.estble misc = list() # soup-up following if (1) glms allowed or (2) d.f. available dfargs = list(df = object$df.residual) dffun = function(k, dfargs) Inf bas = list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = misc) # check for multiv resp k = length(levels(cf$Trait)) if (k > 1) { bas$misc$ylevs = list(Trait = levels(cf$Trait)) bas$X = kronecker(diag(rep(1, k)), bas$X) # reorder coefs to go one trait at a time ord = as.integer(matrix(seq_along(bas$bhat), ncol = k, byrow = TRUE)) bas$bhat = bas$bhat[ord] bas$V = bas$V[ord, ord, drop = FALSE] } bas }emmeans/R/nonlin-support.R0000644000176200001440000001374314150204043015233 0ustar liggesusers############################################################################## # Copyright (c) 2012-2021 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # experimental support for nls, nlme objects recover_data.nls = function(object, ...) { fcall = object$call trms = terms(.reformulate(names(object$dataClasses))) recover_data(fcall, trms, object$na.action, ...) } emm_basis.nls = function(object, trms, xlev, grid, ...) { Vbeta = .my.vcov(object, ...) env = object$m$getEnv() for (nm in names(grid)) env[[nm]] = grid[[nm]] pars = object$m$getAllPars() DD = deriv(object$m$formula(), names(pars)) ests = eval(DD, env) bhat = as.numeric(ests) grad = attr(ests, "gradient") V = grad %*% Vbeta %*% t(grad) X = diag(1, nrow(grid)) list(X=X, bhat=bhat, nbasis=all.estble, V=V, dffun=function(k, dfargs) Inf, dfargs=list(), misc=list()) } ### For nlme objects, we can do stuff with the fixed part of the model ### Additional REQUIRED argument is 'param' - parameter name to explore recover_data.nlme = function(object, param, ...) { if(missing(param)) return("'param' argument is required for nlme objects") fcall = object$call if (!is.null(fcall$weights)) fcall$weights = nlme::varWeights(object$modelStruct) fixed = fcall$fixed if (is.call(fixed)) fixed = eval(fixed, envir = parent.frame()) if(!is.list(fixed)) fixed = list(fixed) form = NULL for (x in fixed) if (param %in% all.names(x)) form = x if (is.null(form)) return(paste("Can't find '", param, "' among the fixed parameters", sep = "")) fcall$weights = NULL trms = delete.response(terms(form)) if (length(.all.vars(trms)) == 0) return(paste("No predictors for '", param, "' in fixed model", sep = "")) recover_data(fcall, trms, object$na.action, ...) } emm_basis.nlme = function(object, trms, xlev, grid, param, ...) { idx = object$map$fmap[[param]] V = object$varFix[idx, idx, drop = FALSE] bhat = object$coefficients$fixed[idx] contr = attr(object$plist[[param]]$fixed, "contrasts") m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = contr) dfx = object$fixDF$X[idx] dfx[1] = min(dfx) # I'm assuming 1st one is intercept dffun = function(k, dfargs) { # containment df idx = which(abs(k) > 1e-6) ifelse(length(idx) > 0, min(dfargs$dfx[idx]), NA) } list(X = X, bhat = bhat, nbasis = estimability::all.estble, V = V, dffun = dffun, dfargs = list(dfx = dfx), misc = list(estName = param)) } # Support for gnls - contributed by Fernando Miguez recover_data.gnls = function(object, param, data, ...) { fcall = object$call$params if (is.null(fcall)) return("Models fitted without a 'params' specification are not supported.") if(missing(param)) return("'param' argument is required for gnls objects") plist = object$plist pnames = names(plist) params = eval(object$call$params) if (!is.list(params)) params = list(params) params = unlist(lapply(params, function(pp) { if (is.name(pp[[2]])){ list(pp) } else { ## multiple parameters on left hand side eval(parse(text = paste("list(", paste(paste(all.vars(pp[[2]]), deparse(pp[[3]]), sep = "~"), collapse = ","), ")"))) } }), recursive = FALSE) names(params) = pnames form = params[[param]] trms = delete.response(terms(eval(form, envir = environment(formula(object))))) if(is.null(data)) data = eval(object$call$data, envir = environment(formula(object))) recover_data(fcall, trms, object$na.action, data = data, ...) } emm_basis.gnls = function(object, trms, xlev, grid, param, ...) { idx = object$pmap[[param]] V = object$varBeta[idx, idx, drop = FALSE] bhat = object$coefficients[idx] contr = attr(object$plist[[param]], "contrasts") m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = contr) dfx = with(attributes(logLik(object)), nobs - df) dffun = function(k, dfargs) dfargs$dfx list(X = X, bhat = bhat, nbasis = estimability::all.estble, V = V, dffun = dffun, dfargs = list(dfx = dfx), misc = list(estName = param)) } emmeans/R/emm-list.R0000644000176200001440000001403014140532461013750 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # Methods for emm_list objects # First, ehere is documentation for the emm_list class #' The \code{emm_list} class #' #' An \code{emm_list} object is simply a list of #' \code{\link[=emmGrid-class]{emmGrid}} objects. Such a list is returned, #' for example, by \code{\link{emmeans}} with a two-sided formula or a list as its #' \code{specs} argument. #' #' Methods for \code{emm_list} objects include \code{summary}, #' \code{coef}, \code{confint}, \code{contrast}, \code{pairs}, \code{plot}, #' \code{print}, and #' \code{test}. These are all the same as those methods for \code{emmGrid} #' objects, with an additional \code{which} argument (integer) to specify which #' members of the list to use. The default is \code{which = seq_along(object)}; #' i.e., the method is applied to every member of the \code{emm_list} object. #' The exception is \code{plot}, where only the \code{which[1]}th element is #' plotted. #' #' As an example, #' to summarize a single member -- say the second one -- of an \code{emm_list}, #' one may use \code{summary(object, which = 2)}, but it is probably preferable #' to directly summarize it using \code{summary(object[[2]])}. #' #' @note No \code{export} option is provided for printing an \code{emm_list} #' (see \code{\link{print.emmGrid}}). If you wish to export these objects, you #' must do so separately for each element in the list. #' #' #' @rdname emm_list-object #' @name emm_list NULL #' @export #' @method str emm_list str.emm_list = function(object, ...) { for(nm in names(object)) { cat(paste("$", nm, "\n", sep="")) str(object[[nm]]) cat("\n") } } # summary.emm_list et all take an argument which that allows doing a subset # Each returns a regular 'list' #' @export #' @method summary emm_list summary.emm_list <- function(object, ..., which = seq_along(object)) lapply(object[which], function(x) { if (inherits(x, "summary.emmGrid")) x else summary.emmGrid(x, ...) }) #' @export #' @method print emm_list print.emm_list = function(x, ...) { print(summary(x, ...)) } #' @export #' @method contrast emm_list contrast.emm_list = function(object, ... , which = seq_along(object)) { lapply(object[which], contrast, ...) } #' @export #' @method pairs emm_list pairs.emm_list = function(x, ..., which = seq_along(x)) { lapply(x[which], pairs, ...) } #' @export #' @method test emm_list test.emm_list = function(object, ..., which = seq_along(object)) { lapply(object[which], test, ...) } #' @export #' @method confint emm_list confint.emm_list = function(object, ..., which = seq_along(object)) { lapply(object[which], confint, ...) } #' @export #' @method coef emm_list coef.emm_list = function(object, ..., which = seq_along(object)) { lapply(object[which], coef, ...) } # plot just plots one #' @export #' @method plot emm_list plot.emm_list = function(x, ..., which = 1) { plot.emmGrid(x[[which[1]]], ...) } #' @rdname rbind.emmGrid #' @order 3 #' @param which Integer vector of subset of elements to use; if missing, all are combined #' @return The \code{rbind} method for \code{emm_list} objects simply combines #' the \code{emmGrid} objects comprising the first element of \code{...}. #' @export #' @method rbind emm_list #' @examples #' #' ### Working with 'emm_list' objects #' mod <- lm(conc ~ source + factor(percent), data = pigs) #' all <- emmeans(mod, list(src = pairwise ~ source, pct = consec ~ percent)) #' rbind(all, which = c(2, 4), adjust = "mvt") rbind.emm_list = function(..., which, adjust = "bonferroni") { elobj = list(...)[[1]] if(!missing(which)) elobj = elobj[which] class(elobj) = c("emm_list", "list") update(do.call(rbind.emmGrid, elobj), adjust = adjust) } #' @export #' @method as.data.frame emm_list as.data.frame.emm_list = function(x, ...) { data.frame(rbind(x, ..., check.names = FALSE)) } #' @export #' @method as.list emm_list as.list.emm_list = function(x, ...) { rtn = list() for (nm in names(x)) rtn[[nm]] = as.list.emmGrid(x[[nm]]) attr(rtn, "emm_list") = TRUE rtn } #' @export #' @return \code{as.emm_list} returns an object of class \code{emm_list}. #' #' @rdname as.emmGrid #' @order 3 as.emm_list = function(object, ...) { if (is.null(attr(object, "emm_list"))) as.emmGrid(object, ...) else lapply(object, as.emmGrid, ...) } emmeans/R/multinom-support.R0000644000176200001440000001241114137062735015610 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### Multinomial modeling ### Example for testing ### From: http://www.ats.ucla.edu/stat/r/dae/mlogit.htm # library(foreign) # ml <- read.dta("http://www.ats.ucla.edu/stat/data/hsbdemo.dta") # library(nnet) # ml$prog2 <- relevel(ml$prog, ref = "academic") # test <- multinom(prog2 ~ ses + write, data = ml) # # same as recover_data.lm recover_data.multinom = function(object, ...) { fcall = object$call recover_data(fcall, delete.response(terms(object)), object$na.action, ...) } emm_basis.multinom = function(object, trms, xlev, grid, mode = c("prob", "latent"), ...) { mode = match.arg(mode) bhat = t(coef(object)) V = .my.vcov(object, ...) # NOTE: entries in vcov(object) come out in same order as # in as.numeric(bhat), even though latter has been transposed k = ifelse(is.matrix(coef(object)), ncol(bhat), 1) m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = object$contrasts) # recenter for latent predictions pat = (rbind(0, diag(k + 1, k)) - 1) / (k + 1) X = kronecker(pat, X) nbasis = estimability::all.estble nbasis = kronecker(rep(1,k), nbasis) misc = list(tran = "log", inv.lbl = "e^y") dfargs = list(df = object$edf) dffun = function(k, dfargs) dfargs$df ylevs = list(class = object$lev) if (is.null(ylevs)) ylevs = list(class = seq_len(k)) names(ylevs) = as.character.default(eval(object$call$formula, environment(trms))[[2]]) misc$ylevs = ylevs if (mode == "prob") misc$postGridHook = .multinom.postGrid list(X = X, bhat = as.numeric(bhat), nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = misc) } # post-processing of ref_grid for "prob" mode # also allows simulated outcomes .multinom.postGrid = function(object, N.sim, ...) { linfct = object@linfct misc = object@misc # grid will have multresp as slowest-varying factor... idx = matrix(seq_along(linfct[, 1]), ncol = length(object@levels[[object@roles$multresp]])) bhat = as.numeric(idx) # right length, contents will be replaced if(sim <- !missing(N.sim)) { message("Simulating a sample of size ", N.sim) bsamp = mvtnorm::rmvnorm(N.sim, object@bhat, object@V) postb = matrix(0, nrow = N.sim, ncol = length(bhat)) } for (i in 1:nrow(idx)) { rows = idx[i, ] exp.psi = exp(linfct[rows, , drop = FALSE] %*% object@bhat) p = as.numeric(exp.psi / sum(exp.psi)) bhat[rows] = p if (sim) { ex = exp(linfct[rows, , drop = FALSE] %*% t(bsamp)) # p x N px = t(apply(ex, 2, function(x) x / sum(x))) postb[, rows] = px } A = .diag(p) - outer(p, p) # partial derivs linfct[rows, ] = A %*% linfct[rows, ] } misc$postGridHook = misc$tran = misc$inv.lbl = NULL misc$estName = "prob" object@bhat = bhat object@V = linfct %*% tcrossprod(object@V, linfct) object@linfct = diag(1, length(bhat)) object@misc = misc if (sim) object@post.beta = postb object } ### Support for mclogit::mblogit models??? emm_basis.mblogit = function(object, ..., vcov.) { object$coefficients = object$coefmat object$lev = levels(object$model[[1]]) object$edf = Inf # we have to arrange the vcov elements in row-major order if(missing(vcov.)) vcov. = vcov(object) perm = matrix(seq_along(as.numeric(object$coefmat)), ncol = ncol(object$coefmat)) perm = as.numeric(t(perm)) vcov. = vcov.[perm, perm] emm_basis.multinom(object, ..., vcov. = vcov.) } emmeans/R/emm-contr.R0000644000176200001440000004222114137062735014135 0ustar liggesusers############################################################################## # Copyright (c) 2012-2019 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## ### functions to implement different families of contrasts ### All return a matrix or data frame whose columns are the desired contrasts coefs ### with appropriate row and column names ### Also they have two attributes: ### "desc" is an expanded description of the family, ### "adjust" is the default multiplicity adjustment (used if adjust="auto" in emmeans) #' Contrast families #' #' Functions with an extension of \code{.emmc} provide for named contrast #' families. One of the standard ones documented here may be used, or the user #' may write such a function. #' #' Each standard contrast family has a default multiple-testing adjustment as #' noted below. These adjustments are often only approximate; for a more #' exacting adjustment, use the interfaces provided to \code{glht} in the #' \pkg{multcomp} package. #' #' \code{pairwise.emmc}, \code{revpairwise.emmc}, and \code{tukey.emmc} generate #' contrasts for all pairwise comparisons among estimated marginal means at the #' levels in levs. The distinction is in which direction they are subtracted. #' For factor levels A, B, C, D, \code{pairwise.emmc} generates the comparisons #' A-B, A-C, A-D, B-C, B-D, and C-D, whereas \code{revpairwise.emmc} generates #' B-A, C-A, C-B, D-A, D-B, and D-C. \code{tukey.emmc} invokes #' \code{pairwise.emmc} or \code{revpairwise.emmc} depending on \code{reverse}. #' The default multiplicity adjustment method is \code{"tukey"}, which is only #' approximate when the standard errors differ. #' #' \code{poly.emmc} generates orthogonal polynomial contrasts, assuming #' equally-spaced factor levels. These are derived from the #' \code{\link[stats]{poly}} function, but an \emph{ad hoc} algorithm is used to #' scale them to integer coefficients that are (usually) the same as in #' published tables of orthogonal polynomial contrasts. The default multiplicity #' adjustment method is \code{"none"}. #' #' \code{trt.vs.ctrl.emmc} and its relatives generate contrasts for comparing #' one level (or the average over specified levels) with each of the other #' levels. The argument \code{ref} should be the index(es) (not the labels) of #' the reference level(s). \code{trt.vs.ctrl1.emmc} is the same as #' \code{trt.vs.ctrl.emmc} with a reference value of 1, and #' \code{trt.vs.ctrlk.emmc} is the same as \code{trt.vs.ctrl} with a reference #' value of \code{length(levs)}. \code{dunnett.emmc} is the same as #' \code{trt.vs.ctrl}. The default multiplicity adjustment method is #' \code{"dunnettx"}, a close approximation to the Dunnett adjustment. #' \emph{Note} in all of these functions, it is illegal to have any overlap #' between the \code{ref} levels and the \code{exclude} levels. If any is found, #' an error is thrown. #' #' \code{consec.emmc} and \code{mean_chg.emmc} are useful for contrasting #' treatments that occur in sequence. For a factor with levels A, B, C, D, E, #' \code{consec.emmc} generates the comparisons B-A, C-B, and D-C, while #' \code{mean_chg.emmc} generates the contrasts (B+C+D)/3 - A, (C+D)/2 - #' (A+B)/2, and D - (A+B+C)/3. With \code{reverse = TRUE}, these differences go #' in the opposite direction. #' #' \code{eff.emmc} and \code{del.eff.emmc} generate contrasts that compare each #' level with the average over all levels (in \code{eff.emmc}) or over all other #' levels (in \code{del.eff.emmc}). These differ only in how they are scaled. #' For a set of k EMMs, \code{del.eff.emmc} gives weight 1 to one EMM and weight #' -1/(k-1) to the others, while \code{eff.emmc} gives weights (k-1)/k and -1/k #' respectively, as in subtracting the overall EMM from each EMM. The default #' multiplicity adjustment method is \code{"fdr"}. This is a Bonferroni-based #' method and is slightly conservative; see \code{\link[stats]{p.adjust}}. #' #' \code{identity.emmc} simply returns the identity matrix (as a data frame), #' minus any columns specified in \code{exclude}. It is potentially useful in #' cases where a contrast function must be specified, but none is desired. #' #' @rdname emmc-functions #' @aliases emmc-functions #' @param levs Vector of factor levels #' @param exclude integer vector of indices, or character vector of levels to #' exclude from consideration. These levels will receive weight 0 in all #' contrasts. Character levels must exactly match elements of \code{levs}. #' @param include integer or character vector of levels to include (the #' complement of \code{exclude}). An error will result if the user specifies #' both \code{exclude} and \code{include}. #' @param ... Additional arguments, passed to related methods as appropriate #' #' @return A data.frame, each column containing contrast coefficients for levs. #' The "desc" attribute is used to label the results in emmeans, and the #' "adjust" attribute gives the default adjustment method for multiplicity. #' #' @note Caution is needed in cases where the user alters the ordering of #' results (e.g., using the the \code{"[...]"} operator), because the #' contrasts generated depend on the order of the levels provided. For #' example, suppose \code{trt.vs.ctrl1} contrasts are applied to two \code{by} #' groups with levels ordered (Ctrl, T1, T2) and (T1, T2, Ctrl) respectively, #' then the contrasts generated will be for (T1 - Ctrl, T2 - Ctrl) in the #' first group and (T2 - T1, Ctrl - T1) in the second group, because the first #' level in each group is used as the reference level. #' #' @examples #' warp.lm <- lm(breaks ~ wool*tension, data = warpbreaks) #' warp.emm <- emmeans(warp.lm, ~ tension | wool) #' contrast(warp.emm, "poly") #' contrast(warp.emm, "trt.vs.ctrl", ref = "M") #' #' # Compare only low and high tensions #' # Note pairs(emm, ...) calls contrast(emm, "pairwise", ...) #' pairs(warp.emm, exclude = 2) #' # (same results using exclude = "M" or include = c("L","H") or include = c(1,3)) #' #' ### Setting up a custom contrast function #' helmert.emmc <- function(levs, ...) { #' M <- as.data.frame(contr.helmert(levs)) #' names(M) <- paste(levs[-1],"vs earlier") #' attr(M, "desc") <- "Helmert contrasts" #' M #' } #' contrast(warp.emm, "helmert") #' \dontrun{ #' # See what is used for polynomial contrasts with 6 levels #' emmeans:::poly.emmc(1:6) #' } #' @name contrast-methods pairwise.emmc = function(levs, exclude = integer(0), include, ...) { exclude = .get.excl(levs, exclude, include) k = length(levs) M = data.frame(levs=levs) for (i in setdiff(seq_len(k-1), exclude)) { for (j in setdiff(i + seq_len(k-i), exclude)) { ###for (j in (i+1):k) { con = rep(0,k) con[i] = 1 con[j] = -1 nm = paste(levs[i], levs[j], sep = " - ") M[[nm]] = con } } row.names(M) = levs M = M[-1] attr(M, "desc") = "pairwise differences" attr(M, "adjust") = "tukey" attr(M, "type") = "pairs" attr(M, "famSize") = k - length(exclude) if(length(exclude) > 0) attr(M, "famSize") = length(levs) - length(exclude) M } # all pairwise trt[j] - trt[i], j > i #' @rdname emmc-functions revpairwise.emmc = function(levs, exclude = integer(0), include, ...) { exclude = .get.excl(levs, exclude, include) k = length(levs) M = data.frame(levs=levs) for (i in setdiff(1 + seq_len(k - 1), exclude)) { for (j in setdiff(seq_len(i-1), exclude)) { con = rep(0,k) con[i] = 1 con[j] = -1 nm = paste(levs[i], levs[j], sep = " - ") M[[nm]] = con } } row.names(M) = levs M = M[-1] attr(M, "desc") = "pairwise differences" attr(M, "adjust") = "tukey" attr(M, "type") = "pairs" if(length(exclude) > 0) attr(M, "famSize") = length(levs) - length(exclude) M } # pseudonym #' @rdname emmc-functions #' @param reverse Logical value to determine the direction of comparisons tukey.emmc = function(levs, reverse = FALSE, ...) { if (reverse) revpairwise.emmc(levs, ...) else pairwise.emmc(levs, ...) } # Poly contrasts - scaled w/ integer levels like most tables # ad hoc scaling works for up to 13 levels #' @rdname emmc-functions #' @param max.degree Integer specifying the maximum degree of polynomial contrasts poly.emmc = function(levs, max.degree = min(6, k-1), ...) { nm = c("linear", "quadratic", "cubic", "quartic", paste("degree",5:20)) k = length(levs) M = as.data.frame(poly(seq_len(k), min(20,max.degree))) for (j in seq_len(ncol(M))) { con = M[, j] pos = which(con > .01) con = con / min(con[pos]) z = max(abs(con - round(con))) while (z > .05) { con = con / z z = max(abs(con - round(con))) } M[ ,j] = round(con) } row.names(M) = levs names(M) = nm[seq_len(ncol(M))] attr(M, "desc") = "polynomial contrasts" attr(M, "adjust") = "none" M } # All comparisons with a control; ref = index of control group # New version -- allows more than one control group (ref is a vector) #' @rdname emmc-functions #' @param ref Integer(s) or character(s) specifying which level(s) to use #' as the reference. Character values must exactly match elements of \code{levs}. trt.vs.ctrl.emmc = function(levs, ref = 1, reverse = FALSE, exclude = integer(0), include, ...) { ref = .num.key(levs, ref) exclude = .get.excl(levs, exclude, include) if (length(ref) == 0 || (min(ref) < 1) || (max(ref) > length(levs))) stop("In trt.vs.ctrl.emmc(), 'ref' levels are out of range", call. = FALSE) k = length(levs) cnm = ifelse(length(ref)==1, levs[ref], paste("avg(", paste(levs[ref], collapse=","), ")", sep="")) templ = rep(0, length(levs)) templ[ref] = -1 / length(ref) M = data.frame(levs=levs) if (length(intersect(exclude, ref)) > 0) stop("'exclude' set cannot overlap with 'ref'") skip = c(ref, exclude) for (i in seq_len(k)) { if (i %in% skip) next con = templ con[i] = 1 if (reverse) nm = paste(cnm, levs[i], sep = " - ") else nm = paste(levs[i], cnm, sep = " - ") M[[nm]] = con } row.names(M) = levs M = M[-1] if (reverse) M = -M attr(M, "desc") = "differences from control" attr(M, "adjust") = "dunnettx" if(length(exclude) > 0) attr(M, "famSize") = length(levs) - length(exclude) M } # control is 1st level #' @rdname emmc-functions trt.vs.ctrl1.emmc = function(levs, ref = 1, ...) { trt.vs.ctrl.emmc(levs, ref = ref, ...) } # control is last level #' @rdname emmc-functions trt.vs.ctrlk.emmc = function(levs, ref = length(levs), ...) { trt.vs.ctrl.emmc(levs, ref = ref, ...) } # pseudonym for trt.vs.ctrl #' @rdname emmc-functions dunnett.emmc = function(levs, ref = 1, ...) { trt.vs.ctrl.emmc(levs, ref = ref, ...) } # effects contrasts. Each mean versus the average of all #' @rdname emmc-functions eff.emmc = function(levs, exclude = integer(0), include, ...) { exclude = .get.excl(levs, exclude, include) k = length(levs) kk = k - length(exclude) start = rep(-1/kk, k) start[exclude] = 0 M = data.frame(levs=levs) for (i in setdiff(seq_len(k), exclude)) { con = start con[i] = (kk - 1)/kk nm = paste(levs[i], "effect") M[[nm]] = con } row.names(M) = levs M = M[-1] attr(M, "desc") = "differences from grand mean" attr(M, "adjust") = "fdr" if(length(exclude) > 0) attr(M, "famSize") = length(levs) - length(exclude) M } # "deleted" effects contrasts. # Each mean versus the average of all others #' @rdname emmc-functions del.eff.emmc = function(levs, exclude = integer(0), include, ...) { exclude = .get.excl(levs, exclude, include) k = length(levs) - length(exclude) M = as.matrix(eff.emmc(levs, exclude = exclude, ...)) * k / (k-1) M = as.data.frame(M) attr(M, "desc") = "differences from mean of others" attr(M, "adjust") = "fdr" if(length(exclude) > 0) attr(M, "famSize") = length(levs) - length(exclude) M } # Contrasts to compare consecutive levels: # (-1,1,0,0,...), (0,-1,1,0,...), ..., (0,...0,-1,1) #' @rdname emmc-functions consec.emmc = function(levs, reverse = FALSE, exclude = integer(0), include, ...) { exclude = .get.excl(levs, exclude, include) sgn = ifelse(reverse, -1, 1) tmp = rep(0, length(levs)) k = length(levs) - length(exclude) active.rows = setdiff(seq_along(levs), exclude) M = data.frame(levs=levs) nms = levs[active.rows] for (i in seq_len(k-1)) { con = rep(0, k) con[i] = -sgn con[i+1] = sgn tmp[active.rows] = con nm = ifelse(reverse, paste(nms[i], "-", nms[i+1]), paste(nms[i+1], "-", nms[i])) M[[nm]] = tmp } row.names(M) = levs M = M[-1] attr(M, "desc") = "changes between consecutive levels" attr(M, "adjust") = "mvt" if(length(exclude) > 0) attr(M, "famSize") = length(levs) - length(exclude) M } # Mean after minus mean before # e.g., (-1, 1/3,1/3,1/3), (-1/2,-1/2, 1/2,1/2), (-1/3,-1/3,-1/3, 1) #' @rdname emmc-functions mean_chg.emmc = function(levs, reverse = FALSE, exclude = integer(0), include, ...) { exclude = .get.excl(levs, exclude, include) sgn = ifelse(reverse, -1, 1) k = length(levs) - length(exclude) tmp = rep(0, length(levs)) M = data.frame(levs=levs) active.rows = setdiff(seq_along(levs), exclude) nms = levs[active.rows] for (i in seq_len(k-1)) { kmi = k - i con = rep(c(-sgn/i, sgn/kmi), c(i, kmi)) nm = paste(nms[i], nms[i+1], sep="|") tmp[active.rows] = con M[[nm]] = tmp } row.names(M) = levs M = M[-1] attr(M, "desc") = "mean after minus mean before" attr(M, "adjust") = "mvt" if(length(exclude) > 0) attr(M, "famSize") = length(levs) - length(exclude) M } # Non-contrasts -- just pass thru, possibly excluding some levels #' @rdname emmc-functions identity.emmc = function(levs, exclude = integer(0), include, ...) { exclude = .get.excl(levs, exclude, include) k = length(levs) - length(exclude) M = as.data.frame(diag(length(levs))) names(M) = levs if(length(exclude) > 0) M = M[ , -exclude, drop = FALSE] attr(M, "desc") = "Identity" attr(M, "famSize") = k attr(M, "adjust") = "none" M } ### utility to translate character keys to index keys #' @export .num.key = function(levs, key) { if(!is.null(raw <- attr(levs, "raw"))) levs = raw orig.key = key if (is.character(key)) key = match(key, levs) # if (any(is.na(key))) # warning("One or more of: '", paste(orig.key, collapse = "','"), "' not found in '", # paste(levs, collapse = "','"), "'", call. = FALSE) # if (any(key > length(levs)) || any(key < 1)) # stop("Numeric index not in 1 : length(levs)") key[key %in% seq_along(levs)] # I think I'll just silently remove unmatched levels } ### utility to find exclude levels from either exclude or include ### Also returns numeric version #' @export .get.excl = function(levs, exc, inc) { if (!missing(inc)) { if(length(exc) > 0) stop("Cannot specify both 'exclude' and 'include'", call. = FALSE) inc = .num.key(levs, inc) exc = setdiff(seq_along(levs), inc) } .num.key(levs, exc) }emmeans/R/ref-grid.R0000644000176200001440000014043614164702161013734 0ustar liggesusers############################################################################## # Copyright (c) 2012-2020 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # Reference grid code # Change to cov.reduce specification: can be... # a function: is applied to all covariates # named list of functions: applied to those covariates (else mean is used) # TRUE - same as mean # FALSE - same as function(x) sort(unique(x)) #' Create a reference grid from a fitted model #' #' Using a fitted model object, determine a reference grid for which estimated #' marginal means are defined. The resulting \code{ref_grid} object encapsulates #' all the information needed to calculate EMMs and make inferences on them. #' #' To users, the \code{ref_grid} function itself is important because most of #' its arguments are in effect arguments of \code{\link{emmeans}} and related #' functions, in that those functions pass their \code{...} arguments to #' \code{ref_grid}. #' #' The reference grid consists of combinations of independent variables over #' which predictions are made. Estimated marginal means are defined as these #' predictions, or marginal averages thereof. The grid is determined by first #' reconstructing the data used in fitting the model (see #' \code{\link{recover_data}}), or by using the \code{data.frame} provided in #' \code{data}. The default reference grid is determined by the observed levels #' of any factors, the ordered unique values of character-valued predictors, and #' the results of \code{cov.reduce} for numeric predictors. These may be #' overridden using \code{at}. See also the section below on #' recovering/overriding model information. #' #' #' @param object An object produced by a supported model-fitting function, such #' as \code{lm}. Many models are supported. See #' \href{../doc/models.html}{\code{vignette("models", "emmeans")}}. #' @param at Optional named list of levels for the corresponding variables #' @param cov.reduce A function, logical value, or formula; or a named list of #' these. Each covariate \emph{not} specified in \code{cov.keep} or \code{at} #' is reduced according to these specifications. See the section below on #' \dQuote{Using \code{cov.reduce} and \code{cov.keep}}. #' @param cov.keep Character vector: names of covariates that are \emph{not} #' to be reduced; these are treated as factors and used in weighting calculations. #' \code{cov.keep} may also include integer value(s), and if so, the maximum #' of these is used to set a threshold such that any covariate having no more #' than that many unique values is automatically included in \code{cov.keep}. #' @param mult.names Character value: the name(s) to give to the #' pseudo-factor(s) whose levels delineate the elements of a multivariate #' response. If this is provided, it overrides the default name(s) used for #' \code{class(object)} when it has a multivariate response (e.g., the default #' is \code{"rep.meas"} for \code{"mlm"} objects). #' @param mult.levs A named list of levels for the dimensions of a multivariate #' response. If there is more than one element, the combinations of levels are #' used, in \code{\link{expand.grid}} order. The (total) number of levels must #' match the number of dimensions. If \code{mult.name} is specified, this #' argument is ignored. #' @param options If non-\code{NULL}, a named \code{list} of arguments to pass #' to \code{\link{update.emmGrid}}, just after the object is constructed. #' @param data A \code{data.frame} to use to obtain information about the #' predictors (e.g. factor levels). If missing, then #' \code{\link{recover_data}} is used to attempt to reconstruct the data. See #' the note with \code{\link{recover_data}} for an important precaution. #' @param df Numeric value. This is equivalent to specifying \code{options(df = #' df)}. See \code{\link{update.emmGrid}}. #' @param type Character value. If provided, this is saved as the #' \code{"predict.type"} setting. See \code{\link{update.emmGrid}} and the #' section below on prediction types and transformations. #' @param transform Character, logical, or list. If non-missing, the reference #' grid is reconstructed via \code{\link{regrid}} with the given #' \code{transform} argument. See the section below on prediction types and #' transformations. #' @param nesting If the model has nested fixed effects, this may be specified #' here via a character vector or named \code{list} specifying the nesting #' structure. Specifying \code{nesting} overrides any nesting structure that #' is automatically detected. See the section below on Recovering or Overriding #' Model Information. #' @param offset Numeric scalar value (if a vector, only the first element is #' used). This may be used to add an offset, or override offsets based on the #' model. A common usage would be to specify \code{offset = 0} for a Poisson #' regression model, so that predictions from the reference grid become rates #' relative to the offset that had been specified in the model. #' @param sigma Numeric value to use for subsequent predictions or #' back-transformation bias adjustments. If not specified, we use #' \code{sigma(object)}, if available, and \code{NULL} otherwise. #' @param nuisance,non.nuisance,wt.nuis If \code{nuisance} is a vector of predictor names, #' those predictors are omitted from the reference grid. Instead, the result #' will be as if we had averaged over the levels of those factors, with either #' equal or proportional weights as specified in \code{wt.nuis} (see the #' \code{weights} argument in \code{\link{emmeans}}). The factors in #' \code{nuisance} must not interact with other factors, not even other #' nuisance factors. Specifying nuisance factors can save considerable #' storage and computation time, and help avoid exceeding the maximum #' reference-grid size (\code{get_emm_option("rg.limit")}). #' @param rg.limit Integer limit on the number of reference-grid rows to allow #' (checked before any multivariate responses are included). #' @param ... Optional arguments passed to \code{\link{summary.emmGrid}}, #' \code{\link{emm_basis}}, and #' \code{\link{recover_data}}, such as \code{params}, \code{vcov.} (see #' \bold{Covariance matrix} below), or options such as \code{mode} for #' specific model types (see \href{../doc/models.html}{vignette("models", #' "emmeans")}). #' #' @section Using \code{cov.reduce} and \code{cov.keep}: #' The \code{cov.keep} argument was not available in \pkg{emmeans} versions #' 1.4.1 and earlier. Any covariates named in this list are treated as if they #' are factors: all the unique levels are kept in the reference grid. The user #' may also specify an integer value, in which case any covariate having no more #' than that number of unique values is implicitly included in \code{cov.keep}. #' The default for \code{cove.keep} is set and retrieved via the #' \code{\link{emm_options}} framework, and the system default is \code{"2"}, #' meaning that covariates having only two unique values are automatically #' treated as two-level factors. See also the Note below on backward compatibility. #' #' There is a subtle distinction between including a covariate in \code{cov.keep} #' and specifying its values manually in \code{at}: Covariates included in #' \code{cov.keep} are treated as factors for purposes of weighting, while #' specifying levels in \code{at} will not include the covariate in weighting. #' See the \code{mtcars.lm} example below for an illustration. #' #' \code{cov.reduce} may be a function, #' logical value, formula, or a named list of these. #' If a single function, it is applied to each covariate. #' If logical and \code{TRUE}, \code{mean} is used. If logical and #' \code{FALSE}, it is equivalent to including all covariates in #' \code{cov.keep}. Use of \samp{cov.reduce = FALSE} is inadvisable because it #' can result in a huge reference grid; it is far better to use #' \code{cov.keep}. #' #' If a formula (which must be two-sided), then a model is fitted to that #' formula using \code{\link{lm}}; then in the reference grid, its response #' variable is set to the results of \code{\link{predict}} for that model, #' with the reference grid as \code{newdata}. (This is done \emph{after} the #' reference grid is determined.) A formula is appropriate here when you think #' experimental conditions affect the covariate as well as the response. #' #' To allow for situations where a simple \code{lm()} call as described above won't #' be adequate, a formula of the form \code{ext ~ fcnname} is also supported, #' where the left-hand side may be \code{ext}, \code{extern}, or #' \code{external} (and must \emph{not} be a predictor name) and the #' right-hand side is the name of an existing function. The function is called #' with one argument, a data frame with columns for each variable in the #' reference grid. The function is expected to use that frame as new data to #' be used to obtain predictions for one or more models; and it should return #' a named list or data frame with replacement values for one or more of the #' covariates. #' #' If \code{cov.reduce} is a named list, then the above criteria are used to #' determine what to do with covariates named in the list. (However, formula #' elements do not need to be named, as those names are determined from the #' formulas' left-hand sides.) Any unresolved covariates are reduced using #' \code{"mean"}. #' #' Any \code{cov.reduce} of \code{cov.keep} specification for a covariate #' also named in \code{at} is ignored. #' #' #' @section Interdependent covariates: Care must be taken when covariate values #' depend on one another. For example, when a polynomial model was fitted #' using predictors \code{x}, \code{x2} (equal to \code{x^2}), and \code{x3} #' (equal to \code{x^3}), the reference grid will by default set \code{x2} and #' \code{x3} to their means, which is inconsistent. The user should instead #' use the \code{at} argument to set these to the square and cube of #' \code{mean(x)}. Better yet, fit the model using a formula involving #' \code{poly(x, 3)} or \code{I(x^2)} and \code{I(x^3)}; then there is only #' \code{x} appearing as a covariate; it will be set to its mean, and the #' model matrix will have the correct corresponding quadratic and cubic terms. #' #' @section Matrix covariates: Support for covariates that appear in the dataset #' as matrices is very limited. If the matrix has but one column, it is #' treated like an ordinary covariate. Otherwise, with more than one column, #' each column is reduced to a single reference value -- the result of #' applying \code{cov.reduce} to each column (averaged together if that #' produces more than one value); you may not specify values in \code{at}; and #' they are not treated as variables in the reference grid, except for #' purposes of obtaining predictions. #' #' @section Recovering or overriding model information: Ability to support a #' particular class of \code{object} depends on the existence of #' \code{recover_data} and \code{emm_basis} methods -- see #' \link{extending-emmeans} for details. The call #' \code{methods("recover_data")} will help identify these. #' #' \bold{Data.} In certain models, (e.g., results of #' \code{\link[lme4]{glmer.nb}}), it is not possible to identify the original #' dataset. In such cases, we can work around this by setting \code{data} #' equal to the dataset used in fitting the model, or a suitable subset. Only #' the complete cases in \code{data} are used, so it may be necessary to #' exclude some unused variables. Using \code{data} can also help save #' computing, especially when the dataset is large. In any case, \code{data} #' must represent all factor levels used in fitting the model. It #' \emph{cannot} be used as an alternative to \code{at}. (Note: If there is a #' pattern of \code{NAs} that caused one or more factor levels to be excluded #' when fitting the model, then \code{data} should also exclude those levels.) #' #' \bold{Covariance matrix.} By default, the variance-covariance matrix for #' the fixed effects is obtained from \code{object}, usually via its #' \code{\link{vcov}} method. However, the user may override this via a #' \code{vcov.} argument, specifying a matrix or a function. If a matrix, it #' must be square and of the same dimension and parameter order of the fixed #' effects. If a function, must return a suitable matrix when it is called #' with arguments \code{(object, ...)}. Be careful with possible #' unintended conflicts with arguments in \code{...}; for example, #' \code{sandwich::vcovHAC()} has optional arguments \code{adjust} and \code{weights} #' that may be intended for \code{emmeans()} but will also be passed to \code{vcov.()}. #' #' \bold{Nested factors.} Having a nesting structure affects marginal #' averaging in \code{emmeans} in that it is done separately for each level #' (or combination thereof) of the grouping factors. \code{ref_grid} tries to #' discern which factors are nested in other factors, but it is not always #' obvious, and if it misses some, the user must specify this structure via #' \code{nesting}; or later using \code{\link{update.emmGrid}}. The #' \code{nesting} argument may be a character vector, a named \code{list}, #' or \code{NULL}. #' If a \code{list}, each name should be the name of a single factor in the #' grid, and its entry a character vector of the name(s) of its grouping #' factor(s). \code{nested} may also be a character value of the form #' \code{"factor1 \%in\% (factor2*factor3)"} (the parentheses are optional). #' If there is more than one such specification, they may be appended #' separated by commas, or as separate elements of a character vector. For #' example, these specifications are equivalent: \code{nesting = list(state = #' "country", city = c("state", "country")}, \code{nesting = "state \%in\% #' country, city \%in\% (state*country)"}, and \code{nesting = c("state \%in\% #' country", "city \%in\% state*country")}. #' #' @section Predictors with subscripts and data-set references: When the fitted #' model contains subscripts or explicit references to data sets, the #' reference grid may optionally be post-processed to simplify the variable #' names, depending on the \code{simplify.names} option (see #' \code{\link{emm_options}}), which by default is \code{TRUE}. For example, #' if the model formula is \code{data1$resp ~ data1$trt + data2[[3]] + #' data2[["cov"]]}, the simplified predictor names (for use, e.g., in the #' \code{specs} for \code{\link{emmeans}}) will be \code{trt}, #' \code{data2[[3]]}, and \code{cov}. Numerical subscripts are not simplified; #' nor are variables having simplified names that coincide, such as if #' \code{data2$trt} were also in the model. #' #' Please note that this simplification is performed \emph{after} the #' reference grid is constructed. Thus, non-simplified names must be used in #' the \code{at} argument (e.g., \code{at = list(`data2["cov"]` = 2:4)}. #' #' If you don't want names simplified, use \code{emm_options(simplify.names = #' FALSE)}. #' #' #' #' @section Prediction types and transformations: #' Transformations can exist because of a link function in a generalized linear model, #' or as a response transformation, or even both. In many cases, they are auto-detected, #' for example a model formula of the form \code{sqrt(y) ~ ...}. Even transformations #' containing multiplicative or additive constants, such as \code{2*sqrt(y + pi) ~ ...}, #' are auto-detected. A response transformation of \code{y + 1 ~ ...} is \emph{not} #' auto-detected, but \code{I(y + 1) ~ ...} is interpreted as \code{identity(y + 1) ~ ...}. #' A warning is issued if it gets too complicated. #' Complex transformations like the Box-Cox transformation are not auto-detected; but see #' the help page for \code{\link{make.tran}} for information on some advanced methods. #' #' There is a subtle difference #' between specifying \samp{type = "response"} and \samp{transform = #' "response"}. While the summary statistics for the grid itself are the same, #' subsequent use in \code{\link{emmeans}} will yield different results if #' there is a response transformation or link function. With \samp{type = #' "response"}, EMMs are computed by averaging together predictions on the #' \emph{linear-predictor} scale and then back-transforming to the response #' scale; while with \samp{transform = "response"}, the predictions are #' already on the response scale so that the EMMs will be the arithmetic means #' of those response-scale predictions. To add further to the possibilities, #' \emph{geometric} means of the response-scale predictions are obtainable via #' \samp{transform = "log", type = "response"}. See also the help page for #' \code{\link{regrid}}. #' #' @section Optional side effect: If the \code{save.ref_grid} option is set to #' \code{TRUE} (see \code{\link{emm_options}}), #' The most recent result of \code{ref_grid}, whether #' called directly or indirectly via \code{\link{emmeans}}, #' \code{\link{emtrends}}, or some other function that calls one of these, is #' saved in the user's environment as \code{.Last.ref_grid}. This facilitates #' checking what reference grid was used, or reusing the same reference grid #' for further calculations. This automatic saving is disabled by default, but #' may be enabled via \samp{emm_options(save.ref_grid = TRUE)}. #' #' @return An object of the S4 class \code{"emmGrid"} (see #' \code{\link{emmGrid-class}}). These objects encapsulate everything needed #' to do calculations and inferences for estimated marginal means, and contain #' nothing that depends on the model-fitting procedure. #' #' @seealso Reference grids are of class \code{\link[=emmGrid-class]{emmGrid}}, #' and several methods exist for them -- for example #' \code{\link{summary.emmGrid}}. Reference grids are fundamental to #' \code{\link{emmeans}}. Supported models are detailed in #' \href{../doc/models.html}{\code{vignette("models", "emmeans")}}. #' See \code{\link{update.emmGrid}} for details of arguments that can be in #' \code{options} (or in \code{...}). #' #' @note The system default for \code{cov.keep} causes models #' containing indicator variables to be handled differently than in #' \pkg{emmeans} version 1.4.1 or earlier. To replicate older #' analyses, change the default via #' \samp{emm_options(cov.keep = character(0))}. #' #' @note Some earlier versions of \pkg{emmeans} offer a \code{covnest} argument. #' This is now obsolete; if \code{covnest} is specified, it is harmlessly #' ignored. Cases where it was needed are now handled appropriately via the #' code associated with \code{cov.keep}. #' #' @export #' #' @examples #' fiber.lm <- lm(strength ~ machine*diameter, data = fiber) #' ref_grid(fiber.lm) #' #' ref_grid(fiber.lm, at = list(diameter = c(15, 25))) #' #' \dontrun{ #' # We could substitute the sandwich estimator vcovHAC(fiber.lm) #' # as follows: #' summary(ref_grid(fiber.lm, vcov. = sandwich::vcovHAC)) #' } #' #' # If we thought that the machines affect the diameters #' # (admittedly not plausible in this example), then we should use: #' ref_grid(fiber.lm, cov.reduce = diameter ~ machine) #' #' ### Model with indicator variables as predictors: #' mtcars.lm <- lm(mpg ~ disp + wt + vs * am, data = mtcars) #' (rg.default <- ref_grid(mtcars.lm)) #' (rg.nokeep <- ref_grid(mtcars.lm, cov.keep = character(0))) #' (rg.at <- ref_grid(mtcars.lm, at = list(vs = 0:1, am = 0:1))) #' #' # Two of these have the same grid but different weights: #' rg.default@grid #' rg.at@grid #' #' ### Using cov.reduce formulas... #' # Above suggests we can vary disp indep. of other factors - unrealistic #' rg.alt <- ref_grid(mtcars.lm, at = list(wt = c(2.5, 3, 3.5)), #' cov.reduce = disp ~ vs * wt) #' rg.alt@grid #' #' # Alternative to above where we model sqrt(disp) #' disp.mod <- lm(sqrt(disp) ~ vs * wt, data = mtcars) #' disp.fun <- function(dat) #' list(disp = predict(disp.mod, newdata = dat)^2) #' rg.alt2 <- ref_grid(mtcars.lm, at = list(wt = c(2.5, 3, 3.5)), #' cov.reduce = external ~ disp.fun) #' rg.alt2@grid #' #' #' # Multivariate example #' MOats.lm = lm(yield ~ Block + Variety, data = MOats) #' ref_grid(MOats.lm, mult.names = "nitro") #' # Silly illustration of how to use 'mult.levs' to make comb's of two factors #' ref_grid(MOats.lm, mult.levs = list(T=LETTERS[1:2], U=letters[1:2])) #' #' # Using 'params' #' require("splines") #' my.knots = c(2.5, 3, 3.5) #' mod = lm(Sepal.Length ~ Species * ns(Sepal.Width, knots = my.knots), data = iris) #' ## my.knots is not a predictor, so need to name it in 'params' #' ref_grid(mod, params = "my.knots") #' ref_grid <- function(object, at, cov.reduce = mean, cov.keep = get_emm_option("cov.keep"), mult.names, mult.levs, options = get_emm_option("ref_grid"), data, df, type, transform, nesting, offset, sigma, nuisance = character(0), non.nuisance, wt.nuis = "equal", rg.limit = get_emm_option("rg.limit"), ...) { ### transform = match.arg(transform) if (!missing(df)) { if(is.null(options)) options = list() options$df = df } if(missing(sigma)) { # Get 'sigma(object)' if available, else NULL sigma = suppressWarnings(try(stats::sigma(object), silent = TRUE)) if (inherits(sigma, "try-error")) sigma = NULL } # recover the data if (missing(data)) { data = try(recover_data (object, data = NULL, ...)) if (inherits(data, "try-error")) stop("Perhaps a 'data' or 'params' argument is needed") } else # attach needed attributes to given data data = recover_data(object, data = as.data.frame(data), ...) if(is.character(data)) # 'data' is in fact an error message stop(data) ## undocumented hook to use ref_grid as slave to get data if (!is.null(options$just.data)) return(data) trms = attr(data, "terms") # find out if any variables are coerced to factors or vice versa coerced = .find.coerced(trms, data) # now list with members 'factors' and 'covariates' # convenience function sort.unique = function(x) sort(unique(x)) # Get threshold for max #levels to keep a covariate if (is.null(cov.keep)) cov.keep = character(0) cov.thresh = max(c(1, suppressWarnings(as.integer(cov.keep))), na.rm = TRUE) if (is.logical(cov.reduce)) { if (!cov.reduce) cov.thresh = 99 # big enough! cov.reduce = mean } # Ensure cov.reduce is a function or list thereof dep.x = list() # list of formulas to fit later fix.cr = function(cvr) { if (inherits(cvr, "formula")) { if (length(cvr) < 3) stop("Formulas in 'cov.reduce' must be two-sided") lhs = .all.vars(cvr)[1] dep.x[[lhs]] <<- cvr cvr = mean } else if (!inherits(cvr, c("function","list"))) stop("Invalid 'cov.reduce' argument") cvr } # IMPORTANT: following stmts may also affect x.dep if (is.list(cov.reduce)) cov.reduce = lapply(cov.reduce, fix.cr) else cov.reduce = fix.cr(cov.reduce) # zap any formulas that are also in 'at' if (!missing(at)) for (xnm in names(at)) dep.x[[xnm]] = NULL # local cov.reduce function that works with function or named list cr = function(x, nm) { if (is.function(cov.reduce)) cov.reduce(x) else if (hasName(cov.reduce, nm)) cov.reduce[[nm]](x) else mean(x) } # initialize empty lists ref.levels = matlevs = xlev = chrlev = list() for (nm in attr(data, "responses")) { y = data[[nm]] if (is.matrix(y)) matlevs[[nm]] = apply(y, 2, mean) else ref.levels[[nm]] = mean(y) } for (nm in attr(data, "predictors")) { x = data[[nm]] if (is.matrix(x) && ncol(x) == 1) # treat 1-column matrices as covariates x = as.numeric(x) # Save the original levels of factors, no matter what if (is.factor(x) && !(nm %in% coerced$covariates)) xlev[[nm]] = levels(factor(x)) # (applying factor drops any unused levels) else if (is.character(x)) # just like a factor xlev[[nm]] = sort(unique(x)) # Now go thru and find reference levels... # mentioned in 'at' list but not coerced factor if (!(nm %in% coerced$factors) && !missing(at) && (hasName(at, nm))) ref.levels[[nm]] = at[[nm]] # factors not in 'at' else if (is.factor(x) && !(nm %in% coerced$covariates)) ref.levels[[nm]] = levels(factor(x)) else if (is.character(x) || is.logical(x)) ref.levels[[nm]] = chrlev[[nm]] = sort.unique(x) # matrices else if (is.matrix(x)) { # Matrices -- reduce columns thereof, but don't add to baselevs matlevs[[nm]] = apply(x, 2, cr, nm) # if cov.reduce returns a vector, average its columns if (is.matrix(matlevs[[nm]])) matlevs[[nm]] = apply(matlevs[[nm]], 2, mean) } # covariate coerced, or not mentioned in 'at' else { # single numeric pred but coerced to a factor - use unique values # even if in 'at' list. We'll fix this up later if (nm %in% coerced$factors) ref.levels[[nm]] = sort.unique(x) # Ordinary covariates - summarize based on cov.keep else { if ((length(uval <- sort.unique(x)) > cov.thresh) && !(nm %in% cov.keep)) ref.levels[[nm]] = cr(as.numeric(x), nm) else { ref.levels[[nm]] = uval cov.keep = c(cov.keep, nm) } } } } if (!missing(non.nuisance)) nuisance = setdiff(names(ref.levels), non.nuisance) # Now create the reference grid if(no.nuis <- (length(nuisance) == 0)) { .check.grid(ref.levels, rg.limit) grid = do.call(expand.grid, ref.levels) } else { nuis.info = .setup.nuis(nuisance, ref.levels, trms, rg.limit) grid = nuis.info$grid } # undocumented hook to expand grid by increments of 'var' (needed by emtrends) if (!is.null(delts <- options$delts)) { var = options$var n.orig = nrow(grid) # remember how many rows we had grid = grid[rep(seq_len(n.orig), length(delts)), , drop = FALSE] options$var = options$delts = NULL # grid[[var]] = grid[[var]] + rep(delts, each = n.orig) # (we have to wait until after covariate calcs to do this) } # add any matrices for (nm in names(matlevs)) { tmp = matrix(rep(matlevs[[nm]], each=nrow(grid)), nrow=nrow(grid)) dimnames(tmp) = list(NULL, names(matlevs[[nm]])) grid[[nm]] = tmp } # resolve any covariate formulas for (xnm in names(dep.x)) { if ((xnm %in% c("ext", "extern", "external")) && !(xnm %in% names(grid))) { # use external fcn fun = get(as.character(dep.x[[xnm]][[3]]), inherits = TRUE) rslts = fun(grid) # should be some thing that supports names() and [[]], e.g., a list or d.f. for (nm in intersect(names(rslts), names(grid))) { grid[[nm]] = rslts[[nm]] ref.levels[[nm]] = NULL } } else if (!all(.all.vars(dep.x[[xnm]]) %in% names(grid))) stop("Formulas in 'cov.reduce' must predict covariates actually in the model") else { # Use lm() to predict this covariate xmod = lm(dep.x[[xnm]], data = data) grid[[xnm]] = predict(xmod, newdata = grid) ref.levels[[xnm]] = NULL } } # finish-up our hook for expanding the grid if (!is.null(delts)) # add increments if any grid[[var]] = grid[[var]] + rep(delts, each = n.orig) if (!is.null(attr(data, "pass.it.on"))) # a hook needed by emm_basis.gamlss et al. attr(object, "data") = data ###!! Prevent a warning like in https://stackoverflow.com/questions/68969384/emmeans-warning-in-model-frame-defaultformula-data-data-variable-gr/68990172#68990172 xl = xlev modnm = rownames(attr(trms, "factors")) chk = sapply(modnm, function(mn) mn %in% names(xl)) for(i in which(!chk)) { # replace names in xl - e.g., as.factor(trt) where trt already a factor fn = all.vars(reformulate(modnm[i])) if (length(fn) == 1) names(xl)[names(xl) == fn] = modnm[i] } ###!! If we remove this code, also need to change xl back to xlev in 'basis =' call below # we've added args `misc` and `options` so emm_basis methods can access and use these if they want basis = emm_basis(object, trms, xl, grid, misc = attr(data, "misc"), options = options, ...) environment(basis$dffun) = baseenv() # releases unnecessary storage if(length(basis$bhat) != ncol(basis$X)) stop("Something went wrong:\n", " Non-conformable elements in reference grid.", call. = TRUE) if(!no.nuis) { basis = .basis.nuis(basis, nuis.info, wt.nuis, ref.levels, data, grid, ref.levels) grid = basis$grid nuisance = ref.levels[nuis.info$nuis] # now nuisance has the levels info ref.levels = basis$ref.levels } misc = basis$misc ### Figure out if there is a response transformation... # next stmt assumes that model formula is 1st argument (2nd element) in call. # if not, we probably get an error or something that isn't a formula # and it is silently ignored frm = try(formula(eval(attr(data, "call")[[2]])), silent = TRUE) if (inherits(frm, "formula")) { # response may be transformed lhs = frm[-3] tran = setdiff(.all.vars(lhs, functions = TRUE), c(.all.vars(lhs), "~", "cbind", "+", "-", "*", "/", "^", "%%", "%/%")) if(length(tran) > 0) { if (tran[1] == "scale") { # we'll try to set it up based on terms component pv = try(attr(terms(object), "predvars"), silent = TRUE) if (!inherits(pv, "try-error")) { pv = c(lapply(pv, as.character), "foo") # make sure it isn't empty scal = which(sapply(pv, function(x) x[1] == "scale")) if(length(scal) > 0) { par = as.numeric(pv[[scal[1]]][3:4]) tran = make.tran("scale", y = 0, center = par[1], scale = par[2]) } } if (is.character(tran)) { # didn't manage to find params tran = NULL message("NOTE: Unable to recover scale() parameters. See '? make.tran'") } } else if (tran[1] == "linkfun") tran = as.list(environment(trms))[c("linkfun","linkinv","mu.eta","valideta","name")] else { if (tran[1] == "I") tran = "identity" tran = paste(tran, collapse = ".") # length > 1: Almost certainly unsupported, but facilitates a more informative error message const.warn = "There are unevaluated constants in the response formula\nAuto-detection of the response transformation may be incorrect" # Look for a multiplier, e.g. 2*sqrt(y) tst = strsplit(strsplit(as.character(lhs[2]), "\\(")[[1]][1], "\\*")[[1]] if(length(tst) > 1) { mul = try(eval(parse(text = tst[1])), silent = TRUE) if(!inherits(mul, "try-error")) { misc$tran.mult = mul tran = gsub("\\*\\.", "", tran) } else warning(const.warn) } # look for added constant, e.g. log(y + 1) tst = strsplit(as.character(lhs[2]), "\\(|\\)|\\+")[[1]] if (length(tst) > 2) { const = try(eval(parse(text = tst[3])), silent = TRUE) if (!inherits(const, "try-error") && (length(tst) == 3)) misc$tran.offset = const else warning(const.warn) } } if(is.null(misc[["tran"]])) misc$tran = tran else misc$tran2 = tran misc$inv.lbl = "response" } } # Take care of multivariate response multresp = character(0) ### ??? was list() ylevs = misc$ylevs if(!is.null(ylevs)) { # have a multivariate situation if (missing(mult.levs)) mult.levs = ylevs if(!missing(mult.names)) { k = seq_len(min(length(ylevs), length(mult.names))) names(mult.levs)[k] = mult.names[k] } if (length(ylevs) > 1) ylevs = list(seq_len(prod(sapply(mult.levs, length)))) k = prod(sapply(mult.levs, length)) if (k != length(ylevs[[1]])) stop("supplied 'mult.levs' is of different length ", "than that of multivariate response") for (nm in names(mult.levs)) ref.levels[[nm]] = mult.levs[[nm]] multresp = names(mult.levs) MF = do.call("expand.grid", mult.levs) grid = merge(grid, MF) } # add any matrices for (nm in names(matlevs)) grid[[nm]] = matrix(rep(matlevs[[nm]], each=nrow(grid)), nrow=nrow(grid)) # Here's a complication. If a numeric predictor was coerced to a factor, we had to # include all its levels in the reference grid, even if altered in 'at' # Moreover, whatever levels are in 'at' must be a subset of the unique values # So we now need to subset the rows of the grid and linfct based on 'at' problems = if (!missing(at)) intersect(c(multresp, coerced$factors), names(at)) else character(0) if (length(problems) > 0) { incl.flags = rep(TRUE, nrow(grid)) for (nm in problems) { if (is.numeric(ref.levels[[nm]])) { at[[nm]] = round(at[[nm]], 3) ref.levels[[nm]] = round(ref.levels[[nm]], 3) } # get only "legal" levels at[[nm]] = at[[nm]][at[[nm]] %in% ref.levels[[nm]]] # Now which of those are left out? excl = setdiff(ref.levels[[nm]], at[[nm]]) for (x in excl) incl.flags[grid[[nm]] == x] = FALSE ref.levels[[nm]] = at[[nm]] } if (!any(incl.flags)) stop("Reference grid is empty due to mismatched levels in 'at'") grid = grid[incl.flags, , drop=FALSE] basis$X = basis$X[incl.flags, , drop=FALSE] } # Any offsets??? (misc$offset.mult might specify removing or reversing the offset) if (!missing(offset)) { # For safety, we always treat it as scalar if (offset[1] != 0) grid[[".offset."]] = offset[1] } else if(!is.null(attr(trms,"offset"))) { om = 1 if (!is.null(misc$offset.mult)) om = misc$offset.mult if (any(om != 0)) grid[[".offset."]] = om * .get.offset(trms, grid) } ### --- Determine weights for each grid point if (!hasName(data, "(weights)")) data[["(weights)"]] = 1 cov.keep = intersect(unique(cov.keep), names(ref.levels)) nms = union(union(union(names(xlev), names(chrlev)), coerced$factors), cov.keep) nms = intersect(nms, names(grid)) #### Old code... # if (!covnest) # nms = union(union(names(xlev), names(chrlev)), coerced$factors) # only factors, no covariates or mult.resp # else # nms = setdiff(names(ref.levels)[sapply(ref.levels, length) > 1], multresp) # all names (except multiv) for which there is > 1 level if (length(nms) == 0) wgt = rep(1, nrow(grid)) # all covariates; give each weight 1 else { id = .my.id(data[, nms, drop = FALSE]) uid = !duplicated(id) key = do.call(paste, unname(data[uid, nms, drop = FALSE])) key = key[order(id[uid])] tgt = do.call(paste, unname(grid[, nms, drop = FALSE])) wgt = rep(0, nrow(grid)) for (i in seq_along(key)) wgt[tgt == key[i]] = sum(data[["(weights)"]][id==i]) } grid[[".wgt."]] = wgt model.info = list(call = attr(data,"call"), terms = trms, xlev = xlev) if (!is.null(mm <- basis$model.matrix)) { # submodel support attr(mm, "factors") = .smpFT(trms) model.info$model.matrix = mm } # Detect any nesting structures nst = .find_nests(grid, trms, coerced$orig, ref.levels) if (length(nst) > 0) model.info$nesting = nst misc$is.new.rg = TRUE misc$ylevs = NULL # No longer needed misc$estName = "prediction" misc$estType = "prediction" misc$infer = c(FALSE,FALSE) misc$level = .95 misc$adjust = "none" misc$famSize = nrow(grid) if(is.null(misc$avgd.over)) misc$avgd.over = character(0) misc$sigma = sigma post.beta = basis$post.beta if (is.null(post.beta)) post.beta = matrix(NA) predictors = intersect(attr(data, "predictors"), names(grid)) simp.tbl = environment(trms)$.simplify.names. if (! is.null(simp.tbl)) { names(grid) = .simplify.names(names(grid), simp.tbl) predictors = .simplify.names(predictors, simp.tbl) names(ref.levels) = .simplify.names(names(ref.levels), simp.tbl) if (!is.null(post.beta)) names(post.beta) = .simplify.names(names(post.beta), simp.tbl) if (!is.null(model.info$nesting)) { model.info$nesting = lapply(model.info$nesting, .simplify.names, simp.tbl) names(model.info$nesting) = .simplify.names(names(model.info$nesting), simp.tbl) } environment(trms)$.simplify.names. = NULL } result = new("emmGrid", model.info = model.info, roles = list(predictors = predictors, responses = attr(data, "responses"), multresp = multresp, nuisance = nuisance), grid = grid, levels = ref.levels, matlevs = matlevs, linfct = basis$X, bhat = basis$bhat, nbasis = basis$nbasis, V = basis$V, dffun = basis$dffun, dfargs = basis$dfargs, misc = misc, post.beta = post.beta) if (!missing(type)) { if (is.null(options)) options = list() options$predict.type = type } if (!missing(nesting)) { result@model.info$nesting = lst = .parse_nest(nesting) if(!is.null(lst)) { nms = union(names(lst), unlist(lst)) if(!all(nms %in% names(result@grid))) stop("Nonexistent variables specified in 'nesting'") result@misc$display = .find.nonempty.nests(result, nms) } } else if (!is.null(nst <- result@model.info$nesting)) { result@misc$display = .find.nonempty.nests(result) if (get_emm_option("msg.nesting")) message("NOTE: A nesting structure was detected in the ", "fitted model:\n ", .fmt.nest(nst)) } result = .update.options(result, options, ...) if(!is.null(hook <- misc$postGridHook)) { if (is.character(hook)) hook = get(hook) result@misc$postGridHook = NULL result = hook(result, ...) } if(!missing(transform)) result = regrid(result, transform = transform, sigma = sigma, ...) .save.ref_grid(result) result } #### End of ref_grid ------------------------------------------ # local utility to save each newly constructed ref_grid, if enabled # Goes into global environment unless .Last.ref_grid is found further up .save.ref_grid = function(object) { if (is.logical(isnewrg <- object@misc$is.new.rg)) if(isnewrg && get_emm_option("save.ref_grid")) assign(".Last.ref_grid", object, inherits = TRUE) } # This function figures out which covariates in a model # have been coerced to factors. And also which factors have been coerced # to be covariates .find.coerced = function(trms, data) { if (ncol(data) == 0) return(list(factors = integer(0), covariates = integer(0))) isfac = sapply(data, function(x) inherits(x, "factor")) # Character vectors of factors and covariates in the data... facs.d = names(data)[isfac] covs.d = names(data)[!isfac] lbls = attr(trms, "term.labels") M = model.frame(trms, utils::head(data, 2)) #### just need a couple rows isfac = sapply(M, function(x) inherits(x, "factor")) # Character vector of terms in the model frame that are factors ... facs.m = names(M)[as.logical(isfac)] covs.m = setdiff(names(M), facs.m) # Exclude the terms that are already factors # What's left will be things like "factor(dose)", "interact(dose,treat)", etc # we're saving these in orig orig = cfac = setdiff(facs.m, facs.d) if(length(cfac) != 0) { cvars = lapply(cfac, function(x) .all.vars(stats::reformulate(x))) # Strip off the function calls cfac = intersect(unique(unlist(cvars)), covs.d) # Exclude any variables that are already factors } # Do same with covariates ccov = setdiff(covs.m, covs.d) orig = c(orig, ccov) if(length(ccov) > 0) { cvars = lapply(ccov, function(x) .all.vars(stats::reformulate(x))) ccov = intersect(unique(unlist(cvars)), facs.d) } list(factors = cfac, covariates = ccov, orig = orig) } # My replacement for plyr::id(, drop = TRUE) .my.id = function(data){ p = do.call(paste, data) u = unique(p) match(p, u) } # Utility to error-out when potential reference grid size is too big .check.grid = function(levs, limit = get_emm_option("rg.limit")) { size = prod(sapply(levs, length)) if (size > limit) stop("The rows of your requested reference grid would be ", size, ", which exceeds\n", "the limit of ", limit, " (not including any multivariate responses).\n", "Your options are:\n", " 1. Specify some (or more) nuisance factors using the 'nuisance' argument\n", " (see ?ref_grid). These must be factors that do not interact with others.\n", " 2. Add the argument 'rg.limit = ' to the call. Be careful,\n", " because this could cause excessive memory use and performance issues.\n", " Or, change the default via 'emm_options(rg.limit = )'.\n", call. = FALSE) } # Utility to set up the grid for nuisance factors. This consists of two or more # data.frames rbinded together: # * the expanded grid for all factors *not* in nuis, with the nuis factors set # at their first levels # * for each factor in f nuis, a set of rows for (levs$f), with the other # factors at their first levels (this is arbitrary) # In addition, we return a character vector 'row.assign' corresponding to the rows # in the grid, telling which is what: ".main.grid." for the first part of the grid, # otherwise factor names from nuis. # We also return 'nuis' itself - which may be reduced since illegal # entries are silently removed .setup.nuis = function(nuis, levs, trms, rg.limit) { firsts = args = lapply(levs, function(x) x[1]) nuis = intersect(nuis, names(levs)) # sanity checks on terms, and term indexes fsum = rep(99, length(nuis)) tbl = attr(trms, "factors") rn = row.names(tbl) = sapply(row.names(tbl), function(nm) paste(all.vars(reformulate(nm)), collapse = ",")) for (i in seq_along(nuis)) { f = nuis[i] if(f %in% rn) fsum[i] = sum(tbl[f, ]) } nuis = nuis[fsum == 1] # silently remove any unfound or interacting factors # top part... non.nuis = setdiff(names(levs), nuis) for (n in non.nuis) args[[n]] = levs[[n]] .check.grid(args, rg.limit) grid = do.call(expand.grid, args) ra = rep(".main.grid.", nrow(grid)) # bottom parts for (f in nuis) { args = firsts args[[f]] = levs[[f]] grid = rbind(grid, do.call(expand.grid, args)) ra = c(ra, rep(f, length(levs[[f]]))) } list(grid = grid, row.assign = ra, nuis = nuis) } # Do the required post-processing for nuisance factors... # After we get the model matrix for this grid, we'll average each set of rows in the # bottom part, and substitute those averages in the required columns in the top part # of the model matrix. .basis.nuis = function(basis, info, wt, levs, data, grid, ref.levels) { ra = info$row.assign r. = rep(".", length(ra)) # fillers X = basis$X n = sum(ra == ".main.grid.") k = nrow(X) / length(ra) # multivariate dimension nuis = info$nuis wts = lapply(nuis, function(f) { if (wt == "equal") w = rep(1, length(levs[[f]])) else { x = data[[f]] w = sapply(levs[[f]], function(lev) sum(x == lev)) } w / sum(w) }) names(wts) = nuis # In a multivariate case, we have to repeat the same operations for each block of X rows for (m in 1:k) { RA = c(rep(r., m - 1), ra, rep(r., k - m)) for (f in nuis) { subX = X[RA == f, , drop = FALSE] cols = which(apply(subX, 2, function(x) diff(range(x)) > 0)) subX = sweep(subX[, cols, drop = FALSE], 1, wts[[f]], "*") avg = apply(subX, 2, sum) avg = matrix(rep(avg, each = n), nrow = n) # now several copies X[RA == ".main.grid.", cols] = avg } } basis$misc$nuis = nuis basis$misc$avgd.over = paste(length(nuis), "nuisance factors") RA = rep(ra, k) basis$X = X[RA == ".main.grid.", , drop = FALSE] non.nuis = setdiff(names(ref.levels), info$nuis) basis$ref.levels = ref.levels[non.nuis] basis$grid = grid[1:n, non.nuis, drop = FALSE] basis } emmeans/R/emmeans-package.R0000644000176200001440000001635214137062735015260 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## #' Estimated marginal means (aka Least-squares means) #' #' This package provides methods for obtaining estimated marginal means (EMMs, also #' known as least-squares means) for factor combinations in a variety of models. #' Supported models include [generalized linear] models, models for counts, #' multivariate, multinomial and ordinal responses, survival models, GEEs, and #' Bayesian models. For the latter, posterior samples of EMMs are provided. #' The package can compute contrasts or linear #' combinations of these marginal means with various multiplicity adjustments. #' One can also estimate and contrast slopes of trend lines. #' Some graphical displays of these results are provided. #' #' @section Overview: #' \describe{ #' \item{Vignettes}{A number of vignettes are provided to help the user get #' acquainted with the \pkg{emmeans} package and see some examples.} #' #' \item{Concept}{Estimated marginal means (see Searle \emph{et al.} 1980 are #' popular for summarizing linear models that include factors. For balanced #' experimental designs, they are just the marginal means. For unbalanced data, #' they in essence estimate the marginal means you \emph{would} have observed #' that the data arisen from a balanced experiment. Earlier developments #' regarding these techniques were developed in a least-squares context and are #' sometimes referred to as \dQuote{least-squares means}. Since its early #' development, the concept has expanded far beyond least-squares settings.} #' #' \item{Reference grids}{ The implementation in \pkg{emmeans} relies on our own #' concept of a \emph{reference grid}, which is an array of factor and predictor #' levels. Predictions are made on this grid, and estimated marginal means (or #' EMMs) are defined as averages of these predictions over zero or more #' dimensions of the grid. The function \code{\link{ref_grid}} explicitly #' creates a reference grid that can subsequently be used to obtain #' least-squares means. The object returned by \code{ref_grid} is of class #' \code{"emmGrid"}, the same class as is used for estimated marginal means (see #' below). #' #' Our reference-grid framework expands slightly upon Searle \emph{et al.}'s #' definitions of EMMs, in that it is possible to include multiple levels of #' covariates in the grid. } #' #' \item{Models supported}{As is mentioned in the package description, many #' types of models are supported by the package. #' See \href{../doc/models.html}{vignette("models", "emmeans")} for full details. #' Some models may require other packages be #' installed in order to access all of the available features. #' For models not explicitly supported, it may still be possible to do basic #' post hoc analyses of them via the \code{\link{qdrg}} function.} #' #' \item{Estimated marginal means}{ #' The \code{\link{emmeans}} function computes EMMs given a fitted model (or a #' previously constructed \code{emmGrid} object), using a specification indicating #' what factors to include. The \code{\link{emtrends}} function creates the same #' sort of results for estimating and comparing slopes of fitted lines. Both #' return an \code{emmGrid} object.} #' #' \item{Summaries and analysis}{ #' The \code{\link{summary.emmGrid}} method may be used to display an \code{emmGrid} #' object. Special-purpose summaries are available via \code{\link{confint.emmGrid}} and #' \code{\link{test.emmGrid}}, the latter of which can also do a joint test of several #' estimates. The user may specify by variables, multiplicity-adjustment #' methods, confidence levels, etc., and if a transformation or link function is #' involved, may reverse-transform the results to the response scale.} #' #' \item{Contrasts and comparisons}{ #' The \code{\link{contrast}} method for \code{emmGrid} objects is used to obtain #' contrasts among the estimates; several standard contrast families are #' available such as deviations from the mean, polynomial contrasts, and #' comparisons with one or more controls. Another \code{emmGrid} object is returned, #' which can be summarized or further analyzed. For convenience, a \code{pairs.emmGrid} #' method is provided for the case of pairwise comparisons. #' } #' \item{Graphs}{The \code{\link{plot.emmGrid}} method will display #' side-by-side confidence intervals for the estimates, and/or #' \dQuote{comparison arrows} whereby the *P* values of pairwise differences #' can be observed by how much the arrows overlap. The \code{\link{emmip}} function #' displays estimates like an interaction plot, multi-paneled if there are by #' variables. These graphics capabilities require the \pkg{lattice} package be #' installed.} #' #' \item{MCMC support}{When a model is fitted using MCMC methods, the posterior #' chains(s) of parameter estimates are retained and converted into posterior #' samples of EMMs or contrasts thereof. These may then be summarized or plotted #' like any other MCMC results, using tools in, say \pkg{coda} or #' \pkg{bayesplot}.} #' #' \item{\pkg{multcomp} interface}{The \code{\link{as.glht}} function and #' \code{glht} method for \code{emmGrid}s provide an interface to the #' \code{glht} function in the \pkg{multcomp} package, thus #' providing for more exacting simultaneous estimation or testing. The package #' also provides an \code{\link{emm}} function that works as an alternative to #' \code{mcp} in a call to \code{glht}. #' } #' } %%% end describe #' #' @import estimability #' @import mvtnorm #' @import stats #' @importFrom graphics pairs plot #' @importFrom methods as is new slot slot<- slotNames #' @importFrom utils getS3method hasName installed.packages methods str #' @name emmeans-package NULL emmeans/R/pwpp.R0000644000176200001440000006032214137062735013224 0ustar liggesusers############################################################################## # Copyright (c) 2012-2020 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## #' Pairwise P-value plot #' #' Constructs a plot of P values associated with pairwise comparisons of #' estimated marginal means. #' #' Factor levels (or combinations thereof) are plotted on the vertical scale, and P values #' are plotted on the horizontal scale. Each P value is plotted twice -- at #' vertical positions corresponding to the levels being compared -- and connected by #' a line segment. Thus, it is easy to visualize which P values are small and large, #' and which levels are compared. In addition, factor levels are color-coded, and the points #' and half-line segments appear in the color of the other level. #' The P-value scale is nonlinear, so as to stretch-out smaller P values and #' compress larger ones. #' P values smaller than 0.0004 are altered and plotted in a way that makes #' them more distinguishable from one another. #' #' If \code{xlab}, \code{ylab}, and \code{xsub} are not provided, reasonable labels #' are created. \code{xsub} is used to note special features; e.g., equivalence #' thresholds or one-sided tests. #' #' @param emm An \code{emmGrid} object #' @param method Character or list. Passed to \code{\link{contrast}}, and defines #' the contrasts to be displayed. Any contrast method may be used, #' provided that each contrast includes one coefficient of \code{1}, #' one coefficient of \code{-1}, and the rest \code{0}. That is, calling #' \code{contrast(object, method)} produces a set of comparisons, each with #' one estimate minus another estimate. #' @param by Character vector of variable(s) in the grid to condition on. These will #' create different panels, one for each level or level-combination. #' Grid factors not in \code{by} are the \emph{primary} factors: #' whose levels or level combinations are compared pairwise. #' @param sort Logical value. If \code{TRUE}, levels of the factor combinations are #' ordered by their marginal means. If \code{FALSE}, they appear in #' order based on the existing ordering of the factor levels involved. #' Note that the levels are ordered the same way in all panels, and in #' many cases this implies that the means in any particular panel #' will \emph{not} be ordered even when \code{sort = TRUE}. #' @param values Logical value. If \code{TRUE}, the values of the EMMs are included #' in the plot. When there are several side-by-side panels due #' to \code{by} variable(s), the labels showing values start #' stealing a lot of space from the plotting area; in those cases, #' it may be desirable to specify \code{FALSE} or use \code{rows} #' so that some panels are vertically stacked. #' @param rows Character vector of which \code{by} variable(s) are used to define #' rows of the panel layout. Those variables in \code{by} not included in #' \code{rows} define columns in the array of panels. #' A \code{"."} indicates that only one row #' is used, so all panels are stacked side-by-side. #' @param xlab Character label to use in place of the default for the P-value axis. #' @param ylab Character label to use in place of the default for the primary-factor axis. #' @param xsub Character label used as caption at the lower right of the plot. #' @param plim numeric vector of value(s) between 0 and 1. These are included #' among the observed p values so that the range of tick marks includes at #' least the range of \code{plim}. Choosing \code{plim = c(0,1)} will ensure #' the widest possible range. #' @param add.space Numeric value to adjust amount of space used for value labels. Positioning #' of value labels is tricky, and depends on how many panels and the #' physical size of the plotting region. This parameter allows the user to #' adjust the position. Changing it by one unit should shift the position by #' about one character width (right if positive, left if negative). #' Note that this interacts with \code{aes$label} below. #' @param aes optional named list of lists. Entries considered are \code{point}, #' \code{segment}, and \code{label}, and contents are passed to the respective #' \code{ggplot2::geom_xxx()} functions. These affect rendering of points, #' line segments joining them, and value labels. #' Defaults are \code{point = list(size = 2)}, #' \code{segment = list()}, and \code{label = list(size = 2.5)}. #' @param ... Additional arguments passed to \code{contrast} and \code{\link{summary.emmGrid}}, #' as well as to \code{geom_segment} and \code{geom_label} #' #' #' @note The \pkg{ggplot2} and \pkg{scales} packages must be installed in order #' for \code{pwpp} to work. #' @note Additional plot aesthetics are available by adding them to the returned object; #' see the examples #' #' @seealso A numerical display of essentially the same results is available #' from \code{\link{pwpm}} #' @export #' @examples #' pigs.lm <- lm(log(conc) ~ source * factor(percent), data = pigs) #' emm = emmeans(pigs.lm, ~ percent | source) #' pwpp(emm) #' pwpp(emm, method = "trt.vs.ctrl1", type = "response", side = ">") #' #' # custom aesthetics: #' my.aes <- list(point = list(shape = "square"), #' segment = list(linetype = "dashed", color = "red"), #' label = list(family = "serif", fontface = "italic")) #' my.pal <- c("darkgreen", "blue", "magenta", "orange") #' pwpp(emm, aes = my.aes) + ggplot2::scale_color_manual(values = my.pal) #' pwpp = function(emm, method = "pairwise", by, sort = TRUE, values = TRUE, rows = ".", xlab, ylab, xsub = "", plim = numeric(0), add.space = 0, aes, ...) { if(missing(by)) by = emm@misc$by.vars ### set up aesthetics if(missing(aes)) aes = list() # defaults if other than system ones... daes = list(point = list(size = 2), segment = list(), label = list(size = 2.5)) # fill aes w/ defaults if not present, at either level for(a in names(daes)) { if(is.null(aes[[a]])) aes[[a]] = daes[[a]] else for (b in names(daes[[a]])) if (is.null(aes[[a]][[b]])) aes[[a]][[b]] = daes[[a]][[b]] } if(rows != "." && !(rows %in% by)) stop("'rows' must be a subset of the 'by' variables") args = list(object = emm, method = method, by = by, ...) args$interaction = args$simple = args$offset = NULL con = do.call(contrast, args) args = list(object = emm, infer = c(FALSE, FALSE), by = by, ...) emm.summ = do.call(summary.emmGrid, args) args = list(object = con, infer = c(FALSE, TRUE), ...) args$null = NULL con.summ = do.call(summary.emmGrid, args) if(missing(xlab)) { adjust = .cap(attr(con.summ, "adjust")) delta = attr(con.summ, "delta") side = attr(con.summ, "side") xlab = "P value" if (adjust != "None") xlab = paste0(adjust, "-adjusted ", xlab) if (delta != 0) xsub = paste(c("Nonsuperiority", "Equivalence", "Noninferiority")[side + 2], "test with threshold", delta) else xsub = c("Left-sided tests", "", "Right-sided tests")[side + 2] } sep = get_emm_option("sep") if(missing(ylab)) ylab = paste(attr(emm.summ, "pri.vars"), collapse = ":") # figure out levels being compared cf = coef(con) use = setdiff(names(cf), names(con@misc$orig.grid)) idx = apply(as.matrix(cf[use]), 2, function(x) { if(!all(range(x) == c(-1,1)) || (sum(abs(x)) != 2)) stop("Each contrast must be a comparison of two estimates") c(which(x == 1), which(x == -1)) }) primv = setdiff(attr(emm.summ, "pri.vars"), by) pf = do.call(paste, c(unname(emm.summ[primv]), sep = sep)) pemm = suppressMessages(emmeans(emm, primv)) levs = do.call(paste, c(unname(pemm@grid[primv]), sep = sep)) if(sort) ord = order(predict(pemm)) else ord = seq_along(pf) pf = emm.summ$pri.fac = factor(pf, levels = levs[ord]) con.summ$plus = pf[idx[1, ]] con.summ$minus = pf[idx[2, ]] estName = attr(emm.summ, "estName") ########## The rest should probably be done in a separate function ################ .requireNS("ggplot2", "pwpp requires the 'ggplot2' package be installed.", call. = FALSE) # granulate values in each group so they won't overlap # do this on the transformed (plotted) scale byr = .find.by.rows(con.summ, by) for (r in byr) { pv = con.summ$p.value[r] con.summ$p.value[r] = gran(pv) } # form the reverse half & get midpoints con.summ$midpt = (as.numeric(con.summ$plus) + as.numeric(con.summ$minus)) / 2 tmp = con.summ tmp$plus = con.summ$minus tmp$minus = con.summ$plus con.summ = rbind(con.summ, tmp) # find ranges to ensure we get tick marks: exmaj = c(0, .pvmaj.brk) pvtmp = c(plim, con.summ$p.value) pvtmp = pvtmp[!is.na(pvtmp)] tick.min = max(exmaj[exmaj <= min(pvtmp)]) tick.max = min(exmaj[exmaj >= max(pvtmp)]) # args for geom_segment sarg = c(list(mapping = quote(ggplot2::aes_(xend = ~p.value, yend = ~midpt)), data = NULL, stat = "identity", position = "identity"), aes$segment) grobj = ggplot2::ggplot(data = con.summ, ggplot2::aes_(x = ~p.value, y = ~plus, color = ~minus, group = ~minus)) + do.call(ggplot2::geom_point, aes$point) + do.call(ggplot2::geom_segment, sarg) + ggplot2::geom_point(ggplot2::aes(x = tick.min, y = 1), alpha = 0) + ggplot2::geom_point(ggplot2::aes(x = tick.max, y = 1), alpha = 0) if (!is.null(by)) { cols = setdiff(by, rows) if (length(cols) > 0) ncols = length(unique(do.call(paste, unname(con.summ[cols])))) else { ncols = 1 cols = "." } grobj = grobj + ggplot2::facet_grid( as.formula(paste( paste(rows, collapse = "+"), "~", paste(cols, collapse = "+"))), labeller = "label_both") } else ncols = 1 if (values) { emm.summ$minus = emm.summ$pri.fac # for consistency in grouping/labeling dig = .opt.dig(emm.summ[, estName]) emm.summ$fmtval = format(emm.summ[[estName]], digits = dig) tminp = .pval.tran(min(c(tick.min, con.summ$p.value), na.rm=TRUE)) pos = .pval.inv(tminp - .025) lpad = .012 * (add.space + max(nchar(emm.summ$fmtval))) * ncols # how much space needed for labels rel to (0,1) lpad = lpad * (1.1 - tminp) # scale closer to actual width of scales lpos = .pval.inv(tminp - lpad) # pvalue at left end of label larg = c(list(mapping = quote(ggplot2::aes_(x = pos, y = ~minus, label = ~fmtval, hjust = "right")), data = emm.summ, stat = "identity", position = "identity"), aes$label) grobj = grobj + do.call(ggplot2::geom_label, larg) + ggplot2::geom_point(ggplot2::aes_(x = lpos, y = 1), alpha = 0) # invisible point to stake out space } else lpad = 0 .requireNS("scales", "pwpp requires the 'scales' package be installed.", call. = FALSE) .pvtrans = scales::trans_new("Scaled P value", transform = function(x) .pval.tran(x), inverse = function(p) .pval.inv(p), format = function(x) format(x, drop0trailing = TRUE, scientific = FALSE), domain = c(0,1) ) grobj = grobj + ggplot2::scale_x_continuous(trans = .pvtrans, breaks = .pvmaj.brk, minor_breaks = .pvmin.brk) + #### I plotted an extra point instead of expanding scale #### expand = ggplot2::expand_scale(add = c(.025 + lpad, .025))) + ggplot2::guides(color = "none") grobj + ggplot2::labs(x = xlab, y = ylab, caption = xsub) } # capitalize .cap = function(s) paste0(toupper(substring(s, 1, 1)), substring(s, 2)) ### Scale-transformation code: We stretch out small P values without stretching ### extremely small ones too much -- via a combination of log and normal cdf functions .tran.ctr = -2.5 # params of normal cdf transf of log(.value) .tran.sd = 3 .tran.div = pnorm(0, .tran.ctr, .tran.sd) .pvmaj.brk = c(.001, .01, .05, .1, .2, .5, 1) #.pvmin.brk = c(.0001, .0005, seq(.001, .009, by = .001), seq(.02 ,.09, by = .01), .2, .3, .4, .6, .7, .8, .9) .pvmin.brk = c(.0005, .001, .005, seq(.01 ,.1, by = .01), .2, .3, .4, .5, 1) # transforms x in (0, 1] to t in (0,1], while any x's < 0 or > 1 are preserved as is # This allows me to position marginal labels etc. where I want .pval.tran = function(x) { xx = sapply(x, function(.) min(max(., .00005), 1)) rtn = pnorm(log(xx), .tran.ctr, .tran.sd) / .tran.div spec = which((x < .00005) | (x > 1)) rtn[spec] = x[spec] rtn } .pval.inv = function(p) { pp = sapply(p, function(.) min(max(., .00005), .99995)) rtn = exp(qnorm(.tran.div * pp, .tran.ctr, .tran.sd)) spec = which((p < .00005) | (p > .99995)) rtn[spec] = p[spec] rtn } # For scale_x_continuous(): (moved to body of fcn so I can check w/ requyireNamespace) # .pvtrans = scales::trans_new("Scaled P value", # transform = function(x) .pval.tran(x), # inverse = function(p) .pval.inv(p), # format = function(x) format(x, drop0trailing = TRUE, scientific = FALSE), # domain = c(0,1) ) ### quick & dirty algorithm to Stretch out P values so more distinguishable on transformed scale gran = function(x, min_incr = .01) { savex = x x = x[!is.na(x)] if ((length(x) <= 3) || (diff(range(x)) == 0)) return(savex) kink = function(xx) { # linear spline basis; call with knot subtracted xx[xx < 0] = 0 xx } ### x[x < .00009] = .00009 # forces granulation of extremely small P # spread-out the p values less than .0004 rnk = rank(x) sm = which(x < .0004) if(length(sm) > 0) x[sm] = .0004 - .0003 * rnk[sm] / length(sm) ord = order(x) tx = log(x[ord]) df = diff(tx) incr = sapply(df, function(.) max(., min_incr)) ttx = cumsum(c(tx[1], incr)) if (length(x) > 5) tx = predict(lm(tx ~ ttx + kink(ttx + 1.5) + kink(ttx + 3) + kink(ttx + 4.5))) else tx = ttx tx[tx > 0] = 0 x[ord] = exp(tx) savex[!is.na(savex)] = x savex } #' Pairwise P-value matrix (plus other statistics) #' #' This function presents results from \code{emmeans} and pairwise comparisons #' thereof in a compact way. It displays a matrix (or matrices) of estimates, #' pairwise differences, and P values. The user may opt to exclude any of these #' via arguments \code{means}, \code{diffs}, and \code{pvals}, respectively. #' To control the direction of the pairwise differences, use \code{reverse}; #' and to control what appears in the upper and lower triangle(s), use \code{flip}. #' Optional arguments are passed to \code{contrast.emmGrid} and/or #' \code{summary.emmGrid}, making it possible to control what estimates #' and tests are displayed. #' #' @param emm An \code{emmGrid} object #' @param by Character vector of variable(s) in the grid to condition on. These #' will create different matrices, one for each level or level-combination. #' If missing, \code{by} is set to \code{emm@misc$by.vars}. #' Grid factors not in \code{by} are the \emph{primary} factors: #' whose levels or level combinations are compared pairwise. #' @param reverse Logical value passed to \code{\link{pairs.emmGrid}}. #' Thus, \code{FALSE} specifies \code{"pairwise"} comparisons #' (earlier vs. later), and \code{TRUE} specifies \code{"revpairwise"} #' comparisons (later vs. earlier). #' @param pvals Logical value. If \code{TRUE}, the pairwise differences #' of the EMMs are included in each matrix according to \code{flip}. #' @param means Logical value. If \code{TRUE}, the estimated marginal means #' (EMMs) from \code{emm} are included in the matrix diagonal(s). #' @param diffs Logical value. If \code{TRUE}, the pairwise differences #' of the EMMs are included in each matrix according to \code{flip}. #' @param flip Logical value that determines where P values and differences #' are placed. \code{FALSE} places the P values in the upper triangle #' and differences in the lower, and \code{TRUE} does just the opposite. #' @param digits Integer. Number of digits to display. If missing, #' an optimal number of digits is determined. #' @param ... Additional arguments passed to \code{\link{contrast.emmGrid}} and #' \code{\link{summary.emmGrid}}. You should \emph{not} include \code{method} #' here, because pairwise comparisons are always used. #' #' @return A matrix or `list` of matrices, one for each `by` level. #' #' @seealso A graphical display of essentially the same results is available #' from \code{\link{pwpp}} #' @export #' #' @examples #' warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) #' warp.emm <- emmeans(warp.lm, ~ tension | wool) #' #' pwpm(warp.emm) #' #' # use dot options to specify noninferiority tests #' pwpm(warp.emm, by = NULL, side = ">", delta = 5, adjust = "none") pwpm = function(emm, by, reverse = FALSE, pvals = TRUE, means = TRUE, diffs = TRUE, flip = FALSE, digits, ...) { if(missing(by)) by = emm@misc$by.vars emm = update(emm, by = by) pri = paste(emm@misc$pri.vars, collapse = ":") mns = confint(emm, ...) mns$lbls = do.call(paste, c(unname(mns[attr(mns, "pri.vars")]), sep = get_emm_option("sep"))) estName = attr(mns, "estName") prs = test(pairs(emm, reverse = reverse, ...), ...) diffName = attr(prs, "estName") null.hyp = "0" if (!reverse) trifcn = lower.tri else { flip = !flip trifcn = upper.tri } if (!is.null(prs$null)) { null.hyp = as.character(signif(unique(prs$null), digits = 5)) if (length(null.hyp) > 1) null.hyp = "(various values)" } mby = .find.by.rows(mns, by) pby = .find.by.rows(prs, by) if(opt.dig <- missing(digits)) { tmp = mns[[estName]] + mns[["SE"]] * cbind(rep(-2, nrow(mns)), 0, 2) digits = max(apply(tmp, 1, .opt.dig)) opt.dig = TRUE } result = lapply(seq_along(mby), function(i) { if(opt.dig) { pv = prs$p.value[pby[[i]]] fpv = sprintf("%6.4f", pv) fpv[pv < 0.0001] = "<.0001" } else fpv = format(prs$p.value[pby[[i]]], digits = digits) fmn = format(mns[mby[[i]], estName], digits = digits) fdiff = format(prs[pby[[i]], diffName], digits = digits) lbls = mns$lbls[mby[[i]]] n = length(lbls) mat = matrix("", nrow = n, ncol = n, dimnames = list(lbls, lbls)) if(pvals) { mat[trifcn(mat)] = fpv mat = t(mat) } if (diffs) mat[trifcn(mat)] = fdiff if (means) diag(mat) = fmn else { # trim off empty row and col idx = seq_len(n - 1) if (pvals && !diffs) mat = mat[idx, 1 + idx] if (!pvals && diffs) mat = mat[1 + idx, idx] } if (flip) t(mat) else mat }) if (reverse) flip = !flip names(result) = paste(paste(by, collapse = ", "), "=", names(mby)) if (length(result) == 1) result = result[[1]] class(result) = c("pwpm", "list") attr(result, "parms") = c(pvals = pvals, diffs = diffs, means = means, pri = pri, estName = estName, diffName = diffName, reverse = reverse, flip = flip, type = attr(mns, "type"), adjust = attr(prs, "adjust"), side = attr(prs, "side"), delta = attr(prs, "delta"), null = null.hyp) result } #' @export print.pwpm = function(x, ...) { parms = attr(x, "parms") attr(x, "class") = attr(x, "parms") = NULL if ((islist <- !is.matrix(x))) entries = seq_along(x) else { entries = 1 m = x } for (i in entries) { if (islist) { cat(paste0("\n", names(x)[i], "\n")) m = x[[i]] } if (parms["means"]) diag(m) = paste0("[", diag(m), "]") print(m, quote = FALSE, right = TRUE, na.print = "nonEst") } # print a parm and its name if present unless it's in excl # optional subst is NAMED vector where each possibilitty MUST be present catparm = function(f, excl = "0", delim = " ", quote = TRUE, subst) { if(!is.na(pf <- parms[f]) && !(pf %in% excl)) { if (!missing(subst)) pf = subst[pf] if (quote) pf = dQuote(pf) cat(paste0(delim, f, " = ", pf)) } } cat(paste0("\nRow and column labels: ", parms["pri"], "\n")) if (parms["pvals"]) { cat(paste0(ifelse(parms["flip"], "Lower", "Upper"), " triangle: P values ")) catparm("null", quote = FALSE) catparm("side", subst = c("-1" = "<", "1" = ">")) catparm("delta", quote = FALSE) catparm("adjust", "none") cat("\n") } if (parms["means"]) { cat(paste0("Diagonal: [Estimates] (", parms["estName"], ") ")) catparm("type", "link") cat("\n") } if (parms["diffs"]) { cat(paste0(ifelse(parms["flip"], "Upper", "Lower"), " triangle: Comparisons (", parms["diffName"], ") ")) if (parms["reverse"]) cat("later vs. earlier\n") else cat("earlier vs. later\n") } invisible(x) } emmeans/R/transformations.R0000644000176200001440000004515414137062735015475 0ustar liggesusers############################################################################## # Copyright (c) 2012-2016 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # Code to implement transformations my way # Implementation of additional transformations, typically ones with parameters # Returns a list like stats::make.link, but often with an additional "param" member # types: # glog: log(mu + param) #' Response-transformation extensions #' #' The \code{make.tran} function creates the needed information to perform #' transformations of the response #' variable, including inverting the transformation and estimating variances of #' back-transformed predictions via the delta method. \code{make.tran} is #' similar to \code{\link{make.link}}, but it covers additional transformations. #' The result can be used as an environment in which the model is fitted, or as #' the \code{tran} argument in \code{\link{update.emmGrid}} (when the given #' transformation was already applied in an existing model). #' #' The functions \code{\link{emmeans}}, \code{\link{ref_grid}}, and related ones #' automatically detect response transformations that are recognized by #' examining the model formula. These are \code{log}, \code{log2}, \code{log10}, #' \code{log1p}, #' \code{sqrt}, \code{logit}, \code{probit}, \code{cauchit}, \code{cloglog}; as #' well as (for a response variable \code{y}) \code{asin(sqrt(y))}, #' \code{asinh(sqrt(y))}, and \code{sqrt(y) + sqrt(y+1)}. In addition, any #' constant multiple of these (e.g., \code{2*sqrt(y)}) is auto-detected and #' appropriately scaled (see also the \code{tran.mult} argument in #' \code{\link{update.emmGrid}}). #' #' A few additional character strings may be supplied as the \code{tran} #' argument in \code{\link{update.emmGrid}}: \code{"identity"}, #' \code{"1/mu^2"}, \code{"inverse"}, \code{"reciprocal"}, \code{"log10"}, \code{"log2"}, \code{"asin.sqrt"}, #' and \code{"asinh.sqrt"}. #' #' More general transformations may be provided as a list of functions and #' supplied as the \code{tran} argument as documented in #' \code{\link{update.emmGrid}}. The \code{make.tran} function returns a #' suitable list of functions for several popular transformations. Besides being #' usable with \code{update}, the user may use this list as an enclosing #' environment in fitting the model itself, in which case the transformation is #' auto-detected when the special name \code{linkfun} (the transformation #' itself) is used as the response transformation in the call. See the examples #' below. #' #' Most of the transformations available in "make.tran" require a parameter, #' specified in \code{param}; in the following discussion, we use \eqn{p} to #' denote this parameter, and \eqn{y} to denote the response variable. #' The \code{type} argument specifies the following transformations: #' \describe{ #' \item{\code{"genlog"}}{Generalized logarithmic transformation: \eqn{log(y + #' p)}, where \eqn{y > -p}} #' \item{\code{"power"}}{Power transformation: \eqn{y^p}, where \eqn{y > 0}. #' When \eqn{p = 0}, \code{"log"} is used instead} #' \item{\code{"boxcox"}}{The Box-Cox transformation (unscaled by the geometric #' mean): \eqn{(y^p - 1) / p}, where \eqn{y > 0}. When \eqn{p = 0}, \eqn{log(y)} #' is used.} #' \item{\code{"sympower"}}{A symmetrized power transformation on the whole real #' line: #' \eqn{abs(y)^p * sign(y)}. There are no restrictions on \eqn{y}, but we #' require \eqn{p > 0} in order for the transformation to be monotone and #' continuous.} #' \item{\code{"asin.sqrt"}}{Arcsin-square-root transformation: #' \eqn{sin^(-1)(y/p)^{1/2)}}. Typically, the parameter \eqn{p} is equal to 1 for #' a fraction, or 100 for a percentage.} #' \item{\code{"bcnPower"}}{Box-Cox with negatives allowed, as described for the #' \code{bcnPower} function in the \pkg{car} package. It is defined as the Box-Cox #' transformation \eqn{(z^p - 1) / p} of the variable \eqn{z = y + (y^2+g^2)^(1/2)}. #' This requires \code{param} to have two elements: #' the power \eqn{p} and the offset \eqn{g > 0}.} #' \item{\code{"scale"}}{This one is a little different than the others, in that #' \code{param} is ignored; instead, \code{param} is determined by calling #' \code{scale(y, ...)}. The user should give as \code{y} the response variable in the #' model to be fitted to its scaled version.} #' } #' The user may include a second element in \code{param} to specify an #' alternative origin (other than zero) for the \code{"power"}, \code{"boxcox"}, #' or \code{"sympower"} transformations. For example, \samp{type = "power", #' param = c(1.5, 4)} specifies the transformation \eqn{(y - 4)^1.5}. #' In the \code{"genpower"} transformation, a second \code{param} element may be #' used to specify a base other than the default natural logarithm. For example, #' \samp{type = "genlog", param = c(.5, 10)} specifies the \eqn{log10(y + .5)} #' transformation. In the \code{"bcnPower"} transformation, the second element #' is required and must be positive. #' #' For purposes of back-transformation, the \samp{sqrt(y) + sqrt(y+1)} #' transformation is treated exactly the same way as \samp{2*sqrt(y)}, because #' both are regarded as estimates of \eqn{2\sqrt\mu}. #' #' @param type The name of the transformation. See Details. #' @param param Numeric parameter needed for the transformation. Optionally, it #' may be a vector of two numeric values; the second element specifies an #' alternative base or origin for certain transformations. See Details. #' @param y,... Used only with \code{type = "scale"}. These parameters are #' passed to \code{\link{scale}} to determine \code{param}. #' #' @return A \code{list} having at least the same elements as those returned by #' \code{\link{make.link}}. The \code{linkfun} component is the transformation #' itself. #' #' @note The \code{genlog} transformation is technically unneeded, because #' a response transformation of the form \code{log(y + c)} is now auto-detected #' by \code{\link{ref_grid}}. #' @note We modify certain \code{\link{make.link}} results in transformations #' where there is a restriction on valid prediction values, so that reasonable #' inverse predictions are obtained, no matter what. For example, if a #' \code{sqrt} transformation was used but a predicted value is negative, the #' inverse transformation is zero rather than the square of the prediction. A #' side effect of this is that it is possible for one or both confidence #' limits, or even a standard error, to be zero. #' @export #' #' @examples #' # Fit a model using an oddball transformation: #' bctran <- make.tran("boxcox", 0.368) #' warp.bc <- with(bctran, #' lm(linkfun(breaks) ~ wool * tension, data = warpbreaks)) #' # Obtain back-transformed LS means: #' emmeans(warp.bc, ~ tension | wool, type = "response") #' #' ### Using a scaled response... #' # Case where it is auto-detected: #' fib.lm <- lm(scale(strength) ~ diameter + machine, data = fiber) #' ref_grid(fib.lm) #' #' # Case where scaling is not auto-detected -- and what to do about it: #' fib.aov <- aov(scale(strength) ~ diameter + Error(machine), data = fiber) #' fib.rg <- suppressWarnings(ref_grid(fib.aov, at = list(diameter = c(20, 30)))) #' #' # Scaling was not retrieved, so we can do: #' fib.rg = update(fib.rg, tran = make.tran("scale", y = fiber$strength)) #' emmeans(fib.rg, "diameter") #' #' \dontrun{ #' ### An existing model 'mod' was fitted with a y^(2/3) transformation... #' ptran = make.tran("power", 2/3) #' emmeans(mod, "treatment", tran = ptran) #' } make.tran = function(type = c("genlog", "power", "boxcox", "sympower", "asin.sqrt", "bcnPower", "scale"), param = 1, y, ...) { type = match.arg(type) origin = 0 mu.lbl = "mu" if (length(param) > 1) { origin = param[2] param = param[1] mu.lbl = paste0("(mu - ", round(origin, 3), ")") } if(type == "scale") { sy = scale(y, ...) if(is.null(origin <- attr(sy, "scaled:center"))) origin = 0 if(is.null(param <- attr(sy, "scaled:scale"))) param = 1 remove(list = c("y", "sy")) # remove baggage from env } switch(type, genlog = { if((origin < 0) || (origin == 1)) stop('"genlog" transformation must have a positive base != 1') logbase = ifelse(origin == 0, 1, log(origin)) xlab = ifelse(origin == 0, "", paste0(" (base ", round(origin, 3), ")")) list(linkfun = function(mu) log(pmax(mu + param, 0)) / logbase, linkinv = function(eta) pmax(exp(logbase * eta), .Machine$double.eps) - param, mu.eta = function(eta) logbase * pmax(exp(logbase * eta), .Machine$double.eps), valideta = function(eta) TRUE, param = c(param, origin), name = paste0("log(mu + ", round(param,3), ")", xlab) ) }, power = { if (param == 0) { if(origin == 0) make.link("log") else make.tran("genlog", -origin) } else list( linkfun = function(mu) pmax(mu - origin, 0)^param, linkinv = function(eta) origin + pmax(eta, 0)^(1/param), mu.eta = function(eta) pmax(eta, 0)^(1/param - 1) / param, valideta = function(eta) all(eta > 0), param = c(param, origin), name = ifelse(param > 0, paste0(mu.lbl, "^", round(param,3)), paste0(mu.lbl, "^(", round(param,3), ")")) ) }, boxcox = { if (param == 0) { result = if(origin == 0) make.link("log") else make.tran("genlog", -origin) return (result) } min.eta = ifelse(param > 0, -1 / param, -Inf) xlab = ifelse(origin == 0, "", paste0(" with origin at ", round(origin, 3))) list( linkfun = function(mu) ((mu - origin)^param - 1) / param, linkinv = function(eta) origin + (1 + param * pmax(eta, min.eta))^(1/param), mu.eta = function(eta) (1 + param * pmax(eta, min.eta))^(1/param - 1), valideta = function(eta) all(eta > min.eta), param = c(param, origin), name = paste0("Box-Cox (lambda = ", round(param, 3), ")", xlab) ) }, sympower = { if (param <= 0) stop('"sympower" transformation requires positive param') if (origin == 0) mu.lbl = paste0("(", mu.lbl, ")") absmu.lbl = gsub("\\(|\\)", "|", mu.lbl) list(linkfun = function(mu) sign(mu - origin) * abs(mu - origin)^param, linkinv = function(eta) origin + sign(eta) * abs(eta)^(1/param), mu.eta = function(eta) (abs(eta))^(1/param - 1), valideta = function(eta) all(eta > min.eta), param = c(param, origin), name = paste0(absmu.lbl, "^", round(param,3), " * sign", mu.lbl) ) }, asin.sqrt = { mu.lbl = ifelse(param == 1, "mu", paste0("mu/", round(param,3))) list(linkfun = function(mu) asin(sqrt(mu/param)), linkinv = function(eta) param * sin(pmax(pmin(eta, pi/2), 0))^2, mu.eta = function(eta) param * sin(2*pmax(pmin(eta, pi/2), 0)), valideta = function(eta) all(eta <= pi/2) && all(eta >= 0), name = paste0("asin(sqrt(", mu.lbl, "))") ) }, bcnPower = { if(origin <= 0) stop ("The second parameter for 'bcnPower' must be strictly positive.") list( linkfun = function(mu) { s = sqrt(mu^2 + origin^2) if (abs(param) < 1e-10) log(.5*(mu + s)) else ((0.5 * (mu + s))^param - 1) / param }, linkinv = function(eta) { q = if (abs(param) < 1e-10) 2 * exp(eta) else 2 * (param * eta + 1) ^ (1/param) (q^2 - origin^2) / (2 * q) }, mu.eta = function(eta) { if (abs(param) < 1e-10) { q = 2 * exp(eta); dq = q } else { q = 2 * (param * eta + 1) ^ (1/param) dq = 2 * (param * eta + 1)^(1/param - 1) } 0.5 * (1 + (origin/q)^2) * dq }, valideta = function(eta) all(eta > 0), param = c(param, origin), name = paste0("bcnPower(", signif(param,3), ", ", signif(origin,3), ")") ) }, scale = list( linkfun = function(mu) (mu - origin) / param, linkinv = function(eta) param * eta + origin, mu.eta = function(eta) rep(param, length(eta)), valideta = function(eta) TRUE, name = paste0("scale(", signif(origin, 3), ", ", signif(param, 3), ")"), param = c(param, origin) ) ) } ### My modification/expansion of stats:make.link() ### Also, if not found, returns make.link("identity") modified with ## unknown = TRUE, name = link ## In addition, I make all links truly monotone on (-Inf, Inf) in ## lieu of valideta ## ## Extensions to make.link results: ## unknown: set to TRUE if link is unknown ## mult: scalar multiple of transformation ## .make.link = function(link) { if (link %in% c("logit", "probit", "cauchit", "cloglog", "identity", "log")) result = stats::make.link(link) else result = switch(link, sqrt = { tmp = make.link("sqrt") tmp$linkinv = function(eta) pmax(0, eta)^2 tmp$mu.eta = function(eta) 2*pmax(0, eta) tmp }, `1/mu^2` = { tmp = make.link("1/mu^2") tmp$linkinv = function(eta) 1/sqrt(pmax(0, eta)) tmp$mu.eta = function(eta) -1/(2*pmax(0, eta)^1.5) tmp }, inverse = { tmp = make.link("inverse") tmp$linkinv = function(eta) 1/pmax(0, eta) tmp$mu.eta = function(eta) -1/pmax(0, eta)^2 tmp }, `/` = .make.link("inverse"), reciprocal = .make.link("inverse"), log10 = list( linkfun = log10, linkinv = function(eta) 10^eta, mu.eta = function(eta) 10^eta * log(10), name = "log10" ), log2 = list( linkfun = log2, linkinv = function(eta) 2^eta, mu.eta = function(eta) 2^eta * log(2), name = "log2" ), log1p = list( linkfun = log1p, linkinv = expm1, mu.eta = exp, name = "log1p" ), asin.sqrt = make.tran("asin.sqrt"), `asin.sqrt./` = make.tran("asin.sqrt", 100), asinh.sqrt = list( linkinv = function(eta) sinh(eta)^2, mu.eta = function(eta) sinh(2 * eta), name = "asinh(sqrt(mu))" ), exp = list( linkinv = function(eta) log(eta), mu.eta = function(eta) 1/eta, name = "exp" ), `+.sqrt` = { tmp = .make.link("sqrt") tmp$mult = 2 tmp }, log.o.r. = { tmp = make.link("log") tmp$name = "log odds ratio" tmp }, { # default if not included, flags it as unknown tmp = stats::make.link("identity") tmp$unknown = TRUE tmp$name = link tmp } ) result } ### Internal routine to make a scales::trans object .make.scale = function(misc) { if (!requireNamespace("scales", quiet = TRUE)) stop("type = \"scale\" requires the 'scales' package to be installed") tran = misc$tran if (is.character(tran)) { # is it a canned scale? if ((length(intersect(names(misc), c("tran.mult", "tran.offset"))) == 0) && tran %in% c("log", "log1p", "log2", "log10", "sqrt", "logit", "probit", "exp", "identity")) return(get(paste(tran, "trans", sep = "_"), envir = asNamespace("scales"))()) # not built-in, so let's get a list tran = .make.link(tran) } # tran is a list. we'll incorporate any scaling tran$mult = ifelse(is.null(misc$tran.mult), 1, misc$tran.mult) tran$offset = ifelse(is.null(misc$tran.offset), 0, misc$tran.offset) with(tran, scales::trans_new(name, function(x) mult * linkfun(x + offset), function(z) linkinv(z / mult) - offset)) } emmeans/R/emmeans.R0000644000176200001440000010475014142775742013674 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # emmeans and related functions # emmeans utility for a list of specs emmeans.list = function(object, specs, ...) { result = list() nms = names(specs) # Format a string describing the results .make.desc = function(meth, pri, by) { pri = paste(pri, collapse = ", ") desc = paste(meth, "of", pri) if (!is.null(by)) { by = paste(by, collapse = ", ") desc = paste(desc, "|", by) } desc } name.arg = match.call()$name for (i in seq_len(length(specs))) { nm = nms[i] # We'll rename the contrasts if spec is named if (is.null(name.arg)) { contr.name = ifelse(nm == "", "contrast", paste0(nm, ".contrast")) res = emmeans(object=object, specs = specs[[i]], name = contr.name, ...) } else res = emmeans(object=object, specs = specs[[i]], ...) if (is.data.frame(res)) { # happens e.g. when cld is used if (is.null(nm)) nm = .make.desc("summary", attr(res, "pri.vars"), attr(res, "by.vars")) result[[nm]] = res } else if (is.list(res)) { for (j in seq_len(length(res))) { m = res[[j]]@misc if (is.null(nm)) names(res)[j] = .make.desc(m$methDesc, res[[1]]@misc$pri.vars, m$by.vars) else names(res)[j] = paste(nm, m$methDesc) } result = c(result,res) } else{ if (is.null(nm)) nm = .make.desc(res@misc$methDesc, res@misc$pri.vars, res@misc$by.vars) result[[nm]] = res } } class(result) = c("emm_list", "list") result } # # Generic for after we've gotten specs in character form # emmeans.character = function(object, specs, ...) { # UseMethod("emmeans.character") # } # # # Needed for model objects # emmeans.character.default = function(object, specs, trend, ...) { # if (!missing(trend)) { # warning("The `trend` argument is being deprecated. Use `emtrends()` instead.") # emtrends(object, specs, var = trend, ...) # } # else # emmeans.default(object, specs, ...) # } # Here's our flagship function! #' Estimated marginal means (Least-squares means) #' #' Compute estimated marginal means (EMMs) for specified factors #' or factor combinations in a linear model; and optionally, comparisons or #' contrasts among them. EMMs are also known as least-squares means. #' #' Users should also consult the documentation for \code{\link{ref_grid}}, #' because many important options for EMMs are implemented there, via the #' \code{...} argument. #' #' @param object An object of class \code{emmGrid}; or a fitted model object #' that is supported, such as the result of a call to \code{lm} or #' \code{lmer}. Many fitted-model objects are supported; see #' \href{../doc/models.html}{\code{vignette("models", "emmeans")}} for details. #' @param specs A \code{character} vector specifying the names of the predictors #' over which EMMs are desired. \code{specs} may also be a \code{formula} #' or a \code{list} (optionally named) of valid \code{spec}s. Use of formulas #' is described in the Overview section below. #' @param by A character vector specifying the names of predictors to condition on. #' @param fac.reduce A function that combines the rows of a matrix into a single #' vector. This implements the ``marginal averaging'' aspect of EMMs. #' The default is the mean of the rows. Typically if it is overridden, #' it would be some kind of weighted mean of the rows. If \code{fac.reduce} is #' nonlinear, bizarre results are likely, and EMMs will not be #' interpretable. NOTE: If the \code{weights} argument is non-missing, #' \code{fac.reduce} is ignored. #' @param contr A character value or \code{list} specifying contrasts to be #' added. See \code{\link{contrast}}. NOTE: \code{contr} is ignored when #' \code{specs} is a formula. #' @param options If non-\code{NULL}, a named \code{list} of arguments to pass #' to \code{\link{update.emmGrid}}, just after the object is constructed. #' (Options may also be included in \code{...}; see the \sQuote{options} #' section below.) #' @param weights Character value, numeric vector, or numeric matrix specifying #' weights to use in averaging predictions. See \dQuote{Weights} section below. #' Also, if \code{object} is not already a reference grid, \code{weights} #' (if it is character) is passed to \code{ref_grid} as \code{wt.nuis} in case #' nuisance factors are specified. We can override this by specifying #' \code{wt.nuis} explicitly. #' This more-or-less makes the weighting of nuisance factors consistent with #' that of primary factors. #' @param offset Numeric vector or scalar. If specified, this adds an offset to #' the predictions, or overrides any offset in the model or its #' reference grid. If a vector of length differing from the number of rows in #' the result, it is subsetted or cyclically recycled. #' @param trend This is now deprecated. Use \code{\link{emtrends}} instead. #' @param ... When \code{object} is not already a \code{"emmGrid"} #' object, these arguments are passed to \code{\link{ref_grid}}. Common #' examples are \code{at}, \code{cov.reduce}, \code{data}, code{type}, #' \code{transform}, \code{df}, \code{nesting}, and \code{vcov.}. #' Model-type-specific options (see #' \href{../doc/models.html}{\code{vignette("models", "emmeans")}}), commonly #' \code{mode}, may be used here as well. In addition, if the model formula #' contains references to variables that are not predictors, you must provide #' a \code{params} argument with a list of their names. #' #' These arguments may also be used in lieu of \code{options}. See the #' \sQuote{Options} section below. #' @param tran Placeholder to prevent it from being included in \code{...}. #' If non-missing, it is added to `options`. See the \sQuote{Options} #' section. #' #' @return When \code{specs} is a \code{character} vector or one-sided formula, #' an object of class \code{"emmGrid"}. A number of methods #' are provided for further analysis, including #' \code{\link{summary.emmGrid}}, \code{\link{confint.emmGrid}}, #' \code{\link{test.emmGrid}}, \code{\link{contrast.emmGrid}}, #' and \code{\link{pairs.emmGrid}}. #' When \code{specs} is a \code{list} or a \code{formula} having a left-hand #' side, the return value is an \code{\link{emm_list}} object, which is simply a #' \code{list} of \code{emmGrid} objects. #' #' @section Overview: #' Estimated marginal means or EMMs (sometimes called least-squares means) are #' predictions from a linear model over a \emph{reference grid}; or marginal #' averages thereof. The \code{\link{ref_grid}} function identifies/creates the #' reference grid upon which \code{emmeans} is based. #' #' For those who prefer the terms \dQuote{least-squares means} or #' \dQuote{predicted marginal means}, functions \code{lsmeans} and #' \code{pmmeans} are provided as wrappers. See \code{\link{wrappers}}. #' #' If \code{specs} is a \code{formula}, it should be of the form \code{~ specs}, #' \code{~ specs | by}, \code{contr ~ specs}, or \code{contr ~ specs | by}. The #' formula is parsed and the variables therein are used as the arguments #' \code{specs}, \code{by}, and \code{contr} as indicated. The left-hand side is #' optional, but if specified it should be the name of a contrast family (e.g., #' \code{pairwise}). Operators like #' \code{*} or \code{:} are needed in the formula to delineate names, but #' otherwise are ignored. #' #' In the special case where the mean (or weighted mean) of all the predictions #' is desired, specify \code{specs} as \code{~ 1} or \code{"1"}. #' #' A number of standard contrast families are provided. They can be identified #' as functions having names ending in \code{.emmc} -- see the documentation #' for \code{\link{emmc-functions}} for details -- including how to write your #' own \code{.emmc} function for custom contrasts. #' #' @section Weights: #' If \code{weights} is a vector, its length must equal #' the number of predictions to be averaged to obtain each EMM. #' If a matrix, each row of the matrix is used in turn, wrapping back to the #' first row as needed. When in doubt about what is being averaged (or how #' many), first call \code{emmeans} with \code{weights = "show.levels"}. #' #' If \code{weights} is a string, it should partially match one of the following: #' \describe{ #' \item{\code{"equal"}}{Use an equally weighted average.} #' \item{\code{"proportional"}}{Weight in proportion to the frequencies (in the #' original data) of the factor combinations that are averaged over.} #' \item{\code{"outer"}}{Weight in proportion to each individual factor's #' marginal frequencies. Thus, the weights for a combination of factors are the #' outer product of the one-factor margins} #' \item{\code{"cells"}}{Weight according to the frequencies of the cells being #' averaged.} #' \item{\code{"flat"}}{Give equal weight to all cells with data, and ignore #' empty cells.} #' \item{\code{"show.levels"}}{This is a convenience feature for understanding #' what is being averaged over. Instead of a table of EMMs, this causes the #' function to return a table showing the levels that are averaged over, in the #' order that they appear.} #' } #' Outer weights are like the 'expected' counts in a chi-square test of #' independence, and will yield the same results as those obtained by #' proportional averaging with one factor at a time. All except \code{"cells"} #' uses the same set of weights for each mean. In a model where the predicted #' values are the cell means, cell weights will yield the raw averages of the #' data for the factors involved. Using \code{"flat"} is similar to #' \code{"cells"}, except nonempty cells are weighted equally and empty cells #' are ignored. #' #' #' @section Offsets: #' Unlike in \code{ref_grid}, an offset need not be scalar. If not enough values #' are supplied, they are cyclically recycled. For a vector of offsets, it is #' important to understand that the ordering of results goes with the first #' name in \code{specs} varying fastest. If there are any \code{by} factors, #' those vary slower than all the primary ones, but the first \code{by} variable #' varies the fastest within that hierarchy. See the examples. #' #' @section Options and \code{...}: #' Arguments that could go in \code{options} may instead be included in \code{...}, #' typically, arguments such as \code{type}, \code{infer}, etc. that in essence #' are passed to \code{\link{summary.emmGrid}}. Arguments in both places are #' overridden by the ones in \code{...}. #' #' There is a danger that \code{...} arguments could partially match those used #' by both \code{ref_grid} and \code{update.emmGrid}, creating a conflict. #' If these occur, usually they can be resolved by providing complete (or at least #' longer) argument names; or by isolating non-\code{ref_grid} arguments in #' \code{options}; or by calling \code{ref_grid} separately and passing the #' result as \code{object}. See a not-run example below. #' #' Also, when \code{specs} is a two-sided formula, or \code{contr} is specified, #' there is potential confusion concerning which \code{...} arguments #' apply to the means, and which to the contrasts. When such confusion is possible, #' we suggest doing things separately #' (a call to \code{emmeans} with no contrasts, followed by a call to #' \code{\link{contrast}}). We do treat #' for \code{adjust} as a special case: it is applied to the \code{emmeans} results #' \emph{only} if there are #' no contrasts specified, otherwise it is passed to \code{contrast}. #' #' @export #' #' @seealso \code{\link{ref_grid}}, \code{\link{contrast}}, #' \href{../doc/models.html}{vignette("models", "emmeans")} #' #' @examples #' warp.lm <- lm(breaks ~ wool * tension, data = warpbreaks) #' emmeans (warp.lm, ~ wool | tension) #' # or equivalently emmeans(warp.lm, "wool", by = "tension") #' #' # 'adjust' argument ignored in emmeans, passed to contrast part... #' emmeans (warp.lm, poly ~ tension | wool, adjust = "sidak") #' #' \dontrun{ #' # 'adjust' argument NOT ignored ... #' emmeans (warp.lm, ~ tension | wool, adjust = "sidak") #' } #' #' #' \dontrun{ #' ### Offsets: Consider a silly example: #' emmeans(warp.lm, ~ tension | wool, offset = c(17, 23, 47)) @ grid #' # note that offsets are recycled so that each level of tension receives #' # the same offset for each wool. #' # But using the same offsets with ~ wool | tension will probably not #' # be what you want because the ordering of combinations is different. #' #' ### Conflicting arguments... #' # This will error because 'tran' is passed to both ref_grid and update #' emmeans(some.model, "treatment", tran = "log", type = "response") #' #' # Use this if the response was a variable that is the log of some other variable #' # (Keep 'tran' from being passed to ref_grid) #' emmeans(some.model, "treatment", options = list(tran = "log"), type = "response") #' #' # This will re-grid the result as if the response had been log-transformed #' # ('transform' is passed only to ref_grid, not to update) #' emmeans(some.model, "treatment", transform = "log", type = "response") #' } emmeans = function(object, specs, by = NULL, fac.reduce = function(coefs) apply(coefs, 2, mean), contr, options = get_emm_option("emmeans"), weights, offset, trend, ..., tran) { if(!is(object, "emmGrid")) { args = .zap.args(object = object, ..., omit = "submodel") if (is.null(args$wt.nuis)) # pass weights as wt.nuis args$wt.nuis = ifelse(!missing(weights) && is.character(weights), weights, "equal") object = do.call(ref_grid, args) } if (is.list(specs)) { return (emmeans.list(object, specs, by = by, contr = contr, weights = weights, offset = offset, trend = trend, ...)) } if (inherits(specs, "formula")) { spc = .parse.by.formula(specs) specs = spc$rhs if (length(spc$by) > 0) by = setdiff(union(spc$by, by), spc$rhs) if (length(spc$lhs) > 0) contr = spc$lhs } if (!missing(trend)) { stop("The 'trend' argument has been deprecated. Use 'emtrends()' instead.") } if (!missing(tran)) { options $tran = tran } # This was added in 1.47, but causes problems # if((length(specs) == 1) && (specs == "1")) # specs = character(0) if(is.null(nesting <- object@model.info$nesting)) { RG = object facs = union(specs, by) # Check that grid is complete # This isn't a 100% reliable check, but... if(nrow(RG@grid) != prod(sapply(RG@levels, length))) stop("Irregular reference grid: Marginal means cannot be determined.\n", "You can possibly fix this with the 'force_regular' function.") if (!is.null(RG@misc$display)) { RG@misc$display = NULL warning("emmeans() results may be corrupted by removal of a nesting structure") } # Ensure object is in standard order ord = .std.order(RG@grid, RG@levels) ###do.call(order, unname(RG@grid[rev(names(RG@levels))])) if(any(ord != seq_along(ord))) RG = RG[ord] # xxx if ((length(facs) == 1) && (facs == "1")) { ### just want grand mean if("1" %in% facs) { RG@levels[["1"]] = "overall" RG@grid[ ,"1"] = 1 } # Figure out the structure of the grid wgt = RG@grid[[".wgt."]] if(!is.null(wgt) && all(zapsmall(wgt) == 0)) wgt = wgt + 1 ### repl all zero wgts with 1 dims = sapply(RG@levels, length) row.idx = array(seq_len(nrow(RG@linfct)), dims) use.mars = match(facs, names(RG@levels)) # which margins to use avgd.mars = setdiff(seq_along(dims)[dims>1], use.mars) # margins that we average over # Reconcile weights, if there are any margins left if ((length(avgd.mars) > 0) && !missing(weights)) { if (is.character(weights)) { if (is.null(wgt)) warning("'weights' requested but no weighting information is available") else { wopts = c("equal","proportional","outer","cells","flat","show.levels","invalid") weights = switch(wopts[pmatch(weights, wopts, 7)], equal = rep(1, prod(dims[avgd.mars])), proportional = as.numeric(apply(row.idx, avgd.mars, function(idx) sum(wgt[idx]))), outer = { ftbl = apply(row.idx, avgd.mars, function(idx) sum(wgt[idx])) # Fix up the dimensions ftbl = array(ftbl, dim(row.idx)[avgd.mars]) w = N = sum(ftbl) for (d in seq_along(dim(ftbl))) w = outer(w, apply(ftbl, d, sum) / N) as.numeric(w) }, cells = "fq", flat = "fl", show.levels = { cat("emmeans are obtained by averaging over these factor combinations\n") return(do.call(expand.grid, RG@levels[avgd.mars])) }, invalid = stop("Invalid 'weights' option: '", weights, "'") ) } } if (is.matrix(weights)) { wtrow = 0 fac.reduce = function(coefs) { wtmat = .diag(weights[wtrow+1, ]) / sum(weights[wtrow+1, ]) ans = apply(wtmat %*% coefs, 2, sum) wtrow <<- (1 + wtrow) %% nrow(weights) ans } } else if (is.numeric(weights)) { wtmat = .diag(weights) wtsum = sum(weights) if (wtsum <= 1e-8) wtsum = NA fac.reduce = function(coefs) { if (nrow(coefs) != nrow(wtmat)) stop("Nonconforming number of weights -- need ", nrow(coefs)) apply(wtmat %*% coefs, 2, sum) / wtsum } } } # Get the required factor combs levs = list() for (f in facs) { levs[[f]] = RG@levels[[f]] if (!hasName(levs, f)) stop(paste("No variable named", f, "in the reference grid")) } combs = do.call("expand.grid", levs) if (!missing(weights) && is.character(weights) && (weights %in% c("fq", "fl"))) K = apply(row.idx, use.mars, function(idx) { fq = RG@grid[[".wgt."]][idx] if (weights == "fl") fq = 0 + (fq > 0) # fq = 1 if > 0, else 0 apply(.diag(fq) %*% RG@linfct[idx, , drop=FALSE], 2, sum) / sum(fq) }) else K = apply(row.idx, use.mars, function(idx) { fac.reduce(RG@linfct[idx, , drop=FALSE]) }) linfct = t(matrix(K, nrow = ncol(RG@linfct), dimnames = list(colnames(RG@linfct), NULL))) if(.some.term.contains(union(facs, RG@roles$trend), RG@model.info$terms)) if(get_emm_option("msg.interaction")) message("NOTE: Results may be misleading due ", "to involvement in interactions") # Figure offset, if any if (hasName(RG@grid, ".offset.")) { combs[[".offset."]] = as.numeric(apply(row.idx, use.mars, function(idx) fac.reduce(as.matrix(RG@grid[idx, ".offset.", drop = FALSE])))) } avgd.over = names(RG@levels[avgd.mars]) # add/override .offset. column if requested if(!missing(offset)) { combs[[".offset."]] = rep(offset, nrow(combs))[seq_len(nrow(combs))] } # Update .wgt column of grid, if it exists if (!is.null(wgt)) { combs[[".wgt."]] = as.numeric(apply(row.idx, use.mars, function(idx) sum(wgt[idx]))) } RG@roles$responses = character() RG@misc$is.new.rg = NULL RG@misc$famSize = nrow(linfct) if(RG@misc$estName == "prediction") RG@misc$estName = "emmean" RG@misc$adjust = "none" RG@misc$infer = c(TRUE,FALSE) RG@misc$pri.vars = setdiff(facs, by) RG@misc$by.vars = by RG@misc$avgd.over = union(RG@misc$avgd.over, avgd.over) RG@misc$methDesc = "emmeans" RG@roles$predictors = names(levs) ### Pass up 'new' as we're not changing its class result = new("emmGrid", RG, linfct = linfct, levels = levs, grid = combs) result = as.emmGrid(RG) result@linfct = linfct result@levels = levs result@grid = combs result = if (missing(contr)) .update.options(result, options, ...) else .update.options(result, options, ..., exclude = "adjust") } else { # handle a nested structure object@model.info$nesting = NULL result = .nested_emm(object, specs, by = by, fac.reduce = fac.reduce, options = options, weights = weights, offset = offset, nesting = nesting) if(!is.null(type <- list(...)$type)) result = update(result, type = type) } if(!missing(contr)) { # return a list with emmeans and contrasts # NULL-out a bunch of arguments to not pass. dontpass = c("data", "avgd.over", "by.vars", "df", "initMesg", "estName", "estType", "famSize", "inv.lbl", "methDesc", "nesting", "pri.vars", "tran", "tran.mult", "tran.offset", "tran2", "is.new.rg") args = .zap.args(object = result, method = contr, by = by, ..., omit = dontpass) ctrs = do.call(contrast, args) result = .cls.list("emm_list", emmeans = result, contrasts = ctrs) if(!is.null(lbl <- object@misc$methDesc)) names(result)[1] = lbl } result } # Construct a new emmGrid object with given arguments #' Construct an \code{emmGrid} object from scratch #' #' This allows the user to incorporate results obtained by some analysis #' into an \code{emmGrid} object, enabling the use of \code{emmGrid} methods #' to perform related follow-up analyses. #' #' The arguments must be conformable. This includes that the length of #' \code{bhat}, the number of columns of \code{linfct}, and the number of #' columns of \code{post.beta} must all be equal. And that the product of #' lengths in \code{levels} must be equal to the number of rows of #' \code{linfct}. The \code{grid} slot of the returned object is generated #' by \code{\link{expand.grid}} using \code{levels} as its arguments. So the #' rows of \code{linfct} should be in corresponding order. #' #' The functions \code{qdrg} and \code{\link{emmobj}} are close cousins, in that #' they both produce \code{emmGrid} objects. When starting with summary #' statistics for an existing grid, \code{emmobj} is more useful, while #' \code{qdrg} is more useful when starting from an unsupported fitted model. #' #' #' @param bhat Numeric. Vector of regression coefficients #' @param V Square matrix. Covariance matrix of \code{bhat} #' @param levels Named list or vector. Levels of factor(s) that define the #' estimates defined by \code{linfct}. If not a list, we assume one factor #' named \code{"level"} #' @param linfct Matrix. Linear functions of \code{bhat} for each combination #' of \code{levels}. #' @param df Numeric value or function with arguments \code{(x, dfargs)}. If a #' number, that is used for the degrees of freedom. If a function, it should #' return the degrees of freedom for \code{sum(x*bhat)}, with any additional #' parameters in \code{dfargs}. #' @param dffun Overrides \code{df} if specified. This is a convenience #' to match the slot names of the returned object. #' @param dfargs List containing arguments for \code{df}. #' This is ignored if df is numeric. #' @param post.beta Matrix whose columns comprise a sample from the posterior #' distribution of the regression coefficients (so that typically, the column #' averages will be \code{bhat}). A 1 x 1 matrix of \code{NA} indicates that #' such a sample is unavailable. #' @param nesting Nesting specification as in \code{\link{ref_grid}}. This is #' ignored if \code{model.info} is supplied. #' @param ... Arguments passed to \code{\link{update.emmGrid}} #' #' @seealso \code{\link{qdrg}}, an alternative that is useful when starting #' with a fitted model not supported in \pkg{emmeans}. #' #' @return An \code{emmGrid} object #' @export #' #' @examples #' # Given summary statistics for 4 cells in a 2 x 2 layout, obtain #' # marginal means and comparisons thereof. Assume heteroscedasticity #' # and use the Satterthwaite method #' levels <- list(trt = c("A", "B"), dose = c("high", "low")) #' ybar <- c(57.6, 43.2, 88.9, 69.8) #' s <- c(12.1, 19.5, 22.8, 43.2) #' n <- c(44, 11, 37, 24) #' se2 = s^2 / n #' Satt.df <- function(x, dfargs) #' sum(x * dfargs$v)^2 / sum((x * dfargs$v)^2 / (dfargs$n - 1)) #' #' expt.rg <- emmobj(bhat = ybar, V = diag(se2), #' levels = levels, linfct = diag(c(1, 1, 1, 1)), #' df = Satt.df, dfargs = list(v = se2, n = n), estName = "mean") #' plot(expt.rg) #' #' ( trt.emm <- emmeans(expt.rg, "trt") ) #' ( dose.emm <- emmeans(expt.rg, "dose") ) #' #' rbind(pairs(trt.emm), pairs(dose.emm), adjust = "mvt") emmobj = function(bhat, V, levels, linfct = diag(length(bhat)), df = NA, dffun, dfargs = list(), post.beta = matrix(NA), nesting = NULL, ...) { if ((nrow(V) != ncol(V)) || (nrow(V) != ncol(linfct)) || (length(bhat) != ncol(linfct))) stop("bhat, V, and linfct are incompatible") if (!is.list(levels)) levels = list(level = levels) grid = do.call(expand.grid, levels) if (nrow(grid) != nrow(linfct)) stop("linfct should have ", nrow(grid), "rows") pri.vars = names(grid) dotargs = list(...) model.info = dotargs$model.info if(is.null(model.info)) model.info = list(call = str2lang("emmobj"), xlev = levels, nesting = .parse_nest(nesting)) roles = list(predictors= names(grid), responses=character(0), multresp=character(0)) for (nm in names(dotargs$extras)) grid[[nm]] = dotargs$extras[[nm]] if (!missing(dffun)) df = dffun if (is.function(df)) { dffun = df } else { dffun = function(x, dfargs) dfargs$df dfargs = list(df = df) } misc = list(estName = "estimate", estType = "prediction", infer = c(TRUE,FALSE), level = .95, adjust = "none", famSize = nrow(linfct), avgd.over = character(0), pri.vars = pri.vars, methDesc = "emmobj", display = dotargs$display) result = new("emmGrid", model.info=model.info, roles=roles, grid=grid, levels = levels, matlevs=list(), linfct=linfct, bhat=bhat, nbasis=all.estble, V=V, dffun=dffun, dfargs=dfargs, misc=misc, post.beta=post.beta) dotargs$model.info = dotargs$extras = dotargs$display = NULL do.call(update, c(object = result, dotargs, silent = TRUE)) } #' Convert to and from \code{emmGrid} objects #' #' These are useful utility functions for creating a compact version of an #' \code{emmGrid} object that may be saved and later reconstructed, or for #' converting old \code{ref.grid} or \code{lsmobj} objects into \code{emmGrid} #' objects. #' #' An \code{emmGrid} object is an S4 object, and as such cannot be saved in a #' text format or saved without a lot of overhead. By using \code{as.list}, #' the essential parts of the object are converted to a list format that can be #' easily and compactly saved for use, say, in another session or by another user. #' Providing this list as the arguments for \code{\link{emmobj}} allows the user #' to restore a working \code{emmGrid} object. #' #' @param object Object to be converted to class \code{emmGrid}. It may #' be a \code{list} returned by \code{as.list.emmGrid}, or a \code{ref.grid} #' or \code{lsmobj} object created by \pkg{emmeans}'s predecessor, the #' \pkg{lsmeans} package. An error is thrown if \code{object} cannot #' be converted. #' @param ... In \code{as.emmGrid}, additional arguments passed to #' \code{\link{update.emmGrid}} before returning the object. This #' argument is ignored in \code{as.list.emmGrid} #' #' @return \code{as.emmGrid} returns an object of class \code{emmGrid}. #' However, in fact, both \code{as.emmGrid} and \code{as.emm_list} check for an #' attribute in \code{object} to decide whether to return an \code{emmGrid} #' or \code{emm_list)} object. #' #' @seealso \code{\link{emmobj}} #' @export #' #' @examples #' pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) #' pigs.sav <- as.list(ref_grid(pigs.lm)) #' #' pigs.anew <- as.emmGrid(pigs.sav) #' emmeans(pigs.anew, "source") as.emmGrid = function(object, ...) { if (cls <- class(object)[1] %in% c("ref.grid", "lsmobj")) { object = as.list.emmGrid(object) if (is.null(object$misc$is.new.rg)) object$misc$is.new.rg = (cls == "ref.grid") } # above keeps us from having to define these classes in emmeans if (is.list(object)) { if (!is.null(attr(object, "emm_list"))) return(as.emm_list(object)) else result = do.call(emmobj, object) } else { result = try(as(object, "emmGrid", strict = FALSE), silent = TRUE) if (inherits(result, "try-error")) stop("Object cannot be coerced to class 'emmGrid'") } update(result, ...) } #' @rdname as.emmGrid #' @order 2 #' @param x An \code{emmGrid} object #' @param model.info.slot Logical value: Include the \code{model.info} slot? #' Set this to \code{TRUE} if you want to preserve the original call and #' information needed by the \code{submodel} option. #' If \code{FALSE}, only the nesting information (if any) is saved #' @return \code{as.list.emmGrid} returns an object of class \code{list}. #' @method as.list emmGrid #' @export as.list.emmGrid = function(x, model.info.slot = FALSE, ...) { slots = c("bhat", "V", "levels", "linfct", "dffun", "dfargs", "post.beta") result = lapply(slots,function(nm) slot(x, nm)) names(result) = slots result = c(result, x@misc) if(model.info.slot) result$model.info = x@model.info else result$nesting = x@model.info$nesting nm = intersect(names(x@grid), c(".wgt.", ".offset.")) if (length(nm) > 0) result$extras = x@grid[nm] result$pri.vars = NULL result } #### --- internal stuff used only by emmeans ------------- # Check if model contains a term containing all elts of facs # Note: if an lstrends call, we want to include trend var in facs # terms is terms() component of model .some.term.contains = function(facs, terms) { for (trm in attr(terms, "term.labels")) { if(all(sapply(facs, function(f) length(grep(f,trm))>0))) if (length(.all.vars(as.formula(paste("~",trm)))) > length(facs)) return(TRUE) } return(FALSE) } ### Sort grid in standard order according to ordering of entries in levels. ### Thus .std.order(do.call(expand.grid, levels), levels) --> 1,2,...,nrow .std.order = function(grid, levels) { tmp = lapply(rev(names(levels)), function(nm) { x = grid[[nm]] if (inherits(x, "factor")) as.integer(x) else as.integer(factor(x, levels = as.character(levels[[nm]]))) # Note: need as.character(levels) here so we handle such as Date vectors correctly }) do.call(order, tmp) } emmeans/R/lqm-support.R0000644000176200001440000001050714137062735014541 0ustar liggesusers############################################################################## # Copyright (c) 2012-2020 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # lqmm and lqm support recover_data.lqmm = function(object, data = object$mfArgs$data, ...) { fcall = object$call trms = delete.response(terms(eval(fcall$fixed))) recover_data(fcall, trms, object$mfArgs$na.action, data = data, ...) } emm_basis.lqmm = function(object, trms, xlev, grid, tau = 0.5, ...) { taudiff = abs(object$tau - tau) col = which(taudiff < 0.0001) if (length(col) == 0) stop("No coefficients available for tau = ", tau) bhat = coef(object) # Very touchy here because their boot() function doesn't take dots... nm = intersect(names(list(...)), c("method", "R", "seed", "startQR")) vargs = c(list(object = object, covariance = TRUE), list(...)[nm]) V = do.call("summary", vargs)$Cov if (length(taudiff) > 1) { bhat = bhat[, col[1]] V = V[, , col] } m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = object$contrasts) nbasis = estimability::all.estble dfargs = list(df = object$rdf) dffun = function(k, dfargs) dfargs$df list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = list()) } # Use same functions for lqm objects recover_data.lqm = function(object, ...) { recover_data.lm(object, frame = NULL, ...) } emm_basis.lqm = function(object, ...) emm_basis.lqmm(object, ...) #### rq objects (quantreg) recover_data.rq = function(object, ...) { recover_data.lm(object, frame = object$model, ...) } emm_basis.rq = function(object, trms, xlev, grid, tau = 0.5, ...) { taudiff = abs(object$tau - tau) col = which(taudiff < 0.0001) if (length(col) == 0) stop("No coefficients available for tau = ", tau) bhat = object$coefficients summ = summary(object, covariance = TRUE, ...) if (length(taudiff) == 1) { V = summ$cov df = summ$rdf } else { bhat = bhat[, col[1]] V = summ[[col]] $ cov df = summ[[col]] $ rdf } nm = if(is.null(names(bhat))) row.names(bhat) else names(bhat) m = suppressWarnings(model.frame(trms, grid, na.action = na.pass, xlev = xlev)) X = model.matrix(trms, m, contrasts.arg = object$contrasts) assign = attr(X, "assign") X = X[, nm, drop = FALSE] bhat = as.numeric(bhat) nbasis = estimability::all.estble misc = list() dfargs = list(df = df) dffun = function(k, dfargs) dfargs$df list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = misc) } # we just reroute rqs objects to emm_basis.rq, as pretty similar recover_data.rqs = recover_data.rq emm_basis.rqs = function(object, ...) emm_basis.rq(object, ...) emmeans/R/datasets.R0000644000176200001440000003760614137062735014057 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # datasets (provided as .rda files in data/) -- this file is for documentation ### auto.noise ### #' Auto Pollution Filter Noise #' #' Three-factor experiment comparing pollution-filter noise for two filters, #' three sizes of cars, and two sides of the car. #' #' The data are from a statement by Texaco, Inc., to the Air and Water Pollution #' Subcommittee of the Senate Public Works Committee on June 26, 1973. #' Mr. John McKinley, President of Texaco, cited an automobile filter developed #' by Associated Octel Company as effective in reducing pollution. However, #' questions had been raised about the effects of filters on vehicle performance, #' fuel consumption, exhaust gas back pressure, and silencing. On the last #' question, he referred to the data included here as evidence that the silencing #' properties of the Octel filter were at least equal to those of standard silencers. #' #' @format A data frame with 36 observations on the following 4 variables. #' \describe{ #' \item{\code{noise}}{Noise level in decibels (but see note) - a numeric vector.} #' \item{\code{size}}{The size of the vehicle - an ordered factor with #' levels \code{S}, \code{M}, \code{L}.} #' \item{\code{type}}{Type of anti-pollution filter - a factor with levels #' \code{Std} and \code{Octel}} #' \item{\code{side}}{The side of the car where measurement was taken -- a #' factor with levels \code{L} and \code{R}.} #' } #' @source The dataset was obtained from the Data and Story Library (DASL) #' at Carnegie-Mellon University. Apparently it has since been removed. The #' original dataset was altered by assigning meaningful names to the factors #' and sorting the observations in random order as if this were the run order #' of the experiment. #' @note While the data source claims that \code{noise} is measured in decibels, #' the values are implausible. I believe that these measurements are actually #' in tenths of dB (centibels?). Looking at the values in the dataset, note #' that every measurement ends in 0 or 5, and it is reasonable to believe that #' measurements are accurate to the nearest half of a decibel. #' %%% Thanks to an email communication from a speech/hearing scientist #' @examples #' # (Based on belief that noise/10 is in decibel units) #' noise.lm <- lm(noise/10 ~ size * type * side, data = auto.noise) #' #' # Interaction plot of predictions #' emmip(noise.lm, type ~ size | side) #' #' # Confidence intervals #' plot(emmeans(noise.lm, ~ size | side*type)) #' "auto.noise" # This is where it used to be... # \url{http://lib.stat.cmu.edu/DASL/Datafiles/airpullutionfiltersdat.html} ### feedlot ### #' Feedlot data #' #' This is an unbalanced analysis-of-covariance example, where one covariate is #' affected by a factor. Feeder calves from various herds enter a feedlot, where #' they are fed one of three diets. The weight of the animal at entry is the #' covariate, and the weight at slaughter is the response. #' #' The data arise from a Western Regional Research Project conducted at New #' Mexico State University. Calves born in 1975 in commercial herds entered a #' feedlot as yearlings. Both diets and herds are of interest as factors. The #' covariate, \code{ewt}, is thought to be dependent on \code{herd} due to #' different genetic backgrounds, breeding history, etc. The levels of #' \code{herd} ordered to similarity of genetic background. #' #' Note: There are some empty cells in the cross-classification of #' \code{herd} and \code{diet}. #' @format A data frame with 67 observations and 4 variables: #' \describe{ #' \item{\code{herd}}{a factor with levels \code{9} \code{16} \code{3} #' \code{32} \code{24} \code{31} \code{19} \code{36} \code{34} \code{35} #' \code{33}, designating the herd that a feeder calf came from.} #' \item{\code{diet}}{a factor with levels \code{Low} \code{Medium} #' \code{High}: the energy level of the diet given the animal.} #' \item{\code{swt}}{a numeric vector: the weight of the animal at slaughter.} #' \item{\code{ewt}}{a numeric vector: the weight of the animal at entry to the feedlot.} #' } #' @source Urquhart NS (1982) Adjustment in covariates when one factor affects #' the covariate. \emph{Biometrics} 38, 651-660. #' @examples #' feedlot.lm <- lm(swt ~ ewt + herd*diet, data = feedlot) #' #' # Obtain EMMs with a separate reference value of ewt for each #' # herd. This reproduces the last part of Table 2 in the reference #' emmeans(feedlot.lm, ~ diet | herd, cov.reduce = ewt ~ herd) #' "feedlot" ### fiber ### #' Fiber data #' #' Fiber data from Montgomery Design (8th ed.), p.656 (Table 15.10). Useful as a #' simple analysis-of-covariance example. #' #' The goal of the experiment is to compare the mean breaking strength of fibers #' produced by the three machines. When testing this, the technician also #' measured the diameter of each fiber, and this measurement may be used as a #' concomitant variable to improve precision of the estimates. #' @format A data frame with 15 observations and 3 variables: #' \describe{ #' \item{\code{machine}}{a factor with levels \code{A} \code{B} \code{C}. #' This is the primary factor of interest.} #' \item{\code{strength}}{a numeric vector. The response variable.} #' \item{\code{diameter}}{a numeric vector. A covariate.} #' } #' @source Montgomery, D. C. (2013) \emph{Design and Analysis of Experiments} #' (8th ed.). John Wiley and Sons, ISBN 978-1-118-14692-7. #' @examples #' fiber.lm <- lm(strength ~ diameter + machine, data=fiber) #' ref_grid(fiber.lm) #' #' # Covariate-adjusted means and comparisons #' emmeans(fiber.lm, pairwise ~ machine) #' "fiber" ### MOats ### #' Oats data in multivariate form #' #' This is the \code{Oats} dataset provided in the \pkg{nlme} package, but it is #' rearranged as one multivariate observation per plot. #' #' These data arise from a split-plot experiment reported by Yates (1935) and #' used as an example in Pinheiro and Bates (2000) and other texts. Six blocks #' were divided into three whole plots, randomly assigned to the three varieties #' of oats. The whole plots were each divided into 4 split plots and randomized #' to the four concentrations of nitrogen. #' @format A data frame with 18 observations and 3 variables #' \describe{ #' \item{\code{Variety}}{a factor with levels \code{Golden Rain}, #' \code{Marvellous}, \code{Victory}} #' \item{\code{Block}}{an ordered factor with levels \code{VI} < \code{V} < #' \code{III} < \code{IV} < \code{II} < \code{I}} #' \item{\code{yield}}{a matrix with 4 columns, giving the yields with #' nitrogen concentrations of 0, .2, .4, and .6.} #' } #' @source The dataset \code{\link[nlme]{Oats}} in the \pkg{nlme} package. #' @references #' Pinheiro, J. C. and Bates D. M. (2000) \emph{Mixed-Effects Models in S and #' S-PLUS}, Springer, New York. (Appendix A.15) #' #' Yates, F. (1935) Complex experiments, \emph{Journal of the Royal Statistical #' Society} Suppl. 2, 181-247 #' @examples #' MOats.lm <- lm (yield ~ Block + Variety, data = MOats) #' MOats.rg <- ref_grid (MOats.lm, mult.name = "nitro") #' emmeans(MOats.rg, ~ nitro | Variety) "MOats" ### neuralgia ### #' Neuralgia data #' #' These data arise from a study of analgesic effects of treatments of elderly #' patients who have neuralgia. Two treatments and a placebo are compared. The #' response variable is whether the patient reported pain or not. Researchers #' recorded the age and gender of 60 patients along with the duration of #' complaint before the treatment began. #' #' @format A data frame with 60 observations and 5 variables: #' \describe{ #' \item{\code{Treatment}}{Factor with 3 levels \code{A}, \code{B}, and \code{P}. #' The latter is placebo} #' \item{\code{Sex}}{Factor with two levels \code{F} and \code{M}} #' \item{\code{Age}}{Numeric covariate -- patient's age in years} #' \item{\code{Duration}}{Numeric covariate -- duration of the condition before #' beginning treatment} #' \item{\code{Pain}}{Binary response factor with levels \code{No} and \code{Yes}} #' } #' @source Cai, Weijie (2014) \emph{Making Comparisons Fair: How LS-Means Unify #' the Analysis of Linear Models}, SAS Institute, Inc. Technical paper 142-2014, #' page 12, #' \url{http://support.sas.com/resources/papers/proceedings14/SAS060-2014.pdf} #' @examples #' # Model and analysis shown in the SAS report: #' neuralgia.glm <- glm(Pain ~ Treatment * Sex + Age, family = binomial(), #' data = neuralgia) #' pairs(emmeans(neuralgia.glm, ~ Treatment, at = list(Sex = "F")), #' reverse = TRUE, type = "response", adjust = "bonferroni") #' "neuralgia" ### nutrition ### #' Nutrition data #' #' This observational dataset involves three factors, but where several factor #' combinations are missing. It is used as a case study in Milliken and Johnson, #' Chapter 17, p.202. (You may also find it in the second edition, p.278.) #' #' A survey was conducted by home economists ``to study how much #' lower-socioeconomic-level mothers knew about nutrition and to judge the #' effect of a training program designed to increase their knowledge of #' nutrition.'' This is a messy dataset with several empty cells. #' @format A data frame with 107 observations and 4 variables: #' \describe{ #' \item{\code{age}}{a factor with levels \code{1}, \code{2}, \code{3}, #' \code{4}. Mother's age group.} #' \item{\code{group}}{a factor with levels \code{FoodStamps}, \code{NoAid}. #' Whether or not the family receives food stamp assistance.} #' \item{\code{race}}{a factor with levels \code{Black}, \code{Hispanic}, #' \code{White}. Mother's race.} #' \item{\code{gain}}{a numeric vector (the response variable). Gain score #' (posttest minus pretest) on knowledge of nutrition.} #' } #' @source Milliken, G. A. and Johnson, D. E. (1984) #' \emph{Analysis of Messy Data -- Volume I: Designed Experiments}. #' Van Nostrand, ISBN 0-534-02713-7. #' @examples #' nutr.aov <- aov(gain ~ (group + age + race)^2, data = nutrition) #' #' # Summarize predictions for age group 3 #' nutr.emm <- emmeans(nutr.aov, ~ race * group, at = list(age="3")) #' #' emmip(nutr.emm, race ~ group) #' #' # Hispanics seem exceptional; but this doesn't test out due to very sparse data #' pairs(nutr.emm, by = "group") #' pairs(nutr.emm, by = "race") "nutrition" ### oranges ### #' Sales of oranges #' #' This example dataset on sales of oranges has two factors, two covariates, and #' two responses. There is one observation per factor combination. #' @format A data frame with 36 observations and 6 variables: #' \describe{ #' \item{\code{store}}{a factor with levels \code{1} \code{2} \code{3} #' \code{4} \code{5} \code{6}. The store that was observed.} #' \item{\code{day}}{a factor with levels \code{1} \code{2} \code{3} #' \code{4} \code{5} \code{6}. The day the observation was taken (same for #' each store).} #' \item{\code{price1}}{a numeric vector. Price of variety 1.} #' \item{\code{price2}}{a numeric vector. Price of variety 2.} #' \item{\code{sales1}}{a numeric vector. Sales (per customer) of variety 1.} #' \item{\code{sales2}}{a numeric vector. Sales (per customer) of variety 2.} #' } #' @source This is (or once was) available as a SAS sample dataset. #' @references #' Littell, R., Stroup W., Freund, R. (2002) \emph{SAS For Linear Models} (4th #' edition). SAS Institute. ISBN 1-59047-023-0. #' @examples #' # Example on p.244 of Littell et al. #' oranges.lm <- lm(sales1 ~ price1*day, data = oranges) #' emmeans(oranges.lm, "day") #' #' # Example on p.246 of Littell et al. #' emmeans(oranges.lm, "day", at = list(price1 = 0)) #' #' # A more sensible model to consider, IMHO (see vignette("interactions")) #' org.mlm <- lm(cbind(sales1, sales2) ~ price1 * price2 + day + store, #' data = oranges) "oranges" ### pigs ### #' Effects of dietary protein on free plasma leucine concentration in pigs #' #' A two-factor experiment with some observations lost #' #' @format A data frame with 29 observations and 3 variables: #' \describe{ #' \item{source}{Source of protein in the diet (factor with 3 levels: #' fish meal, soybean meal, dried skim milk)} #' \item{percent}{Protein percentage in the diet (numeric with 4 values: #' 9, 12, 15, and 18)} #' \item{conc}{Concentration of free plasma leucine, in mcg/ml} #' } #' @source Windels HF (1964) PhD thesis, Univ. of Minnesota. (Reported as #' Problem 10.8 in Oehlert G (2000) \emph{A First Course in Design and #' Analysis of Experiments}, licensed under Creative Commons, #' \url{http://users.stat.umn.edu/~gary/Book.html}.) Observations 7, 22, 23, #' 31, 33, and 35 have been omitted, creating a more notable imbalance. #' @examples #' pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) #' emmeans(pigs.lm, "source") "pigs" ### ubds ### #' Unbalanced dataset #' #' This is a simulated unbalanced dataset with three factors #' and two numeric variables. There are true relationships among these variables. #' This dataset can be useful in testing or illustrating messy-data situations. #' There are no missing data, and there is at least one observation for every #' factor combination; however, the \code{"cells"} attribute makes it simple #' to construct subsets that have empty cells. #' #' @format A data frame with 100 observations, 5 variables, #' and a special \code{"cells"} attribute: #' \describe{ #' \item{A}{Factor with levels 1, 2, and 3} #' \item{B}{Factor with levels 1, 2, and 3} #' \item{C}{Factor with levels 1, 2, and 3} #' \item{x}{A numeric variable} #' \item{y}{A numeric variable} #' } #' In addition, \code{attr(ubds, "cells")} consists of a named list of length 27 with the row numbers for #' each combination of \code{A, B, C}. For example, #' \code{attr(ubds, "cells")[["213"]]} has the row numbers corresponding #' to levels \code{A == 2, B == 1, C == 3}. The entries are ordered by #' length, so the first entry is the cell with the lowest frequency. #' @examples #' # Omit the three lowest-frequency cells #' low3 <- unlist(attr(ubds, "cells")[1:3]) #' messy.lm <- lm(y ~ (x + A + B + C)^3, data = ubds, subset = -low3) #' "ubds"emmeans/R/cld-emm.R0000644000176200001440000002266514157422260013557 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # Runs the function multcompLetters from the multcompView package # returns an error if not installed .mcletters = function(..., Letters=c("1234567890",LETTERS,letters), reversed=FALSE) { .requireNS("multcompView", "The 'multcompView' package must be installed to use CLD methods") # Expand strings to individual letters Letters = as.character(unlist(sapply(Letters, function(stg) { sapply(seq_len(nchar(stg)), function(i) substr(stg, i, i)) }))) result = multcompView::multcompLetters(..., Letters=Letters, reversed=reversed) if (is.null(result$monospacedLetters)) result$monospacedLetters = result$Letters result } ### Lingering support for multcomp::cld -- registered dynamically in zzz.R ### NOTE: MUST KEEP the rdname of CLD.emmGrid ### because it's referenced by augmentedRCBD package #' Compact letter displays #' #' A method for \code{multcomp::cld()} is provided for users desiring to produce #' compact-letter displays (CLDs). #' This method uses the Piepho (2004) algorithm (as implemented in the #' \pkg{multcompView} package) to generate a compact letter display of all #' pairwise comparisons of estimated marginal means. The function obtains (possibly #' adjusted) P values for all pairwise comparisons of means, using the #' \code{\link{contrast}} function with \code{method = "pairwise"}. When a P #' value exceeds \code{alpha}, then the two means have at least one letter in #' common. #' #' @rdname CLD.emmGrid #' @order 1 #' @param object An object of class \code{emmGrid} #' @param details Logical value determining whether detailed information on tests of #' pairwise comparisons is displayed #' @param sort Logical value determining whether the EMMs are sorted before the comparisons #' are produced. When \code{TRUE}, the results are displayed according to #' \code{reversed}. #' @param by Character value giving the name or names of variables by which separate #' families of comparisons are tested. If NULL, all means are compared. #' If missing, the object's \code{by.vars} setting, if any, is used. #' @param alpha Numeric value giving the significance level for the comparisons #' @param Letters Character vector of letters to use in the display. Any strings of #' length greater than 1 are expanded into individual characters #' @param reversed Logical value (passed to \code{multcompView::multcompLetters}.) #' If \code{TRUE}, the order of use of the letters is reversed. #' In addition, if both \code{sort} and \code{reversed} are TRUE, the sort #' order of results is reversed. #' @param ... Arguments passed to \code{\link{contrast}} (for example, #' an \code{adjust} method) #' @references Piepho, Hans-Peter (2004) An algorithm for a letter-based #' representation of all pairwise comparisons, #' Journal of Computational and Graphical Statistics, #' 13(2), 456-466. #' #' @note #' We warn that such displays encourage a poor #' practice in interpreting significance tests. CLDs are misleading because they #' visually group means with comparisons \emph{P} > \code{alpha} as though they #' are equal, when in fact we have only failed to prove that they differ. #' As alternatives, consider \code{\link{pwpp}} (graphical display of \emph{P} #' values) or \code{\link{pwpm}} (matrix display). #' #' @method cld emmGrid #' @examples #' if(requireNamespace("multcomp")) { #' pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) #' pigs.emm <- emmeans(pigs.lm, "percent", type = "response") #' multcomp::cld(pigs.emm, alpha = 0.10, Letters = LETTERS) #' } cld.emmGrid = function(object, details=FALSE, sort=TRUE, by, alpha=.05, Letters = c("1234567890",LETTERS,letters), reversed=FALSE, ...) { if (!is.na(object@post.beta)[1]) { message("NOTE: Summary and groupings are based on frequentist results") object@post.beta = matrix(NA) } emmtbl = summary(object, ...) if(missing(by)) by = object@misc$by.vars if (sort) { args = list() for (nm in by) args[[nm]] = emmtbl[[nm]] args$.emmGrid. = emmtbl[[attr(emmtbl, "estName")]] ord = do.call("order", unname(args)) emmtbl = emmtbl[ord, , as.df = FALSE] if (!is.null(object@misc$display)) { use = which(object@misc$display) object@linfct = object@linfct[use, , drop = FALSE] object@grid = object@grid[use, , drop = FALSE] object@misc$display = NULL } object@grid = object@grid[ord, , drop = FALSE] object@linfct = object@linfct[ord, , drop = FALSE] } attr(emmtbl, "by.vars") = by object@misc$by.vars = by prwise = contrast(object, "revpairwise", by=by) pwtbl = test(prwise, ...) p.boo = (pwtbl$p.value < alpha) if(is.null(by)) { by.rows = list(seq_len(nrow(pwtbl))) by.out = list(seq_len(nrow(emmtbl))) } else { by.rows = .find.by.rows(pwtbl, by) by.out = .find.by.rows(emmtbl, by) } ### This code moved to inside the loop # Create comps matrix reflecting order generated by pairwise.emmc # icol = jcol = numeric(0) # create fake row indexes in revpairwise order for use by .mcletters # k = length(by.out[[1]]) # for (i in 2:k) { # icol = c(icol, seq_len(i-1)) # jcol = c(jcol, rep(i, i-1)) # } ltrs = rep("", nrow(emmtbl)) na.p = which(is.na(p.boo)) # Take care of non-est cases. This is surprisingly complicated, # because it's possible we have some emmeans that are non-est # but comparisons are est'ble. So cases to exclude must be missing in # the table of means, AND appar somewhere in the indexes of NA p values # All that said, it still messes up because I didn't track the indexes correctly # excl.rows = intersect(which(is.na(emmtbl$SE)), union(icol[na.p], jcol[na.p])) # So I'll just go with which est's are missing excl.rows = which(is.na(emmtbl$SE)) p.boo[na.p] = FALSE for (i in seq_len(length(by.rows))) { # Create comps matrix reflecting order generated by pairwise.emmc icol = jcol = numeric(0) k = length(by.out[[i]]) for (j in 2:k) { icol = c(icol, seq_len(j-1)) jcol = c(jcol, rep(j, j-1)) } labs = paste(icol, jcol, sep="-") pb = p.boo[by.rows[[i]]] names(pb) = labs mcl = .mcletters(pb, Letters = Letters, reversed = reversed)$monospacedLetters ltrs[by.out[[i]]] = paste0(" ", mcl[seq_along(by.out[[i]])]) } # any missing estimates get blanks... ltrs[excl.rows] = "" emmtbl[[".group"]] = ltrs if(sort && reversed) for (i in seq_len(length(by.out))) { r = by.out[[i]] emmtbl[r, ] = emmtbl[rev(r), ] } dontusemsg = paste0("NOTE: Compact letter displays can be misleading\n", " because they show NON-findings rather than findings.\n", " Consider using 'pairs()', 'pwpp()', or 'pwpm()' instead.") attr(emmtbl, "mesg") = c(attr(emmtbl,"mesg"), attr(pwtbl, "mesg"), paste("significance level used: alpha =", alpha), dontusemsg) if (details) list(emmeans = emmtbl, comparisons = pwtbl) else emmtbl } ### Registered dynamically in zzz.R ### NOTE: MUST KEEP the rdname of CLD.emmGrid ### because it's referenced by augmentedRCBD package #' @rdname CLD.emmGrid #' @order 2 #' @method cld emm_list #' @param which Which element of the \code{emm_list} object to process #' (If length exceeds one, only the first one is used) cld.emm_list = function(object, ..., which = 1) { multcomp::cld(object[[which[1]]], ...) } emmeans/R/test.R0000644000176200001440000003537014137062735013222 0ustar liggesusers############################################################################## # Copyright (c) 2012-2017 Russell V. Lenth # # # # This file is part of the emmeans package for R (*emmeans*) # # # # *emmeans* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *emmeans* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # You should have received a copy of the GNU General Public License # # along with R and *emmeans*. If not, see # # and/or # # . # ############################################################################## # confint and test methods # confint method #' @rdname summary.emmGrid #' @order 2 #' @param parm (Required argument for \code{confint} methods, but not used) #' @method confint emmGrid #' @export confint.emmGrid = function(object, parm, level = .95, ...) { summary(object, infer = c(TRUE, FALSE), level = level, ...) } #' @rdname summary.emmGrid #' @order 3 #' @export test = function(object, null, ...) { UseMethod("test") } #' @rdname summary.emmGrid #' @order 3 #' @param joint Logical value. If \code{FALSE}, the arguments are passed to #' \code{\link{summary.emmGrid}} with \code{infer=c(FALSE, TRUE)}. If \code{joint = #' TRUE}, a joint test of the hypothesis L beta = null is performed, where L #' is \code{object@linfct} and beta is the vector of fixed effects estimated #' by \code{object@betahat}. This will be either an \emph{F} test or a #' chi-square (Wald) test depending on whether degrees of freedom are #' available. See also \code{\link{joint_tests}}. #' @param verbose Logical value. If \code{TRUE} and \code{joint = TRUE}, a table #' of the effects being tested is printed. #' @param rows Integer values. The rows of L to be tested in the joint test. If #' missing, all rows of L are used. If not missing, \code{by} variables are #' ignored. #' @param status logical. If \code{TRUE}, a \code{note} column showing status #' flags (for rank deficiencies and estimability issues) is displayed even #' when empty. If \code{FALSE}, the column is included only if there are #' such issues. #' @method test emmGrid #' @export test.emmGrid = function(object, null = 0, joint = FALSE, verbose = FALSE, rows, by, status = FALSE, ...) { # if joint = FALSE, this is a courtesy method for 'contrast' # else it computes the F test or Wald test of H0: L*beta = null # where L = object@linfct if (!joint) { if (missing(by)) summary(object, infer=c(FALSE,TRUE), null = null, ...) else summary(object, infer=c(FALSE,TRUE), null = null, by = by, ...) } else { if(verbose) { cat("Joint test of the following linear predictions\n") print(cbind(object@grid, equals = null)) } L = object@linfct bhat = object@bhat estble.idx = which(!is.na(object@bhat)) bhat = bhat[estble.idx] est.flag = !is.na(object@nbasis[1]) if(est.flag) { est.tol = get_emm_option("estble.tol") nbasis = zapsmall(object@nbasis) } if (!missing(rows)) by.rows = list(sel.rows = rows) else { by.rows = list(all = seq_len(nrow(L))) if(missing(by)) by = object@misc$by.vars if (!is.null(by)) by.rows = .find.by.rows(object@grid, by) } result = lapply(by.rows, function(rows) { LL = L[rows, , drop = FALSE] # discard any rows that have NAs narows = apply(LL, 1, function(x) any(is.na(x))) LL = LL[!narows, , drop = FALSE] rrflag = 0 + 2 * any(narows) ## flag for estimability issue if(est.flag) { if (any(!estimability::is.estble(LL, nbasis, est.tol))) { LL = estimability::estble.subspace(zapsmall(LL), nbasis) rrflag = bitwOr(rrflag, 2) } LL = LL[, estble.idx, drop = FALSE] } # Check rank qrLt = qr(t(LL)) # this will work even if LL has 0 rows r = qrLt$rank if (r == 0) return(c(df1 = 0, df2 = NA, F.ratio = NA, p.value = NA, note = 3)) if (r < nrow(LL)) { if(!all(null == 0)) stop("Rows are linearly dependent - cannot do the test when 'null' != 0") rrflag = bitwOr(rrflag, 1) } tR = t(qr.R(qrLt))[1:r, 1:r, drop = FALSE] tQ = t(qr.Q(qrLt))[1:r, , drop = FALSE] if(length(null) < r) null = rep(null, r) z = tQ %*% bhat - solve(tR, null[1:r]) zcov = tQ %*% object@V %*% t(tQ) F = try(sum(z * solve(zcov, z)) / r) if (inherits(F, "try-error")) c(df1 = r, df2 = NA, F.ratio = NA, p.value = NA, note = 1) else { df2 = min(apply(tQ, 1, function(.) object@dffun(., object@dfargs))) if (is.na(df2)) p.value = pchisq(F*r, r, lower.tail = FALSE) else p.value = pf(F, r, df2, lower.tail = FALSE) c(round(c(df1 = r, df2 = df2), 2), F.ratio = round(F, 3), p.value = p.value, note = rrflag) } }) result = as.data.frame(t(as.data.frame(result))) if (!missing(by)) { fbr = sapply(by.rows, "[", 1) result = cbind(object@grid[fbr, by, drop = FALSE], result) } class(result) = c("summary_emm", "data.frame") attr(result, "estName") = "F.ratio" if (!status && all(result$note == 0)) result$note = NULL else { if (any(result$note %in% c(1,3))) attr(result, "mesg") = .dep.msg if (any(result$note %in% c(2,3))) attr(result, "mesg") = c(attr(result, "mesg"), .est.msg) result$note = sapply(result$note, function(x) switch(x + 1, "", " d", " e", " d e")) } result } } # messages (also needed by joint_tests()) .dep.msg = "d: df1 reduced due to linear dependence" .est.msg = "e: df1 reduced due to non-estimability" # Do all joint tests of contrasts. by, ... passed to emmeans() calls #' Compute joint tests of the terms in a model #' #' This function produces an analysis-of-variance-like table based on linear #' functions of predictors in a model or \code{emmGrid} object. Specifically, #' the function constructs, for each combination of factors (or covariates #' reduced to two or more levels), a set of (interaction) contrasts via #' \code{\link{contrast}}, and then tests them using \code{\link{test}} with #' \code{joint = TRUE}. Optionally, one or more of the predictors may be used as #' \code{by} variable(s), so that separate tables of tests are produced for #' each combination of them. #' #' In models with only factors, no covariates, we believe these tests correspond #' to \dQuote{type III} tests a la \pkg{SAS}, as long as equal-weighted #' averaging is used and there are no estimability issues. When covariates are #' present and interact with factors, the results depend on how the covariate is #' handled in constructing the reference grid. See the example at the end of #' this documentation. The point that one must always remember is that #' \code{joint_tests} always tests contrasts among EMMs, in the context of the #' reference grid, whereas type III tests are tests of model coefficients -- #' which may or may not have anything to do with EMMs or contrasts. #' #' @param object,cov.reduce \code{object} is a fitted model or an \code{emmGrid}. #' If a fitted model, it is #' replaced by \code{ref_grid(object, cov.reduce = cov.reduce, ...)} #' @param by character names of \code{by} variables. Separate sets of tests are #' run for each combination of these. #' @param show0df logical value; if \code{TRUE}, results with zero numerator #' degrees of freedom are displayed, if \code{FALSE} they are skipped #' @param ... additional arguments passed to \code{ref_grid} and \code{emmeans} #' #' @return a \code{summary_emm} object (same as is produced by #' \code{\link{summary.emmGrid}}). All effects for which there are no #' estimable contrasts are omitted from the results. #' #' @seealso \code{\link{test}} #' @export #' #' @examples #' pigs.lm <- lm(log(conc) ~ source * factor(percent), data = pigs) #' #' joint_tests(pigs.lm) ## will be same as type III ANOVA #' #' joint_tests(pigs.lm, weights = "outer") ## differently weighted #' #' joint_tests(pigs.lm, by = "source") ## separate joint tests of 'percent' #' #' ### Comparisons with type III tests #' toy = data.frame( #' treat = rep(c("A", "B"), c(4, 6)), #' female = c(1, 0, 0, 1, 0, 0, 0, 1, 1, 0 ), #' resp = c(17, 12, 14, 19, 28, 26, 26, 34, 33, 27)) #' toy.fac = lm(resp ~ treat * factor(female), data = toy) #' toy.cov = lm(resp ~ treat * female, data = toy) #' # (These two models have identical fitted values and residuals) #' #' joint_tests(toy.fac) #' joint_tests(toy.cov) # female is regarded as a 2-level factor by default #' #' joint_tests(toy.cov, at = list(female = 0.5)) #' joint_tests(toy.cov, cov.keep = 0) # i.e., female = mean(toy$female) #' joint_tests(toy.cov, at = list(female = 0)) #' #' # -- Compare with SAS output -- female as factor -- #' ## Source DF Type III SS Mean Square F Value Pr > F #' ## treat 1 488.8928571 488.8928571 404.60 <.0001 #' ## female 1 78.8928571 78.8928571 65.29 0.0002 #' ## treat*female 1 1.7500000 1.7500000 1.45 0.2741 #' # #' # -- Compare with SAS output -- female as covariate -- #' ## Source DF Type III SS Mean Square F Value Pr > F #' ## treat 1 252.0833333 252.0833333 208.62 <.0001 #' ## female 1 78.8928571 78.8928571 65.29 0.0002 #' ## female*treat 1 1.7500000 1.7500000 1.45 0.2741 joint_tests = function(object, by = NULL, show0df = FALSE, cov.reduce = range, ...) { if (!inherits(object, "emmGrid")) { args = .zap.args(object = object, cov.reduce = cov.reduce, ..., omit = "submodel") object = do.call(ref_grid, args) } facs = setdiff(names(object@levels), c(by, "1")) if(length(facs) == 0) stop("There are no factors to test") # Use "factors" attr if avail to screen-out interactions not in model # For any factors not in model (created by emmeans fcns), assume they interact w/ everything trmtbl = attr(object@model.info$terms, "factors") if (is.null(trmtbl) || (length(trmtbl) == 0)) trmtbl = matrix(1, nrow = length(facs), dimnames = list(facs, NULL)) else row.names(trmtbl) = sapply(row.names(trmtbl), function(x) .all.vars(reformulate(x))) xtras = setdiff(facs, row.names(trmtbl)) if (length(xtras) > 0) { xt = matrix(1, nrow = length(xtras), ncol = ncol(trmtbl), dimnames = list(xtras, NULL)) trmtbl = rbind(trmtbl, xt) } nesting = object@model.info$nesting for (nst in names(nesting)) { # make sure all complete nests are in the table trmtbl = cbind(trmtbl, 0) n = ncol(trmtbl) trmtbl[c(nst, nesting[[nst]]), n] = 1 } do.test = function(these, facs, result, ...) { if ((k <- length(these)) > 0) { if(any(apply(trmtbl[these, , drop = FALSE], 2, prod) != 0)) { # term is in model nesters = NULL if (!is.null(nesting)) { nst = intersect(these, names(nesting)) if (length(nst) > 0) nesters = unlist(nesting[nst]) # proceed only if these includes all nesters } if (is.null(nesting) || length(setdiff(nesters, these)) == 0) { emm = emmeans(object, these, by = by, ...) tst = test(contrast(emm, interaction = "consec", by = union(by, nesters)), by = by, joint = TRUE, status = TRUE) tst = cbind(ord = k, `model term` = paste(these, collapse = ":"), tst) result = rbind(result, tst) } } last = max(match(these, facs)) } else last = 0 if (last < (n <- length(facs))) for (i in last + seq_len(n - last)) result = do.test(c(these, facs[i]), facs, result, ...) result } result = suppressMessages(do.test(character(0), facs, NULL, ...)) result = result[order(result[[1]]), -1, drop = FALSE] if(!show0df) result = result[result$df1 > 0, , drop = FALSE] class(result) = c("summary_emm", "data.frame") attr(result, "estName") = "F.ratio" attr(result, "by.vars") = by if (any(result$note != "")) { msg = character(0) if (any(result$note %in% c(" d", " d e"))) msg = .dep.msg if (any(result$note %in% c(" e", " d e"))) msg = c(msg, .est.msg) attr(result, "mesg") = msg } else result$note = NULL result } # provide for displaying in standard 'anova' format (with astars etc.) # I'm not g # #' @export # as.anova = function(object, ...) # UseMethod("as.anova") # # as.anova.summary_emm = function(object, ...) { # class(object) = c("anova", "data.frame") # row.names(object) = as.character(object[[1]]) # names(object) = gsub("p.value", "Pr(>F)", names(object)) # object[-1] # } emmeans/NEWS.md0000644000176200001440000011673414165066563013025 0ustar liggesusers--- title: "NEWS for the emmeans package" --- ## emmeans 1.7.2 * Improvements to `averaging` support (#319) * Fixed bug in comparison arrows when `by = NULL` (#321) (this bug was a subtle byproduct of the name-checking in #305) Note this fixes visible errors in the vignettes for ver 1.7.1-1 * Patch for `gamlss` support (#323) * Added `withAutoprint()` to documentation examples with `require()` clauses, so we see interactive-style results * Correction to a logic error in adjustment corrections in `summary.emmGrid` (#31) * Revised `summary.emmGrid()` so that if we have both a response transformation and a link function, then both transformations are followed through with `type = "response"`. Previously, I took the lazy way out and used `summary(regrid(object, transform = "unlink"), type = "response")` (see #325) * Fix to `force_regular()` which caused an unintended warning (#326) * Fixes to issues in `emtrends()` (#327) ## emmeans 1.7.1 * Support from multinomial models in mgcv::gam (#303) thanks to Hannes Riebl * Bug fix for spaces in `by` variable names (#305). Related to this are: - `plot.emmGrid()` now forces all names to be syntactically valid - In `as.data.frame.emmGrid()`, we changed the `optional` argument to `check.names` (defaulting to `TRUE`), and it actually has an effect. So by default, the result will have syntactically valid names; this is a change, but only because `optional` did not work right (because it is an argument for `as.data.frame.list()). * Fix for missing column names in `linfct` from `emmeans()` (#308) * Added `gnls` support (#313, #314, thanks to Fernando Miguez) * Modified `glm` support so that `df.residual` is used when the family is gaussian or gamma. Thus, e.g., we match `lm` results when the model is fitted with a Gaussian family. Previously we ignored the d.f. for all `glm` objects. * New vignette example with percentage differences * More graceful handling of comparisons when there is only one mean; and a related FAQ ## emmeans 1.7.0 #### Notable changes * New `rg.limit` option (and argument for `ref_grid()`) to limit the number of rows in the reference grid (#282, #292). **This change could affect existing code that used to work** -- but only in fairly extreme situations. Some users report extreme performance issues that can be traced to the size of the reference grid being in the billions, causing memory to be paged, etc. So providing this limit really is necessary. The default is 10,000 rows. I hope that most existing users don't bump up against that too often. The `nuisance` (or `non.nuisance`) argument in `ref_grid()` (see below) can help work around this limit. * New `nuisance` option in `ref_grid()`, by which we can specify names of factors to exclude from the reference grid (accommodating them by averaging) (#282, #292). These must be factors that don't interact with anything, even other nuisance factors. This provides a remedy for excessive grid sizes. * Improvements to and broadening of `qdrg()`: - Changed the order of arguments in to something a bit more natural - Default for `contrasts` now `object$contrasts` when `object` is specified - Detection of multivariate situations - Added `ordinal.dim` argument to support ordinal models * New `force_regular()` function adds invisible rows to an irregular `emmGrid` to make it regular (i.e., covers all factor combinations) #### Bug fixes and tweaks * Removed dependency on **plyr** package (#298) * Fix to bug in `regrid()` with nested structures (#287) * Fix bug in `rbind()` which mishandled `@grid$.offset.` * Major repairs to `clm` and `clmm` support to fix issues related to rank deficiency and nested models, particularly with `mode = "prob"` (#300) * Allow `type` to be passed in `emmeans()` when `object` is already an `emmGrid` (incidentally noticed in #287) * Code to prevent a warning when an existing factor is coerced to a factor in the model formula -- see [SO question](https://stackoverflow.com/questions/68969384) * Add documentation note for `add_grouping` with multiple reference factors (#291) ## emmeans 1.6.3 * Clarification of documentation of `ref_grid(object, vcov. = ...)` (#283) * Fix to `emmtrends()` with covariate formulas (#284) * Improved parts of "Basics" vignette - removed "back story", revised guidance on $P$ values and models * Allow for > 1 reference factor in `add_grouping()` (#286) * Repairs to `contrast()` to avoid all-`nonEst` results in irregular nested structures ## emmeans 1.6.2 * Fixed navigation error in vignette index * Discouraging message added to `cld()` results. Also am providing an `emm_list` method for `emm_list` objects. * Added `mvcontrast()` function (#281) and assoc vignette material * Added `update.summary_emm()` ## emmeans 1.6.1 * Fixed a bug in parsing a response transformation (#274) * Changed handling of `contrast()` so that `log2` and `log10` transformations are handled just like `log`. (#273) Also disabled making ratios with `genlog` as it seems ill-advised. * Added support for `log1p` transformation * Improved detection of cases where Tukey adjustment is [in]appropriate (#275) * Added `type = "scale"` argument to `plot.emmGrid()` and `emmip()`. This is the same as `type = "response"` except the scale itself is transformed (i.e., a log scale if the log transformation was used). Since the same transformation is used, the appearance of the plot will be the same as with `type = "lp"`, but with an altered axis scale. Currently this is implemented only with `engine = "ggplot"`. * Fixed bug whereby Scheffe is ignored when there is only one contrast, even though `scheffe.rank` > 1 was specified. (#171) * Added a `subset()` method for `emmGrid` objects * Bug fixes for `mcmc` and `mcmc.list` objects (#278, #279) * `test()` shows `null` whenever it is nonzero on the chosen scale (#280) emmeans 1.6.0 ------------- This version has some changes that affect all users, e.g., not saving `.Last.ref_grid`, so we incremented the sub-version number. * Changed handling of logit transformations in `contrast()`, so that the odds-ratio transformation persists into subsequent `contrast()` calls e.g., interaction contrasts. * We also made `contrast(..., type = ...)` work correctly * Bug fix so that all `p.adjust.methods` work (#267) * Support for `mblogit` extended to work with `mmblogit` models (#268) (However, since, **mclogit** pkg incorporates its own interface) * Added `export` option in `print.emmGrid()` and `print.emm_summary()` * Changed default for `emm_options(save.ref_grid = FALSE)`. Years ago, it seemed potentially useful to save the last reference grid, but this is extra overhead, and writes in the user's global environment. The option remains if you want it. * Added a note advising against using `as.data.frame` (because we lose potentially important annotations), and information/example on how to see more digits (which I guess is why I'm seeing users do this). * Further refinement to nesting detection. A model like `y ~ A:B` detected `A %in% B` and `B %in% A`, and hence `A %in% A*B` and `B %in% A*B` due to a change in 1.4.6. Now we omit cases where factors are nested in themselves! * Expansion of `cov.reduce` formulas to allow use of custom models for predicting mediating covariates emmeans 1.5.5 ------------- * The `multinom` "correction" in version 1.5.4 was actually an "incorrection." It was right before, and I made it wrong! **If analyzing `multinom` models, use a version *other* than 1.5.4** * Repairs to support for `mblogit` models * Bug fix for `survreg` support (#258) -- `survreg()` doesn't handle missing factor levels the same way as `lm()`. This also affects results from `coxph()`, `AER::tobit()`, ... * Addition of a note in help `auto.noise` dataset, and changing that example and vignette example to have `noise/10` as the response variable. (Thanks to speech and hearing professor Stuart Rosen for pointing out this issue in an e-mail comment.) * Bug fix for `appx-satterthwaite` mode in `gls`/`lme` models (#263) * Added `mode = "asymptotic"` for `gls`/`lme` models. * Added `facetlab` argument to `emmip_ggplot()` so user can control how facets are labeled (#261) * Efficiency improvements in `joint_tests()` (#265) * Bug fixes in `joint_tests()` and interaction contrasts for nested models (#266) * Improvement to `multinom` support suggested by this [SO question](https://stackoverflow.com/questions/66675697) emmeans 1.5.4 ------------- * Fix to bug in `rbind.emm_list()` to default for `which` * Fix for a glitch in recovering data for `gee` models (#249) * Support for `svyglm` objects (#248) * Better support for `lqm`, `lqmm`, and added support for `rq` & `rqs` objects (**quantreg** package). User may pass `summary` or `boot` arguments such as `method`, `se`, `R`, ... (#250) * Correction to `multinom` objects (SEs were previously incorrect) and addition of support for related `mclogit::mblogit` objects. If at all possible, users should re-run any pre-1.5.4 analyses of multinomial models
**Note: This correction was wrong!** If using multinomial models, you should use some version *other than* 1.5.4! * Change to less misleading messages and documentation related to the `N.sim` argument of `regrid()`. We are no longer calling this a posterior sample because this is not really a Bayesian method, it is just a simulated set of regression coefficients. emmeans 1.5.3 ------------- * Per long-time threats, we really are removing `CLD()` once and for all. We tried in version 1.5.0, but forced to cave due to downstream problems. * Addition of `levels<-` method that maps to `update(... levels =)` (#237) * Fix `cld()` so it works with nested cases (#239) * Enable `coef()` method to work with contrasts of nested models. This makes it possible for `pwpp()` to work (#239) * Fixed a coding error in `plot()` that occurs if we use `type = "response" but there is in fact no transformation ([reported on StackOverflow](https://stackoverflow.com/questions/64962094)) * Added `"log10"` and `"log2"` as legal transformations in `regrid()` * Revised vignette example for MCMC models, added example with **bayestestR** * Expanded support for ordinal models to all link functions available in **ordinal** (errors-out if **ordinal** not installed and link not available in `stats::make.link()`) * Cleaned-up `emmip()` to route plot output to rendering functions `emmip_ggplot()` and `emmip_lattice()`. These functions allow more customization to the plot and can also be called independently. (To do later, maybe next update: the same for `plot.emmGrid()`. What to name rendering functions?? -- suggestions?) * Cleaned up code for `.emmc` functions so that parenthesization of levels does not get in the way of `ref`, `exclude`, or `include` arguments (#246) * Fix to bug in `emtrends()` when `data` is specified (#247) * Tries harder to recover original data when available in the object (#247). In particular, sometimes this is available, e.g., in `$model` slot in a `lm` object, *as long as there are no predictor transformations*. This provides a little bit more safety in cases the data have been removed or altered. * Tweaks to `rbind.emm_list()` to allow subsetting. (Also documentation & example) emmeans 1.5.2 ------------- * Change to `plot.emmGrid(... comparisons = TRUE)` where we determine arrow bounds and unnecessary-arrow deletions *separately* in each `by` group. See also [Stack Overflow posting](https://stackoverflow.com/questions/63713439) * `emmeans()` with contrasts specified ignores `adjust` and passes to `contrast()` instead. Associated documentation improved (I hope) * Bug-fix for missing cases in `plot(..., comparisons = TRUE)` (#228) * Robustified `plot.emmGrid()` so that comparison arrows work correctly with back-transformations. (Previously we used `regrid()` in that case, causing different CIs and PIs depending on `comparisons`) (#230) * Bug fixes in support for `stan_polr` models. * Bug fix for incorrect (and relatively harmless) warning in several models (#234) * Lower object size via removing unnecessary environment deps (#232) * Repairs to `as.list()` and `as.emmGrid()` to fully support nesting and submodels. emmeans 1.5.1 ------------- * Additional checking for potential errors (e.g. memory overload) connected with `submodel` support. Also, much more memory-efficient code therein (#218, #219) * A new option `enable.submodel` so user can switch off `submodel` support when unwanted or to save memory. * `multinom` support for `N.sim` option * Modification to internal dispatching of `recover_data` and `emm_basis` so that an external package's methods are always found and given priority whether or not they are registered (#220) * Patches to `gamlss` support. Smoothers are not supported but other aspects are more reliable. See [CV posting](https://stats.stackexchange.com/questions/484886) * Improvement to auto-detection of transformations (#223) * Added `aes` argument in `pwpp()` for more control over rendering (#178) * Fix to a situation in `plot.emmGrid()` where ordering of factor levels could change depending on `CIs` and `PIs` (#225) emmeans 1.5.0 ------------- * Changed help page for `joint_tests()` to reflect `cov.keep` (ver. 1.4.2) * `emm_options()` gains a `disable` argument to use for setting aside any existing options. Useful for reproducible bug reporting. * In `emmeans()` with a `contr` argument or two-sided formula, we now suppress several particular `...` arguments from being passed on to `contrast()` when they should apply only to the construction of the EMMs (#214) * More control of what `...` arguments are passed to methods * `CLD()` was deprecated in version 1.3.4. THIS IS THE LAST VERSION where it will continue to be available. Users should use `multcomp::cld()` instead, for which an `emmGrid` method will continue to exist. * Experimental `submodel` option * Bug fix therein (#217) * Enhancements to `mgcv::gam` support (#216) * New `ubds` dataset for testing with messy situations * Added minimal support for `lqm` and `lqmm` models (#213) * Interim support for user-supplied contrasts for `stanreg` models (#212) emmeans 1.4.8 ------------- * Bug fix and smoother support for `stanreg` objects (#202) * Fix to `emmip()` to be consistent between one curve and several, in whether points are displayed (`style` option) * Added `"scale"` option to `make.tran()` * Auto-detection of standardized response transformation * Fix to a scoping issue in `emtrends()` (#201) * Bug fix for #197 created a new issue #206. Both now fixed. * Non-existent reference levels in `trt.vs.ctrl.emmc()` now throws an error (#208) * Added a default for `linfct` (the identity) to `emmobj` * Provisions for more flexible and consistent labeling/naming of results. This includes added `emm_options` `"sep"` and `"parens"`, and a `parens` argument in `contrast()`. `sep` controls how factor levels are combined when ploted or contrasted, and `parens` sets whether, what, and how labels are parenthesized in `contrast()`. In constructing contrasts of contrasts, for example, labels like `A - B - C - D` are now `(A - B) - (C - D)`, by default. To reproduce old labeling, do `emm_options(sep = ",", parens = "a^") emmeans 1.4.7 ------------- * Repairs to `pwpp()` so it plays nice with nonestimable cases * Added `"xplanations"` vignette with additional documentation on methods used. (comparison arrows, for starters) * Touch-ups to `plot()`, especially regarding comparison arrows * Bug fix for `stanreg` models (#196) * Fixed error in `emmeans(obj, "1", by = "something")` (#197) * `eff_size()` now supports `emm_list` objects with a `$contrasts` component, using those contrasts. This helps those who specify `pairwise ~ treatment`. * Labels in `contrast()` for factor combinations with `by` groups were wacky (#199) * `emtrends()` screwed up with multivariate models (#200). * Added a new argument `calc` to `summary()`. For example, `calc = c(n = ~.wgt.)` will add a column of sample sizes to the summary. emmeans 1.4.6 ------------- * Improvements to `coxph` support for models with strata * `emmeans()` with `specs` of class `list` now passes any `offset` and `trend` arguments (#179) * Added `plim` argument to `pwpp()` to allow controlling the scale * More documentation on using `params` (#180) * Robustified support for `gls` objects when data are incomplete (#181) * Fixed bug in `joint_tests()` and `test(..., joint = TRUE)` that can occur with nontrivial `@dffun()` slots (#184) * Improved support for Satterthwaite-based methods in `gls` (#185) and renamed `boot-satterthwaite` to `appx-satterthwaite` (#176) * Further repairs to nesting-related code (#186) * Fix `transform` argument in `ref_grid()` so it is same as in `regrid()` (#188) * Added `pwpm()` function for displaying estimates, pairwise comparisons, and *P* values in matrix form emmeans 1.4.5 ------------- * Change to `.all.vars()` that addresses #170 * Addition of hidden argument `scheffe.rank` in `summary.emmGrid()` to manually specify the desired dimensionality of a Scheffe adjustment (#171) * Provided for `...` to be included in `options` in calls to `emmeans()` and `contrast()`. This allows passing any `summary()` argument more easily, e.g., `emmeans(..., type = "response", bias.adjust = TRUE, infer = c(TRUE, TRUE))` (Before, we would have had to wrap this in `summary()`) * Added a `plotit` argument to `plot.emmGrid()` that works similarly to that in `emmip()`. * Removed startup message for behavior change in 1.4.2; it's been long enough. * Fixed bug with `character predictors in `at` (#175) emmeans 1.4.4 --------------- * Fixed bug in `emmeans()` associated with non-factors such as `Date` (#162) * Added `nesting.order` option to `emmip()` (#163) * New `style` argument for `emmip()` allows plotting on a numeric scale * More robust detection of response transformations (#166) * Ensure `pwpp()` has tick marks on P-value axis (#167) * Bug fix for `regrid()` for error when estimates exceed bounds * Bug fix in auto-detecting nesting (#169) to make it less "enthusiastic" * Fixes to formula operations needed because `formula.tools:::as.character.formula` messes me up (thanks to Berwin Turloch, UWA, for alerting me) * Making `dqrg()` more visible in the documentation (because it's often useful) * Added more methods for `emm_list` objects, e.g. `rbind()` and `as.data.frame()`, `as.list()`, and `as.emm_list()` emmeans 1.4.3.01 ---------------- * Fixed bug in post-grid support that affects, e.g., the **ggeffects** package (#161) emmeans 1.4.3 ------------- * Added `"bcnPower"` option to `make.tran()` (per `car::bcnPower()`) * Scoping correction for `emmtrends()` (#153) * Allow passing `...` to hook functions (need exposed by #154) * Addition to `regrid()` whereby we can fake any response transformation -- not just `"log"` (again inspired by #154) * Informative message when **pbkrtest** or **lmerTest** is not found (affects `merMod` objects) (#157) * Change in `pwpp()` to make extremely small P values more distinguishable emmeans 1.4.2 ------------- * First argument of `emtrends()` is now `object`, not `model`, to avoid potential mis-matching of the latter with optional `mode` argument * `emtrends()` now uses more robust and efficient code whereby a single reference grid is constructed containing all needed values of `var`. The old version could fail, e.g., in cases where the reference grid involves post-processing. (#145) * Added `scale` argument to `contrast()` * Added new `"identity"` contrast method * New `eff_size()` function for Cohen effect sizes * Expanded capabilities for interaction contrasts (#146) * New `cov.keep` argument in `ref_grid()` for specifying covariates to be treated just like factors (#148). A side effect is that the system default for indicator variables as covariates is to treat them like 2-level factors. *This could change the results obtained from some analyses using earlier versions*. To replicate old analyses, set `emm_options(cov.keep = character(0))`. * Added merMod-related options as convenience arguments (#150) * Bug fixes: `regrid` ignored offsets with Bayesian models; `emtrends()` did not supply `options` and `misc` arguments to `emm_basis()` (#143) emmeans 1.4.1 ------------- * Added non-estimability infrastructure for Bayesian models, `stanreg` in particular (#114) * Added `max.degree` argument in `emtrends()` making it possible to obtain higher-order trends (#133). Plus minor tuneups, e.g., smaller default increment for difference quotients * Made `emmeans()` more forgiving with 'by` variables; e.g., `emmeans(model, ~ dose | treat, by = "route")` will find both `by` variables whereas previously `"route"` would be ignored. * Temporary fix for glitch in gls support where Satterthwaite isn't always right. * Attempt to make annotations clearer and more consistent regarding degrees-of-freedom methods. * Provisions whereby externally provided `emm_basis()` and `recover_data()` methods are used in preference to internal ones - so package developers can provide improvements over what I've cobbled together. * Tried to produce more informative message when `recover_data()` fails * Fixed bug in `contrast()` in identifying true contrasts (#134) * Fixed a bug in `plot.summary_emm()` regarding `CIs` and `intervals` (#137) * Improved support for response transformations. Models with formulas like like `log(y + 1) ~ ...` and `2*sqrt(y + 0.5) ~ ...` are now auto-detected. [This may cause discrepancies with examples in past usages, but if so, that would be because the response transformation was previously incorrectly interpreted.] * Added a `ratios` argument to `contrast()` to decide how to handle `log` and `logit` * Added message/annotation when contrasts are summarized with `type = "response"` but there is no way to back-transform them (or we opted out with `ratios = FALSE`) emmeans 1.4 ----------- * Added a courtesy function `.emm_register()` to make it easier for other packages to register their **emmeans** support methods * Clarified the "confidence intervals" vignette discussion of `infer`, explaining that Bayesian models are handled differently (#128) * Added `PIs` option to `plot.emmGrid()` and `emmip()` (#131). Also, in `plot.emmGrid()`, the `intervals` argument has been changed to `CIs` for sake of consistency and less confusion; `intervals` is still supported for backaward compatibility. * `plot.emmGrid` gains a `colors` argument so we can customize colors used. * Bug fix for `glht` support (#132 contributed by Balsz Banfai) * `regrid` gains `sim` and `N.sim` arguments whereby we can generate a fake posterior sample from a frequentist model. emmeans 1.3.5.1 ------------- * Bug fix for `gls` objects with non-matrix `apVar` member (#119) * Repairs faulty links in 1.3.5 vignettes emmeans 1.3.5 ------------- * First steps to take prediction seriously. This includes * Addition of a `sigma` argument to `ref_grid()` (defaults to `sigma(object)` if available) * Addition of an `interval` argument in `predict.emmGrid()` * Addition of a `likelihood` argument in `as.mcmc` to allow for simulating from the posterior predictive distribution * Crude provisions for bias adjustment when back-transforming. This is not really prediction, but it is made possible by availability of `sigma` in object * Further steps to lower the profile of `cld()` and `CLD()` * Family size for Tukey adjustment was wrong when using `exclude` (#107) * Provided for direct passing of info from `recover_data` to `emm_basis` * Attempts to broaden `MCMCglmm` support emmeans 1.3.4 ------------- * Un-naming a lot of arguments in `do.call(paste, ...)` and `do.call(order, ...)`, to prevent problems with factor names like `method` that are argument names for these functions (#94) * Fix to a logic error in `summary.emmGrid()` whereby transformations of class `list` were ignored. * Enhancement to `update.emmGrid(..., levels = levs)` whereby we can easily relabel the reference grid and ensure that the `grid` and `roles` slots stay consistent. Added vignette example. * Clarified ordering rules used by `emmeans()`. We now ensure that the original order of the reference grid is preserved. Previously, the grid was re-ordered if any numeric or character levels occurred out of order, per `order()` * Curbing use of "statistical significance" language. This includes additional vignette material and plans to deprecate `CLD()` due to its misleading display of pairwise-comparison tests. * Bug fix for `betareg` objects, where the wrong `terms` component was sometimes used. * Correction to logic error that affected multiplicity adjustments when `by` variables are present (#98). * Addition of `pwpp()` function to plot *P* values of comparisons * Improvement to `summary(..., adjust = "scheffe")`. We now actually compute and use the rank of the matrix of linear functions to obtain the *F* numerator d.f., rather than trying to guess the likely correct value. * Removal of vignette on transitioning from **lsmeans** -- it's been a long enough time now. emmeans 1.3.3 ------------- * Fix to unintended consequence of #71 that caused incorrect ordering of `contrast()` results if they are later used by `emmeans()`. This was first noticed with ordinal models in `prob` mode (#83). * Improved checking of conformability of parameters -- for models with rank deficiency not handled same way as lm()'s NA convention * Added basic support for `sommer::mmer`, `MuMIn::averaging`, and `mice::mira` objects * Fix in `nnet::multinom` support when there are 2 outcomes (#19) * Added Satterthwaite d.f. to `gls` objects * `famSize` now correct when `exclude` or `include` is used in a contrast function (see #68) * Stronger warnings of possible bias with `aovList` objects, in part due to the popularity of `afex::aov_ez()` which uses these models. * Updates to FAQs vignette emmeans 1.3.2 ------------- * I decided to enable "optimal digits" display by default. In summaries, we try to show enough---but not too much---precision in estimates and confidence intervals. If you don't like this and want to revert to the old (exaggerated precision) behavior, do `emm_options(opt.digits = FALSE)` * Added `include` argument to most `.emmc` functions (#67) * Now allow character values for `ref`, `exclude`, and `include` in `.emmc` functions (#68) * Better handling of matrix predictors (#66) * Fixed over-zealous choice to not pass `...` arguments in `emmeans()` when two-sided formulas are present * Fix to `clm` support when model is rank-deficient * Fix to `regrid(..., transform = "log")` error when there are existing non-estimable cases (issue #65) * Improvements to `brmsfit` support (#43) * Added support for `mgcv::gam` and `mgcv::gamm` models * `.my.vcov()` now passes `...` to clients * Removed **glmmADMB** support. This package appears to be dormant * Fixed ordering bug for nested models (#71) * Support for `manova` object no longer requires `data` keyword (#72) * Added support for multivariate response in `aovlist` models (#73) * Documentation clarification (#76) * Fix to `CLD` fatal error when `sort = TRUE` (#77) * Fix to issue with weights and incomplete cases with `lme` objects (#75) * Nested fixed-effects yielded NonEsts when two factors are nested in the same factor(s) (#79) emmeans 1.3.1 ------------- * `"mvt"` adjustment ignored `by` grouping * `contrast()` mis-labeled estimates when levels varied among `by` groups (most prominently this happened in `CLD(..., details = TRUE)`) * Changed `aovlist` support so it re-fits the model when non-sum-to-zero contrasts were used * `print.summary_emm()` now cleans up numeric columns with `zapsmall()` * More robust handling of `nesting` in `ref_grid()` and `update()`, and addition of `covnest` argument for whether to include covariates when auto-detecting nesting * Revision of some vignettes * Fixed bug in `hpd.summary()` and handoff to it from `summary()` * Fixed bug where `ref_grid()` ignored `mult.levs` * Fixes in emmeans where it passes `...` where it shouldn't * `CLD()` now works for MCMC models (uses frequentist summary) * Addition of `opt.digits` option emmeans 1.3.0 ------------- * Deprecated functions like `ref.grid()` put to final rest, and we no longer support packages that provide `recover.data` or `lsm.basis` methods * Courtesy exports `.recover_data()` and `.emm_basis()` to provide access for extension developers to all available methods * Streamlining of a stored example in `inst/extdata` * Fix to `.all.vars()` that could cause errors when response variable has a function call with character constants. * Relabeling of differences as ratios when appropriate in `regrid()` (so results match `summary()` labeling with `type = "response"`). * `plot.emmGrid(..., comparisons = TRUE, type = "response")` produced incorrect comparison arrows; now fixed emmeans 1.2.4 ------------- * Support for model formulas such as `df$y ~ df$treat + df[["cov"]]`. This had failed previously for two obscure reasons, but now works correctly. * New `simplify.names` option for above types of models * `emm_options()` with no arguments now returns all options in force, including the defaults. This makes it more consistent with `options()` * Bug fix for `emtrends()`; produced incorrect results in models with offsets. * Separated the help pages for `update.emmGrid()` and `emm_options()` * New `qdrg()` function (quick and dirty reference grid) for help with unsupported model objects emmeans 1.2.3 ------------- * S3 methods involving packages **multcomp** and **coda** are now dynamically registered, not merely exported as functions. This passes checks when S3 methods are required to be registered. * `cld()` has been deprecated in favor of `CLD()`. This had been a headache. **multcomp** is the wrong place for the generic to be; it is too fancy a dance to export `cld` with or without having **multcomp** installed. * Added vignette caution regarding interdependent covariates * Improved **glmmADMB** support to recover contrasts correctly emmeans 1.2.2 ------------- * Removed ggplot2, multcomp, and coda to Suggests -- thus vastly reducing dependencies * Added a FAQ to the FAQs vignette * Modified advice in `xtending.Rmd` vignette on how to export methods * Fixes to `revpairwise.emmc` and `cld` regarding comparing only 1 EMM * `cld.emm_list` now returns results only for `object[[ which[1] ]]`, along with a warning message. * Deprecated `emmeans` specs like `cld ~ group`, a vestige of **lsmeans** as it did not work correctly (and was already undocumented) emmeans 1.2.1 ------------- * Moved **brms** to `Suggests` (dozens and dozens fewer dependencies) emmeans 1.2 ----------- * Index of vignette topics added * New, improved (to my taste) vignette formats * Fixed df bug in regrid (#29) * Fixed annotation bug for nested models (#30) * Better documentation for `lme` models in "models" vignette * Additional fixes for arguments passed to `.emmc` functions (#22) * Support added for logical predictors (who knew we could have those? not me) * Replaced tex/pdf "Extending" vignette with Rmd/html * Overhauled the faulty logic for df methods in emm_basis.merMod * Added Henrik to contributors list (long-standing oversight) * Added `exclude` argument to most `.emmc` functions: allows user to omit certain levels when computing contrasts * New `hpd.summary()` function for Bayesian models to show HPD intervals rather than frequentist summary. Note: `summary()` automatically reroutes to it. Also `plot()` and `emmip()` play along. * Rudimentary support for **brms** package * *Ad hoc* Satterthwaite method for `nlme::lme` models emmeans 1.1.3 ------------- * Formatting corrections in documentation * Fixed bug for survival models where `Surv()` was interpreted as a response transformation. * Fixed bug (issue #19) in multinom support * Fixed bug (issue #22) in optional arguments with interaction contrasts * Fixed bug (issue #23) in weighting with character predictors * Clarifying message when `cld()` is applied to an `emm_list` (issue #24) * Added `offset` argument to `ref_grid()` (scalar offset only) and to `emmeans()` (vector offset allowed) -- (issue #18) * New optional argument for `[.summary_emm` to choose whether to retain its class or coerce to a `data.frame` (relates to issue #14) * Added `reverse` option for `trt.vs.ctrl` and relatives (#27) emmeans 1.1.2 ------------- * Changed the way `terms` is accessed with `lme` objects to make it more robust * `emmeans:::convert_scripts()` renames output file more simply * Added `[` method for class `summary_emm` * Added `simple` argument for `contrast` - essentially the complement of `by` * Improved estimability handling in `joint_tests()` * Made `ref_grid()` accept `ylevs` list of length > 1; also slight argument change: `mult.name` -> `mult.names` * Various bug fixes, bullet-proofing * Fixes to make Markdown files render better emmeans 1.1 ----------- * Fixed a bug in `emmeans()` wherein `weights` was ignored when `specs` is a `list` * Coerce `data` argument, if supplied to a data.frame (`recover_data()` doesn't like tibbles...) * Added `as.data.frame` method for `emmGrid` objects, making it often possible to pass it directly to other functions as a `data` argument. * Fixed bug in `contrast()` where `by` was ignored for interaction contrasts * Fixed bug in `as.glht()` where it choked on `df = Inf` * Fixed bug occurring when a model call has no `data` or `subset` * New `joint_tests()` function tests all [interaction] contrasts emmeans 1.0 ----------- * Added preliminary support for `gamlss` objects (but doesn't support smoothing). Additional argument is `what = c("mu", "sigma", "nu", "tau")` It seems to be flaky when the model of interest is just `~ 1`. * Improved support for models with fancy variable names (containing spaces and such) * Fixed a bug whereby `emmeans()` might pass `data` to `contrast()` * Added some missing documentation for `summary.emmGrid()` * Repaired handling of `emm_options(summary = ...)` to work as advertised. * Changed many object names in examples and vignettes from xxx.emmGrid to xxx.emm (result of overdoing the renaming the object class itself) * Changed `emmGrid()` function to `emm()` as had been intended as alternative to `mcp()` in `multcomp::glht()` (result of ditto). * Fixed error in exporting `cld.emm_list()` * Fixed a bug whereby all CIs were computed using the first estimate's degrees of freedom. * Now using `Inf` to display d.f. for asymptotic (z) tests. (`NA` will still work too but `Inf` is a better choice for consistency and meaning.) * Bug fix in nesting-detection code when model has only an intercept emmeans 0.9.1 ------------- * Documentation corrections (broken links, misspellings, mistakes) * More sophisticated check for randomized data in `recover_data()` now throws an error when it finds recovered data not reproducible * Added support for gam::gam objects * Fixes to `vcov()` calls to comply with recent R-devel changes emmeans 0.9 ----------- This is the initial major version that replaces the **lsmeans** package. Changes shown below are changes made to the last real release of **lsmeans** (version 2.27-2). **lsmeans** versions greater than that are transitional to that package being retired. * We now emphasize the terminology "estimated marginal means" rather than "least-squares means" * The flagship functions are now `emmeans()`, `emtrends()`, `emmip()`, etc. But `lsmeans()`, `lstrends()`, etc. as well as `pmmeans()` etc. are mapped to their corresponding `emxxxx()` functions. * In addition, we are trying to avoid names that could get confused as S3 methods. So, `ref.grid -> ref_grid`, `lsm.options -> emm_options`, etc. * Classes `ref.grid` and `lsmobj` are gone. Both are replaced by class `emmGrid`. An `as.emmGrid()` function is provided to convert old objects to class `emmGrid`. * I decided to revert back to "kenward-roger" as the default degrees-of-freedom method for `lmerMod models`. Also added options `disable.lmerTest` and `lmerTest.limit`, similar to those for **pbkrtest**. * Documentation and NAMESPACE are now "ROxygenated" * Additional `neuralgia` and `pigs` datasets * Dispatching of `emmmeans()` methods is now top-down rather than convoluted intermingling of S3 methods * Improved display of back-transformed contrasts when log or logit transformation was used: We change any ` - `s in labels to ` / `s to emphasize that thnese results are ratios. * A message is now displayed when nesting is auto-detected in `ref_grid`. (Can be disabled via `emm_options()`) * Options were added for several messages that users may want to suppress, e.g., ones about interactions and nesting. * Greatly overhauled help page for models. It is now a vignette, with a quick reference chart linked to details, and is organized by similarities instead of packages. * Support for 'mer' objects (lme4.0 package) removed. * A large number of smaller interlinked vignettes replaces the one big one on using the package. Several vignettes are linked in the help pages. * Graphics methods `plot()` and `emmip()` are now **ggplot2**-based. Old **lattice**-based functionality is still available too, and there is a `graphics.engine` option to choose the default. * Non-exported utilities convert_workspace() and convert_scripts() to help with transition * Moved `Suggests` pkgs to `Enhances` when not needed for building/testing ### NOTE: **emmeans** is a continuation of the **lsmeans** package. New developments will take place in **emmeans**, and **lsmeans** will remain static and eventually will be archived. emmeans/MD50000644000176200001440000002122414165107726012221 0ustar liggesusers6768814a14c615d9afafd5c89f0e5428 *DESCRIPTION 996ab623b54a9e51b1d209d8f017a0c6 *NAMESPACE 4c10c3dc0f145e1a823d91fa0a764552 *NEWS.md 5b2465ce3313d2e0f694d617edc0bf2c *R/0nly-internal.R b92d508a11637771f9778eb117dfb5c3 *R/MCMC-support.R cc5d274526252ba8fa1f0003d8de9fde *R/S4-classes.R 62b1398be346f0b006c5e5aa86317ba7 *R/aovlist-support.R 3a65a08f86878a7a72639f679a71df5c *R/betareg.support.R 8d995e61519738294cc2c60acc5d84dc *R/brms-support.R fcaf663aad21a6bb8a519c19e1c6efa2 *R/cld-emm.R 7db4b8835bfdf45d48827ab2818ee294 *R/contrast.R b41de83365a0c6bd88ddb0ef9e549f3d *R/countreg-support.R a1aab34ef85dc1ee374a19eeb52baeb2 *R/datasets.R 2959650a58d3505b26850cafbdcbd253 *R/eff-size.R 6be98990a676724546d709ef5e3b87b7 *R/emm-contr.R 80e3b56525fc9cd27cbe3cb1fa4cd33b *R/emm-list.R 1e99c911b5159b1df094e85ba78d1aa6 *R/emmGrid-methods.R fc3199c45bb867e024401d189fbee370 *R/emmeans-package.R 6ac529fc5cd76cd54e3d2cbd0e47187e *R/emmeans.R 9d7a4c450baeff3e1ad6383edab1ae28 *R/emmip.R af7e3aa2d49ac894e996cf738700b092 *R/emtrends.R 15f8cc00e161c47db491adc51fd30fd9 *R/gam-support.R 88e8f5c128dae1adb51715e51dea5d97 *R/glht-support.R ea5498376036bab6d755a43182b16cb3 *R/helpers.R 12c25d76ad658082d0f3e892e7954bdc *R/interfacing.R 098558c7854f85c88233644c69a3f9e1 *R/lqm-support.R 677f27b19a2d339a6a65e5b4f6c46ccc *R/models.R f040dd6cda7e9c3d2f3038470730be76 *R/multinom-support.R 2a39a20b5333be43487ef6d535cc17f8 *R/multiple-models.R 385f4ace4290e18f0a9ffa788c58c058 *R/multiv.R 12f7d139391bab8d47423a6efe72e8f8 *R/nested.R 04d2e6a0bf84d71efe18ae1b658723c1 *R/nonlin-support.R f1dd65215874592701b40178096f48b0 *R/ordinal-support.R ee51c12576443bc378031ceea482ebdf *R/plot.emm.R 6e879224fe4e49b1cc483865341b8bbc *R/pwpp.R 0608106edf002b193e7dd21d66b07d3d *R/qdrg.R 1c2126361fdba2f8201fd2cd0e41e220 *R/rbind.R 425cee67e69abb690904f35957756727 *R/ref-grid.R b8833cde9f0ba7791c76c6df51ed0dd3 *R/rms-support.R 6bda66c4ffd654a5ee2698c28acc7f3a *R/sommer-support.R 5512efa6d2e7cf507b449c563912a60c *R/summary.R e23d3487c198e1bc462a6600e17a5de4 *R/test.R 984ea7eb20780d6b27fec69033558155 *R/transformations.R 09a2c6ae53a4dbbb498ef56f776fc399 *R/wrappers.R d5da807c7f73b236fe74fa0fdeade92f *R/xtable-method.R 3c4829af0abdf2e9ef47ea14a0c882b8 *R/zzz.R c264fd2137ae53b4fe4e4905a4e7f78d *README.md f8a26e204f6fe83a46ac219a32a19c30 *build/vignette.rds dd8a9e9e6180f65bff84a40f25943636 *data/MOats.RData 2d7a1d217def5e4ab094c7f019a7d27d *data/autonoise.RData 991422f691ec5cdee38ab0d613e2a89c *data/feedlot.RData f204c1431e151e2459178e8f4fc8f0e8 *data/fiber.RData f5bb40e88a8879e008e86a08202a1466 *data/neuralgia.rda 298a5159a77e201ffe1410cf80282eff *data/nutrition.RData 2b03398ef2214b51371430e875d964cc *data/oranges.RData 913189bb7ee175c6e16deef8f53c2c05 *data/pigs.rda 73a66cd691d5b6d96ba54742c361f18e *data/ubds.RData 782efc97ea1fdebe5f28a5da97148559 *inst/css/clean-simple.css c8030e025af4124d64466126267648e5 *inst/doc/FAQs.R 43d37a99c9b184f6694cf815972b2629 *inst/doc/FAQs.Rmd 407e47796c5a413fc66d6bf7bae835f2 *inst/doc/FAQs.html aee24e495957066729e537111d82ba90 *inst/doc/basics.R 4567c08599d58fd4abe6b92c09dce552 *inst/doc/basics.Rmd d7a2369bf00f8bf41d2f2b793628d06d *inst/doc/basics.html ee377f63146ffe7cde526ebd8c8c3ec8 *inst/doc/comparisons.R 67f06dc86146a9142f46d3e4c1d9f27a *inst/doc/comparisons.Rmd 39246d426cd1e2b77284a553f6e68190 *inst/doc/comparisons.html a4ecd8b1e561d59a63a9eec6904e6225 *inst/doc/confidence-intervals.R 5e259ccf3ea9b13751429db5283bb917 *inst/doc/confidence-intervals.Rmd 13b1ab48b674a0c6d92c331dd121c9af *inst/doc/confidence-intervals.html 18a50f2ff34e1fa999cbfbbbee33962c *inst/doc/interactions.R 366bec92630dce76260034317e6344e7 *inst/doc/interactions.Rmd ed027422b737a7a681da6fb31e30ffe1 *inst/doc/interactions.html 907cd7b83bda8294b3bd5d48fd7c3964 *inst/doc/messy-data.R 2cb9b6f4e74ee27b192b0823b26d219f *inst/doc/messy-data.Rmd a51f40f9ab2d27dd52675034942634e8 *inst/doc/messy-data.html 370507cd90cecd640575cc7c71fab3f6 *inst/doc/models.Rmd 8cec9907e9afab8b4fd9f5e7cd912330 *inst/doc/models.html d16af40cfcb655e99dda6fc4f6e20649 *inst/doc/predictions.R 93fc9584e2ce49dd9a384406f615ee18 *inst/doc/predictions.Rmd 53c6ca29fef37df4ddf9715334008a63 *inst/doc/predictions.html d4d9023543c5ea6c7db1a75b0b78f080 *inst/doc/sophisticated.R 999c1f7f2c0a53793af6b19bcb7dbc9c *inst/doc/sophisticated.Rmd 9dfe7f8cd2c6fc22943adc8c13abd176 *inst/doc/sophisticated.html 40d2b8368a8d74fdf3a54da87b15c81d *inst/doc/transformations.R 4cea517e2a56de0f711dfed05a525b45 *inst/doc/transformations.Rmd 56a8d3ef538472072e0db255d0aeb115 *inst/doc/transformations.html 2059082a2cf9f2bd44e3fbf82221538a *inst/doc/utilities.R 7ac17898cab9c93c082b96dd196c70a8 *inst/doc/utilities.Rmd 3bb136b3f186cb733da61bd068fdcbb9 *inst/doc/utilities.html c7659fd423e38bbe27bc3af366d13bda *inst/doc/vignette-topics.Rmd de546fd715e35a5871dac98a44d91d6f *inst/doc/vignette-topics.html a192d49f4da4d58ec8fd6862276f4e85 *inst/doc/xplanations.R 6f223c166b7ba59c3c660d92c8209223 *inst/doc/xplanations.Rmd 7ec3692d85479db45416c9677035283c *inst/doc/xplanations.html 8f429d4157e346f3948732faec5451fc *inst/doc/xtending.R 00f49790353b266fe734d1dd9b0a37cf *inst/doc/xtending.Rmd 93d7344112479dfbabf7e3af9232400d *inst/doc/xtending.html 7b8cd42116a0f48bab21497ca906cd88 *inst/extdata/cbpplist 406e3dc1e6186935a4a27de7b9504970 *inst/extdata/cbpppriorrglist e0050219ed153a384beffdd35d9ca7c1 *inst/extdata/cbpprglist 6543a8d1793773e95418d14e5f69738c *inst/extdata/cbppsigma 36dd7bafc4faa1d2a730e5a217d74bd4 *man/CLD.emmGrid.Rd aae866046e88015710e8f7e93af368cd *man/MOats.Rd 4aaca900c27aab06eaacb22780d0b79f *man/add_grouping.Rd bef1dbd2df575df24140ba51886756ee *man/as.emmGrid.Rd 541ac8af53b9ad1b8602ae9fbd42c209 *man/auto.noise.Rd 19ed39119280572e18b4669ae37c30d4 *man/contrast.Rd c33975d95af9a38d6d29c186aeb854df *man/eff_size.Rd 3c33e59fdb25c915d5155aeb015c6c0b *man/emmGrid-class.Rd 408b1bd096766f593b8b46cd3ae1c34c *man/emmGrid-methods.Rd 984ad80691e91b88429ccf767034071a *man/emm_list-object.Rd 0262a4f46f4977a24f79211328652971 *man/emm_options.Rd a58786944d0ed0b64fb7ac94a6acbe12 *man/emmc-functions.Rd e11c42db91b145558e501e9bc6694872 *man/emmeans-package.Rd 76918ade5739574315e5150d88886f5d *man/emmeans.Rd 96c57032430522abb6343a966efa381b *man/emmip.Rd 5ab06634fe087f428a01da6d99dee0d0 *man/emmobj.Rd 562eae5962fc6b82c746221bf8f45f3e *man/emtrends.Rd 8cb236cb650506c92ed158496947eff6 *man/extending-emmeans.Rd 113ba6de508b28d398037dd059f608c2 *man/feedlot.Rd 7311386c6c6ec360a31cbc85a42dc11c *man/fiber.Rd 7523139417ee796497bd526a0371cef6 *man/glht-support.Rd 9774b9abaa9e9b081727d7a34ceeacf9 *man/hpd.summary.Rd 353b3c1364c3f1a48bf924e834570d7c *man/joint_tests.Rd 51a34ab35df9ac6871d9a93dff99a509 *man/make.tran.Rd ba2ea5ae5f731698b90eb9164b89c958 *man/mcmc-support.Rd 1ba90c6af7fed64648527ed84ee082e6 *man/models.Rd 9d4dec708f7505d99132285bce7cb442 *man/mvcontrast.Rd 7a081c8af22c249f4ae84dbebb48ed38 *man/neuralgia.Rd db31ec28aa26ad81bb250b2f92245f19 *man/nutrition.Rd 3d66ca80081f8e5ac0f9151400b52d4c *man/oranges.Rd 6cf7125e7c06e58f6473f2b9fd5fb432 *man/pigs.Rd e9a6b828be4af80cf0f52dc97a0c96d4 *man/plot.Rd c21c52194882a4d0343c6f6dea76ee10 *man/pwpm.Rd 36a2e74ae82273a8f263ccb9ac90deef *man/pwpp.Rd f728226b6e261dc9917258bfea161fec *man/qdrg.Rd a2a96924fe9420cbc3f982610dec21ca *man/rbind.emmGrid.Rd e0979163301939dc9b02ee59301facfd *man/ref_grid.Rd 6ce8206ba94383320ae230e32e2613c3 *man/regrid.Rd ecae5b32c75ef09348c923f05d424e8d *man/summary.emmGrid.Rd 1dc29fe89a1251ee68f5eb556d7b5204 *man/ubds.Rd 7cfe0c2d02f0da7f9c83598b1df08504 *man/update.emmGrid.Rd d878308e8698ae369e3693e0c645dc1a *man/wrappers.Rd cf7e49d26b701657dc29e24363e00ec4 *man/xtable.emmGrid.Rd 5f7a31f3566231d8f586eab956aa7cde *tests/testthat.R 91a2039503542c87cfeb53f7608756d7 *tests/testthat/test-contrast.R fc837aed66f0f2acaa05c7e2ee30ddfb *tests/testthat/test-emmeans.R 67f10d59f3e191c2d4675c58bbcf7427 *tests/testthat/test-emtrends.R 9433d1c6044ba24185da97b2f23fb65a *tests/testthat/test-nested.R 9ae8a4b9239cdc7cdddc1abf12ae813a *tests/testthat/test-ref_grid.R 43d37a99c9b184f6694cf815972b2629 *vignettes/FAQs.Rmd 4567c08599d58fd4abe6b92c09dce552 *vignettes/basics.Rmd 67f06dc86146a9142f46d3e4c1d9f27a *vignettes/comparisons.Rmd 5e259ccf3ea9b13751429db5283bb917 *vignettes/confidence-intervals.Rmd 366bec92630dce76260034317e6344e7 *vignettes/interactions.Rmd 2cb9b6f4e74ee27b192b0823b26d219f *vignettes/messy-data.Rmd 370507cd90cecd640575cc7c71fab3f6 *vignettes/models.Rmd 93fc9584e2ce49dd9a384406f615ee18 *vignettes/predictions.Rmd 999c1f7f2c0a53793af6b19bcb7dbc9c *vignettes/sophisticated.Rmd 4cea517e2a56de0f711dfed05a525b45 *vignettes/transformations.Rmd 7ac17898cab9c93c082b96dd196c70a8 *vignettes/utilities.Rmd c7659fd423e38bbe27bc3af366d13bda *vignettes/vignette-topics.Rmd 6f223c166b7ba59c3c660d92c8209223 *vignettes/xplanations.Rmd 00f49790353b266fe734d1dd9b0a37cf *vignettes/xtending.Rmd emmeans/inst/0000755000176200001440000000000014165066776012676 5ustar liggesusersemmeans/inst/doc/0000755000176200001440000000000014165066776013443 5ustar liggesusersemmeans/inst/doc/comparisons.Rmd0000644000176200001440000004044014137062735016434 0ustar liggesusers--- title: "Comparisons and contrasts in emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Comparisons and contrasts} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` ## Contents This vignette covers techniques for comparing EMMs at levels of a factor predictor, and other related analyses. 1. [Pairwise comparisons](#pairwise) 2. [Other contrasts](#contrasts) 3. [Formula interface](#formulas) 4. [Custom contrasts and linear functions](#linfcns) 5. [Special behavior with log transformations](#logs) 6. Interaction contrasts (in ["interactions" vignette](interactions.html#contrasts)) 7. Multivariate contrasts (in ["interactions" vignette](interactions.html#multiv)) [Index of all vignette topics](vignette-topics.html) ## Pairwise comparisons {#pairwise} The most common follow-up analysis for models having factors as predictors is to compare the EMMs with one another. This may be done simply via the `pairs()` method for `emmGrid` objects. In the code below, we obtain the EMMs for `source` for the `pigs` data, and then compare the sources pairwise. ```{r} pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) pigs.emm.s <- emmeans(pigs.lm, "source") pairs(pigs.emm.s) ``` In its out-of-the-box configuration, `pairs()` sets two defaults for [`summary()`](confidence-intervals.html#summary): `adjust = "tukey"` (multiplicity adjustment), and `infer = c(FALSE, TRUE)` (test statistics, not confidence intervals). You may override these, of course, by calling `summary()` on the result with different values for these. In the example above, EMMs for later factor levels are subtracted from those for earlier levels; if you want the comparisons to go in the other direction, use `pairs(pigs.emm.s, reverse = TRUE)`. Also, in multi-factor situations, you may specify `by` factor(s) to perform the comparisons separately at the levels of those factors. ### Matrix displays {#pwpm} The numerical main results associated with pairwise comparisons can be presented compactly in matrix form via the `pwpm()` function. We simply hand it the `emmGrid` object to use in making the comparisons: ```{r} pwpm(pigs.emm.s) ``` This matrix shows the EMMs along the diagonal, $P$ values in the upper triangle, and the differences in the lower triangle. Options exist to switch off any one of these and to switch which triangle is used for the latter two. Also, optional arguments are passed. For instance, we can reverse the direction of the comparisons, suppress the display of EMMs, swap where the $P$ values go, and perform noninferiority tests with a threshold of 0.05 as follows: ```{r} pwpm(pigs.emm.s, means = FALSE, flip = TRUE, # args for pwpm() reverse = TRUE, # args for pairs() side = ">", delta = 0.05, adjust = "none") # args for test() ``` With all three *P* values so small, we have fish, soy, and skim in increasing order of noninferiority based on the given threshold. When more than one factor is present, an existing or newly specified `by` variables() can split the results into l list of matrices. ### Effect size Some users desire standardized effect-size measures. Most popular is probably Cohen's *d*, which is defined as the observed difference, divided by the population SD; and obviously Cohen effect sizes are close cousins of pairwise differences. They are available via the `eff_size()` function, where the user must specify the `emmGrid` object with the means to be compared, the estimated population SD `sigma`, and its degrees of freedom `edf`. This is illustrated with the current example: ```{r} eff_size(pigs.emm.s, sigma = sigma(pigs.lm), edf = 23) ``` The confidence intervals shown take into account the error in estimating `sigma` as well as the error in the differences. Note that the intervals are narrower if we claim that we know `sigma` perfectly (i.e., infinite degrees of freedom): ```{r} eff_size(pigs.emm.s, sigma = sigma(pigs.lm), edf = Inf) ``` Note that `eff_size()` expects the object with the means, not the differences. If you want to use the differences, use the `method` argument to specify that you don't want to compute pairwise differences again; e.g., ```{r, eval = FALSE} eff_size(pairs(pigs.emm.s), sigma = sigma(pigs.lm), edf = 23, method = "identity") ``` (results are identical to the first effect sizes shown). ### Graphical comparisons {#graphical} Comparisons may be summarized graphically via the `comparisons` argument in `plot.emm()`: ```{r fig.height = 1.5} plot(pigs.emm.s, comparisons = TRUE) ``` The blue bars are confidence intervals for the EMMs, and the red arrows are for the comparisons among them. If an arrow from one mean overlaps an arrow from another group, the difference is not "significant," based on the `adjust` setting (which defaults to `"tukey"`) and the value of `alpha` (which defaults to 0.05). See the ["xplanations" supplement](xplanations.html#arrows) for details on how these are derived. *Note:* Don't *ever* use confidence intervals for EMMs to perform comparisons; they can be very misleading. Use the comparison arrows instead; or better yet, use `pwpp()`. *A caution:* it really is not good practice to draw a bright distinction based on whether or not a *P* value exceeds some cutoff. This display does dim such distinctions somewhat by allowing the viewer to judge whether a *P* value is close to `alpha` one way or the other; but a better strategy is to simply obtain all the *P* values using `pairs()`, and look at them individually. #### Pairwise *P*-value plots {#pwpp} In trying to develop an alternative to compact-letter displays (see next subsection), we devised the "pairwise *P*-value plot" displaying all the *P* values in pairwise comparisons: ```{r} pwpp(pigs.emm.s) ``` Each comparison is associated with a vertical line segment that joins the scale positions of the two EMMs being compared, and whose horizontal position is determined by the *P* value of that comparison. This kind of plot can get quite "busy" as the number of means being compared goes up. For example, suppose we include the interactions in the model for the pigs data, and compare all 12 cell means: ```{r, fig.width = 9} pigs.lmint <- lm(log(conc) ~ source * factor(percent), data = pigs) pigs.cells <- emmeans(pigs.lmint, ~ source * percent) pwpp(pigs.cells, type = "response") ``` While this plot has a lot of stuff going on, consider looking at it row-by-row. Next to each EMM, we can visualize the *P* values of all 11 comparisons with each other EMM (along with their color codes). Also, note that we can include arguments that are passed to `summary()`; in this case, to display the back-transformed means. If we are willing to forgo the diagonal comparisons (where neither factor has a common level), we can make this a lot less cluttered via a `by` specification: ```{r, fig.width = 6} pwpp(pigs.cells, by = "source", type = "response") ``` In this latter plot we can see that the comparisons with `skim` as the source tend to be statistically stronger. This is also an opportunity to remind the user that multiplicity adjustments are made relative to each `by` group. For example, comparing `skim:9` versus `skim:15` has a Tukey-adjusted *P* value somewhat greater than 0.1 when all are in one family of 12 means, but about 0.02 relative to a smaller family of 4 means as depicted in the three-paneled plot. #### Compact letter displays {#CLD} Another way to depict comparisons is by *compact letter displays*, whereby two EMMs sharing one or more grouping symbols are not "significantly" different. These may be generated by the `multcomp::cld()` function. I really recommend against this kind of display, though, and decline to illustrate it. These displays promote visually the idea that two means that are "not significantly different" are to be judged as being equal; and that is a very wrong interpretation. In addition, they draw an artificial "bright line" between *P* values on either side of `alpha`, even ones that are very close. [Back to Contents](#contents) ## Other contrasts {#contrasts} Pairwise comparisons are an example of linear functions of EMMs. You may use `coef()` to see the coefficients of these linear functions: ```{r} coef(pairs(pigs.emm.s)) ``` The pairwise comparisons correspond to columns of the above results. For example, the first pairwise comparison, `fish - soy`, gives coefficients of 1, -1, and 0 to fish, soy, and skim, respectively. In cases, such as this one, where each column of coefficients sums to zero, the linear functions are termed *contrasts* The `contrast()` function provides for general contrasts (and linear functions, as well) of factor levels. Its second argument, `method`, is used to specify what method is to be used. In this section we describe the built-in ones, where we simply provide the name of the built-in method. Consider, for example, the factor `percent` in the model `pigs.lm` . It is treated as a factor in the model, but it corresponds to equally-spaced values of a numeric variable. In such cases, users often want to compute orthogonal polynomial contrasts: ```{r} pigs.emm.p <- emmeans(pigs.lm, "percent") ply <- contrast(pigs.emm.p, "poly") ply coef(ply) ``` We obtain tests for the linear, quadratic, and cubic trends. The coefficients are those that can be found in tables in many experimental-design texts. It is important to understand that the estimated linear contrast is *not* the slope of a line fitted to the data. It is simply a contrast having coefficients that increase linearly. It *does* test the linear trend, however. There are a number of other named contrast methods, for example `"trt.vs.ctrl"`, `"eff"`, and `"consec"`. The `"pairwise"` and `"revpairwise"` methods in `contrast()` are the same as `Pairs()` and `pairs(..., reverse = TRUE)`. See `help("contrast-methods")` for details. [Back to Contents](#contents) ## Formula interface {#formulas} If you already know what contrasts you will want before calling `emmeans()`, a quick way to get them is to specify the method as the left-hand side of the formula in its second argument. For example, with the `oranges` dataset provided in the package, ```{r} org.aov <- aov(sales1 ~ day + Error(store), data = oranges, contrasts = list(day = "contr.sum")) org.emml <- emmeans(org.aov, consec ~ day) org.emml ``` The contrasts shown are the day-to-day changes. This two-sided formula technique is quite convenient, but it can also create confusion. For one thing, the result is not an `emmGrid` object anymore; it is a `list` of `emmGrid` objects, called an `emm_list`. You may need to be cognizant of that if you are to do further contrasts or other analyzes. For example if you want `"eff"` contrasts as well, you need to do `contrast(org.emml[[1]], "eff")` or `contrast(org.emml, "eff", which = 1)`. Another issue is that it may be unclear which part of the results is affected by certain options. For example, if you were to add `adjust = "bonf"` to the `org.emm` call above, would the Bonferroni adjustment be applied to the EMMs, or to the contrasts? (See the documentation if interested; but the best practice is to avoid such dilemmas.) [Back to Contents](#contents) ## Custom contrasts and linear functions {#linfcns} The user may write a custom contrast function for use in `contrast()`. What's needed is a function having the desired name with `".emmc"` appended, that generates the needed coefficients as a list or data frame. The function should take a vector of levels as its first argument, and any optional parameters as additional arguments. For example, suppose we want to compare every third level of a treatment. The following function provides for this: ```{r} skip_comp.emmc <- function(levels, skip = 1, reverse = FALSE) { if((k <- length(levels)) < skip + 1) stop("Need at least ", skip + 1, " levels") coef <- data.frame() coef <- as.data.frame(lapply(seq_len(k - skip - 1), function(i) { sgn <- ifelse(reverse, -1, 1) sgn * c(rep(0, i - 1), 1, rep(0, skip), -1, rep(0, k - i - skip - 1)) })) names(coef) <- sapply(coef, function(x) paste(which(x == 1), "-", which(x == -1))) attr(coef, "adjust") = "fdr" # default adjustment method coef } ``` To test it, try 5 levels: ```{r} skip_comp.emmc(1:5) skip_comp.emmc(1:5, skip = 0, reverse = TRUE) ``` (The latter is the same as `"consec"` contrasts.) Now try it with the `oranges` example we had previously: ```{r} contrast(org.emml[[1]], "skip_comp", skip = 2, reverse = TRUE) ``` ####### {#linfct} The `contrast()` function may in fact be used to compute arbitrary linear functions of EMMs. Suppose for some reason we want to estimate the quantities $\lambda_1 = \mu_1+2\mu_2-7$ and $\lambda_2 = 3\mu_2-2\mu_3+1$, where the $\mu_j$ are the population values of the `source` EMMs in the `pigs` example. This may be done by providing the coefficients in a list, and the added constants in the `offset` argument: ```{r} LF <- contrast(pigs.emm.s, list(lambda1 = c(1, 2, 0), lambda2 = c(0, 3, -2)), offset = c(-7, 1)) confint(LF, adjust = "bonferroni") ``` [Back to Contents](#contents) ## Special properties of log (and logit) transformations {#logs} Suppose we obtain EMMs for a model having a response transformation or link function. In most cases, when we compute contrasts of those EMMs, there is no natural way to express those contrasts on anything other than the transformed scale. For example, in a model fitted using `glm()` with the `gamma()` family, the default link function is the inverse. Predictions on such a model are estimates of $1/\mu_j$ for various $j$. Comparisons of predictions will be estimates of $1/\mu_j - 1/\mu_{k}$ for $j \ne k$. There is no natural way to back-transform these differences to some other interpretable scale. However, logs are an exception, in that $\log\mu_j - \log\mu_k = \log(\mu_j/\mu_k)$. Accordingly, when `contrast()` (or `pairs()`) notices that the response is on the log scale, it back-transforms contrasts to ratios when results are to be of `response` type. For example: ```{r} pairs(pigs.emm.s, type = "lp") pairs(pigs.emm.s, type = "response") ``` As is true of EMM summaries with `type = "response"`, the tests and confidence intervals are done before back-transforming. The ratios estimated here are actually ratios of *geometric* means. In general, a model with a log response is in fact a model for *relative* effects of any of its linear predictors, and this back-transformation to ratios goes hand-in-hand with that. In generalized linear models, this behaviors will occur in two common cases: Poisson or count regression, for which the usual link is the log; and logistic regression, because logits are logs of odds ratios. [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/inst/doc/predictions.html0000644000176200001440000005644414165066764016666 0ustar liggesusers Prediction in emmeans

Prediction in emmeans

emmeans package, Version 1.7.2

In this vignette, we discuss emmeans’s rudimentary capabilities for constructing prediction intervals.

Focus on reference grids

Prediction is not the central purpose of the emmeans package. Even its name refers to the idea of obtaining marginal averages of fitted values; and it is a rare situation where one would want to make a prediction of the average of several observations. We can certainly do that if it is truly desired, but almost always, predictions should be based on the reference grid itself (i.e., not the result of an emmeans() call), inasmuch as a reference grid comprises combinations of model predictors.

Need for an SD estimate

A prediction interval requires an estimate of the error standard deviation, because we need to account for both the uncertainty of our point predictions and the uncertainty of outcomes centered on those estimates. By its current design, we save the value (if any) returned by stats::sigma(object) when a reference grid is constructed for a model object. Not all models provide a sigma() method, in which case an error is thrown if the error SD is not manually specified. Also, in many cases, there may be a sigma() method, but it does not return the appropriate value(s) in the context of the needed predictions. (In an object returned by lme4::glmer(), for example,sigma()` seems to always returns 1.0.) Indeed, as will be seen in the example that follows, one usually needs to construct a manual SD estimate when the model is a mixed-effects model.

So it is essentially always important to think very specifically about whether we are using an appropriate value. You may check the value being assumed by looking at the misc slot in the reference grid:

rg <- ref_grid(model)
rg@misc$sigma

Finally, sigma may be a vector, as long as it is conformable with the estimates in the reference grid. This would be appropriate, for example, with a model fitted by nlme::gls() with some kind of non-homogeneous error structure. It may take some effort, as well as a clear understanding of the model and its structure, to obtain suitable SD estimates. It was suggested to me that the function insight::get_variance() may be helpful – especially when working with an unfamiliar model class. Personally, I prefer to make sure I understand the structure of the model object and/or its summary to ensure I am not going astray.

Back to Contents

Feedlot example

To illustrate, consider the feedlot dataset provided with the package. Here we have several herds of feeder cattle that are sent to feed lots and given one of three diets. The weights of the cattle are measured at time of entry (ewt) and at time of slaughter (swt). Different herds have possibly different entry weights, based on breed and ranching practices, so we will center each herd’s ewt measurements, then use that as a covariate in a mixed model:

feedlot = transform(feedlot, adj.ewt = ewt - predict(lm(ewt ~ herd)))
require(lme4)
feedlot.lmer <- lmer(swt ~ adj.ewt + diet + (1|herd), data = feedlot)
feedlot.rg <- ref_grid(feedlot.lmer, at = list(adj.ewt = 0))
summary(feedlot.rg)  ## point predictions
##  adj.ewt diet   prediction   SE   df
##        0 Low          1029 25.5 12.0
##        0 Medium        998 26.4 13.7
##        0 High         1031 29.4 19.9
## 
## Degrees-of-freedom method: kenward-roger

Now, as advised, let’s look at the SDs involved in this model:

lme4::VarCorr(feedlot.lmer)  ## for the model
##  Groups   Name        Std.Dev.
##  herd     (Intercept) 77.087  
##  Residual             57.832
feedlot.rg@misc$sigma  ## default in the ref. grid
## [1] 57.83221

So the residual SD will be assumed in our prediction intervals if we don’t specify something else. And we do want something else, because in order to predict the slaughter weight of an arbitrary animal, without regard to its herd, we need to account for the variation among herds too, which is seen to be considerable. The two SDs reported by VarCorr() are assumed to represent independent sources of variation, so they may be combined into a total SD using the Pythagorean Theorem. We will update the reference grid with the new value:

feedlot.rg <- update(feedlot.rg, sigma = sqrt(77.087^2 + 57.832^2))

We are now ready to form prediction intervals. To do so, simply call the predict() function with an interval argument:

predict(feedlot.rg, interval = "prediction")
##  adj.ewt diet   prediction    SE   df lower.PL upper.PL
##        0 Low          1029  99.7 12.0      812     1247
##        0 Medium        998  99.9 13.7      783     1213
##        0 High         1031 100.7 19.9      821     1241
## 
## Degrees-of-freedom method: kenward-roger 
## Prediction intervals and SEs are based on an error SD of 96.369 
## Confidence level used: 0.95

These results may also be displayed graphically:

plot(feedlot.rg, PIs = TRUE)

The inner intervals are confidence intervals, and the outer ones are the prediction intervals.

Note that the SEs for prediction are considerably greater than the SEs for estimation in the original summary of feedlot.rg. Also, as a sanity check, observe that these prediction intervals cover about the same ground as the original data:

range(feedlot$swt)
## [1]  816 1248

By the way, we could have specified the desired sigma value as an additional sigma argument in the predict() call, rather than updating the feedlot.rg object.

Back to Contents

Predictions on particular strata

Suppose, in our example, we want to predict swt for one or more particular herds. Then the total SD we computed is not appropriate for that purpose, because that includes variation among herds.

But more to the point, if we are talking about particular herds, then we are really regarding herd as a fixed effect of interest; so the expedient thing to do is to fit a different model where herd is a fixed effect:

feedlot.lm <- lm(swt ~ adj.ewt + diet + herd, data = feedlot)

So to predict slaughter weight for herds 9 and 19:

newrg <- ref_grid(feedlot.lm, at = list(adj.ewt = 0, herd = c("9", "19")))
predict(newrg, interval = "prediction", by = "herd")
## herd = 9:
##  adj.ewt diet   prediction   SE df lower.PL upper.PL
##        0 Low           867 63.6 53      740      995
##        0 Medium        835 64.1 53      707      964
##        0 High          866 66.3 53      733      999
## 
## herd = 19:
##  adj.ewt diet   prediction   SE df lower.PL upper.PL
##        0 Low          1069 62.1 53      945     1194
##        0 Medium       1037 62.8 53      911     1163
##        0 High         1068 64.0 53      940     1197
## 
## Prediction intervals and SEs are based on an error SD of 57.782 
## Confidence level used: 0.95

This is an instance where the default sigma was already correct (being the only error SD we have available). The SD value is comparable to the residual SD in the previous model, and the prediction SEs are smaller than those for predicting over all herds.

Back to Contents

Predictions with Bayesian models

For models fitted using Bayesian methods, these kinds of prediction intervals are available only by forcing a frequentist analysis (frequentist = TRUE).

However, a better and more flexible approach with Bayesian models is to simulate observations from the posterior predictive distribution. This is done via as.mcmc() and specifying a likelihood argument. An example is given in the “sophisticated models” vignette.

Back to Contents

Index of all vignette topics

emmeans/inst/doc/basics.R0000644000176200001440000001511614165066753015031 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE------------------------- require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ## ---- echo = FALSE------------------------------------------------------------ par(mar = .1 + c(4, 4, 1, 1)) # reduce head space ## ----------------------------------------------------------------------------- with(pigs, interaction.plot(percent, source, conc)) ## ----------------------------------------------------------------------------- with(pigs, tapply(conc, percent, mean)) ## ----------------------------------------------------------------------------- cell.means <- matrix(with(pigs, tapply(conc, interaction(source, percent), mean)), nrow = 3) cell.means ## ----------------------------------------------------------------------------- apply(cell.means, 2, mean) ## ----------------------------------------------------------------------------- with(pigs, table(source, percent)) ## ----------------------------------------------------------------------------- sum(c(3, 1, 1) * cell.means[, 4]) / 5 ## ----------------------------------------------------------------------------- pigs.lm1 <- lm(log(conc) ~ source + factor(percent), data = pigs) ## ----------------------------------------------------------------------------- ref_grid(pigs.lm1) ## ----------------------------------------------------------------------------- ref_grid(pigs.lm1) @ grid ## ----------------------------------------------------------------------------- pigs.lm2 <- lm(log(conc) ~ source + percent, data = pigs) ref_grid(pigs.lm2) ## ----------------------------------------------------------------------------- ref_grid(pigs.lm2) @ grid ## ----------------------------------------------------------------------------- pigs.pred1 <- matrix(predict(ref_grid(pigs.lm1)), nrow = 3) pigs.pred1 ## ----------------------------------------------------------------------------- apply(pigs.pred1, 1, mean) ### EMMs for source apply(pigs.pred1, 2, mean) ### EMMs for percent ## ----------------------------------------------------------------------------- predict(ref_grid(pigs.lm2)) ## ----------------------------------------------------------------------------- emmeans(pigs.lm1, "percent") ## ----------------------------------------------------------------------------- ref_grid(pigs.lm2, cov.keep = "percent") ## ----------------------------------------------------------------------------- ref_grid(pigs.lm2, cov.reduce = range) ## ----------------------------------------------------------------------------- mtcars.lm <- lm(mpg ~ disp * cyl, data = mtcars) ref_grid(mtcars.lm) ## ----------------------------------------------------------------------------- mtcars.rg <- ref_grid(mtcars.lm, cov.keep = 3, at = list(disp = c(100, 200, 300))) mtcars.rg ## ----------------------------------------------------------------------------- mtcars.1 <- lm(mpg ~ factor(cyl) + disp + I(disp^2), data = mtcars) emmeans(mtcars.1, "cyl") ## ----------------------------------------------------------------------------- mtcars <- transform(mtcars, Cyl = factor(cyl), dispsq = disp^2) mtcars.2 <- lm(mpg ~ Cyl + disp + dispsq, data = mtcars) emmeans(mtcars.2, "Cyl") ## ----------------------------------------------------------------------------- ref_grid(mtcars.1) ref_grid(mtcars.2) ## ----------------------------------------------------------------------------- emmeans(mtcars.2, "Cyl", at = list(dispsq = 230.72^2)) ## ---- eval = FALSE------------------------------------------------------------ # deg <- 2 # mod <- lm(y ~ treat * poly(x, degree = deg), data = mydata) ## ---- eval = FALSE------------------------------------------------------------ # emmeans(mod, ~ treat | x, at = list(x = 1:3), params = "deg") ## ----------------------------------------------------------------------------- emmip(pigs.lm1, source ~ percent) emmip(ref_grid(pigs.lm2, cov.reduce = FALSE), source ~ percent) ## ----------------------------------------------------------------------------- plot(mtcars.rg, by = "disp") ## ----------------------------------------------------------------------------- mtcars.rg_d.c <- ref_grid(mtcars.lm, at = list(cyl = c(4,6,8)), cov.reduce = disp ~ cyl) mtcars.rg_d.c @ grid ## ----fig.height = 1.5--------------------------------------------------------- plot(mtcars.rg_d.c) ## ----------------------------------------------------------------------------- require("ggplot2") emmip(pigs.lm1, ~ percent | source, CIs = TRUE) + geom_point(aes(x = percent, y = log(conc)), data = pigs, pch = 2, color = "blue") ## ---- eval = FALSE------------------------------------------------------------ # ci <- confint(mtcars.rg_d.c, level = 0.90, adjust = "scheffe") # xport <- print(ci, export = TRUE) # cat("\n") # knitr::kable(xport$summary, align = "r") # for (a in xport$annotations) cat(paste(a, "
")) # cat("
\n") ## ---- results = "asis", echo = FALSE------------------------------------------ ci <- confint(mtcars.rg_d.c, level = 0.90, adjust = "scheffe") xport <- print(ci, export = TRUE) cat("\n") knitr::kable(xport$summary, align = "r") for (a in xport$annotations) cat(paste(a, "
")) cat("
\n") ## ----------------------------------------------------------------------------- emmeans(pigs.lm1, "percent", weights = "cells") ## ----------------------------------------------------------------------------- pigs.lm3 <- lm(log(conc) ~ factor(percent), data = pigs) emmeans(pigs.lm3, "percent") ## ----------------------------------------------------------------------------- MOats.lm <- lm (yield ~ Block + Variety, data = MOats) ref_grid (MOats.lm, mult.name = "nitro") ## ----------------------------------------------------------------------------- pigs.rg <- ref_grid(pigs.lm1) class(pigs.rg) pigs.emm.s <- emmeans(pigs.rg, "source") class(pigs.emm.s) ## ----------------------------------------------------------------------------- pigs.rg pigs.emm.s ## ----------------------------------------------------------------------------- str(pigs.emm.s) ## ----------------------------------------------------------------------------- # equivalent to summary(emmeans(pigs.lm1, "percent"), level = 0.90, infer = TRUE)) emmeans(pigs.lm1, "percent", level = 0.90, infer = TRUE) ## ----------------------------------------------------------------------------- class(summary(pigs.emm.s)) emmeans/inst/doc/vignette-topics.Rmd0000644000176200001440000006300414165066711017223 0ustar liggesusers--- title: "Index of vignette topics" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Index of vignette topics} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} ---
### Jump to: [A](#a) [B](#b) [C](#c) [D](#d) [E](#e) [F](#f) [G](#g) [H](#h) [I](#i) [J](#j) [K](#k) [L](#l) [M](#m) [N](#n) [O](#o) [P](#p) [Q](#q) [R](#r) [S](#s) [T](#t) [U](#u) [V](#v) [W](#w) [X](#x) [Z](#z) {#topnav} ### A {#a} * [`add_grouping()`](utilities.html#groups) * `adjust` * [in *comparisons: pairwise*](comparisons.html#pairwise) * [in *confidence-intervals: adjust*](confidence-intervals.html#adjust) * [`afex_aov` objects](models.html#V) * [Alias matrix](xplanations.html#submodels) * [Analysis of subsets of data](FAQs.html#model) * Analysis of variance * [versus *post hoc* comparisons](FAQs.html#anova) * [Type III](confidence-intervals.html#joint_tests) * [`aovList` objects](models.html#V) * [`appx-satterthwaite` method](models.html#K) * [`as.data.frame()`](utilities.html#data) * `as.mcmc()` * [in *models: S*](models.html#S) * [in *sophisticated: bayesxtra*](sophisticated.html#bayesxtra) * [ASA Statement on *P* values](basics.html#pvalues) * [Asymptotic tests](sophisticated.html#dfoptions) * [ATOM](basics.html#pvalues) * [`averaging` models](models.html#I) [Back to top](#topnav) ### B {#b} * [Bayes factor](sophisticated.html#bayesxtra) * Bayesian models * [in *models: S*](models.html#S) * [in *sophisticated: mcmc*](sophisticated.html#mcmc) * [**bayesplot** package](sophisticated.html#bayesxtra) * [**bayestestR** package](sophisticated.html#bayesxtra) * [Beta regression](models.html#B) * [`betareg` models](models.html#B) * Bias adjustment * [For link functions vs. response transformations](transformations.html#link-bias) * [in Bayesian models](sophisticated.html#bias-adj-mcmc) * [In GLMMs and GEE models](transformations.html#cbpp) * [When back-transforming](transformations.html#bias-adj) * [When *not* to use](transformations.html#insects) * [Bonferroni adjustment](confidence-intervals.html#adjmore) * [`boot-satterthwaite` method](models.html#K) * [Brackets (`[ ]` and `[[ ]]` operators)](utilities.html#brackets) * [`brmsfit` objects](models.html#S) * [`by` groups](confidence-intervals.html#byvars) * [Identical comparisons](FAQs.html#additive) [Back to top](#topnav) ### C {#c} * [`cld()`](comparisons.html#CLD) * [`clm` models](models.html#O) * [**coda** package](sophisticated.html#bayesxtra) * [`coef()`](comparisons.html#contrasts) * [Cohen's *d*](comparisons.html#pwpm) * [Compact letter displays](comparisons.html#CLD) * Comparison arrows * [Derivation](xplanations.html#arrows) * Comparisons * [Back-transforming](comparisons.html#logs) * [Displaying as groups](comparisons.html#CLD) * [Displaying *P* values](comparisons.html#pwpp) * [Graphical](comparisons.html#graphical) * [How arrows are determined](xplanations.html#arrows) * [with logs](comparisons.html#logs) * [with overlapping CIs](FAQs.html#CIerror) * [Comparisons result in `(nothing)`](FAQs.html#nopairs) * [Confidence intervals](confidence-intervals.html#summary) * [Overlapping](FAQs.html#CIerror) * [`confint()`](confidence-intervals.html#summary) * [`consec` contrasts](comparisons.html#contrasts) * [Constrained marginal means](xplanations.html#submodels) * [Consultants](basics.html#recs3) * [Containment d.f.](models.html#K) * [`contrast()`](comparisons.html#contrasts) * [`adjust`](comparisons.html#linfcns) * [Changing defaults](utilities.html#defaults) * [`combine`](interactions.html#simple) * [`interaction`](interactions.html#contrasts) * [Linear functions](comparisons.html#linfct) * `simple` * [in *confidence-intervals: simple*](confidence-intervals.html#simple) * [in *interactions: simple*](interactions.html#simple) * Contrasts * [of other contrasts](interactions.html#contrasts) * [Custom](comparisons.html#linfcns) * [Formula](comparisons.html#formulas) * [Multivariate](interactions.html#multiv) * [Pairwise](comparisons.html#pairwise) * [Polynomial](comparisons.html#contrasts) * Tests of * [with transformations](comparisons.html#logs) * [Count regression](models.html#C) * [`cov.reduce`](messy-data.html#med.covred) * Covariates * [Adjusted](messy-data.html#adjcov) * [`cov.keep`](FAQs.html#numeric) * [`cov.reduce`](FAQs.html#numeric) * [Derived](basics.html#depcovs) * [`emmeans()` doesn't work](FAQs.html#numeric) * [Interacting with factors](FAQs.html#trends) * [Mediating](messy-data.html#mediators) [Back to top](#topnav) ### D {#d} * [Degrees of freedom](sophisticated.html#dfoptions) * [Infinite](FAQs.html#asymp) * [Digits, optimizing](utilities.html#digits) * [Dunnett method](comparisons.html#contrasts) [Back to top](#topnav) ### E {#e} * [`eff` contrasts](comparisons.html#contrasts) * [`eff_size()`](comparisons.html#pwpm) * [Effect size](comparisons.html#pwpm) * [`emm_basis()`](xtending.html#intro) * [Arguments and returned value](xtending.html#ebreqs) * [Communicating with `recover_data()`](xtending.html#communic) * [Dispatching](xtending.html#dispatch) * [Hook functions](xtending.html#hooks) * [for `lqs` objects](xtending.html#eblqs) * [for `rsm` objects](xtending.html#ebrsm) * [`emm_list` object](comparisons.html#formulas) * [`emm_options()`](utilities.html#defaults) * [`.emmc` functions](comparisons.html#linfcns) * **emmeans** package * [Exporting extensions to](xtending.html#exporting) * [`emmeans()`](basics.html#emmeans) * [And the underlying model](FAQs.html#nowelch) * [Changing defaults](utilities.html#defaults) * [Surprising results from](FAQs.html#transformations) * `weights` * [in *basics: weights*](basics.html#weights) * [in *messy-data: weights*](messy-data.html#weights) * [With transformations](transformations.html#regrid) * [`emmGrid` objects](basics.html#emmobj) * [Accessing data](utilities.html#data) * [Combining and subsetting](utilities.html#rbind) * [Modifying](utilities.html#update) * [Setting defaults for](utilities.html#defaults) * `emmip()` * [in *basics: plots*](basics.html#plots) * [in *interactions: factors*](interactions.html#factors) * [nested factors](messy-data.html#cows) * [EMMs](basics.html#EMMdef) * [Appropriateness of](basics.html#eqwts) * [Projecting to a submodel](xplanations.html#submodels) * [What are they?](FAQs.html#what) * `emtrends()` * [in *interactions: covariates*](interactions.html#covariates) * [in *interactions: oranges*](interactions.html#oranges) * [`estHook`](xtending.html#hooks) * [Estimability](messy-data.html#nonestex) * [Estimability issues](FAQs.html#NAs) * [Estimated marginal means](basics.html#EMMdef) * [Defined](basics.html#emmeans) * Examples * [`auto.noise`](interactions.html#factors) * [Bayesian model](sophisticated.html#mcmc) * `cbpp` * [in *sophisticated: mcmc*](sophisticated.html#mcmc) * [in *transformations: cbpp*](transformations.html#cbpp) * [`cows`](messy-data.html#cows) * [`feedlot`](predictions.html#feedlot) * `fiber` * [in *interactions: covariates*](interactions.html#covariates) * [in *transformations: stdize*](transformations.html#stdize) * [`framing`](messy-data.html#mediators) * [Gamma regression](transformations.html#tranlink) * [`InsectSprays`](transformations.html#insects) * [Insurance claims (SAS)](sophisticated.html#offsets) * [Logistic regression](transformations.html#links) * [`lqs` objects](xtending.html#lqs) * [`MOats`](basics.html#multiv) * `mtcars` * [in *basics: altering*](basics.html#altering) * [in *messy-data: nuis.example*](messy-data.html#nuis.example) * [Multivariate](basics.html#multiv) * [Nested fixed effects](messy-data.html#cows) * `neuralgia` * [in *transformations: links*](transformations.html#links) * [in *transformations: trangraph*](transformations.html#trangraph) * [`nutrition`](messy-data.html#nutrex) * [`submodel`](messy-data.html#submodels) * [`weights`](messy-data.html#weights) * [`Oats`](sophisticated.html#lmer) * `oranges` * [in *comparisons: formulas*](comparisons.html#formulas) * [in *interactions: oranges*](interactions.html#oranges) * [Ordinal model](sophisticated.html#ordinal) * `pigs` * [in *basics: motivation*](basics.html#motivation) * [in *confidence-intervals: summary*](confidence-intervals.html#summary) * [in *transformations: altscale*](transformations.html#altscale) * [in *transformations: overview*](transformations.html#overview) * [in *transformations: pigs-biasadj*](transformations.html#pigs-biasadj) * [`rlm` objects](xtending.html#rlm) * [Robust regression](xtending.html#rlm) * [Split-plot experiment](sophisticated.html#lmer) * [Unbalanced data](basics.html#motivation) * `warpbreaks` * [in *transformations: tranlink*](transformations.html#tranlink) * [in *utilities: relevel*](utilities.html#relevel) * [Welch's *t* comparisons](utilities.html#relevel) * [`wine`](sophisticated.html#ordinal) * [Exporting output](basics.html#formatting) * Extending **emmeans** * [Exports useful to developers](xtending.html#exported) * [Restrictions](xtending.html#dispatch) [Back to top](#topnav) ### F {#f} * *F* test * [vs. pairwise comparisons](FAQs.html#anova) * [Role in *post hoc* tests](basics.html#recs1) * Factors * [Mediating](messy-data.html#weights) * [Formatting results](basics.html#formatting) * [Frequently asked questions](FAQs.html) [Back to top](#topnav) ### G {#g} * [`gam` models](models.html#G) * [`gamlss` models](models.html#H) * [GEE models](models.html#E) * [Generalized additive models](models.html#G) * [Generalized linear models](models.html#G) * [Geometric means](transformations.html#bias-adj) * [Get the model right first](FAQs.html#nowelch) * [`get_emm_option()`](utilities.html#options) * **ggplot2** package * [in *basics: ggplot*](basics.html#ggplot) * [in *messy-data: cows*](messy-data.html#cows) * [`glm`*xxx* models](models.html#G) * [`gls` models](models.html#K) * [Graphical displays](basics.html#plots) * [Grouping factors](utilities.html#groups) * [Grouping into separate sets](confidence-intervals.html#byvars) [Back to top](#topnav) ### H {#h} * [Hook functions](xtending.html#hooks) * [Hotelling's $T^2$](interactions.html#multiv) * [`hpd.summary()`](sophisticated.html#mcmc) * [`hurdle` models](models.html#C) [Back to top](#topnav) ### I {#i} * [Infinite degrees of freedom](FAQs.html#asymp) * Interactions * [Analysis](interactions.html) * [Contrasts](interactions.html#contrasts) * [Covariate with factors](interactions.html#covariates) * [Implied](interactions.html#oranges) * [Plotting](interactions.html#factors) * [Possible inappropriateness of marginal means](interactions.html#factors) [Back to top](#topnav) ### J {#j} * [`joint`](confidence-intervals.html#joint) * `joint_tests()` * [in *confidence-intervals: joint_tests*](confidence-intervals.html#joint_tests) * [in *interactions: contrasts*](interactions.html#contrasts) * [with `submodel = "type2"`](messy-data.html#type2submodel) [Back to top](#topnav) ### K {#k} * [`kable`](basics.html#formatting) * [Kenward-Roger d.f.](models.html#L) [Back to top](#topnav) ### L {#l} * Labels * [Changing](utilities.html#relevel) * [Large models](messy-data.html#nuisance) * [Least-squares means](FAQs.html#what) * Levels * [Changing](utilities.html#relevel) * [Linear functions](comparisons.html#linfct) * [Link functions](transformations.html#links) * [`lme` models](models.html#K) * `lmerMod` models * [in *models: L*](models.html#L) * [in *sophisticated: lmer*](sophisticated.html#lmer) * [System options for](sophisticated.html#lmerOpts) * Logistic regression * [Odds ratios](transformations.html#oddsrats) * [Surprising results](FAQs.html#transformations) * LSD * [protected](basics.html#recs2) [Back to top](#topnav) ### M {#m} * [`make.tran()`](transformations.html#special) * [`mcmc` objects](models.html#S) * Means * [Cell](basics.html#motivation) * [Generalized](transformations.html#bias-adj) * [Marginal](basics.html#motivation) * [Based on a model](basics.html#EMMdef) * [of cell means](basics.html#eqwts) * Weighted * [in *basics: eqwts*](basics.html#eqwts) * [in *basics: weights*](basics.html#weights) * [Mediating covariates](messy-data.html#mediators) * [Memory usage](messy-data.html#nuisance) * [Limiting](messy-data.html#rg.limit) * [`mira` models](models.html#I) * [`misc` attribute and argument](xtending.html#communic) * [`mlm` models](models.html#N) * [`mmer` models](models.html#G) * Model * [Get it right first](basics.html#recs2) * [Importance of](FAQs.html#fastest) * [Importance of getting it right](FAQs.html#nowelch) * [Model averaging](models.html#I) * Models * [Constrained](messy-data.html#submodels) * [Large](messy-data.html#nuisance) * [Quick reference](models.html#quickref) * [Unsupported](FAQs.html#qdrg) * [Multi-factor studies](FAQs.html#interactions) * [Multinomial models](models.html#N) * [Multiple imputation](models.html#I) * [Multiplicity adjustments](confidence-intervals.html#adjust) * [Multivariate contrasts](interactions.html#multiv) * Multivariate models * [in *basics: multiv*](basics.html#multiv) * [in *interactions: oranges*](interactions.html#oranges) * [in *models: M*](models.html#M) * [with `submodel`](xplanations.html#mult.submodel) * [Multivariate *t* (`"mvt"`) adjustment](confidence-intervals.html#adjmore) * [`mvcontrast()`](interactions.html#multiv) * [**mvtnorm** package](confidence-intervals.html#adjmore) [Back to top](#topnav) ### N {#n} * [`NA`s in the output](FAQs.html#NAs) * [Nesting](messy-data.html#nesting) * [Auto-detection](messy-data.html#nest-trap) * Nesting factors * [Creating](utilities.html#groups) * [Non-estimability](messy-data.html#nonestex) * [`non.nuisance`](messy-data.html#nuisance) * [`NonEst` values](FAQs.html#NAs) * [`(nothing)` in output](FAQs.html#nopairs) * [`nuisance`](messy-data.html#nuisance) * [Nuisance factors](messy-data.html#nuisance) [Back to top](#topnav) ### O {#o} * [Observational data](messy-data.html#issues) * [Odds ratios](transformations.html#oddsrats) * [Offsets](sophisticated.html#offsets) * [`opt.digits` option](utilities.html#digits) * [Options](utilities.html#options) * [Startup](utilities.html#startup) * Ordinal models * [Latent scale](sophisticated.html#ordinal) * [Linear-predictor scale](sophisticated.html#ordlp) * [in *models: O*](models.html#O) * [`prob` and `mean.class`](sophisticated.html#ordprob) * [in *sophisticated: ordinal*](sophisticated.html#ordinal) [Back to top](#topnav) ### P {#p} * *P* values * [Adjusted](basics.html#recs1) * [Adjustment is ignored](FAQs.html#noadjust) * [Interpreting](basics.html#pvalues) * [`pairs()`](comparisons.html#pairwise) * [Pairwise comparisons](comparisons.html#pairwise) * [Matrix displays](comparisons.html#pwpm) * [`pairwise` contrasts](comparisons.html#contrasts) * [Pairwise *P*-value plots](comparisons.html#pwpp) * [`params`](basics.html#params) * [Percentage differences](transformations.html#altscale) * `plot()` * [nested factors](messy-data.html#cows) * [`plot.emmGrid()`](basics.html#plot.emmGrid) * Plots * [of confidence intervals](basics.html#plot.emmGrid) * [of EMMs](basics.html#plots) * [Interaction-style](basics.html#plots) * [`+` operator](utilities.html#rbind) * Poisson regression * [Surprising results](FAQs.html#transformations) * [`polreg` models](models.html#O) * [Polynomial regression](basics.html#depcovs) * Pooled *t* * [Instead of Welch's *t*](FAQs.html#nowelch) * [`postGridHook`](xtending.html#hooks) * [Practices, recommended](basics.html#recs1) * Predictions * Bayesian models * [in *predictions: bayes*](predictions.html#bayes) * [in *sophisticated: predict-mcmc*](sophisticated.html#predict-mcmc) * [Error SD](predictions.html#sd-estimate) * [graphics](predictions.html#feedlot) * [on Particular strata](predictions.html#strata) * [Posterior predictive distribution](sophisticated.html#predict-mcmc) * [Reference grid](predictions.html#ref-grid) * [Total SD](predictions.html#feedlot) * [`print.summary_emm()`](basics.html#emmobj) * [`pwpm()`](comparisons.html#pwpm) * [`pwpp()`](comparisons.html#pwpp) [Back to top](#topnav) ### Q {#q} * [`qdrg()`](FAQs.html#qdrg) * [Quadratic terms](basics.html#depcovs) * [Quick start](FAQs.html#fastest) [Back to top](#topnav) ### R {#r} * [`rbind()`](utilities.html#rbind) * [Re-labeling](utilities.html#relevel) * [Recommended practices](basics.html#recs1) * [`recover_data()`](xtending.html#intro) * [Communicating with `emm_basis()`](xtending.html#communic) * [`data` and `params` arguments](xtending.html#rdargs) * [Dispatching](xtending.html#dispatch) * [Error handling](xtending.html#rderrs) * [for `lqs` objects](xtending.html#rd.lqs) * [for `rsm` objects](xtending.html#rdrsm) * `recover_data.call()` * [`frame` argument](xtending.html#rdargs) * [`ref_grid()`](basics.html#ref_grid) * [`at`](basics.html#altering) * [`cov.keep`](basics.html#altering) * [`cov.reduce`](basics.html#altering) * [`mult.name`](basics.html#multiv) * [`nesting`](messy-data.html#nest-trap) * [`offset`](sophisticated.html#offsets) * [Reference grids](basics.html#ref_grid) * [Altering](basics.html#altering) * [Prediction on](predictions.html#ref-grid) * [Region of practical equivalence](sophisticated.html#bayesxtra) * [Registering `recover_data` and `emm_basis` methods](xtending.html#exporting) * [`regrid()`](transformations.html#regrid) * [`transform = "log"`](transformations.html#logs) * [`transform` vs. `type`](transformations.html#regrid) * [Response scale](confidence-intervals.html#tran) * [`revpairwise` contrasts](comparisons.html#contrasts) * [`rg.limit` option](messy-data.html#rg.limit) * [RMarkdown](basics.html#formatting) * [ROPE](sophisticated.html#bayesxtra) * [**rsm** package](xtending.html#rsm) * [`rstanarm`](sophisticated.html#mcmc) [Back to top](#topnav) ### S {#s} * [Sample size, displaying](confidence-intervals.html#summary) * Satterthwaite d.f. * [in *models: K*](models.html#K) * [in *models: L*](models.html#L) * [`"scale"` type](transformations.html#trangraph) * [`scale()`](transformations.html#stdize) * [Selecting results](utilities.html#brackets) * [Sidak adjustment](confidence-intervals.html#adjust) * Significance * [Assessing](basics.html#pvalues) * [`simple = "each"`](confidence-intervals.html#simple) * Simple comparisons * [in *confidence-intervals: simple*](confidence-intervals.html#simple) * [in *FAQs: interactions*](FAQs.html#interactions) * [in *interactions: simple*](interactions.html#simple) * `specs` * [Formula](comparisons.html#formulas) * [Standardized response](transformations.html#stdize) * [`stanreg` objects](models.html#S) * [* gazing (star gazing)](interactions.html#factors) * [Startup options](utilities.html#startup) * [Statistical consultants](basics.html#recs3) * [Statistics is hard](basics.html#recs3) * [`str()`](basics.html#emmobj) * `submodel` * [`"minimal"` and `"type2"`](messy-data.html#type2submodel) * [in a multivariate model](xplanations.html#mult.submodel) * [in *messy-data: submodels*](messy-data.html#submodels) * [in *xplanations: submodels*](xplanations.html#submodels) * [Subsets of data](FAQs.html#model) * `summary()` * [`adjust`](comparisons.html#pairwise) * [in *basics: emmobj*](basics.html#emmobj) * Bayesian models * [in *confidence-intervals: summary*](confidence-intervals.html#summary) * [in *models: S*](models.html#S) * [Calculated columns](confidence-intervals.html#summary) * [in *confidence-intervals: summary*](confidence-intervals.html#summary) * [HPD intervals](sophisticated.html#mcmc) * [`hpd.summary()`](confidence-intervals.html#summary) * `infer` * [in *comparisons: pairwise*](comparisons.html#pairwise) * [in *confidence-intervals: summary*](confidence-intervals.html#summary) * [Show sample size](confidence-intervals.html#summary) * [`type = "unlink"`](transformations.html#tranlink) * [`summary_emm` object](basics.html#emmobj) * [As a data frame](utilities.html#data) [Back to top](#topnav) ### T {#t} * [*t* tests vs. *z* tests](FAQs.html#asymp) * [`test()`](confidence-intervals.html#summary) * [`delta`](confidence-intervals.html#equiv) * [`joint = TRUE`](confidence-intervals.html#joint) * Tests * [Equivalence](confidence-intervals.html#equiv) * [Noninferiority](confidence-intervals.html#equiv) * [Nonzero null](confidence-intervals.html#summary) * [One- and two-sided](confidence-intervals.html#summary) * Transformations * [Adding after the fact](transformations.html#after) * [Auto-detected](transformations.html#auto) * [Back-transforming](confidence-intervals.html#tran) * [Bias adjustment](transformations.html#bias-adj) * [Custom](transformations.html#special) * [faking](transformations.html#faking) * [Faking a log transformation](transformations.html#logs) * [Graphical display](transformations.html#trangraph) * [with link function](transformations.html#tranlink) * [Log](comparisons.html#logs) * [Overview](transformations.html#overview) * [Percent difference](transformations.html#altscale) * [Re-gridding](transformations.html#regrid) * [Response versus link functions](transformations.html#link-bias) * [`scale()`](transformations.html#stdize) * [Standardizing](transformations.html#stdize) * [Timing is everything](transformations.html#timing) * Trends * [Estimating and comparing](interactions.html#oranges) * [`trt.vs.ctrl` contrasts](comparisons.html#contrasts) * [Tukey adjustment](confidence-intervals.html#adjust) * [Ignored or changed](FAQs.html#notukey) * [`type`](confidence-intervals.html#tran) * [`type = "scale"`](transformations.html#trangraph) * [Type II analysis](messy-data.html#type2submodel) * Type III tests * [in *confidence-intervals: joint*](confidence-intervals.html#joint) * [in *confidence-intervals: joint_tests*](confidence-intervals.html#joint_tests) [Back to top](#topnav) ### U {#u} * [Unadjusted tests](confidence-intervals.html#adjmore) * [`update()`](utilities.html#update) * [`tran`](transformations.html#after) * [Using results](utilities.html#data) [Back to top](#topnav) ### V {#v} * [Variables that are not predictors](basics.html#params) * [`vcovHook`](xtending.html#hooks) * Vignettes * [Basics](basics.html) * [Comparisons](comparisons.html) * [Confidence intervals and tests](confidence-intervals.html) * [Explanations supplement](xplanations.html) * [Extending **emmeans**](xtending.html) * [FAQS](FAQs.html) * [Interactions](interactions.html) * [Messy data](messy-data.html) * [Models](models.html) * [Predictions](predictions.html) * [Sophisticated models](sophisticated.html) * [Transformations and link functions](transformations.html) * [Utilities and options](utilities.html) [Back to top](#topnav) ### W {#w} * [`weights`](messy-data.html#weights) * [Welch's *t* comparisons](FAQs.html#nowelch) * [Example](utilities.html#relevel) * [`wt.nuis`](messy-data.html#nuisance) [Back to top](#topnav) ### X {#x} * [`xtable` method](basics.html#formatting) [Back to top](#topnav) ### Z {#z} * [*z* tests](sophisticated.html#dfoptions) * [vs. *t* tests](FAQs.html#asymp) * [`zeroinfl` models](models.html#C) [Back to top](#topnav) *Index generated by the [vigindex](https://github.com/rvlenth/vigindex) package.* emmeans/inst/doc/interactions.Rmd0000644000176200001440000004347514137062735016614 0ustar liggesusers--- title: "Interaction analysis in emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Interaction analysis in emmeans} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") options(show.signif.stars = FALSE) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro", class.message = "re") ``` Models in which predictors interact seem to create a lot of confusion concerning what kinds of *post hoc* methods should be used. It is hoped that this vignette will be helpful in shedding some light on how to use the **emmeans** package effectively in such situations. ## Contents {#contents} 1. [Interacting factors](#factors) a. [Simple contrasts](#simple) 2. [Interaction contrasts](#contrasts) 3. [Multivariate contrasts](#multiv) 4. [Interactions with covariates](#covariates) 9. [Summary](#summary) [Index of all vignette topics](vignette-topics.html) ## Interacting factors {#factors} As an example for this topic, consider the `auto.noise` dataset included with the package. This is a balanced 3x2x2 experiment with three replications. The response -- noise level -- is evaluated with different sizes of cars, types of anti-pollution filters, on each side of the car being measured.[^1] [^1]: I sure wish I could ask some questions about how how these data were collected; for example, are these independent experimental runs, or are some cars measured more than once? The model is based on the independence assumption, but I have my doubts. Let's fit a model and obtain the ANOVA table (because of the scale of the data, we believe that the response is recorded in tenths of decibels; so we compensate for this by scaling the response): ```{r} noise.lm <- lm(noise/10 ~ size * type * side, data = auto.noise) anova(noise.lm) ``` There are statistically strong 2- and 3-way interactions. One mistake that a lot of people seem to make is to proceed too hastily to estimating marginal means (even in the face of all these interactions!). They would go straight to analyses like this: ```{r} emmeans(noise.lm, pairwise ~ size) ``` The analyst-in-a-hurry would thus conclude that the noise level is higher for medium-sized cars than for small or large ones. But as is seen in the message before the output, `emmeans()` valiantly tries to warn you that it may not be a good idea to average over factors that interact with the factor of interest. It isn't *always* a bad idea to do this, but sometimes it definitely is. What about this time? I think a good first step is always to try to visualize the nature of the interactions before doing any statistical comparisons. The following plot helps. ```{r} emmip(noise.lm, type ~ size | side) ``` Examining this plot, we see that the "medium" mean is not always higher; so the marginal means, and the way they compare, does not represent what is always the case. Moreover, what is evident in the plot is that the peak for medium-size cars occurs for only one of the two filter types. So it seems more useful to do the comparisons of size separately for each filter type. This is easily done, simply by conditioning on `type`: ```{r} emm_s.t <- emmeans(noise.lm, pairwise ~ size | type) emm_s.t ``` Not too surprisingly, the statistical comparisons are all different for standard filters, but with Octel filters, there isn't much of a difference between small and medium size. For comparing the levels of other factors, similar judgments must be made. It may help to construct other interaction plots with the factors in different roles. In my opinion, almost all meaningful statistical analysis should be grounded in evaluating the practical impact of the estimated effects *first*, and seeing if the statistical evidence backs it up. Those who put all their attention on how many asterisks (I call these people "`*` gazers") are ignoring the fact that these don't measure the sizes of the effects on a practical scale.[^2] An effect can be practically negligible and still have a very small *P* value -- or practically important but have a large *P* value -- depending on sample size and error variance. Failure to describe what is actually going on in the data is a failure to do an adequate analysis. Use lots of plots, and *think* about the results. For more on this, see the discussion of *P* values in the ["basics" vignette](basics.html#pvalues). [^2]: You may have noticed that there are no asterisks in the ANOVA table in this vignette. I habitually opt out of star-gazing by including `options(show.signif.stars = FALSE)` in my `.Rprofile` file. ### Simple contrasts {#simple} An alternative way to specify conditional contrasts or comparisons is through the use of the `simple` argument to `contrast()` or `pairs()`, which amounts to specifying which factors are *not* used as `by` variables. For example, consider: ```{r} noise.emm <- emmeans(noise.lm, ~ size * side * type) ``` Then `pairs(noise.emm, simple = "size")` is the same as `pairs(noise.emm, by = c("side", "type"))`. One may specify a list for `simple`, in which case separate runs are made with each element of the list. Thus, `pairs(noise.emm, simple = list("size", c("side", "type"))` returns two sets of contrasts: comparisons of `size` for each combination of the other two factors; and comparisons of `side*type` combinations for each `size`. A shortcut that generates all simple main-effect comparisons is to use `simple = "each"`. In this example, the result is the same as obtained using `simple = list("size", "side", "type")`. Ordinarily, when `simple` is a list (or equal to `"each"`), a list of contrast sets is returned. However, if the additional argument `combine` is set to `TRUE`, they are all combined into one family: ```{r} contrast(noise.emm, "consec", simple = "each", combine = TRUE, adjust = "mvt") ``` The dots (`.`) in this result correspond to which simple effect is being displayed. If we re-run this same call with `combine = FALSE` or omitted, these twenty comparisons would be displayed in three broad sets of contrasts, each broken down further by combinations of `by` variables, each separately multiplicity-adjusted (a total of 16 different tables). [Back to Contents](#contents) ## Interaction contrasts {#contrasts} An interaction contrast is a contrast of contrasts. For instance, in the auto-noise example, we may want to obtain the linear and quadratic contrasts of `size` separately for each `type`, and compare them. Here are estimates of those contrasts: ```{r} contrast(emm_s.t[[1]], "poly") ## 'by = "type"' already in previous result ``` The comparison of these contrasts may be done using the `interaction` argument in `contrast()` as follows: ```{r} IC_st <- contrast(emm_s.t[[1]], interaction = c("poly", "consec"), by = NULL) IC_st ``` (Using `by = NULL` restores `type` to a primary factor in these contrasts.) The practical meaning of this is that there isn't a statistical difference in the linear trends, but the quadratic trend for Octel is greater than for standard filter types. (Both quadratic trends are negative, so in fact it is the standard filters that have more pronounced *downward* curvature, as is seen in the plot.) In case you need to understand more clearly what contrasts are being estimated, the `coef()` method helps: ```{r} coef(IC_st) ``` Note that the 4th through 6th contrast coefficients are the negatives of the 1st through 3rd -- thus a comparison of two contrasts. By the way, "type III" tests of interaction effects can be obtained via interaction contrasts: ```{r} test(IC_st, joint = TRUE) ``` This result is exactly the same as the *F* test of `size:type` in the `anova` output. The three-way interaction may be explored via interaction contrasts too: ```{r} contrast(emmeans(noise.lm, ~ size*type*side), interaction = c("poly", "consec", "consec")) ``` One interpretation of this is that the comparison by `type` of the linear contrasts for `size` is different on the left side than on the right side; but the comparison of that comparison of the quadratic contrasts, not so much. Refer again to the plot, and this can be discerned as a comparison of the interaction in the left panel versus the interaction in the right panel. Finally, **emmeans** provides a `joint_tests()` function that obtains and tests the interaction contrasts for all effects in the model and compiles them in one Type-III-ANOVA-like table: ```{r} joint_tests(noise.lm) ``` You may even add `by` variable(s) to obtain separate ANOVA tables for the remaining factors: ```{r} joint_tests(noise.lm, by = "side") ``` [Back to Contents](#contents) ## Multivariate contrasts {#multiv} In the preceding sections, the way we addressed interacting factors was to do comparisons or contrasts of some factors()) separately at levels of other factor(s). This leads to a lot of estimates and associated tests. Another approach is to compare things in a multivariate way. In the auto-noise example, for example, we have four means (corresponding to the four combinations of `type` and `size`) with each size of car, and we could consider comparing these *sets* of means. Such multivariate comparisons can be done via the *Mahalanobis distance* (a kind of standardized distance measure) between one set of four means and another. This is facilitated by the `mvcontrast()` function: ```{r} mvcontrast(noise.emm, "pairwise", mult.name = c("type", "side")) ``` In this output, the `T.square` values are Hotelling's $T^2$ statistics, which are the squared Mahalanobis distances among the sets of four means. These results thus accomplish a similar objective as the initial comparisons presented in this vignette, but are not complicated by the issue that the factors interact. (Instead, we lose the directionality of the comparisons.) While all comparisons are "significant," the `T.square` values indicate that large cars are statistically most different from the other sizes. We may still break things down using `by` variables. Suppose, for example, we wish to compare the two filter types for each size of car, without regard to which side: ```{r} update(mvcontrast(noise.emm, "consec", mult.name = "side", by = "size"), by = NULL) ``` One detail to note about multivariate comparisons: in order to make complete sense, all the factors involved must interact. Suppose we were to repeat the initial multivariate comparison after removing all interactions: ```{r} mvcontrast(update(noise.emm, submodel = ~ side + size + type), "pairwise", mult.name = c("type", "side")) ``` Note that each $F$ ratio now has 1 d.f. Also, note that `T.square = F.ratio`, and you can verify that these values are equal to the squares of the `t.ratio`s in the initial example in this vignette ($(-6.147)^2 = 37.786$, etc.). That is, if we ignore all interactions, the multivariate tests are exactly equivalent to the univariate tests of the marginal means. [Back to Contents](#contents) ## Interactions with covariates {#covariates} When a covariate and a factor interact, we typically don't want EMMs themselves, but rather estimates of *slopes* of the covariate trend for each level of the factor. As a simple example, consider the `fiber` dataset, and fit a model including the interaction between `diameter` (a covariate) and `machine` (a factor): ```{r} fiber.lm <- lm(strength ~ diameter*machine, data = fiber) ``` This model comprises fitting, for each machine, a separate linear trend for `strength` versus `diameter`. Accordingly, we can estimate and compare the slopes of those lines via the `emtrends()` function: ```{r} emtrends(fiber.lm, pairwise ~ machine, var = "diameter") ``` We see the three slopes, but no two of them test as being statistically different. To visualize the lines themselves, you may use ```{r fig.height = 2} emmip(fiber.lm, machine ~ diameter, cov.reduce = range) ``` The `cov.reduce = range` argument is passed to `ref_grid()`; it is needed because by default, each covariate is reduced to only one value (see the ["basics" vignette](basics.html)). Instead, we call the `range()` function to obtain the minimum and maximum diameter. ######### {#oranges} For a more sophisticated example, consider the `oranges` dataset included with the package. These data concern the sales of two varieties of oranges. The prices (`price1` and `price2`) were experimentally varied in different stores and different days, and the responses `sales1` and `sales2` were observed. Let's consider three multivariate models for these data, with additive effects for days and stores, and different levels of fitting on the prices: ```{r} org.quad <- lm(cbind(sales1, sales2) ~ poly(price1, price2, degree = 2) + day + store, data = oranges) org.int <- lm(cbind(sales1, sales2) ~ price1 * price2 + day + store, data = oranges) org.add <- lm(cbind(sales1, sales2) ~ price1 + price2 + day + store, data = oranges) ``` Being a multivariate model, **emmeans** methods will distinguish the responses as if they were levels of a factor, which we will name "variety". Moreover, separate effects are estimated for each multivariate response, so there is an *implied interaction* between `variety` and each of the predictors involving `price1` and `price2`. (In `org.int`, there is an implied three-way interaction.) An interesting way to view these models is to look at how they predict sales of each variety at each observed values of the prices: ```{r} emmip(org.quad, price2 ~ price1 | variety, mult.name = "variety", cov.reduce = FALSE) ``` The trends portrayed here are quite sensible: In the left panel, as we increase the price of variety 1, sales of that variety will tend to decrease -- and the decrease will be faster when the other variety of oranges is low-priced. In the right panel, as price of variety 1 increases, sales of variety 2 will increase when it is low-priced, but could decrease also at high prices because oranges in general are just too expensive. A plot like this for `org.int` will be similar but all the curves will be straight lines; and the one for `plot.add` will have all lines parallel. In all models, though, there are implied `price1:variety` and `price2:variety` interactions, because we have different regression coefficients for the two responses. Which model should we use? They are nested models, so they can be compared by `anova()`: ```{r} anova(org.quad, org.int, org.add) ``` It seems like the full-quadratic model has little advantage over the interaction model. There truly is nothing magical about a *P* value of 0.05, and we have enough data that over-fitting is not a hazard; so I like `org.int`. However, what follows could be done with any of these models. To summarize and test the results compactly, it makes sense to obtain estimates of a representative trend in each of the left and right panels, and perhaps to compare them. In turn, that can be done by obtaining the slope of the curve (or line) at the average value of `price2`. The `emtrends()` function is designed for exactly this kind of purpose. It uses a difference quotient to estimate the slope of a line fitted to a given variable. It works just like `emmeans()` except for requiring the variable to use in the difference quotient. Using the `org.int` model: ```{r} emtrends(org.int, pairwise ~ variety, var = "price1", mult.name = "variety") ``` From this, we can say that, starting with `price1` and `price2` both at their average values, we expect `sales1` to decrease by about .75 per unit increase in `price1`; meanwhile, there is a suggestion of a slight increase of `sales2`, but without much statistical evidence. Marginally, the first variety has a 0.89 disadvantage relative to sales of the second variety. Other analyses (not shown) with `price2` set at a higher value will reduce these effects, while setting `price2` lower will exaggerate all these effects. If the same analysis is done with the quadratic model, the the trends are curved, and so the results will depend somewhat on the setting for `price1`. The graph above gives an indication of the nature of those changes. Similar results hold when we analyze the trends for `price2`: ```{r} emtrends(org.int, pairwise ~ variety, var = "price2", mult.name = "variety") ``` At the averages, increasing the price of variety 2 has the effect of decreasing sales of variety 2 while slightly increasing sales of variety 1 -- a marginal difference of about .92. [Back to Contents](#contents) ## Summary {#summary} Interactions, by nature, make things more complicated. One must resist pressures and inclinations to try to produce simple bottom-line conclusions. Interactions require more work and more patience; they require presenting more cases -- more than are presented in the examples in this vignette -- in order to provide a complete picture. [Index of all vignette topics](vignette-topics.html) emmeans/inst/doc/comparisons.R0000644000176200001440000000720614165066754016124 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE------------------------- require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ## ----------------------------------------------------------------------------- pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) pigs.emm.s <- emmeans(pigs.lm, "source") pairs(pigs.emm.s) ## ----------------------------------------------------------------------------- pwpm(pigs.emm.s) ## ----------------------------------------------------------------------------- pwpm(pigs.emm.s, means = FALSE, flip = TRUE, # args for pwpm() reverse = TRUE, # args for pairs() side = ">", delta = 0.05, adjust = "none") # args for test() ## ----------------------------------------------------------------------------- eff_size(pigs.emm.s, sigma = sigma(pigs.lm), edf = 23) ## ----------------------------------------------------------------------------- eff_size(pigs.emm.s, sigma = sigma(pigs.lm), edf = Inf) ## ---- eval = FALSE------------------------------------------------------------ # eff_size(pairs(pigs.emm.s), sigma = sigma(pigs.lm), edf = 23, method = "identity") ## ----fig.height = 1.5--------------------------------------------------------- plot(pigs.emm.s, comparisons = TRUE) ## ----------------------------------------------------------------------------- pwpp(pigs.emm.s) ## ---- fig.width = 9----------------------------------------------------------- pigs.lmint <- lm(log(conc) ~ source * factor(percent), data = pigs) pigs.cells <- emmeans(pigs.lmint, ~ source * percent) pwpp(pigs.cells, type = "response") ## ---- fig.width = 6----------------------------------------------------------- pwpp(pigs.cells, by = "source", type = "response") ## ----------------------------------------------------------------------------- coef(pairs(pigs.emm.s)) ## ----------------------------------------------------------------------------- pigs.emm.p <- emmeans(pigs.lm, "percent") ply <- contrast(pigs.emm.p, "poly") ply coef(ply) ## ----------------------------------------------------------------------------- org.aov <- aov(sales1 ~ day + Error(store), data = oranges, contrasts = list(day = "contr.sum")) org.emml <- emmeans(org.aov, consec ~ day) org.emml ## ----------------------------------------------------------------------------- skip_comp.emmc <- function(levels, skip = 1, reverse = FALSE) { if((k <- length(levels)) < skip + 1) stop("Need at least ", skip + 1, " levels") coef <- data.frame() coef <- as.data.frame(lapply(seq_len(k - skip - 1), function(i) { sgn <- ifelse(reverse, -1, 1) sgn * c(rep(0, i - 1), 1, rep(0, skip), -1, rep(0, k - i - skip - 1)) })) names(coef) <- sapply(coef, function(x) paste(which(x == 1), "-", which(x == -1))) attr(coef, "adjust") = "fdr" # default adjustment method coef } ## ----------------------------------------------------------------------------- skip_comp.emmc(1:5) skip_comp.emmc(1:5, skip = 0, reverse = TRUE) ## ----------------------------------------------------------------------------- contrast(org.emml[[1]], "skip_comp", skip = 2, reverse = TRUE) ## ----------------------------------------------------------------------------- LF <- contrast(pigs.emm.s, list(lambda1 = c(1, 2, 0), lambda2 = c(0, 3, -2)), offset = c(-7, 1)) confint(LF, adjust = "bonferroni") ## ----------------------------------------------------------------------------- pairs(pigs.emm.s, type = "lp") pairs(pigs.emm.s, type = "response") emmeans/inst/doc/confidence-intervals.R0000644000176200001440000000473214165066755017673 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE------------------------- require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ## ----------------------------------------------------------------------------- pigs.lm1 <- lm(log(conc) ~ source + factor(percent), data = pigs) pigs.rg <- ref_grid(pigs.lm1) pigs.emm.s <- emmeans(pigs.rg, "source") ## ----------------------------------------------------------------------------- test(pigs.emm.s) ## ----------------------------------------------------------------------------- test(pigs.emm.s, null = log(40), side = ">") ## ----------------------------------------------------------------------------- confint(pigs.emm.s, calc = c(n = ~.wgt.)) ## ----------------------------------------------------------------------------- test(pigs.emm.s, null = log(40), side = ">", type = "response") ## ----------------------------------------------------------------------------- confint(pigs.emm.s, side = ">", level = .90, type = "response") ## ----------------------------------------------------------------------------- confint(pigs.emm.s, adjust = "tukey") ## ----------------------------------------------------------------------------- test(pigs.emm.s, null = log(40), side = ">", adjust = "bonferroni") ## ----------------------------------------------------------------------------- confint(pigs.rg, by = "source") ## ----eval = FALSE------------------------------------------------------------- # emmeans(pigs.lm, ~ percent | source) ### same results as above # summary(.Last.value, by = percent) ### grouped the other way ## ----------------------------------------------------------------------------- pigsint.lm <- lm(log(conc) ~ source * factor(percent), data = pigs) pigsint.rg <- ref_grid(pigsint.lm) contrast(pigsint.rg, "consec", simple = "percent") ## ----------------------------------------------------------------------------- pigs.prs.s <- pairs(pigs.emm.s) pigs.prs.s ## ----------------------------------------------------------------------------- test(pigs.prs.s, joint = TRUE) ## ----------------------------------------------------------------------------- joint_tests(pigsint.rg) ## ----------------------------------------------------------------------------- joint_tests(pigsint.rg, by = "source") ## ----------------------------------------------------------------------------- test(pigs.prs.s, delta = log(1.25), adjust = "none") emmeans/inst/doc/xplanations.R0000644000176200001440000000337014165066775016130 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE--------------------------------------------- require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, fig.height = 2.0, class.output = "ro", class.message = "re", class.error = "re", class.warning = "re") ###knitr::opts_chunk$set(fig.width = 4.5, fig.height = 2.0) ## ---- message = FALSE----------------------------------------------------------------------------- m = c(6.1, 4.5, 5.4, 6.3, 5.5, 6.7) se2 = c(.3, .4, .37, .41, .23, .48)^2 lev = list(A = c("a1","a2","a3"), B = c("b1", "b2")) foo = emmobj(m, diag(se2), levels = lev, linfct = diag(6)) plot(foo, CIs = FALSE, comparisons = TRUE) ## ---- message = FALSE----------------------------------------------------------------------------- mkmat <- function(V, rho = 0, indexes = list(1:3, 4:6)) { sd = sqrt(diag(V)) for (i in indexes) V[i,i] = (1 - rho)*diag(sd[i]^2) + rho*outer(sd[i], sd[i]) V } # Intraclass correlation = 0.3 foo3 = foo foo3@V <- mkmat(foo3@V, 0.3) plot(foo3, CIs = FALSE, comparisons = TRUE) ## ---- message = FALSE----------------------------------------------------------------------------- foo6 = foo foo6@V <- mkmat(foo6@V, 0.6) plot(foo6, CIs = FALSE, comparisons = TRUE) ## ---- message = FALSE, error = TRUE--------------------------------------------------------------- foo8 = foo foo8@V <- mkmat(foo8@V, 0.8) plot(foo8, CIs = FALSE, comparisons = TRUE) ## ---- message = FALSE----------------------------------------------------------------------------- plot(foo8, CIs = FALSE, comparisons = TRUE, by = "B") ## ------------------------------------------------------------------------------------------------- pwpp(foo6, sort = FALSE) pwpm(foo6) emmeans/inst/doc/comparisons.html0000644000176200001440000041716014165066754016673 0ustar liggesusers Comparisons and contrasts in emmeans

Comparisons and contrasts in emmeans

emmeans package, Version 1.7.2

Contents

This vignette covers techniques for comparing EMMs at levels of a factor predictor, and other related analyses.

  1. Pairwise comparisons
  2. Other contrasts
  3. Formula interface
  4. Custom contrasts and linear functions
  5. Special behavior with log transformations
  6. Interaction contrasts (in “interactions” vignette)
  7. Multivariate contrasts (in “interactions” vignette)

Index of all vignette topics

Pairwise comparisons

The most common follow-up analysis for models having factors as predictors is to compare the EMMs with one another. This may be done simply via the pairs() method for emmGrid objects. In the code below, we obtain the EMMs for source for the pigs data, and then compare the sources pairwise.

pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs)
pigs.emm.s <- emmeans(pigs.lm, "source")
pairs(pigs.emm.s)
##  contrast    estimate     SE df t.ratio p.value
##  fish - soy    -0.273 0.0529 23  -5.153  0.0001
##  fish - skim   -0.402 0.0542 23  -7.428  <.0001
##  soy - skim    -0.130 0.0530 23  -2.442  0.0570
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## P value adjustment: tukey method for comparing a family of 3 estimates

In its out-of-the-box configuration, pairs() sets two defaults for summary(): adjust = "tukey" (multiplicity adjustment), and infer = c(FALSE, TRUE) (test statistics, not confidence intervals). You may override these, of course, by calling summary() on the result with different values for these.

In the example above, EMMs for later factor levels are subtracted from those for earlier levels; if you want the comparisons to go in the other direction, use pairs(pigs.emm.s, reverse = TRUE). Also, in multi-factor situations, you may specify by factor(s) to perform the comparisons separately at the levels of those factors.

Matrix displays

The numerical main results associated with pairwise comparisons can be presented compactly in matrix form via the pwpm() function. We simply hand it the emmGrid object to use in making the comparisons:

pwpm(pigs.emm.s)
##        fish    soy   skim
## fish [3.39] <.0001 <.0001
## soy  -0.273 [3.67] 0.0570
## skim -0.402 -0.130 [3.80]
## 
## Row and column labels: source
## Upper triangle: P values   adjust = "tukey"
## Diagonal: [Estimates] (emmean) 
## Lower triangle: Comparisons (estimate)   earlier vs. later

This matrix shows the EMMs along the diagonal, \(P\) values in the upper triangle, and the differences in the lower triangle. Options exist to switch off any one of these and to switch which triangle is used for the latter two. Also, optional arguments are passed. For instance, we can reverse the direction of the comparisons, suppress the display of EMMs, swap where the \(P\) values go, and perform noninferiority tests with a threshold of 0.05 as follows:

pwpm(pigs.emm.s, means = FALSE, flip = TRUE,     # args for pwpm()
     reverse = TRUE,                             # args for pairs()
     side = ">", delta = 0.05, adjust = "none")  # args for test()
##        fish    soy  skim
## fish         0.273 0.402
## soy  <.0001        0.130
## skim <.0001 0.0013      
## 
## Row and column labels: source
## Lower triangle: P values   side = ">"  delta = 0.05
## Upper triangle: Comparisons (estimate)   later vs. earlier

With all three P values so small, we have fish, soy, and skim in increasing order of noninferiority based on the given threshold.

When more than one factor is present, an existing or newly specified by variables() can split the results into l list of matrices.

Effect size

Some users desire standardized effect-size measures. Most popular is probably Cohen’s d, which is defined as the observed difference, divided by the population SD; and obviously Cohen effect sizes are close cousins of pairwise differences. They are available via the eff_size() function, where the user must specify the emmGrid object with the means to be compared, the estimated population SD sigma, and its degrees of freedom edf. This is illustrated with the current example:

eff_size(pigs.emm.s, sigma = sigma(pigs.lm), edf = 23)
##  contrast    effect.size    SE df lower.CL upper.CL
##  fish - soy        -2.37 0.577 23    -3.56   -1.175
##  fish - skim       -3.49 0.698 23    -4.94   -2.051
##  soy - skim        -1.12 0.490 23    -2.14   -0.112
## 
## Results are averaged over the levels of: percent 
## sigma used for effect sizes: 0.1151 
## Confidence level used: 0.95

The confidence intervals shown take into account the error in estimating sigma as well as the error in the differences. Note that the intervals are narrower if we claim that we know sigma perfectly (i.e., infinite degrees of freedom):

eff_size(pigs.emm.s, sigma = sigma(pigs.lm), edf = Inf)
##  contrast    effect.size    SE df lower.CL upper.CL
##  fish - soy        -2.37 0.460 23    -3.32   -1.418
##  fish - skim       -3.49 0.470 23    -4.47   -2.521
##  soy - skim        -1.12 0.461 23    -2.08   -0.172
## 
## Results are averaged over the levels of: percent 
## sigma used for effect sizes: 0.1151 
## Confidence level used: 0.95

Note that eff_size() expects the object with the means, not the differences. If you want to use the differences, use the method argument to specify that you don’t want to compute pairwise differences again; e.g.,

eff_size(pairs(pigs.emm.s), sigma = sigma(pigs.lm), edf = 23, method = "identity")

(results are identical to the first effect sizes shown).

Graphical comparisons

Comparisons may be summarized graphically via the comparisons argument in plot.emm():

plot(pigs.emm.s, comparisons = TRUE)

The blue bars are confidence intervals for the EMMs, and the red arrows are for the comparisons among them. If an arrow from one mean overlaps an arrow from another group, the difference is not “significant,” based on the adjust setting (which defaults to "tukey") and the value of alpha (which defaults to 0.05). See the “xplanations” supplement for details on how these are derived.

Note: Don’t ever use confidence intervals for EMMs to perform comparisons; they can be very misleading. Use the comparison arrows instead; or better yet, use pwpp().

A caution: it really is not good practice to draw a bright distinction based on whether or not a P value exceeds some cutoff. This display does dim such distinctions somewhat by allowing the viewer to judge whether a P value is close to alpha one way or the other; but a better strategy is to simply obtain all the P values using pairs(), and look at them individually.

Pairwise P-value plots

In trying to develop an alternative to compact-letter displays (see next subsection), we devised the “pairwise P-value plot” displaying all the P values in pairwise comparisons:

pwpp(pigs.emm.s)

Each comparison is associated with a vertical line segment that joins the scale positions of the two EMMs being compared, and whose horizontal position is determined by the P value of that comparison.

This kind of plot can get quite “busy” as the number of means being compared goes up. For example, suppose we include the interactions in the model for the pigs data, and compare all 12 cell means:

pigs.lmint <- lm(log(conc) ~ source * factor(percent), data = pigs)
pigs.cells <- emmeans(pigs.lmint, ~ source * percent)
pwpp(pigs.cells, type = "response")

While this plot has a lot of stuff going on, consider looking at it row-by-row. Next to each EMM, we can visualize the P values of all 11 comparisons with each other EMM (along with their color codes). Also, note that we can include arguments that are passed to summary(); in this case, to display the back-transformed means.

If we are willing to forgo the diagonal comparisons (where neither factor has a common level), we can make this a lot less cluttered via a by specification:

pwpp(pigs.cells, by = "source", type = "response")

In this latter plot we can see that the comparisons with skim as the source tend to be statistically stronger. This is also an opportunity to remind the user that multiplicity adjustments are made relative to each by group. For example, comparing skim:9 versus skim:15 has a Tukey-adjusted P value somewhat greater than 0.1 when all are in one family of 12 means, but about 0.02 relative to a smaller family of 4 means as depicted in the three-paneled plot.

Compact letter displays

Another way to depict comparisons is by compact letter displays, whereby two EMMs sharing one or more grouping symbols are not “significantly” different. These may be generated by the multcomp::cld() function. I really recommend against this kind of display, though, and decline to illustrate it. These displays promote visually the idea that two means that are “not significantly different” are to be judged as being equal; and that is a very wrong interpretation. In addition, they draw an artificial “bright line” between P values on either side of alpha, even ones that are very close.

Back to Contents

Other contrasts

Pairwise comparisons are an example of linear functions of EMMs. You may use coef() to see the coefficients of these linear functions:

coef(pairs(pigs.emm.s))
##      source c.1 c.2 c.3
## fish   fish   1   1   0
## soy     soy  -1   0   1
## skim   skim   0  -1  -1

The pairwise comparisons correspond to columns of the above results. For example, the first pairwise comparison, fish - soy, gives coefficients of 1, -1, and 0 to fish, soy, and skim, respectively. In cases, such as this one, where each column of coefficients sums to zero, the linear functions are termed contrasts

The contrast() function provides for general contrasts (and linear functions, as well) of factor levels. Its second argument, method, is used to specify what method is to be used. In this section we describe the built-in ones, where we simply provide the name of the built-in method. Consider, for example, the factor percent in the model pigs.lm . It is treated as a factor in the model, but it corresponds to equally-spaced values of a numeric variable. In such cases, users often want to compute orthogonal polynomial contrasts:

pigs.emm.p <- emmeans(pigs.lm, "percent")
ply <- contrast(pigs.emm.p, "poly")
ply
##  contrast  estimate     SE df t.ratio p.value
##  linear      0.9374 0.2106 23   4.452  0.0002
##  quadratic  -0.0971 0.0883 23  -1.099  0.2830
##  cubic       0.1863 0.1877 23   0.992  0.3313
## 
## Results are averaged over the levels of: source 
## Results are given on the log (not the response) scale.
coef(ply)
##    percent c.1 c.2 c.3
## 9        9  -3   1  -1
## 12      12  -1  -1   3
## 15      15   1  -1  -3
## 18      18   3   1   1

We obtain tests for the linear, quadratic, and cubic trends. The coefficients are those that can be found in tables in many experimental-design texts. It is important to understand that the estimated linear contrast is not the slope of a line fitted to the data. It is simply a contrast having coefficients that increase linearly. It does test the linear trend, however.

There are a number of other named contrast methods, for example "trt.vs.ctrl", "eff", and "consec". The "pairwise" and "revpairwise" methods in contrast() are the same as Pairs() and pairs(..., reverse = TRUE). See help("contrast-methods") for details.

Back to Contents

Formula interface

If you already know what contrasts you will want before calling emmeans(), a quick way to get them is to specify the method as the left-hand side of the formula in its second argument. For example, with the oranges dataset provided in the package,

org.aov <- aov(sales1 ~ day + Error(store), data = oranges,
               contrasts = list(day = "contr.sum"))
org.emml <- emmeans(org.aov, consec ~ day)
org.emml
## $emmeans
##  day emmean   SE   df lower.CL upper.CL
##  1     7.87 2.77 29.2     2.21     13.5
##  2     7.10 2.77 29.2     1.43     12.8
##  3    13.76 2.77 29.2     8.09     19.4
##  4     8.04 2.77 29.2     2.37     13.7
##  5    12.92 2.77 29.2     7.26     18.6
##  6    11.60 2.77 29.2     5.94     17.3
## 
## Warning: EMMs are biased unless design is perfectly balanced 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast estimate   SE df t.ratio p.value
##  2 - 1      -0.772 3.78 25  -0.204  0.9997
##  3 - 2       6.658 3.78 25   1.763  0.3246
##  4 - 3      -5.716 3.78 25  -1.513  0.4681
##  5 - 4       4.882 3.78 25   1.293  0.6127
##  6 - 5      -1.321 3.78 25  -0.350  0.9965
## 
## P value adjustment: mvt method for 5 tests

The contrasts shown are the day-to-day changes.

This two-sided formula technique is quite convenient, but it can also create confusion. For one thing, the result is not an emmGrid object anymore; it is a list of emmGrid objects, called an emm_list. You may need to be cognizant of that if you are to do further contrasts or other analyzes. For example if you want "eff" contrasts as well, you need to do contrast(org.emml[[1]], "eff") or contrast(org.emml, "eff", which = 1).

Another issue is that it may be unclear which part of the results is affected by certain options. For example, if you were to add adjust = "bonf" to the org.emm call above, would the Bonferroni adjustment be applied to the EMMs, or to the contrasts? (See the documentation if interested; but the best practice is to avoid such dilemmas.)

Back to Contents

Custom contrasts and linear functions

The user may write a custom contrast function for use in contrast(). What’s needed is a function having the desired name with ".emmc" appended, that generates the needed coefficients as a list or data frame. The function should take a vector of levels as its first argument, and any optional parameters as additional arguments. For example, suppose we want to compare every third level of a treatment. The following function provides for this:

skip_comp.emmc <- function(levels, skip = 1, reverse = FALSE) {
    if((k <- length(levels)) < skip + 1)
        stop("Need at least ", skip + 1, " levels")
    coef <- data.frame()
    coef <- as.data.frame(lapply(seq_len(k - skip - 1), function(i) {
        sgn <- ifelse(reverse, -1, 1)
        sgn * c(rep(0, i - 1), 1, rep(0, skip), -1, rep(0, k - i - skip - 1))
    }))
    names(coef) <- sapply(coef, function(x)
        paste(which(x == 1), "-", which(x == -1)))
    attr(coef, "adjust") = "fdr"   # default adjustment method
    coef
}

To test it, try 5 levels:

skip_comp.emmc(1:5)
##   1 - 3 2 - 4 3 - 5
## 1     1     0     0
## 2     0     1     0
## 3    -1     0     1
## 4     0    -1     0
## 5     0     0    -1
skip_comp.emmc(1:5, skip = 0, reverse = TRUE)
##   2 - 1 3 - 2 4 - 3 5 - 4
## 1    -1     0     0     0
## 2     1    -1     0     0
## 3     0     1    -1     0
## 4     0     0     1    -1
## 5     0     0     0     1

(The latter is the same as "consec" contrasts.) Now try it with the oranges example we had previously:

contrast(org.emml[[1]], "skip_comp", skip = 2, reverse = TRUE)
##  contrast estimate   SE df t.ratio p.value
##  4 - 1        0.17 3.78 25   0.045  0.9645
##  5 - 2        5.82 3.78 25   1.542  0.4069
##  6 - 3       -2.15 3.78 25  -0.571  0.8601
## 
## P value adjustment: fdr method for 3 tests

The contrast() function may in fact be used to compute arbitrary linear functions of EMMs. Suppose for some reason we want to estimate the quantities \(\lambda_1 = \mu_1+2\mu_2-7\) and \(\lambda_2 = 3\mu_2-2\mu_3+1\), where the \(\mu_j\) are the population values of the source EMMs in the pigs example. This may be done by providing the coefficients in a list, and the added constants in the offset argument:

LF <- contrast(pigs.emm.s, 
               list(lambda1 = c(1, 2, 0), lambda2 = c(0, 3, -2)),
               offset = c(-7, 1))
## Note: Use 'contrast(regrid(object), ...)' to obtain contrasts of back-transformed estimates
confint(LF, adjust = "bonferroni")
##  contrast estimate     SE df lower.CL upper.CL
##  lambda1      3.73 0.0827 23     3.53     3.93
##  lambda2      4.41 0.1341 23     4.09     4.73
## 
## Results are averaged over the levels of: percent 
## Note: contrasts are still on the log scale 
## Confidence level used: 0.95 
## Conf-level adjustment: bonferroni method for 2 estimates

Back to Contents

Special properties of log (and logit) transformations

Suppose we obtain EMMs for a model having a response transformation or link function. In most cases, when we compute contrasts of those EMMs, there is no natural way to express those contrasts on anything other than the transformed scale. For example, in a model fitted using glm() with the gamma() family, the default link function is the inverse. Predictions on such a model are estimates of \(1/\mu_j\) for various \(j\). Comparisons of predictions will be estimates of \(1/\mu_j - 1/\mu_{k}\) for \(j \ne k\). There is no natural way to back-transform these differences to some other interpretable scale.

However, logs are an exception, in that \(\log\mu_j - \log\mu_k = \log(\mu_j/\mu_k)\). Accordingly, when contrast() (or pairs()) notices that the response is on the log scale, it back-transforms contrasts to ratios when results are to be of response type. For example:

pairs(pigs.emm.s, type = "lp")
##  contrast    estimate     SE df t.ratio p.value
##  fish - soy    -0.273 0.0529 23  -5.153  0.0001
##  fish - skim   -0.402 0.0542 23  -7.428  <.0001
##  soy - skim    -0.130 0.0530 23  -2.442  0.0570
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## P value adjustment: tukey method for comparing a family of 3 estimates
pairs(pigs.emm.s, type = "response")
##  contrast    ratio     SE df null t.ratio p.value
##  fish / soy  0.761 0.0403 23    1  -5.153  0.0001
##  fish / skim 0.669 0.0362 23    1  -7.428  <.0001
##  soy / skim  0.879 0.0466 23    1  -2.442  0.0570
## 
## Results are averaged over the levels of: percent 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## Tests are performed on the log scale

As is true of EMM summaries with type = "response", the tests and confidence intervals are done before back-transforming. The ratios estimated here are actually ratios of geometric means. In general, a model with a log response is in fact a model for relative effects of any of its linear predictors, and this back-transformation to ratios goes hand-in-hand with that.

In generalized linear models, this behaviors will occur in two common cases: Poisson or count regression, for which the usual link is the log; and logistic regression, because logits are logs of odds ratios.

Back to Contents

Index of all vignette topics

emmeans/inst/doc/messy-data.html0000644000176200001440000042237414165066763016410 0ustar liggesusers Working with messy data

Working with messy data

emmeans package, Version 1.7.2

Issues with observational data

In experiments, we control the conditions under which observations are made. Ideally, this leads to balanced datasets and clear inferences about the effects of those experimental conditions. In observational data, factor levels are observed rather than controlled, and in the analysis we control for those factors and covariates. It is possible that some factors and covariates lie in the causal path for other predictors. Observational studies can be designed in ways to mitigate some of these issues; but often we are left with a mess. Using EMMs does not solve the inherent problems in messy, undesigned studies; but they do give us ways to compensate for imbalance in the data, and allow us to estimate meaningful effects after carefully considering the ways in which they can be confounded.

As an illustration, consider the nutrition dataset provided with the package. These data are used as an example in Milliken and Johnson (1992), Analysis of Messy Data, and contain the results of an observational study on nutrition education. Low-income mothers are classified by race, age category, and whether or not they received food stamps (the group factor); and the response variable is a gain score (post minus pre scores) after completing a nutrition training program. First, let’s fit a model than includes all main effects and 2-way interactions, and obtain its “type II” ANOVA:

nutr.lm <- lm(gain ~ (age + group + race)^2, data = nutrition) 
car::Anova(nutr.lm)
## Note: model has aliased coefficients
##       sums of squares computed by model comparison
## Anova Table (Type II tests)
## 
## Response: gain
##             Sum Sq Df F value    Pr(>F)
## age          82.37  3  0.9614    0.4145
## group       658.13  1 23.0441 6.105e-06
## race         11.17  2  0.1956    0.8227
## age:group    91.58  3  1.0688    0.3663
## age:race     87.30  3  1.0189    0.3880
## group:race  113.70  2  1.9906    0.1424
## Residuals  2627.47 92

There is definitely a group effect and a hint of and interaction with race. Here are the EMMs for those two factors, along with their counts:

emmeans(nutr.lm, ~ group * race, calc = c(n = ".wgt."))
##  group      race     emmean   SE df  n lower.CL upper.CL
##  FoodStamps Black      4.71 2.37 92  7  0.00497     9.41
##  NoAid      Black     -2.19 2.49 92 14 -7.13690     2.76
##  FoodStamps Hispanic nonEst   NA NA  1       NA       NA
##  NoAid      Hispanic nonEst   NA NA  2       NA       NA
##  FoodStamps White      3.61 1.16 92 52  1.31252     5.90
##  NoAid      White      2.26 2.39 92 31 -2.48897     7.00
## 
## Results are averaged over the levels of: age 
## Confidence level used: 0.95

Hmmmm. The EMMs when race is “Hispanic” are not given; instead they are flagged as non-estimable. What does that mean? Well, when using a model to make predictions, it is impossible to do that beyond the linear space of the data used to fit the model. And we have no data for three of the age groups in the Hispanic population:

with(nutrition, table(race, age))
##           age
## race        1  2  3  4
##   Black     2  7 10  2
##   Hispanic  0  0  3  0
##   White     5 16 51 11

We can’t make predictions for all the cases we are averaging over in the above EMMs, and that is why some of them are non-estimable. The bottom line is that we simply cannot include Hispanics in the mix when comparing factor effects. That’s a limitation of this study that cannot be overcome without collecting additional data. Our choices for further analysis are to focus only on Black and White populations; or to focus only on age group 3. For example (the latter):

summary(emmeans(nutr.lm, pairwise ~ group | race, at = list(age = "3")), 
    by = NULL)
## Note: adjust = "tukey" was changed to "sidak"
## because "tukey" is only appropriate for one set of pairwise comparisons
## $emmeans
##  group      race     emmean   SE df lower.CL upper.CL
##  FoodStamps Black      7.50 2.67 92     2.19   12.807
##  NoAid      Black     -3.67 2.18 92    -8.00    0.666
##  FoodStamps Hispanic   0.00 5.34 92   -10.61   10.614
##  NoAid      Hispanic   2.50 3.78 92    -5.01   10.005
##  FoodStamps White      5.42 0.96 92     3.51    7.326
##  NoAid      White     -0.20 1.19 92    -2.57    2.173
## 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast           race     estimate   SE df t.ratio p.value
##  FoodStamps - NoAid Black       11.17 3.45 92   3.237  0.0050
##  FoodStamps - NoAid Hispanic    -2.50 6.55 92  -0.382  0.9739
##  FoodStamps - NoAid White        5.62 1.53 92   3.666  0.0012
## 
## P value adjustment: sidak method for 3 tests

(We used trickery with providing a by variable, and then taking it away, to make the output more compact.) Evidently, the training program has been beneficial to the Black and White groups in that age category. There is no conclusion for the Hispanic group – for which we have very little data.

Back to Contents

Mediating covariates

The framing data in the mediation package has the results of an experiment conducted by Brader et al. (2008) where subjects were given the opportunity to send a message to Congress regarding immigration. However, before being offered this, some subjects (treat = 1) were first shown a news story that portrays Latinos in a negative way. Besides the binary response (whether or not they elected to send a message), the experimenters also measured emo, the subjects’ emotional state after the treatment was applied. There are various demographic variables as well. Let’s a logistic regression model, after changing the labels for educ to shorter strings.

framing <- mediation::framing 
## Registered S3 methods overwritten by 'lme4':
##   method                          from
##   cooks.distance.influence.merMod car 
##   influence.merMod                car 
##   dfbeta.influence.merMod         car 
##   dfbetas.influence.merMod        car
levels(framing$educ) <- c("NA","Ref","< HS", "HS", "> HS","Coll +") 
framing.glm <- glm(cong_mesg ~ age + income + educ + emo + gender * factor(treat), 
    family = binomial, data = framing)

The conventional way to handle covariates like emo is to set them at their means and use those means for purposes of predictions and EMMs. These adjusted means are shown in the following plot.

emmip(framing.glm, treat ~ educ | gender, type = "response") 

This plot gives the impression that the effect of treat is reversed between male and female subjects; and also that the effect of education is not monotone. Both of these are counter-intuitive.

However, note that the covariate emo is measured post-treatment. That suggests that in fact treat (and perhaps other factors) could affect the value of emo; and if that is true (as is in fact established by mediation analysis techniques), we should not pretend that emo can be set independently of treat as was done to obtain the EMMs shown above. Instead, let emo depend on treat and the other predictors – easily done using cov.reduce – and we obtain an entirely different impression:

emmip(framing.glm, treat ~ educ | gender, type = "response", 
    cov.reduce = emo ~ treat*gender + age + educ + income)

The reference grid underlying this plot has different emo values for each factor combination. The plot suggests that, after taking emotional response into account, male (but not female) subjects exposed to the negative news story are more likely to send the message than are females or those not seeing the negative news story. Also, the effect of educ is now nearly monotone.

By the way, the results in this plot are the same is what you would obtain by refitting the model with an adjusted covariate

emo.adj <- resid(lm(emo ~ treat*gender + age + educ + income, data = framing))

… and then using ordinary covariate-adjusted means at the means of emo.adj. This is a technique that is often recommended.

If there is more than one mediating covariate, their settings may be defined in sequence; for example, if x1, x2, and x3 are all mediating covariates, we might use

emmeans(..., cov.reduce = list(x1 ~ trt, x2 ~ trt + x1, x3 ~ trt + x1 + x2))

(or possibly with some interactions included as well).

Back to Contents

Mediating factors and weights

A mediating covariate is one that is in the causal path; likewise, it is possible to have a mediating factor. For mediating factors, the moral equivalent of the cov.reduce technique described above is to use weighted averages in lieu of equally-weighted ones in computing EMMs. The weights used in these averages should depend on the frequencies of mediating factor(s). Usually, the "cells" weighting scheme described later in this section is the right approach. In complex situations, it may be necessary to compute EMMs in stages.

As described in the “basics” vignette, EMMs are usually defined as equally-weighted means of reference-grid predictions. However, there are several built-in alternative weighting schemes that are available by specifying a character value for weights in a call to emmeans() or related function. The options are "equal" (the default), "proportional", "outer", "cells", and "flat".

The "proportional" (or "prop" for short) method weights proportionally to the frequencies (or model weights) of each factor combination that is averaged over. The "outer" method uses the outer product of the marginal frequencies of each factor that is being averaged over. To explain the distinction, suppose the EMMs for A involve averaging over two factors B and C. With "prop", we use the frequencies for each combination of B and C; whereas for "outer", first obtain the marginal frequencies for B and for C and weight proportionally to the product of these for each combination of B and C. The latter weights are like the “expected” counts used in a chi-square test for independence. Put another way, outer weighting is the same as proportional weighting applied one factor at a time; the following two would yield the same results:

emmeans(model, "A", weights = "outer") 
emmeans(emmeans(model, c("A", "B"), weights = "prop"),  weights = "prop") 

Using "cells" weights gives each prediction the same weight as occurs in the model; applied to a reference grid for a model with all interactions, "cells"-weighted EMMs are the same as the ordinary marginal means of the data. With "flat" weights, equal weights are used, except zero weight is applied to any factor combination having no data. Usually, "cells" or "flat" weighting will not produce non-estimable results, because we exclude empty cells. (That said, if covariates are linearly dependent with factors, we may still encounter non-estimable cases.)

Here is a comparison of predictions for nutr.lm defined above, using different weighting schemes:

sapply(c("equal", "prop", "outer", "cells", "flat"), function(w)
    predict(emmeans(nutr.lm, ~ race, weights = w)))
##         equal     prop    outer     cells      flat
## [1,] 1.258929 1.926554 2.546674 0.3809524 0.6865079
## [2,]       NA       NA       NA 1.6666667 1.2500000
## [3,] 2.932008 2.522821 3.142940 2.7951807 1.6103407

In the other hand, if we do group * race EMMs, only one factor (age) is averaged over; thus, the results for "prop" and "outer" weights will be identical in that case.

Back to Contents

Nuisance factors

Consider a situation where we have a model with 15 factors, each at 5 levels. Regardless of how simple or complex the model is, the reference grid consists of all combinations of these factors – and there are \(%^15\) of these, or over 30 billion. If there are, say, 100 regression coefficients in the model, then just the linfct slot in the reference grid requires \(100\times5^15\times8\) bytes of storage, or almost 23,000 gigabytes. Suppose in addition the model has a multivariate response with 5 levels. That multiplies both the rows and columns in linfct, increasing the storage requirements by a factor of 25. Either way, your computer can’t store that much – so this definitely qualifies as a messy situation!

The ref_grid() function now provides some relief, in the way of specifying some of the factors as “nuisance” factors. The reference grid is then constructed with those factors already averaged-out. So, for example with the same scenario, if only three of those 15 factors are of primary interest, and we specify the other 12 as nuisance factors to be averaged, that leaves us with only \(3^5=125\) rows in the reference grid, and hence \(125\times100\times8=10,000\) bytes of storage required for linfct. If there is a 5-level multivariate response, we’ll have 625 rows in the reference grid and \(25\times1000=250,000\) bytes in linfct. Suddenly a horribly unmanageable situation becomes quite manageable!

But of course, there is a restriction: nuisance factors must not interact with any other factors – not even other nuisance factors. And a multivariate response (or an implied multivariate response, e.g., in an ordinal model) can never be a nuisance factor. Under that condition, the average effects of a nuisance factor are the same regardless of the levels of other factors, making it possible to pre-average them by considering just one case.

We specify nuisance factors by listing their names in a nuisance argument to ref_grid() (in emmeans(), this argument is passed to ref_grid)). Often, it is much more convenient to give the factors that are not nuisance factors, via a non.nuisance argument. If you do specify a nuisance factor that does interact with others, or doesn’t exist, it is quietly excluded from the nuisance list.

Time for an example. Consider the mtcars dataset standard in R, and the model

mtcars.lm <- lm(mpg ~ factor(cyl)*am + disp + hp + drat + log(wt) + vs + 
                  factor(gear) + factor(carb), data = mtcars)

And let’s construct two different reference grids:

rg.usual <- ref_grid(mtcars.lm)
rg.usual
## 'emmGrid' object with variables:
##     cyl = 4, 6, 8
##     am = 0, 1
##     disp = 230.72
##     hp = 146.69
##     drat = 3.5966
##     wt = 3.2172
##     vs = 0, 1
##     gear = 3, 4, 5
##     carb = 1, 2, 3, 4, 6, 8
nrow(rg.usual@linfct)
## [1] 216
rg.nuis = ref_grid(mtcars.lm, non.nuisance = "cyl")
rg.nuis
## 'emmGrid' object with variables:
##     cyl = 4, 6, 8
##     am = 0, 1
## Nuisance factors that have been collapsed by averaging:
##     disp(1), hp(1), drat(1), wt(1), vs(2), gear(3), carb(6)
nrow(rg.nuis@linfct)
## [1] 6

Notice that we left am out of non.nuisance and hence included it in nuisance. However, it interacts with cyl, so it was not allowed as a nuisance factor. But rg.nuis requires 1/36 as much storage. There’s really nothing else to show, other than to demonstrate that we get the same EMMs either way, with slightly different annotations:

emmeans(rg.usual, ~ cyl * am)
##  cyl am emmean   SE df lower.CL upper.CL
##    4  0   19.0 4.29 14    9.823     28.2
##    6  0   19.7 3.32 14   12.556     26.8
##    8  0   29.0 5.98 14   16.130     41.8
##    4  1   15.4 4.29 14    6.206     24.6
##    6  1   27.3 4.90 14   16.741     37.8
##    8  1   11.2 5.56 14   -0.718     23.1
## 
## Results are averaged over the levels of: vs, gear, carb 
## Confidence level used: 0.95
emmeans(rg.nuis, ~ cyl * am)
##  cyl am emmean   SE df lower.CL upper.CL
##    4  0   19.0 4.29 14    9.823     28.2
##    6  0   19.7 3.32 14   12.556     26.8
##    8  0   29.0 5.98 14   16.130     41.8
##    4  1   15.4 4.29 14    6.206     24.6
##    6  1   27.3 4.90 14   16.741     37.8
##    8  1   11.2 5.56 14   -0.718     23.1
## 
## Results are averaged over the levels of: 7 nuisance factors 
## Confidence level used: 0.95

By default, the pre-averaging is done with equal weights. If we specify wt.nuis as anything other than "equal", they are averaged proportionally. As described above, this really amounts to "outer" weights since they are averaged separately. Let’s try it to see how the estimates differ:

predict(emmeans(mtcars.lm, ~ cyl * am, non.nuis = c("cyl", "am"), 
                wt.nuis = "prop"))
## [1] 16.51254 17.17869 26.45709 12.90600 24.75053  8.70546
predict(emmeans(mtcars.lm, ~ cyl * am, weights = "outer"))
## [1] 16.51254 17.17869 26.45709 12.90600 24.75053  8.70546

These are the same as each other, but different from the equally-weighted EMMs we obtained before. By the way, to help make things consistent, if weights is character, emmeans() passes wt.nuis = weights to ref_grid (if it is called), unless wt.nuis is also specified.

There is a trick to get emmeans to use the smallest possible reference grid: Pass the specs argument to ref_grid() as non.nuisance. But we have to quote it to delay evaluation, and also use all.vars() if (and only if) specs is a formula:

emmeans(mtcars.lm, ~ gear | am, non.nuis = quote(all.vars(specs)))
## am = 0:
##  gear emmean   SE df lower.CL upper.CL
##     3   15.2 2.65 14     9.56     20.9
##     4   22.4 3.36 14    15.20     29.6
##     5   30.0 4.54 14    20.28     39.7
## 
## am = 1:
##  gear emmean   SE df lower.CL upper.CL
##     3   10.6 4.49 14     1.02     20.3
##     4   17.8 2.62 14    12.19     23.4
##     5   25.4 4.05 14    16.72     34.1
## 
## Results are averaged over the levels of: 6 nuisance factors, cyl 
## Confidence level used: 0.95

Observe that cyl was passed over as a nuisance factor because it interacts with another factor.

Limiting the size of the reference grid

We have just seen how easily the size of a reference grid can get out of hand. The rg.limit option (set via emm_options() or as an optional argument in ref_grid() or emmeans()) serves to guard against excessive memory demands. It specifies the number of allowed rows in the reference grid. But because of the way ref_grid() works, this check is made before any multivariate-response levels are taken into account. If the limit is exceeded, an error is thrown:

ref_grid(mtcars.lm, rg.limit = 200)
## Error: The rows of your requested reference grid would be 216, which exceeds
## the limit of 200 (not including any multivariate responses).
## Your options are:
##   1. Specify some (or more) nuisance factors using the 'nuisance' argument
##      (see ?ref_grid). These must be factors that do not interact with others.
##   2. Add the argument 'rg.limit = <new limit>' to the call. Be careful,
##      because this could cause excessive memory use and performance issues.
##      Or, change the default via 'emm_options(rg.limit = <new limit>)'.

The default rg.limit is 10,000. With this limit, and if we have 1,000 columns in the model matrix, then the size of linfct is limited to about 80MB. If in addition, there is a 5-level multivariate response, the limit is 2GB – darn big, but perhaps manageable. Even so, I suspect that the 10000-row default may be to loose to guard against some users getting into a tight situation.

Back to Contents

Sub-models

We have just seen that we can assign different weights to the levels of containing factors. Another option is to constrain the effects of those containing factors to zero. In essence, that means fitting a different model without those containing effects; however, for certain models (not all), an emmGrid may be updated with a submodel specification so as to impose such a constraint. For illustration, return again to the nutrition example, and consider the analysis of group and race as before, after removing interactions involving age:

summary(emmeans(nutr.lm, pairwise ~ group | race, submodel = ~ age + group*race), 
        by = NULL)
## Note: adjust = "tukey" was changed to "sidak"
## because "tukey" is only appropriate for one set of pairwise comparisons
## $emmeans
##  group      race     emmean    SE df lower.CL upper.CL
##  FoodStamps Black      4.91 2.061 92    0.817    9.003
##  NoAid      Black     -3.01 1.581 92   -6.148    0.133
##  FoodStamps Hispanic  -1.18 5.413 92  -11.935    9.567
##  NoAid      Hispanic   1.32 3.876 92   -6.382    9.014
##  FoodStamps White      4.10 0.901 92    2.308    5.886
##  NoAid      White     -1.44 1.114 92   -3.654    0.771
## 
## Results are averaged over the levels of: age 
## submodel: ~ age + group + race + group:race 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast           race     estimate   SE df t.ratio p.value
##  FoodStamps - NoAid Black        7.92 2.62 92   3.021  0.0098
##  FoodStamps - NoAid Hispanic    -2.50 6.55 92  -0.382  0.9739
##  FoodStamps - NoAid White        5.54 1.27 92   4.364  0.0001
## 
## Results are averaged over the levels of: age 
## P value adjustment: sidak method for 3 tests

If you like, you may confirm that we would obtain exactly the same estimates if we had fitted that sub-model to the data, except we continue to use the residual variance from the full model in tests and confidence intervals. Without the interactions with age, all of the marginal means become estimable. The results are somewhat different from those obtained earlier where we narrowed the scope to just age 3. These new estimates include all ages, averaging over them equally, but with constraints that the interaction effects involving age are all zero.

There are two special character values that may be used with submodel. Specifying "minimal" creates a submodel with only the active factors:

emmeans(nutr.lm, ~ group * race, submodel = "minimal")
##  group      race     emmean    SE df lower.CL upper.CL
##  FoodStamps Black     5.000 2.020 92    0.988    9.012
##  NoAid      Black    -1.929 1.428 92   -4.765    0.908
##  FoodStamps Hispanic  0.000 5.344 92  -10.614   10.614
##  NoAid      Hispanic  2.500 3.779 92   -5.005   10.005
##  FoodStamps White     4.769 0.741 92    3.297    6.241
##  NoAid      White    -0.516 0.960 92   -2.422    1.390
## 
## Results are averaged over the levels of: age 
## submodel: ~ group + race + group:race 
## Confidence level used: 0.95

This submodel constrains all effects involving age to be zero. Another interesting option is "type2", whereby we in essence analyze the residuals of the model with all contained or overlapping effects, then constrain the containing effects to be zero. So what is left if only the interaction effects of the factors involved. This is most useful with joint_tests():

joint_tests(nutr.lm, submodel = "type2")
##  model term df1 df2 F.ratio p.value note
##  age          3  92   0.961  0.4145     
##  group        1  92  23.044  <.0001     
##  race         2  92   0.196  0.8227     
##  age:group    3  92   1.069  0.3663     
##  age:race     3  92   1.019  0.3880  d e
##  group:race   2  92   1.991  0.1424     
## 
## d: df1 reduced due to linear dependence 
## e: df1 reduced due to non-estimability

These results are identical to the type II anova obtained at the beginning of this example.

More details on how submodel works may be found in vignette("xplanations")

Back to Contents

Nested fixed effects

A factor A is nested in another factor B if the levels of A have a different meaning in one level of B than in another. Often, nested factors are random effects—for example, subjects in an experiment may be randomly assigned to treatments, in which case subjects are nested in treatments—and if we model them as random effects, these random nested effects are not among the fixed effects and are not an issue to emmeans. But sometimes we have fixed nested factors.

Here is an example of a fictional study of five fictional treatments for some disease in cows. Two of the treatments are administered by injection, and the other three are administered orally. There are varying numbers of observations for each drug. The data and model follow:

cows <- data.frame (
    route = factor(rep(c("injection", "oral"), c(5, 9))),
    drug = factor(rep(c("Bovineumab", "Charloisazepam", 
              "Angustatin", "Herefordmycin", "Mollycoddle"), c(3,2,  4,2,3))),
    resp = c(34, 35, 34,   44, 43,      36, 33, 36, 32,   26, 25,   25, 24, 24)
)
cows.lm <- lm(resp ~ route + drug, data = cows)

The ref_grid function finds a nested structure in this model:

cows.rg <- ref_grid(cows.lm)
cows.rg
## 'emmGrid' object with variables:
##     route = injection, oral
##     drug = Angustatin, Bovineumab, Charloisazepam, Herefordmycin, Mollycoddle
## Nesting structure:  drug %in% route

When there is nesting, emmeans computes averages separately in each group

route.emm <- emmeans(cows.rg, "route")
route.emm
##  route     emmean    SE df lower.CL upper.CL
##  injection   38.9 0.591  9     37.6     40.3
##  oral        28.0 0.449  9     27.0     29.0
## 
## Results are averaged over the levels of: drug 
## Confidence level used: 0.95

… and insists on carrying along any grouping factors that a factor is nested in:

drug.emm <- emmeans(cows.rg, "drug")
drug.emm
##  drug           route     emmean    SE df lower.CL upper.CL
##  Bovineumab     injection   34.3 0.747  9     32.6     36.0
##  Charloisazepam injection   43.5 0.915  9     41.4     45.6
##  Angustatin     oral        34.2 0.647  9     32.8     35.7
##  Herefordmycin  oral        25.5 0.915  9     23.4     27.6
##  Mollycoddle    oral        24.3 0.747  9     22.6     26.0
## 
## Confidence level used: 0.95

Here are the associated pairwise comparisons:

pairs(route.emm, reverse = TRUE)
##  contrast         estimate    SE df t.ratio p.value
##  oral - injection    -10.9 0.742  9 -14.671  <.0001
## 
## Results are averaged over the levels of: drug
pairs(drug.emm, by = "route", reverse = TRUE)
## route = injection:
##  contrast                    estimate    SE df t.ratio p.value
##  Charloisazepam - Bovineumab     9.17 1.182  9   7.757  <.0001
## 
## route = oral:
##  contrast                    estimate    SE df t.ratio p.value
##  Herefordmycin - Angustatin     -8.75 1.121  9  -7.805  0.0001
##  Mollycoddle - Angustatin       -9.92 0.989  9 -10.030  <.0001
##  Mollycoddle - Herefordmycin    -1.17 1.182  9  -0.987  0.6026
## 
## P value adjustment: tukey method for comparing a family of 3 estimates

In the latter result, the contrast itself becomes a nested factor in the returned emmGrid object. That would not be the case if there had been no by variable.

Graphs with nesting

It can be very helpful to take advantage of special features of ggplot2 when graphing results with nested factors. For example, the default plot for the cows example is not ideal:

emmip(cows.rg, ~ drug | route)

We can instead remove route from the call and instead handle it with ggplot2 code to use separate x scales:

require(ggplot2)
emmip(cows.rg, ~ drug) + facet_wrap(~ route, scales = "free_x")

Similarly with plot.emmGrid():

plot(drug.emm, PIs = TRUE) + 
    facet_wrap(~ route, nrow = 2, scales = "free_y")

Auto-identification of nested factors – avoid being trapped!

ref_grid() and emmeans() tries to discover and accommodate nested structures in the fixed effects. It does this in two ways: first, by identifying factors whose levels appear in combination with only one level of another factor; and second, by examining the terms attribute of the fixed effects. In the latter approach, if an interaction A:B appears in the model but A is not present as a main effect, then A is deemed to be nested in B. Note that this can create a trap: some users take shortcuts by omitting some fixed effects, knowing that this won’t affect the fitted values. But such shortcuts do affect the interpretation of model parameters, ANOVA tables, etc., and I advise against ever taking such shortcuts. Here are some ways you may notice mistakenly-identified nesting:

  • A message is displayed when nesting is detected
  • A str() listing of the emmGrid object shows a nesting component
  • An emmeans() summary unexpectedly includes one or more factors that you didn’t specify
  • EMMs obtained using by factors don’t seem to behave right, or give the same results with different specifications

To override the auto-detection of nested effects, use the nesting argument in ref_grid() or emmeans(). Specifying nesting = NULL will ignore all nesting. Incorrectly-discovered nesting can be overcome by specifying something akin to nesting = "A %in% B, C %in% (A * B)" or, equivalently, nesting = list(A = "B", C = c("A", "B")).

Back to Contents

Index of all vignette topics

emmeans/inst/doc/utilities.R0000644000176200001440000000745014165066774015605 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE--------------------------------------------- require("emmeans") emm_options(opt.digits = TRUE) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ## ------------------------------------------------------------------------------------------------- pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) pigs.emm <- emmeans(pigs.lm, "source") pigs.emm ## ------------------------------------------------------------------------------------------------- pigs.emm.s <- update(pigs.emm, infer = c(TRUE, TRUE), null = log(35), calc = c(n = ".wgt.")) pigs.emm.s ## ----eval = FALSE--------------------------------------------------------------------------------- # emmeans(pigs.lm, "source", infer = c(TRUE, TRUE), null = log(35), # calc = c(n = ".wgt.")) ## ------------------------------------------------------------------------------------------------- get_emm_option("emmeans") ## ------------------------------------------------------------------------------------------------- get_emm_option("contrast") ## ------------------------------------------------------------------------------------------------- get_emm_option("ref_grid") ## ------------------------------------------------------------------------------------------------- emm_options(emmeans = list(type = "response"), contrast = list(infer = c(TRUE, TRUE))) ## ------------------------------------------------------------------------------------------------- pigs.anal.p <- emmeans(pigs.lm, consec ~ percent) pigs.anal.p ## ------------------------------------------------------------------------------------------------- options(emmeans = NULL) ## ------------------------------------------------------------------------------------------------- emm_options(opt.digits = FALSE) pigs.emm emm_options(opt.digits = TRUE) # revert to optimal digits ## ----eval = FALSE--------------------------------------------------------------------------------- # options(emmeans = list(lmer.df = "satterthwaite", # contrast = list(infer = c(TRUE, FALSE)))) ## ------------------------------------------------------------------------------------------------- rbind(pairs(pigs.emm.s), pigs.anal.p[[2]]) ## ------------------------------------------------------------------------------------------------- update(pigs.anal.p[[2]] + pairs(pigs.emm.s), adjust = "mvt") ## ------------------------------------------------------------------------------------------------- pigs.emm[2:3] ## ------------------------------------------------------------------------------------------------- transform(pigs.emm, CI.width = upper.CL - lower.CL) ## ------------------------------------------------------------------------------------------------- pigs.emm.ss <- add_grouping(pigs.emm.s, "type", "source", c("animal", "vegetable", "animal")) str(pigs.emm.ss) ## ------------------------------------------------------------------------------------------------- emmeans(pigs.emm.ss, pairwise ~ type) ## ---- message = FALSE----------------------------------------------------------------------------- warp <- transform(warpbreaks, treat = interaction(wool, tension)) library(nlme) warp.gls <- gls(breaks ~ treat, weights = varIdent(form = ~ 1|treat), data = warp) ( warp.emm <- emmeans(warp.gls, "treat") ) ## ------------------------------------------------------------------------------------------------- warp.fac <- update(warp.emm, levels = list( wool = c("A", "B"), tension = c("L", "M", "H"))) str(warp.fac) ## ------------------------------------------------------------------------------------------------- contrast(warp.fac, "consec", by = "wool") emmeans/inst/doc/xtending.Rmd0000644000176200001440000007617314137062735015733 0ustar liggesusers--- title: "For developers: Extending **emmeans**" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{For developers: Extending emmeans} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") set.seed(271828) ``` ## Contents {#contents} This vignette explains how developers may incorporate **emmeans** support in their packages. If you are a user looking for a quick way to obtain results for an unsupported model, you are probably better off trying to use the `qdrg()` function. 1. [Introduction](#intro) 2. [Data example](#dataex) 3. [Supporting `rlm` objects](#rlm) 4. [Supporting `lqs` objects](#lqs) 5. [Communication between methods](#communic) 5. [Hook functions](#hooks) 6. [Exported methods from **emmeans**](#exported) 7. [Existing support for `rsm` objects](#rsm) 7. [Dispatching and restrictions](#dispatch) 8. [Exporting and registering your methods](#exporting) 9. [Conclusions](#concl) [Index of all vignette topics](vignette-topics.html) ## Introduction {#intro} Suppose you want to use **emmeans** for some type of model that it doesn't (yet) support. Or, suppose you have developed a new package with a fancy model-fitting function, and you'd like it to work with **emmeans**. What can you do? Well, there is hope because **emmeans** is designed to be extended. The first thing to do is to look at the help page for extending the package: ```{r eval=FALSE} help("extending-emmeans", package="emmeans") ``` It gives details about the fact that you need to write two S3 methods, `recover_data` and `emm_basis`, for the class of object that your model-fitting function returns. The `recover_data` method is needed to recreate the dataset so that the reference grid can be identified. The `emm_basis` method then determines the linear functions needed to evaluate each point in the reference grid and to obtain associated information---such as the variance-covariance matrix---needed to do estimation and testing. These methods must also be exported from your package so that they are available to users. See the section on [exporting the methods](#exporting) for details and suggestions. This vignette presents an example where suitable methods are developed, and discusses a few issues that arise. [Back to Contents](#contents) ## Data example {#dataex} The **MASS** package contains various functions that do robust or outlier-resistant model fitting. We will cobble together some **emmeans** support for these. But first, let's create a suitable dataset (a simulated two-factor experiment) for testing. ```{r} fake = expand.grid(rep = 1:5, A = c("a1","a2"), B = c("b1","b2","b3")) fake$y = c(11.46,12.93,11.87,11.01,11.92,17.80,13.41,13.96,14.27,15.82, 23.14,23.75,-2.09,28.43,23.01,24.11,25.51,24.11,23.95,30.37, 17.75,18.28,17.82,18.52,16.33,20.58,20.55,20.77,21.21,20.10) ``` The `y` values were generated using predetermined means and Cauchy-distributed errors. There are some serious outliers in these data. ## Supporting `rlm` {#rlm} The **MASS** package provides an `rlm` function that fits robust-regression models using *M* estimation. We'll fit a model using the default settings for all tuning parameters: ```{r} library(MASS) fake.rlm = rlm(y ~ A * B, data = fake) library(emmeans) emmeans(fake.rlm, ~ B | A) ``` The first lesson to learn about extending **emmeans** is that sometimes, it already works! It works here because `rlm` objects inherit from `lm`, which is supported by the **emmeans** package, and `rlm` objects aren't enough different to create any problems. [Back to Contents](#contents) ## Supporting `lqs` objects {#lqs} The **MASS** resistant-regression functions `lqs`, `lmsreg`, and `ltsreg` are another story, however. They create `lqs` objects that are not extensions of any other class, and have other issues, including not even having a `vcov` method. So for these, we really do need to write new methods for `lqs` objects. First, let's fit a model. ```{r} fake.lts = ltsreg(y ~ A * B, data = fake) ``` ### The `recover_data` method {#rd.lqs} It is usually an easy matter to write a `recover_data` method. Look at the one for `lm` objects: ```{r} emmeans:::recover_data.lm ``` Note that all it does is obtain the `call` component and call the method for class `call`, with additional arguments for its `terms` component and `na.action`. It happens that we can access these attributes in exactly the same way as for `lm` objects; so: ```{r} recover_data.lqs = emmeans:::recover_data.lm ``` Let's test it: ```{r} rec.fake = recover_data(fake.lts) head(rec.fake) ``` Our recovered data excludes the response variable `y` (owing to the `delete.response` call), and this is fine. #### Special arguments {#rdargs} By the way, there are two special arguments `data` and `params` that may be handed to `recover_data` via `ref_grid` or `emmeans` or a related function; and you may need to provide for if you don't use the `recover_data.call` function. The `data` argument is needed to cover a desperate situation that occurs with certain kinds of models where the underlying data information is not saved with the object---e.g., models that are fitted by iteratively modifying the data. In those cases, the only way to recover the data is to for the user to give it explicitly, and `recover_data` just adds a few needed attributes to it. The `params` argument is needed when the model formula refers to variables besides predictors. For example, a model may include a spline term, and the knots are saved in the user's environment as a vector and referred to in the call to fit the model. In trying to recover the data, we try to construct a data frame containing all the variables present on the right-hand side of the model, but if some of those are scalars or of different lengths than the number of observations, an error occurs. So you need to exclude any names in `params` when reconstructing the data. Many model objects contain the model frame as a slot; for example, a model fitted with `lm(..., model = TRUE)` has a member `$model` containing the model frame. This can be useful for recovering the data, provided none of the predictors are transformed (when predictors are transformed, the original predictor values are not in the model frame so it's harder to recover them). Therefore, when the model frame is available in the model object, it should be provided in the `frame` argument of `recover_data.call()`; then when `data = NULL`, a check is made on `trms`, and if it has no function calls, then `data` is set to `frame`. Of course, in the rarer case where the original data are available in the model object, specify that as `data`. #### Error handling {#rderrs} If you check for any error conditions in `recover_data`, simply have it return a character string with the desired message, rather than invoking `stop`. This provides a cleaner exit. The reason is that whenever `recover_data` throws an error, an informative message suggesting that `data` or `params` be provided is displayed. But a character return value is tested for and throws a different error with your string as the message. ### The `emm_basis` method {#ebreqs} The `emm_basis` method has four required arguments: ```{r} args(emmeans:::emm_basis.lm) ``` These are, respectively, the model object, its `terms` component (at least for the right-hand side of the model), a `list` of levels of the factors, and the grid of predictor combinations that specify the reference grid. The function must obtain six things and return them in a named `list`. They are the matrix `X` of linear functions for each point in the reference grid, the regression coefficients `bhat`; the variance-covariance matrix `V`; a matrix `nbasis` for non-estimable functions; a function `dffun(k,dfargs)` for computing degrees of freedom for the linear function `sum(k*bhat)`; and a list `dfargs` of arguments to pass to `dffun`. Optionally, the returned list may include a `model.matrix` element (the model matrix for the data or a compact version thereof obtained via `.cmpMM()`), which, if included, enables the `submodel` option. To write your own `emm_basis` function, examining some of the existing methods can help; but the best resource is the `predict` method for the object in question, looking carefully to see what it does to predict values for a new set of predictors (e.g., `newdata` in `predict.lm`). Following this advice, let's take a look at it: ```{r} MASS:::predict.lqs ``` ###### {#eblqs} Based on this, here is a listing of an `emm_basis` method for `lqs` objects: ```{r} emm_basis.lqs = function(object, trms, xlev, grid, ...) { m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = object$contrasts) bhat = coef(object) Xmat = model.matrix(trms, data=object$model) # 5 V = rev(object$scale)[1]^2 * solve(t(Xmat) %*% Xmat) nbasis = matrix(NA) dfargs = list(df = nrow(Xmat) - ncol(Xmat)) dffun = function(k, dfargs) dfargs$df list(X = X, bhat = bhat, nbasis = nbasis, V = V, #10 dffun = dffun, dfargs = dfargs) } ``` Before explaining it, let's verify that it works: ```{r} emmeans(fake.lts, ~ B | A) ``` Hooray! Note the results are comparable to those we had for `fake.rlm`, albeit the standard errors are quite a bit smaller. (In fact, the SEs could be misleading; a better method for estimating covariances should probably be implemented, but that is beyond the scope of this vignette.) [Back to Contents](#contents) ### Dissecting `emm_basis.lqs` Let's go through the listing of this method, line-by-line: * Lines 2--3: Construct the linear functions, `X`. This is a pretty standard two-step process: First obtain a model frame, `m`, for the grid of predictors, then pass it as data to `model.matrix` to create the associated design matrix. As promised, this code is essentially identical to what you find in `predict.lqs`. * Line 4: Obtain the coefficients, `bhat`. Most model objects have a `coef` method. * Lines 5--6: Obtain the covariance matrix, `V`, of `bhat`. In many models, this can be obtained using the object's `vcov` method. But not in this case. Instead, I cobbled one together using the inverse of the **X'X** matrix as in ordinary regression, and the variance estimate found in the last element of the `scale` element of the object. This probably under-estimates the variances and distorts the covariances, because robust estimators have some efficiency loss. * Line 7: Compute the basis for non-estimable functions. This applies only when there is a possibility of rank deficiency in the model. But `lqs` methods don't allow rank deficiencies, so it we have fitted such a model, we can be sure that all linear functions are estimable; we signal that by setting `nbasis` equal to a 1 x 1 matrix of `NA`. If rank deficiency were possible, the **estimability** package (which is required by **emmeans**) provides a `nonest.basis` function that makes this fairly painless---I would have coded `nbasis = estimability::nonest.basis(Xmat)`. There some subtleties you need to know regarding estimability. Suppose the model is rank-deficient, so that the design matrix **X** has *p* columns but rank *r* < *p*. In that case, `bhat` should be of length *p* (not *r*), and there should be *p* - *r* elements equal to `NA`, corresponding to columns of **X** that were excluded from the fit. Also, `X` should have all *p* columns. In other words, do not alter or throw-out columns of `X` or their corresponding elements of `bhat`---even those with `NA` coefficients---as they are essential for assessing estimability. `V` should be *r* x *r*, however---the covariance matrix for the non-excluded predictors. * Lines 8--9: Obtain `dffun` and `dfargs`. This is a little awkward because it is designed to allow support for mixed models, where approximate methods may be used to obtain degrees of freedom. The function `dffun` is expected to have two arguments: `k`, the vector of coefficients of `bhat`, and `dfargs`, a list containing any additional arguments. In this case (and in many other models), the degrees of freedom are the same regardless of `k`. We put the required degrees of freedom in `dfargs` and write `dffun` so that it simply returns that value. (Note: If asymptotic tests and CIs are desired, return `Inf` degrees of freedom.) * Line 10: Return these results in a named list. [Back to Contents](#contents) ## Communication between methods {#communic} If you need to pass information obtained in `recover_data()` to the `emm_basis()` method, simply incorporate it as `attr(data, "misc")` where `data` is the dataset returned by `recover_data()`. Subsequently, that attribute is available in `emm_grid()` by adding a `misc` argument. ## Hook functions {#hooks} Most linear models supported by **emmeans** have straightforward structure: Regression coefficients, their covariance matrix, and a set of linear functions that define the reference grid. However, a few are more complex. An example is the `clm` class in the **ordinal** package, which allows a scale model in addition to the location model. When a scale model is used, the scale parameters are included in the model matrix, regression coefficients, and covariance matrix, and we can't just use the usual matrix operations to obtain estimates and standard errors. To facilitate using custom routines for these tasks, the `emm_basis.clm` function function provided in **emmeans** includes, in its `misc` part, the names (as character constants) of two "hook" functions: `misc$estHook` has the name of the function to call when computing estimates, standard errors, and degrees of freedom (for the `summary` method); and `misc$vcovHook` has the name of the function to call to obtain the covariance matrix of the grid values (used by the `vcov` method). These functions are called in lieu of the usual built-in routines for these purposes, and return the appropriately sized matrices. In addition, you may want to apply some form of special post-processing after the reference grid is constructed. To provide for this, give the name of your function to post-process the object in `misc$postGridHook`. Again, `clm` objects (as well as `polr` in the **MASS** package) serve as an example. They allow a `mode` specification that in two cases, calls for post-processing. The `"cum.prob"` mode uses the `regrid` function to transform the linear predictor to the cumulative-probability scale. And the `"prob"` mode performs this, as well as applying the contrasts necessary to convert the cumulative probabilities into the class probabilities. [Back to Contents](#contents) ## Exported methods from **emmeans** {#exported} For package developers' convenience, **emmeans** exports some of its S3 methods for `recover_data` and/or `emm_basis`---use `methods("recover_data")` and `methods("emm_basis")` to discover which ones. It may be that all you need is to invoke one of those methods and perhaps make some small changes---especially if your model-fitting algorithm makes heavy use of an existing model type supported by **emmeans**. For those methods that are not exported, use `recover_data()` and `.emm_basis()`, which run in **emmeans**'s namespace, thus providing access to all available methods.. A few additional functions are exported because they may be useful to developers. They are as follows: * `emmeans::.all.vars(expr, retain)` Some users of your package may include `$` or `[[]]` operators in their model formulas. If you need to get the variable names, `base::all.vars` will probably not give you what you need. For example, if `form = ~ data$x + data[[5]]`, then `base::all.vars(form)` returns the names `"data"` and `"x"`, whereas `emmeans::.all.vars(form)` returns the names `"data$x"` and `"data[[5]]"`. The `retain` argument may be used to specify regular expressions for patterns to retain as parts of variable names. * `emmeans::.diag(x, nrow, ncol)` The base `diag` function has a booby trap whereby, for example, `diag(57.6)` returns a 57 x 57 identity matrix rather than a 1 x 1 matrix with 57.6 as its only element. But `emmeans::.diag(57.6)` will return the latter. The function works identically to `diag` except for its tail run around the identity-matrix trap. * `emmeans::.aovlist.dffun(k, dfargs)` This function is exported because it is needed for computing degrees of freedom for models fitted using `aov`, but it may be useful for other cases where Satterthwaite degrees-of-freedom calculations are needed. It requires the `dfargs` slot to contain analogous contents. * `emmeans::.get.offset(terms, grid)` If `terms` is a model formula containing an `offset` call, this is will compute that offset in the context of `grid` (a `data.frame`). * `emmeans::.my.vcov(object, ...)` In a call to `ref_grid`, `emmeans`, etc., the user may use `vcov.` to specify an alternative function or matrix to use as the covariance matrix of the fixed-effects coefficients. This function supports that feature. Calling `.my.vcov` in place of the `vcov` method will substitute the user's `vcov.` when it is specified. * `emmeans::.std.link.labels(fam, misc)` This is useful in `emm_basis` methods for generalized linear models. Call it with `fam` equal to the `family` object for your model, and `misc` either an existing list, or just `list()` if none. It returns a new `misc` list containing the link function and, in some cases, extra features that are used for certain types of link functions (e.g., for a log link, the setups for returning ratio comparisons with `type = "response"`). * `emmeans::.num.key(levs, key)` Returns integer indices of elements of `key` in `levs` when `key` is a character vector; or just returns integer values if already integer. Also throws an error if levels are mismatched or indices exceed legal range. This is useful in custom contrast functions (`.emmc` functions). * `emmeans::.get.excl(levs, exclude, include)` This is support for the `exclude` and `include` arguments of contrast functions. It checks legality and returns an integer vector of `exclude` indices in `levs`, given specified integer or character arguments `exclude` and `include`. In your `.emmc` function, `exclude` should default to `integer(0)` and `include` should have no default. * `emmeans::.cmpMM(X, weights, assign)` creates a compact version of the model matrix `X` (or, preferably, its QR decomposition). This is useful if we want an `emm_basis()` method to return a `model.matrix` element. The returned result is just the R portion of the QR decomposition of `diag(sqrt(weights)) %*% X`, with the `assign` attribute added. If `X` is a `qr` object, we assume the weights are already incorporated, as is true of the `qr` slot of a `lm` object. [Back to Contents](#contents) ## Existing support for `rsm` objects {#rsm} As a nontrivial example of how an existing package supports **emmeans**, we show the support offered by the **rsm** package. Its `rsm` function returns an `rsm` object which is an extension of the `lm` class. Part of that extension has to do with `coded.data` structures whereby, as is typical in response-surface analysis, models are fitted to variables that have been linearly transformed (coded) so that the scope of each predictor is represented by plus or minus 1 on the coded scale. Without any extra support in **rsm**, `emmeans` will work just fine with `rsm` objects; but if the data are coded, it becomes awkward to present results in terms of the original predictors on their original, uncoded scale. The `emmeans`-related methods in **rsm** provide a `mode` argument that may be used to specify whether we want to work with coded or uncoded data. The possible values for `mode` are `"asis"` (ignore any codings, if present), `"coded"` (use the coded scale), and `"decoded"` (use the decoded scale). The first two are actually the same in that no decoding is done; but it seems clearer to provide separate options because they represent two different situations. ### The `recover_data` method {#rdrsm} Note that coding is a *predictor* transformation, not a response transformation (we could have that, too, as it's already supported by the **emmeans** infrastructure). So, to handle the `"decode"` mode, we will need to actually decode the predictors used to construct he reference grid. That means we need to make `recover_data` a lot fancier! Here it is: ```{r} recover_data.rsm = function(object, data, mode = c("asis", "coded", "decoded"), ...) { mode = match.arg(mode) cod = rsm::codings(object) fcall = object$call if(is.null(data)) # 5 data = emmeans::recover_data(fcall, delete.response(terms(object)), object$na.action, ...) if (!is.null(cod) && (mode == "decoded")) { pred = cpred = attr(data, "predictors") trms = attr(data, "terms") #10 data = rsm::decode.data(rsm::as.coded.data(data, formulas = cod)) for (form in cod) { vn = all.vars(form) if (!is.na(idx <- grep(vn[1], pred))) { pred[idx] = vn[2] #15 cpred = setdiff(cpred, vn[1]) } } attr(data, "predictors") = pred new.trms = update(trms, reformulate(c("1", cpred))) #20 attr(new.trms, "orig") = trms attr(data, "terms") = new.trms attr(data, "misc") = cod } data } ``` Lines 2--7 ensure that `mode` is legal, retrieves the codings from the object, and obtain the results we would get from `recover_data` had it been an `lm` object. If `mode` is not `"decoded"`, *or* if no codings were used, that's all we need. Otherwise, we need to return the decoded data. However, it isn't quite that simple, because the model equation is still defined on the coded scale. Rather than to try to translate the model coefficients and covariance matrix to the decoded scale, we elected to remember what we will need to do later to put things back on the coded scale. In lines 9--10, we retrieve the attributes of the recovered data that provide the predictor names and `terms` object on the coded scale. In line 11, we replace the recovered data with the decoded data. By the way, the codings comprise a list of formulas with the coded name on the left and the original variable name on the right. It is possible that only some of the predictors are coded (for example, blocking factors will not be). In the `for` loop in lines 12--18, the coded predictor names are replaced with their decoded names. For technical reasons to be discussed later, we also remove these coded predictor names from a copy, `cpred`, of the list of all predictors in the coded model. In line 19, the `"predictors"` attribute of `data` is replaced with the modified version. Now, there is a nasty technicality. The `ref_grid` function in **emmeans** has a few lines of code after `recover_data` is called that determine if any terms in the model convert covariates to factors or vice versa; and this code uses the model formula. That formula involves variables on the coded scale, and those variables are no longer present in the data, so an error will occur if it tries to access them. Luckily, if we simply take those terms out of the formula, it won't hurt because those coded predictors would not have been converted in that way. So in line 20, we update `trms` with a simpler model with the coded variables excluded (the intercept is explicitly included to ensure there will be a right-hand side even is `cpred` is empty). We save that as the `terms` attribute, and the original terms as a new `"orig"` attribute to be retrieved later. The `data` object, modified or not, is returned. If data have been decoded, `ref_grid` will construct its grid using decoded variables. In line 23, we save the codings as the `"misc"` attribute, to be accessed later by `emm_basis()`. ### The `emm_basis` method {#ebrsm} Now comes the `emm_basis` method that will be called after the grid is defined. It is listed below: ```{r} emm_basis.rsm = function(object, trms, xlev, grid, mode = c("asis", "coded", "decoded"), misc, ...) { mode = match.arg(mode) cod = misc if(!is.null(cod) && mode == "decoded") { # 5 grid = rsm::coded.data(grid, formulas = cod) trms = attr(trms, "orig") } m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) #10 X = model.matrix(trms, m, contrasts.arg = object$contrasts) bhat = as.numeric(object$coefficients) V = emmeans::.my.vcov(object, ...) if (sum(is.na(bhat)) > 0) #15 nbasis = estimability::nonest.basis(object$qr) else nbasis = estimability::all.estble dfargs = list(df = object$df.residual) dffun = function(k, dfargs) dfargs$df #20 list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = list()) } ``` This is much simpler. The coding formulas are obtained from `misc` (line 4) so that we don't have to re-obtain them from the object. All we have to do is determine if decoding was done (line 5); and, if so, convert the grid back to the coded scale (line 6) and recover the original `terms` attribute (line 7). The rest is borrowed directly from the `emm_basis.lm` method in **emmeans**. Note that line 13 uses one of the exported functions we described in the preceding section. Lines 15--18 use functions from the **estimability** package to handle the possibility that the model is rank-deficient. ### A demonstration {#demo} Here's a demonstration of this **rsm** support. The standard example for `rsm` fits a second-order model `CR.rs2` to a dataset organized in two blocks and with two coded predictors. ```{r results = "hide", warning = FALSE, message = FALSE} library("rsm") example("rsm") ### (output is not shown) ### ``` First, let's look at some results on the coded scale---which are the same as for an ordinary `lm` object. ```{r} emmeans(CR.rs2, ~ x1 * x2, mode = "coded", at = list(x1 = c(-1, 0, 1), x2 = c(-2, 2))) ``` Now, the coded variables `x1` and `x2` are derived from these coding formulas for predictors `Time` and `Temp`: ```{r} codings(CR.rs1) ``` Thus, for example, a coded value of `x1 = 1` corresponds to a time of 85 + 1 x 5 = 90. Here are some results working with decoded predictors. Note that the `at` list must now be given in terms of `Time` and `Temp`: ```{r} emmeans(CR.rs2, ~ Time * Temp, mode = "decoded", at = list(Time = c(80, 85, 90), Temp = c(165, 185))) ``` Since the supplied settings are the same on the decoded scale as were used on the coded scale, the EMMs are identical to those in the previous output. ## Dispatching and restrictions {#dispatch} The **emmeans** package has internal support for a number of model classes. When `recover_data()` and `emm_basis()` are dispatched, a search is made for external methods for a given class; and if found, those methods are used instead of the internal ones. However, certain restrictions apply when you aim to override an existing internal method: 1. The class name being extended must appear in the first or second position in the results of `class(object)`. That is, you may have a base class for which you provide `recover_data()` and `emm_basis()` methods, and those will also work for *direct* descendants thereof; but any class in third place or later in the inheritance is ignored. 2. Certain classes vital to the correct operation of the package, e.g., `"lm"`, `"glm"`, etc., may not be overridden. If there are no existing internal methods for the class(es) you provide methods for, there are no restrictions on them. ## Exporting and registering your methods {#exporting} To make the methods available to users of your package, the methods must be exported. R and CRAN are evolving in a way that having S3 methods in the registry is increasingly important; so it is a good idea to provide for that. The problem is not all of your package users will have **emmeans** installed. Thus, registering the methods must be done conditionally. We provide a courtesy function `.emm_register()` to make this simple. Suppose that your package offers two model classes `foo` and `bar`, and it includes the corresponding functions `recover_data.foo`, `recover_data.bar`, `emm_basis.foo`, and `emm_basis.bar`. Then to register these methods, add or modify the `.onLoad` function in your package (traditionally saved in the source file `zzz.R`): ```r .onLoad <- function(libname, pkgname) { if (requireNamespace("emmeans", quietly = TRUE)) emmeans::.emm_register(c("foo", "bar"), pkgname) } ``` You should also add `emmeans (>= 1.4)` and `estimability` (which is required by **emmeans**) to the `Suggests` field of your `DESCRIPTION` file. [Back to Contents](#contents) ## Conclusions {#concl} It is relatively simple to write appropriate methods that work with **emmeans** for model objects it does not support. I hope this vignette is helpful for understanding how. Furthermore, if you are the developer of a package that fits linear models, I encourage you to include `recover_data` and `emm_basis` methods for those classes of objects, so that users have access to **emmeans** support. [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/inst/doc/confidence-intervals.html0000644000176200001440000005761314165066755020444 0ustar liggesusers Confidence intervals and tests in emmeans

Confidence intervals and tests in emmeans

emmeans package, Version 1.7.2

summary(), confint(), and test()

The most important method for emmGrid objects is summary(). For one thing, it is called by default when you display an emmeans() result. The summary() function has a lot of options, and the detailed documentation via help("summary.emmGrid") is worth a look.

For ongoing illustrations, let’s re-create some of the objects in the “basics” vignette for the pigs example:

pigs.lm1 <- lm(log(conc) ~ source + factor(percent), data = pigs)
pigs.rg <- ref_grid(pigs.lm1)
pigs.emm.s <- emmeans(pigs.rg, "source")

Just summary(<object>) by itself will produce a summary that varies somewhat according to context. It does this by setting different defaults for the infer argument, which consists of two logical values, specifying confidence intervals and tests, respectively. [The exception is models fitted using MCMC methods, where summary() is diverted to the hpd.summary() function, a preferable summary for many Bayesians.]

The summary of a newly made reference grid will show just estimates and standard errors, but not confidence intervals or tests (that is, infer = c(FALSE, FALSE)). The summary of an emmeans() result, as we see above, will have intervals, but no tests (i.e., infer = c(TRUE, FALSE)); and the result of a contrast() call (see comparisons and contrasts) will show test statistics and P values, but not intervals (i.e., infer = c(FALSE, TRUE)). There are courtesy methods confint() and test() that just call summary() with the appropriate infer setting; for example,

test(pigs.emm.s)
##  source emmean     SE df t.ratio p.value
##  fish     3.39 0.0367 23  92.540  <.0001
##  soy      3.67 0.0374 23  97.929  <.0001
##  skim     3.80 0.0394 23  96.407  <.0001
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale.

It is not particularly useful, though, to test these EMMs against the default of zero – which is why tests are not usually shown. It makes a lot more sense to test them against some target concentration, say 40. And suppose we want to do a one-sided test to see if the concentration is greater than 40. Remembering that the response is log-transformed in this model,

test(pigs.emm.s, null = log(40), side = ">")
##  source emmean     SE df null t.ratio p.value
##  fish     3.39 0.0367 23 3.69  -8.026  1.0000
##  soy      3.67 0.0374 23 3.69  -0.577  0.7153
##  skim     3.80 0.0394 23 3.69   2.740  0.0058
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## P values are right-tailed

It is also possible to add calculated columns to the summary, via the calc argument. The calculations can include any columns up through df in the summary, as well as any variable in the object’s grid slot. Among the latter are usually weights in a column named .wgt., and we can use that to include sample size in the summary:

confint(pigs.emm.s, calc = c(n = ~.wgt.))
##  source emmean     SE df  n lower.CL upper.CL
##  fish     3.39 0.0367 23 10     3.32     3.47
##  soy      3.67 0.0374 23 10     3.59     3.74
##  skim     3.80 0.0394 23  9     3.72     3.88
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

Back to Contents

Back-transforming

Transformations and link functions are supported an several ways in emmeans, making this a complex topic worthy of its own vignette. Here, we show just the most basic approach. Namely, specifying the argument type = "response" will cause the displayed results to be back-transformed to the response scale, when a transformation or link function is incorporated in the model. For example, let’s try the preceding test() call again:

test(pigs.emm.s, null = log(40), side = ">", type = "response")
##  source response   SE df null t.ratio p.value
##  fish       29.8 1.09 23   40  -8.026  1.0000
##  soy        39.1 1.47 23   40  -0.577  0.7153
##  skim       44.6 1.75 23   40   2.740  0.0058
## 
## Results are averaged over the levels of: percent 
## P values are right-tailed 
## Tests are performed on the log scale

Note what changes and what doesn’t change. In the test() call, we still use the log of 40 as the null value; null must always be specified on the linear-prediction scale, in this case the log. In the output, the displayed estimates, as well as the null value, are shown back-transformed. As well, the standard errors are altered (using the delta method). However, the t ratios and P values are identical to the preceding results. That is, the tests themselves are still conducted on the linear-predictor scale (as is noted in the output).

Similar statements apply to confidence intervals on the response scale:

confint(pigs.emm.s, side = ">", level = .90, type = "response")
##  source response   SE df lower.CL upper.CL
##  fish       29.8 1.09 23     28.4      Inf
##  soy        39.1 1.47 23     37.3      Inf
##  skim       44.6 1.75 23     42.3      Inf
## 
## Results are averaged over the levels of: percent 
## Confidence level used: 0.9 
## Intervals are back-transformed from the log scale

With side = ">", a lower confidence limit is computed on the log scale, then that limit is back-transformed to the response scale. (We have also illustrated how to change the confidence level.)

Back to Contents

Multiplicity adjustments

Both tests and confidence intervals may be adjusted for simultaneous inference. Such adjustments ensure that the confidence coefficient for a whole set of intervals is at least the specified level, or to control for multiplicity in a whole family of tests. This is done via the adjust argument. For ref_grid() and emmeans() results, the default is adjust = "none". For most contrast() results, adjust is often something else, depending on what type of contrasts are created. For example, pairwise comparisons default to adjust = "tukey", i.e., the Tukey HSD method. The summary() function sometimes changes adjust if it is inappropriate. For example, with

confint(pigs.emm.s, adjust = "tukey")
## Note: adjust = "tukey" was changed to "sidak"
## because "tukey" is only appropriate for one set of pairwise comparisons
##  source emmean     SE df lower.CL upper.CL
##  fish     3.39 0.0367 23     3.30     3.49
##  soy      3.67 0.0374 23     3.57     3.76
##  skim     3.80 0.0394 23     3.70     3.90
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95 
## Conf-level adjustment: sidak method for 3 estimates

the adjustment is changed to the Sidak method because the Tukey adjustment is inappropriate unless you are doing pairwise comparisons.

An adjustment method that is usually appropriate is Bonferroni; however, it can be quite conservative. Using adjust = "mvt" is the closest to being the “exact” all-around method “single-step” method, as it uses the multivariate t distribution (and the mvtnorm package) with the same covariance structure as the estimates to determine the adjustment. However, this comes at high computational expense as the computations are done using simulation techniques. For a large set of tests (and especially confidence intervals), the computational lag becomes noticeable if not intolerable.

For tests, adjust increases the P values over those otherwise obtained with adjust = "none". Compare the following adjusted tests with the unadjusted ones previously computed.

test(pigs.emm.s, null = log(40), side = ">", adjust = "bonferroni")
##  source emmean     SE df null t.ratio p.value
##  fish     3.39 0.0367 23 3.69  -8.026  1.0000
##  soy      3.67 0.0374 23 3.69  -0.577  1.0000
##  skim     3.80 0.0394 23 3.69   2.740  0.0175
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## P value adjustment: bonferroni method for 3 tests 
## P values are right-tailed

Back to Contents

“By” variables

Sometimes you want to break a summary down into smaller pieces; for this purpose, the by argument in summary() is useful. For example,

confint(pigs.rg, by = "source")
## source = fish:
##  percent prediction     SE df lower.CL upper.CL
##        9       3.22 0.0536 23     3.11     3.33
##       12       3.40 0.0493 23     3.30     3.50
##       15       3.44 0.0548 23     3.32     3.55
##       18       3.52 0.0547 23     3.41     3.63
## 
## source = soy:
##  percent prediction     SE df lower.CL upper.CL
##        9       3.49 0.0498 23     3.39     3.60
##       12       3.67 0.0489 23     3.57     3.77
##       15       3.71 0.0507 23     3.61     3.82
##       18       3.79 0.0640 23     3.66     3.93
## 
## source = skim:
##  percent prediction     SE df lower.CL upper.CL
##        9       3.62 0.0501 23     3.52     3.73
##       12       3.80 0.0494 23     3.70     3.90
##       15       3.84 0.0549 23     3.73     3.95
##       18       3.92 0.0646 23     3.79     4.06
## 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

If there is also an adjust in force when by variables are used, the adjustment is made separately on each by group; e.g., in the above, we would be adjusting for sets of 4 intervals, not all 12 together.

There can be a by specification in emmeans() (or equivalently, a | in the formula); and if so, it is passed on to summary() and used unless overridden by another by. Here are examples, not run:

emmeans(pigs.lm, ~ percent | source)     ### same results as above
summary(.Last.value, by = percent)       ### grouped the other way

Specifying by = NULL will remove all grouping.

Simple comparisons

There is also a simple argument for contrast() that is in essence the inverse of by; the contrasts are run using everything except the specified variables as by variables. To illustrate, let’s consider the model for pigs that includes the interaction (so that the levels of one factor compare differently at levels of the other factor).

pigsint.lm <- lm(log(conc) ~ source * factor(percent), data = pigs)
pigsint.rg <- ref_grid(pigsint.lm)
contrast(pigsint.rg, "consec", simple = "percent")
## source = fish:
##  contrast estimate     SE df t.ratio p.value
##  12 - 9     0.1849 0.1061 17   1.742  0.2361
##  15 - 12    0.0045 0.1061 17   0.042  0.9999
##  18 - 15    0.0407 0.1061 17   0.383  0.9626
## 
## source = soy:
##  contrast estimate     SE df t.ratio p.value
##  12 - 9     0.1412 0.0949 17   1.487  0.3592
##  15 - 12   -0.0102 0.0949 17  -0.108  0.9992
##  18 - 15    0.0895 0.1342 17   0.666  0.8572
## 
## source = skim:
##  contrast estimate     SE df t.ratio p.value
##  12 - 9     0.2043 0.0949 17   2.152  0.1178
##  15 - 12    0.1398 0.1061 17   1.317  0.4520
##  18 - 15    0.1864 0.1424 17   1.309  0.4567
## 
## Results are given on the log (not the response) scale. 
## P value adjustment: mvt method for 3 tests

In fact, we may do all one-factor comparisons by specifying simple = "each". This typically produces a lot of output, so use it with care.

Back to Contents

Joint tests

From the above, we already know how to test individual results. For pairwise comparisons (details in the “comparisons” vignette), we might do

pigs.prs.s <- pairs(pigs.emm.s)
pigs.prs.s
##  contrast    estimate     SE df t.ratio p.value
##  fish - soy    -0.273 0.0529 23  -5.153  0.0001
##  fish - skim   -0.402 0.0542 23  -7.428  <.0001
##  soy - skim    -0.130 0.0530 23  -2.442  0.0570
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## P value adjustment: tukey method for comparing a family of 3 estimates

But suppose we want an omnibus test that all these comparisons are zero. Easy enough, using the joint argument in test (note: the joint argument is not available in summary(); only in test()):

test(pigs.prs.s, joint = TRUE)
##  df1 df2 F.ratio p.value note
##    2  23  28.849  <.0001  d  
## 
## d: df1 reduced due to linear dependence

Notice that there are three comparisons, but only 2 d.f. for the test, as cautioned in the message.

The test produced with joint = TRUE is a “type III” test (assuming the default equal weights are used to obtain the EMMs). See more on these types of tests for higher-order effects in the “interactions” vignette section on contrasts.

For convenience, there is also a joint_tests() function that performs joint tests of contrasts among each term in a model or emmGrid object.

joint_tests(pigsint.rg)
##  model term     df1 df2 F.ratio p.value
##  source           2  17  30.256  <.0001
##  percent          3  17   8.214  0.0013
##  source:percent   6  17   0.926  0.5011

The tests of main effects are of families of contrasts; those for interaction effects are for interaction contrasts. These results are essentially the same as a “Type-III ANOVA”, but may differ in situations where there are empty cells or other non-estimability issues, or if generalizations are present such as unequal weighting. (Another distinction is that sums of squares and mean squares are not shown; that is because these really are tests of contrasts among predictions, and they may or may not correspond to model sums of squares.)

One may use by variables with joint_tests. For example:

joint_tests(pigsint.rg, by = "source")
## source = fish:
##  model term df1 df2 F.ratio p.value
##  percent      3  17   1.712  0.2023
## 
## source = soy:
##  model term df1 df2 F.ratio p.value
##  percent      3  17   1.290  0.3097
## 
## source = skim:
##  model term df1 df2 F.ratio p.value
##  percent      3  17   6.676  0.0035

In some models, it is possible to specify submodel = "type2", thereby obtaining something akin to a Type II analysis of variance. See the messy-data vignette for an example.

Back to Contents

Testing equivalence, noninferiority, and nonsuperiority

The delta argument in summary() or test() allows the user to specify a threshold value to use in a test of equivalence, noninferiority, or nonsuperiority. An equivalence test is kind of a backwards significance test, where small P values are associated with small differences relative to a specified threshold value delta. The help page for summary.emmGrid gives the details of these tests. Suppose in the present example, we consider two sources to be equivalent if they are within 25% of each other. We can test this as follows:

test(pigs.prs.s, delta = log(1.25), adjust = "none")
##  contrast    estimate     SE df t.ratio p.value
##  fish - soy    -0.273 0.0529 23   0.937  0.8209
##  fish - skim   -0.402 0.0542 23   3.308  0.9985
##  soy - skim    -0.130 0.0530 23  -1.765  0.0454
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Statistics are tests of equivalence with a threshold of 0.22314 
## P values are left-tailed

By our 25% standard, the P value is quite small for comparing soy and skim, providing statistical evidence that their difference is enough smaller than the threshold to consider them equivalent.

Back to Contents

Graphics

Graphical displays of emmGrid objects are described in the “basics” vignette

Index of all vignette topics

emmeans/inst/doc/predictions.R0000644000176200001440000000362014165066764016107 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE--------------------------------------------- require("emmeans") options(show.signif.stars = FALSE, width = 100) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ## ----eval = FALSE--------------------------------------------------------------------------------- # rg <- ref_grid(model) # rg@misc$sigma ## ---- message = FALSE----------------------------------------------------------------------------- feedlot = transform(feedlot, adj.ewt = ewt - predict(lm(ewt ~ herd))) require(lme4) feedlot.lmer <- lmer(swt ~ adj.ewt + diet + (1|herd), data = feedlot) feedlot.rg <- ref_grid(feedlot.lmer, at = list(adj.ewt = 0)) summary(feedlot.rg) ## point predictions ## ------------------------------------------------------------------------------------------------- lme4::VarCorr(feedlot.lmer) ## for the model feedlot.rg@misc$sigma ## default in the ref. grid ## ------------------------------------------------------------------------------------------------- feedlot.rg <- update(feedlot.rg, sigma = sqrt(77.087^2 + 57.832^2)) ## ------------------------------------------------------------------------------------------------- predict(feedlot.rg, interval = "prediction") ## ---- fig.height = 2------------------------------------------------------------------------------ plot(feedlot.rg, PIs = TRUE) ## ------------------------------------------------------------------------------------------------- range(feedlot$swt) ## ------------------------------------------------------------------------------------------------- feedlot.lm <- lm(swt ~ adj.ewt + diet + herd, data = feedlot) ## ------------------------------------------------------------------------------------------------- newrg <- ref_grid(feedlot.lm, at = list(adj.ewt = 0, herd = c("9", "19"))) predict(newrg, interval = "prediction", by = "herd") emmeans/inst/doc/FAQs.html0000644000176200001440000006717314165066750015131 0ustar liggesusers FAQs for emmeans

FAQs for emmeans

emmeans package, Version 1.7.2

This vignette contains answers to questions received from users or posted on discussion boards like Cross Validated and Stack Overflow

What are EMMs/lsmeans?

Estimated marginal means (EMMs), a.k.a. least-squares means, are predictions on a reference grid of predictor settings, or marginal averages thereof. See details in the “basics” vignette.

What is the fastest way to obtain EMMs and pairwise comparisons?

There are two answers to this (i.e., be careful what you wish for):

  1. Don’t think; just fit the first model that comes to mind and run emmeans(model, pairwise ~ treatment). This is the fastest way; however, the results have a good chance of being invalid.
  2. Do think: Make sure you fit a model that really explains the responses. Do diagnostic residual plots, include appropriate interactions, account for heteroscadesticity if necessary, etc. This is the fastest way to obtain appropriate estimates and comparisons.

The point here is that emmeans() summarizes the model, not the data directly. If you use a bad model, you will get bad results. And if you use a good model, you will get appropriate results. It’s up to you: it’s your research—is it important?

Back to Contents

I wanted comparisons, but all I get is (nothing)

This happens when you have only one estimate; and you can’t compare it with itself! This is turn can happen when you have a situation like this: you have fitted

mod <- lm(RT ~ treat, data = mydata)

and treat is coded in your dataset with numbers 1, 2, 3, … . Since treat is a numeric predictor, emmeans() just reduces it to a single number, its mean, rather than separate values for each treatment. Also, please note that this is almost certainly NOT the model you want, because it forces an assumption that the treatment effects all fall on a straight line. You should fit a model like

mod <- lm(RT ~ factor(treat), data = mydata)

then you will have much better luck with comparisons.

The model I fitted is not supported by emmeans

You may still be able to get results using qdrg() (quick and dirty reference grid). See ?qdrg for details and examples.

I have three (or two or four) factors that interact

Perhaps your question has to do with interacting factors, and you want to do some kind of post hoc analysis comparing levels of one (or more) of the factors on the response. Some specific versions of this question…

  • Perhaps you tried to do a simple comparison for one treatment and got a warning message you don’t understand
  • You do pairwise comparisons of factor combinations and it’s just too much – want just some of them
  • How do I even approach this?

My first answer is: plots almost always help. If you have factors A, B, and C, try something like emmip(model, A ~ B | C), which creates an interaction-style plot of the predictions against B, for each A, with separate panels for each C. This will help visualize what effects stand out in a practical way. This can guide you in what post-hoc tests would make sense. See the “interactions” vignette for more discussion and examples.

Back to Contents

I have covariate(s) and am fitting a polynomial model

You need to be careful to define the reference grid consistently. For example, if you use covariates x and xsq (equal to x^2) to fit a quadratic curve, the default reference grid uses the mean of each covariate – and mean(xsq) is usually not the same as mean(x)^2. So you need to use at to ensure that the covariates are set consistently with respect to the model. See this subsection of the “basics” vignette for an example.

3. Some “significant” comparisons have overlapping confidence intervals

That can happen because it is just plain wrong to use [non-]overlapping CIs for individual means to do comparisons. Look at the printed results from something like emmeans(mymodel, pairwise ~ treatment). In particular, note that the SE values are not the same*, and may even have different degrees of freedom. Means are one thing statistically, and differences of means are quite another thing. Don’t ever mix them up, and don’t ever use a CI display for comparing means.

I’ll add that making hard-line decisions about “significant” and “non-significant” is in itself a poor practice. See the discussion in the “basics” vignette

All my pairwise comparisons have the same P value

This will happen if you fitted a model where the treatments you want to compare were put in as a numeric predictor; for example dose, with values of 1, 2, and 3. If dose is modeled as numeric, you will be fitting a linear trend in those dose values, rather than a model that allows those doses to differ in arbitrary ways. Go back and fit a different model using factor(dose) instead; it will make all the difference. This is closely related to the next topic.

emmeans() doesn’t work as expected

Equivalently, users ask how to get post hoc comparisons when we have covariates rather than factors. Yes, it does work, but you have to tell it the appropriate reference grid.

But before saying more, I have a question for you: Are you sure your model is meaningful?

  • If your question concerns only two-level predictors such as sex (coded 1 for female, 2 for male), no problem. The model will produce the same predictions as you’d get if you’d used these as factors.
  • If any of the predictors has 3 or more levels, you may have fitted a nonsense model, in which case you need to fit a different model that does make sense before doing any kind of post hoc analysis. For instance, the model contains a covariate brand (coded 1 for Acme, 2 for Ajax, and 3 for Al’s), this model is implying that the difference between Acme and Ajax is exactly equal to the difference between Ajax and Al’s, owing to the fact that a linear trend in brand has been fitted. If you had instead coded 1 for Ajax, 2 for Al’s, and 3 for Acme, the model would produce different fitted values. Ask yourself if it makes sense to have brand = 2.319. If not, you need to fit another model using factor(brand) in place of brand.

Assuming that the appropriateness of the model is settled, the current version of emmeans automatically casts two-value covariates as factors, but not covariates having higher numbers of unique values. Suppose your model has a covariate dose which was experimentally varied over four levels, but can sensibly be interpreted as a numerical predictor. If you want to include the separate values of dose rather than the mean dose, you can do that using something like emmeans(model, "dose", at = list(dose = 1:4)), or emmeans(model, "dose", cov.keep = "dose"), or emmeans(model, "dose", cov.keep = "4"). There are small differences between these. The last one regards any covariate having 4 or fewer unique values as a factor.

See “altering the reference grid” in the “basics” vignette for more discussion.

Back to Contents

All or some of the results are NA

The emmeans package uses tools in the estimability package to determine whether its results are uniquely estimable. For example, in a two-way model with interactions included, if there are no observations in a particular cell (factor combination), then we cannot estimate the mean of that cell.

When some of the EMMs are estimable and others are not, that is information about missing information in the data. If it’s possible to remove some terms from the model (particularly interactions), that may make more things estimable if you re-fit with those terms excluded; but don’t delete terms that are really needed for the model to fit well.

When all of the estimates are non-estimable, it could be symptomatic of something else. Some possibilities include:

  • An overly ambitious model; for example, in a Latin square design, interaction effects are confounded with main effects; so if any interactions are included in the model, you will render main effects inestimable.
  • Possibly you have a nested structure that needs to be included in the model or specified via the nesting argument. Perhaps the levels that B can have depend on which level of A is in force. Then B is nested in A and the model should specify A + A:B, with no main effect for B.
  • Modeling factors as numeric predictors (see also the related section on covariates). To illustrate, suppose you have data on particular state legislatures, and the model includes the predictors state_name as well as dem_gov which is coded 1 if the governor is a Democrat and 0 otherwise. If the model was fitted with state_name as a factor or character variable, but dem_gov as a numeric predictor, then, chances are, emmeans() will return non-estimable results. If instead, you use factor(dem_gov) in the model, then the fact that state_name is nested in dem_gov will be detected, causing EMMs to be computed separately for each party’s states, thus making things estimable.
  • Some other things may in fact be estimable. For illustration, it’s easy to construct an example where all the EMMs are non-estimable, but pairwise comparisons are estimable:
pg <- transform(pigs, x = rep(1:3, c(10, 10, 9)))
pg.lm <- lm(log(conc) ~ x + source + factor(percent), data = pg)
emmeans(pg.lm, consec ~ percent)
## $emmeans
##  percent emmean SE df asymp.LCL asymp.UCL
##        9 nonEst NA NA        NA        NA
##       12 nonEst NA NA        NA        NA
##       15 nonEst NA NA        NA        NA
##       18 nonEst NA NA        NA        NA
## 
## Results are averaged over the levels of: source 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast estimate     SE df t.ratio p.value
##  12 - 9     0.1796 0.0561 23   3.202  0.0110
##  15 - 12    0.0378 0.0582 23   0.650  0.8614
##  18 - 15    0.0825 0.0691 23   1.194  0.5200
## 
## Results are averaged over the levels of: source 
## Results are given on the log (not the response) scale. 
## P value adjustment: mvt method for 3 tests

The “messy-data” vignette has more examples and discussion.

Back to Contents

If I analyze subsets of the data separately, I get different results

Estimated marginal means summarize the model that you fitted to the data – not the data themselves. Many of the most common models rely on several simplifying assumptions – that certain effects are linear, that the error variance is constant, etc. – and those assumptions are passed forward into the emmeans() results. Doing separate analyses on subsets usually comprises departing from that overall model, so of course the results are different.

My lsmeans/EMMs are way off from what I expected

First step: Carefully read the annotations below the output. Do they say something like “results are on the log scale, not the response scale”? If so, that explains it. A Poisson or logistic model involves a link function, and by default, emmeans() produces its results on that same scale. You can add type = "response" to the emmeans() call and it will put the results of the scale you expect. But that is not always the best approach. The “transformations” vignette has examples and discussion.

Why do I get Inf for the degrees of freedom?

This is simply the way that emmeans labels asymptotic results (that is, estimates that are tested against the standard normal distribution – z tests – rather than the t distribution). Note that obtaining quantiles or probabilities from the t distribution with infinite degrees of freedom is the same as obtaining the corresponding values from the standard normal. For example:

qt(c(.9, .95, .975), df = Inf)
## [1] 1.281552 1.644854 1.959964
qnorm(c(.9, .95, .975))
## [1] 1.281552 1.644854 1.959964

so when you see infinite d.f., that just means its a z test or a z confidence interval.

Back to Contents

I get exactly the same comparisons for each “by” group

As mentioned elsewhere, EMMs summarize a model, not the data. If your model does not include any interactions between the by variables and the factors for which you want EMMs, then by definition, the effects for the latter will be exactly the same regardless of the by variable settings. So of course the comparisons will all be the same. If you think they should be different, then you are saying that your model should include interactions between the factors of interest and the by factors.

My ANOVA F is significant, but no pairwise comparisons are

First of all, you should not be making binary decisions of “significant” or “nonsignificant.” This is a simplistic view of P values that assigns an unmerited magical quality to the value 0.05. It is suggested that you just report the P values actually obtained, and let your readers decide how significant your findings are in the context of the scientific findings.

But to answer the question: This is a common misunderstanding of ANOVA. If F has a particular P value, this implies only that some contrast among the means (or effects) has the same P value, after applying the Scheffe adjustment. That contrast may be very much unlike a pairwise comparison, especially when there are several means being compared. Having an F statistic with a P value of, say, 0.06, does not imply that any pairwise comparison will have a P value of 0.06 or smaller. Again referring to the paragraph above, just report the P value for each pairwise comparison, and don’t try to relate them to the F statistic.

Another consideration is that by default, P values for pairwise comparisons are adjusted using the Tukey method, and the adjusted P values can be quite a bit larger than the unadjusted ones. (But I definitely do not advocate using no adjustment to “repair” this problem.)

I asked for Tukey adjustments, but that’s not what I got

There are two reasons this could happen:

  1. There is only one comparison in each by group (see next topic).
  2. A Tukey adjustment is inappropriate. The Tukey adjustment is appropriate for pairwise comparisons of means. When you have some other set of contrasts, the Tukey method is deemed unsuitable and the Sidak method is used instead. A suggestion is to use "mvt" adjustment (which is exact); we don’t default to this because it can require a lot of computing time for a large set of contrasts or comparisons.

emmeans() completely ignores my P-value adjustments

This happens when there are only two means (or only two in each by group). Thus there is only one comparison. When there is only one thing to test, there is no multiplicity issue, and hence no multiplicity adjustment to the P values.

If you wish to apply a P-value adjustment to all tests across all groups, you need to null-out the by variable and summarize, as in the following:

EMM <- emmeans(model, ~ treat | group)   # where treat has 2 levels
pairs(EMM, adjust = "sidak")   # adjustment is ignored - only 1 test per group
summary(pairs(EMM), by = NULL, adjust = "sidak")   # all are in one group now

Note that if you put by = NULL inside the call to pairs(), then this causes all treat,group combinations to be compared.

Back to Contents

emmeans() gives me pooled t tests, but I expected Welch’s t

It is important to note that emmeans() and its relatives produce results based on the model object that you provide – not the data. So if your sample SDs are wildly different, a model fitted using lm() or aov() is not a good model, because those R functions use a statistical model that presumes that the errors have constant variance. That is, the problem isn’t in emmeans(), it’s in handing it an inadequate model object.

Here is a simple illustrative example. Consider a simple one-way experiment and the following model:

mod1 <- aov(response ~ treat, data = mydata)
emmeans(mod1, pairwise ~ treat)

This code will estimate means and comparisons among treatments. All standard errors, confidence intervals, and t statistics are based on the pooled residual SD with N - k degrees of freedom (assuming N observations and k treatments). These results are useful only if the underlying assumptions of mod1 are correct – including the assumption that the error SD is the same for all treatments.

Alternatively, you could fit the following model using generalized least-squares:

mod2 = nlme::gls(response ~ treat, data = mydata,
                 weights = varIdent(form = ~1 | treat))
emmeans(mod2, pairwise ~ treat)

This model specifies that the error variance depends on the levels of treat. This would be a much better model to use when you have wildly different sample SDs. The results of the emmeans() call will reflect this improvement in the modeling. The standard errors of the EMMs will depend on the individual sample variances, and the t tests of the comparisons will be in essence the Welch t statistics with Satterthwaite degrees of freedom.

To obtain appropriate post hoc estimates, contrasts, and comparisons, one must first find a model that successfully explains the peculiarities in the data. This point cannot be emphasized enough. If you give emmeans() a good model, you will obtain correct results; if you give it a bad model, you will obtain incorrect results. Get the model right first.

Back to Contents

Index of all vignette topics

emmeans/inst/doc/interactions.html0000644000176200001440000025772314165066760017044 0ustar liggesusers Interaction analysis in emmeans

Interaction analysis in emmeans

emmeans package, Version 1.7.2

Models in which predictors interact seem to create a lot of confusion concerning what kinds of post hoc methods should be used. It is hoped that this vignette will be helpful in shedding some light on how to use the emmeans package effectively in such situations.

Interacting factors

As an example for this topic, consider the auto.noise dataset included with the package. This is a balanced 3x2x2 experiment with three replications. The response – noise level – is evaluated with different sizes of cars, types of anti-pollution filters, on each side of the car being measured.1

Let’s fit a model and obtain the ANOVA table (because of the scale of the data, we believe that the response is recorded in tenths of decibels; so we compensate for this by scaling the response):

noise.lm <- lm(noise/10 ~ size * type * side, data = auto.noise)
anova(noise.lm)
## Analysis of Variance Table
## 
## Response: noise/10
##                Df  Sum Sq Mean Sq  F value    Pr(>F)
## size            2 260.514 130.257 893.1905 < 2.2e-16
## type            1  10.563  10.563  72.4286 1.038e-08
## side            1   0.007   0.007   0.0476 0.8291042
## size:type       2   8.042   4.021  27.5714 6.048e-07
## size:side       2  12.931   6.465  44.3333 8.730e-09
## type:side       1   0.174   0.174   1.1905 0.2860667
## size:type:side  2   3.014   1.507  10.3333 0.0005791
## Residuals      24   3.500   0.146

There are statistically strong 2- and 3-way interactions.

One mistake that a lot of people seem to make is to proceed too hastily to estimating marginal means (even in the face of all these interactions!). They would go straight to analyses like this:

emmeans(noise.lm, pairwise ~ size)
## NOTE: Results may be misleading due to involvement in interactions
## $emmeans
##  size emmean     SE df lower.CL upper.CL
##  S     82.42 0.1102 24    82.19    82.64
##  M     83.38 0.1102 24    83.15    83.60
##  L     77.25 0.1102 24    77.02    77.48
## 
## Results are averaged over the levels of: type, side 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast estimate    SE df t.ratio p.value
##  S - M      -0.958 0.156 24  -6.147  <.0001
##  S - L       5.167 0.156 24  33.140  <.0001
##  M - L       6.125 0.156 24  39.287  <.0001
## 
## Results are averaged over the levels of: type, side 
## P value adjustment: tukey method for comparing a family of 3 estimates

The analyst-in-a-hurry would thus conclude that the noise level is higher for medium-sized cars than for small or large ones.

But as is seen in the message before the output, emmeans() valiantly tries to warn you that it may not be a good idea to average over factors that interact with the factor of interest. It isn’t always a bad idea to do this, but sometimes it definitely is.

What about this time? I think a good first step is always to try to visualize the nature of the interactions before doing any statistical comparisons. The following plot helps.

emmip(noise.lm, type ~ size | side)

Examining this plot, we see that the “medium” mean is not always higher; so the marginal means, and the way they compare, does not represent what is always the case. Moreover, what is evident in the plot is that the peak for medium-size cars occurs for only one of the two filter types. So it seems more useful to do the comparisons of size separately for each filter type. This is easily done, simply by conditioning on type:

emm_s.t <- emmeans(noise.lm, pairwise ~ size | type)
## NOTE: Results may be misleading due to involvement in interactions
emm_s.t
## $emmeans
## type = Std:
##  size emmean     SE df lower.CL upper.CL
##  S     82.58 0.1559 24    82.26    82.91
##  M     84.58 0.1559 24    84.26    84.91
##  L     77.50 0.1559 24    77.18    77.82
## 
## type = Octel:
##  size emmean     SE df lower.CL upper.CL
##  S     82.25 0.1559 24    81.93    82.57
##  M     82.17 0.1559 24    81.84    82.49
##  L     77.00 0.1559 24    76.68    77.32
## 
## Results are averaged over the levels of: side 
## Confidence level used: 0.95 
## 
## $contrasts
## type = Std:
##  contrast estimate   SE df t.ratio p.value
##  S - M     -2.0000 0.22 24  -9.071  <.0001
##  S - L      5.0833 0.22 24  23.056  <.0001
##  M - L      7.0833 0.22 24  32.127  <.0001
## 
## type = Octel:
##  contrast estimate   SE df t.ratio p.value
##  S - M      0.0833 0.22 24   0.378  0.9245
##  S - L      5.2500 0.22 24  23.812  <.0001
##  M - L      5.1667 0.22 24  23.434  <.0001
## 
## Results are averaged over the levels of: side 
## P value adjustment: tukey method for comparing a family of 3 estimates

Not too surprisingly, the statistical comparisons are all different for standard filters, but with Octel filters, there isn’t much of a difference between small and medium size.

For comparing the levels of other factors, similar judgments must be made. It may help to construct other interaction plots with the factors in different roles. In my opinion, almost all meaningful statistical analysis should be grounded in evaluating the practical impact of the estimated effects first, and seeing if the statistical evidence backs it up. Those who put all their attention on how many asterisks (I call these people “* gazers”) are ignoring the fact that these don’t measure the sizes of the effects on a practical scale.2 An effect can be practically negligible and still have a very small P value – or practically important but have a large P value – depending on sample size and error variance. Failure to describe what is actually going on in the data is a failure to do an adequate analysis. Use lots of plots, and think about the results. For more on this, see the discussion of P values in the “basics” vignette.

Simple contrasts

An alternative way to specify conditional contrasts or comparisons is through the use of the simple argument to contrast() or pairs(), which amounts to specifying which factors are not used as by variables. For example, consider:

noise.emm <- emmeans(noise.lm, ~ size * side * type)

Then pairs(noise.emm, simple = "size") is the same as pairs(noise.emm, by = c("side", "type")).

One may specify a list for simple, in which case separate runs are made with each element of the list. Thus, pairs(noise.emm, simple = list("size", c("side", "type")) returns two sets of contrasts: comparisons of size for each combination of the other two factors; and comparisons of side*type combinations for each size.

A shortcut that generates all simple main-effect comparisons is to use simple = "each". In this example, the result is the same as obtained using simple = list("size", "side", "type").

Ordinarily, when simple is a list (or equal to "each"), a list of contrast sets is returned. However, if the additional argument combine is set to TRUE, they are all combined into one family:

contrast(noise.emm, "consec", simple = "each", combine = TRUE, adjust = "mvt")
##  side type  size contrast    estimate    SE df t.ratio p.value
##  L    Std   .    M - S          1.500 0.312 24   4.811  0.0012
##  L    Std   .    L - M         -8.667 0.312 24 -27.795  <.0001
##  R    Std   .    M - S          2.500 0.312 24   8.018  <.0001
##  R    Std   .    L - M         -5.500 0.312 24 -17.639  <.0001
##  L    Octel .    M - S         -0.333 0.312 24  -1.069  0.9768
##  L    Octel .    L - M         -5.667 0.312 24 -18.174  <.0001
##  R    Octel .    M - S          0.167 0.312 24   0.535  0.9999
##  R    Octel .    L - M         -4.667 0.312 24 -14.967  <.0001
##  .    Std   S    R - L         -1.833 0.312 24  -5.880  0.0001
##  .    Std   M    R - L         -0.833 0.312 24  -2.673  0.1715
##  .    Std   L    R - L          2.333 0.312 24   7.483  <.0001
##  .    Octel S    R - L         -0.500 0.312 24  -1.604  0.7747
##  .    Octel M    R - L          0.000 0.312 24   0.000  1.0000
##  .    Octel L    R - L          1.000 0.312 24   3.207  0.0559
##  L    .     S    Octel - Std   -1.000 0.312 24  -3.207  0.0563
##  L    .     M    Octel - Std   -2.833 0.312 24  -9.087  <.0001
##  L    .     L    Octel - Std    0.167 0.312 24   0.535  0.9999
##  R    .     S    Octel - Std    0.333 0.312 24   1.069  0.9768
##  R    .     M    Octel - Std   -2.000 0.312 24  -6.414  <.0001
##  R    .     L    Octel - Std   -1.167 0.312 24  -3.742  0.0165
## 
## P value adjustment: mvt method for 20 tests

The dots (.) in this result correspond to which simple effect is being displayed. If we re-run this same call with combine = FALSE or omitted, these twenty comparisons would be displayed in three broad sets of contrasts, each broken down further by combinations of by variables, each separately multiplicity-adjusted (a total of 16 different tables).

Back to Contents

Interaction contrasts

An interaction contrast is a contrast of contrasts. For instance, in the auto-noise example, we may want to obtain the linear and quadratic contrasts of size separately for each type, and compare them. Here are estimates of those contrasts:

contrast(emm_s.t[[1]], "poly")   ## 'by = "type"' already in previous result 
## type = Std:
##  contrast  estimate    SE df t.ratio p.value
##  linear       -5.08 0.220 24 -23.056  <.0001
##  quadratic    -9.08 0.382 24 -23.786  <.0001
## 
## type = Octel:
##  contrast  estimate    SE df t.ratio p.value
##  linear       -5.25 0.220 24 -23.812  <.0001
##  quadratic    -5.08 0.382 24 -13.311  <.0001
## 
## Results are averaged over the levels of: side

The comparison of these contrasts may be done using the interaction argument in contrast() as follows:

IC_st <- contrast(emm_s.t[[1]], interaction = c("poly", "consec"), by = NULL)
IC_st
##  size_poly type_consec estimate    SE df t.ratio p.value
##  linear    Octel - Std   -0.167 0.312 24  -0.535  0.5979
##  quadratic Octel - Std    4.000 0.540 24   7.407  <.0001
## 
## Results are averaged over the levels of: side

(Using by = NULL restores type to a primary factor in these contrasts.) The practical meaning of this is that there isn’t a statistical difference in the linear trends, but the quadratic trend for Octel is greater than for standard filter types. (Both quadratic trends are negative, so in fact it is the standard filters that have more pronounced downward curvature, as is seen in the plot.) In case you need to understand more clearly what contrasts are being estimated, the coef() method helps:

coef(IC_st)
##   size  type c.1 c.2
## 1    S   Std   1  -1
## 2    M   Std   0   2
## 3    L   Std  -1  -1
## 4    S Octel  -1   1
## 5    M Octel   0  -2
## 6    L Octel   1   1

Note that the 4th through 6th contrast coefficients are the negatives of the 1st through 3rd – thus a comparison of two contrasts.

By the way, “type III” tests of interaction effects can be obtained via interaction contrasts:

test(IC_st, joint = TRUE)
##  df1 df2 F.ratio p.value
##    2  24  27.571  <.0001

This result is exactly the same as the F test of size:type in the anova output.

The three-way interaction may be explored via interaction contrasts too:

contrast(emmeans(noise.lm, ~ size*type*side),
         interaction = c("poly", "consec", "consec"))
##  size_poly type_consec side_consec estimate    SE df t.ratio p.value
##  linear    Octel - Std R - L          -2.67 0.624 24  -4.276  0.0003
##  quadratic Octel - Std R - L          -1.67 1.080 24  -1.543  0.1359

One interpretation of this is that the comparison by type of the linear contrasts for size is different on the left side than on the right side; but the comparison of that comparison of the quadratic contrasts, not so much. Refer again to the plot, and this can be discerned as a comparison of the interaction in the left panel versus the interaction in the right panel.

Finally, emmeans provides a joint_tests() function that obtains and tests the interaction contrasts for all effects in the model and compiles them in one Type-III-ANOVA-like table:

joint_tests(noise.lm)
##  model term     df1 df2 F.ratio p.value
##  size             2  24 893.190  <.0001
##  type             1  24  72.429  <.0001
##  side             1  24   0.048  0.8291
##  size:type        2  24  27.571  <.0001
##  size:side        2  24  44.333  <.0001
##  type:side        1  24   1.190  0.2861
##  size:type:side   2  24  10.333  0.0006

You may even add by variable(s) to obtain separate ANOVA tables for the remaining factors:

joint_tests(noise.lm, by = "side")
## side = L:
##  model term df1 df2 F.ratio p.value
##  size         2  24 651.714  <.0001
##  type         1  24  46.095  <.0001
##  size:type    2  24  23.524  <.0001
## 
## side = R:
##  model term df1 df2 F.ratio p.value
##  size         2  24 285.810  <.0001
##  type         1  24  27.524  <.0001
##  size:type    2  24  14.381  0.0001

Back to Contents

Multivariate contrasts

In the preceding sections, the way we addressed interacting factors was to do comparisons or contrasts of some factors()) separately at levels of other factor(s). This leads to a lot of estimates and associated tests.

Another approach is to compare things in a multivariate way. In the auto-noise example, for example, we have four means (corresponding to the four combinations of type and size) with each size of car, and we could consider comparing these sets of means. Such multivariate comparisons can be done via the Mahalanobis distance (a kind of standardized distance measure) between one set of four means and another. This is facilitated by the mvcontrast() function:

mvcontrast(noise.emm, "pairwise", mult.name = c("type", "side"))
##  contrast T.square df1 df2 F.ratio p.value
##  S - M      88.857   4  21  19.438  <.0001
##  S - L    1199.429   4  21 262.375  <.0001
##  M - L    1638.000   4  21 358.312  <.0001
## 
## P value adjustment: sidak

In this output, the T.square values are Hotelling’s \(T^2\) statistics, which are the squared Mahalanobis distances among the sets of four means. These results thus accomplish a similar objective as the initial comparisons presented in this vignette, but are not complicated by the issue that the factors interact. (Instead, we lose the directionality of the comparisons.) While all comparisons are “significant,” the T.square values indicate that large cars are statistically most different from the other sizes.

We may still break things down using by variables. Suppose, for example, we wish to compare the two filter types for each size of car, without regard to which side:

update(mvcontrast(noise.emm, "consec", mult.name = "side", by = "size"), 
       by = NULL)
##  contrast    size T.square df1 df2 F.ratio p.value
##  Octel - Std S      11.429   2  23   5.476  0.0113
##  Octel - Std M     123.714   2  23  59.280  <.0001
##  Octel - Std L      14.286   2  23   6.845  0.0047
## 
## P value adjustment: sidak

One detail to note about multivariate comparisons: in order to make complete sense, all the factors involved must interact. Suppose we were to repeat the initial multivariate comparison after removing all interactions:

mvcontrast(update(noise.emm, submodel = ~ side + size + type), 
           "pairwise", mult.name = c("type", "side"))
##  contrast T.square df1 df2  F.ratio p.value
##  S - M      37.786   1  24   37.786  <.0001
##  S - L    1098.286   1  24 1098.286  <.0001
##  M - L    1543.500   1  24 1543.500  <.0001
## 
## P value adjustment: sidak 
## NOTE: Some or all d.f. are reduced due to singularities

Note that each \(F\) ratio now has 1 d.f. Also, note that T.square = F.ratio, and you can verify that these values are equal to the squares of the t.ratios in the initial example in this vignette (\((-6.147)^2 = 37.786\), etc.). That is, if we ignore all interactions, the multivariate tests are exactly equivalent to the univariate tests of the marginal means.

Back to Contents

Interactions with covariates

When a covariate and a factor interact, we typically don’t want EMMs themselves, but rather estimates of slopes of the covariate trend for each level of the factor. As a simple example, consider the fiber dataset, and fit a model including the interaction between diameter (a covariate) and machine (a factor):

fiber.lm <- lm(strength ~ diameter*machine, data = fiber)

This model comprises fitting, for each machine, a separate linear trend for strength versus diameter. Accordingly, we can estimate and compare the slopes of those lines via the emtrends() function:

emtrends(fiber.lm, pairwise ~ machine, var = "diameter")
## $emtrends
##  machine diameter.trend    SE df lower.CL upper.CL
##  A                1.104 0.194  9    0.666     1.54
##  B                0.857 0.224  9    0.351     1.36
##  C                0.864 0.208  9    0.394     1.33
## 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast estimate    SE df t.ratio p.value
##  A - B     0.24714 0.296  9   0.835  0.6919
##  A - C     0.24008 0.284  9   0.845  0.6863
##  B - C    -0.00705 0.306  9  -0.023  0.9997
## 
## P value adjustment: tukey method for comparing a family of 3 estimates

We see the three slopes, but no two of them test as being statistically different.

To visualize the lines themselves, you may use

emmip(fiber.lm, machine ~ diameter, cov.reduce = range)

The cov.reduce = range argument is passed to ref_grid(); it is needed because by default, each covariate is reduced to only one value (see the “basics” vignette). Instead, we call the range() function to obtain the minimum and maximum diameter.

For a more sophisticated example, consider the oranges dataset included with the package. These data concern the sales of two varieties of oranges. The prices (price1 and price2) were experimentally varied in different stores and different days, and the responses sales1 and sales2 were observed. Let’s consider three multivariate models for these data, with additive effects for days and stores, and different levels of fitting on the prices:

org.quad <- lm(cbind(sales1, sales2) ~ poly(price1, price2, degree = 2)
                                       + day + store, data = oranges)
org.int <- lm(cbind(sales1, sales2) ~ price1 * price2 + day + store, data = oranges)
org.add <- lm(cbind(sales1, sales2) ~ price1 + price2 + day + store, data = oranges)

Being a multivariate model, emmeans methods will distinguish the responses as if they were levels of a factor, which we will name “variety”. Moreover, separate effects are estimated for each multivariate response, so there is an implied interaction between variety and each of the predictors involving price1 and price2. (In org.int, there is an implied three-way interaction.) An interesting way to view these models is to look at how they predict sales of each variety at each observed values of the prices:

emmip(org.quad, price2 ~ price1 | variety, mult.name = "variety", cov.reduce = FALSE)

The trends portrayed here are quite sensible: In the left panel, as we increase the price of variety 1, sales of that variety will tend to decrease – and the decrease will be faster when the other variety of oranges is low-priced. In the right panel, as price of variety 1 increases, sales of variety 2 will increase when it is low-priced, but could decrease also at high prices because oranges in general are just too expensive. A plot like this for org.int will be similar but all the curves will be straight lines; and the one for plot.add will have all lines parallel. In all models, though, there are implied price1:variety and price2:variety interactions, because we have different regression coefficients for the two responses.

Which model should we use? They are nested models, so they can be compared by anova():

anova(org.quad, org.int, org.add)
## Analysis of Variance Table
## 
## Model 1: cbind(sales1, sales2) ~ poly(price1, price2, degree = 2) + day + 
##     store
## Model 2: cbind(sales1, sales2) ~ price1 * price2 + day + store
## Model 3: cbind(sales1, sales2) ~ price1 + price2 + day + store
##   Res.Df Df Gen.var.   Pillai approx F num Df den Df Pr(>F)
## 1     20      22.798                                       
## 2     22  2   21.543 0.074438  0.38658      4     40 0.8169
## 3     23  1   23.133 0.218004  2.64840      2     19 0.0967

It seems like the full-quadratic model has little advantage over the interaction model. There truly is nothing magical about a P value of 0.05, and we have enough data that over-fitting is not a hazard; so I like org.int. However, what follows could be done with any of these models.

To summarize and test the results compactly, it makes sense to obtain estimates of a representative trend in each of the left and right panels, and perhaps to compare them. In turn, that can be done by obtaining the slope of the curve (or line) at the average value of price2. The emtrends() function is designed for exactly this kind of purpose. It uses a difference quotient to estimate the slope of a line fitted to a given variable. It works just like emmeans() except for requiring the variable to use in the difference quotient. Using the org.int model:

emtrends(org.int, pairwise ~ variety, var = "price1", mult.name = "variety")
## $emtrends
##  variety price1.trend    SE df lower.CL upper.CL
##  sales1        -0.749 0.171 22   -1.104   -0.394
##  sales2         0.138 0.214 22   -0.306    0.582
## 
## Results are averaged over the levels of: day, store 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast        estimate   SE df t.ratio p.value
##  sales1 - sales2   -0.887 0.24 22  -3.690  0.0013
## 
## Results are averaged over the levels of: day, store

From this, we can say that, starting with price1 and price2 both at their average values, we expect sales1 to decrease by about .75 per unit increase in price1; meanwhile, there is a suggestion of a slight increase of sales2, but without much statistical evidence. Marginally, the first variety has a 0.89 disadvantage relative to sales of the second variety.

Other analyses (not shown) with price2 set at a higher value will reduce these effects, while setting price2 lower will exaggerate all these effects. If the same analysis is done with the quadratic model, the the trends are curved, and so the results will depend somewhat on the setting for price1. The graph above gives an indication of the nature of those changes.

Similar results hold when we analyze the trends for price2:

emtrends(org.int, pairwise ~ variety, var = "price2", mult.name = "variety")
## $emtrends
##  variety price2.trend    SE df lower.CL upper.CL
##  sales1         0.172 0.102 22  -0.0404    0.384
##  sales2        -0.745 0.128 22  -1.0099   -0.480
## 
## Results are averaged over the levels of: day, store 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast        estimate    SE df t.ratio p.value
##  sales1 - sales2    0.917 0.143 22   6.387  <.0001
## 
## Results are averaged over the levels of: day, store

At the averages, increasing the price of variety 2 has the effect of decreasing sales of variety 2 while slightly increasing sales of variety 1 – a marginal difference of about .92.

Back to Contents

Summary

Interactions, by nature, make things more complicated. One must resist pressures and inclinations to try to produce simple bottom-line conclusions. Interactions require more work and more patience; they require presenting more cases – more than are presented in the examples in this vignette – in order to provide a complete picture.

Index of all vignette topics


  1. I sure wish I could ask some questions about how how these data were collected; for example, are these independent experimental runs, or are some cars measured more than once? The model is based on the independence assumption, but I have my doubts.↩︎

  2. You may have noticed that there are no asterisks in the ANOVA table in this vignette. I habitually opt out of star-gazing by including options(show.signif.stars = FALSE) in my .Rprofile file.↩︎

emmeans/inst/doc/predictions.Rmd0000644000176200001440000001712314137062735016424 0ustar liggesusers--- title: "Prediction in **emmeans**" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Prediction in emmeans} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") options(show.signif.stars = FALSE, width = 100) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` In this vignette, we discuss **emmeans**'s rudimentary capabilities for constructing prediction intervals. ## Contents {#contents} 1. [Focus on reference grids](#ref-grid) 2. [Need for an SD estimate](#sd-estimate) 3. [Feedlot example](#feedlot) 4. [Predictions on particular strata](#strata) 5. [Predictions with Bayesian models](#bayes) [Index of all vignette topics](vignette-topics.html) ## Focus on reference grids {#ref-grid} Prediction is not the central purpose of the **emmeans** package. Even its name refers to the idea of obtaining marginal averages of fitted values; and it is a rare situation where one would want to make a prediction of the average of several observations. We can certainly do that if it is truly desired, but almost always, predictions should be based on the reference grid itself (i.e., *not* the result of an `emmeans()` call), inasmuch as a reference grid comprises combinations of model predictors. ## Need for an SD estimate {#sd-estimate} A prediction interval requires an estimate of the error standard deviation, because we need to account for both the uncertainty of our point predictions and the uncertainty of outcomes centered on those estimates. By its current design, we save the value (if any) returned by `stats::sigma(object)` when a reference grid is constructed for a model `object`. Not all models provide a `sigma()` method, in which case an error is thrown if the error SD is not manually specified. Also, in many cases, there may be a `sigma()` method, but it does not return the appropriate value(s) in the context of the needed predictions. (In an object returned by `lme4::glmer(), for example, `sigma()` seems to always returns 1.0.) Indeed, as will be seen in the example that follows, one usually needs to construct a manual SD estimate when the model is a mixed-effects model. So it is essentially always important to think very specifically about whether we are using an appropriate value. You may check the value being assumed by looking at the `misc` slot in the reference grid: ```{r eval = FALSE} rg <- ref_grid(model) rg@misc$sigma ``` Finally, `sigma` may be a vector, as long as it is conformable with the estimates in the reference grid. This would be appropriate, for example, with a model fitted by `nlme::gls()` with some kind of non-homogeneous error structure. It may take some effort, as well as a clear understanding of the model and its structure, to obtain suitable SD estimates. It was suggested to me that the function `insight::get_variance()` may be helpful -- especially when working with an unfamiliar model class. Personally, I prefer to make sure I understand the structure of the model object and/or its summary to ensure I am not going astray. [Back to Contents](#contents) ## Feedlot example {#feedlot} To illustrate, consider the `feedlot` dataset provided with the package. Here we have several herds of feeder cattle that are sent to feed lots and given one of three diets. The weights of the cattle are measured at time of entry (`ewt`) and at time of slaughter (`swt`). Different herds have possibly different entry weights, based on breed and ranching practices, so we will center each herd's `ewt` measurements, then use that as a covariate in a mixed model: ```{r, message = FALSE} feedlot = transform(feedlot, adj.ewt = ewt - predict(lm(ewt ~ herd))) require(lme4) feedlot.lmer <- lmer(swt ~ adj.ewt + diet + (1|herd), data = feedlot) feedlot.rg <- ref_grid(feedlot.lmer, at = list(adj.ewt = 0)) summary(feedlot.rg) ## point predictions ``` Now, as advised, let's look at the SDs involved in this model: ```{r} lme4::VarCorr(feedlot.lmer) ## for the model feedlot.rg@misc$sigma ## default in the ref. grid ``` So the residual SD will be assumed in our prediction intervals if we don't specify something else. And we *do* want something else, because in order to predict the slaughter weight of an arbitrary animal, without regard to its herd, we need to account for the variation among herds too, which is seen to be considerable. The two SDs reported by `VarCorr()` are assumed to represent independent sources of variation, so they may be combined into a total SD using the Pythagorean Theorem. We will update the reference grid with the new value: ```{r} feedlot.rg <- update(feedlot.rg, sigma = sqrt(77.087^2 + 57.832^2)) ``` We are now ready to form prediction intervals. To do so, simply call the `predict()` function with an `interval` argument: ```{r} predict(feedlot.rg, interval = "prediction") ``` These results may also be displayed graphically: ```{r, fig.height = 2} plot(feedlot.rg, PIs = TRUE) ``` The inner intervals are confidence intervals, and the outer ones are the prediction intervals. Note that the SEs for prediction are considerably greater than the SEs for estimation in the original summary of `feedlot.rg`. Also, as a sanity check, observe that these prediction intervals cover about the same ground as the original data: ```{r} range(feedlot$swt) ``` By the way, we could have specified the desired `sigma` value as an additional `sigma` argument in the `predict()` call, rather than updating the `feedlot.rg` object. [Back to Contents](#contents) ## Predictions on particular strata {#strata} Suppose, in our example, we want to predict `swt` for one or more particular herds. Then the total SD we computed is not appropriate for that purpose, because that includes variation among herds. But more to the point, if we are talking about particular herds, then we are really regarding `herd` as a fixed effect of interest; so the expedient thing to do is to fit a different model where `herd` is a fixed effect: ```{r} feedlot.lm <- lm(swt ~ adj.ewt + diet + herd, data = feedlot) ``` So to predict slaughter weight for herds `9` and `19`: ```{r} newrg <- ref_grid(feedlot.lm, at = list(adj.ewt = 0, herd = c("9", "19"))) predict(newrg, interval = "prediction", by = "herd") ``` This is an instance where the default `sigma` was already correct (being the only error SD we have available). The SD value is comparable to the residual SD in the previous model, and the prediction SEs are smaller than those for predicting over all herds. [Back to Contents](#contents) ## Predictions with Bayesian models {#bayes} For models fitted using Bayesian methods, these kinds of prediction intervals are available only by forcing a frequentist analysis (`frequentist = TRUE`). However, a better and more flexible approach with Bayesian models is to simulate observations from the posterior predictive distribution. This is done via `as.mcmc()` and specifying a `likelihood` argument. An example is given in the ["sophisticated models" vignette](sophisticated.html#predict-mcmc). [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/inst/doc/FAQs.Rmd0000644000176200001440000005154714151001674014672 0ustar liggesusers--- title: "FAQs for emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{FAQs for emmeans} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") options(show.signif.stars = FALSE) ``` This vignette contains answers to questions received from users or posted on discussion boards like [Cross Validated](https://stats.stackexchange.com) and [Stack Overflow](https://stackoverflow.com/) ## Contents {#contents} 1. [What are EMMs/lsmeans?](#what) 2. [What is the fastest way to obtain EMMs and pairwise comparisons?](#fastest) 2. [I wanted comparisons, but all I get is (nothing)](#nopairs) 2. [The model I fitted is not supported by **emmeans**](#qdrg) 2. [I have three (or two or four) factors that interact](#interactions) 3. [I have covariate(s) that interact(s) with factor(s)](#trends) 3. [I have covariate(s) and am fitting a polynomial model](#polys) 3. [Some "significant" comparisons have overlapping confidence intervals](#CIerror) 4. [All my pairwise comparisons have the same *P* value](#notfactor) 5. [emmeans() doesn't work as expected](#numeric) 6. [All or some of the results are NA](#NAs) 6. [If I analyze subsets of the data separately, I get different results](#model) 7. [My lsmeans/EMMs are way off from what I expected](#transformations) 8. [Why do I get `Inf` for the degrees of freedom?](#asymp) 10. [I get exactly the same comparisons for each "by" group](#additive) 11. [My ANOVA *F* is significant, but no pairwise comparisons are](#anova) 12. [I asked for a Tukey adjustments, but that's not what I got](#notukey) 12. [`emmeans()` completely ignores my P-value adjustments](#noadjust) 13. [`emmeans()` gives me pooled *t* tests, but I expected Welch's *t*](#nowelch) [Index of all vignette topics](vignette-topics.html) ## What are EMMs/lsmeans? {#what} Estimated marginal means (EMMs), a.k.a. least-squares means, are predictions on a reference grid of predictor settings, or marginal averages thereof. See details in [the "basics" vignette](basics.html). ## What is the fastest way to obtain EMMs and pairwise comparisons? {#fastest} There are two answers to this (i.e., be careful what you wish for): 1. Don't think; just fit the first model that comes to mind and run `emmeans(model, pairwise ~ treatment)`. This is the fastest way; however, the results have a good chance of being invalid. 2. *Do* think: Make sure you fit a model that really explains the responses. Do diagnostic residual plots, include appropriate interactions, account for heteroscadesticity if necessary, etc. This is the fastest way to obtain *appropriate* estimates and comparisons. The point here is that `emmeans()` summarizes the *model*, not the data directly. If you use a bad model, you will get bad results. And if you use a good model, you will get appropriate results. It's up to you: it's your research---is it important? [Back to Contents](#contents) ## I wanted comparisons, but all I get is (nothing) {#nopairs} This happens when you have only one estimate; and you can't compare it with itself! This is turn can happen when you have a situation like this: you have fitted ``` mod <- lm(RT ~ treat, data = mydata) ``` and `treat` is coded in your dataset with numbers 1, 2, 3, ... . Since `treat` is a numeric predictor, `emmeans()` just reduces it to a single number, its mean, rather than separate values for each treatment. Also, please note that this is almost certainly NOT the model you want, because it forces an assumption that the treatment effects all fall on a straight line. You should fit a model like ``` mod <- lm(RT ~ factor(treat), data = mydata) ``` then you will have much better luck with comparisons. ## The model I fitted is not supported by **emmeans** {#qdrg} You may still be able to get results using `qdrg()` (quick and dirty reference grid). See `?qdrg` for details and examples. ## I have three (or two or four) factors that interact {#interactions} Perhaps your question has to do with interacting factors, and you want to do some kind of *post hoc* analysis comparing levels of one (or more) of the factors on the response. Some specific versions of this question... * Perhaps you tried to do a simple comparison for one treatment and got a warning message you don't understand * You do pairwise comparisons of factor combinations and it's just too much -- want just some of them * How do I even approach this? My first answer is: plots almost always help. If you have factors A, B, and C, try something like `emmip(model, A ~ B | C)`, which creates an interaction-style plot of the predictions against B, for each A, with separate panels for each C. This will help visualize what effects stand out in a practical way. This can guide you in what post-hoc tests would make sense. See the ["interactions" vignette](interactions.html) for more discussion and examples. [Back to Contents](#contents) ## I have covariate(s) that interact(s) with factor(s) {#trends} This is a situation where it may well be appropriate to compare the slopes of trend lines, rather than the EMMs. See the `help("emtrends"()`")` and the discussion of this topic in [the "interactions" vignette](interactions.html#covariates) ## I have covariate(s) and am fitting a polynomial model {#polys} You need to be careful to define the reference grid consistently. For example, if you use covariates `x` and `xsq` (equal to `x^2`) to fit a quadratic curve, the default reference grid uses the mean of each covariate -- and `mean(xsq)` is usually not the same as `mean(x)^2`. So you need to use `at` to ensure that the covariates are set consistently with respect to the model. See [this subsection of the "basics" vignette](basics.html#depcovs) for an example. ## 3. Some "significant" comparisons have overlapping confidence intervals {#CIerror} That can happen because *it is just plain wrong to use [non-]overlapping CIs for individual means to do comparisons*. Look at the printed results from something like `emmeans(mymodel, pairwise ~ treatment)`. In particular, note that the `SE` values are *not* the same*, and may even have different degrees of freedom. Means are one thing statistically, and differences of means are quite another thing. Don't ever mix them up, and don't ever use a CI display for comparing means. I'll add that making hard-line decisions about "significant" and "non-significant" is in itself a poor practice. See [the discussion in the "basics" vignette](basics.html#pvalues) ## All my pairwise comparisons have the same *P* value {#notfactor} This will happen if you fitted a model where the treatments you want to compare were put in as a numeric predictor; for example `dose`, with values of 1, 2, and 3. If `dose` is modeled as numeric, you will be fitting a linear trend in those dose values, rather than a model that allows those doses to differ in arbitrary ways. Go back and fit a different model using `factor(dose)` instead; it will make all the difference. This is closely related to the next topic. ## emmeans() doesn't work as expected {#numeric} Equivalently, users ask how to get *post hoc* comparisons when we have covariates rather than factors. Yes, it does work, but you have to tell it the appropriate reference grid. But before saying more, I have a question for you: *Are you sure your model is meaningful?* * If your question concerns *only* two-level predictors such as `sex` (coded 1 for female, 2 for male), no problem. The model will produce the same predictions as you'd get if you'd used these as factors. * If *any* of the predictors has 3 or more levels, you may have fitted a nonsense model, in which case you need to fit a different model that does make sense before doing any kind of *post hoc* analysis. For instance, the model contains a covariate `brand` (coded 1 for Acme, 2 for Ajax, and 3 for Al's), this model is implying that the difference between Acme and Ajax is exactly equal to the difference between Ajax and Al's, owing to the fact that a linear trend in `brand` has been fitted. If you had instead coded 1 for Ajax, 2 for Al's, and 3 for Acme, the model would produce different fitted values. Ask yourself if it makes sense to have `brand = 2.319`. If not, you need to fit another model using `factor(brand)` in place of `brand`. Assuming that the appropriateness of the model is settled, the current version of **emmeans** automatically casts two-value covariates as factors, but not covariates having higher numbers of unique values. Suppose your model has a covariate `dose` which was experimentally varied over four levels, but can sensibly be interpreted as a numerical predictor. If you want to include the separate values of `dose` rather than the mean `dose`, you can do that using something like `emmeans(model, "dose", at = list(dose = 1:4))`, or `emmeans(model, "dose", cov.keep = "dose")`, or `emmeans(model, "dose", cov.keep = "4")`. There are small differences between these. The last one regards any covariate having 4 or fewer unique values as a factor. See "altering the reference grid" in the ["basics" vignette](basics.html#altering) for more discussion. [Back to Contents](#contents) ## All or some of the results are NA {#NAs} The **emmeans** package uses tools in the **estimability** package to determine whether its results are uniquely estimable. For example, in a two-way model with interactions included, if there are no observations in a particular cell (factor combination), then we cannot estimate the mean of that cell. When *some* of the EMMs are estimable and others are not, that is information about missing information in the data. If it's possible to remove some terms from the model (particularly interactions), that may make more things estimable if you re-fit with those terms excluded; but don't delete terms that are really needed for the model to fit well. When *all* of the estimates are non-estimable, it could be symptomatic of something else. Some possibilities include: * An overly ambitious model; for example, in a Latin square design, interaction effects are confounded with main effects; so if any interactions are included in the model, you will render main effects inestimable. * Possibly you have a nested structure that needs to be included in the model or specified via the `nesting` argument. Perhaps the levels that B can have depend on which level of A is in force. Then B is nested in A and the model should specify `A + A:B`, with no main effect for `B`. * Modeling factors as numeric predictors (see also the [related section on covariates](#numeric)). To illustrate, suppose you have data on particular state legislatures, and the model includes the predictors `state_name` as well as `dem_gov` which is coded 1 if the governor is a Democrat and 0 otherwise. If the model was fitted with `state_name` as a factor or character variable, but `dem_gov` as a numeric predictor, then, chances are, `emmeans()` will return non-estimable results. If instead, you use `factor(dem_gov)` in the model, then the fact that `state_name` is nested in `dem_gov` will be detected, causing EMMs to be computed separately for each party's states, thus making things estimable. * Some other things may in fact be estimable. For illustration, it's easy to construct an example where all the EMMs are non-estimable, but pairwise comparisons are estimable: ```{r} pg <- transform(pigs, x = rep(1:3, c(10, 10, 9))) pg.lm <- lm(log(conc) ~ x + source + factor(percent), data = pg) emmeans(pg.lm, consec ~ percent) ``` The ["messy-data" vignette](messy-data.html) has more examples and discussion. [Back to Contents](#contents) ## If I analyze subsets of the data separately, I get different results {#model} Estimated marginal means summarize the *model* that you fitted to the data -- not the data themselves. Many of the most common models rely on several simplifying assumptions -- that certain effects are linear, that the error variance is constant, etc. -- and those assumptions are passed forward into the `emmeans()` results. Doing separate analyses on subsets usually comprises departing from that overall model, so of course the results are different. ## My lsmeans/EMMs are way off from what I expected {#transformations} First step: Carefully read the annotations below the output. Do they say something like "results are on the log scale, not the response scale"? If so, that explains it. A Poisson or logistic model involves a link function, and by default, `emmeans()` produces its results on that same scale. You can add `type = "response"` to the `emmeans()` call and it will put the results of the scale you expect. But that is not always the best approach. The ["transformations" vignette](transformations.html) has examples and discussion. ## Why do I get `Inf` for the degrees of freedom? {#asymp} This is simply the way that **emmeans** labels asymptotic results (that is, estimates that are tested against the standard normal distribution -- *z* tests -- rather than the *t* distribution). Note that obtaining quantiles or probabilities from the *t* distribution with infinite degrees of freedom is the same as obtaining the corresponding values from the standard normal. For example: ```{r} qt(c(.9, .95, .975), df = Inf) qnorm(c(.9, .95, .975)) ``` so when you see infinite d.f., that just means its a *z* test or a *z* confidence interval. [Back to Contents](#contents) ## I get exactly the same comparisons for each "by" group {#additive} As mentioned elsewhere, EMMs summarize a *model*, not the data. If your model does not include any interactions between the `by` variables and the factors for which you want EMMs, then by definition, the effects for the latter will be exactly the same regardless of the `by` variable settings. So of course the comparisons will all be the same. If you think they should be different, then you are saying that your model should include interactions between the factors of interest and the `by` factors. ## My ANOVA *F* is significant, but no pairwise comparisons are {#anova} First of all, you should not be making binary decisions of "significant" or "nonsignificant." This is a simplistic view of *P* values that assigns an unmerited magical quality to the value 0.05. It is suggested that you just report the *P* values actually obtained, and let your readers decide how significant your findings are in the context of the scientific findings. But to answer the question: This is a common misunderstanding of ANOVA. If *F* has a particular *P* value, this implies only that *some contrast* among the means (or effects) has the same *P* value, after applying the Scheffe adjustment. That contrast may be very much unlike a pairwise comparison, especially when there are several means being compared. Having an *F* statistic with a *P* value of, say, 0.06, does *not* imply that any pairwise comparison will have a *P* value of 0.06 or smaller. Again referring to the paragraph above, just report the *P* value for each pairwise comparison, and don't try to relate them to the *F* statistic. Another consideration is that by default, *P* values for pairwise comparisons are adjusted using the Tukey method, and the adjusted *P* values can be quite a bit larger than the unadjusted ones. (But I definitely do *not* advocate using no adjustment to "repair" this problem.) ## I asked for Tukey adjustments, but that's not what I got {#notukey} There are two reasons this could happen: 1. There is only one comparison in each `by` group (see next topic). 2. A Tukey adjustment is inappropriate. The Tukey adjustment is appropriate for pairwise comparisons of means. When you have some other set of contrasts, the Tukey method is deemed unsuitable and the Sidak method is used instead. A suggestion is to use `"mvt"` adjustment (which is exact); we don't default to this because it can require a lot of computing time for a large set of contrasts or comparisons. ## `emmeans()` completely ignores my P-value adjustments {#noadjust} This happens when there are only two means (or only two in each `by` group). Thus there is only one comparison. When there is only one thing to test, there is no multiplicity issue, and hence no multiplicity adjustment to the *P* values. If you wish to apply a *P*-value adjustment to all tests across all groups, you need to null-out the `by` variable and summarize, as in the following: ```r EMM <- emmeans(model, ~ treat | group) # where treat has 2 levels pairs(EMM, adjust = "sidak") # adjustment is ignored - only 1 test per group summary(pairs(EMM), by = NULL, adjust = "sidak") # all are in one group now ``` Note that if you put `by = NULL` *inside* the call to `pairs()`, then this causes all `treat`,`group` combinations to be compared. [Back to Contents](#contents) ## `emmeans()` gives me pooled *t* tests, but I expected Welch's *t* {#nowelch} It is important to note that `emmeans()` and its relatives produce results based on the *model object* that you provide -- not the data. So if your sample SDs are wildly different, a model fitted using `lm()` or `aov()` is not a good model, because those R functions use a statistical model that presumes that the errors have constant variance. That is, the problem isn't in `emmeans()`, it's in handing it an inadequate model object. Here is a simple illustrative example. Consider a simple one-way experiment and the following model: ``` mod1 <- aov(response ~ treat, data = mydata) emmeans(mod1, pairwise ~ treat) ``` This code will estimate means and comparisons among treatments. All standard errors, confidence intervals, and *t* statistics are based on the pooled residual SD with *N - k* degrees of freedom (assuming *N* observations and *k* treatments). These results are useful *only* if the underlying assumptions of `mod1` are correct -- including the assumption that the error SD is the same for all treatments. Alternatively, you could fit the following model using generalized least-squares: ``` mod2 = nlme::gls(response ~ treat, data = mydata, weights = varIdent(form = ~1 | treat)) emmeans(mod2, pairwise ~ treat) ``` This model specifies that the error variance depends on the levels of `treat`. This would be a much better model to use when you have wildly different sample SDs. The results of the `emmeans()` call will reflect this improvement in the modeling. The standard errors of the EMMs will depend on the individual sample variances, and the *t* tests of the comparisons will be in essence the Welch *t* statistics with Satterthwaite degrees of freedom. To obtain appropriate *post hoc* estimates, contrasts, and comparisons, one must first find a model that successfully explains the peculiarities in the data. This point cannot be emphasized enough. If you give `emmeans()` a good model, you will obtain correct results; if you give it a bad model, you will obtain incorrect results. Get the model right *first*. [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/inst/doc/xplanations.Rmd0000644000176200001440000003421414137062735016441 0ustar liggesusers--- title: "Explanations supplement" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Explanations supplement} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, fig.height = 2.0, class.output = "ro", class.message = "re", class.error = "re", class.warning = "re") ###knitr::opts_chunk$set(fig.width = 4.5, fig.height = 2.0) ``` This vignette provides additional documentation for some methods implemented in the **emmeans** package. [Index of all vignette topics](vignette-topics.html) ## Contents {#contents} 1. [Sub-models](#submodels) 2. [Comparison arrows](#arrows) ## Sub-models {#submodels} Estimated marginal means (EMMs) and other statistics computed by the **emmeans** package are *model-based*: they depend on the model that has been fitted to the data. In this section we discuss a provision whereby a different underlying model may be considered. The `submodel` option in `update()` can project EMMs and other statistics to an alternative universe where a simpler version of the model has been fitted to the data. Another way of looking at this is that it constrains certain external effects to be zero -- as opposed to averaging over them as is otherwise done for marginal means. Two things to know before getting into details: 1. The `submodel` option uses information from the fixed-effects portion of the model matrix 2. Not all model classes are supported for the `submodel` option. Now some details. Suppose that we have a fixed-effects model matrix $X$, and let $X_1$ denote a sub-matrix of $X$ whose columns correspond to a specified sub-model. (Note: if there are weights, use $X = W^{1/2}X^*$, where $X^*$ is the model matrix without the weights incorporated.) The trick we use is what is called the *alias matrix*: $A = (X_1'X_1)^-X_1'X$ where $Z^-$ denotes a generalized inverse of $Z$. It can be shown that $(X_1'X_1)^-X_1' = A(X'X)^-X'$; thus, in an ordinary fixed-effects regression model, $b_1 = Ab$ where $b_1$ and $b$ denote the regression coefficients for the sub-model and full model, respectively. Thus, given a matrix $L$ such that $Lb$ provides estimates of interest for the full model, the corresponding estimates for the sub-model are $L_1b_1$, where $L_1$ is the sub-matrix of $L$ consisting of the columns corresponding to the columns of $X_1$. Moreover, $L_1b_1 = L_1(Ab) = (L_1A)b$; that is, we can replace $L$ by $L_1A$ to obtain estimates from the sub-model. That's all that `update(..., submodel = ...)` does. Here are some intuitive observations: 1. Consider the excluded effects, $X_2$, consisting of the columns of $X$ other than $X_1$. The corresponding columns of the alias matrix are regression coefficients treating $X_2$ as the response and $X_1$ as the predictors. 2. Thus, when we obtain predictions via these aliases, we are predicting the effects of $X_2$ based on $X_1$. 3. The columns of the new linear predictor $\tilde L = L_1A$ depend only on the columns of $L_1$, and hence not on other columns of $L$. These three points provide three ways of saying nearly the same thing, namely that we are excluding the effects in $X_2$. Note that in a rank-deficient situation, there are different possible generalized inverses, and so in (1), $A$ is not unique. However, the predictions in (2) are unique. In ordinary regression models, (1), (2), and (3) all apply and will be the same as predictions from re-fitting the model with model matrix $X_1$; however, in generalized linear models, mixed models, etc., re-fitting will likely produce somewhat different results. That is because fitting such models involves iterative weighting, and the re-fitted models will probably not have the same weights. However, point (3) will still hold: the predictions obtained with a submodel will involve only the columns of $L_1$ and hence constrain all effects outside of the sub-model to be zero. Therefore, when it really matters to get the correct estimates from the stated sub-model, the user should actually fit that sub-model unless the full model is an ordinary linear regression. A technicality: Most writers define the alias matrix as $(X_1'X_1)^-X_1'X_2$, where $X_2$ denotes that part of $X$ that excludes the columns of $X_1$. We are including all columns of $X$ here just because it makes the notation very simple; the $X_1$ portion of $X$ just reduces to the identity (at least in the case where $X_1$ is full-rank). A word on computation: Like many matrix expressions, we do not compute $A$ directly as shown. Instead, we use the QR decomposition of $X_1$, obtainable via the R call `Z <- qr(X1)`. Then the alias matrix is computed via `A <- qr.coef(Z, X)`. In fact, nothing changes if we use just the $R$ portion of $X = QR$, saving us both memory and computational effort. The exported function `.cmpMM()` extracts this $R$ matrix, taking care of any pivoting that might have occurred. And in an `lm` object, the QR decomposition of $X$ is already saved as a slot. The `qr.coef()` function works just fine in both the full-rank and rank-deficient cases, but in the latter situation, some elements of `A` will be `NA`; those correspond to "excluded" predictors, but that is another way of saying that we are constraining their regression coefficients to be zero. Thus, we can easily clean that up via `A[is.na(A)] <- 0`. If we specify `submodel = "minimal"`, the software figures out the sub-model by extracting terms involving only factors that have not already been averaged over. If the user specifies `submodel = "type2"`, an additional step is performed: Let $X_1$ have only the highest-order effect in the minimal model, and $X_0$ denote the matrix of all columns of $X$ whose columns do not contain the effect in $X_1$. We then replace $Z$ by the QR decomposition of $[I - X_0(X_0'X_0)^-X_0']X_1^*$. This projects $X_1^*$ onto the null space of $X_0$. The net result is that we obtain estimates of just the $X_1^*$ effects, after adjusting for all effects that don't contain it (including the intercept if present). Such estimates have very limited use in data description, but provide a kind of "Type II" analysis when used in conjunction with `joint_tests()`. The `"type2"` calculations parallel those [documented by SAS](https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_introglmest_sect011.htm&docsetVersion=14.3&locale=en) for obtaining type II estimable functions in SAS `PROC GLM`. However, we (as well as `car::Anova()`) define "contained" effects differently from SAS, treating covariates no differently than factors. ### A note on multivariate models {#mult.submodel} Recall that **emmeans** generates a constructed factor for the levels of a multivariate response. That factor (or factors) is completely ignored in any sub-model calculations. The $X$ and $X_1$ matrices described above involve only the predictors in the right-hand side of the model equation . The multivariate response "factor" implicitly interacts with everything in the right-hand-side model; and the same is true of any sub-model. So it is not possible to consider sub-models where terms are omitted from among those multivariate interactions (note that it is also impossible to fit a multivariate sub-model that excludes those interactions). The only way to remove consideration of multivariate effects is to average over them via a call to `emmeans()`. [Back to Contents](#contents) ## Comparison arrows {#arrows} The `plot()` method for `emmGrid` objects offers the option `comparisons = TRUE`. If used, the software attempts to construct "comparison arrows" whereby two estimated marginal means (EMMs) differ significantly if, and only if, their respective comparison arrows do not overlap. In this section, we explain how these arrows are obtained. First, please understand these comparison arrows are decidedly *not* the same as confidence intervals. Confidence intervals for EMMs are based on the statistical properties of the individual EMMs, whereas comparison arrows are based on the statistical properties of *differences* of EMMs. Let the EMMs be denoted $m_1, m_2, ..., m_k$. For simplicity, let us assume that these are ordered: $m_1 \le m_2 \le \cdots \le m_k$. Let $d_{ij} = m_j - m_i$ denote the difference between the $i$th and $j$th EMM. Then the $(1 - \alpha)$ confidence interval for the true difference $\delta_{ij} = \mu_j - \mu_i$ is $$ d_{ij} - e_{ij}\quad\mbox{to}\quad d_{ij} + e_{ij} $$ where $e_{ij}$ is the "margin of error" for the difference; i.e., $e_{ij} = t\cdot SE(d_{ij})$ for some critical value $t$ (equal to $t_{\alpha/2}$ when no multiplicity adjustment is used). Note that $d_{ij}$ is statistically significant if, and only if, $d_{ij} > e_{ij}$. Now, how to get the comparison arrows? These arrows are plotted with origins at the $m_i$; we have an arrow of length $L_i$ pointing to the left, and an arrow of length $R_i$ pointing to the right. To compare EMMs $m_i$ and $m_j$ (and remembering that we are supposing that $m_i \le m_j$), we propose to look to see if the arrows extending right from $m_i$ and left from $m_j$ overlap or not. So, ideally, if we want overlap to be identified with statistical non-significance, we want $$ R_i + L_j = e_{ij} \quad\mbox{for all } i < j $$ If we can do that, then the two arrows will overlap if, and only if, $d_{ij} < e_{ij}$. This is easy to accomplish if all the $e_{ij}$ are equal: just set all $L_i = R_j = \frac12e_{12}$. But with differing $e_{ij}$ values, it may or may not even be possible to obtain suitable arrow lengths. The code in **emmeans** uses an *ad hoc* weighted regression method to solve the above equations. We give greater weights to cases where $d_{ij}$ is close to $e_{ij}$, because those are the cases where it is more critical that we get the lengths of the arrows right. Once the regression equations are solved, we test to make sure that $R_i + L_j < d_{ij}$ when the difference is significant, and $\ge d_{ij}$ when it is not. If one or more of those checks fails, a warning is issued. That's the essence of the algorithm. Note, however, that there are a few complications that need to be handled: * For the lowest EMM $m_1$, $L_1$ is completely arbitrary because there are no right-pointing arrows with which to compare it; in fact, we don't even need to display that arrow. The same is true of $R_k$ for the largest EMM $m_k$. Moreover, there could be additional unneeded arrows when other $m_i$ are equal to $m_1$ or $m_k$. * Depending on the number $k$ of EMMs and the number of tied minima and maxima, the system of equations could be under-determined, over-determined, or just right. * It is possible that the solution could result in some $L_i$ or $R_j$ being negative. That would result in an error. In summary, the algorithm does not always work (in fact it is possible to construct cases where no solution is possible). But we try to do the best we can. The main reason for trying to do this is to encourage people to not ever use confidence intervals for the $m_i$ as a means of testing the comparisons $d_{ij}$. That is almost always incorrect. What is better yet is to simply avoid using comparison arrows altogether and use `pwpp()` or `pwpm()` to display the *P* values directly. ### Examples and tests Here is a constructed example with specified means and somewhat unequal SEs ```{r, message = FALSE} m = c(6.1, 4.5, 5.4, 6.3, 5.5, 6.7) se2 = c(.3, .4, .37, .41, .23, .48)^2 lev = list(A = c("a1","a2","a3"), B = c("b1", "b2")) foo = emmobj(m, diag(se2), levels = lev, linfct = diag(6)) plot(foo, CIs = FALSE, comparisons = TRUE) ``` This came out pretty well. But now let's keep the means and SEs the same but make them correlated. Such correlations happen, for example, in designs with subject effects. The function below is used to set a specified intra-class correlation, treating `A` as a within-subjects (or split-plot) factor and `B` as a between-subjects (whole-plot) factor. We'll start with a correlation of 0.3. ```{r, message = FALSE} mkmat <- function(V, rho = 0, indexes = list(1:3, 4:6)) { sd = sqrt(diag(V)) for (i in indexes) V[i,i] = (1 - rho)*diag(sd[i]^2) + rho*outer(sd[i], sd[i]) V } # Intraclass correlation = 0.3 foo3 = foo foo3@V <- mkmat(foo3@V, 0.3) plot(foo3, CIs = FALSE, comparisons = TRUE) ``` Same with intraclass correlation of 0.6: ```{r, message = FALSE} foo6 = foo foo6@V <- mkmat(foo6@V, 0.6) plot(foo6, CIs = FALSE, comparisons = TRUE) ``` Now we have a warning that some arrows don't overlap, but should. We can make it even worse by upping the correlation to 0.8: ```{r, message = FALSE, error = TRUE} foo8 = foo foo8@V <- mkmat(foo8@V, 0.8) plot(foo8, CIs = FALSE, comparisons = TRUE) ``` Now the solution actually leads to negative arrow lengths. What is happening here is we are continually reducing the SE of within-B comparisons while keeping the others the same. These all work out if we use `B` as a `by` variable: ```{r, message = FALSE} plot(foo8, CIs = FALSE, comparisons = TRUE, by = "B") ``` Note that the lengths of the comparison arrows are relatively equal within the levels of `B`. Or, we can use `pwpp()` or `pwpm()` to show the *P* values for all comparisons among the six means: ```{r} pwpp(foo6, sort = FALSE) pwpm(foo6) ``` [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html)emmeans/inst/doc/vignette-topics.html0000644000176200001440000011776614165066774017475 0ustar liggesusers Index of vignette topics

Index of vignette topics

emmeans package, Version 1.7.2

Jump to: A B C D E F G H I J K L M N O P Q R S T U V W X Z

E

Back to top

Z

Back to top

Index generated by the vigindex package.

emmeans/inst/doc/messy-data.Rmd0000644000176200001440000005563414137062735016161 0ustar liggesusers--- title: "Working with messy data" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Working with messy data} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") require("ggplot2") options(show.signif.stars = FALSE) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` ## Contents {#contents} 1. [Issues with observational data](#issues) 2. [Mediating covariates](#mediators) 3. [Mediating factors and weights](#weights) 3. [Nuisance factors](#nuisance) 4. [Sub-models](#submodels) 5. [Nested fixed effects](#nesting) a. [Avoiding mis-identified nesting](#nest-trap) [Index of all vignette topics](vignette-topics.html) ## Issues with observational data {#issues} In experiments, we control the conditions under which observations are made. Ideally, this leads to balanced datasets and clear inferences about the effects of those experimental conditions. In observational data, factor levels are observed rather than controlled, and in the analysis we control *for* those factors and covariates. It is possible that some factors and covariates lie in the causal path for other predictors. Observational studies can be designed in ways to mitigate some of these issues; but often we are left with a mess. Using EMMs does not solve the inherent problems in messy, undesigned studies; but they do give us ways to compensate for imbalance in the data, and allow us to estimate meaningful effects after carefully considering the ways in which they can be confounded. ####### {#nutrex} As an illustration, consider the `nutrition` dataset provided with the package. These data are used as an example in Milliken and Johnson (1992), *Analysis of Messy Data*, and contain the results of an observational study on nutrition education. Low-income mothers are classified by race, age category, and whether or not they received food stamps (the `group` factor); and the response variable is a gain score (post minus pre scores) after completing a nutrition training program. First, let's fit a model than includes all main effects and 2-way interactions, and obtain its "type II" ANOVA: ```{r} nutr.lm <- lm(gain ~ (age + group + race)^2, data = nutrition) car::Anova(nutr.lm) ``` There is definitely a `group` effect and a hint of and interaction with `race`. Here are the EMMs for those two factors, along with their counts: ```{r} emmeans(nutr.lm, ~ group * race, calc = c(n = ".wgt.")) ``` ####### {#nonestex} Hmmmm. The EMMs when `race` is "Hispanic" are not given; instead they are flagged as non-estimable. What does that mean? Well, when using a model to make predictions, it is impossible to do that beyond the linear space of the data used to fit the model. And we have no data for three of the age groups in the Hispanic population: ```{r} with(nutrition, table(race, age)) ``` We can't make predictions for all the cases we are averaging over in the above EMMs, and that is why some of them are non-estimable. The bottom line is that we simply cannot include Hispanics in the mix when comparing factor effects. That's a limitation of this study that cannot be overcome without collecting additional data. Our choices for further analysis are to focus only on Black and White populations; or to focus only on age group 3. For example (the latter): ```{r} summary(emmeans(nutr.lm, pairwise ~ group | race, at = list(age = "3")), by = NULL) ``` (We used trickery with providing a `by` variable, and then taking it away, to make the output more compact.) Evidently, the training program has been beneficial to the Black and White groups in that age category. There is no conclusion for the Hispanic group -- for which we have very little data. [Back to Contents](#contents) ## Mediating covariates {#mediators} The `framing` data in the **mediation** package has the results of an experiment conducted by Brader et al. (2008) where subjects were given the opportunity to send a message to Congress regarding immigration. However, before being offered this, some subjects (`treat = 1`) were first shown a news story that portrays Latinos in a negative way. Besides the binary response (whether or not they elected to send a message), the experimenters also measured `emo`, the subjects' emotional state after the treatment was applied. There are various demographic variables as well. Let's a logistic regression model, after changing the labels for `educ` to shorter strings. ```{r} framing <- mediation::framing levels(framing$educ) <- c("NA","Ref","< HS", "HS", "> HS","Coll +") framing.glm <- glm(cong_mesg ~ age + income + educ + emo + gender * factor(treat), family = binomial, data = framing) ``` The conventional way to handle covariates like `emo` is to set them at their means and use those means for purposes of predictions and EMMs. These adjusted means are shown in the following plot. ```{r} emmip(framing.glm, treat ~ educ | gender, type = "response") ``` This plot gives the impression that the effect of `treat` is reversed between male and female subjects; and also that the effect of education is not monotone. Both of these are counter-intuitive. ###### {#med.covred} However, note that the covariate `emo` is measured *post*-treatment. That suggests that in fact `treat` (and perhaps other factors) could affect the value of `emo`; and if that is true (as is in fact established by mediation analysis techniques), we should not pretend that `emo` can be set independently of `treat` as was done to obtain the EMMs shown above. Instead, let `emo` depend on `treat` and the other predictors -- easily done using `cov.reduce` -- and we obtain an entirely different impression: ```{r} emmip(framing.glm, treat ~ educ | gender, type = "response", cov.reduce = emo ~ treat*gender + age + educ + income) ``` The reference grid underlying this plot has different `emo` values for each factor combination. The plot suggests that, after taking emotional response into account, male (but not female) subjects exposed to the negative news story are more likely to send the message than are females or those not seeing the negative news story. Also, the effect of `educ` is now nearly monotone. ###### {#adjcov} By the way, the results in this plot are the same is what you would obtain by refitting the model with an adjusted covariate ```{r eval = FALSE} emo.adj <- resid(lm(emo ~ treat*gender + age + educ + income, data = framing)) ``` ... and then using ordinary covariate-adjusted means at the means of `emo.adj`. This is a technique that is often recommended. If there is more than one mediating covariate, their settings may be defined in sequence; for example, if `x1`, `x2`, and `x3` are all mediating covariates, we might use ```{r eval = FALSE} emmeans(..., cov.reduce = list(x1 ~ trt, x2 ~ trt + x1, x3 ~ trt + x1 + x2)) ``` (or possibly with some interactions included as well). [Back to Contents](#contents) ## Mediating factors and weights {#weights} A mediating covariate is one that is in the causal path; likewise, it is possible to have a mediating *factor*. For mediating factors, the moral equivalent of the `cov.reduce` technique described above is to use *weighted* averages in lieu of equally-weighted ones in computing EMMs. The weights used in these averages should depend on the frequencies of mediating factor(s). Usually, the `"cells"` weighting scheme described later in this section is the right approach. In complex situations, it may be necessary to compute EMMs in stages. As described in [the "basics" vignette](basics.html#emmeans), EMMs are usually defined as *equally-weighted* means of reference-grid predictions. However, there are several built-in alternative weighting schemes that are available by specifying a character value for `weights` in a call to `emmeans()` or related function. The options are `"equal"` (the default), `"proportional"`, `"outer"`, `"cells"`, and `"flat"`. The `"proportional"` (or `"prop"` for short) method weights proportionally to the frequencies (or model weights) of each factor combination that is averaged over. The `"outer"` method uses the outer product of the marginal frequencies of each factor that is being averaged over. To explain the distinction, suppose the EMMs for `A` involve averaging over two factors `B` and `C`. With `"prop"`, we use the frequencies for each combination of `B` and `C`; whereas for `"outer"`, first obtain the marginal frequencies for `B` and for `C` and weight proportionally to the product of these for each combination of `B` and `C`. The latter weights are like the "expected" counts used in a chi-square test for independence. Put another way, outer weighting is the same as proportional weighting applied one factor at a time; the following two would yield the same results: ```{r eval = FALSE} emmeans(model, "A", weights = "outer") emmeans(emmeans(model, c("A", "B"), weights = "prop"), weights = "prop") ``` Using `"cells"` weights gives each prediction the same weight as occurs in the model; applied to a reference grid for a model with all interactions, `"cells"`-weighted EMMs are the same as the ordinary marginal means of the data. With `"flat"` weights, equal weights are used, except zero weight is applied to any factor combination having no data. Usually, `"cells"` or `"flat"` weighting will *not* produce non-estimable results, because we exclude empty cells. (That said, if covariates are linearly dependent with factors, we may still encounter non-estimable cases.) Here is a comparison of predictions for `nutr.lm` defined [above](#issues), using different weighting schemes: ```{r message = FALSE} sapply(c("equal", "prop", "outer", "cells", "flat"), function(w) predict(emmeans(nutr.lm, ~ race, weights = w))) ``` In the other hand, if we do `group * race` EMMs, only one factor (`age`) is averaged over; thus, the results for `"prop"` and `"outer"` weights will be identical in that case. [Back to Contents](#contents) ## Nuisance factors {#nuisance} Consider a situation where we have a model with 15 factors, each at 5 levels. Regardless of how simple or complex the model is, the reference grid consists of all combinations of these factors -- and there are $%^15$ of these, or over 30 billion. If there are, say, 100 regression coefficients in the model, then just the `linfct` slot in the reference grid requires $100\times5^15\times8$ bytes of storage, or almost 23,000 gigabytes. Suppose in addition the model has a multivariate response with 5 levels. That multiplies *both* the rows and columns in `linfct`, increasing the storage requirements by a factor of 25. Either way, your computer can't store that much -- so this definitely qualifies as a messy situation! The `ref_grid()` function now provides some relief, in the way of specifying some of the factors as "nuisance" factors. The reference grid is then constructed with those factors already averaged-out. So, for example with the same scenario, if only three of those 15 factors are of primary interest, and we specify the other 12 as nuisance factors to be averaged, that leaves us with only $3^5=125$ rows in the reference grid, and hence $125\times100\times8=10,000$ bytes of storage required for `linfct`. If there is a 5-level multivariate response, we'll have 625 rows in the reference grid and $25\times1000=250,000$ bytes in `linfct`. Suddenly a horribly unmanageable situation becomes quite manageable! But of course, there is a restriction: nuisance factors must not interact with any other factors -- not even other nuisance factors. And a multivariate response (or an implied multivariate response, e.g., in an ordinal model) can never be a nuisance factor. Under that condition, the average effects of a nuisance factor are the same regardless of the levels of other factors, making it possible to pre-average them by considering just one case. We specify nuisance factors by listing their names in a `nuisance` argument to `ref_grid()` (in `emmeans()`, this argument is passed to `ref_grid)`). Often, it is much more convenient to give the factors that are *not* nuisance factors, via a `non.nuisance` argument. If you do specify a nuisance factor that does interact with others, or doesn't exist, it is quietly excluded from the nuisance list. ###### {#nuis.example} Time for an example. Consider the `mtcars` dataset standard in R, and the model ```{r} mtcars.lm <- lm(mpg ~ factor(cyl)*am + disp + hp + drat + log(wt) + vs + factor(gear) + factor(carb), data = mtcars) ``` And let's construct two different reference grids: ```{r} rg.usual <- ref_grid(mtcars.lm) rg.usual nrow(rg.usual@linfct) rg.nuis = ref_grid(mtcars.lm, non.nuisance = "cyl") rg.nuis nrow(rg.nuis@linfct) ``` Notice that we left `am` out of `non.nuisance` and hence included it in `nuisance`. However, it interacts with `cyl`, so it was not allowed as a nuisance factor. But `rg.nuis` requires 1/36 as much storage. There's really nothing else to show, other than to demonstrate that we get the same EMMs either way, with slightly different annotations: ```{r} emmeans(rg.usual, ~ cyl * am) emmeans(rg.nuis, ~ cyl * am) ``` By default, the pre-averaging is done with equal weights. If we specify `wt.nuis` as anything other than `"equal"`, they are averaged proportionally. As described above, this really amounts to `"outer"` weights since they are averaged separately. Let's try it to see how the estimates differ: ```{r} predict(emmeans(mtcars.lm, ~ cyl * am, non.nuis = c("cyl", "am"), wt.nuis = "prop")) predict(emmeans(mtcars.lm, ~ cyl * am, weights = "outer")) ``` These are the same as each other, but different from the equally-weighted EMMs we obtained before. By the way, to help make things consistent, if `weights` is character, `emmeans()` passes `wt.nuis = weights` to `ref_grid` (if it is called), unless `wt.nuis` is also specified. There is a trick to get `emmeans` to use the smallest possible reference grid: Pass the `specs` argument to `ref_grid()` as `non.nuisance`. But we have to quote it to delay evaluation, and also use `all.vars()` if (and only if) `specs` is a formula: ```{r} emmeans(mtcars.lm, ~ gear | am, non.nuis = quote(all.vars(specs))) ``` Observe that `cyl` was passed over as a nuisance factor because it interacts with another factor. ### Limiting the size of the reference grid {#rg.limit} We have just seen how easily the size of a reference grid can get out of hand. The `rg.limit` option (set via `emm_options()` or as an optional argument in `ref_grid()` or `emmeans()`) serves to guard against excessive memory demands. It specifies the number of allowed rows in the reference grid. But because of the way `ref_grid()` works, this check is made *before* any multivariate-response levels are taken into account. If the limit is exceeded, an error is thrown: ```{r, error = TRUE} ref_grid(mtcars.lm, rg.limit = 200) ``` The default `rg.limit` is 10,000. With this limit, and if we have 1,000 columns in the model matrix, then the size of `linfct` is limited to about 80MB. If in addition, there is a 5-level multivariate response, the limit is 2GB -- darn big, but perhaps manageable. Even so, I suspect that the 10000-row default may be to loose to guard against some users getting into a tight situation. [Back to Contents](#contents) ## Sub-models {#submodels} We have just seen that we can assign different weights to the levels of containing factors. Another option is to constrain the effects of those containing factors to zero. In essence, that means fitting a different model without those containing effects; however, for certain models (not all), an `emmGrid` may be updated with a `submodel` specification so as to impose such a constraint. For illustration, return again to the nutrition example, and consider the analysis of `group` and `race` as before, after removing interactions involving `age`: ```{r} summary(emmeans(nutr.lm, pairwise ~ group | race, submodel = ~ age + group*race), by = NULL) ``` If you like, you may confirm that we would obtain exactly the same estimates if we had fitted that sub-model to the data, except we continue to use the residual variance from the full model in tests and confidence intervals. Without the interactions with `age`, all of the marginal means become estimable. The results are somewhat different from those obtained earlier where we narrowed the scope to just age 3. These new estimates include all ages, averaging over them equally, but with constraints that the interaction effects involving `age` are all zero. ###### {#type2submodel} There are two special character values that may be used with `submodel`. Specifying `"minimal"` creates a submodel with only the active factors: ```{r} emmeans(nutr.lm, ~ group * race, submodel = "minimal") ``` This submodel constrains all effects involving `age` to be zero. Another interesting option is `"type2"`, whereby we in essence analyze the residuals of the model with all contained or overlapping effects, then constrain the containing effects to be zero. So what is left if only the interaction effects of the factors involved. This is most useful with `joint_tests()`: ```{r} joint_tests(nutr.lm, submodel = "type2") ``` These results are identical to the type II anova obtained [at the beginning of this example](#nutrex). More details on how `submodel` works may be found in [`vignette("xplanations")`](xplanations.html#submodels) [Back to Contents](#contents) ## Nested fixed effects {#nesting} A factor `A` is nested in another factor `B` if the levels of `A` have a different meaning in one level of `B` than in another. Often, nested factors are random effects---for example, subjects in an experiment may be randomly assigned to treatments, in which case subjects are nested in treatments---and if we model them as random effects, these random nested effects are not among the fixed effects and are not an issue to `emmeans`. But sometimes we have fixed nested factors. ###### {#cows} Here is an example of a fictional study of five fictional treatments for some disease in cows. Two of the treatments are administered by injection, and the other three are administered orally. There are varying numbers of observations for each drug. The data and model follow: ```{r} cows <- data.frame ( route = factor(rep(c("injection", "oral"), c(5, 9))), drug = factor(rep(c("Bovineumab", "Charloisazepam", "Angustatin", "Herefordmycin", "Mollycoddle"), c(3,2, 4,2,3))), resp = c(34, 35, 34, 44, 43, 36, 33, 36, 32, 26, 25, 25, 24, 24) ) cows.lm <- lm(resp ~ route + drug, data = cows) ``` The `ref_grid` function finds a nested structure in this model: ```{r message = FALSE} cows.rg <- ref_grid(cows.lm) cows.rg ``` When there is nesting, `emmeans` computes averages separately in each group\ldots ```{r} route.emm <- emmeans(cows.rg, "route") route.emm ``` ... and insists on carrying along any grouping factors that a factor is nested in: ```{r} drug.emm <- emmeans(cows.rg, "drug") drug.emm ``` Here are the associated pairwise comparisons: ```{r} pairs(route.emm, reverse = TRUE) pairs(drug.emm, by = "route", reverse = TRUE) ``` In the latter result, the contrast itself becomes a nested factor in the returned `emmGrid` object. That would not be the case if there had been no `by` variable. #### Graphs with nesting It can be very helpful to take advantage of special features of **ggplot2** when graphing results with nested factors. For example, the default plot for the `cows` example is not ideal: ```{r, fig.width = 5.5} emmip(cows.rg, ~ drug | route) ``` We can instead remove `route` from the call and instead handle it with **ggplot2** code to use separate *x* scales: ```{r, fig.width = 5.5} require(ggplot2) emmip(cows.rg, ~ drug) + facet_wrap(~ route, scales = "free_x") ``` Similarly with `plot.emmGrid()`: ```{r, fig.height = 2.5, fig.width = 5.5} plot(drug.emm, PIs = TRUE) + facet_wrap(~ route, nrow = 2, scales = "free_y") ``` ### Auto-identification of nested factors -- avoid being trapped! {#nest-trap} `ref_grid()` and `emmeans()` tries to discover and accommodate nested structures in the fixed effects. It does this in two ways: first, by identifying factors whose levels appear in combination with only one level of another factor; and second, by examining the `terms` attribute of the fixed effects. In the latter approach, if an interaction `A:B` appears in the model but `A` is not present as a main effect, then `A` is deemed to be nested in `B`. Note that this can create a trap: some users take shortcuts by omitting some fixed effects, knowing that this won't affect the fitted values. But such shortcuts *do* affect the interpretation of model parameters, ANOVA tables, etc., and I advise against ever taking such shortcuts. Here are some ways you may notice mistakenly-identified nesting: * A message is displayed when nesting is detected * A `str()` listing of the `emmGrid` object shows a nesting component * An `emmeans()` summary unexpectedly includes one or more factors that you didn't specify * EMMs obtained using `by` factors don't seem to behave right, or give the same results with different specifications To override the auto-detection of nested effects, use the `nesting` argument in `ref_grid()` or `emmeans()`. Specifying `nesting = NULL` will ignore all nesting. Incorrectly-discovered nesting can be overcome by specifying something akin to `nesting = "A %in% B, C %in% (A * B)"` or, equivalently, `nesting = list(A = "B", C = c("A", "B"))`. [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/inst/doc/transformations.Rmd0000644000176200001440000007421014150770352017326 0ustar liggesusers--- title: "Transformations and link functions in emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Transformations and link functions} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` ## Contents {#contents} This vignette covers the intricacies of transformations and link functions in **emmeans**. 1. [Overview](#overview) 2. [Re-gridding](#regrid) 3. [Link functions](#links) 3. [Graphing transformations and links](#trangraph) 4. [Both a response transformation and a link](#tranlink) 5. [Special transformations](#special) 6. [Specifying a transformation after the fact](#after) 6. [Auto-detected transformations](#auto) 6. [Standardized response](#stdize) 7. [Faking a log transformation](#logs) a. [Faking other transformations](#faking) b. [Alternative scale](#altscale) 8. [Bias adjustment](#bias-adj) [Index of all vignette topics](vignette-topics.html) ## Overview {#overview} Consider the same example with the `pigs` dataset that is used in many of these vignettes: ```{r} pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) ``` This model has two factors, `source` and `percent` (coerced to a factor), as predictors; and log-transformed `conc` as the response. Here we obtain the EMMs for `source`, examine its structure, and finally produce a summary, including a test against a null value of log(35): ```{r} pigs.emm.s <- emmeans(pigs.lm, "source") str(pigs.emm.s) ``` ```{r} summary(pigs.emm.s, infer = TRUE, null = log(35)) ``` Now suppose that we want the EMMs expressed on the same scale as `conc`. This can be done by adding `type = "response"` to the `summary()` call: ```{r} summary(pigs.emm.s, infer = TRUE, null = log(35), type = "response") ``` Note: Looking ahead, this output is compared later in this vignette with a [bias-adjusted version](#pigs-biasadj). ### Timing is everything {#timing} Dealing with transformations in **emmeans** is somewhat complex, due to the large number of possibilities. But the key is understanding what happens, when. These results come from a sequence of steps. Here is what happens (and doesn't happen) at each step: 1. The reference grid is constructed for the `log(conc)` model. The fact that a log transformation is used is recorded, but nothing else is done with that information. 2. The predictions on the reference grid are averaged over the four `percent` levels, for each `source`, to obtain the EMMs for `source` -- *still* on the `log(conc)` scale. 3. The standard errors and confidence intervals for these EMMs are computed -- *still* on the `log(conc)` scale. 4. Only now do we do back-transformation... a. The EMMs are back-transformed to the `conc` scale. b. The endpoints of the confidence intervals are back-transformed. c. The *t* tests and *P* values are left as-is. d. The standard errors are converted to the `conc` scale using the delta method. These SEs were *not* used in constructing the tests and confidence intervals. ### The model is our best guide This choice of timing is based on the idea that *the model is right*. In particular, the fact that the response is transformed suggests that the transformed scale is the best scale to be working with. In addition, the model specifies that the effects of `source` and `percent` are *linear* on the transformed scale; inasmuch as marginal averaging to obtain EMMs is a linear operation, that averaging is best done on the transformed scale. For those two good reasons, back-transforming to the response scale is delayed until the very end by default. [Back to Contents](#contents) ## Re-gridding {#regrid} As well-advised as it is, some users may not want the default timing of things. The tool for changing when back-transformation is performed is the `regrid()` function -- which, with default settings of its arguments, back-transforms an `emmGrid` object and adjusts everything in it appropriately. For example: ```{r} str(regrid(pigs.emm.s)) summary(regrid(pigs.emm.s), infer = TRUE, null = 35) ``` Notice that the structure no longer includes the transformation. That's because it is no longer relevant; the reference grid is on the `conc` scale, and how we got there is now forgotten. Compare this `summary()` result with the preceding one, and note the following: * It no longer has annotations concerning transformations. * The estimates and SEs are identical. * The confidence intervals, *t* ratios, and *P* values are *not* identical. This is because, this time, the SEs shown in the table are the ones actually used to construct the tests and intervals. Understood, right? But think carefully about how these EMMs were obtained. They are back-transformed from `pigs.emm.s`, in which *the marginal averaging was done on the log scale*. If we want to back-transform *before* doing the averaging, we need to call `regrid()` after the reference grid is constructed but before the averaging takes place: ```{r} pigs.rg <- ref_grid(pigs.lm) pigs.remm.s <- emmeans(regrid(pigs.rg), "source") summary(pigs.remm.s, infer = TRUE, null = 35) ``` These results all differ from either of the previous two summaries -- again, because the averaging is done on the `conc` scale rather than the `log(conc)` scale. ###### {#regrid} Note: For those who want to routinely back-transform before averaging, the `transform` argument in `ref_grid()` simplifies this. The first two steps above could have been done more easily as follows: ```{r eval = FALSE} pigs.remm.s <- emmeans(pigs.lm, "source", transform = "response") ``` But don't get `transform` and `type` confused. The `transform` argument is passed to `regrid()` after the reference grid is constructed, whereas the `type` argument is simply remembered and used by `summary()`. So a similar-looking call: ```{r eval = FALSE} emmeans(pigs.lm, "source", type = "response") ``` will compute the results we have seen for `pigs.emm.s` -- back-transformed *after* averaging on the log scale. Remember again: When it comes to transformations, timing is everything. [Back to Contents](#contents) ## Link functions {#links} Exactly the same ideas we have presented for response transformations apply to generalized linear models having non-identity link functions. As far as **emmeans** is concerned, there is no difference at all. To illustrate, consider the `neuralgia` dataset provided in the package. These data come from an experiment reported in a SAS technical report where different treatments for neuralgia are compared. The patient's sex is an additional factor, and their age is a covariate. The response is `Pain`, a binary variable on whether or not the patient reports neuralgia pain after treatment. The model suggested in the SAS report is equivalent to the following. We use it to obtain estimated probabilities of experiencing pain: ```{r} neuralgia.glm <- glm(Pain ~ Treatment * Sex + Age, family = binomial(), data = neuralgia) neuralgia.emm <- emmeans(neuralgia.glm, "Treatment", type = "response") neuralgia.emm ``` ###### {#oddsrats} (The note about the interaction is discussed shortly.) Note that the averaging over `Sex` is done on the logit scale, *before* the results are back-transformed for the summary. We may use `pairs()` to compare these estimates; note that logits are logs of odds; so this is another instance where log-differences are back-transformed -- in this case to odds ratios: ```{r} pairs(neuralgia.emm, reverse = TRUE) ``` So there is evidence of considerably more pain being reported with placebo (treatment `P`) than with either of the other two treatments. The estimated odds of pain with `B` are about half that for `A`, but this finding is not statistically significant. (The odds that this is a made-up dataset seem quite high, but that finding is strictly this author's impression.) Observe that there is a note in the output for `neuralgia.emm` that the results may be misleading. It is important to take it seriously, because if two factors interact, it may be the case that marginal averages of predictions don't reflect what is happening at any level of the factors being averaged over. To find out, look at an interaction plot of the fitted model: ```{r} emmip(neuralgia.glm, Sex ~ Treatment) ``` There is no practical difference between females and males in the patterns of response to `Treatment`; so I think most people would be quite comfortable with the marginal results that are reported earlier. [Back to Contents](#contents) ## Graphing transformations and links {#trangraph} There are a few options for displaying transformed results graphically. First, the `type` argument works just as it does in displaying a tabular summary. Following through with the `neuralgia` example, let us display the marginal `Treatment` EMMs on both the link scale and the response scale (we are opting to do the averaging on the link scale): ```{r, fig.height = 1.5} neur.Trt.emm <- suppressMessages(emmeans(neuralgia.glm, "Treatment")) plot(neur.Trt.emm) # Link scale by default plot(neur.Trt.emm, type = "response") ``` Besides whether or not we see response values, there is a dramatic difference in the symmetry of the intervals. For `emmip()` and `plot()` *only* (and currently only with the "ggplot" engine), there is also the option of specifying `type = "scale"`, which causes the response values to be calculated but plotted on a nonlinear scale corresponding to the transformation or link: ```{r, fig.height = 1.5} plot(neur.Trt.emm, type = "scale") ``` Notice that the interior part of this plot is identical to the plot on the link scale. Only the horizontal axis is different. That is because the response values are transformed using the link function to determine the plotting positions of the graphical elements -- putting them back where they started. As is the case here, nonlinear scales can be confusing to read, and it is very often true that you will want to display more scale divisions, and even add minor ones. This is done via adding arguments for the function `ggplot2::scale_x_continuous()` (see its documentation): ```{r, fig.height = 1.5} plot(neur.Trt.emm, type = "scale", breaks = seq(0.10, 0.90, by = 0.10), minor_breaks = seq(0.05, 0.95, by = 0.05)) ``` When using the `"ggplot"` engine, you always have the option of using **ggplot2** to incorporate a transformed scale -- and it doesn't even have to be the same as the transformation used in the model. For example, here we display the same results on an arcsin-square-root scale. ```{r, fig.height = 1.5} plot(neur.Trt.emm, type = "response") + ggplot2::scale_x_continuous(trans = scales::asn_trans(), breaks = seq(0.10, 0.90, by = 0.10)) ``` This comes across as a compromise: not as severe as the logit scaling, and not as distorted as the linear scaling of response values. Again, the same techniques can be used with `emmip()`, except it is the vertical scale that is affected. [Back to Contents](#contents) ## Models having both a response transformation and a link function {#tranlink} It is possible to have a generalized linear model with a non-identity link *and* a response transformation. Here is an example, with the built-in `wapbreaks` dataset: ```{r} warp.glm <- glm(sqrt(breaks) ~ wool*tension, family = Gamma, data = warpbreaks) ref_grid(warp.glm) ``` The canonical link for a gamma model is the reciprocal (or inverse); and there is the square-root response transformation besides. If we choose `type = "response"` in summarizing, we undo *both* transformations: ```{r} emmeans(warp.glm, ~ tension | wool, type = "response") ``` What happened here is first the linear predictor was back-transformed from the link scale (inverse); then the squares were obtained to back-transform the rest of the way. It is possible to undo the link, and not the response transformation: ```{r} emmeans(warp.glm, ~ tension | wool, type = "unlink") ``` It is *not* possible to undo the response transformation and leave the link in place, because the response was transform first, then the link model was applied; we have to undo those in reverse order to make sense. One may also use `"unlink"` as a `transform` argument in `regrid()` or through `ref_grid()`. [Back to Contents](#contents) ## Special transformations {#special} The `make.tran()` function provides several special transformations and sets things up so they can be handled in **emmeans** with relative ease. (See `help("make.tran", "emmeans")` for descriptions of what is available.) `make.tran()` works much like `stats::make.link()` in that it returns a list of functions `linkfun()`, `linkinv()`, etc. that serve in managing results on a transformed scale. The difference is that most transformations with `make.tran()` require additional arguments. To use this capability in `emmeans()`, it is fortuitous to first obtain the `make.tran()` result, and then to use it as the enclosing environment for fitting the model, with `linkfun` as the transformation. For example, suppose the response variable is a percentage and we want to use the response transformation $\sin^{-1}\sqrt{y/100}$. Then proceed like this: ```{r eval = FALSE} tran <- make.tran("asin.sqrt", 100) my.model <- with(tran, lmer(linkfun(percent) ~ treatment + (1|Block), data = mydata)) ``` Subsequent calls to `ref_grid()`, `emmeans()`, `regrid()`, etc. will then be able to access the transformation information correctly. The help page for `make.tran()` has an example like this using a Box-Cox transformation. [Back to Contents](#contents) ## Specifying a transformation after the fact {#after} It is not at all uncommon to fit a model using statements like the following: ```{r eval = FALSE} mydata <- transform(mydata, logy.5 = log(yield + 0.5)) my.model <- lmer(logy.5 ~ treatment + (1|Block), data = mydata) ``` In this case, there is no way for `ref_grid()` to figure out that a response transformation was used. What can be done is to update the reference grid with the required information: ```{r eval = FALSE} my.rg <- update(ref_grid(my.model), tran = make.tran("genlog", .5)) ``` Subsequently, use `my.rg` in place of `my.model` in any `emmeans()` analyses, and the transformation information will be there. For standard transformations (those in `stats::make.link()`), just give the name of the transformation; e.g., ```{r eval = FALSE} model.rg <- update(ref_grid(model), tran = "sqrt") ``` ## Auto-detected response transformations {#auto} As can be seen in the initial `pigs.lm` example in this vignette, certain straightforward response transformations such as `log`, `sqrt`, etc. are automatically detected when `emmeans()` (really, `ref_grid()`) is called on the model object. In fact, scaling and shifting is supported too; so the preceding example with `my.model` could have been done more easily by specifying the transformation directly in the model formula: ```r my.better.model <- lmer(log(yield + 0.5) ~ treatment + (1|Block), data = mydata) ``` The transformation would be auto-detected, saving you the trouble of adding it later. Similarly, a response transformation of `2 * sqrt(y + 1)` would be correctly auto-detected. A model with a linearly transformed response, e.g. `4*(y - 1)`, would *not* be auto-detected, but `4*I(y + -1)` would be interpreted as `4*identity(y + -1)`. Parsing is such that the response expression must be of the form `mult * fcn(resp + const)`; operators of `-` and `/` are not recognized. [Back to Contents](#contents) ## Faking a log transformation {#logs} The `regrid()` function makes it possible to fake a log transformation of the response. Why would you want to do this? So that you can make comparisons using ratios instead of differences. Consider the `pigs` example once again, but suppose we had fitted a model with a square-root transformation instead of a log: ```{r} pigroot.lm <- lm(sqrt(conc) ~ source + factor(percent), data = pigs) piglog.emm.s <- regrid(emmeans(pigroot.lm, "source"), transform = "log") confint(piglog.emm.s, type = "response") pairs(piglog.emm.s, type = "response") ``` These results are not identical, but very similar to the back-transformed confidence intervals [above](#timing) for the EMMs and the [pairwise ratios in the "comparisons" vignette](comparisons.html#logs), where the fitted model actually used a log response. ### Faking other transformations {#faking} It is possible to fake transformations other than the log. Just use the same method, e.g. ```{r, eval = FALSE} regrid(emm, transform = "probit") ``` would re-grid the existing `emm` to the probit scale. Note that any estimates in `emm` outside of the interval $(0,1)$ will be flagged as non-estimable. The [section on standardized responses](#stdize) gives an example of reverse-engineering a standardized response transformation in this way. ### Alternative scale {#altscale} It is possible to create a report on an alternative scale by updating the `tran` component. For example, suppose we want percent differences instead of ratios in the preceding example with the `pigs` dataset. This is possible by modifying the inverse transformation: since the uusual inverse transformation is a ratio of the form $r = a/b$, we have that the percentage difference between $a$ and $b$ is $100(a-b)/b = 100(r-1)$. Thus, ```{r} pct.diff.tran <- list( linkfun = function(mu) log(mu/100 + 1), linkinv = function(eta) 100 * (exp(eta) - 1), mu.eta = function(eta) 100 * exp(eta), name = "log(pct.diff)" ) update(pairs(piglog.emm.s, type = "response"), tran = pct.diff.tran, inv.lbl = "pct.diff") ``` ## Standardized response {#stdize} In some disciplines, it is common to fit a model to a standardized response variable. R's base function `scale()` makes this easy to do; but it is important to notice that `scale(y)` is more complicated than, say, `sqrt(y)`, because `scale(y)` requires all the values of `y` in order to determine the centering and scaling parameters. The `ref_grid()` function (called by `emmeans() and others) tries to detect the scaling parameters. To illustrate: ```{r, message = FALSE} fiber.lm <- lm(scale(strength) ~ machine * scale(diameter), data = fiber) emmeans(fiber.lm, "machine") # on the standardized scale emmeans(fiber.lm, "machine", type = "response") # strength scale ``` More interesting (and complex) is what happens with `emtrends()`. Without anything fancy added, we have ```{r} emtrends(fiber.lm, "machine", var = "diameter") ``` These slopes are (change in `scale(strength)`) / (change in `diameter`); that is, we didn't do anything to undo the response transformation, but the trend is based on exactly the variable specified, `diameter`. To get (change in `strength`) / (change in `diameter`), we need to undo the response transformation, and that is done via `transform` (which invokes `regrid()` after the reference grid is constructed): ```{r} emtrends(fiber.lm, "machine", var = "diameter", transform = "response") ``` What if we want slopes for (change in `scale(strength)`) / (change in `scale(diameter)`)? This can be done, but it is necessary to manually specify the scaling parameters for `diameter`. ```{r} with(fiber, c(mean = mean(diameter), sd = sd(diameter))) emtrends(fiber.lm, "machine", var = "scale(diameter, 24.133, 4.324)") ``` This result is the one most directly related to the regression coefficients: ```{r} coef(fiber.lm)[4:6] ``` There is a fourth possibility, (change in `strength`) / (change in `scale(diameter)`), that I leave to the reader. ### What to do if auto-detection fails Auto-detection of standardized responses is a bit tricky, and doesn't always succeed. If it fails, a message is displayed and the transformation is ignored. In cases where it doesn't work, we need to explicitly specify the transformation using `make.tran()`. The methods are exactly as shown earlier in this vignette, so we show the code but not the results for a hypothetical example. One method is to fit the model and then add the transformation information later. In this example, `some.fcn` is a model-fitting function which for some reason doesn't allow the scaling information to be detected. ```{r, eval = FALSE} mod <- some.fcn(scale(RT) ~ group + (1|subject), data = mydata) emmeans(mod, "group", type = "response", tran = make.tran("scale", y = mydata$RT)) ``` The other, equivalent, method is to create the transformation object first and use it in fitting the model: ```{r, eval = FALSE} mod <- with(make.tran("scale", y = mydata$RT), some.fcn(linkfun(RT) ~ group + (1|subject), data = mydata)) emmeans(mod, "group", type = "response") ``` ### Reverse-engineering a standardized response An interesting twist on all this is the reverse situation: Suppose we fitted the model *without* the standardized response, but we want to know what the results would be if we had standardized. Here we reverse-engineer the `fiber.lm` example above: ```{r, message = FALSE} fib.lm <- lm(strength ~ machine * diameter, data = fiber) # On raw scale: emmeans(fib.lm, "machine") # On standardized scale: tran <- make.tran("scale", y = fiber$strength) emmeans(fib.lm, "machine", transform = tran) ``` In the latter call, the `transform` argument causes `regrid()` to be called after the reference grid is constructed. [Back to Contents](#contents) ## Bias adjustment {#bias-adj} So far, we have discussed ideas related to back-transforming results as a simple way of expressing results on the same scale as the response. In particular, means obtained in this way are known as *generalized means*; for example, a log transformation of the response is associated with geometric means. When the goal is simply to make inferences about which means are less than which other means, and a response transformation is used, it is often acceptable to present estimates and comparisons of these generalized means. However, sometimes it is important to report results that actually do reflect expected values of the untransformed response. An example is a financial study, where the response is in some monetary unit. It may be convenient to use a response transformation for modeling purposes, but ultimately we may want to make financial projections in those same units. In such settings, we need to make a bias adjustment when we back-transform, because any nonlinear transformation biases the expected values of statistical quantities. More specifically, suppose that we have a response $Y$ and the transformed response is $U$. To back-transform, we use $Y = h(U)$; and using a Taylor approximation, $Y \approx h(\eta) + h'(\eta)(U-\eta) + \frac12h''(\eta)(U-\eta)^2$, so that $E(Y) \approx h(\eta) + \frac12h''(\eta)Var(U)$. This shows that the amount of needed bias adjustment is approximately $\frac12h''(\eta)\sigma^2$ where $\sigma$ is the error SD in the model for $U$. It depends on $\sigma$, and the larger this is, the greater the bias adjustment is needed. This second-order bias adjustment is what is currently used in the **emmeans** package when bias-adjustment is requested. There are better or exact adjustments for certain cases, and future updates may incorporate some of those. ### Pigs example revisited {#pigs-biasadj} Let us compare the estimates in [the overview](#overview) after we apply a bias adjustment. First, note that an estimate of the residual SD is available via the `sigma()` function: ```{r} sigma(pigs.lm) ``` This estimate is used by default. The bias-adjusted EMMs for the sources are: ```{r} summary(pigs.emm.s, type = "response", bias.adj = TRUE) ``` These estimates (and also their SEs) are slightly larger than we had without bias adjustment. They are estimates of the *arithmetic* mean responses, rather than the *geometric* means shown in the overview. Had the value of `sigma` been larger, the adjustment would have been greater. You can experiment with this by adding a `sigma =` argument to the above call. ### Response transformations vs. link functions {#link-bias} At this point, it is important to point out that the above discussion focuses on *response* transformations, as opposed to link functions used in generalized linear models (GLMs). In an ordinary GLM, no bias adjustment is needed, nor is it appropriate, because the link function is just used to define a nonlinear relationship between the actual response mean $\eta$ and the linear predictor. That is, the back-transformed parameter is already the mean. #### InsectSprays example {#insects} To illustrate this, consider the `InsectSprays` data in the **datasets** package. The response variable is a count, and there is one treatment, the spray that is used. Let us model the count as a Poisson variable with (by default) a log link; and obtain the EMMs, with and without a bias adjustment ```{r} ismod <- glm(count ~ spray, data = InsectSprays, family = poisson()) emmeans(ismod, "spray", type = "response", bias.adj = FALSE) emmeans(ismod, "spray", type = "response", bias.adj = TRUE) ``` These are substantially different! Which is right? Well, due to the simple structure of this dataset, the estimates should be well in line with the simple observed mean counts: ```{r} with(InsectSprays, tapply(count, spray, mean)) ``` This illustrates that it is the *non*-bias-adjusted results that are appropriate. Again, the point here is that a GLM does not have an additive error term, that the model is already formulated in terms of the mean, not some generalized mean. Users must be very careful with this! There is no way to automatically do the right thing. Note that, in a generalized linear *mixed* model, including generalized estimating equations and such, there *are* additive random components involved, and then bias adjustment becomes appropriate. #### CBPP example {#cbpp} Consider an example adapted from the help page for `lme4::cbpp`. Contagious bovine pleuropneumonia (CBPP) is a disease in African cattle, and the dataset contains data on incidence of CBPP in several herds of cattle over four time periods. We will fit a mixed model that accounts for herd variations as well as overdispersion (variations larger than expected with a simple binomial model): ```{r, message = FALSE} require(lme4) cbpp <- transform(cbpp, unit = 1:nrow(cbpp)) cbpp.glmer <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd) + (1 | unit), family = binomial, data = cbpp) emm <- emmeans(cbpp.glmer, "period") summary(emm, type = "response") ``` The above summary reflects the back-transformed estimates, with no bias adjustment. However, the model estimates two independent sources of random variation that probably should be taken into account: ```{r} lme4::VarCorr(cbpp.glmer) ``` Notably, the over-dispersion SD is considerably greater than the herd SD. Suppose we want to estimate the marginal probabilities of CBPP incidence, averaged over herds and over-dispersion variations. For this purpose, we need the combined effect of these variations; so we compute the overall SD via the Pythagorean theorem: ```{r} total.SD = sqrt(0.89107^2 + 0.18396^2) ``` Accordingly, here are the bias-adjusted estimates of the marginal probabilities: ```{r} summary(emm, type = "response", bias.adjust = TRUE, sigma = total.SD) ``` These estimates are somewhat larger than the unadjusted estimates (actually, any estimates greater than 0.5 would have been adjusted downward). These adjusted estimates are more appropriate for describing the marginal incidence of CBPP for all herds. In fact, these estimates are fairly close to those obtained directly from the incidences in the data: ```{r} cases <- with(cbpp, tapply(incidence, period, sum)) trials <- with(cbpp, tapply(size, period, sum)) cases / trials ``` Left as an exercise: Revisit the `InsectSprays` example, but (using similar methods to the above) create a `unit` variable and fit an over-dispersion model. Compare the results with and without bias adjustment, and evaluate these results against the earlier results. This is simpler than the CBPP example because there is only one random effect. [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/inst/doc/sophisticated.Rmd0000644000176200001440000004326714137062735016754 0ustar liggesusers--- title: "Sophisticated models in emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Sophisticated models in emmeans} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") options(show.signif.stars = FALSE, width = 100) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` This vignette gives a few examples of the use of the **emmeans** package to analyze other than the basic types of models provided by the **stats** package. Emphasis here is placed on accessing the optional capabilities that are typically not needed for the more basic models. A reference for all supported models is provided in the ["models" vignette](models.html). ## Contents {#contents} 1. [Linear mixed models (lmer)](#lmer) a. [System options for lmerMod models](#lmerOpts) 2. [Models with offsets](#offsets) 3. [Ordinal models](#ordinal) 4. [Models fitted using MCMC methods](#mcmc) [Index of all vignette topics](vignette-topics.html) ## Linear mixed models (lmer) {#lmer} Linear mixed models are really important in statistics. Emphasis here is placed on those fitted using `lme4::lmer()`, but **emmeans** also supports other mixed-model packages such as **nlme**. To illustrate, consider the `Oats` dataset in the **nlme** package. It has the results of a balanced split-plot experiment: experimental blocks are divided into plots that are randomly assigned to oat varieties, and the plots are subdivided into subplots that are randomly assigned to amounts of nitrogen within each plot. We will consider a linear mixed model for these data, excluding interaction (which is justified in this case). For sake of illustration, we will exclude a few observations. ```{r} Oats.lmer <- lme4::lmer(yield ~ Variety + factor(nitro) + (1|Block/Variety), data = nlme::Oats, subset = -c(1,2,3,5,8,13,21,34,55)) ``` Let's look at the EMMs for `nitro`: ```{r} Oats.emm.n <- emmeans(Oats.lmer, "nitro") Oats.emm.n ``` You will notice that the degrees of freedom are fractional: that is due to the fact that whole-plot and subplot variations are combined when standard errors are estimated. Different degrees-of-freedom methods are available. By default, the Kenward-Roger method is used, and that's why you see a message about the **pbkrtest** package being loaded, as it implements that method. We may specify a different degrees-of-freedom method via the optional argument `lmer.df`: ```{r} emmeans(Oats.lmer, "nitro", lmer.df = "satterthwaite") ``` ###### {#dfoptions} This latest result uses the Satterthwaite method, which is implemented in the **lmerTest** package. Note that, with this method, not only are the degrees of freedom slightly different, but so are the standard errors. That is because the Kenward-Roger method also entails making a bias adjustment to the covariance matrix of the fixed effects; that is the principal difference between the methods. A third possibility is `"asymptotic"`: ```{r} emmeans(Oats.lmer, "nitro", lmer.df = "asymptotic") ``` This just sets all the degrees of freedom to `Inf` -- that's **emmeans**'s way of using *z* statistics rather than *t* statistics. The asymptotic methods tend to make confidence intervals a bit too narrow and P values a bit too low; but they involve much, much less computation. Note that the SEs are the same as obtained using the Satterthwaite method. Comparisons and contrasts are pretty much the same as with other models. As `nitro` has quantitative levels, we might want to test polynomial contrasts: ```{r} contrast(Oats.emm.n, "poly") ``` The interesting thing here is that the degrees of freedom are much larger than they are for the EMMs. The reason is because `nitro` within-plot factor, so inter-plot variations have little role in estimating contrasts among `nitro` levels. On the other hand, `Variety` is a whole-plot factor, and there is not much of a bump in degrees of freedom for comparisons: ```{r} emmeans(Oats.lmer, pairwise ~ Variety) ``` ### System options for lmerMod models {#lmerOpts} The computation required to compute the adjusted covariance matrix and degrees of freedom may become cumbersome. Some user options (i.e., `emm_options()` calls) make it possible to streamline these computations through default methods and limitations on them. First, the option `lmer.df`, which may have values of `"kenward-roger"`, `"satterthwaite"`, or `"asymptotic"` (partial matches are OK!) specifies the default degrees-of-freedom method. The options `disable.pbkrtest` and `disable.lmerTest` may be `TRUE` or `FALSE`, and comprise another way of controlling which method is used (e.g., the Kenward-Roger method will not be used if `get_emm_option("disable.pbkrtest") == TRUE`). Finally, the options `pbkrtest.limit` and `lmerTest.limit`, which should be set to numeric values, enable the given package conditionally on whether the number of data rows does not exceed the given limit. The factory default is 3000 for both limits. [Back to Contents](#contents) ## Models with offsets {#offsets} If a model is fitted and its formula includes an `offset()` term, then by default, the offset is computed and included in the reference grid. To illustrate, consider a hypothetical dataset on insurance claims (used as an [example in SAS's documentation](https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_genmod_sect006.htm)). There are classes of cars of varying counts (`n`), sizes (`size`), and age (`age`), and we record the number of insurance claims (`claims`). We fit a Poisson model to `claims` as a function of `size` and `age`. An offset of `log(n)` is included so that `n` functions as an "exposure" variable. ```{r} ins <- data.frame( n = c(500, 1200, 100, 400, 500, 300), size = factor(rep(1:3,2), labels = c("S","M","L")), age = factor(rep(1:2, each = 3)), claims = c(42, 37, 1, 101, 73, 14)) ins.glm <- glm(claims ~ size + age + offset(log(n)), data = ins, family = "poisson") ``` First, let's look at the reference grid obtained by default: ```{r} ref_grid(ins.glm) ``` Note that `n` is included in the reference grid and that its average value of 500 is used for all predictions. Thus, if we obtain EMMs for, say, `age`, these are results are based on a pool of 500 cars: ```{r} emmeans(ins.glm, "size", type = "response") ``` However, many users would like to ignore the offset for this kind of model, because then the estimates we obtain are rates per unit value of the (logged) offset. This may be accomplished by specifying an `offset` parameter in the call: ```{r} emmeans(ins.glm, "size", type = "response", offset = 0) ``` You may verify that the above estimates are 1/500th of the previous ones. You may also verify that the above results are identical to those obtained by setting `n` equal to 1: ```{r eval = FALSE} emmeans(ins.glm, "size", type = "response", at = list(n = 1)) ``` However, those who use these types of models will be more comfortable directly setting the offset to zero. By the way, you may set some other reference value for the rates. For example, if you want estimates of claims per 100 cars, simply use (results not shown): ```{r eval = FALSE} emmeans(ins.glm, "size", type = "response", offset = log(100)) ``` [Back to Contents](#contents) ## Ordinal models {#ordinal} Ordinal-response models comprise an example where several options are available for obtaining EMMs. To illustrate, consider the `wine` data in the **ordinal** package. The response is a rating of bitterness on a five-point scale. we will consider a probit model in two factors during fermentation: `temp` (temperature) and `contact` (contact with grape skins), with the judge making the rating as a scale predictor: ```{r} require("ordinal") wine.clm <- clm(rating ~ temp + contact, scale = ~ judge, data = wine, link = "probit") ``` (in earlier modeling, we found little interaction between the factors.) Here are the EMMs for each factor using default options: ```{r} emmeans(wine.clm, list(pairwise ~ temp, pairwise ~ contact)) ``` These results are on the "latent" scale; the idea is that there is a continuous random variable (in this case normal, due to the probit link) having a mean that depends on the predictors; and that the ratings are a discretization of the latent variable based on a fixed set of cut points (which are estimated). In this particular example, we also have a scale model that says that the variance of the latent variable depends on the judges. The latent results are quite a bit like those for measurement data, making them easy to interpret. The only catch is that they are not uniquely defined: we could apply a linear transformation to them, and the same linear transformation to the cut points, and the results would be the same. ###### {#ordlp} The `clm` function actually fits the model using an ordinary probit model but with different intercepts for each cut point. We can get detailed information for this model by specifying `mode = "linear.predictor"`: ```{r} tmp <- ref_grid(wine.clm, mode = "lin") tmp ``` Note that this reference grid involves an additional constructed predictor named `cut` that accounts for the different intercepts in the model. Let's obtain EMMs for `temp` on the linear-predictor scale: ```{r} emmeans(tmp, "temp") ``` These are just the negatives of the latent results obtained earlier (the sign is changed to make the comparisons go the right direction). Closely related to this is `mode = "cum.prob"` and `mode = "exc.prob"`, which simply transform the linear predictor to cumulative probabilities and exceedance (1 - cumulative) probabilities. These modes give us access to the details of the fitted model but are cumbersome to use for describing results. When they can become useful is when you want to work in terms of a particular cut point. Let's look at `temp` again in terms of the probability that the rating will be at least 4: ```{r} emmeans(wine.clm, ~ temp, mode = "exc.prob", at = list(cut = "3|4")) ``` ###### {#ordprob} There are yet more modes! With `mode = "prob"`, we obtain estimates of the probability distribution of each rating. Its reference grid includes a factor with the same name as the model response -- in this case `rating`. We usually want to use that as the primary factor, and the factors of interest as `by` variables: ```{r} emmeans(wine.clm, ~ rating | temp, mode = "prob") ``` Using `mode = "mean.class"` obtains the average of these probability distributions as probabilities of the integers 1--5: ```{r} emmeans(wine.clm, "temp", mode = "mean.class") ``` And there is a mode for the scale model too. In this example, the scale model involves only judges, and that is the only factor in the grid: ```{r} summary(ref_grid(wine.clm, mode = "scale"), type = "response") ``` Judge 8's ratings don't vary much, relative to the others. The scale model is in terms of log(SD). Again, these are not uniquely identifiable, and the first level's estimate is set to log(1) = 0. so, actually, each estimate shown is a comparison with judge 1. [Back to Contents](#contents) ## Models fitted using MCMC methods {#mcmc} To illustrate **emmeans**'s support for models fitted using MCMC methods, consider the `example_model` available in the **rstanarm** package. The example concerns CBPP, a serious disease of cattle in Ethiopia. A generalized linear mixed model was fitted to the data using the code below. (This is a Bayesian equivalent of the frequentist model we considered in the ["Transformations" vignette](transformations.html#cbpp).) In fitting the model, we first set the contrast coding to `bayestestR::contr.bayes` because this equalizes the priors across different treatment levels (a correction from an earlier version of this vignette.) We subsequently obtain the reference grids for these models in the usual way. For later use, we also fit the same model with just the prior information. ```{r eval = FALSE} cbpp <- transform(lme4::cbpp, unit = 1:56) require("bayestestR") options(contrasts = c("contr.bayes", "contr.poly")) cbpp.rstan <- rstanarm::stan_glmer( cbind(incidence, size - incidence) ~ period + (1|herd) + (1|unit), data = cbpp, family = binomial, prior = student_t(df = 5, location = 0, scale = 2, autoscale = FALSE), chains = 2, cores = 1, seed = 2021.0120, iter = 1000) cbpp_prior.rstan <- update(cbpp.rstan, prior_PD = TRUE) cbpp.rg <- ref_grid(cbpp.rstan) cbpp_prior.rg <- ref_grid(cbpp_prior.rstan) ``` ```{r echo = FALSE} cbpp.rg <- do.call(emmobj, readRDS(system.file("extdata", "cbpprglist", package = "emmeans"))) cbpp_prior.rg <- do.call(emmobj, readRDS(system.file("extdata", "cbpppriorrglist", package = "emmeans"))) cbpp.sigma <- readRDS(system.file("extdata", "cbppsigma", package = "emmeans")) ``` Here is the structure of the reference grid: ```{r} cbpp.rg ``` So here are the EMMs (no averaging needed in this simple model): ```{r} summary(cbpp.rg) ``` The summary for EMMs of Bayesian models shows the median of the posterior distribution of each estimate, along with highest posterior density (HPD) intervals. Under the hood, the posterior sample of parameter estimates is used to compute a corresponding sample of posterior EMMs, and it is those that are summarized. (Technical note: the summary is actually rerouted to the `hpd.summary()` function. ###### {#bayesxtra} We can access the posterior EMMs via the `as.mcmc` method for `emmGrid` objects. This gives us an object of class `mcmc` (defined in the **coda** package), which can be summarized and explored as we please. ```{r} require("coda") summary(as.mcmc(cbpp.rg)) ``` Note that `as.mcmc` will actually produce an `mcmc.list` when there is more than one chain present, as in this example. The 2.5th and 97.5th quantiles are similar, but not identical, to the 95% confidence intervals in the frequentist summary. The **bayestestR** package provides `emmGrid` methods for most of its description and testing functions. For example: ```{r} bayestestR::bayesfactor_parameters(pairs(cbpp.rg), prior = pairs(cbpp_prior.rg)) bayestestR::p_rope(pairs(cbpp.rg), range = c(-0.25, 0.25)) ``` Both of these sets of results suggest that period 1 is different from the others. For more information on these methods, refer to [the CRAN page for **bayestestR**](https://cran.r-project.org/package=bayestestR) and its vignettes, e.g., the one on Bayes factors. ### Bias-adjusted incidence probabilities {#bias-adj-mcmc} Next, let us consider the back-transformed results. As is discussed with the [frequentist model](transformations.html#cbpp), there are random effects present, and if wee want to think in terms of marginal probabilities across all herds and units, we should correct for bias; and to do that, we need the standard deviations of the random effects. The model object has MCMC results for the random effects of each herd and each unit, but after those, there are also summary results for the posterior SDs of the two random effects. (I used the `colnames` function to find that they are in the 78th and 79th columns.) ```{r eval = FALSE} cbpp.sigma = as.matrix(cbpp.rstan$stanfit)[, 78:79] ``` Here are the first few: ```{r} head(cbpp.sigma) ``` So to obtain bias-adjusted marginal probabilities, obtain the resultant SD and regrid with bias correction: ```{r} totSD <- sqrt(apply(cbpp.sigma^2, 1, sum)) cbpp.rgrd <- regrid(cbpp.rg, bias.adjust = TRUE, sigma = totSD) summary(cbpp.rgrd) ``` Here is a plot of the posterior incidence probabilities, back-transformed: ```{r} bayesplot::mcmc_areas(as.mcmc(cbpp.rgrd)) ``` ... and here are intervals for each period compared with its neighbor: ```{r} contrast(cbpp.rgrd, "consec", reverse = TRUE) ``` The only interval that excludes zero is the one that compares periods 1 and 2. ### Bayesian prediction {#predict-mcmc} To predict from an MCMC model, just specify the `likelihood` argument in `as.mcmc`. Doing so causes the function to simulate data from the posterior predictive distribution. For example, if we want to predict the CBPP incidence in future herds of 25 cattle, we can do: ```{r} set.seed(2019.0605) cbpp.preds <- as.mcmc(cbpp.rgrd, likelihood = "binomial", trials = 25) bayesplot::mcmc_hist(cbpp.preds, binwidth = 1) ``` [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/inst/doc/interactions.R0000644000176200001440000000764614165066757016304 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE------------------------- require("emmeans") options(show.signif.stars = FALSE) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro", class.message = "re") ## ----------------------------------------------------------------------------- noise.lm <- lm(noise/10 ~ size * type * side, data = auto.noise) anova(noise.lm) ## ----------------------------------------------------------------------------- emmeans(noise.lm, pairwise ~ size) ## ----------------------------------------------------------------------------- emmip(noise.lm, type ~ size | side) ## ----------------------------------------------------------------------------- emm_s.t <- emmeans(noise.lm, pairwise ~ size | type) emm_s.t ## ----------------------------------------------------------------------------- noise.emm <- emmeans(noise.lm, ~ size * side * type) ## ----------------------------------------------------------------------------- contrast(noise.emm, "consec", simple = "each", combine = TRUE, adjust = "mvt") ## ----------------------------------------------------------------------------- contrast(emm_s.t[[1]], "poly") ## 'by = "type"' already in previous result ## ----------------------------------------------------------------------------- IC_st <- contrast(emm_s.t[[1]], interaction = c("poly", "consec"), by = NULL) IC_st ## ----------------------------------------------------------------------------- coef(IC_st) ## ----------------------------------------------------------------------------- test(IC_st, joint = TRUE) ## ----------------------------------------------------------------------------- contrast(emmeans(noise.lm, ~ size*type*side), interaction = c("poly", "consec", "consec")) ## ----------------------------------------------------------------------------- joint_tests(noise.lm) ## ----------------------------------------------------------------------------- joint_tests(noise.lm, by = "side") ## ----------------------------------------------------------------------------- mvcontrast(noise.emm, "pairwise", mult.name = c("type", "side")) ## ----------------------------------------------------------------------------- update(mvcontrast(noise.emm, "consec", mult.name = "side", by = "size"), by = NULL) ## ----------------------------------------------------------------------------- mvcontrast(update(noise.emm, submodel = ~ side + size + type), "pairwise", mult.name = c("type", "side")) ## ----------------------------------------------------------------------------- fiber.lm <- lm(strength ~ diameter*machine, data = fiber) ## ----------------------------------------------------------------------------- emtrends(fiber.lm, pairwise ~ machine, var = "diameter") ## ----fig.height = 2----------------------------------------------------------- emmip(fiber.lm, machine ~ diameter, cov.reduce = range) ## ----------------------------------------------------------------------------- org.quad <- lm(cbind(sales1, sales2) ~ poly(price1, price2, degree = 2) + day + store, data = oranges) org.int <- lm(cbind(sales1, sales2) ~ price1 * price2 + day + store, data = oranges) org.add <- lm(cbind(sales1, sales2) ~ price1 + price2 + day + store, data = oranges) ## ----------------------------------------------------------------------------- emmip(org.quad, price2 ~ price1 | variety, mult.name = "variety", cov.reduce = FALSE) ## ----------------------------------------------------------------------------- anova(org.quad, org.int, org.add) ## ----------------------------------------------------------------------------- emtrends(org.int, pairwise ~ variety, var = "price1", mult.name = "variety") ## ----------------------------------------------------------------------------- emtrends(org.int, pairwise ~ variety, var = "price2", mult.name = "variety") emmeans/inst/doc/FAQs.R0000644000176200001440000000110114165066750014341 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE------------------------- require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") options(show.signif.stars = FALSE) ## ----------------------------------------------------------------------------- pg <- transform(pigs, x = rep(1:3, c(10, 10, 9))) pg.lm <- lm(log(conc) ~ x + source + factor(percent), data = pg) emmeans(pg.lm, consec ~ percent) ## ----------------------------------------------------------------------------- qt(c(.9, .95, .975), df = Inf) qnorm(c(.9, .95, .975)) emmeans/inst/doc/xtending.html0000644000176200001440000012523614165066776016162 0ustar liggesusers For developers: Extending emmeans

For developers: Extending emmeans

emmeans package, Version 1.7.2

Contents

This vignette explains how developers may incorporate emmeans support in their packages. If you are a user looking for a quick way to obtain results for an unsupported model, you are probably better off trying to use the qdrg() function.

  1. Introduction
  2. Data example
  3. Supporting rlm objects
  4. Supporting lqs objects
  5. Communication between methods
  6. Hook functions
  7. Exported methods from emmeans
  8. Existing support for rsm objects
  9. Dispatching and restrictions
  10. Exporting and registering your methods
  11. Conclusions

Index of all vignette topics

Introduction

Suppose you want to use emmeans for some type of model that it doesn’t (yet) support. Or, suppose you have developed a new package with a fancy model-fitting function, and you’d like it to work with emmeans. What can you do? Well, there is hope because emmeans is designed to be extended.

The first thing to do is to look at the help page for extending the package:

help("extending-emmeans", package="emmeans")

It gives details about the fact that you need to write two S3 methods, recover_data and emm_basis, for the class of object that your model-fitting function returns. The recover_data method is needed to recreate the dataset so that the reference grid can be identified. The emm_basis method then determines the linear functions needed to evaluate each point in the reference grid and to obtain associated information—such as the variance-covariance matrix—needed to do estimation and testing.

These methods must also be exported from your package so that they are available to users. See the section on exporting the methods for details and suggestions.

This vignette presents an example where suitable methods are developed, and discusses a few issues that arise.

Back to Contents

Data example

The MASS package contains various functions that do robust or outlier-resistant model fitting. We will cobble together some emmeans support for these. But first, let’s create a suitable dataset (a simulated two-factor experiment) for testing.

fake = expand.grid(rep = 1:5, A = c("a1","a2"), B = c("b1","b2","b3"))
fake$y = c(11.46,12.93,11.87,11.01,11.92,17.80,13.41,13.96,14.27,15.82,
           23.14,23.75,-2.09,28.43,23.01,24.11,25.51,24.11,23.95,30.37,
           17.75,18.28,17.82,18.52,16.33,20.58,20.55,20.77,21.21,20.10)

The y values were generated using predetermined means and Cauchy-distributed errors. There are some serious outliers in these data.

Supporting rlm

The MASS package provides an rlm function that fits robust-regression models using M estimation. We’ll fit a model using the default settings for all tuning parameters:

library(MASS)
fake.rlm = rlm(y ~ A * B, data = fake)

library(emmeans)
emmeans(fake.rlm, ~ B | A)
## A = a1:
##  B  emmean    SE df asymp.LCL asymp.UCL
##  b1   11.8 0.477 NA      10.9      12.8
##  b2   23.3 0.477 NA      22.4      24.2
##  b3   17.8 0.477 NA      16.9      18.7
## 
## A = a2:
##  B  emmean    SE df asymp.LCL asymp.UCL
##  b1   14.7 0.477 NA      13.7      15.6
##  b2   24.7 0.477 NA      23.8      25.6
##  b3   20.6 0.477 NA      19.7      21.6
## 
## Confidence level used: 0.95

The first lesson to learn about extending emmeans is that sometimes, it already works! It works here because rlm objects inherit from lm, which is supported by the emmeans package, and rlm objects aren’t enough different to create any problems.

Back to Contents

Supporting lqs objects

The MASS resistant-regression functions lqs, lmsreg, and ltsreg are another story, however. They create lqs objects that are not extensions of any other class, and have other issues, including not even having a vcov method. So for these, we really do need to write new methods for lqs objects. First, let’s fit a model.

fake.lts = ltsreg(y ~ A * B, data = fake)

The recover_data method

It is usually an easy matter to write a recover_data method. Look at the one for lm objects:

emmeans:::recover_data.lm
## function (object, frame = object$model, ...) 
## {
##     fcall = object$call
##     recover_data(fcall, delete.response(terms(object)), object$na.action, 
##         frame = frame, ...)
## }
## <bytecode: 0x000000002497dc70>
## <environment: namespace:emmeans>

Note that all it does is obtain the call component and call the method for class call, with additional arguments for its terms component and na.action. It happens that we can access these attributes in exactly the same way as for lm objects; so:

recover_data.lqs = emmeans:::recover_data.lm

Let’s test it:

rec.fake = recover_data(fake.lts)
head(rec.fake)
##    A  B
## 1 a1 b1
## 2 a1 b1
## 3 a1 b1
## 4 a1 b1
## 5 a1 b1
## 6 a2 b1

Our recovered data excludes the response variable y (owing to the delete.response call), and this is fine.

Special arguments

By the way, there are two special arguments data and params that may be handed to recover_data via ref_grid or emmeans or a related function; and you may need to provide for if you don’t use the recover_data.call function. The data argument is needed to cover a desperate situation that occurs with certain kinds of models where the underlying data information is not saved with the object—e.g., models that are fitted by iteratively modifying the data. In those cases, the only way to recover the data is to for the user to give it explicitly, and recover_data just adds a few needed attributes to it.

The params argument is needed when the model formula refers to variables besides predictors. For example, a model may include a spline term, and the knots are saved in the user’s environment as a vector and referred to in the call to fit the model. In trying to recover the data, we try to construct a data frame containing all the variables present on the right-hand side of the model, but if some of those are scalars or of different lengths than the number of observations, an error occurs. So you need to exclude any names in params when reconstructing the data.

Many model objects contain the model frame as a slot; for example, a model fitted with lm(..., model = TRUE) has a member $model containing the model frame. This can be useful for recovering the data, provided none of the predictors are transformed (when predictors are transformed, the original predictor values are not in the model frame so it’s harder to recover them). Therefore, when the model frame is available in the model object, it should be provided in the frame argument of recover_data.call(); then when data = NULL, a check is made on trms, and if it has no function calls, then data is set to frame. Of course, in the rarer case where the original data are available in the model object, specify that as data.

Error handling

If you check for any error conditions in recover_data, simply have it return a character string with the desired message, rather than invoking stop. This provides a cleaner exit. The reason is that whenever recover_data throws an error, an informative message suggesting that data or params be provided is displayed. But a character return value is tested for and throws a different error with your string as the message.

The emm_basis method

The emm_basis method has four required arguments:

args(emmeans:::emm_basis.lm)
## function (object, trms, xlev, grid, ...) 
## NULL

These are, respectively, the model object, its terms component (at least for the right-hand side of the model), a list of levels of the factors, and the grid of predictor combinations that specify the reference grid.

The function must obtain six things and return them in a named list. They are the matrix X of linear functions for each point in the reference grid, the regression coefficients bhat; the variance-covariance matrix V; a matrix nbasis for non-estimable functions; a function dffun(k,dfargs) for computing degrees of freedom for the linear function sum(k*bhat); and a list dfargs of arguments to pass to dffun. Optionally, the returned list may include a model.matrix element (the model matrix for the data or a compact version thereof obtained via .cmpMM()), which, if included, enables the submodel option.

To write your own emm_basis function, examining some of the existing methods can help; but the best resource is the predict method for the object in question, looking carefully to see what it does to predict values for a new set of predictors (e.g., newdata in predict.lm). Following this advice, let’s take a look at it:

MASS:::predict.lqs
## function (object, newdata, na.action = na.pass, ...) 
## {
##     if (missing(newdata)) 
##         return(fitted(object))
##     Terms <- delete.response(terms(object))
##     m <- model.frame(Terms, newdata, na.action = na.action, xlev = object$xlevels)
##     if (!is.null(cl <- attr(Terms, "dataClasses"))) 
##         .checkMFClasses(cl, m)
##     X <- model.matrix(Terms, m, contrasts = object$contrasts)
##     drop(X %*% object$coefficients)
## }
## <bytecode: 0x000000002626ea00>
## <environment: namespace:MASS>

Based on this, here is a listing of an emm_basis method for lqs objects:

emm_basis.lqs = function(object, trms, xlev, grid, ...) { 
    m = model.frame(trms, grid, na.action = na.pass, xlev = xlev)
    X = model.matrix(trms, m, contrasts.arg = object$contrasts) 
    bhat = coef(object) 
    Xmat = model.matrix(trms, data=object$model)                      # 5
    V = rev(object$scale)[1]^2 * solve(t(Xmat) %*% Xmat)
    nbasis = matrix(NA) 
    dfargs = list(df = nrow(Xmat) - ncol(Xmat))
    dffun = function(k, dfargs) dfargs$df
    list(X = X, bhat = bhat, nbasis = nbasis, V = V,                  #10
         dffun = dffun, dfargs = dfargs)
}

Before explaining it, let’s verify that it works:

emmeans(fake.lts, ~ B | A)
## A = a1:
##  B  emmean    SE df lower.CL upper.CL
##  b1   11.9 0.228 24     11.4     12.3
##  b2   23.1 0.228 24     22.6     23.6
##  b3   17.8 0.228 24     17.3     18.2
## 
## A = a2:
##  B  emmean    SE df lower.CL upper.CL
##  b1   13.9 0.228 24     13.4     14.4
##  b2   24.1 0.228 24     23.6     24.5
##  b3   20.5 0.228 24     20.0     21.0
## 
## Confidence level used: 0.95

Hooray! Note the results are comparable to those we had for fake.rlm, albeit the standard errors are quite a bit smaller. (In fact, the SEs could be misleading; a better method for estimating covariances should probably be implemented, but that is beyond the scope of this vignette.)

Back to Contents

Dissecting emm_basis.lqs

Let’s go through the listing of this method, line-by-line:

  • Lines 2–3: Construct the linear functions, X. This is a pretty standard two-step process: First obtain a model frame, m, for the grid of predictors, then pass it as data to model.matrix to create the associated design matrix. As promised, this code is essentially identical to what you find in predict.lqs.

  • Line 4: Obtain the coefficients, bhat. Most model objects have a coef method.

  • Lines 5–6: Obtain the covariance matrix, V, of bhat. In many models, this can be obtained using the object’s vcov method. But not in this case. Instead, I cobbled one together using the inverse of the X’X matrix as in ordinary regression, and the variance estimate found in the last element of the scale element of the object. This probably under-estimates the variances and distorts the covariances, because robust estimators have some efficiency loss.

  • Line 7: Compute the basis for non-estimable functions. This applies only when there is a possibility of rank deficiency in the model. But lqs methods don’t allow rank deficiencies, so it we have fitted such a model, we can be sure that all linear functions are estimable; we signal that by setting nbasis equal to a 1 x 1 matrix of NA. If rank deficiency were possible, the estimability package (which is required by emmeans) provides a nonest.basis function that makes this fairly painless—I would have coded nbasis = estimability::nonest.basis(Xmat).

    There some subtleties you need to know regarding estimability. Suppose the model is rank-deficient, so that the design matrix X has p columns but rank r < p. In that case, bhat should be of length p (not r), and there should be p - r elements equal to NA, corresponding to columns of X that were excluded from the fit. Also, X should have all p columns. In other words, do not alter or throw-out columns of X or their corresponding elements of bhat—even those with NA coefficients—as they are essential for assessing estimability. V should be r x r, however—the covariance matrix for the non-excluded predictors.

  • Lines 8–9: Obtain dffun and dfargs. This is a little awkward because it is designed to allow support for mixed models, where approximate methods may be used to obtain degrees of freedom. The function dffun is expected to have two arguments: k, the vector of coefficients of bhat, and dfargs, a list containing any additional arguments. In this case (and in many other models), the degrees of freedom are the same regardless of k. We put the required degrees of freedom in dfargs and write dffun so that it simply returns that value. (Note: If asymptotic tests and CIs are desired, return Inf degrees of freedom.)

  • Line 10: Return these results in a named list.

Back to Contents

Communication between methods

If you need to pass information obtained in recover_data() to the emm_basis() method, simply incorporate it as attr(data, "misc") where data is the dataset returned by recover_data(). Subsequently, that attribute is available in emm_grid() by adding a misc argument.

Hook functions

Most linear models supported by emmeans have straightforward structure: Regression coefficients, their covariance matrix, and a set of linear functions that define the reference grid. However, a few are more complex. An example is the clm class in the ordinal package, which allows a scale model in addition to the location model. When a scale model is used, the scale parameters are included in the model matrix, regression coefficients, and covariance matrix, and we can’t just use the usual matrix operations to obtain estimates and standard errors. To facilitate using custom routines for these tasks, the emm_basis.clm function function provided in emmeans includes, in its misc part, the names (as character constants) of two “hook” functions: misc$estHook has the name of the function to call when computing estimates, standard errors, and degrees of freedom (for the summary method); and misc$vcovHook has the name of the function to call to obtain the covariance matrix of the grid values (used by the vcov method). These functions are called in lieu of the usual built-in routines for these purposes, and return the appropriately sized matrices.

In addition, you may want to apply some form of special post-processing after the reference grid is constructed. To provide for this, give the name of your function to post-process the object in misc$postGridHook. Again, clm objects (as well as polr in the MASS package) serve as an example. They allow a mode specification that in two cases, calls for post-processing. The "cum.prob" mode uses the regrid function to transform the linear predictor to the cumulative-probability scale. And the "prob" mode performs this, as well as applying the contrasts necessary to convert the cumulative probabilities into the class probabilities.

Back to Contents

Exported methods from emmeans

For package developers’ convenience, emmeans exports some of its S3 methods for recover_data and/or emm_basis—use methods("recover_data") and methods("emm_basis") to discover which ones. It may be that all you need is to invoke one of those methods and perhaps make some small changes—especially if your model-fitting algorithm makes heavy use of an existing model type supported by emmeans. For those methods that are not exported, use recover_data() and .emm_basis(), which run in emmeans’s namespace, thus providing access to all available methods..

A few additional functions are exported because they may be useful to developers. They are as follows:

  • emmeans::.all.vars(expr, retain) Some users of your package may include $ or [[]] operators in their model formulas. If you need to get the variable names, base::all.vars will probably not give you what you need. For example, if form = ~ data$x + data[[5]], then base::all.vars(form) returns the names "data" and "x", whereas emmeans::.all.vars(form) returns the names "data$x" and "data[[5]]". The retain argument may be used to specify regular expressions for patterns to retain as parts of variable names.

  • emmeans::.diag(x, nrow, ncol) The base diag function has a booby trap whereby, for example, diag(57.6) returns a 57 x 57 identity matrix rather than a 1 x 1 matrix with 57.6 as its only element. But emmeans::.diag(57.6) will return the latter. The function works identically to diag except for its tail run around the identity-matrix trap.

  • emmeans::.aovlist.dffun(k, dfargs) This function is exported because it is needed for computing degrees of freedom for models fitted using aov, but it may be useful for other cases where Satterthwaite degrees-of-freedom calculations are needed. It requires the dfargs slot to contain analogous contents.

  • emmeans::.get.offset(terms, grid) If terms is a model formula containing an offset call, this is will compute that offset in the context of grid (a data.frame).

  • emmeans::.my.vcov(object, ...) In a call to ref_grid, emmeans, etc., the user may use vcov. to specify an alternative function or matrix to use as the covariance matrix of the fixed-effects coefficients. This function supports that feature. Calling .my.vcov in place of the vcov method will substitute the user’s vcov. when it is specified.

  • emmeans::.std.link.labels(fam, misc) This is useful in emm_basis methods for generalized linear models. Call it with fam equal to the family object for your model, and misc either an existing list, or just list() if none. It returns a new misc list containing the link function and, in some cases, extra features that are used for certain types of link functions (e.g., for a log link, the setups for returning ratio comparisons with type = "response").

  • emmeans::.num.key(levs, key) Returns integer indices of elements of key in levs when key is a character vector; or just returns integer values if already integer. Also throws an error if levels are mismatched or indices exceed legal range. This is useful in custom contrast functions (.emmc functions).

  • emmeans::.get.excl(levs, exclude, include) This is support for the exclude and include arguments of contrast functions. It checks legality and returns an integer vector of exclude indices in levs, given specified integer or character arguments exclude and include. In your .emmc function, exclude should default to integer(0) and include should have no default.

  • emmeans::.cmpMM(X, weights, assign) creates a compact version of the model matrix X (or, preferably, its QR decomposition). This is useful if we want an emm_basis() method to return a model.matrix element. The returned result is just the R portion of the QR decomposition of diag(sqrt(weights)) %*% X, with the assign attribute added. If X is a qr object, we assume the weights are already incorporated, as is true of the qr slot of a lm object.

Back to Contents

Existing support for rsm objects

As a nontrivial example of how an existing package supports emmeans, we show the support offered by the rsm package. Its rsm function returns an rsm object which is an extension of the lm class. Part of that extension has to do with coded.data structures whereby, as is typical in response-surface analysis, models are fitted to variables that have been linearly transformed (coded) so that the scope of each predictor is represented by plus or minus 1 on the coded scale.

Without any extra support in rsm, emmeans will work just fine with rsm objects; but if the data are coded, it becomes awkward to present results in terms of the original predictors on their original, uncoded scale. The emmeans-related methods in rsm provide a mode argument that may be used to specify whether we want to work with coded or uncoded data. The possible values for mode are "asis" (ignore any codings, if present), "coded" (use the coded scale), and "decoded" (use the decoded scale). The first two are actually the same in that no decoding is done; but it seems clearer to provide separate options because they represent two different situations.

The recover_data method

Note that coding is a predictor transformation, not a response transformation (we could have that, too, as it’s already supported by the emmeans infrastructure). So, to handle the "decode" mode, we will need to actually decode the predictors used to construct he reference grid. That means we need to make recover_data a lot fancier! Here it is:

recover_data.rsm = function(object, data, mode = c("asis", "coded", "decoded"), ...) {
    mode = match.arg(mode)
    cod = rsm::codings(object)
    fcall = object$call
    if(is.null(data))                                                 # 5
        data = emmeans::recover_data(fcall, 
                   delete.response(terms(object)), object$na.action, ...)
    if (!is.null(cod) && (mode == "decoded")) {
        pred = cpred = attr(data, "predictors")
        trms = attr(data, "terms")                                    #10
        data = rsm::decode.data(rsm::as.coded.data(data, formulas = cod))
        for (form in cod) {
            vn = all.vars(form)
            if (!is.na(idx <- grep(vn[1], pred))) { 
                pred[idx] = vn[2]                                     #15
                cpred = setdiff(cpred, vn[1])
            }
        }
        attr(data, "predictors") = pred
        new.trms = update(trms, reformulate(c("1", cpred)))           #20
        attr(new.trms, "orig") = trms
        attr(data, "terms") = new.trms
        attr(data, "misc") = cod
    }
    data
}

Lines 2–7 ensure that mode is legal, retrieves the codings from the object, and obtain the results we would get from recover_data had it been an lm object. If mode is not "decoded", or if no codings were used, that’s all we need. Otherwise, we need to return the decoded data. However, it isn’t quite that simple, because the model equation is still defined on the coded scale. Rather than to try to translate the model coefficients and covariance matrix to the decoded scale, we elected to remember what we will need to do later to put things back on the coded scale. In lines 9–10, we retrieve the attributes of the recovered data that provide the predictor names and terms object on the coded scale. In line 11, we replace the recovered data with the decoded data.

By the way, the codings comprise a list of formulas with the coded name on the left and the original variable name on the right. It is possible that only some of the predictors are coded (for example, blocking factors will not be). In the for loop in lines 12–18, the coded predictor names are replaced with their decoded names. For technical reasons to be discussed later, we also remove these coded predictor names from a copy, cpred, of the list of all predictors in the coded model. In line 19, the "predictors" attribute of data is replaced with the modified version.

Now, there is a nasty technicality. The ref_grid function in emmeans has a few lines of code after recover_data is called that determine if any terms in the model convert covariates to factors or vice versa; and this code uses the model formula. That formula involves variables on the coded scale, and those variables are no longer present in the data, so an error will occur if it tries to access them. Luckily, if we simply take those terms out of the formula, it won’t hurt because those coded predictors would not have been converted in that way. So in line 20, we update trms with a simpler model with the coded variables excluded (the intercept is explicitly included to ensure there will be a right-hand side even is cpred is empty). We save that as the terms attribute, and the original terms as a new "orig" attribute to be retrieved later. The data object, modified or not, is returned. If data have been decoded, ref_grid will construct its grid using decoded variables.

In line 23, we save the codings as the "misc" attribute, to be accessed later by emm_basis().

The emm_basis method

Now comes the emm_basis method that will be called after the grid is defined. It is listed below:

emm_basis.rsm = function(object, trms, xlev, grid, 
                         mode = c("asis", "coded", "decoded"), misc, ...) {
    mode = match.arg(mode)
    cod = misc
    if(!is.null(cod) && mode == "decoded") {                          # 5
        grid = rsm::coded.data(grid, formulas = cod)
        trms = attr(trms, "orig")
    }
    
    m = model.frame(trms, grid, na.action = na.pass, xlev = xlev)     #10
    X = model.matrix(trms, m, contrasts.arg = object$contrasts)
    bhat = as.numeric(object$coefficients) 
    V = emmeans::.my.vcov(object, ...)
    
    if (sum(is.na(bhat)) > 0)                                         #15
        nbasis = estimability::nonest.basis(object$qr)
    else
        nbasis = estimability::all.estble
    dfargs = list(df = object$df.residual)
    dffun = function(k, dfargs) dfargs$df                             #20

    list(X = X, bhat = bhat, nbasis = nbasis, V = V, 
         dffun = dffun, dfargs = dfargs, misc = list())
}

This is much simpler. The coding formulas are obtained from misc (line 4) so that we don’t have to re-obtain them from the object. All we have to do is determine if decoding was done (line 5); and, if so, convert the grid back to the coded scale (line 6) and recover the original terms attribute (line 7). The rest is borrowed directly from the emm_basis.lm method in emmeans. Note that line 13 uses one of the exported functions we described in the preceding section. Lines 15–18 use functions from the estimability package to handle the possibility that the model is rank-deficient.

A demonstration

Here’s a demonstration of this rsm support. The standard example for rsm fits a second-order model CR.rs2 to a dataset organized in two blocks and with two coded predictors.

library("rsm")
example("rsm")   ### (output is not shown) ###

First, let’s look at some results on the coded scale—which are the same as for an ordinary lm object.

emmeans(CR.rs2, ~ x1 * x2, mode = "coded", 
        at = list(x1 = c(-1, 0, 1), x2 = c(-2, 2)))
##  x1 x2 emmean    SE df lower.CL upper.CL
##  -1 -2   75.0 0.298  7     74.3     75.7
##   0 -2   77.0 0.240  7     76.4     77.5
##   1 -2   76.4 0.298  7     75.6     77.1
##  -1  2   76.8 0.298  7     76.1     77.5
##   0  2   79.3 0.240  7     78.7     79.9
##   1  2   79.2 0.298  7     78.5     79.9
## 
## Results are averaged over the levels of: Block 
## Confidence level used: 0.95

Now, the coded variables x1 and x2 are derived from these coding formulas for predictors Time and Temp:

codings(CR.rs1)
## $x1
## x1 ~ (Time - 85)/5
## 
## $x2
## x2 ~ (Temp - 175)/5

Thus, for example, a coded value of x1 = 1 corresponds to a time of 85 + 1 x 5 = 90. Here are some results working with decoded predictors. Note that the at list must now be given in terms of Time and Temp:

emmeans(CR.rs2, ~ Time * Temp, mode = "decoded", 
        at = list(Time = c(80, 85, 90), Temp = c(165, 185)))
##  Time Temp emmean    SE df lower.CL upper.CL
##    80  165   75.0 0.298  7     74.3     75.7
##    85  165   77.0 0.240  7     76.4     77.5
##    90  165   76.4 0.298  7     75.6     77.1
##    80  185   76.8 0.298  7     76.1     77.5
##    85  185   79.3 0.240  7     78.7     79.9
##    90  185   79.2 0.298  7     78.5     79.9
## 
## Results are averaged over the levels of: Block 
## Confidence level used: 0.95

Since the supplied settings are the same on the decoded scale as were used on the coded scale, the EMMs are identical to those in the previous output.

Dispatching and restrictions

The emmeans package has internal support for a number of model classes. When recover_data() and emm_basis() are dispatched, a search is made for external methods for a given class; and if found, those methods are used instead of the internal ones. However, certain restrictions apply when you aim to override an existing internal method:

  1. The class name being extended must appear in the first or second position in the results of class(object). That is, you may have a base class for which you provide recover_data() and emm_basis() methods, and those will also work for direct descendants thereof; but any class in third place or later in the inheritance is ignored.
  2. Certain classes vital to the correct operation of the package, e.g., "lm", "glm", etc., may not be overridden.

If there are no existing internal methods for the class(es) you provide methods for, there are no restrictions on them.

Exporting and registering your methods

To make the methods available to users of your package, the methods must be exported. R and CRAN are evolving in a way that having S3 methods in the registry is increasingly important; so it is a good idea to provide for that. The problem is not all of your package users will have emmeans installed.

Thus, registering the methods must be done conditionally. We provide a courtesy function .emm_register() to make this simple. Suppose that your package offers two model classes foo and bar, and it includes the corresponding functions recover_data.foo, recover_data.bar, emm_basis.foo, and emm_basis.bar. Then to register these methods, add or modify the .onLoad function in your package (traditionally saved in the source file zzz.R):

.onLoad <- function(libname, pkgname) {
    if (requireNamespace("emmeans", quietly = TRUE))
        emmeans::.emm_register(c("foo", "bar"), pkgname)
}

You should also add emmeans (>= 1.4) and estimability (which is required by emmeans) to the Suggests field of your DESCRIPTION file.

Back to Contents

Conclusions

It is relatively simple to write appropriate methods that work with emmeans for model objects it does not support. I hope this vignette is helpful for understanding how. Furthermore, if you are the developer of a package that fits linear models, I encourage you to include recover_data and emm_basis methods for those classes of objects, so that users have access to emmeans support.

Back to Contents

Index of all vignette topics

emmeans/inst/doc/models.html0000644000176200001440000012670314165066763015621 0ustar liggesusers Models supported by emmeans

Models supported by emmeans

emmeans package, Version 1.7.2

Here we document what model objects may be used with emmeans, and some special features of some of them that may be accessed by passing additional arguments through ref_grid or emmeans().

Certain objects are affected by optional arguments to functions that construct emmGrid objects, including ref_grid(), emmeans(), emtrends(), and emmip(). When “arguments” are mentioned in the subsequent quick reference and object-by-object documentation, we are talking about arguments in these constructors.

If a model type is not included here, users may be able to obtain usable results via the qdrg() function; see its help page. Package developers may support their models by writing appropriate recover_data and emm_basis methods. See the package documentation for extending-emmeans and vignette("xtending") for details.

Index of all vignette topics

Quick reference for supported objects and options

Here is an alphabetical list of model classes that are supported, and the arguments that apply. Detailed documentation follows, with objects grouped by the code in the “Group” column. Scroll down or follow the links to those groups for more information.

Object.class Package Group Arguments / notes
aov stats A
aovList stats V Best with balanced designs, orthogonal coding
averaging MuMIn I
betareg betareg B mode = c("link", "precision", "phi.link",
"variance", "quantile")
brmsfit brms P Supported in brms package
carbayes CARBayes S data is required
clm ordinal O mode = c("latent", "linear.predictor", "cum.prob",
"exc.prob", "prob", "mean.class", "scale")
clmm ordinal O Like clm but no "scale" mode
coxme coxme G
coxph survival G
gam mgcv G freq = FALSE, unconditional = FALSE,
what = c("location", "scale", "shape", "rate", "prob.gt.0")
gamm mgcv G call = object$gam$call
Gam gam G nboot = 800
gamlss gamlss H what = c("mu", "sigma", "nu", "tau")
gee gee E vcov.method = c("naive", "robust")
geeglm geepack E vcov.method = c("vbeta", "vbeta.naiv", "vbeta.j1s",
"vbeta.fij", "robust", "naive") or a matrix
geese geepack E Like geeglm
glm stats G
glm.nb MASS G Requires data argument
glmerMod lme4 G
glmmadmb glmmADMB No longer supported
glmmPQL MASS G inherits lm support
glmmTMB glmmTMB P Supported in glmmTMB package (dev. version only?)
gls nlme K mode = c("auto", "df.error", "satterthwaite", "asymptotic")
gnls nlme A Supports params part. Requires param = "<name>"
hurdle pscl C mode = c("response", "count", "zero", "prob0"),
lin.pred = c(FALSE, TRUE)
lm stats A Several other classes inherit from this and may be supported
lme nlme K sigmaAdjust = c(TRUE, FALSE),
mode = c("auto", containment", "satterthwaite", "asymptotic"),
extra.iter = 0
lmerMod lme4 L lmer.df = c("kenward-roger", "satterthwaite", "asymptotic"),
pbkrtest.limit = 3000, disable.pbkrtest = FALSE.
emm_options(lmer.df =, pbkrtest.limit =, disable.pbkrtest =)
lqm,lqmm lqmm Q tau = "0.5" (must match an entry in object$tau)
Optional: method, R, seed, startQR (must be fully spelled-out)
manova stats M mult.name, mult.levs
maov stats M mult.name, mult.levs
mblogit mclogit P Supported in mclogit (overrides previous minimal support here)
mcmc mcmc S May require formula, data
MCMCglmm MCMCglmm S (see also M) mult.name, mult.levs, trait,
mode = c("default", "multinomial"); data is required
mira mice I Optional arguments per class of $analyses elements
mixed afex P Supported in afex package
mlm stats M mult.name, mult.levs
mmer sommer G
multinom nnet N mode = c("prob", "latent")
Always include response in specs for emmeans()
nauf nauf.xxx P Supported in nauf package
nlme nlme A Supports fixed part. Requires param = "<name>"
polr MASS O mode = c("latent", "linear.predictor", "cum.prob",
"exc.prob", "prob", "mean.class")
rlm MASS A inherits lm support
rms rms O mode = ("middle", "latent", "linear.predictor",
"cum.prob", "exc.prob", "prob", "mean.class")
rq,rqs quantreg Q tau = "0.5" (must match an entry in object$tau)
Optional: se, R, bsmethod, etc.
rlmerMod robustlmm P Supported in robustlmm package
rsm rsm P Supported in rsm package
stanreg rstanarm S Args for stanreg_xxx similar to those for xxx
survreg survival A
svyglm survey A
zeroinfl pscl C mode = c("response", "count", "zero", "prob0"),
lin.pred = c(FALSE, TRUE)

Group A – “Standard” or minimally supported models

Models in this group, such as lm, do not have unusual features that need special support; hence no extra arguments are needed. Some may require data in the call.

B – Beta regression

The additional mode argument for betareg objects has possible values of "response", "link", "precision", "phi.link", "variance", and "quantile", which have the same meaning as the type argument in predict.betareg – with the addition that "phi.link" is like "link", but for the precision portion of the model. When mode = "quantile" is specified, the additional argument quantile (a numeric scalar or vector) specifies which quantile(s) to compute; the default is 0.5 (the median). Also in "quantile" mode, an additional variable quantile is added to the reference grid, and its levels are the values supplied.

Back to quick reference

Group C – Count models

Two optional arguments – mode and lin.pred – are provided. The mode argument has possible values "response" (the default), "count", "zero", or "prob0". lin.pred is logical and defaults to FALSE.

With lin.pred = FALSE, the results are comparable to those returned by predict(..., type = "response"), predict(..., type = "count"), predict(..., type = "zero"), or predict(..., type = "prob")[, 1]. See the documentation for predict.hurdle and predict.zeroinfl.

The option lin.pred = TRUE only applies to mode = "count" and mode = "zero". The results returned are on the linear-predictor scale, with the same transformation as the link function in that part of the model. The predictions for a reference grid with mode = "count", lin.pred = TRUE, and type = "response" will be the same as those obtained with lin.pred = FALSE and mode = "count"; however, any EMMs derived from these grids will be different, because the averaging is done on the log-count scale and the actual count scale, respectively – thereby producing geometric means versus arithmetic means of the predictions.

If the vcov. argument is used (see details in the documentation for ref_grid), it must yield a matrix of the same size as would be obtained using vcov.hurdle or vcov.zeroinfl with its model argument set to ("full", "count", "zero") in respective correspondence with mode of ("mean", "count", "zero"). If vcov. is a function, it must support the model argument.

Back to quick reference

Group E – GEE models

These models all have more than one covariance estimate available, and it may be selected by supplying a string as the vcov.method argument. It is partially matched with the available choices shown in the quick reference. In geese and geeglm, the aliases "robust" (for "vbeta") and "naive" (for "vbeta.naiv" are also accepted.

If a matrix or function is supplied as vcov.method, it is interpreted as a vcov. specification as described for ... in the documentation for ref_grid.

Group G – Generalized linear models and relatives

Most models in this group receive only standard support as in Group A, but typically the tests and confidence intervals are asymptotic. Thus the df column for tabular results will be Inf.

Some objects in this group require that the original or reference dataset be provided when calling ref_grid() or emmeans().

In the case of mgcv::gam objects, there are optional freq and unconditional arguments as is detailed in the documentation for mgcv::vcov.gam(). Both default to FALSE. The value of unconditional matters only if freq = FALSE and object$Vc is non-null.

For mgcv::gamm objects, emmeans() results are based on the object$gam part. Unfortunately, that is missing its call component, so the user must supply it in the call argument (e.g., call = quote(gamm(y ~ s(x), data = dat))) or give the dataset in the data argument. Alternatively (and recommended), you may first set object$gam$call to the quoted call ahead of time. The what arguments are used to select which model formula to use: "location", "scale" apply to gaulss and gevlss families, "shape" applies only to gevlss, and "rate", "prob.gt.0" apply to ziplss.

With gam::Gam objects, standard errors are estimated using a bootstrap method when there are any smoothers involved. Accordingly, there is an optional nboot argument that sets the number of bootstrap replications used to estimate the variances and covariances of the smoothing portions of the model. Generally, it is better to use models fitted via mgcv::gam() rather than gam::gam().

Back to quick reference

Group H – gamlss models

The what argument has possible values of "mu" (default), "sigma", "nu", or "tau" depending on which part of the model you want results for. Currently, there is no support when the selected part of the model contains a smoothing method like pb().

Group I – Multiple models (via imputation or averaging)

These objects are the results of fitting several models with different predictor subsets or imputed values. The bhat and V slots are obtained via averaging and, in the case of multiple imputation, adding a multiple of the between-imputation covariance per Rubin’s rules.

Support for MuMIn::averaging objects may be somewhat dodgy, as it is not clear that all supported model classes will work. The object must have a "modelList" attribute (obtained by constructing the object explicitly from a model list or by including fit = TRUE in the call). And each model should be fitted with data as a named argument in the call; or else provide a data argument in the call to emmeans() or ref_grid(). No estimability checking is done at present: if/when it is added, a linear function will be estimable only if it is estimable in all models included in the averaging.

Group K – gls and lme models

The sigmaAdjust argument is a logical value that defaults to TRUE. It is comparable to the adjustSigma option in nlme::summary.lme (the name-mangling is to avoid conflicts with the often-used adjust argument), and determines whether or not a degrees-of-freedom adjustment is performed with models fitted using the ML method.

The optional mode argument affects the degrees of freedom. The mode = "satterthwaite" option determines degrees of freedom via the Satterthwaite method: If s^2 is the estimate of some variance, then its Satterthwaite d.f. is 2*s^4 / Var(s^2). In case our numerical methods for this fail, we also offer mode = "appx-satterthwaite" as a backup, by which quantities related to Var(s^2) are obtained by randomly perturbing the response values. Currently, only "appx-satterthwaite" is available for lme objects, and it is used if "satterthwaite" is requested. Because appx-satterthwaite is simulation-based, results may vary if the same analysis is repeated. An extra.iter argument may be added to request additional simulation runs (at [possibly considerable] cost of repeating the model-fitting that many more times). (Note: Previously, "appx-satterthwaite" was termed "boot-satterthwaite"; this is still supported for backward compatibility. The “boot” was abandoned because it is really an approximation method, not a bootstrap method in the sense as many statistical methods.)

An alternative method is "df.error" (for gls) and "containment" (for lme). df.error is just the error degrees of freedom for the model, minus the number of extra random effects estimated; it generally over-estimates the degrees of freedom. The asymptotic mode simply sets the degrees of freedom to infinity. “containment”mode (forlmemodels) determines the degrees of freedom for the coarsest grouping involved in the contrast or linear function involved, so it tends to under-estimate the degrees of freedom. The default ismode = “auto”`, which uses Satterthwaite if there are estimated random effects and the non-Satterthwaite option otherwise.

The extra.iter argument is ignored unless the d.f. method is (or defaults to) appx-satterthwaite.

Back to quick reference

Group L – lmerMod models

There is an optional lmer.df argument that defaults to get_EMM_option("lmer.df") (which in turn defaults to "kenward-roger"). The possible values are "kenward-roger", "satterthwaite", and "asymptotic" (these are partially matched and case-insensitive). With "kenward-roger", d.f. are obtained using code from the pbkrtest package, if installed. With "satterthwaite", d.f. are obtained using code from the lmerTest package, if installed. With "asymptotic", or if the needed package is not installed, d.f. are set to Inf. (For backward compatibility, the user may specify mode in lieu of lmer.df.)

A by-product of the Kenward-Roger method is that the covariance matrix is adjusted using pbkrtest::vcovAdj(). This can require considerable computation; so to avoid that overhead, the user should opt for the Satterthwaite or asymptotic method; or, for backward compatibility, may disable the use of pbkrtest via emm_options(disable.pbkrtest = TRUE) (this does not disable the pbkrtest package entirely, just its use in emmeans). The computation time required depends roughly on the number of observations, N, in the design matrix (because a major part of the computation involves inverting an N x N matrix). Thus, pbkrtest is automatically disabled if N exceeds the value of get_emm_option("pbkrtest.limit"), for which the factory default is 3000. (The user may also specify pbkrtest.limit or disable.pbkrtest as an argument in the call to emmeans() or ref_grid())

Similarly to the above, the disable.lmerTest and lmerTest.limit options or arguments affect whether Satterthwaite methods can be implemented.

The df argument may be used to specify some other degrees of freedom. Note that if df and method = "kenward-roger" are both specified, the covariance matrix is adjusted but the K-R degrees of freedom are not used.

Finally, note that a user-specified covariance matrix (via the vcov. argument) will also disable the Kenward-Roger method; in that case, the Satterthwaite method is used in place of Kenward-Roger.

Back to quick reference

Group M – Multivariate models

When there is a multivariate response, the different responses are treated as if they were levels of a factor – named rep.meas by default. The mult.name argument may be used to change this name. The mult.levs argument may specify a named list of one or more sets of levels. If this has more than one element, then the multivariate levels are expressed as combinations of the named factor levels via the function base::expand.grid.

N - Multinomial responses

The reference grid includes a pseudo-factor with the same name and levels as the multinomial response. There is an optional mode argument which should match "prob" or "latent". With mode = "prob", the reference-grid predictions consist of the estimated multinomial probabilities. The "latent" mode returns the linear predictor, recentered so that it averages to zero over the levels of the response variable (similar to sum-to-zero contrasts). Thus each latent variable can be regarded as the log probability at that level minus the average log probability over all levels.

There are two optional arguments: mode and rescale (which defaults to c(0, 1)).

Please note that, because the probabilities sum to 1 (and the latent values sum to 0) over the multivariate-response levels, all sensible results from emmeans() must involve that response as one of the factors. For example, if resp is a response with k levels, emmeans(model, ~ resp | trt) will yield the estimated multinomial distribution for each trt; but emmeans(model, ~ trt) will just yield the average probability of 1/k for each trt.

Back to quick reference

Group O - Ordinal responses

The reference grid for ordinal models will include all variables that appear in the main model as well as those in the scale or nominal models (if provided). There are two optional arguments: mode (a character string) and rescale (which defaults to c(0, 1)). mode should match one of "latent" (the default), "linear.predictor", "cum.prob", "exc.prob", "prob", "mean.class", or "scale" – see the quick reference and note which are supported.

With mode = "latent", the reference-grid predictions are made on the scale of the latent variable implied by the model. The scale and location of this latent variable are arbitrary, and may be altered via rescale. The predictions are multiplied by rescale[2], then added to rescale[1]. Keep in mind that the scaling is related to the link function used in the model; for example, changing from a probit link to a logistic link will inflate the latent values by around \(\pi/\sqrt{3}\), all other things being equal. rescale has no effect for other values of mode.

With mode = "linear.predictor", mode = "cum.prob", and mode = "exc.prob", the boundaries between categories (i.e., thresholds) in the ordinal response are included in the reference grid as a pseudo-factor named cut. The reference-grid predictions are then of the cumulative probabilities at each threshold (for mode = "cum.prob"), exceedance probabilities (one minus cumulative probabilities, for mode = "exc.prob"), or the link function thereof (for mode = "linear.predictor").

With mode = "prob", a pseudo-factor with the same name as the model’s response variable is created, and the grid predictions are of the probabilities of each class of the ordinal response. With "mean.class", the returned results are means of the ordinal response, interpreted as a numeric value from 1 to the number of classes, using the "prob" results as the estimated probability distribution for each case.

With mode = "scale", and the fitted object incorporates a scale model, EMMs are obtained for the factors in the scale model (with a log response) instead of the response model. The grid is constructed using only the factors in the scale model.

Any grid point that is non-estimable by either the location or the scale model (if present) is set to NA, and any EMMs involving such a grid point will also be non-estimable. A consequence of this is that if there is a rank-deficient scale model, then all latent responses become non-estimable because the predictions are made using the average log-scale estimate.

rms models have an additional mode. With mode = "middle" (this is the default), the middle intercept is used, comparable to the default for rms::Predict(). This is quite similar in concept to mode = "latent", where all intercepts are averaged together.

Back to quick reference

P – Other packages

Models in this group have their emmeans support provided by the package that implements the model-fitting procedure. Users should refer to the package documentation for details on emmeans support. In some cases, a package’s models may have been supported here in emmeans; if so, the other package’s support overrides it.

Q – Quantile regression

The argument tau should match (within a very small margin) one of the quantiles actually specified in fitting the model; otherwise an error results. In these models, the covariance matrix is obtained via the model’s summary() method with covariance = TRUE. The user may specify one or more of the other arguments for summary or to be passed to, say, a bootstrap routine. If so, those optional arguments must be spelled-out completely (e.g., start will not be matched to startQR).

S – Sampling (MCMC) methods

Models fitted using MCMC methods contain a sample from the posterior distribution of fixed-effect coefficients. In some cases (e.g., results of MCMCpack::MCMCregress() and MCMCpack::MCMCpoisson()), the object may include a "call" attribute that emmeans() can use to reconstruct the data and obtain a basis for the EMMs. If not, a formula and data argument are provided that may help produce the right results. In addition, the contrasts specifications are not necessarily recoverable from the object, so the system default must match what was actually used in fitting the model.

The summary.emmGrid() method provides credibility intervals (HPD intervals) of the results, and ignores the frequentist-oriented arguments (infer, adjust, etc.) An as.mcmc() method is provided that creates an mcmc object that can be summarized or plotted using the coda package (or others that support those objects). It provides a posterior sample of EMMs, or contrasts thereof, for the given reference grid, based on the posterior sample of the fixed effects from the model object.

In MCMCglmm objects, the data argument is required; however, if you save it as a member of the model object (e.g., object$data = quote(mydata)), that removes the necessity of specifying it in each call. The special keyword trait is used in some models. When the response is multivariate and numeric, trait is generated automatically as a factor in the reference grid, and the arguments mult.levels can be used to name its levels. In other models such as a multinomial model, use the mode argument to specify the type of model, and trait = <factor name> to specify the name of the data column that contains the levels of the factor response.

The brms package version 2.13 and later, has its own emmeans support. Refer to the documentation in that package.

Back to quick reference

Group V – aovList objects (also used with afex_aov objects)

Support for these objects is limited. To avoid strong biases in the predictions, it is strongly recommended that when fitting the model, the contrasts attribute of all factors should be of a type that sums to zero – for example, "contr.sum", "contr.poly", or "contr.helmert" but not "contr.treatment". If that is found not to be the case, the model is re-fitted using sum-to-zero contrasts (thus requiring additional computation). Doing so does not remove all bias in the EMMs unless the design is perfectly balanced, and an annotation is added to warn of that. This bias cancels out when doing comparisons and contrasts.

Only intra-block estimates of covariances are used. That is, if a factor appears in more than one error stratum, only the covariance structure from its lowest stratum is used in estimating standard errors. Degrees of freedom are obtained using the Satterthwaite method. In general, aovList support is best with balanced designs, with due caution in the use of contrasts. If a vcov. argument is supplied, it must yield a single covariance matrix for the unique fixed effects (not a set of them for each error stratum). In that case, the degrees of freedom are set to NA.

Back to quick reference

Index of all vignette topics

emmeans/inst/doc/utilities.Rmd0000644000176200001440000003214114137062735016111 0ustar liggesusers--- title: "Utilities and options for emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Utilities and options} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") emm_options(opt.digits = TRUE) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` ## Contents {#contents} 1. [Updating an `emmGrid` object](#update) 2. [Setting options](#options) a. [Setting and viewing defaults](#defaults) b. [Optimal digits to display](#digits) c. [Startup options](#startup) 3. [Combining and subsetting `emmGrid` objects](#rbind) 4. [Accessing results to use elsewhere](#data) 5. [Adding grouping factors](#groups) 6. [Re-labeling and re-leveling an `emmGrid`](#relevel) [Index of all vignette topics](vignette-topics.html) ## Updating an `emmGrid` object {#update} Several internal settings are saved when functions like `ref_grid()`, `emmeans()`, `contrast()`, etc. are run. Those settings can be manipulated via the `update()` method for `emmGrid`s. To illustrate, consider the `pigs` dataset and model yet again: ```{r} pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) pigs.emm <- emmeans(pigs.lm, "source") pigs.emm ``` We see confidence intervals but not tests, by default. This happens as a result of internal settings in `pigs.emm.s` that are passed to `summary()` when the object is displayed. If we are going to work with this object a lot, we might want to change its internal settings rather than having to rely on explicitly calling `summary()` with several arguments. If so, just update the internal settings to what is desired; for example: ```{r} pigs.emm.s <- update(pigs.emm, infer = c(TRUE, TRUE), null = log(35), calc = c(n = ".wgt.")) pigs.emm.s ``` Note that by adding of `calc`, we have set a default to calculate and display the sample size when the object is summarized. See `help("update.emmGrid")` for details on the keywords that can be changed. Mostly, they are the same as the names of arguments in the functions that construct these objects. Of course, we can always get what we want via calls to `test()`, `confint()` or `summary()` with appropriate arguments. But the `update()` function is more useful in sophisticated manipulations of objects, or called implicitly via the `...` or `options` argument in `emmeans()` and other functions. Those options are passed to `update()` just before the object is returned. For example, we could have done the above update within the `emmeans()` call as follows (results are not shown because they are the same as before): ```{r eval = FALSE} emmeans(pigs.lm, "source", infer = c(TRUE, TRUE), null = log(35), calc = c(n = ".wgt.")) ``` [Back to contents](#contents) ## Setting options {#options} Speaking of the `options` argument, note that the default in `emmeans()` is `options = get_emm_option("emmeans")`. Let's see what that is: ```{r} get_emm_option("emmeans") ``` So, by default, confidence intervals, but not tests, are displayed when the result is summarized. The reverse is true for results of `contrast()` (and also the default for `pairs()` which calls `contrast()`): ```{r} get_emm_option("contrast") ``` There are also defaults for a newly constructed reference grid: ```{r} get_emm_option("ref_grid") ``` The default is to display neither intervals nor tests when summarizing. In addition, the flag `is.new.rg` is set to `TRUE`, and that is why one sees a `str()` listing rather than a summary as the default when the object is simply shown by typing its name at the console. ### Setting and viewing defaults {#defaults} The user may have other preferences. She may want to see both intervals and tests whenever contrasts are produced; and perhaps she also wants to always default to the response scale when transformations or links are present. We can change the defaults by setting the corresponding options; and that is done via the `emm_options()` function: ```{r} emm_options(emmeans = list(type = "response"), contrast = list(infer = c(TRUE, TRUE))) ``` Now, new `emmeans()` results and contrasts follow the new defaults: ```{r} pigs.anal.p <- emmeans(pigs.lm, consec ~ percent) pigs.anal.p ``` Observe that the contrasts "inherited" the `type = "response"` default from the EMMs. NOTE: Setting the above options does *not* change how existing `emmGrid` objects are displayed; it only affects ones constructed in the future. There is one more option -- `summary` -- that overrides all other display defaults for both existing and future objects. For example, specifying `emm_options(summary = list(infer = c(TRUE, TRUE)))` will result in both intervals and tests being displayed, regardless of their internal defaults, unless `infer` is explicitly specified in a call to `summary()`. To temporarily revert to factory defaults in a single call to `emmeans()` or `contrast()` or `pairs()`, specify `options = NULL` in the call. To reset everything to factory defaults (which we do presently), null-out all of the **emmeans** package options: ```{r} options(emmeans = NULL) ``` ### Optimal digits to display {#digits} When an `emmGrid` object is summarized and displayed, the factory default is to display it with just enough digits as is justified by the standard errors or HPD intervals of the estimates displayed. You may use the `"opt.digits"` option to change this. If it is `TRUE` (the default), we display only enough digits as is justified (but at least 3). If it is set to `FALSE`, the number of digits is set using the R system's default, `getOption("digits")`; this is often much more precision than is justified. To illustrate, here is the summary of `pigs.emm` displayed without optimizing digits. Compare it with the first summary in this vignette. ```{r} emm_options(opt.digits = FALSE) pigs.emm emm_options(opt.digits = TRUE) # revert to optimal digits ``` By the way, setting this option does *not* round the calculated values computed by `summary.emmGrid()` or saved in a `summary)emm` object; it simply controls the precision displayed by `print.summary_emm()`. ### Startup options {#startup} The options accessed by `emm_options()` and `get_emm_option()` are stored in a list named `emmeans` within R's options environment. Therefore, if you desire options other than the defaults provided on a regular basis, this can be easily arranged by specifying them in your startup script for R. For example, if you want to default to Satterthwaite degrees of freedom for `lmer` models, and display confidence intervals rather than tests for contrasts, your `.Rprofile` file could contain the line ```{r eval = FALSE} options(emmeans = list(lmer.df = "satterthwaite", contrast = list(infer = c(TRUE, FALSE)))) ``` [Back to contents](#contents) ## Combining and subsetting `emmGrid` objects {#rbind} Two or more `emmGrid` objects may be combined using the `rbind()` or `+` methods. The most common reason (or perhaps the only good reason) to do this is to combine EMMs or contrasts into one family for purposes of applying a multiplicity adjustment to tests or intervals. A user may want to combine the three pairwise comparisons of sources with the three comparisons above of consecutive percents into a single family of six tests with a suitable multiplicity adjustment. This is done quite simply: ```{r} rbind(pairs(pigs.emm.s), pigs.anal.p[[2]]) ``` The default adjustment is `"bonferroni"`; we could have specified something different via the `adjust` argument. An equivalent way to combine `emmGrid`s is via the addition operator. Any options may be provided by `update()`. Below, we combine the same results into a family but ask for the "exact" multiplicity adjustment. ```{r} update(pigs.anal.p[[2]] + pairs(pigs.emm.s), adjust = "mvt") ``` Also evident in comparing these results is that settings are obtained from the first object combined. So in the second output, where they are combined in reverse order, we get both confidence intervals and tests, and transformation to the response scale. ###### {#brackets} To subset an `emmGrid` object, just use the subscripting operator `[]`. For instance, ```{r} pigs.emm[2:3] ``` ## Accessing results to use elsewhere {#data} Sometimes, users want to use the results of an analysis (say, an `emmeans()` call) in other computations. The `summary()` method creates a `summary_emm` object that inherits from the `data.frame` class; so one may use the variables therein just as those in a data frame. Another way is to use the `as.data.frame()` method for `emmGrid` objects. This is provided to implement the standard way to coerce an object to a data frame. For illustration, let's compute the widths of the confidence intervals in our example. ```{r} transform(pigs.emm, CI.width = upper.CL - lower.CL) ``` This implicitly converted `pigs.emm` to a data frame by passing it to the `as.data.frame()` method, then performed the required computation. But sometimes you have to explicitly call `as.data.frame()`. [Note that the `opt.digits` option is ignored here, because this is a regular data frame, not the summary of an `emmGrid`.] [Back to contents](#contents) ## Adding grouping factors {#groups} Sometimes, users want to group levels of a factor into a smaller number of groups. Those groups may then be, say, averaged separately and compared, or used as a `by` factor. The `add_grouping()` function serves this purpose. The function takes four arguments: the object, the name of the grouping factor to be created, the name of the reference factor that is being grouped, and a vector of level names of the grouping factor corresponding to levels of the reference factor. Suppose for example that we want to distinguish animal and non-animal sources of protein in the `pigs` example: ```{r} pigs.emm.ss <- add_grouping(pigs.emm.s, "type", "source", c("animal", "vegetable", "animal")) str(pigs.emm.ss) ``` Note that the new object has a nesting structure (see more about this in the ["messy-data" vignette](messy-data.html#nesting)), with the reference factor nested in the new grouping factor. Now we can obtain means and comparisons for each group ```{r} emmeans(pigs.emm.ss, pairwise ~ type) ``` [Back to contents](#contents) ## Re-labeling or re-leveling an `emmGrid` {#relevel} Sometimes it is desirable to re-label the rows of an `emmGrid`, or cast it in terms of other factor(s). This can be done via the `levels` argument in `update()`. As an example, sometimes a fitted model has a treatment factor that comprises combinations of other factors. In subsequent analysis, we may well want to break it down into the individual factors' contributions. Consider, for example, the `warpbreaks` data provided with R. We will define a single factor and fit a non homogeneous-variance model: ```{r, message = FALSE} warp <- transform(warpbreaks, treat = interaction(wool, tension)) library(nlme) warp.gls <- gls(breaks ~ treat, weights = varIdent(form = ~ 1|treat), data = warp) ( warp.emm <- emmeans(warp.gls, "treat") ) ``` But now we want to re-cast this `emmGrid` into one that has separate factors for `wool` and `tension`. We can do this as follows: ```{r} warp.fac <- update(warp.emm, levels = list( wool = c("A", "B"), tension = c("L", "M", "H"))) str(warp.fac) ``` So now we can do various contrasts involving the separate factors: ```{r} contrast(warp.fac, "consec", by = "wool") ``` Note: When re-leveling to more than one factor, you have to be careful to anticipate that the levels will be expanded using `expand.grid()`: the first factor in the list varies the fastest and the last varies the slowest. That was the case in our example, but in others, it may not be. Had the levels of `treat` been ordered as `A.L, A.M, A.H, B.L, B.M, B.H`, then we would have had to specify the levels of `tension` first and the levels of `wool` second. [Back to contents](#contents) [Index of all vignette topics](vignette-topics.html) emmeans/inst/doc/basics.html0000644000176200001440000037304614165066753015605 0ustar liggesusers Basics of estimated marginal means

Basics of estimated marginal means

emmeans package, Version 1.7.2

Why we need EMMs

Consider the pigs dataset provided with the package (help("pigs") provides details). These data come from an unbalanced experiment where pigs are given different percentages of protein (percent) from different sources (source) in their diet, and later we measure the concentration (conc) of leucine. Here’s an interaction plot showing the mean conc at each combination of the other factors.

with(pigs, interaction.plot(percent, source, conc))

This plot suggests that with each source, conc tends to go up with percent, but that the mean differs with each source.

Now, suppose that we want to assess, numerically, the marginal results for percent. The natural thing to do is to obtain the marginal means:

with(pigs, tapply(conc, percent, mean))
##        9       12       15       18 
## 32.70000 38.01111 40.12857 39.94000

Looking at the plot, it seems a bit surprising that the last three means are all about the same, with the one for 15 percent being the largest.

Hmmmm, so let’s try another approach – actually averaging together the values we see in the plot. First, we need the means that are shown there:

cell.means <- matrix(with(pigs, 
    tapply(conc, interaction(source, percent), mean)), 
    nrow = 3)
cell.means
##          [,1]     [,2]     [,3]     [,4]
## [1,] 25.75000 30.93333 31.15000 32.33333
## [2,] 34.63333 39.63333 39.23333 42.90000
## [3,] 35.40000 43.46667 50.45000 59.80000

Confirm that the rows of this matrix match the plotted values for fish, soy, and skim, respectively. Now, average each column:

apply(cell.means, 2, mean)
## [1] 31.92778 38.01111 40.27778 45.01111

These results are decidedly different from the ordinary marginal means we obtained earlier. What’s going on? The answer is that some observations were lost, making the data unbalanced:

with(pigs, table(source, percent))
##       percent
## source 9 12 15 18
##   fish 2  3  2  3
##   soy  3  3  3  1
##   skim 3  3  2  1

We can reproduce the marginal means by weighting the cell means with these frequencies. For example, in the last column:

sum(c(3, 1, 1) * cell.means[, 4]) / 5
## [1] 39.94

The big discrepancy between the ordinary mean for percent = 18 and the marginal mean from cell.means is due to the fact that the lowest value receives 3 times the weight as the other two values.

The point

The point is that the marginal means of cell.means give equal weight to each cell. In many situations (especially with experimental data), that is a much fairer way to compute marginal means, in that they are not biased by imbalances in the data. We are, in a sense, estimating what the marginal means would be, had the experiment been balanced. Estimated marginal means (EMMs) serve that need.

All this said, there are certainly situations where equal weighting is not appropriate. Suppose, for example, we have data on sales of a product given different packaging and features. The data could be unbalanced because customers are more attracted to some combinations than others. If our goal is to understand scientifically what packaging and features are inherently more profitable, then equally weighted EMMs may be appropriate; but if our goal is to predict or maximize profit, the ordinary marginal means provide better estimates of what we can expect in the marketplace.

Back to Contents

What exactly are EMMs?

Model and reference grid

Estimated marginal means are based on a model – not directly on data. The basis for them is what we call the reference grid for a given model. To obtain the reference grid, consider all the predictors in the model. Here are the default rules for constructing the reference grid

  • For each predictor that is a factor, use its levels (dropping unused ones)
  • For each numeric predictor (covariate), use its average.1

The reference grid is then a regular grid of all combinations of these reference levels.

As a simple example, consider again the pigs dataset (see help("fiber") for details). Examination of residual plots from preliminary models suggests that it is a good idea to work in terms of log concentration.

If we treat the predictor percent as a factor, we might fit the following model:

pigs.lm1 <- lm(log(conc) ~ source + factor(percent), data = pigs)

The reference grid for this model can be found via the ref_grid function:

ref_grid(pigs.lm1)
## 'emmGrid' object with variables:
##     source = fish, soy, skim
##     percent =  9, 12, 15, 18
## Transformation: "log"

(Note: Many of the calculations that follow are meant to illustrate what is inside this reference-grid object; You don’t need to do such calculations yourself in routine analysis; just use the emmeans() (or possibly ref_grid()) function as we do later.)

In this model, both predictors are factors, and the reference grid consists of the \(3\times4 = 12\) combinations of these factor levels. It can be seen explicitly by looking at the grid slot of this object:

ref_grid(pigs.lm1) @ grid
##    source percent .wgt.
## 1    fish       9     2
## 2     soy       9     3
## 3    skim       9     3
## 4    fish      12     3
## 5     soy      12     3
## 6    skim      12     3
## 7    fish      15     2
## 8     soy      15     3
## 9    skim      15     2
## 10   fish      18     3
## 11    soy      18     1
## 12   skim      18     1

Note that other information is retained in the reference grid, e.g., the transformation used on the response, and the cell counts as the .wgt. column.

Now, suppose instead that we treat percent as a numeric predictor. This leads to a different model – and a different reference grid.

pigs.lm2 <- lm(log(conc) ~ source + percent, data = pigs)
ref_grid(pigs.lm2)
## 'emmGrid' object with variables:
##     source = fish, soy, skim
##     percent = 12.931
## Transformation: "log"

This reference grid has the levels of source, but only one percent value, its average. Thus, the grid has only three elements:

ref_grid(pigs.lm2) @ grid
##   source  percent .wgt.
## 1   fish 12.93103    10
## 2    soy 12.93103    10
## 3   skim 12.93103     9

Back to Contents

Estimated marginal means

Once the reference grid is established, we can consider using the model to estimate the mean at each point in the reference grid. (Curiously, the convention is to call this “prediction” rather than “estimation”). For pigs.lm1, we have

pigs.pred1 <- matrix(predict(ref_grid(pigs.lm1)), nrow = 3)
pigs.pred1
##          [,1]     [,2]     [,3]     [,4]
## [1,] 3.220292 3.399846 3.437691 3.520141
## [2,] 3.493060 3.672614 3.710459 3.792909
## [3,] 3.622569 3.802124 3.839968 3.922419

Estimated marginal means (EMMs) are defined as equally weighted means of these predictions at specified margins:

apply(pigs.pred1, 1, mean) ### EMMs for source
## [1] 3.394492 3.667260 3.796770
apply(pigs.pred1, 2, mean) ### EMMs for percent
## [1] 3.445307 3.624861 3.662706 3.745156

For the other model, pigs.lm2, we have only one point in the reference grid for each source level; so the EMMs for source are just the predictions themselves:

predict(ref_grid(pigs.lm2))
## [1] 3.379865 3.652693 3.783120

These are slightly different from the previous EMMs for source, emphasizing the fact that EMMs are model-dependent. In models with covariates, EMMs are often called adjusted means.

The emmeans function computes EMMs, accompanied by standard errors and confidence intervals. For example,

emmeans(pigs.lm1, "percent")
##  percent emmean     SE df lower.CL upper.CL
##        9   3.45 0.0409 23     3.36     3.53
##       12   3.62 0.0384 23     3.55     3.70
##       15   3.66 0.0437 23     3.57     3.75
##       18   3.75 0.0530 23     3.64     3.85
## 
## Results are averaged over the levels of: source 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

In these examples, all the results are presented on the log(conc) scale (and the annotations in the output warn of this). It is possible to convert them back to the conc scale by back-transforming. This topic is discussed in the vignette on transformations.

An additional note: There is an exception to the definition of EMMs given here. If the model has a nested structure in the fixed effects, then averaging is performed separately in each nesting group. See the section on nesting in the “messy-data” vignette for an example.

Back to Contents

Altering the reference grid

It is possible to alter the reference grid. We might, for example, want to define a reference grid for pigs.lm2 that is comparable to the one for pigs.lm1.

ref_grid(pigs.lm2, cov.keep = "percent")
## 'emmGrid' object with variables:
##     source = fish, soy, skim
##     percent =  9, 12, 15, 18
## Transformation: "log"

Using cov.keep = "percent" specifies that, instead of using the mean, the reference grid should use all the unique values of each covariate“percent”`.

Another option is to specify a cov.reduce function that is used in place of the mean; e.g.,

ref_grid(pigs.lm2, cov.reduce = range)
## 'emmGrid' object with variables:
##     source = fish, soy, skim
##     percent =  9, 18
## Transformation: "log"

Another option is to use the at argument. Consider this model for the built-in mtcars dataset:

mtcars.lm <- lm(mpg ~ disp * cyl, data = mtcars)
ref_grid(mtcars.lm)
## 'emmGrid' object with variables:
##     disp = 230.72
##     cyl = 6.1875

Since both predictors are numeric, the default reference grid has only one point. For purposes of describing the fitted model, you might want to obtain predictions at a grid of points, like this:

mtcars.rg <- ref_grid(mtcars.lm, cov.keep = 3,
                      at = list(disp = c(100, 200, 300)))
mtcars.rg
## 'emmGrid' object with variables:
##     disp = 100, 200, 300
##     cyl = 4, 6, 8

This illustrates two things: a new use of cov.keep and the at argument. cov.keep = "3" specifies that any covariates having 3 or fewer unique values is treated like a factor (the system default is cov.keep = "2"). The at specification gives three values of disp, overriding the default behavior to use the mean of disp. Another use of at is to focus on only some of the levels of a factor. Note that at does not need to specify every predictor; those not mentioned in at are handled by cov.reduce, cov.keep, or the default methods. Also, covariate values in at need not be values that actually occur in the data, whereas cov.keep will use only values that are achieved.

Back to Contents

Derived covariates

You need to be careful when one covariate depends on the value of another. To illustrate in the mtcars example, suppose we want to use cyl as a factor and include a quadratic term for disp:

mtcars.1 <- lm(mpg ~ factor(cyl) + disp + I(disp^2), data = mtcars)
emmeans(mtcars.1, "cyl")
##  cyl emmean   SE df lower.CL upper.CL
##    4   19.3 2.66 27     13.9     24.8
##    6   17.2 1.36 27     14.4     20.0
##    8   18.8 1.47 27     15.7     21.8
## 
## Confidence level used: 0.95

Some users may not like function calls in the model formula, so they instead do something like this:

mtcars <- transform(mtcars, 
                    Cyl = factor(cyl),
                    dispsq = disp^2)
mtcars.2 <- lm(mpg ~ Cyl + disp + dispsq, data = mtcars)
emmeans(mtcars.2, "Cyl")
##  Cyl emmean   SE df lower.CL upper.CL
##  4     20.8 2.05 27     16.6     25.0
##  6     18.7 1.19 27     16.3     21.1
##  8     20.2 1.77 27     16.6     23.9
## 
## Confidence level used: 0.95

Wow! Those are really different results – even though the models are equivalent. Why is this? To understand, look at the reference grids:

ref_grid(mtcars.1)
## 'emmGrid' object with variables:
##     cyl = 4, 6, 8
##     disp = 230.72
ref_grid(mtcars.2)
## 'emmGrid' object with variables:
##     Cyl = 4, 6, 8
##     disp = 230.72
##     dispsq = 68113

For both models, the reference grid uses the disp mean of 230.72. But for mtcars.2, we also set dispsq to its mean of 68113. This is not right, because dispsq should be the square of disp (about 53232, not 68113) in order to be consistent. If we use that value of dispsq, we get the same results (modulus rounding error) as for mtcars.1:

emmeans(mtcars.2, "Cyl", at = list(dispsq = 230.72^2))
##  Cyl emmean   SE df lower.CL upper.CL
##  4     19.3 2.66 27     13.9     24.8
##  6     17.2 1.36 27     14.4     20.0
##  8     18.8 1.47 27     15.7     21.8
## 
## Confidence level used: 0.95

In summary, for polynomial models and others where some covariates depend on others in nonlinear ways, include that dependence in the model formula (as in mtcars.1) using I() or poly() expressions, or alter the reference grid so that the dependency among covariates is correct.

Non-predictor variables

Reference grids are derived using the variables in the right-hand side of the model formula. But sometimes, these variables are not actually predictors. For example:

deg <- 2
mod <- lm(y ~ treat * poly(x, degree = deg), data = mydata)

If we call ref_grid() or emmeans() with this model, it will try to construct a grid of values of treat, x, and deg – causing an error because deg is not a predictor in this model. To get things to work correctly, you need to name deg in a params argument, e.g.,

emmeans(mod, ~ treat | x, at = list(x = 1:3), params = "deg")

Back to Contents

Graphical displays

The results of ref_grid() or emmeans() (these are objects of class emmGrid) may be plotted in two different ways. One is an interaction-style plot, using emmip(). In the following, let’s use it to compare the predictions from pigs.lm1 and pigs.lm2:

emmip(pigs.lm1, source ~ percent)

emmip(ref_grid(pigs.lm2, cov.reduce = FALSE), source ~ percent)

Notice that emmip() may also be used on a fitted model. The formula specification needs the x variable on the right-hand side and the “trace” factor (what is used to define the different curves) on the left. This is a good time to yet again emphasize that EMMs are based on a model. Neither of these plots is an interaction plot of the data; they are interaction plots of model predictions; and since both models do not include an interaction, no interaction at all is evident in the plots.

The other graphics option offered is the plot() method for emmGrid objects. In the following, we display the estimates and 95% confidence intervals for mtcars.rg in separate panels for each disp.

plot(mtcars.rg, by = "disp")

This plot illustrates, as much as anything else, how silly it is to try to predict mileage for a 4-cylinder car having high displacement, or an 8-cylinder car having low displacement. The widths of the intervals give us a clue that we are extrapolating. A better idea is to acknowledge that displacement largely depends on the number of cylinders. So here is yet another way to use cov.reduce to modify the reference grid:

mtcars.rg_d.c <- ref_grid(mtcars.lm, at = list(cyl = c(4,6,8)),
                          cov.reduce = disp ~ cyl)
mtcars.rg_d.c @ grid
##        disp cyl .wgt.
## 1  93.78673   4     1
## 2 218.98458   6     1
## 3 344.18243   8     1

The ref_grid call specifies that disp depends on cyl; so a linear model is fitted with the given formula and its fitted values are used as the disp values – only one for each cyl. If we plot this grid, the results are sensible, reflecting what the model predicts for typical cars with each number of cylinders:

plot(mtcars.rg_d.c)

Wizards with the ggplot2 package can further enhance these plots if they like. For example, we can add the data to an interaction plot – this time we opt to include confidence intervals and put the three sources in separate panels:

require("ggplot2")
## Loading required package: ggplot2
emmip(pigs.lm1, ~ percent | source, CIs = TRUE) +
    geom_point(aes(x = percent, y = log(conc)), data = pigs, pch = 2, color = "blue")

Formatting results

If you want to include emmeans() results in a report, you might want to have it in a nicer format than just the printed output. We provide a little bit of help for this, especially if you are using RMarkdown or SWeave to prepare the report. There is an xtable method for exporting these results, which we do not illustrate here but it works similarly to xtable() in other contexts. Also, the export option the print() method allows the user to save exactly what is seen in the printed output as text, to be saved or formatted as the user likes (see the documentation for print.emmGrid for details). Here is an example using one of the objects above:

ci <- confint(mtcars.rg_d.c, level = 0.90, adjust = "scheffe")
xport <- print(ci, export = TRUE)
cat("<font color = 'blue'>\n")
knitr::kable(xport$summary, align = "r")
for (a in xport$annotations) cat(paste(a, "<br>"))
cat("</font>\n")

disp cyl prediction SE df lower.CL upper.CL
93.8 4 27.7 0.858 28 25.5 30.0
219.0 6 17.6 1.066 28 14.8 20.4
344.2 8 15.4 0.692 28 13.5 17.2

Confidence level used: 0.9
Conf-level adjustment: scheffe method with rank 3

Back to Contents

Using weights

It is possible to override the equal-weighting method for computing EMMs. Using weights = "cells" in the call will weight the predictions according to their cell frequencies (recall this information is retained in the reference grid). This produces results comparable to ordinary marginal means:

emmeans(pigs.lm1, "percent", weights = "cells")
##  percent emmean     SE df lower.CL upper.CL
##        9   3.47 0.0407 23     3.39     3.56
##       12   3.62 0.0384 23     3.55     3.70
##       15   3.67 0.0435 23     3.58     3.76
##       18   3.66 0.0515 23     3.55     3.76
## 
## Results are averaged over the levels of: source 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

Note that, as in the ordinary means in the motivating example, the highest estimate is for percent = 15 rather than percent = 18. It is interesting to compare this with the results for a model that includes only percent as a predictor.

pigs.lm3 <- lm(log(conc) ~ factor(percent), data = pigs)
emmeans(pigs.lm3, "percent")
##  percent emmean     SE df lower.CL upper.CL
##        9   3.47 0.0731 25     3.32     3.62
##       12   3.62 0.0689 25     3.48     3.77
##       15   3.67 0.0782 25     3.51     3.83
##       18   3.66 0.0925 25     3.46     3.85
## 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

The EMMs in these two tables are identical, but their standard errors are considerably different. That is because the model pigs.lm1 accounts for variations due to source. The lesson here is that it is possible to obtain statistics comparable to ordinary marginal means, while still accounting for variations due to the factors that are being averaged over.

Back to Contents

Multivariate responses

The emmeans package supports various multivariate models. When there is a multivariate response, the dimensions of that response are treated as if they were levels of a factor. For example, the MOats dataset provided in the package has predictors Block and Variety, and a four-dimensional response yield giving yields observed with varying amounts of nitrogen added to the soil. Here is a model and reference grid:

MOats.lm <- lm (yield ~ Block + Variety, data = MOats)
ref_grid (MOats.lm, mult.name = "nitro")
## 'emmGrid' object with variables:
##     Block = VI, V, III, IV, II, I
##     Variety = Golden Rain, Marvellous, Victory
##     nitro = multivariate response levels: 0, 0.2, 0.4, 0.6

So, nitro is regarded as a factor having 4 levels corresponding to the 4 dimensions of yield. We can subsequently obtain EMMs for any of the factors Block, Variety, nitro, or combinations thereof. The argument mult.name = "nitro" is optional; if it had been excluded, the multivariate levels would have been named rep.meas.

Back to Contents

Objects, structures, and methods

The ref_grid() and emmeans() functions are introduced previously. These functions, and a few related ones, return an object of class emmGrid:

pigs.rg <- ref_grid(pigs.lm1)
class(pigs.rg)
## [1] "emmGrid"
## attr(,"package")
## [1] "emmeans"
pigs.emm.s <- emmeans(pigs.rg, "source")
class(pigs.emm.s)
## [1] "emmGrid"
## attr(,"package")
## [1] "emmeans"

If you simply show these objects, you get different-looking results:

pigs.rg
## 'emmGrid' object with variables:
##     source = fish, soy, skim
##     percent =  9, 12, 15, 18
## Transformation: "log"
pigs.emm.s
##  source emmean     SE df lower.CL upper.CL
##  fish     3.39 0.0367 23     3.32     3.47
##  soy      3.67 0.0374 23     3.59     3.74
##  skim     3.80 0.0394 23     3.72     3.88
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

This is based on guessing what users most need to see when displaying the object. You can override these defaults; for example to just see a quick summary of what is there, do

str(pigs.emm.s)
## 'emmGrid' object with variables:
##     source = fish, soy, skim
## Transformation: "log"

The most important method for emmGrid objects is summary(). It is used as the print method for displaying an emmeans() result. For this reason, arguments for summary() may also be specified within most functions that produce these kinds of results.emmGrid` objects. For example:

# equivalent to summary(emmeans(pigs.lm1, "percent"), level = 0.90, infer = TRUE))
emmeans(pigs.lm1, "percent", level = 0.90, infer = TRUE)
##  percent emmean     SE df lower.CL upper.CL t.ratio p.value
##        9   3.45 0.0409 23     3.38     3.52  84.262  <.0001
##       12   3.62 0.0384 23     3.56     3.69  94.456  <.0001
##       15   3.66 0.0437 23     3.59     3.74  83.757  <.0001
##       18   3.75 0.0530 23     3.65     3.84  70.716  <.0001
## 
## Results are averaged over the levels of: source 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.9

This summary() method for emmGrid objects) actually produces a data.frame, but with extra bells and whistles:

class(summary(pigs.emm.s))
## [1] "summary_emm" "data.frame"

This can be useful to know because if you want to actually use emmeans() results in other computations, you should save its summary, and then you can access those results just like you would access data in a data frame. The emmGrid object itself is not so accessible. There is a print.summary_emm() function that is what actually produces the output you see above – a data frame with extra annotations.

Back to Contents

P values, “significance”, and recommendations

There is some debate among statisticians and researchers about the appropriateness of P values, and that the term “statistical significance” can be misleading. If you have a small P value, it only means that the effect being tested is unlikely to be explained by chance variation alone, in the context of the current study and the current statistical model underlying the test. If you have a large P value, it only means that the observed effect could plausibly be due to chance alone: it is wrong to conclude that there is no effect.

The American Statistical Association has for some time been advocating very cautious use of P values (Wasserman et al. 2014) because it is too often misinterpreted, and too often used carelessly. Wasserman et al. (2019) even goes so far as to advise against ever using the term “statistically significant”. The 43 articles it accompanies in the same issue of TAS, recommend a number of alternatives. I do not agree with all that is said in the main article, and there are portions that are too cutesy or wander off-topic. Further, it is quite dizzying to try to digest all the accompanying articles, and to reconcile their disagreeing viewpoints.

For some time I included a summary of Wasserman et al.’s recommendations and their ATOM paradigm (Acceptance of uncertainty, Thoughtfulness, Openness, Modesty). But in the meantime, I have handled a large number of user questions, and many of those have made it clear to me that there are more important fish to fry in a vignette section like this. It is just a fact that P values are used, and are useful. So I have my own set of recommendations regarding them.

A set of comparisons or well-chosen contrasts is more useful and interpretable than an omnibus F test

F tests are useful for model selection, but don’t tell you anything specific about the nature of an effect. If F has a small P value, it suggests that there is some effect, somewhere. It doesn’t even necessarily imply that any two means differ statistically.

Use adjusted P values

When you run a bunch of tests, there is a risk of making too many type-I errors, and adjusted P values (e.g., the Tukey adjustment for pairwise comparisons) keep you from making too many mistakes. That said, it is possible to go overboard; and it’s usually reasonable to regard each “by” group as a separate family of tests for purposes of adjustment.

It is not necessary to have a significant F test as a prerequisite to doing comparisons or contrasts

… as long as an appropriate adjustment is used. There do exist rules such as the “protected LSD” by which one is given license to do unadjusted comparisons provided the \(F\) statistic is “significant.” However, this is a very weak form of protection for which the justification is, basically, “if \(F\) is significant, you can say absolutely anything you want.”

Get the model right first

Everything the emmeans package does is an interpretation of the model that you fitted to the data. If the model is bad, you will get bad results from emmeans() and other functions. Every single limitation of your model, be it presuming constant error variance, omitting interaction terms, etc., becomes a limitation of the results emmeans() produces. So do a responsible job of fitting the model. And if you don’t know what’s meant by that…

Consider seeking the advice of a statistical consultant

Statistics is hard. It is a lot more than just running programs and copying output. It is your research; is it important that it be done right? Many academic statistics and biostatistics departments can refer you to someone who can help.

Back to Contents

Summary of main points

  • EMMs are derived from a model. A different model for the same data may lead to different EMMs.
  • EMMs are based on a reference grid consisting of all combinations of factor levels, with each covariate set to its average (by default).
  • For purposes of defining the reference grid, dimensions of a multivariate response are treated as levels of a factor.
  • EMMs are then predictions on this reference grid, or marginal averages thereof (equally weighted by default).
  • Reference grids may be modified using at or cov.reduce; the latter may be logical, a function, or a formula.
  • Reference grids and emmeans() results may be plotted via plot() (for parallel confidence intervals) or emmip() (for an interaction-style plot).
  • Be cautious with the terms “significant” and “nonsignificant”, and don’t ever interpret a “nonsignificant” result as saying that there is no effect.
  • Follow good practices such as getting the model right first, and using adjusted P values for appropriately chosen families of comparisons or contrasts.

Back to Contents

References

Wasserman RL, Lazar NA (2016) “The ASA’s Statement on p-Values: Context, Process, and Purpose,” The American Statistician, 70, 129–133, https://doi.org/10.1080/00031305.2016.1154108

Wasserman RL, Schirm AL, Lazar, NA (2019) “Moving to a World Beyond ‘p < 0.05’,” The American Statistician, 73, 1–19, https://doi.org/10.1080/00031305.2019.1583913

Further reading

The reader is referred to other vignettes for more details and advanced use. The strings linked below are the names of the vignettes; i.e., they can also be accessed via vignette("name", "emmeans")

Back to Contents

Index of all vignette topics


  1. In newer versions of emmeans, however, covariates having only two distinct values are by default treated as two-level factors, though there is an option to reduce them to their mean.↩︎

emmeans/inst/doc/sophisticated.R0000644000176200001440000001456314165066771016435 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE--------------------------------------------- require("emmeans") options(show.signif.stars = FALSE, width = 100) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ## ------------------------------------------------------------------------------------------------- Oats.lmer <- lme4::lmer(yield ~ Variety + factor(nitro) + (1|Block/Variety), data = nlme::Oats, subset = -c(1,2,3,5,8,13,21,34,55)) ## ------------------------------------------------------------------------------------------------- Oats.emm.n <- emmeans(Oats.lmer, "nitro") Oats.emm.n ## ------------------------------------------------------------------------------------------------- emmeans(Oats.lmer, "nitro", lmer.df = "satterthwaite") ## ------------------------------------------------------------------------------------------------- emmeans(Oats.lmer, "nitro", lmer.df = "asymptotic") ## ------------------------------------------------------------------------------------------------- contrast(Oats.emm.n, "poly") ## ------------------------------------------------------------------------------------------------- emmeans(Oats.lmer, pairwise ~ Variety) ## ------------------------------------------------------------------------------------------------- ins <- data.frame( n = c(500, 1200, 100, 400, 500, 300), size = factor(rep(1:3,2), labels = c("S","M","L")), age = factor(rep(1:2, each = 3)), claims = c(42, 37, 1, 101, 73, 14)) ins.glm <- glm(claims ~ size + age + offset(log(n)), data = ins, family = "poisson") ## ------------------------------------------------------------------------------------------------- ref_grid(ins.glm) ## ------------------------------------------------------------------------------------------------- emmeans(ins.glm, "size", type = "response") ## ------------------------------------------------------------------------------------------------- emmeans(ins.glm, "size", type = "response", offset = 0) ## ----eval = FALSE--------------------------------------------------------------------------------- # emmeans(ins.glm, "size", type = "response", at = list(n = 1)) ## ----eval = FALSE--------------------------------------------------------------------------------- # emmeans(ins.glm, "size", type = "response", offset = log(100)) ## ------------------------------------------------------------------------------------------------- require("ordinal") wine.clm <- clm(rating ~ temp + contact, scale = ~ judge, data = wine, link = "probit") ## ------------------------------------------------------------------------------------------------- emmeans(wine.clm, list(pairwise ~ temp, pairwise ~ contact)) ## ------------------------------------------------------------------------------------------------- tmp <- ref_grid(wine.clm, mode = "lin") tmp ## ------------------------------------------------------------------------------------------------- emmeans(tmp, "temp") ## ------------------------------------------------------------------------------------------------- emmeans(wine.clm, ~ temp, mode = "exc.prob", at = list(cut = "3|4")) ## ------------------------------------------------------------------------------------------------- emmeans(wine.clm, ~ rating | temp, mode = "prob") ## ------------------------------------------------------------------------------------------------- emmeans(wine.clm, "temp", mode = "mean.class") ## ------------------------------------------------------------------------------------------------- summary(ref_grid(wine.clm, mode = "scale"), type = "response") ## ----eval = FALSE--------------------------------------------------------------------------------- # cbpp <- transform(lme4::cbpp, unit = 1:56) # require("bayestestR") # options(contrasts = c("contr.bayes", "contr.poly")) # cbpp.rstan <- rstanarm::stan_glmer( # cbind(incidence, size - incidence) ~ period + (1|herd) + (1|unit), # data = cbpp, family = binomial, # prior = student_t(df = 5, location = 0, scale = 2, autoscale = FALSE), # chains = 2, cores = 1, seed = 2021.0120, iter = 1000) # cbpp_prior.rstan <- update(cbpp.rstan, prior_PD = TRUE) # cbpp.rg <- ref_grid(cbpp.rstan) # cbpp_prior.rg <- ref_grid(cbpp_prior.rstan) ## ----echo = FALSE--------------------------------------------------------------------------------- cbpp.rg <- do.call(emmobj, readRDS(system.file("extdata", "cbpprglist", package = "emmeans"))) cbpp_prior.rg <- do.call(emmobj, readRDS(system.file("extdata", "cbpppriorrglist", package = "emmeans"))) cbpp.sigma <- readRDS(system.file("extdata", "cbppsigma", package = "emmeans")) ## ------------------------------------------------------------------------------------------------- cbpp.rg ## ------------------------------------------------------------------------------------------------- summary(cbpp.rg) ## ------------------------------------------------------------------------------------------------- require("coda") summary(as.mcmc(cbpp.rg)) ## ------------------------------------------------------------------------------------------------- bayestestR::bayesfactor_parameters(pairs(cbpp.rg), prior = pairs(cbpp_prior.rg)) bayestestR::p_rope(pairs(cbpp.rg), range = c(-0.25, 0.25)) ## ----eval = FALSE--------------------------------------------------------------------------------- # cbpp.sigma = as.matrix(cbpp.rstan$stanfit)[, 78:79] ## ------------------------------------------------------------------------------------------------- head(cbpp.sigma) ## ------------------------------------------------------------------------------------------------- totSD <- sqrt(apply(cbpp.sigma^2, 1, sum)) cbpp.rgrd <- regrid(cbpp.rg, bias.adjust = TRUE, sigma = totSD) summary(cbpp.rgrd) ## ------------------------------------------------------------------------------------------------- bayesplot::mcmc_areas(as.mcmc(cbpp.rgrd)) ## ------------------------------------------------------------------------------------------------- contrast(cbpp.rgrd, "consec", reverse = TRUE) ## ------------------------------------------------------------------------------------------------- set.seed(2019.0605) cbpp.preds <- as.mcmc(cbpp.rgrd, likelihood = "binomial", trials = 25) bayesplot::mcmc_hist(cbpp.preds, binwidth = 1) emmeans/inst/doc/sophisticated.html0000644000176200001440000014620214165066771017174 0ustar liggesusers Sophisticated models in emmeans

Sophisticated models in emmeans

emmeans package, Version 1.7.2

This vignette gives a few examples of the use of the emmeans package to analyze other than the basic types of models provided by the stats package. Emphasis here is placed on accessing the optional capabilities that are typically not needed for the more basic models. A reference for all supported models is provided in the “models” vignette.

Linear mixed models (lmer)

Linear mixed models are really important in statistics. Emphasis here is placed on those fitted using lme4::lmer(), but emmeans also supports other mixed-model packages such as nlme.

To illustrate, consider the Oats dataset in the nlme package. It has the results of a balanced split-plot experiment: experimental blocks are divided into plots that are randomly assigned to oat varieties, and the plots are subdivided into subplots that are randomly assigned to amounts of nitrogen within each plot. We will consider a linear mixed model for these data, excluding interaction (which is justified in this case). For sake of illustration, we will exclude a few observations.

Oats.lmer <- lme4::lmer(yield ~ Variety + factor(nitro) + (1|Block/Variety),
                        data = nlme::Oats, subset = -c(1,2,3,5,8,13,21,34,55))

Let’s look at the EMMs for nitro:

Oats.emm.n <- emmeans(Oats.lmer, "nitro")
Oats.emm.n
##  nitro emmean   SE   df lower.CL upper.CL
##    0.0   78.9 7.29 7.78     62.0     95.8
##    0.2   97.0 7.14 7.19     80.3    113.8
##    0.4  114.2 7.14 7.19     97.4    131.0
##    0.6  124.1 7.07 6.95    107.3    140.8
## 
## Results are averaged over the levels of: Variety 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95

You will notice that the degrees of freedom are fractional: that is due to the fact that whole-plot and subplot variations are combined when standard errors are estimated. Different degrees-of-freedom methods are available. By default, the Kenward-Roger method is used, and that’s why you see a message about the pbkrtest package being loaded, as it implements that method. We may specify a different degrees-of-freedom method via the optional argument lmer.df:

emmeans(Oats.lmer, "nitro", lmer.df = "satterthwaite")
##  nitro emmean   SE   df lower.CL upper.CL
##    0.0   78.9 7.28 7.28     61.8       96
##    0.2   97.0 7.13 6.72     80.0      114
##    0.4  114.2 7.13 6.72     97.2      131
##    0.6  124.1 7.07 6.49    107.1      141
## 
## Results are averaged over the levels of: Variety 
## Degrees-of-freedom method: satterthwaite 
## Confidence level used: 0.95

This latest result uses the Satterthwaite method, which is implemented in the lmerTest package. Note that, with this method, not only are the degrees of freedom slightly different, but so are the standard errors. That is because the Kenward-Roger method also entails making a bias adjustment to the covariance matrix of the fixed effects; that is the principal difference between the methods. A third possibility is "asymptotic":

emmeans(Oats.lmer, "nitro", lmer.df = "asymptotic")
##  nitro emmean   SE  df asymp.LCL asymp.UCL
##    0.0   78.9 7.28 Inf      64.6      93.2
##    0.2   97.0 7.13 Inf      83.1     111.0
##    0.4  114.2 7.13 Inf     100.2     128.2
##    0.6  124.1 7.07 Inf     110.2     137.9
## 
## Results are averaged over the levels of: Variety 
## Degrees-of-freedom method: asymptotic 
## Confidence level used: 0.95

This just sets all the degrees of freedom to Inf – that’s emmeans’s way of using z statistics rather than t statistics. The asymptotic methods tend to make confidence intervals a bit too narrow and P values a bit too low; but they involve much, much less computation. Note that the SEs are the same as obtained using the Satterthwaite method.

Comparisons and contrasts are pretty much the same as with other models. As nitro has quantitative levels, we might want to test polynomial contrasts:

contrast(Oats.emm.n, "poly")
##  contrast  estimate    SE   df t.ratio p.value
##  linear      152.69 15.58 43.2   9.802  <.0001
##  quadratic    -8.27  6.95 44.2  -1.190  0.2402
##  cubic        -6.32 15.21 42.8  -0.415  0.6800
## 
## Results are averaged over the levels of: Variety 
## Degrees-of-freedom method: kenward-roger

The interesting thing here is that the degrees of freedom are much larger than they are for the EMMs. The reason is because nitro within-plot factor, so inter-plot variations have little role in estimating contrasts among nitro levels. On the other hand, Variety is a whole-plot factor, and there is not much of a bump in degrees of freedom for comparisons:

emmeans(Oats.lmer, pairwise ~ Variety)
## $emmeans
##  Variety     emmean   SE   df lower.CL upper.CL
##  Golden Rain  105.2 7.53 8.46     88.0      122
##  Marvellous   108.5 7.48 8.28     91.3      126
##  Victory       96.9 7.64 8.81     79.6      114
## 
## Results are averaged over the levels of: nitro 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast                 estimate   SE   df t.ratio p.value
##  Golden Rain - Marvellous    -3.23 6.55 9.56  -0.493  0.8764
##  Golden Rain - Victory        8.31 6.71 9.80   1.238  0.4595
##  Marvellous - Victory        11.54 6.67 9.80   1.729  0.2431
## 
## Results are averaged over the levels of: nitro 
## Degrees-of-freedom method: kenward-roger 
## P value adjustment: tukey method for comparing a family of 3 estimates

System options for lmerMod models

The computation required to compute the adjusted covariance matrix and degrees of freedom may become cumbersome. Some user options (i.e., emm_options() calls) make it possible to streamline these computations through default methods and limitations on them. First, the option lmer.df, which may have values of "kenward-roger", "satterthwaite", or "asymptotic" (partial matches are OK!) specifies the default degrees-of-freedom method.

The options disable.pbkrtest and disable.lmerTest may be TRUE or FALSE, and comprise another way of controlling which method is used (e.g., the Kenward-Roger method will not be used if get_emm_option("disable.pbkrtest") == TRUE). Finally, the options pbkrtest.limit and lmerTest.limit, which should be set to numeric values, enable the given package conditionally on whether the number of data rows does not exceed the given limit. The factory default is 3000 for both limits.

Back to Contents

Models with offsets

If a model is fitted and its formula includes an offset() term, then by default, the offset is computed and included in the reference grid. To illustrate, consider a hypothetical dataset on insurance claims (used as an example in SAS’s documentation). There are classes of cars of varying counts (n), sizes (size), and age (age), and we record the number of insurance claims (claims). We fit a Poisson model to claims as a function of size and age. An offset of log(n) is included so that n functions as an “exposure” variable.

ins <- data.frame(
    n = c(500, 1200, 100, 400, 500, 300),
    size = factor(rep(1:3,2), labels = c("S","M","L")),
    age = factor(rep(1:2, each = 3)),
    claims = c(42, 37, 1, 101, 73, 14))
ins.glm <- glm(claims ~ size + age + offset(log(n)), 
               data = ins, family = "poisson")

First, let’s look at the reference grid obtained by default:

ref_grid(ins.glm)
## 'emmGrid' object with variables:
##     size = S, M, L
##     age = 1, 2
##     n = 500
## Transformation: "log"

Note that n is included in the reference grid and that its average value of 500 is used for all predictions. Thus, if we obtain EMMs for, say, age, these are results are based on a pool of 500 cars:

emmeans(ins.glm, "size", type = "response")
##  size rate   SE  df asymp.LCL asymp.UCL
##  S    69.3 6.25 Inf     58.03      82.7
##  M    34.6 3.34 Inf     28.67      41.9
##  L    11.9 3.14 Inf      7.07      19.9
## 
## Results are averaged over the levels of: age 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale

However, many users would like to ignore the offset for this kind of model, because then the estimates we obtain are rates per unit value of the (logged) offset. This may be accomplished by specifying an offset parameter in the call:

emmeans(ins.glm, "size", type = "response", offset = 0)
##  size   rate      SE  df asymp.LCL asymp.UCL
##  S    0.1385 0.01250 Inf    0.1161    0.1653
##  M    0.0693 0.00669 Inf    0.0573    0.0837
##  L    0.0237 0.00627 Inf    0.0141    0.0398
## 
## Results are averaged over the levels of: age 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale

You may verify that the above estimates are 1/500th of the previous ones. You may also verify that the above results are identical to those obtained by setting n equal to 1:

emmeans(ins.glm, "size", type = "response", at = list(n = 1))

However, those who use these types of models will be more comfortable directly setting the offset to zero.

By the way, you may set some other reference value for the rates. For example, if you want estimates of claims per 100 cars, simply use (results not shown):

emmeans(ins.glm, "size", type = "response", offset = log(100))

Back to Contents

Ordinal models

Ordinal-response models comprise an example where several options are available for obtaining EMMs. To illustrate, consider the wine data in the ordinal package. The response is a rating of bitterness on a five-point scale. we will consider a probit model in two factors during fermentation: temp (temperature) and contact (contact with grape skins), with the judge making the rating as a scale predictor:

require("ordinal")
## Loading required package: ordinal
## 
## Attaching package: 'ordinal'
## The following objects are masked from 'package:lme4':
## 
##     VarCorr, ranef
wine.clm <- clm(rating ~ temp + contact, scale = ~ judge,
                data = wine, link = "probit")

(in earlier modeling, we found little interaction between the factors.) Here are the EMMs for each factor using default options:

emmeans(wine.clm, list(pairwise ~ temp, pairwise ~ contact))
## $`emmeans of temp`
##  temp emmean    SE  df asymp.LCL asymp.UCL
##  cold -0.884 0.290 Inf    -1.452    -0.316
##  warm  0.601 0.225 Inf     0.161     1.041
## 
## Results are averaged over the levels of: contact, judge 
## Confidence level used: 0.95 
## 
## $`pairwise differences of temp`
##  1           estimate    SE  df z.ratio p.value
##  cold - warm    -1.07 0.422 Inf  -2.547  0.0109
## 
## Results are averaged over the levels of: contact, judge 
## 
## $`emmeans of contact`
##  contact emmean    SE  df asymp.LCL asymp.UCL
##  no      -0.614 0.298 Inf   -1.1990   -0.0297
##  yes      0.332 0.201 Inf   -0.0632    0.7264
## 
## Results are averaged over the levels of: temp, judge 
## Confidence level used: 0.95 
## 
## $`pairwise differences of contact`
##  1        estimate    SE  df z.ratio p.value
##  no - yes   -0.684 0.304 Inf  -2.251  0.0244
## 
## Results are averaged over the levels of: temp, judge

These results are on the “latent” scale; the idea is that there is a continuous random variable (in this case normal, due to the probit link) having a mean that depends on the predictors; and that the ratings are a discretization of the latent variable based on a fixed set of cut points (which are estimated). In this particular example, we also have a scale model that says that the variance of the latent variable depends on the judges. The latent results are quite a bit like those for measurement data, making them easy to interpret. The only catch is that they are not uniquely defined: we could apply a linear transformation to them, and the same linear transformation to the cut points, and the results would be the same.

The clm function actually fits the model using an ordinary probit model but with different intercepts for each cut point. We can get detailed information for this model by specifying mode = "linear.predictor":

tmp <- ref_grid(wine.clm, mode = "lin")
tmp
## 'emmGrid' object with variables:
##     temp = cold, warm
##     contact = no, yes
##     judge = 1, 2, 3, 4, 5, 6, 7, 8, 9
##     cut = multivariate response levels: 1|2, 2|3, 3|4, 4|5
## Transformation: "probit"

Note that this reference grid involves an additional constructed predictor named cut that accounts for the different intercepts in the model. Let’s obtain EMMs for temp on the linear-predictor scale:

emmeans(tmp, "temp")
##  temp emmean    SE  df asymp.LCL asymp.UCL
##  cold  0.884 0.290 Inf     0.316     1.452
##  warm -0.601 0.225 Inf    -1.041    -0.161
## 
## Results are averaged over the levels of: contact, judge, cut 
## Results are given on the probit (not the response) scale. 
## Confidence level used: 0.95

These are just the negatives of the latent results obtained earlier (the sign is changed to make the comparisons go the right direction). Closely related to this is mode = "cum.prob" and mode = "exc.prob", which simply transform the linear predictor to cumulative probabilities and exceedance (1 - cumulative) probabilities. These modes give us access to the details of the fitted model but are cumbersome to use for describing results. When they can become useful is when you want to work in terms of a particular cut point. Let’s look at temp again in terms of the probability that the rating will be at least 4:

emmeans(wine.clm, ~ temp, mode = "exc.prob", at = list(cut = "3|4"))
##  temp exc.prob     SE  df asymp.LCL asymp.UCL
##  cold   0.0748 0.0318 Inf    0.0124     0.137
##  warm   0.4069 0.0706 Inf    0.2686     0.545
## 
## Results are averaged over the levels of: contact, judge 
## Confidence level used: 0.95

There are yet more modes! With mode = "prob", we obtain estimates of the probability distribution of each rating. Its reference grid includes a factor with the same name as the model response – in this case rating. We usually want to use that as the primary factor, and the factors of interest as by variables:

emmeans(wine.clm, ~ rating | temp, mode = "prob")
## temp = cold:
##  rating   prob     SE  df asymp.LCL asymp.UCL
##  1      0.1292 0.0625 Inf   0.00667    0.2518
##  2      0.4877 0.0705 Inf   0.34948    0.6259
##  3      0.3083 0.0594 Inf   0.19186    0.4248
##  4      0.0577 0.0238 Inf   0.01104    0.1043
##  5      0.0171 0.0127 Inf  -0.00768    0.0419
## 
## temp = warm:
##  rating   prob     SE  df asymp.LCL asymp.UCL
##  1      0.0156 0.0129 Inf  -0.00961    0.0408
##  2      0.1473 0.0448 Inf   0.05959    0.2350
##  3      0.4302 0.0627 Inf   0.30723    0.5532
##  4      0.2685 0.0625 Inf   0.14593    0.3910
##  5      0.1384 0.0506 Inf   0.03923    0.2376
## 
## Results are averaged over the levels of: contact, judge 
## Confidence level used: 0.95

Using mode = "mean.class" obtains the average of these probability distributions as probabilities of the integers 1–5:

emmeans(wine.clm, "temp", mode = "mean.class")
##  temp mean.class    SE  df asymp.LCL asymp.UCL
##  cold       2.35 0.144 Inf      2.06      2.63
##  warm       3.37 0.146 Inf      3.08      3.65
## 
## Results are averaged over the levels of: contact, judge 
## Confidence level used: 0.95

And there is a mode for the scale model too. In this example, the scale model involves only judges, and that is the only factor in the grid:

summary(ref_grid(wine.clm, mode = "scale"), type = "response")
##  judge response    SE  df
##  1        1.000 0.000 Inf
##  2        1.043 0.570 Inf
##  3        1.053 0.481 Inf
##  4        0.710 0.336 Inf
##  5        0.663 0.301 Inf
##  6        0.758 0.341 Inf
##  7        1.071 0.586 Inf
##  8        0.241 0.179 Inf
##  9        0.533 0.311 Inf

Judge 8’s ratings don’t vary much, relative to the others. The scale model is in terms of log(SD). Again, these are not uniquely identifiable, and the first level’s estimate is set to log(1) = 0. so, actually, each estimate shown is a comparison with judge 1.

Back to Contents

Models fitted using MCMC methods

To illustrate emmeans’s support for models fitted using MCMC methods, consider the example_model available in the rstanarm package. The example concerns CBPP, a serious disease of cattle in Ethiopia. A generalized linear mixed model was fitted to the data using the code below. (This is a Bayesian equivalent of the frequentist model we considered in the “Transformations” vignette.) In fitting the model, we first set the contrast coding to bayestestR::contr.bayes because this equalizes the priors across different treatment levels (a correction from an earlier version of this vignette.) We subsequently obtain the reference grids for these models in the usual way. For later use, we also fit the same model with just the prior information.

cbpp <- transform(lme4::cbpp, unit = 1:56)
require("bayestestR")
options(contrasts = c("contr.bayes", "contr.poly"))
cbpp.rstan <- rstanarm::stan_glmer(
    cbind(incidence, size - incidence) ~ period + (1|herd) + (1|unit),
    data = cbpp, family = binomial,
    prior = student_t(df = 5, location = 0, scale = 2, autoscale = FALSE),
    chains = 2, cores = 1, seed = 2021.0120, iter = 1000)
cbpp_prior.rstan <- update(cbpp.rstan, prior_PD = TRUE)
cbpp.rg <- ref_grid(cbpp.rstan)
cbpp_prior.rg <- ref_grid(cbpp_prior.rstan)

Here is the structure of the reference grid:

cbpp.rg
## 'emmGrid' object with variables:
##     period = 1, 2, 3, 4
## Transformation: "logit"

So here are the EMMs (no averaging needed in this simple model):

summary(cbpp.rg)
##  period prediction lower.HPD upper.HPD
##  1           -1.60     -2.26    -0.987
##  2           -2.77     -3.65    -1.974
##  3           -2.90     -3.77    -2.040
##  4           -3.32     -4.43    -2.385
## 
## Point estimate displayed: median 
## Results are given on the logit (not the response) scale. 
## HPD interval probability: 0.95

The summary for EMMs of Bayesian models shows the median of the posterior distribution of each estimate, along with highest posterior density (HPD) intervals. Under the hood, the posterior sample of parameter estimates is used to compute a corresponding sample of posterior EMMs, and it is those that are summarized. (Technical note: the summary is actually rerouted to the hpd.summary() function.

We can access the posterior EMMs via the as.mcmc method for emmGrid objects. This gives us an object of class mcmc (defined in the coda package), which can be summarized and explored as we please.

require("coda")
## Loading required package: coda
summary(as.mcmc(cbpp.rg))
## 
## Iterations = 1:500
## Thinning interval = 1 
## Number of chains = 2 
## Sample size per chain = 500 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##            Mean     SD Naive SE Time-series SE
## period 1 -1.595 0.3333  0.01054        0.01279
## period 2 -2.790 0.4327  0.01368        0.01367
## period 3 -2.916 0.4491  0.01420        0.01706
## period 4 -3.379 0.5384  0.01703        0.01845
## 
## 2. Quantiles for each variable:
## 
##            2.5%    25%    50%    75%   97.5%
## period 1 -2.283 -1.823 -1.597 -1.357 -0.9929
## period 2 -3.707 -3.038 -2.766 -2.511 -2.0197
## period 3 -3.834 -3.196 -2.899 -2.610 -2.0915
## period 4 -4.502 -3.725 -3.318 -3.022 -2.4079

Note that as.mcmc will actually produce an mcmc.list when there is more than one chain present, as in this example. The 2.5th and 97.5th quantiles are similar, but not identical, to the 95% confidence intervals in the frequentist summary.

The bayestestR package provides emmGrid methods for most of its description and testing functions. For example:

bayestestR::bayesfactor_parameters(pairs(cbpp.rg), prior = pairs(cbpp_prior.rg))
## Loading required namespace: logspline
## Bayes Factor (Savage-Dickey density ratio)
## 
## Parameter |    BF
## -----------------
## 1 - 2     |  3.01
## 1 - 3     |  5.13
## 1 - 4     | 14.26
## 2 - 3     | 0.173
## 2 - 4     | 0.268
## 3 - 4     | 0.221
## 
## * Evidence Against The Null: 0
bayestestR::p_rope(pairs(cbpp.rg), range = c(-0.25, 0.25))
## Proportion of samples inside the ROPE [-0.25, 0.25]
## 
## Parameter | p (ROPE)
## --------------------
## 1 - 2     |    0.021
## 1 - 3     |    0.015
## 1 - 4     |    0.004
## 2 - 3     |    0.367
## 2 - 4     |    0.184
## 3 - 4     |    0.290

Both of these sets of results suggest that period 1 is different from the others. For more information on these methods, refer to the CRAN page for bayestestR and its vignettes, e.g., the one on Bayes factors.

Bias-adjusted incidence probabilities

Next, let us consider the back-transformed results. As is discussed with the frequentist model, there are random effects present, and if wee want to think in terms of marginal probabilities across all herds and units, we should correct for bias; and to do that, we need the standard deviations of the random effects. The model object has MCMC results for the random effects of each herd and each unit, but after those, there are also summary results for the posterior SDs of the two random effects. (I used the colnames function to find that they are in the 78th and 79th columns.)

cbpp.sigma = as.matrix(cbpp.rstan$stanfit)[, 78:79]

Here are the first few:

head(cbpp.sigma)
##           parameters
## iterations Sigma[unit:(Intercept),(Intercept)] Sigma[herd:(Intercept),(Intercept)]
##       [1,]                            1.154694                         0.167807505
##       [2,]                            1.459379                         0.040318460
##       [3,]                            1.482619                         0.006198847
##       [4,]                            1.236694                         0.206057981
##       [5,]                            1.460472                         0.088491844
##       [6,]                            1.412277                         0.070334431

So to obtain bias-adjusted marginal probabilities, obtain the resultant SD and regrid with bias correction:

totSD <- sqrt(apply(cbpp.sigma^2, 1, sum))
cbpp.rgrd <- regrid(cbpp.rg, bias.adjust = TRUE, sigma = totSD)
summary(cbpp.rgrd)
##  period   prob lower.HPD upper.HPD
##  1      0.2192    0.1207     0.377
##  2      0.0856    0.0285     0.176
##  3      0.0747    0.0288     0.163
##  4      0.0506    0.0136     0.120
## 
## Point estimate displayed: median 
## HPD interval probability: 0.95

Here is a plot of the posterior incidence probabilities, back-transformed:

bayesplot::mcmc_areas(as.mcmc(cbpp.rgrd))

… and here are intervals for each period compared with its neighbor:

contrast(cbpp.rgrd, "consec", reverse = TRUE)
##  contrast estimate lower.HPD upper.HPD
##  1 - 2     0.13379    0.0258     0.274
##  2 - 3     0.00864   -0.0714     0.111
##  3 - 4     0.02317   -0.0564     0.114
## 
## Point estimate displayed: median 
## HPD interval probability: 0.95

The only interval that excludes zero is the one that compares periods 1 and 2.

Bayesian prediction

To predict from an MCMC model, just specify the likelihood argument in as.mcmc. Doing so causes the function to simulate data from the posterior predictive distribution. For example, if we want to predict the CBPP incidence in future herds of 25 cattle, we can do:

set.seed(2019.0605)
cbpp.preds <- as.mcmc(cbpp.rgrd, likelihood = "binomial", trials = 25)
bayesplot::mcmc_hist(cbpp.preds, binwidth = 1)

Back to Contents

Index of all vignette topics

emmeans/inst/doc/basics.Rmd0000644000176200001440000007024314137062735015347 0ustar liggesusers--- title: "Basics of estimated marginal means" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Basics of EMMs} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` ## Contents {#contents} 1. [Motivating example](#motivation) 2. [EMMs defined](#EMMdef) a. [Reference grids](#ref_grid) b. [Estimated marginal means](#emmeans) c. [Altering the reference grid](#altering) d. [Derived covariates](#depcovs) d. [Non-predictor variables](#params) e. [Graphical displays](#plots) e. [Formatting results](#formatting) f. [Weighting](#weights) g. [Multivariate models](#multiv) 3. [Objects, structures, and methods](#emmobj) 4. [P values, "significance", and recommendations](#pvalues) 5. [Summary](#summary) 6. [Further reading](#more) [Index of all vignette topics](vignette-topics.html) ## Why we need EMMs {#motivation} Consider the `pigs` dataset provided with the package (`help("pigs")` provides details). These data come from an unbalanced experiment where pigs are given different percentages of protein (`percent`) from different sources (`source`) in their diet, and later we measure the concentration (`conc`) of leucine. Here's an interaction plot showing the mean `conc` at each combination of the other factors. ```{r, echo = FALSE} par(mar = .1 + c(4, 4, 1, 1)) # reduce head space ``` ```{r} with(pigs, interaction.plot(percent, source, conc)) ``` This plot suggests that with each `source`, `conc` tends to go up with `percent`, but that the mean differs with each `source`. Now, suppose that we want to assess, numerically, the marginal results for `percent`. The natural thing to do is to obtain the marginal means: ```{r} with(pigs, tapply(conc, percent, mean)) ``` Looking at the plot, it seems a bit surprising that the last three means are all about the same, with the one for 15 percent being the largest. Hmmmm, so let's try another approach -- actually averaging together the values we see in the plot. First, we need the means that are shown there: ```{r} cell.means <- matrix(with(pigs, tapply(conc, interaction(source, percent), mean)), nrow = 3) cell.means ``` Confirm that the rows of this matrix match the plotted values for fish, soy, and skim, respectively. Now, average each column: ```{r} apply(cell.means, 2, mean) ``` These results are decidedly different from the ordinary marginal means we obtained earlier. What's going on? The answer is that some observations were lost, making the data unbalanced: ```{r} with(pigs, table(source, percent)) ``` We can reproduce the marginal means by weighting the cell means with these frequencies. For example, in the last column: ```{r} sum(c(3, 1, 1) * cell.means[, 4]) / 5 ``` The big discrepancy between the ordinary mean for `percent = 18` and the marginal mean from `cell.means` is due to the fact that the lowest value receives 3 times the weight as the other two values. ### The point {#eqwts} The point is that the marginal means of `cell.means` give *equal weight* to each cell. In many situations (especially with experimental data), that is a much fairer way to compute marginal means, in that they are not biased by imbalances in the data. We are, in a sense, estimating what the marginal means *would* be, had the experiment been balanced. Estimated marginal means (EMMs) serve that need. All this said, there are certainly situations where equal weighting is *not* appropriate. Suppose, for example, we have data on sales of a product given different packaging and features. The data could be unbalanced because customers are more attracted to some combinations than others. If our goal is to understand scientifically what packaging and features are inherently more profitable, then equally weighted EMMs may be appropriate; but if our goal is to predict or maximize profit, the ordinary marginal means provide better estimates of what we can expect in the marketplace. [Back to Contents](#contents) ## What exactly are EMMs? {#EMMdef} ### Model and reference grid {#ref_grid} Estimated marginal means are based on a *model* -- not directly on data. The basis for them is what we call the *reference grid* for a given model. To obtain the reference grid, consider all the predictors in the model. Here are the default rules for constructing the reference grid * For each predictor that is a *factor*, use its levels (dropping unused ones) * For each numeric predictor (covariate), use its average.[^1] The reference grid is then a regular grid of all combinations of these reference levels. As a simple example, consider again the `pigs` dataset (see `help("fiber")` for details). Examination of residual plots from preliminary models suggests that it is a good idea to work in terms of log concentration. If we treat the predictor `percent` as a factor, we might fit the following model: ```{r} pigs.lm1 <- lm(log(conc) ~ source + factor(percent), data = pigs) ``` The reference grid for this model can be found via the `ref_grid` function: ```{r} ref_grid(pigs.lm1) ``` (*Note:* Many of the calculations that follow are meant to illustrate what is inside this reference-grid object; You don't need to do such calculations yourself in routine analysis; just use the `emmeans()` (or possibly `ref_grid()`) function as we do later.) In this model, both predictors are factors, and the reference grid consists of the $3\times4 = 12$ combinations of these factor levels. It can be seen explicitly by looking at the `grid` slot of this object: ```{r} ref_grid(pigs.lm1) @ grid ``` Note that other information is retained in the reference grid, e.g., the transformation used on the response, and the cell counts as the `.wgt.` column. Now, suppose instead that we treat `percent` as a numeric predictor. This leads to a different model -- and a different reference grid. ```{r} pigs.lm2 <- lm(log(conc) ~ source + percent, data = pigs) ref_grid(pigs.lm2) ``` This reference grid has the levels of `source`, but only one `percent` value, its average. Thus, the grid has only three elements: ```{r} ref_grid(pigs.lm2) @ grid ``` [^1]: In newer versions of **emmeans**, however, covariates having only two distinct values are by default treated as two-level factors, though there is an option to reduce them to their mean. [Back to Contents](#contents) ### Estimated marginal means {#emmeans} Once the reference grid is established, we can consider using the model to estimate the mean at each point in the reference grid. (Curiously, the convention is to call this "prediction" rather than "estimation"). For `pigs.lm1`, we have ```{r} pigs.pred1 <- matrix(predict(ref_grid(pigs.lm1)), nrow = 3) pigs.pred1 ``` Estimated marginal means (EMMs) are defined as equally weighted means of these predictions at specified margins: ```{r} apply(pigs.pred1, 1, mean) ### EMMs for source apply(pigs.pred1, 2, mean) ### EMMs for percent ``` For the other model, `pigs.lm2`, we have only one point in the reference grid for each `source` level; so the EMMs for `source` are just the predictions themselves: ```{r} predict(ref_grid(pigs.lm2)) ``` These are slightly different from the previous EMMs for `source`, emphasizing the fact that EMMs are model-dependent. In models with covariates, EMMs are often called *adjusted means*. The `emmeans` function computes EMMs, accompanied by standard errors and confidence intervals. For example, ```{r} emmeans(pigs.lm1, "percent") ``` In these examples, all the results are presented on the `log(conc)` scale (and the annotations in the output warn of this). It is possible to convert them back to the `conc` scale by back-transforming. This topic is discussed in [the vignette on transformations](transformations.html). An additional note: There is an exception to the definition of EMMs given here. If the model has a nested structure in the fixed effects, then averaging is performed separately in each nesting group. See the [section on nesting in the "messy-data" vignette](messy-data.html#nesting) for an example. [Back to Contents](#contents) ### Altering the reference grid {#altering} It is possible to alter the reference grid. We might, for example, want to define a reference grid for `pigs.lm2` that is comparable to the one for `pigs.lm1`. ```{r} ref_grid(pigs.lm2, cov.keep = "percent") ``` Using `cov.keep = "percent"` specifies that, instead of using the mean, the reference grid should use all the unique values of `each covariate`"percent"`. Another option is to specify a `cov.reduce` function that is used in place of the mean; e.g., ```{r} ref_grid(pigs.lm2, cov.reduce = range) ``` Another option is to use the `at` argument. Consider this model for the built-in `mtcars` dataset: ```{r} mtcars.lm <- lm(mpg ~ disp * cyl, data = mtcars) ref_grid(mtcars.lm) ``` Since both predictors are numeric, the default reference grid has only one point. For purposes of describing the fitted model, you might want to obtain predictions at a grid of points, like this: ```{r} mtcars.rg <- ref_grid(mtcars.lm, cov.keep = 3, at = list(disp = c(100, 200, 300))) mtcars.rg ``` This illustrates two things: a new use of `cov.keep` and the `at` argument. `cov.keep = "3"` specifies that any covariates having 3 or fewer unique values is treated like a factor (the system default is `cov.keep = "2"`). The `at` specification gives three values of `disp`, overriding the default behavior to use the mean of `disp`. Another use of `at` is to focus on only some of the levels of a factor. Note that `at` does not need to specify every predictor; those not mentioned in `at` are handled by `cov.reduce`, `cov.keep`, or the default methods. Also, covariate values in `at` need not be values that actually occur in the data, whereas `cov.keep` will use only values that are achieved. [Back to Contents](#contents) ### Derived covariates {#depcovs} You need to be careful when one covariate depends on the value of another. To illustrate in the `mtcars` example, suppose we want to use `cyl` as a factor and include a quadratic term for `disp`: ```{r} mtcars.1 <- lm(mpg ~ factor(cyl) + disp + I(disp^2), data = mtcars) emmeans(mtcars.1, "cyl") ``` Some users may not like function calls in the model formula, so they instead do something like this: ```{r} mtcars <- transform(mtcars, Cyl = factor(cyl), dispsq = disp^2) mtcars.2 <- lm(mpg ~ Cyl + disp + dispsq, data = mtcars) emmeans(mtcars.2, "Cyl") ``` Wow! Those are really different results -- even though the models are equivalent. Why is this? To understand, look at the reference grids: ```{r} ref_grid(mtcars.1) ref_grid(mtcars.2) ``` For both models, the reference grid uses the `disp` mean of 230.72. But for `mtcars.2`, we also set `dispsq` to its mean of 68113. This is not right, because `dispsq` should be the square of `disp` (about 53232, not 68113) in order to be consistent. If we use that value of `dispsq`, we get the same results (modulus rounding error) as for `mtcars.1`: ```{r} emmeans(mtcars.2, "Cyl", at = list(dispsq = 230.72^2)) ``` In summary, for polynomial models and others where some covariates depend on others in nonlinear ways, include that dependence in the model formula (as in `mtcars.1`) using `I()` or `poly()` expressions, or alter the reference grid so that the dependency among covariates is correct. ### Non-predictor variables {#params} Reference grids are derived using the variables in the right-hand side of the model formula. But sometimes, these variables are not actually predictors. For example: ```{r, eval = FALSE} deg <- 2 mod <- lm(y ~ treat * poly(x, degree = deg), data = mydata) ``` If we call `ref_grid()` or `emmeans()` with this model, it will try to construct a grid of values of `treat`, `x`, and `deg` -- causing an error because `deg` is not a predictor in this model. To get things to work correctly, you need to name `deg` in a `params` argument, e.g., ```{r, eval = FALSE} emmeans(mod, ~ treat | x, at = list(x = 1:3), params = "deg") ``` [Back to Contents](#contents) ### Graphical displays {#plots} The results of `ref_grid()` or `emmeans()` (these are objects of class `emmGrid`) may be plotted in two different ways. One is an interaction-style plot, using `emmip()`. In the following, let's use it to compare the predictions from `pigs.lm1` and `pigs.lm2`: ```{r} emmip(pigs.lm1, source ~ percent) emmip(ref_grid(pigs.lm2, cov.reduce = FALSE), source ~ percent) ``` Notice that `emmip()` may also be used on a fitted model. The formula specification needs the *x* variable on the right-hand side and the "trace" factor (what is used to define the different curves) on the left. This is a good time to yet again emphasize that EMMs are based on a *model*. Neither of these plots is an interaction plot of the *data*; they are interaction plots of model predictions; and since both models do not include an interaction, no interaction at all is evident in the plots. ###### {#plot.emmGrid} The other graphics option offered is the `plot()` method for `emmGrid` objects. In the following, we display the estimates and 95% confidence intervals for `mtcars.rg` in separate panels for each `disp`. ```{r} plot(mtcars.rg, by = "disp") ``` This plot illustrates, as much as anything else, how silly it is to try to predict mileage for a 4-cylinder car having high displacement, or an 8-cylinder car having low displacement. The widths of the intervals give us a clue that we are extrapolating. A better idea is to acknowledge that displacement largely depends on the number of cylinders. So here is yet another way to use `cov.reduce` to modify the reference grid: ```{r} mtcars.rg_d.c <- ref_grid(mtcars.lm, at = list(cyl = c(4,6,8)), cov.reduce = disp ~ cyl) mtcars.rg_d.c @ grid ``` The `ref_grid` call specifies that `disp` depends on `cyl`; so a linear model is fitted with the given formula and its fitted values are used as the `disp` values -- only one for each `cyl`. If we plot this grid, the results are sensible, reflecting what the model predicts for typical cars with each number of cylinders: ```{r fig.height = 1.5} plot(mtcars.rg_d.c) ``` ###### {#ggplot} Wizards with the **ggplot2** package can further enhance these plots if they like. For example, we can add the data to an interaction plot -- this time we opt to include confidence intervals and put the three sources in separate panels: ```{r} require("ggplot2") emmip(pigs.lm1, ~ percent | source, CIs = TRUE) + geom_point(aes(x = percent, y = log(conc)), data = pigs, pch = 2, color = "blue") ``` ### Formatting results {#formatting} If you want to include `emmeans()` results in a report, you might want to have it in a nicer format than just the printed output. We provide a little bit of help for this, especially if you are using RMarkdown or SWeave to prepare the report. There is an `xtable` method for exporting these results, which we do not illustrate here but it works similarly to `xtable()` in other contexts. Also, the `export` option the `print()` method allows the user to save exactly what is seen in the printed output as text, to be saved or formatted as the user likes (see the documentation for `print.emmGrid` for details). Here is an example using one of the objects above: ```{r, eval = FALSE} ci <- confint(mtcars.rg_d.c, level = 0.90, adjust = "scheffe") xport <- print(ci, export = TRUE) cat("\n") knitr::kable(xport$summary, align = "r") for (a in xport$annotations) cat(paste(a, "
")) cat("
\n") ``` ```{r, results = "asis", echo = FALSE} ci <- confint(mtcars.rg_d.c, level = 0.90, adjust = "scheffe") xport <- print(ci, export = TRUE) cat("\n") knitr::kable(xport$summary, align = "r") for (a in xport$annotations) cat(paste(a, "
")) cat("
\n") ``` [Back to Contents](#contents) ### Using weights {#weights} It is possible to override the equal-weighting method for computing EMMs. Using `weights = "cells"` in the call will weight the predictions according to their cell frequencies (recall this information is retained in the reference grid). This produces results comparable to ordinary marginal means: ```{r} emmeans(pigs.lm1, "percent", weights = "cells") ``` Note that, as in the ordinary means in [the motivating example](#motivation), the highest estimate is for `percent = 15` rather than `percent = 18`. It is interesting to compare this with the results for a model that includes only `percent` as a predictor. ```{r} pigs.lm3 <- lm(log(conc) ~ factor(percent), data = pigs) emmeans(pigs.lm3, "percent") ``` The EMMs in these two tables are identical, but their standard errors are considerably different. That is because the model `pigs.lm1` accounts for variations due to `source`. The lesson here is that it is possible to obtain statistics comparable to ordinary marginal means, while still accounting for variations due to the factors that are being averaged over. [Back to Contents](#contents) ### Multivariate responses {#multiv} The **emmeans** package supports various multivariate models. When there is a multivariate response, the dimensions of that response are treated as if they were levels of a factor. For example, the `MOats` dataset provided in the package has predictors `Block` and `Variety`, and a four-dimensional response `yield` giving yields observed with varying amounts of nitrogen added to the soil. Here is a model and reference grid: ```{r} MOats.lm <- lm (yield ~ Block + Variety, data = MOats) ref_grid (MOats.lm, mult.name = "nitro") ``` So, `nitro` is regarded as a factor having 4 levels corresponding to the 4 dimensions of `yield`. We can subsequently obtain EMMs for any of the factors `Block`, `Variety`, `nitro`, or combinations thereof. The argument `mult.name = "nitro"` is optional; if it had been excluded, the multivariate levels would have been named `rep.meas`. [Back to Contents](#contents) ## Objects, structures, and methods {#emmobj} The `ref_grid()` and `emmeans()` functions are introduced previously. These functions, and a few related ones, return an object of class `emmGrid`: ```{r} pigs.rg <- ref_grid(pigs.lm1) class(pigs.rg) pigs.emm.s <- emmeans(pigs.rg, "source") class(pigs.emm.s) ``` If you simply show these objects, you get different-looking results: ```{r} pigs.rg pigs.emm.s ``` This is based on guessing what users most need to see when displaying the object. You can override these defaults; for example to just see a quick summary of what is there, do ```{r} str(pigs.emm.s) ``` The most important method for `emmGrid` objects is `summary()`. It is used as the print method for displaying an `emmeans()` result. For this reason, arguments for `summary()` may also be specified within most functions that produce `these kinds of results.`emmGrid` objects. For example: ```{r} # equivalent to summary(emmeans(pigs.lm1, "percent"), level = 0.90, infer = TRUE)) emmeans(pigs.lm1, "percent", level = 0.90, infer = TRUE) ``` This `summary()` method for `emmGrid` objects) actually produces a `data.frame`, but with extra bells and whistles: ```{r} class(summary(pigs.emm.s)) ``` This can be useful to know because if you want to actually *use* `emmeans()` results in other computations, you should save its summary, and then you can access those results just like you would access data in a data frame. The `emmGrid` object itself is not so accessible. There is a `print.summary_emm()` function that is what actually produces the output you see above -- a data frame with extra annotations. [Back to Contents](#contents) ## P values, "significance", and recommendations {#pvalues} There is some debate among statisticians and researchers about the appropriateness of *P* values, and that the term "statistical significance" can be misleading. If you have a small *P* value, it *only* means that the effect being tested is unlikely to be explained by chance variation alone, in the context of the current study and the current statistical model underlying the test. If you have a large *P* value, it *only* means that the observed effect could plausibly be due to chance alone: it is *wrong* to conclude that there is no effect. The American Statistical Association has for some time been advocating very cautious use of *P* values (Wasserman *et al.* 2014) because it is too often misinterpreted, and too often used carelessly. Wasserman *et al.* (2019) even goes so far as to advise against *ever* using the term "statistically significant". The 43 articles it accompanies in the same issue of *TAS*, recommend a number of alternatives. I do not agree with all that is said in the main article, and there are portions that are too cutesy or wander off-topic. Further, it is quite dizzying to try to digest all the accompanying articles, and to reconcile their disagreeing viewpoints. For some time I included a summary of Wasserman *et al.*'s recommendations and their *ATOM* paradigm (Acceptance of uncertainty, Thoughtfulness, Openness, Modesty). But in the meantime, I have handled a large number of user questions, and many of those have made it clear to me that there are more important fish to fry in a vignette section like this. It is just a fact that *P* values are used, and are useful. So I have my own set of recommendations regarding them. #### A set of comparisons or well-chosen contrasts is more useful and interpretable than an omnibus *F* test {#recs1} *F* tests are useful for model selection, but don't tell you anything specific about the nature of an effect. If *F* has a small *P* value, it suggests that there is some effect, somewhere. It doesn't even necessarily imply that any two means differ statistically. #### Use *adjusted* *P* values When you run a bunch of tests, there is a risk of making too many type-I errors, and adjusted *P* values (e.g., the Tukey adjustment for pairwise comparisons) keep you from making too many mistakes. That said, it is possible to go overboard; and it's usually reasonable to regard each "by" group as a separate family of tests for purposes of adjustment. #### It is *not* necessary to have a significant *F* test as a prerequisite to doing comparisons or contrasts {#recs2} ... as long as an appropriate adjustment is used. There do exist rules such as the "protected LSD" by which one is given license to do unadjusted comparisons provided the $F$ statistic is "significant." However, this is a very weak form of protection for which the justification is, basically, "if $F$ is significant, you can say absolutely anything you want." #### Get the model right first Everything the **emmeans** package does is an interpretation of the model that you fitted to the data. If the model is bad, you will get bad results from `emmeans()` and other functions. Every single limitation of your model, be it presuming constant error variance, omitting interaction terms, etc., becomes a limitation of the results `emmeans()` produces. So do a responsible job of fitting the model. And if you don't know what's meant by that... #### Consider seeking the advice of a statistical consultant {#recs3} Statistics is hard. It is a lot more than just running programs and copying output. It is *your* research; is it important that it be done right? Many academic statistics and biostatistics departments can refer you to someone who can help. [Back to Contents](#contents) ## Summary of main points {#summary} * EMMs are derived from a *model*. A different model for the same data may lead to different EMMs. * EMMs are based on a *reference grid* consisting of all combinations of factor levels, with each covariate set to its average (by default). * For purposes of defining the reference grid, dimensions of a multivariate response are treated as levels of a factor. * EMMs are then predictions on this reference grid, or marginal averages thereof (equally weighted by default). * Reference grids may be modified using `at` or `cov.reduce`; the latter may be logical, a function, or a formula. * Reference grids and `emmeans()` results may be plotted via `plot()` (for parallel confidence intervals) or `emmip()` (for an interaction-style plot). * Be cautious with the terms "significant" and "nonsignificant", and don't ever interpret a "nonsignificant" result as saying that there is no effect. * Follow good practices such as getting the model right first, and using adjusted *P* values for appropriately chosen families of comparisons or contrasts. [Back to Contents](#contents) ### References Wasserman RL, Lazar NA (2016) "The ASA's Statement on *p*-Values: Context, Process, and Purpose," *The American Statistician*, **70**, 129--133, https://doi.org/10.1080/00031305.2016.1154108 Wasserman RL, Schirm AL, Lazar, NA (2019) "Moving to a World Beyond 'p < 0.05'," *The American Statistician*, **73**, 1--19, https://doi.org/10.1080/00031305.2019.1583913 ## Further reading {#more} The reader is referred to other vignettes for more details and advanced use. The strings linked below are the names of the vignettes; i.e., they can also be accessed via `vignette("`*name*`", "emmeans")` * Models that are supported in **emmeans** (there are lots of them) ["models"](models.html) * Confidence intervals and tests: ["confidence-intervals"](confidence-intervals.html) * Often, users want to compare or contrast EMMs: ["comparisons"](comparisons.html) * Working with response transformations and link functions: ["transformations"](transformations.html) * Multi-factor models with interactions: ["interactions"](interactions.html) * Working with messy data and nested effects: ["messy-data"](messy-data.html) * Making predictions from your model: ["predictions"](predictions.html) * Examples of more sophisticated models (e.g., mixed, ordinal, MCMC) ["sophisticated"](sophisticated.html) * Utilities for working with `emmGrid` objects: ["utilities"](utilities.html) * Frequently asked questions: ["FAQs"](FAQs.html) * Adding **emmeans** support to your package: ["xtending"](xtending.html) [Back to Contents](#contents) [Index of all vignette topics](vignette-topics.html)emmeans/inst/doc/transformations.R0000644000176200001440000002137014165066772017016 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE--------------------------------------------- require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ## ------------------------------------------------------------------------------------------------- pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs) ## ------------------------------------------------------------------------------------------------- pigs.emm.s <- emmeans(pigs.lm, "source") str(pigs.emm.s) ## ------------------------------------------------------------------------------------------------- summary(pigs.emm.s, infer = TRUE, null = log(35)) ## ------------------------------------------------------------------------------------------------- summary(pigs.emm.s, infer = TRUE, null = log(35), type = "response") ## ------------------------------------------------------------------------------------------------- str(regrid(pigs.emm.s)) summary(regrid(pigs.emm.s), infer = TRUE, null = 35) ## ------------------------------------------------------------------------------------------------- pigs.rg <- ref_grid(pigs.lm) pigs.remm.s <- emmeans(regrid(pigs.rg), "source") summary(pigs.remm.s, infer = TRUE, null = 35) ## ----eval = FALSE--------------------------------------------------------------------------------- # pigs.remm.s <- emmeans(pigs.lm, "source", transform = "response") ## ----eval = FALSE--------------------------------------------------------------------------------- # emmeans(pigs.lm, "source", type = "response") ## ------------------------------------------------------------------------------------------------- neuralgia.glm <- glm(Pain ~ Treatment * Sex + Age, family = binomial(), data = neuralgia) neuralgia.emm <- emmeans(neuralgia.glm, "Treatment", type = "response") neuralgia.emm ## ------------------------------------------------------------------------------------------------- pairs(neuralgia.emm, reverse = TRUE) ## ------------------------------------------------------------------------------------------------- emmip(neuralgia.glm, Sex ~ Treatment) ## ---- fig.height = 1.5---------------------------------------------------------------------------- neur.Trt.emm <- suppressMessages(emmeans(neuralgia.glm, "Treatment")) plot(neur.Trt.emm) # Link scale by default plot(neur.Trt.emm, type = "response") ## ---- fig.height = 1.5---------------------------------------------------------------------------- plot(neur.Trt.emm, type = "scale") ## ---- fig.height = 1.5---------------------------------------------------------------------------- plot(neur.Trt.emm, type = "scale", breaks = seq(0.10, 0.90, by = 0.10), minor_breaks = seq(0.05, 0.95, by = 0.05)) ## ---- fig.height = 1.5---------------------------------------------------------------------------- plot(neur.Trt.emm, type = "response") + ggplot2::scale_x_continuous(trans = scales::asn_trans(), breaks = seq(0.10, 0.90, by = 0.10)) ## ------------------------------------------------------------------------------------------------- warp.glm <- glm(sqrt(breaks) ~ wool*tension, family = Gamma, data = warpbreaks) ref_grid(warp.glm) ## ------------------------------------------------------------------------------------------------- emmeans(warp.glm, ~ tension | wool, type = "response") ## ------------------------------------------------------------------------------------------------- emmeans(warp.glm, ~ tension | wool, type = "unlink") ## ----eval = FALSE--------------------------------------------------------------------------------- # tran <- make.tran("asin.sqrt", 100) # my.model <- with(tran, # lmer(linkfun(percent) ~ treatment + (1|Block), data = mydata)) ## ----eval = FALSE--------------------------------------------------------------------------------- # mydata <- transform(mydata, logy.5 = log(yield + 0.5)) # my.model <- lmer(logy.5 ~ treatment + (1|Block), data = mydata) ## ----eval = FALSE--------------------------------------------------------------------------------- # my.rg <- update(ref_grid(my.model), tran = make.tran("genlog", .5)) ## ----eval = FALSE--------------------------------------------------------------------------------- # model.rg <- update(ref_grid(model), tran = "sqrt") ## ------------------------------------------------------------------------------------------------- pigroot.lm <- lm(sqrt(conc) ~ source + factor(percent), data = pigs) piglog.emm.s <- regrid(emmeans(pigroot.lm, "source"), transform = "log") confint(piglog.emm.s, type = "response") pairs(piglog.emm.s, type = "response") ## ---- eval = FALSE-------------------------------------------------------------------------------- # regrid(emm, transform = "probit") ## ------------------------------------------------------------------------------------------------- pct.diff.tran <- list( linkfun = function(mu) log(mu/100 + 1), linkinv = function(eta) 100 * (exp(eta) - 1), mu.eta = function(eta) 100 * exp(eta), name = "log(pct.diff)" ) update(pairs(piglog.emm.s, type = "response"), tran = pct.diff.tran, inv.lbl = "pct.diff") ## ---- message = FALSE----------------------------------------------------------------------------- fiber.lm <- lm(scale(strength) ~ machine * scale(diameter), data = fiber) emmeans(fiber.lm, "machine") # on the standardized scale emmeans(fiber.lm, "machine", type = "response") # strength scale ## ------------------------------------------------------------------------------------------------- emtrends(fiber.lm, "machine", var = "diameter") ## ------------------------------------------------------------------------------------------------- emtrends(fiber.lm, "machine", var = "diameter", transform = "response") ## ------------------------------------------------------------------------------------------------- with(fiber, c(mean = mean(diameter), sd = sd(diameter))) emtrends(fiber.lm, "machine", var = "scale(diameter, 24.133, 4.324)") ## ------------------------------------------------------------------------------------------------- coef(fiber.lm)[4:6] ## ---- eval = FALSE-------------------------------------------------------------------------------- # mod <- some.fcn(scale(RT) ~ group + (1|subject), data = mydata) # emmeans(mod, "group", type = "response", # tran = make.tran("scale", y = mydata$RT)) ## ---- eval = FALSE-------------------------------------------------------------------------------- # mod <- with(make.tran("scale", y = mydata$RT), # some.fcn(linkfun(RT) ~ group + (1|subject), data = mydata)) # emmeans(mod, "group", type = "response") ## ---- message = FALSE----------------------------------------------------------------------------- fib.lm <- lm(strength ~ machine * diameter, data = fiber) # On raw scale: emmeans(fib.lm, "machine") # On standardized scale: tran <- make.tran("scale", y = fiber$strength) emmeans(fib.lm, "machine", transform = tran) ## ------------------------------------------------------------------------------------------------- sigma(pigs.lm) ## ------------------------------------------------------------------------------------------------- summary(pigs.emm.s, type = "response", bias.adj = TRUE) ## ------------------------------------------------------------------------------------------------- ismod <- glm(count ~ spray, data = InsectSprays, family = poisson()) emmeans(ismod, "spray", type = "response", bias.adj = FALSE) emmeans(ismod, "spray", type = "response", bias.adj = TRUE) ## ------------------------------------------------------------------------------------------------- with(InsectSprays, tapply(count, spray, mean)) ## ---- message = FALSE----------------------------------------------------------------------------- require(lme4) cbpp <- transform(cbpp, unit = 1:nrow(cbpp)) cbpp.glmer <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd) + (1 | unit), family = binomial, data = cbpp) emm <- emmeans(cbpp.glmer, "period") summary(emm, type = "response") ## ------------------------------------------------------------------------------------------------- lme4::VarCorr(cbpp.glmer) ## ------------------------------------------------------------------------------------------------- total.SD = sqrt(0.89107^2 + 0.18396^2) ## ------------------------------------------------------------------------------------------------- summary(emm, type = "response", bias.adjust = TRUE, sigma = total.SD) ## ------------------------------------------------------------------------------------------------- cases <- with(cbpp, tapply(incidence, period, sum)) trials <- with(cbpp, tapply(size, period, sum)) cases / trials emmeans/inst/doc/transformations.html0000644000176200001440000031032514165066773017563 0ustar liggesusers Transformations and link functions in emmeans

Transformations and link functions in emmeans

emmeans package, Version 1.7.2

Overview

Consider the same example with the pigs dataset that is used in many of these vignettes:

pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs)

This model has two factors, source and percent (coerced to a factor), as predictors; and log-transformed conc as the response. Here we obtain the EMMs for source, examine its structure, and finally produce a summary, including a test against a null value of log(35):

pigs.emm.s <- emmeans(pigs.lm, "source")
str(pigs.emm.s)
## 'emmGrid' object with variables:
##     source = fish, soy, skim
## Transformation: "log"
summary(pigs.emm.s, infer = TRUE, null = log(35))
##  source emmean     SE df lower.CL upper.CL null t.ratio p.value
##  fish     3.39 0.0367 23     3.32     3.47 3.56  -4.385  0.0002
##  soy      3.67 0.0374 23     3.59     3.74 3.56   2.988  0.0066
##  skim     3.80 0.0394 23     3.72     3.88 3.56   6.130  <.0001
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

Now suppose that we want the EMMs expressed on the same scale as conc. This can be done by adding type = "response" to the summary() call:

summary(pigs.emm.s, infer = TRUE, null = log(35), type = "response")
##  source response   SE df lower.CL upper.CL null t.ratio p.value
##  fish       29.8 1.09 23     27.6     32.1   35  -4.385  0.0002
##  soy        39.1 1.47 23     36.2     42.3   35   2.988  0.0066
##  skim       44.6 1.75 23     41.1     48.3   35   6.130  <.0001
## 
## Results are averaged over the levels of: percent 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## Tests are performed on the log scale

Note: Looking ahead, this output is compared later in this vignette with a bias-adjusted version.

Timing is everything

Dealing with transformations in emmeans is somewhat complex, due to the large number of possibilities. But the key is understanding what happens, when. These results come from a sequence of steps. Here is what happens (and doesn’t happen) at each step:

  1. The reference grid is constructed for the log(conc) model. The fact that a log transformation is used is recorded, but nothing else is done with that information.
  2. The predictions on the reference grid are averaged over the four percent levels, for each source, to obtain the EMMs for sourcestill on the log(conc) scale.
  3. The standard errors and confidence intervals for these EMMs are computed – still on the log(conc) scale.
  4. Only now do we do back-transformation…
    1. The EMMs are back-transformed to the conc scale.
    2. The endpoints of the confidence intervals are back-transformed.
    3. The t tests and P values are left as-is.
    4. The standard errors are converted to the conc scale using the delta method. These SEs were not used in constructing the tests and confidence intervals.

The model is our best guide

This choice of timing is based on the idea that the model is right. In particular, the fact that the response is transformed suggests that the transformed scale is the best scale to be working with. In addition, the model specifies that the effects of source and percent are linear on the transformed scale; inasmuch as marginal averaging to obtain EMMs is a linear operation, that averaging is best done on the transformed scale. For those two good reasons, back-transforming to the response scale is delayed until the very end by default.

Back to Contents

Re-gridding

As well-advised as it is, some users may not want the default timing of things. The tool for changing when back-transformation is performed is the regrid() function – which, with default settings of its arguments, back-transforms an emmGrid object and adjusts everything in it appropriately. For example:

str(regrid(pigs.emm.s))
## 'emmGrid' object with variables:
##     source = fish, soy, skim
summary(regrid(pigs.emm.s), infer = TRUE, null = 35)
##  source response   SE df lower.CL upper.CL null t.ratio p.value
##  fish       29.8 1.09 23     27.5     32.1   35  -4.758  0.0001
##  soy        39.1 1.47 23     36.1     42.2   35   2.827  0.0096
##  skim       44.6 1.75 23     40.9     48.2   35   5.446  <.0001
## 
## Results are averaged over the levels of: percent 
## Confidence level used: 0.95

Notice that the structure no longer includes the transformation. That’s because it is no longer relevant; the reference grid is on the conc scale, and how we got there is now forgotten. Compare this summary() result with the preceding one, and note the following:

  • It no longer has annotations concerning transformations.
  • The estimates and SEs are identical.
  • The confidence intervals, t ratios, and P values are not identical. This is because, this time, the SEs shown in the table are the ones actually used to construct the tests and intervals.

Understood, right? But think carefully about how these EMMs were obtained. They are back-transformed from pigs.emm.s, in which the marginal averaging was done on the log scale. If we want to back-transform before doing the averaging, we need to call regrid() after the reference grid is constructed but before the averaging takes place:

pigs.rg <- ref_grid(pigs.lm)
pigs.remm.s <- emmeans(regrid(pigs.rg), "source")
summary(pigs.remm.s, infer = TRUE, null = 35)
##  source response   SE df lower.CL upper.CL null t.ratio p.value
##  fish       30.0 1.10 23     27.7     32.2   35  -4.585  0.0001
##  soy        39.4 1.49 23     36.3     42.5   35   2.927  0.0076
##  skim       44.8 1.79 23     41.1     48.5   35   5.486  <.0001
## 
## Results are averaged over the levels of: percent 
## Confidence level used: 0.95

These results all differ from either of the previous two summaries – again, because the averaging is done on the conc scale rather than the log(conc) scale.

Note: For those who want to routinely back-transform before averaging, the transform argument in ref_grid() simplifies this. The first two steps above could have been done more easily as follows:

pigs.remm.s <- emmeans(pigs.lm, "source", transform = "response")

But don’t get transform and type confused. The transform argument is passed to regrid() after the reference grid is constructed, whereas the type argument is simply remembered and used by summary(). So a similar-looking call:

emmeans(pigs.lm, "source", type = "response")

will compute the results we have seen for pigs.emm.s – back-transformed after averaging on the log scale.

Remember again: When it comes to transformations, timing is everything.

Back to Contents

Graphing transformations and links

There are a few options for displaying transformed results graphically. First, the type argument works just as it does in displaying a tabular summary. Following through with the neuralgia example, let us display the marginal Treatment EMMs on both the link scale and the response scale (we are opting to do the averaging on the link scale):

neur.Trt.emm <- suppressMessages(emmeans(neuralgia.glm, "Treatment"))
plot(neur.Trt.emm)   # Link scale by default

plot(neur.Trt.emm, type = "response")

Besides whether or not we see response values, there is a dramatic difference in the symmetry of the intervals.

For emmip() and plot() only (and currently only with the “ggplot” engine), there is also the option of specifying type = "scale", which causes the response values to be calculated but plotted on a nonlinear scale corresponding to the transformation or link:

plot(neur.Trt.emm, type = "scale")

Notice that the interior part of this plot is identical to the plot on the link scale. Only the horizontal axis is different. That is because the response values are transformed using the link function to determine the plotting positions of the graphical elements – putting them back where they started.

As is the case here, nonlinear scales can be confusing to read, and it is very often true that you will want to display more scale divisions, and even add minor ones. This is done via adding arguments for the function ggplot2::scale_x_continuous() (see its documentation):

plot(neur.Trt.emm, type = "scale", breaks = seq(0.10, 0.90, by = 0.10),
     minor_breaks = seq(0.05, 0.95, by = 0.05))

When using the "ggplot" engine, you always have the option of using ggplot2 to incorporate a transformed scale – and it doesn’t even have to be the same as the transformation used in the model. For example, here we display the same results on an arcsin-square-root scale.

plot(neur.Trt.emm, type = "response") +
  ggplot2::scale_x_continuous(trans = scales::asn_trans(),
                              breaks = seq(0.10, 0.90, by = 0.10))

This comes across as a compromise: not as severe as the logit scaling, and not as distorted as the linear scaling of response values.

Again, the same techniques can be used with emmip(), except it is the vertical scale that is affected.

Back to Contents

Special transformations

The make.tran() function provides several special transformations and sets things up so they can be handled in emmeans with relative ease. (See help("make.tran", "emmeans") for descriptions of what is available.) make.tran() works much like stats::make.link() in that it returns a list of functions linkfun(), linkinv(), etc. that serve in managing results on a transformed scale. The difference is that most transformations with make.tran() require additional arguments.

To use this capability in emmeans(), it is fortuitous to first obtain the make.tran() result, and then to use it as the enclosing environment for fitting the model, with linkfun as the transformation. For example, suppose the response variable is a percentage and we want to use the response transformation \(\sin^{-1}\sqrt{y/100}\). Then proceed like this:

tran <- make.tran("asin.sqrt", 100)
my.model <- with(tran, 
    lmer(linkfun(percent) ~ treatment + (1|Block), data = mydata))

Subsequent calls to ref_grid(), emmeans(), regrid(), etc. will then be able to access the transformation information correctly.

The help page for make.tran() has an example like this using a Box-Cox transformation.

Back to Contents

Specifying a transformation after the fact

It is not at all uncommon to fit a model using statements like the following:

mydata <- transform(mydata, logy.5 = log(yield + 0.5))
my.model <- lmer(logy.5 ~ treatment + (1|Block), data = mydata)

In this case, there is no way for ref_grid() to figure out that a response transformation was used. What can be done is to update the reference grid with the required information:

my.rg <- update(ref_grid(my.model), tran = make.tran("genlog", .5))

Subsequently, use my.rg in place of my.model in any emmeans() analyses, and the transformation information will be there.

For standard transformations (those in stats::make.link()), just give the name of the transformation; e.g.,

model.rg <- update(ref_grid(model), tran = "sqrt")

Auto-detected response transformations

As can be seen in the initial pigs.lm example in this vignette, certain straightforward response transformations such as log, sqrt, etc. are automatically detected when emmeans() (really, ref_grid()) is called on the model object. In fact, scaling and shifting is supported too; so the preceding example with my.model could have been done more easily by specifying the transformation directly in the model formula:

my.better.model <- lmer(log(yield + 0.5) ~ treatment + (1|Block), data = mydata)

The transformation would be auto-detected, saving you the trouble of adding it later. Similarly, a response transformation of 2 * sqrt(y + 1) would be correctly auto-detected. A model with a linearly transformed response, e.g. 4*(y - 1), would not be auto-detected, but 4*I(y + -1) would be interpreted as 4*identity(y + -1). Parsing is such that the response expression must be of the form mult * fcn(resp + const); operators of - and / are not recognized.

Back to Contents

Faking a log transformation

The regrid() function makes it possible to fake a log transformation of the response. Why would you want to do this? So that you can make comparisons using ratios instead of differences.

Consider the pigs example once again, but suppose we had fitted a model with a square-root transformation instead of a log:

pigroot.lm <- lm(sqrt(conc) ~ source + factor(percent), data = pigs)
piglog.emm.s <- regrid(emmeans(pigroot.lm, "source"), transform = "log")
confint(piglog.emm.s, type = "response")
##  source response   SE df lower.CL upper.CL
##  fish       29.8 1.32 23     27.2     32.7
##  soy        39.2 1.54 23     36.2     42.6
##  skim       45.0 1.74 23     41.5     48.7
## 
## Results are averaged over the levels of: percent 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale
pairs(piglog.emm.s, type = "response")
##  contrast    ratio     SE df null t.ratio p.value
##  fish / soy  0.760 0.0454 23    1  -4.591  0.0004
##  fish / skim 0.663 0.0391 23    1  -6.965  <.0001
##  soy / skim  0.872 0.0469 23    1  -2.548  0.0457
## 
## Results are averaged over the levels of: percent 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## Tests are performed on the log scale

These results are not identical, but very similar to the back-transformed confidence intervals above for the EMMs and the pairwise ratios in the “comparisons” vignette, where the fitted model actually used a log response.

Faking other transformations

It is possible to fake transformations other than the log. Just use the same method, e.g.

regrid(emm, transform = "probit")

would re-grid the existing emm to the probit scale. Note that any estimates in emm outside of the interval \((0,1)\) will be flagged as non-estimable.

The section on standardized responses gives an example of reverse-engineering a standardized response transformation in this way.

Alternative scale

It is possible to create a report on an alternative scale by updating the tran component. For example, suppose we want percent differences instead of ratios in the preceding example with the pigs dataset. This is possible by modifying the inverse transformation: since the uusual inverse transformation is a ratio of the form \(r = a/b\), we have that the percentage difference between \(a\) and \(b\) is \(100(a-b)/b = 100(r-1)\). Thus,

pct.diff.tran <- list(
    linkfun = function(mu) log(mu/100 + 1),
    linkinv = function(eta) 100 * (exp(eta) - 1),
    mu.eta = function(eta) 100 * exp(eta),
    name = "log(pct.diff)"
)

update(pairs(piglog.emm.s, type = "response"), 
       tran = pct.diff.tran, inv.lbl = "pct.diff")
##  contrast    pct.diff   SE df t.ratio p.value
##  fish / soy     -24.0 4.54 23  -4.591  0.0004
##  fish / skim    -33.7 3.91 23  -6.965  <.0001
##  soy / skim     -12.8 4.69 23  -2.548  0.0457
## 
## Results are averaged over the levels of: percent 
## P value adjustment: tukey method for comparing a family of 3 estimates 
## Tests are performed on the log(pct.diff) scale

Standardized response

In some disciplines, it is common to fit a model to a standardized response variable. R’s base function scale() makes this easy to do; but it is important to notice that scale(y) is more complicated than, say, sqrt(y), because scale(y) requires all the values of y in order to determine the centering and scaling parameters. The ref_grid() function (called by `emmeans() and others) tries to detect the scaling parameters. To illustrate:

fiber.lm <- lm(scale(strength) ~ machine * scale(diameter), data = fiber)
emmeans(fiber.lm, "machine")   # on the standardized scale
##  machine   emmean    SE df lower.CL upper.CL
##  A        0.00444 0.156  9   -0.349    0.358
##  B        0.28145 0.172  9   -0.109    0.672
##  C       -0.33473 0.194  9   -0.774    0.105
## 
## Results are given on the scale(40.2, 4.97) (not the response) scale. 
## Confidence level used: 0.95
emmeans(fiber.lm, "machine", type = "response")   # strength scale
##  machine response    SE df lower.CL upper.CL
##  A           40.2 0.777  9     38.5     42.0
##  B           41.6 0.858  9     39.7     43.5
##  C           38.5 0.966  9     36.3     40.7
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the scale(40.2, 4.97) scale

More interesting (and complex) is what happens with emtrends(). Without anything fancy added, we have

emtrends(fiber.lm, "machine", var = "diameter")
##  machine diameter.trend     SE df lower.CL upper.CL
##  A                0.222 0.0389  9   0.1339    0.310
##  B                0.172 0.0450  9   0.0705    0.274
##  C                0.174 0.0418  9   0.0791    0.268
## 
## Confidence level used: 0.95

These slopes are (change in scale(strength)) / (change in diameter); that is, we didn’t do anything to undo the response transformation, but the trend is based on exactly the variable specified, diameter. To get (change in strength) / (change in diameter), we need to undo the response transformation, and that is done via transform (which invokes regrid() after the reference grid is constructed):

emtrends(fiber.lm, "machine", var = "diameter", transform = "response")
##  machine diameter.trend    SE df lower.CL upper.CL
##  A                1.104 0.194  9    0.666     1.54
##  B                0.857 0.224  9    0.351     1.36
##  C                0.864 0.208  9    0.394     1.33
## 
## Confidence level used: 0.95

What if we want slopes for (change in scale(strength)) / (change in scale(diameter))? This can be done, but it is necessary to manually specify the scaling parameters for diameter.

with(fiber, c(mean = mean(diameter), sd = sd(diameter)))
##      mean        sd 
## 24.133333  4.323799
emtrends(fiber.lm, "machine", var = "scale(diameter, 24.133, 4.324)")
##  machine scale(diameter, 24.133, 4.324).trend    SE df lower.CL upper.CL
##  A                                      0.960 0.168  9    0.579     1.34
##  B                                      0.745 0.195  9    0.305     1.19
##  C                                      0.751 0.181  9    0.342     1.16
## 
## Confidence level used: 0.95

This result is the one most directly related to the regression coefficients:

coef(fiber.lm)[4:6]
##          scale(diameter) machineB:scale(diameter) machineC:scale(diameter) 
##                0.9598846               -0.2148202               -0.2086880

There is a fourth possibility, (change in strength) / (change in scale(diameter)), that I leave to the reader.

What to do if auto-detection fails

Auto-detection of standardized responses is a bit tricky, and doesn’t always succeed. If it fails, a message is displayed and the transformation is ignored. In cases where it doesn’t work, we need to explicitly specify the transformation using make.tran(). The methods are exactly as shown earlier in this vignette, so we show the code but not the results for a hypothetical example.

One method is to fit the model and then add the transformation information later. In this example, some.fcn is a model-fitting function which for some reason doesn’t allow the scaling information to be detected.

mod <- some.fcn(scale(RT) ~ group + (1|subject), data = mydata)
emmeans(mod, "group", type = "response",
        tran = make.tran("scale", y = mydata$RT))

The other, equivalent, method is to create the transformation object first and use it in fitting the model:

mod <- with(make.tran("scale", y = mydata$RT),
            some.fcn(linkfun(RT) ~ group + (1|subject), data = mydata))
emmeans(mod, "group", type = "response")

Reverse-engineering a standardized response

An interesting twist on all this is the reverse situation: Suppose we fitted the model without the standardized response, but we want to know what the results would be if we had standardized. Here we reverse-engineer the fiber.lm example above:

fib.lm <- lm(strength ~ machine * diameter, data = fiber)

# On raw scale:
emmeans(fib.lm, "machine")
##  machine emmean    SE df lower.CL upper.CL
##  A         40.2 0.777  9     38.5     42.0
##  B         41.6 0.858  9     39.7     43.5
##  C         38.5 0.966  9     36.3     40.7
## 
## Confidence level used: 0.95
# On standardized scale:
tran <- make.tran("scale", y = fiber$strength)
emmeans(fib.lm, "machine", transform = tran)
##  machine   emmean    SE df lower.CL upper.CL
##  A        0.00444 0.156  9   -0.349    0.358
##  B        0.28145 0.172  9   -0.109    0.672
##  C       -0.33473 0.194  9   -0.774    0.105
## 
## Results are given on the scale(40.2, 4.97) (not the response) scale. 
## Confidence level used: 0.95

In the latter call, the transform argument causes regrid() to be called after the reference grid is constructed.

Back to Contents

Bias adjustment

So far, we have discussed ideas related to back-transforming results as a simple way of expressing results on the same scale as the response. In particular, means obtained in this way are known as generalized means; for example, a log transformation of the response is associated with geometric means. When the goal is simply to make inferences about which means are less than which other means, and a response transformation is used, it is often acceptable to present estimates and comparisons of these generalized means. However, sometimes it is important to report results that actually do reflect expected values of the untransformed response. An example is a financial study, where the response is in some monetary unit. It may be convenient to use a response transformation for modeling purposes, but ultimately we may want to make financial projections in those same units.

In such settings, we need to make a bias adjustment when we back-transform, because any nonlinear transformation biases the expected values of statistical quantities. More specifically, suppose that we have a response \(Y\) and the transformed response is \(U\). To back-transform, we use \(Y = h(U)\); and using a Taylor approximation, \(Y \approx h(\eta) + h'(\eta)(U-\eta) + \frac12h''(\eta)(U-\eta)^2\), so that \(E(Y) \approx h(\eta) + \frac12h''(\eta)Var(U)\). This shows that the amount of needed bias adjustment is approximately \(\frac12h''(\eta)\sigma^2\) where \(\sigma\) is the error SD in the model for \(U\). It depends on \(\sigma\), and the larger this is, the greater the bias adjustment is needed. This second-order bias adjustment is what is currently used in the emmeans package when bias-adjustment is requested. There are better or exact adjustments for certain cases, and future updates may incorporate some of those.

Pigs example revisited

Let us compare the estimates in the overview after we apply a bias adjustment. First, note that an estimate of the residual SD is available via the sigma() function:

sigma(pigs.lm)
## [1] 0.115128

This estimate is used by default. The bias-adjusted EMMs for the sources are:

summary(pigs.emm.s, type = "response", bias.adj = TRUE)
##  source response   SE df lower.CL upper.CL
##  fish       30.0 1.10 23     27.8     32.4
##  soy        39.4 1.48 23     36.5     42.6
##  skim       44.9 1.77 23     41.3     48.7
## 
## Results are averaged over the levels of: percent 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## Bias adjustment applied based on sigma = 0.11513

These estimates (and also their SEs) are slightly larger than we had without bias adjustment. They are estimates of the arithmetic mean responses, rather than the geometric means shown in the overview. Had the value of sigma been larger, the adjustment would have been greater. You can experiment with this by adding a sigma = argument to the above call.

emmeans/inst/doc/messy-data.R0000644000176200001440000001220614165066763015632 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE------------------------- require("emmeans") require("ggplot2") options(show.signif.stars = FALSE) knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ## ----------------------------------------------------------------------------- nutr.lm <- lm(gain ~ (age + group + race)^2, data = nutrition) car::Anova(nutr.lm) ## ----------------------------------------------------------------------------- emmeans(nutr.lm, ~ group * race, calc = c(n = ".wgt.")) ## ----------------------------------------------------------------------------- with(nutrition, table(race, age)) ## ----------------------------------------------------------------------------- summary(emmeans(nutr.lm, pairwise ~ group | race, at = list(age = "3")), by = NULL) ## ----------------------------------------------------------------------------- framing <- mediation::framing levels(framing$educ) <- c("NA","Ref","< HS", "HS", "> HS","Coll +") framing.glm <- glm(cong_mesg ~ age + income + educ + emo + gender * factor(treat), family = binomial, data = framing) ## ----------------------------------------------------------------------------- emmip(framing.glm, treat ~ educ | gender, type = "response") ## ----------------------------------------------------------------------------- emmip(framing.glm, treat ~ educ | gender, type = "response", cov.reduce = emo ~ treat*gender + age + educ + income) ## ----eval = FALSE------------------------------------------------------------- # emo.adj <- resid(lm(emo ~ treat*gender + age + educ + income, data = framing)) ## ----eval = FALSE------------------------------------------------------------- # emmeans(..., cov.reduce = list(x1 ~ trt, x2 ~ trt + x1, x3 ~ trt + x1 + x2)) ## ----eval = FALSE------------------------------------------------------------- # emmeans(model, "A", weights = "outer") # emmeans(emmeans(model, c("A", "B"), weights = "prop"), weights = "prop") ## ----message = FALSE---------------------------------------------------------- sapply(c("equal", "prop", "outer", "cells", "flat"), function(w) predict(emmeans(nutr.lm, ~ race, weights = w))) ## ----------------------------------------------------------------------------- mtcars.lm <- lm(mpg ~ factor(cyl)*am + disp + hp + drat + log(wt) + vs + factor(gear) + factor(carb), data = mtcars) ## ----------------------------------------------------------------------------- rg.usual <- ref_grid(mtcars.lm) rg.usual nrow(rg.usual@linfct) rg.nuis = ref_grid(mtcars.lm, non.nuisance = "cyl") rg.nuis nrow(rg.nuis@linfct) ## ----------------------------------------------------------------------------- emmeans(rg.usual, ~ cyl * am) emmeans(rg.nuis, ~ cyl * am) ## ----------------------------------------------------------------------------- predict(emmeans(mtcars.lm, ~ cyl * am, non.nuis = c("cyl", "am"), wt.nuis = "prop")) predict(emmeans(mtcars.lm, ~ cyl * am, weights = "outer")) ## ----------------------------------------------------------------------------- emmeans(mtcars.lm, ~ gear | am, non.nuis = quote(all.vars(specs))) ## ---- error = TRUE------------------------------------------------------------ ref_grid(mtcars.lm, rg.limit = 200) ## ----------------------------------------------------------------------------- summary(emmeans(nutr.lm, pairwise ~ group | race, submodel = ~ age + group*race), by = NULL) ## ----------------------------------------------------------------------------- emmeans(nutr.lm, ~ group * race, submodel = "minimal") ## ----------------------------------------------------------------------------- joint_tests(nutr.lm, submodel = "type2") ## ----------------------------------------------------------------------------- cows <- data.frame ( route = factor(rep(c("injection", "oral"), c(5, 9))), drug = factor(rep(c("Bovineumab", "Charloisazepam", "Angustatin", "Herefordmycin", "Mollycoddle"), c(3,2, 4,2,3))), resp = c(34, 35, 34, 44, 43, 36, 33, 36, 32, 26, 25, 25, 24, 24) ) cows.lm <- lm(resp ~ route + drug, data = cows) ## ----message = FALSE---------------------------------------------------------- cows.rg <- ref_grid(cows.lm) cows.rg ## ----------------------------------------------------------------------------- route.emm <- emmeans(cows.rg, "route") route.emm ## ----------------------------------------------------------------------------- drug.emm <- emmeans(cows.rg, "drug") drug.emm ## ----------------------------------------------------------------------------- pairs(route.emm, reverse = TRUE) pairs(drug.emm, by = "route", reverse = TRUE) ## ---- fig.width = 5.5--------------------------------------------------------- emmip(cows.rg, ~ drug | route) ## ---- fig.width = 5.5--------------------------------------------------------- require(ggplot2) emmip(cows.rg, ~ drug) + facet_wrap(~ route, scales = "free_x") ## ---- fig.height = 2.5, fig.width = 5.5--------------------------------------- plot(drug.emm, PIs = TRUE) + facet_wrap(~ route, nrow = 2, scales = "free_y") emmeans/inst/doc/xplanations.html0000644000176200001440000025621614165066775016704 0ustar liggesusers Explanations supplement

Explanations supplement

emmeans package, Version 1.7.2

This vignette provides additional documentation for some methods implemented in the emmeans package.

Index of all vignette topics

Sub-models

Estimated marginal means (EMMs) and other statistics computed by the emmeans package are model-based: they depend on the model that has been fitted to the data. In this section we discuss a provision whereby a different underlying model may be considered. The submodel option in update() can project EMMs and other statistics to an alternative universe where a simpler version of the model has been fitted to the data. Another way of looking at this is that it constrains certain external effects to be zero – as opposed to averaging over them as is otherwise done for marginal means.

Two things to know before getting into details:

  1. The submodel option uses information from the fixed-effects portion of the model matrix
  2. Not all model classes are supported for the submodel option.

Now some details. Suppose that we have a fixed-effects model matrix \(X\), and let \(X_1\) denote a sub-matrix of \(X\) whose columns correspond to a specified sub-model. (Note: if there are weights, use \(X = W^{1/2}X^*\), where \(X^*\) is the model matrix without the weights incorporated.) The trick we use is what is called the alias matrix: \(A = (X_1'X_1)^-X_1'X\) where \(Z^-\) denotes a generalized inverse of \(Z\). It can be shown that \((X_1'X_1)^-X_1' = A(X'X)^-X'\); thus, in an ordinary fixed-effects regression model, \(b_1 = Ab\) where \(b_1\) and \(b\) denote the regression coefficients for the sub-model and full model, respectively. Thus, given a matrix \(L\) such that \(Lb\) provides estimates of interest for the full model, the corresponding estimates for the sub-model are \(L_1b_1\), where \(L_1\) is the sub-matrix of \(L\) consisting of the columns corresponding to the columns of \(X_1\). Moreover, \(L_1b_1 = L_1(Ab) = (L_1A)b\); that is, we can replace \(L\) by \(L_1A\) to obtain estimates from the sub-model. That’s all that update(..., submodel = ...) does.

Here are some intuitive observations:

  1. Consider the excluded effects, \(X_2\), consisting of the columns of \(X\) other than \(X_1\). The corresponding columns of the alias matrix are regression coefficients treating \(X_2\) as the response and \(X_1\) as the predictors.
  2. Thus, when we obtain predictions via these aliases, we are predicting the effects of \(X_2\) based on \(X_1\).
  3. The columns of the new linear predictor \(\tilde L = L_1A\) depend only on the columns of \(L_1\), and hence not on other columns of \(L\).

These three points provide three ways of saying nearly the same thing, namely that we are excluding the effects in \(X_2\). Note that in a rank-deficient situation, there are different possible generalized inverses, and so in (1), \(A\) is not unique. However, the predictions in (2) are unique. In ordinary regression models, (1), (2), and (3) all apply and will be the same as predictions from re-fitting the model with model matrix \(X_1\); however, in generalized linear models, mixed models, etc., re-fitting will likely produce somewhat different results. That is because fitting such models involves iterative weighting, and the re-fitted models will probably not have the same weights. However, point (3) will still hold: the predictions obtained with a submodel will involve only the columns of \(L_1\) and hence constrain all effects outside of the sub-model to be zero. Therefore, when it really matters to get the correct estimates from the stated sub-model, the user should actually fit that sub-model unless the full model is an ordinary linear regression.

A technicality: Most writers define the alias matrix as \((X_1'X_1)^-X_1'X_2\), where \(X_2\) denotes that part of \(X\) that excludes the columns of \(X_1\). We are including all columns of \(X\) here just because it makes the notation very simple; the \(X_1\) portion of \(X\) just reduces to the identity (at least in the case where \(X_1\) is full-rank).

A word on computation: Like many matrix expressions, we do not compute \(A\) directly as shown. Instead, we use the QR decomposition of \(X_1\), obtainable via the R call Z <- qr(X1). Then the alias matrix is computed via A <- qr.coef(Z, X). In fact, nothing changes if we use just the \(R\) portion of \(X = QR\), saving us both memory and computational effort. The exported function .cmpMM() extracts this \(R\) matrix, taking care of any pivoting that might have occurred. And in an lm object, the QR decomposition of \(X\) is already saved as a slot. The qr.coef() function works just fine in both the full-rank and rank-deficient cases, but in the latter situation, some elements of A will be NA; those correspond to “excluded” predictors, but that is another way of saying that we are constraining their regression coefficients to be zero. Thus, we can easily clean that up via A[is.na(A)] <- 0.

If we specify submodel = "minimal", the software figures out the sub-model by extracting terms involving only factors that have not already been averaged over. If the user specifies submodel = "type2", an additional step is performed: Let \(X_1\) have only the highest-order effect in the minimal model, and \(X_0\) denote the matrix of all columns of \(X\) whose columns do not contain the effect in \(X_1\). We then replace \(Z\) by the QR decomposition of \([I - X_0(X_0'X_0)^-X_0']X_1^*\). This projects \(X_1^*\) onto the null space of \(X_0\). The net result is that we obtain estimates of just the \(X_1^*\) effects, after adjusting for all effects that don’t contain it (including the intercept if present). Such estimates have very limited use in data description, but provide a kind of “Type II” analysis when used in conjunction with joint_tests(). The "type2" calculations parallel those documented by SAS for obtaining type II estimable functions in SAS PROC GLM. However, we (as well as car::Anova()) define “contained” effects differently from SAS, treating covariates no differently than factors.

A note on multivariate models

Recall that emmeans generates a constructed factor for the levels of a multivariate response. That factor (or factors) is completely ignored in any sub-model calculations. The \(X\) and \(X_1\) matrices described above involve only the predictors in the right-hand side of the model equation . The multivariate response “factor” implicitly interacts with everything in the right-hand-side model; and the same is true of any sub-model. So it is not possible to consider sub-models where terms are omitted from among those multivariate interactions (note that it is also impossible to fit a multivariate sub-model that excludes those interactions). The only way to remove consideration of multivariate effects is to average over them via a call to emmeans().

Back to Contents

Comparison arrows

The plot() method for emmGrid objects offers the option comparisons = TRUE. If used, the software attempts to construct “comparison arrows” whereby two estimated marginal means (EMMs) differ significantly if, and only if, their respective comparison arrows do not overlap. In this section, we explain how these arrows are obtained.

First, please understand these comparison arrows are decidedly not the same as confidence intervals. Confidence intervals for EMMs are based on the statistical properties of the individual EMMs, whereas comparison arrows are based on the statistical properties of differences of EMMs.

Let the EMMs be denoted \(m_1, m_2, ..., m_k\). For simplicity, let us assume that these are ordered: \(m_1 \le m_2 \le \cdots \le m_k\). Let \(d_{ij} = m_j - m_i\) denote the difference between the \(i\)th and \(j\)th EMM. Then the \((1 - \alpha)\) confidence interval for the true difference \(\delta_{ij} = \mu_j - \mu_i\) is \[ d_{ij} - e_{ij}\quad\mbox{to}\quad d_{ij} + e_{ij} \] where \(e_{ij}\) is the “margin of error” for the difference; i.e., \(e_{ij} = t\cdot SE(d_{ij})\) for some critical value \(t\) (equal to \(t_{\alpha/2}\) when no multiplicity adjustment is used). Note that \(d_{ij}\) is statistically significant if, and only if, \(d_{ij} > e_{ij}\).

Now, how to get the comparison arrows? These arrows are plotted with origins at the \(m_i\); we have an arrow of length \(L_i\) pointing to the left, and an arrow of length \(R_i\) pointing to the right. To compare EMMs \(m_i\) and \(m_j\) (and remembering that we are supposing that \(m_i \le m_j\)), we propose to look to see if the arrows extending right from \(m_i\) and left from \(m_j\) overlap or not. So, ideally, if we want overlap to be identified with statistical non-significance, we want \[ R_i + L_j = e_{ij} \quad\mbox{for all } i < j \]

If we can do that, then the two arrows will overlap if, and only if, \(d_{ij} < e_{ij}\).

This is easy to accomplish if all the \(e_{ij}\) are equal: just set all \(L_i = R_j = \frac12e_{12}\). But with differing \(e_{ij}\) values, it may or may not even be possible to obtain suitable arrow lengths.

The code in emmeans uses an ad hoc weighted regression method to solve the above equations. We give greater weights to cases where \(d_{ij}\) is close to \(e_{ij}\), because those are the cases where it is more critical that we get the lengths of the arrows right. Once the regression equations are solved, we test to make sure that \(R_i + L_j < d_{ij}\) when the difference is significant, and \(\ge d_{ij}\) when it is not. If one or more of those checks fails, a warning is issued.

That’s the essence of the algorithm. Note, however, that there are a few complications that need to be handled:

  • For the lowest EMM \(m_1\), \(L_1\) is completely arbitrary because there are no right-pointing arrows with which to compare it; in fact, we don’t even need to display that arrow. The same is true of \(R_k\) for the largest EMM \(m_k\). Moreover, there could be additional unneeded arrows when other \(m_i\) are equal to \(m_1\) or \(m_k\).
  • Depending on the number \(k\) of EMMs and the number of tied minima and maxima, the system of equations could be under-determined, over-determined, or just right.
  • It is possible that the solution could result in some \(L_i\) or \(R_j\) being negative. That would result in an error.

In summary, the algorithm does not always work (in fact it is possible to construct cases where no solution is possible). But we try to do the best we can. The main reason for trying to do this is to encourage people to not ever use confidence intervals for the \(m_i\) as a means of testing the comparisons \(d_{ij}\). That is almost always incorrect.

What is better yet is to simply avoid using comparison arrows altogether and use pwpp() or pwpm() to display the P values directly.

Examples and tests

Here is a constructed example with specified means and somewhat unequal SEs

m = c(6.1, 4.5, 5.4,    6.3, 5.5, 6.7)
se2 = c(.3, .4, .37,  .41, .23, .48)^2
lev = list(A = c("a1","a2","a3"), B = c("b1", "b2"))
foo = emmobj(m, diag(se2), levels = lev, linfct = diag(6))
plot(foo, CIs = FALSE, comparisons = TRUE)

This came out pretty well. But now let’s keep the means and SEs the same but make them correlated. Such correlations happen, for example, in designs with subject effects. The function below is used to set a specified intra-class correlation, treating A as a within-subjects (or split-plot) factor and B as a between-subjects (whole-plot) factor. We’ll start with a correlation of 0.3.

mkmat <- function(V, rho = 0, indexes = list(1:3, 4:6)) {
    sd = sqrt(diag(V))
    for (i in indexes)
        V[i,i] = (1 - rho)*diag(sd[i]^2) + rho*outer(sd[i], sd[i])
    V
}
# Intraclass correlation = 0.3
foo3 = foo
foo3@V <- mkmat(foo3@V, 0.3)
plot(foo3, CIs = FALSE, comparisons = TRUE)

Same with intraclass correlation of 0.6:

foo6 = foo
foo6@V <- mkmat(foo6@V, 0.6)
plot(foo6, CIs = FALSE, comparisons = TRUE)
## Warning: Comparison discrepancy in group "1", a1 b1 - a2 b2:
##     Target overlap = 0.443, overlap on graph = -0.2131

Now we have a warning that some arrows don’t overlap, but should. We can make it even worse by upping the correlation to 0.8:

foo8 = foo
foo8@V <- mkmat(foo8@V, 0.8)
plot(foo8, CIs = FALSE, comparisons = TRUE)
## Error: Aborted -- Some comparison arrows have negative length!
## (in group "1")

Now the solution actually leads to negative arrow lengths.

What is happening here is we are continually reducing the SE of within-B comparisons while keeping the others the same. These all work out if we use B as a by variable:

plot(foo8, CIs = FALSE, comparisons = TRUE, by = "B")

Note that the lengths of the comparison arrows are relatively equal within the levels of B. Or, we can use pwpp() or pwpm() to show the P values for all comparisons among the six means:

pwpp(foo6, sort = FALSE)

pwpm(foo6)
##       a1 b1  a2 b1  a3 b1  a1 b2  a2 b2  a3 b2
## a1 b1 [6.1] <.0001 0.1993 0.9988 0.6070 0.8972
## a2 b1   1.6  [4.5] 0.0958 0.0208 0.2532 0.0057
## a3 b1   0.7   -0.9  [5.4] 0.5788 0.9999 0.2641
## a1 b2  -0.2   -1.8   -0.9  [6.3] 0.1439 0.9204
## a2 b2   0.6   -1.0   -0.1    0.8  [5.5] 0.0245
## a3 b2  -0.6   -2.2   -1.3   -0.4   -1.2  [6.7]
## 
## Row and column labels: A:B
## Upper triangle: P values   adjust = "tukey"
## Diagonal: [Estimates] (estimate) 
## Lower triangle: Comparisons (estimate)   earlier vs. later

Back to Contents

Index of all vignette topics

emmeans/inst/doc/utilities.html0000644000176200001440000006144314165066774016352 0ustar liggesusers Utilities and options for emmeans

Utilities and options for emmeans

emmeans package, Version 1.7.2

Updating an emmGrid object

Several internal settings are saved when functions like ref_grid(), emmeans(), contrast(), etc. are run. Those settings can be manipulated via the update() method for emmGrids. To illustrate, consider the pigs dataset and model yet again:

pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs)
pigs.emm <- emmeans(pigs.lm, "source")
pigs.emm
##  source emmean     SE df lower.CL upper.CL
##  fish     3.39 0.0367 23     3.32     3.47
##  soy      3.67 0.0374 23     3.59     3.74
##  skim     3.80 0.0394 23     3.72     3.88
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

We see confidence intervals but not tests, by default. This happens as a result of internal settings in pigs.emm.s that are passed to summary() when the object is displayed. If we are going to work with this object a lot, we might want to change its internal settings rather than having to rely on explicitly calling summary() with several arguments. If so, just update the internal settings to what is desired; for example:

pigs.emm.s <- update(pigs.emm, infer = c(TRUE, TRUE), null = log(35),
                     calc = c(n = ".wgt."))
pigs.emm.s
##  source emmean     SE df  n lower.CL upper.CL null t.ratio p.value
##  fish     3.39 0.0367 23 10     3.32     3.47 3.56  -4.385  0.0002
##  soy      3.67 0.0374 23 10     3.59     3.74 3.56   2.988  0.0066
##  skim     3.80 0.0394 23  9     3.72     3.88 3.56   6.130  <.0001
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

Note that by adding of calc, we have set a default to calculate and display the sample size when the object is summarized. See help("update.emmGrid") for details on the keywords that can be changed. Mostly, they are the same as the names of arguments in the functions that construct these objects.

Of course, we can always get what we want via calls to test(), confint() or summary() with appropriate arguments. But the update() function is more useful in sophisticated manipulations of objects, or called implicitly via the ... or options argument in emmeans() and other functions. Those options are passed to update() just before the object is returned. For example, we could have done the above update within the emmeans() call as follows (results are not shown because they are the same as before):

emmeans(pigs.lm, "source", infer = c(TRUE, TRUE), null = log(35),
        calc = c(n = ".wgt."))

Back to contents

Setting options

Speaking of the options argument, note that the default in emmeans() is options = get_emm_option("emmeans"). Let’s see what that is:

get_emm_option("emmeans")
## $infer
## [1]  TRUE FALSE

So, by default, confidence intervals, but not tests, are displayed when the result is summarized. The reverse is true for results of contrast() (and also the default for pairs() which calls contrast()):

get_emm_option("contrast")
## $infer
## [1] FALSE  TRUE

There are also defaults for a newly constructed reference grid:

get_emm_option("ref_grid")
## $is.new.rg
## [1] TRUE
## 
## $infer
## [1] FALSE FALSE

The default is to display neither intervals nor tests when summarizing. In addition, the flag is.new.rg is set to TRUE, and that is why one sees a str() listing rather than a summary as the default when the object is simply shown by typing its name at the console.

Setting and viewing defaults

The user may have other preferences. She may want to see both intervals and tests whenever contrasts are produced; and perhaps she also wants to always default to the response scale when transformations or links are present. We can change the defaults by setting the corresponding options; and that is done via the emm_options() function:

emm_options(emmeans = list(type = "response"),
            contrast = list(infer = c(TRUE, TRUE)))

Now, new emmeans() results and contrasts follow the new defaults:

pigs.anal.p <- emmeans(pigs.lm, consec ~ percent)
pigs.anal.p
## $emmeans
##  percent response   SE df lower.CL upper.CL
##        9     31.4 1.28 23     28.8     34.1
##       12     37.5 1.44 23     34.7     40.6
##       15     39.0 1.70 23     35.6     42.7
##       18     42.3 2.24 23     37.9     47.2
## 
## Results are averaged over the levels of: source 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## $contrasts
##  contrast ratio     SE df lower.CL upper.CL null t.ratio p.value
##  12 / 9    1.20 0.0671 23    1.038     1.38    1   3.202  0.0112
##  15 / 12   1.04 0.0604 23    0.896     1.20    1   0.650  0.8613
##  18 / 15   1.09 0.0750 23    0.911     1.29    1   1.194  0.5202
## 
## Results are averaged over the levels of: source 
## Confidence level used: 0.95 
## Conf-level adjustment: mvt method for 3 estimates 
## Intervals are back-transformed from the log scale 
## P value adjustment: mvt method for 3 tests 
## Tests are performed on the log scale

Observe that the contrasts “inherited” the type = "response" default from the EMMs.

NOTE: Setting the above options does not change how existing emmGrid objects are displayed; it only affects ones constructed in the future.

There is one more option – summary – that overrides all other display defaults for both existing and future objects. For example, specifying emm_options(summary = list(infer = c(TRUE, TRUE))) will result in both intervals and tests being displayed, regardless of their internal defaults, unless infer is explicitly specified in a call to summary().

To temporarily revert to factory defaults in a single call to emmeans() or contrast() or pairs(), specify options = NULL in the call. To reset everything to factory defaults (which we do presently), null-out all of the emmeans package options:

options(emmeans = NULL)

Optimal digits to display

When an emmGrid object is summarized and displayed, the factory default is to display it with just enough digits as is justified by the standard errors or HPD intervals of the estimates displayed. You may use the "opt.digits" option to change this. If it is TRUE (the default), we display only enough digits as is justified (but at least 3). If it is set to FALSE, the number of digits is set using the R system’s default, getOption("digits"); this is often much more precision than is justified. To illustrate, here is the summary of pigs.emm displayed without optimizing digits. Compare it with the first summary in this vignette.

emm_options(opt.digits = FALSE)
pigs.emm
##  source   emmean         SE df lower.CL upper.CL
##  fish   3.394492 0.03668122 23 3.318612 3.470373
##  soy    3.667260 0.03744798 23 3.589793 3.744727
##  skim   3.796770 0.03938283 23 3.715300 3.878240
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95
emm_options(opt.digits = TRUE)  # revert to optimal digits

By the way, setting this option does not round the calculated values computed by summary.emmGrid() or saved in a summary)emm object; it simply controls the precision displayed by print.summary_emm().

Startup options

The options accessed by emm_options() and get_emm_option() are stored in a list named emmeans within R’s options environment. Therefore, if you desire options other than the defaults provided on a regular basis, this can be easily arranged by specifying them in your startup script for R. For example, if you want to default to Satterthwaite degrees of freedom for lmer models, and display confidence intervals rather than tests for contrasts, your .Rprofile file could contain the line

options(emmeans = list(lmer.df = "satterthwaite", 
                       contrast = list(infer = c(TRUE, FALSE))))

Back to contents

Combining and subsetting emmGrid objects

Two or more emmGrid objects may be combined using the rbind() or + methods. The most common reason (or perhaps the only good reason) to do this is to combine EMMs or contrasts into one family for purposes of applying a multiplicity adjustment to tests or intervals. A user may want to combine the three pairwise comparisons of sources with the three comparisons above of consecutive percents into a single family of six tests with a suitable multiplicity adjustment. This is done quite simply:

rbind(pairs(pigs.emm.s), pigs.anal.p[[2]])
##  contrast    estimate     SE df t.ratio p.value
##  fish - soy   -0.2728 0.0529 23  -5.153  0.0002
##  fish - skim  -0.4023 0.0542 23  -7.428  <.0001
##  soy - skim   -0.1295 0.0530 23  -2.442  0.1364
##  12 - 9        0.1796 0.0561 23   3.202  0.0238
##  15 - 12       0.0378 0.0582 23   0.650  1.0000
##  18 - 15       0.0825 0.0691 23   1.194  1.0000
## 
## Results are averaged over some or all of the levels of: percent, source 
## Results are given on the log (not the response) scale. 
## P value adjustment: bonferroni method for 6 tests

The default adjustment is "bonferroni"; we could have specified something different via the adjust argument. An equivalent way to combine emmGrids is via the addition operator. Any options may be provided by update(). Below, we combine the same results into a family but ask for the “exact” multiplicity adjustment.

update(pigs.anal.p[[2]] + pairs(pigs.emm.s), adjust = "mvt")
##  contrast    ratio     SE df lower.CL upper.CL null t.ratio p.value
##  12 / 9      1.197 0.0671 23    1.022    1.402    1   3.202  0.0214
##  15 / 12     1.039 0.0604 23    0.881    1.224    1   0.650  0.9681
##  18 / 15     1.086 0.0750 23    0.894    1.320    1   1.194  0.7305
##  fish / soy  0.761 0.0403 23    0.656    0.884    1  -5.153  0.0002
##  fish / skim 0.669 0.0362 23    0.574    0.779    1  -7.428  <.0001
##  soy / skim  0.879 0.0466 23    0.756    1.020    1  -2.442  0.1109
## 
## Results are averaged over some or all of the levels of: source, percent 
## Confidence level used: 0.95 
## Conf-level adjustment: mvt method for 6 estimates 
## Intervals are back-transformed from the log scale 
## P value adjustment: mvt method for 6 tests 
## Tests are performed on the log scale

Also evident in comparing these results is that settings are obtained from the first object combined. So in the second output, where they are combined in reverse order, we get both confidence intervals and tests, and transformation to the response scale.

To subset an emmGrid object, just use the subscripting operator []. For instance,

pigs.emm[2:3]
##  source emmean     SE df lower.CL upper.CL
##  soy      3.67 0.0374 23     3.59     3.74
##  skim     3.80 0.0394 23     3.72     3.88
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

Accessing results to use elsewhere

Sometimes, users want to use the results of an analysis (say, an emmeans() call) in other computations. The summary() method creates a summary_emm object that inherits from the data.frame class; so one may use the variables therein just as those in a data frame.

Another way is to use the as.data.frame() method for emmGrid objects. This is provided to implement the standard way to coerce an object to a data frame. For illustration, let’s compute the widths of the confidence intervals in our example.

transform(pigs.emm, CI.width = upper.CL - lower.CL)
##   source   emmean         SE df lower.CL upper.CL  CI.width
## 1   fish 3.394492 0.03668122 23 3.318612 3.470373 0.1517618
## 2    soy 3.667260 0.03744798 23 3.589793 3.744727 0.1549341
## 3   skim 3.796770 0.03938283 23 3.715300 3.878240 0.1629392

This implicitly converted pigs.emm to a data frame by passing it to the as.data.frame() method, then performed the required computation. But sometimes you have to explicitly call as.data.frame(). [Note that the opt.digits option is ignored here, because this is a regular data frame, not the summary of an emmGrid.]

Back to contents

Adding grouping factors

Sometimes, users want to group levels of a factor into a smaller number of groups. Those groups may then be, say, averaged separately and compared, or used as a by factor. The add_grouping() function serves this purpose. The function takes four arguments: the object, the name of the grouping factor to be created, the name of the reference factor that is being grouped, and a vector of level names of the grouping factor corresponding to levels of the reference factor. Suppose for example that we want to distinguish animal and non-animal sources of protein in the pigs example:

pigs.emm.ss <- add_grouping(pigs.emm.s, "type", "source",
                            c("animal", "vegetable", "animal"))
str(pigs.emm.ss)
## 'emmGrid' object with variables:
##     source = fish, soy, skim
##     type = animal, vegetable
## Nesting structure:  source %in% type
## Transformation: "log"

Note that the new object has a nesting structure (see more about this in the “messy-data” vignette), with the reference factor nested in the new grouping factor. Now we can obtain means and comparisons for each group

emmeans(pigs.emm.ss, pairwise ~ type)
## $emmeans
##  type      emmean     SE df  n lower.CL upper.CL
##  animal      3.60 0.0267 23 19     3.54     3.65
##  vegetable   3.67 0.0374 23 10     3.59     3.74
## 
## Results are averaged over the levels of: percent, source 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast           estimate     SE df t.ratio p.value
##  animal - vegetable  -0.0716 0.0455 23  -1.573  0.1295
## 
## Results are averaged over the levels of: percent, source 
## Results are given on the log (not the response) scale.

Back to contents

Re-labeling or re-leveling an emmGrid

Sometimes it is desirable to re-label the rows of an emmGrid, or cast it in terms of other factor(s). This can be done via the levels argument in update().

As an example, sometimes a fitted model has a treatment factor that comprises combinations of other factors. In subsequent analysis, we may well want to break it down into the individual factors’ contributions. Consider, for example, the warpbreaks data provided with R. We will define a single factor and fit a non homogeneous-variance model:

warp <- transform(warpbreaks, treat = interaction(wool, tension))
library(nlme)
warp.gls <- gls(breaks ~ treat, weights = varIdent(form = ~ 1|treat), data = warp)
( warp.emm <- emmeans(warp.gls, "treat") )
##  treat emmean   SE   df lower.CL upper.CL
##  A.L     44.6 6.03 8.01     30.6     58.5
##  B.L     28.2 3.29 8.00     20.6     35.8
##  A.M     24.0 2.89 8.00     17.3     30.7
##  B.M     28.8 3.14 8.00     21.5     36.0
##  A.H     24.6 3.42 8.00     16.7     32.5
##  B.H     18.8 1.63 8.00     15.0     22.5
## 
## Degrees-of-freedom method: satterthwaite 
## Confidence level used: 0.95

But now we want to re-cast this emmGrid into one that has separate factors for wool and tension. We can do this as follows:

warp.fac <- update(warp.emm, levels = list(
                wool = c("A", "B"), tension = c("L", "M", "H")))
str(warp.fac)
## 'emmGrid' object with variables:
##     wool = A, B
##     tension = L, M, H

So now we can do various contrasts involving the separate factors:

contrast(warp.fac, "consec", by = "wool")
## wool = A:
##  contrast estimate   SE   df t.ratio p.value
##  M - L     -20.556 6.69 11.5  -3.074  0.0203
##  H - M       0.556 4.48 15.6   0.124  0.9899
## 
## wool = B:
##  contrast estimate   SE   df t.ratio p.value
##  M - L       0.556 4.55 16.0   0.122  0.9881
##  H - M     -10.000 3.54 12.0  -2.824  0.0269
## 
## Degrees-of-freedom method: satterthwaite 
## P value adjustment: mvt method for 2 tests

Note: When re-leveling to more than one factor, you have to be careful to anticipate that the levels will be expanded using expand.grid(): the first factor in the list varies the fastest and the last varies the slowest. That was the case in our example, but in others, it may not be. Had the levels of treat been ordered as A.L, A.M, A.H, B.L, B.M, B.H, then we would have had to specify the levels of tension first and the levels of wool second.

Back to contents

Index of all vignette topics

emmeans/inst/doc/confidence-intervals.Rmd0000644000176200001440000003067114137062735020206 0ustar liggesusers--- title: "Confidence intervals and tests in emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Confidence intervals and tests} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` ## Contents {#contents} This vignette describes various ways of summarizing `emmGrid` objects. 1. [`summary()`, `confint()`, and `test()`](#summary) 2. [Back-transforming to response scale](#tran) (See also the ["transformations" vignette](transformations.html)) 3. [Multiplicity adjustments](#adjust) 4. [Using "by" variables](#byvars) 5. [Joint (omnibus) tests](#joint) 6. [Testing equivalence, noninferiority, nonsuperiority](#equiv) 7. Graphics (in ["basics" vignette](basics.html#plots)) [Index of all vignette topics](vignette-topics.html) ## `summary()`, `confint()`, and `test()` {#summary} The most important method for `emmGrid` objects is `summary()`. For one thing, it is called by default when you display an `emmeans()` result. The `summary()` function has a lot of options, and the detailed documentation via `help("summary.emmGrid")` is worth a look. For ongoing illustrations, let's re-create some of the objects in the ["basics" vignette](basics.html) for the `pigs` example: ```{r} pigs.lm1 <- lm(log(conc) ~ source + factor(percent), data = pigs) pigs.rg <- ref_grid(pigs.lm1) pigs.emm.s <- emmeans(pigs.rg, "source") ``` Just `summary()` by itself will produce a summary that varies somewhat according to context. It does this by setting different defaults for the `infer` argument, which consists of two logical values, specifying confidence intervals and tests, respectively. [The exception is models fitted using MCMC methods, where `summary()` is diverted to the `hpd.summary()` function, a preferable summary for many Bayesians.] The summary of a newly made reference grid will show just estimates and standard errors, but not confidence intervals or tests (that is, `infer = c(FALSE, FALSE)`). The summary of an `emmeans()` result, as we see above, will have intervals, but no tests (i.e., `infer = c(TRUE, FALSE)`); and the result of a `contrast()` call (see [comparisons and contrasts](comparisons.html)) will show test statistics and *P* values, but not intervals (i.e., `infer = c(FALSE, TRUE)`). There are courtesy methods `confint()` and `test()` that just call `summary()` with the appropriate `infer` setting; for example, ```{r} test(pigs.emm.s) ``` It is not particularly useful, though, to test these EMMs against the default of zero -- which is why tests are not usually shown. It makes a lot more sense to test them against some target concentration, say 40. And suppose we want to do a one-sided test to see if the concentration is greater than 40. Remembering that the response is log-transformed in this model, ```{r} test(pigs.emm.s, null = log(40), side = ">") ``` It is also possible to add calculated columns to the summary, via the `calc` argument. The calculations can include any columns up through `df` in the summary, as well as any variable in the object's `grid` slot. Among the latter are usually weights in a column named `.wgt.`, and we can use that to include sample size in the summary: ```{r} confint(pigs.emm.s, calc = c(n = ~.wgt.)) ``` [Back to Contents](#contents) ## Back-transforming {#tran} Transformations and link functions are supported an several ways in **emmeans**, making this a complex topic worthy of [its own vignette](transformations.html). Here, we show just the most basic approach. Namely, specifying the argument `type = "response"` will cause the displayed results to be back-transformed to the response scale, when a transformation or link function is incorporated in the model. For example, let's try the preceding `test()` call again: ```{r} test(pigs.emm.s, null = log(40), side = ">", type = "response") ``` Note what changes and what doesn't change. In the `test()` call, we *still* use the log of 40 as the null value; `null` must always be specified on the linear-prediction scale, in this case the log. In the output, the displayed estimates, as well as the `null` value, are shown back-transformed. As well, the standard errors are altered (using the delta method). However, the *t* ratios and *P* values are identical to the preceding results. That is, the tests themselves are still conducted on the linear-predictor scale (as is noted in the output). Similar statements apply to confidence intervals on the response scale: ```{r} confint(pigs.emm.s, side = ">", level = .90, type = "response") ``` With `side = ">"`, a *lower* confidence limit is computed on the log scale, then that limit is back-transformed to the response scale. (We have also illustrated how to change the confidence level.) [Back to Contents](#contents) ## Multiplicity adjustments {#adjust} Both tests and confidence intervals may be adjusted for simultaneous inference. Such adjustments ensure that the confidence coefficient for a whole set of intervals is at least the specified level, or to control for multiplicity in a whole family of tests. This is done via the `adjust` argument. For `ref_grid()` and `emmeans()` results, the default is `adjust = "none"`. For most `contrast()` results, `adjust` is often something else, depending on what type of contrasts are created. For example, pairwise comparisons default to `adjust = "tukey"`, i.e., the Tukey HSD method. The `summary()` function sometimes *changes* `adjust` if it is inappropriate. For example, with ```{r} confint(pigs.emm.s, adjust = "tukey") ``` the adjustment is changed to the Sidak method because the Tukey adjustment is inappropriate unless you are doing pairwise comparisons. ####### {#adjmore} An adjustment method that is usually appropriate is Bonferroni; however, it can be quite conservative. Using `adjust = "mvt"` is the closest to being the "exact" all-around method "single-step" method, as it uses the multivariate *t* distribution (and the **mvtnorm** package) with the same covariance structure as the estimates to determine the adjustment. However, this comes at high computational expense as the computations are done using simulation techniques. For a large set of tests (and especially confidence intervals), the computational lag becomes noticeable if not intolerable. For tests, `adjust` increases the *P* values over those otherwise obtained with `adjust = "none"`. Compare the following adjusted tests with the unadjusted ones previously computed. ```{r} test(pigs.emm.s, null = log(40), side = ">", adjust = "bonferroni") ``` [Back to Contents](#contents) ## "By" variables {#byvars} Sometimes you want to break a summary down into smaller pieces; for this purpose, the `by` argument in `summary()` is useful. For example, ```{r} confint(pigs.rg, by = "source") ``` If there is also an `adjust` in force when `by` variables are used, the adjustment is made *separately* on each `by` group; e.g., in the above, we would be adjusting for sets of 4 intervals, not all 12 together. There can be a `by` specification in `emmeans()` (or equivalently, a `|` in the formula); and if so, it is passed on to `summary()` and used unless overridden by another `by`. Here are examples, not run: ```{r eval = FALSE} emmeans(pigs.lm, ~ percent | source) ### same results as above summary(.Last.value, by = percent) ### grouped the other way ``` Specifying `by = NULL` will remove all grouping. ### Simple comparisons {#simple} There is also a `simple` argument for `contrast()` that is in essence the inverse of `by`; the contrasts are run using everything *except* the specified variables as `by` variables. To illustrate, let's consider the model for `pigs` that includes the interaction (so that the levels of one factor compare differently at levels of the other factor). ```{r} pigsint.lm <- lm(log(conc) ~ source * factor(percent), data = pigs) pigsint.rg <- ref_grid(pigsint.lm) contrast(pigsint.rg, "consec", simple = "percent") ``` In fact, we may do *all* one-factor comparisons by specifying `simple = "each"`. This typically produces a lot of output, so use it with care. [Back to Contents](#contents) ## Joint tests {#joint} From the above, we already know how to test individual results. For pairwise comparisons (details in [the "comparisons" vignette](comparisons.html)), we might do ```{r} pigs.prs.s <- pairs(pigs.emm.s) pigs.prs.s ``` But suppose we want an *omnibus* test that all these comparisons are zero. Easy enough, using the `joint` argument in `test` (note: the `joint` argument is *not* available in `summary()`; only in `test()`): ```{r} test(pigs.prs.s, joint = TRUE) ``` Notice that there are three comparisons, but only 2 d.f. for the test, as cautioned in the message. The test produced with `joint = TRUE` is a "type III" test (assuming the default equal weights are used to obtain the EMMs). See more on these types of tests for higher-order effects in the ["interactions" vignette section on contrasts](interactions.html#contrasts). ####### {#joint_tests} For convenience, there is also a `joint_tests()` function that performs joint tests of contrasts among each term in a model or `emmGrid` object. ```{r} joint_tests(pigsint.rg) ``` The tests of main effects are of families of contrasts; those for interaction effects are for interaction contrasts. These results are essentially the same as a "Type-III ANOVA", but may differ in situations where there are empty cells or other non-estimability issues, or if generalizations are present such as unequal weighting. (Another distinction is that sums of squares and mean squares are not shown; that is because these really are tests of contrasts among predictions, and they may or may not correspond to model sums of squares.) One may use `by` variables with `joint_tests`. For example: ```{r} joint_tests(pigsint.rg, by = "source") ``` In some models, it is possible to specify `submodel = "type2"`, thereby obtaining something akin to a Type II analysis of variance. See the [messy-data vignette](messy-data.html#type2submodel) for an example. [Back to Contents](#contents) ## Testing equivalence, noninferiority, and nonsuperiority {#equiv} The `delta` argument in `summary()` or `test()` allows the user to specify a threshold value to use in a test of equivalence, noninferiority, or nonsuperiority. An equivalence test is kind of a backwards significance test, where small *P* values are associated with small differences relative to a specified threshold value `delta`. The help page for `summary.emmGrid` gives the details of these tests. Suppose in the present example, we consider two sources to be equivalent if they are within 25% of each other. We can test this as follows: ```{r} test(pigs.prs.s, delta = log(1.25), adjust = "none") ``` By our 25% standard, the *P* value is quite small for comparing soy and skim, providing statistical evidence that their difference is enough smaller than the threshold to consider them equivalent. [Back to Contents](#contents) ## Graphics {#graphics} Graphical displays of `emmGrid` objects are described in the ["basics" vignette](basics.html#plots) [Index of all vignette topics](vignette-topics.html) emmeans/inst/doc/models.Rmd0000644000176200001440000007104714147507455015375 0ustar liggesusers--- title: "Models supported by emmeans" author: "emmeans package, Version `r packageVersion('emmeans')`" output: emmeans::.emm_vignette vignette: > %\VignetteIndexEntry{Models supported by emmeans} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- Here we document what model objects may be used with **emmeans**, and some special features of some of them that may be accessed by passing additional arguments through `ref_grid` or `emmeans()`. Certain objects are affected by optional arguments to functions that construct `emmGrid` objects, including `ref_grid()`, `emmeans()`, `emtrends()`, and `emmip()`. When "*arguments*" are mentioned in the subsequent quick reference and object-by-object documentation, we are talking about arguments in these constructors. If a model type is not included here, users may be able to obtain usable results via the `qdrg()` function; see its help page. Package developers may support their models by writing appropriate `recover_data` and `emm_basis` methods. See the package documentation for `extending-emmeans` and `vignette("xtending")` for details. [Index of all vignette topics](vignette-topics.html) ## Quick reference for supported objects and options {#quickref} Here is an alphabetical list of model classes that are supported, and the arguments that apply. Detailed documentation follows, with objects grouped by the code in the "Group" column. Scroll down or follow the links to those groups for more information. |Object.class |Package |Group |Arguments / notes | |:------------|:--------|:-------:|:------------------------------------------------------------| |aov |stats |[A](#A) | | |aovList |stats |[V](#V) |Best with balanced designs, orthogonal coding | |averaging |MuMIn |[I](#I) | | |betareg |betareg |[B](#B) |`mode = c("link", "precision", "phi.link",` | | | | |` "variance", "quantile")` | |brmsfit |brms |[P](#P) |Supported in **brms** package | |carbayes |CARBayes |[S](#S) |`data` is required | |clm |ordinal |[O](#O) |`mode = c("latent", "linear.predictor", "cum.prob",` | | | | |` "exc.prob", "prob", "mean.class", "scale")` | |clmm |ordinal |[O](#O) |Like `clm` but no `"scale"` mode | |coxme |coxme |[G](#G) | | |coxph |survival |[G](#G) | | |gam |mgcv |[G](#G) |`freq = FALSE`, `unconditional = FALSE`, | | | | |`what = c("location", "scale", "shape", "rate", "prob.gt.0")`| |gamm |mgcv |[G](#G) |`call = object$gam$call` | |Gam |gam |[G](#G) |`nboot = 800` | |gamlss |gamlss |[H](#H) |`what = c("mu", "sigma", "nu", "tau")` | |gee |gee |[E](#E) |`vcov.method = c("naive", "robust")` | |geeglm |geepack |[E](#E) |`vcov.method = c("vbeta", "vbeta.naiv", "vbeta.j1s",` | | | | |`"vbeta.fij", "robust", "naive")` or a matrix | |geese |geepack |[E](#E) |Like `geeglm` | |glm |stats |[G](#G) | | |glm.nb |MASS |[G](#G) |Requires `data` argument | |glmerMod |lme4 |[G](#G) | | |glmmadmb |glmmADMB | |No longer supported | |glmmPQL |MASS |[G](#G) |inherits `lm` support | |glmmTMB |glmmTMB |[P](#P) |Supported in **glmmTMB** package (dev. version only?) | |gls |nlme |[K](#K) |`mode = c("auto", "df.error", "satterthwaite", "asymptotic")`| |gnls |nlme |[A](#A) |Supports `params` part. Requires `param = ""` | |hurdle |pscl |[C](#C) |`mode = c("response", "count", "zero", "prob0"),` | | | | |`lin.pred = c(FALSE, TRUE)` | |lm |stats |[A](#A) |Several other classes inherit from this and may be supported | |lme |nlme |[K](#K) |`sigmaAdjust = c(TRUE, FALSE),` | | | | |`mode = c("auto", containment", "satterthwaite", "asymptotic"),`| | | | |`extra.iter = 0` | |lmerMod |lme4 |[L](#L) |`lmer.df = c("kenward-roger", "satterthwaite", "asymptotic")`, | | | | |`pbkrtest.limit = 3000`, `disable.pbkrtest = FALSE`. | | | | |`emm_options(lmer.df =, pbkrtest.limit =, disable.pbkrtest =)` | |lqm,lqmm |lqmm |[Q](#Q) |`tau = "0.5"` (must match an entry in `object$tau`) | | | | |Optional: `method`, `R`, `seed`, `startQR` (must be fully spelled-out) | |manova |stats |[M](#M) |`mult.name`, `mult.levs` | |maov |stats |[M](#M) |`mult.name`, `mult.levs` | |mblogit |mclogit |[P](#P) |Supported in **mclogit** (overrides previous minimal support here) | |mcmc |mcmc |[S](#S) |May require `formula`, `data` | |MCMCglmm |MCMCglmm |[S](#S) |(see also [M](#M#)) `mult.name`, `mult.levs`, `trait`, | | | | |`mode = c("default", "multinomial")`; `data` is required | |mira |mice |[I](#I) |Optional arguments per class of `$analyses` elements | |mixed |afex |[P](#P) |Supported in **afex** package | |mlm |stats |[M](#M) |`mult.name`, `mult.levs` | |mmer |sommer |[G](#G) | | |multinom |nnet |[N](#N) |`mode = c("prob", "latent")` | | | | |Always include response in specs for `emmeans()` | |nauf |nauf.*xxx* |[P](#P) |Supported in **nauf** package | |nlme |nlme |[A](#A) |Supports fixed part. Requires `param = ""` | |polr |MASS |[O](#O) |`mode = c("latent", "linear.predictor", "cum.prob",` | | | | |`"exc.prob", "prob", "mean.class")` | |rlm |MASS |[A](#A) |inherits `lm` support | |rms |rms |[O](#O) |`mode = ("middle", "latent", "linear.predictor",` | | | | |`"cum.prob", "exc.prob", "prob", "mean.class")` | |rq,rqs |quantreg |[Q](#Q) |`tau = "0.5"` (must match an entry in `object$tau`) | | | | |Optional: `se`, `R`, `bsmethod`, etc. | |rlmerMod |robustlmm|[P](#P) |Supported in **robustlmm** package | |rsm |rsm |[P](#P) |Supported in **rsm** package | |stanreg |rstanarm |[S](#S) |Args for `stanreg_`*xxx* similar to those for *xxx* | |survreg |survival |[A](#A) | | |svyglm |survey |[A](#A) | | |zeroinfl |pscl |[C](#C) |`mode = c("response", "count", "zero", "prob0")`, | | | | |`lin.pred = c(FALSE, TRUE)` | ## Group A -- "Standard" or minimally supported models {#A} Models in this group, such as `lm`, do not have unusual features that need special support; hence no extra arguments are needed. Some may require `data` in the call. ## B -- Beta regression {#B} The additional `mode` argument for `betareg` objects has possible values of `"response"`, `"link"`, `"precision"`, `"phi.link"`, `"variance"`, and `"quantile"`, which have the same meaning as the `type` argument in `predict.betareg` -- with the addition that `"phi.link"` is like `"link"`, but for the precision portion of the model. When `mode = "quantile"` is specified, the additional argument `quantile` (a numeric scalar or vector) specifies which quantile(s) to compute; the default is 0.5 (the median). Also in `"quantile"` mode, an additional variable `quantile` is added to the reference grid, and its levels are the values supplied. [Back to quick reference](#quickref) ## Group C -- Count models {#C} Two optional arguments -- `mode` and `lin.pred` -- are provided. The `mode` argument has possible values `"response"` (the default), `"count"`, `"zero"`, or `"prob0"`. `lin.pred` is logical and defaults to `FALSE`. With `lin.pred = FALSE`, the results are comparable to those returned by `predict(..., type = "response")`, `predict(..., type = "count")`, `predict(..., type = "zero")`, or `predict(..., type = "prob")[, 1]`. See the documentation for `predict.hurdle` and `predict.zeroinfl`. The option `lin.pred = TRUE` only applies to `mode = "count"` and `mode = "zero"`. The results returned are on the linear-predictor scale, with the same transformation as the link function in that part of the model. The predictions for a reference grid with `mode = "count"`, `lin.pred = TRUE`, and `type = "response"` will be the same as those obtained with `lin.pred = FALSE` and `mode = "count"`; however, any EMMs derived from these grids will be different, because the averaging is done on the log-count scale and the actual count scale, respectively -- thereby producing geometric means versus arithmetic means of the predictions. If the `vcov.` argument is used (see details in the documentation for `ref_grid`), it must yield a matrix of the same size as would be obtained using `vcov.hurdle` or `vcov.zeroinfl` with its `model` argument set to `("full", "count", "zero")` in respective correspondence with `mode` of `("mean", "count", "zero")`. If `vcov.` is a function, it must support the `model` argument. [Back to quick reference](#quickref) ## Group E -- GEE models {#E} These models all have more than one covariance estimate available, and it may be selected by supplying a string as the `vcov.method` argument. It is partially matched with the available choices shown in the quick reference. In `geese` and `geeglm`, the aliases `"robust"` (for `"vbeta"`) and `"naive"` (for `"vbeta.naiv"` are also accepted. If a matrix or function is supplied as `vcov.method`, it is interpreted as a `vcov.` specification as described for `...` in the documentation for `ref_grid`. ## Group G -- Generalized linear models and relatives {#G} Most models in this group receive only standard support as in [Group A](#A), but typically the tests and confidence intervals are asymptotic. Thus the `df` column for tabular results will be `Inf`. Some objects in this group *require* that the original or reference dataset be provided when calling `ref_grid()` or `emmeans()`. In the case of `mgcv::gam` objects, there are optional `freq` and `unconditional` arguments as is detailed in the documentation for `mgcv::vcov.gam()`. Both default to `FALSE`. The value of `unconditional` matters only if `freq = FALSE` and `object$Vc` is non-null. For `mgcv::gamm` objects, `emmeans()` results are based on the `object$gam` part. Unfortunately, that is missing its `call` component, so the user must supply it in the `call` argument (e.g., `call = quote(gamm(y ~ s(x), data = dat))`) or give the dataset in the `data` argument. Alternatively (and recommended), you may first set `object$gam$call` to the quoted call ahead of time. The `what` arguments are used to select which model formula to use: `"location", "scale"` apply to `gaulss` and `gevlss` families, `"shape"` applies only to `gevlss`, and `"rate", "prob.gt.0"` apply to `ziplss`. With `gam::Gam` objects, standard errors are estimated using a bootstrap method when there are any smoothers involved. Accordingly, there is an optional `nboot` argument that sets the number of bootstrap replications used to estimate the variances and covariances of the smoothing portions of the model. Generally, it is better to use models fitted via `mgcv::gam()` rather than `gam::gam()`. [Back to quick reference](#quickref) ## Group H -- `gamlss` models {#H} The `what` argument has possible values of `"mu"` (default), `"sigma"`, `"nu"`, or `"tau"` depending on which part of the model you want results for. Currently, there is no support when the selected part of the model contains a smoothing method like `pb()`. ## Group I -- Multiple models (via imputation or averaging) {#I} These objects are the results of fitting several models with different predictor subsets or imputed values. The `bhat` and `V` slots are obtained via averaging and, in the case of multiple imputation, adding a multiple of the between-imputation covariance per Rubin's rules. Support for `MuMIn::averaging` objects may be somewhat dodgy, as it is not clear that all supported model classes will work. The object *must* have a `"modelList"` attribute (obtained by constructing the object explicitly from a model list or by including `fit = TRUE` in the call). And each model should be fitted with `data` as a **named** argument in the call; or else provide a `data` argument in the call to `emmeans()` or `ref_grid()`. No estimability checking is done at present: if/when it is added, a linear function will be estimable only if it is estimable in *all* models included in the averaging. ## Group K -- `gls` and `lme` models {#K} The `sigmaAdjust` argument is a logical value that defaults to `TRUE`. It is comparable to the `adjustSigma` option in `nlme::summary.lme` (the name-mangling is to avoid conflicts with the often-used `adjust` argument), and determines whether or not a degrees-of-freedom adjustment is performed with models fitted using the ML method. The optional `mode` argument affects the degrees of freedom. The `mode = "satterthwaite"` option determines degrees of freedom via the Satterthwaite method: If `s^2` is the estimate of some variance, then its Satterthwaite d.f. is `2*s^4 / Var(s^2)`. In case our numerical methods for this fail, we also offer `mode = "appx-satterthwaite"` as a backup, by which quantities related to `Var(s^2)` are obtained by randomly perturbing the response values. Currently, only `"appx-satterthwaite"` is available for `lme` objects, and it is used if `"satterthwaite"` is requested. Because `appx-satterthwaite` is simulation-based, results may vary if the same analysis is repeated. An `extra.iter` argument may be added to request additional simulation runs (at [possibly considerable] cost of repeating the model-fitting that many more times). (Note: Previously, `"appx-satterthwaite"` was termed `"boot-satterthwaite"`; this is still supported for backward compatibility. The "boot" was abandoned because it is really an approximation method, not a bootstrap method in the sense as many statistical methods.) An alternative method is `"df.error"` (for `gls`) and `"containment"` (for `lme`). `df.error` is just the error degrees of freedom for the model, minus the number of extra random effects estimated; it generally over-estimates the degrees of freedom. The `asymptotic` mode simply sets the degrees of freedom to infinity. "containment"` mode (for `lme` models) determines the degrees of freedom for the coarsest grouping involved in the contrast or linear function involved, so it tends to under-estimate the degrees of freedom. The default is `mode = "auto"`, which uses Satterthwaite if there are estimated random effects and the non-Satterthwaite option otherwise. The `extra.iter` argument is ignored unless the d.f. method is (or defaults to) `appx-satterthwaite`. [Back to quick reference](#quickref) ## Group L -- `lmerMod` models {#L} There is an optional `lmer.df` argument that defaults to `get_EMM_option("lmer.df")` (which in turn defaults to `"kenward-roger"`). The possible values are `"kenward-roger"`, `"satterthwaite"`, and `"asymptotic"` (these are partially matched and case-insensitive). With `"kenward-roger"`, d.f. are obtained using code from the **pbkrtest** package, if installed. With `"satterthwaite"`, d.f. are obtained using code from the **lmerTest** package, if installed. With `"asymptotic"`, or if the needed package is not installed, d.f. are set to `Inf`. (For backward compatibility, the user may specify `mode` in lieu of `lmer.df`.) A by-product of the Kenward-Roger method is that the covariance matrix is adjusted using `pbkrtest::vcovAdj()`. This can require considerable computation; so to avoid that overhead, the user should opt for the Satterthwaite or asymptotic method; or, for backward compatibility, may disable the use of **pbkrtest** via `emm_options(disable.pbkrtest = TRUE)` (this does not disable the **pbkrtest** package entirely, just its use in **emmeans**). The computation time required depends roughly on the number of observations, *N*, in the design matrix (because a major part of the computation involves inverting an *N* x *N* matrix). Thus, **pbkrtest** is automatically disabled if *N* exceeds the value of `get_emm_option("pbkrtest.limit")`, for which the factory default is 3000. (The user may also specify `pbkrtest.limit` or `disable.pbkrtest` as an argument in the call to `emmeans()` or `ref_grid()`) Similarly to the above, the `disable.lmerTest` and `lmerTest.limit` options or arguments affect whether Satterthwaite methods can be implemented. The `df` argument may be used to specify some other degrees of freedom. Note that if `df` and `method = "kenward-roger"` are both specified, the covariance matrix is adjusted but the K-R degrees of freedom are not used. Finally, note that a user-specified covariance matrix (via the `vcov.` argument) will also disable the Kenward-Roger method; in that case, the Satterthwaite method is used in place of Kenward-Roger. [Back to quick reference](#quickref) ## Group M -- Multivariate models {#M} When there is a multivariate response, the different responses are treated as if they were levels of a factor -- named `rep.meas` by default. The `mult.name` argument may be used to change this name. The `mult.levs` argument may specify a named list of one or more sets of levels. If this has more than one element, then the multivariate levels are expressed as combinations of the named factor levels via the function `base::expand.grid`. ## N - Multinomial responses {#N} The reference grid includes a pseudo-factor with the same name and levels as the multinomial response. There is an optional `mode` argument which should match `"prob"` or `"latent"`. With `mode = "prob"`, the reference-grid predictions consist of the estimated multinomial probabilities. The `"latent"` mode returns the linear predictor, recentered so that it averages to zero over the levels of the response variable (similar to sum-to-zero contrasts). Thus each latent variable can be regarded as the log probability at that level minus the average log probability over all levels. There are two optional arguments: `mode` and `rescale` (which defaults to `c(0, 1)`). Please note that, because the probabilities sum to 1 (and the latent values sum to 0) over the multivariate-response levels, all sensible results from `emmeans()` must involve that response as one of the factors. For example, if `resp` is a response with *k* levels, `emmeans(model, ~ resp | trt)` will yield the estimated multinomial distribution for each `trt`; but `emmeans(model, ~ trt)` will just yield the average probability of 1/*k* for each `trt`. [Back to quick reference](#quickref) ## Group O - Ordinal responses {#O} The reference grid for ordinal models will include all variables that appear in the main model as well as those in the `scale` or `nominal` models (if provided). There are two optional arguments: `mode` (a character string) and `rescale` (which defaults to `c(0, 1)`). `mode` should match one of `"latent"` (the default), `"linear.predictor"`, `"cum.prob"`, `"exc.prob"`, `"prob"`, `"mean.class"`, or `"scale"` -- see the quick reference and note which are supported. With `mode = "latent"`, the reference-grid predictions are made on the scale of the latent variable implied by the model. The scale and location of this latent variable are arbitrary, and may be altered via `rescale`. The predictions are multiplied by `rescale[2]`, then added to `rescale[1]`. Keep in mind that the scaling is related to the link function used in the model; for example, changing from a probit link to a logistic link will inflate the latent values by around $\pi/\sqrt{3}$, all other things being equal. `rescale` has no effect for other values of `mode`. With `mode = "linear.predictor"`, `mode = "cum.prob"`, and `mode = "exc.prob"`, the boundaries between categories (i.e., thresholds) in the ordinal response are included in the reference grid as a pseudo-factor named `cut`. The reference-grid predictions are then of the cumulative probabilities at each threshold (for `mode = "cum.prob"`), exceedance probabilities (one minus cumulative probabilities, for `mode = "exc.prob"`), or the link function thereof (for `mode = "linear.predictor"`). With `mode = "prob"`, a pseudo-factor with the same name as the model's response variable is created, and the grid predictions are of the probabilities of each class of the ordinal response. With `"mean.class"`, the returned results are means of the ordinal response, interpreted as a numeric value from 1 to the number of classes, using the `"prob"` results as the estimated probability distribution for each case. With `mode = "scale"`, and the fitted object incorporates a scale model, EMMs are obtained for the factors in the scale model (with a log response) instead of the response model. The grid is constructed using only the factors in the scale model. Any grid point that is non-estimable by either the location or the scale model (if present) is set to `NA`, and any EMMs involving such a grid point will also be non-estimable. A consequence of this is that if there is a rank-deficient `scale` model, then *all* latent responses become non-estimable because the predictions are made using the average log-scale estimate. `rms` models have an additional `mode`. With `mode = "middle"` (this is the default), the middle intercept is used, comparable to the default for `rms::Predict()`. This is quite similar in concept to `mode = "latent"`, where all intercepts are averaged together. [Back to quick reference](#quickref) ## P -- Other packages {#P} Models in this group have their **emmeans** support provided by the package that implements the model-fitting procedure. Users should refer to the package documentation for details on **emmeans** support. In some cases, a package's models may have been supported here in **emmeans**; if so, the other package's support overrides it. ## Q -- Quantile regression {#Q} The argument `tau` should match (within a very small margin) one of the quantiles actually specified in fitting the model; otherwise an error results. In these models, the covariance matrix is obtained via the model's `summary()` method with `covariance = TRUE`. The user may specify one or more of the other arguments for `summary` or to be passed to, say, a bootstrap routine. If so, those optional arguments must be spelled-out completely (e.g., `start` will *not* be matched to `startQR`). ## S -- Sampling (MCMC) methods {#S} Models fitted using MCMC methods contain a sample from the posterior distribution of fixed-effect coefficients. In some cases (e.g., results of `MCMCpack::MCMCregress()` and `MCMCpack::MCMCpoisson()`), the object may include a `"call"` attribute that `emmeans()` can use to reconstruct the data and obtain a basis for the EMMs. If not, a `formula` and `data` argument are provided that may help produce the right results. In addition, the `contrasts` specifications are not necessarily recoverable from the object, so the system default must match what was actually used in fitting the model. The `summary.emmGrid()` method provides credibility intervals (HPD intervals) of the results, and ignores the frequentist-oriented arguments (`infer`, `adjust`, etc.) An `as.mcmc()` method is provided that creates an `mcmc` object that can be summarized or plotted using the **coda** package (or others that support those objects). It provides a posterior sample of EMMs, or contrasts thereof, for the given reference grid, based on the posterior sample of the fixed effects from the model object. In `MCMCglmm` objects, the `data` argument is required; however, if you save it as a member of the model object (e.g., `object$data = quote(mydata)`), that removes the necessity of specifying it in each call. The special keyword `trait` is used in some models. When the response is multivariate and numeric, `trait` is generated automatically as a factor in the reference grid, and the arguments `mult.levels` can be used to name its levels. In other models such as a multinomial model, use the `mode` argument to specify the type of model, and `trait = ` to specify the name of the data column that contains the levels of the factor response. The **brms** package version 2.13 and later, has its own `emmeans` support. Refer to the documentation in that package. [Back to quick reference](#quickref) ## Group V -- `aovList` objects (also used with `afex_aov` objects) {#V} Support for these objects is limited. To avoid strong biases in the predictions, it is strongly recommended that when fitting the model, the `contrasts` attribute of all factors should be of a type that sums to zero -- for example, `"contr.sum"`, `"contr.poly"`, or `"contr.helmert"` but *not* `"contr.treatment"`. If that is found not to be the case, the model is re-fitted using sum-to-zero contrasts (thus requiring additional computation). Doing so does *not* remove all bias in the EMMs unless the design is perfectly balanced, and an annotation is added to warn of that. This bias cancels out when doing comparisons and contrasts. Only intra-block estimates of covariances are used. That is, if a factor appears in more than one error stratum, only the covariance structure from its lowest stratum is used in estimating standard errors. Degrees of freedom are obtained using the Satterthwaite method. In general, `aovList` support is best with balanced designs, with due caution in the use of contrasts. If a `vcov.` argument is supplied, it must yield a single covariance matrix for the unique fixed effects (not a set of them for each error stratum). In that case, the degrees of freedom are set to `NA`. [Back to quick reference](#quickref) [Index of all vignette topics](vignette-topics.html) emmeans/inst/doc/xtending.R0000644000176200001440000001305114165066775015405 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE--------------------------------------------- require("emmeans") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") set.seed(271828) ## ----eval=FALSE----------------------------------------------------------------------------------- # help("extending-emmeans", package="emmeans") ## ------------------------------------------------------------------------------------------------- fake = expand.grid(rep = 1:5, A = c("a1","a2"), B = c("b1","b2","b3")) fake$y = c(11.46,12.93,11.87,11.01,11.92,17.80,13.41,13.96,14.27,15.82, 23.14,23.75,-2.09,28.43,23.01,24.11,25.51,24.11,23.95,30.37, 17.75,18.28,17.82,18.52,16.33,20.58,20.55,20.77,21.21,20.10) ## ------------------------------------------------------------------------------------------------- library(MASS) fake.rlm = rlm(y ~ A * B, data = fake) library(emmeans) emmeans(fake.rlm, ~ B | A) ## ------------------------------------------------------------------------------------------------- fake.lts = ltsreg(y ~ A * B, data = fake) ## ------------------------------------------------------------------------------------------------- emmeans:::recover_data.lm ## ------------------------------------------------------------------------------------------------- recover_data.lqs = emmeans:::recover_data.lm ## ------------------------------------------------------------------------------------------------- rec.fake = recover_data(fake.lts) head(rec.fake) ## ------------------------------------------------------------------------------------------------- args(emmeans:::emm_basis.lm) ## ------------------------------------------------------------------------------------------------- MASS:::predict.lqs ## ------------------------------------------------------------------------------------------------- emm_basis.lqs = function(object, trms, xlev, grid, ...) { m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) X = model.matrix(trms, m, contrasts.arg = object$contrasts) bhat = coef(object) Xmat = model.matrix(trms, data=object$model) # 5 V = rev(object$scale)[1]^2 * solve(t(Xmat) %*% Xmat) nbasis = matrix(NA) dfargs = list(df = nrow(Xmat) - ncol(Xmat)) dffun = function(k, dfargs) dfargs$df list(X = X, bhat = bhat, nbasis = nbasis, V = V, #10 dffun = dffun, dfargs = dfargs) } ## ------------------------------------------------------------------------------------------------- emmeans(fake.lts, ~ B | A) ## ------------------------------------------------------------------------------------------------- recover_data.rsm = function(object, data, mode = c("asis", "coded", "decoded"), ...) { mode = match.arg(mode) cod = rsm::codings(object) fcall = object$call if(is.null(data)) # 5 data = emmeans::recover_data(fcall, delete.response(terms(object)), object$na.action, ...) if (!is.null(cod) && (mode == "decoded")) { pred = cpred = attr(data, "predictors") trms = attr(data, "terms") #10 data = rsm::decode.data(rsm::as.coded.data(data, formulas = cod)) for (form in cod) { vn = all.vars(form) if (!is.na(idx <- grep(vn[1], pred))) { pred[idx] = vn[2] #15 cpred = setdiff(cpred, vn[1]) } } attr(data, "predictors") = pred new.trms = update(trms, reformulate(c("1", cpred))) #20 attr(new.trms, "orig") = trms attr(data, "terms") = new.trms attr(data, "misc") = cod } data } ## ------------------------------------------------------------------------------------------------- emm_basis.rsm = function(object, trms, xlev, grid, mode = c("asis", "coded", "decoded"), misc, ...) { mode = match.arg(mode) cod = misc if(!is.null(cod) && mode == "decoded") { # 5 grid = rsm::coded.data(grid, formulas = cod) trms = attr(trms, "orig") } m = model.frame(trms, grid, na.action = na.pass, xlev = xlev) #10 X = model.matrix(trms, m, contrasts.arg = object$contrasts) bhat = as.numeric(object$coefficients) V = emmeans::.my.vcov(object, ...) if (sum(is.na(bhat)) > 0) #15 nbasis = estimability::nonest.basis(object$qr) else nbasis = estimability::all.estble dfargs = list(df = object$df.residual) dffun = function(k, dfargs) dfargs$df #20 list(X = X, bhat = bhat, nbasis = nbasis, V = V, dffun = dffun, dfargs = dfargs, misc = list()) } ## ----results = "hide", warning = FALSE, message = FALSE------------------------------------------- library("rsm") example("rsm") ### (output is not shown) ### ## ------------------------------------------------------------------------------------------------- emmeans(CR.rs2, ~ x1 * x2, mode = "coded", at = list(x1 = c(-1, 0, 1), x2 = c(-2, 2))) ## ------------------------------------------------------------------------------------------------- codings(CR.rs1) ## ------------------------------------------------------------------------------------------------- emmeans(CR.rs2, ~ Time * Temp, mode = "decoded", at = list(Time = c(80, 85, 90), Temp = c(165, 185))) emmeans/inst/extdata/0000755000176200001440000000000014137062735014316 5ustar liggesusersemmeans/inst/extdata/cbpplist0000644000176200001440000004675214137062735016077 0ustar liggesusersy:jfb%gTv/ ]L=A]^gPzW=f>y  "#j&G4eV~ .b>J!b)wϷ}إB$=tqn=:O[1m$GZn1X;v~aB3?S?H*sE,!t}&ߧ3t;Cq fEO?_7gl,G52:k} ݿiM{,w_ۨ@~R;(|m|wbdD1oZ[M߼9D;)I *0<ý ˦K1QO^\VJ_ڜGyӎghuχ߆@j&&ʹ -PvXl`kxr'.vt*/MӎL9XmvPY-j1PC\,}la'GQX_cX_HVM1lw-5Y]Trjۓ`R%F)3 7I1Mdo)+F8be{0Ye@o[z2sqVG]v-ֿ\_͇|t1MaA\CNtsys7s I FSSF)kJ98 dЌ NX9,(&9f:E}) `=!!enLǀ^e~~74ۓN`Ĕ R&Kh",.֒{i}(3x,7+&Oiqn{r>VoH(|~hfܒЌv=@Z޸Bˀ?;6V U0q٧\J4y {̞ɩLBPz&9Ԭ7]t_~fXwdԕzVXirF3:!|Ƥ)) S Tp̵DFl4Ue;f% tIaY_Ȫ kdFţ2Ls]u N^yw g݃a$VWv<.E1䭰+T?+3DӵNە?aI]ڿ>D3sɵ7CѬ(b'`8C jq{Ntj؏4 >1B)\,~HXGTda^7mbg~Vᅁ'yM麳PvnWL^vw>em7P4&s[< l.[:tZp+@}csQ8b~޼l5YBH 'y.=m6.&L4Veۨ;H3͝> ;afFҬ 9䁺d*6OoVN3ՅhiqBfgZeR ]3/崸/׺z^8n3(Q=3+dٰ^dIγGa]HH,,\M[/>-,a[0 WrLUCr^3B%$.Q9s76>;ބvϡ& XKg.V cC.G(6kvek䏃'xT4ɉuˇvԻ[hFL Y{9/MMLK:L LZ!a@ys޾h{-/ fs>(4%]7!`6Vz5wnu"__cl5?Q@dCUƧ#! d|ߐd]P4o $Gji+6fߋdVXn?Mo*B{u76 X BƳ@v{y.P~0;a^dw0PKz0d r+ hD_ޗh0Izl<~x6.h @ld~ćwZRz!Q v~lf6<aZ#g`4sBe#N?|n7ǃf}@^(ʛOTqO8Inj=qY(P| .xѾۡ=)tԽ(yM&~) z14*7B9eӰxfsvq~(TtKEVe0Y=˔aE3tdŻV?iQ: 9%dUƪ/9[Y;?twqKCT}j߽c3@Ю|Lv}}=fo=l? $tKڟ5{ӏwlQMoQ+7Ds?al'?ߑ s?$F~D:lo)}Dh|lQnc {؃N(7^pP۞\\aNoH`&kk *Z6sP^@M x&Ū_X7l~3U,3 |}4\$Lkehg;Ȱt o\ngDOXj[xp GOS@LڨF^=T//h6ކ}rp#j$#*'. AV;H.BD_jzݯPxw&_Ky3u&0n&IS]E–YUBH/BjzHHj+5" >I)pkx/ {Sx#iⓕc-FÈ}ğ7B1^k[n?h䅸H o͑ߓ3H2`eUoZ cGq$N8~88N%T!Eg~I׽#ɤ3TG-@vXw}8 }ɐQ4 y>zp: ^To߇W:ER QIۡR(:iyUػsf;2R5Q:?j*N+\C+_!ٮC*ycFj!Я7.@bnB xqPa0 BHfA6K86I\ӔAM9Ln.шGS̯;єDYvŎ OEO:ˑĈ{"$b(}3 4O_˼^ qa jW1 ܄g!n|f3?Rж]yL/*憤m]dw},b8*YWr:^"i^Q!0C Hi?{ ~ta(bY)}jgI!EIfote$W+h)=^xOh3?y^$=a&ۊd|sI3_U # =|bb' O}5C 9 lѶK'^%!Sd09Pti OKz@;ŭX[KP$fK=@@J6R7lԏ=c/\w1B&WƖH~AZ6HȘ8rބr4eU#ב+@x-\!! )cYiFjTs"Eֽ"oY9cܫ6H6$&cWP³ UfUY+|^Q {,/^ @ܥ/9`EY]A{c4н3D QFC'cn$+! 3ȫg^HF A$ۄkp8_hD=>%z=Ug3 '␳0y ߁'}Տ[(krEWRUT ߠW#ߗD_ ;#ͮH6fS!9sɍH\Lyk7+\[i޶Bbl!yod g'MBM@R{HH $F\,߅d>,R{]!vwsٓ"9rw뺲AbuN>Y3:*hdqwbnd!Ee!#2PWoH-_^H T,<}EHV.-0>fvJHҩš/H"f2C64|(6 ]]x]0؝ȂR̔"noVU}7$1`0{I~vxɔfm0 MHI6xZyuڹL")H0Pܿe q$/"fSM.H*%+mq9V_<@̾Sz yVlt2ştǟ3]=EɐcZj⇐8.w:!ۃCVIKi8$3kEe(h֟0Wl  OjKդЯ3ErY)HCW`q~ *x)' 檡$ŞCnMH[9C2/n_-9DEċޑan=F$.Z1I UH+yWTTUkx(IO|&t  x^UfU#N-H؈v[8/eS꽿#^"UHeg}j |!󔎇t'iF59Zocw~5-ouR_ D$@o3$JAh(R{SdA%{.{%(bGMB.4=KeX aFodo Lw#G !G™߁xSNMfƔTC0A_6 }_{ * l>IRQzΈݾ*$Pew0Uz3!ܹwVx_`ɍWFL[X2w tm1C *99ꐑ餽9?TC ?jCZOw!9xgxT{~Dt($(Q0᠞% H{Hh3}O$-w, r6n$qpa0IšD޸ZIZ ?;:V< )\M=5-*?~?= fc(~e@8ϔW3E謩b`.? 7ކ=ʰRS)mdqyZk~PJ0 %>ˈ|nW'aOO >S-`_8̬pBmL ۳ o-Ơxr/Yvnz:Q@]8}[;0VųX腙`@Ǖ2"μEݬ;?C Ybb7hɗ ߄eRv.*K&<˟WdC>x]'`?ԉEݰD I'`Z5d娲F@؉Ln]~9EWyf&_w˟M^O|:J+OhY#EC`& +,n K]'"˃1n&Ӂ__ E%a1F 1dU]r`I۟@8g,ab?Z_Wğ,UNx3CbM0yVj۸79 e c,aU/0)}\8]w]`h4С<[7`6Cf`P9ZO2&C|30Df,W?f΄+)((^$qX¾w\WKWHe0x\W"0­xWw`~@`iv&=t+ 0;uV#oOOB)kLkab:N}.N^2ē3){vOVh0/%SX1=r^w TY; Vυ)xq%<%]y!zπ8q!/&|5yz9`,y0Y[u#Nt:~"Ze Ýq~'`vTț^` SWU`~M V?IIfՕoiyRA~Z+_Ǯ? ̏QX=fPM5[$')}?` nQyLrPvC4 5 8U꾁;LiÓ;I0-u7 eǵͻagGi޵xwVa6XuF#m=Fl`|yvs~Z]iU_|NO`Feek(2Su@>n,]V;0/Flg(lTzTA?_|L+~%H^kYɈ߷4OtMJ p:1za10Wk)7Í|=߃ ̼vw?Vߥ`-j;+ N _<>)èms6=^|%gk$JՕ5߉1 1"ܒg),-U(,` -p#fX0yc nbLLtkhlij0P~a _tVq62/k. eQsn R; w/б˗\ Һx| xҷ;|c 7dB_[eCgZ76_w(.+,}JoX(kzR t&IP`*z7pߙw`1t /Õ}Pd0ż}ڱY"7"K<_ ˛,,MUB.|a1gĬ30Y;",tPHYu%&mRcK{-8Z^ J#" V)EoR%|]U 9.ϋkC>0>B qIW3U*0}VFr5,Z<_ Q3`W Kr<{y ي_Ϡa,YQYϲ-W@U- 19Åd*`QY`vyVoA'ĭyL6QuL\-q0t*A@mBG p6G̟DøH3>^SiZo G ,|W4/(]g[ݼ(s?o޵}4S4~V?D ߼v)9ƂS:tFJL&`ĹKiu/f9m~N0IӬ `j9 #ĩ)tWx[wie eySKӥL&:_L 6JKfw'8anLx&rgkwL?qr^w KiGV @!g 3ˬwe`|Lzdj9 YHƲ0{ftK0n{T=L1 %wp+'KR fXkՐ~$vB> c.c"͘IwvyXPdR*W kf1z`5g6G0Eq#?SÄ/Ē{*q-&Cn /-l@qS񮓅O{@.G/ڱrI0.4 K۽(GDr邍sY4 ٽ{KwqF@|O]#Z~щE.cgf- X>X:Lw?O.%*Z9zcLϑnnw ^vUat ϑ yaf*d6Hj {~Bkq@؟fp_I'o m~u3pf__#w;i8C(UV]V wzr^&I `]B `3;|  )фEwU>&=`¶Jf ױq ]BvY^ǎ 98ݕ B XjZh3x.{3pn8\;{J=Pwv| 8B}MRf?`|]-*?_ns2KBAڰ8XwBu</+Կ $yW.8.&ܞ'CPYӲzɺ $ø$Cf#=z ;aݩ٦wkb>6kK\YPJ|NS:=$axAey# l6z$wDEAN'zhc+JPҀ_ R3F4^_gr݋ޏ݁4v5Mg{؟f}&7V)(3@Ʉx|v8PIK,xw0S~9XrMK(ʞ7:.B8")u5̇`B?*(`=tEna?ל;`N6zw;A}@j).{ GѾ_ S֚,'a-YfgGY (n;RC n)ۆ) /8,٫HoN4.fh: GaZf)o&P<@̢JbqF6Z>c/;#2q!X+z?|}Wv>ȢBxa*1^-?iЁ{?i3wXg;;hIœcyw ˸P7Ųr@[n"S s0+ﴻ@#< Ra8mxƹ)e=Poܹ Z)NIdMtAަkvP{2[l `H b(0{YH݅>'o"mGݶX?`[{ Ȟ;w`Ky8B ଼}D{`a9W,Ҽl8`E4޵J/awUVV g^ΰ(t΃@=ksHuw?&GBkcuhBX~yj39CFXZ~›,wC#f8&Fx&&^(Mocqj0ɏ+cj|"Lƹ0lH=94>1;[HffIL, [\bILQ.:\eLMn6@]{YS3aҰeBVõ⚞WĹz2Rz{DSz~Knҏ獙@-uBHwE@Pe#ݓOY0~/0S'u{)k`XjZ0 xF`'2*hBM v"f].0߷_)i߈xɨPdq.S7`yTϹ`,gui{&Q>6 @9y<#({?Fa}Vdzw`9Eex9F9nzEt577lW;Z]͘<4aϬ%g `Q $ {UހK1+W>b`eOnXXc:Sn.q"fT~FD5&AQ0ɝb`{V_jN6'3ԩ#ïNõϥR.ū*`Ϻ} 1nFq8p47ܕL蹱5XWPvWl@oS\ >|+[71ްD񑖷.X^c(] SOC0kK~~RUsi}:U`CX7oU4rF7OjQuaDXAv;*D_&Pn&FShGS!Uˋ$sT DGOj\Û1,Xdҷȟ&u07HbtjWt{gN4 z0[[ 08R]/<$! k]4sAb .@sT#K̵o>IOG`Onꎢxծ1NE;P[/a.lT>HO&qQ&aD U)zI}_eO aAJLOПO[%;XvőmLfi>:@boi) Jot`>=!A`~TkM*1@g ?4p(wܑ内gO{{!#}X8Ck5.e O1yPvXM-ĕ f̫4I q:0_nk]ޣ{xVT o(( ů> |B€lO]C\:}wVX {H`&WK]n^m}c#9w)=i?Mm=2%ޢIXݳ/ηlVG5+u"bG4D1s\%Ja6 j 釰0N9W4ݽK hg,aG-U u$U7@*PJ{rc@b=*2?~!|ٵ;05շ(Q\\F)D;KXwJ7^GC3be?akL'ct m҇L=rloLiβrusS)Ni*uvLAl7PXYs\6d4=Z`K,A4=CU'NRao4JsB!x4xMg(a^c'Bט)DqFi)Z@𑯼aޠ;}1bV{7&w:*aTYhٳXv)EoJRa{RJUG]Y&Juh|Ur-7|O ƥm9X3 a #߷Jr ;NKm:Ѩ{.ZAҺD=}m@!qE^X=\y6l޼ @+Z†w_jVXR 18;$q6tcQ:a;{@ SU}k D~`hQ Z6fBtLYkNQ}2W.œ'bhg^Dm –wa00Em#gFRm]  wy.ށK֘dsPqZF5XdFlMEVZA44c^|C%B.}ȁ!V=9Rʄ4`ksx$l~g?|[>I )S?z3,k}|ǫlCD EKHWSR9;}ק!fkC51[Ry]|hhtI4to{Q ϋxucO|5t4l9U_զi}s"ۄ~/^y /f]ϻ`W(J*\hk:t1 ^v( / (u[(VD">[ @EM\tڥ֯0}ɠ =:@\|˸'^Rڽ` WuFHp>R%1{餏Dy![FOsa *{&Q/`syoj R_/QYJ0w|'s6vp868b,BnQ6L4|Zo;j_GШ˿$_`bCXۢ6bl~04*##UGS)A/nl4dw}/v,2pʹ<}8e[M/Vf H:Pc-#>|s'Z>(oߋ'-7:A:3?a3vgZȜPsEM6e"zrEmWeK-|ԄO1lyWMg];=mZwR(w\\2D@>}s R.^=Z>xΝ׃b'sO ֔K݆a[Zz% n_\{ϵziwW"FuaS|<65g˓@hkkRj5 .`Zd_"ρt?[+Sϖ?{X?5Qظ@AsQ]<14=u&Ɣ X! hz. whzC*3 9 \.f'4jӘ9h2]w,l=2s_Ր;SSaڵz3GVio Qƨ̞C* _2P V0ښހ fKlRm$2TWΤ\a0k8'+DKN%T'_}7?% @.gӉfdyp[^^x2Åú2S/ogFM k{.]SҤ [ -qMB3X׶O~$i.V_]U%F3S`%"WTaLõ9o~=1 []Wއ5m; XH.liizY(Ve2tAQ- ix;` Ш6<pfiGԓE[:镟KĀh}7_Ufu}KhzB>4S#6`9. `&<7wG? ks_rmm~4~`q6;"o1*6ܑ6£c}jT[R`}55O<`w7>,2Y@]J.n}@޾] v+ɒiҢsl7?+NjluRPMϫbE2vT' /D_nH_`1Ofkpy4veXȿzl7e, ˀrRp؈CFl `ّoJFI s,a*vXH Ȏ9nSW@~kx~4:q-4*@r ^4ֻ a#2E]yN4}zOY_a!E# uJO34JW &NN,MƤ}Fٽv'9Ӱr%@SKR3*FKK Y?R`AϦ*,4^ sX"[d\q#k, Ă?bl/TV`#p=0 ;/kiewFI*ͶM0M;}kgǁ@txTݭhKR;ؖ Q5-TGXۧOg^(vi"g߰~oUy uQuŰ=]>`9%4vLAXcZx 9}Tï* >=F@LwԌF{!*Vla{uÊ?-l/7OߡIq h 2q~O+ ˀ~v es7?JBݶ#32nc(6j[F\3}WaOFPJ>X&,i @j|cVj2LB H2ik>w!?_V4d=Ws7s'ke+C:K[ڙ;?WW ߫t>po]\,.Y?aWn _CsCkoagwCk>~3{g˿FV6[٘buuq0d7s翙w~`௭hN]2oH÷/hkl̝nba5v %'ӝUemmeans/inst/extdata/cbppsigma0000644000176200001440000003577114137062735016223 0ustar liggesusersueTѷIE0P QyPEET@E CB[i$RCtsǸc|߇)]]7 򑐐'!]\XU NJjfgzX>`Da}lSJjˆNA,rdeaHP疉i A̦maI\F.a H.yw=S}jG>TE1wuȡ@BV蝸zmb6bg)CXNIY#j1mӬWl+X{.f9]Z&2W&L O>1SeIU:紝1Izגx6N`P»Ll1ϝk= ?s蒯ao{٠OtQcW u5va RӬZM%Dˎ1:H~ܴGsU6KTp[' 3Ot6HAb$|ңg@P|t c;UL k_NRQ C,Cs~)anh(iLӞ;k' nmXq \ڍ33syv\ ;g!hpbjيM|vFbh׀.Y/~4K( h[M&`Xa8n,@5ֹ0pE`@%ҒL>1'F~7%F},,FuxB;,?8}(Ψ4VFŔ`%wx 1n쉲I,]?hCͦX:վe^OdX83hwԕ urH ql_+Nn`[R{ZZN8 MJxVfH(ab3,g8}E# Vcs)}&x$aiW[o)+~VMCO!,Ƴb=ϲ/'&{ԞiǦZ&vۂwo;ӆQw&12\ s̝<|{ӱQ7ƕؽ<<{7'ӒyXLp[ G㍰q:[Oi[cc{Nyf,"qJqBZLFY}z (a {BІRR#i`)=O)`n3V c%UhQK2DYcn}Zdl )u#4Zٝcc1td:]LP~?|djl-r'M?SfXKp#YQ`tR%hZaqsvtH?ZdiUXy8 t1)=X 8T{56zo:aSɣ"! >RIGhI6S}GE`[?O9N hUOkpi||3YB `X5l^?ljЛ} ay0u/K) dp݉Gm1Խ'PӦSXeUG;ϟeX4w\āηQ-5$|R5~ꬱ.Lz#B@Q HjMaHF"FZZK|NYR*-3~=&ww޶RT$q\OYKA&.FApirgH__ âTr^ b ټ<}J29쀆Okw 5\9d>".W .u_Y}:$@mS-+}b VNh{_EIs4P̯޽Ӊ@J79҅\tKV#0sS;F$ĮcyA,VKuoa13MgZKGb}Tm5nƣr޷ }1g ,m2}t-jnwbJHy#ܵFnajڜk""a5ᶈLcE6->nHNBsYz~.A *0B\KGdBc~ WY ]ΐNZ4q]NPfSwܱ_Cdz [mUp;w, XĨr|zm}.9%cB)(rZaT}OCWثw:>b)_&cjjhCn`W:fG͕bT!8#δ|D%i,~/`CJ| FcTEvRΈ -bo gX]q~mLJbeAtX8mBP d&FJU=;_|tbdb/xѫq𭵶 ?FFr~Ü>k(M<=p 숥šjv̋gTq[qj'V^:k*F~Y,_?J5&" kAZrc{"<4CtkM _@^Tey ;x~Yy0l?v CǛ()w9fOczS-u K`LpǛ1.Pҁ7+c4 .=V1Ϧ1ԿGE+1m::$Y917|585fRa.=f &:1+G?ǔY>[?~XO{+V?-1z{ &c#Ia]gk'04'f`ǂ~I˦*/x}j 8n\YF,{\)Fd(|,q8Xb[k9#$a<_kRQ{1nkϟ[qQ٣z pI,IȚ@>fNSEOm6q?jc[l|@(~о ="uhE}u K'M蹆xez""wTcyl~~Bcdv_9Lf//F-î#XNr=I:-I/If7ɾ]u P !K~6wDUb武p,AeŞ;G'%`=1 4lv;Ĉ7fKhWKw 4"Ѳv3٨f+{_i*r6lxsa0X1<]bbN}?`<03"?HlK@/6 ws (m0~g&t\/KsJ.$bmZp^Y1 {~LS1S?т-T0LqdW ox[貧}K AyjSV+澳؞,XP?U^} R/WgnWu-aڞQ +dBV'Ղd{uvZM) S t\<an:F踋-j12G ;)EjORxf{nfb<Xg$cwK̸DV?s#9}=l(TZ}ِ8:Y s+`D?Nц5+` Zd|L-θ3tъ菒F<k8ٿ4*FέꡀŸ#bЊQh Ha)h?X iބ+ Oh1rl0{4 /`uݞz7hD)t 4Ăl~b~B?1vpԐpN|?Ouu,?ow] %6#0=ݝxD"dؠ]FT ߑS4c|`_d%kS'^®=LVKPLUd4Hq6,J,N3gk)F>Ũ{cd|t)ץy w)U SV`cWo\8/U9"߁Xeလ'鱻1%tkX?bxQ7?b!ESIS?W0B#iCiߒ2cD'_>3>/!GI/ߢ+9~  ϑ->X.JsF䙟IjwAZ  w0_ Eg' ;0qXg ̇ʽ]ȷtnX2ifu8׃(HdbOMYrZv؟0_8WqU*OvdMaSl.j.Th<$ rX~nш)UW0~17$ ^f|#,uQ|Չ9kKH9rhb~ t$Jb$vkm'q|',wu~bd Yڕ0{iVUU]c7<  [=-SC!Ά_eFϴ0&$~!?7Jf`j R~{*\jo;:DA"abxVo%#qE9ɃkU]Xx:y/q#Tl3JX cb¦rBB*kwڏ0[dzqf$k.v,{0B2_.I|ILپn#r^w{N5,^Hz B}E\Ϝ?8i3_(1`s2emW'a2+N #Zu_1\!G}\Ɉy?Eв!rm™mk02#J/uN+0+C+qAKqЩ|k [S%:,l|:zUA$Ic/6,  F|b$nrX>Vv)\\hx]IܷB&)X_ :N݊HyjʯIcݺv4[8>Wf1Yȴ{˴aPN>>S=%bw&3;&4^KOkA,T/neŕi;qR5C#}vUX`- ܗNhjbNM WlW71%BP+ݷQd 1F*V.gىj uqC][cOry2{]Mk81*<Ocv|3/BJ/QyTGs餰[=b]f╩"1Lg0E[:Ϩ,\(wX[]r{f>), ]~94fR1\#g(cݱS# #&ưš%CU_dl'#XSTNn,4hTm~l bt̒#O%1eQh~}V zS̏wSX^6zza|\}|x?dy2^}I0D{L'{;8^cً9BXxM2׹{)F9._Y+5;x BFVp AсWZU;S"9;?.hJ=8A`uZ3z[XIXdvŚɁL[y;c<0Frw6H_| RTm0CuNLPOcsZ:#ª&1\!?I {yqjP [n6XtH_%20Mb0LjBf'ŧ/)caksHZ6hkӑXOٹ Փ 8)/nv؄f?o@?x,Ɠ.)-c!"t;\&cܡw&Ex,L.b|o6,&f\a(\"<Xb3gK8]$0W[2A - H/ I-؂`Կ" w}K\ۻP}\ اeIt Mކ5P}Ӽ9{$gы( "rAѿP 32$& φǃe,)g0,Ɍ7^ڙC?) J\ƨJȢ>+T@ ި85Sӣ~tEUҮ͵џ I.D` gplJrw\CݔGG-p;H5h#e`DR!]t2Xs? L-[]n6@1AzS; lxDfԟ]3w-s5 uy2DQ&w+!jBv7r;@C4?ؾ Gf(##/3 , [vG|nDɔZ0 QE[GH^Ԥ` v ޛDs~TA2S 20¬Axw1~_o[m"64 T58](w"V $vs*>ƍSama'Z8As4]2CI-7nܛ'y{1,)#uqy,JbJ`L:>;{@edcD>K}K|B߈ 7`Wt2K>Yο?!:jH)zr $+~,Wx7D)uqڿ΄?FN1mh9.>V@ɉ{s~ E9î/*Q+j}+tOέ"Fyxq3$~P;pY7H)=t+bȣq}^Tv!DGk;`ab%S_Irׇg ~•kpugWr#7AizNtaClv\w{~XYS&ʿ.b?:D}h J3;^Wp\OEUn ݭz J-֧e'1٩A[e9V_mD-dij}׌!wMaWŮ:Ѽ{qMxjK{?E Y|g8Y4!ø,h{u V|_\aiMI[4窴P ?C\/i)pO\g!Tu4 8994LYʢZzG7F1:ͯnΎGo|_*rJ^Ca+{Vp)?)39_/?_E^!8\uq]w~>ܿ0y*8T<75@7:s{, uX ?»@xրv43Y-_oo>)@%<^d{nEz:9WG2*™;ĺwʊp8g C)3Y$+{XBG@&œ2Sn' Y)=Xۆh8ʖX9dA{!

fu RG1W_zd3{,5*? V:Њ14rr6;H6_B TOMmu6[QMjE_G,D"kyw{("أ=R(G3z{˨ԋ"mZ$ nW3PgސRAS$\Īgf!)E"}u">G/ .+今&JwND㣓lK65]qbEފ8hq͝wlgb bg l946WCui(6 8獆૎GPb0~*K(W'|@'ACyY&; dEuUiI m.¤ۇwgc_d4?Ɛzsf԰ԏAEߕU_] xV6/-x~#w/e^YfߵM=?1HV5@ΝB7(o)YVcLP bO\(Ȉ]Ӊ(]K4PEJTk!o1p2k٧4 ~`36&)~KnjD%n=bDu2Z6d?>- )^&:'Q'ml>Zߌkq'""pÏ|D8$Y*Wv@rO_ !YҷJ.^/6eHVʵ"K$>dRUhcd̔L9xZ0~qMp")B26r{ۚvg  ʬ?:6q:Cb2%fMGjYn/~FPM]#RD׍Q8( y4s~]^Y"TP~&&oy;}"k~)D{xu5HLO1Aq(|@oƂg$_/Pcv?F[|lS xb`Mte._x^!tsyԍ*r DNu>eiǚû[_\T#y" ןg^E>ߏ&ӽB'iAц_4w04!?]|v( },L7=D<\)rC#$CuB9qJ zn_ ɩ4H~rR]ySx }z]^$n4GiçT{]"G֡0V3^αLBs1KV]eoB !AV!is[v< wgKʹȃ59Q(W6] 3eHD'rv{{a=AkcE,55<ᩧ3{yFͅ0IɛӍƮMp*~'m f=\>Z qqKhܹ>`.n(!$>HB~ci2YT]/Cm$oU =؅&f/[pl>ZCA;fGxQ6=bG-WBuWDQB-K`[tDNؕt\x* P,}x55Uz"wIPAYl4i$sEDz@Q|cRi?Tl^h@fV5Kz| ajѵo,f 0٪5EG/G gSALF~ǂ85,=Z p%o)rBO~(6="va1\风~RQ [ԁ~Ͱ6D|뵱p@3o4-?vcZh8ʌiaWHv7Etgt'*82:i#3qb2_6( 0u'+ǕVTq X*k9'Fӎa~h=ѼKS,j)_EDF(w& uo{s6E ~O21Ca+=Hl`oG’QkL(>WQ]q&i&)t.B*c\JeV|̾s_ÿϙc6GT36T(gWiԞ2H9HPdEI#aߢѨxrر-Dų^map3ApIۮΜ^ ^4J_8K Vq⾥Ie3o O=qq(:\Dgdov*ݪj ~۶ڄc|C+kVI~oD$ om&Tսŕ;H7 7? CR|5xI=@qut|*{ڳ}L>U;[珝pTDѶKkFTc l%v+wb0.*eyP ʼӏ?iR"2_:]ώUᨈX8Id y} Q'y<>fx_*#*:z>6FfvzgX9X]pYr$:`dBIعe*4*fC}R`@ ~}soG4} pO n)ϥ]ƿџfćoa9~B^,W7uϧ v}؇b!Ƌ/ٵe)vÏ毟%\zBeM™ Sj (PrBa+w{_ԃBlj,Bjbo>%Z+Sv919I1ϰ#e&?.$WnDtzj6=WGkaj )$m CGP^r/K ׯ]=σqaCx!sC}mѼmhݏ1| !-Pr2M/#5pjZe>fHH(HHHHI(H?յt}F&PC޾0$:?IvO KeZF"giiq3??g@ORk_i{kG^Æ;ŝmP?emmeans/inst/extdata/cbpppriorrglist0000644000176200001440000054034314137062735017477 0ustar liggesusers}Ǖe6+1d`ɒ@"A( 0H;Y"(#xOYNY9r:s>wggYO_u]]ǣ^UիUPgC0Op%˗Ԍliú5G'_uӏk޲GdxM{2VLL/G[;2}y/:ˏKǥ/ kww>8o|kv{4- m~a͗;>k>/ޯw+ӥCډ/|G>k%;5@Bnu%J[YiHd-,si4"Y^hkJP -8OxW/io ?Y򫫶~ۦ-=C E5zg̻x?m|Sws#nt$9/dBLW&t^_(泽躽B51@#ӗ(^-27;N(*vmk^disTӅBv_/q#nvcGOV %(+B}*S!ߑt B۬lP8úy6|%klXB&_Xnu/tds /[ؙ9^i۶M\ɤ{ 6,(ʓ/o.ݗMwGӉ`5eaRƤ?4kn-zm2r7ٯJ9ma;y?͆eNf(S@RY wq6!?f=wYDh#nh,snzpDĻy1:iwlC2,~ =}C܉$HtHo3SD+"m2؆lEr(i\Hϙ׆Do5+Y@/pk\l&2s o YA8B)g/:4Wٰ҆M(>Geze3<#.mI)Mm&[@oENѝFF嵃k;^Byz1@hw2^NnΣpw;_aAyR?d`MUd@<׀c?PFfne"u5. l-CB~zx F7I_Rd&p?_ QHsC]NɉpJ>#{Gėga%iY4/ )uR (U]/VvZ#@_L B{]eV ō c><ޗ_\_e2 0*:Ŵ)@-ҝwee,h"{XVt2lȤ.#y{r ե55MA[A#Tt;`0;sŠ H;c25Rtb+53)u-ZGw25>:ٳ3\6S\6x9 ksYƠ4N Jc{iș,d;{cCP;ea:@Qf@p34 ;̡vVܟ+! \i5WA;]V4f;gC$T!&bY,)Péo|xP{kg,)G.ҁtwFSۚ(3my 4mJVsg6RH+7(t7y#\›G8 g Qq/GSJa.RGy)ZT  ]֚ٲ,6^ GyP}Z, #N>aX- B;磒-on?tP֛ne!l'22-OS(L9-xS)׳z*c/vYϡp~ yRN|'~_LO>kW4$(`8xkyF(0H&xnb0cfq gƝ`atXwgfqsGpVc p2 tg"~y+pE"Lui0"Q.fp p.T`{x6( u]|BW]Bı[a;K~+ʕ1?0m;B8j.+O0!\-\ 13lQV*PPl:a/5[b1}gvL"~(K}R0v?f$%mٗvVoWR1ߗ)UjJ)Gkn}鎛mD"эa+]g*3/6 l=xN GtQ![W؟;Vm%‰zP|VZmNn|Oa@lMN8orNza->s=bE_l A?x6t*1M'~?c>/T'B0ma8!Dw:ўLq4 )EhN7CsFZ'Z&PW}śڎtQAU6Vf,(Օ˧+kjXK0=E3ՕȄP\{w i2}DlVFwc+H²ٸ}\Er>+Tb<R\lT1*џ[Rl0K<%.{ #%a:h%ґN2OibY#B4eUen5^NGa7rIf^vy0[ܟ*1IWgK<ѯ8:FAYJ ڌB# wL<xm}Ȕe17W!Փ>l7cϺږq;)r*[I"gPy֮j9ƞRX>Akyb;}8>v)ڣah\z ̖ٞ^{̛-{;2>t(cfڟ-hYN*s3SHz8򤲞@ LjM!SR"&KjHF1?TjZ/ 7(Xɭ4R?eeրOQ)IL#(lWߒWQaxŧOЗkXFZ,z:z*xNVoX2zL}MsZ5:j~bEy+x5//>ORc!f@vѕmfg>؎+ԏs 0Sf,UX4;J؞9T>|l?| P1%tNڷN8% –Q5G)aWpmq2Ytw7]Uʒq>8i&%2hy&@a}TGͩyb8u"|j_ щ$#hLۍ<}>mDW׍!bɡm@{+!QXQ?%ۿ%7;չt$p#6Mlju:U[\bƍh&ķT4+h.F( Q$FΦ)tRǝMsō<S9M̫}N{R]J}!ןP:/Wz/Jg윧9GԜ%݌q1#י,e>(.x^4;^6bƌb4,(:nJs9O(UrzҔ@˻e|gTrYިUV7I֨y t`2]+:Dh|0)F;M؎\^ϟkD9Z` ?d:4y>i_L6R%א!п*ˏ /!*=R4G|)(5BKH>kk0B ӣ@m0UɸS@Tp7L08ML|SAdy~5>/'L|3dV+%dVɬ.%dVZLn\\`7 JQ=",E9\ɪ@3;qǎ:+Z:~{N(8 ;qSzvgbf&7{=bw8>[,~BD?LVG@e9dC9p(c 7F<~ToZK8/#ۻF|> p!ϩ? p{#=Zs(l%ZKnJWKzzdE\9=ma:Yqx/u׾.$YȲuA cVG.ޖB1ۙwjHaDto"4,Ae[fjn&12$〞1'EӼ#VxK 6(dW=f ,_Ȕɱ/;)Tt 䞴^t_rSt!Q:ઽsl Xj9Zܭ+µ `L ^m@>|_X*v)2=Y{kQBw;0/7 UHޮ\*ٙ霧 HJfUI@R`e\O:K5#TAG"Yd31Hw=5Hc\{\\i~im+r)GLg*B1wܟAG*;iZ<ʴ:LyZz"ߴk hQ<-'Xj%7zv>G2⌑Q7<XCbJ_ZF.|y>(zx dlvP;V;Q,vy4YʯXsc t9^;in3s' ^=jգ:P>Nso-dHBr-d&s|ą[ԫ.ECR#Px>4vH|wo">5cF>E \fQyx;7ӡKOg04L% L:9`~xW|WjgY:,J3**u?`OQþy\+8K0g;> ػ%t 諥fF(k~|~M\I[`[* &. 9}!G4͗tieûRYץ dH'8TNN )R.&}UY5Mi!zv a)0o:irr w4B{)n^ l%uZV:SZ2dh.`Lo1sK^={|۶\ٳxt׾GScN4 iGHJ;҉2x"~3ቕV߆eV u2@_;djr|ЦCLoNmNU]>rkzoF!dc0=yJmЪ1Lq].Eb9YC!Ag1 GP@S0'}|^ŁIgޜ2W'9Z [r ]͘/9ZAhL/9Z<8/u*-fJڕٰiS $G#RKSNۄ.Js0 WT7}jiO&&'pŧY|X`Sw5j=<ӆ'5_8-Wh&cBi%WW% G[%!X*"t'_Y sIZRSMegLgJ!Fܕ*=Fvڼ78i*S2M̕[: v̙+#c39l d=^ ˚2'2'{HWX:07At叹(lھh'ÃS"T7BVlh'm66PLrUX"^dFACǢC|.)b~4ٱ(ZjMLĤYCXH\AD'4-~V 2E0t!ӳ94L\&FvLwqU(cuL0ANc? [ٰ$0+S(U̝V.~kǯ¾|n_ae ;32 7m۶i T (헕'4߹mUwMwgqzIV@%bL'֨ۗ,XhELi t=0X[ȱ ` )<#%jh% Fv'T08Mկ㿽:YSB2n>z"]S)ΠeeRMzZ?Ɲ %z0X9~Vz҇7TuRsJ#jPH2+S q(XLطib:ɉ7;`-v'0t8~>"vZ)+ĚpKWҽbL\70:jou[jxi=G[ڎlჭD[DjOS /ն]X؝ ,[$Ͽ#Xq޺n,Σs%|щyʻ-B[9L 'KuUŽFY5Ŝv'(k 'EZ%CJd{Jm0H욽I_hTlӀEkg|[RI-m+y-  ؗsEXK,fd$gSf&0Hwg;üh-4jq֧;:2%,CwPQj?lͤ0dXg0?@ tw|*(~`;wKN+ܐ.d)}#sh-51vt?[?*) da#].o5zMk,˄_QQNojah\5xX~h]<:}ta$aGDR~-`꣚:1t=|7"^ OiZg4RWw;Qvk*Y8Ka&b Zh+8[N΀I-y{֞3l7Z}޴ߠاt/i6:N!lc~!uR|qiL XO88s Hm&ܱ,dQ^p#\ExK8`82n7xwzLgLS R^|iǷdaYmƂp58}^p#:|[ un~M-$SDmɶɶk5ζAE?C~Dy< ϧt+gaZe d̥ri6Q lB'tņ6l)pE6w㈦6}PKm.ɸh_fmCvѻq s%- ߿b:6ux_x}jfydY@xKL78ؘA42zkuuxpt6sȎ?FG;{*n]A$y~}+;@n;4=y7@Ѽax1vY  ~׶5 8>B8G.Jr;qWq㕤2j ׀W^~-}Kf0pPvN&[>l{λRPlkVVgDZEiNi r"P>K&Ŗ s<"m%_ Lvm36ػVp2 N_^3Di O1 ש|}[#ּ6mTؖݎ}#4'\x[^.=693^灬O/IN +_oj~G2q*?8b8W8J6) E߱_i?P};vn oLc,{>VfA۬ $|ɲõeӭɌ6uuc7_j?#'+nM%ZyR,[ojb:V]k͢:5dI\< ìZssH3Nԑ3|s>LGP7b8A]Յv*qi ͱhk)֞?]㤳 L:lr*%{n\'lɕX:Ү&Unĸ)k)nߙ'z8 oϿ^_׍y=^*Aqm,{.ng)4W.2%Dk'њJDm wXi7Ցmri=jJs-T$&M^J{Hc}9Eٸ>e?4V{7b.So^&So;;v[SYLvxL'݄twB,л m.ٝ#Ujx[s$;'4RW)N>@LINtZYa$_Y>+Iw'gek㗨DIP帡ztӏ2&Rڨ:*_%{UH2;UZ[Ll/-8ҕl3ś3 ̷1RB7WL_(fRt!Arym1W/pmsp}O7ڜ.oKZF{J-2ʾ ~~JъOz*&4 vY,brJFiRoB@jryzYҰng'O|z|#E09#׎I`6tj2N;[zsj`v> Ӗ]ɸlGw-,MEy]PRto%jϨlaYӘ.~*N#4],0ְ_ZlAe='$sC7ϐ??SWύNqrY/&bNS͇qwi2 Cy#f7sp,s]XYY2Uۋ~GjVL8bg{;y~B 64R 9KZ!T17F-ǫCh_;WWW"McIBLcA}G?\x~b]w&>چ5cYxJjeV̥}ߺ4NcY/TFkMe7t Afn㍣ :r]َ"΃ٰS합U8)lF3R$Kp*TNE Tiȫw!9^6Pp(/c|JymrkH> 5g W8/'4S##@i=@p5kda4]SKDEhEe$A R~~@K9|u20ץr|ت@KyQeV?4E]-M 2:S[Q;+C|wﹸiw~c66)cǎtsk!Vqoy]DqӽF˩-Pn+ԟAʠyk_*KG;#VG?FA`|W>Ζ[>D $ wƑ-`Wt_xiT,"v R;?XZuްt>ss6oӽ"uq_y{d}ސUtt3e(Xjߛ`n n Hֺ"k]?Du{y`ծ`]!vAȚ?donY$(S)-g{.?H;BpiG{L ݲ)n*2N0[#rc&t?änͿdHw:W1reшb(r@ d _$,"o>4V3m!%Z!j.KGTj]g=(եlz^(|= F3-:P>K YGl^fU]\WW7_P䴂|_MtԐ}L7~yotl֌Kbd{%] V H L>[G=%=JJEWUEtZs1[uf9y>D7[^+ 7Bý>Ep[ 5pbko3z+`>ƻyw׮ucwBؙR ƾ\;t238w%' ;%`jrVM:4\ѫ]ݛn@9c':0oW?tYdtKUq'45SLKW >nE3a9~FԶQ0Z,߇͏AI̵w8nt M 9' :Ϡ,␋?$~jj:U>Q|L78ʈl*Ufuք!&t|˜]LI9Q;Q)97mÊ3`.Te,'|l}NRǏO.Tkʧ\ ҺkT"H흗*Qc/*~]bYLu]Lϯ?P+ǫ"ukaiQ'+Ec!s\>Xٙ(̰>ek&o0;X&89QӫZ?)}^ʯCΟd:UIVI#QF9X];!x_0ȏI/_xyǚ) pBYkLĊTaJkRF7VMHĮcm:IcI ~ c'I9s. 3+ǯv\"x j[dԟ[@8`~[![4 +V4 N`|x1  hǯTF#5|%~;f&(0;BBSlL1 NN%,]r1:>ܖ4Ї;*CifG@^%[[P_g,q_Х(6y5,u"o6hm,T-P"1ylxϧ9ϯ_ӣudkfs;Np T8{^U [ V4S>>Gӟ WPq`5;YΤp)yθW:ϖx^M*: 3ȰF;?`E@0A "R`Z P*3](UΘbT *{AKPUj$nXD)mĩ>HhzB^D6m0]mu ] (۝$I.K+K(MwcgPU΅L/sYqFL»B#,m (+)^CNE=^_,*2kYC𙷚!CX#*Sxn6 xB!Z/һI$<5Q_!1žb[t ҽcK}C>Dr4_Jr:>w?i?o o^㴟J=Uۏ;&F (|·PfhoWbZ~| jי翇ʬ=ô8בc - /Mxp7 1;gkV~˕]/ ]zN{m%=Ael!f{l9PT{i )'<>1ח8:ҳ5LWnr# *QƹŢ;apd6CDVgm;7=醵)NRE>Jc4y/YXAy,(?I./+ri ?I?S) ?+]jTvOVXO_(~綪mɸKotܗ*Xf=>CW U7"Qu ?,&I&*ܺF).xߣIW ?-y~=oAg7 (No"SӉ$F 7AS3)O#xӟZOM4?qdh{Rγ{RgEeN_ ,EHxWK0 e~wJwV8jyOP| oq_f~xqSAG44Pvlt b Y~ Rqٯ@nd߸&7 *& ~*;حfGH 5 /H-wZCAK%w~x_SG #w +e6+0dʪ@WRj F@3A+ N+e0!N'xcL"L0B0Z3LgVi`=r f s`nlSkvE1\l` gS`9cXkp5Zq#ϛ<(}ϛ\H2`.Wv\@_ DzB_>ZTi,78jXXICKN:OVwtoul:Nyg* r_^d$GyV|LBkYjgE)8!)5I} i?'o2Ek2OfVb\l*;JKR;qeM#j`>yb`P@ں.uXmܪ6 G&Un= D#9jVع媪V$ebkVgbSZWOҽ}=UnJKn^*K\Êi\snEB&+XbVtoKS˚` ?|̢༹,$l9!\hVS'o`U̪룧Ҫ+U`b([р2(u۰Y57 ~8ab|VUQ֠#]9%aG:wڇ(b2HU7PLɰ$ ;d{؁#]9eGC'ap AL/k,C]OW^E/rn'-VW.ߝv%L,VbNUŪ+K%*zT&+Z3zq)aTC SExY*N8el\3ם}ܲJKj),U^95-Ycm>]=LĆ%6lxU/ˀVN %Y|.XZS˲S)at YٽtK/r%MlXb^6LerZ)e$+ր|M+fbR_^E/xn',sd;4ʶ%;PtœdS=}1{cN7O~?+ Z25']{c8y3smW3up7 I{V!>vd sKJDJiw]v9%d'~0 獨 (* R 2ܸݭGUm͔4HJ!pCR^Rf, Lx%A9:j̇VU8-kPƪat@Ʒ YYMyqmΩ\Iot~J[v_PNx%FʐW*VSoZXZ=j ւ5lِ{Fw%^*KkF;nJlUDMXb^vLe)c$+ְI;WsLWb^˫eY䔱ZY0,Iد~T˛EZN]+l Meۯdubd@*iuCb߰vlV(5ݔ7ksN4DiNuwwWHE쾠%T-AY5f6aoT1z@N03)&ǔah(ähFչ_Qek5J7ĩq6iYH 5Sw@P+,WVrjZO_k1g@~+ X۬eh;pf*IրnEW=?XES,IH Zӌcb9Iz6l8~&J\n|T+Yˤ Uqe}pN̯#f 4kU0 A'myԇ\&${Ԁ}9՗ƍBt Ue_OkdtqTL[x T}=(`eFRtKک;| ە&yY+n+9E-VwnMnHb%%mD=/bl%Œ>gZr%حn\2^eS‚*:Lb4|kO$3X!_^@|d"N56v,K[e*Аn|4dGMqkөhˆtWӾA4\90*oꞚӭbnsd|~2ѩ; X&Z^BN K%Y&~*UMVbNUݪKibI~ /z;ZFn|VRE*\BOREq̭Iww_zIC.Pߏߴs RdlngR~&LIޘ(Ο>0}^9lvt!UYTp*ە--%*)uRs甐X4&7bYlަb*H{5/ceu[G8-!OSZXy7n`MWG3(9R1 sj膵3ܛcϮ;MX?rMArE N&51#!tg̎%,\-q VCA[CzFgK޴#^rj{JȃWJ(3]l!fntx0&7"hMZMtg#gif [1Dc$t4KiO#4KCiRGS5 y"DO7 z d MQ~L#yf&s)3Y>VQ=W3<_Bm O^թiTz .wښ*lUte}4+UP8]i /+q>ħvJp M4hHh&U}*C!j|#'ٛǎ)l⥬^pLz&44ȹ뺻S]\GIdn2h "6tQIL콕loFQKbk=A%*rM%%VPce8ctWȂ^ M9(Sz`tŜ"K#jrV0X|fA&qe/w? vz5y[ę_9Ynxz*WT-<"^uQAYHQHŨ fo֓/+)^6W0q]IYg;ʠF}&M [\\;)LXBRZS:|MٮRVPقNNS`X429Cc#O0<<L9 "Xg1X Eel\>ӵA.x,(E,=Vʪ_Y-+Y!pi5 muIfx Ii)\s}%4b# . uEg q1K:BXj{C(ETC*h&֏(tw{憠k+CC;߹pu Ί2}|ٰ%iy97')s`qzKXlc!q/%uзmNC'tٮKy=-ZG/@e;( 2M,/p&y QRXR^20~@;v";co8s;'Z2`*|LDudqH[Z ݹ] eL77h!8ᆞ=*' B\FkkDOU@=k>9ʔ_ e=%u]-x񿢴5i5VTzŤ͝S;҂[OSȤL!՛+zlod{ӑ5{qJ6z]2ȦzJ4Mta2ML~_fd0}R.֣'BFr*WU=Le| .6>(0ݸ}ZzWoGы63\/Vn]bVi(e:njl3 Z*d~V/orړ."k|P'X>EͺUf_/zr e6LZК+o [=n+o$g.'Jr[6 GQr=n;nk]̤3iZez] \oL;7;s?FI۶/M婃Fofyt[ ^DJAUX%!%'bUYo72Elw##.uHTgxp= nٖSApiОeC0 rn.ZZ;3u/!bZw*{CtDF^N᫢NjkBn=e# =כL+Umoa'yC@OfbE_?u|_0mYq_e&Ud95\8SO&kg]U^}F9Q5cov)nY`}A/-Xl}ͺ}o$ܟfqȰ&QZn7?e47"B@7[j8;ƥ^|HG~8\ʑº޵Q#[#شu,].?5+W* re-[O/fo8|(4J wPYsƏ%ڭL ;M@D ' O LV 4[:~.LAޔ4eOٸ!ͭ z/p#Ζ#6ONl~j"fk/&H5 ZmD2q{"=pgPIʳt?La)t™I, 9m.IY'ysNBY@87۷W޷"F/Y{5]_hhyc}~U* 0~f~=ӖN>?eO*F0yN~U"똼Sz/oȡٟܯm~~Na!f}z+?Iʾdu:̿^w^}~8\$:޴"&eKP鍓I/F"x.|\?0ŎSZdŦfpΉ4slbuN%s=x;lŮ`q=$- `{fxġFp[<;{-+lH |c6ņKqON'}RҷhQiӨ5bW^_Y[cA6}?0QwisWi,MGtvjNx Ckw$=/Wkݩr FF(ryO`7b$ & t(96ߠU Pʹo#0o7N7 1/amFG0 @l>A|ŀ ʾ>%*ږ7Z}"J;GPs%;ґuBPכǝ #3larW Ol Z/,害KK߻MT:s|vH/_bmw{:`4 [|J3 5tՋ,A&N(_<Sw'>YSg(W- /<3.'00t7dqUH⾩Ҧ)ys2g{攴7Ga y9J9ql #ѽ}'19Kl{ -?罪L[|lVB_N>Q4F~NWvDmVST>kkGzY XMq<'O҆,-'L-F\+Ƿ< S#Ѹ+8愾~ѵK6bԩ$8SQUq]~1{aӬ"J]=,)[vbKUs;ĵ!<ƃ7. 'qFR(f)^VAeYYmC^]~S\^_f(;ucA@iiQ\Bt\EBdEB%"}_zOo{8T~{ѩQ v=ɤ~( 4!5ܤw°+&5N2l ZmH [3Yl㩖(d(JlK^nLtSa$Pnɓ_pj&(&t{ao-FOPCj$9Yeӎ=it!ןlPoߙ6\.Ђi:mMXһynK(ZܽsY_oRݧnqGkA>sJ-gx+) /RKV{8շwU(wrΠ\|ƿ(ѯM +BB1[/f4ޅW845]ʛڍ.E5ѰBxQGNX]ꝲej.$(ri`.߯=iu 'b(Ryh NV;Tۂ f)j` k@w\EL.\nt*u+'XIwLQ9#N#8#ЛM"cvҢ,AifR FS܆xk:87g8x OeћͤwN+@%yLyHoX6N=&=6r3}LA}p&$A&(_lW@I֝*w#bO_&~l!;͟5񏺙cq)l꘩HYT[_ 9"/V 9fS")Lٷ~$}鄆J뫾*%_]}̎\Jx ] 9ZӀMLSMa2L X<> @Y P{!_FCS[d\b0W%R$B/;fc82. ٲjTlMW;wY1vS/j F9oNED 9s\+d}[_ C‡we{7sTS>]ȃY˙,aDC4Bvh݊=hM\͑ݹtcW{l/뇦E5}hx:F|::y/o-u?ϵVV/ 5 7sI_>_xu?dEďg.ˏ3Rё=M_3.{֫f'Գz>AWٳXY*rշP bAx>pK˓-hG6'kyu|H7~COsRemisY7v&E9~3M) Qd RxBr3*NsISbgeSsxԈ;)_^K]| ֞8Ϫ`~XJhΉYs68I| HOYt,u$wK@n83qE z(<#=|R:/*}+|a-, ;#ޟIa; (\DB=h_$ø%.vҸV+nJT|WHLÐ6/$ &^->y.- H_lt=sbA,y%uǗt^HwPx4;|,C< p ТbFboI'ʠI)u _Zhy F]wS)WiHZ#nGonmj] N_D^knw&\Ep 4!i2H3d d0ҀK'nT&nǀ>s@0C nep{U"p_5[(q*:6?WVHDl3W'3ڭ ۺ2EE0|_~bjv~empxJZmQ*f.ZoR֥q'$Oao%z'ni-ol/Qw 9CZ?LO6乜v)ѼѼ_vSPPٞiXXf{(.]]^뼷|^gYǍλR9>q~8\NTBvheWfyzAn V. ]!7F!Ŕ}TYD2tj‹R왉xQi6oDިȆAM{A\Y}[ypeP:#?6Nٟ.l=.O!] X/6 )^}'gcgtw(>өUNNT?r 6h^6TF$[Q &9sW Bo3JMFZf%MZc N`X9޷M>p]$Ռr5 7brԧN,D0YGN mjmjTO.I{SL7ݓt}ϥ҅tB(gH} R| SmLS%8Yf[W4['K'T't Qo&+rׇφ`cVBQӉ56 X̥z3KnTW.o7 "rdtT ]M邼<壢JF9*|9BGOT\.(...QY9?GT2Er|h/6R.}j,t犥\43 ,Ǡtqӥe2~Ur9\#H h9⯀=HɌTʷ1Щyf:e9BpU }!jT}\c7lXkh@Ē/,3WZZnU^([ ~I΁#ڥ5 sF;\'is0IZ HYw/ijs3F-\";aK?yxnu<.[_N{pMlbjocT顗졼ĕLEG-57 mʨE8MM}IsϽa=)(&?m@ |jfC gw C toD6w!9$^|,r'EEo2烷5Q|^%q|z7zD+}@B5uP+HMaDw+Er6<~֛_ZiFVA+H?h 6&+p=g(  fE<aľqW~;Bl1{G)@Ź{G{cp~%zZQM ۾5v0`gp)+6 D9=Wku:bxgW L \}'!'Ss}=y30գr(-msM~?U)f흍A[ p`Kr(}tE$x'2P;uH|r_`Sۍ$$f_͌»H#E o+qf;ɍs1|~ow (ϻsyI|pNiphYO(^C4i<1p%)u?M Ia;=<3YW Lu{C&[mCa/&^fNhK x $MX<RI7 }&a۷o+f/> *pD<]dC2Owtw̦&3~jk{%xyrDO^'o`<)ﶝxS{dr D֍3EW*^wDZ[קjla  k07rǴX_{(xl/qVKo[ V>|R>Gx7y?1x|M郄G)8m0#uw1#8TvU'XyqM>?I?+x3_()_KDs$ 3P|};?*_ 11h r{E+nk <[m ڝRawX26t~@zGkj0lRAL~J~BxVVDo[i逜)u\*b_̃|=uN^;Tʩ k*gB7gLm6;kՋCOݣ-g cwoon+5o3O =U4O|:9t|ubW֩N(QTթr>ʡ C{SN̩ۆT+(镋(iI3l`ubdT'[(8}LO }}f_1֙/`󝟾TLrcHWNu>]>Bz!GЎ#hu)B\*zr\WOWkIK9v0q0DO<6t;P%!.-We1E|刘ڧc 1/T^g_>2Q1QQODNW9r˟8|"XGn?}65CLogAcw/ziwp;YķNWb+Ͽ!De=AzUr&A!B0w|}E= H=*c||k'[LCtA\W/{O*ι)]g.SHaȱ]㶨d9]R?yܟry/s޸20N_ް+$njJםH3!u̮sW.^Ssv%V^%Fe[[W"hlDguwٯcGj)\=G=Oׁ>{`7^욺}&S:&A, +ǿ^Qj|ߤgd2+{ ݝ,)R;a]ZTs8K!eD7<>07Af 7<]|+ۓQČ1h&ATj'뉘%}^]8ˏIṨ Gy6h<}feWOaQQ@kqiJ9z.TD_Y'{1 (i~~*[7}W}r|QEWW+ h"mro9Jq C/HBEm!wawvDm1wsQjZrBvowfpϾj0 덀p/Jԩ>_]W}>@HA!?MYFojIzݶu?gwt۱?q!af{ >*]23=1Zj9rL'_ Oɽ'{,S9 E^%BF繴Q%"ڍ.DX7؂ϸ%…,wrJBzT܋nbTAy3̖hr똠w#[?ˢa>qN}"͡'&./ޏ>(F|:xAB!Xj*xK نDI|4GəA3RV?@URV!5e4>r(ly.x@߫FwyECc|^q_: W/[%O >KccIG%"%N:6-&Cd衒|-"e]GsV/;K,.]8m2>WG8y}y5ȯ^fZ3++ Hoxq &[s[Msq nuv',Oq^9,ʙ4 7Rxxa )LE^LG/ I$*٬<6kJUWBrHsJ*pcepcG<}e)AwT8+>j. wRxOwaw D1_cW09F$nۿqAw:1 *^4<} 3KLLUYڡ;5Pע`h ٢Ce nRxDN"z)eQ(B҅(COD'tGQ^3NڈMnCiqyo$ jMJ߀j,j;WMcɻaMT=>4]oxo絑,Hg^Nb.XW)& {?  ω9-$Zsҫ-qDy5p֯ZE;,,zrLayHA 3}4B<8!8n+a\!CF_)<Mn7ݑ ʭ4- ;>1'zgbGCσ7h'[F&>JMS+K>+R¯S@Ew/w5T0iȟP0RP{w?T}~0Z hA4`f2Ai `0)΀pW83 9 Lt.♻g1½x`bE 2X;ߵx^`Nfje>6) bAndl'@ϗP`g\`3~1Q`/H] ! D1! D(28D oapkp;$E pfp,!^>g{r\@DPZ6# 5<^U Fk>1xcDxLgÛL'x + g x~/[ ;;"|<221H81 _yI :7T5kJ 8Oo~]4^3•ǹX⹅heVʼn9xU;NbtP+{ T3M|L gtxW#\9³(OxH…p$ S5r|yέ 8w-t=3DNNz0?5`A[%el[U~`.[:lAe[61Zmmj(9zUVL$]#,ǫ;W߮XY߭FOkh0gݗ㏑׫cw(Sy^T.mO\8qr>54W*f]nӕז,k+PX=EYJ ~,8vYzkӴVnLtO9'=5)v,,vvjrRoG:qmC?V!sBܱJa^6z:o{.7dV(qTz+>- &T&o0ȢΩZ\ŧ<4҉#~5bYXdZhGG8igO[^ #A_Xy֑,N5U62By[{ͮM)o^3ӡ(ׄ7hfTK$S4S%rr/#4 1/a`2`g@WIi f:9/,8uA;<\qbTѺoݪ*<%WC}ZX_*zՃqXO|qmlVZ-+5c[mj#c/[xa_n,(ZaZ[%p3)iw(VėY3oˏՙw bfpE]mŔn|Qeﳙ< n%㹒^G[E;W3լցlӬgXNyut#+~~w!lw5xNGQ\8l3ɳyE}+mcwK%se,o J{wJDJ/<_C鮥:C<=6 LvVFeuϹm[1"GNKB}$_<bUW)/Df-%Laph  川ܕzB[v<{)ݭono՜ӽ vIE$%E?Fx=Aad%dl  +y%ۺ+d%;Чʶ(}Rq)/H_SVU_mOD\9l;B*jYo' ]ysArWcd_kp*vlʪvu|ʪuSLy5zN1')1+lrL}N1πX'S̩f9}N1Su9s:Db FClVGt9[f `wR9WyFUT*NRv jI3U~2z0~ \ ;*<|%rѯϕxiCg*ߓ-4w^6KB1?} JӛNw<~"RY$*n>|ʋԤ9Ɵ+C_| aW@p3$8ME3S'i=Ɵ/?q\Ϋ!N=Vz&VP:0uLQ%~lq&QyC=HJW~ "<E\6I~#_3F{؍WT/C=eh}\aurlFro_:k~\pXN_O~gY!a",&yۤ㔫4$6M:!1y4Ɖw7=9B<mtyWPe&!o38'u+px[fn]&)n&Sr޻nR8]SZ(I MBŭ,F`+zwVlWj~6~~98~2! Z$x۴Gu"3XF10_)/ XD1ok@6:xÑ$!pwYl;#'Hc&x?l𶆣hqfw?OÌMO΢t < b&"`2zĿ"#T;Hǿ<#ǕDZd\CtE$2XgpbO1:(7N"n$/ Y63nfuS6v{ ?c(%6:?^^FyE c#- qknatq%)u?M Ia;O;ɺh`Z3;CoWīxlnh͹īO݄ Qtsʐ7lo¾}GRMw_{8cᳰG#؅H;.tGIwl`2#O{cw^ ^}+~*Q7qOn0=^L7XvNQ3$ˉgU$+o˵OՔ5eٶ5.@7:2׌a2e7rǴX_{(xl/qVKo[ V>|R>Gx7y?1x|M郄G)8m0#uw1㥙R9V{@`O$5$SկP4 ~/ϑ 'T@CeW>~ }3LX0o|lMv7lͷ?hwKiAcی?TQ>MxI5?1) ZɫDi_q_}kGj{#I)u 1C_cgʱA߅o1AeNw*ɫ[9^;z =$>l;"">"j;"7ZƏ}DnoXG}G#GAp O'gQB9(r4bG#G#>"g>J?"7EƯ3l`ȥbdT<i!Oվ>覾oz$o?'"ܑ`#yɓS>)r$P'uwO:C;dtMzZEr|_p:9[{(Βyϖ+"rO1}@!W3J=F:y|j>}NW9C˟r/|"_`4n?}l*7jzeڗ|d|uuN/&c׍~ s ~ݠSڑrvNߴ0_]hS$f{ROΩ/ 69xukGw2\9_z3[މ'b/kq*rl0gv% lܐ{ ͥ`l7svsۍod:=+$Kt^z|%D;b3U+'n.J&ztﻓj0򯑕vڅ#kE?G)?SzN3T~GXd$;\{x믦rP풪7^` }2|!B_[yOMCY)HSe)ay |T?$s@f\|.Rx+yր=Y EE!!׫p<6rybh!i7zD̟AqGV #ν?Vw?܁u^8jc[C lĺ-Ta] dN|> D(g y0ޅ4gփ|f.[b.MQ`i lVDg _˹i"P!֗>Qt'zHӥ^BN /s.f?+a &G9] >;N`s| qnnY~JL4BL ;] Т]W@^Bj*xw\)|Nk) =rvX ^O!Zo^'"`^#(Nd߽6Qs@0EmrZ^q[3WZҢ7yg0ƎjSX.(oX?DO=e+{:MDym$ 6Ҫ"S F )^O>s}Ρ^H9_Ֆ8V<JrsWV¢0@oyx_ >A1Hj(tz!SnrQxRx;oH4tG*(w.[C+|,ttǜ{bAԋ 1N?ߠUO{kߛ(wSx_4 7N=$_w/Sï3 N=ݱ$>"H;">LĀO)&o0xߊd6|@~GI4(0ƒx# S뺻J8# b/#3\/訄_tp:.H6>#GžmIA|g$U<<{P'Jcoa,k)r\p%[]>L;2j .?Qr69F5Տsxu٢⯀V:H\-ɧ:mZ>jb WCp$^YΣK˳ ?{>kp!ŵHo!Q·NL_¨B+!jdϭh's dOװ Npg8|F0BaiNp) gCe}Ϣp& O_aEod3¹Oj (dϽp!iLǜ#T:ǜņohWn3,Dg2ٽ@s9uo'5r:=-ΑRyx>³)\ķs){ XEgOk䝬[ND2( j~Ɋvc ' O>; `q4x;p 3e4qA<&k5yz'GvS'y2Nb4Tv&DݧS hP.z48/лٔ,Jɳ]o a2zĿq} ;GvHǿ/&YҬb7Sԫ D:k"{] :6ևpXqKr_@lfE}o!4r^Rn<##|P7?^^FyE;HkCzh\Ei!vф{-\O _iJw]N 3A|ؚd]E40-M^MfvqK x $MX<RI7 }&a۷o+f/> *pD<]dC2Owtw̦&3~jk{%xyrDO^I[xtcyn۹g;UGA(ߐ\ 9^NlX?Sdu"YyuG'\L}}z_)7̶ͮ@w2ёF,4o&V[%{\qE^?N ~62+῝]P^(4g<<)2}(#-bv6N.x?Ai>^ Jy @)ɀ}L;_)__|;AcJ6&~ǔ'k%(aNjR oަՙosIRvGʫ[W?!|@WwM> ]r5|'?0GV3??qez:Acwoo!^}*e;q~ɩlW_+oߗOa}?[뻣ʩlW|@Ce&v"W6=Wv·x\8&* QUp$/t9^͐C=ϔ<^-;u. :lW}|ɅBW} se_塚\'u !:}zhc9WIO CH/cNg1߾Y"R9r]C^W}:cH9C/k((ك TUl )H^dC^7)TXl֜/|ə7u'Hrߜ9D7(6]MSbֳ_Ƞgb3=CexVodi\i\E';n#Lܐ{ ͥ`l7m7Jv շ]!\GG 0+!z=6:D^2-߭_ FS5rNUpĔuu\h#5.] Ʈ֗XɽKz0Wx[.V{׫2Qj|ߤgd&yZNtf⵸ly:v#[pw$ ]D}P6uB=IEc:M̙?3@'fVe_[V*N΢YU9z7n9>(:$&A. >Ato 69~~U>w;/*1f~/EwaHvDws9HSrP풪7^` }2|!B~._4 fӦԣXΛTAStMH@TBԐw\e:xVe] N@3] N,'`=<6r8Aȿ%:G !NU+y{Z)$}T/[smh[+Qqpg_(?ʵ2f>VV0nxr0"R *7jc'C lxW)Ta+5D(gN0H]ǂz /p3Qx1[q+olSHSLf':ۜ[+Y;\w #;ϥOqv'\Nt N}숧8! ; >#@(d"ӥ^BN /s.f?+a &0^ĵPWBp7~2>N'S!\<_+sưyoy5M!S-)SjV @*Z:WWUsZwL3":MZ ;3~9C=%|SCey~5n=kƻӉ^9Nt69-[/8+~AIiQP폼3DMcGPr,y7T=v6iU)S rt#Da'Ca9>tCǯjK+Qpoh%\V+aю v7S^/RxL[go!8M)PrCz([)<@Sx$n#[;!~wi[>xAw:|cN_=1r T 'o*'h5M|}»)/~]{_W/һ)|Wڙ_XKk`-Ӑ?UYa>H!`hG=S+(ѩѩ@z40h4(e@367DaRwπhwa fs&y/,"XLR@a99^1f~3,g9M6O<$Cős6x9DzxO!s!Bh ^9? r{Jr%CNn9dT9 tAk*SO {sH_uN]H<+Qp!0G#$C!O9$xI<שG;!sHvxqgۍ 0A3$CB9d2X !rx !kx@9D#cxH< ĀC9< ~3T|!DN19H멥D-Yg۶C9"f| nIwp^NieV+Ag<ٙzng.}1Zp>@1ZmON~_'t/x&}"͋ZR޷U>x9<.n*0=e+7X{pmRU[鳤gXYP/ Zuf(tZvn,pG1":Sj9Jli%O[-G7>x|O[Sկ@i NiKO-|&~cO[DgYћlxJ<'~tJO[!Eh ^i? r{Jr%NnikT9 tAk*SO {V_uN]H$ r|8g7/qBtƛLFs9^8rcO(XO ZsOwEH *?q}LBQOm19q|m$A9C/q8IDb`{9HkisDyN(9ѿ8!ws9 sG9,qY98Qi*-q%A2 ω)9C9H3ȟ8魠0qR;qRtm yNɻ5:>vG$q8_3$AWUKg_cOI9M Q*ǒwAyÚ.aM%z})[Aa8 sa% sƒfh9HI$A s:TtR8 N ~qF1h9\H1Z`$A6CD@ wAaHxcDx   '!q"uHpsGq"qCEu8H<$C!?x$>8 |ɏ!sH9߿VSs_sڝsxQׇAO<|!LƘ%Cx7?xG`%~>I9D?B}S_9_%'CC^7I<-f='C4񘯭2sxq%C!h]lUpxI<@y--b\H<8ω%x!\!Dn0׿`.cr|9yN<0x;.˗xq!sH9$_XJ

!Y9ĩAo7n4(͐x 5X`}lsfH<A9.H< Ua<oA9#!$$C<ez yiu|MltXz CU#xII& !z('Bxc|!N|9ġxI<$Cui0Wr|9$嵴sqQ"<'CCr|9rM<ݻ\9W9D,_9ĉO<(4UPsH9^)+K<$CD8!DwĔ眡C9OLoGp Q~c7@G :;F/0m 1la^vV1 x8*}֛3ͦPLv3|2Zrv2qr K'S qrī'G#8|t̃ɥHٶىKH\~'c!qruuHYJ\*jQRo'J1f~K3,gM6O\&N.ős6xRz8O\R|@T+BҜ%ů'^H'jFY?KŬɥ&U8t'et{ ^Ww-ɥ)'N.5Ex-.[N'>>Nǜɥ'PO\Bw;5qrIts; r<&'N.%ӏ._1Jה%Hgy8峐8@D/qr'Qp%".v'N.tKsbs!qrYF 'N.#Ez+(L\@N\7dBzBe2G=<'N.!&~G;#qr8L\'ު%N./g豧$WRt&N.McɻaMT=>4]oxo8 Ka% sƒfheI\&N.'K:TtR e;i4l's!qrkM8D K9W$N.51H\"Χ8.h$~< i| މŷP0`9wRӒ9NPFk-H?G]w,] )4ޔ:0m,j|jr}quV9eyc^3_;ݳDC Xk= N}xz <-Q|Pu߿g8߿᳔'ij ND:e,&κW#T| fuKua zWHMh1C9u^}S뮳>ӯNmp5AbXgSmjPqN4\[n7S|-CgUi!mD{,KQϹl~" $G|\YTpVRD^e.wZ9kx|.w# L!vE%v)1Wk保ҞM| Qx>J,kHdY'̾Ezg=Dsɴh 퓄p/&\HK8A[ Rx8b'At㑒Nr J.zP'HIf\uqw-8+t!3z&uڪd@ Nor[B}$8Jl%>XzM|{l? Xg(xHe" ;kvB଼ }"l[h n#wP+3ݔ(ߥp%coOs>Dq b3dS3y&ߙAL 2ya 15(ێ͐V2Lwr @:o #T0VB*2_rzru5;SN(utftT9gKֿN'^Qc7"Ǟ?;W 4_ gз*0DF=6~'v;gQs}*G\KxorԚ*㲘:\r-EUF u7?G;Ҩ!s.p۪Gm{a}~^ϗ?k_=_8l9:}CkX{!ݥQGLTk\Q N)\/$A⋩竢':VV/ 5}_-ZhC|_m/Κ>GwPw_, !GS4~WP@WN|ܻځ8#VECt)V5Bg>P8y_G;:pms&:_ϧQ|t&L߇78~Xy2@R? =i-Rm:/PIu_))!`EzSb{JB!<©pt)'^;šN4'{ofLfVP.ZM9 yilS#J;qZ k6E(u\'TE`HwN3I`Ez:c șcz=:p׋||}KῨ(q|w{U.pnfOa ;i!zR+|Spe 58ӿ0'3ɷ.r_WKqKuB!,p+O\;om%tkrc j2͓ kڈn-mdvcayK3{K*N:QMJ0cnF?Y~+Ʃ43]L]# ˷V_ꃯo"5N[=Gґu+ѩ ~A=e9mh7x0>88`;Ng0`2!04 fS\9W`:X@_wK Vƀ!, D> }~;Eq1x>nw 2W\*ԐEzf4h{t22dti@ԥ} Z7*pqA.1 7>38 V Po+  8|ذt>>\MG|gŪO*;%889p1r'p97kxf 9pPd Nw[9;]/Ǘ}[9x[>`=Ǟ|7?׍:-gB>sF^i릝Oo3*>йzI|64Jf(\7ƞ;}Lo&.[]n`m`g^njppL>{ }9~u:CH_4߁~U8 NF gyHF2zqP*# Mx 8o3r h'<+&.'IH(Xu6xbtP k_e3/f ]"t1|~ow (ϻ;\yI|9&?в WQګ)h&~,v-\O _iJw]N 3A|lLUDbݞAP/Xq6ꡰx9ws\U'ynEyʨH9Le7a޾~[1$w|YP#p]$ M^yc6w0=W4^+ϻwO%~}ޱ^'o(ӋKԼcyn۹g;UGA(ߐ\ 9^NlX?Sdu"YyuG'\L}}z_)7̶ͮ@w2ёf y {+wL-뿇="n'zov῕N`.{(|/}wS3m>HON폀C;a^'@~S<^ۉ/oռ V^D2|\ORJ>?Ai>^ Jy @)ɀ}L;_)__|;AcJEϫrtor:f}P_}r|QEWk"mrzrG ={ *#`Q]2~W.`G\j=RWS9vI՛Z/tx]~x>_]|>@HA!?/|iS:)~_SAStMHā3.D |e"Q,®8*y[<ׂwHa~9"f:}x<ύsi#GfHڍ p:=yv\!$ ;=76ŔXo< 9Dڑhr똠H/Hu8ǯTӆ+-~a ғJEzKtmvWOZ;EcP s`EzgtًbN"ؘŐ1&>D̟A~:E(ʘXY|D8+`Ez+֮o0@x';0*Sp+@>a@|"Z 7P (uc^Hf /b 8zq Voq]NM!O2ʣlsor.}<>Uĩ_Թ9>vӇ[_vtGqqh},B=]J%2Gb~Qc. `r0O(_샌ӉT6>W91v[@t7kfW+j]hu!YaYuX+߇<Icgfu-˧lB8C@BBHH@r|! &Wݯ^wWwήG_f~UޫW^y`WE]T1B|51|7S|B?i&F,T3v;SA9ݥn=CIs\=?.Ŕ˹1Iƙ^%? A+2~cDWbOF>YAJ|O`OKz$z^TtŒDV,Eȿc<_'OCUYkb `l'>[Ǒ8}e,;aǔ xx>PO 4d2@7 0s `EXB@DX,e.!_J@_ާ BXZ.'p!s%z@`=o$ a3F[ `v"k1u7^} \pJo@Ϡ7n"f]1/PCȅJ<Ek*~$JN! cCT=ag (Jxs?OPU' [抗#o2xx!x=7D3x~Jx;"‹ iAx7¯2P!k>_z?"PDe%xR#$ |,])g'>g1_" ^Bcp :<Dxn842- ׅ8I !;Bjw;Bjw@{ ~L\;BjwS;Bjw(Q#ڝ!Z!MkwcDLvGfvGm_#vG?q\,:^#/F=KvG177kwS#1˹vGO!W:;Bd|OKR#2.$]F~W#\1rRF @ǰvGHt:!z9"vGx#Q#vGH!*}96_;B~vG8թrZ<樨~=1)ۅO=`펐6NZ|I Oy4!;B/!;Bjw٩EEW,|VjwH zq !KE펐-J`{;Bv! vGȣvGVyQ#D;"‹vG'BvG/ǀ!ʼ#JBL)>Z f]̯;A&wE\P<*.bvyQ[~(*O Tk ^z$8aNpUޯN>Q~`up\taFq%?v6 hKw#NL?ӫQlo1'U.÷0SE;5#a%$H~.t/̍fcCҧ}үj0:[߭ :償0 ފۄۈ7IYm wf;I Az5<؛9Ll>2ۍK?K? ^B?ҿ4Ioooo2heZ/܋3eZ{Zɧ} Lg@{@{7k1m܇aV2 qI6 dw&: 'h+qyFw! S=$@(^oGG1| y>Zc'09jپ=Z|!^g: |b~4Qsz>,>z؇O2/qzZģ<v§ʶ]+-ݻ4c^؁k9/ߵ`V=O~xۄ=gJoρz>tyNU>p礱zO3?]O?g{gʇ/f>=j{ܟ+b=ܟgs~*7#LD_ BalhO9^#/I]DK=_E= ?]} 38t>JaVU¤’?<=x#C?)R|-]f]G̉l*ogUt#ciΒ\ ez&WcRпB9_릯L"Tvΰ[@?><ߨǿL~nr,oeҳ-1嵵x sS]Do=Mflt<[HY(/LtT<bڥ()zKr}*!Icҽ؍z=6P/syuD?O?dkzozӴ e!e|pq;s,s~l)(x3}<7b$[h_@a3e|oŰv=*6ķwήN[!#ٺ,M_AY:qISes$ Αl0TGjΑl3$^H9Ε#fc#aSGp! |>dG-4#ؖ2iӬl RgU6 lȯ#cx2 ΉiiIԵp F>0Cr߭ϱcY[Qx: Б ˢ:% ax%W`x9m3K{548(8ă\% ULMFї,8'*Iw)1y|ɱ 6 WykL9l_9ǂ.ۂ61.~ePNC`9\}l.gKB. w8k0܃n)CuG[ቱ |GCn&N!$:z  =i/d-N {`NM`zG`,K @ jz^I@_E@_-}}k } }?=jO>>}{ G)؎aWG5 нyj?Qܽz^ SFwawG!!*!0D`8*})"p}8i_4ݯ Cx0EPt5_@ մ³(|@2O i3k Hӵ w2֧~pv!=U19 yi]iෳgp{!yx)8t{0B \ DnK1%^rL eNӦ1\iV`բydcmFzuhl76t yw⽜[l7_;Ȥcx<[ ߱5#oϴ}{~BJMݮa_q&v>HO/Mߩo]zdϫʥ[k\(z>^0E,T`krx5:_O|ó kOUX=.p&q!3=Q; ]aV 98Jvwx-x{}B "<: ?,!,*S}䧚,$sR5kIGG\qT:uZM vg Y@I*ld5EM_#U,LMSµ C:nEx: h>Un730d?;L5]'tvW _1/:IE.p]|Q*'\wb;UX(#mJU nRغ;H'=\n19~w)~&Q:]FhVs`/,|^HI{r?rä&djq j'QP;0ۅi|Olw"W#/ } ={y܋ه|{$=so8މicx ӏa!VM^ #9! 6h b8VJ;}=.JU@A~n"q$Q esːoPTvZoV, ݇-PF!ۅGN(Gxwo$'w>O%)L :<<=-AK(ko?yڹ䷄vt/i|x-dyO(_,->e6;lI(9wHjY/܅Q"߂[W'C}"^Wx[abc5ǯc||0 i|ԢQLӿo#OYv~x8PVoo%]K~yO>?d ~Gկ?o13 ;/0__|%%}̗A oo(w!Zq'm_ :7]e | `ݷ1Nxl ~wtUM3@;a8%ҧO9'yMzo_9c}t'l RϦ=SlJO<=ujM{cǚw]_=˦CcS汏k5զ}W/6%دN3={}|~u{bSSSEp Ϗϐ<~Scf;z;m~G5GO_c s;O|W*FF9z'X7} cjnj-+vbaq1cqh?fl瘱eznjcgyyM#O.[6e9_|򓋉wu.? .gLz?bR||0O=F7|wYΞ~ $uFxȗ~.쾽z2Cw~Ms>J eO.]!S㙒e|0Go.ȗ0↔\onz{֟zF->?~$z|3Gr}R?U(nyvNV=smبx.ީ:T4)ԱݽCƮ֓Qɽ0Z|Q،0=m׃z<(g c883¨?>ldTL0zqNOT]rz'D;!_ ?2(ES}xڐoR>L~?1kz>g&aBԐw^/aOGK[,;4:9NxRx}Tw}Fl/䈱w!Ճ~' w?s~3, awU:# Tz=vuoY|929Խ]h;©29Խ[uOچQPwu8< #|6N\١) {0/ypy_HʨA}O +:_ه/pb 9:ɠK1\.K14+d³J;tƩÕ VbZ~6<־J$]F~ME:A@\1r8?t^)B`S~b 7u6'U03*|r,^%g\B.Xdž0& 7,a_~[rV"cop;1]n wcx-aGi|V܀xg#+=Rꑿ}9QD^<Ǔ//y(xGE' ÃR3f#q! o"|s"g=Htn|nϷSeSnϏۄ> UGN"N%SЏRF 0"^.*_ yWSAwwbx2C/K1B|51|7S|B?z&F \xter97r18٫D'cA7heB}bߛ Y hu'+(8z\I iI^SϋXRڊw+i*+3>)YV0q>t_ n1~|  h0Dh&*3B+n)`K 0+s8yj$b| R*r>M`j :˅^+\7X/:7 62b.+&A=_C`ݹa~" ҫ3 FA5 pW 8 r!ҩB O@F(J S~PE<EPi#pXp JճJ$r\tDVy/5^'zo7/=[ /xGDx;-=FU*=T{G@K'^A! Oc>D8#ſ>E >g,&|6"|K@w!|K_Cz o\gȠɅ74١l!Sz%M!޴Ut!rЉ٣;*65:kxsOPLP"9!$L34bF=E_|)D݈| ,IP,&gz? Q͐s-Ma͘NeO_4"|S K"N,?|Ԑ~r: m<w'M ]_rY$z1܄t6yيi6#/ۄ<}\ Cw|nj{'qBv#n}aZu5p_Op0ÃBE/ُx S)1 ÛCu-\du öP~TY;t3?P0DSzA["<@,-v .AcTZˑ~@ 9} z6t@@g1ie#BţÇj'}yB0ziia'0[´Oc'g,'09Lsy;)>9!S-qO>=iyNNOtzaNo~!oagA#DIzuyj#X 7/_{tKv#?-Lwƣ_ eW~KJo=;|)}SjWyizz~X_etanof/_p,S2}܉PnmFL2O_cPq0x0!3 OMNs QqU0eӟ"=s=~5PGQóH3XOI>fFk/CM(R/qs*C-Ƽt<~ cS;s(Gj ( 9vs*#?f}*Gh%G'?Eq]L^&2\L˞TG 1tYԐҹRp0UvA߇Uou4>)R~CAx_^e^m9~+7I? bvਏ@o9]鏝٨n_D| }yޮ$qAjXd9._u o 'x7@1 É2K*'?~ #ˌ%Q|Dh oU=|E_η3ҧT,. /;_HE=OPK<ЋY- )*c.@VRovW9E]Ǭh M?k*X|sb{N鷋!<yΕ} "vTX=Η;kaJ`)l^j -@#چ{ڛAP ~,>t = EP[v CN"Pڲq)Aro}4< gg5 [KL_6sd [M[<{ƪ*1 "|X`0S|K[q%!*`vT&nU%VRI}K~D``U A#NDGD;t>v6 ۅz;S/܃{nO"܃z|[k QXע {КKO /W"Loɏ%R ؤW Ӄh5~ [CYsAz|܀W[Q~P~1f~3)+x?l'i>(o;w&X]`}[{b3_(q$c~0A$o[ۛ[`Xmχ9ӜT<>~-<iaj't~.4.51@څ~2e{匓+M/Ijd^6(tlyN_] dC agx?Oz9}}o&z+zDo0*aRaɏwpIoS±P)Uʧ\_eEFk b)uhWܪ4ΛH op7g@xp7AƓ5Cs9ûJ3qwG!n:M&cBz [Z; } qy|ɱ 6 WykL9l_9ǂ."}$ⷋ.~\;Dy 'Е3%!n;eopT< #ԍ~~m'Ƃ.VwY]Myº8x ~+e"{hZ@'BBL%dA0LXt-g5A=#0`q>,~~Яb\K@\W0\I`pZT]mޫ }v h5 !gD8"r%# 9B`/BB6T# sx7 @A/E/' c=Zx!<h)EPt5wv}mwke6C-\r'ʒ;?"Atk#'CVJ,n|=Z[ϖWo >n.CkXxNZ"ei#:4F_H$9ןJS>zyyzZ#P6fswK~Km'O%uk%o +TwTBfOo)ۑgN@Lh,Che<}ӿ$oaۈBEài0wl?`[_z)C>?Y>?pO?o>cy 8 y>a! 1?_A1_ oe"c1[Bߧ.U+DVk5 @[F;K tApoa 6imR#~];3B dBy$qX螠0/t ޔzsz_'lz'Oo S~I?~f<{؞k„e{BAcl5Ikޓnٓ֡'Lϳ0KO?=if=il^螴"g=Ϟ4wϳ'ū؞4g/tO۳7=i{=isߓ6Wyz|=iGܓ6?=}On{j{m{XuܳN߫7=p޾=pl^=pl_z lד~z?;L_+b!-pW?&9 *=djc=׻=gMkw] ~!=ptnc([#K_Py](z x zW_ )ǀz\lӞ'l]HwEƺo;NL.x6mo '?lѾKhۗ|ylmzz>zO.&c"ם|x|.H7j=~Ą7(5O=bxm۴$uFxȗiλtoXz}_fv=wNwrCɥ6wjZJZ:)X>R~wAQm<R~sAFTS9(+'Lw($7j&aEt=?2<˔‥zۄQv]8SuaݩifSc q)]'w ={aaVvaV#zx|N-<>ԅUE\="U` EPt0}Pl s<.ո\y2v04[WD{PF #~HOϝogLǜ~gN&)~/=ϏϒW8#f9=Ns}R_OAi=S#W!|=~L!a,ןǐ_~1wF\ӇM쏪 r|\?i񉟛ʁK\n\/AWߟs'+GrB>;x!ߤ|A9[1/~1c)הg|M>څ!_~8.AN{Bu=(9G9gD=O's|rq4|P=x{#~3,s*n6J#甤TzJR9q-_gi 8*wwxġvAO&rv D29q[uN.IDC"<9qHd4GD6T0qݩ$]^CYDz!,ՀxcC lu^Sp+(@þ`{~OWFw%B9VជN:e|-ܳN Ng$уfa>hbR g|=3CO>>hsʑxpeE 9rbxPqfz?NW8Mr'Y}P㣔-vq*í1}GVĩxCarn#Uvoxm UEeogI)V_;u#G-)o"k/稾|/6Qi!+BƫNӒ5GE5) oO.{:Mį;.ho֑֑v&gX_.b܌S=`x/C2ݏTA y9g "նi%7mH0pDE/axDB1QwT2_>o@h<61v>iQr|0|0<ÒO0<o3]J&ޣ>-xH:2\LChUJ Z!>1Mt,ad4:o=$]OagL٩EEW,ImR;v4^& Fʿ{3WƂ.=8|L '' 0@ "{'i R@.%A=CUK 4U~XD8#ſ>E >g,&|6"|K@w!|K_Cz o\gȠɅX>nt1T{nwv([ȔGVG]!Bh]!B H~?&]!Bjw o)]!B~N-}}CH&B|jwgiW+L3Y+6үR+ğ8|.F _+kww}HjwBoIG\w\+'uNvW" R+$Zf< "ʫiBN]!vW? A c A r|\"kwvWy]!2vW㧥 ʊjw yT.[#xaxEL~_9 )#kwD cX+ :v[~=~^\+DzHǑxR+vWZEUEeogjwg]! o[+dT9-QQYsTTYS¿vWH_u QDQ> $<E]!O]!Bjw+jw`>jwDgB~=ϸZEvWl%=v]!EyQ+QQ+`+ EQ+!IQ+DcWD +龐\x/ߨRv;M_v;PDŽUv;Hv ->e_v;ү?Aݩ¾oD=v;0OvL:j if>k FAj¨AXraԓ k`|v~S_v~v;H1N]>AkCnA$ AjDb3¬G ADy5v;)QD>na !a, 1aֿ\Asv"|n;WDnaT AXY_v "J|kwA: + evtz k!@îb˯y#+kW8OAjn_,]vҗczaUDxkS*%?*3k7k*Sޞ]tnnԁ ( 4GRRv;Hvz^TtRvGvhv; '7@Tnb.Q` /j<*jleA#"(j|">)j(r BaXM7$KE1ʫ2*qY@ ;Gml<%`."*xcG_o$I$ h_X!cAϫSIJHX*g wlK itWWK~t!=U1N禱4 iෳg= C0!S@=f!P. ~D;]i.R ce(s6JLӬBpMhx|Rd*MC9&Z;XD8^> z^7#=) h~IiN|^0-;2OPYM:3xoֿGߵyn?,ڄY&kk0:ߥA:\5^0EɎ;>ū73H<1SⳐb$ Y ~  !/:0 J`6\># 03bpݤtoض1:-q="%W/`>l?:MC١R5kG4!f* Ao, 0WtLlS(:~M2 g3r |!Kw|,pw|?ϒ/:^$4&'~P$_yQ3 4$Mkh[CA͟Qˎ[{Tpu.1ZAӥLXtAy%1dEZtɖň`\49[l5fLV3ךl{'Fۿ0ydmwkeց&A186@w;[ 32PB'\ԒxTH |hT|W_1$]VqM&jj6 juUpWWAS.U'f̋Zm5IE+fb٘vWlAy]NtX> }`)i_VtbrRng%[MY`:_.#40`wBعߩm*=Vہ"~ٿȞ ij*Ow`޷ 32c}z6W#/ } =Z>%lхgՆ4J'.<<=pR]ߝE=nh޽|޽^|HsB?(03(7I zyyzZCP6=ko?yڹ䷄vt/~ zZe&=xJ?՟P.#|Y[|m>evj-Pu 񛘏7&4!Z42>_v0m !"aЎ|׏4u;vnv9XVׅ?DO0C7$\Oa~/ ?COX@Eϗ|P}_bW|t ցp .U+DVk5 @[F;K tApoa 6imR#~];UNtV|sqyV03Vs1~< qqynd..vY䏹`[?AVs>=+u7PT>>|̊ǯl#}y'M?GOr)[\014Wyzr?bd3N2>?=}Ow.{h"=IL|.F_":_>|Ksg]"zp.z\伟N"Xyo9Kf7|6s۳AxքV\~_}:1-UJ_a#Q\ip>M>$!+*Zn6Oދ5y #O.[6e9_|>I&c"ם|x|.Ht1z oPHzt0o0=ڵ9 3?7=vCx|%{M&rӢ|x0j{]/!_<#I ) 扸0WOϗ=oBw?~xof=YDcP-:9[xmبx.ީ:T4 ԑSZOF{:$¬wk(]lFU6ߧmz| #uAU u=Ab\'ˣ"zM;&5Wrtk=XӘx^_i@K1B}iS'4hϏx!==wJ<ǿIy3sfOoO8_B??>Kf>_|Dr:!-I=~ ?{A_NW>H!|=#~L!a,ןǐ_~1wF\ӇM쏪 r|\?ir @x>!' JA!./ӆ|fzP 9>eFH AP<= ՞!( ?=GU[bRA~o,c⾥ۢu _Ṷ{8N= 8'NuTx=iӅvS4 gO31#|ΫR 8[ gc8ù> yH°gʅ~iVV~*VV7.[2z傴R /0 t"ɱ cx̟=v rn `!H<|lSmpŝ\A:Hz7bXJrp'Wc ^n : H8 Djs>?{x}bR g|_ɚʻ3ϊI9/'8OTI+#@ os8Qq*w ÃR3f#q! o"|AOEp75>Jy{<"o2*߷ }NnEJ :px`WE]T1 ?h|;ߝ ߜRB7Wn$p }D9?WbŤGLgݟݠUN 1~o+g1| 'i|q%'0| ç%={eN=/*bIM"h+"߱k`ȧ!|!`lWy=[Ǒ8}e,&pE=S+II@=L$hzspH f,-+¯M[@`P#ѥ#KP#iV#": \p+`x#M62b.+&A=_C`#p"COZTz|~!p7#GԷ}!pB.T:@ (ZP#P"p A=O2!?03xA}YG.3LDcEوE/r S ޅU/!| 1k΂GPYpw`"rJܛGr9v1\68ȘKȟPsFcy K=^[rBɽ嗕@T K% |t,[aåGsف~yY(:Bb_f0˾9p`ǨKNOE9纴J.F=!b9WgFjw֤ԉyM{l!S'2#Bvx ӗ:*Ϧ3{r)z+cq8?o'Χ켧掦2c#PU.M<2f#Y9w&lY::dQ K"?mB,Ye-)["gPN/U\8Q (/{'>gH_ c2OE4ٹ ]uC{%dPC ɩl]2g>C9v`VKaڭ,rథG)18nsl ㅳ8kdY[R/)@!J2` EE%0uN)!XJF keKDTT'kC#Ea͡v85!7 ;NThtA)ӕj0/Nc8R\Z}=K' N\ٚa`DXlڔSBz Lm飹SFU2G n9vbD]*̑_}>KxՕ 靃fPݮ\~[zc#9v|Ai)vЍ+nwW9U*l?$B]m]cӹ( -JwUOk%3?Ӧ\Q )=2P޵J6ծ1v&6Y:]EˈӛIS,dhecæNW)}lAݞx{)iCVvfʜf,}=m.K[2Ԛ΀/j͒ z:Ӗw,Kӻn7*v4')-S23'd l'jQ&GP]vgӎy aW%Q4  ,_['j2Ar=EmXӜ.Kk5b:N+)9EUc&jkH\߲ZlhhĊж꽕*^ۘs2@Hf |[g5RNF`|]=ѮF):2p8cju[6J\&j;i0h[IfY6K@ICv䥋pnljj|$tC~J3-vZlm(oū2S.+5X g4Da\jpΜUJDԊ݅pIһS$P',n)0Gf@Z#:"5jMsp!u'`eF쵖zUI{Nkm#C%1̬ v 'KGc\= né=UD[u%ɟp 0M3{_Ȁ%DӝO*aC٫")n?9\,;.X$$2px6*~YF`c9L7Wiւn36R{3C}orfӁ{n8 <#VejA5RUÎ99v@w1|tcki/㎪܏vkz-l[p*L+֑NúfU?v 8=N:r3Ȥ&zg PNv˓E{J\7Ǭzƶ'hU&wzTA2R){F; =];=,ㄈ(|dwrfZw(uq?H-}Gf+ng]ұ +wZK:vQ@[ K>Fg(H|3$S݊9 cɧ誼H8uXzrnș!֐-\YݳEnܑRR7Id/ٺݷS@N{AƳ-LjitfHH~k^˪p{{ĹWtgC:p8l@L9y [7*1tKp?œnPL ~mk>60d)V^}0͑nN7ٓfZWn;4ulKh3ԝL5.uh9^,Lwtۤif~sГd9-UZf#Iజq#_>/$02 ©bhs7))4h8&0/kT] ]%Ncޏ#5}|b=C_LP[IΝ)s'YRIG]&\kؠ3c?2XVs=XvIc0x_0gt~t/UöZygΡgVe>"fqNle St'U9vq7:SN&Xwyd`6}걽1YmՙLkSUN#"gZU5o\t:MnR3vPCvEcbRh SlqJzul:]̰>vd,cuVqLu+o봰t. *":[-% -ʞ 6pvd$7PʩYRr`+ Ĺjxv8祢vYM ڒQnL[ 5Du|*)wB~5UIv|n  mLZȨJ؆.ٝVǸMyt fgC<@̓XMn;1Ȏ|5h-E6hf)dβEPl=yv-].WBge'5?DpwCpz3ؔ; SJ()z8qrE1 i[VW-T@"\|c #}ĊE_*QتF07bRqJC9#6t__ЬA~V~e|n@mc(]d(1dY6tܞl?5]ϳۛNOW.Yrs;M:j4e^%g8W)%z~GKzגirѪ"KZT{iG?Ѫ֛r&uC.$5qJW&?.n c{j)}Lw^M}n] NצZ&ܩ4*gffXC0Snf /oH!+ [N+UFcAיiWcVg `5w@G q!m<ؾJi`Pv(W4ٞ}bBgFW nleF,i{\hu'zt͝PDNSC{Fi;&7FfaBVӆCduӷª@Ȯ2r2D[E'3w2dO(ig6Z&w&7S(_#?t+JngBL!mdA琖Rn'ecn36Pp)f{}dvLnoXdgOןV7[FF;%*KtJNo K~Ѹl5$rd@ńdl' ^ǭdV8Ƿ(Ca7[[|̣kGTAdSM9fWBiVnܷu1-ja!RUC$i{e1j&ͬwO0|y)h e2|U:KaܶV3ng*`9:7`jM{q^˪\jC=Cvդ0 da iTރ||Xf'iM8nXH'o 8'=.h;m򘇱.w6w}U~W 3 ] F|<#MIy-6eXthc8&lr]Wr{qV܃f=nO54(']pOrL̫\eҲg݁f}ģ=V"U9]M#8lDwiV5p}_'O9@-œ19(mJ3Awȗ)+'ߨϊMlPRzTS=ܠ>@Skyh>BkM599qDvX<rC}#}QXwYO 2)O̜N-ͦN2j1RX)wY8daۧ e7ŮU[RY̹8$SJF,ΙlDw_$*]`<P&"(+;Aky]oG)J;AA5E "YD Zdan;LjLc"e[*2LQߟHzOW!=͍yGs^̡"hQ;*/I]φye~ԙ:}N|ׯ;zz|]N8?~v}O}1zn*}Yx-]N(3iؑ˝NhP,9]9==~;eQ1~F\\w;VZU(O 1MMPO, ȖF4ZlK&;/=jiiǷ&Iyx.(=(nzo}~=ƫ(lMN j 'CqDЫrTÉnӳN.HOCOJ|O>%xK Mvvq|wEB LJ.WJ`kPLv,,g2Nٽn8T5*TQ\ 9;j70ڞaؽ+}v8ٱs*?4p>2^tT<tʩ\ ZX4q"ɻI)/2^?Htקta;N=f5a9+5k`>f!/&G>6iq3Mf#,Xm9ww ~45`׌̿4*OU{C\ ɥ<2*R添ܕدC= L#iN:kV$L|A<:x5C#AKb}% ".%f؂GYҽD@K M[( SzQ'Ňz!:T`hGd!䞺y׹äZ|xq}i4)\! C /ANb\Pc!&a_rtz D2\s{/`. $Iu{'B] 0!$]PG #!XvS߉`: qC=DPW6N&r-:;Bxs1›= 5!9Ҁt1]=M8U'!6z64ŘQZDw@( 7?nOg  MK@Nm1$/#6UP.N: PWz[MG.#_N廒ti *Oئ.)QJRKGF_񱊯glid wG4ԒLqt,YTYڸBޡr$\4Μ㙢;nOe厦)t-KR'3&Q1QQLE񠩄bh© !~0՘&HR_U6G'ne[Q#Qjzփts0n1D[/WQ*RJtDGES C>~T:6 u+k@s6ǐ3}#~?*| (m>U}Ryl c|hj@`RDh`BmqPC{#遟|): ^<F{/ҹɂ.jI8noovq+wb-gBc}H稕v9uoAw #܋6h b83lHupz\h@#ȯM$ay2*lNc jN Am+cG]xVmJt£ 'eY97 녫χ;;<'S2 >r{VH<<=-ܡK(w{9̻%%ܶ{Am>K됏".K61WZu_So)#VTmx~9IxDz"߂[W'C}"^Wx[abc5ǯc||0 i|ԢQLӿo#OYv~=8o%]K~yO>?d ~Gկ?o13 ;/0__|%%}̗A oo}ܻWYѶ毅|vo.2x?0|ۘo MJBZDLqNMO.v3s^Y?L:DLINJ ~?A9St=}o6鱐:Gsϧ9kaɶ)Ors܅4^sQMHOCs/co*FF9sŢ{x=[dz~> _Ƿbagy%¨cO3/L=POw}H6C`WtUcXqgj7OIO}9E.om_ͷ)?oIG\w zƤW#&A!ŷ[5ؿä{rk76Μht9^ ?#E^a;=tن?easP?XO( %Dڝ\{@ Xo[o cJz|yav($7j&aEt=?2<볯;+X 7Xʐޢ#U&GᝪNO3\Ok>pw)]'w ={aaVvaV#zx|N-<>ԅUEfk~A&Ѯ9gՁ#XB E~<*7c\#x5.WLǸ#59X:#Է6uBs~71?l&/5_{} /KdYNO-rSzgtZ/x|%/@_tL!a,ןǐ_~1wF\ӇM쏪 r|\?i񉟛ʁK\n\/AWߟs'+GrBS"|xhOMʇqo%ǔyt8^S茕ϰv!jH;?ttO=9BU= mf&JuHQI峽C3QvaCPR{aAH3b^3n:rM0 KL~Z*OsH۷s[4qU~zs;]}'S0*T'm:3diΒVgb8Gxas$pps}>$$a%( I52fej`eE~1Lں@gK$=cL?/p mK0LcB))u^+Wǂ.+%_1l 6(wB*' ?m3*~r,^%g\B.X" dž0& 7,a_~_y#D(ߊ wbx50܎5Z p0^O6idŤ<^{og9vv|֎_DEjO6ctAN7`x^ Jy: Q iJv0\է "g=Htn|nϷSeSnϏۄ> =ߵ"N%S v) x|kcx.ȯ**{;K'q8;^M3FgGp'w+3TrL;l xIV{G.DZ>w3*g"n%B?0x[dMvYշV&V7*폼2cExթrZ<樨~=1)ۅO=v::LESq@™9 pHJ7|c>/qz.H4}-qZ|~ b+M[jm%b"+ 0_QK4Gw'd4TN8vZT)  O`SE! DLtIr'+^#qz>S.P.&=g?{d,RtVO{]A>c>MNdG+I=S>-9"?/*bIM"h+"߱k`ȧ!|5tg0|6RUqԻJt_ \xǔ xx>PO 4*oӎf,-K  ,2j$b| R*r>M`jUXp9 p%z@`=o$ a3F[ `v"k1u7^} \pJo@Ϡ7n"f]1/PCȅJ<Ek*~$JN! %__AJ%!|=|Y 73NLd@nC{\b2>qd^%](91Jȼ? yGLc_T#x3'zeOOUW7̂fZzuᙟtzA2n (./G;uDzC:wPf~[ʩ7S)١l!S%FuQr(;:[×lwxb2CHBtrv)")Dڊ'U3{T/u& "9nT(!*D&vN0؅<&>Th+WRSm4[s,KNN[QaHaJa@yFcݣMb-;m=Wͼ১IrDC%>{Pp'INE'T3@#* upqCI&+W cgU3&u›aO=J:=m&#m3~̬6|nG-H4S< CftM¸$n68{^V70n!pe*r<=#n0$<-e3~Y#(W2e ~¨ݞlo &zRKm_e`"[Qe$ M&;RH| Mx-|/6= QTVgDhrp}ɮt-X|oTȝaV^L-M↦-:,6Yh([X AY a4}lvΧ+teV&+OC+p94sH5TPFc5ಲd,v5G N *t 6hj"hчrne{Rw&M6J%zPiJ׈C4@E2M>;BCӮzϹ| n ;e;U^}(j0G|fAT«kJ-Zt ̻4PA'K@\ysi=/wk6,k xS#c#bT~^mXqU!Xf\H<>.sOa\_ޒpbH1zjssEGG.sؿ6HHdIf|a0|`n 3la`0vq|oӉOg=!N}9,f;<o}vF̈́lˆ9g ϛI6lR (pfveM0fWsIN?,t=ǟ7ӜQR:㛭-]Կ!UeKEŞ b?4o\cNi1]ozy)!3Oc^ >!03{7_awP{Y7ޔ~n@Z7]txYh#t =u&]d)c=0'?C9;E6ZAЍbMC7c98:c2ٌ%8¯vi0H ! 5W A]IJo:o\z OO_}JL`wDW;fΪ%Rqdv˹]3= LI`KgqD.OHQF jtjyh*6l컍:lQ>) i)Lfn4 D4b<-n+.^@.TCH.@}㤅@095?A"&'uҫ wy]A8z(#*Gi/64W͙bx+e 658OTKW66%0/ s;ѭ~h!prs킍hفo="XɞrigAr RZB*R?>:QeM˦.gĿUPQg?'-?=_ #2y-/NOQw^˞~Q6h33P bްJ)_l(VIXVA}4mnp%&΍!i->=;.UsnBeA .iJ8ë91w.yz֜L;mFX4biU7;B=)w~!~ 3'{)ܢ8h^1A<'^=AVGŤ%;53Ļ*kNHEǩ:S*u o3``D]-%6]~' =4 `ڭL(Rf{r9iG)V}vQ&당ំa*CQ<$NlbpgL2,#fbtlcBg-f<̙Tk{椎9ֹ J{9VB~057s,hB) 6-bzv9|e3}FTHv ?t,7t,Uʧ#K60l&v6w9\rf,IBO1IAO1I,V"z1I9dt!F_\9k uqeMNi_ eȝ;R8=\1@m `yFkZ qY&|T:?T6|Q M -mֳ\$H/[N2CS1oA s'FZy[,ӭ*q2A l`L64 jTz\C{aHg *wFDyW3(5hv:=@I><Ϸ|5E.J9m<?U[|O4 iQ4<]5ۦu8 .kڨMFmQ[mVڨ6jQ[9j6H;`Mr7|2| 7yirobPaT7qprʌ\.[cd9;rQQ_5_O3ysL00iذ$0liaÌ!׆C'U@ o=)b%A֢^?V+DʑC|~,&\rt8 hqjC;W$=_g[]X!~MxMVP)SeVЪr~ \i,Xl:AY[W*d&~Nchoy2~; _{pfL*ܻR .Xey׽; LcvN !8FiBZQ41Yrx͵$p,ީ u+/88TSN 02&: Լ$]A{2koFf:,y:}z A2CKͦ^#g*7ƀAm}#F୭o "7ϵ,?B1o,/ `] R`Q9&Qwc5s{` nm za0XZkhĠDl*elE ;u0jbׅqS;S:L阖?g!Gd4i,qG>7 Wʙ\,b0!cX{.`Ӥ?uI$ϡlW2\PMN1kyb#NN-U'-ch؀Ap6MlR>#YMXKCbdnʌaDz| _&Zu"eau=^_(Na..=VO6&):w֤.)fs)`jEPo[{!QbRx lCS gLex+M_'2oR\.Ӵo >YZ΂S\{xI*(ֱԐw4ŋ~{<+7 ݈'~%ӜySmʠq|EA!t8E0@@O s.`\TEjzOR5 |,QB^_Pj}P@Fkz]?C1îeTGnd*U`XӖZ8V$: tL"gyid&e0@Q Y&;pYñp1u:"$~ܹC-Yeňb dz 5{bғ6knM kUF557~F&hx|ZiN9O^C% *2<_u[=`'兾[o-=Vꄑ vINVQgns+0&'ܙޙD5zNU 5{ҕ7a.> o6M7f>Eh]^B  ZW"okhf/\..ygOsR5wU}MުgjŅ~5<| IIWFl4΋XMќ֫Hɼ"UEX'.i Uʨ#X.&KGCcb j8F5| X/C1c:_5ʻ"z|OZ#"=wEAQDX؄T@R=hv`oq"mŮlTp#Au ,=Oi &f wub{"Sq EeDT9* ;|">#}0LL0ћZo 3!2K|nPJPPi644YCEe-q\+FN__&9W Aت_8CZ%vPlcǯ*KteT&OCkf9\r"oHpY-M+2>$(ұ&l(g﷓чrVE13wBRdNEF\|V*ArD+4}k!cEiw٨|ytʜFhZ 2'RL=C93fL=v*M ˜|d}Q̹Z&\{}7sG=0f;GBߛ{ԓ*9cj;gn٦RnN9q3zh~~Oi( )Y+ʍ.ߐ*e DzTϢbφT|1kZ/D9Ƽm$;3w2xixKw="`y'2Q& lz 60#t|B=pp(&6OPh %\}i]*6/8lQ>) i)dn4 D4b<,n+.^@.TCH:ٮB8"&6ԤTmt2} z'ʡC`psP# zn>;Тǂ.62\BP+`( )Ij;;3YLVf )!om*^/AU6$C߂z.IN|EMA<]۩ݦr5 ړڤ{z c +vG{wkS-ĸ$nN\N4b%fpgHi cVb&idrN:"UBNO +0|erZ3t۔>dzʸ9xB'_tټbPwp=·5ʢMDyw7\"%B6%{9]W-j$(")L*Oe&ݯL]^Sz/ /dғ~Sǔ?6ɦm&z4 vA,ށ|NY g>י6EpvtR>;-SUeB{M:&@%sVYDլFlmW+3W yH@yݮ6}Klh3wރ g]yPu*o:ݮ6gQr/EHX XE@Ọw+G %8}mH68k#Љ .!Tv:3hYݱal0*_372hg81p以 é~bI/7g^3[̑`h΢׀/ce{lBEgp n.J,S%WuC{nջ2ZYc:J!m͒k6DbWG~A*C}'Ѧ.7 N$`K? buRR-[ B[ܶ.w2#]hހ xM+/RkmBKic>(v3SfxM4א==}-3Y7C ڳW){t+_'E)>O OKl'c~.mꄗͤ< %0]ڇO!|C>yހ+?1>.FBEȅVgXv[P6 @~ۑ }I`a-.F{o k7~i|^,X^HNv^"Q7Xչ# 2MVכEí<OFՙlw{uqIxY~x9 ]ɹ<=[:__xmYR Y(bq~r=~0[0/?~|{^ TX~aW{[=`c=Vzʋ~[rc^vuTe]+'[ꑯ`UsYYxZ[⋏7_Ƿߣ_fGW] -nc3 cp9OHOE_߸ s-z|/F}Tx/W} )ܾ%-Jan_VXﭨ<_G[-F/gʇ)Z5^S;*MWm {l:* G{ <՞V^V5Tjc; C@|{[DBۯQp{ beNpJNt+j&*K{.)0D>T \w[M%V ѕN6{![WԢx%Jw'P*]|Tdjpjan+G~g0|)GZ aJy'uj{_Eu\`Us_J .ƫ1\X%Co9zI(Kӛ C^(\oiz;^=wbl._q٣R{ A.~iURj){!.t{:(ګ r.UK{9BRtYf Ҫ-E- LN \p-ҩ\K40< 8$< Ҫ71B66Jv|$?~z8,*1|rk֎Bʭ-Ȍ}Bu490bBL-`b)3ԗV(HEi/.QC*&JguT5JJv8SF Whkc/­/ը> ZWdWyK{ڮ[3­\] k83lf GuSDiG8~W._#U_TZe-s SUxYkT㏫L..6+2U&G֚R=R8- =+=5~7Zx{Hۄ :~!,$+̔<ԦAֲY( ?t+zNkq bf;R'?cIβ`|fuwپ\f ,趌eTH?qڴfF Z FEJŠ1E-k!xUY`\ÙukX &郙±PFo7dlV+&y#ö NǬgKl8U'cQmq ̝;"emj TJ3aFjolg P5X%~YFòo2R?1*ܿu #NE41r6OqZ+MMqz贚%Ą8-[GC7sY(~'f$3Q Gz[b)?(+},NBzj)=< 1`u{=JSIKM>5%B. K1 mmw>VN n8S@Qu~V 1)R熎Zo)\foe7Y7[5W^WsFB~d8nu^GcJtGޤ}OZ+1IB;im鼅 q2ı@B1eV=Xr_v7Ӟj>3F} -^acmNE6ZUrWS,h3BquyګoYe[3BTx5*LL=f,bl1ipV\E7ǫ,>bb#b?c!NVßݘ>^M)|-e͡I7m̲P'|ʠ eUiBS(8嘣ݪsc,!4مyn)߄c "({ǩ, d?5IiE-oZWTZ YXU5|A`oՍ0lk~ҒإC1 h:febn~RotO2AMmZR^cyo 99sQ7fYM*FI}=eSe8H/Zm?/Ј+ esѺg,d.O>#4 n^df22qjY7 vE2^˷,!6E+F]i,;]?!K><9⠅x魃]*~Jh댉RwXqu 0V'i8`GҜ-C!jBАDTʯՒ]SS-[{l8+R^>c'nyYBzS/)ѽMGQ1IW##W=h 9;#ipC=ԥ(ͯOA;;rkkP/.Ru'0P7~ͲT<?[ѩ0*QŸàc=2/ۊ<2hOb[8k,&q[ :Mjcr| ]SZ>{e'Sx8p|.N)7%)<7ߵ:xqİM1MKWbjYA{LٵfxK_N۰r=Js-rK"8j0% !:t0çR|\푇Z%;>hk#yo>>Jg7pV/{LZS-ҺF!vVPNޮg.bw qk"Ԥom_gs$ߵ_9{d 7toyE_wÚ."ю_'B7x#{~)j4Ɛofb}f[-1r4CUf=+8yi+d vӟ]#h'mdSjDK߬ai2d/y޺8;{[1:gkA))#7o۳#y:@x/iSB];ι Z~Y#MelGJ8eavNckk=d*Cg7x  {KL Z-`%>'-v2 Kn!]oU$Ū^eMq,62{^R1Z-뼾>Fuy'[[b```X>8|`u#,֝,+V78#F`eՀ3baib *_QJ=-cb hc-xv՚<C3|FQz7-bƩ~L֟δ0An 9A2ݦ{9KoA ~qgl@*ig \ #QplgG}7^}Dp2#+Y"ڮKGu[~^Ls N+]Z'/m{VqK![МisûY}%8rtˍcdtfe%,DbP`y!̗ _< -V8ٚ=*ֻֽdm?S>+ #߭4mNi71!~jsy1^[Vm 8٪qW|_ ⺋fHe[dSBtElc:d 9Ϯ;X۱=ﲵ ʝן\_5U)-㡬;vTAoY!ʝw,U%J{An :#o;b}HJ׀7qx!$>HIm{˺,&q{+ٍ{֍VkֻIo8&j\{\W$sHopO}Ncnu8D.dtzqm#͍k_/$qv[qݸuMBuoA?Gn{8u}>ᾷMxR O en8 T;7)9-\=2:Cg> $۴z2"/{M[$ǯr]$.7mbIw0B5ߔܟ&ܿq FI:udeJE+|1Z}HBxc'oUVMMnwY-~hUÎ[<3̀uwq3x hT5s~^*kohVy bArk525KʴЂBK0# &`Z(,t; z:^7kG2Ŭ%l| ?CU@ Q / #߽⪳sFx~u ~؀J.dEX˺g?{J8t_1NDMiEz(lZ2#Q{%ɷ|3׋{эqKP6Jbh.#͛VJ=˰,JuAQ9X$F$PTX* s U+J>hy+ȿe]/KmbRBv*{۵;{} du%R2N'ҺN !b Q=p˫$bTGö+{: \YG4j[VibS҇۶wWm;v>v>K|{P&G@M;LJ׺ZW*uQO㧺GYLO2uO^\Q,gP,vv 1& ;?9ECx8udk+3xXwAb_;a&$ ,]V bŀ5>-طC*;s[LMʤpƯFz(N@Z"h`۹DV.[M`7\-ceڵS@/; er(, es4 D-w"]|q|8IYrqj"[TT^'v{ WW(_YyPtE8y~*i۔[Qr 2|z/-̺7g~rRM2Ix/ЉDJ?{e; $VB 3aoW{@ fYޣ-Rhٻ|BPRf'띭})wNz$/Ў2Ӓ(Y*}A"_lOI RSe}R1tسQfaď+%e@a(dR$bg!:6mΎiA\>c“δ19*هƈ La6n.F g"q+ɇ/ظ iӊ NmN2׆0\yS /⿨NLs&-/2nR2"Nf!&̬!Wk4(5M˪`őV\QKWXDMND..?lJ sNxnMRe6m9^OlY&m^ŵ75/" ?z4Z\3-.`5>6GAPA!7@;ȏ B6P 4s8lY@9Q Pf2(LLGeb+bTw=nc(Hg^Qq c~VC߶ctG㉏KMַ|6GGo?AzpxNڒ OЧHB:dT~iT'`٩8A$ie-^/1 Co_&`6O>4aA0O_+ܪ5lc}_}|%g%!&kk_QxZcț<$M+н!|yA=4v9\6yaHIedhvȊ3qp/(z% 4qe}[] J݃Fc2_n4tg342G ?f4f+FbWpyikm*b{L#HqV# \Z֩%L$Tvj!YH\X?X/{?|Hk%ɈZ [rܓ_HFFoߏd|ϔ$S*7]WZ0{u&"SB5f0lSF{msItI:]Y#r,~I|%]ik"U )x{D*=D% fldt\b@|Ӏ9ȔȄT&urAiR8݄⁨8( $eƘZkqRX~}o.;<ܰ?<¿\) OWxW7,O{xu2@`[\Q[ˆLrG![:~W4cݷѿ7J :Ô`y+ ˽X+"E}eY[ݣz-S]tY I󜖏S(_=YAC_݄n0_ ȿ%Ż5ݎIwG%GJN| 2,kW*/qAe6ȏe%7"F-EK[ΧxxG'?;қL}q^)ҒLbBurtp}1DcO;'_&) yGK@/ :FJ`ʟNVa=/w-wT}\$q}Ed#nJ!ZZMZ/gM`s==~$6+ qָW|7ZLw#n"dA)I_1ʑ|!H rD/BO>kA$%_E/L5-kVI>k_+ӊj_"s=tv UV.EZefGնl.YNJ̮)z >7FvkS>9)c*8( 8Hy1MscnчF8ҟwSۄIK4 ->L+}:Cig,VZ 떃{!ZB%D7֌V&!go V@[)JYʎ͆qk6/׬W~r6F}hGûE}g$8\1@;11_c ]I~ӆׂ8OOEĩ¤֥cntU#yvX0F/~iN`ÿ,:8FkΥg5#?|!2S:#NR/Fi.=2H$;-%$Y8t$RhJ%&Y{սmŇTpj{"卽}51KC)m?G3y`7+ɇT2CR%|H׭ %]Ro|&;ZOJqJ1/EBCUv{ VHHo\s* )]$ |4xoq(f^D.2  @CA3 b Kķ(q(ll6 ?~IaW1HFg lPbh~b.PFw -^ɗglj1褁SߙҪFO7ߣd4 ^Ҹ?8#iz&l{<y@Pż AF/C>bsC<ar"<&?2۬OL "'?}$:Py#hr@2%Sh5A/5&<^WmZrsl!| hsǶc{Ca[B6iKi~:< ._K!+ţ;7HQh~co?o,GXw]]vG#xi|#ĦI D/?Cq#RnD+.'Жp/Ј\'L@# N½@#b rm0-CkCid:$/j\#((qq.q7lpbKS|r0ҧE2 < L^zHFʷ;Q?ȸ]1{Yg}/qWrrtQ*.RZ^/72 m#C$4X2PqtYN[:UG& 5PɊr2:*Cg&`2xEaf4ʿYҚ~aCG`y?27f( L<ք{'Pȵ)p”j# rm1\rm0-H$P#i!hDB-6OB# P̵26vF(8aH%Pȵ p”:{("@!׶ȴ8 IhD:B{Fd \;'Lij0qȅ|pz%k>)*m3Ŀ=~!N9E۟]^5|9g׃"J*ږ/|%vREM̲&d ^:TT8A֨ǘT2,S-օJSL,d16JZ'oqv~d݆cK;2r挵M.e%3 ڐ% bK^ rP,IR I `) YB6K핔/#DDa@ON+i'!i jC>^{%Df[5)vg)}c2lbT9WyEm$V-\㕚KJ5وD+?U00d%l1Y?HM8?3\`t󥭚?<>2Jo Pjb#ͪ5e\-pڒN`RlbPH&[**:,t L6P}:8l&Ŀ搜zlͮh~wǫ ++U0ʙ%U&E%GcZ {őxƦ'2Hi2:hY2 ~kJVde⫊"덬gb|\*=Glȵ5\+jk+#1ѳQr8ӱ8/q"Ȥ:wHlH"reضVȘ!?wAތ!~&s$I>G)@8^hb*ˏbLҏئAY Jvk˩Υ(Djy8yà?5.!1c#])ʽ|Hہ6obE++Zӊ psF .]>xCgƋ =9Il5b곓us.F.]7eo*_Tk| 8Ib L^1|yfgKvIJNO^,@# THw.]Pam}}H}2=HIQ -t5##ɇ*XS]< 1Zh\۩p$R~#.Yf{y>#>2j\ݩ{%Vh"g.w"3Yj0Zh\mdq*ADs8=ʔS'^ԭ|H%&BRqQ;p)^¡n5Rh[̭|HДQޥH6W 8  >`r r/N}v$R #`:M8ɧsiFm ڴhC xi'k `K!g }bDoBYmz[%/KW{"gQʌD~9s T^I>̖;\烐mӹ2VJб掲&7fb΅39r bȼRy=Z[ޭ|Heӊ ~eéȤڥa^ccl>U1&:v d½7|0YF}g#q!oEΏA\TݤC\ҪU*T<21t I&W+Toӊ\VhbE++Z?X?ȍ;Z \/$Kk  dAP QbGi,$*RIԡAC Uv@߂B+hG t@FB GZC 6)` Qc(8[py`)ufJOSۧq|]\ Pv%l@LcL Q (=6Z;.@OHt]X'D0E3`5.UEs?'A:WMI>}cD@ Y声?m v#)^N%V9V[j/"2,Zpj:tql `|'&0<4Xc8 =2:Md{KmUI4+LכЊ cz'%cR8 83 x I[?2_1lJoؘR)fF)X#92EoLg3V^Bd4Z8K <6O"X-|ǒ:^Ҍa-E"a |+*<; d.̻tqw8x,y02E@f$=/7<3xvA_ cD/Y :ӁN8xw=x.Go`Eu(/bl"{D] FꚂ,X \v?'AmF䈼o3o[?-W1.xDŽxxzI <n> gHY*4O1pm𮗀{ coYxGaq!h,i,//(c$ R"s٧L& +<l<+eyb%ă>T,wl HbkMjPa7<,5TdFQ VgUWU[V`;3f$al*֌!T4Q dkֻ*? ?O0)>B OUaCoM1]佪8;XSI%cp~ y:v5dweMkxKUNx׌ygV~IUѐ!/2TdaĴ ۥT zy}9XJ8\=gҊ4O*;s1&Skhc|W;$[-j~"n*q%o\ '4xh@;)zC:z!՘9&nYUByF|l(])]ຆے.QN<%ե1~1|" RpHjpZӀTi=2ie(a@? .W5BuwHoGX8O~(]e8F!B> dS {]=1Oۀ~z~8rf XRs&5>cضC94D՗HR'fX a@ug-ȹ [I&+6@xsE?m? *lwߟ=c~`g{Ľ!q{;I۾Sy<57;\~Ku; ac2!NA-YIY{nϋxvXp9l0Qs>AR[ߗ+9}M}KUDfc=~YXe,T=|B,jO!dawj,omE_ނHbw65$i>Hb1m`C!/2۲_K$]jVO;|r<ϼ+?ʇlۊ|-T]:!u<"x C\07.|iȘv~P <75ړwFa;Ycո-lV5c($Ղ/;j3NW9ABD'LvV%z)eRCz#I!*wu*@f;_MRѓ|P1iBz$8wY=nc4Nq=s-n ypD.m/ gto:-g vHZNį<#i;,9\2s+6io:<:I-u `3cp# ǔC!HJ7y@'ِs&1Lh[I$񵷯ѡFMt4J%v ktʜ& Sfq)T9XޛLO|ǴmARBxM+- U ߁O5]-J#بxU#>%{&O $ODzC>`lUM_K9LiQM<Fh4I2aD<"|Wr6g qz(o8)؁O(?`i>eGkw 2fAH2mү`mP$5A07xDv"._c D} UxS ^ km mfeB~YNP4@[N'`/wb+cVndai"ݡip$++0;BMwEWh3}|5G MAS௎4Q[aC kil a?&X&kn?:,GIV=ˏ)~}(n[Dŵc IG1=YмmTvf2-6_ğ#pn+a'UV0^vI7Qu'p2v= ^+`bap/P  NK{z -&\'L½@(8aZ꽕@!} ^BI12,[(q޷s} 0-Lcx#P pzpO[cgq=wB^q5~8z(b@Ha\Xk6iQ*P@!i(pVM+~x-$T+SZ\s\^R8%@xnW1|59< f!P S*Pm\+WLP[@! P=V軴ኹ'#6@!:ЖPE1FUp&bq Un Vxƨ;B+偫@=r HrB^&WLWj{( \E)ݾF( v( \a$Z=BiuS'PupŔnrb&(:bZ TDi *Pm2B#($=_r \1%K(A~ \Wf 00D1 S*%Pu"pȴr\1-\@!U Nͅv$BӀtDzʯvb3+UOr \11 G:)BӎWupB]NTu>p]Ks\1fH( bJ r \1-ryBRZ2[!`b?ןv`B0ht@!uu B0i@!Wp50\(WLWj(Uڍ6)a縲gfΌ\ȼ 獜=5Abjx~ XwԽ4뮖lBFJH`{H%DvO$6P<߁|=~}x~-Cx~O dk겦6O$m^H6[[wpQGމׁycM.c l>02%A5QS`=Խ#jė?2[dW#igVa>5-.<TNMHLJ%&YU湽&";R{N$߄咺ľ|#"ۚwi݉KMb%9rˋ[|z  >rLO6UjG#*ORJ\>d34A)۬v "g2QV&L×/8$`}AF7b(xn#`7D 'YM5 !%D^uE֋o\jx|RxijXVq efcLϷQE줸(݆kZV85-<|r/BQv(k*W;s=> @!߁y^wÑmGHݙt?4L?c?Kq<ȴd:S 0 {zLZ/Z/J"[!jOy{J#Ǐ-7'OEY-O%\;:eWלsĄT&hd7ʫjRU}κWאO3dn`>5|*ouHlEjQjQd~ֿNV.hr8"2hz2cw^w 13Μ-LAuBg@"x$P؉ք{zF9i3!\'L=>N_#dvdJ8z'&F&iƒq,&R656\S]ֽ|G˨IaH*[I ?֞l8[:;P+Z:[KGi -,o0,aU%R_lʐ^$R^H'Win"ٖƉS*7p4 ?|r0 >OmtBbR/h䭚vwuIֽ4<ꎵچ5l톚mXh>3N}g9sɸm282ȬXὢSӒMdvM&''vH)R֖=3)GK>XyzKkn2ULO,-ΫxhlT{XW&-z:$l-<:ؤ饯IZRŬ'I^mXKr?$;Z"6_FVĿf W-9ӗłD"=VQ%oBy]"/Yx{D^oldd OOLӇOy-r?]kb(\;^\bltzݨ`șLU\mljUcX:\֖BW{YS06W gѲ|Yl=d4|65>%o0nePo**T(p-,BHCOsWYcc-Yi~堌D&hIXU><2b$Yvd'ҏOAr&E=[:u^S0t,$?e^f#cK}LG{"Pb \ =s"dt)LC˰twx=+' 9b+mLIx0!9RhnX,ꛍltdP4~"\Hh2ԤТVjղ'`Ϳ\|gkJZi?ϼhv_Ȫ2(]~)JZ u]+;k%ebe[(8m|ڛ_$tG4ŲzdS5~KdK% <%j}Ƴ 9^ 5ѥIRw"Kܥ_4Ts dPcRwv f~7}R r/cI!DH>kL7<) *2YexY[ 톲 S̵ pŴ>p֟@! )6B"˕} !qGnLC)(&K-UH)X-qW g ~zS/c9紈b-r\'^5τXe$ݢXtQVmʓFxl\\l\$fEYqlAr%bUV6%2v͏| wۑ\?&ّbeہ쎖H|dQVyɾKl˷-JǽɗEY, ˆTq\W詍2T@!S6/zmL_2\k9ʺNd5ƈL L MagD Y a_++bjLa;~fԑQ IBbk؇~*훢f*B@a e= UZ3{!TpdYeC\Ze Ǫ:6 yP\A|hn}!s8U#MVl {%R~lC"66=17A6>,Gh-kH 9vh!!a6k-x07ؤd; Y(ϩcRwKU(k4Ʀa I1a! :!!qieL-WSsmp\BGtig&kL_xg$Ig>fgO%'9xΕ,LNIQ"TEj%*7.OIpYx5[Tm05:}q gx}U; 0b\eRbtRblRJI! QԈ;_?bir 9f"xY1o"5]#l䛜Rwv{rg02 2js#Phj153MAS45Լc'J-D'ugt٪ؠZ٪1y_Z&$ d٨6d+Ƙ#x#,5:2^O'ogO5ڒtei[QA񃺩͊K; #+ۭ ]r!2nv'RƓ)nj>@Cɽ, VFZƫB0VM)Wjk ksidڪP[ GȲ)mpY$=<(FÖall/$--Pp C0$}以zS72tzWN.EC{ٜ]`CIٟ?m*渏66ڢj}ks我]3gm(XȞnXL'jl[#cG+Rz pPk (bZb9(b6$PF nHVֵ-s\1cmIk)KB=++>|5B+yiam\{W{Dm i)\1-\mOk4pųPv1hGGVG˙G?ScS)L1q7?fAɷӶFlF I@#yCg8b-IwKe7fIVlzȢCbZJNáe[y"uCVRC'nj贤''cUIHSWmtbF%=ҪʃdK'=%qyɖaz3a+nхV/ɽx& v]#@?~Bu2!弭#OEr^O_{6gS1CS'U֝9a\,SYK/+'z0C|2SLELS%ohYrB]+ +ٛ|5 U)I;j?qB+.Lk3W~V1'a9_q &yYk%{Qd+=#[VR}UB,zoؒlog- "f-_ɕnIHq#Dg:ᙳB,SwbT'ᮄ\e.k*gĕ}6KmMaөG(miŸ=e1qf6C df3lj8^K J B90G,P*'앫RȜV!wILKqp-dE;NqSD* ,.`Cz8Ub,R̖V-$*4 Ѩܰ5rN֧_Ȃ׸4 U ssTsPB֝9j=۵uT6j!Nd)s K2T JP$#=iA_Ӂxcb8\WLuȼ& ˏ]:i^qVL,w[AM7k}&2x2`Oŗ#յjqVa ex]b㒞WQ WUv"R2~|4&NOMU .rY 6cNp?Lh7Vw$=C=w7XHpNV%8h Żfq ak3䎣PT~SFIPݙc'2qDi8%h-?)1I !AW%#Gz:/=K¹/8:bw%)v4naKN^ swLzX:yVSwm97J4ĎnC",t˜rMW匷.`" mOq܅=ZA[t\WL[Wjr[ Zy9F¢\KѕLMe1ٮMǗ9L-FgC}ZE1nFhݜ[@jrEePߢ~ f;z.R`"rհ;bQEj1k>=oԑ33m{%Wx[\dT\g62Ū2ǒD'ų+!Yf uffȭvnC+</"?99nQ=O%=]qh= $l&gb[|KKI[֔n9)irZSӺG5}]s^wcy-,2մm)V@Ůqc7:k>[Ta4*d0Sp 7R7k^勑|FhIg.ErՆgbXbO/LFE&Bj\H+±B[o|&Tnpxn=}һO%Pup]N-SkRN!@->I)q}™AGȜ`&dcU0C`c%mVKKM=['qO)892Mq޴2Sz3MqmТ޸5t\WLWjg(~ |^}Җ9 -GVS_/kbZؚ% 7fܠ%[k7"9`$[ondZGj?.af*Xg*Qg,\WLWjg(bZ Mcs3pKkf䊚ؑbd7F WW-nSm"=U<6GJ{#_LW9TvzIفҵ3\W .mnhD!s!&0gMk͸9X|J3ԑZzf*B@a=?E{L8@P1U*P"\WWL9*Pb7ri|C&~~ӵ.\ISf3ul5dœ6 ^!`CG A*y=&񦲹LL;'cYB%%6MGmU>YBoiELڹs?jR;濦jvH΃v8XCR@y_IyC?S5|$B^ޫQ4|%5ޛvO;ӊ\?X;Z =RL½@(' =&=8aļbô: {+dǐc6km6o 6#لe̦{ ABB6{z('E"h!@,/3-8#^CԉނASQŬY ߓ$#!HC@D 1@01a8\c,tE=Ei*D˦Gȳtz&p:2r3FJ!u{ԖcR` G%Vabp|@8]<<~|bô* {/1e,}Ch ?\CkOG-H[h JmޠvDϚTFCJ@>o1N28U mL+@%PP7t6rq@k9;3LYgwd|^oeÙEs=F '!½@(B]&Sт5}XQ  ]"hQ%NhƉ:n̤,y6RqzF% sd }Mn}gEb8?HIܪ`= jG1B\;5toH#cthS]#-<i3!Ց2^I>2 kaԿ Zbb@-ڨŷy'uj">, U2KCNti:O( wsH4 7VLL׀{څ%}E]JN:ϭgb(PoR>(v@ tDbWz$?9#r4H:r[N/2KbktH]Z1Wli<ъYxgdOr`bRɁ* ЮVyE%%.@37(0+/drG,*N_P|VD*"s<)>qL=Y`;#NF8㪢v7y7ܛ-- \n(׸WPVF"덭k55;WڰG!m5oTL~ne6'ĸ&o=`ojCjې{S Tc!=D bɳ/#}SXقeI; _mQZ& <@d0iU7C|V}hEHN1T?bZD UIq_JQDJ!5qî֓fAݗ턛+m͋_5'Ty~yy>1a#!xԮ Ubx[ѥ "% Ӓp/Pԇ$\=HvDf=WMM! ɕd'Z ƴ02P}@A9d9HqDr@qs'`G.ɅgyL@&pm^I>wT&!{z"PJti=1 "bNdb|abO,ViY5F|b4A ̯*v;;c3ggVtj˘҉qzPЃIx(" 'bĈJ[m\OK{%ZDg@;ZQQ8#HوG3ZP0 -1R,jyiq*o Y߽ et[%JO+"_HF>i96 8&4s8ד!DSE}XcH%Yt5wHd~RlH\JOz|dJHjP}MߐtUbZĤ䤔4h}rZȠ!T*5-%=:-=E*OH'aNrtN ~b31zAҿD<'prc 1a1 M:$&F-ĘIݽFQS|ȳ{@A Ӑ3i{L|B1  5O'G%lLFP0Ao ӢzjUl]?CN}\Lw̷ٛϑ僟3 ` VMƂ^I>)eSЯg@O$T+ZE_"+U Ar 8AZ/E,R&FEy#OB7]f%^(1|| [F| S"ϓP`ks(i9WN@sdžQ~0>ݶܗ d[Qfkwe ltnOXeZw@D`oTcS^8XW]t^k/Uutt`Z!P?NQiވnJiֶIj_6!s >틵oF﷼[\Z.C, R'僟"iI&Ei;j (3K^I>Ʀ'F 1wyE2>M:Fg4S{N.RƆ6V@s`h)2)Ћ}kVlY?)1-%25-D,!aNfg%̸46)=1Ɗ3+YVgRl1^?f:-@= 5 KЧM IH f>$2$>.5 IJ IL'F&WF E7Oa!TC#9%"L` \+ɇTWpjO?ԟeR_RL]jH|~$ @!JWڲ#0־䔤q1 DEǦLFTE'? @5.WmQWeO6&0FQKV׬hM]{uc^x?@,-cR$'DSu-6Iظ8}bZ* h:&NPu TAͽ|H+S:HǶ*nojӑ&w t myqsvhZTHstlf=zZxO+'?VVVНer(HwR!6 Yw>#dKׇŝ_!1ITZd.j_Y/izE #[%+x\d00~|MaԈJ5uO>cbj %P(!V'PS/@Lp9Ả=]҉ќBHVWlWkF{0U25 T[S@Lxf޸UifwV#32GzZƭ!v)R5,o!  JkidO[ڸ%P x3 0¸>ze~CVm#L $n& >dk9a-thX0n ƴZ Z$E=9/>I/STL}6& !6$x?vp]+ݿn.FgvZX.pq\|]%ZolM#G-HXlHꍤOSHMd'h1x2moE3eIXy$׬evIx짝oD҈irʹUi/SA--`Z PmkCV+%)y9q✈Tu˩)"P֘BL㈋l\QD[EV fdVlJmFycR/PmmRbzCL!TV@;%1cЪș=֫& Q>e5Df0-={z7K!$ΌVjٖ}痹N׿is:3ѐ3M2b;wmJ~MJ#rV$gC}]A._ic  dcc G'8V$aKdnU'i/v$汫 B[!c[y1399iAj{('[m_Pb">6^VKaJE@cmXΉ,)IJJh([*yV6Iv42M-S Ąi^Ijm'-8UӉ eq_#Vi<J*hSc hsZĦ–Ԕ*9)qF 4$TЁ iC乲(AgvhW4^i*GNh? 1MBdSK__1*NV ?$þVT~=o]Gx AU5 )9`:d#HU}?|Hyъ-V|)b!_0j+Y>9ɿ$82 mbssΦ$J ܝЌ,?Z.@3vA 1\m}TbB)l~aW_U,Z}*y.n(5EaP"*vd<Ȝoi[LYV~?d}V#{i6#qL񳨹30]=`\ZkqjeHq:1xVnT8|HN-(DS]mZT 'jY~9ɍ!70An`#$@aS엔l%ŵ5.˺Ddjȳg>5/ V)G?rjbR<\ijt]I@ K[+2_2>YF,YM&P K*GB8A kS2mU{qT|?i44I畄h)0C:Dƈcq} j0Z5wu$Z=@q)t*0|CK:F0xA.ҹ:rӖ$PP t5)vXyOi 4T+&LO6F~ tб@'"V'x,ꜵEVP"A%PP&94W*TR5?ZAҟ(d4y(<;ry>bP;:$@~L@S҆Gm8JBN Y~t'C1 )v8Os*gLݪrdR>RiaH96(F~"޴I{%LO1әO #hE%p@g+-Qv7(')Q$b͇s#KFA T;@2j,D}_V ?:Ցb5U4Y m,OO_2Ʋh¯um{A׭t| (w?rtMSk,?Fvgo{-tw*V- Hɂ[ڣ LJ RRB-12Qj[7ExS-}RTE|tKImCer˘|qIH]- tb_1%怆I?0^QR&yTI &*kːSD]v^A w', Y~"锷0.Uk=1w7Ľ辒:>@K{㧈./+WɆ]!_ZQWR%JNPY~51rg n+ִKTyqt5Trtb}cfs)*O"OK`E)q 'Gh~nOvݪgg떻?e봕QТwj(SBC%ETz::8ė? hB 46ۧi۾mjGkbh;@;SLTM@7Z1-M1"PU˚l-`'2JwVwZAI)斀0-+g(4WO.m 4oZ3KA3VT;@f9v NVwY~Xׄm>Ik|zZk,Ô.̟(ȲSlEtW `U[P*j?'PSl(>>0k {l:da!iI!IQiqTM _VxLLgcB_|SH9H@ Դee=6 ~V|chur)չcp|U*GN'?qN,R;),?s5cId%=ҫȗ((G}n>>\3k';%]?(O4_ ؊C̡AZͼ۾ >O7vR2.f61#].̅'МSwNĸbǒ@+TyX:?"?V`XK4͞אBӧ#Vr1deռ+ʁߑ35=aH[Y>:GQ&7Ĵ02/q]Eޏ$R'$4FFUkQ!0  b5륈$3y (4A3[qhg wJ;*(>}'@}|f3[q 42oѶ@vN8f\S@O#&NUt/-Q سTdL.\S5 (`Ћ@ 6^}_7K]%vҵ)|PtnP .Eٙݮ2:\z.`w ˇGр{%Z!e{ET<(x?B06BP E(2(!RҀ2 a(O" **QF5Z4dhhSkhE5@| >7Y@A|/>w ̀ НB@/ ^ z@4>ʠ?bDs1( D@  $1&F `<~7}` L F}9]0p\p _-YQ >cKK ܆|OAԱU5 zj, O1v'lg {\(.0o:B(@s`q8 >~<8\U@a;m;J>^<)؋>1󌂽xSxP8vpw)L&xግ] V1b1? f17Z߭A!P Q UP0J8Re)"(3$.ʛ!_UQ c^VT5ClL*a SP S觢Բ @# Ml9ZA[ lC[COH3t= ЕAw@O l S :x}>v7äx3B )R@@ b8`$~y8Qv080o<`b*`:QvYCO`0S)u),,q̊?I,AD6 ʽY[[)lO2IPc>09?bc6pNNS8c6(ށQq ;Em pM Is;xC0IL00 0fRg43Y=0!Qa KPXJaUVS߉~7 ?6RIw{RaY9)pApi;8D;e W5e);ob@e~݌W}p#aJ!#;x xf8Q?4ƊD?ϤP&P2V|@2C #>~$iQ;/-'·{M6@v ˔&' 7)3C4 Bm(UMI@) eͰaJI!~7єP q*PD*:NAU)ԣ@(ЈA]@S3pihAA|T V1-6ЍBg ==()_4&(ĚaJk d fXO F!00 CFP c~  OA|'40 0$ S3)L7#S25߹<1| ),Li)`];)| XAA|n5`z& PLj;(촃-fcV~_쳁Ǚމ~X13& I;8Aa7,5C4%3Lu\ܱ26G3cu50B2n@9;#7z*@svB VVh?/9 ~pe^? `-ϻW'!WTdj\jxd8|xm5"B^," 0GOic7.߾z)$l}NVFر@tj9zM"@ƫ&'"1u89rR|vVŋZJ\[8>21\ԴOj\ȴ I1Ry[ dlH4!Vha@0+qh0cr/l6shp3WȼSy(%i(yىrR;;Fq c79A yqDQ[vp n9 Ǒa֘݃\EFUrVs& {fLkp 4{}j9˕=.8?UJPR*!91p2Ϣ݅k6_%_uWKكC4gQYwFO>V{k }sHN^/yYؿ9>3x\dmiYK2zP "i=a M9lCĴo:8g.NF]C̣PN} 5:5Ğѩh)feڴ~Ԉx}bZ߈I b`|ߤ}DrJRԈ1<Ј-Z4۲] ]\rxL) 3eʘظx}d|\d8NwEI{:my&JTUiVZX ~jmy:ҪZh y(Ȳ KC}2D}jCRӫ y sG1ߣP:P? eA,W. rl11A&7u+mq 66i GpkGlfy։r@ݺSnj z][/ʭq 2n;(d%Mn)]zTZ"t?(~ "Om.蹵-rkP¢nh#-RnDD[avH։na|rԗrK&wwC_<G ʭwg pkCu㖔[֥ %ԝnKݺRn_ ٭|4\MNc(^洋S"G[Mџg>^e~/Sgܳ}3>Nhb cZ}d}Js9>rQ@H KiBgQɷp֜YteOn(s}: .\a },*LAi< f74y* ^!Q'9/x4jx=."rfvm̋H>aL9 3yИ2MO*e$p*jhC`mI!a=FA}Y0Tc{))Tݕ$r2%)8L.T6Y(@R5TY+O34JPa3T[pT5CRg4t9$L(ᩞCMGhDN U?;g_hbpoWqFxP70d~Ͷ)XwfC&mt=ldf7ćtW*fnQҜcQQد'vl+D2_!})jFKLܩO1t&'LO@),|4 =zǶ8xnqҞ|R<@;^$w>W,c[\7/!sI>]T\v*YQ&Y 2Բ@v'glH6U%CUiw}'y.iTK;ZI')*mXyiAղnZ׭߰WpS0^nRڦnuH6ftu>@:Tz2TY3}v~L0j'& Jz{tLrL&IVvNӬ 0X#;hhXI(E*d/VC_7cX6~* JWcV@╓|Xu\FW蘟ǩ3vtd.S^Yf}n+9XwEN7.8O"_wͶ?W3 MVbqOٻ<_lu~w|rŒ x^tw˕yǣq!JV,3aOpE"]\&Vt_rsg>ƌ|/\+tɕx:i3Wڪl+B޼[;^;ȷWu[]n|p︀,=Q 4w"r~ Xhx4B3mGJyeu-_"ȫj9Vnj=i en'շ)po<燇oܨ÷y۴(ERj\Ε4DIJx|jWre{FxLPlˋ NHlUQ կC om;+/Z:JMSBtwPjո4vq->[kПokR\Ө;|Ч4q4\;)xC/ O;Ԙ,iS|\ϯ_\ؾ 䋝X$H.Vg-swi)n [7~|ߩU|?_9u?t_= kWFr>ͧΓ3t//5{U->k!=sf|t/w8Җ\!tӾKr`bh z\^9sv|֚7y3ƃ|\ պ_s kϏ]LŕJp>۠Cob=ԾU(ZY Yr)F.C|Q֟{r>;/}!M|~Kvr8\h{u<.~myø"0_ T[FUz¦9[Q2mE#F˾8b ^;lzx3}>_vղ+>׶s/Ѵ6\{ d:dxfP-|]Ӈ|r=nu\gb{l>[pvgo:·M4ęr'+rC <:>7/_翚?oVB.ZI|[jDs9g_ĕ;B_kb >d{؆hP$>GW NJ*]NdgMˉ[|tl=}˔/';.raɮ|wWѲ³wq^U#z0dK@ Ruz\;9Ohg pyauř t+r<7eɵ.]X[~Qsy;Û[+e3.|^ m._b6F/,5\u%WZlo6iU*0z0CBqm?ꫛ?|DŽd|! mn=g?񅵥F~p~g}Wz^lvOp>V`ٶ+o*ߵ˩f镌%S3WB8֪C73ymtT".Fpy/h-{z՝¿^ŗouݡcwϛˇZz>ɓ/V˰<vʶJh3C.aj5N3Eq+J= VWvy}[9\>٥şڼkW|صOGV+hIsrf7i!}2;v=-_v?Ϲ|GgBiIr$9Y+7dOUy5ͅ?p_ЗffJW .:C숄1 t/OZhyդ+ !Si'K3}W9>8k߿ e)n@seҳu76On͇x\ȓڭr .zM~5詨< su0߳Au]J\_iE,>ڠ/ϡgp|S?wo.v}{;9w,i_!k1S/>-oީ~6|v八w{QxR|q햻^6)keq>ϽDhwx`=jvQ\iN(7yj.֎\gmygl~K}6\++Zysz\Ⱦf?*3/+O Dӵ@-٪k&>|Ȣ'ךҾX-Z%ZV_~HFbE_CVbbÞC>Wш+Z#f/JlQ:\OR.qy;6}|z 7+ѯPg౩w5W[b-?./ܸ}Kkۄn&>yrQs&$bqD~lՙ/}.]B8lX0>j%~/zN,>@.ZW9Wmꍇ*/Or:W9lK+ @#bWu?/mf'`h.9^)OWlo%>k3goܿ{U.ky>u\_6\C ;4Oun ]gūok6o߻M00bӫ[2Ӕ{~M.ςyֽ[r7{ڲr*"E+ 9Y1c, W%QKv=x˾%RWN^q ﴽY8qg.>qߖsZ4g|v߽eۙܵۜH*xn~ |ү֩ݺW]W*?j\СŗXaWz.~ipEߟ:+aų)ۼ9WҢfh"gԟksW/joj _4B]dh3>[xV?]Z5}H ~X62ⶁ .n2x6a|If4갚^7m "U%tOZXZk\S+.t/UFVgsrK{qr>߈ ?M]a!c7^ ɗXd+0Tm'x{Lhtߕ_7㋏_x[tȅ_'}iBkϥ}oի|_iWv~;} o iCs.ǧ?L^=t.(e[c?-7=1~.s #zq]<5_b|[s! 䱁rC1_)˓|U%y^t6g8RfI|c 5rMНeV^ۂFEjسoG1}c+?\ybŚA|a~*+|oK||-5זj>W'+Çbb Y:D' .Y,{yE<#U'i|\=k<3>G9/Ї~%ь ?7zځu2v__oÑ E]|ov.gOȗF[{t'[nS#/W[PhzʅҝWHt߭Z-rx5ޯnQG|2p_>,gۖοM~-/y9K>wb'7 ,w8R 56>`xuό__F/o;!sޖ×n\rr u~Ln(ϕ9y&ߥK9},\jyϓBq/߲2=[\^NځOY-=٦KU~{<5˥ ʕڪ³K͋+8V׷[=_u9m+OlyVus-T6lt a~\qj[7WV}{+uȫ4_E {ŗ<Ͱ{93JɨϺV^ 5OSX/ـJQWy4Z,_ajGK}]._p #MC3T~hoװ'|~v3PI5E~. 8u\%Y;/:A{y|ʍ{5}܋ȹj~Mn؋+Y9xW1`*|ʍ5,+po+ڕUε-\#-_G#~\zǣ\%n_LZWT+/ͯI#9G=פΙ~_[gSKq!;f_?=+URm"o:dž}Ν |3+|ر'L\w*KR!mꞖ[>ݣGNG.v|Ew^=V{j^(V{C5g631W:̴o;TPMq>ŜBQopJ͙8e_.qogB:Ե_&Ǔ>]㪴8y;xn1܂r[>a<:'=Op=>hWovcJ\D96˷EyU-Mvs*}gضM+ 1U͍~-:.!֨j=t/,.}Y/==ɝm#s%?Nߐܕ/Y6廵[5'{PᗏrV?; 9Fy\Z4/7GawK Ti~mko_sбZEWr~)rËE8Nsnnn\ulvpEYy^;=>Rwo>?7Z7f].۽-(ȕڦ]oqB}{Kt<OWe:W<#wc[.? #/מYkȘ_w<_Y ޳&$DYZ}ze5ɏ r.z^8/ۼ³ Y#j"5,=/m-JOyP%W/8/ܽwiю/To9Q}RmK/"y˕̳qk]|ѽ?nmYχ.ei?{_zu߹Js+[̲~.{ Z)>këq9tqknu <4u{|oksM72ąA5͜#yt?]n{xτYOqRQذ';sŇ91י'1Qn@޼C.Tq^H.G[]JYwŧ^_=\/ר .%cY]k_3_}Ǘc^qvƏt种U ޻;%rpPLrY~Qe緗.~҃y=݃~}n$Áq.]ױsK}К-Lj/yxnm`"WF[,SwXÀ(w_ȡ}蟋WLゲ6excG]>[PQ<_G{{7_jGg)R|Vxſ*Ul3`\p,ٚ^gS-o1_;V5㻥W_kĕ]!{/Z&\`ڬ~gB*|z.bi>Cg~|v6kiqG~>ߗop~.(_lf;ǯު˱ɶYՏ2GiZPq<|ӗ/u,U5I.+=55Fh2A\ԁ=#]I_9W^ְ3g:_QW#ZpE^LHµBW }fgNM/>$-.evgϏB|l5+4nݣ] ܇xO}WT׈YrQ\;؆/]JO+5\wY.&'{NX4#7ou.{찢~ҭ3y37՛V "Kmۧ3zU26>~\Ndɻ/c|^ f:gNSJ<ڱˢdzq+rVTJܐ AʍC{p$la;L]p˒U7q![#Oo3#w=-Wcw mFm~W|Cf˫ZX7?mv\O/υn^wK.lUWTyR}yK.'sL z9r,uYSPnǼAgN|g9׌^P'EJo9kpáj|3n_jk~TuفBu).^հ]z?]@2sq? ^u=Ë_2T{i=o^1<>˹;yFѭ[2F#Vg؛nud9&HhR7kv/\]iz5ZZ%{2֚#t׃Nnq;y|C _uS)s>o:CFv!=XPu+7^>S>~ڕu4n?ekÍ6=U,K /zWrCU?({Piq +nxuʺ𞜦M1% n9QE7*3%p g&螵(Wх_83-rܫzLo+t"Zwg{s5/lg| $|^F$6ު_˱N/7RI]2n&ݾ&x1rtC tW+7t\ei!-u.\kyÞ ,[yj UpOs0ѦeX5zj.˽нW#9ͯ ;V-.kYûvK%_E4^ ,-_u?~ż Zd[1^{uVrу/;TYws3s罿9'Az_ zCE~0vxgXh`Õ].;S] ovr-+z!vewIxY<:K#wϊ;9n5 -ʭoGQ%߸jvGjwAgu7+ ٌ]U;nn{:]&ۄ}Mn౳Gܦv9Uz>vU&;0הF}n]`ÍS"bK~^^5jBt/]}~O';ÍR϶ƣ5+~ZUɡY͍3?/G)>5d |.lWFwcmb?vݿ &z4Jݟo/tZzᆈi߷7Mkyrhz\|GsSGݝtP=(ύGKXǫ s'nZsիC,5<[+œY400?VT_qܵe:U=-kWgñeFsÞQk kZ9mwGz5&>`"v&\RG_Z4mVwUyݣY\w!:]F~7U^C!%^kΩ6nuro?^Qu枝wOu9 /9euσg X1}YyLw!ˌWf|=qtlye)tc'6;7.BmmCnmSjKexڹ]ün/}Hg\wڙׇ=8R´HĘQ;-=3#@Cƍ yx;=oy,*d~;p_\}|`Gt7Z&oǖםشgJR㱼Qo,} ׿Y}ܠ|uUlxg 垑/-yտ5JĚsAǶ˾i; ʖ-E3F kjx7m_=;v:ێYZޝ?ϗmDN"ݙ/`j_S(ʹ47TSQ{Sԟ7rd<5m{ D@,.?/AĘce0.`$MU0DM:636~F/!;k`&SG#n\Oj@乑cwJ w̹ !gE_sB"CLR0*|l^p}{3F7pf (.Y? DlFDi1I~h wQyL=¢v Ҿl)~p6ځ?VBvvj j)+(u*쇴vGwӞgB_tmԋ+`0K: pkZzRgcºg"CgWsfq=^tvNf~$SIpcCYTԐ[ :_  ^-L hS! FL0#p%5.;K,S opQw# 0\]F/sǑa@:o35_J)lKݏ3cJOMKLD~ ,S󍇌gĿC_R>p=PYØq4Q+}fSBꩼLZf1)1kn6h]څ]D`ЬG<5 $ә.U6q`AM dK#V;OFs"Eoɂܖq/t EƹwqDo=Y-?}>" 21F2Xuxs-}Pd⛵gA! VP@9[Lvngߍږ9"4i(Y MX m3hYq60D jHuBnBV?h }!QǪLV͵3dOm`{3肟a˃1`VveE,p]G"!!BNLS]4_a3ྲྀ1uh;ZuµL`]M%sf|%l9wE)yf{|-;vv?JU#,V`XC@'O>#i. BRN ̑d$y W ^/zO5 P7Q%m1~!Q/4ҟq'mMF|Z@ݥD4L ܏3):䀂B ` L׮li==u TzЯ]ru IRb+HX,=iZ=D !hVߦ#v;ZꪾTO?6e泚zuzP{HhW-Hjwb? 'SGI"'''Fٮ&y<9DLvwkf]uňbOw|U;}_41|ewf48IrX Qϱ Z-cNpC \3yGl"WNBž6=4UxKWt1HNhFT}HOz,k0+B>U_~ɳ`d umnbږO mŗ L$DkbBc QֲJ 2&GUo:*Mߕ:2iDHk4!Q ;LL[vW?!Y훵 &|P9ǎzFrzNFwG1]Af?2<jFFx9ڀf3܉եh f'(=Z^K4K^9b÷ò1O.d_f<BǶϿC%"`~UfHg@ $6!{urin/t50j 7T;ݾ;W;0WdBF~I|TxBod<B_1<޼RɹNJa݇0kS1+j:Ͼ P{7E(z>sޕ  h\E :=ČCO5\2VU䞜0G Y!/eLפ) [(睻5ps̮u($_ M%w:K%zO1ӛH\~ ·~*'j-':ol+G~Էʭe;3&Uɪ 6f^LxS~宐$ţ`gR#sKrep'9EMO!E3E !ut7]J?>.5a^2 `sG&mNH7_p!Z0랿+V5Nx_+u!7Ѻe,M f`_w=hwO(9=#E3'8L˯.9aͥ7b`kUq0EosaZhK;rit9TPL` La}Fz˺TK!.>9474Ϭ:mn -FDWDaW9-ngqމՍ`włuh-T$ طci]A`%6iͶHY{SȖu0-D V 2Dpe/TL8p,)=Mφ$;@AH˩g[bEf};}H#_&Y~-"\P7ޫwV6#.p'jU4S2X2m0&TU}QzX6XJzd" 3\\KJVm J .)+;Q3!_41&D7\ڮ|| 0烇M<ҕ>^k|Dٱ[wupvUN84!DLVM9h zd/l;#{NC6K?O72C ¬1`uоʗ__/ـ'y]n[WL sӏ?ֈ 1mdUSۛ`g <@EL =Bfu2Wm+pWiE N4z@PHtGOr H3wi $ ᴣ$Lek%gowd؟/kS!sA;lD+B"8C6_ ɘǤj=`j,=ohQq =Lŗu¾h1Ikv0^ o.+᧷ |gCyH d}7>u W 2Jqc5H4}s]Z\dCձ[G)"Wb2e 4Zb!VŠmݳ^9Ǿ'"x1ʨ<KOmu>tz#nX6Z,^fMboPkEHD{V]+\DH0q%pBx1'h+lE #;YCg1|Jl^6?4Br*Jۍ w)G#m儴c |T[)BZnHw|? А,ö,q4uyߗ̾\|6#v~Y>ql4مiwzi΄rm3Gf4,]7nkjOAȵ'.򚠱6;3DcWG-U_Dl`F{B< H:$dtPV ?7A r\b޳xѬF"f|v1etzy@+ :̶ܑSO!MC-|iȡ4yk7J&jAb!=[&L|֕-ܻe4<3NO1 C)!08=^o: r,d٩×#uC [_9Dqb׸BC~@}']ķYm!?4e=H 9_\hV9exXer`uĨV5&6}%bFKq=S7!q;JZTQ>JiepiV*4iKBU&{NX5zyh)* LnY|82ÏS1SX45n|v<&v^#_V>VsB+L$],Ik< G7A%L`); &F%to6k%Tԍ F rbU w#;FF2~S1Ɛ#H6_!  3Q'Qg%h"u-ȨOa~ʮ@wP>ckJ)=xˇ^Ju:selڼb|7>{Q}&ۆ_Bb'gl$(PMNٸcz.|Y0wgE/Dxepc$YM6RيS`L HęUv^T#}vU~o0~*pl6޲ژ^s G2j /#>FAbuϋ1iC+m9uj|P%Af'8ވ$3eKM@` d MqKc6JFDyDILTML]Si妗GgdKCoi42+թx9%S=K+'CĘ; Mio]J~=it?0ǃGp&myv2w&=z:9wm9rZk^ &CkgKѢdٝu ѐenDy|,8殻{8D7V~]9 ^v^846!4;tM I% nUhxVsL00gf] d=5m: M~6&zyTQdݬM$l]P~JnɕA:LJoH'ea5y: Few.DVCH%jo%(Z+W,̆ru\W8>OZH|S|@)~#(m(_Ed0f39'#Un:#ٟՖآ츔Gzʸ|lC<4BZ;R>O Gl ~~c_\*^`+h:Zٳ)R{rܪ223?yB)ЀU!d[Mh3kIu Kv)@'/@f_2Ȏ4q(rc]RH_CJHFanh3! 8"j=x5F?3:ΎIuv JQa%]K#hVG53c*4bCcvg|/ N}14!X8lr҇F?E'YC:QQlb %~e `Xz:eV>ʊTELNeeH".T` dL/hxӜzdq+O AZFnZ[ - aNs4ۖvn<~Ծ76w_b?'&8I+bd{+LRm zWwg?pCP{Pt æ,N@iO5;0O7hq-jV܏~sXQRXYCW?Y`s#Iқ :NLzsT_c(>?Ϳ^͎ V3"|:"&QU%sG!,U;RiB٢~J`)腣I=;YgK4CPPhG3~кR-묏g?u{&|=MhdndN.ƴaZ^B͕˱ 7ۢΣbR}G=Zs. mk6;:Ao!cg-HN Bswx5RUF/56%o B[^A})$eez#}yMm˜cU[ c>?+NH@qw0>ۊ!͜E㚨-]]e)ta|Fzk[ `OliC{ E}`u]F#pE,hg_9YۉzDB1c ɂiwx-?`T;̜h9,Y=F&ˬÌ`in9QإM'*M$y0woӀ£o4umfFQŮ 7eߛ=߂v#@tVu/@y?:J#ޔ=;*hiIz_a^sRƄsr]`rF+*imF*+<>o-<⯩KeALw])Msa!ypǪ^̩|V1x{pO/Z dQ :&{\; K0JcQ7԰@%Ho][ޅr"4Nzu#CODASw?#kgJoР_3ԐnÆFi|h20b̜$,Ij'n)_UirE0)X&՟p7+ .3!ip ?xX`f<)bI k&Qޛ6:>-ТɂhYN| |334/K?>XTZ 2+TǹЄ%>LH| z0kmTq4n*w9+Om~ .UQ?evJ+` w" j`^lp%dI`} |Ȥe9P$3CywZ {[Ն} ɛu~Q.m)h I|֐C !vTmWn-q-9LV :۬!5nm͐^Y+_($1,Gvn4{#)/u4CLQJqGUYᓝH3ôV_Z\yo3`ͣw_?J)ch8)v)utݵ6M*ZwU/ @!,PX0S]{ (k=`ĝyI1L,|:Pg˪LN5~p7gܴdBː!Yhrln-u_62_2@ ԗ5%@CTUvSQ3 :Y\aL,ڋ`4^2'ZkP-@AQQv[HPt/ښ`baϕ_ٛFeh~ghч?ʦ Yq68H4eHN[!8.=ZXgu̴thqғ}i;}Ow Lٚ옉::T7ڂomuv.Uk V/!a-4&"D`@GG̹ZT' G8KC NF Awӭ64A0 ?~ )| ;_<@.V>QV-}jv}tA#I>MʐM~aeD#N_Wk1wM>f1AJw`@!}18;TM oćt 7\NO`Ak!-hmAAu 3≇_YK4_PFN὆6.fvCK^>qF+t c:&W3^V<ۃL$Vf?Ic LwSvN53@ D\ ρ?%Qv8w0O4YP~G>|u~4jթAu o ejwmߡ6^kld7]»歹N3݂OקQh W>oDuS;B鎖r0|X`E=whQ $XR@'lPhs`RӇcwGNB\ȵ?aУ.2ˍpB-~0d[ryC7{Hekc1|Dԫf\_JJdVf)$(^{ ߄t~!ܐ([&^WzhE[)3h`i3냛u'A֗]S>W9{iZ~Q|5S ]^"3*ʿ-~`VqUZ/ ('9v!z`t[H$(+f''X"]SKᖲރ^UrasZ`ݺ H_=Ӈ׹ʝlbZtPp` FtD oOaҩY1ERA9RǩFl{4."?X|.KPN{{5UŊк6;+.HЬfvqV!1;1ʹpL m^M~>ӴJn,_@,$AO`jy2M4ɤ]Dޙ/0/4xтQr͆Q <>smւ** Eg~Tcn˂YC1*o<$6;ʄW Xc*4yөnhْ4q #w--o po0@xEyx(v]d6Q)ޯuV}*n0dYclف&15xJ[H5 :~WyR $jK:0MK`Ru#$s]u%`ZhtQ 4,Qweў C 4zk&U A-*ӗgޓ$!2Z)A-4#eޡ'H^س ]Ot=b,`j|R-|[1e@|K3f왳k`塛藟hyL42; uaS ۂby0.\.&\IFSoqإ"Ͼ^[qE;T=\c pyn8N@kMAh,wİD^=}ʜdf+EP\7HlM7=ͦ>`A&NZ\$ENxI_O*//*5CiWa9&Ȥ^7cO^c?^||Pw)aDf4W굣es.NegԯWD,cRB5xΙ*6+dE f,YS X.hA0w֋8z~ 4 \fbZcYNJ/ 9 a}mTyfW|ŵQ4-\7XW){֎c$ǥ] +]groY"_[@i˹]}Dԫ{ ~UB+:P=}5@ !^ܓUNAz!뉞RR_v3z, E}ɮZ.@?A.x߰>ѳq:,^x_f18y&Է78s =6,qu٢rLbChPWsW|dʉݴ)+}Hr4+*O^c%OqH2֭tT/߼3Z…-WLO`yfuyeЂK**P4ZY`WSw!ZiZ fW;1cB`X8{9,+@46gI 0aFuTKAB2b3︵q#{~^|73d{9OI5l+6iI|"B &yD`{:(|L.%E[S{oޱmzƦ1-O9or( Z߂hg0!OV1?^Lp2NO:| ipi Ps WɦƄgS@4|&W3f҃1I\?Q0Z[f[ޠǁ?L%BT%Vo)FLhݻUbӨ+[pֻ ]n.G%JXJwtrlX8MTĈ;`R#Lڐ9A6hꌠqmu 3^1͐9Kl[e -Rncr7=P`NhwM5lgzcN쐓W:̄֏dVS<`+K3lhv6tg=gQAU_`Ajfr+U\I>Bw(TT|3m+t!W"0$>fOFcmi ;CPb9p7q*quudhħ5>cLL9J?bA'^m8iVwqOhF0 XP A}[ާS(t\4M@Y7cs1x.@3F hvaN[,x(ǂCiG,>[v%O: M`DlF~}.Ymv:;I5{HpalԇW!wT6&\~5rmqB,]/&>t@1}Cio4B[ _r?]߂fAry|ҤR&݆#OPY7AeC -ЗŅcBaԑYzY cNDF1ӘkE_kr'0UbHR8?h(F]Ę_ G0]Xf-δ Nt?^%BlS6\(WsHJJêj%DŽ3FJ}´o}hPϟ Wdyijk5$u<ɳk'ݷ}1fdwSVKw~A6@1rlfoLDEy>ТyѾ'`%CR>vw;j;^顉tj#5ukV$0q&a$q:P#ut"hsc׼}*R}SFXzi\WҼcrՃObIO; !G\Z7:).?h翘39g/&=%d I:4OF`z{5pnPn<9X\a$=ѻ[hZ6m ;D 8Ȍ_4z#rh?}U:& d2yY*2'ysZ\*Txr&?)ЍWt|`kև+1$ /頑FnhW=1)z=seAƾBƹ 3W\15쬙 YUSi,K$ }_9ba]P4fSA#u?mЬgIi:4Τ9!ϱq(Z݁8 1:0x뎯h )̞{Kʚ6[BXCvi_'ALP؎zvMgw[UV{h=[ *C\Є/e}f+C+"Sls CWIʮHbp^,H{*XF xWh?נ:O4N%-;3 nttV URQwT`crSN`/êQ?n@o7{r&]S %1jil_wuZYX ;~d ڹ,Q(f#}\fcVg`*;5[~N _F340nCzco_)ɼ=)[ط 8S\;T߳ ȟb4sDG&e%gCl“KWלB>'hFҏK" .*F横}@Wt!F]лwhDԩ~4O,oPo>pA{˞|]Y3;_< Д'“gL_aٍ09kRd@;3ł=MP@Y&'D_6^>C:^_32Dii"*+j ;͗u$37hWch1ȴOeP1f>ӫCEusN/~=S>*hq:/l8 Z;{HP|͊sTk`BON֢qҮR+yY EW5 6wEПqS`#ֻs^ Dmϳc Tlhxw򨇜w_\4* mIН;-* ]_LQ֋`y/.YТjW(3]fHokh~3q0=yV,zu;A ՙwǣ/fSAh~[<6l,l&Reh9)BZ{NEPg`a~Y '>]O6M/$=YPd%.e " oykelGR QLp!m 5 qf4-Gy?6# k}A|"2aQKS:/o7ޘ] ~ 4/H0@boDkWęq̘&%s_3'FgOM 3,+~u oay*a~8cD}Ib2Q#w\`$K.t]8>FK{ԃ_+Fje0%vL&0mx2ʤa].ϧbO_Y權Ȩ6 ږ24t8 j-ݗJ!La[f9oN8omkA^񷂓/5 g-Kn[/:.|0OVS0 d+ϥIUV.OduS+L<); sme69 9,A df5?; yrB_0^\ I1WG5Q=tGgȪEv9~KZкAx\B3$͙=㠔RLk ٖێwRaZ/D8mH0*,`lC.ftl^ j_Ԡai"}&G f4s)'!uh`VNa76T"07k1.b2z=&}nfOxEm?OPq3AI[ثi/gAAL^0 ,Z';l+w|*Ə ֗)RO#Ƚt'Y y@}>ƭdGqpdxmZ-`D ե&L273`5r5T7Uue G`A&7v͎7J f%{%K ڟOw1Mj˧r3$lx Zˮ1C%RF0ɑ!ïDiE/FQЯ|tT?GAfyO_;w2.@ڻ&!:S[:K*3@3A<7oƁWıffV뭥] W ߜ v4$qXeI*-b 6hPX=c2P O_v_T42\`=oWם4{uvEH>$F QEO߇etM)eZg9)I:xT4v LAҕ܄нr:4GGG4;i+.1Z kR($>y.yVuQ1-w=F t"n'#߷Mx8yqGZTQo$$aeiV4+V9{kUѤ.bR |g˼WB#3kyS&!yq׹ˈi z*ٌ) NYf;w'ϡ9g B_ "֚QsWF*i[*wn6.إbx5d/1XUY`MDWeWz+@T϶Ԗ^\'[>lXqX 1LiCZ5OϷe˕C!wZ+_/wzzqƮ<۫a#ؿ,(QwO @ cgFH]GF\XB# VWu^) t3Cj`[r4Q Lz* jao4fNJNMϽch3lcN'@k1.jL֕tŴm!odP&7}")KX<>}L,(8 I99/`2&WޡKks9FSէWL\uMp$Z}6 i`Yⶈ/Zqle0_;G*+[D{GgY틧."s`N5,!_]mtJ,}_uT'1ydKȟ5loyTG7O<|Qp?4~=/%)Nj@ZN/dA=WgeS},),FD#1q6X~w#6KМ'O~[V ]}X *R}\Xfx&޻t\/8r6 ̧pxe~jOk_ֻ(s }%5-,w9E+.wgn*"۹nİJ'q4%ӌ|ۇt4c+v %ͯ'y]gL|hYmYdb]Z)=?O2X|VR;Ѣl+`|{r~)#s gYl}*sc`r;G^z o a:_zg'G4$߯a`fwÁYښJ#oQ?c v=:[{Ujkemmeans/inst/extdata/cbpprglist0000644000176200001440000053721514137062735016427 0ustar liggesusers}xu[(,bII$bo"TH#p Op݁ENIk\ve;QXv;[{X7ofggw}|Λ޼ySvf5lb .[xb 5vX!{qw<ߎxk<}ȕk>HQjl<˦hCzݙG+&l)f\zЛgsI5pwkw'y*H}`'9>.X~޵S><ϼ׼.>_ѱkDZ5':W :___[9FhmGl)ePEc4UJ-T2MK$D"kcOwJ_P̗]|dxG㾼hqۤ5zw_̚__^|6l9j(ү{4};ɿ||Oɑ放jOO EnZ:2|>qCP zb\woxCX9Ph" ڵJ{RO =ĭYH؍b>](Z%bDL|{>i+ej²Eo  vlWe %1rG]ał+ |aAގ/\1me8u0םYЛ(,bAGPk7۱kA;7l0ۛۮ(OV|@'M'RښJ.3te{Jf*RP:_Фgt}n:l+SJ/$Y6,uڍ5G0*B,p썵a i}N}E40J;莡W8eEVH7xR>8V3Ϧ4cTf̵{1Jr|_w;&179)$byJi6`\jJf|.ydM0t;s.a[q (\Hxm9ٰɹ\f[p<0y6pY勼.b|156pP<ʷƆOS٭b @'9>Kl_FpxE~[Vew xF3N%]nI嵃 gt/<[4ڻn$^n7{RΡ^I2qd(2 ڌۘHbMS02Gx`k!xHN^oY}QaU$޽2 ?"4tɭB.Đr:DNG|# >"v8Esϵ*sbp8+rn\9]|p2"eDiNi NRnS"w;]<BZI lL߄\ R8Kp~x?V [6?f8M.RlzݠRHZM֔ΖexYj8ʻP+ղ_7sa jZ9<lytp,gRtC- ae;!ly:|LaʑmB9Nr◉Np:3 yx<[ys++ee[מYJ#yJ^ꩌ5e=.t]$]@aU"~=$_M aޒ  ZZ⹉HV_0P WqQ&'p79M#=񹁞R\pHC@/6ٟCf<{ Iϣf ѝA ?c>/V#C0me8!w6ўDq5!))EShN7C;FZ'Z&PW}śtQAU6VfW,(ՙ˧)kjXˣ?-E3ՙuȄP\{wʹa i2DlVFwc+H²ٸ\Er>Tbj7cϺږq;)dZ9$GҤE1$*h5-@_A_VR20ikȧ~¨$VjP6ثoIk0ӧX˵a,#r-L s=r=c'B+7}P,LAoh=^vR>}3GU:ԒWaK֨ns#֔cqb(Ѷ8u,qs{ .ŪAeI88lcmiv_R ܐ|VsYݧTY@1: \sR>G/NQrF>AO1DW׍!bɡ m@{;n!QXQ?#ۿ57;չtDp#6Mlju6U[\3bƍh&ķT4+h. Q)$F΢)tGS(ǝsxy6xyr--*W (;OB/߮,xudmA_j_v9Ou9KEbX{#3Y}Q>\1ya;gԦVYW>q_| +&S9h>vt9\iY4dM@ٛ@l)c.n~ 1p#x3YϯZe%Ɇr !騥}-K &!;ճʷm孞=[w|MwJbs6iNxL;SڡNq OBEϰ6(b8&SkFo 6)fz:u"o#ujB6a3!cx4 _hVaF[ZuQm|@-Ϩ BjK9-۟AΥo :Ӊi8‡$ Fz=.LR= z8bH ߒ< hj|r%Gezy|ߨSh1k4Vr؏-֮̆Mj 9ZaZʍt&tQAIzB[UK{L06]?+>@5Q#68iBs$"O+t>Ⱦb1h|^ ~Ub Bwb504*5Tk|&]tʈkĝ|2cx{#_=5*sM\y깵`:`ǜ z9>8æ͹|_\e6&VH]֔9,9j2NJ6g”ѕ?j7PLuڴ}%OGDnjMNl(m8!EA4M#뻏E\R8hcQXA-ɛtIs$*GR :NiZD5;׭eBa`={QC{K>aй(eMęt@OP꘼!J)aX9܃0~,a? Ia)WX')* 6Xpe!/,ws_C]sݙ܁‚+tdelھ}ӺPaSۮ(O%sV =oG#ʦ BṊ6J>ɘ@Q+!u3ږ-pᢅ . |^z"aıLӑcB/F!#4ٝ6S}\@BL\ǁ7V"d)N XHPKة(ZvN`8ÖeI5WqoeS,ᢸTUlow`+P];l7 iZyDCГ9WQ.f|ȳlީ3[2JBCP'jbo*,?SznY.x.2ɫ,3L05m ^(~9tAIQY)޷B_ww:4P̱rh >m(N=NSD|E6WJfKXm|hpPOy>5y"sF5|5ZS~*I5@2t زUVˠ:ohOIEɷ ,}-,D9ˆA2S' n xT=_8}:ZH6L, < h&hRZ9@'@K G6@ˡI7I9R풹"Mh3ofiπNNnHLްꄓ:&1hѺYILC1IiPfzMHL; ]҈ .!J$H|Φ(sGʆT4343x; bM ;ahu+ 6`uUK:'*M$:#hl,-TL|w}j(.41<5G\5WLnNڔ K3yqjVyfa\sJ2Fijz82;߅v-e>^Kw(5H9xw(q^\0=j25Н>!@_/+@W*<TBXR'DibǾ^I׉NNi l;q@H^+!LYGeuN^I'֌t-ņ[4]f䢽Ѥ֡T@T?K9vd=h%r?.*U&Jx%ZŢlX(-,/? kH9]#PyץX'jTAFcZY{iwzK4+0q֤֨Ds `Z`:Lވmق9C$fjzֻJP@2# lk҄BvMhzymۼe/UÙ͸Ō ~D|? = :Í+ u:s-j\HK^/W6STʌ]8/wТ4$]Anصl-U3eg?Ay=F;E"7^ەVSesuRi )Œ- k%}ZRqM Nck o-J%Uz{ӷY6o$|co>W~6;=/A>wdaMI0_ך~t(ݕ Ynoϔ ݭC9F)mv68c_};/d]>UK!֯K{djOF ,]V9pCQtգIԠPk|*nhBj+'t-{w5,v}S#ֆH ?GCmB;tmpb } X\OF.׷٘o:u}e5dhEíwɦ[ySkXT+cTWhܭ*@%-S{T%Zdb і3yfa6_|ޯ6# 1Щ裫$; J+SS48P+ M+FShDZf"ɠ:Ĺ XtܒGim=CnvM 0{mjL}Z9›rݼe͆`?VljRg-Uqĝ337 'fMbKkV:5h ;/w!.I4 EǙ oJq|K ,m,(ׂWg;\;`s/xfx6+mɴŊl I%L%]q )0@?~V#c+)-'ULUL?kIu6z2EDi6S Ot mbKlMM$N /< ~mqѾ†ڰhwco!U29j^[@tmfhd媙Ox KL78ؘN43zvkuuxpt6sĎ?AG;{*n]E&y>};An4ݔy@Ѽex1vy$׶5# $>J8ǘ.+Jr;qWqդ2j ׀W^~-}Kf0pPv}`]~-}D*])[( 'lx N4o4X老"֏ C{I&}m뺦Aз0d|uC~BNs~x~mNю_ Vnnj.b4(44{'ߏB r1 ?N>r OR>EJ- 9J|'Lvm6ٻVpR N_^3Dک O3W|}S#ּ6MTؖc} @z!GvO\~cp].yx}/|}FsA'$/ҷ|5%}|8pm?2w%4/E߲_i4Ey~NXڀry`"ߟ $Ÿ acb܏t/gW4=X{еj>VC';ByXګ<$[9 4VW6p¡cG^Z̈́R~m_,h۹%dÿ1=XAY4oƓLo%׾O&1(^ IYnl~ROV ݚB>\'Yl_ה?t6ךIuj4tinY=氿gn?;]#y<}n2j%qkۃn mT ?^\cҭ=뤳 Lɺ|2*Ŕ{n\'l+ ͬ\ho]DiW*E7b\N;̓p|wi,͉P·u/W7+un^O/xJtPnd3˞[<9;խ)tpE&zs;Qnhg@2?}CuJ7@Z/Rx_Ki۞C侁Ҥ)Oioiou;g:I6:@l_{*uy@z d y~nQ*QÑ鞎t2o䡛#ә*t^TԘzͥr#> R Mwui}Fj!EUOѐ)ѡN+K27k0~U/ .Larb5^9; 7T/3xBVS]A dYfy*R"ܗ^xgWەtxSv&y6F*[H䊩RSs L*ڟ.dۉP.o-F=׽3roF}T ZhvwOTFC]9#絟V"ꓶ^qʩG +995߮'u^حs~Q\^f)v@},&@4I pQ$Lȵc" r9Zjl΁V쒞;:.:s}=#|´l{W2n3^]3SmsShQ^9*eB3x0[mV44~+AHs>5W&7ˬ;UPYߜ5ϕsji@ GLqYCW=% 鲘"] R-F+* Q "8`Z1W.3 VK]/ϋ"nRō頩애/laWnR!B)^k SǵL{B]kà(_o;VͽXzũOh BQ_P!)in;,e䇧X Q'7"?BpWY0^8[n6va$|,o$%G·ڃ]_}[S|-Jbi1yF-}ټ=JNŃ"~ X향{BJ'Wi b u|oskȃ5 5 Z`u}NnUv!kVAeՒskO%dBpV({]~n3v^ӎNveS,/T;e|u-U#]`MᯕX'I=wܴ!s?#rbʤVKNU ^/OWXUkQ,=Qu}:s'[~XW}9 K0oAցQ^H(=f20xrEN+HWr/Қqi\lKJsI{>gGIjY{5rڐ_k.&cq,g6ׇqSkC0WCt=Э~Ϋ؉)]>U(7](f.RU|)Mdͼ.,? 0[C 8B 4D=-?@Iȵ;nt M 9 :π,␋?$~bj:U>Q|L7(ʰlUfuքg &t|̘]Tq9gQ;Q 97W9^իۆ'\rXN:~&˟v]y>ge|srqetedweRe•=NHY#y*M{roLyY;Ok7}2{|r?'~07ЯƩr:_5[Fv{IP|xn=t`7\_WGet~.c='GC|\Ke;R|nmD}ExxVI]e B^v_fM)n7F}Hfiʵۆfm0G4u:G~ߕz_=Gɨxehd +g@(ǿ^Q@V@tDT8H5n'!iUԀW+xf#WL7ӁS) *5S.i]yLV^Ui?[ /d_ )_UXc,D1|Wk2H[|lt`Zm JXrC+s@&zvEfHzj wFafʨQ-䔾 !N2k*%|դ(?EP.;< rϚJXǤ?|㍿*6 )#R[֨4& *H> ֤o;)]ƪ3Tu{1Ɣ~ON4ems]|\# 3+ǯv\"x)j[dԟ[@8`~[![4 +V4 N`|x1 h4ǯTF#=|#~f&0;BLSlL1 NN%S,]r12>ܖ4 Ї;"nCifG@^%[[P_g,q_Х(1jYNEm,h\=t-ZDn1gT-r_';ϿgG8)*עHw(HdLq+7bY9L`iTOӇ } G}L?NXAkkHw*3)At gQx.s)Cl ׳|p\9Ͽ ?8 _N^ ~(\xϢK)V5:_Z}k[Q ou(;!އ;aPp=.&/8@MnCw9WqOL褐ctE8Qx?RSxf!M=D:1PW-:5Z˝kns}y+J+╋[_qʥ|\PޗQd0_A&Qϫ{UC:S>^F׼D}c^ uC^JhsWMuK^r*8|#莊.wEc9¿q DR JLρo!AM!: :wP' x:rl[[iOn_3@fZw0=]}l orK/-E"S^`[(}P~Y+[N{g>^E~ZCISO>IQi ?Fg(,mU~q-j9}!V趹Wz,;_>Kᗣ U7"PU ?*&i:(ܺN7)6q R= +~@~<7Ϡ g7Do"C 7AS3)O#xӟZOM4?vdh{R'γ{REeN_ ,E@xWK0 ewJwV8jyOP78}KV3O?<ۏ)٠oh#ht(;6:Іv3Y, 7_o\yu VcVO?V3#$nÚzy`!?KΥ[ y?RQ %< g i6˙'w%Zt$q*p&G7+ {0x(65 |:5|c>g҅S ,BD/; |ؠ\* KOGq0Z:6i` %bv9e|-jcҒ<ʽ!_k+/|RŗA}42- P4!#f0.fLf3as p18Ma)%[H|G Va-:"~= Wa3o+- .!"JWBkhJ޸n:\OpcLA;  t@΅@xua0( n%ow1;{~׎<>3x$^1"x9WAP_EJ^M ^Kz LiH7fo od O!>3@DiD=x2>B !"4Ţ"q3~TgB_k|_bEgwba"+ J5 |:1z !۷|@<W ~Q~SBޟ?g ~@AG< N[ԷF#W '/ aSt ^YJT`mf l5 .e n#ءe!=W26#31 2@r-ai "V#u|G]ۜ-Op:mj\k#eڋڢLzB|;Jv{+#m3\|09Sk{4%=ᙞC|iBzneYA*Xͮ_K ;ܶ+|_{/1Q( "(Y lB/na 9|0j>ͤEuj<W!=}]]zO$J*\I01 _j~}[iBIlӤQzHn:UQew'G @>cN9D{ZR+TQ3Ń/vtRU;y|ǓhCV^L,wi3o& F{OWNy>D+^)-q6{Qϼ3ܯ`/xyVP=?G|3NkxWֲr79RRpB:KESj⍕~R#O:5K1$dfs7g^+x2d*E>= m۵dTyw(, uu6Fy?}5dru]6i<ڸUm'MpC 4zW+Gr~`.۹k5UoHZ%%mD=/Zl%g򯞤{rmh1AحU,UF[9#dӸTVL,VbT*fsޖ&5@~VysEYIY>s$^N%[\+]KXb L%9j̇VU8-k@ƪa_Ʒ YYMyqmΙ\Nnt~J[v_PNx%FʠW*VSoZYZ=j ւ5lِ{Fw%^*KkņoNlUDMXb^vLec$+pIVsLWb^˫eY䌱ZY0,Iد~T˛EZ\+lj Meۯdubeu*iuCb߰vlV(5ݔ7ksΈ4DiNtuuuUHE쾠%T-AY5f6aoT1z@N03)&ǔah(ähFչ_Qek5Jg7ĩv6iYH 53w@P+,WVrfZO_k1_~+ X۬eh;p]f*IVnEW=?XES,iH Xӌcb9Mz6l(~&J\nT+Yˤ Uqe}pN̯#f 4U0 Amyԇ\&${Ԁ}9՗ƍBt Ue_OkdtqTL[x T}=(`eFRtKڙ;| ە&yY+n+9C-VWnHb%%mD=/bl%gŒ^gZr%حn\2^e3‚*:Lb4k$3XA_^@|d"N56v,K[e*Аn|4dGMqkәhuWӁ4T90*oꞙӭbnsd|~2љ; X&Z^BK%Y&v&UMVbTݪMibIv(/zZFn|VRE*\BSREqϭIW@_iC.` ߏߴ?RblnfR>&HNߔi/Λ60mn٩lvt!UTh*ۙ-)%*)}ŕRsfX4&;Ylޮb*K{5/ceu[G9-!SZX~ѕ;6nڼuǦny95tCgʈV1gW~զrw9 "k; I5U|SLfDɌfDQVr_u꘍D_=$F0oWG0jgPk`q7#4L_t|UEE%׹\'rLv]ZaWR0/B80 HIFNB1^gCWf7\ӆ. j.I>+W,!2B#:e EէN$>ai"yxsLrssR})?Oa3pEԖh(?ZPk煭=#A{h8XtU5[d 2,jF^ɓ o )+ԕ\uDoï!!,uLd~gQsFb򳲢4[ ^j-ҳApc1ZP6f:=XCn,ak z-1砐3LCjPMsn,:2#Z`-R uAܙܙA$5$g~(ՋF\6osP1NHzaXeEz5zVr֟F"S#*ZOFFsl{\lQw%Zdws+Qd$(lqrudqp|3>b anKMkMyt4eJ]Ae :9N-!!`x?``,8 6~@ofp8`U/*+dc wdA)gɷRV"WjY LQlۨKbDg6CKWO.NK |+-vkpnH8+:S%\:֑:R],ZF)Z_vVA;6}~XF37u_[ڐ%;gNyuV Ά.I"!=!d NDw՜s\b !xh)cm[w?v]X{i:FObx*[8?yFYw٘oby3I?F "No2/ЕA}H`0%>3]ne͙ѲכSe&Ë~Fb$:_j_.CƟb2PxҶ1]LGO a 7tgQa?6Zr_[$jd챕_sΡU-(:n%=IQW>Χvއ-&U̥^nTg.ڵܺ xP&7n7OWt2TO豽]}LG =Uh( y[w4"(6A/C҅]K5ۺ3uvOTe:J]`2Z u\VeJ0f)<ֺL<՛WbuK}o]x&3YV},u줥wc^3cOV=ݡʧWДbWSܨ]h;gL[W\k)˄j\MXu; z(SY+w ۞we4U(ma Z><"Ix+sD}X apξ#"mfze+z{]E7c]Ŭ Pʹg݌f@DȰmB_l*;]gEֆQA(N:}o[u@v./\m3ȡ5Gdՙy k[<|Yֱ]>NxE妻l4{vֺXcٕIwe:Ҵ`5,쨞\OH;8s?FI;@>׫Sܖɤ~V91K=Е=6e1tRKBKNnX7g6ٮt{FÝ]fMB[@0-=6ԡ-ˆra܂ R]whg^B޵T2PEB[RWE&4Pq֧PJJ{nUGC/z'a7 V^5%޼N򆀞5틾xeM`&lCy&`ڲB3!j 8IM)\sd'k4+ q.L6κ!pL wwk^'*RܲF؃PZuNHr-?a1LZp3n~ҍioEn*pvKg#Z?{%)i6%3qC8fÍ8[RJ#[p,>; o })Fm WCqω$'w:3Co 'x|>_ر0{*Bl ~.vu9T{M̖ۻΉsxo,2Zձl!O9tn 9bgEu.[otڦp)5)iY^J-;|w5VjFً+kk8=D?H;JnMtN*>阗_Nrcݩ#Yc!u {v }buG*v;\.W!Yn9"Lf_2d|@!> e7cr6@j3*/ MbD &0=!e$Qc Ɨc 0Ho|sAWЇ]PE2}FO3iTJrnw$q発U:öNHz3\Bp?R=mZaTau|z1{j]G2Ԯ3+6Xl`Nt\roSaO)~z1e6ujۧ@=ktjԣʵ%Rg҅6Q"Z,7Uty43ݔUpnPlρf+L"!/'Y)'-a4>TTϒd:6gmu=WОvJZבRpyW_\+uO;B"Kܶ[)S *5eZDz8 'iCmyn#S G n[v  hcPsJ_%]TN)(.O=iVRB.zhu;1ޥЪ9CЍagA pU{pW߆0z8)nKDH3E|\|2ɬ|`ɬ|@Fl|?J^W/3~t:U[Og?ψrpKJ:."!j2"ly@]¾/vçP=*=(edu nW2,JIhӷ [Vb5RG~L|j8)x3-2 K6%cnLtSa$QMnɓ_pj&&t{ao-FWCjD9YeÎ=id!חolPoߙ1:-\NuښU3b w{5d>'Q&x ĤZS&[~ 3U(cYa(_[*Nl8H=8\5tRBX'{H3M&qE@Ʒ^@U" 9P-nYHЯTg񥱐J1܅T:Fo-yoᵦ~}{L'ޤ:O__V[oׂ|栕ZXV0ߓA^Le{qoy׫P,A ƍ 'P_>WbWhb Jq>ij o:77mE75ѐBxQ{G֧OX]ꝲej.$(ri`ZN_{IN.yQ\-*71vc8}ﻛ:lRZ@4- >VE=e5 Ze]ni2TV6O"r=GF9p`\G:7GG1D{4KAiS~Oq yid&vx^, ηY!6Ŕv u _TFiY,ߜr‹(\E|VhdYC%$:>\o[||ZFhn"mtZn%K p?So6F{_F^NEtv +)«nDy גx=nTzl!s;x2M'.h&f-!q,Ek|>,Vnunw֙^ o!9uYHa7C >p+˶Nvq'w21=8߭p#oOs?>LI0M/Ǡ?$._ٻi?6iFOtG1sP6uPu,x*^+ Mϔee[UM >JCTU_\frt[.t{fgX.%UӆizvrwS`i&IJfőZˠyX~EW6guoKDu .8VjY0M(:WPܨx aGGH}ɳ1rjuX(wjqR.J@5X٫TnR7}YyjXvmBkLlOGfoMĤ*JvR''rNߜY3pFN ~h~l?ǠzP+?un}{aht?].&yg%LN*,WֶNy }kˉklyg5́H&a%CQέ_6.n1_r.y#ó6d!1/Ub)Li>6cc4[j=D hr| +b9XʰXD%ry sGƥ1u 2>_Ajg{XtncGM>gUY/TDԐ\708-|hg'}<+AomFʿHx| #j!.HS>-iGmߏ֭X5 L֤UޕkY7vU!GKZ-~`\k\ Ngj4)y,@`9Kb&k2 _\I./}me[y#:SNyWHvD0M~>5`9sYMy_A>+ҽnjvK=;u=Eu~"W} A9)D!sn g[<٢*>ttyE3b5rYlyn3="k*O PI-!1sk)KnN8xnI( 'u?hDEkJg9EoĔ/7EG )\GpdENR\jMjo iBY2Ԧفrח۫%8ϥA!CmRN+=2V`ep=o4A=Ħ;GC\ƣٹۅd⡟KK7{K.N:QMJ0cboE˃U7jcJJ#`u8򭮰|k5ukS*Xp>'Ze"#WPGDǺ?q֓]FY#gL йr$Sex[s1 S`| R".e VĀ!L D>` . Θ!t;&0 aH D; ͠!àS.`p@кI B p( 06'"!=Gholh:Ѫ؜|_ Zm"O\h.m KIa|1{U#U\ƫU+iU7G[hKXƂwG8jCp$)6~"w{+'"܋HgVtXth#Z?ղu|#m<܃)-DK,!Υp2 1'?|:s%ryy <\k硠=ӰQλRyo D~YMλR9>q~#8\NTBvheWfyzNn V. ]!7F!ETYL2tj‹R왉8xRio,DިȆAM{A\Y}[ypeP:#?6N9.l9.@!] X/1 )^}'gcgtW(>өUNNP?r 6h^6TZ$[Q &9sΟP Bo3JMFZf%MZ N`h9޷xv#}IDjhAgW 9OYR1 `>:a0;A1ѻ ^Ӈ2\>*Lr=T_OO;K )}Pΐ<P+Sۘ J?fqi :]M1):NOF+Nif}f?/|?(R' 0W>TPkPS9,jl|ͱKdϙ+S3)2ީ\nDJh%(,Ȼ77yr%=zuHTl`q2xx0 A @68΂np A& '$B;٘ ޹ ~W 琀wzS[# y4 w8Zt{Zԝ#%p|x90\Ļ>0/o=p/̕L`:a|^޾e /o1kK,z'/"o8􈰏Wk%jW&TTܾa{(.{ 8N.M2ݩu;N;ayk{5K߽yW T ܇HoS/OL7{tf.#w|<۩=Gzre"d?;?r-cr5k5ezMٿdmvB 5̣ߛXcZo^qͷX<'wq{{+K v+wQ>C{)|ҼK<>hخAv)p"ahG>l R'p~= OOgᣚ|~Q1J ')QPY ?GƧIf( ޡ|_|}%;?2_,11H r{9+nk<[M ڝoSawh2&tOzkj0lR@D~B~LxRD?VDo[i逜)u\*b_̃|=uN^;Uʩk*gB7gLm6;kՋCOݫ)g Oewoon+5c_9jzh, tr*GT9PGSSǻʩS|PC9&8S ?WNT+Qf!@NMȨ*xCwPpD"49^Mc3$|!5^>;?}.:`S|ɅBT+GV5SUw䤮௴ג@'r(,Tm`,j7iAE3yl,2wt :C]"?WKcȋ1OWb,_%/ɿ|d̗c :V1r̗?eoqHwEʑr'&l>juC3Gz/*9^)vl{֘OX}P/C^7Q{Q7}&A!b0w|V}%= O=,c||k'[MCmtA\g/{O,=ι)]G.SHaȱCd9]R?y ܟry/s澸 0._뾰;$WnjJەmO3!u̮sW .^Ssv%V^#ݫ2#ߣ_ FK+E]jh]'ykؑZ wc`ޥskꍠXɔ|_CWۥxk!7~n?Jwe;J-tJNXlNR(hHt M=Ow kt:&05;ccg6_(vg1c ZiYf2%FZnɺ#fpEI#WW>NcRx.Q !OY ec@Dh+ZieTdxK; ߦWe^({~UZϬ!0 xߕGxl_TofZusu,[R6|Dlv?k2wЋ.wE?\c`l!V[2~ǝQ[=`\'jr\ߕ,鳯 SFujL% uՠͦ,}7 fu?gwt~0~!af{ >*]23=!Zj9rNL'_ Oɽ'{,S9 E^%BF繴Q%"ڍ.DX7؂ϸ%…,wqJBzT܋nbTAy3#̖hr넠w+[?ˢa>N"~C׽OL\25{@}Plj } BDq.R .  3XՃ|gQ&Ӂ(z/eC]gkʢh|P-.rmW/j("ٓRu6N$_K|ǒӤJEztmZM9C%Z4 EKT+< N_w>xY,M]p6z1d|(>&˯lqHj_(?ʵ"f>VT0Ni?5|r.r+"Mz51^'L  *O?7Y͟rvY3X ޭinp3xa‹)B%^JVG/Ii$dd*Yj.2 wQxOwaw D _c7W19$nۿqAw1y/qAJ %NBZR*۵Нx*Z:WWUs7[5zL3,:M: I~9C}%?x#7PCey~5f=kܳkSAmrZvq뛸)IZҢׯy0ΕjSXN(oX ?DOy=e+{:MDym$ 6Ҫ"3 & (^}Ρz C<ǯjK+Qpoh%\V+a ܷPA/Rx£Lmŭ3px<ۊ8v*WnHSx(g EMwr|05n rǢ0H.A|‰|oL;z1գAO7h'[F&>N=MS+G>+R¯SBEw/w5V0iȟ S0Qp{w?\ }a Z 鞌hA4`f23abpC483wU#c {BE ,adw2.kr՝*5X`}lR`>f RAnd @ϗT`W\` ~1Q`?H ! D1z ! D(28B oep[p"1M pap',&>>g{?r\`DPZ1 3<^U Fm>3xcDx\ O0Z< >`.|$"0D fx"¿2c§" UiD5O|1 cw jT@<( Ydr9VS,@9?2ɘ#B]A ?㼤X\mltFw X5 %E ug7Q•ǹh⹕hfsq =w:,I@}b-)%k;_D[kAq?Zlzfp-u>!2~kwr-zK5ʵ*\u'RmcrzFINFYWw6=<]k[e= וJפב`κ/%WGP{(\Ht¯>KW~krx̤\%\v]Q>NW^[ 'x|.<@caT(M3ms؁虯MZ1yC=VĻ'NxԤ4FIγCyvO`7;KmIsĵr:=p[< qs+-zlFizCb ސqH!+S0PPE@aL,&1E'Sbyh:GjIJzȴ ,:?xf;||:#r#Yj.mx,x&3]-țNR(9fgCQ oxH ȧhK t_ 3h2@ b^F2cˀ &_%20?[\WsUSEBu\ia}w4W~ǽbkŕ [ja.Ԍoi @xb v+bDׂ j.نKH(M`OmMCGb%$,sY~%lHcnzm|C8\DYJ>ɳ. <_NV0+HuU5_j81zNWG7^7^$C a]s:JGIg[Hw+%,o;㿓_N/#+Xޮ`y҆Sګ(5TFsWB%ZWSy/ $ o372iZ)n;%įTFim+UMyQFꝲj{2YeuRYVw~?TVʛ;-üz_%%Z#n;vPVc;TVCW;TVCbʫw9}N1_ e;ŜdfsyIJ=bN1s'GS̩F~d'*s: 0@Od3cdTs̀!p 8M^)qx^@/d |;ⶏA B}J6@Mz7?ҝG3yEL ;nEGv8D!r 8$y#(uF+z5ɸ謵):;eŞ&YbtPn^=MHro&Y0nauSv{ ?e(e6:?^^AyMᕔ]*g#a-Iq ½DcѰS:F Nźtׇ0CħƝd]E40-!̈́Er6^wBU#ynMyʨH9Je7aߎ޾#)V"=YP#1p$ӝ M^y㤻6w0=^w>vޕ{An?(!^f$ xS'I7/&ٛ,Pm;W}l D֍SEW*^wDZkקjla  kF1ŲXcZo^qͷX<'wq{{+K v+wQ>C{)|ҼK<>hخAv)p"ahG>l RK38r?=Ijq E('_Fi>Cg)'&Ox}Eϗ>~}̳LX0o#|lMkv瘮wl7?hwMiAڣ>?PR>MxI5? 1JɫDi_q_}kGj{#I)u 1C_cʱA߅o1AeNw)ɫ[9^;z =$>l;"">"j;"7RƏ}DnohGƘ)GCȍ$Nΐ#rQB9h`Ŏ)GC)G}D|Dn_#rN#KȨ*x#rSCD&ǫ}:M}t9>H $O>EX#yX}G䣃?'5 ?!}$O9RAHZON ur͏60I DO <6tsp!.+Wޥ1E|ڧc 1/$.CTWztAg|1rЗ?<_HwE9]m'&,+٦U2)/k^L!DeAz礛>ݠSڑrvNߴ0_]hS$f{SOΩ/ v9xu[{w2\9_z2]މ'b/k`*r0gv%lܐ{`l7svsۍod:=;$WJu^z|5D:zb#U+'n.F&WeGIu-O{e娝v):3[3/ǎ}E>]Ʈ֗X ɽK0WxQ4~9j|w!7~.< Jwe;JUt9WJ`6'Pp)4$tf;y⵸ly:v[p, ]D}P6uBQE:M̙?5P'-fRe|&!oi dt|:9f9_P3龸t̖X?"&AG. Ato v9^OOQcYſ3t/*c`V2~Q=`\'jR?rP풪7^`2B!B_[yOMACYQ:)~_SAStMX3.D |'e"]/%I;\T 5|-"ݥ6G W'/;[l>/9`cCƄ⃛H'T0ƑZ3+*8T/[TvJszlXҫn-w a*U&VWPn}wE; ".Z.p#x`(.Y3[lKb7!WFbT&XyγJy."s싐v@_T>vӇ[_vtG!QOSx(ѧ_#ǯd«\v5 #*4b:5^^4==<} ȳЃh jAv0-dE^W{ E~U[<;E_y>yu^wfrJr9;,F ota7aS k0@ݑ{ ޳QA0EmrZ^q[3WZҢׯyg0ƎjSXN(oX ?DOy=e+{:MDym$ 6Ҫ"3 & (^}Ρ^Hy_Ֆ8V<JrsWV¢0@oyx_0G>A1ǐ>QVy)C1<܎B v ow:,AhrUPnƷݭWnX9(O8#{P/z4|ri |v[)׵WPqq%"(|~IuQ%Aڱ! "9 S^a ! +8zd/ _~sɘz?YA9NoNAF0h&h-Eq"L s`:V0a0@<[y <BE VRk .$+,6{jXc &lf@<ʷ vdp e x`B*[&>ǀ"m{t2 ȆB @%ȇ@j/w0.;!Mw{7  e 乏YϠ2<T{VG ( 2/W2xUDx_@:o0 Womx3io!;D oc 3x+x~'wG0KAB&! >xe?a >EE/1x6|Z7|_'F :]A{eZUР0$*HNEھ; W;rGӊOO܏hs=颣NBD~}~HVҁqH ,ӛOA3سN {%񝑌V!iA*!f 암=96I<ߵT_Bw!Jǒn"70Tj"]8cٙD(Lbh&]gsI$G"G@?>jۈ~R.gdhELTHIe^J:+L2[Kl^D:$"乒v9ѼѼ_PPPyl+OA{mmo}TO}hd8]λRY 4qfyzNn V. ]!xG߇A(?<9dd"Մuۼ'K#mEK4Q7*^ދj`_q_ޜzyL7DQ/Grt![?Wꥨ!(t{$r_#d|eM>{zlswU/Hr+~M4 Gcx^ތ_ @n&!ZuSce|&1Bg\:K'ɜ qa5*r7zTL;ࡢT'9pOR \S$9H@=bLmc2mس*9-5 ؆J fׇO7q}/ 0׹G`,j=_+2b}nb9+S3)2ޥMŃDs!́xolR U|,NOT\y`udHkfΏ5'jB9>k)r\p]>L[Rj .?Qr>5Տ xu٢/V:H\-ɧLGD- rzU1!QisYN_,%_XYgsPw8eU^([ ~INa}ڕ5 V r Z⹆!{\sL7ǹ4RLp#(l-U-NH'q3^Oq8ȧz((>MN\Y0 ^b/c؟*~5ׯS[EuKYNo!8a,>dְ Np"8|V0La©Np:(gAe}Ϥp O_aEod39j)dϽpiLǜ#T:ǜņmohWn3,DgRٽ@suo'5r:=-Αyx)\ķ )\I2[ȣnu!W00>p5Q5Q|^3P_rÏVK}x]G!'ez7zD؊+}AlqN+MawIqn_F1~MfW:`1a+Z Ġ@`$h&c xg+08pD; `:D;aOEmqAlG DF( q񼘁_@/%1[g/ )V2PLX%zZQM ^o K@U\j#2弗U:Dz4#J:&~Bv񾃁-@M'wp}$)20գnt+pr&[LX3n+2HǞ{ٿo;k{I:#6t_Wy'Sv JBwݘk~IxC oO3z8x7X 7L':M\Xd| /\O$y[nq$O"܉ޮYDw,xqtw@>M'z7?ҝG3y=kr!-&<_J17a(8;/g$+vSU uFwvz5䝝Hg]dOkݛXgАFro>'$fe /+]tƎ ̆2 n ">=GZZ֓DJ{-{;Gus7S)!|:(~;;ckuXS: v7jn {W;VkK[y!*Duxv7aDz{ ^$oI 3j:{3ʻmjT\Erx9ѲucTdџp1鵚@26F!FGPGߛXcZo^qͷX<'wq{{+K v+wQ>C{)|ҼK<>hخAv)p"ahG>l R{R9V{@d$G58cկOR4~O 'P@C@"Kv~ re >Ycc7ڥ_^w¿x|͎5&uη);H{4xu~B:ߧ =|CI5O6 ?t?t?&_)yEp v-WۮޗZ0~6Μ?ceO<}=oO?75U^;Uʮ᫭:нkiyį;E>+ Z<(3T~e| Ƙ߯W\ 'H>v}UT}+ü;}UU9]arh,rzʮ|uoS*e|Wiyc0)r<:Ug^L˻Oվ}t9># :XIY`vyݯg\H/dI90~>w_=P )zrRWOWBKٗ60I DO <6t_p!.+Wޥ1E|eڧc 1њ9?$v_R=h߰zJOo*_e_/ }1NL9d?T|~x_AO:NŶ/kٙC^7Q{/I7}&A!b0o|V}%=O=,;c||k%[MCJ;7r==iNp/g\n.c/nƷVKWo/ߕR{<:W_ I'zLtt-O{e娝v):3`Gj\}]/׃={ `7Jӷ(]_We#9I#uAM 1t 6Ow- kqt:-&04e^ﴃӻY%Z#mꄺ#㫋't3^kN~[ ? 2m:|U>E/rt_r:f}X]uDME5V9^]|AW"r<]P|L0U;cA߻;g-} wxB?+;NAKz{k" GZ 9 |~4䛕rޤd?P {Gkr>h\|/FGų/Ep/;2qRqb=A_wOF繴#`F^F /q9*np._CTM1i<'sK4s uBmvһntϭ]a G|KX'~?ơ^ "n= ޏR|6x:B b xC yDI|!TGARV?@URV’]aт->8yx79߫7?wEcZ|^I_: W/[M >KbciG%"N:6-&t衒|-"ݥ݌G W'/;?l>/9`cCƄ⃛B'T0՛K\+bcEv/[ ,eܠrI:{R1VwEL[]A)j˽K\]SNrkÍn: Sx1[(K)5-8};dLzLe2G=<*t]/BޡI^._ɷy.}Sv@=Pcepcg<}eAwD8~G!>.2 wQxOw1G _c7W19$ۿ}w1y/qYAJL4BLLUkYZ+x*Z:WWUsZwL3,:M: ;3~9C}%|#7PCey~5f=kƻQAmrZ^q[3WZҢׯyg0ƎjSXN(oX ?DOy=e+{:MDym$ 6Ҫ"3 & (^}Ρ^Hy_Ֆ8V<JrsWV¢0@oyx_0G>!}ӷ:tߴCq SnGrPx;Qx;X4t*(w[C+|,tr'˽S=d9M|4|vV >A;dt~o LC+8|}>@$:(Ttǒ Xw\ho_0Gʿw2E;ڐɘz?YA9NoNAF0h&h-!88LPp&9~tvwlb&:,1^zBE ,a{\;  {W&P_c &lf@<_`.et;v;/c ;W1Bp5x'(@mNBx"}W^x"}_!Ϸ2 hw&08n @sQųAe.dy0"`Qe ^dJ^6u `1"O| "qO3g8 |ɏ!sH9߿VSs_sڝsxQׇAO<|!LF%C87?xGY`%~I9D?B}S._9_%'CC^7I<f='C4m2sxq%C!h]lUpxI<@y--b\H<8ω%x!w\!D0׿`.r|9yN<0x;)˗xq!sH9$_XJ

!Y9ĩAo7n4͐x5X`}lsH<A9nH< U< oC9#!$C< S S=fTg: \OdOAcQoI|*U'.Í{9*+uZ٭ G=Yփutւ:jrwJI/ؤWyQK;Cj'/' U'lwۀTk/Mʼjܢ2}ty+ *ųYSˠ .^ˮ e(FdCgC"GVVi'O[]Mg#Vi+'O[@i |IHTAyO[z/x}T/[KvGi+x_3$WUK8{ɏ s9H߿VSs9_9ڝ98QׇAO| LF%A8s7?q8GY`s%~ID?qBs}S.__%'AC^7If='A4m298q%A sh]lUp8I@y--qb\H8ωs%8!w\ D0׿`.r|yN08;)˗8q s9H$q_XJP> YĩA/0n4͐8s5X`}l9HAnH U< soC#!$A< sp=N;1xsH9'C!]3V9$xIx듐xAx!J50O9D#(3sg3B&'C!z9 u<c=Ibo*!K<8dsH1oFY?լs&Mxq!sH9$M !(%C9!D0k9wxBO<8ω]!|'e!N|9D$C!JY_9$2($_Ix!K) < !e 9$_sH t;sHIH~v9J8OsH9͐x_U4V/Ÿ~9C}%!nC72HmK k:5)l}O=`9$¯.$C(8ω Sx!'xI<$CԣPKG9$Z;K<83ƍ!s kMxA9H<  GZsH9$_)sH9DԯOB9N9DO<(C 'CX>|&~!DgYsћdxI<#,0s?$xs|/௒!!De$CV!x6O9ĉO<8!sH.6 *_O<$Cx1.J$CsDO<;X{q0 9><'C~tJ6A9a $C r7$C*s7F!x !_s xy)[*ߓ"]Q=;>!-8 p.?Bpw;$3#bFwx[ 7~egP>zF;+*VJU_t7{'ߟ}jYFg(M8/ʴDW3Lw@+\2fUjy{ &zZբr{9*8QTkfBACܕpxP6S*h~F"`j,{zoxw0c2bi' GQMnɃ# .z˙nSR({:-L>^n-qrُ^;qr8%Y ɐ8 K#1H\>IHgoBRl%$N.? ѐ8:F%N.xe(qr7K%(3%g3qrI&'N.'z9 uqrio*![a_WGГN/ktK5~%jsR&I\:K^2qrij\}]mR't"B-OKS'mORK?,'N.!Z8$s9'sG {Gk3K\B9H\p8]f˨m| 9'N%]J1`p8,#'"&N.Pa'N.KFM&O2ʣlw'~@y?8L\&N. KUEcoU'Y3W+qr)v8t'ԦʱPްXS{Vtډ;K\Vԅɥ9qrIa 2}4B2qrY~$N.'%~z*c)qrIxeFu 4_ 98\c &H\"lɥ'$N.3 qr)qH\~$ɥgcW!qrɁ9.$ݩ=erRߝ.Ge|TUk)C]IҼܤX|luε Sd2K4pz?x4kH[(~O;iIj|'_'v(;xo[6D5^N5^8Ⱥ|+wx1ݙVC9!>WOǩuYVYFYZ1ݳe5]\8C p gD)~ -CgUi"chesYƉ)Ro5ቼLf;೉L,;K+ L>!6Ŕv Wk俍ҞO|_/'+)UgF5$ZYMyWfߢs=D{3D2m!ڸBn%K W8N6F{_F^NEtvσx$Ӹ«(ZFz6DW]%zܨ]J+;v+^IwG37Cuwܖ8Ң5>gi+8unw֙^ o!9uYHaN9(8+BGH.Z;Iۉǝ w d{tp9Nwkp% Or>d>D#sd; L:cLEIw"ϴ|>SfS('ͳ)2RLbʬ7Sf[Q;˔gezX_R2!z2dϪr5rL#`'O9~%<4n1j[o1,# *e%'!h3ꭢΒӫɿjojc?NWN.SƃL7ɇV,gۋCٞlρ}='IA&9J`*SX_)`%t'(xr?<? c0(]=Oc,ĭ)3u*!&.A_3^GgLAsoI}rE -l~s R kq3.#߹K 1}}SgCoɎgfԣoʯ7H h7|s1קrXZ"Ǜ?N/࿨2.e Ptm;6jܑ|F 9 \V?Bn dCz)_hՃ/_AU۞3kz;x*9?02]AxD[>lDuTHx&| "N"/q*ZP]YpkZI./}me[jۢ)<[20M~tuŲ_ʍr4E9~;zǽ;M;RoUo;B8oU(|M/s ~1NWh, dyB2ǯT:\@)t{X?T 'ӹEMg11+86|Hor|O V>'Q8s5=#vWX(w͈)׌ eQC)A<τ6rjDi}'NKU| &崎tXJh ΎY7_;M|HO^G~d9sLUN|0=zQ=oOo(=S0z"xjÅ.pݍ~ )[L"';D\OjojT"ӹ9*! \3[Z?w>;|@R u繴^'dh ~R SwE/t'W,ɍ&u66OZ*puC4pT oo#$| v\ϳ(X[Rq҉2hRʠAwu%/4*5w]1N\?j`dVWXu,8U|}yq>h}uDtۈNMS )CiCDQƑ FY#0q6  I Lg01`3t R"[:8,c"u fg )70PO 79. Θ!t;& n`/kSC9gf@ȐaЩQ08H hݤƍnǀ>#n8[5@p"18)n*G@]cC|h5>Xsس㨣qĀ˝(C xSf)~$Ag6D:Y&nĂ|w_nDBnd{\7괜%z yuSyI릙{v<-BP̧_I& 4s({i[2=|ovEy{ItdC۬ӱGbD:2XƲfp6U1Nzzw/ o( cxDo8Mb07NG3F:Ml.+LFs=IgXh) |(;GΣt< "&B[Lx-c+ϻw^O%}ޱ^$o($ӋKԼeyn۹g;U@(_\ 9^NlX?Udu"YyuG'\eL}}z_)7̶ͮ@72ёfy{+wL-뿃="n zov)῕N`.{(|/}wS3sm>HON=@?N[> ȇmz]Jx'TvU'Yy3QM>?N(x듔(g(,D$ 3P/PHw|¯ym$^/x'.t;x曌A;ߦ4 eMF |()&`<٤~URA;cK=4~ܼTJYڣGk_9Iؗ(Ke)K`ʥ}j|:ڼKpԥ92IW.%9N7cMT .si6;k(ݫSg7>hצ/5c_ Bp O'r**_r K ǫގ)Q4Ya}PC{"Twrz)d_A%[N:v?O'n*FFUB@SCD&ǫ}qrr <ϐx $_9g?+Ju~]>Bz[#lz'Ukuki~a)>Sm`,j7iAE3yl,2wAtȻD\yƐj1XK_Py5wEHGI!趕!NwyHB7+oZ("d|vw**jzeڗ|d|uN/&c׍~ n椛>ݠvt17m>LwYϾ~usKS1_d>r!tOO+dRҷT3Վ'R3 lO'sx)]G.SHȃb#U<.:;d9njOv%!!20nncՋ1 oCwo^'U`WCz<+۞.f:R.r3v}U^#ݫ2#ߣ_ FS^Y9j]8bʺNK#5.>]Ʈ֗X ɽK0Wx|\We#9I#uAM`fP+Q\i̜+A%0( :]vBts<Z\<pK -MY_g, ]D}P6uBQE:M̙?5P'-fRe_[6*N΢YT9z/n9>,/VIQ9*ǫ/ µto v9^OQcYſ3t/*c`V2~W.=`\'jR?rP풪7^`2B!B._4 f3FGꤜ-M (CyL=5a905tJGųh Fx\ "qFܛ9}y|H<7:ϥMbh!i7z<qؙrᆐ87n,\ңzވS cQkGZuWPna<҉W&}T/[smSмz:ג5|-"ݥ]=G W'/;k^s*<\/ 7q}!*N` )GVNJ #Y/[޵veV~I:;1VwX[]A)5`wYܝ Rnpx׍ax1[(K)5ۘšwqv6>d+zyU_˹a\T~3^8On}yCN_|[t9Q+}E=OpJv)!.> WCp7~2>N#S \<9Nc<ӷ< =H隉gM8J{W5Zݶ4:}H}Xm!vGػ;Y>e)ۄ2ΐ0B. B_H_ ` ߯UݳїmmwUիW.%S v) x|kcx.ȯ**{;K'q8;^MNz'w+3TrL;lwcxC0jo|~ۅTkc'nZ~L׍xDpoɮ8+ߟ֊UjFWc:UNK~TgoT<&=e4[Gr[Gڙ/b}{q3NH83G`8t<?V翪炈T~LW("ߴV"+&I Oy=~'"J*cq"br;-*Wnb'0|Xg|w|sK ${ԇ^/Iz~\\=)sc(3ݟJt2tVV):A+d' |1 &_}Ǖ ) I0;%5XǮyO ٫A>HwO}#qXХwœ)$$}@&h4e@+n)`K 0+H .%X\B@JAOXa!\N2CJz9H`f `+ DPc\o^=߀AoD`7!pbm_%x= N}?y0B!T:HBPwDzQ1g 85(~ Y=O(/OGle'^Czog&x "wDӂ_!Ѓn_eC@}T~EP$>J@?F!I0Y*S~O3 |bg#D1+O1xW 5ǀ; Afuy "Q2qhd0[ qT;Bjwkw"jwR#޿ŧkwQ'vG;;B;B !ß^^3g7HvGHXuvG_.zR#ė~bo*!"o =&!rݩ˧vGcsx:!2vGW#vGH펐h]lFU_#vG(1:%jw!,_#?$Z#>&wVq=vG|B㗯!:ujw! S#vGH++R#d\QIovG]1b|P#.[a펐2tص;Bl osDA{sGɣvGH펐!k;BWBUrL;ljw(o펐qSGEyfQQfM%zcS z!~m:P#DE\#Ó&h;Bjw_>;BjwSϋXF4h;B![ %jw;BE펐GE 3vGwDE펐O'E__;B(yG6qÅ MR&}̺_+7?LtVh\ /!"}X|?d'^Gy<;4>,Om VZw-ZiƼ=z׬s_k:z| cY{j6C)3;/|2ԭ}3=IcSܟf9p~0=p(_|z՜?Wj{"?̧T<>Bܟo=pGډ @څr(FF9_=l{R=b/3{>;@~·\Ofzqzi==?|¬«I%?~xWy5z`ћG`{*S862*J Z^߻KT;ΞFZ%Y$˱!\"M*#NT..rRM_/噰Ep5$8Uaz~V}yQ!$/$Y3ʤg[bkkAnh{"\xQ^>QfRÙB*{+u0y;صKQShsjHgoïUC6Ƥ{ ={mأ^?zC~~7:HG'U2i^BL9a|=a 6[+cƲE#cu@С# 4ANEuJccAJ r xpf|.gxtjhq0nIYQp'pJ7 롣/+XnqOTXSwcwcAm~֘r:rȏ]mbt]vŏˠ~(QrX9\0 ]q΁`R *F?M?cA]+;{&+ }{ Ԟ>}|~6#lRa6|kA{~{8"r%# 9B`/BB6T# ={C`pTRDpji_A `<,8 kp!V?7,iKgG1xQ^zeX]=C5z{=f wf ke_OLB{WIpucsD3.og=AewI5݉x/ckG8V=i~?^0˻]dxMr&}^S0>ȞWK'|-2["7JP|S|`Y~6Gku g}Mwy!X;֞B !{\LBg2{w&hF=*A&ٛs6,.*pk:,:8#m#h[H4r1x$'vA=4a>ݝP,ebkk'V?*X,^%,jQO!yh9~GBxb'M 8yFӿcAw/_P;P\_^ʳraZ xÃ?"[ݑgc=A3.*#: !/:Pl|sEytA~l?XB`Y Xq}UO5Y:H犥jryUh`,jGgwgڨE>|֤ 3ur'Tj'FX?k7LnvtVu!lul$}ܬ >Popgauv jN" ݕ{ܤ0uw~OzpbrRng%[MY%t ~Є^Xj;~VIM$-B?I ij*Ow`޷ 30 J1$BDF^vJ_w{'?reoHz&q$9ןJStzyyzZ—P6=Z,#ynso ^ޣ~ zZe&=xJ?՟P.#|Y[|m>evj-POdEۚn[ BtڝoT@oc>)6) &%A?wテ˫fv4'qJ3եOǟrN #B)fޔzs4O>$]:^Ϙj=~Ą7(`zt0o0=;IԅUE\ivE 9`(Q٦y\īqd:5aiqBHZzH:As~~Sq?M̛6ڏLzů>Y2 s,' anOkx)=3:Ǘxzڅ!_9Xwitus.j}^c߯BQN~.gX'PIuGA-1i ? z4}ruo;;ds{[;' v w32Ses{샟. Ρpx<9]?Gl(;]?ù>CS`8_򾐔Q7feWt,y_I)stܓAb\x/&bx i W4%gw"*Q9S+IpeEgmLy}QIoi.ϫu#cA76(]1b|p~,1RSbs"&Tn$wm"*N` o?!f>6T0N=X|1J;N7\$W a*MnYp+(@þ`I9z#D(ߊ wbx50܎5Z p0㥛9=DɹϾGWzY;;#s?y'G__PxNЕ7 0Jy{<"o2*߷ }jEJ :Ax`WE]T1"WϏs1rn bңqWOƂn*E'h7>봿OVPqt>Ӓ2 f]&K5W0 Ӑ?>{UVg0|6Rݳ`}2tݜc|@<@E8@@W7 g7؍p3zkApA}G (^!B@S@5PN}?%("#hwy>>F'>gI喹ۃ 3^0k6^O  o6_zx^$"wZ+zލ T{Z=O>>B`G (|">qF#A} |&"1"YLlD"—|9|~)B*5g#(,?;0A 1o!i ٣CB/KBiB6١GwU^mFuk4u0VS;DT'rBIf"i40D{olR݉wxY.!npoYL  != ['t1|i»E@ EY*,C! tڀyt߃O}c ~ ý/1+I)* 7b le lF^ yvf.%N}5܅=i4O"XO`g1sH =Bߠ >ϞG1:x~1_t9u0ɩ9_-:ߟa|LG<n*@_趙P~Ar\)UF~TceT|O~8G~7J>>h&/fU{Zog|5O* cY\ A4_vxnðS/R!t2s|z xճZR_+U\~ Nݓfѓs/Y~fU屠 oQC@y60+):Dʉ.${0/wv, xSyz^D uۿ.`ګn(mT~F PI %1s/]h6ApBy:?zL}C:v|7YpLJVOA Wo>R\ƣZ`ѫs;Ueb\/%݉BRB)&": cAUTWŔUcXn.%3u!.DWs.L7ePI߃t`?]v<ȳ&a%{QmRߩ2hae$PObFHIx5ǯG)708tc/o}?Ӷ o-2ζdvD+(#ēx$`Mg˛,-mTua׍2$7,D#XKBT9p킩MW^GOJ.~=~W+;4$@R:CpG8B58 !wH7 B}K4m q %Hw*^ݞ0 E,& }| (ʯE;5n;4_;7!~!~,Ef 2%K0ݥ7 X w'!ݱIӯr7Y7jn~}92Lwn66b$fRV`~NB9}P;v .M,fĿӧQf=HZkIޯt{ `$oI`?7᷷`yBn\7cvN w BOnun—e'1n/յH[bP#e1/HgAL/f[b-c0,`J#2. o2T_;Mm%mO#[40L o>(;A1 k ?.焫~JڅYYo^"Uh?³a?MϞCww6;JKT%NOYx;)CTPٶ b{עf ۥ;r_^`V=O~xۄtMe%o5v0a,SOi:LqN[OϧaΆc9sT=~݌ۭ}&lw#$6#Cz}S-??>&BS=}ڞEsϧ9x|ZXy7җ-=N¯]i:\6kbh07 dPC3'W2X_`;4mP^LvďɆ2mx$M϶2/&9rJҴ JsWS|n`t6Nxg%~l(*A+݂OFnl}Q]Vml(Ǘ1/#N2̓ Ms."gswVdvGYt?sYm,.?O<yqP|z?Ox:Cvpp)AC2d$$g ӤA+OKIU4!;`ޣZ,ܝ yv8_*wLHv=tN{$_F>^=Cr6Q/VufmFf5)oYZej%K}6Ui7<6t:n΀ np')krwIw#f6;C<t› ULMFї,8*Iw)AwcAm~֘r:rȏ]EHFo]x w,O+.gKB. wʸR\ wK7xG48 O]w쳺>4}u3q $!#VjD @;N.NJ@ɂa: f[jzG`,W10}X***j_ŸguЯ`z^@(W/-6#lBoAޫkAC@R׉h+#p z D8$JN7G s^>ltGJxo "J_\_N@R{' Cx0ѮS8 kp!4۶l[ݟ-Nda%w~DރGN?"u=+XzD-k!B_7$=&|`@o݆' w]tpת%t`^Ex|m-\лO'yRym:µ#_ J3cxoT< w-6 'X !ܛ ]*=ER_]߭ D| ?MZ]z=/ځpIz_%EPiA W+[V>߆߁yKt,k|S(۝nB_nBo?>|X4>jx(Ix'ЇA;q _?za 귒.%R^<|'|~#ן|?~q}B/b~dBcÿ>DƠc7O]HV牬h[mkABjwwpu~<>xm7&%!ۤ?Gw0aygjL Ʉ0.$ IB=AIa^lW'=@)>O>>NlX~y=wBx lj֦ٓ'Lϳ'CO{OZgah~{Һ={؞=iSEp Ϗϐ{=ilgOۋW=il^4go{{{+'m07{R12'm~=zAN~r٢}Ѷ/[|\LǤ#D;u\\ n1z oPHkz67&eB5iISZOF{:$¬wkFb3¬G>n3[x|V rC'2~[{vE 9`(Q٦y\īqd:5aiq4w/ԉ44G<;%$ϼ9a37ѿL'~AOLR!}QO_zΟ%3p>Grz"n9ǟ=z|߯s+F|)BzPX$JCX?!>b3N0O? Uf;+ǟ?7oܸ^ ?NW )}v FOMiCIt3=rc_Xڠwp+cA].c,:Pc*Al_؄jM3MDs̟~kC|l`>SIopg\B.Xh+dž0&$6WPN} $='Jr==Buv[g@xpςI'|.,Ť<^{>Qg9h|}>Q#@ os8Q :t0𠔧̀dqÛOvG)o'T[c6BwS DTF 0"^.*_ yWS呒wbx2C/K1B|51|7S|B,w&F \xter97r18٫D'cA7heB}bߛ Y hu'+(8z\I iI^SϋXRڊw+i*+M, g+8rg]zpr?_A>O?O 4`"F4hEh/Z t N8 0s `?,` 5]J`1zȫP#iV x!\._AJz&Qxߌl%=v1Xp5Qߋa?kP Mv#܌w+APQcP!'h #B@Sߏ@:T^=C("#h?>F'>gI喹ۃ 3^0k6^O  o6_zx^$"wZ+zލ T{Z=O>>B`G (|">qF#A} |&"1"YLlD"—|9|~)B*5g#(,?;0A l}݀blP) W ݭ.W*]!BjwBjw@{ ~L\Bjw SBjw(Q+ڝ]!Z]!Mkwc DLvWfvWm_+vW?q\,:^+/F=KvW 17 7kwS+1˹vWO !W:Bd|OKR+ 2.$]F~W+\1rRF @ǰvWHt:]!z9"vWx#Q+vWH]! *}96_B~vW8թrZ<樨~=1)ۅO=`6NZ|I Oy4]!B/]!Bjw ٩EEW, | VjwH zq ]!KE-J`{Bv]! vWȣvWVyQ+D;"‹vW'BvW/ǀ]!*pW|}!bɥ_Qv;Hvv;Hv{/ҁޏ v;Hv A[|ʾv;Hv_"j@SDK_}z|va, >unA<|nA|kn/Qk¨'A|n ] 2&=}vc"ם|j =f9n|]ө"kH|AjnfYAjjZvSv;|OACXAc¬gk D>~w^v;i ?Aj4" AE j u^+WAȟ n)CN]Ė_6GW>nޱq?߬Ϗ.kѿݪUMaXH u9L+?$a;u U5U f'h1@B^:t 0az0l"|F`>!@g&ŀ0;l{IܱmcP?dO'u[zDA'K^f׋|.~u:+Cjr?Q׎iC6Wy*(?耡u9w ZBԘp+Nš~op[VʢBkr߃6!?y![5$LxnF~[m5jޱ5}&̗}U7WXl_Fma,٦rQtd:;g e!%BcX6%_t|^K]y܋ه|{$=so;EdOB˃CވyE:7Yc?j/![0q܊8,$:0흘0w :f1}9j]Nx݄8[刐'@Z! h݇z̀V1Hc!,ϓXF%i,CھA}Si߾=(mbL}آ Ϫ UiN]xyz᤺L;{3Lh,Che<}ӿ$oaۈBEài0v*r z!`,oH'71<_@<@z님/Y𯄷21 ]HV牬h[mkABjwwpu~<>xm7&%!ۤ?Gw0ayU(ߥu9..'`f.-~bqxܨ\\sxb.(ϊF=>=_?J9]}zGsѢ&zfz:ߊl7`V ;a+э7n3'9}2?t%T\㓭|z.kWoLJ|+=|>_ +FN~>S?~$z|k([tsۄQv]8Suaݩife>#U"^Ww0v%tH3p=kE=~9~||1鉸tB[zJ N%ϝ}|)BzPGl$JCX?!>b3N0O? Uf;+ǟ54A%.7~ ~}BīO#9!BJ)\>_4Շ &(u!yt8^S9kQBԐw^5~i+{r|wz^ܫ#K{X1yl{X|r}&4|P=x{=Cj/Q~3, ?'S{P䧥$xY8}K=E,M_A[qv2znqnO= ©2I-:{Ҧ $$a%( I5,Ӭ ""UV /o 0\"9eey~)i+^a2MIY%"yR:W.[jsژX*ɣt6]giWC@\1r8?t^)B`S~b 7ue6'U0h#׆P|D8TcA*?{@:ݬrB\ xXۄt: oK=oİ[1܁N p1^urq뉼|~,Ť<^|5wgs__=O>SqkWVGї<pU!%0N9NeU>?Fo(Th݊8@OuB?Zv) x|kcx.ȯ**{;K'q8;^M3V܉]" /0`҅( ֠ǀOL ꫙?jLx%Y]qV?GM~A(J# X2^u(Ϭ9*߬DOyLx{vAh'~qA{;䄷39_rf:pfp{1~ b8,XK "Rm1M_ioKV(J8|֯Z[H > GW'1,T3v;SA9ݥn=CIs\=?.Ŕ˹1Iƙ^%? A+2~cDWbOF>YAJ|O`OKz$z^TtŒDV,Eȿc<_'OCUY#B>Hwz#qXM6{>Wēēz HV^Z t#L!03XZY0W_",!FK ,FPϗP/%/GPV!FXE`-u W" #F6#ld[ l] V \M`'z\GzE؇ zC&nFPo ; oB(1\tP !ҩGDzeCT=ag (Jxs?OPU' [抗#o2xx!x=7D3x~Jx;"‹ iAx7¯2P!k>_z?"PDe%xR#$ |,])g'>g1_" ^Bcp :<D67 -3sbmp1{ys?w8(z|i5{/+֩J@ 4X^K+ӷPuͅQž`}sgőlQhruiF]b{B(rϺxI%B(NdFS@/7t,U:M g 87R(XW(6p~NOyO-Me F\1]r%x*e'Gr|M|ٲt*uȢ$?) E~(ۄ,XZYaѱz1'>O0 ~ ETUfٵdQ#_F rw_9u.6^beMDqԣܣ\ t[|ߧPP_O|ΐ,ǾS/_!%e/n#ՋhsܻPvJȠTAP Sɻٺ$d|J*s0U;npY[աY}aKɏ dSVcpNџ- ~mA6 gqF12³u,ɷ(l^R>Br-e!:*эJ`VoSyRVP!ӗzdc8Y>WFwز%02섬:1ۆYOMg'rRbrR`9VVr١l!ק۠dK]Zf+:n%.eu-A~٦<$[IV٣iLe d{KGj+TY}@@!oպR. rJalr̺KkZ}VRVdd3<`çYÇw>LmܛfV3O&[,fKv#Êb=,g-¢"W 9jcŹEg3dY(TiGKH)v[ᘣ"Rz:V:eIk_ ]zͷq蔺:ҒcP[[3%cot| >0/Vѕ2CB?*ʄ4YxFndƪ#C`fH EE6+1"$葓VtDigs)l:Ze YV" @YmHnSDYj4MzTOU uXU.V^:Pfx+ĴJ0)]Y5\4QsZSe76F~laT.#p}F>+m3&;ͤI];4$FjsNu;RW@ZrUvH6ջFsQ*Z&..Kf(BYM2@\Szui3d.}#m [MLwLaQ>l&[(#r8K$ݮ,xB[rA|m/i6IM^mΩ)='rZ'ˏTwg(G4c{E[Go%N|ͅ\1tjkvHNުOsŃAYu8]]Z{Z]{}H42i>\mf/ZP6WdT0]Vk!ɱJmnKi@ewgɻ9`7NoUɝ9\3 S뽥w-mYH.NXRa(̀&F(uEj(/ 5՚#VC0-N5ˌk-}y[DNGFJ͑cYN NE alz;H3E#܆S{ }#N70:FK>?aa(jg K;U¢ȇW3ER.~V1rpYv\%N>+y:"ΖtɲXE퐤e9A+H,cvvRNYvZH4 wZӚm60eҒ>Vk֣VA l*Z͞+: W 8L:jˮ α07<ƺПzp^[ymiireɨ94no-̳N!'3sAcb:O@,u[IԟvTN {͖Ud+fLO産 aDG鼴#KZs7-SeJ-c̕/]+MV~{:''!L2)8 qIhP6Sݟt`RrXCSV#*6DsHy̔޵)Y2֝3SWi1G[J}H2Ie'lTd/?brn+ҙo$w4'5ӬlMg7Cmf5=@巧pPxG̭v k;ss\bكCy-z^UcO aOX[ض&Vpf!:}cHWVt(:&aA|`Qi'ё3w?fBI ~)rƒOUypܐ3yC![Lng#<}o__uϻo:6Ӄg%[ҌiܱU-s^e/ΆtpؒL r4a%FnTjc~^'Yʡ<~3@Tڈ8}l`P*S䭼`#ݜn'ʹviR±:f;ݟ+J:k:\LЊslKYҕ\4b/8C?Ikc̺猡'9acsZ)Fa9 $G|^Had.S4784nSShVqLa$_"NlɩNAJƼGj2zNsp;fb;s۩9R(NسLI * MĹ^űAg~dT?1zh󃓴`a q8*趥^:TmnΜCW6|D4"=J5\>VCOr.fotRLAɖ>TFD ;:jk4*tܰf힡ia;4hSM] U-Et&PyZpK;Z&U5=l(HnS/'̥N0WtsppKE:E' \Q%yܘjΟ1",6TRg)wzA(inԸKoQo&=[9!):Ks[Гu:矛j7왔L{nqlԉ8'fY5JquhR&pY`\hr*'; )\o:K[N]9fed],%b˦}JPԻSeQ \c0ɲl訹==y(OkBT'6;g7=nՋ \P9Hwu6i iLKzqSKک?F7%I;4=b>UEӎ~U7Lp\Hkf{T6M%s]?\@ y̞g[2uKy6eMud#mk XP+Q2zsGvGVZmi649]T>47Y3Mss%tП-1:Ԩ _.',a GV["nu]tQ>v} ]iKՑVŒiH<YjUEm@hتxZroKi7&&w>BA|;AkZ.HЍC+6;]s搽Nļ=ky |НfPHsdW 9- Taaϊq$b:.Ȟ LJo24Ye]܋ˇn+bg<#pEr:2Da`j]gV90}R|gi+ ܊@MLнS9iUnǍ=۱>:a&=@VwgKE_v'ߐBV:؝0Wr#3o߇ǯjY}cKj2o4Czy}>֯зPi=Ęx'ϠSכʶX(.H; OA ; *3 vL?o"\äɻ-=n 1x>?oU]med\=_Of.dɞ\Q$-l2M6(|Ln9DIQmG~WZMBNȂ!-N&fPm5Y9R>0ެ|hϢ?LoT#wJU֗蔜FVqu?٨kHȀ ɂ_?dO8j3j5*[+˭qn;oQ~ }6goGS׎(e=tjnyJm߭gΠ9aUs*uo\;wʹ5җ4%1ѻZznN3ﵟ*kAiP*7$OX1|1ÙΙWɹˤex5G{Dr+īr%>Gp^sȉd#Ҭk"85;NsZ'crP0f< 6 dO9-/SV{i_fWOQR̡|!52$OR{A}V2(t9l}ƣךkrr /dx270F%%@l8&8k ?3JeR:9ZMK/OeR)b.K7S0pvJӟOZ/o]y<x0hE|0.tsqH*X f3وࣧIT xLXEQVvrJS' w؃8'n3;Lk3DⳈ̥(wB589DʶTie%r/s?G \ <B;BNa={漘CE06R=vTr_ oa3u$_v_o*~!q~.I3/'c'U*?-ZD}XQfC#?;R\Xrrz{w) cڍ,/`-dw:=zn^oMԓv?\Pz\i+Q8!:0kzW QؚNr/WUiWg&\ڑүZSr3+xj}2?KZ p\1V?:v?՛)\ꯔjXX e`{-q: 5փj&UѩtGr>5w>na=Yj#UW{} W lpc QU~h}d9 x4"S=̹,.hoWEwS^-dҽ~v>ݯOw- ɑ0;&]wCϝzjry7K]s2=&VB1kB}r9B_L6(|m,N&g"(#~Ue3 F(Xڨs$.h܋* ij&-ixTOqwXr)~4vPN髈aI1ǞLI7Q }@Zݫ^G13hjG\l9ĥph`F]V 9G!ZNfm,"$W܋mje @NN :x#e,k@T(ذ`%EjL0=Cٓ=RVX:㙡&}UT*Ƚ@KI3My ev=SUowٹ+I)_{ˇzAAFҼtּ H> 8yt<&jևF>lYKD\J{h߯Ex/x Q+:ҧN)z"Q7Cu" B=u sŇImh(SC 1~M!/>"A^ĸnBL4Tx|Gx~3}/0/JQJF}$eDk=_nUCU\Hn#3N`BHz(FBڏ:q# /-Hߩ3*/uzl /z]L4[uv9pbu{kB|s6czqOBm"Vg]mLZi1!"zQBo~(ܞ.6&bH^Fl \tRȡJ]F5w%T~M"]RR3xGHU0_S UЈ09T:` qb mf~YVxNg6cmN*E1jZy6nD^!Ӑ9v1p-h f7F qNo'U܌ZE^yg}<^#:?.c_>%i %R~Y޳س,qCy Ih93EwܞM-S6[:!bQjw&LlE7W0N?fCTM|Iʝi2?8W3c"uD(W*8;yw 36 SD"G{^eM?tC< =&8iLiK^pwYqfwVճhb:b[$8Ad߾%.Fox ."\34xUe56Q5yνaU8)o\Su,\Y oE,p,wfm&<7#t nQނgTx+70z AW ǟILy!mlW%x lOMFc>A3t@;[:sV5&P$_yQ63 pK[&?Oh#s 4Y\eF.1yP tFyFLlVj6ZFhkm4_BYPaB7Mvށ_R ]F_]߭ D| ?0Le&&FW[)Nje7u7]$ǽ(}~ G?,6ISt@6$tፘ^s?&.\Ԓx q܂ކVa'сi41[Hww0Q+r&߂n/@.G,m6pi gؐ@>ni#:4F_HŧSoG6;/s܉E}} =N E~{Kү k_70~aBcEc/飘%m Fܟ.B-|H]+d{q,JHyi|揄_)~9 cg w_` K Jx/A:w!Zq'm_ :7]e | `ݷ1Nxl ~wtU\v|)v91g3sijr1~-N+W1]\1g7sAF=)?dm>鲭BW6 c|LJ^^[q]fz8=~u=u?ί~!T\sϧ8zlc!uL3Os>fʓmS i梚=& s;O_ǰTr'Ey}!{2n߷@g-=o4c/~<~s9e]F۾l5coS~r1sArݥIGLxBo0ףk͇I׮5olp9xs<uN+٫7r_ߗFέp6w({r )]>+/ ~rcǃQ7K;R!0כ<z'PHn?M¬7z<dx g_wV.Y+n!EGz;'zMoׅ;U_֝*f6<:}SZOF{:$¬wkFb3¬G>n3[x|V pM]HsΪG"(:>(dyTDowDŽFj\ǯx _)J+_CX?!>b3N0O? Uf;+ǟ?7oܸ^ ?NW )Dp~T6䛔?ʱJ)aSOp?瓟+aBԐw^5~i+{r|wz^=6eMّ֣<7g{!g@w!3%p<Âg>Bӫ>gt,{awT/ 琶oh|+ oY!ϩv 'N `8U?%^gOt gtO0%?p,Hl $< |H&!I  KQ(2j e2? /"2Peʊbu HzN-p|^_rA x)`pLSRxVH:{ƩÕ VbZ~6<־J$]F~MEYՐC%&}00"&?WΏ]w9WJ:Pc*Al.~QTODs̟/?!f>6T0X|1J;N7\$WE a*MnYp+(@þ`IF YPj wak0܍^)a[mɾIy4 $J;EWû?<}Cm;*:A肜nt0A0ax_OEp75>Jy{<"o2*߷ }zkEJ :&6R \o vQ_]_UTvN&pwEgΎ5N Wfw||71.F1]O=|}fU'D|݈QK~` /Ț9o(_ MFoTye0ƊSGEyfQQfM%zcS z@S= ۹u$'uə"֗7ԁ3stދL#na}^W\jۏiJ{[E@Vᛶ~J$pDWa8"߿ޗ0aICƷݙ .%tHQzOWG$qs|.\΍\Lz4t*X ZY{&}0|2| 7W.{ç0|ZsnE~^TtŒDV,Eȿc<_'OCUYkb `lK9w?z2t )$$}@&h4=UߦZ t#L!03XZY0@*Xd%Ht)*T}*%r!\AJz9H`f `+ DPc\o^=߀AoD`7!pbm_%x= N}?y0B!T:HBPx x D8#ſ>E >g,&|6"|K@w!|K_Cz o\gȀ܆ľe}68ȘKȟPsFcy}+?y@Ϗrƾ:B!#Gʹ*3?d`P]N_ve+t<̀%SoS軳CB/\=2K0Pvt/J j{EeCk'^{R5ERě!>y9|OVag^L7bM^qADrDݨQBTULy%zUrGfT/rKX1 !KKU U`h7IRUh Yw#{>Lk y J`:U?:0ZiuҲsz*R2Oenv/:,N5ΗQG>Hye)_{M¡>DB8{&%3nljX@vx W9eL-Q*z/&mPߡC%'d}wuN$MA0vDH*F)̇oZ"(3Dz,yBOn<rʶ!e aRG Nh7!Q:l Owǰ'2SB͛XD0+=;1ޯ|삝0a yM|Wh6a?⻵)PYx] .đ.ÔˁXƺG5ݛ`?Kłg%[vv{&6yOO/9|K`}jN7[DO2gFUL@<ڝ ZLV@^Ϊff/sM궽ׅ7PT3 z fZm5 uzs34&MF fFYm܎8Z^ׅi7Sy@鈣|;q3Ilq <=½T-oaBT&[yzf+F2 aI6y(?[ g =GP2dڅQ=F_7L&ھE~[xH"lɛL wH4 *u[.D_mzx. ]Z1U-L;æfZZ M[tYfKmݳцQұ@2[iJLtQ`͂s7⩄W)ר@[20@wiN &z"I_&4y2QPU_NBD@)._|OPi>>""͑qy!ÒsMeԛpP2԰ǗsUqxjZн_YD[HHMw]+k#:]'~_wu'@=O':1=wK?"ߣf.I4f} н P ?>$ܵJw*%= Ey"Y2,81c9~)SB1~pJxN4ӝFvUw8"7g_T$R{ܨK6+5|$mY2GrF#TQ5whۊ/X&֫Bhx&|\Ša%304Őc0"7DX犼>V#)]m&ɒa ԗgJ"a5"GCߦωӉ {BnsXwx ;J(( " s+7lؤ@03PN91 `$撜~Xz?C?o9ۣ?(tf s7[7k[C)˖R==R}Ŭjiҹ}52Ҝc̻S*GtO~>5%a=j+s6n wDm# _"n>r&qtf/6JsWe3Kq__A҄a$(B*jA.4tP30)ܩ/08$g PAмU&m,'!wu')٢u}2 x>({O*ooSǕ­G¬Yu`Vgij3>l1BGTybzqψjǛni| SR*g Y4iix,9Z#W\\J]$u]oI VaBwPs*j~*6:?E7MйNDWϳ LJw󄻲qPճw y" K@ů.wʘM\0g诶=d̩&E9ՔE SQksVW#~>I=~Č`PśbY^}eGTN߳^l.i3tL!W7lj4qЗ8l 3lJa_6J?v[*L :B(;^ve&;zD=3.GTÝ.r}޵HOuT~|t 8lM/\ 0賡&~N][!zȿF>dZ^0!6=_dl=fg/Lk,aSֿPNbiJMC \[2}{xwn]܄9˂α\7q$Wsb]w9qvڌr'i:"tw nvv{S d7Cpm0gN>-REq.^c>= y"tOZ{,5筎IKv&jf(NwU6*SuT83a1f|;=xN5v\ͤvIK8a'  g`_9_?W唕T9fVx i(?o#I(7x&oq m.{Xu+t=UZ咧 kiyMwE !t9[&m LGwlsTNʊf'exd9٣= <v *VlvV1U33|ZJl'w5Ƅ[&b͜yVٳ3c詴<0IssAs`j8woXC˅vySl3[ rd3ya9-fԑ@~XnXO+Gm`8L<lsF/(LYdbdbdaY+[Ebr*\xDf`$+<5me=FVrH̠g3^C/(L/V fN[mF5S86='@Oc[ra9yc1JI^J5EOϲrP8p:9 d dSS,Vf?=<O4*Υ) GS}#bj(?2:n%*eWljK1[S\f[; Ӎ<*MwtԼթQX\.l(;֬'&% o~3-|B!*sTe=/"yHzmzz/~^ RF[ ce࿋9jEHgjYMEOT4(vyK0]>nLTu o3IoJ\ W70?k[cYMd #!K#/ L=+ccʰAQ??rf!rÐlUJ9V?zQVۍxzzɔs1S;|&ʚ Ӵ; ;w6q4ɿ{0b6k&]״`Aji)oJdzLL)u$zl::@,[h gIB_2eLcނ:NꍴLJ-XY[U*d (Op li ^^7RCU <՛15<-fQBktz\%<|x,S o4¨oo _ Q/Fhoo\Tɻ+%rv䢢jZg 爙(a&ڣa&R3k"z,켨X[y[wimv6;/||mv6;? h7@|v^OjRv Cp.Kx<"gP{wvΊ3=5P93ۂHh-HG13vr5tFo 7즮Xkve#ˆfw]USY L)۟bhat0p~&;N6b QWrrux*1rbl56p zl/_~x( l9#Ll((!G6| oP 𖱪t Zƪk҃=Uo躮7<~Y%Obi\ΙyWȫN|~V^jqʯqL9 aeLnuyIBd2R8tCYRy u ͕e֛M#F3 U43U%1oF[[8DotkY~j>cV[X^>jk)07*-5Sl*٣rLPEj"jۨ16P`lS/ܱ ЈA c$c1H-U]؊>>wt`tZ, w<_uf1-k=,~6B:3=5h.HYp'☏|,o"3]Xa&[C|]8mݧI#ꘓHC dgc$62Ffo3ѝZN+U}JHZ#J[ eLy`ϱC1} !O]vc)ߩն8קS}WnODJ9+A|C*>&q0a.@җ]5Clb<;5e~%jX책<)0ۍjj׶!v~c<]3w˨& ܨ/U%Y'/-UqIDuD΄]S:L"d`3dACCĝ5L w:c5bV{%uDHs%d5>[Zo?ˊv(*H/Ok Ť'+l.jk|o5c=ȟxM60M&1{rJUdRy fz,OTe }T[zBf #0$VnV aL6O3;!3jQ-UPAk;6gj+o\|l |o 0|g!ɋZѺ!`~##1~3I Dֈ^&陹\],Ϟ.kx+e !LyNra*gCzfTK3i&'Hh~'65j,K :*^L9,jƑ z܉!S- +UW>y_ǗvpJt4 6LBaC;&r?뷚5($0AN'0GC`5"]_+R+}{ ٣;6Urx [qzd7$yi8Ʊszz59;t ii2U)Q R˷QS`hڅ4| jy>7l5H ,Uد%hji9׭WyE݋ND]ԫQG>űص۽[]M>㝗:Vq!NJ3d,p7 j^"c\t*jwE\5rY5>b"FD{ 8߃  >9y>L4$S%6ϗdzyYm'uQ!L=D^uu*=m屩zhgpAS! ?)&mh*gi<=*4ZlWȍL,)ns_KSjv U]Y pJL3 l_/nɏ_U8O+ʚ,MrF4EDjᲲ[V~eV}>mI"WQcMRQ#?so'O{ ӜQk;N09дdN{, rf&{T>yzhj/_X;ŷ|`^k6 _UM"vQO//1]e_X;U;4y2QPU_NBD@28f44n1՘)zY9JUԡzgA2/ݛT[C[HHMw]+k#:]'~_wuw`=O':1=wK?"ߣf.I4f} н P =$Dw*%½%[y"Y2,ñ){B!S#b':gh()bcEu!27ol%Q)kn 5bOe䎍GjR{ [e[&֫Bh =o]{r[[oR,|v3燗9t>ss;gO?,t=ǟ7ӜQR:3bW7v-]Կ!UeKEŞ b?4_:-7FsyHwg(dH&zDp%OdryC_JϧgJl_#̺@D׍ۗ/emH΃/ x$nDŽ+e7c&COc:z VOlbC=BBJ5iJ9љ(MPfd_,!~}crDjh WXL"TR>~>zr)<^0]B"P=M@jLGoI{"5}Ml`FzL7$P~QM=m)Y@8K.̓U&m,'!^({O*ooSǕ­G¬Yu`Vgij3>l1BGTybzqi&ІUTψjǛni| SR*g Y4iixY#W\\J]$u]}뷍@+B;5qE@MlI dNC;S :<.F=~|wE]leW;JQLSvvg&6SBT`{IpV ^&8W#~myހWJύf#Ewu5AVޙۂL|֬ /M>CG|f)}~3y gqCsOyŰyOyټc(q)/0|5{ۅowkEoDJl>KsGZ)LQHQ4U7GEGST$y-CL_U3 ӧ.=^ 2U_T'(*?Ϗ)ljM'LLMi*~X#ES9}3mn?%&΍!> };gwZ]˄^7tL.J`符5ىY(ۮWg>5XyH=xEpgSm>6qUP9ٙHA:|&Dcy$房dg&uxWe|)ɤ8UgJeN3cʷ'TW l$jATd]*(Yv nvs~X6Ɏ/N%Slo%鎇|A624`]m2חfm )#ywPϺO'tT>8rHy= j)-j|,ǀw쁘H)@=Ӊ_a~ף%@,t0bNWB?ըMƧU|p\XS`|as<>ﶚK=+=(m7CEJ N<#6U̥dɀ t.W x+VQO9"aS;”4Oy<_ǿ@E2*B^=/ eԉKT%sUs|T/v.p$=ޕ1\. Wc/KrQ7PԳv*z"0tz']!0G ]\Ҫ RC\rtPWX+<]r*\WyUq[l[tiWZS5ia/x:;(y$pHyUq7%nb0م8m"m0K1_I+6 ~0pXUb<]حt[[)ir`k0c]p=^1׌^[+&_+Q}}\+U<":]f[ pfj 릊^#/ Tqk:\PM}GX/ Z@d'*)b0W\w]\'lJWdL5'{q,[:8=z4W*{jJY{Gao1!w M;:uhB&JMYHV)yMeP39-6);WH $֔Zw44N7$:ae}@ Yom#쩾~[i+͌AVC5~eAc .38<-[B9n?:JӇ3}٣ALB3cnD3dWLzFm=> YϖldqNRǢ jw+P><;ewQE8VR.KoG,7TWXyX&+jC*+XSs{@yb6t9&թPcrՕU{JZ29|$ckJ\2&zc?|6/7q7bRH5I~ Rn12o jff7*cpR/-4ƔIWbv$?2y e:pc4c(4:{G'3cp]Eo=n}~g,Mq8Z1+8Νځm,hMϟXgX8="W#߲Wށ fJܩkUluz*X c27F1nW5cY } 8UGhECjn?u1m}.RZE˚C*zoeN^AՍ˪ӄ<+9XQTq0o1GQ/UЛYC!h 1s9/1xvL]BE?=oq3,2 ND9q2j!f.)֩>v_j,?&ٞ1Vi߉!NnEY%;cl۽ǫM zhQSe3~23o?ֻHǏXAc၁1o[#1LlE }8Vx*idcҬ2wɖ5u˭7cÜ:D+pLMLrE4.u{U8kl}R $DPSYAjҊZr`aY9X%!+Sɉ1==mk4ߪa%KbtߛĒ/|ߌ?r=e<%>VwJ$hssV{o̲UZ z,ا,+qypU_pڢ _?WZK9auX>#]|FhtU1dd?LomAd,2oYBllV1Xw,C|<yr|A K[u U?-F 3Nw(`N&9n Zq*a9[BN&!_g%$( ZܱqV|OrH-)0^RD{XcuF6FzrvF4zKϣQ_o3vv׬^"]Fb O`neթx&#̇SaTqA' {dn_yd%Q6q5?XMt"Vz!fq},˦OƱ/ qo?h\ B|S 6 _ko4KpMSxnk !!tt)ta ǛbFĚ-kE_ =&a&{NZ)(}EpԖaYKzBuuXa8OuuqK#J*yw\}־Gh:`}|<&nϴo$ᒫ^(ZuB;.]\Zۭ,j׃EI ;Ia;k#Os\@n޸X=`955]D/Nn0GRԦi>Xˍ!Ckgߴ-OͶک[bh67{Vp0Z%VDҧ?e!.FVNd֧ɦ7+ՈڿYdjY_u!^qvF ۷btւHSY+XSF&o޶gGts/YQ$pu^T#(,ջvXŝs,JgG,Lʬ3!7,cqU1$9*R "z >U o ;Z6gI!f-w9J}OZZ "d4 oK=B\H UY#,N7ʚXme~dc8UZy}}nO /9η,ñ|RS_gǝq&0gq6'FXح;Y<7WnpF6ʸgTy\Qo|;|[ {Z:}BŤZD5y:Pg~oZnSz?iaZ]rd(MsJPشToMFT!e{Q3ώ SSn𩩉஽!&)d&GVhs9+, E]F/_L-"\g {'>WND_*4q7MBd9Ww~Kqv}1dJbw+'XŠ Cƙ/)˿y&"-[4q5;`{Tĭ{kwɐû2rE]1}VF^[i1kۜ,"ncB,9blJoq}p ?U!fI1G|ud8FYvnt<1 us]wyB;|IYVٓs|)nٌV pHyCT1CL-JA/]9/ $*x=W:0Y܅R\'SWXA3Ki_57;;WqɁ ^iR=mjռr3VMvm1/ ӒW7[v.ؤ}ɸ ފ~珞d/[qunM$Ggn&q?NJ5Lيwul%ܸIGtz}$ݼtF^=|wˑ.7nO[!BI}:V+[ uY>M>V\7i+1P֬wpMWvh'?f{Izu5j>O*q]nq:GVUw׾_tqI&q7뚄Nߍ0̓S:?ZqLc}}og0qvnR6s,[.{Z=d?m=uf|XkI ieD^tuH_J/H*]o Ē`nk8G) ?ML+<bt/ӕVb0µ.<$OޭM9Z:%Ѫxf R=ůdrf~񬛅A-:Rk>Tf׀6yXuO?y6@ێfyoiM40Gk+Gq:\9i\ЯsGZj~cUzd^k=h\em<r:7Bk2=I]kĻ.sM<ǽĽ9nIL^Xֹ:1`>]ګMރ+O`|u)^.z҅#L}H󠭩r!p1ł:j dji35`:83GHsMP4X[ww$>?bj%tn֎dYK ] ~:ī0A_nGH{Ugٕ>\uPp*bR8!""NQش4 Le2G&Jo9-g2ȣɣJmlJŐJ]F/"37z&'0aYL\ɵ-\UsHIϩ\TFZK%Q:- !VPp}k;WDqr^N^:=ԑT L*k)d3wD&⿅KdĝOuB/CX߉-(zW~;I*yL !NLcMH5Pf-A7Y-m50ľ>߹={*ۧ*Uё+lB-5Gu]ҳSʟq@8 ֞5.!`T YT !α-md7O[[/[1]/!r+zPO_w{!H%s8X&^ӟƓK{^ʞ~|O2;hI7ϿOڞ*L`zvvBH>^FwzvzcB>UqPL$0 @se!nKv[2`(j5kEr*Iպ'q2rc/ I O3si oh}7{ُ&mQWtViH*TŦ,޷mv:|8|,&L6v-j]q+uŵUꊣOu2d#%x2.v\ѯ+InE䶙)o /&6'\z8-b`PunU,FC6]+_*j n!m8 HCY[{g.n [8"@n9Yn.~o!IF|=ۅ b K"򽢿qĴ1\ⷫ;ii@|!>sـU?ބfLoaJieP$k+5pwPez u bZJWT6V ԿtPR@oK>2~X@uN0.8[a`63Dribз$D5 wVr2^gL+\y@hqZ Ҹ+o 6lΪdpKφp 93uJL_ [8P6od9_/U.J1y4KRf9Hi4yN0 [ߊ,LP]/ #Cqq/V3۴`[+ؠj7 WTC'b^w;:٪rb)#ܨO'I+J(\%d;U02p2@ݛxn-}k5^K[SUةڷGyަ7~2]6]|bUWfpvUyMrIY>$A0ĊkH}"[o =['UwǛ纷8I_Q+ׁ*Eрqsa\ʅnJl[ʈkk ^v^!PX@b1fr;iԉ,[D(}pR)%ǓtO6KE50O* v/P`͋qU8iu#*:)Ig=s dz^ZuoTϠt54多tWIe_ !.Y$a=ޔ2n{1 eϲa(/:>|;IOOOOғNw!=-%H$D2eޓ2,?{J*dL=?/}K y߸l E~zBVd-{S!i{tY׏bNd1>&W%q|Tt%MhJ[>w%Z2Z;Z!Mc#LK >%z-攁]Tzs'B5UYaJ)ozy&kpWَ5(4(ĵ,˪`% ˭: F>Ze؅\"\fdG֚B^-S%]9EO-Qv/ِų7u/"! ?z4Z3-n`=>GASHM! NY;Y e& H"b.P qlP$,ZR~AdK2U6(yPl+N]C#"9: "o-Cf=׎=!4o |G@.4'ek4:b:fV~ x7!T7ݨ c9: Ho!ʼce.2xl6&W֢M&R0UW |W l7I/ԱeYwgH-M_Nچn5(l#,6%/•x>@L3/k FKǕmw3X?xmTq1' }6;q,SАH&1IO3sD1SKiJkk[0S)D^~K \9]Y23qv#̓\X/2|Mk%f1k9ީ/w$f%_F2Wvi3d)%ME[,ttڃ=(*(h\։auy:Yw|C, w-~.(HXɽ. ~>ȐISeS tB-lZI_WK2TY <׿8e(A吴i(𬎬vPWGֺ'yסd?z>nDnr7g iDђ5Jm -)UO k;ctA݈%OqRb0H/q! t=0z2qHTԗu/RAYbVm9tM}q ޏie"<!@&`& EB]@B]GRq8c<&`LdH7 M2&.NEDh4`\/K&(z8c0ABW?(z8j@ LW6݇;רAh]) wS棒cO$Js ;^A0dupy&k4{v|8I/ Æ3_džmV18ɛf#z0OGV5Sȉ2OIDQn *!4cHBJ BG2 fe5aK#a>v$+j~R~ϝ+FR:v`)eSUdGN*ixaWdJdZm@QmiU,DBK 4;<h&@!_?i(8-bdm!@!Ӏr]"sv!N Pup"(4 LB?& nN[,#Ѐnsv!a#Ѐъr ЂӀX NKB{^$0s1 H~tko>!i?cb')$,4`p2dلeFxԗI\?(.$:,&f>:4.fb&ʟNV =(ZxQ>wxS\&}MݺEHgZg`{ONrImgroQNj{sS{ӟa_ +3m!}^$φ[R6ovI*( iR0r9-QfJLvᕠHXHh@GΊNVt]R?UKs ׺]mx $mNi,\pt!عhٶ<>Õ p.,"7F.Y3XQW  n^.cGIwB¨EZ[COV:)imsf% uɎemR!= Ϗۀ&*%릁e[b- rBLg.[ʒ:,.&6wtxSXP$e"?t{vM${sp>X &ٮz9Aeka5KNHޟ(& /ĆGoD|Pt(S\DTCs,!1WI K9)aii8XIELqq1q#l0+L]ŅՆ=Ahu5e&d1c_SӃHpa-1 '@!Ce1 v0+cv;أȵEڽ'JZmH|l, RHˆ7gϘmtEfRC:! !!@a9b),H@yBCcCc#p?ޙ Ln $v2 24Q1\ʶVHi79뽄ט1"h^@"+1y8nJ5ه ^c )sZY_Ĝ"N~uF}ȃ XDr$ Xl k%1*}IelײW%l׹VӊI^ZMdm׎JeHK";ͣ2ؑkYD{&\sum$a~tf}J'&1믎ґ@XK(3/.roR׫:WƓVHi0g9CV\7`r<uCh _B雾nմw%jῧW<Ⱦ|n9E/hx݄!Q94h#~4C+`i\~M(5յWi}D Y_́J=>[0C|לC<گ9vBe#{A\'V'#ҶeJf9t0t]MvYZO4I^wq~O\M53u̚чۋXfɩ̒@ΜHԙQG RP6]ٜ\Y8$_Sh)%;?MwHi;!坽 }503ĶKS)]˛\2yy`7¼+dYIy!)|-|K&$[O m9AJz?(|-|{R$gbɠ\) 1^=tWu5#ǿq0H"X BdHDJr *ਫ਼"guEN$>%2H H @NOA b};'"Cn b#98 c)'8EOE'@t8C'x@!JWdY5?;H# T6PAQ/V.?V֎. T!"p0&O"~r|!fk`?V=_b~08AyN&瓀^]/./'.qlB{!!c`O2EEՎs^T$܏y 5$>">k\T|89J"3,Kmd!,2Gc^E` , i'?QY6u}BcYOl)f? ,]*.L a?)vJ>)Yq,)ԇyo C֝H=u >3&Jxgzˉ?<;DB<'5OuHY4g%q@*N 89 NnTzQx- 0/A_ ZhyK+VZV5 nZO| 6ׁtևu!;44?i iCfT5!ko^}-n+<(Gg] z <:2Ix nw7FBzEQ# ~:>@e<aD>!C!aPf8QHڮChdm$39n3!>Mx΀|#Y@gt.П yC q1zKx"*.•4V_;.h\[֠7@kdm}?Ek[I>Vt\'B+LW6sSEP2[;,6G%p$\]G0#i|#&IQh/?QUA7cD 0PŖAI6 k5DhiQ\'B3L Pȵp"4 Ǵh7 NFIvW l%-.£-N| XKS.S.Y,Q9[5?%5-qaY[ʗZ*l&>{z=)⩯'ĵJdK#+aD'1J]&ynye f2NM}TT:A!Y!eX[r *[,9VG$kSj(Ϟj?4yؙ}ʀc[?4rf-.2 :5"O #IY8Bc"KębR6 9&.,":$RFhG,hT*YF;\>{Bq[UtH{VIW"ye96rwa'uf+z)df;*H'g3㨥'EFD j7bp(Uhsjk)Wf$]$~Qv62,hcOŐ&E~T:uX||:g[le{-nqI|Q9_ huEsaD3KJR}_ޞ\",wƴ̑Υ{G3H8k7hnyg!sӔT%9JSYE;Y]tdD\ߒE-սDbcfLjTqcqXBKEuCZH|嗑Dj~رV9K+xy#=(jE+\ !>&s~IJl|4u쿧)@zxhbݲ\JqJxчؖA#=H\nNS1бyNA/<4COdxnyWޕkj.y;Z?VTGJR-^fѽ n}ϻ|M|Z忧>?MZ,}*zxцGMY@Ջ tM~CCV?{R`じ+槨DIY.@ɩi{sF2R;n/^ggyp[]I @C>NHl:ȃݚ{W\iŎVVܥVp/4_ [u.F M_qwq2'H[;2cs{Ubm$7%} ?Eҋ=8VtM?nGy*dRE~d2*ؐLj(PcWL5E|ЭGޕkR|B=XN+rfȃ/mS|o'ޛD:^p z$]զWU" jOjhLm-`]r]цGCv$_Su& iِc׹:Ӗ?*6eLmf_,b15j<ȺSoGW%liERdJяs*c}$[bNu;s{mVXI dyWN +r~LO~M%K+UBSix]٣tkqŻ|EvVO+vrZӊVhbgc"7hx34$d (*Ih%#q2!NA>J*QrFmBX=Cz6Ji*(.OP1bCLb{X=s&{ I]9ůdC/ Le|GS qz@:\4puz:\g,p@d8 PPf2k/B Ȑ7ғy+>dF] 2 EcDaLE!|їׇ:#+du{=ro]l 4H?tzk'=IATz>: q8 CHw S$z8. IOM'ptXRPAz$e'vFw'RV=݈[TGu" Ы.!ؗ_\bt"Ym$1s mf q}}+dym'ɇ} <%S@;dkw?mHk]jkH* 루8Ijֻ|-dWqF}gu",$[kA*UfJ 2g]d8Y N)=*'ȁ[x^xɋ$lphJ; `O^a;<i2 RQrT߬ؠ">?=M,B9A]qP4҈eTz u dzI靤kBLYIĩCS?w+S.{25 v)e_͑dn?.ĝ>PRϝ4|hGH @9:#Մ%&or*iPe%A/݁Fn [J&J,'Y*ե91}4ViL{A8.b:ROڌ= -ЁP~`;D︼8)PV=-UtJ銵%Z㶯>x {S@._\9USu.)9܉\3@ہd? >Q5IڄO?߀$E@#Zy-e4!?3kW@Րf-YrA1uvCюόfȗ,DұhJlX\P;ngLqo!p7@B޺lDҾ}4@- s=N3ȼǾ.+S'\W!eFu ~5<}B\gj|EA%#S7~׉p j7p6d爲F%E}(]C!Wd6s Ⱥq-xPpr@쿨 8|$y=stIAy PUA!/P.5?SM#O؝\90/%xe^c_Q/m':~꜐sdY+2jOISpϬ3'1VM쳪5]'V8z;j3DW1HW>pg`wA>WMֱe T}{" cA]tKš- *JE}2kNI;ikV׆|+<.Cޱmɼݡ#I;WE u#d'S65򶁰vf>#{>2Sug7K<.')3@G=MZnu(fbF,{[O*ߦ*ǐMG#k-{D~.󹈤\,s+6 ph_$qP'(gA@C JH7 yI@9s4oQ_2)7J'k q {k|5@|z}'':K|M/c`b\L<+1H{Z7 YPPK!oq[`ȱL^k0Rya2݄̃m+Pu[( e">-; NUxyt{DQ-I௪O,>U#>Pú?ϤqIO=SuqG||7ٞ.YoT9%L I}'{y>S>5%5lJ}TΪOJ=s;H i`{Kȁ}UMw[9,e&I%MLQ?I{/ gz3|A^lxl}b?P3CaLS 1LděBl~hr ?lG0j7ә| [?Oc>e~aEjol>,sWX a uk Ie9-ij/kG99f4_w26a|u]5yK$yg_|qK(=e: ̧u54=V.CҴf i@<&٧q 񘴢MK+>܅xtZ}yX'ϗxO⋒Ծ&*2ym aςo $L)3e۶-۷7 ,`X&dG!q.׉Iɥm]Q#@v!v)YUmO$OkU|bLA1vy묋H+14waiᅥ%q._:''c{EFSeD=t&d"f L}JMt!) o[ok'&.[P7ch4,RXdET怣J|oZ8|qTl2ge*ƟR.o x!Cc6{y'Sv$ߣJK9})-cj>!֬~>*F0}'{r87QpN-OR3sQ`~Cg=ךL xzeإQtY9 Nfx Ns=   P p",p@Bg?(z8*4\NW:q\1[SY NX#>Nցn PUy=@ɫ/ N P/DhN\'BL (8Y>Ehx( -YJɖ, Jnd+ ggҢGj;C# 9FcD+Ye( b J]#M}cYiI]H^+6smR."Hvc3sC"hq:3Eqzy 'DnB {(z815V/z/J"!ZOqilx IJ#ǏWSRVnSJ*C.t9te֜&:*IZ9M=Hz̪TU?lfcL3§LVe)D$tD鼾ёUrq3"kl`YJ+yWA3GU$g&?ՑY9nk]uOdf-5]`8,@ JZcj8"@!KzSQ ^NdtL|quw{,;ג*Dq42˼%UXul'lx=twF˨IcoLNOvQ]ce(&&RwY5ErcE$?vlaeHdV="9$QlKBs};|(8cj,@!G)"~5~p]1gLRo8@nzth佚pu64=걽ֆul۳᭑㎚Xh>mPN|x포dl2ںv2HXb-b-&ư~HG)~Rֶ)WY-YyHk:du §l*Ϯh5=+Z2r_/ w[&i8NU;jH~ttn^۫BN@ 1wrz֦`EpZL^{UfMd]>HHI|m%'x}GrF/ o#&#Sw$oByc="$^ 󈼮':xy|IdH;6,$d=ѱa{lMoOR"C}NCT1xj{[ҪFpQ;*z#󳧐oe q*)eedqPNA 39*^P/TS&/r1SzK +A5H,,BE#PksWBWtFBzG=`o(j P\!n1>*bj$ /@kvg. F.>?(./;!{~AaAthNuQFmHS4v$*-pLJFarn29qIHZ6|4=?Ap/VIynuڿK`ޱ!q8Sޑ!qL}C{[ c{%MBM }# /G'Mgr4r]Gf:9v[ "Nƺ(1$22($b8Y~x7OO@r.M;pjgVÉRY:6kIA30o s4nx/@!W_Jh/<ȉQ#3 9aa邍+dEjS(WN)2r v*4!6w iRl6MGFaP4?" \fHhěb]ТVժmO7S_o/\~?hΈƞ5?&]}4Y\);7%\p6^\i?IY/)?Vey$[/pg dԛ7_V\mJ9@F>PnkGr"l>A=hW/vR%p"J¥O4L99rl$|4jm5ˆ~L_^-ܑq}$WQgnS`WMd=AeB]R"ai2~BVۖ %S5+p%:pԘBBـ+{Ɣr ّ]GFw4 ¤gw$Alj$7 lU^VvKk 8%!Ki HGۄz |\UCoR$c3ͮ-4lV%2ˏNݏmHEnG$"׏Ev$D;6 7'2=$nE9oG{\& f`;-ὣCI@|c/ CXO"ʐʘ:|-P60K,bxL0 ş>)$_S6K8# WVFl2%t )0$02">EcěȒ=y\DZ@@;⊒!9'$ldI[#J+"r0G8B u ^slۀ cbDza/NtpEnT~UZm|/r>!ܳRYjݝO~GDw+ nHؼov mgbUKX:ОAɮ>43~[VtDQXd1]9i+)0wTWSj/_7#B#L r\st v~KW% Z$_|H~c\Hs9 Pj#$k>7#9W\,QW|" y4]}!>{گCWB5ƞ ϑl- Ǐ ^%uАHSHSt?b[, a1x/ ]-5(+_K?(>PYdk$qeg=F 'lj2j!D*|2\9] Y( R4t^( n(E-"7AO+ Bgv=d;$V'A!5ֆ kķ@}&)|þCaOH~Xwņ!񍤳4p^wyl:Wp}ͭ:(2e%#F^ZIʟ gPʗᄆ3ݰ<A,96AV"('d|lQ]B̀+Հ+\WBMScM ,AI޷1aP1P1Fr\ nl,@!NВ>DB+Ʀr\#*Wn`K36W̵+p%pBBj$V^qH`#>soй1S1YzDB{F tAa!77VL`i٘9*F$p.M Wn1Nk̪z@X9S"/6$:>Ų?T8Mun֯Ȩ'kRF[ -|B\DtBBb\U1'!94*$jk7*H^Uz$ky6^$;{M% 3RQw*7/^GBk;0G 2NB7udljHի3I;{kTPT%=I gbnw"˴ʉLob3Tf*DT@ 5k[JhMJ|`bKB@UJZ} QͺCO2$͎ YG*$,Wk#H3oIs,o{>s# ύBo>G|hXyHn/JzZ趘 wG;{edNa|u,B'@ѻjæg qӦ4|Ha`4wWTVTd Τ.&-{3'^8.F\Lߠ%$ɵɝw҃-=p:t4*dI#B-i(۞Kt]?SUY/pi|v6`}G;iC+\r3's8r~ugYG /xi^g'=-Ю>GhS+B38xt_IKMOFD$8bNYk% gw ?+6C[[IJ0{3/G&w'KSL~!@!~`dC#ۇ==ix; s] OA]7{c&+;,jd\'J~4I׆/`'.EFY%mG m &]YيHiX̻<:䐗jCǜ^/&kcj1p$=CZ$<7pHpOv %6h _di+:s䎣PdqA}B,YHO#̱.$]4\Q)1F!AW"Gz:/=K=[ևm])iLII8sJ’ZRVr{7%'/s\Gp$9s \ m\15ϙ|zsfΫ  , .+Yfm19nMX'Z7 sl ż9ARGc$]6/ѳ/TdK?@q/՞ "L\TnxΟ:rN${⺈po{1Ƿqi%ٻTIQŪ2!ǒD);oC\}Z5 w3'mq7##ZL/sG*0ʬ3erTt3rp>@f+}MPI?./!βK˚/.kJ-dž, u6ܥ T5O* 5ZVgH7_9m|&Pý77>h7f}^RŲ29 y m*7R7Jt+4#n}oFKB$lt\ {|gK/C-*Ďpw{$?I?|o*K?pz q!.!Qf V*8A>=}_}(@!׹u5C; cW %]BX1q݂à3dNC& J99,9#u0"Șn2nh:2e"ƵLil"fH<?ONS ː7<1JA5̹I9cS F~&ŴL<-7Uh4z l߀7栃~ʊl!?$Rf۟EGv &@!мSt .\3xw}3{M-M[  IH#wS1AqAdDAQ#5A= n}sW\w =lƻl4d_^[Qܡr2Ka"s(l9d?F<0JJhp8OB+U+,^mIMJN)[.IȾ(&TX 1 o=㙿KEd)-X'u9µ 2O,a8Q=L  5!$"RELx`אОEOKΑW#M܃AQ/{`9ޑLI_*3E&[*'t#혖䁼pHi٘s\;$ wnDmt$qo=O6ʬQ|vPd_`&b*bO[TIG͜+m:& >Q쳽\ZF/\>(͟9+@qxa"L>%M͞ncԴ ·drTL{6ai>G 6_I1GJu`_.8|'JO@r?eN:rVcGr]@sR g=rNÜjdT#:F}wUj5iͣR}ҊXj?.jR;O\v65vv#cj3j9br:O!)*cQ-lY*HpB6SPnUxLr*\Q«}MÞ;G+ŘҞ15}Zh&G h ԰_AĚ,mrLg 7BL hc)Cv[K5`ҖZR[7/gE*CR!JRiu ~[S n@+h,hRS`5@ªAm";)1\[Jl:h7h SA POzIh +ʡ@kK=dhL/ f -P {ayN8hRҽO$ Ґ~*{GI!G=b>C+?ESs:^z&G5m'V|4Í%ceFW hC&/AZcj8*@akuŦ+dSJF~YM0%- $D6VSWg|H84Tyt*pm?c;lSYԢ39HbOpS_*tLm|4v ?j(@ %h= y,LI'u3ږ@I1TPg;2$.0>bo``ho|+:!`tLB`|ؘ~؄ MV|B\Єq&GV頜ro((!O,h&G=H(J[{)Ĝ29 'BBN PP@'h!馻1>*q"pmلf9LBσ']CJLh(1x=*),!Sɓ5S'nJτbCpFLqԤqTSf  ӄ#٢R#t|I5 *IK=ÎhMhmJY? 0 {pSQFx0a)EUNAR eB3 5 ޜzWɴIiGR)}J(=UػB!z؆.Dro#ˇ\f !4(SZ ӻ|MR'0&jcQa tcp$ B05 2| !UF$j܌r\tc(:bt #S&) O'P#эɴr!V |B(my)kbnɆZݕqSҵ= ee ˀ.D|l)P#֪OU1Z Jkm+WwXc;]ϔ#+PL)( ~Vm[u+}o#Ӿ+&eڻ>:vZ"d3`mU>z (Yd Y>gL) [}a/2 |M^)f;R@i0S#iLh6P[ J?þydm9w/k(i.(#;|s"iD8p[1~xNq?#qfZZO1/ZO2$_ȋiZpK"x -ZU9լRɡ"|9~B+N˴KMWI3 pĜ||Wd ^!߽Czr3B3,اxWȎlP;UAEQE*$^cTVTV ߈1+Fv/TU2J5W1wthIQQڢnjF'(#&eXg5h-3%C&qz/=)\gpj4cBEfT?d[JczG q%ۊ!$[ODGqTE#94S@[|Q1azcj O(l1q M]CM!Q板!t/H&"BP(c^|,ޕkj+8_dJ_JO>G_ YN4zHe$@/!FWv 0ظ>a $fNZ-*ʿRpݎSZB_T[/Iru a#kV׷hMnQӶv%Xw$OiЛ@"nHИ𘸨nѽw*v1StBr}FB3MXT7J5U,*SPHƶFEkS>@(H'H k[r8u[vjYTHtٔf =v J&:;?m^DR=i^Cr(9S$痘?s/З:_|hH)_`)[Bw2ى!0,o^qRxYM<>%vMdq^yUOl"_O?␁ZBݪLH:hh SmBs`j,/@AN}Tr"8D찂/N-\ yX:ѝ78,Pj[~=X*Wsj05V&=^kZY }L9SF/)݋GNyk$7.fJګGhdj'AVe SH$ | s7|.70\ldz 3^kM޲k)u3(5ՠԘkPЫ$JAԳkC_[ qrOɔ9PScM_oV mfJ/ְm|i*}h'sA+Iռ5Sy :'j/=XQGwPQoQ\E5$,t@M4#]+L-u]y2 G/}9A#1(K6Yfɹ j'@aTv^ bͰ-B_ZNIOZGXW$喟 =3甤υ$s6U~#>b>!׹h*=|CSs>D.qq|_M:9.[7I a! ۲#I I:^yO*7s0 oE;E6u/^IP!gZɝ28N=, Hh?GU̫|M|_,uQ:x]B'%A |}|I`RG\'%NOUW o{*y}^I-!(eq]<[k)`"V.%&ƽz$f>8L;C MT>NA1RLN] $Fs%9s*вȝzW*LPRW`y+0 !hSc I,,83[(L_mM3R3噚 !l“BdW_$ۈܕ*5;mo -Ugwژ5WvHxLmE5j%v7^τ]AƄ~"}MĪJ߾X؎vBIAВ+~wC-j*Y-J5՞MkMՔ蘊96кz-Q@;PԵgu1mh+:b)u,$_coacNX6acC!{ɧ$a#dU'eYIcO-z\-}uCjkj-"pXq35 5Hh.AL]('m\ߊ`P[b#|+JE|Ln5 #>PW 4v.H7|B)"B[gdY^PXB#Pvc 19ANf05 PRqNT.o;鍦󩘴VA+s6]pO+R8T)75 0ю2KGwL=:Bi^d8#$u 3ŇB]Hy$TITfdks4Ch[d= J6FZrB.[H顄!.|WDwJ._lCG g(L7JC2dηV4AEF(+JE6lEzǢ(Ѣv1wHhΩu"5(*U=U{s=RZP܍۵2ө nvղu$S6 NH]ORk')$mqjuQluA}xmR6M%C)mOVmJ`4qikae05P0tl}}HXfϜĈed d+@,@er99bjIJUyx#0i+*·@t0p$G Kj٪uw%چA@%V]`HsԤ%"]#V m=0C=I7&c}`45'r03 ;:):cPM/@4EZc@f%bj @ٍsK{RIׄkh  J?V wKRkK.G5Smmqژ6Xl"U 2d%%'4S  J>ɭk0E!rMeJ>JNhN(9q(B#iޕksmq*̔~//(l=2Sd3N: RrJ Zc>B_nW$_CJ\ޢX +(RA*ˏ\tgvBFٚq"]Eً[Mِۓ!ȞD<>^Kॉ&)z#Gvgwۖe봗gYܠwL(hk[BMv%Eb5\KkfZ4qX/uiۿjGkbhbh`1;$q# v"ƄŅ$D0G -$'8M^f˱ܢݨ(Yb-4Gv"O@s\5R+Лۼi,Z4qf:qe'_*g ,*#6(mD0feZ"4 Sj{m EÖRtk%@>v29*\'!>FhV.*l``=Z30$Нlqwb#ϖL&&uswA];[ Bh@d:nWB5Ոq$zylo 'Ze8S1]{B  */0!&0kBHD4e1]CB{bSXotZ^Oq%ݽ7t4tG;| XM.#ݓ>Tn'ˏ\sOz,R),?ڲ|PET>rqtT|I5-RY[;FttÞ˹i;"oqXCIɃ❑r[yK?X<&Bu)6CmY-$_ӈ1j㙑pBvB߅X%@stɍR$5!1Z4*PȈA+Gy hg9v,MP}Y^;>Ȥx V򤵳 ChzdD s=0_iGMh Ju#w41Y+@fιdIt\g8p@T09/0A6M`j\'C<9v}γґ.!ǎ\y_c. :ZȽ W56k^%yyjkrSNž9#jSg석EZjSzk@?P%*p'=[<{YmF(~ JO=󵫤7fK(=,#l|Ms6 "z*Z G`FBN@. y9AA@!@a E0PB @RDQ wE j2yq)54 ^qShRH YnIA/^ m hÀЁBG@g ]! B&@(nˠ'<($,PHp^ЗB\a S#(H cbYq 0B8`G)t1 c:A@ s) d]Da@^ \K) C~ XJjb;^Ca@5F7 &vHZRp 8ql:A@s`/8 \ ^q:&ܠ@ < nQxp yFAWg,W޹A ˭vZ+QTUɭ}˽dVӤ[!,{)7vRSH(dqCCU9(q\ n 1} ⽢2Ya_W[!_UR s]B,J[aNSP K|UTwZ8@}@C )4rf;8"ъBk@@+)vBK)=N:;@Cu07'-09AO .""( 2Qw}q A!zA:H âр1ƺDk1?TL Ma.gM= R ]* k@ _ܛdV]m. 죰a?)u'bpp8l(<06N5b'm)$+ ] o#P)h8&jkq |>: RuwB@!dpg ْVX|Vgy)(laWBq*%)Zԥ)PF(TA%@TPP P1,k=ou @M@-ȻQ젮48nBІA{+ľL݉Ag@@W Aݭpڿi KEA[bV'y#{+,i{hTVik1  Fxχֽ&% V6IHG! L2Y!@ 9 7  FP e@aɀ&[$.).b:ـfS1TE{ͥ0xo`%*Z uY+@ vR`笚9 :#V,8,OcNp qIsp R}%5u'fMq9#c'xF! Qxx xEb5(kٳ#0(sP[!xvNI\Dt8 t\w0&'Sfa,s,"^=_0n~xL95)\@jEds9@E]D /DM R_&Cz>.[1$ e"&iIX_3̓/mC!lB-_ 3I$jp+ 5|aaL&l3 CX?[@<,S:w  A}݌|do,0&i^{#o$y =!rR,Ə8um׊٩~F=5u4i=Ogo`k.zR!1JB㭶}#E$39JeJ/ɿ@Ʀ,m|.+Cs舄hY(w_&gN"4e)Jɟ]8i>=~<|d{7=DN!⩁w*&s7 d";9*Kai ,5<3k(Rj5;\Ӕ(r@uMYb|y?eyoƞJ[)DY)Rj1^g5|*ilZ)={=} 'hdbJ +=:7A~ZCreeih/4# A#YFtcs`fKP^~j@"i/s 9L!rDs|I(rJ__BIE-AK@lPy)!pmn Ӛ;?K)Ol"_c-kFo z :YO I212H%MF_f ryd:Ϻ[gYV5;5= SSLΖn E[ƛE; 7Dv2_Y0rhњ ֬ڨyQ_DlP$)aR|޳THSHdDH8O{.eI:tTJT3Ҭ%Z;j0`ץU5h~ltm>9RfP_/`,+8skX Vְ^ϐY>šAma}Voh Ku̯։ai|aV;N2]WAoy4ªcK@XS*C 7ºe]kA=)Inc 3L*. kG_ְT1t!0GIRQa-ֲUpda4~u~sޞw61Iy9"۩_HU,ҝFr[m7*_ܦM3?SpWRuGF/>vh wCgរ0ހk#ޓ-u#L2͘Ԫ%evKNe/c{v3_iVGZ”ԁE.)=pm҂ ܹQՆ57ZfAd݆Z9[C7ܢ nloMn] J:ȭs#jEY}N^ӌqjzG-6D>y$1U|0ű$M7&4AkLS&pl6”Tg`̢x6K\BcbŇ7M|ߌ]}m h$-jH"iBN9oyߧ7>~CV.h_a!ckJ,{{^r~;zū_}1^ߡ{tdyFSl/ߵlJ^W[)x7tCL7^Wo8^zxmi_BU hƫ3~5!( Z=l/v3&r5f}$OOuuI붳fu!+YזM\{ui?cskyj+y"#}>}Eㆴ]Q?V6ҩ{u^W$]{7/,ܒׇv\8.݆+xqAH,iS}JLw~V5OPe*m{^3ڟy]E>_:Ր׼?]-Xwֺ5O56["~^;wAGn5sw^Co+/cy9T4u9/ȗl.^uwѶizykxMj>`W^>s5yۭ luYsTIuJn1kX߬ky}[x]ہլk*}=o %ZvF>+du#GYܜ׾jXiKߏs-/קn//u[h80iꂼE[5bcyR*dzs6eTOX}^3ލ?L({^3*͗sJtes\)܆ 5漮qӆco4siM{xMX郁 uG~6^s°twnbh1ސ7'3]{ˏ75틫q:_CSE\|^}7mKG^_6Kڗװ}h_oO^Zވ,rpjxvB/Z4[\6q柱 n%&?TJ^㞦Oy֝,Z~4U(yݪm:\kՊOljS?>ٌ 3uqk5+}*L\Uj#7] N7YP)ilӔ׎PE_ks>Y\KlGܕTĨlO<~}KUWׇ#m𺃩Ouڵ<С_Ν}*MuoLkTҶkEqG;]*W9a/Ӟ-*i=,oCVWmΟsZqoyӄ.v|y:}~eknu/vKv3}QoJoskR>[?[xIKprH+xmn󷹞𾚢~uѓkn}my<+;[ԄqQxM24  519>|?=wFlTR]~Eȓ!u}mk}>xIn1=F]YmzN3^wݒ&AZ^yx۪͟)W |ާꈻMٟb7%3=2S7<oY/ ^kozoβ3M>k鯎./-00c)7>Gu` eHfk0)OXܙ7-:+9 ѳ>/jՇ{zi [s2{_tųK][5/W76᧕Q̃C.卟ty7N]|qn}'kNuۼ!"J5Y[7?Z]^k߁5 ?zPT$;^ӰLc?.qsO_>ϒ;~֔+cז|} ذQ#Kx-[}VVJuʷjq'_\oӴzyU5kةλg>6ul؟f w==ռݼѨʒ&38߂Zuj~>n =GQ^߹ʏpl}h!~CBym[Ӥ'nLk?Q/m[(ݙeze*^=G/Y>}\x?jR! JWo[};nK_7QZqrp0wcq;ozߚɼO7K[ϯL=xUk4>|~uGW?T5.8[~&Lx{4hnT=Wv- Zv톓x;rix 'Z^؄gZLUs<.)zExejֿy_ sC'WǵSi':T`mnpZ}u{߻ސ{ >d-7[t4wD̝W}sgͷ}ԍ_䚁o|Smޕǰ=8r~zXUE7͐tlQK|ѳ)izR]3M'^oi1H?>˰>v,omޫc 򆒽>׮xev\C^Ie'i"lǼ)s^9yݩÚw˨eϟ?mu4w?>_e\7dLKx'K7m^v/է?g(yΫW y>[1^jY~TϫN4l*ͯצE{sxtiڗ\[ť#|U]->[jze; ue_--wo|oq(#7v^쿌i vVYפ6&OI\3yC×iW5U|'ƫ}.Nw. ֏=#3ǷakM:L]Px>]#{c?^͎М֌3sfcx]:`;ڐny]9>jΫǏON^_g o8xpE_lsѼz녗um | ؟-ioxv 9m92-NFof~~v{0|B?Mk7?>׷8>w5Y%^7㯑G vts='4<5=j,ӿ*xcZxmi^7 S67oh x_~i_i7]xwkG$CSJ'[k_WN~ϢKWc>5s`gȞ mFMr; +ap~u+.5K +ƜkCLܯk_+v\^qK'wތݡxPc' +농Nv#dOy]ذYtyC/v7o%p?J)o=Q״v<.нq>}vMTז[^+?-n3ȓp<^Ԕx>PrvE?aܧC5//խLP ;j6^7n;W1 nY<8>Ko/vf~:OhȒ zmx=<^}`ϝ5~*3{z?WLo!1x=TCz>YyçRW*M{kR9-\kOi2c?4)]o)f}({%}[#ob\LU5w} j Yp^i^ bCo?vo󻵇v}YksuA7ϭ 8z"7!o^lZei^?fM6It|ChzX\^_Կì.Uk/ڊV-K?R^[jv Zvڻ܎|Oxm\P<_{Ş4ǷT7kEp2<M^쥛cy3"녿\3;^/vZ=xt}޹I~?5? 7.3cx"l\c$V8{׆w8*-=Јxm=l挼V73bxw~T _Ue b_x~/TVm̯m͐ojdlM \MS_|DQ>k'TO}m%^qlD"yvEL۶СKϷ E' [? }gUu^߱gqF eSk.uW|4Mےs~a=u-yTyhɈU3KuSFujXw ïO~z׾Z}gvַnL55茸70rN m! xCa?7l3'W^_E}_wLnCKv|tʱKdl }˼N* :^yc꽋ᵫ5cYk8TJ#Xyښiʼ6E=S-yo4a~;k 賧sջy͛5ˍ574_Y[^`9S5i5fT(3fw|O'piͨHܼqb_dUW8>oL3x }c?yճُ5R޼#%jIl}ԅN嵷Vv-2K~]^?3C<~Fhb c%d,#R\V9~Ywl3ol:}GWb;|d^4u)x<`^#p= ٰce\*blbNx/1~7}wr@T*>H^6/Dz }.sv}:זC?(v?[tOEyQ3Ɖ/gϺː/Ӥqn^T "xݦx'?6^\:$%o 7rg>[K~μ޸3FexMuayݻ\9rcFO5|2UNZֽ竵'/ m~q1y ++cpeGv'Wխ8/0ɫn7N6|N%G:]i j1oT2]?prs9?]k4p'z.XRrCw=+qcvf҈c~]Oe}SvȞ'}Ϗu'7swbq͝G֗FGvĝ.>jZEOns6-MS8y!|{. V"pgZ ~Wd'ܓ=̏#ҟtd[s'dзuɚ/65-S/Y+V >>|YI rwBL땲2wݘMopљ^9V3 w+ ñFjŸ;+𰭣|xJ\Sm^ܲ?hEƧpKWBSvniwt=qG,a:Ξ|gݽ~53Y /ܽO{6{?߄r72 záS _]c2ݎM<[1,Zܣ}/X|Z߉Zhxo탷Q-܃0-ͪOV]]!׸)ΛP&O4pYwd<}^} qj\ *鲉[3p⡭gSr&Ιг.wnEκ!j<̝zu~Y˸Tg.4%z&M[grg&&ߝw"g9/$TnR ͗rkjpgbۚiw๚?LkZ/ͫw^mnUoLNx2d]_9Q !v0NW۫1{U<8(qIu0xM@ >x^ٸiZ&Ȼ~q;Uu4= kO3#Gvgf?xAӷW올 >1!_/8/VΝGuV?N[&Vq-cܝ{O|ۨDVV;x-Grܪ_"b|UÅUpwtۀ5#E#JrOoY>}5;Cis7L\ kr7|ϓC. wOYSqy?7L}b]ܾ ~0sv|r3>1l՝{T-}~sF v΃ YzL])l9Ψ-S!O[=rvNPp7"Ӆ{ poa8{>hp^)7;9;kU+N̟W6|܈ܹ{W1rw03u}WzdM\=3V{pז3*O-eWnP^]voWƵ_{|٨q{\spC#Ο:k!w@6_/3wa>U.}ŝNe>qJG^i0wX|ø+[kw ]_0\.ܹƽ<wLs=*We2˟x485fxaPrprzgni=w}"Vlp7wWVQMY{3 Z3'.|lݫ̠w>¨G_sFMf`XA<u`°5SN UJpEhΝ+ף&O]=4T>mӔ[%#6p_mY\<{y;wAi7}{K|4~ឌ[u ݈|\x=ف䩏y9ѣ{˽s_+J0^y% z.=o} )ĝ{ZCW̃v<={\Spܔ?Uw>s_|gceZsb&:EͽS5n6WO{I|=Rt_*r-0{\<[y r6۹{rx=sMן.f2+y?ވmJ0p; ;r|kפM,p)P6qdi[ܔm)R::"Up㶾=,A?>9JX"׳4(lV/?Zg~1~Rc w.3HK)k}U;_jsge :=xԽ78]QƌOl]?Xe?=4aqܽ. 㲸A'zl8C"Akyy62r| zfn]Pfb[l\^SkwtVnSYjVq>k8!xҼ? ~0@]5ڳ*dnUG&,NX39qwFŭ?[Oqi6\j%&O>]GߨRVC nEǤ ^l͝xtTq'{{u\M?LoO]u$,;e^؂7|G#%pSpOڗ+ݙZko=_P*ǹ2x[˸9?vJKqG>|Gr$ڳ)wề n[ʜ=T`Mʯ;ON ʔO(ͽ~hӂW.\nPvxslnmZ{=sɁi'f>mG榹k_垶w1x߬,%=9p.,ˈ?rׇW;As8zHtܡ w#9vlL[i㱘M|AmcUg>Qes2.hbted9|&}9cd~p*!*'wAR/:6=f}ֵ=+q|KsbF S4opwFgXixm􄆩&]v@vȀK߀;_~ÊW\3LiX+a }HϽ|<}{>^kZs9ƸPBkVڟ#yі ?/:2oC2t7h4;x9u3E6mk$2ouv*^gD`f<v.&7lݿh JH |8cI'? j'[Ҽ:C9J9ƑEcnYXtPa)RJtPk? ^(SGqgiT!40Y^}(߸0YlM9 i3וUn\:؍Vާ,;2^$ HfeGEV?Eєx iҸ9#ED9FFUQ[xխOv%i4G;~Q6 ,A}/,8Dx Gu0S?*5b0F2C}Eȡؿh1뽺#p}_U*LG?E_t=A8ڟ9esn-Sa=Ɵw;\r34n;PlOƢZɮ2&N.p "<ӄrDKzuJ#_׉.CfzpKşPi ~REg2 q4:s_SYn:)gq'͓,4ɰD펺SBm"$+qkD9V'hC_.uwV<%t]i /f2Tw]->CW3ttlBcJfg$Q=/h`|yIx~' G=0hEoCnz"s 1 Uy3/\K4d6j;=w(f!sg (;p f͊.@'Y=܄k=1/'BQ3+/Zۓ`p?jdK\fDNZ9MSk8XtМ]nGt)5&*o@1g䎳 l@ߜ/#T?q{)47`Fw^s2P16%ƶCL v_1NJAfAP{fpp0iL`NAEKӣyi3&vo.GKA_~Jz\ /UlvӍѤ[uЈ j#41tr K7C ,;$Kv#L"ͨQWӉ:ƻtss`<vg47C\}28%Yo?{w a!]h7lzńmjyE0^ml t|ن  _H^&P7sMgPٸZm0iOrW~75o'~;Չ̩Ƙi)7wfF} ($ǣZ#F -hDaXTH~ 6sͣ+B2;6m,4v49C)*%`OB`=: t-%avRiIDC6ԍL,y.H%[e;Dsc0܋ P1EOChNiҵDd+ f ʋ'TQ[n4@>|Vy WE&} 긢ݡnq@K%LL'G'IזXfSNXxNev4>cRI>DvCjm6(߂h'~%#CU|4e7sTl?z'+ϯ[Fy)L[%/G3Vd51?wpVB%qdls'GQ:'̓n}U-@K__{s_\xYhQKh@rs,8,͙Vgbܟ;Fْĉ:\> nȱAnhM전d)ZA"fZcay>oEe{7nCjђǤ|soU-NIFv@eq@ͲeŸihǕ>x_J,|uf94r`(Q.j2 ׸//"Xw9ҾeNߚDr%Rչn!l e 4J}h E;H*blYPÂ9[C&Olcوch/,h|5KФTM'6P:wg <{]0)K[1Px>M1luYI24m|C5/;C;дt w#4?ChZWbV ;^n۹ţ~Shy&2oZ%@A?{DA(6'lek+Oh9 ^yF݄ۙ/J& W0z`5r0|nX^jZ]>_w a?]61\O̾v5wк$1*3#ѸWىKGRߩlwIfCeh0xSqZ3>pY.d~8#tr'8MAD׭Þ.z#B< EMmbӈhIU86S$*<?l*A [Q{F Tw = 2It> ۅF%ē¨M-t8أωQmp^[qBjmQhӺW_ʃZ_X9%Oضl $)ן<%i!Tm%-F;x7:Ѡ\ԳaY9J٧G壍;K4I^:/%cW8!_=ۀ3T$# ݄0Y⃓ l}4,P~?ju*m s pAx(jHM~ a>;#~SW e]5P:{`("U|w1FUn&h(g )|3 < &&0.#4r֘':r͞w|Ö":~M.ҩw}vhea P+C#"E}Z-e! zS?^y_*.}`kEbBdцm{~%!aQ 3Q55pwcbpy|yf Ө\% Ua7# Pa#aZ;> Ž__FcO|Pǽޖ+@Cwهwonpk} oA@CG'OۥO'iuM4rNY$p4RА|J[D-PϠjr,B>;xŨZU\CeG9ίAYI sU3l@)oyxi@iI`J)XTEBN=-Gj5 h=".=u %SYaGwi7Q{-T]٦ $JOC|7^]4M,:V(wD+gKKGQ 9 _GsǺI o +O)D..F5MzXp0C X 7O^`m26fK:`G l4c uo!A}-- GU3٤X+#)Et.9`c|URS}g7cBtHg=@#KyWm @X ~݅ !U@~,SrV5X/Մ0I|h/@>#Do&O1Gc7Hl55N(!hrW<̰~y~V .?>LA#2u~SHq0ֽ(-^M,k@ 41~r>7Kz8bLz?=d!_g 8`ŒgP$^?.|6 v7aQ1 {XlT ]I X3$Oy<- k{)@`9ppO {X-UvH0x؎) ><<4ӣp=̱w}-؞ڼxʌ¦JC :{YI Ć%nOҹg.><ˋs Ff{(S'C ;aaӸ]5<%eG´[9pQ\EacR, nŊwJW1q&X5)T3cr㣗2`k|^'Kl_I + w\`V7g vKJZ1! ')x7M)eـ\)܄eϣaI_=q!ՙ(s'XDc0k]ac}Ȓ K;_.|Hn&[.bz AaUf[(YbvvB9`7 >4}:?baA.~s :YakMÛ@' ;tWHj ז&7ߩkB°ZvǹNʵG'ŰAGOUK UXTv )>-jk}&eUd哽@L2O2MҼ#ɑϧ7Qs |ydJǦfY>forpyN@pZ΍ 괱xhJoy%`J( ?X֗7ϭ@ԫ`M|lX19|K \{ >!ɞ5aa>6{>_ y'yd"I'U#o a>P{>,Ǘp~lXIi= xV?6 *&De>{^n박{C-{W+U`uG ~:__5 Vھ睁evXԇAuJ#]Xd lYwLjjXT(~@Ow ֮HoV=`MK[6XϦ[ {@pt">|F>*դbÚb#*`a 7u~[,^* s@)62KHVuOeX'Ǐ$|b4v܉3Q+:\ 4֘og[vFO |^Z}C 5 dnF<_iߕn3 Tˤs V =4 0W?Z)?=|#`{d~(׷[EɅC(<gz΃ .bd?ܟ\ĖlYX-N` 2jk^`bjw:ll [k\U):.Sx ,7>#y?{]]vm@sZVfRWd H/'=G|myrs)`?\6gv;@ o  3Ǿ3_[} 0wC?wau*L\ͮ@t V|켉$|ZOy#ohI%;Sk|y>$=Z:W@\~ l@[Wf(}k"T^+p7s,/vnWTS;K5'n,[!X:{:+:lRƐ]3gY/Vcwkr5`ؑاtP kMoݬ`3kd1a{HK]5 \"bEӀ_lF0wȧiEpI>* k&9DON)8R`m̈́b) ?QkSU9`o꿁-os aL嫀ytV,:0.SS$п ðst挢I.$TqwHwXa F$~$6o6,<[fKuUӿ khJ1GWv,L{07?̀X{KXIQOU,MhoX6Q/}˯@LIKGr%}NbXc[w獹G a)ZjLj^D"[(qwYay ؛5,[:+)Gg G_&eo~muv%x'*=a i \O!&v7kMΙ:a n%Iro[!8Xk,D^$Y+R9fH>S^,>< 3WCUKF&.k!+ }si4m 6O AFFć 9 /%?QVW}``IdՃJ6kDXas;4X~09`ŧf$,RE9ނ*\0;U}$2 Cg{uOa>c!FU^gE*a `+6Î$%rbsѦWo;ؑ! ]sI&"4L\mΓ~sjXcԉi X+:8(.1PvBz]P&|XN8^P~WLP8$ނDـcyޠ<[xk `t'Q ݅֬6QyR_iFw`GpoU|Ip =`~ӵ N-Kٸ'aS-,x qm Xw"j'NEl8t 6k'jس߃W^`o^B1L-9[JOH|_,Xx}ΥHj`LJh^܍g$aK87ăZEzFT ;#n[`m@H966Ly[a2V{Ls7K%\fFfaW1{ ' 5AM`;|'2ͤP p˵3 * [ $K`QVgX Mgg7:Il Xnf޿0ŤZQ2iHX Fҍ/OÖO#&0K}-;_ƥjR5[_aagi^J ۷5J6}gqh:RF@T*#y9ZX% j XG0;Ջ`o4ln=|.3s&So{iϜqfrX~ؖMR'WA>-?Ulꀉ7{Mwl:A WIM%De恵G"$'xR?:OށOQv6XWCi`WFf*ŚNZ|kϛ |f \So-V!͟]XuuXN1+`R4U),0g܁1%$|[`ޭF'g_4k1Ϥm8i1 \6i*lO|q%1NVwz=9LW JY6XR R5K?$ pRj@0~/OF`/P]= mX[}slV&2ݫ sm x}[9ǹdX1%(ٝ-N7I[?9iςÁ.ˏ#拰6R0TX|9VE^<<o(`> 9TtQ֔yu"tV2xMt.WƒL>uV.K *NAB0z `o55Rtgv&{]8azur\& i b/\a:9׈csGX3}Vܰ÷wXy-}S$?a~VeK\#ŵyVMʼ˙b| `3nC,\vY{0)Hy 7_n`!#?~j;wU A[Y GlNPv(1I1#9Sf{+X]UgL.t,)u)+2 ,iqYF>@D<]aXflG>Ր7k$|M?,Ӕ"l "IvUtEE=ЩLe4#Jvq@қ3c^ 2|&zRX_$Su^$zpN*<_FK>=@N/剽 :6%t,aM>+Ub?%И)_f]kսG?ezd 0-u2Ht ,2jB‹bFsEv?=րMWq\8yC}oPӇuVC(YU`G;RO Qҏ>OX³}pMm)I80_*VljGZ*!Q[miX#tG`2~9׭/`10p.N@һ۠RAO5:j݃7xHr Сl?QȯN6'D OǂvG2| x? C[$~}Cp$ D3X𾅕u`[}0d6߭W ۟UJ0ta$!"X} Xv~^I'2Q8nK)>X 6ʢST|U" +E|$K.Z>8$Vank;1,Jb(Hvfk݌z 4 W`~[ ^9ctJW%=#3ʠ{_sAE'SWyg-jY,lj}X 5/Q|cVX]rz?@H#^tէܥwt'h`nKmU0ȺK(9X.0wԇrm*0؊_$)G&x5,va¹ pHW+} ` ; k:rPC(oy| K иc,Jv(Q0*7ɝ~v| xjqM+Iwrj%F(*Ckm)9{;z`w&_&e,hX~xy٤-^@3b I l5jCõ ʦwˡ&(4k~~"! `^_ji`ӗA j>`ᇭG 85[n&1Poe -Ȝ3"1vN|x9^ː{d<7-Na"3􉰞Ew )-tH,ԅ1 MMusWCY pF۠Vn ˧oއEFKcjTHKJ+蝾OrtXLJ(B?' өa6Wd2 ]Z\lia0{@`eW oSCkL_aޙ(K@ܻdƻrCJOqr.OM7gY0ھop%j9^]U@?2**u#^ѨKwsc*Ĩ*ݵk B~smQ[^r+Cl G $[OVK)k"L"5"vJ:~-nX@|IGKHhp>Df}B '(Kkjz5~R8Dv a2Ps_96KGA[c2HMy#uy P_ހ4l:m 0U2 *j!8arv)|lf!Ꮁ]^{L:csd>'\?SRwoѶCeXt(T*G9@;X\|$eT~ t7˜Ш=Bws}0?mG9%lNoPMbTDh^,ǖ S(K,(C6cPisg㏨7KC]XG(c Rb̤X@mpLZ7s祂yIYyb߽dq n0Pnj?=͐̐jǓIs7a.!-Fh*{آESVZ2i:dzUz5P{w?Ųɓ~qQgp;ٛ2KGb> Pdp Y{[|c?פByۯqC7/pp ]0lK ]s ݅ArGRggs%T% M6ACMa0p).r EEt>h9ΠoyEohѻ:&/G]g|?9*2o940a]0nvjE1hiWcBy+y?@J!xv0+$)mAEV`fsc?*M`)E2]29j Q;5(4v2*o')Ni\6ö˿"J=a\冮%-+j/x˨.P1]qܿO6@/ˠhS+SrXS/}-hMٛ&2 A ̓qS ?{A_i$ښ.c AGw'RވMs)Fv#7,?F^9v*\D !z!gt cm YU/Kn?omgXyψ~W;AnMΤuAt蛿Դ">3X <2HU|c 5䠥/!ųzFF:A3hr>`znw;~TfD|k!L&?GɁ84uv t_|)נ#%ԛ+bi9S)BuYiNB=T~XqE(=q8,1%FV[݆(jd2 7*[PMAb.E)Y^Ao%h|UO3~E%3aAQ tZDVʠ8<@{m> Hey$U>*= tbSyk(.KVftY5aG'8V?Ό=e3bkyh5+;k+E kҰSveɈNiܱ/PmqrTAXU>_7C{Kcg ~W-0NH2oYXYr{AnC}:6!&8.580Lb4]Z/sZlIoW>LʓSs(qLezo &m7/g Ku j޺'%N~@t0 Ldn)i\κ*cWx5)c70QBKȲEq .fQwY0VXa>&~%U3~>I)J:_Y)>ߜJ$<{ %0!y5x<`"HWźE;Ws=u-  -ʬœ$ȓ38яT/[ʷBgA9$CL ]ڡ{'dwL㞡*clU}tbh 2#f:ًҜP3.:$Y2L/YZ&Z_ ݫG]"顒h+ypwFjy释PJQa#}39^A)kiNE݁{]EEGThu )fށ_srLD~&4&B[D;zA15}F=mMt4z@Z}UXS' Tci;%kgt։1N ,N3=1Y c8(*S!JFhl~Qky m+C6 S=w`rT.w! x٫v#y9T TC2o@v{07Pg0vo7FonanFKzN[SwZYg}r[W؛!Odlq9ւfEf ;g/C-(&7 %䮿ɿ s8Ph6)Ip6.a$=D%;9Udʳ(1C-s aPqt*f ԛPgLj 6M\b $y@h̩gP[-ϊ mZg8.2>VʆE:1?Ge eˈ灚2?hr?s4*̊:\-a,cS_7J* |Jn}_ Qΐ0b{/;mpC dMch6\"F֣>Ko%mE!)h'8p]F}AV I3oȌɭՃkY,Y^W@wn׹F.pq y$Ȓ/H»w8Ai^x(mW\;f Pub;0|\>`){~5Uva:~5-DvhpfM.J|+ C ?vB]hcaڰjN)4ȯxl vMkI<Ьv!\!Tm[Cgɳg O]LaBP-=kMЧˢezܫ|{"to].CMuJ? '`ץH13LYg] ӌ0Mig @@*ukYEJ?G .5X5i13k0&E;;t_>;e_ƽ(C+[{/~/T_ %ȟvf'`Ny=4NVlT}BMN,@KjWI-/k k4at`*>Cك^C#+а[h}\ΝH&c{]!Js_ؐiŽ)">}y3mS-t Eskav:Lyc:$u /*Vg5*PYItB+V<\/*cLNku|#w)=taACG'aVwL|SOtđim]e"Qz|¯07*P1qf*g܅vgﵠ?PQ;2.ogca |0HcÍ0Xbm;E(u~YpRuv!P/Y<=g,s# Muem }l{9-33Zɿ@&g,a`ˎ޴3Li^ &k@Q&J]iqƟ&$Qe71}/ BLYQ>MXjD{AI0!W+~qSaɔHw)M D)C]>T@ӳlo BrwrS޸7YDރNG ioQKiAtN17EUs f-fHg!b-t+2(Pf#\ۻ,D9C;(G =/\3& n镟Ǵ.};*  }{a z={`ʌd C`گs}D¥G~>0h"_| $V B}LcEkÜY쭜Ý(j{KTm688IH0o20M !0֬4Z&U N9D =n: =plXf^[HU OyY4>I嚎Wd10Ah[Z)sIc4y73\_ny/Mc0}zVb1Qr8=#2W9G oZH_Ajth T^q/P]5\]JKe{fkP ,'OfY4XZDv)^B*426>pA$!ߪ)cr0ϘgϬy[*hm?RA_~kF}1ԂWL06-tXaZWnJb ?5D&LaL6i@3tꭤCd~XNNo5=̫;* RzzU |6Qe۔KZJauȱ}o ;ʣz*> mP>)P ~ҩ ,;-L;@{O٣(eD7A'>nl?O `^F Er+sSвUe~}dQ`N L|ˋ?KCAB :cA ˼"Xd .[*nK^gj!P]|O 9rZ 60;xpV/ KI@S]HZ}ԓ9b`:,՘%LSP` ^ش0XQqü疥J+n>ֲwoPƹ0w/ 0?yNZKia`~+49g#}nrMsHLl[#c ~?saW*N!:2.|njGB_*ð3oXm3qDC,ܳ8i?vwQ%RW}",(eRIW (dz|u,Cm@PD߱#^"C Cgo@at4tr c 8?)JC9x}0].ckiCn`U>t$YJ>!7Z!&ON٥ѯoa4ٴ\(tבAO? W:ER7I7^TҼsOH} 8Uߕ(v5o&!+ҷjT ffG!kHX:sQZ^o10ӾBa'kaʂXbMK Z'(#e.-r<:TAG'Jh,N[7`Xh35|nh3MljFRn)ECG魧P`1N{]OG*{ݨ}VT ,^%HNߤ E&-LZ(ij̎pܲESCtL=hxJWP?qk/PTm@۱bMYIA-N#g%Qgة>`yXa쿥 Elh#d.2gLP@Qo Wy٘h 9} p^^CU˽JB%.hq}"vLN3*5Tso͞weK9#?OhK@rQ¹L"4~+ĕy+џlqf`6ly3^3.SmФSN!dWޡ@#TYW.NF9GRORgZRGšzFoۼg߂tFi[SBRTm[ ѿKƻKw̥6170?TOq: K=]#߿7+wcd9O(^gc~8;k H;tyeϧ6Զ{r1ѳ3oL_IԺ6f k[3񠅹5k=knmTZ#3[^&55{kiϸͧ4I[гk{lA[k*9젶..220g{4zv +?.{Bkemmeans/inst/css/0000755000176200001440000000000014137062735013454 5ustar liggesusersemmeans/inst/css/clean-simple.css0000644000176200001440000000161614137062735016543 0ustar liggesusersbody { font-size: 11pt; font-family: "Palatino Linotype", "Book Antiqua", Palatino, serif; margin: 30px 50px 30px 50px; } h1,h2,h3,h4,h5,h6 { font-family: Arial,Helvetica,Sans-serif; } a { text-decoration: none; } a:link { color:darkblue; } a:visited { color:darkblue; } a:hover { color:dodgerblue; } a:active { color:dodgerblue; } code { color: #602000; font-family: "Lucida Console", Monaco, monospace; font-size: 90%; } .r { /* class for R code chunks */ color: darkred; } .ro { /* class for R output */ color: darkgreen; background-color: #eeeeee; } .re { /* class for errors and warnings */ color: red; } .r code, a code, .ro code, .re code { color: inherit; } .vigindex ul { list-style-type: none; } .vigindex ul li { list-style: none; } .vigindex a code { color: inherit; } .vigindex li code { color: inherit; }