estimability/0000755000176200001440000000000014565033213012753 5ustar liggesusersestimability/NAMESPACE0000644000176200001440000000130414564547711014204 0ustar liggesusers# Imports from non-base packages importFrom("stats", "delete.response", "model.frame", "model.matrix", "na.pass", "predict", "terms", "update") # Exports from estimability package export(all.estble) export(epredict) export(eupdate) export(is.estble) export(legacy.nonest.basis) export(nonest.basis) export(estble.subspace) S3method(epredict, lm) S3method(epredict, mlm) S3method(epredict, glm) S3method(eupdate, lm) S3method(nonest.basis, default) S3method(nonest.basis, qr) S3method(nonest.basis, matrix) S3method(nonest.basis, lm) S3method(nonest.basis, svd) export(nonest.basis.svd) estimability/README.md0000644000176200001440000000444114565007742014245 0ustar liggesusers--- title: "estimability" output: html_document date: '2022-07-03' --- R package **estimability**: Support for determining estimability of linear functions ==== [![cran version](https://www.r-pkg.org/badges/version/estimability)](https://cran.r-project.org/package=estimability) [![downloads](https://cranlogs.r-pkg.org/badges/estimability)](https://cranlogs.r-pkg.org/badges/estimability) [![total downloads](https://cranlogs.r-pkg.org/badges/grand-total/estimability)](https://cranlogs.r-pkg.org/badges/grand-total/estimability) [![Research software impact](http://depsy.org/api/package/cran/estimability/badge.svg)](http://depsy.org/package/r/estimability/) ## Features * A `nonest.basis()` function is provided that determines a basis for the null space of a matrix. This may be used in conjunction with `is.estble()` to determine the estimability (within a tolerance) of a given linear function of the regression coefficients in a linear model. * A set of `epredict()` methods are provided for `lm`, `glm`, and `mlm` objects. These work just like `predict()`, except an `NA` is returned for any cases that are not estimable. This is a useful alternative to the generic warning that "predictions from rank-deficient models are unreliable." * A function `estble.subspace()` that projects a set of linear functions onto an estimable subspace (possibly of smaller dimension). This can be useful in creating a set of estimable contrasts for joint testing. * Package developers may wish to import this package and incorporate estimability checks for their `predict` methods. ## Installation * To install latest version from CRAN, run ``` install.packages("estimability") ``` Release notes for the latest CRAN version are found at [https://cran.r-project.org/package=estimability/NEWS](https://cran.r-project.org/package=estimability/NEWS) -- or do `news(package = "estimability")` for notes on the version you have installed. * To install the latest development version from Github, have the newest **devtools** package installed, then run ``` devtools::install_github("rvlenth/estimability", dependencies = TRUE) ``` For latest release notes on this development version, see the [NEWS file](https://github.com/rvlenth/estimability/blob/master/inst/NEWS) estimability/man/0000755000176200001440000000000014262633552013534 5ustar liggesusersestimability/man/epredict.lm.Rd0000644000176200001440000001434414564555572016251 0ustar liggesusers% Copyright (c) 2015-2024 Russell V. Lenth # \name{epredict} \alias{epredict} \alias{epredict.lm} \alias{epredict.glm} \alias{epredict.mlm} \alias{eupdate} \alias{eupdate.lm} \title{ Estimability Enhancements for \code{lm} and Relatives } \description{ These functions call the corresponding S3 \code{predict} methods in the \pkg{stats} package, but with a check for estimability of new predictions, and with appropriate actions for non-estimable cases. } \usage{ \S3method{epredict}{lm}(object, newdata, ..., type = c("response", "terms", "matrix", "estimability"), nonest.tol = 1e-8, nbasis = object$nonest) \S3method{epredict}{glm}(object, newdata, ..., type = c("link", "response", "terms", "matrix", "estimability"), nonest.tol = 1e-8, nbasis = object$nonest) \S3method{epredict}{mlm}(object, newdata, ..., type = c("response", "matrix", "estimability"), nonest.tol = 1e-8, nbasis = object$nonest) eupdate(object, ...) } \arguments{ \item{object}{An object inheriting from \code{lm}} \item{newdata}{A \code{data.frame} containing predictor combinations for new predictions} \item{\dots}{Arguments passed to \code{\link{predict}} or \code{\link{update}}} \item{nonest.tol}{Tolerance used by \code{\link{is.estble}} to check estimability of new predictions} \item{type}{Character string specifying the desired result. See Details.} \item{nbasis}{Basis for the null space, e.g., a result of a call to \code{\link{nonest.basis}}. If \code{nbasis} is \code{NULL}, a basis is constructed from \code{object}.} } \details{ If \code{newdata} is missing or \code{object} is not rank-deficient, this method passes its arguments directly to the same method in the \pkg{stats} library. In rank-deficient cases with \code{newdata} provided, each row of \code{newdata} is tested for estimability against the null basis provided in \code{nbasis}. Any non-estimable cases found are replaced with \code{NA}s. The \code{type} argument is passed to \code{\link[stats]{predict}} when it is one of \code{"response"}, \code{"link"}, or \code{"terms"}. With \code{newdata} present and \code{type = "matrix"}, the model matrix for \code{newdata} is returned, with an attribute \code{"estble"} that is a logical vector of length \samp{nrow(newdata)} indicating whether each row is estimable. With \code{type = "estimability"}, just the logical vector is returned. If you anticipate making several \code{epredict} calls with new data, it improves efficiency to either obtain the null basis and provide it in the call, or add it to \code{object} with the name \code{"nonest"} (perhaps via a call to \code{eupdate}). \code{eupdate} is an S3 generic function with a method provided for \code{"lm"} objects. It updates the object according to any arguments in \code{...}, then obtains the updated object's nonestimable basis and returns it in \code{object$nonest}. } \value{ The same as the result of a call to the \code{predict} method in the \pkg{stats} package, except rows or elements corresponding to non-estimable predictor combinations are set to \code{NA}. The value for \code{type} is \code{"matrix"} or \code{"estimability"} is explained under details.} \author{ Russell V. Lenth } \note{ The capabilities of the \code{epredict} function for \code{lm} objects is provided in R 4.3.0 and later by \code{\link[stats]{predict.lm}} with \code{rankdeficient = "NA"}; however, \code{epredict} uses \pkg{estimability}'s own criteria to determine which predictions are set to \code{NA}. An advantage of using \code{epredict} is one of efficiency: we can compute the null basis once and for all and have it available additional predictions, whereas \code{predict.lm} will re-compute it each time. If the user wishes to see a message explaining why \code{NA}s were displayed, set \samp{options(estimability.verbose = TRUE)}. } \seealso{ \code{\link[stats]{predict.lm}} in the \pkg{stats} package; \code{\link{nonest.basis}}. } \examples{ require("estimability") # Fake data where x3 and x4 depend on x1, x2, and intercept x1 <- -4:4 x2 <- c(-2,1,-1,2,0,2,-1,1,-2) x3 <- 3*x1 - 2*x2 x4 <- x2 - x1 + 4 y <- 1 + x1 + x2 + x3 + x4 + c(-.5,.5,.5,-.5,0,.5,-.5,-.5,.5) # Different orderings of predictors produce different solutions mod1234 <- lm(y ~ x1 + x2 + x3 + x4) mod4321 <- eupdate(lm(y ~ x4 + x3 + x2 + x1)) # (Estimability checking with mod4321 will be more efficient because # it will not need to recreate the basis) mod4321$nonest # test data: testset <- data.frame( x1 = c(3, 6, 6, 0, 0, 1), x2 = c(1, 2, 2, 0, 0, 2), x3 = c(7, 14, 14, 0, 0, 3), x4 = c(2, 4, 0, 4, 0, 4)) # Look at predictions when we don't check estimability suppressWarnings( # Disable the warning from stats::predict.lm rbind(p1234 = predict(mod1234, newdata = testset), p4321 = predict(mod4321, newdata = testset))) # Compare with results when we do check: rbind(p1234 = epredict(mod1234, newdata = testset), p4321 = epredict(mod4321, newdata = testset)) # now stats::predict has same capability for lm objects stats::predict(mod1234, newdata = testset, rankdeficient = "NA") # Note that estimable cases have the same predictions # change mod1234 and include nonest basis mod134 <- eupdate(mod1234, . ~ . - x2, subset = -c(3, 7)) mod134$nonest # When row spaces are the same, bases are interchangeable # so long as you account for the ordering of parameters: epredict(mod4321, newdata = testset, type = "estimability", nbasis = nonest.basis(mod1234)[c(1,5:2), ]) # Comparison with predict.lm stats::predict(mod4321, newdata = testset, rankdeficient = "NA") \dontrun{ ### Additional illustration example(nonest.basis) ## creates model objects warp.lm1 and warp.lm2 # The two models have different contrast specs. But the empty cell # is correctly identified in both: fac.cmb <- expand.grid(wool = c("A", "B"), tension = c("L", "M", "H")) cbind(fac.cmb, pred1 = epredict(warp.lm1, newdata = fac.cmb), pred2 = epredict(warp.lm2, newdata = fac.cmb)) } % end of \dontrun } \keyword{ models } \keyword{ regression } estimability/man/estble-subspace.Rd0000644000176200001440000000453014137063330017076 0ustar liggesusers% Copyright (c) 2015-2018 Russell V. Lenth \name{estble.subspace} \alias{estble.subspace} \title{Find an estimable subspace} \description{ Determine a transformation \code{B} of the rows of a matrix \code{L} such that \code{B \%*\% L} is estimable. A practical example is in jointly testing a set of contrasts \code{L} in a linear model, and we need to restrict to the subspace spanned by the rows of \code{L} that are estimable. } \usage{ estble.subspace (L, nbasis, tol = 1e-8) } \arguments{ \item{L}{A matrix of dimensions \emph{k} by \emph{p}} \item{nbasis}{A \emph{k} by \emph{b} matrix whose columns form a basis for non-estimable linear functions -- such as is returned by \code{\link{nonest.basis}}} \item{tol}{Numeric tolerance for assessing nonestimability. See \code{\link{is.estble}}.} } \details{ We require \code{B} such that all the rows of \code{M = B \%*\% L} are estimable, i.e. orthogonal to the columns of \code{nbasis}. Thus, we need \code{B \%*\% L \%*\% nbasis} to be zero, or equivalently, \code{t(B)} must be in the null space of \code{t(L \%*\% nbasis)}. This can be found using \code{\link{nonest.basis}}. } \value{ An \emph{r} by \emph{p} matrix \code{M = B \%*\% L} whose rows are all orthogonal to the columns of \code{nbasis}. The matrix \code{B} is attached as \code{attr(M, "B")}. Note that if any rows of \code{L} were non-estimable, then \emph{r} will be less than \emph{k}. In fact, if there are no estimable functions in the row space of \code{L}, then \emph{r} = 0. } \author{ Russell V. Lenth } \examples{ ### Find a set of estimable interaction contrasts for a 3 x 4 design ### with two empty cells. des <- expand.grid(A = factor(1:3), B = factor(1:4)) des <- des[-c(5, 12), ] # cells (2,2) and (3,4) are empty X <- model.matrix(~ A * B, data = des) N <- nonest.basis(X) L <- cbind(matrix(0, nrow = 6, ncol = 6), diag(6)) # i.e., give nonzero weight only to interaction effects estble.subspace(L, N) # Tougher demo: create a variation where all rows of L are non-estimable LL <- matrix(rnorm(36), ncol = 6) \%*\% L estble.subspace(LL, N) } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{ models } \keyword{ regression } estimability/man/estimability-package.Rd0000644000176200001440000000400014137063330020075 0ustar liggesusers% Copyright (c) 2015-2016 Russell V. Lenth # \name{estimability-package} \alias{estimability-package} \alias{estimability} \docType{package} \title{ Estimability Tools for Linear Models } \description{ Provides tools for determining estimability of linear functions of regression coefficients, and alternative \code{epredict} methods for \code{lm}, \code{glm}, and \code{mlm} objects that handle non-estimable cases correctly. } \details{ \tabular{ll}{ Package: \tab estimability\cr Type: \tab Package\cr Details: \tab See DESCRIPTION file\cr } When a linear model is not of full rank, the regression coefficients are not uniquely estimable. However, the predicted values are unique, as are other linear combinations where the coefficients lie in the row space of the data matrix. Thus, estimability of a linear function of regression coefficients can be determined by testing whether the coefficients lie in this row space -- or equivalently, are orthogonal to the corresponding null space. This package provides functions \code{\link{nonest.basis}} and \code{\link{is.estble}} to facilitate such an estimability test. Package developers may find these useful for incorporating in their \code{predict} methods when new predictor settings are involved. The function \code{\link{estble.subspace}} is useful for projecting a matrix onto an estimable subspace whose rows are all estimable. The package also provides \code{\link{epredict}} methods -- alternatives to the \code{\link{predict}} methods in the \pkg{stats} package for \code{"lm"}, \code{"glm"}, and \code{"mlm"} objects. When the \code{newdata} argument is specified, estimability of each new prediction is checked and any non-estimable cases are replaced by \code{NA}. } \author{ Russell V. Lenth } \references{ Monahan, John F. (2008) \emph{A Primer on Linear Models}, CRC Press. (Chapter 3) } \keyword{ package } \keyword{ models } \keyword{ regression } estimability/man/nonest.basis.Rd0000644000176200001440000001326414564560310016433 0ustar liggesusers% Copyright (c) 2015-2016 Russell V. Lenth \name{nonest.basis} \alias{nonest.basis} \alias{legacy.nonest.basis} \alias{nonest.basis.qr} \alias{nonest.basis.matrix} \alias{nonest.basis.lm} \alias{nonest.basis.svd} \alias{nonest.basis.default} \alias{all.estble} \alias{is.estble} \title{Estimability Tools} \description{ This documents the functions needed to test estimability of linear functions of regression coefficients. } \usage{ nonest.basis(x, ...) \S3method{nonest.basis}{default}(x, ...) \S3method{nonest.basis}{qr}(x, ...) \S3method{nonest.basis}{matrix}(x, ...) \S3method{nonest.basis}{lm}(x, ...) \S3method{nonest.basis}{svd}(x, tol = 5e-8, ...) legacy.nonest.basis(x, ...) all.estble is.estble(x, nbasis, tol = 1e-8) } %- maybe also 'usage' for other objects documented here. \arguments{ \item{x}{For \code{nonest.basis}, an object of a class in \samp{methods("nonest.basis")}. Or, in \code{is.estble}, a numeric vector or matrix for assessing estimability of \samp{sum(x * beta)}, where \code{beta} is the vector of regression coefficients.} \item{nbasis}{Matrix whose columns span the null space of the model matrix. Such a matrix is returned by \code{nonest.basis}.} \item{tol}{Numeric tolerance for assessing rank or nonestimability. For determining rank, singular values less than \code{tol} times the largest singular value are regarded as zero. For determining estimability with a nonzero \eqn{x}, \eqn{\beta'x} is assessed by whether or not \eqn{||N'x||^2 < \tau ||x'x||^2}, where \eqn{N} and \eqn{\tau} denote \code{nbasis} and \code{tol}, respectively.} \item{\dots}{Additional arguments passed to other methods.} } \details{ Consider a linear model \eqn{y = X\beta + E}. If \eqn{X} is not of full rank, it is not possible to estimate \eqn{\beta} uniquely. However, \eqn{X\beta} \emph{is} uniquely estimable, and so is \eqn{a'X\beta} for any conformable vector \eqn{a}. Since \eqn{a'X} comprises a linear combination of the rows of \eqn{X}, it follows that we can estimate any linear function where the coefficients lie in the row space of \eqn{X}. Equivalently, we can check to ensure that the coefficients are orthogonal to the null space of \eqn{X}. The \code{nonest.basis} method for class \code{'svd'} is not really functional as a method because there is no \code{"svd"} class (at least in R <= 4.2.0). But the function \code{nonest.basis.svd} is exported and may be called directly; it works with results of \code{\link{svd}} or \code{\link{La.svd}}. We \emph{require} \code{x$v} to be the complete matrix of right singular values; but we do not need \code{x$u} at all. The \code{default} method does serve as an \code{svd} method, in that it only works if \code{x} has the required elements of an SVD result, in which case it passes it to \code{nonest.basis.svd}. The \code{matrix} method runs \code{nonest.basis.svd(svd(x, nu = 0))}. The \code{lm} method runs the \code{qr} method on \code{x$qr}. The function \code{legacy.nonest.basis} is the original default method in early versions of the \pkg{estimability} package. It may be called with \code{x} being either a matrix or a \code{qr} object, and after obtaining the \code{R} matrix, it uses an additional QR decomposition of \code{t(R)} to obtain the needed basis. (The current \code{nonest.basis} method for \code{qr} objects is instead based on the singular-value decomposition of R, and requires much simpler code.) The constant \code{all.estble} is simply a 1 x 1 matrix of \code{NA}. This specifies a trivial non-estimability basis, and using it as \code{nbasis} will cause everything to test as estimable. } \value{ When \eqn{X} is not full-rank, the methods for \code{nonest.basis} return a basis for the null space of \eqn{X}. The number of rows is equal to the number of regression coefficients (\emph{including} any \code{NA}s); and the number of columns is equal to the rank deficiency of the model matrix. The columns are orthonormal. If the model is full-rank, then \code{nonest.basis} returns \code{all.estble}. The \code{matrix} method uses \eqn{X} itself, the \code{qr} method uses the \eqn{QR} decomposition of \eqn{X}, and the \code{lm} method recovers the required information from the object. The function \code{is.estble} returns a logical value (or vector, if \code{x} is a matrix) that is \code{TRUE} if the function is estimable and \code{FALSE} if not. } \references{ Monahan, John F. (2008) \emph{A Primer on Linear Models}, CRC Press. (Chapter 3) } \author{ Russell V. Lenth } \examples{ require(estimability) X <- cbind(rep(1,5), 1:5, 5:1, 2:6) ( nb <- nonest.basis(X) ) SVD <- svd(X, nu = 0) # we don't need the U part of UDV' nonest.basis.svd(SVD) # same result as above # Test estimability of some linear functions for this X matrix lfs <- rbind(c(1,4,2,5), c(2,3,9,5), c(1,2,2,1), c(0,1,-1,1)) is.estble(lfs, nb) # Illustration on 'lm' objects: warp.lm1 <- lm(breaks ~ wool * tension, data = warpbreaks, subset = -(26:38), contrasts = list(wool = "contr.treatment", tension = "contr.treatment")) zapsmall(warp.nb1 <- nonest.basis(warp.lm1)) warp.lm2 <- update(warp.lm1, contrasts = list(wool = "contr.sum", tension = "contr.helmert")) zapsmall(warp.nb2 <- nonest.basis(warp.lm2)) # These bases look different, but they both correctly identify the empty cell wcells = with(warpbreaks, expand.grid(wool = levels(wool), tension = levels(tension))) epredict(warp.lm1, newdata = wcells, nbasis = warp.nb1) epredict(warp.lm2, newdata = wcells, nbasis = warp.nb2) } % Add one or more standard keywords, see file 'KEYWORDS' in the % R documentation directory. \keyword{ models } \keyword{ regression } estimability/DESCRIPTION0000644000176200001440000000164114565033213014463 0ustar liggesusersPackage: estimability Type: Package Title: Tools for Assessing Estimability of Linear Predictions Version: 1.5 Date: 2024-02-18 Authors@R: c(person("Russell", "Lenth", role = c("aut", "cre", "cph"), email = "russell-lenth@uiowa.edu")) Depends: stats, R(>= 4.3.0) Suggests: knitr, rmarkdown Description: Provides tools for determining estimability of linear functions of regression coefficients, and 'epredict' methods that handle non-estimable cases correctly. Estimability theory is discussed in many linear-models textbooks including Chapter 3 of Monahan, JF (2008), "A Primer on Linear Models", Chapman and Hall (ISBN 978-1-4200-6201-4). ByteCompile: yes License: GPL (>= 3) VignetteBuilder: knitr NeedsCompilation: no Packaged: 2024-02-20 02:36:58 UTC; rlenth Author: Russell Lenth [aut, cre, cph] Maintainer: Russell Lenth Repository: CRAN Date/Publication: 2024-02-20 05:20:11 UTC estimability/build/0000755000176200001440000000000014565010112014043 5ustar liggesusersestimability/build/vignette.rds0000644000176200001440000000035514565010112016405 0ustar liggesusers‹mQ;Â0 M?|%Äï½Ù‘ B ¬¡14¢iP„ºqrÀЍ!ŠÇyÏ~V¶]ƘÏÂÀg~€a0F×D y,d<‡\ˆ v’¤ãµ0 !óC„©øNfÒ–Ñì²V»]곉N„Ll¤À¦šVÕÛ¤VeÑ'BPúøùü‚á ý/x#ç R£5‡äÂ¥oÿùÞPÚK(/ÚTœ¦ùÆ„ ™AÕw#íç¬æ‹wè­©†Ÿú5ý£/q5CÏ}ÔÝ4ÉxAí ny¼7ÈwºåJáêestimability/vignettes/0000755000176200001440000000000014565010112014754 5ustar liggesusersestimability/vignettes/add-est-check.Rmd0000644000176200001440000000641414564715422020040 0ustar liggesusers--- title: "How to add estimability checking to your model's `predict` method" author: "estimability package, Version `r packageVersion('estimability')`" output: html_vignette vignette: > %\VignetteIndexEntry{Adding estimability checking to to your predict method} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("estimability") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` The goal of this short vignette is to show how you can easily add estimability checking to your package's `predict()` methods. Suppose that you have developed a model class that has elements `$coefficients`, `$formula`, etc. Suppose it also has an `$env` element, an environment that can hold miscellaneous information. This is not absolutely necessary, but handy if it exists. Your model class involves some kind of linear predictor. We are concerned with models that: * Allow rank deficiencies (where some predictors may be excluded) * Allow predictions for new data For any such model, it is important to add estimability checking to your predict method, because the regression coefficients are not unique -- and hence that predictions may not be unique. It can be shown that predictions on new data are unique only for cases that fall within the row space of the model matrix. The **estimability** package is designed to check for this. The recommended design for accommodating rank-deficient models is to follow the example of `stats::lm` objects, where any predictors that are excluded have a corresponding regression coefficient of `NA`. Please note that this `NA` code actually doesn't actually means the coefficient is missing; it is a code that means that that coefficient has been constrained to be zero. In what follows, we assume that this convention is used. First note that estimability checking is not needed unless you are predicting for new data. So that's where you need to incorporate estimability checking. The `predict` method should be coded something like this: ``` predict.mymod <- function(object, newdata, ...) { # ... some setup code ... if (!missing(newdata)) { X <- # ... code to set up the model matrix for newdata ... b <- coef(object) if (any(is.na(b))) { # we have rank deficiency so test estimability if (is.null (nbasis <- object$env$nbasis)) nbasis <- object$nbasis <- estimability::nonest.basis(model.matrix(object)) b[is.na(b)] <- 0 pred <- X %*% b pred[!estimability::is.estble(X, nbasis)] <- NA } else pred <- X %*% coef(object) } # ... perhaps more code ... pred } ``` That's it -- and this is the fancy version, where we can save `nbasis` for use with possible future predictions. Any non-estimable cases are flagged as `NA` in the `pred` vector. An alternative way to code this would be to exclude the columns of `X` and elements of `b` that correspond to `NA`s in `b`. But be careful, because you need *all* the columns in `X` in order to check estimability. The only other thing you need to do is add `estimability` to the `Imports` list in your `Description file. estimability/R/0000755000176200001440000000000014564544340013163 5ustar liggesusersestimability/R/estble-subsp.R0000644000176200001440000000417414564544340015724 0ustar liggesusers############################################################################## # Copyright (c) 2015-2018 Russell V. Lenth # # # # This file is part of the estimability package for R (*estimability*) # # # # *estimability* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *estimability* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # A copy of the GNU General Public License is available at # # # ############################################################################## # Obtain an estimable subspace from the rows of a matrix L # i.e., B %*% L such that B %*% L %*% N = 0 (where N = nbasis) # Thus, (LN)'B' = 0, i.e., B' is in null space of LN' # We are tooled-up to find that! # # The function returns BL, with B as an attribute estble.subspace = function(L, nbasis, tol = 1e-8) { if (all(apply(L, 1, is.estble, nbasis, tol))) B = diag(nrow(L)) else { LN = L %*% nbasis LN[abs(LN) <= tol] = 0 # don't be jerked around by small values B = t(nonest.basis(t(LN))) } if (is.na(B[1])) # nothing is estimable result = matrix(0, nrow = 0, ncol = ncol(L)) else result = B %*% L attr(result, "B") = B result }estimability/R/estimability.R0000644000176200001440000001106614564547160016014 0ustar liggesusers############################################################################## # Copyright (c) 2015-2024 Russell V. Lenth # # # # This file is part of the estimability package for R (*estimability*) # # # # *estimability* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *estimability* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # A copy of the GNU General Public License is available at # # # ############################################################################## # Obtain an orthonormal basis for nonestimable functions # Generic nonest.basis = function(x, ...) UseMethod("nonest.basis") # Legacy code for now-deprecated case of a matrix or qr decomposition legacy.nonest.basis = function(x, ...) { if(!is.qr(x)) { if (!is.matrix(x)) stop("legacy.nonest.basis requires a matrix or qr object") x = qr(x) } rank = x$rank tR = t(qr.R(x)) p = nrow(tR) if (rank == p) return (all.estble) # null space of X is same as null space of R in QR decomp if (ncol(tR) < p) # add columns if not square tR = cbind(tR, matrix(0, nrow=p, ncol=p-ncol(tR))) # last few rows are zero -- add a diagonal of 1s extras = rank + seq_len(p - rank) tR[extras, extras] = diag(1, p - rank) # nbasis is last p - rank cols of Q in QR decomp of tR nbasis = qr.Q(qr(tR))[ , extras, drop = FALSE] # permute the rows via pivot nbasis[x$pivot, ] = nbasis nbasis } # Main method -- for class "qr" # revised to use the svd of R nonest.basis.qr = function(x, ...) { R = qr.R(x) cols = which(seq_along(R[1, ]) > x$rank) tmp = nbasis = svd(R, nu = 0, nv = ncol(R))$v[, cols, drop = FALSE] nbasis[x$pivot, ] = tmp nbasis } nonest.basis.matrix = function(x, ...) nonest.basis.svd(svd(x, nu = 0, nv = ncol(x)), ...) ##nonest.basis(qr(x), ...) nonest.basis.lm = function(x, ...) { if (is.null(x$qr)) x = update(x, method = "qr", qr = TRUE) nonest.basis(x$qr) } # method for svd class, were it to exist nonest.basis.svd = function(x, tol = 5e-8, ...) { # Note we don't need the 'u' slot at all. if(!is.null(x$vt)) # result of La.svd() x$v = t(x$vt) if (is.null(x$v) || ncol(x$v) < length(x$d)) stop("We need 'v' to be complete to obtain the basis\n", "Run svd() again and exclude the 'nv' argument") # we need d to be as long as ncol(v) if((deficit <- ncol(x$v) - length(x$d)) > 0) x$d = c(x$d, rep(0, deficit)) w = which(x$d < x$d[1] * tol) if (length(w) == 0) return(all.estble) x$v[, w, drop = FALSE] } # default method really designed to suss out an svd() result nonest.basis.default = function(x, ...) { if (!is.null(x$d)) { if (!is.null(x$vt) && is.matrix(x$vt)) ## apparently from La.svd x$v = t(x$vt) if(is.matrix(x$v) && ncol(x$v) == length(x$d)) return(nonest.basis.svd(x, ...)) } stop("Requires an 'svd()' or 'La.svd()' result") } # utility to check estimability of x'beta, given nonest.basis is.estble = function(x, nbasis, tol = 1e-8) { if (is.matrix(x)) return(apply(x, 1, is.estble, nbasis, tol)) if(is.na(nbasis[1])) TRUE else { x[is.na(x)] = 0 chk = as.numeric(crossprod(nbasis, x)) ssqx = sum(x*x) # BEFORE subsetting x # If x really small, don't scale chk'chk if (ssqx < tol) ssqx = 1 sum(chk*chk) < tol * ssqx } } # nonestimability basis that makes everything estimable all.estble = matrix(NA) estimability/R/epredict.lm.R0000644000176200001440000001202714564552445015523 0ustar liggesusers############################################################################## # Copyright (c) 2015-2016 Russell V. Lenth # # # # This file is part of the estimability package for R (*estimability*) # # # # *estimability* is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # # the Free Software Foundation, either version 2 of the License, or # # (at your option) any later version. # # # # *estimability* is distributed in the hope that it will be useful, # # but WITHOUT ANY WARRANTY; without even the implied warranty of # # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # # GNU General Public License for more details. # # # # A copy of the GNU General Public License is available at # # # ############################################################################## # Patch for predict.lm, predict.glm, predict.mlm # # If newdata present, and fit is rank-deficient, # we check estimability and replace any non-est cases with NA # Use options(estimability.quiet = TRUE) to suppress message # Use options(estimability.suppress = TRUE) to override this patch # Main workhorse -- call with stats-library predict function .patch.predict = function(object, newdata, type, nonest.tol = 1e-8, nbasis = object$nonest, ...) { if(missing(newdata)) predict(object = object, type = type, ...) else { type = match.arg(type, c("response", "link", "terms", "matrix", "estimability")) if (all(!is.na(object$coefficients)) && (type != "matrix")) if (type == "estimability") return (rep(TRUE, nrow(newdata))) else return (predict(object = object, newdata = newdata, type = type, ...)) if(is.null(nbasis)) { if (!is.null(qr <- object$qr)) nbasis = nonest.basis(qr) else nbasis = nonest.basis(model.matrix(object)) } trms = delete.response(terms(object)) m = model.frame(trms, newdata, na.action = na.pass, xlev = object$xlevels) X = model.matrix(trms, m, contrasts.arg = object$contrasts) nonest = !is.estble(X, nbasis, nonest.tol) if (type == "estimability") return (!nonest) else if (type == "matrix") { attr(X, "estble") = !nonest return (X) } # (else) we have a type anticipated by stats::predict w.handler <- function(w){ # suppress the incorrect warning if (!is.na(pmatch("prediction from a rank-deficient", w$message))) invokeRestart("muffleWarning") } result = withCallingHandlers( suppressWarnings(predict(object = object, newdata = newdata, type = type, rankdeficient = "simple", ...)), warning = w.handler) if (any(nonest)) { if (is.matrix(result)) result[nonest, ] = NA else if (is.list(result)) { result$fit[nonest] = NA result$se.fit[nonest] = NA } else result[nonest] = NA if(getOption("estimability.verbose", FALSE)) message("Note: Non-estimable cases were replaced by 'NA'") } result } } # Generic for epredict epredict = function(object, ...) UseMethod("epredict") epredict.lm = function(object, newdata, ..., type = c("response", "terms", "matrix", "estimability"), nonest.tol = 1e-8, nbasis = object$nonest) .patch.predict(object, newdata, type[1], nonest.tol, nbasis, ...) epredict.glm = function(object, newdata, ..., type = c("link", "response", "terms", "matrix", "estimability"), nonest.tol = 1e-8, nbasis = object$nonest) .patch.predict(object, newdata, type[1], nonest.tol, nbasis, ...) epredict.mlm = function(object, newdata, ..., type = c("response", "matrix", "estimability"), nonest.tol = 1e-8, nbasis = object$nonest) .patch.predict(object, newdata, type[1], nonest.tol, nbasis, ...) # Generic for eupdate -- adds nonest basis to object eupdate = function(object, ...) UseMethod("eupdate") eupdate.lm = function(object, ...) { if (length(list(...)) > 0) object = do.call("update", list(object = object, ...)) object$nonest = nonest.basis(object) object } estimability/MD50000644000176200001440000000152514565033213013266 0ustar liggesusers8de7a52d96e7ef17caefce2cd54153ce *DESCRIPTION e49efd927ce07e4456fc7bf07e2dc84c *NAMESPACE 1bc5e855854bf867eb2ce558781d7a9b *R/epredict.lm.R db937ec225a2778864ac557a0bd09511 *R/estble-subsp.R f3dbce94e409e178092559fc2c32520e *R/estimability.R 5d49adba856d769c31fe07b9ec0aeed4 *README.md 1c0274f9488abd9ea14a5116da76661f *build/vignette.rds 42d0a1c390b96cd612b7afc0760fbe62 *inst/NEWS 3a2776bf799c758a54df482e4e2ffeb5 *inst/doc/add-est-check.R 863aa7dc8ec506ef87335a9e58651647 *inst/doc/add-est-check.Rmd d56432b5828cc10e5ab657d7e057211c *inst/doc/add-est-check.html 2ad4acbf4b00f5aad9f3531b87663a68 *man/epredict.lm.Rd 76344948cf755a7caf150250283fedd2 *man/estble-subspace.Rd aae9f4ed984b658a3c55772610998377 *man/estimability-package.Rd bf0dc0dcd85ea65346e78f137235e45c *man/nonest.basis.Rd 863aa7dc8ec506ef87335a9e58651647 *vignettes/add-est-check.Rmd estimability/inst/0000755000176200001440000000000014565010112013721 5ustar liggesusersestimability/inst/doc/0000755000176200001440000000000014565010112014466 5ustar liggesusersestimability/inst/doc/add-est-check.Rmd0000644000176200001440000000641414564715422017552 0ustar liggesusers--- title: "How to add estimability checking to your model's `predict` method" author: "estimability package, Version `r packageVersion('estimability')`" output: html_vignette vignette: > %\VignetteIndexEntry{Adding estimability checking to to your predict method} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, results = "hide", message = FALSE} require("estimability") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") ``` The goal of this short vignette is to show how you can easily add estimability checking to your package's `predict()` methods. Suppose that you have developed a model class that has elements `$coefficients`, `$formula`, etc. Suppose it also has an `$env` element, an environment that can hold miscellaneous information. This is not absolutely necessary, but handy if it exists. Your model class involves some kind of linear predictor. We are concerned with models that: * Allow rank deficiencies (where some predictors may be excluded) * Allow predictions for new data For any such model, it is important to add estimability checking to your predict method, because the regression coefficients are not unique -- and hence that predictions may not be unique. It can be shown that predictions on new data are unique only for cases that fall within the row space of the model matrix. The **estimability** package is designed to check for this. The recommended design for accommodating rank-deficient models is to follow the example of `stats::lm` objects, where any predictors that are excluded have a corresponding regression coefficient of `NA`. Please note that this `NA` code actually doesn't actually means the coefficient is missing; it is a code that means that that coefficient has been constrained to be zero. In what follows, we assume that this convention is used. First note that estimability checking is not needed unless you are predicting for new data. So that's where you need to incorporate estimability checking. The `predict` method should be coded something like this: ``` predict.mymod <- function(object, newdata, ...) { # ... some setup code ... if (!missing(newdata)) { X <- # ... code to set up the model matrix for newdata ... b <- coef(object) if (any(is.na(b))) { # we have rank deficiency so test estimability if (is.null (nbasis <- object$env$nbasis)) nbasis <- object$nbasis <- estimability::nonest.basis(model.matrix(object)) b[is.na(b)] <- 0 pred <- X %*% b pred[!estimability::is.estble(X, nbasis)] <- NA } else pred <- X %*% coef(object) } # ... perhaps more code ... pred } ``` That's it -- and this is the fancy version, where we can save `nbasis` for use with possible future predictions. Any non-estimable cases are flagged as `NA` in the `pred` vector. An alternative way to code this would be to exclude the columns of `X` and elements of `b` that correspond to `NA`s in `b`. But be careful, because you need *all* the columns in `X` in order to check estimability. The only other thing you need to do is add `estimability` to the `Imports` list in your `Description file. estimability/inst/doc/add-est-check.html0000644000176200001440000002027714565010112017760 0ustar liggesusers How to add estimability checking to your model’s predict method

How to add estimability checking to your model’s predict method

estimability package, Version 1.5

The goal of this short vignette is to show how you can easily add estimability checking to your package’s predict() methods. Suppose that you have developed a model class that has elements $coefficients, $formula, etc. Suppose it also has an $env element, an environment that can hold miscellaneous information. This is not absolutely necessary, but handy if it exists. Your model class involves some kind of linear predictor.

We are concerned with models that:

  • Allow rank deficiencies (where some predictors may be excluded)
  • Allow predictions for new data

For any such model, it is important to add estimability checking to your predict method, because the regression coefficients are not unique – and hence that predictions may not be unique. It can be shown that predictions on new data are unique only for cases that fall within the row space of the model matrix. The estimability package is designed to check for this.

The recommended design for accommodating rank-deficient models is to follow the example of stats::lm objects, where any predictors that are excluded have a corresponding regression coefficient of NA. Please note that this NA code actually doesn’t actually means the coefficient is missing; it is a code that means that that coefficient has been constrained to be zero. In what follows, we assume that this convention is used.

First note that estimability checking is not needed unless you are predicting for new data. So that’s where you need to incorporate estimability checking. The predict method should be coded something like this:

predict.mymod <- function(object, newdata, ...) {
    # ... some setup code ...
    if (!missing(newdata)) {
        X <-  # ... code to set up the model matrix for newdata ...
        
        b <- coef(object)
        if (any(is.na(b))) {  # we have rank deficiency so test estimability
            if (is.null (nbasis <- object$env$nbasis))
                nbasis <- object$nbasis <-
                    estimability::nonest.basis(model.matrix(object))
            b[is.na(b)] <- 0
            pred <- X %*% b
            pred[!estimability::is.estble(X, nbasis)] <- NA
        }
        else
            pred <- X %*% coef(object)
    }
    # ... perhaps more code ...
    pred
}

That’s it – and this is the fancy version, where we can save nbasis for use with possible future predictions. Any non-estimable cases are flagged as NA in the pred vector.

An alternative way to code this would be to exclude the columns of X and elements of b that correspond to NAs in b. But be careful, because you need all the columns in X in order to check estimability.

The only other thing you need to do is add estimability to the Imports list in your `Description file.

estimability/inst/doc/add-est-check.R0000644000176200001440000000025214565010112017204 0ustar liggesusers## ---- echo = FALSE, results = "hide", message = FALSE------------------------- require("estimability") knitr::opts_chunk$set(fig.width = 4.5, class.output = "ro") estimability/inst/NEWS0000644000176200001440000000321714564716105014441 0ustar liggesusersUpdate history for **estimability*** ** NOTE: If you have v1.4 installed, please update to a newer version! ** (Or an older one, for that matter) 1.5 We now require R >= 4.3.0. Plays along with the changes in 'predict.lm' that came with R 4.3.0. We re-coded the 'nonest.basis.qr' to something much simpler, and the old version is kept available as 'legacy.nonest.basis'. And a vignette was added to help developers add estimability checking to their package. 1.4.1 Correction to version 1.4. The new svd-based methods worked correctly only for n x p matrices with n >= p. Otherwise things go badly awry. And this is a big problem because I replaced the default nonest.basis() method with the svd method. 1.4 Added support for results of svd(), via 'nonest.basis.svd' function and 'default' method. The 'matrix' method now uses the SVD instead of the QR decomposition. 1.3 Added 'estble.subspace' function 1.2-1 Moved codebase to github repository rvlenth/estimability 1.2 Modified license to make it more compatible with dependents 1.1-1 Added imports of non-base packages that are referenced 1.1 Design improvements to aid in potential scope and usability: * Made 'nonest.basis' a generic, with provided methods for "qr", "matrix", and "lm" * Added 'eupdate' generic and 'lm' method for updating a model object and including its nonestimability basis as part of the object Added 'type = "matrix"' and 'type = "estimability"' options for 'epredict' 1.0-2 Initial version on CRAN