gamm4/0000755000176200001440000000000013641707272011270 5ustar liggesusersgamm4/NAMESPACE0000644000176200001440000000041113066222254012474 0ustar liggesusersimport(Matrix) import(lme4) import(mgcv) import(methods) #importFrom(Matrix,t,crossprod,solve,rowSums) export(gamm4) importFrom("stats", "as.formula", "delete.response", "deviance", "gaussian", "reformulate", "residuals") importFrom("utils", "packageVersion") gamm4/ChangeLog0000644000176200001440000000723613641660451013047 0ustar liggesusersISSUES: * gamm4 isn't handling NA's properly with formulae like y~factor(z). When inserting z into dataframe, it fails to drop. * gamm4 (and indeed gamm) will fail if the fixed effects are not identifiable. This can happen, quite easily. e.g. s(x,by=fac1) + s(x,by=fac2) means that the columns x:fac1 and x:fac2 are not independent. * The computation of the covariance matrix of the response/pseudodata is very resource intensive if the random effects are in only a few groups. When there are only few random effects an alternative computation would be much better. 0.2-6 * Removal of code branch compatible with lme<1.0 to avoid warnings. 0.2-5 * Fix of problem dealing with long random effect formulae. Thanks to Iain Malcolm. * Added control ability to pass control arguments to underlying lme4 fitting functions. 0.2-4 * start up message changed to use packageStartupMessage. * Various imports in NAMESPACE. 0.2-3 * modified to extract REML using REMLcrit to avoid lme4 warning * argument drop.unused.levels added to gamm4, to facilitate mrf smoothing when some regions are data-less. 0.2-2 * Some tidying and modification to avoid cran check problems for code back compatible with old lme4. * Fix of erroroneous passing of '...' to modular lme4 fitting functions. 0.2-1 * Some restructuring - initial model setup in common for both old and new lme4, but actual fitting work then branches depending on version. 0.2-0 * Updated for new lme4. gamm4 tests for lme4 version installed, and behaves accordingly. Will generate notes on testing with old lme4 - unavoidable. Some testing still to do - awaiting some lme4 fixes. * fix to variance computation - prior weights were not used correctly - could cause incorrect interval estimation for smooths when prior weights used (e.g. binomial with n > 1, or weighted gaussian case). 0.1-6 * added return of R factor from QR of WX. * cov matrix computation modified to use LAPACK and pivoting properly. * some changes to replace direct slot access with getME, but incomplete (y, pWt and var still direct). 0.1-5 * gamm4 return object now of class "gamm4" 0.1-4 * upgrade to gamm4.setup to make smooth to r.e. conversion object oriented and hence cleaner. * gamm4 can now deal with "sf" class smooth factor interactions. This gives an efficient way to handle subject specific random smooths. 0.1-3 * Incorrect pivoting of covariance matrix of data/pseudodata could lead to incorrect covariance matrix for coefficients and incorrect EDF computation. Pivoting now corrected, and results tested. 0.1-2 * Bad bug fix: I'd failed to track an internal lme4 change, so that gamm4 had stopped extracting random effect variances correctly. This meant that gamm4 standard errors were typically too low. Fixed and checks added to test suite to detect this sort of problem. 0.1-1 * Allow for centering of smooth model matrix columns, when there is an intercept, but columns are not centered by constraint. 0.1-0 * Upgraded to use `t2' type tensor product smooths * bug fix so that s(...,fx=TRUE) works * workaround in gamm4 so that g/lmer handles offset properly. 0.0-4 * covariance matrix calculation was still not robust enough. Improved further. 0.0-3 * solving for the coefficient covariance matrix could fail under heavy smoothing --- now made more robust. * `gamm4' can now be supplied with prior weights. * The `cbind(success,failure)' form for a binomial response now works properly. * help file has been updated for mgcv_1.6-2, and to avoid running to many slow `gamm' calls in checking. 0.0-2 * gamm4 now returns a `scale.estimated' field in its `gam' object part. gamm4/man/0000755000176200001440000000000013641661735012046 5ustar liggesusersgamm4/man/gamm4.Rd0000644000176200001440000003734713136643424013352 0ustar liggesusers\name{gamm4} \alias{gamm4} %- Also NEED an `\alias' for EACH other topic documented here. \title{Generalized Additive Mixed Models using lme4 and mgcv} \description{ Fits the specified generalized additive mixed model (GAMM) to data, by making use of the \code{modular} fitting functions provided by lme4 (new version). For earlier lme4 versions modelling fitting is via a call to \code{lmer} in the normal errors identity link case, or by a call to \code{glmer} otherwise (see \code{\link[lme4]{lmer}}). Smoothness selection is by REML in the Gaussian additive case and (Laplace approximate) ML otherwise. \code{gamm4} is based on \code{\link[mgcv]{gamm}} from package \code{mgcv}, but uses \code{lme4} rather than \code{nlme} as the underlying fitting engine via a trick due to Fabian Scheipl. \code{gamm4} is more robust numerically than \code{\link[mgcv]{gamm}}, and by avoiding PQL gives better performance for binary and low mean count data. Its main disadvantage is that it can not handle most multi-penalty smooths (i.e. not \code{\link[mgcv]{te}} type tensor products or adaptive smooths) and there is no facilty for \code{nlme} style correlation structures. Tensor product smoothing is available via \code{\link[mgcv]{t2}} terms (Wood, Scheipl and Faraway, 2013). For fitting generalized additive models without random effects, \code{gamm4} is much slower than \code{\link[mgcv]{gam}} and has slightly worse MSE performance than \code{\link[mgcv]{gam}} with REML smoothness selection. For fitting GAMMs with modest numbers of i.i.d. random coefficients then \code{gamm4} is slower than \code{\link[mgcv]{gam}} (or \code{\link[mgcv]{bam}} for large data sets). \code{gamm4} is most useful when the random effects are not i.i.d., or when there are large numbers of random coeffecients (more than several hundred), each applying to only a small proportion of the response data. To use this function effectively it helps to be quite familiar with the use of \code{\link[mgcv]{gam}} and \code{\link[lme4]{lmer}}. } \usage{ gamm4(formula,random=NULL,family=gaussian(),data=list(),weights=NULL, subset=NULL,na.action,knots=NULL,drop.unused.levels=TRUE, REML=TRUE,control=NULL,start=NULL,verbose=0L,...) } \arguments{ \item{formula}{ A GAM formula (see also \code{\link[mgcv]{formula.gam}} and \code{\link[mgcv]{gam.models}}). This is like the formula for a \code{\link{glm}} except that smooth terms (\code{\link[mgcv]{s}} and \code{\link[mgcv]{t2}} but not \code{\link[mgcv]{te}}) can be added to the right hand side of the formula. Note that \code{id}s for smooths and fixed smoothing parameters are not supported.} \item{random}{An optional formula specifying the random effects structure in \code{\link[lme4]{lmer}} style. See example below.} \item{family}{A \code{family} as used in a call to \code{\link{glm}} or \code{\link[mgcv]{gam}}.} \item{data}{ A data frame or list containing the model response variable and covariates required by the formula. By default the variables are taken from \code{environment(formula)}, typically the environment from which \code{gamm4} is called.} \item{weights}{a vector of prior weights on the observations. \code{NULL} is equivalent to a vector of 1s. Used, in particular, to supply the number-of-trials for binomial data, when the response is proportion of successes. } \item{subset}{ an optional vector specifying a subset of observations to be used in the fitting process.} \item{na.action}{ a function which indicates what should happen when the data contain `NA's. The default is set by the `na.action' setting of `options', and is `na.fail' if that is unset. The ``factory-fresh'' default is `na.omit'.} \item{knots}{this is an optional list containing user specified knot values to be used for basis construction. Different terms can use different numbers of knots, unless they share a covariate.} \item{drop.unused.levels}{by default unused levels are dropped from factors before fitting. For some smooths involving factor variables you might want to turn this off. Only do so if you know what you are doing.} \item{REML}{passed on to \code{\link{lmer}} fitting routines (but not \code{\link{glmer}} fitting routines) to control whether REML or ML is used.} \item{control}{\code{\link{lmerControl}} or \code{\link{glmerControl}} list as appropriate (\code{NULL} means defaults are used).} \item{start}{starting value list as used by \code{\link{lmer}} or \code{\link{glmer}}.} \item{verbose}{passed on to fitting \code{lme4} fitting routines.} \item{...}{further arguments for passing on to model setup routines.} } %- maybe also `usage' for other objects documented here. \details{A generalized additive mixed model is a generalized linear mixed model in which the linear predictor depends linearly on unknown smooth functions of some of the covariates (`smooths' for short). \code{gamm4} follows the approach taken by package \code{mgcv} and represents the smooths using penalized regression spline type smoothers, of moderate rank. For estimation purposes the penalized component of each smooth is treated as a random effect term, while the unpenalized component is treated as fixed. The wiggliness penalty matrix for the smooth is in effect the precision matrix when the smooth is treated as a random effect. Estimating the degree of smoothness of the term amounts to estimating the variance parameter for the term. \code{gamm4} uses the same reparameterization trick employed by \code{\link[mgcv]{gamm}} to allow any single quadratic penalty smoother to be used (see Wood, 2004, or 2006 for details). Given the reparameterization then the modular fitting approach employed in \code{\link[lme4]{lmer}} can be used to fit a GAMM. Estimation is by Maximum Likelihood in the generalized case, and REML in the gaussian additive model case. \code{gamm4} allows the random effects specifiable with \code{\link[lme4]{lmer}} to be combined with any number of any of the (single penalty) smooth terms available in \code{\link[mgcv]{gam}} from package \code{mgcv} as well as \code{\link[mgcv]{t2}} tensor product smooths. Note that the model comparison on the basis of the (Laplace approximate) log likelihood is possible with GAMMs fitted by \code{gamm4}. As in \code{\link[mgcv]{gamm}} the smooth estimates are assumed to be of interest, and a covariance matrix is returned which enables Bayesian credible intervals for the smooths to be constructed, which treat all the terms in \code{random} as random. For details on how to condition smooths on factors, set up varying coefficient models, do signal regression or set up terms involving linear functionals of smooths, see \code{\link[mgcv]{gam.models}}, but note that \code{te} type tensor product and adaptive smooths are not available with \code{gamm4}. } \value{ Returns a list with two items: \item{gam}{an object of class \code{gam}. At present this contains enough information to use \code{predict}, \code{plot}, \code{summary} and \code{print} methods and \code{vis.gam}, from package \code{mgcv} but not to use e.g. the \code{anova} method function to compare models.} \item{mer}{the fitted model object returned by \code{\link[lme4]{lmer}} or \code{glmer}. Extra random and fixed effect terms will appear relating to the estimation of the smooth terms. Note that unlike \code{lme} objects returned by \code{\link[mgcv]{gamm}}, everything in this object always relates to the fitted model itself, and never to a PQL working approximation: hence the usual methods of model comparison are entirely legitimate. } } \references{ Bates D., M. Maechler, B. Bolker & S. Walker (2013). lme4: Linear mixed-effects models using Eigen and S4. \url{https://cran.r-project.org/package=lme4} Wood S.N., Scheipl, F. and Faraway, J.J. (2013/2011 online) Straightforward intermediate rank tensor product smoothing in mixed models. Statistics and Computing 23(3): 341-360 Wood, S.N. (2004) Stable and efficient multiple smoothing parameter estimation for generalized additive models. Journal of the American Statistical Association. 99:673-686 Wood S.N. (2006) Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC Press. For more GAMM references see \code{\link[mgcv]{gamm}} \url{http://www.maths.bris.ac.uk/~sw15190/} } \author{ Simon N. Wood \email{simon.wood@r-project.org}} \section{WARNINGS }{ If you don't need random effects in addition to the smooths, then \link[mgcv]{gam} is substantially faster, gives fewer convergence warnings, and slightly better MSE performance (based on simulations). Models must contain at least one random effect: either a smooth with non-zero smoothing parameter, or a random effect specified in argument \code{random}. Note that the \code{gam} object part of the returned object is not complete in the sense of having all the elements defined in \link[mgcv]{gamObject} and does not inherit from \code{glm}: hence e.g. multi-model \code{anova} calls will not work. Linked smoothing parameters, adaptive smoothing and te terms are not supported. This routine is obviously less well tested than \link[mgcv]{gamm}. } \seealso{\link[mgcv]{gam}, \link[mgcv]{gamm}, \link[mgcv]{gam.models}, \link[lme4]{lmer}, \link[mgcv]{predict.gam}, \link[mgcv]{plot.gam}, \link[mgcv]{summary.gam}, \link[mgcv]{s}, \link[mgcv]{vis.gam} } \examples{ ## NOTE: most examples are flagged as 'do not run' simply to ## save time in package checking on CRAN. ################################### ## A simple additive mixed model... ################################### library(gamm4) set.seed(0) dat <- gamSim(1,n=400,scale=2) ## simulate 4 term additive truth ## Now add 20 level random effect `fac'... dat$fac <- fac <- as.factor(sample(1:20,400,replace=TRUE)) dat$y <- dat$y + model.matrix(~fac-1)\%*\%rnorm(20)*.5 br <- gamm4(y~s(x0)+x1+s(x2),data=dat,random=~(1|fac)) plot(br$gam,pages=1) summary(br$gam) ## summary of gam summary(br$mer) ## underlying mixed model anova(br$gam) ## compare gam fit of the same bg <- gam(y~s(x0)+x1+s(x2)+s(fac,bs="re"), data=dat,method="REML") plot(bg,pages=1) gam.vcomp(bg) ########################## ## Poisson example GAMM... ########################## ## simulate data... x <- runif(100) fac <- sample(1:20,100,replace=TRUE) eta <- x^2*3 + fac/20; fac <- as.factor(fac) y <- rpois(100,exp(eta)) ## fit model and examine it... bp <- gamm4(y~s(x),family=poisson,random=~(1|fac)) plot(bp$gam) bp$mer \dontrun{ ################################################################# ## Add a factor to the linear predictor, to be modelled as random ## and make response Poisson. Again compare `gamm' and `gamm4' ################################################################# set.seed(6) dat <- gamSim(1,n=400,scale=2) ## simulate 4 term additive truth ## add random effect... g <- as.factor(sample(1:20,400,replace=TRUE)) dat$f <- dat$f + model.matrix(~ g-1)\%*\%rnorm(20)*2 dat$y <- rpois(400,exp(dat$f/7+1)) b2<-gamm(y~s(x0)+s(x1)+s(x2)+s(x3),family=poisson, data=dat,random=list(g=~1)) plot(b2$gam,pages=1) b2r<-gamm4(y~s(x0)+s(x1)+s(x2)+s(x3),family=poisson, data=dat,random = ~ (1|g)) plot(b2r$gam,pages=1) rm(dat) vis.gam(b2r$gam,theta=35) ################################## # Multivariate varying coefficient # With crossed and nested random # effects. ################################## ## Start by simulating data... f0 <- function(x, z, sx = 0.3, sz = 0.4) { (pi^sx * sz) * (1.2 * exp(-(x - 0.2)^2/sx^2 - (z - 0.3)^2/sz^2) + 0.8 * exp(-(x - 0.7)^2/sx^2 - (z - 0.8)^2/sz^2)) } f1 <- function(x2) 2 * sin(pi * x2) f2 <- function(x2) exp(2 * x2) - 3.75887 f3 <- function (x2) 0.2 * x2^11 * (10 * (1 - x2))^6 + 10 * (10 * x2)^3 * (1 - x2)^10 n <- 1000 ## first set up a continuous-within-group effect... g <- factor(sample(1:50,n,replace=TRUE)) ## grouping factor x <- runif(n) ## continuous covariate X <- model.matrix(~g-1) mu <- X\%*\%rnorm(50)*.5 + (x*X)\%*\%rnorm(50) ## now add nested factors... a <- factor(rep(1:20,rep(50,20))) b <- factor(rep(rep(1:25,rep(2,25)),rep(20,50))) Xa <- model.matrix(~a-1) Xb <- model.matrix(~a/b-a-1) mu <- mu + Xa\%*\%rnorm(20) + Xb\%*\%rnorm(500)*.5 ## finally simulate the smooth terms v <- runif(n);w <- runif(n);z <- runif(n) r <- runif(n) mu <- mu + f0(v,w)*z*10 + f3(r) y <- mu + rnorm(n)*2 ## response data ## First compare gamm and gamm4 on a reduced model br <- gamm4(y ~ s(v,w,by=z) + s(r,k=20,bs="cr"),random = ~ (1|a/b)) ba <- gamm(y ~ s(v,w,by=z) + s(r,k=20,bs="cr"),random = list(a=~1,b=~1),method="REML") par(mfrow=c(2,2)) plot(br$gam) plot(ba$gam) ## now fit the full model br <- gamm4(y ~ s(v,w,by=z) + s(r,k=20,bs="cr"),random = ~ (x+0|g) + (1|g) + (1|a/b)) br$mer br$gam plot(br$gam) ## try a Poisson example, based on the same linear predictor... lp <- mu/5 y <- rpois(exp(lp),exp(lp)) ## simulated response ## again compare gamm and gamm4 on reduced model br <- gamm4(y ~ s(v,w,by=z) + s(r,k=20,bs="cr"),family=poisson,random = ~ (1|a/b)) ba <- gamm(y ~ s(v,w,by=z) + s(r,k=20,bs="cr"),family=poisson,random = list(a=~1,b=~1)) par(mfrow=c(2,2)) plot(br$gam) plot(ba$gam) ## and now fit full version (very slow)... br <- gamm4(y ~ s(v,w,by=z) + s(r,k=20,bs="cr"),family=poisson,random = ~ (x|g) + (1|a/b)) br$mer br$gam plot(br$gam) #################################### # Different smooths of x2 depending # on factor `fac'... #################################### dat <- gamSim(4) br <- gamm4(y ~ fac+s(x2,by=fac)+s(x0),data=dat) plot(br$gam,pages=1) summary(br$gam) #################################### # Timing comparison with `gam'... # #################################### dat <- gamSim(1,n=600,dist="binary",scale=.33) system.time(lr.fit0 <- gam(y~s(x0)+s(x1)+s(x2), family=binomial,data=dat,method="ML")) system.time(lr.fit <- gamm4(y~s(x0)+s(x1)+s(x2), family=binomial,data=dat)) lr.fit0;lr.fit$gam cor(fitted(lr.fit0),fitted(lr.fit$gam)) ## plot model components with truth overlaid in red op <- par(mfrow=c(2,2)) fn <- c("f0","f1","f2","f3");xn <- c("x0","x1","x2","x3") for (k in 1:3) { plot(lr.fit$gam,select=k) ff <- dat[[fn[k]]];xx <- dat[[xn[k]]] ind <- sort.int(xx,index.return=TRUE)$ix lines(xx[ind],(ff-mean(ff))[ind]*.33,col=2) } par(op) } ###################################### ## A "signal" regression example, in ## which a univariate response depends ## on functional predictors. ###################################### ## simulate data first.... rf <- function(x=seq(0,1,length=100)) { ## generates random functions... m <- ceiling(runif(1)*5) ## number of components f <- x*0; mu <- runif(m,min(x),max(x));sig <- (runif(m)+.5)*(max(x)-min(x))/10 for (i in 1:m) f <- f+ dnorm(x,mu[i],sig[i]) f } x <- seq(0,1,length=100) ## evaluation points ## example functional predictors... par(mfrow=c(3,3));for (i in 1:9) plot(x,rf(x),type="l",xlab="x") ## simulate 200 functions and store in rows of L... L <- matrix(NA,200,100) for (i in 1:200) L[i,] <- rf() ## simulate the functional predictors f2 <- function(x) { ## the coefficient function (0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10)/10 } f <- f2(x) ## the true coefficient function y <- L\%*\%f + rnorm(200)*20 ## simulated response data ## Now fit the model E(y) = L\%*\%f(x) where f is a smooth function. ## The summation convention is used to evaluate smooth at each value ## in matrix X to get matrix F, say. Then rowSum(L*F) gives E(y). ## create matrix of eval points for each function. Note that ## `smoothCon' is smart and will recognize the duplication... X <- matrix(x,200,100,byrow=TRUE) ## compare `gam' and `gamm4' this time b <- gam(y~s(X,by=L,k=20),method="REML") br <- gamm4(y~s(X,by=L,k=20)) par(mfrow=c(2,1)) plot(b,shade=TRUE);lines(x,f,col=2) plot(br$gam,shade=TRUE);lines(x,f,col=2) } \keyword{models} \keyword{smooth} \keyword{regression}%-- one or more .. gamm4/DESCRIPTION0000644000176200001440000000102513641707272012774 0ustar liggesusersPackage: gamm4 Version: 0.2-6 Author: Simon Wood, Fabian Scheipl Maintainer: Simon Wood Title: Generalized Additive Mixed Models using 'mgcv' and 'lme4' Description: Estimate generalized additive mixed models via a version of function gamm() from 'mgcv', using 'lme4' for estimation. Depends: R (>= 2.9.0), methods, Matrix, lme4 (>= 1.0), mgcv (>= 1.7-23) License: GPL (>= 2) NeedsCompilation: no Packaged: 2020-04-03 16:27:09 UTC; sw283 Repository: CRAN Date/Publication: 2020-04-03 19:30:02 UTC gamm4/R/0000755000176200001440000000000013641661735011474 5ustar liggesusersgamm4/R/gamm4.r0000644000176200001440000004604013641661731012664 0ustar liggesusers## Version of gamm using lme4 as fit engine. (c) Simon N. Wood 2009-20 ## Reparameterization trick as Wood (2004,2006). ## fooling lmer using Fabian Scheipl's trick (now adapted for lme4 >1.0). gamm4.setup<-function(formula,pterms, data=stop("No data supplied to gamm.setup"),knots=NULL) ## set up the model matrix, penalty matrices and auxilliary information about the smoothing bases ## needed for a gamm4 fit. ## There is an implicit assumption that any rank deficient penalty does not penalize ## the constant term in a basis. ## 1. Calls gam.setup, as for a gam to produce object G suitable for estimating a gam. ## 2. Works through smooth list, G$smooth, modifying so that... ## i) Smooths are reparameterized to have a sequence of (portion of) identity matrix ## penalties. ## ii) 'random' list is accumulated containing random effect model matrices for terms. ## iii) Sparse version of full model matrix in original parameterization is also accumulated ## iv) Various indices are created for moving between the parameterizations. { ## first simply call `gam.setup'.... G <- mgcv:::gam.setup(formula,pterms, data=data,knots=knots,sp=NULL, min.sp=NULL,H=NULL,absorb.cons=TRUE,sparse.cons=0,gamm.call=TRUE) if (!is.null(G$L)) stop("gamm can not handle linked smoothing parameters (probably from use of `id' or adaptive smooths)") # now perform re-parameterization... first.f.para <- G$nsdf+1 random <- list() if (G$nsdf>0) ind <- 1:G$nsdf else ind <- rep(0,0) X <- G$X[,ind,drop=FALSE] # accumulate fixed effects into here xlab <- rep("",0) G$Xf <- as(X,"dgCMatrix") ## sparse version of full matrix, treating smooths as fixed first.para <- G$nsdf+1 used.names <- names(data) ## keep track of all variable names already used if (G$m) for (i in 1:G$m) { ## work through the smooths sm <- G$smooth[[i]] sm$X <- G$X[,sm$first.para:sm$last.para,drop=FALSE] rasm <- mgcv::smooth2random(sm,used.names,type=2) ## convert smooth to random effect and fixed effects used.names <- c(used.names,names(rasm$rand)) sm$fixed <- rasm$fixed ## deal with creation of sparse full model matrix if (!is.null(sm$fac)) { flev <- levels(sm$fac) ## grouping factor for smooth n.lev <- length(flev) for (k in 1:n.lev) { G$Xf <- cbind2(G$Xf,as(sm$X*as.numeric(sm$fac==flev[k]),"dgCMatrix")) } } else { n.lev <- 1 G$Xf <- cbind2(G$Xf,as(sm$X,"dgCMatrix")) } ## now append random effects to main list n.para <- 0 ## count random coefficients #rinc <- rind <- rep(0,0) if (!sm$fixed) { for (k in 1:length(rasm$rand)) n.para <- n.para + ncol(rasm$rand[[k]]) sm$lmer.name <- names(rasm$rand) random <- c(random,rasm$rand) sm$trans.D <- rasm$trans.D sm$trans.U <- rasm$trans.U ## matrix mapping fit coefs back to original } ## ensure stored first and last para relate to G$Xf in expanded version sm$last.para <- first.para + ncol(rasm$Xf) + n.para - 1 sm$first.para <- first.para first.para <- sm$last.para + 1 if (ncol(rasm$Xf)) { Xfnames <- rep("",ncol(rasm$Xf)) k <- length(xlab)+1 for (j in 1:ncol(rasm$Xf)) { xlab[k] <- Xfnames[j] <- new.name(paste(sm$label,"Fx",j,sep=""),xlab) k <- k + 1 } colnames(rasm$Xf) <- Xfnames } X <- cbind(X,rasm$Xf) # add fixed model matrix to overall fixed X sm$first.f.para <- first.f.para first.f.para <- first.f.para + ncol(rasm$Xf) sm$last.f.para <- first.f.para - 1 ## note less than sm$first.f.para => no fixed ## store indices of random parameters in smooth specific array sm$rind <- rasm$rind sm$rinc <- rasm$rinc sm$pen.ind <- rasm$pen.ind ## pen.ind==i TRUE for coef penalized by ith penalty sm$n.para <- n.para sm$X <- NULL ## delete model matrix G$smooth[[i]] <- sm ## replace smooth object with extended version } G$random <- random ## named list of random effect matrices G$X <- X ## fixed effects model matrix G } ## end of gamm4.setup gamm4 <- function(formula,random=NULL,family=gaussian(),data=list(),weights=NULL, subset=NULL,na.action,knots=NULL,drop.unused.levels=TRUE,REML=TRUE, control=NULL,start=NULL,verbose=0L,...) { # Routine to fit a GAMM to some data. Fixed and smooth terms are defined in the formula, but the wiggly # parts of the smooth terms are treated as random effects. The onesided formula random defines additional # random terms. if (!is.null(random)) { if (!inherits(random,"formula")) stop("gamm4 requires `random' to be a formula") random.vars <- all.vars(random) } else random.vars <- NULL # create model frame..... gp <- interpret.gam(formula) # interpret the formula mf <- match.call(expand.dots=FALSE) mf$formula <- gp$fake.formula mf$REML <- mf$verbose <- mf$control <- mf$start <- mf$family <- mf$scale <- mf$knots <- mf$random <- mf$... <-NULL ## mf$weights? mf$drop.unused.levels <- drop.unused.levels mf[[1]] <- as.name("model.frame") pmf <- mf gmf <- eval(mf, parent.frame()) # the model frame now contains all the data, for the gam part only gam.terms <- attr(gmf,"terms") # terms object for `gam' part of fit -- need this for prediction to work properly if (length(random.vars)) { mf$formula <- as.formula(paste(paste(deparse(gp$fake.formula, backtick = TRUE), collapse = ""), "+", paste(random.vars, collapse = "+"))) mf <- eval(mf, parent.frame()) } else mf <- gmf rm(gmf) if (nrow(mf)<2) stop("Not enough (non-NA) data to do anything meaningful") ## summarize the *raw* input variables ## note can't use get_all_vars here -- buggy with matrices vars <- all.vars(gp$fake.formula[-2]) ## drop response here inp <- parse(text = paste("list(", paste(vars, collapse = ","),")")) dl <- eval(inp, data, parent.frame()) names(dl) <- vars ## list of all variables needed var.summary <- mgcv:::variable.summary(gp$pf,dl,nrow(mf)) ## summarize the input data ## lmer offset handling work around... mvars <- vars[!vars%in%names(mf)] ## variables not in mf raw -- can cause lmer problem if (length(mvars)>0) for (i in 1:length(mvars)) mf[[mvars[i]]] <- dl[[mvars[i]]] ## append raw versions to mf rm(dl) ## save space pmf$formula <- gp$pf pmf <- eval(pmf, parent.frame()) # pmf contains all data for non-smooth part pTerms <- attr(pmf,"terms") if (is.character(family)) family<-eval(parse(text=family)) if (is.function(family)) family <- family() if (is.null(family$family)) stop("family not recognized") if (family$family == "gaussian" && family$link == "identity") linear <- TRUE else linear <- FALSE # now call gamm4.setup G <- gamm4.setup(gp,pterms=pTerms,data=mf,knots=knots) G$var.summary <- var.summary n.sr <- length(G$random) # number of random smooths (i.e. s(...,fx=FALSE,...) terms) if (is.null(random)&&n.sr==0) stop("gamm4 models must have at least 1 smooth with unknown smoothing parameter or at least one other random effect") offset.name <- attr(mf,"names")[attr(attr(mf,"terms"),"offset")] yname <- new.name("y",names(mf)) eval(parse(text=paste("mf$",yname,"<-G$y",sep=""))) Xname <- new.name("X",names(mf)) eval(parse(text=paste("mf$",Xname,"<-G$X",sep=""))) lme4.formula <- paste(yname,"~",Xname,"-1") if (length(offset.name)) { lme4.formula <- paste(lme4.formula,"+",offset.name) } ## Basic trick is to call (g)lFormula to set up model, with simple i.i.d. dummy random effects for the ## penalized component of each smooth. This results in columns of Z being produced for these dummy's, ## which can be over-written with the right thing. NOTE: that lambdat could also be modified, I think!! ## Add the random effect dummy variables for the smooth r.name <- names(G$random) if (n.sr) for (i in 1:n.sr) # adding the constructed variables to the model frame avoiding name duplication { mf[[r.name[i]]] <- factor(rep(1:ncol(G$random[[i]]),length=nrow(G$random[[i]]))) lme4.formula <- paste(lme4.formula,"+ (1|",r.name[i],")") } if (!is.null(random)) { ## append the regular random effects lme4.formula <- paste(lme4.formula,"+", substring(paste(deparse(random,backtick=TRUE),collapse=""),first=2)) } lme4.formula <- as.formula(lme4.formula) if (is.null(control)) control <- if (linear) lmerControl() else glmerControl() ## NOTE: further arguments should be passed here... b <- if (linear) lFormula(lme4.formula,data=mf,weights=G$w,REML=REML,control=control,...) else glFormula(lme4.formula,data=mf,family=family,weights=G$w,control=control,...) if (n.sr) { ## Fabian Scheipl's trick of overwriting dummy slots revised for new structure tn <- names(b$reTrms$cnms) ## names associated with columns of Z (same order as Gp) ind <- 1:length(tn) sn <- names(G$random) ## names of smooth random components for (i in 1:n.sr) { ## loop through random effect smooths k <- ind[sn[i]==tn] ## which term should contain G$random[[i]] ii <- (b$reTrms$Gp[k]+1):b$reTrms$Gp[k+1] b$reTrms$Zt[ii,] <- as(t(G$random[[i]]),"dgCMatrix") b$reTrms$cnms[[k]] <- attr(G$random[[i]],"s.label") } } ## now do the actual fitting... ret <- list() #arg <- list(...) #arg <- arg[!(names(arg) %in% names(b))] #b <- c(b,arg) ## add '...' arguments for use with do.call b$control <- control; b$verbose=verbose; b$start=start if (linear) { ## Create the deviance function to be optimized: devfun <- do.call(mkLmerDevfun, b) ## Optimize the deviance function: opt <- optimizeLmer(devfun,start=start,verbose=verbose,control=control$optCtrl) ## previously bobyqa optimizer set, but now default ## Package up the results: ret$mer <- mkMerMod(environment(devfun), opt, b$reTrms, fr = b$fr) } else { ## generalized case... ## Create the deviance function for optimizing over theta: devfun <- do.call(mkGlmerDevfun, b) ## Optimize over theta using a rough approximation (i.e. nAGQ = 0): opt <- optimizeGlmer(devfun,start=start,verbose=verbose,control=control$optCtrl) ## Update the deviance function for optimizing over theta and beta: devfun <- updateGlmerDevfun(devfun, b$reTrms) ## Optimize over theta and beta: opt <- optimizeGlmer(devfun, stage=2,start=start,verbose=verbose,control=control$optCtrl) ## Package up the results: ret$mer <- mkMerMod(environment(devfun), opt, b$reTrms, fr = b$fr) } rm(b) ### .... fitting finished ## now fake a gam object object<-list(model=mf,formula=formula,smooth=G$smooth,nsdf=G$nsdf,family=family, df.null=nrow(G$X),y=getME(ret$mer,"y"), terms=gam.terms,pterms=G$pterms,xlevels=G$xlevels, contrasts=G$contrasts,assign=G$assign,na.action=attr(mf,"na.action"), cmX=G$cmX,var.summary=G$var.summary) pvars <- all.vars(delete.response(object$terms)) object$pred.formula <- if (length(pvars)>0) reformulate(pvars) else NULL ## to unpack coefficients look at names(ret$lme$flist), ret$lme@Zt, ranef(), fixef() ## let the GAM coefficients in the original parameterization be beta, ## and let them be beta' in the fitting parameterization. ## Then beta = B beta'. B and B^{-1} can be efficiently accumulated ## and are useful for stable computation of the covariance matrix ## etc... B <- Matrix(0,ncol(G$Xf),ncol(G$Xf)) diag(B) <- 1 Xfp <- G$Xf ## Transform parameters back to the original space.... bf <- as.numeric(lme4::fixef(ret$mer)) ## the fixed effects br <- lme4::ranef(ret$mer) ## a named list if (G$nsdf) p <- bf[1:G$nsdf] else p <- array(0,0) ## fixed parametric componet if (G$m>0) for (i in 1:G$m) { fx <- G$smooth[[i]]$fixed first <- G$smooth[[i]]$first.f.para; last <- G$smooth[[i]]$last.f.para if (first <=last) beta <- bf[first:last] else beta <- array(0,0) if (fx) b <- beta else { ## not fixed so need to undo transform of random effects etc. b <- rep(0,0) for (k in 1:length(G$smooth[[i]]$lmer.name)) ## collect all coefs associated with this smooth b <- c(b,as.numeric(br[[G$smooth[[i]]$lmer.name[k]]][[1]])) b <- b[G$smooth[[i]]$rind] ## make sure coefs are in order expected by smooth b <- c(b,beta) b <- G$smooth[[i]]$trans.D*b if (!is.null(G$smooth[[i]]$trans.U)) b <- G$smooth[[i]]$trans.U%*%b ## transform back to original } p <- c(p,b) ## now fill in B... ind <- G$smooth[[i]]$first.para:G$smooth[[i]]$last.para if (!fx) { D <- G$smooth[[i]]$trans.D if (is.null(G$smooth[[i]]$trans.U)) B[ind,ind] <- Diagonal(length(D),D) else B[ind,ind] <- t(D*t(G$smooth[[i]]$trans.U)) } ## and finally transform G$Xf into fitting parameterization... Xfp[,ind] <- G$Xf[,ind,drop=FALSE]%*%B[ind,ind,drop=FALSE] } object$coefficients <- p ## need to drop smooths from Zt and then ## form Z'phiZ + I \sigma^2 vr <- lme4::VarCorr(ret$mer) ## list of ranef variance components in the same order as Gp scale <- as.numeric(attr(vr,"sc"))^2 ## get the scale parameter if (!is.finite(scale) || scale==1) { ## NOTE: better test??? scale <- 1 object$scale.estimated <- FALSE } else object$scale.estimated <- TRUE sp <- rep(-1,n.sr) Zt <- Matrix(0,0,ncol(getME(ret$mer,"Zt"))) if (n.sr==0) sn <- NULL ## names by which smooths are known in mer rn <- names(vr) ind <- rep(0,0) ## index the non-smooth random effects among the random effects for (i in 1:length(vr)) { if (is.null(sn)||!rn[i]%in%sn) { ## append non smooth r.e.s to Zt Gp <- getME(ret$mer,"Gp") ## group index ends ind <- c(ind,(Gp[i]+1):Gp[i+1]) } else if (!is.null(sn)) { ## extract smoothing parameters for smooth r.e.s k <- (1:n.sr)[rn[i]==sn] ## where in original smooth ordering is current smoothing param if (as.numeric(vr[[i]]>0)) sp[k] <- scale/as.numeric(vr[[i]]) else sp[k] <- 1e10 } } if (length(ind)) { ## extract columns corresponding to non-smooth r.e.s Zt <- getME(ret$mer,"Zt")[ind,] ## extracting random effects model matrix root.phi <- getME(ret$mer,"Lambdat")[ind,ind] ## and corresponding sqrt of cov matrix (phi) } object$prior.weights <- G$w if (linear) { object$weights <- object$prior.weights V <- Diagonal(n=length(object$weights),x=scale/object$weights) } else { # mu <- getME(ret$mer,"mu") # eta <- family$linkfun(mu) object$weights <- ret$mer@resp$sqrtWrkWt()^2 ## object$prior.weights*family$mu.eta(eta)^2/family$variance(mu) V <- Diagonal(x=1/object$weights)*scale #V <- Diagonal(x=scale*family$variance(mu)/object$prior.weights) } if (nrow(Zt)>0) V <- V + crossprod(root.phi%*%Zt)*scale ## data or pseudodata cov matrix, treating smooths as fixed now ## NOTE: Cholesky probably better in the following - then pivoting ## automatic when solving.... R <- Matrix::chol(V,pivot=TRUE);piv <- attr(R,"pivot") G$Xf <- as(G$Xf,"dgCMatrix") Xfp <- as(Xfp,"dgCMatrix") if (is.null(piv)) { WX <- as(solve(t(R),Xfp),"matrix") ## V^{-.5}Xp -- fit parameterization XVX <- as(solve(t(R),G$Xf),"matrix") ## same in original parameterization } else { WX <- as(solve(t(R),Xfp[piv,]),"matrix") ## V^{-.5}Xp -- fit parameterization XVX <- as(solve(t(R),G$Xf[piv,]),"matrix") ## same in original parameterization } qrz <- qr(XVX,LAPACK=TRUE) object$R <- qr.R(qrz);object$R[,qrz$pivot] <- object$R XVX <- crossprod(object$R) ## X'V^{-1}X original parameterization object$sp <- sp colx <- ncol(G$Xf) Sp <- matrix(0,colx,colx) # penalty matrix - fit param first <- G$nsdf+1 k <- 1 if (G$m>0) for (i in 1:G$m) { # Accumulate the total penalty matrix if (!object$smooth[[i]]$fixed) { ii <- object$smooth[[i]]$first.para:object$smooth[[i]]$last.para ## index this smooth's params for (j in 1:length(object$smooth[[i]]$S)) { ## work through penalty list ind <- ii[object$smooth[[i]]$pen.ind == j] ## index of currently penalized diag(Sp)[ind] <- sqrt(object$sp[k]) ## diagonal penalty k <- k+1 } } first <- last + 1 } ## Alternative cov matrix calculation. Basic ## idea is that cov matrix is computed stably in ## fitting parameterization, and then transformed to ## original parameterization. qrx <- qr(rbind(WX,Sp/sqrt(scale)),LAPACK=TRUE) Ri <- backsolve(qr.R(qrx),diag(ncol(WX))) ind <- qrx$pivot;ind[ind] <- 1:length(ind)## qrx$pivot Ri <- Ri[ind,] ## unpivoted square root of cov matrix in fitting parameterization Ri Ri' = cov Vb <- B%*%Ri; Vb <- Vb%*%t(Vb) object$edf<-rowSums(Vb*t(XVX)) object$df.residual <- length(object$y) - sum(object$edf) object$sig2 <- scale if (linear) { object$method <- "lmer.REML" } else { object$method <- "glmer.ML"} object$Vp <- as(Vb,"matrix") object$Ve <- as(Vb%*%XVX%*%Vb,"matrix") class(object) <- "gam" ## Restore original smooth list, if it was split to deal with t2 terms... if (!is.null(G$original.smooth)) { object$smooth <- G$smooth <- G$original.smooth } ## If prediction parameterization differs from fit parameterization, transform now... ## (important for t2 smooths, where fit constraint is not good for component wise ## prediction s.e.s) if (!is.null(G$P)) { object$coefficients <- G$P %*% object$coefficients object$Vp <- G$P %*% object$Vp %*% t(G$P) object$Ve <- G$P %*% object$Ve %*% t(G$P) } object$linear.predictors <- predict.gam(object,type="link") object$fitted.values <- object$family$linkinv(object$linear.predictors) object$residuals <- residuals(ret$mer) if (G$nsdf>0) term.names<-colnames(G$X)[1:G$nsdf] else term.names<-array("",0) n.smooth<-length(G$smooth) if (n.smooth) for (i in 1:n.smooth) { k<-1 for (j in object$smooth[[i]]$first.para:object$smooth[[i]]$last.para) { term.names[j]<-paste(object$smooth[[i]]$label,".",as.character(k),sep="") k<-k+1 } } names(object$coefficients) <- term.names # note - won't work on matrices!! names(object$edf) <- term.names names(object$sp) <- names(G$sp) object$gcv.ubre <- if (isREML(ret$mer)) REMLcrit(ret$mer) else deviance(ret$mer) if (!is.null(G$Xcentre)) object$Xcentre <- G$Xcentre ## any column centering applied to smooths ret$gam<-object class(gamm4) <- c("gamm4","list") ret } ## end of gamm4 print.gamm4.version <- function() { library(help=gamm4)$info[[1]] -> version version <- version[pmatch("Version",version)] um <- strsplit(version," ")[[1]] version <- um[nchar(um)>0][2] hello <- paste("This is gamm4 ",version,"\n",sep="") packageStartupMessage(hello) } .onAttach <- function(...) { print.gamm4.version() } .onUnload <- function(libpath) {} gamm4/MD50000644000176200001440000000034113641707272011576 0ustar liggesusersb9c8c115e3842b0d071ce7dbcfa86350 *ChangeLog a0ef1282ccc667525c500f293febecd3 *DESCRIPTION 9db19f47495b33803744e11eaf3ca90a *NAMESPACE 6d9c12eb5a03c5fd94bdaf0bdd8e13e3 *R/gamm4.r 4dca194b5e0f8fba6d3e8d1d1b957a5f *man/gamm4.Rd