mitml/0000755000176200001440000000000013414712773011405 5ustar liggesusersmitml/inst/0000755000176200001440000000000013413110673012350 5ustar liggesusersmitml/inst/doc/0000755000176200001440000000000013413110673013115 5ustar liggesusersmitml/inst/doc/Introduction.Rmd0000644000176200001440000001736313321347222016254 0ustar liggesusers--- title: "Introduction" output: rmarkdown::html_vignette: css: "css/vignette.css" vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{Introduction} %\VignetteEncoding{UTF-8} --- --- ```{r setup, include=FALSE, cache=FALSE} library(knitr) set.seed(123) options(width=87) opts_chunk$set(background="#ffffff", comment="#", collapse=FALSE, fig.width=9, fig.height=9, warning=FALSE, message=FALSE) ``` This vignette is intended to provide a first introduction to the R package `mitml` for generating and analyzing multiple imputations for multilevel missing data. A usual application of the package may consist of the following steps. 1. Imputation 2. Assessment of convergence 3. Completion of the data 4. Analysis 5. Pooling The `mitml` package offers a set of tools to facilitate each of these steps. This vignette is intended as a step-by-step illustration of the basic features of `mitml`. Further information can be found in the other [vignettes](https://github.com/simongrund1/mitml/wiki) and the package [documentation](https://cran.r-project.org/package=mitml/mitml.pdf). ## Example data For the purposes of this vignette, we employ a simple example that makes use of the `studentratings` data set, which is provided with `mitml`. To use it, the `mitml` package and the data set must be loaded as follows. ```{r} library(mitml) data(studentratings) ``` More information about the variables in the data set can be obtained from its `summary`. ```{r} summary(studentratings) ``` In addition, the correlations between variables (based on pairwise observations) may be useful for identifying possible sources of information that may be used during the treatment of missing data. ```{r, echo=FALSE} round(cor(studentratings[,-(1:3)], use="pairwise"),3) ``` This illustrates that (a) most variables in the data set are affected by missing data, but also (b) that substantial relations exist between variables. For simplicity, we focus on only a subset of these variables. ## Model of interest For the present example, we focus on the two variables `ReadDis` (disciplinary problems in reading class) and `ReadAchiev` (reading achievement). Assume we are interested in the relation between these variables. Specifically, we may be interested in the following analysis model $$ \mathit{ReadAchiev}_{ij} = \gamma_{00} + \gamma_{10} \mathit{ReadDis}_{ij} + u_{0j} + e_{ij} $$ On the basis of the syntax used in the R package `lme4`, this model may be written as follows. ```{r, results="hide"} ReadAchiev ~ 1 + ReadDis + (1|ID) ``` In this model, the relation between `ReadDis` and `ReadAchiev` is represented by a single fixed effect of `ReadDis`, and a random intercept is included to account for the clustered structure of the data and the group-level variance in `ReadAchiev` that is not explained by `ReadDis`. ## Generating imputations The `mitml` package includes wrapper functions for the R packages `pan` (`panImpute`) and `jomo` (`jomoImpute`). Here, we will use the first option. To generate imputations with `panImpute`, the user must specify (at least): 1. an imputation model 2. the number of iterations and imputations The easiest way of specifying the imputation model is to use the `formula` argument of `panImpute`. Generally speaking, the imputation model should include all variables that are either (a) part of the model of interest, (b) related to the variables in the model, or (c) related to whether the variables in the model are missing. In this simple example, we include only `ReadDis` and `ReadAchiev` as the main target variables and `SchClimate` as an auxiliary variable. ```{r} fml <- ReadAchiev + ReadDis + SchClimate ~ 1 + (1|ID) ``` Note that, in this specification of the imputation model. all variables are included on the left-hand side of the model, whereas the right-hand side is left "empty". This model allows for all relations between variables at Level 1 and 2 and is thus suitable for most applications of the multilevel random intercept model (for further discussion, see also Grund, Lüdtke, & Robitzsch, 2016, in press). The imputation procedure is then run for 5,000 iterations (burn-in), after which 100 imputations are drawn every 100 iterations. ```{r, results="hide"} imp <- panImpute(studentratings, formula=fml, n.burn=5000, n.iter=100, m=100) ``` This step may take a few seconds. Once the process is completed, the imputations are saved in the `imp` object. ## Assessing convergence In `mitml`, there are two options for assessing the convergence of the imputation procedure. First, the `summary` calculates the "potential scale reduction factor" ($\hat{R}$) for each parameter in the imputation model. If this value is noticeably larger than 1 for some parameters (say $>1.05$), a longer burn-in period may be required. ```{r} summary(imp) ``` Second, diagnostic plots can be requested with the `plot` function. These plots consist of a trace plot, an autocorrelation plot, and some additional information about the posterior distribution. Convergence can be assumed if the trace plot is stationary (i.e., does not "drift"), and the autocorrelation is within reasonable bounds for the chosen number of iterations between imputations. For this example, we examine only the plot for the parameter `Beta[1,2]` (i.e., the intercept of `ReadDis`). ```{r conv, echo=FALSE} plot(imp, trace="all", print="beta", pos=c(1,2), export="png", dev.args=list(width=720, height=380, pointsize=16)) ``` ```{r, eval=FALSE} plot(imp, trace="all", print="beta", pos=c(1,2)) ``` ![](mitmlPlots/BETA_ReadDis_ON_Intercept.png) Taken together, both $\hat{R}$ and the diagnostic plots indicate that the imputation model converged, setting the basis for the analysis of the imputed data sets. ## Completing the data In order to work with and analyze the imputed data sets, the data sets must be completed with the imputations generated in the previous steps. To do so, `mitml` provides the function `mitmlComplete`. ```{r} implist <- mitmlComplete(imp, "all") ``` This resulting object is a list that contains the 100 completed data sets. ## Analysis and pooling In order to obtain estimates for the model of interest, the model must be fit separately to each of the completed data sets, and the results must be pooled into a final set of estimates and inferences. The `mitml` package offers the `with` function to fit various statistical models to a list of completed data sets. In this example, we use the `lmer` function from the R package `lme4` to fit the model of interest. ```{r, message=FALSE} library(lme4) fit <- with(implist, lmer(ReadAchiev ~ 1 + ReadDis + (1|ID))) ``` The resulting object is a list containing the 100 fitted models. To pool the results of these models into a set of final estimates and inferences, `mitml` offers the `testEstimates` function. ```{r} testEstimates(fit, var.comp=TRUE) ``` The estimates can be interpreted in a manner similar to the estimates from the corresponding complete-data procedure. In addition, the output includes diagnostic quantities such as the fraction of missing information (FMI), which can be helpful for interpreting the results and understanding problems with the imputation procedure. ###### References Grund, S., Lüdtke, O., & Robitzsch, A. (2016). Multiple imputation of multilevel missing data: An introduction to the R package pan. *SAGE Open*, *6*(4), 1–17. doi: 10.1177/2158244016668220 ([Link](https://doi.org/10.1177/2158244016668220)) Grund, S., Lüdtke, O., & Robitzsch, A. (in press). Multiple imputation of missing data for multilevel models: Simulations and recommendations. *Organizational Research Methods*. doi: 10.1177/1094428117703686 ([Link](https://doi.org/10.1177/1094428117703686)) --- ```{r, echo=F} cat("Author: Simon Grund (grund@ipn.uni-kiel.de)\nDate: ", as.character(Sys.Date())) ``` mitml/inst/doc/Analysis.Rmd0000644000176200001440000002274613321375504015363 0ustar liggesusers--- title: "Analysis of Multiply Imputed Data Sets" output: rmarkdown::html_vignette: css: "css/vignette.css" vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{Analysis of multiply imputed data sets} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE, cache=FALSE} library(knitr) set.seed(123) options(width=87) opts_chunk$set(background="#ffffff", comment="#", collapse=FALSE, fig.width=9, fig.height=9, warning=FALSE, message=FALSE) ``` This vignette is intended to provide an overview of the analysis of multiply imputed data sets with `mitml`. Specifically, this vignette addresses the following topics: 1. Working with multiply imputed data sets 2. Rubin's rules for pooling individual parameters 3. Model comparisons 4. Parameter constraints Further information can be found in the other [vignettes](https://github.com/simongrund1/mitml/wiki) and the package [documentation](https://cran.r-project.org/package=mitml/mitml.pdf). ## Example data (`studentratings`) For the purposes of this vignette, we make use of the `studentratings` data set, which contains simulated data from 750 students in 50 schools including scores on reading and math achievement, socioeconomic status (SES), and ratings on school and classroom environment. The package and the data set can be loaded as follows. ```{r} library(mitml) library(lme4) data(studentratings) ``` As evident from its `summary`, most variables in the data set contain missing values. ```{r} summary(studentratings) ``` In the present example, we investigate the differences in mathematics achievement that can be attributed to differences in SES when controlling for students' sex. Specifically, we are interested in the following model. $$ \mathit{MA}_{ij} = \gamma_{00} + \gamma_{10} \mathit{Sex}_{ij} + \gamma_{20} (\mathit{SES}_{ij}-\overline{\mathit{SES}}_{\bullet j}) + \gamma_{01} \overline{\mathit{SES}}_{\bullet j} + u_{0j} + e_{ij} $$ Note that this model also employs group-mean centering to separate the individual and group-level effects of SES. ## Generating imputations In the present example, we generate 20 imputations from the following imputation model. ```{r, results="hide"} fml <- ReadDis + SES ~ 1 + Sex + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=5000, n.iter=200, m=20) ``` The completed data are then extracted with `mitmlComplete`. ```{r} implist <- mitmlComplete(imp, "all") ``` ## Transforming the imputed data sets In empirical research, the raw data rarely enter the analyses but often require to be transformed beforehand. For this purpose, the `mitml` package provides the `within` function, which applies a given transformation directly to each data set. In the following, we use this to (a) calculate the group means of SES and (b) center the individual scores around their group means. ```{r} implist <- within(implist,{ G.SES <- clusterMeans(SES,ID) # calculate group means I.SES <- SES - G.SES # center around group means }) ``` This method can be used to apply arbitrary transformations to all of the completed data sets simultaneously. > **Note regarding** `dplyr`**:** > Due to how it is implemented, `within` cannot be used directly with `dplyr`. > Instead, users may use `with` instead of `within` with the following workaround. >```{r, eval=FALSE} implist <- with(implist,{ df <- data.frame(as.list(environment())) df <- ... # dplyr commands df }) implist <- as.mitml.list(implist) ``` > Advanced users may also consider using `lapply` for a similar workaround.` ## Fitting the analysis model In order to analyze the imputed data, each data set is analyzed using regular complete-data techniques. For this purpose, `mitml` offers the `with` function. In the present example, we use it to fit the model of interest with the R package `lme4`. ```{r} fit <- with(implist,{ lmer(MathAchiev ~ 1 + Sex + I.SES + G.SES + (1|ID)) }) ``` This results in a list of fitted models, one for each of the imputed data sets. ## Pooling The results obtained from the imputed data sets must be pooled in order to obtain a set of final parameter estimates and inferences. In the following, we employ a number of different pooling methods that can be used to address common statistical tasks, for example, for (a) estimating and testing individual parameters, (b) model comparisons, and (c) tests of constraints about one or several parameters. #### Parameter estimates Individual parameters are commonly pooled with the rules developed by Rubin (1987). In `mitml`, Rubin's rules are implemented in the `testEstimates` function. ```{r} testEstimates(fit) ``` In addition, the argument `var.comp=TRUE` can be used to obtain pooled estimates of variance components, and `df.com` can be used to specify the complete-data degrees of freedom, which provides more appropriate (i.e., conservative) inferences in smaller samples. For example, using a conservative value for the complete-data degrees of freedom for the fixed effects in the model of interest (Snijders & Bosker, 2012), the output changes as follows. ```{r} testEstimates(fit, var.comp=TRUE, df.com=46) ``` #### Multiple parameters and model comparisons Oftentimes, statistical inference concerns more than one parameter at a time. For example, the combined influence of SES (within and between groups) on mathematics achievement is represented by two parameters in the model of interest. Multiple pooling methods for Wald and likelihood ratio tests (LRTs) are implemented in the `testModels` function. This function requires the specification of a full model and a restricted model, which are then compared using (pooled) Wald tests or LRTs. Specifically, `testModels` allows users to pool Wald tests ($D_1$), $\chi^2$ test statistics ($D_2$), and LRTs ($D_3$; for a comparison of these methods, see also Grund, Lüdtke, & Robitzsch, 2016b). To examine the combined influence of SES on mathematics achievement, the following restricted model can be specified and compared with the model of interest (using $D_1$). ```{r} fit.null <- with(implist,{ lmer(MathAchiev ~ 1 + Sex + (1|ID)) }) testModels(fit, fit.null) ``` > **Note regarding the order of arguments:** > Please note that `testModels` expects that the first argument contains the full model, and the second argument contains the restricted model. > If the order of the arguments is reversed, the results will not be interpretable. Similar to the test for individual parameters, smaller samples can be accommodated with `testModels` (with method $D_1$) by specifying the complete-data degrees of freedom for the denominator of the $F$ statistic. ```{r} testModels(fit, fit.null, df.com=46) ``` The pooling method used by `testModels` is determined by the `method` argument. For example, to calculate the pooled LRT corresponding to the Wald test above (i.e., $D_3$), the following command can be issued. ```{r} testModels(fit, fit.null, method="D3") ``` #### Constraints on parameters Finally, it is often useful to investigate functions (or constraints) of the parameters in the model of interest. In complete data sets, this can be achieved with a test of linear hypotheses or the delta method. The `mitml` package implements a pooled version of the delta method in the `testConstraints` function. For example, the combined influence of SES on mathematics achievement can also be tested without model comparisons by testing the constraint that the parameters pertaining to `I.SES` and `G.SES` are both zero. This constraint is defined and tested as follows. ```{r} c1 <- c("I.SES", "G.SES") testConstraints(fit, constraints=c1) ``` This test is identical to the Wald test given in the previous section. Arbitrary constraints on the parameters can be specified and tested in this manner, where each character string denotes an expression to be tested against zero. In the present example, we are also interested in the *contextual* effect of SES on mathematics achievement (e.g., Snijders & Bosker, 2012). The contextual effect is simply the difference between the coefficients pertaining to `G.SES` and `I.SES` and can be tested as follows. ```{r} c2 <- c("G.SES - I.SES") testConstraints(fit, constraints=c2) ``` Similar to model comparisons, constraints can be tested with different methods ($D_1$ and $D_2$) and can accommodate smaller samples by a value for `df.com`. Further examples for the analysis of multiply imputed data sets with `mitml` are given by Enders (2016) and Grund, Lüdtke, and Robitzsch (2016a). ###### References Enders, C. K. (2016). Multiple imputation as a flexible tool for missing data handling in clinical research. *Behaviour Research and Therapy*. doi: 10.1016/j.brat.2016.11.008 ([Link](https://doi.org/10.1016/j.brat.2016.11.008)) Grund, S., Lüdtke, O., & Robitzsch, A. (2016a). Multiple imputation of multilevel missing data: An introduction to the R package pan. *SAGE Open*, *6*(4), 1–17. doi: 10.1177/2158244016668220 ([Link](https://doi.org/10.1177/2158244016668220)) Grund, S., Lüdtke, O., & Robitzsch, A. (2016b). Pooling ANOVA results from multiply imputed datasets: A simulation study. *Methodology*, *12*, 75–88. doi: 10.1027/1614-2241/a000111 ([Link](https://doi.org/10.1027/1614-2241/a000111)) Rubin, D. B. (1987). *Multiple imputation for nonresponse in surveys*. Hoboken, NJ: Wiley. Snijders, T. A. B., & Bosker, R. J. (2012). *Multilevel analysis: An introduction to basic and advanced multilevel modeling*. Thousand Oaks, CA: Sage. --- ```{r, echo=F} cat("Author: Simon Grund (grund@ipn.uni-kiel.de)\nDate: ", as.character(Sys.Date())) ``` mitml/inst/doc/Analysis.R0000644000176200001440000000517513413110657015035 0ustar liggesusers## ----setup, include=FALSE, cache=FALSE----------------------------------------------- library(knitr) set.seed(123) options(width=87) opts_chunk$set(background="#ffffff", comment="#", collapse=FALSE, fig.width=9, fig.height=9, warning=FALSE, message=FALSE) ## ------------------------------------------------------------------------------------ library(mitml) library(lme4) data(studentratings) ## ------------------------------------------------------------------------------------ summary(studentratings) ## ---- results="hide"----------------------------------------------------------------- fml <- ReadDis + SES ~ 1 + Sex + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=5000, n.iter=200, m=20) ## ------------------------------------------------------------------------------------ implist <- mitmlComplete(imp, "all") ## ------------------------------------------------------------------------------------ implist <- within(implist,{ G.SES <- clusterMeans(SES,ID) # calculate group means I.SES <- SES - G.SES # center around group means }) ## ---- eval=FALSE--------------------------------------------------------------------- # implist <- with(implist,{ # df <- data.frame(as.list(environment())) # df <- ... # dplyr commands # df # }) # implist <- as.mitml.list(implist) ## ------------------------------------------------------------------------------------ fit <- with(implist,{ lmer(MathAchiev ~ 1 + Sex + I.SES + G.SES + (1|ID)) }) ## ------------------------------------------------------------------------------------ testEstimates(fit) ## ------------------------------------------------------------------------------------ testEstimates(fit, var.comp=TRUE, df.com=46) ## ------------------------------------------------------------------------------------ fit.null <- with(implist,{ lmer(MathAchiev ~ 1 + Sex + (1|ID)) }) testModels(fit, fit.null) ## ------------------------------------------------------------------------------------ testModels(fit, fit.null, df.com=46) ## ------------------------------------------------------------------------------------ testModels(fit, fit.null, method="D3") ## ------------------------------------------------------------------------------------ c1 <- c("I.SES", "G.SES") testConstraints(fit, constraints=c1) ## ------------------------------------------------------------------------------------ c2 <- c("G.SES - I.SES") testConstraints(fit, constraints=c2) ## ---- echo=F------------------------------------------------------------------------- cat("Author: Simon Grund (grund@ipn.uni-kiel.de)\nDate: ", as.character(Sys.Date())) mitml/inst/doc/Analysis.html0000644000176200001440000006705613413110657015606 0ustar liggesusers Analysis of Multiply Imputed Data Sets

Analysis of Multiply Imputed Data Sets

This vignette is intended to provide an overview of the analysis of multiply imputed data sets with mitml. Specifically, this vignette addresses the following topics:

  1. Working with multiply imputed data sets
  2. Rubin’s rules for pooling individual parameters
  3. Model comparisons
  4. Parameter constraints

Further information can be found in the other vignettes and the package documentation.

Example data (studentratings)

For the purposes of this vignette, we make use of the studentratings data set, which contains simulated data from 750 students in 50 schools including scores on reading and math achievement, socioeconomic status (SES), and ratings on school and classroom environment.

The package and the data set can be loaded as follows.

library(mitml)
library(lme4)
data(studentratings)

As evident from its summary, most variables in the data set contain missing values.

summary(studentratings)
#        ID       FedState     Sex              MathAchiev       MathDis      
#  Min.   :1001   B :375   Length:750         Min.   :225.0   Min.   :0.2987  
#  1st Qu.:1013   SH:375   Class :character   1st Qu.:440.7   1st Qu.:1.9594  
#  Median :1513            Mode  :character   Median :492.7   Median :2.4350  
#  Mean   :1513                               Mean   :495.4   Mean   :2.4717  
#  3rd Qu.:2013                               3rd Qu.:553.2   3rd Qu.:3.0113  
#  Max.   :2025                               Max.   :808.1   Max.   :4.7888  
#                                             NA's   :132     NA's   :466     
#       SES          ReadAchiev       ReadDis        CognAbility      SchClimate     
#  Min.   :-9.00   Min.   :191.1   Min.   :0.7637   Min.   :28.89   Min.   :0.02449  
#  1st Qu.:35.00   1st Qu.:427.4   1st Qu.:2.1249   1st Qu.:43.80   1st Qu.:1.15338  
#  Median :46.00   Median :490.2   Median :2.5300   Median :48.69   Median :1.65636  
#  Mean   :46.55   Mean   :489.9   Mean   :2.5899   Mean   :48.82   Mean   :1.73196  
#  3rd Qu.:59.00   3rd Qu.:558.4   3rd Qu.:3.0663   3rd Qu.:53.94   3rd Qu.:2.24018  
#  Max.   :93.00   Max.   :818.5   Max.   :4.8554   Max.   :71.29   Max.   :4.19316  
#  NA's   :281                     NA's   :153                      NA's   :140

In the present example, we investigate the differences in mathematics achievement that can be attributed to differences in SES when controlling for students’ sex. Specifically, we are interested in the following model.

\[ \mathit{MA}_{ij} = \gamma_{00} + \gamma_{10} \mathit{Sex}_{ij} + \gamma_{20} (\mathit{SES}_{ij}-\overline{\mathit{SES}}_{\bullet j}) + \gamma_{01} \overline{\mathit{SES}}_{\bullet j} + u_{0j} + e_{ij} \]

Note that this model also employs group-mean centering to separate the individual and group-level effects of SES.

Generating imputations

In the present example, we generate 20 imputations from the following imputation model.

fml <- ReadDis + SES ~ 1 + Sex + (1|ID)
imp <- panImpute(studentratings, formula=fml, n.burn=5000, n.iter=200, m=20)

The completed data are then extracted with mitmlComplete.

implist <- mitmlComplete(imp, "all")

Transforming the imputed data sets

In empirical research, the raw data rarely enter the analyses but often require to be transformed beforehand. For this purpose, the mitml package provides the within function, which applies a given transformation directly to each data set.

In the following, we use this to (a) calculate the group means of SES and (b) center the individual scores around their group means.

implist <- within(implist,{
  G.SES <- clusterMeans(SES,ID) # calculate group means
  I.SES <- SES - G.SES          # center around group means
})

This method can be used to apply arbitrary transformations to all of the completed data sets simultaneously.

Note regarding dplyr: Due to how it is implemented, within cannot be used directly with dplyr. Instead, users may use with instead of within with the following workaround.

Advanced users may also consider using lapply for a similar workaround.`

Fitting the analysis model

In order to analyze the imputed data, each data set is analyzed using regular complete-data techniques. For this purpose, mitml offers the with function. In the present example, we use it to fit the model of interest with the R package lme4.

fit <- with(implist,{
  lmer(MathAchiev ~ 1 + Sex + I.SES + G.SES + (1|ID))
})

This results in a list of fitted models, one for each of the imputed data sets.

Pooling

The results obtained from the imputed data sets must be pooled in order to obtain a set of final parameter estimates and inferences. In the following, we employ a number of different pooling methods that can be used to address common statistical tasks, for example, for (a) estimating and testing individual parameters, (b) model comparisons, and (c) tests of constraints about one or several parameters.

Parameter estimates

Individual parameters are commonly pooled with the rules developed by Rubin (1987). In mitml, Rubin’s rules are implemented in the testEstimates function.

# 
# Call:
# 
# testEstimates(model = fit)
# 
# Final parameter estimates and inferences obtained from 20 imputed data sets.
# 
#               Estimate  Std.Error    t.value         df    P(>|t|)        RIV        FMI 
# (Intercept)    433.015     28.481     15.203   1081.280      0.000      0.153      0.134 
# SexGirl          3.380      7.335      0.461 279399.841      0.645      0.008      0.008 
# I.SES            0.692      0.257      2.690    233.427      0.008      0.399      0.291 
# G.SES            1.296      0.597      2.173   1096.956      0.030      0.152      0.133 
# 
# Unadjusted hypothesis test as appropriate in larger samples.

In addition, the argument var.comp=TRUE can be used to obtain pooled estimates of variance components, and df.com can be used to specify the complete-data degrees of freedom, which provides more appropriate (i.e., conservative) inferences in smaller samples.

For example, using a conservative value for the complete-data degrees of freedom for the fixed effects in the model of interest (Snijders & Bosker, 2012), the output changes as follows.

# 
# Call:
# 
# testEstimates(model = fit, var.comp = TRUE, df.com = 46)
# 
# Final parameter estimates and inferences obtained from 20 imputed data sets.
# 
#              Estimate Std.Error   t.value        df   P(>|t|)       RIV       FMI 
# (Intercept)   433.015    28.481    15.203    36.965     0.000     0.153     0.134 
# SexGirl         3.380     7.335     0.461    43.752     0.647     0.008     0.008 
# I.SES           0.692     0.257     2.690    27.781     0.012     0.399     0.291 
# G.SES           1.296     0.597     2.173    37.022     0.036     0.152     0.133 
# 
#                         Estimate 
# Intercept~~Intercept|ID  168.506 
# Residual~~Residual      8092.631 
# ICC|ID                     0.020 
# 
# Hypothesis test adjusted for small samples with df=[46]
# complete-data degrees of freedom.

Multiple parameters and model comparisons

Oftentimes, statistical inference concerns more than one parameter at a time. For example, the combined influence of SES (within and between groups) on mathematics achievement is represented by two parameters in the model of interest.

Multiple pooling methods for Wald and likelihood ratio tests (LRTs) are implemented in the testModels function. This function requires the specification of a full model and a restricted model, which are then compared using (pooled) Wald tests or LRTs. Specifically, testModels allows users to pool Wald tests (\(D_1\)), \(\chi^2\) test statistics (\(D_2\)), and LRTs (\(D_3\); for a comparison of these methods, see also Grund, Lüdtke, & Robitzsch, 2016b).

To examine the combined influence of SES on mathematics achievement, the following restricted model can be specified and compared with the model of interest (using \(D_1\)).

# 
# Call:
# 
# testModels(model = fit, null.model = fit.null)
# 
# Model comparison calculated from 20 imputed data sets.
# Combination method: D1 
# 
#    F.value     df1     df2   P(>F)     RIV 
#      6.095       2 674.475   0.002   0.275 
# 
# Unadjusted hypothesis test as appropriate in larger samples.

Note regarding the order of arguments: Please note that testModels expects that the first argument contains the full model, and the second argument contains the restricted model. If the order of the arguments is reversed, the results will not be interpretable.

Similar to the test for individual parameters, smaller samples can be accommodated with testModels (with method \(D_1\)) by specifying the complete-data degrees of freedom for the denominator of the \(F\) statistic.

# 
# Call:
# 
# testModels(model = fit, null.model = fit.null, df.com = 46)
# 
# Model comparison calculated from 20 imputed data sets.
# Combination method: D1 
# 
#    F.value     df1     df2   P(>F)     RIV 
#      6.095       2  40.687   0.005   0.275 
# 
# Hypothesis test adjusted for small samples with df=[46]
# complete-data degrees of freedom.

The pooling method used by testModels is determined by the method argument. For example, to calculate the pooled LRT corresponding to the Wald test above (i.e., \(D_3\)), the following command can be issued.

# 
# Call:
# 
# testModels(model = fit, null.model = fit.null, method = "D3")
# 
# Model comparison calculated from 20 imputed data sets.
# Combination method: D3 
# 
#    F.value     df1     df2   P(>F)     RIV 
#      5.787       2 519.143   0.003   0.328 
# 
# Models originally fit with REML were automatically refit using ML.

Constraints on parameters

Finally, it is often useful to investigate functions (or constraints) of the parameters in the model of interest. In complete data sets, this can be achieved with a test of linear hypotheses or the delta method. The mitml package implements a pooled version of the delta method in the testConstraints function.

For example, the combined influence of SES on mathematics achievement can also be tested without model comparisons by testing the constraint that the parameters pertaining to I.SES and G.SES are both zero. This constraint is defined and tested as follows.

# 
# Call:
# 
# testConstraints(model = fit, constraints = c1)
# 
# Hypothesis test calculated from 20 imputed data sets. The following
# constraints were specified:
# 
#             Estimate Std. Error 
#    I.SES:      0.692      0.245 
#    G.SES:      1.296      0.628 
# 
# Combination method: D1 
# 
#    F.value     df1     df2   P(>F)     RIV 
#      6.095       2 674.475   0.002   0.275 
# 
# Unadjusted hypothesis test as appropriate in larger samples.

This test is identical to the Wald test given in the previous section. Arbitrary constraints on the parameters can be specified and tested in this manner, where each character string denotes an expression to be tested against zero.

In the present example, we are also interested in the contextual effect of SES on mathematics achievement (e.g., Snijders & Bosker, 2012). The contextual effect is simply the difference between the coefficients pertaining to G.SES and I.SES and can be tested as follows.

# 
# Call:
# 
# testConstraints(model = fit, constraints = c2)
# 
# Hypothesis test calculated from 20 imputed data sets. The following
# constraints were specified:
# 
#                     Estimate Std. Error 
#    G.SES - I.SES:      0.605      0.644 
# 
# Combination method: D1 
# 
#    F.value     df1     df2   P(>F)     RIV 
#      0.881       1 616.380   0.348   0.166 
# 
# Unadjusted hypothesis test as appropriate in larger samples.

Similar to model comparisons, constraints can be tested with different methods (\(D_1\) and \(D_2\)) and can accommodate smaller samples by a value for df.com. Further examples for the analysis of multiply imputed data sets with mitml are given by Enders (2016) and Grund, Lüdtke, and Robitzsch (2016a).

References

Enders, C. K. (2016). Multiple imputation as a flexible tool for missing data handling in clinical research. Behaviour Research and Therapy. doi: 10.1016/j.brat.2016.11.008 (Link)

Grund, S., Lüdtke, O., & Robitzsch, A. (2016a). Multiple imputation of multilevel missing data: An introduction to the R package pan. SAGE Open, 6(4), 1–17. doi: 10.1177/2158244016668220 (Link)

Grund, S., Lüdtke, O., & Robitzsch, A. (2016b). Pooling ANOVA results from multiply imputed datasets: A simulation study. Methodology, 12, 75–88. doi: 10.1027/1614-2241/a000111 (Link)

Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. Hoboken, NJ: Wiley.

Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Thousand Oaks, CA: Sage.


# Author: Simon Grund (grund@ipn.uni-kiel.de)
# Date:   2019-01-02
mitml/inst/doc/Level2.R0000644000176200001440000000276413413110672014401 0ustar liggesusers## ----setup, include=FALSE, cache=FALSE----------------------------------------------- library(knitr) set.seed(123) options(width=87) opts_chunk$set(background="#ffffff", comment="#", collapse=FALSE, fig.width=9, fig.height=9, warning=FALSE, message=FALSE) ## ------------------------------------------------------------------------------------ library(mitml) data(leadership) ## ------------------------------------------------------------------------------------ summary(leadership) ## ---- echo=FALSE--------------------------------------------------------------------- leadership[73:78,] ## ------------------------------------------------------------------------------------ fml <- list( JOBSAT + NEGLEAD + WLOAD ~ 1 + (1|GRPID) , # Level 1 COHES ~ 1 ) # Level 2 ## ---- results="hide"----------------------------------------------------------------- imp <- jomoImpute(leadership, formula=fml, n.burn=5000, n.iter=250, m=20) ## ------------------------------------------------------------------------------------ summary(imp) ## ------------------------------------------------------------------------------------ implist <- mitmlComplete(imp, "all") ## ---- echo=FALSE--------------------------------------------------------------------- implist[[1]][73:78,] ## ---- echo=F------------------------------------------------------------------------- cat("Author: Simon Grund (grund@ipn.uni-kiel.de)\nDate: ", as.character(Sys.Date())) mitml/inst/doc/Level2.html0000644000176200001440000004270413413110673015143 0ustar liggesusers Imputation of Missing Data at Level 2

Imputation of Missing Data at Level 2


This vignette illustrates the use of mitml for the treatment of missing data at Level 2. Specifically, the vignette addresses the following topics:

  1. Specification of the two-level imputation model for missing data at both Level 1 and 2
  2. Running the imputation procedure

Further information can be found in the other vignettes and the package documentation.

Example data

For purposes of this vignette, we make use of the leadership data set, which contains simulated data from 750 employees in 50 groups including ratings on job satisfaction, leadership style, and work load (Level 1) as well as cohesion (Level 2).

The package and the data set can be loaded as follows.

library(mitml)
data(leadership)

In the summary of the data, it becomes visible that all variables are affected by missing data.

summary(leadership)
#      GRPID          JOBSAT             COHES            NEGLEAD          WLOAD    
#  Min.   : 1.0   Min.   :-7.32934   Min.   :-3.4072   Min.   :-3.13213   low :416  
#  1st Qu.:13.0   1st Qu.:-1.61932   1st Qu.:-0.4004   1st Qu.:-0.70299   high:248  
#  Median :25.5   Median :-0.02637   Median : 0.2117   Median : 0.08027   NA's: 86  
#  Mean   :25.5   Mean   :-0.03168   Mean   : 0.1722   Mean   : 0.04024             
#  3rd Qu.:38.0   3rd Qu.: 1.64571   3rd Qu.: 1.1497   3rd Qu.: 0.79111             
#  Max.   :50.0   Max.   :10.19227   Max.   : 2.5794   Max.   : 3.16116             
#                 NA's   :69         NA's   :30        NA's   :92

The following data segment illustrates this fact, including cases with missing data at Level 1 (e.g., job satisfaction) and 2 (e.g., cohesion).

#    GRPID      JOBSAT     COHES     NEGLEAD WLOAD
# 73     5 -1.72143400 0.9023198  0.83025589  high
# 74     5          NA 0.9023198  0.15335056  high
# 75     5 -0.09541178 0.9023198  0.21886272   low
# 76     6  0.68626611        NA -0.38190591  high
# 77     6          NA        NA          NA   low
# 78     6 -1.86298201        NA -0.05351001  high

In the following, we will employ a two-level model to address missing data at both levels simultaneously.

Specifying the imputation model

The specification of the two-level model, involves two components, one pertaining to the variables at each level of the sample (Goldstein, Carpenter, Kenward, & Levin, 2009; for further discussion, see also Enders, Mister, & Keller, 2016; Grund, Lüdtke, & Robitzsch, in press).

Specifically, the imputation model is specified as a list with two components, where the first component denotes the model for the variables at Level 1, and the second component denotes the model for the variables at Level 2.

For example, using the formula interface, an imputation model targeting all variables in the data set can be written as follows.

fml <- list( JOBSAT + NEGLEAD + WLOAD ~ 1 + (1|GRPID) , # Level 1
             COHES ~ 1 )                                # Level 2

The first component of this list includes the three target variables at Level 1 and a fixed (1) as well as a random intercept (1|GRPID). The second component includes the target variable at Level 2 with a fixed intercept (1).

From a statistical point of view, this specification corresponds to the following model \[ \begin{aligned} \mathbf{y}_{1ij} &= \boldsymbol\mu_{1} + \mathbf{u}_{1j} + \mathbf{e}_{ij} \\ \mathbf{y}_{2j} &= \boldsymbol\mu_{2} + \mathbf{u}_{1j} \; , \end{aligned} \] where \(\mathbf{y}_{1ij}\) denotes the target variables at Level 1, \(\mathbf{y}_{2j}\) the target variables at Level 2, and the right-hand side of the model contains the fixed effects, random effects, and residual terms as mentioned above.

Note that, even though the two components of the model appear to be separated, they define a single (joint) model for all target variables at both Level 1 and 2. Specifically, this model employs a two-level covariance structure, which allows for relations between variables at both Level 1 (i.e., correlated residuals at Level 1) and 2 (i.e., correlated random effects residuals at Level 2).

Generating imputations

Because the data contain missing values at both levels, imputations will be generated with jomoImpute (and not panImpute). Except for the specification of the two-level model, the syntax is the same as in applications with missing data only at Level 1.

Here, we will run 5,000 burn-in iterations and generate 20 imputations, each 250 iterations apart.

imp <- jomoImpute(leadership, formula=fml, n.burn=5000, n.iter=250, m=20)

By looking at the summary, we can then review the imputation procedure and verify that the imputation model converged.

summary(imp)
# 
# Call:
# 
# jomoImpute(data = leadership, formula = fml, n.burn = 5000, n.iter = 250, 
#     m = 20)
# 
# Level 1:
#  
# Cluster variable:         GRPID 
# Target variables:         JOBSAT NEGLEAD WLOAD 
# Fixed effect predictors:  (Intercept) 
# Random effect predictors: (Intercept) 
# 
# Level 2:
#                 
# Target variables:         COHES 
# Fixed effect predictors:  (Intercept) 
# 
# Performed 5000 burn-in iterations, and generated 20 imputed data sets,
# each 250 iterations apart. 
# 
# Potential scale reduction (Rhat, imputation phase):
#  
#          Min   25%  Mean Median   75%   Max
# Beta:  1.001 1.001 1.001  1.001 1.001 1.001
# Beta2: 1.001 1.001 1.001  1.001 1.001 1.001
# Psi:   1.000 1.001 1.003  1.001 1.003 1.009
# Sigma: 1.000 1.003 1.004  1.004 1.006 1.009
# 
# Largest potential scale reduction:
# Beta: [1,3], Beta2: [1,1], Psi: [4,3], Sigma: [3,1]
# 
# Missing data per variable:
#     GRPID JOBSAT NEGLEAD WLOAD COHES
# MD% 0     9.2    12.3    11.5  4.0

Due to the greater complexity of the two-level model, the output includes more information than in applications with missing data only at Level 1. For example, the output features the model specification for variables at both Level 1 and 2. Furthermore, it provides convergence statistics for the additional regression coefficients for the target variables at Level 2 (i.e., Beta2).

Finally, it also becomes visible that the two-level model indeed allows for relations between target variables at Level 1 and 2. This can be seen from the fact that the potential scale reduction factor (\(\hat{R}\)) for the covariance matrix at Level 2 (Psi) was largest for Psi[4,3], which is the covariance between cohesion and the random intercept of work load.

Completing the data

The completed data sets can then be extracted with mitmlComplete.

implist <- mitmlComplete(imp, "all")

When inspecting the completed data, it is easy to verify that the imputations for variables at Level 2 are constant within groups as intended, thus preserving the two-level structure of the data.

#    GRPID      JOBSAT     NEGLEAD WLOAD      COHES
# 73     5 -1.72143400  0.83025589  high  0.9023198
# 74     5 -2.80749991  0.15335056  high  0.9023198
# 75     5 -0.09541178  0.21886272   low  0.9023198
# 76     6  0.68626611 -0.38190591  high -1.0275552
# 77     6  1.52825873 -1.11035850   low -1.0275552
# 78     6 -1.86298201 -0.05351001  high -1.0275552
References

Enders, C. K., Mistler, S. A., & Keller, B. T. (2016). Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. Psychological Methods, 21, 222–240. doi: 10.1037/met0000063 (Link)

Goldstein, H., Carpenter, J. R., Kenward, M. G., & Levin, K. A. (2009). Multilevel models with multivariate mixed response types. Statistical Modelling, 9, 173–197. doi: 10.1177/1471082X0800900301 (Link)

Grund, S., Lüdtke, O., & Robitzsch, A. (in press). Multiple imputation of missing data for multilevel models: Simulations and recommendations. Organizational Research Methods. doi: 10.1177/1094428117703686 (Link)


# Author: Simon Grund (grund@ipn.uni-kiel.de)
# Date:   2019-01-02
mitml/inst/doc/Introduction.R0000644000176200001440000000413313413110663015721 0ustar liggesusers## ----setup, include=FALSE, cache=FALSE----------------------------------------------- library(knitr) set.seed(123) options(width=87) opts_chunk$set(background="#ffffff", comment="#", collapse=FALSE, fig.width=9, fig.height=9, warning=FALSE, message=FALSE) ## ------------------------------------------------------------------------------------ library(mitml) data(studentratings) ## ------------------------------------------------------------------------------------ summary(studentratings) ## ---- echo=FALSE--------------------------------------------------------------------- round(cor(studentratings[,-(1:3)], use="pairwise"),3) ## ---- results="hide"----------------------------------------------------------------- ReadAchiev ~ 1 + ReadDis + (1|ID) ## ------------------------------------------------------------------------------------ fml <- ReadAchiev + ReadDis + SchClimate ~ 1 + (1|ID) ## ---- results="hide"----------------------------------------------------------------- imp <- panImpute(studentratings, formula=fml, n.burn=5000, n.iter=100, m=100) ## ------------------------------------------------------------------------------------ summary(imp) ## ----conv, echo=FALSE---------------------------------------------------------------- plot(imp, trace="all", print="beta", pos=c(1,2), export="png", dev.args=list(width=720, height=380, pointsize=16)) ## ---- eval=FALSE--------------------------------------------------------------------- # plot(imp, trace="all", print="beta", pos=c(1,2)) ## ------------------------------------------------------------------------------------ implist <- mitmlComplete(imp, "all") ## ---- message=FALSE------------------------------------------------------------------ library(lme4) fit <- with(implist, lmer(ReadAchiev ~ 1 + ReadDis + (1|ID))) ## ------------------------------------------------------------------------------------ testEstimates(fit, var.comp=TRUE) ## ---- echo=F------------------------------------------------------------------------- cat("Author: Simon Grund (grund@ipn.uni-kiel.de)\nDate: ", as.character(Sys.Date())) mitml/inst/doc/Introduction.html0000644000176200001440000026545413413110663016503 0ustar liggesusers Introduction

Introduction


This vignette is intended to provide a first introduction to the R package mitml for generating and analyzing multiple imputations for multilevel missing data. A usual application of the package may consist of the following steps.

  1. Imputation
  2. Assessment of convergence
  3. Completion of the data
  4. Analysis
  5. Pooling

The mitml package offers a set of tools to facilitate each of these steps. This vignette is intended as a step-by-step illustration of the basic features of mitml. Further information can be found in the other vignettes and the package documentation.

Example data

For the purposes of this vignette, we employ a simple example that makes use of the studentratings data set, which is provided with mitml. To use it, the mitml package and the data set must be loaded as follows.

library(mitml)
data(studentratings)

More information about the variables in the data set can be obtained from its summary.

summary(studentratings)
#        ID       FedState     Sex              MathAchiev       MathDis      
#  Min.   :1001   B :375   Length:750         Min.   :225.0   Min.   :0.2987  
#  1st Qu.:1013   SH:375   Class :character   1st Qu.:440.7   1st Qu.:1.9594  
#  Median :1513            Mode  :character   Median :492.7   Median :2.4350  
#  Mean   :1513                               Mean   :495.4   Mean   :2.4717  
#  3rd Qu.:2013                               3rd Qu.:553.2   3rd Qu.:3.0113  
#  Max.   :2025                               Max.   :808.1   Max.   :4.7888  
#                                             NA's   :132     NA's   :466     
#       SES          ReadAchiev       ReadDis        CognAbility      SchClimate     
#  Min.   :-9.00   Min.   :191.1   Min.   :0.7637   Min.   :28.89   Min.   :0.02449  
#  1st Qu.:35.00   1st Qu.:427.4   1st Qu.:2.1249   1st Qu.:43.80   1st Qu.:1.15338  
#  Median :46.00   Median :490.2   Median :2.5300   Median :48.69   Median :1.65636  
#  Mean   :46.55   Mean   :489.9   Mean   :2.5899   Mean   :48.82   Mean   :1.73196  
#  3rd Qu.:59.00   3rd Qu.:558.4   3rd Qu.:3.0663   3rd Qu.:53.94   3rd Qu.:2.24018  
#  Max.   :93.00   Max.   :818.5   Max.   :4.8554   Max.   :71.29   Max.   :4.19316  
#  NA's   :281                     NA's   :153                      NA's   :140

In addition, the correlations between variables (based on pairwise observations) may be useful for identifying possible sources of information that may be used during the treatment of missing data.

#             MathAchiev MathDis    SES ReadAchiev ReadDis CognAbility SchClimate
# MathAchiev       1.000  -0.106  0.260      0.497  -0.080       0.569     -0.206
# MathDis         -0.106   1.000 -0.206     -0.189   0.613      -0.203      0.412
# SES              0.260  -0.206  1.000      0.305  -0.153       0.138     -0.176
# ReadAchiev       0.497  -0.189  0.305      1.000  -0.297       0.413     -0.320
# ReadDis         -0.080   0.613 -0.153     -0.297   1.000      -0.162      0.417
# CognAbility      0.569  -0.203  0.138      0.413  -0.162       1.000     -0.266
# SchClimate      -0.206   0.412 -0.176     -0.320   0.417      -0.266      1.000

This illustrates that (a) most variables in the data set are affected by missing data, but also (b) that substantial relations exist between variables. For simplicity, we focus on only a subset of these variables.

Model of interest

For the present example, we focus on the two variables ReadDis (disciplinary problems in reading class) and ReadAchiev (reading achievement).

Assume we are interested in the relation between these variables. Specifically, we may be interested in the following analysis model

\[ \mathit{ReadAchiev}_{ij} = \gamma_{00} + \gamma_{10} \mathit{ReadDis}_{ij} + u_{0j} + e_{ij} \]

On the basis of the syntax used in the R package lme4, this model may be written as follows.

ReadAchiev ~ 1 + ReadDis + (1|ID)

In this model, the relation between ReadDis and ReadAchiev is represented by a single fixed effect of ReadDis, and a random intercept is included to account for the clustered structure of the data and the group-level variance in ReadAchiev that is not explained by ReadDis.

Generating imputations

The mitml package includes wrapper functions for the R packages pan (panImpute) and jomo (jomoImpute). Here, we will use the first option. To generate imputations with panImpute, the user must specify (at least):

  1. an imputation model
  2. the number of iterations and imputations

The easiest way of specifying the imputation model is to use the formula argument of panImpute. Generally speaking, the imputation model should include all variables that are either (a) part of the model of interest, (b) related to the variables in the model, or (c) related to whether the variables in the model are missing.

In this simple example, we include only ReadDis and ReadAchiev as the main target variables and SchClimate as an auxiliary variable.

fml <- ReadAchiev + ReadDis + SchClimate ~ 1 + (1|ID)

Note that, in this specification of the imputation model. all variables are included on the left-hand side of the model, whereas the right-hand side is left “empty”. This model allows for all relations between variables at Level 1 and 2 and is thus suitable for most applications of the multilevel random intercept model (for further discussion, see also Grund, Lüdtke, & Robitzsch, 2016, in press).

The imputation procedure is then run for 5,000 iterations (burn-in), after which 100 imputations are drawn every 100 iterations.

imp <- panImpute(studentratings, formula=fml, n.burn=5000, n.iter=100, m=100)

This step may take a few seconds. Once the process is completed, the imputations are saved in the imp object.

Assessing convergence

In mitml, there are two options for assessing the convergence of the imputation procedure. First, the summary calculates the “potential scale reduction factor” (\(\hat{R}\)) for each parameter in the imputation model. If this value is noticeably larger than 1 for some parameters (say \(>1.05\)), a longer burn-in period may be required.

summary(imp)
# 
# Call:
# 
# panImpute(data = studentratings, formula = fml, n.burn = 5000, 
#     n.iter = 100, m = 100)
# 
# Cluster variable:         ID 
# Target variables:         ReadAchiev ReadDis SchClimate 
# Fixed effect predictors:  (Intercept) 
# Random effect predictors: (Intercept) 
# 
# Performed 5000 burn-in iterations, and generated 100 imputed data sets,
# each 100 iterations apart. 
# 
# Potential scale reduction (Rhat, imputation phase):
#  
#          Min   25%  Mean Median   75%   Max
# Beta:  1.000 1.001 1.001  1.001 1.002 1.003
# Psi:   1.000 1.001 1.001  1.001 1.001 1.002
# Sigma: 1.000 1.000 1.000  1.000 1.000 1.001
# 
# Largest potential scale reduction:
# Beta: [1,3], Psi: [2,1], Sigma: [2,1]
# 
# Missing data per variable:
#     ID ReadAchiev ReadDis SchClimate FedState Sex MathAchiev MathDis SES  CognAbility
# MD% 0  0          20.4    18.7       0        0   17.6       62.1    37.5 0

Second, diagnostic plots can be requested with the plot function. These plots consist of a trace plot, an autocorrelation plot, and some additional information about the posterior distribution. Convergence can be assumed if the trace plot is stationary (i.e., does not “drift”), and the autocorrelation is within reasonable bounds for the chosen number of iterations between imputations.

For this example, we examine only the plot for the parameter Beta[1,2] (i.e., the intercept of ReadDis).

plot(imp, trace="all", print="beta", pos=c(1,2))

Taken together, both \(\hat{R}\) and the diagnostic plots indicate that the imputation model converged, setting the basis for the analysis of the imputed data sets.

Completing the data

In order to work with and analyze the imputed data sets, the data sets must be completed with the imputations generated in the previous steps. To do so, mitml provides the function mitmlComplete.

implist <- mitmlComplete(imp, "all")

This resulting object is a list that contains the 100 completed data sets.

Analysis and pooling

In order to obtain estimates for the model of interest, the model must be fit separately to each of the completed data sets, and the results must be pooled into a final set of estimates and inferences. The mitml package offers the with function to fit various statistical models to a list of completed data sets.

In this example, we use the lmer function from the R package lme4 to fit the model of interest.

library(lme4)
fit <- with(implist, lmer(ReadAchiev ~ 1 + ReadDis + (1|ID)))

The resulting object is a list containing the 100 fitted models. To pool the results of these models into a set of final estimates and inferences, mitml offers the testEstimates function.

testEstimates(fit, var.comp=TRUE)
# 
# Call:
# 
# testEstimates(model = fit, var.comp = TRUE)
# 
# Final parameter estimates and inferences obtained from 100 imputed data sets.
# 
#              Estimate Std.Error   t.value        df   P(>|t|)       RIV       FMI 
# (Intercept)   582.186    14.501    40.147  4335.314     0.000     0.178     0.152 
# ReadDis       -35.689     5.231    -6.822  3239.411     0.000     0.212     0.175 
# 
#                         Estimate 
# Intercept~~Intercept|ID  902.868 
# Residual~~Residual      6996.303 
# ICC|ID                     0.114 
# 
# Unadjusted hypothesis test as appropriate in larger samples.

The estimates can be interpreted in a manner similar to the estimates from the corresponding complete-data procedure. In addition, the output includes diagnostic quantities such as the fraction of missing information (FMI), which can be helpful for interpreting the results and understanding problems with the imputation procedure.

References

Grund, S., Lüdtke, O., & Robitzsch, A. (2016). Multiple imputation of multilevel missing data: An introduction to the R package pan. SAGE Open, 6(4), 1–17. doi: 10.1177/2158244016668220 (Link)

Grund, S., Lüdtke, O., & Robitzsch, A. (in press). Multiple imputation of missing data for multilevel models: Simulations and recommendations. Organizational Research Methods. doi: 10.1177/1094428117703686 (Link)


# Author: Simon Grund (grund@ipn.uni-kiel.de)
# Date:   2019-01-02
mitml/inst/doc/Level2.Rmd0000644000176200001440000001441013321350422014706 0ustar liggesusers--- title: "Imputation of Missing Data at Level 2" output: rmarkdown::html_vignette: css: "css/vignette.css" vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{Imputation of missing data at level 2} %\VignetteEncoding{UTF-8} --- --- ```{r setup, include=FALSE, cache=FALSE} library(knitr) set.seed(123) options(width=87) opts_chunk$set(background="#ffffff", comment="#", collapse=FALSE, fig.width=9, fig.height=9, warning=FALSE, message=FALSE) ``` This vignette illustrates the use of `mitml` for the treatment of missing data at Level 2. Specifically, the vignette addresses the following topics: 1. Specification of the two-level imputation model for missing data at both Level 1 and 2 2. Running the imputation procedure Further information can be found in the other [vignettes](https://github.com/simongrund1/mitml/wiki) and the package [documentation](https://cran.r-project.org/package=mitml/mitml.pdf). ## Example data For purposes of this vignette, we make use of the `leadership` data set, which contains simulated data from 750 employees in 50 groups including ratings on job satisfaction, leadership style, and work load (Level 1) as well as cohesion (Level 2). The package and the data set can be loaded as follows. ```{r} library(mitml) data(leadership) ``` In the `summary` of the data, it becomes visible that all variables are affected by missing data. ```{r} summary(leadership) ``` The following data segment illustrates this fact, including cases with missing data at Level 1 (e.g., job satisfaction) and 2 (e.g., cohesion). ```{r, echo=FALSE} leadership[73:78,] ``` In the following, we will employ a two-level model to address missing data at both levels simultaneously. ## Specifying the imputation model The specification of the two-level model, involves two components, one pertaining to the variables at each level of the sample (Goldstein, Carpenter, Kenward, & Levin, 2009; for further discussion, see also Enders, Mister, & Keller, 2016; Grund, Lüdtke, & Robitzsch, in press). Specifically, the imputation model is specified as a list with two components, where the first component denotes the model for the variables at Level 1, and the second component denotes the model for the variables at Level 2. For example, using the `formula` interface, an imputation model targeting all variables in the data set can be written as follows. ```{r} fml <- list( JOBSAT + NEGLEAD + WLOAD ~ 1 + (1|GRPID) , # Level 1 COHES ~ 1 ) # Level 2 ``` The first component of this list includes the three target variables at Level 1 and a fixed (`1`) as well as a random intercept (`1|GRPID`). The second component includes the target variable at Level 2 with a fixed intercept (`1`). From a statistical point of view, this specification corresponds to the following model $$ \begin{aligned} \mathbf{y}_{1ij} &= \boldsymbol\mu_{1} + \mathbf{u}_{1j} + \mathbf{e}_{ij} \\ \mathbf{y}_{2j} &= \boldsymbol\mu_{2} + \mathbf{u}_{1j} \; , \end{aligned} $$ where $\mathbf{y}_{1ij}$ denotes the target variables at Level 1, $\mathbf{y}_{2j}$ the target variables at Level 2, and the right-hand side of the model contains the fixed effects, random effects, and residual terms as mentioned above. Note that, even though the two components of the model appear to be separated, they define a single (joint) model for all target variables at both Level 1 and 2. Specifically, this model employs a two-level covariance structure, which allows for relations between variables at both Level 1 (i.e., correlated residuals at Level 1) and 2 (i.e., correlated random effects residuals at Level 2). ## Generating imputations Because the data contain missing values at both levels, imputations will be generated with `jomoImpute` (and not `panImpute`). Except for the specification of the two-level model, the syntax is the same as in applications with missing data only at Level 1. Here, we will run 5,000 burn-in iterations and generate 20 imputations, each 250 iterations apart. ```{r, results="hide"} imp <- jomoImpute(leadership, formula=fml, n.burn=5000, n.iter=250, m=20) ``` By looking at the `summary`, we can then review the imputation procedure and verify that the imputation model converged. ```{r} summary(imp) ``` Due to the greater complexity of the two-level model, the output includes more information than in applications with missing data only at Level 1. For example, the output features the model specification for variables at both Level 1 and 2. Furthermore, it provides convergence statistics for the additional regression coefficients for the target variables at Level 2 (i.e., `Beta2`). Finally, it also becomes visible that the two-level model indeed allows for relations between target variables at Level 1 and 2. This can be seen from the fact that the potential scale reduction factor ($\hat{R}$) for the covariance matrix at Level 2 (`Psi`) was largest for `Psi[4,3]`, which is the covariance between cohesion and the random intercept of work load. ## Completing the data The completed data sets can then be extracted with `mitmlComplete`. ```{r} implist <- mitmlComplete(imp, "all") ``` When inspecting the completed data, it is easy to verify that the imputations for variables at Level 2 are constant within groups as intended, thus preserving the two-level structure of the data. ```{r, echo=FALSE} implist[[1]][73:78,] ``` ###### References Enders, C. K., Mistler, S. A., & Keller, B. T. (2016). Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. *Psychological Methods*, *21*, 222–240. doi: 10.1037/met0000063 ([Link](https://doi.org/10.1037/met0000063)) Goldstein, H., Carpenter, J. R., Kenward, M. G., & Levin, K. A. (2009). Multilevel models with multivariate mixed response types. *Statistical Modelling*, *9*, 173–197. doi: 10.1177/1471082X0800900301 ([Link](https://doi.org/10.1177/1471082X0800900301)) Grund, S., Lüdtke, O., & Robitzsch, A. (in press). Multiple imputation of missing data for multilevel models: Simulations and recommendations. *Organizational Research Methods*. doi: 10.1177/1094428117703686 ([Link](https://doi.org/10.1177/1094428117703686)) --- ```{r, echo=F} cat("Author: Simon Grund (grund@ipn.uni-kiel.de)\nDate: ", as.character(Sys.Date())) ``` mitml/NAMESPACE0000644000176200001440000000277113142555645012634 0ustar liggesusersimport(stats) import(jomo) import(pan) importFrom(graphics,abline,layout,par,plot,plot.new,axTicks,axis,lines,text,title) importFrom(grDevices,dev.new,dev.off,devAskNewPage) importFrom(utils,flush.console,tail,write.table) export(panImpute, jomoImpute, mitmlComplete, clusterMeans, plot.mitml, read.mitml, summary.mitml, multilevelR2, testEstimates, testModels, testConstraints, with.mitml.list, within.mitml.list, long2mitml.list, jomo2mitml.list, mids2mitml.list, amelia2mitml.list, as.mitml.list, is.mitml.list, c.mitml.list, cbind.mitml.list, rbind.mitml.list, sort.mitml.list, subset.mitml.list, anova.mitml.result, confint.mitml.testEstimates, write.mitml, write.mitmlMplus, write.mitmlSAV, write.mitmlSPSS ) S3method(print,mitml) S3method(plot,mitml) S3method(summary,mitml) S3method(print,mitml.summary) S3method(c,mitml.list) S3method(cbind,mitml.list) S3method(rbind,mitml.list) S3method(sort,mitml.list) S3method(subset,mitml.list) S3method(with,mitml.list) S3method(within,mitml.list) S3method(anova,mitml.result) S3method(print,mitml.anova) S3method(print,mitml.testEstimates) S3method(summary,mitml.testEstimates) S3method(confint,mitml.testEstimates) S3method(print,mitml.testModels) S3method(summary,mitml.testModels) S3method(print,mitml.testConstraints) S3method(summary,mitml.testConstraints) mitml/NEWS0000644000176200001440000001544513321124624012102 0ustar liggesusers# * RELEASE HISTORY OF THE 'mitml' PACKAGE: -- # Version 0.3-6 (2018-07-10) -- * confint: new function, calculating confidence intervals for pooled estimates (applicable to testEstimates() objects) * jomoImpute, added features (option to save fewer parameters with panImpute: 'keep.chains') * jomoImpute, added features (support for single-level imputation models) * testEstimates, added features (support for Cox-PH models using 'survival' testConstraints, package) testModels: * testConstraints: added features (pooled estimates and SEs of specified constraints) * mitmlComplete: bugfix (fixes row ordering issue) * jomoImpute, bugfix (fixes erroneous removal of global seed) panImpute: * other: added vignettes (analysis, level-2 imputation) # Version 0.3-5 (2017-03-14) -- * testEstimates: now prints the two-tailed p-value (as opposed to one-tailed in earlier versions), revised label for p-values, improved output with constrained variance components * testModels: revised label for p-values * testEstimates: added features (support for GEEs using the 'geepack' package) * testModels: added features (support for GEEs using the 'geepack' package) * testConstraints: added features (support for GEEs using the 'geepack' package) * c.mitml.list: new functions, combining lists of multiply imputed data sets (and rbind..., by data set (c.mitml.list), row (rbind.mitml.list), or column cbind...) (cbind.mitml.list) * sort.mitml.list: new function, sorting lists of multiply imputed data sets by one or several variables (similar to '?order') * subset.mitml.list: new function, generating subsets for lists of multiply imputed data sets (similar to '?subset') * amelia2mitml.list: new function, converting imputations generated by the 'Amelia' package to 'mitml.list' * justice: updated data set (added categorical variable, missing data at Level 2) # Version 0.3-4 (2016-09-12) -- * mitmlComplete: changed default arguments ('print' now defaults to 'all', returning list of completed data sets) * jomoImpute: bugfix (fixes error in which jomoImpute() rejected correctly specified priors when 'group=NULL') * mitmlComplete: bugfix (fixes error with categorical target variables when there are no missing data) * plot: adjusted warning message for 'pos' argument to include 'beta2'. # Version 0.3-3 (2016-07-04) -- * jomoImpute: added features (support for imputation of cluster-level variables, i.e., the two-level procedures "jomo2...") * print/summary: revised appearance in two-level models (model summary is displayed separately by level for two-level imputation procedures) * plot: additional value for print argument ("beta2", denoting the regression coefficients of the cluster-level imputation model) * jomoImpute: bugfix (fixes error in the usage of starting values in cases with only continuous/no categorical data) * plot: revised formatting of the plot title (order of variables) # Version 0.3-2 (2016-05-10) -- * plot: added features (requesting single parameters, thinning of the chain prior to plotting) * summary: added features (summary of autocorrelation) * plot: revised appearance and behavior (burn-in printed in grey, included Rhat and autocorrelation at lag k in the posterior summary; for trace="burnin", the summary is now calculated for the burn-in phase, not the imputation phase) # Version 0.3-1 (2016-05-10) -- * anova: new function based on objects of class 'mitml.result', facilitating comparisons for a sequence of models * long2mitml.list: new function, converting multiple imputations from "long" format to 'mitml.list' * jomo2mitml.list: new function, converting imputations generated byt the 'jomo' package to 'mitml.list' * multilevelR2: new function, calculating measures of explained variance (R^2) for multilevel models and based on observed or multiply imputed data sets * justice: new data set, including re-simulated data based on the study of procedural justice, justice climate, and organizational satisfaction by Liao and Rupp (2005) * plot: renamed export directory ("mitmlPlots", formerly "panPlots") * testModels: added automatic refit using ML for REML fitted models * mitmlComplete: bugfix (fixes error with mixed categorical and continuous variables) * plot: bugfix (fixes handling of latent background variables for categorical variables) # Version 0.3-0 (2016-03-15) -- * jomoImpute: new function, providing an interface to the jomo package for imputation of missing values at level 1 - includes adjustments in mitml.complete as well as the summary, print, and plot methods) - includes minor changes in the interpretation of the formula and type arguments - adds a few slots to the 'mitml' object class * summary: bugfix (fixes behavior when applied to fixed parameters with zero variance) * as.mitml.list: bugfix (fixes order of class attributes) # Version 0.2-4 (2015-10-19) -- * clusterMeans: code improvements * panImpute: code improvements * testConstraints: added features (model-independent input) * testEstimates: added features (model-independent input) * testModels: comparisons of REML fits through D2 is now permitted * summary: bugfix (n.Rhat now takes effect), added features (goodness of approximation) # Version 0.2-3 (2015-07-09) -- * panImpute: added features (silent mode), bugfix (ordering of variables with nonstandard priors) * summary: added features (details on PSR) * plot: revised layout, added features (trend line, posterior summary), bugfix (plot labels) * testModels: bugfix (structural zeros in lmer objects) * studentratings: renamed variable (data set) # Version 0.2-2 (2015-05-23) -- * initial release mitml/data/0000755000176200001440000000000013172633034012307 5ustar liggesusersmitml/data/justice.rda0000644000176200001440000002407413043611417014452 0ustar liggesusers|eWqz^Ou B9'HK\bfci6#]W7iھڪ}ܷ;DxV}sf7sWnO$D}?ShLM s^"ѐiZUKfz4x#MiHGZ=vtx#]HGIy$푌Gy$GHGxdGNȩ9#{ 鑧yyGgyyGG#zyyG#/ = #{<G^ꑿ<1;#+=*#xyG^7xyG쑷xyGwx{yG?z<#Y9#99#y|G.=rG#yb\K=G.#WzGxG>ꑏyG'=)|#g=9|#_=%|#鑯xG#_7o}bngW'_\gU_3c[kٮzmfg3g.{.[εξqdZֿVvߙzoB[?!zՉWks\K_MG_5ZlijMuMgi2şj\1[kGflc`.ߝs\Pւ>|V_(8˜j|vwd⇵÷^g _ d{[>~[u b \2}ko|$>s\>:=ZPw--.jYb`KLOvŖ!s+™?\ZOͥglC5IO4Zl2[ϸ5~ʹswlm,!mcOoE}J;o姍}yz畿霋ξ\?Ϲ˦ut͜Xӗ/}w'gyyrϏn_wMz_Ow~bdNm_d>6K+ol~3Ee֟dY/m7˯釽q٘l/m /xΫV±.Džj}lkzFOrmgM 67?_ y=~HyriG/Urwugm6kwm_~5l~{W'm{v_?9p;mtUl/g'=g7|Ӈ eF e{V6ʸmdosU0޾|Kgs_-JstW|6{'*Ky ?R\}K-YIIq~96y1Bdi6\ncýg6?T6M x?$~q>W_+pـnؼ_O{fsyfl]~TכKYc4\WY_l.]ƣ 6= }@}+6.ޛlYeCӉjY[-q?|?l˿MOn|MK)Tpe ph9}eG)PgiֽϱIg~̦Nii⾻& Y_ۍ|e^ϗ/I]ϒ=e?oU[&Ё_e>'ͻ2JR\CY ]kJWyzsx1K,'Lvz98KzVpFwW]}똔Eeo\_([x%l2T{RعOQYG0\"ceo˿vi@kgW~\ wش4G~O9a)p %yA_)ƕU|!~b ϕ2_SmNӦ&ZCۮ?t8y!OYKQfYβ%Ll>y6 !𙾒;|tmH:v)u9KϕvhܹqbgO2".2U^q ~SӅmyu2ƕկXn?uG~)q^ϿgJeyǍ/wWop c}{^M;gˤơ|fVY.^QZKc3k?OF$O(#GVo;.}x O Is?L+𗂃ETxp3.)CG}~7[q[R7$%G#4nb= ~.3}?Sʎ]NV Od~MO 7 U^Hĺ/d7ypKwyrs;3npw!v+19Ɠ /9AZ)~`v_(ß:sۥ8;:oGyOn-xMO;G3!wUIybul#NoZsW7S'W]sg]ϗfaSػJrgKS nsͬ{h<_f~_6h%=oT%E?II* v$I%CcGVt \PGTxHSkoU\Ao8QGms)9]|a ;4Os:xĮ+'YI}闀+4}6sOK)xG'/z|k>DRxaJiL=O2إGpWگ%ow9lZp&\HυH-ҧkǵ>ROgK[_ }UHǭM<%*KP>@=k"]ќ9ּ@>ǶPƪ5%}ߨ>Vm[tx~o)xVwɣO#_տM ?<ׁ&Y~|Wz[rgOO 7{ <$O\tI rX/SJڿ̒7hOҟXHP1֩>CuI*үи`}$y: η.-W+7zI.mԧ4}~#rg |Bׯɺ/}.Tէy 'O]MgQr/Uu?{摫YVַ^Osxk'} <.=S)_z֋iom' u30:R)]݊iߐeoQU%ɋ勇_Su<;<ßcu[{ >bG ;=wj.O_ Qo|(}J^}E/ρ{~gg!y\]/q(h2 O컜A@>s}&1Z7v|&ާVz 9xK7Y |Dz牏"=L<?t?~kuUc#yF} |EZrء*>QOeR/h̀+YNgma3xFdSY9y|iOy3o&[ -x3ߍdVYJy*RH= e8ϑ [*>AzvxleGԾ_]q_ o>|3 c}ժY s)̓0OxMoI[h_C@sZ7AzN.꾂;_EsY!-^cz6nQ9٣ny WяWiӉu^KA>mu[}%ztvP}Mԉ=W'QUzwrڏ >?*}*\iz;KWOW6>"R戛ӎhsfucsB&%o$^ f[.ᖊ?QɋfHp+~fF>f.׹A?+3(d|x=!Y?7` Ow%fԽnYMlxtwI= Cnc4~~Ǭ{ExøzmV=%k&5~Ax?ycGͤ?_~kJiO qK 3:!wPcƹ"ُqMb7L=G$fYh<<1e̠"8.3f 3$|lxvVtǍ<܄í}"sb_FwKvd||VgL3(כ%(Y'n>!%q_q9ewD{:r)c\{ ?FY"-rcnhFapfwIW«Hܸdp𷂃fa3JuwwH~e?;jq;df#=κ씼dďEl$בww 7HRy!bp}C|Q#v5f;n'5>nN{ -:<%~ ǭG$^Ŏ<W[nXvEoo$ů8pgW7c:]#׃;RDZvs(mT=L~`>A঻Qet4A3>U 8sGz_+c^}Qpw爗^dߍR︣x q馤ycKjŸɟu+J7Y<+~|~Pgf<|X^{Xh/'\uN盪`o#K^wK%TC| It$o >"~bM _1U;o"Q~G??Nla?&QS쌯{XGSOZ]cco":lw+9T}f+x_\I~?)xyKȣCР|NxR9 .\)'o ]>f';.uo )K}N8kww[e+s;ɝ܇8u;Xū1x?K#r-WI?%u5A~Ι u;{'fq(Gsu>xع͞xȶ:6ytݽv#b>xgB^3+MZO o쯇?gS,Mo|{ 88)~=O=ݓp~S88~aO0q{ܣuZ'GAֵ;Yǧ+o`\>#ȫSLݠ\g&$/RC-FRWx 3!rJȟˈ g]n#}'E&c+ā9J^uE%~WxaK?f}x|F븆>>\/ڂ߃gf xLHyο}gnB뛸Q8*%q7?`ϳ[aofo<1reuC8|^:ƼFIuz03 g}2;7#Ko>c}R]O1ε6괛u/wUh[լpcze:E6NS\}_/n%wn~M7pGWV!Z1A~7B^?,l6q&~NsԽ{>u;o%Cb893 uN4ΣUuɕ< :斀7ŕNύ3SA7ߍ?7toq@CZQ<(*U''w6F ko38I/qO?;foׅnh{/?sZO?w[}~W֯obS<ܫؙ>2 /6җ#zYJ,I!掰#$qI>7AnCw/.a/o9$nhʛS~Ȑ`VJߊ̛!puNs8xXoտb GVU#ɽw[KѾ·ݒh|/>^~;"?oebW'(1%xvK 37u{ɗ/i=bog bcc&$Vx3I|T}/؄Ɉq8}nv[|VyY4=wKۆax ϋh?K`7ybvwjx?v6f[߾rTԟ _^1>n>߈;-;} w;Hywk;qh<,>El w"OKතk/qsR<ϝQ4kߚQ$GalwGϏf7s3;N| !?U @}7u'R79΍⟕/kKsf7N'7g8(V<>ob]ME;Cε8-iz*qCk?6~>J1;5>˩vK}J>%-Oǔ/*w[.v6W熶p޼z]ϓ辯֩琣iG߲oC'1Ї?_`vr^Z39.sgq|zNo|Kg U8tfWѸ0uU~x>K7'i]noS>8H_v+}U} ?ѺVOy»g/Eu;xN_sO?-[*>HTo7c)X[xn_Æj}_*P`M?TGHz{ {a]~esdBRz"Ts&Tj=}@Fg6 }`=zSfP[ ~M6P!C]5ugvr2Pz1LW>C6jut6P;^V6w2B2Ef-T;x?X3̛Gb~,|Zsk5Iޯ ҿGPyj~_`y7T5}P=[Qj}O0T{^zj=7wPׅj# _ =Bۅ!x}sۡ::ב/B@ '9Oy-X=@ne2Pϻ.ToouwC f)q}ֿ =P_j{^as̡Z߻ !y3PG1T/D~zuD۴nϻ|7ϻ*W?/\Tu˶.g9q†+Y5ÄX){߷Geo,eK X=G+ݮЌf Dh5n?1fm-ol0>>3S+8gy8T\I]խ\4[[+Z=y9ny5*> 24K'\Al`ګ-PC .~=~7'5.+݀2C/Gpbr{gV0;GV!+(:uO dz5qb}/wQQ-t?.y6&,bXqss7DM?#˟ly-7ZXtJ!=&rMGŤqS6CT >O>_ =WSN&WdVWq3Wɚ"[?Uߍ })+3䎱t5n яd%8s-R>}upl^nY +g >,lݢ+Y[FBy| \e{Noډt3tD$5!o|ImPGw7GE7\̓ElaIUYPGe:7_Qӯir˷ ^aʟ^ k}'1ռSMwy ZFWUrl/j]QktQJ*]%ۏ Pm|o.٨H eɲbG܂uď`Iq=qrYa9W/ܾ}oN o*q&.˞`):ר/k9.!%1=dDYkx=I,޳c:՟՜K~p׊cie\\ s,m=v-q'pYBV,LY옚Ȫ;!l?Aa|.Lu8*vx&xyJPU>v* 5,_bJgXw6fգb Ԛ,*r(llåCGհ2]S=6k~OiwSUzi$n)y4B+#ЁY ĬgW5Iʝy{9ć*r#5\4? 8vD_eaо4ܔQ^^NA\e8#NA[pcG N明~\ӥÊb$l}6EO|gD?TGb]kz1啬P*ߕx{_?JιL̇Upf:VIYIo=/.ʻ1@ q6 \_4'+1'70wg8+ոbksV3+؍*,YW/(6v@;:^~hZUx˞^4wg^ǭ.qb?V*,U%N{SxԖb|KB9dn8]RR_r\Iyʉ22'5ee'%xcpVz8)΃?s>>7{vPh/\c^>qC_'XSU=Va]9$O}t j8a4\ZL_Åt[ZNpahMwN*tI]hg( o{e@טe#ʈB'zbaXV8`~r¦ }&{?Jcy7q<8Sٞh{)p'xxUzr>}}w<|F-o8'/?5 Xl!NjXv N}J8]s'l Wnb80V5$~ځ€oa.D"jz`jLaL>+vc*_Σ?* >ejF88_k-kzY|d/}'5Oո%t)vp̋(ܖ;4+Doj :Uzc4')>R%-6^u5!fu҂$.Y>sݎ.S| :3{m6ۨ/߅p.4e+lQl%u'S_DY:M+e2)%>sE rWW*YJ;g=TpUzY.n*Wȵ^Gqқ}/Y{DPGK= %fd]?_OJo<g:ʒ笼ǡ3l,]J 7kHTdKoϕ)W=ڦsj|&nv*qe~<%>ٷgiG򊓸n֚[ Y;+?tDc1.3o\dF̴nيK*&ҺN?5K.ϔ ;9[]3YqښI''#]np|Q:/mA1+U_s*GJ;N{$&\7ĒA+ w^}ywJmW72~6 uRE8}qʦr|fʪ~iZX>ҟSsGC9?Un9F~::KɨDo]\B)~ƅm!s9PK8̕wZ*'Ri&7qqZf>҆3DM4}ftMG2#(|u Ya1sggulXccy,',j/$ntl}CN2}DnSSQ~OqM̦Ŝtm|]k?d'5:+qH:ITֱ=|i@EaNez=z^M\S20Zz5fi+3V]FUPs>:57}Wp#neQm;X]-0?qg/W}v`dJ@G;~My#}Gwv>Xu&6+{Vf0]?,qŧ"N]{'{hHw=:fwU=3WXJ<{vh UJ.=?+~ uX&)Aԅ3eSqM.*ϙjUǂpwpK8?4GֵOxմ.8O-֭z5W?1?O*̫eՆ:S"F]B %O8$%Jdr R8‹SMQ"}a霼׶ic>r^FpڼV{s%7/m+v$Ů{.iӷmfVj.ː)awpg ~= 69?nmY݁8NŸb+ArT.a V\jͰ26l Ϗ YSۄcO*s}#UG>W;l;oJڽ+>8w/?>eΏ+$~='cj)R͟&(._.Y(,T©<~-4NBg㢣:QFg6|*ԵD|˹z7i/w;=QۥuB ׯ=qwN7\AI8$ n3yt+d +BRU.S͐du"ҡ8-CwQp--MJ{GWĕ_&[NcU;#?4^NڑgZqm#.mQ.|T|{2̻i-C$jY8uZ^X+sX gt<:}pԎUl Eeoh%K.e/8]{*Vol)UIsҧn3U4eTR o}cPGq`RmA{O]dU[Tiv>G íou}՝%K}uF>zNO2'JΩKY  _uQ/w~ٿs:o3AO)7 l=aSQU҈N+vsmxQ3~VC NznoOC]9+|`kt6>6zjDg{ʼnElzX_td?_%$.$ OQտϬK\wF?:j&}}˾E}ZkwWNov[tǡkEv؀Ӆ&WWwwEZ8绻1TwwË́Ѭ8EKp@Ğd՞ˏ6u{T'qhwWJfjOK+LSJz> UVk&핾kw}.uwQ^| ޔnOW\_tlAwWX=)nSGzŮj:]]pnun޵R^ԦvKL ؑ!]YqƨM)]{쾽0h%(c|%%ӺO_w/tw.Y",WY^S/ۣ.s9#a_͝Nu&N.]qAU񃺺^WF++f_8JRD o}׎QݕvsVwW\\så{'JdטNGMFazu/0q.=9uxfJ ;P); wO͎vckṮKu%/2>nUowivW˝\.E2(?WSTކĊC7PQ[l 8.VFAb1=+ +-ۋJ7Ϸ@¢5_ږC9P͉6y*jDuݫe>l_p- isM\p\U@An}vV\$KB /N[ᖊ:˭~)("{D!j_Xxb .%!]fZN9Y{,v!pjP|_iTu^bT>{{ ~G;bw7¹]NiL ]>DRp 9ߨ5XU*טqTY1; \/u ~bs\^zj`.z'`a\7*HGew=Bnu^(jVm?Oyd;,OK{6 U,KEݫﲣZu\r?c *NV0 kns((>Z߹=4u)/]~lu)]ȽsRR(sd;Xm]whqST7З}Se8V|yFlH7I{ĕ:T}p~/k~B.꡸m ك&#JC_u{&ǫz:N6?:YrNpq/]P1 Œ=&C';ֱ_搪YoKiyƶ('g0}xvָ8{tdmȆ(d¾T~9Aenᛛ#q&kgc567gHq9cq]ao{5|@|/I;VW?2unotQWwU h>[vj8״I9Ź=uvM.]B&M 0{KµN9;uJqzF_]TzPe92S.ʫoQ?Q aUh}6F5Lm*/DRwH>Fi B}u7C<BU%|cBGm<2ⶥ/ ?ِsY~Y.tIWS^IȒ1_75K3urܼA|`=cgp ]"YL-ujYޱ &'ƫ9F9ym#q_'Cv\;k*p;%Äput;[4b]Gg [,Q / SLG qFFK qӄ6qcp vs+Q@m+;KeM͸wWdCl|u&k).K$6 (CN;++7yxVsة?^0 f'/HFo2ӃaPAZӬ9It-ˢϨ_UM5$}h`@|V@Tu[$Wcl[$ 0F 5ĝk&*qE<]$VMxa2Ũ8T%:ss+~g*.{JozKt l8F>ThJ/k Ӣa O_fD\F+(}iMX+k}!&Ι~d"\kXUnm]a=2< e&=/sQIatI8JuzU0(Rh~^ꓻlHhCuI a4wxTJS)5 ,bƊ 8pThs/Nr{3,KlsyEI6ەp^OԱ'=ōpnL\y2]j5v()I昻C=پ\?[#fV 2EmG\thAN/8QP2 T.;50z(**p9FƠ{jW"Τ% --Bd.F/Fޕu -J[vZ^{zp@#W nW@6x3ϧÇ>$ M(]ϯE- ~Z(iORs*Xjn *@[,ѶǩwEa=@O]U'5{Um\"~[6څ(>o>(kF.mŕ^{ͲAytf:udg`TxV]va|?'7 8< |v3;mB|'(l {PM8YվL)'_[7z׸|aaeKz>*.* tiϋkau<^I^0\cŌsKX ;~{]~ph{h) _/}y\Iysl"QI~ܫ#n E\*g>nZl3-]Ea|9?& T'?BQ lr$9AQ:ŻQݔ+'af/_LCUH4uDK[A-ơ ~=逽T78e#&5#Ze5.4`G1'P]5h:9;konl3P٠eeGe?-"]>^*$GG*Qڣ9Qgx?)}ߚvja=\11 gfwo{]F*h9 ZEn[}2ڥx[wFzjE$8;8I 2jFrNGcq\W.\_:%Mϟײ3+cRvAk)*-[9pM{54s[ipz }=3Dc<2Qѭ3k^ge6#i_lkO 72V=s*.O;(eSFc>KɌ K:QF5΢ };*"@-PWhwTYm.k7~K cʸN[֣_^Q͊B]ƥU>j4ffoeٮRlOͫصICM ٰ>~nJz"T_ts j}uK0H~JJA4pua?ѸCvΠzvWWPg{G /Mw0._)|e G6ߵXn/͊^˹!2&ͧ}Q{u.kiW?BU!E+2Q}foP 8w'F)]jh/ons&.gO2_g Pj[+Q]2]PůG:!sg}_6`&s{ةC[BvQsɰ$zG%|x]އՍ$k3Q[ ʍ^,=\S>p7KnAVrTŵ Xդ/p Nbs"x8K 6LE%b춚eRK{F[wĕs&y%n_)%rEʻmVW.jdkn9* b(Deڥp"ɢi(QsTvG\civ_johro39m㉭W ߯WsoJG>3̻QsoӽGrQæU!AxQ )3IGŃۇrJƏ:"s /VO/euow=04fty∡[u>9,פq7}jrQf_>cpkGy TQ2J9셸Tx6a7nY]o'rSu_\).3iH 5=}n=Ec,oqM9~ jXdE ] >?rOێ_y?ɾ~P~lُ5uX/YC /6?'aޏ?~}xogXzL?߯o_]o{lv OÿMﮍn|?~Ӈ_^?ߞ?'͝8ǿ_][Y׆Skߧm~v~:ōeo175w}rϣl r t~韌/-x'}S?}x}ޏy?/m~|ݿ~nkZ*mM66Z;oi;lqu՛.}Mo;tM35nϜ3}5%16۷oNwA~oP!0POo-o|[4bo->3>3>3>3>3>3H>3H>3H>3H>3H>3(>3(>3(>3(>3(>3h>3h>3h>3h>3h>3>3>3>3>3>Cg !3D|">Cg }>C3 }>C3 }>C03 >À03 >À03 gBS4IhRФ@SM}hBAAAAAiF@iF@iFB i$FB i$FBiQFAiQFAiѐFC i4ѐFC i4ѐFCi 1@i 1@i Mi"HAD&4 Miiiiiififififi` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` P` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` ` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` 0` X"KD`,%"DX"KD`,%"DX"KD`,%"DX"KD`,%"DX"KD`,%"DX"KD`,%"DX"f_K8[zjWoϷN=Kmitml/data/studentratings.rda0000644000176200001440000007372712547221257016102 0ustar liggesusers 8U}|c3B(B)eeJ*cH !RɐF4 do?y}~zzӲ?~k}LJފ`18,?CС᰻Q ǎc3}jb?Ou5M7QD~Mo:&o&(7Dc~M47D3~=D/oVM7QD%&hoM7DMomMo=&:{h7QB1?cq>}>}v==V `uW wY<;{_oA{R?5;Ek$5׬?o~*EǶW__m9_/mo߉l18;e R=\ǣncn?9/kW,t!/ߋu B[/NS_:`F1?k~c ;ww"_my\o?[ִKCOkv ojoXt7@ߘW߿ۼmnx´b ԅ)x E=Yni1{zo4{:_:SC sdx0/~ Jeo\>L)dr`8Wz *^w<邰MgdjD^03;oؒ S>&AH<DXF t/nM㣂p^90+sCʅu)\0;#A$6s]0{@ϫ!Xܞ_a yo] u^Y|q@x2Ua[/ֹa3_X#A6*KbFuӶ0 "y*Y]'Aؓj@b^IJ!Ն@>$.! `Mm>I.i*"z# "czx 'sy"_ri^>%$ƒ 5GɲBbKrw`?*'ד>gQh/Zy1N.5aG.ˑ;h@1:sr>_B!;~ʸ;ٞ_V ?Ko~,4v|-1u+~}'TJk.batxBf< d@|?"uK#Y_*mOAHnyb^}&_Ϳ8 ⎞wnۢv@-ü ¤nyZ@{قwy=fZkN؞EGa3@▲z΃3~sW@~,[nUI@Zo|m=A65;!!H,zV@b_Lj\ԓ-*,h*f! w{t X7suy\tG3&*|Mjϱ"%] >z|h+; `ܳvPq D'JQy0J$CƇBe'O9H*n_LBCHv;= |#ӲW^p @njn tl7q>C늌AKPmdFkh 7!q&׎v6sL]ڷj>X!<(}1H NlORf7wؤx^渊 ܐ^d<ޠ.|oW (n1~͘Dl?ag t~m}yS97*_\ u "?;a!t=lПI:~ǿcE_IƵe(=g@lC"6tO~G02̤="<]}>~juȧria{ii"!¾8gmZGo"\SR2'(tW1qxlP%_B+6w[i\-Nx9ְa1bA3y8K"u֗m_!=<[jL{BxTٞV? 2jwd @nmTHT*{fqmfmz᧘~Lr'7w'w7 |H[2v'CB`M#~4ǎWC770 (<5-w½Y@UM^uZ/5{[!=8r仼($JA_k!$CTCcwQN$ud.d9]g d2wQ"3EPa +OG NVTA=D|;z 9:l* FGD?y7J|m22*Bp;$a;"r?XA)կ.JA Oo@f( aSZNTkN`R2 k$m @r' =Nnԇ؋!l5 r)hQ~Uc3Y>z3]Wo`^8ʗ߾RzLXNf J|\L &6l,u4l!ǟ ! Wr7(A8"ÊtjqQ22|4‰uk0@`T0yׯL' ř;Oy>0m^evAiJYeid"&ӏn&9d\5pxَآ@x7fe>(ye!F+ߡ/T)]Ӈ*HFc5^K7~ W 2jqfO qy[A@c'K@eUkߝ5z}jZu G_D:OƗ59$CyD'Q| !OաcjJ רݤ6ؚ0wNBwN&8͠~d6Z 9FmÖO!ӕ1;Z<9!T~q]+6ض@2?jsj׭bn+EAXcF j{ uQįvC8 U= awK{in4K;iuo\&d ndJϻ uYh|U8\G-Y!Zx稯ywK?9ps}iS^r`+' m*¢&>W@^7x|'/Y4i}@~rIeBpim<|n |˷; 0]д=x|8l?]ݧKmZ}|?.a")AdgBy9'niB04@پdžukOvGvϑ+3#Ðhr)68jβAѕSN,V5$hޝ}vUPO BhRB: LB_+T g" I"Nm (+?MgOaЕBeKmxۜ Wzeʸ pށM#nk6D橕Ԧ!$B3._DlE-d~ P"fw i wnw,Aq$+un˄pJޅ}Uݶ|(R!L+a0sƊdx+;c l0elO"V‘ [?q!c[[b/(?`-2EmZ *7 mAe* XlhO#4u"Nl7sy4t6{:Bx֗$6MБeh{]Ln0:{V=FdUA919t4=x]9sNۄ@c_$~Elf7(CX7CDy:<1>sj74o@ĸy,QZBlx^t=RO)YB w𶜂иZ dczOaz'*dk`e]A9 +|^u]6uvR-Iչn]&uLIwji)◵opj:Jpd|h{4Nw )+0EEsM_ KU :[H1lk $e)IF𜷝gE*hgBȥcݗ (`0##$$tZÄ4Ǥ }%UIue~'2*^os*"܄kU]%OFJmM[Vcg}1Xlmu6}W 뙈"QNW8 X|~qLcjLW|!-gfzYSߗnԅ6.VCv<<}zȬj)KGڰKѕӺ/eɷܹ E/I!:g*ɡr;,N`{@ҩwN^DFd}]@K=lShC6L 5 z](i*Ik,\w}j.q3g~4aVgY0nB?zOT*#c/ד!ϥ# -3ˏ;ThVbOScf $m˛ؘvweft^dUF\ўf^~('(Sb[Gf)dWN1#_Нs4@Q"xZV d"7 $adKj >@sɧc@e2 v:bƝtBY~u6hҞZ׽ھ72:?s?٫w2X -K Wٜdlu S9Zջ)E<-Ľdm%?vvujr׎x4,H!yg :% |(y=SC5V]z'-L4:l|KFhDӸo4mim$+^%ZOnOѾI"w|pf2#|\i܆ڏ~/xQd+H6,:P,zGMF,?Q"N[zLxp\Ə[G/W=tOvoPv@&urߺUvaNN+*_OH"eڲ&U92%Vmk٧5{*=mUy 0 fr"׍f[1$nvh97,C/Oe#y9dg>dNlQ>r[lu;(_l/\?ű#m|!0R:',с5K/1Z3_m_^2Cȏ$5`֪y(?8wAYN.%0n煌=Q^6Sռo S֍$Uy2r12^!}tQ;""Y ;7!{4򟿊 뜮Wʪ-6gDoNoyB{VZkxԑFe&I߭}o`{׊l#npR WmYT @]k*\Gۄ7 !WY [%!fJc]XCި_knI/8!Q 8/ZxEյګ&6e#hN!9~U_PYo7D>o:J~ӹD6y'ǃ݅#x]Ƣ%E'q !WDO]LXĞ}G͝1o=~!=+WEj>d1~g.`_20x0IOb[w%9ɫZTXǶ8!v۫c#rʛP~ҴY> 87g8Pr|zd^M}l, T#rπJT]d82rp0/,l44dӣ,A3pO\UȼcKP9Ooz ޗܟbN%~?H?l'@>B'Իp?Y>L8*[P9bLӃjzUq?VHmZW~9N`|BomR@}eYr{C?hRoSlzZ`U/;x߶:;qmULNއ^2X$BW+!/YWǁv: &|7>h  CM9xznFXVjڸMJrt}ZQ!1̥m_NR``Vߗ(o%GI,Ecy n>.N.?72f"+G료m0 |ĶuGEeӪq<,!B/]qگipgnÞ[V]OlŻKR'i:/z=³3 ϖ/0?m:zvs]-PZ?XF$_@%ehF^8SG1eW-??k1ci? l7^ql1vl_+y\Wҏ4/R/+/tIKv$_īUyo9đ:eJzYvBd\_4e=-uK[3üq~we]<~ҕq}i~y,?eZw VR^v֟]g:oz+|KqcLgXfź?|UXYWI+qfG9,Jޟd|Vן +}e_gJos_-׭K~?wI vki?KrYc}/G~VK8+Kt 񧝬W7WkWau/祸Z#KzX9~i?+ڗnE/n)OkSgR~+ߒ[?_VÒV|me=.l%_Kpwi?%;ouƥ~Ouyв/1+h.V{K3/.W~+?W%/RZץk ϖ/_;^Ureݱ|]r4/-aԂ?\ vMpa}[7)AXtN}Ƴumod[* 8'rrۄbUZۓ!R12=]J pR{F@йkG:,?@az($t:FnQAn\0O=abZ@I? y a! a@'75x/ljdggYP;-aWaG/jw ]&u/Z{`߶†d8B<v;wwG(ipP8YZf 9@M]|{]L$!P;q|M%Dj0iVC) ؤ•nM- =6 4$OQlj5@:Q- >Qn0bgÇ" XnT.o69 Ġ-I@Zr-^Bhդn:hbۀxx/!2 6 Xr@*/J Q! t@i.8} I'E=bם @*izB A#f>} @ěWE Y}\Z!R\‹ZΩ+qE^Oڄ` dR zT%4V;}xvX\"wyԣ糝ƁOo # wmtu[/&?7q z]g߼n )M i; `\y "~@dd| " S4'Q*yMd}8 5.Cȋ%7^q"]Jtڅ7,0΂ۀx«]7)FOM@(v<~,;]_OѵH Y^N8rPp]A2Ĭ3糒!t}dp)(YU:f0mĥ5'1cB5ogIxW Z#kר1|Fw3+@|Ul)iA_og;벁;j2Uɼ0q"cشLeuH1Y ͪr98J`A[ Q%GsB7pV@ZW #+$,ɎV" OB{eNT/#K$ UKg|q@<$!JꙗQ p%wt_, gN gQR Kjʱfhmly ;1q{a6 Ħ !rok_9xΔ!!᠁\jD $I)! 0)UBat e!CΕ_Dj|N4mH6}|*O~ \?ws= *@ّ` F%@T|-23% Σ{\]n!J[-ETՀ%/}J p!EXo~̸@h9>F$BX\Eo/!VQ: r]l5М% grM? : kIRYN+{D'~DUl4T 2SӌX=XtBҷhRROu@_D}¯r\xg" CpMdt5WHրVKБ u)?*lo [mvo!C)M@_<1,u[CDXK牳P]Ys 6ߥaO~TjːJ)v~~WaQOȪ5}m9` Ů'+D}yu9S`Nqb)vs NG=n,9wQˆQ[NV@۱D߽$ũ}C^wEY 25?ߤ}ȶA~+'w|ix>T ɡTd[6Wcn$%D J[ E%ljI{){Q Yڛ{Me_0b|l7Z@@62( m HY"cvik2$Z7C#4/^\괼3Mj;{CX@az2|\W;FĀH{jύ1rlShj?x7ɀȭ3Hc J^EAA˨|kK6y<z u>C.v\qcz~΂u!)P*h8ߡBæ؃8wrw,{ B+ϡyM k5o''O*Yo!|vdG*@Hf^6 u9ٓ w&J~3vH,F:ݬ5S/_#HYM!.le:__8A| ?O_z//S@ X:'x!lCgZ1DĊ$Ku´F:?Ӵj }Fcp;eXiAoT4 jgIt^1h !;MN,RyT;$J rg| DqbhrS)D dt!\^9rj^𨷽 G&b\M9+[}=8r˷0BIv4 *sW a DtjCt_,*>vag_3kZ2A.v:ٛ퇁$J~M#7w#7s6"~և} yNlߥ&Av&CWGc+m:qB C5t 0I\ `})Wd-Bo~kKк#Z3.q<"O1M .T^7 j ![7\u>կ[̅:4WBPDËdo9~ gLE$ 8R]-;3nG%DŽsNc)%Ch 3AҚ~}q,D'~TBP'|vbR!N[ \~AowwH ;cƿǃd(ObBr^ Mf͛>rE DG2:/_7qW3rAqA>?3US G8m? ![9f@`kPC[预>Fkzo[ !BSٰG')*Z/D3B`WB W1B%,ΕfC8un ^H׼ LzCoS)sbjٍVM^No 8Bn@ N$^ }]| ךL, X8%/y Uݝ@)ʣٽBep%f1h4_9 ^z ~.Oq?P[W?Pyv,;.i(< v ;ZN dߝ?=1:Ek6ߗW l`̝&foIA 9iꂐ:bKE3T[( ~__%ڵޤ~\ wt%EQO@[Oۓ7= 3J QWODm{"V(G@Hj0Dpޡ> }p<)؂}v^#)Ù=1 ׻A5iЉV5"%XtB߷ػB):9:!~[94:2,I7|^~KgA! Z \yt84n]C Y}{8jJCPؤf@8.DC 7[ Vwu|M.t@pƂOBNRy=!fc@0@P/rͻvl*wQ(ңdy"sy:C}xk *~hQƨ ->!r=k۬B"Ór!#X߸Ń ƌ.o=Ӹ|֡ᙅN,-T/ŝV~f@!큯 &R<ȗ%ֽYxg򀢆{@:K΃5ޱ`CZ7U3{pw(7V+"6τzK- y?OCwPN_xZHu6^s@4qZD@=aݗ̭m-@TwG8ʹ@(Ef^ϱu\@/Ҝ TGd3Ljvdѣ5Dw8a '.?O=2C~&B@fh X+"RKFP C)fv֟SC1i+}U=ʩpC{JHBWh4TCf o+VO] t*ZDS{pU?C#hg>vu*j 7 H.@=*ȣ4-U@SR#k̎ʍuzPwٮͥmG}Pyu19R^ۭOd"ʉFpP͋4QO&JvH|u {3b6Ud@n7786ҞЗK-ێ$r\xhb0pkGV]lP0PyPȋ ?ǎ)Ӏ#<8y`?P:? V*@CENŠhc D|*݁K2\'?^λF >N"Oc?PQy,nmIg؈g s6- 'æSȂKluZc`eg@&Bq? .>Yо^ ̀:{w^0=p?uqb{)87Ee`UUXq24";Q6y=lU&@#OȌKW?wjȢ+u,L@cS2L7PaN%Srcϲ"]̄!3 PtS0r.$CCLkf -j; dL̊O呟Oc}NhL_JްM}o>rh/1|t뵑5ekv+l6*{'$>D_`Rv^(omλJ~,a@qŽ>KѸsś!T#-G I ka 083f,%b_)s% OL3D@zMf2bo.GQ 0?li`r6US"C;tn: 48^FExAp5g 7O#1o+ 6RT@c,t1Yl? 'JfTݿQcmBhL̿<݉Lۚ)LEq;Z 6WAm{(kSah 7ƇE•h\qϲG{*ju,ʌ@kꁭ@{ھb%`Wjap\S%3YWOo+gޤ1hgs #=Ę-PYDSn4XLY*m9:#dY3r%JC:4.E7i,9mhٙ/LT>~ @:GO/[Q?3<zikzrg(#Zn:'g1x~A(}-߳- (mU%@+IsI-HL(D86 ?j[:7y|azXK~xOP> U۬ `I.P7E)2`HbȭMDfPQ\T/XM/;<j?ALn'F%P^(p@*dRQQڍiC]چ7.gN{$7~@| zhwxh!,Th<"`_v; ~_ϴ}]ǼtU6nGXswc_ 7CfI\ۀn%l`6NE~C?-֫ WΣ~jXY8pEhvhP /}H tO7mW̅orb֥ K>qzLb(̾6` ޱȈywUO,`>0EjK=Nz'?naó#^ŀYlW)Sc@lUgp**^$mC -gpXtWzCz}ǀ~#sRgNy7c`+? QO57k!0e=5]-cj8E\)}# /C 86N*:=S@y%O1c3xpR^\*8N^лm|W@# %xWE.m8f)w` h?2vy*I4k{)l89Aq:G?m\5mG(hf4<[lANd8S2>\םlwҎ@/Et)eM~ǞdgC{cs$ht%9?\&%2{׀YY,7`aUrDU{5N=4Y$|`>ɘg"oj^sZT /n صm /ڍ-9Mt߸a'T~%[MmŇ ;?/'L|-7"ύRdPn6@ N\9qZ_b+0o:jbǮ!=@eto{@ۋf/잳S3heqѠ1@PhEngLe\asWQ&Ћw?/ ڦu߱Q^kZdu>#8iwy]a%h;< z[f^PԳ: 8Ąe·S~\D.@7}'˺#Vߞ/?@͖ ;vnT`EZGAg4Re){@僲5PA|I@[.ܑtH)@f'%0xxVn0iYA=os`W`%wF̎d#gqm9X%҇l} 9k|#90Y8 g6*`@XЩuڙB?Iu|$뷔RNe\+3h%={%M}uI}6/stcr{Q@7.Џl#+Hj:lԾuvI=MmZx:O0:[AhKf0=*c^LsM@3J9!/t(*oo 8W}jʬZ`{?s^ߣno/1 duAW@" qg0Ѓ&EƽԠt>jc󣠱e ^l/r9]AK-5Ar4 [AwcGuo00{J3:vp7VǸ}lЫ:/(?Fvរ'D3 nRS꾖t6'v2m'hh=9&L`!y^/5m*G@#Bt_d&d;Jd#>_??zY W 4#`}O Fvy00x%"!2QPi^;6 4rzݛA"wcбQa;7zJeiAL<ecB ?/ƻFsARԠ{/fh3ux0:[U} .o.ɭ弅h}}kul>4/Ȝx^8p~O@C_f h]<t)mc^?[) I0<ƧP|xa:tU}&qv9Ds<g$*{^_ 86OҸ  zxAsペ0` qk8ԏj}|y \!/3Yij@HF%/̋E 2DA"iזT{vu@5%h}˜)Uxj\$*55ٻts]`hY7/k0}5OPN ʆf Z*0O`,#H c% AbTR:_CO&+Pק|j*&# j:ϑ F>5ƷoV]cozU|{q٠owho|c+t, d֢x-KyQ:4r.-|A_TI|?՘F"wuw}:hkJ2Vv*6(w(qÒjtZv\"Au9{`fqt4u_气dС3ſt(I6Dy≠! G. ^Ƙﲞ8 C^L ִmBuMP cbxD*|݂Ɖ{[+|yP7 lhܾ) tyYdlYIuWlrOx vQ'l6Z|Aq]ju49DМ[} 9o#$mʻRתAGxnm1>{P& _aѾ)48<;%s2#^Vk#92}IhSX@]xVR-j ƞTq{Hl[!S𺂼g]{D D ;40$ %1Eliw>'z@koXyl"hu=/li;O~Ld\4tk(zi;A?=NG@3) ^Or7&I^s`dzg- VWOo=hyh.Z&\lC,43{_~x H'YAѲFU /g`pɐZhH)])t%~ul:7joua)AM>pvJFQ;+(Rd1Ι;0/m_MX|} h{:t? I(Rh-"S[6gAxm5h3˂ã/y@g@kE8.Oh1<nVFNNh~}`p[e9_@K>sA˾F)K9'> u\x:W%Aˠu7ޅotoR^gC*hyՉ& ~pD矾͡ځs?|ּ z"IwPEľ[GW;{ !z' 4c(?YDV?pNQ hO=P *81Ql m*@n!4_Ǵ=I7㞷l͊ m{>.n`S2j_~9C4r]U$dW{h[_ :A=>d!02l/~|bNNe_3ƘLBq"h25kEqp)/(wzP>5Q[QrZ0u;_ Z[BqSx!*)]jV!hi#}4\7o}E3!2<_iG­SRIs~!hZ),tAmF̂Gehr<G@ǵm@wy#\|d[fcFȣ8[ykPЯ2LؚZo YP\Ĺɼ2nNЙkLҸb5`_N>#t{r?~A<v_fsChFu+e='ZhN95E+]g@ry,Ϭ;>Ԯ7"AC6}O'п6UeVr>)Nh!m|y(yG!4o\:POzUh/L!7Г@B lAe0A,e, DA_R,6WEN`ⰾ3ZHVZ JM^g&=ρFZ&yړ`]]}Yi^umG4-a o{ {>gQ}ɞ@& J\:hfoVc/-ue\]fY=4<g"6A-3A(noΰʤ=Z` 2xa[n_I(Nοt6jA[1~`$If> @y%Yr:$%> }>|[ ZI!Ʒ!+ͥNUtL`[V64j> &y`Wus &IЪVFs9BLc$j͡q5 87;-aAY @L^+#:G1sM@o&z[$Tq{);J52 t'=D@twȸ +P)q `'1*j+j7 :Y^#.RUdaM5~rj:Fy*h]:c{oh|M2υtz3hFvk׹Zt*>\ŗ:զeI@w3[Ҕ`xFrWT9'suOY :#ãj?hZ< 6xyϽ ol)OFiB8Zߔ{k~ B~+~]}}#ُchnk :q*%uQ?_)k&A[Bw4fZZ5л<1Shhn _lgڹC?>?aõA'qEЦIfm ~ hh{vh`Dri٩렡B)TVzh>asA4$*0!d>YkX`B +^ȯEޣoe@$ldh陫(>'hTeF;vyn#<WVM׃VH'sDkHw{ǸꇶIM^'Ao@7(40i Z7 .ЏynkqQT}Q;XA;h#sRI}ds ndžu7$'Ʌ}t@*?t &lxc,x^m'?t'@TQ4 ZrRZP*3ȴ?5lb*@R^cg ;P‚h]MlP>aR]M[{.Xs5DGtYYI9Ϭ1tʮMz }s`ص!Ư*>ޗ #e,7F1=$76lEHs/CM~_: /,2#qz:sEw䷛}muMi>Aw(uƼ:}d 8™pui_81AF+ys*XQZ@T\UG6uΤ?ۗ}`}nL`ბŚH5Ӕs41n&|@Sj4&(}MʣPdA~ 俼G=2 _in۶#C,5'"&]T:so{d8U,0Wm\z4߉"Ek|۪ Gτ"_3̽O!ǐ}7~H{S=6jR)N!sdΡv$:D VRBF8޹k1All.rׅґKG?\@C&߼NL D`G$߯KQ/n r3@x={ 689X}(m}M}sU-Ԑ>U>(MF@fYܣJ"d͍ua -\t)q[L7Saz7|<" Dc}iF4^'e2kd}K-F{Ȣߌ?2~6n9yi| @+>d셌{P(]8ؐgq??2uL7d9:5w}Ĺ,ybż뿒hjwPP>"Y;d;0r29cwMQŢ~mE%dTH>I/ ATNW#717N!nB|a_/?*м塽id}*F  ȴ>Ifc@Aooh^F;G| (v- V1R>LnR^~E?KKˌ2)us}>k5" wvH gTxw.tw)!-@3[gƌ<: [Zr $]ґS+Dd(v݇U[Cfoi]G&)u Np2T^Y c|Ȍ !3iE$Wd.볾g&2>m*lH+'MEuKYδUˉ@)Λ,G|/%F~HVi50̊ ~W1bC~lf $Jo7!˓]֐߰(UwEl2vw3!#Z!yd#P4R˅5Ӆܠ$%,H2RGarSy }q@:8X$픽u{M{42=5ZdR3!#5FRwSNPA&k0"SC6,=caAUm;PE~Q#si\݆Vɗ*5l+)#zIss [ kzr 0!L{0 GˑO$&- b}u ߹whn(/z5r+6FiS9oF4Aݯ_ 2<Ҫ߱!֚*,921r9왕 B8?}Ȩ3ITHMl}p 2(&GK!`<݂|/Dہ|xP5|u? jvyO<㨳֓S@zff4o|?.V0zY5xz;==,2w*[5+0yV=@&Qx6v:2Y]SG?ɓ^〉e>pD*`_Z IE,H~vdz-C!Yb;2skdOOߡXW i.w*C P|ZhCIdL-0C4/o>@&I 9?n]yyvxY$fd`Hb(mf ;s"E'5| a+q,RGjT(_aEyl/UG`;|Ld^wHd+ꙞR}0$e*\Xqg d1_-2Dk jFۀǂ0Ej6E&.܂#"ޚM{9 K/y7,㩞zt7V>;hkw^}B,ϡB[_Ev$dJk+{#d7-dd!0.F{%_JFwc# ӂ̸M9 "PYّmV`JqYfWqLz 9 }v֕刹2 3VZiOd}ɎT@q&i \âإ۽N S^Cff,ȿTr/@?a9t4@WoV;:֡+hivz'68! :DzW Q{84E^Fee/K<T!/Cy;fF?C*Y].#Sl,w8qMWS}df'Ajfh.'hH&ï 髳 k08ѻ!S}眏-B72U>CWCE#jVx3XרJ2x#玮VO(Lc/6oOhT=N"W4# Ojj"Zc7 U; q1OאINg36d{#}e'?_el\ D/P{ȩ` ~}AYyao+t!mR[$([5n49*w,ow \ }k=|z6"sU|9OOFY@iؖB5/\tQ*\_H|:5$tBOn󣁝!:ȢuI)ʮ~OπZHg}kLԝ/P/\{4'm@ydt;{ĩ'ې& hPMIK׺n>}J'E[ Ȼ ҕ@STrff{o?jt."(mjuU [.,Z}݊EuY&Bld%/20F5O^12lk9*PHX-i#`֝bE|Rk zm,")ȨcgYЎYb˚ȢS7 bd|k,Pn:ذ*~?h`p=އ1){c("\nImputations were carried out seperately within", sprintf("%.0f",ngr), "groups.\n")},"\n") invisible(NULL) } mitml/R/with.mitml.list.R0000644000176200001440000000040312710440045014760 0ustar liggesuserswith.mitml.list <- function(data, expr, ...){ # evaluates an expression for a list of data sets expr <- substitute(expr) parent <- parent.frame() out <- lapply(data, function(x) eval(expr, x, parent)) class(out) <- c("mitml.result","list") out } mitml/R/confint.mitml.testEstimates.R0000644000176200001440000000112013142552223017327 0ustar liggesusersconfint.mitml.testEstimates <- function(object, parm, level=0.95, ...){ # calculate confidence intervals from pooled estimates est <- object$estimates pnames <- rownames(est) if(missing(parm)) parm <- pnames if(is.numeric(parm)) parm <- pnames[parm] cf <- est[parm,1] se <- est[parm,2] df <- est[parm,4] a <- (1-level)/2 fac <- qt(1-a, est[parm,"df"]) pct <- paste(format(100*c(a,1-a), trim=TRUE, scientific=FALSE, digits=3), "%") ci <- matrix(NA_real_, length(parm), 2, dimnames=list(parm,pct)) ci[,1] <- cf - se*fac ci[,2] <- cf + se*fac return(ci) } mitml/R/print.mitml.anova.R0000644000176200001440000000242412712373763015315 0ustar liggesusersprint.mitml.anova <- function(x,...){ # print method for anova method cl <- x$call test <- x$test fml <- x$formula mth <- x$method use <- x$use reml <- x$reml m <- x$test[[1]]$m # header cat("\nCall:\n", paste(deparse(cl)), sep="\n") cat("\nModel comparison calculated from",m,"imputed data sets.") cat("\nCombination method:",mth, if(mth=="D2"){paste("(",use,")",sep="")},"\n") # model formulas cat("\n") for(mm in 1:length(fml)) cat("Model ",mm,": ",fml[mm],"\n", sep="") # check for very large values out <- sapply(test, function(z) z$test) fmt <- c("%.3f","%.0f","%.3f","%.3f","%.3f") ln <- apply(out, 1, function(z) any(z>=10^5)) fmt[ln] <- "%.3e" out <- apply(out, 2, function(z) sprintf(fmt,z)) # model comparisons cat("\n") nt <- length(test) out <- matrix(out,ncol=nt) comp <- paste0(1:nt, " vs ", 2:(nt+1),":") out <- rbind(comp, out) w <- max(sapply(c(out,colnames(test[[1]]$test)),nchar)) cat("",format(c("",colnames(test[[1]]$test)),justify="right",width=w),"\n") for(mm in 1:ncol(out)) cat("",format(out[,mm],justify="right",width=w),"\n") if(reml){ cat("\nModels originally fit with REML were automatically refit using ML.\n") } cat("\n") invisible() } mitml/R/amelia2mitml.list.R0000644000176200001440000000022213044106451015241 0ustar liggesusersamelia2mitml.list <- function(x){ # convert amelia to mitml.list out <- unname(x$imputations) class(out) <- c("mitml.list","list") out } mitml/R/internal-convergence.R0000644000176200001440000000347012736447562016054 0ustar liggesusers# Gelman-Rubin (1992) criterion for convergence (Rhat) .GelmanRubin <- function(x,m){ # check NA if(all(is.na(x))) return(NA) # convert vector if(is.vector(x)) x <- matrix(x,1,length(x)) iter <- ncol(x) mod <- iter %% m n <- rep( (iter-mod)/m , m ) nmat <- matrix(c(cumsum(n)-n+1, cumsum(n)), nrow=m) n <- n[1] Rhat <- numeric(nrow(x)) for(ii in 1:nrow(x)){ # values per chain chs <- apply(nmat, 1, function(j) x[ii,j[1]:j[2]]) mns <- apply(chs,2,mean) vrs <- apply(chs,2,var) Bdivn <- var(mns) W <- mean(vrs) muhat <- mean(chs) sighat2 <- (n-1)/n * W + Bdivn # sampling distribution Vhat <- sighat2 + Bdivn/m var.Vhat <- ((n-1)/n)^2*(1/m)*var(vrs) + ((m+1)/(m*n))^2*2/(m-1)*(Bdivn*n)^2 + 2*((m+1)*(n-1)/(m*n^2)) * (n/m)*(cov(vrs,mns^2)-2*muhat*cov(vrs,mns)) df <- 2*Vhat^2 / var.Vhat # compute Rhat if(Bdivn==0 & identical(vrs,rep(0,m))){ # for zero variance defined as 1 Rhat[ii] <- 1 }else{ Rhat[ii] <- sqrt( (Vhat/W)*df/(df-2) ) } } Rhat } # criterion for goodness of approximation (Hoff, 2009) .SDprop <- function(x){ # check NA if(all(is.na(x))) return(NA) # convert vector if(is.vector(x)) x <- matrix(x,1,length(x)) np <- nrow(x) v <- apply(x, 1, var) # variance of chain v0 <- v==0 sdp <- sp0 <- neff <- numeric(np) for(i in 1:np){ arp <- try( ar(x[i,], aic=TRUE), silent=T ) if(!v0[i]) sp0[i] <- arp$var.pred/(1 - sum(arp$ar))^2 # spectral density at frequency 0 } n <- ncol(x) mcmc.v <- sp0/n # true variance of the mean (correcting for autocorrelation) neff[!v0] <- (v/mcmc.v)[!v0] # effective sample size neff[v0] <- n # proportion of variance due to sampling inefficiency sdp[!v0] <- sqrt(mcmc.v / v)[!v0] attr(sdp,"n.eff") <- neff sdp } mitml/R/internal-getVC.R0000644000176200001440000001171313044065066014551 0ustar liggesusers# *** # Functions to extract variance components from supported classes # of statistical models # # *** lmer method .getVC.lmer <- function(model){ if(!requireNamespace("lme4", quietly=TRUE)) stop("The 'lme4' package must be installed in order to use this function.") m <- length(model) vlist <- addp <- NULL # variance components vlist <- list() vc <- lapply(model,lme4::VarCorr) clus <- names(vc[[1]]) for(vv in clus){ q <- dim(vc[[1]][[vv]])[1] v.cl <- vapply(vc, function(z) z[[vv]], FUN.VALUE=matrix(0,q,q)) if(is.null(dim(v.cl))) dim(v.cl) <- c(1,1,m) dimnames(v.cl)[1:2] <- lapply(dimnames(vc[[1]][[vv]]), function(z) sub("^[(]Intercept[)]$","Intercept",z)) vlist[[paste("|",vv,sep="")]] <- v.cl } # residual variance (if model uses scale) usesc <- attr(vc[[1]], "useSc") if(usesc){ rv <- sapply(vc, function(z) attr(z,"sc")^2) dim(rv) <- c(1,1,m) dimnames(rv) <- list("Residual","Residual",NULL) vlist <- c(vlist, list(rv)) } # additional parameters # 1. ICC (only single clustering) if(usesc & length(clus)==1){ if("(Intercept)"%in%colnames(vc[[1]][[clus]])){ iv <- sapply(vc, function(z) z[[clus]]["(Intercept)","(Intercept)"]) icc <- iv / (iv + rv[1,1,]) addp <- c(addp, mean(icc)) names(addp) <- paste("ICC|",clus,sep="") } } list(vlist=vlist,addp=addp) } # *** nlme method .getVC.nlme <- function(model){ if(!requireNamespace("nlme", quietly=TRUE)) stop("The 'nlme' package must be installed in order to use this function.") m <- length(model) vlist <- addp <- NULL # variance components (single clustering, limited for multiple clustering) cls <- class(model[[1]])[1] clus <- names(model[[1]]$coefficients$random) vlist <- list() # single cluster variable if(cls=="lme" & length(clus)==1){ vc <- lapply(model,nlme::getVarCov) clus <- attr(vc[[1]],"group.levels") q <- dim(vc[[1]])[1] v.cl <- vapply(vc, identity, FUN.VALUE=matrix(0,q,q)) if(is.null(dim(v.cl))) dim(v.cl) <- c(1,1,m) dimnames(v.cl)[1:2] <- lapply(dimnames(vc[[1]]), function(z) sub("^[(]Intercept[)]$","Intercept",z)) vlist[[paste("|",clus,sep="")]] <- v.cl }else{ vc <- lapply(model,nlme::VarCorr) # by variable for(vv in clus){ q <- dim(model[[1]]$coefficients$random[[vv]])[2] if(length(clus)==1){ rind <- 1:q }else{ rind <- grep( paste0("^",vv," =$"), rownames(vc[[1]])) rind <- (rind+1):(rind+q) } # ... by term for(qq in rind){ v.cl <- sapply(vc, function(x) as.numeric(x[qq,1])) if(is.null(dim(v.cl))) dim(v.cl) <- c(1,1,m) dimnames(v.cl)[1:2] <- list(sub("^[(]Intercept[)]$", "Intercept", rownames(vc[[1]])[qq])) vlist[[paste("|",vv,sep="")]] <- v.cl } } } # residual variance (if estimated) fixsigma <- attr(model[[1]]$modelStruct,"fixedSigma") if(!fixsigma){ rv <- sapply(model, function(z) z$sigma^2) dim(rv) <- c(1,1,m) dimnames(rv) <- list("Residual","Residual",NULL) vlist <- c(vlist, list(rv)) } # additional parameters # 1. ICC (only lme, single clustering) if(!fixsigma & cls=="lme" & length(clus)==1){ if("(Intercept)"%in%colnames(vc[[1]])){ iv <- sapply(vc, function(z) z["(Intercept)","(Intercept)"]) icc <- iv / (iv + rv[1,1,]) addp <- c(addp, mean(icc)) names(addp) <- paste("ICC|",clus,sep="") } } list(vlist=vlist,addp=addp) } # *** geeglm method .getVC.geeglm <- function(model){ if(!requireNamespace("geepack", quietly=TRUE)) stop("The 'geepack' package must be installed in order to use this function.") m <- length(model) vlist <- addp <- NULL # variance components (currently not used) # vlist <- list() # additional parameters # 1. scale parameter (gamma) isfix <- model[[1]]$geese$model$scale.fix if(!isfix){ gamma <- sapply(model, function(x) x$geese$gamma) if(is.null(dim(gamma))){ dim(gamma) <- c(1,m) rownames(gamma) <- names(model[[1]]$geese$gamma) } addp <- c(addp,rowMeans(gamma)) nms <- gsub("^[(]Intercept[)]$", "Intercept", names(addp)) names(addp) <- paste0("Scale:",nms) } # 2. correlation parameters (alpha) corstr <- model[[1]]$geese$model$corstr isfix <- corstr%in%c("fixed","userdefined") if(!isfix){ alpha <- sapply(model, function(x) x$geese$alpha) if(is.null(dim(alpha))){ dim(alpha) <- c(1,m) rownames(alpha) <- names(model[[1]]$geese$alpha) } rownames(alpha) <- paste0("Correlation:",rownames(alpha)) addp <- c(addp,rowMeans(alpha)) } list(vlist=vlist,addp=addp) } # *** lm method .getVC.lm <- function(model,ML=FALSE){ m <- length(model) vlist <- addp <- NULL if(ML){ # SiG 16-04-2016 rv <- sapply(model, function(z) sum(resid(z)^2)/length(resid(z)) ) }else{ rv <- sapply(model, function(z) sum(resid(z)^2)/df.residual(z) ) } dim(rv) <- c(1,1,m) dimnames(rv) <- list("Residual","Residual",NULL) vlist <- c(vlist, list(rv)) list(vlist=vlist,addp=addp) } mitml/R/internal-getLR.R0000644000176200001440000001463213100131327014544 0ustar liggesusers# *** # Functions to extract LR statistic for supported classes of statistical models, # possibly given user-defined values for model parameters # # *** lmer method .getLR.lmer <- function(model, null.model=NULL, psi=NULL, null.psi=NULL){ if(!requireNamespace("lme4", quietly=TRUE)) stop("The 'lme4' package must be installed in order to use this function.") requireNamespace("lme4", quietly=TRUE) if(is.null(null.model)){ # TODO: global model fit (experimental, for D2) k <- attr(logLik(model[[1]]),"df") reml <- lme4::isREML(model[[1]]) logL1 <- sapply(model,logLik) dW <- -2*logL1 }else{ k <- attr(logLik(model[[1]]),"df") - attr(logLik(null.model[[1]]),"df") reml <- any(lme4::isREML(model[[1]]), lme4::isREML(null.model[[1]])) if(is.null(psi) & is.null(null.psi)){ logL0 <- sapply(null.model,logLik) logL1 <- sapply(model,logLik) }else{ logL0 <- sapply(null.model,.logL.lmer,psi=null.psi) logL1 <- sapply(model,.logL.lmer,psi=psi) } dW <- -2*(logL0-logL1) } attr(dW,"df") <- k attr(dW,"REML") <- reml dW } # *** nlme method .getLR.nlme <- function(model, null.model=NULL, psi=NULL, null.psi=NULL){ if(!requireNamespace("nlme", quietly=TRUE)) stop("The 'nlme' package must be installed in order to use this function.") requireNamespace("nlme", quietly=TRUE) if(is.null(null.model)){ # TODO: global model fit (experimental, for D2) k <- attr(logLik(model[[1]]),"df") reml <- model[[1]]$method=="REML" logL1 <- sapply(model,logLik) dW <- -2*logL1 }else{ k <- attr(logLik(model[[1]]),"df") - attr(logLik(null.model[[1]]),"df") reml <- any(model[[1]]$method=="REML", null.model[[1]]$method=="REML") if(is.null(psi) & is.null(null.psi)){ logL0 <- sapply(null.model,logLik) logL1 <- sapply(model,logLik) }else{ logL0 <- sapply(null.model,.logL.nlme,psi=null.psi) logL1 <- sapply(model,.logL.nlme,psi=psi) } dW <- -2*(logL0-logL1) } attr(dW,"df") <- k attr(dW,"REML") <- reml dW } # *** coxph method .getLR.coxph <- function(model, null.model=NULL, psi=NULL, null.psi=NULL){ ll.null <- function(object,...){ out <- object$loglik[1] attr(out, "df") <- 0 out } if(is.null(null.model)){ # TODO: global model fit (experimental, D2) cls0 <- class(model[[1]])[1]=="coxph.null" k <- attr(logLik(model[[1]]),"df") logL1 <- sapply(model, if(cls0) ll.null else logLik) dW <- -2*logL1 }else{ cls0 <- class(null.model[[1]])[1]=="coxph.null" k <- attr(logLik(model[[1]]),"df") - (if(cls0) 0 else attr(logLik(null.model[[1]]),"df")) if(is.null(psi) & is.null(null.psi)){ logL0 <- sapply(null.model, if(cls0) ll.null else logLik) logL1 <- sapply(model,logLik) }else{ # not yet supported (caught at higher level) } dW <- -2*(logL0-logL1) } attr(dW,"df") <- k dW } # *** default method .getLR.default <- function(model,null.model=NULL, psi=NULL, null.psi=NULL){ if(is.null(null.model)){ # TODO: global model fit (experimental, D2) k <- .tryResidualDf(model[[1]]) logL1 <- sapply(model,logLik) dW <- -2*logL1 }else{ k <- .tryResidualDf(null.model[[1]]) - .tryResidualDf(model[[1]]) if(is.null(psi) & is.null(null.psi)){ logL0 <- sapply(null.model,logLik) logL1 <- sapply(model,logLik) }else{ # only lm supported logL0 <- sapply(null.model,.logL.lm,psi=null.psi) logL1 <- sapply(model,.logL.lm,psi=psi) } dW <- -2*(logL0-logL1) } attr(dW,"df") <- k dW } # *** # Likelihood functions for supported models # .logL.lmer <- function(object, psi=NULL){ if(is.null(psi)){ beta <- lme4::fixef(object) D <- lme4::VarCorr(object)[[1]] sig2 <- attr(lme4::VarCorr(object),"sc")^2 }else{ beta <- psi$beta D <- psi$D sig2 <- psi$sigma2 } cls <- lme4::getME(object,"flist")[[1]] p <- length(beta) q <- dim(D)[1] y <- split(lme4::getME(object,"y"),cls) X <- split(lme4::getME(object,"X"),cls) # testing: via mmList, for single level of clustering Z <- split(lme4::getME(object,"mmList")[[1]],cls) # outdated: via sparse model matrix (produces bug due to zero elements) # Z <- split(matrix(lme4::getME(object,"Zt")@x, nrow=length(cls), byrow=T),cls) L <- numeric(nlevels(cls)) for(i in levels(cls)){ yi <- y[[i]] ni <- length(yi) Xi <- matrix(X[[i]], ncol=p) Ri <- yi - Xi%*%beta Zi <- matrix(Z[[i]], ncol=q) V <- diag(sig2,ni) + Zi %*% D %*% t(Zi) Vinv <- chol2inv(chol(V)) dV <- determinant(V,logarithm=TRUE) dV <- dV$modulus*dV$sign L[i] <- dV + t(Ri) %*% Vinv %*% (Ri) } -sum(L)/2 } .logL.nlme <- function(object, psi=NULL){ if(is.null(psi)){ beta <- nlme::fixef(object) D <- nlme::getVarCov(object) sig2 <- object$sigma^2 }else{ beta <- psi$beta D <- psi$D sig2 <- psi$sigma2 } # error check if(is.null(nlme::getData(object))) stop("No data sets found in 'lme' fit. See '?testModels' for an example.") cls <- nlme::getGroups(object) p <- length(beta) q <- dim(D)[1] y <- split(nlme::getResponse(object),cls) fe <- object$terms X <- split(model.matrix(fe,nlme::getData(object)),cls) re <- attr(object$modelStruct$reStruct[[1]],"formula") Z <- split(model.matrix(re,nlme::getData(object)),cls) L <- numeric(nlevels(cls)) for(i in levels(cls)){ yi <- y[[i]] ni <- length(yi) Xi <- matrix(X[[i]], ncol=p) Ri <- yi - Xi%*%beta Zi <- matrix(Z[[i]], ncol=q) V <- diag(sig2,ni) + Zi %*% D %*% t(Zi) Vinv <- chol2inv(chol(V)) dV <- determinant(V,logarithm=TRUE) dV <- dV$modulus*dV$sign L[i] <- dV + t(Ri) %*% Vinv %*% (Ri) } -sum(L)/2 } .logL.lm <- function(object, psi=NULL){ if(is.null(psi)){ beta <- coef(object) # sig2 <- sum(resid(object)^2)/df.residual(object) r <- resid(object) # SiG 19-04-2016 sig2 <- sum(r^2)/length(r) }else{ beta <- psi$beta sig2 <- psi$sigma2 } ytrm <- attr(object$terms,"variables")[-1][attr(object$terms,"response")] y <- as.matrix(object$model[as.character(ytrm)]) X <- model.matrix(object$terms,object$model) r <- y-X%*%beta n <- length(y) L <- -(n/2)*log(sig2) - (1/(2*sig2)) * t(r) %*% r as.numeric(L) } .tryResidualDf <- function(object){ k <- NULL if(is.null(k)) k <- tryCatch( df.residual(object), error=function(f) NULL ) if(is.null(k)) k <- tryCatch( tail(anova(object),1), error=function(f) NULL ) k } mitml/R/write.mitmlSPSS.R0000644000176200001440000000575512765742062014733 0ustar liggesuserswrite.mitmlSPSS <- function(x, filename, sep="\t", dec=".", na.value=-999, syntax=TRUE, locale=NULL){ # write text file to be read into SPSS if(!"mitml"%in%class(x) & !"mitml.list"%in%class(x)) stop("'x' must be of class 'mitml' or 'mitml.list'.") if(!dec%in%c(",",".")) stop("Only a dot '.' or a comma ',' may be specified as decimal separator.") if("mitml"%in%class(x)){ il <- mitmlComplete(x,0:x$iter$m) }else{ il <- x } for(ii in 1:length(il)){ il[[ii]] <- cbind(ii-1,il[[ii]]) colnames(il[[ii]])[1] <- "Imputation_" } out <- do.call(rbind,il) num <- sapply(out,is.numeric) chr <- sapply(out,is.character) fac <- sapply(out,is.factor) ord <- sapply(out,is.ordered) # convert factors conv <- as.list(which(fac)) for(ff in which(fac)){ out[,ff] <- as.factor(out[,ff]) conv[[colnames(out)[ff]]] <- matrix(c(levels(out[,ff]),1:nlevels(out[,ff])),ncol=2) out[,ff] <- as.numeric(out[,ff]) } ds <- paste(filename, ".dat", sep="") out[is.na(out)] <- na.value write.table(out, file=ds, sep=sep, dec=dec, col.names=T, row.names=F, quote=F) # gerate syntax if(syntax){ sf <- paste(filename, ".sps", sep="") if(dec==".") d <- "DOT" else d <- "COMMA" cat(file=sf,"SET DECIMAL", d, ".\n") if(!is.null(locale)) cat(file=sf, "SET LOCALE",locale,".\n", append=T) cat(file=sf, "\n", append=T) cat(file=sf, append=T, "GET DATA\n", "/TYPE=TXT\n", paste("/FILE=\"",ds,"\"\n",sep=""), "/DELCASE=LINE\n", paste("/DELIMITERS=\"",sub("\t","\\\\t",sep),"\"\n",sep=""), "/ARRANGEMENT=DELIMITED\n", "/FIRSTCASE=2\n", "/IMPORTCASE=ALL\n", "/VARIABLES=" ) # class specific format width <- sapply(as.matrix(out)[1,], nchar, type="width") width[chr] <- sapply(out[,chr,drop=FALSE], function(z) max(nchar(z,type="width"))) fmt <- data.frame(v=colnames(out),f=character(ncol(out)),stringsAsFactors=F) fmt[num|fac|ord,"f"] <- paste("F",width[num|fac|ord]+3,".2",sep="") fmt[chr,"f"] <- paste("A",width[chr],sep="") fmt[num,"l"] <- "SCALE" fmt[fac|chr,"l"] <- "NOMINAL" fmt[ord,"l"] <- "ORDINAL" fmt[1,"l"] <- "NOMINAL" cat(file=sf, "\n ", append=T) cat(file=sf, paste(fmt$v,fmt$f, collapse="\n "), ".\n\n", append=T) cat(file=sf, append=T, sep="", "CACHE .\n", "EXECUTE .\n", "DATASET NAME panImpute1 WINDOW=FRONT .\n\n" ) # value labels cat(file=sf, "VALUE LABELS", append=T) for(cc in 1:length(conv)){ cat(file=sf, "\n", paste("/",names(conv)[cc],sep=""), append=T) for(rr in 1:nrow(conv[[cc]])){ cat(file=sf, "\n", conv[[cc]][rr,2], paste("\'",conv[[cc]][rr,1],"\'",sep=""), append=T) } } cat(file=sf, " .\n\n", append=T) # missing values cat(file=sf, append=T, "MISSING VALUES\n", paste(fmt$v[num|fac|ord], collapse=" "), paste("(",na.value,")",sep=""),"\n", paste(fmt$v[chr], collapse=" "), paste("(\"",na.value,"\")",sep=""), ".\n" ) } invisible() } mitml/R/internal-model.R0000644000176200001440000001724613321120501014630 0ustar liggesusers# prepare model input by formula .model.byFormula <- function(data, formula, group, group.original, method=c("pan","jomo","jomo.matrix")){ # check model, separate equations formula <- .check.model(formula) isML <- attr(formula,"is.ML") isL2 <- attr(formula,"is.L2") if(isL2){ formula.L2 <- formula[[2]] formula <- formula[[1]] } method <- match.arg(method) # *** evaluate L1 model # ft <- terms(formula) tl <- attr(ft,"term.labels") vrs <- attr(ft,"variables")[-1] nms <- colnames(data) # responses yvrs <- as.character(vrs)[attr(ft,"response")] yvrs <- gsub("[\r\n]","",yvrs) y.fml <- as.formula(paste0("~",yvrs)) yvrs <- attr(terms(y.fml), "term.labels") # check for untransformed yvrs err <- !(yvrs %in% nms) if(any(err)) stop("Could not find: ", paste0(yvrs[err],collapse=", "), "). Target variables must be contained in the data set 'as is', and transformations must be applied beforehand.") # cluster id clt <- tl[grep("\\|",tl)] if(method=="pan" & !isML) stop("Cluster indicator not found in formula\n\n", .formula2char(formula), "\n\nPlease specify the cluster indicator and at least one random term using the '|' operator. Single-level imputation is supported by jomoImpute().") # extract and reorder if(isML){ clt <- strsplit( clt, split="[[:blank:]]*\\|[[:blank:]]*" )[[1]] clname <- clt[2] # order data and grouping data <- data[ order(group,data[,clname]), ] group.original <- group.original[ order(group) ] group <- group[ order(group) ] }else{ clname <- NULL } # predictors: fixed pvrs <- c(if(attr(ft,"intercept")){"(Intercept)"}, tl[!grepl("\\|",tl)]) fe.fml <- c(if(attr(ft,"intercept")){"1"}else{"0"}, tl[!grepl("\\|",tl)]) fe.fml <- as.formula(paste0("~", paste0(fe.fml,collapse="+"))) # predictors: random if(isML){ cl.fml <- as.formula(paste("~",clt[1])) cl.ft <- terms(cl.fml) qvrs <- c(if(attr(cl.ft,"intercept")){"(Intercept)"}, attr(cl.ft,"term.labels")) }else{ cl.fml <- ~0 qvrs <- NULL } # model matrix for fe and cl attr(data,"na.action") <- identity mmp <- suppressWarnings( model.matrix(fe.fml, data=data) ) mmq <- suppressWarnings( model.matrix(cl.fml, data=data) ) pnames <- colnames(mmp) qnames <- colnames(mmq) psave <- setdiff( c(pnames,qnames), c("(Intercept)",nms) ) switch( method , # panImpute (matrix input) pan={ y <- data.matrix(data[yvrs]) ycat <- NULL }, # jomoImpute, for higher-level functions (data frames, uses jomo for preprocessing) jomo={ y <- data[yvrs] ycat <- NULL }, # jomoImpute, for higher- and lower-level versions (preprocessed matrix input) jomo.matrix={ y <- data.matrix(data[yvrs]) cvrs <- sapply(data[,yvrs,drop=F], is.factor) ycat <- y[,cvrs,drop=F] y <- y[,!cvrs,drop=F] } ) clus <- if(isML) data[,clname] else NULL pred <- cbind(mmp, mmq[,!(qnames%in%pnames),drop=F]) xcol <- which(colnames(pred)%in%pnames) zcol <- which(colnames(pred)%in%qnames) # assign to parent.frame inp <- list( y=y, ycat=ycat, clus=clus, pred=pred, xcol=xcol, zcol=zcol, data=data, group=group, group.original=group.original, psave=psave, clname=clname, yvrs=yvrs, pvrs=pvrs, qvrs=qvrs, pnames=pnames, qnames=qnames ) for(i in names(inp)) assign(i, inp[[i]], pos=parent.frame()) # *** evaluate L2 model # if(isL2){ ft <- terms(formula.L2) tl <- attr(ft,"term.labels") vrs <- attr(ft,"variables")[-1] # responses yvrs <- as.character(vrs)[attr(ft,"response")] yvrs <- gsub("[\r\n]","",yvrs) y.fml <- as.formula(paste0("~",yvrs)) yvrs <- attr(terms(y.fml), "term.labels") # check for untransformed yvrs err <- !(yvrs %in% nms) if(any(err)) stop("Could not find: ", paste0(yvrs[err],collapse=", "), "). Target variables must be contained in the data set 'as is', and transformations must be applied beforehand.") # predictors: fixed only at L2 pvrs <- c(if(attr(ft,"intercept")){"(Intercept)"}, tl[!grepl("\\|",tl)]) fe.fml <- c(if(attr(ft,"intercept")){"1"}else{"0"}, tl[!grepl("\\|",tl)]) fe.fml <- as.formula(paste0("~", paste0(fe.fml,collapse="+"))) # model matrix for FE only attr(data,"na.action") <- identity mmp <- suppressWarnings( model.matrix(fe.fml, data=data) ) pnames <- colnames(mmp) psave <- c( psave, setdiff( c(pnames), c("(Intercept)",nms) ) ) switch( method , jomo={ # jomoImpute, for higher-level functions (data input) y <- data[yvrs] ycat <- NULL }, jomo.matrix={ # jomoImpute, for lower-level versions (matrix input) y <- data.matrix(data[yvrs]) cvrs <- sapply(data[,yvrs,drop=F], is.factor) ycat <- y[,cvrs,drop=F] y <- y[,!cvrs,drop=F] } ) pred <- mmp xcol <- which(colnames(pred) %in% pnames) # assign to parent.frame inp <- list( y.L2=y, ycat.L2=ycat, pred.L2=pred, xcol.L2=xcol, yvrs.L2=yvrs, pvrs.L2=pvrs, pnames.L2=pnames, psave=psave ) for(i in names(inp)) assign(i, inp[[i]], pos=parent.frame()) } invisible(NULL) } # convert formula to character .formula2char <- function(x){ chr <- as.character(x) paste(chr[c(3,1,2)]) } .check.model <- function(x){ # check model type and number of levels xnew <- x # ensure proper list format if(is.list(x) & length(x) > 2) stop("Cannot determine the number of levels. The 'formula' or 'type' argument must indicate either a single-level model, a model for responses at level 1, or two models for responses at level 1 and 2.") if(!is.list(x)) x <- list(x) # check cluster specification and model type clt <- lapply(x, function(z){ if(is.language(z)){ tl <- attr(terms(z), "term.labels") tl[grep("\\|",tl)] }else{ which(z==-2) } }) isML <- length(clt[[1]]) > 0 isL2 <- length(x) == 2 if(isL2 & !isML) stop("No cluster variable found. Imputation models for responses at level 1 and 2 require the specification of a cluster variable in the level-1 equation.") attr(xnew,"is.ML") <- isML attr(xnew,"is.L2") <- isL2 xnew } .check.variablesL2 <- function(x,clus){ # check for variables at L2 (constant at L1) apply(x, 2, function(a) all( abs(a-clusterMeans(a,clus)) < sqrt(.Machine$double.eps), na.rm=T)) } # convert type to formula .type2formula <- function(data, type){ # L2: separate model equations type <- .check.model(type) isML <- attr(type,"is.ML") isL2 <- attr(type,"is.L2") if(isL2){ type.L2 <- type[[2]] type <- type[[1]] } nms <- colnames(data) # grouping grp <- if(any(type==-1)) nms[type==-1] else NULL if(isL2 & is.null(grp)){ if(any(type.L2==-1)) grp <- nms[type.L2==-1] } # L1 model if(ncol(data)!=length(type)) stop("Length of 'type' must be equal to the number of colums in 'data'.") if(sum(type==-2)>1) stop("Only one cluster indicator may be specified.") cls <- nms[type==-2] yvrs <- paste( nms[type==1], collapse="+" ) pvrs <- paste( c(1,nms[type%in%c(2,3)]), collapse="+" ) qvrs <- if(isML) paste( c(1,nms[type==3]), collapse="+" ) else NULL # build L1 formula cls.fml <- if(isML) paste("+ (", qvrs, "|", cls, ")") else NULL fml <- formula( paste(yvrs, "~", pvrs, cls.fml) ) # L2 model if(isL2){ if(ncol(data)!=length(type.L2)) stop("Length of 'type' must be equal to the number of colums in 'data'.") yvrs <- paste( nms[type.L2==1], collapse="+" ) pvrs <- paste( c(1,nms[type.L2%in%c(2,3)]), collapse="+" ) # build formula (make list) fml <- list( fml, formula( paste(yvrs, "~", pvrs) ) ) } attr(fml,"group") <- grp attr(fml,"is.ML") <- isML attr(fml,"is.L2") <- isL2 return(fml) } mitml/R/mitmlComplete.R0000644000176200001440000000335213321400743014533 0ustar liggesusersmitmlComplete <- function(x, print="all", force.list=FALSE){ if(sum(print<=0)>1) stop("Only one negative or zero value is allowed in 'print'.") dat <- x$data srt <- order( attr(x$data,"sort") ) labs <- attr(x$data,"labels") method <- class(x)[2] m <- x$iter$m ind <- x$index.mat rpm <- x$replacement.mat if(class(print)%in%c("integer","numeric")){ if(length(print)==1){ if(print>0){ com <- .completeOne(dat,print,ind,rpm,method) out <- com[srt,] }else{ out <- .stripDataAttributes(dat[srt,]) } if(force.list) out <- list(out) }else{ out <- list() for(ii in print){ if(ii>0){ com <- .completeOne(dat,ii,ind,rpm,method) out <- c(out,list(com[srt,])) }else{ out <- c(out,list(.stripDataAttributes(dat[srt,]))) } } } }else{ if(!print%in%c("list","all")) stop("Invalid 'print' argument.") out <- list() for(ii in 1:m){ com <- .completeOne(dat,ii,ind,rpm,method) out <- c(out,list(com[srt,])) } } if(class(out)=="list") class(out) <- c("mitml.list","list") out } .completeOne <- function(x,i,ind,rpm,method){ if(method=="jomo"){ fac <- which(colnames(x) %in% names(attr(x,"labels"))) nofac <- !(ind[,2] %in% fac) if(any(nofac)) x[ ind[nofac,,drop=F] ] <- rpm[nofac,i,drop=F] for(ff in fac){ fi <- which(ind[,2]==ff) lev <- attr(x,"labels")[[colnames(x)[ff]]] if(length(fi)>0) x[ ind[fi,,drop=F] ] <- lev[rpm[fi,i]] } }else{ x[ind] <- rpm[,i] } .stripDataAttributes(x) } .stripDataAttributes <- function(x){ attr(x,"sort") <- NULL attr(x,"group") <- NULL attr(x,"levels") <- NULL attr(x,"labels") <- NULL x } mitml/R/write.mitmlMplus.R0000644000176200001440000000375012765745751015244 0ustar liggesuserswrite.mitmlMplus <- function(x, filename, suffix="list", sep="\t", dec=".", na.value=-999){ # write text files that can be read into Mplus if(!"mitml"%in%class(x) & !"mitml.list"%in%class(x)) stop("'x' must be of class 'mitml' or 'mitml.list'.") if("mitml"%in%class(x)){ il <- mitmlComplete(x,1:x$iter$m) m <- x$iter$m }else{ il <- x m <- length(x) } if(!"list"%in%class(il)) il <- list(il) dnames <- paste(filename, 1:m, ".dat", sep="") lname <- paste(filename, suffix, ".dat", sep="") write.table(dnames, file=lname, col.names=FALSE, row.names=FALSE, quote=FALSE) for(ii in 1:length(il)){ out <- il[[ii]] # convert factors notnum <- which(sapply(out, function(z) !is.numeric(z))) conv <- as.list(notnum) for(nn in notnum){ out[,nn] <- as.factor(out[,nn]) conv[[colnames(out)[nn]]] <- matrix(c(levels(out[,nn]),1:nlevels(out[,nn])),ncol=2) out[,nn] <- as.numeric(out[,nn]) } # write out[is.na(out)] <- na.value write.table(out, file=dnames[ii], sep=sep, dec=dec, col.names=F, row.names=F, quote=FALSE) } # log file cname <- paste(filename,".log",sep="") cat(file=cname, "The data set featured the following variables:") cat(file=cname, "\n\n", paste(colnames(out),collapse=" "), sep="", append=T) if(length(conv)>0){ cat(file=cname, "\n\n", "Factors were converted to numeric values as follows:\n ",sep="", append=T) for(cc in 1:length(conv)){ cat(file=cname, "\n", names(conv[cc]), ":\n", sep="", append=T) write.table(conv[[cc]],file=cname, row.names=F,col.names=F, sep=" = ", quote=F, append=T) } } # input file iname <- paste(filename,".inp",sep="") cat(file=iname, sep="", "TITLE:\n This Mplus input file for multiply imputed data sets was generated by mitml in R.\n", "DATA:\n file = ",lname,";\n", " type = imputation;\n", "VARIABLE:\n names = ",paste(colnames(out),collapse=" "),";\n", " missing = all (",na.value,");" ) invisible() } mitml/R/plot.mitml.R0000644000176200001440000005517213321120501014015 0ustar liggesusersplot.mitml <- function(x, print=c("beta","beta2","psi","sigma"), pos=NULL, group="all", trace=c("imputation","burnin","all"), thin=1, smooth=3, n.Rhat=3, export=c("none","png","pdf"), dev.args=list(), ...){ # plot method for objects of class "mitml" # retrieve data and variable names (predictors) vrs <- x$model clus <- x$model$clus pvrs <- seq_along(attr(vrs,"full.names")$pvrs) qvrs <- seq_along(attr(vrs,"full.names")$qvrs) names(pvrs) <- attr(vrs,"full.names")$pvrs names(qvrs) <- attr(vrs,"full.names")$qvrs isML <- attr(x$model,"is.ML") isL2 <- attr(x$model,"is.L2") if(isL2){ pvrs.L2 <- seq_along(attr(vrs,"full.names")$pvrs.L2) names(pvrs.L2) <- attr(vrs,"full.names")$pvrs.L2 } # match arguments print <- match.arg(print,several.ok=TRUE) trace <- match.arg(trace) export <- match.arg(export) # check for random L1 rl1 <- x$random.L1=="full" # parameter chains (for backwards compatibility) kc <- x$keep.chains if(is.null(kc)) kc <- "full" # check print and position for selected parameters if(!is.null(pos) & length(print)>1){ pos <- NULL warning("The 'pos' argument may only be used when 'print' is cleary defined as one of 'beta', 'beta2', 'psi', or 'sigma' (see '?plot').") } # grouping grp.labels <- unique(attr(x$data,"group")) if(class(group)=="numeric") grp.labels <- grp.labels[group] grp <- length(grp.labels) # export, graphical parameters if(export!="none"){ wd <- getwd() out <- file.path(wd,"mitmlPlots") if(!file.exists(out)) dir.create(out) }else{ do.call(dev.new,dev.args) devAskNewPage(ask=FALSE) } oldpar <- par(no.readonly=TRUE) # *** # start plotting # for(gg in 1:grp){ # grouping if(grp>1){ glab <- paste(",Group:",grp.labels[gg],sep="") gfile <- paste("Group-",grp.labels[gg],"_",sep="") }else{ glab <- gfile <- "" } # expand targets for multiple categories yvrs <- vrs$yvrs yvrs.L2 <- vrs$yvrs.L2 # ... level 1 cvrs <- intersect(yvrs,attr(x$data,"cvrs")) nc <- length(cvrs) if(length(cvrs)>=1){ yvrs <- c(yvrs[!yvrs%in%cvrs], cvrs) for(cc in 1:nc){ cv <- cvrs[cc] ci <- which(yvrs==cv) yi <- 1:length(yvrs) nlev <- attr(x$data,"levels")[gg,cc] if(nlev>2){ newy <- paste0(cv, 1:(nlev-1)) }else{ newy <- cv } sel0 <- yi[yici] yvrs <- c(yvrs[sel0],newy,yvrs[sel1]) } } ynam <- yvrs yvrs <- seq_along(yvrs) names(yvrs) <- ynam # ... level 2 if(isL2){ cvrs.L2 <- intersect(yvrs.L2,attr(x$data,"cvrs")) nc.L2 <- length(cvrs.L2) if(length(cvrs.L2)>=1){ yvrs.L2 <- c(yvrs.L2[!yvrs.L2%in%cvrs.L2], cvrs.L2) for(cc in 1:nc.L2){ cv <- cvrs.L2[cc] ci <- which(yvrs.L2==cv) yi <- 1:length(yvrs.L2) nlev <- attr(x$data,"levels")[gg,nc+cc] if(nlev>2){ newy <- paste0(cv, 1:(nlev-1)) }else{ newy <- cv } sel0 <- yi[yici] yvrs.L2 <- c(yvrs.L2[sel0],newy,yvrs.L2[sel1]) } } ynam <- yvrs.L2 yvrs.L2 <- seq_along(yvrs.L2) names(yvrs.L2) <- ynam } # number of iterations n <- dim(x$par.burnin[["beta"]])[3]+dim(x$par.imputation[["beta"]])[3] nb <- dim(x$par.burnin[["beta"]])[3] ni <- dim(x$par.imputation[["beta"]])[3] niter <- x$iter[["iter"]] # thinned-sample indicators s <- seq.int(thin,n,by=thin) sb <- seq.int(thin,nb,by=thin) si <- seq.int(thin,ni,by=thin) lag <- ceiling(niter/thin) # *** plots for fixed regression coefficients at level 1 # if("beta" %in% print){ # check if pos is badly defined if(!is.null(pos)){ if(pos[1] > max(pvrs) | pos[1] < min(pvrs) | pos[2] > max(yvrs) | pos[2] < min(yvrs)){ .restoreDevice(oldpar,export,close=TRUE) stop("There is no entry [",pos[1],",",pos[2],"] in 'beta'.") } } for(ic in yvrs){ for(ir in pvrs){ # skip if individual parameters requested if(!is.null(pos)){ if(!(pos[1]==ir & pos[2]==ic)) next } if(export!="none"){ filename <- paste("BETA_",gfile,names(yvrs[ic]),"_ON_",names(pvrs[ir]),".",export,sep="") filename <- gsub("[(),]","",filename) filename <- gsub("[[:space:]]","-",filename) out.args <- c(list(file=file.path(out,filename)),dev.args) do.call(export, out.args) } layout(matrix(c(1,2,3,4),2,2), c(5,1), c(1.13,1)) # choose section of trace switch(trace, imputation={ trc <- x$par.imputation[["beta"]][ir,ic,,gg][si] }, burnin={ trc <- x$par.burnin[["beta"]][ir,ic,,gg][sb] }, all={ trc <- c(x$par.burnin[["beta"]][ir,ic,,gg], x$par.imputation[["beta"]][ir,ic,,gg])[s] } ) # trace plot par(mar=c(3,3,2,0)+0.5, mgp=c(2,1,0), font.lab=2) ymin <- min(trc) ymax <- max(trc) yr <- ymax-ymin plot(trc, type=ifelse(trace=="all","n","l"), ylab="Trace", xlab="Iteration", xaxt="n", ylim=c(ymin-yr*.03, ymax+yr*.03), ...) if(trace=="all"){ lines(which(s<=nb), trc[s<=nb], col="grey75") lines(which(s>=nb), trc[s>=nb], col="black") } axt <- axTicks(1) title(main=paste("Beta [",ir,",",ic,glab,"]: ",names(yvrs[ic])," ON ",names(pvrs[ir]),sep=""), cex.main=1) if(trace=="imputation"){ axl <- sprintf("%d", thin*(axt+nb)) }else{ axl <- sprintf("%d", thin*axt) } axis(side=1, at=axt, labels=axl) # trend line for trace (moving window average) if(all(is.numeric(smooth),smooth>0)){ B <- floor(niter/(smooth*thin)) mwa <- .movingAverage(trc,B,fill=TRUE) lines(mwa, col="grey60") } # blue line if(trace=="all") abline(v=ceiling(nb/thin), col="blue") # further plots if(trace=="burnin"){ drw <- x$par.burnin[["beta"]][ir,ic,sb,gg] }else{ drw <- x$par.imputation[["beta"]][ir,ic,si,gg] } # autocorrelation plot par(mar=c(3,3,1,0)+0.5) ac <- acf(drw, lag.max=lag+2, plot=F) plot(ac[1:lag], ylim=c(-.1,1), yaxt="n", main=NULL, ylab="ACF", ci=0, ...) axis(side=2, at=c(0,.5,1)) abline(h=c(-.1,.1), col="blue") # kernel density plot par(mar=c(3,0,2,0)+0.5, mgp=c(2,0,0)) ddrw <- density(drw) plot(x=ddrw$y, y=ddrw$x, type="l", xaxt="n", yaxt="n", xlab="", ylab="", ylim=c(ymin-yr*.03, ymax+yr*.03), ...) # posterior summary par(mar=c(1,-0.5,0,-0.5)+0.5) plot.new() text(0,0.5,paste("EAP: ", sprintf(fmt="%.3f", mean(drw)), "\n", "MAP: ", sprintf(fmt="%.3f", ddrw$x[which.max(ddrw$y)]), "\n", "SD: ", sprintf(fmt="%.3f", sd(drw)), "\n", "2.5%: ", sprintf(fmt="%.3f", quantile(drw,.025)), "\n", "97.5%: ", sprintf(fmt="%.3f", quantile(drw,.975)), "\n", "Rhat: ", sprintf(fmt="%.3f", .GelmanRubin(t(drw),n.Rhat)), "\n", "ACF-k: ", sprintf(fmt="%.3f", .smoothedACF(ac,k=lag,sd=.5)), "\n", sep=""), adj=c(0,.5), cex=.8, family="mono", font=2, ...) if(export!="none"){ dev.off() }else{ devAskNewPage(ask=TRUE) } }}} # *** plots for fixed regression coefficients at level 2 # if(isL2 & "beta2" %in% print){ # check if pos is badly defined if(!is.null(pos)){ if(pos[1] > max(pvrs.L2) | pos[1] < min(pvrs.L2) | pos[2] > max(yvrs.L2) | pos[2] < min(yvrs.L2)){ .restoreDevice(oldpar,export,close=TRUE) stop("There is no entry [",pos[1],",",pos[2],"] in 'beta2'.") } } for(ic in yvrs.L2){ for(ir in pvrs.L2){ # skip if individual parameters requested if(!is.null(pos)){ if(!(pos[1]==ir & pos[2]==ic)) next } if(export!="none"){ filename <- paste("BETA2_",gfile,names(yvrs.L2[ic]),"_ON_",names(pvrs.L2[ir]),".",export,sep="") filename <- gsub("[(),]","",filename) filename <- gsub("[[:space:]]","-",filename) out.args <- c(list(file=file.path(out,filename)),dev.args) do.call(export, out.args) } layout(matrix(c(1,2,3,4),2,2), c(5,1), c(1.13,1)) # choose section of trace switch(trace, imputation={ trc <- x$par.imputation[["beta2"]][ir,ic,,gg][si] }, burnin={ trc <- x$par.burnin[["beta2"]][ir,ic,,gg][sb] }, all={ trc <- c(x$par.burnin[["beta2"]][ir,ic,,gg], x$par.imputation[["beta2"]][ir,ic,,gg])[s] } ) # trace plot par(mar=c(3,3,2,0)+0.5, mgp=c(2,1,0), font.lab=2) ymin <- min(trc) ymax <- max(trc) yr <- ymax-ymin plot(trc, type=ifelse(trace=="all","n","l"), ylab="Trace", xlab="Iteration", xaxt="n", ylim=c(ymin-yr*.03, ymax+yr*.03), ...) if(trace=="all"){ lines(which(s<=nb), trc[s<=nb], col="grey75") lines(which(s>=nb), trc[s>=nb], col="black") } axt <- axTicks(1) title(main=paste("Beta2 [",ir,",",ic,glab,"]: ",names(yvrs.L2[ic])," ON ",names(pvrs.L2[ir]),sep=""), cex.main=1) if(trace=="imputation"){ axl <- sprintf("%d", thin*(axt+nb)) }else{ axl <- sprintf("%d", thin*axt) } axis(side=1, at=axt, labels=axl) # trend line for trace (moving window average) if(all(is.numeric(smooth),smooth>0)){ B <- floor(niter/(smooth*thin)) mwa <- .movingAverage(trc,B,fill=TRUE) lines(mwa, col="grey60") } # blue line if(trace=="all") abline(v=ceiling(nb/thin), col="blue") # further plots if(trace=="burnin"){ drw <- x$par.burnin[["beta2"]][ir,ic,sb,gg] }else{ drw <- x$par.imputation[["beta2"]][ir,ic,si,gg] } # autocorrelation plot par(mar=c(3,3,1,0)+0.5) ac <- acf(drw, lag.max=lag+2, plot=F) plot(ac[1:lag], ylim=c(-.1,1), yaxt="n", main=NULL, ylab="ACF", ci=0, ...) axis(side=2, at=c(0,.5,1)) abline(h=c(-.1,.1), col="blue") # kernel density plot par(mar=c(3,0,2,0)+0.5, mgp=c(2,0,0)) ddrw <- density(drw) plot(x=ddrw$y, y=ddrw$x, type="l", xaxt="n", yaxt="n", xlab="", ylab="", ylim=c(ymin-yr*.03, ymax+yr*.03), ...) # posterior summary par(mar=c(1,-0.5,0,-0.5)+0.5) plot.new() text(0,0.5,paste("EAP: ", sprintf(fmt="%.3f", mean(drw)), "\n", "MAP: ", sprintf(fmt="%.3f", ddrw$x[which.max(ddrw$y)]), "\n", "SD: ", sprintf(fmt="%.3f", sd(drw)), "\n", "2.5%: ", sprintf(fmt="%.3f", quantile(drw,.025)), "\n", "97.5%: ", sprintf(fmt="%.3f", quantile(drw,.975)), "\n", "Rhat: ", sprintf(fmt="%.3f", .GelmanRubin(t(drw),n.Rhat)), "\n", "ACF-k: ", sprintf(fmt="%.3f", .smoothedACF(ac,k=lag,sd=.5)), "\n", sep=""), adj=c(0,.5), cex=.8, family="mono", font=2, ...) if(export!="none"){ dev.off() }else{ devAskNewPage(ask=TRUE) } }}} # *** plots for random effects' variance components # if(isML & "psi" %in% print){ # joint set of variables at level 1 and 2 yvrs.comb <- c(yvrs, if(isL2) yvrs.L2+length(yvrs)) # index matrix bvec <- t(expand.grid(qvrs, yvrs)) if(isL2) bvec <- cbind(bvec, t(expand.grid(1,yvrs.L2+length(yvrs)))) # attempt to fix pos if badly defined if(!is.null(pos)){ pos0 <- pos if(pos[2]>pos[1]){ # fix if pos is redundant/transposed pos[1] <- pos0[2] pos[2] <- pos0[1] } if(any(pos0 > max(yvrs.comb)) | any(pos0 < min(yvrs.comb))){ .restoreDevice(oldpar,export,close=TRUE) stop("There is no entry [",pos0[1],",",pos0[2],"] in 'psi'.") } if(!identical(pos,pos0)) warning("Could not use entry [",pos0[1],",",pos0[2],"] in 'psi'. Used [",pos[1],",",pos[2],"] instead.") } dpsi <- length(yvrs)*length(qvrs) if(isL2) dpsi <- dpsi+length(yvrs.L2) # if only "diagonal" entries, fix max. column index to 1 cpsi <- if(kc=="diagonal") 1 else dpsi for(ic in 1:cpsi){ for(ir in ic:dpsi){ # skip if different individual parameters requested if(!is.null(pos)){ if(!(pos[1]==ir & pos[2]==ic)) next } # if only "diagonal" entries, use ir for all labels ic2 <- if(kc=="diagonal") ir else ic # check for residual at L2 icL2 <- ic > (length(yvrs)*length(qvrs)) irL2 <- ir > (length(yvrs)*length(qvrs)) if(export!="none"){ filename <- paste0("PSI_", gfile, names(yvrs.comb[bvec[2,ir]]), if(!irL2) paste0("_ON_", names(qvrs[bvec[1,ir]])), "_WITH_", names(yvrs.comb[bvec[2,ic2]]), if(!icL2) paste0("_ON_", names(qvrs[bvec[1,ic2]])), ".", export) filename <- gsub("[(),]","",filename) filename <- gsub("[[:space:]]","-",filename) out.args <- c(list(file=file.path(out,filename)),dev.args) do.call(export, out.args) } layout(matrix(c(1,2,3,4),2,2), c(5,1), c(1.13,1)) switch(trace, imputation={ trc <- x$par.imputation[["psi"]][ir,ic,,gg][si] }, burnin={ trc <- x$par.burnin[["psi"]][ir,ic,,gg][sb] }, all={ trc <- c(x$par.burnin[["psi"]][ir,ic,,gg], x$par.imputation[["psi"]][ir,ic,,gg])[s] } ) # trace plot par(mar=c(3,3,2,0)+0.5, mgp=c(2,1,0), font.lab=2) ymin <- min(trc) ymax <- max(trc) yr <- ymax-ymin plot(trc, type=ifelse(trace=="all","n","l"), ylab="Trace", xlab="Iteration", xaxt="n", ylim=c(ymin-yr*.03, ymax+yr*.03), ...) if(trace=="all"){ lines(which(s<=nb), trc[s<=nb], col="grey75") lines(which(s>=nb), trc[s>=nb], col="black") } title(main=paste0("Psi [",ir,",",ic,glab,"]: ", if(!irL2) "(", names(yvrs.comb[bvec[2,ir]]), if(!irL2) paste0(" ON ", names(qvrs[bvec[1,ir]]), ")"), " WITH ", if(!icL2) "(", names(yvrs.comb[bvec[2,ic2]]), if(!icL2) paste0(" ON ", names(qvrs[bvec[1,ic2]]), ")") ), cex.main=1) axt <- axTicks(1) if(trace=="imputation"){ axl <- sprintf("%d", thin*(axt+nb)) }else{ axl <- sprintf("%d", thin*axt) } axis(side=1, at=axt, labels=axl) # trend line for trace (moving window average) if(all(is.numeric(smooth),smooth>0)){ B <- floor(niter/(smooth*thin)) mwa <- .movingAverage(trc,B,fill=TRUE) lines(mwa, col="grey60") } # blue line if(trace=="all") abline(v=ceiling(nb/thin), col="blue") # further plots if(trace=="burnin"){ drw <- x$par.burnin[["psi"]][ir,ic,sb,gg] }else{ drw <- x$par.imputation[["psi"]][ir,ic,si,gg] } # autocorrelation plot par(mar=c(3,3,1,0)+0.5) ac <- acf(drw, lag.max=lag+2, plot=F) plot(ac[1:lag], ylim=c(-.1,1), yaxt="n", main=NULL, ylab="ACF", ci=0, ...) axis(side=2, at=c(0,.5,1)) abline(h=c(-.1,.1), col="blue") # kernel density plot par(mar=c(3,0,2,0)+0.5, mgp=c(2,0,0)) ddrw <- density(drw) plot(x=ddrw$y, y=ddrw$x, type="l", xaxt="n", yaxt="n", xlab="", ylab="", ylim=c(ymin-yr*.03, ymax+yr*.03), ...) # posterior summary par(mar=c(1,-0.5,0,-0.5)+0.5) plot.new() text(0,0.5,paste("EAP: ", sprintf(fmt="%.3f", mean(drw)), "\n", "MAP: ", sprintf(fmt="%.3f", ddrw$x[which.max(ddrw$y)]), "\n", "SD: ", sprintf(fmt="%.3f", sd(drw)), "\n", "2.5%: ", sprintf(fmt="%.3f", quantile(drw,.025)), "\n", "97.5%: ", sprintf(fmt="%.3f", quantile(drw,.975)), "\n", "Rhat: ", sprintf(fmt="%.3f", .GelmanRubin(t(drw),n.Rhat)), "\n", "ACF-k: ", sprintf(fmt="%.3f", .smoothedACF(ac,k=lag,sd=.5)), "\n", sep=""), adj=c(0,.5), cex=.8, family="mono", font=2, ...) if(export!="none"){ dev.off() }else{ devAskNewPage(ask=TRUE) } }}} # *** plots for residual variance components # if("sigma" %in% print){ # cluster-specific covariance matrices stacked in rows gind <- attr(x$data,"group")==grp.labels[gg] clus2 <- unique(x$data[gind,clus]) clus3 <- if(rl1) seq_along(clus2) else 1 # attempt to fix pos if badly defined if(!is.null(pos)){ pos0 <- pos dims <- dim(x$par.imputation$sigma) if(pos[2] > length(yvrs)){ # fix if pos is transposed pos[1] <- pos0[2] pos[2] <- pos0[1] pos0 <- pos } if(pos[2] > ((pos[1]-1)%%length(yvrs))+1){ # fix if pos is redundant pos[1] <- pos0[1] - pos0[1]%%length(yvrs) + pos0[2] pos[2] <- pos0[1]%%length(yvrs) } if(all(pos0 > max(yvrs)) | any(pos0 < min(yvrs)) | max(pos0) > dims[1]){ .restoreDevice(oldpar,export,close=TRUE) stop("There is no entry [",pos0[1],",",pos0[2],"] in 'sigma'.") } if(!identical(pos,pos0)) warning("Could not use entry [",pos0[1],",",pos0[2],"] in 'sigma'. Used [",pos[1],",",pos[2],"] instead.") } # if only "diagonal" entries, fix max. column index to 1 csig <- if(kc=="diagonal") 1 else length(yvrs) for(icl in clus3){ for(ic in 1:csig){ for(ir in ic:length(yvrs)){ # adjust row index for cluster-specific covariance matrices ir2 <- ir+(icl-1)*length(yvrs) # if only "diagonal" entries, use ir for all labels ic2 <- if(kc=="diagonal") ir else ic # skip if individual parameters requested if(!is.null(pos)){ if(!(pos[1]==ir2 & pos[2]==ic)) next } if(export!="none"){ filename <- paste0("SIGMA_", gfile, names(yvrs[ir]), "_WITH_", names(yvrs[ic2]), if(rl1) paste0("_",clus,clus2[icl]), ".",export) filename <- gsub("[(),]","",filename) filename <- gsub("[[:space:]]","-",filename) out.args <- c(list(file=file.path(out,filename)),dev.args) do.call(export, out.args) } layout(matrix(c(1,2,3,4),2,2), c(5,1), c(1.13,1)) switch(trace, imputation={ trc <- x$par.imputation[["sigma"]][ir2,ic,,gg][si] }, burnin={ trc <- x$par.burnin[["sigma"]][ir2,ic,,gg][sb] }, all={ trc <- c(x$par.burnin[["sigma"]][ir2,ic,,gg], x$par.imputation[["sigma"]][ir2,ic,,gg])[s] } ) # trace plots par(mar=c(3,3,2,0)+0.5, mgp=c(2,1,0), font.lab=2) ymin <- min(trc) ymax <- max(trc) yr <- ymax-ymin plot(trc, type=ifelse(trace=="all","n","l"), ylab="Trace", xlab="Iteration", xaxt="n", ylim=c(ymin-yr*.03, ymax+yr*.03), ...) if(trace=="all"){ lines(which(s<=nb), trc[s<=nb], col="grey75") lines(which(s>=nb), trc[s>=nb], col="black") } title(main=paste0("Sigma [",ir2,",",ic,glab,"]: ", names(yvrs[ir]), " WITH ", names(yvrs[ic2]), if(rl1) paste0(" [",clus,":",clus2[icl],"]")), cex.main=1) axt <- axTicks(1) if(trace=="imputation"){ axl <- sprintf("%d", thin*(axt+nb)) }else{ axl <- sprintf("%d", thin*axt) } axis(side=1, at=axt, labels=axl) # trend line for trace (moving window average) if(all(is.numeric(smooth),smooth>0)){ B <- floor(niter/(smooth*thin)) mwa <- .movingAverage(trc,B,fill=TRUE) lines(mwa, col="grey60") } # blue line if(trace=="all") abline(v=ceiling(nb/thin), col="blue") # further plots if(trace=="burnin"){ drw <- x$par.burnin[["sigma"]][ir2,ic,sb,gg] }else{ drw <- x$par.imputation[["sigma"]][ir2,ic,si,gg] } # autocorrelation plot par(mar=c(3,3,1,0)+0.5) ac <- acf(drw, lag.max=lag+2, plot=F) plot(ac[1:lag], ylim=c(-.1,1), yaxt="n", main=NULL, ylab="ACF", ci=0, ...) axis(side=2, at=c(0,.5,1)) abline(h=c(-.1,.1), col="blue") # kernel density plot par(mar=c(3,0,2,0)+0.5, mgp=c(2,0,0)) ddrw <- density(drw) plot(x=ddrw$y, y=ddrw$x, type="l", xaxt="n", yaxt="n", xlab="", ylab="", ylim=c(ymin-yr*.03, ymax+yr*.03), ...) # posterior summary par(mar=c(1,-0.5,0,-0.5)+0.5) plot.new() text(0,0.5,paste("EAP: ", sprintf(fmt="%.3f", mean(drw)), "\n", "MAP: ", sprintf(fmt="%.3f", ddrw$x[which.max(ddrw$y)]), "\n", "SD: ", sprintf(fmt="%.3f", sd(drw)), "\n", "2.5%: ", sprintf(fmt="%.3f", quantile(drw,.025)), "\n", "97.5%: ", sprintf(fmt="%.3f", quantile(drw,.975)), "\n", "Rhat: ", sprintf(fmt="%.3f", .GelmanRubin(t(drw),n.Rhat)), "\n", "ACF-k: ", sprintf(fmt="%.3f", .smoothedACF(ac,k=lag,sd=.5)), "\n", sep=""), adj=c(0,.5), cex=.8, family="mono", font=2, ...) if(export!="none"){ dev.off() }else{ devAskNewPage(ask=TRUE) } }}}} } plot.new() par(oldpar) if(export=="none") devAskNewPage(ask=FALSE) dev.off() invisible() } # restore and shut down parameters upon error .restoreDevice <- function(pars, export, close=TRUE){ par(pars) if(export=="none") devAskNewPage(ask=FALSE) if(close) dev.off() invisible() } # moving window average for time series .movingAverage <- function(x, B, fill=TRUE){ x1 <- cumsum(x) N <- length(x) y <- rep(NA,N) i <- seq(B+1 , N-B) xdiff <- x1[ -seq(1,B) ] - x1[ -seq(N-B+1,N) ] xdiff <- xdiff[ - seq(1,B) ] y[i] <- ( x1[i] + xdiff - c(0,x1[ -seq(N-2*B,N) ]) ) / (2*B+1) # fill NAs at beginning and end of time series if(fill){ j <- seq(0,B-1) ybeg <- sapply(j, function(z) sum( x[ seq(1,(2*z+1)) ]) / (2*z+1) ) yend <- sapply(rev(j), function(z) sum( x[ seq(N-2*z,N) ] ) / (2*z+1) ) y[j+1] <- ybeg y[rev(N-j)] <- yend } y } # lag-k autocorrelation smoothed by values of a normal density .smoothedACF <- function(x, k, sd=.5){ x0 <- x$ac[-1,1,1] n <- length(x0) add <- n-k x0 <- x0[(k-add):n] # weights based on normal density w <- dnorm(-add:add, 0, sd) y <- sum( x0 * (w/sum(w)) ) y } mitml/R/mids2mitml.list.R0000644000176200001440000000053312522110736014753 0ustar liggesusersmids2mitml.list <- function(x){ # convert mids to mitml.list if(!requireNamespace("mice", quietly=TRUE)) stop("The 'mice' package must be installed in order to use this function.") m <- x$m out <- list() length(out) <- m for(ii in 1:m){ out[[ii]] <- mice::complete(x,action=ii) } class(out) <- c("mitml.list","list") out } mitml/R/testConstraints.R0000644000176200001440000001217613321120501015122 0ustar liggesuserstestConstraints <- function(model, qhat, uhat, constraints, method=c("D1","D2"), df.com=NULL){ # test constraints with multiply imputed data if(missing(model)==(missing(qhat)|missing(uhat))) stop("Either 'model' or both 'qhat' and 'uhat' must be supplied.") # match methods if(!missing(method) & length(method)>1) stop("Only one of 'D1' or 'D2' may be supplied as 'method'.") method <- match.arg(method) cons <- gsub("\\(Intercept\\)","Intercept",constraints) # warnings for ignored arguments if(!is.null(df.com) & method=="D2") warning("Setting complete-data degrees of freedom is only available for 'D1' and will be ignored with 'D2'.") if(!missing(qhat)){ coef.method <- "default" if(missing(uhat)) stop("Either 'model' or both 'qhat' and 'uhat' must be supplied.") if(length(dim(qhat))==3) qhat <- apply(qhat,3,identity) if(is.list(qhat)){ Qhat <- sapply(qhat, identity) Uhat <- sapply(uhat, identity, simplify="array") }else{ Qhat <- qhat Uhat <- uhat } if(is.null(dim(Qhat))){ dim(Qhat) <- c(1,length(qhat)) nms <- if(is.list(qhat)) names(qhat[[1]]) else if(is.matrix(qhat)) rownames(qhat) else names(qhat)[1] dimnames(Qhat) <- list(nms, NULL) } if(is.null(dim(Uhat))) dim(Uhat) <- c(1,1,length(qhat)) if(is.null(rownames(Qhat))){ nms <- !sapply(dimnames(Uhat),is.null) nms <- dimnames(Uhat)[[min(which(nms))]] rownames(Qhat) <- nms } m <- dim(Qhat)[2] p <- dim(Qhat)[1] k <- q <- length(cons) } if(!missing(model)){ if(!"list"%in%class(model)) stop("The 'model' argument must be a list of fitted statistical models.") # *** # select extraction methods cls <- class(model[[1]]) # default method coef.method <- "default" # merMod (lme4) if(any(grepl("merMod",cls)) & coef.method=="default"){ if(!requireNamespace("lme4", quietly=TRUE)) stop("The 'lme4' package must be installed to handle 'merMod' class objects.") coef.method <- "lmer" } # lme (nlme) if(any(grepl("^.?lme$",cls)) & coef.method=="default"){ if(!requireNamespace("nlme", quietly=TRUE)) stop("The 'nlme' package must be installed to handle '(n)lme' class objects.") coef.method <- "nlme" } # geeglm (geepack) if(any(grepl("geeglm",cls)) & coef.method=="default"){ if(!requireNamespace("geepack", quietly=TRUE)) stop("The 'geepack' package must be installed in order to handle 'geeglm' class objects.") coef.method <- "geeglm" } fe <- switch(coef.method, lmer=.getCOEF.lmer(model), nlme=.getCOEF.nlme(model), geeglm=.getCOEF.geeglm(model), default=.getCOEF.default(model) ) m <- length(model) Qhat <- fe$Qhat Uhat <- fe$Uhat if(is.null(dim(Qhat))) dim(Qhat) <- c(1,m) if(is.null(dim(Uhat))) dim(Uhat) <- c(1,1,m) p <- dim(Qhat)[1] k <- q <- length(cons) } newQhat <- array(NA, dim=c(q,m)) newUhat <- array(NA, dim=c(q,q,m)) # *** delta method for(ii in 1:m){ theta <- Qhat[,ii] Sigma <- Uhat[,,ii] names(theta) <- gsub("\\(Intercept\\)","Intercept",names(theta)) g <- parse(text=cons) env.g <- new.env() for(pp in 1:p) assign(names(theta)[pp],theta[pp],pos=env.g) # new parameter estimates newtheta <- numeric(length(g)) for(qq in 1:q) newtheta[qq] <- eval(g[qq],envir=env.g) # derivative, new covariance matrix gdash.theta <- matrix(NA,q,p) for(qq in 1:q){ tmp <- numericDeriv(g[[qq]],names(theta),env.g) gdash.theta[qq,] <- attr(tmp,"gradient") } newSigma <- gdash.theta %*% Sigma %*% t(gdash.theta) newQhat[,ii] <- newtheta newUhat[,,ii] <- newSigma } # *** aggregation # # common part (based on D1) Qbar <- apply(newQhat,1,mean) Ubar <- apply(newUhat,c(1,2),mean) B <- cov(t(newQhat)) r <- (1+m^(-1))*sum(diag(B%*%solve(Ubar)))/k # based on D1 Ttilde <- (1 + r)*Ubar if(method=="D1"){ # D1 (Li, Raghunathan and Rubin, 1991) val <- t(Qbar) %*% solve(Ttilde) %*% Qbar / k t <- k*(m-1) if(!is.null(df.com)){ a <- r*t/(t-2) vstar <- ( (df.com+1) / (df.com+3) ) * df.com v <- 4 + ( (vstar-4*(1+a))^(-1) + (t-4)^(-1) * ((a^2*(vstar-2*(1+a))) / ((1+a)^2*(vstar-4*(1+a)))) )^(-1) } else { if (t>4){ v <- 4 + (t-4) * (1 + (1 - 2*t^(-1)) * (r^(-1)))^2 }else{ v <- t * (1 + k^(-1)) * ((1 + r^(-1))^2) / 2 } } p <- 1-pf(val, k, v) } if(method=="D2"){ dW <- sapply(1:m, function(z) t(newQhat[,z]) %*% solve(newUhat[,,z]) %*% newQhat[,z]) # D2 (Li, Meng et al., 1991) dWbar <- mean(dW) r <- (1+m^(-1)) * var(sqrt(dW)) val <- (dWbar/k - (m+1)/(m-1) * r) / (1+r) v <- k^(-3/m) * (m-1) * (1+r^(-1))^2 p <- 1-pf(val, k, v) } out <- matrix(c(val,k,v,p,r),ncol=5) colnames(out) <- c("F.value","df1","df2","P(>F)","RIV") # new label for p-value, SiG 2017-02-09 out <- list( call=match.call(), constraints=cons, test=out, Qbar=Qbar, T=Ttilde, m=m, adj.df=!is.null(df.com), df.com=df.com, method=method ) class(out) <- "mitml.testConstraints" out } mitml/R/print.mitml.summary.R0000644000176200001440000001102313321120501015652 0ustar liggesusersprint.mitml.summary <- function(x,...){ # print method for objects of class "summary.mitml" cl <- x$call vrs <- x$model itr <- x$iter ngr <- x$ngr mdr <- x$missing.rates conv <- x$conv isML <- attr(x$model,"is.ML") isL2 <- attr(x$model,"is.L2") # print general information cat("\nCall:\n", paste(deparse(cl)), sep="\n") cat("\n") if(isL2) cat("Level 1:\n", collapse="\n") if(isML) cat(formatC("Cluster variable:",width=-25), vrs$clus, sep=" ", collapse="\n") cat(formatC("Target variables:",width=-25), vrs$yvrs, collapse="\n") cat(formatC("Fixed effect predictors:",width=-25), vrs$pvrs, collapse="\n") if(isML) cat(formatC("Random effect predictors:",width=-25), vrs$qvrs, collapse="\n") if(isL2){ cat("\n") cat(formatC("Level 2:\n",width=-25), collapse="\n") cat(formatC("Target variables:",width=-25), vrs$yvrs.L2, collapse="\n") cat(formatC("Fixed effect predictors:",width=-25), vrs$pvrs.L2, collapse="\n") } cat("\nPerformed", sprintf("%.0f",itr$burn), "burn-in iterations, and generated", sprintf("%.0f",itr$m), "imputed data sets,\neach", sprintf("%.0f",itr$iter), "iterations apart.", if(ngr>1){c("\nImputations were carried out seperately within", sprintf("%.0f",ngr), "groups.")},"\n") # print convergence diagnostics if(!is.null(conv)){ # note for reduced chains if(x$keep.chains!="full"){ cat("\nNote: Convergence criteria were calculated from a reduced set of\nparameters (setting: ", x$keep.chains, ").\n", sep="") } for(cc in attr(conv,"stats")){ # summary for Rhat and SDprop if(cc=="Rhat"|cc=="SDprop"){ cout <- matrix(c( sapply(conv, function(z) min(z[,cc])), sapply(conv, function(z) quantile(z[,cc],.25)), sapply(conv, function(z) mean(z[,cc])), sapply(conv, function(z) median(z[,cc])), sapply(conv, function(z) quantile(z[,cc],.75)), sapply(conv, function(z) max(z[,cc])) ), ncol=6 ) rownames(cout) <- c("Beta:",if(isL2) "Beta2:",if(isML) "Psi:","Sigma:") colnames(cout) <- c("Min","25%","Mean","Median","75%","Max") clab <- switch(cc, Rhat="\nPotential scale reduction (Rhat, imputation phase):\n", SDprop="\nGoodness of approximation (imputation phase):\n") cat(clab,"\n") print.table(round(cout,3)) clab <- switch(cc, Rhat="\nLargest potential scale reduction:\n", SDprop="\nPoorest approximation:\n") cat(clab) maxval <- lapply(conv, function(a) a[which.max(a[,cc]),1:2]) cat("Beta: [", paste(maxval$beta,collapse=",") ,"], ", if(isL2) paste0("Beta2: [", paste(maxval$beta2,collapse=",") ,"], "), if(isML) paste0("Psi: [", paste(maxval$psi,collapse=",") ,"], "), "Sigma: [", paste(maxval$sigma,collapse=",") ,"]\n", sep="") } # summary for ACF if(cc=="ACF"){ cout <- c( sapply(conv, function(z) mean(z[,"lag-1"])), sapply(conv, function(z) mean(z[,"lag-k"])), sapply(conv, function(z) mean(z[,"lag-2k"])), sapply(conv, function(z) max(z[,"lag-1"])), sapply(conv, function(z) max(z[,"lag-k"])), sapply(conv, function(z) max(z[,"lag-2k"])) ) neg <- cout<0 cout <- sprintf(cout,fmt="%.3f") cout[neg] <- gsub("^-0","-",cout[neg]) cout[!neg] <- gsub("^0"," ",cout[!neg]) cout <- matrix(cout, 2+isML+isL2, 6) cout <- rbind(c(" Lag1"," Lagk","Lag2k"," Lag1"," Lagk","Lag2k"), cout) rownames(cout) <- c("","Beta:",if(isL2) "Beta2:",if(isML) "Psi:","Sigma:") colnames(cout) <- c(" Mean","",""," Max","","") cat("\nAutocorrelation (ACF, imputation phase):\n\n") print.table(cout) cat("\nLargest autocorrelation at lag k:\n") maxval <- lapply(conv, function(a) a[which.max(abs(a[,"lag-k"])),1:2]) cat("Beta: [", paste(maxval$beta,collapse=",") ,"], ", if(isL2) paste0("Beta2: [", paste(maxval$beta2,collapse=",") ,"], "), if(isML) paste0("Psi: [", paste(maxval$psi,collapse=",") ,"], "), "Sigma: [", paste(maxval$sigma,collapse=",") ,"]\n", sep="") } } } # missing data rates mdrout <- t(as.matrix(mdr)) rownames(mdrout) <- "MD%" cat("\nMissing data per variable:\n") print.table(mdrout) cat("\n") invisible(NULL) } mitml/R/anova.mitml.result.R0000644000176200001440000000474313100117100015452 0ustar liggesusersanova.mitml.result <- function(object, ...){ # *** select method for object class # cls <- class(object[[1]]) # default lr.method <- "default" # lm if(cls[1]=="lm") lr.method <- "lm" # merMod (lme4) if(any(grepl("^l?merMod$",cls[1]))){ if(!requireNamespace("lme4", quietly=TRUE)) stop("The 'lme4' package must be installed to handle 'merMod' class objects.") lr.method <- "lmer" } # lme (nlme) if(any(grepl("^.?lme$",cls[1]))){ if(!requireNamespace("nlme", quietly=TRUE)) stop("The 'nlme' package must be installed to handle 'lme' class objects.") lr.method <- "nlme" } # coxph (survival) if(any(grepl("coxph",cls)) & lr.method=="default"){ if(!requireNamespace("survival", quietly=TRUE)) stop("The 'survival' package must be installed in order to handle 'coxph' class objects.") lr.method <- "coxph" } # check use of D3 if(lr.method%in%c("lm","lmer","nlme")){ method <- "D3" }else{ warning("The 'D3' method is currently not supported for models of class '",cls[1],"'. Switching to 'D2'.") method <- "D2" } # *** testModels # modlist <- list(object,...) # order models nm <- length(modlist) df <- integer(nm) for(mm in 1:nm){ df[mm] <- switch(lr.method, lmer=attr(logLik(modlist[[mm]][[1]]),"df"), nlme=attr(logLik(modlist[[mm]][[1]]),"df"), coxph=if(class(modlist[[mm]][[1]])[1]=="coxph.null") 0 else attr(logLik(modlist[[mm]][[1]]),"df"), lm=.tryResidualDf(modlist[[mm]][[1]]), default=.tryResidualDf(modlist[[mm]][[1]]) ) } if(lr.method%in%c("lm","default")) df <- abs(df-max(df)) modlist <- modlist[ order(df,decreasing=T) ] # stepwise comparison outlist <- as.list(1:(nm-1)) for(mm in 1:(nm-1)){ if(method=="D3"){ outlist[[mm]] <- testModels(model=modlist[[mm]], null.model=modlist[[mm+1]], method="D3") }else{ outlist[[mm]] <- testModels(model=modlist[[mm]], null.model=modlist[[mm+1]], method="D2", use="likelihood") } } # get model formulas fml <- character(nm) for(mm in 1:nm){ fml[mm] <- tryCatch( gsub("[[:space:]]", "", Reduce(paste, deparse( formula(modlist[[mm]][[1]]) ))), error=function(f) NULL ) } # check for REML reml <- any(sapply(outlist, function(x) x$reml)) out <- list( call=match.call(), test=outlist, formula=fml, method=method, use="likelihood", reml=reml ) class(out) <- "mitml.anova" out } mitml/R/write.mitml.R0000644000176200001440000000031212477323013014171 0ustar liggesuserswrite.mitml <- function(x, filename, drop=FALSE){ # write mitml class object to file if(drop){ x <- x[!names(x)%in%c("par.burnin","par.imputation")] } save(x,file=filename) invisible() } mitml/R/testEstimates.R0000644000176200001440000001256013047036113014560 0ustar liggesuserstestEstimates <- function(model, qhat, uhat, var.comp=FALSE, df.com=NULL){ # combine scalar estimates from the analysis of multiply imputed data if(missing(model)==(missing(qhat)|missing(uhat))) stop("Either 'model' or both 'qhat' and 'uhat' must be supplied.") # *** # combine from matrix or list # if(!missing(qhat)){ coef.method <- "default" if(missing(uhat)) stop("Either 'model' or both 'qhat' and 'uhat' must be supplied.") if(is.list(qhat)){ Qhat <- sapply(qhat, identity) Uhat <- sapply(uhat, identity) }else{ Qhat <- qhat Uhat <- uhat } if(is.null(dim(Qhat))){ dim(Qhat) <- c(1,length(qhat)) nms <- if(is.list(qhat)) names(qhat[[1]]) else if(is.matrix(qhat)) rownames(qhat) else NULL dimnames(Qhat) <- list(nms, NULL) } if(is.null(dim(Uhat))) dim(Uhat) <- dim(Qhat) m <- ncol(Qhat) if(is.null(rownames(Qhat))) rownames(Qhat) <- paste0("Parameter.",1:nrow(Qhat)) Qbar <- apply(Qhat,1,mean) Ubar <- apply(Uhat,1,mean) B <- apply(Qhat,1,var) T <- Ubar + (1+m^(-1)) * B r <- (1+m^(-1))*B/Ubar v <- vm <- (m-1)*(1+r^(-1))^2 fmi <- (r+2/(v+3))/(r+1) se <- sqrt(T) t <- Qbar/se if(!is.null(df.com)){ lam <- r/(r+1) vobs <- (1-lam)*((df.com+1)/(df.com+3))*df.com v <- (vm^(-1)+vobs^(-1))^(-1) } p <- 2*(1-pt(abs(t),df=v)) # two-tailed p-value, SiG 2017-02-09 out <- matrix(c(Qbar,se,t,v,p,r,fmi),ncol=7) colnames(out) <- c("Estimate","Std.Error","t.value","df","P(>|t|)","RIV","FMI") # two-tailed p-value, SiG 2017-02-09 rownames(out) <- names(Qbar) # print vout for missing U uind <- is.na(Ubar) vout <- if(any(uind)) out[uind,1,drop=FALSE] else NULL out <- if(all(uind)) NULL else out[!uind,,drop=FALSE] } # *** # combine through model recognition # if(!missing(model)){ # * identify procedures for model class cls <- class(model[[1]]) coef.method <- vc.method <- "default" if(cls[1]=="lm") vc.method <- "lm" # merMod (lme4) if(any(grepl("merMod",cls)) & coef.method=="default"){ if(!requireNamespace("lme4", quietly=TRUE)) stop("The 'lme4' package must be installed in order to handle 'merMod' class objects.") coef.method <- vc.method <- "lmer" } # lme (nlme) if(any(grepl("^.?lme$",cls)) & coef.method=="default"){ if(!requireNamespace("nlme", quietly=TRUE)) stop("The 'nlme' package must be installed in order to handle 'lme' class objects.") coef.method <- vc.method <- "nlme" } # geeglm (geepack) if(any(grepl("geeglm",cls)) & coef.method=="default"){ if(!requireNamespace("geepack", quietly=TRUE)) stop("The 'geepack' package must be installed in order to handle 'geeglm' class objects.") coef.method <- vc.method <- "geeglm" } # * combine fixed coefficients fe <- switch(coef.method, lmer=.getCOEF.lmer(model,diagonal=TRUE), nlme=.getCOEF.nlme(model,diagonal=TRUE), geeglm=.getCOEF.geeglm(model,diagonal=TRUE), default=.getCOEF.default(model,diagonal=TRUE) ) m <- length(model) Qhat <- fe$Qhat Uhat <- fe$Uhat if(is.null(dim(Qhat))){ dim(Qhat) <- c(1,m) dim(Uhat) <- c(1,m) dimnames(Qhat) <- dimnames(Uhat) <- list(fe$nms, NULL) } Qbar <- apply(Qhat,1,mean) Ubar <- apply(Uhat,1,mean) B <- apply(Qhat,1,var) T <- Ubar + (1+m^(-1)) * B r <- (1+m^(-1))*B/Ubar v <- vm <- (m-1)*(1+r^(-1))^2 fmi <- (r+2/(v+3))/(r+1) se <- sqrt(T) t <- Qbar/se if(!is.null(df.com)){ lam <- r/(r+1) vobs <- (1-lam)*((df.com+1)/(df.com+3))*df.com v <- (vm^(-1)+vobs^(-1))^(-1) } p <- 2*(1-pt(abs(t),df=v)) # two-tailed p-value, SiG 2017-02-09 out <- matrix(c(Qbar,se,t,v,p,r,fmi),ncol=7) colnames(out) <- c("Estimate","Std.Error","t.value","df","P(>|t|)","RIV","FMI") # two-tailed p-value, SiG 2017-02-09 rownames(out) <- names(Qbar) # * combine variance components vout <- NULL if(var.comp){ vc <- switch(vc.method, lmer=.getVC.lmer(model), nlme=.getVC.nlme(model), geeglm=.getVC.geeglm(model), lm=.getVC.lm(model), default=list(vlist=NULL,addp=NULL) ) if(vc.method=="default") warning("Computation of variance components not supported for objects of class '", cls[1], "' (see ?with.mitml.list for manual calculation).") vlist <- vc$vlist addp <- vc$addp if(!is.null(vlist)){ vlist <- lapply(vlist, function(z) apply(z,1:2,mean) ) ln <- names(vlist) nms <- vout <- c() for(vv in 1:length(vlist)){ vc <- vlist[[vv]] rn <- rownames(vc) cn <- colnames(vc) for(rr in 1:nrow(vc)){ for(cc in 1:ncol(vc)){ if(cc>=rr){ vout <- c(vout, vc[rr,cc]) nms <- c(nms, paste(rn[rr],"~~",cn[cc],ln[vv],sep="")) } }} } } if(!is.null(vout)){ vout <- matrix(vout,ncol=1) rownames(vout) <- nms colnames(vout) <- "Estimate" } if(!is.null(addp)){ vout <- rbind(vout, as.matrix(addp)) colnames(vout) <- "Estimate" } } } out <- list( call=match.call(), estimates=out, var.comp=vout, m=m, adj.df=!is.null(df.com), df.com=df.com, cls.method=coef.method ) class(out) <- "mitml.testEstimates" out } mitml/R/subset.mitml.list.R0000644000176200001440000000154113057752752015336 0ustar liggesuserssubset.mitml.list <- function(x, subset, select, ...){ # subset list of multiply imputed data sets # NOTE: code adapted from subset.data.frame (by Peter Dalgaard and Brian Ripley) rind <- if (missing(subset)) { lapply(x, function(i) rep(TRUE, nrow(i))) } else { ss <- substitute(subset) rind <- lapply(x, function(i) eval(ss, i, parent.frame())) if (!is.logical(unlist(rind))) stop("'subset' must be logical") lapply(rind, function(i) i & !is.na(i)) } cind <- if (missing(select)) { lapply(x, function(i) TRUE) } else { nl <- lapply(x, function(i){ l <- as.list(seq_along(i)) names(l) <- names(i) l }) se <- substitute(select) lapply(nl, function(i) eval(se, i, parent.frame())) } res <- lapply(seq_along(x), function(i) x[[i]][rind[[i]], cind[[i]], drop=FALSE]) as.mitml.list(res) } mitml/R/panImpute.R0000644000176200001440000001315213321120501013650 0ustar liggesuserspanImpute <- function(data, type, formula, n.burn=5000, n.iter=100, m=10, group=NULL, prior=NULL, seed=NULL, save.pred=FALSE, keep.chains=c("full","diagonal"), silent=FALSE){ # wrapper function for the Gibbs sampler in the pan package # *** checks if(!missing(type) && !missing(formula)) stop("Only one of 'type' or 'formula' may be specified.") if(save.pred && !missing(type)){ warning("Option 'save.pred' is ignored if 'type' is specified") save.pred=FALSE } keep.chains <- match.arg(keep.chains) # convert type if(!missing(type)){ formula <- .type2formula(data,type) group <- attr(formula, "group") } # empty objects to assign to clname <- yvrs <- y <- ycat <- zcol <- xcol <- pred <- clus <- psave <- pvrs <- qvrs <- pnames <- qnames <- NULL # preserve original order if(!is.data.frame(data)) as.data.frame(data) data <- cbind(data, original.order=1:nrow(data)) # address additional grouping grname <- group if(is.null(group)){ group <- rep(1,nrow(data)) }else{ group <- data[,group] if(length(group)!=nrow(data)) stop("Argument 'group' is not correctly specified.") } group.original <- group group <- as.numeric(factor(group,levels=unique(group))) # *** # model input # populate local frame .model.byFormula(data, formula, group, group.original, method="pan") # check model input if(any(is.na(group))) stop("Grouping variable must not contain missing data.") if(any(is.na(pred))) stop("Predictor variables must not contain missing data.") if(sum(is.na(y))==0) stop("Target variables do not contain any missing data.") if(any(!sapply(y,is.numeric))) stop("Target variables must be numeric. You may either convert them or use jomoImpute() instead.") if(any(duplicated(yvrs))) stop("Found duplicate target variables.") # reorder colums cc <- which(colnames(data) %in% c(clname,grname,yvrs)) data.ord <- cbind(data[c(clname,grname,yvrs)],data[-cc]) # *** # pan setup if(is.null(prior)){ prior <- list( a=ncol(y), Binv=diag(1,ncol(y)), c=ncol(y)*length(zcol), Dinv=diag(1,ncol(y)*length(zcol)) ) } if(is.null(seed)){ set.seed(as.integer(runif(1,0,10^6))) }else{ set.seed(as.integer(seed)) } rns <- sapply(unique(group), function(x,m) as.integer(runif(m+1,0,10^6)), m=m) # prepare output ind <- which(is.na(data.ord), arr.ind=TRUE, useNames=FALSE) ind <- ind[ ind[,2] %in% which(colnames(data.ord)%in%colnames(y)),,drop=FALSE ] rpm <- matrix(NA, nrow(ind), m) # standard dimensions ng <- length(unique(group)) np <- length(xcol) nq <- length(zcol) nr <- ncol(y) # reduced dimensions dpsi <- nr*nq dsig <- nr if(keep.chains=="diagonal"){ dpsi <- dsig <- 1 } bpar <- list(beta=array( NA, c(np,nr,n.burn,ng) ), psi=array( NA, c(nr*nq,dpsi,n.burn,ng) ), sigma=array( NA, c(nr,dsig,n.burn,ng) )) ipar <- list(beta=array( NA, c(np,nr,n.iter*m,ng) ), psi=array( NA, c(nr*nq,dpsi,n.iter*m,ng) ), sigma=array( NA, c(nr,dsig,n.iter*m,ng) )) # burn-in if(!silent){ cat("Running burn-in phase ...\n") flush.console() } glast <- as.list(unique(group)) for(gg in unique(group)){ gi <- group==gg gy <- y[gi,] gpred <- pred[gi,] gclus <- clus[gi] # sort 1, ..., k gclus <- match(gclus, unique(gclus)) cur <- pan::pan(gy, subj=gclus, gpred, xcol, zcol, prior, seed=rns[1,gg], iter=n.burn) glast[[gg]] <- cur$last # save parameter chains bpar[["beta"]][,,,gg] <- cur$beta if(keep.chains=="diagonal"){ bpar[["psi"]][,,,gg] <- .adiag( cur$psi ) bpar[["sigma"]][,,,gg] <-.adiag( cur$sigma ) }else{ bpar[["psi"]][,,,gg] <- cur$psi bpar[["sigma"]][,,,gg] <- cur$sigma } } # imputation for(ii in 1:m){ if(!silent){ cat("Creating imputed data set (",ii,"/",m,") ...\n") flush.console() } gy.imp <- as.list(unique(group)) for(gg in unique(group)){ gi <- group==gg gy <- y[gi,] gpred <- pred[gi,] gclus <- clus[gi] # sort 1, ..., k gclus <- match(gclus, unique(gclus)) cur <- pan::pan(gy, subj=gclus, gpred, xcol, zcol, prior, seed=rns[ii+1,gg], iter=n.iter, start=glast[[gg]]) glast[[gg]] <- cur$last # save imputations gy.imp[[gg]] <- cur$y # save parameter chains i0 <- seq.int(n.iter*(ii-1)+1, n.iter*ii) ipar[["beta"]][,,i0,gg] <- cur$beta if(keep.chains=="diagonal"){ ipar[["psi"]][,,i0,gg] <- .adiag( cur$psi ) ipar[["sigma"]][,,i0,gg] <- .adiag( cur$sigma ) }else{ ipar[["psi"]][,,i0,gg] <- cur$psi ipar[["sigma"]][,,i0,gg] <- cur$sigma } } y.imp <- do.call(rbind,gy.imp) rpm[,ii] <- y.imp[is.na(y)] } if(!silent){ cat("Done!\n") } # clean up srt <- data.ord[,ncol(data.ord)] data.ord <- data.ord[,-ncol(data.ord)] # prepare output data if( save.pred && !missing(formula) ) data.ord <- cbind(data.ord,pred[,psave,drop=F]) # ordering attr(data.ord,"sort") <- srt attr(data.ord,"group") <- group.original # model summary model <- list(clus=clname, yvrs=yvrs, pvrs=pvrs, qvrs=qvrs) attr(model,"is.ML") <- TRUE attr(model,"is.L2") <- FALSE attr(model,"full.names") <- list(pvrs=pnames, qvrs=qnames) out <- list( data=data.ord, replacement.mat=rpm, index.mat=ind, call=match.call(), model=model, random.L1="none", prior=prior, iter=list(burn=n.burn, iter=n.iter, m=m), keep.chains=keep.chains, par.burnin=bpar, par.imputation=ipar ) class(out) <- c("mitml","pan") out } mitml/R/internal-getCOEF.R0000644000176200001440000000763413077701223014762 0ustar liggesusers# *** # Functions to extract (fixed) coefficients and SEs/Covariance from # supported classes of statistical models # # *** lmer method .getCOEF.lmer <- function(model,null.model=NULL,diagonal=FALSE){ if(!requireNamespace("lme4", quietly=TRUE)) stop("The 'lme4' package must be installed in order to use this function.") if(is.null(null.model)){ if(diagonal){ nms <- names(lme4::fixef(model[[1]])) Qhat <- sapply(model,lme4::fixef) if(is.null(dim(Qhat))){ Uhat <- sapply(model, function(z) vcov(summary(z))@x ) }else{ Uhat <- sapply(model, function(z) diag( matrix(vcov(summary(z))@x,nrow(Qhat)) )) } }else{ nms <- names(lme4::fixef(model[[1]])) p <- length(nms) Qhat <- sapply(model,lme4::fixef) Uhat <- vapply(model, function(z) vcov(summary(z))@x, FUN.VALUE=matrix(0,p,p)) } }else{ par0 <- names(lme4::fixef(null.model[[1]])) par1 <- names(lme4::fixef(model[[1]])) dpar <- setdiff(par1,par0) nms <- NULL p <- length(par1) i <- which(par1%in%dpar) Qhat <- sapply(model,lme4::fixef)[dpar,] Uhat <- vapply(model, function(z) vcov(summary(z))@x, FUN.VALUE=matrix(0,p,p))[i,i,] } out <- list(Qhat=Qhat,Uhat=Uhat,nms=nms) out } # *** nlme method .getCOEF.nlme <- function(model,null.model=NULL,diagonal=FALSE){ if(!requireNamespace("nlme", quietly=TRUE)) stop("The 'nlme' package must be installed in order to use this function.") if(is.null(null.model)){ if(diagonal){ nms <- names(nlme::fixef(model[[1]])) Qhat <- sapply(model,nlme::fixef) if(is.null(dim(Qhat))){ Uhat <- sapply(model, function(z) vcov(summary(z)) ) }else{ Uhat <- sapply(model, function(z) diag( vcov(summary(z)) ) ) } }else{ nms <- names(nlme::fixef(model[[1]])) p <- length(nms) Qhat <- sapply(model,nlme::fixef) Uhat <- vapply(model, function(z) vcov(summary(z)), FUN.VALUE=matrix(0,p,p)) } }else{ par0 <- names(nlme::fixef(null.model[[1]])) par1 <- names(nlme::fixef(model[[1]])) dpar <- setdiff(par1,par0) nms <- NULL p <- length(par1) i <- which(par1%in%dpar) Qhat <- sapply(model,nlme::fixef)[dpar,] Uhat <- vapply(model, function(z) vcov(summary(z)), FUN.VALUE=matrix(0,p,p))[i,i,] } out <- list(Qhat=Qhat,Uhat=Uhat,nms=nms) out } # *** geeglm method (geepack) .getCOEF.geeglm <- function(model,null.model=NULL,diagonal=FALSE){ if(!requireNamespace("geepack", quietly=TRUE)) stop("The 'geepack' package must be installed in order to use this function.") vcov.geeglm <- function(x) summary(x)$cov.scaled .getCOEF.default(model, null.model=null.model, diagonal=diagonal, vcov.func=vcov.geeglm) } # *** default method .getCOEF.default <- function(model, null.model=NULL, diagonal=FALSE, coef.func=NULL, vcov.func=NULL){ # use predefined methods if(!is.null(coef.func)) coef <- coef.func if(!is.null(vcov.func)) vcov <- vcov.func if(is.null(null.model)){ if(diagonal){ nms <- names(coef(model[[1]])) Qhat <- sapply(model,coef) if(is.null(dim(Qhat))){ Uhat <- sapply(model, vcov ) }else{ Uhat <- sapply(model,function(z) diag(vcov(z))) } }else{ nms <- names(coef(model[[1]])) p <- length(nms) Qhat <- sapply(model,coef) Uhat <- vapply(model,vcov, FUN.VALUE=matrix(0,p,p)) } }else{ par0 <- names(coef(null.model[[1]])) if(is.null(par0)) par0 <- character(0) par1 <- names(coef(model[[1]])) dpar <- setdiff(par1,par0) nms <- NULL p <- length(par1) i <- which(par1%in%dpar) Qhat <- sapply(model,coef) Uhat <- vapply(model,vcov, FUN.VALUE=matrix(0,p,p)) if(is.null(dim(Qhat))){ dim(Qhat) <- c(1,length(model)) dim(Uhat) <- c(1,1,length(model)) rownames(Qhat) <- par1 } Qhat <- Qhat[dpar,,drop=F] Uhat <- Uhat[i,i,,drop=F] } out <- list(Qhat=Qhat,Uhat=Uhat,nms=nms) out } mitml/R/testModels.R0000644000176200001440000002413013100066111014030 0ustar liggesuserstestModels <- function(model, null.model, method=c("D1","D2","D3"), use=c("wald","likelihood"), df.com=NULL){ # model comparison and hypothesis tests for k-dimensional estimands # *** # general errors if(!"list"%in%class(model) & !"list"%in%class(null.model) & !is.null(null.model)) stop("The 'model' and 'null.model' arguments must be lists of fitted statistical models.") if(!"list"%in%class(model) & is.null(null.model)) stop("The 'model' argument must be a list of fitted statistical models.") if(!missing(method) & length(method)>1) stop("Only one 'method' may be supplied.") if(!missing(use) & length(use)>1) stop("Only one of 'wald' or 'likelihood' may be supplied.") method <- match.arg(method) use <- match.arg(use) # *** # warnings if(!is.null(df.com) & method!="D1") warning("Complete-data degrees of freedom are not available for use with '",method,"', and thus were ignored.") if(use=="likelihood" & method!="D2") warning("The 'likelihood' option is not available with method '",method,"', and thus was ignored.") # *** # select extraction methods cls <- class(model[[1]]) # default method (lm) coef.method <- vc.method <- lr.method <- "default" if(cls[1]=="lm") vc.method <- lr.method <- "lm" # merMod (lme4) if(any(grepl("merMod",cls)) & coef.method=="default"){ if(!requireNamespace("lme4", quietly=TRUE)) stop("The 'lme4' package must be installed to handle 'merMod' class objects.") coef.method <- vc.method <- lr.method <- "lmer" } # lme (nlme) if(any(grepl("^.?lme$",cls)) & coef.method=="default"){ if(!requireNamespace("nlme", quietly=TRUE)) stop("The 'nlme' package must be installed to handle 'lme' class objects.") coef.method <- vc.method <- lr.method <- "nlme" } # geeglm (geepack) if(any(grepl("geeglm",cls)) & coef.method=="default"){ if(!requireNamespace("geepack", quietly=TRUE)) stop("The 'geepack' package must be installed in order to handle 'geeglm' class objects.") coef.method <- "geeglm" } # coxph (survival) if(any(grepl("coxph",cls)) & coef.method=="default"){ if(!requireNamespace("survival", quietly=TRUE)) stop("The 'survival' package must be installed in order to handle 'coxph' class objects.") lr.method <- "coxph" } # *** #! if(method=="D1"){ fe <- switch(coef.method, lmer=.getCOEF.lmer(model,null.model), nlme=.getCOEF.nlme(model,null.model), geeglm=.getCOEF.geeglm(model,null.model), default=.getCOEF.default(model,null.model) ) m <- length(model) Qhat <- fe$Qhat Uhat <- fe$Uhat if(is.null(dim(Qhat))) dim(Qhat) <- c(1,m) if(is.null(dim(Uhat))) dim(Uhat) <- c(1,1,m) k <- dim(Qhat)[1] Qbar <- apply(Qhat,1,mean) Ubar <- apply(Uhat,c(1,2),mean) B <- cov(t(Qhat)) r <- (1+m^(-1))*sum(diag(B%*%solve(Ubar)))/k Ttilde <- (1 + r)*Ubar # D1 (Li, Raghunathan and Rubin, 1991) val <- t(Qbar) %*% solve(Ttilde) %*% Qbar / k t <- k*(m-1) if(!is.null(df.com)){ a <- r*t/(t-2) vstar <- ( (df.com+1) / (df.com+3) ) * df.com v <- 4 + ( (vstar-4*(1+a))^(-1) + (t-4)^(-1) * ((a^2*(vstar-2*(1+a))) / ((1+a)^2*(vstar-4*(1+a)))) )^(-1) } else { if (t>4){ v <- 4 + (t-4) * (1 + (1 - 2*t^(-1)) * (r^(-1)))^2 }else{ v <- t * (1 + k^(-1)) * ((1 + r^(-1))^2) / 2 } } p <- 1-pf(val, k, v) out <- matrix(c(val,k,v,p,r),ncol=5) colnames(out) <- c("F.value","df1","df2","P(>F)","RIV") # new label for p-value, SiG 2017-02-09 out <- list( call=match.call(), test=out, m=m, adj.df=!is.null(df.com), df.com=df.com, method="D1", use="wald", reml=FALSE ) } # *** #! if(method=="D2"){ if(use=="wald"){ reml <- FALSE fe <- switch(coef.method, lmer=.getCOEF.lmer(model,null.model), nlme=.getCOEF.nlme(model,null.model), geeglm=.getCOEF.geeglm(model,null.model), default=.getCOEF.default(model,null.model) ) m <- length(model) Qhat <- fe$Qhat Uhat <- fe$Uhat if(is.null(dim(Qhat))) dim(Qhat) <- c(1,m) if(is.null(dim(Uhat))) dim(Uhat) <- c(1,1,m) k <- dim(Qhat)[1] dW <- sapply(1:m, function(z) t(Qhat[,z]) %*% solve(Uhat[,,z]) %*% Qhat[,z]) } # TODO: likelihood test for (single) model fit (with null.model=NULL) if(use=="likelihood"){ # check for REML and refit reml1 <- sapply(model, .is.REML, lr.method=lr.method) reml0 <- sapply(null.model, .is.REML, lr.method=lr.method) reml <- ( any(reml0) | any(reml1) ) if(reml){ model[reml1] <- lapply(model[reml1], .update.ML, lr.method=lr.method) null.model[reml0] <- lapply(null.model[reml0], .update.ML, lr.method=lr.method) } dW <- switch(lr.method, lmer=.getLR.lmer(model,null.model), nlme=.getLR.nlme(model,null.model), coxph=.getLR.coxph(model,null.model), lm=.getLR.default(model,null.model), default=.getLR.default(model,null.model) ) m <- length(model) k <- attr(dW,"df") if(is.null(k)) stop("Degrees of freedom for the model comparison could not be detected.") } # D2 (Li, Meng et al., 1991) dWbar <- mean(dW) r <- (1+m^(-1)) * var(sqrt(dW)) val <- (dWbar/k - (m+1)/(m-1) * r) / (1+r) v <- k^(-3/m) * (m-1) * (1+r^(-1))^2 p <- 1-pf(val, k, v) out <- matrix(c(val,k,v,p,r),ncol=5) colnames(out) <- c("F.value","df1","df2","P(>F)","RIV") # new label for p-value, SiG 2017-02-09 out <- list( call=match.call(), test=out, k=k, m=m, adj.df=FALSE, df.com=NULL, method="D2", use=use, reml=reml ) } # *** #! if(method=="D3"){ # error checking if(!lr.method%in%c("lm","lmer","nlme")) stop("The 'D3' method is currently not supported for models of class '", cls[1],"'.") if(!grepl("^lme$",cls[1]) & lr.method=="nlme") stop("The 'D3' method is currently only supported for linear mixed-effects models.") if(!grepl("^l?merMod$",cls[1]) & lr.method=="lmer") stop("The 'D3' method is currently only supported for linear mixed-effects models.") # check for REML and refit reml1 <- sapply(model, .is.REML, lr.method=lr.method) reml0 <- sapply(null.model, .is.REML, lr.method=lr.method) reml <- ( any(reml0) | any(reml1) ) if(reml){ model[reml1] <- lapply(model[reml1], .update.ML, lr.method=lr.method) null.model[reml0] <- lapply(null.model[reml0], .update.ML, lr.method=lr.method) } # LR at fit-specific estimates dL <- switch( lr.method, lmer=.getLR.lmer(model,null.model), nlme=.getLR.nlme(model,null.model), lm=.getLR.default(model,null.model)) fe0 <- switch( coef.method, lmer=.getCOEF.lmer(null.model), nlme=.getCOEF.nlme(null.model), default=.getCOEF.default(null.model) ) vc0 <- switch( vc.method, lmer=.getVC.lmer(null.model), nlme=.getVC.nlme(null.model), lm=.getVC.lm(null.model,ML=TRUE) ) fe1 <- switch( coef.method, lmer=.getCOEF.lmer(model), nlme=.getCOEF.nlme(model), default=.getCOEF.default(model) ) vc1 <- switch( vc.method, lmer=.getVC.lmer(model), nlme=.getVC.nlme(model), lm=.getVC.lm(model,ML=TRUE) ) dLbar <- mean(dL) m <- length(model) k <- attr(dL,"df") # LR at average estimates switch( lr.method, lmer={ if(length(vc0$vlist)>2) stop("The 'D3' method is only supported for models of class 'merMod' with a single cluster variable.") Q0 <- fe0$Qhat Q1 <- fe1$Qhat if(is.null(dim(Q0))) dim(Q0) <- c(1,m) if(is.null(dim(Q1))) dim(Q1) <- c(1,m) V0 <- lapply(vc0$vlist, function(z) unname(apply(z,1:2,mean)) ) V1 <- lapply(vc1$vlist, function(z) unname(apply(z,1:2,mean)) ) psi0bar <- list(beta=rowMeans(Q0),D=V0[[1]],sigma2=V0[[2]][1,1]) psi1bar <- list(beta=rowMeans(Q1),D=V1[[1]],sigma2=V1[[2]][1,1]) dLt <- .getLR.lmer(model,null.model,psi=psi1bar,null.psi=psi0bar) }, nlme={ if(length(vc0$vlist)>2) stop("The 'D3' method is only supported for models of class 'lme' with a single cluster variable.") Q0 <- fe0$Qhat Q1 <- fe1$Qhat if(is.null(dim(Q0))) dim(Q0) <- c(1,m) if(is.null(dim(Q1))) dim(Q1) <- c(1,m) V0 <- lapply(vc0$vlist, function(z) unname(apply(z,1:2,mean)) ) V1 <- lapply(vc1$vlist, function(z) unname(apply(z,1:2,mean)) ) psi0bar <- list(beta=rowMeans(Q0),D=V0[[1]],sigma2=V0[[2]][1,1]) psi1bar <- list(beta=rowMeans(Q1),D=V1[[1]],sigma2=V1[[2]][1,1]) dLt <- .getLR.nlme(model,null.model,psi=psi1bar,null.psi=psi0bar) }, lm={ Q0 <- fe0$Qhat Q1 <- fe1$Qhat if(is.null(dim(Q0))) dim(Q0) <- c(1,m) if(is.null(dim(Q1))) dim(Q1) <- c(1,m) V0 <- lapply(vc0$vlist, function(z) unname(apply(z,1:2,mean)) ) V1 <- lapply(vc1$vlist, function(z) unname(apply(z,1:2,mean)) ) psi0bar <- list(beta=rowMeans(Q0),sigma2=V0[[1]][1,1]) psi1bar <- list(beta=rowMeans(Q1),sigma2=V1[[1]][1,1]) dLt <- .getLR.default(model,null.model,psi=psi1bar,null.psi=psi0bar) } ) # D3 (Meng & Rubin, 1992) dLtilde <- mean(dLt) r <- (m+1) * (k*(m-1))^(-1) * (dLbar-dLtilde) val <- dLtilde / (k*(1+r)) t <- k*(m-1) if( t>4 ){ v <- 4 + (t-4) * (1 + (1-2*t^(-1)) * r^(-1))^2 }else{ v <- t * (1+k^(-1)) * (1+r^(-1))^2 / 2 } p <- 1- pf(val, k, v) out <- matrix(c(val,k,v,p,r),ncol=5) colnames(out) <- c("F.value","df1","df2","P(>F)","RIV") # new label for p-value, SiG 2017-02-09 out <- list( call=match.call(), test=out, m=m, adj.df=FALSE, df.com=NULL, method="D3", use="likelihood", reml=reml ) } class(out) <- "mitml.testModels" out } .is.REML <- function(x, lr.method){ reml <- FALSE if(lr.method=="lmer") reml <- lme4::isREML(x) if(lr.method=="nlmer") reml <- x$method=="REML" reml } .update.ML <- function(x, lr.method){ if(lr.method=="lmer") x <- update(x, REML=FALSE) if(lr.method=="nlme") x <- update(x, data=x$data, method="ML") x } mitml/R/jomoImpute.R0000644000176200001440000004005613370070244014055 0ustar liggesusersjomoImpute <- function(data, type, formula, random.L1=c("none","mean","full"), n.burn=5000, n.iter=100, m=10, group=NULL, prior=NULL, seed=NULL, save.pred=FALSE, keep.chains=c("full","diagonal"), silent=FALSE){ # wrapper function for the different samplers of the jomo package # checks arguments if(!missing(type) & !missing(formula)) stop("Only one of 'type' or 'formula' may be specified.") if(save.pred & !missing(type)){ warning("Option 'save.pred' is ignored if 'type' is specified") save.pred=FALSE } random.L1 <- match.arg(random.L1) keep.chains <- match.arg(keep.chains) # convert type if(!missing(type)){ if(!is.null(group)){ gv <- match(group, colnames(data)) if(is.list(type)){ type[[1]][gv] <- -1 }else{ type[gv] <- -1 } warning("The 'group' argument is intended only for 'formula'. Setting 'type' of '", colnames(data)[gv],"' to '-1'.") } formula <- .type2formula(data,type) group <- attr(formula, "group") } # check for number of model equations formula <- .check.model( formula ) isML <- attr(formula,"is.ML") isL2 <- attr(formula,"is.L2") if(!isML && random.L1!="none") stop("No cluster variable found. Random covariance matrices (random.L1) are not supported for single-level MI and require the specification of a cluster variable.") # objects to assign to clname <- yvrs <- y <- ycat <- zcol <- xcol <- pred <- clus <- psave <- pvrs <- qvrs <- pnames <- qnames <- yvrs.L2 <- y.L2 <- ycat.L2 <- xcol.L2 <- pred.L2 <- pvrs.L2 <- pnames.L2 <- NULL # preserve original order if(!is.data.frame(data)) as.data.frame(data) data <- cbind(data, original.order=1:nrow(data)) # address additional grouping grname <- group if(is.null(group)){ group <- rep(1,nrow(data)) }else{ if(length(group) > 1) stop("Multiple 'group' variables found. There can be only one!") if(!group%in%colnames(data)) stop("Argument 'group' is not correctly specified.") group <- data[,group] } group.original <- group group <- as.numeric(factor(group,levels=unique(group))) # *** # model input # populate local frame .model.byFormula(data, formula, group, group.original, method="jomo.matrix") # check model input if(any(is.na(group))) stop("Grouping variable must not contain missing data.") if(any(is.na(pred))) stop("Predictor variables must not contain missing data.") if(any(!sapply(data[yvrs], function(a) is.factor(a) || is.numeric(a)))) stop("Target variables must either be numeric or factors.") if((sum(is.na(y)) + sum(is.na(ycat)) + ifelse(isL2, sum(is.na(y.L2))+sum(is.na(ycat.L2)), 0))==0) stop("Target variables do not contain any missing data.") if(any(duplicated(c(yvrs,yvrs.L2)))) stop("Found duplicate target variables.") if(isL2){ if(any(is.na(pred.L2))) stop("Predictor variables must not contain missing data.") if(any(!sapply(data[yvrs.L2], function(a) is.factor(a) || is.numeric(a)))) stop("Target variables must either be numeric or factors.") } # check for L1 variables in L2 models if(isL2){ y.L1 <- !.check.variablesL2(y.L2,clus) x.L1 <- !.check.variablesL2(pred.L2,clus) if(any(y.L1)) stop("Target variables at level 1 are not allowed in level-2 equation.") if(any(x.L1)){ for(i in which(x.L1)) pred.L2[,i] <- clusterMeans(pred.L2[,i],clus) message("NOTE: Predictor variables at level 1 were found in level-2 equation and were replaced with cluster means (", paste0(pvrs.L2[x.L1], collapse=", "), ").") } } # reorder colums cc <- which(colnames(data) %in% c(clname,grname,yvrs,yvrs.L2)) data.ord <- cbind(data[c(clname,grname,yvrs,yvrs.L2)],data[-cc]) # *** jomo setup # ycat.labels <- lapply(data[,c(colnames(ycat),colnames(ycat.L2)),drop=F], levels) # select function func <- if(ncol(ycat)==0) "con" else if(ncol(y)==0) "cat" else "mix" func <- paste0(ifelse(!isML, "jomo1", ifelse(!isL2,"jomo1ran", "jomo2")), if(!isL2) func, if(isL2 & random.L1=="none") "com", if(random.L1!="none") "hr", ".MCMCchain") func <- get(func, asNamespace("jomo")) # standard dimensions and data properties ng <- length(unique(group)) np <- length(xcol) nq <- length(zcol) ncon <- ncol(y) ncat <- ncol(ycat) nr <- ncon + ncat # combined con + cat (variables) ynumcat <- matrix(0,ng,ncat) nc <- nr2 <- integer(ng) if(isL2){ np.L2 <- length(xcol.L2) ncon.L2 <- ncol(y.L2) ncat.L2 <- ncol(ycat.L2) nr.L2 <- ncon.L2 + ncat.L2 # combined con + cat (variables) ynumcat.L2 <- matrix(0,ng,ncat.L2) nc.L2 <- nr2.L2 <- integer(ng) }else{ nr2.L2 <- integer(ng) # zero counts for compatibility ncon.L2 <- ncat.L2 <- 0 # of shared code } # ... manage categories groupwise for(gg in unique(group)){ ynumcat[gg,] <- apply(ycat[group==gg,,drop=F], 2, FUN=function(x) length(unique(x[!is.na(x)]))) nc[gg] <- length(unique(clus[group==gg])) nr2[gg] <- ncon+sum(ynumcat[gg,])-length(ynumcat[gg,]) # combined con + cat (indicators) if(isL2){ ynumcat.L2[gg,] <- apply(ycat.L2[group==gg,,drop=F], 2, FUN=function(x) length(unique(x[!is.na(x)]))) nc.L2[gg] <- length(unique(clus[group==gg])) nr2.L2[gg] <- ncon.L2+sum(ynumcat.L2[gg,])-length(ynumcat.L2[gg,]) } } # reduced dimensions dpsi <- max(nr2)*nq+max(nr2.L2) dsig1 <- ifelse(random.L1=="full", max(nr2)*max(nc), max(nr2)) dsig2 <- max(nr2) if(keep.chains=="diagonal"){ dpsi <- dsig2 <- 1 } # * * * * * * * * * * * * * * * * * * * * # save original seed (if seed is provided) original.seed <- NULL if(!is.null(seed)){ if(exists(".Random.seed", .GlobalEnv)) original.seed <- .Random.seed set.seed(seed) } # priors if(is.null(prior)){ prior <- as.list(unique(group)) for(gg in unique(group)){ prior[[gg]] <- list( Binv=diag(1,nr2[gg]), Dinv=diag(1,nq*nr2[gg]+nr2.L2[gg]) ) if(random.L1!="none") prior[[gg]]$a <- nr2[gg] if(!isML) prior[[gg]]$Dinv <- NULL } }else{ # check if prior is given as simple list if(!is.list(prior[[1]])) prior <- rep(list(prior),ng) } # prepare output ind <- which(is.na(data.ord), arr.ind=TRUE, useNames=FALSE) ind <- ind[ ind[,2] %in% which(colnames(data.ord)%in%c(yvrs,yvrs.L2)),,drop=FALSE ] rpm <- matrix(NA, nrow(ind), m) bpar <- c( list(beta=array( NA, c(np,max(nr2),n.burn,ng) )), if(isL2) list(beta2=array( NA, c(np.L2,max(nr2.L2),n.burn,ng) )), if(isML) list(psi=array( NA, c(max(nr2)*nq+max(nr2.L2),dpsi,n.burn,ng) )), list(sigma=array( NA, c(dsig1,dsig2,n.burn,ng) )) ) ipar <- c( list(beta=array( NA, c(np,max(nr2),n.iter*m,ng) )), if(isL2) list(beta2=array( NA, c(np.L2,max(nr2.L2),n.iter*m,ng) )), if(isML) list(psi=array( NA, c(max(nr2)*nq+max(nr2.L2),dpsi,n.iter*m,ng) )), list(sigma=array( NA, c(dsig1,dsig2,n.iter*m,ng) )) ) # burn-in if(!silent){ cat("Running burn-in phase ...\n") flush.console() } glast <- as.list(unique(group)) for(gg in unique(group)){ gi <- group==gg gprior <- prior[[gg]] # function arguments (group specific) gclus <- clus[gi] gclus <- matrix( match(gclus, unique(gclus))-1, ncol=1 ) func.args <- list( Y=if(ncon>0 & ncat==0 & !isL2) y[gi,,drop=F] else NULL, Y.con=if(ncon>0 & (ncat>0 | isL2)) y[gi,,drop=F] else NULL, Y.cat=if(ncat>0) ycat[gi,,drop=F] else NULL, Y.numcat=if(ncat>0) ynumcat[gg,] else NULL, Y2.con=if(ncon.L2>0) y.L2[gi,,drop=F] else NULL, Y2.cat=if(ncat.L2>0) ycat.L2[gi,,drop=F] else NULL, Y2.numcat=if(ncat.L2>0) ynumcat.L2[gg,] else NULL, X=pred[gi,xcol,drop=F], X2=if(isL2) pred.L2[gi,xcol.L2,drop=F] else NULL, Z=if(isML) pred[gi,zcol,drop=F] else NULL, clus=if(isML) gclus else NULL, beta.start=matrix(0,np,nr2[gg]), l2.beta.start=if(isL2) matrix(0,np.L2,nr2.L2[gg]) else NULL, u.start=if(isML) matrix(0,nc[gg],nq*nr2[gg]+nr2.L2[gg]) else NULL, l1cov.start=if(random.L1!="none"){ matrix(diag(1,nr2[gg]),nr2[gg]*nc[gg],nr2[gg],byrow=T) }else{ diag(1,nr2[gg]) }, l2cov.start=if(isML) diag(1,nq*nr2[gg]+nr2.L2[gg]) else NULL, start.imp=NULL, l2.start.imp=NULL, l1cov.prior=gprior$Binv, l2cov.prior=gprior$Dinv, a=gprior$a, meth=if(random.L1!="none") "random" else NULL, nburn=n.burn, output=0 ) func.args <- func.args[!sapply(func.args,is.null)] cur <- do.call( func, func.args ) glast[[gg]] <- cur # current parameter dimensions (group-specific) bdim <- dim(cur$collectbeta)[1:2] pdim <- dim(cur$collectcovu)[1:2] sdim <- dim(cur$collectomega)[1:2] # save chains for beta bpar[["beta"]][1:bdim[1],1:bdim[2],,gg] <- cur$collectbeta # ... covariance matrix at L2 if(isML){ if(keep.chains=="diagonal"){ bpar[["psi"]][1:pdim[1],1,,gg] <- .adiag(cur$collectcovu) }else{ bpar[["psi"]][1:pdim[1],1:pdim[2],,gg] <- cur$collectcovu } } # ... random covariance matrices at L1 if(random.L1=="mean"){ tmp <- cur$collectomega dim(tmp) <- c(nr2[gg],nc[gg],nr2[gg],n.burn) if(keep.chains=="diagonal"){ bpar[["sigma"]][1:sdim[2],1,,gg] <- .adiag(apply(tmp,c(1,3,4),mean)) }else{ bpar[["sigma"]][1:sdim[2],1:sdim[2],,gg] <- apply(tmp,c(1,3,4),mean) } }else{ if(keep.chains=="diagonal"){ bpar[["sigma"]][1:sdim[1],1,,gg] <- .adiag(cur$collectomega, stacked=(random.L1=="full")) }else{ bpar[["sigma"]][1:sdim[1],1:sdim[2],,gg] <- cur$collectomega } } # ... L2 model if(isL2){ bdim2 <- dim(cur$collect.l2.beta)[1:2] bpar[["beta2"]][1:bdim2[1],1:bdim2[2],,gg] <- cur$collect.l2.beta } } # imputation for(ii in 1:m){ if(!silent){ cat("Creating imputed data set (",ii,"/",m,") ...\n") flush.console() } gy.imp <- as.list(unique(group)) for(gg in unique(group)){ gi <- group==gg gprior <- prior[[gg]] # last state (imp) last.imp <- if(isL2 | ncat>0) glast[[gg]]$finimp.latnorm else glast[[gg]]$finimp if(ncon>0 & ncat==0 & !isL2){ last.imp <- last.imp[(nrow(y[gi,,drop=F])+1):nrow(last.imp), 1:ncon, drop=F] } last.imp.L2 <- if(isL2) glast[[gg]]$l2.finimp.latnorm else NULL # function arguments (group specific) gclus <- clus[gi] gclus <- matrix( match(gclus, unique(gclus))-1, ncol=1 ) it <- dim(glast[[gg]]$collectbeta)[3] func.args <- list( Y=if(ncon>0 & ncat==0 & !isL2) y[gi,,drop=F] else NULL, Y.con=if(ncon>0 & (ncat>0 | isL2)) y[gi,,drop=F] else NULL, Y.cat=if(ncat>0) ycat[gi,,drop=F] else NULL, Y.numcat=if(ncat>0) ynumcat[gg,] else NULL, Y2.con=if(ncon.L2>0) y.L2[gi,,drop=F] else NULL, Y2.cat=if(ncat.L2>0) ycat.L2[gi,,drop=F] else NULL, Y2.numcat=if(ncat.L2>0) ynumcat.L2[gg,] else NULL, X=pred[gi,xcol,drop=F], X2=if(isL2) pred.L2[gi,xcol.L2,drop=F] else NULL, Z=if(isML) pred[gi,zcol,drop=F] else NULL, clus=if(isML) gclus else NULL, beta.start=.extractMatrix(glast[[gg]]$collectbeta,it), l2.beta.start=.extractMatrix(glast[[gg]]$collect.l2.beta,it), u.start=.extractMatrix(glast[[gg]]$collectu,it), l1cov.start=.extractMatrix(glast[[gg]]$collectomega,it), l2cov.start=.extractMatrix(glast[[gg]]$collectcovu,it), start.imp=last.imp, l2.start.imp=last.imp.L2, l1cov.prior=gprior$Binv, l2cov.prior=gprior$Dinv, a=gprior$a, meth=if(random.L1!="none") "random" else NULL, nburn=n.iter, output=0 ) func.args <- func.args[!sapply(func.args,is.null)] cur <- do.call( func, func.args ) glast[[gg]] <- cur # save imputations ri <- (sum(gi)+1):nrow(cur$finimp) ci <- which(colnames(cur$finimp) %in% c(yvrs,yvrs.L2)) gy.imp[[gg]] <- cur$finimp[ri,ci,drop=F] # current parameter dimensions (group-specific) bdim <- dim(cur$collectbeta)[1:2] pdim <- dim(cur$collectcovu)[1:2] sdim <- dim(cur$collectomega)[1:2] # save chains for beta iind <- (n.iter*(ii-1)+1):(n.iter*ii) ipar[["beta"]][1:bdim[1],1:bdim[2],iind,gg] <- cur$collectbeta # ... covariance matrix at L2 if(isML){ if(keep.chains=="diagonal"){ ipar[["psi"]][1:pdim[1],1,iind,gg] <- .adiag(cur$collectcovu) }else{ ipar[["psi"]][1:pdim[1],1:pdim[2],iind,gg] <- cur$collectcovu } } # ... random covariance matrices at L1 if(random.L1=="mean"){ tmp <- cur$collectomega dim(tmp) <- c(nr2[gg],nc[gg],nr2[gg],n.iter) if(keep.chains=="diagonal"){ ipar[["sigma"]][1:sdim[2],1,iind,gg] <- .adiag(apply(tmp,c(1,3,4),mean)) }else{ ipar[["sigma"]][1:sdim[2],1:sdim[2],iind,gg] <- apply(tmp,c(1,3,4),mean) } }else{ if(keep.chains=="diagonal"){ ipar[["sigma"]][1:sdim[1],1,iind,gg] <- .adiag(cur$collectomega, stacked=(random.L1=="full")) }else{ ipar[["sigma"]][1:sdim[1],1:sdim[2],iind,gg] <- cur$collectomega } } # ... L2 model if(isL2){ bdim2 <- dim(cur$collect.l2.beta)[1:2] ipar[["beta2"]][1:bdim2[1],1:bdim2[2],iind,gg] <- cur$collect.l2.beta } } y.imp <- data.matrix(do.call(rbind,gy.imp)) rpm[,ii] <- y.imp[,c(yvrs,yvrs.L2)][is.na(data.ord[,c(yvrs,yvrs.L2),drop=F])] } if(!silent){ cat("Done!\n") } # clean up srt <- data.ord[,ncol(data.ord)] data.ord <- data.ord[,-ncol(data.ord)] # restore original seed (if seed was provided) if(!is.null(seed)){ if(is.null(original.seed)){ rm(".Random.seed", envir = .GlobalEnv) }else{ assign(".Random.seed", original.seed, envir=.GlobalEnv) } } # *** prepare output # # save pred if( save.pred & !missing(formula) ){ ps1 <- colnames(pred) %in% psave ps2 <- (colnames(pred.L2) %in% psave) & !(colnames(pred.L2) %in% colnames(pred)[ps1]) data.ord <- cbind(data.ord, pred[,ps1,drop=F]) if(isL2) cbind(data.ord, pred.L2[,ps2,drop=F]) } # ordering attr(data.ord,"sort") <- srt attr(data.ord,"group") <- group.original # categorical variables if(ncat>0 | ncat.L2>0){ attr(data.ord,"cvrs") <- names(ycat.labels) attr(data.ord,"levels") <- cbind(ynumcat,if(isL2) ynumcat.L2) attr(data.ord,"labels") <- ycat.labels } # model summary model <- list(clus=clname, yvrs=yvrs, pvrs=pvrs, qvrs=qvrs, yvrs.L2=if(isL2) yvrs.L2 else NULL, pvrs.L2=if(isL2) pvrs.L2 else NULL) attr(model,"is.ML") <- isML attr(model,"is.L2") <- isL2 attr(model,"full.names") <- list(pvrs=pnames, qvrs=qnames, pvrs.L2=if(isL2) pnames.L2 else NULL) out <- list( data=data.ord, replacement.mat=rpm, index.mat=ind, call=match.call(), model=model, random.L1=random.L1, prior=prior, iter=list(burn=n.burn, iter=n.iter, m=m), keep.chains=keep.chains, par.burnin=bpar, par.imputation=ipar ) class(out) <- c("mitml","jomo") return(out) } mitml/R/long2mitml.list.R0000644000176200001440000000060612705460231014760 0ustar liggesuserslong2mitml.list <- function(x, split, exclude=NULL){ # convert data set in "long" format to mitml.list i1 <- which(colnames(x)==split) f <- x[,i1] if(!is.null(exclude)){ i2 <- if(length(exclude)==1) f!=exclude else !f%in%exclude x <- x[i2,,drop=F] f <- f[i2] } out <- split(x[,-i1,drop=F], f=f) names(out) <- NULL class(out) <- c("mitml.list","list") out } mitml/R/multilevelR2.R0000644000176200001440000000720212765741676014334 0ustar liggesusersmultilevelR2 <- function(model, print=c("RB1","RB2","SB","MVP")){ # print argument case insensitive print <- toupper(print) print <- match.arg(print, several.ok=TRUE) method <- NULL # select method cls <- ifelse(is.list(model), class(model[[1]]), class(model)) if(any(grepl("^l?merMod$",cls))) method <- "lmer" if(any(grepl("^lme$",cls))) method <- "nlme" if(is.null(method)) stop("Calculation of R-squared statistics not supported for models of class") # calculate R-squared if(is.list(model)){ out <- sapply(model, .getRsquared, print=print, method=method) if(is.null(dim(out))) out <- matrix(out,nrow=1) out <- rowMeans(out) }else{ out <- .getRsquared(model, print, method) } out } .getRsquared <- function(model, print, method){ # R squared for single model fit (lme4) # check if refit is necessary refit <- any(c("RB1","RB2","SB")%in%print) if(method=="lmer"){ # model terms trm <- terms(model) if(!as.logical(attr(trm,"intercept"))) stop("Model must contain intercept.") yvr <- as.character(attr(trm,"variables")[-1])[attr(trm,"response")] cvr <- names(lme4::getME(model,"flist")) if(length(cvr)>1) stop("Calculation of R-squared only support for models with a single cluster variable.") cvr <- cvr[1] if(refit){ # fit null model fml0 <- formula(paste0(yvr,"~1+(1|",cvr,")")) model0 <- update(model, fml0) # variance components under null vc0 <- lme4::VarCorr(model0) s0 <- attr(vc0,"sc")^2 t0.0 <- vc0[[cvr]][1,1] } # alternative model components beta <- lme4::fixef(model)[-1] X <- lme4::getME(model,"X")[,-1,drop=F] Z <- lme4::getME(model,"mmList")[[1]][,-1,drop=F] muX <- colMeans(X) muZ <- colMeans(Z) vZ <- cov(Z) # predicted and total variance vc1 <- lme4::VarCorr(model) t0.1 <- vc1[[cvr]][1,1] t10.1 <- vc1[[cvr]][1,-1] t11.1 <- vc1[[cvr]][-1,-1,drop=F] s1 <- attr(vc1,"sc")^2 } if(method=="nlme"){ # model terms trm <- terms(model) if(!as.logical(attr(trm,"intercept"))) stop("Model must contain intercept.") yvr <- as.character(attr(trm,"variables")[-1])[attr(trm,"response")] cvr <- attr(nlme::getGroups(model),"label") if(length(nlme::getGroupsFormula(model,asList=T))>1) stop("Calculation of R-squared only support for models with a single cluster variable.") if(refit){ # fit null model ffml0 <- formula(paste0(yvr,"~1")) rfml0 <- formula(paste0("~1|",cvr,"")) if(is.null(nlme::getData(model))) stop("No data sets found in 'lme' fit. See '?testModels' for an example.") model0 <- update(model, fixed=ffml0, random=rfml0, data=model$data) # variance components under null vc0 <- nlme::getVarCov(model0) s0 <- model0$sigma^2 t0.0 <- vc0[1,1] } # alternative model components beta <- nlme::fixef(model)[-1] fe <- model$terms X <- model.matrix(fe,nlme::getData(model))[,-1,drop=F] re <- attr(model$modelStruct$reStruct[[1]],"formula") Z <- model.matrix(re,nlme::getData(model))[,-1,drop=F] muX <- colMeans(X) muZ <- colMeans(Z) vZ <- cov(Z) # predicted and total variance vc1 <- nlme::getVarCov(model) t0.1 <- vc1[1,1] t10.1 <- vc1[1,-1] t11.1 <- vc1[-1,-1,drop=F] s1 <- model$sigma^2 } # calculate R2 vyhat <- var( X %*% beta ) vy <- vyhat + t0.1 + 2*(muZ %*% t10.1) + muZ%*%t11.1%*%muZ + sum(diag(t11.1%*%vZ)) + s1 if(refit){ rb1 <- 1 - s1/s0 rb2 <- 1 - t0.1/t0.0 sb <- 1 - (s1+t0.1)/(s0+t0.0) }else{ rb1 <- rb2 <- sb <- NA } mvp <- as.vector(vyhat/vy) c(RB1=rb1, RB2=rb2, SB=sb, MVP=mvp)[print] } mitml/R/clusterMeans.R0000644000176200001440000000215312765741454014405 0ustar liggesusersclusterMeans <- function(x, cluster, adj=FALSE, group=NULL){ # calculate cluster means # get objects if names are given isname <- c(length(x)==1, length(cluster)==1, length(group)==1) & c(is.character(x), is.character(cluster), is.character(group)) if(any(isname)){ parent <- parent.frame() if(isname[1]) x <- eval(parse(text=x),parent) if(isname[2]) cluster <- eval(parse(text=cluster),parent) if(isname[3]) group <- eval(parse(text=group),parent) } # prepare group if(!is.null(group)) { if(is.character(group)) group <- as.factor(group) if(is.factor(group)) group <- as.integer(group) ngr <- length(unique(group)) } # format cluster (and groups) if(!is.numeric(cluster)) cluster <- as.integer(cluster) if(!is.null(group)) cluster <- cluster + group/(ngr+1) cluster <- match(cluster, unique(cluster)) n.obs <- rowsum(as.integer(!is.na(x)), cluster) gm <- rowsum(x, cluster, na.rm = T)/n.obs gm[is.nan(gm)] <- NA gm <- gm[cluster] if(adj){ n.obs <- n.obs[cluster] ((n.obs * gm) - x)/(n.obs - 1) }else{ gm } } mitml/R/summary.mitml.R0000644000176200001440000000631013321120501014522 0ustar liggesuserssummary.mitml <- function(object, n.Rhat=3, goodness.of.appr=FALSE, autocorrelation=FALSE, ...){ # summary method for objects of class "mitml" inc <- object$data ngr <- length(unique(attr(object$data,"group"))) prm <- object$par.imputation iter <- dim(prm[[1]])[3] k <- object$iter$iter isML <- attr(object$model,"is.ML") isL2 <- attr(object$model,"is.L2") # parameter chains (for backwards compatibility) if(is.null(object$keep.chains)) object$keep.chains <- "full" # percent missing mdr <- sapply(inc, FUN=function(x){mean(is.na(x))}) mdr[] <- sprintf(mdr*100,fmt="%.1f") mdr <- gsub("^0.0$","0",mdr) # convergence for imputation phase conv <- NULL Rhat <- ifelse(is.null(n.Rhat), FALSE, n.Rhat >= 2) SDprop <- goodness.of.appr ACF <- autocorrelation if(Rhat|SDprop|ACF){ conv <- c(list(beta=NULL), if(isL2) list(beta2=NULL), if(isML) list(psi=NULL), list(sigma=NULL)) for(pp in names(conv)){ ni <- dim(prm[[pp]])[1] nj <- dim(prm[[pp]])[2] nl <- dim(prm[[pp]])[4] cmat <- matrix(NA_real_, ni*nj*nl, 3+Rhat+SDprop+3*ACF) cmat[,1] <- rep(1:ni,nj*nl) cmat[,2] <- rep(1:nj,each=ni,times=nl) cmat[,3] <- rep(1:nl,each=ni*nj) colnames(cmat) <- c("i1", "i2", "grp", if(Rhat) "Rhat", if(SDprop) "SDprop", if(ACF) c("lag-1","lag-k","lag-2k")) for(ll in 1:nl){ # by group for(jj in 1:nj){ for(ii in 1:ni){ # check for redundant entries if(pp=="psi"){ if(jj > ii) next } if(pp=="sigma"){ if(jj > ((ii-1)%%nj)+1) next } ind <- ( cmat[,1]==ii & cmat[,2]==jj & cmat[,3]==ll ) chn <- matrix(prm[[pp]][ii,jj,,ll], 1, iter) # potential scale reduction (Rhat) if(Rhat) cmat[ind,"Rhat"] <- .GelmanRubin(chn,n.Rhat) # goodness of approximation if(SDprop) cmat[ind,"SDprop"] <- .SDprop(chn) # autocorrelation if(ACF){ cmat[ind,"lag-1"] <- .reducedACF(chn, lag=1, smooth=0) cmat[ind,"lag-k"] <- .reducedACF(chn, lag=k, smooth=2, sd=.5) cmat[ind,"lag-2k"] <- .reducedACF(chn, lag=2*k, smooth=2, sd=.5) } } } } conv[[pp]] <- cmat[!apply(cmat,1,function(x) any(is.na(x))),,drop=F] } attr(conv,"stats") <- c("Rhat","SDprop","ACF")[c(Rhat,SDprop,ACF)] } smr <- list( call=object$call, model=object$model, prior=object$prior, iter=object$iter, keep.chains=object$keep.chains, ngr=ngr, missing.rates=mdr, conv=conv ) class(smr) <- "mitml.summary" smr } .reducedACF <- function(x, lag, smooth=0, sd=.5){ # check NA if(all(is.na(x))) return(NA) n <- length(x) lag0 <- lag lag <- lag + (-smooth:smooth) ac <- numeric(length(lag)) y <- x - mean(x) ss.y <- sum(y^2) for(li in 1:length(lag)){ ll <- lag[li] # leave at 0 for constant value ac[li] <- if(ss.y>0) sum( y[1:(n-ll)] * y[1:(n-ll)+ll] ) / ss.y else 0 } if(smooth>0){ # weights based on normal density w <- dnorm(-smooth:smooth, 0, sd) ac <- sum( ac * (w/sum(w)) ) } ac } mitml/R/is.mitml.list.R0000644000176200001440000000044212521666537014443 0ustar liggesusersis.mitml.list <- function(x){ # checks if the argument is a list of class "mitml.list" l <- any(class(x)=="mitml.list") & is.list(x) if(!l){ return(FALSE) }else{ if(any(!sapply(x,is.data.frame))) warning("Does not appear to be a list of data frames.") return(TRUE) } } mitml/R/print.mitml.testEstimates.R0000644000176200001440000000306412765741757017062 0ustar liggesusersprint.mitml.testEstimates <- function(x,...){ # print method for MI estimates cl <- x$call est <- x$estimates vc <- x$var.comp m <- x$m adj <- x$adj.df dfc <- x$df.com # header cat("\nCall:\n", paste(deparse(cl)), sep="\n") cat("\nFinal parameter estimates and inferences obtained from",m,"imputed data sets.\n") cat("\n") # print results if(!is.null(est)){ val <- sprintf("%.3f",est) w <- max(sapply(c(colnames(est),val),nchar)) out <- matrix("",nrow(est)+1,ncol(est)+1) out[,1] <- format(c("",rownames(est))) out[1,-1] <- format(colnames(est),justify="right",width=w) out[-1,-1] <- format(val,justify="right",width=w) for(i in 1:nrow(out)) cat(out[i,],"\n") } if(!is.null(vc)){ if(!is.null(est)) cat("\n") val <- sprintf("%.3f",vc) w <- max(sapply(c("Estimate",val),nchar)) out <- matrix("",nrow(vc)+1,2) out[,1] <- format(c("",rownames(vc))) out[1,-1] <- format("Estimate",justify="right",width=w) out[-1,-1] <- format(val,justify="right",width=w) for(i in 1:nrow(out)) cat(out[i,],"\n") } cat(if(adj){c("\nHypothesis test adjusted for small samples with", paste("df=[",paste(dfc,collapse=","),"]\ncomplete-data degrees of freedom.",sep="")) }else{"\nUnadjusted hypothesis test as appropriate in larger samples."},"\n") cat("\n") invisible() } summary.mitml.testEstimates <- function(object,...){ # summary method for objects of class mitml.testEstimates print.mitml.testEstimates(object,...) } mitml/R/write.mitmlSAV.R0000644000176200001440000000103212765742055014556 0ustar liggesuserswrite.mitmlSAV <- function(x, filename){ # write to native SPSS format if(!"mitml"%in%class(x) & !"mitml.list"%in%class(x)) stop("'x' must be of class 'mitml' or 'mitml.list'.") if(!grepl(".sav$",filename)) filename <- paste(filename,".sav",sep="") if("mitml"%in%class(x)){ il <- mitmlComplete(x,0:x$iter$m) }else{ il <- x } for(ii in 1:length(il)){ il[[ii]] <- cbind(ii-1,il[[ii]]) colnames(il[[ii]])[1] <- "Imputation_" } out <- do.call(rbind, il) haven::write_sav(out, filename) invisible() } mitml/R/cbind.mitml.list.R0000644000176200001440000000020713011132435015062 0ustar liggesuserscbind.mitml.list <- function(...){ # merges two objects of class "mitml.list" by appending columns of list entries Map(cbind, ...) } mitml/R/print.mitml.testConstraints.R0000644000176200001440000000345313321120501017374 0ustar liggesusersprint.mitml.testConstraints <- function(x,...){ # print method for MI estimates cl <- x$call test <- x$test cons <- x$constraints mth <- x$method m <- x$m adj <- x$adj.df dfc <- x$df.com # header cat("\nCall:\n", paste(deparse(cl)), sep="\n") cat("\nHypothesis test calculated from",m,"imputed data sets. The following\nconstraints were specified:\n\n") # print constraint table est <- matrix(c(x$Qbar, sqrt(diag(x$T))), ncol=2) colnames(est) <- c("Estimate", "Std. Error") rownames(est) <- paste0(cons, ":") out <- .formatTable.helper(est) for(i in 1:nrow(out)) cat(" ", out[i,],"\n") cat("\nCombination method:",mth,"\n") # print test table fmt <- c("%.3f","%.0f","%.3f","%.3f","%.3f") fmt[test>=10^5] <- "%.3e" # large values out <- sprintf(fmt,test) # table cat("\n") w <- max(sapply(c(out,colnames(test)),nchar)) cat(" ",format(colnames(test),justify="right",width=w),"\n") cat(" ",format(out,justify="right",width=w),"\n") if(mth=="D1"){ cat(if(adj){c("\nHypothesis test adjusted for small samples with", paste("df=[",paste(dfc,collapse=","),"]\ncomplete-data degrees of freedom.",sep="")) }else{"\nUnadjusted hypothesis test as appropriate in larger samples."},"\n") } cat("\n") invisible() } summary.mitml.testConstraints <- function(object,...){ # summary method for objects of class mitml.testConstraints print.mitml.testConstraints(object,...) } .formatTable.helper <- function(x){ f <- sprintf("%.3f",x) w <- max(sapply(c(colnames(x),f),nchar)) out <- matrix("",nrow(x)+1,ncol(x)+1) out[,1] <- format(c("",rownames(x))) out[1,-1] <- format(colnames(x),justify="right",width=w) out[-1,-1] <- format(f,justify="right",width=w) return(out) } mitml/R/within.mitml.list.R0000644000176200001440000000110413011131050015272 0ustar liggesuserswithin.mitml.list <- function(data, expr, ignore=NULL, ...){ # evaluate an expression for a list of data sets, then return altered data sets expr <- substitute(expr) parent <- parent.frame() out <- lapply(data, function(x){ e <- evalq(environment(), x, parent) eval(expr, e) l <- as.list(e) l <- l[!sapply(l, is.null)] l[ignore] <- NULL nD <- length(del <- setdiff(names(x), (nl <- names(l)))) x[nl] <- l if(nD){ x[del] <- if(nD==1){ NULL } else { vector("list", nD) } } x }) class(out) <- c("mitml.list","list") out } mitml/R/print.mitml.testModels.R0000644000176200001440000000247512712374016016332 0ustar liggesusersprint.mitml.testModels <- function(x,...){ # print method for MI estimates cl <- x$call test <- x$test mth <- x$method use <- x$use reml <- x$reml m <- x$m adj <- x$adj.df dfc <- x$df.com # header cat("\nCall:\n", paste(deparse(cl)), sep="\n") cat("\nModel comparison calculated from",m,"imputed data sets.") cat("\nCombination method:",mth, if(mth=="D2"){paste("(",use,")",sep="")},"\n") # check for large values fmt <- c("%.3f","%.0f","%.3f","%.3f","%.3f") fmt[test>=10^5] <- "%.3e" out <- sprintf(fmt,test) # print table cat("\n") w <- max(sapply(c(out,colnames(test)),nchar)) cat(" ",format(colnames(test),justify="right",width=w),"\n") cat(" ",format(out,justify="right",width=w),"\n") if(mth=="D1"){ cat(if(adj){c("\nHypothesis test adjusted for small samples with", paste("df=[",paste(dfc,collapse=","),"]\ncomplete-data degrees of freedom.",sep="")) }else{"\nUnadjusted hypothesis test as appropriate in larger samples."},"\n") } if(reml){ cat("\nModels originally fit with REML were automatically refit using ML.\n") } cat("\n") invisible() } summary.mitml.testModels <- function(object,...){ # summary method for objects of class mitml.testModels print.mitml.testModels(object,...) } mitml/R/read.mitml.R0000644000176200001440000000026212477323026013762 0ustar liggesusersread.mitml <- function(filename){ # read mitml objects from file env <- new.env(parent=parent.frame()) load(filename, env) obj <- ls(env) eval(parse(text=obj), env) } mitml/R/jomo2mitml.list.R0000644000176200001440000000020212705456540014765 0ustar liggesusersjomo2mitml.list <- function(x){ # convert jomo imputations to mitml.list long2mitml.list(x, split="Imputation", exclude=0) } mitml/R/c.mitml.list.R0000644000176200001440000000023213011132447014226 0ustar liggesusersc.mitml.list <- function(...){ # merges two objects of class "mitml.list" by appending list entries as.mitml.list(unlist(list(...), recursive=FALSE)) } mitml/R/zzz.R0000644000176200001440000000025413047041061012552 0ustar liggesusers.onAttach <- function(libname, pkgname) { packageStartupMessage("*** This is beta software. Please report any bugs!\n*** See the NEWS file for recent changes.") } mitml/R/rbind.mitml.list.R0000644000176200001440000000020413011132464015100 0ustar liggesusersrbind.mitml.list <- function(...){ # merges two objects of class "mitml.list" by appending rows of list entries Map(rbind, ...) } mitml/R/as.mitml.list.R0000644000176200001440000000051512671754704014434 0ustar liggesusersas.mitml.list <- function(x){ # adds a class attribute "mitml.list" to its argument if(!is.list(x)) stop("Argument must be a 'list'.") if(any(!sapply(x,is.data.frame))){ x <- lapply(x,as.data.frame) cat("Note: List entries were converted to class 'data.frame'.\n") } class(x) <- c("mitml.list",class(x)) x } mitml/R/sort.mitml.list.R0000644000176200001440000000056513057752622015021 0ustar liggesuserssort.mitml.list <- function(x, decreasing=FALSE, by, ...){ # sort list of multiply imputed data sets expr <- substitute(by) args0 <- list(decreasing=decreasing, ...) res <- lapply(x, function(i){ args <- eval(expr,i,parent.frame()) if(!is.list(args)) args <- list(args) ind <- do.call("order", c(args,args0)) i[ind,] }) as.mitml.list(res) } mitml/R/internal-zzz.R0000644000176200001440000000136513321120501014360 0ustar liggesusers.extractMatrix <- function(x, ...){ # extract submatrix from array (indexed by ...) if(is.null(dim(x))) return(x) out <- `[`(x,,,...) dim(out) <- dim(x)[1:2] dimnames(out) <- dimnames(x)[1:2] out } .adiag <- function(x, stacked=FALSE){ # extract diagonal elements of first two dimensions in three-dimensional array # containing either square (default) or stacked matrices d <- dim(x) # indices for diagonal entries (square or stacked-square) if(stacked){ i <- seq_len(d[2]) + d[1]*(seq_len(d[2])-1) i <- outer(i,(seq_len(d[1]/d[2])-1)*d[2],`+`) i <- outer(i,(seq_len(d[3])-1)*d[1]*d[2],`+`) }else{ i <- seq_len(d[1]) + d[1]*(seq_len(d[1])-1) i <- outer(i,(seq_len(d[3])-1)*d[1]^2,`+`) } x[as.vector(i)] } mitml/vignettes/0000755000176200001440000000000013413110673013403 5ustar liggesusersmitml/vignettes/Introduction.Rmd0000644000176200001440000001736313321347222016542 0ustar liggesusers--- title: "Introduction" output: rmarkdown::html_vignette: css: "css/vignette.css" vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{Introduction} %\VignetteEncoding{UTF-8} --- --- ```{r setup, include=FALSE, cache=FALSE} library(knitr) set.seed(123) options(width=87) opts_chunk$set(background="#ffffff", comment="#", collapse=FALSE, fig.width=9, fig.height=9, warning=FALSE, message=FALSE) ``` This vignette is intended to provide a first introduction to the R package `mitml` for generating and analyzing multiple imputations for multilevel missing data. A usual application of the package may consist of the following steps. 1. Imputation 2. Assessment of convergence 3. Completion of the data 4. Analysis 5. Pooling The `mitml` package offers a set of tools to facilitate each of these steps. This vignette is intended as a step-by-step illustration of the basic features of `mitml`. Further information can be found in the other [vignettes](https://github.com/simongrund1/mitml/wiki) and the package [documentation](https://cran.r-project.org/package=mitml/mitml.pdf). ## Example data For the purposes of this vignette, we employ a simple example that makes use of the `studentratings` data set, which is provided with `mitml`. To use it, the `mitml` package and the data set must be loaded as follows. ```{r} library(mitml) data(studentratings) ``` More information about the variables in the data set can be obtained from its `summary`. ```{r} summary(studentratings) ``` In addition, the correlations between variables (based on pairwise observations) may be useful for identifying possible sources of information that may be used during the treatment of missing data. ```{r, echo=FALSE} round(cor(studentratings[,-(1:3)], use="pairwise"),3) ``` This illustrates that (a) most variables in the data set are affected by missing data, but also (b) that substantial relations exist between variables. For simplicity, we focus on only a subset of these variables. ## Model of interest For the present example, we focus on the two variables `ReadDis` (disciplinary problems in reading class) and `ReadAchiev` (reading achievement). Assume we are interested in the relation between these variables. Specifically, we may be interested in the following analysis model $$ \mathit{ReadAchiev}_{ij} = \gamma_{00} + \gamma_{10} \mathit{ReadDis}_{ij} + u_{0j} + e_{ij} $$ On the basis of the syntax used in the R package `lme4`, this model may be written as follows. ```{r, results="hide"} ReadAchiev ~ 1 + ReadDis + (1|ID) ``` In this model, the relation between `ReadDis` and `ReadAchiev` is represented by a single fixed effect of `ReadDis`, and a random intercept is included to account for the clustered structure of the data and the group-level variance in `ReadAchiev` that is not explained by `ReadDis`. ## Generating imputations The `mitml` package includes wrapper functions for the R packages `pan` (`panImpute`) and `jomo` (`jomoImpute`). Here, we will use the first option. To generate imputations with `panImpute`, the user must specify (at least): 1. an imputation model 2. the number of iterations and imputations The easiest way of specifying the imputation model is to use the `formula` argument of `panImpute`. Generally speaking, the imputation model should include all variables that are either (a) part of the model of interest, (b) related to the variables in the model, or (c) related to whether the variables in the model are missing. In this simple example, we include only `ReadDis` and `ReadAchiev` as the main target variables and `SchClimate` as an auxiliary variable. ```{r} fml <- ReadAchiev + ReadDis + SchClimate ~ 1 + (1|ID) ``` Note that, in this specification of the imputation model. all variables are included on the left-hand side of the model, whereas the right-hand side is left "empty". This model allows for all relations between variables at Level 1 and 2 and is thus suitable for most applications of the multilevel random intercept model (for further discussion, see also Grund, Lüdtke, & Robitzsch, 2016, in press). The imputation procedure is then run for 5,000 iterations (burn-in), after which 100 imputations are drawn every 100 iterations. ```{r, results="hide"} imp <- panImpute(studentratings, formula=fml, n.burn=5000, n.iter=100, m=100) ``` This step may take a few seconds. Once the process is completed, the imputations are saved in the `imp` object. ## Assessing convergence In `mitml`, there are two options for assessing the convergence of the imputation procedure. First, the `summary` calculates the "potential scale reduction factor" ($\hat{R}$) for each parameter in the imputation model. If this value is noticeably larger than 1 for some parameters (say $>1.05$), a longer burn-in period may be required. ```{r} summary(imp) ``` Second, diagnostic plots can be requested with the `plot` function. These plots consist of a trace plot, an autocorrelation plot, and some additional information about the posterior distribution. Convergence can be assumed if the trace plot is stationary (i.e., does not "drift"), and the autocorrelation is within reasonable bounds for the chosen number of iterations between imputations. For this example, we examine only the plot for the parameter `Beta[1,2]` (i.e., the intercept of `ReadDis`). ```{r conv, echo=FALSE} plot(imp, trace="all", print="beta", pos=c(1,2), export="png", dev.args=list(width=720, height=380, pointsize=16)) ``` ```{r, eval=FALSE} plot(imp, trace="all", print="beta", pos=c(1,2)) ``` ![](mitmlPlots/BETA_ReadDis_ON_Intercept.png) Taken together, both $\hat{R}$ and the diagnostic plots indicate that the imputation model converged, setting the basis for the analysis of the imputed data sets. ## Completing the data In order to work with and analyze the imputed data sets, the data sets must be completed with the imputations generated in the previous steps. To do so, `mitml` provides the function `mitmlComplete`. ```{r} implist <- mitmlComplete(imp, "all") ``` This resulting object is a list that contains the 100 completed data sets. ## Analysis and pooling In order to obtain estimates for the model of interest, the model must be fit separately to each of the completed data sets, and the results must be pooled into a final set of estimates and inferences. The `mitml` package offers the `with` function to fit various statistical models to a list of completed data sets. In this example, we use the `lmer` function from the R package `lme4` to fit the model of interest. ```{r, message=FALSE} library(lme4) fit <- with(implist, lmer(ReadAchiev ~ 1 + ReadDis + (1|ID))) ``` The resulting object is a list containing the 100 fitted models. To pool the results of these models into a set of final estimates and inferences, `mitml` offers the `testEstimates` function. ```{r} testEstimates(fit, var.comp=TRUE) ``` The estimates can be interpreted in a manner similar to the estimates from the corresponding complete-data procedure. In addition, the output includes diagnostic quantities such as the fraction of missing information (FMI), which can be helpful for interpreting the results and understanding problems with the imputation procedure. ###### References Grund, S., Lüdtke, O., & Robitzsch, A. (2016). Multiple imputation of multilevel missing data: An introduction to the R package pan. *SAGE Open*, *6*(4), 1–17. doi: 10.1177/2158244016668220 ([Link](https://doi.org/10.1177/2158244016668220)) Grund, S., Lüdtke, O., & Robitzsch, A. (in press). Multiple imputation of missing data for multilevel models: Simulations and recommendations. *Organizational Research Methods*. doi: 10.1177/1094428117703686 ([Link](https://doi.org/10.1177/1094428117703686)) --- ```{r, echo=F} cat("Author: Simon Grund (grund@ipn.uni-kiel.de)\nDate: ", as.character(Sys.Date())) ``` mitml/vignettes/Analysis.Rmd0000644000176200001440000002274613321375504015651 0ustar liggesusers--- title: "Analysis of Multiply Imputed Data Sets" output: rmarkdown::html_vignette: css: "css/vignette.css" vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{Analysis of multiply imputed data sets} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE, cache=FALSE} library(knitr) set.seed(123) options(width=87) opts_chunk$set(background="#ffffff", comment="#", collapse=FALSE, fig.width=9, fig.height=9, warning=FALSE, message=FALSE) ``` This vignette is intended to provide an overview of the analysis of multiply imputed data sets with `mitml`. Specifically, this vignette addresses the following topics: 1. Working with multiply imputed data sets 2. Rubin's rules for pooling individual parameters 3. Model comparisons 4. Parameter constraints Further information can be found in the other [vignettes](https://github.com/simongrund1/mitml/wiki) and the package [documentation](https://cran.r-project.org/package=mitml/mitml.pdf). ## Example data (`studentratings`) For the purposes of this vignette, we make use of the `studentratings` data set, which contains simulated data from 750 students in 50 schools including scores on reading and math achievement, socioeconomic status (SES), and ratings on school and classroom environment. The package and the data set can be loaded as follows. ```{r} library(mitml) library(lme4) data(studentratings) ``` As evident from its `summary`, most variables in the data set contain missing values. ```{r} summary(studentratings) ``` In the present example, we investigate the differences in mathematics achievement that can be attributed to differences in SES when controlling for students' sex. Specifically, we are interested in the following model. $$ \mathit{MA}_{ij} = \gamma_{00} + \gamma_{10} \mathit{Sex}_{ij} + \gamma_{20} (\mathit{SES}_{ij}-\overline{\mathit{SES}}_{\bullet j}) + \gamma_{01} \overline{\mathit{SES}}_{\bullet j} + u_{0j} + e_{ij} $$ Note that this model also employs group-mean centering to separate the individual and group-level effects of SES. ## Generating imputations In the present example, we generate 20 imputations from the following imputation model. ```{r, results="hide"} fml <- ReadDis + SES ~ 1 + Sex + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=5000, n.iter=200, m=20) ``` The completed data are then extracted with `mitmlComplete`. ```{r} implist <- mitmlComplete(imp, "all") ``` ## Transforming the imputed data sets In empirical research, the raw data rarely enter the analyses but often require to be transformed beforehand. For this purpose, the `mitml` package provides the `within` function, which applies a given transformation directly to each data set. In the following, we use this to (a) calculate the group means of SES and (b) center the individual scores around their group means. ```{r} implist <- within(implist,{ G.SES <- clusterMeans(SES,ID) # calculate group means I.SES <- SES - G.SES # center around group means }) ``` This method can be used to apply arbitrary transformations to all of the completed data sets simultaneously. > **Note regarding** `dplyr`**:** > Due to how it is implemented, `within` cannot be used directly with `dplyr`. > Instead, users may use `with` instead of `within` with the following workaround. >```{r, eval=FALSE} implist <- with(implist,{ df <- data.frame(as.list(environment())) df <- ... # dplyr commands df }) implist <- as.mitml.list(implist) ``` > Advanced users may also consider using `lapply` for a similar workaround.` ## Fitting the analysis model In order to analyze the imputed data, each data set is analyzed using regular complete-data techniques. For this purpose, `mitml` offers the `with` function. In the present example, we use it to fit the model of interest with the R package `lme4`. ```{r} fit <- with(implist,{ lmer(MathAchiev ~ 1 + Sex + I.SES + G.SES + (1|ID)) }) ``` This results in a list of fitted models, one for each of the imputed data sets. ## Pooling The results obtained from the imputed data sets must be pooled in order to obtain a set of final parameter estimates and inferences. In the following, we employ a number of different pooling methods that can be used to address common statistical tasks, for example, for (a) estimating and testing individual parameters, (b) model comparisons, and (c) tests of constraints about one or several parameters. #### Parameter estimates Individual parameters are commonly pooled with the rules developed by Rubin (1987). In `mitml`, Rubin's rules are implemented in the `testEstimates` function. ```{r} testEstimates(fit) ``` In addition, the argument `var.comp=TRUE` can be used to obtain pooled estimates of variance components, and `df.com` can be used to specify the complete-data degrees of freedom, which provides more appropriate (i.e., conservative) inferences in smaller samples. For example, using a conservative value for the complete-data degrees of freedom for the fixed effects in the model of interest (Snijders & Bosker, 2012), the output changes as follows. ```{r} testEstimates(fit, var.comp=TRUE, df.com=46) ``` #### Multiple parameters and model comparisons Oftentimes, statistical inference concerns more than one parameter at a time. For example, the combined influence of SES (within and between groups) on mathematics achievement is represented by two parameters in the model of interest. Multiple pooling methods for Wald and likelihood ratio tests (LRTs) are implemented in the `testModels` function. This function requires the specification of a full model and a restricted model, which are then compared using (pooled) Wald tests or LRTs. Specifically, `testModels` allows users to pool Wald tests ($D_1$), $\chi^2$ test statistics ($D_2$), and LRTs ($D_3$; for a comparison of these methods, see also Grund, Lüdtke, & Robitzsch, 2016b). To examine the combined influence of SES on mathematics achievement, the following restricted model can be specified and compared with the model of interest (using $D_1$). ```{r} fit.null <- with(implist,{ lmer(MathAchiev ~ 1 + Sex + (1|ID)) }) testModels(fit, fit.null) ``` > **Note regarding the order of arguments:** > Please note that `testModels` expects that the first argument contains the full model, and the second argument contains the restricted model. > If the order of the arguments is reversed, the results will not be interpretable. Similar to the test for individual parameters, smaller samples can be accommodated with `testModels` (with method $D_1$) by specifying the complete-data degrees of freedom for the denominator of the $F$ statistic. ```{r} testModels(fit, fit.null, df.com=46) ``` The pooling method used by `testModels` is determined by the `method` argument. For example, to calculate the pooled LRT corresponding to the Wald test above (i.e., $D_3$), the following command can be issued. ```{r} testModels(fit, fit.null, method="D3") ``` #### Constraints on parameters Finally, it is often useful to investigate functions (or constraints) of the parameters in the model of interest. In complete data sets, this can be achieved with a test of linear hypotheses or the delta method. The `mitml` package implements a pooled version of the delta method in the `testConstraints` function. For example, the combined influence of SES on mathematics achievement can also be tested without model comparisons by testing the constraint that the parameters pertaining to `I.SES` and `G.SES` are both zero. This constraint is defined and tested as follows. ```{r} c1 <- c("I.SES", "G.SES") testConstraints(fit, constraints=c1) ``` This test is identical to the Wald test given in the previous section. Arbitrary constraints on the parameters can be specified and tested in this manner, where each character string denotes an expression to be tested against zero. In the present example, we are also interested in the *contextual* effect of SES on mathematics achievement (e.g., Snijders & Bosker, 2012). The contextual effect is simply the difference between the coefficients pertaining to `G.SES` and `I.SES` and can be tested as follows. ```{r} c2 <- c("G.SES - I.SES") testConstraints(fit, constraints=c2) ``` Similar to model comparisons, constraints can be tested with different methods ($D_1$ and $D_2$) and can accommodate smaller samples by a value for `df.com`. Further examples for the analysis of multiply imputed data sets with `mitml` are given by Enders (2016) and Grund, Lüdtke, and Robitzsch (2016a). ###### References Enders, C. K. (2016). Multiple imputation as a flexible tool for missing data handling in clinical research. *Behaviour Research and Therapy*. doi: 10.1016/j.brat.2016.11.008 ([Link](https://doi.org/10.1016/j.brat.2016.11.008)) Grund, S., Lüdtke, O., & Robitzsch, A. (2016a). Multiple imputation of multilevel missing data: An introduction to the R package pan. *SAGE Open*, *6*(4), 1–17. doi: 10.1177/2158244016668220 ([Link](https://doi.org/10.1177/2158244016668220)) Grund, S., Lüdtke, O., & Robitzsch, A. (2016b). Pooling ANOVA results from multiply imputed datasets: A simulation study. *Methodology*, *12*, 75–88. doi: 10.1027/1614-2241/a000111 ([Link](https://doi.org/10.1027/1614-2241/a000111)) Rubin, D. B. (1987). *Multiple imputation for nonresponse in surveys*. Hoboken, NJ: Wiley. Snijders, T. A. B., & Bosker, R. J. (2012). *Multilevel analysis: An introduction to basic and advanced multilevel modeling*. Thousand Oaks, CA: Sage. --- ```{r, echo=F} cat("Author: Simon Grund (grund@ipn.uni-kiel.de)\nDate: ", as.character(Sys.Date())) ``` mitml/vignettes/css/0000755000176200001440000000000013321365165014201 5ustar liggesusersmitml/vignettes/css/vignette.css0000644000176200001440000000741513321364322016541 0ustar liggesusersbody { background-color: #fff; max-width: 720px; overflow: visible; padding-left: 2em; padding-right: 2em; font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 16px; font-weight: 500; line-height: 1.65; text-align: justify; text-justify: inter-word; margin: 2em auto; } #header { text-align: center; } #TOC { clear: both; margin: 0 0 10px 10px; padding: 4px; width: 400px; border: 1px solid #CCCCCC; border-radius: 5px; background-color: #f6f6f6; font-size: 16px; line-height: 1.5; text-align: left; } #TOC .toctitle { font-weight: bold; font-size: 18px; margin-left: 5px; } #TOC ul { padding-left: 40px; margin-left: -1.5em; margin-top: 5px; margin-bottom: 5px; } #TOC ul ul { margin-left: -2em; } #TOC li { line-height: 16px; } p { margin: 0.6em 0; } blockquote { border-left:3px dotted #e5e5e5; background-color: #fff; padding: 0 1em; margin: 0.9em 0; } hr { border-style: solid; border: none; border-top: 1px solid #777; margin: 28px 0; } dl { margin-left: 0; } dl dd { margin-bottom: 13px; margin-left: 13px; } dl dt { font-weight: bold; } ul, ol { text-align: left; } ul { margin-top: 0; } ul li { list-style: circle outside; } ul ul { margin-bottom: 0; } pre, code { background-color: #f7f7f7; line-height: 1.2; border-radius: 3px; color: #333; padding: 0px; white-space: pre; /* or: pre-wrap */ overflow-x: auto; } pre { border-radius: 3px; margin: 5px 0px 10px 0px; padding: 10px; } code { font-family: Consolas, monospace; font-size: 85%; } p > code, li > code { padding: 2px 2px; } h1, h2, h3, h4, h5, h6 { text-align: left; line-height: 1.2; } h1 { font-size: 2em; font-weight: 600; } h2 { color: #191919; font-size: 1.5em; font-weight: 600; } h3, h4, h5 { color: #292929; font-weight: 600; } /* Reference list */ h6 { color:#191919; font-size: 1.5em; font-weight: 600; margin-top: 0.83em; margin-bottom: 0.83em; } h6 ~ p { text-align: left; } a { color: #777; text-decoration: none; } a:hover { color: #aaa; text-decoration: underline; } /* a:visited { color: #777; } a:visited:hover { color: #aaa; text-decoration: underline; } */ /* tables */ table, table th, table td { border-left-style: none; border-right-style: none; } table { margin-top: 25px; margin-bottom: 25px; margin-left: auto; margin-right: auto; border-collapse: collapse; border-spacing: 0; } th { padding:5px 10px; border: 1px solid #b2b2b2; } td { padding:5px 10px; border: 1px solid #e5e5e5; } dt { color:#444; font-weight:500; } th { color:#444; } table thead, table tr.even { background-color: #f7f7f7; } /* images */ img { display: block; margin-left: auto; margin-right: auto; max-width:100%; } div.figure { text-align: center; } /* hovering behavior for images (e.g., play/pause GIFs) */ .gif_play, #gif:hover .gif_stop{ display:none } .gif_stop, #gif:hover .gif_play{ display:block } /* code highlighting */ pre code { color: #707070; } /* General Code w/o Class */ pre code.r { color: #333333; } /* General Code */ code span.kw { color: #558200; font-weight: normal; } /* Keyword */ code span.co { color: #707070; font-style: normal; } /* Comment */ code span.dt { color: #333333; } /* Data Type */ code span.fu { color: #558200; } /* Function calls */ code span.dv { color: #007878; } /* Decimal Values */ code span.bn { color: #007878; } /* Base N */ code span.fl { color: #007878; } /* Float */ code span.ch { color: #985b00; } /* Character */ code span.st { color: #985b00; } /* String */ code span.ot { color: #007878; } /* Other Token */ code span.al { color: #a61717; font-weight: bold; } /* Alert Token */ code span.er { color: #a61717; background-color: #e3d2d2; } /* Error Token */ mitml/vignettes/Level2.Rmd0000644000176200001440000001441013321350422015174 0ustar liggesusers--- title: "Imputation of Missing Data at Level 2" output: rmarkdown::html_vignette: css: "css/vignette.css" vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{Imputation of missing data at level 2} %\VignetteEncoding{UTF-8} --- --- ```{r setup, include=FALSE, cache=FALSE} library(knitr) set.seed(123) options(width=87) opts_chunk$set(background="#ffffff", comment="#", collapse=FALSE, fig.width=9, fig.height=9, warning=FALSE, message=FALSE) ``` This vignette illustrates the use of `mitml` for the treatment of missing data at Level 2. Specifically, the vignette addresses the following topics: 1. Specification of the two-level imputation model for missing data at both Level 1 and 2 2. Running the imputation procedure Further information can be found in the other [vignettes](https://github.com/simongrund1/mitml/wiki) and the package [documentation](https://cran.r-project.org/package=mitml/mitml.pdf). ## Example data For purposes of this vignette, we make use of the `leadership` data set, which contains simulated data from 750 employees in 50 groups including ratings on job satisfaction, leadership style, and work load (Level 1) as well as cohesion (Level 2). The package and the data set can be loaded as follows. ```{r} library(mitml) data(leadership) ``` In the `summary` of the data, it becomes visible that all variables are affected by missing data. ```{r} summary(leadership) ``` The following data segment illustrates this fact, including cases with missing data at Level 1 (e.g., job satisfaction) and 2 (e.g., cohesion). ```{r, echo=FALSE} leadership[73:78,] ``` In the following, we will employ a two-level model to address missing data at both levels simultaneously. ## Specifying the imputation model The specification of the two-level model, involves two components, one pertaining to the variables at each level of the sample (Goldstein, Carpenter, Kenward, & Levin, 2009; for further discussion, see also Enders, Mister, & Keller, 2016; Grund, Lüdtke, & Robitzsch, in press). Specifically, the imputation model is specified as a list with two components, where the first component denotes the model for the variables at Level 1, and the second component denotes the model for the variables at Level 2. For example, using the `formula` interface, an imputation model targeting all variables in the data set can be written as follows. ```{r} fml <- list( JOBSAT + NEGLEAD + WLOAD ~ 1 + (1|GRPID) , # Level 1 COHES ~ 1 ) # Level 2 ``` The first component of this list includes the three target variables at Level 1 and a fixed (`1`) as well as a random intercept (`1|GRPID`). The second component includes the target variable at Level 2 with a fixed intercept (`1`). From a statistical point of view, this specification corresponds to the following model $$ \begin{aligned} \mathbf{y}_{1ij} &= \boldsymbol\mu_{1} + \mathbf{u}_{1j} + \mathbf{e}_{ij} \\ \mathbf{y}_{2j} &= \boldsymbol\mu_{2} + \mathbf{u}_{1j} \; , \end{aligned} $$ where $\mathbf{y}_{1ij}$ denotes the target variables at Level 1, $\mathbf{y}_{2j}$ the target variables at Level 2, and the right-hand side of the model contains the fixed effects, random effects, and residual terms as mentioned above. Note that, even though the two components of the model appear to be separated, they define a single (joint) model for all target variables at both Level 1 and 2. Specifically, this model employs a two-level covariance structure, which allows for relations between variables at both Level 1 (i.e., correlated residuals at Level 1) and 2 (i.e., correlated random effects residuals at Level 2). ## Generating imputations Because the data contain missing values at both levels, imputations will be generated with `jomoImpute` (and not `panImpute`). Except for the specification of the two-level model, the syntax is the same as in applications with missing data only at Level 1. Here, we will run 5,000 burn-in iterations and generate 20 imputations, each 250 iterations apart. ```{r, results="hide"} imp <- jomoImpute(leadership, formula=fml, n.burn=5000, n.iter=250, m=20) ``` By looking at the `summary`, we can then review the imputation procedure and verify that the imputation model converged. ```{r} summary(imp) ``` Due to the greater complexity of the two-level model, the output includes more information than in applications with missing data only at Level 1. For example, the output features the model specification for variables at both Level 1 and 2. Furthermore, it provides convergence statistics for the additional regression coefficients for the target variables at Level 2 (i.e., `Beta2`). Finally, it also becomes visible that the two-level model indeed allows for relations between target variables at Level 1 and 2. This can be seen from the fact that the potential scale reduction factor ($\hat{R}$) for the covariance matrix at Level 2 (`Psi`) was largest for `Psi[4,3]`, which is the covariance between cohesion and the random intercept of work load. ## Completing the data The completed data sets can then be extracted with `mitmlComplete`. ```{r} implist <- mitmlComplete(imp, "all") ``` When inspecting the completed data, it is easy to verify that the imputations for variables at Level 2 are constant within groups as intended, thus preserving the two-level structure of the data. ```{r, echo=FALSE} implist[[1]][73:78,] ``` ###### References Enders, C. K., Mistler, S. A., & Keller, B. T. (2016). Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. *Psychological Methods*, *21*, 222–240. doi: 10.1037/met0000063 ([Link](https://doi.org/10.1037/met0000063)) Goldstein, H., Carpenter, J. R., Kenward, M. G., & Levin, K. A. (2009). Multilevel models with multivariate mixed response types. *Statistical Modelling*, *9*, 173–197. doi: 10.1177/1471082X0800900301 ([Link](https://doi.org/10.1177/1471082X0800900301)) Grund, S., Lüdtke, O., & Robitzsch, A. (in press). Multiple imputation of missing data for multilevel models: Simulations and recommendations. *Organizational Research Methods*. doi: 10.1177/1094428117703686 ([Link](https://doi.org/10.1177/1094428117703686)) --- ```{r, echo=F} cat("Author: Simon Grund (grund@ipn.uni-kiel.de)\nDate: ", as.character(Sys.Date())) ``` mitml/README.md0000644000176200001440000000360613321120733012653 0ustar liggesusers# mitml #### Tools for multiple imputation in multilevel modeling This [R](https://www.r-project.org/) package provides tools for multiple imputation of missing data in multilevel modeling. It includes a user-friendly interface to the packages `pan` and `jomo`, and several functions for visualization, data management, and the analysis of multiply imputed data sets. The purpose of `mitml` is to provide users with a set of effective and user-friendly tools for multiple imputation of multilevel data without requiring advanced knowledge of its statistical underpinnings. Examples and additional information can be found in the official [documentation](https://cran.r-project.org/package=mitml/mitml.pdf) of the package and in the [Wiki](https://github.com/simongrund1/mitml/wiki) pages on GitHub. If you use `mitml` and have suggestions for improvement, please email me (see [here](https://cran.r-project.org/package=mitml)) or file an [issue](https://github.com/simongrund1/mitml/issues) at the GitHub repository. #### CRAN version The official version of `mitml` is hosted on CRAN and may be found [here](https://cran.r-project.org/package=mitml). The CRAN version can be installed from within R using: ```r install.packages("mitml") ``` [![CRAN release](http://www.r-pkg.org/badges/version/mitml)](https://cran.r-project.org/package=mitml) [![CRAN downloads](http://cranlogs.r-pkg.org/badges/mitml)](https://cran.r-project.org/package=mitml) #### GitHub version The version hosted here is the development version of `mitml`, allowing better tracking of [issues](https://github.com/simongrund1/mitml/issues) and possibly containing features and changes in advance. The GitHub version can be installed using `devtools` as: ```r install.packages("devtools") devtools::install_github("simongrund1/mitml") ``` ![Github commits](https://img.shields.io/github/commits-since/simongrund1/mitml/latest.svg?colorB=green) mitml/MD50000644000176200001440000001176513414712773011727 0ustar liggesusers636eba5ade03b6931455d8c8cd093ba3 *DESCRIPTION 08f30f5a35bfe7d786631a2f667f2abb *NAMESPACE e3d5d9eace1eb37ba70b55c729775ce9 *NEWS befa2cc2a4f4c48c7b6aef2319fceb64 *R/amelia2mitml.list.R d15fbc76a31015bd8baf0e670052bb87 *R/anova.mitml.result.R 90c02200226e1c8c3c019c968347494f *R/as.mitml.list.R acea5820a61330ec3536cefe78f7b7f1 *R/c.mitml.list.R 58b8f41ea2e94fa9f6df8ff8452ae7cb *R/cbind.mitml.list.R a83261a9af057203cd4927d5ca4ef979 *R/clusterMeans.R 63dab6af85b618cd84b4535d81293ec9 *R/confint.mitml.testEstimates.R d4c7a80fe01a93e76f7e8c24aad551a4 *R/internal-convergence.R 7826f5fd65c1f87bb145808ba5c8ff02 *R/internal-getCOEF.R f5749ff084be5df16f8cb2ec3b175208 *R/internal-getLR.R 4826e3ff163120ac0bdd15c4f284b871 *R/internal-getVC.R 8e021793e4afb249bea585e923b8a834 *R/internal-model.R b7b858a97caf6c3945a773522aef6c80 *R/internal-zzz.R 219b121125e206ae606e274007cabeec *R/is.mitml.list.R 0f9b694c9816ba683b08dbfd43c5243f *R/jomo2mitml.list.R c9050bd5bf568a2854e889fb9f520da0 *R/jomoImpute.R b6f1eef2a49b9c7228323a80d3433f30 *R/long2mitml.list.R 0196b69d662cbd876e153b61b1367213 *R/mids2mitml.list.R d9c95097f672f7de6eaee9528fbf42fd *R/mitmlComplete.R 852419166afeb5b9cdea0981d505234f *R/multilevelR2.R 973eb505a5cb86425a677c5c2c7495df *R/panImpute.R b4d9d9bab1e37dcecc9edc3b4a4dcbf4 *R/plot.mitml.R 6872a84ceb05880b3809b81e8ece1301 *R/print.mitml.R 9fe22528311a6fadfdb17ee1ba243ead *R/print.mitml.anova.R feca082fb7051ef2ed88f717d03b7e7a *R/print.mitml.summary.R 70ac92f0d8a7a8f536cbf773fa9274b8 *R/print.mitml.testConstraints.R 1d8062421ebb191ce8c36a074ec04895 *R/print.mitml.testEstimates.R 3f86842b2e0bcfca0d0b0ffffa2a6d48 *R/print.mitml.testModels.R 0fb3e85de5c36bfea7744753486da854 *R/rbind.mitml.list.R 5cac45da0f789c172de7ea7d2aeda893 *R/read.mitml.R 2c453f0bc9bd6f77369f3849a0781d8b *R/sort.mitml.list.R ffc9e3fd876907094c5023d314ec370b *R/subset.mitml.list.R 430507a2473fa0ebc361d47baca904aa *R/summary.mitml.R 710f2d419ef4f48a106e6d985c9a8bf0 *R/testConstraints.R f4efb7e3790fa6872f5e49b684be9957 *R/testEstimates.R 66efc9bafffdcd3236c1aff5b3db43e4 *R/testModels.R 6d8afc2556102104300f1c60283830f3 *R/with.mitml.list.R 334cb0d12f9d35ca2eb078184681ffad *R/within.mitml.list.R 404db79ac5c244bc0ef6708323a9e0d7 *R/write.mitml.R 6655ebe8753a2915da6b68109ce1657f *R/write.mitmlMplus.R f12c0bc5d27eac356983d9121de1c853 *R/write.mitmlSAV.R d1656c98171c84fdba9e0f7c15286fbf *R/write.mitmlSPSS.R 4ce345668716e8a375f0e88dd0463e03 *R/zzz.R 1e859c273f3e2c8873a285ee19fb60a2 *README.md 37100bc51ab24c5fd39099f1632d155e *build/vignette.rds 7333385f16ef213729fcaae2bb48a5f2 *data/justice.rda 85f879a82e8215f2709c0d1e917b1a0d *data/leadership.rda 1856395651bfbea14af44ce56f1fa37e *data/studentratings.rda f1b93c3f871477dc74f119be95b9df75 *inst/doc/Analysis.R 5a62c29d219190993669a816dcd5a2da *inst/doc/Analysis.Rmd 8ec97b30d32dc5bbe324997891d4f2e0 *inst/doc/Analysis.html b437deb1fa4a9ea0b8539f25f40d5d4f *inst/doc/Introduction.R 4aaf362cd780906c9d2c20c98e9289bb *inst/doc/Introduction.Rmd f031224affcbe2758f816558d6f753c0 *inst/doc/Introduction.html 65358008c959ed1b31eb6c733bf29092 *inst/doc/Level2.R 450ba6512f382f358f13c4f3a5f9c641 *inst/doc/Level2.Rmd 60b63aaccfd761dbd9b3d4d8c5eb4245 *inst/doc/Level2.html 77b07ab2f9e2acc2acd0402874977edb *man/amelia2mitml.list.Rd 6224063580545fe07c4b6190a0f2755c *man/anova.mitml.result.Rd bfadab70bc0cd5dcd1a84206c0304a1e *man/as.mitml.list.Rd 3811df5dd83071ce076d658d4bf2c5c9 *man/c.mitml.list.Rd bd74202bf80cf7573e53fec37a9e934c *man/clusterMeans.Rd 2194c12808d083550a51afa5b5c0bebe *man/confint.mitml.testEstimates.Rd dce2b0bfd73cf8e5404b262982a4a5f9 *man/is.mitml.list.Rd e0dfc81befc44872b635d65a92f995b0 *man/jomoImpute.Rd 5d8ebab538d0d83a2f8446c699742cf2 *man/justice.Rd 2308bf8434b36568b2c9d510e4ac50ae *man/leadership.Rd 5849be25acc0bbd636af772d24f16fad *man/long2mitml.list.Rd f299596d09f4b6396a1395dc7d926631 *man/mids2mitml.list.Rd 40b7122ddcb05d65ab1a3e10915990f0 *man/mitml-package.Rd baf86b9b24d1f14586849fe4e5a15978 *man/mitmlComplete.Rd c63ce88c30385de9f50ef4d9265250aa *man/multilevelR2.Rd 6e167c3f265a2dd73957a675cf5ff36d *man/panImpute.Rd 62ace16832570e826bbfec41c828c2f4 *man/plot.mitml.Rd 1217b5ab2e107aee05e10b78eda10019 *man/read.mitml.Rd 920b7a2fde8929b23108fe4fc4c193ac *man/sort.mitml.list.Rd 5e09c366db0ee1caa870e96e16de8db8 *man/studentratings.Rd e98fd455c5a2c71766ed7924da8cae4c *man/subset.mitml.list.Rd 72771c9eb62d543e9e965b90e2f0a420 *man/summary.mitml.Rd c60cd7348b4cdaa55ae7c2d33153adfc *man/testConstraints.Rd f8d834bc59b5383b39feb82c7b70da20 *man/testEstimates.Rd 9718094a500e0bcb36fba1461330e86a *man/testModels.Rd f689f1438f2d52ff48dd10082501a08d *man/with.mitml.list.Rd e75006e2b36501cb40eeb5cad1ac2a3b *man/write.mitml.Rd 57886e57a51028d907b526ad016d1cac *man/write.mitmlMplus.Rd dc8bedd444f6933ed9f5ee8fbb0158f2 *man/write.mitmlSAV.Rd 65750fb55dd69f697930a39067f71638 *man/write.mitmlSPSS.Rd 5a62c29d219190993669a816dcd5a2da *vignettes/Analysis.Rmd 4aaf362cd780906c9d2c20c98e9289bb *vignettes/Introduction.Rmd 450ba6512f382f358f13c4f3a5f9c641 *vignettes/Level2.Rmd dea4ee80cd16d43382e0c184efa8b251 *vignettes/css/vignette.css mitml/build/0000755000176200001440000000000013413110673012472 5ustar liggesusersmitml/build/vignette.rds0000644000176200001440000000044413413110673015033 0ustar liggesusersRN0u3J 1Z l'l8sz.qĐ;{O愐$)/HFfp^Z+m9bquɛj#1]`F9YJ]5Npʙc gۭ!t]uh (5CMBamitml/DESCRIPTION0000644000176200001440000000162713414712773013121 0ustar liggesusersPackage: mitml Type: Package Title: Tools for Multiple Imputation in Multilevel Modeling Version: 0.3-7 Date: 2019-01-02 Author: Simon Grund [aut,cre], Alexander Robitzsch [aut], Oliver Luedtke [aut] Maintainer: Simon Grund BugReports: https://github.com/simongrund1/mitml/issues Imports: pan, jomo, haven, grDevices, graphics, stats, utils Suggests: mice, miceadds, Amelia, lme4, nlme, geepack, survival, knitr, rmarkdown LazyData: true LazyLoad: true Description: Provides tools for multiple imputation of missing data in multilevel modeling. Includes a user-friendly interface to the packages 'pan' and 'jomo', and several functions for visualization, data management and the analysis of multiply imputed data sets. License: GPL (>= 2) VignetteBuilder: knitr NeedsCompilation: no Packaged: 2019-01-02 10:26:03 UTC; simon Repository: CRAN Date/Publication: 2019-01-07 18:10:03 UTC mitml/man/0000755000176200001440000000000013321120501012133 5ustar liggesusersmitml/man/testModels.Rd0000644000176200001440000001532513044067021014564 0ustar liggesusers\name{testModels} \alias{testModels} \title{Test multiple parameters and compare nested models} \description{ Performs multi-parameter hypothesis tests for a vector of statistical parameters and compares nested statistical models obtained from multiply imputed data sets. } \usage{ testModels(model, null.model, method=c("D1","D2","D3"), use=c("wald","likelihood"), df.com=NULL) } \arguments{ \item{model}{A list of fitted statistical models (see examples).} \item{null.model}{A list of fitted (more restrictive) statistical models.} \item{method}{A character string denoting the method by which the test is performed. Can be either \code{"D1"}, \code{"D2"} or \code{"D3"} (see details). Default is \code{"D1"}.} \item{use}{A character string denoting Wald- or likelihood-based based tests. Can be either \code{"wald"} or \code{"likelihood"}. Only used if \code{method="D2"}.} \item{df.com}{(optional) A single number or a numeric vector denoting the complete-data degrees of freedom for the hypothesis test. Only used if \code{method="D1"}.} } \details{ This function compares two nested statistical models which differ by one or more parameters. In other words, the function performs Wald-like and likelihood-ratio hypothesis tests for the statistical parameters by which the two models differ. The general approach to Wald-like inference for multi-dimensional estimands was introduced Rubin (1987) and further developed by Li, Raghunathan and Rubin (1991). This procedure is commonly referred to as \eqn{D_1} and can be used by setting \code{method="D1"}. \eqn{D_1} is the multi-parameter equivalent of \code{\link{testEstimates}}, that is, it tests multiple parameters simultaneously. For \eqn{D_1}, the complete-data degrees of freedom are assumed to be infinite, but they can be adjusted for smaller samples by supplying \code{df.com} (Reiter, 2007). An alternative method for Wald-like hypothesis tests was suggested by Li, Meng, Raghunathan and Rubin (1991). The procedure is often called \eqn{D_2} and can be used by setting \code{method="D2"}. \eqn{D_2} calculates the Wald-test directly for each data set and then aggregates the resulting \eqn{\chi^2} values. The source of these values is specified by the \code{use} argument. If \code{use="wald"} (the default), then a Wald-like hypothesis test similar to \eqn{D_1} is performed. If \code{use="likelihood"}, then the two models are compared through their likelihood. A third method relying on likelihood-based comparisons was suggested by Meng and Rubin (1992). This procedure is referred to as \eqn{D_3} and can be used by setting \code{method="D3"}. \eqn{D_3} compares the two models by aggregating the likelihood-ratio test across multiply imputed data sets. In general, Wald-like hypothesis tests (\eqn{D_1} and \eqn{D_2}) are appropriate if the parameters can be assumed to follow a multivariate normal distribution (e.g., regression coefficients, fixed effects in multilevel models). Likelihood-based comparisons (\eqn{D_2} and \eqn{D_3}) are also appropriate in such cases and may also be used for variance components. The function supports different classes of statistical models depending on which \code{method} is chosen. \eqn{D_1} supports quite general models as long as they define \code{coef} and \code{vcov} methods (or similar) for extracting the parameter estimates and their estimated covariance matrix. \eqn{D_2} can be used for the same models (if \code{use="wald"}, or alternatively, for models that define a \code{logLik} method (if \code{use="likelihood"}). Finally, \eqn{D_3} supports linear models and linear mixed-effects models with a single cluster variable as estimated by \code{lme4} or \code{nlme} (see Laird, Lange, & Stram, 1987). Support for other statistical models may be added in future releases. } \value{ Returns a list containing the results of the model comparison, and the relative increase in variance due to nonresponse (Rubin, 1987). A \code{print} method is used for better readable console output. } \note{ The \eqn{D_3} method and the likelihood-based \eqn{D_2} assume that models were fit using maximum likelihood (ML). Models fit using REML are automatically refit using ML. } \references{ Meng, X.-L., & Rubin, D. B. (1992). Performing likelihood ratio tests with multiply-imputed data sets. \emph{Biometrika, 79}, 103-111. Laird, N., Lange, N., & Stram, D. (1987). Maximum likelihood computations with repeated measures: Application of the em algorithm. \emph{Journal of the American Statistical Association, 82}, 97-105. Li, K.-H., Meng, X.-L., Raghunathan, T. E., & Rubin, D. B. (1991). Significance levels from repeated p-values with multiply-imputed data. \emph{Statistica Sinica, 1}, 65-92. Li, K. H., Raghunathan, T. E., & Rubin, D. B. (1991). Large-sample significance levels from multiply imputed data using moment-based statistics and an F reference distribution. \emph{Journal of the American Statistical Association, 86}, 1065-1073. Reiter, J. P. (2007). Small-sample degrees of freedom for multi-component significance tests with multiple imputation for missing data. \emph{Biometrika, 94}, 502-508. Rubin, D. B. (1987). \emph{Multiple imputation for nonresponse in surveys}. Hoboken, NJ: Wiley. } \author{Simon Grund} \seealso{\code{\link{anova.mitml.result}}, \code{\link{testEstimates}}, \code{\link{testConstraints}}, \code{\link{with.mitml.list}}} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) implist <- mitmlComplete(imp, print=1:5) # * Example 1: multiparameter hypothesis test for 'ReadDis' and 'SES' # This tests the hypothesis that both effects are zero. require(lme4) fit0 <- with(implist, lmer(ReadAchiev ~ (1|ID), REML=FALSE)) fit1 <- with(implist, lmer(ReadAchiev ~ ReadDis + (1|ID), REML=FALSE)) # apply Rubin's rules testEstimates(fit1) # multiparameter hypothesis test using D1 (default) testModels(fit1, fit0) # ... adjusting for finite samples testModels(fit1, fit0, df.com=47) # ... using D2 ("wald", using estimates and covariance-matrix) testModels(fit1, fit0, method="D2") # ... using D2 ("likelihood", using likelihood-ratio test) testModels(fit1, fit0, method="D2", use="likelihood") # ... using D3 (likelihood-ratio test, requires ML fit) testModels(fit1, fit0, method="D3") \dontrun{ # * Example 2: multiparameter test using D3 with nlme # for D3 to be calculable, the 'data' argument for 'lme' must be # can be constructed manually require(nlme) fit0 <- with(implist, lme(ReadAchiev~1, random=~1|ID, data=data.frame(ReadAchiev,ID), method="ML")) fit1 <- with(implist, lme(ReadAchiev ~ 1 + ReadDis, random=~ 1|ID, data=data.frame(ReadAchiev,ReadDis,ID), method="ML")) # multiparameter hypothesis test using D3 testModels(fit1, fit0, method="D3") } } mitml/man/panImpute.Rd0000644000176200001440000002521413163177460014414 0ustar liggesusers\name{panImpute} \alias{panImpute} \title{Impute multilevel missing data using \code{pan}} \description{This function provides an interface to the \code{pan} package for multiple imputation of multilevel data (Schafer & Yucel, 2002). Imputations can be generated using \code{type} or \code{formula}, which offer different options for model specification.} \usage{ panImpute(data, type, formula, n.burn=5000, n.iter=100, m=10, group=NULL, prior=NULL, seed=NULL, save.pred=FALSE, keep.chains=c("full","diagonal"), silent=FALSE) } \arguments{ \item{data}{A data frame containing incomplete and auxiliary variables, the cluster indicator variable, and any other variables that should be present in the imputed datasets.} \item{type}{An integer vector specifying the role of each variable in the imputation model (see details).} \item{formula}{A formula specifying the role of each variable in the imputation model. The basic model is constructed by \code{model.matrix}, thus allowing to include derived variables in the imputation model using \code{I()} (see details and examples).} \item{n.burn}{The number of burn-in iterations before any imputations are drawn. Default is to 5,000.} \item{n.iter}{The number of iterations between imputations. Default is to 100.} \item{m}{The number of imputed data sets to generate.} \item{group}{(optional) A character string denoting the name of an additional grouping variable to be used with the \code{formula} argument. When specified, the imputation model is run separately within each of these groups.} \item{prior}{(optional) A list with components \code{a}, \code{Binv}, \code{c}, and \code{Dinv} for specifying prior distributions for the covariance matrix of random effects and the covariance matrix of residuals (see details). Default is to using least-informative priors.} \item{seed}{(optional) An integer value initializing \code{pan}'s random number generator for reproducible results. Default is to using random seeds.} \item{save.pred}{(optional) Logical flag indicating if variables derived using \code{formula} should be included in the imputed data sets. Default is to \code{FALSE}.} \item{keep.chains}{(optional) A character string denoting which parameter chains to save. Default is to save all chains (see details).} \item{silent}{(optional) Logical flag indicating if console output should be suppressed. Default is to \code{FALSE}.} } \details{ This function serves as an interface to the \code{pan} algorithm. The imputation model can be specified using either the \code{type} or the \code{formula} argument. The \code{type} interface is designed to provide quick-and-easy imputations using \code{pan}. The \code{type} argument must be an integer vector denoting the role of each variable in the imputation model: \itemize{ \item{\code{1}: target variables containing missing data} \item{\code{2}: predictors with fixed effect on all targets (completely observed)} \item{\code{3}: predictors with random effect on all targets (completely observed)} \item{\code{-1}: grouping variable within which the imputation is run separately} \item{\code{-2}: cluster indicator variable} \item{\code{0}: variables not featured in the model} } At least one target variable and the cluster indicator must be specified. The intercept is automatically included both as a fixed and random effect. If a variable of type \code{-1} is found, then separate imputations are performed within each level of that variable. The \code{formula} argument is intended as more flexible and feature-rich interface to \code{pan}. Specifying the \code{formula} argument is similar to specifying other formulae in R. Given below is a list of operators that \code{panImpute} currently understands: \itemize{ \item{\code{~}: separates the target (left-hand) and predictor (right-hand) side of the model} \item{\code{+}: adds target or predictor variables to the model} \item{\code{*}: adds an interaction term of two or more predictors} \item{\code{|}: denotes cluster-specific random effects and specifies the cluster indicator (e.g., \code{1|ID})} \item{\code{I()}: defines functions to be interpreted by \code{model.matrix}} } Predictors are allowed to have fixed effects, random effects, or both on all target variables. The intercept is automatically included both as a fixed and a random effect, but it can be constrained if necessary (see examples). Note that, when specifying random effects other than the intercept, these will \emph{not} be automatically added as fixed effects and must be included explicitly. Any predictors defined by \code{I()} will be used for imputation but not included in the data set unless \code{save.pred=TRUE}. In order to run separate imputations for each level of an additional grouping variable, the \code{group} argument may be used. The name of the grouping variable must be given in quotes. As a default prior, \code{panImpute} uses "least informative" inverse-Wishart priors for the covariance matrix of random effects and the covariance matrix of residuals, that is, with minimum degrees of freedom (largest dispersion) and identity matrices for scale. For better control, the \code{prior} argument may be used for specifying alternative prior distributions. These must be supplied as a list containing the following components: \itemize{ \item{\code{a}: degrees of freedom for the covariance matrix of residuals} \item{\code{Binv}: scale matrix for the covariance matrix of residuals} \item{\code{c}: degrees of freedom for the covariance matrix of random effects} \item{\code{Dinv}: scale matrix for the covariance matrix of random effects} } A sensible choice for a diffuse non-default prior is to set the degrees of freedom to the lowest value possible, and the scale matrices according to a prior guess of the corresponding covariance matrices (see Schafer & Yucel, 2002). In imputation models with many parameters, the number of parameter chains being saved can be reduced with the \code{keep.chains} argument. If set to \code{full} (the default), all chains are saved. If set to \code{diagonal}, only chains pertaining to fixed effects and the diagonal entries of the covariance matrices are saved. This setting influences the storage mode of parameters (e.g., dimensions and indices of arrays) and should be used with caution. } \value{ Returns an object of class \code{mitml}, containing the following components: \item{data}{The original (incomplete) data set, sorted according to the cluster variable and (if given) the grouping variable, with several attributes describing the original row order (\code{"sort"}) and grouping (\code{"group"}.} \item{replacement.mat}{A matrix containing the multiple replacements (i.e., imputations) for each missing value. The replacement matrix contains one row for each missing value and one one column for each imputed data set.} \item{index.mat}{A matrix containing the row and column index for each missing value. The index matrix is used to \emph{link} the missing values in the data set with their corresponding rows in the replacement matrix.} \item{call}{The matched function call.} \item{model}{A list containing the names of the cluster variable, the target variables, and the predictor variables with fixed and random effects, respectively.} \item{random.L1}{A character string denoting the handling of random residual covariance matrices (not used here; see \code{jomoImpute}).} \item{prior}{The prior parameters used in the imputation model.} \item{iter}{A list containing the number of burn-in iterations, the number of iterations between imputations, and the number of imputed data sets.} \item{par.burnin}{A multi-dimensional array containing the parameters of the imputation model from the burn-in phase.} \item{par.imputation}{A multi-dimensional array containing the parameters of the imputation model from the imputation phase.} } \note{ For objects of class \code{mitml}, methods for the generic functions \code{print}, \code{summary}, and \code{plot} have been defined. \code{mitmlComplete} is used for extracting the imputed data sets. } \references{ Schafer, J. L., and Yucel, R. M. (2002). Computational strategies for multivariate linear mixed-effects models with missing values. \emph{Journal of Computational and Graphical Statistics, 11}, 437-457. } \author{Simon Grund, Alexander Robitzsch, Oliver Luedtke} \seealso{\code{\link{jomoImpute}}, \code{\link{mitmlComplete}}, \code{\link{summary.mitml}}, \code{\link{plot.mitml}}} \examples{ # NOTE: The number of iterations in these examples is much lower than it # should be! This is done in order to comply with CRAN policies, and more # iterations are recommended for applications in practice! data(studentratings) # *** ................................ # the 'type' interface # # * Example 1.1: 'ReadDis' and 'SES', predicted by 'ReadAchiev' and # 'CognAbility', with random slope for 'ReadAchiev' type <- c(-2,0,0,0,0,0,3,1,2,0) names(type) <- colnames(studentratings) type imp <- panImpute(studentratings, type=type, n.burn=1000, n.iter=100, m=5) # * Example 1.2: 'ReadDis' and 'SES' groupwise for 'FedState', # and predicted by 'ReadAchiev' type <- c(-2,-1,0,0,0,0,2,1,0,0) names(type) <- colnames(studentratings) type imp <- panImpute(studentratings, type=type, n.burn=1000, n.iter=100, m=5) # *** ................................ # the 'formula' interface # # * Example 2.1: imputation of 'ReadDis', predicted by 'ReadAchiev' # (random intercept) fml <- ReadDis ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) # ... the intercept can be suppressed using '0' or '-1' (here for fixed intercept) fml <- ReadDis ~ 0 + ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) # * Example 2.2: imputation of 'ReadDis', predicted by 'ReadAchiev' # (random slope) fml <- ReadDis ~ ReadAchiev + (1+ReadAchiev|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) # * Example 2.3: imputation of 'ReadDis', predicted by 'ReadAchiev', # groupwise for 'FedState' fml <- ReadDis ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, group="FedState", n.burn=1000, n.iter=100, m=5) # * Example 2.4: imputation of 'ReadDis', predicted by 'ReadAchiev' # including the cluster mean of 'ReadAchiev' as an additional predictor fml <- ReadDis ~ ReadAchiev + I(clusterMeans(ReadAchiev,ID)) + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) # ... using 'save.pred' to save the calculated cluster means in the data set fml <- ReadDis ~ ReadAchiev + I(clusterMeans(ReadAchiev,ID)) + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5, save.pred=TRUE) head(mitmlComplete(imp,1)) } \keyword{models} mitml/man/read.mitml.Rd0000644000176200001440000000160712765552055014512 0ustar liggesusers\name{read.mitml} \alias{read.mitml} \title{Read \code{mitml} objects from file} \description{ This function loads \code{mitml} class objects from R binary formats (similar to \code{?load}), usually produced by \code{write.mitml}. } \usage{ read.mitml(filename) } \arguments{ \item{filename}{Name of the file to read, to be specified with file extension (e.g., \code{.R}, \code{.Rdata}).} } \value{ Returns the saved \code{mitml} class object. } \author{Simon Grund} \seealso{\code{\link{panImpute}}, \code{\link{jomoImpute}}, \code{\link{write.mitml}}} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) # write 'mitml' object write.mitml(imp, filename="imputation.R") # read previously saved 'mitml' object previous.imp <- read.mitml("imputation.R") class(previous.imp) previous.imp } mitml/man/write.mitmlMplus.Rd0000644000176200001440000000366512765547275015770 0ustar liggesusers\name{write.mitmlMplus} \alias{write.mitmlMplus} \title{Write \code{mitml} objects to Mplus format} \description{ Saves objects of class \code{mitml} as a series of text files which can be processed by the statistical software M\emph{plus} (Muthen & Muthen, 2012). } \usage{ write.mitmlMplus(x, filename, suffix="list", sep="\t", dec=".", na.value=-999) } \arguments{ \item{x}{An object of class \code{mitml} or \code{mitml.list} as produced by \code{panImpute}, \code{jomoImpute}, \code{mitmlComplete}, or similar).} \item{filename}{Basic file name for the text files containing the imputed data sets, to be specified without file extension.} \item{suffix}{File name suffix for the index file.} \item{sep}{The field separator.} \item{dec}{The decimal separator.} \item{na.value}{A numeric value coding the missing data in the resulting data files.} } \details{ The M\emph{plus} format for multiply imputed data sets comprises a series of text files, each containing one imputed data set, and an index file containing the names of all data files. During export, factors and character variables are converted to numeric. Therefore, \code{write.mitmlMplus} produces a log file which contains information about the data set and the factors that have been converted. In addition, a basic M\emph{plus} input file is generated that can be used for setting up subsequent analysis models. } \value{ None (invisible \code{NULL}). } \references{ Muthen, L. K., & Muthen, B. O. (2012). \emph{Mplus User's Guide. Seventh Edition.} Los Angeles, CA: Muthen & Muthen. } \author{Simon Grund} \seealso{\code{\link{panImpute}}, \code{\link{jomoImpute}}, \code{\link{mitmlComplete}}} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) # write imputation files, index file, and log file write.mitmlMplus(imp, filename="imputation", suffix="list", na.value=-999) } mitml/man/long2mitml.list.Rd0000644000176200001440000000323613057753556015520 0ustar liggesusers\name{long2mitml.list} \alias{long2mitml.list} \alias{jomo2mitml.list} \title{Convert imputations from long format to \code{mitml.list}} \description{These functions convert data sets containing multiple imputations in long format to objects of class \code{mitml.list}. The resulting object can be used in further analyses.} \usage{ long2mitml.list(x, split, exclude=NULL) jomo2mitml.list(x) } \arguments{ \item{x}{A data frame in long format containing multiple imputations (see details).} \item{split}{A character string denoting the column in \code{x} that identifies different imputations (see details).} \item{exclude}{A vector denoting values of \code{split} which should be excluded from the list.} } \details{ The function \code{long2mitml.list} is intedended for converting data frames from the long format to \code{mitml.list} (i.e., a list of imputed data sets). In this format, imputations are enclosed in a single data frame, and a \code{split} variable is used to identify different imputations. Similarly, \code{jomo2mitml.list} is a special case of \code{long2mitml.list} intended for converting imputations that have been generated with \code{jomo} directly. } \value{ A list of imputed data sets with an additional class attribute \code{mitml.list}. } \author{Simon Grund} \seealso{\code{\link{mitmlComplete}}} \examples{ data(studentratings) require(jomo) # impute data using jomo (native functions) clus <- studentratings[,"ID"] Y <- studentratings[,c("ReadAchiev","ReadDis")] imp <- jomo(Y=Y, clus=clus, nburn=1000, nbetween=100, nimp=5) # split imputations impList <- long2mitml.list(imp, split="Imputation", exclude=0) impList <- jomo2mitml.list(imp) } mitml/man/mids2mitml.list.Rd0000644000176200001440000000133212765527470015507 0ustar liggesusers\name{mids2mitml.list} \alias{mids2mitml.list} \title{Convert objects of class \code{mids} to \code{mitml.list}} \description{This function converts a \code{mids} class object (as produced by the \code{mice} package) to \code{mitml.list}. The resulting object may be used in further analyses.} \usage{ mids2mitml.list(x) } \arguments{ \item{x}{An object of class \code{mids} as produced by \code{mice} (see the \code{mice} package).} } \value{ A list of imputed data sets with an additional class attribute \code{mitml.list}. } \author{Simon Grund} \seealso{\code{\link{mitmlComplete}}} \examples{ data(studentratings) # imputation using mice require(mice) imp <- mice(studentratings) implist <- mids2mitml.list(imp) } mitml/man/leadership.Rd0000644000176200001440000000260412742203327014562 0ustar liggesusers\name{leadership} \alias{leadership} \docType{data} \title{Example data set on leadership style and job satisfaction} \description{ This data set is a slightly altered version of the data set simulated by Paul D. Bliese as described in Klein et al. (2000). The data set consists of 750 employees, nested within 50 work groups. The data set features employees' ratings on negative leadership style, job satisfaction, and workload as well as a measure for each work group's cohesion. The original data set is available in the \code{multilevel} package and was altered by (a) transforming workload into a categorical variable, (b) transforming cohesion into a group-level variable, and (c) by inducing missing values. } \usage{data(leadership)} \format{A data frame containing 750 observations on 5 variables.} \references{ Bliese, P. D. (2013). multilevel: Multilevel functions (Version 2.5) [Computer software]. Retrieved from \code{http://CRAN.R-project.org/package=multilevel} Klein, K. J., Bliese, P. D., Kozlowski, S. W. J., Dansereau, F., Gavin, M. B., Griffin, M. A., ... Bligh, M. C. (2000). Multilevel analytical techniques: Commonalities, differences, and continuing questions. In K. J. Klein & S. W. J. Kozlowski (Eds.), \emph{Multilevel theory, research, and methods in organizations: Foundations, extensions, and new directions} (pp. 512-553). San Francisco, CA: Jossey-Bass. } \keyword{datasets} mitml/man/write.mitmlSAV.Rd0000644000176200001440000000316212765552552015303 0ustar liggesusers\name{write.mitmlSAV} \alias{write.mitmlSAV} \title{Write \code{mitml} objects to native SPSS format} \description{ Saves objects of class \code{mitml} in the \code{.sav} format used by the statistical software SPSS (IBM Corp., 2013). The function serves as a front-end for \code{write_sav} from the \code{haven} package. } \usage{ write.mitmlSAV(x, filename) } \arguments{ \item{x}{An object of class \code{mitml} or \code{mitml.list} as produced by \code{panImpute}, \code{jomoImpute}, \code{mitmlComplete}, or similar).} \item{filename}{Name of the destination file, to be specified with or without file extension. The file extension (\code{.sav}) is appended if necessary.} } \details{ This function exports multiply imputed data sets to a single \code{.sav} file, in which an \code{Imputation_} variable separates the original data and the various imputed data sets. Thus, \code{write.mitmlSAV} exports directly to the native SPSS format. Alternatively, \code{\link{write.mitmlSPSS}} may be used for creating separate text and SPSS syntax files; an option that offers more control over the data format. } \value{ None (invisible \code{NULL}). } \references{ IBM Corp. (2013). \emph{IBM SPSS Statistics for Windows, Version 22.0}. Armonk, NY: IBM Corp } \author{Simon Grund} \seealso{\code{\link{panImpute}}, \code{\link{jomoImpute}}, \code{\link{mitmlComplete}}, \code{\link{write.mitmlSPSS}}} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) # write data file and SPSS syntax write.mitmlSAV(imp, filename="imputation") } mitml/man/plot.mitml.Rd0000644000176200001440000001430512765551765014563 0ustar liggesusers\name{plot.mitml} \alias{plot.mitml} \title{Print diagnostic plots} \description{ Generates diagnostic plots for assessing the convergence and autocorrelation behavior of \code{pan}'s and \code{jomo}'s MCMC algorithms. } \usage{ \method{plot}{mitml}(x, print=c("beta","beta2","psi","sigma"), pos=NULL, group="all", trace=c("imputation","burnin","all"), thin=1, smooth=3, n.Rhat=3, export=c("none","png","pdf"), dev.args=list(), ...) } \arguments{ \item{x}{An object of class \code{mitml} as produced by \code{panImpute} and \code{jomoImpute}.} \item{print}{A character vector containing one or several of \code{"beta"}, \code{"beta2"}, \code{"psi"} or \code{"sigma"} denoting which parameters should be plotted. Default is to plot all parameters.} \item{pos}{Either \code{NULL} or an integer vector denoting a specific entry in either \code{"beta"}, \code{"beta2"}, \code{"psi"} or \code{"sigma"}. Default is to \code{NULL} plotting all entries.} \item{group}{Either \code{"all"} or an integer denoting for which group plots should be generated. Used only when group-wise imputation was used. Default is to \code{"all"}.} \item{trace}{One of \code{"imputation"}, \code{"burnin"} or \code{"all"} denoting which part of the parameter chain should be used for the trace plot. Default is to \code{"imputation"}, which plots only the iterations after burn-in.} \item{thin}{An integer denoting the thinning factor that is applied before plotting. Default is to \code{1}, plotting the full chain.} \item{smooth}{A numeric value denoting the smoothing factor for the trend line in trace plots. Higher values correspond to less smoothing. Default is \code{3}. If set to \code{0} or \code{NULL}, the trend line is suppressed.} \item{n.Rhat}{An integer denoting the number of sequences used for calculating the potential scale reduction factor. Default is \code{3}.} \item{export}{(optional) A character string specifying if plots should be exported to file. If \code{"png"} or \code{"pdf"}, then plots are printed into a folder named "mitmlPlots" in the current directory using either the \code{png} or \code{pdf} device. Default is to \code{"none"}, which does not export files.} \item{dev.args}{(optional) A named list containing additional arguments that are passed to the graphics device.} \item{\dots}{Parameters passed to the plotting functions.} } \details{ The \code{plot} method generates a series of plots for the parameters of the imputation model which can be used for diagnostic purposes. In addition, a short summary of the parameter chain is displayed. Setting \code{print} to \code{"beta"}, \code{"beta2"}, \code{"psi"} and \code{"sigma"} will plot the fixed effects, the variances and covariances of random effects, and the variances and covariances of residuals, respectively. Here, \code{"beta2"} refers to the fixed effects for target variables at level 2 and is only used when imputations were carried out using a two-level model (\code{\link{jomoImpute}}). Each plotting window contains a trace plot (upper left), an autocorrelation plot (lower left), a kernel density approximation of the posterior distribution (upper right), and a posterior summary (lower right). The summary includes the following quantities: \describe{ \item{\code{EAP}:}{Expected value a posteriori (i.e., the mean of the parameter chain)} \item{\code{MAP}:}{Mode a posteriori (i.e., the mode of the parameter chain)} \item{\code{SD}:}{Standard deviation of the parameter chain} \item{\code{2.5\%}:}{The 2.5\% quantile of parameter values} \item{\code{97.5\%}:}{The 97.5\% quantile of parameter values} \item{\code{Rhat}:}{Estimated potential scale reduction factor (\eqn{\hat{R}})} \item{\code{ACF-k}:}{Smoothed autocorrelation at lag \eqn{k}, where \eqn{k} is the number of iterations between imputations (see \code{\link{summary.mitml}})} } The \code{trace} and \code{smooth} arguments can be used to influence how the trace plot is drawn and what part of the chain should be used for it. The \code{thin} argument can be used for thinning the chain before plotting, in which case the number of data points is reduced in the trace plot, and the autocorrelation is calculated up to lag \eqn{k/}\code{thin} (see above). The \code{n.Rhat} argument controls the number of sequences that are used for calculating the potential scale reduction factor (\eqn{\hat{R}}) in each plot (see \code{summary.mitml}). Further aguments to the graphics device are supplied using the \code{dev.args} argument. The \code{plot} function calculates and displays diagnostic information primarily for the imputation phase (i.e., for iterations after burn-in). This is the default in the \code{plot} function and the recommended method for most users. However, note that, when overriding the default using \code{trace="burnin"}, the posterior summary and the trace plots do not convey the necessary information to establish convergence. When \code{trace="all"}, the full chain is displayed with emphasis on the imputation phase, and the posterior summary is calculated based only on iterations after burn-in as recommended. } \note{ The plots are presented on-screen one at a time. To proceed with the next plot, the user may left-click in the plotting window or press the "enter" key while in the R console, depending on the operating system. No plots are displayed when exporting to file. } \value{ None (invisible \code{NULL}). } \author{Simon Grund} \seealso{\code{\link{panImpute}}, \code{\link{jomoImpute}}, \code{\link{summary.mitml}}} \examples{ \dontrun{ data(studentratings) # * Example 1: simple imputation fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) # plot all parameters (default) plot(imp) # plot fixed effects only plot(imp, print="beta") # export plots to file (using pdf device) plot(imp, export="pdf", dev.args=list(width=9, height=4, pointsize=12)) # * Example 2: groupwise imputation fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, group=FedState, n.burn=1000, n.iter=100, m=5) # plot fixed effects for all groups (default for 'group') plot(imp, print="beta", group="all") # plot fixed effects for first group only plot(imp, print="beta", group=1) } } \keyword{methods} mitml/man/anova.mitml.result.Rd0000644000176200001440000000520712765530245016215 0ustar liggesusers\name{anova.mitml.result} \alias{anova.mitml.result} \title{Compare several nested models} \description{ Performs model comparisons for a series of nested statistical models fitted using \code{with.mitml.list}. } \usage{ \method{anova}{mitml.result}(object, ...) } \arguments{ \item{object}{An object of class \code{mitml.result} as produced by \code{with.mitml.list}.} \item{\dots}{Additional objects of class \code{mitml.result} to be included in the comparison.} } \details{ This function performs several model comparisons between models fitted using \code{with.mitml.list}. If possible, the models are compared using the \eqn{D_3} statistic (Meng & Rubin, 1992). If this method is unavailable, the \eqn{D_2} statistic is used instead (Li, Meng, Raghunathan, & Rubin, 1991). The \eqn{D_3} method currently supports linear models and linear mixed-effects models with a single cluster variable as estimated by \code{lme4} or \code{nlme} (see Laird, Lange, & Stram, 1987). This function is essentially a wrapper for \code{\link{testModels}} with the advantage that several models can be compared simultaneously. All model comparisons are likelihood-based. For further options for model comparisons (e.g., Wald-based procedures) and finer control, see \code{testModels}. } \value{ Returns a list containing the results of each model comparison. A \code{print} method is used for better readable console output. } \author{Simon Grund} \seealso{\code{\link{with.mitml.list}}, \code{\link{testModels}}} \examples{ require(lme4) data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) implist <- mitmlComplete(imp, print=1:5) # simple comparison (same as testModels) fit0 <- with(implist, lmer(ReadAchiev ~ (1|ID), REML=FALSE)) fit1 <- with(implist, lmer(ReadAchiev ~ ReadDis + (1|ID), REML=FALSE)) anova(fit1,fit0) \dontrun{ # multiple comparisons fit2 <- with(implist, lmer(ReadAchiev ~ ReadDis + (1+ReadDis|ID), REML=FALSE)) anova(fit2,fit1,fit0) } } \references{ Meng, X.-L., & Rubin, D. B. (1992). Performing likelihood ratio tests with multiply-imputed data sets. \emph{Biometrika, 79}, 103-111. Laird, N., Lange, N., & Stram, D. (1987). Maximum likelihood computations with repeated measures: Application of the em algorithm. \emph{Journal of the American Statistical Association, 82}, 97-105. Li, K. H., Raghunathan, T. E., & Rubin, D. B. (1991). Large-sample significance levels from multiply imputed data using moment-based statistics and an F reference distribution. \emph{Journal of the American Statistical Association, 86}, 1065-1073. } \keyword{methods} mitml/man/write.mitmlSPSS.Rd0000644000176200001440000000531512765552570015444 0ustar liggesusers\name{write.mitmlSPSS} \alias{write.mitmlSPSS} \title{Write \code{mitml} objects to SPSS compatible format} \description{ Saves objects of class \code{mitml} as a text and a syntax file which can be processed by the statistical software SPSS (IBM Corp., 2013). } \usage{ write.mitmlSPSS(x, filename, sep="\t", dec=".", na.value=-999, syntax=TRUE, locale=NULL) } \arguments{ \item{x}{An object of class \code{mitml} or \code{mitml.list} as produced by \code{panImpute}, \code{jomoImpute}, \code{mitmlComplete}, or similar).} \item{filename}{Basic file name of the data and syntax files, to be specified without file extension.} \item{sep}{The field separator.} \item{dec}{The decimal separator.} \item{na.value}{A numeric value coding the missing data in the resulting data file.} \item{syntax}{A logical flag indicating if an SPSS syntax file should be generated. This file contains instructions for SPSS for reading in the data file. Default is to \code{TRUE}.} \item{locale}{(optional) A character string specifying the localization to be used in SPSS (e.g., \code{"en_US"}, \code{"de_DE"}). This argument may be specified if SPSS reads the data incorrectly due to conflicting locale settings.} } \details{ Multiply imputed data sets in SPSS are contained in a single file, in which an \code{Imputation_} variable separates the original data and the various imputed data sets. During export, factors are converted to numeric, whereas character variables are left "as is". By default, \code{write.mitmlSPSS} generates a raw text file containing the data, along with a syntax file containing instructions for SPSS. This syntax file mimics SPSS's functionality to read text files but circumvents certain problems that may occur when using the GUI. In order to read in the data, the syntax file must be opened and executed using SPSS. The syntax file may be altered manually if problems occur, for example, if the file path of the data file is not correctly represented in the syntax. Alternatively, \code{\link{write.mitmlSAV}} may be used for exporting directly to the SPSS native \code{.sav} format. However, this may offer less control over the data format. } \value{ None (invisible \code{NULL}). } \references{ IBM Corp. (2013). \emph{IBM SPSS Statistics for Windows, Version 22.0}. Armonk, NY: IBM Corp } \author{Simon Grund} \seealso{\code{\link{panImpute}}, \code{\link{jomoImpute}}, \code{\link{mitmlComplete}}, \code{\link{write.mitmlSAV}}} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) # write data file and SPSS syntax write.mitmlSPSS(imp, filename="imputation", sep="\t", dec=".", na.value=-999, locale="en_US") } mitml/man/confint.mitml.testEstimates.Rd0000644000176200001440000000316613321120501020046 0ustar liggesusers\name{confint.mitml.testEstimates} \alias{confint.mitml.testEstimates} \title{Compute confidence intervals} \description{ Computes confidence intervals on the basis of the final parameter estimates and inferences given by \code{\link{testEstimates}}. } \usage{ \method{confint}{mitml.testEstimates}(object, parm, level=0.95, ...) } \arguments{ \item{object}{An object of class \code{mitml.testEstimates} as produced by \code{testEstimates}.} \item{parm}{(optional) A reference to the parameters for which to calculate confidence intervals. Can be a character or integer vector denoting names or position of parameters, respectively. If missing, all parameters are considered (the default).} \item{level}{The confidence level. Default is to \code{0.95} (i.e., 95\%).} \item{\dots}{Not being used.} } \details{ This function calculates confidence intervals with the given confidence level for the specified parameters on the basis of a $t$-distribution, with estimates, standard errors, and degrees of freedom as returned by \code{testEstimates}. } \value{ A matrix containing the lower and upper bounds of the confidence intervals. } \author{Simon Grund} \seealso{\code{\link{testEstimates}}} \examples{ data(studentratings) fml <- ReadDis ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=500, n.iter=100, m=5) implist <- mitmlComplete(imp, print=1:5) # fit regression model fit <- with(implist, lm(ReadDis ~ 1 + ReadAchiev)) est <- testEstimates(fit) # calculate confidence intervals confint(est) # ... with different confidence levels confint(est, level=0.90) confint(est, level=0.999) } mitml/man/testEstimates.Rd0000644000176200001440000000753413321120501015271 0ustar liggesusers\name{testEstimates} \alias{testEstimates} \title{Compute final estimates and inferences} \description{ Computes final parameter estimates and inferences from multiply imputed data sets. } \usage{ testEstimates(model, qhat, uhat, var.comp=FALSE, df.com=NULL) } \arguments{ \item{model}{A list of fitted statistical models (see examples).} \item{qhat, uhat}{Two matrices or lists containing point and variances estimates, respectively, for each imputed data set (see examples).} \item{var.comp}{A logical flag indicating if estimates for variance components should be calculated. Default is to \code{FALSE}.} \item{df.com}{(optional) A numeric vector denoting the complete-data degrees of freedom for the hypothesis test.} } \details{ This function calculates final parameter estimates and inferences as suggested by Rubin (1987, "Rubin's rules") for each parameter of the fitted model. In other words, \code{testEstimates} aggregates estimates and standard errors across multiply imputed data sets. The parameters and standard errors can either be supplied as fitted statistical models (\code{model}), or as two matrices or lists (\code{qhat} and \code{uhat}, see examples). Rubin's original method assumes that the complete-data degrees of freedom are infinite, which is reasonable in larger samples. Alternatively, the degrees of freedom can be adjusted for smaller samples by specifying \code{df.com} (Barnard & Rubin, 1999). The \code{df.com} argument can either be a single number if the degrees of freedom are equal for all tests, or a numeric vector with one element per test. Using the \code{var.comp} argument, final estimates for variance components and related parameters can be requested. These will be shown as a separate table within the console output. Accessing variance components is highly dependent on the model being estimated and not implemented for all models. Users may prefer calculating these estimates manually using \code{\link{with.mitml.list}} (see Example 3). No inferences are calculated for variance components. Currently, the procedure supports statistical models that define \code{coef} and \code{vcov} methods (e.g., \code{lm}), multilevel models estimated with \code{lme4} or \code{nlme}, and GEEs estimated with \code{geepack}. The arguments \code{qhat} and \code{uhat} allow for more general aggregation of parameter estimates regardless of model class. Support for further models may be added in future releases. } \value{ Returns a list containing the final parameter and inferences, the relative increase in variance due to nonresponse, and the fraction of missing information (Rubin, 1987). A \code{print} method is used for better readable console output. } \references{ Barnard, J., & Rubin, D. B. (1999). Small-sample degrees of freedom with multiple imputation. \emph{Biometrika, 86}, 948-955. Rubin, D. B. (1987). \emph{Multiple imputation for nonresponse in surveys}. Hoboken, NJ: Wiley. } \author{Simon Grund} \seealso{\code{\link{with.mitml.list}}, \code{\link{confint.mitml.testEstimates}}} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) implist <- mitmlComplete(imp, print=1:5) # fit multilevel model using lme4 require(lme4) fit.lmer <- with(implist, lmer(SES ~ (1|ID))) # * Example 1: combine estimates using model recognition # final estimates and inferences sperately for each parameter (Rubin's rules) testEstimates(fit.lmer) # ... adjusted df for finite samples testEstimates(fit.lmer, df.com=49) # ... with additional table for variance components and ICCs testEstimates(fit.lmer, var.comp=TRUE) # * Example 2: combine estimates using matrices or lists fit.lmer <- with(implist, lmer(SES ~ ReadAchiev + (1|ID))) qhat <- sapply(fit.lmer, fixef) uhat <- sapply(fit.lmer, function(x) diag(vcov(x))) testEstimates(qhat=qhat, uhat=uhat) } mitml/man/amelia2mitml.list.Rd0000644000176200001440000000141213062001406015753 0ustar liggesusers\name{amelia2mitml.list} \alias{amelia2mitml.list} \title{Convert objects of class \code{amelia} to \code{mitml.list}} \description{This function converts a \code{amelia} class object (as produced by the \code{Amelia} package) to \code{mitml.list}. The resulting object may be used in further analyses.} \usage{ amelia2mitml.list(x) } \arguments{ \item{x}{An object of class \code{amelia} as produced by \code{amelia} (see the \code{Amelia} package).} } \value{ A list of imputed data sets with an additional class attribute \code{mitml.list}. } \author{Simon Grund} \seealso{\code{\link{mitmlComplete}}} \examples{ data(studentratings) require(Amelia) imp <- amelia(x=studentratings[,c("ID","MathAchiev","ReadAchiev")], cs="ID") implist <- amelia2mitml.list(imp) } mitml/man/subset.mitml.list.Rd0000644000176200001440000000342713057751366016061 0ustar liggesusers\name{subset.mitml.list} \alias{subset.mitml.list} \title{Subset a list of imputed data sets} \description{ The functions can be used for creating subsets for a list of multiply imputed data sets. } \usage{ \method{subset}{mitml.list}(x, subset, select, ...) } \arguments{ \item{x}{A list of imputed data sets with class \code{mitml.list} as produced by \code{mitmlComplete} (or similar).} \item{subset}{An R expression by which to subset each data set.} \item{select}{An R expression by which to select columns.} \item{\dots}{Not being used.} } \details{ This function can be used to create subsets and select variables for a list of multiply imputed data sets according to the R expressions given in the \code{subset} and \code{select} arguments. The function is similar to and adapted from the \code{subset} function for regular data sets. Note that subsetting is performed individually for each data set. Thus, the cases included may differ across data sets if the variables used for subsetting contain different values. } \value{ A list of imputed data sets with an additional class attribute \code{mitml.list}. } \author{Simon Grund} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) implist <- mitmlComplete(imp,"all") # * Example 1: subset by SES, select variables by name subset(implist, SES < 25, select = c(ID, FedState, Sex, SES, ReadAchiev, ReadDis)) # * Example 2: subset by FedState, select variables by column number subset(implist, FedState == "SH", select = -c(6:7,9:10)) \dontrun{ # * Example 3: subset by ID and Sex subset(implist, ID %in% 1001:1005 & Sex == "Boy") # * Example 4: select variables by name range subset(implist, select = ID:Sex) } } \keyword{methods} mitml/man/is.mitml.list.Rd0000644000176200001440000000123012765521231015144 0ustar liggesusers\name{is.mitml.list} \alias{is.mitml.list} \title{Check if an object is of class \code{mitml.list}} \description{This function checks if its argument is a list of class \code{mitml.list}.} \usage{ is.mitml.list(x) } \arguments{ \item{x}{An R object.} } \value{ Either \code{TRUE} or \code{FALSE}. A warning message is displayed if the contents of \code{x} do not appear to be data frames. } \author{Simon Grund} \seealso{\code{\link{as.mitml.list}}} \examples{ l <- list(data.frame(x=rnorm(20))) l <- as.mitml.list(l) is.mitml.list(l) # TRUE l <- as.list(1:10) is.mitml.list(l) # FALSE class(l) <- "mitml.list" is.mitml.list(l) # TRUE, with a warning } mitml/man/write.mitml.Rd0000644000176200001440000000213712765536623014733 0ustar liggesusers\name{write.mitml} \alias{write.mitml} \title{Write \code{mitml} objects to file} \description{ This function saves objects of class \code{mitml} in R binary formats (similar to \code{?save}). } \usage{ write.mitml(x, filename, drop=FALSE) } \arguments{ \item{x}{An object of class \code{mitml} as produced by \code{panImpute} and \code{jomoImpute}.} \item{filename}{Name of the destination file, to be specified with file extension (e.g., \code{.R}, \code{.Rdata}).} \item{drop}{Logical flag indicating if the parameters of the imputation model should be dropped in favor for lower file size. Default is to \code{FALSE}.} } \value{ None (invisible \code{NULL}). } \author{Simon Grund} \seealso{\code{\link{panImpute}}, \code{\link{jomoImpute}}, \code{\link{read.mitml}}} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) # write full 'mitml' object (default) write.mitml(imp, filename="imputation.R") # drop parameters of the imputation model write.mitml(imp, filename="imputation.R", drop=TRUE) } mitml/man/with.mitml.list.Rd0000644000176200001440000000730613102371425015510 0ustar liggesusers\name{with.mitml.list} \alias{with.mitml.list} \alias{within.mitml.list} \title{Evaluate an expression in a list of imputed data sets} \description{ The functions \code{with} and \code{within} evaluate R expressions in a list of multiply imputed data sets. } \usage{ \method{with}{mitml.list}(data, expr, ...) \method{within}{mitml.list}(data, expr, ignore=NULL, ...) } \arguments{ \item{data}{A list of imputed data sets with class \code{mitml.list} as produced by \code{mitmlComplete}.} \item{expr}{An R expression to be evaluated for each data set.} \item{ignore}{A character vector denoting objects not to be saved.} \item{\dots}{Not being used.} } \details{ The two functions are defined as \code{with} and \code{within} methods for objects of class \code{mitml.list}. Both \code{with} and \code{within} evaluate an R expression in each of the imputed data sets. However, the two functions return different values: \code{with} returns the evaluated expression, whereas \code{within} returns the resuling data sets. The \code{ignore} argument may be used to declare objects that are not to be saved within \code{within}. } \value{ % \itemize{ % \item{\code{with}}{: Returns the evaluated expression from each data set as a list (class \code{mitml.result}. This is useful for fitting statistical models to multiply imputed data. The list of fitted models can be analyzed using \code{testEstimates}, \code{testModels}, \code{testConstraints}, or \code{anova}.} % \item{\code{within}}{: Evaluates the R expression for each data set and returns the altered data sets as a list \code{mitml.list}. This is useful for manipulating the data prior to data analysis (e.g., centering, calculating cluster means, etc.).} % } \code{with}: Returns the evaluated expression from each data set as a list (class \code{mitml.result}). This is useful for fitting statistical models to multiply imputed data. The list of fitted models can be analyzed using \code{\link{testEstimates}}, \code{\link{testModels}}, \code{\link{testConstraints}}, or \code{\link{anova}}. \code{within}: Evaluates the R expression for each data set and returns the altered data sets as a list \code{mitml.list}. This is useful for manipulating the data prior to data analysis (e.g., centering, calculating cluster means, etc.). } \author{Simon Grund} \seealso{\code{\link{mitmlComplete}}, \code{\link{anova.mitml.result}}, \code{\link{testEstimates}}, \code{\link{testModels}}, \code{\link{testConstraints}}} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) implist <- mitmlComplete(imp, print=1:5) # * Example 1: data manipulation # calculate and save cluster means new1.implist <- within(implist, Means.ReadAchiev <- clusterMeans(ReadAchiev, ID)) # center variables, calculate interaction terms, ignore byproducts new2.implist <- within(implist,{ M.SES <- mean(SES) M.CognAbility <- mean(CognAbility) C.SES <- SES - M.SES C.CognAbility <- CognAbility - M.CognAbility SES.CognAbility <- C.SES * C.CognAbility }, ignore=c("M.SES", "M.CognAbility")) # * Example 2: fitting statistical models # fit regression model fit.lm <- with(implist, lm(ReadAchiev ~ ReadDis)) # fit multilevel model using lme4 require(lme4) fit.lmer <- with(implist, lmer(ReadAchiev ~ ReadDis + (1|ID))) # * Example 3: manual extraction of variance estimates require(lme4) fit.lmer <- with(implist, lmer(SES ~ (1|ID))) # extract level-1 and level-2 variances var.l1 <- sapply(fit.lmer, function(z) attr(VarCorr(z),"sc")^2) var.l2 <- sapply(fit.lmer, function(z) VarCorr(z)$ID[1,1]) # calculate final estimate of the intraclass correlation ICC <- mean( var.l2 / (var.l2+var.l1) ) } \keyword{methods} mitml/man/as.mitml.list.Rd0000644000176200001440000000151713321120501015124 0ustar liggesusers\name{as.mitml.list} \alias{as.mitml.list} \title{Convert a list of data sets to \code{mitml.list}} \description{This function adds a \code{mitml.list} class attribute to a list of data frames. The resulting object can be used in further analyses.} \usage{ as.mitml.list(x) } \arguments{ \item{x}{A list of data frames.} } \value{ The original list with an additional class attribute \code{mitml.list}. The list entries are converted to \code{data.frame} if necessary, in which case a note is printed. } \author{Simon Grund} \seealso{\code{\link{is.mitml.list}}, \code{\link{long2mitml.list}}} \examples{ # data frame with 'imputation' indicator dat <- data.frame(imputation=rep(1:10,each=20), x=rnorm(200)) # split into a list and convert to 'mitml.list' l <- split(dat, dat$imputation) l <- as.mitml.list(l) is.mitml.list(l) # TRUE } mitml/man/sort.mitml.list.Rd0000644000176200001440000000306213057752135015531 0ustar liggesusers\name{sort.mitml.list} \alias{sort.mitml.list} \title{Sort a list of imputed data sets} \description{ The functions sorts a list of multiply imputed data sets according to an R expression. } \usage{ \method{sort}{mitml.list}(x, decreasing=FALSE, by, ...) } \arguments{ \item{x}{A list of imputed data sets with class \code{mitml.list} as produced by \code{mitmlComplete} (or similar).} \item{decreasing}{Logical flag indicating if data sets should be sorted in decreasing (i.e., reversed) order. Default is `\code{FALSE}`.} \item{by}{An R expression or a list of multiple expressions by which to sort the imputed data sets (see Examples).} \item{\dots}{Further arguments to `\code{order}' (see Details).} } \details{ This function sorts a list of imputed data sets according to the R expression given in the \code{by} argument. The function is similar to the \code{order} function for regular data sets and uses it internally. Note that sorting is performed individually for each data set. Thus, the order of cases may differ across data sets if the variables used for sorting contain different values. } \value{ A list of imputed data sets with an additional class attribute \code{mitml.list}. } \author{Simon Grund} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) implist <- mitmlComplete(imp,"all") # * Example 1: sort by ID sort(implist, by=ID) # * Example 2: sort by combination of variables sort(implist, by=list(FedState,ID,-SES)) } \keyword{methods} mitml/man/multilevelR2.Rd0000644000176200001440000000512512765530342015036 0ustar liggesusers\name{multilevelR2} \alias{multilevelR2} \title{Calculate R-squared measures for multilevel models} \description{ Calculates several measures for the proportion of explained variance in a fitted linear mixed-effects (i.e.,) multilevel model (or a list of fitted models). } \usage{ multilevelR2(model, print=c("RB1","RB2","SB","MVP")) } \arguments{ \item{model}{Either a fitted linear mixed-effects model as produced by \code{lme4} or \code{nlme}, or a list of fitted models as produced by \code{with.mitml.list}.} \item{print}{A character vector denoting which measures should be calculated (see details). Default is to printing all measures.} } \details{ This function calculates several measures of explained variance (\eqn{R^2}) for linear-mixed effects models. It can be used with a single model, as produced by the packages \code{lme4} or \code{nlme}, or a list of fitted models produced by \code{with.mitml.list}. In the latter case, the \eqn{R^2} measures are calculated separately for each imputed data set and then averaged across data sets. Different \eqn{R^2} measures can be requested using the \code{print} argument. Specifying \code{RB1} and \code{RB2} will return the explained variance at level 1 and level 2, respectively, according to Raudenbush and Bryk (2002, pp. 74 and 79). Specifying \code{SB} will return the total variance explained according to Snijders and Bosker (2012, p. 112). Specifying \code{MVP} will return the total variance explained based on ``multilevel variance partitioning'' as proposed by LaHuis, Hartman, Hakoyama, and Clark (2014). } \value{ Returns a numeric vector containing the \eqn{R^2} measures requested in \code{print}. } \note{ Calculating \eqn{R^2} measures is currently only supported for two-level models with a single cluster variable. } \author{Simon Grund} \references{ LaHuis, D. M., Hartman, M. J., Hakoyama, S., & Clark, P. C. (2014). Explained variance measures for multilevel models. \emph{Organizational Research Methods}, 17, 433-451. Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Thousand Oaks, CA: Sage. Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Thousand Oaks, CA: Sage. } \examples{ require(lme4) data(studentratings) fml <- MathAchiev + ReadAchiev + CognAbility ~ 1 + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) implist <- mitmlComplete(imp, print=1:5) fit <- with(implist, lmer(MathAchiev ~ 1 + CognAbility + (1|ID))) multilevelR2(fit) } mitml/man/justice.Rd0000644000176200001440000000231613044107142014102 0ustar liggesusers\name{justice} \alias{justice} \docType{data} \title{Example data set on employees' justice perceptions and satisfaction} \description{ This data set contains simulated data for employees nested within organizations, featuring employees' sex, ratings on individual justice orientation and ratings on job satisfaction. In addition, the data set features scores for justice climate in each organization (defined at the level of organizations, level 2). Different organizations are denoted by the variable \code{id}. The data were simulated based on the results by Liao and Rupp (2005), as well as the secondary analyses of the same data given in Mathieu, Aguinis, Culpepper, and Chen, (2012). } \format{A data frame containing 1400 observations on 4 variables.} \usage{data(justice)} \references{ Liao, H., & Rupp, D. E. (2005). The impact of justice climate and justice orientation on work outcomes: A cross-level multifoci framework. \emph{Journal of Applied Psychology}, 90, 242.256. Mathieu, J. E., Aguinis, H., Culpepper, S. A., & Chen, G. (2012). Understanding and estimating the power to detect cross-level interaction effects in multilevel modeling. \emph{Journal of Applied Psychology}, 97, 951-966. } \keyword{datasets} mitml/man/testConstraints.Rd0000644000176200001440000001227013321120501015633 0ustar liggesusers\name{testConstraints} \alias{testConstraints} \title{Test functions and constraints of model parameters} \description{ Performs hypothesis tests for arbitrary functions of a parameter vector using the Delta method. } \usage{ testConstraints(model, qhat, uhat, constraints, method=c("D1","D2"), df.com=NULL) } \arguments{ \item{model}{A list of fitted statistical models (see examples).} \item{qhat, uhat}{Two matrices/arrays or lists containing estimates of the parameter vector and its covariance matrix, respectively, for each imputed data set (see examples).} \item{constraints}{A character vector specifying constraints or functions of the vector of model parameters to be tested.} \item{method}{A character string denoting the method by which the test is performed. Can be either \code{"D1"} or \code{"D2"} (see details). Default is to \code{"D1"}.} \item{df.com}{(optional) A single number or a numeric vector denoting the complete-data degrees of freedom for the hypothesis test. Only used if \code{method="D1"}.} } \details{ This function is similar in functionality to \code{\link{testModels}} but extended to arbitrary functions (or constraints) of the model parameters. The function is based on the Delta method (e.g., Casella & Berger, 2002) according to which any function of the parameters can be tested using Wald-like methods, assuming that their sampling distribution is approximately normal. It is assumed that the parameters can be extracted using \code{coef} and \code{vcov} methods from the fitted models (or similar; e.g., regression coefficients, fixed effects in multilevel models) In cases where this is not possible, hypothesis tests can be carried out using user-supplied matrices/arrays or lists (\code{qhat} and \code{uhat}, see examples). Constraints and functions of the model parameters can be specified in the \code{constraints} argument. The constraints must be supplied as a character vector, where each string denotes a function or a constraint to be tested (see examples). The Wald-like tests that are carried out by \code{testConstraints} can be aggregated across data sets with the two methods \eqn{D_1} (Li, Raghunathan & Rubin, 1991) and \eqn{D_2} (Li, Meng, Raghunathan & Rubin, 1991), where \eqn{D_1} operates on the constrained estimates and standard errors, and \eqn{D_2} operates on the Wald-statistics (for an explanation, see \code{testModels}). The pooled estimates and standard errors reported in the output are always based on \eqn{D_1}. For \eqn{D_1}, the complete-data degrees of freedom can be adjusted for smaller samples by specifying \code{df.com}. Currently, the procedure supports statistical models that define \code{coef} and \code{vcov} methods (e.g., \code{lm}), multilevel models estimated with \code{lme4} or \code{nlme}, and GEEs estimated with \code{geepack}. The arguments \code{qhat} and \code{uhat} allow for more general hypothesis tests regardless of model class. Support for further models may be added in future releases. } \value{ Returns a list containing the results of the model comparison, the constrained estimates and standard errors, and the relative increase in variance due to nonresponse (Rubin, 1987). A \code{print} method is used for better readable console output. } \references{ Casella, G., & Berger, R. L. (2002). \emph{Statistical inference (2nd. Ed.)}. Pacific Grove, CA: Duxbury. Li, K.-H., Meng, X.-L., Raghunathan, T. E., & Rubin, D. B. (1991). Significance levels from repeated p-values with multiply-imputed data. \emph{Statistica Sinica, 1}, 65-92. Li, K. H., Raghunathan, T. E., & Rubin, D. B. (1991). Large-sample significance levels from multiply imputed data using moment-based statistics and an F reference distribution. \emph{Journal of the American Statistical Association, 86}, 1065-1073. } \author{Simon Grund} \seealso{\code{\link{testModels}}, \code{\link{with.mitml.list}}} \examples{ data(studentratings) fml <- MathDis + ReadDis + SchClimate ~ (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) implist <- mitmlComplete(imp, print=1:5) # fit simple regression model fit.lm <- with(implist, lm(SchClimate ~ ReadDis + MathDis)) # apply Rubin's rules testEstimates(fit.lm) # * Example 1: test 'identity' function of two parameters # multi-parameter hypothesis test, equivalent to model comparison cons <- c("ReadDis","MathDis") testConstraints(fit.lm, constraints=cons) # ... adjusting for finite samples testConstraints(fit.lm, constraints=cons, df.com=749) # ... using D2 testConstraints(fit.lm, constraints=cons, method="D2") # * Example 2: test for equality of two effects # tests the hypothesis that the effects of 'ReadDis' and 'MathDis' # are equal (ReadDis=MathDis) cons <- c("ReadDis-MathDis") testConstraints(fit.lm, constraints=cons) # * Example 3: test against a fixed value # tests the hypothesis that the effect of "ReadDis" is one (ReadDis=1) cons <- c("ReadDis-1") testConstraints(fit.lm, constraints=cons) # * Example 4: test 'identity' using arrays and list fit.lm <- with(implist, lm(SchClimate ~ ReadDis + MathDis)) cons <- c("ReadDis","MathDis") qhat <- sapply(fit.lm, coef) uhat <- sapply(fit.lm, function(x) vcov(x), simplify="array") testConstraints(qhat=qhat, uhat=uhat, constraints=cons) } mitml/man/studentratings.Rd0000644000176200001440000000156412765533545015541 0ustar liggesusers\name{studentratings} \alias{studentratings} \docType{data} \title{Example data set on student's ratings and achievement} \description{ This data set contains simulated data for students nested within schools, featuring students' ratings of their teachers' behavior (i.e., disciplinary problems in mathematics and reading class) and their general learning environment (school climate), as well as mathematics and reading achievement scores, and scores for socio-economic status and cognitive ability. In addition, the data set features the ID of 50 different schools (i.e., clusters), the biological sex of all students, and a broad, additional grouping factor. Different amounts of missing data have been inserted into the data set in a completely random fashion. } \usage{data(studentratings)} \format{A data frame containing 750 observations on 10 variables.} \keyword{datasets} mitml/man/mitml-package.Rd0000644000176200001440000000560313321120501015141 0ustar liggesusers\name{mitml-package} \alias{mitml-package} \docType{package} \title{mitml: Tools for multiple imputation in multilevel modeling} \description{Provides tools for multiple imputation of missing data in multilevel modeling. This package includes a user-friendly interface to the algorithms implemented in the R packages \code{pan} and \code{jomo}, as well as several functions for visualizing, managing and analyzing multiply imputed data sets. The main interface to \code{pan} is the function \code{\link{panImpute}}, which allows specification of imputation models for continuous variables with missing data at level 1. In addition, the function \code{\link{jomoImpute}} provides an interface to \code{jomo}, which extends the functionality of \code{pan} to continuous and categorical variables with missing data at level 1 and level 2 and also allows for the specification of single-level imputation models. Imputations and parameter chains are stored in objects of class \code{mitml}. To obtain the completed (i.e., imputed) data sets, \code{\link{mitmlComplete}} is used, producing a list of imputed data sets of class \code{mitml.list} that can be used in further analyses. Several additional functions allow for convenient analysis of multiply imputed data sets, especially when using the R packages \code{lme4} and \code{nlme}. The functions \code{\link[=with.mitml.list]{within}}, \code{\link[=sort.mitml.list]{sort}}, and \code{\link[=subset.mitml.list]{subset}} can be used to manage and manipulate multiply imputed data sets. For model fitting, \code{\link[=with.mitml.list]{with}} is used. Final parameter estimates can be extracted using \code{\link{testEstimates}}. Single- and multi-parameter hypotheses tests can be performed using the functions \code{\link{testConstraints}} and \code{\link{testModels}}. In addition, the \code{\link{anova}} method provides a simple interface to model comparisons with automatic refitting of statistical models. Data sets can be imported and exported from or to different statistical software packages. Currently, \code{\link{mids2mitml.list}}, \code{\link{amelia2mitml.list}}, \code{\link{jomo2mitml.list}}, and \code{\link{long2mitml.list}} can be used for importing imputations for other packages in R. In addition, \code{\link{write.mitmlMplus}}, \code{\link{write.mitmlSAV}}, and \code{\link{write.mitmlSPSS}} export data sets to M\emph{plus} and SPSS, respectively. Finally, the package provides tools for summarizing and visualizing imputation models, which is useful for the assessment of convergence and the reporting of results. The data sets contained in this package are published under the same license as the package itself. They contain simulated data and may be used by anyone free of charge as long as reference to this package is given. } \author{ Authors: Simon Grund, Alexander Robitzsch, Oliver Luedtke Maintainer: Simon Grund } \keyword{package} mitml/man/jomoImpute.Rd0000644000176200001440000003310013321120501014547 0ustar liggesusers\name{jomoImpute} \alias{jomoImpute} \title{Impute single-level and multilevel missing data using \code{jomo}} \description{This function provides an interface to the \code{jomo} package, which uses the MCMC algorithms presented in Carpenter & Kenward (2013). Through this wrapper function, single-level and multilevel imputations can be generated for (mixed) categorical and continuous variables (Goldstein et al., 2009). The multilevel procedures support imputation of missing data at level 1 and 2 as well as imputation using random (residual) covariance matrices (Yucel, 2011). Imputations can be generated using \code{type} or \code{formula}, which offer different options for model specification.} \usage{ jomoImpute(data, type, formula, random.L1=c("none","mean","full"), n.burn=5000, n.iter=100, m=10, group=NULL, prior=NULL, seed=NULL, save.pred=FALSE, keep.chains=c("full","diagonal"), silent=FALSE) } \arguments{ \item{data}{A data frame containing incomplete and auxiliary variables, the cluster indicator variable, and any other variables that should be present in the imputed datasets.} \item{type}{An integer vector specifying the role of each variable in the imputation model or a list of two vectors specifying a two-level model (see details).} \item{formula}{A formula specifying the role of each variable in the imputation model or a list of two formulas specifying a two-level model. The basic model is constructed by \code{model.matrix}, which allows including derived variables in the imputation model using \code{I()} (see details and examples).} \item{random.L1}{A character string denoting if the covariance matrix of residuals should vary across groups and how the values of these matrices are stored (see details). Default is to \code{"none"}, assuming a common covariance matrix across clusters.} \item{n.burn}{The number of burn-in iterations before any imputations are drawn. Default is to 5,000.} \item{n.iter}{The number of iterations between imputations. Default is to 100.} \item{m}{The number of imputed data sets to generate. Default is to 10.} \item{group}{(optional) A character string denoting the name of an additional grouping variable to be used with the \code{formula} argument. When specified, the imputation model is run separately within each of these groups.} \item{prior}{(optional) A list with components \code{Binv}, \code{Dinv}, and \code{a} for specifying prior distributions for the covariance matrix of random effects and the covariance matrix of residuals (see details). Default is to using least-informative priors.} \item{seed}{(optional) An integer value initializing R's random number generator for reproducible results. Default is to using the global seed.} \item{save.pred}{(optional) Logical flag indicating if variables derived using \code{formula} should be included in the imputed data sets. Default is to \code{FALSE}.} \item{keep.chains}{(optional) A character string denoting which parameter chains to save. Default is to save all chains (see details).} \item{silent}{(optional) Logical flag indicating if console output should be suppressed. Default is to \code{FALSE}.} } \details{ This function serves as an interface to the \code{jomo} package and supports imputation of single-level and multilevel continuous and categorical data. In order for categorical variables to be detected correctly, these must be formatted as a \code{factor} variables (see examples). The imputation model can be specified using either the \code{type} or the \code{formula} argument. The \code{type} interface is designed to provide quick-and-easy imputations using \code{jomo}. The \code{type} argument must be an integer vector denoting the role of each variable in the imputation model: \itemize{ \item{\code{1}: target variables containing missing data} \item{\code{2}: predictors with fixed effect on all targets (completely observed)} \item{\code{3}: predictors with random effect on all targets (completely observed)} \item{\code{-1}: grouping variable within which the imputation is run separately} \item{\code{-2}: cluster indicator variable} \item{\code{0}: variables not featured in the model} } At least one target variable and, for multilevel imputation, the cluster indicator must be specified. If the cluster indicator is omitted, single-level imputation will be performed. The intercept is automatically included both as a fixed and random effect. If a variable of type \code{-1} is found, then separate imputations are performed within each level of that variable. The \code{formula} argument is intended as more flexible and feature-rich interface to \code{jomo}. Specifying the \code{formula} argument is similar to specifying other formulae in R. Given below is a list of operators that \code{jomoImpute} currently understands: \itemize{ \item{\code{~}: separates the target (left-hand) and predictor (right-hand) side of the model} \item{\code{+}: adds target or predictor variables to the model} \item{\code{*}: adds an interaction term of two or more predictors} \item{\code{|}: denotes cluster-specific random effects and specifies the cluster indicator (e.g., \code{1|ID})} \item{\code{I()}: defines functions to be interpreted by \code{model.matrix}} } If the cluster indicator is omitted, single-level imputation will be run. For multilevel imputation, predictors are allowed to have fixed effects, random effects, or both on all target variables. The intercept is automatically included both as a fixed and, for multilevel imputation, a random effect. Both can be constrained if necessary (see \code{\link{panImpute}}). Note that, when specifying random effects other than the intercept, these will \emph{not} be automatically added as fixed effects and must be included explicitly. Any predictors defined by \code{I()} will be used for imputation but not included in the data set unless \code{save.pred=TRUE}. If missing data occur at both levels of the sample (level 1 and level 2), then a list of two \code{formula}s or \code{type}s may be provided. The first element of this list denotes the imputation model for variables at level 1. The second element denotes the imputation model for variables at level 2. In such a case, missing values are imputed jointly at both levels (see examples, see also Carpenter and Kenward, 2013; Goldstein et al., 2009). It is possible to model the covariance matrix of residuals at level 1 as random across clusters (Yucel, 2011; Carpenter & Kenward, 2013). The \code{random.L1} argument determines this behavior and how the values of these matrices are stored. If set to \code{"none"}, a common covariance matrix is assumed across groups (similar to \code{panImpute}). If set to \code{"mean"}, the covariance matrices are random, but only the average covariance matrix is stored at each iteration. If set to \code{"full"}, the covariance matrices are random, and all variances and covariances from all clusters are stored. In order to run separate imputations for each level of an additional grouping variable, the \code{group} argument may be used. The name of the grouping variable must be given in quotes. As a default prior, \code{jomoImpute} uses "least informative" inverse-Wishart priors for the covariance matrix of random effects (and residuals at level 2) and the covariance matrix of residuals at level 1, that is, with minimum degrees of freedom (largest dispersion) and identity matrices for scale. For better control, the \code{prior} argument may be used for specifying alternative prior distributions. These must be supplied as a list containing the following components: \itemize{ \item{\code{Binv}: scale matrix for the covariance matrix of residuals at level 1} \item{\code{Dinv}: scale matrix for the covariance matrix of random effects and residuals at level 2} \item{\code{a}: starting value for the degrees of freedom of random covariance matrices of residuals (only used with \code{random.L1="mean"} or \code{random.L1="full"})} } Note that \code{jomo} does not allow for the degrees of freedom for the inverse-Wishart prior to be specified by the user. These are always set to the lowest value possible (largest dispersion) or determined iteratively if the residuals at level 1 are modeled as random (see above). For single-level imputation, only \code{Binv} is relevant. In imputation models with many parameters, the number of parameter chains being saved can be reduced with the \code{keep.chains} argument (see \code{\link{panImpute}}). This setting influences the storage mode of parameters (e.g., dimensions and indices of arrays) and should be used with caution. } \value{ Returns an object of class \code{mitml}, containing the following components: \item{data}{The original (incomplete) data set, sorted according to the cluster variable and (if given) the grouping variable, with several attributes describing the original order (\code{"sort"}), grouping (\code{"group"}) and factor levels of categorical variables.} \item{replacement.mat}{A matrix containing the multiple replacements (i.e., imputations) for each missing value. The replacement matrix contains one row for each missing value and one one column for each imputed data set.} \item{index.mat}{A matrix containing the row and column index for each missing value. The index matrix is used to \emph{link} the missing values in the data set with their corresponding rows in the replacement matrix.} \item{call}{The matched function call.} \item{model}{A list containing the names of the cluster variable, the target variables, and the predictor variables with fixed and random effects, at level 1 and level 2, respectively.} \item{random.L1}{A character string denoting the handling of the (random) covariance matrix of residuals at level 1 (see details).} \item{prior}{The prior parameters used in the imputation model.} \item{iter}{A list containing the number of burn-in iterations, the number of iterations between imputations, and the number of imputed data sets.} \item{par.burnin}{A multi-dimensional array containing the parameters of the imputation model from the burn-in phase.} \item{par.imputation}{A multi-dimensional array containing the parameters of the imputation model from the imputation phase.} } \note{ For objects of class \code{mitml}, methods for the generic functions \code{print}, \code{summary}, and \code{plot} have been defined. \code{mitmlComplete} is used for extracting the imputed data sets. } \references{ Carpenter, J. R., & Kenward, M. G. (2013). \emph{Multiple imputation and its application}. Hoboken, NJ: Wiley. Goldstein, H., Carpenter, J., Kenward, M. G., & Levin, K. A. (2009). Multilevel models with multivariate mixed response types. \emph{Statistical Modelling}, 9, 173-197. Yucel, R. M. (2011). Random covariances and mixed-effects models for imputing multivariate multilevel continuous data. \emph{Statistical Modelling}, 11, 351-370. } \author{Simon Grund, Alexander Robitzsch, Oliver Luedtke} \seealso{\code{\link{panImpute}}, \code{\link{mitmlComplete}}, \code{\link{summary.mitml}}, \code{\link{plot.mitml}}} \examples{ # NOTE: The number of iterations in these examples is much lower than it # should be! This is done in order to comply with CRAN policies, and more # iterations are recommended for applications in practice! data(studentratings) data(leadership) # *** # for further examples, see "panImpute" # ?panImpute # *** ................................ # the 'type' interface # # * Example 1.1 (studentratings): 'ReadDis' and 'SES', predicted by 'ReadAchiev' # (random slope) type <- c(-2,0,0,0,0,1,3,1,0,0) names(type) <- colnames(studentratings) type imp <- jomoImpute(studentratings, type=type, n.burn=100, n.iter=10, m=5) # * Example 1.2 (leadership): all variables (mixed continuous and categorical # data with missing values at level 1 and level 2) type.L1 <- c(-2,1,0,1,1) # imputation model at level 1 type.L2 <- c(-2,0,1,0,0) # imputation model at level 2 names(type.L1) <- names(type.L2) <- colnames(leadership) type <- list(type.L1, type.L2) type imp <- jomoImpute(leadership, type=type, n.burn=100, n.iter=10, m=5) # * Example 1.3 (studentratings): 'ReadDis', 'ReadAchiev', and 'SES' predicted # with empty model, groupwise for 'FedState' (single-level imputation) type <- c(0,-1,0,0,0,1,1,1,0,0) names(type) <- colnames(studentratings) type imp <- jomoImpute(studentratings, type=type, group="FedState", n.burn=100, n.iter=10, m=5) # *** ................................ # the 'formula' interface # # * Example 2.1 (studentratings): 'ReadDis' and 'SES' predicted by 'ReadAchiev' # (random slope) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- jomoImpute(studentratings, formula=fml, n.burn=100, n.iter=10, m=5) # * Example 2.2 (studentratings): 'ReadDis' predicted by 'ReadAchiev' and the # the cluster mean of 'ReadAchiev' fml <- ReadDis ~ ReadAchiev + I(clusterMeans(ReadAchiev,ID)) + (1|ID) imp <- jomoImpute(studentratings, formula=fml, n.burn=100, n.iter=10, m=5) # * Example 2.3 (studentratings): 'ReadDis' predicted by 'ReadAchiev', groupwise # for 'FedState' fml <- ReadDis ~ ReadAchiev + (1|ID) imp <- jomoImpute(studentratings, formula=fml, group="FedState", n.burn=100, n.iter=10, m=5) # * Example 2.4 (leadership): all variables (mixed continuous and categorical # data with missing values at level 1 and level 2) fml <- list( JOBSAT + NEGLEAD + WLOAD ~ 1 + (1|GRPID) , COHES ~ 1 ) imp <- jomoImpute(leadership, formula=fml, n.burn=100, n.iter=10, m=5) # * Example 2.5 (studentratings): 'ReadDis', 'ReadAchiev', and 'SES' predicted # with empty model, groupwise for 'FedState' (single-level imputation) fml <- ReadDis + ReadAchiev + SES ~ 1 imp <- jomoImpute(studentratings, formula=fml, group="FedState", n.burn=100, n.iter=10, m=5) } \keyword{models} mitml/man/clusterMeans.Rd0000644000176200001440000000504312765551601015115 0ustar liggesusers\name{clusterMeans} \alias{clusterMeans} \title{Calculate cluster means} \description{ Calculates the mean of a given variable within each cluster, possibly conditioning on an additional grouping variable. } \usage{ clusterMeans(x, cluster, adj=FALSE, group=NULL) } \arguments{ \item{x}{A numeric vector for which cluster means should be calculated. Can also be supplied as a character string denoting a variable in the current environment (see details).} \item{cluster}{A numeric vector or a factor denoting the cluster membership of each unit in \code{x}. Can also be supplied as a character string (see details).} \item{adj}{Logical flag indicating if person-adjusted group means should be calculated. The cluster mean is then calculated for each unit by excluding that unit from calculating the cluster mean. Default is to \code{FALSE}.} \item{group}{(optional) An grouping factor or a variable that can be interpreted as such. If specified, then cluster means are calculated conditionally on the grouping variable, that is, separately within sub-groups. Can also be supplied as a character string (see details).} } \details{ This function calculates the mean of a variable within each level of a cluster variable. Any \code{NA} are omitted during calculation. The three main arguments of the function can also be supplied as (single) character strings, denoting the name of the respective variables in the current environment. This is especially useful for calculating several cluster means simultaneously, for example using \code{\link{within.mitml.list}} (see Example 2 below). } \value{ Returns a numeric vector with the same length as \code{x} containing the cluster mean for all units. } \author{Simon Grund, Alexander Robitzsch} \seealso{\code{\link{within.mitml.list}}} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) implist <- mitmlComplete(imp, print=1:5) # * Example 1: single cluster means # calculate cluster means (for each data set) with(implist, clusterMeans(ReadAchiev, ID)) # ... person-adjusted cluster means with(implist, clusterMeans(ReadAchiev, ID, adj=TRUE)) # ... groupwise cluster means with(implist, clusterMeans(ReadAchiev, ID, group=Sex)) # * Example 2: automated cluster means using 'for' and 'assign' # calculate multiple cluster means within multiply imputed data sets within(implist,{ vars <- c("ReadAchiev","MathAchiev","CognAbility") for(i in vars) assign(paste(i,"Mean",sep="."), clusterMeans(i,ID)) rm(i,vars) }) } mitml/man/mitmlComplete.Rd0000644000176200001440000000322412765551665015274 0ustar liggesusers\name{mitmlComplete} \alias{mitmlComplete} \title{Extract imputed data sets} \description{This function extracts imputed data sets from \code{mitml} class objects as produced by \code{panImpute} and \code{jomoImpute}.} \usage{ mitmlComplete(x, print="all", force.list=FALSE) } \arguments{ \item{x}{An object of class \code{mitml} as produced by \code{panImpute} and \code{jomoImpute}.} \item{print}{Either an integer vector, \code{"list"}, or \code{"all"} denoting which data sets to extract. If set to \code{"list"} or \code{"all"}, then all imputed data sets will be returned as a list. Negative values and zero will return the original (incomplete) data set. Default is to \code{"all"}.} \item{force.list}{(optional) Logical flag indicating if single data sets should be enclosed in a list. Default is to \code{FALSE}.} } \value{ Single data sets are returned as a data frame unless \code{force.list=TRUE}. If several data sets are extracted, the result is always a list of data sets with an additional class attribute \code{mitml.list}. } \author{Simon Grund} \seealso{\code{\link{panImpute}}, \code{\link{jomoImpute}}} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) # extract original (incomplete) data set mitmlComplete(imp, print=0) # extract first imputed data set (returned as mitml.list) mitmlComplete(imp, print=1, force.list=TRUE) # extract all imputed data sets at once implist <- mitmlComplete(imp, print="all") \dontrun{ # ... alternatives with same results implist <- mitmlComplete(imp, print=1:5) implist <- mitmlComplete(imp, print="list") } } mitml/man/summary.mitml.Rd0000644000176200001440000000576212765552121015274 0ustar liggesusers\name{summary.mitml} \alias{summary.mitml} \title{Summary measures for imputation models} \description{ Provides summary statistics and additional information on imputations in objects of class \code{mitml}. } \usage{ \method{summary}{mitml}(object, n.Rhat=3, goodness.of.appr=FALSE, autocorrelation=FALSE, ...) } \arguments{ \item{object}{An object of class \code{mitml} as produced by \code{panImpute} and \code{jomoImpute}.} \item{n.Rhat}{(optional) An integer denoting the number of sequences used for calculating the potential scale reduction factor. Default is to \code{3}.} \item{goodness.of.appr}{(optional) A logical flag indicating if the goodness of approximation should be printed. Default is to \code{FALSE} (see details).} \item{autocorrelation}{(optional) A logical flag indicating if the autocorrelation should be printed. Default is to \code{FALSE} (see details).} \item{\dots}{Not being used.} } \details{ The \code{summary} method calculates summary statistics for objects of class \code{mitml} as produced by \code{\link{panImpute}} and \code{\link{jomoImpute}}. The output includes the potential scale reduction factor (PSRF, or \eqn{\hat{R}}) and (optionally) the goodness of approximation and autocorrelation. The PSRF is calculated for each parameter of the imputation model and may be interpreted as a measure of convergence (Gelman and Rubin, 1992). Calculation of the PSRFs can be suppressed by setting \code{n.Rhat=NULL}. The PSRFs are not computed from different chains, but by dividing each chain from the imputation phase into a number of sequences as denoted by \code{n.Rhat}. This is slightly different from the original method proposed by Gelman and Rubin. The goodness of approximation indicates what proportion of the posterior standard deviation is due to simulation error. For multiple imputation, the goodness of approximation is not essential; it should be considered only if posterior summaries, such as the EAP, are of interest. The autocorrelation includes estimates of the autocorrelation in the parameter chains at lag 1 (i.e., for consecutive draws) and for lags \eqn{k} and \eqn{2k}, where \eqn{k} is the number of iterations between imputations. For lag \eqn{k} and \eqn{2k}, the autocorrelation is slightly smoothed to reduce the influence of noise on the estimates (see \code{\link{plot.mitml}}). } \value{ Returns an object of class \code{summary.mitml}. A print method is used for better readable console output. } \references{ Gelman, A., and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. \emph{Statistical Science, 7}, 457-472. Hoff, P. D. (2009). \emph{A first course in Bayesian statistical methods}. New York, NY: Springer. } \author{Simon Grund} \seealso{\code{\link{panImpute}}, \code{\link{jomoImpute}}, \code{\link{plot.mitml}}} \examples{ data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(studentratings, formula=fml, n.burn=1000, n.iter=100, m=5) # print summary summary(imp) } \keyword{methods} mitml/man/c.mitml.list.Rd0000644000176200001440000000342013057741221014754 0ustar liggesusers\name{c.mitml.list} \alias{c.mitml.list} \alias{rbind.mitml.list} \alias{cbind.mitml.list} \title{Concatenate lists of imputed data sets} \description{ These functions allow concatenating lists of imputed data sets by data set, row, or column. } \usage{ \method{c}{mitml.list}(...) \method{rbind}{mitml.list}(...) \method{cbind}{mitml.list}(...) } \arguments{ \item{\dots}{One or several lists of imputed data sets with class \code{mitml.list} as produced by \code{mitmlComplete} (or similar).} } \details{ These function allow concatenating multiple lists of imputed data sets. The function \code{c} concatenates by data set (i.e., by appending additional data sets to the list), \code{rbind} concatenates by row (i.e., appending additional rows to each data set), and \code{cbind} concatenates by column (i.e., by appending additional columns to each data set). These functions are intended for experienced users and should be used with caution. Appending rows or columns from multiple imputation procedures is usually unsafe unless in special applications (see Examples). } \value{ A list of imputed data sets with an additional class attribute \code{mitml.list}. } \author{Simon Grund} \examples{ # Example 1: manual imputation by grouping variable data(studentratings) fml <- ReadDis + SES ~ ReadAchiev + (1|ID) imp <- panImpute(subset(studentratings, FedState=="SH"), formula=fml, n.burn=1000, n.iter=100, m=5) implist <- mitmlComplete(imp, print="all") imp2 <- panImpute(subset(studentratings, FedState=="B"), formula=fml, n.burn=1000, n.iter=100, m=5) implist2 <- mitmlComplete(imp2, print="all") rbind(implist, implist2) # Example 2: predicted values from linear model pred <- with(implist, predict(lm(ReadDis~ReadAchiev))) cbind(implist, pred.ReadDis=pred) } \keyword{methods}