ggstats/0000755000176200001440000000000014527062732011736 5ustar liggesusersggstats/NAMESPACE0000644000176200001440000000175414466120077013163 0ustar liggesusers# Generated by roxygen2: do not edit by hand export("%>%") export(PositionLikert) export(PositionLikertCount) export(StatCross) export(StatProp) export(StatWeightedMean) export(augment_chisq_add_phi) export(geom_stripped_cols) export(geom_stripped_rows) export(ggcoef_compare) export(ggcoef_model) export(ggcoef_multicomponents) export(ggcoef_multinom) export(ggcoef_plot) export(ggcoef_table) export(gglikert) export(gglikert_data) export(gglikert_stacked) export(ggsurvey) export(label_number_abs) export(label_percent_abs) export(position_likert) export(position_likert_count) export(signif_stars) export(stat_cross) export(stat_prop) export(stat_weighted_mean) export(weighted.median) export(weighted.quantile) import(ggplot2) importFrom(dplyr,.data) importFrom(dplyr,sym) importFrom(ggplot2,after_stat) importFrom(lifecycle,deprecate_soft) importFrom(lifecycle,deprecated) importFrom(magrittr,"%>%") importFrom(stats,weighted.mean) importFrom(stats,weights) ggstats/README.md0000644000176200001440000001111114526731562013213 0ustar liggesusers # `ggstats`: extension to `ggplot2` for plotting stats [![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable) [![R-CMD-check](https://github.com/larmarange/ggstats/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/larmarange/ggstats/actions/workflows/R-CMD-check.yaml) [![Codecov test coverage](https://codecov.io/gh/larmarange/ggstats/branch/main/graph/badge.svg)](https://app.codecov.io/gh/larmarange/ggstats?branch=main) [![CRAN status](https://www.r-pkg.org/badges/version/ggstats)](https://CRAN.R-project.org/package=ggstats) [![DOI](https://zenodo.org/badge/547360047.svg)](https://zenodo.org/badge/latestdoi/547360047) The `ggstats` package provides new statistics, new geometries and new positions for `ggplot2` and a suite of functions to facilitate the creation of statistical plots. ## Installation & Documentation To install **stable version**: ``` r install.packages("ggstats") ``` Documentation of stable version: To install **development version**: ``` r remotes::install_github("larmarange/ggstats") ``` Documentation of development version: ## Plot model coefficients ``` r library(ggstats) mod1 <- lm(Fertility ~ ., data = swiss) ggcoef_model(mod1) ``` ``` r ggcoef_table(mod1) ``` ## Comparing several models ``` r mod2 <- step(mod1, trace = 0) mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss) models <- list( "Full model" = mod1, "Simplified model" = mod2, "With interaction" = mod3 ) ggcoef_compare(models, type = "faceted") ``` ## Compute custom proportions ``` r library(ggplot2) ggplot(as.data.frame(Titanic)) + aes(x = Class, fill = Survived, weight = Freq, by = Class) + geom_bar(position = "fill") + geom_text(stat = "prop", position = position_fill(.5)) + facet_grid(~Sex) ``` ## Compute weighted mean ``` r data(tips, package = "reshape") ggplot(tips) + aes(x = day, y = total_bill, fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") + ylab("Mean total bill per day and sex") ``` ## Compute cross-tabulation statistics ``` r ggplot(as.data.frame(Titanic)) + aes( x = Class, y = Survived, weight = Freq, size = after_stat(observed), fill = after_stat(std.resid) ) + stat_cross(shape = 22) + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + scale_size_area(max_size = 20) ``` ## Plot survey objects taking into account weights ``` r library(survey, quietly = TRUE) #> #> Attachement du package : 'survey' #> L'objet suivant est masqué depuis 'package:graphics': #> #> dotchart dw <- svydesign( ids = ~1, weights = ~Freq, data = as.data.frame(Titanic) ) ggsurvey(dw) + aes(x = Class, fill = Survived) + geom_bar(position = "fill") + ylab("Weighted proportion of survivors") ``` ## Plot Likert-type items ``` r library(dplyr) #> #> Attachement du package : 'dplyr' #> Les objets suivants sont masqués depuis 'package:stats': #> #> filter, lag #> Les objets suivants sont masqués depuis 'package:base': #> #> intersect, setdiff, setequal, union likert_levels <- c( "Strongly disagree", "Disagree", "Neither agree nor disagree", "Agree", "Strongly agree" ) set.seed(42) df <- tibble( q1 = sample(likert_levels, 150, replace = TRUE), q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1), q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q5 = sample(c(likert_levels, NA), 150, replace = TRUE), q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0)) ) %>% mutate(across(everything(), ~ factor(.x, levels = likert_levels))) gglikert(df) ``` ggstats/man/0000755000176200001440000000000014504775450012514 5ustar liggesusersggstats/man/stat_weighted_mean.Rd0000644000176200001440000001152014415736053016632 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/stat_weighted_mean.R \docType{data} \name{stat_weighted_mean} \alias{stat_weighted_mean} \alias{StatWeightedMean} \title{Compute weighted y mean} \usage{ stat_weighted_mean( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE ) } \arguments{ \item{mapping}{Set of aesthetic mappings created by \code{\link[ggplot2:aes]{aes()}}. If specified and \code{inherit.aes = TRUE} (the default), it is combined with the default mapping at the top level of the plot. You must supply \code{mapping} if there is no plot mapping.} \item{data}{The data to be displayed in this layer. There are three options: If \code{NULL}, the default, the data is inherited from the plot data as specified in the call to \code{\link[ggplot2:ggplot]{ggplot()}}. A \code{data.frame}, or other object, will override the plot data. All objects will be fortified to produce a data frame. See \code{\link[ggplot2:fortify]{fortify()}} for which variables will be created. A \code{function} will be called with a single argument, the plot data. The return value must be a \code{data.frame}, and will be used as the layer data. A \code{function} can be created from a \code{formula} (e.g. \code{~ head(.x, 10)}).} \item{geom}{Override the default connection with \code{\link[ggplot2:geom_point]{ggplot2::geom_point()}}.} \item{position}{Position adjustment, either as a string naming the adjustment (e.g. \code{"jitter"} to use \code{position_jitter}), or the result of a call to a position adjustment function. Use the latter if you need to change the settings of the adjustment.} \item{...}{Other arguments passed on to \code{\link[ggplot2:layer]{layer()}}. These are often aesthetics, used to set an aesthetic to a fixed value, like \code{colour = "red"} or \code{size = 3}. They may also be parameters to the paired geom/stat.} \item{na.rm}{If \code{FALSE}, the default, missing values are removed with a warning. If \code{TRUE}, missing values are silently removed.} \item{orientation}{The orientation of the layer. The default (\code{NA}) automatically determines the orientation from the aesthetic mapping. In the rare event that this fails it can be given explicitly by setting \code{orientation} to either \code{"x"} or \code{"y"}. See the \emph{Orientation} section for more detail.} \item{show.legend}{logical. Should this layer be included in the legends? \code{NA}, the default, includes if any aesthetics are mapped. \code{FALSE} never includes, and \code{TRUE} always includes. It can also be a named logical vector to finely select the aesthetics to display.} \item{inherit.aes}{If \code{FALSE}, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g. \code{\link[ggplot2:borders]{borders()}}.} } \value{ A \code{ggplot2} plot with the added statistic. } \description{ This statistic will compute the mean of \strong{y} aesthetic for each unique value of \strong{x}, taking into account \strong{weight} aesthetic if provided. } \section{Computed variables}{ \describe{ \item{y}{weighted y (numerator / denominator)} \item{numerator}{numerator} \item{denominator}{denominator} } } \examples{ \dontshow{if (requireNamespace("reshape")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} \dontshow{\}) # examplesIf} library(ggplot2) data(tips, package = "reshape") ggplot(tips) + aes(x = day, y = total_bill) + geom_point() ggplot(tips) + aes(x = day, y = total_bill) + stat_weighted_mean() \donttest{ ggplot(tips) + aes(x = day, y = total_bill, group = 1) + stat_weighted_mean(geom = "line") ggplot(tips) + aes(x = day, y = total_bill, colour = sex, group = sex) + stat_weighted_mean(geom = "line") ggplot(tips) + aes(x = day, y = total_bill, fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") # computing a proportion on the fly if (requireNamespace("scales")) { ggplot(tips) + aes(x = day, y = as.integer(smoker == "Yes"), fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") + scale_y_continuous(labels = scales::percent) } } library(ggplot2) # taking into account some weights if (requireNamespace("scales")) { d <- as.data.frame(Titanic) ggplot(d) + aes( x = Class, y = as.integer(Survived == "Yes"), weight = Freq, fill = Sex ) + geom_bar(stat = "weighted_mean", position = "dodge") + scale_y_continuous(labels = scales::percent) + labs(y = "Survived") } } \seealso{ \code{vignette("stat_weighted_mean")} } \keyword{datasets} ggstats/man/label_number_abs.Rd0000644000176200001440000000216614527052140016251 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/label_number_abs.R \name{label_number_abs} \alias{label_number_abs} \alias{label_percent_abs} \title{Label absolute values} \usage{ label_number_abs(..., hide_below = NULL) label_percent_abs(..., hide_below = NULL) } \arguments{ \item{...}{arguments passed to \code{\link[scales:label_number]{scales::label_number()}} or \code{\link[scales:label_percent]{scales::label_percent()}}} \item{hide_below}{if provided, values below \code{hide_below} will be masked (i.e. an empty string \code{""} will be returned)} } \value{ A "labelling" function, , i.e. a function that takes a vector and returns a character vector of same length giving a label for each input value. } \description{ Label absolute values } \examples{ x <- c(-0.2, -.05, 0, .07, .25, .66) scales::label_number()(x) label_number_abs()(x) scales::label_percent()(x) label_percent_abs()(x) label_percent_abs(hide_below = .1)(x) } \seealso{ \code{\link[scales:label_number]{scales::label_number()}}, \code{\link[scales:label_percent]{scales::label_percent()}} } ggstats/man/ggstats-package.Rd0000644000176200001440000000143114467450345016050 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/ggstats-package.R \docType{package} \name{ggstats-package} \alias{ggstats} \alias{ggstats-package} \title{ggstats: Extension to 'ggplot2' for Plotting Stats} \description{ Provides new statistics, new geometries and new positions for 'ggplot2' and a suite of functions to facilitate the creation of statistical plots. } \seealso{ Useful links: \itemize{ \item \url{https://larmarange.github.io/ggstats/} \item \url{https://github.com/larmarange/ggstats} \item Report bugs at \url{https://github.com/larmarange/ggstats/issues} } } \author{ \strong{Maintainer}: Joseph Larmarange \email{joseph@larmarange.net} (\href{https://orcid.org/0000-0001-7097-700X}{ORCID}) } \keyword{internal} ggstats/man/ggcoef_model.Rd0000644000176200001440000004406414505244316015416 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/ggcoef_model.R \name{ggcoef_model} \alias{ggcoef_model} \alias{ggcoef_table} \alias{ggcoef_compare} \alias{ggcoef_multinom} \alias{ggcoef_multicomponents} \alias{ggcoef_plot} \title{Plot model coefficients} \usage{ ggcoef_model( model, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = 1 - conf.level, significance_labels = NULL, show_p_values = TRUE, signif_stars = TRUE, return_data = FALSE, ... ) ggcoef_table( model, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = 1 - conf.level, significance_labels = NULL, show_p_values = FALSE, signif_stars = FALSE, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), plot_title = NULL, ... ) ggcoef_compare( models, type = c("dodged", "faceted"), tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, ... ) ggcoef_multinom( model, type = c("dodged", "faceted", "table"), y.level_label = NULL, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), ... ) ggcoef_multicomponents( model, type = c("dodged", "faceted", "table"), component_col = "component", component_label = NULL, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), ... ) ggcoef_plot( data, x = "estimate", y = "label", exponentiate = FALSE, point_size = 2, point_stroke = 2, point_fill = "white", colour = NULL, colour_guide = TRUE, colour_lab = "", colour_labels = ggplot2::waiver(), shape = "significance", shape_values = c(16, 21), shape_guide = TRUE, shape_lab = "", errorbar = TRUE, errorbar_height = 0.1, errorbar_coloured = FALSE, stripped_rows = TRUE, strips_odd = "#11111111", strips_even = "#00000000", vline = TRUE, vline_colour = "grey50", dodged = FALSE, dodged_width = 0.8, facet_row = "var_label", facet_col = NULL, facet_labeller = "label_value" ) } \arguments{ \item{model}{a regression model object} \item{tidy_fun}{option to specify a custom tidier function} \item{tidy_args}{Additional arguments passed to \code{\link[broom.helpers:tidy_plus_plus]{broom.helpers::tidy_plus_plus()}} and to \code{tidy_fun}} \item{conf.int}{should confidence intervals be computed? (see \code{\link[broom:reexports]{broom::tidy()}})} \item{conf.level}{the confidence level to use for the confidence interval if \code{conf.int = TRUE}; must be strictly greater than 0 and less than 1; defaults to 0.95, which corresponds to a 95 percent confidence interval} \item{exponentiate}{if \code{TRUE} a logarithmic scale will be used for x-axis} \item{variable_labels}{a named list or a named vector of custom variable labels} \item{term_labels}{a named list or a named vector of custom term labels} \item{interaction_sep}{separator for interaction terms} \item{categorical_terms_pattern}{a \link[glue:glue]{glue pattern} for labels of categorical terms with treatment or sum contrasts (see \code{\link[broom.helpers:model_list_terms_levels]{model_list_terms_levels()}})} \item{add_reference_rows}{should reference rows be added?} \item{no_reference_row}{variables (accepts \link[dplyr:select]{tidyselect} notation) for those no reference row should be added, when \code{add_reference_rows = TRUE}} \item{intercept}{should the intercept(s) be included?} \item{include}{variables to include. Accepts \link[dplyr:select]{tidyselect} syntax. Use \code{-} to remove a variable. Default is \code{everything()}. See also \code{\link[broom.helpers:all_continuous]{all_continuous()}}, \code{\link[broom.helpers:all_categorical]{all_categorical()}}, \code{\link[broom.helpers:all_dichotomous]{all_dichotomous()}} and \code{\link[broom.helpers:all_interaction]{all_interaction()}}} \item{add_pairwise_contrasts}{apply \code{\link[broom.helpers:tidy_add_pairwise_contrasts]{tidy_add_pairwise_contrasts()}}? \ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#experimental}{\figure{lifecycle-experimental.svg}{options: alt='[Experimental]'}}}{\strong{[Experimental]}}} \item{pairwise_variables}{variables to add pairwise contrasts (accepts \link[dplyr:select]{tidyselect} notation)} \item{keep_model_terms}{keep original model terms for variables where pairwise contrasts are added? (default is \code{FALSE})} \item{pairwise_reverse}{determines whether to use \code{"pairwise"} (if \code{TRUE}) or \code{"revpairwise"} (if \code{FALSE}), see \code{\link[emmeans:contrast]{emmeans::contrast()}}} \item{emmeans_args}{list of additional parameter to pass to \code{\link[emmeans:emmeans]{emmeans::emmeans()}} when computing pairwise contrasts} \item{significance}{level (between 0 and 1) below which a coefficient is consider to be significantly different from 0 (or 1 if \code{exponentiate = TRUE}), \code{NULL} for not highlighting such coefficients} \item{significance_labels}{optional vector with custom labels for significance variable} \item{show_p_values}{if \code{TRUE}, add p-value to labels} \item{signif_stars}{if \code{TRUE}, add significant stars to labels} \item{return_data}{if \code{TRUE}, will return the data.frame used for plotting instead of the plot} \item{...}{parameters passed to \code{\link[=ggcoef_plot]{ggcoef_plot()}}} \item{table_stat}{statistics to display in the table, use any column name returned by the tidier or \code{"ci"} for confidence intervals formatted according to \code{ci_pattern}} \item{table_header}{optional custom headers for the table} \item{table_text_size}{text size for the table} \item{table_stat_label}{optional named list of labeller functions for the displayed statistic (see examples)} \item{ci_pattern}{glue pattern for confidence intervals in the table} \item{table_witdhs}{relative widths of the forest plot and the coefficients table} \item{plot_title}{an optional plot title} \item{models}{named list of models} \item{type}{a dodged plot, a faceted plot or multiple table plots?} \item{y.level_label}{an optional named vector for labeling \code{y.level} (see examples)} \item{component_col}{name of the component column} \item{component_label}{an optional named vector for labeling components} \item{data}{a data frame containing data to be plotted, typically the output of \code{ggcoef_model()}, \code{ggcoef_compare()} or \code{ggcoef_multinom()} with the option \code{return_data = TRUE}} \item{x, y}{variables mapped to x and y axis} \item{point_size}{size of the points} \item{point_stroke}{thickness of the points} \item{point_fill}{fill colour for the points} \item{colour}{optional variable name to be mapped to colour aesthetic} \item{colour_guide}{should colour guide be displayed in the legend?} \item{colour_lab}{label of the colour aesthetic in the legend} \item{colour_labels}{labels argument passed to \code{\link[ggplot2:scale_colour_discrete]{ggplot2::scale_colour_discrete()}} and \code{\link[ggplot2:discrete_scale]{ggplot2::discrete_scale()}}} \item{shape}{optional variable name to be mapped to the shape aesthetic} \item{shape_values}{values of the different shapes to use in \code{\link[ggplot2:scale_manual]{ggplot2::scale_shape_manual()}}} \item{shape_guide}{should shape guide be displayed in the legend?} \item{shape_lab}{label of the shape aesthetic in the legend} \item{errorbar}{should error bars be plotted?} \item{errorbar_height}{height of error bars} \item{errorbar_coloured}{should error bars be colored as the points?} \item{stripped_rows}{should stripped rows be displayed in the background?} \item{strips_odd}{color of the odd rows} \item{strips_even}{color of the even rows} \item{vline}{should a vertical line be drawn at 0 (or 1 if \code{exponentiate = TRUE})?} \item{vline_colour}{colour of vertical line} \item{dodged}{should points be dodged (according to the colour aesthetic)?} \item{dodged_width}{width value for \code{\link[ggplot2:position_dodge]{ggplot2::position_dodge()}}} \item{facet_row}{variable name to be used for row facets} \item{facet_col}{optional variable name to be used for column facets} \item{facet_labeller}{labeller function to be used for labeling facets; if labels are too long, you can use \code{\link[ggplot2:labellers]{ggplot2::label_wrap_gen()}} (see examples), more information in the documentation of \code{\link[ggplot2:facet_grid]{ggplot2::facet_grid()}}} } \value{ A \code{ggplot2} plot or a \code{tibble} if \code{return_data = TRUE}. } \description{ \code{ggcoef_model()}, \code{ggcoef_table()}, \code{ggcoef_multinom()}, \code{ggcoef_multicomponents()} and \code{ggcoef_compare()} use \code{\link[broom.helpers:tidy_plus_plus]{broom.helpers::tidy_plus_plus()}} to obtain a \code{tibble} of the model coefficients, apply additional data transformation and then pass the produced \code{tibble} to \code{ggcoef_plot()} to generate the plot. } \details{ For more control, you can use the argument \code{return_data = TRUE} to get the produced \code{tibble}, apply any transformation of your own and then pass your customized \code{tibble} to \code{ggcoef_plot()}. } \section{Functions}{ \itemize{ \item \code{ggcoef_table()}: a variation of \code{\link[=ggcoef_model]{ggcoef_model()}} adding a table with estimates, confidence intervals and p-values \item \code{ggcoef_compare()}: designed for displaying several models on the same plot. \item \code{ggcoef_multinom()}: a variation of \code{\link[=ggcoef_model]{ggcoef_model()}} adapted to multinomial logistic regressions performed with \code{\link[nnet:multinom]{nnet::multinom()}}. \item \code{ggcoef_multicomponents()}: a variation of \code{\link[=ggcoef_model]{ggcoef_model()}} adapted to multi-component models such as zero-inflated models or beta regressions. \code{\link[=ggcoef_multicomponents]{ggcoef_multicomponents()}} has been tested with \code{pscl::zeroinfl()}, \code{pscl::hurdle()} and \code{betareg::betareg()} \item \code{ggcoef_plot()}: plot a tidy \code{tibble} of coefficients }} \examples{ mod <- lm(Sepal.Length ~ Sepal.Width + Species, data = iris) ggcoef_model(mod) ggcoef_table(mod) \donttest{ ggcoef_table(mod, table_stat = c("estimate", "ci")) ggcoef_table( mod, table_stat_label = list( estimate = scales::label_number(.001) ) ) ggcoef_table(mod, table_text_size = 5, table_witdhs = c(1, 1)) # a logistic regression example d_titanic <- as.data.frame(Titanic) d_titanic$Survived <- factor(d_titanic$Survived, c("No", "Yes")) mod_titanic <- glm( Survived ~ Sex * Age + Class, weights = Freq, data = d_titanic, family = binomial ) # use 'exponentiate = TRUE' to get the Odds Ratio ggcoef_model(mod_titanic, exponentiate = TRUE) ggcoef_table(mod_titanic, exponentiate = TRUE) # display intercepts ggcoef_model(mod_titanic, exponentiate = TRUE, intercept = TRUE) # customize terms labels ggcoef_model( mod_titanic, exponentiate = TRUE, show_p_values = FALSE, signif_stars = FALSE, add_reference_rows = FALSE, categorical_terms_pattern = "{level} (ref: {reference_level})", interaction_sep = " x " ) + ggplot2::scale_y_discrete(labels = scales::label_wrap(15)) # display only a subset of terms ggcoef_model(mod_titanic, exponentiate = TRUE, include = c("Age", "Class")) # do not change points' shape based on significance ggcoef_model(mod_titanic, exponentiate = TRUE, significance = NULL) # a black and white version ggcoef_model( mod_titanic, exponentiate = TRUE, colour = NULL, stripped_rows = FALSE ) # show dichotomous terms on one row ggcoef_model( mod_titanic, exponentiate = TRUE, no_reference_row = broom.helpers::all_dichotomous(), categorical_terms_pattern = "{ifelse(dichotomous, paste0(level, ' / ', reference_level), level)}", show_p_values = FALSE ) } \dontshow{if (requireNamespace("reshape")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} \donttest{ data(tips, package = "reshape") mod_simple <- lm(tip ~ day + time + total_bill, data = tips) ggcoef_model(mod_simple) # custom variable labels # you can use the labelled package to define variable labels # before computing model if (requireNamespace("labelled")) { tips_labelled <- tips \%>\% labelled::set_variable_labels( day = "Day of the week", time = "Lunch or Dinner", total_bill = "Bill's total" ) mod_labelled <- lm(tip ~ day + time + total_bill, data = tips_labelled) ggcoef_model(mod_labelled) } # you can provide custom variable labels with 'variable_labels' ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ) ) # if labels are too long, you can use 'facet_labeller' to wrap them ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ), facet_labeller = ggplot2::label_wrap_gen(10) ) # do not display variable facets but add colour guide ggcoef_model(mod_simple, facet_row = NULL, colour_guide = TRUE) # works also with with polynomial terms mod_poly <- lm( tip ~ poly(total_bill, 3) + day, data = tips, ) ggcoef_model(mod_poly) # or with different type of contrasts # for sum contrasts, the value of the reference term is computed if (requireNamespace("emmeans")) { mod2 <- lm( tip ~ day + time + sex, data = tips, contrasts = list(time = contr.sum, day = contr.treatment(4, base = 3)) ) ggcoef_model(mod2) } } \dontshow{\}) # examplesIf} \donttest{ # Use ggcoef_compare() for comparing several models on the same plot mod1 <- lm(Fertility ~ ., data = swiss) mod2 <- step(mod1, trace = 0) mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss) models <- list( "Full model" = mod1, "Simplified model" = mod2, "With interaction" = mod3 ) ggcoef_compare(models) ggcoef_compare(models, type = "faceted") # you can reverse the vertical position of the point by using a negative # value for dodged_width (but it will produce some warnings) ggcoef_compare(models, dodged_width = -.9) } \dontshow{if (requireNamespace("nnet")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} \donttest{ # specific function for nnet::multinom models mod <- nnet::multinom(Species ~ ., data = iris) ggcoef_multinom(mod, exponentiate = TRUE) ggcoef_multinom(mod, type = "faceted") ggcoef_multinom( mod, type = "faceted", y.level_label = c("versicolor" = "versicolor\n(ref: setosa)") ) } \dontshow{\}) # examplesIf} \dontshow{if (requireNamespace("pscl")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} \donttest{ library(pscl) data("bioChemists", package = "pscl") mod <- zeroinfl(art ~ fem * mar | fem + mar, data = bioChemists) ggcoef_multicomponents(mod) ggcoef_multicomponents(mod, type = "f") ggcoef_multicomponents(mod, type = "t") ggcoef_multicomponents( mod, type = "t", component_label = c(conditional = "Count", zero_inflated = "Zero-inflated") ) mod2 <- zeroinfl(art ~ fem + mar | 1, data = bioChemists) ggcoef_multicomponents(mod2, type = "t") } \dontshow{\}) # examplesIf} } \seealso{ \code{vignette("ggcoef_model")} } ggstats/man/gglikert.Rd0000644000176200001440000001760414504775450014623 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/gglikert.R \name{gglikert} \alias{gglikert} \alias{gglikert_data} \alias{gglikert_stacked} \title{Plotting Likert-type items} \usage{ gglikert( data, include = dplyr::everything(), weights = NULL, y = ".question", variable_labels = NULL, sort = c("none", "ascending", "descending"), sort_method = c("prop", "mean", "median"), sort_prop_include_center = totals_include_center, exclude_fill_values = NULL, add_labels = TRUE, labels_size = 3.5, labels_color = "black", labels_accuracy = 1, labels_hide_below = 0.05, add_totals = TRUE, totals_size = labels_size, totals_color = "black", totals_accuracy = labels_accuracy, totals_fontface = "bold", totals_include_center = FALSE, totals_hjust = 0.1, y_reverse = TRUE, y_label_wrap = 50, reverse_likert = FALSE, width = 0.9, facet_rows = NULL, facet_cols = NULL, facet_label_wrap = 50 ) gglikert_data( data, include = dplyr::everything(), weights = NULL, variable_labels = NULL, sort = c("none", "ascending", "descending"), sort_method = c("prop", "mean", "median"), sort_prop_include_center = TRUE, exclude_fill_values = NULL ) gglikert_stacked( data, include = dplyr::everything(), weights = NULL, y = ".question", variable_labels = NULL, sort = c("none", "ascending", "descending"), sort_method = c("prop", "mean", "median"), sort_prop_include_center = FALSE, add_labels = TRUE, labels_size = 3.5, labels_color = "black", labels_accuracy = 1, labels_hide_below = 0.05, add_median_line = FALSE, y_reverse = TRUE, y_label_wrap = 50, reverse_fill = TRUE, width = 0.9 ) } \arguments{ \item{data}{a data frame} \item{include}{variables to include, accept \link[dplyr:select]{tidy-select} syntax} \item{weights}{optional variable name of a weighting variable, accept \link[dplyr:select]{tidy-select} syntax} \item{y}{name of the variable to be plotted on \code{y} axis (relevant when \code{.question} is mapped to "facets, see examples), accept \link[dplyr:select]{tidy-select} syntax} \item{variable_labels}{a named list or a named vector of custom variable labels} \item{sort}{should variables be sorted?} \item{sort_method}{method used to sort the variables: \code{"prop"} sort according to the proportion of answers higher than the centered level, \code{"mean"} considers answer as a score and sort according to the mean score, \code{"median"} used the median and the majority judgment rule for tie-breaking.} \item{sort_prop_include_center}{when sorting with \code{"prop"} and if the number of levels is uneven, should half of the central level be taken into account to compute the proportion?} \item{exclude_fill_values}{Vector of values that should not be displayed (but still taken into account for computing proportions), see \code{\link[=position_likert]{position_likert()}}} \item{add_labels}{should percentage labels be added to the plot?} \item{labels_size}{size of the percentage labels} \item{labels_color}{color of the percentage labels} \item{labels_accuracy}{accuracy of the percentages, see \code{\link[scales:label_percent]{scales::label_percent()}}} \item{labels_hide_below}{if provided, values below will be masked, see \code{\link[=label_percent_abs]{label_percent_abs()}}} \item{add_totals}{should the total proportions of negative and positive answers be added to plot? \strong{This option is not compatible with facets!}} \item{totals_size}{size of the total proportions} \item{totals_color}{color of the total proportions} \item{totals_accuracy}{accuracy of the total proportions, see \code{\link[scales:label_percent]{scales::label_percent()}}} \item{totals_fontface}{font face of the total proportions} \item{totals_include_center}{if the number of levels is uneven, should half of the center level be added to the total proportions?} \item{totals_hjust}{horizontal adjustment of totals labels on the x axis} \item{y_reverse}{should the y axis be reversed?} \item{y_label_wrap}{number of characters per line for y axis labels, see \code{\link[scales:label_wrap]{scales::label_wrap()}}} \item{reverse_likert}{if \code{TRUE}, will reverse the default stacking order, see \code{\link[=position_likert]{position_likert()}}} \item{width}{bar width, see \code{\link[ggplot2:geom_bar]{ggplot2::geom_bar()}}} \item{facet_rows, facet_cols}{A set of variables or expressions quoted by \code{\link[ggplot2:vars]{ggplot2::vars()}} and defining faceting groups on the rows or columns dimension (see examples)} \item{facet_label_wrap}{number of characters per line for facet labels, see \code{\link[ggplot2:labellers]{ggplot2::label_wrap_gen()}}} \item{add_median_line}{add a vertical line at 50\%?} \item{reverse_fill}{if \code{TRUE}, will reverse the default stacking order, see \code{\link[ggplot2:position_stack]{ggplot2::position_fill()}}} } \value{ A \code{ggplot2} plot or a \code{tibble}. } \description{ \ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#experimental}{\figure{lifecycle-experimental.svg}{options: alt='[Experimental]'}}}{\strong{[Experimental]}} } \details{ Combines several factor variables using the same list of ordered levels (e.g. Likert-type scales) into a unique data frame and generates a centered bar plot. You could use \code{gglikert_data()} to just produce the dataset to be plotted. If variable labels have been defined (see \code{\link[labelled:var_label]{labelled::var_label()}}), they will be considered. You can also pass custom variables labels with the \code{variable_labels} argument. } \examples{ library(ggplot2) library(dplyr) likert_levels <- c( "Strongly disagree", "Disagree", "Neither agree nor disagree", "Agree", "Strongly agree" ) set.seed(42) df <- tibble( q1 = sample(likert_levels, 150, replace = TRUE), q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1), q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q5 = sample(c(likert_levels, NA), 150, replace = TRUE), q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0)) ) \%>\% mutate(across(everything(), ~ factor(.x, levels = likert_levels))) gglikert(df) gglikert(df, include = q1:3) gglikert(df, sort = "ascending") \donttest{ gglikert(df, sort = "ascending", sort_prop_include_center = TRUE) gglikert(df, sort = "ascending", sort_method = "mean") gglikert(df, reverse_likert = TRUE) gglikert(df, add_totals = FALSE, add_labels = FALSE) gglikert( df, totals_include_center = TRUE, totals_hjust = .25, totals_size = 4.5, totals_fontface = "italic", totals_accuracy = .01, labels_accuracy = 1, labels_size = 2.5, labels_hide_below = .25 ) gglikert(df, exclude_fill_values = "Neither agree nor disagree") if (require("labelled")) { df \%>\% set_variable_labels( q1 = "First question", q2 = "Second question" ) \%>\% gglikert( variable_labels = c( q4 = "a custom label", q6 = "a very very very very very very very very very very long label" ), y_label_wrap = 25 ) } # Facets df_group <- df df_group$group <- sample(c("A", "B"), 150, replace = TRUE) gglikert(df_group, q1:q6, facet_rows = vars(group)) gglikert(df_group, q1:q6, facet_cols = vars(group)) gglikert(df_group, q1:q6, y = "group", facet_rows = vars(.question)) } gglikert_stacked(df, q1:q6) gglikert_stacked(df, q1:q6, add_median_line = TRUE, sort = "asc") \donttest{ gglikert_stacked(df_group, q1:q6, y = "group", add_median_line = TRUE) + facet_grid(rows = vars(.question)) } } \seealso{ \code{vignette("gglikert")}, \code{\link[=position_likert]{position_likert()}}, \code{\link[=stat_prop]{stat_prop()}} } ggstats/man/figures/0000755000176200001440000000000014526731562014160 5ustar liggesusersggstats/man/figures/README-unnamed-chunk-7-1.png0000644000176200001440000001232514526731560020661 0ustar liggesusersPNG  IHDRMR/ PLTE:f:f:f333::::f:f:f:MMMMMnMMMnMnMff:fff:f::ff:fffffffnMMnMnnMnnnnnnMMMnMnMM::::fff۶nMnnnMȫff:fffې۶ȎMې:ېfnvmfȎې pHYsodpIDATx흍V@[R3>Ru$bC8/AO=ֻXۤE=}Ke'Ai: B!P4 &2m n uZu;%z P[J RJuRBE(-@QGo)%N :zK)!P[J RJuRBE(-@QGo)%:zK)!P[J &؄@#mJ@c!Фh,@BhR4M BI!X4) &@c!Фh,@BhR4M BI!X4) &@c!Фh,bh,J[J hRB(Fo)%J[J hRB(Fo)%J[J hRt˲s" @kꓣlp/;}ƾH%Z=wUۻz{M_IZc|>+,{6 +QVJ07"PVWЫ7S4D:}?z *8 4@}6 jjM?Ӿc󠂣ڀ@S!$ݕh*T)J4MCw% զn ơMjS7@] @Bq讄@SAM!8tWB Pmh+!T6uS4ݕh*TZpSզ%@wxI;%z Pmj^NTZpSզ%@wxI;%z Pmj^NTZpSզ%@wxI;%z Pmj^NTZpSզ%@wxI;%z Pmj^NTZpSզ%@wxI;%z Pmj^NTZpS@7GĕtD-%`զ%@wxI;%z Pmj^NTZpSզ%@wxI;%z Pmj^NTZpSզ%@wxI;%z Pmj^NTZpSզ%@QVQTQTQTK:G'Pj@/?Nҁfރ(Uj@5m8ZV]JԲ@ՌʦZԲZabqTyzNE]?'l&Ӽx哪LF^vnWm;vz u@IQo#< Hh8 4w {uMu0b]˟)%4-wozh9* q;͔^^㝷@-Á&P5&P5&P5{uMtlB/UCҁf`^&ZV]m%P{a(ZV:z~0 ~@- ^#EfT6ҁfcw˨=j@/X]LVBk:]Lk & _@+g_XR:,~{@uH@G\ţ-˟/(ţ =\>;CGeuS+h{:rs떽fM!8ģ7|\ Xqz Բz@@X 1G4 /WPw?ݛ@QVhcum5M5M5Mu0жMPWEZV]'PԢ:O_dn@Qwo=?;ɮs{M^=j^-hqCYvQÜHP $z jYݱC@F^SL , ]fL 2cEXF!PԵfT6բr uTͨlrUgFeS %>tV QVK:ߞ^(ZVw 5cP+4_AC/'Pj@ԶZ6IeVyPUjjYMjFeSMjFeSMjFeSMjFeS@Of> PNzGjY@vC|>̹jAN_-՝\A ne5Mu'=9ˮ4΃>=Z|@ Բ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@Ռʦ@ՌʦZAwIP %`E][`%PjU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*jU3*nz9[|@՝ p/;}@-;ۓl@-;YvQ9 |gcſlpSӪ-+cc_cT6y6*7)i-'!5ſ >oUnnSӪ-kmUnnSӪ[ 6k'WMNnD:zK)!P[J RJuRBE(-@QGo)%:zK)!P[J RJuRz"u!Pԝ@Qw@@i: B!P4)~ysC$\n q}tiwfTjw]e5))H= gK ͔[U-բ~ }+苽Xh>dxQȋ7=S0t.'DZ>_L+zTm@E! Cr+@N(޹,幘@`(}sZ -L/׳bXKzE 8?t0ٓ%oފWINCi: B!P4 @2,Wօ@E кZAyzv6?7ޒ[@kp9|9a?vl(2֠8a.ԋ'S@5?,{> ZE<|-G4h C|ą:##|IkQC6qOaqi?{*#|=qtagw K5_2MN;p7Ȩvz8+ڏ?(r\> SSxTs_VG7߳gNe=g6[=sގ;ឿ}7 I/]?6 bkd5Z4d@'~dA~-@h"ց^[7A)z_ųM2:bص}qf^5: -u>. PN$@*: PN$@*: PN$@*: PNԞWr]=bn:] @KzKXl94dkP{gg;Flhy-Ay.%[;F(@E' PITt(@E' PITt(@E' PITt(@E' P^߇4d{@l]S>v7.h$Mzv7.&v2."Лv2.P>_=>ܵ~]ŻF(@E' PITt(@E' PITt(@E' PITt(@E'[cFl94d@]z {|\z 蒭uz1@Ke;O=htɖנOhtv>%cFl 4$@*: PN$@*: PN$@*: PN$@*:2qmjZ:.,?ÝZ@KH] @Kz%黸^ dw]=i\{MEnD"@7z"[>/@]N @}{u8u^@9s7=A;;^uKX \qӃn>Yɚeqjp-k꾗6w Mgo ɣO+Á]~YMWoOq/^զ<-6zcT:ˏfo$xnDb@M^E&^ bJ.\cT=O_'Yt}/Y荛:H+v zO_5ҳwz#5* h?ncoƋޚ+/³J-nݫgG֩S+5hŶ6>|lwmGy쁪-c;\h@MzsW>GN^-yzu:k}P%Sh@zjW$=:}{u8uI(gun>՟. W@OėiaxXɕ+@NZZOzb $@*: PN$@*: PN$@*: PNd@x{P @#LOcsɃok{y <)p{`$cvFf\ӻ7-cJtZ\y7[j\V,o0.֔q_YJifna"ث\=3*$]+5{ۃaqd{40j(Iě}O{(7[ŻCgV^r4Fl ɸS*gj5D7?I$]/̤w>j/sW<5;>L$=krr?. PN$@*: PNGu4M9 PN$@*: PN]y1Qu%@Iz)4-(IE&%D$="@U@8xvȓ>C}U2hp5~wzn?k9D)HC @W5&F n/ `A mV2bPKp]<ihJ y5(@H$nd8@ mAv*]PJNSMAJ4d@:;hHɈ;<(n pt+Mvh>ԛx^ <1u>*93Ъ)HCK@6I"8ͧ&*{hTt2b7@M TuРqW9% }d@.PxVOAZ I΋ )ЍSH|3d@VgfQ~.МouFlե[9 =7P|2b|3d@Vgxꋄ4d@YFh^L(?,E2^SHlH 2u'@MM|ɀ g t2]׷9^c{σ EZNsS֙v !uջկ[y.v2$/~|{$qaj^g+L -]nԫ0^:hpܵJf а0 n^ܦNF T=.E\<ih+L*'x P~`"@PPh"Z PMD @JChh×hf(@E' PITt(@E' PITt(@E' PITt(@E' PITtA~Zy,4d@+1Jqh(@Iz(4-(IE&%D$="@PPh"Z PMD @Wշvcƒ 0~Uh,ɐN>O.I@ _7߫(>@#If_n<v}12׺^@:sF`ghVm$ ЍL(~x6@uFt$@*: PNuwsn |t$ ٍ @ڄA@+?(@f &{4&Uf;^y8 ƞ ڦgY7<g=hpNZt12'je V28샦 ިϾ̎=SHTy4d@@+ @K$@*: PN$@*: PN$@*: PN$@*:M$]?M9 PN$@*: PN$@*: PN$@*: PN$@*: PN$@*:ЍNl kMtXh"Z PMD @JCh(@Iz(4-(IE&%D$="@PPh"Z PMD @Wg[<d|ȓwqg4;Ý.{2A<@cOtz?y4ʧzvؓjj+e@cOƪ5h"pNcHV2(^7g{QG)$:=ȶX2 % PITt(@E'h>"Z$EJRt$E7eD@PF=xnɻrz{tׯu6Oi*{bzkW'' 4TΞ6T_שZ7T󞫒\aYS3V5'~|7KSR/&A'/ErW|^U'+禳J A:tAse^5F֠z]DFJ0=>|뿿g7)65s獯AszLǧ禓P g }PϳkP(~^~tR z??sTzQ|Ѿjjj16LPF=eD@PFT e+2@X_rut;#==Ӄn:UXvL۳Tyt-m ޞ[M(7`<@o5hC~{PK)Aͥ&^߭Ϙ7/%3GfSߡ/o+չA.jݭX4v_2/|'Igƴ%=eD@PF=eD@PF=eD:/fIENDB`ggstats/man/figures/README-unnamed-chunk-10-1.png0000644000176200001440000002571114526731562020740 0ustar liggesusersPNG  IHDRMR/PLTE ,-28:Mbf?',F.m55Q8:::f:KKaMYR_-_qbffr?rqQaqq,M.mRMss%%% %%%' %6%622222,F2M22mY2mk33,38888b888bb88::::f::::f:::f:f:CCC CEH.H.,HMMMMMnMMMnMMMYYY}\R\]]]Tbb8bff:fff:f::f:ff:ffff:ffffffnMMnMnnMnnnnns,nnnv'va~,~,~}Mm͊ͦ888MMMnMnMn6Ea::::fff:۶Mm}E TaaanMnnnMȫbff::fې۶mY}ȎMҊ8үې:ېf۶fېߋ2ߧFYk}nbҊfȎې۶.Q pHYsod IDATxݏ{U'?QAk7ѹיuwfG۰袄QGu532"%7,Aww#6;Cra83$M'TwWǩ:~OnT|9σ0$3}EbEbEbEbEb7!Ȍ 0o*N5E48phEqЈTũ;@#.S w8F\p(N5WkFKk!Ө!ь :J{QPc(α^@TXi/ sP9Vڋ(+E@u :J{QPc(α^@TXi/ sP9Vڋ(+E@u :J{QPc(α^@TXi/ sP9Vڋ(+E@uA8j^3r]E{X#{{V8g~$DȰ<02/Fw/ іɜ_;@ߵ)nٚLn|3etKQ\q| q<rz'K$E@Esn(]w?3Xsƃs[?}o A>O:_s ۆo>>O /]eƮu]cf>mAwJK|J4"h~Zz46sPo lʽ(A4P969cupJ4:NsʄJN~pҘmjtЉfCDU9Q8?qyZ-]{R)ЁkxITXaT )]R"/sSMrB™˥%rGl WiFP?׺/W}ʶEG t/wN]ߥyj2 Z" ,2_/ u|SR/ǖ҅wJ;o:ɨw ٥hzⴎf2+ Ai ϑByK mo4E 4?@?94sN2hf.:ɨ'jiibth.:M*Ÿ*[$(A)ќb|W]ܳg^ozJp A$rliceAm&@cAO䳶2!U栆jj֐g`lkxu$[lae֑|˔ sOe!xqJNQ[B+yQhq~nW$dwT]x~-@*2 {rXi/ s,:J{QPc(α^@TXi/ sP9Vڋ(+E@u :J{QPc(α^@TXi/ sP9Vڋ(+E@u :J{QPc(α^@TXi/ sUâqQj48phEqЈTũ;|:M4^{}&?wB@g2ebW@:V+ jPU(Z*vE@c ֱ]QPXŮ(ubW@:V+ jPU(Z*vE@c ֱ]QPXŮ(ubW@:V+ jPU(Z*vE@c ֱ]QPXŮh0Ekq:!j:M~⇽{~r暠>==|w}E#g_6LNzizddWf#^5v`66ЗdUkwAsrK<ٵ?hʋ윟G.:pHoIaT 4sP\|mL.nN9+}\*YF,^.f P5fÙP)Ko|e\>Ǘ-—xnjo}U-:ԲgI#:m ]y3HMFtTMFnjzٹ.u/ȯЅ^qckfaT A/rO>d%@Smo4P~siP?esNkpc&V;4$Vo7M*Pr&TH:!Vv%<5;߻A9O17zR"ܿ / AcK3tw ;':8:̈́p!dvwh iO"HӿqIꟃҲzQisxM;FHNAetߗ&uAU4lzT(7ER>A/(d,T2ޓUPN h(Z*vE@c ֱ]QPXŮ(ubW@:V+ jPU(Z*vE@c ֱ]QPXŮ(ubW@:V+ jPU(Z*vE@c ֱ]AQNgoϩhEqЈTũ;@#.S w8F\p(N5jިN绠xuzgP}@4LP}@4LP}@4LP}@4LP}@4LP}@4LP}@4LP}@4L~ӓ%;uTm15#a f3FhU=ȑmw|⪵ΧuB4%%z!,篔 q+#PN.^Icg%rZfjo>4]w?Dfг_8_s ۆo>>խ^x#ͅw:gݵN~1?v4<^{Fl:ɨWcd4"^|uX:v⎛XZ7iM(لyew"dd8ADž95fWgTMAeAǖfwPēNuX'j o,h. !YO!)y{߂9heʭ9l破,;;uxJ"ƘorErH(P?@=&-^LUgw(( a(( a(( a(( a(( a(( a(( aũ;@#.S w8F\p(N5qQj48ph17y47fnj (( a( h>&Z (( a( h>&Z (( a( h>&Z (( a( h>&Aj c\Q=5#a,z̀w\}yX:ɜ|'N!ZL|J\!,DS2 be4sW6.b z.LC{J)MtӋt~~g@~sm7Z^x#3f.k~nd'\;Zy&>}]F:c~bzb:o@s]uCʄ8hI\+5R\=0Ut@jɏ=%~=t҆Kzp4wo ɲq+1uMH0 |}q?'g=hnQzx\^FuPK"PAou:¹+iI:ht\yeB@/b3.0)?\JM N^@6R69unn*=i.997ʩ(A5PWEMB@'?8iL3FA׏LJ,J9v\*yY$wenۘ\*4MrS."sPPK;n\R/JZO W=hxݗ>ZeۢgEt/݀RΛj2(ɴ\l (A:o^ 9tΖXR4<^{FӬzIF=UۋNk("^|uX(FIVhD P'Bej~I9y?4?ËFZOq#o_u(>&Z (( a( h>&Z (( a( h>&Z (( a( h>&Z ((UâqQj48phEqЈTũ;O7$=&=7tf@@4x@4D@@@5@v@P PP}@4L@Ts@m 0PPP'P@@@5@v@P PP}@4L@Ts@mf\4FvĦ׌\WG$g}}Gbg~$Dȹx`dd@8s߁@?,D|ȑL|! NqdrM;.B,ilFqƅ2~_$؎q:!r#t'cIU7;]۟]d*8̓ϟCL\㯃}ud;F|̏}aG^jˌ]\$s]cf>mAw|}cgRW_Lo,vвeJ]ԼꥁA~>uA]=l^:SuR:!UԹN?ou'T ->b"gk>"i '<6sЏ'⊍rZ@M>yꩁ=u+wԭAwJ'}~ghЉlHԼJ7*CˀiF[u_hmUg=)YW.ͩ(^DxgD[JzuP ]yӹXMF].-@u|4cVT5 :|<˥߅c7(F W?Ս%;\4jЋN2|uX=]()$n7Nk,J1I'gm&MD/Vgg\Iy9뻃.0OMcK3;(\X5Ф̔n3z'J4"Å*Yꠣ--J7wWG|;TQs*os3SYT΀sPvzu5/b E6eQT(7/h=]돘 /u0i^nd'\;Zy&>g I(Ǯ!Be@'oyp'e&-:WO5: ƪX'pǽ"!Z'?(X҃{kP6lk]|h1ݼF9ou::@858B(}~Ǹ%ejz@|Ru+~"9hyinx귁zΫddr꧁ PI(-6**/ 9ϔj/w׏LJt%砅IKfxi\|mL.M44˥NT9Pj8r蠓W (5WJUm3 TAsG}-Aӥj2 Piڹ|{Dȼ΄I'WnxCzj=n{)@zYZ7M'ЋN($ziib;nsiyv"dZ'm~zdBN\V9Fvu6/PQzu*9ϪTح+_T8ΖTyV˧Z@osuuuuuuuuuuuuuuuuuuuuuuu3h'iUV @TK4j@5 x@h<f܁N QodW;(o 4o!@)?=S@It{ND@'iuC8 wN%4k}O֭ +tPg T S?WW~G`%U:i N,~J&\nUsu]i|{:h@'U}l*A=`f(/NOqOZn]|6:no(:kuO1jP-(L_uuuuuuuuuuuuuuu?X'wzYIENDB`ggstats/man/figures/lifecycle-defunct.svg0000644000176200001440000000170414357760261020270 0ustar liggesuserslifecyclelifecycledefunctdefunct ggstats/man/figures/lifecycle-maturing.svg0000644000176200001440000000170614357760261020470 0ustar liggesuserslifecyclelifecyclematuringmaturing ggstats/man/figures/lifecycle-archived.svg0000644000176200001440000000170714357760261020430 0ustar liggesusers lifecyclelifecyclearchivedarchived ggstats/man/figures/README-unnamed-chunk-6-1.png0000644000176200001440000002171614526731560020664 0ustar liggesusersPNG  IHDRMR/PLTE,:Nf=++,+n0M:LLfk,kNnn?b?b?b333888,88=8C^::::f:::::????b?b?b?MMMMMnMMMnMnMbb?bbb?b??bb?bbbbbbccc,c=cTmff:fff:ffnMMnMnnMnnnnnn????bbbٌemMMMnMnM::::f۶ېbb?bٽ٫nMnnnMȫ0e=vmffې?bٟȎMȎnCvmٟbٽٟٽې:nT,e=vMv^vmfȎېm" pHYsod IDATx흏y DX )vF[96Ii)iS0&iӒ65N%mF"QCTڙٙݝ{{z{}43&}őp hu@PD"ZꀠVGw}"tDR< hp"tDR< hp"tDR< hp"tDR< hp"tDR< hp"tDR< hp"tDR< hp"tDR< hp"tDRK"=KDR# 2㯽YM*'HP)B"X#&?y?+dcWE7)hzПŋ# by?Oi+}{k"5wAKw byW*%q"5dn*BgP}ЖI1A|U›Ɣ=e0TGC. VH$cg.k|I_r~n*p * (kDRv۠ נWtw{=cO/ď_3o i/bY/r8jxLW3BpA*bwbI t1~%h+ətEv^]9q_OSk9.fir]G|s(o (As]17AmAE/4.́H3163cN1+6V9p%A'ǎ&Rcw{!蠀a4-P]!CCP?CP~ A4 Ai:4p>xVCKF<%hqAՇϽ\t3{@PTdN)ҲQ M&aXOWiaA?uFr1' 镎x:ym] A =A [ 6 A 5A*=#A&ʜTH9tў1 Aue4K#;F<7AM.I9؁ݓ^*/)HiAզ A AWg^˿;'UUN.5M/^{7%hx/+$luK7%hx jJ$7yCPA40uh9`Osԡ!!CCP?CP~ A4 Ai:40uh9`Osԡ!!CCP?CP~nJP>Y^ ?7TZ4{4Ƞa42h3(u+bdU4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4[W'ɕy|cBB|nJ26K_CP"trN[-?X( )/UT*ND/e~ӎkt'u#ZnS̢ȠnE32h DdסԭvdT]hB6 zsIAo\4.płrxxLEo|Pj'*>ӯugyBAW+&>8HAÇVvJ1]4¿A}*@!hvEsm_O5ZMOVڹYSR]AU'I$$8۹YP2U Gw*ceN*IA\GAЕD%7oaVr.zdU}#G6τZUCPL*N*{&lժZEjP8ql'lժZES#=qKi'lժZEaS!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]4%hECP!]tk"rb0vlt 8^x@^2].tbQ\,N.. A!(Aǂ(|4Z/. A!(AG^>l-.t$ؗ@KKCPhJБ2_/. A!(Aǁ3]ECPA1K{ݝ. A!(AǁWtv\)c.yUPj!q) +. A!(A/ }}Õvz;]ECP/א[C^{!]4%h" vjJ!(A^xvjJ!(Ag/6W{>ZE^{j/={A/qUb;&;G/J`k*/誔Qrn*hv6 *p_si^;V^A^Sj|VUq ʀ2b̛~ R,ACU4! N69r?\Kn=$'grAXRdy/uڇRh잜}0RdTxמ^K>9{B8RA)<DF̧c++ӣA!}( =?(͞rQX^&1Vс0K󞿛 *̿4 v jǂ v(V@E4,a\OVu/ՂnN T{Ƽ*:v^A>yB=ҷJ)x#h /=cjCPhyPяe鵃ZT=&R٣$~x Ƽ*:v~AUIu$TݻW]|yILn>QJyUt A ADK 9`j>9>yZԭvd@0izp@P"QAu+"!($CPy*I> VD`;CP2OU{U-:: A ZZd-A }Z V A!h^n lNRErdaBP:!]n S|ue'oyhZOP5ód RA,+ݓZ}cBкt)~|ڲ4lTPqaI]SMgr4{u=HPI@7/o'nweX91hO;@A][n6I; 2d anV#t["DzmcqJ.zf S<^yT4ooAO/&b69gc3^̎l}(} dBP\hXnB͙1wՕ4n1=2o\ԝvynrRJ7gO噠vǂ*;:۞2gu/q@ʠ$4S@ӎhz!%D1N&S"N$u!hUAS<h~RY݅O^LN3е,w(UANhBP3,eΎw&EjkYRPj#RoR&5~r1}$o }Ij@ЛGgCPa ] w( Z 6S0CPAh%:D@P ApÂ?$hljZ4ܬr-zS9MeެpC :G6`=hg"yj3zI4n^hghg*yO(̞7OpGﳁatC,qhc NQ;T'o}7Grbj SzC=cawUN9]NI̚atC:hz= UW^8X( 4qW휯71ڙHZ)"y[ٕgj' 0ڹhg"yO_'zIRNngw^vL$*O)|H8D/>p'IvF;Shg"y^ Ob@0)A`*yO./yCP[v<4w !!CCP?CPY2 A ) h A!` A4 Ai:40uh9`Osԡ!!CCP?CP~ A4 Af1hO;7H F#=BP"Ah A0uh9`Osԡ!!CCP?CP~ A4 A;< CP8 Ap Z01j:< CP8 Ap Z01j:< CP8 Ap Z01j:< CP8 Ap Z01j:< CP8 Ap Z01j:< CP8 Ap Z01j:< CP8 Ap Z01j:< CP8 Ap Z01j#+֕{jmnw_vo,CPZݾ1_;%׶[o_/FJ_Э2anHM?X(FJ_ TLCS:/ <AЖ :s#1nSb9G\xTg6S=~H :=^0.ΎGUP}T$QySv#A[)䧤yVŽufƄ:S;" :3N2eVñ+PPǓO%V1͠ɱo%PZ8ۻStFvfRCg#$:Hc6U4؞T:9c)S/'.y ><bJA'Fc^x0pxc4ztL1:A̙!q]&әquS0A$i6 IޠWO @'iÃ, Z] h$mIuFR< hvT,Z] h$mIuFR< hvT,Z] h$#:!_f?a BP2 :q3NJC (%#a]ĭ4:1a BP2^*A-:1a BP2 q[*I\ [AP 7^Lz A!(qOdZ'lAPJF\AŹ8qA!(M'j/}ЎwdD8QDP߄- AɈI/Y8 (%c?Ut ABfaJd@7A!(6V90uh9`Osԡ!!CCP?CP~VA<4o?ӎĞɠ Kf utidPnk+u&Ƴ =l4uhcIlBCԡ!g #3A Ahpvjc-gM At,x<)0Kz utiOG'C' Գ;;KCP~:AoN  uuy~9_7@P/{|"}W'z- U^MP4*#(A.j'TމGЪנX4*#Y/bڋX֋ƢQf'AmO3X4*$(-ƢQf&èSAaT٩ɀ0*d@`avj2 h00;54FE  "NM Qf&èSAaT٩ɀ0*d Z)j7hp/UA: hcTC_t?*bhVPf@PD"ZꀠVGS苫UeCFF^^uXp9Q/tP9Cw [kYX>y:&RAEԌf]z'1?+ɃbCQCݡG_}W\NGXM* ٨X)^~WG<ܙ*i틪BĢ;l2pVcMX TDhTƆm;_5䥗2-dh޷7/`$TDh$u [W.M(ti!yEga}'x[W~i˨Zt/^2uG_] 8K/ ?7O@Eԋ&3öK,]߲KksyIDAT&!`y@޻W@EԋF3蒺}cZWƼiv^׺t(XS8v]x論$ ꀠVE: (Ahu@Сb5Iǯ^~+$ w "nJ?߅z<9{άJA"'\D Z= iS*]W>~Yr=z<~V.oV;.$9*%"f@aAu]⅜O&*,.<I5s,A"ǯH&MZ/ :T,|$ȘR4["bE: (Ahu@PD"ZꀠVE: (K!] ZIENDB`ggstats/man/figures/lifecycle-questioning.svg0000644000176200001440000000171414357760261021206 0ustar liggesuserslifecyclelifecyclequestioningquestioning ggstats/man/figures/lifecycle-superseded.svg0000644000176200001440000000171314357760261021003 0ustar liggesusers lifecyclelifecyclesupersededsuperseded ggstats/man/figures/README-unnamed-chunk-4-1.png0000644000176200001440000001500214526731555020655 0ustar liggesusersPNG  IHDRMR/MPLTE:f:::f:fff}::::f:::::::f:::ff:f:f::MMMMMnMMMnMnMff:fff:f::ffff:ffffnMMnMnnMnnnnMMMnMnMn::::fff:ff:fېnMnnnnȫff::ffې۶ȎMې:ېf۶f۶ېnkvmfȎې۶Cj" pHYsod[IDATxWG[nAvf3DXIXd!bφceA.ow땄ѥ~hPDg9{]ʠ#P2AG oVdTt%W1m$Aʀ@%!PJ(B: (J ,ՍS='f}xNjټ#"5>ж)xz^_5IYr<k6KooT fJ̕hwb>lSyn`tl#,B9cTW,:0a>7 P?@{fhr[rz ecת@x ׬:q\;ھqqTک:hifGEj\~_kx~hw^hZkNVLUy7"ND'USZ%(IdO߽|+.xLgfkNf%!PJ(B: (J ,(X#P eQx2 #P e@TGʀ@BԫWF'P$У:{x`|Tk]|y}r,:tt_F<@u@Skv#:Nz|5RZ7W;;A1;_? fNtNH<^KsҺǧwNkF 1RݭNڇt@ٓx =]Y[?'m|G:p3MSbt Izz ^kPA?| *@Kεk%@IC*>x8rv >xGp>a3Q"P"P{י>69Y(@r< f P e@TGʀ@BE:@u TBYE9* TB2 #PPD`@"PE:P(@QJ(B:P@u* ,((X#PTG`@%E PAʀ@%!PJ(B"eEJF@@Y#P2AG :ehLQ-_s"PiCr<TB2 #P e@TG`E@"Pݸ@;UcLu'f}`ϋ#v."-R]y}y]DEjBrk'ҟΐowN55<3hg,EERIgҟryNLmZYޓ jwoRK+8"15hОs}{EzbVufm;V'@]w|"",B9_wݭǻM7M Τc#}mE (J5 ̠rp JEA8F/}`G tTǿ$E:P(@QJ(B:P@u* ,((X#PTG`@%E PAʀ@%!PJ(B:]`O\(zQz1Y[=[q~;g O%t k,t݈,Ku7 h`QD Բn}Un|m^ 3zNxT|ǜAMaAu@4l%|(x@__΍rנn@BA4t!Wu.<t}iBW>[=Z-o͸_DKs>"P,P n"% eQx:"Aʀ@%!P%@u ,(@r<TB2 #P e@TG`E@"PE:@u*, PP@u* TB(X(@Q"PTG`@"PJ((ǃ@%!PJ(B:P@uEʊj"F :e(@tʠ+мa&Z$S/i4 P@u* TB(X(@Q"PTG`n\1rяyq?:1ٽ)#"5>Еx2^lqE(cRmމYN;k ~o1v;Z^s->"1!Ps%y%)lZjes@xLt ZqE_q{^ P؉juh gPm]n^7(Ei C33(JTMi' ک5(N&fXG7`7Fv^WoOD:%IBYE9* TB2 #PPD`@"PE:P(@QJ(B:P@u* ,((X)@'n߾=dq Thc`Gwzܿ8{xE_9X~,CfTt݈m%;Nkw~I}Ot`Qpr4~#n}Uo}n2xw{}Y}{}0@/ݩMGΫl@[?n_ph28m6;a=]=V#LxΠYD%D8 :l@נ(|54@'m*YD:IC*~##trIGǫxWx/F:W&p9eP(@QP+lW%P@u@@@x2 #PPD`@"PE:P(@QJ(B:P@u* ,((X#PTG`@%E PAʀ@%!PJ(B: (J ,(X#P eQx2 #P e@TGʀ@YQ-PVAG :e(@tʠ+м߸c&Z$N_4 P@u* TB(X(@Q"PTG`nlgƼ8~ԩb26UZkV~}Sj0MIYr<jJzMc*S :kfϠۗgE(c\m<;o^A+?7桽7u7<hqZaԊOvk7P=fx.=jok %(Mz 8&"͠(nU"SիGEjAÿj46Cx^)%IBYE9* TB2 #PPD`@"PE:P(@QJ(B:P@u* ,((X [z(=ݬoɷZv _>9@Mؼ 7Ln(̤Ftto+ޱp;Ⳇ&"Բn}UnA%.}#_H oh( C\=>{pZs5ܔ}0ʖ|'z{xN-P_4\@9X@FϏQt-&Q_~3 h1 fPg(_р{ r<(ˮAó Pk{~& ]lo$#[Li]AuX@*}xy͙CQ3Q_t`@9>ZfS/I(( hhP@u* ,((X#PTG`@%E PAʀ@%!PJ(B: (J ,(X#P eQx2 #P e@TGʀ@BE:@u TBYE9* TB2 +"PVdT 1e(@tʠ#P]4"ԧyT%W1m(@QJ(B:P@u* ,((XY71Em>Ny4DEjrʡ^b=@9v'-RT^+=5)T5QǙ=1+gvRR|Z; }Z2_iP(@QD@+?7aV[K vugnwKzyOpw;ƜBMt1w+d}s]ެ[vv]hnW,!Pt@;եǻ@Wv֖OP3*O˻s%,B9u)~[ȟ Ϡּ.AH*TS5 t~T4d譪j=@{ͰWnfwOxjw Wt=#P]NI盡Z6KZsM(ǃP@u* TB(X(@Q"PTG`@"PJ((ǃ@%!PJ(B:P@u  P,8/_?hc`Gwzܿ\˃9X~]Ȑ -?52.ѽtGtz|56fQwڝxGwNҭ>;|7e4!  @݃pwSVͣݭwc."c׷i,S=GJv}t3}l]>\h5*ggP9_~ ?ޟ (J(@%Uk{}F# %W>[۞nT_θG h΁!ycv̦:_"i4DYr<TB2 #PPD`@"PE:P(@QJ(B:P@u* ,((X#PTG`@%E PAʀ@%!PJ(B: (J ,(X#P eQx2 #P e@TGʀ@YQ-BYuTBC{7(Î@tʠ#P24|SwGþ'Zn<܇sh3ҏ9ao*vӁB&= Iιp4|/}7~N%Ctf70$0נG#rc aHҏv=|(Đ}PQU<(@tʠ#P2AG׍t0WO 5fo=[-8^jo?J:@Fm-oi׿eR+7E\n~s@KA9T~~S|'LE@SjuZh3>\Τ5OoxocgNCعkes^KAERmT}/p(x1wyY5lj*??3¿!J :e(@tʠ#P2AG :e(@tʠ#P2AG :e(@tʠ#P2AG :e(@tʠ#P#Sԁ,$7nyW_ 1fg "61^s=j/oÓc>!P.Nu{V_GϜ>I=!R.ZlNf?Fh[_g:c?!R.d3;N*'ي'% 3O.נLLf:ih{.G_z1(]`U3]zd t}5'@ڨy}ٟpGk$nw.t~q'ͽ?|pƅ7+Lm9zޱ%ͽJlifecyclelifecyclestablestable ggstats/man/figures/lifecycle-experimental.svg0000644000176200001440000000171614357760261021340 0ustar liggesuserslifecyclelifecycleexperimentalexperimental ggstats/man/figures/README-unnamed-chunk-4-2.png0000644000176200001440000001561714526731555020672 0ustar liggesusersPNG  IHDRMR/PLTE:f:::f:fff}7`777`7777`777::::f:::::::f:::ff:f:f::MMMMMnMMMnnMnMnM``7```ff:fff:f::ffff:ffffnMMnMnnMnnnnnnnnn7777`MMMnMnMnnnnȎ::::fff:ff:fې```nMnnȫ﫫ﳳff:ff:ffې۶ȎMȎnȫnȫȫΈ7ې:ېf۶f۶ېn䫎Ȏȫk`ΈvmfȎې۶WO pHYsodIIDATx흋yG˪7 4A@ +S &Ƭ[br6ł^m uڦ\NnZcbWHIh5y=O#;0AiAiAiAiAiAiAiAiAFƏog^;ٸ=f|t?sz嗛A|GF?}A7?p?ϟOo{ny7YjUgoi4''k 9׎3FSvjgk?=8#h 5U0(a+<A?}*ϣ2|1ǹ}e .` %kUo_6I?ui֠_ i$w.AOANů.ZO z_(m8ÿ3=96X h -:8L o~8> O>6X`#+~;*k8Omm"h5haDP3 eJࠂ5hxY]Mӑ/{dx>!s$<-/?ŏkԲyA~eA_}a&)ksA(#H[9I_QxifB(WFOwi[Т(  :f΀Y 34IQ.*\UAsT$9APT*rT$y.*\UAsT$9APT*rT$y.*\UAsT@Ny߽咠)AiAiAi\TZQwoe9%U5{TVz vAJQuo#*EPT <AI MЯ{W{_v᱌sAP9_n- M گ)Խ S!9}z09fFa ?mѾ J"hx7:{^j`hڏ j6oNCU"hJPvYφ]Q.֠m^h A}!8Agj*QjFEA tY.<>ٝQA} נR4%YhuvL^9e 3m:#JAPT*@$EPT <A@Py.*@$A@P9APT*pMgoIrM= ԓ矟ih hn6\PTA/9P0xa4#.<?S[,~%(A #}?o|k֕o=;)yG_6f^9^Y*=e : r j!4>iIC|PٙPP遱]寠]}a4]KIO6.LJj˅4-衯A N ts)>GRРo>m=Pӑ?8n4I+ur틟b${HJ:Z9mO[W_=wR`7c* 6o;>VS4}RܳHJ:4t *?-hQ"|@L4 uAaIrT"*EPT $APT*rT $sT"*EPT $APT*rT $sT"*EPT rM,H4 N4 N4 N㒠7֊{/)A ^ A[ z-zGPT <A@Py.*@$A@P9_<Ҳ~1{.0APgXZНpڲr4HЎx?UN;um껦-s=/- (ADoXPS3F葝NկWkZ .- r4AZ󬋑G4brpW_>K|ʷ5gB#G>3[Тy-~'?| G_#njgO߼8мS +~2+C+ NVŕX\AAb|s%D_??4Wf2~!,w)>>i¨DB[ hEA A͌/!z뷮9{mނ{)te$MLrAL"?xcA(}Љ6G.&iR2+!j(4'OfL/1,h LsC,J/*샰ނ?,+-4 uAA@PgXoAi\UAsT"4IrT UA 4I\UAsT"4IrT UA 4I\UAsT"/{/%AR (8 (8 (8 Ӹ$ rJP~B練#CP((AT"*EPT $APT&y{OJkzCx#'vF6fa& t[NC<49Θ0u9V91lz;uro/nvh4[FśꕺTǪ΂K˂&)EVjGoo[ zO}Wg6t;`wk?(o7t .- r4A ,⇯FG^.~S>|YL_'kwbò)a (8 (8 (8 hg<վ hHPۙs{j hneU%tT M jF̗^qg|]ҾMA}os}(/־n ^~ۯoOၠ )BꅦVZ `f|=\4A$u2 5e?d4A$hT~}pcjۦwY6 ^FMRk՟Fn`$藺f?W{yn2 N4 N4 N4 N4 N4 N4 N4 N4 N4 N4 N4 N4 N4 N4 N4!Q7+F蟸-"]`wQ'7x>Ņ*mߵ<6Sf8DA Ah6OPX+AiAiAiAiAiAiAiAiAiAiAiAi=H'@)IENDB`ggstats/man/figures/lifecycle-deprecated.svg0000644000176200001440000000171214357760261020737 0ustar liggesuserslifecyclelifecycledeprecateddeprecated ggstats/man/figures/README-unnamed-chunk-5-1.png0000644000176200001440000001540714526731557020671 0ustar liggesusersPNG  IHDRMR/>PLTE:f:::f:fff8::::f:::::::f:::ff:f:f::MMMMMnMMMnMaff:fff:f::f:fffff:ffffnMMnMnnMnnnMMMnM::::fff:ff:fېېnMȫff::ffې۶ȎMې:ېf۶f۶ې۶ndvmfȎې۶+e{ pHYsodoIDATx q4YJP"65SMiöi2([zvLInE'@5b yYb??LX j:\@P  DAh ( @4@P  DAh ( @4)BiFѢ|˴{>DuF_Du>:pz4.}lуB;UAG[Oo|k4u&L3r9lߏh NYЅN~KAzӤ s՛7A>2"vA$/ Fo&:|ñRt 6q5ݟN΃?1,o%/IIG_ߎ}W?\?s]!tEXaeL2{I29q yH`#dv&y`_)=ʺ3?/&Ҿʾ؆\6*4/SeN_:A0W,̼f6Y薠Hv88Oׂ`gwUdP|gK*huzue~MOΎ!hܾ{[Fd- l&i#Sn'E/!4Mzu͝7!(QK"IҬ FF ̜*rt6\P ԅ-ҪՋe\Q3S+m ڟ;7֠:f_>{T y-BzmnaI^tT;kмBrh _Yu?쿰1۹gv rm@P  D#^']MGQJ|A+~7#ឳ?x\a#-B(!h J@Pv"%u.A=̥'NA= OTz84h6عqfa?r A/LٻQw <$rf;/Yh?Gw=Q.!P4~v~{7}J諒o&A筩@K"iI~&_p#%uHո?/>5[ :RP9?\$41VP@'h^*g'tP4s1R$}C (qrڿt&a!hsgJd (qr._?$}Ss]ӗr A]n&. J@Pv"ZϣIbA A{!_"T9|[[ɂn*XK9MAI:Pv5I5@t3rMA* dAs A-.PA g^ůC$6r&A9:('NDo\P5fjD (Aى(EM&r A|@@P> h"TzMv5I"s A]@PDK]z&@ND@P> h (4 *r=&E$. J@Pv"%uqF@t6E(@P  DAh (xAQBm::PR m_U#ឳ?q**"Ԉ A:e'\BP\zԓ8IAA+h(:jtiϫ;ODOp+h=&E&MA6ՅE:t}47Gir\ocR7Ɋc&j&ٴ{PL zܽca>L %֠>s+]"]|93#68AkA 5T}`OJ6՛MqMZ9]~^zAt5~pOc$zA 5.FGu{=5a6x%"M<$YDd.! r A]KPO$hعn.b % (;)% (;WRUmC]10 h"tcŽO,h&):Z?u ejAw~V KꢽV.稠eZy4I8}Au@~ e'<$jD (Aى(EM&r A|@@P> h"TzMv5I"s A]@PDK]z&@ND@P> h (4 *r=&E$. J@Pv"%uqF@t6E(@P  DAh (xAQBm::PR ZI ɥC ibDIAA@PQ.!v g.=q IH݂zrBÁ|@4MRٜv/Ǽ-Ԉ/-"%eg>ASPUDG"PT:S u{΃,{XK]AD9ED. RP]3ZքS9LҤzo1p "%uLP$h\=ק$=6F]^r A]ƾY) &-AqMqMReMTPO (qrSoBfXV45(=JxAu]|>]}s/?lA$r=&EnK:e'\BP4I$v"%eqFI~924YR7,[}7t# q*%@МjA *M=Iڎ,$m zFE dVf^bC-4n3ɑuy\G~4*hm8+K4#4 eA9uPv"~H䂊7hW$2uA ND.%'h4IDKA"h\IK:e'\BP4I$v"%eAA\Py4I.$%uA@PQ.!4 (@I/(h7@P  DAh Վ:DjQԁ*_or^ +TQFTz% (;hx-'I*= @mAWOzq,M6/H#c2RWո|IgBOl2VzM@MRߋDuJNM:,=WIQAr A]tF35ɦ [A\'/҈r A]TKi7VΤ8F]i w(E OwO# $4S|JKVMKk^ JԿ:;tVН.^}Ϙr A7I"h\f. J@Pv"%u.A=ADIb'\BPF'͛79]ࡲ &ږ nZ8&ہY#&*USny?Am|BmI]Mo.o jゖBC[J@Pv 8Ly4I9]|h!(NDo\P5fjD (Aى(EM&r A|@@P> h"TzMv5I"s A]@PDK]z&@ND@P> h (4 *r=&E$. J@Pv"%uqF@t6E(@P  DAh (xAQBm::PR ZI ɥC ibDIAA@PQ.!v g.=q IH݂zrBÁ|@tF'3>βl#7fz8eA|r8C>2$(\ϱ:n#r=&E&I˕&;Vc/ygr0=WV񭤓vx^Aѽ|9V#c>LE^0i=śɵFd.!S]-1̺SW׳|?;|-h t."hOw= +Y JׂƃJ4<$F'10?_4 Jׂ=o-Ac%e䚂ћV;KN A A A;z8uAu޹Wt5OśʪNAkUu=J_ h=&E7Tϯ*E^r A]@PDGu۬>BP~凖/RO\^=|i^_{A< 3R4I*ʷ!7R* TiBoנa. u+uf(M#".$%uA@PQ.!v $M;2A|@D.<$jD (Aى(EM&r A|@@P> h"TzMv5I"s A]@PDKdBm::PR @4@P 6}хtu*ܻG~=g UTE%>;uA ND.%')'a#u q AAPW4QJu&fsڽ~P׾7WOz49_\r A/h>AAwV'Vr#"ZϣIrI*TG݋:=MUAHjTvLM ncpvnunbٗPI!a).sdzkBόcf3Iq^kx~agT<7 z&>@| 7HТ&S pD/9GK"l5*JAg֥4=n 7w-+zs#'hm*+(qcJ\ԛ4Mh zYP%N%5vw[.!(#^kP]l=VhT^~FYl:ϓjV'!9=A'<$YDd.! r A]KPO$|MwߡIʁp #=@P>'w7NA "ZϷI~f~%]M@ i2S)Ǘ (&]z&@Aف7I@PNp?(;h\fƉ(uA ND.%'h4IDKA"h\IK:e'\BP4I$v"%eAA\Py4I.$%uA@PQ.!4 (@I/h5O"dH# iH# 5rAA[@4@P XWwO?[;w~4[H# u#H㝿7~ao?6~2[H# u38/O;_uͻ$'҈B݌4NA}?s[ܳK4PO_?,p.-xZF_޹cNr~7]\4:A߯rɻt6Ƃ"(ԍHO_Psj8ry.&҈B݌ {w w^JiAh ( @4@P9,й?o:A@P9Jmrؙ.@$L &?Tot|AP{Y6z+׉yҽh:&r(נӤ<{3^ԖA@WIgv&Vիg| 5Nj MuMss5mKAAÔkA$f   Aߟn맪` @P  DAh ( @4@P  DAh ( @4@P  DAh ( @4@P  DAh ( ժUQQaDqbj;=.-?E[zF3כP:4 ނƣl6ܚVEe;ikN(=n7dy"KbIOS^y=ބ⁠A'?_سjlN1ͳ'/_yŌVzkR=le9t hנ0v)XV:b{wkЪykN(nU)~gɘtw5hҁ괗A~­`5>+|ˌ[U=KyN(zL?+M[_[%~ռ' =&J0yA @P  DAh ( @4@P  DAh ( @4@P  DPhszf]WIENDB`ggstats/man/figures/README-unnamed-chunk-9-1.png0000644000176200001440000001410614526731561020663 0ustar liggesusersPNG  IHDRMR/PLTE:f:f333::::f:::::MMMMMnMMMnMnMff:fff:ffnMMnMnnMnnnnnnMMMnMnM::::f۶ېnMnnnMȫffېȎMȎnې:nvmfȎېt,% pHYsodIDATx {YvEmp;A0{Nl:3vfԭ*NYi اTI8[!˂$ (I:J$4KϵNM)ޥsAPpQZ'ǹ S8(s{Sw\)Ak։)Żq.5 v\ޔ]8u G;|uboJ.} :#h>:7x>AFk؛RK炠N!A7 ^^>doF7 52nt# x2xR<˳%rRa-!A_fV7-{VOu }P嬔lx<_~x σ*gd-tY4 Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJ) g oa ڞxkPMd#}VJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-Y)~#yVJj8gd-[лݽݭ'_,?i肞d'_;D4JvlANveZ"hp%[ [</†pbrPܾ:Go~G4y>Z8W跮sW,W :Ojy${|~GpoW/ /Ui<,xسJx|s4Q Z<e \Ɏ͆g5uygi؂n)O3M γ,lGG=F4oqyа<W : [D4JvtAa3iWc Zfl<+%oȂo[<$5b-]A4J6gd-YPgMdޟ&\Ɏ.hӭ#"hp%[!hS΃c-.t 7e-[P=CTJvlAg?"\Ɏ-h hp%;æ>"\Ɏ*h hp%AR,(/u6c-YP^l[</u6a-.4lRg}Fl# K~Gt+i؂m[<]Md74c-YZA4J6gd-YP6~G[^[\"hve:G4JJP6~k=e ZxdA$ZxdAkAӀ+jxlAlc zGacAMd4_>O3-_E4;rR+1σ4d^>> zC4 [֎<=&W Z'\FP] WAW•lEЕp%At%\Ɏ*hqTWAW•쨂Cs*+qm؂ W Zlmߤiާ>4 [j]NJdM|处++.  c~ARI W Z gF4 T,'%\^!hS\l[>|nwk_处+KͷX; 濬Gñ.p%{!δ{/U+nrR•Uf_ǂ۰U7)(bGa9)JAбs3軏,3rK#ś7 (4lk :O`A 씂.=m5cP3ߴS<˳b :͎Wf_N AO_dWK.06Z~=瓰<4 j ニ&_UDeI0>ivfÝ.g.(SA>zk<WaE-ξݮ<֙˓SA-ހ\(7,ۃ5:AA-p%;lAP|&])hMd#(+]̺ۮbirl9)JvlA9p9)J6.O2I Wc Z~e I Wc :z? \Ɏ-h hp%A'处+-͆rR•悖ʽCǜd9)Jv AE6zw{8rR•6޽ofAN>4\Gd4ߝˊv.d-lnC^(I,'%\@ߌh':XNJ^CAN|_|处+oöQ<]NJJPc05/'%\Ɏ-(p9)JvlAkAӀ+<t}|s}?4d#hٔWl1Z=A۳lʇl19%A{ ڳ h{6mFc=A{ g#h ZSHdG4z-nMC%;ٌ= [TYlIRaokf cfxr6N3g#hAl1sI3~8AӐDɎ.?Wg^MC%;w{LAk h{6o(A,8Up=A{ oiFc&oX[Z5h؂֠~x?AӀ4+2M4͏LJG:FXfaOt#Q|g?i9Aæ~Y G,⊗HЂ|y V磌agwʫm ~Oq7^Y|e5hV]NA+n|7Ra(#AV.מأ17 ~PYqV8Q `$qx&&:Zpm|W( Ax%$%IAIAPt$%IAke}e7?*N5pktAAk4+ s7yo_dtgj-t>R\%} \title{Objects exported from other packages} \keyword{internal} \description{ These objects are imported from other packages. Follow the links below to see their documentation. \describe{ \item{magrittr}{\code{\link[magrittr:pipe]{\%>\%}}} }} ggstats/man/position_likert.Rd0000644000176200001440000001077614505240761016225 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/position_likert.R \docType{data} \name{position_likert} \alias{position_likert} \alias{position_likert_count} \alias{PositionLikert} \alias{PositionLikertCount} \title{Stack objects on top of each another and center them around 0} \usage{ position_likert(vjust = 1, reverse = FALSE, exclude_fill_values = NULL) position_likert_count(vjust = 1, reverse = FALSE, exclude_fill_values = NULL) } \arguments{ \item{vjust}{Vertical adjustment for geoms that have a position (like points or lines), not a dimension (like bars or areas). Set to \code{0} to align with the bottom, \code{0.5} for the middle, and \code{1} (the default) for the top.} \item{reverse}{If \code{TRUE}, will reverse the default stacking order. This is useful if you're rotating both the plot and legend.} \item{exclude_fill_values}{Vector of values from the variable associated with the \code{fill} aesthetic that should not be displayed (but still taken into account for computing proportions)} } \description{ \ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#experimental}{\figure{lifecycle-experimental.svg}{options: alt='[Experimental]'}}}{\strong{[Experimental]}} } \details{ \code{position_likert()} stacks proportion bars on top of each other and center them around zero (the same number of modalities are displayed on each side). This type of presentation is commonly used to display Likert-type scales. \code{position_likert_count()} uses counts instead of proportions. It is recommended to use \code{position_likert()} with \code{stat_prop()} and its \code{complete} argument (see examples). } \examples{ library(ggplot2) ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "fill") + scale_x_continuous(label = scales::label_percent()) + scale_fill_brewer(palette = "PiYG") + xlab("proportion") ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + scale_x_continuous(label = label_percent_abs()) + scale_fill_brewer(palette = "PiYG") + xlab("proportion") ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "stack") + scale_fill_brewer(palette = "PiYG") ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert_count") + scale_x_continuous(label = label_number_abs()) + scale_fill_brewer(palette = "PiYG") \donttest{ # Reverse order ------------------------------------------------------------- ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(reverse = TRUE)) + scale_x_continuous(label = label_percent_abs()) + scale_fill_brewer(palette = "PiYG", direction = -1) + xlab("proportion") # Missing items ------------------------------------------------------------- # example with a level not being observed for a specific value of y d <- diamonds d <- d[!(d$cut == "Premium" & d$clarity == "I1"), ] d <- d[!(d$cut \%in\% c("Fair", "Good") & d$clarity == "SI2"), ] # by default, the two lowest bar are not properly centered ggplot(d) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + scale_fill_brewer(palette = "PiYG") # use stat_prop() with `complete = "fill"` to fix it ggplot(d) + aes(y = clarity, fill = cut) + geom_bar(position = "likert", stat = "prop", complete = "fill") + scale_fill_brewer(palette = "PiYG") # Add labels ---------------------------------------------------------------- custom_label <- function(x) { p <- scales::percent(x, accuracy = 1) p[x < .075] <- "" p } ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + geom_text( aes(by = clarity, label = custom_label(after_stat(prop))), stat = "prop", position = position_likert(vjust = .5) ) + scale_x_continuous(label = label_percent_abs()) + scale_fill_brewer(palette = "PiYG", direction = -1) + xlab("proportion") # Do not display specific fill values --------------------------------------- # (but taken into account to compute proportions) ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(exclude_fill_values = "Very Good")) + scale_x_continuous(label = label_percent_abs()) + scale_fill_brewer(palette = "PiYG") + xlab("proportion") } } \seealso{ See \code{\link[ggplot2:position_stack]{ggplot2::position_stack()}} and \code{\link[ggplot2:position_stack]{ggplot2::position_fill()}} } \keyword{datasets} ggstats/man/geom_stripped_rows.Rd0000644000176200001440000001052514357760261016721 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/geom_stripped_rows.R \name{geom_stripped_rows} \alias{geom_stripped_rows} \alias{geom_stripped_cols} \title{Alternating Background Color} \usage{ geom_stripped_rows( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., show.legend = NA, inherit.aes = TRUE, xfrom = -Inf, xto = Inf, width = 1, nudge_y = 0 ) geom_stripped_cols( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., show.legend = NA, inherit.aes = TRUE, yfrom = -Inf, yto = Inf, width = 1, nudge_x = 0 ) } \arguments{ \item{mapping}{Set of aesthetic mappings created by \code{\link[ggplot2:aes]{aes()}}. If specified and \code{inherit.aes = TRUE} (the default), it is combined with the default mapping at the top level of the plot. You must supply \code{mapping} if there is no plot mapping.} \item{data}{The data to be displayed in this layer. There are three options: If \code{NULL}, the default, the data is inherited from the plot data as specified in the call to \code{\link[ggplot2:ggplot]{ggplot()}}. A \code{data.frame}, or other object, will override the plot data. All objects will be fortified to produce a data frame. See \code{\link[ggplot2:fortify]{fortify()}} for which variables will be created. A \code{function} will be called with a single argument, the plot data. The return value must be a \code{data.frame}, and will be used as the layer data. A \code{function} can be created from a \code{formula} (e.g. \code{~ head(.x, 10)}).} \item{stat}{The statistical transformation to use on the data for this layer, either as a \code{ggproto} \code{Geom} subclass or as a string naming the stat stripped of the \code{stat_} prefix (e.g. \code{"count"} rather than \code{"stat_count"})} \item{position}{Position adjustment, either as a string naming the adjustment (e.g. \code{"jitter"} to use \code{position_jitter}), or the result of a call to a position adjustment function. Use the latter if you need to change the settings of the adjustment.} \item{...}{Other arguments passed on to \code{\link[ggplot2:layer]{layer()}}. These are often aesthetics, used to set an aesthetic to a fixed value, like \code{colour = "red"} or \code{size = 3}. They may also be parameters to the paired geom/stat.} \item{show.legend}{logical. Should this layer be included in the legends? \code{NA}, the default, includes if any aesthetics are mapped. \code{FALSE} never includes, and \code{TRUE} always includes. It can also be a named logical vector to finely select the aesthetics to display.} \item{inherit.aes}{If \code{FALSE}, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g. \code{\link[ggplot2:borders]{borders()}}.} \item{xfrom, xto}{limitation of the strips along the x-axis} \item{width}{width of the strips} \item{yfrom, yto}{limitation of the strips along the y-axis} \item{nudge_x, nudge_y}{horizontal or vertical adjustment to nudge strips by} } \value{ A \code{ggplot2} plot with the added geometry. } \description{ Add alternating background color along the y-axis. The geom takes default aesthetics \code{odd} and \code{even} that receive color codes. } \examples{ \dontshow{if (requireNamespace("reshape")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} data(tips, package = "reshape") library(ggplot2) p <- ggplot(tips) + aes(x = time, y = day) + geom_count() + theme_light() p p + geom_stripped_rows() p + geom_stripped_cols() p + geom_stripped_rows() + geom_stripped_cols() \donttest{ p <- ggplot(tips) + aes(x = total_bill, y = day) + geom_count() + theme_light() p p + geom_stripped_rows() p + geom_stripped_rows() + scale_y_discrete(expand = expansion(0, 0.5)) p + geom_stripped_rows(xfrom = 10, xto = 35) p + geom_stripped_rows(odd = "blue", even = "yellow") p + geom_stripped_rows(odd = "blue", even = "yellow", alpha = .1) p + geom_stripped_rows(odd = "#00FF0022", even = "#FF000022") p + geom_stripped_cols() p + geom_stripped_cols(width = 10) p + geom_stripped_cols(width = 10, nudge_x = 5) } \dontshow{\}) # examplesIf} } ggstats/man/stat_cross.Rd0000644000176200001440000001232714415736053015171 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/stat_cross.R \docType{data} \name{stat_cross} \alias{stat_cross} \alias{StatCross} \title{Compute cross-tabulation statistics} \usage{ stat_cross( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., na.rm = TRUE, show.legend = NA, inherit.aes = TRUE, keep.zero.cells = FALSE ) } \arguments{ \item{mapping}{Set of aesthetic mappings created by \code{\link[ggplot2:aes]{aes()}}. If specified and \code{inherit.aes = TRUE} (the default), it is combined with the default mapping at the top level of the plot. You must supply \code{mapping} if there is no plot mapping.} \item{data}{The data to be displayed in this layer. There are three options: If \code{NULL}, the default, the data is inherited from the plot data as specified in the call to \code{\link[ggplot2:ggplot]{ggplot()}}. A \code{data.frame}, or other object, will override the plot data. All objects will be fortified to produce a data frame. See \code{\link[ggplot2:fortify]{fortify()}} for which variables will be created. A \code{function} will be called with a single argument, the plot data. The return value must be a \code{data.frame}, and will be used as the layer data. A \code{function} can be created from a \code{formula} (e.g. \code{~ head(.x, 10)}).} \item{geom}{Override the default connection with \code{\link[ggplot2:geom_point]{ggplot2::geom_point()}}.} \item{position}{Position adjustment, either as a string naming the adjustment (e.g. \code{"jitter"} to use \code{position_jitter}), or the result of a call to a position adjustment function. Use the latter if you need to change the settings of the adjustment.} \item{...}{Other arguments passed on to \code{\link[ggplot2:layer]{layer()}}. These are often aesthetics, used to set an aesthetic to a fixed value, like \code{colour = "red"} or \code{size = 3}. They may also be parameters to the paired geom/stat.} \item{na.rm}{If \code{TRUE}, the default, missing values are removed with a warning. If \code{TRUE}, missing values are silently removed.} \item{show.legend}{logical. Should this layer be included in the legends? \code{NA}, the default, includes if any aesthetics are mapped. \code{FALSE} never includes, and \code{TRUE} always includes. It can also be a named logical vector to finely select the aesthetics to display.} \item{inherit.aes}{If \code{FALSE}, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g. \code{\link[ggplot2:borders]{borders()}}.} \item{keep.zero.cells}{If \code{TRUE}, cells with no observations are kept.} } \value{ A \code{ggplot2} plot with the added statistic. } \description{ Computes statistics of a 2-dimensional matrix using \link[broom:augment.htest]{broom::augment.htest}. } \section{Aesthetics}{ \code{stat_cross()} requires the \strong{x} and the \strong{y} aesthetics. } \section{Computed variables}{ \describe{ \item{observed}{number of observations in x,y} \item{prop}{proportion of total} \item{row.prop}{row proportion} \item{col.prop}{column proportion} \item{expected}{expected count under the null hypothesis} \item{resid}{Pearson's residual} \item{std.resid}{standardized residual} \item{row.observed}{total number of observations within row} \item{col.observed}{total number of observations within column} \item{total.observed}{total number of observations within the table} \item{phi}{phi coefficients, see \code{\link[=augment_chisq_add_phi]{augment_chisq_add_phi()}}} } } \examples{ library(ggplot2) d <- as.data.frame(Titanic) # plot number of observations ggplot(d) + aes(x = Class, y = Survived, weight = Freq, size = after_stat(observed)) + stat_cross() + scale_size_area(max_size = 20) # custom shape and fill colour based on chi-squared residuals ggplot(d) + aes( x = Class, y = Survived, weight = Freq, size = after_stat(observed), fill = after_stat(std.resid) ) + stat_cross(shape = 22) + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + scale_size_area(max_size = 20) \donttest{ # custom shape and fill colour based on phi coeffients ggplot(d) + aes( x = Class, y = Survived, weight = Freq, size = after_stat(observed), fill = after_stat(phi) ) + stat_cross(shape = 22) + scale_fill_steps2(show.limits = TRUE) + scale_size_area(max_size = 20) # plotting the number of observations as a table ggplot(d) + aes( x = Class, y = Survived, weight = Freq, label = after_stat(observed) ) + geom_text(stat = "cross") # Row proportions with standardized residuals ggplot(d) + aes( x = Class, y = Survived, weight = Freq, label = scales::percent(after_stat(row.prop)), size = NULL, fill = after_stat(std.resid) ) + stat_cross(shape = 22, size = 30) + geom_text(stat = "cross") + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + facet_grid(Sex ~ .) + labs(fill = "Standardized residuals") + theme_minimal() } } \seealso{ \code{vignette("stat_cross")} } \keyword{datasets} ggstats/man/weighted.median.Rd0000644000176200001440000000435614527052140016034 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/weighted_quantile.R \name{weighted.median} \alias{weighted.median} \alias{weighted.quantile} \title{Weighted Median and Quantiles} \source{ These functions are adapted from their homonyms developed by Adrian Baddeley in the \code{spatstat} package. } \usage{ weighted.median(x, w, na.rm = TRUE, type = 2) weighted.quantile(x, w, probs = seq(0, 1, 0.25), na.rm = TRUE, type = 4) } \arguments{ \item{x}{a numeric vector of values} \item{w}{a numeric vector of weights} \item{na.rm}{a logical indicating whether to ignore \code{NA} values} \item{type}{Integer specifying the rule for calculating the median or quantile, corresponding to the rules available for \code{stats:quantile()}. The only valid choices are type=1, 2 or 4. See Details.} \item{probs}{probabilities for which the quantiles should be computed, a numeric vector of values between 0 and 1} } \value{ A numeric vector. } \description{ Compute the median or quantiles a set of numbers which have weights associated with them. } \details{ The \code{i}th observation \code{x[i]} is treated as having a weight proportional to \code{w[i]}. The weighted median is a value \code{m} such that the total weight of data less than or equal to \code{m} is equal to half the total weight. More generally, the weighted quantile with probability \code{p} is a value \code{q} such that the total weight of data less than or equal to \code{q} is equal to \code{p} times the total weight. If there is no such value, then \itemize{ \item if \code{type = 1}, the next largest value is returned (this is the right-continuous inverse of the left-continuous cumulative distribution function); \item if \code{type = 2}, the average of the two surrounding values is returned (the average of the right-continuous and left-continuous inverses); \item if \code{type = 4}, linear interpolation is performed. } Note that the default rule for \code{weighted.median()} is \code{type = 2}, consistent with the traditional definition of the median, while the default for \code{weighted.quantile()} is \code{type = 4}. } \examples{ x <- 1:20 w <- runif(20) weighted.median(x, w) weighted.quantile(x, w) } ggstats/man/augment_chisq_add_phi.Rd0000644000176200001440000000226514357760261017307 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/stat_cross.R \name{augment_chisq_add_phi} \alias{augment_chisq_add_phi} \title{Augment a chi-squared test and compute phi coefficients} \usage{ augment_chisq_add_phi(x) } \arguments{ \item{x}{a chi-squared test as returned by \code{\link[stats:chisq.test]{stats::chisq.test()}}} } \value{ A \code{tibble}. } \description{ Augment a chi-squared test and compute phi coefficients } \details{ Phi coefficients are a measurement of the degree of association between two binary variables. \itemize{ \item A value between -1.0 to -0.7 indicates a strong negative association. \item A value between -0.7 to -0.3 indicates a weak negative association. \item A value between -0.3 to +0.3 indicates a little or no association. \item A value between +0.3 to +0.7 indicates a weak positive association. \item A value between +0.7 to +1.0 indicates a strong positive association. } } \examples{ tab <- xtabs(Freq ~ Sex + Class, data = as.data.frame(Titanic)) augment_chisq_add_phi(chisq.test(tab)) } \seealso{ \code{\link[=stat_cross]{stat_cross()}}, \code{GDAtools::phi.table()} or \code{psych::phi()} } ggstats/man/signif_stars.Rd0000644000176200001440000000165114357760261015501 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/signif_stars.R \name{signif_stars} \alias{signif_stars} \title{Significance Stars} \usage{ signif_stars(x, three = 0.001, two = 0.01, one = 0.05, point = 0.1) } \arguments{ \item{x}{numeric values that will be compared to the \code{point}, \code{one}, \code{two}, and \code{three} values} \item{three}{threshold below which to display three stars} \item{two}{threshold below which to display two stars} \item{one}{threshold below which to display one star} \item{point}{threshold below which to display one point (\code{NULL} to deactivate)} } \value{ Character vector containing the appropriate number of stars for each \code{x} value. } \description{ Calculate significance stars } \examples{ x <- c(0.5, 0.1, 0.05, 0.01, 0.001) signif_stars(x) signif_stars(x, one = .15, point = NULL) } \author{ Joseph Larmarange } ggstats/man/ggsurvey.Rd0000644000176200001440000000312214357760261014654 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/ggsurvey.R \name{ggsurvey} \alias{ggsurvey} \title{Easy ggplot2 with survey objects} \usage{ ggsurvey(design = NULL, mapping = NULL, ...) } \arguments{ \item{design}{A survey design object, usually created with \code{\link[survey:svydesign]{survey::svydesign()}}} \item{mapping}{Default list of aesthetic mappings to use for plot, to be created with \code{\link[ggplot2:aes]{ggplot2::aes()}}.} \item{...}{Other arguments passed on to methods. Not currently used.} } \value{ A \code{ggplot2} plot. } \description{ A function to facilitate \code{ggplot2} graphs using a survey object. It will initiate a ggplot and map survey weights to the corresponding aesthetic. } \details{ Graphs will be correct as long as only weights are required to compute the graph. However, statistic or geometry requiring correct variance computation (like \code{\link[ggplot2:geom_smooth]{ggplot2::geom_smooth()}}) will be statistically incorrect. } \examples{ \dontshow{if (requireNamespace("survey")) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} data(api, package = "survey") dstrat <- survey::svydesign( id = ~1, strata = ~stype, weights = ~pw, data = apistrat, fpc = ~fpc ) ggsurvey(dstrat) + ggplot2::aes(x = cnum, y = dnum) + ggplot2::geom_count() d <- as.data.frame(Titanic) dw <- survey::svydesign(ids = ~1, weights = ~Freq, data = d) ggsurvey(dw) + ggplot2::aes(x = Class, fill = Survived) + ggplot2::geom_bar(position = "fill") \dontshow{\}) # examplesIf} } ggstats/man/stat_prop.Rd0000644000176200001440000001312414521172741015010 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/stat_prop.R \docType{data} \name{stat_prop} \alias{stat_prop} \alias{StatProp} \title{Compute proportions according to custom denominator} \usage{ stat_prop( mapping = NULL, data = NULL, geom = "bar", position = "fill", ..., width = NULL, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, complete = NULL ) } \arguments{ \item{mapping}{Set of aesthetic mappings created by \code{\link[ggplot2:aes]{aes()}}. If specified and \code{inherit.aes = TRUE} (the default), it is combined with the default mapping at the top level of the plot. You must supply \code{mapping} if there is no plot mapping.} \item{data}{The data to be displayed in this layer. There are three options: If \code{NULL}, the default, the data is inherited from the plot data as specified in the call to \code{\link[ggplot2:ggplot]{ggplot()}}. A \code{data.frame}, or other object, will override the plot data. All objects will be fortified to produce a data frame. See \code{\link[ggplot2:fortify]{fortify()}} for which variables will be created. A \code{function} will be called with a single argument, the plot data. The return value must be a \code{data.frame}, and will be used as the layer data. A \code{function} can be created from a \code{formula} (e.g. \code{~ head(.x, 10)}).} \item{geom}{Override the default connection with \code{\link[ggplot2:geom_bar]{ggplot2::geom_bar()}}.} \item{position}{Position adjustment, either as a string naming the adjustment (e.g. \code{"jitter"} to use \code{position_jitter}), or the result of a call to a position adjustment function. Use the latter if you need to change the settings of the adjustment.} \item{...}{Other arguments passed on to \code{\link[ggplot2:layer]{layer()}}. These are often aesthetics, used to set an aesthetic to a fixed value, like \code{colour = "red"} or \code{size = 3}. They may also be parameters to the paired geom/stat.} \item{width}{Bar width. By default, set to 90\% of the \code{\link[ggplot2:resolution]{resolution()}} of the data.} \item{na.rm}{If \code{FALSE}, the default, missing values are removed with a warning. If \code{TRUE}, missing values are silently removed.} \item{orientation}{The orientation of the layer. The default (\code{NA}) automatically determines the orientation from the aesthetic mapping. In the rare event that this fails it can be given explicitly by setting \code{orientation} to either \code{"x"} or \code{"y"}. See the \emph{Orientation} section for more detail.} \item{show.legend}{logical. Should this layer be included in the legends? \code{NA}, the default, includes if any aesthetics are mapped. \code{FALSE} never includes, and \code{TRUE} always includes. It can also be a named logical vector to finely select the aesthetics to display.} \item{inherit.aes}{If \code{FALSE}, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g. \code{\link[ggplot2:borders]{borders()}}.} \item{complete}{Name (character) of an aesthetic for those statistics should be completed for unobserved values (see example)} } \value{ A \code{ggplot2} plot with the added statistic. } \description{ \code{stat_prop()} is a variation of \code{\link[ggplot2:geom_bar]{ggplot2::stat_count()}} allowing to compute custom proportions according to the \strong{by} aesthetic defining the denominator (i.e. all proportions for a same value of \strong{by} will sum to 1). The \strong{by} aesthetic should be a factor. If \strong{by} is not specified, proportions of the total will be computed. } \section{Aesthetics}{ \code{stat_prop()} understands the following aesthetics (required aesthetics are in bold): \itemize{ \item \strong{x \emph{or} y} \item by (this aesthetic should be a \strong{factor}) \item group \item weight } } \section{Computed variables}{ \describe{ \item{count}{number of points in bin} \item{prop}{computed proportion} } } \examples{ library(ggplot2) d <- as.data.frame(Titanic) p <- ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = Class) + geom_bar(position = "fill") + geom_text(stat = "prop", position = position_fill(.5)) p p + facet_grid(~Sex) ggplot(d) + aes(x = Class, fill = Survived, weight = Freq) + geom_bar(position = "dodge") + geom_text( aes(by = Survived), stat = "prop", position = position_dodge(0.9), vjust = "bottom" ) \donttest{ if (requireNamespace("scales")) { ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = 1) + geom_bar() + geom_text( aes(label = scales::percent(after_stat(prop), accuracy = 1)), stat = "prop", position = position_stack(.5) ) } # displaying unobserved levels with complete d <- diamonds \%>\% dplyr::filter(!(cut == "Ideal" & clarity == "I1")) \%>\% dplyr::filter(!(cut == "Very Good" & clarity == "VS2")) \%>\% dplyr::filter(!(cut == "Premium" & clarity == "IF")) p <- ggplot(d) + aes(x = clarity, fill = cut, by = clarity) + geom_bar(position = "fill") p + geom_text(stat = "prop", position = position_fill(.5)) p + geom_text(stat = "prop", position = position_fill(.5), complete = "fill") } } \seealso{ \code{vignette("stat_prop")}, \code{\link[ggplot2:geom_bar]{ggplot2::stat_count()}}. For an alternative approach, see \url{https://github.com/tidyverse/ggplot2/issues/5505#issuecomment-1791324008}. } \keyword{datasets} ggstats/DESCRIPTION0000644000176200001440000000242414527062732013446 0ustar liggesusersPackage: ggstats Type: Package Title: Extension to 'ggplot2' for Plotting Stats Version: 0.5.1 Authors@R: c( person("Joseph", "Larmarange", , "joseph@larmarange.net", role = c("aut", "cre"), comment = c(ORCID = "0000-0001-7097-700X")) ) Description: Provides new statistics, new geometries and new positions for 'ggplot2' and a suite of functions to facilitate the creation of statistical plots. License: GPL (>= 3) URL: https://larmarange.github.io/ggstats/, https://github.com/larmarange/ggstats BugReports: https://github.com/larmarange/ggstats/issues Imports: broom.helpers (>= 1.14.0), cli, dplyr, forcats, ggplot2 (>= 3.4.0), lifecycle, magrittr, patchwork, purrr, rlang, scales, stats, stringr, tidyr Suggests: betareg, broom, emmeans, glue, knitr, labelled (>= 2.11.0), reshape, rmarkdown, nnet, parameters, pscl, testthat (>= 3.0.0), spelling, survey, survival, vdiffr Encoding: UTF-8 RoxygenNote: 7.2.3 Config/testthat/edition: 3 Language: en-US VignetteBuilder: knitr NeedsCompilation: no Packaged: 2023-11-21 07:02:14 UTC; josep Author: Joseph Larmarange [aut, cre] () Maintainer: Joseph Larmarange Repository: CRAN Date/Publication: 2023-11-21 08:10:02 UTC ggstats/build/0000755000176200001440000000000014527052762013037 5ustar liggesusersggstats/build/vignette.rds0000644000176200001440000000061014527052762015373 0ustar liggesusersAO0 `#F b% labelled::set_variable_labels( day = "Day of the week", time = "Lunch or Dinner", total_bill = "Bill's total" ) mod_labelled <- lm(tip ~ day + time + total_bill, data = tips_labelled) vdiffr::expect_doppelganger( "ggcoef_model() mod labelled", ggcoef_model(mod_labelled) ) } vdiffr::expect_doppelganger( "ggcoef_model() mod simple with variable labels", ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ) ) ) # if labels are too long, you can use 'facet_labeller' to wrap them vdiffr::expect_doppelganger( "ggcoef_model() mod simple facet_labeller", ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ), facet_labeller = ggplot2::label_wrap_gen(10) ) ) # do not display variable facets but add colour guide vdiffr::expect_doppelganger( "ggcoef_model() mod simple no variable facets", ggcoef_model( mod_simple, facet_row = NULL, colour_guide = TRUE ) ) # a logistic regression example d_titanic <- as.data.frame(Titanic) d_titanic$Survived <- factor(d_titanic$Survived, c("No", "Yes")) mod_titanic <- glm( Survived ~ Sex * Age + Class, weights = Freq, data = d_titanic, family = binomial ) vdiffr::expect_doppelganger( "ggcoef_model() logistic regression", ggcoef_model(mod_titanic, exponentiate = TRUE) ) # display intercept vdiffr::expect_doppelganger( "ggcoef_model() logistic regression with intercept", ggcoef_model(mod_titanic, exponentiate = TRUE, intercept = TRUE) ) # display only a subset of terms vdiffr::expect_doppelganger( "ggcoef_model() logistic regression subset", ggcoef_model(mod_titanic, exponentiate = TRUE, include = c("Age", "Class")) ) # do not change points' shape based on significance vdiffr::expect_doppelganger( "ggcoef_model() logistic regression no significance", ggcoef_model(mod_titanic, exponentiate = TRUE, significance = NULL) ) # a black and white version vdiffr::expect_doppelganger( "ggcoef_model() logistic regression black and white", ggcoef_model( mod_titanic, exponentiate = TRUE, colour = NULL, stripped_rows = FALSE ) ) # show dichotomous terms on one row vdiffr::expect_doppelganger( "ggcoef_model() logistic regression no reference row", ggcoef_model( mod_titanic, exponentiate = TRUE, no_reference_row = broom.helpers::all_dichotomous(), categorical_terms_pattern = "{ifelse(dichotomous, paste0(level, ' / ', reference_level), level)}", show_p_values = FALSE ) ) # works also with with polynomial terms mod_poly <- lm( tip ~ poly(total_bill, 3) + day, data = tips, ) vdiffr::expect_doppelganger( "ggcoef_model() polynomial terms", ggcoef_model(mod_poly) ) # or with different type of contrasts # for sum contrasts, the value of the reference term is computed if (requireNamespace("emmeans")) { mod2 <- lm( tip ~ day + time + sex, data = tips, contrasts = list(time = contr.sum, day = contr.treatment(4, base = 3)) ) vdiffr::expect_doppelganger( "ggcoef_model() different types of contrasts", ggcoef_model(mod2) ) } }) test_that("ggcoef_compare()", { skip_if_not_installed("broom.helpers") skip_on_cran() # Use ggcoef_compare() for comparing several models on the same plot mod1 <- lm(Fertility ~ ., data = swiss) mod2 <- step(mod1, trace = 0) mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss) models <- list( "Full model" = mod1, "Simplified model" = mod2, "With interaction" = mod3 ) vdiffr::expect_doppelganger( "ggcoef_compare() dodged", ggcoef_compare(models) ) vdiffr::expect_doppelganger( "ggcoef_compare() faceted", ggcoef_compare(models, type = "faceted") ) d <- as.data.frame(Titanic) m1 <- glm(Survived ~ Sex + Age, family = binomial, data = d, weights = Freq) m2 <- glm( Survived ~ Sex + Age + Class, family = binomial, data = d, weights = Freq ) models <- list("Model 1" = m1, "Model 2" = m2) vdiffr::expect_doppelganger( "ggcoef_compare() titanic dodged", ggcoef_compare(models) ) vdiffr::expect_doppelganger( "ggcoef_compare() titanic faceted", ggcoef_compare(models, type = "faceted") ) rd <- ggcoef_compare(models, return_data = TRUE) expect_equal( levels(rd$label), c("Male", "Female", "Child", "Adult", "1st", "2nd", "3rd", "Crew") ) expect_error( ggcoef_compare(models, add_reference_rows = FALSE), NA ) }) test_that("ggcoef_multinom()", { skip_if_not_installed("broom.helpers") skip_if_not_installed("nnet") skip_on_cran() library(nnet) hec <- as.data.frame(HairEyeColor) mod <- multinom( Hair ~ Eye + Sex, data = hec, weights = hec$Freq ) vdiffr::expect_doppelganger( "ggcoef_multinom() dodged", ggcoef_multinom(mod, exponentiate = TRUE) ) vdiffr::expect_doppelganger( "ggcoef_multinom() faceted", ggcoef_multinom(mod, type = "faceted") ) vdiffr::expect_doppelganger( "ggcoef_multinom() table", ggcoef_multinom(mod, type = "table", exponentiate = TRUE) ) vdiffr::expect_doppelganger( "ggcoef_multinom() faceted custom y level label", ggcoef_multinom( mod, type = "faceted", y.level_label = c("Brown" = "Brown\n(ref: Black)"), exponentiate = TRUE ) ) }) test_that("ggcoef_model() works with tieders not returning p-values", { skip_if_not_installed("broom.helpers") skip_on_cran() mod <- lm(Sepal.Width ~ Species, iris) my_tidier <- function(x, ...) { x %>% broom::tidy(...) %>% dplyr::select(-dplyr::all_of("p.value")) } vdiffr::expect_doppelganger( "ggcoef_model() no p values", ggcoef_model(mod, tidy_fun = my_tidier) ) }) test_that("ggcoef_compare() complete NA respecting variables order", { skip_if_not_installed("broom.helpers") m1 <- lm(Fertility ~ Education + Catholic, data = swiss) m2 <- lm(Fertility ~ Education + Catholic + Agriculture, data = swiss) m3 <- lm( Fertility ~ Education + Catholic + Agriculture + Infant.Mortality, data = swiss ) res <- ggcoef_compare(models = list(m1, m2, m3), return_data = TRUE) expect_equal( res$variable[1:4], structure(1:4, .Label = c( "Education", "Catholic", "Agriculture", "Infant.Mortality" ), class = "factor") ) }) test_that("ggcoef_compare() does not produce an error with an include", { skip_if_not_installed("survival") skip_if_not_installed("broom.helpers") skip_on_cran() m1 <- survival::coxph( survival::Surv(time, status) ~ prior + age, data = survival::veteran ) m2 <- survival::coxph( survival::Surv(time, status) ~ prior + celltype, data = survival::veteran ) models <- list("Model 1" = m1, "Model 2" = m2) vdiffr::expect_doppelganger( "ggcoef_compare() with include", ggcoef_compare(models, include = broom.helpers::starts_with("p")) ) }) test_that("ggcoef_model() works with pairwise contratst", { skip_if_not_installed("broom.helpers") mod <- lm(Sepal.Length ~ Sepal.Width + Species, data = iris) expect_error( ggcoef_model(mod, add_pairwise_contrasts = TRUE), NA ) expect_error( ggcoef_model( mod, add_pairwise_contrasts = TRUE, pairwise_variables = dplyr::starts_with("Sp"), keep_model_terms = TRUE ), NA ) mod2 <- lm(Sepal.Length ~ Species, data = iris) expect_error( ggcoef_compare(list(mod, mod2), add_pairwise_contrasts = TRUE), NA ) }) test_that("tidy_args is supported", { mod <- lm(Sepal.Length ~ Sepal.Width, data = iris) custom <- function(x, force = 1, ...) { broom::tidy(x, ...) %>% dplyr::mutate(estimate = force) } res <- ggcoef_model( mod, tidy_fun = custom, tidy_args = list(force = 3), return_data = TRUE ) expect_equal(res$estimate, 3) }) test_that("ggcoef_table()", { skip_on_cran() skip_if_not_installed("broom.helpers") skip_if_not_installed("reshape") data(tips, package = "reshape") mod_simple <- lm(tip ~ day + time + total_bill, data = tips) vdiffr::expect_doppelganger( "ggcoef_table() mod simple", ggcoef_table(mod_simple) ) vdiffr::expect_doppelganger( "ggcoef_table() table_stat", ggcoef_table(mod_simple, table_stat = c("p.value", "ci")) ) vdiffr::expect_doppelganger( "ggcoef_table() table_header", ggcoef_table(mod_simple, table_header = c("A", "B", "C")) ) expect_error( ggcoef_table(mod_simple, table_header = c("A", "B", "C", "D")) ) vdiffr::expect_doppelganger( "ggcoef_table() table_text_size", ggcoef_table(mod_simple, table_text_size = 5) ) vdiffr::expect_doppelganger( "ggcoef_table() table_stat_label ", ggcoef_table( mod_simple, table_stat_label = list( estimate = scales::label_percent(.1) ) ) ) vdiffr::expect_doppelganger( "ggcoef_table() ci_pattern", ggcoef_table(mod_simple, ci_pattern = "{conf.low} to {conf.high}") ) vdiffr::expect_doppelganger( "ggcoef_table() table_widths", ggcoef_table(mod_simple, table_witdhs = c(1, 2)) ) vdiffr::expect_doppelganger( "ggcoef_table() stripped_rows", ggcoef_table(mod_simple, stripped_rows = FALSE) ) vdiffr::expect_doppelganger( "ggcoef_table() show_p_values & signif_stars", ggcoef_table(mod_simple, show_p_values = TRUE, signif_stars = TRUE) ) vdiffr::expect_doppelganger( "ggcoef_table() show_p_values only", ggcoef_table(mod_simple, show_p_values = TRUE, signif_stars = FALSE) ) vdiffr::expect_doppelganger( "ggcoef_table() signif_stars only", ggcoef_table(mod_simple, show_p_values = FALSE, signif_stars = TRUE) ) vdiffr::expect_doppelganger( "ggcoef_table() customized statistics", ggcoef_table( mod_simple, table_stat = c("label", "estimate", "std.error", "ci"), ci_pattern = "{conf.low} to {conf.high}", table_stat_label = list( estimate = scales::label_number(accuracy = .01), conf.low = scales::label_number(accuracy = .1), conf.high = scales::label_number(accuracy = .1), std.error = scales::label_number(accuracy = .001), label = toupper ), table_header = c("Term", "Coef.", "SE", "CI"), table_witdhs = c(2, 3) ) ) }) test_that("ggcoef_multicomponents()", { skip_on_cran() skip_if_not_installed("broom.helpers") skip_if_not_installed("pscl") library(pscl) data("bioChemists", package = "pscl") mod <- zeroinfl(art ~ fem * mar | fem + mar, data = bioChemists) vdiffr::expect_doppelganger( "ggcoef_multicomponents() dodged", ggcoef_multicomponents(mod, tidy_fun = broom.helpers::tidy_zeroinfl) ) vdiffr::expect_doppelganger( "ggcoef_multicomponents() faceted", ggcoef_multicomponents( mod, tidy_fun = broom.helpers::tidy_zeroinfl, type = "f" ) ) vdiffr::expect_doppelganger( "ggcoef_multicomponents() faceted exponentiated", ggcoef_multicomponents( mod, tidy_fun = broom.helpers::tidy_zeroinfl, type = "f", exponentiate = TRUE ) ) vdiffr::expect_doppelganger( "ggcoef_multicomponents() table", ggcoef_multicomponents( mod, tidy_fun = broom.helpers::tidy_zeroinfl, type = "t" ) ) vdiffr::expect_doppelganger( "ggcoef_multicomponents() table exponentiated", ggcoef_multicomponents( mod, tidy_fun = broom.helpers::tidy_zeroinfl, type = "t", exponentiate = TRUE ) ) expect_s3_class( ggcoef_multicomponents( mod, tidy_fun = broom.helpers::tidy_zeroinfl, type = "t", return_data = TRUE ), "tbl" ) vdiffr::expect_doppelganger( "ggcoef_multicomponents() table component_label", ggcoef_multicomponents( mod, tidy_fun = broom.helpers::tidy_zeroinfl, type = "t", component_label = c(conditional = "Count", zero_inflated = "Zero-inflated") # nolint ) ) vdiffr::expect_doppelganger( "ggcoef_multicomponents() faceted component_label", ggcoef_multicomponents( mod, tidy_fun = broom.helpers::tidy_zeroinfl, type = "f", component_label = c(conditional = "Count", zero_inflated = "Zero-inflated") # nolint ) ) # message if unfound values for component_label expect_message( ggcoef_multicomponents( mod, tidy_fun = broom.helpers::tidy_zeroinfl, type = "t", component_label = c(c = "Count", zi = "Zero-inflated") ) ) # error if unnamed values for component_label expect_error( ggcoef_multicomponents( mod, tidy_fun = broom.helpers::tidy_zeroinfl, type = "t", component_label = c("Count", zi = "Zero-inflated") ) ) mod2 <- zeroinfl(art ~ fem + mar | 1, data = bioChemists) vdiffr::expect_doppelganger( "ggcoef_multicomponents() mod2 table", ggcoef_multicomponents( mod2, tidy_fun = broom.helpers::tidy_zeroinfl, type = "t" ) ) skip_if_not_installed("betareg") skip_if_not_installed("parameters") library(betareg) data("GasolineYield", package = "betareg") m1 <- betareg(yield ~ batch + temp, data = GasolineYield) m2 <- betareg(yield ~ batch + temp | temp + pressure, data = GasolineYield) vdiffr::expect_doppelganger( "ggcoef_multicomponents() betareg m1 table", ggcoef_multicomponents( m1, type = "t", tidy_fun = broom.helpers::tidy_parameters, tidy_args = list(component = "all") ) ) vdiffr::expect_doppelganger( "ggcoef_multicomponents() betareg m2 table", ggcoef_multicomponents( m2, type = "t", tidy_fun = broom.helpers::tidy_parameters, tidy_args = list(component = "all") ) ) modlm <- lm(Sepal.Length ~ Sepal.Width + Species, data = iris) vdiffr::expect_doppelganger( "ggcoef_multicomponents() linear model table", ggcoef_multicomponents(modlm, type = "t") ) vdiffr::expect_doppelganger( "ggcoef_multicomponents() linear model faceted", ggcoef_multicomponents(modlm, type = "f") ) }) ggstats/tests/testthat/test-stat_weighted_mean.R0000644000176200001440000000345214357760262021703 0ustar liggesuserstest_that("stat_weighted_mean()", { skip_on_cran() skip_if_not_installed("reshape") library(ggplot2) data(tips, package = "reshape") vdiffr::expect_doppelganger( "stat_weighted_mean() point", ggplot(tips) + aes(x = day, y = total_bill) + geom_point() ) vdiffr::expect_doppelganger( "stat_weighted_mean() geom-default", ggplot(tips) + aes(x = day, y = total_bill) + stat_weighted_mean() ) vdiffr::expect_doppelganger( "stat_weighted_mean() geom-line", ggplot(tips) + aes(x = day, y = total_bill, group = 1) + stat_weighted_mean(geom = "line") ) vdiffr::expect_doppelganger( "stat_weighted_mean() geom-line-grouped", ggplot(tips) + aes(x = day, y = total_bill, colour = sex, group = sex) + stat_weighted_mean(geom = "line") ) vdiffr::expect_doppelganger( "stat_weighted_mean() geom-bar-dodge", ggplot(tips) + aes(x = day, y = total_bill, fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") ) # computing a proportion on the fly vdiffr::expect_doppelganger( "stat_weighted_mean() geom-bar-dodge-percent", ggplot(tips) + aes(x = day, y = as.integer(smoker == "Yes"), fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") + scale_y_continuous(labels = scales::percent) ) # taking into account some weights d <- as.data.frame(Titanic) vdiffr::expect_doppelganger( "stat_weighted_mean() titanic", ggplot(d) + aes( x = Class, y = as.integer(Survived == "Yes"), weight = Freq, fill = Sex ) + geom_bar(stat = "weighted_mean", position = "dodge") + scale_y_continuous(labels = scales::percent) + labs(y = "Survived") ) }) ggstats/tests/testthat/test-utilities.R0000644000176200001440000000040514357760262020056 0ustar liggesuserstest_that("signif_stars() works", { x <- c(0.5, 0.1, 0.05, 0.01, 0.001) expect_equal( signif_stars(x), c("", ".", "*", "**", "***") ) expect_equal( signif_stars(x, one = .15, point = NULL), c("", "*", "*", "**", "***") ) }) ggstats/tests/testthat/test-gglikert.R0000644000176200001440000001273714504775450017665 0ustar liggesuserstest_that("gglikert)", { skip_on_cran() skip_if_not_installed("labelled") skip_if_not_installed("ggplot2") skip_if_not_installed("dplyr") likert_levels <- c( "Strongly disagree", "Disagree", "Neither agree nor disagree", "Agree", "Strongly agree" ) set.seed(42) df <- dplyr::tibble( q1 = sample(likert_levels, 150, replace = TRUE), q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1), q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q5 = sample(c(likert_levels, NA), 150, replace = TRUE), q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0)) ) %>% dplyr::mutate(dplyr::across( dplyr::everything(), ~ factor(.x, levels = likert_levels) )) likert_levels_dk <- c( "Strongly disagree", "Disagree", "Neither agree nor disagree", "Agree", "Strongly agree", "Don't know" ) df_dk <- dplyr::tibble( q1 = sample(likert_levels_dk, 150, replace = TRUE), q2 = sample(likert_levels_dk, 150, replace = TRUE, prob = 6:1), q3 = sample(likert_levels_dk, 150, replace = TRUE, prob = 1:6), q4 = sample(likert_levels_dk, 150, replace = TRUE, prob = 1:6), q5 = sample(c(likert_levels_dk, NA), 150, replace = TRUE), q6 = sample( likert_levels_dk, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0, 1) ) ) %>% dplyr::mutate(dplyr::across( dplyr::everything(), ~ factor(.x, levels = likert_levels_dk) )) vdiffr::expect_doppelganger( "gglikert() mod simple", gglikert(df) ) expect_error( d <- gglikert_data(df), NA ) expect_equal(levels(d$.answer), likert_levels) vdiffr::expect_doppelganger( "gglikert() include and width", gglikert(df, include = q1:q3, width = .5) ) vdiffr::expect_doppelganger( "gglikert() variable_labels", gglikert(df, variable_labels = c(q2 = "second question")) ) vdiffr::expect_doppelganger( "gglikert() sort prop asc", gglikert(df, sort = "asc") ) vdiffr::expect_doppelganger( "gglikert() sort prop desc", gglikert(df, sort = "desc") ) vdiffr::expect_doppelganger( "gglikert() sort mean asc", gglikert(df, sort = "asc", sort_method = "mean") ) vdiffr::expect_doppelganger( "gglikert() sort mean desc", gglikert(df, sort = "desc", sort_method = "mean") ) vdiffr::expect_doppelganger( "gglikert() sort median asc", gglikert(df, sort = "asc", sort_method = "median") ) vdiffr::expect_doppelganger( "gglikert() sort median desc", gglikert(df, sort = "desc", sort_method = "median") ) vdiffr::expect_doppelganger( "gglikert() sort prop asc include_center", gglikert(df, sort = "asc", sort_prop_include_center = TRUE) ) vdiffr::expect_doppelganger( "gglikert() exclude_fill_values", gglikert(df, exclude_fill_values = "Neither agree nor disagree") ) vdiffr::expect_doppelganger( "gglikert() add_labels", gglikert(df, add_labels = FALSE) ) vdiffr::expect_doppelganger( "gglikert() customize labels", gglikert(df, labels_size = 5, labels_hide_below = .3, labels_accuracy = .1) ) vdiffr::expect_doppelganger( "gglikert() add_totals", gglikert(df, add_totals = FALSE) ) vdiffr::expect_doppelganger( "gglikert() customize totals", gglikert( df, totals_size = 5, totals_fontface = "italic", totals_include_center = TRUE, totals_hjust = 0 ) ) vdiffr::expect_doppelganger( "gglikert() colors", gglikert(df, labels_color = "red", totals_color = "blue") ) vdiffr::expect_doppelganger( "gglikert() reverse", gglikert(df, y_reverse = TRUE, reverse_likert = TRUE) ) vdiffr::expect_doppelganger( "gglikert() variable labels and y_label_wrap", df %>% labelled::set_variable_labels( q1 = "first question", q2 = "second question", q3 = "third question with a very very very veru very very long label" ) %>% gglikert( variable_labels = c( q2 = "question 2", q4 = "another question with a long long long long long long label" ), y_label_wrap = 20 ) ) vdiffr::expect_doppelganger( "gglikert_stacked()", gglikert_stacked(df) ) vdiffr::expect_doppelganger( "gglikert_stacked() add_median_line", gglikert_stacked(df, add_median_line = TRUE) ) vdiffr::expect_doppelganger( "gglikert_stacked() labels_color", gglikert_stacked(df, labels_color = "red") ) df_group <- df df_group$group1 <- sample(c("A", "B"), 150, replace = TRUE) df_group$group2 <- sample(c("a", "b", "c"), 150, replace = TRUE) vdiffr::expect_doppelganger( "gglikert() facet_cols", gglikert(df_group, q1:q6, facet_cols = vars(group1)) ) vdiffr::expect_doppelganger( "gglikert() facet_rows", gglikert(df_group, q1:q2, facet_rows = vars(group1, group2)) ) vdiffr::expect_doppelganger( "gglikert() facet_rows and facet_cols", gglikert( df_group, q3:q6, facet_cols = vars(group1), facet_rows = vars(group2) ) ) vdiffr::expect_doppelganger( "gglikert() facet_rows with group on y", gglikert(df_group, q1:q6, y = "group1", facet_rows = vars(.question)) ) }) ggstats/tests/testthat/test-stat_cross.R0000644000176200001440000000433114357760262020231 0ustar liggesuserstest_that("stat_cross()", { skip_on_cran() library(ggplot2) d <- as.data.frame(Titanic) # plot number of observations p <- ggplot(d) + aes(x = Class, y = Survived, weight = Freq, size = after_stat(observed)) + stat_cross() + scale_size_area(max_size = 20) vdiffr::expect_doppelganger("stat_cross() n obs", p) # custom shape and fill colour based on chi-squared residuals p <- ggplot(d) + aes( x = Class, y = Survived, weight = Freq, size = after_stat(observed), fill = after_stat(std.resid) ) + stat_cross(shape = 22) + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + scale_size_area(max_size = 20) vdiffr::expect_doppelganger("stat_cross() shape-22", p) # custom shape and fill colour based phi coefficients p <- ggplot(d) + aes( x = Class, y = Survived, weight = Freq, size = after_stat(observed), fill = after_stat(phi) ) + stat_cross(shape = 22) + scale_fill_steps2(show.limits = TRUE) + scale_size_area(max_size = 20) vdiffr::expect_doppelganger("stat_cross() phi coefficients", p) # plotting the number of observations as a table p <- ggplot(d) + aes( x = Class, y = Survived, weight = Freq, label = after_stat(observed) ) + geom_text(stat = "cross") vdiffr::expect_doppelganger("stat_cross() table", p) # Row proportions with standardized residuals p <- ggplot(d) + aes( x = Class, y = Survived, weight = Freq, label = scales::percent(after_stat(row.prop)), size = NULL, fill = after_stat(std.resid) ) + stat_cross(shape = 22, size = 30) + geom_text(stat = "cross") + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + facet_grid(Sex ~ .) + labs(fill = "Standardized residuals") vdiffr::expect_doppelganger("stat_cross() residuals", p) }) test_that("phi coefficients", { res <- Titanic %>% as.data.frame() %>% xtabs(Freq ~ Sex + Class, data = .) %>% chisq.test() %>% augment_chisq_add_phi() %>% dplyr::mutate(.phi = round(.data$.phi, digits = 3)) expect_equal( res$.phi, c(-0.236, 0.236, -0.149, 0.149, -0.107, 0.107, 0.375, -0.375) ) }) ggstats/tests/testthat/test-geom_stripped.R0000644000176200001440000000063614357760262020712 0ustar liggesuserstest_that("geom_stripped_cols() and geom_stripped_rows() works", { skip_on_cran() library(ggplot2) p <- ggplot(iris) + aes(x = Species, y = Petal.Length) + geom_count() vdiffr::expect_doppelganger( "stripped rows and cols", p + geom_stripped_rows( odd = "blue", even = "yellow", alpha = .1, nudge_y = .5 ) + geom_stripped_cols() ) }) ggstats/tests/testthat/test-position_likert.R0000644000176200001440000000431114462704254021255 0ustar liggesuserstest_that("position_likert()", { skip_on_cran() library(ggplot2) p <- ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + scale_x_continuous(label = label_percent_abs()) + scale_fill_brewer(palette = "PiYG") + xlab("proportion") vdiffr::expect_doppelganger( "position_likert() base", p ) vdiffr::expect_doppelganger( "position_likert() facet", p + facet_grid(~ price > 2500) ) p <- ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert_count") + scale_x_continuous(label = label_number_abs()) + scale_fill_brewer(palette = "PiYG") vdiffr::expect_doppelganger( "position_likert_count() base", p ) vdiffr::expect_doppelganger( "position_likert_count() facet", p + facet_grid(~ price > 2500) ) p <- ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(reverse = TRUE)) vdiffr::expect_doppelganger( "position_likert() reverse", p ) p <- ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert_count(reverse = TRUE)) vdiffr::expect_doppelganger( "position_likert_count() reverse", p ) custom_label <- function(x) { p <- scales::percent(x, accuracy = 1) p[x < .075] <- "" p } p <- ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + geom_text( aes(by = clarity, label = custom_label(after_stat(prop))), stat = "prop", position = position_likert(vjust = .5) ) + scale_x_continuous(label = label_percent_abs()) + scale_fill_brewer(palette = "PiYG", direction = -1) + xlab("proportion") vdiffr::expect_doppelganger( "position_likert() vjust", p ) p <- ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(exclude_fill_values = "Very Good")) + scale_x_continuous(label = label_percent_abs()) + scale_fill_brewer(palette = "PiYG") + xlab("proportion") vdiffr::expect_doppelganger( "position_likert() exclude_fill_values", p ) }) ggstats/tests/testthat/test-stat_prop.R0000644000176200001440000000367714415526427020072 0ustar liggesuserstest_that("stat_prop()", { skip_on_cran() library(ggplot2) d <- as.data.frame(Titanic) p <- ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = Class) + geom_bar(position = "fill") + geom_text(stat = "prop", position = position_fill(.5)) vdiffr::expect_doppelganger( "stat_prop() titanic", p ) vdiffr::expect_doppelganger( "stat_prop() direct call", ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = Class) + stat_prop(geom = "bar") ) vdiffr::expect_doppelganger( "stat_prop() titanic-facet", p + facet_grid(~Sex) ) vdiffr::expect_doppelganger( "stat_prop() titanic-dodge", ggplot(d) + aes(x = Class, fill = Survived, weight = Freq) + geom_bar(position = "dodge") + geom_text( aes(by = Survived), stat = "prop", position = position_dodge(0.9), vjust = "bottom" ) ) vdiffr::expect_doppelganger( "stat_prop() titanic-dodge (not specifying by)", ggplot(d) + aes(x = Class, fill = Survived, weight = Freq) + geom_bar(position = "dodge") + geom_text( stat = "prop", position = position_dodge(0.9), vjust = "bottom" ) ) vdiffr::expect_doppelganger( "stat_prop() titanic-stack", ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = 1) + geom_bar() + geom_text( aes(label = scales::percent(after_stat(prop), accuracy = 1)), stat = "prop", position = position_stack(.5) ) ) }) test_that("stat_prop() works with an y aesthetic", { library(ggplot2) skip_on_cran() d <- as.data.frame(Titanic) p <- ggplot(d) + aes(y = Class, fill = Survived, weight = Freq, by = Class) + geom_bar(position = "fill") + geom_text(stat = "prop", position = position_fill(.5)) vdiffr::expect_doppelganger("stat_prop() y-aes", p) }) ggstats/tests/testthat/test_ggsurvey.R0000644000176200001440000000130714357760262020002 0ustar liggesuserstest_that("ggsurvey works correctly", { skip_on_cran() skip_if_not_installed("survey") skip_if_not_installed("ggplot2") library(ggplot2) data(api, package = "survey") dstrat <- survey::svydesign( id = ~1, strata = ~stype, weights = ~pw, data = apistrat, fpc = ~fpc ) vdiffr::expect_doppelganger( "ggsurvey() dstrat", ggsurvey(dstrat) + aes(x = cnum, y = dnum) + geom_count() ) d <- as.data.frame(Titanic) dw <- survey::svydesign(ids = ~1, weights = ~Freq, data = d) vdiffr::expect_doppelganger( "ggsurvey() titanic", ggsurvey(dw) + aes(x = Class, fill = Survived) + geom_bar(position = "fill") ) }) ggstats/tests/testthat.R0000644000176200001440000000060214357760261015064 0ustar liggesusers# This file is part of the standard setup for testthat. # It is recommended that you do not modify it. # # Where should you do additional test configuration? # Learn more about the roles of various files in: # * https://r-pkgs.org/tests.html # * https://testthat.r-lib.org/reference/test_package.html#special-files library(testthat) library(ggstats) test_check("ggstats") ggstats/vignettes/0000755000176200001440000000000014527052766013754 5ustar liggesusersggstats/vignettes/gglikert.Rmd0000644000176200001440000001652214504775450016233 0ustar liggesusers--- title: "Plot Likert-type items with `gglikert()`" author: Joseph Larmarange output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Plot Likert-type items with `gglikert()`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ggstats) library(dplyr) library(ggplot2) ``` The purpose of `gglikert()` is to generate a centered bar plot comparing the answers of several questions sharing a common Likert-type scale. ## Generating an example dataset ```{r} likert_levels <- c( "Strongly disagree", "Disagree", "Neither agree nor disagree", "Agree", "Strongly agree" ) set.seed(42) df <- tibble( q1 = sample(likert_levels, 150, replace = TRUE), q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1), q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q5 = sample(c(likert_levels, NA), 150, replace = TRUE), q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0)) ) %>% mutate(across(everything(), ~ factor(.x, levels = likert_levels))) likert_levels_dk <- c( "Strongly disagree", "Disagree", "Neither agree nor disagree", "Agree", "Strongly agree", "Don't know" ) df_dk <- tibble( q1 = sample(likert_levels_dk, 150, replace = TRUE), q2 = sample(likert_levels_dk, 150, replace = TRUE, prob = 6:1), q3 = sample(likert_levels_dk, 150, replace = TRUE, prob = 1:6), q4 = sample(likert_levels_dk, 150, replace = TRUE, prob = 1:6), q5 = sample(c(likert_levels_dk, NA), 150, replace = TRUE), q6 = sample( likert_levels_dk, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0, 1) ) ) %>% mutate(across(everything(), ~ factor(.x, levels = likert_levels_dk))) ``` ## Quick plot Simply call `gglikert()`. ```{r} gglikert(df) ``` The list of variables to plot (all by default) could by specify with `include`. This argument accepts tidy-select syntax. ```{r} gglikert(df, include = q1:q3) ``` ## Customizing the plot The generated plot is a standard `ggplot2` object. You can therefore use `ggplot2` functions to custom many aspects. ```{r} gglikert(df) + ggtitle("A Likert-type items plot", subtitle = "generated with gglikert()") + scale_fill_brewer(palette = "RdYlBu") ``` ### Sorting the questions You can sort the plot with `sort`. ```{r} gglikert(df, sort = "ascending") ``` By default, the plot is sorted based on the proportion being higher than the center level, i.e. in this case the proportion of answers equal to "Agree" or "Strongly Agree". Alternatively, the questions could be transformed into a score and sorted accorded to their mean. ```{r} gglikert(df, sort = "ascending", sort_method = "mean") ``` ### Sorting the answers You can reverse the order of the answers with `reverse_likert`. ```{r} gglikert(df, reverse_likert = TRUE) ``` ### Proportion labels Proportion labels could be removed with `add_labels = FALSE`. ```{r} gglikert(df, add_labels = FALSE) ``` or customized. ```{r} gglikert( df, labels_size = 3, labels_accuracy = .1, labels_hide_below = .2, labels_color = "white" ) ``` ### Totals on each side By default, totals are added on each side of the plot. In case of an uneven number of answer levels, the central level is not taken into account for computing totals. With `totals_include_center = TRUE`, half of the proportion of the central level will be added on each side. ```{r} gglikert( df, totals_include_center = TRUE, sort = "descending", sort_prop_include_center = TRUE ) ``` Totals could be customized. ```{r} gglikert( df, totals_size = 4, totals_color = "blue", totals_fontface = "italic", totals_hjust = .20 ) ``` Or removed. ```{r} gglikert(df, add_totals = FALSE) ``` ## Variable labels If you are using variable labels (see `labelled::set_variable_labels()`), they will be taken automatically into account by `gglikert()`. ```{r} if (require(labelled)) { df <- df %>% set_variable_labels( q1 = "first question", q2 = "second question", q3 = "this is the third question with a quite long variable label" ) } gglikert(df) ``` You can also provide custom variable labels with `variable_labels`. ```{r} gglikert( df, variable_labels = c( q1 = "alternative label for the first question", q6 = "another custom label" ) ) ``` You can control how variable labels are wrapped with `y_label_wrap`. ```{r} gglikert(df, y_label_wrap = 20) gglikert(df, y_label_wrap = 200) ``` ## Removing certain values Sometimes, the dataset could contain certain values that you should not be displayed. ```{r} gglikert(df_dk) ``` A first option could be to convert the don't knows into `NA`. In such case, the proportions will be computed on non missing. ```{r} df_dk %>% mutate(across(everything(), ~ factor(.x, levels = likert_levels))) %>% gglikert() ``` Or, you could use `exclude_fill_values` to not display specific values, but still counting them in the denominator for computing proportions. ```{r} df_dk %>% gglikert(exclude_fill_values = "Don't know") ``` ## Facets To define facets, use `facet_rows` and/or `facet_cols`. ```{r message=FALSE} df_group <- df df_group$group1 <- sample(c("A", "B"), 150, replace = TRUE) df_group$group2 <- sample(c("a", "b", "c"), 150, replace = TRUE) gglikert(df_group, q1:q6, facet_cols = vars(group1), labels_size = 3 ) gglikert(df_group, q1:q2, facet_rows = vars(group1, group2), labels_size = 3 ) gglikert(df_group, q3:q6, facet_cols = vars(group1), facet_rows = vars(group2), labels_size = 3 ) + scale_x_continuous( labels = label_percent_abs(), expand = expansion(0, .2) ) ``` To compare answers by subgroup, you can alternatively map `.question` to facets, and define a grouping variable for `y`. ```{r} gglikert(df_group, q1:q4, y = "group1", facet_rows = vars(.question), labels_size = 3, facet_label_wrap = 15 ) ``` ## Stacked plot For a more classical stacked bar plot, you can use `gglikert_stacked()`. ```{r} gglikert_stacked(df) gglikert_stacked( df, sort = "asc", add_median_line = TRUE, add_labels = FALSE ) gglikert_stacked( df_group, include = q1:q4, y = "group2" ) + facet_grid( rows = vars(.question), labeller = label_wrap_gen(15) ) ``` ## Long format dataset Internally, `gglikert()` is calling `gglikert_data()` to generate a long format dataset combining all questions into two columns, `.question` and `.answer`. ```{r} gglikert_data(df) %>% head() ``` Such dataset could be useful for other types of plot, for example for a classic stacked bar plot. ```{r} ggplot(gglikert_data(df)) + aes(y = .question, fill = .answer) + geom_bar(position = "fill") ``` ## Weighted data `gglikert()`, `gglikert_stacked()` and `gglikert_data()` accepts a `weights` argument, allowing to specify statistical weights. ```{r} df$sampling_weights <- runif(nrow(df)) gglikert(df, q1:q4, weights = sampling_weights) ``` ## See also The function `position_likert()` used to center bars. ggstats/vignettes/stat_prop.Rmd0000644000176200001440000001017314462704257016432 0ustar liggesusers--- title: "Compute custom proportions with `stat_prop()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Compute custom proportions with `stat_prop()`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ggstats) library(ggplot2) ``` `stat_prop()` is a variation of `ggplot2::stat_count()` allowing to compute custom proportions according to the **by** aesthetic defining the denominator (i.e. all proportions for a same value of **by** will sum to 1). The **by** aesthetic should be a factor. Therefore, `stat_prop()` requires the **by** aesthetic and this **by** aesthetic should be a factor. ## Adding labels on a percent stacked bar plot When using `position = "fill"` with `geom_bar()`, you can produce a percent stacked bar plot. However, the proportions corresponding to the **y** axis are not directly accessible using only `ggplot2`. With `stat_prop()`, you can easily add them on the plot. In the following example, we indicated `stat = "prop"` to `ggplot2::geom_text()` to use `stat_prop()`, we defined the **by** aesthetic (here we want to compute the proportions separately for each value of **x**), and we also used `ggplot2::position_fill()` when calling `ggplot2::geom_text()`. ```{r} d <- as.data.frame(Titanic) p <- ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = Class) + geom_bar(position = "fill") + geom_text(stat = "prop", position = position_fill(.5)) p ``` Note that `stat_prop()` has properly taken into account the **weight** aesthetic. `stat_prop()` is also compatible with faceting. In that case, proportions are computed separately in each facet. ```{r} p + facet_grid(cols = vars(Sex)) ``` ## Displaying proportions of the total If you want to display proportions of the total, simply map the **by** aesthetic to `1`. Here an example using a stacked bar chart. ```{r} ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = 1) + geom_bar() + geom_text( aes(label = scales::percent(after_stat(prop), accuracy = 1)), stat = "prop", position = position_stack(.5) ) ``` ## A dodged bar plot to compare two distributions A dodged bar plot could be used to compare two distributions. ```{r} ggplot(d) + aes(x = Class, fill = Sex, weight = Freq, by = Sex) + geom_bar(position = "dodge") ``` On the previous graph, it is difficult to see if first class is over- or under-represented among women, due to the fact they were much more men on the boat. `stat_prop()` could be used to adjust the graph by displaying instead the proportion within each category (i.e. here the proportion by sex). ```{r} ggplot(d) + aes(x = Class, fill = Sex, weight = Freq, by = Sex, y = after_stat(prop)) + geom_bar(stat = "prop", position = "dodge") + scale_y_continuous(labels = scales::percent) ``` The same example with labels: ```{r} ggplot(d) + aes(x = Class, fill = Sex, weight = Freq, by = Sex, y = after_stat(prop)) + geom_bar(stat = "prop", position = "dodge") + scale_y_continuous(labels = scales::percent) + geom_text( mapping = aes( label = scales::percent(after_stat(prop), accuracy = .1), y = after_stat(0.01) ), vjust = "bottom", position = position_dodge(.9), stat = "prop" ) ``` ## Displaying unobserved levels With the `complete` argument, it is possible to indicate an aesthetic for those statistics should be completed for unobserved values. ```{r} d <- diamonds %>% dplyr::filter(!(cut == "Ideal" & clarity == "I1")) %>% dplyr::filter(!(cut == "Very Good" & clarity == "VS2")) %>% dplyr::filter(!(cut == "Premium" & clarity == "IF")) p <- ggplot(d) + aes(x = clarity, fill = cut, by = clarity) + geom_bar(position = "fill") p + geom_text( stat = "prop", position = position_fill(.5) ) ``` Adding `complete = "fill"` will generate "0.0%" labels where relevant. ```{r} p + geom_text( stat = "prop", position = position_fill(.5), complete = "fill" ) ``` ggstats/vignettes/ggcoef_model.Rmd0000644000176200001440000002212514466120077017025 0ustar liggesusers--- title: "Plot model coefficients with `ggcoef_model()`" author: Joseph Larmarange output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Plot model coefficients with `ggcoef_model()`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ggstats) ``` The purpose of `ggcoef_model()` is to quickly plot the coefficients of a model. It is an updated and improved version of `GGally::ggcoef()` based on `broom.helpers::tidy_plus_plus()`. For displaying a nicely formatted table of the same models, look at `gtsummary::tbl_regression()`. ## Quick coefficients plot To work automatically, this function requires the `{broom.helpers}`. Simply call `ggcoef_model()` with a model object. It could be the result of `stats::lm`, `stats::glm` or any other model covered by `{broom.helpers}`. ```{r ggcoef-reg} data(tips, package = "reshape") mod_simple <- lm(tip ~ day + time + total_bill, data = tips) ggcoef_model(mod_simple) ``` In the case of a logistic regression (or any other model for which coefficients are usually exponentiated), simply indicated `exponentiate = TRUE`. Note that a logarithmic scale will be used for the x-axis. ```{r ggcoef-titanic} d_titanic <- as.data.frame(Titanic) d_titanic$Survived <- factor(d_titanic$Survived, c("No", "Yes")) mod_titanic <- glm( Survived ~ Sex * Age + Class, weights = Freq, data = d_titanic, family = binomial ) ggcoef_model(mod_titanic, exponentiate = TRUE) ``` ## Customizing the plot ### Variable labels You can use the `{labelled}` package to define variable labels. They will be automatically used by `ggcoef_model()`. Note that variable labels should be defined before computing the model. ```{r} library(labelled) tips_labelled <- tips %>% set_variable_labels( day = "Day of the week", time = "Lunch or Dinner", total_bill = "Bill's total" ) mod_labelled <- lm(tip ~ day + time + total_bill, data = tips_labelled) ggcoef_model(mod_labelled) ``` You can also define custom variable labels directly by passing a named vector to the `variable_labels` option. ```{r} ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ) ) ``` If variable labels are to long, you can pass `ggplot2::label_wrap_gen()` or any other labeller function to `facet_labeller.` ```{r} ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ), facet_labeller = ggplot2::label_wrap_gen(10) ) ``` Use `facet_row = NULL` to hide variable names. ```{r} ggcoef_model(mod_simple, facet_row = NULL, colour_guide = TRUE) ``` ### Term labels Several options allows you to customize term labels. ```{r} ggcoef_model(mod_titanic, exponentiate = TRUE) ggcoef_model( mod_titanic, exponentiate = TRUE, show_p_values = FALSE, signif_stars = FALSE, add_reference_rows = FALSE, categorical_terms_pattern = "{level} (ref: {reference_level})", interaction_sep = " x " ) + ggplot2::scale_y_discrete(labels = scales::label_wrap(15)) ``` By default, for categorical variables using treatment and sum contrasts, reference rows will be added and displayed on the graph. ```{r} mod_titanic2 <- glm( Survived ~ Sex * Age + Class, weights = Freq, data = d_titanic, family = binomial, contrasts = list(Sex = contr.sum, Class = contr.treatment(4, base = 3)) ) ggcoef_model(mod_titanic2, exponentiate = TRUE) ``` Continuous variables with polynomial terms defined with `stats::poly()` are also properly managed. ```{r} mod_poly <- lm(Sepal.Length ~ poly(Petal.Width, 3) + Petal.Length, data = iris) ggcoef_model(mod_poly) ``` Use `no_reference_row` to indicate which variables should not have a reference row added. ```{r} ggcoef_model( mod_titanic2, exponentiate = TRUE, no_reference_row = "Sex" ) ggcoef_model( mod_titanic2, exponentiate = TRUE, no_reference_row = broom.helpers::all_dichotomous() ) ggcoef_model( mod_titanic2, exponentiate = TRUE, no_reference_row = broom.helpers::all_categorical(), categorical_terms_pattern = "{level}/{reference_level}" ) ``` ### Elements to display Use `intercept = TRUE` to display intercepts. ```{r} ggcoef_model(mod_simple, intercept = TRUE) ``` You can remove confidence intervals with `conf.int = FALSE`. ```{r} ggcoef_model(mod_simple, conf.int = FALSE) ``` By default, significant terms (i.e. with a p-value below 5%) are highlighted using two types of dots. You can control the level of significance with `significance` or remove it with `significance = NULL`. ```{r} ggcoef_model(mod_simple, significance = NULL) ``` By default, dots are colored by variable. You can deactivate this behavior with `colour = NULL`. ```{r} ggcoef_model(mod_simple, colour = NULL) ``` You can display only a subset of terms with **include**. ```{r} ggcoef_model(mod_simple, include = c("time", "total_bill")) ``` It is possible to use `tidyselect` helpers. ```{r} ggcoef_model(mod_simple, include = dplyr::starts_with("t")) ``` You can remove stripped rows with `stripped_rows = FALSE`. ```{r} ggcoef_model(mod_simple, stripped_rows = FALSE) ``` Do not hesitate to consult the help file of `ggcoef_model()` to see all available options. ### ggplot2 elements The plot returned by `ggcoef_model()` is a classic `ggplot2` plot. You can therefore apply `ggplot2` functions to it. ```{r} ggcoef_model(mod_simple) + ggplot2::xlab("Coefficients") + ggplot2::ggtitle("Custom title") + ggplot2::scale_color_brewer(palette = "Set1") + ggplot2::theme(legend.position = "right") ``` ## Forest plot with a coefficient table `ggcoef_table()` is a variant of `ggcoef_model()` displaying a coefficient table on the right of the forest plot. ```{r} ggcoef_table(mod_simple) ggcoef_table(mod_titanic, exponentiate = TRUE) ``` You can easily customize the columns to be displayed. ```{r} ggcoef_table( mod_simple, table_stat = c("label", "estimate", "std.error", "ci"), ci_pattern = "{conf.low} to {conf.high}", table_stat_label = list( estimate = scales::label_number(accuracy = .001), conf.low = scales::label_number(accuracy = .01), conf.high = scales::label_number(accuracy = .01), std.error = scales::label_number(accuracy = .001), label = toupper ), table_header = c("Term", "Coef.", "SE", "CI"), table_witdhs = c(2, 3) ) ``` ## Multinomial models For multinomial models, simply use `ggcoef_multinom()`. Three types of visualizations are available: `"dodged"`, `"faceted"` and `"table"`. ```{r} library(nnet) hec <- as.data.frame(HairEyeColor) mod <- multinom( Hair ~ Eye + Sex, data = hec, weights = hec$Freq ) ggcoef_multinom( mod, exponentiate = TRUE ) ggcoef_multinom( mod, exponentiate = TRUE, type = "faceted" ) ``` ```{r, fig.height=9, fig.width=6} ggcoef_multinom( mod, exponentiate = TRUE, type = "table" ) ``` You can use `y.level_label` to customize the label of each level. ```{r} ggcoef_multinom( mod, type = "faceted", y.level_label = c("Brown" = "Brown\n(ref: Black)"), exponentiate = TRUE ) ``` ## Multi-components models Multi-components models such as zero-inflated Poisson or beta regression generate a set of terms for each of their components. You can use `ggcoef_multicomponents()` which is similar to `ggcoef_multinom()`. ```{r} library(pscl) data("bioChemists", package = "pscl") mod <- zeroinfl(art ~ fem * mar | fem + mar, data = bioChemists) ggcoef_multicomponents(mod) ggcoef_multicomponents(mod, type = "f") ``` ```{r, fig.height=7, fig.width=6} ggcoef_multicomponents(mod, type = "t") ggcoef_multicomponents( mod, type = "t", component_label = c(conditional = "Count", zero_inflated = "Zero-inflated") ) ``` ## Comparing several models You can easily compare several models with `ggcoef_compare()`. To be noted, `ggcoef_compare()` is not compatible with multinomial or multi-components models. ```{r} mod1 <- lm(Fertility ~ ., data = swiss) mod2 <- step(mod1, trace = 0) mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss) models <- list( "Full model" = mod1, "Simplified model" = mod2, "With interaction" = mod3 ) ggcoef_compare(models) ggcoef_compare(models, type = "faceted") ``` ## Advanced users Advanced users could use their own dataset and pass it to `ggcoef_plot()`. Such dataset could be produced by `ggcoef_model()`, `ggcoef_compare()` or `ggcoef_multinom()` with the option `return_data = TRUE` or by using `broom::tidy()` or `broom.helpers::tidy_plus_plus()`. ## Supported models ```{r, echo=FALSE} broom.helpers::supported_models %>% knitr::kable() ``` Note: this list of models has been tested. `{broom.helpers}`, and therefore `ggcoef_model()`, may or may not work properly or partially with other types of models. ggstats/vignettes/stat_cross.Rmd0000644000176200001440000000601514357760262016604 0ustar liggesusers--- title: "Compute cross-tabulation statistics with `stat_cross()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Compute cross-tabulation statistics with `stat_cross()`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ggstats) library(ggplot2) ``` This statistic is intended to be used with two discrete variables mapped to **x** and **y** aesthetics. It will compute several statistics of a cross-tabulated table using `broom::tidy.test()` and `stats::chisq.test()`. More precisely, the computed variables are: - **observed**: number of observations in x,y - **prop**: proportion of total - **row.prop**: row proportion - **col.prop**: column proportion - **expected**: expected count under the null hypothesis - **resid**: Pearson's residual - **std.resid**: standardized residual - **row.observed**: total number of observations within row - **col.observed**: total number of observations within column - **total.observed**: total number of observations within the table - **phi**: phi coefficients, see `augment_chisq_add_phi()` By default, `stat_cross()` is using `ggplot2::geom_points()`. If you want to plot the number of observations, you need to map `after_stat(observed)` to an aesthetic (here **size**): ```{r} d <- as.data.frame(Titanic) ggplot(d) + aes(x = Class, y = Survived, weight = Freq, size = after_stat(observed)) + stat_cross() + scale_size_area(max_size = 20) ``` Note that the **weight** aesthetic is taken into account by `stat_cross()`. We can go further using a custom shape and filling points with standardized residual to identify visually cells who are over- or underrepresented. ```{r fig.height=6, fig.width=6} ggplot(d) + aes( x = Class, y = Survived, weight = Freq, size = after_stat(observed), fill = after_stat(std.resid) ) + stat_cross(shape = 22) + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + scale_size_area(max_size = 20) ``` We can easily recreate a cross-tabulated table. ```{r} ggplot(d) + aes(x = Class, y = Survived, weight = Freq) + geom_tile(fill = "white", colour = "black") + geom_text(stat = "cross", mapping = aes(label = after_stat(observed))) + theme_minimal() ``` Even more complicated, we want to produce a table showing column proportions and where cells are filled with standardized residuals. Note that `stat_cross()` could be used with facets. In that case, computation is done separately in each facet. ```{r} ggplot(d) + aes( x = Class, y = Survived, weight = Freq, label = scales::percent(after_stat(col.prop), accuracy = .1), fill = after_stat(std.resid) ) + stat_cross(shape = 22, size = 30) + geom_text(stat = "cross") + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + facet_grid(rows = vars(Sex)) + labs(fill = "Standardized residuals") + theme_minimal() ``` ggstats/vignettes/stat_weighted_mean.Rmd0000644000176200001440000000526514357760262020261 0ustar liggesusers--- title: "Compute weighted mean with `stat_weighted_mean()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Compute weighted mean with `stat_weighted_mean()`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ggstats) library(ggplot2) ``` `stat_weighted_mean()` computes mean value of **y** (taking into account any **weight** aesthetic if provided) for each value of **x**. More precisely, it will return a new data frame with one line per unique value of **x** with the following new variables: - **y**: mean value of the original **y** (i.e. **numerator**/**denominator**) - **numerator** - **denominator** Let's take an example. The following plot shows all tips received according to the day of the week. ```{r} data(tips, package = "reshape") ggplot(tips) + aes(x = day, y = tip) + geom_point() ``` To plot their mean value per day, simply use `stat_weighted_mean()`. ```{r} ggplot(tips) + aes(x = day, y = tip) + stat_weighted_mean() ``` We can specify the geometry we want using `geom` argument. Note that for lines, we need to specify the **group** aesthetic as well. ```{r} ggplot(tips) + aes(x = day, y = tip, group = 1) + stat_weighted_mean(geom = "line") ``` An alternative is to specify the statistic in `ggplot2::geom_line()`. ```{r} ggplot(tips) + aes(x = day, y = tip, group = 1) + geom_line(stat = "weighted_mean") ``` Of course, it could be use with other geometries. Here a bar plot. ```{r} p <- ggplot(tips) + aes(x = day, y = tip, fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") + ylab("mean tip") p ``` It is very easy to add facets. In that case, computation will be done separately for each facet. ```{r} p + facet_grid(rows = vars(smoker)) ``` `stat_weighted_mean()` could be also used for computing proportions as a proportion is technically a mean of binary values (0 or 1). ```{r} ggplot(tips) + aes(x = day, y = as.integer(smoker == "Yes"), fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") + scale_y_continuous(labels = scales::percent) + ylab("proportion of smoker") ``` Finally, you can use the **weight** aesthetic to indicate weights to take into account for computing means / proportions. ```{r} d <- as.data.frame(Titanic) ggplot(d) + aes(x = Class, y = as.integer(Survived == "Yes"), weight = Freq, fill = Sex) + geom_bar(stat = "weighted_mean", position = "dodge") + scale_y_continuous(labels = scales::percent) + labs(y = "Proportion who survived") ``` ggstats/R/0000755000176200001440000000000014526731212012132 5ustar liggesusersggstats/R/weighted_quantile.R0000644000176200001440000000675514527052132015772 0ustar liggesusers#' Weighted Median and Quantiles #' #' Compute the median or quantiles a set of numbers which have weights #' associated with them. #' #' @param x a numeric vector of values #' @param w a numeric vector of weights #' @param probs probabilities for which the quantiles should be computed, a #' numeric vector of values between 0 and 1 #' @param na.rm a logical indicating whether to ignore `NA` values #' @param type Integer specifying the rule for calculating the median or #' quantile, corresponding to the rules available for `stats:quantile()`. #' The only valid choices are type=1, 2 or 4. See Details. #' @details #' The `i`th observation `x[i]` is treated as having a weight proportional to #' `w[i]`. #' #' The weighted median is a value `m` such that the total weight of data less #' than or equal to `m` is equal to half the total weight. More generally, the #' weighted quantile with probability `p` is a value `q` such that the total #' weight of data less than or equal to `q` is equal to `p` times the total #' weight. #' #' If there is no such value, then #' #' - if `type = 1`, the next largest value is returned (this is the #' right-continuous inverse of the left-continuous cumulative distribution #' function); #' - if `type = 2`, the average of the two surrounding values is returned #' (the average of the right-continuous and left-continuous inverses); #' - if `type = 4`, linear interpolation is performed. #' #' Note that the default rule for `weighted.median()` is `type = 2`, consistent #' with the traditional definition of the median, while the default for #' `weighted.quantile()` is `type = 4`. #' @source These functions are adapted from their homonyms developed by Adrian #' Baddeley in the `spatstat` package. #' @returns A numeric vector. #' @export #' @examples #' x <- 1:20 #' w <- runif(20) #' weighted.median(x, w) #' weighted.quantile(x, w) weighted.median <- function(x, w, na.rm = TRUE, type = 2) { unname(weighted.quantile(x, probs = 0.5, w = w, na.rm = na.rm, type = type )) } #' @export #' @rdname weighted.median weighted.quantile <- function(x, w, probs = seq(0, 1, 0.25), na.rm = TRUE, type = 4) { x <- as.numeric(as.vector(x)) w <- as.numeric(as.vector(w)) if (length(x) == 0) { stop("No data given") } stopifnot(length(x) == length(w)) if (is.na(m <- match(type, c(1, 2, 4)))) { stop("Argument 'type' must equal 1, 2 or 4", call. = FALSE) } type <- c(1, 2, 4)[m] if (anyNA(x) || anyNA(w)) { ok <- !(is.na(x) | is.na(w)) x <- x[ok] w <- w[ok] } if (length(x) == 0) { stop("At least one non-NA value is required") } stopifnot(all(w >= 0)) if (all(w == 0)) { stop("All weights are zero", call. = FALSE) } oo <- order(x) x <- x[oo] w <- w[oo] Fx <- cumsum(w) / sum(w) if (length(x) > 1) { out <- switch(as.character(type), `1` = stats::approx(Fx, x, xout = probs, ties = "ordered", rule = 2, method = "constant", f = 1 ), `2` = stats::approx(Fx, x, xout = probs, ties = "ordered", rule = 2, method = "constant", f = 1 / 2 ), `4` = stats::approx(Fx, x, xout = probs, ties = "ordered", rule = 2, method = "linear" ) ) result <- out$y } else { result <- rep.int(x, length(probs)) } names(result) <- paste0( format(100 * probs, trim = TRUE), "%" ) return(result) } ggstats/R/ggcoef_model.R0000644000176200001440000012726314526731233014705 0ustar liggesusers#' Plot model coefficients #' #' `ggcoef_model()`, `ggcoef_table()`, `ggcoef_multinom()`, #' `ggcoef_multicomponents()` and `ggcoef_compare()` #' use [broom.helpers::tidy_plus_plus()] #' to obtain a `tibble` of the model coefficients, #' apply additional data transformation and then pass the #' produced `tibble` to `ggcoef_plot()` to generate the plot. #' #' For more control, you can use the argument `return_data = TRUE` to #' get the produced `tibble`, apply any transformation of your own and #' then pass your customized `tibble` to `ggcoef_plot()`. #' @inheritParams broom.helpers::tidy_plus_plus #' @param tidy_args Additional arguments passed to #' [broom.helpers::tidy_plus_plus()] and to `tidy_fun` #' @param model a regression model object #' @param conf.level the confidence level to use for the confidence #' interval if `conf.int = TRUE`; must be strictly greater than 0 #' and less than 1; defaults to 0.95, which corresponds to a 95 #' percent confidence interval #' @param show_p_values if `TRUE`, add p-value to labels #' @param signif_stars if `TRUE`, add significant stars to labels #' @param significance level (between 0 and 1) below which a #' coefficient is consider to be significantly different from 0 #' (or 1 if `exponentiate = TRUE`), `NULL` for not highlighting #' such coefficients #' @param significance_labels optional vector with custom labels #' for significance variable #' @param return_data if `TRUE`, will return the data.frame used #' for plotting instead of the plot #' @param ... parameters passed to [ggcoef_plot()] #' @return A `ggplot2` plot or a `tibble` if `return_data = TRUE`. #' @export #' @examples #' mod <- lm(Sepal.Length ~ Sepal.Width + Species, data = iris) #' ggcoef_model(mod) #' #' ggcoef_table(mod) #' #' #' \donttest{ #' ggcoef_table(mod, table_stat = c("estimate", "ci")) #' #' ggcoef_table( #' mod, #' table_stat_label = list( #' estimate = scales::label_number(.001) #' ) #' ) #' #' ggcoef_table(mod, table_text_size = 5, table_witdhs = c(1, 1)) #' #' # a logistic regression example #' d_titanic <- as.data.frame(Titanic) #' d_titanic$Survived <- factor(d_titanic$Survived, c("No", "Yes")) #' mod_titanic <- glm( #' Survived ~ Sex * Age + Class, #' weights = Freq, #' data = d_titanic, #' family = binomial #' ) #' #' # use 'exponentiate = TRUE' to get the Odds Ratio #' ggcoef_model(mod_titanic, exponentiate = TRUE) #' #' ggcoef_table(mod_titanic, exponentiate = TRUE) #' #' # display intercepts #' ggcoef_model(mod_titanic, exponentiate = TRUE, intercept = TRUE) #' #' # customize terms labels #' ggcoef_model( #' mod_titanic, #' exponentiate = TRUE, #' show_p_values = FALSE, #' signif_stars = FALSE, #' add_reference_rows = FALSE, #' categorical_terms_pattern = "{level} (ref: {reference_level})", #' interaction_sep = " x " #' ) + #' ggplot2::scale_y_discrete(labels = scales::label_wrap(15)) #' #' # display only a subset of terms #' ggcoef_model(mod_titanic, exponentiate = TRUE, include = c("Age", "Class")) #' #' # do not change points' shape based on significance #' ggcoef_model(mod_titanic, exponentiate = TRUE, significance = NULL) #' #' # a black and white version #' ggcoef_model( #' mod_titanic, #' exponentiate = TRUE, #' colour = NULL, stripped_rows = FALSE #' ) #' #' # show dichotomous terms on one row #' ggcoef_model( #' mod_titanic, #' exponentiate = TRUE, #' no_reference_row = broom.helpers::all_dichotomous(), #' categorical_terms_pattern = #' "{ifelse(dichotomous, paste0(level, ' / ', reference_level), level)}", #' show_p_values = FALSE #' ) #' } #' @examplesIf requireNamespace("reshape") #' #' \donttest{ #' data(tips, package = "reshape") #' mod_simple <- lm(tip ~ day + time + total_bill, data = tips) #' ggcoef_model(mod_simple) #' #' # custom variable labels #' # you can use the labelled package to define variable labels #' # before computing model #' if (requireNamespace("labelled")) { #' tips_labelled <- tips %>% #' labelled::set_variable_labels( #' day = "Day of the week", #' time = "Lunch or Dinner", #' total_bill = "Bill's total" #' ) #' mod_labelled <- lm(tip ~ day + time + total_bill, data = tips_labelled) #' ggcoef_model(mod_labelled) #' } #' #' # you can provide custom variable labels with 'variable_labels' #' ggcoef_model( #' mod_simple, #' variable_labels = c( #' day = "Week day", #' time = "Time (lunch or dinner ?)", #' total_bill = "Total of the bill" #' ) #' ) #' # if labels are too long, you can use 'facet_labeller' to wrap them #' ggcoef_model( #' mod_simple, #' variable_labels = c( #' day = "Week day", #' time = "Time (lunch or dinner ?)", #' total_bill = "Total of the bill" #' ), #' facet_labeller = ggplot2::label_wrap_gen(10) #' ) #' #' # do not display variable facets but add colour guide #' ggcoef_model(mod_simple, facet_row = NULL, colour_guide = TRUE) #' #' # works also with with polynomial terms #' mod_poly <- lm( #' tip ~ poly(total_bill, 3) + day, #' data = tips, #' ) #' ggcoef_model(mod_poly) #' #' # or with different type of contrasts #' # for sum contrasts, the value of the reference term is computed #' if (requireNamespace("emmeans")) { #' mod2 <- lm( #' tip ~ day + time + sex, #' data = tips, #' contrasts = list(time = contr.sum, day = contr.treatment(4, base = 3)) #' ) #' ggcoef_model(mod2) #' } #' } #' ggcoef_model <- function( model, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = .95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = 1 - conf.level, significance_labels = NULL, show_p_values = TRUE, signif_stars = TRUE, return_data = FALSE, ...) { data <- ggcoef_data( model = model, tidy_fun = tidy_fun, tidy_args = {{ tidy_args }}, conf.int = conf.int, conf.level = conf.level, exponentiate = exponentiate, variable_labels = variable_labels, term_labels = term_labels, interaction_sep = interaction_sep, categorical_terms_pattern = categorical_terms_pattern, add_reference_rows = add_reference_rows, no_reference_row = {{ no_reference_row }}, intercept = intercept, include = {{ include }}, add_pairwise_contrasts = add_pairwise_contrasts, pairwise_variables = {{ pairwise_variables }}, keep_model_terms = keep_model_terms, pairwise_reverse = pairwise_reverse, emmeans_args = emmeans_args, significance = significance, significance_labels = significance_labels ) if (show_p_values && signif_stars) { data$add_to_label <- paste0(data$p_value_label, data$signif_stars) } if (show_p_values && !signif_stars) { data$add_to_label <- data$p_value_label } if (!show_p_values && signif_stars) { data$add_to_label <- data$signif_stars } if (show_p_values || signif_stars) { data$label <- forcats::fct_inorder( factor( paste0( data$label, ifelse( data$add_to_label == "", "", paste0(" (", data$add_to_label, ")") ) ) ) ) data$label_light <- forcats::fct_inorder( factor( paste0( data$label_light, ifelse( data$add_to_label == "", "", paste0(" (", data$add_to_label, ")") ) ) ) ) } if (return_data) { return(data) } args <- list(...) args$data <- data args$exponentiate <- exponentiate if (!"y" %in% names(args) && !"facet_row" %in% names(args)) { args$y <- "label_light" } if (!"colour" %in% names(args) && !all(is.na(data$var_label))) { args$colour <- "var_label" if (!"colour_guide" %in% names(args)) { args$colour_guide <- FALSE } } do.call(ggcoef_plot, args) } #' @describeIn ggcoef_model a variation of [ggcoef_model()] adding a table #' with estimates, confidence intervals and p-values #' @param table_stat statistics to display in the table, use any column name #' returned by the tidier or `"ci"` for confidence intervals formatted #' according to `ci_pattern` #' @param table_header optional custom headers for the table #' @param table_text_size text size for the table #' @param table_stat_label optional named list of labeller functions for the #' displayed statistic (see examples) #' @param ci_pattern glue pattern for confidence intervals in the table #' @param table_witdhs relative widths of the forest plot and the coefficients #' table #' @param plot_title an optional plot title #' @export ggcoef_table <- function( model, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = .95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = 1 - conf.level, significance_labels = NULL, show_p_values = FALSE, signif_stars = FALSE, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), plot_title = NULL, ...) { args <- list(...) # undocumented feature, we can pass directly `data` # used by ggcoef_multicomponents() if (is.null(args$data)) { data <- ggcoef_data( model = model, tidy_fun = tidy_fun, tidy_args = {{ tidy_args }}, conf.int = conf.int, conf.level = conf.level, exponentiate = exponentiate, variable_labels = variable_labels, term_labels = term_labels, interaction_sep = interaction_sep, categorical_terms_pattern = categorical_terms_pattern, add_reference_rows = add_reference_rows, no_reference_row = {{ no_reference_row }}, intercept = intercept, include = {{ include }}, add_pairwise_contrasts = add_pairwise_contrasts, pairwise_variables = {{ pairwise_variables }}, keep_model_terms = keep_model_terms, pairwise_reverse = pairwise_reverse, emmeans_args = emmeans_args, significance = significance, significance_labels = significance_labels ) } else { data <- args$data } if (show_p_values && signif_stars) { data$add_to_label <- paste0(data$p_value_label, data$signif_stars) } if (show_p_values && !signif_stars) { data$add_to_label <- data$p_value_label } if (!show_p_values && signif_stars) { data$add_to_label <- data$signif_stars } if (show_p_values || signif_stars) { data$label <- forcats::fct_inorder( factor( paste0( data$label, ifelse( data$add_to_label == "", "", paste0(" (", data$add_to_label, ")") ) ) ) ) data$label_light <- forcats::fct_inorder( factor( paste0( data$label_light, ifelse( data$add_to_label == "", "", paste0(" (", data$add_to_label, ")") ) ) ) ) } args$data <- data args$exponentiate <- exponentiate if (!"y" %in% names(args) && !"facet_row" %in% names(args)) { args$y <- "label_light" } if (!"colour" %in% names(args) && !all(is.na(data$var_label))) { args$colour <- "var_label" if (!"colour_guide" %in% names(args)) { args$colour_guide <- FALSE } } if (!"y" %in% names(args)) args$y <- "label" if (!"facet_row" %in% names(args)) args$facet_row <- "var_label" if (!"stripped_rows" %in% names(args)) args$stripped_rows <- TRUE if (!"strips_odd" %in% names(args)) args$strips_odd <- "#11111111" if (!"strips_even" %in% names(args)) args$strips_even <- "#00000000" coef_plot <- do.call(ggcoef_plot, args) if (!is.null(plot_title)) { coef_plot <- coef_plot + ggplot2::ggtitle(plot_title) + ggplot2::theme( plot.title = ggplot2::element_text(face = "bold"), plot.title.position = "plot" ) } if (args$stripped_rows) { if (!"term" %in% names(data)) { data$term <- data[[args$y]] } data <- data %>% dplyr::mutate(.fill = dplyr::if_else( as.integer(.in_order(.data$term)) %% 2L == 1, args$strips_even, args$strips_odd )) } # building the coefficient table ---- tbl_data <- data if (!"estimate" %in% names(table_stat_label)) { table_stat_label$estimate <- scales::label_number(accuracy = .1) } if (!"conf.low" %in% names(table_stat_label)) { table_stat_label$conf.low <- scales::label_number(accuracy = .1) } if (!"conf.high" %in% names(table_stat_label)) { table_stat_label$conf.high <- scales::label_number(accuracy = .1) } if (!"p.value" %in% names(table_stat_label)) { table_stat_label$p.value <- scales::label_pvalue(add_p = FALSE) } for (v in names(table_stat_label)) { tbl_data[[v]] <- table_stat_label[[v]](tbl_data[[v]]) tbl_data[[v]][is.na(tbl_data[[v]])] <- "" } tbl_data$ci <- stringr::str_glue_data(tbl_data, ci_pattern) tbl_data$ci[is.na(data$conf.low) & is.na(data$conf.high)] <- " " tbl_data <- tbl_data %>% tidyr::pivot_longer( dplyr::any_of(table_stat), names_to = "stat", values_to = "value", values_transform = as.character ) tbl_data$stat <- factor(tbl_data$stat, levels = table_stat) if (!is.null(table_header) && length(table_header) != length(table_stat)) { cli::cli_abort("{.arg table_header} should have the same length as {.arg table_stat}.") # nolint } if (is.null(table_header)) { table_header <- table_stat if ("estimate" %in% table_header) { table_header[table_header == "estimate"] <- attr(data, "coefficients_label") } if ("ci" %in% table_header) { table_header[table_header == "ci"] <- paste(scales::percent(conf.level), "CI") } if ("p.value" %in% table_header) { table_header[table_header == "p.value"] <- "p" } } table_plot <- ggplot2::ggplot(tbl_data) + ggplot2::aes( x = .data[["stat"]], y = .data[[args$y]], label = .data[["value"]] ) + ggplot2::geom_text(hjust = .5, vjust = .5, size = table_text_size) + ggplot2::scale_x_discrete(position = "top", labels = table_header) + ggplot2::scale_y_discrete( limits = rev, expand = ggplot2::expansion(mult = 0, add = .5) ) + ggplot2::facet_grid( rows = args$facet_row, scales = "free_y", space = "free_y", switch = "y" ) + ggplot2::theme_light() + ggplot2::theme( axis.text.x = ggplot2::element_text(face = "bold", hjust = .5), axis.text.y = ggplot2::element_blank(), axis.title = ggplot2::element_blank(), strip.text = ggplot2::element_blank(), panel.grid = ggplot2::element_blank(), axis.ticks = ggplot2::element_blank() ) if (args$stripped_rows) { table_plot <- table_plot + geom_stripped_rows( mapping = ggplot2::aes( odd = .data[[".fill"]], even = .data[[".fill"]], colour = NULL, linetype = NULL ) ) } # join the plots patchwork::wrap_plots(coef_plot, table_plot, nrow = 1, widths = table_witdhs) } #' @describeIn ggcoef_model designed for displaying several models on the same #' plot. #' @export #' @param models named list of models #' @param type a dodged plot, a faceted plot or multiple table plots? #' @examples #' \donttest{ #' # Use ggcoef_compare() for comparing several models on the same plot #' mod1 <- lm(Fertility ~ ., data = swiss) #' mod2 <- step(mod1, trace = 0) #' mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss) #' models <- list( #' "Full model" = mod1, #' "Simplified model" = mod2, #' "With interaction" = mod3 #' ) #' #' ggcoef_compare(models) #' ggcoef_compare(models, type = "faceted") #' #' # you can reverse the vertical position of the point by using a negative #' # value for dodged_width (but it will produce some warnings) #' ggcoef_compare(models, dodged_width = -.9) #' } ggcoef_compare <- function( models, type = c("dodged", "faceted"), tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = .95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, ...) { data <- lapply( X = models, FUN = ggcoef_data, tidy_fun = tidy_fun, tidy_args = {{ tidy_args }}, conf.int = conf.int, conf.level = conf.level, exponentiate = exponentiate, variable_labels = variable_labels, term_labels = term_labels, interaction_sep = interaction_sep, categorical_terms_pattern = categorical_terms_pattern, add_reference_rows = add_reference_rows, no_reference_row = {{ no_reference_row }}, intercept = intercept, add_pairwise_contrasts = add_pairwise_contrasts, pairwise_variables = {{ pairwise_variables }}, keep_model_terms = keep_model_terms, pairwise_reverse = pairwise_reverse, emmeans_args = emmeans_args, significance = significance, significance_labels = significance_labels ) data <- dplyr::bind_rows(data, .id = "model") coefficients_label <- attr(data, "coefficients_label") data$model <- .in_order(data$model) data$term <- .in_order(data$term) data$var_label <- .in_order(data$var_label) data$variable <- .in_order(data$variable) data$label <- .in_order(data$label) # include should be applied after lapply data <- data %>% broom.helpers::tidy_select_variables( include = {{ include }}, model = models[[1]] # just need to pass 1 model for the function to work ) %>% broom.helpers::tidy_detach_model() # Add NA values for unobserved combinations # (i.e. for a term present in one model but not in another) data <- data %>% tidyr::complete( .data$model, tidyr::nesting( !!sym("var_label"), !!sym("variable"), !!sym("var_class"), !!sym("var_type"), !!sym("contrasts"), !!sym("label"), !!sym("label_light"), !!sym("term") ) ) %>% # order lost after nesting dplyr::arrange(.data$model, .data$variable, .data$term) attr(data, "coefficients_label") <- coefficients_label if (return_data) { return(data) } type <- match.arg(type) args <- list(...) args$data <- data args$exponentiate <- exponentiate if (!"y" %in% names(args) && !"facet_row" %in% names(args)) { args$y <- "label_light" } if (type == "dodged") { if (!"dodged " %in% names(args)) { args$dodged <- TRUE } if (!"colour" %in% names(args)) { args$colour <- "model" } if (!"errorbar_coloured" %in% names(args)) { args$errorbar_coloured <- TRUE } } else { if (!"facet_col" %in% names(args)) { args$facet_col <- "model" } if (!"colour" %in% names(args) && !all(is.na(data$var_label))) { args$colour <- "var_label" if (!"colour_guide" %in% names(args)) { args$colour_guide <- FALSE } } } do.call(ggcoef_plot, args) } #' @describeIn ggcoef_model a variation of [ggcoef_model()] adapted to #' multinomial logistic regressions performed with [nnet::multinom()]. #' @param y.level_label an optional named vector for labeling `y.level` #' (see examples) #' @export #' @examplesIf requireNamespace("nnet") #' #' \donttest{ #' # specific function for nnet::multinom models #' mod <- nnet::multinom(Species ~ ., data = iris) #' ggcoef_multinom(mod, exponentiate = TRUE) #' ggcoef_multinom(mod, type = "faceted") #' ggcoef_multinom( #' mod, #' type = "faceted", #' y.level_label = c("versicolor" = "versicolor\n(ref: setosa)") #' ) #' } ggcoef_multinom <- function( model, type = c("dodged", "faceted", "table"), y.level_label = NULL, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = .95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), ...) { type <- match.arg(type) attr(model, "component_label_arg") <- "y.level_label" ggcoef_multicomponents( model = model, type = type, component_col = "y.level", component_label = y.level_label, tidy_fun = tidy_fun, tidy_args = tidy_args, conf.int = conf.int, conf.level = conf.level, exponentiate = exponentiate, variable_labels = variable_labels, term_labels = term_labels, interaction_sep = interaction_sep, categorical_terms_pattern = categorical_terms_pattern, add_reference_rows = add_reference_rows, no_reference_row = {{ no_reference_row }}, intercept = intercept, include = {{ include }}, significance = significance, significance_labels = significance_labels, return_data = return_data, table_stat = table_stat, table_header = table_header, table_text_size = table_text_size, table_stat_label = table_stat_label, ci_pattern = ci_pattern, table_witdhs = table_witdhs, ... ) } #' @describeIn ggcoef_model a variation of [ggcoef_model()] adapted to #' multi-component models such as zero-inflated models or beta regressions. #' [ggcoef_multicomponents()] has been tested with `pscl::zeroinfl()`, #' `pscl::hurdle()` and `betareg::betareg()` #' @param component_col name of the component column #' @param component_label an optional named vector for labeling components #' @export #' @examplesIf requireNamespace("pscl") #' \donttest{ #' library(pscl) #' data("bioChemists", package = "pscl") #' mod <- zeroinfl(art ~ fem * mar | fem + mar, data = bioChemists) #' ggcoef_multicomponents(mod) #' #' ggcoef_multicomponents(mod, type = "f") #' #' ggcoef_multicomponents(mod, type = "t") #' #' ggcoef_multicomponents( #' mod, #' type = "t", #' component_label = c(conditional = "Count", zero_inflated = "Zero-inflated") #' ) #' #' mod2 <- zeroinfl(art ~ fem + mar | 1, data = bioChemists) #' ggcoef_multicomponents(mod2, type = "t") #' } ggcoef_multicomponents <- function( model, type = c("dodged", "faceted", "table"), component_col = "component", component_label = NULL, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = .95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), ...) { type <- match.arg(type) if (return_data && type == "table") type <- "faceted" if (type %in% c("dodged", "faceted")) { res <- ggcoef_multi_d_f( model = model, type = type, component_col = component_col, component_label = component_label, tidy_fun = tidy_fun, tidy_args = tidy_args, conf.int = conf.int, conf.level = conf.level, exponentiate = exponentiate, variable_labels = variable_labels, term_labels = term_labels, interaction_sep = interaction_sep, categorical_terms_pattern = categorical_terms_pattern, add_reference_rows = add_reference_rows, no_reference_row = {{ no_reference_row }}, intercept = intercept, include = {{ include }}, significance = significance, significance_labels = significance_labels, return_data = return_data, ... ) } else { res <- ggcoef_multi_t( model = model, type = type, component_col = component_col, component_label = component_label, tidy_fun = tidy_fun, tidy_args = tidy_args, conf.int = conf.int, conf.level = conf.level, exponentiate = exponentiate, variable_labels = variable_labels, term_labels = term_labels, interaction_sep = interaction_sep, categorical_terms_pattern = categorical_terms_pattern, add_reference_rows = add_reference_rows, no_reference_row = {{ no_reference_row }}, intercept = intercept, include = {{ include }}, significance = significance, significance_labels = significance_labels, ... ) } res } # dodged & faceted version ggcoef_multi_d_f <- function( model, type = c("dodged", "faceted"), component_col = "component", component_label = NULL, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = .95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, ...) { component_label_arg <- attr(model, "component_label_arg") if (is.null(component_label_arg)) component_label_arg <- "component_label" data <- ggcoef_data( model, tidy_fun = tidy_fun, tidy_args = {{ tidy_args }}, conf.int = conf.int, conf.level = conf.level, exponentiate = exponentiate, variable_labels = variable_labels, term_labels = term_labels, interaction_sep = interaction_sep, categorical_terms_pattern = categorical_terms_pattern, add_reference_rows = add_reference_rows, no_reference_row = {{ no_reference_row }}, intercept = intercept, include = {{ include }}, significance = significance, significance_labels = significance_labels ) if (!component_col %in% names(data)) data[[component_col]] <- " " data[[component_col]] <- .in_order(data[[component_col]]) if (!is.null(component_label)) { if ( is.null(names(component_label)) || any(names(component_label) == "") ) { cli::cli_abort( "All elements of {.arg {component_label_arg}} should be named." ) } keep <- names(component_label) %in% levels(data[[component_col]]) drop <- component_label[!keep] if (length(drop) > 0) { cli::cli_alert_warning(c( "Error in {.arg {component_label_arg}}:\n", "value{?s} {.strong {drop}} not found in the data and ignored." )) } component_label <- component_label[keep] missing_levels <- setdiff( levels(.in_order(data[[component_col]])), names(component_label) ) names(missing_levels) <- missing_levels data[[component_col]] <- factor( data[[component_col]], levels = c(names(component_label), missing_levels), labels = c(component_label, missing_levels) ) } if (return_data) { return(data) } type <- match.arg(type) args <- list(...) args$data <- data args$exponentiate <- exponentiate if (!"y" %in% names(args) && !"facet_row" %in% names(args)) { args$y <- "label_light" } if (type == "dodged") { if (!"dodged " %in% names(args)) { args$dodged <- TRUE } if (!"colour" %in% names(args)) { args$colour <- component_col } if (!"errorbar_coloured" %in% names(args)) { args$errorbar_coloured <- TRUE } } else { if (!"facet_col" %in% names(args)) { args$facet_col <- component_col } if (!"colour" %in% names(args) && !all(is.na(data$var_label))) { args$colour <- "var_label" if (!"colour_guide" %in% names(args)) { args$colour_guide <- FALSE } } } do.call(ggcoef_plot, args) } # table version ggcoef_multi_t <- function( model, type = c("table"), component_col = "component", component_label = NULL, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = .95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), significance = 1 - conf.level, significance_labels = NULL, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), ...) { type <- match.arg(type) component_label_arg <- attr(model, "component_label_arg") if (is.null(component_label_arg)) component_label_arg <- "component_label" data <- ggcoef_data( model = model, tidy_fun = tidy_fun, tidy_args = {{ tidy_args }}, conf.int = conf.int, conf.level = conf.level, exponentiate = exponentiate, variable_labels = variable_labels, term_labels = term_labels, interaction_sep = interaction_sep, categorical_terms_pattern = categorical_terms_pattern, add_reference_rows = add_reference_rows, no_reference_row = {{ no_reference_row }}, intercept = intercept, include = {{ include }}, significance = significance, significance_labels = significance_labels ) if (!component_col %in% names(data)) data[[component_col]] <- " " data[[component_col]] <- .in_order(data[[component_col]]) if (!is.null(component_label)) { if ( is.null(names(component_label)) || any(names(component_label) == "") ) { cli::cli_abort( "All elements of {.arg {component_label_arg}} should be named." ) } keep <- names(component_label) %in% levels(data[[component_col]]) drop <- component_label[!keep] if (length(drop) > 0) { cli::cli_alert_warning(c( "Error in {.arg {component_label_arg}}:\n", "value{?s} {.strong {drop}} not found in the data and ignored." )) } component_label <- component_label[keep] missing_levels <- setdiff( levels(.in_order(data[[component_col]])), names(component_label) ) names(missing_levels) <- missing_levels data[[component_col]] <- factor( data[[component_col]], levels = c(names(component_label), missing_levels), labels = c(component_label, missing_levels) ) } res <- levels(data[[component_col]]) %>% purrr::map( ~ ggcoef_table( data = dplyr::filter(data, .data[[component_col]] == .x), plot_title = .x, model = model, tidy_fun = tidy_fun, tidy_args = tidy_args, conf.int = conf.int, conf.level = conf.level, exponentiate = exponentiate, variable_labels = variable_labels, term_labels = term_labels, interaction_sep = interaction_sep, categorical_terms_pattern = categorical_terms_pattern, add_reference_rows = add_reference_rows, no_reference_row = {{ no_reference_row }}, intercept = intercept, include = {{ include }}, significance = significance, significance_labels = significance_labels, show_p_values = FALSE, signif_stars = FALSE, table_stat = table_stat, table_header = table_header, table_text_size = table_text_size, table_stat_label = table_stat_label, ci_pattern = ci_pattern, table_witdhs = table_witdhs ) ) patchwork::wrap_plots(res, ncol = 1) } # not exporting ggcoef_data ggcoef_data <- function( model, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = .95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = conf.level, significance_labels = NULL) { rlang::check_installed("broom.helpers") if (length(significance) == 0) { significance <- NULL } data <- rlang::inject(broom.helpers::tidy_plus_plus( model = model, tidy_fun = tidy_fun, conf.int = conf.int, conf.level = conf.level, exponentiate = exponentiate, variable_labels = variable_labels, term_labels = term_labels, interaction_sep = interaction_sep, categorical_terms_pattern = categorical_terms_pattern, add_reference_rows = add_reference_rows, no_reference_row = {{ no_reference_row }}, add_pairwise_contrasts = add_pairwise_contrasts, pairwise_variables = {{ pairwise_variables }}, keep_model_terms = keep_model_terms, pairwise_reverse = pairwise_reverse, emmeans_args = emmeans_args, add_estimate_to_reference_rows = TRUE, add_header_rows = FALSE, intercept = intercept, include = {{ include }}, keep_model = FALSE, !!!tidy_args )) if (!"p.value" %in% names(data)) { data$p.value <- NA_real_ significance <- NULL } if (!is.null(significance)) { if (is.null(significance_labels)) { significance_labels <- paste(c("p <=", "p >"), significance) } data$significance <- factor( !is.na(data$p.value) & data$p.value <= significance, levels = c(TRUE, FALSE), labels = significance_labels ) } data$signif_stars <- signif_stars(data$p.value, point = NULL) data$p_value_label <- ifelse( is.na(data$p.value), "", scales::pvalue(data$p.value, add_p = TRUE) ) # keep only rows with estimate data <- data[!is.na(data$estimate), ] data$term <- .in_order(data$term) data$var_label <- .in_order(data$var_label) data$variable <- .in_order(data$variable) data$label <- .in_order(data$label) data$label_light <- dplyr::if_else( as.character(data$label) == as.character(data$var_label) & ((!grepl("^nmatrix", data$var_class)) | is.na(data$var_class)), "", as.character(data$label) ) %>% .in_order() data } #' @describeIn ggcoef_model plot a tidy `tibble` of coefficients #' @param data a data frame containing data to be plotted, #' typically the output of `ggcoef_model()`, `ggcoef_compare()` #' or `ggcoef_multinom()` with the option `return_data = TRUE` #' @param x,y variables mapped to x and y axis #' @param exponentiate if `TRUE` a logarithmic scale will #' be used for x-axis #' @param point_size size of the points #' @param point_stroke thickness of the points #' @param point_fill fill colour for the points #' @param colour optional variable name to be mapped to #' colour aesthetic #' @param colour_guide should colour guide be displayed #' in the legend? #' @param colour_lab label of the colour aesthetic in the legend #' @param colour_labels labels argument passed to #' [ggplot2::scale_colour_discrete()] and #' [ggplot2::discrete_scale()] #' @param shape optional variable name to be mapped to the #' shape aesthetic #' @param shape_values values of the different shapes to use in #' [ggplot2::scale_shape_manual()] #' @param shape_guide should shape guide be displayed in the legend? #' @param shape_lab label of the shape aesthetic in the legend #' @param errorbar should error bars be plotted? #' @param errorbar_height height of error bars #' @param errorbar_coloured should error bars be colored as the points? #' @param stripped_rows should stripped rows be displayed in the background? #' @param strips_odd color of the odd rows #' @param strips_even color of the even rows #' @param vline should a vertical line be drawn at 0 (or 1 if #' `exponentiate = TRUE`)? #' @param vline_colour colour of vertical line #' @param dodged should points be dodged (according to the colour aesthetic)? #' @param dodged_width width value for [ggplot2::position_dodge()] #' @param facet_row variable name to be used for row facets #' @param facet_col optional variable name to be used for column facets #' @param facet_labeller labeller function to be used for labeling facets; #' if labels are too long, you can use [ggplot2::label_wrap_gen()] (see #' examples), more information in the documentation of [ggplot2::facet_grid()] #' @seealso `vignette("ggcoef_model")` #' @export ggcoef_plot <- function( data, x = "estimate", y = "label", exponentiate = FALSE, point_size = 2, point_stroke = 2, point_fill = "white", colour = NULL, colour_guide = TRUE, colour_lab = "", colour_labels = ggplot2::waiver(), shape = "significance", shape_values = c(16, 21), shape_guide = TRUE, shape_lab = "", errorbar = TRUE, errorbar_height = .1, errorbar_coloured = FALSE, stripped_rows = TRUE, strips_odd = "#11111111", strips_even = "#00000000", vline = TRUE, vline_colour = "grey50", dodged = FALSE, dodged_width = .8, facet_row = "var_label", facet_col = NULL, facet_labeller = "label_value") { data[[y]] <- forcats::fct_rev(.in_order(data[[y]])) if (!is.null(facet_row)) { data[[facet_row]] <- .in_order(data[[facet_row]]) } if (stripped_rows) { if (!"term" %in% names(data)) { data$term <- data[[y]] } data <- data %>% dplyr::mutate(.fill = dplyr::if_else( as.integer(.in_order(.data$term)) %% 2L == 1, strips_even, strips_odd )) } # mapping mapping <- ggplot2::aes(x = .data[[x]], y = .data[[y]]) errorbar <- errorbar & all(c("conf.low", "conf.high") %in% names(data)) if (errorbar) { mapping$xmin <- ggplot2::aes(xmin = .data[["conf.low"]])$xmin mapping$xmax <- ggplot2::aes(xmax = .data[["conf.high"]])$xmax } if (!is.null(shape) && shape %in% names(data)) { mapping$shape <- ggplot2::aes(shape = .data[[shape]])$shape } if (!is.null(colour) && colour %in% names(data)) { mapping$colour <- ggplot2::aes(colour = .data[[colour]])$colour mapping$group <- ggplot2::aes(group = .data[[colour]])$group } # position if (dodged) { position <- ggplot2::position_dodge(dodged_width) } else { position <- ggplot2::position_identity() } # plot p <- ggplot2::ggplot(data = data, mapping = mapping) if (stripped_rows) { p <- p + geom_stripped_rows( mapping = ggplot2::aes( odd = .data[[".fill"]], even = .data[[".fill"]], colour = NULL, linetype = NULL ) ) } if (vline) { p <- p + ggplot2::geom_vline( xintercept = ifelse(exponentiate, 1, 0), colour = vline_colour ) } if (errorbar) { if (!is.null(colour) && errorbar_coloured) { p <- p + ggplot2::geom_errorbarh( na.rm = TRUE, height = errorbar_height, position = position ) } else { p <- p + ggplot2::geom_errorbarh( mapping = ggplot2::aes(colour = NULL), na.rm = TRUE, height = errorbar_height, colour = "black", position = position ) } } if (!is.null(facet_col) && is.character(facet_col)) { facet_col <- ggplot2::vars(!!sym(facet_col)) } if (!is.null(facet_row) && is.character(facet_row)) { facet_row <- ggplot2::vars(!!sym(facet_row)) } p <- p + ggplot2::geom_point( size = point_size, stroke = point_stroke, fill = point_fill, position = position, na.rm = TRUE ) + ggplot2::facet_grid( rows = facet_row, cols = facet_col, labeller = facet_labeller, scales = "free_y", space = "free_y", switch = "y" ) + ggplot2::ylab("") + ggplot2::scale_y_discrete(expand = ggplot2::expansion(mult = 0, add = .5)) + ggplot2::theme_light() + ggplot2::theme( legend.position = "bottom", legend.box = "vertical", strip.placement = "outside", strip.text.y.left = ggplot2::element_text( face = "bold", angle = 0, colour = "black", hjust = 0, vjust = 1 ), strip.text.x = ggplot2::element_text(face = "bold", colour = "black"), strip.background = ggplot2::element_blank(), panel.grid.minor = ggplot2::element_blank(), panel.grid.major.y = ggplot2::element_blank(), panel.grid.major.x = ggplot2::element_line(linetype = "dashed"), axis.title.x = ggplot2::element_text(face = "bold"), axis.ticks.y = ggplot2::element_blank() ) if (!is.null(colour) && colour %in% names(data)) { if (colour_guide) { colour_guide <- ggplot2::guide_legend() } else { colour_guide <- "none" } p <- p + ggplot2::scale_colour_discrete( guide = colour_guide, labels = colour_labels ) + ggplot2::labs(colour = colour_lab) } if (!is.null(shape) && shape %in% names(data)) { if (shape_guide) { shape_guide <- ggplot2::guide_legend() } else { shape_guide <- "none" } p <- p + ggplot2::scale_shape_manual( values = shape_values, drop = FALSE, guide = shape_guide, na.translate = FALSE ) + ggplot2::labs(shape = shape_lab) } if (exponentiate) { p <- p + ggplot2::scale_x_log10() } if (!is.null(attr(data, "coefficients_label"))) { p <- p + ggplot2::xlab(attr(data, "coefficients_label")) } p } .in_order <- function(x) { # droping unobserved value if needed forcats::fct_inorder(as.character(x)) } ggstats/R/stat_cross.R0000644000176200001440000001575214462704247014462 0ustar liggesusers#' Compute cross-tabulation statistics #' #' Computes statistics of a 2-dimensional matrix using [broom::augment.htest]. #' #' @inheritParams ggplot2::stat_identity #' @param geom Override the default connection with #' [ggplot2::geom_point()]. #' @param na.rm If `TRUE`, the default, missing values are #' removed with a warning. #' If `TRUE`, missing values are silently removed. #' @param keep.zero.cells If `TRUE`, cells with no observations are kept. #' @section Aesthetics: #' `stat_cross()` requires the **x** and the **y** aesthetics. #' @section Computed variables: #' \describe{ #' \item{observed}{number of observations in x,y} #' \item{prop}{proportion of total} #' \item{row.prop}{row proportion} #' \item{col.prop}{column proportion} #' \item{expected}{expected count under the null hypothesis} #' \item{resid}{Pearson's residual} #' \item{std.resid}{standardized residual} #' \item{row.observed}{total number of observations within row} #' \item{col.observed}{total number of observations within column} #' \item{total.observed}{total number of observations within the table} #' \item{phi}{phi coefficients, see [augment_chisq_add_phi()]} #' } #' #' @export #' @return A `ggplot2` plot with the added statistic. #' @seealso `vignette("stat_cross")` #' @examples #' library(ggplot2) #' d <- as.data.frame(Titanic) #' #' # plot number of observations #' ggplot(d) + #' aes(x = Class, y = Survived, weight = Freq, size = after_stat(observed)) + #' stat_cross() + #' scale_size_area(max_size = 20) #' #' # custom shape and fill colour based on chi-squared residuals #' ggplot(d) + #' aes( #' x = Class, y = Survived, weight = Freq, #' size = after_stat(observed), fill = after_stat(std.resid) #' ) + #' stat_cross(shape = 22) + #' scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + #' scale_size_area(max_size = 20) #' #' \donttest{ #' # custom shape and fill colour based on phi coeffients #' ggplot(d) + #' aes( #' x = Class, y = Survived, weight = Freq, #' size = after_stat(observed), fill = after_stat(phi) #' ) + #' stat_cross(shape = 22) + #' scale_fill_steps2(show.limits = TRUE) + #' scale_size_area(max_size = 20) #' #' #' # plotting the number of observations as a table #' ggplot(d) + #' aes( #' x = Class, y = Survived, weight = Freq, label = after_stat(observed) #' ) + #' geom_text(stat = "cross") #' #' # Row proportions with standardized residuals #' ggplot(d) + #' aes( #' x = Class, y = Survived, weight = Freq, #' label = scales::percent(after_stat(row.prop)), #' size = NULL, fill = after_stat(std.resid) #' ) + #' stat_cross(shape = 22, size = 30) + #' geom_text(stat = "cross") + #' scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + #' facet_grid(Sex ~ .) + #' labs(fill = "Standardized residuals") + #' theme_minimal() #' } stat_cross <- function(mapping = NULL, data = NULL, geom = "point", position = "identity", ..., na.rm = TRUE, show.legend = NA, inherit.aes = TRUE, keep.zero.cells = FALSE) { params <- list( na.rm = na.rm, keep.zero.cells = keep.zero.cells, ... ) layer( data = data, mapping = mapping, stat = StatCross, geom = geom, position = position, show.legend = show.legend, inherit.aes = inherit.aes, params = params ) } #' @rdname stat_cross #' @format NULL #' @usage NULL #' @export StatCross <- ggplot2::ggproto( "StatCross", ggplot2::Stat, required_aes = c("x", "y"), default_aes = ggplot2::aes(weight = 1), setup_params = function(data, params) { params }, extra_params = c("na.rm"), compute_panel = function(self, data, scales, keep.zero.cells = FALSE) { if (is.null(data$weight)) { data$weight <- rep(1, nrow(data)) } # compute cross statistics panel <- augment_chisq_add_phi( chisq.test(xtabs(weight ~ y + x, data = data)) ) panel_names <- names(panel) for (to_name in c( "observed", "prop", "row.prop", "col.prop", "expected", "resid", "std.resid", "row.observed", "col.observed", "total.observed", "phi" )) { from_name <- paste0(".", to_name) panel_names[which(panel_names == from_name)] <- to_name } names(panel) <- panel_names # to handle the fact that ggplot2 could transform factors into integers # before computation of the statistic if (is.numeric(data$x)) panel$x <- as.numeric(panel$x) if (is.numeric(data$y)) panel$y <- as.numeric(panel$y) # keeping first value of other aesthetics in data panel <- merge( panel, dplyr::select(data, -dplyr::all_of("PANEL")), by = c("x", "y"), all.x = TRUE ) panel <- panel %>% dplyr::distinct(.data$x, .data$y, .keep_all = TRUE) if (!keep.zero.cells) { panel <- panel[panel$observed != 0, ] } panel } ) # Compute phi coefficients # see psych::phi() and GDAtools::phi.table() .compute_phi <- function(.prop, .row.observed, .col.observed, .total.observed) { rp <- .row.observed / .total.observed cp <- .col.observed / .total.observed (.prop - rp * cp) / sqrt(rp * (1 - rp) * cp * (1 - cp)) } #' Augment a chi-squared test and compute phi coefficients #' @details #' Phi coefficients are a measurement of the degree of association #' between two binary variables. #' #' - A value between -1.0 to -0.7 indicates a strong negative association. #' - A value between -0.7 to -0.3 indicates a weak negative association. #' - A value between -0.3 to +0.3 indicates a little or no association. #' - A value between +0.3 to +0.7 indicates a weak positive association. #' - A value between +0.7 to +1.0 indicates a strong positive association. #' @export #' @param x a chi-squared test as returned by [stats::chisq.test()] #' @return A `tibble`. #' @seealso [stat_cross()], `GDAtools::phi.table()` or `psych::phi()` #' @examples #' tab <- xtabs(Freq ~ Sex + Class, data = as.data.frame(Titanic)) #' augment_chisq_add_phi(chisq.test(tab)) augment_chisq_add_phi <- function(x) { if (!inherits(x, "htest") && names(x$statistic) != "X-squared") { cli::cli_abort(paste( "{.arg x} should be the result of a chi-squared test", "(see {.fn stats::chisq.test})." )) } broom::augment(x) %>% dplyr::group_by(dplyr::across(1)) %>% dplyr::mutate(.row.observed = sum(.data$.observed)) %>% dplyr::group_by(dplyr::across(2)) %>% dplyr::mutate(.col.observed = sum(.data$.observed)) %>% dplyr::ungroup() %>% dplyr::mutate( .total.observed = sum(.data$.observed), .phi = .compute_phi( .data$.prop, .data$.row.observed, .data$.col.observed, .data$.total.observed ) ) } ggstats/R/signif_stars.R0000644000176200001440000000207414357760261014763 0ustar liggesusers#' Significance Stars #' #' Calculate significance stars #' #' @param x numeric values that will be compared to the `point`, #' `one`, `two`, and `three` values #' @param three threshold below which to display three stars #' @param two threshold below which to display two stars #' @param one threshold below which to display one star #' @param point threshold below which to display one point #' (`NULL` to deactivate) #' @return Character vector containing the appropriate number of #' stars for each `x` value. #' @author Joseph Larmarange #' @export #' @examples #' x <- c(0.5, 0.1, 0.05, 0.01, 0.001) #' signif_stars(x) #' signif_stars(x, one = .15, point = NULL) signif_stars <- function(x, three = 0.001, two = 0.01, one = 0.05, point = 0.1) { res <- rep_len("", length.out = length(x)) if (!is.null(point)) { res[x <= point] <- "." } if (!is.null(one)) { res[x <= one] <- "*" } if (!is.null(two)) { res[x <= two] <- "**" } if (!is.null(three)) { res[x <= three] <- "***" } res } ggstats/R/ggsurvey.R0000644000176200001440000000334314415524646014143 0ustar liggesusers#' Easy ggplot2 with survey objects #' #' A function to facilitate `ggplot2` graphs using a survey object. #' It will initiate a ggplot and map survey weights to the #' corresponding aesthetic. #' #' Graphs will be correct as long as only weights are required #' to compute the graph. However, statistic or geometry requiring #' correct variance computation (like [ggplot2::geom_smooth()]) will #' be statistically incorrect. #' #' @param design A survey design object, usually created with #' [survey::svydesign()] #' @param mapping Default list of aesthetic mappings to use for plot, #' to be created with [ggplot2::aes()]. #' @param ... Other arguments passed on to methods. Not currently used. #' @importFrom stats weights #' @return A `ggplot2` plot. #' @export #' @examplesIf requireNamespace("survey") #' data(api, package = "survey") #' dstrat <- survey::svydesign( #' id = ~1, strata = ~stype, #' weights = ~pw, data = apistrat, #' fpc = ~fpc #' ) #' ggsurvey(dstrat) + #' ggplot2::aes(x = cnum, y = dnum) + #' ggplot2::geom_count() #' #' d <- as.data.frame(Titanic) #' dw <- survey::svydesign(ids = ~1, weights = ~Freq, data = d) #' ggsurvey(dw) + #' ggplot2::aes(x = Class, fill = Survived) + #' ggplot2::geom_bar(position = "fill") ggsurvey <- function(design = NULL, mapping = NULL, ...) { if (!inherits(design, "survey.design")) { cli::cli_abort("{.var design} should be a {.cls survey.design} object.") } rlang::check_installed("survey") data <- design$variables data$.weights <- weights(design) if (is.null(mapping)) { mapping <- ggplot2::aes() } mapping$weight <- ggplot2::aes(weight = .data[[".weights"]])$weight ggplot2::ggplot(data, mapping, ...) } ggstats/R/label_number_abs.R0000644000176200001440000000223314527052120015524 0ustar liggesusers#' Label absolute values #' #' @param ... arguments passed to [scales::label_number()] or #' [scales::label_percent()] #' @param hide_below if provided, values below `hide_below` will be masked #' (i.e. an empty string `""` will be returned) #' @returns A "labelling" function, , i.e. a function that takes a vector and #' returns a character vector of same length giving a label for each input value. #' @seealso [scales::label_number()], [scales::label_percent()] #' @export #' @examples #' x <- c(-0.2, -.05, 0, .07, .25, .66) #' #' scales::label_number()(x) #' label_number_abs()(x) #' #' scales::label_percent()(x) #' label_percent_abs()(x) #' label_percent_abs(hide_below = .1)(x) label_number_abs <- function(..., hide_below = NULL) { function(x) { res <- scales::label_number(...)(abs(x)) if (!is.null(hide_below)) { res[abs(x) < hide_below] <- "" } res } } #' @rdname label_number_abs #' @export label_percent_abs <- function(..., hide_below = NULL) { function(x) { res <- scales::label_percent(...)(abs(x)) if (!is.null(hide_below)) { res[abs(x) < hide_below] <- "" } res } } ggstats/R/gglikert.R0000644000176200001440000004754614505240662014106 0ustar liggesusers#' Plotting Likert-type items #' #' `r lifecycle::badge("experimental")` #' #' Combines several factor variables using the same list of ordered levels #' (e.g. Likert-type scales) into a unique data frame and generates a centered #' bar plot. #' #' You could use `gglikert_data()` to just produce the dataset to be plotted. #' #' If variable labels have been defined (see [labelled::var_label()]), they will #' be considered. You can also pass custom variables labels with the #' `variable_labels` argument. #' #' @param data a data frame #' @param include variables to include, accept [tidy-select][dplyr::select] #' syntax #' @param weights optional variable name of a weighting variable, #' accept [tidy-select][dplyr::select] syntax #' @param y name of the variable to be plotted on `y` axis (relevant when #' `.question` is mapped to "facets, see examples), #' accept [tidy-select][dplyr::select] syntax #' @param variable_labels a named list or a named vector of custom variable #' labels #' @param sort should variables be sorted? #' @param sort_method method used to sort the variables: `"prop"` sort according #' to the proportion of answers higher than the centered level, `"mean"` #' considers answer as a score and sort according to the mean score, `"median"` #' used the median and the majority judgment rule for tie-breaking. #' @param sort_prop_include_center when sorting with `"prop"` and if the number #' of levels is uneven, should half of the central level be taken into account #' to compute the proportion? #' @param exclude_fill_values Vector of values that should not be displayed #' (but still taken into account for computing proportions), #' see [position_likert()] #' @param add_labels should percentage labels be added to the plot? #' @param labels_size size of the percentage labels #' @param labels_color color of the percentage labels #' @param labels_accuracy accuracy of the percentages, see #' [scales::label_percent()] #' @param labels_hide_below if provided, values below will be masked, see #' [label_percent_abs()] #' @param add_totals should the total proportions of negative and positive #' answers be added to plot? **This option is not compatible with facets!** #' @param totals_size size of the total proportions #' @param totals_color color of the total proportions #' @param totals_accuracy accuracy of the total proportions, see #' [scales::label_percent()] #' @param totals_fontface font face of the total proportions #' @param totals_include_center if the number of levels is uneven, should half #' of the center level be added to the total proportions? #' @param totals_hjust horizontal adjustment of totals labels on the x axis #' @param y_reverse should the y axis be reversed? #' @param y_label_wrap number of characters per line for y axis labels, see #' [scales::label_wrap()] #' @param reverse_likert if `TRUE`, will reverse the default stacking order, #' see [position_likert()] #' @param width bar width, see [ggplot2::geom_bar()] #' @param facet_rows,facet_cols A set of variables or expressions quoted by #' [ggplot2::vars()] and defining faceting groups on the rows or columns #' dimension (see examples) #' @param facet_label_wrap number of characters per line for facet labels, see #' [ggplot2::label_wrap_gen()] #' @return A `ggplot2` plot or a `tibble`. #' @seealso `vignette("gglikert")`, [position_likert()], [stat_prop()] #' @export #' @examples #' library(ggplot2) #' library(dplyr) #' #' likert_levels <- c( #' "Strongly disagree", #' "Disagree", #' "Neither agree nor disagree", #' "Agree", #' "Strongly agree" #' ) #' set.seed(42) #' df <- #' tibble( #' q1 = sample(likert_levels, 150, replace = TRUE), #' q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1), #' q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), #' q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), #' q5 = sample(c(likert_levels, NA), 150, replace = TRUE), #' q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0)) #' ) %>% #' mutate(across(everything(), ~ factor(.x, levels = likert_levels))) #' #' gglikert(df) #' #' gglikert(df, include = q1:3) #' #' gglikert(df, sort = "ascending") #' #' \donttest{ #' gglikert(df, sort = "ascending", sort_prop_include_center = TRUE) #' #' gglikert(df, sort = "ascending", sort_method = "mean") #' #' gglikert(df, reverse_likert = TRUE) #' #' gglikert(df, add_totals = FALSE, add_labels = FALSE) #' #' gglikert( #' df, #' totals_include_center = TRUE, #' totals_hjust = .25, #' totals_size = 4.5, #' totals_fontface = "italic", #' totals_accuracy = .01, #' labels_accuracy = 1, #' labels_size = 2.5, #' labels_hide_below = .25 #' ) #' #' gglikert(df, exclude_fill_values = "Neither agree nor disagree") #' #' if (require("labelled")) { #' df %>% #' set_variable_labels( #' q1 = "First question", #' q2 = "Second question" #' ) %>% #' gglikert( #' variable_labels = c( #' q4 = "a custom label", #' q6 = "a very very very very very very very very very very long label" #' ), #' y_label_wrap = 25 #' ) #' } #' #' # Facets #' df_group <- df #' df_group$group <- sample(c("A", "B"), 150, replace = TRUE) #' #' gglikert(df_group, q1:q6, facet_rows = vars(group)) #' #' gglikert(df_group, q1:q6, facet_cols = vars(group)) #' #' gglikert(df_group, q1:q6, y = "group", facet_rows = vars(.question)) #' } gglikert <- function(data, include = dplyr::everything(), weights = NULL, y = ".question", variable_labels = NULL, sort = c("none", "ascending", "descending"), sort_method = c("prop", "mean", "median"), sort_prop_include_center = totals_include_center, exclude_fill_values = NULL, add_labels = TRUE, labels_size = 3.5, labels_color = "black", labels_accuracy = 1, labels_hide_below = .05, add_totals = TRUE, totals_size = labels_size, totals_color = "black", totals_accuracy = labels_accuracy, totals_fontface = "bold", totals_include_center = FALSE, totals_hjust = .1, y_reverse = TRUE, y_label_wrap = 50, reverse_likert = FALSE, width = .9, facet_rows = NULL, facet_cols = NULL, facet_label_wrap = 50) { data <- gglikert_data( data, {{ include }}, weights = {{ weights }}, variable_labels = variable_labels, sort = sort, sort_method = sort_method, sort_prop_include_center = sort_prop_include_center, exclude_fill_values = exclude_fill_values ) y <- broom.helpers::.select_to_varnames( select = {{ y }}, data = data, arg_name = "y", select_single = TRUE ) if (!is.factor(data[[y]])) { data[[y]] <- factor(data[[y]]) } if (y_reverse) { data[[y]] <- data[[y]] %>% forcats::fct_rev() } p <- ggplot(data) + aes( y = .data[[y]], fill = .data[[".answer"]], by = .data[[y]], weight = .data[[".weights"]] ) + geom_bar( position = position_likert( reverse = reverse_likert, exclude_fill_values = exclude_fill_values ), stat = StatProp, complete = "fill", width = width ) if (add_labels) { p <- p + geom_text( mapping = aes( label = label_percent_abs( hide_below = labels_hide_below, accuracy = labels_accuracy )(after_stat(prop)) ), stat = StatProp, complete = "fill", position = position_likert( vjust = .5, reverse = reverse_likert, exclude_fill_values = exclude_fill_values ), size = labels_size, color = labels_color ) } if (add_totals) { dtot <- data %>% dplyr::group_by(.data[[y]], !!!facet_rows, !!!facet_cols) %>% dplyr::summarise( prop_lower = .prop_lower( .data$.answer, .data$.weights, include_center = TRUE, exclude_fill_values = exclude_fill_values ), prop_higher = .prop_higher( .data$.answer, .data$.weights, include_center = TRUE, exclude_fill_values = exclude_fill_values ), label_lower = .prop_lower( .data$.answer, .data$.weights, include_center = totals_include_center, exclude_fill_values = exclude_fill_values ), label_higher = .prop_higher( .data$.answer, .data$.weights, include_center = totals_include_center, exclude_fill_values = exclude_fill_values ) ) %>% dplyr::ungroup() %>% dplyr::mutate( label_lower = label_percent_abs(accuracy = totals_accuracy)(.data$label_lower), label_higher = label_percent_abs(accuracy = totals_accuracy)(.data$label_higher), x_lower = -1 * max(.data$prop_lower) - totals_hjust, x_higher = max(.data$prop_higher) + totals_hjust ) %>% dplyr::group_by(!!!facet_rows, !!!facet_cols) dtot <- dplyr::bind_rows( dtot %>% dplyr::select( dplyr::all_of(c(y, x = "x_lower", label = "label_lower")), dplyr::group_cols() ), dtot %>% dplyr::select( dplyr::all_of(c(y, x = "x_higher", label = "label_higher")), dplyr::group_cols() ) ) p <- p + geom_text( mapping = aes( y = .data[[y]], x = .data[["x"]], label = .data[["label"]], fill = NULL, by = NULL, weight = NULL ), data = dtot, size = totals_size, color = totals_color, fontface = totals_fontface ) } p <- p + labs(x = NULL, y = NULL, fill = NULL) + scale_x_continuous(labels = label_percent_abs()) + scale_y_discrete(labels = scales::label_wrap(y_label_wrap)) + theme_light() + theme( legend.position = "bottom", panel.grid.major.y = element_blank() ) if (length(levels(data$.answer)) <= 11) { p <- p + scale_fill_brewer(palette = "BrBG") } p + facet_grid( rows = facet_rows, cols = facet_cols, labeller = ggplot2::label_wrap_gen(facet_label_wrap) ) } #' @rdname gglikert #' @export gglikert_data <- function(data, include = dplyr::everything(), weights = NULL, variable_labels = NULL, sort = c("none", "ascending", "descending"), sort_method = c("prop", "mean", "median"), sort_prop_include_center = TRUE, exclude_fill_values = NULL) { rlang::check_installed("broom.helpers") rlang::check_installed("labelled") sort <- match.arg(sort) sort_method <- match.arg(sort_method) variables <- broom.helpers::.select_to_varnames( select = {{ include }}, data = data, arg_name = "include" ) weights_var <- broom.helpers::.select_to_varnames( select = {{ weights }}, data = data, arg_name = "weights", select_single = TRUE ) if (is.null(weights_var)) { data$.weights <- 1 } else { data$.weights <- data[[weights_var]] } if (!is.numeric(data$.weights)) { cli::cli_abort("{.arg weights} should correspond to a numerical variable.") } if (is.list(variable_labels)) { variable_labels <- unlist(variable_labels) } data_labels <- data %>% labelled::var_label(unlist = TRUE, null_action = "fill") if (!is.null(variable_labels)) { data_labels[names(variable_labels)] <- variable_labels } data_labels <- data_labels[variables] data <- data %>% dplyr::mutate( dplyr::across(dplyr::all_of(variables), labelled::to_factor) ) data <- data %>% dplyr::mutate( dplyr::bind_cols(forcats::fct_unify(data[, variables])) ) %>% tidyr::pivot_longer( cols = dplyr::all_of(variables), names_to = ".question", values_to = ".answer" ) data$.question <- data_labels[data$.question] %>% forcats::fct_inorder() if (sort == "ascending" && sort_method == "prop") { data$.question <- data$.question %>% forcats::fct_reorder2( data$.answer, data$.weights, .fun = .prop_higher, include_center = sort_prop_include_center, exclude_fill_values = exclude_fill_values, .na_rm = FALSE, .desc = FALSE ) } if (sort == "descending" && sort_method == "prop") { data$.question <- data$.question %>% forcats::fct_reorder2( data$.answer, data$.weights, .fun = .prop_higher, include_center = sort_prop_include_center, exclude_fill_values = exclude_fill_values, .na_rm = FALSE, .desc = TRUE ) } if (sort == "ascending" && sort_method == "mean") { data$.question <- data$.question %>% forcats::fct_reorder2( data$.answer, data$.weights, .fun = .sort_mean, exclude_fill_values = exclude_fill_values, .na_rm = FALSE, .desc = FALSE ) } if (sort == "descending" && sort_method == "mean") { data$.question <- data$.question %>% forcats::fct_reorder2( data$.answer, data$.weights, .fun = .sort_mean, exclude_fill_values = exclude_fill_values, .na_rm = FALSE, .desc = TRUE ) } if (sort == "ascending" && sort_method == "median") { data$.question <- data$.question %>% forcats::fct_reorder2( data$.answer, data$.weights, .fun = .sort_median, exclude_fill_values = exclude_fill_values, .na_rm = FALSE, .desc = FALSE ) } if (sort == "descending" && sort_method == "median") { data$.question <- data$.question %>% forcats::fct_reorder2( data$.answer, data$.weights, .fun = .sort_median, exclude_fill_values = exclude_fill_values, .na_rm = FALSE, .desc = TRUE ) } data } # Compute the proportion being higher than the center # Option to include the centre (if yes, only half taken into account) .prop_higher <- function(x, w, include_center = TRUE, exclude_fill_values = NULL) { N <- sum(as.integer(!is.na(x)) * w) if (!is.factor(x)) x <- factor(x) if (!is.null(exclude_fill_values)) { l <- levels(x) l <- l[!l %in% exclude_fill_values] x <- factor(x, levels = l) } m <- length(levels(x)) / 2 + 1 / 2 x <- as.numeric(x) ic <- ifelse(include_center, 1 / 2, 0) sum(w * as.integer(x > m), w * ic * as.integer(x == m), na.rm = TRUE) / N } # Compute the proportion being higher than the center # Option to include the centre (if yes, only half taken into account) .prop_lower <- function(x, w, include_center = TRUE, exclude_fill_values = NULL) { N <- sum(as.integer(!is.na(x)) * w) if (!is.factor(x)) x <- factor(x) if (!is.null(exclude_fill_values)) { l <- levels(x) l <- l[!l %in% exclude_fill_values] x <- factor(x, levels = l) } m <- length(levels(x)) / 2 + 1 / 2 x <- as.numeric(x) ic <- ifelse(include_center, 1 / 2, 0) sum(w * as.integer(x < m), ic * w * as.integer(x == m), na.rm = TRUE) / N } #' @importFrom stats weighted.mean .sort_mean <- function(x, w, exclude_fill_values = NULL) { if (!is.factor(x)) x <- factor(x) if (!is.null(exclude_fill_values)) { l <- levels(x) l <- l[!l %in% exclude_fill_values] x <- factor(x, levels = l) } x <- as.integer(x) stats::weighted.mean(x, w, na.rm = TRUE) } .sort_median <- function(x, w, exclude_fill_values = NULL) { if (!is.factor(x)) x <- factor(x) if (!is.null(exclude_fill_values)) { l <- levels(x) l <- l[!l %in% exclude_fill_values] x <- factor(x, levels = l) } x <- as.integer(x) med <- weighted.median(x, w, na.rm = TRUE) med + stats::weighted.mean(x > med, w, na.rm = TRUE) - stats::weighted.mean(x < med, w, na.rm = TRUE) } #' @rdname gglikert #' @param add_median_line add a vertical line at 50%? #' @param reverse_fill if `TRUE`, will reverse the default stacking order, #' see [ggplot2::position_fill()] #' @export #' @examples #' gglikert_stacked(df, q1:q6) #' #' gglikert_stacked(df, q1:q6, add_median_line = TRUE, sort = "asc") #' #' \donttest{ #' gglikert_stacked(df_group, q1:q6, y = "group", add_median_line = TRUE) + #' facet_grid(rows = vars(.question)) #' } gglikert_stacked <- function(data, include = dplyr::everything(), weights = NULL, y = ".question", variable_labels = NULL, sort = c("none", "ascending", "descending"), sort_method = c("prop", "mean", "median"), sort_prop_include_center = FALSE, add_labels = TRUE, labels_size = 3.5, labels_color = "black", labels_accuracy = 1, labels_hide_below = .05, add_median_line = FALSE, y_reverse = TRUE, y_label_wrap = 50, reverse_fill = TRUE, width = .9) { data <- gglikert_data( data, {{ include }}, weights = {{ weights }}, variable_labels = variable_labels, sort = sort, sort_method = sort_method, sort_prop_include_center = sort_prop_include_center, exclude_fill_values = NULL ) y <- broom.helpers::.select_to_varnames( select = {{ y }}, data = data, arg_name = "y", select_single = TRUE ) if (!is.factor(data[[y]])) { data[[y]] <- factor(data[[y]]) } if (y_reverse) { data[[y]] <- data[[y]] %>% forcats::fct_rev() } p <- ggplot(data) + aes( y = .data[[y]], fill = .data[[".answer"]], by = .data[[y]], weight = .data[[".weights"]] ) + geom_bar( position = position_fill(reverse = reverse_fill), stat = StatProp, complete = "fill", width = width ) if (add_labels) { p <- p + geom_text( mapping = aes( label = label_percent_abs( hide_below = labels_hide_below, accuracy = labels_accuracy )(after_stat(prop)) ), stat = StatProp, complete = "fill", position = position_fill( vjust = .5, reverse = reverse_fill ), size = labels_size, color = labels_color ) } if (add_median_line) { p <- p + ggplot2::geom_vline(xintercept = .5) } p <- p + labs(x = NULL, y = NULL, fill = NULL) + scale_x_continuous(labels = label_percent_abs()) + scale_y_discrete(labels = scales::label_wrap(y_label_wrap)) + theme_light() + theme( legend.position = "bottom", panel.grid.major.y = element_blank() ) if (length(levels(data$.answer)) <= 11) { p <- p + scale_fill_brewer(palette = "BrBG") } p } ggstats/R/ggstats-package.R0000644000176200001440000000146514462704245015335 0ustar liggesusers#' @keywords internal "_PACKAGE" # The following block is used by usethis to automatically manage # roxygen namespace tags. Modify with care! ## usethis namespace: start #' @importFrom lifecycle deprecate_soft #' @importFrom lifecycle deprecated #' @importFrom dplyr .data sym #' @importFrom ggplot2 after_stat ## usethis namespace: end NULL utils::globalVariables(c("prop")) # \lifecycle{experimental} # \lifecycle{maturing} # \lifecycle{stable} # \lifecycle{superseded} # \lifecycle{questioning} # \lifecycle{soft-deprecated} # \lifecycle{deprecated} # \lifecycle{defunct} # \lifecycle{archived} #' @importFrom magrittr %>% #' @export magrittr::`%>%` # from ggplot2 (but not exported by ggplot2) `%||%` <- function(a, b) { if (!is.null(a)) { a } else { b } } ggstats/R/geom_stripped_rows.R0000644000176200001440000001442014357760261016201 0ustar liggesusers#' Alternating Background Color #' #' Add alternating background color along the y-axis. The geom takes default #' aesthetics `odd` and `even` that receive color codes. #' #' @inheritParams ggplot2::layer #' @inheritParams ggplot2::geom_rect #' @param xfrom,xto limitation of the strips along the x-axis #' @param width width of the strips #' @param nudge_x,nudge_y horizontal or vertical adjustment to nudge strips by #' @export #' @return A `ggplot2` plot with the added geometry. #' @examplesIf requireNamespace("reshape") #' data(tips, package = "reshape") #' #' library(ggplot2) #' p <- ggplot(tips) + #' aes(x = time, y = day) + #' geom_count() + #' theme_light() #' #' p #' p + geom_stripped_rows() #' p + geom_stripped_cols() #' p + geom_stripped_rows() + geom_stripped_cols() #' #' \donttest{ #' p <- ggplot(tips) + #' aes(x = total_bill, y = day) + #' geom_count() + #' theme_light() #' p #' p + geom_stripped_rows() #' p + geom_stripped_rows() + scale_y_discrete(expand = expansion(0, 0.5)) #' p + geom_stripped_rows(xfrom = 10, xto = 35) #' p + geom_stripped_rows(odd = "blue", even = "yellow") #' p + geom_stripped_rows(odd = "blue", even = "yellow", alpha = .1) #' p + geom_stripped_rows(odd = "#00FF0022", even = "#FF000022") #' #' p + geom_stripped_cols() #' p + geom_stripped_cols(width = 10) #' p + geom_stripped_cols(width = 10, nudge_x = 5) #' } geom_stripped_rows <- function(mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., show.legend = NA, inherit.aes = TRUE, xfrom = -Inf, xto = Inf, width = 1, nudge_y = 0) { ggplot2::layer( data = data, mapping = mapping, stat = stat, geom = GeomStrippedRows, position = position, show.legend = show.legend, inherit.aes = inherit.aes, params = list( xfrom = xfrom, xto = xto, width = width, nudge_y = nudge_y, ... ) ) } GeomStrippedRows <- ggplot2::ggproto("GeomStrippedRows", ggplot2::Geom, required_aes = c("y"), default_aes = ggplot2::aes( odd = "#11111111", even = "#00000000", alpha = NA, colour = NA, linetype = "solid", linewidth = .5 ), draw_key = ggplot2::draw_key_rect, draw_panel = function(data, panel_params, coord, xfrom, xto, width = 1, nudge_y = 0) { ggplot2::GeomRect$draw_panel( data %>% dplyr::mutate( y = round_any(.data$y, width), ymin = .data$y - width / 2 + nudge_y, ymax = .data$y + width / 2 + nudge_y, xmin = xfrom, xmax = xto ) %>% dplyr::select(dplyr::all_of(c( "xmin", "xmax", "ymin", "ymax", "odd", "even", "alpha", "colour", "linetype", "linewidth" ))) %>% dplyr::distinct(.data$ymin, .keep_all = TRUE) %>% dplyr::arrange(.data$ymin) %>% dplyr::mutate( .n = dplyr::row_number(), fill = dplyr::if_else( .data$.n %% 2L == 1L, true = .data$odd, false = .data$even ) ) %>% dplyr::select(-dplyr::all_of(c(".n", "odd", "even"))), panel_params, coord ) } ) #' @rdname geom_stripped_rows #' @param yfrom,yto limitation of the strips along the y-axis #' @export geom_stripped_cols <- function(mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., show.legend = NA, inherit.aes = TRUE, yfrom = -Inf, yto = Inf, width = 1, nudge_x = 0) { ggplot2::layer( data = data, mapping = mapping, stat = stat, geom = GeomStrippedCols, position = position, show.legend = show.legend, inherit.aes = inherit.aes, params = list( yfrom = yfrom, yto = yto, width = width, nudge_x = nudge_x, ... ) ) } GeomStrippedCols <- ggplot2::ggproto("GeomStrippedCols", ggplot2::Geom, required_aes = c("y"), default_aes = ggplot2::aes( odd = "#11111111", even = "#00000000", alpha = NA, colour = NA, linetype = "solid", linewidth = .5 ), draw_key = ggplot2::draw_key_rect, draw_panel = function(data, panel_params, coord, yfrom, yto, width = 1, nudge_x = 0) { ggplot2::GeomRect$draw_panel( data %>% dplyr::mutate( x = round_any(.data$x, width), xmin = .data$x - width / 2 + nudge_x, xmax = .data$x + width / 2 + nudge_x, ymin = yfrom, ymax = yto ) %>% dplyr::select(dplyr::all_of(c( "xmin", "xmax", "ymin", "ymax", "odd", "even", "alpha", "colour", "linetype", "linewidth" ))) %>% dplyr::distinct(.data$xmin, .keep_all = TRUE) %>% dplyr::arrange(.data$xmin) %>% dplyr::mutate( .n = dplyr::row_number(), fill = dplyr::if_else( .data$.n %% 2L == 1L, true = .data$odd, false = .data$even ) ) %>% dplyr::select(-dplyr::all_of(c(".n", "odd", "even"))), panel_params, coord ) } ) # Copied from plyr # Round to multiple of any number. # # @param x numeric or date-time (POSIXct) vector to round # @param accuracy number to round to; for POSIXct objects, a number of seconds # @param f rounding function: \code{\link{floor}}, \code{\link{ceiling}} or # \code{\link{round}} round_any <- function(x, accuracy, f = round) { UseMethod("round_any") } round_any.numeric <- function(x, accuracy, f = round) { f(x / accuracy) * accuracy } round_any.POSIXct <- function(x, accuracy, f = round) { tz <- format(x[1], "%Z") xr <- round_any(as.numeric(x), accuracy, f) as.POSIXct(xr, origin = "1970-01-01 00:00.00 UTC", tz = tz) } ggstats/R/position_likert.R0000644000176200001440000002050114505240725015472 0ustar liggesusers#' Stack objects on top of each another and center them around 0 #' #' `r lifecycle::badge("experimental")` #' #' `position_likert()` stacks proportion bars on top of each other and #' center them around zero (the same number of modalities are displayed on #' each side). This type of presentation is commonly used to display #' Likert-type scales. #' `position_likert_count()` uses counts instead of proportions. #' #' It is recommended to use `position_likert()` with `stat_prop()` #' and its `complete` argument (see examples). #' #' @param vjust Vertical adjustment for geoms that have a position #' (like points or lines), not a dimension (like bars or areas). Set to #' `0` to align with the bottom, `0.5` for the middle, #' and `1` (the default) for the top. #' @param reverse If `TRUE`, will reverse the default stacking order. #' This is useful if you're rotating both the plot and legend. #' @param exclude_fill_values Vector of values from the variable associated with #' the `fill` aesthetic that should not be displayed (but still taken into #' account for computing proportions) #' @seealso See [ggplot2::position_stack()] and [ggplot2::position_fill()] #' @export #' @examples #' library(ggplot2) #' #' ggplot(diamonds) + #' aes(y = clarity, fill = cut) + #' geom_bar(position = "fill") + #' scale_x_continuous(label = scales::label_percent()) + #' scale_fill_brewer(palette = "PiYG") + #' xlab("proportion") #' #' ggplot(diamonds) + #' aes(y = clarity, fill = cut) + #' geom_bar(position = "likert") + #' scale_x_continuous(label = label_percent_abs()) + #' scale_fill_brewer(palette = "PiYG") + #' xlab("proportion") #' #' ggplot(diamonds) + #' aes(y = clarity, fill = cut) + #' geom_bar(position = "stack") + #' scale_fill_brewer(palette = "PiYG") #' #' ggplot(diamonds) + #' aes(y = clarity, fill = cut) + #' geom_bar(position = "likert_count") + #' scale_x_continuous(label = label_number_abs()) + #' scale_fill_brewer(palette = "PiYG") #' #' \donttest{ #' # Reverse order ------------------------------------------------------------- #' #' ggplot(diamonds) + #' aes(y = clarity, fill = cut) + #' geom_bar(position = position_likert(reverse = TRUE)) + #' scale_x_continuous(label = label_percent_abs()) + #' scale_fill_brewer(palette = "PiYG", direction = -1) + #' xlab("proportion") #' #' # Missing items ------------------------------------------------------------- #' # example with a level not being observed for a specific value of y #' d <- diamonds #' d <- d[!(d$cut == "Premium" & d$clarity == "I1"), ] #' d <- d[!(d$cut %in% c("Fair", "Good") & d$clarity == "SI2"), ] #' #' # by default, the two lowest bar are not properly centered #' ggplot(d) + #' aes(y = clarity, fill = cut) + #' geom_bar(position = "likert") + #' scale_fill_brewer(palette = "PiYG") #' #' # use stat_prop() with `complete = "fill"` to fix it #' ggplot(d) + #' aes(y = clarity, fill = cut) + #' geom_bar(position = "likert", stat = "prop", complete = "fill") + #' scale_fill_brewer(palette = "PiYG") #' #' # Add labels ---------------------------------------------------------------- #' #' custom_label <- function(x) { #' p <- scales::percent(x, accuracy = 1) #' p[x < .075] <- "" #' p #' } #' #' ggplot(diamonds) + #' aes(y = clarity, fill = cut) + #' geom_bar(position = "likert") + #' geom_text( #' aes(by = clarity, label = custom_label(after_stat(prop))), #' stat = "prop", #' position = position_likert(vjust = .5) #' ) + #' scale_x_continuous(label = label_percent_abs()) + #' scale_fill_brewer(palette = "PiYG", direction = -1) + #' xlab("proportion") #' #' # Do not display specific fill values --------------------------------------- #' # (but taken into account to compute proportions) #' #' ggplot(diamonds) + #' aes(y = clarity, fill = cut) + #' geom_bar(position = position_likert(exclude_fill_values = "Very Good")) + #' scale_x_continuous(label = label_percent_abs()) + #' scale_fill_brewer(palette = "PiYG") + #' xlab("proportion") #' } position_likert <- function(vjust = 1, reverse = FALSE, exclude_fill_values = NULL) { ggplot2::ggproto( NULL, PositionLikert, vjust = vjust, reverse = reverse, exclude_fill_values = exclude_fill_values ) } #' @export #' @rdname position_likert position_likert_count <- function(vjust = 1, reverse = FALSE, exclude_fill_values = NULL) { ggplot2::ggproto( NULL, PositionLikertCount, vjust = vjust, reverse = reverse, exclude_fill_values = exclude_fill_values ) } #' @rdname position_likert #' @format NULL #' @usage NULL #' @export PositionLikert <- ggplot2::ggproto("PositionLikert", Position, type = NULL, vjust = 1, fill = TRUE, exclude_fill_values = NULL, reverse = FALSE, setup_params = function(self, data) { flipped_aes <- ggplot2::has_flipped_aes(data) data <- ggplot2::flip_data(data, flipped_aes) list( var = self$var %||% likert_var(data), fill = self$fill, vjust = self$vjust, reverse = self$reverse, exclude_fill_values = self$exclude_fill_values, flipped_aes = flipped_aes ) }, setup_data = function(self, data, params) { data <- ggplot2::flip_data(data, params$flipped_aes) if (is.null(params$var)) { return(data) } if (!"ymin" %in% names(data)) data$ymin <- 0 data$ymax <- switch(params$var, y = data$y, ymax = as.numeric(ifelse(data$ymax == 0, data$ymin, data$ymax)) ) data <- ggplot2::remove_missing( data, vars = c("x", "xmin", "xmax", "y"), name = "position_likert" ) ggplot2::flip_data(data, params$flipped_aes) }, compute_panel = function(data, params, scales) { data <- ggplot2::flip_data(data, params$flipped_aes) if (is.null(params$var)) { return(data) } negative <- data$ymax < 0 negative[is.na(negative)] <- FALSE if (any(negative)) { cli::cli_abort("{.fn position_liker} does not work with negative values") } data <- data %>% tidyr::nest(.by = "x", .key = "d") %>% dplyr::mutate( d = purrr::map( .data$d, function(x) { pos_likert( x, vjust = params$vjust, fill = params$fill, reverse = params$reverse, exclude_fill_values = params$exclude_fill_values ) } ) ) %>% tidyr::unnest(cols = "d") ggplot2::flip_data(data, params$flipped_aes) } ) pos_likert <- function(df, vjust = 1, fill = FALSE, reverse = FALSE, exclude_fill_values = NULL) { if (reverse) { df <- df[nrow(df):1, ] # nolint } if (fill) { df$y <- df$y / sum(abs(df$y), na.rm = TRUE) } # Values to be excluded after computation of proportions if (!is.null(exclude_fill_values) && "fill" %in% names(df)) { exclude <- df$fill %in% exclude_fill_values df <- df[!exclude, ] } n <- nrow(df) + 1 y <- ifelse(is.na(df$y), 0, df$y) heights <- c(0, cumsum(y)) df$ymin <- pmin(heights[-n], heights[-1]) df$ymax <- pmax(heights[-n], heights[-1]) df$y <- (1 - vjust) * df$ymin + vjust * df$ymax # Now, we have to center the results if (nrow(df) %% 2 == 0) { y_adjust <- df[nrow(df) / 2, ]$ymax } else { y_adjust <- mean(c( df[nrow(df) %/% 2, ]$ymax, df[nrow(df) %/% 2 + 1, ]$ymax )) } df$y <- df$y - y_adjust df$ymin <- df$ymin - y_adjust df$ymax <- df$ymax - y_adjust df } #' @rdname position_likert #' @format NULL #' @usage NULL #' @export PositionLikertCount <- ggproto("PositionLikertCount", PositionLikert, fill = FALSE ) likert_var <- function(data) { if (!is.null(data$ymax)) { "ymax" } else if (!is.null(data$y)) { "y" } else { cli::cli_warn(c( "Stacking requires either the {.field ymin} {.emph and} {.field ymin}", "or the {.field y} aesthetics", "i" = "Maybe you want {.code position = \"identity\"}?" )) NULL } } ggstats/R/stat_prop.R0000644000176200001440000001405314521172727014300 0ustar liggesusers#' Compute proportions according to custom denominator #' #' `stat_prop()` is a variation of [ggplot2::stat_count()] allowing to #' compute custom proportions according to the **by** aesthetic defining #' the denominator (i.e. all proportions for a same value of **by** will #' sum to 1). The **by** aesthetic should be a factor. If **by** is not #' specified, proportions of the total will be computed. #' #' @inheritParams ggplot2::stat_count #' @param geom Override the default connection with [ggplot2::geom_bar()]. #' @param complete Name (character) of an aesthetic for those statistics should #' be completed for unobserved values (see example) #' @section Aesthetics: #' `stat_prop()` understands the following aesthetics #' (required aesthetics are in bold): #' #' - **x *or* y** #' - by (this aesthetic should be a **factor**) #' - group #' - weight #' @section Computed variables: #' \describe{ #' \item{count}{number of points in bin} #' \item{prop}{computed proportion} #' } #' @seealso `vignette("stat_prop")`, [ggplot2::stat_count()]. For an alternative #' approach, see #' . #' #' @import ggplot2 #' @return A `ggplot2` plot with the added statistic. #' @export #' @examples #' library(ggplot2) #' d <- as.data.frame(Titanic) #' #' p <- ggplot(d) + #' aes(x = Class, fill = Survived, weight = Freq, by = Class) + #' geom_bar(position = "fill") + #' geom_text(stat = "prop", position = position_fill(.5)) #' p #' p + facet_grid(~Sex) #' #' ggplot(d) + #' aes(x = Class, fill = Survived, weight = Freq) + #' geom_bar(position = "dodge") + #' geom_text( #' aes(by = Survived), #' stat = "prop", #' position = position_dodge(0.9), vjust = "bottom" #' ) #' \donttest{ #' if (requireNamespace("scales")) { #' ggplot(d) + #' aes(x = Class, fill = Survived, weight = Freq, by = 1) + #' geom_bar() + #' geom_text( #' aes(label = scales::percent(after_stat(prop), accuracy = 1)), #' stat = "prop", #' position = position_stack(.5) #' ) #' } #' #' # displaying unobserved levels with complete #' d <- diamonds %>% #' dplyr::filter(!(cut == "Ideal" & clarity == "I1")) %>% #' dplyr::filter(!(cut == "Very Good" & clarity == "VS2")) %>% #' dplyr::filter(!(cut == "Premium" & clarity == "IF")) #' p <- ggplot(d) + #' aes(x = clarity, fill = cut, by = clarity) + #' geom_bar(position = "fill") #' p + geom_text(stat = "prop", position = position_fill(.5)) #' p + geom_text(stat = "prop", position = position_fill(.5), complete = "fill") #' } stat_prop <- function(mapping = NULL, data = NULL, geom = "bar", position = "fill", ..., width = NULL, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, complete = NULL) { params <- list( na.rm = na.rm, orientation = orientation, width = width, complete = complete, ... ) if (!is.null(params$y)) { cli::cli_abort( "{.fn stat_prop} must not be used with a {.arg y} aesthetic.", call. = FALSE ) } layer( data = data, mapping = mapping, stat = StatProp, geom = geom, position = position, show.legend = show.legend, inherit.aes = inherit.aes, params = params ) } #' @rdname stat_prop #' @format NULL #' @usage NULL #' @export StatProp <- ggplot2::ggproto("StatProp", ggplot2::Stat, required_aes = c("x|y"), default_aes = ggplot2::aes( x = after_stat(count), y = after_stat(count), weight = 1, label = scales::percent(after_stat(prop), accuracy = .1), by = 1 ), setup_params = function(data, params) { params$flipped_aes <- ggplot2::has_flipped_aes( data, params, main_is_orthogonal = FALSE ) has_x <- !(is.null(data$x) && is.null(params$x)) has_y <- !(is.null(data$y) && is.null(params$y)) if (!has_x && !has_y) { cli::cli_abort( "{.fn stat_prop} requires an {.arg x} or {.arg y} aesthetic.", call. = FALSE ) } if (has_x && has_y) { cli::cli_abort( "{.fn stat_prop} can only have an {.arg x} or an {.arg y} aesthetic.", call. = FALSE ) } # there is an unresolved bug when by is a character vector. To be explored. if (is.character(data$by)) { cli::cli_abort( "The {.arg by} aesthetic should be a factor instead of a character.", call. = FALSE ) } params }, extra_params = c("na.rm"), compute_panel = function(self, data, scales, width = NULL, flipped_aes = FALSE, complete = NULL) { data <- ggplot2::flip_data(data, flipped_aes) data$weight <- data$weight %||% rep(1, nrow(data)) data$by <- data$by %||% rep(1, nrow(data)) width <- width %||% (ggplot2::resolution(data$x) * 0.9) # sum weights for each combination of by and aesthetics # the use of . allows to consider all aesthetics defined in data panel <- stats::aggregate(weight ~ ., data = data, sum, na.rm = TRUE) names(panel)[which(names(panel) == "weight")] <- "count" panel$count[is.na(panel$count)] <- 0 if (!is.null(complete)) { panel <- panel %>% dplyr::select(-dplyr::all_of("group")) cols <- names(panel) cols <- cols[!cols %in% c("count", complete)] panel <- panel %>% tidyr::complete( tidyr::nesting(!!!syms(cols)), .data[[complete]], fill = list(count = 0) ) %>% dplyr::mutate(group = seq_len(dplyr::n())) } # compute proportions by by sum_abs <- function(x) { sum(abs(x)) } panel$prop <- panel$count / ave(panel$count, panel$by, FUN = sum_abs) panel$width <- width panel$flipped_aes <- flipped_aes ggplot2::flip_data(panel, flipped_aes) } ) ggstats/R/stat_weighted_mean.R0000644000176200001440000000704114415736022016113 0ustar liggesusers#' Compute weighted y mean #' #' This statistic will compute the mean of **y** aesthetic for #' each unique value of **x**, taking into account **weight** #' aesthetic if provided. #' #' @section Computed variables: #' \describe{ #' \item{y}{weighted y (numerator / denominator)} #' \item{numerator}{numerator} #' \item{denominator}{denominator} #' } #' #' @inheritParams ggplot2::stat_bin #' @param geom Override the default connection with [ggplot2::geom_point()]. #' @seealso `vignette("stat_weighted_mean")` #' @export #' @return A `ggplot2` plot with the added statistic. #' @examplesIf requireNamespace("reshape") #' @examples #' library(ggplot2) #' #' data(tips, package = "reshape") #' #' ggplot(tips) + #' aes(x = day, y = total_bill) + #' geom_point() #' #' ggplot(tips) + #' aes(x = day, y = total_bill) + #' stat_weighted_mean() #' #' \donttest{ #' ggplot(tips) + #' aes(x = day, y = total_bill, group = 1) + #' stat_weighted_mean(geom = "line") #' #' ggplot(tips) + #' aes(x = day, y = total_bill, colour = sex, group = sex) + #' stat_weighted_mean(geom = "line") #' #' ggplot(tips) + #' aes(x = day, y = total_bill, fill = sex) + #' stat_weighted_mean(geom = "bar", position = "dodge") #' #' # computing a proportion on the fly #' if (requireNamespace("scales")) { #' ggplot(tips) + #' aes(x = day, y = as.integer(smoker == "Yes"), fill = sex) + #' stat_weighted_mean(geom = "bar", position = "dodge") + #' scale_y_continuous(labels = scales::percent) #' } #' } #' @examples #' library(ggplot2) #' #' # taking into account some weights #' if (requireNamespace("scales")) { #' d <- as.data.frame(Titanic) #' ggplot(d) + #' aes( #' x = Class, y = as.integer(Survived == "Yes"), #' weight = Freq, fill = Sex #' ) + #' geom_bar(stat = "weighted_mean", position = "dodge") + #' scale_y_continuous(labels = scales::percent) + #' labs(y = "Survived") #' } stat_weighted_mean <- function(mapping = NULL, data = NULL, geom = "point", position = "identity", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE) { layer( data = data, mapping = mapping, stat = StatWeightedMean, geom = geom, position = position, show.legend = show.legend, inherit.aes = inherit.aes, params = list( na.rm = na.rm, orientation = orientation, ... ) ) } #' @rdname stat_weighted_mean #' @format NULL #' @usage NULL #' @export StatWeightedMean <- ggplot2::ggproto( "StatSummary", ggplot2::Stat, required_aes = c("x", "y"), extra_params = c("na.rm", "orientation"), setup_params = function(data, params) { params$flipped_aes <- ggplot2::has_flipped_aes(data, params) params }, compute_panel = function(data, scales, na.rm = FALSE, flipped_aes = FALSE) { data <- ggplot2::flip_data(data, flipped_aes) if (is.null(data$weight)) { data$weight <- rep(1, nrow(data)) } summarised <- aggregate( cbind(numerator = y * weight, denominator = weight) ~ ., data, FUN = sum, na.rm = TRUE ) summarised$y <- summarised$numerator / summarised$denominator summarised$flipped_aes <- flipped_aes ggplot2::flip_data(summarised, flipped_aes) } ) ggstats/NEWS.md0000644000176200001440000000460514526740307013041 0ustar liggesusers# ggstats 0.5.1 **Bug fixes** * fix in `ggcoef_model()` and other similar functions: Unicode character removed in significance labels (#49) # ggstats 0.5.0 **Improvements** * New options `labels_color` and `totals_color` in `gglikert()` and `gglikert_stacked()` (#43) **Bug fixes** * fix in `ggcoef_multicomponents()` when `type = "table"` and `exponentiate = TRUE` * fix in `gglikert()`: the function could be called directly with `ggstats::gglikert()` without requiring the full package to be loaded (#47) # ggstats 0.4.0 **New features** * new function `ggcoef_table()` displaying a coefficient table at the right of the forest plot (#32) * new function `ggcoef_multicomponents()` for multi-components models such as zero-inflated Poisson or beta regressions (#38) * new type `"table"` for `ggcoef_multinom()` **Improvements** * `gglikert()` now aligns total proportions when faceting (#28) * new `weights` argument for `gglikert()`, `gglikert_stacked()` and `gglikert_data()` (#29) * new `y` argument for `gglikert()` and `gglikert_stacked()` (#31) * new `facet_label_wrap` argument for `gglikert()` (#31) **New helpers** * `weighted.median()` and `weighted.quantile()` functions # ggstats 0.3.0 **New features** * New functions `gglikert()`, `gglikert_stacked()` and `gglikert_data()` (#25) * New positions `position_likert()` and `position_likert_count()` (#25) * New `complete` argument for `stat_prop()` (#25) **Bug fixes** * Bug fix in `ggcoef_compare()` to preserve the order of model terms and to avoid an error with `add_reference_rows = FALSE` (#23) # ggstats 0.2.1 * Bug fix in `geom_stripped_rows()` and `geom_stripped_cols()` (#20) # ggstats 0.2.0 * Support for pairwise contrasts (#14) * New argument `tidy_args` in `ggcoef_*()` to pass additional arguments to `broom.helpers::tidy_plus_plus()` and to `tidy_fun` (#17) * Now requires `ggplot2` version 3.4.0 or more (#15) * Following change in `geom_rect()`, the `size` aesthetic is now deprecated in `geom_stripped_cols()` and `geom_stripped_rows()`: please use the `linewidth` aesthetic instead (#15) # ggstats 0.1.1 * Examples relying on Internet resources have been removed (#11) # ggstats 0.1.0 * First version, based on dev version of GGally * Fix in `ggcoef_multinom()` to display y levels not listed in `y.level_label` * `stat_cross()` now returns phi coefficients (see also `augment_chisq_add_phi()`) (#6) ggstats/MD50000644000176200001440000001120714527062732012247 0ustar liggesusers3bc2f2bf8ab750b72c2147fdba6d0e4b *DESCRIPTION 65b488e04e44bb301a08b373cb3c9693 *NAMESPACE bc04526d7f0468039a9551433ff4c895 *NEWS.md cf34415121e0aa57f259dba4e06866aa *R/geom_stripped_rows.R 5c243e7c0df515f9b4f2fbef989d3ea8 *R/ggcoef_model.R 81976595603f18d281aa9322e0764783 *R/gglikert.R 785baff7f28ac10882a96390415d3fde *R/ggstats-package.R 953e73724fb8b0057a745c342d9d7b22 *R/ggsurvey.R a95a65e0203d3997a670412fe89b241e *R/label_number_abs.R f1c23d46fc1b87deed7ec449e43f7ef0 *R/position_likert.R da97f039976af617e0731565c2ab13f3 *R/signif_stars.R 99b353be4443b25205a7855c5033cdfb *R/stat_cross.R b464b1e75a3fa7fbcbc703290f67a5d2 *R/stat_prop.R c272fb908cff0cc0d37625c706c55173 *R/stat_weighted_mean.R e66c066897522ee3c467fb77654268ba *R/weighted_quantile.R dbb0fa2cb83c82745280a406ca8a8e35 *README.md 25cb8442b2e61bfedeba0966fce11c6c *build/vignette.rds 2b5fe1a6da1dd2c209e88374640a5be5 *inst/WORDLIST c1d15f67c74fdf723141ae00b11f8b67 *inst/doc/ggcoef_model.R a14187f86a3bf91b478fd324035be31b *inst/doc/ggcoef_model.Rmd b977a714199ba118b4c22b7fcd89d6f2 *inst/doc/ggcoef_model.html 50a81ba9e4d71c1608de099ccae55d19 *inst/doc/gglikert.R 0174086e4de965ef43115efd225cc158 *inst/doc/gglikert.Rmd e6b4b232847e4555d1f45c1e48a35951 *inst/doc/gglikert.html 1b8081a7ad40a9dd366a906798cc7300 *inst/doc/stat_cross.R bceaf8ba6563ed5198d7e8cd325ec352 *inst/doc/stat_cross.Rmd 65df1cb843d8383a8d0314de414650dc *inst/doc/stat_cross.html 0b49f043d3eb28e00a6ad3382fd7aabb *inst/doc/stat_prop.R 6f355e1456b3a32fa5804c21f7dc79c9 *inst/doc/stat_prop.Rmd 3ef36016cb193dd3cf04532110690d9a *inst/doc/stat_prop.html 06e26e6b115620dae874f56b5ff93f16 *inst/doc/stat_weighted_mean.R 8b364b651c16d66c845af52161f585b4 *inst/doc/stat_weighted_mean.Rmd 3a62117a67287bb71fb173276c8b964a *inst/doc/stat_weighted_mean.html 613bdbc4f6d83370ef17432b0635bdb6 *man/augment_chisq_add_phi.Rd d4b31c59a09aaacf79b5d6e2f6f54410 *man/figures/README-unnamed-chunk-10-1.png efd1211d099a4efe2fa538769a7ca5ee *man/figures/README-unnamed-chunk-4-1.png 2a3632c8ce22c9ede4449226b6b451dc *man/figures/README-unnamed-chunk-4-2.png 575c0ef96464cbb8407d5a4a9fc09460 *man/figures/README-unnamed-chunk-5-1.png 26bc986ca05c849525c9f65927f70c48 *man/figures/README-unnamed-chunk-6-1.png 0f3bbcb02296489326e03a6976510f08 *man/figures/README-unnamed-chunk-7-1.png a0af3bdceb9785f64fb2564300968aa8 *man/figures/README-unnamed-chunk-8-1.png 62d00e2afb8419746cb89eac6ee19636 *man/figures/README-unnamed-chunk-9-1.png cb1e46f469cfbbbde29c8b5113e1d789 *man/figures/lifecycle-archived.svg c0d2e5a54f1fa4ff02bf9533079dd1f7 *man/figures/lifecycle-defunct.svg a1b8c987c676c16af790f563f96cbb1f *man/figures/lifecycle-deprecated.svg c3978703d8f40f2679795335715e98f4 *man/figures/lifecycle-experimental.svg 952b59dc07b171b97d5d982924244f61 *man/figures/lifecycle-maturing.svg 27b879bf3677ea76e3991d56ab324081 *man/figures/lifecycle-questioning.svg 53b3f893324260b737b3c46ed2a0e643 *man/figures/lifecycle-stable.svg 1c1fe7a759b86dc6dbcbe7797ab8246c *man/figures/lifecycle-superseded.svg 65602434b4aca85dd5858954c95ea53c *man/geom_stripped_rows.Rd adc0e72cf676f74b0ff29d014300fa85 *man/ggcoef_model.Rd b075e197a0ac641114df7d650c0aeba3 *man/gglikert.Rd 50314b608b6fe16a9bfdc8c272728f37 *man/ggstats-package.Rd f2624df8c40b05c304bbb80250b12770 *man/ggsurvey.Rd bef0fc568741dd32e6550f8ac1365e25 *man/label_number_abs.Rd 4e82698cc427fc21bf87b8df43faf1f0 *man/position_likert.Rd c7c574b02920d4ec80f79de0242d68e4 *man/reexports.Rd ed4752eb6f36bd29490e9b14dc376a21 *man/signif_stars.Rd a55203404d80c8154181bcd0ef3fd1a9 *man/stat_cross.Rd 169a9177433d873febc19692c0a3bdaf *man/stat_prop.Rd 817b2cbcc995a2c02b333f565e022601 *man/stat_weighted_mean.Rd ca70a231c2ab7cd2bfb3ef3ebc63fe4f *man/weighted.median.Rd 52e4cfc6848ac432bf7398e4f7b41889 *tests/spelling.R c7758487796fdefd18c2ffda1429ca23 *tests/testthat.R ec7aca3f4fb8c5b0db00a135c302d4d1 *tests/testthat/test-geom_stripped.R 68157038f752797d2a4f4bd71852b2c3 *tests/testthat/test-ggcoef_model.R a486f81629cdf968ff9389418ddcbb59 *tests/testthat/test-gglikert.R 21147e88a12e864d26c765fc07fbad6c *tests/testthat/test-position_likert.R 54aea6d92dcb6888b7e1d85b47e27c41 *tests/testthat/test-stat_cross.R a32e3bd0ff6f3079e4eed343fb08a98d *tests/testthat/test-stat_prop.R a850c441de4f629f540d2bffe68711cd *tests/testthat/test-stat_weighted_mean.R 0119550466225fe3d5699f0eea2125e2 *tests/testthat/test-utilities.R 86c282eb2902de49b32827e44ab5113a *tests/testthat/test_ggsurvey.R a14187f86a3bf91b478fd324035be31b *vignettes/ggcoef_model.Rmd 0174086e4de965ef43115efd225cc158 *vignettes/gglikert.Rmd bceaf8ba6563ed5198d7e8cd325ec352 *vignettes/stat_cross.Rmd 6f355e1456b3a32fa5804c21f7dc79c9 *vignettes/stat_prop.Rmd 8b364b651c16d66c845af52161f585b4 *vignettes/stat_weighted_mean.Rmd ggstats/inst/0000755000176200001440000000000014527052762012715 5ustar liggesusersggstats/inst/doc/0000755000176200001440000000000014527052762013462 5ustar liggesusersggstats/inst/doc/ggcoef_model.R0000644000176200001440000001551014527052730016214 0ustar liggesusers## ----include = FALSE---------------------------------------------------------- knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ## ----setup-------------------------------------------------------------------- library(ggstats) ## ----ggcoef-reg--------------------------------------------------------------- data(tips, package = "reshape") mod_simple <- lm(tip ~ day + time + total_bill, data = tips) ggcoef_model(mod_simple) ## ----ggcoef-titanic----------------------------------------------------------- d_titanic <- as.data.frame(Titanic) d_titanic$Survived <- factor(d_titanic$Survived, c("No", "Yes")) mod_titanic <- glm( Survived ~ Sex * Age + Class, weights = Freq, data = d_titanic, family = binomial ) ggcoef_model(mod_titanic, exponentiate = TRUE) ## ----------------------------------------------------------------------------- library(labelled) tips_labelled <- tips %>% set_variable_labels( day = "Day of the week", time = "Lunch or Dinner", total_bill = "Bill's total" ) mod_labelled <- lm(tip ~ day + time + total_bill, data = tips_labelled) ggcoef_model(mod_labelled) ## ----------------------------------------------------------------------------- ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ) ) ## ----------------------------------------------------------------------------- ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ), facet_labeller = ggplot2::label_wrap_gen(10) ) ## ----------------------------------------------------------------------------- ggcoef_model(mod_simple, facet_row = NULL, colour_guide = TRUE) ## ----------------------------------------------------------------------------- ggcoef_model(mod_titanic, exponentiate = TRUE) ggcoef_model( mod_titanic, exponentiate = TRUE, show_p_values = FALSE, signif_stars = FALSE, add_reference_rows = FALSE, categorical_terms_pattern = "{level} (ref: {reference_level})", interaction_sep = " x " ) + ggplot2::scale_y_discrete(labels = scales::label_wrap(15)) ## ----------------------------------------------------------------------------- mod_titanic2 <- glm( Survived ~ Sex * Age + Class, weights = Freq, data = d_titanic, family = binomial, contrasts = list(Sex = contr.sum, Class = contr.treatment(4, base = 3)) ) ggcoef_model(mod_titanic2, exponentiate = TRUE) ## ----------------------------------------------------------------------------- mod_poly <- lm(Sepal.Length ~ poly(Petal.Width, 3) + Petal.Length, data = iris) ggcoef_model(mod_poly) ## ----------------------------------------------------------------------------- ggcoef_model( mod_titanic2, exponentiate = TRUE, no_reference_row = "Sex" ) ggcoef_model( mod_titanic2, exponentiate = TRUE, no_reference_row = broom.helpers::all_dichotomous() ) ggcoef_model( mod_titanic2, exponentiate = TRUE, no_reference_row = broom.helpers::all_categorical(), categorical_terms_pattern = "{level}/{reference_level}" ) ## ----------------------------------------------------------------------------- ggcoef_model(mod_simple, intercept = TRUE) ## ----------------------------------------------------------------------------- ggcoef_model(mod_simple, conf.int = FALSE) ## ----------------------------------------------------------------------------- ggcoef_model(mod_simple, significance = NULL) ## ----------------------------------------------------------------------------- ggcoef_model(mod_simple, colour = NULL) ## ----------------------------------------------------------------------------- ggcoef_model(mod_simple, include = c("time", "total_bill")) ## ----------------------------------------------------------------------------- ggcoef_model(mod_simple, include = dplyr::starts_with("t")) ## ----------------------------------------------------------------------------- ggcoef_model(mod_simple, stripped_rows = FALSE) ## ----------------------------------------------------------------------------- ggcoef_model(mod_simple) + ggplot2::xlab("Coefficients") + ggplot2::ggtitle("Custom title") + ggplot2::scale_color_brewer(palette = "Set1") + ggplot2::theme(legend.position = "right") ## ----------------------------------------------------------------------------- ggcoef_table(mod_simple) ggcoef_table(mod_titanic, exponentiate = TRUE) ## ----------------------------------------------------------------------------- ggcoef_table( mod_simple, table_stat = c("label", "estimate", "std.error", "ci"), ci_pattern = "{conf.low} to {conf.high}", table_stat_label = list( estimate = scales::label_number(accuracy = .001), conf.low = scales::label_number(accuracy = .01), conf.high = scales::label_number(accuracy = .01), std.error = scales::label_number(accuracy = .001), label = toupper ), table_header = c("Term", "Coef.", "SE", "CI"), table_witdhs = c(2, 3) ) ## ----------------------------------------------------------------------------- library(nnet) hec <- as.data.frame(HairEyeColor) mod <- multinom( Hair ~ Eye + Sex, data = hec, weights = hec$Freq ) ggcoef_multinom( mod, exponentiate = TRUE ) ggcoef_multinom( mod, exponentiate = TRUE, type = "faceted" ) ## ----fig.height=9, fig.width=6------------------------------------------------ ggcoef_multinom( mod, exponentiate = TRUE, type = "table" ) ## ----------------------------------------------------------------------------- ggcoef_multinom( mod, type = "faceted", y.level_label = c("Brown" = "Brown\n(ref: Black)"), exponentiate = TRUE ) ## ----------------------------------------------------------------------------- library(pscl) data("bioChemists", package = "pscl") mod <- zeroinfl(art ~ fem * mar | fem + mar, data = bioChemists) ggcoef_multicomponents(mod) ggcoef_multicomponents(mod, type = "f") ## ----fig.height=7, fig.width=6------------------------------------------------ ggcoef_multicomponents(mod, type = "t") ggcoef_multicomponents( mod, type = "t", component_label = c(conditional = "Count", zero_inflated = "Zero-inflated") ) ## ----------------------------------------------------------------------------- mod1 <- lm(Fertility ~ ., data = swiss) mod2 <- step(mod1, trace = 0) mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss) models <- list( "Full model" = mod1, "Simplified model" = mod2, "With interaction" = mod3 ) ggcoef_compare(models) ggcoef_compare(models, type = "faceted") ## ----echo=FALSE--------------------------------------------------------------- broom.helpers::supported_models %>% knitr::kable() ggstats/inst/doc/gglikert.Rmd0000644000176200001440000001652214504775450015745 0ustar liggesusers--- title: "Plot Likert-type items with `gglikert()`" author: Joseph Larmarange output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Plot Likert-type items with `gglikert()`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ggstats) library(dplyr) library(ggplot2) ``` The purpose of `gglikert()` is to generate a centered bar plot comparing the answers of several questions sharing a common Likert-type scale. ## Generating an example dataset ```{r} likert_levels <- c( "Strongly disagree", "Disagree", "Neither agree nor disagree", "Agree", "Strongly agree" ) set.seed(42) df <- tibble( q1 = sample(likert_levels, 150, replace = TRUE), q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1), q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q5 = sample(c(likert_levels, NA), 150, replace = TRUE), q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0)) ) %>% mutate(across(everything(), ~ factor(.x, levels = likert_levels))) likert_levels_dk <- c( "Strongly disagree", "Disagree", "Neither agree nor disagree", "Agree", "Strongly agree", "Don't know" ) df_dk <- tibble( q1 = sample(likert_levels_dk, 150, replace = TRUE), q2 = sample(likert_levels_dk, 150, replace = TRUE, prob = 6:1), q3 = sample(likert_levels_dk, 150, replace = TRUE, prob = 1:6), q4 = sample(likert_levels_dk, 150, replace = TRUE, prob = 1:6), q5 = sample(c(likert_levels_dk, NA), 150, replace = TRUE), q6 = sample( likert_levels_dk, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0, 1) ) ) %>% mutate(across(everything(), ~ factor(.x, levels = likert_levels_dk))) ``` ## Quick plot Simply call `gglikert()`. ```{r} gglikert(df) ``` The list of variables to plot (all by default) could by specify with `include`. This argument accepts tidy-select syntax. ```{r} gglikert(df, include = q1:q3) ``` ## Customizing the plot The generated plot is a standard `ggplot2` object. You can therefore use `ggplot2` functions to custom many aspects. ```{r} gglikert(df) + ggtitle("A Likert-type items plot", subtitle = "generated with gglikert()") + scale_fill_brewer(palette = "RdYlBu") ``` ### Sorting the questions You can sort the plot with `sort`. ```{r} gglikert(df, sort = "ascending") ``` By default, the plot is sorted based on the proportion being higher than the center level, i.e. in this case the proportion of answers equal to "Agree" or "Strongly Agree". Alternatively, the questions could be transformed into a score and sorted accorded to their mean. ```{r} gglikert(df, sort = "ascending", sort_method = "mean") ``` ### Sorting the answers You can reverse the order of the answers with `reverse_likert`. ```{r} gglikert(df, reverse_likert = TRUE) ``` ### Proportion labels Proportion labels could be removed with `add_labels = FALSE`. ```{r} gglikert(df, add_labels = FALSE) ``` or customized. ```{r} gglikert( df, labels_size = 3, labels_accuracy = .1, labels_hide_below = .2, labels_color = "white" ) ``` ### Totals on each side By default, totals are added on each side of the plot. In case of an uneven number of answer levels, the central level is not taken into account for computing totals. With `totals_include_center = TRUE`, half of the proportion of the central level will be added on each side. ```{r} gglikert( df, totals_include_center = TRUE, sort = "descending", sort_prop_include_center = TRUE ) ``` Totals could be customized. ```{r} gglikert( df, totals_size = 4, totals_color = "blue", totals_fontface = "italic", totals_hjust = .20 ) ``` Or removed. ```{r} gglikert(df, add_totals = FALSE) ``` ## Variable labels If you are using variable labels (see `labelled::set_variable_labels()`), they will be taken automatically into account by `gglikert()`. ```{r} if (require(labelled)) { df <- df %>% set_variable_labels( q1 = "first question", q2 = "second question", q3 = "this is the third question with a quite long variable label" ) } gglikert(df) ``` You can also provide custom variable labels with `variable_labels`. ```{r} gglikert( df, variable_labels = c( q1 = "alternative label for the first question", q6 = "another custom label" ) ) ``` You can control how variable labels are wrapped with `y_label_wrap`. ```{r} gglikert(df, y_label_wrap = 20) gglikert(df, y_label_wrap = 200) ``` ## Removing certain values Sometimes, the dataset could contain certain values that you should not be displayed. ```{r} gglikert(df_dk) ``` A first option could be to convert the don't knows into `NA`. In such case, the proportions will be computed on non missing. ```{r} df_dk %>% mutate(across(everything(), ~ factor(.x, levels = likert_levels))) %>% gglikert() ``` Or, you could use `exclude_fill_values` to not display specific values, but still counting them in the denominator for computing proportions. ```{r} df_dk %>% gglikert(exclude_fill_values = "Don't know") ``` ## Facets To define facets, use `facet_rows` and/or `facet_cols`. ```{r message=FALSE} df_group <- df df_group$group1 <- sample(c("A", "B"), 150, replace = TRUE) df_group$group2 <- sample(c("a", "b", "c"), 150, replace = TRUE) gglikert(df_group, q1:q6, facet_cols = vars(group1), labels_size = 3 ) gglikert(df_group, q1:q2, facet_rows = vars(group1, group2), labels_size = 3 ) gglikert(df_group, q3:q6, facet_cols = vars(group1), facet_rows = vars(group2), labels_size = 3 ) + scale_x_continuous( labels = label_percent_abs(), expand = expansion(0, .2) ) ``` To compare answers by subgroup, you can alternatively map `.question` to facets, and define a grouping variable for `y`. ```{r} gglikert(df_group, q1:q4, y = "group1", facet_rows = vars(.question), labels_size = 3, facet_label_wrap = 15 ) ``` ## Stacked plot For a more classical stacked bar plot, you can use `gglikert_stacked()`. ```{r} gglikert_stacked(df) gglikert_stacked( df, sort = "asc", add_median_line = TRUE, add_labels = FALSE ) gglikert_stacked( df_group, include = q1:q4, y = "group2" ) + facet_grid( rows = vars(.question), labeller = label_wrap_gen(15) ) ``` ## Long format dataset Internally, `gglikert()` is calling `gglikert_data()` to generate a long format dataset combining all questions into two columns, `.question` and `.answer`. ```{r} gglikert_data(df) %>% head() ``` Such dataset could be useful for other types of plot, for example for a classic stacked bar plot. ```{r} ggplot(gglikert_data(df)) + aes(y = .question, fill = .answer) + geom_bar(position = "fill") ``` ## Weighted data `gglikert()`, `gglikert_stacked()` and `gglikert_data()` accepts a `weights` argument, allowing to specify statistical weights. ```{r} df$sampling_weights <- runif(nrow(df)) gglikert(df, q1:q4, weights = sampling_weights) ``` ## See also The function `position_likert()` used to center bars. ggstats/inst/doc/stat_cross.R0000644000176200001440000000317314527052753015775 0ustar liggesusers## ----include = FALSE---------------------------------------------------------- knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ## ----setup-------------------------------------------------------------------- library(ggstats) library(ggplot2) ## ----------------------------------------------------------------------------- d <- as.data.frame(Titanic) ggplot(d) + aes(x = Class, y = Survived, weight = Freq, size = after_stat(observed)) + stat_cross() + scale_size_area(max_size = 20) ## ----fig.height=6, fig.width=6------------------------------------------------ ggplot(d) + aes( x = Class, y = Survived, weight = Freq, size = after_stat(observed), fill = after_stat(std.resid) ) + stat_cross(shape = 22) + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + scale_size_area(max_size = 20) ## ----------------------------------------------------------------------------- ggplot(d) + aes(x = Class, y = Survived, weight = Freq) + geom_tile(fill = "white", colour = "black") + geom_text(stat = "cross", mapping = aes(label = after_stat(observed))) + theme_minimal() ## ----------------------------------------------------------------------------- ggplot(d) + aes( x = Class, y = Survived, weight = Freq, label = scales::percent(after_stat(col.prop), accuracy = .1), fill = after_stat(std.resid) ) + stat_cross(shape = 22, size = 30) + geom_text(stat = "cross") + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + facet_grid(rows = vars(Sex)) + labs(fill = "Standardized residuals") + theme_minimal() ggstats/inst/doc/stat_prop.Rmd0000644000176200001440000001017314462704257016144 0ustar liggesusers--- title: "Compute custom proportions with `stat_prop()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Compute custom proportions with `stat_prop()`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ggstats) library(ggplot2) ``` `stat_prop()` is a variation of `ggplot2::stat_count()` allowing to compute custom proportions according to the **by** aesthetic defining the denominator (i.e. all proportions for a same value of **by** will sum to 1). The **by** aesthetic should be a factor. Therefore, `stat_prop()` requires the **by** aesthetic and this **by** aesthetic should be a factor. ## Adding labels on a percent stacked bar plot When using `position = "fill"` with `geom_bar()`, you can produce a percent stacked bar plot. However, the proportions corresponding to the **y** axis are not directly accessible using only `ggplot2`. With `stat_prop()`, you can easily add them on the plot. In the following example, we indicated `stat = "prop"` to `ggplot2::geom_text()` to use `stat_prop()`, we defined the **by** aesthetic (here we want to compute the proportions separately for each value of **x**), and we also used `ggplot2::position_fill()` when calling `ggplot2::geom_text()`. ```{r} d <- as.data.frame(Titanic) p <- ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = Class) + geom_bar(position = "fill") + geom_text(stat = "prop", position = position_fill(.5)) p ``` Note that `stat_prop()` has properly taken into account the **weight** aesthetic. `stat_prop()` is also compatible with faceting. In that case, proportions are computed separately in each facet. ```{r} p + facet_grid(cols = vars(Sex)) ``` ## Displaying proportions of the total If you want to display proportions of the total, simply map the **by** aesthetic to `1`. Here an example using a stacked bar chart. ```{r} ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = 1) + geom_bar() + geom_text( aes(label = scales::percent(after_stat(prop), accuracy = 1)), stat = "prop", position = position_stack(.5) ) ``` ## A dodged bar plot to compare two distributions A dodged bar plot could be used to compare two distributions. ```{r} ggplot(d) + aes(x = Class, fill = Sex, weight = Freq, by = Sex) + geom_bar(position = "dodge") ``` On the previous graph, it is difficult to see if first class is over- or under-represented among women, due to the fact they were much more men on the boat. `stat_prop()` could be used to adjust the graph by displaying instead the proportion within each category (i.e. here the proportion by sex). ```{r} ggplot(d) + aes(x = Class, fill = Sex, weight = Freq, by = Sex, y = after_stat(prop)) + geom_bar(stat = "prop", position = "dodge") + scale_y_continuous(labels = scales::percent) ``` The same example with labels: ```{r} ggplot(d) + aes(x = Class, fill = Sex, weight = Freq, by = Sex, y = after_stat(prop)) + geom_bar(stat = "prop", position = "dodge") + scale_y_continuous(labels = scales::percent) + geom_text( mapping = aes( label = scales::percent(after_stat(prop), accuracy = .1), y = after_stat(0.01) ), vjust = "bottom", position = position_dodge(.9), stat = "prop" ) ``` ## Displaying unobserved levels With the `complete` argument, it is possible to indicate an aesthetic for those statistics should be completed for unobserved values. ```{r} d <- diamonds %>% dplyr::filter(!(cut == "Ideal" & clarity == "I1")) %>% dplyr::filter(!(cut == "Very Good" & clarity == "VS2")) %>% dplyr::filter(!(cut == "Premium" & clarity == "IF")) p <- ggplot(d) + aes(x = clarity, fill = cut, by = clarity) + geom_bar(position = "fill") p + geom_text( stat = "prop", position = position_fill(.5) ) ``` Adding `complete = "fill"` will generate "0.0%" labels where relevant. ```{r} p + geom_text( stat = "prop", position = position_fill(.5), complete = "fill" ) ``` ggstats/inst/doc/gglikert.R0000644000176200001440000001372714527052751015425 0ustar liggesusers## ----include = FALSE---------------------------------------------------------- knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ## ----setup-------------------------------------------------------------------- library(ggstats) library(dplyr) library(ggplot2) ## ----------------------------------------------------------------------------- likert_levels <- c( "Strongly disagree", "Disagree", "Neither agree nor disagree", "Agree", "Strongly agree" ) set.seed(42) df <- tibble( q1 = sample(likert_levels, 150, replace = TRUE), q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1), q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q5 = sample(c(likert_levels, NA), 150, replace = TRUE), q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0)) ) %>% mutate(across(everything(), ~ factor(.x, levels = likert_levels))) likert_levels_dk <- c( "Strongly disagree", "Disagree", "Neither agree nor disagree", "Agree", "Strongly agree", "Don't know" ) df_dk <- tibble( q1 = sample(likert_levels_dk, 150, replace = TRUE), q2 = sample(likert_levels_dk, 150, replace = TRUE, prob = 6:1), q3 = sample(likert_levels_dk, 150, replace = TRUE, prob = 1:6), q4 = sample(likert_levels_dk, 150, replace = TRUE, prob = 1:6), q5 = sample(c(likert_levels_dk, NA), 150, replace = TRUE), q6 = sample( likert_levels_dk, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0, 1) ) ) %>% mutate(across(everything(), ~ factor(.x, levels = likert_levels_dk))) ## ----------------------------------------------------------------------------- gglikert(df) ## ----------------------------------------------------------------------------- gglikert(df, include = q1:q3) ## ----------------------------------------------------------------------------- gglikert(df) + ggtitle("A Likert-type items plot", subtitle = "generated with gglikert()") + scale_fill_brewer(palette = "RdYlBu") ## ----------------------------------------------------------------------------- gglikert(df, sort = "ascending") ## ----------------------------------------------------------------------------- gglikert(df, sort = "ascending", sort_method = "mean") ## ----------------------------------------------------------------------------- gglikert(df, reverse_likert = TRUE) ## ----------------------------------------------------------------------------- gglikert(df, add_labels = FALSE) ## ----------------------------------------------------------------------------- gglikert( df, labels_size = 3, labels_accuracy = .1, labels_hide_below = .2, labels_color = "white" ) ## ----------------------------------------------------------------------------- gglikert( df, totals_include_center = TRUE, sort = "descending", sort_prop_include_center = TRUE ) ## ----------------------------------------------------------------------------- gglikert( df, totals_size = 4, totals_color = "blue", totals_fontface = "italic", totals_hjust = .20 ) ## ----------------------------------------------------------------------------- gglikert(df, add_totals = FALSE) ## ----------------------------------------------------------------------------- if (require(labelled)) { df <- df %>% set_variable_labels( q1 = "first question", q2 = "second question", q3 = "this is the third question with a quite long variable label" ) } gglikert(df) ## ----------------------------------------------------------------------------- gglikert( df, variable_labels = c( q1 = "alternative label for the first question", q6 = "another custom label" ) ) ## ----------------------------------------------------------------------------- gglikert(df, y_label_wrap = 20) gglikert(df, y_label_wrap = 200) ## ----------------------------------------------------------------------------- gglikert(df_dk) ## ----------------------------------------------------------------------------- df_dk %>% mutate(across(everything(), ~ factor(.x, levels = likert_levels))) %>% gglikert() ## ----------------------------------------------------------------------------- df_dk %>% gglikert(exclude_fill_values = "Don't know") ## ----message=FALSE------------------------------------------------------------ df_group <- df df_group$group1 <- sample(c("A", "B"), 150, replace = TRUE) df_group$group2 <- sample(c("a", "b", "c"), 150, replace = TRUE) gglikert(df_group, q1:q6, facet_cols = vars(group1), labels_size = 3 ) gglikert(df_group, q1:q2, facet_rows = vars(group1, group2), labels_size = 3 ) gglikert(df_group, q3:q6, facet_cols = vars(group1), facet_rows = vars(group2), labels_size = 3 ) + scale_x_continuous( labels = label_percent_abs(), expand = expansion(0, .2) ) ## ----------------------------------------------------------------------------- gglikert(df_group, q1:q4, y = "group1", facet_rows = vars(.question), labels_size = 3, facet_label_wrap = 15 ) ## ----------------------------------------------------------------------------- gglikert_stacked(df) gglikert_stacked( df, sort = "asc", add_median_line = TRUE, add_labels = FALSE ) gglikert_stacked( df_group, include = q1:q4, y = "group2" ) + facet_grid( rows = vars(.question), labeller = label_wrap_gen(15) ) ## ----------------------------------------------------------------------------- gglikert_data(df) %>% head() ## ----------------------------------------------------------------------------- ggplot(gglikert_data(df)) + aes(y = .question, fill = .answer) + geom_bar(position = "fill") ## ----------------------------------------------------------------------------- df$sampling_weights <- runif(nrow(df)) gglikert(df, q1:q4, weights = sampling_weights) ggstats/inst/doc/stat_prop.html0000644000176200001440000036654314527052760016402 0ustar liggesusers Compute custom proportions with stat_prop()

Compute custom proportions with stat_prop()

library(ggstats)
library(ggplot2)

stat_prop() is a variation of ggplot2::stat_count() allowing to compute custom proportions according to the by aesthetic defining the denominator (i.e. all proportions for a same value of by will sum to 1). The by aesthetic should be a factor. Therefore, stat_prop() requires the by aesthetic and this by aesthetic should be a factor.

Adding labels on a percent stacked bar plot

When using position = "fill" with geom_bar(), you can produce a percent stacked bar plot. However, the proportions corresponding to the y axis are not directly accessible using only ggplot2. With stat_prop(), you can easily add them on the plot.

In the following example, we indicated stat = "prop" to ggplot2::geom_text() to use stat_prop(), we defined the by aesthetic (here we want to compute the proportions separately for each value of x), and we also used ggplot2::position_fill() when calling ggplot2::geom_text().

d <- as.data.frame(Titanic)
p <- ggplot(d) +
  aes(x = Class, fill = Survived, weight = Freq, by = Class) +
  geom_bar(position = "fill") +
  geom_text(stat = "prop", position = position_fill(.5))
p

Note that stat_prop() has properly taken into account the weight aesthetic.

stat_prop() is also compatible with faceting. In that case, proportions are computed separately in each facet.

p + facet_grid(cols = vars(Sex))

Displaying proportions of the total

If you want to display proportions of the total, simply map the by aesthetic to 1. Here an example using a stacked bar chart.

ggplot(d) +
  aes(x = Class, fill = Survived, weight = Freq, by = 1) +
  geom_bar() +
  geom_text(
    aes(label = scales::percent(after_stat(prop), accuracy = 1)),
    stat = "prop",
    position = position_stack(.5)
  )

A dodged bar plot to compare two distributions

A dodged bar plot could be used to compare two distributions.

ggplot(d) +
  aes(x = Class, fill = Sex, weight = Freq, by = Sex) +
  geom_bar(position = "dodge")

On the previous graph, it is difficult to see if first class is over- or under-represented among women, due to the fact they were much more men on the boat. stat_prop() could be used to adjust the graph by displaying instead the proportion within each category (i.e. here the proportion by sex).

ggplot(d) +
  aes(x = Class, fill = Sex, weight = Freq, by = Sex, y = after_stat(prop)) +
  geom_bar(stat = "prop", position = "dodge") +
  scale_y_continuous(labels = scales::percent)

The same example with labels:

ggplot(d) +
  aes(x = Class, fill = Sex, weight = Freq, by = Sex, y = after_stat(prop)) +
  geom_bar(stat = "prop", position = "dodge") +
  scale_y_continuous(labels = scales::percent) +
  geom_text(
    mapping = aes(
      label = scales::percent(after_stat(prop), accuracy = .1),
      y = after_stat(0.01)
    ),
    vjust = "bottom",
    position = position_dodge(.9),
    stat = "prop"
  )

Displaying unobserved levels

With the complete argument, it is possible to indicate an aesthetic for those statistics should be completed for unobserved values.

d <- diamonds %>%
  dplyr::filter(!(cut == "Ideal" & clarity == "I1")) %>%
  dplyr::filter(!(cut == "Very Good" & clarity == "VS2")) %>%
  dplyr::filter(!(cut == "Premium" & clarity == "IF"))
p <- ggplot(d) +
  aes(x = clarity, fill = cut, by = clarity) +
  geom_bar(position = "fill")
p +
  geom_text(
    stat = "prop",
    position = position_fill(.5)
  )

Adding complete = "fill" will generate “0.0%” labels where relevant.

p +
  geom_text(
    stat = "prop",
    position = position_fill(.5),
    complete = "fill"
  )

ggstats/inst/doc/ggcoef_model.html0000644000176200001440000106314514527052732016771 0ustar liggesusers Plot model coefficients with ggcoef_model()

Plot model coefficients with ggcoef_model()

Joseph Larmarange

library(ggstats)

The purpose of ggcoef_model() is to quickly plot the coefficients of a model. It is an updated and improved version of GGally::ggcoef() based on broom.helpers::tidy_plus_plus(). For displaying a nicely formatted table of the same models, look at gtsummary::tbl_regression().

Quick coefficients plot

To work automatically, this function requires the {broom.helpers}. Simply call ggcoef_model() with a model object. It could be the result of stats::lm, stats::glm or any other model covered by {broom.helpers}.

data(tips, package = "reshape")
mod_simple <- lm(tip ~ day + time + total_bill, data = tips)
ggcoef_model(mod_simple)

In the case of a logistic regression (or any other model for which coefficients are usually exponentiated), simply indicated exponentiate = TRUE. Note that a logarithmic scale will be used for the x-axis.

d_titanic <- as.data.frame(Titanic)
d_titanic$Survived <- factor(d_titanic$Survived, c("No", "Yes"))
mod_titanic <- glm(
  Survived ~ Sex * Age + Class,
  weights = Freq,
  data = d_titanic,
  family = binomial
)
ggcoef_model(mod_titanic, exponentiate = TRUE)

Customizing the plot

Variable labels

You can use the {labelled} package to define variable labels. They will be automatically used by ggcoef_model(). Note that variable labels should be defined before computing the model.

library(labelled)
tips_labelled <- tips %>%
  set_variable_labels(
    day = "Day of the week",
    time = "Lunch or Dinner",
    total_bill = "Bill's total"
  )
mod_labelled <- lm(tip ~ day + time + total_bill, data = tips_labelled)
ggcoef_model(mod_labelled)

You can also define custom variable labels directly by passing a named vector to the variable_labels option.

ggcoef_model(
  mod_simple,
  variable_labels = c(
    day = "Week day",
    time = "Time (lunch or dinner ?)",
    total_bill = "Total of the bill"
  )
)

If variable labels are to long, you can pass ggplot2::label_wrap_gen() or any other labeller function to facet_labeller.

ggcoef_model(
  mod_simple,
  variable_labels = c(
    day = "Week day",
    time = "Time (lunch or dinner ?)",
    total_bill = "Total of the bill"
  ),
  facet_labeller = ggplot2::label_wrap_gen(10)
)

Use facet_row = NULL to hide variable names.

ggcoef_model(mod_simple, facet_row = NULL, colour_guide = TRUE)

Term labels

Several options allows you to customize term labels.

ggcoef_model(mod_titanic, exponentiate = TRUE)

ggcoef_model(
  mod_titanic,
  exponentiate = TRUE,
  show_p_values = FALSE,
  signif_stars = FALSE,
  add_reference_rows = FALSE,
  categorical_terms_pattern = "{level} (ref: {reference_level})",
  interaction_sep = " x "
) +
  ggplot2::scale_y_discrete(labels = scales::label_wrap(15))
#> Scale for y is already present.
#> Adding another scale for y, which will replace the existing scale.

By default, for categorical variables using treatment and sum contrasts, reference rows will be added and displayed on the graph.

mod_titanic2 <- glm(
  Survived ~ Sex * Age + Class,
  weights = Freq,
  data = d_titanic,
  family = binomial,
  contrasts = list(Sex = contr.sum, Class = contr.treatment(4, base = 3))
)
ggcoef_model(mod_titanic2, exponentiate = TRUE)

Continuous variables with polynomial terms defined with stats::poly() are also properly managed.

mod_poly <- lm(Sepal.Length ~ poly(Petal.Width, 3) + Petal.Length, data = iris)
ggcoef_model(mod_poly)

Use no_reference_row to indicate which variables should not have a reference row added.

ggcoef_model(
  mod_titanic2,
  exponentiate = TRUE,
  no_reference_row = "Sex"
)

ggcoef_model(
  mod_titanic2,
  exponentiate = TRUE,
  no_reference_row = broom.helpers::all_dichotomous()
)

ggcoef_model(
  mod_titanic2,
  exponentiate = TRUE,
  no_reference_row = broom.helpers::all_categorical(),
  categorical_terms_pattern = "{level}/{reference_level}"
)

Elements to display

Use intercept = TRUE to display intercepts.

ggcoef_model(mod_simple, intercept = TRUE)

You can remove confidence intervals with conf.int = FALSE.

ggcoef_model(mod_simple, conf.int = FALSE)

By default, significant terms (i.e. with a p-value below 5%) are highlighted using two types of dots. You can control the level of significance with significance or remove it with significance = NULL.

ggcoef_model(mod_simple, significance = NULL)

By default, dots are colored by variable. You can deactivate this behavior with colour = NULL.

ggcoef_model(mod_simple, colour = NULL)

You can display only a subset of terms with include.

ggcoef_model(mod_simple, include = c("time", "total_bill"))

It is possible to use tidyselect helpers.

ggcoef_model(mod_simple, include = dplyr::starts_with("t"))

You can remove stripped rows with stripped_rows = FALSE.

ggcoef_model(mod_simple, stripped_rows = FALSE)

Do not hesitate to consult the help file of ggcoef_model() to see all available options.

ggplot2 elements

The plot returned by ggcoef_model() is a classic ggplot2 plot. You can therefore apply ggplot2 functions to it.

ggcoef_model(mod_simple) +
  ggplot2::xlab("Coefficients") +
  ggplot2::ggtitle("Custom title") +
  ggplot2::scale_color_brewer(palette = "Set1") +
  ggplot2::theme(legend.position = "right")
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.

Forest plot with a coefficient table

ggcoef_table() is a variant of ggcoef_model() displaying a coefficient table on the right of the forest plot.

ggcoef_table(mod_simple)

ggcoef_table(mod_titanic, exponentiate = TRUE)

You can easily customize the columns to be displayed.

ggcoef_table(
  mod_simple,
  table_stat = c("label", "estimate", "std.error", "ci"),
  ci_pattern = "{conf.low} to {conf.high}",
  table_stat_label = list(
    estimate = scales::label_number(accuracy = .001),
    conf.low = scales::label_number(accuracy = .01),
    conf.high = scales::label_number(accuracy = .01),
    std.error = scales::label_number(accuracy = .001),
    label = toupper
  ),
  table_header = c("Term", "Coef.", "SE", "CI"),
  table_witdhs = c(2, 3)
)

Multinomial models

For multinomial models, simply use ggcoef_multinom(). Three types of visualizations are available: "dodged", "faceted" and "table".

library(nnet)
hec <- as.data.frame(HairEyeColor)
mod <- multinom(
  Hair ~ Eye + Sex,
  data = hec,
  weights = hec$Freq
)
#> # weights:  24 (15 variable)
#> initial  value 820.686262 
#> iter  10 value 669.061500
#> iter  20 value 658.888977
#> final  value 658.885327 
#> converged
ggcoef_multinom(
  mod,
  exponentiate = TRUE
)

ggcoef_multinom(
  mod,
  exponentiate = TRUE,
  type = "faceted"
)

ggcoef_multinom(
  mod,
  exponentiate = TRUE,
  type = "table"
)

You can use y.level_label to customize the label of each level.

ggcoef_multinom(
  mod,
  type = "faceted",
  y.level_label = c("Brown" = "Brown\n(ref: Black)"),
  exponentiate = TRUE
)

Multi-components models

Multi-components models such as zero-inflated Poisson or beta regression generate a set of terms for each of their components. You can use ggcoef_multicomponents() which is similar to ggcoef_multinom().

library(pscl)
#> Classes and Methods for R developed in the
#> Political Science Computational Laboratory
#> Department of Political Science
#> Stanford University
#> Simon Jackman
#> hurdle and zeroinfl functions by Achim Zeileis
data("bioChemists", package = "pscl")
mod <- zeroinfl(art ~ fem * mar | fem + mar, data = bioChemists)

ggcoef_multicomponents(mod)
#> ℹ <zeroinfl> model detected.
#> ✔ `tidy_zeroinfl()` used instead.
#> ℹ Add `tidy_fun = broom.helpers::tidy_zeroinfl` to quiet these messages.

ggcoef_multicomponents(mod, type = "f")
#> ℹ <zeroinfl> model detected.
#> ✔ `tidy_zeroinfl()` used instead.
#> ℹ Add `tidy_fun = broom.helpers::tidy_zeroinfl` to quiet these messages.

ggcoef_multicomponents(mod, type = "t")
#> ℹ <zeroinfl> model detected.
#> ✔ `tidy_zeroinfl()` used instead.
#> ℹ Add `tidy_fun = broom.helpers::tidy_zeroinfl` to quiet these messages.

ggcoef_multicomponents(
  mod,
  type = "t",
  component_label = c(conditional = "Count", zero_inflated = "Zero-inflated")
)
#> ℹ <zeroinfl> model detected.
#> ✔ `tidy_zeroinfl()` used instead.
#> ℹ Add `tidy_fun = broom.helpers::tidy_zeroinfl` to quiet these messages.

Comparing several models

You can easily compare several models with ggcoef_compare(). To be noted, ggcoef_compare() is not compatible with multinomial or multi-components models.

mod1 <- lm(Fertility ~ ., data = swiss)
mod2 <- step(mod1, trace = 0)
mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss)
models <- list(
  "Full model" = mod1,
  "Simplified model" = mod2,
  "With interaction" = mod3
)

ggcoef_compare(models)

ggcoef_compare(models, type = "faceted")

Advanced users

Advanced users could use their own dataset and pass it to ggcoef_plot(). Such dataset could be produced by ggcoef_model(), ggcoef_compare() or ggcoef_multinom() with the option return_data = TRUE or by using broom::tidy() or broom.helpers::tidy_plus_plus().

Supported models

model notes
betareg::betareg() Use tidy_parameters() as tidy_fun with component argument to control with coefficients to return. broom::tidy() does not support theexponentiateargument for betareg models, usetidy_parameters() instead.| |biglm::bigglm()| | |biglmm::bigglm()| | |brms::brm()|broom.mixedpackage required | |cmprsk::crr()|Limited support. It is recommended to usetidycmprsk::crr()instead. | |fixest::feglm()|May fail with R <= 4.0. | |fixest::femlm()|May fail with R <= 4.0. | |fixest::feNmlm()|May fail with R <= 4.0. | |fixest::feols()|May fail with R <= 4.0. | |gam::gam()| | |geepack::geeglm()| | |glmmTMB::glmmTMB()|broom.mixedpackage required | |lavaan::lavaan()|Limited support for categorical variables | |lfe::felm()| | |lme4::glmer.nb()|broom.mixedpackage required | |lme4::glmer()|broom.mixedpackage required | |lme4::lmer()|broom.mixedpackage required | |logitr::logitr()|Requires logitr >= 0.8.0 | |MASS::glm.nb()| | |MASS::polr()| | |mgcv::gam()|Use default tidierbroom::tidy()for smooth terms only, orgtsummary::tidy_gam()to include parametric terms | |mice::mira|Limited support. Ifmodis amiraobject, usetidy_plus_plus(mod, tidy_fun = function(x, …) mice::pool(x) %>% mice::tidy(…))| |multgee::nomLORgee()|Experimental support. Usetidy_multgee()astidy_fun. | |multgee::ordLORgee()|Experimental support. Usetidy_multgee()astidy_fun. | |nnet::multinom()| | |ordinal::clm()|Limited support for models with nominal predictors. | |ordinal::clmm()|Limited support for models with nominal predictors. | |parsnip::model_fit|Supported as long as the type of model and the engine is supported. | |plm::plm()| | |pscl::hurdle()|Usetidy_zeroinfl()astidy_fun. | |pscl::zeroinfl()|Usetidy_zeroinfl()astidy_fun. | |rstanarm::stan_glm()|broom.mixedpackage required | |stats::aov()|Reference rows are not relevant for such models. | |stats::glm()| | |stats::lm()| | |stats::nls()|Limited support | |survey::svycoxph()| | |survey::svyglm()| | |survey::svyolr()| | |survival::clogit()| | |survival::coxph()| | |survival::survreg()| | |tidycmprsk::crr()| | |VGAM::vglm()|Limited support. It is recommended to usetidy_parameters()astidy_fun`.

Note: this list of models has been tested. {broom.helpers}, and therefore ggcoef_model(), may or may not work properly or partially with other types of models.

ggstats/inst/doc/gglikert.html0000644000176200001440000155247614527052751016202 0ustar liggesusers Plot Likert-type items with gglikert()

Plot Likert-type items with gglikert()

Joseph Larmarange

library(ggstats)
library(dplyr)
#> 
#> Attachement du package : 'dplyr'
#> Les objets suivants sont masqués depuis 'package:stats':
#> 
#>     filter, lag
#> Les objets suivants sont masqués depuis 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(ggplot2)

The purpose of gglikert() is to generate a centered bar plot comparing the answers of several questions sharing a common Likert-type scale.

Generating an example dataset

likert_levels <- c(
  "Strongly disagree",
  "Disagree",
  "Neither agree nor disagree",
  "Agree",
  "Strongly agree"
)
set.seed(42)
df <-
  tibble(
    q1 = sample(likert_levels, 150, replace = TRUE),
    q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1),
    q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
    q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
    q5 = sample(c(likert_levels, NA), 150, replace = TRUE),
    q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0))
  ) %>%
  mutate(across(everything(), ~ factor(.x, levels = likert_levels)))

likert_levels_dk <- c(
  "Strongly disagree",
  "Disagree",
  "Neither agree nor disagree",
  "Agree",
  "Strongly agree",
  "Don't know"
)
df_dk <-
  tibble(
    q1 = sample(likert_levels_dk, 150, replace = TRUE),
    q2 = sample(likert_levels_dk, 150, replace = TRUE, prob = 6:1),
    q3 = sample(likert_levels_dk, 150, replace = TRUE, prob = 1:6),
    q4 = sample(likert_levels_dk, 150, replace = TRUE, prob = 1:6),
    q5 = sample(c(likert_levels_dk, NA), 150, replace = TRUE),
    q6 = sample(
      likert_levels_dk, 150,
      replace = TRUE, prob = c(1, 0, 1, 1, 0, 1)
    )
  ) %>%
  mutate(across(everything(), ~ factor(.x, levels = likert_levels_dk)))

Quick plot

Simply call gglikert().

gglikert(df)

The list of variables to plot (all by default) could by specify with include. This argument accepts tidy-select syntax.

gglikert(df, include = q1:q3)

Customizing the plot

The generated plot is a standard ggplot2 object. You can therefore use ggplot2 functions to custom many aspects.

gglikert(df) +
  ggtitle("A Likert-type items plot", subtitle = "generated with gglikert()") +
  scale_fill_brewer(palette = "RdYlBu")
#> Scale for fill is already present.
#> Adding another scale for fill, which will replace the existing scale.

Sorting the questions

You can sort the plot with sort.

gglikert(df, sort = "ascending")

By default, the plot is sorted based on the proportion being higher than the center level, i.e. in this case the proportion of answers equal to “Agree” or “Strongly Agree”. Alternatively, the questions could be transformed into a score and sorted accorded to their mean.

gglikert(df, sort = "ascending", sort_method = "mean")

Sorting the answers

You can reverse the order of the answers with reverse_likert.

gglikert(df, reverse_likert = TRUE)

Proportion labels

Proportion labels could be removed with add_labels = FALSE.

gglikert(df, add_labels = FALSE)

or customized.

gglikert(
  df,
  labels_size = 3,
  labels_accuracy = .1,
  labels_hide_below = .2,
  labels_color = "white"
)

Totals on each side

By default, totals are added on each side of the plot. In case of an uneven number of answer levels, the central level is not taken into account for computing totals. With totals_include_center = TRUE, half of the proportion of the central level will be added on each side.

gglikert(
  df,
  totals_include_center = TRUE,
  sort = "descending",
  sort_prop_include_center = TRUE
)

Totals could be customized.

gglikert(
  df,
  totals_size = 4,
  totals_color = "blue",
  totals_fontface = "italic",
  totals_hjust = .20
)

Or removed.

gglikert(df, add_totals = FALSE)

Variable labels

If you are using variable labels (see labelled::set_variable_labels()), they will be taken automatically into account by gglikert().

if (require(labelled)) {
  df <- df %>%
    set_variable_labels(
      q1 = "first question",
      q2 = "second question",
      q3 = "this is the third question with a quite long variable label"
    )
}
gglikert(df)

You can also provide custom variable labels with variable_labels.

gglikert(
  df,
  variable_labels = c(
    q1 = "alternative label for the first question",
    q6 = "another custom label"
  )
)

You can control how variable labels are wrapped with y_label_wrap.

gglikert(df, y_label_wrap = 20)

gglikert(df, y_label_wrap = 200)

Removing certain values

Sometimes, the dataset could contain certain values that you should not be displayed.

gglikert(df_dk)

A first option could be to convert the don’t knows into NA. In such case, the proportions will be computed on non missing.

df_dk %>%
  mutate(across(everything(), ~ factor(.x, levels = likert_levels))) %>%
  gglikert()

Or, you could use exclude_fill_values to not display specific values, but still counting them in the denominator for computing proportions.

df_dk %>% gglikert(exclude_fill_values = "Don't know")

Facets

To define facets, use facet_rows and/or facet_cols.

df_group <- df
df_group$group1 <- sample(c("A", "B"), 150, replace = TRUE)
df_group$group2 <- sample(c("a", "b", "c"), 150, replace = TRUE)

gglikert(df_group,
  q1:q6,
  facet_cols = vars(group1),
  labels_size = 3
)

gglikert(df_group,
  q1:q2,
  facet_rows = vars(group1, group2),
  labels_size = 3
)

gglikert(df_group,
  q3:q6,
  facet_cols = vars(group1),
  facet_rows = vars(group2),
  labels_size = 3
) +
  scale_x_continuous(
    labels = label_percent_abs(),
    expand = expansion(0, .2)
  )

To compare answers by subgroup, you can alternatively map .question to facets, and define a grouping variable for y.

gglikert(df_group,
  q1:q4,
  y = "group1",
  facet_rows = vars(.question),
  labels_size = 3,
  facet_label_wrap = 15
)

Stacked plot

For a more classical stacked bar plot, you can use gglikert_stacked().

gglikert_stacked(df)


gglikert_stacked(
  df,
  sort = "asc",
  add_median_line = TRUE,
  add_labels = FALSE
)


gglikert_stacked(
  df_group,
  include = q1:q4,
  y = "group2"
) +
  facet_grid(
    rows = vars(.question),
    labeller = label_wrap_gen(15)
  )

Long format dataset

Internally, gglikert() is calling gglikert_data() to generate a long format dataset combining all questions into two columns, .question and .answer.

gglikert_data(df) %>%
  head()
#> # A tibble: 6 × 3
#>   .weights .question                                                   .answer  
#>      <dbl> <fct>                                                       <fct>    
#> 1        1 first question                                              Strongly…
#> 2        1 second question                                             Disagree 
#> 3        1 this is the third question with a quite long variable label Agree    
#> 4        1 q4                                                          Disagree 
#> 5        1 q5                                                          Strongly…
#> 6        1 q6                                                          Strongly…

Such dataset could be useful for other types of plot, for example for a classic stacked bar plot.

ggplot(gglikert_data(df)) +
  aes(y = .question, fill = .answer) +
  geom_bar(position = "fill")

Weighted data

gglikert(), gglikert_stacked() and gglikert_data() accepts a weights argument, allowing to specify statistical weights.

df$sampling_weights <- runif(nrow(df))
gglikert(df, q1:q4, weights = sampling_weights)

See also

The function position_likert() used to center bars.

ggstats/inst/doc/ggcoef_model.Rmd0000644000176200001440000002212514466120077016537 0ustar liggesusers--- title: "Plot model coefficients with `ggcoef_model()`" author: Joseph Larmarange output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Plot model coefficients with `ggcoef_model()`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ggstats) ``` The purpose of `ggcoef_model()` is to quickly plot the coefficients of a model. It is an updated and improved version of `GGally::ggcoef()` based on `broom.helpers::tidy_plus_plus()`. For displaying a nicely formatted table of the same models, look at `gtsummary::tbl_regression()`. ## Quick coefficients plot To work automatically, this function requires the `{broom.helpers}`. Simply call `ggcoef_model()` with a model object. It could be the result of `stats::lm`, `stats::glm` or any other model covered by `{broom.helpers}`. ```{r ggcoef-reg} data(tips, package = "reshape") mod_simple <- lm(tip ~ day + time + total_bill, data = tips) ggcoef_model(mod_simple) ``` In the case of a logistic regression (or any other model for which coefficients are usually exponentiated), simply indicated `exponentiate = TRUE`. Note that a logarithmic scale will be used for the x-axis. ```{r ggcoef-titanic} d_titanic <- as.data.frame(Titanic) d_titanic$Survived <- factor(d_titanic$Survived, c("No", "Yes")) mod_titanic <- glm( Survived ~ Sex * Age + Class, weights = Freq, data = d_titanic, family = binomial ) ggcoef_model(mod_titanic, exponentiate = TRUE) ``` ## Customizing the plot ### Variable labels You can use the `{labelled}` package to define variable labels. They will be automatically used by `ggcoef_model()`. Note that variable labels should be defined before computing the model. ```{r} library(labelled) tips_labelled <- tips %>% set_variable_labels( day = "Day of the week", time = "Lunch or Dinner", total_bill = "Bill's total" ) mod_labelled <- lm(tip ~ day + time + total_bill, data = tips_labelled) ggcoef_model(mod_labelled) ``` You can also define custom variable labels directly by passing a named vector to the `variable_labels` option. ```{r} ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ) ) ``` If variable labels are to long, you can pass `ggplot2::label_wrap_gen()` or any other labeller function to `facet_labeller.` ```{r} ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ), facet_labeller = ggplot2::label_wrap_gen(10) ) ``` Use `facet_row = NULL` to hide variable names. ```{r} ggcoef_model(mod_simple, facet_row = NULL, colour_guide = TRUE) ``` ### Term labels Several options allows you to customize term labels. ```{r} ggcoef_model(mod_titanic, exponentiate = TRUE) ggcoef_model( mod_titanic, exponentiate = TRUE, show_p_values = FALSE, signif_stars = FALSE, add_reference_rows = FALSE, categorical_terms_pattern = "{level} (ref: {reference_level})", interaction_sep = " x " ) + ggplot2::scale_y_discrete(labels = scales::label_wrap(15)) ``` By default, for categorical variables using treatment and sum contrasts, reference rows will be added and displayed on the graph. ```{r} mod_titanic2 <- glm( Survived ~ Sex * Age + Class, weights = Freq, data = d_titanic, family = binomial, contrasts = list(Sex = contr.sum, Class = contr.treatment(4, base = 3)) ) ggcoef_model(mod_titanic2, exponentiate = TRUE) ``` Continuous variables with polynomial terms defined with `stats::poly()` are also properly managed. ```{r} mod_poly <- lm(Sepal.Length ~ poly(Petal.Width, 3) + Petal.Length, data = iris) ggcoef_model(mod_poly) ``` Use `no_reference_row` to indicate which variables should not have a reference row added. ```{r} ggcoef_model( mod_titanic2, exponentiate = TRUE, no_reference_row = "Sex" ) ggcoef_model( mod_titanic2, exponentiate = TRUE, no_reference_row = broom.helpers::all_dichotomous() ) ggcoef_model( mod_titanic2, exponentiate = TRUE, no_reference_row = broom.helpers::all_categorical(), categorical_terms_pattern = "{level}/{reference_level}" ) ``` ### Elements to display Use `intercept = TRUE` to display intercepts. ```{r} ggcoef_model(mod_simple, intercept = TRUE) ``` You can remove confidence intervals with `conf.int = FALSE`. ```{r} ggcoef_model(mod_simple, conf.int = FALSE) ``` By default, significant terms (i.e. with a p-value below 5%) are highlighted using two types of dots. You can control the level of significance with `significance` or remove it with `significance = NULL`. ```{r} ggcoef_model(mod_simple, significance = NULL) ``` By default, dots are colored by variable. You can deactivate this behavior with `colour = NULL`. ```{r} ggcoef_model(mod_simple, colour = NULL) ``` You can display only a subset of terms with **include**. ```{r} ggcoef_model(mod_simple, include = c("time", "total_bill")) ``` It is possible to use `tidyselect` helpers. ```{r} ggcoef_model(mod_simple, include = dplyr::starts_with("t")) ``` You can remove stripped rows with `stripped_rows = FALSE`. ```{r} ggcoef_model(mod_simple, stripped_rows = FALSE) ``` Do not hesitate to consult the help file of `ggcoef_model()` to see all available options. ### ggplot2 elements The plot returned by `ggcoef_model()` is a classic `ggplot2` plot. You can therefore apply `ggplot2` functions to it. ```{r} ggcoef_model(mod_simple) + ggplot2::xlab("Coefficients") + ggplot2::ggtitle("Custom title") + ggplot2::scale_color_brewer(palette = "Set1") + ggplot2::theme(legend.position = "right") ``` ## Forest plot with a coefficient table `ggcoef_table()` is a variant of `ggcoef_model()` displaying a coefficient table on the right of the forest plot. ```{r} ggcoef_table(mod_simple) ggcoef_table(mod_titanic, exponentiate = TRUE) ``` You can easily customize the columns to be displayed. ```{r} ggcoef_table( mod_simple, table_stat = c("label", "estimate", "std.error", "ci"), ci_pattern = "{conf.low} to {conf.high}", table_stat_label = list( estimate = scales::label_number(accuracy = .001), conf.low = scales::label_number(accuracy = .01), conf.high = scales::label_number(accuracy = .01), std.error = scales::label_number(accuracy = .001), label = toupper ), table_header = c("Term", "Coef.", "SE", "CI"), table_witdhs = c(2, 3) ) ``` ## Multinomial models For multinomial models, simply use `ggcoef_multinom()`. Three types of visualizations are available: `"dodged"`, `"faceted"` and `"table"`. ```{r} library(nnet) hec <- as.data.frame(HairEyeColor) mod <- multinom( Hair ~ Eye + Sex, data = hec, weights = hec$Freq ) ggcoef_multinom( mod, exponentiate = TRUE ) ggcoef_multinom( mod, exponentiate = TRUE, type = "faceted" ) ``` ```{r, fig.height=9, fig.width=6} ggcoef_multinom( mod, exponentiate = TRUE, type = "table" ) ``` You can use `y.level_label` to customize the label of each level. ```{r} ggcoef_multinom( mod, type = "faceted", y.level_label = c("Brown" = "Brown\n(ref: Black)"), exponentiate = TRUE ) ``` ## Multi-components models Multi-components models such as zero-inflated Poisson or beta regression generate a set of terms for each of their components. You can use `ggcoef_multicomponents()` which is similar to `ggcoef_multinom()`. ```{r} library(pscl) data("bioChemists", package = "pscl") mod <- zeroinfl(art ~ fem * mar | fem + mar, data = bioChemists) ggcoef_multicomponents(mod) ggcoef_multicomponents(mod, type = "f") ``` ```{r, fig.height=7, fig.width=6} ggcoef_multicomponents(mod, type = "t") ggcoef_multicomponents( mod, type = "t", component_label = c(conditional = "Count", zero_inflated = "Zero-inflated") ) ``` ## Comparing several models You can easily compare several models with `ggcoef_compare()`. To be noted, `ggcoef_compare()` is not compatible with multinomial or multi-components models. ```{r} mod1 <- lm(Fertility ~ ., data = swiss) mod2 <- step(mod1, trace = 0) mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss) models <- list( "Full model" = mod1, "Simplified model" = mod2, "With interaction" = mod3 ) ggcoef_compare(models) ggcoef_compare(models, type = "faceted") ``` ## Advanced users Advanced users could use their own dataset and pass it to `ggcoef_plot()`. Such dataset could be produced by `ggcoef_model()`, `ggcoef_compare()` or `ggcoef_multinom()` with the option `return_data = TRUE` or by using `broom::tidy()` or `broom.helpers::tidy_plus_plus()`. ## Supported models ```{r, echo=FALSE} broom.helpers::supported_models %>% knitr::kable() ``` Note: this list of models has been tested. `{broom.helpers}`, and therefore `ggcoef_model()`, may or may not work properly or partially with other types of models. ggstats/inst/doc/stat_cross.Rmd0000644000176200001440000000601514357760262016316 0ustar liggesusers--- title: "Compute cross-tabulation statistics with `stat_cross()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Compute cross-tabulation statistics with `stat_cross()`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ggstats) library(ggplot2) ``` This statistic is intended to be used with two discrete variables mapped to **x** and **y** aesthetics. It will compute several statistics of a cross-tabulated table using `broom::tidy.test()` and `stats::chisq.test()`. More precisely, the computed variables are: - **observed**: number of observations in x,y - **prop**: proportion of total - **row.prop**: row proportion - **col.prop**: column proportion - **expected**: expected count under the null hypothesis - **resid**: Pearson's residual - **std.resid**: standardized residual - **row.observed**: total number of observations within row - **col.observed**: total number of observations within column - **total.observed**: total number of observations within the table - **phi**: phi coefficients, see `augment_chisq_add_phi()` By default, `stat_cross()` is using `ggplot2::geom_points()`. If you want to plot the number of observations, you need to map `after_stat(observed)` to an aesthetic (here **size**): ```{r} d <- as.data.frame(Titanic) ggplot(d) + aes(x = Class, y = Survived, weight = Freq, size = after_stat(observed)) + stat_cross() + scale_size_area(max_size = 20) ``` Note that the **weight** aesthetic is taken into account by `stat_cross()`. We can go further using a custom shape and filling points with standardized residual to identify visually cells who are over- or underrepresented. ```{r fig.height=6, fig.width=6} ggplot(d) + aes( x = Class, y = Survived, weight = Freq, size = after_stat(observed), fill = after_stat(std.resid) ) + stat_cross(shape = 22) + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + scale_size_area(max_size = 20) ``` We can easily recreate a cross-tabulated table. ```{r} ggplot(d) + aes(x = Class, y = Survived, weight = Freq) + geom_tile(fill = "white", colour = "black") + geom_text(stat = "cross", mapping = aes(label = after_stat(observed))) + theme_minimal() ``` Even more complicated, we want to produce a table showing column proportions and where cells are filled with standardized residuals. Note that `stat_cross()` could be used with facets. In that case, computation is done separately in each facet. ```{r} ggplot(d) + aes( x = Class, y = Survived, weight = Freq, label = scales::percent(after_stat(col.prop), accuracy = .1), fill = after_stat(std.resid) ) + stat_cross(shape = 22, size = 30) + geom_text(stat = "cross") + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + facet_grid(rows = vars(Sex)) + labs(fill = "Standardized residuals") + theme_minimal() ``` ggstats/inst/doc/stat_weighted_mean.html0000644000176200001440000012720214527052762020207 0ustar liggesusers Compute weighted mean with stat_weighted_mean()

Compute weighted mean with stat_weighted_mean()

library(ggstats)
library(ggplot2)

stat_weighted_mean() computes mean value of y (taking into account any weight aesthetic if provided) for each value of x. More precisely, it will return a new data frame with one line per unique value of x with the following new variables:

  • y: mean value of the original y (i.e. numerator/denominator)
  • numerator
  • denominator

Let’s take an example. The following plot shows all tips received according to the day of the week.

data(tips, package = "reshape")
ggplot(tips) +
  aes(x = day, y = tip) +
  geom_point()

To plot their mean value per day, simply use stat_weighted_mean().

ggplot(tips) +
  aes(x = day, y = tip) +
  stat_weighted_mean()

We can specify the geometry we want using geom argument. Note that for lines, we need to specify the group aesthetic as well.

ggplot(tips) +
  aes(x = day, y = tip, group = 1) +
  stat_weighted_mean(geom = "line")

An alternative is to specify the statistic in ggplot2::geom_line().

ggplot(tips) +
  aes(x = day, y = tip, group = 1) +
  geom_line(stat = "weighted_mean")

Of course, it could be use with other geometries. Here a bar plot.

p <- ggplot(tips) +
  aes(x = day, y = tip, fill = sex) +
  stat_weighted_mean(geom = "bar", position = "dodge") +
  ylab("mean tip")
p

It is very easy to add facets. In that case, computation will be done separately for each facet.

p + facet_grid(rows = vars(smoker))

stat_weighted_mean() could be also used for computing proportions as a proportion is technically a mean of binary values (0 or 1).

ggplot(tips) +
  aes(x = day, y = as.integer(smoker == "Yes"), fill = sex) +
  stat_weighted_mean(geom = "bar", position = "dodge") +
  scale_y_continuous(labels = scales::percent) +
  ylab("proportion of smoker")

Finally, you can use the weight aesthetic to indicate weights to take into account for computing means / proportions.

d <- as.data.frame(Titanic)
ggplot(d) +
  aes(x = Class, y = as.integer(Survived == "Yes"), weight = Freq, fill = Sex) +
  geom_bar(stat = "weighted_mean", position = "dodge") +
  scale_y_continuous(labels = scales::percent) +
  labs(y = "Proportion who survived")

ggstats/inst/doc/stat_prop.R0000644000176200001440000000502714527052757015630 0ustar liggesusers## ----include = FALSE---------------------------------------------------------- knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ## ----setup-------------------------------------------------------------------- library(ggstats) library(ggplot2) ## ----------------------------------------------------------------------------- d <- as.data.frame(Titanic) p <- ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = Class) + geom_bar(position = "fill") + geom_text(stat = "prop", position = position_fill(.5)) p ## ----------------------------------------------------------------------------- p + facet_grid(cols = vars(Sex)) ## ----------------------------------------------------------------------------- ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = 1) + geom_bar() + geom_text( aes(label = scales::percent(after_stat(prop), accuracy = 1)), stat = "prop", position = position_stack(.5) ) ## ----------------------------------------------------------------------------- ggplot(d) + aes(x = Class, fill = Sex, weight = Freq, by = Sex) + geom_bar(position = "dodge") ## ----------------------------------------------------------------------------- ggplot(d) + aes(x = Class, fill = Sex, weight = Freq, by = Sex, y = after_stat(prop)) + geom_bar(stat = "prop", position = "dodge") + scale_y_continuous(labels = scales::percent) ## ----------------------------------------------------------------------------- ggplot(d) + aes(x = Class, fill = Sex, weight = Freq, by = Sex, y = after_stat(prop)) + geom_bar(stat = "prop", position = "dodge") + scale_y_continuous(labels = scales::percent) + geom_text( mapping = aes( label = scales::percent(after_stat(prop), accuracy = .1), y = after_stat(0.01) ), vjust = "bottom", position = position_dodge(.9), stat = "prop" ) ## ----------------------------------------------------------------------------- d <- diamonds %>% dplyr::filter(!(cut == "Ideal" & clarity == "I1")) %>% dplyr::filter(!(cut == "Very Good" & clarity == "VS2")) %>% dplyr::filter(!(cut == "Premium" & clarity == "IF")) p <- ggplot(d) + aes(x = clarity, fill = cut, by = clarity) + geom_bar(position = "fill") p + geom_text( stat = "prop", position = position_fill(.5) ) ## ----------------------------------------------------------------------------- p + geom_text( stat = "prop", position = position_fill(.5), complete = "fill" ) ggstats/inst/doc/stat_cross.html0000644000176200001440000013077314527052753016547 0ustar liggesusers Compute cross-tabulation statistics with stat_cross()

Compute cross-tabulation statistics with stat_cross()

library(ggstats)
library(ggplot2)

This statistic is intended to be used with two discrete variables mapped to x and y aesthetics. It will compute several statistics of a cross-tabulated table using broom::tidy.test() and stats::chisq.test(). More precisely, the computed variables are:

  • observed: number of observations in x,y
  • prop: proportion of total
  • row.prop: row proportion
  • col.prop: column proportion
  • expected: expected count under the null hypothesis
  • resid: Pearson’s residual
  • std.resid: standardized residual
  • row.observed: total number of observations within row
  • col.observed: total number of observations within column
  • total.observed: total number of observations within the table
  • phi: phi coefficients, see augment_chisq_add_phi()

By default, stat_cross() is using ggplot2::geom_points(). If you want to plot the number of observations, you need to map after_stat(observed) to an aesthetic (here size):

d <- as.data.frame(Titanic)
ggplot(d) +
  aes(x = Class, y = Survived, weight = Freq, size = after_stat(observed)) +
  stat_cross() +
  scale_size_area(max_size = 20)

Note that the weight aesthetic is taken into account by stat_cross().

We can go further using a custom shape and filling points with standardized residual to identify visually cells who are over- or underrepresented.

ggplot(d) +
  aes(
    x = Class, y = Survived, weight = Freq,
    size = after_stat(observed), fill = after_stat(std.resid)
  ) +
  stat_cross(shape = 22) +
  scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) +
  scale_size_area(max_size = 20)

We can easily recreate a cross-tabulated table.

ggplot(d) +
  aes(x = Class, y = Survived, weight = Freq) +
  geom_tile(fill = "white", colour = "black") +
  geom_text(stat = "cross", mapping = aes(label = after_stat(observed))) +
  theme_minimal()

Even more complicated, we want to produce a table showing column proportions and where cells are filled with standardized residuals. Note that stat_cross() could be used with facets. In that case, computation is done separately in each facet.

ggplot(d) +
  aes(
    x = Class, y = Survived, weight = Freq,
    label = scales::percent(after_stat(col.prop), accuracy = .1),
    fill = after_stat(std.resid)
  ) +
  stat_cross(shape = 22, size = 30) +
  geom_text(stat = "cross") +
  scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) +
  facet_grid(rows = vars(Sex)) +
  labs(fill = "Standardized residuals") +
  theme_minimal()

ggstats/inst/doc/stat_weighted_mean.R0000644000176200001440000000363614527052762017450 0ustar liggesusers## ----include = FALSE---------------------------------------------------------- knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ## ----setup-------------------------------------------------------------------- library(ggstats) library(ggplot2) ## ----------------------------------------------------------------------------- data(tips, package = "reshape") ggplot(tips) + aes(x = day, y = tip) + geom_point() ## ----------------------------------------------------------------------------- ggplot(tips) + aes(x = day, y = tip) + stat_weighted_mean() ## ----------------------------------------------------------------------------- ggplot(tips) + aes(x = day, y = tip, group = 1) + stat_weighted_mean(geom = "line") ## ----------------------------------------------------------------------------- ggplot(tips) + aes(x = day, y = tip, group = 1) + geom_line(stat = "weighted_mean") ## ----------------------------------------------------------------------------- p <- ggplot(tips) + aes(x = day, y = tip, fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") + ylab("mean tip") p ## ----------------------------------------------------------------------------- p + facet_grid(rows = vars(smoker)) ## ----------------------------------------------------------------------------- ggplot(tips) + aes(x = day, y = as.integer(smoker == "Yes"), fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") + scale_y_continuous(labels = scales::percent) + ylab("proportion of smoker") ## ----------------------------------------------------------------------------- d <- as.data.frame(Titanic) ggplot(d) + aes(x = Class, y = as.integer(Survived == "Yes"), weight = Freq, fill = Sex) + geom_bar(stat = "weighted_mean", position = "dodge") + scale_y_continuous(labels = scales::percent) + labs(y = "Proportion who survived") ggstats/inst/doc/stat_weighted_mean.Rmd0000644000176200001440000000526514357760262017773 0ustar liggesusers--- title: "Compute weighted mean with `stat_weighted_mean()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Compute weighted mean with `stat_weighted_mean()`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(ggstats) library(ggplot2) ``` `stat_weighted_mean()` computes mean value of **y** (taking into account any **weight** aesthetic if provided) for each value of **x**. More precisely, it will return a new data frame with one line per unique value of **x** with the following new variables: - **y**: mean value of the original **y** (i.e. **numerator**/**denominator**) - **numerator** - **denominator** Let's take an example. The following plot shows all tips received according to the day of the week. ```{r} data(tips, package = "reshape") ggplot(tips) + aes(x = day, y = tip) + geom_point() ``` To plot their mean value per day, simply use `stat_weighted_mean()`. ```{r} ggplot(tips) + aes(x = day, y = tip) + stat_weighted_mean() ``` We can specify the geometry we want using `geom` argument. Note that for lines, we need to specify the **group** aesthetic as well. ```{r} ggplot(tips) + aes(x = day, y = tip, group = 1) + stat_weighted_mean(geom = "line") ``` An alternative is to specify the statistic in `ggplot2::geom_line()`. ```{r} ggplot(tips) + aes(x = day, y = tip, group = 1) + geom_line(stat = "weighted_mean") ``` Of course, it could be use with other geometries. Here a bar plot. ```{r} p <- ggplot(tips) + aes(x = day, y = tip, fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") + ylab("mean tip") p ``` It is very easy to add facets. In that case, computation will be done separately for each facet. ```{r} p + facet_grid(rows = vars(smoker)) ``` `stat_weighted_mean()` could be also used for computing proportions as a proportion is technically a mean of binary values (0 or 1). ```{r} ggplot(tips) + aes(x = day, y = as.integer(smoker == "Yes"), fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") + scale_y_continuous(labels = scales::percent) + ylab("proportion of smoker") ``` Finally, you can use the **weight** aesthetic to indicate weights to take into account for computing means / proportions. ```{r} d <- as.data.frame(Titanic) ggplot(d) + aes(x = Class, y = as.integer(Survived == "Yes"), weight = Freq, fill = Sex) + geom_bar(stat = "weighted_mean", position = "dodge") + scale_y_continuous(labels = scales::percent) + labs(y = "Proportion who survived") ``` ggstats/inst/WORDLIST0000644000176200001440000000020214423723176014100 0ustar liggesusersBaddeley CMD Codecov DOI GGally Lifecycle Likert ORCID behaviour colour dev geoms ggplot htest labeller resid th