diffobj/0000755000176200001440000000000014126775131011664 5ustar liggesusersdiffobj/NAMESPACE0000755000176200001440000000465214123062122013077 0ustar liggesusers# Generated by roxygen2: do not edit by hand S3method(as.character,diffobj_ogewlhgiadfl2) S3method(as.character,diffobj_ogewlhgiadfl3) S3method(print,diffobj_ogewlhgiadfl) S3method(print,ses_dat) export(AlignThreshold) export(Pager) export(PagerBrowser) export(PagerOff) export(PagerSystem) export(PagerSystemLess) export(PaletteOfStyles) export(Rdiff_chr) export(Rdiff_obj) export(Style) export(StyleAnsi) export(StyleAnsi256DarkRgb) export(StyleAnsi256DarkYb) export(StyleAnsi256LightRgb) export(StyleAnsi256LightYb) export(StyleAnsi8NeutralRgb) export(StyleAnsi8NeutralYb) export(StyleFuns) export(StyleHtml) export(StyleHtmlLightRgb) export(StyleHtmlLightYb) export(StyleRaw) export(StyleText) export(auto_context) export(console_lines) export(cont_f) export(diffChr) export(diffCsv) export(diffDeparse) export(diffFile) export(diffObj) export(diffPrint) export(diffStr) export(diffobj_css) export(diffobj_js) export(diffobj_set_def_opts) export(div_f) export(finalizeHtml) export(gdo) export(guidesChr) export(guidesDeparse) export(guidesFile) export(guidesPrint) export(guidesStr) export(has_Rdiff) export(make_blocking) export(nchar_html) export(pager_is_less) export(ses) export(ses_dat) export(span_f) export(tag_f) export(trimChr) export(trimDeparse) export(trimFile) export(trimPrint) export(trimStr) export(view_or_browse) exportClasses(AlignThreshold) exportClasses(Diff) exportClasses(PagerOff) exportClasses(PagerSystem) exportClasses(PagerSystemLess) exportClasses(PaletteOfStyles) exportClasses(Style) exportClasses(StyleAnsi) exportClasses(StyleAnsi256DarkRgb) exportClasses(StyleAnsi256DarkYb) exportClasses(StyleAnsi256LightRgb) exportClasses(StyleAnsi256LightYb) exportClasses(StyleAnsi8NeutralRgb) exportClasses(StyleAnsi8NeutralYb) exportClasses(StyleFuns) exportClasses(StyleHtml) exportClasses(StyleHtmlLightRgb) exportClasses(StyleHtmlLightYb) exportClasses(StyleRaw) exportClasses(StyleSummary) exportClasses(StyleSummaryHtml) exportClasses(StyleText) exportMethods("[") exportMethods(diffObj) exportMethods(head) exportMethods(summary) exportMethods(tail) import(crayon) import(methods) importFrom(grDevices,rgb) importFrom(stats,ave) importFrom(stats,frequency) importFrom(stats,is.ts) importFrom(stats,setNames) importFrom(tools,Rdiff) importFrom(utils,browseURL) importFrom(utils,capture.output) importFrom(utils,file_test) importFrom(utils,packageVersion) importFrom(utils,read.csv) useDynLib(diffobj, .registration=TRUE, .fixes="DIFFOBJ_") diffobj/README.md0000755000176200001440000001075014123062122013133 0ustar liggesusers# diffobj - Diffs for R Objects [![R build status](https://github.com/brodieG/diffobj/workflows/R-CMD-check/badge.svg)](https://github.com/brodieG/diffobj/actions) [![](https://codecov.io/github/brodieG/diffobj/coverage.svg?branch=master)](https://codecov.io/github/brodieG/diffobj?branch=master) [![](http://www.r-pkg.org/badges/version/diffobj)](https://cran.r-project.org/package=diffobj) [![Dependencies direct/recursive](https://tinyverse.netlify.app/badge/diffobj)](https://tinyverse.netlify.app/) Generate a colorized diff of two R objects for an intuitive visualization of their differences. > See the [introductory vignette for details][1]. ## Output If your terminal supports formatting through ANSI escape sequences, `diffobj` will output colored diffs to the terminal. Otherwise, output will be colored with HTML/CSS and sent to the IDE viewport or to your browser. `diffobj` comes with several built-in color schemes that can be further customized. Some examples: ![Output Examples](https://raw.githubusercontent.com/brodieG/diffobj/master/cliandrstudio.png) ## Installation This package is available on [CRAN](https://cran.r-project.org/package=diffobj). ``` install.packages("diffobj") browseVignettes("diffobj") ``` ## Related Software * [tools::Rdiff][2]. * [Daff](https://cran.r-project.org/package=daff) diff, patch and merge for data.frames. * [GNU diff](https://www.gnu.org/software/diffutils/). * [waldo](https://cran.r-project.org/package=waldo), which internally uses `diffobj` for diffs but takes a more hands-on approach to detailing object differences. ## Acknowledgements * R Core for developing and maintaining such a wonderful language. * CRAN maintainers, for patiently shepherding packages onto CRAN and maintaining the repository, and Uwe Ligges in particular for maintaining [Winbuilder](http://win-builder.r-project.org/). * The users who have reported bugs and possible fixes, and/or made feature requests (see NEWS.md). * [Gábor Csárdi](https://github.com/gaborcsardi) for [crayon](https://github.com/r-lib/crayon). * [Jim Hester](https://github.com/jimhester) for [covr](https://cran.r-project.org/package=covr), and with Rstudio for [r-lib/actions](https://github.com/r-lib/actions). * [Dirk Eddelbuettel](https://github.com/eddelbuettel) and [Carl Boettiger](https://github.com/cboettig) for the [rocker](https://github.com/rocker-org/rocker) project, and [Gábor Csárdi](https://github.com/gaborcsardi) and the [R-consortium](https://www.r-consortium.org/) for [Rhub](https://github.com/r-hub), without which testing bugs on R-devel and other platforms would be a nightmare. * [Hadley Wickham](https://github.com/hadley/) and [Peter Danenberg](https://github.com/klutometis) for [roxygen2](https://cran.r-project.org/package=roxygen2). * [Yihui Xie](https://github.com/yihui) for [knitr](https://cran.r-project.org/package=knitr) and [J.J. Allaire](https://github.com/jjallaire) etal for [rmarkdown](https://cran.r-project.org/package=rmarkdown), and by extension John MacFarlane for [pandoc](https://pandoc.org/). * Olaf Mersmann for [microbenchmark](https://cran.r-project.org/package=microbenchmark), because microsecond matter, and [Joshua Ulrich](https://github.com/joshuaulrich) for making it lightweight and maintaining it. * [Tomas Kalibera](https://github.com/kalibera) for [rchk](https://github.com/kalibera/rchk) and the accompanying vagrant image, and rcnst to help detect errors in compiled code. * [Winston Chang](https://github.com/wch) for the [r-debug](https://hub.docker.com/r/wch1/r-debug/) docker container, in particular because of the valgrind level 2 instrumented version of R. * [Gábor Csárdi](https://github.com/gaborcsardi), the [R-consortium](https://www.r-consortium.org/), etal for [revdepcheck](https://github.com/r-lib/revdepcheck) to simplify reverse dependency checks. * All open source developers out there that make their work freely available for others to use. * [Github](https://github.com/), [Codecov](https://about.codecov.io/), [Vagrant](https://www.vagrantup.com/), [Docker](https://www.docker.com/), [Ubuntu](https://ubuntu.com/), [Brew](https://brew.sh/) for providing infrastructure that greatly simplifies open source development. * [Free Software Foundation](https://www.fsf.org/) for developing the GPL license and promotion of the free software movement. [1]: https://cran.r-project.org/package=diffobj/vignettes/diffobj.html [2]: https://stat.ethz.ch/R-manual/R-devel/library/tools/html/Rdiff.html diffobj/man/0000755000176200001440000000000014123062122012421 5ustar liggesusersdiffobj/man/Extract_PaletteOfStyles.Rd0000755000176200001440000000300613656314536017516 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/styles.R \name{[<-,PaletteOfStyles-method} \alias{[<-,PaletteOfStyles-method} \alias{[,PaletteOfStyles,ANY,ANY,ANY-method} \alias{[[,PaletteOfStyles-method} \title{Extract/Replace a Style Class or Object from PaletteOfStyles} \usage{ \S4method{[}{PaletteOfStyles}(x, i, j, ...) <- value \S4method{[}{PaletteOfStyles,ANY,ANY,ANY}(x, i, j, ..., drop = FALSE) \S4method{[[}{PaletteOfStyles}(x, i, j, ..., exact = TRUE) } \arguments{ \item{x}{a \code{\link{PaletteOfStyles}} object} \item{i}{numeric, or character corresponding to a valid style \code{format}} \item{j}{numeric, or character corresponding to a valid style \code{brightness}} \item{...}{pass a numeric or character corresponding to a valid \code{color.mode}} \item{value}{a \emph{list} of \code{\link{Style}} class or \code{\link{Style}} objects} \item{drop}{TRUE or FALSE, whether to drop dimensions, defaults to FALSE, which is different than generic} \item{exact}{passed on to generic} } \value{ a \code{\link{Style}} \code{ClassRepresentation} object or \code{\link{Style}} object for \code{[[}, and a list of the same for \code{[} } \description{ Extract/Replace a Style Class or Object from PaletteOfStyles } \examples{ pal <- PaletteOfStyles() pal[["ansi256", "light", "rgb"]] pal["ansi256", "light", ] pal["ansi256", "light", "rgb"] <- list(StyleAnsi8NeutralRgb()) } \seealso{ \code{\link{diffPrint}} for explanations of \code{format}, \code{brightness}, and \code{color.mode} } diffobj/man/strip_hz_control.Rd0000755000176200001440000000223613420351310016317 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/text.R \name{strip_hz_control} \alias{strip_hz_control} \title{Replace Horizontal Spacing Control Characters} \usage{ strip_hz_control(txt, stops = 8L, sgr.supported) } \arguments{ \item{txt}{character to covert} \item{stops}{integer, what tab stops to use} \item{sgr.supported}{logical whether the current display device supports ANSI CSI SGR. See \code{\link[=diffPrint]{diff*}}'s \code{sgr.supported} parameter.} } \value{ character, `txt` with horizontal control sequences replaced. } \description{ Removes tabs, newlines, and manipulates the text so that it looks the same as it did with those horizontal control characters embedded. Currently carriage returns are also processed, but in the future they no longer will be. This function is used when the \code{convert.hz.white.space} parameter to the \code{\link[=diffPrint]{diff*}} methods is active. The term \dQuote{strip} is a misnomer that remains for legacy reasons and lazyness. } \details{ This is an internal function with exposed documentation because it is referenced in an external function's documentation. } \keyword{internal} diffobj/man/dimnames-PaletteOfStyles-method.Rd0000755000176200001440000000071213656314536021076 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/styles.R \name{dimnames,PaletteOfStyles-method} \alias{dimnames,PaletteOfStyles-method} \title{Retrieve Dimnames for PaletteOfStyles Objects} \usage{ \S4method{dimnames}{PaletteOfStyles}(x) } \arguments{ \item{x}{a \code{\link{PaletteOfStyles}} object} } \value{ list the dimension names dimnames(PaletteOfStyles()) } \description{ Retrieve Dimnames for PaletteOfStyles Objects } diffobj/man/Diff-class.Rd0000755000176200001440000000054513201325222014671 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/s4.R \docType{class} \name{Diff-class} \alias{Diff-class} \title{Diff Result Object} \description{ Return value for the \code{\link[=diffPrint]{diff*}} methods. Has \code{show}, \code{as.character}, \code{summmary}, \code{[}, \code{head}, \code{tail}, and \code{any} methods. } diffobj/man/as.character-DiffSummary-method.Rd0000755000176200001440000000121713656314536020777 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/summmary.R \name{as.character,DiffSummary-method} \alias{as.character,DiffSummary-method} \title{Generate Character Representation of DiffSummary Object} \usage{ \S4method{as.character}{DiffSummary}(x, ...) } \arguments{ \item{x}{a \code{DiffSummary} object} \item{...}{not used, for compatibility with generic} } \value{ the summary as a character vector intended to be \code{cat}ed to terminal } \description{ Generate Character Representation of DiffSummary Object } \examples{ as.character( summary(diffChr(letters, letters[-c(5, 15)], format="raw", pager="off")) ) } diffobj/man/as.character-MyersMbaSes-method.Rd0000755000176200001440000000101513656314536020737 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/core.R \name{as.character,MyersMbaSes-method} \alias{as.character,MyersMbaSes-method} \title{Generate a character representation of Shortest Edit Sequence} \usage{ \S4method{as.character}{MyersMbaSes}(x, ...) } \arguments{ \item{x}{S4 object of class \code{MyersMbaSes}} \item{...}{unused} } \value{ character vector } \description{ Generate a character representation of Shortest Edit Sequence } \seealso{ \code{\link{ses}} } \keyword{internal} diffobj/man/diffFile.Rd0000755000176200001440000004435514123062122014436 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/diff.R \name{diffFile} \alias{diffFile} \alias{diffFile,ANY-method} \title{Diff Files} \usage{ diffFile(target, current, ...) \S4method{diffFile}{ANY}( target, current, mode = gdo("mode"), context = gdo("context"), format = gdo("format"), brightness = gdo("brightness"), color.mode = gdo("color.mode"), word.diff = gdo("word.diff"), pager = gdo("pager"), guides = gdo("guides"), trim = gdo("trim"), rds = gdo("rds"), unwrap.atomic = gdo("unwrap.atomic"), max.diffs = gdo("max.diffs"), disp.width = gdo("disp.width"), ignore.white.space = gdo("ignore.white.space"), convert.hz.white.space = gdo("convert.hz.white.space"), tab.stops = gdo("tab.stops"), line.limit = gdo("line.limit"), hunk.limit = gdo("hunk.limit"), align = gdo("align"), style = gdo("style"), palette.of.styles = gdo("palette"), frame = par_frame(), interactive = gdo("interactive"), term.colors = gdo("term.colors"), tar.banner = NULL, cur.banner = NULL, strip.sgr = gdo("strip.sgr"), sgr.supported = gdo("sgr.supported"), extra = list() ) } \arguments{ \item{target}{character(1L) or file connection with read capability; if character should point to a text file} \item{current}{like \code{target}} \item{...}{unused, for compatibility of methods with generics} \item{mode}{character(1L), one of: \itemize{ \item \dQuote{unified}: diff mode used by \code{git diff} \item \dQuote{sidebyside}: line up the differences side by side \item \dQuote{context}: show the target and current hunks in their entirety; this mode takes up a lot of screen space but makes it easier to see what the objects actually look like \item \dQuote{auto}: default mode; pick one of the above, will favor \dQuote{sidebyside} unless \code{getOption("width")} is less than 80, or in \code{diffPrint} and objects are dimensioned and do not fit side by side, or in \code{diffChr}, \code{diffDeparse}, \code{diffFile} and output does not fit in side by side without wrapping }} \item{context}{integer(1L) how many lines of context are shown on either side of differences (defaults to 2). Set to \code{-1L} to allow as many as there are. Set to \dQuote{auto} to display as many as 10 lines or as few as 1 depending on whether total screen lines fit within the number of lines specified in \code{line.limit}. Alternatively pass the return value of \code{\link{auto_context}} to fine tune the parameters of the auto context calculation.} \item{format}{character(1L), controls the diff output format, one of: \itemize{ \item \dQuote{auto}: to select output format based on terminal capabilities; will attempt to use one of the ANSI formats if they appear to be supported, and if not or if you are in the Rstudio console it will attempt to use HTML and browser output if in interactive mode. \item \dQuote{raw}: plain text \item \dQuote{ansi8}: color and format diffs using basic ANSI escape sequences \item \dQuote{ansi256}: like \dQuote{ansi8}, except using the full range of ANSI formatting options \item \dQuote{html}: color and format using HTML markup; the resulting string is processed with \code{\link{enc2utf8}} when output as a full web page (see docs for \code{html.output} under \code{\link{Style}}). } Defaults to \dQuote{auto}. See \code{palette.of.styles} for details on customization, \code{\link{style}} for full control of output format. See `pager` parameter for more discussion of Rstudio behavior.} \item{brightness}{character, one of \dQuote{light}, \dQuote{dark}, \dQuote{neutral}, useful for adjusting color scheme to light or dark terminals. \dQuote{neutral} by default. See \code{\link{PaletteOfStyles}} for details and limitations. Advanced: you may specify brightness as a function of \code{format}. For example, if you typically wish to use a \dQuote{dark} color scheme, except for when in \dQuote{html} format when you prefer the \dQuote{light} scheme, you may use \code{c("dark", html="light")} as the value for this parameter. This is particularly useful if \code{format} is set to \dQuote{auto} or if you want to specify a default value for this parameter via options. Any names you use should correspond to a \code{format}. You must have one unnamed value which will be used as the default for all \code{format}s that are not explicitly specified.} \item{color.mode}{character, one of \dQuote{rgb} or \dQuote{yb}. Defaults to \dQuote{yb}. \dQuote{yb} stands for \dQuote{Yellow-Blue} for color schemes that rely primarily on those colors to style diffs. Those colors can be easily distinguished by individuals with limited red-green color sensitivity. See \code{\link{PaletteOfStyles}} for details and limitations. Also offers the same advanced usage as the \code{brightness} parameter.} \item{word.diff}{TRUE (default) or FALSE, whether to run a secondary word diff on the in-hunk differences. For atomic vectors setting this to FALSE could make the diff \emph{slower} (see the \code{unwrap.atomic} parameter). For other uses, particularly with \code{\link{diffChr}} setting this to FALSE can substantially improve performance.} \item{pager}{one of \dQuote{auto} (default), \dQuote{on}, \dQuote{off}, a \code{\link{Pager}} object, or a list; controls whether and how a pager is used to display the diff output. If you require a particular pager behavior you must use a \code{\link{Pager}} object, or \dQuote{off} to turn off the pager. All other settings will interact with other parameters such as \code{format}, \code{style}, as well as with your system capabilities in order to select the pager expected to be most useful. \dQuote{auto} and \dQuote{on} are the same, except that in non-interactive mode \dQuote{auto} is equivalent to \dQuote{off}. \dQuote{off} will always send output to the console. If \dQuote{on}, whether the output actually gets routed to the pager depends on the pager \code{threshold} setting (see \code{\link{Pager}}). The default behavior is to use the pager associated with the \code{Style} object. The \code{Style} object is itself is determined by the \code{format} or \code{style} parameters. Depending on your system configuration different styles and corresponding pagers will get selected, unless you specify a \code{Pager} object directly. On a system with a system pager that supports ANSI CSI SGR colors, the pager will only trigger if the output is taller than one window. If the system pager is not known to support ANSI colors then the output will be sent as HTML to the IDE viewer if available or to the web browser if not. Even though Rstudio now supports ANSI CSI SGR at the console output is still formatted as HTML and sent to the IDE viewer. Partly this is for continuity of behavior, but also because the default Rstudio pager does not support ANSI CSI SGR, at least as of this writing. If \code{pager} is a list, then the same as with \dQuote{on}, except that the \code{Pager} object associated with the selected \code{Style} object is re-instantiated with the union of the list elements and the existing settings of that \code{Pager}. The list should contain named elements that correspond to the \code{\link{Pager}} instantiation parameters. The names must be specified in full as partial parameter matching will not be carried out because the pager is re-instantiated with \code{\link{new}}. See \code{\link{Pager}}, \code{\link{Style}}, and \code{\link{PaletteOfStyles}} for more details and for instructions on how to modify the default behavior.} \item{guides}{TRUE (default), FALSE, or a function that accepts at least two arguments and requires no more than two arguments. Guides are additional context lines that are not strictly part of a hunk, but provide important contextual data (e.g. column headers). If TRUE, the context lines are shown in addition to the normal diff output, typically in a different color to indicate they are not part of the hunk. If a function, the function should accept as the first argument the object being diffed, and the second the character representation of the object. The function should return the indices of the elements of the character representation that should be treated as guides. See \code{\link{guides}} for more details.} \item{trim}{TRUE (default), FALSE, or a function that accepts at least two arguments and requires no more than two arguments. Function should compute for each line in captured output what portion of those lines should be diffed. By default, this is used to remove row meta data differences (e.g. \code{[1,]}) so they alone do not show up as differences in the diff. See \code{\link{trim}} for more details.} \item{rds}{TRUE (default) or FALSE, if TRUE will check whether \code{target} and/or \code{current} point to a file that can be read with \code{\link{readRDS}} and if so, loads the R object contained in the file and carries out the diff on the object instead of the original argument. Currently there is no mechanism for specifying additional arguments to \code{readRDS}} \item{unwrap.atomic}{TRUE (default) or FALSE. Relevant primarily for \code{diffPrint}, if TRUE, and \code{word.diff} is also TRUE, and both \code{target} and \code{current} are \emph{unnamed} one-dimension atomics , the vectors are unwrapped and diffed element by element, and then re-wrapped. Since \code{diffPrint} is fundamentally a line diff, the re-wrapped lines are lined up in a manner that is as consistent as possible with the unwrapped diff. Lines that contain the location of the word differences will be paired up. Since the vectors may well be wrapped with different periodicities this will result in lines that are paired up that look like they should not be paired up, though the locations of the differences should be. If is entirely possible that setting this parameter to FALSE will result in a slower diff. This happens if two vectors are actually fairly similar, but their line representations are not. For example, in comparing \code{1:100} to \code{c(100, 1:99)}, there is really only one difference at the \dQuote{word} level, but every screen line is different. \code{diffChr} will also do the unwrapping if it is given a character vector that contains output that looks like the atomic vectors described above. This is a bug, but as the functionality could be useful when diffing e.g. \code{capture.output} data, we now declare it a feature.} \item{max.diffs}{integer(1L), number of \emph{differences} (default 50000L) after which we abandon the \code{O(n^2)} diff algorithm in favor of a naive \code{O(n)} one. Set to \code{-1L} to stick to the original algorithm up to the maximum allowed (~INT_MAX/4).} \item{disp.width}{integer(1L) number of display columns to take up; note that in \dQuote{sidebyside} \code{mode} the effective display width is half this number (set to 0L to use default widths which are \code{getOption("width")} for normal styles and \code{80L} for HTML styles. Future versions of \code{diffobj} may change this to larger values for two dimensional objects for better diffs (see details).} \item{ignore.white.space}{TRUE or FALSE, whether to consider differences in horizontal whitespace (i.e. spaces and tabs) as differences (defaults to TRUE).} \item{convert.hz.white.space}{TRUE or FALSE, whether modify input strings that contain tabs and carriage returns in such a way that they display as they would \bold{with} those characters, but without using those characters (defaults to TRUE). The conversion assumes that tab stops are spaced evenly eight characters apart on the terminal. If this is not the case you may specify the tab stops explicitly with \code{tab.stops}.} \item{tab.stops}{integer, what tab stops to use when converting hard tabs to spaces. If not integer will be coerced to integer (defaults to 8L). You may specify more than one tab stop. If display width exceeds that addressable by your tab stops the last tab stop will be repeated.} \item{line.limit}{integer(2L) or integer(1L), if length 1 how many lines of output to show, where \code{-1} means no limit. If length 2, the first value indicates the threshold of screen lines to begin truncating output, and the second the number of lines to truncate to, which should be fewer than the threshold. Note that this parameter is implemented on a best-efforts basis and should not be relied on to produce the exact number of lines requested. In particular do not expect it to work well for for values small enough that the banner portion of the diff would have to be trimmed. If you want a specific number of lines use \code{[} or \code{head} / \code{tail}. One advantage of \code{line.limit} over these other options is that you can combine it with \code{context="auto"} and auto \code{max.level} selection (the latter for \code{diffStr}), which allows the diff to dynamically adjust to make best use of the available display lines. \code{[}, \code{head}, and \code{tail} just subset the text of the output.} \item{hunk.limit}{integer(2L) or integer (1L), how many diff hunks to show. Behaves similarly to \code{line.limit}. How many hunks are in a particular diff is a function of how many differences, and also how much \code{context} is used since context can cause two hunks to bleed into each other and become one.} \item{align}{numeric(1L) between 0 and 1, proportion of words in a line of \code{target} that must be matched in a line of \code{current} in the same hunk for those lines to be paired up when displayed (defaults to 0.25), or an \code{\link{AlignThreshold}} object. Set to \code{1} to turn off alignment which will cause all lines in a hunk from \code{target} to show up first, followed by all lines from \code{current}. Note that in order to be aligned lines must meet the threshold and have at least 3 matching alphanumeric characters (see \code{\link{AlignThreshold}} for details).} \item{style}{\dQuote{auto}, a \code{\link{Style}} object, or a list. \dQuote{auto} by default. If a \code{Style} object, will override the the \code{format}, \code{brightness}, and \code{color.mode} parameters. The \code{Style} object provides full control of diff output styling. If a list, then the same as \dQuote{auto}, except that if the auto-selected \code{Style} requires instantiation (see \code{\link{PaletteOfStyles}}), then the list contents will be used as arguments when instantiating the style object. See \code{\link{Style}} for more details, in particular the examples.} \item{palette.of.styles}{\code{\link{PaletteOfStyles}} object; advanced usage, contains all the \code{\link{Style}} objects or \dQuote{classRepresentation} objects extending \code{\link{Style}} that are selected by specifying the \code{format}, \code{brightness}, and \code{color.mode} parameters. See \code{\link{PaletteOfStyles}} for more details.} \item{frame}{an environment to use as the evaluation frame for the \code{print/show/str}, calls and for \code{diffObj}, the evaluation frame for the \code{diffPrint} / \code{diffStr} calls. Defaults to the return value of \code{\link{par_frame}}.} \item{interactive}{TRUE or FALSE whether the function is being run in interactive mode, defaults to the return value of \code{\link{interactive}}. If in interactive mode, pager will be used if \code{pager} is \dQuote{auto}, and if ANSI styles are not supported and \code{style} is \dQuote{auto}, output will be send to viewer/browser as HTML.} \item{term.colors}{integer(1L) how many ANSI colors are supported by the terminal. This variable is provided for when \code{\link[=num_colors]{crayon::num_colors}} does not properly detect how many ANSI colors are supported by your terminal. Defaults to return value of \code{\link[=num_colors]{crayon::num_colors}} and should be 8 or 256 to allow ANSI colors, or any other number to disallow them. This only impacts output format selection when \code{style} and \code{format} are both set to \dQuote{auto}.} \item{tar.banner}{character(1L), language, or NULL, used to generate the text to display ahead of the diff section representing the target output. If NULL will use the deparsed \code{target} expression, if language, will use the language as it would the \code{target} expression, if character(1L), will use the string with no modifications. The language mode is provided because \code{diffStr} modifies the expression prior to display (e.g. by wrapping it in a call to \code{str}). Note that it is possible in some cases that the substituted value of \code{target} actually is character(1L), but if you provide a character(1L) value here it will be assumed you intend to use that value literally.} \item{cur.banner}{character(1L) like \code{tar.banner}, but for \code{current}} \item{strip.sgr}{TRUE, FALSE, or NULL (default), whether to strip ANSI CSI SGR sequences prior to comparison and for display of diff. If NULL, resolves to TRUE if `style` resolves to an ANSI formatted diff, and FALSE otherwise. The default behavior is to avoid confusing diffs where the original SGR and the SGR added by the diff are mixed together.} \item{sgr.supported}{TRUE, FALSE, or NULL (default), whether to assume the standard output device supports ANSI CSI SGR sequences. If TRUE, strings will be manipulated accounting for the SGR sequences. If NULL, resolves to TRUE if `style` resolves to an ANSI formatted diff, and to `crayon::has_color()` otherwise. This only controls how the strings are manipulated, not whether SGR is added to format the diff, which is controlled by the `style` parameter. This parameter is exposed for the rare cases where you might wish to control string manipulation behavior directly.} \item{extra}{list additional arguments to pass on to the functions used to create text representation of the objects to diff (e.g. \code{print}, \code{str}, etc.)} } \value{ a \code{Diff} object; see \code{\link{diffPrint}}. } \description{ Reads text files with \code{\link{readLines}} and performs a diff on the resulting character vectors. } \examples{ \dontrun{ url.base <- "https://raw.githubusercontent.com/wch/r-source" f1 <- file.path(url.base, "29f013d1570e1df5dc047fb7ee304ff57c99ea68/README") f2 <- file.path(url.base, "daf0b5f6c728bd3dbcd0a3c976a7be9beee731d9/README") diffFile(f1, f2) } } \seealso{ \code{\link{diffPrint}} for details on the \code{diff*} functions, \code{\link{diffObj}}, \code{\link{diffStr}}, \code{\link{diffChr}} to compare character vectors directly, \code{\link{ses}} for a minimal and fast diff } diffobj/man/auto_context.Rd0000755000176200001440000000177113656314536015460 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/set.R \name{auto_context} \alias{auto_context} \title{Configure Automatic Context Calculation} \usage{ auto_context( min = getOption("diffobj.context.auto.min"), max = getOption("diffobj.context.auto.max") ) } \arguments{ \item{min}{integer(1L), positive, set to zero to allow any context} \item{max}{integer(1L), set to negative to allow any context} } \value{ S4 object containing configuration parameters, for use as the \code{context} or parameter value in \code{\link[=diffPrint]{diff*}} methods } \description{ Helper functions to help define parameters for selecting an appropriate \code{context} value. } \examples{ ## `pager="off"` for CRAN compliance; you may omit in normal use diffChr(letters, letters[-13], context=auto_context(0, 3), pager="off") diffChr(letters, letters[-13], context=auto_context(0, 10), pager="off") diffChr( letters, letters[-13], context=auto_context(0, 10), line.limit=3L, pager="off" ) } diffobj/man/StyleFuns.Rd0000755000176200001440000000501113201325222014643 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/styles.R \docType{class} \name{StyleFuns-class} \alias{StyleFuns-class} \alias{StyleFuns} \title{Functions Used for Styling Diff Components} \arguments{ \item{container}{function used primarily by HTML styles to generate an outermost \code{DIV} that allows for CSS targeting of its contents (see \code{\link{cont_f}} for a function generator appropriate for use here)} \item{line}{function} \item{line.insert}{function} \item{line.delete}{function} \item{line.match}{function} \item{line.guide}{function formats guide lines (see \code{\link{guides}})} \item{text}{function} \item{text.insert}{function} \item{text.delete}{function} \item{text.match}{function} \item{text.guide}{function formats guide lines (see \code{\link{guides}})} \item{gutter}{function} \item{gutter.insert}{function} \item{gutter.delete}{function} \item{gutter.match}{function} \item{gutter.guide}{function} \item{gutter.pad}{function} \item{header}{function to format each hunk header with} \item{banner}{function to format entire banner} \item{banner.insert}{function to format insertion banner} \item{banner.delete}{function to format deletion banner} \item{meta}{function format meta information lines} \item{context.sep}{function to format the separator used to visually distinguish the A and B hunks in \dQuote{context} \code{mode}} } \value{ a StyleFuns S4 object } \description{ Except for \code{container} every function specified here should be vectorized and apply formatting to each element in a character vectors. The functions must accept at least one argument and require no more than one argument. The text to be formatted will be passed as a character vector as the first argument to each function. } \details{ These functions are applied in post processing steps. The \code{diff*} methods do not do any of the formatting. Instead, the formatting is done only if the user requests to \code{show} the object. Internally, \code{show} first converts the object to a character vector using \code{as.character}, which applies every formatting function defined here except for \code{container}. Then \code{show} applies \code{container} before forwarding the result to the screen or pager. } \note{ the slots are set to class \dQuote{ANY} to allow classed functions such as those defined in the \code{crayon} package. Despite this seemingly permissive slot definition, only functions are allowed in the slots by the validation functions. } \seealso{ \code{\link{Style}} } diffobj/man/diffChr.Rd0000755000176200001440000004444714123062122014275 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/diff.R \name{diffChr} \alias{diffChr} \alias{diffChr,ANY-method} \title{Diff Character Vectors Element By Element} \usage{ diffChr(target, current, ...) \S4method{diffChr}{ANY}( target, current, mode = gdo("mode"), context = gdo("context"), format = gdo("format"), brightness = gdo("brightness"), color.mode = gdo("color.mode"), word.diff = gdo("word.diff"), pager = gdo("pager"), guides = gdo("guides"), trim = gdo("trim"), rds = gdo("rds"), unwrap.atomic = gdo("unwrap.atomic"), max.diffs = gdo("max.diffs"), disp.width = gdo("disp.width"), ignore.white.space = gdo("ignore.white.space"), convert.hz.white.space = gdo("convert.hz.white.space"), tab.stops = gdo("tab.stops"), line.limit = gdo("line.limit"), hunk.limit = gdo("hunk.limit"), align = gdo("align"), style = gdo("style"), palette.of.styles = gdo("palette"), frame = par_frame(), interactive = gdo("interactive"), term.colors = gdo("term.colors"), tar.banner = NULL, cur.banner = NULL, strip.sgr = gdo("strip.sgr"), sgr.supported = gdo("sgr.supported"), extra = list() ) } \arguments{ \item{target}{the reference object} \item{current}{the object being compared to \code{target}} \item{...}{unused, for compatibility of methods with generics} \item{mode}{character(1L), one of: \itemize{ \item \dQuote{unified}: diff mode used by \code{git diff} \item \dQuote{sidebyside}: line up the differences side by side \item \dQuote{context}: show the target and current hunks in their entirety; this mode takes up a lot of screen space but makes it easier to see what the objects actually look like \item \dQuote{auto}: default mode; pick one of the above, will favor \dQuote{sidebyside} unless \code{getOption("width")} is less than 80, or in \code{diffPrint} and objects are dimensioned and do not fit side by side, or in \code{diffChr}, \code{diffDeparse}, \code{diffFile} and output does not fit in side by side without wrapping }} \item{context}{integer(1L) how many lines of context are shown on either side of differences (defaults to 2). Set to \code{-1L} to allow as many as there are. Set to \dQuote{auto} to display as many as 10 lines or as few as 1 depending on whether total screen lines fit within the number of lines specified in \code{line.limit}. Alternatively pass the return value of \code{\link{auto_context}} to fine tune the parameters of the auto context calculation.} \item{format}{character(1L), controls the diff output format, one of: \itemize{ \item \dQuote{auto}: to select output format based on terminal capabilities; will attempt to use one of the ANSI formats if they appear to be supported, and if not or if you are in the Rstudio console it will attempt to use HTML and browser output if in interactive mode. \item \dQuote{raw}: plain text \item \dQuote{ansi8}: color and format diffs using basic ANSI escape sequences \item \dQuote{ansi256}: like \dQuote{ansi8}, except using the full range of ANSI formatting options \item \dQuote{html}: color and format using HTML markup; the resulting string is processed with \code{\link{enc2utf8}} when output as a full web page (see docs for \code{html.output} under \code{\link{Style}}). } Defaults to \dQuote{auto}. See \code{palette.of.styles} for details on customization, \code{\link{style}} for full control of output format. See `pager` parameter for more discussion of Rstudio behavior.} \item{brightness}{character, one of \dQuote{light}, \dQuote{dark}, \dQuote{neutral}, useful for adjusting color scheme to light or dark terminals. \dQuote{neutral} by default. See \code{\link{PaletteOfStyles}} for details and limitations. Advanced: you may specify brightness as a function of \code{format}. For example, if you typically wish to use a \dQuote{dark} color scheme, except for when in \dQuote{html} format when you prefer the \dQuote{light} scheme, you may use \code{c("dark", html="light")} as the value for this parameter. This is particularly useful if \code{format} is set to \dQuote{auto} or if you want to specify a default value for this parameter via options. Any names you use should correspond to a \code{format}. You must have one unnamed value which will be used as the default for all \code{format}s that are not explicitly specified.} \item{color.mode}{character, one of \dQuote{rgb} or \dQuote{yb}. Defaults to \dQuote{yb}. \dQuote{yb} stands for \dQuote{Yellow-Blue} for color schemes that rely primarily on those colors to style diffs. Those colors can be easily distinguished by individuals with limited red-green color sensitivity. See \code{\link{PaletteOfStyles}} for details and limitations. Also offers the same advanced usage as the \code{brightness} parameter.} \item{word.diff}{TRUE (default) or FALSE, whether to run a secondary word diff on the in-hunk differences. For atomic vectors setting this to FALSE could make the diff \emph{slower} (see the \code{unwrap.atomic} parameter). For other uses, particularly with \code{\link{diffChr}} setting this to FALSE can substantially improve performance.} \item{pager}{one of \dQuote{auto} (default), \dQuote{on}, \dQuote{off}, a \code{\link{Pager}} object, or a list; controls whether and how a pager is used to display the diff output. If you require a particular pager behavior you must use a \code{\link{Pager}} object, or \dQuote{off} to turn off the pager. All other settings will interact with other parameters such as \code{format}, \code{style}, as well as with your system capabilities in order to select the pager expected to be most useful. \dQuote{auto} and \dQuote{on} are the same, except that in non-interactive mode \dQuote{auto} is equivalent to \dQuote{off}. \dQuote{off} will always send output to the console. If \dQuote{on}, whether the output actually gets routed to the pager depends on the pager \code{threshold} setting (see \code{\link{Pager}}). The default behavior is to use the pager associated with the \code{Style} object. The \code{Style} object is itself is determined by the \code{format} or \code{style} parameters. Depending on your system configuration different styles and corresponding pagers will get selected, unless you specify a \code{Pager} object directly. On a system with a system pager that supports ANSI CSI SGR colors, the pager will only trigger if the output is taller than one window. If the system pager is not known to support ANSI colors then the output will be sent as HTML to the IDE viewer if available or to the web browser if not. Even though Rstudio now supports ANSI CSI SGR at the console output is still formatted as HTML and sent to the IDE viewer. Partly this is for continuity of behavior, but also because the default Rstudio pager does not support ANSI CSI SGR, at least as of this writing. If \code{pager} is a list, then the same as with \dQuote{on}, except that the \code{Pager} object associated with the selected \code{Style} object is re-instantiated with the union of the list elements and the existing settings of that \code{Pager}. The list should contain named elements that correspond to the \code{\link{Pager}} instantiation parameters. The names must be specified in full as partial parameter matching will not be carried out because the pager is re-instantiated with \code{\link{new}}. See \code{\link{Pager}}, \code{\link{Style}}, and \code{\link{PaletteOfStyles}} for more details and for instructions on how to modify the default behavior.} \item{guides}{TRUE (default), FALSE, or a function that accepts at least two arguments and requires no more than two arguments. Guides are additional context lines that are not strictly part of a hunk, but provide important contextual data (e.g. column headers). If TRUE, the context lines are shown in addition to the normal diff output, typically in a different color to indicate they are not part of the hunk. If a function, the function should accept as the first argument the object being diffed, and the second the character representation of the object. The function should return the indices of the elements of the character representation that should be treated as guides. See \code{\link{guides}} for more details.} \item{trim}{TRUE (default), FALSE, or a function that accepts at least two arguments and requires no more than two arguments. Function should compute for each line in captured output what portion of those lines should be diffed. By default, this is used to remove row meta data differences (e.g. \code{[1,]}) so they alone do not show up as differences in the diff. See \code{\link{trim}} for more details.} \item{rds}{TRUE (default) or FALSE, if TRUE will check whether \code{target} and/or \code{current} point to a file that can be read with \code{\link{readRDS}} and if so, loads the R object contained in the file and carries out the diff on the object instead of the original argument. Currently there is no mechanism for specifying additional arguments to \code{readRDS}} \item{unwrap.atomic}{TRUE (default) or FALSE. Relevant primarily for \code{diffPrint}, if TRUE, and \code{word.diff} is also TRUE, and both \code{target} and \code{current} are \emph{unnamed} one-dimension atomics , the vectors are unwrapped and diffed element by element, and then re-wrapped. Since \code{diffPrint} is fundamentally a line diff, the re-wrapped lines are lined up in a manner that is as consistent as possible with the unwrapped diff. Lines that contain the location of the word differences will be paired up. Since the vectors may well be wrapped with different periodicities this will result in lines that are paired up that look like they should not be paired up, though the locations of the differences should be. If is entirely possible that setting this parameter to FALSE will result in a slower diff. This happens if two vectors are actually fairly similar, but their line representations are not. For example, in comparing \code{1:100} to \code{c(100, 1:99)}, there is really only one difference at the \dQuote{word} level, but every screen line is different. \code{diffChr} will also do the unwrapping if it is given a character vector that contains output that looks like the atomic vectors described above. This is a bug, but as the functionality could be useful when diffing e.g. \code{capture.output} data, we now declare it a feature.} \item{max.diffs}{integer(1L), number of \emph{differences} (default 50000L) after which we abandon the \code{O(n^2)} diff algorithm in favor of a naive \code{O(n)} one. Set to \code{-1L} to stick to the original algorithm up to the maximum allowed (~INT_MAX/4).} \item{disp.width}{integer(1L) number of display columns to take up; note that in \dQuote{sidebyside} \code{mode} the effective display width is half this number (set to 0L to use default widths which are \code{getOption("width")} for normal styles and \code{80L} for HTML styles. Future versions of \code{diffobj} may change this to larger values for two dimensional objects for better diffs (see details).} \item{ignore.white.space}{TRUE or FALSE, whether to consider differences in horizontal whitespace (i.e. spaces and tabs) as differences (defaults to TRUE).} \item{convert.hz.white.space}{TRUE or FALSE, whether modify input strings that contain tabs and carriage returns in such a way that they display as they would \bold{with} those characters, but without using those characters (defaults to TRUE). The conversion assumes that tab stops are spaced evenly eight characters apart on the terminal. If this is not the case you may specify the tab stops explicitly with \code{tab.stops}.} \item{tab.stops}{integer, what tab stops to use when converting hard tabs to spaces. If not integer will be coerced to integer (defaults to 8L). You may specify more than one tab stop. If display width exceeds that addressable by your tab stops the last tab stop will be repeated.} \item{line.limit}{integer(2L) or integer(1L), if length 1 how many lines of output to show, where \code{-1} means no limit. If length 2, the first value indicates the threshold of screen lines to begin truncating output, and the second the number of lines to truncate to, which should be fewer than the threshold. Note that this parameter is implemented on a best-efforts basis and should not be relied on to produce the exact number of lines requested. In particular do not expect it to work well for for values small enough that the banner portion of the diff would have to be trimmed. If you want a specific number of lines use \code{[} or \code{head} / \code{tail}. One advantage of \code{line.limit} over these other options is that you can combine it with \code{context="auto"} and auto \code{max.level} selection (the latter for \code{diffStr}), which allows the diff to dynamically adjust to make best use of the available display lines. \code{[}, \code{head}, and \code{tail} just subset the text of the output.} \item{hunk.limit}{integer(2L) or integer (1L), how many diff hunks to show. Behaves similarly to \code{line.limit}. How many hunks are in a particular diff is a function of how many differences, and also how much \code{context} is used since context can cause two hunks to bleed into each other and become one.} \item{align}{numeric(1L) between 0 and 1, proportion of words in a line of \code{target} that must be matched in a line of \code{current} in the same hunk for those lines to be paired up when displayed (defaults to 0.25), or an \code{\link{AlignThreshold}} object. Set to \code{1} to turn off alignment which will cause all lines in a hunk from \code{target} to show up first, followed by all lines from \code{current}. Note that in order to be aligned lines must meet the threshold and have at least 3 matching alphanumeric characters (see \code{\link{AlignThreshold}} for details).} \item{style}{\dQuote{auto}, a \code{\link{Style}} object, or a list. \dQuote{auto} by default. If a \code{Style} object, will override the the \code{format}, \code{brightness}, and \code{color.mode} parameters. The \code{Style} object provides full control of diff output styling. If a list, then the same as \dQuote{auto}, except that if the auto-selected \code{Style} requires instantiation (see \code{\link{PaletteOfStyles}}), then the list contents will be used as arguments when instantiating the style object. See \code{\link{Style}} for more details, in particular the examples.} \item{palette.of.styles}{\code{\link{PaletteOfStyles}} object; advanced usage, contains all the \code{\link{Style}} objects or \dQuote{classRepresentation} objects extending \code{\link{Style}} that are selected by specifying the \code{format}, \code{brightness}, and \code{color.mode} parameters. See \code{\link{PaletteOfStyles}} for more details.} \item{frame}{an environment to use as the evaluation frame for the \code{print/show/str}, calls and for \code{diffObj}, the evaluation frame for the \code{diffPrint} / \code{diffStr} calls. Defaults to the return value of \code{\link{par_frame}}.} \item{interactive}{TRUE or FALSE whether the function is being run in interactive mode, defaults to the return value of \code{\link{interactive}}. If in interactive mode, pager will be used if \code{pager} is \dQuote{auto}, and if ANSI styles are not supported and \code{style} is \dQuote{auto}, output will be send to viewer/browser as HTML.} \item{term.colors}{integer(1L) how many ANSI colors are supported by the terminal. This variable is provided for when \code{\link[=num_colors]{crayon::num_colors}} does not properly detect how many ANSI colors are supported by your terminal. Defaults to return value of \code{\link[=num_colors]{crayon::num_colors}} and should be 8 or 256 to allow ANSI colors, or any other number to disallow them. This only impacts output format selection when \code{style} and \code{format} are both set to \dQuote{auto}.} \item{tar.banner}{character(1L), language, or NULL, used to generate the text to display ahead of the diff section representing the target output. If NULL will use the deparsed \code{target} expression, if language, will use the language as it would the \code{target} expression, if character(1L), will use the string with no modifications. The language mode is provided because \code{diffStr} modifies the expression prior to display (e.g. by wrapping it in a call to \code{str}). Note that it is possible in some cases that the substituted value of \code{target} actually is character(1L), but if you provide a character(1L) value here it will be assumed you intend to use that value literally.} \item{cur.banner}{character(1L) like \code{tar.banner}, but for \code{current}} \item{strip.sgr}{TRUE, FALSE, or NULL (default), whether to strip ANSI CSI SGR sequences prior to comparison and for display of diff. If NULL, resolves to TRUE if `style` resolves to an ANSI formatted diff, and FALSE otherwise. The default behavior is to avoid confusing diffs where the original SGR and the SGR added by the diff are mixed together.} \item{sgr.supported}{TRUE, FALSE, or NULL (default), whether to assume the standard output device supports ANSI CSI SGR sequences. If TRUE, strings will be manipulated accounting for the SGR sequences. If NULL, resolves to TRUE if `style` resolves to an ANSI formatted diff, and to `crayon::has_color()` otherwise. This only controls how the strings are manipulated, not whether SGR is added to format the diff, which is controlled by the `style` parameter. This parameter is exposed for the rare cases where you might wish to control string manipulation behavior directly.} \item{extra}{list additional arguments to pass on to the functions used to create text representation of the objects to diff (e.g. \code{print}, \code{str}, etc.)} } \value{ a \code{Diff} object; see \code{\link{diffPrint}}. } \description{ Will perform the diff on the actual string values of the character vectors instead of capturing the printed screen output. Each vector element is treated as a line of text. NA elements are treated as the string \dQuote{NA}. Non character inputs are coerced to character and attributes are dropped with \code{\link{c}}. } \examples{ ## `pager="off"` for CRAN compliance; you may omit in normal use diffChr(LETTERS[1:5], LETTERS[2:6], pager="off") } \seealso{ \code{\link{diffPrint}} for details on the \code{diff*} functions, \code{\link{diffObj}}, \code{\link{diffStr}}, \code{\link{diffDeparse}} to compare deparsed objects, \code{\link{ses}} for a minimal and fast diff } diffobj/man/console_lines.Rd0000755000176200001440000000055513201325222015553 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/set.R \name{console_lines} \alias{console_lines} \title{Attempt to Compute Console Height in Text Lines} \usage{ console_lines() } \value{ integer(1L) } \description{ Returns the value of the \code{LINES} system variable if it is reasonable, 48 otherwise. } \examples{ console_lines() } diffobj/man/summary-PaletteOfStyles-method.Rd0000755000176200001440000000110513656314536020773 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/styles.R \name{summary,PaletteOfStyles-method} \alias{summary,PaletteOfStyles-method} \title{Display a Summarized Version of a PaletteOfStyles} \usage{ \S4method{summary}{PaletteOfStyles}(object, ...) } \arguments{ \item{object}{a \code{\link{PaletteOfStyles}} object} \item{...}{unused, for compatibility with generic} } \value{ character representation showing classes and/or objects in PaletteOfStyles summary(PaletteOfStyles()) } \description{ Display a Summarized Version of a PaletteOfStyles } diffobj/man/summary-MyersMbaSes-method.Rd0000755000176200001440000000130113656314536020074 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/core.R \name{summary,MyersMbaSes-method} \alias{summary,MyersMbaSes-method} \title{Summary Method for Shortest Edit Path} \usage{ \S4method{summary}{MyersMbaSes}(object, with.match = FALSE, ...) } \arguments{ \item{object}{the \code{diff_myers} object to display} \item{with.match}{logical(1L) whether to show what text the edit command refers to} \item{...}{forwarded to the data frame print method used to actually display the data} } \value{ whatever the data frame print method returns } \description{ Displays the data required to generate the shortest edit path for comparison between two strings. } \keyword{internal} diffobj/man/diff_myers.Rd0000755000176200001440000000404414123062122015044 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/core.R \name{diff_myers} \alias{diff_myers} \title{Diff two character vectors} \usage{ diff_myers(a, b, max.diffs = -1L, warn = FALSE) } \arguments{ \item{a}{character} \item{b}{character} \item{max.diffs}{integer(1L) how many differences before giving up; set to -1 to allow as many as there are up to the maximum allowed (~INT_MAX/4).} \item{warn}{TRUE or FALSE, whether to warn if we hit `max.diffs`.} } \value{ MyersMbaSes object } \description{ Implementation of Myer's Diff algorithm with linear space refinement originally implemented by Mike B. Allen as part of \href{http://www.ioplex.com/~miallen/libmba/}{libmba} version 0.9.1. This implementation is a heavily modified version of the original C code and is not compatible with the \code{libmba} library. The C code is simplified by using fixed size arrays instead of variable ones for tracking the longest reaching paths and for recording the shortest edit scripts. Additionally all error handling and memory allocation calls have been moved to the internal R functions designed to handle those things. A failover result is provided in the case where max diffs allowed is exceeded. Ability to provide custom comparison functions is removed. } \details{ The result format indicates operations required to convert \code{a} into \code{b} in a precursor format to the GNU diff shortest edit script. The operations are \dQuote{Match} (do nothing), \dQuote{Insert} (insert one or more values of \code{b} into \code{a}), and \dQuote{Delete} (remove one or more values from \code{a}). The \code{length} slot dictates how many values to advance along, insert into, or delete from \code{a}. The \code{offset} slot changes meaning depending on the operation. For \dQuote{Match} and \dQuote{Delete}, it is the starting index of that operation in \code{a}. For \dQuote{Insert}, it is the starting index in \code{b} of the values to insert into \code{a}; the index in \code{a} to insert at is implicit in previous operations. } \keyword{internal} diffobj/man/show-DiffSummary-method.Rd0000755000176200001440000000067413656314536017427 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/summmary.R \name{show,DiffSummary-method} \alias{show,DiffSummary-method} \title{Display DiffSummary Objects} \usage{ \S4method{show}{DiffSummary}(object) } \arguments{ \item{object}{a \code{DiffSummary} object} } \value{ NULL, invisbly show( summary(diffChr(letters, letters[-c(5, 15)], format="raw", pager="off")) ) } \description{ Display DiffSummary Objects } diffobj/man/Rdiff_chr.Rd0000755000176200001440000000401613656314536014625 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/rdiff.R \name{Rdiff_chr} \alias{Rdiff_chr} \alias{Rdiff_obj} \title{Run Rdiff Directly on R Objects} \usage{ Rdiff_chr(from, to, silent = FALSE, minimal = FALSE, nullPointers = TRUE) Rdiff_obj(from, to, silent = FALSE, minimal = FALSE, nullPointers = TRUE) } \arguments{ \item{from}{character or object coercible to character for \code{Rdiff_chr}, any R object with \code{Rdiff_obj}, or a file pointing to an RDS object} \item{to}{character same as \code{from}} \item{silent}{TRUE or FALSE, whether to display output to screen} \item{minimal}{TRUE or FALSE, whether to exclude the lines that show the actual differences or only the actual edit script commands} \item{nullPointers}{passed to \code{tools::Rdiff}} } \value{ the Rdiff output, invisibly if \code{silent} is FALSE Rdiff_chr(letters[1:5], LETTERS[1:5]) Rdiff_obj(letters[1:5], LETTERS[1:5]) } \description{ These functions are here for reference and testing purposes. They are wrappers to \code{tools::Rdiff} and rely on an existing system diff utility. You should be using \code{\link{ses}} or \code{\link{diffChr}} instead of \code{Rdiff_chr} and \code{\link{diffPrint}} instead of \code{Rdiff_obj}. See limitations in note. } \details{ \code{Rdiff_chr} runs diffs on character vectors or objects coerced to character vectors, where each value in the vectors is treated as a line in a file. \code{Rdiff_chr} always runs with the \code{useDiff} and \code{Log} parameters set to \code{TRUE}. \code{Rdiff_obj} runs diffs on the \code{print}ed representation of the provided objects. For each of \code{from}, \code{to}, will check if they are 1 length character vectors referencing an RDS file, and will use the contents of that RDS file as the object to compare. } \note{ These functions will try to use the system \code{diff} utility. This will fail in systems that do not have that utility available (e.g. windows installation without Rtools). } \seealso{ \code{\link{ses}}, \code{\link[=diffPrint]{diff*}} } diffobj/man/pager_is_less.Rd0000755000176200001440000000215213327367306015553 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/pager.R \name{pager_is_less} \alias{pager_is_less} \title{Check Whether System Has less as Pager} \usage{ pager_is_less() } \value{ TRUE or FALSE } \description{ If \code{getOption(pager)} is set to the default value, checks whether \code{Sys.getenv("PAGER")} appears to be \code{less} by trying to run the pager with the \dQuote{version} and parsing the output. If \code{getOption(pager)} is not the default value, then checks whether it points to the \code{less} program by the same mechanism. } \details{ Some systems may have \code{less} pagers installed that do not respond to the \code{$LESS} environment variable. For example, \code{more} on at least some versions of OS X is \code{less}, but does not actually respond to \code{$LESS}. If such as pager is the system pager you will likely end up seeing gibberish in the pager. If this is your use case you will need to set-up a custom pager configuration object that sets the correct system variables (see \code{\link{Pager}}). } \examples{ pager_is_less() } \seealso{ \code{\link{Pager}} } diffobj/man/show-Style-method.Rd0000755000176200001440000000152413656314536016274 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/styles.R \name{show,Style-method} \alias{show,Style-method} \alias{show,StyleHtml-method} \title{Show Method for Style Objects} \usage{ \S4method{show}{Style}(object) \S4method{show}{StyleHtml}(object) } \arguments{ \item{object}{a \code{Style} S4 object} } \value{ NULL, invisibly } \description{ Display a small sample diff with the Style object styles applied. For ANSI light and dark styles, will also temporarily set the background and foreground colors to ensure they are compatible with the style, even though this is not done in normal output (i.e. if you intend on using a \dQuote{light} style, you should set your terminal background color to be light or expect sub-optimal rendering). } \examples{ show(StyleAnsi256LightYb()) # assumes ANSI colors supported } diffobj/man/diffCsv.Rd0000755000176200001440000004503714123062122014310 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/diff.R \name{diffCsv} \alias{diffCsv} \alias{diffCsv,ANY-method} \title{Diff CSV Files} \usage{ diffCsv(target, current, ...) \S4method{diffCsv}{ANY}( target, current, mode = gdo("mode"), context = gdo("context"), format = gdo("format"), brightness = gdo("brightness"), color.mode = gdo("color.mode"), word.diff = gdo("word.diff"), pager = gdo("pager"), guides = gdo("guides"), trim = gdo("trim"), rds = gdo("rds"), unwrap.atomic = gdo("unwrap.atomic"), max.diffs = gdo("max.diffs"), disp.width = gdo("disp.width"), ignore.white.space = gdo("ignore.white.space"), convert.hz.white.space = gdo("convert.hz.white.space"), tab.stops = gdo("tab.stops"), line.limit = gdo("line.limit"), hunk.limit = gdo("hunk.limit"), align = gdo("align"), style = gdo("style"), palette.of.styles = gdo("palette"), frame = par_frame(), interactive = gdo("interactive"), term.colors = gdo("term.colors"), tar.banner = NULL, cur.banner = NULL, strip.sgr = gdo("strip.sgr"), sgr.supported = gdo("sgr.supported"), extra = list() ) } \arguments{ \item{target}{character(1L) or file connection with read capability; if character should point to a CSV file} \item{current}{like \code{target}} \item{...}{unused, for compatibility of methods with generics} \item{mode}{character(1L), one of: \itemize{ \item \dQuote{unified}: diff mode used by \code{git diff} \item \dQuote{sidebyside}: line up the differences side by side \item \dQuote{context}: show the target and current hunks in their entirety; this mode takes up a lot of screen space but makes it easier to see what the objects actually look like \item \dQuote{auto}: default mode; pick one of the above, will favor \dQuote{sidebyside} unless \code{getOption("width")} is less than 80, or in \code{diffPrint} and objects are dimensioned and do not fit side by side, or in \code{diffChr}, \code{diffDeparse}, \code{diffFile} and output does not fit in side by side without wrapping }} \item{context}{integer(1L) how many lines of context are shown on either side of differences (defaults to 2). Set to \code{-1L} to allow as many as there are. Set to \dQuote{auto} to display as many as 10 lines or as few as 1 depending on whether total screen lines fit within the number of lines specified in \code{line.limit}. Alternatively pass the return value of \code{\link{auto_context}} to fine tune the parameters of the auto context calculation.} \item{format}{character(1L), controls the diff output format, one of: \itemize{ \item \dQuote{auto}: to select output format based on terminal capabilities; will attempt to use one of the ANSI formats if they appear to be supported, and if not or if you are in the Rstudio console it will attempt to use HTML and browser output if in interactive mode. \item \dQuote{raw}: plain text \item \dQuote{ansi8}: color and format diffs using basic ANSI escape sequences \item \dQuote{ansi256}: like \dQuote{ansi8}, except using the full range of ANSI formatting options \item \dQuote{html}: color and format using HTML markup; the resulting string is processed with \code{\link{enc2utf8}} when output as a full web page (see docs for \code{html.output} under \code{\link{Style}}). } Defaults to \dQuote{auto}. See \code{palette.of.styles} for details on customization, \code{\link{style}} for full control of output format. See `pager` parameter for more discussion of Rstudio behavior.} \item{brightness}{character, one of \dQuote{light}, \dQuote{dark}, \dQuote{neutral}, useful for adjusting color scheme to light or dark terminals. \dQuote{neutral} by default. See \code{\link{PaletteOfStyles}} for details and limitations. Advanced: you may specify brightness as a function of \code{format}. For example, if you typically wish to use a \dQuote{dark} color scheme, except for when in \dQuote{html} format when you prefer the \dQuote{light} scheme, you may use \code{c("dark", html="light")} as the value for this parameter. This is particularly useful if \code{format} is set to \dQuote{auto} or if you want to specify a default value for this parameter via options. Any names you use should correspond to a \code{format}. You must have one unnamed value which will be used as the default for all \code{format}s that are not explicitly specified.} \item{color.mode}{character, one of \dQuote{rgb} or \dQuote{yb}. Defaults to \dQuote{yb}. \dQuote{yb} stands for \dQuote{Yellow-Blue} for color schemes that rely primarily on those colors to style diffs. Those colors can be easily distinguished by individuals with limited red-green color sensitivity. See \code{\link{PaletteOfStyles}} for details and limitations. Also offers the same advanced usage as the \code{brightness} parameter.} \item{word.diff}{TRUE (default) or FALSE, whether to run a secondary word diff on the in-hunk differences. For atomic vectors setting this to FALSE could make the diff \emph{slower} (see the \code{unwrap.atomic} parameter). For other uses, particularly with \code{\link{diffChr}} setting this to FALSE can substantially improve performance.} \item{pager}{one of \dQuote{auto} (default), \dQuote{on}, \dQuote{off}, a \code{\link{Pager}} object, or a list; controls whether and how a pager is used to display the diff output. If you require a particular pager behavior you must use a \code{\link{Pager}} object, or \dQuote{off} to turn off the pager. All other settings will interact with other parameters such as \code{format}, \code{style}, as well as with your system capabilities in order to select the pager expected to be most useful. \dQuote{auto} and \dQuote{on} are the same, except that in non-interactive mode \dQuote{auto} is equivalent to \dQuote{off}. \dQuote{off} will always send output to the console. If \dQuote{on}, whether the output actually gets routed to the pager depends on the pager \code{threshold} setting (see \code{\link{Pager}}). The default behavior is to use the pager associated with the \code{Style} object. The \code{Style} object is itself is determined by the \code{format} or \code{style} parameters. Depending on your system configuration different styles and corresponding pagers will get selected, unless you specify a \code{Pager} object directly. On a system with a system pager that supports ANSI CSI SGR colors, the pager will only trigger if the output is taller than one window. If the system pager is not known to support ANSI colors then the output will be sent as HTML to the IDE viewer if available or to the web browser if not. Even though Rstudio now supports ANSI CSI SGR at the console output is still formatted as HTML and sent to the IDE viewer. Partly this is for continuity of behavior, but also because the default Rstudio pager does not support ANSI CSI SGR, at least as of this writing. If \code{pager} is a list, then the same as with \dQuote{on}, except that the \code{Pager} object associated with the selected \code{Style} object is re-instantiated with the union of the list elements and the existing settings of that \code{Pager}. The list should contain named elements that correspond to the \code{\link{Pager}} instantiation parameters. The names must be specified in full as partial parameter matching will not be carried out because the pager is re-instantiated with \code{\link{new}}. See \code{\link{Pager}}, \code{\link{Style}}, and \code{\link{PaletteOfStyles}} for more details and for instructions on how to modify the default behavior.} \item{guides}{TRUE (default), FALSE, or a function that accepts at least two arguments and requires no more than two arguments. Guides are additional context lines that are not strictly part of a hunk, but provide important contextual data (e.g. column headers). If TRUE, the context lines are shown in addition to the normal diff output, typically in a different color to indicate they are not part of the hunk. If a function, the function should accept as the first argument the object being diffed, and the second the character representation of the object. The function should return the indices of the elements of the character representation that should be treated as guides. See \code{\link{guides}} for more details.} \item{trim}{TRUE (default), FALSE, or a function that accepts at least two arguments and requires no more than two arguments. Function should compute for each line in captured output what portion of those lines should be diffed. By default, this is used to remove row meta data differences (e.g. \code{[1,]}) so they alone do not show up as differences in the diff. See \code{\link{trim}} for more details.} \item{rds}{TRUE (default) or FALSE, if TRUE will check whether \code{target} and/or \code{current} point to a file that can be read with \code{\link{readRDS}} and if so, loads the R object contained in the file and carries out the diff on the object instead of the original argument. Currently there is no mechanism for specifying additional arguments to \code{readRDS}} \item{unwrap.atomic}{TRUE (default) or FALSE. Relevant primarily for \code{diffPrint}, if TRUE, and \code{word.diff} is also TRUE, and both \code{target} and \code{current} are \emph{unnamed} one-dimension atomics , the vectors are unwrapped and diffed element by element, and then re-wrapped. Since \code{diffPrint} is fundamentally a line diff, the re-wrapped lines are lined up in a manner that is as consistent as possible with the unwrapped diff. Lines that contain the location of the word differences will be paired up. Since the vectors may well be wrapped with different periodicities this will result in lines that are paired up that look like they should not be paired up, though the locations of the differences should be. If is entirely possible that setting this parameter to FALSE will result in a slower diff. This happens if two vectors are actually fairly similar, but their line representations are not. For example, in comparing \code{1:100} to \code{c(100, 1:99)}, there is really only one difference at the \dQuote{word} level, but every screen line is different. \code{diffChr} will also do the unwrapping if it is given a character vector that contains output that looks like the atomic vectors described above. This is a bug, but as the functionality could be useful when diffing e.g. \code{capture.output} data, we now declare it a feature.} \item{max.diffs}{integer(1L), number of \emph{differences} (default 50000L) after which we abandon the \code{O(n^2)} diff algorithm in favor of a naive \code{O(n)} one. Set to \code{-1L} to stick to the original algorithm up to the maximum allowed (~INT_MAX/4).} \item{disp.width}{integer(1L) number of display columns to take up; note that in \dQuote{sidebyside} \code{mode} the effective display width is half this number (set to 0L to use default widths which are \code{getOption("width")} for normal styles and \code{80L} for HTML styles. Future versions of \code{diffobj} may change this to larger values for two dimensional objects for better diffs (see details).} \item{ignore.white.space}{TRUE or FALSE, whether to consider differences in horizontal whitespace (i.e. spaces and tabs) as differences (defaults to TRUE).} \item{convert.hz.white.space}{TRUE or FALSE, whether modify input strings that contain tabs and carriage returns in such a way that they display as they would \bold{with} those characters, but without using those characters (defaults to TRUE). The conversion assumes that tab stops are spaced evenly eight characters apart on the terminal. If this is not the case you may specify the tab stops explicitly with \code{tab.stops}.} \item{tab.stops}{integer, what tab stops to use when converting hard tabs to spaces. If not integer will be coerced to integer (defaults to 8L). You may specify more than one tab stop. If display width exceeds that addressable by your tab stops the last tab stop will be repeated.} \item{line.limit}{integer(2L) or integer(1L), if length 1 how many lines of output to show, where \code{-1} means no limit. If length 2, the first value indicates the threshold of screen lines to begin truncating output, and the second the number of lines to truncate to, which should be fewer than the threshold. Note that this parameter is implemented on a best-efforts basis and should not be relied on to produce the exact number of lines requested. In particular do not expect it to work well for for values small enough that the banner portion of the diff would have to be trimmed. If you want a specific number of lines use \code{[} or \code{head} / \code{tail}. One advantage of \code{line.limit} over these other options is that you can combine it with \code{context="auto"} and auto \code{max.level} selection (the latter for \code{diffStr}), which allows the diff to dynamically adjust to make best use of the available display lines. \code{[}, \code{head}, and \code{tail} just subset the text of the output.} \item{hunk.limit}{integer(2L) or integer (1L), how many diff hunks to show. Behaves similarly to \code{line.limit}. How many hunks are in a particular diff is a function of how many differences, and also how much \code{context} is used since context can cause two hunks to bleed into each other and become one.} \item{align}{numeric(1L) between 0 and 1, proportion of words in a line of \code{target} that must be matched in a line of \code{current} in the same hunk for those lines to be paired up when displayed (defaults to 0.25), or an \code{\link{AlignThreshold}} object. Set to \code{1} to turn off alignment which will cause all lines in a hunk from \code{target} to show up first, followed by all lines from \code{current}. Note that in order to be aligned lines must meet the threshold and have at least 3 matching alphanumeric characters (see \code{\link{AlignThreshold}} for details).} \item{style}{\dQuote{auto}, a \code{\link{Style}} object, or a list. \dQuote{auto} by default. If a \code{Style} object, will override the the \code{format}, \code{brightness}, and \code{color.mode} parameters. The \code{Style} object provides full control of diff output styling. If a list, then the same as \dQuote{auto}, except that if the auto-selected \code{Style} requires instantiation (see \code{\link{PaletteOfStyles}}), then the list contents will be used as arguments when instantiating the style object. See \code{\link{Style}} for more details, in particular the examples.} \item{palette.of.styles}{\code{\link{PaletteOfStyles}} object; advanced usage, contains all the \code{\link{Style}} objects or \dQuote{classRepresentation} objects extending \code{\link{Style}} that are selected by specifying the \code{format}, \code{brightness}, and \code{color.mode} parameters. See \code{\link{PaletteOfStyles}} for more details.} \item{frame}{an environment to use as the evaluation frame for the \code{print/show/str}, calls and for \code{diffObj}, the evaluation frame for the \code{diffPrint} / \code{diffStr} calls. Defaults to the return value of \code{\link{par_frame}}.} \item{interactive}{TRUE or FALSE whether the function is being run in interactive mode, defaults to the return value of \code{\link{interactive}}. If in interactive mode, pager will be used if \code{pager} is \dQuote{auto}, and if ANSI styles are not supported and \code{style} is \dQuote{auto}, output will be send to viewer/browser as HTML.} \item{term.colors}{integer(1L) how many ANSI colors are supported by the terminal. This variable is provided for when \code{\link[=num_colors]{crayon::num_colors}} does not properly detect how many ANSI colors are supported by your terminal. Defaults to return value of \code{\link[=num_colors]{crayon::num_colors}} and should be 8 or 256 to allow ANSI colors, or any other number to disallow them. This only impacts output format selection when \code{style} and \code{format} are both set to \dQuote{auto}.} \item{tar.banner}{character(1L), language, or NULL, used to generate the text to display ahead of the diff section representing the target output. If NULL will use the deparsed \code{target} expression, if language, will use the language as it would the \code{target} expression, if character(1L), will use the string with no modifications. The language mode is provided because \code{diffStr} modifies the expression prior to display (e.g. by wrapping it in a call to \code{str}). Note that it is possible in some cases that the substituted value of \code{target} actually is character(1L), but if you provide a character(1L) value here it will be assumed you intend to use that value literally.} \item{cur.banner}{character(1L) like \code{tar.banner}, but for \code{current}} \item{strip.sgr}{TRUE, FALSE, or NULL (default), whether to strip ANSI CSI SGR sequences prior to comparison and for display of diff. If NULL, resolves to TRUE if `style` resolves to an ANSI formatted diff, and FALSE otherwise. The default behavior is to avoid confusing diffs where the original SGR and the SGR added by the diff are mixed together.} \item{sgr.supported}{TRUE, FALSE, or NULL (default), whether to assume the standard output device supports ANSI CSI SGR sequences. If TRUE, strings will be manipulated accounting for the SGR sequences. If NULL, resolves to TRUE if `style` resolves to an ANSI formatted diff, and to `crayon::has_color()` otherwise. This only controls how the strings are manipulated, not whether SGR is added to format the diff, which is controlled by the `style` parameter. This parameter is exposed for the rare cases where you might wish to control string manipulation behavior directly.} \item{extra}{list additional arguments to pass on to the functions used to create text representation of the objects to diff (e.g. \code{print}, \code{str}, etc.)} } \value{ a \code{Diff} object; see \code{\link{diffPrint}}. } \description{ Reads CSV files with \code{\link{read.csv}} and passes the resulting data frames onto \code{\link{diffPrint}}. \code{extra} values are passed as arguments are passed to both \code{read.csv} and \code{print}. To the extent you wish to use different \code{extra} arguments for each of those functions you will need to \code{read.csv} the files and pass them to \code{diffPrint} yourself. } \examples{ iris.2 <- iris iris.2$Sepal.Length[5] <- 99 f1 <- tempfile() f2 <- tempfile() write.csv(iris, f1, row.names=FALSE) write.csv(iris.2, f2, row.names=FALSE) ## `pager="off"` for CRAN compliance; you may omit in normal use diffCsv(f1, f2, pager="off") unlink(c(f1, f2)) } \seealso{ \code{\link{diffPrint}} for details on the \code{diff*} functions, \code{\link{diffObj}}, \code{\link{diffStr}}, \code{\link{diffChr}} to compare character vectors directly, \code{\link{ses}} for a minimal and fast diff } diffobj/man/Pager.Rd0000755000176200001440000002666213656314536014010 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/pager.R \docType{class} \name{Pager} \alias{Pager} \alias{PagerOff,} \alias{PagerSystem,} \alias{PagerSystemLess,} \alias{PagerBrowser} \alias{PagerOff-class} \alias{PagerOff} \alias{PagerSystem-class} \alias{PagerSystem} \alias{PagerSystemLess-class} \alias{PagerSystemLess} \title{Objects for Specifying Pager Settings} \usage{ Pager( pager = function(x) writeLines(readLines(x)), file.ext = "", threshold = 0L, ansi = FALSE, file.path = NA_character_, make.blocking = FALSE ) PagerOff(...) PagerSystem(pager = file.show, threshold = -1L, file.ext = "", ...) PagerSystemLess( pager = file.show, threshold = -1L, flags = "R", file.ext = "", ansi = TRUE, ... ) PagerBrowser( pager = view_or_browse, threshold = 0L, file.ext = "html", make.blocking = NA, ... ) } \arguments{ \item{pager}{a function that accepts at least one parameter and does not require a parameter other than the first parameter. This function will be called with a file path passed as the first argument. The referenced file will contain the text of the diff. By default this is a temporary file that will be deleted as soon as the pager function completes evaluation. \code{PagerSystem} and \code{PagerSystemLess} use \code{\link{file.show}} by default, and \code{PagerBrowser} uses \code{\link{view_or_browse}} for HTML output. For asynchronous pagers such as \code{view_or_browse} it is important to make the pager function blocking by setting the \code{make.blocking} parameter to TRUE, or to specify a pager file path explicitly with \code{file.path}.} \item{file.ext}{character(1L) an extension to append to file path passed to \code{pager}, \emph{without} the period. For example, \code{PagerBrowser} uses \dQuote{html} to cause \code{\link{browseURL}} to launch the web browser. This parameter will be overridden if \code{file.path} is used.} \item{threshold}{integer(1L) number of lines of output that triggers the use of the pager; negative values lead to using \code{\link{console_lines} + 1}, and zero leads to always using the pager irrespective of how many lines the output has.} \item{ansi}{TRUE or FALSE, whether the pager supports ANSI CSI SGR sequences.} \item{file.path}{character(1L), if not NA the diff will be written to this location, ignoring the value of \code{file.ext}. If NA_character_ (default), a temporary file is used and removed after the pager function completes evaluation. If not NA, the file is preserved. Beware that the file will be overwritten if it already exists.} \item{make.blocking}{TRUE, FALSE, or NA. Whether to wrap \code{pager} with \code{\link{make_blocking}} prior to calling it. This suspends R code execution until there is user input so that temporary diff files are not deleted before the pager has a chance to read them. This typically defaults to FALSE, except for \code{PagerBrowser} where it defaults to NA, which resolves to \code{is.na(file.path)} (i.e. it is TRUE if the diff is being written to a temporary file, and FALSE otherwise).} \item{...}{additional arguments to pass on to \code{new} that are passed on to parent classes.} \item{flags}{character(1L), only for \code{PagerSystemLess}, what flags to set with the \code{LESS} system environment variable. By default the \dQuote{R} flag is set to ensure ANSI escape sequences are interpreted if it appears your terminal supports ANSI escape sequences. If you want to leave the output on the screen after you exit the pager you can use \dQuote{RX}. You should only provide the flag letters (e.g. \dQuote{"RX"}, not \code{"-RX"}). The system variable is only modified for the duration of the evaluation and is reset / unset afterwards. \emph{Note:} you must specify this slot via the constructor as in the example. If you set the slot directly it will not have any effect.} } \description{ Initializers for pager configuration objects that modify pager behavior. These objects can be used as the \code{pager} argument to the \code{\link[=diffPrint]{diff*}} methods, or as the \code{pager} slot for \code{\link{Style}} objects. In this documentation we use the \dQuote{pager} term loosely and intend it to refer to any device other than the terminal that can be used to render output. } \section{Default Output Behavior}{ \code{\link[=diffPrint]{diff*}} methods use \dQuote{pagers} to help manage large outputs and also to provide an alternative colored diff when the terminal does not support them directly. For OS X and *nix systems where \code{less} is the pager and the terminal supports ANSI escape sequences, output is colored with ANSI escape sequences. If the output exceeds one screen height in size (as estimated by \code{\link{console_lines}}) it is sent to the pager. If the terminal does not support ANSI escape sequences, or if the system pager is not \code{less} as detected by \code{\link{pager_is_less}}, then the output is rendered in HTML and sent to the IDE viewer (\code{getOption("viewer")}) if defined, or to the browser with \code{\link{browseURL}} if not. This behavior may seem sub-optimal for systems that have ANSI aware terminals and ANSI aware pagers other than \code{less}, but these should be rare and it is possible to configure \code{diffobj} to produce the correct output for them (see examples). } \section{Pagers and Styles}{ There is a close relationship between pagers and \code{\link{Style}}. The \code{Style} objects control whether the output is raw text, formatted with ANSI escape sequences, or marked up with HTML. In order for these different types of outputs to render properly, they need to be sent to the right device. For this reason \code{\link{Style}} objects come with a \code{Pager} configuration object pre-assigned so the output can render correctly. The exact \code{Pager} configuration object depends on the \code{\link{Style}} as well as the system configuration. In any call to the \code{\link[=diffPrint]{diff*}} methods you can always specify both the \code{\link{Style}} and \code{Pager} configuration object directly for full control of output formatting and rendering. We have tried to set-up sensible defaults for most likely use cases, but given the complex interactions involved it is possible you may need to configure things explicitly. Should you need to define explicit configurations you can save them as option values with \code{options(diffobj.pager=..., diffobj.style=...)} so that you do not need to specify them each time you use \code{diffobj}. } \section{Pager Configuration Objects}{ The \code{Pager} configuration objects allow you to specify what device to use as the pager and under what circumstances the pager should be used. Several pre-defined pager configuration objects are available via constructor functions: \itemize{ \item \code{Pager}: Generic pager just outputs directly to terminal; not useful unless the default parameters are modified. \item \code{PagerOff}: Turn off pager \item \code{PagerSystem}: Use the system pager as invoked by \code{\link{file.show}} \item \code{PagerSystemLess}: Like \code{PagerSystem}, but provides additional configuration options if the system pager is \code{less}. Note this object does not change the system pager; it only allows you to configure it via the \code{$LESS} environment variable which will have no effect unless the system pager is set to be \code{less}. \item \code{PagerBrowser}: Use \code{getOption("viewer")} if defined, or \code{\link{browseURL}} if not } The default configuration for \code{PagerSystem} and \code{PagerSystemLess} leads to output being sent to the pager if it exceeds the estimated window size, whereas \code{PagerBrowser} always sends output to the pager. This behavior can be configured via the \code{threshold} parameter. \code{PagerSystemLess}'s primary role is to correctly configure the \code{$LESS} system variable so that \code{less} renders the ANSI escape sequences as intended. On OS X \code{more} is a faux-alias to \code{less}, except it does not appear to read the \code{$LESS} system variable. Should you configure your system pager to be the \code{more} version of \code{less}, \code{\link{pager_is_less}} will be tricked into thinking you are using a \dQuote{normal} version of \code{less} and you will likely end up seeing gibberish in the pager. If this is your use case you will need to set-up a custom pager configuration object that sets the correct system variables. } \section{Custom Pager Configurations}{ In most cases the simplest way to generate new pager configurations is to use a list specification in the \code{\link[=diffPrint]{diff*}} call. Alternatively you can start with an existing \code{Pager} object and change the defaults. Both these cases are covered in the examples. You can change what system pager is used by \code{PagerSystem} by changing it with \code{options(pager=...)} or by changing the \code{$PAGER} environment variable. You can also explicitly set a function to act as the pager when you instantiate the \code{Pager} configuration object (see examples). If you wish to define your own pager object you should do so by extending the any of the \code{Pager} classes. If the function you use to handle the actual paging is non-blocking (i.e. allows R code evaluation to continue after it is spawned, you should set the \code{make.blocking} parameter to TRUE to pause execution prior to deleting the temporary file that contains the diff. } \examples{ ## We `dontrun` these examples as they involve pagers that should only be run ## in interactive mode \dontrun{ ## Specify Pager parameters via list; this lets the `diff*` functions pick ## their preferred pager based on format and other output parameters, but ## allows you to modify the pager behavior. f <- tempfile() diffChr(1:200, 180:300, format='html', pager=list(file.path=f)) head(readLines(f)) # html output unlink(f) ## Assuming system pager is `less` and terminal supports ANSI ESC sequences ## Equivalent to running `less -RFX` diffChr(1:200, 180:300, pager=PagerSystemLess(flags="RFX")) ## If the auto-selected pager would be the system pager, we could ## equivalently use: diffChr(1:200, 180:300, pager=list(flags="RFX")) ## System pager is not less, but it supports ANSI escape sequences diffChr(1:200, 180:300, pager=PagerSystem(ansi=TRUE)) ## Use a custom pager, in this case we make up a trivial one and configure it ## always page (`threshold=0L`) page.fun <- function(x) cat(paste0("| ", readLines(x)), sep="\n") page.conf <- PagerSystem(pager=page.fun, threshold=0L) diffChr(1:200, 180:300, pager=page.conf, disp.width=getOption("width") - 2) ## Set-up the custom pager as the default pager options(diffobj.pager=page.conf) diffChr(1:200, 180:300) ## A blocking pager (this is effectively very similar to what `PagerBrowser` ## does); need to block b/c otherwise temp file with diff could be deleted ## before the device has a chance to read it since `browseURL` is not ## blocking itself. On OS X we need to specify the extension so the correct ## program opens it (in this case `TextEdit`): page.conf <- Pager(pager=browseURL, file.ext="txt", make.blocking=TRUE) diffChr(1:200, 180:300, pager=page.conf, format='raw') ## An alternative to a blocking pager is to disable the ## auto-file deletion; here we also specify a file location ## explicitly so we can recover the diff text. f <- paste0(tempfile(), ".html") # must specify .html diffChr(1:5, 2:6, format='html', pager=list(file.path=f)) tail(readLines(f)) unlink(f) } } \seealso{ \code{\link{Style}}, \code{\link{pager_is_less}} } diffobj/man/diffobj_s4method_doc.Rd0000755000176200001440000000071613656314536017001 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/s4.R, R/core.R, R/tochar.R \name{diffobj_s4method_doc} \alias{diffobj_s4method_doc} \alias{show,MyersMbaSes-method} \alias{as.character,Diff-method} \title{Dummy Doc File for S4 Methods with Existing Generics} \usage{ \S4method{show}{MyersMbaSes}(object) \S4method{as.character}{Diff}(x, ...) } \description{ Dummy Doc File for S4 Methods with Existing Generics } \keyword{internal} diffobj/man/trim.Rd0000755000176200001440000000710413656314536013713 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/trim.R \name{trim} \alias{trim} \alias{trimPrint,} \alias{trimStr,} \alias{trimChr,} \alias{trimDeparse,} \alias{trimFile} \alias{trimPrint} \alias{trimPrint,ANY,character-method} \alias{trimStr} \alias{trimStr,ANY,character-method} \alias{trimChr} \alias{trimChr,ANY,character-method} \alias{trimDeparse} \alias{trimDeparse,ANY,character-method} \alias{trimFile,ANY,character-method} \title{Methods to Remove Unsemantic Text Prior to Diff} \usage{ trimPrint(obj, obj.as.chr) \S4method{trimPrint}{ANY,character}(obj, obj.as.chr) trimStr(obj, obj.as.chr) \S4method{trimStr}{ANY,character}(obj, obj.as.chr) trimChr(obj, obj.as.chr) \S4method{trimChr}{ANY,character}(obj, obj.as.chr) trimDeparse(obj, obj.as.chr) \S4method{trimDeparse}{ANY,character}(obj, obj.as.chr) trimFile(obj, obj.as.chr) \S4method{trimFile}{ANY,character}(obj, obj.as.chr) } \arguments{ \item{obj}{the object} \item{obj.as.chr}{character the \code{print}ed representation of the object} } \value{ a \code{length(obj.as.chr)} row and 2 column integer matrix with the start (first column) and end (second column) character positions of the sub string to run diffs on. } \description{ \code{\link[=diffPrint]{diff*}} methods, in particular \code{diffPrint}, modify the text representation of an object prior to running the diff to reduce the incidence of spurious mismatches caused by unsemantic differences. For example, we look to remove matrix row indices and atomic vector indices (i.e. the \samp{[1,]} or \samp{[1]} strings at the beginning of each display line). } \details{ Consider: \preformatted{ > matrix(10:12) [,1] [1,] 10 [2,] 11 [3,] 12 > matrix(11:12) [,1] [1,] 11 [2,] 12 } In this case, the line by line diff would find all rows of the matrix to be mismatched because where the data matches (rows containing 11 and 12) the indices do not. By trimming out the row indices before the diff, the diff can recognize that row 2 and 3 from the first matrix should be matched to row 1 and 2 of the second. These methods follow a similar interface as the \code{\link[=guides]{guide*}} methods, with one available for each \code{diff*} method except for \code{diffCsv} since that one uses \code{diffPrint} internally. The unsemantic differences are added back after the diff for display purposes, and are colored in grey to indicate they are ignored in the diff. Currently only \code{trimPrint} and \code{trimStr} do anything meaningful. \code{trimPrint} removes row index headers provided that they are of the default un-named variety. If you add row names, or if numeric row indices are not ascending from 1, they will not be stripped as those have meaning. \code{trimStr} removes the \samp{..$}, \samp{..-}, and \samp{..@} tokens to minimize spurious matches. You can modify how text is trimmed by providing your own functions to the \code{trim} argument of the \code{diff*} methods, or by defining \code{trim*} methods for your objects. Note that the return value for these functions is the start and end columns of the text that should be \emph{kept} and used in the diff. As with guides, trimming is on a best efforts basis and may fail with \dQuote{pathological} display representations. Since the diff still works even with failed trimming this is considered an acceptable compromise. Trimming is more likely to fail with nested recursive structures. } \note{ \code{obj.as.chr} will be as processed by \code{\link{strip_hz_control}} and as such will not be identical to the captured output if it contains tabs, newlines, or carriage returns. } diffobj/man/Style.Rd0000755000176200001440000003432313656314536014043 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/styles.R \docType{class} \name{Style-class} \alias{Style-class} \alias{Style} \alias{StyleRaw-class} \alias{StyleRaw} \alias{StyleAnsi-class} \alias{StyleAnsi} \alias{StyleAnsi8NeutralRgb-class} \alias{StyleAnsi8NeutralRgb} \alias{StyleAnsi8NeutralYb-class} \alias{StyleAnsi8NeutralYb} \alias{StyleAnsi256LightRgb-class} \alias{StyleAnsi256LightRgb} \alias{StyleAnsi256LightYb-class} \alias{StyleAnsi256LightYb} \alias{StyleAnsi256DarkRgb-class} \alias{StyleAnsi256DarkRgb} \alias{StyleAnsi256DarkYb-class} \alias{StyleAnsi256DarkYb} \alias{StyleHtml-class} \alias{StyleHtml} \alias{StyleHtmlLightRgb-class} \alias{StyleHtmlLightRgb} \alias{StyleHtmlLightYb-class} \alias{StyleHtmlLightYb} \title{Customize Appearance of Diff} \arguments{ \item{funs}{a \code{\link{StyleFuns}} object that contains all the functions represented above} \item{text}{a \code{\link{StyleText}} object that contains the non-content text used by the diff (e.g. \code{gutter.insert.txt})} \item{summary}{a \code{\link{StyleSummary}} object that contains formatting functions and other meta data for rendering summaries} \item{pad}{TRUE or FALSE, whether text should be right padded} \item{pager}{what type of \code{\link{Pager}} to use} \item{nchar.fun}{function to use to count characters; intended mostly for internal use (used only for gutters as of version 0.2.0).} \item{wrap}{TRUE or FALSE, whether text should be hard wrapped at \code{disp.width}} \item{na.sub}{what character value to substitute for NA elements; NA elements are generated when lining up side by side diffs by adding padding rows; by default the text styles replace these with a blank character string, and the HTML styles leave them as NA for the HTML formatting functions to deal with} \item{blank}{sub what character value to replace blanks with; needed in particular for HTML rendering (uses \code{" "}) to prevent lines from collapsing} \item{disp.width}{how many columns the text representation of the objects to diff is allowed to take up before it is hard wrapped (assuming \code{wrap} is TRUE). See param \code{disp.width} for \code{\link{diffPrint}}.} \item{finalizer}{function that accepts at least two parameters and requires no more than two parameters, will receive as the first parameter the the object to render (either a \code{Diff} or a \code{DiffSummary} object), and the text representation of that object as the second argument. This allows final modifications to the character output so that it is displayed correctly by the pager. For example, \code{StyleHtml} objects use it to generate HTML headers if the \code{Diff} is destined to be displayed in a browser. The object themselves are passed along to provide information about the paging device and other contextual data to the function.} \item{html.output}{(\code{StyleHtml} objects only) one of: \itemize{ \item \dQuote{page}: Include all HTML/CSS/JS required to create a stand-alone web page with the diff; in this mode the diff string will be re-encoded with \code{\link{enc2utf8}} and the HTML page encoding will be declared as UTF-8. \item \dQuote{diff.w.style}: The CSS and HTML, but without any of the outer tags that would make it a proper HTML page (i.e. no \code{/} tags or the like) and without the JS; note that technically this is illegal HTML since we have \code{", css.txt) } } if(html.output == "diff.w.style") { tpl <- "%s%s" } else if (html.output == "page") { x.chr <- enc2utf8(x.chr) charset <- '' tpl <- sprintf(" %s\n %%s\n
\n%%s\n
", charset, js ) } else if (html.output == "diff.only") { css <- "" tpl <- "%s%s" } else stop("Internal Error: unexpected html.output; contact maintainer.")# nocov sprintf(tpl, css, paste0(x.chr, collapse="")) } ) diffobj/R/pager.R0000755000176200001440000004670013777704534013332 0ustar liggesusers# Copyright (C) 2021 Brodie Gaslam # # This file is part of "diffobj - Diffs for R Objects" # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # Go to for a copy of the license. #' Objects for Specifying Pager Settings #' #' Initializers for pager configuration objects that modify pager behavior. #' These objects can be used as the \code{pager} argument to the #' \code{\link[=diffPrint]{diff*}} methods, or as the \code{pager} slot for #' \code{\link{Style}} objects. In this documentation we use the \dQuote{pager} #' term loosely and intend it to refer to any device other than the terminal #' that can be used to render output. #' #' @section Default Output Behavior: #' #' \code{\link[=diffPrint]{diff*}} methods use \dQuote{pagers} to help #' manage large outputs and also to provide an alternative colored diff when the #' terminal does not support them directly. #' #' For OS X and *nix systems where \code{less} is the pager and the #' terminal supports ANSI escape sequences, output is colored with ANSI escape #' sequences. If the output exceeds one screen height in size (as estimated by #' \code{\link{console_lines}}) it is sent to the pager. #' #' If the terminal does not support ANSI escape sequences, or if the system #' pager is not \code{less} as detected by \code{\link{pager_is_less}}, then the #' output is rendered in HTML and sent to the IDE viewer #' (\code{getOption("viewer")}) if defined, or to the browser with #' \code{\link{browseURL}} if not. This behavior may seem sub-optimal for #' systems that have ANSI aware terminals and ANSI aware pagers other than #' \code{less}, but these should be rare and it is possible to configure #' \code{diffobj} to produce the correct output for them (see examples). #' #' @section Pagers and Styles: #' #' There is a close relationship between pagers and \code{\link{Style}}. The #' \code{Style} objects control whether the output is raw text, formatted #' with ANSI escape sequences, or marked up with HTML. In order for these #' different types of outputs to render properly, they need to be sent to the #' right device. For this reason \code{\link{Style}} objects come with a #' \code{Pager} configuration object pre-assigned so the output can render #' correctly. The exact \code{Pager} configuration object depends on the #' \code{\link{Style}} as well as the system configuration. #' #' In any call to the \code{\link[=diffPrint]{diff*}} methods you can always #' specify both the \code{\link{Style}} and \code{Pager} configuration object #' directly for full control of output formatting and rendering. We have tried #' to set-up sensible defaults for most likely use cases, but given the complex #' interactions involved it is possible you may need to configure things #' explicitly. Should you need to define explicit configurations you can save #' them as option values with #' \code{options(diffobj.pager=..., diffobj.style=...)} so that you do not need #' to specify them each time you use \code{diffobj}. #' #' @section Pager Configuration Objects: #' #' The \code{Pager} configuration objects allow you to specify what device to #' use as the pager and under what circumstances the pager should be used. #' Several pre-defined pager configuration objects are available via #' constructor functions: #' \itemize{ #' \item \code{Pager}: Generic pager just outputs directly to terminal; not #' useful unless the default parameters are modified. #' \item \code{PagerOff}: Turn off pager #' \item \code{PagerSystem}: Use the system pager as invoked by #' \code{\link{file.show}} #' \item \code{PagerSystemLess}: Like \code{PagerSystem}, but provides #' additional configuration options if the system pager is \code{less}. #' Note this object does not change the system pager; it only allows you to #' configure it via the \code{$LESS} environment variable which will have #' no effect unless the system pager is set to be \code{less}. #' \item \code{PagerBrowser}: Use \code{getOption("viewer")} if defined, or #' \code{\link{browseURL}} if not #' } #' The default configuration for \code{PagerSystem} and \code{PagerSystemLess} #' leads to output being sent to the pager if it exceeds the estimated window #' size, whereas \code{PagerBrowser} always sends output to the pager. This #' behavior can be configured via the \code{threshold} parameter. #' #' \code{PagerSystemLess}'s primary role is to correctly configure the #' \code{$LESS} system variable so that \code{less} renders the ANSI escape #' sequences as intended. On OS X \code{more} is a faux-alias to \code{less}, #' except it does not appear to read the \code{$LESS} system variable. #' Should you configure your system pager to be the \code{more} version of #' \code{less}, \code{\link{pager_is_less}} will be tricked into thinking you #' are using a \dQuote{normal} version of \code{less} and you will likely end up #' seeing gibberish in the pager. If this is your use case you will need to #' set-up a custom pager configuration object that sets the correct system #' variables. #' #' @section Custom Pager Configurations: #' #' In most cases the simplest way to generate new pager configurations is to use #' a list specification in the \code{\link[=diffPrint]{diff*}} call. #' Alternatively you can start with an existing \code{Pager} object and change #' the defaults. Both these cases are covered in the examples. #' #' You can change what system pager is used by \code{PagerSystem} by changing it #' with \code{options(pager=...)} or by changing the \code{$PAGER} environment #' variable. You can also explicitly set a function to act as the pager when #' you instantiate the \code{Pager} configuration object (see examples). #' #' If you wish to define your own pager object you should do so by extending the #' any of the \code{Pager} classes. If the function you use to handle the #' actual paging is non-blocking (i.e. allows R code evaluation to continue #' after it is spawned, you should set the \code{make.blocking} parameter to #' TRUE to pause execution prior to deleting the temporary file that contains #' the diff. #' #' @param pager a function that accepts at least one parameter and does not #' require a parameter other than the first parameter. This function will be #' called with a file path passed as the first argument. The referenced file #' will contain the text of the diff. By default this is a temporary file that #' will be deleted as soon as the pager function completes evaluation. #' \code{PagerSystem} and \code{PagerSystemLess} use \code{\link{file.show}} #' by default, and \code{PagerBrowser} uses #' \code{\link{view_or_browse}} for HTML output. For asynchronous pagers such #' as \code{view_or_browse} it is important to make the pager function #' blocking by setting the \code{make.blocking} parameter to TRUE, or to #' specify a pager file path explicitly with \code{file.path}. #' @param file.ext character(1L) an extension to append to file path passed to #' \code{pager}, \emph{without} the period. For example, \code{PagerBrowser} #' uses \dQuote{html} to cause \code{\link{browseURL}} to launch the web #' browser. This parameter will be overridden if \code{file.path} is used. #' @param threshold integer(1L) number of lines of output that triggers the use #' of the pager; negative values lead to using #' \code{\link{console_lines} + 1}, and zero leads to always using the pager #' irrespective of how many lines the output has. #' @param ansi TRUE or FALSE, whether the pager supports ANSI CSI SGR sequences. #' @param flags character(1L), only for \code{PagerSystemLess}, what flags to #' set with the \code{LESS} system environment variable. By default the #' \dQuote{R} flag is set to ensure ANSI escape sequences are interpreted if #' it appears your terminal supports ANSI escape sequences. If you want to #' leave the output on the screen after you exit the pager you can use #' \dQuote{RX}. You should only provide the flag letters (e.g. \dQuote{"RX"}, #' not \code{"-RX"}). The system variable is only modified for the duration #' of the evaluation and is reset / unset afterwards. \emph{Note:} you must #' specify this slot via the constructor as in the example. If you set the #' slot directly it will not have any effect. #' @param file.path character(1L), if not NA the diff will be written to this #' location, ignoring the value of \code{file.ext}. If NA_character_ #' (default), a temporary file is used and removed after the pager function #' completes evaluation. If not NA, the file is preserved. Beware that the #' file will be overwritten if it already exists. #' @param make.blocking TRUE, FALSE, or NA. Whether to wrap \code{pager} with #' \code{\link{make_blocking}} prior to calling it. This suspends R code #' execution until there is user input so that temporary diff files are not #' deleted before the pager has a chance to read them. This typically #' defaults to FALSE, except for \code{PagerBrowser} where it defaults to NA, #' which resolves to \code{is.na(file.path)} (i.e. it is TRUE if the diff is #' being written to a temporary file, and FALSE otherwise). #' @param ... additional arguments to pass on to \code{new} that are passed on #' to parent classes. #' #' @aliases PagerOff, PagerSystem, PagerSystemLess, PagerBrowser #' @importFrom utils browseURL #' @include options.R #' @rdname Pager #' @name Pager #' @seealso \code{\link{Style}}, \code{\link{pager_is_less}} #' @examples #' ## We `dontrun` these examples as they involve pagers that should only be run #' ## in interactive mode #' \dontrun{ #' ## Specify Pager parameters via list; this lets the `diff*` functions pick #' ## their preferred pager based on format and other output parameters, but #' ## allows you to modify the pager behavior. #' #' f <- tempfile() #' diffChr(1:200, 180:300, format='html', pager=list(file.path=f)) #' head(readLines(f)) # html output #' unlink(f) #' #' ## Assuming system pager is `less` and terminal supports ANSI ESC sequences #' ## Equivalent to running `less -RFX` #' #' diffChr(1:200, 180:300, pager=PagerSystemLess(flags="RFX")) #' #' ## If the auto-selected pager would be the system pager, we could #' ## equivalently use: #' #' diffChr(1:200, 180:300, pager=list(flags="RFX")) #' #' ## System pager is not less, but it supports ANSI escape sequences #' #' diffChr(1:200, 180:300, pager=PagerSystem(ansi=TRUE)) #' #' ## Use a custom pager, in this case we make up a trivial one and configure it #' ## always page (`threshold=0L`) #' #' page.fun <- function(x) cat(paste0("| ", readLines(x)), sep="\n") #' page.conf <- PagerSystem(pager=page.fun, threshold=0L) #' diffChr(1:200, 180:300, pager=page.conf, disp.width=getOption("width") - 2) #' #' ## Set-up the custom pager as the default pager #' #' options(diffobj.pager=page.conf) #' diffChr(1:200, 180:300) #' #' ## A blocking pager (this is effectively very similar to what `PagerBrowser` #' ## does); need to block b/c otherwise temp file with diff could be deleted #' ## before the device has a chance to read it since `browseURL` is not #' ## blocking itself. On OS X we need to specify the extension so the correct #' ## program opens it (in this case `TextEdit`): #' #' page.conf <- Pager(pager=browseURL, file.ext="txt", make.blocking=TRUE) #' diffChr(1:200, 180:300, pager=page.conf, format='raw') #' #' ## An alternative to a blocking pager is to disable the #' ## auto-file deletion; here we also specify a file location #' ## explicitly so we can recover the diff text. #' #' f <- paste0(tempfile(), ".html") # must specify .html #' diffChr(1:5, 2:6, format='html', pager=list(file.path=f)) #' tail(readLines(f)) #' unlink(f) #' } setClass( "Pager", slots=c( pager="function", file.ext="character", threshold="numeric", ansi="logical", file.path="character", make.blocking="logical" ), prototype=list( pager=function(x) writeLines(readLines(x)), file.ext="", threshold=0L, ansi=FALSE, file.path=NA_character_, make.blocking=FALSE ), validity=function(object) { if(!is.chr.1L(object@file.ext)) return("Invalid `file.ext` slot") if(!is.int.1L(object@threshold)) return("Invalid `threshold` slot") if(!is.TF(object@ansi)) return("Invalid `ansi` slot") if(!is.logical(object@make.blocking) || length(object@make.blocking) != 1L) return("Invalid `make.blocking` slot") if(!is.character(object@file.path) || length(object@file.path) != 1L) return("Invalid `file.path` slot") TRUE } ) setMethod("initialize", "Pager", function(.Object, ...) { dots <- list(...) if("file.path" %in% names(dots)) { file.path <- dots[['file.path']] if(length(file.path) != 1L) stop("Argument `file.path` must be length 1.") if(is.na(file.path)) file.path <- NA_character_ if(!is.character(file.path)) stop("Argument `file.path` must be character.") dots[['file.path']] <- file.path } do.call(callNextMethod, c(list(.Object), dots)) } ) #' @export #' @rdname Pager Pager <- function( pager=function(x) writeLines(readLines(x)), file.ext="", threshold=0L, ansi=FALSE, file.path=NA_character_, make.blocking=FALSE ) { new( 'Pager', pager=pager, file.ext=file.ext, threshold=threshold, file.path=file.path, make.blocking=make.blocking ) } #' @export #' @rdname Pager setClass("PagerOff", contains="Pager") #' @export #' @rdname Pager PagerOff <- function(...) new("PagerOff", ...) #' @export #' @rdname Pager setClass( "PagerSystem", contains="Pager", prototype=list(pager=file.show, threshold=-1L, file.ext="") ) #' @export #' @rdname Pager PagerSystem <- function(pager=file.show, threshold=-1L, file.ext="", ...) new("PagerSystem", pager=pager, threshold=threshold, ...) #' @export #' @rdname Pager setClass( "PagerSystemLess", contains="PagerSystem", slots=c("flags"), prototype=list(flags="R") ) #' @export #' @rdname Pager PagerSystemLess <- function( pager=file.show, threshold=-1L, flags="R", file.ext="", ansi=TRUE, ... ) new( "PagerSystemLess", pager=pager, threshold=threshold, flags=flags, ansi=ansi, file.ext=file.ext, ... ) # Must use initialize so that the pager function can access the flags slot setMethod("initialize", "PagerSystemLess", function(.Object, ...) { dots <- list(...) flags <- if("flags" %in% names(dots)) { if(!is.chr.1L(dots[['flags']])) stop("Argument `flags` must be character(1L) and not NA") dots[['flags']] } else "" pager.old <- dots[['pager']] pager <- function(x) { old.less <- set_less_var(flags) on.exit(reset_less_var(old.less), add=TRUE) pager.old(x) } dots[['flags']] <- flags dots[['pager']] <- pager do.call(callNextMethod, c(list(.Object), dots)) } ) #' Create a Blocking Version of a Function #' #' Wraps \code{fun} in a function that runs \code{fun} and then issues a #' \code{readline} prompt to prevent further R code evaluation until user #' presses a key. #' #' @export #' @param fun a function #' @param msg character(1L) a message to use as the \code{readline} prompt #' @param invisible.res whether to return the result of \code{fun} invisibly #' @return \code{fun}, wrapped in a function that does the blocking. #' @examples #' make_blocking(sum, invisible.res=FALSE)(1:10) make_blocking <- function( fun, msg="Press ENTER to continue...", invisible.res=TRUE ) { if(!is.function(fun)) stop("Argument `fun` must be a function") if(!is.chr.1L(msg)) stop("Argument `msg` must be character(1L) and not NA") if(!is.TF(invisible.res)) stop("Argument `invisible.res` must be TRUE or FALSE") res <- function(...) { res <- fun(...) readline(msg) if(invisible.res) invisible(res) else res } res } #' Invoke IDE Viewer If Available, browseURL If Not #' #' Use \code{getOption("viewer")} to view HTML output if it is available as #' per \href{https://support.rstudio.com/hc/en-us/articles/202133558-Extending-RStudio-with-the-Viewer-Pane}{RStudio}. Fallback to \code{\link{browseURL}} #' if not available. #' #' @export #' @param url character(1L) a location containing a file to display #' @return the return vaue of \code{getOption("viewer")} if it is a function, or #' of \code{\link{browseURL}} if the viewer is not available view_or_browse <- function(url) { viewer <- getOption("viewer") view.success <- FALSE if(is.function(viewer)) { view.try <- try(res <- viewer(url), silent=TRUE) if(inherits(view.try, "try-error")) { warning( "IDE viewer failed with error ", conditionMessage(attr(view.try, "condition")), "; falling back to `browseURL`" ) } else view.success <- TRUE } if(!view.success) { res <- utils::browseURL(url) } res } setClass( "PagerBrowser", contains="Pager", prototype=list(threshold=0L, file.ext='html', make.blocking=NA) ) #' @export #' @rdname Pager PagerBrowser <- function( pager=view_or_browse, threshold=0L, file.ext="html", make.blocking=NA, ... ) new( "PagerBrowser", pager=pager, threshold=threshold, file.ext=file.ext, make.blocking=make.blocking, ... ) # Helper function to determine whether pager will be used or not use_pager <- function(pager, len) { if(!is(pager, "Pager")) stop("Logic Error: expecting `Pager` arg; contact maintainer.") # nocov if(!is(pager, "PagerOff")) { threshold <- if(pager@threshold < 0L) { console_lines() } else pager@threshold !threshold || len > threshold } else FALSE } #' Check Whether System Has less as Pager #' #' If \code{getOption(pager)} is set to the default value, checks whether #' \code{Sys.getenv("PAGER")} appears to be \code{less} by trying to run the #' pager with the \dQuote{version} and parsing the output. If #' \code{getOption(pager)} is not the default value, then checks whether it #' points to the \code{less} program by the same mechanism. #' #' Some systems may have \code{less} pagers installed that do not respond to the #' \code{$LESS} environment variable. For example, \code{more} on at least some #' versions of OS X is \code{less}, but does not actually respond to #' \code{$LESS}. If such as pager is the system pager you will likely end up #' seeing gibberish in the pager. If this is your use case you will need to #' set-up a custom pager configuration object that sets the correct system #' variables (see \code{\link{Pager}}). #' #' @seealso \code{\link{Pager}} #' @return TRUE or FALSE #' @export #' @examples #' pager_is_less() pager_is_less <- function() { pager.opt <- getOption("pager") if(pager_opt_default(pager.opt)) { file_is_less(Sys.getenv("PAGER")) } else if (is.character(pager.opt)) { file_is_less(head(pager.opt, 1L)) } else FALSE } pager_opt_default <- function(x=getOption("pager")) { is.character(x) && !is.na(x[1L]) && normalizePath(x[1L], mustWork=FALSE) == normalizePath(file.path(R.home(), "bin", "pager"), mustWork=FALSE) } ## Helper Function to Check if a File is Likely to be less Pager file_is_less <- function(x) { if(is.chr.1L(x) && file_test("-x", x)) { res <- tryCatch( system2(x, "--version", stdout=TRUE, stderr=TRUE), warning=function(e) NULL, error=function(e) NULL ) length(res) && grepl("^less \\d+", res[1L]) } else FALSE } diffobj/R/trim.R0000755000176200001440000005010213777704534013176 0ustar liggesusers# Copyright (C) 2021 Brodie Gaslam # # This file is part of "diffobj - Diffs for R Objects" # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # Go to for a copy of the license. # Detect and remove atomic headers .pat.atom <- "^\\s*\\[[1-9][0-9]*\\]\\s" .pat.mat <- "^\\s*\\[[1-9]+[0-9]*,\\]\\s" # dfs/tables colon for data.table, SGR for tibble, starting to get # dangerously broad; we should really split out the tibble business into its own # method. .pat.tbl <- "^(?:\033\\[[^m]*m)?\\s*[1-9]+[0-9]*:?(?:\033\\[[^m]*m)?\\s" .pat.attr <- "^attr\\(,\"(\\\\\"|[^\"])*\")$" # Find first attribute and drop everything after it up_to_attr <- function(x) { attr.id <- grep(.pat.attr, x) if(length(attr.id) && attr.id[1L] > 1L) { y <- head(x, attr.id[1L] - 1L) } else { y <- x } y } # Get atomic content on a best-efforts basis # Note that functionality for named vectors is turned off since they become # fairly pathological when wrap periodicities are not the same (Issue #43); which_atomic_cont <- function(x.chr, x) { # Limit to everything before attribute y <- up_to_attr(x.chr) res <- if(!is.null(nm <- names(x))) { integer(0L) # # name mode; find all lines from output that contain only names # nm.tar <- unlist(strsplit(names(x), "\\s+")) # y.split <- strsplit(sub("^\\s+", "", y), "\\s+") # only.nm <- vapply(y.split, function(z) all(z %in% nm.tar), logical(1L)) # # Look for TF pattern starting with first TRUE # if(any(only.nm)) { # first.t <- min(which(only.nm)) # only.nm.sub <- if(first.t > 1L) { # tail(only.nm, -(first.t - 1L)) # } else only.nm # only.nm.check <- # only.nm.sub == rep(c(TRUE, FALSE), length.out=length(only.nm.sub)) # last.t <- which(!only.nm.check) # last.t <- if(!length(last.t)) length(only.nm.check) + 1L else min(last.t) # # Modulo check makes sure we have full T,F repeats # if(length(last.t) && last.t %% 2L) { # # Ensure that all names are present in the order they are supposed to be # tar.seq <- first.t:(last.t + first.t - 2L) # if(all(unlist(y.split[tar.seq][c(TRUE, FALSE)]) == nm.tar)) { # tar.seq # } else integer(0L) # } else integer(0L) # } else integer(0L) } else which_atomic_rh(x.chr) res } # Identify elements that contain row headers, these are guaranteed to be # sequential incrementing with no gaps, or zero length. which_atomic_rh <- function(x) { stopifnot(is.character(x), !anyNA(x)) # Now find the row headers if any prior to the attributes y <- up_to_attr(x) w.pat <- grepl(.pat.atom, y) # Grab first set that matches for checking, there could be more particularly # if the object in question has attributes, but we explicitly rule out # attributes w.pat.rle <- rle(w.pat) res <- if(any(w.pat.rle$values)) { # First get the indices of the patterns that match first.block <- min(which(w.pat.rle$values)) w.pat.start <- sum(head(w.pat.rle$lengths, first.block - 1L), 0L) + 1L w.pat.ind <- seq(from=w.pat.start, length.out=w.pat.rle$lengths[first.block], by=1L) # Re extract those and run checks on them to make sure they truly are # what we think they are: width of headers is the same, and numbers # increment in equal increments starting at 1 r.h.rows <- y[w.pat.ind] r.h.vals <- regmatches(r.h.rows, regexpr(.pat.atom, r.h.rows)) r.h.lens.u <- length(unique(nchar(r.h.vals))) r.h.nums <- sub(".*?([0-9]+).*", "\\1", r.h.vals, perl=TRUE) r.h.nums.u <- length(unique(diff(as.numeric(r.h.nums)))) if( r.h.nums.u <= 1L && r.h.lens.u == 1L && r.h.nums[[1L]] == "1" && (length(w.pat.ind) < 2L || all(diff(w.pat.ind)) == 1L) ) { w.pat.ind } else integer(0L) } else integer(0L) } strip_atomic_rh <- function(x) { stopifnot(is.character(x), !anyNA(x)) w.r.h <- which_atomic_rh(x) x[w.r.h] <- sub(.pat.atom, "", x[w.r.h]) x } # Detect table row headers; a bit lazy, combining all table like into one # function when in reality more subtlety is warranted; also, we only care about # numeric row headers. # # Matrices used to be done here as well, but then got split off so the `pat` # argument is legacy wtr_help <- function(x, pat) { # Should expect to find pattern repeated some number of times, and then whole # pattern possibly repeated the same number of times separated by the same # gap each time if the table is too wide and wraps. w.pat <- grepl(pat, x) w.pat.rle <- rle(w.pat) # It must be the case that the first block of matches occurs after non-matches # since the first header should happen first res <- integer(0L) if( any(w.pat.rle$values) && length(w.pat.rle$values) > 1L && w.pat.rle$values[2L] ) { tar.len <- w.pat.rle$lengths[2L] match.blocks <- w.pat.rle$values & w.pat.rle$lengths == tar.len # Only take matches they if alternate T/F match.break <- match.blocks != rep(c(FALSE, TRUE), length.out=length(match.blocks)) match.valid <- if(any(match.break)) { # actually very difficult to test this; need a df like structure that is # wrapped and has some irregularity that crops up later, and we're not # actually able to generate these with vanilla structures head(match.blocks, min(which(match.break)) - 1L) } else match.blocks # Make sure that all interstitial blocks are same length and that they all # start with at least one space interstitial <- which( !match.valid & seq_along(match.valid) > 1L & seq_along(match.valid) != length(match.valid) ) if( !length(interstitial) || ( length(interstitial) && length(unique(w.pat.rle$lengths[interstitial])) == 1L && all(grepl("^\\s", x[unlist(rle_sub(w.pat.rle, interstitial))])) ) ) { # Make sure row headers are the same for each repeating block; start by # extracting the actual headers; need to get a list of each sequence of # headers max.valid <- max(which(match.valid)) ranges <- rle_sub( w.pat.rle, seq_along(w.pat.rle$lengths) <= max.valid & w.pat.rle$values ) heads.l <- regmatches(x, regexec(pat, x)) heads <- character(length(heads.l)) heads[w.pat] <- as.character(heads.l[w.pat]) heads.num <- as.integer( sub(".*?(?:\033\\[[^m]*m.*?)*([0-9]+).*", "\\1", heads, perl=TRUE) ) head.ranges <- lapply(ranges, function(x) heads.num[x]) all.identical <- all(vapply(head.ranges, identical, logical(1L), head.ranges[[1L]])) all.one.apart <- all(vapply(head.ranges, function(x) all(diff(x) == 1L), logical(1L))) if(all.identical && all.one.apart && head.ranges[[1L]][1L] == 1L) { res <- unlist(ranges) } } } res } which_table_rh <- function(x) { stopifnot(is.character(x), !anyNA(x)) res <- wtr_help(x, .pat.tbl) if(length(res)) attr(res, "pat") <- .pat.tbl res } strip_table_rh <- function(x) { w <- which_table_rh(x) if(!length(w)) { x } else { pat <- attr(w, "pat") if(!is.chr.1L(pat)) # nocov start stop("Logic Error: unexpected row header pattern; contact maintainer.") # nocov end x[w] <- sub(pat, "", x[w]) x } } # Matrices; should really try to leverage logic in wtr_help, but not quite the # same which_matrix_rh <- function(x, dim.names.x) { guides <- detect_matrix_guides(x, dim.names.x) res <- integer(0L) if(length(guides)) { pieces <- split_by_guides(x, guides) if(!length(pieces)) stop("Logic Error: no matrix pieces") # nocov # Get all rows matching the matrix row header so long as they are adjacent; # this is only really different if there is an attribute in the last piece pat.ind <- lapply( pieces, function(y) { pat.match <- grep(.pat.mat, y) if(length(pat.match) > 1) pat.match[c(TRUE, !cumsum(diff(pat.match) != 1L))] else pat.match } ) if( all(vapply(pat.ind, identical, logical(1L), pat.ind[[1L]])) && (length(pat.ind[[1L]]) == 1L || all(diff(pat.ind[[1L]]) == 1L)) ) { piece.nums <- as.integer( sub(".*?([0-9]+).*", "\\1", pieces[[1L]][pat.ind[[1L]]], perl=TRUE) ) if( length(piece.nums) && piece.nums[1L] == 1L && (length(piece.nums) == 1L || all(diff(piece.nums) == 1L)) ) { res <- unlist( lapply(seq_along(pieces), function(i) attr(pieces[[i]], "idx")[pat.ind[[i]]] ) ) } } } res } strip_matrix_rh <- function(x, dim.names.x) { to.rep <- which_matrix_rh(x, dim.names.x) res <- x res[to.rep] <- sub(.pat.mat, "", x[to.rep]) res } # Handle arrays which_array_rh <- function(x, dim.names.x) { arr.h <- detect_array_guides(x, dim.names.x) dat <- split_by_guides(x, arr.h) # Look for the stuff between array guides; those should be matrix like # and have the same rows in each one m.h <- lapply(dat, which_matrix_rh, head(dim.names.x, 2L)) res <- integer(0L) if(length(m.h) && all(vapply(m.h, identical, logical(1L), m.h[[1L]]))) { res <- unlist(Map(function(y, z) attr(y, "idx")[z], dat, m.h)) } res } strip_array_rh <- function(x, dim.names.x) { inds <- which_array_rh(x, dim.names.x) res <- x res[inds] <- sub(.pat.mat, "", x[inds]) res } # Lists, need to recurse through the various list components # # This is not done super rigorously; the main point of failure is if sub-objects # produce patterns that match list sub-object headers which may cause confusion # # Super inefficient currently since we keep switching back and forth between # index and trimmed formats so we can re-use `trimPrint`... # # Also, right now we are passing the list components with all the trailing # new lines, and it isn't completely clear that is the right thing to do # # Note that we're not actually trimming the list headers themselves since unlike # in atomics and matrices, etc, the list headers are on their own line and won't # affect the matching diff of the actual contents of the list strip_list_rh <- function(x, obj) { if(!length(obj)) { # empty list, nothing to do, and also if it is nested causes problems later x } else { # Split output into each list component list.h <- detect_list_guides(x) dat <- split_by_guides(x, list.h, drop.leading=FALSE) elements <- flatten_list(obj) # Special case where first element in list is deeper than one value, which # means there will be leading non-data elements in `dat` that we have to # reconstruct; note that if no len then rendered as `list()` so it doesn't # get a guide. offset <- if( is.list(obj[[1L]]) && !is.object(obj[[1L]]) && length(obj[[1L]]) ) 1L else 0L if(length(elements) != length(dat) - offset) { # Something went wrong here, so return as is? x } else { # Use `trimPrint` to get indices, and trim back to stuff without row # header if(offset) { hd <- dat[[1L]] tl <- tail(dat, -offset) } else { hd <- NULL tl <- dat } dat.trim <- Map(trimPrint, elements, tl) dat.w.o.rh <- Map( function(chr, ind) substr(chr, ind[, 1], ind[, 2]), tl, dat.trim ) unlist( c( list(hd), c( as.list(x[list.h]), dat.w.o.rh )[order(rep(seq_along(list.h), 2))] ) ) } } } # Very similar logic to lists strip_s4_rh <- function(x, obj) { stopifnot(isS4(obj)) if(!length(slotNames(obj))) { # Not possible to have object without slots (would be virtual class) stop("Internal Error: s4 object w/o slots; contact maintainer") # nocov } else { # Split output into each list component s4.h <- detect_s4_guides(x, obj) dat <- split_by_guides(x, s4.h, drop.leading=FALSE) elements <- lapply(slotNames(obj), slot, object=obj) dat.trim <- Map(trimPrint, elements, dat) dat.w.o.rh <- unlist( Map( function(chr, ind) substr(chr, ind[, 1], ind[, 2]), dat, dat.trim ) ) if(length(dat.w.o.rh) + length(s4.h) == length(x)) { res <- character(length(x)) res[s4.h] <- x[s4.h] res[!seq_along(res) %in% s4.h] <- dat.w.o.rh res } else { # This should not happen, only a warning because operating without trimed # S4 guides is not a huge eissue # nocov start warning('Unable to detect S4 object guides.') x # nocov end } } } #' Methods to Remove Unsemantic Text Prior to Diff #' #' \code{\link[=diffPrint]{diff*}} methods, in particular \code{diffPrint}, #' modify the text representation of an object prior to running the diff to #' reduce the incidence of spurious mismatches caused by unsemantic differences. #' For example, we look to remove matrix row indices and atomic vector indices #' (i.e. the \samp{[1,]} or \samp{[1]} strings at the beginning of each display #' line). #' #' Consider: \preformatted{ #' > matrix(10:12) #' [,1] #' [1,] 10 #' [2,] 11 #' [3,] 12 #' > matrix(11:12) #' [,1] #' [1,] 11 #' [2,] 12 #' } #' In this case, the line by line diff would find all rows of the matrix to #' be mismatched because where the data matches (rows containing #' 11 and 12) the indices do not. By trimming out the row indices before #' the diff, the diff can recognize that row 2 and 3 from the first matrix #' should be matched to row 1 and 2 of the second. #' #' These methods follow a similar interface as the \code{\link[=guides]{guide*}} #' methods, with one available for each \code{diff*} method except for #' \code{diffCsv} since that one uses \code{diffPrint} internally. The #' unsemantic differences are added back after the diff for display purposes, #' and are colored in grey to indicate they are ignored in the diff. #' #' Currently only \code{trimPrint} and \code{trimStr} do anything meaningful. #' \code{trimPrint} removes row index headers provided that they are of the #' default un-named variety. If you add row names, or if numeric row indices #' are not ascending from 1, they will not be stripped as those have meaning. #' \code{trimStr} removes the \samp{..$}, \samp{..-}, and \samp{..@} tokens #' to minimize spurious matches. #' #' You can modify how text is trimmed by providing your own functions to the #' \code{trim} argument of the \code{diff*} methods, or by defining #' \code{trim*} methods for your objects. Note that the return value for these #' functions is the start and end columns of the text that should be #' \emph{kept} and used in the diff. #' #' As with guides, trimming is on a best efforts basis and may fail with #' \dQuote{pathological} display representations. Since the diff still works #' even with failed trimming this is considered an acceptable compromise. #' Trimming is more likely to fail with nested recursive structures. #' #' @note \code{obj.as.chr} will be as processed by #' \code{\link{strip_hz_control}} and as such will not be identical to the #' captured output if it contains tabs, newlines, or carriage returns. #' @rdname trim #' @name trim #' @aliases trimPrint, trimStr, trimChr, trimDeparse, trimFile #' @param obj the object #' @param obj.as.chr character the \code{print}ed representation of the object #' @return a \code{length(obj.as.chr)} row and 2 column integer matrix with the #' start (first column) and end (second column) character positions of the sub #' string to run diffs on. NULL #' @export #' @rdname trim setGeneric("trimPrint", function(obj, obj.as.chr) standardGeneric("trimPrint") ) #' @rdname trim setMethod( "trimPrint", c("ANY", "character"), function(obj, obj.as.chr) { # Remove the stuff we don't want stripped <- if(is.matrix(obj)) { strip_matrix_rh(obj.as.chr, dimnames(obj)) } else if( length(dim(obj)) == 2L || (is.ts(obj) && frequency(obj) > 1) ) { strip_table_rh(obj.as.chr) } else if (is.array(obj)) { strip_array_rh(obj.as.chr, dimnames(obj)) } else if(is.atomic(obj)) { strip_atomic_rh(obj.as.chr) } else if(is.list(obj) && !is.object(obj)) { strip_list_rh(obj.as.chr, obj) } else if(isS4(obj) && is_default_show_obj(obj)) { strip_s4_rh(obj.as.chr, obj) } else obj.as.chr trim_sub(obj.as.chr, stripped) } ) #' @export #' @rdname trim setGeneric("trimStr", function(obj, obj.as.chr) standardGeneric("trimStr") ) #' @rdname trim setMethod( "trimStr", c("ANY", "character"), function(obj, obj.as.chr) { # Remove the stuff we don't want pat <- "^ (?: \\.\\.)*(?:\\$|-|@) " stripped <- gsub(pat, "", obj.as.chr, perl=TRUE) # Figure out the indices that correspond to what we want, knowing that all # removals should have occured at front of string trim_sub(obj.as.chr, stripped) } ) # Helper function; returns untrimmed objects trim_identity <- function(obj, obj.as.chr) cbind(rep(1L, length(obj.as.chr)), nchar(obj.as.chr)) #' @export #' @rdname trim setGeneric( "trimChr", function(obj, obj.as.chr) standardGeneric("trimChr") ) #' @rdname trim setMethod("trimChr", c("ANY", "character"), trim_identity) #' @export #' @rdname trim setGeneric( "trimDeparse", function(obj, obj.as.chr) standardGeneric("trimDeparse") ) #' @rdname trim setMethod("trimDeparse", c("ANY", "character"), trim_identity) #' @export #' @rdname trim setGeneric( "trimFile", function(obj, obj.as.chr) standardGeneric("trimFile") ) #' @rdname trim setMethod("trimFile", c("ANY", "character"), trim_identity) # Helper fun used by trim functions that remove front of strings and rely on # string comparison to determine trim indices trim_sub <- function(obj.as.chr, obj.stripped) { if(length(obj.as.chr) != length(obj.stripped)) # nocov start stop( "Logic Error: trimmed string does not have same number of elements as ", "original; contact maintainer" ) # nocov end stripped.chars <- nchar(obj.stripped) char.diff <- nchar(obj.as.chr) - stripped.chars sub.start <- char.diff + 1L sub.end <- sub.start - 1L + stripped.chars if(!all(substr(obj.as.chr, sub.start, sub.end) == obj.stripped)) # nocov start stop( "Logic Error: trimmed string is not a substring of orginal, ", "contact maintainer" ) # nocov end cbind(sub.start, sub.end) } # Re-insert the trimmed stuff back into the original string, note that we # use normal string funs, not ANSI aware ones, because the row header stuff is # done in an ANSI unaware manner. untrim <- function(dat, word.c, etc) { fun <- etc@style@funs@trim res <- with( dat, paste0( fun(substr(raw, 0, trim.ind.start - 1L)), word.c, fun(substr(raw, trim.ind.end + 1L, nchar(raw) + 1L)) ) ) # substitute blanks res[!nzchar(dat$raw)] <- etc@style@blank.sub res } valid_trim_ind <- function(x) if( !is.integer(x) || !is.matrix(x) || anyNA(x) || !ncol(x) == 2L ) { "must be a two column integer matrix with no NAs" } else TRUE apply_trim <- function(obj, obj.as.chr, trim_fun) { if(!isTRUE(two.arg <- is.two.arg.fun(trim_fun))) stop( "Invalid trim function (", two.arg, "). If you did not customize the ", "trim function contact maintainer; see `?trim`" ) trim <- try(trim_fun(obj, obj.as.chr)) msg.extra <- paste0( "If you did not specify a `trim` function or define custom `trim*` ", "methods contact maintainer (see `?trim`). Proceeding without trimming." ) if(inherits(trim, "try-error")) { warning( "`trim*` method produced an error when attempting to trim ; ", msg.extra ) trim <- cbind(rep(1L, length(obj.as.chr)), nchar(obj.as.chr)) } if(!isTRUE(trim.check <- valid_trim_ind(trim))) stop("`trim*` method return value ", trim.check, "; ", msg.extra) if(nrow(trim) != length(obj.as.chr)) stop( "`trim*` method output matrix must have as many rows as object ", "character representation has elements; ", msg.extra ) trim } diffobj/R/text.R0000755000176200001440000003520614123062122013167 0ustar liggesusers# Copyright (C) 2021 Brodie Gaslam # # This file is part of "diffobj - Diffs for R Objects" # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # Go to for a copy of the license. # borrowed from crayon, will lobby to get it exported ansi_regex <- paste0("(?:(?:\\x{001b}\\[)|\\x{009b})", "(?:(?:[0-9]{1,3})?(?:(?:;[0-9]{0,3})*)?[A-M|f-m])", "|\\x{001b}[A-M]") # Function to split a character vector by newlines; handles some special cases split_new_line <- function(x, sgr.supported) { y <- x y[!nzchar(x)] <- "\n" unlist(strsplit2(y, "\n", sgr.supported=sgr.supported)) } html_ent_sub <- function(x, style) { if(is(style, "StyleHtml") && style@escape.html.entities) { x <- gsub("&", "&", x, fixed=TRUE) x <- gsub("<", "<", x, fixed=TRUE) x <- gsub(">", ">", x, fixed=TRUE) x <- gsub("\n", "
", x, fixed=TRUE) # x <- gsub(" ", " ", x, fixed=TRUE) } x } # Helper function for align_eq; splits up a vector into matched elements and # interstitial elements, including possibly empty interstitial elements when # two matches are abutting align_split <- function(v, m) { match.len <- sum(!!m) res.len <- match.len * 2L + 1L splits <- cumsum( c( if(length(m)) 1L, (!!diff(m) < 0L & !tail(m, -1L)) | (head(m, -1L) & tail(m, -1L)) ) ) m.all <- match(m, sort(unique(m[!!m])), nomatch=0L) # normalize m.all[!m.all] <- -ave(m.all, splits, FUN=max)[!m.all] m.all[!m.all] <- -match.len - 1L # trailing zeros m.fin <- ifelse(m.all < 0, -m.all * 2 - 1, m.all * 2) if(any(diff(m.fin) < 0L)) stop("Logic Error: non monotonic alignments; contact maintainer") # nocov res <- replicate(res.len, character(0L), simplify=FALSE) res[unique(m.fin)] <- unname(split(v, m.fin)) res } # Align lists based on equalities on other vectors # # This is used for hunks that are word diffed. Once the word differences are # accounted for, the remaining strings (A.eq/B.eq) are compared to try to align # them with a naive algorithm on a line basis. This works best when lines as a # whole are equal except for a few differences. There can be funny situations # where matched words are on one line in e.g. A, but spread over multiple lines # in B. This isn't really handled well currently. # # See issue #37. # # The A/B vecs will be split up into matchd elements, and non-matched elements. # Each matching element will be surrounding by (possibly empty) non-matching # elements. # # Need to reconcile the padding that happens as a result of alignment as well # as the padding that happens with atomic vectors align_eq <- function(A, B, x, context) { stopifnot( is.integer(A), is.integer(B), !anyNA(c(A, B)), is(x, "Diff") ) A.fill <- get_dat(x, A, "fill") B.fill <- get_dat(x, B, "fill") A.fin <- get_dat(x, A, "fin") B.fin <- get_dat(x, B, "fin") if(context) { # Nothing to align if this is context hunk A.chunks <- list(A.fin) B.chunks <- list(B.fin) } else { etc <- x@etc A.eq <- get_dat(x, A, "eq") B.eq <- get_dat(x, B, "eq") # Cleanup so only relevant stuff is allowed to match A.tok.ratio <- get_dat(x, A, "tok.rat") B.tok.ratio <- get_dat(x, B, "tok.rat") if(etc@align@count.alnum.only) { A.eq.trim <- gsub("[^[:alnum:]]", "", A.eq, perl=TRUE) B.eq.trim <- gsub("[^[:alnum:]]", "", B.eq, perl=TRUE) } else { A.eq.trim <- A.eq B.eq.trim <- B.eq } # TBD whether nchar here should be ansi-aware; probably if in alnum only # mode... A.valid <- which( nchar2(A.eq.trim, sgr.supported=etc@sgr.supported) >= etc@align@min.chars & A.tok.ratio >= etc@align@threshold ) B.valid <- which( nchar2(B.eq.trim, sgr.supported=etc@sgr.supported) >= etc@align@min.chars & B.tok.ratio >= etc@align@threshold ) B.eq.seq <- seq_along(B.eq.trim) align <- integer(length(A.eq)) min.match <- 0L # Need to match each element in A.eq to B.eq, though each match consumes the # match so we can't use `match`; unfortunately this is slow; for context # hunks the match is one to one for each line; also, this whole matching # needs to be improved (see issue #37) if(length(A.valid) & length(B.valid)) { B.max <- length(B.valid) B.eq.val <- B.eq.trim[B.valid] for(i in A.valid) { if(min.match >= B.max) break B.match <- which( A.eq.trim[[i]] == if(min.match) tail(B.eq.val, -min.match) else B.eq.val ) if(length(B.match)) { align[[i]] <- B.valid[B.match[[1L]] + min.match] min.match <- B.match[[1L]] + min.match } } } # Group elements together. We number the interstitial buckest as the # negative of the next match. There are always matches together, split # by possibly empty interstitial elements align.b <- seq_along(B.eq) align.b[!align.b %in% align] <- 0L A.chunks <- align_split(A.fin, align) B.chunks <- align_split(B.fin, align.b) } if(length(A.chunks) != length(B.chunks)) # nocov start stop("Logic Error: aligned chunks unequal length; contact maintainer.") # nocov end list(A=A.chunks, B=B.chunks, A.fill=A.fill, B.fill=B.fill) } # Calculate how many lines of screen space are taken up by the diff hunks # # `disp.width` should be the available display width, this function computes # the net real estate account for mode, padding, etc. nlines <- function(txt, disp.width, mode, etc) { # stopifnot(is.character(txt), all(!is.na(txt))) capt.width <- calc_width_pad(disp.width, mode) pmax( 1L, as.integer( ceiling( nchar2(txt, sgr.supported=etc@sgr.supported ) / capt.width ) ) ) } # Gets rid of tabs and carriage returns # # Assumes each line is one screen line # @param stops may be a single positive integer value, or a vector of values # whereby the last value will be repeated as many times as necessary strip_hz_c_int <- function(txt, stops, sgr.supported) { # remove trailing and leading CRs (need to record if trailing remains to add # back at end? no, not really since by structure next thing must be a newline w.chr <- nzchar(txt) # corner case with strsplit and zero length strings txt <- gsub("^\r+|\r+$", "", txt) has.tabs <- grep("\t", txt, fixed=TRUE) has.crs <- grep("\r", txt, fixed=TRUE) txt.s <- as.list(txt) txt.s[has.crs] <- if(!any(has.crs)) list() else strsplit2(txt[has.crs], "\r+", sgr.supported=sgr.supported) # Assume \r resets tab stops as it would on a type writer; so now need to # generate the set maximum set of possible tab stops; approximate here by # using largest stop if(length(has.tabs)) { max.stop <- max(stops) width.w.tabs <- max( vapply( txt.s[has.tabs], function(x) { # add number of chars and number of tabs times max tab length sum( nchar2(x, sgr.supported=sgr.supported) + ( vapply( strsplit2(x, "\t", sgr.supported=sgr.supported), length, integer(1L) ) + grepl("\t$", x) - 1L ) * max.stop ) }, integer(1L) ) ) extra.chars <- width.w.tabs - sum(stops) extra.stops <- ceiling(extra.chars / tail(stops, 1L)) stop.vec <- cumsum(c(stops, rep(tail(stops, 1L), extra.stops))) # For each line, assess effects of tabs txt.s[has.tabs] <- lapply(txt.s[has.tabs], function(x) { if(length(h.t <- grep("\t", x, fixed=T))) { # workaround for strsplit dropping trailing tabs x.t <- sub("\t$", "\t\t", x[h.t]) x.s <- strsplit2(x.t, "\t", sgr.supported=sgr.supported) # Now cycle through each line with tabs and replace them with # spaces res <- vapply(x.s, function(y) { topad <- head(y, -1L) rest <- tail(y, 1L) chrs <- nchar2(topad, sgr.supported=sgr.supported) pads <- character(length(topad)) txt.len <- 0L for(i in seq_along(topad)) { txt.len <- chrs[i] + txt.len tab.stop <- head(which(stop.vec > txt.len), 1L) if(!length(tab.stop)) # nocov start stop( "Logic Error: failed trying to find tab stop; contact ", "maintainer" ) # nocov end tab.len <- stop.vec[tab.stop] pads[i] <- paste0(rep(" ", tab.len - txt.len), collapse="") txt.len <- tab.len } paste0(paste0(topad, pads, collapse=""), rest) }, character(1L) ) x[h.t] <- res } x } ) } # Simulate the effect of \r by collapsing every \r separated element on top # of each other with some special handling for ansi escape seqs txt.fin <- txt.s txt.fin[has.crs] <- vapply( txt.s[has.crs], function(x) { if(length(x) > 1L) { chrs <- nchar2(x, sgr.supported=sgr.supported) max.disp <- c(tail(rev(cummax(rev(chrs))), -1L), 0L) res <- paste0( rev( substr2(x, max.disp + 1L, chrs, sgr.supported=sgr.supported) ), collapse="" ) # add back every ANSI esc sequence from last line to very end # to ensure that we leave in correct ANSI escaped state if(grepl(ansi_regex, res, perl=TRUE)) { res <- paste0( res, gsub(paste0(".*", ansi_regex, ".*"), "\\1", tail(x, 1L), perl=TRUE) ) } res } else x # nocov has.cr elements can't have length zero after split... }, character(1L) ) # txt.fin should only have one long char vectors as elements if(!length(txt.fin)) txt else { # handle strsplit corner case where splitting empty string txt.fin[!nzchar(txt)] <- "" unlist(txt.fin) } } #' Replace Horizontal Spacing Control Characters #' #' Removes tabs, newlines, and manipulates the text so that #' it looks the same as it did with those horizontal control #' characters embedded. Currently carriage returns are also processed, but #' in the future they no longer will be. This function is used when the #' \code{convert.hz.white.space} parameter to the #' \code{\link[=diffPrint]{diff*}} methods is active. The term \dQuote{strip} #' is a misnomer that remains for legacy reasons and lazyness. #' #' This is an internal function with exposed documentation because it is #' referenced in an external function's documentation. #' #' @keywords internal #' @param txt character to covert #' @param stops integer, what tab stops to use #' @param sgr.supported logical whether the current display device supports #' ANSI CSI SGR. See \code{\link[=diffPrint]{diff*}}'s \code{sgr.supported} #' parameter. #' @return character, `txt` with horizontal control sequences #' replaced. strip_hz_control <- function(txt, stops=8L, sgr.supported) { # stopifnot( # is.character(txt), !anyNA(txt), # is.integer(stops), length(stops) >= 1L, !anyNA(stops), all(stops > 0L) # ) # for speed in case no special chars, just skip; obviously this adds a penalty # for other cases but it is small if(!any(grepl("\n|\t|\r", txt, perl=TRUE))) { txt } else { if(length(has.n <- grep("\n", txt, fixed=TRUE))) { txt.l <- as.list(txt) txt.l.n <- strsplit2(txt[has.n], "\n", sgr.supported=sgr.supported) txt.l[has.n] <- txt.l.n txt <- unlist(txt.l) } has.ansi <- grepl(ansi_regex, txt, perl=TRUE) w.ansi <- which(has.ansi) wo.ansi <- which(!has.ansi) # since for the time being the crayon funs are a bit slow, only us them on # strings that are known to have ansi escape sequences strip_hz_c_int(txt, stops, sgr.supported=sgr.supported) } } # Normalize strings so whitespace differences don't show up as differences normalize_whitespace <- function(txt) gsub(" ([[:punct:]])", "\\1", gsub("(\t| )+", " ", trimws(txt))) # Simple text manip functions chr_trim <- function(text, width, sgr.supported) { stopifnot(all(width > 2L)) ifelse( nchar2(text, sgr.supported=sgr.supported) > width, paste0(substr2(text, 1L, width - 2L, sgr.supported=sgr.supported), ".."), text ) } rpad <- function(text, width, pad.chr=" ", sgr.supported) { stopifnot(is.character(pad.chr), length(pad.chr) == 1L, nchar(pad.chr) == 1L) pad.count <- width - nchar2(text, sgr.supported=sgr.supported) pad.count[pad.count < 0L] <- 0L pad.chrs <- vapply( pad.count, function(x) paste0(rep(pad.chr, x), collapse=""), character(1L) ) paste0(text, pad.chrs) } # Breaks long character vectors into vectors of length width # # Right pads them to full length if requested. Only attempt to wrap if # longer than width since wrapping is pretty expensive # # Returns a list of split vectors wrap_int <- function(txt, width, sgr.supported) { nchars <- nchar2(txt, sgr.supported=sgr.supported) res <- as.list(txt) too.wide <- which(nchars > width) res[too.wide] <- lapply( too.wide, function(i) { split.end <- seq( from=width, by=width, length.out=ceiling(nchars[[i]] / width) ) split.start <- split.end - width + 1L substr2( rep(txt[[i]], length(split.start)), split.start, split.end, sgr.supported=sgr.supported ) } ) res } wrap <- function(txt, width, pad=FALSE, sgr.supported) { if(length(grep("\n", txt, fixed=TRUE))) # nocov start stop("Logic error: wrap input contains newlines; contact maintainer.") # nocov end # If there are ansi escape sequences, account for them; either way, create # a vector of character positions after which we should split our character # vector has.na <- is.na(txt) has.chars <- nchar2(txt, sgr.supported=sgr.supported) & !has.na w.chars <- which(has.chars) wo.chars <- which(!has.chars & !has.na) txt.sub <- txt[has.chars] # Wrap differently depending on whether contains ansi or not, exclude zero # length char elements res.l <- vector("list", length(txt)) res.l[has.na] <- NA_character_ res.l[wo.chars] <- "" res.l[w.chars] <- wrap_int(txt.sub, width, sgr.supported=sgr.supported) # pad if requested if(pad) res.l[!has.na] <- lapply(res.l[!has.na], rpad, width=width, sgr.supported=sgr.supported) res.l } diffobj/R/summmary.R0000755000176200001440000002604613777704534014107 0ustar liggesusers# Copyright (C) 2021 Brodie Gaslam # # This file is part of "diffobj - Diffs for R Objects" # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # Go to for a copy of the license. #' @include s4.R NULL setClass("DiffSummary", slots=c( max.lines="integer", width="integer", etc="Settings", diffs="matrix", all.eq="character", scale.threshold="numeric" ), validity=function(object) { if( !is.integer(object@diffs) && !identical(rownames(object@diffs), c("match", "delete", "add")) ) return("Invalid diffs object") TRUE } ) #' Summary Method for Diff Objects #' #' Provides high level count of insertions, deletions, and matches, as well as a #' \dQuote{map} of where the differences are. #' #' Sequences of single operations (e.g. "DDDDD") are compressed provided that #' compressing them does not distort the relative size of the sequence relative #' to the longest such sequence in the map by more than \code{scale.threshold}. #' Since length 1 sequences cannot be further compressed \code{scale.threshold} #' does not apply to them. #' #' @param object at \code{Diff} object #' @param scale.threshold numeric(1L) between 0 and 1, how much distortion to #' allow when creating the summary map, where 0 is none and 1 is as much as #' needed to fit under \code{max.lines}, defaults to 0.1 #' @param max.lines integer(1L) how many lines to allow for the summary map, #' defaults to 50 #' @param width integer(1L) how many columns wide the output should be, defaults #' to \code{getOption("width")} #' @param ... unused, for compatibility with generic #' @return a \code{DiffSummary} object #' ## `pager="off"` for CRAN compliance; you may omit in normal use #' summary(diffChr(letters, letters[-c(5, 15)], format="raw", pager="off")) setMethod("summary", "Diff", function( object, scale.threshold=0.1, max.lines=50L, width=getOption("width"), ... ) { if(!is.int.1L(max.lines) || max.lines < 1L) stop("Argument `max.lines` must be integer(1L) and strictly positive") max.lines <- as.integer(max.lines) if(!is.int.1L(width) || width < 0L) stop("Argument `width` must be integer(1L) and positive") if(width < 10L) width <- 10L if( !is.numeric(scale.threshold) || length(scale.threshold) != 1L || is.na(scale.threshold) || !scale.threshold %bw% c(0, 1) ) stop("Argument `scale.threshold` must be numeric(1L) between 0 and 1") diffs.c <- count_diffs_detail(object@diffs) # remove context hunks that are duplicated match.seq <- rle(!!diffs.c["match", ]) match.keep <- unlist( lapply( match.seq$lengths, function(x) if(x == 2L) c(TRUE, FALSE) else TRUE ) ) diffs <- diffs.c[, match.keep, drop=FALSE] all.eq <- all.equal(object@target, object@current) new( "DiffSummary", max.lines=max.lines, width=width, etc=object@etc, diffs=diffs, all.eq=if(isTRUE(all.eq)) character(0L) else all.eq, scale.threshold=scale.threshold ) } ) #' @rdname finalizeHtml setMethod("finalizeHtml", c("DiffSummary"), function(x, x.chr, ...) { js <- "" callNextMethod(x, x.chr, js=js, ...) } ) #' Generate Character Representation of DiffSummary Object #' #' @param x a \code{DiffSummary} object #' @param ... not used, for compatibility with generic #' @return the summary as a character vector intended to be \code{cat}ed to #' terminal #' @examples #' as.character( #' summary(diffChr(letters, letters[-c(5, 15)], format="raw", pager="off")) #' ) setMethod("as.character", "DiffSummary", function(x, ...) { etc <- x@etc style <- etc@style hunks <- sum(!x@diffs["match", ]) res <- c(apply(x@diffs, 1L, sum)) scale.threshold <- x@scale.threshold # something seems wrong with next condition res <- if(!hunks || !sum(x@diffs[c("delete", "add"), ])) { style@summary@body( if(length(x@all.eq)) { eq.txt <- paste0("- ", x@all.eq) paste0( c( "No visible differences, but objects are not `all.equal`:", eq.txt ), collapse=style@text@line.break ) } else { "Objects are `all.equal`" } ) } else { pad <- 2L width <- x@width - pad head <- paste0( paste0( strwrap( sprintf( "Found differences in %d hunk%s:", hunks, if(hunks != 1L) "s" else "" ), width=width ), collapse=style@text@line.break ), style@summary@detail( paste0( strwrap( sprintf( "%d insertion%s, %d deletion%s, %d match%s (lines)", res[["add"]], if(res[["add"]] == 1L) "" else "s", res[["delete"]], if(res[["delete"]] == 1L) "" else "s", res[["match"]], if(res[["match"]] == 1L) "" else "es" ), width=width ), collapse=style@text@line.break ) ), collapse="" ) # Compute character screen display max.chars <- x@max.lines * width diffs <- x@diffs scale.threshold <- x@scale.threshold # Helper fun to determine if the scale skewed our data too much scale_err <- function(orig, scaled, threshold, width) { if((width - sum(scaled)) / width > threshold) { TRUE } else { zeroes <- !orig orig.nz <- orig[!zeroes] scaled.nz <- scaled[!zeroes] orig.norm <- orig.nz / max(orig.nz) scaled.norm <- scaled.nz / max(scaled.nz) any(abs(orig.norm - scaled.norm) > threshold) } } # Scale the data down as small as possible provided we don't violate # tolerance. diffs.gz <- diffs > 1L diffs.nz <- diffs[diffs.gz] safety <- 10000L tol <- width / 4 diffs.scale <- diffs lo.bound <- lo <- length(diffs.nz) hi.bound <- hi <- sum(diffs.nz) if(sum(diffs.scale) > width) { repeat { mp <- ceiling((hi.bound - lo.bound) / 2) + lo.bound safety <- safety - 1L if(safety < 0L) # nocov start stop("Logic Error: likely infinite loop; contact maintainer.") # nocov end # Need to scale down; we know we need at least one char per value diffs.nz.s <- pmax( round(diffs.nz * (mp - lo) / (hi - lo)), 1L ) diffs.scale[diffs.gz] <- diffs.nz.s scale.err <- scale_err(diffs, diffs.scale, scale.threshold, width) break.cond <- floor(mp / width) <= floor(lo.bound / width) || mp >= hi.bound if(scale.err) { # error, keep increasing lines lo.bound <- mp } else { # no error, check if we can generate an error with a smaller value # note hi.bound is always guaranteed to not produce error if(break.cond) break hi.bound <- mp } } } diffs.fin <- diffs.scale # Compute scaling factors for display to user scale.one <- diffs.scale == 1 scale.gt.one <- diffs.scale > 1 s.o.txt <- if(any(scale.one)) { s.o.r <- unique(range(diffs[scale.one])) if(length(s.o.r) == 1L) sprintf("%d:1 for single chars", s.o.r) else sprintf("%d-%d:1 for single chars", s.o.r[1L], s.o.r[2L]) } s.gt.o.txt <- if(any(scale.gt.one)) { s.gt.o.r <- unique( range(round(diffs[scale.gt.one] / diffs.scale[scale.gt.one])) ) if(length(s.gt.o.r) == 1L) sprintf("%d:1 for char seqs", s.gt.o.r) else sprintf("%d-%d:1 for char seqs", s.gt.o.r[1L], s.gt.o.r[2L]) } map.txt <- sprintf( "Diff map (line:char scale is %s%s%s):", if(!is.null(s.o.txt)) s.o.txt else "", if(is.null(s.o.txt) && !is.null(s.gt.o.txt)) "" else ", ", if(!is.null(s.gt.o.txt)) s.gt.o.txt else "" ) body <- if(style@wrap) strwrap(map.txt, width=x@width) else map.txt # Render actual map diffs.txt <- character(length(diffs.fin)) attributes(diffs.txt) <- attributes(diffs.fin) symb <- c(match=".", add="I", delete="D") use.ansi <- FALSE for(i in names(symb)) { test <- diffs.txt[i, ] <- vapply( diffs.fin[i, ], function(x) paste0(rep(symb[[i]], x), collapse=""), character(1L) ) } # Trim text down to what is displayable in the allowed lines txt <- do.call(paste0, as.list(c(diffs.txt))) txt <- substr2(txt, 1, max.chars, sgr.supported=etc@sgr.supported) txt.w <- unlist( if(style@wrap) wrap(txt, width, sgr.supported=etc@sgr.supported) else txt ) # Apply ansi styles if warranted if(is(style, "StyleAnsi")) { old.crayon.opt <- options(crayon.enabled=TRUE) on.exit(options(old.crayon.opt), add=TRUE) } s.f <- style@funs txt.w <- gsub( symb[["add"]], s.f@word.insert(symb[["add"]]), gsub( symb[["delete"]], s.f@word.delete(symb[["delete"]]), txt.w, fixed=TRUE ), fixed=TRUE ) extra <- if(sum(diffs.fin) > max.chars) { diffs.omitted <- diffs.fin diffs.under <- cumsum(diffs.omitted) <= max.chars diffs.omitted[diffs.under] <- 0L res.om <- apply(diffs.omitted, 1L, sum) sprintf( paste0( "omitting %d deletion%s, %d insertion%s, and %d matche%s; ", "increase `max.lines` to %d to show full map" ), res.om[["delete"]], if(res.om[["delete"]] != 1L) "s" else "", res.om[["add"]], if(res.om[["add"]] != 1L) "s" else "", res.om[["match"]], if(res.om[["match"]] != 1L) "s" else "", ceiling(sum(diffs.scale) / width) ) } else character(0L) map <- txt.w if(length(extra) && style@wrap) extra <- strwrap(extra, width=width) c( style@summary@body( paste0( c(head, body), collapse=style@text@line.break ) ), style@summary@map(c(map, extra)) ) } fin <- style@funs@container(style@summary@container(res)) finalize( fin, x, length(unlist(gregexpr(style@text@line.break, fin, fixed=TRUE))) + length(fin) ) } ) #' Display DiffSummary Objects #' #' @param object a \code{DiffSummary} object #' @return NULL, invisbly #' show( #' summary(diffChr(letters, letters[-c(5, 15)], format="raw", pager="off")) #' ) setMethod("show", "DiffSummary", function(object) { show_w_pager(as.character(object), object@etc@style@pager) invisible(NULL) } ) diffobj/R/hunks.R0000755000176200001440000005745213777704534013372 0ustar liggesusers# Copyright (C) 2021 Brodie Gaslam # # This file is part of "diffobj - Diffs for R Objects" # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # Go to for a copy of the license. # Convert ses data into raw hunks that include both match hunks as well as # actual hunks # # These hunks are then processed into hunk groups in a separate step # (see `group_hunks`). # # @return a list of atomic hunks, each containing integer vectors A and B where # positive numbers reference character lines from target and negative ones # from current. For "context" and "sidebyside" mode the A vector will contain # the lines from target, and the B vector the lines from current. For # "unified" only the A vector is populated. In addition to the A and B # vectors some other meta data is tracked, such as the range of the hunks is # also stored as tar.rng and cur.rng; mostly inferrable from the actual data # in the hunks, except that in unified mode we no longer have the actual # context strings from the `current` vector. # # starting to have second thoughts about removing all the non index data from # hunks, particularly because it makes the line length calc a pita. setGeneric("as.hunks", function(x, etc, ...) standardGeneric("as.hunks")) setMethod("as.hunks", c("MyersMbaSes", "Settings"), function( x, etc, ... ) { # Split our data into sections that have either deletes/inserts or matches dat <- as.matrix(x) sects <- unique(dat[, "section"]) j <- 0L res.l <- if(!nrow(dat)) { # Minimum one empty hunk if nothing; make this a context hunk to indicate # that there are no differences. This used to be a non-context hunk list( list( id=1L, A=integer(0L), B=integer(0L), context=TRUE, guide=FALSE, tar.rng=integer(2L), cur.rng=integer(2L), tar.rng.sub=integer(2L), cur.rng.sub=integer(2L), tar.rng.trim=integer(2L), cur.rng.trim=integer(2L), completely.empty=TRUE ) ) } else { lapply( seq_along(sects), function(i) { s <- sects[i] d <- dat[which(dat[, "section"] == s), , drop=FALSE] d.del <- d[which(.edit.map[d[, "type"]] == "Delete"), ,drop=FALSE] d.ins <- d[which(.edit.map[d[, "type"]] == "Insert"), ,drop=FALSE] d.mtc <- d[which(.edit.map[d[, "type"]] == "Match"), ,drop=FALSE] # R 3.3.3 had sum(integer(0)) == 1! del.len <- if(nrow(d.del)) sum(d.del[, "len"]) else 0L ins.len <- if(nrow(d.ins)) sum(d.ins[, "len"]) else 0L mtc.len <- if(nrow(d.mtc)) sum(d.mtc[, "len"]) else 0L tar.len <- del.len + mtc.len cur.len <- ins.len + mtc.len # atomic hunks may only be del/ins or match, not both if((del.len || ins.len) && mtc.len || !(del.len + ins.len + mtc.len)) stop("Logic Error: unknown edit types; contact maintainer.") # nocov # Figure out where previous hunk left off del.last <- if(nrow(d.del)) d.del[1L, "last.a"] else d[1L, "last.a"] ins.last <- if(nrow(d.ins)) d.ins[1L, "last.b"] else d[1L, "last.b"] A.start <- unname(del.last) B.start <- unname(ins.last) # record `cur` indices as negatives tar <- seq_len(tar.len) + A.start cur <- -(seq_len(cur.len) + B.start) context <- !!mtc.len A <- switch( etc@mode, context=tar, unified=c(tar, if(!context) cur), sidebyside=tar, stop("Logic Error: unknown mode; contact maintainer.") ) B <- switch( etc@mode, context=cur, unified=integer(), sidebyside=cur, stop("Logic Error: unknown mode; contact maintainer.") ) # compute ranges tar.rng <- cur.rng <- integer(2L) if(tar.len) tar.rng <- c(A.start + 1L, A.start + tar.len) if(cur.len) cur.rng <- c(B.start + 1L, B.start + cur.len) list( id=i, A=A, B=B, context=context, guide=FALSE, tar.rng=tar.rng, cur.rng=cur.rng, tar.rng.sub=tar.rng, cur.rng.sub=cur.rng, tar.rng.trim=tar.rng, cur.rng.trim=cur.rng, completely.empty=FALSE ) } ) } res.l } ) # Group hunks together based on context, in "auto" mode we find the context # that maximizes lines displayed while adhering to line and hunk limits # Definitely not very efficient since we re-run code multiple times we # probably don't need to. # # Important: context atomic hunks are duplicated anytime there is enough # context that we only show part of the context hunk. # # @return a list containing lists of atomic hunks. Each of these sub-lists # of atomic hunks is treated as a "hunk", but is really a combination of # context and hunks which we will refer to as "hunk group". In each hunk # group, There may be as little as one hunk with no context, or many hunks and # context if the context between hunks is not sufficient to meet the requested # context, in which case the hunks bleed together forming these hunk groups. group_hunks <- function(hunks, etc, tar.capt, cur.capt) { context <- etc@context line.limit <- etc@line.limit ctx.val <- if(is(context, "AutoContext")) { len <- diff_line_len( p_and_t_hunks(hunks, ctx.val=context@max, etc=etc), etc=etc, tar.capt=tar.capt, cur.capt=cur.capt ) len.min <- diff_line_len( p_and_t_hunks(hunks, ctx.val=context@min, etc=etc), etc=etc, tar.capt=tar.capt, cur.capt=cur.capt ) if(line.limit[[1L]] < 0L) { context@max } else if(len.min > line.limit[[1L]]) { context@min } else { ctx.max <- ctx.hi <- ctx <- context@max ctx.lo <- context@min safety <- 0L repeat { if((safety <- safety + 1L) > ctx.max) # nocov start stop( "Logic Error: stuck trying to find auto-context; contact ", "maintainer." ) # nocov end if(len > line.limit[[1L]] && ctx - ctx.lo > 1L) { ctx.hi <- ctx ctx <- as.integer((ctx - ctx.lo) / 2) } else if (len < line.limit[[1L]] && ctx.hi - ctx > 1L) { ctx.lo <- ctx ctx <- ctx + as.integer(ceiling(ctx.hi - ctx) / 2) } else if (len > line.limit[[1L]]) { # unable to get something small enough, but we know min context # works from inital test ctx <- context@min break } else if (len <= line.limit[[1L]]) { break } len <- diff_line_len( p_and_t_hunks(hunks, ctx.val=ctx, etc=etc), etc=etc, tar.capt=tar.capt, cur.capt=cur.capt ) } ctx } } else context res <- process_hunks(hunks, ctx.val=ctx.val, etc=etc) res } # process the hunks and also drop off groups that exceed limit # # used exclusively when we are trying to auto-calculate context p_and_t_hunks <- function(hunks.raw, ctx.val, etc) { c.all <- process_hunks(hunks.raw, ctx.val, etc) hunk.limit <- etc@hunk.limit if(hunk.limit[[1L]] >= 0L && length(c.all) > hunk.limit[[1L]]) c.all <- c.all[seq_along(hunk.limit[[2L]])] c.all } # Subset hunks; should only ever be subsetting context hunks hunk_sub <- function(hunk, op, n) { stopifnot( op %in% c("head", "tail"), hunk$context, all(hunk$tar.rng.sub), length(hunk$tar.rng.sub) == length(hunk$cur.rng.sub), diff(hunk$tar.rng.sub) == diff(hunk$cur.rng.sub), length(hunk$tar.rng.sub) == 2L ) hunk.len <- diff(hunk$tar.rng.sub) + 1L len.diff <- hunk.len - n if(len.diff >= 0) { nm <- c("A", "B", "A.tok.ratio", "B.tok.ratio") hunk[nm] <- lapply(hunk[nm], op, n) # Need to recompute ranges if(n) { if(op == "tail") { hunk$tar.rng.trim[[1L]] <- hunk$tar.rng.sub[[1L]] <- hunk$tar.rng.sub[[1L]] + len.diff hunk$cur.rng.trim[[1L]] <- hunk$cur.rng.sub[[1L]] <- hunk$cur.rng.sub[[1L]] + len.diff } else { hunk$tar.rng.trim[[2L]] <- hunk$tar.rng.sub[[2L]] <- hunk$tar.rng.sub[[2L]] - len.diff hunk$cur.rng.trim[[2L]] <- hunk$cur.rng.sub[[2L]] <- hunk$cur.rng.sub[[2L]] - len.diff } } else { hunk$tar.rng.trim <- hunk$cur.rng.trim <- hunk$tar.rng.sub <- hunk$cur.rng.sub <- integer(2L) } } hunk } # Figure Out Context for Each Chunk # # If a hunk bleeds into another due to context then it becomes part of the # other hunk. # # This will group atomic hunks into hunk groups with matching line in excess of # context removed. process_hunks <- function(x, ctx.val, etc) { context <- ctx.val ctx.vec <- vapply(x, "[[", logical(1L), "context") if(!all(abs(diff(ctx.vec)) == 1L)) # nocov start stop( "Logic Error: atomic hunks not interspersing context; contact maintainer." ) # nocov end hunk.len <- length(x) # Special cases, including only one hunk or forcing only one hunk group, or # no differences if(context < 0L || hunk.len < 2L || !any(ctx.vec)) { res.l <- list(x) } else { # Normal cases; allocate maximum possible number of elements, may need fewer # if hunks bleed into each other res.l <- vector("list", sum(!ctx.vec)) # Jump through every second value as those are the mismatch hunks, though # first figure out if first hunk is mismatching, and merge hunks. This # is likely not super efficient as we keep growing a list, though the only # thing we are actually re-allocating is the list index really, at least if # R is being smart about not copying the list contents (which as of 3.1 I # think it is...) i <- if(ctx.vec[[1L]]) 2L else 1L j <- 1L while(i <= hunk.len) { # Merge left res.l[[j]] <- if(i - 1L) list(hunk_sub(x[[i - 1L]], "tail", context), x[[i]]) else x[i] # Merge right if(i < hunk.len) { # Hunks bleed into next hunk due to context; note that i + 1L will always # be a context hunk, so $A is fully representative while( i < hunk.len && length(x[[i + 1L]]$A) <= context * 2 && i + 1L < length(x) ) { res.l[[j]] <- append(res.l[[j]], x[i + 1L]) if(i < hunk.len - 1L) res.l[[j]] <- append(res.l[[j]], x[i + 2L]) i <- i + 2L } # Context enough to cause a break if(i < hunk.len) { res.l[[j]] <- append( res.l[[j]], list(hunk_sub(x[[i + 1L]], "head", context)) ) } } j <- j + 1L i <- i + 2L } length(res.l) <- j - 1L } # Add back the guide hunks if needed they didn't make it in as part of the # context or differences. It should be the case that the only spot that could # have missing hunk guides is the first hunk in a hunk group if it is a # context hunk # First, determine which guides if any need to be added back; need to do it # first because it is possible that a guide is present at the end context # of the prior hunk group # Helper fun to pull out indices of guide.lines get_guides <- function(hunk, rows, mode) { stopifnot(hunk$context) rng <- hunk[[sprintf("%s.rng", mode)]] rng.sub <- hunk[[sprintf("%s.rng.sub", mode)]] h.rows <- rows[which(!rows %bw% rng.sub & rows %bw% rng)] # If context hunk already contains guide row and there is a non guide at # beginning of hunk, then we don't need to return a guide row if(any(rows %bw% rng.sub) && !rng.sub[[1L]] %in% rows) { integer(0L) } else { # special case where the first row in the subbed hunk is a context row; # note we need to look at the first non-blank row; since this has to be # a context hunk we can just look at A.chr first.is.guide <- FALSE if(rng.sub[[1L]] %in% rows) { first.is.guide <- TRUE h.rows <- c(h.rows, rng.sub[[1L]]) } # we want all guide.lines that abut the last matched guide row if(length(h.rows)) { h.fin <- h.rows[seq(to=max(h.rows), length.out=length(h.rows)) == h.rows] if(first.is.guide) h.fin <- head(h.fin, -1L) # convert back to indeces relative to hunk h.fin - rng[[1L]] + 1L } else integer(0L) } } for(k in seq_along(res.l)) { if(length(res.l[[k]]) && res.l[[k]][[1L]]$context) { h <- res.l[[k]][[1L]] h.o <- x[[res.l[[k]][[1L]]$id]] # retrieve original untrimmed hunk if(! identical( h$tar.rng.sub, h$cur.rng.sub - h$cur.rng.sub[1L] + h$tar.rng.sub[1L] ) ) stop("Logic Error: unequal context hunks; contact mainainer") # nocov # since in a context hunk, everything in tar and cur is the same, so # we just need to recompute the `cur` guidelines relative to tar indices # since the guidelines need not be the same (e.g., in lists that are # mostly the same, but deeper in one object, guideline will be deepest # index entry, which will be different. tar.cand.guides <- intersect( etc@guide.lines@target, seq(h$tar.rng[1L], h$tar.rng[2L], by=1L) ) cur.cand.guides <- intersect( etc@guide.lines@current, seq(h$cur.rng[1L], h$cur.rng[2L], by=1L) ) - h$cur.rng[1L] + h$tar.rng[1L] h.guides <- get_guides( h, unique(c(tar.cand.guides, cur.cand.guides)), "tar" ) if(length(h.guides)) { h.h <- hunk_sub(h.o, "head", max(h.guides)) tail.ind <- if(length(h.guides) == 1L) 1L else diff(range(h.guides)) + 1L h.fin <- hunk_sub(h.h, "tail", tail.ind) h.fin$guide <- TRUE res.l[[k]] <- c(list(h.fin), res.l[[k]]) } } } # Finalize, including sizing correctly, and setting the ids to the right # values since we potentially duplicated some context hunks res.fin <- res.l k <- 1L for(i in seq_along(res.fin)) { for(j in seq_along(res.fin[[i]])) { res.fin[[i]][[j]][["id"]] <- k k <- k + 1L } } res.fin } # Account for overhead / side by sideness in width calculations # Internal funs hunk_len <- function(hunk.id, hunks, tar.capt, cur.capt, etc) { disp.width <- etc@disp.width mode <- etc@mode hunk <- hunks[[hunk.id]] A.lines <- nlines(get_dat_raw(hunk$A, tar.capt, cur.capt), disp.width, mode, etc) B.lines <- nlines(get_dat_raw(hunk$B, tar.capt, cur.capt), disp.width, mode, etc) # Depending on each mode, figure out how to set up the lines; # straightforward except for context where we need to account for the # fact that all the A of a hunk group are shown first, and then all # the B are shown lines.out <- switch( mode, context=c(A.lines, if(!hunk$guide) -B.lines), unified=c(A.lines), sidebyside={ max.len <- max(length(A.lines), length(B.lines)) length(A.lines) <- length(B.lines) <- max.len c(pmax(A.lines, B.lines, na.rm=TRUE)) }, stop("Logic Error: unknown mode '", mode, "' contact maintainer") ) # Make sure that line.id refers to the position of the line in either # original A or B vector l.o.len <- length(lines.out) line.id <- integer(l.o.len) l.gt.z <- lines.out > 0L l.gt.z.w <- which(l.gt.z) line.id[l.gt.z.w] <- seq_along(l.gt.z.w) l.lt.z.w <- which(!l.gt.z) line.id[l.lt.z.w] <- seq_along(l.lt.z.w) cbind( hunk.id=if(length(lines.out)) hunk.id else integer(), line.id=unname(line.id), len=lines.out ) } hunk_grp_len <- function( hunk.grp.id, hunk.grps, etc, tar.capt, cur.capt ) { mode <- etc@mode hunks <- hunk.grps[[hunk.grp.id]] hunks.proc <- lapply( seq_along(hunks), hunk_len, hunks=hunks, etc=etc, tar.capt=tar.capt, cur.capt=cur.capt ) res.tmp <- do.call(rbind, hunks.proc) res <- cbind(grp.id=if(nrow(res.tmp)) hunk.grp.id else integer(0L), res.tmp) # Need to make sure all positives are first, and all negatives second, if # there are negatives (context mode); also, if the first hunk in a hunk # group, add a line for the hunk header, though hunk header itself is added # later extra <- if(length(hunks)) 1L else 0L if(identical(mode, "context")) res <- res[order(res[, "len"] < 0L), , drop=FALSE] if( identical(mode, "context") && length(negs <- which(res[, "len"] < 0L)) && length(poss <- which(res[, "len"] > 0L)) ) { # Add one for hunk header, one for context separator; remember, that lengths # in the B hunk are counted negatively res[1L, "len"] <- res[1L, "len"] + extra res[negs[[1L]], "len"] <- res[negs[[1L]], "len"] - extra } else if(nrow(res)) { res[1L, "len"] <- res[1L, "len"] + extra } res } # Compute how many lines the display version of the diff will take, meta # lines (used for hunk guides) are denoted by negatives # # count lines for each remaining hunk and figure out if we need to cut some # hunks off; note that "negative" lengths indicate the lines being counted # originated from the B hunk in context mode get_hunk_chr_lens <- function(hunk.grps, etc, tar.capt, cur.capt) { mode <- etc@mode disp.width <- etc@disp.width # Generate a matrix with hunk group id, hunk id, and wrapped length of each # line that we can use to figure out what to show do.call( rbind, lapply( seq_along(hunk.grps), hunk_grp_len, etc=etc, tar.capt=tar.capt, cur.capt=cur.capt, hunk.grps=hunk.grps ) ) } # Compute total diff length in lines diff_line_len <- function(hunk.grps, etc, tar.capt, cur.capt) { max( 0L, cumsum( get_hunk_chr_lens( hunk.grps, etc=etc, tar.capt=tar.capt, cur.capt=cur.capt )[, "len"] ) ) + banner_len(etc@mode) } # completely.empty used to highlight difference between hunks that technically # contain a header and no data vs those that can't even contain a header; # unfortunately a legacy of poor design choice in how headers are handled empty_hunk_grp <- function(h.g) { for(j in seq_along(h.g)) { h.g[[j]][c("tar.rng.trim", "cur.rng.trim")] <- list(integer(2L), integer(2L)) h.g[[j]]$completely.empty <- TRUE } h.g } # Remove hunk groups and atomic hunks that exceed the line limit # # Return value is a hunk group list, with an attribute indicating how many # hunks and lines were trimmed trim_hunk <- function(hunk, type, line.id) { stopifnot(type %in% c("tar", "cur")) rng.idx <- sprintf("%s.rng.trim", type) hunk[[rng.idx]] <- if(!line.id) integer(2L) else { if(all(hunk[[rng.idx]])) { c( hunk[[rng.idx]][[1L]], min(hunk[[rng.idx]][[1L]] + line.id - 1L, hunk[[rng.idx]][[2L]]) ) } else integer(2L) } hunk } trim_hunks <- function(hunk.grps, etc, tar.raw, cur.raw) { stopifnot(is(etc, "Settings")) mode <- etc@mode disp.width <- etc@disp.width hunk.limit <- etc@hunk.limit line.limit <- etc@line.limit diffs.orig <- count_diffs(hunk.grps) hunk.grps.count <- length(hunk.grps) if(hunk.limit[[1L]] < 0L) hunk.limit <- rep(hunk.grps.count, 2L) hunk.limit.act <- if(hunk.grps.count > hunk.limit[[1L]]) hunk.limit[[2L]] hunk.grps.omitted <- max(0L, hunk.grps.count - hunk.limit.act) hunk.grps.used <- min(hunk.grps.count, hunk.limit.act) hunk.grps <- hunk.grps[seq_len(hunk.grps.used)] lines <- get_hunk_chr_lens( hunk.grps, etc=etc, tar.capt=tar.raw, cur.capt=cur.raw ) cum.len <- cumsum(abs(lines[, "len"])) cut.off <- -1L lines.omitted <- 0L lines.total <- max(0L, tail(cum.len, 1L)) if(line.limit[[1L]] < 0L) { cut.off <- max(0L, cum.len) } else if(any(cum.len > line.limit[[1L]])) { cut.off <- max(0L, cum.len[cum.len <= line.limit[[2L]]]) } if(cut.off > 0) { lines.omitted <- lines.total - cut.off cut.dat <- lines[max(which(cum.len <= cut.off)), ] grp.cut <- cut.dat[["grp.id"]] hunk.cut <- cut.dat[["hunk.id"]] line.cut <- cut.dat[["line.id"]] line.neg <- cut.dat[["len"]] < 0 # completely trim hunks that will not be shown grps.to.cut <- setdiff(seq_along(hunk.grps), seq_len(grp.cut)) for(i in grps.to.cut) hunk.grps[[i]] <- empty_hunk_grp(hunk.grps[[i]]) hunk.grps.used <- grp.cut hunk.grps.omitted <- max(0L, hunk.grps.count - grp.cut) # Remove excess lines from the atomic hunks based on the limits; we don't # update the ranges as those should still indicate what the original # untrimmed range was # special case for first hunk in group since we need to account for hunk # header that takes up a line; this is not ideal, hunk header should be # made part of hunks eventually if(mode == "context") { # Context tricky because every atomic hunk B data is displayed after all # the A data for(i in seq_along(hunk.grps[[grp.cut]])) { hunk.atom <- hunk.grps[[grp.cut]][[i]] if(!line.neg) { # means all B blocks must be dropped hunk.atom <- trim_hunk(hunk.atom, "cur", 0L) if(i > hunk.cut) { hunk.atom <- trim_hunk(hunk.atom, "tar", 0L) } else if (i == hunk.cut) { hunk.atom <- trim_hunk(hunk.atom, "tar", line.cut) } } else { if(i > hunk.cut) { hunk.atom <- trim_hunk(hunk.atom, "cur", 0L) } else if (i == hunk.cut) { hunk.atom <- trim_hunk(hunk.atom, "cur", line.cut) } } hunk.grps[[grp.cut]][[i]] <- hunk.atom } } else { hunk.atom <- hunk.grps[[grp.cut]][[hunk.cut]] hunk.atom <- trim_hunk(hunk.atom, "tar", line.cut) if(mode == "unified") { # Need to share lines between tar and cur in unified mode line.cut <- max( 0L, line.cut - if(any(hunk.atom$tar.rng)) diff(hunk.atom$tar.rng) + 1L else 0L ) } hunk.atom <- trim_hunk(hunk.atom, "cur", line.cut) hunk.grps[[grp.cut]][[hunk.cut]] <- hunk.atom null.hunks <- seq_len(length(hunk.grps[[grp.cut]]) - hunk.cut) + hunk.cut hunk.grps[[grp.cut]][null.hunks] <- lapply( hunk.grps[[grp.cut]][null.hunks], function(h.a) { h.a <- trim_hunk(h.a, "cur", 0L) h.a <- trim_hunk(h.a, "tar", 0L) h.a } ) } } else if (!cut.off && length(cum.len)) { lines.omitted <- lines.total hunk.grps.omitted <- hunk.grps.count for(i in seq_along(hunk.grps)) hunk.grps[[i]] <- empty_hunk_grp(hunk.grps[[i]]) } diffs.trim <- count_diffs(hunk.grps) attr(hunk.grps, "meta") <- list( lines=as.integer(c(lines.omitted, lines.total)), hunks=as.integer(c(hunk.grps.omitted, hunk.grps.count)), diffs=as.integer(c(diffs.orig - diffs.trim, diffs.orig)) ) hunk.grps } # Helper fun line_count <- function(rng) if(rng[[1L]]) rng[[2L]] - rng[[1L]] + 1L else 0L # Count how many "lines" of differences there are in the hunks # # Counts original diff lines, not lines left after trim. This is because # we are checking for 'str' folding, and 'str' folding should only happen # if the folded results fits fully within limit. # # param x should be a hunk group list count_diffs <- function(x) { sum( vapply( unlist(x, recursive=FALSE), function(y) if(y$context) 0L else line_count(y$tar.rng) + line_count(y$cur.rng), integer(1L) ) ) } # More detailed counting of differences; note that context counting is messed # up b/c context's are duplicated around each hunk. This is primarily used for # the summary method count_diffs_detail <- function(x) { x.flat <- unlist(x, recursive=FALSE) guides <- vapply(x.flat, "[[", logical(1L), "guide") vapply( x.flat[!guides], function(y) if(y$context) c(match=line_count(y$tar.rng), delete=0L, add=0L) else c(match=0L, delete=line_count(y$tar.rng), add=line_count(y$cur.rng)), integer(3L) ) } count_diff_hunks <- function(x) sum(!vapply(unlist(x, recursive=FALSE), "[[", logical(1L), "context")) diffobj/R/misc.R0000755000176200001440000003006714126712540013147 0ustar liggesusers# Copyright (C) 2021 Brodie Gaslam # # This file is part of "diffobj - Diffs for R Objects" # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # Go to for a copy of the license. # Used so that `with_mock` will work since these are primitives, for testing interactive <- function() base::interactive() readline <- function(...) if(interactive()) base::readline(...) # nocov # Returns the indices of the original rle object that correspond to the # ind rle values rle_sub <- function(rle, ind) { ind <- if(is.numeric(ind)) { as.integer(ind) } else if(is.logical(ind)) { which(ind) } else stop("Internal Error: unexpected `ind` input") # nocov if(!all(ind) > 0 || !all(diff(ind) > 0)) stop("Internal Error: `ind` should be monotonically increasing") # nocov len.cum <- cumsum(rle$lengths) all.ind <- Map( seq, from=c(1L, head(len.cum, -1L) + 1L), to=len.cum, by=1L ) all.ind[ind] } # concatenate method for factors c.factor <- function(..., recursive=FALSE) { dots <- list(...) dots.n.n <- dots[!vapply(dots, is.null, logical(1L))] if(!length(dots)) factor(character()) else { if( !all(vapply(dots.n.n, is, logical(1L), "factor")) || length(unique(lapply(dots.n.n, levels))) != 1L ) { NextMethod() } else { int.f <- unlist(lapply(dots.n.n, as.integer)) lvl <- levels(dots[[1L]]) factor(lvl[int.f], levels=lvl) } } } # Pull out the names of the functions in a sys.call stack stack_funs <- function(s.c) { if(!length(s.c)) stop("Internal Error: call stack empty; contact maintainer.") #nocov vapply( s.c, function(call) paste0(deparse(call), collapse="\n"), character(1L) ) } .internal.call <- quote(.local(target, current, ...)) # Pull out the first call reading back from sys.calls that is likely to be # be the top level call to the diff* methods. This is somewhat fragile # unfortunately, but there doesn't seem to be a systematic way to figure this # out which_top <- function(s.c) { if(!length(s.c)) # nocov start stop("Internal Error: stack should have at least one call, contact maintainer") # nocov end funs <- stack_funs(s.c) fun.ref <- stack_funs(list(.internal.call)) # find .local call fun.ref.loc <- match(fun.ref, funs, nomatch=0L) f.rle <- rle(funs) val.calls <- f.rle$lengths == 2 # default if failed to find a value is last call on stack res <- length(s.c) if(any(val.calls) && fun.ref.loc) { # return first index of last pairs of identical calls in the call stack # that is followed by a correct .internal call, and also that are not # calls to `eval`. rle.elig <- rle_sub(f.rle, which(val.calls)) rle.elig.max <- vapply(rle.elig, max, integer(1L)) rle.followed <- which( rle.elig.max < max(fun.ref.loc) & !grepl("eval\\(", funs[rle.elig.max]) ) if(length(rle.followed)) { # can't find correct one res <- rle.elig[[max(rle.followed)]][1L] } } res } get_fun <- function(name, env) { get.fun <- if(is.name(name) || (is.character(name) && length(name) == 1L)) { try(get(as.character(name), envir=env), silent=TRUE) } else if( is.call(name) && ( identical(as.character(name[[1L]]), "::") || identical(as.character(name[[1L]]), ":::") ) && length(name) == 3L ) { get.fun <- try(eval(name, env)) } else function(...) NULL if(is.function(get.fun)) get.fun else { warning( "Unable to find function `", deparse(name), "` to ", "match call with." ) NULL } } extract_call <- function(s.c, par.env) { idx <- which_top(s.c) found.call <- s.c[[idx]] no.match <- list(call=NULL, tar=NULL, cur=NULL) get.fun <- get_fun(found.call[[1L]], env=par.env) res <- no.match if(is.function(get.fun)) { found.call.m <- try( # this creates an environment where `...` is available so we don't # get a "... used in a situation it does not exist error" (issue 134) (function(...) { match.call(definition=get.fun, call=found.call, envir=environment()) })() ) if(!inherits(found.call.m, "try-error")) { if(length(found.call.m) < 3L) { found.call.ml <- as.list(found.call.m) length(found.call.ml) <- 3L # found.call.ml[[3L]] <- quote(list(x=))[[2L]] found.call.m <- as.call(found.call.ml) } res <- list(call=found.call.m, tar=found.call.m[[2L]], cur=found.call.m[[3L]]) } else { # nocov start # not sure if it's possible to get here, seems like not, maybe we can # get rid of try, but don't want to risk breaking stuff that used to work warning( "Failed trying to recover tar/cur expressions for display, see ", "previous errors." ) # nocov end } } res } #' Get Parent Frame of S4 Call Stack #' #' Implementation of the \code{function(x=parent.frame()) ...} pattern for the #' \code{\link[=diffPrint]{diff*}} methods since the normal pattern does not #' work with S4 methods. Works by looking through the call stack and #' identifying what call likely initiated the S4 dispatch. #' #' The function is not exported and intended only for use as the default value #' for the \code{frame} argument for the \code{\link[=diffPrint]{diff*}} #' methods. #' #' Matching is done purely by looking for the last repeated call followed #' by \code{.local(target, current, ...)} that is not a call to \code{eval}. #' This pattern seems to match the correct call most of the time. #' Since methods can be renamed by the user we make no attempt to verify method #' names. This method could potentially be tricked if you implement custom #' \code{\link[=diffPrint]{diff*}} methods that somehow #' issue two identical sequential calls before calling \code{callNextMethod}. #' Failure in this case means the wrong \code{frame} will be returned. #' #' @return an environment par_frame <- function() { s.c <- head(sys.calls(), -1L) top <- which_top(s.c) par <- head(sys.parents(), -1L)[top] if(par) { head(sys.frames(), -1L)[[par]] } else .GlobalEnv # can't figure out how to cause this branch } # check whether running in knitr # in_knitr <- function() isTRUE(getOption('knitr.in.progress')) make_err_fun <- function(call) function(...) stop(simpleError(do.call(paste0, list(...)), call=call)) make_warn_fun <- function(call) function(...) warning(simpleWarning(do.call(paste0, list(...)), call=call)) # Function used to match against `str` calls since the existing function # does not actually define `max.level`; note it never is actually called # nocov start str_tpl <- function(object, max.level, comp.str, indent.str, ...) NULL # nocov end # utility fun to deparse into chr1L dep <- function(x) paste0(deparse(x, width.cutoff=500L), collapse="") # Reports how many levels deep each line of a `str` screen output is str_levels <- function(str.txt, wrap=FALSE) { if(length(str.txt) < 2L) { integer(length(str.txt)) } else { # annoying `wrap` kills leading whitespace, so we need separate patterns sub.pat <- if(wrap) { "^(\\.\\. )*\\.\\.[@$\\-]" } else { "^ ( \\.\\.)*[@$\\-]" } tl.pat <- if(wrap) "^(\\$|-)" else "^ (\\$|-)" subs <- character(length(str.txt)) subs.rg <- regexpr(sub.pat, str.txt, perl=TRUE) subs[subs.rg > 0] <- regmatches(str.txt, subs.rg) subs.fin <- regmatches(subs, gregexpr("\\.\\.", subs, perl=TRUE)) level <- vapply(subs.fin, length, integer(1L)) top.level <- grepl(tl.pat, str.txt) level[!!level & !top.level] <- level[!!level & !top.level] + 1L level[1L] <- 0L level[top.level] <- 1L # handle potential wrapping; need to detect which sections of the text # are at level 0, and if they are, give them the depth of the previous # section if(wrap) { sects <- c( 0L, cumsum(xor(head(level, -1L) == 0L, tail(level, -1L) == 0L)) ) level.s <- split(level, sects) if(length(level.s) > 1L) { for(i in 2L:length(level.s)){ if(!any(level.s[[i]])) level.s[[i]][] <- tail(level.s[[i - 1L]], 1L) } # could just unlist since sections are supposed to be monotonic in vec level <- unsplit(level.s, sects) } } level } } # Calculate how many lines the banner will take up banner_len <- function(mode) if(mode == "sidebyside") 1L else 2L # Compute display width in characters # # Note this does not account for the padding required .pad <- list(context=2L, sidebyside=2L, unified=2L) .min.width <- 6L calc_width <- function(width, mode) { # stopifnot( # is.numeric(width), length(width) == 1L, !is.na(width), is.finite(width), # width >= 0L, # is.character(mode), mode %in% c("context", "unified", "sidebyside") # ) width <- as.integer(width) width.tmp <- if(mode == "sidebyside") as.integer(floor((width - 2)/ 2)) else width as.integer(max(.min.width, width.tmp)) } calc_width_pad <- function(width, mode) { # stopifnot( # is.character(mode), mode %in% c("context", "unified", "sidebyside") # ) width.tmp <- calc_width(width, mode) width.tmp - .pad[[mode]] } # Helper function to retrieve a palette parameter get_pal_par <- function(format, param) { if(is.chr.1L(param) && is.null(names(param))) { param } else if(format %in% names(param)) { param[format] } else if (wild.match <- match("", names(param), nomatch=0L)) { param[wild.match] } else # nocov start stop("Internal Error: malformed palette parameter; contact maintainer.") # nocov end } # check whether argument list contains non-default formals has_non_def_formals <- function(arg.list) { stopifnot(is.pairlist(arg.list) || is.list(arg.list)) any( vapply( arg.list, function(x) is.name(x) && !nzchar(as.character(x)), logical(1L) ) ) } # Between `%bw%` <- function(x, y) { stopifnot(length(y) == 2L) if(y[[1L]] < y[[2L]]) { low <- y[[1L]] hi <- y[[2L]] } else { hi <- y[[1L]] low <- y[[2L]] } x >= low & x <= hi } flatten_list <- function(l) if(is.list(l) && !is.object(l) && length(l)) do.call(c, lapply(l, flatten_list)) else list(l) trimws2 <- function(x, which=c("both", "left", "right")) { if( !is.character(which) || !isTRUE(which[[1]] %in% c("both", "left", "right")) ) stop("Argument which is wrong") switch(which[[1]], both=gsub("^[ \t\r\n]*|[ \t\r\n]*$", "", x), left=gsub("^[ \t\r\n]*", "", x), right=gsub("[ \t\r\n]*$", "", x) ) } # this gets overwritten in .onLoad if needed (i.e. R version < 3.2) trimws <- NULL # Placeholders until we are able to use fansi versions substr2 <- function(x, start, stop, sgr.supported) { len.x <- length(x) if( (length(start) != 1L && length(start) != len.x) || (length(stop) != 1L && length(stop) != len.x) ) stop("`start` and `stop` must be length 1 or the same length as `x`.") res <- substr(x, start, stop) if(sgr.supported) { has.ansi <- grep("\033[", x, fixed=TRUE) if(length(has.ansi)) { res[has.ansi] <- crayon::col_substr( x[has.ansi], if(length(start) != 1L) start[has.ansi] else start, if(length(stop) != 1L) stop[has.ansi] else stop ) } } res } strsplit2 <- function(x, ..., sgr.supported) { res <- strsplit(x, ...) if(sgr.supported) { has.ansi <- grep("\033[", x, fixed=TRUE) if(length(has.ansi)) res[has.ansi] <- crayon::col_strsplit(x[has.ansi], ...) } res } nchar2 <- function(x, ..., sgr.supported) { if(sgr.supported) crayon::col_nchar(x, ...) else nchar(x, ...) } # These are internal methods for testing #' @export print.diffobj_ogewlhgiadfl <- function(x, ...) stop('failure') #' @export as.character.diffobj_ogewlhgiadfl2 <- function(x, ...) stop('failure2') #' @export as.character.diffobj_ogewlhgiadfl3 <- function(x, ...) x diffobj/R/s4.R0000755000176200001440000004307313777704534012562 0ustar liggesusers# Copyright (C) 2021 Brodie Gaslam # This file is part of "diffobj - Diffs for R Objects" # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # Go to for a copy of the license. #' @include misc.R #' @include styles.R #' @include pager.R NULL # S4 class definitions setClassUnion("charOrNULL", c("character", "NULL")) #' Dummy Doc File for S4 Methods with Existing Generics #' #' @keywords internal #' @name diffobj_s4method_doc #' @rdname diffobj_s4method_doc NULL #' Controls How Lines Within a Diff Hunk Are Aligned #' #' @slot threshold numeric(1L) between 0 and 1, what proportion of words #' in the lines must match in order to align them. Set to 1 to effectively #' turn aligning off. Defaults to 0.25. #' @slot min.chars integer(1L) positive, minimum number of characters that must #' match across lines in order to align them. This requirement is in addition #' to \code{threshold} and helps minimize spurious alignments. Defaults to #' 3. #' @slot count.alnum.only logical(1L) modifier for \code{min.chars}, whether to #' count alpha numeric characters only. Helps reduce spurious alignment #' caused by meta character sequences such as \dQuote{[[1]]} that would #' otherwise meet the \code{min.chars} limit #' @export AlignThreshold #' @exportClass AlignThreshold #' @examples #' a1 <- AlignThreshold(threshold=0) #' a2 <- AlignThreshold(threshold=1) #' a3 <- AlignThreshold(threshold=0, min.chars=2) #' ## Note how "e f g" is aligned #' diffChr(c("a b c e", "d e f g"), "D e f g", align=a1, pager="off") #' ## But now it is not #' diffChr(c("a b c e", "d e f g"), "D e f g", align=a2, pager="off") #' ## "e f" are not enough chars to align #' diffChr(c("a b c", "d e f"), "D e f", align=a1, pager="off") #' ## Override with min.chars, so now they align #' diffChr(c("a b c", "d e f"), "D e f", align=a3, pager="off") AlignThreshold <- setClass("AlignThreshold", slots=c( threshold="numeric", min.chars="integer", count.alnum.only="logical" ), validity=function(object) { if( length(object@threshold) != 1L || is.na(object@threshold) || !object@threshold %bw% c(0, 1) ) return("Slot `threhold` must be numeric(1L) between 0 and 1") if(!is.int.1L(object@min.chars) || object@min.chars < 0L) return("Slot `min.chars` must be integer(1L) and positive") if(!is.TF(object@count.alnum.only)) return("Slot `count.alnum.only` must be TRUE or FALSE") } ) setMethod( "initialize", "AlignThreshold", function( .Object, threshold=gdo("align.threshold"), min.chars=gdo("align.min.chars"), count.alnum.only=gdo("align.count.alnum.only"), ... ) { if(is.numeric(min.chars)) min.chars <- as.integer(min.chars) callNextMethod( .Object, threshold=threshold, min.chars=min.chars, count.alnum.only=count.alnum.only, ... ) } ) setClass("AutoContext", slots=c( min="integer", max="integer" ), validity=function(object) { if(!is.int.1L(object@max) || object@min < 0L) return("Slot `max` must be integer(1L), positive, and not NA") if(!is.int.1L(object@max)) return("Slot `max` must be integer(1L), and not NA") if(object@max > 0L && object@min > object@max) return("Slot `max` must be negative, or greater than slot `min`") TRUE } ) setClassUnion("doAutoCOrInt", c("AutoContext", "integer")) # pre-computed gutter data GuideLines <- setClass( "GuideLines", slots=c(target="integer", current="integer"), validity=function(object) { vals <- c(object@target, object@current) if(anyNA(vals) || any(vals < 1L)) return("Object may only contain strictly positive integer values") TRUE } ) setClass("StripRowHead", slots=c(target="ANY", current="ANY"), validity=function(object) { if(!isTRUE(err <- is.one.arg.fun(object@target))) return(err) if(!isTRUE(err <- is.one.arg.fun(object@current))) return(err) TRUE } ) setClass("Gutter", slots= c( insert="character", insert.ctd="character", delete="character", delete.ctd="character", match="character", match.ctd="character", guide="character", guide.ctd="character", fill="character", fill.ctd="character", context.sep="character", context.sep.ctd="character", pad="character", width="integer" ) ) setClass("Settings", slots=c( mode="character", # diff output mode context="doAutoCOrInt", line.limit="integer", style="Style", hunk.limit="integer", max.diffs="integer", word.diff="logical", unwrap.atomic="logical", align="AlignThreshold", ignore.white.space="logical", convert.hz.white.space="logical", strip.sgr="logical", sgr.supported="logical", frame="environment", tab.stops="integer", tar.exp="ANY", cur.exp="ANY", tar.banner="charOrNULL", cur.banner="charOrNULL", guides="ANY", guide.lines="GuideLines", trim="ANY", strip.row.head="StripRowHead", disp.width="integer", line.width="integer", text.width="integer", line.width.half="integer", text.width.half="integer", gutter="Gutter", err="function", warn="function" ), prototype=list( disp.width=0L, text.width=0L, line.width=0L, text.width.half=0L, line.width.half=0L, guides=function(obj, obj.as.chr) integer(0L), trim=function(obj, obj.as.chr) cbind(1L, nchar(obj.as.chr)), ignore.white.space=TRUE, convert.hz.white.space=TRUE, word.diff=TRUE, unwrap.atomic=TRUE, strip.sgr=TRUE, sgr.supported=TRUE, err=stop, warn=warning ), validity=function(object){ int.1L.and.pos <- c( "disp.width", "line.width", "text.width", "line.width.half", "text.width.half" ) for(i in int.1L.and.pos) if(!is.int.1L(slot(object, i)) || slot(object, i) < 0L) return(sprintf("Slot `%s` must be integer(1L) and positive", i)) TF <- c( "ignore.white.space", "convert.hz.white.space", "word.diff", "unwrap.atomic", "strip.sgr" ) for(i in TF) if(!is.TF(slot(object, i)) || slot(object, i) < 0L) return(sprintf("Slot `%s` must be TRUE or FALSE", i)) if(!is.TF(object@guides) && !is.function(object@guides)) return("Slot `guides` must be TRUE, FALSE, or a function") if( is.function(object@guides) && !isTRUE(v.g <- is.two.arg.fun(object@guides)) ) return(sprintf("Slot `guides` is not a valid guide function (%s)", v.g)) if(!is.TF(object@trim) && !is.function(object@trim)) return("Slot `trim` must be TRUE, FALSE, or a function") if( is.function(object@trim) && !isTRUE(v.t <- is.two.arg.fun(object@trim)) ) return(sprintf("Slot `trim` is not a valid trim function (%s)", v.t)) TRUE } ) setMethod("initialize", "Settings", function(.Object, ...) { if(is.numeric(.Object@disp.width)) .Object@disp.width <- as.integer(.Object@disp.width) return(callNextMethod(.Object, ...)) } ) setGeneric("sideBySide", function(x, ...) standardGeneric("sideBySide")) setMethod("sideBySide", "Settings", function(x, ...) { x@mode <- "sidebyside" x@text.width <- x@text.width.half x@line.width <- x@line.width.half x } ) .diff.dat.cols <- c( "orig", "raw", "trim", "trim.ind.start", "trim.ind.end", "comp", "eq", "fin", "fill", "word.ind", "tok.rat" ) # Validate the *.dat slots of the Diff objects # # We stopped using this one because it was too expensive computationally. # Saving the code just in case. # valid_dat <- function(x) { # char.cols <- c("orig", "raw", "trim", "eq", "comp", "fin") # list.cols <- c("word.ind") # zerotoone.cols <- "tok.rat" # integer.cols <- c("trim.ind.start", "trim.ind.end") # # if(!is.list(x)) { # "should be a list" # } else if(!identical(names(x), .diff.dat.cols)) { # paste0("should have names ", dep(.diff.dat.cols)) # } else if( # length( # unique( # vapply( # x[c(char.cols, list.cols, zerotoone.cols, integer.cols)], # length, integer(1L) # ) # ) ) != 1L # ) { # "should have equal length components" # } else { # if( # length( # not.char <- which(!vapply(x[char.cols], is.character, logical(1L))) # ) # ){ # sprintf("element `%s` should be character", char.cols[not.char][[1L]]) # } else if ( # length( # not.int <- which(!vapply(x[integer.cols], is.integer, logical(1L))) # ) # ) { # sprintf("element `%s` should be integer", integer.cols[not.int][[1L]]) # } else if ( # length( # not.list <- which(!vapply(x[list.cols], is.list, logical(1L))) # ) # ) { # sprintf("element `%s` should be list", list.cols[not.list][[1L]]) # } else if ( # !all( # vapply( # x$word.ind, # function(y) # is.integer(y) && is.integer(attr(y, "match.length")) && # length(y) == length(attr(y, "match.length")), # logical(1L) # ) ) # ) { # "element `word.ind` is not in expected format" # } else if ( # !is.numeric(x$tok.rat) || anyNA(x$tok.rat) || !all(x$tok.rat %bw% c(0, 1)) # ) { # "element `tok.rat` should be numeric with all values between 0 and 1" # } else if (!is.logical(x$fill) || anyNA(x$fill)) { # "element `fill` should be logical and not contain NAs" # } # else TRUE # } # } #' Diff Result Object #' #' Return value for the \code{\link[=diffPrint]{diff*}} methods. Has #' \code{show}, \code{as.character}, \code{summmary}, \code{[}, \code{head}, #' \code{tail}, and \code{any} methods. #' #' @export setClass("Diff", slots=c( target="ANY", # Actual object tar.dat="list", # see line_diff() for details current="ANY", cur.dat="list", diffs="list", trim.dat="list", # result of trimmaxg sub.index="integer", sub.head="integer", sub.tail="integer", capt.mode="character", # whether in print or str mode hit.diffs.max="logical", diff.count.full="integer", # only really used by diffStr when folding hunk.heads="list", etc="Settings" ), prototype=list( capt.mode="print", trim.dat=list(lines=integer(2L), hunks=integer(2L), diffs=integer(2L)), hit.diffs.max=FALSE, diff.count.full=-1L ), validity=function(object) { # Most of the validation is done by `check_args` if( !is.chr.1L(object@capt.mode) || ! object@capt.mode %in% c("print", "str", "chr", "deparse", "file") ) return("slot `capt.mode` must be either \"print\" or \"str\"") not.list.3 <- !is.list(object@trim.dat) || length(object@trim.dat) != 3L not.names <- !identical(names(object@trim.dat), c("lines", "hunks", "diffs")) not.comp.1 <- !all(vapply(object@trim.dat, is.integer, logical(1L))) not.comp.2 <- !all(vapply(object@trim.dat, length, integer(1L)) == 2L) if(not.list.3) return( paste0( "slot `trim.dat` is not a length 3 list (", typeof(object@trim.dat), ", ", length(object@trim.dat) ) ) if(not.names) return( paste0( "slot `trim.dat` has wrong names", deparse(names(object@trim.dat))[1] ) ) if(not.comp.1) return( paste0( "slot `trim.dat` has non-integer components ", deparse(vapply(object@trim.dat, typeof, character(1L)))[1] ) ) if(not.comp.2) return("slot `trim.dat` has components of length != 2") ## too expensive computationally # if(!isTRUE(tar.dat.val <- valid_dat(object@tar.dat))) # return(paste0("slot `tar.dat` not valid: ", tar.dat.val)) # if(!isTRUE(cur.dat.val <- valid_dat(object@cur.dat))) # return(paste0("slot `cur.dat` not valid: ", cur.dat.val)) if(!is.TF(object@hit.diffs.max)) return("slot `hit.diffs.max` must be TRUE or FALSE") TRUE } ) #' @rdname finalizeHtml setMethod("finalizeHtml", c("Diff"), function(x, x.chr, ...) { style <- x@etc@style html.output <- style@html.output if(html.output == "auto") { html.output <- if(is(style@pager, "PagerBrowser")) "page" else "diff.only" } if(html.output == "page") { x.chr <- c( make_dummy_row(x), sprintf("
%s
", x.chr), sprintf( " ", if(style@scale) "true" else "false" ) ) rez.fun <- if(style@scale) "resize_diff_out_scale" else "resize_diff_out_no_scale" js <- try(readLines(style@js), silent=TRUE) if(inherits(js, "try-error")) { cond <- attr(js, "condition") warning( "Unable to read provided js file \"", style@js, "\" (error: ", paste0(conditionMessage(cond), collapse=""), ")." ) js <- "" } else { js <- paste0( c( js, sprintf( "window.addEventListener('resize', %s, true);\n %s();", rez.fun, rez.fun ) ), collapse="\n" ) } } else js <- "" callNextMethod(x, x.chr, js=js, ...) } ) # Helper fun used by `show` for Diff and DiffSummary objects show_w_pager <- function(txt, pager) { use.pager <- use_pager(pager, attr(txt, "len")) file.keep <- !is.na(pager@file.path) # Finalize and output if(use.pager) { disp.f <- if(!is.na(pager@file.path)) pager@file.path else paste0(tempfile(), ".", pager@file.ext) if(!file.keep) on.exit(add=TRUE, unlink(disp.f)) writeLines(txt, disp.f) if( isTRUE(pager@make.blocking) || (is.na(pager@make.blocking) && !file.keep) ) make_blocking(pager@pager)(disp.f) else pager@pager(disp.f) } else { cat(txt, sep="\n") } } setMethod("show", "Diff", function(object) { txt <- as.character(object) show_w_pager(txt, object@etc@style@pager) invisible(NULL) } ) # Compute what fraction of the lines in target and current actually end up # in the diff; some of the complexity is driven by repeated context hunks setGeneric("lineCoverage", function(x) standardGeneric("lineCoverage")) setMethod("lineCoverage", "Diff", function(x) { lines_in_hunk <- function(z, ind) if(z[[ind]][[1L]]) z[[ind]][[1L]]:z[[ind]][[2L]] hunks.f <- unlist(x@diffs, recursive=FALSE) lines.tar <- length( unique(unlist(lapply(hunks.f, lines_in_hunk, "tar.rng.sub"))) ) lines.cur <- length( unique(unlist(lapply(hunks.f, lines_in_hunk, "cur.rng.sub"))) ) min( 1, (lines.tar + lines.cur) / ( length(x@tar.dat$raw) + length(x@cur.dat$raw)) ) } ) #' Determine if Diff Object Has Differences #' #' @param x a \code{Diff} object #' @param ... unused, for compatibility with generic #' @param na.rm unused, for compatibility with generic #' @return TRUE if there are differences, FALSE if not, FALSE with warning if #' there are no differences but objects are not \code{\link{all.equal}} #' @examples #' any(diffChr(letters, letters)) #' any(diffChr(letters, letters[-c(1, 5, 8)])) setMethod("any", "Diff", function(x, ..., na.rm = FALSE) { dots <- list(...) if(length(dots)) stop("`any` method for `Diff` supports only one argument", call. = FALSE) res <- any( which( !vapply( unlist(x@diffs, recursive=FALSE), "[[", logical(1L), "context" ) ) ) if(!res && !isTRUE(all.equal(x@target, x@current))) warning( "No visible differences, but objects are NOT `all.equal`.", call.=FALSE ) res } ) # See diff_myers for explanation of slots setClass( "MyersMbaSes", slots=c( a="character", b="character", type="factor", length="integer", offset="integer", diffs="integer" ), prototype=list( type=factor(character(), levels=c("Match", "Insert", "Delete")) ), validity=function(object) { if(!identical(levels(object@type), c("Match", "Insert", "Delete"))) return("Slot `type` levels incorrect") if(any(is.na(c(object@a, object@b)))) return("Slots `a` and `b` may not contain NA values") if(any(is.na(c(object@type, object@length, object@offset)))) return("Slots `type`, `length`, or `offset` may not contain NA values") if(any(c(object@type, object@length, object@offset)) < 0) return( paste0( "Slots `type`, `length`, and `offset` must have values greater ", "than zero" ) ) if(!is.int.1L(object@diffs)) return("Slot `diffs` must be integer(1L) and not NA") TRUE } ) # Run validity on S4 objects # # Intended for use within check_args; unfortunately can't use complete=TRUE # because we are using ANY slots with S3 objects there-in, which causes # the complete check to freak out with "trying to get slot 'package' from..." # # @param x object to test # @param err.tpl a string used with sprintf, must contain two \dQuote{%s} for # respectively \code{arg.name} and the class name # @param arg.name argument the object is supposed to come from # @param err error reporting function valid_object <- function( x, arg.name, err, err.tpl="Argument `%s` is an invalid `%s` object because:" ) { if(isS4(x)) { if(!isTRUE(test <- validObject(x, test=TRUE))) { err( paste( sprintf(err.tpl, arg.name, class(x)[[1L]]), strwrap(test, initial="- ", prefix=" "), collapse="\n" ) ) } } } diffobj/R/rdiff.R0000755000176200001440000001253513777704534013325 0ustar liggesusers# Copyright (C) 2021 Brodie Gaslam # # This file is part of "diffobj - Diffs for R Objects" # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # Go to for a copy of the license. #' Run Rdiff Directly on R Objects #' #' These functions are here for reference and testing purposes. They are #' wrappers to \code{tools::Rdiff} and rely on an existing system diff utility. #' You should be using \code{\link{ses}} or \code{\link{diffChr}} instead of #' \code{Rdiff_chr} and \code{\link{diffPrint}} instead of \code{Rdiff_obj}. #' See limitations in note. #' #' \code{Rdiff_chr} runs diffs on character vectors or objects coerced to #' character vectors, where each value in the vectors is treated as a line in a #' file. \code{Rdiff_chr} always runs with the \code{useDiff} and \code{Log} #' parameters set to \code{TRUE}. #' #' \code{Rdiff_obj} runs diffs on the \code{print}ed representation of #' the provided objects. For each of \code{from}, \code{to}, will check if they #' are 1 length character vectors referencing an RDS file, and will use the #' contents of that RDS file as the object to compare. #' #' @note These functions will try to use the system \code{diff} utility. This #' will fail in systems that do not have that utility available (e.g. windows #' installation without Rtools). #' @importFrom tools Rdiff #' @export #' @seealso \code{\link{ses}}, \code{\link[=diffPrint]{diff*}} #' @param from character or object coercible to character for \code{Rdiff_chr}, #' any R object with \code{Rdiff_obj}, or a file pointing to an RDS object #' @param to character same as \code{from} #' @param nullPointers passed to \code{tools::Rdiff} #' @param silent TRUE or FALSE, whether to display output to screen #' @param minimal TRUE or FALSE, whether to exclude the lines that show the #' actual differences or only the actual edit script commands #' @return the Rdiff output, invisibly if \code{silent} is FALSE #' Rdiff_chr(letters[1:5], LETTERS[1:5]) #' Rdiff_obj(letters[1:5], LETTERS[1:5]) Rdiff_chr <- function(from, to, silent=FALSE, minimal=FALSE, nullPointers=TRUE) { A <- try(as.character(from)) if(inherits(A, "try-error")) stop("Unable to coerce `target` to character.") B <- try(as.character(to)) if(inherits(B, "try-error")) stop("Unable to coerce `current` to character.") af <- tempfile() bf <- tempfile() writeLines(A, af) writeLines(B, bf) on.exit(unlink(c(af, bf))) Rdiff_run( silent=silent, minimal=minimal, from=af, to=bf, nullPointers=nullPointers ) } #' @export #' @rdname Rdiff_chr Rdiff_obj <- function(from, to, silent=FALSE, minimal=FALSE, nullPointers=TRUE) { dummy.env <- new.env() # used b/c unique object files <- try( vapply( list(from, to), function(x) { if( is.character(x) && length(x) == 1L && !is.na(x) && file_test("-f", x) ) { rdstry <- tryCatch(readRDS(x), error=function(x) dummy.env) if(!identical(rdstry, dummy.env)) x <- rdstry } f <- tempfile() on.exit(unlink(f)) capture.output(if(isS4(x)) show(x) else print(x), file=f) on.exit() f }, character(1L) ) ) if(inherits(files, "try-error")) stop("Unable to store text representation of objects") on.exit(unlink(files)) Rdiff_run( from=files[[1L]], to=files[[2L]], silent=silent, minimal=minimal, nullPointers=nullPointers ) } # Internal use only: BEWARE, will unlink from, to Rdiff_run <- function(from, to, nullPointers, silent, minimal) { stopifnot( isTRUE(silent) || identical(silent, FALSE), isTRUE(minimal) || identical(minimal, FALSE) ) res <- tryCatch( Rdiff( from=from, to=to, useDiff=TRUE, Log=TRUE, nullPointers=nullPointers )$out, warning=function(e) stop( "`tools::Rdiff` returned a warning; this likely means you are running ", "without a `diff` utility accessible to R" ) ) if(!is.character(res)) # nocov start stop("Internal Error: Unexpected tools::Rdiff output, contact maintainer") # nocov end res <- if(minimal) res[!grepl("^[<>-]", res)] else res if(silent) res else { cat(res, sep="\n") invisible(res) } } #' Attempt to Detect Whether diff Utility is Available #' #' Checks whether \code{\link[=Rdiff]{tools::Rdiff}} issues a warning when #' running with \code{useDiff=TRUE} and if it does assumes this is because the #' diff utility is not available. Intended primarily for testing purposes. #' #' @export #' @return TRUE or FALSE #' @param test.with function to test for diff presence with, typically Rdiff #' @examples #' has_Rdiff() has_Rdiff <- function(test.with=tools::Rdiff) { f.a <- tempfile() f.b <- tempfile() on.exit(unlink(c(f.a, f.b))) writeLines(letters[1:3], f.a) writeLines(LETTERS, f.b) tryCatch( { test.with( from=f.a, to=f.b, useDiff=TRUE, Log=TRUE, nullPointers=FALSE ) TRUE }, warning=function(e) FALSE ) } diffobj/R/rds.R0000755000176200001440000000171713777704534013023 0ustar liggesusers# Copyright (C) 2021 Brodie Gaslam # # This file is part of "diffobj - Diffs for R Objects" # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # Go to for a copy of the license. # Check Whether Input Could Be Reference to RDS File and Load if it Is get_rds <- function(x) { tryCatch( if( (is.chr.1L(x) && Encoding(x) != "bytes" && file_test("-f", x)) || inherits(x, "connection") ) { suppressWarnings(readRDS(x)) } else x, error=function(e) x ) } diffobj/R/get.R0000755000176200001440000000314313777704532013003 0ustar liggesusers# Copyright (C) 2021 Brodie Gaslam # # This file is part of "diffobj - Diffs for R Objects" # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # Go to for a copy of the license. # Retrieves data from the data elements of the Diff object based on the index # values provided in ind. Positive values draw from `tar` elements, and # negative draw from `cur` elements. # # returns a list with the elements. If type is length 1, you will probably # want to unlist the return value get_dat <- function(x, ind, type) { stopifnot( is(x, "Diff"), is.integer(ind), is.chr.1L(type) && type %in% .diff.dat.cols ) # Need to figure out what zero indices are; previously would return # NA_character_, but now since we're getting a whole bunch of different # stuff not sure what the right return value should be, or even if we produce # zero indices anymore get_dat_raw(ind, x@tar.dat[[type]], x@cur.dat[[type]]) } get_dat_raw <- function(ind, tar, cur) { template <- tar[0L] length(template) <- length(ind) template[which(ind < 0L)] <- cur[abs(ind[ind < 0L])] template[which(ind > 0L)] <- tar[abs(ind[ind > 0L])] template } diffobj/R/core.R0000755000176200001440000006761114123062122013140 0ustar liggesusers# Copyright (C) 2021 Brodie Gaslam # # This file is part of "diffobj - Diffs for R Objects" # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # Go to for a copy of the license. #' @include s4.R NULL #' Generate a character representation of Shortest Edit Sequence #' #' @keywords internal #' @seealso \code{\link{ses}} #' @param x S4 object of class \code{MyersMbaSes} #' @param ... unused #' @return character vector setMethod("as.character", "MyersMbaSes", function(x, ...) { dat <- as.data.frame(x) # Split our data into sections that have either deletes or inserts and get # rid of the matches dat <- dat[dat$type != "Match", ] d.s <- split(dat, dat$section) # For each section, compute whether we should display, change, insert, # delete, or both, and based on that append to the ses string ses_rng <- function(off, len) paste0(off, if(len > 1L) paste0(",", off + len - 1L)) vapply( unname(d.s), function(d) { del <- sum(d$len[d$type == "Delete"]) ins <- sum(d$len[d$type == "Insert"]) if(del) { del.first <- which(d$type == "Delete")[[1L]] del.off <- d$off[del.first] } if(ins) { ins.first <- which(d$type == "Insert")[[1L]] ins.off <- d$off[ins.first] } if(del && ins) { paste0(ses_rng(del.off, del), "c", ses_rng(ins.off, ins)) } else if (del) { paste0(ses_rng(del.off, del), "d", d$last.b[[1L]]) } else if (ins) { paste0(d$last.a[[1L]], "a", ses_rng(ins.off, ins)) } else { stop("Logic Error: unexpected edit type; contact maintainer.") # nocov } }, character(1L) ) } ) # Used for mapping edit actions to numbers so we can use numeric matrices # absolutely must be used to create the @type factor in the MBA object. # # DO NOT CHANGE LIGHTLY; SOME CODE MIGHT RELY ON THE UNDERLYING INTEGER POSITIONS .edit.map <- c("Match", "Insert", "Delete") setMethod("as.matrix", "MyersMbaSes", function(x, row.names=NULL, optional=FALSE, ...) { # map del/ins/match to numbers len <- length(x@type) matches <- x@type == "Match" section <- cumsum(matches + c(0L, head(matches, -1L))) # Track what the max offset observed so far for elements of the `a` string # so that if we have an insert command we can get the insert position in # `a` last.a <- c( if(len) 0L, head( cummax( ifelse(x@type != "Insert", x@offset + x@length, 1L) ) - 1L, -1L ) ) # Do same thing with `b`, complicated because the matching entries are all # in terms of `a` last.b <- c( if(len) 0L, head(cumsum(ifelse(x@type != "Delete", x@length, 0L)), -1L) ) cbind( type=as.integer(x@type), len=x@length, off=x@offset, section=section, last.a=last.a, last.b = last.b ) } ) setMethod("as.data.frame", "MyersMbaSes", function(x, row.names=NULL, optional=FALSE, ...) { len <- length(x@type) mod <- c("Insert", "Delete") dat <- data.frame(type=x@type, len=x@length, off=x@offset) matches <- dat$type == "Match" dat$section <- cumsum(matches + c(0L, head(matches, -1L))) # Track what the max offset observed so far for elements of the `a` string # so that if we have an insert command we can get the insert position in # `a` dat$last.a <- c( if(nrow(dat)) 0L, head( cummax(ifelse(dat$type != "Insert", dat$off + dat$len, 1L)) - 1L, -1L ) ) # Do same thing with `b`, complicated because the matching entries are all # in terms of `a` dat$last.b <- c( if(nrow(dat)) 0L, head(cumsum(ifelse(dat$type != "Delete", dat$len, 0L)), -1L) ) dat } ) #' Shortest Edit Script #' #' Computes shortest edit script to convert \code{a} into \code{b} by removing #' elements from \code{a} and adding elements from \code{b}. Intended primarily #' for debugging or for other applications that understand that particular #' format. See \href{http://www.gnu.org/software/diffutils/manual/diffutils.html#Detailed-Normal}{GNU diff docs} #' for how to interpret the symbols. #' #' \code{ses} will be much faster than any of the #' \code{\link[=diffPrint]{diff*}} methods, particularly for large inputs with #' limited numbers of differences. #' #' NAs are treated as the string \dQuote{NA}. Non-character inputs are coerced #' to character. #' #' \code{ses_dat} provides a semi-processed \dQuote{machine-readable} version of #' precursor data to \code{ses} that may be useful for those desiring to use the #' raw diff data and not the printed output of \code{diffobj}, but do not wish #' to manually parse the \code{ses} output. Whether it is faster than #' \code{ses} or not depends on the ratio of matching to non-matching values as #' \code{ses_dat} includes matching values whereas \code{ses} does not. #' \code{ses_dat} objects have a print method that makes it easy to interpret #' the diff, but are actually data.frames. You can see the underlying data by #' using \code{as.data.frame}, removing the "ses_dat" class, etc.. #' #' @export #' @param a character #' @param b character #' @param extra TRUE (default) or FALSE, whether to also return the indices in #' \code{a} and \code{b} the diff values are taken from. Set to FALSE for a #' small performance gain. #' @inheritParams diffPrint #' @param warn TRUE (default) or FALSE whether to warn if we hit #' \code{max.diffs}. #' @return character shortest edit script, or a machine readable version of it #' as a \code{ses_dat} object, which is a \code{data.frame} with columns #' \code{op} (factor, values \dQuote{Match}, \dQuote{Insert}, or #' \dQuote{Delete}), \code{val} character corresponding to the value taken #' from either \code{a} or \code{b}, and if \code{extra} is TRUE, integer #' columns \code{id.a} and \code{id.b} corresponding to the indices in #' \code{a} or \code{b} that \code{val} was taken from. See Details. #' @examples #' a <- letters[1:6] #' b <- c('b', 'CC', 'DD', 'd', 'f') #' ses(a, b) #' (dat <- ses_dat(a, b)) #' str(dat) # data.frame with a print method #' #' ## use `ses_dat` output to construct a minimal diff #' ## color with ANSI CSI SGR #' diff <- dat[['val']] #' del <- dat[['op']] == 'Delete' #' ins <- dat[['op']] == 'Insert' #' if(any(del)) #' diff[del] <- paste0("\033[33m- ", diff[del], "\033[m") #' if(any(ins)) #' diff[ins] <- paste0("\033[34m+ ", diff[ins], "\033[m") #' if(any(!ins & !del)) #' diff[!ins & !del] <- paste0(" ", diff[!ins & !del]) #' writeLines(diff) #' #' ## We can recover `a` and `b` from the data #' identical(subset(dat, op != 'Insert', val)[[1]], a) #' identical(subset(dat, op != 'Delete', val)[[1]], b) ses <- function(a, b, max.diffs=gdo("max.diffs"), warn=gdo("warn")) { args <- ses_prep(a=a, b=b, max.diffs=max.diffs, warn=warn) as.character( diff_myers( args[['a']], args[['b']], max.diffs=args[['max.diffs']], warn=args[['warn']] ) ) } #' @export #' @rdname ses ses_dat <- function( a, b, extra=TRUE, max.diffs=gdo("max.diffs"), warn=gdo("warn") ) { args <- ses_prep(a=a, b=b, max.diffs=max.diffs, warn=warn) if(!is.TF(extra)) stop("Argument `extra` must be TRUE or FALSE.") mba <- diff_myers( args[['a']], args[['b']], max.diffs=args[['max.diffs']], warn=args[['warn']] ) # reorder so that deletes are before (lack of foresight in setting factor # levels...) inserts in each section sec <- cumsum(mba@type == 'Match') o <- order(sec, c(1L,3L,2L)[as.integer(mba@type)]) type <- mba@type[o] len <- mba@length[o] off <- mba@offset[o] # offsets are indices in `a` for 'Match' and 'Delete', and in `b` for insert # see `diff_myers` for details id <- rep(seq_along(type), len) type2 <- type[id] off2 <- off[id] id2 <- sequence(len) + off2 - 1L use.a <- type2 %in% c('Match', 'Delete') use.b <- !use.a values <- character(length(id)) values[use.a] <- a[id2[use.a]] values[use.b] <- b[id2[use.b]] res <- if(extra) { id.a <- id.b <- rep(NA_integer_, length(values)) id.a[use.a] <- id2[use.a] id.b[use.b] <- id2[use.b] data.frame( op=type2, val=values, id.a=id.a, id.b=id.b, stringsAsFactors=FALSE ) } else { data.frame(op=type2, val=values, stringsAsFactors=FALSE) } structure(res, class=c('ses_dat', class(res))) } #' @export print.ses_dat <- function(x, quote=FALSE, ...) { op <- x[['op']] diff <- matrix( "", 3, nrow(x), dimnames=list(c('D:', 'M:', 'I:'), character(nrow(x))) ) d <- op == 'Delete' m <- op == 'Match' i <- op == 'Insert' diff[1, d] <- x[['val']][d] diff[2, m] <- x[['val']][m] diff[3, i] <- x[['val']][i] writeLines( sprintf( "\"ses_dat\" object (Match: %d, Delete: %d, Insert: %d):", sum(m), sum(d), sum(i) ) ) print(diff, quote=quote, ...) invisible(x) } # Internal validation fun for ses_* ses_prep <- function(a, b, max.diffs, warn) { if(!is.character(a)) { a <- try(as.character(a)) if(inherits(a, "try-error")) stop("Argument `a` is not character and could not be coerced to such") } if(!is.character(b)) { b <- try(as.character(b)) if(inherits(b, "try-error")) stop("Argument `b` is not character and could not be coerced to such") } if(is.numeric(max.diffs)) max.diffs <- as.integer(max.diffs) if(!is.int.1L(max.diffs)) stop("Argument `max.diffs` must be scalar integer.") if(!is.TF(warn)) stop("Argument `warn` must be TRUE or FALSE.") if(anyNA(a)) a[is.na(a)] <- "NA" if(anyNA(b)) b[is.na(b)] <- "NA" list(a=a, b=b, max.diffs=max.diffs, warn=warn) } #' Diff two character vectors #' #' Implementation of Myer's Diff algorithm with linear space refinement #' originally implemented by Mike B. Allen as part of #' \href{http://www.ioplex.com/~miallen/libmba/}{libmba} #' version 0.9.1. This implementation is a heavily modified version of the #' original C code and is not compatible with the \code{libmba} library. #' The C code is simplified by using fixed size arrays instead of variable #' ones for tracking the longest reaching paths and for recording the shortest #' edit scripts. Additionally all error handling and memory allocation calls #' have been moved to the internal R functions designed to handle those things. #' A failover result is provided in the case where max diffs allowed is #' exceeded. Ability to provide custom comparison functions is removed. #' #' The result format indicates operations required to convert \code{a} into #' \code{b} in a precursor format to the GNU diff shortest edit script. The #' operations are \dQuote{Match} (do nothing), \dQuote{Insert} (insert one or #' more values of \code{b} into \code{a}), and \dQuote{Delete} (remove one or #' more values from \code{a}). The \code{length} slot dictates how #' many values to advance along, insert into, or delete from \code{a}. The #' \code{offset} slot changes meaning depending on the operation. For #' \dQuote{Match} and \dQuote{Delete}, it is the starting index of that #' operation in \code{a}. For \dQuote{Insert}, it is the starting index in #' \code{b} of the values to insert into \code{a}; the index in \code{a} to #' insert at is implicit in previous operations. #' #' @keywords internal #' @param a character #' @param b character #' @param max.diffs integer(1L) how many differences before giving up; set to #' -1 to allow as many as there are up to the maximum allowed (~INT_MAX/4). #' @param warn TRUE or FALSE, whether to warn if we hit `max.diffs`. #' @return MyersMbaSes object #' @useDynLib diffobj, .registration=TRUE, .fixes="DIFFOBJ_" diff_myers <- function(a, b, max.diffs=-1L, warn=FALSE) { stopifnot( is.character(a), is.character(b), all(!is.na(c(a, b))), is.int.1L(max.diffs), is.TF(warn) ) a <- enc2utf8(a) b <- enc2utf8(b) res <- .Call(DIFFOBJ_diffobj, a, b, max.diffs) res <- setNames(res, c("type", "length", "offset", "diffs")) types <- .edit.map # silly that we have to generate a factor when we have the integer vector and # levels... Two unncessary hashes. res$type <- factor(types[res$type], levels=types) res$offset <- res$offset + 1L # C 0-indexing originally res.s4 <- try(do.call("new", c(list("MyersMbaSes", a=a, b=b), res))) if(inherits(res.s4, "try-error")) # nocov start stop( "Logic Error: unable to instantiate shortest edit script object; contact ", "maintainer." ) # nocov end if(isTRUE(warn) && res$diffs < 0) { warning( "Exceeded `max.diffs`: ", abs(res$diffs), " vs ", max.diffs, " allowed. ", "Diff is probably suboptimal." ) } res.s4 } # Print Method for Shortest Edit Path # # Bare bones display of shortest edit path using GNU diff conventions # # @param object object to display # @return character the shortest edit path character representation, invisibly # @rdname diffobj_s4method_doc #' @rdname diffobj_s4method_doc setMethod("show", "MyersMbaSes", function(object) { res <- as.character(object) cat(res, sep="\n") invisible(res) } ) #' Summary Method for Shortest Edit Path #' #' Displays the data required to generate the shortest edit path for comparison #' between two strings. #' #' @export #' @keywords internal #' @param object the \code{diff_myers} object to display #' @param with.match logical(1L) whether to show what text the edit command #' refers to #' @param ... forwarded to the data frame print method used to actually display #' the data #' @return whatever the data frame print method returns setMethod("summary", "MyersMbaSes", function(object, with.match=FALSE, ...) { what <- vapply( seq_along(object@type), function(y) { t <- object@type[[y]] o <- object@offset[[y]] l <- object@length[[y]] vec <- if(t == "Insert") object@b else object@a paste0(vec[o:(o + l - 1L)], collapse="") }, character(1L) ) res <- data.frame( type=object@type, string=what, len=object@length, offset=object@offset, stringsAsFactors=FALSE ) if(!with.match) res <- res[-2L] print(res, ...) } ) # mode is display mode (sidebyside, etc.) # diff.mode is whether we are doing the first pass line diff, or doing the # in-hunk or word-wrap versions # warn is to allow us to suppress warnings after first hunk warning char_diff <- function(x, y, context=-1L, etc, diff.mode, warn) { stopifnot( diff.mode %in% c("line", "hunk", "wrap"), isTRUE(warn) || identical(warn, FALSE) ) max.diffs <- etc@max.diffs # probably shouldn't generate S4, but easier... diff <- diff_myers(x, y, max.diffs, warn=FALSE) hunks <- as.hunks(diff, etc=etc) hit.diffs.max <- FALSE if(diff@diffs < 0L) { hit.diffs.max <- TRUE diff@diffs <- -diff@diffs diff.msg <- c( line="overall", hunk="in-hunk word", wrap="atomic wrap-word" ) if(warn) warning( "Exceeded diff limit during diff computation (", diff@diffs, " vs. ", max.diffs, " allowed); ", diff.msg[diff.mode], " diff is likely not optimal", call.=FALSE ) } # used to be a `DiffDiffs` object, but too slow list(hunks=hunks, hit.diffs.max=hit.diffs.max) } # Compute the character representation of a hunk header make_hh <- function(h.g, mode, tar.dat, cur.dat, ranges.orig) { h.ids <- vapply(h.g, "[[", integer(1L), "id") h.head <- vapply(h.g, "[[", logical(1L), "guide") # exclude header hunks from contributing to range, and adjust ranges for # possible fill lines added to the data h.ids.nh <- h.ids[!h.head] tar.rng <- find_rng(h.ids.nh, ranges.orig[1:2, , drop=FALSE], tar.dat$fill) tar.rng.f <- cumsum(!tar.dat$fill)[tar.rng] cur.rng <- find_rng(h.ids.nh, ranges.orig[3:4, , drop=FALSE], cur.dat$fill) cur.rng.f <- cumsum(!cur.dat$fill)[cur.rng] hh.a <- paste0(rng_as_chr(tar.rng.f)) hh.b <- paste0(rng_as_chr(cur.rng.f)) if(mode == "sidebyside") sprintf("@@ %s @@", c(hh.a, hh.b)) else { sprintf("@@ %s / %s @@", hh.a, hh.b) } } # Do not allow `useBytes=TRUE` if there are any matches with `useBytes=FALSE` # # Clean up word.ind to avoid issues where we have mixed UTF-8 and non # UTF-8 strings in different hunks, and gregexpr is trying to optimize # buy using useBytes=TRUE in ASCII only strings without knowing that in a # different hunk there are UTF-8 strings fix_word_ind <- function(x) { matches <- vapply(x, function(y) length(y) > 1L || y != -1L, logical(1L)) useBytes <- vapply(x, function(y) isTRUE(attr(y, "useBytes")), logical(1L)) if(!all(useBytes[matches])) x <- lapply(x, `attr<-`, "useBytes", NULL) x } # Variation on `char_diff` used for the overall diff where we don't need # to worry about overhead from creating the `Diff` object line_diff <- function( target, current, tar.capt, cur.capt, context, etc, warn=TRUE, strip=TRUE ) { if(!is.valid.guide.fun(etc@guides)) # nocov start stop( "Logic Error: guides are not a valid guide function; contact maintainer" ) # nocov end etc@guide.lines <- make_guides(target, tar.capt, current, cur.capt, etc@guides) # Need to remove new lines as the processed captures do that anyway and we # end up with mismatched lengths if we don't if(any(nzchar(tar.capt))) tar.capt <- split_new_line(tar.capt, sgr.supported=etc@sgr.supported) if(any(nzchar(cur.capt))) cur.capt <- split_new_line(cur.capt, sgr.supported=etc@sgr.supported) # Some debate as to whether we want to do this first, or last. First has # many benefits so that everything is consistent, width calcs can work fine, # etc., but only issue is that user provided trim functions might not expect # the transformation of the data; this needs to be documented with the trim # docs. tar.capt.p <- tar.capt cur.capt.p <- cur.capt if(etc@convert.hz.white.space) { tar.capt.p <- strip_hz_control( tar.capt, stops=etc@tab.stops, sgr.supported=etc@sgr.supported ) cur.capt.p <- strip_hz_control( cur.capt, stops=etc@tab.stops, sgr.supported=etc@sgr.supported ) } # Remove whitespace and CSI SGR if warranted if(etc@strip.sgr) { if(has.style.1 <- any(crayon::has_style(tar.capt.p))) tar.capt.p <- crayon::strip_style(tar.capt.p) if(has.style.2 <- any(crayon::has_style(cur.capt.p))) cur.capt.p <- crayon::strip_style(cur.capt.p) if(has.style.1 || has.style.2) etc@warn( "`target` or `current` contained ANSI CSI SGR when rendered; these ", "were stripped. Use `strip.sgr=FALSE` to preserve them in the diffs." ) } # Apply trimming to remove row heads, etc, but only if something gets trimmed # from both elements tar.trim.ind <- apply_trim(target, tar.capt.p, etc@trim) tar.trim <- do.call( substr, list(tar.capt.p, tar.trim.ind[, 1L], tar.trim.ind[, 2L]) ) cur.trim.ind <- apply_trim(current, cur.capt.p, etc@trim) cur.trim <- do.call( substr, list(cur.capt.p, cur.trim.ind[, 1L], cur.trim.ind[, 2L]) ) if(identical(tar.trim, tar.capt.p) || identical(cur.trim, cur.capt.p)) { # didn't trim in both, so go back to original tar.trim <- tar.capt.p tar.trim.ind <- cbind( rep(1L, length(tar.capt.p)), nchar(tar.capt.p) ) cur.trim <- cur.capt.p cur.trim.ind <- cbind( rep(1L, length(cur.capt.p)), nchar(cur.capt.p) ) } tar.comp <- tar.trim cur.comp <- cur.trim if(etc@ignore.white.space) { tar.comp <- normalize_whitespace(tar.comp) cur.comp <- normalize_whitespace(cur.comp) } # Word diff is done in three steps: create an empty template vector structured # as the result of a call to `gregexpr` without matches, if dealing with # compliant atomic vectors in print mode, then update with the word diff # matches, finally, update with in-hunk word diffs for hunks that don't have # any existing word diffs: # Set up data lists with all relevant info; need to pass to diff_word so it # can be modified. # - orig: the very original string # - raw: the original captured text line by line, with strip_hz applied # - trim: as above, but with row meta data removed # - trim.ind: the indices used to re-insert `trim` into `raw` # - comp: the strings that will have the line diffs run on, these can be # modified to force a particular outcome, e.g. by word_to_line_map # - eq: the portion of `trim` that is equal post word-diff # - fin: the final character string for display to user # - word.ind: for use by `regmatches<-` to re-insert colored words # - tok.rat: for use by `align_eq` when lining up lines within hunks tar.dat <- list( orig=tar.capt, raw=tar.capt.p, trim=tar.trim, trim.ind.start=tar.trim.ind[, 1L], trim.ind.end=tar.trim.ind[, 2L], comp=tar.comp, eq=tar.comp, fin=tar.capt.p, fill=logical(length(tar.capt.p)), word.ind=replicate(length(tar.capt.p), .word.diff.atom, simplify=FALSE), tok.rat=rep(1, length(tar.capt.p)) ) cur.dat <- list( orig=cur.capt, raw=cur.capt.p, trim=cur.trim, trim.ind.start=cur.trim.ind[, 1L], trim.ind.end=cur.trim.ind[, 2L], comp=cur.comp, eq=cur.comp, fin=cur.capt.p, fill=logical(length(cur.capt.p)), word.ind=replicate(length(cur.capt.p), .word.diff.atom, simplify=FALSE), tok.rat=rep(1, length(cur.capt.p)) ) # Word diffs in wrapped form is atomic; note this will potentially change # the length of the vectors. tar.wrap.diff <- integer(0L) cur.wrap.diff <- integer(0L) tar.dat.w <- tar.dat cur.dat.w <- cur.dat if( is.atomic(target) && is.atomic(current) && is.null(dim(target)) && is.null(dim(current)) && length(tar.rh <- which_atomic_cont(tar.capt.p, target)) && length(cur.rh <- which_atomic_cont(cur.capt.p, current)) && is.null(names(target)) && is.null(names(current)) && etc@unwrap.atomic && etc@word.diff ) { # For historical compatibility we allow `diffChr` to get into this step if # the text format is right, even though it is arguable whether it should be # allowed or not. if(!all(diff(tar.rh) == 1L) || !all(diff(cur.rh)) == 1L){ # nocov start stop("Logic Error, row headers must be sequential; contact maintainer.") # nocov end } # Only do this for the portion of the data that actually matches up with # the atomic row headers. diff.word <- diff_word2( tar.dat, cur.dat, tar.ind=tar.rh, cur.ind=cur.rh, diff.mode="wrap", warn=warn, etc=etc ) warn <- !diff.word$hit.diffs.max tar.dat.w <- diff.word$tar.dat cur.dat.w <- diff.word$cur.dat # Mark the lines that were wrapped diffed; necessary b/c tar/cur.rh are # defined even if other conditions to get in this loop are not, and also # because the addition of the fill lines moves everything around # (effectively tar/cur.wrap.diff are the fill-offset versions of tar/cur.rh) tar.wrap.diff <- seq_along(tar.dat.w$fill)[!tar.dat.w$fill][tar.rh] cur.wrap.diff <- seq_along(cur.dat.w$fill)[!cur.dat.w$fill][cur.rh] } # Actual line diff diffs <- char_diff( tar.dat.w$comp, cur.dat.w$comp, etc=etc, diff.mode="line", warn=warn ) warn <- !diffs$hit.diffs.max hunks.flat <- diffs$hunks # For each of those hunks, run the word diffs and store the results in the # word.diffs list; bad part here is that we keep overwriting the overall # diff data for each hunk, which might be slow tar.dat.ww <- tar.dat.w cur.dat.ww <- cur.dat.w if(etc@word.diff) { # Word diffs on hunks, excluding all values that have already been wrap # diffed as in tar.rh and cur.rh / tar.wrap.diff and cur.wrap.diff for(h.a in hunks.flat) { if(h.a$context) next h.a.ind <- c(h.a$A, h.a$B) h.a.tar.ind <- setdiff(h.a.ind[h.a.ind > 0], tar.wrap.diff) h.a.cur.ind <- setdiff(abs(h.a.ind[h.a.ind < 0]), cur.wrap.diff) h.a.w.d <- diff_word2( tar.dat.ww, cur.dat.ww, h.a.tar.ind, h.a.cur.ind, diff.mode="hunk", warn=warn, etc=etc ) tar.dat.ww <- h.a.w.d[['tar.dat']] cur.dat.ww <- h.a.w.d[['cur.dat']] warn <- warn || !h.a.w.d[['hit.diffs.max']] } # Compute the token ratios tok_ratio_compute <- function(z) vapply( z, function(y) if(is.null(wc <- attr(y, "word.count"))) 1 else max(0, (wc - length(y)) / wc), numeric(1L) ) tar.dat.ww$tok.rat <- tok_ratio_compute(tar.dat.ww$word.ind) cur.dat.ww$tok.rat <- tok_ratio_compute(cur.dat.ww$word.ind) # Deal with mixed UTF/plain strings tar.dat.ww$word.ind <- fix_word_ind(tar.dat.ww$word.ind) cur.dat.ww$word.ind <- fix_word_ind(cur.dat.ww$word.ind) # Remove different words to make equal strings tar.dat.ww$eq <- with(tar.dat.ww, `regmatches<-`(trim, word.ind, value="")) cur.dat.ww$eq <- with(cur.dat.ww, `regmatches<-`(trim, word.ind, value="")) } # Instantiate result hunk.grps.raw <- group_hunks( hunks.flat, etc=etc, tar.capt=tar.dat.ww$raw, cur.capt=cur.dat.ww$raw ) gutter.dat <- etc@gutter max.w <- etc@text.width # Recompute line limit accounting for banner len, needed for correct trim etc.group <- etc if(etc.group@line.limit[[1L]] >= 0L) { etc.group@line.limit <- pmax(integer(2L), etc@line.limit - banner_len(etc@mode)) } # Trim hunks to the extent needed to make sure we fit in lines hunk.grps <- trim_hunks(hunk.grps.raw, etc.group, tar.dat.ww$raw, cur.dat.ww$raw) hunks.flat <- unlist(hunk.grps, recursive=FALSE) # Compact to width of widest element, so retrieve all char values; also # need to generate all the hunk headers b/c we need to use them in width # computation as well; under no circumstances are hunk headers allowed to # wrap as they are always assumed to take one line. # # Note: this used to be done after trimming / subbing, which is technically # better since we might have trimmed away long rows, but we need to do it # here so that we can can record the new text width in the outgoing object; # also, logic a bit circuitous b/c this was originally done elsewhere; might # be faster to use tar.dat and cur.dat directly chr.ind <- unlist(lapply(hunks.flat, "[", c("A", "B"))) chr.dat <- get_dat_raw(chr.ind, tar.dat.ww$raw, cur.dat.ww$raw) chr.size <- integer(length(chr.dat)) ranges <- vapply( hunks.flat, function(h.a) c(h.a$tar.rng.trim, h.a$cur.rng.trim), integer(4L) ) # compute ranges excluding fill lines rng_non_fill <- function(rng, fill) { if(!rng[[1L]]) rng else { rng.seq <- seq(rng[[1L]], rng[[2L]], by=1L) seq.not.fill <- rng.seq[!rng.seq %in% fill] if(!length(seq.not.fill)) { integer(2L) } else { range(seq.not.fill) } } } ranges.orig <- vapply( hunks.flat, function(h.a) { with( h.a, c( rng_non_fill(tar.rng.sub, which(tar.dat.ww$fill)), rng_non_fill(cur.rng.sub, which(cur.dat.ww$fill)) ) ) }, integer(4L) ) # We need a version of ranges that adjust for the fill lines that are counted # in the ranges but don't represent actual lines of output. This does mean # that adjusted ranges are not necessarily contiguous hunk.heads <- lapply(hunk.grps, make_hh, etc@mode, tar.dat.ww, cur.dat.ww, ranges.orig) h.h.chars <- nchar2( chr_trim( unlist(hunk.heads), etc@line.width, sgr.supported=etc@sgr.supported ), sgr.supported=etc@sgr.supported ) chr.size <- nchar2(chr.dat, sgr.supported=etc@sgr.supported) max.col.w <- max( max(0L, chr.size, .min.width + gutter.dat@width), h.h.chars ) max.w <- if(max.col.w < max.w) max.col.w else max.w # future calculations should assume narrower display etc@text.width <- max.w etc@line.width <- max.w + gutter.dat@width new( "Diff", diffs=hunk.grps, target=target, current=current, hit.diffs.max=!warn, tar.dat=tar.dat.ww, cur.dat=cur.dat.ww, etc=etc, hunk.heads=hunk.heads, trim.dat=attr(hunk.grps, 'meta') ) } diffobj/R/html.R0000755000176200001440000001037313777704532013173 0ustar liggesusers# Copyright (C) 2021 Brodie Gaslam # # This file is part of "diffobj - Diffs for R Objects" # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # Go to for a copy of the license. #' @include misc.R NULL #' Make Functions That Wrap Text in HTML Tags #' #' Helper functions to generate functions to use as slots for the #' \code{StyleHtml@funs} classes. These are functions that return #' \emph{functions}. #' #' \code{tag_f} and related functions (\code{div_f}, \code{span_f}) produce #' functions that are vectorized and will apply opening and closing tags to #' each element of a character vector. \code{container_f} on the other hand #' produces a function will collapse a character vector into length 1, and only #' then applies the tags. Additionally, \code{container_f} already comes with #' the \dQuote{diffobj-container} class specified. #' #' @note inputs are assumed to be valid class names or CSS styles. #' #' @export #' @param tag character(1L) a name of an HTML tag #' @param class character the CSS class(es) #' @param style named character inline styles, where the name is the CSS #' property and the value the value. #' @return a function that accepts a character parameter. If applied, each #' element in the character vector will be wrapped in the div tags #' @aliases div_f, span_f, cont_f #' @examples #' ## Assuming class 'ex1' has CSS styles defined elsewhere #' tag_f("div", "ex1")(LETTERS[1:5]) #' ## Use convenience function, and add some inline styles #' div_f("ex2", c(color="green", `font-family`="arial"))(LETTERS[1:5]) #' ## Notice how this is a div with pre-specifed class, #' ## and only one div is created around the entire data #' cont_f()(LETTERS[1:5]) tag_f <- function(tag, class=character(), style=character()) { stopifnot(is.chr.1L(tag), is.character(class), is.character(style)) function(x) { if(!is.character(x)) stop("Argument `x` must be character.") if(!length(x)) character(0L) else paste0( "<", tag, if(length(class)) paste0(" class='", paste0(class, collapse=" "), "'"), if(length(style)) paste0( " style='", paste(names(style), style, sep=": ", collapse="; "), ";'" ), ">", x, "" ) } } #' @export #' @rdname tag_f div_f <- function(class=character(), style=character()) tag_f("div", class, style) #' @export #' @rdname tag_f span_f <- function(class=character(), style=character()) tag_f("span", class, style) #' @export #' @rdname tag_f cont_f <- function(class=character()) { stopifnot(is.character(class)) function(x) { if(!is.character(x)) stop("Argument `x` must be character.") sprintf( paste0( "
",
        "%s
" ), if(length(class)) paste0(" ", class, collapse="") else "", paste0(x, collapse="") ) } } #' Count Text Characters in HTML #' #' Very simple implementation that will fail if there are any \dQuote{>} in the #' HTML that are not closing tags, and assumes that HTML entities are all one #' character wide. Also, spaces are counted as one width each because the #' HTML output is intended to be displayed inside \code{
} tags.
#'
#' @export
#' @param x character
#' @param ... unused for compatibility with internal use
#' @return integer(length(x)) with number of characters of each element
#' @examples
#' nchar_html("hello")

nchar_html <- function(x, ...) {
  stopifnot(is.character(x) && !anyNA(x))
  tag.less <- gsub("<[^>]*>", "", x) 
  # Thanks ridgerunner for html entity removal regex
  # http://stackoverflow.com/users/433790/ridgerunner
  # http://stackoverflow.com/a/8806462/2725969
  ent.less <-
    gsub("&(?:[a-z\\d]+|#\\d+|#x[a-f\\d]+);", "X", tag.less, perl=TRUE)
  nchar(ent.less)
}
diffobj/R/capt.R0000755000176200001440000003331113777704532013153 0ustar  liggesusers# Copyright (C) 2021 Brodie Gaslam
#
# This file is part of "diffobj - Diffs for R Objects"
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# Go to  for a copy of the license.

# Capture output of print/show/str; unfortunately doesn't have superb handling
# of errors during print/show call, though hopefully these are rare
#
# x is a quoted call to evaluate

capture <- function(x, etc, err) {
  capt.width <- etc@text.width
  if(capt.width) {
    opt.set <- try(width.old <- options(width=capt.width), silent=TRUE)
    if(inherits(opt.set, "try-error")) {
      warning(
        "Unable to set desired width ", capt.width, ", (",
        conditionMessage(attr(opt.set, "condition")), ");",
        "proceeding with existing setting."
      )
    } else on.exit(options(width.old))
  }
  # Note, we use `tempfile` for capture as that appears much faster than normal
  # capture without a file

  capt.file <- tempfile()
  on.exit(unlink(capt.file), add=TRUE)
  res <- try({
    capture.output(eval(x, etc@frame), file=capt.file)
    obj.out <- readLines(capt.file)
  })
  if(inherits(res, "try-error"))
    err(
      "Failed attempting to get text representation of object: ",
      conditionMessage(attr(res, "condition"))
    )
  html_ent_sub(res, etc@style)
}
# capture normal prints, along with default prints to make sure that if we
# do try to wrap an atomic vector print it is very likely to be in a format
# we are familiar with and not affected by a non-default print method

capt_print <- function(target, current, etc, err, extra){
  dots <- extra
  # What about S4?
  if(getRversion() >= "3.2.0") {
    print.match <- try(
      match.call(
        get("print", envir=etc@frame, mode='function'),
        as.call(c(list(quote(print), x=NULL), dots)),
        envir=etc@frame
    ) )
  } else {
    # this may be sub-optimal, but match.call does not support the envir arg
    # prior to this
    # nocov start
    print.match <- try(
      match.call(
        get("print", envir=etc@frame),
        as.call(c(list(quote(print), x=NULL), dots))
    ) )
    # nocov end
  }
  if(inherits(print.match, "try-error"))
    err("Unable to compose `print` call")

  names(print.match)[[2L]] <- ""
  tar.call <- cur.call <- print.match

  if(length(dots)) {
    if(!is.null(etc@tar.exp)) tar.call[[2L]] <- etc@tar.exp
    if(!is.null(etc@cur.exp)) cur.call[[2L]] <- etc@cur.exp
    etc@tar.banner <- deparse(tar.call)[[1L]]
    etc@cur.banner <- deparse(cur.call)[[1L]]
  }
  tar.call.q <- if(is.call(target) || is.symbol(target))
    call("quote", target) else target
  cur.call.q <- if(is.call(current) || is.symbol(current))
    call("quote", current) else current

  if(!is.null(target)) tar.call[[2L]] <- tar.call.q
  if(!is.null(current)) cur.call[[2L]] <- cur.call.q

  # If dimensioned object, and in auto-mode, switch to side by side if stuff is
  # narrow enough to fit

  if((!is.null(dim(target)) || !is.null(dim(current)))) {
    cur.capt <- capture(cur.call, etc, err)
    tar.capt <- capture(tar.call, etc, err)
    etc <- set_mode(etc, tar.capt, cur.capt)
  } else {
    etc <- if(etc@mode == "auto") sideBySide(etc) else etc
    cur.capt <- capture(cur.call, etc, err)
    tar.capt <- capture(tar.call, etc, err)
  }
  if(isTRUE(etc@guides)) etc@guides <- guidesPrint
  if(isTRUE(etc@trim)) etc@trim <- trimPrint

  diff.out <- line_diff(target, current, tar.capt, cur.capt, etc=etc, warn=TRUE)
  diff.out@capt.mode <- "print"
  diff.out
}
# Tries various different `str` settings to get the best possible output

capt_str <- function(target, current, etc, err, extra){
  # Match original call and managed dots, in particular wrt to the
  # `max.level` arg
  dots <- extra
  frame <- etc@frame
  line.limit <- etc@line.limit
  if("object" %in% names(dots))
    err("You may not specify `object` as part of `extra`")

  if(getRversion() < "3.2.0") {
    # nocov start
    str.match <- match.call(
      str_tpl,
      call=as.call(c(list(quote(str), object=NULL), dots))
    )
    # nocov end
  } else {
    str.match <- match.call(
      str_tpl,
      call=as.call(c(list(quote(str), object=NULL), dots)), envir=etc@frame
    )
  }
  names(str.match)[[2L]] <- ""

  # Handle auto mode (side by side always for `str`)

  if(etc@mode == "auto") etc <- sideBySide(etc)

  # Utility function; defining in body so it has access to `err`

  eval_try <- function(match.list, index, envir)
    tryCatch(
      eval(match.list[[index]], envir=envir),
      error=function(e)
        err("Error evaluating `", index, "` arg: ", conditionMessage(e))
    )
  # Setup / process extra args

  auto.mode <- FALSE
  max.level.supplied <- FALSE
  if(
    max.level.pos <- match("max.level", names(str.match), nomatch=0L)
  ) {
    # max.level specified in call; check for special 'auto' case
    max.level.eval <- eval_try(str.match, "max.level", etc@frame)
    if(identical(max.level.eval, "auto")) {
      auto.mode <- TRUE
      str.match[["max.level"]] <- NA
    } else {
      max.level.supplied <- TRUE
    }
  } else {
    str.match[["max.level"]] <- NA
    auto.mode <- TRUE
    max.level.pos <- length(str.match)
    max.level.supplied <- FALSE
  }
  # Was wrap specified in strict width mode?  Not sure this is correct any more;
  # should probably be looking at extra args.

  wrap <- FALSE
  if("strict.width" %in% names(str.match)) {
    res <- eval_try(str.match, "strict.width", etc@frame)
    wrap <- is.character(res) && length(res) == 1L && !is.na(res) &&
      nzchar(res) && identical(res, substr("wrap", 1L, nchar(res)))
  }
  if(auto.mode) {
    msg <-
      "Specifying `%s` may cause `str` output level folding to be incorrect"
    if("comp.str" %in% names(str.match)) warning(sprintf(msg, "comp.str"))
    if("indent.str" %in% names(str.match)) warning(sprintf(msg, "indent.str"))
  }
  # don't want to evaluate target and current more than once, so can't eval
  # tar.exp/cur.exp, so instead run call with actual object

  tar.call <- cur.call <- str.match

  tar.call.q <- if(is.call(target) || is.symbol(target))
    call("quote", target) else target
  cur.call.q <- if(is.call(current) || is.symbol(current))
    call("quote", current) else current

  if(!is.null(target)) tar.call[[2L]] <- tar.call.q
  if(!is.null(current)) cur.call[[2L]] <- cur.call.q

  # Run str

  capt.width <- etc@text.width
  has.diff <- has.diff.prev <- FALSE

  # we used to strip_hz_control here, but shouldn't have to since handled by
  # line_diff

  tar.capt <- capture(tar.call, etc, err)
  tar.lvls <- str_levels(tar.capt, wrap=wrap)
  cur.capt <- capture(cur.call, etc, err)
  cur.lvls <- str_levels(cur.capt, wrap=wrap)

  prev.lvl.hi <- lvl <- max.depth <- max(tar.lvls, cur.lvls)
  prev.lvl.lo <- 0L
  first.loop <- TRUE
  safety <- 0L
  warn <- TRUE

  if(isTRUE(etc@guides)) etc@guides <- guidesStr
  if(isTRUE(etc@trim)) etc@trim <- trimStr

  tar.str <- tar.capt
  cur.str <- cur.capt

  diff.obj <- diff.obj.full <- line_diff(
    target, current, tar.str, cur.str, etc=etc, warn=warn
  )
  if(!max.level.supplied) {
    repeat{
      if((safety <- safety + 1L) > max.depth && !first.loop)
        # nocov start
        stop(
          "Logic Error: exceeded list depth when comparing structures; contact ",
          "maintainer."
        )
        # nocov end
      if(!first.loop) {
        tar.str <- tar.capt[tar.lvls <= lvl]
        cur.str <- cur.capt[cur.lvls <= lvl]

        diff.obj <- line_diff(
          target, current, tar.str, cur.str, etc=etc, warn=warn
        )
      }
      if(diff.obj@hit.diffs.max) warn <- FALSE
      has.diff <- suppressWarnings(any(diff.obj))

      # If there are no differences reducing levels isn't going to help to
      # find one; additionally, if not in auto.mode we should not be going
      # through this process

      if(first.loop && !has.diff) break
      first.loop <- FALSE

      if(line.limit[[1L]] < 1L) break

      line.len <- diff_line_len(
        diff.obj@diffs, etc=etc, tar.capt=tar.str, cur.capt=cur.str
      )
      # We need a higher level if we don't have diffs

      if(!has.diff && prev.lvl.hi - lvl > 1L) {
        prev.lvl.lo <- lvl
        lvl <- lvl + as.integer((prev.lvl.hi - lvl) / 2)
        tar.call[[max.level.pos]] <- lvl
        cur.call[[max.level.pos]] <- lvl
        next
      } else if(!has.diff) {
        diff.obj <- diff.obj.full
        lvl <- NULL
        break
      }
      # If we have diffs, need to check whether we should try to reduce lines
      # to get under line limit

      if(line.len <= line.limit[[1L]]) {
        # We fit, nothing else to do
        break
      }
      if(lvl - prev.lvl.lo > 1L) {
        prev.lvl.hi <- lvl
        lvl <- lvl - as.integer((lvl - prev.lvl.lo) / 2)
        tar.call[[max.level.pos]] <- lvl
        cur.call[[max.level.pos]] <- lvl
        next
      }
      # Couldn't get under limit, so use first run results

      diff.obj <- diff.obj.full
      lvl <- NULL
      break
    }
  } else {
    tar.str <- tar.capt[tar.lvls <= max.level.eval]
    cur.str <- cur.capt[cur.lvls <= max.level.eval]

    lvl <- max.level.eval
    diff.obj <- line_diff(target, current, tar.str, cur.str, etc=etc, warn=warn)
  }
  if(auto.mode && !is.null(lvl) && lvl < max.depth) {
    str.match[[max.level.pos]] <- lvl
  } else if (!max.level.supplied || is.null(lvl)) {
    str.match[[max.level.pos]] <- NULL
  }
  tar.call <- cur.call <- str.match
  if(!is.null(etc@tar.exp)) tar.call[[2L]] <- etc@tar.exp
  if(!is.null(etc@cur.exp)) cur.call[[2L]] <- etc@cur.exp
  if(is.null(etc@tar.banner))
    diff.obj@etc@tar.banner <- deparse(tar.call)[[1L]]
  if(is.null(etc@cur.banner))
    diff.obj@etc@cur.banner <- deparse(cur.call)[[1L]]

  # Track total differences in fully expanded view so we can report hidden
  # diffs when folding levels

  diff.obj@diff.count.full <- count_diffs(diff.obj.full@diffs)
  diff.obj@capt.mode <- "str"
  diff.obj
}
capt_chr <- function(target, current, etc, err, extra){
  tar.capt <- if(!is.character(target))
    do.call(as.character, c(list(target), extra), quote=TRUE) else target
  cur.capt <- if(!is.character(current))
    do.call(as.character, c(list(current), extra), quote=TRUE) else current

  # technically possible to have a character method that doesn't return a
  # character object...

  if((tt <- typeof(tar.capt)) != 'character')
    stop("Coercion of `target` did not produce character object (", tt, ").")
  if((tc <- typeof(cur.capt)) != 'character')
    stop("Coercion of `current` did not produce character object (", tc, ").")

  # drop attributes

  tar.capt <- c(tar.capt)
  cur.capt <- c(cur.capt)

  if(anyNA(tar.capt)) tar.capt[is.na(tar.capt)] <- "NA"
  if(anyNA(cur.capt)) cur.capt[is.na(cur.capt)] <- "NA"

  etc <- set_mode(etc, tar.capt, cur.capt)
  if(isTRUE(etc@guides)) etc@guides <- guidesChr
  if(isTRUE(etc@trim)) etc@trim <- trimChr

  diff.out <- line_diff(
    target, current, html_ent_sub(tar.capt, etc@style),
    html_ent_sub(cur.capt, etc@style), etc=etc
  )
  diff.out@capt.mode <- "chr"
  diff.out
}
capt_deparse <- function(target, current, etc, err, extra){
  dep.try <- try({
    tar.capt <- do.call(deparse, c(list(target), extra), quote=TRUE)
    cur.capt <- do.call(deparse, c(list(current), extra), quote=TRUE)
  })
  if(inherits(dep.try, "try-error"))
    err("Error attempting to deparse object(s)")

  etc <- set_mode(etc, tar.capt, cur.capt)
  if(isTRUE(etc@guides)) etc@guides <- guidesDeparse
  if(isTRUE(etc@trim)) etc@trim <- trimDeparse

  diff.out <- line_diff(
    target, current, html_ent_sub(tar.capt, etc@style),
    html_ent_sub(cur.capt, etc@style), etc=etc
  )
  diff.out@capt.mode <- "deparse"
  diff.out
}
capt_file <- function(target, current, etc, err, extra) {
  tar.capt <- try(do.call(readLines, c(list(target), extra), quote=TRUE))
  if(inherits(tar.capt, "try-error")) err("Unable to read `target` file.")
  cur.capt <- try(do.call(readLines, c(list(current), extra), quote=TRUE))
  if(inherits(cur.capt, "try-error")) err("Unable to read `current` file.")

  etc <- set_mode(etc, tar.capt, cur.capt)
  if(isTRUE(etc@guides)) etc@guides <- guidesFile
  if(isTRUE(etc@trim)) etc@trim <- trimFile

  diff.out <- line_diff(
    tar.capt, cur.capt, html_ent_sub(tar.capt, etc@style),
    html_ent_sub(cur.capt, etc@style), etc=etc
  )
  diff.out@capt.mode <- "file"
  diff.out
}
capt_csv <- function(target, current, etc, err, extra){
  tar.df <- try(do.call(read.csv, c(list(target), extra), quote=TRUE))
  if(inherits(tar.df, "try-error")) err("Unable to read `target` file.")
  if(!is.data.frame(tar.df))
    err("`target` file did not produce a data frame when read")   # nocov
  cur.df <- try(do.call(read.csv, c(list(current), extra), quote=TRUE))
  if(inherits(cur.df, "try-error")) err("Unable to read `current` file.")
  if(!is.data.frame(cur.df))
    err("`current` file did not produce a data frame when read")  # nocov

  capt_print(tar.df, cur.df, etc, err, extra)
}
# Sets mode to "unified" if stuff is too wide to fit side by side without
# wrapping otherwise sets it in "sidebyside"

set_mode <- function(etc, tar.capt, cur.capt) {
  stopifnot(is(etc, "Settings"), is.character(tar.capt), is.character(cur.capt))
  if(etc@mode == "auto") {
    if(
      any(
        nchar2(cur.capt, sgr.supported=etc@sgr.supported) > etc@text.width.half
      ) ||
      any(
        nchar2(tar.capt, sgr.supported=etc@sgr.supported) > etc@text.width.half
      )
    ) {
      etc@mode <- "unified"
  } }
  if(etc@mode == "auto") etc <- sideBySide(etc)
  etc
}
diffobj/R/diff.R0000755000176200001440000010771114123062122013114 0ustar  liggesusers# Copyright (C) 2021 Brodie Gaslam
#
# This file is part of "diffobj - Diffs for R Objects"
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# Go to  for a copy of the license.

#' Diffs for R Objects
#'
#' Generate a colorized diff of two R objects for an intuitive visualization of
#' their differences.  See `vignette(package="diffobj", "diffobj")` for details.
#'
#' @import crayon
#' @import methods
#' @importFrom utils capture.output file_test packageVersion read.csv
#' @importFrom stats ave frequency is.ts setNames
#' @importFrom grDevices rgb
#' @name diffobj-package
#' @docType package

NULL

# Because all these functions are so similar, we have constructed them with a
# function factory.  This allows us to easily maintain consistent formals during
# initial development process when they have not been set in stone yet.

make_diff_fun <- function(capt_fun) {
  # nocov start
  function(
    target, current,
    mode=gdo("mode"),
    context=gdo("context"),
    format=gdo("format"),
    brightness=gdo("brightness"),
    color.mode=gdo("color.mode"),
    word.diff=gdo("word.diff"),
    pager=gdo("pager"),
    guides=gdo("guides"),
    trim=gdo("trim"),
    rds=gdo("rds"),
    unwrap.atomic=gdo("unwrap.atomic"),
    max.diffs=gdo("max.diffs"),
    disp.width=gdo("disp.width"),
    ignore.white.space=gdo("ignore.white.space"),
    convert.hz.white.space=gdo("convert.hz.white.space"),
    tab.stops=gdo("tab.stops"),
    line.limit=gdo("line.limit"),
    hunk.limit=gdo("hunk.limit"),
    align=gdo("align"),
    style=gdo("style"),
    palette.of.styles=gdo("palette"),
    frame=par_frame(),
    interactive=gdo("interactive"),
    term.colors=gdo("term.colors"),
    tar.banner=NULL,
    cur.banner=NULL,
    strip.sgr=gdo("strip.sgr"),
    sgr.supported=gdo("sgr.supported"),
    extra=list()
  ) {
  # nocov end
    frame    # force frame so that `par_frame` called in this context
    call.dat <- extract_call(sys.calls(), frame)
    target   # force target/current so if one missing we get an error here
    current  # and not later

    # Check args and evaluate all the auto-selection arguments

    etc.proc <- check_args(
      call=call.dat$call, tar.exp=call.dat$tar, cur.exp=call.dat$cur,
      mode=mode, context=context, line.limit=line.limit, format=format,
      brightness=brightness, color.mode=color.mode, pager=pager,
      ignore.white.space=ignore.white.space, max.diffs=max.diffs,
      align=align, disp.width=disp.width,
      hunk.limit=hunk.limit, convert.hz.white.space=convert.hz.white.space,
      tab.stops=tab.stops, style=style, palette.of.styles=palette.of.styles,
      frame=frame, tar.banner=tar.banner, cur.banner=cur.banner, guides=guides,
      rds=rds, trim=trim, word.diff=word.diff, unwrap.atomic=unwrap.atomic,
      extra=extra, interactive=interactive, term.colors=term.colors,
      strip.sgr=strip.sgr, sgr.supported=sgr.supported,
      call.match=match.call()
    )
    # If in rds mode, try to see if either target or current reference an RDS

    if(rds) {
      target <- get_rds(target)
      current <- get_rds(current)
    }
    # Force crayon to whatever ansi status we chose; note we must do this after
    # touching vars in case someone passes `options(crayon.enabled=...)` as one
    # of the arguments

    # old.crayon.opt <- options(
    #   crayon.enabled=
    #     is(etc.proc@style, "StyleAnsi") ||
    #     (!is(etc.proc@style, "StyleHtml") && etc.proc@sgr.supported)
    # )
    # on.exit(options(old.crayon.opt), add=TRUE)
    err <- make_err_fun(sys.call())

    # Compute gutter values so that we know correct widths to use for capture,
    # etc. If not a base text type style, assume gutter and column padding are
    # zero even though that may not always be correct

    etc.proc@gutter <- gutter_dat(etc.proc)

    col.pad.width <-
      nchar2(etc.proc@style@text@pad.col, sgr.supported=etc.proc@sgr.supported)
    gutt.width <- etc.proc@gutter@width

    half.width <- as.integer((etc.proc@disp.width - col.pad.width) / 2)
    etc.proc@line.width <-
      max(etc.proc@disp.width, .min.width + gutt.width)
    etc.proc@text.width <- etc.proc@line.width - gutt.width
    etc.proc@line.width.half <- max(half.width, .min.width + gutt.width)
    etc.proc@text.width.half <- etc.proc@line.width.half - gutt.width

    # If in side by side mode already then we know we want half-width, and if
    # width is less than 80 we know we want unitfied

    if(etc.proc@mode == "auto" && etc.proc@disp.width < 80L)
      etc.proc@mode <- "unified"
    if(etc.proc@mode == "sidebyside") etc.proc <- sideBySide(etc.proc)

    # Capture and diff

    diff <- capt_fun(target, current, etc=etc.proc, err=err, extra)
    diff
  }
}
#' Diff \code{print}ed Objects
#'
#' Runs the diff between the \code{print} or \code{show} output produced by
#' \code{target} and \code{current}.  Given the extensive parameter list, this
#' documentation page is intended as a reference for all the \code{diff*}
#' methods.  For a high level introduction see \code{vignette("diffobj")}.
#'
#' Almost all aspects of how the diffs are computed and displayed are
#' controllable through the \code{diff*} methods parameters.  This results in a
#' lengthy parameter list, but in practice you should rarely need to adjust
#' anything past the \code{color.mode} parameter.  Default values are specified
#' as options so that users may configure diffs in a persistent manner.
#' \code{\link{gdo}} is a shorthand function to access \code{diffobj} options.
#'
#' Parameter order after \code{color.mode} is not guaranteed.  Future versions
#' of \code{diffobj} may add parameters and re-order existing parameters past
#' \code{color.mode}.
#'
#' This and other \code{diff*} functions are S4 generics that dispatch on the
#' \code{target} and \code{current} parameters.  Methods with signature
#' \code{c("ANY", "ANY")} are defined and act as the default methods.  You can
#' use this to set up methods to pre-process or set specific parameters for
#' selected classes that can then \code{callNextMethod} for the actual diff.
#' Note that while the generics include \code{...} as an argument, none of the
#' methods do.
#'
#' Strings are re-encoded to UTF-8 with \code{\link{enc2utf8}} prior to
#' comparison to avoid encoding-only differences.
#'
#' The text representation of `target` and `current` should each have no more
#' than ~INT_MAX/4 lines.
#'
#' @section Matrices and Data Frames:
#'
#' While \code{diffPrint} attempts to handle the default R behavior that wraps
#' wide tables, the results are often sub-optimal.  A better approach is to set
#' the \code{disp.width} parameter to a large enough value such that wrapping is
#' not necessary, and a browser-based \code{pager}.  In the future we will add
#' the capability to specify different capture widths and wrap widths so that
#' this is an option for terminal output (see
#' \href{https://github.com/brodieG/diffobj/issues/109}{issue 109}).
#'
#' One thing to keep in mind is that \code{diffPrint} is not designed to work
#' with very large data frames.
#'
#' @export
#' @seealso \code{\link{diffObj}}, \code{\link{diffStr}},
#'   \code{\link{diffChr}} to compare character vectors directly,
#'   \code{\link{diffDeparse}} to compare deparsed objects, \code{\link{ses}}
#'   for a minimal and fast diff @param target the reference object
#' @param target the reference object
#' @param current the object being compared to \code{target}
#' @param mode character(1L), one of:
#'   \itemize{
#'     \item \dQuote{unified}: diff mode used by \code{git diff}
#'     \item \dQuote{sidebyside}: line up the differences side by side
#'     \item \dQuote{context}: show the target and current hunks in their
#'       entirety; this mode takes up a lot of screen space but makes it easier
#'       to see what the objects actually look like
#'     \item \dQuote{auto}: default mode; pick one of the above, will favor
#'       \dQuote{sidebyside} unless \code{getOption("width")} is less than 80,
#'       or in \code{diffPrint} and objects are dimensioned and do not fit side
#'       by side, or in \code{diffChr}, \code{diffDeparse}, \code{diffFile} and
#'       output does not fit in side by side without wrapping
#'   }
#' @param context integer(1L) how many lines of context are shown on either side
#'   of differences (defaults to 2).  Set to \code{-1L} to allow as many as
#'   there are.  Set to \dQuote{auto}  to display as many as 10 lines or as few
#'   as 1 depending on whether total screen lines fit within the number of lines
#'   specified in \code{line.limit}.  Alternatively pass the return value of
#'   \code{\link{auto_context}} to fine tune the parameters of the auto context
#'   calculation.
#' @param format character(1L), controls the diff output format, one of:
#'   \itemize{
#'     \item \dQuote{auto}: to select output format based on terminal
#'       capabilities; will attempt to use one of the ANSI formats if they
#'       appear to be supported, and if not or if you are in the Rstudio console
#'       it will attempt to use HTML and browser output if in interactive mode.
#'     \item \dQuote{raw}: plain text
#'     \item \dQuote{ansi8}: color and format diffs using basic ANSI escape
#'       sequences
#'     \item \dQuote{ansi256}: like \dQuote{ansi8}, except using the full range
#'       of ANSI formatting options
#'     \item \dQuote{html}: color and format using HTML markup; the resulting
#'       string is processed with \code{\link{enc2utf8}} when output as a full
#'       web page (see docs for \code{html.output} under \code{\link{Style}}).
#'   }
#'   Defaults to \dQuote{auto}.  See \code{palette.of.styles} for details
#'   on customization, \code{\link{style}} for full control of output format.
#'   See `pager` parameter for more discussion of Rstudio behavior.
#' @param brightness character, one of \dQuote{light}, \dQuote{dark},
#'   \dQuote{neutral}, useful for adjusting color scheme to light or dark
#'   terminals.  \dQuote{neutral} by default.  See \code{\link{PaletteOfStyles}}
#'   for details and limitations.  Advanced: you may specify brightness as a
#'   function of \code{format}.  For example, if you typically wish to use a
#'   \dQuote{dark} color scheme, except for when in \dQuote{html} format when
#'   you prefer the \dQuote{light} scheme, you may use
#'   \code{c("dark", html="light")} as the value for this parameter.  This is
#'   particularly useful if \code{format} is set to \dQuote{auto} or if you
#'   want to specify a default value for this parameter via options.  Any names
#'   you use should correspond to a \code{format}.  You must have one unnamed
#'   value which will be used as the default for all \code{format}s that are
#'   not explicitly specified.
#' @param color.mode character, one of \dQuote{rgb} or \dQuote{yb}.
#'   Defaults to \dQuote{yb}.  \dQuote{yb} stands for \dQuote{Yellow-Blue} for
#'   color schemes that rely primarily on those colors to style diffs.
#'   Those colors can be easily distinguished by individuals with
#'   limited red-green color sensitivity.  See \code{\link{PaletteOfStyles}} for
#'   details and limitations.  Also offers the same advanced usage as the
#'   \code{brightness} parameter.
#' @param word.diff TRUE (default) or FALSE, whether to run a secondary word
#'   diff on the in-hunk differences.  For atomic vectors setting this to
#'   FALSE could make the diff \emph{slower} (see the \code{unwrap.atomic}
#'   parameter).  For other uses, particularly with \code{\link{diffChr}}
#'   setting this to FALSE can substantially improve performance.
#' @param pager one of \dQuote{auto} (default), \dQuote{on},
#'   \dQuote{off}, a \code{\link{Pager}} object, or a list; controls whether and
#'   how a pager is used to display the diff output.  If you require a
#'   particular pager behavior you must use a \code{\link{Pager}}
#'   object, or \dQuote{off} to turn off the pager.  All other settings will
#'   interact with other parameters such as \code{format}, \code{style}, as well
#'   as with your system capabilities in order to select the pager expected to
#'   be most useful.
#'
#'   \dQuote{auto} and \dQuote{on} are the same, except that in non-interactive
#'   mode \dQuote{auto} is equivalent to \dQuote{off}.  \dQuote{off} will always
#'   send output to the console.  If \dQuote{on}, whether the output
#'   actually gets routed to the pager depends on the pager \code{threshold}
#'   setting (see \code{\link{Pager}}).  The default behavior is to use the
#'   pager associated with the \code{Style} object.  The \code{Style} object is
#'   itself is determined by the \code{format} or \code{style} parameters.
#'
#'   Depending on your system configuration different styles and corresponding
#'   pagers will get selected, unless you specify a \code{Pager} object
#'   directly.  On a system with a system pager that supports ANSI CSI SGR
#'   colors, the pager will only trigger if the output is taller than one
#'   window.  If the system pager is not known to support ANSI colors then the
#'   output will be sent as HTML to the IDE viewer if available or to the web
#'   browser if not.  Even though Rstudio now supports ANSI CSI SGR at the
#'   console output is still formatted as HTML and sent to the IDE viewer.
#'   Partly this is for continuity of behavior, but also because the default
#'   Rstudio pager does not support ANSI CSI SGR, at least as of this writing.
#'
#'   If \code{pager} is a list, then the same as with \dQuote{on}, except that
#'   the \code{Pager} object associated with the selected \code{Style} object is
#'   re-instantiated with the union of the list elements and the existing
#'   settings of that \code{Pager}.  The list should contain named elements that
#'   correspond to the \code{\link{Pager}} instantiation parameters.  The names
#'   must be specified in full as partial parameter matching will not be carried
#'   out because the pager is re-instantiated with \code{\link{new}}.
#'
#'   See \code{\link{Pager}}, \code{\link{Style}}, and
#'   \code{\link{PaletteOfStyles}} for more details and for instructions on how
#'   to modify the default behavior.
#' @param guides TRUE (default), FALSE, or a function that accepts at least two
#'   arguments and requires no more than two arguments.  Guides
#'   are additional context lines that are not strictly part of a hunk, but
#'   provide important contextual data (e.g. column headers).  If TRUE, the
#'   context lines are shown in addition to the normal diff output, typically
#'   in a different color to indicate they are not part of the hunk.  If a
#'   function, the function should accept as the first argument the object
#'   being diffed, and the second the character representation of the object.
#'   The function should return the indices of the elements of the
#'   character representation that should be treated as guides.  See
#'   \code{\link{guides}} for more details.
#' @param trim TRUE (default), FALSE, or a function that accepts at least two
#'   arguments and requires no more than two arguments.  Function should compute
#'   for each line in captured output what portion of those lines should be
#'   diffed.  By default, this is used to remove row meta data differences
#'   (e.g. \code{[1,]}) so they alone do not show up as differences in the
#'   diff.  See \code{\link{trim}} for more details.
#' @param rds TRUE (default) or FALSE, if TRUE will check whether
#'   \code{target} and/or \code{current} point to a file that can be read with
#'   \code{\link{readRDS}} and if so, loads the R object contained in the file
#'   and carries out the diff on the object instead of the original argument.
#'   Currently there is no mechanism for specifying additional arguments to
#'   \code{readRDS}
#' @param unwrap.atomic TRUE (default) or FALSE.  Relevant primarily for
#'   \code{diffPrint}, if TRUE, and \code{word.diff} is also TRUE, and both
#'   \code{target} and \code{current} are \emph{unnamed} one-dimension atomics ,
#'   the vectors are unwrapped and diffed element by element, and then
#'   re-wrapped.  Since \code{diffPrint} is fundamentally a line diff, the
#'   re-wrapped lines are lined up in a manner that is as consistent as possible
#'   with the unwrapped diff.  Lines that contain the location of the word
#'   differences will be paired up.  Since the vectors may well be wrapped with
#'   different periodicities this will result in lines that are paired up that
#'   look like they should not be paired up, though the locations of the
#'   differences should be.  If is entirely possible that setting this parameter
#'   to FALSE will result in a slower diff.  This happens if two vectors are
#'   actually fairly similar, but their line representations are not.  For
#'   example, in comparing \code{1:100} to \code{c(100, 1:99)}, there is really
#'   only one difference at the \dQuote{word} level, but every screen line is
#'   different.  \code{diffChr} will also do the unwrapping if it is given a
#'   character vector that contains output that looks like the atomic vectors
#'   described above.  This is a bug, but as the functionality could be useful
#'   when diffing e.g. \code{capture.output} data, we now declare it a feature.
#' @param line.limit integer(2L) or integer(1L), if length 1 how many lines of
#'   output to show, where \code{-1} means no limit.  If length 2, the first
#'   value indicates the threshold of screen lines to begin truncating output,
#'   and the second the number of lines to truncate to, which should be fewer
#'   than the threshold.  Note that this parameter is implemented on a
#'   best-efforts basis and should not be relied on to produce the exact
#'   number of lines requested.  In particular do not expect it to work well for
#'   for values small enough that the banner portion of the diff would have to
#'   be trimmed.  If you want a specific number of lines use \code{[} or
#'   \code{head} / \code{tail}.  One advantage of \code{line.limit} over these
#'   other options is that you can combine it with \code{context="auto"} and
#'   auto \code{max.level} selection (the latter for \code{diffStr}), which
#'   allows the diff to dynamically adjust to make best use of the available
#'   display lines.  \code{[}, \code{head}, and \code{tail} just subset the text
#'   of the output.
#' @param hunk.limit integer(2L) or integer (1L), how many diff hunks to show.
#'   Behaves similarly to \code{line.limit}.  How many hunks are in a
#'   particular diff is a function of how many differences, and also how much
#'   \code{context} is used since context can cause two hunks to bleed into
#'   each other and become one.
#' @param max.diffs integer(1L), number of \emph{differences} (default 50000L)
#'   after which we abandon the \code{O(n^2)} diff algorithm in favor of a naive
#'   \code{O(n)} one. Set to \code{-1L} to stick to the original algorithm up to
#'   the maximum allowed (~INT_MAX/4).
#' @param disp.width integer(1L) number of display columns to take up; note that
#'   in \dQuote{sidebyside} \code{mode} the effective display width is half this
#'   number (set to 0L to use default widths which are \code{getOption("width")}
#'   for normal styles and \code{80L} for HTML styles.  Future versions of
#'   \code{diffobj} may change this to larger values for two dimensional objects
#'   for better diffs (see details).
#' @param ignore.white.space TRUE or FALSE, whether to consider differences in
#'   horizontal whitespace (i.e. spaces and tabs) as differences (defaults to
#'   TRUE).
#' @param convert.hz.white.space TRUE or FALSE, whether modify input strings
#'   that contain tabs and carriage returns in such a way that they display as
#'   they would \bold{with} those characters, but without using those
#'   characters (defaults to TRUE).  The conversion assumes that tab stops are
#'   spaced evenly eight characters apart on the terminal.  If this is not the
#'   case you may specify the tab stops explicitly with \code{tab.stops}.
#' @param tab.stops integer, what tab stops to use when converting hard tabs to
#'   spaces.  If not integer will be coerced to integer (defaults to 8L).  You
#'   may specify more than one tab stop.  If display width exceeds that
#'   addressable by your tab stops the last tab stop will be repeated.
#' @param align numeric(1L) between 0 and 1, proportion of
#'   words in a line of \code{target} that must be matched in a line of
#'   \code{current} in the same hunk for those lines to be paired up when
#'   displayed (defaults to 0.25), or an \code{\link{AlignThreshold}} object.
#'   Set to \code{1} to turn off alignment which will cause all lines in a hunk
#'   from \code{target} to show up first, followed by all lines from
#'   \code{current}.  Note that in order to be aligned lines must meet the
#'   threshold and have at least 3 matching alphanumeric characters (see
#'   \code{\link{AlignThreshold}} for details).
#' @param style \dQuote{auto}, a \code{\link{Style}} object, or a list.
#'   \dQuote{auto} by default.  If a \code{Style} object, will override the
#'   the \code{format}, \code{brightness}, and \code{color.mode} parameters.
#'   The \code{Style} object provides full control of diff output styling.
#'   If a list, then the same as \dQuote{auto}, except that if the auto-selected
#'   \code{Style} requires instantiation (see \code{\link{PaletteOfStyles}}),
#'   then the list contents will be used as arguments when instantiating the
#'   style object.  See \code{\link{Style}} for more details, in particular the
#'   examples.
#' @param palette.of.styles \code{\link{PaletteOfStyles}} object; advanced
#'   usage, contains all the \code{\link{Style}} objects or
#'   \dQuote{classRepresentation} objects extending \code{\link{Style}} that are
#'   selected by specifying the \code{format}, \code{brightness}, and
#'   \code{color.mode} parameters.  See \code{\link{PaletteOfStyles}} for more
#'   details.
#' @param frame an environment to use as the evaluation frame for the
#'   \code{print/show/str}, calls and for \code{diffObj}, the evaluation frame
#'   for the \code{diffPrint} / \code{diffStr} calls.  Defaults to the return
#'   value of \code{\link{par_frame}}.
#' @param interactive TRUE or FALSE whether the function is being run in
#'   interactive mode, defaults to the return value of
#'   \code{\link{interactive}}.  If in interactive mode, pager will be used if
#'   \code{pager} is \dQuote{auto}, and if ANSI styles are not supported and
#'   \code{style} is \dQuote{auto}, output will be send to viewer/browser as
#'   HTML.
#' @param term.colors integer(1L) how many ANSI colors are supported by the
#'   terminal.  This variable is provided for when
#'   \code{\link[=num_colors]{crayon::num_colors}} does not properly detect how
#'   many ANSI colors are supported by your terminal. Defaults to return value
#'   of \code{\link[=num_colors]{crayon::num_colors}} and should be 8 or 256 to
#'   allow ANSI colors, or any other number to disallow them.  This only
#'   impacts output format selection when \code{style} and \code{format} are
#'   both set to \dQuote{auto}.
#' @param tar.banner character(1L), language, or NULL, used to generate the
#'   text to display ahead of the diff section representing the target output.
#'   If NULL will use the deparsed \code{target} expression, if language, will
#'   use the language as it would the \code{target} expression, if
#'   character(1L), will use the string with no modifications.  The language
#'   mode is provided because \code{diffStr} modifies the expression prior to
#'   display (e.g. by wrapping it in a call to \code{str}).  Note that it is
#'   possible in some cases that the substituted value of \code{target} actually
#'   is character(1L), but if you provide a character(1L) value here it will be
#'   assumed you intend to use that value literally.
#' @param cur.banner character(1L) like \code{tar.banner}, but for
#'   \code{current}
#' @param strip.sgr TRUE, FALSE, or NULL (default), whether to strip ANSI CSI
#'   SGR sequences prior to comparison and for display of diff.  If NULL,
#'   resolves to TRUE if `style` resolves to an ANSI formatted diff, and
#'   FALSE otherwise.  The default behavior is to avoid confusing diffs where
#'   the original SGR and the SGR added by the diff are mixed together.
#' @param sgr.supported TRUE, FALSE, or NULL (default), whether to assume the
#'   standard output device supports ANSI CSI SGR sequences.  If TRUE, strings
#'   will be manipulated accounting for the SGR sequences.  If NULL,
#'   resolves to TRUE if `style` resolves to an ANSI formatted diff, and
#'   to `crayon::has_color()` otherwise.  This only controls how the strings are
#'   manipulated, not whether SGR is added to format the diff, which is
#'   controlled by the `style` parameter.  This parameter is exposed for the
#'   rare cases where you might wish to control string manipulation behavior
#'   directly.
#' @param extra list additional arguments to pass on to the functions used to
#'   create text representation of the objects to diff (e.g. \code{print},
#'   \code{str}, etc.)
#' @param ... unused, for compatibility of methods with generics
#' @return a \code{Diff} object; this object has a \code{show}
#'   method that will display the diff to screen or pager, as well as
#'   \code{summary}, \code{any}, and \code{as.character} methods.
#'   If you store the return value instead of displaying it to screen, and
#'   display it later, it is possible for the display to be thrown off if
#'   there are environment changes (e.g. display width changes) in between
#'   the time you compute the diff and the time you display it.
#' @rdname diffPrint
#' @name diffPrint
#' @export
#' @examples
#' ## `pager="off"` for CRAN compliance; you may omit in normal use
#' diffPrint(letters, letters[-5], pager="off")

setGeneric(
  "diffPrint", function(target, current, ...) standardGeneric("diffPrint")
)

#' @rdname diffPrint

setMethod("diffPrint", signature=c("ANY", "ANY"), make_diff_fun(capt_print))

#' Diff Object Structures
#'
#' Compares the \code{str} output of \code{target} and \code{current}.  If
#' the \code{max.level} parameter to \code{str} is left unspecified, will
#' attempt to find the largest \code{max.level} that fits within
#' \code{line.limit} and shows at least one difference.
#'
#' Due to the seemingly inconsistent nature of \code{max.level} when used with
#' objects with nested attributes, and also due to the relative slowness of
#' \code{str}, this function simulates the effect of \code{max.level} by hiding
#' nested lines instead of repeatedly calling \code{str} with varying values of
#' \code{max.level}.
#'
#' @inheritParams diffPrint
#' @seealso \code{\link{diffPrint}} for details on the \code{diff*} functions,
#'   \code{\link{diffObj}}, \code{\link{diffStr}},
#'   \code{\link{diffChr}} to compare character vectors directly,
#'   \code{\link{diffDeparse}} to compare deparsed objects,
#'   \code{\link{ses}} for a minimal and fast diff
#' @return a \code{Diff} object; see \code{\link{diffPrint}}.
#' @rdname diffStr
#' @export
#' @examples
#' ## `pager="off"` for CRAN compliance; you may omit in normal use
#' with(mtcars, diffStr(lm(mpg ~ hp)$qr, lm(mpg ~ disp)$qr, pager="off"))

setGeneric("diffStr", function(target, current, ...) standardGeneric("diffStr"))

#' @rdname diffStr

setMethod("diffStr", signature=c("ANY", "ANY"), make_diff_fun(capt_str))

#' Diff Character Vectors Element By Element
#'
#' Will perform the diff on the actual string values of the character vectors
#' instead of capturing the printed screen output. Each vector element is
#' treated as a line of text.  NA elements are treated as the string
#' \dQuote{NA}.  Non character inputs are coerced to character and attributes
#' are dropped with \code{\link{c}}.
#'
#' @inheritParams diffPrint
#' @seealso \code{\link{diffPrint}} for details on the \code{diff*} functions,
#'   \code{\link{diffObj}}, \code{\link{diffStr}},
#'   \code{\link{diffDeparse}} to compare deparsed objects,
#'   \code{\link{ses}} for a minimal and fast diff
#' @return a \code{Diff} object; see \code{\link{diffPrint}}.
#' @export
#' @rdname diffChr
#' @examples
#' ## `pager="off"` for CRAN compliance; you may omit in normal use
#' diffChr(LETTERS[1:5], LETTERS[2:6], pager="off")

setGeneric("diffChr", function(target, current, ...) standardGeneric("diffChr"))

#' @rdname diffChr

setMethod("diffChr", signature=c("ANY", "ANY"), make_diff_fun(capt_chr))

#' Diff Deparsed Objects
#'
#' Perform diff on the character vectors produced by \code{\link{deparse}}ing
#' the objects.  Each element counts as a line.  If an element contains newlines
#' it will be split into elements new lines by the newlines.
#'
#' @export
#' @inheritParams diffPrint
#' @seealso \code{\link{diffPrint}} for details on the \code{diff*} functions,
#'   \code{\link{diffObj}}, \code{\link{diffStr}},
#'   \code{\link{diffChr}} to compare character vectors directly,
#'   \code{\link{ses}} for a minimal and fast diff
#' @return a \code{Diff} object; see \code{\link{diffPrint}}.
#' @export
#' @rdname diffDeparse
#' @examples
#' ## `pager="off"` for CRAN compliance; you may omit in normal use
#' diffDeparse(matrix(1:9, 3), 1:9, pager="off")

setGeneric(
  "diffDeparse", function(target, current, ...) standardGeneric("diffDeparse")
)
#' @rdname diffDeparse

setMethod("diffDeparse", signature=c("ANY", "ANY"), make_diff_fun(capt_deparse))

#' Diff Files
#'
#' Reads text files with \code{\link{readLines}} and performs a diff on the
#' resulting character vectors.
#'
#' @export
#' @param target character(1L) or file connection with read capability; if
#'   character should point to a text file
#' @param current like \code{target}
#' @inheritParams diffPrint
#' @seealso \code{\link{diffPrint}} for details on the \code{diff*} functions,
#'   \code{\link{diffObj}}, \code{\link{diffStr}},
#'   \code{\link{diffChr}} to compare character vectors directly,
#'   \code{\link{ses}} for a minimal and fast diff
#' @return a \code{Diff} object; see \code{\link{diffPrint}}.
#' @export
#' @rdname diffFile
#' @examples
#' \dontrun{
#' url.base <- "https://raw.githubusercontent.com/wch/r-source"
#' f1 <- file.path(url.base, "29f013d1570e1df5dc047fb7ee304ff57c99ea68/README")
#' f2 <- file.path(url.base, "daf0b5f6c728bd3dbcd0a3c976a7be9beee731d9/README")
#' diffFile(f1, f2)
#' }

setGeneric(
  "diffFile", function(target, current, ...) standardGeneric("diffFile")
)
#' @rdname diffFile

setMethod("diffFile", signature=c("ANY", "ANY"), make_diff_fun(capt_file))

#' Diff CSV Files
#'
#' Reads CSV files with \code{\link{read.csv}} and passes the resulting data
#' frames onto \code{\link{diffPrint}}.  \code{extra} values are passed as
#' arguments are passed to both \code{read.csv} and \code{print}.  To the
#' extent you wish to use different \code{extra} arguments for each of those
#' functions you will need to \code{read.csv} the files and pass them to
#' \code{diffPrint} yourself.
#'
#' @export
#' @param target character(1L) or file connection with read capability;
#'   if character should point to a CSV file
#' @param current like \code{target}
#' @inheritParams diffPrint
#' @seealso \code{\link{diffPrint}} for details on the \code{diff*} functions,
#'   \code{\link{diffObj}}, \code{\link{diffStr}},
#'   \code{\link{diffChr}} to compare character vectors directly,
#'   \code{\link{ses}} for a minimal and fast diff
#' @return a \code{Diff} object; see \code{\link{diffPrint}}.
#' @export
#' @rdname diffCsv
#' @examples
#' iris.2 <- iris
#' iris.2$Sepal.Length[5] <- 99
#' f1 <- tempfile()
#' f2 <- tempfile()
#' write.csv(iris, f1, row.names=FALSE)
#' write.csv(iris.2, f2, row.names=FALSE)
#' ## `pager="off"` for CRAN compliance; you may omit in normal use
#' diffCsv(f1, f2, pager="off")
#' unlink(c(f1, f2))

setGeneric(
  "diffCsv", function(target, current, ...) standardGeneric("diffCsv")
)
#' @rdname diffCsv

setMethod("diffCsv", signature=c("ANY", "ANY"), make_diff_fun(capt_csv))

#' Diff Objects
#'
#' Compare either the \code{print}ed or \code{str} screen representation of
#' R objects depending on which is estimated to produce the most useful
#' diff.  The selection process tries to minimize screen lines while maximizing
#' differences shown subject to display constraints.  The decision algorithm is
#' likely to evolve over time, so do not rely on this function making
#' a particular selection under specific circumstances.  Instead, use
#' \code{\link{diffPrint}} or \code{\link{diffStr}} if you require one or the
#' other output.
#'
#' @inheritParams diffPrint
#' @seealso \code{\link{diffPrint}} for details on the \code{diff*} methods,
#'   \code{\link{diffStr}},
#'   \code{\link{diffChr}} to compare character vectors directly
#'   \code{\link{diffDeparse}} to compare deparsed objects,
#'   \code{\link{ses}} for a minimal and fast diff
#' @return a \code{Diff} object; see \code{\link{diffPrint}}.
#' @export
#' @examples
#' ## `pager="off"` for CRAN compliance; you may omit in normal use
#' diffObj(letters, c(letters[1:10], LETTERS[11:26]), pager="off")
#' with(mtcars, diffObj(lm(mpg ~ hp)$qr, lm(mpg ~ disp)$qr, pager="off"))

setGeneric("diffObj", function(target, current, ...) standardGeneric("diffObj"))

diff_obj <- make_diff_fun(identity) # we overwrite the body next
body(diff_obj) <- quote({
  if(length(extra))
    stop("Argument `extra` must be empty in `diffObj`.")

  # frame # force frame so that `par_frame` called in this context

  # Need to generate calls inside a new child environment so that we do not
  # pollute the environment and create potential conflicts with ... args
  # used to run this inside a `local` call, but issues cropped up with the
  # advent of JIT, and can't recall why just storing arguments at first
  # was a problem

  args <- as.list(environment())
  call.dat <- extract_call(sys.calls(), frame)
  err <- make_err_fun(call.dat$call)

  if(is.null(args$tar.banner)) args$tar.banner <- call("quote", call.dat$tar)
  if(is.null(args$cur.banner)) args$cur.banner <- call("quote", call.dat$cur)

  call.print <- as.call(c(list(quote(diffobj::diffPrint)), args))
  call.str <- as.call(c(list(quote(diffobj::diffStr)), args))
  call.str[["extra"]] <- list(max.level="auto")
  res.print <- try(eval(call.print, frame), silent=TRUE)
  res.str <- try(eval(call.str, frame), silent=TRUE)

  if(inherits(res.str, "try-error"))
    err(
      "Error in calling `diffStr`: ",
      conditionMessage(attr(res.str, "condition"))
    )
  if(inherits(res.print, "try-error"))
    err(
      "Error in calling `diffPrint`: ",
      conditionMessage(attr(res.print, "condition"))
    )

  # Run both the print and str versions, and then decide which to use based
  # on some weighting of various factors including how many lines needed to be
  # omitted vs. how many differences were reported

  diff.p <- count_diff_hunks(res.print@diffs)
  diff.s <- count_diff_hunks(res.str@diffs)
  diff.l.p <- diff_line_len(
    res.print@diffs, res.print@etc, tar.capt=res.print@tar.dat$raw,
    cur.capt=res.print@cur.dat$raw
  )
  diff.l.s <- diff_line_len(
    res.str@diffs, res.str@etc, tar.capt=res.str@tar.dat$raw,
    cur.capt=res.str@cur.dat$raw
  )

  # How many lines of the input are in the diffs, vs how many lines of input

  diff.line.ratio.p <- lineCoverage(res.print)
  diff.line.ratio.s <- lineCoverage(res.str)

  # Only show the one with differences

  res <- if(!diff.s && diff.p) {
    res.print
  } else if(!diff.p && diff.s) {
    res.str

  # If one fits in full and the other doesn't, show the one that fits in full
  } else if(
    !res.str@trim.dat$lines[[1L]] &&
    res.print@trim.dat$lines[[1L]]
  ) {
    res.str
  } else if(
    res.str@trim.dat$lines[[1L]] &&
    !res.print@trim.dat$lines[[1L]]
  ) {
    res.print
  } else if (diff.l.p <= console_lines() / 2) {
    # Always use print if print output is reasonable size
    res.print
  } else {
  # Calculate the trade offs between the two options
    s.score <- diff.s / diff.l.s * diff.line.ratio.s
    p.score <- diff.p / diff.l.p * diff.line.ratio.p
    if(p.score >= s.score) res.print else res.str
  }
  res
})
#' @export
setMethod("diffObj", signature=c("ANY", "ANY"), diff_obj)
diffobj/R/tochar.R0000755000176200001440000005075013777704534013514 0ustar  liggesusers# Copyright (C) 2021 Brodie Gaslam
#
# This file is part of "diffobj - Diffs for R Objects"
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# Go to  for a copy of the license.

# @include S4.R

NULL

# Compute the ranges of a hunk group based on atomic hunk ids
#
# rng.o is a matrix where each column represents `c(tar.rng, cur.rng)`
# and rng.o has the original untrimmed values (ACTUALLY, not clear this is
# what we are doing currently, seems like we're passing the post context
# assesment hunks)
#
# fill indicates which lines where fill lines and should not be picked to
# represent the start or end point of a range (these are added by the atomic
# word diff)

find_rng <- function(ids, rng.o, fill) {
  # first row of rng.o is the start of the hunk
  with.rng <- ids[which(rng.o[1L, ids] > 0L)]
  rng <- if(!length(with.rng)) {
    # Find previous earliest originally existing item we want to insert
    # after; note we need to look at the non-trimmed ranges, and we include
    # the first context atomic hunk in the group as a potential match
    prev <- rng.o[
      2L, seq_len(ncol(rng.o)) <= max(ids[[1L]], 0L) &
        rng.o[1L, ] > 0L
    ]
    if(!length(prev)) integer(2L) else c(max(prev), 0L)
  } else {
    c(min(rng.o[1L, intersect(ids, with.rng)]), max(rng.o[2L, ids]))
  }
}
# Create a text representation of a file line range to use in the hunk header

rng_as_chr <- function(range) {
  if(length(range) < 2L) "0" else {
    a <- range[[1L]]
    b <- if(diff(range))
      paste0(",", if(range[[2L]]) diff(range) + 1L else 0)
    paste0(a, b)
  }
}
# Finalization function should return a list with two character vectors for
# diff contents, and two factor vectors denoting the type of content for
# each of the character vectors where valid data types are ins, del, mtc, hdr,
# ctx; chrt is just a helper function to generate factors with those possible
# values

chrt <- function(...)
  factor(
    c(...),
    levels=c(
      "insert", "delete", "match", "header", "context.sep",
      "banner.insert", "banner.delete", "guide", "fill"
    )
  )
hunkl <- function(col.1=NULL, col.2=NULL, type.1=NULL, type.2=NULL)
  c(
    list(
      if(is.null(col.1)) list(dat=character(), type=chrt()) else
        list(dat=col.1, type=type.1)
      ),
    if(!is.null(col.2)) list(list(dat=col.2, type=type.2))
  )

# finalization functions take aligned data and juxtapose it according to
# selected display mode.  Note that _context must operate on all the hunks
# in a hunk group, whereas the other two operate on each hunk atom.  Padding
# is identified in two forms: as actual A.fill and B.fill values when there
# was a wrapped diff, and in side by side mode when the lengths of A and B
# are not the same and end up adding NAs.  Padding is really only meaningful
# for side by side mode so is removed in the other modes

# The A.fill and B.fill business is a bit of a mess, because ideally we woudl
# want a structure parallel to the data structure instead of just vectors that
# we need to line up with the data lists, but this is all a result of trying
# to shoehorn new functionality in...

fin_fun_context <- function(dat) {
  dat_wo_fill <- function(x, ind) unlist(x[[ind]])[!x[[sprintf("%s.fill", ind)]]]
  A.dat <- lapply(dat, dat_wo_fill, "A")
  B.dat <- lapply(dat, dat_wo_fill, "B")

  A.lens <- vapply(A.dat, function(x) length(unlist(x)), integer(1L))
  B.lens <- vapply(B.dat, function(x) length(unlist(x)), integer(1L))

  A.ul <- unlist(A.dat)
  B.ul <- unlist(B.dat)

  context <- vapply(dat, "[[", logical(1L), "context")
  guide <- vapply(dat, "[[", logical(1L), "guide")
  A.ctx <- rep(context, A.lens)
  A.guide <- rep(guide, A.lens)
  B.ctx <- rep(context, B.lens)
  B.guide <- rep(guide, B.lens)
  A.types <- ifelse(A.guide, "guide", ifelse(A.ctx, "match", "delete"))
  B.types <- ifelse(B.guide, "guide", ifelse(B.ctx, "match", "insert"))

  # return in list so compatible with post `lapply` return values for other
  # finalization functions

  list(
    hunkl(
      col.1=c(A.ul,  if(length(B.ul)) NA, B.ul),
      type.1=chrt(A.types, if(length(B.ul)) "context.sep", B.types)
    )
  )
}
fin_fun_unified <- function(A, B, A.fill, B.fill, context, guide) {
  A.lens <- vapply(A, length, integer(1L))
  B.lens <- vapply(B, length, integer(1L))
  A.ord <- rep(seq_along(A.lens), A.lens)[!A.fill]
  B.ord <- rep(seq_along(B.lens), B.lens)[!B.fill]
  A <- unlist(A)[!A.fill]
  B <- unlist(B)[!B.fill]

  ord <- order(c(A.ord, B.ord))
  types <- c(
    rep(if(guide) "guide" else if(context) "match" else "delete", sum(A.lens)),
    rep(if(guide) "guide" else if(context) "match" else "insert", sum(B.lens))
  )
  hunkl(
    col.1=unlist(c(A, B)[ord]), type.1=chrt(unlist(types[ord]))
  )
}
fin_fun_sidebyside <- function(A, B, A.fill, B.fill, context, guide) {
  for(i in seq_along(A)) {
    A.ch <- A[[i]]
    B.ch <- B[[i]]
    A.l <- length(A.ch)
    B.l <- length(B.ch)
    max.l <- max(A.l, B.l)
    length(A.ch) <- length(B.ch) <- max.l

    A[[i]] <- A.ch
    B[[i]] <- B.ch
  }
  A.ul <- unlist(A)
  B.ul <- unlist(B)
  A.fill.u <- B.fill.u <- !logical(length(A.ul))
  A.fill.u[!is.na(A.ul)] <- A.fill
  B.fill.u[!is.na(B.ul)] <- B.fill

  A.len <- length(A.ul)
  B.len <- length(B.ul)
  hunkl(
    col.1=ifelse(is.na(A.ul), "", A.ul),
    col.2=ifelse(is.na(B.ul), "", B.ul),
    type.1=chrt(
      ifelse(
        rep(guide, A.len), "guide",
        ifelse(A.fill.u, "fill",
          ifelse(context, "match", "delete")
    ) ) ),
    type.2=chrt(
      ifelse(
        rep(guide, B.len), "guide",
        ifelse(B.fill.u, "fill",
          ifelse(context, "match", "insert")
  ) ) ) )
}
# Convert a hunk group into text representation

hunk_atom_as_char <- function(h.a, x) {
  etc <- x@etc
  mode <- x@etc@mode
  if(mode=="context") {
    ghd.mode.1 <- "A"
    ghd.mode.2 <- "B"
    ghd.type.1 <- ghd.type.2 <- "both"
  } else if(mode == "unified") {
    ghd.mode.1 <- ghd.mode.2 <-"A"
    ghd.type.1 <- "pos"
    ghd.type.2 <- "neg"
  } else if(mode == "sidebyside") {
    ghd.mode.1 <- "A"
    ghd.mode.2 <- "B"
    ghd.type.1 <- "pos"
    ghd.type.2 <- "neg"
  }
  A.ind <- get_hunk_ind(h.a, mode=ghd.mode.1, ghd.type.1)
  B.ind <- get_hunk_ind(h.a, mode=ghd.mode.2, ghd.type.2)

  # Align the lines accounting for partial matching post word-diff,
  # each diff style has a different finalization function

  dat.align <- align_eq(A.ind, B.ind, x=x, context=h.a$context)
  list(
    A=dat.align$A, B=dat.align$B,
    A.fill=dat.align$A.fill, B.fill=dat.align$B.fill,
    context=h.a$context, guide=h.a$guide
  )
}
hunk_as_char <- function(h.g, h.h, x) {
  stopifnot(is(x, "Diff"))

  etc <- x@etc
  mode <- etc@mode

  hunk.head <- if(length(h.g) && !h.g[[1L]]$completely.empty) {
    list(
      if(mode == "sidebyside") {
        hunkl(
          col.1=h.h[1L], col.2=h.h[2L],
          type.1=chrt("header"), type.2=chrt("header")
        )
      } else {
        hunkl(col.1=h.h, type.1=chrt("header"))
  } ) }
  # Generate hunk contents in aligned form

  hunk.res <- lapply(h.g, hunk_atom_as_char, x=x)

  # Run finalization functions; context mode is different because we need to
  # re-order across atomic hunks

  fin_fun <- switch(
    mode, unified=fin_fun_unified, sidebyside=fin_fun_sidebyside,
    context=fin_fun_context
  )
  hunk.fin <- if(mode != "context") {
    lapply(hunk.res, function(x) do.call(fin_fun, x))
  } else {
    fin_fun_context(hunk.res)
  }
  # Add header and return; this a list of lists, though all sub-lists should
  # have same format

  c(hunk.head, hunk.fin)
}
# Helper functions for 'as.character'

# Get trimmed character ranges; positives are originally from target, and
# negatives from current

get_hunk_ind <- function(h.a, mode, type="both") {
  stopifnot(
    mode %in% LETTERS[1:2], length(mode) == 1L,
    is.chr.1L(type), type %in% c("both", "pos", "neg")
  )
  rng.raw <- c(
    if(type %in% c("pos", "both"))
      seq(h.a$tar.rng.trim[[1L]], h.a$tar.rng.trim[[2L]]),
    if(type %in% c("neg", "both"))
      -seq(h.a$cur.rng.trim[[1L]], h.a$cur.rng.trim[[2L]])
  )
  rng.raw[rng.raw %in% h.a[[mode]]]
}
#' @rdname diffobj_s4method_doc

setMethod("as.character", "Diff",
  function(x, ...) {
    old.crayon.opt <- options(crayon.enabled=is(x@etc@style, "StyleAnsi"))
    on.exit(options(old.crayon.opt), add=TRUE)

    hunk.limit <- x@etc@hunk.limit
    line.limit <- x@etc@line.limit
    hunk.limit <- x@etc@hunk.limit
    disp.width <- x@etc@disp.width
    hunk.grps <- x@diffs
    mode <- x@etc@mode
    tab.stops <- x@etc@tab.stops
    ignore.white.space <- x@etc@ignore.white.space
    sgr.supported <- x@etc@sgr.supported

    # legacy from when we had different max diffs for different parts of diff

    max.diffs <- x@etc@max.diffs
    max.diffs.in.hunk <- x@etc@max.diffs
    max.diffs.wrap <- x@etc@max.diffs

    s <- x@etc@style  # shorthand

    len.max <- max(length(x@tar.dat$raw), length(x@cur.dat$raw))

    no.diffs <- if(!suppressWarnings(any(x))) {
      # This needs to account for "trim" effects

      msg <- "No visible differences between objects"
      if(
        (
          ignore.white.space || x@etc@convert.hz.white.space ||
          !identical(x@etc@trim, trim_identity) || x@etc@strip.sgr
        ) &&
        !isTRUE(all.equal(x@tar.dat$orig, x@cur.dat$orig)) &&
        isTRUE(all.equal(x@tar.dat$comp, x@cur.dat$comp))
      ) {
        paste0(
          msg, ", but there are some differences suppressed by ",
          "`ignore.white.space`, `convert.hz.white.space`, `strip.sgr`, ",
          "and/or `trim`. Set all those arguments to FALSE to highlight ",
          "the differences.",
          collapse=""
        )
      } else if (!isTRUE(all.eq <- all.equal(x@target, x@current))) {
        c(
          paste0(
            msg, ", but objects are *not* `all.equal`",
            if(length(all.eq)) ":" else "."
          ),
          if(length(all.eq)) paste0("- ", all.eq)
        )
      } else paste0(msg, ".")
    }
    # Basic width computation and banner size; start by computing gutter so we
    # can figure out what's left

    gutter.dat <- x@etc@gutter

    # Trim hunks to the extented needed to make sure we fit in lines

    hunks.flat <- unlist(hunk.grps, recursive=FALSE)
    ranges <- vapply(
      hunks.flat, function(h.a) c(h.a$tar.rng.trim, h.a$cur.rng.trim),
      integer(4L)
    )
    ranges.orig <- vapply(
      hunks.flat, function(h.a) c(h.a$tar.rng.sub, h.a$cur.rng.sub), integer(4L)
    )
    hunk.heads <- x@hunk.heads
    h.h.chars <- nchar2(
      chr_trim(
        unlist(hunk.heads), x@etc@line.width, sgr.supported=sgr.supported
      ),
      sgr.supported=sgr.supported
    )
    # Make the object banner and compute more detailed widths post trim

    tar.banner <- if(!is.null(x@etc@tar.banner)) x@etc@tar.banner else
      deparse(x@etc@tar.exp)[[1L]]
    cur.banner <- if(!is.null(x@etc@cur.banner)) x@etc@cur.banner else
      deparse(x@etc@cur.exp)[[1L]]
    ban.A.trim <- if(s@wrap)
        chr_trim(tar.banner, x@etc@text.width, sgr.supported=sgr.supported)
      else tar.banner
    ban.B.trim <- if(s@wrap)
        chr_trim(cur.banner, x@etc@text.width, sgr.supported=sgr.supported)
      else cur.banner
    banner.A <- s@funs@word.delete(ban.A.trim)
    banner.B <- s@funs@word.insert(ban.B.trim)

    # Trim banner doesn't currently work, so we just comment the nulling out and
    # updated the docs.  This doesn't seem worth fixing.  The banner portion
    # would still show up with the banners themselves NULLed.

    if(line.limit[[1L]] >= 0) {
      ll2 <- line.limit[[2L]]
      # if(ll2 < 2L && mode != "sidebyside") {
      #   banner.A <- NULL
      # }
      # if(ll2 < 1L) {
      #   banner.B <- banner.A <- NULL
      # }
    }
    if(mode == "sidebyside") {
      line.limit <- pmax(integer(2L), line.limit - 2L)
    } else {
      line.limit <- pmax(integer(2L), line.limit - 1L)
    }
    # Post trim, figure out max lines we could possibly be showing from capture
    # strings; careful with ranges,

    trim.meta <- attr(hunk.grps, "meta")
    if(is.null(trim.meta))
      stop("Internal error: missing trim meta data, contact maintainer") # nocov

    lim.line <- trim.meta$lines
    lim.hunk <- trim.meta$hunks
    ll <- !!lim.line[[1L]]
    lh <- !!lim.hunk[[1L]]
    diff.count <- count_diffs(hunk.grps)
    str.fold.out <- if(x@capt.mode == "str" && x@diff.count.full > diff.count) {
      paste0(
        x@diff.count.full - diff.count,
        " differences are hidden by our use of `max.level`"
      )
    }
    limit.out <- if(ll || lh) {
      if(!is.null(str.fold.out)) {
        # nocov start
        stop(
          "Internal Error: should not be str folding when limited; contact ",
          "maintainer."
        )
        # nocov end
      }
      paste0(
        "... omitted ",
        if(ll) sprintf("%d/%d lines", lim.line[[1L]], lim.line[[2L]]),
        if(ll && lh) ", ",
        if(lh) sprintf("%d/%d hunks", lim.hunk[[1L]], lim.hunk[[2L]])
      )
    }
    tar.max <- max(ranges[2L, ], 0L)
    cur.max <- max(ranges[4L, ], 0L)

    # At this point we need to actually reconstitute the final output string by:
    # - Applying word diffs
    # - Reconstructing untrimmed strings
    # - Substitute appropriate values for empty strings

    f.f <- x@etc@style@funs
    if(x@etc@word.diff) {
      tar.w.c <- word_color(x@tar.dat$trim, x@tar.dat$word.ind, f.f@word.delete)
      cur.w.c <- word_color(x@cur.dat$trim, x@cur.dat$word.ind, f.f@word.insert)
    } else {
      tar.w.c <- x@tar.dat$trim
      cur.w.c <- x@cur.dat$trim
    }
    x@tar.dat$fin <- untrim(x@tar.dat, tar.w.c, x@etc)
    x@cur.dat$fin <- untrim(x@cur.dat, cur.w.c, x@etc)

    # Generate the pre-rendered hunk data as text columns; a bit complicated
    # as we need to unnest stuff; use rbind to make it a little easier.

    pre.render.raw <- unlist(
      Map(hunk_as_char, hunk.grps, hunk.heads, x=list(x)),
      recursive=FALSE
    )
    pre.render.mx <- do.call(rbind, pre.render.raw)
    pre.render.mx.2 <- lapply(
      split(pre.render.mx, col(pre.render.mx)), do.call, what="rbind"
    )
    pre.render <- lapply(
      unname(pre.render.mx.2),
      function(mx) list(
        dat=unlist(mx[, 1L]),
        type=unlist(mx[, 2L], recursive=FALSE)
    ) )
    # Add the banners; banners are rendered exactly like normal text, except
    # for the line level functions

    if(mode == "sidebyside") {
      pre.render[[1L]]$dat <- c(banner.A, pre.render[[1L]]$dat)
      pre.render[[1L]]$type <- c(chrt("banner.delete"), pre.render[[1L]]$type)
      pre.render[[2L]]$dat <- c(banner.B, pre.render[[2L]]$dat)
      pre.render[[2L]]$type <- c(chrt("banner.insert"), pre.render[[2L]]$type)
    } else {
      pre.render[[1L]]$dat <- c(banner.A, banner.B, pre.render[[1L]]$dat)
      pre.render[[1L]]$type <- c(
        chrt("banner.delete", "banner.insert"), pre.render[[1L]]$type
      )
    }
    # Generate wrapped version of the text; if in sidebyside, make sure that
    # all elements are same length

    pre.render.w <- if(s@wrap) {
      pre.render.w <- replicate(
        length(pre.render),
        vector("list", length(pre.render[[1L]]$dat)), simplify=FALSE
      )
      for(i in seq_along(pre.render)) {
        hdr <- pre.render[[i]]$type == "header"
        pre.render.w[[i]][hdr] <- wrap(
          pre.render[[i]]$dat[hdr], x@etc@line.width,
          sgr.supported=sgr.supported
        )
        pre.render.w[[i]][!hdr] <- wrap(
          pre.render[[i]]$dat[!hdr], x@etc@text.width,
          sgr.supported=sgr.supported
        )
      }
      pre.render.w
    } else lapply(pre.render, function(y) as.list(y$dat))

    line.lens <- lapply(pre.render.w, vapply, length, integer(1L))
    types.raw <- lapply(pre.render, "[[", "type")
    types <- lapply(
      types.raw, function(y) sub("^banner\\.", "", as.character(y))
    )
    if(mode == "sidebyside") {
      line.lens.max <- replicate(2L, do.call(pmax, line.lens), simplify=FALSE)
      pre.render.w <- lapply(
        pre.render.w, function(y) {
          Map(
            function(dat, len) {
              length(dat) <- len
              dat
            },
            y, line.lens.max[[1L]]
      ) } )
    } else line.lens.max <- line.lens

    # Substitute NA elements with the appropriate values as dictated by the
    # styles; also record lines NA positions

    lines.na <- lapply(pre.render.w, lapply, is.na)
    pre.render.w <- lapply(
      pre.render.w, lapply,
      function(y) {
        res <- y
        res[is.na(y)] <- x@etc@style@na.sub
        res
    } )

    # Compute gutter, padding, and continuations

    gutters <- render_gutters(
      types=types, lens=line.lens, lens.max=line.lens.max, etc=x@etc
    )
    # Pad text

    pre.render.w.p <- if(s@pad) {
      Map(
        function(col, type) {
          diff.line <- type %in% c("insert", "delete", "match", "guide", "fill")
          col[diff.line] <- lapply(
            col[diff.line], rpad, x@etc@text.width, sgr.supported=sgr.supported
          )
          col[!diff.line] <- lapply(
            col[!diff.line], rpad, x@etc@line.width, sgr.supported=sgr.supported
          )
          col
        },
        pre.render.w, types
      )
    } else pre.render.w

    # Apply text level styles; make sure that all types are defined here
    # otherwise you'll get lines missing in output; note that fill lines were
    # represented by NAs originally and we indentify them within each aligned
    # group with `lines.na`

    # NOTE: any changes here need to be reflected in `make_dummy_row`

    # CAN WE MOVE THIS WAY EARLIER SO WE CAN GET THE CORRECT TEXT WIDTHS?
    # SEE #65

    es <- x@etc@style
    funs.ts <- list(
      insert=function(x) es@funs@text(es@funs@text.insert(x)),
      delete=function(x) es@funs@text(es@funs@text.delete(x)),
      match=function(x) es@funs@text(es@funs@text.match(x)),
      guide=function(x) es@funs@text(es@funs@text.guide(x)),
      fill=function(x) es@funs@text(es@funs@text.fill(x)),
      context.sep=function(x)
        es@funs@text(es@funs@context.sep(es@text@context.sep)),
      header=es@funs@header
    )
    pre.render.s <- Map(
      function(dat, type, l.na) {
        res <- vector("list", length(dat))
        for(i in names(funs.ts))  # really need to loop through all?
          res[type == i] <- Map(
            function(y, l.na.i) {
              res.s <- y
              if(any(l.na.i))
                res.s[l.na.i] <- funs.ts$fill(y[l.na.i])
              res.s[!l.na.i | i == "context.sep"] <- funs.ts[[i]](y[!l.na.i])
              res.s
            },
            dat[type == i],
            l.na[type == i]
          )
        res
      },
      pre.render.w.p, types, lines.na
    )
    # Reconstruct 'types.raw' with the appropriate lenghts, and replacing
    # types with 'fill' if elements were extended due to wrap

    types.raw.x <- Map(
      function(y, z) {
        Map(
          function(y.s, z.s) {
            res <- rep(y.s, length(z.s))
            res[z.s] <- "fill"
            res
          },
          y, z
      ) },
      types.raw, lines.na
    )
    # Render columns; note here we use 'types.raw' to distinguish banner lines

    cols <- render_cols(
      cols=pre.render.s, gutters=gutters, types=types.raw.x, etc=x@etc
    )
    # Render rows

    rows <- render_rows(cols, etc=x@etc)

    # Collect all the pieces, and for the meta pieces wrap, pad, and format

    pre.fin.l <- list(no.diffs, rows, limit.out, str.fold.out)
    meta.elem <- c(1L, 3:4)
    pre.fin.l[meta.elem] <- lapply(
      pre.fin.l[meta.elem],
      # meta should not have any csi, so plain strwrap is okay
      function(m) es@funs@meta(strwrap(m, width=disp.width))
    )
    pre.fin <- unlist(pre.fin.l)

    # Apply subsetting as needed

    ind <- seq_along(pre.fin)
    ind <- if(length(x@sub.index)) ind[x@sub.index] else ind
    if(length(x@sub.head)) ind <- head(ind, x@sub.head)
    if(length(x@sub.tail)) ind <- tail(ind, x@sub.tail)

    # Do the finalization

    pre.fin <- pre.fin[ind]
    res.len <- length(pre.fin)

    finalize(es@funs@container(pre.fin), x, res.len)
} )

# Finalizing fun used by both Diff and DiffSummary as.character methods

finalize <- function(txt, obj, len) {
  style <- obj@etc@style
  pager <- style@pager
  obj@etc@style@pager <- if(use_pager(pager, len)) pager else PagerOff()

  fin <- style@finalizer(obj, txt)

  attr(fin, "len") <- len
  fin
}
diffobj/R/myerssimple.R0000755000176200001440000001367013777704534014605 0ustar  liggesusers# Copyright (C) 2021 Brodie Gaslam
#
# This file is part of "diffobj - Diffs for R Objects"
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# Go to  for a copy of the license.

# These are deprecated legacy functions from before we incorporated the
# libmba versions of the myers algo

# Alternate implementation of Myers algorithm in R, without linear space
# modification.  Included here mostly for reference purposes and not intended
# for use since the MBA myers implemenation should be far superior

myers_simple <- function(target, current) {
  path <- myers_simple_int(target, current)
  diff_path_to_diff(path, target, current)
}
myers_simple_int <- function(A, B) {
  N <- length(A)
  M <- length(B)
  MAX <- M + N + 1L
  OFF <- MAX + 1L  # offset to adjust to R indexing
  Vl <- vector("list", MAX)
  for(D in seq_len(MAX) - 1L) {
    Vl[[D + 1L]] <- if(!D) integer(2L * MAX + 1L) else Vl[[D]]
    for(k in seq(-D, D, by=2L)) {
      # not sure of precendence for || vs &&
      # k == -D means x == 0
      V <- Vl[[D + 1L]]
      if(k == -D || (k != D && V[k - 1L + OFF] < V[k + 1L + OFF])) {
        x <- V[k + 1L + OFF]
      } else {
        x <- V[k - 1L + OFF] + 1L
      }
      y <- x - k

      # Move on diagonal
      while (x < N && y < M && A[x + 1L] == B[y + 1L]) {
        x <- x + 1L
        y <- y + 1L
      }
      # Record last match or end; if a mismatch no longer increment

      Vl[[D + 1L]][k + OFF] <- x
      if(x >= N && y >= M) {
        # Create matrix to hold entire result path; should be longest of
        # A and B plus recorded differences

        path.len <- D + max(N, M)
        res <- matrix(integer(1L), nrow=path.len, ncol=2)
        res[path.len, ] <- c(x, y)
        path.len <- path.len - 1L

        for(d in rev(seq_len(D))) {
          Vp <- Vl[[d]]
          break.out <- FALSE
          repeat {
            # can't match to zero since that is the initialized value
            shift.up <- Vp[k + 1L + OFF] == x && x
            shift.left <- Vp[k - 1L + OFF] == x - 1L && x > 1L
            if(x <= 0L && y <= 0L) {
              break
            } else if(!shift.up && !shift.left) {
              # must be on snake or about to hit 0,0
              x <- max(x - 1L, 0L)
              y <- max(y - 1L, 0L)
            } else {
              if(shift.up) {
                y <- y - 1L
                k <- k + 1L
              } else {
                x <- x - 1L
                k <- k - 1L
              }
              break.out <- TRUE
            }
            res[path.len, ] <- c(x, y)
            path.len <- path.len - 1L
            if(break.out) break
          }
        }
        if(any(res < 0L)) {
          # nocov start
          stop(
            "Logic Error: diff generated illegal coords; contact maintainer."
          )
          # nocov end
        }
        return(res)
      }
    }
  }
  stop("Logic Error, should not get here") # nocov
}
# Translates a diff path produced by the simple Myers Algorithm into the
# standard format we use in the rest of the package

diff_path_to_diff <- function(path, target, current) {
  stopifnot(
    is.character(target), is.character(current),
    is.matrix(path), is.integer(path), ncol(path) == 2,
    all(path[, 1L] %in% c(0L, seq_along(target))),
    all(path[, 2L] %in% c(0L, seq_along(current)))
  )
  # Path specifies 0s as well as duplicate coordinates, which we don't use
  # in our other formats.  For dupes, find first value for each index that is
  # lined up with a real value in the other column

  get_dupe <- function(x) {
    base <- !logical(length(x))
    if(!length(y <- which(x != 0L)))
      base[[1L]] <- FALSE else base[[min(y)]] <- FALSE
    base
  }
  cur.dup <- as.logical(ave(path[, 1L], path[, 2L], FUN=get_dupe))
  tar.dup <- as.logical(ave(path[, 2L], path[, 1L], FUN=get_dupe))

  path[!path] <- NA_integer_
  path[tar.dup, 1L] <- NA_integer_
  path[cur.dup, 2L] <- NA_integer_

  # Now create the character equivalents of the path matrix

  tar.path <- target[path[, 1L]]
  cur.path <- current[path[, 2L]]

  # Mark the equalities in the path matrix by setting them negative

  path[which(tar.path == cur.path), ] <- -path[which(tar.path == cur.path), ]

  # Remaining numbers are the mismatches which we will arbitrarily assign to
  # each other; to do so we first split our data into groups of matches and
  # mismatches and do the mapping there-in.  We also get rid of non-matching
  # entries.

  matched <- ifelse(!is.na(path[, 1]) & path[, 1] < 0L, 1L, 0L)
  splits <- cumsum(abs(diff(c(0, matched))))
  chunks <- split.data.frame(path, splits)
  res.tar <- res.cur <- vector("list", length(chunks))
  mm.count <- 0L  # for tracking matched mismatches

  for(i in seq_along(chunks)) {
    x <- chunks[[i]]
    if((neg <- any(x < 0L, na.rm=TRUE)) && !all(x < 0L, na.rm=TRUE))
      stop("Internal Error: match group error; contact maintainer") # nocov
    if(neg) {
      # Matches, so equal length and set to zero
      res.tar[[i]] <- res.cur[[i]] <- integer(nrow(x))
    } else {
      # Mismatches
      tar.mm <- Filter(Negate(is.na), x[, 1L])
      cur.mm <- Filter(Negate(is.na), x[, 2L])

      x.min.len <- min(length(tar.mm), length(cur.mm))
      res.tar[[i]] <- res.cur[[i]] <- seq_len(x.min.len) + mm.count
      mm.count <- x.min.len + mm.count
      length(res.tar[[i]]) <- length(tar.mm)
      length(res.cur[[i]]) <- length(cur.mm)
    }
  }
  if(!length(res.tar)) res.tar <- integer()
  if(!length(res.cur)) res.cur <- integer()

  return(list(target=unlist(res.tar), current=unlist(res.cur)))
}
diffobj/NEWS.md0000755000176200001440000002617414126712602012771 0ustar  liggesusers# diffobj

## v0.3.5

* Options automatically fallback to factory defaults if they are unset (h/t
  @gadenbui).
* [#158](https://github.com/brodieG/diffobj/issues/158): Calling `diff*` with
  `do.call` now works without warnings.
* [#117](https://github.com/brodieG/diffobj/issues/117): Fix guide detection
  with very wide wrapped data.frames (h/t @bastician, @overvolting).

## v0.3.4

* Add a print method for `ses_dat` return values that makes it easier to
  interpret the diff.
* [#152](https://github.com/brodieG/diffobj/issues/152): Rewrite the
  fall-back "O(n)" algorithm that kicks in when there are `max.diffs`
  differences to be more robust (h/t @hadley, @DanChaltiel, @gadenbui).
* Related to #152: `max.diffs=0` used to mean the same as `max.diffs=-1` (i.e.
  unlimited), but this was undocumented and an error.  `max.diffs=0` will now
  immediately fall back to the "O(n)" algorithm.

## v0.3.3

* Implement experimental .Rout / .Rout.save testing.
* Fix `all.equal` test breakages from
  [r79555](https://github.com/r-devel/r-svn/commit/66d016544fe9deb64aa74ae55fa3edfcb721b1c4).

## v0.3.1-2

* [#150](https://github.com/brodieG/diffobj/issues/150): Make tests compatible
  with new `testthat` release (h/t @hadley).
* Remove pre-built vignettes and note `testthat` change to `waldo` release.

## v0.3.0

* [#143](https://github.com/brodieG/diffobj/issues/143): Add `ses_dat` to
  provide a more computable version of `ses` (h/t @hadley).
* [#144](https://github.com/brodieG/diffobj/issues/144): Re-encode strings to
  UTF-8 prior to comparison to avoid spurious encoding-only differences (h/t
  @hadley).
* [#142](https://github.com/brodieG/diffobj/issues/142): Typos in
  `standardGeneric` in trim/guide generic definitions.
* Drop attributes from inputs to `diffChr` (revealed as an issue by #142).
* Banish ghosts of `stringsAsFactors`.

## v0.2.4

* Tests explicitly set `stringsAsFactors=TRUE` so they don't fail with the
  anticipated changed for R4.0.
* [#140](https://github.com/brodieG/diffobj/issues/140): Bad link in `?ses`.

## v0.2.3

This is a bugfix release.

* [#136](https://github.com/brodieG/diffobj/issues/136): Documentation for
  `ignore.white.space` (h/t @flying-sheep) and `max.diffs` parameters listed
  incorrect defaults.
* [#135](https://github.com/brodieG/diffobj/issues/135): Incorrect handling of
  potential meta data strings when unwrapping atomics would cause a "wrong sign
  in by argument" error (h/t @flying-sheep).  We also fixed other bugs related
  to the handling of meta data in atomic vectors that were uncovered while
  debugging this issue.
* [#134](https://github.com/brodieG/diffobj/issues/134): Forwarding `...` to
  `diff*` functions no longer breaks substitution of arguments for diff banners
  (h/t @noamross)..
* [#133](https://github.com/brodieG/diffobj/issues/133): `diffFile` considers
  files with equal content but different locations to be `all.equal` now (h/t
  @noamross).
* [#132](https://github.com/brodieG/diffobj/issues/132): Duplicate pager slot
  for baseline `Pager` removed (h/t Bill Dunlap).

There are also several other small internal changes that in theory should not
affect user facing behavior.

## v0.2.2

* Set `RNGversion()` due to changes to sampling mechanism.

## v0.2.0-1

### Features

* [#129](https://github.com/brodieG/diffobj/issues/129): Allow pager
  specification via lists rather than full `Pager` objects for easier changes to
  defaults.  As part of this we changed `StyleRaw` objects to use default
  pager instead of `PagerOff`.
* [#126](https://github.com/brodieG/diffobj/issues/126): Add embedding diffs in
  Shiny to vignette.
* [#119](https://github.com/brodieG/diffobj/issues/119): `ignore.whitespace` now
  also ignores white space differences adjoining punctuation.
* [#118](https://github.com/brodieG/diffobj/issues/118): New option to preserve
  temporary diff file output when using pager (see `?Pager`).
* [#114](https://github.com/brodieG/diffobj/issues/114): New options `strip.sgr`
  and `sgr.supported` allow finer control of what happens when input already
  contains ANSI CSI SGR and how ANSI CSI SGR is handled in string manipulations.
  Related to this, `options(crayon.enabled=TRUE)` is no longer set when
  capturing output prior to diff as it used to be.  By default pre-existing ANSI
  CSI SGR is stripped with a warning prior to comparison.

### Bugs

* [#131](https://github.com/brodieG/diffobj/issues/131): Fix missing slot in S4
  class definition (discovered by Bill Dunlap).
* [#127](https://github.com/brodieG/diffobj/issues/127): Width CSS conflicts
  with bootstrap (reported by @eckyu, debugged/fixed by @cpsievert).

## v0.1.11

* [#123](https://github.com/brodieG/diffobj/issues/123): Compatibility with R3.1
  (@krlmlr).
* [#121](https://github.com/brodieG/diffobj/issues/121): Vignette describing how
  to embed diffs in Rmd documents (@JBGruber).
* [#115](https://github.com/brodieG/diffobj/issues/115): Declare HTML page diff
  encoding/charset as UTF-8 (@artemklevtsov).

## v0.1.10

* Comply with CRAN directive to remove references to packages not in
  depends/imports/suggests in tests (these were run optionally before).
* Fix bugs in corner case handling when we hit `max.diffs`.

## v0.1.9

* Fix test failures caused by changes in tibble output

## v0.1.8

* [#111](https://github.com/brodieG/diffobj/issues/111): Fixed guides with
  `row.names=FALSE` (thank you @[Henrik
  Bengtsson](https://github.com/HenrikBengtsson)).
* [#113](https://github.com/brodieG/diffobj/issues/113): Adapt tests to new
  `str` return values (thank you @[Martin
  Mächler](https://github.com/mmaechler)).

## v0.1.7

* Fix tests for next `testthat` release.
* [#107](https://github.com/brodieG/diffobj/issues/107): Diffs on quoted
  language
* [#108](https://github.com/brodieG/diffobj/issues/108): Problems caused by
  copying `crayon` functions
  ([@seulki-choi](https://stackoverflow.com/users/7788015/seulki-choi),
  @gaborcsardi)
* [#100](https://github.com/brodieG/diffobj/issues/100): R_useDynamicSymbols
* [#97](https://github.com/brodieG/diffobj/issues/97): 2D Guidelines fixes for
  data.table, tibble
* [#96](https://github.com/brodieG/diffobj/issues/96): Warnings when comparing
  large data tables.
* [#94](https://github.com/brodieG/diffobj/issues/94): Guide detection problems
  in nested lists.
* [#105](https://github.com/brodieG/diffobj/issues/105): Copyright tweaks.

## v0.1.6

* [#87](https://github.com/brodieG/diffobj/issues/87): `diffobj` is now GPL (>=2)
  instead of GPL-3.
* [#81](https://github.com/brodieG/diffobj/issues/81): Better handling of mixed
  UTF-8 / ASCII strings, reported by [jennybc](https://github.com/jennybc)
* [#88](https://github.com/brodieG/diffobj/issues/88): correctly handle trimming
  when empty lists are involved, reported by [wch](https://github.com/wch)
* [#77](https://github.com/brodieG/diffobj/issues/77): `diffObj` now favors
  dispatching to `diffPrint` unless `diffPrint` output is large
* [#82](https://github.com/brodieG/diffobj/issues/82): `diffChr` and `ses` now
  treat `NA` as "NA" (needed with change in `nchar(NA)` in base R)
* [#85](https://github.com/brodieG/diffobj/issues/85): Improved alignment of
  unwrapped atomic vector diffs
* [#83](https://github.com/brodieG/diffobj/issues/83): Improve pager auto
  detection (note now ANSI output is only allowed by default if terminal
  supports ANSI colors and the system pager is `less`, see `?Pager` for details)
* [#92](https://github.com/brodieG/diffobj/issues/92),
  [#80](https://github.com/brodieG/diffobj/issues/80),
  [#45](https://github.com/brodieG/diffobj/issues/45): basic implementation of
  S4 guidelines and trimming (full resolution eventually with
  [#33](https://github.com/brodieG/diffobj/issues/33))
* [#84](https://github.com/brodieG/diffobj/issues/84): simplify how to call
  `diffChr` for improved performance, including "optimization" of
  `convert.hz.whitespace`.
* [#64](https://github.com/brodieG/diffobj/issues/64): fix line limit in corner
  case
* More robust handling of external `diff*` methods and of how `diffObj` calls
  `diffStr` and `diffPrint`

## v0.1.5

* [#71](https://github.com/brodieG/diffobj/issues/71): Buggy diffs b/w data
  frames when one has sequential row numbers and the other does not, loosely
  related to [#38](https://github.com/brodieG/diffobj/issues/38)
* [#69](https://github.com/brodieG/diffobj/issues/69): Improve performance on
  outputs with large print/show output, and other assorted minor optimizations
* [#72](https://github.com/brodieG/diffobj/issues/72): Warn when `style`
  parameter overrides other user supplied parameters
* [#70](https://github.com/brodieG/diffobj/issues/70): Improve word contrast in YB
  HTML mode
* [#63](https://github.com/brodieG/diffobj/issues/63): Show `all.equal` output
  when objects are not `all.equal` but there are no visible differences
* Add Mean Relative Indifference vignette and update vignette styling

## v0.1.4

* [#67](https://github.com/brodieG/diffobj/issues/67): Fix CRAN Binaries
* Clarified that C code is heavily modified and incompatible with original
  `libmba` implementation

## v0.1.3

* First version on CRAN
* [#51](https://github.com/brodieG/diffobj/issues/51): use RStudio viewport to display HTML diffs when running in RStudio (h/t Noam Ross)
* [#54](https://github.com/brodieG/diffobj/issues/54): [#55](https://github.com/brodieG/diffobj/issues/55), scale HTML output to viewport width (see `?Style`)
* [#53](https://github.com/brodieG/diffobj/issues/53): default term colors computed on run instead of on package load
* [#56](https://github.com/brodieG/diffobj/issues/56): disable wrap for HTML output
* HTML output now captured with default width 80 since there is no explicit relationship between HTML viewport width and `getOption("width")`
* The `style` parameter now accepts lists to use as instantiation arguments for `Style` objects (see `?Style`)
* Fix subtle rendering and formatting application flaws
* Switch Travis shields to SVG per Gábor Csárdi
* Improve in-hunk alignment of partially matching lines
* Compile with `-pedantic`, fix related warnings [Arun](https://stackoverflow.com/users/559784/arun)
* Improved coverage and more robust testing
* Several internal structure changes to accommodate improvements

## v0.1.2

* [#46](https://github.com/brodieG/diffobj/issues/46): Guide and Trim Problems with Lists
* [#47](https://github.com/brodieG/diffobj/issues/47): Output Format in non-ANSI Terminals Without Browser (reported by [Frank](https://github.com/brodieG/diffobj/issues/47))
* [#48](https://github.com/brodieG/diffobj/issues/48): `make_blocking` Default prompt Confusing (reported by [Frank](https://github.com/brodieG/diffobj/issues/47))
* [#49](https://github.com/brodieG/diffobj/issues/49): In-Hunk Word Diffs Issues when Unwrap-diffing Atomics
* [#50](https://github.com/brodieG/diffobj/issues/50): CSS Lost in Rstudio Server Sessions (reported by [Steven Beaupré](https://chat.stackoverflow.com/users/4064778/steven-beaupre))

## v0.1.1

* Turn off unwrapping for _named_ atomic vectors (see [#43](https://github.com/brodieG/diffobj/issues/43))
* [#44](https://github.com/brodieG/diffobj/issues/44): Proper handling of NULL objects in `diffStr`
* [#41](https://github.com/brodieG/diffobj/issues/41): Compilation Issues in Winbuilder

## v0.1.0

* Initial Release
diffobj/MD50000644000176200001440000005204214126775131012177 0ustar  liggesusers2e535c9911aec20a33313d1ee2dcc8db *DESCRIPTION
86ba5fc5c95a550a25db2f3df704cede *NAMESPACE
2f45de40307898b4042d61f42ffb95e7 *NEWS.md
49d3d45d7f8fbd5039931dec5e0a75ad *R/capt.R
15c25fba7f29c1ca25339d50c9db24e3 *R/check.R
8cc95339d84906b633c287b1c25d3b1e *R/core.R
e24157f15e4880e71e4b6327e1906e3a *R/diff.R
dcce37378f0cff4d10f9a4c132bc2898 *R/finalizer.R
f62077eb26090bdbb0003aee5b405a24 *R/get.R
b8bf43023e234425503bfa0e8dc9adc8 *R/guides.R
984ed8bdef48847c20f7d41939984bf9 *R/html.R
3866a281d271757572686dcd4323e1e9 *R/hunks.R
e7a548158b6bd4ef3079e2157a7f50eb *R/layout.R
9710803489104a33ba502ef4ee11d8e1 *R/misc.R
0a6b2d346328ee8c55e0cea53a36db3b *R/myerssimple.R
d37d716569d95a7d8b1f1eff73f24ad2 *R/options.R
6f3afe561497c8fe8f3dd16dff396552 *R/pager.R
dc315dbe26fbcf9af0c746df1cc7c0d5 *R/rdiff.R
1019d8520b1b2f8367f77341aaf0f1c4 *R/rds.R
201159662ccb306e427391315f5f2517 *R/s4.R
ef478064e67522a8627d4c390149235b *R/set.R
323789973b388f35b29cfbb5d397d44b *R/styles.R
140c8b32cf27466641ba6073c206a69c *R/subset.R
d264d7fd94ee63fc5ba90458cafc19c9 *R/summmary.R
6e41c09de0c77f265ab80103e8eda1bc *R/system.R
69f27658f5dad96c2b6dffac87c8b845 *R/text.R
14dce32be9ed0a23b2e8d8437c925357 *R/tochar.R
c3a42436e52a82143d6a23fc35e94b67 *R/trim.R
91bf02bacd231e90d1d8c8a5130d679a *R/word.R
178748ba5f9b47c21e928b4848561a78 *README.md
c5ffe1166423d4137c6ce3c1f91790ad *build/vignette.rds
320fc37b423b0e309fedfde4061bc906 *inst/COPYRIGHTS
4522b9ebe26e4d989bc7a27f933e3d21 *inst/css/diffobj.css
7076bea0a51f0cc75fcba4f2c8ad4078 *inst/doc/diffobj.R
c50e37261653c146d7542dafab786625 *inst/doc/diffobj.Rmd
42be10d705bffbf64e96521b82a8e71c *inst/doc/diffobj.html
d9b10adda8de36a85376cae8ecb9f552 *inst/doc/embed.R
9e93047aa59973c6475724b83fc6c321 *inst/doc/embed.Rmd
7e6979c508b4e7a4bb0f5f18c25d6c9c *inst/doc/embed.html
58457afa03757c75394dfb619876db3b *inst/script/diffobj.js
31419b3dc6ce97ed3345f193b7656d50 *man/AlignThreshold-class.Rd
93b232974dcc5131b6d2d3614b5e90bc *man/Diff-class.Rd
45e765d94003da598c35f61278cea156 *man/Extract_PaletteOfStyles.Rd
f2e40c1436601fc5c57e0d2fd58405cf *man/Pager.Rd
70f6f6c53819b288f32221a9037ccdd8 *man/PaletteOfStyles-class.Rd
44122be6966715471ea1071ab79f9a8d *man/Rdiff_chr.Rd
dc65bc321349383369bb285cafe8f377 *man/Style.Rd
7432f4de3d7eeac72497bdebce78fa99 *man/StyleFuns.Rd
2c94216969caf6341d74b9ffe80188b7 *man/StyleSummary.Rd
31afdced631cfa0ea225ea80e52f97bb *man/StyleText.Rd
acccbbfbe4b351a6ec2abf0214622580 *man/any-Diff-method.Rd
c6e0d308a833f0f506c82db170574449 *man/as.character-DiffSummary-method.Rd
5520ff92ad10370c9d2e8eaa23ee7301 *man/as.character-MyersMbaSes-method.Rd
9b1924f6d3ed312ae91c66dcaa12c867 *man/auto_context.Rd
42e7cd6e2e963cf61fde825fd0c48b53 *man/console_lines.Rd
70d8fbe6d8a252715546229b7e8a033d *man/diffChr.Rd
7e00f6a3c1b636f73bc29f103e2a550e *man/diffCsv.Rd
977de4f3e0ef7dfced765ae54dfe53b5 *man/diffDeparse.Rd
9e82eaf94e05a4b67ef803aea8e07a88 *man/diffFile.Rd
29916b9909caccb09b666d6a32acbb0c *man/diffObj.Rd
c7221ac49df3ea6b56ca72dcdcee5214 *man/diffPrint.Rd
01a60bfa69539835b8a7b1dbfde0da96 *man/diffStr.Rd
b0f022b9ecc50e339d59999b799bf995 *man/diff_myers.Rd
246ff7707c0b57375841ee6901cc4522 *man/diffobj-package.Rd
e90329bfecacee1de20b87230e5bb8cc *man/diffobj_s4method_doc.Rd
168591e0c6d3f6c40061e5fa5819f724 *man/diffobj_set_def_opts.Rd
504f92f3aa46bf96b5d7ceb9ced1cf39 *man/dimnames-PaletteOfStyles-method.Rd
7925b4c3808f54355ce4a7e27151351b *man/extract-Diff-method.Rd
0152556a38f12fec9826e2486c1b5503 *man/finalizeHtml.Rd
f5950a8e6e9dd92fc80cba49b7e6f753 *man/gdo.Rd
ed1a56df12ea7f95cd3d1a3401b1b761 *man/guides.Rd
e6fe3aeab53db02d9e281856a1895df8 *man/has_Rdiff.Rd
930d757bba96a859bbd528ba95d5056e *man/make_blocking.Rd
00ba016bdea25ea334451371bd616d3c *man/nchar_html.Rd
4f2341110f17057d5df37855d560c188 *man/pager_is_less.Rd
641f6695b008ea98b037ba7542b653f6 *man/par_frame.Rd
cc177a548625a2310c4b1b909f08a6a7 *man/ses.Rd
577dda8d444116bc3d5cd9c76b6dc0b1 *man/show-DiffSummary-method.Rd
6b3cf207b00a94134d319f5bbc1624eb *man/show-PaletteOfStyles-method.Rd
8686528d8af3c214b50ccd8383932e94 *man/show-Style-method.Rd
f6aec61ea0385eb48029f21361b4b36d *man/strip_hz_control.Rd
15e607283f82470ad098506338e6a56e *man/summary-Diff-method.Rd
7b3535e656b8011ff8298db338be16e9 *man/summary-MyersMbaSes-method.Rd
f79eb2ea6757ef2babf636c6222040f6 *man/summary-PaletteOfStyles-method.Rd
089e12b412655819fbd17fba92e02a2c *man/tag_f.Rd
9b8d5ebf572029a14064136e48907d4b *man/trim.Rd
a781aa55007885ab32eafa432f2e0d4d *man/view_or_browse.Rd
9dd8095f89982d2d197f849c9af145d6 *man/webfiles.Rd
4be9e7afc0e49598c44931d45eb41e66 *src/diff.c
2305ce0fbfec00b881bd460e956388d2 *src/diff.h
eea2cb88f3d7f5ab081b4396715c447c *src/diffobj.c
3a4d96a207313f8d062a150984b5b280 *src/diffobj.h
5b752a841b887c1ad5c862eb6679d9f6 *src/init.c
a7b2148e0e6c112c8724679fea3e80e4 *tests/_helper/breakdown.R
fc5f397522504e0bce404a5c2a5246e6 *tests/_helper/check.R
93a11280184e036685a9cd4c24d0ed78 *tests/_helper/commonobjects.R
168a72d71281bdb4c60a4546213dcd1a *tests/_helper/init.R
a45701a8a943fee86e2f7bfea68c554e *tests/_helper/objs/atomic/100.rds
dc3f364393561bd18b20d84286c3b7c7 *tests/_helper/objs/atomic/1000.rds
44c2381df681c732977077f5a46bd25c *tests/_helper/objs/atomic/1100.rds
69f4fa668d1680ac1dccebb25e315e99 *tests/_helper/objs/atomic/1200.rds
0734eca1d4d03680354c5612b93aed88 *tests/_helper/objs/atomic/1250.rds
37c577a82c7968d7997aebe1d3109ca5 *tests/_helper/objs/atomic/1300.rds
34a356139062b436b647f889e3a26a0a *tests/_helper/objs/atomic/1400.rds
e05e4ac9e2ac21a5be7ccf5ba4901412 *tests/_helper/objs/atomic/1425.rds
2432d32ed1495f36c39af5fa71db6a64 *tests/_helper/objs/atomic/1450.rds
ea6141ad68328e262fb39675bebc751b *tests/_helper/objs/atomic/1500.rds
c140a1be8612c04236c9741e044e7f8a *tests/_helper/objs/atomic/1600.rds
b6dc4bc12e54f95aa91224f02ee32bb8 *tests/_helper/objs/atomic/1700.rds
764f2019d89135c633fb547ed4e1a2b0 *tests/_helper/objs/atomic/1800.rds
221588368f319815137723249d330fe7 *tests/_helper/objs/atomic/1900.rds
c2a2f40ec2fb74e3732c84602b3f88f0 *tests/_helper/objs/atomic/200.rds
a74e6e21b9c01c44385949f20db6a701 *tests/_helper/objs/atomic/2000.rds
101109fd4be9f6920e256fa7f4902d42 *tests/_helper/objs/atomic/2100.rds
41fd3b188451f69fd9e1fecb6ca7647d *tests/_helper/objs/atomic/2200.rds
f23316310a5b9a7fc25895065a64c1da *tests/_helper/objs/atomic/2300.rds
2f5942ee3eedc6d18123ffcff4756ce7 *tests/_helper/objs/atomic/2400.rds
f0e8bb15dd4921aec2d0c224c3646234 *tests/_helper/objs/atomic/2500.rds
3f0cd800db66170eef9945861acd15aa *tests/_helper/objs/atomic/2520.rds
fab134028d4d441560b2a6847cae5091 *tests/_helper/objs/atomic/2530.rds
31764b7afd942315de7c9e7436081d93 *tests/_helper/objs/atomic/2540.rds
bfafd46faad18c70b7d7b2e6734ea3a3 *tests/_helper/objs/atomic/2600.rds
da093f2a3ce05c9b4b76c2bb7a9353a1 *tests/_helper/objs/atomic/2700.rds
f4d06103b2b62f64c2b60cfbf4ac4d36 *tests/_helper/objs/atomic/2800.rds
989d1180e3880b221aa73666b60bfd81 *tests/_helper/objs/atomic/2900.rds
7f9785d7bf351f9eee19af77fb04bac2 *tests/_helper/objs/atomic/3000.rds
5d368c9bac8fd272b810a4b15b33fc57 *tests/_helper/objs/atomic/3100.rds
91218647f129ad69cbc6ebd5c7dc172d *tests/_helper/objs/atomic/3200.rds
acb225308bc51c408162ad69a43a6863 *tests/_helper/objs/atomic/3300.rds
c1694e5b8fdec754a52c2b01e747ed7c *tests/_helper/objs/atomic/3400.rds
6ee314755262cba70a6fe4fd2dda4058 *tests/_helper/objs/atomic/400.rds
e5d81a992cfb76c65f2b4676f4cbebf3 *tests/_helper/objs/atomic/500.rds
02e31be899289c97b3a36a6c75ab47a8 *tests/_helper/objs/atomic/600.rds
4fff8918a763c31cd2f3f98e6df53acf *tests/_helper/objs/atomic/700.rds
3e8b6358cb2aa43179319f1bd6228486 *tests/_helper/objs/atomic/800.rds
8357b4e983f364f29582a41bf12ef315 *tests/_helper/objs/atomic/900.rds
48f39fcbb69b8703b42d00c23b2defbd *tests/_helper/objs/common/aaaa.RDS
f6d68e680e822b2808762d45d201b675 *tests/_helper/objs/context/100.rds
5dd71ce40c9679b4c95aaee45c565b37 *tests/_helper/objs/context/100.txt
f198a0849fac7b99820ea6694ed26990 *tests/_helper/objs/context/150.rds
41dc9a5fed17f5ee9d909000195c23e6 *tests/_helper/objs/context/200.rds
54ab37eccc5b0e7fe550ebd5467da5af *tests/_helper/objs/context/200.txt
f28717f7e31485ece3c0f8e2ce032b3f *tests/_helper/objs/context/300.rds
ec95a408b23380dfa98dcd5d44026a0e *tests/_helper/objs/context/400.rds
f00958307411ddfe90d8d45be4a6e917 *tests/_helper/objs/context/500.rds
6ea96d1d9b5da34720d14756cb5aada7 *tests/_helper/objs/diffChr/100.rds
5ea668dd799d006548506ff46c74b738 *tests/_helper/objs/diffChr/100.txt
d85efe2ce8b91e2acde6a2b1363211ca *tests/_helper/objs/diffChr/1000.rds
ca124f8c0b596d37f7a211162f40f018 *tests/_helper/objs/diffChr/1100.rds
f2cbabe3191df5d5e8d74db247af7af3 *tests/_helper/objs/diffChr/1200.rds
805bbf7455cce2d6e79bb5868535d99e *tests/_helper/objs/diffChr/1300.rds
5728b07f03543b472c9e8afe477a4218 *tests/_helper/objs/diffChr/1400.rds
11efd2c5b63c52ebef56eddc809547be *tests/_helper/objs/diffChr/1500.rds
3094895a0ae2a02c3df392b59110131b *tests/_helper/objs/diffChr/200.rds
3f61d4788d843555c01962fe9488f224 *tests/_helper/objs/diffChr/200.txt
c28135ec2ad960124c266f2d8e8f1449 *tests/_helper/objs/diffChr/225.rds
b858eca5629dab83ad71a0ceb4aeaf28 *tests/_helper/objs/diffChr/250.rds
3d5584a86016db6a893cf63a896bbd3d *tests/_helper/objs/diffChr/300.rds
7eec0dc151eacdc33d6fff179ce36d70 *tests/_helper/objs/diffChr/300.txt
a0a3531322d6651912d2b4bfc057fdfb *tests/_helper/objs/diffChr/400.rds
9387a8fc5bcf16bf042f6431a8f3dcc6 *tests/_helper/objs/diffChr/400.txt
14e6ba05956c075bc0dea7a806d1ce11 *tests/_helper/objs/diffChr/500.rds
c20342671fef7075a835980cba0f8878 *tests/_helper/objs/diffChr/500.txt
a7bff3a7d88f4be8a65fecedd1d6707b *tests/_helper/objs/diffChr/600.rds
ad335e2b030a318a911f7ce5b2d81c8c *tests/_helper/objs/diffChr/800.rds
da6d129bc1abde1f32243f14060437e2 *tests/_helper/objs/diffChr/900.rds
a892016365e770595c37c7cfeef280c4 *tests/_helper/objs/diffDeparse/100.rds
a3902e3d7357dc97b3abc4d323046096 *tests/_helper/objs/diffDeparse/200.rds
cfd11512ac2c0336d5f7ae6dcd3952b4 *tests/_helper/objs/diffFile/100.rds
125b7770618cee796f75f11fe42b513f *tests/_helper/objs/diffFile/s.o.30dbe0.R
9752f5d324622b136bad0666714cc101 *tests/_helper/objs/diffFile/s.o.3f1f68.R
1483c33e014566ac7c0307b4f875a190 *tests/_helper/objs/diffObj/100.rds
79eb3fb9a4533231c16a88da7c7e67f1 *tests/_helper/objs/diffObj/200.rds
3d2afca44733f3900578b5139c80ac6e *tests/_helper/objs/diffObj/300.rds
f28316a1fc903d296643e6730589f6de *tests/_helper/objs/diffObj/400.rds
37b17feb793bf07a25b90bea54db8dd5 *tests/_helper/objs/diffPrint/100.rds
85499e94ba78e61b81abb0a18c518aad *tests/_helper/objs/diffPrint/100.txt
60381bf784f270725eaf5f886df2e40f *tests/_helper/objs/diffPrint/1000.rds
c963e95d93e1419329faf4b3e4af8008 *tests/_helper/objs/diffPrint/1100.rds
5e701970a746e1f48a7aabcc27a886e9 *tests/_helper/objs/diffPrint/1200.rds
8f76c594527c84f9c094c41c1e954e15 *tests/_helper/objs/diffPrint/1300.rds
9bd8920b9f7d5e05d30c60d1739e58f5 *tests/_helper/objs/diffPrint/1400.rds
36c801641f202eb148e628da26a0f8bf *tests/_helper/objs/diffPrint/150.rds
90bb138a4f05b20b723cc13de8dfbf2e *tests/_helper/objs/diffPrint/150.txt
07497bf535c796d2b6207f9f0aabb023 *tests/_helper/objs/diffPrint/1500.rds
b2de18480bc61e62e2b4d7e14e00963f *tests/_helper/objs/diffPrint/1600.rds
e4f1b395cb2c812e12de73aaf57eb689 *tests/_helper/objs/diffPrint/1650.rds
31bec83015b12f4212e0a406c5c69923 *tests/_helper/objs/diffPrint/1700.rds
5778a7ce44ff70aa025f16aecb98a047 *tests/_helper/objs/diffPrint/175.rds
0abeac571d16a77cd414bc57c0d528cc *tests/_helper/objs/diffPrint/175.txt
fc035cf2e8ece3effb658e0919d7a9f7 *tests/_helper/objs/diffPrint/1800.rds
5dc5f7ba8f8d8fbc698dc145d0296736 *tests/_helper/objs/diffPrint/1900.rds
eec6a82dfed19356cf0d4f8c6b0dcc8b *tests/_helper/objs/diffPrint/200.rds
73d3abb805eaa7da254c6e46a9369b8d *tests/_helper/objs/diffPrint/200.txt
1be79c5818539f20125ac58eba1dcc44 *tests/_helper/objs/diffPrint/2000.rds
f6969b4e873c2626cba9655fa0ade92f *tests/_helper/objs/diffPrint/2100.rds
f14f24f4387d9ba302727c01c8dca2e6 *tests/_helper/objs/diffPrint/2150.rds
4883764bcc9b13ca249fa1759c2eb0cf *tests/_helper/objs/diffPrint/2200.rds
7acb31232169b734d4fa644446928ab1 *tests/_helper/objs/diffPrint/2250.rds
23e50dc91187ec68ffb8417c962732d2 *tests/_helper/objs/diffPrint/2300.rds
f6969b4e873c2626cba9655fa0ade92f *tests/_helper/objs/diffPrint/2350.rds
fb6e3e28a0fe590afa16470e491b897d *tests/_helper/objs/diffPrint/2370.rds
90f0c8e624026ae3ee09ff6b36af0a7e *tests/_helper/objs/diffPrint/2380.rds
4a8410c083b3f64cc8fce0ff5d3709a3 *tests/_helper/objs/diffPrint/2383.rds
f7adce81326a7bf1b2af4b9db98da099 *tests/_helper/objs/diffPrint/2400.rds
f7adce81326a7bf1b2af4b9db98da099 *tests/_helper/objs/diffPrint/2500.rds
88f94687cb1212e341dd409fef89f60e *tests/_helper/objs/diffPrint/2600.rds
51192ba9a08d8130611189b4b7a99717 *tests/_helper/objs/diffPrint/2700.rds
1462ff7687b98218c6dd5a5d79209fc6 *tests/_helper/objs/diffPrint/2800.rds
0fcf1dbaf973aff32c38642dea659515 *tests/_helper/objs/diffPrint/2900.rds
6428bd1ff765feaf9698cc6693cf1ac0 *tests/_helper/objs/diffPrint/300.rds
7266f76fb2b5f81db8f1c18858399827 *tests/_helper/objs/diffPrint/3000.rds
a35c08e11e8d7c403e3177c13a2d65f9 *tests/_helper/objs/diffPrint/3100.rds
2c958f4965b14694d30335b9f72807bd *tests/_helper/objs/diffPrint/3200.rds
86794d78ccec9b8dc582ab5a346da9b3 *tests/_helper/objs/diffPrint/3300.rds
4d0208a0b608b7d633f662b2092ed896 *tests/_helper/objs/diffPrint/3400.rds
26bb401980e8c6ddc54688448335eb75 *tests/_helper/objs/diffPrint/400.rds
8ba4fe3b4176287a49fc6ae3f6174ea7 *tests/_helper/objs/diffPrint/500.rds
207c5dd85b0c8df263deae230c68a2b2 *tests/_helper/objs/diffPrint/600.rds
f6b17f580422089c84bf2c16cea8579a *tests/_helper/objs/diffPrint/700.rds
cf0a10ed0b7939dfbd33366fbfe353c2 *tests/_helper/objs/diffPrint/800.rds
3732b9b1ed11bd4171627347310ee847 *tests/_helper/objs/diffPrint/900.rds
e1b09bf3fffdaac6189e86f1c49594b0 *tests/_helper/objs/diffStr/100.rds
8f8c0614f6610391dd23827baa721d2b *tests/_helper/objs/diffStr/100.txt
64fc1bd339dd438d0b4dff9fb26c6dbf *tests/_helper/objs/diffStr/1000.rds
d2ca22a1835e41ad7d0b60d1e0b4e252 *tests/_helper/objs/diffStr/1100.rds
0aebb54ccae30a57fa40c9e94d5a0b79 *tests/_helper/objs/diffStr/200.rds
8e82236f5a8dca6179790f610b9162e8 *tests/_helper/objs/diffStr/300.rds
afc2ddbcbb06ad13195f6111d352b141 *tests/_helper/objs/diffStr/400.rds
f16070308563105cd79766fcbe4731ce *tests/_helper/objs/diffStr/500.rds
c3438a71e37e39840f52f78e161941d8 *tests/_helper/objs/diffStr/550.rds
3d2afca44733f3900578b5139c80ac6e *tests/_helper/objs/diffStr/600.rds
60a797bb010593ff574c99320941a5c9 *tests/_helper/objs/diffStr/700.rds
e816514b74dcc98245ed28dcd05b04e6 *tests/_helper/objs/diffStr/800.rds
e816514b74dcc98245ed28dcd05b04e6 *tests/_helper/objs/diffStr/900.rds
81b4d6c860927ce9fe0b389f625f07c8 *tests/_helper/objs/guides/100.rds
9dadb9e30a566f057bb3449ca2e80ea4 *tests/_helper/objs/guides/200.rds
5cc7de0903649ea072c89294f6b8cda2 *tests/_helper/objs/html/100.rds
8b2d8f29e9b5c7e5e4315ec4bc4fa2a3 *tests/_helper/objs/html/200.rds
0143f228fcd903ce6540824af9868cd9 *tests/_helper/objs/html/300.rds
e1c6fae4d07bd0a9baeb86e3fed064b1 *tests/_helper/objs/html/350.rds
9cbccce90aea834d543f4d3742e96aa9 *tests/_helper/objs/html/400.rds
0aa5e953e2838d65352d867d456376e0 *tests/_helper/objs/limit/100.rds
90b9ae355f42a20f7839b2629fa05291 *tests/_helper/objs/limit/1000.rds
d1933542e8052ee44bb0b2ead7dfb387 *tests/_helper/objs/limit/1100.rds
0df077631f4d4c6b14d1479eb952aff5 *tests/_helper/objs/limit/1200.rds
c880da027a5854880a5cc29bb4a0865e *tests/_helper/objs/limit/1300.rds
dbe2410e259a09082784d9180d0e19cb *tests/_helper/objs/limit/200.rds
a95ecffebee4e50cd16d704cf3e3908b *tests/_helper/objs/limit/300.rds
4aaadb0768c247d2e5b10da4c4c9f537 *tests/_helper/objs/limit/500.rds
602b2ac1e4d2c6b523a9b9cf72ca530d *tests/_helper/objs/limit/600.rds
f939bf7d26cef1e59f088084433c10b0 *tests/_helper/objs/limit/700.rds
fb49940c1563c0d7e98eaf2d8acf1655 *tests/_helper/objs/limit/800.rds
df2834bd7e775c21b6c92e385e31d8f4 *tests/_helper/objs/limit/900.rds
162e195ad74eb4044591f87178a884da *tests/_helper/objs/methods/100.rds
a657fc69520a33fd5d0f7a1438b54a73 *tests/_helper/objs/methods/200.rds
ae0c8d3d54ce94d5ae3a529d2f07155f *tests/_helper/objs/pager/100.txt
8ee3d97210bc7915cc9b2dc52381f267 *tests/_helper/objs/pager/200.txt
d41d8cd98f00b204e9800998ecf8427e *tests/_helper/objs/pager/300.txt
d911852c9676e4833578c3bd5f385842 *tests/_helper/objs/style/100.rds
58d7a8816b2b516182879bf23bce19cf *tests/_helper/objs/style/200.rds
b02d528d5d7449feba4d502d68963f5d *tests/_helper/objs/style/300.rds
e78829dfa9acd41cc85022fbddf477f4 *tests/_helper/objs/style/400.rds
784e2b466f53d531e6bcaf9f7f834065 *tests/_helper/objs/style/500.rds
8ae83e9eaf20da0f0cca68fc184226d5 *tests/_helper/objs/summary/100.rds
98dd5589b90754b1c99677b156856280 *tests/_helper/objs/summary/100.txt
963be56fab7538f72248b353af820d9d *tests/_helper/objs/summary/200.rds
54b7fa3e99c02528b6455cf26004d424 *tests/_helper/objs/summary/300.rds
881a4f2d94f2799a11d97b2982fae9de *tests/_helper/objs/summary/400.rds
b8b7c6f80f1df44b42f5be1d379a8d67 *tests/_helper/objs/summary/450.rds
bfa07e2f8ba3c984ded34f47cc4f6441 *tests/_helper/objs/summary/500.rds
2022c580b6d6f555a7ed9e4d4c5d7eef *tests/_helper/objs/summary/600.rds
03683967ab30bcbeb1ab014d91d195ef *tests/_helper/objs/summary/700.rds
e828109ffc70f1d2767db1fa4b40fbdf *tests/_helper/objs/summary/800.rds
466531c9c8c58319d8a123c24cfcfc65 *tests/_helper/objs/summary/900.rds
bffab9bbb20319b1f24862c359640af6 *tests/_helper/objs/trim/100.rds
ab2c9cb8084ef78d1a4ab576f6ef3b77 *tests/_helper/objs/trim/200.rds
720332fc5f81d72cc6cbbd4b01c00bb0 *tests/_helper/objs/trim/300.rds
24167a5223c4095addff43e415899e5d *tests/_helper/objs/trim/50.rds
ef70f577c0ec4ce7513db0b6d5b78e84 *tests/_helper/string-gen.R
92e492f2f93d7e15ad4960863a1950ef *tests/_helper/tools.R
2d807b1c69afa0e80a300abe995341ad *tests/test-atomic.R
5c0756c1d64e779ef985ba6c38129cf2 *tests/test-atomic.Rout.save
2ce53858a1bd4a6affbcc1df5e570979 *tests/test-banner.R
9cdfa96b1f9db77c67bd8f664f479023 *tests/test-banner.Rout.save
ef6090e6b2bfd16295bc1e845d90a288 *tests/test-capture.R
f68fb6459e1b62f9f46dbcf0eade3268 *tests/test-capture.Rout.save
ff8ee6e20161d2b84c58664bea28f8b6 *tests/test-check.R
b052b4296bda795c96bb095752bbe502 *tests/test-check.Rout.save
a0f50f1b2cfd4db3fcdd137e07086803 *tests/test-context.R
033a46f2ae9717814bfc9d0cec370741 *tests/test-context.Rout.save
73ee13c582aeb67c67fd9697ce84f3e3 *tests/test-core.R
3c48c41748b057fbdf006acb2bee7b42 *tests/test-core.Rout.save
7e999d1dfc77cca6dbd57ccf7c5799cc *tests/test-diffChr.R
ce4f5f68765f0ac0239c9d31a4040afa *tests/test-diffChr.Rout.save
df8289564d876b0c3544b5900aae98ba *tests/test-diffDeparse.R
1e9da1500d09c228f6f8c5a0173ebb10 *tests/test-diffDeparse.Rout.save
8eb7cb812f16635b1d9fda98d81ce596 *tests/test-diffObj.R
0623d086cf584a66a56fe2db0c41c9e8 *tests/test-diffObj.Rout.save
4dcb80e502cec9b8ebd7b6a46c027a39 *tests/test-diffPrint.R
d0e9ff2ee8d3ba2e5d639fca96f91197 *tests/test-diffPrint.Rout.save
5a12fff5b0fb6fc0439da47db53fac57 *tests/test-diffStr.R
a2436bf9d2e0ce7df4311931d35c6ae0 *tests/test-diffStr.Rout.save
e0c634149591adda376e29362867b5b4 *tests/test-file.R
af3c4263118fd5b4aecb6cdec3c41138 *tests/test-file.Rout.save
6ef0da46f74a8658f1ed61e32ff98ac3 *tests/test-guide.R
dca94e0dea10450d1050a68d78cc3144 *tests/test-guide.Rout.save
57c322b182dd58a402335ab224d33357 *tests/test-html.R
58e1f53c1f5a7c7e10a20ce7d99fbefe *tests/test-html.Rout.save
e798365dcb28b3eb7bce0764238c8101 *tests/test-limit.R
1c6fbd12c51b2179ac5b1af7331f58d9 *tests/test-limit.Rout.save
92ad4b7e9a893b0e7a9a47a6b9b29271 *tests/test-methods.R
4b3fd6edfdda6ab6e6ea285fa9d98d5f *tests/test-methods.Rout.save
1af2ac3b1e7da77bad705910d328f044 *tests/test-misc.R
3e647672e0516648240e095e55d17eed *tests/test-misc.Rout.save
00ac0f315eb6d3deff28ab18020d7c78 *tests/test-pager.R
e7448b9e0027d9d279d9c9c6900eaccb *tests/test-pager.Rout.save
3aeae69d552f7a1e274964a36f4d3bee *tests/test-rdiff.R
d0677e8a86888c95f451d94122b3c47d *tests/test-rdiff.Rout.save
08c39fb1a5cce37a44c6dc54950a550e *tests/test-s4.R
1c19f34b17256c22ff72ed6c023cb983 *tests/test-s4.Rout.save
6be2b47afa8da5a82e686f67b7804303 *tests/test-scaling.R
fe59891c59b0fce386c7b8e33b032254 *tests/test-ses.R
b41dfd88eb631428064a5267e0688ea0 *tests/test-ses.Rout.save
8768717a5bb91a3743a1a5a380705b78 *tests/test-style.R
01ca69030e0ccb2295c0671d3b0ef4b5 *tests/test-style.Rout.save
940afbdf7933110c7ad5ea55dd9b502b *tests/test-subset.R
3cb6f6da3d85d9e3ef229f5572fc3558 *tests/test-subset.Rout.save
a9259fccca78e0fcab91f4608eb0f6f2 *tests/test-summary.R
e54a8cb65318104247f4d0fecc1d80a7 *tests/test-summary.Rout.save
252573b38413d9a359a49c3d3c03bfb3 *tests/test-text.R
bb8aa18d264f5091a3f505faad0e81e6 *tests/test-text.Rout.save
baea5e403d5277f053053f6a5b7a187d *tests/test-trim.R
dfb718afbd95a673de8a76e7f694c8ab *tests/test-trim.Rout.save
c7d3d4ff89fc984b06596636fab7d469 *tests/test-warnings.R
aae395c15d5d4fa004b7b836da260896 *tests/test-warnings.Rout.save
498f5f1c22d785221998c067e58cc456 *tests/valgrind/mdl-cur-all.txt
3ebdc9637d25ba9ada7fb03ea757a42e *tests/valgrind/mdl-cur.txt
190d0593d9f845f0afa81a7c23789387 *tests/valgrind/mdl-tar-all.txt
8478afde8a5625a16f8f3118ecec2219 *tests/valgrind/mdl-tar.txt
ddad15fc1b4b67a5b3e6cb7e520c2860 *tests/valgrind/tests-valgrind.R
b98a4202bca9af7409da6a63e0622f5d *tests/zz-test-check.R
b2d7b4d93f4eab544203b2d4160e0583 *vignettes/ansi256brightness.png
c50e37261653c146d7542dafab786625 *vignettes/diffobj.Rmd
9e93047aa59973c6475724b83fc6c321 *vignettes/embed.Rmd
53121e5e594fc9814e2f89ebfeeccf28 *vignettes/styles.css
diffobj/inst/0000755000176200001440000000000014126723607012642 5ustar  liggesusersdiffobj/inst/doc/0000755000176200001440000000000014126723607013407 5ustar  liggesusersdiffobj/inst/doc/diffobj.R0000644000176200001440000000563514126723606015145 0ustar  liggesusers## ---- echo=FALSE--------------------------------------------------------------
library(diffobj)
old.opt <- options(
  diffobj.disp.width=80, diffobj.pager="off", diffobj.format="html"
)

## ---- results="asis"----------------------------------------------------------
a <- b <- matrix(1:100, ncol=2)
a <- a[-20,]
b <- b[-45,]
b[c(18, 44)] <- 999
diffPrint(target=a, current=b)

## ---- results="asis", echo=FALSE----------------------------------------------
diffPrint(target=a, current=b)[1]

## ---- results="asis", echo=FALSE----------------------------------------------
diffPrint(target=a, current=b)[2:10]

## ---- results="asis", echo=FALSE----------------------------------------------
diffPrint(target=a, current=b)[3]

## ---- results="asis", echo=FALSE----------------------------------------------
diffPrint(target=a, current=b)[6:9]

## ---- results="asis", echo=FALSE----------------------------------------------
diffPrint(target=a, current=b)[8:9]

## ---- results="asis"----------------------------------------------------------
state.abb2 <- state.abb[-16]
state.abb2[37] <- "Pennsylvania"
diffPrint(state.abb, state.abb2)

## ---- results="asis"----------------------------------------------------------
mdl1 <- lm(Sepal.Length ~ Sepal.Width, iris)
mdl2 <- lm(Sepal.Length ~ Sepal.Width + Species, iris)
diffStr(mdl1$qr, mdl2$qr, line.limit=15)

## ---- results="asis"----------------------------------------------------------
diffChr(letters[1:3], c("a", "B", "c"))

## ---- eval=FALSE--------------------------------------------------------------
#  x <- diffPrint(letters, LETTERS)
#  x   # or equivalently: `show(x)`

## ---- results="asis"----------------------------------------------------------
summary(diffStr(mdl1, mdl2))

## ---- results="asis", eval=FALSE----------------------------------------------
#  x <- y <- letters[24:26]
#  y[2] <- "GREMLINS"
#  diffChr(x, y)

## ---- results="asis", echo=FALSE----------------------------------------------
x <- y <- letters[24:26]
y[2] <- "GREMLINS"
diffChr(x, y, mode="sidebyside")

## ---- results="asis", echo=FALSE----------------------------------------------
x <- y <- letters[24:26]
y[2] <- "GREMLINS"
diffChr(x, y, mode="unified")

## ---- results="asis", echo=FALSE----------------------------------------------
x <- y <- letters[24:26]
y[2] <- "GREMLINS"
diffChr(x, y, mode="context")

## ---- results="asis"----------------------------------------------------------
diffChr(x, y, color.mode="rgb")

## ---- eval=FALSE--------------------------------------------------------------
#  v1 <- 1:5e4
#  v2 <- v1[-sample(v1, 100)]
#  diffChr(v1, v2, word.diff=FALSE)

## ---- eval=FALSE--------------------------------------------------------------
#  diffPrint(v1, v2)

## -----------------------------------------------------------------------------
ses(letters[1:5], letters[c(2:3, 5)])

## ---- echo=FALSE--------------------------------------------------------------
options(old.opt)

diffobj/inst/doc/embed.html0000644000176200001440000007767614126723607015400 0ustar  liggesusers














Embed Diffs in R Markdown Or Shiny

























Embed Diffs in R Markdown Or Shiny

Brodie Gaslam

Rmarkdown

Basic Requirements

Any R chunks that produce diffs should include the results='asis' option, e.g.:

```{r, comment="", results="asis"}
# R code here
```

Embedded CSS

This is what a basic code block should look like:

```{r, comment="", results="asis"}
cat(                                 # output to screen
  as.character(                      # convert to diff to character vector
    diffPrint(                       # run diff
      1:5, 2:6,
      format="html",                 # specify html output
      style=list(
        html.output="diff.w.style"   # configure html style
      )
) ) )
```

Here we use this same code as an actual markdown R code block:

@@ 1 @@
@@ 1 @@
<
[1] 1 2 3 4 5
>
[1] 2 3 4 5 6

This is an ugly implementation because it produces illegal HTML. The styles are directly embedded in the body of the document, outside of the HEAD tags. Although this is illegal HTML, it seems to work in most browsers. Another problem is that every diff you use in your document will inject the same CSS code over and over.

External CSS

A better option is to provide the CSS directly by modifying the output portion of the YAML header:

---
output:
    rmarkdown::html_vignette:
        toc: true
        css: !expr diffobj::diffobj_css()
---

In reality you will probably want to specify multiple CSS files, including the original rmarkdown one:

---
output:
    rmarkdown::html_vignette:
        toc: true
        css:
          - !expr diffobj::diffobj_css()
          - !expr system.file("rmarkdown", "templates", "html_vignette", "resources", "vignette.css", package = "rmarkdown")
---

Once you set this up then you can use:

@@ 1 @@
@@ 1 @@
<
[1] 1 2 3 4 5
>
[1] 2 3 4 5 6

This will omit the CSS, but since we include it via the YAML everything should work as expected.

Use Options

Almost all diffobj parameters can be specified via options:

Then you can just run the diff as normal:

@@ 1 @@
@@ 1 @@
<
[1] 1 2 3 4 5
>
[1] 2 3 4 5 6

Shiny

Shiny usage is very similar to rmarkdown. In both cases we want to get diffobj to produce HTML output to embed in our document. If we are willing to embed the CSS with each diff, we can use:

If we have many diffs, it may be preferable to use options and external style sheet:

Unlike with our rmarkdown example, this CSS is included in the body of the HTML document instead of in the header, so it is technically illegal like in our embedded css example.

diffobj/inst/doc/embed.Rmd0000755000176200001440000001013213420351310015107 0ustar liggesusers--- title: "Embed Diffs in R Markdown Or Shiny" author: "Brodie Gaslam" output: rmarkdown::html_vignette: toc: true css: - !expr diffobj::diffobj_css() - styles.css vignette: > %\VignetteIndexEntry{Embed Diffs in R Markdown Or Shiny} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r echo=FALSE} library(diffobj) ``` ## Rmarkdown ### Basic Requirements Any R chunks that produce diffs should include the `results='asis'` option, e.g.: ```` ```{r, comment="", results="asis"}`r ''` # R code here ``` ```` ### Embedded CSS This is what a basic code block should look like: ```` ```{r, comment="", results="asis"}`r ''` cat( # output to screen as.character( # convert to diff to character vector diffPrint( # run diff 1:5, 2:6, format="html", # specify html output style=list( html.output="diff.w.style" # configure html style ) ) ) ) ``` ```` Here we use this same code as an actual markdown R code block: ```{r results='asis'} cat( as.character( diffPrint( 1:5, 2:6, format="html", style=list(html.output="diff.w.style") ) ) ) ``` This is an ugly implementation because it produces illegal HTML. The styles are directly embedded in the body of the document, outside of the HEAD tags. Although this is illegal HTML, it seems to work in most browsers. Another problem is that every diff you use in your document will inject the same CSS code over and over. ### External CSS A better option is to provide the CSS directly by modifying the `output` portion of the [YAML header](https://bookdown.org/yihui/rmarkdown/r-package-vignette.html): ``` --- output: rmarkdown::html_vignette: toc: true css: !expr diffobj::diffobj_css() --- ``` In reality you will probably want to specify multiple CSS files, including the original `rmarkdown` one: ``` --- output: rmarkdown::html_vignette: toc: true css: - !expr diffobj::diffobj_css() - !expr system.file("rmarkdown", "templates", "html_vignette", "resources", "vignette.css", package = "rmarkdown") --- ``` Once you set this up then you can use: ```{r results='asis'} cat( as.character( diffPrint( 1:5, 2:6, format="html", style=list(html.output="diff.only") # notice this changed ) ) ) ``` This will omit the CSS, but since we include it via the YAML everything should work as expected. ### Use Options Almost all `diffobj` parameters can be specified via options: ```{r eval=FALSE} options( diffobj.format="html", diffobj.style=list(html.output="diff.only") ) ``` ```{r echo=FALSE} old.opts <- options( diffobj.format="html", diffobj.style=list(html.output="diff.only") ) ``` Then you can just run the diff as normal: ```{r results='asis'} cat(as.character(diffPrint(1:5, 2:6))) ``` ```{r echo=FALSE} options(old.opts) ``` ## Shiny Shiny usage is very similar to `rmarkdown`. In both cases we want to get `diffobj` to produce HTML output to embed in our document. If we are willing to embed the CSS with each diff, we can use: ```{r, eval=FALSE} library(shiny) shinyApp( ui=fluidPage(htmlOutput('diffobj_element')), server=function(input, output) { output$diffobj_element <- renderUI({ HTML( as.character( diffPrint( 1:5, 2:6, format="html", style=list(html.output="diff.w.style") ) ) )}) } ) ``` If we have many diffs, it may be preferable to use options and external style sheet: ```{r, eval=FALSE} options( diffobj.format="html", diffobj.style=list(html.output="diff.only") ) shinyApp( ui=fluidPage( includeCSS(diffobj_css()), htmlOutput('diffobj_element') ), server=function(input, output) { output$diffobj_element <- renderUI({ HTML(as.character(diffPrint(1:5, 2:6,))) }) } ) ``` Unlike with our [rmarkdown example](#external-css), this CSS is included in the body of the HTML document instead of in the header, so it is technically illegal like in our [embedded css example](#embedded-css). diffobj/inst/doc/embed.R0000644000176200001440000000365714126723607014621 0ustar liggesusers## ----echo=FALSE--------------------------------------------------------------- library(diffobj) ## ----results='asis'----------------------------------------------------------- cat( as.character( diffPrint( 1:5, 2:6, format="html", style=list(html.output="diff.w.style") ) ) ) ## ----results='asis'----------------------------------------------------------- cat( as.character( diffPrint( 1:5, 2:6, format="html", style=list(html.output="diff.only") # notice this changed ) ) ) ## ----eval=FALSE--------------------------------------------------------------- # options( # diffobj.format="html", # diffobj.style=list(html.output="diff.only") # ) ## ----echo=FALSE--------------------------------------------------------------- old.opts <- options( diffobj.format="html", diffobj.style=list(html.output="diff.only") ) ## ----results='asis'----------------------------------------------------------- cat(as.character(diffPrint(1:5, 2:6))) ## ----echo=FALSE--------------------------------------------------------------- options(old.opts) ## ---- eval=FALSE-------------------------------------------------------------- # library(shiny) # shinyApp( # ui=fluidPage(htmlOutput('diffobj_element')), # server=function(input, output) { # output$diffobj_element <- renderUI({ # HTML( # as.character( # diffPrint( # 1:5, 2:6, # format="html", # style=list(html.output="diff.w.style") # ) ) )}) } ) ## ---- eval=FALSE-------------------------------------------------------------- # options( # diffobj.format="html", # diffobj.style=list(html.output="diff.only") # ) # shinyApp( # ui=fluidPage( # includeCSS(diffobj_css()), # htmlOutput('diffobj_element') # ), # server=function(input, output) { # output$diffobj_element <- renderUI({ # HTML(as.character(diffPrint(1:5, 2:6,))) # }) } ) diffobj/inst/doc/diffobj.html0000644000176200001440000046140114126723606015705 0ustar liggesusers diffobj - Diffs for R Objects

diffobj - Diffs for R Objects

Brodie Gaslam

Introduction

diffobj uses the same comparison mechanism used by git diff and diff to highlight differences between rendered R objects:

@@ 17,6 @@
@@ 17,7 @@
~
[,1] [,2]
~
[,1] [,2]
 
[16,] 16 66
 
[16,] 16 66
 
[17,] 17 67
 
[17,] 17 67
<
[18,] 18 68
>
[18,] 999 68
 
[19,] 19 69
 
[19,] 19 69
~
>
[20,] 20 70
 
[20,] 21 71
 
[21,] 21 71
 
[21,] 22 72
 
[22,] 22 72
@@ 42,6 @@
@@ 43,5 @@
 
[41,] 42 92
 
[42,] 42 92
 
[42,] 43 93
 
[43,] 43 93
<
[43,] 44 94
>
[44,] 999 94
<
[44,] 45 95
~
 
[45,] 46 96
 
[45,] 46 96
 
[46,] 47 97
 
[46,] 47 97

diffobj comparisons work best when objects have some similarities, or when they are relatively small. The package was originally developed to help diagnose failed unit tests by comparing test results to reference objects in a human-friendly manner.

If your terminal supports formatting through ANSI escape sequences, diffobj will output colored diffs to the terminal. If not, it will output colored diffs to your IDE viewport if it is supported, or to your browser otherwise.

Interpreting Diffs

Shortest Edit Script

The output from diffobj is a visual representation of the Shortest Edit Script (SES). An SES is the shortest set of deletion and insertion instructions for converting one sequence of elements into another. In our case, the elements are lines of text. We encode the instructions to convert a to b by deleting lines from a (in yellow) and inserting new ones from b (in blue).

Diff Structure

The first line of our diff output acts as a legend to the diff by associating the colors and symbols used to represent differences present in each object with the name of the object:

After the legend come the hunks, which are portions of the objects that have differences with nearby matching lines provided for context:

@@ 17,6 @@
@@ 17,7 @@
~
[,1] [,2]
~
[,1] [,2]
 
[16,] 16 66
 
[16,] 16 66
 
[17,] 17 67
 
[17,] 17 67
<
[18,] 18 68
>
[18,] 999 68
 
[19,] 19 69
 
[19,] 19 69
~
>
[20,] 20 70
 
[20,] 21 71
 
[21,] 21 71
 
[21,] 22 72
 
[22,] 22 72

At the top of the hunk is the hunk header: this tells us that the first displayed hunk (including context lines), starts at line 17 and spans 6 lines for a and 7 for b. These are display lines, not object row indices, which is why the first row shown of the matrix is row 16. You might have also noticed that the line after the hunk header is out of place:

~
[,1] [,2]
~
[,1] [,2]

This is a special context line that is not technically part of the hunk, but is shown nonetheless because it is useful in helping understand the data. The line is styled differently to highlight that it is not part of the hunk. Since it is not part of the hunk, it is not accounted for in the hunk header. See ?guideLines for more details.

The actual mismatched lines are highlighted in the colors of the legend, with additional visual cues in the gutters:

<
[18,] 18 68
>
[18,] 999 68
 
[19,] 19 69
 
[19,] 19 69
~
>
[20,] 20 70
 
[20,] 21 71
 
[21,] 21 71

diffobj uses a line by line diff to identify which portions of each of the objects are mismatches, so even if only part of a line mismatches it will be considered different. diffobj then runs a word diff within the hunks and further highlights mismatching words.

Let’s examine the last two lines from the previous hunk more closely:

~
>
[20,] 20 70
 
[20,] 21 71
 
[21,] 21 71

Here b has an extra line so diffobj adds an empty line to a to maintain the alignment for subsequent matching lines. This additional line is marked with a tilde in the gutter and is shown in a different color to indicate it is not part of the original text.

If you look closely at the next matching line you will notice that the a and b values are not exactly the same. The row indices are different, but diffobj excludes row indices from the diff so that rows that are identical otherwise are shown as matching. diffobj indicates this is happening by showing the portions of a line that are ignored in the diff in grey.

See ?guides and ?trim for details and limitations on guideline detection and unsemantic meta data trimming.

Atomic Vectors

Since R can display multiple elements in an atomic vector on the same line, and diffPrint is fundamentally a line diff, we use specialized logic when diffing atomic vectors. Consider:

@@ 1,5 @@
@@ 6,5 @@
 
[1] "AL" "AK" "AZ" "AR" "CA" "CO"
 
[11] "HI" "ID"
 
[7] "CT" "DE" "FL" "GA" "HI" "ID"
 
[13] "IL" "IN"
<
[13] "IL" "IN" "IA" "KS" "KY" "LA"
>
[15] "IA" "KY"
 
[19] "ME" "MD" "MA" "MI" "MN" "MS"
 
[17] "LA" "ME"
 
[25] "MO" "MT" "NE" "NV" "NH" "NJ"
 
[19] "MD" "MA"
@@ 6,4 @@
@@ 17,5 @@
~
 
 
[33] "ND" "OH"
 
[31] "NM" "NY" "NC" "ND" "OH" "OK"
 
[35] "OK" "OR"
<
[37] "OR" "PA" "RI" "SC" "SD" "TN"
>
[37] "Pennsylvania" "RI"
 
[43] "TX" "UT" "VT" "VA" "WA" "WV"
 
[39] "SC" "SD"
 
[49] "WI" "WY"
 
[41] "TN" "TX"

Due to the different wrapping frequency no line in the text display of our two vectors matches. Despite this, diffPrint only highlights the lines that actually contain differences. The side effect is that lines that only contain matching elements are shown as matching even though the actual lines may be different. You can turn off this behavior in favor of a normal line diff with the unwrap.atomic argument to diffPrint.

Currently this only works for unnamed vectors, and even for them some inputs may produce sub-optimal results. Nested vectors inside lists will not be unwrapped. You can also use diffChr (see below) to do a direct element by element comparison.

Other Diff Functions

Method Overview

diffobj defines several S4 generics and default methods to go along with them. Each of them uses a different text representation of the inputs:

  • diffPrint: use the print/show output and is the one used in the examples so far
  • diffStr: use the output of str
  • diffObj: picks between print/show and str depending on which provides the “best” overview of differences
  • diffChr: coerces the inputs to atomic character vectors with as.character, and runs the diff on the character vector
  • diffFile: compares the text content of two files
  • diffCsv: loads two CSV files into data frames and compares the data frames with diffPrint
  • diffDeparse: deparses and compares the character vectors produced by the deparsing
  • ses: computes the element by element shortest edit script on two character vectors

Note the diff* functions use lowerCamelCase in keeping with S4 method name convention, whereas the package name itself is all lower case.

Compare Structure with diffStr

For complex objects it is often useful to compare structures:

@@ 1,9 @@
@@ 1,10 @@
 
List of 5
 
List of 5
<
$ qr : num [1:150, 1:2] -12.2474 0.0816 0.0816 0.0816 0.0816 ...
>
$ qr : num [1:150, 1:4] -12.2474 0.0816 0.0816 0.0816 0.0816 ...
 
..- attr(*, "dimnames")=List of 2
 
..- attr(*, "dimnames")=List of 2
<
..- attr(*, "assign")= int [1:2] 0 1
>
..- attr(*, "assign")= int [1:4] 0 1 2 2
~
>
..- attr(*, "contrasts")=List of 1
<
$ qraux: num [1:2] 1.08 1.02
>
$ qraux: num [1:4] 1.08 1.02 1.05 1.11
<
$ pivot: int [1:2] 1 2
>
$ pivot: int [1:4] 1 2 3 4
 
$ tol : num 1e-07
 
$ tol : num 1e-07
<
$ rank : int 2
>
$ rank : int 4
 
- attr(*, "class")= chr "qr"
 
- attr(*, "class")= chr "qr"
3 differences are hidden by our use of `max.level`

If you specify a line.limit with diffStr it will fold nested levels in order to fit under line.limit so long as there remain visible differences. If you prefer to see all the differences you can leave line.limit unspecified.

Compare Vectors Elements with diffChr

Sometimes it is useful to do a direct element by element comparison:

@@ 1,3 @@
@@ 1,3 @@
 
a
 
a
<
b
>
B
 
c
 
c

Notice how we are comparing the contents of the vectors with one line per element.

Why S4?

The diff* functions are defined as S4 generics with default methods (signature c("ANY", "ANY")) so that users can customize behavior for their own objects. For example, a custom method could set many of the default parameters to values more suitable for a particular object. If the objects in question are S3 objects the S3 class will have to be registered with setOldClass.

Return Value

All the diff* methods return a Diff S4 object. It has a show method which is responsible for rendering the Diff and displaying it to the screen. Because of this you can compute and render diffs in two steps:

This may cause the diff to render funny if you change screen widths, etc., between the two steps.

There are also summary, any, and as.character methods. The summary method provides a high level overview of where the differences are, which can be helpful for large diffs:

Found differences in 12 hunks:
45 insertions, 39 deletions, 18 matches (lines)

Diff map (line:char scale is 1:1 for single chars, 1-2:1 for char seqs):
DDDIII.DDDIII.DI.DI..DDDIIIII.DI.DDDDDIIIIIII.DDDIII..DDDIII..DDIII.DDDIII..DDII.

any returns TRUE if there are differences, and as.character returns the character representation of the diff.

Controlling Diffs and Their Appearance

Parameters

The diff* family of methods has an extensive set of parameters that allow you to fine tune how the diff is applied and displayed. We will review some of the major ones in this section. For a full description see ?diffPrint.

While the parameter list is extensive, only the objects being compared are required. All the other parameters have default values, and most of them are for advanced use only. The defaults can all be adjusted via the diffobj.* options.

Display Mode

There are three built-in display modes that are similar to those found in GNU diff: “sidebyside”, “unified”, and “context”. For example, by varying the mode parameter with:

we get:

mode=“sidebyside” mode=“unified” mode=“context”
@@ 1,3 @@
@@ 1,3 @@
 
x
 
x
<
y
>
GREMLINS
 
z
 
z
@@ 1,3 / 1,3 @@
 
x
<
y
>
GREMLINS
 
z
@@ 1,3 / 1,3 @@
 
x
<
y
 
z
~
----------
 
x
>
GREMLINS
 
z

By default diffobj will try to use mode="sidebyside" if reasonable given display width, and otherwise will switch to mode="unified". You can always force a particular display style by specifying it with the mode argument.

Color Mode

The default color mode uses yellow and blue to symbolize deletions and insertions for accessibility to dichromats. If you prefer the more traditional color mode you can specify color.mode="rgb" in the parameter list, or use options(diffobj.color.mode="rgb"):

@@ 1,3 @@
@@ 1,3 @@
 
x
 
x
<
y
>
GREMLINS
 
z
 
z

Output Formats

If your terminal supports it diffobj will format the output with ANSI escape sequences. diffobj uses Gábor Csárdi’s crayon package to detect ANSI support and to apply ANSI based formatting. If you are using RStudio or another IDE that supports getOption("viewer"), diffobj will output an HTML/CSS formatted diff to the viewport. In other terminals that do not support ANSI colors, diffobj will attempt to output to an HTML/CSS formatted diff to your browser using browseURL.

You can explicitly specify the output format with the format parameter:

  • format="raw" for unformatted diffs
  • format="ansi8" for standard ANSI 8 color formatting
  • format="ansi256" for ANSI 256 color formatting
  • format="html" for HTML/CSS output and styling

See Pagers for more details.

Brightness

The brightness parameter allows you to pick a color scheme compatible with the background color of your terminal. The options are:

  • “light”: for use with light tone terminals
  • “dark”: for use with dark tone terminals
  • “neutral”: for use with either light or dark terminals

Here are examples of terminal screen renderings for both “rgb” and “yb” color.mode for the three brightness levels.

The examples for “light” and “dark” have the backgrounds forcefully set to a color compatible with the scheme. In actual use the base background and foreground colors are left unchanged, which will look bad if you use “dark” with light colored backgrounds or vice versa. Since we do not know of a good cross platform way of detecting terminal background color the default brightness value is “neutral”.

At this time the only format that is affected by this parameter is “ansi256”. If you want to specify your own light/dark/neutral schemes you may do so either by specifying a style directly or with Palette of Styles.

Pagers

In interactive mode, if the diff output is very long or if your terminal does not support ANSI colors, diff* methods will pipe output to a pager. This is done by writing the output to a temporary file and passing the file reference to the pager. The default action is to invoke the pager with file.show if your terminal supports ANSI colors and the pager is known to support ANSI colors as well (as of this writing, only less is assumed to support ANSI colors), or if not to use getOption("viewer") if available (this outputs to the viewport in RStudio), or if not to use browseURL.

You can fine tune when, how, and if a pager is used with the pager parameter. See ?diffPrint and ?Pager for more details.

Styles

You can control almost all aspects of the diff output formatting via the style parameter. To do so, pass an appropriately configured Style object. See ?Style for more details on how to do this.

The default is to auto pick a style based on the values of the format, color.mode, and brightness parameters. This is done by using the computed values for each of those parameters to subset the PaletteOfStyles object passed as the palette.of.styles parameter. This PaletteOfStyles object contains a Style object for all the possible permutations of the style, format, and color.mode parameters. See ?PaletteOfStyles.

If you specify the style parameter the values of the format, brightness, and color.mode parameters will be ignored.

Diff Algorithm

The primary diff algorithm is Myer’s solution to the shortest edit script / longest common sequence problem with the Hirschberg linear space refinement as described in:

E. Myers, “An O(ND) Difference Algorithm and Its Variations”, Algorithmica 1, 2 (1986), 251-266.

and should be the same algorithm used by GNU diff. The implementation used here is a heavily modified version of Michael B. Allen’s diff program from the libmba C library. Any and all bugs in the C code in this package were most likely introduced by yours truly. Please note that the resulting C code is incompatible with the original libmba library.

Performance Considerations

Diff

The diff algorithm scales with the square of the number of differences. For reasonably small diffs (< 10K differences), the diff itself is unlikely to be the bottleneck.

Capture and Processing

Capture of inputs for diffPrint and diffStr, and processing of output for all diff* methods will account for most of the execution time unless you have large numbers of differences. This input and output processing scales mostly linearly with the input size.

You can improve performance somewhat by using diffChr since that skips the capture part, and by turning off word.diff:

will be ~2x as fast as:

Note: turning off word.diff when using diffPrint with unnamed atomic vectors can actually slow down the diff because there may well be fewer element by element differences than line differences as displayed. For example, when comparing 1:1e6 to 2:1e6 there is only one element difference, but every line as displayed is different because of the shift. Using word.diff=TRUE (and unwrap.atomic=TRUE) allows diffPrint to compare element by element rather than line by line. diffChr always compares element by element.

Minimal Diff

If you are looking for the fastest possible diff you can use ses and completely bypass most input and output processing. Inputs will be coerced to character if they are not character.

## [1] "1d0" "4d2"

This will be 10-20x faster than diffChr, at the cost of less useful output.

diffobj/inst/doc/diffobj.Rmd0000755000176200001440000003734113777704534015502 0ustar liggesusers--- title: "diffobj - Diffs for R Objects" author: "Brodie Gaslam" output: rmarkdown::html_vignette: toc: true css: - !expr diffobj::diffobj_css() - styles.css vignette: > %\VignetteIndexEntry{Introduction to Diffobj} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r, echo=FALSE} library(diffobj) old.opt <- options( diffobj.disp.width=80, diffobj.pager="off", diffobj.format="html" ) ``` ## Introduction `diffobj` uses the same comparison mechanism used by `git diff` and `diff` to highlight differences between _rendered_ R objects: ```{r, results="asis"} a <- b <- matrix(1:100, ncol=2) a <- a[-20,] b <- b[-45,] b[c(18, 44)] <- 999 diffPrint(target=a, current=b) ``` `diffobj` comparisons work best when objects have some similarities, or when they are relatively small. The package was originally developed to help diagnose failed unit tests by comparing test results to reference objects in a human-friendly manner. If your terminal supports formatting through ANSI escape sequences, `diffobj` will output colored diffs to the terminal. If not, it will output colored diffs to your IDE viewport if it is supported, or to your browser otherwise. ## Interpreting Diffs ### Shortest Edit Script The output from `diffobj` is a visual representation of the Shortest Edit Script (SES). An SES is the shortest set of deletion and insertion instructions for converting one sequence of elements into another. In our case, the elements are lines of text. We encode the instructions to convert `a` to `b` by deleting lines from `a` (in yellow) and inserting new ones from `b` (in blue). ### Diff Structure The first line of our diff output acts as a legend to the diff by associating the colors and symbols used to represent differences present in each object with the name of the object: ```{r, results="asis", echo=FALSE} diffPrint(target=a, current=b)[1] ``` After the legend come the hunks, which are portions of the objects that have differences with nearby matching lines provided for context: ```{r, results="asis", echo=FALSE} diffPrint(target=a, current=b)[2:10] ``` At the top of the hunk is the hunk header: this tells us that the first displayed hunk (including context lines), starts at line 17 and spans 6 lines for `a` and 7 for `b`. These are display lines, not object row indices, which is why the first row shown of the matrix is row 16. You might have also noticed that the line after the hunk header is out of place: ```{r, results="asis", echo=FALSE} diffPrint(target=a, current=b)[3] ``` This is a special context line that is not technically part of the hunk, but is shown nonetheless because it is useful in helping understand the data. The line is styled differently to highlight that it is not part of the hunk. Since it is not part of the hunk, it is not accounted for in the hunk header. See `?guideLines` for more details. The actual mismatched lines are highlighted in the colors of the legend, with additional visual cues in the gutters: ```{r, results="asis", echo=FALSE} diffPrint(target=a, current=b)[6:9] ``` `diffobj` uses a line by line diff to identify which portions of each of the objects are mismatches, so even if only part of a line mismatches it will be considered different. `diffobj` then runs a word diff within the hunks and further highlights mismatching words. Let's examine the last two lines from the previous hunk more closely: ```{r, results="asis", echo=FALSE} diffPrint(target=a, current=b)[8:9] ``` Here `b` has an extra line so `diffobj` adds an empty line to `a` to maintain the alignment for subsequent matching lines. This additional line is marked with a tilde in the gutter and is shown in a different color to indicate it is not part of the original text. If you look closely at the next matching line you will notice that the `a` and `b` values are not exactly the same. The row indices are different, but `diffobj` excludes row indices from the diff so that rows that are identical otherwise are shown as matching. `diffobj` indicates this is happening by showing the portions of a line that are ignored in the diff in grey. See `?guides` and `?trim` for details and limitations on guideline detection and unsemantic meta data trimming. ### Atomic Vectors Since R can display multiple elements in an atomic vector on the same line, and `diffPrint` is fundamentally a line diff, we use specialized logic when diffing atomic vectors. Consider: ```{r, results="asis"} state.abb2 <- state.abb[-16] state.abb2[37] <- "Pennsylvania" diffPrint(state.abb, state.abb2) ``` Due to the different wrapping frequency no line in the text display of our two vectors matches. Despite this, `diffPrint` only highlights the lines that actually contain differences. The side effect is that lines that only contain matching elements are shown as matching even though the actual lines may be different. You can turn off this behavior in favor of a normal line diff with the `unwrap.atomic` argument to `diffPrint`. Currently this only works for _unnamed_ vectors, and even for them some inputs may produce sub-optimal results. Nested vectors inside lists will not be unwrapped. You can also use `diffChr` (see below) to do a direct element by element comparison. ## Other Diff Functions ### Method Overview `diffobj` defines several S4 generics and default methods to go along with them. Each of them uses a different text representation of the inputs: * `diffPrint`: use the `print`/`show` output and is the one used in the examples so far * `diffStr`: use the output of `str` * `diffObj`: picks between `print`/`show` and `str` depending on which provides the "best" overview of differences * `diffChr`: coerces the inputs to atomic character vectors with `as.character`, and runs the diff on the character vector * `diffFile`: compares the text content of two files * `diffCsv`: loads two CSV files into data frames and compares the data frames with `diffPrint` * `diffDeparse`: deparses and compares the character vectors produced by the deparsing * `ses`: computes the element by element shortest edit script on two character vectors Note the `diff*` functions use lowerCamelCase in keeping with S4 method name convention, whereas the package name itself is all lower case. ### Compare Structure with `diffStr` For complex objects it is often useful to compare structures: ```{r, results="asis"} mdl1 <- lm(Sepal.Length ~ Sepal.Width, iris) mdl2 <- lm(Sepal.Length ~ Sepal.Width + Species, iris) diffStr(mdl1$qr, mdl2$qr, line.limit=15) ``` If you specify a `line.limit` with `diffStr` it will fold nested levels in order to fit under `line.limit` so long as there remain visible differences. If you prefer to see all the differences you can leave `line.limit` unspecified. ### Compare Vectors Elements with `diffChr` Sometimes it is useful to do a direct element by element comparison: ```{r, results="asis"} diffChr(letters[1:3], c("a", "B", "c")) ``` Notice how we are comparing the contents of the vectors with one line per element. ### Why S4? The `diff*` functions are defined as S4 generics with default methods (signature `c("ANY", "ANY")`) so that users can customize behavior for their own objects. For example, a custom method could set many of the default parameters to values more suitable for a particular object. If the objects in question are S3 objects the S3 class will have to be registered with `setOldClass`. ### Return Value All the `diff*` methods return a `Diff` S4 object. It has a `show` method which is responsible for rendering the `Diff` and displaying it to the screen. Because of this you can compute and render diffs in two steps: ```{r, eval=FALSE} x <- diffPrint(letters, LETTERS) x # or equivalently: `show(x)` ``` This may cause the diff to render funny if you change screen widths, etc., between the two steps. There are also `summary`, `any`, and `as.character` methods. The `summary` method provides a high level overview of where the differences are, which can be helpful for large diffs: ```{r, results="asis"} summary(diffStr(mdl1, mdl2)) ``` `any` returns TRUE if there are differences, and `as.character` returns the character representation of the diff. ## Controlling Diffs and Their Appearance ### Parameters The `diff*` family of methods has an extensive set of parameters that allow you to fine tune how the diff is applied and displayed. We will review some of the major ones in this section. For a full description see `?diffPrint`. While the parameter list is extensive, only the objects being compared are required. All the other parameters have default values, and most of them are for advanced use only. The defaults can all be adjusted via the `diffobj.*` options. ### Display Mode There are three built-in display modes that are similar to those found in GNU `diff`: "sidebyside", "unified", and "context". For example, by varying the `mode` parameter with: ```{r, results="asis", eval=FALSE} x <- y <- letters[24:26] y[2] <- "GREMLINS" diffChr(x, y) ``` we get:
mode="sidebyside"mode="unified"mode="context"
```{r, results="asis", echo=FALSE} x <- y <- letters[24:26] y[2] <- "GREMLINS" diffChr(x, y, mode="sidebyside") ``` ```{r, results="asis", echo=FALSE} x <- y <- letters[24:26] y[2] <- "GREMLINS" diffChr(x, y, mode="unified") ``` ```{r, results="asis", echo=FALSE} x <- y <- letters[24:26] y[2] <- "GREMLINS" diffChr(x, y, mode="context") ```
By default `diffobj` will try to use `mode="sidebyside"` if reasonable given display width, and otherwise will switch to `mode="unified"`. You can always force a particular display style by specifying it with the `mode` argument. ### Color Mode The default color mode uses yellow and blue to symbolize deletions and insertions for accessibility to dichromats. If you prefer the more traditional color mode you can specify `color.mode="rgb"` in the parameter list, or use `options(diffobj.color.mode="rgb")`: ```{r, results="asis"} diffChr(x, y, color.mode="rgb") ``` ### Output Formats If your terminal supports it `diffobj` will format the output with ANSI escape sequences. `diffobj` uses Gábor Csárdi's [`crayon`](https://github.com/r-lib/crayon) package to detect ANSI support and to apply ANSI based formatting. If you are using RStudio or another IDE that supports `getOption("viewer")`, `diffobj` will output an HTML/CSS formatted diff to the viewport. In other terminals that do not support ANSI colors, `diffobj` will attempt to output to an HTML/CSS formatted diff to your browser using `browseURL`. You can explicitly specify the output format with the `format` parameter: * `format="raw"` for unformatted diffs * `format="ansi8"` for standard ANSI 8 color formatting * `format="ansi256"` for ANSI 256 color formatting * `format="html"` for HTML/CSS output and styling See [Pagers](#pagers) for more details. ### Brightness The `brightness` parameter allows you to pick a color scheme compatible with the background color of your terminal. The options are: * "light": for use with light tone terminals * "dark": for use with dark tone terminals * "neutral": for use with either light or dark terminals Here are examples of terminal screen renderings for both "rgb" and "yb" `color.mode` for the three `brightness` levels. The examples for "light" and "dark" have the backgrounds forcefully set to a color compatible with the scheme. In actual use the base background and foreground colors are left unchanged, which will look bad if you use "dark" with light colored backgrounds or vice versa. Since we do not know of a good cross platform way of detecting terminal background color the default `brightness` value is "neutral". At this time the only `format` that is affected by this parameter is "ansi256". If you want to specify your own light/dark/neutral schemes you may do so either by specifying a [style](#styles) directly or with [Palette of Styles](#styles). ### Pagers In interactive mode, if the diff output is very long or if your terminal does not support ANSI colors, `diff*` methods will pipe output to a pager. This is done by writing the output to a temporary file and passing the file reference to the pager. The default action is to invoke the pager with `file.show` if your terminal supports ANSI colors and the pager is known to support ANSI colors as well (as of this writing, only `less` is assumed to support ANSI colors), or if not to use `getOption("viewer")` if available (this outputs to the viewport in RStudio), or if not to use `browseURL`. You can fine tune when, how, and if a pager is used with the `pager` parameter. See `?diffPrint` and `?Pager` for more details. ### Styles You can control almost all aspects of the diff output formatting via the `style` parameter. To do so, pass an appropriately configured `Style` object. See `?Style` for more details on how to do this. The default is to auto pick a style based on the values of the `format`, `color.mode`, and `brightness` parameters. This is done by using the computed values for each of those parameters to subset the `PaletteOfStyles` object passed as the `palette.of.styles` parameter. This `PaletteOfStyles` object contains a `Style` object for all the possible permutations of the `style`, `format`, and `color.mode` parameters. See `?PaletteOfStyles`. If you specify the `style` parameter the values of the `format`, `brightness`, and `color.mode` parameters will be ignored. ## Diff Algorithm The primary diff algorithm is Myer's solution to the shortest edit script / longest common sequence problem with the Hirschberg linear space refinement as described in: > E. Myers, "An O(ND) Difference Algorithm and Its Variations", Algorithmica 1, 2 (1986), 251-266. and should be the same algorithm used by GNU diff. The implementation used here is a heavily modified version of Michael B. Allen's diff program from the [`libmba`](http://www.ioplex.com/~miallen/libmba/dl/libmba-0.9.1.tar.gz) `C` library. Any and all bugs in the C code in this package were most likely introduced by yours truly. Please note that the resulting C code is incompatible with the original `libmba` library. ## Performance Considerations ### Diff The diff algorithm scales with the _square_ of the number of _differences_. For reasonably small diffs (< 10K differences), the diff itself is unlikely to be the bottleneck. ### Capture and Processing Capture of inputs for `diffPrint` and `diffStr`, and processing of output for all `diff*` methods will account for most of the execution time unless you have large numbers of differences. This input and output processing scales mostly linearly with the input size. You can improve performance somewhat by using `diffChr` since that skips the capture part, and by turning off `word.diff`: ```{r, eval=FALSE} v1 <- 1:5e4 v2 <- v1[-sample(v1, 100)] diffChr(v1, v2, word.diff=FALSE) ``` will be ~2x as fast as: ```{r, eval=FALSE} diffPrint(v1, v2) ``` *Note*: turning off `word.diff` when using `diffPrint` with unnamed atomic vectors can actually _slow down_ the diff because there may well be fewer element by element differences than line differences as displayed. For example, when comparing `1:1e6` to `2:1e6` there is only one element difference, but every line as displayed is different because of the shift. Using `word.diff=TRUE` (and `unwrap.atomic=TRUE`) allows `diffPrint` to compare element by element rather than line by line. `diffChr` always compares element by element. ### Minimal Diff If you are looking for the fastest possible diff you can use `ses` and completely bypass most input and output processing. Inputs will be coerced to character if they are not character. ```{r} ses(letters[1:5], letters[c(2:3, 5)]) ``` This will be 10-20x faster than `diffChr`, at the cost of less useful output. ```{r, echo=FALSE} options(old.opt) ``` diffobj/inst/script/0000755000176200001440000000000013777704534014157 5ustar liggesusersdiffobj/inst/script/diffobj.js0000755000176200001440000001322313777704534016124 0ustar liggesusers// diffobj - Compare R Objects with a Diff // Copyright (C) 2021 Brodie Gaslam // // This program is free software: you can redistribute it and/or modify // it under the terms of the GNU General Public License as published by // the Free Software Foundation, either version 3 of the License, or // (at your option) any later version. // // This program is distributed in the hope that it will be useful, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // GNU General Public License for more details. // // Go to for a copy of the license. /* * Resizes diff by changing font-size using a hidden row of sample output as * a reference * * NOTE: this code is intended to be loaded after the HTML has been rendered * and is assumed to be the only JS on the page. It should only be included * as part of output when in "page" mode and should not be embedded in other * content. For that, use the HTML/CSS only outputs. */ var meta = document.getElementById("diffobj_meta"); var meta_cont = document.getElementById("diffobj_content_meta"); var meta_banner = document.getElementById("diffobj_banner_meta"); var content = document.getElementById("diffobj_content"); var outer = document.getElementById("diffobj_outer"); if( meta == null || content == null || outer == null || meta_cont == null || meta_banner == null ) throw new Error("Unable to find meta and content; contact maintainer."); var row = meta_cont.getElementsByClassName("row"); if(row.length != 1) throw new Error("Unexpected row struct in meta block; contact maintainer."); var lines = meta_cont.getElementsByClassName("line"); if(lines.length != 1 && lines.length != 2) throw new Error("Unexpected lines in meta block; contact maintainer."); var meta_bnr_gutter = document.querySelector("#diffobj_banner_meta .line .gutter"); var meta_bnr_delete = document.querySelector("#diffobj_banner_meta .line .text>.delete"); var meta_bnr_text = document.querySelector("#diffobj_banner_meta .line .text"); var bnr_gutters = document.querySelectorAll("#diffobj_content .line.banner .gutter"); var bnr_text_div = document.querySelectorAll("#diffobj_content .line.banner .text>DIV"); if( meta_bnr_gutter == null || meta_bnr_delete == null || bnr_gutters.length != 2 || bnr_text_div.length != 2 ) throw new Error("Unable to get meta banner objects") // Set the banners to 'fixed'; need to be in auto by default for(i = 0; i < 2; i++) bnr_text_div[i].style.tableLayout = "fixed"; // - Set Min Width ------------------------------------------------------------- // Makes sure that we don't wrap under "native" width // Note we need to pad because scrollWidth appears to truncate floats to int meta.style.display = "block"; var min_width = 0; for(i = 0; i < lines.length; i++) min_width += lines[i].scrollWidth + 1; meta.style.display = "none"; content.style.minWidth = min_width + "px"; function resize_diff_out(scale) { // - Get object refs --------------------------------------------------------- // - Get Sizes --------------------------------------------------------------- meta.style.display = "block"; // The getComputedStyle business won't work on IE9 or lower; need to detect // and implement work-around var b_t, b_d_w, b_d_o, b_g; b_g = parseFloat(window.getComputedStyle(meta_bnr_gutter).width); b_d_o = meta_bnr_delete.offsetWidth; b_d_w = parseFloat(window.getComputedStyle(meta_bnr_delete).width); b_t = parseFloat(window.getComputedStyle(meta_bnr_text).width); meta.style.display = "none"; // - Set Sizes --------------------------------------------------------------- for(i = 0; i < 2; i++) { bnr_gutters[i].style.width = b_g + "px"; // for some reason table fixed width computation doesn't properly account // for padding and lines bnr_text_div[i].style.width = b_t - b_d_o + b_d_w + "px"; } var w = document.body.clientWidth; var scale_size = w / min_width; if(scale_size < 1) { if(scale) { content.style.transform = "scale(" + scale_size + ")"; content.style.transformOrigin = "top left"; content.style.webkitTransform = "scale(" + scale_size + ")"; content.style.webkitTransformOrigin = "top left"; content.style.msTransform = "scale(" + scale_size + ")"; content.style.msTransformOrigin = "top left"; content.style.MozTransform = "scale(" + scale_size + ")"; content.style.MozTransformOrigin = "top left"; content.style.oTransform = "scale(" + scale_size + ")"; content.style.oTransformOrigin = "top left"; var cont_rec_h = content.getBoundingClientRect().height; if(cont_rec_h) { outer.style.height = cont_rec_h + "px"; } } var cont_rec_w = content.getBoundingClientRect().width; if(cont_rec_w) { outer.style.width = cont_rec_w + "px"; } } else { content.style.transform = "none"; content.style.MozTransform = "none"; content.style.webkitTransform = "none"; content.style.msTransform = "none"; content.style.oTransform = "none"; outer.style.height = "auto"; outer.style.width = "auto"; } }; /* * Manage resize timeout based on how large the object is */ var out_rows = content.getElementsByClassName("row").length; var timeout_time; if(out_rows < 100) { timeout_time = 25; } else { timeout_time = Math.min(25 + (out_rows - 100) / 4, 500) } var timeout; function resize_window(f, scale) { clearTimeout(timeout); timeout = setTimeout(f, timeout_time, scale); } function resize_diff_out_scale() {resize_window(resize_diff_out, true);} function resize_diff_out_no_scale() {resize_window(resize_diff_out, false);} diffobj/inst/COPYRIGHTS0000755000176200001440000000127313777704534014277 0ustar liggesusersThe C implementation of the Myers' Diff Algorithm with Linear Space Refinement was originally written by Michael B. Allen with the following license and copyright: diff - compute a shortest edit script (SES) given two sequences Copyright (C) 2004 Michael B. Allen License: MIT The original source code is available at: http://www.ioplex.com/~miallen/libmba/dl/libmba-0.9.1.tar.gz The adapted and heavily modified code is available in src/diff.c, src/diff.h. See those files for additional details. This package is released under the GPL (>= 2) or greater license: diffobj - Compare R Objects with a Diff Copyright (C) 2021 Brodie Gaslam License: GPL (>= 2) diffobj/inst/css/0000755000176200001440000000000013420351310013412 5ustar liggesusersdiffobj/inst/css/diffobj.css0000755000176200001440000001172113420351310015534 0ustar liggesusers/* Structural CSS ------------------------------------------------------------*/ /* * TBD whether we want a more fully table like structure; some of the visual * cues provided by the current set-up are useful (line wraps, etc.) */ DIV.diffobj-container PRE.diffobj-content { white-space: pre-wrap; margin: 0; } DIV.diffobj-container DIV.diffobj-row { width: 100%; font-family: monospace; display: table; table-layout: fixed; } DIV.diffobj-container DIV.diffobj-line { width: auto; display: table-cell; overflow: hidden; } DIV.diffobj-container DIV.diffobj-line>DIV { width: 100%; display: table; table-layout: auto; } DIV.diffobj-container DIV.diffobj-line.banner>DIV { display: table; table-layout: auto; /* set to fixed in JS */ } DIV.diffobj-container DIV.diffobj-text { display: table-cell; width: 100%; } DIV.diffobj-container DIV.diffobj-gutter { display: table-cell; padding: 0 0.2em; } DIV.diffobj-container DIV.diffobj-gutter DIV { display: table-cell; } #diffobj_content_meta DIV.diffobj-container DIV.diffobj-row { width: auto; } #diffobj_banner_meta DIV.diffobj-container DIV.diffobj-line.banner>DIV { table-layout: auto; } #diffobj_outer { overflow: hidden; } /* Summary -------------------------------------------------------------------*/ DIV.diffobj-container DIV.diffobj-summary DIV.map { word-wrap: break-word; padding-left: 1em; } DIV.diffobj-container DIV.diffobj-summary DIV.detail { padding-left: 1em; } /* Common elements -----------------------------------------------------------*/ DIV.diffobj-container DIV.diffobj-line.banner { font-size: 1.2em; font-weight: bold; overflow: hidden; } /* truncate banners */ DIV.diffobj-container DIV.diffobj-line.banner DIV.diffobj-text DIV{ white-space: nowrap; overflow: hidden; text-overflow: ellipsis; width: 100%; /* need to compute and set in JS */ } DIV.diffobj-container DIV.diffobj-gutter, DIV.diffobj-container DIV.diffobj-guide, DIV.diffobj-container DIV.diffobj-fill, DIV.diffobj-container DIV.context_sep, DIV.diffobj-container SPAN.diffobj-trim { color: #999; } DIV.diffobj-container DIV.diffobj-header { font-size: 1.1em; } DIV.diffobj-container DIV.diffobj-text>DIV.diffobj-match, DIV.diffobj-container DIV.diffobj-text>DIV.diffobj-guide { background-color: #ffffff; } DIV.diffobj-container DIV.diffobj-text>DIV.diffobj-fill { background-color: transparent; } DIV.diffobj-container DIV.diffobj-text>DIV { padding-right: 3px; } DIV.diffobj-container DIV.diffobj-text>DIV { border-left: 1px solid #888888; } DIV.diffobj-container DIV.diffobj-line { background-color: #eeeeee; } DIV.diffobj-container DIV.diffobj-text>DIV, DIV.diffobj-container DIV.diffobj-header { padding-left: 0.5em; } DIV.diffobj-container DIV.diffobj-line>DIV.diffobj-match, DIV.diffobj-container DIV.diffobj-line>DIV.diffobj-fill, DIV.diffobj-container DIV.diffobj-line>DIV.diffobj-guide { border-left: 1px solid #888888; } /* github inspired color scheme - default ------------------------------------*/ DIV.diffobj-container.light.rgb SPAN.diffobj-word.insert, DIV.diffobj-container.light.rgb DIV.diffobj-line>DIV.insert { background-color: #a6f3a6; } DIV.diffobj-container.light.rgb SPAN.diffobj-word.delete, DIV.diffobj-container.light.rgb DIV.diffobj-line>DIV.delete { background-color: #f8c2c2; } DIV.diffobj-container.light.rgb DIV.diffobj-text>DIV.insert { background-color: #efffef; } DIV.diffobj-container.light.rgb DIV.diffobj-text>DIV.insert, DIV.diffobj-container.light.rgb DIV.diffobj-line>DIV.insert { border-left: 1px solid #33bb33; } DIV.diffobj-container.light.rgb DIV.diffobj-text>DIV.delete { background-color: #ffefef; } DIV.diffobj-container.light.rgb DIV.diffobj-text>DIV.delete, DIV.diffobj-container.light.rgb DIV.diffobj-line>DIV.delete { border-left: 1px solid #cc6666; } DIV.diffobj-container.light.rgb DIV.diffobj-header { background-color: #e0e6fa; border-left: 1px solid #9894b6; } /* Yellow Blue variation -----------------------------------------------------*/ DIV.diffobj-container.light.yb SPAN.diffobj-word.insert, DIV.diffobj-container.light.yb DIV.diffobj-line>DIV.insert { background-color: #c0cfff; } DIV.diffobj-container.light.yb SPAN.diffobj-word.delete, DIV.diffobj-container.light.yb DIV.diffobj-line>DIV.delete { background-color: #e7e780; } DIV.diffobj-container.light.yb DIV.diffobj-text>DIV.insert { background-color: #efefff; } DIV.diffobj-container.light.yb DIV.diffobj-text>DIV.insert, DIV.diffobj-container.light.yb DIV.diffobj-line>DIV.insert { border-left: 1px solid #3333bb; } DIV.diffobj-container.light.yb DIV.diffobj-text>DIV.delete { background-color: #fefee5; } DIV.diffobj-container.light.yb DIV.diffobj-text>DIV.delete, DIV.diffobj-container.light.yb DIV.diffobj-line>DIV.delete { border-left: 1px solid #aaaa55; } DIV.diffobj-container.light.yb DIV.diffobj-header { background-color: #afafaf; border-left: 1px solid #e3e3e3; color: #e9e9e9; } DIV.diffobj-container.light.yb DIV.diffobj-line { background-color: #eeeeee; }