fansi/0000755000176200001440000000000014510601475011355 5ustar liggesusersfansi/NAMESPACE0000755000176200001440000000200714166673426012612 0ustar liggesusers# Generated by roxygen2: do not edit by hand export("substr2_ctl<-") export("substr_ctl<-") export(close_state) export(dflt_css) export(dflt_term_cap) export(fansi_lines) export(fwl) export(has_ctl) export(has_sgr) export(html_code_block) export(html_esc) export(in_html) export(make_styles) export(nchar_ctl) export(nchar_sgr) export(normalize_state) export(nzchar_ctl) export(nzchar_sgr) export(set_knit_hooks) export(sgr_256) export(sgr_to_html) export(state_at_end) export(strip_ctl) export(strip_sgr) export(strsplit_ctl) export(strsplit_sgr) export(strtrim2_ctl) export(strtrim2_sgr) export(strtrim_ctl) export(strtrim_sgr) export(strwrap2_ctl) export(strwrap2_sgr) export(strwrap_ctl) export(strwrap_sgr) export(substr2_ctl) export(substr2_sgr) export(substr_ctl) export(substr_sgr) export(tabs_as_spaces) export(term_cap_test) export(to_html) export(trimws_ctl) export(unhandled_ctl) importFrom(grDevices,col2rgb) importFrom(grDevices,rgb) importFrom(utils,browseURL) useDynLib(fansi, .registration=TRUE, .fixes="FANSI_") fansi/README.md0000755000176200001440000002345614510301700012635 0ustar liggesusers # fansi - ANSI Control Sequence Aware String Functions [![R build status](https://github.com/brodieG/fansi/workflows/R-CMD-check/badge.svg)](https://github.com/brodieG/fansi/actions) [![](https://codecov.io/gh/brodieG/fansi/branch/master/graphs/badge.svg?branch=master)](https://app.codecov.io/github/brodieG/fansi?branch=master) [![](http://www.r-pkg.org/badges/version/fansi)](https://cran.r-project.org/package=fansi) [![Dependencies direct/recursive](https://tinyverse.netlify.app/badge/fansi)](https://tinyverse.netlify.app/) Counterparts to R string manipulation functions that account for the effects of ANSI text formatting control sequences. ## Formatting Strings with Control Sequences Many terminals will recognize special sequences of characters in strings and change display behavior as a result. For example, on my terminal the sequences `"\033[3?m"` and `"\033[4?m"`, where `"?"` is a digit in 1-7, change the foreground and background colors of text respectively: fansi <- "\033[30m\033[41mF\033[42mA\033[43mN\033[44mS\033[45mI\033[m" ![](https://github.com/brodieG/fansi/raw/v1.0-rc/extra/images/fansi-1.png) This type of sequence is called an ANSI CSI SGR control sequence. Most \*nix terminals support them, and newer versions of Windows and Rstudio consoles do too. You can check whether your display supports them by running `term_cap_test()`. Whether the `fansi` functions behave as expected depends on many factors, including how your particular display handles Control Sequences. See `?fansi` for details, particularly if you are getting unexpected results. ## Manipulation of Formatted Strings ANSI control characters and sequences (*Control Sequences* hereafter) break the relationship between byte/character position in a string and display position. For example, to extract the “ANS” part of our colored “FANSI”, we would need to carefully compute the character positions: ![](https://github.com/brodieG/fansi/raw/v1.0-rc/extra/images/fansi-2.png) With `fansi` we can select directly based on display position: ![](https://github.com/brodieG/fansi/raw/v1.0-rc/extra/images/fansi-3.png) If you look closely you’ll notice that the text color for the `substr` version is wrong as the naïve string extraction loses the initial`"\033[37m"` that sets the foreground color. Additionally, the color from the last letter bleeds out into the next line. ## `fansi` Functions `fansi` provides counterparts to the following string functions: - `substr` (and `substr<-`) - `strsplit` - `strtrim` - `strwrap` - `nchar` / `nzchar` - `trimws` These are drop-in replacements that behave (almost) identically to the base counterparts, except for the *Control Sequence* awareness. There are also utility functions such as `strip_ctl` to remove *Control Sequences* and `has_ctl` to detect whether strings contain them. Much of `fansi` is written in C so you should find performance of the `fansi` functions to be slightly slower than the corresponding base functions, with the exception that `strwrap_ctl` is much faster. Operations involving `type = "width"` will be slower still. We have prioritized convenience and safety over raw speed in the C code, but unless your code is primarily engaged in string manipulation `fansi` should be fast enough to avoid attention in benchmarking traces. ## Width Based Substrings `fansi` also includes improved versions of some of those functions, such as `substr2_ctl` which allows for width based substrings. To illustrate, let’s create an emoji string made up of two wide characters: pizza.grin <- sprintf("\033[46m%s\033[m", strrep("\U1F355\U1F600", 10)) ![](https://github.com/brodieG/fansi/raw/v1.0-rc/extra/images/pizza-grin.png) And a colorful background made up of one wide characters: raw <- paste0("\033[45m", strrep("FANSI", 40)) wrapped <- strwrap2_ctl(raw, 41, wrap.always=TRUE) ![](https://github.com/brodieG/fansi/raw/df4019e/extra/images/wrapped-2.png) When we inject the 2-wide emoji into the 1-wide background their widths are accounted for as shown by the result remaining rectangular: starts <- c(18, 13, 8, 13, 18) ends <- c(23, 28, 33, 28, 23) substr2_ctl(wrapped, type='width', starts, ends) <- pizza.grin ![](https://github.com/brodieG/fansi/raw/v1.0-rc/extra/images/wrapped-1.png) `fansi` width calculations use heuristics to account for graphemes, including combining emoji: emo <- c( "\U1F468", "\U1F468\U1F3FD", "\U1F468\U1F3FD\u200D\U1F9B3", "\U1F468\u200D\U1F469\u200D\U1F467\u200D\U1F466" ) writeLines( paste( emo, paste("base:", nchar(emo, type='width')), paste("fansi:", nchar_ctl(emo, type='width')) ) ) ## 👨 base: 2 fansi: 2 ## 👨🏽 base: 4 fansi: 2 ## 👨🏽‍🦳 base: 6 fansi: 2 ## 👨‍👩‍👧‍👦 base: 8 fansi: 2 ## HTML Translation You can translate ANSI CSI SGR formatted strings into their HTML counterparts with `to_html`: ![Translate to HTML](https://github.com/brodieG/fansi/raw/v1.0-rc/extra/images/sgr_to_html.png) ## Rmarkdown It is possible to set `knitr` hooks such that R output that contains ANSI CSI SGR is automatically converted to the HTML formatted equivalent and displayed as intended. See the [vignette](https://htmlpreview.github.io/?https://raw.githubusercontent.com/brodieG/fansi/rc/extra/sgr-in-rmd.html) for details. ## Installation This package is available on CRAN: install.packages('fansi') It has no runtime dependencies. For the development version use `remotes::install_github('brodieg/fansi@development')` or: f.dl <- tempfile() f.uz <- tempfile() github.url <- 'https://github.com/brodieG/fansi/archive/development.zip' download.file(github.url, f.dl) unzip(f.dl, exdir=f.uz) install.packages(file.path(f.uz, 'fansi-development'), repos=NULL, type='source') unlink(c(f.dl, f.uz)) There is no guarantee that development versions are stable or even working. The master branch typically mirrors CRAN and should be stable. ## Related Packages and References - [crayon](https://github.com/r-lib/crayon), the library that started it all. - [ansistrings](https://github.com/r-lib/ansistrings/), which implements similar functionality. - [ECMA-48 - Control Functions For Coded Character Sets](https://www.ecma-international.org/publications-and-standards/standards/ecma-48/), in particular pages 10-12, and 61. - [CCITT Recommendation T.416](https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-T.416-199303-I!!PDF-E&type=items) - [ANSI Escape Code - Wikipedia](https://en.wikipedia.org/wiki/ANSI_escape_code) for a gentler introduction. ## Acknowledgments - R Core for developing and maintaining such a wonderful language. - CRAN maintainers, for patiently shepherding packages onto CRAN and maintaining the repository, and Uwe Ligges in particular for maintaining [Winbuilder](https://win-builder.r-project.org/). - [Gábor Csárdi](https://github.com/gaborcsardi) for getting me started on the journey ANSI control sequences, and for many of the ideas on how to process them. - [Jim Hester](https://github.com/jimhester) for [covr](https://cran.r-project.org/package=covr), and with Rstudio for [r-lib/actions](https://github.com/r-lib/actions). - [Dirk Eddelbuettel](https://github.com/eddelbuettel) and [Carl Boettiger](https://github.com/cboettig) for the [rocker](https://github.com/rocker-org/rocker) project, and [Gábor Csárdi](https://github.com/gaborcsardi) and the [R-consortium](https://www.r-consortium.org/) for [Rhub](https://github.com/r-hub), without which testing bugs on R-devel and other platforms would be a nightmare. - [Tomas Kalibera](https://github.com/kalibera) for [rchk](https://github.com/kalibera/rchk) and the accompanying vagrant image, and rcnst to help detect errors in compiled code. - [Winston Chang](https://github.com/wch) for the [r-debug](https://hub.docker.com/r/wch1/r-debug/) docker container, in particular because of the valgrind level 2 instrumented version of R. - George Nachman etal. for [Iterm2](https://iterm2.com/index.html), a Free terminal emulator that supports truecolor CSI SGR. - [Hadley Wickham](https://github.com/hadley/) and [Peter Danenberg](https://github.com/klutometis) for [roxygen2](https://cran.r-project.org/package=roxygen2). - [Yihui Xie](https://github.com/yihui) for [knitr](https://cran.r-project.org/package=knitr) and [J.J. Allaire](https://github.com/jjallaire) et al. for [rmarkdown](https://cran.r-project.org/package=rmarkdown), and by extension John MacFarlane for [pandoc](https://pandoc.org/). - [Gábor Csárdi](https://github.com/gaborcsardi), the [R-consortium](https://www.r-consortium.org/), et al. for [revdepcheck](https://github.com/r-lib/revdepcheck) to simplify reverse dependency checks. - Olaf Mersmann for [microbenchmark](https://cran.r-project.org/package=microbenchmark), because microsecond matter, and [Joshua Ulrich](https://github.com/joshuaulrich) for making it lightweight. - All open source developers out there that make their work freely available for others to use. - [Github](https://github.com/), [Codecov](https://about.codecov.io/), [Vagrant](https://www.vagrantup.com/), [Docker](https://www.docker.com/), [Ubuntu](https://ubuntu.com/), [Brew](https://brew.sh/) for providing infrastructure that greatly simplifies open source development. - [Free Software Foundation](https://www.fsf.org/) for developing the GPL license and promotion of the free software movement. fansi/man/0000755000176200001440000000000014510300164012117 5ustar liggesusersfansi/man/fansi.Rd0000755000176200001440000003163014510300164013514 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/fansi-package.R \docType{package} \name{fansi} \alias{fansi} \alias{fansi-package} \title{Details About Manipulation of Strings Containing Control Sequences} \description{ Counterparts to R string manipulation functions that account for the effects of some ANSI X3.64 (a.k.a. ECMA-48, ISO-6429) control sequences. } \section{Control Characters and Sequences}{ Control characters and sequences are non-printing inline characters or sequences initiated by them that can be used to modify terminal display and behavior, for example by changing text color or cursor position. We will refer to X3.64/ECMA-48/ISO-6429 control characters and sequences as "\emph{Control Sequences}" hereafter. There are four types of \emph{Control Sequences} that \code{fansi} can treat specially: \itemize{ \item "C0" control characters, such as tabs and carriage returns (we include delete in this set, even though technically it is not part of it). \item Sequences starting in "ESC[", also known as Control Sequence Introducer (CSI) sequences, of which the Select Graphic Rendition (SGR) sequences used to format terminal output are a subset. \item Sequences starting in "ESC]", also known as Operating System Commands (OSC), of which the subset beginning with "8" is used to encode URI based hyperlinks. \item Sequences starting in "ESC" and followed by something other than "[" or "]". } \emph{Control Sequences} starting with ESC are assumed to be two characters long (including the ESC) unless they are of the CSI or OSC variety, in which case their length is computed as per the \href{https://www.ecma-international.org/publications-and-standards/standards/ecma-48/}{ECMA-48 specification}, with the exception that \href{#osc-hyperlinks}{OSC hyperlinks} may be terminated with BEL ("\\a") in addition to ST ("ESC\\"). \code{fansi} handles most common \emph{Control Sequences} in its parsing algorithms, but it is not a conforming implementation of ECMA-48. For example, there are non-CSI/OSC escape sequences that may be longer than two characters, but \code{fansi} will (incorrectly) treat them as if they were two characters long. There are many more unimplemented ECMA-48 specifications. In theory it is possible to encode CSI sequences with a single byte introducing character in the 0x40-0x5F range instead of the traditional "ESC[". Since this is rare and it conflicts with UTF-8 encoding, \code{fansi} does not support it. Within \emph{Control Sequences}, \code{fansi} further distinguishes CSI SGR and OSC hyperlinks by recording format specification and URIs into string state, and applying the same to any output strings according to the semantics of the functions in use. CSI SGR and OSC hyperlinks are known together as \emph{Special Sequences}. See the following sections for details. Additionally, all \emph{Control Sequences}, whether special or not, do not count as characters, graphemes, or display width. You can cause \code{fansi} to treat particular \emph{Control Sequences} as regular characters with the \code{ctl} parameter. } \section{CSI SGR Control Sequences}{ \strong{NOTE}: not all displays support CSI SGR sequences; run \code{\link{term_cap_test}} to see whether your display supports them. CSI SGR Control Sequences are the subset of CSI sequences that can be used to change text appearance (e.g. color). These sequences begin with "ESC[" and end in "m". \code{fansi} interprets these sequences and writes new ones to the output strings in such a way that the original formatting is preserved. In most cases this should be transparent to the user. Occasionally there may be mismatches between how \code{fansi} and a display interpret the CSI SGR sequences, which may produce display artifacts. The most likely source of artifacts are \emph{Control Sequences} that move the cursor or change the display, or that \code{fansi} otherwise fails to interpret, such as: \itemize{ \item Unknown SGR substrings. \item "C0" control characters like tabs and carriage returns. \item Other escape sequences. } Another possible source of problems is that different displays parse and interpret control sequences differently. The common CSI SGR sequences that you are likely to encounter in formatted text tend to be treated consistently, but less common ones are not. \code{fansi} tries to hew by the ECMA-48 specification \strong{for CSI SGR control sequences}, but not all terminals do. The most likely source of problems will be 24-bit CSI SGR sequences. For example, a 24-bit color sequence such as "ESC[38;2;31;42;4" is a single foreground color to a terminal that supports it, or separate foreground, background, faint, and underline specifications for one that does not. \code{fansi} will always interpret the sequences according to ECMA-48, but it will warn you if encountered sequences exceed those specified by the \code{term.cap} parameter or the "fansi.term.cap" global option. \code{fansi} will will also warn if it encounters \emph{Control Sequences} that it cannot interpret. You can turn off warnings via the \code{warn} parameter, which can be set globally via the "fansi.warn" option. You can work around "C0" tabs characters by turning them into spaces first with \code{\link{tabs_as_spaces}} or with the \code{tabs.as.spaces} parameter available in some of the \code{fansi} functions \code{fansi} interprets CSI SGR sequences in cumulative "Graphic Rendition Combination Mode". This means new SGR sequences add to rather than replace previous ones, although in some cases the effect is the same as replacement (e.g. if you have a color active and pick another one). } \section{OSC Hyperlinks}{ Operating System Commands are interpreted by terminal emulators typically to engage actions external to the display of text proper, such as setting a window title or changing the active color palette. \href{https://iterm2.com/documentation-escape-codes.html}{Some terminals} have added support for associating URIs to text with OSCs in a similar way to anchors in HTML, so \code{fansi} interprets them and outputs or terminates them as needed. For example: \if{html}{\out{
}}\preformatted{"\\033]8;;xy.z\\033\\\\LINK\\033]8;;\\033\\\\" }\if{html}{\out{
}} Might be interpreted as link to the URI "x.z". To make the encoding pattern clearer, we replace "\033]" with "" and "\033\\\\" with "" below: \if{html}{\out{
}}\preformatted{8;;URILINK TEXT8;; }\if{html}{\out{
}} } \section{State Interactions}{ The cumulative nature of state as specified by SGR or OSC hyperlinks means that unterminated strings that are spliced will interact with each other. By extension, a substring does not inherently contain all the information required to recreate its state as it appeared in the source document. The default \code{fansi} configuration terminates extracted substrings and prepends original state to them so they present on a stand-alone basis as they did as part of the original string. To allow state in substrings to affect subsequent strings set \code{terminate = FALSE}, but you will need to manually terminate them or deal with the consequences of not doing so (see "Terminal Quirks"). By default, \code{fansi} assumes that each element in an input character vector is independent, but this is incorrect if the input is a single document with each element a line in it. In that situation state from each line should bleed into subsequent ones. Setting \code{carry = TRUE} enables the "single document" interpretation. To most closely approximate what \code{writeLines(x)} produces on your terminal, where \code{x} is a stateful string, use \code{writeLines(fansi_fun(x, carry=TRUE, terminate=FALSE))}. \code{fansi_fun} is a stand-in for any of the \code{fansi} string manipulation functions. Note that even with a seeming "null-op" such as \code{substr_ctl(x, 1, nchar_ctl(x), carry=TRUE, terminate=FALSE)} the output control sequences may not match the input ones, but the output \emph{should} look the same if displayed to the terminal. \code{fansi} strings will be affected by any active state in strings they are appended to. There are no parameters to control what happens in this case, but \code{fansi} provides functions that can help the user get the desired behavior. \code{state_at_end} computes the active state the end of a string, which can then be prepended onto the \emph{input} of \code{fansi} functions so that they are aware of the active style at the beginning of the string. Alternatively, one could use \code{close_state(state_at_end(...))} and pre-pend that to the \emph{output} of \code{fansi} functions so they are unaffected by preceding SGR. One could also just prepend "ESC[0m", but in some cases as described in \code{\link[=normalize_state]{?normalize_state}} that is sub-optimal. If you intend to combine stateful \code{fansi} manipulated strings with your own, it may be best to set \code{normalize = TRUE} for improved compatibility (see \code{\link[=normalize_state]{?normalize_state}}.) } \section{Terminal Quirks}{ Some terminals (e.g. OS X terminal, ITerm2) will pre-paint the entirety of a new line with the currently active background before writing the contents of the line. If there is a non-default active background color, any unwritten columns in the new line will keep the prior background color even if the new line changes the background color. To avoid this be sure to use \code{terminate = TRUE} or to manually terminate each line with e.g. "ESC[0m". The problem manifests as: \if{html}{\out{
}}\preformatted{" " = default background "#" = new background ">" = start new background "!" = restore default background +-----------+ | abc\\n | |>###\\n | |!abc\\n#####| <- trailing "#" after newline are from pre-paint | abc | +-----------+ }\if{html}{\out{
}} The simplest way to avoid this problem is to split input strings by any newlines they contain, and use \code{terminate = TRUE} (the default). A more complex solution is to pad with spaces to the terminal window width before emitting the newline to ensure the pre-paint is overpainted with the current line's prevailing background color. } \section{Encodings / UTF-8}{ \code{fansi} will convert any non-ASCII strings to UTF-8 before processing them, and \code{fansi} functions that return strings will return them encoded in UTF-8. In some cases this will be different to what base R does. For example, \code{substr} re-encodes substrings to their original encoding. Interpretation of UTF-8 strings is intended to be consistent with base R. There are three ways things may not work out exactly as desired: \enumerate{ \item \code{fansi}, despite its best intentions, handles a UTF-8 sequence differently to the way R does. \item R incorrectly handles a UTF-8 sequence. \item Your display incorrectly handles a UTF-8 sequence. } These issues are most likely to occur with invalid UTF-8 sequences, combining character sequences, and emoji. For example, whether special characters such as emoji are considered one or two wide evolves as software implements newer versions the Unicode databases. Internally, \code{fansi} computes the width of most UTF-8 character sequences outside of the ASCII range using the native \code{R_nchar} function. This will cause such characters to be processed slower than ASCII characters. Unlike R (at least as of version 4.1), \code{fansi} can account for graphemes. Because \code{fansi} implements its own internal UTF-8 parsing it is possible that you will see results different from those that R produces even on strings without \emph{Control Sequences}. } \section{Overflow}{ The maximum length of input character vector elements allowed by \code{fansi} is the 32 bit INT_MAX, excluding the terminating NULL. As of R4.1 this is the limit for R character vector elements generally, but is enforced at the C level by \code{fansi} nonetheless. It is possible that during processing strings that are shorter than INT_MAX would become longer than that. \code{fansi} checks for that overflow and will stop with an error if that happens. A work-around for this situation is to break up large strings into smaller ones. The limit is on each element of a character vector, not on the vector as a whole. \code{fansi} will also error on your system if \code{R_len_t}, the R type used to measure string lengths, is less than the processed length of the string. } \section{R < 3.2.2 support}{ Nominally you can build and run this package in R versions between 3.1.0 and 3.2.1. Things should mostly work, but please be aware we do not run the test suite under versions of R less than 3.2.2. One key degraded capability is width computation of wide-display characters. Under R < 3.2.2 \code{fansi} will assume every character is 1 display width. Additionally, \code{fansi} may not always report malformed UTF-8 sequences as it usually does. One exception to this is \code{\link{nchar_ctl}} as that is just a thin wrapper around \code{\link[base:nchar]{base::nchar}}. } fansi/man/strsplit_ctl.Rd0000755000176200001440000002156714510300164015152 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/strsplit.R \name{strsplit_ctl} \alias{strsplit_ctl} \title{Control Sequence Aware Version of strsplit} \usage{ strsplit_ctl( x, split, fixed = FALSE, perl = FALSE, useBytes = FALSE, warn = getOption("fansi.warn", TRUE), term.cap = getOption("fansi.term.cap", dflt_term_cap()), ctl = "all", normalize = getOption("fansi.normalize", FALSE), carry = getOption("fansi.carry", FALSE), terminate = getOption("fansi.terminate", TRUE) ) } \arguments{ \item{x}{a character vector, or, unlike \code{\link[base:strsplit]{base::strsplit}} an object that can be coerced to character.} \item{split}{ character vector (or object which can be coerced to such) containing \link[base]{regular expression}(s) (unless \code{fixed = TRUE}) to use for splitting. If empty matches occur, in particular if \code{split} has length 0, \code{x} is split into single characters. If \code{split} has length greater than 1, it is re-cycled along \code{x}. } \item{fixed}{ logical. If \code{TRUE} match \code{split} exactly, otherwise use regular expressions. Has priority over \code{perl}. } \item{perl}{logical. Should Perl-compatible regexps be used?} \item{useBytes}{logical. If \code{TRUE} the matching is done byte-by-byte rather than character-by-character, and inputs with marked encodings are not converted. This is forced (with a warning) if any input is found which is marked as \code{"bytes"} (see \code{\link[base]{Encoding}}).} \item{warn}{TRUE (default) or FALSE, whether to warn when potentially problematic \emph{Control Sequences} are encountered. These could cause the assumptions \code{fansi} makes about how strings are rendered on your display to be incorrect, for example by moving the cursor (see \code{\link[=fansi]{?fansi}}). At most one warning will be issued per element in each input vector. Will also warn about some badly encoded UTF-8 strings, but a lack of UTF-8 warnings is not a guarantee of correct encoding (use \code{\link{validUTF8}} for that).} \item{term.cap}{character a vector of the capabilities of the terminal, can be any combination of "bright" (SGR codes 90-97, 100-107), "256" (SGR codes starting with "38;5" or "48;5"), "truecolor" (SGR codes starting with "38;2" or "48;2"), and "all". "all" behaves as it does for the \code{ctl} parameter: "all" combined with any other value means all terminal capabilities except that one. \code{fansi} will warn if it encounters SGR codes that exceed the terminal capabilities specified (see \code{\link{term_cap_test}} for details). In versions prior to 1.0, \code{fansi} would also skip exceeding SGRs entirely instead of interpreting them. You may add the string "old" to any otherwise valid \code{term.cap} spec to restore the pre 1.0 behavior. "old" will not interact with "all" the way other valid values for this parameter do.} \item{ctl}{character, which \emph{Control Sequences} should be treated specially. Special treatment is context dependent, and may include detecting them and/or computing their display/character width as zero. For the SGR subset of the ANSI CSI sequences, and OSC hyperlinks, \code{fansi} will also parse, interpret, and reapply the sequences as needed. You can modify whether a \emph{Control Sequence} is treated specially with the \code{ctl} parameter. \itemize{ \item "nl": newlines. \item "c0": all other "C0" control characters (i.e. 0x01-0x1f, 0x7F), except for newlines and the actual ESC (0x1B) character. \item "sgr": ANSI CSI SGR sequences. \item "csi": all non-SGR ANSI CSI sequences. \item "url": OSC hyperlinks \item "osc": all non-OSC-hyperlink OSC sequences. \item "esc": all other escape sequences. \item "all": all of the above, except when used in combination with any of the above, in which case it means "all but". }} \item{normalize}{TRUE or FALSE (default) whether SGR sequence should be normalized out such that there is one distinct sequence for each SGR code. normalized strings will occupy more space (e.g. "\033[31;42m" becomes "\033[31m\033[42m"), but will work better with code that assumes each SGR code will be in its own escape as \code{crayon} does.} \item{carry}{TRUE, FALSE (default), or a scalar string, controls whether to interpret the character vector as a "single document" (TRUE or string) or as independent elements (FALSE). In "single document" mode, active state at the end of an input element is considered active at the beginning of the next vector element, simulating what happens with a document with active state at the end of a line. If FALSE each vector element is interpreted as if there were no active state when it begins. If character, then the active state at the end of the \code{carry} string is carried into the first element of \code{x} (see "Replacement Functions" for differences there). The carried state is injected in the interstice between an imaginary zeroeth character and the first character of a vector element. See the "Position Semantics" section of \code{\link{substr_ctl}} and the "State Interactions" section of \code{\link[=fansi]{?fansi}} for details. Except for \code{\link{strwrap_ctl}} where \code{NA} is treated as the string \code{"NA"}, \code{carry} will cause \code{NA}s in inputs to propagate through the remaining vector elements.} \item{terminate}{TRUE (default) or FALSE whether substrings should have active state closed to avoid it bleeding into other strings they may be prepended onto. This does not stop state from carrying if \code{carry = TRUE}. See the "State Interactions" section of \code{\link[=fansi]{?fansi}} for details.} } \value{ Like \code{\link[base:strsplit]{base::strsplit}}, with \emph{Control Sequences} excluded. } \description{ A drop-in replacement for \code{\link[base:strsplit]{base::strsplit}}. } \details{ This function works by computing the position of the split points after removing \emph{Control Sequences}, and uses those positions in conjunction with \code{\link{substr_ctl}} to extract the pieces. This concept is borrowed from \code{crayon::col_strsplit}. An important implication of this is that you cannot split by \emph{Control Sequences} that are being treated as \emph{Control Sequences}. You can however limit which control sequences are treated specially via the \code{ctl} parameters (see examples). } \note{ The split positions are computed after both \code{x} and \code{split} are converted to UTF-8. Non-ASCII strings are converted to and returned in UTF-8 encoding. Width calculations will not work properly in R < 3.2.2. } \section{Control and Special Sequences}{ \emph{Control Sequences} are non-printing characters or sequences of characters. \emph{Special Sequences} are a subset of the \emph{Control Sequences}, and include CSI SGR sequences which can be used to change rendered appearance of text, and OSC hyperlinks. See \code{\link{fansi}} for details. } \section{Output Stability}{ Several factors could affect the exact output produced by \code{fansi} functions across versions of \code{fansi}, \code{R}, and/or across systems. \strong{In general it is best not to rely on exact \code{fansi} output, e.g. by embedding it in tests}. Width and grapheme calculations depend on locale, Unicode database version, and grapheme processing logic (which is still in development), among other things. For the most part \code{fansi} (currently) uses the internals of \code{base::nchar(type='width')}, but there are exceptions and this may change in the future. How a particular display format is encoded in \emph{Control Sequences} is not guaranteed to be stable across \code{fansi} versions. Additionally, which \emph{Special Sequences} are re-encoded vs transcribed untouched may change. In general we will strive to keep the rendered appearance stable. To maximize the odds of getting stable output set \code{normalize_state} to \code{TRUE} and \code{type} to \code{"chars"} in functions that allow it, and set \code{term.cap} to a specific set of capabilities. } \section{Bidirectional Text}{ \code{fansi} is unaware of text directionality and operates as if all strings are left to right (LTR). Using \code{fansi} function with strings that contain mixed direction scripts (i.e. both LTR and RTL) may produce undesirable results. } \examples{ strsplit_ctl("\033[31mhello\033[42m world!", " ") ## Splitting by newlines does not work as they are _Control ## Sequences_, but we can use `ctl` to treat them as ordinary strsplit_ctl("\033[31mhello\033[42m\nworld!", "\n") strsplit_ctl("\033[31mhello\033[42m\nworld!", "\n", ctl=c("all", "nl")) } \seealso{ \code{\link[=fansi]{?fansi}} for details on how \emph{Control Sequences} are interpreted, particularly if you are getting unexpected results, \code{\link{normalize_state}} for more details on what the \code{normalize} parameter does, \code{\link{state_at_end}} to compute active state at the end of strings, \code{\link{close_state}} to compute the sequence required to close active state. } fansi/man/make_styles.Rd0000755000176200001440000000420614507523546014754 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/tohtml.R \name{make_styles} \alias{make_styles} \title{Generate CSS Mapping Classes to Colors} \usage{ make_styles(classes, rgb.mix = diag(3)) } \arguments{ \item{classes}{a character vector of either 16, 32, or 512 class names. The character vectors are described in \code{\link{to_html}}.} \item{rgb.mix}{3 x 3 numeric matrix to remix color channels. Given a N x 3 matrix of numeric RGB colors \code{rgb}, the colors used in the style sheet will be \code{rgb \%*\% rgb.mix}. Out of range values are clipped to the nearest bound of the range.} } \value{ A character vector that can be used as the contents of a style sheet. } \description{ Given a set of class names, produce the CSS that maps them to the default 8-bit colors. This is a helper function to generate style sheets for use in examples with either default or remixed \code{fansi} colors. In practice users will create their own style sheets mapping their classes to their preferred styles. } \examples{ ## Generate some class strings; order matters classes <- do.call(paste, c(expand.grid(c("fg", "bg"), 0:7), sep="-")) writeLines(classes[1:4]) ## Some Default CSS css0 <- "span {font-size: 60pt; padding: 10px; display: inline-block}" ## Associated class strings to styles css1 <- make_styles(classes) writeLines(css1[1:4]) ## Generate SGR-derived HTML, mapping to classes string <- "\033[43mYellow\033[m\n\033[45mMagenta\033[m\n\033[46mCyan\033[m" html <- to_html(string, classes=classes) writeLines(html) ## Combine in a page with styles and display in browser \dontrun{ in_html(html, css=c(css0, css1)) } ## Change CSS by remixing colors, and apply to exact same HTML mix <- matrix( c( 0, 1, 0, # red output is green input 0, 0, 1, # green output is blue input 1, 0, 0 # blue output is red input ), nrow=3, byrow=TRUE ) css2 <- make_styles(classes, rgb.mix=mix) ## Display in browser: same HTML but colors changed by CSS \dontrun{ in_html(html, css=c(css0, css2)) } } \seealso{ Other HTML functions: \code{\link{html_esc}()}, \code{\link{in_html}()}, \code{\link{to_html}()} } \concept{HTML functions} fansi/man/sgr_256.Rd0000755000176200001440000000117414507523546013624 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/misc.R \name{sgr_256} \alias{sgr_256} \title{Show 8 Bit CSI SGR Colors} \usage{ sgr_256() } \value{ character vector with SGR codes with background color set as themselves. } \description{ Generates text with each 8 bit SGR code (e.g. the "###" in "38;5;###") with the background colored by itself, and the foreground in a contrasting color and interesting color (we sacrifice some contrast for interest as this is intended for demo rather than reference purposes). } \examples{ writeLines(sgr_256()) } \seealso{ \code{\link[=make_styles]{make_styles()}}. } fansi/man/has_sgr.Rd0000755000176200001440000000211614507523546014060 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/sgr.R \name{has_sgr} \alias{has_sgr} \title{Check for Presence of Control Sequences} \usage{ has_sgr(x, warn = getOption("fansi.warn", TRUE)) } \arguments{ \item{x}{a character vector or object that can be coerced to such.} \item{warn}{TRUE (default) or FALSE, whether to warn when potentially problematic \emph{Control Sequences} are encountered. These could cause the assumptions \code{fansi} makes about how strings are rendered on your display to be incorrect, for example by moving the cursor (see \code{\link[=fansi]{?fansi}}). At most one warning will be issued per element in each input vector. Will also warn about some badly encoded UTF-8 strings, but a lack of UTF-8 warnings is not a guarantee of correct encoding (use \code{\link{validUTF8}} for that).} } \value{ logical of same length as \code{x}; NA values in \code{x} result in NA values in return } \description{ This function is deprecated in favor of the \code{\link{has_ctl}}. It checks for CSI SGR and OSC hyperlink sequences. } \keyword{internal} fansi/man/in_html.Rd0000755000176200001440000000311114507523546014060 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/tohtml.R \name{in_html} \alias{in_html} \title{Frame HTML in a Web Page And Display} \usage{ in_html(x, css = character(), pre = TRUE, display = TRUE, clean = display) } \arguments{ \item{x}{character vector of html encoded strings.} \item{css}{character vector of css styles.} \item{pre}{TRUE (default) or FALSE, whether to wrap \code{x} in PRE tags.} \item{display}{TRUE or FALSE, whether to display the resulting page in a browser window. If TRUE, will sleep for one second before returning, and will delete the temporary file used to store the HTML.} \item{clean}{TRUE or FALSE, if TRUE and \code{display == TRUE}, will delete the temporary file used for the web page, otherwise will leave it.} } \value{ character(1L) the file location of the page, invisibly, but keep in mind it will have been deleted if \code{clean=TRUE}. } \description{ Helper function that assembles user provided HTML and CSS into a temporary text file, and by default displays it in the browser. Intended for use in examples. } \examples{ txt <- "\033[31;42mHello \033[7mWorld\033[m" writeLines(txt) html <- to_html(txt) \dontrun{ in_html(html) # spawns a browser window } writeLines(readLines(in_html(html, display=FALSE))) css <- "SPAN {text-decoration: underline;}" writeLines(readLines(in_html(html, css=css, display=FALSE))) \dontrun{ in_html(html, css) } } \seealso{ \code{\link[=make_styles]{make_styles()}}. Other HTML functions: \code{\link{html_esc}()}, \code{\link{make_styles}()}, \code{\link{to_html}()} } \concept{HTML functions} fansi/man/set_knit_hooks.Rd0000755000176200001440000001216714507523546015464 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/misc.R \name{set_knit_hooks} \alias{set_knit_hooks} \title{Set an Output Hook Convert Control Sequences to HTML in Rmarkdown} \usage{ set_knit_hooks( hooks, which = "output", proc.fun = function(x, class) html_code_block(to_html(html_esc(x)), class = class), class = sprintf("fansi fansi-\%s", which), style = getOption("fansi.css", dflt_css()), split.nl = FALSE, .test = FALSE ) } \arguments{ \item{hooks}{list, this should the be \code{knitr::knit_hooks} object; we require you pass this to avoid a run-time dependency on \code{knitr}.} \item{which}{character vector with the names of the hooks that should be replaced, defaults to 'output', but can also contain values 'message', 'warning', and 'error'.} \item{proc.fun}{function that will be applied to output that contains CSI SGR sequences. Should accept parameters \code{x} and \code{class}, where \code{x} is the output, and \code{class} is the CSS class that should be applied to the
 blocks the output will be placed in.}

\item{class}{character the CSS class to give the output chunks.  Each type of
output chunk specified in \code{which} will be matched position-wise to the
classes specified here.  This vector should be the same length as \code{which}.}

\item{style}{character a vector of CSS styles; these will be output inside
HTML >STYLE< tags as a side effect.  The default value is designed to
ensure that there is no visible gap in background color with lines with
height 1.5 (as is the default setting in \code{rmarkdown} documents v1.1).}

\item{split.nl}{TRUE or FALSE (default), set to TRUE to split input strings
by any newlines they may contain to avoid any newlines inside SPAN tags
created by \code{\link[=to_html]{to_html()}}.  Some markdown->html renders can be configured
to convert embedded newlines into line breaks, which may lead to a doubling
of line breaks.  With the default \code{proc.fun} the split strings are
recombined by \code{\link[=html_code_block]{html_code_block()}}, but if you provide your own \code{proc.fun}
you'll need to account for the possibility that the character vector it
receives will have a different number of elements than the chunk output.
This argument only has an effect if chunk output contains CSI SGR
sequences.}

\item{.test}{TRUE or FALSE, for internal testing use only.}
}
\value{
named list with the prior output hooks for each of \code{which}.
}
\description{
This is a convenience function designed for use within an \code{rmarkdown}
document.  It overrides the \code{knitr} output hooks by using
\code{knitr::knit_hooks$set}.  It replaces the hooks with ones that convert
\emph{Control Sequences} into HTML.  In addition to replacing the hook functions,
this will output a "),
    "",
    if(pre) "
",
    x,
    if(pre) "
", "", "" ) f <- paste0(tempfile(), ".html") writeLines(html, f) if(display) browseURL(f) # nocov, can't do this in tests if(clean) { Sys.sleep(1) unlink(f) } invisible(f) } FANSI.CLASSES <- do.call( paste, c( expand.grid('fansi', c('color', 'bgcol'), sprintf("%03d", 0:255)), sep="-" ) ) fansi/R/trimws.R0000755000176200001440000000415514510276530013236 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. #' Control Sequence Aware Version of trimws #' #' Removes any whitespace before the first and/or after the last non-_Control #' Sequence_ character. Unlike with the [`base::trimws`], only the default #' `whitespace` specification is supported. #' #' @export #' @inheritSection substr_ctl Control and Special Sequences #' @inheritSection substr_ctl Output Stability #' @inheritParams base::trimws #' @inheritParams substr_ctl #' @param whitespace must be set to the default value, in the future it may #' become possible to change this parameter. #' @return The input with white space removed as described. #' @examples #' trimws_ctl(" \033[31m\thello world\t\033[39m ") trimws_ctl <- function( x, which = c("both", "left", "right"), whitespace = "[ \t\r\n]", warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), ctl='all', normalize=getOption('fansi.normalize', FALSE) ) { if(!identical(whitespace, "[ \t\r\n]")) stop("Argument `whitespace` may only be set to \"[ \\t\\r\\n]\".") # modifies/adds vars in env VAL_IN_ENV(x=x, ctl=ctl, warn=warn, term.cap=term.cap, normalize=normalize); valid.which <- c("both", "left", "right") if( !is.character(which) || length(which[1]) != 1 || is.na(which.int <- pmatch(which[1], valid.which)) ) stop( "Argument `which` must partial match one of ", deparse(valid.which), "." ) .Call( FANSI_trimws, x, which.int - 1L, WARN.INT, TERM.CAP.INT, CTL.INT, normalize ) } fansi/R/nchar.R0000755000176200001440000001111714510276532013002 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. #' Control Sequence Aware Version of nchar #' #' `nchar_ctl` counts all non _Control Sequence_ characters. #' `nzchar_ctl` returns TRUE for each input vector element that has non _Control #' Sequence_ sequence characters. By default newlines and other C0 control #' characters are not counted. #' #' `nchar_ctl` and `nzchar_ctl` are implemented in statically compiled code, so #' in particular `nzchar_ctl` will be much faster than the otherwise equivalent #' `nzchar(strip_ctl(...))`. #' #' These functions will warn if either malformed or escape or UTF-8 sequences #' are encountered as they may be incorrectly interpreted. #' #' @inheritParams substr_ctl #' @inheritParams base::nchar #' @inheritParams strip_ctl #' @inheritSection substr_ctl Control and Special Sequences #' @inheritSection substr_ctl Output Stability #' @inheritSection substr_ctl Graphemes #' @inherit base::nchar return #' @return Like [`base::nchar`], with _Control Sequences_ excluded. #' @note The `keepNA` parameter is ignored for R < 3.2.2. #' @export #' @inherit has_ctl seealso #' @examples #' nchar_ctl("\033[31m123\a\r") #' ## with some wide characters #' cn.string <- sprintf("\033[31m%s\a\r", "\u4E00\u4E01\u4E03") #' nchar_ctl(cn.string) #' nchar_ctl(cn.string, type='width') #' #' ## Remember newlines are not counted by default #' nchar_ctl("\t\n\r") #' #' ## The 'c0' value for the `ctl` argument does not include #' ## newlines. #' nchar_ctl("\t\n\r", ctl="c0") #' nchar_ctl("\t\n\r", ctl=c("c0", "nl")) #' #' ## The _sgr flavor only treats SGR sequences as zero width #' nchar_sgr("\033[31m123") #' nchar_sgr("\t\n\n123") #' #' ## All of the following are Control Sequences or C0 controls #' nzchar_ctl("\n\033[42;31m\033[123P\a") nchar_ctl <- function( x, type='chars', allowNA=FALSE, keepNA=NA, ctl='all', warn=getOption('fansi.warn', TRUE), strip ) { if(!missing(strip)) { message("Parameter `strip` has been deprecated; use `ctl` instead.") ctl <- strip } ## modifies / creates NEW VARS in fun env if(FANSI.ENV[['r.ver']] >= "3.2.2") { VAL_IN_ENV( x=x, ctl=ctl, warn=warn, type=type, allowNA=allowNA, keepNA=keepNA, valid.types=c('chars', 'width', 'graphemes', 'bytes'), warn.mask=if(isTRUE(allowNA)) get_warn_mangled() else get_warn_worst() ) nchar_ctl_internal( x=x, type.int=TYPE.INT, allowNA=allowNA, keepNA=keepNA, ctl.int=CTL.INT, warn.int=WARN.INT, z=FALSE ) } else { nchar( strip_ctl(x, ctl=ctl, warn=warn), type=type, allowNA=allowNA, keepNA=keepNA ) } } #' @export #' @rdname nchar_ctl nzchar_ctl <- function( x, keepNA=FALSE, ctl='all', warn=getOption('fansi.warn', TRUE) ) { if(FANSI.ENV[['r.ver']] >= "3.2.2") { ## modifies / creates NEW VARS in fun env VAL_IN_ENV( x=x, ctl=ctl, warn=warn, type='chars', keepNA=keepNA, valid.types=c('chars', 'width', 'bytes'), warn.mask=get_warn_mangled() ) nchar_ctl_internal( x=x, type.int=TYPE.INT, allowNA=TRUE, keepNA=keepNA, ctl.int=CTL.INT, warn.int=WARN.INT, z=TRUE ) } else nzchar(strip_ctl(x, ctl=ctl, warn=warn), keepNA=keepNA) } nchar_ctl_internal <- function( x, type.int, allowNA, keepNA, ctl.int, warn.int, z ) { term.cap.int <- 1L res <- .Call( FANSI_nchar_esc, x, type.int, keepNA, allowNA, warn.int, term.cap.int, ctl.int, z ) dim(res) <- dim(x) dimnames(res) <- dimnames(x) names(res) <- names(x) res } #' Control Sequence Aware Version of nchar #' #' These functions are deprecated in favor of the [`nchar_ctl`] and #' [`nzchar_ctl`]. #' #' @inheritParams nchar_ctl #' @inherit nchar_ctl return #' @keywords internal #' @export nchar_sgr <- function( x, type='chars', allowNA=FALSE, keepNA=NA, warn=getOption('fansi.warn', TRUE) ) nchar_ctl( x=x, type=type, allowNA=allowNA, keepNA=keepNA, warn=warn, ctl='sgr' ) #' @export #' @rdname nchar_sgr nzchar_sgr <- function(x, keepNA=NA, warn=getOption('fansi.warn', TRUE)) nzchar_ctl(x=x, keepNA=keepNA, warn=warn, ctl='sgr') fansi/R/fansi-package.R0000755000176200001440000003277314510300164014400 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. #' Details About Manipulation of Strings Containing Control Sequences #' #' Counterparts to R string manipulation functions that account for #' the effects of some ANSI X3.64 (a.k.a. ECMA-48, ISO-6429) control sequences. #' #' @section Control Characters and Sequences: #' #' Control characters and sequences are non-printing inline characters or #' sequences initiated by them that can be used to modify terminal display and #' behavior, for example by changing text color or cursor position. #' #' We will refer to X3.64/ECMA-48/ISO-6429 control characters and sequences as #' "_Control Sequences_" hereafter. #' #' There are four types of _Control Sequences_ that `fansi` can treat #' specially: #' #' * "C0" control characters, such as tabs and carriage returns (we include #' delete in this set, even though technically it is not part of it). #' * Sequences starting in "ESC[", also known as Control Sequence #' Introducer (CSI) sequences, of which the Select Graphic Rendition (SGR) #' sequences used to format terminal output are a subset. #' * Sequences starting in "ESC]", also known as Operating System #' Commands (OSC), of which the subset beginning with "8" is used to encode #' URI based hyperlinks. #' * Sequences starting in "ESC" and followed by something other than "[" or #' "]". #' #' _Control Sequences_ starting with ESC are assumed to be two characters #' long (including the ESC) unless they are of the CSI or OSC variety, in which #' case their length is computed as per the [ECMA-48 #' specification](https://www.ecma-international.org/publications-and-standards/standards/ecma-48/), #' with the exception that [OSC hyperlinks](#osc-hyperlinks) may be terminated #' with BEL ("\\a") in addition to ST ("ESC\\"). `fansi` handles most common #' _Control Sequences_ in its parsing algorithms, but it is not a conforming #' implementation of ECMA-48. For example, there are non-CSI/OSC escape #' sequences that may be longer than two characters, but `fansi` will #' (incorrectly) treat them as if they were two characters long. There are many #' more unimplemented ECMA-48 specifications. #' #' In theory it is possible to encode CSI sequences with a single byte #' introducing character in the 0x40-0x5F range instead of the traditional #' "ESC[". Since this is rare and it conflicts with UTF-8 encoding, `fansi` #' does not support it. #' #' Within _Control Sequences_, `fansi` further distinguishes CSI SGR and OSC #' hyperlinks by recording format specification and URIs into string state, and #' applying the same to any output strings according to the semantics of the #' functions in use. CSI SGR and OSC hyperlinks are known together as _Special #' Sequences_. See the following sections for details. #' #' Additionally, all _Control Sequences_, whether special or not, #' do not count as characters, graphemes, or display width. You can cause #' `fansi` to treat particular _Control Sequences_ as regular characters with #' the `ctl` parameter. #' #' @section CSI SGR Control Sequences: #' #' **NOTE**: not all displays support CSI SGR sequences; run #' [`term_cap_test`] to see whether your display supports them. #' #' CSI SGR Control Sequences are the subset of CSI sequences that can be #' used to change text appearance (e.g. color). These sequences begin with #' "ESC[" and end in "m". `fansi` interprets these sequences and writes new #' ones to the output strings in such a way that the original formatting is #' preserved. In most cases this should be transparent to the user. #' #' Occasionally there may be mismatches between how `fansi` and a display #' interpret the CSI SGR sequences, which may produce display artifacts. The #' most likely source of artifacts are _Control Sequences_ that move #' the cursor or change the display, or that `fansi` otherwise fails to #' interpret, such as: #' #' * Unknown SGR substrings. #' * "C0" control characters like tabs and carriage returns. #' * Other escape sequences. #' #' Another possible source of problems is that different displays parse #' and interpret control sequences differently. The common CSI SGR sequences #' that you are likely to encounter in formatted text tend to be treated #' consistently, but less common ones are not. `fansi` tries to hew by the #' ECMA-48 specification **for CSI SGR control sequences**, but not all #' terminals do. #' #' The most likely source of problems will be 24-bit CSI SGR sequences. #' For example, a 24-bit color sequence such as "ESC[38;2;31;42;4" is a #' single foreground color to a terminal that supports it, or separate #' foreground, background, faint, and underline specifications for one that does #' not. `fansi` will always interpret the sequences according to ECMA-48, but #' it will warn you if encountered sequences exceed those specified by #' the `term.cap` parameter or the "fansi.term.cap" global option. #' #' `fansi` will will also warn if it encounters _Control Sequences_ that it #' cannot interpret. You can turn off warnings via the `warn` parameter, which #' can be set globally via the "fansi.warn" option. You can work around "C0" #' tabs characters by turning them into spaces first with [`tabs_as_spaces`] or #' with the `tabs.as.spaces` parameter available in some of the `fansi` #' functions #' #' `fansi` interprets CSI SGR sequences in cumulative "Graphic Rendition #' Combination Mode". This means new SGR sequences add to rather than replace #' previous ones, although in some cases the effect is the same as replacement #' (e.g. if you have a color active and pick another one). #' #' @section OSC Hyperlinks: #' #' Operating System Commands are interpreted by terminal emulators typically to #' engage actions external to the display of text proper, such as setting a #' window title or changing the active color palette. #' #' [Some terminals](https://iterm2.com/documentation-escape-codes.html) have #' added support for associating URIs to text with OSCs in a similar way to #' anchors in HTML, so `fansi` interprets them and outputs or terminates them as #' needed. For example: #' #' ``` #' "\033]8;;xy.z\033\\LINK\033]8;;\033\\" #' ``` #' #' Might be interpreted as link to the URI "x.z". To make the encoding pattern #' clearer, we replace "\033]" with "<OSC>" and "\033\\\\" with #' "<ST>" below: #' #' ``` #' 8;;URILINK TEXT8;; #' ``` #' #' @section State Interactions: #' #' The cumulative nature of state as specified by SGR or OSC hyperlinks means #' that unterminated strings that are spliced will interact with each other. #' By extension, a substring does not inherently contain all the information #' required to recreate its state as it appeared in the source document. The #' default `fansi` configuration terminates extracted substrings and prepends #' original state to them so they present on a stand-alone basis as they did as #' part of the original string. #' #' To allow state in substrings to affect subsequent strings set `terminate = #' FALSE`, but you will need to manually terminate them or deal with the #' consequences of not doing so (see "Terminal Quirks"). #' #' By default, `fansi` assumes that each element in an input character vector is #' independent, but this is incorrect if the input is a single document with #' each element a line in it. In that situation state from each line should #' bleed into subsequent ones. Setting `carry = TRUE` enables the "single #' document" interpretation. #' #' To most closely approximate what `writeLines(x)` produces on your terminal, #' where `x` is a stateful string, use `writeLines(fansi_fun(x, carry=TRUE, #' terminate=FALSE))`. `fansi_fun` is a stand-in for any of the `fansi` string #' manipulation functions. Note that even with a seeming "null-op" such as #' `substr_ctl(x, 1, nchar_ctl(x), carry=TRUE, terminate=FALSE)` the output #' control sequences may not match the input ones, but the output _should_ look #' the same if displayed to the terminal. #' #' `fansi` strings will be affected by any active state in strings they are #' appended to. There are no parameters to control what happens in this case, #' but `fansi` provides functions that can help the user get the desired #' behavior. `state_at_end` computes the active state the end of a string, #' which can then be prepended onto the _input_ of `fansi` functions so that #' they are aware of the active style at the beginning of the string. #' Alternatively, one could use `close_state(state_at_end(...))` and pre-pend #' that to the _output_ of `fansi` functions so they are unaffected by preceding #' SGR. One could also just prepend "ESC[0m", but in some cases as #' described in [`?normalize_state`][normalize_state] that is sub-optimal. #' #' If you intend to combine stateful `fansi` manipulated strings with your own, #' it may be best to set `normalize = TRUE` for improved compatibility (see #' [`?normalize_state`][normalize_state].) #' #' @section Terminal Quirks: #' #' Some terminals (e.g. OS X terminal, ITerm2) will pre-paint the entirety of a #' new line with the currently active background before writing the contents of #' the line. If there is a non-default active background color, any unwritten #' columns in the new line will keep the prior background color even if the new #' line changes the background color. To avoid this be sure to use `terminate = #' TRUE` or to manually terminate each line with e.g. "ESC[0m". The #' problem manifests as: #' #' ``` #' " " = default background #' "#" = new background #' ">" = start new background #' "!" = restore default background #' #' +-----------+ #' | abc\n | #' |>###\n | #' |!abc\n#####| <- trailing "#" after newline are from pre-paint #' | abc | #' +-----------+ #' ``` #' #' The simplest way to avoid this problem is to split input strings by any #' newlines they contain, and use `terminate = TRUE` (the default). A more #' complex solution is to pad with spaces to the terminal window width before #' emitting the newline to ensure the pre-paint is overpainted with the current #' line's prevailing background color. #' #' @section Encodings / UTF-8: #' #' `fansi` will convert any non-ASCII strings to UTF-8 before processing them, #' and `fansi` functions that return strings will return them encoded in UTF-8. #' In some cases this will be different to what base R does. For example, #' `substr` re-encodes substrings to their original encoding. #' #' Interpretation of UTF-8 strings is intended to be consistent with base R. #' There are three ways things may not work out exactly as desired: #' #' 1. `fansi`, despite its best intentions, handles a UTF-8 sequence differently #' to the way R does. #' 2. R incorrectly handles a UTF-8 sequence. #' 3. Your display incorrectly handles a UTF-8 sequence. #' #' These issues are most likely to occur with invalid UTF-8 sequences, #' combining character sequences, and emoji. For example, whether special #' characters such as emoji are considered one or two wide evolves as software #' implements newer versions the Unicode databases. #' #' Internally, `fansi` computes the width of most UTF-8 character sequences #' outside of the ASCII range using the native `R_nchar` function. This will #' cause such characters to be processed slower than ASCII characters. Unlike R #' (at least as of version 4.1), `fansi` can account for graphemes. #' #' Because `fansi` implements its own internal UTF-8 parsing it is possible #' that you will see results different from those that R produces even on #' strings without _Control Sequences_. #' #' @section Overflow: #' #' The maximum length of input character vector elements allowed by `fansi` is #' the 32 bit INT_MAX, excluding the terminating NULL. As of R4.1 this is the #' limit for R character vector elements generally, but is enforced at the C #' level by `fansi` nonetheless. #' #' It is possible that during processing strings that are shorter than INT_MAX #' would become longer than that. `fansi` checks for that overflow and will #' stop with an error if that happens. A work-around for this situation is to #' break up large strings into smaller ones. The limit is on each element of a #' character vector, not on the vector as a whole. `fansi` will also error on #' your system if `R_len_t`, the R type used to measure string lengths, is less #' than the processed length of the string. #' #' @section R < 3.2.2 support: #' #' Nominally you can build and run this package in R versions between 3.1.0 and #' 3.2.1. Things should mostly work, but please be aware we do not run the test #' suite under versions of R less than 3.2.2. One key degraded capability is #' width computation of wide-display characters. Under R < 3.2.2 `fansi` will #' assume every character is 1 display width. Additionally, `fansi` may not #' always report malformed UTF-8 sequences as it usually does. One #' exception to this is [`nchar_ctl`] as that is just a thin wrapper around #' [`base::nchar`]. #' #' @useDynLib fansi, .registration=TRUE, .fixes="FANSI_" #' @docType package #' @aliases fansi-package #' @name fansi NULL fansi/R/constants.R0000755000176200001440000000255414507512660013731 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. ## Order of these is important as typically we convert them to integer codes ## with `match` ## Valid values for the `term.cap` argument VALID.TERM.CAP <- c('all', 'bright', '256', 'truecolor', 'old') ## Valid values for the `ctl` argument, ## ## * nl: newlines ## * c0: other c0, including del ## * sgr: SGR ANSI CSI ## * csi: ANSI CSI, excluding SGR ## * esc: other \033 escape sequences, we assume they are two long ## ## These will eventually encoded in an integer as powers of 2, except for `all` ## which acts as a negation (see FANSI_ctl_as_int), so "nl" is 2^0, "c0" is 2^1, ## and so on. ## ## REMEMBER TO UPDATE CTL_ALL CONSTANT IF WE MODIFY THIS VALID.CTL <- c("all", "nl", "c0", "sgr", "csi", "esc", "url", "osc") fansi/R/strsplit.R0000755000176200001440000001404214510300164013560 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. #' Control Sequence Aware Version of strsplit #' #' A drop-in replacement for [`base::strsplit`]. #' #' This function works by computing the position of the split points after #' removing _Control Sequences_, and uses those positions in conjunction with #' [`substr_ctl`] to extract the pieces. This concept is borrowed from #' `crayon::col_strsplit`. An important implication of this is that you cannot #' split by _Control Sequences_ that are being treated as _Control Sequences_. #' You can however limit which control sequences are treated specially via the #' `ctl` parameters (see examples). #' #' @note The split positions are computed after both `x` and `split` are #' converted to UTF-8. #' @export #' @param x a character vector, or, unlike [`base::strsplit`] an object that can #' be coerced to character. #' @inheritParams base::strsplit #' @inheritParams strwrap_ctl #' @inherit substr_ctl seealso #' @inheritSection substr_ctl Control and Special Sequences #' @inheritSection substr_ctl Output Stability #' @inheritSection substr_ctl Bidirectional Text #' @note Non-ASCII strings are converted to and returned in UTF-8 encoding. #' Width calculations will not work properly in R < 3.2.2. #' @return Like [`base::strsplit`], with _Control Sequences_ excluded. #' @examples #' strsplit_ctl("\033[31mhello\033[42m world!", " ") #' #' ## Splitting by newlines does not work as they are _Control #' ## Sequences_, but we can use `ctl` to treat them as ordinary #' strsplit_ctl("\033[31mhello\033[42m\nworld!", "\n") #' strsplit_ctl("\033[31mhello\033[42m\nworld!", "\n", ctl=c("all", "nl")) strsplit_ctl <- function( x, split, fixed=FALSE, perl=FALSE, useBytes=FALSE, warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), ctl='all', normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) { ## modifies / creates NEW VARS in fun env VAL_IN_ENV( x=x, warn=warn, term.cap=term.cap, ctl=ctl, normalize=normalize, carry=carry, terminate=terminate, round="start" ) if(is.null(split)) split <- "" split <- enc_to_utf8(as.character(split)) if(!length(split)) split <- "" if(anyNA(split)) stop("Argument `split` may not contain NAs.") if(any(Encoding(split) == "bytes")) stop("Argument `split` may not be \"bytes\" encoded.") if(!is.logical(fixed)) fixed <- as.logical(fixed) if(length(fixed) != 1L || is.na(fixed)) stop("Argument `fixed` must be TRUE or FALSE.") if(!is.logical(perl)) perl <- as.logical(perl) if(length(perl) != 1L || is.na(perl)) stop("Argument `perl` must be TRUE or FALSE.") if(!is.logical(useBytes)) useBytes <- as.logical(useBytes) if(length(useBytes) != 1L || is.na(useBytes)) stop("Argument `useBytes` must be TRUE or FALSE.") # Need to handle recycling, complicated by the ability of strsplit to accept # multiple different split arguments x.na <- is.na(x) x.seq <- seq_along(x) s.seq <- seq_along(split) s.x.seq <- rep(s.seq, length.out=length(x)) * (!x.na) matches <- res <- vector("list", length(x)) x.strip <- strip_ctl(x, warn=FALSE, ctl=ctl) chars <- nchar(x.strip) # Find the split locations and widths for(i in s.seq) { to.split <- s.x.seq == i & chars matches[to.split] <- if(!nzchar(split[i])) { # special handling for zero width split lapply( chars[to.split], function(y) structure( seq.int(from=2L, by=1L, length.out=y - 1L), match.length=integer(y - 1L) ) ) } else { gregexpr( split[i], x.strip[to.split], perl=perl, useBytes=useBytes, fixed=fixed ) } } # Use `substr` to select the pieces between the start/end for(i in seq_along(x)) { if(any(matches[[i]] > 0)) { starts <- c(1L, matches[[i]] + attr(matches[[i]], 'match.length')) ends <- c(matches[[i]] - 1L, chars[i]) starts[starts < 1L] <- 1L sub.invalid <- starts > chars[i] if(any(sub.invalid)) { # happens when split goes all way to end of string starts <- starts[!sub.invalid] ends <- ends[!sub.invalid] } res[[i]] <- substr_ctl_internal( x=rep(x[i], length.out=length(starts)), start=starts, stop=ends, type.int=0L, round.int=ROUND.INT, tabs.as.spaces=FALSE, tab.stops=8L, warn.int=WARN.INT, term.cap.int=TERM.CAP.INT, x.len=length(starts), ctl.int=CTL.INT, normalize=normalize, carry=carry, terminate=terminate ) } else { res[[i]] <- x[[i]] } } # lazy fix for zero length strings splitting into nothing; would be better to # fix upstream... res[!chars] <- list(character(0L)) res[x.na] <- list(NA_character_) res } #' Check for Presence of Control Sequences #' #' This function is deprecated in favor of the [`strsplit_ctl`]. #' #' @inheritParams strsplit_ctl #' @inherit strsplit_ctl return #' @keywords internal #' @export strsplit_sgr <- function( x, split, fixed=FALSE, perl=FALSE, useBytes=FALSE, warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) strsplit_ctl( x=x, split=split, fixed=fixed, perl=perl, useBytes=useBytes, warn=warn, term.cap=term.cap, ctl='sgr', normalize=normalize, carry=carry, terminate=terminate ) fansi/R/strwrap.R0000755000176200001440000002600414510300164013377 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. #' Control Sequence Aware Version of strwrap #' #' Wraps strings to a specified width accounting for _Control Sequences_. #' `strwrap_ctl` is intended to emulate `strwrap` closely except with respect to #' the _Control Sequences_ (see details for other minor differences), while #' `strwrap2_ctl` adds features and changes the processing of whitespace. #' `strwrap_ctl` is faster than `strwrap`. #' #' `strwrap2_ctl` can convert tabs to spaces, pad strings up to `width`, and #' hard-break words if single words are wider than `width`. #' #' Unlike [base::strwrap], both these functions will translate any non-ASCII #' strings to UTF-8 and return them in UTF-8. Additionally, invalid UTF-8 #' always causes errors, and `prefix` and `indent` must be scalar. #' #' When replacing tabs with spaces the tabs are computed relative to the #' beginning of the input line, not the most recent wrap point. #' Additionally,`indent`, `exdent`, `initial`, and `prefix` will be ignored when #' computing tab positions. #' #' @inheritSection substr_ctl Control and Special Sequences #' @inheritSection substr_ctl Graphemes #' @inheritSection substr_ctl Output Stability #' @inheritSection substr_ctl Bidirectional Text #' @inheritParams base::strwrap #' @inheritParams tabs_as_spaces #' @inheritParams substr_ctl #' @inherit substr_ctl seealso #' @return A character vector, or list of character vectors if `simplify` is #' false. #' @note Non-ASCII strings are converted to and returned in UTF-8 encoding. #' Width calculations will not work properly in R < 3.2.2. #' @param wrap.always TRUE or FALSE (default), whether to hard wrap at requested #' width if no word breaks are detected within a line. If set to TRUE then #' `width` must be at least 2. #' @param pad.end character(1L), a single character to use as padding at the #' end of each line until the line is `width` wide. This must be a printable #' ASCII character or an empty string (default). If you set it to an empty #' string the line remains unpadded. #' @param strip.spaces TRUE (default) or FALSE, if TRUE, extraneous white spaces #' (spaces, newlines, tabs) are removed in the same way as [base::strwrap] #' does. When FALSE, whitespaces are preserved, except for newlines as those #' are implicit boundaries between output vector elements. #' @param tabs.as.spaces FALSE (default) or TRUE, whether to convert tabs to #' spaces. This can only be set to TRUE if `strip.spaces` is FALSE. #' @note For the `strwrap*` functions the `carry` parameter affects whether #' styles are carried across _input_ vector elements. Styles always carry #' within a single wrapped vector element (e.g. if one of the input elements #' gets wrapped into three lines, the styles will carry through those three #' lines even if `carry=FALSE`, but not across input vector elements). #' @export #' @examples #' hello.1 <- "hello \033[41mred\033[49m world" #' hello.2 <- "hello\t\033[41mred\033[49m\tworld" #' #' strwrap_ctl(hello.1, 12) #' strwrap_ctl(hello.2, 12) #' #' ## In default mode strwrap2_ctl is the same as strwrap_ctl #' strwrap2_ctl(hello.2, 12) #' #' ## But you can leave whitespace unchanged, `warn` #' ## set to false as otherwise tabs causes warning #' strwrap2_ctl(hello.2, 12, strip.spaces=FALSE, warn=FALSE) #' #' ## And convert tabs to spaces #' strwrap2_ctl(hello.2, 12, tabs.as.spaces=TRUE) #' #' ## If your display has 8 wide tab stops the following two #' ## outputs should look the same #' writeLines(strwrap2_ctl(hello.2, 80, tabs.as.spaces=TRUE)) #' writeLines(hello.2) #' #' ## tab stops are NOT auto-detected, but you may provide #' ## your own #' strwrap2_ctl(hello.2, 12, tabs.as.spaces=TRUE, tab.stops=c(6, 12)) #' #' ## You can also force padding at the end to equal width #' writeLines(strwrap2_ctl("hello how are you today", 10, pad.end=".")) #' #' ## And a more involved example where we read the #' ## NEWS file, color it line by line, wrap it to #' ## 25 width and display some of it in 3 columns #' ## (works best on displays that support 256 color #' ## SGR sequences) #' #' NEWS <- readLines(file.path(R.home('doc'), 'NEWS')) #' NEWS.C <- fansi_lines(NEWS, step=2) # color each line #' W <- strwrap2_ctl(NEWS.C, 25, pad.end=" ", wrap.always=TRUE) #' writeLines(c("", paste(W[1:20], W[100:120], W[200:220]), "")) strwrap_ctl <- function( x, width = 0.9 * getOption("width"), indent = 0, exdent = 0, prefix = "", simplify = TRUE, initial = prefix, warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), ctl='all', normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) { strwrap2_ctl( x=x, width=width, indent=indent, exdent=exdent, prefix=prefix, simplify=simplify, initial=initial, warn=warn, term.cap=term.cap, ctl=ctl, normalize=normalize, carry=carry, terminate=terminate ) } #' @export #' @rdname strwrap_ctl strwrap2_ctl <- function( x, width = 0.9 * getOption("width"), indent = 0, exdent = 0, prefix = "", simplify = TRUE, initial = prefix, wrap.always=FALSE, pad.end="", strip.spaces=!tabs.as.spaces, tabs.as.spaces=getOption('fansi.tabs.as.spaces', FALSE), tab.stops=getOption('fansi.tab.stops', 8L), warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), ctl='all', normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) { if(!is.logical(wrap.always)) wrap.always <- as.logical(wrap.always) if(length(wrap.always) != 1L || is.na(wrap.always)) stop("Argument `wrap.always` must be TRUE or FALSE.") if(!is.logical(tabs.as.spaces)) tabs.as.spaces <- as.logical(tabs.as.spaces) if(wrap.always && width < 2L) stop("Width must be at least 2 in `wrap.always` mode.") ## modifies / creates NEW VARS in fun env VAL_IN_ENV ( x=x, warn=warn, term.cap=term.cap, ctl=ctl, normalize=normalize, carry=carry, terminate=terminate, tab.stops=tab.stops, tabs.as.spaces=tabs.as.spaces, strip.spaces=strip.spaces ) if(tabs.as.spaces && strip.spaces) stop("`tabs.as.spaces` and `strip.spaces` should not both be TRUE.") # This changes `width`, so needs to happen after the first width validation VAL_WRAP_IN_ENV(width, indent, exdent, prefix, initial, pad.end) res <- .Call( FANSI_strwrap_csi, x, width, indent, exdent, prefix, initial, wrap.always, pad.end, strip.spaces, tabs.as.spaces, tab.stops, WARN.INT, TERM.CAP.INT, FALSE, # first_only CTL.INT, normalize, carry, terminate ) if(simplify) { if(normalize) normalize_state(unlist(res), warn=FALSE, term.cap) else unlist(res) } else { if(normalize) normalize_state_list(res, 0L, TERM.CAP.INT, carry=carry) else res } } #' Control Sequence Aware Version of strwrap #' #' These functions are deprecated in favor of the [`strwrap_ctl`] flavors. #' #' @inheritParams strwrap_ctl #' @inherit strwrap_ctl return #' @keywords internal #' @export strwrap_sgr <- function( x, width = 0.9 * getOption("width"), indent = 0, exdent = 0, prefix = "", simplify = TRUE, initial = prefix, warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) strwrap_ctl( x=x, width=width, indent=indent, exdent=exdent, prefix=prefix, simplify=simplify, initial=initial, warn=warn, term.cap=term.cap, ctl='sgr', normalize=normalize, carry=carry, terminate=terminate ) #' @export #' @rdname strwrap_sgr strwrap2_sgr <- function( x, width = 0.9 * getOption("width"), indent = 0, exdent = 0, prefix = "", simplify = TRUE, initial = prefix, wrap.always=FALSE, pad.end="", strip.spaces=!tabs.as.spaces, tabs.as.spaces=getOption('fansi.tabs.as.spaces', FALSE), tab.stops=getOption('fansi.tab.stops', 8L), warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) strwrap2_ctl( x=x, width=width, indent=indent, exdent=exdent, prefix=prefix, simplify=simplify, initial=initial, wrap.always=wrap.always, pad.end=pad.end, strip.spaces=strip.spaces, tabs.as.spaces=tabs.as.spaces, tab.stops=tab.stops, warn=warn, term.cap=term.cap, ctl='sgr', normalize=normalize, carry=carry, terminate=terminate ) VAL_WRAP_IN_ENV <- function( width, indent, exdent, prefix, initial, pad.end ) { call <- sys.call(-1) env <- parent.frame() stop2 <- function(x) stop(simpleError(x, call)) is_scl_int_pos <- function(x, name, strict=FALSE) { x <- as.integer(x) if( !is.numeric(x) || length(x) != 1L || is.na(x) || if(strict) x <= 0 else x < 0 ) stop2( sprintf( "Argument `%s` %s.", name, "must be a positive scalar numeric representable as integer" ) ) x } exdent <- is_scl_int_pos(exdent, 'exdent', strict=FALSE) indent <- is_scl_int_pos(indent, 'indent', strict=FALSE) if(is.numeric(width)) width <- as.integer(min(c(max(c(min(width), 2L)), .Machine$integer.max))) else stop2("Argument `width` must be numeric.") # technically width <- is_scl_int_pos(width, 'width', strict=TRUE) width <- width - 1L if(!is.character(prefix)) prefix <- as.character(prefix) if(length(prefix) != 1L) stop2("Argument `prefix` must be a scalar character.") prefix <- enc_to_utf8(prefix) if(Encoding(prefix) == "bytes") stop2("Argument `prefix` cannot be \"bytes\" encoded.") if(!is.character(initial)) initial <- as.character(initial) if(length(initial) != 1L) stop2("Argument `initial` must be a scalar character.") initial <- enc_to_utf8(initial) if(Encoding(initial) == "bytes") stop2("Argument `initial` cannot be \"bytes\" encoded.") if(!is.character(pad.end)) pad.end <- as.character(pad.end) if(length(pad.end) != 1L) stop2("Argument `pad.end` must be a scalar character.") pad.end <- enc_to_utf8(pad.end) if(Encoding(pad.end) == "bytes") stop2("Argument `pad.end` cannot be \"bytes\" encoded.") if(nchar(pad.end, type='bytes') > 1L) stop2("Argument `pad.end` must be at most one byte long.") list2env( list( width=width, indent=indent, exdent=exdent, prefix=prefix, initial=initial, pad.end=pad.end ), env ) } fansi/R/normalize.R0000755000176200001440000001052714507512660013714 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. #' Normalize CSI and OSC Sequences #' #' Re-encodes SGR and OSC encoded URL sequences into a unique decomposed form. #' Strings containing semantically identical SGR and OSC sequences that are #' encoded differently should compare equal after normalization. #' #' Each compound SGR sequence is broken up into individual tokens, superfluous #' tokens are removed, and the SGR reset sequence "ESC[0m" (or "ESC[m") #' is replaced by the closing codes for whatever SGR styles are active at the #' point in the string in which it appears. #' #' Unrecognized SGR codes will be dropped from the output with a warning. The #' specific order of SGR codes associated with any given SGR sequence is not #' guaranteed to remain the same across different versions of `fansi`, but #' should remain unchanged except for the addition of previously uninterpreted #' codes to the list of interpretable codes. There is no special significance #' to the order the SGR codes are emitted in other than it should be consistent #' for any given SGR state. URLs adjacent to SGR codes are always emitted after #' the SGR codes irrespective of what side they were on originally. #' #' OSC encoded URL sequences are always terminated by "ESC]\\", and those #' between abutting URLs are omitted. Identical abutting URLs are merged. In #' order for URLs to be considered identical both the URL and the "id" parameter #' must be specified and be the same. OSC URL parameters other than "id" are #' dropped with a warning. #' #' The underlying assumption is that each element in the vector is #' unaffected by SGR or OSC URLs in any other element or elsewhere. This may #' lead to surprising outcomes if these assumptions are untrue (see examples). #' You may adjust this assumption with the `carry` parameter. #' #' Normalization was implemented primarily for better compatibility with #' [`crayon`][1] which emits SGR codes individually and assumes that each #' opening code is paired up with its specific closing code, but it can also be #' used to reduce the probability that strings processed with future versions of #' `fansi` will produce different results than the current version. #' #' [1]: https://cran.r-project.org/package=crayon #' #' @export #' @inheritParams substr_ctl #' @inherit has_ctl seealso #' @return `x`, with all SGRs normalized. #' @examples #' normalize_state("hello\033[42;33m world") #' normalize_state("hello\033[42;33m world\033[m") #' normalize_state("\033[4mhello\033[42;33m world\033[m") #' #' ## Superflous codes removed #' normalize_state("\033[31;32mhello\033[m") # only last color prevails #' normalize_state("\033[31\033[32mhello\033[m") # only last color prevails #' normalize_state("\033[31mhe\033[49mllo\033[m") # unused closing #' #' ## Equivalent normalized sequences compare identical #' identical( #' normalize_state("\033[31;32mhello\033[m"), #' normalize_state("\033[31mhe\033[49mllo\033[m") #' ) #' ## External SGR will defeat normalization, unless we `carry` it #' red <- "\033[41m" #' writeLines( #' c( #' paste(red, "he\033[0mllo", "\033[0m"), #' paste(red, normalize_state("he\033[0mllo"), "\033[0m"), #' paste(red, normalize_state("he\033[0mllo", carry=red), "\033[0m") #' ) ) normalize_state <- function( x, warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), carry=getOption('fansi.carry', FALSE) ) { ## modifies / creates NEW VARS in fun env VAL_IN_ENV(x=x, warn=warn, term.cap=term.cap, carry=carry) .Call(FANSI_normalize_state, x, WARN.INT, TERM.CAP.INT, carry) } # To reduce overhead of applying this in `strwrap_ctl` normalize_state_list <- function(x, warn.int, term.cap.int, carry) .Call(FANSI_normalize_state_list, x, warn.int, term.cap.int, carry) fansi/R/sgr.R0000755000176200001440000001641414507512660012510 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. #' Strip Control Sequences #' #' Removes _Control Sequences_ from strings. By default it will #' strip all known _Control Sequences_, including CSI/OSC sequences, two #' character sequences starting with ESC, and all C0 control characters, #' including newlines. You can fine tune this behavior with the `ctl` #' parameter. #' #' The `ctl` value contains the names of **non-overlapping** subsets of the #' known _Control Sequences_ (e.g. "csi" does not contain "sgr", and "c0" does #' not contain newlines). The one exception is "all" which means strip every #' known sequence. If you combine "all" with any other options then everything #' **but** those options will be stripped. #' #' @note Non-ASCII strings are converted to and returned in UTF-8 encoding. #' @inheritParams substr_ctl #' @inherit has_ctl seealso #' @export #' @param ctl character, any combination of the following values (see details): #' * "nl": strip newlines. #' * "c0": strip all other "C0" control characters (i.e. x01-x1f, x7F), #' except for newlines and the actual ESC character. #' * "sgr": strip ANSI CSI SGR sequences. #' * "csi": strip all non-SGR csi sequences. #' * "esc": strip all other escape sequences. #' * "all": all of the above, except when used in combination with any of the #' above, in which case it means "all but" (see details). #' @param strip character, deprecated in favor of `ctl`. #' @return character vector of same length as x with ANSI escape sequences #' stripped #' @examples #' string <- "hello\033k\033[45p world\n\033[31mgoodbye\a moon" #' strip_ctl(string) #' strip_ctl(string, c("nl", "c0", "sgr", "csi", "esc")) # equivalently #' strip_ctl(string, "sgr") #' strip_ctl(string, c("c0", "esc")) #' #' ## everything but C0 controls, we need to specify "nl" #' ## in addition to "c0" since "nl" is not part of "c0" #' ## as far as the `strip` argument is concerned #' strip_ctl(string, c("all", "nl", "c0")) strip_ctl <- function(x, ctl='all', warn=getOption('fansi.warn', TRUE), strip) { if(!missing(strip)) { message("Parameter `strip` has been deprecated; use `ctl` instead.") ctl <- strip } ## modifies / creates NEW VARS in fun env VAL_IN_ENV(x=x, ctl=ctl, warn=warn, warn.mask=get_warn_worst()) if(length(ctl)) .Call(FANSI_strip_csi, x, CTL.INT, WARN.INT) else x } #' Strip Control Sequences #' #' This function is deprecated in favor of the [`strip_ctl`]. It #' strips CSI SGR and OSC hyperlink sequences. #' #' @inheritParams strip_ctl #' @inherit strip_ctl return #' @keywords internal #' @export #' @examples #' ## convenience function, same as `strip_ctl(ctl=c('sgr', 'url'))` #' string <- "hello\033k\033[45p world\n\033[31mgoodbye\a moon" #' strip_sgr(string) strip_sgr <- function(x, warn=getOption('fansi.warn', TRUE)) { ## modifies / creates NEW VARS in fun env VAL_IN_ENV(x=x, warn=warn, warn.mask=get_warn_worst()) ctl.int <- match(c("sgr", "url"), VALID.CTL) .Call(FANSI_strip_csi, x, ctl.int, WARN.INT) } #' Check for Presence of Control Sequences #' #' `has_ctl` checks for any _Control Sequence_. You can check for different #' types of sequences with the `ctl` parameter. Warnings are only emitted for #' malformed CSI or OSC sequences. #' #' @export #' @seealso [`?fansi`][fansi] for details on how _Control Sequences_ are #' interpreted, particularly if you are getting unexpected results, #' [`unhandled_ctl`] for detecting bad control sequences. #' @inheritParams substr_ctl #' @inheritParams strip_ctl #' @param which character, deprecated in favor of `ctl`. #' @return logical of same length as `x`; NA values in `x` result in NA values #' in return #' @examples #' has_ctl("hello world") #' has_ctl("hello\nworld") #' has_ctl("hello\nworld", "sgr") #' has_ctl("hello\033[31mworld\033[m", "sgr") has_ctl <- function(x, ctl='all', warn=getOption('fansi.warn', TRUE), which) { if(!missing(which)) { message("Parameter `which` has been deprecated; use `ctl` instead.") ctl <- which } ## modifies / creates NEW VARS in fun env VAL_IN_ENV(x=x, ctl=ctl, warn=warn, warn.mask=get_warn_mangled()) if(length(CTL.INT)) { .Call(FANSI_has_csi, x, CTL.INT, WARN.INT) } else rep(FALSE, length(x)) } #' Check for Presence of Control Sequences #' #' This function is deprecated in favor of the [`has_ctl`]. It #' checks for CSI SGR and OSC hyperlink sequences. #' #' @inheritParams has_ctl #' @inherit has_ctl return #' @keywords internal #' @export has_sgr <- function(x, warn=getOption('fansi.warn', TRUE)) has_ctl(x, ctl=c("sgr", "url"), warn=warn) #' Utilities for Managing CSI and OSC State In Strings #' #' `state_at_end` reads through strings computing the accumulated SGR and #' OSC hyperlinks, and outputs the active state at the end of them. #' `close_state` produces the sequence that closes any SGR active and OSC #' hyperlinks at the end of each input string. If `normalize = FALSE` #' (default), it will emit the reset code "ESC[0m" if any SGR is present. #' It is more interesting for closing SGRs if `normalize = TRUE`. Unlike #' `state_at_end` and other functions `close_state` has no concept of `carry`: #' it will only emit closing sequences for states explicitly active at the end #' of a string. #' #' @export #' @inheritParams substr_ctl #' @inheritSection substr_ctl Control and Special Sequences #' @inheritSection substr_ctl Output Stability #' @inherit has_ctl seealso #' @return character vector same length as `x`. #' @examples #' x <- c("\033[44mhello", "\033[33mworld") #' state_at_end(x) #' state_at_end(x, carry=TRUE) #' (close <- close_state(state_at_end(x, carry=TRUE), normalize=TRUE)) #' writeLines(paste0(x, close, " no style")) state_at_end <- function( x, warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE) ) { ## modifies / creates NEW VARS in fun env VAL_IN_ENV(x=x, ctl='sgr', warn=warn, term.cap=term.cap, carry=carry) .Call( FANSI_state_at_end, x, WARN.INT, TERM.CAP.INT, CTL.INT, normalize, carry, "x", TRUE # allowNA ) } # Given an SGR, compute the sequence that closes it #' @export #' @rdname state_at_end close_state <- function( x, warn=getOption('fansi.warn', TRUE), normalize=getOption('fansi.normalize', FALSE) ) { ## modifies / creates NEW VARS in fun env VAL_IN_ENV(x=x, warn=warn, normalize=normalize) .Call(FANSI_close_state, x, WARN.INT, 1L, normalize) } ## Process String by Removing Unwanted Characters ## ## This is to simulate what `strwrap` does, exposed for testing purposes. process <- function(x, ctl="all") .Call( FANSI_process, enc_to_utf8(x), 1L, match(ctl, VALID.CTL) ) fansi/R/misc.R0000755000176200001440000004560614507512660012655 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. #' Replace Tabs With Spaces #' #' Finds horizontal tab characters (0x09) in a string and replaces them with the #' spaces that produce the same horizontal offset. #' #' Since we do not know of a reliable cross platform means of detecting tab #' stops you will need to provide them yourself if you are using anything #' outside of the standard tab stop every 8 characters that is the default. #' #' @note Non-ASCII strings are converted to and returned in UTF-8 encoding. The #' `ctl` parameter only affects which _Control Sequences_ are considered zero #' width. Tabs will always be converted to spaces, irrespective of the `ctl` #' setting. #' @inherit has_ctl seealso #' @export #' @inheritParams substr_ctl #' @param x character vector or object coercible to character; any tabs therein #' will be replaced. #' @return character, `x` with tabs replaced by spaces, with elements #' possibly converted to UTF-8. #' @examples #' string <- '1\t12\t123\t1234\t12345678' #' tabs_as_spaces(string) #' writeLines( #' c( #' '-------|-------|-------|-------|-------|', #' tabs_as_spaces(string) #' ) ) #' writeLines( #' c( #' '-|--|--|--|--|--|--|--|--|--|--|', #' tabs_as_spaces(string, tab.stops=c(2, 3)) #' ) ) #' writeLines( #' c( #' '-|--|-------|-------|-------|', #' tabs_as_spaces(string, tab.stops=c(2, 3, 8)) #' ) ) tabs_as_spaces <- function( x, tab.stops=getOption('fansi.tab.stops', 8L), warn=getOption('fansi.warn', TRUE), ctl='all' ) { ## modifies / creates NEW VARS in fun env VAL_IN_ENV( x=x, warn=warn, ctl=ctl, warn.mask=get_warn_worst(), tab.stops=tab.stops ) term.cap.int <- 1L .Call( FANSI_tabs_as_spaces, x, tab.stops, WARN.INT, term.cap.int, CTL.INT ) } #' Test Terminal Capabilities #' #' Outputs ANSI CSI SGR formatted text to screen so that you may visually #' inspect what color capabilities your terminal supports. #' #' The three tested terminal capabilities are: #' #' * "bright" for bright colors with SGR codes in 90-97 and 100-107 #' * "256" for colors defined by "38;5;x" and "48;5;x" where x is in 0-255 #' * "truecolor" for colors defined by "38;2;x;y;z" and "48;x;y;x" where x, y, #' and z are in 0-255 #' #' Each of the color capabilities your terminal supports should be displayed #' with a blue background and a red foreground. For reference the corresponding #' CSI SGR sequences are displayed as well. #' #' You should compare the screen output from this function to #' `getOption('fansi.term.cap', dflt_term_cap)` to ensure that they are self #' consistent. #' #' By default `fansi` assumes terminals support bright and 256 color #' modes, and also tests for truecolor support via the $COLORTERM system #' variable. #' #' Functions with the `term.cap` parameter like `substr_ctl` will warn if they #' encounter 256 or true color SGR sequences and `term.cap` indicates they are #' unsupported as such a terminal may misinterpret those sequences. Bright #' codes and OSC hyperlinks in terminals that do not support them will likely be #' silently ignored, so `fansi` functions do not warn about those. #' #' @seealso [`dflt_term_cap`], [`has_ctl`]. #' @export #' @return character the test vector, invisibly #' @examples #' term_cap_test() term_cap_test <- function() { types <- format(c("bright", "256", "truecolor")) res <- paste0( c( "\033[91;104m", "\033[38;5;196;48;5;21m", "\033[38;2;255;0;0;48;2;0;0;255m" ), types, "\033[0m" ) res.esc <- gsub("\033", "\\033", res, fixed=TRUE) res.fin <- paste0(res, " -> ", format(res.esc)) writeLines(res.fin) invisible(res) } #' Colorize Character Vectors #' #' Color each element in input with one of the "256 color" ANSI CSI SGR codes. #' This is intended for testing and demo purposes. #' #' @export #' @param txt character vector or object that can be coerced to character vector #' @param step integer(1L) how quickly to step through the color palette #' @return character vector with each element colored #' @examples #' NEWS <- readLines(file.path(R.home('doc'), 'NEWS')) #' writeLines(fansi_lines(NEWS[1:20])) #' writeLines(fansi_lines(NEWS[1:20], step=8)) fansi_lines <- function(txt, step=1) { if(!is.character(txt)) txt <- as.character(txt) if(!is.numeric(step) || length(step) != 1 || is.na(step) || step < 1) stop("Argument `step` must be a strictly positive scalar integer.") step <- as.integer(step) txt.c <- txt bg <- ceiling((seq_along(txt) * step) %% 215 + 1) + 16 fg <- ifelse((((bg - 16) %/% 18) %% 2), 30, 37) tpl <- "\033[%d;48;5;%dm%s\033[39;49m" ## Apply colors to strings and collapse nz <- nzchar(txt) txt.c[nz] <- sprintf(tpl, fg[nz], bg[nz], txt[nz]) txt.c } #' Escape Characters With Special HTML Meaning #' #' Arbitrary text may contain characters with special meaning in HTML, which may #' cause HTML display to be corrupted if they are included unescaped in a web #' page. This function escapes those special characters so they do not #' interfere with the HTML markup generated by e.g. [`to_html`]. #' #' @export #' @family HTML functions #' @param x character vector #' @param what character(1) containing any combination of ASCII characters #' "<", ">", "&", "'", or "\"". These characters are special in HTML contexts #' and will be substituted by their HTML entity code. By default, all #' special characters are escaped, but in many cases "<>&" or even "<>" might #' be sufficient. #' @return `x`, but with the `what` characters replaced by their HTML entity #' codes. #' @note Non-ASCII strings are converted to and returned in UTF-8 encoding. #' @examples #' html_esc("day > night") #' html_esc("hello world") html_esc <- function(x, what=getOption("fansi.html.esc", "<>&'\"")) { if(!is.character(x)) stop("Argument `x` must be character, is ", typeof(x), ".") if(!is.character(what)) stop("Argument `what` must be character, is ", typeof(what), ".") .Call(FANSI_esc_html, enc_to_utf8(x), what) } #' Format Character Vector for Display as Code in HTML #' #' This simulates what `rmarkdown` / `knitr` do to the output of an R markdown #' chunk, at least as of `rmarkdown` 1.10. It is useful when we override the #' `knitr` output hooks so that we can have a result that still looks as if it #' was run by `knitr`. #' #' @export #' @param x character vector #' @param class character vectors of classes to apply to the PRE HTML tags. It #' is the users responsibility to ensure the classes are valid CSS class #' names. #' @return character(1L) `x`, with <PRE> and <CODE> HTML tags #' applied and collapsed into one line with newlines as the line separator. #' @examples #' html_code_block(c("hello world")) #' html_code_block(c("hello world"), class="pretty") html_code_block <- function(x, class='fansi-output') { if(!is.character(x)) stop("Argument `x` must be character, is ", typeof(x), ".") if(!is.character(class)) stop("Argument `class` must be character, is ", typeof(class), ".") class.all <- sprintf("class=\"%s\"", paste0(class, collapse=" ")) sprintf( "
%s
", class.all, paste0(x, collapse='\n') ) } #' Set an Output Hook Convert Control Sequences to HTML in Rmarkdown #' #' This is a convenience function designed for use within an `rmarkdown` #' document. It overrides the `knitr` output hooks by using #' `knitr::knit_hooks$set`. It replaces the hooks with ones that convert #' _Control Sequences_ into HTML. In addition to replacing the hook functions, #' this will output a <STYLE> HTML block to stdout. These two actions are #' side effects as a result of which R chunks in the `rmarkdown` document that #' contain CSI SGR are shown in their HTML equivalent form. #' #' The replacement hook function tests for the presence of CSI SGR #' sequences in chunk output with [`has_ctl`], and if it is detected then #' processes it with the user provided `proc.fun`. Chunks that do not contain #' CSI SGR are passed off to the previously set hook function. The default #' `proc.fun` will run the output through [`html_esc`], [`to_html`], and #' finally [`html_code_block`]. #' #' If you require more control than this function provides you can set the #' `knitr` hooks manually with `knitr::knit_hooks$set`. If you are seeing your #' output gaining extra line breaks, look at the `split.nl` option. #' #' @note Since we do not formally import the `knitr` functions we do not #' guarantee that this function will always work properly with `knitr` / #' `rmarkdown`. #' #' @export #' @seealso [`has_ctl`], [`to_html`], [`html_esc`], [`html_code_block`], #' [`knitr` output hooks](https://yihui.org/knitr/hooks/#output-hooks), #' [embedding CSS in #' Rmd](https://bookdown.org/yihui/rmarkdown/language-engines.html#javascript-and-css), #' and the vignette `vignette(package='fansi', 'sgr-in-rmd')`. #' @param hooks list, this should the be `knitr::knit_hooks` object; we #' require you pass this to avoid a run-time dependency on `knitr`. #' @param which character vector with the names of the hooks that should be #' replaced, defaults to 'output', but can also contain values #' 'message', 'warning', and 'error'. #' @param class character the CSS class to give the output chunks. Each type of #' output chunk specified in `which` will be matched position-wise to the #' classes specified here. This vector should be the same length as `which`. #' @param proc.fun function that will be applied to output that contains #' CSI SGR sequences. Should accept parameters `x` and `class`, where `x` is #' the output, and `class` is the CSS class that should be applied to #' the <PRE><CODE> blocks the output will be placed in. #' @param style character a vector of CSS styles; these will be output inside #' HTML >STYLE< tags as a side effect. The default value is designed to #' ensure that there is no visible gap in background color with lines with #' height 1.5 (as is the default setting in `rmarkdown` documents v1.1). #' @param split.nl TRUE or FALSE (default), set to TRUE to split input strings #' by any newlines they may contain to avoid any newlines inside SPAN tags #' created by [to_html()]. Some markdown->html renders can be configured #' to convert embedded newlines into line breaks, which may lead to a doubling #' of line breaks. With the default `proc.fun` the split strings are #' recombined by [html_code_block()], but if you provide your own `proc.fun` #' you'll need to account for the possibility that the character vector it #' receives will have a different number of elements than the chunk output. #' This argument only has an effect if chunk output contains CSI SGR #' sequences. #' @param .test TRUE or FALSE, for internal testing use only. #' @return named list with the prior output hooks for each of `which`. #' @examples #' \dontrun{ #' ## The following should be done within an `rmarkdown` document chunk with #' ## chunk option `results` set to 'asis' and the chunk option `comment` set #' ## to ''. #' #' ```{r comment="", results='asis', echo=FALSE} #' ## Change the "output" hook to handle ANSI CSI SGR #' #' old.hooks <- set_knit_hooks(knitr::knit_hooks) #' #' ## Do the same with the warning, error, and message, and add styles for #' ## them (alternatively we could have done output as part of this call too) #' #' styles <- c( #' getOption('fansi.style', dflt_css()), # default style #' "PRE.fansi CODE {background-color: transparent;}", #' "PRE.fansi-error {background-color: #DD5555;}", #' "PRE.fansi-warning {background-color: #DDDD55;}", #' "PRE.fansi-message {background-color: #EEEEEE;}" #' ) #' old.hooks <- c( #' old.hooks, #' fansi::set_knit_hooks( #' knitr::knit_hooks, #' which=c('warning', 'error', 'message'), #' style=styles #' ) ) #' ``` #' ## You may restore old hooks with the following chunk #' #' ## Restore Hooks #' ```{r} #' do.call(knitr::knit_hooks$set, old.hooks) #' ``` #' } set_knit_hooks <- function( hooks, which='output', proc.fun=function(x, class) html_code_block(to_html(html_esc(x)), class=class), class=sprintf("fansi fansi-%s", which), style=getOption("fansi.css", dflt_css()), split.nl=FALSE, .test=FALSE ) { if( !is.list(hooks) || !all(c('get', 'set') %in% names(hooks)) || !is.function(hooks[['get']]) || !is.function(hooks[['set']]) ) stop("Argument `hooks` does not appear to be `knitr::knit_hooks`.") which.vals <- c('output', 'warning', 'error', 'message') if(!is.character(which) || !all(which %in% which.vals)) stop( "Argument `which` must be character containing values in ", deparse(which.vals) ) if(anyDuplicated(which)) stop( "Argument `which` may not contain duplicate values (", which[anyDuplicated(which)], ")." ) if( !is.function(proc.fun) || !all(c('x', 'class') %in% names(formals(proc.fun))) ) stop( "Argument `proc.fun` must be a function with formals named ", "`x` and `class`." ) if(!is.character(class) || (length(class) != length(which))) stop( "Argument `class` should be a character vector the same length as ", "`which`." ) if(!is.character(style)) stop("Argument `style` must be character.") if(!isTRUE(split.nl %in% c(TRUE, FALSE))) stop("Argument `split.n` must be TRUE or FALSE") old.hook.list <- vector('list', length(which)) names(old.hook.list) <- which new.hook.list <- vector('list', length(which)) names(new.hook.list) <- which base.err <- "are you sure you passed `knitr::knit_hooks` as the `hooks` argument?" make_hook <- function(old.hook, class, split.nl) { force(old.hook) force(class) force(split.nl) function(x, options) { # If the output has SGR in it, then convert to HTML and wrap # in PRE/CODE tags if(any(has_ctl(x, c('sgr', 'url')))) { if(split.nl) x <- unlist(strsplit_sgr(x, '\n', fixed=TRUE)) res <- try(proc.fun(x=x, class=class)) if(inherits(res, "try-error")) stop( "Argument `proc.fun` for `set_knit_hooks` caused an error when ", "processing output; see prior error." ) res } # If output doesn't have SGR, then use the default hook else old.hook(x, options) } } for(i in seq_along(which)) { hook.name <- which[i] old.hook <- try(hooks$get(hook.name)) base.err.2 <- sprintf(" Quitting after setting %d/%d hooks", (i - 1), length(which)) if(inherits(old.hook, 'try-error')) { warning( "Failed retrieving '", hook.name, "' hook from the knit hooks; ", base.err, base.err.2 ) break } if(!is.function(old.hook)) { warning( "Retrieved '", hook.name, "' hook is not a function; ", base.err, base.err.2 ) break } new.hook.list[[i]] <- make_hook(old.hook, class[[i]], split.nl) old.hook.list[[i]] <- old.hook } if( inherits( set.res <- try(do.call(hooks[['set']], new.hook.list)), 'try-error' ) ) warning("Failure while trying to set hooks; see prior error; ", base.err) writeLines(c("")) if(.test) list(old.hooks=old.hook.list, new.hooks=new.hook.list, res=set.res) else old.hook.list } #' Show 8 Bit CSI SGR Colors #' #' Generates text with each 8 bit SGR code (e.g. the "###" in "38;5;###") with #' the background colored by itself, and the foreground in a contrasting color #' and interesting color (we sacrifice some contrast for interest as this is #' intended for demo rather than reference purposes). #' #' @seealso [make_styles()]. #' @export #' @return character vector with SGR codes with background color set as #' themselves. #' @examples #' writeLines(sgr_256()) sgr_256 <- function() { tpl <- "\033[38;5;%d;48;5;%dm%s\033[m" # Basic, bright, grayscale basic <- paste0(sprintf(tpl, 15, 0:7, format(0:7, width=3)), collapse=" ") bright <- paste0(sprintf(tpl, 0, 8:15, format(8:15, width=3)), collapse=" ") gs1 <- paste0(sprintf(tpl, 15, 232:243, format(232:243, width=3)), collapse=" ") gs2 <- paste0(sprintf(tpl, 0, 244:255, format(244:255, width=3)), collapse=" ") # Color parts fg <- 231:16 bg <- rev(fg) # reverse fg/bg so we can read the numbers } table <- matrix(sprintf(tpl, fg, bg, format(bg)), 36) part.a <- do.call(paste0, c(split(table[1:18,], row(table[1:18,])))) part.b <- do.call(paste0, c(split(table[-(1:18),], row(table[-(1:18),])))) ## Output c( "Standard", basic, "", "High-Intensity", bright, "", "216 Colors (Dark)", part.a, "", "216 Colors (Light)", part.b, "", "Grayscale", gs1, gs2 ) } # To test growable buffer. size_buff <- function(x) .Call(FANSI_size_buff, x) size_buff_prot_test <- function() { raw <- .Call(FANSI_size_buff_prot_test) res <- raw[-1L] names(res) <- c('n', 'prev', 'self') res <- as.data.frame(res) # stringsAsFactors issues res[['prev']] <- as.character(res[['prev']]) res[['self']] <- as.character(res[['self']]) rownames(res) <- raw[[1L]] # remap the addresses so they are consistent across different runs addresses <- do.call(rbind, res[c('prev', 'self')]) res[['prev']] <- match(res[['prev']], addresses) res[['self']] <- match(res[['self']], addresses) res } #' Display Strings to Terminal #' #' Shortcut for [`writeLines`] with an additional terminating "ESC[0m". #' #' @keywords internal #' @export #' @param ... character vectors to display. #' @param end character what to output after the primary inputs. #' @return whatever writeLines returns fwl <- function(..., end='\033[0m') { writeLines(c(..., end)) } #' Default Arg Helper Funs #' #' Terminal capabilities are assumed to include bright and 256 color SGR codes. #' 24 bit color support is detected based on the `COLORTERM` environment #' variable. #' #' Default CSS may exceed or fail to cover the interline distance when two lines #' have background colors. To ensure lines are exactly touching use #' inline-block, although that has its own issues. Otherwise specify your own #' CSS. #' #' @seealso [`term_cap_test`]. #' @export #' @return character to use as default value for `fansi` parameter. dflt_term_cap <- function() { c( if(isTRUE(Sys.getenv('COLORTERM') %in% c('truecolor', '24bit'))) 'truecolor', 'bright', '256' ) } #' @rdname dflt_term_cap #' @export dflt_css <- function() { "PRE.fansi SPAN {padding-top: .25em; padding-bottom: .25em};" } fansi/R/strtrim.R0000755000176200001440000001025614507512660013417 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. #' Control Sequence Aware Version of strtrim #' #' A drop in replacement for [`base::strtrim`], with the difference that all #' C0 control characters such as newlines, carriage returns, etc., are always #' treated as zero width, whereas in base it may vary with platform / R version. #' #' `strtrim2_ctl` adds the option of converting tabs to spaces before trimming. #' This is the only difference between `strtrim_ctl` and `strtrim2_ctl`. #' #' @export #' @note Non-ASCII strings are converted to and returned in UTF-8 encoding. #' Width calculations will not work properly in R < 3.2.2. #' @inheritParams base::strtrim #' @inheritParams strwrap_ctl #' @inherit substr_ctl seealso #' @return Like [`base::strtrim`], except that _Control Sequences_ are treated #' as zero width. #' @examples #' strtrim_ctl("\033[42mHello world\033[m", 6) strtrim_ctl <- function( x, width, warn=getOption('fansi.warn', TRUE), ctl='all', normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) { strtrim2_ctl( x=x, width=width, warn=warn, ctl=ctl, normalize=normalize, carry=carry, terminate=terminate ) } #' @export #' @rdname strtrim_ctl strtrim2_ctl <- function( x, width, warn=getOption('fansi.warn', TRUE), tabs.as.spaces=getOption('fansi.tabs.as.spaces', FALSE), tab.stops=getOption('fansi.tab.stops', 8L), ctl='all', normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) { ## modifies / creates NEW VARS in fun env VAL_IN_ENV( x=x, warn=warn, ctl=ctl, tabs.as.spaces=tabs.as.spaces, tab.stops=tab.stops, normalize=normalize, carry=carry, terminate=terminate ) if(!is.numeric(width) || length(width) != 1L || is.na(width) || width < 0) stop( "Argument `width` must be a positive scalar numeric representable ", "as an integer." ) # can assume all term cap available for these purposes term.cap.int <- 1L width <- as.integer(width) tab.stops <- as.integer(tab.stops) # a bit inefficient to rely on strwrap, but oh well res <- .Call( FANSI_strwrap_csi, x, width, 0L, 0L, # indent, exdent "", "", # prefix, initial TRUE, "", # wrap always FALSE, # strip spaces tabs.as.spaces, tab.stops, WARN.INT, term.cap.int, TRUE, # first only CTL.INT, normalize, carry, terminate ) if(normalize) normalize_state(res, warn=FALSE) else res } #' Control Sequence Aware Version of strtrim #' #' These functions are deprecated in favor of the [`strtrim_ctl`] flavors. #' #' @inheritParams strtrim_ctl #' @inherit strtrim_ctl return #' @keywords internal #' @export strtrim_sgr <- function( x, width, warn=getOption('fansi.warn', TRUE), normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) strtrim_ctl( x=x, width=width, warn=warn, ctl='sgr', normalize=normalize, carry=carry, terminate=terminate ) #' @export #' @rdname strtrim_sgr strtrim2_sgr <- function(x, width, warn=getOption('fansi.warn', TRUE), tabs.as.spaces=getOption('fansi.tabs.as.spaces', FALSE), tab.stops=getOption('fansi.tab.stops', 8L), normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) strtrim2_ctl( x=x, width=width, warn=warn, tabs.as.spaces=tabs.as.spaces, tab.stops=tab.stops, ctl='sgr', normalize=normalize, carry=carry, terminate=terminate ) fansi/R/internal.R0000755000176200001440000002245514507512660013533 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. ## Internal environment (mostly just to store version) FANSI.ENV <- new.env() ## Global variables utils::globalVariables( c("TERM.CAP.INT", "WARN.INT", "CTL.INT", "TYPE.INT", "ROUND.INT", "X.LEN") ) ## Internal functions, used primarily for testing ## Testing interface for color code to HTML conversion esc_color_code_to_html <- function(x) { if(!is.matrix(x) || !is.integer(x) || nrow(x) != 5) stop("Argument `x` must be a five row integer matrix.") .Call(FANSI_color_to_html, as.integer(x)) } check_assumptions <- function() .Call(FANSI_check_assumptions) # nocov add_int <- function(x, y) .Call(FANSI_add_int, as.integer(x), as.integer(y)) ## testing interface for low overhead versions of R funs set_int_max <- function(x) .Call(FANSI_set_int_max, as.integer(x)[1]) get_int_max <- function(x) .Call(FANSI_get_int_max) # nocov for debug only set_rlent_max <- function(x) .Call(FANSI_set_rlent_max, as.integer(x)[1]) reset_limits <- function(x) .Call(FANSI_reset_limits) get_warn_all <- function() .Call(FANSI_get_warn_all) get_warn_mangled <- function() .Call(FANSI_get_warn_mangled) get_warn_utf8 <- function() .Call(FANSI_get_warn_utf8) get_warn_worst <- function() bitwOr(get_warn_mangled(), get_warn_utf8()) get_warn_error <- function() .Call(FANSI_get_warn_error) ## For testing version specific code set_rver <- function(x=getRversion()) { old <- FANSI.ENV[['r.ver']] FANSI.ENV[['r.ver']] <- x invisible(old) } ## exposed internals for testing check_enc <- function(x, i) .Call(FANSI_check_enc, x, as.integer(i)[1]) ## make sure `ctl` compression working ctl_as_int <- function(x) .Call(FANSI_ctl_as_int, as.integer(x)) ## testing interface for bridging bridge <- function( end, restart, term.cap=getOption("fansi.term.cap", dflt_term_cap()), normalize=getOption('fansi.normalize', FALSE) ) { VAL_IN_ENV(term.cap=term.cap) .Call(FANSI_bridge_state, end, restart, TERM.CAP.INT, normalize) } ## Common argument validation and conversion. Missing args okay. ## ## Converts common arguments to standardized forms if needed. ## ## DANGER: will modify values in calling environment! Also may add variables ## such as CTL.INT, X.LEN, etc. (these should all be in caps). VAL_IN_ENV <- function( ..., valid.types=c('chars', 'width', 'graphemes'), warn.mask=get_warn_all() ) { call <- sys.call(-1) par.env <- parent.frame() stop2 <- function(...) stop(simpleError(paste0(..., collapse=""), call)) args <- list(...) argnm <- names(args) if( !all( argnm %in% c( 'x', 'warn', 'term.cap', 'ctl', 'normalize', 'carry', 'terminate', 'tab.stops', 'tabs.as.spaces', 'strip.spaces', 'round', 'type', 'start', 'stop', 'keepNA', 'allowNA', 'value', # meta parameters (i.e. internal parameters) 'valid.types' # nchar and substr allow different things ) ) ) stop("Internal Error: some arguments to validate unknown") if('x' %in% argnm) { x <- args[['x']] if(!is.character(x)) x <- as.character(args[['x']]) enc <- Encoding(x) x <- enc_to_utf8(x, enc) if(length(which.byte <- which(enc == "bytes"))) stop2( "Argument `x` contains a \"bytes\" encoded string at index [", which.byte[1],"]", if(length(which.byte) > 1) "and others, " else ", ", "which is disallowed." ) args[['x']] <- x } if('warn' %in% argnm) { warn <- args[['warn']] if(!is.logical(warn)) warn <- as.logical(args[['warn']]) if(length(warn) != 1L || is.na(warn)) stop2("Argument `warn` must be TRUE or FALSE.") args[['warn']] <- warn args[['WARN.INT']] <- if(warn) warn.mask else bitwAnd(warn.mask, get_warn_error()) } if('normalize' %in% argnm) { normalize <- as.logical(args[['normalize']]) if(!isTRUE(normalize %in% c(FALSE, TRUE))) stop2("Argument `normalize` must be TRUE or FALSE.") args[['normalize']] <- as.logical(normalize) } if('term.cap' %in% argnm) { term.cap <- args[['term.cap']] if(!is.character(term.cap)) stop2("Argument `term.cap` must be character.") if(anyNA(term.cap.int <- match(term.cap, VALID.TERM.CAP))) stop2( "Argument `term.cap` may only contain values in ", deparse(VALID.TERM.CAP) ) args[['TERM.CAP.INT']] <- term.cap.int } if('ctl' %in% argnm) { ctl <- args[['ctl']] if(!is.character(ctl)) stop2("Argument `ctl` must be character.") ctl.int <- integer() if(length(ctl)) { # duplicate values in `ctl` are okay, so save a call to `unique` here if(anyNA(ctl.int <- match(ctl, VALID.CTL))) stop2( "Argument `ctl` may contain only values in `", deparse(VALID.CTL), "`" ) } args[['CTL.INT']] <- ctl.int } if('carry' %in% argnm) { carry <- args[['carry']] if(length(carry) != 1L) stop2("Argument `carry` must be scalar.") if(!is.logical(carry) && !is.character(carry)) stop2("Argument `carry` must be logical or character.") if(is.na(carry)) stop2("Argument `carry` may not be NA.") if('value' %in% argnm && !is.logical(carry)) stop2("Argument `carry` must be TRUE or FALSE in replacement mode.") if(is.logical(carry)) if(carry) carry <- "" else carry = NA_character_ args[['carry']] <- carry } if('terminate' %in% argnm) { terminate <- as.logical(args[['terminate']]) if(!isTRUE(terminate %in% c(TRUE, FALSE))) stop2("Argument `terminate` must be TRUE or FALSE") terminate <- as.logical(terminate) } if('tab.stops' %in% argnm) { tab.stops <- args[['tab.stops']] if( !is.numeric(tab.stops) || !length(tab.stops) || any(tab.stops < 1) || anyNA(tab.stops) ) stop2( "Argument `tab.stops` must be numeric, strictly positive, and ", "representable as an integer." ) args[['tab.stops']] <- as.integer(tab.stops) } if('tabs.as.spaces' %in% argnm) { tabs.as.spaces <- args[['tabs.as.spaces']] if(!is.logical(tabs.as.spaces)) tabs.as.spaces <- as.logical(tabs.as.spaces) if(length(tabs.as.spaces) != 1L || is.na(tabs.as.spaces)) stop2("Argument `tabs.as.spaces` must be TRUE or FALSE.") args[['tabs.as.spaces']] <- tabs.as.spaces } if('strip.spaces' %in% argnm) { strip.spaces <- args[['strip.spaces']] if(!is.logical(strip.spaces)) strip.spaces <- as.logical(strip.spaces) if(length(strip.spaces) != 1L || is.na(strip.spaces)) stop2("Argument `strip.spaces` must be TRUE or FALSE.") args[['strip.spaces']] <- strip.spaces } if('round' %in% argnm) { # be sure to update FANSI_RND_* defines in C code if this changes valid.round <- c('start', 'stop', 'both', 'neither') round <- args[['round']] if( !is.character(round) || length(round) != 1 || is.na(round.int <- pmatch(round, valid.round)) ) stop2("Argument `round` must partial match one of ", deparse(valid.round)) args[['round']] <- valid.round[round.int] args[['ROUND.INT']] <- round.int } if('type' %in% argnm) { type <- args[['type']] if( !is.character(type) || length(type) != 1 || is.na(type) || is.na(type.int <- pmatch(type, valid.types)) ) stop2("Argument `type` must partial match one of ", deparse(valid.types)) args[['type']] <- valid.types[type.int] args[['TYPE.INT']] <- type.int - 1L } if('start' %in% argnm || 'stop' %in% argnm) { x.len <- length(args[['x']]) # Silently recycle start/stop like substr does. Coercion to integer # should be done ahead of VAL_IN_ENV so warnings are reported # correctly start <- rep(as.integer(args[['start']]), length.out=x.len) stop <- rep(as.integer(args[['stop']]), length.out=x.len) args[['start']] <- start args[['stop']] <- stop args[['X.LEN']] <- x.len } if('keepNA' %in% argnm) { keepNA <- as.logical(args[['keepNA']]) if(length(keepNA) != 1L) stop2("Argument `keepNA` must be interpretable as a scalar logical.") args[['keepNA']] <- keepNA } if('allowNA' %in% argnm) { allowNA <- as.logical(args[['allowNA']]) if(length(allowNA) != 1L) stop2("Argument `allowNA` must be interpretable as a scalar logical.") args[['allowNA']] <- isTRUE(allowNA) } # we might not have validated all, so we should be careful list2env(args, par.env) } ## Encode to UTF-8 If needed ## ## Problem is that if native is UTF-8, unknown vectors are re-encoded, ## which will include escaping of bad encoding which hides errors. ## ## Assumes char input enc_to_utf8 <- function(x, enc=Encoding(x)) { if(isTRUE(l10n_info()[['UTF-8']])) { # in theory just "latin1", but just in case other encs added translate <- enc != "unknown" & enc != "UTF-8" x[translate] <- enc2utf8(x[translate]) x } else enc2utf8(x) # nocov tested manually } fansi/R/unhandled.R0000755000176200001440000001304614507512660013655 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. #' Identify Unhandled Control Sequences #' #' Will return position and types of unhandled _Control Sequences_ in a #' character vector. Unhandled sequences may cause `fansi` to interpret strings #' in a way different to your display. See [fansi] for details. Functions that #' interpret _Special Sequences_ (CSI SGR or OSC hyperlinks) might omit bad #' _Special Sequences_ or some of their components in output substrings, #' particularly if they are leading or trailing. Some functions are more #' tolerant of bad inputs than others. For example [`nchar_ctl`] will not #' report unsupported colors because it only cares about counts or widths. #' `unhandled_ctl` will report all potentially problematic sequences. #' #' To work around tabs present in input, you can use [`tabs_as_spaces`] or the #' `tabs.as.spaces` parameter on functions that have it, or the [`strip_ctl`] #' function to remove the troublesome sequences. Alternatively, you can use #' `warn=FALSE` to suppress the warnings. #' #' This is a debugging function that is not optimized for speed and the precise #' output of which might change with `fansi` versions. #' #' The return value is a data frame with five columns: #' #' * index: integer the index in `x` with the unhandled sequence #' * start: integer the start position of the sequence (in characters) #' * stop: integer the end of the sequence (in characters), but note that if #' there are multiple ESC sequences abutting each other they will all be #' treated as one, even if some of those sequences are valid. #' * error: the reason why the sequence was not handled: #' * unknown-substring: SGR substring with a value that does not correspond #' to a known SGR code or OSC hyperlink with unsupported parameters. #' * invalid-substr: SGR contains uncommon characters in ":<=>", #' intermediate bytes, other invalid characters, or there is an invalid #' subsequence (e.g. "ESC[38;2m" which should specify an RGB triplet #' but does not). OSCs contain invalid bytes, or OSC hyperlinks contain #' otherwise valid OSC bytes in 0x08-0x0d. #' * exceed-term-cap: contains color codes not supported by the terminal #' (see [term_cap_test]). Bright colors with color codes in the 90-97 and #' 100-107 range in terminals that do not support them are not considered #' errors, whereas 256 or truecolor codes in terminals that do not support #' them are. This is because the latter are often misinterpreted by #' terminals that do not support them, whereas the former are typically #' silently ignored. #' * CSI/OSC: a non-SGR CSI sequence, or non-hyperlink OSC sequence. #' * CSI/OSC-bad-substr: a CSI or OSC sequence containing invalid #' characters. #' * malformed-CSI/OSC: a malformed CSI or OSC sequence, typically one that #' never encounters its closing sequence before the end of a string. #' * non-CSI/OSC: a non-CSI or non-OSC escape sequence, i.e. one where the #' ESC is followed by something other than "[" or "]". Since we #' assume all non-CSI sequences are only 2 characters long include the #' ESC, this type of sequence is the most likely to cause problems as some #' are not actually two characters long. #' * malformed-ESC: a malformed two byte ESC sequence (i.e. one not ending #' in 0x40-0x7e). #' * C0: a "C0" control character (e.g. tab, bell, etc.). #' * malformed-UTF8: illegal UTF8 encoding. #' * non-ASCII: non-ASCII bytes in escape sequences. #' * translated: whether the string was translated to UTF-8, might be helpful in #' odd cases were character offsets change depending on encoding. You should #' only worry about this if you cannot tie out the `start`/`stop` values to #' the escape sequence shown. #' * esc: character the unhandled escape sequence #' #' @note Non-ASCII strings are converted to UTF-8 encoding. #' @export #' @inherit has_ctl seealso #' @param x character vector #' @inheritParams substr_ctl #' @return Data frame with as many rows as there are unhandled escape #' sequences and columns containing useful information for debugging the #' problem. See details. #' #' @examples #' string <- c( #' "\033[41mhello world\033[m", "foo\033[22>m", "\033[999mbar", #' "baz \033[31#3m", "a\033[31k", "hello\033m world" #' ) #' unhandled_ctl(string) unhandled_ctl <- function( x, term.cap=getOption('fansi.term.cap', dflt_term_cap()) ) { ## modifies / creates NEW VARS in fun env VAL_IN_ENV(x=x, term.cap=term.cap) res <- .Call(FANSI_unhandled_esc, x, TERM.CAP.INT) names(res) <- c("index", "start", "stop", "error", "translated", "esc") errors <- c( 'unknown-substr', 'invalid-substr', 'exceed-term-cap', 'non-SGR/hyperlink', 'CSI/OSC-bad-substr', 'malformed-CSI/OSC', 'non-CSI/OSC', 'malformed-ESC', 'C0', 'malformed-UTF8', 'non-ASCII' ) res[['error']] <- errors[res[['error']]] as.data.frame(res, stringsAsFactors=FALSE) } fansi/R/substr2.R0000755000176200001440000005260314510300164013305 0ustar liggesusers## Copyright (C) Brodie Gaslam ## ## This file is part of "fansi - ANSI Control Sequence Aware String Functions" ## ## This program is free software: you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation, either version 2 or 3 of the License. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## Go to for copies of the licenses. #' Control Sequence Aware Version of substr #' #' `substr_ctl` is a drop-in replacement for `substr`. Performance is #' slightly slower than `substr`, and more so for `type = 'width'`. Special #' _Control Sequences_ will be included in the substrings to reflect their format #' when as it was when part of the source string. `substr2_ctl` adds the #' ability to extract substrings based on grapheme count or display width in #' addition to the normal character width, as well as several other options. #' #' @section Control and Special Sequences: #' #' _Control Sequences_ are non-printing characters or sequences of characters. #' _Special Sequences_ are a subset of the _Control Sequences_, and include CSI #' SGR sequences which can be used to change rendered appearance of text, and #' OSC hyperlinks. See [`fansi`] for details. #' #' @section Position Semantics: #' #' When computing substrings, _Normal_ (non-control) characters are considered #' to occupy positions in strings, whereas _Control Sequences_ occupy the #' interstices between them. The string: #' #' ``` #' "hello-\033[31mworld\033[m!" #' ``` #' #' is interpreted as: #' #' ``` #' 1 1 1 #' 1 2 3 4 5 6 7 8 9 0 1 2 #' h e l l o -|w o r l d|! #' ^ ^ #' \033[31m \033[m #' ``` #' #' `start` and `stop` reference character positions so they never explicitly #' select for the interstitial _Control Sequences_. The latter are implicitly #' selected if they appear in interstices after the first character and before #' the last. Additionally, because _Special Sequences_ (CSI SGR and OSC #' hyperlinks) affect all subsequent characters in a string, any active _Special #' Sequence_, whether opened just before a character or much before, will be #' reflected in the state `fansi` prepends to the beginning of each substring. #' #' It is possible to select _Control Sequences_ at the end of a string by #' specifying `stop` values past the end of the string, although for _Special #' Sequences_ this only produces visible results if `terminate` is set to #' `FALSE`. Similarly, it is possible to select _Control Sequences_ preceding #' the beginning of a string by specifying `start` values less than one, #' although as noted earlier this is unnecessary for _Special Sequences_ as #' those are output by `fansi` before each substring. #' #' Because exact substrings on anything other than character count cannot be #' guaranteed (e.g. as a result of multi-byte encodings, or double display-width #' characters) `substr2_ctl` must make assumptions on how to resolve provided #' `start`/`stop` values that are infeasible and does so via the `round` #' parameter. #' #' If we use "start" as the `round` value, then any time the `start` #' value corresponds to the middle of a multi-byte or a wide character, then #' that character is included in the substring, while any similar partially #' included character via the `stop` is left out. The converse is true if we #' use "stop" as the `round` value. "neither" would cause all partial #' characters to be dropped irrespective whether they correspond to `start` or #' `stop`, and "both" could cause all of them to be included. See examples. #' #' A number of _Normal_ characters such as combining diacritic marks have #' reported width of zero. These are typically displayed overlaid on top of the #' preceding glyph, as in the case of `"e\u301"` forming "e" with an acute #' accent. Unlike _Control Sequences_, which also have reported width of zero, #' `fansi` groups zero-width _Normal_ characters with the last preceding #' non-zero width _Normal_ character. This is incorrect for some rare #' zero-width _Normal_ characters such as prepending marks (see "Output #' Stability" and "Graphemes"). #' #' @section Output Stability: #' #' Several factors could affect the exact output produced by `fansi` #' functions across versions of `fansi`, `R`, and/or across systems. #' **In general it is best not to rely on exact `fansi` output, e.g. by #' embedding it in tests**. #' #' Width and grapheme calculations depend on locale, Unicode database #' version, and grapheme processing logic (which is still in development), among #' other things. For the most part `fansi` (currently) uses the internals of #' `base::nchar(type='width')`, but there are exceptions and this may change in #' the future. #' #' How a particular display format is encoded in _Control Sequences_ is #' not guaranteed to be stable across `fansi` versions. Additionally, which #' _Special Sequences_ are re-encoded vs transcribed untouched may change. #' In general we will strive to keep the rendered appearance stable. #' #' To maximize the odds of getting stable output set `normalize_state` to #' `TRUE` and `type` to `"chars"` in functions that allow it, and #' set `term.cap` to a specific set of capabilities. #' #' @section Replacement Functions: #' #' Semantics for replacement functions have the additional requirement that the #' result appear as if it is the input modified in place between the positions #' designated by `start` and `stop`. `terminate` only affects the boundaries #' between the original substring and the spliced one, `normalize` only affects #' the same boundaries, and `tabs.as.spaces` only affects `value`, and `x` must #' be ASCII only or marked "UTF-8". #' #' `terminate = FALSE` only makes sense in replacement mode if only one of `x` #' or `value` contains _Control Sequences_. `fansi` will not account for any #' interactions of state in `x` and `value`. #' #' The `carry` parameter causes state to carry within the original string and #' the replacement values independently, as if they were columns of text cut #' from different pages and pasted together. String values for `carry` are #' disallowed in replacement mode as it is ambiguous which of `x` or `value` #' they would modify (see examples). #' #' When in `type = 'width'` mode, it is only guaranteed that the result will be #' no wider than the original `x`. Narrower strings may result if a mixture #' of narrow and wide graphemes cannot be replaced exactly with the same `width` #' value, possibly because the provided `start` and `stop` values (or the #' implicit ones generated for `value`) do not align with grapheme boundaries. #' #' @section Graphemes: #' #' `fansi` approximates grapheme widths and counts by using heuristics for #' grapheme breaks that work for most common graphemes, including emoji #' combining sequences. The heuristic is known to work incorrectly with #' invalid combining sequences, prepending marks, and sequence interruptors. #' `fansi` does not provide a full implementation of grapheme break detection to #' avoid carrying a copy of the Unicode grapheme breaks table, and also because #' the hope is that R will add the feature eventually itself. #' #' The [`utf8`](https://cran.r-project.org/package=utf8) package provides a #' conforming grapheme parsing implementation. #' #' @section Bidirectional Text: #' #' `fansi` is unaware of text directionality and operates as if all strings are #' left to right (LTR). Using `fansi` function with strings that contain mixed #' direction scripts (i.e. both LTR and RTL) may produce undesirable results. #' #' @note Non-ASCII strings are converted to and returned in UTF-8 encoding. #' Width calculations will not work properly in R < 3.2.2. #' @note If `stop` < `start`, the return value is always an empty string. #' @export #' @seealso [`?fansi`][fansi] for details on how _Control Sequences_ are #' interpreted, particularly if you are getting unexpected results, #' [`normalize_state`] for more details on what the `normalize` parameter does, #' [`state_at_end`] to compute active state at the end of strings, #' [`close_state`] to compute the sequence required to close active state. #' @param x a character vector or object that can be coerced to such. #' @param start integer. The first element to be extracted or replaced. #' @param stop integer. The first element to be extracted or replaced. #' @param type character(1L) partial matching #' `c("chars", "width", "graphemes")`, although types other than "chars" only #' work correctly with R >= 3.2.2. See [`?nchar`][base::nchar]. #' @param round character(1L) partial matching #' `c("start", "stop", "both", "neither")`, controls how to resolve #' ambiguities when a `start` or `stop` value in "width" `type` mode falls #' within a wide display character. See details. #' @param tabs.as.spaces FALSE (default) or TRUE, whether to convert tabs to #' spaces (and supress tab related warnings). This can only be set to TRUE if #' `strip.spaces` is FALSE. #' @param tab.stops integer(1:n) indicating position of tab stops to use #' when converting tabs to spaces. If there are more tabs in a line than #' defined tab stops the last tab stop is re-used. For the purposes of #' applying tab stops, each input line is considered a line and the character #' count begins from the beginning of the input line. #' @param ctl character, which _Control Sequences_ should be treated #' specially. Special treatment is context dependent, and may include #' detecting them and/or computing their display/character width as zero. For #' the SGR subset of the ANSI CSI sequences, and OSC hyperlinks, `fansi` #' will also parse, interpret, and reapply the sequences as needed. You can #' modify whether a _Control Sequence_ is treated specially with the `ctl` #' parameter. #' #' * "nl": newlines. #' * "c0": all other "C0" control characters (i.e. 0x01-0x1f, 0x7F), except #' for newlines and the actual ESC (0x1B) character. #' * "sgr": ANSI CSI SGR sequences. #' * "csi": all non-SGR ANSI CSI sequences. #' * "url": OSC hyperlinks #' * "osc": all non-OSC-hyperlink OSC sequences. #' * "esc": all other escape sequences. #' * "all": all of the above, except when used in combination with any of the #' above, in which case it means "all but". #' @param warn TRUE (default) or FALSE, whether to warn when potentially #' problematic _Control Sequences_ are encountered. These could cause the #' assumptions `fansi` makes about how strings are rendered on your display #' to be incorrect, for example by moving the cursor (see [`?fansi`][fansi]). #' At most one warning will be issued per element in each input vector. Will #' also warn about some badly encoded UTF-8 strings, but a lack of UTF-8 #' warnings is not a guarantee of correct encoding (use [`validUTF8`] for #' that). #' @param term.cap character a vector of the capabilities of the terminal, can #' be any combination of "bright" (SGR codes 90-97, 100-107), "256" (SGR codes #' starting with "38;5" or "48;5"), "truecolor" (SGR codes starting with #' "38;2" or "48;2"), and "all". "all" behaves as it does for the `ctl` #' parameter: "all" combined with any other value means all terminal #' capabilities except that one. `fansi` will warn if it encounters SGR codes #' that exceed the terminal capabilities specified (see [`term_cap_test`] #' for details). In versions prior to 1.0, `fansi` would also skip exceeding #' SGRs entirely instead of interpreting them. You may add the string "old" #' to any otherwise valid `term.cap` spec to restore the pre 1.0 behavior. #' "old" will not interact with "all" the way other valid values for this #' parameter do. #' @param normalize TRUE or FALSE (default) whether SGR sequence should be #' normalized out such that there is one distinct sequence for each SGR code. #' normalized strings will occupy more space (e.g. "\033[31;42m" becomes #' "\033[31m\033[42m"), but will work better with code that assumes each SGR #' code will be in its own escape as `crayon` does. #' @param carry TRUE, FALSE (default), or a scalar string, controls whether to #' interpret the character vector as a "single document" (TRUE or string) or #' as independent elements (FALSE). In "single document" mode, active state #' at the end of an input element is considered active at the beginning of the #' next vector element, simulating what happens with a document with active #' state at the end of a line. If FALSE each vector element is interpreted as #' if there were no active state when it begins. If character, then the #' active state at the end of the `carry` string is carried into the first #' element of `x` (see "Replacement Functions" for differences there). The #' carried state is injected in the interstice between an imaginary zeroeth #' character and the first character of a vector element. See the "Position #' Semantics" section of [`substr_ctl`] and the "State Interactions" section #' of [`?fansi`][fansi] for details. Except for [`strwrap_ctl`] where `NA` is #' treated as the string `"NA"`, `carry` will cause `NA`s in inputs to #' propagate through the remaining vector elements. #' @param terminate TRUE (default) or FALSE whether substrings should have #' active state closed to avoid it bleeding into other strings they may be #' prepended onto. This does not stop state from carrying if `carry = TRUE`. #' See the "State Interactions" section of [`?fansi`][fansi] for details. #' @param value a character vector or object that can be coerced to such. #' @return A character vector of the same length and with the same attributes as #' x (after possible coercion and re-encoding to UTF-8). #' @examples #' substr_ctl("\033[42mhello\033[m world", 1, 9) #' substr_ctl("\033[42mhello\033[m world", 3, 9) #' #' ## Positions 2 and 4 are in the middle of the full width W (\uFF37) for #' ## the `start` and `stop` positions respectively. Use `round` #' ## to control result: #' x <- "\uFF37n\uFF37" #' x #' substr2_ctl(x, 2, 4, type='width', round='start') #' substr2_ctl(x, 2, 4, type='width', round='stop') #' substr2_ctl(x, 2, 4, type='width', round='neither') #' substr2_ctl(x, 2, 4, type='width', round='both') #' #' ## We can specify which escapes are considered special: #' substr_ctl("\033[31mhello\tworld", 1, 6, ctl='sgr', warn=FALSE) #' substr_ctl("\033[31mhello\tworld", 1, 6, ctl=c('all', 'c0'), warn=FALSE) #' #' ## `carry` allows SGR to carry from one element to the next #' substr_ctl(c("\033[33mhello", "world"), 1, 3) #' substr_ctl(c("\033[33mhello", "world"), 1, 3, carry=TRUE) #' substr_ctl(c("\033[33mhello", "world"), 1, 3, carry="\033[44m") #' #' ## We can omit the termination #' bleed <- substr_ctl(c("\033[41mhello", "world"), 1, 3, terminate=FALSE) #' writeLines(bleed) # Style will bleed out of string #' end <- "\033[0m\n" #' writeLines(end) # Stanch bleeding #' #' ## Trailing sequences omitted unless `stop` past end. #' substr_ctl("ABC\033[42m", 1, 3, terminate=FALSE) #' substr_ctl("ABC\033[42m", 1, 4, terminate=FALSE) #' #' ## Replacement functions #' x0<- x1 <- x2 <- x3 <- c("\033[42mABC", "\033[34mDEF") #' substr_ctl(x1, 2, 2) <- "_" #' substr_ctl(x2, 2, 2) <- "\033[m_" #' substr_ctl(x3, 2, 2) <- "\033[45m_" #' writeLines(c(x0, end, x1, end, x2, end, x3, end)) #' #' ## With `carry = TRUE` strings look like original #' x0<- x1 <- x2 <- x3 <- c("\033[42mABC", "\033[34mDEF") #' substr_ctl(x0, 2, 2, carry=TRUE) <- "_" #' substr_ctl(x1, 2, 2, carry=TRUE) <- "\033[m_" #' substr_ctl(x2, 2, 2, carry=TRUE) <- "\033[45m_" #' writeLines(c(x0, end, x1, end, x2, end, x3, end)) #' #' ## Work-around to specify carry strings in replacement mode #' x <- c("ABC", "DEF") #' val <- "#" #' x2 <- c("\033[42m", x) #' val2 <- c("\033[45m", rep_len(val, length(x))) #' substr_ctl(x2, 2, 2, carry=TRUE) <- val2 #' (x <- x[-1]) substr_ctl <- function( x, start, stop, warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), ctl='all', normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) substr2_ctl( x=x, start=start, stop=stop, warn=warn, term.cap=term.cap, ctl=ctl, normalize=normalize, carry=carry, terminate=terminate ) #' @rdname substr_ctl #' @export substr2_ctl <- function( x, start, stop, type='chars', round='start', tabs.as.spaces=getOption('fansi.tabs.as.spaces', FALSE), tab.stops=getOption('fansi.tab.stops', 8L), warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), ctl='all', normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) { ## So warning are issues here start <- as.integer(start) stop <- as.integer(stop) ## modifies / creates NEW VARS in fun env VAL_IN_ENV( x=x, warn=warn, term.cap=term.cap, ctl=ctl, normalize=normalize, carry=carry, terminate=terminate, tab.stops=tab.stops, tabs.as.spaces=tabs.as.spaces, type=type, round=round, start=start, stop=stop ) res <- x res[] <- substr_ctl_internal( x, start=start, stop=stop, type.int=TYPE.INT, tabs.as.spaces=tabs.as.spaces, tab.stops=tab.stops, warn.int=WARN.INT, term.cap.int=TERM.CAP.INT, round.int=ROUND.INT, x.len=X.LEN, ctl.int=CTL.INT, normalize=normalize, carry=carry, terminate=terminate ) res } #' @rdname substr_ctl #' @export `substr_ctl<-` <- function( x, start, stop, warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), ctl='all', normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE), value ) { substr2_ctl( x=x, start=start, stop=stop, warn=warn, term.cap=term.cap, ctl=ctl, normalize=normalize, carry=carry, terminate=terminate ) <- value x } #' @rdname substr_ctl #' @export `substr2_ctl<-` <- function( x, start, stop, type='chars', round='start', tabs.as.spaces=getOption('fansi.tabs.as.spaces', FALSE), tab.stops=getOption('fansi.tab.stops', 8L), warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), ctl='all', normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE), value ) { # So warning are issued here start <- as.integer(start) stop <- as.integer(stop) # modifies / creates NEW VARS in fun env x0 <- x VAL_IN_ENV( x=x, warn=warn, term.cap=term.cap, ctl=ctl, normalize=normalize, tab.stops=tab.stops, tabs.as.spaces=tabs.as.spaces, round=round, start=start, stop=stop, type=type, carry=carry, value=value ) # In replace mode we shouldn't change the encoding if(!all(enc.diff <- Encoding(x) == Encoding(x0))) stop( "`x` may only contain ASCII or marked UTF-8 encoded strings; ", "you can use `enc2utf8` to convert `x` prior to use with ", "`substr_ctl<-` (replacement form). Illegal value at position [", min(which(!enc.diff)), "]." ) value <- as.character(value) if(tabs.as.spaces) value <- .Call( FANSI_tabs_as_spaces, value, tab.stops, 0L, # turn off warning, will be reported later TERM.CAP.INT, CTL.INT ) value <- rep_len(enc_to_utf8(value), X.LEN) res <- .Call(FANSI_substr, x, start, stop, value, TYPE.INT, ROUND.INT, WARN.INT, TERM.CAP.INT, CTL.INT, normalize, carry, terminate ) attributes(res) <- attributes(x) res } #' SGR Control Sequence Aware Version of substr #' #' These functions are deprecated in favor of the [`substr_ctl`] flavors. #' #' @keywords internal #' @inheritParams substr_ctl #' @inherit substr_ctl return #' @export substr_sgr <- function( x, start, stop, warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) substr2_ctl( x=x, start=start, stop=stop, warn=warn, term.cap=term.cap, ctl='sgr', normalize=normalize, carry=carry, terminate=terminate ) #' @rdname substr_sgr #' @export substr2_sgr <- function( x, start, stop, type='chars', round='start', tabs.as.spaces=getOption('fansi.tabs.as.spaces', FALSE), tab.stops=getOption('fansi.tab.stops', 8L), warn=getOption('fansi.warn', TRUE), term.cap=getOption('fansi.term.cap', dflt_term_cap()), normalize=getOption('fansi.normalize', FALSE), carry=getOption('fansi.carry', FALSE), terminate=getOption('fansi.terminate', TRUE) ) substr2_ctl( x=x, start=start, stop=stop, type=type, round=round, tabs.as.spaces=tabs.as.spaces, tab.stops=tab.stops, warn=warn, term.cap=term.cap, ctl=c('sgr', 'url'), normalize=normalize, carry=carry, terminate=terminate ) substr_ctl_internal <- function( x, start, stop, type.int, round.int, tabs.as.spaces, tab.stops, warn.int, term.cap.int, x.len, ctl.int, normalize, carry, terminate ) { if(tabs.as.spaces) x <- .Call( FANSI_tabs_as_spaces, x, tab.stops, 0L, # turn off warning, will be reported later term.cap.int, ctl.int ) .Call(FANSI_substr, x, start, stop, NULL, type.int, round.int, warn.int, term.cap.int, ctl.int, normalize, carry, terminate ) } fansi/NEWS.md0000755000176200001440000002731014510300734012453 0ustar liggesusers# fansi Release Notes ## v1.0.5 * Address roxygen2 breaking changes: * Add explicit alias for `fansi-package` now that it is no longer auto-generated by roxgen2 from the [`@docType package` directive](https://github.com/r-lib/roxygen2/issues/1491). * Work around [changed behavior for `@inheritParams`](https://github.com/r-lib/roxygen2/issues/1515). ## v1.0.4 CRAN compiled code warning suppression release. * Fix void function declarations and definitions. * Change `sprintf` to `snprintf`. ## v1.0.3 * Address problem uncovered by gcc-12 linters, although the issue itself could not manifest due to redundancy of checks in the code. ## v1.0.0-2 This is a major release and includes some behavior changes. ### Features * New functions: * [#26](https://github.com/brodieG/fansi/issues/26) Replacement forms of `substr_cl` (i.e `substr_ctl<-`). * `state_at_end` to compute active state at end of a string. * `close_state` to generate a closing sequence given an active state. * [#31](https://github.com/brodieG/fansi/issues/31) `trimws_ctl` as an equivalent to `trimws`. * [#64](https://github.com/brodieG/fansi/issues/64) `normalize_sgr` converts compound _Control Sequences_ into normalized form (e.g. "ESC[44;31m" becomes "ESC[31mESC[44m") for better compatibility with [`crayon`](https://github.com/r-lib/crayon). Additionally, most functions gain a `normalize` parameter so that they may return their output in normalized form (h/t @krlmlr). * [#74](https://github.com/brodieG/fansi/issues/74)`substr_ctl` and related functions are now all-C instead of a combination of C offset computations and R level `substr` operations. This greatly improves performance, particularly for vectors with many distinct strings. Despite documentation claiming otherwise, `substr_ctl` was quite slow in that case. * [#66](https://github.com/brodieG/fansi/issues/66) Improved grapheme support, including accounting for them in `type="width"` mode, as well as a `type="graphemes"` mode to measure in graphemes instead of characters. Implementation is based on heuristics designed to work in most common use cases. * `html_esc` gains a `what` parameter to indicate which HTML special characters should be escaped. * Many functions gain `carry` and `terminate` parameters to control how `fansi` generated substrings interact with surrounding formats. * [#71](https://github.com/brodieG/fansi/issues/71) Functions that write SGR and OSC are now more parsimonious (see "Behavior Changes" below). * [#73](https://github.com/brodieG/fansi/issues/73) Default parameter values retrieved with `getOption` now always have explicit fallback values defined (h/t @gadenbui). * Better warnings and error messages, including more granular messages for `unhandled_ctl` for adjacent _Control Sequences_. * `term.cap` parameter now accepts "all" as value, like the `ctl` parameter. ### Deprecated Functions * All the "sgr" functions (e.g., `substr_sgr`, `strwrap_sgr`) are deprecated. They will likely live on indefinitely, but they are of limited usefulness and with the added support for OSC hyperlinks their name is misleading. * `sgr_to_html` is now `to_html` with slight modifications to semantics; the old function remains and does not warn about unescaped "<" or ">" in the input string. ### Behavior Changes The major intentional behavior change is to default `fansi` to always recognize true color CSI SGR sequences (e.g. `"ESC[38;2;128;50;245m"`). The prior default was to match the active terminal capabilities, but it is unlikely that the intent of a user manipulating a string with truecolor sequences is to interpret them incorrectly, even if their terminal does. `fansi` will continue to warn in this case. To keep the pre-1.0 behavior add `"old"` to the `term.cap` parameter. Additionally, `to_html` will now warn if it encounters unescaped HTML special character "<" or ">" in the input string. Finally, the 1.0 release is an extensive refactoring of many parts of the SGR and OSC hyperlink controls (_Special Sequences_) intake and output algorithms. In some cases this means that some `fansi` functions will output _Special Sequences_ slightly differently than they did before. In almost all cases the rendering of the output should remain unchanged, although there are some corner cases with changes (e.g. in `strwrap_ctl` SGRs embedded in whitespace sequences don't break the sequence). The changes are a side effect of applying more consistent treatment of corner cases around leading and trailing control sequences and (partially) invalid control sequences. Trailing _Special Sequences_ in the output is now omitted as it would be immediately closed (assuming `terminate=TRUE`, the default). Leading SGR is interpreted and re-output. Normally output consistency alone would not be a reason to change behavior, but in this case the changes should be almost always undetectable in the **rendered** output, and maintaining old inconsistent behavior in the midst of a complete refactoring of the internals was beyond my patience. I apologize if these behavior changes adversely affect your programs. > WARNING: we will strive to keep rendered appearance of `fansi` outputs > consistent across releases, but the exact bytes used in the output of _Special > Sequences_ may change. Other changes: * Tests may no longer pass with R < 4.0 although the package should still function correctly. This is primarily because of changes to the character width Unicode Database that ships with R, and many of the newly added grapheme tests touch parts of that database that changed (emoji). * CSI sequences with more than one "intermediate" byte are now considered valid, even though they are likely to be very rare, and CSI sequences consume all subsequent bytes until a valid closing byte or end of string is encountered. * `strip_ctl` only warns with malformed CSI and OSC if they are reported as supported via the `ctl` parameter. If CSI and OSC are indicated as not supported, but two byte escapes are, the two initial bytes of CSI and OSCs will be stripped. * "unknown" encoded strings are no longer translated to UTF-8 in UTF-8 locales (they are instead assumed to be UTF-8). * `nchar_ctl` preserves `dim`, `dimnames`, and `names` as the base functions do. * UTF-8 known to be invalid should not be output, even if present in input (UTF-8 validation is not complete, only sequences that are obviously wrong are detected). ### Bug Fixes * Fix `tabs_as_spaces` to handle sequential tabs, and to perform better on very wide strings. * Strings with invalid UTF-8 sequences with "unknown" declared encoding in UTF-8 locales now cause errors instead of being silently translated into byte escaped versions (e.g. "\xf0\xc2" (2 bytes), used to be interpreted as "" (four characters). These now cause errors as they would have if they had had "UTF-8" declared encoding. * In some cases true colors of form "38;2;x;x;x" and "48;2;x;x;x" would only be partially transcribed. ### Internal Changes * More aggressive UTF-8 validation, also, invalid UTF-8 code points now advance only one byte instead of their putative width based on the initial byte. * Reduce peak memory usage by making some intermediate buffers eligible for garbage collection prior to native code returning to R. * Reworked internals to simplify buffer size computation and synchronization, in some cases this might cause slightly reduced performance. Please report any significant performance regressions. * `nchar_ctl(...)` is no longer a wrapper for `nchar(strip_ctl(...))` so that it may correctly support grapheme width calculations. ## v0.5.0 * [#65](https://github.com/brodieG/fansi/issues/65): `sgr_to_html` optionally converts CSI SGR to classes instead of inline styles (h/t @hadley). * [#69](https://github.com/brodieG/fansi/issues/69): `sgr_to_html` is more disciplined about emitting unnecessary HTML (h/t @hadley). * New functions: * `sgr_256`: Display all 256 8-bit colors. * `in_html`: Easily output HTML in a web page. * `make_styles`: Easily produce CSS that matches 8-bit colors. * Adjust for changes to `nchar(..., type='width')` for C0-C1 control characters in R 4.1. * Restore tests bypassed in 0.4.2. ## v0.4.2 * Temporarily bypass tests due to R bug introduced in R-devel 79799. ## v0.4.1 * Correctly define/declare global symbols as per WRE 1.6.4.1, (h/t Professor Ripley, Joshua Ulrich for example fixes). * [#59](https://github.com/brodieG/fansi/issues/59): Provide a `split.nl` option to `set_knit_hooks` to mitigate white space issues when using blackfriday for the markdown->html conversion (@krlmlr). ## v0.4.0 * Systematized which control sequences are handled specially by adding the `ctl` parameter to most functions. Some functions such as `strip_ctl` had existing parameters that did the same thing (e.g. `strip`, or `which`), and those have been deprecated in favor of `ctl`. While technically this is a change in the API, it is backwards compatible (addresses [#56](https://github.com/brodieG/fansi/issues/56) among and other things). * Added `*_sgr` version of most `*_ctl` functions. * `nzchar_ctl` gains the `ctl` parameter. * [#57](https://github.com/brodieG/fansi/issues/57): Correctly detect when CSI sequences are not actually SGR (previously would apply styles from some non-SGR CSI sequences). * [#55](https://github.com/brodieG/fansi/issues/55): `strsplit_ctl` can now work with `ctl` parameters containing escape sequences provided those sequences are excluded from by the `ctl` parameter. * [#54](https://github.com/brodieG/fansi/issues/54): fix `sgr_to_html` so that it can handle vector elements with un-terminated SGR sequences (@krlmlr). * Fix bug in width computation of first line onwards in `strwrap_ctl` when indent/exdent/prefix/initial widths vary from first to second line. * Fix wrapping in `strwrap2_*(..., strip.spaces=FALSE)`, including a bug when `wrap.always=TRUE` and a line started in a word-whitespace boundary. * Add `term.cap` parameter to `unhandled_ctl`. ## v0.3.0 * `fansi::set_knit_hooks` makes it easy to automatically convert ANSI CSI SGR sequences to HTML in Rmarkdown documents. We also add a vignette that demonstrates how to do this. * [#53](https://github.com/brodieG/fansi/issues/53): fix for systems where 'char' is signed (found and fixed by @QuLogic). * [#52](https://github.com/brodieG/fansi/issues/52): fix bad compilation under ICC (@kazumits). * [#51](https://github.com/brodieG/fansi/issues/51): documentation improvements (@krlmlr). * [#50](https://github.com/brodieG/fansi/issues/50): run tests on R 3.1 - 3.4 tests for the rc branch only (@krlmlr). * [#48](https://github.com/brodieG/fansi/issues/48): malformed call to error in FANSI_check_enc (@msannell). * [#47](https://github.com/brodieG/fansi/issues/47): compatibility with R versions 3.2.0 and 3.2.1 (@andreadega). ## v0.2.3 * [#45](https://github.com/brodieG/fansi/issues/45): add capability to run under R 3.1 [hadley](https://github.com/hadley), [Gábor Csárdi](https://github.com/gaborcsardi). * [#44](https://github.com/brodieG/fansi/issues/44): include bright color support in HTML conversion (h/t [Will Landau](https://github.com/wlandau)). Other minor fixes ([#43](https://github.com/brodieG/fansi/issues/43), [#46](https://github.com/brodieG/fansi/issues/46)). ## v0.2.2 * Remove valgrind uninitialized string errors by avoiding `strsplit`. * Reduce R dependency to >= 3.2.x (@gaborcsardi). * Update tests to handle potential change in `substr` behavior starting with R-3.6. ## v0.2.1 * All string inputs are now encoded to UTF-8, not just those that are used in width calculations. * UTF-8 tests skipped on Solaris. ## v0.2.0 * Add `strsplit_ctl`. ## v0.1.0 Initial release. fansi/MD50000644000176200001440000001524214510601475011671 0ustar liggesusers14bb4b0300c099f37920c7d8b4f95964 *COPYING 083ff9fd55cf923c648285265b14f005 *DESCRIPTION 0c3f04de3f2aa62795bf77b3820de73e *NAMESPACE e25257be68cc7c52544fd53d4a949016 *NEWS.md a1915fec613ccf0cd80de7849299e565 *R/constants.R 97b58ea5941e20d2d98b8b64afd3e82b *R/fansi-package.R 138541d54621918996fcdcba6870cd25 *R/internal.R a35f356575f29b5959184bde04abc95d *R/load.R fc98e612024c02fcf381f866ca17cb05 *R/misc.R 403b2a5ffb4aeaf2bf11fda8b7e9cb7d *R/nchar.R 34e6c64dc7c2a2702d9d54091a12b3b2 *R/normalize.R 49e8d922d49794cf1d0f2797983dc400 *R/sgr.R c140e57fa23107c698a9b13d35481778 *R/strsplit.R 33f160fd11debbcf2f3752a89eedc0fa *R/strtrim.R a7f0d6e3d82d89d94a91f3fcfdb242b1 *R/strwrap.R 9b02671524c571a3e42b74f5fcc455e7 *R/substr2.R 8f47c5dde7697dfcbcb80a484df8391e *R/tohtml.R 38a06554f6076eeb984bca1adcf85c18 *R/trimws.R 9ef5ef71445b6f7725b452e4f815b084 *R/unhandled.R a7c55757a3bd7b5025ad682118d4fb94 *README.md a9817463457f13550bde12ce57747b4f *build/vignette.rds 51c04478fb0ea1e0498a4967966e75ee *inst/doc/sgr-in-rmd.R 368576ff676a26fbc5395f1aeb364cb6 *inst/doc/sgr-in-rmd.Rmd 12c89f10342c673f7557886aeb0808ca *inst/doc/sgr-in-rmd.html 79ad5279091252bcffb795bac32cf70e *man/dflt_term_cap.Rd 78a6d368387d0815605d32dd072dbfcc *man/fansi.Rd 09b9afefad88c42caa18f6bdeff4be2c *man/fansi_lines.Rd 029c21cc1a83767f8452b233ff8b1753 *man/fwl.Rd e99a86f74d92da1720131933fd7c6ed8 *man/has_ctl.Rd b16bf4d8d95bcfedfb82b79383b34661 *man/has_sgr.Rd edd2ae1ba1df57748fed3d5d12d6ac2c *man/html_code_block.Rd b0eb7ce61beaadeede8d17039fd7fe86 *man/html_esc.Rd 175aa96f8dad7b5547df86c89676baa1 *man/in_html.Rd ae801c8adf0625dd46d8b4fb43fa5039 *man/make_styles.Rd 01db3ba54729100c753539c6976ad440 *man/nchar_ctl.Rd 603950d472bdc24dc40b39a3b69b3cf0 *man/nchar_sgr.Rd 41bba2079e33a4f682ee2201524e3111 *man/normalize_state.Rd ed5687c4bbe0e68c7a0c9370f8dfd963 *man/set_knit_hooks.Rd 5e08be7234d5a2b78cd9d17f4a076803 *man/sgr_256.Rd 53e08f28961b5b64a58104ace02d3a1c *man/sgr_to_html.Rd de535fe3354cffbae7aa6b65c8d8d88e *man/state_at_end.Rd 61b5f7c7b058fd5440cfbde681389ee4 *man/strip_ctl.Rd b746f9565205da591cb57ba3577cec6c *man/strip_sgr.Rd cb76d654b8c85f4b694b5dc7652eaae9 *man/strsplit_ctl.Rd 8cbaf0eefcc3c95f18fb55d550f61e12 *man/strsplit_sgr.Rd ce3d32ce0a469a0789cbfec6c31c2744 *man/strtrim_ctl.Rd 50f04ca5e30b07432cce4a8f00254c04 *man/strtrim_sgr.Rd 6cd0ebfd0810da58d881cc25def2edc5 *man/strwrap_ctl.Rd a66ea6f569660a40c59b71c34364e0ed *man/strwrap_sgr.Rd 32eb99e6dfb512c635030523bf3da101 *man/substr_ctl.Rd ac47155aa4d5a610b7af139b19380041 *man/substr_sgr.Rd 37ed615a6c4cce008aecb5c77122248d *man/tabs_as_spaces.Rd b75c259163919eb1a0dd3487456fbfec *man/term_cap_test.Rd 61cdc7c7a0a09b485f32267203a25afd *man/to_html.Rd efa5c17df677497ac4f55ec7693fc430 *man/trimws_ctl.Rd 6bb0e66426a2132365e0c8d1bd7be860 *man/unhandled_ctl.Rd 4a6ee9d66b62b2838b5d54e20f04f8b0 *src/Makevars 3aab85ebdb0924f3081ba6ebcb71c405 *src/assumptions.c 33df5ed09c6142a7c45f8c6d7d03b1f2 *src/carry.c 361540aedcb086efc89acaa44053919c *src/fansi-cnst.h 8201420730646ddea744873c24f4c6a7 *src/fansi-ext.h b283fcb3699c91f6c51bfa6b309af023 *src/fansi-struct.h dba9773adbae7e3ba07d4e2fa5664171 *src/fansi-win.def c155797d23782a7d8febc3704ca394f1 *src/fansi.h cfd18b26c12e0cc52d50a2d1be9de4ca *src/has.c 4f8915cec62d37322c812e50f1360511 *src/init.c fc385906eccfbdae0db1d59af2fa8adb *src/nchar.c d60274206deae16dcb2cefec848e3711 *src/normalize.c ad84922aeebb79abbd1c35feb99332be *src/read.c 221fa7cf9472aabbe262915c18c6b53c *src/rnchar.c e33f79bc0d340ea5bba58ef28df1f412 *src/state.c 0279c381057625099607cf992c31b1fc *src/strip.c 3cb1020bfe988008987ac4c3234ca14b *src/substr.c 3fc68f4f2360c9504e49e7e7d3b38a9f *src/tabs.c 46d1eb2067265581a41bae57bf58e2ed *src/tohtml.c f75d1a33f0c7350ae19127511393ef16 *src/trimws.c 5187bc3a04d90e275e2d9e9f42a2a079 *src/unhandled.c 3322a2575265d605c5f8d7721494ef5f *src/utf8.c db85d844edded6faa7a1c9814723245c *src/utf8clen.h 5b7aa41c22bb506e4a6d4143a57af5e1 *src/utils.c cd47908773349f0f9226b08102797a7e *src/wrap.c 0c463af8d119bd5a3bde84a1f8e668dc *src/write.c 5667947b8f95f7b16c0bc1c9552192e0 *tests/Rprof.out e048a0af7426ed09ab39013c34ca7abc *tests/run.R 87a1e315ff0016470efff476efea6eb1 *tests/special/_pre/funs.R 430eed4d3b63a95533d10dbf1566b602 *tests/special/_pre/lorem-utf8.R 6a8fb67b50bfb19f5f68d2a6404095eb *tests/special/_pre/strings.R 09d74da088beb864b4c244488ef3d98a *tests/special/emo-graph.R cb74a32c2b659399fc774a5c7e67c982 *tests/special/emo-graph.unitizer/data.rds a112b56890622486da472c46df93f7e5 *tests/special/utf8.R ca9c3755ab8ad61a7646ce5d52a5870c *tests/special/utf8.unitizer/data.rds dd3bd2c1ece09879b53ddb6687500798 *tests/unitizer/_pre/funs.R ba82c5fd6275d7b6f0f3bf1efd6fdbd6 *tests/unitizer/_pre/lorem.R a75d0df3c8bac48c6069dfe78774570e *tests/unitizer/_pre/lorem.data/lorem.cn.phrases.RDS a8eaee7d7eb3d494b11a0d0e5564f962 *tests/unitizer/_pre/strings.R b0d00cfe78acd175bee35b82e0e3d28a *tests/unitizer/has.R c8c0c175900565a6755c1225cc6f9b7c *tests/unitizer/has.unitizer/data.rds 6d87b1d167dcf8afdaa117d0e14ab818 *tests/unitizer/interactions.R 2dc43c9a8b9709593226d05cee8ffbba *tests/unitizer/interactions.unitizer/data.rds efebd5960bf982eedfd35fad8d6bc6c0 *tests/unitizer/misc.R 7c679205c63d2a5e10652969691177cf *tests/unitizer/misc.unitizer/data.rds 47eb83baf9f733428bb9830279d9affb *tests/unitizer/nchar.R 9266b3d825e4f053a0abe71ab3ce887d *tests/unitizer/nchar.unitizer/data.rds ae24c511e8d998e575da37c538c17386 *tests/unitizer/normalize.R eef53fd21e84b20678d8c8ac440015b9 *tests/unitizer/normalize.unitizer/data.rds d09ba5073a646fae37d0aca6e895875a *tests/unitizer/overflow.R 162168a40b4695b388e2d71026386236 *tests/unitizer/overflow.unitizer/data.rds fcfb224948a1afb53fa791d35cc74fd7 *tests/unitizer/strip.R 15ba4765eaf7b30b372e8c9f1255041c *tests/unitizer/strip.unitizer/data.rds 54e54f69461b9f0ccd09df8b7db4d1a2 *tests/unitizer/strsplit.R 441c2a390951a76deb028d02788483fe *tests/unitizer/strsplit.unitizer/data.rds 44453730f30cf544dc1d98333f2b7740 *tests/unitizer/substr.R b7d8f2d24288e0083d7b436a0bb315fb *tests/unitizer/substr.unitizer/data.rds 3845fa0319a1ab4f6cf3b659a022bd6b *tests/unitizer/tabs.R 3940bca8582d8d8c317e3ded4b3d57ae *tests/unitizer/tabs.unitizer/data.rds 98122bf796550ccee6b5b6eb1ffb0c46 *tests/unitizer/tohtml.R 93ef1c3dc089e7c99f8528cb3e1847c6 *tests/unitizer/tohtml.unitizer/data.rds 949e7990e73dbe0df0430ab9797e44f6 *tests/unitizer/trimws.R 527e3bcd165dc19c5e684511e51ccbaf *tests/unitizer/trimws.unitizer/data.rds 86f06897eee9a2cf771af27ebea001fb *tests/unitizer/url.R ee5e060a76fb915e4cb8e0a5832d9e10 *tests/unitizer/url.unitizer/data.rds cf7b2aaceec9e907cf2e57383c46248a *tests/unitizer/wrap.R 0612f2adf3150bab6bb7ec211994f50e *tests/unitizer/wrap.unitizer/data.rds 368576ff676a26fbc5395f1aeb364cb6 *vignettes/sgr-in-rmd.Rmd b35b7d6227ab84772153a7c06e2f8dcc *vignettes/styles.css fansi/inst/0000755000176200001440000000000014510302020012312 5ustar liggesusersfansi/inst/doc/0000755000176200001440000000000014510302020013057 5ustar liggesusersfansi/inst/doc/sgr-in-rmd.Rmd0000755000176200001440000001041614361502642015527 0ustar liggesusers--- title: "ANSI CSI SGR Sequences in Rmarkdown" author: "Brodie Gaslam" output: rmarkdown::html_vignette: css: styles.css mathjax: local vignette: > %\VignetteIndexEntry{ANSI CSI SGR Sequences in Rmarkdown} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r echo=FALSE} library(fansi) knitr::knit_hooks$set(document=function(x, options) gsub("\033", "\uFFFD", x)) ``` ### Browsers Do Not Interpret ANSI CSI SGR Sequences Over the past few years color has been gaining traction in the R terminal, particularly since Gábor Csárdi's [crayon](https://github.com/r-lib/crayon) made it easy to format text with [ANSI CSI SGR sequences](https://en.wikipedia.org/wiki/ANSI_escape_code). At the same time the advent of JJ Alaire and Yihui Xie `rmarkdown` and `knitr` packages, along with John MacFarlane `pandoc`, made it easy to automatically incorporate R code and output in HTML documents. Unfortunately ANSI CSI SGR sequences are not recognized by web browsers and end up rendering weirdly1: ```{r} sgr.string <- c( "\033[43;34mday > night\033[0m", "\033[44;33mdawn < dusk\033[0m" ) writeLines(sgr.string) ``` ### Automatically Convert ANSI CSI SGR to HTML `fansi` provides the `to_html` function which converts the ANSI CSI SGR sequences and OSC hyperlinks into HTML markup. When we combine it with `knitr::knit_hooks` we can modify the rendering of the `rmarkdown` document such that ANSI CSI SGR encoding is shown in the equivalent HTML. `fansi::set_knit_hooks` is a convenience function that does just this. You should call it in an `rmarkdown` document with the: * Chunk option `results` set to "asis". * Chunk option `comments` set to "" (empty string). * The `knitr::knit_hooks` object as an argument. The corresponding `rmarkdown` hunk should look as follows: ```` ```{r, comment="", results="asis"}`r ''` old.hooks <- fansi::set_knit_hooks(knitr::knit_hooks) ``` ```` ```{r comment="", results="asis", echo=FALSE} old.hooks <- fansi::set_knit_hooks(knitr::knit_hooks) ``` We run this function for its side effects, which cause the output to be displayed as intended: ```{r} writeLines(sgr.string) ``` If you are seeing extra line breaks in your output you may need to use: ```` ```{r, comment="", results="asis"}`r ''` old.hooks <- fansi::set_knit_hooks(knitr::knit_hooks, split.nl=TRUE) ``` ```` If you use `crayon` to generate your ANSI CSI SGR style strings you may need to set `options(crayon.enabled=TRUE)`, as in some cases `crayon` suppresses the SGR markup if it thinks it is not outputting to a terminal. We can also set hooks for the other types of outputs, and add some additional CSS styles. ```` ```{r, comment="", results="asis"}`r ''` styles <- c( getOption("fansi.style", dflt_css()), # default style "PRE.fansi CODE {background-color: transparent;}", "PRE.fansi-error {background-color: #DDAAAA;}", "PRE.fansi-warning {background-color: #DDDDAA;}", "PRE.fansi-message {background-color: #AAAADD;}" ) old.hooks <- c( old.hooks, fansi::set_knit_hooks( knitr::knit_hooks, which=c("warning", "error", "message"), style=styles ) ) ``` ```` ```{r comment="", results="asis", echo=FALSE} styles <- c( getOption("fansi.style", dflt_css()), # default style "PRE.fansi CODE {background-color: transparent;}", "PRE.fansi-error {background-color: #DDAAAA;}", "PRE.fansi-warning {background-color: #DDDDAA;}", "PRE.fansi-message {background-color: #AAAADD;}" ) old.hooks <- c( old.hooks, fansi::set_knit_hooks( knitr::knit_hooks, which=c("warning", "error", "message"), style=styles ) ) ``` ```{r error=TRUE} message(paste0(sgr.string, collapse="\n")) warning(paste0(c("", sgr.string), collapse="\n")) stop(paste0(c("", sgr.string), collapse="\n")) ``` You can restore the old hooks at any time in your document with: ```{r} do.call(knitr::knit_hooks$set, old.hooks) writeLines(sgr.string) ``` See `?fansi::set_knit_hooks` for details. ---- 1For illustrative purposes we output raw ANSI CSI SGR sequences in this document. However, because the ESC control character causes problems with some HTML rendering services we replace it with the � symbol. Depending on the browser and process it would normally not be visible at all, or substituted with some other symbol. fansi/inst/doc/sgr-in-rmd.R0000644000176200001440000000271114510302017015170 0ustar liggesusers## ----echo=FALSE--------------------------------------------------------------- library(fansi) knitr::knit_hooks$set(document=function(x, options) gsub("\033", "\uFFFD", x)) ## ----------------------------------------------------------------------------- sgr.string <- c( "\033[43;34mday > night\033[0m", "\033[44;33mdawn < dusk\033[0m" ) writeLines(sgr.string) ## ----comment="", results="asis", echo=FALSE----------------------------------- old.hooks <- fansi::set_knit_hooks(knitr::knit_hooks) ## ----------------------------------------------------------------------------- writeLines(sgr.string) ## ----comment="", results="asis", echo=FALSE----------------------------------- styles <- c( getOption("fansi.style", dflt_css()), # default style "PRE.fansi CODE {background-color: transparent;}", "PRE.fansi-error {background-color: #DDAAAA;}", "PRE.fansi-warning {background-color: #DDDDAA;}", "PRE.fansi-message {background-color: #AAAADD;}" ) old.hooks <- c( old.hooks, fansi::set_knit_hooks( knitr::knit_hooks, which=c("warning", "error", "message"), style=styles ) ) ## ----error=TRUE--------------------------------------------------------------- message(paste0(sgr.string, collapse="\n")) warning(paste0(c("", sgr.string), collapse="\n")) stop(paste0(c("", sgr.string), collapse="\n")) ## ----------------------------------------------------------------------------- do.call(knitr::knit_hooks$set, old.hooks) writeLines(sgr.string) fansi/inst/doc/sgr-in-rmd.html0000644000176200001440000004127214510302017015740 0ustar liggesusers ANSI CSI SGR Sequences in Rmarkdown

ANSI CSI SGR Sequences in Rmarkdown

Brodie Gaslam

Browsers Do Not Interpret ANSI CSI SGR Sequences

Over the past few years color has been gaining traction in the R terminal, particularly since Gábor Csárdi’s crayon made it easy to format text with ANSI CSI SGR sequences. At the same time the advent of JJ Alaire and Yihui Xie rmarkdown and knitr packages, along with John MacFarlane pandoc, made it easy to automatically incorporate R code and output in HTML documents.

Unfortunately ANSI CSI SGR sequences are not recognized by web browsers and end up rendering weirdly1:

sgr.string <- c(
  "\033[43;34mday > night\033[0m",
  "\033[44;33mdawn < dusk\033[0m"
)
writeLines(sgr.string)
## �[43;34mday > night�[0m
## �[44;33mdawn < dusk�[0m

Automatically Convert ANSI CSI SGR to HTML

fansi provides the to_html function which converts the ANSI CSI SGR sequences and OSC hyperlinks into HTML markup. When we combine it with knitr::knit_hooks we can modify the rendering of the rmarkdown document such that ANSI CSI SGR encoding is shown in the equivalent HTML.

fansi::set_knit_hooks is a convenience function that does just this. You should call it in an rmarkdown document with the:

  • Chunk option results set to “asis”.
  • Chunk option comments set to “” (empty string).
  • The knitr::knit_hooks object as an argument.

The corresponding rmarkdown hunk should look as follows:

```{r, comment="", results="asis"}
old.hooks <- fansi::set_knit_hooks(knitr::knit_hooks)
```

We run this function for its side effects, which cause the output to be displayed as intended:

writeLines(sgr.string)
## day > night
## dawn < dusk

If you are seeing extra line breaks in your output you may need to use:

```{r, comment="", results="asis"}
old.hooks <- fansi::set_knit_hooks(knitr::knit_hooks, split.nl=TRUE)
```

If you use crayon to generate your ANSI CSI SGR style strings you may need to set options(crayon.enabled=TRUE), as in some cases crayon suppresses the SGR markup if it thinks it is not outputting to a terminal.

We can also set hooks for the other types of outputs, and add some additional CSS styles.

```{r, comment="", results="asis"}
styles <- c(
  getOption("fansi.style", dflt_css()),  # default style
  "PRE.fansi CODE {background-color: transparent;}",
  "PRE.fansi-error {background-color: #DDAAAA;}",
  "PRE.fansi-warning {background-color: #DDDDAA;}",
  "PRE.fansi-message {background-color: #AAAADD;}"
)
old.hooks <- c(
  old.hooks,
  fansi::set_knit_hooks(
    knitr::knit_hooks,
    which=c("warning", "error", "message"),
    style=styles
) )
```
message(paste0(sgr.string, collapse="\n"))
## day > night
## dawn < dusk
warning(paste0(c("", sgr.string), collapse="\n"))
## Warning: 
## day > night
## dawn < dusk
stop(paste0(c("", sgr.string), collapse="\n"))
## Error in eval(expr, envir, enclos): 
## day > night
## dawn < dusk

You can restore the old hooks at any time in your document with:

do.call(knitr::knit_hooks$set, old.hooks)
writeLines(sgr.string)
## �[43;34mday > night�[0m
## �[44;33mdawn < dusk�[0m

See ?fansi::set_knit_hooks for details.


1For illustrative purposes we output raw ANSI CSI SGR sequences in this document. However, because the ESC control character causes problems with some HTML rendering services we replace it with the � symbol. Depending on the browser and process it would normally not be visible at all, or substituted with some other symbol.