fansi/ 0000755 0001762 0000144 00000000000 14534506672 011366 5 ustar ligges users fansi/NAMESPACE 0000644 0001762 0000144 00000002007 14213626056 012575 0 ustar ligges users # Generated by roxygen2: do not edit by hand
export("substr2_ctl<-")
export("substr_ctl<-")
export(close_state)
export(dflt_css)
export(dflt_term_cap)
export(fansi_lines)
export(fwl)
export(has_ctl)
export(has_sgr)
export(html_code_block)
export(html_esc)
export(in_html)
export(make_styles)
export(nchar_ctl)
export(nchar_sgr)
export(normalize_state)
export(nzchar_ctl)
export(nzchar_sgr)
export(set_knit_hooks)
export(sgr_256)
export(sgr_to_html)
export(state_at_end)
export(strip_ctl)
export(strip_sgr)
export(strsplit_ctl)
export(strsplit_sgr)
export(strtrim2_ctl)
export(strtrim2_sgr)
export(strtrim_ctl)
export(strtrim_sgr)
export(strwrap2_ctl)
export(strwrap2_sgr)
export(strwrap_ctl)
export(strwrap_sgr)
export(substr2_ctl)
export(substr2_sgr)
export(substr_ctl)
export(substr_sgr)
export(tabs_as_spaces)
export(term_cap_test)
export(to_html)
export(trimws_ctl)
export(unhandled_ctl)
importFrom(grDevices,col2rgb)
importFrom(grDevices,rgb)
importFrom(utils,browseURL)
useDynLib(fansi, .registration=TRUE, .fixes="FANSI_")
fansi/README.md 0000644 0001762 0000144 00000023613 14533476706 012655 0 ustar ligges users
# fansi - ANSI Control Sequence Aware String Functions
[](https://github.com/brodieG/fansi/actions)
[](https://app.codecov.io/github/brodieG/fansi?branch=master)
[](https://cran.r-project.org/package=fansi)
[](https://tinyverse.netlify.app/)
Counterparts to R string manipulation functions that account for the
effects of ANSI text formatting control sequences.
## Formatting Strings with Control Sequences
Many terminals will recognize special sequences of characters in strings
and change display behavior as a result. For example, on my terminal the
sequences `"\033[3?m"` and `"\033[4?m"`, where `"?"` is a digit in 1-7,
change the foreground and background colors of text respectively:
fansi <- "\033[30m\033[41mF\033[42mA\033[43mN\033[44mS\033[45mI\033[m"

This type of sequence is called an ANSI CSI SGR control sequence. Most
\*nix terminals support them, and newer versions of Windows and Rstudio
consoles do too. You can check whether your display supports them by
running `term_cap_test()`.
Whether the `fansi` functions behave as expected depends on many
factors, including how your particular display handles Control
Sequences. See `?fansi` for details, particularly if you are getting
unexpected results.
## Manipulation of Formatted Strings
ANSI control characters and sequences (*Control Sequences* hereafter)
break the relationship between byte/character position in a string and
display position. For example, to extract the “ANS” part of our colored
“FANSI”, we would need to carefully compute the character positions:

With `fansi` we can select directly based on display position:

If you look closely you’ll notice that the text color for the `substr`
version is wrong as the naïve string extraction loses the
initial`"\033[37m"` that sets the foreground color. Additionally, the
color from the last letter bleeds out into the next line.
## `fansi` Functions
`fansi` provides counterparts to the following string functions:
- `substr` (and `substr<-`)
- `strsplit`
- `strtrim`
- `strwrap`
- `nchar` / `nzchar`
- `trimws`
These are drop-in replacements that behave (almost) identically to the
base counterparts, except for the *Control Sequence* awareness. There
are also utility functions such as `strip_ctl` to remove *Control
Sequences* and `has_ctl` to detect whether strings contain them.
Much of `fansi` is written in C so you should find performance of the
`fansi` functions to be slightly slower than the corresponding base
functions, with the exception that `strwrap_ctl` is much faster.
Operations involving `type = "width"` will be slower still. We have
prioritized convenience and safety over raw speed in the C code, but
unless your code is primarily engaged in string manipulation `fansi`
should be fast enough to avoid attention in benchmarking traces.
## Width Based Substrings
`fansi` also includes improved versions of some of those functions, such
as `substr2_ctl` which allows for width based substrings. To illustrate,
let’s create an emoji string made up of two wide characters:
pizza.grin <- sprintf("\033[46m%s\033[m", strrep("\U1F355\U1F600", 10))

And a colorful background made up of one wide characters:
raw <- paste0("\033[45m", strrep("FANSI", 40))
wrapped <- strwrap2_ctl(raw, 41, wrap.always=TRUE)

When we inject the 2-wide emoji into the 1-wide background their widths
are accounted for as shown by the result remaining rectangular:
starts <- c(18, 13, 8, 13, 18)
ends <- c(23, 28, 33, 28, 23)
substr2_ctl(wrapped, type='width', starts, ends) <- pizza.grin

`fansi` width calculations use heuristics to account for graphemes,
including combining emoji:
emo <- c(
"\U1F468",
"\U1F468\U1F3FD",
"\U1F468\U1F3FD\u200D\U1F9B3",
"\U1F468\u200D\U1F469\u200D\U1F467\u200D\U1F466"
)
writeLines(
paste(
emo,
paste("base:", nchar(emo, type='width')),
paste("fansi:", nchar_ctl(emo, type='width'))
) )
## 👨 base: 2 fansi: 2
## 👨🏽 base: 4 fansi: 2
## 👨🏽🦳 base: 6 fansi: 2
## 👨👩👧👦 base: 8 fansi: 2
## HTML Translation
You can translate ANSI CSI SGR formatted strings into their HTML
counterparts with `to_html`:
Translate to HTML
## Rmarkdown
It is possible to set `knitr` hooks such that R output that contains
ANSI CSI SGR is automatically converted to the HTML formatted equivalent
and displayed as intended. See the
[vignette](https://htmlpreview.github.io/?https://raw.githubusercontent.com/brodieG/fansi/rc/extra/sgr-in-rmd.html)
for details.
## Installation
This package is available on CRAN:
install.packages('fansi')
It has no runtime dependencies.
For the development version use
`remotes::install_github('brodieg/fansi@development')` or:
f.dl <- tempfile()
f.uz <- tempfile()
github.url <- 'https://github.com/brodieG/fansi/archive/development.zip'
download.file(github.url, f.dl)
unzip(f.dl, exdir=f.uz)
install.packages(file.path(f.uz, 'fansi-development'), repos=NULL, type='source')
unlink(c(f.dl, f.uz))
There is no guarantee that development versions are stable or even
working. The master branch typically mirrors CRAN and should be stable.
## Related Packages and References
- [crayon](https://github.com/r-lib/crayon), the library that started
it all.
- [ansistrings](https://github.com/r-lib/ansistrings/), which
implements similar functionality.
- [ECMA-48 - Control Functions For Coded Character
Sets](https://ecma-international.org/publications-and-standards/standards/ecma-48/),
in particular pages 10-12, and 61.
- [CCITT Recommendation
T.416](https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-T.416-199303-I!!PDF-E&type=items)
- [ANSI Escape Code -
Wikipedia](https://en.wikipedia.org/wiki/ANSI_escape_code) for a
gentler introduction.
## Acknowledgments
- R Core for developing and maintaining such a wonderful language.
- CRAN maintainers, for patiently shepherding packages onto CRAN and
maintaining the repository, and Uwe Ligges in particular for
maintaining [Winbuilder](https://win-builder.r-project.org/).
- [Gábor Csárdi](https://github.com/gaborcsardi) for getting me
started on the journey ANSI control sequences, and for many of the
ideas on how to process them.
- [Jim Hester](https://github.com/jimhester) for
[covr](https://cran.r-project.org/package=covr), and with Rstudio
for [r-lib/actions](https://github.com/r-lib/actions).
- [Dirk Eddelbuettel](https://github.com/eddelbuettel) and [Carl
Boettiger](https://github.com/cboettig) for the
[rocker](https://github.com/rocker-org/rocker) project, and [Gábor
Csárdi](https://github.com/gaborcsardi) and the
[R-consortium](https://www.r-consortium.org/) for
[Rhub](https://github.com/r-hub), without which testing bugs on
R-devel and other platforms would be a nightmare.
- [Tomas Kalibera](https://github.com/kalibera) for
[rchk](https://github.com/kalibera/rchk) and the accompanying
vagrant image, and rcnst to help detect errors in compiled code.
- [Winston Chang](https://github.com/wch) for the
[r-debug](https://hub.docker.com/r/wch1/r-debug/) docker container,
in particular because of the valgrind level 2 instrumented version
of R.
- George Nachman etal. for [Iterm2](https://iterm2.com/index.html), a
Free terminal emulator that supports truecolor CSI SGR.
- [Hadley Wickham](https://github.com/hadley/) and [Peter
Danenberg](https://github.com/klutometis) for
[roxygen2](https://cran.r-project.org/package=roxygen2).
- [Yihui Xie](https://github.com/yihui) for
[knitr](https://cran.r-project.org/package=knitr) and [J.J.
Allaire](https://github.com/jjallaire) et al. for
[rmarkdown](https://cran.r-project.org/package=rmarkdown), and by
extension John MacFarlane for [pandoc](https://pandoc.org/).
- [Gábor Csárdi](https://github.com/gaborcsardi), the
[R-consortium](https://www.r-consortium.org/), et al. for
[revdepcheck](https://github.com/r-lib/revdepcheck) to simplify
reverse dependency checks.
- Olaf Mersmann for
[microbenchmark](https://cran.r-project.org/package=microbenchmark),
because microsecond matter, and [Joshua
Ulrich](https://github.com/joshuaulrich) for making it lightweight.
- All open source developers out there that make their work freely
available for others to use.
- [Github](https://github.com/), [Codecov](https://about.codecov.io/),
[Vagrant](https://www.vagrantup.com/),
[Docker](https://www.docker.com/), [Ubuntu](https://ubuntu.com/),
[Brew](https://brew.sh/) for providing infrastructure that greatly
simplifies open source development.
- [Free Software Foundation](https://www.fsf.org/) for developing the
GPL license and promotion of the free software movement.
fansi/man/ 0000755 0001762 0000144 00000000000 14533476214 012136 5 ustar ligges users fansi/man/fansi.Rd 0000644 0001762 0000144 00000031624 14533743434 013534 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/fansi-package.R
\docType{package}
\name{fansi}
\alias{fansi}
\alias{fansi-package}
\title{Details About Manipulation of Strings Containing Control Sequences}
\description{
Counterparts to R string manipulation functions that account for
the effects of some ANSI X3.64 (a.k.a. ECMA-48, ISO-6429) control sequences.
}
\section{Control Characters and Sequences}{
Control characters and sequences are non-printing inline characters or
sequences initiated by them that can be used to modify terminal display and
behavior, for example by changing text color or cursor position.
We will refer to X3.64/ECMA-48/ISO-6429 control characters and sequences as
"\emph{Control Sequences}" hereafter.
There are four types of \emph{Control Sequences} that \code{fansi} can treat
specially:
\itemize{
\item "C0" control characters, such as tabs and carriage returns (we include
delete in this set, even though technically it is not part of it).
\item Sequences starting in "ESC[", also known as Control Sequence
Introducer (CSI) sequences, of which the Select Graphic Rendition (SGR)
sequences used to format terminal output are a subset.
\item Sequences starting in "ESC]", also known as Operating System
Commands (OSC), of which the subset beginning with "8" is used to encode
URI based hyperlinks.
\item Sequences starting in "ESC" and followed by something other than "[" or
"]".
}
\emph{Control Sequences} starting with ESC are assumed to be two characters
long (including the ESC) unless they are of the CSI or OSC variety, in which
case their length is computed as per the \href{https://ecma-international.org/publications-and-standards/standards/ecma-48/}{ECMA-48 specification},
with the exception that \href{#osc-hyperlinks}{OSC hyperlinks} may be terminated
with BEL ("\\a") in addition to ST ("ESC\\"). \code{fansi} handles most common
\emph{Control Sequences} in its parsing algorithms, but it is not a conforming
implementation of ECMA-48. For example, there are non-CSI/OSC escape
sequences that may be longer than two characters, but \code{fansi} will
(incorrectly) treat them as if they were two characters long. There are many
more unimplemented ECMA-48 specifications.
In theory it is possible to encode CSI sequences with a single byte
introducing character in the 0x40-0x5F range instead of the traditional
"ESC[". Since this is rare and it conflicts with UTF-8 encoding, \code{fansi}
does not support it.
Within \emph{Control Sequences}, \code{fansi} further distinguishes CSI SGR and OSC
hyperlinks by recording format specification and URIs into string state, and
applying the same to any output strings according to the semantics of the
functions in use. CSI SGR and OSC hyperlinks are known together as \emph{Special
Sequences}. See the following sections for details.
Additionally, all \emph{Control Sequences}, whether special or not,
do not count as characters, graphemes, or display width. You can cause
\code{fansi} to treat particular \emph{Control Sequences} as regular characters with
the \code{ctl} parameter.
}
\section{CSI SGR Control Sequences}{
\strong{NOTE}: not all displays support CSI SGR sequences; run
\code{\link{term_cap_test}} to see whether your display supports them.
CSI SGR Control Sequences are the subset of CSI sequences that can be
used to change text appearance (e.g. color). These sequences begin with
"ESC[" and end in "m". \code{fansi} interprets these sequences and writes new
ones to the output strings in such a way that the original formatting is
preserved. In most cases this should be transparent to the user.
Occasionally there may be mismatches between how \code{fansi} and a display
interpret the CSI SGR sequences, which may produce display artifacts. The
most likely source of artifacts are \emph{Control Sequences} that move
the cursor or change the display, or that \code{fansi} otherwise fails to
interpret, such as:
\itemize{
\item Unknown SGR substrings.
\item "C0" control characters like tabs and carriage returns.
\item Other escape sequences.
}
Another possible source of problems is that different displays parse
and interpret control sequences differently. The common CSI SGR sequences
that you are likely to encounter in formatted text tend to be treated
consistently, but less common ones are not. \code{fansi} tries to hew by the
ECMA-48 specification \strong{for CSI SGR control sequences}, but not all
terminals do.
The most likely source of problems will be 24-bit CSI SGR sequences.
For example, a 24-bit color sequence such as "ESC[38;2;31;42;4" is a
single foreground color to a terminal that supports it, or separate
foreground, background, faint, and underline specifications for one that does
not. \code{fansi} will always interpret the sequences according to ECMA-48, but
it will warn you if encountered sequences exceed those specified by
the \code{term.cap} parameter or the "fansi.term.cap" global option.
\code{fansi} will will also warn if it encounters \emph{Control Sequences} that it
cannot interpret. You can turn off warnings via the \code{warn} parameter, which
can be set globally via the "fansi.warn" option. You can work around "C0"
tabs characters by turning them into spaces first with \code{\link{tabs_as_spaces}} or
with the \code{tabs.as.spaces} parameter available in some of the \code{fansi}
functions
\code{fansi} interprets CSI SGR sequences in cumulative "Graphic Rendition
Combination Mode". This means new SGR sequences add to rather than replace
previous ones, although in some cases the effect is the same as replacement
(e.g. if you have a color active and pick another one).
}
\section{OSC Hyperlinks}{
Operating System Commands are interpreted by terminal emulators typically to
engage actions external to the display of text proper, such as setting a
window title or changing the active color palette.
\href{https://iterm2.com/documentation-escape-codes.html}{Some terminals} have
added support for associating URIs to text with OSCs in a similar way to
anchors in HTML, so \code{fansi} interprets them and outputs or terminates them as
needed. For example:
\if{html}{\out{
}}
Might be interpreted as link to the URI "x.z". To make the encoding pattern
clearer, we replace "\033]" with "" and "\033\\\\" with
"" below:
\if{html}{\out{
}}
}
\section{State Interactions}{
The cumulative nature of state as specified by SGR or OSC hyperlinks means
that unterminated strings that are spliced will interact with each other.
By extension, a substring does not inherently contain all the information
required to recreate its state as it appeared in the source document. The
default \code{fansi} configuration terminates extracted substrings and prepends
original state to them so they present on a stand-alone basis as they did as
part of the original string.
To allow state in substrings to affect subsequent strings set \code{terminate = FALSE}, but you will need to manually terminate them or deal with the
consequences of not doing so (see "Terminal Quirks").
By default, \code{fansi} assumes that each element in an input character vector is
independent, but this is incorrect if the input is a single document with
each element a line in it. In that situation state from each line should
bleed into subsequent ones. Setting \code{carry = TRUE} enables the "single
document" interpretation.
To most closely approximate what \code{writeLines(x)} produces on your terminal,
where \code{x} is a stateful string, use \code{writeLines(fansi_fun(x, carry=TRUE, terminate=FALSE))}. \code{fansi_fun} is a stand-in for any of the \code{fansi} string
manipulation functions. Note that even with a seeming "null-op" such as
\code{substr_ctl(x, 1, nchar_ctl(x), carry=TRUE, terminate=FALSE)} the output
control sequences may not match the input ones, but the output \emph{should} look
the same if displayed to the terminal.
\code{fansi} strings will be affected by any active state in strings they are
appended to. There are no parameters to control what happens in this case,
but \code{fansi} provides functions that can help the user get the desired
behavior. \code{state_at_end} computes the active state the end of a string,
which can then be prepended onto the \emph{input} of \code{fansi} functions so that
they are aware of the active style at the beginning of the string.
Alternatively, one could use \code{close_state(state_at_end(...))} and pre-pend
that to the \emph{output} of \code{fansi} functions so they are unaffected by preceding
SGR. One could also just prepend "ESC[0m", but in some cases as
described in \code{\link[=normalize_state]{?normalize_state}} that is sub-optimal.
If you intend to combine stateful \code{fansi} manipulated strings with your own,
it may be best to set \code{normalize = TRUE} for improved compatibility (see
\code{\link[=normalize_state]{?normalize_state}}.)
}
\section{Terminal Quirks}{
Some terminals (e.g. OS X terminal, ITerm2) will pre-paint the entirety of a
new line with the currently active background before writing the contents of
the line. If there is a non-default active background color, any unwritten
columns in the new line will keep the prior background color even if the new
line changes the background color. To avoid this be sure to use \code{terminate = TRUE} or to manually terminate each line with e.g. "ESC[0m". The
problem manifests as:
\if{html}{\out{
}}\preformatted{" " = default background
"#" = new background
">" = start new background
"!" = restore default background
+-----------+
| abc\\n |
|>###\\n |
|!abc\\n#####| <- trailing "#" after newline are from pre-paint
| abc |
+-----------+
}\if{html}{\out{
}}
The simplest way to avoid this problem is to split input strings by any
newlines they contain, and use \code{terminate = TRUE} (the default). A more
complex solution is to pad with spaces to the terminal window width before
emitting the newline to ensure the pre-paint is overpainted with the current
line's prevailing background color.
}
\section{Encodings / UTF-8}{
\code{fansi} will convert any non-ASCII strings to UTF-8 before processing them,
and \code{fansi} functions that return strings will return them encoded in UTF-8.
In some cases this will be different to what base R does. For example,
\code{substr} re-encodes substrings to their original encoding.
Interpretation of UTF-8 strings is intended to be consistent with base R.
There are three ways things may not work out exactly as desired:
\enumerate{
\item \code{fansi}, despite its best intentions, handles a UTF-8 sequence differently
to the way R does.
\item R incorrectly handles a UTF-8 sequence.
\item Your display incorrectly handles a UTF-8 sequence.
}
These issues are most likely to occur with invalid UTF-8 sequences,
combining character sequences, and emoji. For example, whether special
characters such as emoji are considered one or two wide evolves as software
implements newer versions the Unicode databases.
Internally, \code{fansi} computes the width of most UTF-8 character sequences
outside of the ASCII range using the native \code{R_nchar} function. This will
cause such characters to be processed slower than ASCII characters. Unlike R
(at least as of version 4.1), \code{fansi} can account for graphemes.
Because \code{fansi} implements its own internal UTF-8 parsing it is possible
that you will see results different from those that R produces even on
strings without \emph{Control Sequences}.
}
\section{Overflow}{
The maximum length of input character vector elements allowed by \code{fansi} is
the 32 bit INT_MAX, excluding the terminating NULL. As of R4.1 this is the
limit for R character vector elements generally, but is enforced at the C
level by \code{fansi} nonetheless.
It is possible that during processing strings that are shorter than INT_MAX
would become longer than that. \code{fansi} checks for that overflow and will
stop with an error if that happens. A work-around for this situation is to
break up large strings into smaller ones. The limit is on each element of a
character vector, not on the vector as a whole. \code{fansi} will also error on
your system if \code{R_len_t}, the R type used to measure string lengths, is less
than the processed length of the string.
}
\section{R < 3.2.2 support}{
Nominally you can build and run this package in R versions between 3.1.0 and
3.2.1. Things should mostly work, but please be aware we do not run the test
suite under versions of R less than 3.2.2. One key degraded capability is
width computation of wide-display characters. Under R < 3.2.2 \code{fansi} will
assume every character is 1 display width. Additionally, \code{fansi} may not
always report malformed UTF-8 sequences as it usually does. One
exception to this is \code{\link{nchar_ctl}} as that is just a thin wrapper around
\code{\link[base:nchar]{base::nchar}}.
}
fansi/man/strsplit_ctl.Rd 0000644 0001762 0000144 00000021567 14533476214 015166 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/strsplit.R
\name{strsplit_ctl}
\alias{strsplit_ctl}
\title{Control Sequence Aware Version of strsplit}
\usage{
strsplit_ctl(
x,
split,
fixed = FALSE,
perl = FALSE,
useBytes = FALSE,
warn = getOption("fansi.warn", TRUE),
term.cap = getOption("fansi.term.cap", dflt_term_cap()),
ctl = "all",
normalize = getOption("fansi.normalize", FALSE),
carry = getOption("fansi.carry", FALSE),
terminate = getOption("fansi.terminate", TRUE)
)
}
\arguments{
\item{x}{a character vector, or, unlike \code{\link[base:strsplit]{base::strsplit}} an object that can
be coerced to character.}
\item{split}{
character vector (or object which can be coerced to such)
containing \link[base]{regular expression}(s) (unless \code{fixed = TRUE})
to use for splitting. If empty matches occur, in particular if
\code{split} has length 0, \code{x} is split into single characters.
If \code{split} has length greater than 1, it is re-cycled along
\code{x}.
}
\item{fixed}{
logical. If \code{TRUE} match \code{split} exactly, otherwise
use regular expressions. Has priority over \code{perl}.
}
\item{perl}{logical. Should Perl-compatible regexps be used?}
\item{useBytes}{logical. If \code{TRUE} the matching is done
byte-by-byte rather than character-by-character, and inputs with
marked encodings are not converted. This is forced (with a warning)
if any input is found which is marked as \code{"bytes"}
(see \code{\link[base]{Encoding}}).}
\item{warn}{TRUE (default) or FALSE, whether to warn when potentially
problematic \emph{Control Sequences} are encountered. These could cause the
assumptions \code{fansi} makes about how strings are rendered on your display
to be incorrect, for example by moving the cursor (see \code{\link[=fansi]{?fansi}}).
At most one warning will be issued per element in each input vector. Will
also warn about some badly encoded UTF-8 strings, but a lack of UTF-8
warnings is not a guarantee of correct encoding (use \code{\link{validUTF8}} for
that).}
\item{term.cap}{character a vector of the capabilities of the terminal, can
be any combination of "bright" (SGR codes 90-97, 100-107), "256" (SGR codes
starting with "38;5" or "48;5"), "truecolor" (SGR codes starting with
"38;2" or "48;2"), and "all". "all" behaves as it does for the \code{ctl}
parameter: "all" combined with any other value means all terminal
capabilities except that one. \code{fansi} will warn if it encounters SGR codes
that exceed the terminal capabilities specified (see \code{\link{term_cap_test}}
for details). In versions prior to 1.0, \code{fansi} would also skip exceeding
SGRs entirely instead of interpreting them. You may add the string "old"
to any otherwise valid \code{term.cap} spec to restore the pre 1.0 behavior.
"old" will not interact with "all" the way other valid values for this
parameter do.}
\item{ctl}{character, which \emph{Control Sequences} should be treated
specially. Special treatment is context dependent, and may include
detecting them and/or computing their display/character width as zero. For
the SGR subset of the ANSI CSI sequences, and OSC hyperlinks, \code{fansi}
will also parse, interpret, and reapply the sequences as needed. You can
modify whether a \emph{Control Sequence} is treated specially with the \code{ctl}
parameter.
\itemize{
\item "nl": newlines.
\item "c0": all other "C0" control characters (i.e. 0x01-0x1f, 0x7F), except
for newlines and the actual ESC (0x1B) character.
\item "sgr": ANSI CSI SGR sequences.
\item "csi": all non-SGR ANSI CSI sequences.
\item "url": OSC hyperlinks
\item "osc": all non-OSC-hyperlink OSC sequences.
\item "esc": all other escape sequences.
\item "all": all of the above, except when used in combination with any of the
above, in which case it means "all but".
}}
\item{normalize}{TRUE or FALSE (default) whether SGR sequence should be
normalized out such that there is one distinct sequence for each SGR code.
normalized strings will occupy more space (e.g. "\033[31;42m" becomes
"\033[31m\033[42m"), but will work better with code that assumes each SGR
code will be in its own escape as \code{crayon} does.}
\item{carry}{TRUE, FALSE (default), or a scalar string, controls whether to
interpret the character vector as a "single document" (TRUE or string) or
as independent elements (FALSE). In "single document" mode, active state
at the end of an input element is considered active at the beginning of the
next vector element, simulating what happens with a document with active
state at the end of a line. If FALSE each vector element is interpreted as
if there were no active state when it begins. If character, then the
active state at the end of the \code{carry} string is carried into the first
element of \code{x} (see "Replacement Functions" for differences there). The
carried state is injected in the interstice between an imaginary zeroeth
character and the first character of a vector element. See the "Position
Semantics" section of \code{\link{substr_ctl}} and the "State Interactions" section
of \code{\link[=fansi]{?fansi}} for details. Except for \code{\link{strwrap_ctl}} where \code{NA} is
treated as the string \code{"NA"}, \code{carry} will cause \code{NA}s in inputs to
propagate through the remaining vector elements.}
\item{terminate}{TRUE (default) or FALSE whether substrings should have
active state closed to avoid it bleeding into other strings they may be
prepended onto. This does not stop state from carrying if \code{carry = TRUE}.
See the "State Interactions" section of \code{\link[=fansi]{?fansi}} for details.}
}
\value{
Like \code{\link[base:strsplit]{base::strsplit}}, with \emph{Control Sequences} excluded.
}
\description{
A drop-in replacement for \code{\link[base:strsplit]{base::strsplit}}.
}
\details{
This function works by computing the position of the split points after
removing \emph{Control Sequences}, and uses those positions in conjunction with
\code{\link{substr_ctl}} to extract the pieces. This concept is borrowed from
\code{crayon::col_strsplit}. An important implication of this is that you cannot
split by \emph{Control Sequences} that are being treated as \emph{Control Sequences}.
You can however limit which control sequences are treated specially via the
\code{ctl} parameters (see examples).
}
\note{
The split positions are computed after both \code{x} and \code{split} are
converted to UTF-8.
Non-ASCII strings are converted to and returned in UTF-8 encoding.
Width calculations will not work properly in R < 3.2.2.
}
\section{Control and Special Sequences}{
\emph{Control Sequences} are non-printing characters or sequences of characters.
\emph{Special Sequences} are a subset of the \emph{Control Sequences}, and include CSI
SGR sequences which can be used to change rendered appearance of text, and
OSC hyperlinks. See \code{\link{fansi}} for details.
}
\section{Output Stability}{
Several factors could affect the exact output produced by \code{fansi}
functions across versions of \code{fansi}, \code{R}, and/or across systems.
\strong{In general it is best not to rely on exact \code{fansi} output, e.g. by
embedding it in tests}.
Width and grapheme calculations depend on locale, Unicode database
version, and grapheme processing logic (which is still in development), among
other things. For the most part \code{fansi} (currently) uses the internals of
\code{base::nchar(type='width')}, but there are exceptions and this may change in
the future.
How a particular display format is encoded in \emph{Control Sequences} is
not guaranteed to be stable across \code{fansi} versions. Additionally, which
\emph{Special Sequences} are re-encoded vs transcribed untouched may change.
In general we will strive to keep the rendered appearance stable.
To maximize the odds of getting stable output set \code{normalize_state} to
\code{TRUE} and \code{type} to \code{"chars"} in functions that allow it, and
set \code{term.cap} to a specific set of capabilities.
}
\section{Bidirectional Text}{
\code{fansi} is unaware of text directionality and operates as if all strings are
left to right (LTR). Using \code{fansi} function with strings that contain mixed
direction scripts (i.e. both LTR and RTL) may produce undesirable results.
}
\examples{
strsplit_ctl("\033[31mhello\033[42m world!", " ")
## Splitting by newlines does not work as they are _Control
## Sequences_, but we can use `ctl` to treat them as ordinary
strsplit_ctl("\033[31mhello\033[42m\nworld!", "\n")
strsplit_ctl("\033[31mhello\033[42m\nworld!", "\n", ctl=c("all", "nl"))
}
\seealso{
\code{\link[=fansi]{?fansi}} for details on how \emph{Control Sequences} are
interpreted, particularly if you are getting unexpected results,
\code{\link{normalize_state}} for more details on what the \code{normalize} parameter does,
\code{\link{state_at_end}} to compute active state at the end of strings,
\code{\link{close_state}} to compute the sequence required to close active state.
}
fansi/man/make_styles.Rd 0000644 0001762 0000144 00000004206 14213626056 014743 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/tohtml.R
\name{make_styles}
\alias{make_styles}
\title{Generate CSS Mapping Classes to Colors}
\usage{
make_styles(classes, rgb.mix = diag(3))
}
\arguments{
\item{classes}{a character vector of either 16, 32, or 512 class names. The
character vectors are described in \code{\link{to_html}}.}
\item{rgb.mix}{3 x 3 numeric matrix to remix color channels. Given a N x 3
matrix of numeric RGB colors \code{rgb}, the colors used in the style sheet will
be \code{rgb \%*\% rgb.mix}. Out of range values are clipped to the nearest bound
of the range.}
}
\value{
A character vector that can be used as the contents of a style sheet.
}
\description{
Given a set of class names, produce the CSS that maps them to the default
8-bit colors. This is a helper function to generate style sheets for use
in examples with either default or remixed \code{fansi} colors. In practice users
will create their own style sheets mapping their classes to their preferred
styles.
}
\examples{
## Generate some class strings; order matters
classes <- do.call(paste, c(expand.grid(c("fg", "bg"), 0:7), sep="-"))
writeLines(classes[1:4])
## Some Default CSS
css0 <- "span {font-size: 60pt; padding: 10px; display: inline-block}"
## Associated class strings to styles
css1 <- make_styles(classes)
writeLines(css1[1:4])
## Generate SGR-derived HTML, mapping to classes
string <- "\033[43mYellow\033[m\n\033[45mMagenta\033[m\n\033[46mCyan\033[m"
html <- to_html(string, classes=classes)
writeLines(html)
## Combine in a page with styles and display in browser
\dontrun{
in_html(html, css=c(css0, css1))
}
## Change CSS by remixing colors, and apply to exact same HTML
mix <- matrix(
c(
0, 1, 0, # red output is green input
0, 0, 1, # green output is blue input
1, 0, 0 # blue output is red input
),
nrow=3, byrow=TRUE
)
css2 <- make_styles(classes, rgb.mix=mix)
## Display in browser: same HTML but colors changed by CSS
\dontrun{
in_html(html, css=c(css0, css2))
}
}
\seealso{
Other HTML functions:
\code{\link{html_esc}()},
\code{\link{in_html}()},
\code{\link{to_html}()}
}
\concept{HTML functions}
fansi/man/sgr_256.Rd 0000644 0001762 0000144 00000001174 14213626056 013613 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/misc.R
\name{sgr_256}
\alias{sgr_256}
\title{Show 8 Bit CSI SGR Colors}
\usage{
sgr_256()
}
\value{
character vector with SGR codes with background color set as
themselves.
}
\description{
Generates text with each 8 bit SGR code (e.g. the "###" in "38;5;###") with
the background colored by itself, and the foreground in a contrasting color
and interesting color (we sacrifice some contrast for interest as this is
intended for demo rather than reference purposes).
}
\examples{
writeLines(sgr_256())
}
\seealso{
\code{\link[=make_styles]{make_styles()}}.
}
fansi/man/has_sgr.Rd 0000644 0001762 0000144 00000002116 14213626164 014047 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/sgr.R
\name{has_sgr}
\alias{has_sgr}
\title{Check for Presence of Control Sequences}
\usage{
has_sgr(x, warn = getOption("fansi.warn", TRUE))
}
\arguments{
\item{x}{a character vector or object that can be coerced to such.}
\item{warn}{TRUE (default) or FALSE, whether to warn when potentially
problematic \emph{Control Sequences} are encountered. These could cause the
assumptions \code{fansi} makes about how strings are rendered on your display
to be incorrect, for example by moving the cursor (see \code{\link[=fansi]{?fansi}}).
At most one warning will be issued per element in each input vector. Will
also warn about some badly encoded UTF-8 strings, but a lack of UTF-8
warnings is not a guarantee of correct encoding (use \code{\link{validUTF8}} for
that).}
}
\value{
logical of same length as \code{x}; NA values in \code{x} result in NA values
in return
}
\description{
This function is deprecated in favor of the \code{\link{has_ctl}}. It
checks for CSI SGR and OSC hyperlink sequences.
}
\keyword{internal}
fansi/man/in_html.Rd 0000644 0001762 0000144 00000003111 14213626056 014047 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/tohtml.R
\name{in_html}
\alias{in_html}
\title{Frame HTML in a Web Page And Display}
\usage{
in_html(x, css = character(), pre = TRUE, display = TRUE, clean = display)
}
\arguments{
\item{x}{character vector of html encoded strings.}
\item{css}{character vector of css styles.}
\item{pre}{TRUE (default) or FALSE, whether to wrap \code{x} in PRE tags.}
\item{display}{TRUE or FALSE, whether to display the resulting page in a
browser window. If TRUE, will sleep for one second before returning, and
will delete the temporary file used to store the HTML.}
\item{clean}{TRUE or FALSE, if TRUE and \code{display == TRUE}, will delete the
temporary file used for the web page, otherwise will leave it.}
}
\value{
character(1L) the file location of the page, invisibly, but keep in
mind it will have been deleted if \code{clean=TRUE}.
}
\description{
Helper function that assembles user provided HTML and CSS into a temporary
text file, and by default displays it in the browser. Intended for use in
examples.
}
\examples{
txt <- "\033[31;42mHello \033[7mWorld\033[m"
writeLines(txt)
html <- to_html(txt)
\dontrun{
in_html(html) # spawns a browser window
}
writeLines(readLines(in_html(html, display=FALSE)))
css <- "SPAN {text-decoration: underline;}"
writeLines(readLines(in_html(html, css=css, display=FALSE)))
\dontrun{
in_html(html, css)
}
}
\seealso{
\code{\link[=make_styles]{make_styles()}}.
Other HTML functions:
\code{\link{html_esc}()},
\code{\link{make_styles}()},
\code{\link{to_html}()}
}
\concept{HTML functions}
fansi/man/set_knit_hooks.Rd 0000644 0001762 0000144 00000012167 14213626056 015453 0 ustar ligges users % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/misc.R
\name{set_knit_hooks}
\alias{set_knit_hooks}
\title{Set an Output Hook Convert Control Sequences to HTML in Rmarkdown}
\usage{
set_knit_hooks(
hooks,
which = "output",
proc.fun = function(x, class) html_code_block(to_html(html_esc(x)), class = class),
class = sprintf("fansi fansi-\%s", which),
style = getOption("fansi.css", dflt_css()),
split.nl = FALSE,
.test = FALSE
)
}
\arguments{
\item{hooks}{list, this should the be \code{knitr::knit_hooks} object; we
require you pass this to avoid a run-time dependency on \code{knitr}.}
\item{which}{character vector with the names of the hooks that should be
replaced, defaults to 'output', but can also contain values
'message', 'warning', and 'error'.}
\item{proc.fun}{function that will be applied to output that contains
CSI SGR sequences. Should accept parameters \code{x} and \code{class}, where \code{x} is
the output, and \code{class} is the CSS class that should be applied to
the
blocks the output will be placed in.}
\item{class}{character the CSS class to give the output chunks. Each type of
output chunk specified in \code{which} will be matched position-wise to the
classes specified here. This vector should be the same length as \code{which}.}
\item{style}{character a vector of CSS styles; these will be output inside
HTML >STYLE< tags as a side effect. The default value is designed to
ensure that there is no visible gap in background color with lines with
height 1.5 (as is the default setting in \code{rmarkdown} documents v1.1).}
\item{split.nl}{TRUE or FALSE (default), set to TRUE to split input strings
by any newlines they may contain to avoid any newlines inside SPAN tags
created by \code{\link[=to_html]{to_html()}}. Some markdown->html renders can be configured
to convert embedded newlines into line breaks, which may lead to a doubling
of line breaks. With the default \code{proc.fun} the split strings are
recombined by \code{\link[=html_code_block]{html_code_block()}}, but if you provide your own \code{proc.fun}
you'll need to account for the possibility that the character vector it
receives will have a different number of elements than the chunk output.
This argument only has an effect if chunk output contains CSI SGR
sequences.}
\item{.test}{TRUE or FALSE, for internal testing use only.}
}
\value{
named list with the prior output hooks for each of \code{which}.
}
\description{
This is a convenience function designed for use within an \code{rmarkdown}
document. It overrides the \code{knitr} output hooks by using
\code{knitr::knit_hooks$set}. It replaces the hooks with ones that convert
\emph{Control Sequences} into HTML. In addition to replacing the hook functions,
this will output a "),
"",
if(pre) "