selectr/0000755000176200001440000000000013565165573011733 5ustar liggesusersselectr/NAMESPACE0000644000176200001440000000056513511557740013150 0ustar liggesusersimport(methods) importFrom(stringr, str_locate, str_match, str_split_fixed, str_trim) importFrom(R6, R6Class) export(css_to_xpath) export(querySelector) export(querySelectorAll) export(querySelectorNS) export(querySelectorAllNS) S3method(querySelector, default) S3method(querySelectorAll, default) S3method(querySelectorNS, default) S3method(querySelectorAllNS, default) selectr/.Rinstignore0000644000176200001440000000002513511557740014224 0ustar liggesusersMakefile .travis.yml selectr/LICENCE0000644000176200001440000000012713511557740012710 0ustar liggesusersYEAR: 2016 COPYRIGHT HOLDER: Simon Potter, Simon Sapin, Ian Bicking ORGANIZATION: None selectr/README.md0000644000176200001440000000426513511557740013211 0ustar liggesusers# selectr [![License (3-Clause BSD)](https://img.shields.io/badge/license-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause) [![Build Status](https://travis-ci.org/sjp/selectr.svg)](https://travis-ci.org/sjp/selectr) [![CRAN version](https://www.r-pkg.org/badges/version/selectr)](https://cran.r-project.org/package=selectr) [![codecov](https://codecov.io/gh/sjp/selectr/branch/master/graph/badge.svg)](https://codecov.io/gh/sjp/selectr) ![Downloads per month](https://cranlogs.r-pkg.org/badges/last-month/selectr) selectr is a package which makes working with HTML and XML documents easier. It does this by performing translation of CSS selectors into XPath expressions so that you can query `XML` and `xml2` documents easily. ``` r library(selectr) xpath <- css_to_xpath("#selectr") xpath #> [1] "descendant-or-self::*[@id = 'selectr']" ``` ## Installation ### Install the release version from CRAN ``` r install.packages("selectr") ``` ### Install the development version from GitHub ``` r # install.packages("devtools") devtools::install_github("sjp/selectr") ``` ## Overview The key functions in selectr are: * Translate a CSS selector into an XPath expression with `css_to_xpath()`. * Query an `XML` or `xml2` document with `querySelector()` and its variants. * Find the first matching node with `querySelector()`. * Find all matching nodes with `querySelectorAll()`. * Find the first matching node in a namespaced document with `querySelectorNS()`. * Find all matching nodes in a namespaced document with `querySelectorAllNS()`. ## Examples Here is a simple example to demonstrate how to query an `XML` or `xml2` document with `querySelector()`. ``` r library(selectr) xmlText <- '' library(XML) doc <- xmlParse(xmlText) querySelector(doc, "baz") #> querySelectorAll(doc, "baz") #> [[1]] #> #> #> [[2]] #> #> #> attr(,"class") #> [1] "XMLNodeSet" library(xml2) doc <- read_xml(xmlText) querySelector(doc, "baz") #> {xml_node} #> querySelectorAll(doc, "baz") #> {xml_nodeset (2)} #> [1] #> [2] ``` selectr/man/0000755000176200001440000000000013511557740012476 5ustar liggesusersselectr/man/querySelectorAll.Rd0000644000176200001440000001416113511557740016267 0ustar liggesusers\name{querySelectorAll} \alias{querySelector} \alias{querySelectorAll} \alias{querySelectorNS} \alias{querySelectorAllNS} \title{ Find nodes that match a group of CSS selectors in an XML tree. } \description{ The purpose of these functions is to mimic the functionality of the \code{querySelector} and \code{querySelectorAll} functions present in Internet browsers. This is so we can succinctly query an XML tree for nodes matching a CSS selector. Namespaced functions \code{querySelectorNS} and \code{querySelectorAllNS} are also provided to search relative to a given namespace. } \usage{ querySelector(doc, selector, ns = NULL, ...) querySelectorAll(doc, selector, ns = NULL, ...) querySelectorNS(doc, selector, ns, prefix = "descendant-or-self::", ...) querySelectorAllNS(doc, selector, ns, prefix = "descendant-or-self::", ...) } \arguments{ \item{doc}{ The XML document or node to be evaluated against. } \item{selector}{ A selector used to query \code{doc}. This must be a single character string. } \item{ns}{ The namespace that the query will be filtered to. This is a named list or vector which has as its name a namespace, and its value is the namespace URI. This can be ignored for the un-namespaced functions. } \item{prefix}{ The prefix to apply to the resulting XPath expression. The default or \code{""} are most commonly used. } \item{...}{ Parameters to be passed onto \code{css_to_xpath}. } } \details{ The \code{querySelectorNS} and \code{querySelectorAllNS} functions are convenience functions for working with namespaced documents. They filter out all content that does not belong within the given namespaces. Note that when searching for particular elements in a selector, they must have a namespace prefix, e.g. \code{"svg|g"}. The namespace argument, \code{ns}, is simply passed on to \code{\link[XML]{getNodeSet}} or \code{\link[xml2]{xml_find_all}} if it is necessary to use a namespace present within the document. This can be ignored for content lacking a namespace, which is usually the case when using \code{querySelector} or \code{querySelectorAll}. } \value{ For \code{querySelector}, the result is a single node that represents the first matched node from a selector. If no matching nodes are found, \code{NULL} is returned. For \code{querySelectorAll}, the result is a list of XML nodes. This list may be empty in the case that no match is found. The \code{querySelectorNS} and \code{querySelectorAllNS} functions return the same type of content as their un-namespaced counterparts. } \references{ CSS3 Selectors \url{https://www.w3.org/TR/css3-selectors/}, XPath \url{https://www.w3.org/TR/xpath/}, querySelectorAll \url{https://developer.mozilla.org/en-US/docs/DOM/Document.querySelectorAll} and \url{http://www.w3.org/TR/selectors-api/#interface-definitions}. } \author{ Simon Potter } \examples{ hasXML <- require(XML) hasxml2 <- require(xml2) if (!hasXML && !hasxml2) return() # can't demo without XML or xml2 packages present parseFn <- if (hasXML) xmlParse else read_xml # Demo for working with the XML package (if present, otherwise xml2) exdoc <- parseFn('') querySelector(exdoc, "#anid") # Returns the matching node querySelector(exdoc, ".aclass") # Returns the matching node querySelector(exdoc, "b, c") # First match from grouped selection querySelectorAll(exdoc, "b, c") # Grouped selection querySelectorAll(exdoc, "b") # A list of length one querySelector(exdoc, "d") # No match querySelectorAll(exdoc, "d") # No match # Read in a document where two namespaces are being set: # SVG and MathML svgdoc <- parseFn(system.file("demos/svg-mathml.svg", package = "selectr")) # Search for