xml2/ 0000755 0001762 0000144 00000000000 13232116464 011136 5 ustar ligges users xml2/inst/ 0000755 0001762 0000144 00000000000 13231640172 012110 5 ustar ligges users xml2/inst/extdata/ 0000755 0001762 0000144 00000000000 13223425477 013554 5 ustar ligges users xml2/inst/extdata/order-schema.xml 0000644 0001762 0000144 00000004616 13024532225 016643 0 ustar ligges users
R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. To download R, please choose your preferred CRAN mirror.
If you have questions about R like how to download and install the software, or what the license terms are, please read our answers to frequently asked questions before you send an email.
R 3.2.0 (Full of Ingredients) prerelease versions will appear starting March 19. Final release is scheduled for 2015-04-16.
R version 3.1.3 (Smooth Sidewalk) has been released on 2015-03-09.
The R Journal Volume 6/2 is available.
R version 3.1.2 (Pumpkin Helmet) has been released on 2014-10-31.
useR! 2015, will take place at the University of Aalborg, Denmark, June 30 - July 3, 2015.
useR! 2014, took place at the University of California, Los Angeles, USA June 30 - July 3, 2014.
Modifying existing XML can be done in xml2 by using the replacement functions of the accessors. They all have methods for both individual xml_node
objects as well as xml_nodeset
objects. If a vector of values is provided it is applied piecewise over the nodeset, otherwise the value is recycled.
Text modification only happens on text nodes. If a given node has more than one text node only the first will be affected. If you want to modify additional text nodes you need to select them explicitly with /text()
.
x <- read_xml("<p>This is some <b>text</b>. This is more.</p>")
xml_text(x)
#> [1] "This is some text. This is more."
xml_text(x) <- "This is some other text."
xml_text(x)
#> [1] "This is some other text.text. This is more."
# You can avoid this by explicitly selecting the text node.
x <- read_xml("<p>This is some text. This is <b>bold!</b></p>")
text_only <- xml_find_all(x, "//text()")
xml_text(text_only) <- c("This is some other text. ", "Still bold!")
xml_text(x)
#> [1] "This is some other text. Still bold!"
xml_structure(x)
#> <p>
#> {text}
#> <b>
#> {text}
Attributes and namespace definitions are modified one at a time with xml_attr()
or all at once with xml_attrs()
. In both cases using NULL
as the value will remove the attribute completely.
x <- read_xml("<a href='invalid!'>xml2</a>")
xml_attr(x, "href")
#> [1] "invalid!"
xml_attr(x, "href") <- "https://github.com/r-lib/xml2"
xml_attr(x, "href")
#> [1] "https://github.com/r-lib/xml2"
xml_attrs(x) <- c(id = "xml2", href = "https://github.com/r-lib/xml2")
xml_attrs(x)
#> href id
#> "https://github.com/r-lib/xml2" "xml2"
x
#> {xml_document}
#> <a href="https://github.com/r-lib/xml2" id="xml2">
xml_attrs(x) <- NULL
x
#> {xml_document}
#> <a>
# Namespaces are added with as a xmlns or xmlns:prefix attribute
xml_attr(x, "xmlns") <- "http://foo"
x
#> {xml_document}
#> <a xmlns="http://foo">
xml_attr(x, "xmlns:bar") <- "http://bar"
x
#> {xml_document}
#> <a xmlns="http://foo" xmlns:bar="http://bar">
Node names are modified with xml_name()
.
All of these functions have a .copy
argument. If this is set to FALSE
they will remove the new node from its location before inserting it into the new location. Otherwise they make a copy of the node before insertion.
x <- read_xml("<parent><child>1</child><child>2<child>3</child></child></parent>")
children <- xml_children(x)
t1 <- children[[1]]
t2 <- children[[2]]
t3 <- xml_children(children[[2]])[[1]]
xml_replace(t1, t3)
#> {xml_node}
#> <child>
x
#> {xml_document}
#> <parent>
#> [1] <child>3</child>
#> [2] <child>2<child>3</child></child>
x <- read_xml("<parent><child>1</child><child>2<child>3</child></child></parent>")
children <- xml_children(x)
t1 <- children[[1]]
t2 <- children[[2]]
t3 <- xml_children(children[[2]])[[1]]
xml_add_sibling(t1, t3)
x
#> {xml_document}
#> <parent>
#> [1] <child>1</child>
#> [2] <child>3</child>
#> [3] <child>2<child>3</child></child>
xml_add_sibling(t3, t1, where = "before")
x
#> {xml_document}
#> <parent>
#> [1] <child>1</child>
#> [2] <child>3</child>
#> [3] <child>2<child>3</child><child>1</child></child>
x <- read_xml("<parent><child>1</child><child>2<child>3</child></child></parent>")
children <- xml_children(x)
t1 <- children[[1]]
t2 <- children[[2]]
t3 <- xml_children(children[[2]])[[1]]
xml_add_child(t1, t3)
x
#> {xml_document}
#> <parent>
#> [1] <child>1<child>3</child></child>
#> [2] <child>2<child>3</child></child>
xml_add_child(t1, read_xml("<test/>"))
x
#> {xml_document}
#> <parent>
#> [1] <child>1<child>3</child><test/></child>
#> [2] <child>2<child>3</child></child>
The xml_remove()
can be used to remove a node (and it’s children) from a tree. The default behavior is to unlink the node from the tree, but does not free the memory for the node, so R objects pointing to the node are still valid.
This allows code like the following to work without crashing R
x <- read_xml("<foo><bar><baz/></bar></foo>")
x1 <- x %>% xml_children() %>% .[[1]]
x2 <- x1 %>% xml_children() %>% .[[1]]
xml_remove(x1)
rm(x1)
gc()
#> used (Mb) gc trigger (Mb) max used (Mb)
#> Ncells 511223 27.4 940480 50.3 750400 40.1
#> Vcells 994351 7.6 1978995 15.1 1350525 10.4
x2
#> {xml_node}
#> <baz>
If you are not planning on referencing these nodes again this memory is wasted. Calling xml_remove(free = TRUE)
will remove the nodes and free the memory used to store them. Note In this case any node which previously pointed to the node or it’s children will instead be pointing to free memory and may cause R to crash. xml2 can’t figure this out for you, so it’s your responsibility to remove any objects which are no longer valid.
In particular xml_find_*()
results are easy to overlook, for example
We want to construct a document with the following namespace layout. (From http://stackoverflow.com/questions/32939229/creating-xml-in-r-with-namespaces/32941524#32941524).
<?xml version = "1.0" encoding="UTF-8"?>
<sld xmlns="http://www.o.net/sld"
xmlns:ogc="http://www.o.net/ogc"
xmlns:se="http://www.o.net/se"
version="1.1.0" >
<layer>
<se:Name>My Layer</se:Name>
</layer>
</sld>
d <- xml_new_root("sld",
xmlns = "http://www.o.net/sld",
"xmlns:ogc" = "http://www.o.net/ogc",
"xmlns:se" = "http://www.o.net/se",
version = "1.1.0") %>%
xml_add_child("layer") %>%
xml_add_child("se:Name", "My Layer") %>%
xml_root()
d
#> {xml_document}
#> <sld version="1.1.0" xmlns="http://www.o.net/sld" xmlns:ogc="http://www.o.net/ogc" xmlns:se="http://www.o.net/se">
#> [1] <layer>\n <se:Name>My Layer</se:Name>\n</layer>
This is some text. This is more.
") xml_text(x) xml_text(x) <- "This is some other text." xml_text(x) # You can avoid this by explicitly selecting the text node. x <- read_xml("This is some text. This is bold!
") text_only <- xml_find_all(x, "//text()") xml_text(text_only) <- c("This is some other text. ", "Still bold!") xml_text(x) xml_structure(x) ## ------------------------------------------------------------------------ x <- read_xml("xml2") xml_attr(x, "href") xml_attr(x, "href") <- "https://github.com/r-lib/xml2" xml_attr(x, "href") xml_attrs(x) <- c(id = "xml2", href = "https://github.com/r-lib/xml2") xml_attrs(x) x xml_attrs(x) <- NULL x # Namespaces are added with as a xmlns or xmlns:prefix attribute xml_attr(x, "xmlns") <- "http://foo" x xml_attr(x, "xmlns:bar") <- "http://bar" x ## ------------------------------------------------------------------------ x <- read_xml("") x xml_name(x) xml_name(x) <- "c" x ## ------------------------------------------------------------------------ x <- read_xml("This is some text. This is more.
") xml_text(x) xml_text(x) <- "This is some other text." xml_text(x) # You can avoid this by explicitly selecting the text node. x <- read_xml("This is some text. This is bold!
") text_only <- xml_find_all(x, "//text()") xml_text(text_only) <- c("This is some other text. ", "Still bold!") xml_text(x) xml_structure(x) ``` ## Attribute and Namespace Definition Modification ## Attributes and namespace definitions are modified one at a time with `xml_attr()` or all at once with `xml_attrs()`. In both cases using `NULL` as the value will remove the attribute completely. ```{r} x <- read_xml("xml2") xml_attr(x, "href") xml_attr(x, "href") <- "https://github.com/r-lib/xml2" xml_attr(x, "href") xml_attrs(x) <- c(id = "xml2", href = "https://github.com/r-lib/xml2") xml_attrs(x) x xml_attrs(x) <- NULL x # Namespaces are added with as a xmlns or xmlns:prefix attribute xml_attr(x, "xmlns") <- "http://foo" x xml_attr(x, "xmlns:bar") <- "http://bar" x ``` ## Name Modification ## Node names are modified with `xml_name()`. ```{r} x <- read_xml("") x xml_name(x) xml_name(x) <- "c" x ``` # Node modification # All of these functions have a `.copy` argument. If this is set to `FALSE` they will remove the new node from its location before inserting it into the new location. Otherwise they make a copy of the node before insertion. ## Replacing existing nodes ## ```{r} x <- read_xml("