WikidataR/ 0000755 0001762 0000144 00000000000 13161121354 012126 5 ustar ligges users WikidataR/inst/ 0000755 0001762 0000144 00000000000 13161072003 013077 5 ustar ligges users WikidataR/inst/doc/ 0000755 0001762 0000144 00000000000 13161072003 013644 5 ustar ligges users WikidataR/inst/doc/Introduction.Rmd 0000644 0001762 0000144 00000006764 13106776057 017031 0 ustar ligges users
# WikidataR: the API client library for Wikidata
Wikidata is a wonderful and irreplaceable resource for linked data, containing information on pretty much any subject. If there's a Wikipedia article on it, there's almost certainly a Wikidata item for it.
WikidataR
- following the naming scheme of [WikipediR](https://github.com/Ironholds/WikipediR#thanks-and-misc) - is an API client library for Wikidata, written in and accessible from R.
## Items and properties
The two basic component pieces of Wikidata are "items" and "properties". An "item" is a thing - a concept, object or
topic that exists in the real world, such as "Rush". These items each have statements associated with them - for
example, "Rush is an instance of: Rock Band". In that statement, "Rock Band" is a property: a class or trait
that items can hold. Wikidata items are organised as descriptors of the item, in various languages, and references to the properties that that item holds.
## Retrieving specific items or properties
Items and properties are both identified by numeric IDs, prefaced with "Q" in the case of items,
and "P" in the case of properties. WikipediR can be used to retrieve items or properties with specific
ID numbers, using the get\_item
and get\_property
functions:
```{r, eval=FALSE}
#Retrieve an item
item <- get_item(id = 1)
#Get information about the property of the first claim it has.
first_claim <- get_property(id = names(item$claims)[1])
#Do we succeed? Dewey!
```
These functions are capable of accepting various forms for the ID, including (as examples), "Q100" or "100"
for items, and "Property:P100", "P100" or "100" for properties. They're also vectorised - pass them as many IDs as you want!
## Retrieving randomly-selected items or properties
As well as retrieving specific items or properties, Wikidata's API also allows for the retrieval of *random*
elements. With WikidataR, this can be achieved through:
```{r, eval=FALSE}
#Retrieve a random item
rand_item <- get_random_item()
#Retrieve a random property
rand_prop <- get_random_property()
```
These also allow you to retrieve *sets* of random elements - not just one at a time, but say, 50 at a time - by including the "limit" argument:
```{r, eval=FALSE}
#Retrieve 42 random items
rand_item <- get_random_item(limit = 42)
#Retrieve 42 random properties
rand_prop <- get_random_property(limit = 42)
```
## Search
Wikidata's search functionality can also be used, either to find items or to find properties. All you need is
a search string (which is run over the names and descriptions of items or properties) and a language code
(since Wikidata's descriptions can be in many languages):
```{r, eval=FALSE}
#Find item - find defaults to "en" as a language.
aarons <- find_item("Aaron Halfaker")
#Find a property - also defaults to "en"
first_names <- find_property("first name")
```
The resulting search entries have the ID as a key, making it trivial to then retrieve the full corresponding
items or properties:
```{r, eval=FALSE}
#Find item.
all_aarons <- find_item("Aaron Halfaker")
#Grab the ID code for the first entry and retrieve the associated item data.
first_aaron <- get_item(all_aarons[[1]]$id)
```
## Other and future functionality
If you have ideas for other types of useful Wikidata access, the best approach
is to either [request it](https://github.com/Ironholds/WikidataR/issues) or [add it](https://github.com/Ironholds/WikidataR/pulls)!
WikidataR/inst/doc/Introduction.R 0000644 0001762 0000144 00000002355 13161072003 016455 0 ustar ligges users ## ---- eval=FALSE---------------------------------------------------------
# #Retrieve an item
# item <- get_item(id = 1)
#
# #Get information about the property of the first claim it has.
# first_claim <- get_property(id = names(item$claims)[1])
# #Do we succeed? Dewey!
## ---- eval=FALSE---------------------------------------------------------
# #Retrieve a random item
# rand_item <- get_random_item()
#
# #Retrieve a random property
# rand_prop <- get_random_property()
## ---- eval=FALSE---------------------------------------------------------
# #Retrieve 42 random items
# rand_item <- get_random_item(limit = 42)
#
# #Retrieve 42 random properties
# rand_prop <- get_random_property(limit = 42)
## ---- eval=FALSE---------------------------------------------------------
# #Find item - find defaults to "en" as a language.
# aarons <- find_item("Aaron Halfaker")
#
# #Find a property - also defaults to "en"
# first_names <- find_property("first name")
## ---- eval=FALSE---------------------------------------------------------
# #Find item.
# all_aarons <- find_item("Aaron Halfaker")
#
# #Grab the ID code for the first entry and retrieve the associated item data.
# first_aaron <- get_item(all_aarons[[1]]$id)
WikidataR/inst/doc/Introduction.html 0000644 0001762 0000144 00000041715 13161072003 017223 0 ustar ligges users
Wikidata is a wonderful and irreplaceable resource for linked data, containing information on pretty much any subject. If there's a Wikipedia article on it, there's almost certainly a Wikidata item for it.
WikidataR
- following the naming scheme of WikipediR - is an API client library for Wikidata, written in and accessible from R.
The two basic component pieces of Wikidata are “items” and “properties”. An “item” is a thing - a concept, object or topic that exists in the real world, such as “Rush”. These items each have statements associated with them - for example, “Rush is an instance of: Rock Band”. In that statement, “Rock Band” is a property: a class or trait that items can hold. Wikidata items are organised as descriptors of the item, in various languages, and references to the properties that that item holds.
Items and properties are both identified by numeric IDs, prefaced with “Q” in the case of items,
and “P” in the case of properties. WikipediR can be used to retrieve items or properties with specific
ID numbers, using the get_item
and get_property
functions:
#Retrieve an item
item <- get_item(id = 1)
#Get information about the property of the first claim it has.
first_claim <- get_property(id = names(item$claims)[1])
#Do we succeed? Dewey!
These functions are capable of accepting various forms for the ID, including (as examples), “Q100” or “100” for items, and “Property:P100”, “P100” or “100” for properties. They're also vectorised - pass them as many IDs as you want!
As well as retrieving specific items or properties, Wikidata's API also allows for the retrieval of random elements. With WikidataR, this can be achieved through:
#Retrieve a random item
rand_item <- get_random_item()
#Retrieve a random property
rand_prop <- get_random_property()
These also allow you to retrieve sets of random elements - not just one at a time, but say, 50 at a time - by including the “limit” argument:
#Retrieve 42 random items
rand_item <- get_random_item(limit = 42)
#Retrieve 42 random properties
rand_prop <- get_random_property(limit = 42)
Wikidata's search functionality can also be used, either to find items or to find properties. All you need is a search string (which is run over the names and descriptions of items or properties) and a language code (since Wikidata's descriptions can be in many languages):
#Find item - find defaults to "en" as a language.
aarons <- find_item("Aaron Halfaker")
#Find a property - also defaults to "en"
first_names <- find_property("first name")
The resulting search entries have the ID as a key, making it trivial to then retrieve the full corresponding items or properties:
#Find item.
all_aarons <- find_item("Aaron Halfaker")
#Grab the ID code for the first entry and retrieve the associated item data.
first_aaron <- get_item(all_aarons[[1]]$id)
If you have ideas for other types of useful Wikidata access, the best approach is to either request it or add it!
WikidataR/tests/ 0000755 0001762 0000144 00000000000 13106773114 013276 5 ustar ligges users WikidataR/tests/testthat.R 0000644 0001762 0000144 00000000076 13106773114 015264 0 ustar ligges users library(testthat) library(WikidataR) test_check("WikidataR") WikidataR/tests/testthat/ 0000755 0001762 0000144 00000000000 13161121354 015130 5 ustar ligges users WikidataR/tests/testthat/test_search.R 0000644 0001762 0000144 00000000675 13106773114 017575 0 ustar ligges users context("Search functions") test_that("English-language search works",{ expect_true({find_item("Wonder Girls", "en");TRUE}) }) test_that("Non-English-language search works",{ expect_true({find_item("Wonder Girls", "es");TRUE}) }) test_that("Search with limit modding works",{ expect_that(length(find_item("Wonder Girls", "en", 3)), equals(3)) }) test_that("Property search works",{ expect_true({find_property("Music", "en");TRUE}) }) WikidataR/tests/testthat/test_gets.R 0000644 0001762 0000144 00000001570 13106773114 017265 0 ustar ligges users context("Direct Wikidata get functions") test_that("A specific item can be retrieved with an entire item code", { expect_true({get_item("Q100");TRUE}) }) test_that("A specific item can be retrieved with a partial entire item code", { expect_true({get_item("100");TRUE}) }) test_that("A specific property can be retrieved with an entire prop code + namespace", { expect_true({get_property("Property:P10");TRUE}) }) test_that("A specific property can be retrieved with an entire prop code + namespace", { expect_true({get_property("P10");TRUE}) }) test_that("A specific property can be retrieved with a partial prop code", { expect_true({get_property("10");TRUE}) }) test_that("A randomly-selected item can be retrieved",{ expect_true({get_random_item();TRUE}) }) test_that("A randomly-selected property can be retriveed",{ expect_true({get_random_property();TRUE}) }) WikidataR/tests/testthat/test_geo.R 0000644 0001762 0000144 00000004236 13106773114 017077 0 ustar ligges users testthat::context("Geographic queries") testthat::test_that("Simple entity-based geo lookups work", { field_names <- c("item", "name", "latitutde", "longitude", "entity") sf_locations <- get_geo_entity("Q62") testthat::expect_true(is.data.frame(sf_locations)) testthat::expect_true(all(field_names == names(sf_locations))) testthat::expect_true(unique(sf_locations$entity) == "Q62") }) testthat::test_that("Language-variant entity-based geo lookups work", { field_names <- c("item", "name", "latitutde", "longitude", "entity") sf_locations <- get_geo_entity("Q62", language = "fr") testthat::expect_true(is.data.frame(sf_locations)) testthat::expect_true(all(field_names == names(sf_locations))) testthat::expect_true(unique(sf_locations$entity) == "Q62") }) testthat::test_that("Radius restricted entity-based geo lookups work", { field_names <- c("item", "name", "latitutde", "longitude", "entity") sf_locations <- get_geo_entity("Q62", radius = 1) testthat::expect_true(is.data.frame(sf_locations)) testthat::expect_true(all(field_names == names(sf_locations))) testthat::expect_true(unique(sf_locations$entity) == "Q62") }) testthat::test_that("multi-entity geo lookups work", { field_names <- c("item", "name", "latitutde", "longitude", "entity") sf_locations <- get_geo_entity(c("Q62", "Q64"), radius = 1) testthat::expect_true(is.data.frame(sf_locations)) testthat::expect_true(all(field_names == names(sf_locations))) testthat::expect_equal(length(unique(sf_locations$entity)), 2) }) testthat::test_that("Simple bounding lookups work", { field_names <- c("item", "name", "latitutde", "longitude") bruges_box <- get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest") testthat::expect_true(is.data.frame(bruges_box)) testthat::expect_true(all(field_names == names(bruges_box))) }) testthat::test_that("Language-variant bounding lookups work", { field_names <- c("item", "name", "latitutde", "longitude") bruges_box <- get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest", language = "fr") testthat::expect_true(is.data.frame(bruges_box)) testthat::expect_true(all(field_names == names(bruges_box))) }) WikidataR/NAMESPACE 0000644 0001762 0000144 00000000736 13106777553 013374 0 ustar ligges users # Generated by roxygen2: do not edit by hand S3method(print,find_item) S3method(print,find_property) S3method(print,wikidata) export(extract_claims) export(find_item) export(find_property) export(get_geo_box) export(get_geo_entity) export(get_item) export(get_property) export(get_random_item) export(get_random_property) importFrom(WikipediR,page_content) importFrom(WikipediR,query) importFrom(WikipediR,random_page) importFrom(httr,user_agent) importFrom(jsonlite,fromJSON) WikidataR/NEWS 0000644 0001762 0000144 00000002200 13161071406 012621 0 ustar ligges users 1.4.0 ================================================= * extract_claims() allows you to, well, extract claims. * SPARQL syntax bug with some geo queries now fixed (thanks to Mikhail Popov) 1.3.0 ================================================= * get_* functions are now vectorised 1.2.0 ================================================= * geographic data for entities that exist relative to other Wikidata items can now be retrieved with get_geo_entity and get_geo_box, courtesy of excellent Serena Signorelli's excellent QueryWikidataR package. * A bug in printing returned objects is now fixed. 1.1.0 ================================================= * You can now retrieve multiple random properties or items with get_random_item and get_random_property 1.0.1 ================================================= * Various documentation and metadata improvements. 1.0.0 ================================================= * Fix a bug in get_* functions due to a parameter name mismatch * Print methods added by Christian Graul 0.5.0 ================================================= * This is the initial release! See the explanatory vignettes. WikidataR/R/ 0000755 0001762 0000144 00000000000 13161071155 012332 5 ustar ligges users WikidataR/R/utils.R 0000644 0001762 0000144 00000006477 13106777613 013646 0 ustar ligges users #Generic queryin' function for direct Wikidata calls. Wraps around WikipediR::page_content. wd_query <- function(title, ...){ result <- WikipediR::page_content(domain = "wikidata.org", page_name = title, as_wikitext = TRUE, httr::user_agent("WikidataR - https://github.com/Ironholds/WikidataR"), ...) output <- jsonlite::fromJSON(result$parse$wikitext[[1]]) return(output) } #Query for a random item in "namespace" (ns). Essentially a wrapper around WikipediR::random_page. wd_rand_query <- function(ns, limit, ...){ result <- WikipediR::random_page(domain = "wikidata.org", as_wikitext = TRUE, namespaces = ns, httr::user_agent("WikidataR - https://github.com/Ironholds/WikidataR"), limit = limit, ...) output <- lapply(result, function(x){jsonlite::fromJSON(x$wikitext[[1]])}) class(output) <- "wikidata" return(output) } #Generic input checker. Needs additional stuff for property-based querying #because namespaces are weird, yo. check_input <- function(input, substitution){ in_fit <- grepl("^\\d+$",input) if(any(in_fit)){ input[in_fit] <- paste0(substitution, input[in_fit]) } return(input) } #Generic, direct access to Wikidata's search functionality. searcher <- function(search_term, language, limit, type, ...){ result <- WikipediR::query(url = "https://www.wikidata.org/w/api.php", out_class = "list", clean_response = FALSE, query_param = list( action = "wbsearchentities", type = type, language = language, limit = limit, search = search_term ), ...) result <- result$search return(result) } sparql_query <- function(params, ...){ result <- httr::GET("https://query.wikidata.org/bigdata/namespace/wdq/sparql", query = list(query = params), httr::user_agent("WikidataR - https://github.com/Ironholds/WikidataR"), ...) httr::stop_for_status(result) return(httr::content(result, as = "parsed", type = "application/json")) } #'@title Extract Claims from Returned Item Data #'@description extract claim information from data returned using #'\code{\link{get_item}}. #' #'@param items a list of one or more Wikidata items returned with #'\code{\link{get_item}}. #' #'@param claims a vector of claims (in the form "P321", "P12") to look for #'and extract. #' #'@return a list containing one sub-list for each entry in \code{items}, #'and (below that) the found data for each claim. In the event a claim #'cannot be found for an item, an \code{NA} will be returned #'instead. #' #'@examples #'# Get item data #'adams_data <- get_item("42") #' #'# Get claim data #'claims <- extract_claims(adams_data, "P31") #' #'@export extract_claims <- function(items, claims){ output <- lapply(items, function(x, claims){ return(lapply(claims, function(claim, obj){ which_match <- which(names(obj$claims) == claim) if(!length(which_match)){ return(NA) } return(obj$claims[[which_match[1]]]) }, obj = x)) }, claims = claims) return(output) } WikidataR/R/geo.R 0000644 0001762 0000144 00000015731 13161071155 013236 0 ustar ligges users clean_geo <- function(results){ do.call("rbind", lapply(results, function(item){ point <- unlist(strsplit(gsub(x = item$coord$value, pattern = "(Point\\(|\\))", replacement = ""), " ")) wd_id <- gsub(x = item$item$value, pattern = "http://www.wikidata.org/entity/", replacement = "", fixed = TRUE) return(data.frame(item = wd_id, name = ifelse(item$name$value == wd_id, NA, item$name$value), latitutde = as.numeric(point[1]), longitude = as.numeric(point[2]), stringsAsFactors = FALSE)) })) } #'@title Retrieve geographic information from Wikidata #'@description \code{get_geo_entity} retrieves the item ID, latitude #'and longitude of any object with geographic data associated with \emph{another} #'object with geographic data (example: all the locations around/near/associated with #'a city). #' #'@param entity a Wikidata item (\code{Q...}) or series of items, to check #'for associated geo-tagged items. #' #'@param language the two-letter language code to use for the name #'of the item. "en" by default, because we're imperialist #'anglocentric westerners. #' #'@param radius optionally, a radius (in kilometers) around \code{entity} #'to restrict the search to. #' #'@param ... further arguments to pass to httr's GET. #' #'@return a data.frame of 5 columns: #'\itemize{ #' \item{item}{ the Wikidata identifier of each object associated with #' \code{entity}.} #' \item{name}{ the name of the item, if available, in the requested language. If it #' is not available, \code{NA} will be returned instead.} #' \item{latitude}{ the latitude of \code{item}} #' \item{longitude}{ the longitude of \code{item}} #' \item{entity}{ the entity the item is associated with (necessary for multi-entity #' queries).} #'} #' #'@examples #'# All entities #'sf_locations <- get_geo_entity("Q62") #' #'# Entities with French, rather than English, names #'sf_locations <- get_geo_entity("Q62", language = "fr") #' #'# Entities within 1km #'sf_close_locations <- get_geo_entity("Q62", radius = 1) #' #'# Multiple entities #'multi_entity <- get_geo_entity(entity = c("Q62", "Q64")) #' #'@seealso \code{\link{get_geo_box}} for using a bounding box #'rather than an unrestricted search or simple radius. #' #'@export get_geo_entity <- function(entity, language = "en", radius = NULL, ...){ entity <- check_input(entity, "Q") if(is.null(radius)){ query <- paste0("SELECT DISTINCT ?item ?name ?coord ?propertyLabel WHERE { ?item wdt:P131* wd:", entity, ". ?item wdt:P625 ?coord . SERVICE wikibase:label { bd:serviceParam wikibase:language \"", language, "\" . ?item rdfs:label ?name } } ORDER BY ASC (?name)") } else { query <- paste0("SELECT ?item ?name ?coord WHERE { wd:", entity, " wdt:P625 ?mainLoc . SERVICE wikibase:around { ?item wdt:P625 ?coord . bd:serviceParam wikibase:center ?mainLoc . bd:serviceParam wikibase:radius \"", radius, "\" . } SERVICE wikibase:label { bd:serviceParam wikibase:language \"", language, "\" . ?item rdfs:label ?name } } ORDER BY ASC (?name)") } if(length(query) > 1){ return(do.call("rbind", mapply(function(query, entity, ...){ output <- clean_geo(sparql_query(query, ...)$results$bindings) output$entity <- entity return(output) }, query = query, entity = entity, ..., SIMPLIFY = FALSE))) } output <- clean_geo(sparql_query(query)$results$bindings) output$entity <- entity return(output) } #'@title Get geographic entities based on a bounding box #'@description \code{get_geo_box} retrieves all geographic entities in #'Wikidata that fall between a bounding box between two existing items #'with geographic attributes (usually cities). #' #'@param first_city_code a Wikidata item, or series of items, to use for #'one corner of the bounding box. #' #'@param first_corner the direction of \code{first_city_code} relative #'to \code{city} (eg "NorthWest", "SouthEast"). #' #'@param second_city_code a Wikidata item, or series of items, to use for #'one corner of the bounding box. #' #'@param second_corner the direction of \code{second_city_code} relative #'to \code{city} (eg "NorthWest", "SouthEast"). #' #'@param language the two-letter language code to use for the name #'of the item. "en" by default. #' #'@param ... further arguments to pass to httr's GET. #' #'@return a data.frame of 5 columns: #'\itemize{ #' \item{item}{ the Wikidata identifier of each object associated with #' \code{entity}.} #' \item{name}{ the name of the item, if available, in the requested language. If it #' is not available, \code{NA} will be returned instead.} #' \item{latitude}{ the latitude of \code{item}} #' \item{longitude}{ the longitude of \code{item}} #' \item{entity}{ the entity the item is associated with (necessary for multi-entity #' queries).} #'} #' #'@examples #'# Simple bounding box #'bruges_box <- WikidataR:::get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest") #' #'# Custom language #'bruges_box_fr <- WikidataR:::get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest", #' language = "fr") #' #'@seealso \code{\link{get_geo_entity}} for using an unrestricted search or simple radius, #'rather than a bounding box. #' #'@export get_geo_box <- function(first_city_code, first_corner, second_city_code, second_corner, language = "en", ...){ # Input checks first_city_code <- check_input(first_city_code, "Q") second_city_code <- check_input(second_city_code, "Q") # Construct query query <- paste0("SELECT ?item ?name ?coord WHERE { wd:", first_city_code, " wdt:P625 ?Firstloc . wd:", second_city_code, " wdt:P625 ?Secondloc . SERVICE wikibase:box { ?item wdt:P625 ?coord . bd:serviceParam wikibase:corner", first_corner, " ?Firstloc . bd:serviceParam wikibase:corner", second_corner, " ?Secondloc . } SERVICE wikibase:label { bd:serviceParam wikibase:language \"", language, "\" . ?item rdfs:label ?name } }ORDER BY ASC (?name)") # Vectorise if necessary, or not if not! if(length(query) > 1){ return(do.call("rbind", mapply(function(query, ...){ output <- clean_geo(sparql_query(query, ...)$results$bindings) return(output) }, query = query, ..., SIMPLIFY = FALSE))) } output <- clean_geo(sparql_query(query)$results$bindings) return(output) } WikidataR/R/WikidataR.R 0000644 0001762 0000144 00000001272 13106773114 014341 0 ustar ligges users #' @title API client library for Wikidata #' @description This package serves as an API client for \href{Wikidata}{https://www.wikidata.org}. #' See the accompanying vignette for more details. #' #' @name WikidataR #' @docType package #'@seealso \code{\link{get_random}} for selecting a random item or property, #'\code{\link{get_item}} for a /specific/ item or property, or \code{\link{find_item}} #'for using search functionality to pull out item or property IDs where the descriptions #'or aliases match a particular search term. #' @importFrom WikipediR page_content random_page query #' @importFrom httr user_agent #' @importFrom jsonlite fromJSON #' @aliases WikidataR WikidataR-package NULL WikidataR/R/prints.R 0000644 0001762 0000144 00000006060 13106775271 014007 0 ustar ligges users #'@title Print method for find_item #' #'@description print found items. #' #'@param x find_item object with search results #'@param \dots Arguments to be passed to methods #' #'@method print find_item #'@export print.find_item <- function(x, ...) { cat("\n\tWikidata item search\n\n") # number of results num_results <- length(x) cat("Number of results:\t", num_results, "\n\n") # results if(num_results > 0) { cat("Results:\n") for(i in 1:num_results) { if(is.null(x[[i]]$description)){ desc <- "\n" } else { desc <- paste("-", x[[i]]$description, "\n") } cat(i, "\t", x[[i]]$label, paste0("(", x[[i]]$id, ")"), desc) } } } #'@title Print method for find_property #' #'@description print found properties. #' #'@param x find_property object with search results #'@param \dots Arguments to be passed to methods #' #'@method print find_property #'@export print.find_property <- function(x, ...) { cat("\n\tWikidata property search\n\n") # number of results num_results <- length(x) cat("Number of results:\t", num_results, "\n\n") # results if(num_results > 0) { cat("Results:\n") for(i in seq_len(num_results)) { if(is.null(x[[i]]$description)){ desc <- "\n" } else { desc <- paste("-", x[[i]]$description, "\n") } cat(i, "\t", x[[i]]$label, paste0("(", x[[i]]$id, ")"), desc) } } } wd_print_base <- function(x, ...){ cat("\n\tWikidata", x$type, x$id, "\n\n") # labels num.labels <- length(x$labels) if(num.labels>0) { lbl <- x$labels[[1]]$value if(num.labels==1) cat("Label:\t\t", lbl, "\n") else { if(!is.null(x$labels$en)) lbl <- x$labels$en$value cat("Label:\t\t", lbl, paste0("\t[", num.labels-1, " other languages available]\n")) } } # aliases num_aliases <- length(x$aliases) if(num_aliases > 0) { al <- unique(unlist(lapply(x$aliases, function(xl){return(xl$value)}))) cat("Aliases:\t", paste(al, collapse = ", "), "\n") } # descriptions num_desc <- length(x$descriptions) if(num_desc > 0) { desc <- x$descriptions[[1]]$value if(num_desc == 1){ cat("Description:", desc, "\n") } else { if(!is.null(x$descriptions$en)){ desc <- x$descriptions$en$value } cat("Description:", desc, paste0("\t[", (num_desc - 1), " other languages available]\n")) } } # num claims num_claims <- length(x$claims) if(num_claims > 0){ cat("Claims:\t\t", num_claims, "\n") } # num sitelinks num_links <- length(x$sitelinks) if(num_links > 0){ cat("Sitelinks:\t", num_links, "\n") } } #'@title Print method for Wikidata objects #' #'@description print found objects generally. #' #'@param x wikidata object from get_item, get_random_item, get_property or get_random_property #'@param \dots Arguments to be passed to methods #'@seealso get_item, get_random_item, get_property or get_random_property #'@method print wikidata #'@export print.wikidata <- function(x, ...){ lapply(x, wd_print_base, ...) return(invisible()) } WikidataR/R/gets.R 0000644 0001762 0000144 00000010566 13106775601 013435 0 ustar ligges users #'@title Retrieve specific Wikidata items or properties #'@description \code{get_item} and \code{get_property} allow you to retrieve the data associated #'with individual Wikidata items and properties, respectively. As with #'other \code{WikidataR} code, custom print methods are available; use \code{\link{str}} #'to manipulate and see the underlying structure of the data. #' #'@param id the ID number(s) of the item or property you're looking for. This can be in #'various formats; either a numeric value ("200"), the full name ("Q200") or #'even with an included namespace ("Property:P10") - the function will format #'it appropriately. This function is vectorised and will happily accept #'multiple IDs. #' #'@param ... further arguments to pass to httr's GET. #' #'@seealso \code{\link{get_random}} for selecting a random item or property, #'or \code{\link{find_item}} for using search functionality to pull out #'item or property IDs where the descriptions or aliases match a particular #'search term. #' #'@examples #' #'#Retrieve a specific item #'adams_metadata <- get_item("42") #' #'#Retrieve a specific property #'object_is_child <- get_property("P40") #' #'@aliases get_item get_property #'@rdname get_item #'@export get_item <- function(id, ...){ id <- check_input(id, "Q") output <- (lapply(id, wd_query, ...)) class(output) <- "wikidata" return(output) } #'@rdname get_item #'@export get_property <- function(id, ...){ has_grep <- grepl("^P(?!r)",id, perl = TRUE) id[has_grep] <- paste0("Property:", id[has_grep]) id <- check_input(id, "Property:P") output <- (lapply(id, wd_query, ...)) class(output) <- "wikidata" return(output) } #'@title Retrieve randomly-selected Wikidata items or properties #'@description \code{get_random_item} and \code{get_random_property} allow you to retrieve the data #'associated with randomly-selected Wikidata items and properties, respectively. As with #'other \code{WikidataR} code, custom print methods are available; use \code{\link{str}} #'to manipulate and see the underlying structure of the data. #' #'@param limit how many random items to return. 1 by default, but can be higher. #' #'@param ... arguments to pass to httr's GET. #' #'@seealso \code{\link{get_item}} for selecting a specific item or property, #'or \code{\link{find_item}} for using search functionality to pull out #'item or property IDs where the descriptions or aliases match a particular #'search term. #' #'@examples #' #'#Random item #'random_item <- get_random_item() #' #'#Random property #'random_property <- get_random_property() #' #'@aliases get_random get_random_item get_random_property #'@rdname get_random #'@export get_random_item <- function(limit = 1, ...){ return(wd_rand_query(ns = 0, limit = limit, ...)) } #'@rdname get_random #'@export get_random_property <- function(limit = 1, ...){ return(wd_rand_query(ns = 120, limit = limit, ...)) } #'@title Search for Wikidata items or properties that match a search term #'@description \code{find_item} and \code{find_property} allow you to retrieve a set #'of Wikidata items or properties where the aliase or descriptions match a particular #'search term. As with other \code{WikidataR} code, custom print methods are available; #'use \code{\link{str}} to manipulate and see the underlying structure of the data. #' #'@param search_term a term to search for. #' #'@param language the language to return the labels and descriptions in; this should #'consist of an ISO language code. Set to "en" by default. #' #'@param limit the number of results to return; set to 10 by default. #' #'@param ... further arguments to pass to httr's GET. #' #'@seealso \code{\link{get_random}} for selecting a random item or property, #'or \code{\link{get_item}} for selecting a specific item or property. #' #'@examples #' #'#Check for entries relating to Douglas Adams in some way #'adams_items <- find_item("Douglas Adams") #' #'#Check for properties involving the peerage #'peerage_props <- find_property("peerage") #' #'@aliases find_item find_property #'@rdname find_item #'@export find_item <- function(search_term, language = "en", limit = 10, ...){ res <- searcher(search_term, language, limit, "item") class(res) <- "find_item" return(res) } #'@rdname find_item #'@export find_property <- function(search_term, language = "en", limit = 10){ res <- searcher(search_term, language, limit, "property") class(res) <- "find_property" return(res) } WikidataR/vignettes/ 0000755 0001762 0000144 00000000000 13161072003 014132 5 ustar ligges users WikidataR/vignettes/Introduction.Rmd 0000644 0001762 0000144 00000006764 13106776057 017317 0 ustar ligges users # WikidataR: the API client library for Wikidata Wikidata is a wonderful and irreplaceable resource for linked data, containing information on pretty much any subject. If there's a Wikipedia article on it, there's almost certainly a Wikidata item for it.WikidataR
- following the naming scheme of [WikipediR](https://github.com/Ironholds/WikipediR#thanks-and-misc) - is an API client library for Wikidata, written in and accessible from R.
## Items and properties
The two basic component pieces of Wikidata are "items" and "properties". An "item" is a thing - a concept, object or
topic that exists in the real world, such as "Rush". These items each have statements associated with them - for
example, "Rush is an instance of: Rock Band". In that statement, "Rock Band" is a property: a class or trait
that items can hold. Wikidata items are organised as descriptors of the item, in various languages, and references to the properties that that item holds.
## Retrieving specific items or properties
Items and properties are both identified by numeric IDs, prefaced with "Q" in the case of items,
and "P" in the case of properties. WikipediR can be used to retrieve items or properties with specific
ID numbers, using the get\_item
and get\_property
functions:
```{r, eval=FALSE}
#Retrieve an item
item <- get_item(id = 1)
#Get information about the property of the first claim it has.
first_claim <- get_property(id = names(item$claims)[1])
#Do we succeed? Dewey!
```
These functions are capable of accepting various forms for the ID, including (as examples), "Q100" or "100"
for items, and "Property:P100", "P100" or "100" for properties. They're also vectorised - pass them as many IDs as you want!
## Retrieving randomly-selected items or properties
As well as retrieving specific items or properties, Wikidata's API also allows for the retrieval of *random*
elements. With WikidataR, this can be achieved through:
```{r, eval=FALSE}
#Retrieve a random item
rand_item <- get_random_item()
#Retrieve a random property
rand_prop <- get_random_property()
```
These also allow you to retrieve *sets* of random elements - not just one at a time, but say, 50 at a time - by including the "limit" argument:
```{r, eval=FALSE}
#Retrieve 42 random items
rand_item <- get_random_item(limit = 42)
#Retrieve 42 random properties
rand_prop <- get_random_property(limit = 42)
```
## Search
Wikidata's search functionality can also be used, either to find items or to find properties. All you need is
a search string (which is run over the names and descriptions of items or properties) and a language code
(since Wikidata's descriptions can be in many languages):
```{r, eval=FALSE}
#Find item - find defaults to "en" as a language.
aarons <- find_item("Aaron Halfaker")
#Find a property - also defaults to "en"
first_names <- find_property("first name")
```
The resulting search entries have the ID as a key, making it trivial to then retrieve the full corresponding
items or properties:
```{r, eval=FALSE}
#Find item.
all_aarons <- find_item("Aaron Halfaker")
#Grab the ID code for the first entry and retrieve the associated item data.
first_aaron <- get_item(all_aarons[[1]]$id)
```
## Other and future functionality
If you have ideas for other types of useful Wikidata access, the best approach
is to either [request it](https://github.com/Ironholds/WikidataR/issues) or [add it](https://github.com/Ironholds/WikidataR/pulls)!
WikidataR/README.md 0000644 0001762 0000144 00000002471 13107176524 013422 0 ustar ligges users WikidataR
=========
An R API wrapper for the Wikidata store of semantic data.
__Author:__ Oliver Keyes, Serena Signorelli & Christian Graul