WikidataR/0000755000176200001440000000000013161121354012126 5ustar liggesusersWikidataR/inst/0000755000176200001440000000000013161072003013077 5ustar liggesusersWikidataR/inst/doc/0000755000176200001440000000000013161072003013644 5ustar liggesusersWikidataR/inst/doc/Introduction.Rmd0000644000176200001440000000676413106776057017031 0ustar liggesusers # WikidataR: the API client library for Wikidata Wikidata is a wonderful and irreplaceable resource for linked data, containing information on pretty much any subject. If there's a Wikipedia article on it, there's almost certainly a Wikidata item for it. WikidataR - following the naming scheme of [WikipediR](https://github.com/Ironholds/WikipediR#thanks-and-misc) - is an API client library for Wikidata, written in and accessible from R. ## Items and properties The two basic component pieces of Wikidata are "items" and "properties". An "item" is a thing - a concept, object or topic that exists in the real world, such as "Rush". These items each have statements associated with them - for example, "Rush is an instance of: Rock Band". In that statement, "Rock Band" is a property: a class or trait that items can hold. Wikidata items are organised as descriptors of the item, in various languages, and references to the properties that that item holds. ## Retrieving specific items or properties Items and properties are both identified by numeric IDs, prefaced with "Q" in the case of items, and "P" in the case of properties. WikipediR can be used to retrieve items or properties with specific ID numbers, using the get\_item and get\_property functions: ```{r, eval=FALSE} #Retrieve an item item <- get_item(id = 1) #Get information about the property of the first claim it has. first_claim <- get_property(id = names(item$claims)[1]) #Do we succeed? Dewey! ``` These functions are capable of accepting various forms for the ID, including (as examples), "Q100" or "100" for items, and "Property:P100", "P100" or "100" for properties. They're also vectorised - pass them as many IDs as you want! ## Retrieving randomly-selected items or properties As well as retrieving specific items or properties, Wikidata's API also allows for the retrieval of *random* elements. With WikidataR, this can be achieved through: ```{r, eval=FALSE} #Retrieve a random item rand_item <- get_random_item() #Retrieve a random property rand_prop <- get_random_property() ``` These also allow you to retrieve *sets* of random elements - not just one at a time, but say, 50 at a time - by including the "limit" argument: ```{r, eval=FALSE} #Retrieve 42 random items rand_item <- get_random_item(limit = 42) #Retrieve 42 random properties rand_prop <- get_random_property(limit = 42) ``` ## Search Wikidata's search functionality can also be used, either to find items or to find properties. All you need is a search string (which is run over the names and descriptions of items or properties) and a language code (since Wikidata's descriptions can be in many languages): ```{r, eval=FALSE} #Find item - find defaults to "en" as a language. aarons <- find_item("Aaron Halfaker") #Find a property - also defaults to "en" first_names <- find_property("first name") ``` The resulting search entries have the ID as a key, making it trivial to then retrieve the full corresponding items or properties: ```{r, eval=FALSE} #Find item. all_aarons <- find_item("Aaron Halfaker") #Grab the ID code for the first entry and retrieve the associated item data. first_aaron <- get_item(all_aarons[[1]]$id) ``` ## Other and future functionality If you have ideas for other types of useful Wikidata access, the best approach is to either [request it](https://github.com/Ironholds/WikidataR/issues) or [add it](https://github.com/Ironholds/WikidataR/pulls)! WikidataR/inst/doc/Introduction.R0000644000176200001440000000235513161072003016455 0ustar liggesusers## ---- eval=FALSE--------------------------------------------------------- # #Retrieve an item # item <- get_item(id = 1) # # #Get information about the property of the first claim it has. # first_claim <- get_property(id = names(item$claims)[1]) # #Do we succeed? Dewey! ## ---- eval=FALSE--------------------------------------------------------- # #Retrieve a random item # rand_item <- get_random_item() # # #Retrieve a random property # rand_prop <- get_random_property() ## ---- eval=FALSE--------------------------------------------------------- # #Retrieve 42 random items # rand_item <- get_random_item(limit = 42) # # #Retrieve 42 random properties # rand_prop <- get_random_property(limit = 42) ## ---- eval=FALSE--------------------------------------------------------- # #Find item - find defaults to "en" as a language. # aarons <- find_item("Aaron Halfaker") # # #Find a property - also defaults to "en" # first_names <- find_property("first name") ## ---- eval=FALSE--------------------------------------------------------- # #Find item. # all_aarons <- find_item("Aaron Halfaker") # # #Grab the ID code for the first entry and retrieve the associated item data. # first_aaron <- get_item(all_aarons[[1]]$id) WikidataR/inst/doc/Introduction.html0000644000176200001440000004171513161072003017223 0ustar liggesusers WikidataR: the API client library for Wikidata

WikidataR: the API client library for Wikidata

Wikidata is a wonderful and irreplaceable resource for linked data, containing information on pretty much any subject. If there's a Wikipedia article on it, there's almost certainly a Wikidata item for it.

WikidataR - following the naming scheme of WikipediR - is an API client library for Wikidata, written in and accessible from R.

Items and properties

The two basic component pieces of Wikidata are “items” and “properties”. An “item” is a thing - a concept, object or topic that exists in the real world, such as “Rush”. These items each have statements associated with them - for example, “Rush is an instance of: Rock Band”. In that statement, “Rock Band” is a property: a class or trait that items can hold. Wikidata items are organised as descriptors of the item, in various languages, and references to the properties that that item holds.

Retrieving specific items or properties

Items and properties are both identified by numeric IDs, prefaced with “Q” in the case of items, and “P” in the case of properties. WikipediR can be used to retrieve items or properties with specific ID numbers, using the get_item and get_property functions:

#Retrieve an item 
item <- get_item(id = 1)

#Get information about the property of the first claim it has.
first_claim <- get_property(id = names(item$claims)[1])
#Do we succeed? Dewey!

These functions are capable of accepting various forms for the ID, including (as examples), “Q100” or “100” for items, and “Property:P100”, “P100” or “100” for properties. They're also vectorised - pass them as many IDs as you want!

Retrieving randomly-selected items or properties

As well as retrieving specific items or properties, Wikidata's API also allows for the retrieval of random elements. With WikidataR, this can be achieved through:

#Retrieve a random item
rand_item <- get_random_item()

#Retrieve a random property
rand_prop <- get_random_property()

These also allow you to retrieve sets of random elements - not just one at a time, but say, 50 at a time - by including the “limit” argument:

#Retrieve 42 random items
rand_item <- get_random_item(limit = 42)

#Retrieve 42 random properties
rand_prop <- get_random_property(limit = 42)

Search

Wikidata's search functionality can also be used, either to find items or to find properties. All you need is a search string (which is run over the names and descriptions of items or properties) and a language code (since Wikidata's descriptions can be in many languages):

#Find item - find defaults to "en" as a language.
aarons <- find_item("Aaron Halfaker")

#Find a property - also defaults to "en"
first_names <- find_property("first name")

The resulting search entries have the ID as a key, making it trivial to then retrieve the full corresponding items or properties:

#Find item.
all_aarons <- find_item("Aaron Halfaker")

#Grab the ID code for the first entry and retrieve the associated item data.
first_aaron <- get_item(all_aarons[[1]]$id)

Other and future functionality

If you have ideas for other types of useful Wikidata access, the best approach is to either request it or add it!

WikidataR/tests/0000755000176200001440000000000013106773114013276 5ustar liggesusersWikidataR/tests/testthat.R0000644000176200001440000000007613106773114015264 0ustar liggesuserslibrary(testthat) library(WikidataR) test_check("WikidataR") WikidataR/tests/testthat/0000755000176200001440000000000013161121354015130 5ustar liggesusersWikidataR/tests/testthat/test_search.R0000644000176200001440000000067513106773114017575 0ustar liggesuserscontext("Search functions") test_that("English-language search works",{ expect_true({find_item("Wonder Girls", "en");TRUE}) }) test_that("Non-English-language search works",{ expect_true({find_item("Wonder Girls", "es");TRUE}) }) test_that("Search with limit modding works",{ expect_that(length(find_item("Wonder Girls", "en", 3)), equals(3)) }) test_that("Property search works",{ expect_true({find_property("Music", "en");TRUE}) })WikidataR/tests/testthat/test_gets.R0000644000176200001440000000157013106773114017265 0ustar liggesuserscontext("Direct Wikidata get functions") test_that("A specific item can be retrieved with an entire item code", { expect_true({get_item("Q100");TRUE}) }) test_that("A specific item can be retrieved with a partial entire item code", { expect_true({get_item("100");TRUE}) }) test_that("A specific property can be retrieved with an entire prop code + namespace", { expect_true({get_property("Property:P10");TRUE}) }) test_that("A specific property can be retrieved with an entire prop code + namespace", { expect_true({get_property("P10");TRUE}) }) test_that("A specific property can be retrieved with a partial prop code", { expect_true({get_property("10");TRUE}) }) test_that("A randomly-selected item can be retrieved",{ expect_true({get_random_item();TRUE}) }) test_that("A randomly-selected property can be retriveed",{ expect_true({get_random_property();TRUE}) })WikidataR/tests/testthat/test_geo.R0000644000176200001440000000423613106773114017077 0ustar liggesuserstestthat::context("Geographic queries") testthat::test_that("Simple entity-based geo lookups work", { field_names <- c("item", "name", "latitutde", "longitude", "entity") sf_locations <- get_geo_entity("Q62") testthat::expect_true(is.data.frame(sf_locations)) testthat::expect_true(all(field_names == names(sf_locations))) testthat::expect_true(unique(sf_locations$entity) == "Q62") }) testthat::test_that("Language-variant entity-based geo lookups work", { field_names <- c("item", "name", "latitutde", "longitude", "entity") sf_locations <- get_geo_entity("Q62", language = "fr") testthat::expect_true(is.data.frame(sf_locations)) testthat::expect_true(all(field_names == names(sf_locations))) testthat::expect_true(unique(sf_locations$entity) == "Q62") }) testthat::test_that("Radius restricted entity-based geo lookups work", { field_names <- c("item", "name", "latitutde", "longitude", "entity") sf_locations <- get_geo_entity("Q62", radius = 1) testthat::expect_true(is.data.frame(sf_locations)) testthat::expect_true(all(field_names == names(sf_locations))) testthat::expect_true(unique(sf_locations$entity) == "Q62") }) testthat::test_that("multi-entity geo lookups work", { field_names <- c("item", "name", "latitutde", "longitude", "entity") sf_locations <- get_geo_entity(c("Q62", "Q64"), radius = 1) testthat::expect_true(is.data.frame(sf_locations)) testthat::expect_true(all(field_names == names(sf_locations))) testthat::expect_equal(length(unique(sf_locations$entity)), 2) }) testthat::test_that("Simple bounding lookups work", { field_names <- c("item", "name", "latitutde", "longitude") bruges_box <- get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest") testthat::expect_true(is.data.frame(bruges_box)) testthat::expect_true(all(field_names == names(bruges_box))) }) testthat::test_that("Language-variant bounding lookups work", { field_names <- c("item", "name", "latitutde", "longitude") bruges_box <- get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest", language = "fr") testthat::expect_true(is.data.frame(bruges_box)) testthat::expect_true(all(field_names == names(bruges_box))) })WikidataR/NAMESPACE0000644000176200001440000000073613106777553013374 0ustar liggesusers# Generated by roxygen2: do not edit by hand S3method(print,find_item) S3method(print,find_property) S3method(print,wikidata) export(extract_claims) export(find_item) export(find_property) export(get_geo_box) export(get_geo_entity) export(get_item) export(get_property) export(get_random_item) export(get_random_property) importFrom(WikipediR,page_content) importFrom(WikipediR,query) importFrom(WikipediR,random_page) importFrom(httr,user_agent) importFrom(jsonlite,fromJSON) WikidataR/NEWS0000644000176200001440000000220013161071406012621 0ustar liggesusers1.4.0 ================================================= * extract_claims() allows you to, well, extract claims. * SPARQL syntax bug with some geo queries now fixed (thanks to Mikhail Popov) 1.3.0 ================================================= * get_* functions are now vectorised 1.2.0 ================================================= * geographic data for entities that exist relative to other Wikidata items can now be retrieved with get_geo_entity and get_geo_box, courtesy of excellent Serena Signorelli's excellent QueryWikidataR package. * A bug in printing returned objects is now fixed. 1.1.0 ================================================= * You can now retrieve multiple random properties or items with get_random_item and get_random_property 1.0.1 ================================================= * Various documentation and metadata improvements. 1.0.0 ================================================= * Fix a bug in get_* functions due to a parameter name mismatch * Print methods added by Christian Graul 0.5.0 ================================================= * This is the initial release! See the explanatory vignettes. WikidataR/R/0000755000176200001440000000000013161071155012332 5ustar liggesusersWikidataR/R/utils.R0000644000176200001440000000647713106777613013646 0ustar liggesusers#Generic queryin' function for direct Wikidata calls. Wraps around WikipediR::page_content. wd_query <- function(title, ...){ result <- WikipediR::page_content(domain = "wikidata.org", page_name = title, as_wikitext = TRUE, httr::user_agent("WikidataR - https://github.com/Ironholds/WikidataR"), ...) output <- jsonlite::fromJSON(result$parse$wikitext[[1]]) return(output) } #Query for a random item in "namespace" (ns). Essentially a wrapper around WikipediR::random_page. wd_rand_query <- function(ns, limit, ...){ result <- WikipediR::random_page(domain = "wikidata.org", as_wikitext = TRUE, namespaces = ns, httr::user_agent("WikidataR - https://github.com/Ironholds/WikidataR"), limit = limit, ...) output <- lapply(result, function(x){jsonlite::fromJSON(x$wikitext[[1]])}) class(output) <- "wikidata" return(output) } #Generic input checker. Needs additional stuff for property-based querying #because namespaces are weird, yo. check_input <- function(input, substitution){ in_fit <- grepl("^\\d+$",input) if(any(in_fit)){ input[in_fit] <- paste0(substitution, input[in_fit]) } return(input) } #Generic, direct access to Wikidata's search functionality. searcher <- function(search_term, language, limit, type, ...){ result <- WikipediR::query(url = "https://www.wikidata.org/w/api.php", out_class = "list", clean_response = FALSE, query_param = list( action = "wbsearchentities", type = type, language = language, limit = limit, search = search_term ), ...) result <- result$search return(result) } sparql_query <- function(params, ...){ result <- httr::GET("https://query.wikidata.org/bigdata/namespace/wdq/sparql", query = list(query = params), httr::user_agent("WikidataR - https://github.com/Ironholds/WikidataR"), ...) httr::stop_for_status(result) return(httr::content(result, as = "parsed", type = "application/json")) } #'@title Extract Claims from Returned Item Data #'@description extract claim information from data returned using #'\code{\link{get_item}}. #' #'@param items a list of one or more Wikidata items returned with #'\code{\link{get_item}}. #' #'@param claims a vector of claims (in the form "P321", "P12") to look for #'and extract. #' #'@return a list containing one sub-list for each entry in \code{items}, #'and (below that) the found data for each claim. In the event a claim #'cannot be found for an item, an \code{NA} will be returned #'instead. #' #'@examples #'# Get item data #'adams_data <- get_item("42") #' #'# Get claim data #'claims <- extract_claims(adams_data, "P31") #' #'@export extract_claims <- function(items, claims){ output <- lapply(items, function(x, claims){ return(lapply(claims, function(claim, obj){ which_match <- which(names(obj$claims) == claim) if(!length(which_match)){ return(NA) } return(obj$claims[[which_match[1]]]) }, obj = x)) }, claims = claims) return(output) } WikidataR/R/geo.R0000644000176200001440000001573113161071155013236 0ustar liggesusersclean_geo <- function(results){ do.call("rbind", lapply(results, function(item){ point <- unlist(strsplit(gsub(x = item$coord$value, pattern = "(Point\\(|\\))", replacement = ""), " ")) wd_id <- gsub(x = item$item$value, pattern = "http://www.wikidata.org/entity/", replacement = "", fixed = TRUE) return(data.frame(item = wd_id, name = ifelse(item$name$value == wd_id, NA, item$name$value), latitutde = as.numeric(point[1]), longitude = as.numeric(point[2]), stringsAsFactors = FALSE)) })) } #'@title Retrieve geographic information from Wikidata #'@description \code{get_geo_entity} retrieves the item ID, latitude #'and longitude of any object with geographic data associated with \emph{another} #'object with geographic data (example: all the locations around/near/associated with #'a city). #' #'@param entity a Wikidata item (\code{Q...}) or series of items, to check #'for associated geo-tagged items. #' #'@param language the two-letter language code to use for the name #'of the item. "en" by default, because we're imperialist #'anglocentric westerners. #' #'@param radius optionally, a radius (in kilometers) around \code{entity} #'to restrict the search to. #' #'@param ... further arguments to pass to httr's GET. #' #'@return a data.frame of 5 columns: #'\itemize{ #' \item{item}{ the Wikidata identifier of each object associated with #' \code{entity}.} #' \item{name}{ the name of the item, if available, in the requested language. If it #' is not available, \code{NA} will be returned instead.} #' \item{latitude}{ the latitude of \code{item}} #' \item{longitude}{ the longitude of \code{item}} #' \item{entity}{ the entity the item is associated with (necessary for multi-entity #' queries).} #'} #' #'@examples #'# All entities #'sf_locations <- get_geo_entity("Q62") #' #'# Entities with French, rather than English, names #'sf_locations <- get_geo_entity("Q62", language = "fr") #' #'# Entities within 1km #'sf_close_locations <- get_geo_entity("Q62", radius = 1) #' #'# Multiple entities #'multi_entity <- get_geo_entity(entity = c("Q62", "Q64")) #' #'@seealso \code{\link{get_geo_box}} for using a bounding box #'rather than an unrestricted search or simple radius. #' #'@export get_geo_entity <- function(entity, language = "en", radius = NULL, ...){ entity <- check_input(entity, "Q") if(is.null(radius)){ query <- paste0("SELECT DISTINCT ?item ?name ?coord ?propertyLabel WHERE { ?item wdt:P131* wd:", entity, ". ?item wdt:P625 ?coord . SERVICE wikibase:label { bd:serviceParam wikibase:language \"", language, "\" . ?item rdfs:label ?name } } ORDER BY ASC (?name)") } else { query <- paste0("SELECT ?item ?name ?coord WHERE { wd:", entity, " wdt:P625 ?mainLoc . SERVICE wikibase:around { ?item wdt:P625 ?coord . bd:serviceParam wikibase:center ?mainLoc . bd:serviceParam wikibase:radius \"", radius, "\" . } SERVICE wikibase:label { bd:serviceParam wikibase:language \"", language, "\" . ?item rdfs:label ?name } } ORDER BY ASC (?name)") } if(length(query) > 1){ return(do.call("rbind", mapply(function(query, entity, ...){ output <- clean_geo(sparql_query(query, ...)$results$bindings) output$entity <- entity return(output) }, query = query, entity = entity, ..., SIMPLIFY = FALSE))) } output <- clean_geo(sparql_query(query)$results$bindings) output$entity <- entity return(output) } #'@title Get geographic entities based on a bounding box #'@description \code{get_geo_box} retrieves all geographic entities in #'Wikidata that fall between a bounding box between two existing items #'with geographic attributes (usually cities). #' #'@param first_city_code a Wikidata item, or series of items, to use for #'one corner of the bounding box. #' #'@param first_corner the direction of \code{first_city_code} relative #'to \code{city} (eg "NorthWest", "SouthEast"). #' #'@param second_city_code a Wikidata item, or series of items, to use for #'one corner of the bounding box. #' #'@param second_corner the direction of \code{second_city_code} relative #'to \code{city} (eg "NorthWest", "SouthEast"). #' #'@param language the two-letter language code to use for the name #'of the item. "en" by default. #' #'@param ... further arguments to pass to httr's GET. #' #'@return a data.frame of 5 columns: #'\itemize{ #' \item{item}{ the Wikidata identifier of each object associated with #' \code{entity}.} #' \item{name}{ the name of the item, if available, in the requested language. If it #' is not available, \code{NA} will be returned instead.} #' \item{latitude}{ the latitude of \code{item}} #' \item{longitude}{ the longitude of \code{item}} #' \item{entity}{ the entity the item is associated with (necessary for multi-entity #' queries).} #'} #' #'@examples #'# Simple bounding box #'bruges_box <- WikidataR:::get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest") #' #'# Custom language #'bruges_box_fr <- WikidataR:::get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest", #' language = "fr") #' #'@seealso \code{\link{get_geo_entity}} for using an unrestricted search or simple radius, #'rather than a bounding box. #' #'@export get_geo_box <- function(first_city_code, first_corner, second_city_code, second_corner, language = "en", ...){ # Input checks first_city_code <- check_input(first_city_code, "Q") second_city_code <- check_input(second_city_code, "Q") # Construct query query <- paste0("SELECT ?item ?name ?coord WHERE { wd:", first_city_code, " wdt:P625 ?Firstloc . wd:", second_city_code, " wdt:P625 ?Secondloc . SERVICE wikibase:box { ?item wdt:P625 ?coord . bd:serviceParam wikibase:corner", first_corner, " ?Firstloc . bd:serviceParam wikibase:corner", second_corner, " ?Secondloc . } SERVICE wikibase:label { bd:serviceParam wikibase:language \"", language, "\" . ?item rdfs:label ?name } }ORDER BY ASC (?name)") # Vectorise if necessary, or not if not! if(length(query) > 1){ return(do.call("rbind", mapply(function(query, ...){ output <- clean_geo(sparql_query(query, ...)$results$bindings) return(output) }, query = query, ..., SIMPLIFY = FALSE))) } output <- clean_geo(sparql_query(query)$results$bindings) return(output) }WikidataR/R/WikidataR.R0000644000176200001440000000127213106773114014341 0ustar liggesusers#' @title API client library for Wikidata #' @description This package serves as an API client for \href{Wikidata}{https://www.wikidata.org}. #' See the accompanying vignette for more details. #' #' @name WikidataR #' @docType package #'@seealso \code{\link{get_random}} for selecting a random item or property, #'\code{\link{get_item}} for a /specific/ item or property, or \code{\link{find_item}} #'for using search functionality to pull out item or property IDs where the descriptions #'or aliases match a particular search term. #' @importFrom WikipediR page_content random_page query #' @importFrom httr user_agent #' @importFrom jsonlite fromJSON #' @aliases WikidataR WikidataR-package NULLWikidataR/R/prints.R0000644000176200001440000000606013106775271014007 0ustar liggesusers#'@title Print method for find_item #' #'@description print found items. #' #'@param x find_item object with search results #'@param \dots Arguments to be passed to methods #' #'@method print find_item #'@export print.find_item <- function(x, ...) { cat("\n\tWikidata item search\n\n") # number of results num_results <- length(x) cat("Number of results:\t", num_results, "\n\n") # results if(num_results > 0) { cat("Results:\n") for(i in 1:num_results) { if(is.null(x[[i]]$description)){ desc <- "\n" } else { desc <- paste("-", x[[i]]$description, "\n") } cat(i, "\t", x[[i]]$label, paste0("(", x[[i]]$id, ")"), desc) } } } #'@title Print method for find_property #' #'@description print found properties. #' #'@param x find_property object with search results #'@param \dots Arguments to be passed to methods #' #'@method print find_property #'@export print.find_property <- function(x, ...) { cat("\n\tWikidata property search\n\n") # number of results num_results <- length(x) cat("Number of results:\t", num_results, "\n\n") # results if(num_results > 0) { cat("Results:\n") for(i in seq_len(num_results)) { if(is.null(x[[i]]$description)){ desc <- "\n" } else { desc <- paste("-", x[[i]]$description, "\n") } cat(i, "\t", x[[i]]$label, paste0("(", x[[i]]$id, ")"), desc) } } } wd_print_base <- function(x, ...){ cat("\n\tWikidata", x$type, x$id, "\n\n") # labels num.labels <- length(x$labels) if(num.labels>0) { lbl <- x$labels[[1]]$value if(num.labels==1) cat("Label:\t\t", lbl, "\n") else { if(!is.null(x$labels$en)) lbl <- x$labels$en$value cat("Label:\t\t", lbl, paste0("\t[", num.labels-1, " other languages available]\n")) } } # aliases num_aliases <- length(x$aliases) if(num_aliases > 0) { al <- unique(unlist(lapply(x$aliases, function(xl){return(xl$value)}))) cat("Aliases:\t", paste(al, collapse = ", "), "\n") } # descriptions num_desc <- length(x$descriptions) if(num_desc > 0) { desc <- x$descriptions[[1]]$value if(num_desc == 1){ cat("Description:", desc, "\n") } else { if(!is.null(x$descriptions$en)){ desc <- x$descriptions$en$value } cat("Description:", desc, paste0("\t[", (num_desc - 1), " other languages available]\n")) } } # num claims num_claims <- length(x$claims) if(num_claims > 0){ cat("Claims:\t\t", num_claims, "\n") } # num sitelinks num_links <- length(x$sitelinks) if(num_links > 0){ cat("Sitelinks:\t", num_links, "\n") } } #'@title Print method for Wikidata objects #' #'@description print found objects generally. #' #'@param x wikidata object from get_item, get_random_item, get_property or get_random_property #'@param \dots Arguments to be passed to methods #'@seealso get_item, get_random_item, get_property or get_random_property #'@method print wikidata #'@export print.wikidata <- function(x, ...){ lapply(x, wd_print_base, ...) return(invisible()) }WikidataR/R/gets.R0000644000176200001440000001056613106775601013435 0ustar liggesusers#'@title Retrieve specific Wikidata items or properties #'@description \code{get_item} and \code{get_property} allow you to retrieve the data associated #'with individual Wikidata items and properties, respectively. As with #'other \code{WikidataR} code, custom print methods are available; use \code{\link{str}} #'to manipulate and see the underlying structure of the data. #' #'@param id the ID number(s) of the item or property you're looking for. This can be in #'various formats; either a numeric value ("200"), the full name ("Q200") or #'even with an included namespace ("Property:P10") - the function will format #'it appropriately. This function is vectorised and will happily accept #'multiple IDs. #' #'@param ... further arguments to pass to httr's GET. #' #'@seealso \code{\link{get_random}} for selecting a random item or property, #'or \code{\link{find_item}} for using search functionality to pull out #'item or property IDs where the descriptions or aliases match a particular #'search term. #' #'@examples #' #'#Retrieve a specific item #'adams_metadata <- get_item("42") #' #'#Retrieve a specific property #'object_is_child <- get_property("P40") #' #'@aliases get_item get_property #'@rdname get_item #'@export get_item <- function(id, ...){ id <- check_input(id, "Q") output <- (lapply(id, wd_query, ...)) class(output) <- "wikidata" return(output) } #'@rdname get_item #'@export get_property <- function(id, ...){ has_grep <- grepl("^P(?!r)",id, perl = TRUE) id[has_grep] <- paste0("Property:", id[has_grep]) id <- check_input(id, "Property:P") output <- (lapply(id, wd_query, ...)) class(output) <- "wikidata" return(output) } #'@title Retrieve randomly-selected Wikidata items or properties #'@description \code{get_random_item} and \code{get_random_property} allow you to retrieve the data #'associated with randomly-selected Wikidata items and properties, respectively. As with #'other \code{WikidataR} code, custom print methods are available; use \code{\link{str}} #'to manipulate and see the underlying structure of the data. #' #'@param limit how many random items to return. 1 by default, but can be higher. #' #'@param ... arguments to pass to httr's GET. #' #'@seealso \code{\link{get_item}} for selecting a specific item or property, #'or \code{\link{find_item}} for using search functionality to pull out #'item or property IDs where the descriptions or aliases match a particular #'search term. #' #'@examples #' #'#Random item #'random_item <- get_random_item() #' #'#Random property #'random_property <- get_random_property() #' #'@aliases get_random get_random_item get_random_property #'@rdname get_random #'@export get_random_item <- function(limit = 1, ...){ return(wd_rand_query(ns = 0, limit = limit, ...)) } #'@rdname get_random #'@export get_random_property <- function(limit = 1, ...){ return(wd_rand_query(ns = 120, limit = limit, ...)) } #'@title Search for Wikidata items or properties that match a search term #'@description \code{find_item} and \code{find_property} allow you to retrieve a set #'of Wikidata items or properties where the aliase or descriptions match a particular #'search term. As with other \code{WikidataR} code, custom print methods are available; #'use \code{\link{str}} to manipulate and see the underlying structure of the data. #' #'@param search_term a term to search for. #' #'@param language the language to return the labels and descriptions in; this should #'consist of an ISO language code. Set to "en" by default. #' #'@param limit the number of results to return; set to 10 by default. #' #'@param ... further arguments to pass to httr's GET. #' #'@seealso \code{\link{get_random}} for selecting a random item or property, #'or \code{\link{get_item}} for selecting a specific item or property. #' #'@examples #' #'#Check for entries relating to Douglas Adams in some way #'adams_items <- find_item("Douglas Adams") #' #'#Check for properties involving the peerage #'peerage_props <- find_property("peerage") #' #'@aliases find_item find_property #'@rdname find_item #'@export find_item <- function(search_term, language = "en", limit = 10, ...){ res <- searcher(search_term, language, limit, "item") class(res) <- "find_item" return(res) } #'@rdname find_item #'@export find_property <- function(search_term, language = "en", limit = 10){ res <- searcher(search_term, language, limit, "property") class(res) <- "find_property" return(res) } WikidataR/vignettes/0000755000176200001440000000000013161072003014132 5ustar liggesusersWikidataR/vignettes/Introduction.Rmd0000644000176200001440000000676413106776057017317 0ustar liggesusers # WikidataR: the API client library for Wikidata Wikidata is a wonderful and irreplaceable resource for linked data, containing information on pretty much any subject. If there's a Wikipedia article on it, there's almost certainly a Wikidata item for it. WikidataR - following the naming scheme of [WikipediR](https://github.com/Ironholds/WikipediR#thanks-and-misc) - is an API client library for Wikidata, written in and accessible from R. ## Items and properties The two basic component pieces of Wikidata are "items" and "properties". An "item" is a thing - a concept, object or topic that exists in the real world, such as "Rush". These items each have statements associated with them - for example, "Rush is an instance of: Rock Band". In that statement, "Rock Band" is a property: a class or trait that items can hold. Wikidata items are organised as descriptors of the item, in various languages, and references to the properties that that item holds. ## Retrieving specific items or properties Items and properties are both identified by numeric IDs, prefaced with "Q" in the case of items, and "P" in the case of properties. WikipediR can be used to retrieve items or properties with specific ID numbers, using the get\_item and get\_property functions: ```{r, eval=FALSE} #Retrieve an item item <- get_item(id = 1) #Get information about the property of the first claim it has. first_claim <- get_property(id = names(item$claims)[1]) #Do we succeed? Dewey! ``` These functions are capable of accepting various forms for the ID, including (as examples), "Q100" or "100" for items, and "Property:P100", "P100" or "100" for properties. They're also vectorised - pass them as many IDs as you want! ## Retrieving randomly-selected items or properties As well as retrieving specific items or properties, Wikidata's API also allows for the retrieval of *random* elements. With WikidataR, this can be achieved through: ```{r, eval=FALSE} #Retrieve a random item rand_item <- get_random_item() #Retrieve a random property rand_prop <- get_random_property() ``` These also allow you to retrieve *sets* of random elements - not just one at a time, but say, 50 at a time - by including the "limit" argument: ```{r, eval=FALSE} #Retrieve 42 random items rand_item <- get_random_item(limit = 42) #Retrieve 42 random properties rand_prop <- get_random_property(limit = 42) ``` ## Search Wikidata's search functionality can also be used, either to find items or to find properties. All you need is a search string (which is run over the names and descriptions of items or properties) and a language code (since Wikidata's descriptions can be in many languages): ```{r, eval=FALSE} #Find item - find defaults to "en" as a language. aarons <- find_item("Aaron Halfaker") #Find a property - also defaults to "en" first_names <- find_property("first name") ``` The resulting search entries have the ID as a key, making it trivial to then retrieve the full corresponding items or properties: ```{r, eval=FALSE} #Find item. all_aarons <- find_item("Aaron Halfaker") #Grab the ID code for the first entry and retrieve the associated item data. first_aaron <- get_item(all_aarons[[1]]$id) ``` ## Other and future functionality If you have ideas for other types of useful Wikidata access, the best approach is to either [request it](https://github.com/Ironholds/WikidataR/issues) or [add it](https://github.com/Ironholds/WikidataR/pulls)! WikidataR/README.md0000644000176200001440000000247113107176524013422 0ustar liggesusersWikidataR ========= An R API wrapper for the Wikidata store of semantic data. __Author:__ Oliver Keyes, Serena Signorelli & Christian Graul
__License:__ [MIT](http://opensource.org/licenses/MIT)
__Status:__ Stable [![Travis-CI Build Status](https://travis-ci.org/Ironholds/WikidataR.svg?branch=master)](https://travis-ci.org/Ironholds/WikidataR)![downloads](http://cranlogs.r-pkg.org/badges/grand-total/WikidataR) Description ====== WikidataR is a wrapper around the Wikidata API. It is written in and for R, and was inspired by Christian Graul's [rwikidata](https://github.com/chgrl/rwikidata) project. For details on how to best use it, see the [explanatory vignette](https://CRAN.R-project.org/package=WikidataR/vignettes/Introduction.html). Please note that this project is released with a [Contributor Code of Conduct](https://github.com/Ironholds/WikidataR/blob/master/CONDUCT.md). By participating in this project you agree to abide by its terms. Installation ====== For the most recent CRAN version: install.packages("WikidataR") For the development version: library(devtools) devtools::install_github("ironholds/WikidataR") Dependencies ====== * R. Doy. * [httr](https://cran.r-project.org/package=httr) and its dependencies. * [WikipediR](https://cran.r-project.org/package=WikipediR) WikidataR/MD50000644000176200001440000000274413161121354012445 0ustar liggesuserseb02df461c648d4da3f983afc54503d5 *DESCRIPTION 1d9678dbfe1732b5d2c521e07b2ceef0 *LICENSE 8f5819571233c6d8d08d23f9bfc9979b *NAMESPACE 2776dd31c6533290c7fd2cd414a2b4bf *NEWS e6967d650ab6b6462db1793f0fe5a46b *R/WikidataR.R 5ad80eca5081277b549234400a2dd7a3 *R/geo.R 4229fe3d75d444beb2fa00ae2bdcdba6 *R/gets.R e588e32737791defc6f982114f39d75c *R/prints.R c16306d76abfe6d0e78dd62bf77173c7 *R/utils.R 6095c718be80727c886cff790734e9b5 *README.md a1d7177a65e4773e0c7fae2ccb9d143d *build/vignette.rds 43cc957bbe79bc0b25b62be190705064 *inst/doc/Introduction.R 5ab492a540df058a91940716bf3e9c4f *inst/doc/Introduction.Rmd 3e88344829cf501478fb9e8c841a18b5 *inst/doc/Introduction.html dea44cd789a89155878f75eb0c430541 *man/WikidataR.Rd ca32f05afde2f042aa5ce7c799d63976 *man/extract_claims.Rd d6439bd1505303b2c069a9ec5a482346 *man/find_item.Rd 6486678f64813a107103352d076f2ed3 *man/get_geo_box.Rd 3d485c862e1ab25782c98cf2d2e8c009 *man/get_geo_entity.Rd d44df4503eefe77a44f15be028f977b7 *man/get_item.Rd 9a902a02739165e2a862571144e74ebd *man/get_random.Rd aa48f8096742e46ef2f78f0ec960b039 *man/print.find_item.Rd 5f88c4bb32c2b352ff1cab1f9124a982 *man/print.find_property.Rd 294468449f0be62ebd8443bd10e926be *man/print.wikidata.Rd ced86f667bcd51239f1c0d5d5c1a492b *tests/testthat.R 8f0a71f6693281b0d26afe5532158157 *tests/testthat/test_geo.R 2986dd17d5e90976391d811f9bb3bb1c *tests/testthat/test_gets.R 38bd7da6e5db4b243603b6d0d53e8cdd *tests/testthat/test_search.R 5ab492a540df058a91940716bf3e9c4f *vignettes/Introduction.Rmd WikidataR/build/0000755000176200001440000000000013161072003013221 5ustar liggesusersWikidataR/build/vignette.rds0000644000176200001440000000032313161072003015556 0ustar liggesusersb```b`f@&0`b fd`a=JSJK2rS%J33SK Q&U@„5/17vԂԼ?iN,/AQU▙ 7$apq2݀a>9`~EDMI,F((ҊA WikidataR/DESCRIPTION0000644000176200001440000000136413161121354013640 0ustar liggesusersPackage: WikidataR Type: Package Title: API Client Library for 'Wikidata' Version: 1.4.0 Date: 2017-09-21 Author: Oliver Keyes [aut, cre], Serena Signorelli [aut, cre], Christian Graul [ctb], Mikhail Popov [ctb] Maintainer: Oliver Keyes Description: An API client for the Wikidata store of semantic data. BugReports: https://github.com/Ironholds/WikidataR/issues URL: https://github.com/Ironholds/WikidataR/issues License: MIT + file LICENSE Imports: httr, jsonlite, WikipediR (>= 1.4.0), utils Suggests: testthat, knitr, pageviews VignetteBuilder: knitr RoxygenNote: 6.0.1 NeedsCompilation: no Packaged: 2017-09-22 02:22:59 UTC; ironholds Repository: CRAN Date/Publication: 2017-09-22 05:43:08 UTC WikidataR/man/0000755000176200001440000000000013106777553012722 5ustar liggesusersWikidataR/man/extract_claims.Rd0000644000176200001440000000147713106777627016226 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/utils.R \name{extract_claims} \alias{extract_claims} \title{Extract Claims from Returned Item Data} \usage{ extract_claims(items, claims) } \arguments{ \item{items}{a list of one or more Wikidata items returned with \code{\link{get_item}}.} \item{claims}{a vector of claims (in the form "P321", "P12") to look for and extract.} } \value{ a list containing one sub-list for each entry in \code{items}, and (below that) the found data for each claim. In the event a claim cannot be found for an item, an \code{NA} will be returned instead. } \description{ extract claim information from data returned using \code{\link{get_item}}. } \examples{ # Get item data adams_data <- get_item("42") # Get claim data claims <- extract_claims(adams_data, "P31") } WikidataR/man/print.find_property.Rd0000644000176200001440000000057713106777553017241 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/prints.R \name{print.find_property} \alias{print.find_property} \title{Print method for find_property} \usage{ \method{print}{find_property}(x, ...) } \arguments{ \item{x}{find_property object with search results} \item{\dots}{Arguments to be passed to methods} } \description{ print found properties. } WikidataR/man/find_item.Rd0000644000176200001440000000253213106777553015151 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/gets.R \name{find_item} \alias{find_item} \alias{find_property} \alias{find_property} \title{Search for Wikidata items or properties that match a search term} \usage{ find_item(search_term, language = "en", limit = 10, ...) find_property(search_term, language = "en", limit = 10) } \arguments{ \item{search_term}{a term to search for.} \item{language}{the language to return the labels and descriptions in; this should consist of an ISO language code. Set to "en" by default.} \item{limit}{the number of results to return; set to 10 by default.} \item{...}{further arguments to pass to httr's GET.} } \description{ \code{find_item} and \code{find_property} allow you to retrieve a set of Wikidata items or properties where the aliase or descriptions match a particular search term. As with other \code{WikidataR} code, custom print methods are available; use \code{\link{str}} to manipulate and see the underlying structure of the data. } \examples{ #Check for entries relating to Douglas Adams in some way adams_items <- find_item("Douglas Adams") #Check for properties involving the peerage peerage_props <- find_property("peerage") } \seealso{ \code{\link{get_random}} for selecting a random item or property, or \code{\link{get_item}} for selecting a specific item or property. } WikidataR/man/print.wikidata.Rd0000644000176200001440000000076013106777553016144 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/prints.R \name{print.wikidata} \alias{print.wikidata} \title{Print method for Wikidata objects} \usage{ \method{print}{wikidata}(x, ...) } \arguments{ \item{x}{wikidata object from get_item, get_random_item, get_property or get_random_property} \item{\dots}{Arguments to be passed to methods} } \description{ print found objects generally. } \seealso{ get_item, get_random_item, get_property or get_random_property } WikidataR/man/WikidataR.Rd0000644000176200001440000000124413106777553015071 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/WikidataR.R \docType{package} \name{WikidataR} \alias{WikidataR} \alias{WikidataR-package} \alias{WikidataR-package} \title{API client library for Wikidata} \description{ This package serves as an API client for \href{Wikidata}{https://www.wikidata.org}. See the accompanying vignette for more details. } \seealso{ \code{\link{get_random}} for selecting a random item or property, \code{\link{get_item}} for a /specific/ item or property, or \code{\link{find_item}} for using search functionality to pull out item or property IDs where the descriptions or aliases match a particular search term. } WikidataR/man/get_item.Rd0000644000176200001440000000250513106777553015010 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/gets.R \name{get_item} \alias{get_item} \alias{get_property} \alias{get_property} \title{Retrieve specific Wikidata items or properties} \usage{ get_item(id, ...) get_property(id, ...) } \arguments{ \item{id}{the ID number(s) of the item or property you're looking for. This can be in various formats; either a numeric value ("200"), the full name ("Q200") or even with an included namespace ("Property:P10") - the function will format it appropriately. This function is vectorised and will happily accept multiple IDs.} \item{...}{further arguments to pass to httr's GET.} } \description{ \code{get_item} and \code{get_property} allow you to retrieve the data associated with individual Wikidata items and properties, respectively. As with other \code{WikidataR} code, custom print methods are available; use \code{\link{str}} to manipulate and see the underlying structure of the data. } \examples{ #Retrieve a specific item adams_metadata <- get_item("42") #Retrieve a specific property object_is_child <- get_property("P40") } \seealso{ \code{\link{get_random}} for selecting a random item or property, or \code{\link{find_item}} for using search functionality to pull out item or property IDs where the descriptions or aliases match a particular search term. } WikidataR/man/get_random.Rd0000644000176200001440000000223413106777553015331 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/gets.R \name{get_random_item} \alias{get_random_item} \alias{get_random} \alias{get_random_property} \alias{get_random_property} \title{Retrieve randomly-selected Wikidata items or properties} \usage{ get_random_item(limit = 1, ...) get_random_property(limit = 1, ...) } \arguments{ \item{limit}{how many random items to return. 1 by default, but can be higher.} \item{...}{arguments to pass to httr's GET.} } \description{ \code{get_random_item} and \code{get_random_property} allow you to retrieve the data associated with randomly-selected Wikidata items and properties, respectively. As with other \code{WikidataR} code, custom print methods are available; use \code{\link{str}} to manipulate and see the underlying structure of the data. } \examples{ #Random item random_item <- get_random_item() #Random property random_property <- get_random_property() } \seealso{ \code{\link{get_item}} for selecting a specific item or property, or \code{\link{find_item}} for using search functionality to pull out item or property IDs where the descriptions or aliases match a particular search term. } WikidataR/man/get_geo_entity.Rd0000644000176200001440000000345013106777553016220 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/geo.R \name{get_geo_entity} \alias{get_geo_entity} \title{Retrieve geographic information from Wikidata} \usage{ get_geo_entity(entity, language = "en", radius = NULL, ...) } \arguments{ \item{entity}{a Wikidata item (\code{Q...}) or series of items, to check for associated geo-tagged items.} \item{language}{the two-letter language code to use for the name of the item. "en" by default, because we're imperialist anglocentric westerners.} \item{radius}{optionally, a radius (in kilometers) around \code{entity} to restrict the search to.} \item{...}{further arguments to pass to httr's GET.} } \value{ a data.frame of 5 columns: \itemize{ \item{item}{ the Wikidata identifier of each object associated with \code{entity}.} \item{name}{ the name of the item, if available, in the requested language. If it is not available, \code{NA} will be returned instead.} \item{latitude}{ the latitude of \code{item}} \item{longitude}{ the longitude of \code{item}} \item{entity}{ the entity the item is associated with (necessary for multi-entity queries).} } } \description{ \code{get_geo_entity} retrieves the item ID, latitude and longitude of any object with geographic data associated with \emph{another} object with geographic data (example: all the locations around/near/associated with a city). } \examples{ # All entities sf_locations <- get_geo_entity("Q62") # Entities with French, rather than English, names sf_locations <- get_geo_entity("Q62", language = "fr") # Entities within 1km sf_close_locations <- get_geo_entity("Q62", radius = 1) # Multiple entities multi_entity <- get_geo_entity(entity = c("Q62", "Q64")) } \seealso{ \code{\link{get_geo_box}} for using a bounding box rather than an unrestricted search or simple radius. } WikidataR/man/get_geo_box.Rd0000644000176200001440000000364413106777553015501 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/geo.R \name{get_geo_box} \alias{get_geo_box} \title{Get geographic entities based on a bounding box} \usage{ get_geo_box(first_city_code, first_corner, second_city_code, second_corner, language = "en", ...) } \arguments{ \item{first_city_code}{a Wikidata item, or series of items, to use for one corner of the bounding box.} \item{first_corner}{the direction of \code{first_city_code} relative to \code{city} (eg "NorthWest", "SouthEast").} \item{second_city_code}{a Wikidata item, or series of items, to use for one corner of the bounding box.} \item{second_corner}{the direction of \code{second_city_code} relative to \code{city} (eg "NorthWest", "SouthEast").} \item{language}{the two-letter language code to use for the name of the item. "en" by default.} \item{...}{further arguments to pass to httr's GET.} } \value{ a data.frame of 5 columns: \itemize{ \item{item}{ the Wikidata identifier of each object associated with \code{entity}.} \item{name}{ the name of the item, if available, in the requested language. If it is not available, \code{NA} will be returned instead.} \item{latitude}{ the latitude of \code{item}} \item{longitude}{ the longitude of \code{item}} \item{entity}{ the entity the item is associated with (necessary for multi-entity queries).} } } \description{ \code{get_geo_box} retrieves all geographic entities in Wikidata that fall between a bounding box between two existing items with geographic attributes (usually cities). } \examples{ # Simple bounding box bruges_box <- WikidataR:::get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest") # Custom language bruges_box_fr <- WikidataR:::get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest", language = "fr") } \seealso{ \code{\link{get_geo_entity}} for using an unrestricted search or simple radius, rather than a bounding box. } WikidataR/man/print.find_item.Rd0000644000176200001440000000054613106777553016307 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/prints.R \name{print.find_item} \alias{print.find_item} \title{Print method for find_item} \usage{ \method{print}{find_item}(x, ...) } \arguments{ \item{x}{find_item object with search results} \item{\dots}{Arguments to be passed to methods} } \description{ print found items. } WikidataR/LICENSE0000644000176200001440000000005113106773114013135 0ustar liggesusersYEAR: 2014 COPYRIGHT HOLDER: Oliver Keyes