filehash/0000755000176200001440000000000014371237002012034 5ustar liggesusersfilehash/NAMESPACE0000644000176200001440000000216614367270074013273 0ustar liggesusers# Generated by roxygen2: do not edit by hand export(createQ) export(createS) export(db2env) export(dumpDF) export(dumpEnv) export(dumpImage) export(dumpList) export(dumpObjects) export(filehashFormats) export(filehashOption) export(initQ) export(initS) export(registerFormatDB) exportClasses(filehash) exportClasses(filehashDB1) exportClasses(filehashRDS) exportClasses(queue) exportClasses(stack) exportMethods(`$<-`) exportMethods(`$`) exportMethods(`[[<-`) exportMethods(`[[`) exportMethods(`[`) exportMethods(coerce) exportMethods(dbCreate) exportMethods(dbDelete) exportMethods(dbExists) exportMethods(dbFetch) exportMethods(dbInit) exportMethods(dbInsert) exportMethods(dbLazyLoad) exportMethods(dbList) exportMethods(dbLoad) exportMethods(dbMultiFetch) exportMethods(dbReorganize) exportMethods(dbUnlink) exportMethods(isEmpty) exportMethods(lapply) exportMethods(length) exportMethods(mpush) exportMethods(names) exportMethods(pop) exportMethods(push) exportMethods(show) exportMethods(top) exportMethods(with) import(methods) importFrom(digest,digest) importFrom(methods,new) useDynLib(filehash,.registration = TRUE, .fixes = "C_") filehash/man/0000755000176200001440000000000014370503521012610 5ustar liggesusersfilehash/man/dumpEnv.Rd0000644000176200001440000000412714370503521014521 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/dump.R \name{dumpEnv} \alias{dumpEnv} \alias{dumpImage} \alias{dumpObjects} \alias{dumpDF} \alias{dumpList} \title{Dump Environment} \usage{ dumpEnv(env, dbName) dumpImage(dbName = "Rworkspace", type = NULL) dumpObjects( ..., list = character(0), dbName, type = NULL, envir = parent.frame() ) dumpDF(data, dbName = NULL, type = NULL) dumpList(data, dbName = NULL, type = NULL) } \arguments{ \item{env}{an environment} \item{dbName}{character, name of the filehash database} \item{type}{type of filehash database to create} \item{...}{R objects to be dumped to a filehash database} \item{list, }{character vector of object names to be dumped} \item{envir}{environment from which objects are dumped} \item{data}{a data frame} } \value{ An object of class \code{"filehash"} is returned and a database is created. } \description{ Dump an enviroment to a filehash database } \details{ The \code{dumpEnv} function takes an environment and stores each element of the environment in a \code{filehash} database. Objects dumped to a database can later be loaded via \code{dbLoad} or can be accessed with \code{dbFetch}, \code{dbList}, etc. Alternatively, the \code{with} method can be used to evaluate code in the context of a database. If a database with name \code{dbName} already exists, objects will be inserted into the existing database (and values for already-existing keys will be overwritten). \code{dumpDF} is different in that each variable in the data frame is stored as a separate object in the database. So each variable can be read from the database separately rather than having to load the entire data frame into memory. \code{dumpList} works in a simlar way. } \section{Functions}{ \itemize{ \item \code{dumpImage()}: Dump the Global Environment (analogous to \code{save.image}) \item \code{dumpObjects()}: Dump named objects to a filehash database (analogous to \code{save}) \item \code{dumpDF()}: Dump data frame columns to a filehash database \item \code{dumpList()}: Dump elements of a list to a filehash database }} filehash/man/filehashOption.Rd0000644000176200001440000000100714370503521016051 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/filehash.R \name{filehashOption} \alias{filehashOption} \title{Set Filehash Options} \usage{ filehashOption(...) } \arguments{ \item{\dots}{name-value pairs for options} } \value{ \code{filehashOptions} returns a list of current settings for all options. } \description{ Set global filehash options } \details{ Currently, the only option that can be set is the default database type (\code{defaultType}) which can be "DB1", "RDS" or "DB". } filehash/man/coerceDB1list.Rd0000644000176200001440000000050414370503521015521 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/coerce.R \name{coerceDB1list} \alias{coerceDB1list} \alias{coerce,filehashDB1,list-method} \title{Coerce a filehash database} \arguments{ \item{from}{a filehashDB1 database object} } \description{ Coerce a filehashDB1 database to a list object } filehash/man/queue-class.Rd0000644000176200001440000000452414370503521015333 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/queue.R \docType{class} \name{queue-class} \alias{queue-class} \alias{createQ} \alias{initQ} \alias{pop} \alias{push} \alias{isEmpty} \alias{top} \alias{show,queue-method} \alias{push,queue-method} \alias{isEmpty,queue-method} \alias{top,queue-method} \alias{pop,queue-method} \title{A Queue Class} \usage{ createQ(filename) initQ(filename) pop(db, ...) push(db, val, ...) isEmpty(db, ...) top(db, ...) \S4method{show}{queue}(object) \S4method{push}{queue}(db, val, ...) \S4method{isEmpty}{queue}(db) \S4method{top}{queue}(db, ...) \S4method{pop}{queue}(db, ...) } \arguments{ \item{filename}{name of queue file} \item{db}{a queue object} \item{...}{arguments passed to other methods} \item{val}{an R object to be added to the tail queue} \item{object}{a queue object} } \value{ \code{createQ} and \code{initQ} return a \code{queue} object } \description{ A queue implementation using a \code{filehash} database } \details{ Objects can be created by calls of the form \code{new("queue", ...)} or by calling \code{createQ}. Existing queues can be initialized with \code{initQ}. } \section{Methods (by generic)}{ \itemize{ \item \code{show(queue)}: Print a queue object \item \code{push(queue)}: adds an element to the tail ("bottom") of the queue \item \code{isEmpty(queue)}: returns \code{TRUE}/\code{FALSE} depending on whether there are elements in the queue. \item \code{top(queue)}: returns the value of the "top" (i.e. head) of the queue; an error is signaled if the queue is empty \item \code{pop(queue)}: returns the value of the "top" (i.e. head) of the queue and subsequently removes that element from the queue; an error is signaled if the queue is empty }} \section{Functions}{ \itemize{ \item \code{createQ()}: Create a file-based queue object \item \code{initQ()}: Intialize an existing queue object \item \code{pop()}: Return (and remove) the top element of a queue \item \code{push()}: Push an R object on to the tail of a queue \item \code{isEmpty()}: Check if a queue is empty or not \item \code{top()}: Return the top of the queue }} \section{Slots}{ \describe{ \item{\code{queue}}{Object of class \code{"filehashDB1"}} \item{\code{name}}{Object of class \code{"character"}: the name of the queue (default is the file name in which the queue data are stored)} }} filehash/man/dbLoad.Rd0000644000176200001440000000607714370503521014276 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/filehash.R \name{dbLoad} \alias{dbLoad} \alias{dbLoad,filehash-method} \alias{dbLazyLoad} \alias{dbLazyLoad,filehash-method} \alias{db2env} \title{Load a Database} \usage{ dbLoad(db, ...) \S4method{dbLoad}{filehash}(db, env = parent.frame(2), keys = NULL, ...) dbLazyLoad(db, ...) \S4method{dbLazyLoad}{filehash}(db, env = parent.frame(2), keys = NULL, ...) db2env(db) } \arguments{ \item{db}{filehash database object} \item{...}{arguments passed to other methods} \item{env}{environment into which objects should be loaded} \item{keys}{specific keys to be loaded (if NULL then all keys are loaded)} } \value{ dbLoad, dbLazyLoad: a character vector is returned (invisibly) containing the keys associated with the values loaded into the environment. db2env: environment containing database keys } \description{ Load entire database into an environment } \details{ \code{dbLoad} loads objects in the database directly into the environment specified, like \code{load} does except with active bindings. \code{dbLoad} takes a second argument \code{env}, which is an environment, and the default for \code{env} is \code{parent.frame()}. The use of \code{makeActiveBinding} in \code{db2env} and \code{dbLoad} allows for potentially large databases to, at least conceptually, be used in R, as long as you don't need simultaneous access to all of the elements in the database. \code{dbLazyLoad} loads objects in the database directly into the environment specified, like \code{load} does except with promises. \code{dbLazyLoad} takes a second argument \code{env}, which is an environment, and the default for \code{env} is \code{parent.frame()}. With \code{dbLazyLoad} database objects are "lazy-loaded" into the environment. Promises to load the objects are created in the environment specified by \code{env}. Upon first access, those objects are copied into the environment and will from then on reside in memory. Changes to the database will not be reflected in the object residing in the environment after first access. Conversely, changes to the object in the environment will not be reflected in the database. This type of loading is useful for read-only databases. \code{db2env} loads the entire database \code{db} into an environment via calls to \code{makeActiveBinding}. Therefore, the data themselves are not stored in the environment, but a function pointing to the data in the database is stored. When an element of the environment is accessed, the function is called to retrieve the data from the database. If the data in the database is changed, the changes will be reflected in the environment. } \section{Methods (by class)}{ \itemize{ \item \code{dbLoad(filehash)}: Method for filehash databases \item \code{dbLazyLoad(filehash)}: Method for filehash databases }} \section{Functions}{ \itemize{ \item \code{dbLazyLoad()}: Lazy load a filehash database \item \code{db2env()}: Load active bindings into an environment and return the environment }} \seealso{ \code{\link{dbLoad}}, \code{\link{dbLazyLoad}} } filehash/man/filehashRDS-class.Rd0000644000176200001440000000450014370503521016335 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/filehash-RDS.R \docType{class} \name{filehashRDS-class} \alias{filehashRDS-class} \alias{dbInsert,filehashRDS,character-method} \alias{dbFetch,filehashRDS,character-method} \alias{dbMultiFetch,filehashRDS,character-method} \alias{dbExists,filehashRDS,character-method} \alias{dbList,filehashRDS-method} \alias{dbDelete,filehashRDS,character-method} \alias{dbUnlink,filehashRDS-method} \title{Filehash RDS Class} \usage{ \S4method{dbInsert}{filehashRDS,character}(db, key, value, safe = TRUE, ...) \S4method{dbFetch}{filehashRDS,character}(db, key, ...) \S4method{dbMultiFetch}{filehashRDS,character}(db, key, ...) \S4method{dbExists}{filehashRDS,character}(db, key, ...) \S4method{dbList}{filehashRDS}(db, ...) \S4method{dbDelete}{filehashRDS,character}(db, key, ...) \S4method{dbUnlink}{filehashRDS}(db, ...) } \arguments{ \item{db}{a filehashRDS object} \item{key}{character, the name of an R object} \item{value}{an R object} \item{safe}{Should the operation be done safely?} \item{...}{arguments passed to other methods} } \description{ An implementation of filehash databases using diretories and separate files } \details{ When \code{safe = TRUE} in \code{dbInsert}, objects are written to a temp file before replacing any existing objects. This way, if the operation is interrupted, the original data are not corrupted. For \code{dbMultiFetch}, \code{key} is a character vector of keys. } \section{Methods (by generic)}{ \itemize{ \item \code{dbInsert(db = filehashRDS, key = character)}: Insert an R object into a filehashRDS database \item \code{dbFetch(db = filehashRDS, key = character)}: Retrieve a value from a filehashRDS database \item \code{dbMultiFetch(db = filehashRDS, key = character)}: Retrieve multiple objects from a filehashRDS database \item \code{dbExists(db = filehashRDS, key = character)}: Determine if a key exists in a filehashRDS database \item \code{dbList(filehashRDS)}: Return a character vector of all key stored in a database \item \code{dbDelete(db = filehashRDS, key = character)}: Delete a key and its corresponding object from a filehashRDS database \item \code{dbUnlink(filehashRDS)}: Delete an entire filehashRDS database }} \section{Slots}{ \describe{ \item{\code{dir}}{Directory where files are stored (filehashRDS only)} }} filehash/man/filehashDB1-class.Rd0000644000176200001440000000453014370503521016256 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/filehash-DB1.R \docType{class} \name{filehashDB1-class} \alias{filehashDB1-class} \alias{dbInsert,filehashDB1,character-method} \alias{dbFetch,filehashDB1,character-method} \alias{dbMultiFetch,filehashDB1,character-method} \alias{dbExists,filehashDB1,character-method} \alias{dbList,filehashDB1-method} \alias{dbDelete,filehashDB1,character-method} \alias{dbUnlink,filehashDB1-method} \alias{dbReorganize,filehashDB1-method} \title{Filehash DB1 Class} \usage{ \S4method{dbInsert}{filehashDB1,character}(db, key, value, ...) \S4method{dbFetch}{filehashDB1,character}(db, key, ...) \S4method{dbMultiFetch}{filehashDB1,character}(db, key, ...) \S4method{dbExists}{filehashDB1,character}(db, key, ...) \S4method{dbList}{filehashDB1}(db, ...) \S4method{dbDelete}{filehashDB1,character}(db, key, ...) \S4method{dbUnlink}{filehashDB1}(db, ...) \S4method{dbReorganize}{filehashDB1}(db, ...) } \arguments{ \item{db}{a filehashDB1 object} \item{key}{character, the name of an R object in the database} \item{value}{an R object} \item{...}{arguments passed to other methods} } \description{ An implementation of filehash databases using a single large file } \details{ For \code{dbMultiFetch}, \code{key} is a character vector of keys. } \section{Methods (by generic)}{ \itemize{ \item \code{dbInsert(db = filehashDB1, key = character)}: Insert an R object into a filehashDB1 database \item \code{dbFetch(db = filehashDB1, key = character)}: Retrieve an object from a filehash DB1 database \item \code{dbMultiFetch(db = filehashDB1, key = character)}: Retrieve multiple objects from a filehash DB1 database \item \code{dbExists(db = filehashDB1, key = character)}: Determine if a key exists in a filehash DB1 database \item \code{dbList(filehashDB1)}: Return a character vector containing all keys in a database \item \code{dbDelete(db = filehashDB1, key = character)}: Delete a key and it's corresponding object from a filehashDB1 database \item \code{dbUnlink(filehashDB1)}: Delete an entire filehashDB1 database \item \code{dbReorganize(filehashDB1)}: Reorganize and compactify a filehahsDB1 database }} \section{Slots}{ \describe{ \item{\code{datafile}}{full path to the database file (filehashDB1 only)} \item{\code{meta}}{list containing an environment for database metadata (filehashDB1 only)} }} filehash/man/registerFormatDB.Rd0000644000176200001440000000061714370503521016306 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/filehash.R \name{registerFormatDB} \alias{registerFormatDB} \title{Register Database Format} \usage{ registerFormatDB(name, funlist) } \arguments{ \item{name}{character, name of database format} \item{funlist}{list of functions for creating and initializing a database format} } \description{ Register Database Format } filehash/man/coercelist.Rd0000644000176200001440000000046514370503521015240 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/coerce.R \name{coercelist} \alias{coercelist} \alias{coerce,filehash,list-method} \title{Coerce a filehash database} \arguments{ \item{from}{a filehash database object} } \description{ Coerce a filehash database to a list object } filehash/man/filehash-class.Rd0000644000176200001440000001206014370503521015764 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/filehash.R \docType{class} \name{filehash-class} \alias{filehash-class} \alias{show,filehash-method} \alias{dbCreate,ANY-method} \alias{dbCreate} \alias{dbInit,ANY-method} \alias{dbInit} \alias{names,filehash-method} \alias{length,filehash-method} \alias{with,filehash-method} \alias{lapply,filehash-method} \alias{dbMultiFetch} \alias{dbInsert} \alias{dbFetch} \alias{dbExists} \alias{dbList} \alias{dbDelete} \alias{dbReorganize} \alias{dbUnlink} \alias{[[,filehash,character,missing-method} \alias{`[[,filehash,character,missing-method`} \alias{$,filehash-method} \alias{[[<-,filehash,character,missing-method} \alias{$<-,filehash-method} \alias{[,filehash,character,missing,missing-method} \title{Filehash Class} \usage{ \S4method{show}{filehash}(object) \S4method{dbCreate}{ANY}(db, type = NULL, ...) \S4method{dbInit}{ANY}(db, type = NULL, ...) \S4method{names}{filehash}(x) \S4method{length}{filehash}(x) \S4method{with}{filehash}(data, expr, ...) \S4method{lapply}{filehash}(X, FUN, ..., keep.names = TRUE) dbMultiFetch(db, key, ...) dbInsert(db, key, value, ...) dbFetch(db, key, ...) dbExists(db, key, ...) dbList(db, ...) dbDelete(db, key, ...) dbReorganize(db, ...) dbUnlink(db, ...) \S4method{[[}{filehash,character,missing}(x, i, j) \S4method{$}{filehash}(x, name) \S4method{[[}{filehash,character,missing}(x, i, j) <- value \S4method{$}{filehash}(x, name) <- value \S4method{[}{filehash,character,missing,missing}(x, i, j, drop) } \arguments{ \item{object}{a filehash object} \item{db}{a filehash object} \item{type}{filehash database type} \item{...}{arguments passed to other methods} \item{x}{a filehash object} \item{data}{a filehash object} \item{expr}{an R expression to be evaluated} \item{X}{a filehash object} \item{FUN}{a function to be applied} \item{keep.names}{Should the key names be returned in the resulting list?} \item{key}{a character vector indicating a key (or keys) to retreive} \item{value}{an R object} \item{i}{a character index} \item{j}{not used} \item{name}{the name of the element in the filehash database} \item{drop}{should dimensions be dropped? (not used)} } \description{ These functions form the interface for a simple file-based key-value database (i.e. hash table). } \details{ Objects can be created by calls of the form \code{new("filehash", ...)}. } \section{Methods (by generic)}{ \itemize{ \item \code{show(filehash)}: Print a filehash object \item \code{dbCreate(ANY)}: Create a filehash database \item \code{dbInit(ANY)}: Initialize an existing filehash database \item \code{names(filehash)}: Return the keys stored in a filehash database \item \code{length(filehash)}: Return the number of objects in a filehash database \item \code{with(filehash)}: Use a filehash database as an evaluation environment \item \code{lapply(filehash)}: Apply a function over the elements of a filehash database \item \code{x[[i}: Extract elements of a filehash database using character names \item \code{$}: Extract elements of a filehash database using character names \item \code{`[[`(x = filehash, i = character, j = missing) <- value}: Replace elements of a filehash database \item \code{`$`(filehash) <- value}: Replace elements of a filehash database \item \code{x[i}: Retrieve multiple elements of a filehash database }} \section{Functions}{ \itemize{ \item \code{dbMultiFetch()}: Retrieve values associated with multiple keys (a list of those values is returned). \item \code{dbInsert()}: Insert a key-value pair into the database. If that key already exists, its associated value is overwritten. For \code{"RDS"} type databases, there is a \code{safe} option (defaults to \code{TRUE}) which allows the user to insert objects somewhat more safely (objects should not be lost in the event of an interrupt). \item \code{dbFetch()}: Retrieve the value associated with a given key. \item \code{dbExists()}: Check to see if a key exists. \item \code{dbList()}: List all keys in the database. \item \code{dbDelete()}: The \code{dbDelete} function is for deleting elements, but for the \code{"DB1"} format all it does is remove the key from the lookup table. The actual data are still in the database (but inaccessible). If you reinsert data for the same key, the new data are simply appended on to the end of the file. Therefore, it's possible to have multiple copies of data lying around after a while, potentially making the database file big. The \code{"RDS"} format does not have this problem. \item \code{dbReorganize()}: The \code{dbReorganize} function is there for the purpose of rewriting the database to remove all of the stale entries. Basically, this function creates a new copy of the database and then overwrites the old copy. This function has not been tested extensively and so should be considered \emph{experimental}. \code{dbReorganize} is not needed when using the \code{"RDS"} format. \item \code{dbUnlink()}: Delete an entire database from the disk. }} \section{Slots}{ \describe{ \item{\code{name}}{Object of class \code{"character"}, name of the database.} }} filehash/man/coerceDB1RDS.Rd0000644000176200001440000000051614370503521015201 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/coerce.R \name{coerceDB1RDS} \alias{coerceDB1RDS} \alias{coerce,filehashDB1,filehashRDS-method} \title{Coerce a filehash database} \arguments{ \item{from}{a filehashDB1 database object} } \description{ Coerce a filehashDB1 database to filehashRDS format } filehash/man/stack-class.Rd0000644000176200001440000000400514370503521015306 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/stack.R \docType{class} \name{stack-class} \alias{stack-class} \alias{show,stack-method} \alias{createS} \alias{initS} \alias{push,stack-method} \alias{mpush} \alias{mpush,stack-method} \alias{isEmpty,stack-method} \alias{top,stack-method} \alias{pop,stack-method} \title{Stack Class} \usage{ \S4method{show}{stack}(object) createS(filename) initS(filename) \S4method{push}{stack}(db, val, ...) mpush(db, vals, ...) \S4method{mpush}{stack}(db, vals, ...) \S4method{isEmpty}{stack}(db, ...) \S4method{top}{stack}(db, ...) \S4method{pop}{stack}(db, ...) } \arguments{ \item{object}{a stack object} \item{filename}{name of file where stack is stored} \item{db}{a stack object} \item{val}{an R object to be added to the stack} \item{...}{arguments passed to other methods} \item{vals}{a list of R objects} } \value{ a stack object } \description{ A stack implementation using a \code{filehash} database } \details{ Objects can be created by calls of the form \code{new("stack", ...)} or by calling \code{createS}. Existing queues can be initialized with \code{initS}. } \section{Methods (by generic)}{ \itemize{ \item \code{show(stack)}: Print a stack object. \item \code{push(stack)}: Push an object on to the stack \item \code{mpush(stack)}: Push a list of R objects on to the stack \item \code{isEmpty(stack)}: Indicate whether the stack is empty or not \item \code{top(stack)}: Return the top element of the stack \item \code{pop(stack)}: Return the top element of the stack and remove that element from the stack }} \section{Functions}{ \itemize{ \item \code{createS()}: Create a filehash Stack \item \code{initS()}: Initialize and existing filehash stack \item \code{mpush()}: Push multiple R objects on to a stack }} \section{Slots}{ \describe{ \item{\code{stack}}{Object of class \code{"filehashDB1"}} \item{\code{name}}{Object of class \code{"character"}: the name of the stack (default is the file name in which the stack data are stored)} }} filehash/man/filehashFormats.Rd0000644000176200001440000000115714370503521016222 0ustar liggesusers% Generated by roxygen2: do not edit by hand % Please edit documentation in R/filehash.R \name{filehashFormats} \alias{filehashFormats} \title{List and register filehash formats} \usage{ filehashFormats(...) } \arguments{ \item{\dots}{list of functions for registering a new database format} } \value{ A list containing information on the available filehash formats } \description{ List and register filehash backend database formats. } \details{ \code{filehashFormats} can be used to register new filehash backend database formats. \code{filehashFormats} called with no arguments lists information on available formats } filehash/DESCRIPTION0000644000176200001440000000222614371237002013544 0ustar liggesusersPackage: filehash Version: 2.4-5 Depends: R (>= 3.0.0) Imports: digest, methods Collate: filehash.R filehash-DB1.R filehash-RDS.R coerce.R dump.R hash.R queue.R stack.R zzz.R Title: Simple Key-Value Database Author: Roger D. Peng Maintainer: Roger D. Peng Description: Implements a simple key-value style database where character string keys are associated with data values that are stored on the disk. A simple interface is provided for inserting, retrieving, and deleting data from the database. Utilities are provided that allow 'filehash' databases to be treated much like environments and lists are already used in R. These utilities are provided to encourage interactive and exploratory analysis on large datasets. Three different file formats for representing the database are currently available and new formats can easily be incorporated by third parties for use in the 'filehash' framework. License: GPL (>= 2) URL: https://github.com/rdpeng/filehash RoxygenNote: 7.2.3 Encoding: UTF-8 NeedsCompilation: yes Packaged: 2023-02-09 18:18:57 UTC; rp34949 Repository: CRAN Date/Publication: 2023-02-09 18:40:02 UTC filehash/build/0000755000176200001440000000000014371234421013135 5ustar liggesusersfilehash/build/vignette.rds0000644000176200001440000000032114371234421015470 0ustar liggesusersb```b`abb`b2 1# 'IIH, +G HU+$&g'_&Dž0a  3YsS楀|"5lP5,n@7 ,s\ܠL t7`~΢r=xA$Gs=ʕXVr70filehash/tests/0000755000176200001440000000000013405721240013175 5ustar liggesusersfilehash/tests/versions.Rout.save0000644000176200001440000000406414207505421016663 0ustar liggesusers R version 4.1.2 Patched (2022-01-20 r81529) -- "Bird Hippie" Copyright (C) 2022 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin21.2.0 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > ## Test databases > > suppressMessages(library(filehash)) > > testdblist <- dir(pattern = glob2rx("testdb-v*")) > > for(testname in testdblist) { + msg <- sprintf("DATABASE: %s\n", testname) + cat(paste(rep("=", nchar(msg)), collapse = ""), "\n") + cat(msg) + cat(paste(rep("=", nchar(msg)), collapse = ""), "\n") + db <- dbInit(testname, "DB1") + keys <- dbList(db) + print(keys) + + for(k in keys) { + cat("key:", k, "\n") + val <- dbFetch(db, k) + print(val) + cat("\n") + } + } ====================== DATABASE: testdb-v1.1 ====================== [1] "a" "c" "list" "entry" key: a [1] -0.6264538 0.1836433 -0.8356286 1.5952808 0.3295078 -0.8204684 [7] 0.4874291 0.7383247 0.5757814 -0.3053884 key: c [1] 1 key: list [[1]] [1] 1 [[2]] [1] 2 [[3]] [1] 3 [[4]] [1] 4 [[5]] [1] 5 [[6]] [1] 6 [[7]] [1] "a" key: entry [1] "string" ====================== DATABASE: testdb-v2.0 ====================== [1] "a" "c" "list" "entry" key: a [1] -0.6264538 0.1836433 -0.8356286 1.5952808 0.3295078 -0.8204684 [7] 0.4874291 0.7383247 0.5757814 -0.3053884 key: c [1] 1 key: list [[1]] [1] 1 [[2]] [1] 2 [[3]] [1] 3 [[4]] [1] 4 [[5]] [1] 5 [[6]] [1] 6 [[7]] [1] "a" key: entry [1] "string" > > proc.time() user system elapsed 0.171 0.035 0.195 filehash/tests/misc/0000755000176200001440000000000013405721240014130 5ustar liggesusersfilehash/tests/misc/create-testdb.R0000644000176200001440000000053713007762012017006 0ustar liggesuserslibrary(filehash) name <- sprintf("testdb-v%s", packageDescription("filehash", fields = "Version")) dbCreate(name, "DB1") db <- dbInit(name, "DB1") set.seed(1) dbInsert(db, "a", rnorm(10)) dbInsert(db, "b", runif(7)) dbInsert(db, "list", list(1, 2, 3, 4, 5, 6, "a")) dbInsert(db, "c", 1L) dbInsert(db, "entry", "string") dbDelete(db, "b") filehash/tests/reg-tests.Rout.save0000644000176200001440000002324714207505372016741 0ustar liggesusers R version 4.1.2 Patched (2022-01-20 r81529) -- "Bird Hippie" Copyright (C) 2022 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin21.2.0 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > suppressMessages(library(filehash)) > > ###################################################################### > ## Test 'filehashRDS' class > > dbCreate("mydbRDS", "RDS") [1] TRUE > db <- dbInit("mydbRDS", "RDS") > show(db) 'filehashRDS' database 'mydbRDS' > > ## Put some data into it > set.seed(1000) > dbInsert(db, "a", 1:10) > dbInsert(db, "b", rnorm(100)) > dbInsert(db, "c", 100:1) > dbInsert(db, "d", runif(1000)) > dbInsert(db, "other", "hello") > > dbList(db) [1] "a" "b" "c" "d" "other" > > dbExists(db, "e") [1] FALSE > dbExists(db, "a") [1] TRUE > > env <- db2env(db) > ls(env) [1] "a" "b" "c" "d" "other" > > env$a [1] 1 2 3 4 5 6 7 8 9 10 > env$b [1] -0.44577826 -1.20585657 0.04112631 0.63938841 -0.78655436 -0.38548930 [7] -0.47586788 0.71975069 -0.01850562 -1.37311776 -0.98242783 -0.55448870 [13] 0.12138119 -0.12087232 -1.33604105 0.17005748 0.15507872 0.02493187 [19] -2.04658541 0.21315411 2.67007166 -1.22701601 0.83424733 0.53257175 [25] -0.64682496 0.60316126 -1.78384414 0.33494217 0.56097572 1.22093565 [31] -0.21145359 0.69942953 -0.70643668 -0.46515095 -1.76619861 0.18928860 [37] -0.36618068 1.05760118 -0.74162146 -1.34835905 -0.51730643 1.41173570 [43] 0.18546503 -0.04369144 -0.21591338 1.46377535 0.22966664 0.10762363 [49] -1.37810256 -0.96818288 0.25171138 -1.09469370 0.39764284 -0.99630200 [55] 0.10057801 0.95368028 -1.79032293 0.31170122 2.55398801 -0.86083776 [61] 0.54392844 -0.39233804 1.23544190 1.19608644 -0.49574690 -0.29434122 [67] -0.57349748 1.61920873 -0.95692767 0.04123712 -1.49831044 0.66095916 [73] 0.28545762 1.38886629 -0.15934361 -0.46091890 0.16843807 1.39549302 [79] 0.72842626 0.33508995 1.16927649 0.24796682 -0.35814947 1.38349332 [85] 0.41206917 -0.12300786 -0.06622931 -2.32249088 -1.04565650 2.05787502 [91] 1.97153237 -1.92099520 0.46212607 -0.16072406 -0.10421153 0.46783940 [97] 0.44392082 0.82855281 -0.38705012 2.01893816 > env$c [1] 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 [19] 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 [37] 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 [55] 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 [73] 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 [91] 10 9 8 7 6 5 4 3 2 1 > str(env$d) num [1:1000] 0.0854 0.3317 0.5647 0.4989 0.4549 ... > env$other [1] "hello" > > env$b <- rnorm(100) > mean(env$b) [1] -0.02208835 > > env$a[1:5] <- 5:1 > print(env$a) [1] 5 4 3 2 1 6 7 8 9 10 > > dbDelete(db, "c") > > tryCatch(print(env$c), error = function(e) cat(as.character(e))) Error in dbFetch(db, key): unable to obtain value for key 'c' > tryCatch(dbFetch(db, "c"), error = function(e) cat(as.character(e))) Error in dbFetch(db, "c"): unable to obtain value for key 'c' > > ## Check trailing '/' problem > dbCreate("testRDSdb", "RDS") [1] TRUE > db <- dbInit("testRDSdb/", "RDS") > print(db) 'filehashRDS' database 'testRDSdb' > > ###################################################################### > ## test filehashDB1 class > > dbCreate("mydb", "DB1") [1] TRUE > db <- dbInit("mydb", "DB1") > > ## Put some data into it > set.seed(1000) > dbInsert(db, "a", 1:10) > dbInsert(db, "b", rnorm(100)) > dbInsert(db, "c", 100:1) > dbInsert(db, "d", runif(1000)) > dbInsert(db, "other", "hello") > > dbList(db) [1] "a" "b" "other" "c" "d" > > env <- db2env(db) > ls(env) [1] "a" "b" "c" "d" "other" > > env$a [1] 1 2 3 4 5 6 7 8 9 10 > env$b [1] -0.44577826 -1.20585657 0.04112631 0.63938841 -0.78655436 -0.38548930 [7] -0.47586788 0.71975069 -0.01850562 -1.37311776 -0.98242783 -0.55448870 [13] 0.12138119 -0.12087232 -1.33604105 0.17005748 0.15507872 0.02493187 [19] -2.04658541 0.21315411 2.67007166 -1.22701601 0.83424733 0.53257175 [25] -0.64682496 0.60316126 -1.78384414 0.33494217 0.56097572 1.22093565 [31] -0.21145359 0.69942953 -0.70643668 -0.46515095 -1.76619861 0.18928860 [37] -0.36618068 1.05760118 -0.74162146 -1.34835905 -0.51730643 1.41173570 [43] 0.18546503 -0.04369144 -0.21591338 1.46377535 0.22966664 0.10762363 [49] -1.37810256 -0.96818288 0.25171138 -1.09469370 0.39764284 -0.99630200 [55] 0.10057801 0.95368028 -1.79032293 0.31170122 2.55398801 -0.86083776 [61] 0.54392844 -0.39233804 1.23544190 1.19608644 -0.49574690 -0.29434122 [67] -0.57349748 1.61920873 -0.95692767 0.04123712 -1.49831044 0.66095916 [73] 0.28545762 1.38886629 -0.15934361 -0.46091890 0.16843807 1.39549302 [79] 0.72842626 0.33508995 1.16927649 0.24796682 -0.35814947 1.38349332 [85] 0.41206917 -0.12300786 -0.06622931 -2.32249088 -1.04565650 2.05787502 [91] 1.97153237 -1.92099520 0.46212607 -0.16072406 -0.10421153 0.46783940 [97] 0.44392082 0.82855281 -0.38705012 2.01893816 > env$c [1] 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 [19] 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 [37] 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 [55] 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 [73] 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 [91] 10 9 8 7 6 5 4 3 2 1 > str(env$d) num [1:1000] 0.0854 0.3317 0.5647 0.4989 0.4549 ... > env$other [1] "hello" > > env$b <- rnorm(100) > mean(env$b) [1] -0.02208835 > > env$a[1:5] <- 5:1 > print(env$a) [1] 5 4 3 2 1 6 7 8 9 10 > > dbDelete(db, "c") > > tryCatch(print(env$c), error = function(e) cat(as.character(e))) Error in readSingleKey(con, map, key): unable to obtain value for key 'c' > tryCatch(dbFetch(db, "c"), error = function(e) cat(as.character(e))) Error in readSingleKey(con, map, key): unable to obtain value for key 'c' > > numbers <- rnorm(100) > dbInsert(db, "numbers", numbers) > b <- dbFetch(db, "numbers") > stopifnot(all.equal(numbers, b)) > stopifnot(identical(numbers, b)) > > ################################################################################ > ## Other tests > > rm(list = ls()) > > > dbCreate("testLoadingDB", "DB1") [1] TRUE > db <- dbInit("testLoadingDB", "DB1") > > set.seed(234) > > db$a <- rnorm(100) > db$b <- runif(1000) > > dbLoad(db) ## 'a', 'b' > summary(a, digits = 4) Min. 1st Qu. Median Mean 3rd Qu. Max. -3.036000 -0.642100 0.172000 0.004131 0.614100 2.107000 > summary(b, digits = 4) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.004583 0.229900 0.478600 0.482200 0.729200 0.999800 > > rm(list = ls()) > db <- dbInit("testLoadingDB", "DB1") > > dbLazyLoad(db) > > summary(a, digits = 4) Min. 1st Qu. Median Mean 3rd Qu. Max. -3.036000 -0.642100 0.172000 0.004131 0.614100 2.107000 > summary(b, digits = 4) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.004583 0.229900 0.478600 0.482200 0.729200 0.999800 > > > > ################################################################################ > ## Check dbReorganize > > dbCreate("test_reorg", "DB1") [1] TRUE > db <- dbInit("test_reorg", "DB1") > > set.seed(1000) > dbInsert(db, "a", 1) > dbInsert(db, "a", 1) > dbInsert(db, "a", 1) > dbInsert(db, "a", 1) > dbInsert(db, "b", rnorm(1000)) > dbInsert(db, "b", rnorm(1000)) > dbInsert(db, "b", rnorm(1000)) > dbInsert(db, "b", rnorm(1000)) > dbInsert(db, "c", runif(1000)) > dbInsert(db, "c", runif(1000)) > dbInsert(db, "c", runif(1000)) > dbInsert(db, "c", runif(1000)) > > summary(db$b, digits = 4) Min. 1st Qu. Median Mean 3rd Qu. Max. -2.76800 -0.65520 -0.06100 -0.01269 0.65240 3.73900 > summary(db$c, digits = 4) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0002346 0.2416000 0.4813000 0.4938000 0.7492000 0.9992000 > > print(file.info(db@datafile)$size) [1] 65304 > > dbReorganize(db) Reorganizing database: 33% (1/3)67% (2/3)100% (3/3) Finished; reload database with 'dbInit' [1] TRUE > > db <- dbInit("test_reorg", "DB1") > > print(file.info(db@datafile)$size) [1] 16326 > > summary(db$b, digits = 4) Min. 1st Qu. Median Mean 3rd Qu. Max. -2.76800 -0.65520 -0.06100 -0.01269 0.65240 3.73900 > summary(db$c, digits = 4) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0002346 0.2416000 0.4813000 0.4938000 0.7492000 0.9992000 > > > ################################################################################ > ## Taken from the vignette > > file.remove("mydb") [1] TRUE > > dbCreate("mydb") [1] TRUE > db <- dbInit("mydb") > > set.seed(100) > > dbInsert(db, "a", rnorm(100)) > value <- dbFetch(db, "a") > mean(value) [1] 0.002912563 > > dbInsert(db, "b", 123) > dbDelete(db, "a") > dbList(db) [1] "b" > dbExists(db, "a") [1] FALSE > > file.remove("mydb") [1] TRUE > > ################################################################################ > ## Check queue > > db <- createQ("testq") > push(db, 1) > push(db, 2) > top(db) [1] 1 > > pop(db) [1] 1 > top(db) [1] 2 > > proc.time() user system elapsed 0.215 0.064 0.269 filehash/tests/testdb-v1.10000644000176200001440000000132613007762012015072 0ustar liggesusersX  aX  fX  ܲ?ǁx7 // for NULL #include #include SEXP read_key_map(SEXP filename, SEXP map, SEXP filesize, SEXP pos); SEXP lock_file(SEXP filename); static const R_CallMethodDef CallEntries[] = { {"lock_file", (DL_FUNC) &lock_file, 1}, {"read_key_map", (DL_FUNC) &read_key_map, 4}, {NULL, NULL, 0} }; void R_init_filehash(DllInfo *info) { R_registerRoutines(info, NULL, CallEntries, NULL, NULL); R_useDynamicSymbols(info, FALSE); R_forceSymbols(info, TRUE); } filehash/src/lockfile.c0000644000176200001440000000062013007762012014554 0ustar liggesusers#include #include #include #include SEXP lock_file(SEXP filename) { int fd; SEXP status; if(!isString(filename)) error("'filename' should be character"); PROTECT(status = allocVector(INTSXP, 1)); fd = open(CHAR(STRING_ELT(filename, 0)), O_WRONLY | O_CREAT | O_EXCL, 0666); INTEGER(status)[0] = fd; close(fd); UNPROTECT(1); return status; } filehash/src/readKeyMap.c0000644000176200001440000000407213437516654015033 0ustar liggesusers#define NEED_CONNECTION_PSTREAMS #include #include SEXP read_key_map(SEXP filename, SEXP map, SEXP filesize, SEXP pos) { SEXP key, datalen, sym, dpos; FILE *fp; int status, len; struct R_inpstream_st in; if(!isEnvironment(map)) error("'map' should be an environment"); if(!isString(filename)) error("'filename' should be character"); PROTECT(filesize = coerceVector(filesize, INTSXP)); PROTECT(pos = coerceVector(pos, INTSXP)); fp = fopen(CHAR(STRING_ELT(filename, 0)), "rb"); if(INTEGER(pos)[0] > 0) { status = fseek(fp, INTEGER(pos)[0], SEEK_SET); if(status < 0) error("problem with initial file pointer seek"); } /* Initialize the incoming R file stream */ R_InitFileInPStream(&in, fp, R_pstream_any_format, NULL, NULL); while(INTEGER(pos)[0] < INTEGER(filesize)[0]) { PROTECT(key = R_Unserialize(&in)); PROTECT(datalen = R_Unserialize(&in)); len = INTEGER(datalen)[0]; /* calculate the position of file pointer */ INTEGER(pos)[0] = ftell(fp); if(len <= 0) { /* key has been deleted; set pos to NULL */ PROTECT(sym = install(CHAR(STRING_ELT(key, 0)))); defineVar(sym, R_NilValue, map); UNPROTECT(3); continue; } /* create a new entry in the key map */ PROTECT(sym = install(CHAR(STRING_ELT(key, 0)))); PROTECT(dpos = duplicate(pos)); defineVar(sym, dpos, map); /* advance to the next key */ status = fseek(fp, len, SEEK_CUR); if(status < 0) { fclose(fp); error("problem with seek"); } INTEGER(pos)[0] = INTEGER(pos)[0] + len; UNPROTECT(4); } UNPROTECT(2); fclose(fp); return map; } filehash/vignettes/0000755000176200001440000000000014371234421014046 5ustar liggesusersfilehash/vignettes/combined.bib0000644000176200001440000000224013007762012016277 0ustar liggesusers @Manual{ templelang:2002, title = {RObjectTables: User-level attach()'able table support}, author = {Duncan {Temple Lang}}, year = {2002}, note = {{R} package version 0.3-1}, url = {http://www.omegahat.org/RObjectTables} } @Article{ rnews:ripley:2004, author = {Brian D. Ripley}, title = {Lazy Loading and Packages in {R} 2.0.0}, journal = {R News}, year = 2004, volume = 4, number = 2, pages = {2--4}, month = {September}, url = http, pdf = rnews2004-2 } @TechReport{ cham:1991, author = {John M. Chambers}, title = {Data Management in {S}}, institution = {AT\&T Bell Laboratories Statistics Research}, year = {1991}, number = {99}, month = {December}, note = {http://stat.bell-labs.com/doc/93.15.ps} } @Book{ cham:1998, author = {John M. Chambers}, title = {Programming with Data: A Guide to the {S} Language}, publisher = {Springer}, year = {1998} } @Article{ brahm:2002, author = {David E. Brahm}, title = {Delayed Data Packages}, journal = {R News}, year = 2002, volume = 2, number = 3, pages = {11--12}, month = {December}, url = {http://CRAN.R-project.org/doc/Rnews/} } filehash/vignettes/filehash.Rnw0000644000176200001440000004463314371232451016334 0ustar liggesusers\documentclass{article} %%\VignetteIndexEntry{The filehash Package} %%\VignetteDepends{filehash} \usepackage{charter} \usepackage{courier} \usepackage[noae]{Sweave} \usepackage[margin=1in]{geometry} \usepackage{natbib} \title{Interacting with Data using the \textbf{filehash} Package for R} \author{Roger D. Peng $<$roger.peng@austin.utexas.edu$>$\\\textit{Department of Statistics and Data Sciences}\\\textit{University of Texas, Austin}} \date{} \newcommand{\pkg}{\textbf} \newcommand{\code}{\texttt} \begin{document} \maketitle \begin{abstract} The \pkg{filehash} package for R implements a simple key-value style database where character string keys are associated with data values that are stored on the disk. A simple interface is provided for inserting, retrieving, and deleting data from the database. Utilities are provided that allow \pkg{filehash} databases to be treated much like environments and lists are already used in R. These utilities are provided to encourage interactive and exploratory analysis on large datasets. Three different file formats for representing the database are currently available and new formats can easily be incorporated by third parties for use in the \pkg{filehash} framework. \end{abstract} <>= options(width=60) @ \section{Overview and Motivation} Working with large datasets in R can be cumbersome because of the need to keep objects in physical memory. While many might generally see that as a feature of the system, the need to keep whole objects in memory creates challenges to those who might want to work interactively with large datasets. Here we take a simple definition of ``large dataset'' to be any dataset that cannot be loaded into R as a single R object because of memory limitations. For example, a very large data frame might be too large for all of the columns and rows to be loaded at once. In such a situation, one might load only a subset of the rows or columns, if that is possible. In a key-value database, an arbitrary data object (a ``value'') has a ``key'' associated with it, usually a character string. When one requests the value associated with a particular key, it is the database's job to match up the key with the correct value and return the value to the requester. The most straightforward example of a key-value database in R is the global environment. Every object in R has a name and a value associated with it. When you execute at the R prompt <>= x <- 1 print(x) @ the first line assigns the value 1 to the name/key ``x''. The second line requests the value of ``x'' and prints out 1 to the console. R handles the task of finding the appropriate value for ``x'' by searching through a series of environments, including the namespaces of the packages on the search list. In most cases, R stores the values associated with keys in memory, so that the value of \code{x} in the example above was stored in and retrieved from physical memory. However, the idea of a key-value database can be generalized beyond this particular configuration. For example, as of R 2.0.0, much of the R code for R packages is stored in a lazy-loaded database, where the values are initially stored on disk and loaded into memory on first access~\citep{Rnews:Ripley:2004}. Hence, when R starts up, it uses relatively little memory, while the memory usage increases as more objects are requested. Data could also be stored on other computers (e.g. websites) and retrieved over the network. The general S language concept of a database is described in Chapter 5 of the Green Book~\citep{cham:1998} and earlier in~\cite{cham:1991}. Although the S and R languages have different semantics with respect to how variable names are looked up and bound to values, the general concept of using a key-value database applies to both languages. Duncan Temple Lang has implemented this general database framework for R in the \pkg{RObjectTables} package of Omegahat~\citep{TempleLang:2002}. The \pkg{RObjectTables} package provides an interface for connecting R with arbitrary backend systems, allowing data values to be stored in potentially any format or location. While the package itself does not include a specific implementation, some examples are provided on the package's website. The \pkg{filehash} package provides a full read-write implementation of a key-value database for R. The package does not depend on any external packages (beyond those provided in a standard R installation) or software systems and is written entirely in R, making it readily usable on most platforms. The \pkg{filehash} package can be thought of as a specific implementation of the database concept described in~\cite{cham:1991}, taking a slightly different approach to the problem. Both~\cite{TempleLang:2002} and~\cite{cham:1991} focus on generalizing the notion of ``attach()-ing'' a database in an R/S session so that variable names can be looked up automatically via the search list. The \pkg{filehash} package represents a database as an instance of an S4 class and operates directly on the S4 object via various methods. Key-value databases are sometimes called hash tables and indeed, the name of the package comes from the idea of having a ``file-based hash table''. With \pkg{filehash} the values are stored in a file on the disk rather than in memory. When a user requests the values associated with a key, \pkg{filehash} finds the object on the disk, loads the value into R and returns it to the user. The package offers two formats for storing data on the disk: The values can be stored (1) concatenated together in a single file or (2) separately as a directory of files. \section{Related R packages} There are other packages on CRAN designed specifically to help users work with large datasets. Two packages that come immediately to mind are the \pkg{g.data} package by David Brahm~\citep{brahm:2002} and the \pkg{biglm} package by Thomas Lumley. The \pkg{g.data} package takes advantage of the lazy evaluation mechanism in R via the \code{delayedAssign} function. Briefly, objects are loaded into R as promises to load the actual data associated with an object name. The first time an object is requested, the promise is evaluated and the data are loaded. From then on, the data reside in memory. The mechanism used in \pkg{g.data} is similar to the one used by the lazy-loaded databases described in~\cite{Rnews:Ripley:2004}. The \pkg{biglm} package allows users to fit linear models on datasets that are too large to fit in memory. However, the \pkg{biglm} package does not provide methods for dealing with large datasets in general. The \pkg{filehash} package also draws inspiration from Luke Tierney's experimental \pkg{gdbm} package which implements a key-value database via the GNU dbm (GDBM) library. The use of GDBM creates an external dependence since the GDBM C library has to be compiled on each system. In addition, I encountered a problem where databases created on 32-bit machines could not be transferred to and read on 64-bit machines (and vice versa). However, with the increasing use of 64-bit machines in the future, it seems this problem will eventually go away. The R Special Interest Group on Databases has developed a number of packages that provide an R interface to commonly used relational database management systems (RDBMS) such as MySQL (\pkg{RMySQL}), PostgreSQL (\pkg{RPgSQL}), and Oracle (\pkg{ROracle}). These packages use the S4 classes and generics defined in the \pkg{DBI} package and have the advantage that they offer much better database functionality, inherited via the use of a true database management system. However, this benefit comes with the cost of having to install and use third-party software. While installing an RDBMS may not be an issue---many systems have them pre-installed and the \pkg{RSQLite} package comes bundled with the source for the RDBMS---the need for the RDBMS and knowledge of structured query language (SQL) nevertheless adds some overhead. This overhead may serve as an impediment for users in need of a database for simpler applications. \section{Creating a filehash database} Databases can be created with \pkg{filehash} using the \code{dbCreate} function. The one required argument is the name of the database, which we call here ``mydb''. <>= library(filehash) dbCreate("mydb") db <- dbInit("mydb") @ You can also specify the \code{type} argument which controls how the database is represented on the backend. We will discuss the different backends in further detail later. For now, we use the default backend which is called ``DB1''. Once the database is created, it must be initialized in order to be accessed. The \code{dbInit} function returns an S4 object inheriting from class ``filehash''. Since this is a newly created database, there are no objects in it. \section{Accessing a filehash database} <>= set.seed(100) @ The primary interface to filehash databases consists of the functions \code{dbFetch}, \code{dbInsert}, \code{dbExists}, \code{dbList}, and \code{dbDelete}. These functions are all generic---specific methods exists for each type of database backend. They all take as their first argument an object of class ``filehash''. To insert some data into the database we can simply call \code{dbInsert} <>= dbInsert(db, "a", rnorm(100)) @ Here we have associated with the key ``a'' 100 standard normal random variates. We can retrieve those values with \code{dbFetch}. <>= value <- dbFetch(db, "a") mean(value) @ The function \code{dbList} lists all of the keys that are available in the database, \code{dbExists} tests to see if a given key is in the database, and \code{dbDelete} deletes a key-value pair from the database <>= dbInsert(db, "b", 123) dbDelete(db, "a") dbList(db) dbExists(db, "a") @ While using functions like \code{dbInsert} and \code{dbFetch} is straightforward it can often be easier on the fingers to use standard R subset and accessor functions like \code{\$}, \code{[[}, and \code{[}. Filehash databases have methods for these functions so that objects can be accessed in a more compact manner. Similarly, replacement methods for these functions are also available. The \verb+[+ function can be used to access multiple objects from the database, in which case a list is returned. <>= db$a <- rnorm(100, 1) mean(db$a) mean(db[["a"]]) db$b <- rnorm(100, 2) dbList(db) @ For all of the accessor functions, only character indices are allowed. Numeric indices are caught and an error is given. <>= e <- local({ err <- function(e) e tryCatch(db[[1]], error = err) }) conditionMessage(e) @ Finally, there is method for the \code{with} generic function which operates much like using \code{with} on lists or environments. The following three statements all return the same value. <>= with(db, c(a = mean(a), b = mean(b))) @ When using \code{with}, the values of ``a'' and ``b'' are looked up in the database. <>= sapply(db[c("a", "b")], mean) @ Here, using \code{[} on \code{db} returns a list with the values associated with ``a'' and ``b''. Then \code{sapply} is applied in the usual way on the returned list. <>= unlist(lapply(db, mean)) @ In the last statement we call \code{lapply} directly on the ``filehash'' object. The \pkg{filehash} package defines a method for \code{lapply} that allows the user to apply a function on all the elements of a database directly. The method essentially loops through all the keys in the database, loads each object separately and applies the supplied function to each object. \code{lapply} returns a named list with each element being the result of applying the supplied function to an object in the database. There is an argument \code{keep.names} to the \code{lapply} method which, if set to \code{FALSE}, will drop all the names from the list. <>= dbUnlink(db) rm(list = ls(all = TRUE)) @ \section{Loading filehash databases} <>= set.seed(200) @ An alternative way of working with a filehash database is to load it into an environment and access the element names directly, without having to use any of the accessor functions. The \pkg{filehash} function \code{dbLoad} works much like the standard R \code{load} function except that \code{dbLoad} loads active bindings into a given environment rather than the actual data. The active bindings are created via the \code{makeActiveBinding} function in the \pkg{base} package. \code{dbLoad} takes a filehash database and creates symbols in an environment corresponding to the keys in the database. It then calls \code{makeActiveBinding} to associate with each key a function which loads the data associated with a given key. Conceptually, active bindings are like pointers to the database. After calling \code{dbLoad}, anytime an object with an active binding is accessed the associated function (installed by \code{makeActiveBinding}) loads the data from the database. We can create a simple database to demonstrate the active binding mechanism. <>= dbCreate("testDB") db <- dbInit("testDB") db$x <- rnorm(100) db$y <- runif(100) db$a <- letters dbLoad(db) ls() @ Notice that we appear to have some additional objects in our workspace. However, the values of these objects are not stored in memory---they are stored in the database. When one of the objects is accessed, the value is automatically loaded from the database. <>= mean(y) sort(a) @ If I assign a different value to one of these objects, its associated value is updated in the database via the active binding mechanism. <>= y <- rnorm(100, 2) mean(y) @ If I subsequently remove the database and reload it later, the updated value for ``y'' persists. <>= rm(list = ls()) db <- dbInit("testDB") dbLoad(db) ls() mean(y) @ Perhaps one disadvantage of the active binding approach taken here is that whenever an object is accessed, the data must be reloaded into R. This behavior is distinctly different from the the delayed assignment approach taken in \pkg{g.data} where an object must only be loaded once and then is subsequently in memory. However, when using delayed assignments, if one cycles through all of the objects in the database, one could eventually exhaust the available memory. <>= dbUnlink(db) rm(list = ls(all = TRUE)) @ \section{Other filehash utilities} There are a few other utilities included with the \pkg{filehash} package. Two of the utilities, \code{dumpObjects} and \code{dumpImage}, are analogues of \code{save} and \code{save.image}. Rather than save objects to an R workspace, \code{dumpObjects} saves the given objects to a ``filehash'' database so that in the future, individual objects can be reloaded if desired. Similarly, \code{dumpImage} saves the entire workspace to a ``filehash'' database. The function \code{dumpList} takes a list and creates a ``filehash'' database with values from the list. The list must have a non-empty name for every element in order for \code{dumpList} to succeed. \code{dumpDF} creates a ``filehash'' database from a data frame where each column of the data frame is an element in the database. Essentially, \code{dumpDF} converts the data frame to a list and calls \code{dumpList}. \section{Filehash database backends} Currently, the \pkg{filehash} package can represent databases in two different formats. The default format is called ``DB1'' and it stores the keys and values in a single file. From experience, this format works well overall but can be a little slow to initialize when there are many thousands of keys. Briefly, the ``filehash'' object in R stores a map which associates keys with a byte location in the database file where the corresponding value is stored. Given the byte location, we can \code{seek} to that location in the file and read the data directly. Before reading in the data, a check is made to make sure that the map is up to date. This format depends critically on having a working \code{ftell} at the system level and a crude check is made when trying to initialize a database of this format. The second format is called ``RDS'' and it stores objects as separate files on the disk in a directory with the same name as the database. This format is the most straightforward and simple of the available formats. When a request is made for a specific key, \pkg{filehash} finds the appropriate file in the directory and reads the file into R. The only catch is that on operating systems that use case-insensitive file names, objects whose names differ only in case will collide on the filesystem. To workaround this, object names with capital letters are stored with mangled names on the disk. An advantage of this format is that most of the organizational work is delegated to the filesystem. \section{Extending filehash} The \pkg{filehash} package has a mechanism for developing new backend formats, should the need arise. The function \code{registerFormatDB} can be used to make \pkg{filehash} aware of a new database format that may be implemented in a separate R package or a file. \code{registerFormatDB} takes two arguments: a \code{name} for the new format (like ``DB1'' or ``RDS'') and a list of functions. The list should contain two functions: one function named ``create'' for creating a database, given the database name, and another function named ``initialize'' for initializing the database. In addition, one needs to define methods for \code{dbInsert}, \code{dbFetch}, etc. A list of available backend formats can be obtained via the \code{filehashFormats} function. Upon registering a new backend format, the new format will be listed when \code{filehashFormats} is called. The interface for registering new backend formats is still experimental and could change in the future. \section{Discussion} The \pkg{filehash} package has been designed be useful in both a programming setting and an interactive setting. Its main purpose is to allow for simpler interaction with large datasets where simultaneous access to the full dataset is not needed. While the package may not be optimal for all settings, one goal was to write a simple package in pure R that users to could install with minimal overhead. In the future I hope to add functionality for interacting with databases stored on remote computers and perhaps incorporate a ``real'' database backend. Some work has already begun on developing a backend based on the \pkg{RSQLite} package. \bibliographystyle{alpha} \bibliography{combined} \end{document} filehash/R/0000755000176200001440000000000014363026642012244 5ustar liggesusersfilehash/R/queue.R0000644000176200001440000001241314367255465013527 0ustar liggesusers########################################################################## ## Copyright (C) 2006-2023, Roger D. Peng ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 2 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; if not, write to the Free Software ## Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA ## 02110-1301, USA ########################################################################## #' A Queue Class #' #' A queue implementation using a \code{filehash} database #' #' @details Objects can be created by calls of the form \code{new("queue", ...)} or by calling \code{createQ}. Existing queues can be initialized with \code{initQ}. #' #' @slot queue Object of class \code{"filehashDB1"} #' @slot name Object of class \code{"character"}: the name of the queue (default is the file name in which the queue data are stored) #' @exportClass queue setClass("queue", representation(queue = "filehashDB1", name = "character") ) #' @param filename name of queue file #' #' @return \code{createQ} and \code{initQ} return a \code{queue} object #' @export #' @describeIn queue Create a file-based queue object createQ <- function(filename) { dbCreate(filename, "DB1") queue <- dbInit(filename, "DB1") dbInsert(queue, "head", NULL) dbInsert(queue, "tail", NULL) new("queue", queue = queue, name = filename) } #' @export #' @describeIn queue Intialize an existing queue object initQ <- function(filename) { new("queue", queue = dbInit(filename, "DB1"), name = filename) } ## Public #' @describeIn queue Return (and remove) the top element of a queue setGeneric("pop", function(db, ...) standardGeneric("pop")) #' @describeIn queue Push an R object on to the tail of a queue setGeneric("push", function(db, val, ...) standardGeneric("push")) #' @describeIn queue Check if a queue is empty or not setGeneric("isEmpty", function(db, ...) standardGeneric("isEmpty")) #' @describeIn queue Return the top of the queue setGeneric("top", function(db, ...) standardGeneric("top")) #' @exportMethod show #' @describeIn queue Print a queue object #' @param object a queue object setMethod("show", "queue", function(object) { cat(gettextf("\n", object@name)) invisible(object) }) ################################################################################ ## Methods setMethod("lockFile", "queue", function(db, ...) { paste(db@name, "qlock", sep = ".") }) #' @exportMethod push #' @describeIn queue adds an element to the tail ("bottom") of the queue #' @param db a queue object #' @param val an R object to be added to the tail queue #' @param ... arguments passed to other methods setMethod("push", c("queue", "ANY"), function(db, val, ...) { ## Create a new tail node node <- list(value = val, nextkey = NULL) key <- sha1(node) createLockFile(lockFile(db)) on.exit(deleteLockFile(lockFile(db))) if(isEmpty(db)) dbInsert(db@queue, "head", key) else { ## Convert tail node to regular node tailkey <- dbFetch(db@queue, "tail") oldtail <- dbFetch(db@queue, tailkey) oldtail$nextkey <- key dbInsert(db@queue, tailkey, oldtail) } ## Insert new node and point tail to new node dbInsert(db@queue, key, node) dbInsert(db@queue, "tail", key) }) #' @exportMethod isEmpty #' @describeIn queue returns \code{TRUE}/\code{FALSE} depending on whether there are elements in the queue. setMethod("isEmpty", "queue", function(db) { is.null(dbFetch(db@queue, "head")) }) #' @exportMethod top #' @describeIn queue returns the value of the "top" (i.e. head) of the queue; an error is signaled if the queue is empty setMethod("top", "queue", function(db, ...) { createLockFile(lockFile(db)) on.exit(deleteLockFile(lockFile(db))) if(isEmpty(db)) stop("queue is empty") h <- dbFetch(db@queue, "head") node <- dbFetch(db@queue, h) node$value }) #' @exportMethod pop #' @describeIn queue returns the value of the "top" (i.e. head) of the queue and subsequently removes that element from the queue; an error is signaled if the queue is empty #' @param db a queue object setMethod("pop", "queue", function(db, ...) { createLockFile(lockFile(db)) on.exit(deleteLockFile(lockFile(db))) if(isEmpty(db)) stop("queue is empty") h <- dbFetch(db@queue, "head") node <- dbFetch(db@queue, h) dbInsert(db@queue, "head", node$nextkey) dbDelete(db@queue, h) node$value }) filehash/R/zzz.R0000644000176200001440000000333114363027424013223 0ustar liggesusers########################################################################## ## Copyright (C) 2006-2023, Roger D. Peng ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 2 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; if not, write to the Free Software ## Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA ## 02110-1301, USA ########################################################################## .onLoad <- function(lib, pkg) { assign("defaultType", "DB1", .filehashOptions) for(type in c("DB1", "RDS")) { cname <- paste("create", type, sep = "") iname <- paste("initialize", type, sep = "") r <- list(create = get(cname, mode = "function"), initialize = get(iname, mode="function")) assign(type, r, .filehashFormats) } } .onAttach <- function(lib, pkg) { dcf <- read.dcf(file.path(lib, pkg, "DESCRIPTION")) msg <- gettextf("%s: %s (%s)", dcf[, "Package"], dcf[, "Title"], as.character(dcf[, "Version"])) packageStartupMessage(paste(strwrap(msg), collapse = "\n")) } .filehashOptions <- new.env() .filehashFormats <- new.env() filehash/R/filehash-RDS.R0000644000176200001440000002041014367261154014600 0ustar liggesusers########################################################################## ## Copyright (C) 2006-2023, Roger D. Peng ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 2 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; if not, write to the Free Software ## Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA ## 02110-1301, USA ########################################################################## ################################################################################ ## Class 'filehashRDS' #' Filehash RDS Class #' #' An implementation of filehash databases using diretories and separate files #' #' @exportClass filehashRDS #' @slot dir Directory where files are stored (filehashRDS only) setClass("filehashRDS", representation(dir = "character"), contains = "filehash" ) setValidity("filehashRDS", function(object) { if(length(object@dir) != 1) return("only one directory should be set in 'dir'") if(!file.exists(object@dir)) return(gettextf("directory '%s' does not exist", object@dir)) TRUE }) createRDS <- function(dbName) { if(!file.exists(dbName)) { status <- dir.create(dbName) if(!status) stop(gettextf("unable to create database directory '%s'", dbName)) } else message(gettextf("database '%s' already exists", dbName)) TRUE } initializeRDS <- function(dbName) { ## Trailing '/' causes a problem in Windows? dbName <- sub("/$", "", dbName, perl = TRUE) new("filehashRDS", dir = normalizePath(dbName), name = basename(dbName)) } ## For case-insensitive file systems, objects with the same name but ## differ by capitalization might get clobbered. `mangleName()' ## inserts a "@" before each capital letter and `unMangleName()' ## reverses the operation. mangleName <- function(oname) { if(any(grep("@",oname,fixed=TRUE))) stop("RDS format cannot cope with objects with @ characters", " in their names") gsub("([A-Z])", "@\\1", oname, perl = TRUE) } unMangleName <- function(mname) { gsub("@", "", mname, fixed = TRUE) } ## Function for mapping a key to a path on the filesystem setGeneric("objectFile", function(db, key) standardGeneric("objectFile")) setMethod("objectFile", signature(db = "filehashRDS", key = "character"), function(db, key) { file.path(db@dir, mangleName(key)) }) ################################################################################ ## Interface functions #' @describeIn filehashRDS Insert an R object into a filehashRDS database #' @exportMethod dbInsert #' @param db a filehashRDS object #' @param key character, the name of an R object #' @param value an R object #' @param ... arguments passed to other methods #' @param safe Should the operation be done safely? #' @details When \code{safe = TRUE} in \code{dbInsert}, objects are written to a temp file before replacing any existing objects. This way, if the operation is interrupted, the original data are not corrupted. setMethod("dbInsert", signature(db = "filehashRDS", key = "character", value = "ANY"), function(db, key, value, safe = TRUE, ...) { writefile <- if(safe) tempfile() else objectFile(db, key) con <- gzfile(writefile, "wb") writestatus <- tryCatch({ serialize(value, con) }, condition = function(cond) { cond }, finally = { close(con) }) if(inherits(writestatus, "condition")) stop(gettextf("unable to write object '%s'", key)) if(!safe) return(invisible(!inherits(writestatus, "condition"))) cpstatus <- file.copy(writefile, objectFile(db, key), overwrite = TRUE) if(!cpstatus) stop(gettextf("unable to insert object '%s'", key)) else { rmstatus <- file.remove(writefile) if(!rmstatus) warning("unable to remove temporary file") } invisible(cpstatus) }) #' @exportMethod dbFetch #' @describeIn filehashRDS Retrieve a value from a filehashRDS database setMethod("dbFetch", signature(db = "filehashRDS", key = "character"), function(db, key, ...) { ## Create filename from key ofile <- objectFile(db, key) ## Open connection val <- tryCatch({ con<-gzfile(ofile) # note it is necessary to split creating and opening # the connection into two steps so that the connection # can be closed/destroyed successfully if ofile does # not exist (avoiding connection leaks). open(con,"rb") ## Read data unserialize(con) }, condition = function(cond) { cond }, finally = { close(con) }) if(inherits(val, "condition")) stop(gettextf("unable to obtain value for key '%s'", key)) val }) #' @exportMethod dbMultiFetch #' @describeIn filehashRDS Retrieve multiple objects from a filehashRDS database #' @details For \code{dbMultiFetch}, \code{key} is a character vector of keys. setMethod("dbMultiFetch", signature(db = "filehashRDS", key = "character"), function(db, key, ...) { r <- lapply(key, function(k) dbFetch(db, k)) names(r) <- key r }) #' @exportMethod dbExists #' @describeIn filehashRDS Determine if a key exists in a filehashRDS database setMethod("dbExists", signature(db = "filehashRDS", key = "character"), function(db, key, ...) { key %in% dbList(db) }) #' @exportMethod dbList #' @describeIn filehashRDS Return a character vector of all key stored in a database setMethod("dbList", "filehashRDS", function(db, ...) { ## list all keys/files in the database fileList <- dir(db@dir, all.files = TRUE, full.names = TRUE) use <- !file.info(fileList)$isdir fileList <- basename(fileList[use]) unMangleName(fileList) }) #' @exportMethod dbDelete #' @describeIn filehashRDS Delete a key and its corresponding object from a filehashRDS database setMethod("dbDelete", signature(db = "filehashRDS", key = "character"), function(db, key, ...) { ofile <- objectFile(db, key) ## remove/delete the file status <- file.remove(ofile) invisible(isTRUE(all(status))) }) #' @exportMethod dbUnlink #' @describeIn filehashRDS Delete an entire filehashRDS database setMethod("dbUnlink", "filehashRDS", function(db, ...) { ## delete the entire database directory d <- db@dir status <- unlink(d, recursive = TRUE) invisible(status) }) filehash/R/dump.R0000644000176200001440000001036514367023604013340 0ustar liggesusers########################################################################## ## Copyright (C) 2006-2023, Roger D. Peng ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 2 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; if not, write to the Free Software ## Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA ## 02110-1301, USA ########################################################################## #' Dump Environment #' #' Dump an enviroment to a filehash database #' #' @param env an environment #' @param dbName character, name of the filehash database #' @param list, character vector of object names to be dumped #' @param data a data frame #' #' @details The \code{dumpEnv} function takes an environment and stores each element of the environment in a \code{filehash} database. Objects dumped to a database can later be loaded via \code{dbLoad} or can be accessed with \code{dbFetch}, \code{dbList}, etc. Alternatively, the \code{with} method can be used to evaluate code in the context of a database. If a database with name \code{dbName} already exists, objects will be inserted into the existing database (and values for already-existing keys will be overwritten). #' #' @details \code{dumpDF} is different in that each variable in the data frame is stored as a separate object in the database. So each variable can be read from the database separately rather than having to load the entire data frame into memory. \code{dumpList} works in a simlar way. #' #' @return An object of class \code{"filehash"} is returned and a database is created. #' #' @aliases dumpImage dumpObjects dumpDF dumpList #' @name dumpEnv #' #' @export dumpEnv <- function(env, dbName) { keys <- ls(env, all.names = TRUE) dumpObjects(list = keys, dbName = dbName, envir = env) } #' @export #' @describeIn dumpEnv Dump the Global Environment (analogous to \code{save.image}) #' @param type type of filehash database to create dumpImage <- function(dbName = "Rworkspace", type = NULL) { dumpObjects(list = ls(envir = globalenv(), all.names = TRUE), dbName = dbName, type = type, envir = globalenv()) } #' @export #' @describeIn dumpEnv Dump named objects to a filehash database (analogous to \code{save}) #' @param ... R objects to be dumped to a filehash database #' @param envir environment from which objects are dumped dumpObjects <- function(..., list = character(0), dbName, type = NULL, envir = parent.frame()) { names <- as.character(substitute(list(...)))[-1] list <- c(list, names) if(!dbCreate(dbName, type)) stop("could not create database file") db <- dbInit(dbName, type) for(i in seq(along = list)) dbInsert(db, list[i], get(list[i], envir)) db } #' @export #' @describeIn dumpEnv Dump data frame columns to a filehash database dumpDF <- function(data, dbName = NULL, type = NULL) { if(is.null(dbName)) dbName <- as.character(substitute(data)) dumpList(as.list(data), dbName = dbName, type = type) } #' @export #' @describeIn dumpEnv Dump elements of a list to a filehash database dumpList <- function(data, dbName = NULL, type = NULL) { if(!is.list(data)) stop("'data' must be a list") vnames <- names(data) if(is.null(vnames) || isTRUE("" %in% vnames)) stop("list must have non-empty names") if(is.null(dbName)) dbName <- as.character(substitute(data)) if(!dbCreate(dbName, type)) stop("could not create database file") db <- dbInit(dbName, type) for(i in seq(along = vnames)) dbInsert(db, vnames[i], data[[vnames[i]]]) db } filehash/R/coerce.R0000644000176200001440000000463114367267263013645 0ustar liggesusers########################################################################## ## Copyright (C) 2006-2023, Roger D. Peng ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 2 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; if not, write to the Free Software ## Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA ## 02110-1301, USA ########################################################################## toDBType <- function(from, type, dbpath = NULL) { if(is.null(dbpath)) dbpath <- dbName(from) if(!dbCreate(dbpath, type = type)) stop("could not create ", type, " database") db <- dbInit(dbpath, type = type) keys <- dbList(from) for(key in keys) dbInsert(db, key, dbFetch(from, key)) invisible(db) } #' Coerce a filehash database #' #' Coerce a filehashDB1 database to filehashRDS format #' #' @name coerceDB1RDS #' @param from a filehashDB1 database object #' @exportMethod coerce #' @aliases coerce,filehashDB1,filehashRDS-method setAs("filehashDB1", "filehashRDS", function(from) { dbpath <- paste(dbName(from), "RDS", sep = "") toDBType(from, "RDS", dbpath) }) #' Coerce a filehash database #' #' Coerce a filehashDB1 database to a list object #' #' @name coerceDB1list #' @param from a filehashDB1 database object #' @exportMethod coerce #' @aliases coerce,filehashDB1,list-method setAs("filehashDB1", "list", function(from) { keys <- dbList(from) dbMultiFetch(from, keys) }) #' Coerce a filehash database #' #' Coerce a filehash database to a list object #' #' @name coercelist #' @param from a filehash database object #' @exportMethod coerce #' @aliases coerce,filehash,list-method setAs("filehash", "list", function(from) { env <- new.env(hash = TRUE) dbLoad(from, env) as.list(env, all.names = TRUE) }) filehash/R/filehash-DB1.R0000644000176200001440000003605214367261241014524 0ustar liggesusers########################################################################## ## Copyright (C) 2006-2023, Roger D. Peng ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 2 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; if not, write to the Free Software ## Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA ## 02110-1301, USA ########################################################################## ###################################################################### ## Class 'filehashDB1' ## Database entries ## ## File format: [key] [nbytes data] [data] ## serialized serialized raw bytes (serialized) ## ###################################################################### ## 'meta' is a list of functions for updating the file size of the ## database and the file map. #' Filehash DB1 Class #' #' An implementation of filehash databases using a single large file #' #' @exportClass filehashDB1 #' @slot datafile full path to the database file (filehashDB1 only) #' @slot meta list containing an environment for database metadata (filehashDB1 only) setClass("filehashDB1", representation(datafile = "character", meta = "list"), contains = "filehash" ) setValidity("filehashDB1", function(object) { if(!file.exists(object@datafile)) return(gettextf("datafile '%s' does not exist", datafile)) TRUE }) createDB1 <- function(dbName) { if(!hasWorkingFtell()) stop("need working 'ftell()' to use 'DB1' format") if(file.exists(dbName)) { message(gettextf("database '%s' already exists", dbName)) return(TRUE) } status <- file.create(dbName) if(!status) stop(gettextf("unable to create database file '%s'", dbName)) TRUE } makeMetaEnv <- function(filename) { dbmap <- NULL ## 'NULL' indicates the map needs to be read dbfilesize <- file.info(filename)$size updatesize <- function(size) { dbfilesize <<- size } updatemap <- function(map) { dbmap <<- map } getsize <- function() { dbfilesize } getmap <- function() { dbmap } list(updatesize = updatesize, updatemap = updatemap, getmap = getmap, getsize = getsize) } #' @importFrom methods new initializeDB1 <- function(dbName) { if(!hasWorkingFtell()) stop("need working 'ftell()' to use DB1 format") dbName <- normalizePath(dbName) new("filehashDB1", datafile = dbName, meta = makeMetaEnv(dbName), name = basename(dbName) ) } readKeyMap <- function(con, map = NULL, pos = 0) { if(is.null(map)) { ## using 'hash = TRUE' is critical because it can have a major ## impact on performance for large databases map <- new.env(hash = TRUE, parent = emptyenv()) pos <- 0 } if(pos < 0) stop("'pos' cannot be negative") filename <- path.expand(summary(con)$description) filesize <- file.info(filename)$size if(pos > filesize) stop("'pos' cannot be greater than file size") .Call(C_read_key_map, filename, map, filesize, pos) } readSingleKey <- function(con, map, key) { start <- map[[key]] if(is.null(start)) stop(gettextf("unable to obtain value for key '%s'", key)) seek(con, start, rw = "read") unserialize(con) } readKeys <- function(con, map, keys) { r <- lapply(keys, function(key) readSingleKey(con, map, key)) names(r) <- keys r } gotoEndPos <- function(con) { ## Move connection to the end seek(con, 0, "end") seek(con) } writeNullKeyValue <- function(con, key) { writestart <- gotoEndPos(con) handler <- function(cond) { ## Rewind the file back to where writing began and truncate at ## that position seek(con, writestart, "start", "write") truncate(con) cond } tryCatch({ serialize(key, con) len <- as.integer(-1) serialize(len, con) }, interrupt = handler, error = handler, finally = { flush(con) }) } writeKeyValue <- function(con, key, value) { writestart <- gotoEndPos(con) handler <- function(cond) { ## Rewind the file back to where writing began and ## truncate at that position; this is probably a bad ## idea for files > 2GB seek(con, writestart, "start", "write") truncate(con) cond } tryCatch({ serialize(key, con) byteData <- serialize(value, NULL) len <- length(byteData) serialize(len, con) writeBin(byteData, con) }, interrupt = handler, error = handler, finally = { flush(con) }) } setMethod("lockFile", "file", function(db, ...) { ## Use 3 underscores for lock file sprintf("%s___LOCK", summary(db)$description) }) createLockFile <- function(name) { if(.Platform$OS.type != "windows") status <- .Call(C_lock_file, name) else { ## TODO: are these optimal values for max.attempts ## and sleep.duration? max.attempts <- 4 sleep.duration <- 0.5 attempts <- 0 status <- -1 while ((attempts <= max.attempts) && ! isTRUE(status >= 0)) { attempts <- attempts + 1 status <- .Call(C_lock_file, name) if(!isTRUE(status >= 0)) Sys.sleep(sleep.duration) } } if(!isTRUE(status >= 0)) stop("cannot create lock file ", sQuote(name)) TRUE } deleteLockFile <- function(name) { if(!file.remove(name)) stop(paste('cannot remove lock file "', name, '"', sep='')) TRUE } ################################################################################ ## Internal utilities filesize <- gotoEndPos setGeneric("checkMap", function(db, ...) standardGeneric("checkMap")) setMethod("checkMap", "filehashDB1", function(db, filecon, ...) { old.size <- db@meta$getsize() cur.size <- tryCatch({ filesize(filecon) }, error = function(err) { old.size }) size.change <- old.size != cur.size map <- getMap(db) map0 <- map if(is.null(map)) map <- readKeyMap(filecon) else if(size.change) { ## Modify 'map.old' directly map <- tryCatch({ readKeyMap(filecon, map, old.size) }, error = function(err) { message(conditionMessage(err)) map0 }) } else map <- map0 if(!identical(map, map0)) { db@meta$updatemap(map) db@meta$updatesize(cur.size) } invisible(db) }) setGeneric("getMap", function(db) standardGeneric("getMap")) setMethod("getMap", "filehashDB1", function(db) { db@meta$getmap() }) ################################################################################ ## Interface functions openDBConn <- function(filename, mode) { con <- try({ file(filename, mode) }, silent = TRUE) if(inherits(con, "try-error")) stop("unable to open connection to database") con } #' @exportMethod dbInsert #' @describeIn filehashDB1 Insert an R object into a filehashDB1 database #' @param db a filehashDB1 object #' @param key character, the name of an R object in the database #' @param value an R object #' @param ... arguments passed to other methods setMethod("dbInsert", signature(db = "filehashDB1", key = "character", value = "ANY"), function(db, key, value, ...) { con <- openDBConn(db@datafile, "ab") on.exit(close(con)) lockname <- lockFile(con) createLockFile(lockname) on.exit(deleteLockFile(lockname), add = TRUE) invisible(writeKeyValue(con, key, value)) }) #' @exportMethod dbFetch #' @describeIn filehashDB1 Retrieve an object from a filehash DB1 database #' @param db a filehashDB1 object #' @param key character, the name of an R object in the database setMethod("dbFetch", signature(db = "filehashDB1", key = "character"), function(db, key, ...) { con <- openDBConn(db@datafile, "rb") on.exit(close(con)) lockname <- lockFile(con) createLockFile(lockname) on.exit(deleteLockFile(lockname), add = TRUE) checkMap(db, con) map <- getMap(db) val <- readSingleKey(con, map, key) val }) #' @exportMethod dbMultiFetch #' @describeIn filehashDB1 Retrieve multiple objects from a filehash DB1 database #' @param db a filehashDB1 object #' @param key character, the name of an R object in the database #' @details For \code{dbMultiFetch}, \code{key} is a character vector of keys. setMethod("dbMultiFetch", signature(db = "filehashDB1", key = "character"), function(db, key, ...) { con <- openDBConn(db@datafile, "rb") on.exit(close(con)) lockname <- lockFile(con) createLockFile(lockname) on.exit(deleteLockFile(lockname), add = TRUE) checkMap(db, con) map <- getMap(db) readKeys(con, map, key) }) #' @exportMethod dbExists #' @describeIn filehashDB1 Determine if a key exists in a filehash DB1 database setMethod("dbExists", signature(db = "filehashDB1", key = "character"), function(db, key, ...) { dbkeys <- dbList(db) key %in% dbkeys }) #' @exportMethod dbList #' @describeIn filehashDB1 Return a character vector containing all keys in a database setMethod("dbList", "filehashDB1", function(db, ...) { con <- openDBConn(db@datafile, "rb") on.exit(close(con)) lockname <- lockFile(con) createLockFile(lockname) on.exit(deleteLockFile(lockname), add = TRUE) checkMap(db, con) map <- getMap(db) if(length(map) == 0) character(0) else { keys <- as.list(map, all.names = TRUE) use <- !sapply(keys, is.null) names(keys[use]) } }) #' @exportMethod dbDelete #' @describeIn filehashDB1 Delete a key and it's corresponding object from a filehashDB1 database setMethod("dbDelete", signature(db = "filehashDB1", key = "character"), function(db, key, ...) { con <- openDBConn(db@datafile, "ab") on.exit(close(con)) lockname <- lockFile(con) createLockFile(lockname) on.exit(deleteLockFile(lockname), add = TRUE) invisible(writeNullKeyValue(con, key)) }) #' @exportMethod dbUnlink #' @describeIn filehashDB1 Delete an entire filehashDB1 database setMethod("dbUnlink", "filehashDB1", function(db, ...) { file.remove(db@datafile) }) reorganizeDB <- function(db, ...) { datafile <- db@datafile ## Find a temporary file name tempdata <- paste(datafile, "Tmp", sep = "") i <- 0 while(file.exists(tempdata)) { i <- i + 1 tempdata <- paste(datafile, "Tmp", i, sep = "") } if(!dbCreate(tempdata, type = "DB1")) { warning("could not create temporary database") return(FALSE) } on.exit(file.remove(tempdata)) tempdb <- dbInit(tempdata, type = "DB1") keys <- dbList(db) ## Copy all keys to temporary database nkeys <- length(keys) cat("Reorganizing database: ") for(i in seq_along(keys)) { key <- keys[i] msg <- sprintf("%d%% (%d/%d)", round (100 * i / nkeys), i, nkeys) cat(msg) dbInsert(tempdb, key, dbFetch(db, key)) back <- paste(rep("\b", nchar(msg)), collapse = "") cat(back) } cat("\n") status <- file.rename(tempdata, datafile) if(!isTRUE(status)) { on.exit() warning("temporary database could not be renamed and is left in ", tempdata) return(FALSE) } on.exit() cat("Finished; reload database with 'dbInit'\n") TRUE } #' @exportMethod dbReorganize #' @describeIn filehashDB1 Reorganize and compactify a filehahsDB1 database setMethod("dbReorganize", "filehashDB1", reorganizeDB) ################################################################################ ## Test system's ftell() hasWorkingFtell <- function() { tfile <- tempfile() con <- file(tfile, "wb") tryCatch({ bytes <- raw(10) begin <- seek(con) if(begin != 0) return(FALSE) writeBin(bytes, con) end <- seek(con) offset <- end - begin isTRUE(offset == 10) }, error = function(e) { FALSE }, finally = { close(con) unlink(tfile) }) } ###################################################################### filehash/R/hash.R0000644000176200001440000000235414366574566013336 0ustar liggesusers########################################################################## ## Copyright (C) 2006-2023, Roger D. Peng ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 2 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; if not, write to the Free Software ## Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA ## 02110-1301, USA ########################################################################## #' @importFrom digest digest sha1 <- function(object, skip = 14L) { bytes <- serialize(object, NULL) digest(bytes, algo = "sha1", skip = skip, serialize = FALSE) } #' @importFrom digest digest sha1_file <- function(filename, skip = 0L) { digest(filename, algo = "sha1", file = TRUE) } filehash/R/filehash.R0000644000176200001440000004175614367270146014173 0ustar liggesusers########################################################################## ## Copyright (C) 2006-2023, Roger D. Peng ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 2 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; if not, write to the Free Software ## Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA ## 02110-1301, USA ########################################################################## ###################################################################### ## Class 'filehash' #' Filehash Class #' #' These functions form the interface for a simple file-based key-value database (i.e. hash table). #' #' @details Objects can be created by calls of the form \code{new("filehash", ...)}. #' #' @slot name Object of class \code{"character"}, name of the database. #' #' @useDynLib filehash,.registration = TRUE, .fixes = "C_" #' @import methods #' @exportClass filehash setClass("filehash", representation(name = "character")) setValidity("filehash", function(object) { if(length(object@name) == 0) "database name has length 0" else TRUE }) setGeneric("dbName", function(db) standardGeneric("dbName")) setMethod("dbName", "filehash", function(db) db@name) #' @exportMethod show #' @param object a filehash object #' @describeIn filehash Print a filehash object setMethod("show", "filehash", function(object) { if(length(object@name) == 0) stop("database does not have a name") cat(gettextf("'%s' database '%s'\n", as.character(class(object)), object@name)) }) ###################################################################### #' Register Database Format #' #' @param name character, name of database format #' @param funlist list of functions for creating and initializing a database format #' @export registerFormatDB <- function(name, funlist) { if(!all(c("initialize", "create") %in% names(funlist))) stop("need both 'initialize' and 'create' functions in 'funlist'") r <- list(list(create = funlist[["create"]], initialize = funlist[["initialize"]])) names(r) <- name do.call("filehashFormats", r) TRUE } #' List and register filehash formats #' #' List and register filehash backend database formats. #' #' @param \dots list of functions for registering a new database format #' #' @details \code{filehashFormats} can be used to register new filehash backend database formats. \code{filehashFormats} called with no arguments lists information on available formats #' @return A list containing information on the available filehash formats #' @export filehashFormats <- function(...) { args <- list(...) n <- names(args) for(n in names(args)) assign(n, args[[n]], .filehashFormats) current <- as.list(.filehashFormats) if(length(args) == 0) current else invisible(current) } ###################################################################### ## Create necessary database files. On successful creation, return ## TRUE. If the database already exists, don't do anything but return ## TRUE (and print a message). If there's any other strange ## condition, return FALSE. dbStartup <- function(dbName, type, action = c("initialize", "create")) { action <- match.arg(action) validFormat <- type %in% names(filehashFormats()) if(!validFormat) stop(gettextf("'%s' not a valid database format", type)) formatList <- filehashFormats()[[type]] doFUN <- formatList[[action]] if(!is.function(doFUN)) stop(gettextf("'%s' function for database format '%s' is not valid", action, type)) doFUN(dbName) } setGeneric("dbCreate", function(db, ...) standardGeneric("dbCreate")) #' @exportMethod dbCreate #' @aliases dbCreate #' @param db a filehash object #' @param ... arguments passed to other methods #' @describeIn filehash Create a filehash database setMethod("dbCreate", "ANY", function(db, type = NULL, ...) { if(is.null(type)) type <- filehashOption()$defaultType dbStartup(db, type, "create") }) setGeneric("dbInit", function(db, ...) standardGeneric("dbInit")) #' @exportMethod dbInit #' @describeIn filehash Initialize an existing filehash database #' @param type filehash database type #' @aliases dbInit setMethod("dbInit", "ANY", function(db, type = NULL, ...) { if(is.null(type)) type <- filehashOption()$defaultType dbStartup(db, type, "initialize") }) ###################################################################### ## Set options and retrieve list of options #' Set Filehash Options #' #' Set global filehash options #' #' @param \dots name-value pairs for options #' @details Currently, the only option that can be set is the default database type (\code{defaultType}) which can be "DB1", "RDS" or "DB". #' @return \code{filehashOptions} returns a list of current settings for all options. #' #' @export filehashOption <- function(...) { args <- list(...) n <- names(args) for(n in names(args)) assign(n, args[[n]], .filehashOptions) current <- as.list(.filehashOptions) if(length(args) == 0) current else invisible(current) } ###################################################################### ## Load active bindings into an environment #' Load a Database #' #' Load entire database into an environment #' #' @param db filehash database object #' @param ... arguments passed to other methods #' #' @details \code{dbLoad} loads objects in the database directly into the #' environment specified, like \code{load} does except with active bindings. #' \code{dbLoad} takes a second argument \code{env}, which is an #' environment, and the default for \code{env} is \code{parent.frame()}. #' #' @details The use of \code{makeActiveBinding} in \code{db2env} and #' \code{dbLoad} allows for potentially large databases to, at least #' conceptually, be used in R, as long as you don't need simultaneous access to #' all of the elements in the database. #' setGeneric("dbLoad", function(db, ...) standardGeneric("dbLoad")) #' @exportMethod dbLoad #' @param env environment into which objects should be loaded #' @param keys specific keys to be loaded (if NULL then all keys are loaded) #' @describeIn dbLoad Method for filehash databases setMethod("dbLoad", "filehash", function(db, env = parent.frame(2), keys = NULL, ...) { if(is.null(keys)) keys <- dbList(db) else if(!is.character(keys)) stop("'keys' should be a character vector") active <- sapply(keys, function(k) { exists(k, env, inherits = FALSE) }) if(any(active)) { warning("keys with active/regular bindings ignored: ", paste(sQuote(keys[active]), collapse = ", ")) keys <- keys[!active] } make.f <- function(k) { key <- k function(value) { if(!missing(value)) { dbInsert(db, key, value) invisible(value) } else { obj <- dbFetch(db, key) obj } } } for(k in keys) makeActiveBinding(k, make.f(k), env) invisible(keys) }) #' @param db a filehash database object #' @param ... arguments passed to other methods #' #' @details \code{dbLazyLoad} loads objects in the database directly into the #' environment specified, like \code{load} does except with promises. #' \code{dbLazyLoad} takes a second argument \code{env}, which is an #' environment, and the default for \code{env} is \code{parent.frame()}. #' @details With \code{dbLazyLoad} database objects are "lazy-loaded" into #' the environment. Promises to load the objects are created in the environment #' specified by \code{env}. Upon first access, those objects are copied into #' the environment and will from then on reside in memory. Changes to the #' database will not be reflected in the object residing in the environment #' after first access. Conversely, changes to the object in the environment #' will not be reflected in the database. This type of loading is useful for #' read-only databases. #' #' @return dbLoad, dbLazyLoad: a character vector is returned (invisibly) containing the keys associated with the values loaded into the environment. #' @describeIn dbLoad Lazy load a filehash database setGeneric("dbLazyLoad", function(db, ...) standardGeneric("dbLazyLoad")) #' @exportMethod dbLazyLoad #' @param env environment into which objects should be loaded #' @param keys specific keys to be loaded (if NULL then all keys are loaded) #' @describeIn dbLoad Method for filehash databases setMethod("dbLazyLoad", "filehash", function(db, env = parent.frame(2), keys = NULL, ...) { if(is.null(keys)) keys <- dbList(db) else if(!is.character(keys)) stop("'keys' should be a character vector") wrap <- function(x, env) { key <- x delayedAssign(x, dbFetch(db, key), environment(), env) } for(k in keys) wrap(k, env) invisible(keys) }) #' @describeIn dbLoad Load active bindings into an environment and return the environment #' #' @param db filehash database object #' #' @return db2env: environment containing database keys #' #' @details \code{db2env} loads the entire database \code{db} into an #' environment via calls to \code{makeActiveBinding}. Therefore, the data #' themselves are not stored in the environment, but a function pointing to #' the data in the database is stored. When an element of the environment is #' accessed, the function is called to retrieve the data from the database. #' If the data in the database is changed, the changes will be reflected in the #' environment. #' #' #' @seealso \code{\link{dbLoad}}, \code{\link{dbLazyLoad}} #' #' @export db2env <- function(db) { if(is.character(db)) db <- dbInit(db) ## use the default type env <- new.env(hash = TRUE) dbLoad(db, env) env } ###################################################################### ## Other methods setGeneric("names") #' @exportMethod names #' @param x a filehash object #' @describeIn filehash Return the keys stored in a filehash database setMethod("names", "filehash", function(x) { dbList(x) }) setGeneric("length") #' @exportMethod length #' @param x a filehash object #' @describeIn filehash Return the number of objects in a filehash database setMethod("length", "filehash", function(x) { length(dbList(x)) }) setGeneric("with") #' @exportMethod with #' @param data a filehash object #' @param expr an R expression to be evaluated #' @describeIn filehash Use a filehash database as an evaluation environment setMethod("with", "filehash", function(data, expr, ...) { env <- db2env(data) eval(substitute(expr), env, enclos = parent.frame()) }) setGeneric("lapply") #' @exportMethod lapply #' @param FUN a function to be applied #' @param X a filehash object #' @param keep.names Should the key names be returned in the resulting list? #' @describeIn filehash Apply a function over the elements of a filehash database setMethod("lapply", signature(X = "filehash"), function(X, FUN, ..., keep.names = TRUE) { FUN <- match.fun(FUN) keys <- dbList(X) rval <- vector("list", length = length(keys)) for(i in seq(along = keys)) { obj <- dbFetch(X, keys[i]) rval[[i]] <- FUN(obj, ...) } if(keep.names) names(rval) <- keys rval }) ###################################################################### ## Database interface #' @describeIn filehash Retrieve values associated with multiple keys (a list of those values is returned). #' @param db a filehash object #' @param key a character vector indicating a key (or keys) to retreive setGeneric("dbMultiFetch", function(db, key, ...) { standardGeneric("dbMultiFetch") }) #' @describeIn filehash Insert a key-value pair into the database. If that key already exists, its associated value is overwritten. For \code{"RDS"} type databases, there is a \code{safe} option (defaults to \code{TRUE}) which allows the user to insert objects somewhat more safely (objects should not be lost in the event of an interrupt). setGeneric("dbInsert", function(db, key, value, ...) { standardGeneric("dbInsert") }) #' @describeIn filehash Retrieve the value associated with a given key. setGeneric("dbFetch", function(db, key, ...) standardGeneric("dbFetch")) #' @describeIn filehash Check to see if a key exists. setGeneric("dbExists", function(db, key, ...) standardGeneric("dbExists")) #' @describeIn filehash List all keys in the database. setGeneric("dbList", function(db, ...) standardGeneric("dbList")) #' @describeIn filehash The \code{dbDelete} function is for deleting elements, but for the \code{"DB1"} format all it does is remove the key from the lookup table. The actual data are still in the database (but inaccessible). If you reinsert data for the same key, the new data are simply appended on to the end of the file. Therefore, it's possible to have multiple copies of data lying around after a while, potentially making the database file big. The \code{"RDS"} format does not have this problem. setGeneric("dbDelete", function(db, key, ...) standardGeneric("dbDelete")) #' @describeIn filehash The \code{dbReorganize} function is there for the purpose of rewriting the database to remove all of the stale entries. Basically, this function creates a new copy of the database and then overwrites the old copy. This function has not been tested extensively and so should be considered \emph{experimental}. \code{dbReorganize} is not needed when using the \code{"RDS"} format. setGeneric("dbReorganize", function(db, ...) standardGeneric("dbReorganize")) #' @describeIn filehash Delete an entire database from the disk. setGeneric("dbUnlink", function(db, ...) standardGeneric("dbUnlink")) ## Other setOldClass(c("file", "connection")) setGeneric("lockFile", function(db, ...) standardGeneric("lockFile")) ###################################################################### ## Extractor/replacement #' @exportMethod `[[` #' @param j not used #' @describeIn filehash Extract elements of a filehash database using character names #' @aliases `[[,filehash,character,missing-method` setMethod("[[", signature(x = "filehash", i = "character", j = "missing"), function(x, i, j) { dbFetch(x, i) }) #' @exportMethod `$` #' @describeIn filehash Extract elements of a filehash database using character names setMethod("$", signature(x = "filehash"), function(x, name) { dbFetch(x, name) }) #' @exportMethod `[[<-` #' @param x a filehash object #' @param i a character index #' @param value an R object #' @describeIn filehash Replace elements of a filehash database setReplaceMethod("[[", signature(x = "filehash", i = "character", j = "missing"), function(x, i, j, value) { dbInsert(x, i, value) x }) #' @exportMethod `$<-` #' @param x a filehash object #' @param name the name of the element in the filehash database #' @param value an R object #' @describeIn filehash Replace elements of a filehash database setReplaceMethod("$", signature(x = "filehash"), function(x, name, value) { dbInsert(x, name, value) x }) ## Need to define these because they're not automatically caught. ## Don't need this if R >= 2.4.0. #' @exportMethod `[` #' @param drop should dimensions be dropped? (not used) #' @describeIn filehash Retrieve multiple elements of a filehash database setMethod("[", signature(x = "filehash", i = "character", j = "missing", drop = "missing"), function(x, i , j, drop) { dbMultiFetch(x, i) }) filehash/R/stack.R0000644000176200001440000001145414367255606013511 0ustar liggesusers########################################################################## ## Copyright (C) 2006-2023, Roger D. Peng ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 2 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; if not, write to the Free Software ## Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA ## 02110-1301, USA ########################################################################## #' Stack Class #' #' A stack implementation using a \code{filehash} database #' #' @details Objects can be created by calls of the form \code{new("stack", ...)} or by calling \code{createS}. Existing queues can be initialized with \code{initS}. #' #' @slot stack Object of class \code{"filehashDB1"} #' @slot name Object of class \code{"character"}: the name of the stack (default is the file name in which the stack data are stored) #' #' @exportClass stack setClass("stack", representation(stack = "filehashDB1", name = "character")) #' @exportMethod show #' @describeIn stack Print a stack object. #' @param object a stack object setMethod("show", "stack", function(object) { cat(gettextf("\n", object@name)) invisible(object) }) #' @param filename name of file to store stack #' @export #' @describeIn stack Create a filehash Stack #' @return a stack object createS <- function(filename) { dbCreate(filename, "DB1") stack <- dbInit(filename, "DB1") dbInsert(stack, "top", NULL) new("stack", stack = stack, name = filename) } #' @describeIn stack Initialize and existing filehash stack #' @param filename name of file where stack is stored #' @export initS <- function(filename) { new("stack", stack = dbInit(filename, "DB1"), name = filename) } setMethod("lockFile", "stack", function(db, ...) { paste(db@name, "slock", sep = ".") }) #' @exportMethod push #' @param db a stack object #' @param val an R object to be added to the stack #' @param ... arguments passed to other methods #' @describeIn stack Push an object on to the stack setMethod("push", c("stack", "ANY"), function(db, val, ...) { node <- list(value = val, nextkey = dbFetch(db@stack, "top")) topkey <- sha1(node) createLockFile(lockFile(db)) on.exit(deleteLockFile(lockFile(db))) dbInsert(db@stack, topkey, node) dbInsert(db@stack, "top", topkey) }) #' @describeIn stack Push multiple R objects on to a stack setGeneric("mpush", function(db, vals, ...) standardGeneric("mpush")) #' @exportMethod mpush #' @param vals a list of R objects #' @describeIn stack Push a list of R objects on to the stack setMethod("mpush", c("stack", "ANY"), function(db, vals, ...) { if(!is.list(vals)) vals <- as.list(vals) createLockFile(lockFile(db)) on.exit(deleteLockFile(lockFile(db))) topkey <- dbFetch(db@stack, "top") for(i in seq_along(vals)) { node <- list(value = vals[[i]], nextkey = topkey) topkey <- sha1(node) dbInsert(db@stack, topkey, node) dbInsert(db@stack, "top", topkey) } }) #' @exportMethod isEmpty #' @describeIn stack Indicate whether the stack is empty or not setMethod("isEmpty", "stack", function(db, ...) { h <- dbFetch(db@stack, "top") is.null(h) }) #' @exportMethod top #' @describeIn stack Return the top element of the stack setMethod("top", "stack", function(db, ...) { createLockFile(lockFile(db)) on.exit(deleteLockFile(lockFile(db))) if(isEmpty(db)) stop("stack is empty") h <- dbFetch(db@stack, "top") node <- dbFetch(db@stack, h) node$value }) #' @exportMethod pop #' @describeIn stack Return the top element of the stack and remove that element from the stack setMethod("pop", "stack", function(db, ...) { createLockFile(lockFile(db)) on.exit(deleteLockFile(lockFile(db))) if(isEmpty(db)) stop("stack is empty") h <- dbFetch(db@stack, "top") node <- dbFetch(db@stack, h) dbInsert(db@stack, "top", node$nextkey) dbDelete(db@stack, h) node$value }) filehash/MD50000644000176200001440000000432514371237002012350 0ustar liggesusers20724f93d384ede97512d0c16d6000a2 *DESCRIPTION e00630e645e7bcccced0e1c7f076a564 *NAMESPACE d616b0dd18e8720a826ad26a47822491 *R/coerce.R 9a36d6a2bde06d5bd4ab703b8fefc95f *R/dump.R c4a3ecf662ac7fd0186bca51ba45a979 *R/filehash-DB1.R b403c6b0b86cc5956f70032490f763a1 *R/filehash-RDS.R 5e0b9a29697f7f8ca7511bae26e2a7db *R/filehash.R 26de70ca17eba26938be391fdf63673f *R/hash.R 0f7971384ed2968b93bdf7c65371ace1 *R/queue.R b7c62b4bda3d2ba06aa5a7e6d4e6c643 *R/stack.R 8a7451a692b071134c636d60149babcc *R/zzz.R 355b3d9a3f2fbdd1f201b85ebbc1143f *build/vignette.rds ecdc7b4c6aa03fb03e692c4aae262f0f *inst/CITATION b128d2038f8d0c5c554b72de80298e4a *inst/COPYING 44422cadef3c067ffd43b9f00a18aa9d *inst/NEWS 6b5ee8f3a31a761dd38bd0238efbdfc1 *inst/doc/filehash.R f157f4577bb1a29a560f9b630307ec6e *inst/doc/filehash.Rnw b234ee585f3036311b02739a2bcf4c82 *inst/doc/filehash.pdf 213020161cc470bf12a962570dc705b9 *man/coerceDB1RDS.Rd 8de7a1f9dd5f2fc8e04eaf2acaa8a4b7 *man/coerceDB1list.Rd 64f34df62cbba4a3248fc6e1fd25af5f *man/coercelist.Rd 597f874788ae775c58106bc43263e506 *man/dbLoad.Rd b80328a85541042e79d9a841c78df899 *man/dumpEnv.Rd cab6586625a1cbf28b1cc6fcabd72b23 *man/filehash-class.Rd c3731d591111c7c43706214355246382 *man/filehashDB1-class.Rd 3c7e362954e53ae6db6df3b46df31950 *man/filehashFormats.Rd 23ce02835c9fe95f85bc7a018c8e4c70 *man/filehashOption.Rd 0d4ee2708ac63960f023c16d6ca60097 *man/filehashRDS-class.Rd 41654bdf028f44f8c23e49ca51cf50ad *man/queue-class.Rd 923b466eece582567e8f15e5754e616c *man/registerFormatDB.Rd 39e55b0882862759c3948d4a4e379136 *man/stack-class.Rd 684cc44d9ed10d09c77dc890021dfce4 *src/init.c 1f69eebc69b381da6e96c802a8c93003 *src/lockfile.c 43efc34c262443523d94f93365daed8f *src/readKeyMap.c d83a9585d249e9a093cb3c545cf49834 *tests/SHA1SUM f00793baa7e5a70812957058ec569332 *tests/misc/create-testdb.R 22b196fc72a3d39aa18a6aec0fe7112a *tests/reg-tests.R 52831c2f2906ae2967aff550a096cd37 *tests/reg-tests.Rout.save 5b7464763d85ba9406c9e4dddce80d97 *tests/testdb-v1.1 5b7464763d85ba9406c9e4dddce80d97 *tests/testdb-v2.0 40829c8958672fbc650561dde99ea5a0 *tests/versions.R b6a1560810de75ac78be89613eaec399 *tests/versions.Rout.save b2631aaa4f28eae69ba9e6b9b5bbbe80 *vignettes/combined.bib f157f4577bb1a29a560f9b630307ec6e *vignettes/filehash.Rnw filehash/inst/0000755000176200001440000000000014371234421013013 5ustar liggesusersfilehash/inst/doc/0000755000176200001440000000000014371234421013560 5ustar liggesusersfilehash/inst/doc/filehash.pdf0000644000176200001440000031435614371234420016051 0ustar liggesusers%PDF-1.5 % 3 0 obj << /Length 3059 /Filter /FlateDecode >> stream xڍ]=bߪΊ$JE/i"}-kfe:EYs}"93MoKܥeE~K4N*Nk>EF;ft&+>po3z<ϏwLfKd=>t+:2ͳy¾puQd8yeU7VuLUOVf~O(>}bYc]ͻzu&^Z"M(>?Ƀiv7Z$-ʸRYGYE9UWIqIU\~'3Gۍ|~ьVn`Al*m-G$O0ޮ Qذe0ѵnttxab4dpӶM b_fϔa?uU1VT85RUG{ ݋.$XTj=wl3[&/3ؖb0ӀsC.$US0x!%bۚ2psk`Zp*>ь~pgNܱe!D#,8Diڋ_kͶ,k=3dB1u ᄲd{axp07C'?`x3F9*As~0pCRKq y?xkC"%58`>[{yܘGG{D[OUųQ[OMw{:}R [!=fqd O?#4BDsgЇ U[+G|BJGkƝ$zJ P! , l `4r1Ҝ`41 lwC];n9\ X/YdQˢρ3a dz`]?2IFq3.(R_؆=!*d档_d܃.0:#)Uf2ȉTPrRhǾg8ɛӧ-" cQ䌲3V,YO2G!u)xUMUq Аa:̰uPfc_"0 qExY/}pS˪eK$PV qiQRJp,wSv fBgBžE'thtdo fS12;ޫޮdAoyE0[kFz.>HW8sPOCfւb'2 5G@zOakφ*pR8+;MUVI(&"$xt~Vx!esҨPjȞItB5_{aGkU1;"'.ji ,7@pIIxy՝l{VyYͮ`hh(l7 pkR鰧ux ͒,2+rw<ઐS]/ڑHx?̉8(ư }̽O0> -!E3(yhWxƵ,/" )E\M?F.'T_r P@ BZ^LyDiA&l_>Ivm?u;NxBYf4WSRZS,SBx#ԺÿuY㕊>Lr4DȾ$@]9OVCq 7Tʦ'7kϳ*VjzBF W.<*(T<{ ?>?铈%>ӇQ2N>]IQ܋|Ixi~R ̨[MbZ={G^6-y F.2'\r8e.`~ď ɨy)Oy% dF> stream xڍْ_1sV4T8z YP<8h}o4LwYx_Ey {zwҰCo-ekzvfSfc p#cCej zm~+m-oʺs\t-Mɭ93wJ)Hކ%,fw4Ƨ*8ˀpL8U4wP2t.(,wC'iGVeXU uFcKv{1ɜVϺ@ֽ߶|ިZVYRprѱLS2La`FWܒ I6Mm/d-WtSdRQ0UEw~4oÑ'?Q/kLHLu! yxZ?f8;0OD^WB%= ܻzGUryƩѨFq07HS_ZlPKSS47m1!`57-'K|zׅ X lj\epQ*vk:*7{n |9NGqm__F4t u6l~&Мo7ђ<1 Q{ C $mkƦ psՉb{òCu)մ`dS0194 0vLEOKWKT$E_j*Php\@0N^jY")~tf:L>ktZ0:$?' kwTZVq+Ϋ{shH٩WcÊ̋l E$mKi*YW*8MJ6Q(:B՗ $Dϡ b 읈@̓^lw_~!:VPGs:05I<w˽\pZћ,|{ך!AVqfֵ)"1֖@lٲ)0;vᦇ~Lp~^f3^%D/dDb3Ss+(HǺ2u+Tfk䓦 8U]>k]yBBrDT ޱ6p !iRdƜo [ U ,T<Wa\PLeuRm'D5.* 1ݏ?3Y ߽DUug:)˳SEx;t 0Y\f5RPJ3( qDPX$^ pл,g!g54J*h 鲴\_ ;0Ip$NݠI^ҧe9JriO#2cXvI 2|Bb'Jy2@ =Ta3f]ۍKC*#\ nD++U~_~X؈*MvBa~&Tby^M 2'KE(F֬v089'5x)I=ƱeJ d^.m\?tY*/R&3G/nÙk8' 6/u0L{~| j8ey~Xlc]:n@$0`EZܭbyd$UsmaBKOgea?ÀCgVKVOY Ѥ`U{%zxt\M9QvfWL*Ha4-vc{j,SAi7e$7Q h@*lqfj \T@B.H(}piw檶*[Miwr{ꮈoVuڙÛ-S. }8TvC~55Hj08KrjO;/$wD=1šI>er QM~j`/UO1&wUiK[?ڬ*mIF#۹}A: >Ң 'Yx}o~"%An3 ,~QOF\0<4E(08,uF /3q__܄+٣tO <>25m;,Ҿw#Gy^,`<~VC% endstream endobj 17 0 obj << /Length 1827 /Filter /FlateDecode >> stream xڵ]o6(1'R0 h0`<2se~w%"1  JG۬+H Y.qPўڟd$bEH\rH((B,5e(5fcpIGkO g,(łC=AxCP=lp~N EvAzuQϩM tfel h/.9/A?VP[¹qWD|z6"pk+?V=z Vֲy /P QvMj-,{:_: !׽5:ͺQ{zY(H|mAcs*%"Ի BAwձ[gӪ#ȫQۀ7Ci(fwuDa`ƀ%娼*%Pk*@Ț .EPxr{m~Q=VG݄9>|h39J[$gt82H|;+2p3W +/0 hrAtp풴81Kˡ6DZ8EAaqFYfVTlBXÏ?=b@60, YSe H7SҝW5E{?k|2*Jmgߩ9Rk.(LP H韠 .9Rhkh8yd*j[t*bxY )}ِݾCwPg(3re6`d0OO-zS0Tn´C_z^Qߩ5Cyu upt\>usmR-$w_ wOrI'ͮo>]8t endstream endobj 20 0 obj << /Length 2252 /Filter /FlateDecode >> stream xYKoϯ0|j#ӌ HLv &{59%ڭX-5$8F*VQ/-X$H}qtR쵎w\NgiBa/ǁ'7eh:?"c:Zˤl 䇞b(9eOߺ-l)n'8Cn9-g:a,2tOPqNyQ~3_G_u\2.{6?4F ?VqCn 3xUB-kX(=z X[u7Pڋ \_<#K{"n)TĨ2It.ORr_K%Ҕ#^hJ N:~lpi=G~3[)>74.8u|ተj:eE+f"JG+UBX] )T펶?rAÿa"CDȅg"ѴʶLF2m@v*l| o li;*kL"}MEY>h1<#C6FDKHړ[fKC3Q `/|}mF. x:nw|R8G!WUUJGj!@M^AnrF&ZDg# -q>5> stream xڵXK$  r $9^#q ]UiyѮv~}HQW̬ TEQE~û?ewbNwK ;wO>}?v?L޿""pq_nP:,Z+ץn 5p߿Y_WZ^Z|\..U߫{u͏,Pq|C}١Z993^^\ U-Ȼ[]˺:WןeT=[\llC߳|R$5+ 릖%/*;ӵeVs~.2WlDyVOqν )v'Y NNiцce2qMOD7*6FRU^?.zE^2Z@Q^egrge%J{%-~eS4`>ċOk9mHq;g{=_ZGC8΃4+5@d=6{o;#ƓhuE& Q..}j`i~KӞxOm3<ڲ'7,clm[:lڍ2/oغi8VS %ԥ<ʕåL<@~a,,^!#H.~}>(!3^Pvi:PЄr$sRD4 V3)98z(  י>HN&_ e$'4 ea6&V1Tʆ} &du8cRC{5 撶=ҁ[hX8 06"r'\ / KFk(),N(O4.5x=><. ^Z/X;SPh@#Լx7")BkBY R]cS4gi#8['] ,X˚bq죚Ӷ,Tal^o]%f2M(NW]@n-=^i,Mک7`p"&xl &}t(kNJ-N#6,w%Yy^mEEwN)L[ wE\yb A5=O IbMWե7kgU*zh|&[w 86C0:!ݐ vXh(c<s#rq`(2&@ n'7W|I Bqjɝ@ w̛rj⚾ /10nTU¬ACJ%|S1i2}zuy62Bvſd6UaRd.> stream xڕYKsWPU;\ܛey]벝D+2Ӎn1b\ƫя1w_I*MQ ]qX;EQVAFv"UOh=Y{3)RnvqB}sGV$i&y`ͫbaT-w|"LO='lyPE0Ө Mtϼn˳ev4}?GeMxK`cizܭ~&Tt,2f'DXfOjLh-}F%;Lgj濨9iphCMzD988G"Ozlu_iѴ$/:j /zTtNh-IW' ڎI.Բ}۾?4;4M~'8f8=}Qk y$ H&;L2Coe68te=o}_Z"HRw'_jVY?MjCW ӰcCW"Dgik"9CZ %I(gEDhBY|׾Wly^ь/ZyqbQa*3F _eGƙhTޏ^!H$߁ƨU}%[] Rbmh\HfBͥܩp. 36=MYH<pH'MnB96b{ݲt8 Bƙ)`7]@3CFTCO7M@8żDI:sQgGlasꕌܺ؂cv|eQRRY"bٜ%8#t >q>ThCo.$zᗓkcẕ ʯlyM;NjT'G{ZZݚgޞ@oW8FYJRZqsB4+; ͦ 3&ӕժ`,U{h,LSh{ rH)r /A ow؂VIW-*>Qo!(dXa$tEp:ilDH)gU v<XΉGv$!lIr~)0JEC߽ "K@tLs4( .ěgF3I2[8L4[6VdHj q (gh]Z6t][k% x[Լ \5ggǨ&+'4Sksikh# ]zTa Kჰfc&2'_ToxwK~I˛La0b 0|}^Y@ T%Ե|>H Z NH0O适+oN굋qJ0*/EHJ$&oC EЖ9RB4jLÉ| Pi:FG*;p 5غ.Kxˀ8[_W׽ X,s/٨-ѱqrR&"DH 5\L*L,M8< 9,P%|gt#l9< \5JYrJ.]"*-@| l@sQ)]_C8$OU Zг\%g|(,,!6+2\erTYC9G.Tku ø`L+ H݅ck>ĵ'tڐd٩ *P`/] j-[5 {j)#V  Q2?ߧip'0j\!O VK{>ϳꆮcp^uxR]̂O=U]n V40nu6R'04 ܽ*3Q#:F@O}o07dDWk5g.+$ )c?u ͔i*d$? q$S1 -'$403>"HD&Wd%_ a=yweO)\@PVsCy4kH&sjD$ f& (f*=[kxMz352v1+(hM+4m{3*J#F_mK'Y gxak.&SK8,̯ ĸɜ@Fa"6083TID /zlG.&]DsQ(mQ2m 'JUh; SDu<-A2HC:F?+Kp aAX÷W;=zlVa˶ob-l6euvLNr )VA6[>L NA׻QEoK|Rԫ_<dLދh*@hOZfs͹?j AA` E#'i47 W;E. MJ~?4.g!}arE '݃pGRf4x{G|;KvKҠ,cYV|A=cލp{ %ވd Y1Fj6U"R5֞>ƆOp@ -C m'!? '+#~d3D\i>?ɵOd>h9DLou[C 3䍒44`+ ( /ݤ£8 d]YC{Rnl۳`J d.Z39k> stream xڝueT۲-=ݵ%8wwwww@p ξyGUUsZMFL+hlkqec'#vmmDN .2 g`01q13q1199?BNNw;@hasw075s;tm6vΆVf c%1 ̍MA4YA$A++H#dL71hm ss09NPS˫*ɊʩDU$UDEJ*>*fG['WQ AĠ:h>̍n%`lhdeQ nr(/XGgC GE:99Gd#4uF1A%+;-#'҈ @|́\@6Z}#P_U:o^s~ 1;#_%+ @ `edTO/vkN@#'[zk>ZW`d;8:? hDo~4 12c5CC9ᇻ <_:1pcibck>(ƶ6Vc <n¿?Ŝ %lt dke?V:t~hICL :2w3w+;LV\`enRu4{_h8aXL#K#_p#9-BWrR8}c1w`. X@76?tA*jcdklnc!: ٺ몟Z,&HT+&|(){Styكf&H+ Κ7'nbmCFݫk4l0fg >xLєr D;NVZ c}W꯹ { m/(35Eiq~]PE+1#3&B|g:zFf1yM7F2,|iݢpgoFb"Œ㺊]N#rVwŒ*3~UҙO7Uth8‚=J7ob¤TgIhּ@-i4A>,fXrdkUHׅ%R=[/|Rw|Kܯ1}nN+tz·`6 sAм]RӃK$Ye,FS<#^E=Wo 5iE撝qK̙5TޏJAH8.Ko,K~X ᬥbl!"6"`^3H Q4&ev6$[f?3s{qm(v6 {̪/*@PA엳;^EI_ZTh׍Q;>S*Hoɩ1!ۭCH%Ķq)H+^=.4,p+fMK¿U!FнٴhW+핊P#a`Id[}&q̈́m+s&5m !$&Iҍ%;29ru?iR> 7.0b |܆Tz$ўER̓W;"Yd  Hkj{5N:5 JG>-#U)ROdq?C@ v/й 3`-JzZ3a־:YjT1~='ʕOedq⃪s3Y_5ՆD\j}EWKǞOm`5W~c!8Eq<-itY АǕ3M '-s,XrrkQ0vrZJf}<ύ_ƄJPpk~2wu(" 2ͧV])69U R䯴rZm00p>H[ %mM#4mT[@:)=FTK]/Kj+9M3cw&;}SKڭi’+TȯL)f"Md=.! I NO;tGKSՓ#´ǷxQ<''\%N$!ZDQ` -j)JoНҠ%|/\/ߺJ'xn'F㉧͎U< ;>LVPcਸiBpf4Ќ3[KiX*I?8^ L'w_>.62R ·}b>߉^LS^kJ%55ma!o?ksZC5V(zײh5߳Y%d*lea KC!&Pby"JӍ V'₂=2_/ i YDi[NK4Ϯgi{D9S9Op5p>{7׭-jU^֎t$&ȍw%Q@30QDF>QKxjzujz7` !3Bư ghqRGt v7p ̅^ IFw*S5e巁wԤIf\6sM刁%xS=S;<W%M7a*|6߿+:yG^ѣ EoU3|ٌ򓱓wVW^OMF0<#x ɚDV!ĽP^ΣP6ohQ.'')jM.^\%KX1VJܯ(E/ 0,|:Y rxYi3 dvA&%|o܀$KJiVq_T2ä%#yWrעo8aB{Pa'3!{7xM7i.Z33g%x1is{ !kqyÞ,QiŠ&_4pAzBsΘʬN 3+ qN:Pi/1_?'Դw$Ęrd%tdֻ[Ҷ81_~&bpW$X-oF&o]JԾf>}Е9I+ ݷl GZ%[E"f7[z RyrSOAOՓ_*Vu{ dњ;(j>5m7wo4"7#uksݢj k2nectࠋ0E<>NMy|5Z~ElIty# -cmwpBbL68NfGq<{=~N 6עg4բXYQD$/6 &ulr`,  dBC8oUb_1)zO^ hccV4cwe)~KF@͓KᄉmDe͟Ro80Qb=*U-ߓ21b/\{,Z `gEث7lRGȝ] O!C_խQDD_s1;GuU3bp,Fӱ;~,7{L5Q/8ä`F#qkԴμ=\ WM!K/Ig䋷 hć@ƥ8<޶p#d,+hHF: YVF<uA_`&4Y ^)QeQMGQne_Ч)w*qKo x H($J2uh,_AV'١}hIP$6&qr,ʫ:1+C@2g$ЕmR/m42=`[y`ժNq|x: +dnx9 .'AH̒LHl"whb"ρH"*.W( 1|$tl Ũ':9n Ԓl~le-|ʊb-t.FE~$'{ Ӽp~} 1G?Py8',#cyY1H0uu4vըO7(ӻV~gi~!bB 5R-6dqVv8CFJP*RS'D,_]V ,H tZ*#8B]Y٘ ʘ/<"JnE*awzm|gn5fC7-`6uh>1؅kʓ&IRڏX!սTH*LSr!JH!x5tQ7d%DޢX@@QZY!h}"~;Ϗ :Vi>5uwTع䇂/E?rWZo,{n:(byK9Kq렞#|! ;ҔO_ΪKIInagWJ6͜+[QV>2'+N牰>z4V8F©ő4Z#<-S7IyH24Cąat<~B6rE4Һ,LuKXFYEQnn L5ҥ c^6607߀xu4>*f2kc2|"3#LY,VэАtj<>0˗`HNa;4SZN."؏VS sJF蹒C5 vG2 RA2Z6| Ngͷ1IJvBZ{w8R|>r?NCʬlIKkj/Q{(SC}Pǣ0ڦ(`zCp*_*\`ocNoԬ+~ҼN)6t6 !'@1NHgHgv<HeN䙮w$߇=>~47׹!4/, 9q~vΤnNPAN̊i7 +@^N?_OCƇO$aߣ/a}Ȕ,|jVr~1gohgMe0]*`u0X{- 4== 2Z= {/TxN5=#Q ?3ʭWP|,W7LE^}ڳj/?}[bˮժ-=^}u^u̒/z?(d"]H'u2 T|@w<]yK9gƴbFz2ЯW^W.J訏E֎ʯA q݆Nl#A4tmjܶW# B#as7Su~u-* a9iyNj 3jCXrɾB;aKc*HQ=lV0{X*L(Cfx A٬|&&+"ZfV/rPlxi@wm'$mLΛҕ+qiH& x-uB˔ʭ(z5ՐA'8]!x,.pCwA+#ACi9.( 9'dl Pg;o58O}> xc!?-օ Y$$yFԴ6QCemƞ_ ߄D#vA ߰b+k}09c~AhnqXЭA,&?❊׻_}8om)esnWy_Dtr\ŕmgV-$tDj8F1! 儂`3~ o~^MLdBՈGG""$< /IL1T` r߉rԾ%dU*Wq<1iQvNe!oM6J$ytDjHA\N6 Fs8xZ@mAFˎ9YS:kQlΤL1L:λX'oWFix_%H-w]BU_Ik&y-(욐6|3@mUPRƍ^0ɤAdՠ-t2t7r΂w6-gI>*qoǛ6s4S{eiJ28tU]= l3b9vR>'m_]PEלf!ѩ*whU3u`C1C.9/&,XX|%Ds~ra甯չ3.7Raq#T"ف>)VggjC#oG9Iϸ@1M߈fd"$ncc8=0Kˬ(4@맠i5q1[x m|I9|}UM . B)PڄS;[߬n珤!ThΓZm]so&$6O|k5=* /O#Mdѹ y0'Һ#Y2'XIu~H7#d Z $O"/3~WuO.K֪Y~D7<cT<\^M^>~Tr[>˻hf^ccPD =\S| iQ?m .)xXDܢb ް<ǐ^SPj|>1Xm PDĹ=8O[4(V7wkb sLŠa`82yDPlKLv^6V%MݨIw`|o'NRa8oY[,Nq 5d`Q(xwQ!t<{M.\7xj5t~ū!Z6 U4Gm*m(\W8^K;D-bSK=!dYME9>(8#WgvezEɗne@j-=ymx4 fE!%۩0Gdٜq؋/}I:2t(,lnjXi.(Ύ}$}=D>tSzQx5*s*9{DJq h*MD"ax٩A)Pj n~LR皑Hoa> stream xڜuPݷ- \kN.!܃\w}η֭ZϚs1eUO5%*1PȌ@I)4Yۉ 'K+|9>}mj@  ]3bblcl4;Y=LDl,Ḿ_"H"66N@g+Д_;3KSdd 9y@y@Q ! /PWUSUHKJ~YX:@nFN@333#?Ц.& sI%`;~"8X[J3g i tLXgc+'gI,M9 Y9g `fgiOd3`d O>"*VFVV. -+o\f9g?lWD7((/˧׿HS)Z)'+ gOl%D~4ddwb5sltNT?O, K'g@̬,쬌C8F6t>If 5Tg}Kۙxkml8~RxLfL h?Is.66 FJ3&ill* 2kYhd 2|}u;SPaa`aTX,j?oL2*+oj *//%#K;п9,o9~fFffOUSxhd77ڷo/X8,,H-Oǹ\o#cfï|ᏻra| A8MU?6 ;?|zһhB^6܏9cHf63*0 rԕB$L0 ry !s r#MDh~H6Ю]k)u4%k!RYeKK>Tr}X˒ )BCikSoKuL%R&a3eCUQnXIu& LjjZ鎻 IWil4snoY9O vuиk"./.j"{N,Y=Lj9^kʌ̖4 -V}Vi cnpFr'\qR{>O)1׳KodPYG42Lf'1QNC TSы}iV~mPŭ ѱb\EڪGfX6Mw=0Q6 Cꤳ#&(prX{.{SK+/(_Tj@P|Xz^ξvWqonhӝe9-hhlftSa?H*isejA0ļ6 'UǎQ^ElP8IW/caבz E2#{~."]ʥd"#MfQ>PM5W\)- OXptiDQ=&6i1/C cXE H:p# |#<$>Xo<4NsYB!?4]fC>dh{\'͂wWNoa*tou5ɮY(GA[j'$u!{D.䡓MsPw*VW[1aZZi|u=)*8õJ >$鵚:] 1$"W}\ tNg.mޛWuYV2ҍeJP8MVU)_]qt4& |`/t&+M*`xJ_gK u5@2:b`r%@O4cZ~t`Ulxz/+JtlEqvdPu=2Uc0\75R ܫ`d}1/۲UV1"pzI/p]Ҝ;žƉt1z*aѣ)l*my[!O|{^ }c^UT0ŽSg-G1׶}{6$e뾊hi6 S&"H)] p-hTeI_]Yw^+n)/Jj(t%?"KqXzEcDN#rNt&b%rT^O5g岬)Yrvr#a)zjkTL}=, AU$dIGLj^-ĭ|,_m콳jq^ H5EGۡ5+Ioum/cSv>J*xQOc͚[jvY)G2cZ-t \\SF+i28fr?ɔNzMY#n̾d?4k@jRqL^PBYw"paϊpꔿ7su(0'\LO#]p(y{Nsk@ӏW h g ?uk"i)ý0z]W)~AOM ӣL:Wwمd|o϶K2Ä~&ӒW R2Ӊz,tzg~GRO^xo/0q_H|rfhCs!^|Pp?\#M ~7&Z+Gy;yRϐ ܟ]?=WN`ԝgҭ]hvzgZUq&mQ~!3[{!#rgpD@~PoIP0ѹć=|9G. lrgiT:ѓFBF켤%izB$NlЇ̊-SRrYB~%ꜷs&K] lTAB]픽d1ϩ4j0N6#S^0ߛ'ȅ)լ3u'F!+0^bU H´g9Zz)Sݼ^meM9d7 >挺/.;]}5z?8* ܧj]QKѹ{D9R~nڝp7bL _}oM)30XLf ) 1rw?}j՚k'+LT)4* Jͬ)Q_5\qG?~RoUfA$€FT=V^ F|Ƞؗ3wۅ/_^#FĤ% `㻚t:T*b{dhץتG6}~=%qKhw|璋\Ryab;mHL_og(d%y_\T{"w!tg2fft Ӎ%L ރA0NgkʩIRmމ|!ԖxU7|tA2ڲUlnGo&Jx5~{Sx}AuQGކݷ# VYeWβ' AWIJ[4RmPqK"~fŧ5pz.p+OX#ԝ|o&3P?}y4*s\ T߇"A5CO !&1h >޲KcHK:Bғet֕&]E{2h%$kEfn nv>و E֞Dq'W[-or{%jq1^"mƱSHr{I.~5l] _aZa!z8չfƝSlNR*z8]L1:ʇ&K3#Kq]%ܷAlqr0T'KDZCUjZP%wi܀7R{Kŗ->6W"B}(w7Ļ_g^` `n4]mH/, N3l4MgDh P 7g#V>8[0r'0qBKHDJD݂R,X#XrOMJBP ҃L= JU,!6BFj7Wwִ`;c{eih֒ qu-W'9 E]=dM:j!R84fnr =A}0W!S8Dzɩ}xKHi9o(}YWf\ 9g>~a|| AVҙK2 )k[sK? "^bGX^]r;_pCz*MpE*XKJbs*|G D\XVN͵AVuѶ1"T]8^΍W'@cb_["ޕA[M̑xrc|i'dL_0_K]֣5kSJ8Ko> ѕ m?'ts;+ֹ߰ڌ([B^ǐuwi)rU qcr-3nX/7#;n.qK`^7g̚[QөGC65jC"5m\R3=2o$Y[.L5pI6Xr $Ճo|[` !G6C#etE 7'K~X=V qC]Yέ,E8UK`|DQ!fQ^a7X(HOwtk uCp])߂#:Gܷ,u!K)R.;KFLuWAYELk%_DS{ճkC#YΒQ5;ڐxJDc wŐ83 1׀z BR~-.L JT VJa|S$2yx|Zg|EK[wgG-_P0 %m 2Ո2Ũjy6ikM+Z%m#=ږD[OM<ۋ|0>*wI=P"Ь܃^{ nq[7"~`,jj'DD4]qs +}2?X$4ܽU'dS$4c[Y0G#'a,T¸CܷY'E>+Tp,[EO5 y!<y+x@;LV_nH4T(A YlČMP%p^KQ+82xNƚ'(4 &]<|QD?r"Luuk8mZxrZq6S(>^R}r1f203*iY+~sXdQ/?5<%9 &4SL彵$g0/ڬed+1fv9V4>5Y2X%|X 24\)&!Mi1!]d]Q}CFyM< '~ (>%% =p69w7S5fY0slW +@w(7"L! (XV(R#-Z'>/~ڋ %^XGyFJҡ@-?X,d~/JzD}ްX JU}}X/:)l0ql~ f6-dI vgYf  YmFXrXb\A@OIʝTMI Q̄M7'|?-џHd0+ŗ#(f,$e}#Gߧz%4R cKD]ux߽l@4SOmHͺg3-֗[>>i{Eܳ_G =1" S)EYʮ۱_%]fyeP.NDOft:Ua iͥ?2uoQzVzUz1 @Ȃq8cGf{K7OCi*!xeBNY,F6 BڨHb>F\ <|GJgym_ILN!Z'>&SuU}{>uy/%G 6ܦ`>*l߮[.yo+2tHbcEMٙ`7/%cLMѦٽYB r L NKE0q2P$n7^f-(OG >t #j"⦸Y= kW+Ф%[D7xwHmqh*y-0Έ"X -]4v&䛿 M=a"yu4wM}¯T~& 3ZGK]zONKC`ˠ78KNf}?C o2wǷQDsA!x")WjXΔ%ת]H+7Rek W$g 'w^rO. S6t]E+K@ކc}z7j L#Sey+Rs`yBnʫ$.RG3QEv! rWdrYY&ǍYP5L퀰ESDQO+iZ^8T9Vpw͋>j{9n ZGtF4#Zqq*qdsdŎ+tB Qֆf'Ѩ`@>xs_5{Yx~ʶ2Νon-XdY4 z-;]&梌/5Q0us;w,=o?uv^(ZƋXJل@1H=ov*r n\%,)Ä.\y g߻fܐ3vD˸;1+ØPoxMRs٥)U~<_ťPJJ3yh.!З/ $ɒ#zg9%G(ҏ<22\0䀃*0OVNu N݈Zg:ft+tNHYG]$sljR1?p*Wkg˩[Ylqy+ew 9E?%(SᐧKϤBDnժ6ҖF'-U3U5X*7Z$T(zF+9Kv-|qҬ]ֻ(? u@h uD 5X_p22SyPM2FtPIe MGf>U洀7$#ߐs3H%2k9mAbMf, ]10ͧ:eߒ?Z7G!ֳ3>WS#*͍y!q)w6M7)6864D{ݺe?i~]] v]$%[e?q|8O}8Z߁TФI.NhΛnJe C5D ,0r`&NӮ"!?V- {cyf?}9nE],7N;`_I7n slTC=ZH!ETucį+'=Ma5w8v_,@ J PX1QE W,knQ _:36%EՊiVM68 R}G;>pgeᑥ{Wz.`~aMqY&eA#) U~]faXH;Ƭ];# 9ɜ_T @=t)q?f~|AR"0F3O3SN:GIj1tSq ۺ[ϯ:ǧ^bC,5k?VLw>FzHXZW] )5A$roq?s) (wC=Zpw1%vɌw3I_׉oA&+ftd [ntjBBMqDޒHO[>gAK7g3;xMS踝>n蠳1p 'dz>R3޽^mCN\فzsn/|% ubQ3?pi"pökxn)xWE·qLsijaw}JP}5+ q^:j/]ڱl_₿ csF{ȱ %kCSo~/HzchBԬEUx@sKddЛ;FY6~NfnDLox$$R;6wO5#5!y%>]3Ь@*mD8J)c#-7zҼU2tW3׋瓁@|H CyP fH.65V u)"Yjm㽊w3$Hh(iu"t 7 J2S?QDnq[,ᛦ| ~_.sm<6jmF3T>\$/Y`{'wfCBf%E4"ߐqz,@ ZE&nD,zI`$m$ᅭak$}qv1Kx\4 蹡' <]O|m|JUl\VTǫnɚک:Wtn$5g[K~A}} |eX ^>߳zP%_6F [Qv0,pݢBl9}ly3!dBڴlPv|͡G<zeo p}F @0ӄ-O[:Ό|SH^_waF ۫D]R2{pcsN$޻WO۬U:sT Q}IṦa{N23W&p)=};P 'Un2V{{VƲL ICEEl?L߈ɇE} ::Ѧ S72М8Xѱn|6҉xiast;{P96nI7 Fv[w,8; Fz&%R%ӺPng=TH+ /RϓUJ4v9Zt/S@7W6}q)utngFFwaRZ^ Dqe"u v{bVVrwq2xwm䜦K,UŌ{>m# "9T&fNvFieA/K32lۭ_)_"iF^l6{ZWeAP)!ib8_{m3| E]a z&%\fjl~'qJ󓛇Bf M-S/;!-ssN?u_CyEN3L֔tjQKf? `~1lխ$rɏӔ13<' q:L̰; 96v:G%T4v3whI* ¯zCģk/gp^}mzlQpJ0%i/)Y۬8_g<>gN9jțو_ƴ+!\+(фDLfQĔqQLw Rj~t9iT—!j60RY~X̄}Hn\4Ռi5 h*< !c!(C+( E]w0#TbK7OI&NX="||w'[Dg٭)-ot{v)R(I|>02"^xy|$A9jW}@CbrөȻN"|Dk?PɈlkإ tvnW@ic;L 毀Ĥ6*UUsdv4h` Z0v*zƞܬ0-yNM(,u@x1J*NcR7s c׿@T*5"d^b/V g%֘{!sMbT5,3a+e:Q'P7,s*I+v8*D3 uXTΣ\ 5 ZF~# /KĶxHHFYld[P,BI'ޢE)NA_sd[tdMl.ݿ$l$~K%Š\MDlC|s=|wYe ,*AbAh[ͻ?[MR(YKHG=q&)spgJOz $rL qºӰu 2@eZAD"Zf㭃~:,m:9h%,PM)3a/\+{C R?7XD >eds29F &W,6$!#5~sk*nj?X܊=#(51 s߉@.Lh=)oMMq)A$f,sNw|NvJCI0gJiX8-*Yh6MnKލ<;[g֒s@@k_A?8-.mR Wu@; *bOsCU,8.,lz=\yLu-i eۦh!~|]*@:ٚG%28]1n9{7)_C ߯uԝ /y?b) Vuڏ"OhU7-4 z`{YY8tKs%'QQl\e5Ua@}5&ÀWv1JT*9 4o]g}+S\Ss7SND/.:{VgMcoavl%+Ӽ6+u͟bfC CB=kbIJ\z-LÚ`FYRg 4[<%6Jy9"6\uN/s>5+gFl|0^AYW!#RLfk"r KO%[>2"}_셜q]Edڔ!Ήڣ;QhGpPZQȣl!Wr[b!C_&kd d wm@hM'\ Ǩ"a|O&"զCVtbm=!1}죙?COƛxM\]΂@w}d'Ksq/^/ ivs],}7=ĉ1"gq {~F#&e:.DK4؃q>߿KV*sx쓟K~`Fo`+cU'1їy`2c3a-aVwzm<~$w2 ٌ]V]Kt2nHz#1«jgNQU6;:ꣳ^$ac`ExѕT]*5!{'RHtf~ÿW[xU/⤤k 0eg˄!zd$ HPH:p>^5s:b\顀˰C̈́)X cUUoA~eGvo>#IVC/KX6FK~B (Aₒ"Ж~dpQy_::;!G̚q&jwMi$!A!r+Rs TY<ϐotuGRڞhQ\mRaJI) )7%=iφ7N ;.# *v@f晔.N+$>kUjۓߎ=SJ8c0uXZ?"ZCLV:m?˯QUh[i;aiiD#iL{ݞXwH #B~xyԌ@k $SNP}k s-W"l'ª̃w g Q㍂u8ṃ.R,*JJ+uBwWmj g堾cP#cEz f!2#sX\ 'fOH f]A'9V/Fv4A4#V8:Tq .)쐥5,̀Jgr]s )r^ Q-9-:Ê-߁K=HEZ/AoMgё\E[.:E8oPi-|?Y#iH=,k"=杚ry\&Xl<9yfkΣp'2,0;\vk*.Țk(ZZ\o0(RBitWSsTܗ`\syXH˃ea߄`Q$+ز2Bro*@q anp`% 0*jU=`;% kW0:BꌰoN$`@H+FZdJ?FdfH%d$QrZeOKPXmFm1ai1EF?t͔ce1Hf ?Ѝhֳy{ HFE qha'-T@nٍL(:N;{\dZ)WFz9fH@EnS߸L -i3%%Lv r v੐)KجzHW7Ҳ:ضRO R c_' [zj-J}rM>PXs!fф-ؖnT#`Ld UGL 8 _. gP n:{mG hW#\t-_@UE,u<{>!wF%F J6t@sR%p;4,9+\D'e$JNz#K<.h_3\v-H)8s¨K$v3Ʈ8\*Ѿ}Yvro(9Fx=͏ƴ /lN,jjN*n┥+^ݤQNu4G."`[gcjJӤD>xsq۪;kyַ3(ۯ[I])J{u <G"bGh֕yԑSYHw endstream endobj 38 0 obj << /Length1 1468 /Length2 12154 /Length3 0 /Length 13149 /Filter /FlateDecode >> stream xڝeT-  ݂ hwww ]Cp';wkO^?{t}jΪU%21$q`debA`Q 1ȁlV^v^N7?Û5@`PM!@7bnjwhml @g0`1bb !Ʀ o${d8&1178ް-} PeRf˫}S&P$)M$).bfn88! [e&@#_ƎF7ۣ[fP1!;ZJ067;`@w41%VH7ك&`߱o {h oF1!%'#+ƈ@2EWjf  _{*6(*/7ߜoHlW XY94/}C7?ufkoթ_Yr!73`ac`c}5yd |ȷr19oIcnלKژ<:~Mo.6&։h rLPohwwm(he%~_(3 jMZ[_b 90hOlLo]@ Ff$}jc Xۀo#+ ?lo+jdi|fz&0URUӿ6^,_ߖ?tAllnc6o ]lllfw(#GMѿ뭼~71$2BYI`?VD8B~*J!Z^Ygίڿ,jREX%hUϖϸ0us @$2>\vP ;nU!H,f+ʹhONy3e\M^,b'hC"c3"RY eCyrqlxnD[AO!ʻbסf~Yjh?,m?7>c&^j RA+z{.G{=nH@4C?nNhnuǭ0r#S-J!ߛh $A Z JZa{ѣJRm}ׂ ./wQDNM:E  P!]\D,`|ku^"ltQncÖyY-cssnf'~rvUFJ z,*_$ˇ_v<}DJʽaÈDfjzšʈ˱{Lz_w'BOLh?{q.tҷ;LL^ u˚-a <#b7ݠm} RJlw4}%'`2lt\>_۩:tг óЍ,eDg߾E%4O B{i|nT['uG>ltCƍw Ys_PyeG$<^pAԾ#AϪ#3_r(j"HcD@&7J0 v*e+, ?F(zd{ T΢Vdz$; YqǹⳙX`0yPbu,5Lo*g6|< >nrW\S) &:1`'sJ.sLh5ލ3Ԟ^ME{grS (WrőL3 کh4nI%c $T0P~1Jxm8*f/F"Z؊F&EӅ@eJ&9\;pn"|!N0};wR;|D|!fF@R0AU aa7_ǰ.^5bKcӁ PF86+g{bh;\ar^"|"ʾ%roF"zV;7zvmۈN:ɻYwTDY>Rdt`C㶬"½j=rFZX^Ղʥ3sf]ݕþWZp%PQB?>xc*/Ex<}+zIO_!_ ݨtvbdTFJS% ? ؾn&i .sJ< Nn3~@]aAX;2;Bn /tAY+ЩEt =a H c,,oDR=[R|>(U 1ois(\lӂ Mw@\N}~r̃K.|@JJH"x5G܈ڻ',D|77K.'{அ$AM|C*/ZXtLTMVmV$m͋kصKa~w".=ȫe26}[Z|' }]ͼgՈXmyH^G݇kIchN{]I=)sֈUŹ- mPk notsTmЄ-( hnzc3EEXby$bGBSzK\i'+9aK){83S挬 (EӋ"]{[{Sv(Ww$z; Zt%BҢQ/ jID'GlfJn*zPn={.TŝTY Jvcw[Y+U}f%*y"e2in/Fw:4A-qv(Bf>!|=4EB{5M kF )g,o]gY,[P̌rhōO0@?N^^*jiA+KT5GW3MNov-@V.-]lX"i΂2qU顶υ 4{R+ '^ʛĮ8LWC{o:*EEq X?#LHDbhVh,+ @{^ .֤phvd6Vpw Bi ;O.+M(vQ/˹M$ZOrx [8yVfފCtw~89cRTtXֱR:TYx m(4VP u_Q}mIÊg%x?%Ui x,- ,Le LƕƸr H  `Z>hPq<%3f$=RA U)G;RU:;eN8;e e %ak.):CtP-emXJvja/#X_?~e''[@G`&ޝ_3`_Z|}6Gz# /="=gj6ކY9%d٩6:BfdudVNScX"y;-#VׂLe0N&+|\U_ 踸*~ ,Ca_`q.]^'ޗإ&COț~dj2G; Ȕ):$neH(Xr/l&B]8w(Aӈ0;Q1R7b-\R8v;jSy҇PNLlG-yrRskч~/FAz?c΃6 ѳz޷8F t&;0c|?9GAH=\TRQY^|_>xlS*K$@zgs1HSSDjXӣBeBnS*inJ|զ@hgS*[m_@Q̐0٘ Umᴑs8@VҘNA|٬-z ky&k wDmI.OSU ~ UQz}U;wk?J FhIg*uHLH锧ʃX)>15HdqSO]gy}ld#זK"Is,4+ ֥j5 ,2F֭ `@sN& w6.$+Nm$Ĝ_Žܕd|Јc$ dUoh7L9{Nt'ɖHURClQ.`]KG;0SF%f d~&bi;Mϵp(KԃߴCNL8~ybg|nSYF}r)N̓?ÓE(%O%ВyI Gpƨ3byN6ֳ>D)\L֦oº|p bx,tpl^@r&"՝O`57Sh;>f}lXܭ].,f}1Nszzg+B?J߅˻Rݚk}2]ڢ-r!m\rt&fN}C0Dh:i-8'&f?>;Cz6@ZIQrٞv>"=Zz 8|G@XTfuEfNU=_xB5kVx+iF/\(Q#Up_D2J4we0d:jז,|]P^)<~lǽ>T`Xx7( To 'r.+'`MrSx-pIhTal4;mGݩaS >~y~&icr8Ǝ ORn<(PF:뮩b`K[o"Ea[6ԅ'XPfcU?R4犯DXͬڈOQ;fAu"}ʰ)uPw; %?pFT \("SCYBBPCD{̅)K:7S 2[ړEѲuY<25Ql4 H /ҷTۊYmT;1/PH*MYFفʒi#P aW``vZ噌wx,xEMn |@; f]e v"]=tc؂UAn6PJ{_PT6;Y{ÍeVQϝ `lWRSϒCjjm3zu,sQ{_ߵdq@(NydJ4{?#X\v*,ge'٤ZͱFgd+R(]p/4=1"gPy 06fhLbN4SPHƍ<"rfTa#2ܹg-%gJ=5'$Bx {^5+s*gL#K ƒ%}4Xp:v9 7{Ō Glּ4RvIj K-r 芕9cWԺ dKɼi豪Aii2$RFj`)|ɳV;ũz\^J=o^C1~d ^[d=2M|᝷o8sP\ZyMZTgŧ j;ddnjք^WR+aLZ]`MJc A ^^o.IQʉDb@bux",TȵAI M1Ge^bF`FT M]8FnT[]7lg;mCo{8>IiWxO3g$-}񃘬,,}w.'VLA/ p#7hÇ;%3~Rqu/zjBrH;!~r2ϻ瀊ⓠXw$^)#jM0\E)No;>o K_ԭͤ%Z3rpJX,hCmײa^l+5:Meו- )1LS[$~ѧgXZF̀AvҔK1%m\蹯:܃v{|xjHh9AFu鑁} 8eX;Q㑄kc/=$`ϸmT|3?㒮,s+frmZ"m4{8 =5ުw ؉;]c@oZCBVD;!ji?l::O #FEuCE6\0+ Fo**앑C bjj-1ȗm5%g9ld=CVTvBz¤ۚ]yŹCo|=*\6.0l(1HCC?q1G49'$=R}>gKoJ$Rh]itC$ 8UCs Pl-+-~ہNyȕGoAt1>>Tv= j-9j6V?Nz^v8IO( 0S\aMVd&o,nsr-1;1l[(Ӧc TF6YW#լ ԱiPd#.e'm1KcN_I{H,I>ӪUqA@]k7A,Zt%`IglPKqF[wH%X-4`9 v:D ۤ6׺h,J@A|1iK&˗O1'݂I1B5o"+znj0wmH )Ժ!OV\ f${GOZD7 MM6TIn_%abS]%Y& 6_0e#$PIc}tGZmdl9Gk(znX<"ν}#4"Vn khP^5~4 |@;Oe 4,c%{W>/㚳gUJz5l5NkX;+Ɠ9=U#k9q'H/\υ&L@ Xo?I>IR;0_l[q#I&`: Iߌ V>yN$Sb7\dB '~3b_D~ a7‰WkX%X[_Ic)/;m47ck(p?ȗK#:V:Qnf.cZ`,Z\|b^(1>j[s>%|aA[a#+xqHy [ݴېqG';aⰣ?Zqf1VCÛ:jimXH`溗"]Y0MwWljn>]vp~.˔Xd{`ZH|gJ-ZN|6"#JakyǁDɧW–>nL9x#O~TO.i&.} CcaȘZٸ\{_Ddʳu.s6y4 ngU^Cx!frUGvCChǔ>VvkGJaN/ӌ&XʔbjfoL^7Q1/֩+BH"0z܂ϖM2fTW5zƁ9T$zB^vT8e%: (d΋~9HZwMs3me*Y/HP"oHC AoA$+m/5,SeHGHӼk~nB R%+2ųgBٍh )L^⹄D]S}zcqZUAuk$S_e3P' )m#iOtГc=Zϡ{{92ȯJət`|9 ٹ)MB>60`=CQ/E# }L9߹{&VΉn ׺ʱ 4tWtp?(t}=+oSԦ8Y " =c*bٺfr]Փ4Ţ^nv&v6 Nԙrp^9|fUuTݤPWkfUQf)}୍egR<t)=Mn~€|8DX[Q:bXr5_ ^n?5YR#&F(pt|Ss2of3xx*HkU0@ 1`:ՋAyE$s!9o-#ЅexڌP.- U]~S arJ97p3^X'sPu=nV5YP \Tv& TxU'IWL? ,V܃4 @y'[nOhi;jc}ڒIՁb)Vզ vU|tLgRTM }0s$D#-,>} k?tHցSU9ܰK8(<6mYݶy; Yx>vp;L?ܸ @uXMk "6 C?>;谑UiXE& mE9nfp_'' Gn뱞%q5^]S3ix"&&@D+qN% i;Ux3ŠLz0aIβf_itkR9q;a,l'b̮~YhN -@:+U@DaAvKp5e*%ڐP%gyk=ʹj)_ymٶ=ngpnef 2yA&lʜ꪿֓5N.(vgJ>L\``;o UI9<P\~M˕0tzaSqF2(LEqݓoִ}h@R/6X20 O##흊$S(:9&ף-OC7z;p6~'xk"fa= H& ! `wZP`k"}ͦoNpue?x `#`$axb_xѵ 4mj'g $L_I<#xL'"+)C,rE=v|m;\wOAV]oGd)v5cU f<ET(ъ!j0L9$JcVu;S%S0z .2߄*_#I!IIx1TC[ֱ}Pٚ6*^Bжwk n>]APEmw{Wǽx1^,ZQIiFT8 !sˀ׮^Za7kny)ӮO^P'RqgdhJ6ۂر)ì,;bj  endstream endobj 40 0 obj << /Length1 1416 /Length2 6052 /Length3 0 /Length 7019 /Filter /FlateDecode >> stream xڍWT[(HQz Mz^wH@:HGPC]&Hޤ#Rw{kV̞93}=C>;-DCJ@!01 O@B0PB@@(O03( IťB)2jha$1upDa %%y\ (hP̊`3BP()BJ xzz\p,/r@p52@3?10ۣubZT ̄CPW uQ fU`vJp $O1-p0߿,{(j ;wWcqsbvZ+rcfu1c@/E `]1珀z,1>y0st?AEEOA!w:/f 8( b:p u!οn)(*||w+? ( 1 5֮6: Ci>o? AQ`ߴ73у#WW &KP_10!#9ԯ!Qnw>!Q1ycc|A~ 0){8ńWwf0>AUWvG 0 _‡@ `18X:ԩ6G' NS!kBTïqSoN:n?|_z|cV]s`}1vs1F\$E&"D73"9;nC~wk&=*Vki.Тb|`j3Z=22Y'US8γlo-gA˻(r>e3yEijJTڹ,4}Jy M;4lNě47[V '¢!ٟY8Q^ʵAWa|>L~m« iTUv#& |Y.QּKLC&vJKU w%DK>^1GPjŊI\[RO V`,L6~aGc2Cٌ[!~z]1c)G7.:tBeI,-w9k]RpEDЄܶѸv(57gs'BB&l]yq$28CSo U2ɣ;eL6fB I65P >P^pj ⩈jKR#󒃸ŵE쳶Ė;SpY2{:R]vߊ IЙM*4@䩶He8qWe[ gF-kHp~xb ,^ZWRf~C'ReZz)NYL${GɖG{v)*xpdȨ33~$G;n1馇p sJo7!]6xF9qd0.N89'x,a¤GGuDY,   =/ylw:e Xh8TQnW3`ݹۥBxxJjoE#IwD]¥E Yh wFei_G0+2nҁ="UZ;ͨ>y-;|sSm,vn˜z-0k!7\xZ{Jd&/$\$}YLXܳsq[g͛,B[>1t/&RNOiH z&вMk|F c ?\oV}Xk4 =q>x츭aV2N9[ #k3 0w+[ԒB5Ud6E=KPyS+$A^;/jd|ς&}sjJsz#xnʎX oX6sU N:Oy E_ !?z|ŧm׉8]}Aơhdژ`r27'Bn\l/Dm#ʱՂlX]Ka VyeCw>χ{Xn["vkZ"j,mVm_2aYїmX 1Q)ʪQR1Z/E0YE??g~/T_oû5gziH\@4{}$% \,F+{s-Nڔ rzvjܯ$fT ]|͔@_-&!q{|G^"'<Pr fHsZ$?a3N ՠ="c"wZ@BZnuի2ܭ@Ip$hЋV=SDT@6QBWqN=OYB$: W+i*AWM~50_ap>hIjovT|\IAuot{\U^ϳjs9kk?pj}!K%%7H {=D1k*wkC¢< otT7~|WkDپBAQäLXm4\ޏOE +N.x>NmZvG K}Nܺhl =ox^3>*I%vv"BI! a ոo7*3SҼ6*G^8 u42Zf;>,sr#G/`F&i["ǭ U/pqkr vF m#tsngm RFXʨi 3 VV/p $dVjOM,I`³cDM{utp\q^N}>ͭ`l?9״jyxv4p5 @<')=t=xVHoj2T`ΠnFMlX!ao/U;M.[P/ʘO=2Eg^ TQ0b CD֑9eEѢ'J>V.Tֹٙqtj+iV'nYAzX'/:8WZob*4?d@ x.ks(4?Yϙ4b>{ Ĥ֨^*n 2q_Jfݮ<)t7+Mk B djw|i\XXK,BX5)1vTrБW4|p6LL9(2n*!XKo"߲ؖ w^͎Z[/D2Fd]}ыbczjt3T}^h)f%cO6EowK!/,So9\s:7Tl Vd`zC g cHsRNwt9ސB`[IWepB..&+2ŘorgR3]xVDX:3DPF+GYQUL 4+@m3BBPYBlx {I GY!YTY, ÆxʩQӛp oTI2V\UW-)DX dbUzúbw]=hxC  =ycdg7F,]G[Y ͺ>ᝅRou%2xO6'MƟlqTk+hʈw=v }}`~%ז{Erjܶ'}t<:kgO4jbPolayX WHy6xpBh}#ON;6aSb=$2j*ϟw]oy(ڣ̎cn|6f÷rRz=T[]'VrnǬ =[,~_KŒ[Ϭ3h/LLſf'pvhA$ze=a[ uƗr.fk3/;֊{:?&Qle*b/5&6ޓODF֦Śa|m_uG8JCttePX?,F8ٖ|[vVBp'6{cX_|JW/>N,gXDrS% ~Ɋguf/oНb>lW'7@6=eOl#^ m 7>|ߋ|]BKekBS(L5moDڕ$#19qZAҽY%bӐC%Vg3.B0ۮnt(œ|Ħ$?ިD8[f|kkQ6Sm J0):=&6ycZ9l5k'POܽB"r8=&|"RTf2q> ܅USj/<>{+SҦZܐjxi@xʩZI$^CSDs6J?W6s\ѯkg$,M3nXq|yȜ>^;ϲ>:[PhAU>Ϋvw8gS%Ks*~ؐ9 }FX]͒+ZܸP-qٖ@˾)LZ_@sE؀=Ρ;|V׬j}]Q% WqFo!?wr6w#RiY WeՆYr2ͽ:a L{hDzuYŋMV8 endstream endobj 42 0 obj << /Length1 1612 /Length2 14318 /Length3 0 /Length 15151 /Filter /FlateDecode >> stream xڭct&bsNŶmTTmN*۶m۶yNw=a[||Zc+ mmֆΎr6@Sg_#9F DFff9@ @AMKK_B7@hekg q ̀s+ @DAQSJ^@%! ΆVFYs##`b`dkclOk G;4`m`0u0q;'[?MlEo__0E[G'G#s;'ߪdfOmGnHc[#Z/_# O-C o`vhnc_ @Sc+_LoY+_Q#ʄoM#Mm(R6&&ۍQ3IX&pmPߩ'-OB;[YX]?0/-@ce 76r?$g$?p C 6;͝&V'/1W @>U3s#KFo?_Կ T7_QwRuK"gk` ۺ<@zf߂LLja@oˌLj0b6F슊_FU׍dYzJc3´[X9NTWRj5J'+Y-ZAE$*cD-r `΋y_B+Wc˸1;g?{ ES.ry߀*OǑ6ke fx>40 JƬzRˎ)%޿#h3m'C /yQ6֕GJ hg*l)I$[T+ T/!7H(YEm@:hڇ7UeayI/amRtTFL :h^ӊP2A'EEjqn.8ě %;g(JR\uĤ&`c';sUP|jʖo60ͶDVڎLI v]VYvuqT-*dT,9_Uv rsZnpC/ Qd el@6`br2 1*&z`F򾓯ڳfXeE<.#IZFG1c(.ǗKdnݖ;7+? |%2뗹YA%/e=ZC7q$>ۛ`,#f*Νu;j{A ɡ gEC5tK0qjͷ%*0yM+qޢ d4O xN%]+OIp0uR24̷UrA5R f66BKzXrju?4>%[ZАFg+X$zϗ W@LV oɦa(^C+>Z?6=;` nѮO;[wB,i pSn5e5ku:? x?`2gDC ܝ2[RSu|TJ?Y΢A29%^'^eDrQr)µmOL|:VCìUG۽]R)#| O fYh)ͿNUTnqXXf#p3͇.l΋0KR7-r11c&Kt=> QaScg}8DN-.yB' >GP}RLJYYv{1BJ+ySfUH':d-z$Cx|;Mw}S1k]4(*\t*gW/WfiQ'~B ~L qhz<wPBgzĨY"4" n#TpO'kdMk*qS0x|ʷu}_qHYU|t^e*8uʗ1 PVXdSHc]8$q NtL;; 3O;I'3DNژh!14n#JeLj5|(mrQb"$mnV/w fLK'~kd= $.ž|Cx B[#4gXl/;tMnK3_d}ėwɋWɝVJ8qID#;: gA!T1{΃ V$]mTC2REwғhůE#k3 ňE|a١c 0@BRXY3m5f2[Tt?QC,ejKL@q{T@'9L7E=S"@`KА%dzn6*j8 uQ{Z X{Ig+M.X #V._αh늰#4"?Q۶y:~|"KNg# aAe> j]{l$F/v3 t]9`ۉ:ME_˶ `23s+Zc5}aLHʆ Zpz=WoL"PҠ,I9~ T|tp(..S27W$zH"( 6b28Ja̞dPX8i3E޵:^QP=Oe#B] %jgMdywxs:A-b\{(CXF?jl~x 5=Xޝ`3hRc_GN snD\ܨJrw[CʩT( ǐ-)C*)nd~}Bc 4M[CĈrv5EF< m8 l ;ĔmUGr:G.q ,ƭV?EjΆx,彽*nȼ>Rnƒ:]*FF5c^UP`*KA֑GFK)Bk_M4ts3o9[=y<.l_KA8-wFؕv1e&}\?&ؤNδY{P8Ww*/Q-8}̪N^f|n<-%yl.Mֺ,hC}mA)ӆwHI1)aq}6mFrxD7oi\B=>xH >s@vrL?Gɬy;x.;rȪbo变IO=qŮZ=Zo!\3;zoyHjÓ *oUbY/[틲J& IlsCp^`0-bAGYvnAdo(~t-^fILT_k;^R#k;+_7EN*r$Џ8(Ҋ$̷WM#~.w_5.0!7=P$ٰ&f`NPwm`CArBMGL)B3B{xz)|gB&ga̠5mLCm["AD4T\c> vBnrk~yjfyXwBc]K' , ͼ1 E6fxuA-M-:wNҪt<Ö5ᰟteQ'I]vYB7\Uӻ], D":x71y4`?~0,EH}h/(Zyv"J_.e^)қ1_)_,/e[1hUIj#{w>[F۟+{h`md9;_Z|(0e99tQ9ϜziG{UwEȚ!)Hg:6f>y3?Ѧs3dFTوk&e\koɟǚU0^fMEig?AwOcrk/+9,MƬx~qQBLT S_*`~~e<\#g0@=VTr9 b LNVh̶LdTUA1414ԉ^ԀZ:{u-@~$w,c/*a6>Uvzx7ɾjeBy倂Ռ4xwrwn}*$ q&?;sNq*4sڒdU-%z'}tpY6 R@'j!}O2kHj S,Aa s14w9VW<'DzŒd&I?@p\8X%*u)P4{jŰxRyԜrmpϪ}H4¹>;BHJ:d7羫ɖ+^u,4N[\]{:YX)ZLRY t)Җk, Wu@X~ndQZE'^؋ [Ycl^ڡSPކ/m~K= R@)1t x\:-??( >G RhhnA+dB`^%! )&eڙ"bejy/H3h;Ս?lE?feeWs"2Zi?ASF E!9R'hC0[]@$ I{X UFcE/ݻYa-F,>Uqbt~ł{s9/~sI?@Pvf 0vͣ1:O4];3B/`# J>8(@xCI~p/i7KꖂP =dP$񁉡|kŘ״wY}RjK#(wz t݅ }"8k*лl? ͑CwuDCtK8AFU9:cz%ϮlĢLzT̞N&6DXqƶL^mS57<׈*hQ?/)P_(+s&oYJ /[qCQxpUpܯ]3gy +ET+E Dn֧[XR qOGgZ,c蹻4 qw5"eXNY}v ().Rd*$F.| I;c#lcI1 BHñv,EO >;K) 6>t4 o,eG KqAKN7Vrf3S5g/?GJSUPjkYBXOf6?۷vAIBȧsKLj1VJJ[}'HBBt?&הDc}|j 8J%f{3'{Rfwξ" M`MU`!`N;ؘ8霔A2x\yĭO'[pBOʹrEh"Q\?%"Oogң!fi u-H!\R]}kT~U }:e-\o,[Bm4>^Rx. ݕ jydOaf[KʱՅ1˅_U2k*^v6us"2ba|S9z[Ydt@#rdjZkiEbR>^oQڏ<:o)CLLtzhu< {rba/.#PYن|D3$IJD1iW }~z\ĭ@zI |F#pj8nS SI \j;߅H$/!cfY%L6kɹٶɠkA^ to@>$0$TBoB.yR`DߺVdZV^wJ/cueEZH+p❹pZ=( Q hCxP@_KM]71t}0b> QiPchF|^{)rwt7V 7@G0#~$zGdq.PYAn6qs~)5d /^?ϳ}y= .n]`{G/A:R|LpOoӽhe$1N}2_HXs F8e/aΤsTd'P m߼(3ߢ]-c*:ObH; Q%rKy! 4Pcsp8F ʤW,:ZE/ze\t+w${p5"q~6 5;鑧9.>[ Ңg_/F kQf2fw 2P9e'e+߷G&\H9 ~ٽTOGegH$Pl56N)`Qqvc73kqh9_T|7 qʭ աttx1?l΄1h_y1]&/?S$ ̧u N]ERSw2aN*1Nɓr[K ź¨1'Ǝgeqv,䄋tRhyZI3#Ya+;vm*%R5EvvOdmmyspzNY ܖUGe;ȇS*|UbS#KUm$_՛r$5N?Wԉƀx7wY`{tҳ >$f0l\Ƿ? w:\d02I,E v^i#a1 1{L*B$ j MG9}Vnz>WbļL=$`CHfUH{҆FE[cs\ 8Mj3|尃inlk.1 P6#8Mzo9 P#1qIC֓WmeAe f*%'bW0jh!N#s|F=fWZ% 4v1ySD-jFD95eu^e=sj1,_1)jD݆wk- ɓ3 ]8[C`Dee᷋ȸC7 1?j0hm?kK6OJߜ~-PN\07%0~ocioһ-@d QԬZfjRmփg$^; DmMG< Q[ pi<ŃT S4Ȧ{HZ=ywm| v3հ=nd(II[4FH֯hpOspfV$<w_u,2~ܭ^gn!ᐎ1o(ᯓ|:7 H5)X[Kz9DPA)!VLڿOt9]]w\d%`]ZdRP;њ nWmςQts2.nTCJpi9i~qonUҸ>&guN}z 1C;##gdk'DdS?餳0w!K6g0"dueˁ n.d/ S$9*-- 7Eƙk˝JmަbCkꩰݳPOAk&l3Kc۪7?Ri?#a&49R}Z?<9&=nCBڳ9QFN~4pDTyaRX0v}NL#7TnL:1#|wt8N>p[0Y>݂YGuB.;{#T(t/HXHu+z6Pj2wa]_E qyz _h^A*vĬ9oMl?6&gq h>c2 Qw.xjwkڙM`L'춳! jϬ,/v0E"1M[ϰ{KsEdoVXC<D8If-aqzd[SMJ"NHaKo~xF5[f!GybGv'T= +t=)z˧G9DF;Ez:PʒQJAnOHL+1TJH8;xm;R9*e)V'PۀVE[t[u'|iQ H.egb4Q E|]E3Z{J{#2ê"2SyF< 3E Sb8 s4AM_m*ܢ3ϡ)󅉟u~w׭!)*vBpr3>cL׾E~A9b7ϾvDBܳK9{Q7QcmIC|ՔT.s0.lU̷4k7Qƻw8MʖN>r.bq(́Ѽ-I?C%\nB({_fbpzblOJ\rDcs}U2 u>t\ ěz..? ǽ*'=um<P6LX 9Gs6 fTwM+DAf12§×ܠ/JjnNfSvX}]bpY6A)],̺[~\WeG Ҝ3}qkoyM^M1}LwCYl- E Qۆ:؇uqP.|ƸF9uhU6Dk!sY"7sٚAhX{I9-"j ϟ G0O0(w2DPPv$^z| Y廒[%|ޕ.,V#`eđű X[DONߏCrPŗ`*bn]+QL;""ystHoSEZLQb!X&ԅ\nͻ%w;о7).mxTF/\Kqkv\> stream xڭteXے5i<@%C@-hpw`][pwww8Ν;~]k*Y\+1HƑ6vrcT9)["5=lk#fhLb ÃH s;>h10|+c"`33dN?v AS % (ـ썬JNr` D0Xژ*́K`pn W /do vpxfF6=pmVN&%n7;!;{;lx$&<͍~7MlN7N:m Wǿb&`;+#dvprۘ+O{Nooۿog`G)"+{L{l3 "_"mcj `e3}kfޓ02rLlC>Tf@O߉G}N-de`d>X2-c goNl8E ?ޛ"bc. #+?` +D 4Yo lzﶾ;jZ%? ɿ.3J*3wJfG y[<勭+ feD_gy#G{+@߿NzF#n5kr:٘Nߋ\A@ąY[ J1NV_5y9><%LcMn3v/2;Vӟ)r0VhZvi{M˭hsl(<ÒßP:`SڡzpZk!0+shniv\|&bȌB3J8$wt3~xtrSG/f:o >Hv+R#ǕSoO2۱k.`OY:ZQ{.so dDa 2%8E[{L5hw6Kc^m bK<hB*{xQ>=Wtsn \֒]'6𦕝Y((~cweG:,]T$?ͤe~joUO߲CA?nt L*A(r[NO7yi4tF @ZLU'T&|Жfh" (ZDIlx[Lj^xL$PP o8 J6S'6H[@ IysBn07pųd<ā2t&Ƙ|n%KDƮw\J MtϾVF9Nw2ZȎ%gq (HT2Fl̽g{dK64̫|dج];` F}$aTMj-7/ ,$G*6,vwU:U7wV_*g@.6`LBǵUDV8"]ڎM`A@|lPYW Wzr%u|m[OuZUMi9&_ol Ry#Plmk*cz2v7tkxq$֗eģR:LuTLJtk2S 餜$V Ҳ|!~\>e!ijEO GvHO{1Ai/">0Ӿ~ܶ$ nW<a]АKT%j83CoĂXd>4E;5_)pTx Vn`^|EH1O"$>3\cz1r%A`ʱaE"l+~eJ0}8}8ۊJxܭM2d]ZKn K .Y"{!(D&b#bU 5!#A(ny&xM&z^24nuګT$av>&Lk` Ĵގe3 lN-j0}~Vs֭AO BKo)̏Nvβ-#b4:x֔u -E6%/OI㼝eBhgJnX (ru+bvפY<am w7O*F^k0({8]?tqZEiMmj]ǐq0EY˰hk$@f d3O>,K&m΅C-x;,kL!7zoE)꒞\${lnpTRљ+[SB _K@~^Wh7rp3{H׽2)־h0iQo*b+^#O9eӊ o7}mS+kβf_*_.*SμXW,~7nNI.'JSu Юݲ=j"1eK S:dB!'\KʻH)_B̼iQda=?DG^Ə cs;E=DJ@s&>Tx$݌cg{9 .}Z EuQ)* veKnQv}jh B Xֳ[ƏZ IEjna=rcR5R_.grU`^ůp@A'qDvqF <{'w}*tWˮԏY3](+ TЪP~tv*j-R,9E,]м\0 @L:BUh°+j8=ls8@JmU]fV啱-Mcy-3ן8oΖ} V9DSN F8VK&Aq6T0.{C}SD?]?]="])ՖTr xw<=omx:vVeIƽI%^@ ge9mH-!RV^?djcA7fك]ݙ? 䫨!&+tȧx>4rG xGh +Uh`XYw}Q9lr]h_G.OhT-TVL 29rǘC>T[yF)o<(< I*K/pqu&'NoO݇|5kSb4"i}pG[MZIc;2TWq2L9oa`bx[dl %/7'1ƜBʹǻ Ky:N~ %"MTR!'̂SejOw3Т}a1`O^: Lt)0K.=&KiV*?,q!MB;]~qlF @l~YSEɡ5l|''̆Mlń."w3&25e{l3:FͰ48/49#K&$t_ة$5v씱6ML A<.i. Pj9D?R 7&Fi׎YDQW Or&[4Cӡ%tX[J%aƷOI4݄!{-4qG;PV81*yD9p(!e|b4qdhhLH)m[էz"tgJ1Zl>RnspmM'{&MC=*G)sQ~j'K٨;E|WQSA輻?W$gkn \a Vjb0oX :md]_+R@_}->!T"𧷌)}X(͗ fG(0#?pHeq0oRbHJ_*`°]|9"z^xޒzs+I5!̌}̰Lla@Z3W3|c(njVͿB&xO ՞yMwꓻ_ _Fr4#}Qn6US-ƾcg68䛪p;,DfB%xQ{z`)DnqlcLePmҒpdž &.LC5iܽ7:YGt[t 4cwiIh/x6ta~ɄĶD-9 u ^q:ePW$Ā.f1%BV|UP.{?~t=E a52}y͓4W`/n窍   e7'sͶXA `}BeNj/QCu>=\'h%6v?թ4Z. eY|2m1O<5fl!22Zq|2`Ry2c*bvUEP M/jx?AN3V @BD$|9Gb"q{(9k-i)Kun̍-QER5s$bBp Fn(jLR 5! q\?pz i]%ajXa>os1²iq"AXpw{ ?"1 錥N>% 1.kb4h&7k(#﨎Á30W#!jbȴFaR9(_]'eb֢V1fgYnњ-)nvKq;oNZ.V{S߼58_PCף황ULV5P }U&j njۧ^2r: =c՝ $U .߾B[{dT$c|^n :SdU;%_Nyfmuw1) gGį2#Cѝ%)qd4.N7ixsߌO⾧)D $O3XFiE!#lxOKYrXQm-4@s_.nH? %FRHv=qޝ! 0@<)Q6>Z+I믛Р? i5$W =^ǀ]iu-4Zx}fWJܝ[sCoLb&7&9߲;F|rZHǸ}A|'yW}o3-m][() iI6!a`6ZKߥ߱"UZ!ӈN͂IǀŁ`8XT'XjPq$ʆ~mncK5.2o}+{m"C( 7~;8SHrHQ6s{?$XyyPdmǡ;E{ֺ -vzTLR&^1ubID55lC&A"@'|Q*'U?S&.)j"*7~ &nlr<> 1/)kg*&cPxb҈V׌ [bO!<=z)$' n,ϺnU?`^c+`/ ţXNu6S˷x&I:9+وfi};s3߲@$zZ3cF\ S;C\.Ehݕ y#IZ蕫qNo_0=oze.z@+ș=1bk=EB~x[+70^2o5)24@x ᡛNgLҼe jCōX:9U@lkb&_x6rd}G{}-w@6foتa %hzbJ|l.#{}9`4,햭Ls&>GBҍ/g1U tWYymT=Dr _C?y}=a^YQfWRܰF]P q㤲% yE3x~ƦY̾YN|ueʰ(RfW6fSP̲U1>PcX(1CG͸?n% :] m]i!xi:dL|=2>.ddB҂p&^w !QxXt~<-f.d(Op9[U^Jm80tw[ncUP\fvD#`G];` tw2Hhgmr"H7 mO#冿<*l }[}Q)堃V˵xU);\JH y ,AR:NȟKuqe8<$ވޝ3' mCN{BPkKqDr ΰ{0mNk̵ m%.+<;m88[e`Uߝ ) fq ZdoE!BjuuL u_0fOo St)'  b&d}ys!ӷXϦW<>og WWVukYUFЯhq TLsot?ƂcLC..cУ.H<)[z^ԧ[Jܯ=AX3"k3fw@P ŏ#;O29sE0I6lK7Aq؎.6gtFd(މ D}>@zS^4ݿJ~6ZAvΐ5@b4mM=%&)F>nsȑdRs.Wܔ6Lޞ>Ɂ0*uzns+/1H ]GIK;a@pMemu}vs9LK}"DrkwKgyk5}VxND}4}OcxӨfr_5wJ'`kO4<5c̲!h] qުD*~C08vzutaBm,%Ա&>n5˧<%ѮX^*o/d7p$3)HM|o*MuXeA89kT3v\54Ā~՗ cRV!M_#>ɀÈ:6.D m̲IҶr)7Us @RiL^IX~؊%VbqEmN84p^w@`.D#TیYˑ Ay a 3sηGN$\w71Ka^-RY[ӕnl]RXT<=\7$*wa\?"X|Md~H-Mi~gޮseuBh.|#HwG4)=;F>g(N di3?&3ewwxY>A:SۼZHͭy u2FBVP9ғŝZ<4I,Dž֔ J<%0DLT'&+\rD,.ԯ"&N-!ܰ*S{h?gd"GRf Ǧ=œK{{_ I\ϫP2G,\dWEf40Pjp[QdcN $A^A+=|︑;v=AG(@_%ZL!W 0ܩܓo ~F}gch f&LJ[8<2Eiٿ4;"/-~ bgooTP/ #n[崱7<|]|cNITtWF҉[۟DfĒ 6sMsvJ<-L7EUaEw#tIx=\s*eݏGGku^#bm3f(yuMq 0rO :2nsAt@M8SX6 ǩ+ QTc+TgVrN߾&5RC7s⢜If\1/E[ gW = S⣉ЩѪv-=5_}Y7Y^fC1@p̒@&-8Nܽ/EK5u-[6ӿKkpd  T)[y-DRQ+.A(}ILJx]*h*ĵ QvR 헒lNRdtoEol& Dcv f|쨗{A(vGPwtz%);$P^cڌūkQAVZO,ғ[λGr.%zN Q&BDȼ;On 9~\#qjBW헪^̀da!hW1u+=d?4.s .|<_)7;$c幠i+J:њEiui|A.9xQA.+ȭ*!R9@BZ7p˦,џ \V{朮H@lWTP3ۏwCv-ekDұ#탆36vi9Ќ/C^EnN.8Kj-kjy3A5}击a3=a NSZW#A'1=jt^BDd,|d-[l5tjo7ّCV+He-r HTX i 8t}hq壘A _ǵ'RrjT.K%7[$?v `<ld> v)j#zI6uߴ#qha?>6 {}C={b-ה0wܗ >E~C?\z:⡃&-:}vѱ ɤ+nSo1wLZjsGJЊ9P´`s9a:x]qQ4r[k)#hVeE<_UIOQA= !"COڂUIl endstream endobj 46 0 obj << /Length 665 /Filter /FlateDecode >> stream xmTMo0Wxv B Hv[jWHL$Q;onDo3ތ?n~<'i$ͥ+$yh).gW+]9kmwu5Q./vUqLtIĽT;_d-8RcSYs>n1՜qHiQV,gλD˥>b?t.>vU^UwiwBF(݉Qg'ߛ.|NhWS4CCꢥ+nWok*OclBr96[v,rr(CXYhȡQ^s$LD̷aȑ( .$56`>Ƅ*G)jHQ#ر.fvx;tr-^FBOҋ39yh[x1OEiI?[3![1;=b}jU⧏ɩ%󯽟9.E9-a t;13`Ԙ>0ܓ`†BF2c'I`Re=f`x՜kB?s7G2A \aPY:6p \DŽ4s z}Ko-kU6-ϯzF` endstream endobj 47 0 obj << /Length 664 /Filter /FlateDecode >> stream xmTMo0Wxv B Hv[jWHL$Q;onDo3ތ?n~<'i$ͥ+$yh).gW+]9kmwu5Q./vUqL[g $^ֿ/ u}BK)ɱˬ9CwMjNU]vA8BN(b3 ~p]}jRLiVvMuU*nȻ!JDŨ_M]_>Z'4ʫ)pݡ~uRʕXn5R16XGԊ-;99!,}Nf(9O&"[0H Dhbcv#TL5Ҍkn|Wpsc3;vU:Rh/#? bEЙ<-k`q }ʌ4$ٟ-_Ő_}>5*snSƒϜ^"sÖ0]:Sh0kjhyIYh0aCnOg1$ch0)~h0j5F̹#nϠ.۰o|e|lBpl.\cuWEcpťfQiYh|`=n` endstream endobj 48 0 obj << /Length 665 /Filter /FlateDecode >> stream xmTn0CB*D rضj^pZC~Mݮ̼7cϻIZ6;x}s ;~&oã-Ƿxbgqmm] wsiǨ[U_C#n_ɡx*M'Z  B\ lWM Խu]f `t4t*gq yMX ӧ]U۫-w d\?Y1gkEkMiv_n_`!R,6e`wʧclBrՒ5yN҄G&1!ʇ' Ds!FJRLDL "P)jHQ#]Xssc*\FB^G5339syh[jWx0OEd>`[~1E;:oU⦏ɩ\%Ϛ"s0_;0 1CsO ozv}j7ˌ1$N9Aׄ1]açZ7~P˼|BpL-ZcuVErsoZתJ۴?ƏVOOc endstream endobj 49 0 obj << /Length 696 /Filter /FlateDecode >> stream xmTMo0Wx$ ! 8l[jWHL7IPV=M̼ su;Uٛ=w]yil;<[[j<=?׾+v`&ߴț<^*;~&Q>MS 9_P{=s@dkx;`VY`s4JaQܡn.Uu9\Y6><ٴ.Z.4>Dӗ}~r:-d0VWk,8yLһʮӮђ[*mLr?q 5F8@=@)& 8Rx uD\j2HV0CzL] bctI g$`htы0\F0s jd< I6zg W qȐ+#k .bsrbmXK7ǵH7Gnb>&jؐu1VljOu$՟qWS/%1{\xB!K(hHTЖ枃Jρϯv=k2UKς_:~$/ ~E+7ˢ/ l(/} -+ZXukoԝE?ZK endstream endobj 50 0 obj << /Length 664 /Filter /FlateDecode >> stream xmTMo0WxNB+8l[+ML7RI";onDo3ތ?n~<&yݽIr/ŋ=wWIG77eW]Nm=ij몝m-m3Q/oMq'}vIֿ/ ˺sӵBK)ɱn;A9n1vAxHŢn!XN4$>΃=mc-bB}hjM^Uwww BF˥푊QM]1ʫڞCeݡ}BʥXl6ȶ5R^clFrJՒk ;%9& }8K|y091x&GϹPT#Z%)&!lRvDr䨑\#G|bǚHUʸ4'22| ^Dm=^sS<cLUي_3;S}Ш2?}LN=8g,u..Q/)87l _??q Zqб<4 4谡Цg~ѧ,I 4sY^y?4hv5O#ܵy7S4 &*s0P.9S0׬p~ne8|p\ouqn6|kq_^~& am endstream endobj 51 0 obj << /Length 665 /Filter /FlateDecode >> stream xmTn0CB*D rضj^SpH ;olvR3ތm~<&yݽIr+œG۞m=ģ몝=b[ntC۶z;vʾ6%:svI>77 N!._ M u+$bEw!y1 vxHŢnSX: {Nm]XNDW[״bݹ,,-FVL"~C۷6ZHfٶ )/16X9CjIxļ$Bi#cΓ@l MDϹPT#ZC%)&!lR&TG5k䨑}WLԌ]Uz@K~bo#?қHљ<-+`q}ʂbI2_́Y_%X?Na~ZjGcrj59c+ϳEHDܰ%~WLz9ܓ2ƛFϲ`'I&se?zyxмj5F̹k#niM7>T20P-9SA˰֬p~ne8|p99[ڴw=ߣ& c endstream endobj 53 0 obj << /Producer (pdfTeX-1.40.24) /Creator (TeX) /CreationDate (D:20230209121856-06'00') /ModDate (D:20230209121856-06'00') /Trapped /False /PTEX.Fullbanner (This is pdfTeX, Version 3.141592653-2.6-1.40.24 (TeX Live 2022) kpathsea version 6.3.4) >> endobj 10 0 obj << /Type /ObjStm /N 33 /First 242 /Length 2013 /Filter /FlateDecode >> stream xYr8}WqR@U-8IgRy%Z$:"N\qҙ.E9^1ZHDć* yp ^h-2nJmډи-% ʢ?MaS" u[gébS[# !lЙp&H b!Lx!A0h%|hC<|8WA5kYS *NtZ|X%]8)Fe_}$.x0Eqss_~IO>Ç:Sޝ`\\5ݝjh=<tV|i\W2OX E!vD! =ߋx-F7"k!9@/1<>q-bj=: ؟cjm@9Mჟ8,h KE +`ĦȐ5(. }uj=Mpޥq%u`Q3&2-`'a8"2ZTu@^3'0^1HLf2 cp Gdtf^ Bg:砆ep( N8\N3F]b$|_pU 4gRځJX3>no2Zr9\ {sʘ3q`ɲd]{`YG>.m  -[;vy/l [iHgAfbhY2&>DG8&!smX!eYD8:Һg27ˈgy8Çtmh S33nG8#a2V-V)3fq˿\a ekJK~wIkUVtg"L3<&I -玱,Xϱ%M+sr ًšnE'F$"Q,zv#E=WM5o|;o=?=~ppϛb_MF0ǵmdz DQ=dʯA*||oOk|rh6BmSLko H8>!1=tB5SzKg9 iDtA%}.tI_._&4YAUb^V#SMu= 5\΋?*}li8.ؾ˻ݣgONzV|^ѫ26.r }LO3zX~ރ| |6|{Lfq>O: iXM>ӼO1%ԚC0itNة>]WM1:/m=MjCz86MҟżZGWGN{lV]ER)dOuU׻TWF˲B dZbUVj+,7['܇<~͓''OLq\dz/gz)oO4Ox^X%M/,k_?xYNϯjb_o) #K%([v9g-EUr_-EȎѨLR9⻊m~K=ސV߇ggoVu>)7kVfDh1ܩ`-/nlgnVO$"{=IeZ;"Ϫ8ٰ̣1*/. Z>3LLۭţ&r-3δ^.$ElݴK8lC:m-a-07~*BF;L_K͒"f=sO\?\Mf^R('tk\O+yE.;r){zv\ Ƌ;W+GeoSEtuVl Vk3ۉpqfǹM@Mƨi?zUZCfˬfhťÝ9y;7lZ/k~~ͻۀ#2m?j__NV*APƃ?6 endstream endobj 54 0 obj << /Type /XRef /Index [0 55] /Size 55 /W [1 3 1] /Root 52 0 R /Info 53 0 R /ID [<863C06B2A5D2ED0E42426FAF67ED30B5> <863C06B2A5D2ED0E42426FAF67ED30B5>] /Length 169 /Filter /FlateDecode >> stream xʹa}DQ)DS%H$Br܄ 44O99 𠪉hh&yK[EH!61 bY) _drf'wZIoaU"'Q2^Mq{o/q:Iu endstream endobj startxref 104266 %%EOF filehash/inst/doc/filehash.Rnw0000644000176200001440000004463314371232451016046 0ustar liggesusers\documentclass{article} %%\VignetteIndexEntry{The filehash Package} %%\VignetteDepends{filehash} \usepackage{charter} \usepackage{courier} \usepackage[noae]{Sweave} \usepackage[margin=1in]{geometry} \usepackage{natbib} \title{Interacting with Data using the \textbf{filehash} Package for R} \author{Roger D. Peng $<$roger.peng@austin.utexas.edu$>$\\\textit{Department of Statistics and Data Sciences}\\\textit{University of Texas, Austin}} \date{} \newcommand{\pkg}{\textbf} \newcommand{\code}{\texttt} \begin{document} \maketitle \begin{abstract} The \pkg{filehash} package for R implements a simple key-value style database where character string keys are associated with data values that are stored on the disk. A simple interface is provided for inserting, retrieving, and deleting data from the database. Utilities are provided that allow \pkg{filehash} databases to be treated much like environments and lists are already used in R. These utilities are provided to encourage interactive and exploratory analysis on large datasets. Three different file formats for representing the database are currently available and new formats can easily be incorporated by third parties for use in the \pkg{filehash} framework. \end{abstract} <>= options(width=60) @ \section{Overview and Motivation} Working with large datasets in R can be cumbersome because of the need to keep objects in physical memory. While many might generally see that as a feature of the system, the need to keep whole objects in memory creates challenges to those who might want to work interactively with large datasets. Here we take a simple definition of ``large dataset'' to be any dataset that cannot be loaded into R as a single R object because of memory limitations. For example, a very large data frame might be too large for all of the columns and rows to be loaded at once. In such a situation, one might load only a subset of the rows or columns, if that is possible. In a key-value database, an arbitrary data object (a ``value'') has a ``key'' associated with it, usually a character string. When one requests the value associated with a particular key, it is the database's job to match up the key with the correct value and return the value to the requester. The most straightforward example of a key-value database in R is the global environment. Every object in R has a name and a value associated with it. When you execute at the R prompt <>= x <- 1 print(x) @ the first line assigns the value 1 to the name/key ``x''. The second line requests the value of ``x'' and prints out 1 to the console. R handles the task of finding the appropriate value for ``x'' by searching through a series of environments, including the namespaces of the packages on the search list. In most cases, R stores the values associated with keys in memory, so that the value of \code{x} in the example above was stored in and retrieved from physical memory. However, the idea of a key-value database can be generalized beyond this particular configuration. For example, as of R 2.0.0, much of the R code for R packages is stored in a lazy-loaded database, where the values are initially stored on disk and loaded into memory on first access~\citep{Rnews:Ripley:2004}. Hence, when R starts up, it uses relatively little memory, while the memory usage increases as more objects are requested. Data could also be stored on other computers (e.g. websites) and retrieved over the network. The general S language concept of a database is described in Chapter 5 of the Green Book~\citep{cham:1998} and earlier in~\cite{cham:1991}. Although the S and R languages have different semantics with respect to how variable names are looked up and bound to values, the general concept of using a key-value database applies to both languages. Duncan Temple Lang has implemented this general database framework for R in the \pkg{RObjectTables} package of Omegahat~\citep{TempleLang:2002}. The \pkg{RObjectTables} package provides an interface for connecting R with arbitrary backend systems, allowing data values to be stored in potentially any format or location. While the package itself does not include a specific implementation, some examples are provided on the package's website. The \pkg{filehash} package provides a full read-write implementation of a key-value database for R. The package does not depend on any external packages (beyond those provided in a standard R installation) or software systems and is written entirely in R, making it readily usable on most platforms. The \pkg{filehash} package can be thought of as a specific implementation of the database concept described in~\cite{cham:1991}, taking a slightly different approach to the problem. Both~\cite{TempleLang:2002} and~\cite{cham:1991} focus on generalizing the notion of ``attach()-ing'' a database in an R/S session so that variable names can be looked up automatically via the search list. The \pkg{filehash} package represents a database as an instance of an S4 class and operates directly on the S4 object via various methods. Key-value databases are sometimes called hash tables and indeed, the name of the package comes from the idea of having a ``file-based hash table''. With \pkg{filehash} the values are stored in a file on the disk rather than in memory. When a user requests the values associated with a key, \pkg{filehash} finds the object on the disk, loads the value into R and returns it to the user. The package offers two formats for storing data on the disk: The values can be stored (1) concatenated together in a single file or (2) separately as a directory of files. \section{Related R packages} There are other packages on CRAN designed specifically to help users work with large datasets. Two packages that come immediately to mind are the \pkg{g.data} package by David Brahm~\citep{brahm:2002} and the \pkg{biglm} package by Thomas Lumley. The \pkg{g.data} package takes advantage of the lazy evaluation mechanism in R via the \code{delayedAssign} function. Briefly, objects are loaded into R as promises to load the actual data associated with an object name. The first time an object is requested, the promise is evaluated and the data are loaded. From then on, the data reside in memory. The mechanism used in \pkg{g.data} is similar to the one used by the lazy-loaded databases described in~\cite{Rnews:Ripley:2004}. The \pkg{biglm} package allows users to fit linear models on datasets that are too large to fit in memory. However, the \pkg{biglm} package does not provide methods for dealing with large datasets in general. The \pkg{filehash} package also draws inspiration from Luke Tierney's experimental \pkg{gdbm} package which implements a key-value database via the GNU dbm (GDBM) library. The use of GDBM creates an external dependence since the GDBM C library has to be compiled on each system. In addition, I encountered a problem where databases created on 32-bit machines could not be transferred to and read on 64-bit machines (and vice versa). However, with the increasing use of 64-bit machines in the future, it seems this problem will eventually go away. The R Special Interest Group on Databases has developed a number of packages that provide an R interface to commonly used relational database management systems (RDBMS) such as MySQL (\pkg{RMySQL}), PostgreSQL (\pkg{RPgSQL}), and Oracle (\pkg{ROracle}). These packages use the S4 classes and generics defined in the \pkg{DBI} package and have the advantage that they offer much better database functionality, inherited via the use of a true database management system. However, this benefit comes with the cost of having to install and use third-party software. While installing an RDBMS may not be an issue---many systems have them pre-installed and the \pkg{RSQLite} package comes bundled with the source for the RDBMS---the need for the RDBMS and knowledge of structured query language (SQL) nevertheless adds some overhead. This overhead may serve as an impediment for users in need of a database for simpler applications. \section{Creating a filehash database} Databases can be created with \pkg{filehash} using the \code{dbCreate} function. The one required argument is the name of the database, which we call here ``mydb''. <>= library(filehash) dbCreate("mydb") db <- dbInit("mydb") @ You can also specify the \code{type} argument which controls how the database is represented on the backend. We will discuss the different backends in further detail later. For now, we use the default backend which is called ``DB1''. Once the database is created, it must be initialized in order to be accessed. The \code{dbInit} function returns an S4 object inheriting from class ``filehash''. Since this is a newly created database, there are no objects in it. \section{Accessing a filehash database} <>= set.seed(100) @ The primary interface to filehash databases consists of the functions \code{dbFetch}, \code{dbInsert}, \code{dbExists}, \code{dbList}, and \code{dbDelete}. These functions are all generic---specific methods exists for each type of database backend. They all take as their first argument an object of class ``filehash''. To insert some data into the database we can simply call \code{dbInsert} <>= dbInsert(db, "a", rnorm(100)) @ Here we have associated with the key ``a'' 100 standard normal random variates. We can retrieve those values with \code{dbFetch}. <>= value <- dbFetch(db, "a") mean(value) @ The function \code{dbList} lists all of the keys that are available in the database, \code{dbExists} tests to see if a given key is in the database, and \code{dbDelete} deletes a key-value pair from the database <>= dbInsert(db, "b", 123) dbDelete(db, "a") dbList(db) dbExists(db, "a") @ While using functions like \code{dbInsert} and \code{dbFetch} is straightforward it can often be easier on the fingers to use standard R subset and accessor functions like \code{\$}, \code{[[}, and \code{[}. Filehash databases have methods for these functions so that objects can be accessed in a more compact manner. Similarly, replacement methods for these functions are also available. The \verb+[+ function can be used to access multiple objects from the database, in which case a list is returned. <>= db$a <- rnorm(100, 1) mean(db$a) mean(db[["a"]]) db$b <- rnorm(100, 2) dbList(db) @ For all of the accessor functions, only character indices are allowed. Numeric indices are caught and an error is given. <>= e <- local({ err <- function(e) e tryCatch(db[[1]], error = err) }) conditionMessage(e) @ Finally, there is method for the \code{with} generic function which operates much like using \code{with} on lists or environments. The following three statements all return the same value. <>= with(db, c(a = mean(a), b = mean(b))) @ When using \code{with}, the values of ``a'' and ``b'' are looked up in the database. <>= sapply(db[c("a", "b")], mean) @ Here, using \code{[} on \code{db} returns a list with the values associated with ``a'' and ``b''. Then \code{sapply} is applied in the usual way on the returned list. <>= unlist(lapply(db, mean)) @ In the last statement we call \code{lapply} directly on the ``filehash'' object. The \pkg{filehash} package defines a method for \code{lapply} that allows the user to apply a function on all the elements of a database directly. The method essentially loops through all the keys in the database, loads each object separately and applies the supplied function to each object. \code{lapply} returns a named list with each element being the result of applying the supplied function to an object in the database. There is an argument \code{keep.names} to the \code{lapply} method which, if set to \code{FALSE}, will drop all the names from the list. <>= dbUnlink(db) rm(list = ls(all = TRUE)) @ \section{Loading filehash databases} <>= set.seed(200) @ An alternative way of working with a filehash database is to load it into an environment and access the element names directly, without having to use any of the accessor functions. The \pkg{filehash} function \code{dbLoad} works much like the standard R \code{load} function except that \code{dbLoad} loads active bindings into a given environment rather than the actual data. The active bindings are created via the \code{makeActiveBinding} function in the \pkg{base} package. \code{dbLoad} takes a filehash database and creates symbols in an environment corresponding to the keys in the database. It then calls \code{makeActiveBinding} to associate with each key a function which loads the data associated with a given key. Conceptually, active bindings are like pointers to the database. After calling \code{dbLoad}, anytime an object with an active binding is accessed the associated function (installed by \code{makeActiveBinding}) loads the data from the database. We can create a simple database to demonstrate the active binding mechanism. <>= dbCreate("testDB") db <- dbInit("testDB") db$x <- rnorm(100) db$y <- runif(100) db$a <- letters dbLoad(db) ls() @ Notice that we appear to have some additional objects in our workspace. However, the values of these objects are not stored in memory---they are stored in the database. When one of the objects is accessed, the value is automatically loaded from the database. <>= mean(y) sort(a) @ If I assign a different value to one of these objects, its associated value is updated in the database via the active binding mechanism. <>= y <- rnorm(100, 2) mean(y) @ If I subsequently remove the database and reload it later, the updated value for ``y'' persists. <>= rm(list = ls()) db <- dbInit("testDB") dbLoad(db) ls() mean(y) @ Perhaps one disadvantage of the active binding approach taken here is that whenever an object is accessed, the data must be reloaded into R. This behavior is distinctly different from the the delayed assignment approach taken in \pkg{g.data} where an object must only be loaded once and then is subsequently in memory. However, when using delayed assignments, if one cycles through all of the objects in the database, one could eventually exhaust the available memory. <>= dbUnlink(db) rm(list = ls(all = TRUE)) @ \section{Other filehash utilities} There are a few other utilities included with the \pkg{filehash} package. Two of the utilities, \code{dumpObjects} and \code{dumpImage}, are analogues of \code{save} and \code{save.image}. Rather than save objects to an R workspace, \code{dumpObjects} saves the given objects to a ``filehash'' database so that in the future, individual objects can be reloaded if desired. Similarly, \code{dumpImage} saves the entire workspace to a ``filehash'' database. The function \code{dumpList} takes a list and creates a ``filehash'' database with values from the list. The list must have a non-empty name for every element in order for \code{dumpList} to succeed. \code{dumpDF} creates a ``filehash'' database from a data frame where each column of the data frame is an element in the database. Essentially, \code{dumpDF} converts the data frame to a list and calls \code{dumpList}. \section{Filehash database backends} Currently, the \pkg{filehash} package can represent databases in two different formats. The default format is called ``DB1'' and it stores the keys and values in a single file. From experience, this format works well overall but can be a little slow to initialize when there are many thousands of keys. Briefly, the ``filehash'' object in R stores a map which associates keys with a byte location in the database file where the corresponding value is stored. Given the byte location, we can \code{seek} to that location in the file and read the data directly. Before reading in the data, a check is made to make sure that the map is up to date. This format depends critically on having a working \code{ftell} at the system level and a crude check is made when trying to initialize a database of this format. The second format is called ``RDS'' and it stores objects as separate files on the disk in a directory with the same name as the database. This format is the most straightforward and simple of the available formats. When a request is made for a specific key, \pkg{filehash} finds the appropriate file in the directory and reads the file into R. The only catch is that on operating systems that use case-insensitive file names, objects whose names differ only in case will collide on the filesystem. To workaround this, object names with capital letters are stored with mangled names on the disk. An advantage of this format is that most of the organizational work is delegated to the filesystem. \section{Extending filehash} The \pkg{filehash} package has a mechanism for developing new backend formats, should the need arise. The function \code{registerFormatDB} can be used to make \pkg{filehash} aware of a new database format that may be implemented in a separate R package or a file. \code{registerFormatDB} takes two arguments: a \code{name} for the new format (like ``DB1'' or ``RDS'') and a list of functions. The list should contain two functions: one function named ``create'' for creating a database, given the database name, and another function named ``initialize'' for initializing the database. In addition, one needs to define methods for \code{dbInsert}, \code{dbFetch}, etc. A list of available backend formats can be obtained via the \code{filehashFormats} function. Upon registering a new backend format, the new format will be listed when \code{filehashFormats} is called. The interface for registering new backend formats is still experimental and could change in the future. \section{Discussion} The \pkg{filehash} package has been designed be useful in both a programming setting and an interactive setting. Its main purpose is to allow for simpler interaction with large datasets where simultaneous access to the full dataset is not needed. While the package may not be optimal for all settings, one goal was to write a simple package in pure R that users to could install with minimal overhead. In the future I hope to add functionality for interacting with databases stored on remote computers and perhaps incorporate a ``real'' database backend. Some work has already begun on developing a backend based on the \pkg{RSQLite} package. \bibliographystyle{alpha} \bibliography{combined} \end{document} filehash/inst/doc/filehash.R0000644000176200001440000000674414371234420015500 0ustar liggesusers### R code from vignette source 'filehash.Rnw' ################################################### ### code chunk number 1: options ################################################### options(width=60) ################################################### ### code chunk number 2: exampleGlobalEnv ################################################### x <- 1 print(x) ################################################### ### code chunk number 3: create ################################################### library(filehash) dbCreate("mydb") db <- dbInit("mydb") ################################################### ### code chunk number 4: setseed1 ################################################### set.seed(100) ################################################### ### code chunk number 5: insert ################################################### dbInsert(db, "a", rnorm(100)) ################################################### ### code chunk number 6: fetch ################################################### value <- dbFetch(db, "a") mean(value) ################################################### ### code chunk number 7: delete ################################################### dbInsert(db, "b", 123) dbDelete(db, "a") dbList(db) dbExists(db, "a") ################################################### ### code chunk number 8: accessors ################################################### db$a <- rnorm(100, 1) mean(db$a) mean(db[["a"]]) db$b <- rnorm(100, 2) dbList(db) ################################################### ### code chunk number 9: characteronly ################################################### e <- local({ err <- function(e) e tryCatch(db[[1]], error = err) }) conditionMessage(e) ################################################### ### code chunk number 10: with ################################################### with(db, c(a = mean(a), b = mean(b))) ################################################### ### code chunk number 11: sapply ################################################### sapply(db[c("a", "b")], mean) ################################################### ### code chunk number 12: lapply ################################################### unlist(lapply(db, mean)) ################################################### ### code chunk number 13: cleanupMyDB ################################################### dbUnlink(db) rm(list = ls(all = TRUE)) ################################################### ### code chunk number 14: setseed2 ################################################### set.seed(200) ################################################### ### code chunk number 15: testDB ################################################### dbCreate("testDB") db <- dbInit("testDB") db$x <- rnorm(100) db$y <- runif(100) db$a <- letters dbLoad(db) ls() ################################################### ### code chunk number 16: accessbinding ################################################### mean(y) sort(a) ################################################### ### code chunk number 17: assignvalue ################################################### y <- rnorm(100, 2) mean(y) ################################################### ### code chunk number 18: removeandload ################################################### rm(list = ls()) db <- dbInit("testDB") dbLoad(db) ls() mean(y) ################################################### ### code chunk number 19: cleanupTestDB ################################################### dbUnlink(db) rm(list = ls(all = TRUE)) filehash/inst/CITATION0000644000176200001440000000104214371233670014152 0ustar liggesusersbibentry(bibtype = "article", header = "The reference for the 'filehash' package is:", title = "Interacting with data using the filehash package", author = c(person(given = "Roger D.", family = "Peng")), journal = "R News", year = "2006", volume = "6", number = "4", pages = "19--24", url = "https://cran.r-project.org/doc/Rnews/", textVersion = paste("Peng RD (2006).", dQuote("Interacting with data using the filehash package,"), "R News, 6 (4), 19--24.") ) filehash/inst/NEWS0000644000176200001440000000466513007762012013522 0ustar liggesusersCheck the 'filehash' git repository for the latest updates on the package at http://repo.or.cz/w/filehash.git Version 1.0 ----------- * The 'DB' format has been removed; users should use 'DB1' instead * Internals of 'DB1' format have changed so that it should be a bit more reliable but perhaps a little slower * The 'dbDisconnect' generic has been removed since it is no longer necessary for the 'DB1' format (as it was before). It was never needed for the 'RDS' format and one never existed for that format. Version 0.9 ----------- * For 'filehashRDS' class, the 'dbDir' slot has been renamed to 'dir'. * An attempt has been made to normalize the error handling to make it consistent. * The various 'dump' functions have been given a 'type' argument Version 0.8 ----------- * Added function dbLazyLoad for lazy loading filehash databases. * dbCreate and dbInit are now generics with a method for character vectors. The behavior should be the same as before, by default. * dbLoad is generic. * The second argument to dbMultiFetch is 'key', not 'keys'. * dbInitialize is deprecated * 'DB1' and 'RDS' formats use normalizePath() for resolving paths to directories * There is a vignette now [via vignette("filehash")] Version 0.6-3 ------------- * Added methods for "[[", "$", "[[<-", and "$<-" for filehash objects. Only character indices are allowed * filehash-DB functions use the new serialize() from R 2.4.0 so that numeric data will not suffer from rounding error due to previous use of serialize(ascii = TRUE). * New format filehash-DB1 which stores the key index/map and data in a single file. * New "filehash" method for lapply so that functions can be applied to database entries. Version 0.4-1 ------------- * Patch release, changed some internals for the "DB" type databases * Added test database for regression testing in future releases Version 0.4 ----------- * Added name mangling scheme to prevent clobbering on case-insensitive OSes like Windows (thanks to Bill Venables and David Brahm) * Added dumpImage, dumpObjects, dumpDF functions for dumping various things to filehash databases * Added filehashOption() function for setting global options; right now only the default database type can be set * dbLoad and db2env are regular functions now rather than generics/methods. dbLoad's default 'env' is the parent frame now * Added a "filehash" method for 'with' * Added new generic dbUnlink which deletes a database from the disk filehash/inst/COPYING0000644000176200001440000000127113007762012014044 0ustar liggesusersLicense ======= `filehash' is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA