reshape/0000755000175100001440000000000013141637363011726 5ustar hornikusersreshape/inst/0000755000175100001440000000000013141121251012663 5ustar hornikusersreshape/inst/CITATION0000644000175100001440000000066513141121125014027 0ustar hornikuserscitHeader("To cite reshape in publications, please use:") citEntry(entry = "article", author = "Hadley Wickham", journal = "Journal of Statistical Software", number = "12", title = "Reshaping data with the reshape package", url = "http://www.jstatsoft.org/v21/i12/paper", volume = "21", year = "2007", textVersion = "H. Wickham. Reshaping data with the reshape package. Journal of Statistical Software, 21(12), 2007." ) reshape/NAMESPACE0000644000175100001440000000132213141121104013120 0ustar hornikusersexportPattern("^[^\\.]") import(plyr) importFrom("stats", "complete.cases", "mad", "median", "sd") importFrom("utils", "str", "type.convert") S3method(all.vars,character) S3method(as.data.frame,cast_df) S3method(as.data.frame,cast_matrix) S3method(as.matrix,cast_df) S3method(as.matrix,cast_matrix) S3method(colsplit,character) S3method(colsplit,factor) S3method(melt,array) S3method(melt,cast_df) S3method(melt,cast_matrix) S3method(melt,data.frame) S3method(melt,default) S3method(melt,list) S3method(melt,matrix) S3method(melt,table) S3method(print,cast_df) S3method(print,cast_matrix) S3method(rescaler,data.frame) S3method(rescaler,default) S3method(rescaler,matrix) S3method(str,cast_df) S3method(str,cast_matrix) reshape/CHANGELOG0000644000175100001440000000227013141121104013116 0ustar hornikusers Reshape 0.8: * melt.array now uses type.convert on dimnames to convert to appropriate type * preserve.na now renamed to na.rm to be consistent with other R functions * raw names for columns * margins now displayed with (all) instead of NA * extend melt.array to deal with case where there are partial dimnames - Thanks to Roberto Ugoccioni * add the Smiths dataset to the package * fixed bug when displaying margins with multiple result variables Reshape 0.7.4 * only display all levels of a categorical variable when requested Reshape 0.7.2 * display all levels of a categorical variable * fixes to rescaler function * added sparseby function contributed by Duncan Murdoch * add rownames to high-D arrays Reshape 0.7.1 * default to outputting data.frames * now compatible with R 2.4 * added fill argument to cast, to specify what value should be used for structural missings * fun.aggregate will always be applied if specified, even if no aggregation occurs * margins now work for non-aggregated data * cast will now accepted a list of functions for fun.aggregate * very long formulas will now work in cast * fixed bug in rbind.fill * should be able to melt any cast formreshape/NEWS0000644000175100001440000000364113141121173012414 0ustar hornikusersReshape 0.8.7 ------------------------------------------------ * fix outstanding R CMD check problems Reshape 0.8.6 ------------------------------------------------ * fix outstanding R CMD check problems Reshape 0.8.5 --------------------------------------------------- * fix outstanding R CMD check problems Reshape 0.8.4 --------------------------------------------------- * fix spelling mistake (indicies -> indices), thanks to Stavros Macrakis Reshape 0.8.3 (2009-04-27) --------------------------------------------------- * better rename example * When removing missing values in melt, look only at measured variables, not id variables * Fixes to documentation bugs revealed by new parser Reshape 0.8.2 (2008-11-04) -------------------------- * fixed bug where missing fill values where not getting correctly filled * fill value defaults to fun.aggregate applied to zero-length vector. This produces better values in a wide variety of situations, for example missings will be filled with 0's when length or sum is used. This may require setting fill = NA for aggregation functions that previously return NA, like sd and var. Reshape 0.8.1 (2008-05-01) -------------------------- Melt * character dimension names are processed by type.convert * by default, treat character and factor variables as id variables (i.e. integer variables no longer default to being id vars) * ... now passed on to melt in melt.list (thanks to Charles Naylor) Cast * missing values in subsetting are now correctly dropped to match behaviour of subset() * tweaks to cast and recast to make it easier to pass in formulas created elsewhere * allow user to specify column used for values, guessing if necessary, and giving a warning message if value column does not exist * improve error messages when melt or casting parameters incorrectly specified General * now depends on the new plyr package for many of the workhorse functions reshape/data/0000755000175100001440000000000011440164053012626 5ustar hornikusersreshape/data/smiths.rda0000644000175100001440000000037312057435674014647 0ustar hornikusers r0b```b`b@& `d`lŹ% @P 3/X5XFUZT USM,X 3--1$a>?HEp0_|ccu.ۂ%KM9 Ҥ($37fNL1S33`2<9`VrA2G ĒDDP>breshape/data/tips.rda0000644000175100001440000000504412057435674014317 0ustar hornikusersݚoh]g¦mZWX풦I&69YVe6p\ƥIMRA"7n/ Ad"|DͶ{~ޛ4cN9y=7>28/rBZTSv+ss{k]koV޼+t+y/ϥc_y}Swe=_h%-L+eǹE\זMOg.`-p_MS˵S>È3yƈ5Tq~(q;sA~=p+lW8`}f?ca̗Kl{YA Giik:7?x9[3lokOם>CWv~` 3x uyc/n}ޟ&y<ž=vaN $\'Cw|A^帏B&_chWsb_=< a_?}5x;Os\DC>8#8Ãw1;Α[kO|O#ϱ؅]Aa[A8C;>:T+3S߹q9^SA}שd eĽ']grn~qKn.Y8;gUԃ^Ĺ w Gv~r؃,q">ĥON`O?OP‘u;3Ů}&7QWu~Ս{vv#.E֡5rΓۍcrMĻvx77ޔrЋ~~2u~H1ޏn_6oF>Uًv qv>u uzLVqN>S vnd(mϑyӱC4}*~/,~-u*-y=coֿ1y^Y8[Ռ~ȷwش4eYk7׊ĨUΛ堝lM+?NFvͬZfk^X;3{:?Qk :WeY^[Y\ng|*VѬiU5e{qunfǨ}v=սfn\F(|b<(VPk'RzTƵclǾUMdލw#wuAvҜth.\^ڒlowõNowus|jsznaav //ЎHD$f!V mtջ>͞~)e1s3Ilݤ?Mf,_:ʮlpY>7Ү nq]^k&;?M7ٻUmg_ g,,T?y)]v._\|Jd+V.sW{t/-~yl=&t\xH'N$Sў=Ҟ=O{ڋ R)A e2H R)U W2\ep*U O2«yݕ h収?zc?c;8~惏o}N+m[__?R?K,R?K,cc'>ty~x?z+E,t2A^px!'⻬Ye/_Z{ZzD?FNY R? 7_-nqTR٭a|C|٣Ǘ2L/2L/2L/2L/2L/2H"{{Z\܏/Ͽݼ_~'oR_4.F>8CrWsl\F^ryitw~(w-XGK9ވD7i?ҧz1\\"_jG_"ވzE4'?C8Џ [ 2)x%Y\O}RJ_4.G{~ߥzZ;i-rkqq}9厡F?gWZ\^4CIQy7Gv>~MWEyx&?W~>G>pR q}5O&Sp4.\c\~}Dя+@Ips% W+F=ػ oE(wD~p=%Ww!z?DOgqPo _Ni:8^|=CwW*# }?1/#WG81iWЧ%?'hȳu%vG/6̯oowI<~[ /E?w?~3,I~||'G瓾b_3ɟI//E? ;߯sbݳ-b}LUOy?7۟1>wv>, d'-<}Ob/ce~ KQp#O^>ᑘOp3˸OX\3Obw/#㌞X¯">QN~Og7Ɠy xI)_roZD~pz$a]߸߰lxS<^ZME~w'f~5~o6H/6qwCW5ޯ$}/[ד}iט4?ߎr!I  ^S/ ~Th3~3ƉQkh|NJd"'|"}a.x1߇}xx`ޔ>΄w"%3_E9%<*i><>t4;Ǖ86cqYh ~f5w3zX灗8G pql\^r ~kwio* Nx^X O*8_W?_ dډtCq3yu!;îT/|)˒?nNsOi4gȏ}'}k}AKi}1pj<3ߙg3`W8I>r+`,L{~79i,۳C5e< JC*}ߏqW~O`suljIsIwbǒB;&_Kwq?&4~oOiC~o"):?Ƀ%k^Kŭ<(y#h4~LZ?f;d=?Ӿ~vIX9óӷ+yq?~=+^P|qn-xWbٗ /vϵ^uat"Јw*~|ӢͯWsSIoƼ[8 xg8qC^Z\<~W3}`^&'/WM3 f.n~X^oڭG>S~_%8Oy_ӳ ވE{q{cG<?ܓk~(] qKqǬͧSg\DԃmEOuEM}dkB)я ?M +n~҄RtweA9Q:xM?=ˏ[ci9CTSv[s8radדq"p_7?\,-ɸ_MSY/s/^^x957 F|V~A^8.on_8Pڧc!_WszU:kMC~}Y8j8i}OOz8_gjxaOߖa_a&0'71N?i}~/co,K,lᾺH8x|z^SssO{'yj 釕z' Uyi+q'̍'EIGI [w8nKwG)\Pwߪ|Wٯ`>b^1^o9S?su!7Lr_O`~_ܥ'G+7y>lbŞ@ɏIq@Ϋ}O_Ccpy!~Gw7KXi\9WB1@O87qu'kB%9yN|ySrwmJxu.x.=ΎCzy״jSMbn2~Ǿ\Zsw9k+~styQߟXǩƸH?/ri I9Pcٙ+ U^I|>OzLByey|#u>ʽ5{9s I}wp^u-7{GҼ}D3z;vq~NҰ ) >*qxHS"ުyyijy0cW[qXC'R{/jGx?g<'˼gS.~V>͘g_%?Nc=j̛1˂_'5K%1s ύz/@9?YtD|˯|7L_-ޓr3Ngh7r9r<;*4o Wi^{: ߌ)];X|^Tz}9!Dyx\*X;3%~ϸ^®S}ȏ~hOnH:'ZQ Ir7h51JXg·$y v/~/' 08>Hۓ|N?9˫T9JS53~a8;~Ls~fUq=5=Yy'_cnqHzxUĿWr׬ W>|]N`?OWw*}aϹ}齂'|` .T{q?12o}K5{ĉny3?O\=}7p_u~OCտJd4|KVQ<ߓ Ƽki=/ OvOD ycs1I>s<67T?Lg{g/S#{ɕ3{<ݏ>M۷~rӻqɣwn:\gwo|ݻέ?WέO?W*xO6vgBGgpd௦׽FkuoS:=ljiuO`P-$TK+ 6UK+ 6UK+ 6UK+ :[QV5qEM{~\Q{~\Q{~\Q{ujӯno]$o]UKv7TKvUKg΀Q-Z:IШN4[rtUo 6}SomjM r{7%v㛒sMjMȶ{d vl"s]unNgu>ٴ MkXڴ -khڲ -kxr -qfmlZضmܺEjBkRlbwlrwlCѹujq:WnqDG^]+ٻz!{w]&͆ b﮻)b﮻Ib﮻ib﮻9omo C+ VXZ VK^ Rl lBb, ы(DX8pg7pg7pg7pLӹjz7QpM& D6(wŸᾀX֪p֪64U8Ck73} qLAM qSPS؛))f j {35Yc5Yc5Yc5Yc5Yc5Yc5Yc5Yc5Yc5Yc5Yc5YS̼Tli *64oJu杦bNSPy);m9N[@μ3Lmao&ݶ7n[؛I-ͤfm {3鶶֤Znk{kM5鶶֤ZO­$ZO­$ZO­$ZO­$ZO­$ZO­$Zok{kͿ5Zok{kͿ5}{7l{kmdm9Y[@Fu̿]9oW@3v̿]9oW@3v̿]9oW@3v̿]9oW@3v̿]9oW@3v̿]9oW@3v̿5vۙ[ogmͿ5v۵7og ștr&ݮI+ g ștr&ݮI+ g ștr&ݮI+ g ștr&ݮI+ g ștr&ݮI+ g ștr&ݮI+ g ștr&ݮI+ g̿3vۙ{ogͿ7v~M^C7\o ޤwE۾&L!כt{C7\o ޤrI7zno&ސM!כt{C7\o ޤrI7zno&ސM!כt{C7L}9n_@Τ3̿7ۛ{ooͿ7owÛEw(`o ؛t&ݡIw(`o ؛t&ݡIw(`o ؛t&ݡIw(`o ؛t&ݡIw(`o ؛t&ݡIw(`o ؛tBͰLa?t~0` ;wh?7`Xlo>`{Lm0?`{Lm0LCaz&0=P(L?g 3LCaz&0=P(L?g 3LCaz&=ܜ=_ݺ{޹hJreshape/R/0000755000175100001440000000000013141121104012104 5ustar hornikusersreshape/R/melt.r0000644000175100001440000001634313141121104013237 0ustar hornikusers# Melt # Melt an object into a form suitable for easy casting. # # This the generic melt function. See the following functions # for specific details for different data structures: # # \itemize{ # \item \code{\link{melt.data.frame}} for data.frames # \item \code{\link{melt.array}} for arrays, matrices and tables # \item \code{\link{melt.list}} for lists # } # # @keyword manip # @arguments Data set to melt # @arguments Other arguments passed to the specific melt method melt <- function(data, ...) UseMethod("melt", data) # Default melt function # For vectors, make a column of a data frame # # @keyword internal melt.default <- function(data, ...) { data.frame(value=data) } # Melt a list # Melting a list recursively melts each component of the list and joins the results together # # @keyword internal #X a <- as.list(1:4) #X melt(a) #X names(a) <- letters[1:4] #X melt(a) #X attr(a, "varname") <- "ID" #X melt(a) #X a <- list(matrix(1:4, ncol=2), matrix(1:6, ncol=2)) #X melt(a) #X a <- list(matrix(1:4, ncol=2), array(1:27, c(3,3,3))) #X melt(a) #X melt(list(1:5, matrix(1:4, ncol=2))) #X melt(list(list(1:3), 1, list(as.list(3:4), as.list(1:2)))) melt.list <- function(data, ..., level=1) { var <- nulldefault(attr(data, "varname"), paste("L", level, sep="")) names <- nulldefault(names(data), 1:length(data)) parts <- lapply(data, melt, level=level+1, ...) namedparts <- mapply(function(x, name) { x[[var]] <- name x }, parts, names, SIMPLIFY=FALSE) do.call(rbind.fill, namedparts) } # Melt a data frame # Melt a data frame into form suitable for easy casting. # # You need to tell melt which of your variables are id variables, and which # are measured variables. If you only supply one of \code{id.vars} and # \code{measure.vars}, melt will assume the remainder of the variables in the # data set belong to the other. If you supply neither, melt will assume # factor and character variables are id variables, and all others are # measured. # # @arguments Data set to melt # @arguments Id variables. If blank, will use all non measure.vars variables. Can be integer (variable position) or string (variable name) # @arguments Measured variables. If blank, will use all non id.vars variables. Can be integer (variable position) or string (variable name) # @arguments Name of the variable that will store the names of the original variables # @arguments Should NA values be removed from the data set? # @arguments Old argument name, now deprecated # @value molten data # @keyword manip # @seealso \url{http://had.co.nz/reshape/} #X head(melt(tips)) #X names(airquality) <- tolower(names(airquality)) #X melt(airquality, id=c("month", "day")) #X names(ChickWeight) <- tolower(names(ChickWeight)) #X melt(ChickWeight, id=2:4) melt.data.frame <- function(data, id.vars, measure.vars, variable_name = "variable", na.rm = !preserve.na, preserve.na = TRUE, ...) { if (!missing(preserve.na)) message("Use of preserve.na is now deprecated, please use na.rm instead") var <- melt_check(data, id.vars, measure.vars) if (length(var$measure) == 0) { return(data[, var$id, drop=FALSE]) } ids <- data[,var$id, drop=FALSE] df <- do.call("rbind", lapply(var$measure, function(x) { data.frame(ids, x, data[, x]) })) names(df) <- c(names(ids), variable_name, "value") df[[variable_name]] <- factor(df[[variable_name]], unique(df[[variable_name]])) if (na.rm) { df <- df[!is.na(df$value), , drop=FALSE] } rownames(df) <- NULL df } # Melt an array # This function melts a high-dimensional array into a form that you can use \code{\link{cast}} with. # # This code is conceptually similar to \code{\link{as.data.frame.table}} # # @arguments array to melt # @arguments variable names to use in molten data.frame # @keyword manip # @alias melt.matrix # @alias melt.table #X a <- array(1:24, c(2,3,4)) #X melt(a) #X melt(a, varnames=c("X","Y","Z")) #X dimnames(a) <- lapply(dim(a), function(x) LETTERS[1:x]) #X melt(a) #X melt(a, varnames=c("X","Y","Z")) #X dimnames(a)[1] <- list(NULL) #X melt(a) melt.array <- function(data, varnames = names(dimnames(data)), ...) { values <- as.vector(data) dn <- dimnames(data) if (is.null(dn)) dn <- vector("list", length(dim(data))) dn_missing <- sapply(dn, is.null) dn[dn_missing] <- lapply(dim(data), function(x) 1:x)[dn_missing] char <- sapply(dn, is.character) dn[char] <- lapply(dn[char], type.convert) indices <- do.call(expand.grid, dn) names(indices) <- varnames data.frame(indices, value=values) } melt.table <- melt.array melt.matrix <- melt.array # Melt cast data.frames # Melt the results of a cast # # This can be useful when performning complex aggregations - melting # the result of a cast will do it's best to figure out the correct variables # to use as id and measured. # # @keyword internal melt.cast_df <- function(data, drop.margins=TRUE, ...) { molten <- melt.data.frame(as.data.frame(data), id.vars=attr(data, "idvars")) cols <- rcolnames(data) rownames(cols) <- make.names(rownames(cols)) molten <- cbind(molten[names(molten) != "variable"], cols[molten$variable, , drop=FALSE]) if (drop.margins) { margins <- !complete.cases(molten[,names(molten) != "value", drop=FALSE]) molten <- molten[!margins, ] } molten } # Melt cast matrices # Melt the results of a cast # # Converts to a data frame and then uses \code{\link{melt.cast_df}} # # @keyword internal melt.cast_matrix <- function(data, ...) { melt(as.data.frame(data)) } # Melt check. # Check that input variables to melt are appropriate. # # If id.vars or measure.vars are missing, \code{melt_check} will do its # best to impute them.If you only # supply one of id.vars and measure.vars, melt will assume the remainder of # the variables in the data set belong to the other. If you supply neither, # melt will assume character and factor variables are id variables, # and all other are measured. # # @keyword internal # @arguments data frame # @arguments Vector of identifying variable names or indexes # @arguments Vector of Measured variable names or indexes # @value id list id variable names # @value measure list of measured variable names melt_check <- function(data, id.vars, measure.vars) { varnames <- names(data) if (!missing(id.vars) && is.numeric(id.vars)) id.vars <- varnames[id.vars] if (!missing(measure.vars) && is.numeric(measure.vars)) measure.vars <- varnames[measure.vars] if (!missing(id.vars)) { unknown <- setdiff(id.vars, varnames) if (length(unknown) > 0) { stop("id variables not found in data: ", paste(unknown, collapse=", "), call. = FALSE) } } if (!missing(measure.vars)) { unknown <- setdiff(measure.vars, varnames) if (length(unknown) > 0) { stop("measure variables not found in data: ", paste(unknown, collapse=", "), call. = FALSE) } } if (missing(id.vars) && missing(measure.vars)) { categorical <- sapply(data, function(x) class(x)[1]) %in% c("factor", "ordered", "character") id.vars <- varnames[categorical] measure.vars <- varnames[!categorical] message("Using ", paste(id.vars, collapse=", "), " as id variables") } if (missing(id.vars)) id.vars <- varnames[!(varnames %in% c(measure.vars))] if (missing(measure.vars)) measure.vars <- varnames[!(varnames %in% c(id.vars))] list(id = id.vars, measure = measure.vars) } reshape/R/margins.r0000644000175100001440000000555313141121104013737 0ustar hornikusers# Compute margins # Compute marginal values. # # @arguments data frame # @arguments margins to compute # @arguments all id variables # @arguments aggregation function # @arguments other argument passed to aggregation function # @keyword internal compute.margins <- function(data, margins, vars, fun.aggregate, ..., df=FALSE) { if (length(margins) == 0) return(data.frame()) if (missing(fun.aggregate) || is.null(fun.aggregate)) { warning("Margins require fun.aggregate: length used as default", call.=FALSE) fun.aggregate <- length } exp <- function(x) { if (df) { out <- condense.df(data, x, fun.aggregate, ...) } else { out <- expand(condense(data, x, fun.aggregate, ...)) } others <- setdiff(unlist(vars), x) out[, others] <- factor("(all)") out[, unlist(vars)] <- lapply(out[, unlist(vars)], factor) out } df <- do.call("rbind",lapply(margins, exp)) cat <- sapply(df, is.factor) fixlevel <- function(x) { factor(x, levels=c(setdiff(levels(x), "(all)"), "(all)")) } df[cat] <- lapply(df[cat], fixlevel) df[, c(which(cat), which(!cat))] } # Margin variables # Works out list of variables to margin over to get desired margins. # # Variables that can't be margined over are dropped silently. # # @arguments column variables # @arguments row variables # @arguments vector of variable names to margin over. # @keyword internal margin.vars <- function(vars = list(NULL, NULL), margins = NULL) { rows <- vars[[1]] cols <- vars[[2]] if (missing(margins) || is.null(margins) || margins == FALSE) return(NULL) # Nothing to margin over for last variable in column or row row.margins <- intersect(rows[-length(rows)], margins) if (length(row.margins) == 0 ) row.margins <- NULL col.margins <- intersect(cols[-length(cols)], margins) if (length(col.margins) == 0 ) col.margins <- NULL grand.row <- "grand_row" %in% margins grand.col <- "grand_col" %in% margins margin.intersect <- function(cols, col.margins, rows, row.margins) { unlist(lapply(col.margins, function(col) { c(lapply(row.margins, c, col), list(c(col, rows))) }), recursive = FALSE) } margins.all <- c( margin.intersect(cols, col.margins, rows, row.margins), margin.intersect(rows, row.margins, cols, col.margins) ) if (grand.row && !is.null(rows)) margins.all <- compact(c(margins.all, list(cols), list(col.margins))) if (grand.col && !is.null(cols)) margins.all <- compact(c(margins.all, list(rows), list(row.margins))) if ( (grand.col && grand.row && !is.null(rows) && !is.null(cols)) || (grand.row && !is.null(rows) && is.null(cols)) || (grand.col && !is.null(cols) && is.null(rows)) ) margins.all <- c(margins.all, list(numeric(0))) duplicates <- duplicated(lapply(lapply(margins.all,function(x) if(!is.null(x)) sort(x)), paste, collapse="")) margins.all[!duplicates] } reshape/R/formula.r0000644000175100001440000000434513141121104013742 0ustar hornikusers# Cast parse formula # Parse formula for casting # # @value row character vector of row names # @value col character vector of column names # @value aggregate boolean whether aggregation will occur # @keyword internal # #X cast_parse_formula("a + ...", letters[1:6]) #X cast_parse_formula("a | ...", letters[1:6]) #X cast_parse_formula("a + b ~ c ~ . | ...", letters[1:6]) cast_parse_formula <- function(formula = "... ~ variable", varnames) { check_formula(formula, varnames) vars <- all.vars.character(formula) remainder <- varnames[!(varnames %in% c(unlist(vars), "value"))] replace.remainder <- function(x) if (any(x == "...")) c(x[x != "..."], remainder) else x list( m = lapply(vars$m, replace.remainder), l = rev(replace.remainder(vars$l)) ) } # Get all variables # All variables in character string of formula. # # Removes . # # @keyword internal # @returns list of variables in each part of formula #X all.vars.character("a + b") #X all.vars.character("a + b | c") #X all.vars.character("a + b") #X all.vars.character(". ~ a + b") #X all.vars.character("a ~ b | c + d + e") all.vars.character <- function(formula, blank.char = ".") { formula <- paste(formula, collapse="") vars <- function(x) { if (is.na(x)) return(NULL) remove.blank(strsplit(gsub("\\s+", "", x), "[*+]")[[1]]) } remove.blank <- function(x) { x <- x[x != blank.char] if(length(x) == 0) NULL else x } parts <- strsplit(formula, "\\|")[[1]] list( m = lapply(strsplit(parts[1], "~")[[1]], vars), l = vars(parts[2]) ) } # Check formula # Checks that formula is a valid reshaping formula. # # \enumerate{ # \item variable names not found in molten data # \item same variable used in multiple places # } # @arguments formula to check # @arguments vector of variable names # @keyword internal check_formula <- function(formula, varnames) { vars <- unlist(all.vars.character(formula)) unknown <- setdiff(vars, c(".", "...","result_variable",varnames)) if (length(unknown) > 0) stop("Casting formula contains variables not found in molten data: ", paste(unknown, collapse=", "), call. = FALSE) vars <- vars[vars != "."] if (length(unique(vars)) < length(vars)) stop("Variable names repeated", call. = FALSE) } reshape/R/stamp.r0000644000175100001440000000436413141121104013422 0ustar hornikusers# Stamp # Stamp is like reshape but the "stamping" function is passed the entire data frame, instead of just a few variables. # # It is very similar to the \code{\link{by}} function except in the form # of the output which is arranged using the formula as in \code{\link{reshape}} # # Note that it's very easy to create objects that R can't print with this # function. You will probably want to save the results to a variable and # then use extract the results. See the examples. # # @arguments data.frame (no molten) # @arguments formula that describes arrangement of result, columns ~ rows, see \code{\link{reshape}} for more information # @arguments aggregation function to use, should take a data frame as the first argument # @arguments arguments passed to the aggregation function # @arguments margins to compute (character vector, or \code{TRUE} for all margins), can contain \code{grand_row} or \code{grand_col} to inclue grand row or column margins respectively. # @arguments logical vector by which to subset the data frame, evaluated in the context of the data frame so you can #@keyword manip stamp <- function(data, formula = . ~ ., fun.aggregate, ..., margins=NULL, subset=TRUE, add.missing=FALSE) { if (inherits(formula, "formula")) formula <- deparse(substitute(formula)) cast(data, formula, fun.aggregate, ..., margins=margins, subset=subset, df=TRUE,add.missing=add.missing, value="") } # Condense a data frame # Condense # # @arguments data frame # @arguments character vector of variables to condense over # @arguments function to condense with # @arguments arguments passed to condensing function # @keyword manip condense.df <- function(data, variables, fun, ...) { if (length(variables) == 0 ) { df <- data.frame(results = 0) df$results <- list(fun(data, ...)) return(df) } sorted <- sort_df(data, variables) duplicates <- duplicated(sorted[,variables, drop=FALSE]) index <- cumsum(!duplicates) results <- by(sorted, index, fun, ...) cols <- sorted[!duplicates,variables, drop=FALSE] cols$results <- array(results) cols } # Tidy up stamped data set # @keyword internal tidystamp <- function(x) { bind <- function(i) cbind(x[i, -ncol(x),drop=FALSE], x$value[[i]]) l <- lapply(1:nrow(x), bind) do.call(rbind.fill, l) }reshape/R/factors.r0000644000175100001440000000174313141121104013735 0ustar hornikusers# Combine factor levels # Convenience function to make it easy to combine multiple levels # in a factor into one. # # @arguments factor variable # @arguments either a character vector of levels, or a numeric vector of their positions. See examples for more details. # @arguments label for other level # @keyword manip #X df <- data.frame(a = LETTERS[sample(5, 15, replace=TRUE)], y = rnorm(15)) #X combine_factor(df$a, c(1,2,2,1,2)) #X combine_factor(df$a, c(1:4, 1)) #X (f <- reorder(df$a, df$y)) #X percent <- tapply(abs(df$y), df$a, sum) #X combine_factor(f, c(order(percent)[1:3])) combine_factor <- function(fac, variable=levels(fac), other.label="Other") { n <- length(levels(fac)) if (length(variable) < n) { nvar <- c(seq(1, length(variable)), rep(length(variable)+1, n - length(variable))) factor(nvar[as.numeric(fac)], labels=c(levels(fac)[variable], other.label)) } else { factor(variable[as.numeric(fac)], labels=levels(fac)[!duplicated(variable)]) } } reshape/R/recast.r0000644000175100001440000000162213141121104013551 0ustar hornikusers# Recast # \link{melt} and \link{cast} data in a single step # # This conveniently wraps melting and casting a data frame into # one step. # # @arguments Data set to melt # @arguments Casting formula, see \link{cast} for specifics # @arguments Other arguments passed to \link{cast} # @arguments Identifying variables. If blank, will use all non measure.var variables # @arguments Measured variables. If blank, will use all non id.var variables # @keyword manip # @seealso \url{http://had.co.nz/reshape/} #X recast(french_fries, time ~ variable, id.var=1:4) recast <- function(data, formula, ..., id.var, measure.var) { if (any(c("id.vars", "measure.vars") %in% names(list(...)))) stop("its var, not vars\n") molten <- melt(data, id.var, measure.var) if (is.formula(formula)) formula <- deparse(formula) if (!is.character(formula)) formula <- as.character(formula) cast(molten, formula, ...) } reshape/R/dimnames.r0000644000175100001440000000704613141121104014073 0ustar hornikusers# Cast matrix. # Createa a new cast matrix # # For internal use only # # @arguments matrix to turn into cast matrix # @arguments list of dimension names (as data.frames), row, col, ... # @value object of type \code{\link{cast_matrix}} # @keyword internal cast_matrix <- function(m, dimnames) { rdimnames(m) <- dimnames class(m) <- c("cast_matrix", class(m)) dimnames(m) <- lapply(rdimnames(m), rownames) m } # Dimension names # These methods provide easy access to the special dimension names # associated without the output of reshape # # Reshape stores dimension names in a slightly different format to # base R, to allow for (e.g.) multiple levels of column header. These # accessor functions allow you to get and set them. # # @alias rdimnames<- # @alias rcolnames # @alias rcolnames<- # @alias rrownames # @alias rrownames<- # @keyword internal rdimnames <- function(x) attr(x, "rdimnames") "rdimnames<-" <- function(x, value) { name <- function(df) { rownames(df) <- do.call("paste", c(df, sep="_")) df } value <- lapply(value, name) attr(x, "rdimnames") <- value attr(x, "idvars") <- colnames(value[[1]]) x } rcolnames <- function(x) rdimnames(x)[[2]] "rcolnames<-" <- function(x, value) { dn <- rdimnames(x) dn[[2]] <- value rdimnames(x) <- dn x } rrownames <- function(x) rdimnames(x)[[1]] "rrownames<-" <- function(x, value) { dn <- rdimnames(x) dn[[1]] <- value rdimnames(x) <- dn x } # as.data.frame.cast\_matrix # Convert cast matrix into a data frame # # Converts a matrix produced by cast into a data frame with # appropriate id columns. # # @argument Reshape matrix # @argument Argument required to match generic # @argument Argument required to match generic # @keyword internal as.data.frame.cast_matrix <- function(x, row.names, optional, ...) { unx <- unclass(x) colnames(unx) <- rownames(rcolnames(x)) r.df <- data.frame(rrownames(x), unx, check.names=FALSE) class(r.df) <- c("cast_df", "data.frame") attr(r.df, "idvars") <- attr(x, "idvars") attr(r.df, "rdimnames") <- attr(x, "rdimnames") rownames(r.df) <- 1:nrow(r.df) r.df } # as.matrix.cast\_df # Convert cast data.frame into a matrix # # Converts a data frame produced by cast into a matrix with # appropriate dimnames. # # @keyword internal as.matrix.cast_df <- function(x, ...) { ids <- attr(x, "idvars") mat <- as.matrix.data.frame(x[, setdiff(names(x), ids)]) rownames(mat) <- rownames(rrownames(x)) colnames(mat) <- rownames(rcolnames(x)) attr(mat, "idvars") <- attr(x, "idvars") attr(mat, "rdimnames") <- attr(x, "rdimnames") class(mat) <- c("cast_matrix", class(mat)) mat } # as.matrix.cast\_matrix # Convert cast matrix into a matrix # # Strips off cast related attributes so matrix becomes a normal matrix # # @keyword internal as.matrix.cast_matrix <- function(x, ...) { class(x) <- class(x)[-1] attr(x, "rdimnames") <- NULL attr(x, "idvars") <- NULL x } # as.data.frame.cast\_df # Convert cast data.frame into a matrix # # Strips off cast related attributes so data frame becomes a normal data frame # # @keyword internal as.data.frame.cast_df <- function(x, ...) { class(x) <- class(x)[-1] x } # Print cast objects # Printing methods # # Used for printing. # # @keyword internal # @alias str.cast_df # @alias print.cast_matrix # @alias print.cast_df str.cast_df <- str.cast_matrix <- function(object, ...) { str(unclass(object)) } print.cast_matrix <- print.cast_df <- function(x, ...) { class(x) <- class(x)[-1] attr(x, "idvars") <- NULL attr(x, "rdimnames") <- NULL NextMethod(x, ...) }reshape/R/utils.r0000644000175100001440000001701313141121104013431 0ustar hornikusers# Guess value # Guess name of value column # # Strategy: # \enumerate{ # \item Is value or (all) column present? If so, use that # \item Otherwise, guess that last column is the value column # } # # @arguments Data frame to guess value column from # @keyword internal guess_value <- function(df) { if ("value" %in% names(df)) return("value") if ("(all)" %in% names(df)) return("(all)") last <- names(df)[ncol(df)] message("Using ", last, " as value column. Use the value argument to cast to override this choice") last } # Merge all # Merge together a series of data.frames # # Order of data frames should be from most complete to least complete # # @arguments list of data frames to merge # @seealso \code{\link{merge_recurse}} # @keyword manip merge_all <- function(dfs, ...) { if (length(dfs)==1) return(dfs[[1]]) df <- merge_recurse(dfs, ...) df <- df[, match(names(dfs[[1]]), names(df))] df[do.call("order", df[, -ncol(df), drop=FALSE]), ,drop=FALSE] } # Merge recursively # Recursively merge data frames # # @arguments list of data frames to merge # @seealso \code{\link{merge_all}} # @keyword internal merge_recurse <- function(dfs, ...) { if (length(dfs) == 2) { merge(dfs[[1]], dfs[[2]], all=TRUE, sort=FALSE, ...) } else { merge(dfs[[1]], Recall(dfs[-1]), all=TRUE, sort=FALSE, ...) } } # Expand grid # Expand grid of data frames # # Creates new data frame containing all combination of rows from # data.frames in \code{...} # # @arguments list of data frames (first varies fastest) # @arguments only use unique rows? # @keyword manip #X expand.grid.df(data.frame(a=1,b=1:2)) #X expand.grid.df(data.frame(a=1,b=1:2), data.frame()) #X expand.grid.df(data.frame(a=1,b=1:2), data.frame(c=1:2, d=1:2)) #X expand.grid.df(data.frame(a=1,b=1:2), data.frame(c=1:2, d=1:2), data.frame(e=c("a","b"))) expand.grid.df <- function(..., unique=TRUE) { dfs <- list(...) notempty <- sapply(dfs, ncol) != 0 if (sum(notempty) == 1) return(dfs[notempty][[1]]) if (unique) dfs <- lapply(dfs, unique) indexes <- lapply(dfs, function(x) 1:nrow(x)) grid <- do.call(expand.grid, indexes) df <- do.call(data.frame, mapply(function(df, index) df[index, ,drop=FALSE], dfs, grid)) colnames(df) <- unlist(lapply(dfs, colnames)) rownames(df) <- 1:nrow(df) return(df) } # Sort data frame # Convenience method for sorting a data frame using the given variables. # # Simple wrapper around order # # @arguments data frame to sort # @arguments variables to use for sorting # @returns sorted data frame # @keyword manip sort_df <- function(data, vars=names(data)) { if (length(vars) == 0 || is.null(vars)) return(data) data[do.call("order", data[,vars, drop=FALSE]), ,drop=FALSE] } # Untable a dataset # Inverse of table # # Given a tabulated dataset (or matrix) this will untabulate it # by repeating each row by the number of times it was repeated # # @arguments matrix or data.frame to untable # @arguments vector of counts (of same length as \code{df}) # @keyword manip untable <- function(df, num) { df[rep(1:nrow(df), num), ] } # Unique default # Convenience function for setting default if not unique # # Used by ggplot2 # # @arguments vector of values # @arguments default to use if values not uniquez # @keyword manip uniquedefault <- function(values, default) { unq <- unique(values) if (length(unq) == 1) unq[1] else "black" } # Rename # Rename an object # # The rename function provide an easy way to rename the columns of a # data.frame or the items in a list. # # @arguments object to be renamed # @arguments named vector specifying new names # @keyword manip #X rename(mtcars, c(wt = "weight", cyl = "cylinders")) #X a <- list(a = 1, b = 2, c = 3) #X rename(a, c(b = "a", c = "b", a="c")) #X #X # Example supplied by Timothy Bates #X names <- c("john", "tim", "andy") #X ages <- c(50, 46, 25) #X mydata <- data.frame(names,ages) #X names(mydata) #-> "name", "ages" #X #X # lets change "ages" to singular. #X # nb: The operation is not done in place, so you need to set your #X # data to that returned from rename #X #X mydata <- rename(mydata, c(ages="age")) #X names(mydata) #-> "name", "age" rename <- function(x, replace) { replacement <- replace[names(x)] names(x)[!is.na(replacement)] <- replacement[!is.na(replacement)] x } # Round any # Round to multiple of any number # # Useful when you want to round a number to arbitrary precision # # @arguments numeric vector to round # @arguments number to round to # @arguments function to use for round (eg. \code{\link{floor}}) # @keyword internal #X round_any(135, 10) #X round_any(135, 100) #X round_any(135, 25) #X round_any(135, 10, floor) #X round_any(135, 100, floor) #X round_any(135, 25, floor) #X round_any(135, 10, ceiling) #X round_any(135, 100, ceiling) #X round_any(135, 25, ceiling) round_any <- function(x, accuracy, f=round) { f(x / accuracy) * accuracy } # Update list # Update a list, but don't create new entries # # Don't know what this is used for! # # @arguments list to be updated # @arguments list with updated values # @keyword internal updatelist <- function(x, y) { common <- intersect(names(x),names(y)) x[common] <- y[common] x } # Nested.by function # Nest series of by statements returning nested list # # Work horse for producing cast lists. # # @keyword internal nested.by <- function(data, INDICES, FUN, ...) { if (length(compact(INDICES)) == 0 || is.null(INDICES)) return(FUN(data, ...)) FUNx <- function(x) FUN(data[x, ], ...) nd <- nrow(data) if (length(INDICES) == 1) { return(with(data, tapply(1:nd, INDICES[[1]], FUNx))) } tapply(1:nd, INDICES[[length(INDICES)]], function(x) { nested.by(data[x, ], lapply(INDICES[-length(INDICES)],"[", x), FUN, ...) }, simplify=FALSE) } # Split a vector into multiple columns # This function can be used to split up a column that has been pasted together. # # @arguments character vector or factor to split up # @arguments regular expression to split on # @arguments names for output columns # @keyword manip # @alias colsplit.factor # @alias colsplit.character colsplit <- function(x, split="", names) UseMethod("colsplit", x) colsplit.factor <- function(x, split="", names) colsplit(as.character(x), split, names) colsplit.character <- function(x, split="", names) { vars <- as.data.frame(do.call(rbind, strsplit(x, split))) names(vars) <- names as.data.frame(lapply(vars, function(x) type.convert(as.character(x)))) } # Aggregate multiple functions into a single function # Combine multiple functions to a single function returning a named vector of outputs # # Each function should produce a single number as output # # @arguments functions to combine # @keyword manip #X funstofun(min, max)(1:10) #X funstofun(length, mean, var)(rnorm(100)) funstofun <- function(...) { fnames <- sapply(match.call()[-1], deparse) fs <- list(...) n <- length(fs) function(x, ...) { results <- vector("numeric", length=n) for(i in seq_len(n)) results[[i]] <- fs[[i]](x, ...) names(results) <- fnames results } } # Null default # Use default value when null # # Handy method when argument defaults aren't good enough. # # @keyword internal nulldefault <- function(x, default) { if (is.null(x)) default else x } # Name rows # Add variable to data frame containing rownames # # This is useful when the thing that you want to melt by is the rownames # of the data frame, not an explicit variable # # @arguments data frame # @arguments name of new column containing rownames # @keyword manip namerows <- function(df, col.name = "id") { df[[col.name]] = rownames(df) df } reshape/R/rescale.r0000644000175100001440000000361713141121104013714 0ustar hornikusers# Rescaler # Convenient methods for rescaling data # # Provides methods for vectors, matrices and data.frames # # Currently, five rescaling options are implemented: # # \itemize{ # \item \code{I}: do nothing # \item \code{range}: scale to [0, 1] # \item \code{rank}: convert values to ranks # \item \code{robust}: robust version of \code{sd}, substract median and divide by median absolute deviation # \item \code{sd}: subtract mean and divide by standard deviation # } # # @arguments object to rescale # @arguments type of rescaling to use (see description for details) # @arguments other options (only pasesed to \code{\link{rank}}) # @keyword manip # @seealso \code{\link{rescaler.default}} rescaler <- function(x, type="sd", ...) UseMethod("rescaler", x) # Default rescaler # See \code{\link{rescaler}} for details # # @arguments vector to rescale # @arguments type of rescaling to apply # @arguments other arguments passed to rescaler # @keyword internal rescaler.default <- function(x, type="sd", ...) { switch(type, rank = rank(x, ...), var = , sd = (x - mean(x, na.rm=TRUE)) / sd(x, na.rm=TRUE), robust = (x - median(x, na.rm=TRUE)) / mad(x, na.rm=TRUE), I = x, range = (x - min(x, na.rm=TRUE)) / diff(range(x, na.rm=TRUE)) ) } # Rescale a data frame # Rescales data frame by columns # # @arguments data.frame to rescale # @arguments type of rescaling to apply # @arguments other arguments passed to rescaler # @keyword internal rescaler.data.frame <- function(x, type="sd", ...) { continuous <- sapply(x, is.numeric) x[continuous] <- lapply(x[continuous], rescaler, type=type, ...) x } # Rescale a matrix # Rescales matrix by columns # # @arguments matrix to rescale # @arguments type of rescaling to apply # @arguments other arguments passed to rescaler # @keyword internal rescaler.matrix <- function(x, type="sd", ...) { apply(x, 2, rescaler, type=type, ...) }reshape/R/sparse-by.r0000644000175100001440000000505413141121104014200 0ustar hornikusers# a version of by for cases where the dataset doesn't cross all the variables # based on an idea by Hadley Wickham # This function assumes that the data is in a matrix or data.frame, and returns data in a # matrix or data frame. It tries not to turn matrices into data frames except when necessary. # It should be possible to parallelize it.... sparseby <- function (data, INDICES = list(), FUN, ..., GROUPNAMES = TRUE) { cbind2 <- function (...) { if (all(lapply(list(...), is.numeric))) cbind(...) else do.call("cbind.data.frame", list(...)) } if (is.list(INDICES)) IND <- do.call("cbind2", INDICES) else if (is.null(dim(INDICES)) || length(dim(INDICES)) < 2) { IND <- matrix(INDICES, ncol = 1, dimnames = list(NULL, deparse(substitute(INDICES)))) } else if (length(dim(INDICES)) > 2) stop("Cannot handle multi-dimensional indices") else IND = INDICES if (nrow(IND) == 0 ) { result <- rbind(FUN(data, ...)) } else { ncols <- function (x) { if (is.matrix(x) || is.data.frame(x)) return(ncol(x)) else return(length(x)) } if (length(colnames(IND)) == 0) colnames(IND) <- rep("", ncols(IND)) colnames(IND) <- ifelse(colnames(IND) == "", paste("V", 1:ncols(IND), sep=""), colnames(IND)) o <- do.call("order", as.data.frame(IND)) keys <- IND[o,,drop=FALSE] df <- data[o,,drop=FALSE] # duplicates <- duplicated(keys) # Faster way, since we know the keys are sorted: duplicates <- c(FALSE,apply(keys[1:(nrow(keys)-1),,drop=FALSE] != keys[2:nrow(keys),,drop=FALSE],1,sum) == 0) index <- cumsum(!duplicates) FUNx <- function (x) FUN(df[x,,drop=FALSE], ...) result <- tapply(1:nrow(df), index, FUNx, simplify=FALSE) # Drop NULLs from results nulls <- unlist(lapply(result, is.null)) result <- result[!nulls] if (length(result) == 0) return(NULL) lens <- range(lapply(result, ncols)) if (lens[1] != lens[2]) stop("function returns inconsistent lengths") if (GROUPNAMES) { keys[index,] <- keys keys <- keys[(1:length(nulls))[!nulls],,drop=FALSE] if (all(lapply(result, function(x) length(dim(x)) == 2))) keys <- keys[rep(1:length(result), lapply(result, nrow)),,drop=FALSE] else keys <- keys[(1:length(result)),,drop=FALSE] } result <- do.call("rbind", result) if (GROUPNAMES) result <- cbind2(keys, result) } return(result) } reshape/R/condense.r0000644000175100001440000000426513141121104014074 0ustar hornikusers# Condense # Condense a data frame. # # Works very much like by, but keeps data in original data frame format. # Results column is a list, so that each cell may contain an object or a vector etc. # Assumes data is in molten format. Aggregating function must return the # same number of arguments for all input. # # @arguments data frame # @arguments variables to condense over # @arguments aggregating function, may multiple values # @arguments further arguments passed on to aggregating function # @keyword manip # @keyword internal condense <- function(data, variables, fun, ...) { if (length(variables) == 0 ) { df <- data.frame(result = 0) df$result <- list(fun(data$value, ...)) return(df) } sorted <- sort_df(data, variables)[,c(variables, "value"), drop=FALSE] duplicates <- duplicated(sorted[,variables, drop=FALSE]) index <- cumsum(!duplicates) results <- tapply(sorted$value, index, fun, ..., simplify = FALSE) cols <- sorted[!duplicates,variables, drop=FALSE] cols$result <- array(results) cols } # Expand # Expand out condensed data frame. # # If aggregating function supplied to condense returns multiple values, this # function "melts" it again, creating a new column called result\_variable. # # If the aggregating funtion is a named vector, then those names will be used, # otherwise will be number X1, X2, ..., Xn etc. # # @arguments condensed data frame # @keyword manip # @keyword internal expand <- function(data) { lengths <- unique(sapply(data$result, length)) if (lengths == 1) return(data) first <- data[1, "result"][[1]] exp <- lapply(1:length(first), function(x) as.vector(unlist(lapply(data$result, "[", x)))) names(exp) <- if (is.null(names(first))) make.names(1:length(first)) else make.names(names(first)) x <- melt(data.frame(data[, seq_len(ncol(data) -1), drop=FALSE], exp), m=names(exp),variable_name="result_variable") colnames(x)[match("value", colnames(x), FALSE)] <- "result" x } # Clean variables. # Clean variable list for reshaping. # # @arguments vector of variable names # @value Vector of "real" variable names (excluding result\_variable etc.) # @keyword internal clean.vars <- function(vars) {vars[vars != "result_variable"]} reshape/R/cast.r0000644000175100001440000003303213141121104013222 0ustar hornikusers# Cast function # Cast a molten data frame into the reshaped or aggregated form you want # # Along with \code{\link{melt}} and \link{recast}, this is the only function you should ever need to use. # Once you have melted your data, cast will arrange it into the form you desire # based on the specification given by \code{formula}. # # The cast formula has the following format: \code{x_variable + x_2 ~ y_variable + y_2 ~ z_variable ~ ... | list_variable + ... } # The order of the variables makes a difference. The first varies slowest, and the last # fastest. There are a couple of special variables: "..." represents all other variables # not used in the formula and "." represents no variable, so you can do \code{formula=var1 ~ .} # # Creating high-D arrays is simple, and allows a class of transformations that are hard # without \code{\link{apply}} and \code{\link{sweep}} # # If the combination of variables you supply does not uniquely identify one row in the # original data set, you will need to supply an aggregating function, \code{fun.aggregate}. # This function should take a vector of numbers and return a summary statistic(s). It must # return the same number of arguments regardless of the length of the input vector. # If it returns multiple value you can use "result\_variable" to control where they appear. # By default they will appear as the last column variable. # # The margins argument should be passed a vector of variable names, eg. # \code{c("month","day")}. It will silently drop any variables that can not be margined # over. You can also use "grand\_col" and "grand\_row" to get grand row and column margins # respectively. # # Subset takes a logical vector that will be evaluated in the context of \code{data}, # so you can do something like \code{subset = variable=="length"} # # All the actual reshaping is done by \code{\link{reshape1}}, see its documentation # for details of the implementation # # @keyword manip # @arguments molten data frame, see \code{\link{melt}} # @arguments casting formula, see details for specifics # @arguments aggregation function # @arguments further arguments are passed to aggregating function # @arguments vector of variable names (can include "grand\_col" and "grand\_row") to compute margins for, or TRUE to computer all margins # @arguments logical vector to subset data set with before reshaping # @arguments argument used internally # @arguments value with which to fill in structural missings, defaults to value from applying \code{fun.aggregate} to 0 length vector # @argument should all missing combinations be displayed? # @argument name of column which stores values, see \code{\link{guess_value}} for default strategies to figure this out # @seealso \code{\link{reshape1}}, \url{http://had.co.nz/reshape/} #X #Air quality example #X names(airquality) <- tolower(names(airquality)) #X aqm <- melt(airquality, id=c("month", "day"), na.rm=TRUE) #X #X cast(aqm, day ~ month ~ variable) #X cast(aqm, month ~ variable, mean) #X cast(aqm, month ~ . | variable, mean) #X cast(aqm, month ~ variable, mean, margins=c("grand_row", "grand_col")) #X cast(aqm, day ~ month, mean, subset=variable=="ozone") #X cast(aqm, month ~ variable, range) #X cast(aqm, month ~ variable + result_variable, range) #X cast(aqm, variable ~ month ~ result_variable,range) #X #X #Chick weight example #X names(ChickWeight) <- tolower(names(ChickWeight)) #X chick_m <- melt(ChickWeight, id=2:4, na.rm=TRUE) #X #X cast(chick_m, time ~ variable, mean) # average effect of time #X cast(chick_m, diet ~ variable, mean) # average effect of diet #X cast(chick_m, diet ~ time ~ variable, mean) # average effect of diet & time #X #X # How many chicks at each time? - checking for balance #X cast(chick_m, time ~ diet, length) #X cast(chick_m, chick ~ time, mean) #X cast(chick_m, chick ~ time, mean, subset=time < 10 & chick < 20) #X #X cast(chick_m, diet + chick ~ time) #X cast(chick_m, chick ~ time ~ diet) #X cast(chick_m, diet + chick ~ time, mean, margins="diet") #X #X #Tips example #X cast(melt(tips), sex ~ smoker, mean, subset=variable=="total_bill") #X cast(melt(tips), sex ~ smoker | variable, mean) #X #X ff_d <- melt(french_fries, id=1:4, na.rm=TRUE) #X cast(ff_d, subject ~ time, length) #X cast(ff_d, subject ~ time, length, fill=0) #X cast(ff_d, subject ~ time, function(x) 30 - length(x)) #X cast(ff_d, subject ~ time, function(x) 30 - length(x), fill=30) #X cast(ff_d, variable ~ ., c(min, max)) #X cast(ff_d, variable ~ ., function(x) quantile(x,c(0.25,0.5))) #X cast(ff_d, treatment ~ variable, mean, margins=c("grand_col", "grand_row")) #X cast(ff_d, treatment + subject ~ variable, mean, margins="treatment") cast <- function(data, formula = ... ~ variable, fun.aggregate=NULL, ..., margins=FALSE, subset=TRUE, df=FALSE, fill=NULL, add.missing=FALSE, value = guess_value(data)) { if (is.formula(formula)) formula <- deparse(formula) if (!is.character(formula)) formula <- as.character(formula) subset <- eval(substitute(subset), data, parent.frame()) subset <- !is.na(subset) & subset data <- data[subset, , drop=FALSE] variables <- cast_parse_formula(formula, names(data)) if (any(names(data) == value)) names(data)[names(data) == value] <- "value" v <- unlist(variables) v <- v[v != "result_variable"] if (add.missing) data[v] <- lapply(data[v], as.factor) if (length(fun.aggregate) > 1) fun.aggregate <- do.call(funstofun, as.list(match.call()[[4]])[-1]) if (!is.null(fun.aggregate) && is.character(fun.aggregate)) fun.aggregate <- match.fun(fun.aggregate) if (!is.null(variables$l)) { res <- nested.by(data, data[variables$l], function(x) { reshape1(x, variables$m, fun.aggregate, margins=margins, df=df, fill=fill, add.missing=add.missing, ...) }) } else { res <- reshape1(data, variables$m, fun.aggregate, margins=margins, df=df,fill=fill, add.missing=add.missing, ...) } #attr(res, "formula") <- formula #attr(res, "data") <- deparse(substitute(data)) res } # Casting workhorse. # Takes data frame and variable list and casts data. # # @arguments data frame # @arguments variables to appear in columns # @arguments variables to appear in rows # @arguments aggregation function # @arguments should the aggregating function be supplied with the entire data frame, or just the relevant entries from the values column # @arguments vector of variable names (can include "grand\_col" and "grand\_row") to compute margins for, or TRUE to computer all margins # @arguments value with which to fill in structural missings # @arguments further arguments are passed to aggregating function # @seealso \code{\link{cast}} # @keyword internal #X #X ffm <- melt(french_fries, id=1:4, na.rm = TRUE) #X # Casting lists ---------------------------- #X cast(ffm, treatment ~ rep | variable, mean) #X cast(ffm, treatment ~ rep | subject, mean) #X cast(ffm, treatment ~ rep | time, mean) #X cast(ffm, treatment ~ rep | time + variable, mean) #X names(airquality) <- tolower(names(airquality)) #X aqm <- melt(airquality, id=c("month", "day"), preserve=FALSE) #X #Basic call #X reshape1(aqm, list("month", NULL), mean) #X reshape1(aqm, list("month", "variable"), mean) #X reshape1(aqm, list("day", "month"), mean) #X #X #Explore margins ---------------------------- #X reshape1(aqm, list("month", NULL), mean, "month") #X reshape1(aqm, list("month", NULL) , mean, "grand_col") #X reshape1(aqm, list("month", NULL) , mean, "grand_row") #X #X reshape1(aqm, list(c("month", "day"), NULL), mean, "month") #X reshape1(aqm, list(c("month"), "variable"), mean, "month") #X reshape1(aqm, list(c("variable"), "month"), mean, "month") #X reshape1(aqm, list(c("month"), "variable"), mean, c("month","variable")) #X #X reshape1(aqm, list(c("month"), "variable"), mean, c("grand_row")) #X reshape1(aqm, list(c("month"), "variable"), mean, c("grand_col")) #X reshape1(aqm, list(c("month"), "variable"), mean, c("grand_row","grand_col")) #X #X reshape1(aqm, list(c("variable","day"),"month"), mean,c("variable")) #X reshape1(aqm, list(c("variable","day"),"month"), mean,c("variable","grand_row")) #X reshape1(aqm, list(c("month","day"), "variable"), mean, "month") #X #X # Multiple fnction returns ---------------------------- #X reshape1(aqm, list(c("month", "result_variable"), NULL), range) #X reshape1(aqm, list(c("month"),"result_variable") , range) #X reshape1(aqm, list(c("result_variable", "month"), NULL), range) #X #X reshape1(aqm, list(c("month", "result_variable"), "variable"), range, "month") #X reshape1(aqm, list(c("month", "result_variable"), "variable"), range, "variable") #X reshape1(aqm, list(c("month", "result_variable"), "variable"), range, c("variable","month")) #X reshape1(aqm, list(c("month", "result_variable"), "variable"), range, c("grand_col")) #X reshape1(aqm, list(c("month", "result_variable"), "variable"), range, c("grand_row")) #X #X reshape1(aqm, list(c("month"), c("variable")), function(x) diff(range(x))) reshape1 <- function(data, vars = list(NULL, NULL), fun.aggregate=NULL, margins, df=FALSE, fill=NA, add.missing=FALSE, ...) { vars.clean <- lapply(vars, clean.vars) variables <- unlist(vars.clean) if (!missing(margins) && isTRUE(margins)) margins <- c(variables, "grand_row", "grand_col") aggregate <- nrow(unique(data[,variables, drop=FALSE])) < nrow(data) || !is.null(fun.aggregate) if (aggregate) { if (missing(fun.aggregate) || is.null(fun.aggregate)) { message("Aggregation requires fun.aggregate: length used as default") fun.aggregate <- length } if (is.null(fill)) { fill <- suppressWarnings(fun.aggregate(data$value[0])) } if (!df) { data.r <- expand(condense(data, variables, fun.aggregate, ...)) } else { data.r <- condense.df(data, variables, fun.aggregate, ...) } if ("result_variable" %in% names(data.r) && !("result_variable" %in% unlist(vars))) { vars[[2]] <- c(vars[[2]], "result_variable") } } else { data.r <- data.frame(data[,c(variables), drop=FALSE], result = data$value) if (!is.null(fun.aggregate)) data.r$result <- sapply(data.r$result, fun.aggregate) if (is.null(fill)) { fill <- NA } } if (length(vars.clean) > 2 && margins) { warning("Sorry, you currently can't use margins with high D arrays", .call=FALSE) margins <- FALSE } margins.r <- compute.margins(data, margin.vars(vars.clean, margins), vars.clean, fun.aggregate, ..., df=df) if (ncol(margins.r) > 0) { need.factorising <- !sapply(data.r, is.factor) & sapply(margins.r, is.factor) data.r[need.factorising] <- lapply(data.r[need.factorising], factor) } result <- sort_df(rbind.fill(data.r, margins.r), unlist(vars)) if (add.missing) result <- add.missing.levels(result, unlist(vars), fill=fill) result <- add.all.combinations(result, vars, fill=fill) dimnames <- lapply(vars, function(x) dim_names(result, x)) r <- if (!df) unlist(result$result) else result$result reshaped <- array(r, rev(sapply(dimnames, nrow))) reshaped <- aperm(reshaped, length(dim(reshaped)):1) dimnames(reshaped) <- lapply(dimnames, function(x) apply(x, 1, paste, collapse="-")) names(dimnames(reshaped)) <- lapply(vars, paste, collapse="-") if (length(vars.clean) > 2) return(reshaped) if (df) return(cast_matrix(reshaped, dimnames)) as.data.frame(cast_matrix(reshaped, dimnames)) } # Add all combinations # Add all combinations of the given rows and columns to the data frames. # # This function is used to ensure that we have a matrix of the appropriate # dimensionaliy with no missing cells. # # @arguments data.frame # @arguments variables (list of character vectors) # @arguments value to fill structural missings with # @keyword internal #X rdunif <- #X function(n=20, min=0, max=10) floor(runif(n,min, max)) #X df <- data.frame(a = rdunif(), b = rdunif(),c = rdunif(), result=1:20) #X add.all.combinations(df) #X add.all.combinations(df, list("a", "b")) #X add.all.combinations(df, list("a", "b"), fill=0) #X add.all.combinations(df, list(c("a", "b"))) #X add.all.combinations(df, list("a", "b", "c")) #X add.all.combinations(df, list(c("a", "b"), "c")) #X add.all.combinations(df, list(c("a", "b", "c"))) add.all.combinations <- function(data, vars = list(NULL), fill=NA) { if (sum(sapply(vars, length)) == 0) return(data) all.combinations <- do.call(expand.grid.df, lapply(vars, function(cols) data[, cols, drop=FALSE]) ) result <- merge(data, all.combinations, by = unlist(vars), sort = FALSE, all = TRUE) # fill missings with fill value if (is.list(result$result)) { result$result[sapply(result$result, is.null)] <- fill } else { data_col <- matrix(!names(result) %in% unlist(vars), nrow=nrow(result), ncol=ncol(result), byrow=TRUE) result[is.na(result) & data_col] <- fill } sort_df(result, unlist(vars)) } # Add in any missing values # @keyword internal add.missing.levels <- function(data, vars=NULL, fill=NA) { if (is.null(vars)) return(data) cat <- sapply(data[,vars, drop=FALSE], is.factor) levels <- lapply(data[,vars, drop=FALSE][,cat, drop=FALSE], levels) allcombs <- do.call(expand.grid, levels) current <- unique(data[,vars, drop=FALSE]) extras <- allcombs[!duplicated(rbind(current, allcombs))[-(1:nrow(current))], , drop=FALSE] result <- rbind.fill(data, extras) if (!is.na(fill)) result[is.na(result)] <- fill result } # Dimension names # Convenience method for extracting row and column names # # @arguments data frame # @arguments variables to use # @keyword internal dim_names <- function(data, vars) { if (!is.null(vars) && length(vars) > 0) { unique(data[,vars,drop=FALSE]) } else { data.frame(value="(all)") # use fun.aggregate instead of "value"? } } reshape/R/pretty.r0000644000175100001440000000310213141121104013612 0ustar hornikusers# Pretty print # Print reshaped data frame # # This will always work on the direct output from cast, but may not # if you have manipulated (e.g. subsetted) the results. # # @argument Reshaped data frame # @argument Argument required to match generic # @argument Argument required to match generic # @keyword internal prettyprint <- function(x, digits=getOption("digits"), ..., colnames=TRUE) { unx <- x class(unx) <- "data.frame" label.rows <- names(rrownames(x)) labels <- strip.dups(unx[,names(x) %in% label.rows, drop=FALSE]) colnames(labels) <- label.rows[names(x) %in% label.rows] data <- as.matrix((unx[,!(names(x) %in% label.rows), drop=FALSE])) col.labels <- t(strip.dups(rcolnames(x))) bottom <- cbind(labels,data) top <- cbind(matrix("", ncol=ncol(labels)-1, nrow=nrow(col.labels)), names(rcolnames(x)), col.labels) if(colnames) { middle <- colnames(bottom) } else { middle <- c(colnames(labels), rep("", ncol(bottom) - length(colnames(labels)))) } result <- rbind(top, middle, bottom) rownames(result) <- rep("", nrow(result)) colnames(result) <- rep("", ncol(result)) print(result, quote=FALSE, right=TRUE) } # Strip duplicates. # Strips out duplicates from data.frame and replace them with blanks. # # @arguments data.frame to modify # @value character matrix # @keyword internal strip.dups <- function(df) { clear.dup <- function(dups,ret=dups) ifelse(duplicated(dups), "", ret) mat <- apply(df, c(1,2), as.character) do.call(cbind, lapply(1:ncol(mat), function(x) clear.dup(mat[,1:x, drop=FALSE], mat[,x, drop=FALSE]))) } reshape/MD50000644000175100001440000001025413141637363012240 0ustar hornikusers4ec210e8ef551307ee7fc870d167fbea *CHANGELOG 86e17092167e4938f2e73510d7d6ffc4 *DESCRIPTION 581827da3959cce1750fb90f1eb98ec8 *LICENSE e9d0e50b592127b51323caa506853e1a *NAMESPACE 396973455bcc91d755fbfab70557a509 *NEWS d29587c851a9369450419492f344693e *R/cast.r 0e207eeffd3dda9124769723c4d3623c *R/condense.r 01bc93013826a532485da25ebc012a59 *R/dimnames.r 391c244786968bc7134cf14e2ce0c02b *R/factors.r 7fd941c6948b0c0f02073219c5e79e8b *R/formula.r c77bd1c6ae03b0dc4543ea5fad51f787 *R/margins.r 6cacfe26ac5424be6775bb1a0681354e *R/melt.r 238526badf752642575dc9e3fb2c0142 *R/pretty.r de4bf763e84fd4a5c72e3d40b9ea4861 *R/recast.r 17125ccff18e80a01e1c53c1af4cefc5 *R/rescale.r d9802ecf746f1015a35400e208f7e821 *R/sparse-by.r df49a27b899e8e1171799eab7a252b00 *R/stamp.r 551d70b3dd600c62acae0388be36012a *R/utils.r 11d6f343f97ca34edc7cb5ad4a174d05 *data/french_fries.rda 931bb9da3bce71ebcb25ba53c5dcd1e5 *data/smiths.rda 6a3f0a74f813cd68547e665f42b8a3cb *data/tips.rda 134c9659aa6e7c8ed4a9e362cdec561f *inst/CITATION 4bfb5681ced65b27a57357f2679e1273 *man/add-all-combinations-dk.rd 195cb4dc67494405bfb9cbe2c2d7f3a0 *man/add-missing-levels-ko.rd aa1c0744c841fe83790e0b92bb4dc386 *man/all-vars-character-rs.rd 9ee513061fbd5ab99be54bef86b27653 *man/as-data-frame-cast-df-v7.rd 437d18a02164c48c9d9e74dcda49e9d3 *man/as-data-frame-cast-matrix-59.rd 56c7046934b5b2b020683e9f83a4379d *man/as-matrix-cast-df-bu.rd 0fa732227d1436d52af618d9825dfcaf *man/as-matrix-cast-matrix-2y.rd 8f91d8bd5ce300f2e22ff561b882e711 *man/cast-9g.rd e76fc8435aeaad96e8d76765be6e58eb *man/cast-matrix-hj.rd 22e7cd7353a0a1f32423f30153b31382 *man/cast-parse-formula-uw.rd fb9ea5ea038be7bf2dd4c82a1e365d5c *man/check-formula-20.rd cc2e369a5d6a3b781df4908679f63b9d *man/clean-vars-rc.rd f7241f73149eeba7839c9dcce848b5db *man/colsplit-9h.rd 3db587714dd15c57d0821de927d04df6 *man/combine-factor-9x.rd f87fbb8190eada4e0d7de5cfd8268f11 *man/compute-margins-dh.rd 99dbc8cd8fbef091fbc99e01d686ba14 *man/condense-df-34.rd e05e711a82224b66c2a2fc15ba878664 *man/condense-ss.rd 5898aefea07ea9302e1934c75cd762e0 *man/dim-names-fi.rd 8a0d398e2425c8742c407c5b754f93a1 *man/expand-grid-df-fl.rd 4b9cfaf516abeb9fa0b1031cb521e04d *man/expand-kx.rd 46f10536ede1fd678493947b020ea2d0 *man/french-fries.rd 9dab49841aea9ea216659f61c749182b *man/funstofun-gl.rd 8dbdc425561d10412b1ed458aaf2ff3e *man/guess-value-2f.rd 9a3a34be7d066432d4174caec53853f7 *man/margin-vars-rw.rd 500bd5b1e6965386c00c37df816b723a *man/melt-24.rd 2872532c1b996f6ba3fc57c16aa5388f *man/melt-array-e0.rd b3aa9c8f3771baa895227107a42fd165 *man/melt-cast-df-c7.rd 244c66feed95f09fe3ac25976fa07436 *man/melt-cast-matrix-vq.rd 404bc3e3190b73afc4565ded32ef9643 *man/melt-check-j7.rd bf5d7a157be7f697c8fc35fc1c14ba7a *man/melt-data-frame-da.rd 4d5f9b94bc02526f6cb8e1a55f25d921 *man/melt-default-gi.rd 8d17c370057c93e34b4ec0a956ea0833 *man/melt-list-8m.rd d3b9e3407d3412a79a8edf1f6f79021e *man/merge-all-hc.rd 887ae04c6b0973c0ba6fc2d0217d91ab *man/merge-recurse-2d.rd 7c42f3fd65ffa09109e7c771d1b78b2c *man/namerows-6u.rd ec8df9a16be02f26032e974d10fc2119 *man/nested-by-92.rd 5eff7ade5bbdf9b2e46d58a8da82f56a *man/nulldefault-ck.rd cb3325ec61a69a3c73f0afdd3e26a7e2 *man/prettyprint-hy.rd 4b94226b6ef8c369d8e950a3914acb73 *man/rdimnames-11.rd cc0cc180f93f8cd60408d4c76bea9959 *man/recast-ar.rd 51d2e107e60a474bb3bd59644795e13e *man/rename-au.rd 1d547740c9a12863cef4f3ba5354f57c *man/rescaler-40.rd 9596bb019342247f155826fce92b9685 *man/rescaler-data-frame-4u.rd 07485b05390bb5cca32b618e429bdf39 *man/rescaler-default-bl.rd c74c91c2b8656da9a0d322e5d1fb4e45 *man/rescaler-matrix-xv.rd c961fc641aab549080f96b7667c7bcbd *man/reshape-4u.rd 62da388400ca3843ba71efdcefdd988e *man/round-any-u2.rd 2c38649f8d9c2caf40f14155aa6c0904 *man/smiths.rd 20e95dbf023e0b3178553bb4660bc067 *man/sort-df-aw.rd c14371532f92b804930a4e37c09676aa *man/sparse-by.rd 29eda48d3970206240ad55819b0d5d26 *man/stamp-fw.rd 0b1e787301fe52793f35390850286a30 *man/str-cast-matrix-ez.rd f33a3e8f3f8d43075452b078ff8cba54 *man/strip-dups-7t.rd d4788ec858479dea199cf313b1cdb5ad *man/tidystamp-1p.rd 24f75cf629b4d48b8e16176a04c3673c *man/tips.rd f42156b2fae944ec13450036aeaf11e8 *man/uniquedefault-01.rd e34704e716ffed7b9999e340ba0c6c1a *man/untable-1x.rd c58799b441eed16701f6bc99a84e7f4d *man/updatelist-50.rd reshape/DESCRIPTION0000644000175100001440000000105513141637363013435 0ustar hornikusersPackage: reshape Version: 0.8.7 Title: Flexibly Reshape Data Description: Flexibly restructure and aggregate data using just two functions: melt and cast. Authors@R: person("Hadley", "Wickham", , "hadley@rstudio.com", c("aut", "cre")) URL: http://had.co.nz/reshape Depends: R (>= 2.6.1) Imports: plyr License: MIT + file LICENSE LazyData: true NeedsCompilation: yes Packaged: 2017-08-04 16:36:57 UTC; hadley Author: Hadley Wickham [aut, cre] Maintainer: Hadley Wickham Repository: CRAN Date/Publication: 2017-08-06 16:08:19 UTC reshape/man/0000755000175100001440000000000013141121104012456 5ustar hornikusersreshape/man/melt-list-8m.rd0000644000175100001440000000125413141121104015243 0ustar hornikusers\name{melt.list} \alias{melt.list} \title{Melt a list} \author{Hadley Wickham } \description{ Melting a list recursively melts each component of the list and joins the results together } \usage{\method{melt}{list}(data, ..., level=1)} \arguments{ \item{data}{} \item{...}{other arguments passed down} \item{level}{} } \examples{a <- as.list(1:4) melt(a) names(a) <- letters[1:4] melt(a) attr(a, "varname") <- "ID" melt(a) a <- list(matrix(1:4, ncol=2), matrix(1:6, ncol=2)) melt(a) a <- list(matrix(1:4, ncol=2), array(1:27, c(3,3,3))) melt(a) melt(list(1:5, matrix(1:4, ncol=2))) melt(list(list(1:3), 1, list(as.list(3:4), as.list(1:2))))} \keyword{internal} reshape/man/add-missing-levels-ko.rd0000644000175100001440000000045213141121104017104 0ustar hornikusers\name{add.missing.levels} \alias{add.missing.levels} \title{Add in any missing values} \author{Hadley Wickham } \description{ @keyword internal } \usage{add.missing.levels(data, vars=NULL, fill=NA)} \arguments{ \item{data}{} \item{vars}{} \item{fill}{} } \keyword{internal} reshape/man/clean-vars-rc.rd0000644000175100001440000000051313141121104015441 0ustar hornikusers\name{clean.vars} \alias{clean.vars} \title{Clean variables.} \author{Hadley Wickham } \description{ Clean variable list for reshaping. } \usage{clean.vars(vars)} \arguments{ \item{vars}{vector of variable names} } \value{Vector of "real" variable names (excluding result\_variable etc.)} \keyword{internal} reshape/man/as-data-frame-cast-df-v7.rd0000644000175100001440000000060113141121104017255 0ustar hornikusers\name{as.data.frame.cast_df} \alias{as.data.frame.cast_df} \title{as.data.frame.cast\_df} \author{Hadley Wickham } \description{ Convert cast data.frame into a matrix } \usage{\method{as.data.frame}{cast_df}(x, ...)} \arguments{ \item{x}{} \item{...}{} } \details{Strips off cast related attributes so data frame becomes a normal data frame} \keyword{internal} reshape/man/namerows-6u.rd0000644000175100001440000000066213141121104015174 0ustar hornikusers\name{namerows} \alias{namerows} \title{Name rows} \author{Hadley Wickham } \description{ Add variable to data frame containing rownames } \usage{namerows(df, col.name = "id")} \arguments{ \item{df}{data frame} \item{col.name}{name of new column containing rownames} } \details{This is useful when the thing that you want to melt by is the rownames of the data frame, not an explicit variable} \keyword{manip} reshape/man/dim-names-fi.rd0000644000175100001440000000045013141121104015252 0ustar hornikusers\name{dim_names} \alias{dim_names} \title{Dimension names} \author{Hadley Wickham } \description{ Convenience method for extracting row and column names } \usage{dim_names(data, vars)} \arguments{ \item{data}{data frame} \item{vars}{variables to use} } \keyword{internal} reshape/man/smiths.rd0000644000175100001440000000043113141121104014312 0ustar hornikusers\name{Smiths} \docType{data} \alias{smiths} \title{Demo data describing the Smiths} \description{ A small demo dataset describing John and Mary Smith. Used in the introductory vignette. } \usage{data(smiths)} \format{A data frame with 2 rows and 5 variables} \keyword{datasets} reshape/man/melt-check-j7.rd0000644000175100001440000000153213141121104015340 0ustar hornikusers\name{melt_check} \alias{melt_check} \title{Melt check.} \author{Hadley Wickham } \description{ Check that input variables to melt are appropriate. } \usage{melt_check(data, id.vars, measure.vars)} \arguments{ \item{data}{data frame} \item{id.vars}{Vector of identifying variable names or indexes} \item{measure.vars}{Vector of Measured variable names or indexes} } \value{ \item{id}{list id variable names} \item{measure}{list of measured variable names} } \details{If id.vars or measure.vars are missing, \code{melt_check} will do its best to impute them.If you only supply one of id.vars and measure.vars, melt will assume the remainder of the variables in the data set belong to the other. If you supply neither, melt will assume character and factor variables are id variables, and all other are measured.} \keyword{internal} reshape/man/compute-margins-dh.rd0000644000175100001440000000072713141121104016516 0ustar hornikusers\name{compute.margins} \alias{compute.margins} \title{Compute margins} \author{Hadley Wickham } \description{ Compute marginal values. } \usage{compute.margins(data, margins, vars, fun.aggregate, ..., df=FALSE)} \arguments{ \item{data}{data frame} \item{margins}{margins to compute} \item{vars}{all id variables} \item{fun.aggregate}{aggregation function} \item{...}{other argument passed to aggregation function} \item{df}{} } \keyword{internal} reshape/man/melt-data-frame-da.rd0000644000175100001440000000302613141121104016330 0ustar hornikusers\name{melt.data.frame} \alias{melt.data.frame} \title{Melt a data frame} \author{Hadley Wickham } \description{ Melt a data frame into form suitable for easy casting. } \usage{\method{melt}{data.frame}(data, id.vars, measure.vars, variable_name = "variable", na.rm = !preserve.na, preserve.na = TRUE, ...)} \arguments{ \item{data}{Data set to melt} \item{id.vars}{Id variables. If blank, will use all non measure.vars variables. Can be integer (variable position) or string (variable name)} \item{measure.vars}{Measured variables. If blank, will use all non id.vars variables. Can be integer (variable position) or string (variable name)} \item{variable_name}{Name of the variable that will store the names of the original variables} \item{na.rm}{Should NA values be removed from the data set?} \item{preserve.na}{Old argument name, now deprecated} \item{...}{other arguments ignored} } \value{molten data} \details{You need to tell melt which of your variables are id variables, and which are measured variables. If you only supply one of \code{id.vars} and \code{measure.vars}, melt will assume the remainder of the variables in the data set belong to the other. If you supply neither, melt will assume factor and character variables are id variables, and all others are measured.} \seealso{\url{http://had.co.nz/reshape/}} \examples{head(melt(tips)) names(airquality) <- tolower(names(airquality)) melt(airquality, id=c("month", "day")) names(ChickWeight) <- tolower(names(ChickWeight)) melt(ChickWeight, id=2:4)} \keyword{manip} reshape/man/condense-df-34.rd0000644000175100001440000000061713141121104015422 0ustar hornikusers\name{condense.df} \alias{condense.df} \title{Condense a data frame} \author{Hadley Wickham } \description{ Condense } \usage{condense.df(data, variables, fun, ...)} \arguments{ \item{data}{data frame} \item{variables}{character vector of variables to condense over} \item{fun}{function to condense with} \item{...}{arguments passed to condensing function} } \keyword{manip} reshape/man/str-cast-matrix-ez.rd0000644000175100001440000000055213141121104016465 0ustar hornikusers\name{str.cast_matrix} \alias{str.cast_matrix} \alias{str.cast_df} \alias{print.cast_matrix} \alias{print.cast_df} \title{Print cast objects} \author{Hadley Wickham } \description{ Printing methods } \usage{\method{str}{cast_matrix}(object, ...)} \arguments{ \item{object}{} \item{...}{} } \details{Used for printing.} \keyword{internal} reshape/man/prettyprint-hy.rd0000644000175100001440000000067313141121104016035 0ustar hornikusers\name{prettyprint} \alias{prettyprint} \title{Pretty print} \author{Hadley Wickham } \description{ Print reshaped data frame } \usage{prettyprint(x, digits=getOption("digits"), ..., colnames=TRUE)} \arguments{ \item{x}{} \item{digits}{} \item{...}{} \item{colnames}{} } \details{This will always work on the direct output from cast, but may not if you have manipulated (e.g. subsetted) the results.} \keyword{internal} reshape/man/check-formula-20.rd0000644000175100001440000000051113141121104015741 0ustar hornikusers\name{check_formula} \alias{check_formula} \title{Check formula} \author{Hadley Wickham } \description{ Checks that formula is a valid reshaping formula. } \usage{check_formula(formula, varnames)} \arguments{ \item{formula}{formula to check} \item{varnames}{vector of variable names} } \keyword{internal} reshape/man/cast-matrix-hj.rd0000644000175100001440000000063413141121104015643 0ustar hornikusers\name{cast_matrix} \alias{cast_matrix} \title{Cast matrix.} \author{Hadley Wickham } \description{ Createa a new cast matrix } \usage{cast_matrix(m, dimnames)} \arguments{ \item{m}{matrix to turn into cast matrix} \item{dimnames}{list of dimension names (as data.frames), row, col, ...} } \value{object of type \code{\link{cast_matrix}}} \details{For internal use only} \keyword{internal} reshape/man/nested-by-92.rd0000644000175100001440000000054613141121104015134 0ustar hornikusers\name{nested.by} \alias{nested.by} \title{Nested.by function} \author{Hadley Wickham } \description{ Nest series of by statements returning nested list } \usage{nested.by(data, INDICES, FUN, ...)} \arguments{ \item{data}{} \item{INDICES}{} \item{FUN}{} \item{...}{} } \details{Work horse for producing cast lists.} \keyword{internal} reshape/man/rdimnames-11.rd0000644000175100001440000000104013141121104015176 0ustar hornikusers\name{rdimnames} \alias{rdimnames} \alias{rdimnames<-} \alias{rcolnames} \alias{rcolnames<-} \alias{rrownames} \alias{rrownames<-} \title{Dimension names} \author{Hadley Wickham } \description{ These methods provide easy access to the special dimension names } \usage{rdimnames(x)} \arguments{ \item{x}{} } \details{Reshape stores dimension names in a slightly different format to base R, to allow for (e.g.) multiple levels of column header. These accessor functions allow you to get and set them.} \keyword{internal} reshape/man/tidystamp-1p.rd0000644000175100001440000000033013141121104015335 0ustar hornikusers\name{tidystamp} \alias{tidystamp} \title{Tidy up stamped data set} \author{Hadley Wickham } \description{ @keyword internal } \usage{tidystamp(x)} \arguments{ \item{x}{} } \keyword{internal} reshape/man/rename-au.rd0000644000175100001440000000156113141121104014662 0ustar hornikusers\name{rename} \alias{rename} \title{Rename} \author{Hadley Wickham } \description{ Rename an object } \usage{rename(x, replace)} \arguments{ \item{x}{object to be renamed} \item{replace}{named vector specifying new names} } \details{The rename function provide an easy way to rename the columns of a data.frame or the items in a list.} \examples{rename(mtcars, c(wt = "weight", cyl = "cylinders")) a <- list(a = 1, b = 2, c = 3) rename(a, c(b = "a", c = "b", a="c")) # Example supplied by Timothy Bates names <- c("john", "tim", "andy") ages <- c(50, 46, 25) mydata <- data.frame(names,ages) names(mydata) #-> "name", "ages" # lets change "ages" to singular. # nb: The operation is not done in place, so you need to set your # data to that returned from rename mydata <- rename(mydata, c(ages="age")) names(mydata) #-> "name", "age"} \keyword{manip} reshape/man/rescaler-data-frame-4u.rd0000644000175100001440000000060513141121104017133 0ustar hornikusers\name{rescaler.data.frame} \alias{rescaler.data.frame} \title{Rescale a data frame} \author{Hadley Wickham } \description{ Rescales data frame by columns } \usage{\method{rescaler}{data.frame}(x, type="sd", ...)} \arguments{ \item{x}{data.frame to rescale} \item{type}{type of rescaling to apply} \item{...}{other arguments passed to rescaler} } \keyword{internal} reshape/man/rescaler-40.rd0000644000175100001440000000147113141121104015031 0ustar hornikusers\name{rescaler} \alias{rescaler} \title{Rescaler} \author{Hadley Wickham } \description{ Convenient methods for rescaling data } \usage{rescaler(x, type="sd", ...)} \arguments{ \item{x}{object to rescale} \item{type}{type of rescaling to use (see description for details)} \item{...}{other options (only pasesed to \code{\link{rank}})} } \details{Provides methods for vectors, matrices and data.frames Currently, five rescaling options are implemented: \itemize{ \item \code{I}: do nothing \item \code{range}: scale to [0, 1] \item \code{rank}: convert values to ranks \item \code{robust}: robust version of \code{sd}, substract median and divide by median absolute deviation \item \code{sd}: subtract mean and divide by standard deviation }} \seealso{\code{\link{rescaler.default}}} \keyword{manip} reshape/man/untable-1x.rd0000644000175100001440000000065513141121104014773 0ustar hornikusers\name{untable} \alias{untable} \title{Untable a dataset} \author{Hadley Wickham } \description{ Inverse of table } \usage{untable(df, num)} \arguments{ \item{df}{matrix or data.frame to untable} \item{num}{vector of counts (of same length as \code{df})} } \details{Given a tabulated dataset (or matrix) this will untabulate it by repeating each row by the number of times it was repeated} \keyword{manip} reshape/man/as-matrix-cast-df-bu.rd0000644000175100001440000000056413141121104016642 0ustar hornikusers\name{as.matrix.cast_df} \alias{as.matrix.cast_df} \title{as.matrix.cast\_df} \author{Hadley Wickham } \description{ Convert cast data.frame into a matrix } \usage{\method{as.matrix}{cast_df}(x, ...)} \arguments{ \item{x}{} \item{...}{} } \details{Converts a data frame produced by cast into a matrix with appropriate dimnames.} \keyword{internal} reshape/man/expand-kx.rd0000644000175100001440000000102113141121104014676 0ustar hornikusers\name{expand} \alias{expand} \title{Expand} \author{Hadley Wickham } \description{ Expand out condensed data frame. } \usage{expand(data)} \arguments{ \item{data}{condensed data frame} } \details{If aggregating function supplied to condense returns multiple values, this function "melts" it again, creating a new column called result\_variable. If the aggregating funtion is a named vector, then those names will be used, otherwise will be number X1, X2, ..., Xn etc.} \keyword{manip} \keyword{internal} reshape/man/combine-factor-9x.rd0000644000175100001440000000130613141121104016233 0ustar hornikusers\name{combine_factor} \alias{combine_factor} \title{Combine factor levels} \author{Hadley Wickham } \description{ Convenience function to make it easy to combine multiple levels } \usage{combine_factor(fac, variable=levels(fac), other.label="Other")} \arguments{ \item{fac}{factor variable} \item{variable}{either a vector of . See examples for more details.} \item{other.label}{label for other level} } \examples{df <- data.frame(a = LETTERS[sample(5, 15, replace=TRUE)], y = rnorm(15)) combine_factor(df$a, c(1,2,2,1,2)) combine_factor(df$a, c(1:4, 1)) (f <- reorder(df$a, df$y)) percent <- tapply(abs(df$y), df$a, sum) combine_factor(f, c(order(percent)[1:3]))} \keyword{manip} reshape/man/margin-vars-rw.rd0000644000175100001440000000072613141121104015666 0ustar hornikusers\name{margin.vars} \alias{margin.vars} \title{Margin variables} \author{Hadley Wickham } \description{ Works out list of variables to margin over to get desired margins. } \usage{margin.vars(vars = list(NULL, NULL), margins = NULL)} \arguments{ \item{vars}{column variables} \item{margins}{row variables} \item{}{vector of variable names to margin over.} } \details{Variables that can't be margined over are dropped silently.} \keyword{internal} reshape/man/all-vars-character-rs.rd0000644000175100001440000000100313141121104017074 0ustar hornikusers\name{all.vars.character} \alias{all.vars.character} \title{Get all variables} \author{Hadley Wickham } \description{ All variables in character string of formula. } \usage{\method{all.vars}{character}(formula, blank.char = ".")} \arguments{ \item{formula}{} \item{blank.char}{} } \details{Removes .} \examples{all.vars.character("a + b") all.vars.character("a + b | c") all.vars.character("a + b") all.vars.character(". ~ a + b") all.vars.character("a ~ b | c + d + e")} \keyword{internal} reshape/man/melt-24.rd0000644000175100001440000000112213141121104014165 0ustar hornikusers\name{melt} \alias{melt} \title{Melt} \author{Hadley Wickham } \description{ Melt an object into a form suitable for easy casting. } \usage{melt(data, ...)} \arguments{ \item{data}{Data set to melt} \item{...}{Other arguments passed to the specific melt method} } \details{This the generic melt function. See the following functions for specific details for different data structures: \itemize{ \item \code{\link{melt.data.frame}} for data.frames \item \code{\link{melt.array}} for arrays, matrices and tables \item \code{\link{melt.list}} for lists }} \keyword{manip} reshape/man/as-data-frame-cast-matrix-59.rd0000644000175100001440000000072013141121104020073 0ustar hornikusers\name{as.data.frame.cast_matrix} \alias{as.data.frame.cast_matrix} \title{as.data.frame.cast\_matrix} \author{Hadley Wickham } \description{ Convert cast matrix into a data frame } \usage{\method{as.data.frame}{cast_matrix}(x, row.names, optional, ...)} \arguments{ \item{x}{} \item{row.names}{} \item{optional}{} \item{...}{} } \details{Converts a matrix produced by cast into a data frame with appropriate id columns.} \keyword{internal} reshape/man/uniquedefault-01.rd0000644000175100001440000000055613141121104016104 0ustar hornikusers\name{uniquedefault} \alias{uniquedefault} \title{Unique default} \author{Hadley Wickham } \description{ Convenience function for setting default if not unique } \usage{uniquedefault(values, default)} \arguments{ \item{values}{vector of values} \item{default}{default to use if values not uniquez} } \details{Used by ggplot2} \keyword{manip} reshape/man/colsplit-9h.rd0000644000175100001440000000073013141121104015154 0ustar hornikusers\name{colsplit} \alias{colsplit} \alias{colsplit.factor} \alias{colsplit.character} \title{Split a vector into multiple columns} \author{Hadley Wickham } \description{ This function can be used to split up a column that has been pasted together. } \usage{colsplit(x, split="", names)} \arguments{ \item{x}{character vector or factor to split up} \item{split}{regular expression to split on} \item{names}{names for output columns} } \keyword{manip} reshape/man/as-matrix-cast-matrix-2y.rd0000644000175100001440000000056513141121104017502 0ustar hornikusers\name{as.matrix.cast_matrix} \alias{as.matrix.cast_matrix} \title{as.matrix.cast\_matrix} \author{Hadley Wickham } \description{ Convert cast matrix into a matrix } \usage{\method{as.matrix}{cast_matrix}(x, ...)} \arguments{ \item{x}{} \item{...}{} } \details{Strips off cast related attributes so matrix becomes a normal matrix} \keyword{internal} reshape/man/add-all-combinations-dk.rd0000644000175100001440000000177013141121104017367 0ustar hornikusers\name{add.all.combinations} \alias{add.all.combinations} \title{Add all combinations} \author{Hadley Wickham } \description{ Add all combinations of the given rows and columns to the data frames. } \usage{add.all.combinations(data, vars = list(NULL), fill=NA)} \arguments{ \item{data}{data.frame} \item{vars}{variables (list of character vectors)} \item{fill}{value to fill structural missings with} } \details{This function is used to ensure that we have a matrix of the appropriate dimensionaliy with no missing cells.} \examples{rdunif <- function(n=20, min=0, max=10) floor(runif(n,min, max)) df <- data.frame(a = rdunif(), b = rdunif(),c = rdunif(), result=1:20) add.all.combinations(df) add.all.combinations(df, list("a", "b")) add.all.combinations(df, list("a", "b"), fill=0) add.all.combinations(df, list(c("a", "b"))) add.all.combinations(df, list("a", "b", "c")) add.all.combinations(df, list(c("a", "b"), "c")) add.all.combinations(df, list(c("a", "b", "c")))} \keyword{internal} reshape/man/melt-cast-matrix-vq.rd0000644000175100001440000000056513141121104016632 0ustar hornikusers\name{melt.cast_matrix} \alias{melt.cast_matrix} \title{Melt cast matrices} \author{Hadley Wickham } \description{ Melt the results of a cast } \usage{\method{melt}{cast_matrix}(data, ...)} \arguments{ \item{data}{} \item{...}{other arguments ignored} } \details{Converts to a data frame and then uses \code{\link{melt.cast_df}}} \keyword{internal} reshape/man/rescaler-default-bl.rd0000644000175100001440000000057413141121104016630 0ustar hornikusers\name{rescaler.default} \alias{rescaler.default} \title{Default rescaler} \author{Hadley Wickham } \description{ See \code{\link{rescaler}} for details } \usage{\method{rescaler}{default}(x, type="sd", ...)} \arguments{ \item{x}{vector to rescale} \item{type}{type of rescaling to apply} \item{...}{other arguments passed to rescaler} } \keyword{internal} reshape/man/nulldefault-ck.rd0000644000175100001440000000047013141121104015720 0ustar hornikusers\name{nulldefault} \alias{nulldefault} \title{Null default} \author{Hadley Wickham } \description{ Use default value when null } \usage{nulldefault(x, default)} \arguments{ \item{x}{} \item{default}{} } \details{Handy method when argument defaults aren't good enough.} \keyword{internal} reshape/man/stamp-fw.rd0000644000175100001440000000254713141121104014553 0ustar hornikusers\name{stamp} \alias{stamp} \title{Stamp} \author{Hadley Wickham } \description{ Stamp is like reshape but the "stamping" function is passed the entire data frame, instead of just a few variables. } \usage{stamp(data, formula = . ~ ., fun.aggregate, ..., margins=NULL, subset=TRUE, add.missing=FALSE)} \arguments{ \item{data}{data.frame (no molten)} \item{formula}{formula that describes arrangement of result, columns ~ rows, see \code{\link{reshape}} for more information} \item{fun.aggregate}{aggregation function to use, should take a data frame as the first argument} \item{...}{arguments passed to the aggregation function} \item{margins}{margins to compute (character vector, or \code{TRUE} for all margins), can contain \code{grand_row} or \code{grand_col} to inclue grand row or column margins respectively.} \item{subset}{logical vector by which to subset the data frame, evaluated in the context of the data frame so you can} \item{add.missing}{fill in missing combinations?} } \details{It is very similar to the \code{\link{by}} function except in the form of the output which is arranged using the formula as in \code{\link{reshape}} Note that it's very easy to create objects that R can't print with this function. You will probably want to save the results to a variable and then use extract the results. See the examples.} \keyword{manip} reshape/man/french-fries.rd0000644000175100001440000000124713141121104015364 0ustar hornikusers\name{French fries} \docType{data} \alias{french_fries} \title{Sensory data from a french fries experiment} \description{ This data was collected from a sensory experiment conducted at Iowa State University in 2004. The investigators were interested in the effect of using three different fryer oils had on the taste of the fries. Variables: \itemize{ \item time in weeks from start of study. \item treatment (type of oil), \item subject, \item replicate, \item potato-y flavour, \item buttery flavour, \item grassy flavour, \item rancid flavour, \item painty flavour } } \usage{data(french_fries)} \format{A data frame with 696 rows and 9 variables} \keyword{datasets} reshape/man/recast-ar.rd0000644000175100001440000000134713141121104014673 0ustar hornikusers\name{recast} \alias{recast} \title{Recast} \author{Hadley Wickham } \description{ \link{melt} and \link{cast} data in a single step } \usage{recast(data, formula, ..., id.var, measure.var)} \arguments{ \item{data}{Data set to melt} \item{formula}{Casting formula, see \link{cast} for specifics} \item{...}{Other arguments passed to \link{cast}} \item{id.var}{Identifying variables. If blank, will use all non measure.var variables} \item{measure.var}{Measured variables. If blank, will use all non id.var variables} } \details{This conveniently wraps melting and casting a data frame into one step.} \seealso{\url{http://had.co.nz/reshape/}} \examples{recast(french_fries, time ~ variable, id.var=1:4)} \keyword{manip} reshape/man/rescaler-matrix-xv.rd0000644000175100001440000000055513141121104016547 0ustar hornikusers\name{rescaler.matrix} \alias{rescaler.matrix} \title{Rescale a matrix} \author{Hadley Wickham } \description{ Rescales matrix by columns } \usage{\method{rescaler}{matrix}(x, type="sd", ...)} \arguments{ \item{x}{matrix to rescale} \item{type}{type of rescaling to apply} \item{...}{other arguments passed to rescaler} } \keyword{internal} reshape/man/cast-9g.rd0000644000175100001440000001115713141121104014261 0ustar hornikusers\name{cast} \alias{cast} \title{Cast function} \author{Hadley Wickham } \description{ Cast a molten data frame into the reshaped or aggregated form you want } \usage{cast(data, formula = ... ~ variable, fun.aggregate=NULL, ..., margins=FALSE, subset=TRUE, df=FALSE, fill=NULL, add.missing=FALSE, value = guess_value(data))} \arguments{ \item{data}{molten data frame, see \code{\link{melt}}} \item{formula}{casting formula, see details for specifics} \item{fun.aggregate}{aggregation function} \item{add.missing}{fill in missing combinations?} \item{value}{name of value column} \item{...}{further arguments are passed to aggregating function} \item{margins}{vector of variable names (can include "grand\_col" and "grand\_row") to compute margins for, or TRUE to computer all margins} \item{subset}{logical vector to subset data set with before reshaping} \item{df}{argument used internally} \item{fill}{value with which to fill in structural missings, defaults to value from applying \code{fun.aggregate} to 0 length vector} } \details{Along with \code{\link{melt}} and \link{recast}, this is the only function you should ever need to use. Once you have melted your data, cast will arrange it into the form you desire based on the specification given by \code{formula}. The cast formula has the following format: \code{x_variable + x_2 ~ y_variable + y_2 ~ z_variable ~ ... | list_variable + ... } The order of the variables makes a difference. The first varies slowest, and the last fastest. There are a couple of special variables: "..." represents all other variables not used in the formula and "." represents no variable, so you can do \code{formula=var1 ~ .} Creating high-D arrays is simple, and allows a class of transformations that are hard without \code{\link{apply}} and \code{\link{sweep}} If the combination of variables you supply does not uniquely identify one row in the original data set, you will need to supply an aggregating function, \code{fun.aggregate}. This function should take a vector of numbers and return a summary statistic(s). It must return the same number of arguments regardless of the length of the input vector. If it returns multiple value you can use "result\_variable" to control where they appear. By default they will appear as the last column variable. The margins argument should be passed a vector of variable names, eg. \code{c("month","day")}. It will silently drop any variables that can not be margined over. You can also use "grand\_col" and "grand\_row" to get grand row and column margins respectively. Subset takes a logical vector that will be evaluated in the context of \code{data}, so you can do something like \code{subset = variable=="length"} All the actual reshaping is done by \code{\link{reshape1}}, see its documentation for details of the implementation} \seealso{\code{\link{reshape1}}, \url{http://had.co.nz/reshape/}} \examples{#Air quality example names(airquality) <- tolower(names(airquality)) aqm <- melt(airquality, id=c("month", "day"), na.rm=TRUE) cast(aqm, day ~ month ~ variable) cast(aqm, month ~ variable, mean) cast(aqm, month ~ . | variable, mean) cast(aqm, month ~ variable, mean, margins=c("grand_row", "grand_col")) cast(aqm, day ~ month, mean, subset=variable=="ozone") cast(aqm, month ~ variable, range) cast(aqm, month ~ variable + result_variable, range) cast(aqm, variable ~ month ~ result_variable,range) #Chick weight example names(ChickWeight) <- tolower(names(ChickWeight)) chick_m <- melt(ChickWeight, id=2:4, na.rm=TRUE) cast(chick_m, time ~ variable, mean) # average effect of time cast(chick_m, diet ~ variable, mean) # average effect of diet cast(chick_m, diet ~ time ~ variable, mean) # average effect of diet & time # How many chicks at each time? - checking for balance cast(chick_m, time ~ diet, length) cast(chick_m, chick ~ time, mean) cast(chick_m, chick ~ time, mean, subset=time < 10 & chick < 20) cast(chick_m, diet + chick ~ time) cast(chick_m, chick ~ time ~ diet) cast(chick_m, diet + chick ~ time, mean, margins="diet") #Tips example cast(melt(tips), sex ~ smoker, mean, subset=variable=="total_bill") cast(melt(tips), sex ~ smoker | variable, mean) ff_d <- melt(french_fries, id=1:4, na.rm=TRUE) cast(ff_d, subject ~ time, length) cast(ff_d, subject ~ time, length, fill=0) cast(ff_d, subject ~ time, function(x) 30 - length(x)) cast(ff_d, subject ~ time, function(x) 30 - length(x), fill=30) cast(ff_d, variable ~ ., c(min, max)) cast(ff_d, variable ~ ., function(x) quantile(x,c(0.25,0.5))) cast(ff_d, treatment ~ variable, mean, margins=c("grand_col", "grand_row")) cast(ff_d, treatment + subject ~ variable, mean, margins="treatment") } \keyword{manip} reshape/man/sort-df-aw.rd0000644000175100001440000000056513141121104014776 0ustar hornikusers\name{sort_df} \alias{sort_df} \title{Sort data frame} \author{Hadley Wickham } \description{ Convenience method for sorting a data frame using the given variables. } \usage{sort_df(data, vars=names(data))} \arguments{ \item{data}{data frame to sort} \item{vars}{variables to use for sorting} } \details{Simple wrapper around order} \keyword{manip} reshape/man/strip-dups-7t.rd0000644000175100001440000000046413141121104015453 0ustar hornikusers\name{strip.dups} \alias{strip.dups} \title{Strip duplicates.} \author{Hadley Wickham } \description{ Strips out duplicates from data.frame and replace them with blanks. } \usage{strip.dups(df)} \arguments{ \item{df}{data.frame to modify} } \value{character matrix} \keyword{internal} reshape/man/merge-recurse-2d.rd0000644000175100001440000000047413141121104016062 0ustar hornikusers\name{merge_recurse} \alias{merge_recurse} \title{Merge recursively} \author{Hadley Wickham } \description{ Recursively merge data frames } \usage{merge_recurse(dfs, ...)} \arguments{ \item{dfs}{list of data frames to merge} \item{...}{} } \seealso{\code{\link{merge_all}}} \keyword{internal} reshape/man/reshape-4u.rd0000644000175100001440000000602213141121104014762 0ustar hornikusers\name{reshape1} \alias{reshape1} \title{Casting workhorse.} \author{Hadley Wickham } \description{ Takes data frame and variable list and casts data. } \usage{reshape1(data, vars = list(NULL, NULL), fun.aggregate=NULL, margins, df=FALSE, fill=NA, add.missing=FALSE, ...)} \arguments{ \item{data}{data frame} \item{vars}{variables to appear in columns} \item{fun.aggregate}{variables to appear in rows} \item{margins}{aggregation function} \item{df}{should the aggregating function be supplied with the entire data frame, or just the relevant entries from the values column} \item{fill}{vector of variable names (can include "grand\_col" and "grand\_row") to compute margins for, or TRUE to computer all margins} \item{add.missing}{value with which to fill in structural missings} \item{...}{further arguments are passed to aggregating function} } \seealso{\code{\link{cast}}} \examples{ ffm <- melt(french_fries, id=1:4, na.rm = TRUE) # Casting lists ---------------------------- cast(ffm, treatment ~ rep | variable, mean) cast(ffm, treatment ~ rep | subject, mean) cast(ffm, treatment ~ rep | time, mean) cast(ffm, treatment ~ rep | time + variable, mean) names(airquality) <- tolower(names(airquality)) aqm <- melt(airquality, id=c("month", "day"), preserve=FALSE) #Basic call reshape1(aqm, list("month", NULL), mean) reshape1(aqm, list("month", "variable"), mean) reshape1(aqm, list("day", "month"), mean) #Explore margins ---------------------------- reshape1(aqm, list("month", NULL), mean, "month") reshape1(aqm, list("month", NULL) , mean, "grand_col") reshape1(aqm, list("month", NULL) , mean, "grand_row") reshape1(aqm, list(c("month", "day"), NULL), mean, "month") reshape1(aqm, list(c("month"), "variable"), mean, "month") reshape1(aqm, list(c("variable"), "month"), mean, "month") reshape1(aqm, list(c("month"), "variable"), mean, c("month","variable")) reshape1(aqm, list(c("month"), "variable"), mean, c("grand_row")) reshape1(aqm, list(c("month"), "variable"), mean, c("grand_col")) reshape1(aqm, list(c("month"), "variable"), mean, c("grand_row","grand_col")) reshape1(aqm, list(c("variable","day"),"month"), mean,c("variable")) reshape1(aqm, list(c("variable","day"),"month"), mean,c("variable","grand_row")) reshape1(aqm, list(c("month","day"), "variable"), mean, "month") # Multiple fnction returns ---------------------------- reshape1(aqm, list(c("month", "result_variable"), NULL), range) reshape1(aqm, list(c("month"),"result_variable") , range) reshape1(aqm, list(c("result_variable", "month"), NULL), range) reshape1(aqm, list(c("month", "result_variable"), "variable"), range, "month") reshape1(aqm, list(c("month", "result_variable"), "variable"), range, "variable") reshape1(aqm, list(c("month", "result_variable"), "variable"), range, c("variable","month")) reshape1(aqm, list(c("month", "result_variable"), "variable"), range, c("grand_col")) reshape1(aqm, list(c("month", "result_variable"), "variable"), range, c("grand_row")) reshape1(aqm, list(c("month"), c("variable")), function(x) diff(range(x))) } \keyword{internal} reshape/man/condense-ss.rd0000644000175100001440000000126513141121104015232 0ustar hornikusers\name{condense} \alias{condense} \title{Condense} \author{Hadley Wickham } \description{ Condense a data frame. } \usage{condense(data, variables, fun, ...)} \arguments{ \item{data}{data frame} \item{variables}{variables to condense over} \item{fun}{aggregating function, may multiple values} \item{...}{further arguments passed on to aggregating function} } \details{Works very much like by, but keeps data in original data frame format. Results column is a list, so that each cell may contain an object or a vector etc. Assumes data is in molten format. Aggregating function must return the same number of arguments for all input.} \keyword{manip} \keyword{internal} reshape/man/funstofun-gl.rd0000644000175100001440000000073113141121104015435 0ustar hornikusers\name{funstofun} \alias{funstofun} \title{Aggregate multiple functions into a single function} \author{Hadley Wickham } \description{ Combine multiple functions to a single function returning a named vector of outputs } \usage{funstofun(...)} \arguments{ \item{...}{functions to combine} } \details{Each function should produce a single number as output} \examples{funstofun(min, max)(1:10) funstofun(length, mean, var)(rnorm(100))} \keyword{manip} reshape/man/melt-default-gi.rd0000644000175100001440000000045413141121104015770 0ustar hornikusers\name{melt.default} \alias{melt.default} \title{Default melt function} \author{Hadley Wickham } \description{ For vectors, make a column of a data frame } \usage{\method{melt}{default}(data, ...)} \arguments{ \item{data}{data frame} \item{...}{arguments} } \keyword{internal} reshape/man/merge-all-hc.rd0000644000175100001440000000064213141121104015244 0ustar hornikusers\name{merge_all} \alias{merge_all} \title{Merge all} \author{Hadley Wickham } \description{ Merge together a series of data.frames } \usage{merge_all(dfs, ...)} \arguments{ \item{dfs}{list of data frames to merge} \item{...}{other arguments passed on to merge} } \details{Order of data frames should be from most complete to least complete} \seealso{\code{\link{merge_recurse}}} \keyword{manip} reshape/man/sparse-by.rd0000644000175100001440000000440213141121104014712 0ustar hornikusers \name{sparseby} \alias{sparseby} \title{Apply a Function to a Data Frame split by levels of indices} \description{ Function \code{sparseby} is a modified version of \code{\link{by}} for \code{\link{tapply}} applied to data frames. It always returns a new data frame rather than a multi-way array. } \usage{ sparseby(data, INDICES = list(), FUN, ..., GROUPNAMES = TRUE) } \arguments{ \item{data}{an \R object, normally a data frame, possibly a matrix.} \item{INDICES}{ a variable or list of variables indicating the subgroups of \code{data} } \item{FUN}{a function to be applied to data frame subsets of \code{data}.} \item{\dots}{further arguments to \code{FUN}.} \item{GROUPNAMES}{a logical variable indicating whether the group names should be bound to the result} } \details{ A data frame or matrix is split by row into data frames or matrices respectively subsetted by the values of one or more factors, and function \code{FUN} is applied to each subset in turn. \code{sparseby} is much faster and more memory efficient than \code{\link{by}} or \code{\link{tapply}} in the situation where the combinations of \code{INDICES} present in the data form a sparse subset of all possible combinations. } \value{ A data frame or matrix containing the results of \code{FUN} applied to each subgroup of the matrix. The result depends on what is returned from \code{FUN}: If \code{FUN} returns \code{NULL} on any subsets, those are dropped. If it returns a single value or a vector of values, the length must be consistent across all subgroups. These will be returned as values in rows of the resulting data frame or matrix. If it returns data frames or matrices, they must all have the same number of columns, and they will be bound with \code{\link{rbind}} into a single data frame or matrix. Names for the columns will be taken from the names in the list of \code{INDICES} or from the results of \code{FUN}, as appropriate. } \author{Duncan Murdoch} \seealso{ \code{\link{tapply}}, \code{\link{by}} } \examples{ x <- data.frame(index=c(rep(1,4),rep(2,3)),value=c(1:7)) x sparseby(x,x$index,nrow) # The version below works entirely in matrices x <- as.matrix(x) sparseby(x,list(group = x[,"index"]), function(subset) c(mean=mean(subset[,2]))) } \keyword{ iteration } \keyword{ category } reshape/man/guess-value-2f.rd0000644000175100001440000000062713141121104015557 0ustar hornikusers\name{guess_value} \alias{guess_value} \title{Guess value} \author{Hadley Wickham } \description{ Guess name of value column } \usage{guess_value(df)} \arguments{ \item{df}{Data frame to guess value column from} } \details{Strategy: \enumerate{ \item Is value or (all) column present? If so, use that \item Otherwise, guess that last column is the value column }} \keyword{internal} reshape/man/melt-array-e0.rd0000644000175100001440000000141713141121104015367 0ustar hornikusers\name{melt.array} \alias{melt.array} \alias{melt.matrix} \alias{melt.table} \title{Melt an array} \author{Hadley Wickham } \description{ This function melts a high-dimensional array into a form that you can use \code{\link{cast}} with. } \usage{\method{melt}{array}(data, varnames = names(dimnames(data)), ...)} \arguments{ \item{data}{array to melt} \item{varnames}{variable names to use in molten data.frame} \item{...}{other arguments ignored} } \details{This code is conceptually similar to \code{\link{as.data.frame.table}}} \examples{a <- array(1:24, c(2,3,4)) melt(a) melt(a, varnames=c("X","Y","Z")) dimnames(a) <- lapply(dim(a), function(x) LETTERS[1:x]) melt(a) melt(a, varnames=c("X","Y","Z")) dimnames(a)[1] <- list(NULL) melt(a)} \keyword{manip} reshape/man/tips.rd0000644000175100001440000000147213141121104013770 0ustar hornikusers\name{Tips} \docType{data} \alias{tips} \title{Tipping data} \description{ One waiter recorded information about each tip he received over a period of a few months working in one restaurant. He collected several variables: \itemize{ \item tip in dollars, \item bill in dollars, \item sex of the bill payer, \item whether there were smokers in the party, \item day of the week, \item time of day, \item size of the party. } In all he recorded 244 tips. The data was reported in a collection of case studies for business statistics (Bryant & Smith 1995). } \usage{data(tips)} \format{A data frame with 244 rows and 7 variables} \references{ Bryant, P. G. and Smith, M (1995) \emph{Practical Data Analysis: Case Studies in Business Statistics}. Homewood, IL: Richard D. Irwin Publishing: } \keyword{datasets} reshape/man/cast-parse-formula-uw.rd0000644000175100001440000000142413141121104017144 0ustar hornikusers\name{cast_parse_formula} \alias{cast_parse_formula} \title{Cast parse formula} \author{Hadley Wickham } \description{ Parse formula for casting } \usage{cast_parse_formula(formula = "... ~ variable", varnames)} \arguments{ \item{formula}{} \item{varnames}{} } \value{ \item{row}{character vector of row names} \item{col}{character vector of column names} \item{aggregate}{boolean whether aggregation will occur} } \details{@value row character vector of row names @value col character vector of column names @value aggregate boolean whether aggregation will occur @keyword internal} \examples{cast_parse_formula("a + ...", letters[1:6]) cast_parse_formula("a | ...", letters[1:6]) cast_parse_formula("a + b ~ c ~ . | ...", letters[1:6])} \keyword{internal} reshape/man/updatelist-50.rd0000644000175100001440000000051413141121104015405 0ustar hornikusers\name{updatelist} \alias{updatelist} \title{Update list} \author{Hadley Wickham } \description{ Update a list, but don't create new entries } \usage{updatelist(x, y)} \arguments{ \item{x}{list to be updated} \item{y}{list with updated values} } \details{Don't know what this is used for!} \keyword{internal} reshape/man/round-any-u2.rd0000644000175100001440000000122413141121104015244 0ustar hornikusers\name{round_any} \alias{round_any} \title{Round any} \author{Hadley Wickham } \description{ Round to multiple of any number } \usage{round_any(x, accuracy, f=round)} \arguments{ \item{x}{numeric vector to round} \item{accuracy}{number to round to} \item{f}{function to use for round (eg. \code{\link{floor}})} } \details{Useful when you want to round a number to arbitrary precision} \examples{round_any(135, 10) round_any(135, 100) round_any(135, 25) round_any(135, 10, floor) round_any(135, 100, floor) round_any(135, 25, floor) round_any(135, 10, ceiling) round_any(135, 100, ceiling) round_any(135, 25, ceiling)} \keyword{internal} reshape/man/expand-grid-df-fl.rd0000644000175100001440000000124213141121104016174 0ustar hornikusers\name{expand.grid.df} \alias{expand.grid.df} \title{Expand grid} \author{Hadley Wickham } \description{ Expand grid of data frames } \usage{expand.grid.df(..., unique=TRUE)} \arguments{ \item{...}{list of data frames (first varies fastest)} \item{unique}{only use unique rows?} } \details{Creates new data frame containing all combination of rows from data.frames in \code{...}} \examples{expand.grid.df(data.frame(a=1,b=1:2)) expand.grid.df(data.frame(a=1,b=1:2), data.frame()) expand.grid.df(data.frame(a=1,b=1:2), data.frame(c=1:2, d=1:2)) expand.grid.df(data.frame(a=1,b=1:2), data.frame(c=1:2, d=1:2), data.frame(e=c("a","b")))} \keyword{manip} reshape/man/melt-cast-df-c7.rd0000644000175100001440000000077413141121104015604 0ustar hornikusers\name{melt.cast_df} \alias{melt.cast_df} \title{Melt cast data.frames} \author{Hadley Wickham } \description{ Melt the results of a cast } \usage{\method{melt}{cast_df}(data, drop.margins=TRUE, ...)} \arguments{ \item{data}{} \item{drop.margins}{} \item{...}{other arguments ignored} } \details{This can be useful when performning complex aggregations - melting the result of a cast will do it's best to figure out the correct variables to use as id and measured.} \keyword{internal} reshape/LICENSE0000644000175100001440000000006113002467640012723 0ustar hornikusersYEAR: 2008-2016 COPYRIGHT HOLDER: Hadley Wickham