aggregateData {iNZightTools} | R Documentation |
Aggregate a dataframe into summaries of all numeric variables by grouping them by specified categorical variables and returns the result along with tidyverse code used to generate it.
aggregateData(
.data,
vars,
summaries,
summary_vars,
varnames = NULL,
quantiles = c(0.25, 0.75),
custom_funs = NULL
)
.data |
a dataframe or survey design object to aggregate |
vars |
a character vector of categorical variables in |
summaries |
summaries to generate for the groups generated
in |
summary_vars |
names of variables in the dataset to calculate summaries of |
varnames |
name templates for created variables (see details). |
quantiles |
if requesting quantiles, specify the desired quantiles here |
custom_funs |
a list of custom functions (see details). |
aggregated dataframe containing the summaries with tidyverse code attached
The aggregateData
function accepts any R function which returns a single-value (such as mean
, var
, sd
, sum
, IQR
). The default name of new variables will be {var}_{fun}
, where {var}
is the variable name and {fun}
is the summary function used. You may pass new names via the varnames
argument, which should be either a vector the same length as summary_vars
, or a named list (where the names are the summary function). In either case, use {var}
to represent the variable name. e.g., {var}_mean
or min_{var}
.
You can also include the summary missing
, which will count the number of missing values in the variable. It has default name {var}_missing
.
For the quantile
summary, there is the additional argument quantiles
. A new variable will be created for each specified quantile 'p'. To name these variables, use {p}
in varnames
(the default is {var}_q{p}
).
Custom functions can be passed via the custom_funs
argument. This should be a list, and each element should have a name
and either an expr
or fun
element. Expressions should operate on a variable x
. The function should be a function of x
and return a single value.
cust_funs <- list(name = '{var}_width', expr = diff(range(x), na.rm = TRUE)) cust_funs <- list(name = '{var}_stderr', fun = function(x) { s <- sd(x) n <- length(x) s / sqrt(n) } )
Tom Elliott, Owen Jin
aggregated <-
aggregateData(iris,
vars = c("Species"),
summaries = c("mean", "sd", "iqr")
)
cat(code(aggregated))
head(aggregated)