unstack
(Not Recommended) Unstack dataset array from single variable into multiple variables
The dataset data type is not recommended. To work with heterogeneous data,
use the MATLAB®
table data type instead. See MATLAB
table documentation for more information.
Syntax
A = unstack(B,datavar,indvar)
[A,iB] = unstack(B,datavar,indvar)
A = unstack(B,datavar,indvar,'Parameter',value)
Description
A = unstack(B,datavar,indvar) unstacks a single variable in dataset
array B into multiple variables in A. In general
A contains more variables, but fewer observations, than
B.
datavar specifies the data variable in B to unstack.
indvar specifies an indicator variable in B that
determines which variable in A each value in datavar is
unstacked into. unstack treats the remaining variables in B
as grouping variables. Each unique combination of their values defines a group of observations in
B that will be unstacked into a single observation in
A.
unstack creates m data variables in
A, where m is the number of group levels in
indvar. The values in indvar indicate which of those
m variables receive which values from datavar. The
j-th data variable in A contains the values from
datavar that correspond to observations whose indvar value
was the j-th of the m possible levels. Elements of those
m variables for which no corresponding data value in B
exists contain a default value.
datavar is a positive integer, a character vector, a string scalar, or a
logical vector containing a single true value. indvar is a positive integer, a
variable name, or a logical vector containing a single true value.
[A,iB] = unstack(B,datavar,indvar) returns an index vector
iB indicating the correspondence between observations in A
and those in B. For each observation in A,
iB contains the index of the first in the corresponding group of observations
in B.
For more information on grouping variables, see Grouping Variables.
Input Arguments
A = unstack(B,datavar,indvar,
uses the following parameter name/value pairs to control how 'Parameter',value)unstack converts
variables in B to variables in A:
'GroupVars' | Grouping variables in B that define groups of observations.
groupvars is a positive integer, a vector of positive integers, a
character vector, a string array, a cell array of character vectors, or a logical vector.
The default is all variables in B not listed in
datavar or indvar. |
'NewDataVarNames' | A string array or cell array of character vectors containing names for the data
variables unstack should create in A. Default is the
group names of the grouping variable specified in indvar. |
'AggregationFun' | A function handle that accepts a subset of values from datavar and
returns a single value. stack applies this function to observations from
the same group that have the same value of indvar. The function must
aggregate the data values into a single value, and in such cases it is not possible to
recover B from A using stack. The
default is @sum for numeric data variables. For non-numeric variables,
there is no default, and you must specify 'AggregationFun' if multiple
observations in the same group have the same values of indvar. |
'ConstVars' | Variables in B to copy to A without unstacking.
The values for these variables in A are taken from the first observation
in each group in B, so these variables should typically be constant
within each group. ConstVars is a positive integer, a vector of positive
integers, a character vector, a string array, a cell array of character vectors, or a
logical vector. The default is no variables. |
You can also specify more than one data variable in B, each of
which becomes a set of m variables in A. In this case,
specify datavar as a vector of positive integers, a string array or cell array
containing variable names, or a logical vector. You may specify only one variable with
indvar. The names of each set of data variables in A are
the name of the corresponding data variable in B concatenated with the names
specified in 'NewDataVarNames'. The function specified in
'AggregationFun' must return a value with a single row.
Examples
Combine several variables for estimated influenza rates into a single variable. Then unstack the estimated influenza rates by date.
load flu
% FLU has a 'Date' variable, and 10 variables for estimated influenza rates
% (in 9 different regions, estimated from Google searches, plus a
% nationwide estimate from the CDC). Combine those 10 variables into an
% array that has a single data variable, 'FluRate', and an indicator
% variable, 'Region', that says which region each estimate is from.
[flu2,iflu] = stack(flu, 2:11, 'NewDataVarName','FluRate', ...
'IndVarName','Region')
% The second observation in FLU is for 10/16/2005. Find the observations
% in FLU2 that correspond to that date.
flu(2,:)
flu2(iflu==2,:)
% Use the 'Date' variable from that array to split 'FluRate' into 52
% separate variables, each containing the estimated influenza rates for
% each unique date. The new array has one observation for each region. In
% effect, this is the original array FLU "on its side".
dateNames = cellstr(datestr(flu.Date,'mmm_DD_YYYY'));
[flu3,iflu2] = unstack(flu2, 'FluRate', 'Date', ...
'NewDataVarNames',dateNames)
% Since observations in FLU3 represent regions, IFLU2 indicates the first
% occurrence in FLU2 of each region.
flu2(iflu2,:)