Table transform - rows to variable number of columns

2 visualizaciones (últimos 30 días)
Lauren Jones-Lush
Lauren Jones-Lush el 15 de Jun. de 2021
Editada: Pramil el 24 de Mayo de 2024
Trying to transform a table that is one ID to variable number of rows, to one that is one ID to variable number columns:
Original Table: Y x 3 table
ID State Item
A1 MD Pencil
A1 NJ Pen
B2 MD Pen
...
Z26 MD Marker
Z26 NJ Crayon
Z26 DE Pencil
Desired Table: length(unique(ID) x (1+2*N) table
ID State1 Item1 State2 Item2 StateN ItemN
A1 MD Pencil NJ Pen
B2 MD Pen
...
Z26 MD Marker NJ Crayon DE Pencil
without a painful for loop... Thank you for suggestions!
  2 comentarios
the cyclist
the cyclist el 15 de Jun. de 2021
I don't know if there is a slick way to do this without loops, but I am really curious why you think this is a good idea. Your data are in what is known as the tidy format, and this is typically ideal for later analytics.
Do you mind explaining what your next step with the reformatted table is? Maybe there is a way to handle it with the data in their current layout.
Lauren Jones-Lush
Lauren Jones-Lush el 15 de Jun. de 2021
Hehe. I don't actually think it is a good idea ; ) but the goal is to create an end user excel file, with one row per ID, to which user will mannually add any additional column pairs from their own local records, (and which I will then suck back in to tidy format for analytics).

Iniciar sesión para comentar.

Respuestas (1)

Pramil
Pramil el 24 de Mayo de 2024
Editada: Pramil el 24 de Mayo de 2024
Hi Lauren,
To achieve the said transformation without using the “for” loop, you can do the following in MATLAB R2021b:
data = {'A1', 'MD', 'Pencil'; 'A1', 'NJ', 'Pen'; 'B2', 'MD', 'Pen'; 'Z26', 'MD', 'Marker'; 'Z26', 'NJ', 'Crayon'; 'Z26', 'DE', 'Pencil'};
originalTable = cell2table(data, 'VariableNames', {'ID', 'State', 'Item'});
  • Let us assume the data is stored in “OriginalTable” variable. Count the occurrences of each ID” using “groupcounts” method and find the count of “ID” having maximum occurrence using “max” function.
% Count the occurrences of each ID
idCounts = groupcounts(originalTable.ID);
% Find the maximum count to determine the number of columns needed
maxItems = max(idCounts);
The value “maxItems” dictates the number of “state” and “item” columns needed in the final table.
  • Group the original table by “ID” and apply a custom function “groupFun” to each group. This function will organize the grouped data into a format that aligns with the desired table structure. Use “splitapply” for this step, passing “maxItems” as an additional argument to “groupFun”.
% Group by ID and apply the function
groupedData = splitapply(@(varargin) groupFun(varargin, maxItems), originalTable, findgroups(originalTable.ID));
  • Define the custom function “groupFun”.
function out = groupFun(varargin)
maxLength = 2*varargin{2}+1;
currLength = 2*length(varargin{1}{1})+1;
% Preallocate cell array for the maximum number of items
out = cell(1, maxLength);
out(1) = varargin{1}{1}(1);
out(2:2:currLength) = varargin{1}{2}';
out(3:2:currLength) = varargin{1}{3}';
end
  • After applying the custom function to each group, the “groupedData” needs to be structured into a table. Define the variable names for this table, considering the “ID” and the dynamic number of “State” and “Item” columns based on “maxItems”.
% Convert grouped data into a table
varNames = cell(1,2*maxItems+1);
varNames(1) = {'ID'};
varNames(2:2:2*maxItems+1) = cellstr(strcat('State', string(1:maxItems)));
varNames(3:2:2*maxItems+1) = cellstr(strcat('Item', string(1:maxItems)));
desiredTable = cell2table(groupedData, 'VariableNames', varNames)
desiredTable = 3x7 table
ID State1 Item1 State2 Item2 State3 Item3 _______ ______ __________ ____________ ____________ ____________ ____________ {'A1' } {'MD'} {'Pencil'} {'NJ' } {'Pen' } {0x0 double} {0x0 double} {'B2' } {'MD'} {'Pen' } {0x0 double} {0x0 double} {0x0 double} {0x0 double} {'Z26'} {'MD'} {'Marker'} {'NJ' } {'Crayon' } {'DE' } {'Pencil' }
You can check out the functions “groupcounts”, “findgroups” and “splitapply” through the following links to know more about their usage:
Hope it helps.

Categorías

Más información sobre Data Preprocessing en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by