Extracting data based on categories

6 visualizaciones (últimos 30 días)
Allie
Allie el 11 de Jul. de 2019
Editada: Guillaume el 12 de Jul. de 2019
I have a table (z) with two columns, year and annual precipitation. I have a separate vector (E) of three categories (0,1,2) that correspond to the years in the first table. The categories are unevenly spaced. I would like to create three new tables of year and annual precipitation, one for each category. How do I extract the correct year and precipitation data for each category and populate new tables?
z=['01-Jan-1973' 0.0114583333333333
'01-Jan-1974' 0.0918518518518519
'01-Jan-1975' 0.0529032258064516
'01-Jan-1976' 0.213571428571429
'01-Jan-1977' 0.0857692307692308]
E=[0 1 0 2 1]
I would like to create three new tables in the same format as z, like below.
A=['01-Jan-1973' 0.0114583333333333
'01-Jan-1975' 0.0529032258064516]
B=['01-Jan-1974' 0.0918518518518519
'01-Jan-1977' 0.0857692307692308]
C=['01-Jan-1976' 0.213571428571429]
My actual tables and vectors are 1x79, making extracting the three new tables challenging to do by hand.
  2 comentarios
Guillaume
Guillaume el 11 de Jul. de 2019
Can you give an example of inputs and desired output. Preferably using valid matlab syntax.
Allie
Allie el 11 de Jul. de 2019
I updated the question. Do you need more information still?

Iniciar sesión para comentar.

Respuesta aceptada

Star Strider
Star Strider el 11 de Jul. de 2019
The splitapply function (R2015b and later releases) is perfect for this!
z = {'01-Jan-1973' 0.0114583333333333
'01-Jan-1974' 0.0918518518518519
'01-Jan-1975' 0.0529032258064516
'01-Jan-1976' 0.213571428571429
'01-Jan-1977' 0.0857692307692308};
E=[0 1 0 2 1];
Out = splitapply(@(x){x}, z, E(:)+1);
A = Out{1}
B = Out{2}
C = Out{3}
producing:
A =
2×2 cell array
{'01-Jan-1973'} {[0.0114583333333333]}
{'01-Jan-1975'} {[0.0529032258064516]}
B =
2×2 cell array
{'01-Jan-1974'} {[0.0918518518518519]}
{'01-Jan-1977'} {[0.0857692307692308]}
C =
1×2 cell array
{'01-Jan-1976'} {[0.213571428571429]}
  4 comentarios
Star Strider
Star Strider el 12 de Jul. de 2019
I did not know your dates were in datetime format, so that is likely the reason the splitapply code failed.
I added the accumarray code later when I realised that you may not have splitapply. I’m glad it worked!
Guillaume
Guillaume el 12 de Jul. de 2019
Editada: Guillaume el 12 de Jul. de 2019
Note: If you are actually using a table (or a timetable, which would be better probably), the best solution might be not to split the table at all and just add the E as an extra table variable.
Matlab has plenty of aggregation functions (rowfun, varfun, splitapply as shown here, groupsummary) which allows you to apply the same function to each group of a table all at once. More often, the code is actually simpler and faster if the data is not split beforehand.
e.g:
%demo table
T = table(datetime({'01-Jan-1973'; '01-Jan-1974'; '01-Jan-1975'; '01-Jan-1976'; '01-Jan-1977'}), ...
[0.0114583333333333; 0.0918518518518519; 0.0529032258064516; 0.213571428571429; 0.0857692307692308], ...
'VariableNames', {'Date', 'Value'});
T.group = [0; 1; 0; 2; 1]
groupsummary(T, 'group', {'mean', 'sum', 'std'}, 'Value')
outputs:
ans =
3×5 table
group GroupCount mean_Value sum_Value std_Value
_____ __________ __________________ __________________ ___________________
0 2 0.0321807795698924 0.0643615591397849 0.0293059645132893
1 2 0.0888105413105414 0.177621082621083 0.00430106261490964
2 1 0.213571428571429 0.213571428571429 0

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Data Preprocessing en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by