Mean data of each month (Data Matrix)
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hello,
I have a long data matrix with data from the year 2000 to 2010 of the following format (see picture below)
row 1= year
row2=month
row3=day of the month
row4=data
How can i calculate the average for each month of each year? (ie. each jan average, feb average, 10 march average.....)
Thanks in advance
2 comentarios
the cyclist
el 20 de Nov. de 2022
Editada: the cyclist
el 20 de Nov. de 2022
Do you want the average of all Januarys together, or January 2000 separate average from January 2001?
Also, when you say "data matrix", how is the data stored? From your screenshot, it looks like it might be a table. Can you upload the data?
Respuestas (1)
the cyclist
el 20 de Nov. de 2022
Here is one way:
% Create a table of pretend data, in your format
Var1 = [2000; 1; 1; 0.12];
Var2 = [2000; 1; 2; 0.13];
Var3 = [2000; 2; 1; 0.14];
Var4 = [2001; 1; 1; 0.15];
Var5 = [2001; 1; 2; 0.16];
Var6 = [2001; 2; 1; 0.17];
T = table(Var1,Var2,Var3,Var4,Var5,Var6);
% Because none of the variable label information is actually encoded in the
% table, convert to array, and then recast into meaningful column names.
% This is known as the "tidy" format.
A = table2array(T);
T2 = array2table(A','VariableNames',{'year','month','day','data'});
% Split-apply to get mean of each month
[G,TID] = findgroups(T2(:,{'year','month'}));
monthMean = splitapply(@mean,T2(:,'data'),G);
% Put it all together in a new table
output = [TID,table(monthMean)]
It is a bit awkward, but the primary reason for that awkwardness is that your data are not "tidy". Therefore, the vast majority of the code is actually getting the data into the format this is expected for most table operations (i.e. each column is a feature label, and each row is an observation).
If you can go back upstream and get your data into the format of my table T2 here, I would do that.
1 comentario
Campion Loong
el 30 de Nov. de 2022
Editada: Campion Loong
el 30 de Nov. de 2022
I agree with @the cyclist above, that it will be much more straightforward with your data in his T2 format. That said, using your initial data (), it will be cleaner with timetable
Resuse @the cyclist's pretended data:
Var1 = [2000; 1; 1; 0.12];
Var2 = [2000; 1; 2; 0.13];
Var3 = [2000; 2; 1; 0.14];
Var4 = [2001; 1; 1; 0.15];
Var5 = [2001; 1; 2; 0.16];
Var6 = [2001; 2; 1; 0.17];
T = table(Var1,Var2,Var3,Var4,Var5,Var6)
Put your timestamped data in a timetable:
% timestamps as datetime from rows in table T
timestamps = datetime(T{1,:}, T{2,:}, T{3,:});
TT = timetable(timestamps', T{4,:}') % note the transposes within
groupsummary(TT, "Time", "month", "mean")
Ver también
Categorías
Más información sobre Language Fundamentals en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!