Splitting up large arrays based on datetimes without using loops

Hi, I've a large dataset consisting of 10min samples and an acompanying datetime array spanning many years on which I wish to perform certain functions on each month. Is there a way to operate on each individual month without using nested loops? I wish to calculate the skewdness and kurtosis every month for every column in the dataset and then store the results to run control charts on and update at a later date. Thanks in advance!

3 comentarios

What's wrong with nested loops? Without knowing, how the data are represented in your "dataset", it is hard to suggest some code for processing it. I'd expect findgroup and splitapply to solve this problem without creating explicit loops.
hi Jan. primarily nested loops are slow and cumbersome to code. I tried changing the datetime format to "yyyymm" and using accumarray but it only returns zeros! See below code snip
d = temp_struct.timestamps.(turbines{i_wec});
d.Format = 'yyyyMM';
temp_subs = datenum(d);
temp_vals = temp_struct.(atrib_name{i_atrib}).(turbines{i_wec})';
test = accumarray(temp_subs, temp_vals,[], @kurtosis);
I think this should work but not sure why it doesn't now. note vals is a 52704x1 array of double. In this instance "test" is a 736696x1 array of doubles all zero! Not sure why its so much bigger either.
Cedric
Cedric el 20 de Sept. de 2017
Editada: Cedric el 20 de Sept. de 2017
Apparently you have a function kurtosis already. One way to debug calls to ACCUMARRAY (assuming that you already checked out that indices are fine) is to output a cell array of grouped values:
groups = accumarray(temp_subs, temp_vals,[], @(x){x});
so you can checkout what is passed to your aggregation function. If all groups are empty there is an issue with your IND and/or VAL inputs. If groups make sense, the issue is with your aggregation function.

Iniciar sesión para comentar.

 Respuesta aceptada

Here's how you would do this using a table and varfun:
>> t = table(datetime(2017,1,randi(365,20,1)),randn(20,1),'VariableNames',{'Date' 'Value'})
t =
20×2 table
Date Value
___________ ________
05-Mar-2017 2.1778
23-May-2017 1.1385
31-Oct-2017 -2.4969
21-Oct-2017 0.44133
23-Jan-2017 -1.3981
[snip]
>> t.Month = month(t.Date)
t =
20×3 table
Date Value Month
___________ ________ _____
05-Mar-2017 2.1778 3
23-May-2017 1.1385 5
31-Oct-2017 -2.4969 10
21-Oct-2017 0.44133 10
23-Jan-2017 -1.3981 1
[snip]
>> varfun(@mean,t,'GroupingVariable','Month','InputVariables','Value')
ans =
10×3 table
Month GroupCount mean_Value
_____ __________ __________
1 2 -0.3667
2 1 0.32321
3 3 0.41779
4 1 -0.48094
5 4 -0.12632
6 3 0.97795
7 1 0.1644
8 2 0.65163
10 2 -1.0278
12 1 0.085189

Más respuestas (1)

Steven Lord
Steven Lord el 20 de Sept. de 2017
If you have your data stored in a timetable, use retime. Specify @skewness or @kurtosis as the aggregation method, assuming you have Statistics and Machine Learning Toolbox available. If you don't, you will need to write your own functions to compute those statistics and specify those as the aggregation method when you call retime.

1 comentario

@ Steven I only have Matlab2015, so that solution wont work. : (

Iniciar sesión para comentar.

Categorías

Más información sobre Data Type Identification en Centro de ayuda y File Exchange.

Etiquetas

Preguntada:

el 20 de Sept. de 2017

Respondida:

el 21 de Sept. de 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by