Borrar filtros
Borrar filtros

Summing over observations in unbalanced panel data

40 visualizaciones (últimos 30 días)
Selcen Cakir
Selcen Cakir el 16 de Ag. de 2024 a las 5:02
Editada: Shishir Reddy el 20 de Ag. de 2024 a las 7:56
Hello, I have an unbalanced panel data set. For each id, I'd like to sum all values of x up to the latest time I observe that id, and record the summation to a new variable. How can I do this? Ideally, I'd like to avoid looping as I have a large dataset and I try to speed up the process.
Thank you in advance!
Selcen
  2 comentarios
the cyclist
the cyclist el 16 de Ag. de 2024 a las 23:01
Can you upload the data, or a small representative sample? You can use the paper clip icon in the INSERT section of the toolbar.
dpb
dpb el 17 de Ag. de 2024 a las 15:05
As @the cyclist notes, w/o a sample dataset we're pretty-much without recourse to a direct reponse, but look at rowfun and/or splitapply

Iniciar sesión para comentar.

Respuestas (1)

Shishir Reddy
Shishir Reddy el 20 de Ag. de 2024 a las 7:56
Editada: Shishir Reddy el 20 de Ag. de 2024 a las 7:56
Hi Selcen
As per my understanding you would like to sum all the values of a specific variable till the latest occurrence of that variable in an unbalanced panel data set and record the summation to a new variable.
Assuming that the data is in a MATLAB table format with at least 3 columns ‘id’, ‘time’, and ‘x’, the following is a sample MATLAB code to achieve this.
% Sample unsorted data
data = table([2; 1; 3; 2; 1; 3; 1], [1; 3; 2; 2; 1; 1; 2], [5; 20; 30; 15; 10; 25; 35], 'VariableNames', {'id', 'time', 'x'});
data = sortrows(data, {'id', 'time'}); %Sort the table by 'id' and 'time'
[~, idx] = unique(data.id, 'last'); %Find the maximum time for each 'id'
data.cumSumX = cumsum(data.x); %Calculate the cumulative sum of 'x' for each 'id'
latestCumulativeSum = data.cumSumX(idx); %Extract the cumulative sum at the latest time for each 'id'
result = table(data.id(idx), latestCumulativeSum, 'VariableNames', {'id', 'LatestCumSumX'});
% Display the result
disp(result);
id LatestCumSumX __ _____________ 1 65 2 85 3 140
For more information regarding the ‘cumSum’ function kindly refer the following documentation https://www.mathworks.com/help/matlab/ref/cumsum.html
I hope this helps.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by