Use splittapply with division

1 visualización (últimos 30 días)
Luca
Luca el 27 de Sept. de 2023
Editada: Luca el 28 de Sept. de 2023
Hi,
I have data of total 76 stocks over a year. I would like to normalize the data of each stock by dividing the whole stock time series by its first entry.
With only one stock it works like that:
D1990 = D(D.year==1990 & D.gvkey==15497,:);
D1990.pricenorm = D1990{:,"priceadj"}./D1990{1,"priceadj"};
The data looks like this.
where gvkey is the unique stock ID and priceadj is the price of the stock each day.
and the athohr variables are just some date variables.
So my idea was to do it with splitapply but unfortunately I don't get it to work.
[group1, ID] = findgroups(D1990.gvkey);
x = splitapply(@(x,y) x./y, D1990{:,"priceadj"}, D1990{1,"priceadj"} group1);
I think using the ID as group doesn't work and I'm also not sure if I use the function in splitapply correctly.
I also attached the acutal file.
Does someone know how to fix it?
Thank you in advance.

Respuesta aceptada

Mario Malic
Mario Malic el 27 de Sept. de 2023
Hey, is this what you are looking for?
load D1990.mat
[group1, ID] = findgroups(D1990new.gvkey);
y = splitapply(@(x) {x./x(1)}, D1990new.priceadj, group1)
D1990new.priceadjNorm = cell2mat(y)
  1 comentario
Luca
Luca el 28 de Sept. de 2023
Thank you very much it worked.

Iniciar sesión para comentar.

Más respuestas (1)

dpb
dpb el 27 de Sept. de 2023
Editada: dpb el 28 de Sept. de 2023
@Mario Malic fixed the problem w/ splitapply; you only wanted to divide by the first element of the group (which is a scalar so don't need the "dot" divide operator here -- doesn't hurt anything to use and is probably best practice to do so, but isn't required here.
An alternative to illustrate some other newer features of tables...
load D1990
tD=D1990new; % get a short name for convenience
clear D1990new
tD=addvars(tD,cell2mat(rowfun(@(p)p/p(1),tD,'GroupingVariables',{'gvkey'},'InputVariables',{'priceadj'}, ...
'OutputVariableName',{'pricenorm'},'OutputFormat','cell')), ...
'After','priceadj','NewVariableNames',{'pricenorm'});
format bank
head(tD)
gvkey date month year monthyear monthyear_1 priceadj pricenorm ________ ___________ _____ _______ _________ ___________ ________ _________ 15497.00 30-Jan-1990 1.00 1990.00 Jan-1990 Jan-1990 1908.18 1.00 15497.00 13-Feb-1990 2.00 1990.00 Feb-1990 Feb-1990 1908.18 1.00 15497.00 23-Feb-1990 2.00 1990.00 Feb-1990 Feb-1990 1799.55 0.94 15497.00 26-Feb-1990 2.00 1990.00 Feb-1990 Feb-1990 1804.27 0.95 15497.00 28-Feb-1990 2.00 1990.00 Feb-1990 Feb-1990 1799.55 0.94 15497.00 01-Mar-1990 3.00 1990.00 Mar-1990 Mar-1990 1794.82 0.94 15497.00 06-Mar-1990 3.00 1990.00 Mar-1990 Mar-1990 1790.10 0.94 15497.00 07-Mar-1990 3.00 1990.00 Mar-1990 Mar-1990 1794.82 0.94
  1 comentario
Luca
Luca el 28 de Sept. de 2023
Editada: Luca el 28 de Sept. de 2023
Thank you very much this works too. I wasn't aware of the function addvars its cool to learn something new.

Iniciar sesión para comentar.

Categorías

Más información sobre Data Preprocessing en Help Center y File Exchange.

Etiquetas

Productos


Versión

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by