Speed up code to compare fields in a struct

1 visualización (últimos 30 días)
elisa ewin
elisa ewin el 5 de Mayo de 2016
Comentada: Guillaume el 5 de Mayo de 2016
Hi! I have the struct Trajectories with field uniqueDate, dateAll, label: I want to compare the fields uniqueDate and dateAll and, if there is a correspondence, I will save in label a value from an other struct. I know the movements of different users: I know the dates, in which they stay in a location, and the semantic labels linked to the locations. In the same day a user visit more locations. I have a observation time in which analyze user moviments and all the dates included in this observation time are in uniqueDate; I have also all the moviments of the user that are not included in observation time that are in dateAll. So I want to compare, uniqueDate with dateAll and if there is a corrispondence between them, I save in label the semantic label of the location that is in s.place. I have attached the struct Trajectories in which:
  • uniqueDate contains all the dates in which an user was in a location inclueded in an observation time
  • dateAll contains all the date linked to movements of a user
I have written this code:
% k users number
for k=1:nCols
% Trajectories(1,k).dateAll contains all the movements of user
for j=1:size(Trajectories(1,k).dateAll,1)
% Trajectories(1,k).uniqueDate contains the dates linked to user's movements included in an observation time
for i=1:size(Trajectories(1,k).uniqueDate,1)
% First compare if the month, the day and the year of uniqueDate and dateAll are the same
if (~isempty(s(1,k).places))&&(Trajectories(1,k).dateAll(j,1)==Trajectories(1,k).uniqueDate(i,1))&&(Trajectories(1,k).dateAll(j,2)==Trajectories(1,k).uniqueDate(i,2))&&(Trajectories(1,k).dateAll(j,3)==Trajectories(1,k).uniqueDate(i,3))
% After I compare the hours z:indicated hours from 1 to 24
for z=1:24
if(Trajectories(1,k).dateAll(j,4)==z)&&(size(s(1,k).places.all,2)>=size(Trajectories(1,k).uniqueDate,1))
Trajectories(1,k).label(j)=s(1,k).places.all(z,i);
else if(Trajectories(1,k).dateAll(j,4)==z)&&(size(s(1,k).places.all,2)<size(Trajectories(1,k).uniqueDate,1))
for l=1:size(s(1,k).places.all,2)
Trajectories(1,k).label(l)=s(1,k).places.all(z,l);
end
end
end
end
end
end
end
end
but it's very very slow. How can I modify it to speed up?
  7 comentarios
elisa ewin
elisa ewin el 5 de Mayo de 2016
My code run but it's very slow... s is a database that I don't create. I have taken the data from this database for my applications
Guillaume
Guillaume el 5 de Mayo de 2016
Since Elisa explicitly tests that the field is not empty, it is fine then to assume that it is a structure from then one. Of course, the test could be moved outside the i and j loops since it does not depend on them, saving a lot of processing time.

Iniciar sesión para comentar.

Respuesta aceptada

Guillaume
Guillaume el 5 de Mayo de 2016
First, a piece of advice. Writing code that works and is efficient is only half the battle. Writing code that can be understood easily is just as important. One part of this is using names that have meaning for variables. For example instead of
% k users number
for k=1:nCols
use
for userid = 1:usercount
and instead of
for z=1:24
use
for hour = 1:24
It's immediately clearer what the code does.
Anyway, to answer your question, you would use ismember with the 'rows' option to find which dateAll match uniqueDate. The loop over the users is unavoidable but is not an issue:
for userid = 1 : numel(Trajectories)
if ~isempty(s(userid).places)
[found, udrow] = ismember(Trajectories(userid).dateAll(:, 1:3), Trajectories(userid).uniqueDate, 'rows');
found is logical vector of 0 (not found) and 1 (found) which indicates whether the corresponding dateAll matches a uniqueDate. udrow is the row index of the matching uniqueDate (or 0 if no match).
At this point, I'm not very clear what is going on with your z loop. It certainly is not necessary, you could have used simple indexing even in your original code. The equivalent would be:
allplaces = s(userid).places.all; %shortcut
if size(allplaces, 2) >= size(Trajectories(userid).uniqueDate, 1)
usedhours = Trajectories(userid).dateAll(found, 4);
Trajectories(userid).label(found) = allplaces(sub2ind(size(allplaces), usedhours, udrow(found)));
else
%the l loop was just an expensive way of copying a whole row.
%and it just kept overwriting label for all dateAll that matched a uniqueDate
%so in the end label was just the hour row that corresponded to the last matched dateAll
lasthour = Trajectories(userid).dateAll(find(found, 1, 'last'), 4);
Trajectories(userid).label = allplaces(lasthour, :);
end
end
end
One potential difference between my code and your code is if uniqueDate has repeated rows. From the name I assume it is not the case, but if it is, your original code used the row of the last repeated uniqueDate as an index into places to fill the label whereas my code, will use the index of the first one, since that is what ismember returns as second output.

Más respuestas (1)

Jan
Jan el 5 de Mayo de 2016
Editada: Jan el 5 de Mayo de 2016
At first start with an optical simplification of the code. In current form it is not readable and this impedes recognizing locations to improve the speed:
% k users number
for k = 1:nCols
% Trajectories(1,k).dateAll contains all the movements of user
dateAll = Trajectories(k).dateAll;
uniqueDate = Trajectories(k).uniqueDate;
condition1 = (size(s(k).places.all, 2) < size(uniqueDate,1));
condition2 = ~isempty(s(k).places);
TLabel = Trajectories(k).label; % Does this field exist?
for j = 1:size(dateAll,1)
% uniqueDate contains the dates linked to user's movements included
% in an observation time
for i = 1:size(uniqueDate,1)
% First compare if the month, the day and the year of uniqueDate
% and dateAll are the same
if condition2 && all(dateAll(j,1:3) == uniqueDate(i,1:3))
% After I compare the hours z:indicated hours from 1 to 24
if condition1
for z = 1:24
if dateAll(j,4) == z
n = size(s(k).places.all, 2);
TLabel(1:n) = s(k).places.all(z, :);
end
end
else
for z = 1:24
if dateAll(j,4) == z
TLabel(j) = s(k).places.all(z,i);
end
end
end
end
end
end
Trajectories(1,k).label = TLabel;
end
As a side-effect, using temporary variables might increase the speed a little bit. Please test exhaustively if I've inserted bugs. If it works, the next step is searching for redundant work.
  1 comentario
Guillaume
Guillaume el 5 de Mayo de 2016
As mentioned, most of the loops are entirely unnecessary.

Iniciar sesión para comentar.

Categorías

Más información sobre Dates and Time en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by