- What are the shape of each field? It looks like dateAll and uniqueDate are n x 3 matrices.
- What do you want to match in each? It looks like any row of dateAll with any row of uniqueDate
- What is the purpose of s and the z loop?
- Is your Trajectories structure a vector or a 2D array? You're using subscript (2D) indexing but only appear to iterate over the columns. Why aren't you using linear indexing?
Speed up code to compare fields in a struct
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
elisa ewin
el 5 de Mayo de 2016
Comentada: Guillaume
el 5 de Mayo de 2016
Hi! I have the struct Trajectories with field uniqueDate, dateAll, label: I want to compare the fields uniqueDate and dateAll and, if there is a correspondence, I will save in label a value from an other struct. I know the movements of different users: I know the dates, in which they stay in a location, and the semantic labels linked to the locations. In the same day a user visit more locations. I have a observation time in which analyze user moviments and all the dates included in this observation time are in uniqueDate; I have also all the moviments of the user that are not included in observation time that are in dateAll. So I want to compare, uniqueDate with dateAll and if there is a corrispondence between them, I save in label the semantic label of the location that is in s.place. I have attached the struct Trajectories in which:
- uniqueDate contains all the dates in which an user was in a location inclueded in an observation time
- dateAll contains all the date linked to movements of a user
I have written this code:
% k users number
for k=1:nCols
% Trajectories(1,k).dateAll contains all the movements of user
for j=1:size(Trajectories(1,k).dateAll,1)
% Trajectories(1,k).uniqueDate contains the dates linked to user's movements included in an observation time
for i=1:size(Trajectories(1,k).uniqueDate,1)
% First compare if the month, the day and the year of uniqueDate and dateAll are the same
if (~isempty(s(1,k).places))&&(Trajectories(1,k).dateAll(j,1)==Trajectories(1,k).uniqueDate(i,1))&&(Trajectories(1,k).dateAll(j,2)==Trajectories(1,k).uniqueDate(i,2))&&(Trajectories(1,k).dateAll(j,3)==Trajectories(1,k).uniqueDate(i,3))
% After I compare the hours z:indicated hours from 1 to 24
for z=1:24
if(Trajectories(1,k).dateAll(j,4)==z)&&(size(s(1,k).places.all,2)>=size(Trajectories(1,k).uniqueDate,1))
Trajectories(1,k).label(j)=s(1,k).places.all(z,i);
else if(Trajectories(1,k).dateAll(j,4)==z)&&(size(s(1,k).places.all,2)<size(Trajectories(1,k).uniqueDate,1))
for l=1:size(s(1,k).places.all,2)
Trajectories(1,k).label(l)=s(1,k).places.all(z,l);
end
end
end
end
end
end
end
end
but it's very very slow. How can I modify it to speed up?
7 comentarios
Guillaume
el 5 de Mayo de 2016
Since Elisa explicitly tests that the field is not empty, it is fine then to assume that it is a structure from then one. Of course, the test could be moved outside the i and j loops since it does not depend on them, saving a lot of processing time.
Respuesta aceptada
Guillaume
el 5 de Mayo de 2016
First, a piece of advice. Writing code that works and is efficient is only half the battle. Writing code that can be understood easily is just as important. One part of this is using names that have meaning for variables. For example instead of
% k users number
for k=1:nCols
use
for userid = 1:usercount
and instead of
for z=1:24
use
for hour = 1:24
It's immediately clearer what the code does.
Anyway, to answer your question, you would use ismember with the 'rows' option to find which dateAll match uniqueDate. The loop over the users is unavoidable but is not an issue:
for userid = 1 : numel(Trajectories)
if ~isempty(s(userid).places)
[found, udrow] = ismember(Trajectories(userid).dateAll(:, 1:3), Trajectories(userid).uniqueDate, 'rows');
found is logical vector of 0 (not found) and 1 (found) which indicates whether the corresponding dateAll matches a uniqueDate. udrow is the row index of the matching uniqueDate (or 0 if no match).
At this point, I'm not very clear what is going on with your z loop. It certainly is not necessary, you could have used simple indexing even in your original code. The equivalent would be:
allplaces = s(userid).places.all; %shortcut
if size(allplaces, 2) >= size(Trajectories(userid).uniqueDate, 1)
usedhours = Trajectories(userid).dateAll(found, 4);
Trajectories(userid).label(found) = allplaces(sub2ind(size(allplaces), usedhours, udrow(found)));
else
%the l loop was just an expensive way of copying a whole row.
%and it just kept overwriting label for all dateAll that matched a uniqueDate
%so in the end label was just the hour row that corresponded to the last matched dateAll
lasthour = Trajectories(userid).dateAll(find(found, 1, 'last'), 4);
Trajectories(userid).label = allplaces(lasthour, :);
end
end
end
One potential difference between my code and your code is if uniqueDate has repeated rows. From the name I assume it is not the case, but if it is, your original code used the row of the last repeated uniqueDate as an index into places to fill the label whereas my code, will use the index of the first one, since that is what ismember returns as second output.
0 comentarios
Más respuestas (1)
Jan
el 5 de Mayo de 2016
Editada: Jan
el 5 de Mayo de 2016
At first start with an optical simplification of the code. In current form it is not readable and this impedes recognizing locations to improve the speed:
% k users number
for k = 1:nCols
% Trajectories(1,k).dateAll contains all the movements of user
dateAll = Trajectories(k).dateAll;
uniqueDate = Trajectories(k).uniqueDate;
condition1 = (size(s(k).places.all, 2) < size(uniqueDate,1));
condition2 = ~isempty(s(k).places);
TLabel = Trajectories(k).label; % Does this field exist?
for j = 1:size(dateAll,1)
% uniqueDate contains the dates linked to user's movements included
% in an observation time
for i = 1:size(uniqueDate,1)
% First compare if the month, the day and the year of uniqueDate
% and dateAll are the same
if condition2 && all(dateAll(j,1:3) == uniqueDate(i,1:3))
% After I compare the hours z:indicated hours from 1 to 24
if condition1
for z = 1:24
if dateAll(j,4) == z
n = size(s(k).places.all, 2);
TLabel(1:n) = s(k).places.all(z, :);
end
end
else
for z = 1:24
if dateAll(j,4) == z
TLabel(j) = s(k).places.all(z,i);
end
end
end
end
end
end
Trajectories(1,k).label = TLabel;
end
As a side-effect, using temporary variables might increase the speed a little bit. Please test exhaustively if I've inserted bugs. If it works, the next step is searching for redundant work.
1 comentario
Ver también
Categorías
Más información sobre Dates and Time en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!