Info
La pregunta está cerrada. Vuélvala a abrir para editarla o responderla.
My nested For Loops takes 40 seconds for 20,000 records. Anyway to vectorize or improve?
    8 visualizaciones (últimos 30 días)
  
       Mostrar comentarios más antiguos
    
 % Consolidate Results by Security and Date
    uniqueDate = unique(d.RiskData(:,2));
    uniqueSec = unique(d.RiskData(:,1));
    numSecs = length(unique(d.RiskData(:,1)));
    numDates = length(unique(d.RiskData(:,2)));
    d.RiskDataNew{(numDates * numSecs) + 1,size(d.RiskData,2)} =[];
    d.RiskDataNew(1,1) = {'SECURITY'};
    d.RiskDataNew(1,2) = {'DATE'};
    for i = 1:numFields
        d.RiskDataNew(1,i + 2) = fldList(i);
    end
    %d.RiskDataNew = {};
    tic
    n = 1;
    for x = 1:numSecs 
      for y = 1:numDates 
        idx = find(contains(d.RiskData(1:end,1),uniqueSec(x)) & ...
                   contains(d.RiskData(1:end,2),uniqueDate(y)));
        if ~isempty(idx)
          n = n + 1;
          d.RiskDataNew(n,:)=d.RiskData(idx(1),:);
          if length(idx) > 1
            numFields = size(d.RiskData,2); 
            for j = 2:length(idx)
              for k = 3:numFields % first two fields are defaults: Security and Date
                if (isempty(d.RiskDataNew{n,k}))
                  d.RiskDataNew{n,k} = d.RiskData{idx(j),k};
                end
              end
            end
          end
        end
      end
    end
    toc
2 comentarios
  dpb
      
      
 el 15 de Jun. de 2020
				find is superfluous here and somewhat costly...
Hard to decipher what is going on -- how about an explanation of what your'e trying to do and a sample of the raw data to work from?
Is it mandatory to use struct?  They're expensive relative to just plain data arrays.
  Walter Roberson
      
      
 el 16 de Jun. de 2020
				I have not tested, but I have a suspicion that ismember is faster than contains().
contains(d.RiskData(1:end,1),uniqueSec(x))
That sub-expression can be calculated at the for x level. You can possibly even do
    for x = 1:numSecs 
        xidx = find(contains(d.RiskData(1:end,1),uniqueSec(x)));
      for y = xidx
        idx = find(contains(d.RiskData(1:end,2),uniqueDate(y)));
Respuestas (2)
  Sayyed Ahmad
      
 el 16 de Jun. de 2020
        You have to avoiding the nested loops.
may be you can use the bsxfun to avoiding some loops. 
An example:
A = rand(50); % 50-by-50 matrix of random values between 0 and 1
% method 1: slow and lots of lines of code
tic
meanA = mean(A); % mean of every matrix column: a row vector
% pre-allocate result for speed, remove this for even worse performance
result = zeros(size(A));
for j = 1:size(A,1)
    result(j,:) = A(j,:) - meanA;
end
toc
clear result % make sure method 2 creates its own result
% method 2: fast and only one line of code
tic
result = bsxfun(@minus,A,mean(A));
toc
 the Answer wold be
Elapsed time is 0.015153 seconds.
Elapsed time is 0.007884 seconds.
see the following links for more details.
0 comentarios
  Mark McGrath
 el 16 de Jun. de 2020
        
      Editada: Mark McGrath
 el 16 de Jun. de 2020
  
      
      3 comentarios
  dpb
      
      
 el 17 de Jun. de 2020
				Are all the "fields", numeric?  Then just a cell array would hold them -- the problem will still be processing a different number of elements per cell will kill about any vector operations as will a variable number of struct fields or the like.
The most straightforward way altho a little more memory-costly would be to either define a maximum N and preallocate or determine the max number in the dataset and allocate that size of array using NaN or other way to indicate the missing values.
Or, you could then use a table or timetable with the necessary number of variable columns -- code could then be written for that scenario as well that would be generic based on the table size/number variables/columns.
La pregunta está cerrada.
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!



