Index to elements not listed in numeric index?

12 visualizaciones (últimos 30 días)
Andrew Landau
Andrew Landau el 25 de Nov. de 2018
Comentada: Andrew Landau el 25 de Nov. de 2018
Some functions return lists of indices, such as unique and ismember. Let's say I want to index to every element that isn't listed:
A = [1 1 2 2 3 3];
[uA, idxuA] = unique(A); % uA = [1 2 3], idxuA = [1 3 5]
idxDuplicates = true(length(A),1);
idxDuplicates(idxuA) = false;
duplicatesInA = A(idxDuplicates);
But it seems like that isn't very efficient and it would be nice to do something like-
duplicatesInA = A(~idxuA);
I really have two questions for the matlab/coding experts:
(1) Is there an efficient and direct way to use the '~' for a list of indices
(2) Is it worth it to optimize this or should I just deal with the extra few lines of code?
  2 comentarios
Rik
Rik el 25 de Nov. de 2018
I don't really consider myself to be an expert, but I'll still add my thoughts on this:
  1. Not that I know of. If it were a logical vector this would indeed be the way to do it, but since linear indices are returned, this might be the only way.
  2. Longer code can actually be more optimal, and more readable. That being said, as long as you are aware where the bottlenecks of your code are, you are miles ahead of many users. Unless your function is doing this millions of times in a loop, I don't think it is worth the extra effort to optimize this particular issue.
Stephen23
Stephen23 el 25 de Nov. de 2018
setdiff does the job quite easily.

Iniciar sesión para comentar.

Respuesta aceptada

Andrew Landau
Andrew Landau el 25 de Nov. de 2018
Editada: Andrew Landau el 25 de Nov. de 2018
Thanks everyone. I was looking for the function Matt J suggested - setdiff. However, I did a little profiling to check speeds. Making a true array and setting the indexed elements to false is faster than setdiff by an order of magnitude. So, right you are Rik. Longer code more optimal in this case.
Here's the code I used if you want to test it:
% Set up some random data for testing
% ** the result was robust to changing N and K
N = 10000;
K = 500;
data = randn(N,1);
idx = randperm(N,K);
% if anyone has a better way to preallocate cell arrays please tell me!
P = 1000;
timing = cell(1,2);
timing = cellfun(@(c) zeros(P,1), timing, 'uni', 0);
for p = 1:P
% Fastest by order of magnitude
tic
i1 = true(1,N); % define boolean array
i1(idx) = false; % set all elements from index to false
d11 = data(i1); % keep everything that wasn't in the index
timing{1}(p) = toc;
% Ten times slower
tic
i2 = setdiff(1:N,idx); % Get index of everything from 1:N not in idx
d12 = data(i2); % setdiff(1:N,idx) as argument to data() had comparable timing
timing{2}(p) = toc;
end
avgtime = cellfun(@mean, timing, 'uni', 1);
fprintf('Boolean array: %.2fµs -- Setdiff: %.2fµs -- Ratio: %.2f\n', avgtime(1)*1000000, avgtime(2)*1000000, avgtime(2)/avgtime(1));

Más respuestas (2)

Matt J
Matt J el 25 de Nov. de 2018
Editada: Matt J el 25 de Nov. de 2018
Your way is probably the most efficient, but an alternative with shorter syntax is,
duplicatesInA = A( setdiff(1:numel(A), idxuA) );
  1 comentario
Andrew Landau
Andrew Landau el 25 de Nov. de 2018
Yeah, the boolean array is 10x faster. Thanks for your input though!

Iniciar sesión para comentar.


Matt J
Matt J el 25 de Nov. de 2018
Editada: Matt J el 25 de Nov. de 2018
Is it worth it to optimize this or should I just deal with the extra few lines of code?
There's never a reason to deal with extra lines of code if it's an operation that you do often. That's what mfunctions are for.
function Ac = complement(A,idx)
Ic=true(numel(A),1);
Ic(idx)=false;
Ac=A(lc(idx));
end

Categorías

Más información sobre Loops and Conditional Statements en Help Center y File Exchange.

Productos


Versión

R2017a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by