Select neighbours of a vector efficiently

Question

1 voto

a = 1:10;
k = 4;

I have to loop through the elements of a and pick the k - 4 - neighbors equally distributed to each side (excluding the actual element I am on - n -) except when I get close to the boundaries:

 a   n             neigh
 *             2 3 4 5 
   *         1   3 4 5 
     *       1 2   4 5
       *       2 3   5 6
         *       3 4   6 7
           *       4 5   7 8
             *       5 6   8 9
               *       6 7   9 10
                 *     6 7 8   10
                   *   6 7 8 9

My dummy algorithm:

nmA = numel(a);
for n = a
  lo = n-k/2;
  up = n+k/2;
    if lo < 1
      up = up + 1-lo;
      lo = 1;
    elseif up > nmA
      lo = lo  - up + nmA;
      up = nmA;
    end
    out = setdiff(lo:up,n)
  end

Any help for an effcient algorithm? (filters, maybe?)

I am retrieving the neighbours because I need to calculate the trimmed mean and std on the neighbours.

(NOTE: my real a can be as much as 3e6 and k = 60, always even)

Thanks in advance

---------------------------------------------------------------------------------------------------------------------------------------------------------

THE FINAL ALGORITHM Just for fun... and thanks to everybody.

I have a dataset which contains intraday financial data. The first part of the dataset has 2 612 670 observations spread among 1061 days. (I have 10 parts)

For each day I have to calculate the moving mean/std with particular conditions for the beginnig/end of the day.

Firstly, I tried to loop per day but the sole indexing of obs belonging to a day was 75% of total computational time (> 22 sec).

In the end I chose the vectorised solution proposed by Jan and applied the vectorization concept to eliminate the day by day loop.

Now, for k = 20 it takes < 3 sec!

% Load part of the dataset (first column serial dates, second column prices)
tmp = load([R.d 'dataset.mat'],fields{part});
k = 20;
k2 = k / 2;
% Last observation for each day and 0
last(1,1,:) = uint32([0
                       find(diff(fix(tmp.(fields{part})(:,1))))
                       size(tmp.t200301(:,1),1)]);
% Keep only the prices (free some memory)
tmp = tmp.t200301(:,2);
% Begin of the day 
Ini = repmat(uint32(1:k+1).',1,k2);
Ini(1:k+2:k*k2) = [];
Ini = reshape(Ini, k, k2);
Ini = bsxfun(@plus,Ini,last(1:end-1));
% End of day
Fin = bsxfun(@minus, last(2:end), Ini(k:-1:1,k2:-1:1,1))+1;
% Number of observartions per day
numobs = diff(last);
% Create middle and concatenate with Ini and Fin - Loop or out of memory
last = cat(3,Last,size(tmp,1));
pos   = uint32([1:k2, k2+2:k+1].');
mu    = zeros(last(end),1);
sigma = mu;
for n = uint16(1:numel(numobs))
    neigh = sort(tmp([Ini(:,:,n),...
                      bsxfun(@plus, pos, uint32(0:numobs(n)-k-1)),...
                      Fin(:,:,n)]));
      mu(last(n)+1:last(n+1))    = mean(neigh(2:end-1,:));
      sigma(last(n)+1:last(n+1)) = std(neigh(2:end-1,:));
  end

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Jan el 22 de Jul. de 2011

Do you want to create the [out] vector dynamically, or do you need a complete matrix containing the index vectors as lines?

Oleg Komarov el 22 de Jul. de 2011

I will start testing the solutions asap. I will see if it is faster to store the indices and call mean and std once.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Jan el 22 de Jul. de 2011

Abrir en MATLAB Online

3 votos

If the indices are wanted as array:

k2 = k / 2;
Ini = transpose(repmat(1:k+1, k2, 1));
Ini(1:k+2:k*k2) = [];
Ini = transpose(reshape(Ini, k, k2));
Mid = bsxfun(@plus, [1:k2, k2+2:k+1], transpose(0:nA-k-1));
Fin = nA + 1 - Ini(k2:-1:1, k:-1:1);
B = cat(1, Ini, Mid, Fin);

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

Andrei Bobrov el 22 de Jul. de 2011

+1

Jan el 23 de Jul. de 2011

@Oleg: I'm using an old laptop with a 13'' LCD, which had been brighter in its youth. While these tiny specles . and ' look like a fly had left its excreta on my monitor, the massive "transpose" is hard to misinterprete.

And there was a discussion in CSSM about "''x'''.'" compared to "'''x''.'''" - or equivalent. It was rather confusing and I started to use "transpose" and "['string', char(39), 'string']" whenever I post in a forum.

Iniciar sesión para comentar.

Answer 2

Jan el 22 de Jul. de 2011

Abrir en MATLAB Online

2 votos

Some ideas:

"for n = 1:numel(a)" is usually faster than "for n = a"
For indexing integer types are faster than DOUBLEs. I use DOUBLEs in my example for better readability.
SETDIFF is expensive due to sorting.
Instead of check for exceptions inside the code, you can hard-code the exceptions by creating 3 loops:

nA = numel(a);
k2 = k / 2;
v = 1:k+1;
for n = 1:k2  % Initial part
  out = v(v ~= n);
end
v = [1:k2, k2+2:k+1];
for n = k2+1:nA-k2
  out = v;
  v   = v + 1;
end
v = nA-k:nA;
for n = nA-k2+1:nA  % Final part
  out = v(v ~= n);
end

5 comentarios
Mostrar 3 comentarios más antiguos Ocultar 3 comentarios más antiguos

Titus Edelhofer el 22 de Jul. de 2011

Thanks! Now I understand. It's the indexing that is faster. I once tried this but replaced the loop counter by integers but it got slower this way. Thanks again! Titus

Oleg Komarov el 22 de Jul. de 2011

Implementing with preallocation to test against your vectorized solution.

Iniciar sesión para comentar.

Answer 3

Andrei Bobrov el 22 de Jul. de 2011

Abrir en MATLAB Online

1 voto

Hi Oleg! My version.

a = 1:10;
k = 4;
k2 = k/2;
nA = numel(a);
idx = cumsum([[1:k2,k2+2:k+1];ones(nA-k-1,k)]);
% OR idx = bsxfun(@plus,[1:k/2,k/2+2:k+1],(0:nA-k-1)');
a1 =bsxfun(@plus,1:k+1,[0;0]);
a2 =bsxfun(@plus,nA-(0:k),[0;0]);
ii = (1:k2).^2;
a1(ii)=0;
a2(ii)=0;
aa = [a1;a2]';
aout = reshape(aa(aa~=0),k,[])';
out = a([aout(1:k2,:);idx;aout(end-(k2-1:-1:0),end:-1:1)]);

9 comentarios
Mostrar 7 comentarios más antiguos Ocultar 7 comentarios más antiguos

Andrei Bobrov el 22 de Jul. de 2011

Oleg. Sorry, n = numel(a). Сorrected.

Oleg Komarov el 22 de Jul. de 2011

I completely abandoned the day by day loop and cannot verify your version anymore, although I got it working.

Iniciar sesión para comentar.

Answer 4

Sean de Wolski el 22 de Jul. de 2011

Abrir en MATLAB Online

0 votos

I would probably try to do it with a convolution. A 'valid' one with a :

kernel = ([1;1;0;1;1]./4); %a is a column vector (example for mean)

std as a function of two convolutions is also doable. Then do the boundaries manually with a for-loop.

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Oleg Komarov el 22 de Jul. de 2011

Nice idea, will try to substitute the mid part proposed by Jan with the convolution.

Iniciar sesión para comentar.

Select neighbours of a vector efficiently

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Respuesta aceptada

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

Más respuestas (3)

5 comentarios
Mostrar 3 comentarios más antiguos Ocultar 3 comentarios más antiguos

9 comentarios
Mostrar 7 comentarios más antiguos Ocultar 7 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Categorías

Etiquetas

Community Treasure Hunt

Select neighbours of a vector efficiently

2 comentarios Mostrar Ninguno Ocultar Ninguno

Respuesta aceptada

3 comentarios Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

Más respuestas (3)

5 comentarios Mostrar 3 comentarios más antiguos Ocultar 3 comentarios más antiguos

9 comentarios Mostrar 7 comentarios más antiguos Ocultar 7 comentarios más antiguos

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Categorías

Etiquetas

Ver también

Community Treasure Hunt

2 comentarios
Mostrar Ninguno Ocultar Ninguno

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

5 comentarios
Mostrar 3 comentarios más antiguos Ocultar 3 comentarios más antiguos

9 comentarios
Mostrar 7 comentarios más antiguos Ocultar 7 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos