Replacing NaN with nearest neighbor

Hello,
I am trying to replace NaN's in a vector field with the nearest neighbor. I believe I can use knnsearch to find the indices of the nearest neighbor to each NaN, but am running into problems. Here is what I have so far:
u = median_u; %u & v are 47x47x309
v = median_v;
k = [1 0 -1];
s = [47 47];
u_bad = isnan(u);
v_bad = isnan(v);
bad_u = find(u_bad(:)); %linear indices of NaN's
bad_v = find(v_bad(:));
[u_bad_x, u_bad_y] = ind2sub(s, bad_u); %subscript of NaN
[v_bad_x, v_bad_y] = ind2sub(s, bad_v);
idx = knnsearch(u,[u_bad_x,u_bad_y]);
And this is where I run into problems. I get an error saying that Y in knnsearch must be a matrix with 'X' number of columns. I don't see if the problem is with there being three dimensions for u and v, although I've tried just running it on one 47x47 matrix and get the same error. Thanks in advance for any help.

 Respuesta aceptada

Image Analyst
Image Analyst el 18 de Oct. de 2014

0 votos

Why use knn? You have the indices of the nan's and non-nan's. Just loop over all nans using the Pythagorean theorem to find the distances from that nan to all other valid numbers. Then sort them and replace it with the value of the closest. It should be trivial. Let me know if you can't figure it out.

1 comentario

Image Analyst
Image Analyst el 18 de Oct. de 2014
Editada: Image Analyst el 18 de Oct. de 2014
Alright, here's my code. It's well commented so hopefully you'll understand it. Let me know if you don't.
clc; % Clear the command window.
clear all;
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
% Create a 47 by 47 by 309 array of random numbers.
u = rand(47, 47, 309);
% Get 100 random locations that we can make into nan's
randomLocations = randperm(numel(u), 100);
u(randomLocations) = nan;
% Now find the nan's
nanLocations = isnan(u);
nanLinearIndexes = find(nanLocations)
nonNanLinearIndexes = setdiff(1:numel(u), nanLinearIndexes);
% Get the x,y,z of all other locations that are non nan.
[xGood, yGood, zGood] = ind2sub(size(u), nonNanLinearIndexes);
for index = 1 : length(nanLinearIndexes);
thisLinearIndex = nanLinearIndexes(index);
% Get the x,y,z location
[x,y,z] = ind2sub(size(u), thisLinearIndex);
% Get distances of this location to all the other locations
distances = sqrt((x-xGood).^2 + (y - yGood) .^ 2 + (z - zGood) .^ 2);
[sortedDistances, sortedIndexes] = sort(distances, 'ascend');
% The closest non-nan value will be located at index sortedIndexes(1)
indexOfClosest = sortedIndexes(1);
% Get the u value there.
goodValue = u(xGood(indexOfClosest), yGood(indexOfClosest), zGood(indexOfClosest));
% Replace the bad nan value in u with the good value.
u(x,y,z) = goodValue;
end
% u should be fixed now - no nans in it.
% Double check. Sum of nans should be zero now.
nanLocations = isnan(u);
numberOfNans = sum(nanLocations(:))

Iniciar sesión para comentar.

Más respuestas (2)

Steven Lord
Steven Lord el 2 de Feb. de 2018

4 votos

I'm not certain if this will do exactly what you want, but using the fillmissing function with the 'nearest' method may be sufficient for your needs.

3 comentarios

Rami Abousleiman
Rami Abousleiman el 2 de Feb. de 2018
Yes that is exactly it, fillmissing function is a 2016b kid that is why I didn't know about it. Its also few thousand times faster than the method above. FYI: you will have to call it twice if you have NAN at the beginning and end of the array.
If you have NaN at the beginning or end of your array and are using the 'nearest' method it should fill those in just fine. If you were using the 'previous' or 'next' methods you'd probably want to use the 'EndValues' option as well to handle NaN at the ends.
>> x = [NaN 2:5 NaN]
x =
NaN 2 3 4 5 NaN
>> fillmissing(x, 'nearest')
ans =
2 2 3 4 5 5
>> fillmissing(x, 'previous')
ans =
NaN 2 3 4 5 5
>> fillmissing(x, 'previous', 'EndValues', 'nearest')
ans =
2 2 3 4 5 5
jie wu
jie wu el 2 de Abr. de 2020
This 'nearest' element is the 'nearest' in the row or the column same as the 'NaN' locating in.
So this 'nearest' element may not be the 'nearest' one if we take the whole array into account.
The first answer (the accepted one) deal with the case when we consider the whole array.
Is there a built function or more efficient way to do this work? Thanks a lot!
Jie

Iniciar sesión para comentar.

Yavor Kamer
Yavor Kamer el 6 de Jul. de 2016
Editada: Yavor Kamer el 6 de Jul. de 2016
A quick and dirty* way is to get the indices of the NaN values and replace them with their immediate neighbors (to the left or right). You repeat this until all NaNs are replaced.
*This will not work if one of the end elements is NaN.
nanIDX = find(isnan(vec));
while(~isempty(nanIDX))
vec(nanIDX) = vec(nanIDX+1);
nanIDX = find(isnan(vec));
end
An example input/output would look like this:
vecIn = [1 1 NaN 2 NaN 2 3 3 4 NaN NaN 4 4]
vecOut = [1 1 2 2 2 2 3 3 4 4 4 4 4]

1 comentario

Rami Abousleiman
Rami Abousleiman el 2 de Feb. de 2018
Editada: Rami Abousleiman el 2 de Feb. de 2018
Squeeze your code between these 2 I think it will fix it:
if (isnan(b(end)))
b(end) = inf;
end
nanIDX = find(isnan(b));
while(~isempty(nanIDX))
b(nanIDX) = b(nanIDX+1);
nanIDX = find(isnan(b));
end
nanIDX = find(isinf(b));
nanIDX = flipud(nanIDX);
while(~isempty(nanIDX))
b(nanIDX) = b(nanIDX-1);
nanIDX = find(isinf(b));
end

Iniciar sesión para comentar.

Categorías

Preguntada:

el 17 de Oct. de 2014

Comentada:

el 2 de Abr. de 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by