# replacing nan values with the mean of a moving window

1 view (last 30 days)
Richard on 9 Nov 2012
From the following:
t = transpose(1/24:1/24:40);
data1 = 1+(30-1).*rand(length(t),1);
Randm = floor(1+(length(t)-1).*rand(37,1));
data1(Randm) = nan;
plot(t,data1);
We can see that the vector contains some missing values i.e. nan. The data shows the variation in a certain variable for 40 days i.e. there are 40 24 hours in the vector. How could I replace the missing values with the average for that particular 24 hour, in addition, if 24 consecutive values were missing i.e. no values for an entire day, how would I replace those missing values with the average for that week i.e. from the previous 7 days? I'm thinking of applying this with a moving window, would this be the best method?

Image Analyst on 9 Nov 2012
Here's a fairly straightforward easy to understand vectorized way of doing it:
% Reshape into 40 days (rows) by 24 columns (hours).
dailyData = reshape(data1, [40,24]);
% Get a map of where the NaNs are
nanMap = isnan(dailyData)
% Count the number of valid, non-nan values per day.
validValuesPerDay = sum(~nanMap, 2)
% Need to sum rows but can't sum with NaNs or result is a NaN,
% so replace NaNs with zeros so we can sum the row.
nonNanData = dailyData;
nonNanData(nanMap) = 0;
sumPerDay = sum(nonNanData, 2) % Sum across the row.
% Get a matrix where each row is the mean of that row.
dailyMeans = sumPerDay ./ validValuesPerDay;
% Expand it to 24 columns.
dailyMeans = repmat(dailyMeans, [1 24]);
% Replace NaNs with the mean
repairedData = dailyData; % Initialize.
repairedData(nanMap) = dailyMeans(nanMap)