NARX time delay estimation

Hello, I’m trying to determine the feedback and input delays for a 3-input 1-output NARX network using target-target auto-correlation and input/target cross-correlation, respectively. For each input/target pair, the max. cross-correlation (peak) value occurs at a different lag. Given that all input delays must be the same for a NARX network, which value out of the 3 different lags should I select? The smallest one?
For the target-target auto-correlation, the max. lag is at 0, and the value of the auto-correlation coefficients decreases as the number of lags increase. How is the proper feedback delay selected in this case?
Sorry if these questions seem trivial; any help would greatly be appreciated. Thank you!

 Respuesta aceptada

Greg Heath
Greg Heath el 10 de Ag. de 2013

0 votos

Consider all lags that are significant.
If your cross-correlation function does not output significance levels, consider my approach for the 95% confidence level:
1. Order the absolute values of the cross-correlation between zscore(t,1) and zscore(randn(1,N),1).
2. Choose the value at floor(0.95*(2*N-1)).
3. Repeat 100 or more times and average the result.
4. Design a timedelaynet and only keep as many significant input delays as you need.
5. Design a narnet and only keep as many significant feedback delays as you need.
6. Design a narxnet using the delays obtained in 4 and 5.
If you search using the word NNCORR you will find many of my examples
Thank you for formally accepting my answer
Greg

8 comentarios

Rad
Rad el 12 de Ag. de 2013
Editada: Rad el 12 de Ag. de 2013
Thank you very much, I'll try this! And thank you for all your other answers related to NARX questions, I found them very useful to better understand how to code a NARX network! Rad
Rad
Rad el 13 de Ag. de 2013
Hi Greg, I tried your procedure. Just to make sure I followed it properly, here’s what I did:
1. calculated the cross-correlation coefficients between the target and Gaussian noise generated by randn
2. determined the index in the vector containing the absolute, sorted values of the correlation coefficients calculated previously, corresponding to a 95% confidence level using floor(0.95*(2*N-1))
3. determined the value in the vector corresponding to that index
4. repeated 100 times and average all those 100 values
5. this value was then used as the 95% confidence level value for the target-target autocorrelation coefficients
The value that I obtain at the end is very small compared to the target-target auto-correlation coefficients; this results in an extremely high number of ”significant lags”; do you have any suggestions have to fix this, or is there another procedure I should use?
Also, to make sure I understand why this procedure is to be used: are you saying that if the target-target auto-correlation peaks at lag 0, then this auto-correlation can be approximated with a target-Gaussian noise cross-correlation?
Thank you again for taking the time to answer my questions Rad
Greg Heath
Greg Heath el 15 de Ag. de 2013
Editada: Greg Heath el 15 de Ag. de 2013
A qualitative exchange is limiting.
1. Search using nncorr, to review some of my previous posts.
2. Enter help nndatasets
simpleseries_dataset - Simple time-series prediction dataset.
simplenarx_dataset - Simple time-series prediction dataset.
exchanger_dataset - Heat exchanger dataset.
maglev_dataset - Magnetic levitation dataset.
ph_dataset - Solution PH dataset.
pollution_dataset - Pollution mortality dataset.
refmodel_dataset - Reference model dataset
robotarm_dataset - Robot arm dataset
valve_dataset - Valve fluid flow dataset.
2. Choose the first of the list
3. Apply your code, then post code and results.
Greg
Rad
Rad el 26 de Sept. de 2013
Hello Greg, Thank you for your help; sorry for the late reply, I was out of the office for a while…
I followed your recommendations: using the simpleseries_dataset, I’m trying to determine the significant target-target auto-correlation lags by correlating the target with a random series and finding the 95% significance threshold. The code is posted below:
clear all;
load simpleseries_dataset.mat;
[X,T] = simpleseries_dataset;
inputs = cell2mat(X);
targets = cell2mat(T);
number_obs = size(targets, 2);
targets_norm = zscore(targets,1);
index_95 = floor(0.95*(2*number_obs-1));
for i = 1:100
rand_norm = zscore(randn(1,number_obs),1);
[Coeff1_norm, Coeff1_norm_lags] = xcorr(targets_norm, rand_norm, 'coeff');
sorted_Coeff1_norm = sort(abs(Coeff1_norm));
CI_95 = sorted_Coeff1_norm(index_95);
CI_95_vector(i) = CI_95;
end
CI_95_average = mean(CI_95_vector)
[TargetTargetCoeff, TargetTargetCoeff_lags] = xcorr(targets_norm, 'coeff');
% Plot the corr. coefficients and the 95% CI
CI_ForGraph = CI_95_average*ones(size(TargetTargetCoeff_lags,2), 1);
plot(TargetTargetCoeff_lags', TargetTargetCoeff', TargetTargetCoeff_lags', CI_ForGraph); title('Target-target auto-correlation')
According to my results, the significant lag is lag 8 – please see the figure below. Am I right in assuming this significant lag?
However, when I use my data – attached as target_data.xls – I get the following result (please see the code and figure below)
clear all;
targets = xlsread('target_data', 'Sheet1', 'A1:A32063');
number_obs = size(targets, 1);
targets_norm = zscore(targets,1);
index_95 = floor(0.95*(2*number_obs-1));
for i = 1:100
rand_norm = zscore(randn(number_obs, 1),1);
[Coeff1_norm, Coeff1_norm_lags] = xcorr(targets_norm, rand_norm, 'coeff');
sorted_Coeff1_norm = sort(abs(Coeff1_norm));
CI_95 = sorted_Coeff1_norm(index_95);
CI_95_vector(i) = CI_95;
end
CI_95_average = mean(CI_95_vector)
[TargetTargetCoeff, TargetTargetCoeff_lags] = xcorr(targets_norm, 'coeff');
CI_ForGraph = CI_95_average*ones(size(TargetTargetCoeff_lags,2), 1);
figure(1); plot(TargetTargetCoeff_lags', TargetTargetCoeff', TargetTargetCoeff_lags', CI_ForGraph); title('Target-target auto-correlation')
According to the results, the significant lag is 7810? Does this result appear correct to you? Thank you for all your help Rad
Greg Heath
Greg Heath el 13 de Oct. de 2013
Editada: Greg Heath el 13 de Oct. de 2013
>% Also, to make sure I understand why this procedure is to be used: are you saying that if the target-target auto-correlation peaks at lag 0, then this auto-correlation can be approximated with a target-Gaussian noise cross-correlation?
This makes no sense because EVERY autocorrelation has a maximum value of unity at zero lag.
Statistical 95% significance levels obtained from multiple noise/noise or noise/target trials are interpreted as 95% confidence levels for rejecting the null hypothesis that the correlation coefficient of the statistical distribution from which the data came is zero.
>%According to my results, the significant lag is lag 8 – please see the figure below. Am I right in assuming this significant lag?
No. Significant feedback lags are ALL of the POSITIVE lags for which the ABSOLUTE VALUE is equal to or greater than the significant correlation threshold. Significant input delay lags may include the value 0.
My calculations using the noise/noise simulation mean +/- stdv threshold of
sigtrhesh95 = 0.1414 +/- 0.0195 yields
sigautolag95 = 1 8 9 33 50 % positive
sigcrosslag95 = 0 1 8 9 51 % nonnegative
for the simpleseries_dataset.
Only use as many of these as you need. For example, start with delays <= 9 and minimize the number of hidden nodes. Then, using that "optimum" value for H, try to minimize the number of needed delays.
Hope this helps.
Greg
Greg Heath
Greg Heath el 13 de Oct. de 2013
Editada: Greg Heath el 13 de Oct. de 2013
I ran my code on your data
N = length(t) % 32063
sigthresh95 = 0.00817 % Extremely low because N is extremely high
length(autocorrt(N+1:end)) = 32062 % Positive lags
length(sigautolag95) = 31187 % Extremely high because of extremely low sigthresh95 (see your plot!)
Elapsedtime = 67.5 minutes% WHEW!(Only 2.5 sec for simpleseries_dataset)
I wouldn't be surprised if the default ID=1:2, FD = 1:2 works well.
Greg
Greg Heath
Greg Heath el 13 de Oct. de 2013
Several thoughts (but no conclusions):
1. Is this RW or a made up example?
2. It is not stationary. From the plot:
a. It looks like a running windowed mean would be quadratic or,at least, a low order polynomial.
b. A nth order polynomial is uniquely characterized by n+1 points and can be predicted by a linear, uniformly spaced,difference equation.
c. If the residual of a polynomial fit is Fourier analyzed, at least three points per period is needed to characterize each frequency component.
d. A sinusoid can be predicted by a linear, uniformly spaced difference equation.
e. Not sure about the corresponding running windowed variance.
3. If you have nothing to do some evening (e.g.,when your date doesn't show up) you might want to Fourier analyze a polynomial + sum of sinusoids model to try to understand how many feedback delays are sufficient.
Rad
Rad el 24 de Oct. de 2013
Thank you so much for all your help, it is greatly appreciate it. I'll look into your recommendations and I'll post my results when I'm done. Once again, thank you!

Iniciar sesión para comentar.

Más respuestas (0)

Preguntada:

Rad
el 7 de Ag. de 2013

Comentada:

Rad
el 24 de Oct. de 2013

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by