Documentation

# autocorr

Sample autocorrelation

## Syntax

``autocorr(y)``
``autocorr(y,Name,Value)``
``acf = autocorr(___)``
``````[acf,lags,bounds] = autocorr(___)``````
``autocorr(ax,___)``
``````[acf,lags,bounds,h] = autocorr(___)``````

## Description

example

````autocorr(y)` plots the sample autocorrelation function (ACF) of the univariate, stochastic time series `y` with confidence bounds.```

example

````autocorr(y,Name,Value)` uses additional options specified by one or more name-value pair arguments. For example, `autocorr(y,'NumLags',10,'NumSTD',2)` plots the sample ACF of `y` for `10` lags and displays confidence bounds consisting of `2` standard errors.```

example

````acf = autocorr(___)` returns the sample ACF of `y` using any of the input arguments in the previous syntaxes.```

example

``````[acf,lags,bounds] = autocorr(___)``` additionally returns the lag numbers that MATLAB® uses to compute the ACF, and also returns the approximate upper and lower confidence bounds.```
````autocorr(ax,___)` plots on the axes specified by `ax` instead of the current axes (`gca`). `ax` can precede any of the input argument combinations in the previous syntaxes.```
``````[acf,lags,bounds,h] = autocorr(___)``` plots the sample ACF of `y` and additionally returns handles to plotted graphics objects. Use elements of `h` to modify properties of the plot after you create it.```

## Examples

collapse all

Specify the MA(2) model:

`${y}_{t}={\epsilon }_{t}-0.5{\epsilon }_{t-1}+0.4{\epsilon }_{t-2},$`

where ${\epsilon }_{t}$ is Gaussian with mean 0 and variance 1.

```rng(1); % For reproducibility Mdl = arima('MA',{-0.5 0.4},'Constant',0,'Variance',1)```
```Mdl = arima with properties: Description: "ARIMA(0,0,2) Model (Gaussian Distribution)" Distribution: Name = "Gaussian" P: 0 D: 0 Q: 2 Constant: 0 AR: {} SAR: {} MA: {-0.5 0.4} at lags [1 2] SMA: {} Seasonality: 0 Beta: [1×0] Variance: 1 ```

Simulate 1000 observations from `Mdl`.

`y = simulate(Mdl,1000);`

Compute the ACF for 20 lags. Specify that ${\mathit{y}}_{\mathit{t}}$ is an MA(2) model, that is, the ACF is effectively 0 after the second lag.

```[acf,lags,bounds] = autocorr(y,'NumMA',2); bounds```
```bounds = 2×1 0.0843 -0.0843 ```

`bounds` is (-0.0843, 0.0843), which are the upper and lower confidence bounds.

Plot the ACF.

`autocorr(y)`

The ACF cuts off after the second lag. This behavior is indicative of an MA(2) process.

Specify the multiplicative seasonal ARMA $\left(2,0,1\right)×\left(3,0,0{\right)}_{12}$ model:

`$\left(1-0.75L-0.15{L}^{2}\right)\left(1-0.9{L}^{12}+0.5{L}^{24}-0.5{L}^{36}\right){y}_{t}=2+{\epsilon }_{t}-0.5{\epsilon }_{t-1},$`

where ${\epsilon }_{t}$ is Gaussian with mean 0 and variance 1.

```Mdl = arima('AR',{0.75,0.15},'SAR',{0.9,-0.5,0.5},... 'SARLags',[12,24,36],'MA',-0.5,'Constant',2,... 'Variance',1);```

Simulate data from `Mdl`.

```rng(1); % For reproducibility y = simulate(Mdl,1000); ```

Plot the default autocorrelation function (ACF).

```figure autocorr(y)```

The default correlogram does not display the dependence structure for higher lags.

Plot the ACF for 40 lags.

```figure autocorr(y,'NumLags',40,'NumSTD',3)```

The correlogram shows the larger correlations at lags 12, 24, and 36.

Although various estimates of the sample autocorrelation function exist, `autocorr` uses the form in Box, Jenkins, and Reinsel, 1994. In their estimate, they scale the correlation at each lag by the sample variance (`var(y,1)`) so that the autocorrelation at lag 0 is unity. However, certain applications require rescaling the normalized ACF by another factor.

Simulate 1000 observations from the standard Gaussian distribution.

```rng(1); % For reproducibility y = randn(1000, 1);```

Compute the normalized and unnormalized sample ACF.

```[normalizedACF, lags] = autocorr(y,'NumLags',10); unnormalizedACF = normalizedACF*var(y,1);```

Compare the first 10 lags of the sample ACF with and without normalization.

`[lags normalizedACF unnormalizedACF]`
```ans = 11×3 0 1.0000 0.9960 1.0000 -0.0180 -0.0180 2.0000 0.0536 0.0534 3.0000 -0.0206 -0.0205 4.0000 -0.0300 -0.0299 5.0000 -0.0086 -0.0086 6.0000 -0.0108 -0.0107 7.0000 -0.0116 -0.0116 8.0000 0.0309 0.0307 9.0000 0.0341 0.0340 ⋮ ```

## Input Arguments

collapse all

Observed univariate time series for which MATLAB estimates or plots the ACF, specified as a numeric vector. The last element of `y` contains the latest observation.

Specify missing observations using `NaN`. The `autocorr` function treats missing values as missing completely at random.

Data Types: `double`

Axes on which to plot, specified as an `Axes` object.

By default, `autocorr` plots to the current axes (`gca`).

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `autocorr(y,'NumLags',10,'NumSTD',2)` plots the sample ACF of `y` for `10` lags and displays confidence bounds consisting of `2` standard errors.

Number of lags in the sample ACF, specified as the comma-separated pair consisting of `'NumLags'` and a positive integer. `autocorr` uses lags `0:NumLags` to estimate the ACF.

The default is ```min([20,T – 1])```, where `T` is the effective sample size of `y`.

Example: `autocorr(y,'NumLags',10)` plots the sample ACF of `y` for lags `0` through `10`.

Data Types: `double`

Number of lags in a theoretical MA model of `y`, specified as the comma-separated pair consisting of `'NumMA'` and a nonnegative integer less than `NumLags`.

`autocorr` uses `NumMA` to estimate confidence bounds.

• For lags > `NumMA`, `autocorr` uses Bartlett’s approximation [1] to estimate the standard errors under the model assumption.

• If `NumMA` = `0`, then `autocorr` assumes that `y` is a Gaussian white-noise process of length n. Consequently, the standard error is approximately $1/\sqrt{T},$ where T is the effective sample size of `y`.

Example: `autocorr(y,'NumMA',10)` specifies that `y` is an MA(`10`) process, and plots confidence bounds for all lags greater than `10`.

Data Types: `double`

Number of standard errors in the confidence bounds, specified as the comma-separated pair consisting of `'NumSTD'` and a nonnegative scalar. For all lags > `NumMA`, the confidence bounds are 0 ±`NumSTD*`$\stackrel{^}{\sigma }$, where $\stackrel{^}{\sigma }$ is the estimated standard error of the sample autocorrelation.

The default yields approximate 95% confidence bounds.

Example: `autocorr(y,'NumSTD',1.5)` plots the ACF of `y` with confidence bounds `1.5` standard errors away from 0.

Data Types: `double`

## Output Arguments

collapse all

Sample ACF of the univariate time series `y`, returned as a numeric vector of length `NumLags` + `1`.

The elements of `acf` correspond to lags 0,1,2,...,`NumLags` (that is, elements of `lags`). For all time series `y`, the lag 0 autocorrelation `acf(1)` = `1`.

Lag numbers used for ACF estimation, returned as a numeric vector of length `NumLags` + `1`.

Approximate upper and lower autocorrelation confidence bounds assuming `y` is an MA(`NumMA`) process, returned as a two-element numeric vector.

Handles to plotted graphics objects, returned as a graphics array. `h` contains unique plot identifiers, which you can use to query or modify properties of the plot.

collapse all

### Autocorrelation Function

The autocorrelation function measures the correlation between yt and yt + k, where k = 0,...,K and yt is a stochastic process.

According to [1], the autocorrelation for lag k is

`${r}_{k}=\frac{{c}_{k}}{{c}_{0}},$`

where

• ${c}_{k}=\frac{1}{T}\sum _{t=1}^{T-k}\left({y}_{t}-\overline{y}\right)\left({y}_{t+k}-\overline{y}\right).$

• c0 is the sample variance of the time series.

Suppose that q is the lag beyond which the theoretical ACF is effectively 0. Then, the estimated standard error of the autocorrelation at lag k > q is

`$SE\left({r}_{k}\right)=\sqrt{\frac{1}{T}\left(1+2\sum _{j=1}^{q}{r}_{j}^{2}\right)}.$`

If the series is completely random, then the standard error reduces to $1/\sqrt{T}$.

### Missing Completely at Random

Observations of a random variable are missing completely at random if the tendency of an observation to be missing is independent of both the random variable and the tendency of all other observations to be missing.

## Tips

To plot the ACF without confidence bounds, set `'NumSTD',0`.

## Algorithms

• If `y` is a fully observed series (that is, it does not contain any `NaN` values), then `autocorr` uses a Fourier transform to compute the ACF in the frequency domain, then converts back to the time domain using an inverse Fourier transform.

• If `y` is not fully observed (that is, it contains at least one `NaN` value), `autocorr` computes the ACF at lag k in the time domain, and includes in the sample average only those terms for which the cross product ytyt+k exists. Consequently, the effective sample size is a random variable.

• `autocorr` plots the ACF when you do not request any output or when you request the fourth output.

## References

[1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Hamilton, J. D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.