# filter

Filter disturbances through vector error-correction (VEC) model

## Syntax

``Y = filter(Mdl,Z)``
``Y = filter(Mdl,Z,Name,Value)``
``````[Y,E] = filter(___)``````

## Description

example

````Y = filter(Mdl,Z)` returns the multivariate response series `Y`, which results from filtering the underlying multivariate disturbance series `Z`. The `Z` series are associated with the model innovations process through the fully specified VEC(p – 1) model `Mdl`.```

example

````Y = filter(Mdl,Z,Name,Value)` uses additional options specified by one or more name-value pair arguments. For example, `'X',X,'Scale',false` specifies `X` as exogenous predictor data for the regression component and refraining from scaling the disturbances by the lower triangular Cholesky factor of the model innovations covariance matrix.```

example

``````[Y,E] = filter(___)``` returns the multivariate model innovations series `E` using any of the input arguments in the previous syntaxes.```

## Examples

collapse all

Consider a VEC model for the following seven macroeconomic series. Then, fit the model to the data and filter disturbances through the fitted model.

• Gross domestic product (GDP)

• GDP implicit price deflator

• Paid compensation of employees

• Nonfarm business sector hours of all persons

• Effective federal funds rate

• Personal consumption expenditures

• Gross private domestic investment

Suppose that a cointegrating rank of 4 and one short-run term are appropriate, that is, consider a VEC(1) model.

Load the `Data_USEconVECModel` data set.

`load Data_USEconVECModel`

For more information on the data set and variables, enter `Description` at the command line.

Determine whether the data needs to be preprocessed by plotting the series on separate plots.

```figure; subplot(2,2,1) plot(FRED.Time,FRED.GDP); title('Gross Domestic Product'); ylabel('Index'); xlabel('Date'); subplot(2,2,2) plot(FRED.Time,FRED.GDPDEF); title('GDP Deflator'); ylabel('Index'); xlabel('Date'); subplot(2,2,3) plot(FRED.Time,FRED.COE); title('Paid Compensation of Employees'); ylabel('Billions of \$'); xlabel('Date'); subplot(2,2,4) plot(FRED.Time,FRED.HOANBS); title('Nonfarm Business Sector Hours'); ylabel('Index'); xlabel('Date');``` ```figure; subplot(2,2,1) plot(FRED.Time,FRED.FEDFUNDS); title('Federal Funds Rate'); ylabel('Percent'); xlabel('Date'); subplot(2,2,2) plot(FRED.Time,FRED.PCEC); title('Consumption Expenditures'); ylabel('Billions of \$'); xlabel('Date'); subplot(2,2,3) plot(FRED.Time,FRED.GPDI); title('Gross Private Domestic Investment'); ylabel('Billions of \$'); xlabel('Date');``` Stabilize all series, except the federal funds rate, by applying the log transform. Scale the resulting series by 100 so that all series are on the same scale.

```FRED.GDP = 100*log(FRED.GDP); FRED.GDPDEF = 100*log(FRED.GDPDEF); FRED.COE = 100*log(FRED.COE); FRED.HOANBS = 100*log(FRED.HOANBS); FRED.PCEC = 100*log(FRED.PCEC); FRED.GPDI = 100*log(FRED.GPDI);```

Create a VEC(1) model using the shorthand syntax. Specify the variable names.

```Mdl = vecm(7,4,1); Mdl.SeriesNames = FRED.Properties.VariableNames```
```Mdl = vecm with properties: Description: "7-Dimensional Rank = 4 VEC(1) Model with Linear Time Trend" SeriesNames: "GDP" "GDPDEF" "COE" ... and 4 more NumSeries: 7 Rank: 4 P: 2 Constant: [7×1 vector of NaNs] Adjustment: [7×4 matrix of NaNs] Cointegration: [7×4 matrix of NaNs] Impact: [7×7 matrix of NaNs] CointegrationConstant: [4×1 vector of NaNs] CointegrationTrend: [4×1 vector of NaNs] ShortRun: {7×7 matrix of NaNs} at lag  Trend: [7×1 vector of NaNs] Beta: [7×0 matrix] Covariance: [7×7 matrix of NaNs] ```

`Mdl` is a `vecm` model object. All properties containing `NaN` values correspond to parameters to be estimated given data.

Estimate the model using the entire data set and the default options. By default, `estimate` uses the first p = 2 observations as presample data.

`EstMdl = estimate(Mdl,FRED.Variables)`
```EstMdl = vecm with properties: Description: "7-Dimensional Rank = 4 VEC(1) Model" SeriesNames: "GDP" "GDPDEF" "COE" ... and 4 more NumSeries: 7 Rank: 4 P: 2 Constant: [14.1329 8.77841 -7.20359 ... and 4 more]' Adjustment: [7×4 matrix] Cointegration: [7×4 matrix] Impact: [7×7 matrix] CointegrationConstant: [-28.6082 109.555 -77.0912 ... and 1 more]' CointegrationTrend: [4×1 vector of zeros] ShortRun: {7×7 matrix} at lag  Trend: [7×1 vector of zeros] Beta: [7×0 matrix] Covariance: [7×7 matrix] ```

`EstMdl` is an estimated `vecm` model object. It is fully specified because all parameters have known values. By default, `estimate` imposes the constraints of the H1 Johansen VEC model form by removing the cointegrating trend and linear trend terms from the model. Parameter exclusion from estimation is equivalent to imposing equality constraints to zero.

Generate a `numobs`-by-7 series of random Gaussian distributed values, where `numobs` is the number of observations in the data minus p.

```numobs = size(FRED,1) - Mdl.P; rng(1) % For reproducibility Z = randn(numobs,Mdl.NumSeries);```

To simulate responses, filter the disturbances through the estimated model. Specify the first p = 2 observations as presample data.

`Y = filter(EstMdl,Z,'Y0',FRED{1:2,:});`

`Y` is a 238-by-7 matrix of simulated responses. Columns correspond to the variable names in `EstMdl.SeriesNames`.

Plot the simulated and true responses.

```figure; subplot(2,2,1) plot(FRED.Time(3:end),[FRED.GDP(3:end) Y(:,1)]); title('Gross Domestic Product'); ylabel('Index (scaled)'); xlabel('Date'); legend('Simulation','True','Location','Best') subplot(2,2,2) plot(FRED.Time(3:end),[FRED.GDPDEF(3:end) Y(:,2)]); title('GDP Deflator'); ylabel('Index (scaled)'); xlabel('Date'); legend('Simulation','True','Location','Best') subplot(2,2,3) plot(FRED.Time(3:end),[FRED.COE(3:end) Y(:,3)]); title('Paid Compensation of Employees'); ylabel('Billions of \$ (scaled)'); xlabel('Date'); legend('Simulation','True','Location','Best') subplot(2,2,4) plot(FRED.Time(3:end),[FRED.HOANBS(3:end) Y(:,4)]); title('Nonfarm Business Sector Hours'); ylabel('Index (scaled)'); xlabel('Date'); legend('Simulation','True','Location','Best')``` ```figure; subplot(2,2,1) plot(FRED.Time(3:end),[FRED.FEDFUNDS(3:end) Y(:,5)]); title('Federal Funds Rate'); ylabel('Percent'); xlabel('Date'); subplot(2,2,2) plot(FRED.Time(3:end),[FRED.PCEC(3:end) Y(:,6)]); title('Consumption Expenditures'); ylabel('Billions of \$ (scaled)'); xlabel('Date'); subplot(2,2,3) plot(FRED.Time(3:end),[FRED.GPDI(3:end) Y(:,7)]); title('Gross Private Domestic Investment'); ylabel('Billions of \$ (scaled)'); xlabel('Date');``` Consider this VEC(1) model for three hypothetical response series.

`$\begin{array}{rcl}\Delta {y}_{t}& =& c+A{B}^{\prime }{y}_{t-1}+{\Phi }_{1}\Delta {y}_{t-1}+{\epsilon }_{t}\\ & =& \\ & =& \left[\begin{array}{c}-1\\ -3\\ -30\end{array}\right]+\left[\begin{array}{cc}-0.3& 0.3\\ -0.2& 0.1\\ -1& 0\end{array}\right]\left[\begin{array}{ccc}0.1& -0.2& 0.2\\ -0.7& 0.5& 0.2\end{array}\right]{y}_{t-1}+\left[\begin{array}{ccc}0& 0.1& 0.2\\ 0.2& -0.2& 0\\ 0.7& -0.2& 0.3\end{array}\right]\Delta {y}_{t-1}+{\epsilon }_{t}.\end{array}$`

The innovations are multivariate Gaussian with a mean of 0 and the covariance matrix

`$\Sigma =\left[\begin{array}{ccc}1.3& 0.4& 1.6\\ 0.4& 0.6& 0.7\\ 1.6& 0.7& 5\end{array}\right].$`

Create variables for the parameter values.

```Adjustment = [-0.3 0.3; -0.2 0.1; -1 0]; Cointegration = [0.1 -0.7; -0.2 0.5; 0.2 0.2]; ShortRun = {[0. 0.1 0.2; 0.2 -0.2 0; 0.7 -0.2 0.3]}; Constant = [-1; -3; -30]; Trend = [0; 0; 0]; Covariance = [1.3 0.4 1.6; 0.4 0.6 0.7; 1.6 0.7 5];```

Create a `vecm` model object representing the VEC(1) model using the appropriate name-value pair arguments.

```Mdl = vecm('Adjustment',Adjustment,'Cointegration',Cointegration,... 'Constant',Constant,'ShortRun',ShortRun,'Trend',Trend,... 'Covariance',Covariance)```
```Mdl = vecm with properties: Description: "3-Dimensional Rank = 2 VEC(1) Model" SeriesNames: "Y1" "Y2" "Y3" NumSeries: 3 Rank: 2 P: 2 Constant: [-1 -3 -30]' Adjustment: [3×2 matrix] Cointegration: [3×2 matrix] Impact: [3×3 matrix] CointegrationConstant: [2×1 vector of NaNs] CointegrationTrend: [2×1 vector of NaNs] ShortRun: {3×3 matrix} at lag  Trend: [3×1 vector of zeros] Beta: [3×0 matrix] Covariance: [3×3 matrix] ```

`Mdl` is, effectively, a fully specified `vecm` model object. That is, the cointegration constant and linear trend are unknown. However, they are not needed for simulating observations or forecasting, given that the overall constant and trend parameters are known.

Generate 1000 paths of 100 observations from a 3-D Gaussian distribution. `numobs` is the number of observations in the data without any missing values.

```numobs = 100; numpaths = 1000; rng(1); Z = randn(numobs,Mdl.NumSeries,numpaths);```

Filter the disturbances through the estimated model. Return the innovations (scaled disturbances).

`[Y,E] = filter(Mdl,Z);`

`Y` and `E` are 100-by-3-by-1000 matrices of filtered responses and scaled disturbances, respectively.

For each time point, compute the mean vector of the filtered responses among all paths.

`MeanFilt = mean(Y,3);`

`MeanFilt` is a 100-by-3 matrix containing the average of the filtered responses at each time point.

Plot the filtered responses and their averages.

```figure; for j = 1:Mdl.NumSeries subplot(2,2,j) plot(squeeze(Y(:,j,:)),'Color',[0.8,0.8,0.8]) title(Mdl.SeriesNames{j}); hold on plot(MeanFilt(:,j)); xlabel('Time index') hold off end``` ## Input Arguments

collapse all

VEC model, specified as a `vecm` model object created by `vecm` or `estimate`. `Mdl` must be fully specified.

Underlying multivariate disturbance series associated with the model innovations process, specified as a `numobs`-by-`numseries` numeric matrix or a `numobs`-by-`numseries`-by-`numpaths` numeric array.

`numobs` is the sample size. `numseries` is the number of disturbance series (`Mdl.NumSeries`). `numpaths` is the number of disturbance paths.

Rows correspond to sampling times, and the last row contains the latest set of disturbances.

Columns correspond to individual disturbance series for response variables.

Pages correspond to separate, independent paths. For a numeric matrix, `Z` is a single `numseries`-dimensional path of disturbance series. For a 3-D array, each page of `Z` represents a separate `numseries`-dimensional path. Among all pages, disturbances in corresponding rows occur at the same time.

The `'Scale'` name-value pair argument specifies whether to scale the disturbances before `filter` filters them through `Mdl`. For more details, see `Scale`.

Data Types: `double`

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `'Scale',false,'X',X` does not scale `Z` by the lower triangular Cholesky factor of the model covariance matrix before filtering, and uses the matrix `X` as predictor data in the regression component.

Presample responses that provide initial values for the model `Mdl`, specified as the comma-separated pair consisting of `'Y0'` and a `numpreobs`-by-`numseries` numeric matrix or a `numpreobs`-by-`numseries`-by-`numprepaths` numeric array.

`numpreobs` is the number of presample observations. `numprepaths` is the number of presample response paths.

Rows correspond to presample observations, and the last row contains the latest presample observation. `Y0` must have at least `Mdl.P` rows. If you supply more rows than necessary, `filter` uses the latest `Mdl.P` observations only.

Columns must correspond to the response series in `Y`.

Pages correspond to separate, independent paths.

• If `Y0` is a matrix, then `filter` applies it to each path (page) in `Y`. Therefore, all paths in `Y` derive from common initial conditions.

• Otherwise, `filter` applies `Y0(:,:,j)` to `Y(:,:,j)`. `Y0` must have at least `numpaths` pages, and `filter` uses only the first `numpaths` pages.

Among all pages, observations in a particular row occur at the same time.

By default, `filter` sets any necessary presample observations.

• For stationary VAR processes without regression components, `filter` uses the unconditional mean $\mu ={\Phi }^{-1}\left(L\right)c.$

• For nonstationary processes or models containing a regression component, `filter` sets presample observations to an array composed of zeros.

Data Types: `double`

Predictor data for the regression component in the model, specified as the comma-separated pair consisting of `'X'` and a numeric matrix containing `numpreds` columns.

`numpreds` is the number of predictor variables (`size(Mdl.Beta,2)`).

Rows correspond to observations, and the last row contains the latest observation. `X` must have at least as many observations as `Z`. If you supply more rows than necessary, `filter` uses only the latest observations. `filter` does not use the regression component in the presample period.

Columns correspond to individual predictor variables. All predictor variables are present in the regression component of each response equation.

`filter` applies `X` to each path (page) in `Z`; that is, `X` represents one path of observed predictors.

By default, `filter` excludes the regression component, regardless of its presence in `Mdl`.

Data Types: `double`

Flag indicating whether to scale disturbances by the lower triangular Cholesky factor of the model covariance matrix, specified as the comma-separated pair consisting of `'Scale'` and `true` or `false`.

For each page `j` = 1,...,`numpaths`, `filter` filters the `numobs`-by-`numseries` matrix of innovations `E(:,:,j)` through the VAR(p) model `Mdl`, according to these conditions.

• If `Scale` is `true`, then `E(:,:,j)` = `L*Z(:,:,j)` and `L` = `chol(Mdl.Covariance,'lower')`.

• If `Scale` is `false`, then `E(:,:,j)` = `Z(:,:,j)`.

Example: `'Scale',false`

Data Types: `logical`

### Note

`NaN` values in `Z`, `Y0`, and `X` indicate missing values. `filter` removes missing values from the data by list-wise deletion.

1. If `Z` is a 3-D array, then `filter` horizontally concatenates the pages of `Z` to form a `numobs`-by-`numpaths*numseries` matrix.

2. If a regression component is present, then `filter` horizontally concatenates `X` to `Z` to form a `numobs`-by-`(numpaths*numseries + numpreds)` matrix. `filter` assumes that the last rows of each series occur at the same time.

3. `filter` removes any row that contains at least one `NaN` from the concatenated data.

4. `filter` applies steps 1 and 3 to the presample paths in `Y0`.

This process ensures that the filtered responses and innovations of each path are the same size and are based on the same observation times. In the case of missing observations, the results obtained from multiple paths of `Z` can differ from the results obtained from each path individually.

This type of data reduction reduces the effective sample size.

## Output Arguments

collapse all

Filtered multivariate response series, returned as a `numobs`-by-`numseries` numeric matrix or a `numobs`-by-`numseries`-by-`numpaths` numeric array. `Y` represents the continuation of the presample responses in `Y0`.

Multivariate model innovations series, returned as a `numobs`-by-`numseries` numeric matrix or a `numobs`-by-`numseries`-by-`numpaths` numeric array. For details on the value of `E`, see `Scale`.

## Algorithms

• `filter` computes `Y` and `E` using this process for each page `j` in `Z`.

1. If `Scale` is `true`, then `E(:,:,j)` = `L*Z(:,:,j)`, where `L` = `chol(Mdl.Covariance,'lower')`. Otherwise, `E(:,:,j)` = `Z(:,:,j)`. Set et = `E(:,:,j)`.

2. `Y(:,:,j)` is yt in this system of equations.

`$\Delta {y}_{t}={\stackrel{^}{\Phi }}^{-1}\left(L\right)\left(\stackrel{^}{c}+\stackrel{^}{d}t+\stackrel{^}{A}\stackrel{^}{B}\prime {y}_{t-1}+\stackrel{^}{\beta }{x}_{t}+{e}_{t}\right).$`

For variable definitions, see Vector Error-Correction Model.

• `filter` generalizes `simulate`. Both functions filter a disturbance series through a model to produce responses and innovations. However, whereas `simulate` generates a series of mean-zero, unit-variance, independent Gaussian disturbances `Z` to form innovations `E` = `L*Z`, `filter` enables you to supply disturbances from any distribution.

• `filter` uses this process to determine the time origin t0 of models that include linear time trends.

• If you do not specify `Y0`, then t0 = 0.

• Otherwise, `filter` sets t0 to `size(Y0,1)``Mdl.P`. Therefore, the times in the trend component are t = t0 + 1, t0 + 2,..., t0 + `numobs`, where `numobs` is the effective sample size (`size(Y,1)` after `filter` removes missing values). This convention is consistent with the default behavior of model estimation in which `estimate` removes the first `Mdl.P` responses, reducing the effective sample size. Although `filter` explicitly uses the first `Mdl.P` presample responses in `Y0` to initialize the model, the total number of observations in `Y0` and `Y` (excluding missing values) determines t0.

 Hamilton, J. D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

 Johansen, S. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford: Oxford University Press, 1995.

 Juselius, K. The Cointegrated VAR Model. Oxford: Oxford University Press, 2006.

 Lütkepohl, H. New Introduction to Multiple Time Series Analysis. Berlin: Springer, 2005.