nearcorr

Compute nearest correlation matrix by minimizing Frobenius distance

Syntax

Y = nearcorr(A)

Y = nearcorr(___,Name,Value)

Description

Y = nearcorr(A) returns the nearest correlation matrix Y by minimizing the Frobenius distance.

Y = nearcorr(___,Name,Value) specifies options using one or more name-value pair arguments in addition to the input arguments in the previous syntax.

example

Examples

collapse all

Compute the Nearest Correlation Matrix

Open Live Script

Find the nearest correlation matrix in the Frobenius norm for a given nonpositive semidefinite matrix.

Specify an N-by-N symmetric matrix with all elements in the interval [-1, 1] and unit diagonal.

A =  [1.0000   0         0         0   -0.9360 
      0    1.0000   -0.5500   -0.3645   -0.5300 
      0   -0.5500    1.0000   -0.0351    0.0875 
      0   -0.3645   -0.0351    1.0000    0.4557 
     -0.9360   -0.5300    0.0875    0.4557    1.0000];

Compute the eigenvalues of A using eig.

eig(A)

The smallest eigenvalue is less than 0, which indicates that A is not a positive semidefinite matrix.

Compute the nearest correlation matrix using nearcorr with the default Newton algorithm.

B = nearcorr(A)

B = 5×5

    1.0000    0.0372    0.0100   -0.0219   -0.8478
    0.0372    1.0000   -0.5449   -0.3757   -0.4849
    0.0100   -0.5449    1.0000   -0.0381    0.0996
   -0.0219   -0.3757   -0.0381    1.0000    0.4292
   -0.8478   -0.4849    0.0996    0.4292    1.0000

Compute the eigenvalues of B.

eig(B)

All of the eigenvalues are greater than or equal to 0, which means that B is a positive semidefinite matrix.

When you use nearcorr, you can specify the alternating projections algorithm by setting the name-value pair argument 'method' to 'projection'.

nearcorr(A,'method','projection')

ans = 5×5

    1.0000    0.0372    0.0100   -0.0219   -0.8478
    0.0372    1.0000   -0.5449   -0.3757   -0.4849
    0.0100   -0.5449    1.0000   -0.0381    0.0996
   -0.0219   -0.3757   -0.0381    1.0000    0.4292
   -0.8478   -0.4849    0.0996    0.4292    1.0000

You can also impose elementwise weights by specifying the 'Weights' name-value pair argument. For more information on elementwise weights, see Weights.

W = [0.0000  1.0000  0.1000  0.1500  0.2500 
     1.0000  0.0000  0.0500  0.0250  0.1500 
     0.1000  0.0500  0.0000  0.2500  1 
     0.1500  0.0250  0.2500  0.0000  0.2500 
     0.2500  0.1500  1  0.2500  0.0000];
nearcorr(A,'Weights',W)

ans = 5×5

    1.0000    0.0014    0.0287   -0.0222   -0.8777
    0.0014    1.0000   -0.4980   -0.7268   -0.4567
    0.0287   -0.4980    1.0000   -0.0358    0.0878
   -0.0222   -0.7268   -0.0358    1.0000    0.4465
   -0.8777   -0.4567    0.0878    0.4465    1.0000

In addition, you can impose N-by-1 vectorized weights by specifying the 'Weights' name-value pair argument. For more information on vectorized weights, see Weights.

W = linspace(0.1,0.01,5)'

C = nearcorr(A,'Weights', W)

C = 5×5

    1.0000    0.0051    0.0021   -0.0056   -0.8490
    0.0051    1.0000   -0.5486   -0.3684   -0.4691
    0.0021   -0.5486    1.0000   -0.0367    0.1119
   -0.0056   -0.3684   -0.0367    1.0000    0.3890
   -0.8490   -0.4691    0.1119    0.3890    1.0000

Compute the eigenvalues of C.

eig(C)

All of the eigenvalues are greater than or equal to 0, which means that C is a positive semidefinite matrix.

Generate a Correlation Matrix for Stocks with Missing Values

Open Live Script

Use nearcorr to create a positive semidefinite matrix for a correlation matrix for stocks with missing values.

Assume that you have stock values with missing values.

Stock_Missing = [59.875 42.734 47.938 60.359 NaN 69.625 61.500 62.125
                53.188 49.000 39.500 64.813 34.750 56.625 83.000 44.500
                55.750 50.000 38.938 62.875 30.188 43.375 NaN 29.938
                65.500 51.063 45.563 69.313 48.250 62.375 85.250 46.875
                69.938 47.000 52.313 71.016 37.500 59.359 61.188 48.219
                61.500 44.188 NaN 57.000 35.313 55.813 51.500 62.188
                59.230 48.210 62.190 61.390 54.310 70.170 61.750 91.080
                NaN 48.700 60.300 68.580 61.250 70.340 61.590 90.350
                52.900 52.690 54.230 61.670 68.170 NaN 57.870 88.640
                57.370 59.040 59.870 62.090 61.620 66.470 65.370 85.840];

Use corr to compute the correlation matrix and then use eig to check if the correlation matrix is positive semidefinite.

A = corr(Stock_Missing, 'Rows','pairwise');
eig(A)

A has eigenvalues that are less than 0, which indicates that the correlation matrix is not positive semidefinite.

Use nearcorr with this correlation matrix to generate a positive semidefinite matrix where all eigenvalues are greater than or equal to 0.

B = nearcorr(A);
eigenvalues = eig(B)

eigenvalues = 8×1

    0.0000
    0.0000
    0.0180
    0.2205
    0.5863
    1.6026
    1.7258
    3.8469

Input Arguments

collapse all

`A` — Input correlation matrix
matrix

Input correlation matrix, specified as an N-by-N symmetric approximate correlation matrix with all elements in the interval [-1 1] and unit diagonal. The A input may or may not be a positive semidefinite matrix.

Example: A = [1.0000 0 0 0 -0.9360 0 1.0000 -0.5500 -0.3645 -0.5300 0 -0.5500 1.0000 -0.0351 0.0875 0 -0.3645 -0.0351 1.0000 0.4557 -0.9360 -0.5300 0.0875 0.4557 1.0000]

Data Types: single | double

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: nearcorr(A,'Tolerance',1e-7,'MaxIterations',500,'Method','newton','Weights',weight_vector) returns a nearest correlation matrix by minimizing the Frobenius distance.

`Tolerance` — Termination tolerance for algorithm
`1e-6` (default) | positive scalar

Termination tolerance for the algorithm, specified as the comma-separated pair consisting of 'Tolerance' and a positive scalar.

Example: 'Tolerance',1e-7

Data Types: single | double

`MaxIterations` — Maximum number of solver iterations
`200` (default) | positive integer

Maximum number of solver iterations, specified as the comma-separated pair consisting of 'MaxIterations' and a positive integer.

Example: 'MaxIterations',500

Data Types: single | double

`Method` — Method for solving nearest correlation matrix problem
`'newton'` (default) | `'projection'`

Method for solving nearest correlation matrix problem, specified as the comma-separated pair consisting of 'Method' and one of the values in the following table.

Value Description

Value	Description
`'newton'`	The Newton algorithm is quadratically convergent. If you specify the `'newton'` method, `Weights` can be either a symmetric matrix or an `N`-by-`1` vector.
`'projection'`	The alternating projections algorithm can converge to the nearest correlation matrix with high accuracy, at best linearly. If you specify the `'projection'` method, `Weights` must be an `N`-by-`1` vector.

'newton'

The Newton algorithm is quadratically convergent.

If you specify the 'newton' method, Weights can be either a symmetric matrix or an N-by-1 vector.

'projection'

The alternating projections algorithm can converge to the nearest correlation matrix with high accuracy, at best linearly.

If you specify the 'projection' method, Weights must be an N-by-1 vector.

Example: 'Method','projection'

Data Types: char | string

`Weights` — Weights for confidence levels of entries in input matrix
`[ ]` (default) | `matrix` | `vector`

Weights for confidence levels of entries in the input matrix, specified as the comma-separated pair consisting of 'Weights' and either a symmetric matrix or an N-by-1 vector.

Symmetric matrix — When you specify Weights as a symmetric matrix W with all elements >= 0 to do elementwise weighting, the nearest correlation matrix Y is computed by minimizing the norm of (W ⚬ (A-Y)). Larger weight values place greater importance on the corresponding elements in A.
N-by-1 vector — When you specify Weights as an N-by-1 vector w with positive numeric values, the nearest correlation matrix Y is computed by minimizing the norm of (diag(w)^0.5 × (A-Y) × diag(w)^0.5).

Note

Matrix weights put weight on individual entries of the correlation matrix. A full matrix must be specified, but you can control which entries are more important to match. Alternatively, vector weights put weight on a full column (and the corresponding row). Fewer weights need to be specified as compared to the matrix weights, but an entire column (and the corresponding row) is weighted by a single weight.

Example: 'Weights',W

Data Types: single | double

Output Arguments

collapse all

`Y` — Nearest correlation matrix to input A
positive semidefinite matrix

Nearest correlation matrix to the input A, returned as a positive semidefinite matrix.

References

[1] Higham, N. J. "Computing the Nearest Correlation Matrix — A Problem from Finance." IMA Journal of Numerical Analysis. Vol. 22, Issue 3, 2002.

[2] Qi, H. and D. Sun. "An Augmented Lagrangian Dual Approach for the H-Weighted Nearest Correlation Matrix Problem." IMA Journal of Numerical Analysis. Vol. 31, Issue 2, 2011.

[3] Pang, J. S., D. Sun, and J. Sun. "Semismooth Homeomorphisms and Strong Stability of Semidefinite and Lorentz Complementarity Problems." Mathematics of Operation Research. Vol. 28, Number 1, 2003.

Extended Capabilities

Thread-Based Environment
Run code in the background using MATLAB® `backgroundPool` or accelerate code with Parallel Computing Toolbox™ `ThreadPool`.

This function fully supports thread-based environments. For more information, see Run MATLAB Functions in Thread-Based Environment.

Version History

Introduced in R2019b

nearcorr

Syntax

Description

Examples

Compute the Nearest Correlation Matrix

Generate a Correlation Matrix for Stocks with Missing Values

Input Arguments

A — Input correlation matrix matrix

Name-Value Arguments

Tolerance — Termination tolerance for algorithm 1e-6 (default) | positive scalar

MaxIterations — Maximum number of solver iterations 200 (default) | positive integer

Method — Method for solving nearest correlation matrix problem 'newton' (default) | 'projection'

Weights — Weights for confidence levels of entries in input matrix [ ] (default) | matrix | vector

Output Arguments

Y — Nearest correlation matrix to input A positive semidefinite matrix

References

Extended Capabilities

Thread-Based Environment Run code in the background using MATLAB® backgroundPool or accelerate code with Parallel Computing Toolbox™ ThreadPool.

Version History

See Also

`A` — Input correlation matrix
matrix

`Tolerance` — Termination tolerance for algorithm
`1e-6` (default) | positive scalar

`MaxIterations` — Maximum number of solver iterations
`200` (default) | positive integer

`Method` — Method for solving nearest correlation matrix problem
`'newton'` (default) | `'projection'`

`Weights` — Weights for confidence levels of entries in input matrix
`[ ]` (default) | `matrix` | `vector`

`Y` — Nearest correlation matrix to input A
positive semidefinite matrix

Thread-Based Environment
Run code in the background using MATLAB® `backgroundPool` or accelerate code with Parallel Computing Toolbox™ `ThreadPool`.