loss

Regression loss for Gaussian kernel regression model

collapse all in page

Syntax

L = loss(Mdl,X,Y)

L = loss(Mdl,Tbl,ResponseVarName)

L = loss(Mdl,Tbl,Y)

L = loss(___,Name,Value)

Description

example

L = loss(Mdl,X,Y) returns the mean squared error (MSE) for the Gaussian kernel regression model Mdl using the predictor data in X and the corresponding responses in Y.

L = loss(Mdl,Tbl,ResponseVarName) returns the MSE for the model Mdl using the predictor data in Tbl and the true responses in Tbl.ResponseVarName.

L = loss(Mdl,Tbl,Y) returns the MSE for the model Mdl using the predictor data in table Tbl and the true responses in Y.

example

L = loss(___,Name,Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. For example, you can specify a regression loss function and observation weights. Then, loss returns the weighted regression loss using the specified loss function.

Examples

collapse all

Calculate Sample Loss for Gaussian Kernel Regression Model

Open Live Script

Train a Gaussian kernel regression model for a tall array, then calculate the resubstitution mean squared error and epsilon-insensitive error.

When you perform calculations on tall arrays, MATLAB® uses either a parallel pool (default if you have Parallel Computing Toolbox™) or the local MATLAB session. To run the example using the local MATLAB session when you have Parallel Computing Toolbox, change the global execution environment by using the mapreducer function.

mapreducer(0)

Create a datastore that references the folder location with the data. The data can be contained in a single file, a collection of files, or an entire folder. Treat 'NA' values as missing data so that datastore replaces them with NaN values. Select a subset of the variables to use. Create a tall table on top of the datastore.

varnames = {'ArrTime','DepTime','ActualElapsedTime'};
ds = datastore('airlinesmall.csv','TreatAsMissing','NA',...
    'SelectedVariableNames',varnames);
t = tall(ds);

Specify DepTime and ArrTime as the predictor variables (X) and ActualElapsedTime as the response variable (Y). Select the observations for which ArrTime is later than DepTime.

daytime = t.ArrTime>t.DepTime;
Y = t.ActualElapsedTime(daytime);     % Response data
X = t{daytime,{'DepTime' 'ArrTime'}}; % Predictor data

Standardize the predictor variables.

Z = zscore(X); % Standardize the data

Train a default Gaussian kernel regression model with the standardized predictors. Set 'Verbose',0 to suppress diagnostic messages.

[Mdl,FitInfo] = fitrkernel(Z,Y,'Verbose',0)

Mdl = 
  RegressionKernel
              ResponseName: 'Y'
                   Learner: 'svm'
    NumExpansionDimensions: 64
               KernelScale: 1
                    Lambda: 8.5385e-06
             BoxConstraint: 1
                   Epsilon: 5.9303

FitInfo = struct with fields:
                  Solver: 'LBFGS-tall'
            LossFunction: 'epsiloninsensitive'
                  Lambda: 8.5385e-06
           BetaTolerance: 1.0000e-03
       GradientTolerance: 1.0000e-05
          ObjectiveValue: 26.1409
       GradientMagnitude: 0.0023
    RelativeChangeInBeta: 0.0150
                 FitTime: 32.5816
                 History: []

Mdl is a trained RegressionKernel model, and the structure array FitInfo contains optimization details.

Determine how well the trained model generalizes to new predictor values by estimating the resubstitution mean squared error and epsilon-insensitive error.

lossMSE = loss(Mdl,Z,Y) % Resubstitution mean squared error

lossMSE =

  MxNx... tall array

    ?    ?    ?    ...
    ?    ?    ?    ...
    ?    ?    ?    ...
    :    :    :
    :    :    :

lossEI = loss(Mdl,Z,Y,'LossFun','epsiloninsensitive') % Resubstitution epsilon-insensitive error

lossEI =

  MxNx... tall array

    ?    ?    ?    ...
    ?    ?    ?    ...
    ?    ?    ?    ...
    :    :    :
    :    :    :

Evaluate the tall arrays and bring the results into memory by using gather.

[lossMSE,lossEI] = gather(lossMSE,lossEI)

Evaluating tall expression using the Local MATLAB Session:
- Pass 1 of 1: Completed in 1.7 sec
Evaluation completed in 2 sec

lossMSE = 2.5141e+03

lossEI = 25.5148

Specify Custom Regression Loss

Open Live Script

Specify a custom regression loss (Huber loss) for a Gaussian kernel regression model.

Load the carbig data set.

load carbig

Specify the predictor variables (X) and the response variable (Y).

X = [Weight,Cylinders,Horsepower,Model_Year];
Y = MPG;

Delete rows of X and Y where either array has NaN values. Removing rows with NaN values before passing data to fitrkernel can speed up training and reduce memory usage.

R = rmmissing([X Y]); 
X = R(:,1:4); 
Y = R(:,end);

Reserve 10% of the observations as a holdout sample. Extract the training and test indices from the partition definition.

rng(10)  % For reproducibility
N = length(Y);
cvp = cvpartition(N,'Holdout',0.1);
idxTrn = training(cvp); % Training set indices
idxTest = test(cvp);    % Test set indices

Train the regression kernel model. Standardize the training data.

Xtrain = X(idxTrn,:);
Ytrain = Y(idxTrn);
Mdl = fitrkernel(Xtrain,Ytrain,'Standardize',true)

Mdl = 
  RegressionKernel
              ResponseName: 'Y'
                   Learner: 'svm'
    NumExpansionDimensions: 128
               KernelScale: 1
                    Lambda: 0.0028
             BoxConstraint: 1
                   Epsilon: 0.8617

Mdl is a RegressionKernel model.

Create an anonymous function that measures Huber loss $(δ = 1)$ , that is,

$L = \frac{1}{\sum w_{j}} \sum_{j = 1}^{n} w_{j} ℓ_{j},$

where

$\begin{array}{l} ℓ_{j} = {\begin{array}{cccccccccccccccccccc} 0.5 {e_{j}}_{}^{ˆ}^{2}; \\ | {e_{j}}_{}^{ˆ} | - 0.5; \end{array} \begin{array}{cccccccccccccccccccc} | {e_{j}}_{}^{ˆ} | \leq 1 \\ | {e_{j}}_{}^{ˆ} | > 1 \end{array} . \end{array}$

${e_{j}}_{}^{ˆ}$ is the residual for observation j. Custom loss functions must be written in a particular form. For rules on writing a custom loss function, see the 'LossFun' name-value argument.

huberloss = @(Y,Yhat,W)sum(W.*((0.5*(abs(Y-Yhat)<=1).*(Y-Yhat).^2) + ...
    ((abs(Y-Yhat)>1).*abs(Y-Yhat)-0.5)))/sum(W);

Estimate the training set regression loss using the Huber loss function.

eTrain = loss(Mdl,Xtrain,Ytrain,'LossFun',huberloss)

eTrain = 1.7210

Estimate the test set regression loss using the Huber loss function.

Xtest = X(idxTest,:);
Ytest = Y(idxTest);

eTest = loss(Mdl,Xtest,Ytest,'LossFun',huberloss)

eTest = 1.3062

Input Arguments

collapse all

`Mdl` — Kernel regression model
`RegressionKernel` model object

Kernel regression model, specified as a RegressionKernel model object. You can create a RegressionKernel model object using fitrkernel.

`X` — Predictor data
n-by-p numeric matrix

Predictor data, specified as an n-by-p numeric matrix, where n is the number of observations and p is the number of predictors. p must be equal to the number of predictors used to train Mdl.

Data Types: single | double

`Y` — Response data
numeric vector

Response data, specified as an n-dimensional numeric vector. The length of Y must be equal to the number of observations in X or Tbl.

Data Types: single | double

`Tbl` — Sample data
table

Sample data used to train the model, specified as a table. Each row of Tbl corresponds to one observation, and each column corresponds to one predictor variable. Optionally, Tbl can contain additional columns for the response variable and observation weights. Tbl must contain all the predictors used to train Mdl. Multicolumn variables and cell arrays other than cell arrays of character vectors are not allowed.

If Tbl contains the response variable used to train Mdl, then you do not need to specify ResponseVarName or Y.

If you train Mdl using sample data contained in a table, then the input data for loss must also be in a table.

`ResponseVarName` — Response variable name
name of variable in `Tbl`

Response variable name, specified as the name of a variable in Tbl. The response variable must be a numeric vector. If Tbl contains the response variable used to train Mdl, then you do not need to specify ResponseVarName.

If you specify ResponseVarName, then you must specify it as a character vector or string scalar. For example, if the response variable is stored as Tbl.Y, then specify ResponseVarName as 'Y'. Otherwise, the software treats all columns of Tbl, including Tbl.Y, as predictors.

Data Types: char | string

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: L = loss(Mdl,X,Y,'LossFun','epsiloninsensitive','Weights',weights) returns the weighted regression loss using the epsilon-insensitive loss function.

`LossFun` — Loss function
`'mse'` (default) | `'epsiloninsensitive'` | function handle

Loss function, specified as the comma-separated pair consisting of 'LossFun' and a built-in loss function name or a function handle.

The following table lists the available loss functions. Specify one using its corresponding character vector or string scalar. Also, in the table, $f (x) = T (x) β + b .$
- x is an observation (row vector) from p predictor variables.
- $T (\cdot)$ is a transformation of an observation (row vector) for feature expansion. T(x) maps x in $ℝ^{p}$ to a high-dimensional space ( $ℝ^{m}$ ).
- β is a vector of m coefficients.
- b is the scalar bias.
Value Description
'epsiloninsensitive' Epsilon-insensitive loss: $ℓ [y, f (x)] = \max [0, | y - f (x) | - ε]$
'mse' MSE: $ℓ [y, f (x)] = {[y - f (x)]}^{2}$
'epsiloninsensitive' is appropriate for SVM learners only.
Specify your own function by using function handle notation.
Let n be the number of observations in X. Your function must have this signature:
```
lossvalue = lossfun(Y,Yhat,W)
```
- The output argument lossvalue is a scalar.
- You choose the function name (lossfun).
- Y is an n-dimensional vector of observed responses. loss passes the input argument Y in for Y.
- Yhat is an n-dimensional vector of predicted responses, which is similar to the output of predict.
- W is an n-by-1 numeric vector of observation weights.
Specify your function using 'LossFun',@lossfun.

Value	Description
`'epsiloninsensitive'`	Epsilon-insensitive loss: $ℓ [y, f (x)] = \max [0, \| y - f (x) \| - ε]$
`'mse'`	MSE: $ℓ [y, f (x)] = {[y - f (x)]}^{2}$

Data Types: char | string | function_handle

`PredictionForMissingValue` — Predicted response value to use for observations with missing predictor values
`"median"` (default) | `"mean"` | `"omitted"` | numeric scalar

Since R2023b

Predicted response value to use for observations with missing predictor values, specified as "median", "mean", "omitted", or a numeric scalar.

Value	Description
`"median"`	`loss` uses the median of the observed response values in the training data as the predicted response value for observations with missing predictor values.
`"mean"`	`loss` uses the mean of the observed response values in the training data as the predicted response value for observations with missing predictor values.
`"omitted"`	`loss` excludes observations with missing predictor values from the loss computation.
Numeric scalar	`loss` uses this value as the predicted response value for observations with missing predictor values.

If an observation is missing an observed response value or an observation weight, then loss does not use the observation in the loss computation.

Example: "PredictionForMissingValue","omitted"

Data Types: single | double | char | string

`Weights` — Observation weights
`ones(size(X,1),1)` (default) | numeric vector | name of variable in `Tbl`

Observation weights, specified as the comma-separated pair consisting of 'Weights' and a numeric vector or the name of a variable in Tbl.

If Weights is a numeric vector, then the size of Weights must be equal to the number of rows in X or Tbl.
If Weights is the name of a variable in Tbl, you must specify Weights as a character vector or string scalar. For example, if the weights are stored as Tbl.W, then specify Weights as 'W'. Otherwise, the software treats all columns of Tbl, including Tbl.W, as predictors.

If you supply the observation weights, loss computes the weighted regression loss, that is, the Weighted Mean Squared Error or Epsilon-Insensitive Loss Function.

loss normalizes Weights to sum to 1.

Data Types: double | single | char | string

Output Arguments

collapse all

`L` — Regression loss
numeric scalar

Regression loss, returned as a numeric scalar. The interpretation of L depends on Weights and LossFun. For example, if you use the default observation weights and specify 'epsiloninsensitive' as the loss function, then L is the epsilon-insensitive loss.

More About

collapse all

Weighted Mean Squared Error

The weighted mean squared error is calculated as follows:

$mse = \frac{\sum_{j = 1}^{n} w_{j} {(f (x_{j}) - y_{j})}^{2}}{\sum_{j = 1}^{n} w_{j}},$

where:

n is the number of observations.
x_j is the jth observation (row of predictor data).
y_j is the observed response to x_j.
f(x_j) is the response prediction of the Gaussian kernel regression model Mdl to x_j.
w is the vector of observation weights.

Each observation weight in w is equal to ones(n,1)/n by default. You can specify different values for the observation weights by using the 'Weights' name-value pair argument. loss normalizes Weights to sum to 1.

Epsilon-Insensitive Loss Function

The epsilon-insensitive loss function ignores errors that are within the distance epsilon (ε) of the function value. The function is formally described as:

$L o s s_{ε} = {\begin{matrix} 0, i f | y - f (x) | \leq ε \\ | y - f (x) | - ε, o t h e r w i s e . \end{matrix}$

The mean epsilon-insensitive loss is calculated as follows:

$L o s s = \frac{\sum_{j = 1}^{n} w_{j} \max (0, | y_{j} - f (x_{j}) | - ε)}{\sum_{j = 1}^{n} w_{j}},$

where:

n is the number of observations.
x_j is the jth observation (row of predictor data).
y_j is the observed response to x_j.
f(x_j) is the response prediction of the Gaussian kernel regression model Mdl to x_j.
w is the vector of observation weights.

Extended Capabilities

Tall Arrays
Calculate with arrays that have more rows than fit in memory.

Usage notes and limitations:

loss does not support tall table data.

For more information, see Tall Arrays.

Version History

Introduced in R2018a

expand all

R2023b: Specify predicted response value to use for observations with missing predictor values

Starting in R2023b, when you predict or compute the loss, some regression models allow you to specify the predicted response value for observations with missing predictor values. Specify the PredictionForMissingValue name-value argument to use a numeric scalar, the training set median, or the training set mean as the predicted value. When computing the loss, you can also specify to omit observations with missing predictor values.

This table lists the object functions that support the PredictionForMissingValue name-value argument. By default, the functions use the training set median as the predicted response value for observations with missing predictor values.

Model Type	Model Objects	Object Functions
Gaussian process regression (GPR) model	`RegressionGP`, `CompactRegressionGP`	`loss`, `predict`, `resubLoss`, `resubPredict`
Gaussian process regression (GPR) model	`RegressionPartitionedGP`	`kfoldLoss`, `kfoldPredict`
Gaussian kernel regression model	`RegressionKernel`	`loss`, `predict`
Gaussian kernel regression model	`RegressionPartitionedKernel`	`kfoldLoss`, `kfoldPredict`
Linear regression model	`RegressionLinear`	`loss`, `predict`
Linear regression model	`RegressionPartitionedLinear`	`kfoldLoss`, `kfoldPredict`
Neural network regression model	`RegressionNeuralNetwork`, `CompactRegressionNeuralNetwork`	`loss`, `predict`, `resubLoss`, `resubPredict`
Neural network regression model	`RegressionPartitionedNeuralNetwork`	`kfoldLoss`, `kfoldPredict`
Support vector machine (SVM) regression model	`RegressionSVM`, `CompactRegressionSVM`	`loss`, `predict`, `resubLoss`, `resubPredict`
Support vector machine (SVM) regression model	`RegressionPartitionedSVM`	`kfoldLoss`, `kfoldPredict`

In previous releases, the regression model loss and predict functions listed above used NaN predicted response values for observations with missing predictor values. The software omitted observations with missing predictor values from the resubstitution ("resub") and cross-validation ("kfold") computations for prediction and loss.

R2022a: `loss` can return NaN for predictor data with missing values

The loss function no longer omits an observation with a NaN prediction when computing the weighted average regression loss. Therefore, loss can now return NaN when the predictor data X or the predictor variables in Tbl contain any missing values. In most cases, if the test set observations do not contain missing predictors, the loss function does not return NaN.

This change improves the automatic selection of a regression model when you use fitrauto. Before this change, the software might select a model (expected to best predict the responses for new data) with few non-NaN predictors.

If loss in your code returns NaN, you can update your code to avoid this result. Remove or replace the missing values by using rmmissing or fillmissing, respectively.

The following table shows the regression models for which the loss object function might return NaN. For more details, see the Compatibility Considerations for each loss function.

Model Type	Full or Compact Model Object	`loss` Object Function
Gaussian process regression (GPR) model	`RegressionGP`, `CompactRegressionGP`	`loss`
Gaussian kernel regression model	`RegressionKernel`	`loss`
Linear regression model	`RegressionLinear`	`loss`
Neural network regression model	`RegressionNeuralNetwork`, `CompactRegressionNeuralNetwork`	`loss`
Support vector machine (SVM) regression model	`RegressionSVM`, `CompactRegressionSVM`	`loss`

loss

Syntax

Description

Examples

Calculate Sample Loss for Gaussian Kernel Regression Model

Specify Custom Regression Loss

Input Arguments

Mdl — Kernel regression model RegressionKernel model object

X — Predictor data n-by-p numeric matrix

Y — Response data numeric vector

Tbl — Sample data table

ResponseVarName — Response variable name name of variable in Tbl

Name-Value Arguments

LossFun — Loss function 'mse' (default) | 'epsiloninsensitive' | function handle

PredictionForMissingValue — Predicted response value to use for observations with missing predictor values "median" (default) | "mean" | "omitted" | numeric scalar

Weights — Observation weights ones(size(X,1),1) (default) | numeric vector | name of variable in Tbl

Output Arguments

L — Regression loss numeric scalar

More About

Weighted Mean Squared Error

Epsilon-Insensitive Loss Function

Extended Capabilities

Tall Arrays Calculate with arrays that have more rows than fit in memory.

Version History

R2023b: Specify predicted response value to use for observations with missing predictor values

R2022a: loss can return NaN for predictor data with missing values

See Also

`Mdl` — Kernel regression model
`RegressionKernel` model object

`X` — Predictor data
n-by-p numeric matrix

`Y` — Response data
numeric vector

`Tbl` — Sample data
table

`ResponseVarName` — Response variable name
name of variable in `Tbl`

`LossFun` — Loss function
`'mse'` (default) | `'epsiloninsensitive'` | function handle

`PredictionForMissingValue` — Predicted response value to use for observations with missing predictor values
`"median"` (default) | `"mean"` | `"omitted"` | numeric scalar

`Weights` — Observation weights
`ones(size(X,1),1)` (default) | numeric vector | name of variable in `Tbl`

`L` — Regression loss
numeric scalar

Tall Arrays
Calculate with arrays that have more rows than fit in memory.

R2022a: `loss` can return NaN for predictor data with missing values