# margin

Margin of k-nearest neighbor classifier

## Description

m = margin(mdl,tbl,ResponseVarName) returns the classification margins for mdl with data tbl and classification tbl.ResponseVarName. If tbl contains the response variable used to train mdl, then you do not need to specify ResponseVarName.

m is returned as a numeric vector of length size(tbl,1). Each entry in m represents the margin for the corresponding row of tbl and the corresponding true class label in tbl.ResponseVarName, computed using mdl.

m = margin(mdl,tbl,Y) returns the classification margins for mdl with data tbl and classification Y.

example

m = margin(mdl,X,Y) returns the classification margins for mdl with data X and classification Y. m is returned as a numeric vector of length size(X,1).

## Examples

collapse all

Create a k-nearest neighbor classifier for the Fisher iris data, where $k$ = 5.

Load the Fisher iris data set.

Create a classifier for five nearest neighbors.

mdl = fitcknn(meas,species,'NumNeighbors',5);

Examine the margin of the classifier for a mean observation classified as 'versicolor'.

X = mean(meas);
Y = {'versicolor'};
m = margin(mdl,X,Y)
m = 1

All five nearest neighbors classify as 'versicolor'.

## Input Arguments

collapse all

k-nearest neighbor classifier model, specified as a ClassificationKNN object.

Sample data used to train the model, specified as a table. Each row of tbl corresponds to one observation, and each column corresponds to one predictor variable. Optionally, tbl can contain one additional column for the response variable. Multicolumn variables and cell arrays other than cell arrays of character vectors are not allowed.

If tbl contains the response variable used to train mdl, then you do not need to specify ResponseVarName or Y.

If you train mdl using sample data contained in a table, then the input data for margin must also be in a table.

Data Types: table

Response variable name, specified as the name of a variable in tbl. If tbl contains the response variable used to train mdl, then you do not need to specify ResponseVarName.

You must specify ResponseVarName as a character vector or string scalar. For example, if the response variable is stored as tbl.response, then specify it as 'response'. Otherwise, the software treats all columns of tbl, including tbl.response, as predictors.

The response variable must be a categorical, character, or string array, logical or numeric vector, or cell array of character vectors. If the response variable is a character array, then each element must correspond to one row of the array.

Data Types: char | string

Predictor data, specified as a numeric matrix. Each row of X represents one observation, and each column represents one variable.

Data Types: single | double

Class labels, specified as a categorical, character, or string array, logical or numeric vector, or cell array of character vectors. Each row of Y represents the classification of the corresponding row of X.

Data Types: categorical | char | string | logical | single | double | cell

collapse all

### Margin

The classification margin for each observation is the difference between the classification score for the true class and the maximal classification score for the false classes.

### Score

The score of a classification is the posterior probability of the classification. The posterior probability is the number of neighbors with that classification divided by the number of neighbors. For a more detailed definition that includes weights and prior probabilities, see Posterior Probability.