how can I identify the features of my data x that mostly contribute to the classification of linear discriminant analysis?

5 visualizaciones (últimos 30 días)
Dear all, I would like to know whether it is possible to know what features of a data set mostly contribute to the classification performed by linear discriminant analyses.
To make my question clearer, let’s take the example available in Matlab: the Fisher’s iris data.
Each row of the data set fisheriris contains a sample of an iris flower and the columns a value for: Sepal length, Sepal width, Petal length, Petal width.
I would like to know which feature (Sepal length or width or petal length) the most contribute in the classification of one sample of iris as setosa or virginica.
Can I obtain this information using classify or fitcdiscr?
I hope that my question is clear,
Thank you for your help,
Andrea
  2 comentarios
Krishna Bindumadhavan
Krishna Bindumadhavan el 22 de Nov. de 2017
What is the measure (mathematically speaking) of what feature contributes the most here according to you? If you would like to visualize the effect of various features , you could look at this example here: https://www.mathworks.com/help/stats/create-and-visualize-discriminant-analysis-classifier.html .
Sand
Sand el 22 de Nov. de 2017
Hi, thank you for your reply
Let's take a concrete example Let's assume that I have two groupes of flowers f1 = Roses and f2 = Jasmins. Each line represent a sample of rose/jasmin and the columns the petals size (column 1 = height et column 2 = width).
f1 = [1 2
2 3
1 4
2 3];
f2 = [1 7
2 9
1 10
2 8];
Let's assume that I built a discriminant model and I use it to predict the class of a new sample x = [1 9]. The model tells me that the class of the new sample is f2. From this example it seems that the feature 'petal width' (column 2) is the most crucial to class the new sample. However, when one has 200 features (200 columns) or more it becomes harder to understand which one of these features contribute the most to the classification. So how can I obtain this information. Do linear coefficients of the discriminant model convey this information?
Thank you for your help

Iniciar sesión para comentar.

Respuesta aceptada

Bernhard Suhm
Bernhard Suhm el 13 de Dic. de 2017
The coefficient magnitude is a measure of predictor importance. After the training with normalized data (zero mean and unit variance), this measure is stored in the DeltaPredictor property. See also answer from 3 years ago, https://www.mathworks.com/matlabcentral/answers/119122-how-can-we-know-the-most-imortant-predictor-in-discriminant-analysis

Más respuestas (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by