How can the classification learner app can output a single tree after cross validation?
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Volkan Ozcan
el 31 de Mayo de 2022
Respondida: Drew
el 20 de Oct. de 2022
Hello,
I am studying machine learning myself and I do not undertstand how does the classification learner app can output one tree even with cross validation. I'll replicate the issue below:
When we build a tree with fitctree and cross validation (lets say 5 folds), we get 5 different trees as an ouput.
cv_tree = fitctree(...,
"KFold",5);
cv_tree.Trained
ans =
5×1 cell array
{1×1 classreg.learning.classif.CompactClassificationTree}
{1×1 classreg.learning.classif.CompactClassificationTree}
{1×1 classreg.learning.classif.CompactClassificationTree}
{1×1 classreg.learning.classif.CompactClassificationTree}
{1×1 classreg.learning.classif.CompactClassificationTree}
We can make predictions using kFoldPredict. I guess it implements some kind of voting mechanism to extract one output from 5 trees. So far I can understand.
Now, I believe I do the same thing on the app:
After completing the training and exporting the model to the workspace using the Export Model button, we see that the trained model only have one tree inside it:
>> trainedModel.ClassificationTree
ans =
ClassificationTree
PredictorNames: ...
ResponseName: ...
CategoricalPredictors: ...
ClassNames: [0 1]
ScoreTransform: 'none'
NumObservations: ...
Properties, Methods
How could Matlab combine the cross validation trees into one single model?
0 comentarios
Respuesta aceptada
Drew
el 20 de Oct. de 2022
The short answer is that the Classification Learner does not export the k cross-validation models, but rather exports a final model which is trained on the entire training set.
This is described on the doc page: https://www.mathworks.com/help/stats/export-classification-model-for-use-with-new-data.html
in this highlighted note: "The final model Classification Learner exports is always trained using the full data set, excluding any data reserved for testing. The validation scheme that you use only affects the way that the app computes validation metrics. You can use the validation metrics and various plots that visualize results to pick the best model for your classification problem."
0 comentarios
Más respuestas (1)
Abhijeet
el 1 de Jul. de 2022
Hi, Volkan
I understand you were training the model using cross-validation. You have trained the model once with fitctree() method and once using the Classification Learner App but you are wondering why the output of both the approaches the different.
The overall logic behind the output generated by the Classification Learner App can be understood by generating the function of the trained model.
Kindley follow the below mentioned steps to generate function of the trained model:
- Open the Classification Learner App
- Create a new Model by importing data and select the validation scheme and “KFold” as “Cross validation” and a positive number, say 5
- Train the model
- Click on the Export button on the top panel and generate the function for your trained model
Now, using the generated function, you can understand the logic behind the output.
After successfully generating the function, you can find the “partitionedModel” which is exactly what you are getting using fitctree().
Kindly visit the below mentioned hyperlinks for more insights:
0 comentarios
Ver también
Categorías
Más información sobre Classification Trees en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!