what's the difference between DeltaCritDecisionSplit property vs. Gini's Diversity Index?
Mostrar comentarios más antiguos
I'm implementing a Random Forests code for selecting the most important predictors for my application. The treebagger webminar has two examples of ways to estimating predictor importance (DeltaCritDecisionSplit, OOBPermutedVarDeltaError). is DeltaCritDecisionSplit like the Gini diversity index (of predictorImportance)? If not, how are they different?
Respuestas (2)
Ilya
el 19 de En. de 2012
0 votos
Yes, DeltaCritDecisionSplit property of TreeBagger is the equivalent of predictorImportance method for an ensemble produced by fitensemble function. It is obtained by summing the impurity gain over all splits on a given predictor. Gini is the default impurity for classification trees.
1 comentario
Offer
el 19 de En. de 2012
Ilya
el 19 de En. de 2012
0 votos
Predictor importance estimates for every tree in an ensemble are added together. The sum is then divided by the number of trees. This means that the estimates are comparable if the two ensembles are composed of trees of roughly the same depth (that is, trees using roughly the same number of splits). Boosted trees by default use stumps (one-split trees), and many predictors may be never split on. Bagged trees by default are deep, and most predictors get many splits.
In general, comparing predictor importance estimates across ensembles of different types may not produce anything useful. These can only tell you what predictors are important for this particular ensemble.
1 comentario
Offer
el 20 de En. de 2012
Categorías
Más información sobre Classification Ensembles en Centro de ayuda y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!