quantilePredict
Predict response quantile using bag of regression trees
Syntax
Description
returns a vector of medians of the predicted responses at YFit
= quantilePredict(Mdl
,X
)X
, a table or matrix of predictor data, and using the bag of regression trees Mdl
. Mdl
must be a TreeBagger
model object.
uses additional options specified by one or more YFit
= quantilePredict(Mdl
,X
,Name,Value
)Name,Value
pair arguments. For example, specify quantile probabilities or which trees to include for quantile estimation.
[
also returns a sparse matrix of response weights.YFit
,YW
]
= quantilePredict(___)
Input Arguments
Output Arguments
Examples
More About
Tips
quantilePredict
estimates the conditional distribution of the response using the training data every time you call it. To predict many quantiles efficiently, or quantiles for many observations efficiently, you should pass X
as a matrix or table of observations and specify all quantiles in a vector using the Quantile
name-value pair argument. That is, avoid calling quantilePredict
within a loop.
Algorithms
The
TreeBagger
grows a random forest of regression trees using the training data. Then, to implement quantile random forest,quantilePredict
predicts quantiles using the empirical conditional distribution of the response given an observation from the predictor variables. To obtain the empirical conditional distribution of the response:quantilePredict
passes all the training observations inMdl.X
through all the trees in the ensemble, and stores the leaf nodes of which the training observations are members.quantilePredict
similarly passes each observation inX
through all the trees in the ensemble.For each observation in
X
,quantilePredict
:Estimates the conditional distribution of the response by computing response weights for each tree.
For observation k in
X
, aggregates the conditional distributions for the entire ensemble:n is the number of training observations (
size(Y,1)
) and T is the number of trees in the ensemble (Mdl.NumTrees
).
For observation k in
X
, the τ quantile or, equivalently, the 100τ% percentile, is
This process describes how
quantilePredict
uses all specified weights.For all training observations j = 1,...,n and all chosen trees t = 1,...,T,
quantilePredict
attributes the product vtj = btjwj,obs to training observation j (stored inMdl.X(
andj
,:)Mdl.Y(
). btj is the number of times observation j is in the bootstrap sample for tree t. wj,obs is the observation weight inj
)Mdl.W(
.j
)For each chosen tree,
quantilePredict
identifies the leaves in which each training observation falls. Let St(xj) be the set of all observations contained in the leaf of tree t of which observation j is a member.For each chosen tree,
quantilePredict
normalizes all weights within a particular leaf to sum to 1, that is,For each training observation and tree,
quantilePredict
incorporates tree weights (wt,tree) specified byTreeWeights
, that is, w*tj,tree = wt,treevtj*Trees not chosen for prediction have 0 weight.For all test observations k = 1,...,K in
X
and all chosen trees t = 1,...,TquantilePredict
predicts the unique leaves in which the observations fall, and then identifies all training observations within the predicted leaves.quantilePredict
attributes the weight utj such thatquantilePredict
sums the weights over all chosen trees, that is,quantilePredict
creates response weights by normalizing the weights so that they sum to 1, that is,
References
[1] Breiman, L. "Random Forests." Machine Learning 45, pp. 5–32, 2001.
[2] Meinshausen, N. “Quantile Regression Forests.” Journal of Machine Learning Research, Vol. 7, 2006, pp. 983–999.
Version History
Introduced in R2016b