predict.randomForest {randomForest} | R Documentation |
Prediction of test data using random forest.
## S3 method for class 'randomForest': predict(object, newdata, type="response", norm.votes=TRUE, predict.all=FALSE, proximity=FALSE, nodes=FALSE, cutoff, ...)
object |
an object of class randomForest , as that
created by the function randomForest . |
newdata |
a data frame or matrix containing new data. (Note: If
not given, the out-of-bag prediction in object is returned. |
type |
one of response , prob . or votes ,
indicating the type of output: predicted values, matrix of class
probabilities, or matrix of vote counts. class is allowed, but
automatically converted to "response", for backward compatibility. |
norm.votes |
Should the vote counts be normalized (i.e.,
expressed as fractions)? Ignored if object$type is
regression . |
predict.all |
Should the predictions of all trees be kept? |
proximity |
Should proximity measures be computed? An error is
issued if object$type is regression . |
nodes |
Should the terminal node indicators (an n by ntree matrix) be return? If so, it is in the ``nodes'' attribute of the returned object. |
cutoff |
(Classification only) A vector of length equal to
number of classes. The `winning' class for an observation is the
one with the maximum ratio of proportion of votes to cutoff.
Default is taken from the forest$cutoff component of
object (i.e., the setting used when running
randomForest ). |
... |
not used currently. |
If object$type
is regression
, a vector of predicted
values is returned. If predict.all=TRUE
, then the returned
object is a list of two components: aggregate
, which is the
vector of predicted values by the forest, and individual
, which
is a matrix where each column contains prediction by a tree in the
forest.
If object$type
is classification
, the object returned
depends on the argument type
:
response |
predicted classes (the classes with majority vote). |
prob |
matrix of class probabilities (one column for each class and one row for each input). |
vote |
matrix of vote counts (one column for each class
and one row for each new input); either in raw counts or in fractions
(if norm.votes=TRUE ). |
If predict.all=TRUE
, then the individual
component of the
returned object is a character matrix where each column contains the
predicted class by a tree in the forest.
If proximity=TRUE
, the returned object is a list with two
components: pred
is the prediction (as described above) and
proximity
is the proximitry matrix. An error is issued if
object$type
is regression
.
If nodes=TRUE
, the returned object has a ``nodes'' attribute,
which is an n by ntree matrix, each column containing the node number
that the cases fall in for that tree.
Andy Liaw andy_liaw@merck.com and Matthew Wiener matthew_wiener@merck.com, based on original Fortran code by Leo Breiman and Adele Cutler.
Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32.
data(iris) set.seed(111) ind <- sample(2, nrow(iris), replace = TRUE, prob=c(0.8, 0.2)) iris.rf <- randomForest(Species ~ ., data=iris[ind == 1,]) iris.pred <- predict(iris.rf, iris[ind == 2,]) table(observed = iris[ind==2, "Species"], predicted = iris.pred)