\documentclass[nojss]{jss} %\VignetteIndexEntry{Constant Partying: Growing and Handling Trees with Constant Fits} %\VignetteDepends{partykit, rpart, RWeka, pmml, datasets} %\VignetteKeywords{recursive partitioning, regression trees, classification trees, decision trees} %\VignettePackage{partykit} %% packages \usepackage{amstext} \usepackage{amsfonts} \usepackage{amsmath} \usepackage{thumbpdf} \usepackage{rotating} %% need no \usepackage{Sweave} %% additional commands \newcommand{\squote}[1]{`{#1}'} \newcommand{\dquote}[1]{``{#1}''} \newcommand{\fct}[1]{{\texttt{#1()}}} \newcommand{\class}[1]{\dquote{\texttt{#1}}} \newcommand{\fixme}[1]{\emph{\marginpar{FIXME} (#1)}} %% further commands \renewcommand{\Prob}{\mathbb{P} } \renewcommand{\E}{\mathbb{E}} \newcommand{\V}{\mathbb{V}} \newcommand{\Var}{\mathbb{V}} \hyphenation{Qua-dra-tic} \title{Constant Partying: Growing and Handling Trees with Constant Fits} \author{Torsten Hothorn\\Universit\"at Z\"urich \And Achim Zeileis\\Universit\"at Innsbruck} \Plainauthor{Torsten Hothorn, Achim Zeileis} \Abstract{ This vignette describes infrastructure for regression and classification trees with simple constant fits in each of the terminal nodes. Thus, all observations that are predicted to be in the same terminal node also receive the same prediction, e.g., a mean for numeric responses or proportions for categorical responses. This class of trees is very common and includes all traditional tree variants (AID, CHAID, CART, C4.5, FACT, QUEST) and also more recent approaches like CTree. Trees inferred by any of these algorithms could in principle be represented by objects of class \class{constparty} in \pkg{partykit} that then provides unified methods for printing, plotting, and predicting. Here, we describe how one can create \class{constparty} objects by (a)~coercion from other \proglang{R} classes, (b)~parsing of XML descriptions of trees learned in other software systems, (c)~learning a tree using one's own algorithm. } \Keywords{recursive partitioning, regression trees, classification trees, decision trees} \Address{ Torsten Hothorn\\ Institut f\"ur Epidemiologie, Biostatistik und Pr\"avention \\ Universit\"at Z\"urich \\ Hirschengraben 84\\ CH-8001 Z\"urich, Switzerland \\ E-mail: \email{Torsten.Hothorn@R-project.org}\\ URL: \url{http://user.math.uzh.ch/hothorn/}\\ Achim Zeileis\\ Department of Statistics \\ Faculty of Economics and Statistics \\ Universit\"at Innsbruck \\ Universit\"atsstr.~15 \\ 6020 Innsbruck, Austria \\ E-mail: \email{Achim.Zeileis@R-project.org}\\ URL: \url{http://eeecon.uibk.ac.at/~zeileis/} } \begin{document} \setkeys{Gin}{width=\textwidth} \SweaveOpts{engine=R, eps=FALSE, keep.source=TRUE, eval=TRUE} <>= suppressWarnings(RNGversion("3.5.2")) options(width = 70) library("partykit") set.seed(290875) @ \section{Classes and methods} \label{sec:classes} This vignette describes the handling of trees with constant fits in the terminal nodes. This class of regression models includes most classical tree algorithms like AID \citep{Morgan+Sonquist:1963}, CHAID \citep{Kass:1980}, CART \citep{Breiman+Friedman+Olshen:1984}, FACT \citep{Loh+Vanichsetakul1:988}, QUEST \citep{Loh+Shih:1997}, C4.5 \citep{Quinlan:1993}, CTree \citep{Hothorn+Hornik+Zeileis:2006} etc. In this class of tree models, one can compute simple predictions for new observations, such as the conditional mean in a regression setup, from the responses of those learning sample observations in the same terminal node. Therefore, such predictions can easily be computed if the following pieces of information are available: the observed responses in the learning sample, the terminal node IDs assigned to the observations in the learning sample, and potentially associated weights (if any). In \pkg{partykit} it is easy to create a \class{party} object that contains these pieces of information, yielding a \class{constparty} object. The technical details of the \class{party} class are discussed in detail in Section~3.4 of \code{vignette("partykit", package = "partykit")}. In addition to the elements required for any \class{party}, a \class{constparty} needs to have: variables \code{(fitted)} and \code{(response)} (and \code{(weights)} if applicable) in the \code{fitted} data frame along with the \code{terms} for the model. If such a \class{party} has been created, its properties can be checked and coerced to class \class{constparty} by the \fct{as.constparty} function. Note that with such a \class{constparty} object it is possible to compute all kinds of predictions from the subsample in a given terminal node. For example, instead the mean response the median (or any other quantile) could be employed. Similarly, for a categorical response the predicted probabilities (i.e., relative frequencies) can be computed or the corresponding mode or a ranking of the levels etc. In case the full response from the learning sample is not available but only the constant fit from each terminal node, then a \class{constparty} cannot be set up. Specifically, this is the case for trees saved in the XML format PMML \citep[Predictive Model Markup Language,][]{DMG:2014} that does not provide the full learning sample. To also support such constant-fit trees based on simpler information \pkg{partykit} provides the \class{simpleparty} class. Inspired by the PMML format, this requires that the \code{info} of every node in the tree provides list elements \code{prediction}, \code{n}, \code{error}, and \code{distribution}. For classification trees these should contain the following node-specific information: the predicted single predicted factor, the learning sample size, the misclassification error (in \%), and the absolute frequencies of all levels. For regression trees the contents should be: the predicted mean, the learning sample size, the error sum of squares, and \code{NULL}. The function \fct{as.simpleparty} can also coerce \class{constparty} trees to \class{simpleparty} trees by computing the above summary statistics from the full response associated with each node of the tree. The remainder of this vignette consists of the following parts: In Section~\ref{sec:coerce} we assume that the trees were fitted using some other software (either within or outside of \proglang{R}) and we describe how these models can be coerced to \class{party} objects using either the \class{constparty} or \class{simpleparty} class. Emphasize is given to displaying such trees in textual and graphical ways. Subsequently, in Section~\ref{sec:mytree}, we show a simple classification tree algorithm can be easily implemented using the \pkg{partykit} tools, yielding a \class{constparty} object. Section~\ref{sec:prediction} shows how to compute predictions in both scenarios before Section~\ref{sec:conclusion} finally gives a brief conclusion. \section{Coercing tree objects} \label{sec:coerce} For the illustrations, we use the Titanic data set from package \pkg{datasets}, consisting of four variables on each of the $2201$ Titanic passengers: gender (male, female), age (child, adult), and class (1st, 2nd, 3rd, or crew) set up as follows: <>= data("Titanic", package = "datasets") ttnc <- as.data.frame(Titanic) ttnc <- ttnc[rep(1:nrow(ttnc), ttnc$Freq), 1:4] names(ttnc)[2] <- "Gender" @ The response variable describes whether or not the passenger survived the sinking of the ship. \subsection{Coercing rpart objects} We first fit a classification tree by means of the the \fct{rpart} function from package \pkg{rpart} \citep{rpart} to this data set (make sure to set \code{model = TRUE}; otherwise \code{model.frame.rpart} will return the \code{rpart} object and not the data): <>= library("rpart") (rp <- rpart(Survived ~ ., data = ttnc, model = TRUE)) @ The \class{rpart} object \code{rp} can be coerced to a \class{constparty} by \fct{as.party}. Internally, this transforms the tree structure of the \class{rpart} tree to a \class{partynode} and combines it with the associated learning sample as described in Section~\ref{sec:classes}. All of this is done automatically by <>= (party_rp <- as.party(rp)) @ Now, instead of the print method for \class{rpart} objects the print method for \code{constparty} objects creates a textual display of the tree structure. In a similar way, the corresponding \fct{plot} method produces a graphical representation of this tree, see Figure~\ref{party_plot}. \begin{figure}[p!] \centering <>= plot(rp) text(rp) @ <>= plot(party_rp) @ \caption{\class{rpart} tree of Titanic data plotted using \pkg{rpart} (top) and \pkg{partykit} (bottom) infrastructure. \label{party_plot}} \end{figure} By default, the \fct{predict} method for \class{rpart} objects computes conditional class probabilities. The same numbers are returned by the \fct{predict} method for \Sexpr{class(party_rp)[1L]} objects with \code{type = "prob"} argument (see Section~\ref{sec:prediction} for more details): <>= all.equal(predict(rp), predict(party_rp, type = "prob"), check.attributes = FALSE) @ Predictions are computed based on the \code{fitted} slot of a \class{constparty} object <>= str(fitted(party_rp)) @ which contains the terminal node numbers and the response for each of the training samples. So, the conditional class probabilities for each terminal node can be computed via <>= prop.table(do.call("table", fitted(party_rp)), 1) @ Optionally, weights can be stored in the \code{fitted} slot as well. \subsection{Coercing J48 objects} The \pkg{RWeka} package \citep{RWeka} provides an interface to the \pkg{Weka} machine learning library and we can use the \fct{J48} function to fit a J4.8 tree to the Titanic data <>= if (require("RWeka")) { j48 <- J48(Survived ~ ., data = ttnc) } else { j48 <- rpart(Survived ~ ., data = ttnc) } print(j48) @ This object can be coerced to a \class{party} object using <>= (party_j48 <- as.party(j48)) @ and, again, the print method from the \pkg{partykit} package creates a textual display. Note that, unlike the \class{rpart} trees, this tree includes multiway splits. The \fct{plot} method draws this tree, see Figure~\ref{J48_plot}. \begin{sidewaysfigure} \centering <>= plot(party_j48) @ \caption{\class{J48} tree of Titanic data plotted using \pkg{partykit} infrastructure. \label{J48_plot}} \end{sidewaysfigure} The conditional class probabilities computed by the \fct{predict} methods implemented in packages \pkg{RWeka} and \pkg{partykit} are equivalent: <>= all.equal(predict(j48, type = "prob"), predict(party_j48, type = "prob"), check.attributes = FALSE) @ In addition to \fct{J48} \pkg{RWeka} provides several other tree learners, e.g., \fct{M5P} implementing M5' and \fct{LMT} implementing logistic model trees, respectively. These can also be coerced using \fct{as.party}. However, as these are not constant-fit trees this yields plain \class{party} trees with some character information stored in the \code{info} slot. \subsection{Importing trees from PMML files} The previous two examples showed how trees learned by other \proglang{R} packages can be handled in a unified way using \pkg{partykit}. Additionally, \pkg{partykit} can also be used to import trees from any other software package that supports the PMML (Predictive Model Markup Language) format. As an example, we used \proglang{SPSS} to fit a QUEST tree to the Titanic data and exported this from \proglang{SPSS} in PMML format. This file is shipped along with the \pkg{partykit} package and we can read it as follows: <>= ttnc_pmml <- file.path(system.file("pmml", package = "partykit"), "ttnc.pmml") (ttnc_quest <- pmmlTreeModel(ttnc_pmml)) @ % \begin{figure}[t!] \centering <>= plot(ttnc_quest) @ \caption{QUEST tree for Titanic data, fitted using \proglang{SPSS} and exported via PMML. \label{PMML-Titanic-plot1}} \end{figure} % The object \code{ttnc_quest} is of class \class{simpleparty} and the corresponding graphical display is shown in Figure~\ref{PMML-Titanic-plot1}. As explained in Section~\ref{sec:classes}, the full learning data are not part of the PMML description and hence one can only obtain and display the summarized information provided by PMML. In this particular case, however, we have the learning data available in \proglang{R} because we had exported the data from \proglang{R} to begin with. Hence, for this tree we can augment the \class{simpleparty} with the full learning sample to create a \class{constparty}. As \proglang{SPSS} had reordered some factor levels we need to carry out this reordering as well" <>= ttnc2 <- ttnc[, names(ttnc_quest$data)] for(n in names(ttnc2)) { if(is.factor(ttnc2[[n]])) ttnc2[[n]] <- factor( ttnc2[[n]], levels = levels(ttnc_quest$data[[n]])) } @ % Using this data all information for a \class{constparty} can be easily computed: % <>= ttnc_quest2 <- party(ttnc_quest$node, data = ttnc2, fitted = data.frame( "(fitted)" = predict(ttnc_quest, ttnc2, type = "node"), "(response)" = ttnc2$Survived, check.names = FALSE), terms = terms(Survived ~ ., data = ttnc2) ) ttnc_quest2 <- as.constparty(ttnc_quest2) @ This object is plotted in Figure~\ref{PMML-Titanic-plot2}. \begin{figure}[t!] \centering <>= plot(ttnc_quest2) @ \caption{QUEST tree for Titanic data, fitted using \proglang{SPSS}, exported via PMML, and transformed into a \class{constparty} object. \label{PMML-Titanic-plot2}} \end{figure} Furthermore, we briefly point out that there is also the \proglang{R} package \pkg{pmml} \citep{pmml}, part of the \pkg{rattle} project \citep{rattle}, that allows to export PMML files for \pkg{rpart} trees from \proglang{R}. For example, for the \class{rpart} tree for the Titanic data: <>= library("pmml") tfile <- tempfile() write(toString(pmml(rp)), file = tfile) @ Then, we can simply read this file and inspect the resulting tree <>= (party_pmml <- pmmlTreeModel(tfile)) all.equal(predict(party_rp, newdata = ttnc, type = "prob"), predict(party_pmml, newdata = ttnc, type = "prob"), check.attributes = FALSE) @ Further example PMML files created with \pkg{rattle} are the Data Mining Group web page, e.g., \url{http://www.dmg.org/pmml_examples/rattle_pmml_examples/AuditTree.xml} or \url{http://www.dmg.org/pmml_examples/rattle_pmml_examples/IrisTree.xml}. \section{Growing a simple classification tree} \label{sec:mytree} Although the \pkg{partykit} package offers an extensive toolbox for handling trees along with implementations of various tree algorithms, it does not offer unified infrastructure for \emph{growing} trees. However, once you know how to estimate splits from data, it is fairly straightforward to implement trees. Consider a very simple CHAID-style algorithm (in fact so simple that we would advise \emph{not to use it} for any real application). We assume that both response and explanatory variables are factors, as for the Titanic data set. First we determine the best explanatory variable by means of a global $\chi^2$ test, i.e., splitting up the response into all levels of each explanatory variable. Then, for the selected explanatory variable we search for the binary best split by means of $\chi^2$ tests, i.e., we cycle through all potential split points and assess the quality of the split by comparing the distributions of the response in the so-defined two groups. In both cases, we select the split variable/point with lowest $p$-value from the $\chi^2$ test, however, only if the global test is significant at Bonferroni-corrected level $\alpha = 0.01$. This strategy can be implemented based on the data (response and explanatory variables) and some case weights as follows (\code{response} is just the name of the response and \code{data} is a data frame with all variables): <>= findsplit <- function(response, data, weights, alpha = 0.01) { ## extract response values from data y <- factor(rep(data[[response]], weights)) ## perform chi-squared test of y vs. x mychisqtest <- function(x) { x <- factor(x) if(length(levels(x)) < 2) return(NA) ct <- suppressWarnings(chisq.test(table(y, x), correct = FALSE)) pchisq(ct$statistic, ct$parameter, log = TRUE, lower.tail = FALSE) } xselect <- which(names(data) != response) logp <- sapply(xselect, function(i) mychisqtest(rep(data[[i]], weights))) names(logp) <- names(data)[xselect] ## Bonferroni-adjusted p-value small enough? if(all(is.na(logp))) return(NULL) minp <- exp(min(logp, na.rm = TRUE)) minp <- 1 - (1 - minp)^sum(!is.na(logp)) if(minp > alpha) return(NULL) ## for selected variable, search for split minimizing p-value xselect <- xselect[which.min(logp)] x <- rep(data[[xselect]], weights) ## set up all possible splits in two kid nodes lev <- levels(x[drop = TRUE]) if(length(lev) == 2) { splitpoint <- lev[1] } else { comb <- do.call("c", lapply(1:(length(lev) - 2), function(x) combn(lev, x, simplify = FALSE))) xlogp <- sapply(comb, function(q) mychisqtest(x %in% q)) splitpoint <- comb[[which.min(xlogp)]] } ## split into two groups (setting groups that do not occur to NA) splitindex <- !(levels(data[[xselect]]) %in% splitpoint) splitindex[!(levels(data[[xselect]]) %in% lev)] <- NA_integer_ splitindex <- splitindex - min(splitindex, na.rm = TRUE) + 1L ## return split as partysplit object return(partysplit(varid = as.integer(xselect), index = splitindex, info = list(p.value = 1 - (1 - exp(logp))^sum(!is.na(logp))))) } @ In order to actually grow a tree on data, we have to set up the recursion for growing a recursive \class{partynode} structure: <>= growtree <- function(id = 1L, response, data, weights, minbucket = 30) { ## for less than 30 observations stop here if (sum(weights) < minbucket) return(partynode(id = id)) ## find best split sp <- findsplit(response, data, weights) ## no split found, stop here if (is.null(sp)) return(partynode(id = id)) ## actually split the data kidids <- kidids_split(sp, data = data) ## set up all daugther nodes kids <- vector(mode = "list", length = max(kidids, na.rm = TRUE)) for (kidid in 1:length(kids)) { ## select observations for current node w <- weights w[kidids != kidid] <- 0 ## get next node id if (kidid > 1) { myid <- max(nodeids(kids[[kidid - 1]])) } else { myid <- id } ## start recursion on this daugther node kids[[kidid]] <- growtree(id = as.integer(myid + 1), response, data, w) } ## return nodes return(partynode(id = as.integer(id), split = sp, kids = kids, info = list(p.value = min(info_split(sp)$p.value, na.rm = TRUE)))) } @ A very rough sketch of a formula-based user-interface sets-up the data and calls \fct{growtree}: <>= mytree <- function(formula, data, weights = NULL) { ## name of the response variable response <- all.vars(formula)[1] ## data without missing values, response comes last data <- data[complete.cases(data), c(all.vars(formula)[-1], response)] ## data is factors only stopifnot(all(sapply(data, is.factor))) if (is.null(weights)) weights <- rep(1L, nrow(data)) ## weights are case weights, i.e., integers stopifnot(length(weights) == nrow(data) & max(abs(weights - floor(weights))) < .Machine$double.eps) ## grow tree nodes <- growtree(id = 1L, response, data, weights) ## compute terminal node number for each observation fitted <- fitted_node(nodes, data = data) ## return rich constparty object ret <- party(nodes, data = data, fitted = data.frame("(fitted)" = fitted, "(response)" = data[[response]], "(weights)" = weights, check.names = FALSE), terms = terms(formula)) as.constparty(ret) } @ The call to the constructor \fct{party} sets-up a \class{party} object with the tree structure contained in \code{nodes}, the training samples in \code{data} and the corresponding \code{terms} object. Class \class{constparty} inherits all slots from class \class{party} and has an additional \code{fitted} slot for storing the terminal node numbers for each sample in the training data, the response variable(s) and case weights. The \code{fitted} slot is a \class{data.frame} containing three variables: The fitted terminal node identifiers \code{"(fitted)"}, an integer vector of the same length as \code{data}; the response variables \code{"(response)"} as a vector (or \code{data.frame} for multivariate responses) with the same number of observations; and optionally a vector of weights \code{"(weights)"}. The additional \code{fitted} slot allows to compute arbitrary summary measures for each terminal node by simply subsetting the \code{"(response)"} and \code{"(weights)"} slots by \code{"(fitted)"} before computing (weighted) means, medians, empirical cumulative distribution functions, Kaplan-Meier estimates or whatever summary statistic might be appropriate for a certain response. The \fct{print}, \fct{plot}, and \fct{predict} methods for class \class{constparty} work this way with suitable defaults for the summary statistics depending on the class of the response(s). We now can fit this tree to the Titanic data; the \fct{print} method provides us with a first overview on the resulting model <>= (myttnc <- mytree(Survived ~ Class + Age + Gender, data = ttnc)) @ % \begin{figure}[t!] \centering <>= plot(myttnc) @ \caption{Classification tree fitted by the \fct{mytree} function to the \code{ttnc} data. \label{plottree}} \end{figure} % Of course, we can immediately use \code{plot(myttnc)} to obtain a graphical representation of this tree, the result is given in Figure~\ref{plottree}. The default behavior for trees with categorical responses is simply inherited from \class{constparty} and hence we readily obtain bar plots in all terminal nodes. As the tree is fairly large, we might be interested in pruning the tree to a more reasonable size. For this purpose the \pkg{partykit} package provides the \fct{nodeprune} function that can prune back to nodes with selected IDs. As \fct{nodeprune} (by design) does not provide a specific pruning criterion, we need to determine ourselves which nodes to prune. Here, one idea could be to impose significance at a higher level than the default $10^{-2}$ -- say $10^{-5}$ to obtain a strongly pruned tree. Hence we use \fct{nodeapply} to extract the minimal Bonferroni-corrected $p$-value from all inner nodes: % <>= nid <- nodeids(myttnc) iid <- nid[!(nid %in% nodeids(myttnc, terminal = TRUE))] (pval <- unlist(nodeapply(myttnc, ids = iid, FUN = function(n) info_node(n)$p.value))) @ Then, the pruning of the nodes with the larger $p$-values can be simply carried out by % <>= myttnc2 <- nodeprune(myttnc, ids = iid[pval > 1e-5]) @ % The corresponding visualization is shown in Figure~\ref{prunetree}. \setkeys{Gin}{width=0.85\textwidth} \begin{figure}[t!] \centering <>= plot(myttnc2) @ \caption{Pruned classification tree fitted by the \fct{mytree} function to the \code{ttnc} data. \label{prunetree}} \end{figure} \setkeys{Gin}{width=\textwidth} The accuracy of the tree built using the default options could be assessed by the bootstrap, for example. Here, we want to compare our tree for the Titanic survivor data with a simple logistic regression model. First, we fit this simple GLM and compute the (in-sample) log-likelihood: <>= logLik(glm(Survived ~ Class + Age + Gender, data = ttnc, family = binomial())) @ For our tree, we set-up $25$ bootstrap samples <>= bs <- rmultinom(25, nrow(ttnc), rep(1, nrow(ttnc)) / nrow(ttnc)) @ and implement the log-likelihood of a binomal model <>= bloglik <- function(prob, weights) sum(weights * dbinom(ttnc$Survived == "Yes", size = 1, prob[,"Yes"], log = TRUE)) @ What remains to be done is to iterate over all bootstrap samples, to refit the tree on the bootstrap sample and to evaluate the log-likelihood on the out-of-bootstrap samples based on the trees' predictions (details on how to compute predictions are given in the next section): <>= f <- function(w) { tr <- mytree(Survived ~ Class + Age + Gender, data = ttnc, weights = w) bloglik(predict(tr, newdata = ttnc, type = "prob"), as.numeric(w == 0)) } apply(bs, 2, f) @ We see that the in-sample log-likelihood of the linear logistic regression model is much smaller than the out-of-sample log-likelihood found for our tree and thus we can conclude that our tree-based approach fits data the better than the linear model. \section{Predictions} \label{sec:prediction} As argued in Section~\ref{sec:classes} arbitrary types of predictions can be computed from \class{constparty} objects because the full empirical distribution of the response in the learning sample nodes is available. All of these can be easily computed in the \fct{predict} method for \class{constparty} objects by supplying a suitable aggregation function. However, as certain types of predictions are much more commonly used, these are available even more easily by setting a \code{type} argument. \begin{table}[b!] \centering \begin{tabular}{llll} \hline Response class & \code{type = "node"} & \code{type = "response"} & \code{type = "prob"} \\ \hline \class{factor} & terminal node number & majority class & class probabilities \\ \class{numeric} & terminal node number & mean & ECDF \\ \class{Surv} & terminal node number & median survival time & Kaplan-Meier \\ \hline \end{tabular} \caption{Overview on type of predictions computed by the \fct{predict} method for \class{constparty} objects. For multivariate responses, combinations thereof are returned. \label{predict-type}} \end{table} The prediction \code{type} can either be \code{"node"}, \code{"response"}, or \code{"prob"} (see Table~\ref{predict-type}). The idea is that \code{"response"} always returns a prediction of the same class as the original response and \code{"prob"} returns some object that characterizes the entire empirical distribution. Hence, for different response classes, different types of predictions are produced, see Table~\ref{predict-type} for an overview. Additionally, for \class{numeric} responses \code{type = "quantile"} and \code{type = "density"} is available. By default, these return functions for computing predicted quantiles and probability densities, respectively, but optionally these functions can be directly evaluated \code{at} given values and then return a vector/matrix. Here, we illustrate all different predictions for all possible combinations of the explanatory factor levels. <>= nttnc <- expand.grid(Class = levels(ttnc$Class), Gender = levels(ttnc$Gender), Age = levels(ttnc$Age)) nttnc @ The corresponding predicted nodes, modes, and probability distributions are: <>= predict(myttnc, newdata = nttnc, type = "node") predict(myttnc, newdata = nttnc, type = "response") predict(myttnc, newdata = nttnc, type = "prob") @ Furthermore, the \fct{predict} method features a \code{FUN} argument that can be used to compute customized predictions. If we are, say, interested in the rank of the probabilities for the two classes, we can simply specify a function that implements this feature: <>= predict(myttnc, newdata = nttnc, FUN = function(y, w) rank(table(rep(y, w)))) @ The user-supplied function \code{FUN} takes two arguments, \code{y} is the response and \code{w} is a vector of weights (case weights in this situation). Of course, it would have been easier to do these computations directly on the conditional class probabilities (\code{type = "prob"}), but the approach taken here for illustration generalizes to situations where this is not possible, especially for numeric responses. \section{Conclusion} \label{sec:conclusion} The classes \class{constparty} and \class{simpleparty} introduced here can be used to represent trees with constant fits in the terminal nodes, including most of the traditional tree variants. For a number of implementations it is possible to convert the resulting trees to one of these classes, thus offering unified methods for handling constant-fit trees. User-extensible methods for printing and plotting these trees are available. Also, computing non-standard predictions, such as the median or empirical cumulative distribution functions, is easily possible within this framework. With the infrastructure provided in \pkg{partykit} it is rather straightforward to implement a new (or old) tree algorithm and therefore a prototype implementation of fancy ideas for improving trees is only a couple lines of \proglang{R} code away. \bibliography{party} \end{document}