25. Tree Models#

Kriti Sehgal

Tree-based methods are supervised machine learning algorithms that can be used for both regression and classification tasks. Tree-based methods are particularly valuable when working with complex, nonlinear relationships and when model interpretability is important. The general idea behind tree-based methods is to segment the feature space into a number of simple regions. To make a prediction for a new observation, we first determine which region it falls into, then predict the mean (for regression) or mode (for classification) of the training observations in that region.

A decision tree is the basic building block of tree-based methods. It is a single tree structure that makes predictions by recursively splitting the data based on feature values. The term “tree” is used because the set of splitting rules used to segment the predictor space can be summarized in a tree-like structure, typically drawn upside down with the root at the top. Decision trees are highly interpretable i.e. it is easy to understand and explain why the model made a particular prediction. However, individual decision trees tend to overfit the training data and often have lower prediction accuracy compared to other supervised learning algorithms.

To improve prediction accuracy, we use ensemble methods. The key idea is that instead of relying on a single model, we train many models and aggregate their predictions so that the final result is more accurate, more stable, and less prone to overfitting. Techniques such as Random Forests and Gradient Boosting involve producing multiple trees which are then combined to yield a single consensus prediction. Combining a large number of trees can often result in dramatic improvements in prediction accuracy, at the expense of some loss in interpretability. The term “tree based methods” is a broader umbrella term which includes decision trees and models built using many decision trees or decision tree variations.

In this chapter, we begin with decision trees, the fundamental tree-based method, and then extend these ideas to more powerful ensemble methods.

25.1. Why does interpretability matter?#

In many real-world applications, understanding why a model makes a particular prediction is just as important as the prediction itself. For example:

  • Regulatory and Legal Requirements: Financial institutions must comply with regulations like the Equal Credit Opportunity Act, which requires providing specific reasons when denying credit. An interpretable model can show exactly which factors (income, debt-to-income ratio, credit history) led to the denial.

  • Healthcare and Safety-Critical Applications: When a model suggests a patient has cancer, doctors need to understand the reason why model predicted so before ordering invasive procedures. An interpretable model reveals the specific combinations of symptoms, test results, and patient history driving the prediction.

  • Debugging and Improvement: If a hiring model systematically discriminates against certain groups, an interpretable model can reveal which features drive these decisions, allowing data scientists to fix the issue.

With black-box models, understanding the model decisions is far more difficult. This is where decision trees excel: they provide clear, human-readable rules that stakeholders can examine and trust. While ensemble methods sacrifice some interpretability for improved accuracy, they still offer tools that provide partial explanations.