7.3: Regression Trees

Last updated
Save as PDF

Page ID: 138319

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $

$ \newcommand{\dsum}{\displaystyle\sum\limits} $

$ \newcommand{\dint}{\displaystyle\int\limits} $

$ \newcommand{\dlim}{\displaystyle\lim\limits} $

$ \newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $ \newcommand{\range}{\mathrm{range}\,}$

$ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$

$ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$

$ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$ \newcommand{\Span}{\mathrm{span}}$

$ \newcommand{\id}{\mathrm{id}}$

$ \newcommand{\Span}{\mathrm{span}}$

$ \newcommand{\kernel}{\mathrm{null}\,}$

$ \newcommand{\range}{\mathrm{range}\,}$

$ \newcommand{\RealPart}{\mathrm{Re}}$

$ \newcommand{\ImaginaryPart}{\mathrm{Im}}$

$ \newcommand{\Argument}{\mathrm{Arg}}$

$ \newcommand{\norm}[1]{\| #1 \|}$

$ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\AA}{\unicode[.8,0]{x212B}}$

$ \newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$ \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$ \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vectorC}[1]{\textbf{#1}} $

$ \newcommand{\vectorD}[1]{\overrightarrow{#1}} $

$ \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} $

$ \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} $

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$\newcommand{\longvect}{\overrightarrow}$

$ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $

$\newcommand{\avec}{\mathbf a}$ $\newcommand{\bvec}{\mathbf b}$ $\newcommand{\cvec}{\mathbf c}$ $\newcommand{\dvec}{\mathbf d}$ $\newcommand{\dtil}{\widetilde{\mathbf d}}$ $\newcommand{\evec}{\mathbf e}$ $\newcommand{\fvec}{\mathbf f}$ $\newcommand{\nvec}{\mathbf n}$ $\newcommand{\pvec}{\mathbf p}$ $\newcommand{\qvec}{\mathbf q}$ $\newcommand{\svec}{\mathbf s}$ $\newcommand{\tvec}{\mathbf t}$ $\newcommand{\uvec}{\mathbf u}$ $\newcommand{\vvec}{\mathbf v}$ $\newcommand{\wvec}{\mathbf w}$ $\newcommand{\xvec}{\mathbf x}$ $\newcommand{\yvec}{\mathbf y}$ $\newcommand{\zvec}{\mathbf z}$ $\newcommand{\rvec}{\mathbf r}$ $\newcommand{\mvec}{\mathbf m}$ $\newcommand{\zerovec}{\mathbf 0}$ $\newcommand{\onevec}{\mathbf 1}$ $\newcommand{\real}{\mathbb R}$ $\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$ $\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$ $\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$ $\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$ $\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$ $\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$ $\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$ $\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$ $\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$ $\newcommand{\laspan}[1]{\text{Span}\{#1\}}$ $\newcommand{\bcal}{\cal B}$ $\newcommand{\ccal}{\cal C}$ $\newcommand{\scal}{\cal S}$ $\newcommand{\wcal}{\cal W}$ $\newcommand{\ecal}{\cal E}$ $\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$ $\newcommand{\gray}[1]{\color{gray}{#1}}$ $\newcommand{\lgray}[1]{\color{lightgray}{#1}}$ $\newcommand{\rank}{\operatorname{rank}}$ $\newcommand{\row}{\text{Row}}$ $\newcommand{\col}{\text{Col}}$ $\renewcommand{\row}{\text{Row}}$ $\newcommand{\nul}{\text{Nul}}$ $\newcommand{\var}{\text{Var}}$ $\newcommand{\corr}{\text{corr}}$ $\newcommand{\len}[1]{\left|#1\right|}$ $\newcommand{\bbar}{\overline{\bvec}}$ $\newcommand{\bhat}{\widehat{\bvec}}$ $\newcommand{\bperp}{\bvec^\perp}$ $\newcommand{\xhat}{\widehat{\xvec}}$ $\newcommand{\vhat}{\widehat{\vvec}}$ $\newcommand{\uhat}{\widehat{\uvec}}$ $\newcommand{\what}{\widehat{\wvec}}$ $\newcommand{\Sighat}{\widehat{\Sigma}}$ $\newcommand{\lt}{<}$ $\newcommand{\gt}{>}$ $\newcommand{\amp}{&}$ $\definecolor{fillinmathshade}{gray}{0.9}$

Introduction to Regression Trees

Regression trees are a type of decision tree algorithm used when the target (dependent) variable is continuous rather than categorical. Instead of classifying observations into discrete categories, regression trees predict a numerical outcome by learning decision rules from input features. At each internal node, the dataset is split based on a condition that best minimizes error—typically using criteria like minimizing the sum of squared deviations. Each terminal (leaf) node represents a predicted numeric value that corresponds to the average of the target variable for observations reaching that node. Regression trees are intuitive, visual, and capable of capturing non-linear relationships in the data.

Differences Between Regression and Classification Trees

Differences Between Regression and Classification Trees
Feature	Classification Trees	Regression Trees
Target Variable	Categorical	Continuous
Prediction Output	Class label (e.g., Yes/No)	Numeric value
Splitting Criteria	Gini Index, Entropy	Mean Squared Error, MAE
Evaluation Metrics	Accuracy, Sensitivity, Specificity	RMSE, MAE, R-squared
Example Use Case	Churn prediction	Revenue prediction

Applications of Regression Trees

Regression trees are useful in a variety of business and analytical scenarios where the outcome variable is numeric. For instance, marketers might predict customer lifetime value, sales teams might estimate future sales based on historical data, real estate agents can assess property prices based on location and features, and logistics managers may forecast delivery times depending on route and traffic patterns. These models help organizations make informed, data-driven decisions with a visual and interpretable output.

Image of a Regression Tree and Interpretation

In a regression tree, each node represents a split that attempts to reduce variance in the dependent variable. The branches reflect the conditions for the splits, and the leaf nodes show the predicted numeric values.

Consider the following hypothetical business scenario:

A subscription-based telecommunications company wants to better understand the factors influencing customer revenue so it can more accurately forecast income and tailor pricing strategies. Using historical customer data, the company built a regression tree to predict monthly revenue based on two key variables: customer tenure and monthly charges. The model first splits customers by tenure, recognizing that newer customers often have different spending patterns compared to long-term subscribers. Among those with less than 12 months of tenure, monthly charges are the next most important factor, with lower-charged customers predicting an average revenue of $55 and higher-charged customers predicting $72. Longer-tenure customers, regardless of monthly charges, show higher predicted revenue at $95. This analysis helps the company identify high-value segments, adjust retention offers for newer customers, and fine-tune pricing models to maximize revenue potential.

Following is an illustration of the regression tree based from the example above.

A regression tree as described above.

Metrics to Evaluate Regression Trees

To evaluate how well a regression tree performs, analysts commonly use the following metrics:

RMSE (Root Mean Squared Error): Measures the square root of the average squared differences between predicted and actual values.
MAE (Mean Absolute Error): Takes the average of the absolute errors without squaring them.
R-squared: Indicates the proportion of variance in the dependent variable explained by the model.

These metrics help determine how closely the model’s predictions match the actual values.

Splitting Criteria for Regression Trees

Regression trees use variance reduction methods to determine the best split. At each node, the algorithm evaluates which split yields the greatest decrease in variability of the target variable. This is typically assessed using Mean Squared Error (MSE), which ensures that the selected split results in subsets of data with minimal variation around the mean.

Advantages and Disadvantages

Advantages:

Easy to interpret and visualize
Can model non-linear relationships
No need for data normalization or scaling
Suitable for both small and large datasets

Disadvantages:

Prone to overfitting unless pruned or regularized
Sensitive to minor data changes
May underperform compared to ensemble methods like Random Forests

Summary

Regression trees are powerful yet intuitive tools for predicting numeric outcomes using decision rule logic. They are particularly valuable when transparency and interpretability are required. However, they may suffer from instability and limited accuracy when used alone.

To overcome these issues, ensemble methods like Random Forests build multiple trees and aggregate their predictions for more robust results. In Chapter 8, we will explore Random Forests in detail and see how they can improve performance across both regression and classification tasks.

Search

Text Color

Text Size

Margin Size

Font Type