next up previous
Next: 2.4.1 Pruning Up: 2 Existing work on Previous: 2.3.3 Ordered vs. unordered

2.4 Obtaining the right sized trees

 

One of the main difficulties of inducing a recursive partitioning structure is knowing when to stop. Obtaining the ``right'' sized trees may be important for several reasons, which depend on the size of the classification problem [117]. For moderate sized problems, the critical issues are generalization accuracy, honest error rate estimation gif and gaining insight into the predictive and generalization structure of the data. For very large tree classifiers, the critical issue is optimizing structural properties (height, balance etc.) [366,47].

Breiman et al. [29] pointed out that tree quality depends more on good stopping rules than on splitting rules. Effects of noise on generalization are discussed in [270,183]. Overfitting avoidance as a specific bias is studied in [377,317]. Effect of noise on classification tree construction methods is studied in the pattern recognition literature in [346].

Several techniques have been suggested for obtaining the right sized trees. The most popular of these is pruning, whose discussion we will defer to Section 2.4.1. The following are some alternatives to pruning that have been attempted in the literature.





next up previous
Next: 2.4.1 Pruning Up: 2 Existing work on Previous: 2.3.3 Ordered vs. unordered



Sreerama Murthy
Thu Oct 19 17:40:24 EDT 1995