Title: Generating Accurate Rule Sets Without Global Optimization
Eibe Frank and Ian H. Witten
Department of Computer Science
University of Waikato
Hamilton
New Zealand
{eibe, ihw}@cs.waikato.ac.nz
Abstract
The two dominant schemes for rule-learning, C4.5 and RIPPER, both operate in
two stages. First they induce an initial rule set and then they refine it
using a rather complex optimization stage that discards (C4.5) or adjusts
(RIPPER) individual rules to make them work better together. In contrast, this
paper shows how good rule sets can be learned one rule at a time, without any
need for global optimization. We present an algorithm for inferring rules by
repeatedly generating partial decision trees, thus combining the two major
paradigms for rule generation---creating rules from decision trees and the
separate-and-conquer rule-learning technique. The algorithm is straightforward
and elegant: despite this, experiments on standard datasets show that it
produces rule sets that are as accurate as and of similar size to those
generated by C4.5, and more accurate than RIPPER's. Moreover, it operates
efficiently, and because it avoids postprocessing, does not suffer the
extremely slow performance on pathological example sets for which the C4.5
method has been criticized.
Keywords: Rules, global optimization, partial decision trees.
Email address of contact author: eibe@cs.waikato.ac.nz
Phone number of contact author: 0064 7 856 2889