The Wrapper Approach Ron Kohavi George H. John Data Mining and Visualization Data Mining Silicon Graphics, Inc Epiphany, Inc 2011 N. Shoreline Blvd. 2300 Geng Road, Suite 200 Mountain View, CA 94043 Palo Alto, CA 94303 ronnyk@sgi.com gjohn@cs.stanford.edu robotics.stanford.edu/~ronnyk robotics.stanford.edu/~gjohn In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider how the algorithm and the training set interact. We explore the relation between optimal feature subset selection and relevance. The wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain. We compare the wrapper approach to induction without feature subset selection and to Relief, a filter approach to feature subset selection. Improvement in accuracy is achieved for some datasets for the two families of induction algorithms used: decision trees and Naive-Bayes. In addition, the feature subsets selected by the wrapper are significantly smaller than the original subsets used by the learning algorithms, thus producing more comprehensible models. Citation: Kohavi, Ron and John, George H. (1998) The Wrapper Approach. In H. Liu and H. Motoda (Eds.), Feature Selection for Knowledge Discovery in Databases. Springer-Verlag. Significant parts of this chapter are reprinted from Artificial Intelligence Journal, Vol. 97, Nos. 1-2, pp. 273--324, 1997 with kind permission from Elsevier Science -- NL, Sara Burgerhartstraat 25, 1055 KV Amsterdam, The Netherlands.