Wrappers for Feature Subset Selection Ron Kohavi and George H. John Computer Science Department Stanford University Stanford, CA 94305 {ronnyk,gjohn}@CS.Stanford.EDU http://robotics.stanford.edu/~{ronnyk,gjohn} In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular domain, a feature subset selection method should consider how the algorithm and the training data interact. We explore the relation between optimal feature subset selection and relevance. Our wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain. We study the strengths and weaknesses of the wrapper approach and show improvements over the original design. We compare the wrapper approach to induction without feature subset selection and to Relief, a filter-based approach to feature subset selection. Significant improvement in accuracy on real problems is achieved for the two families of induction algorithms used: decision trees and Naive-Bayes. Citation: Kohavi, R. and John, G. H. (1997), Wrappers for Feature Subset Selection. _Artificial Intelligence Journal_. Forthcoming.