Automatic Parameter Selection by Minimizing Estimated Error
		    Ron Kohavi and George H. John
		     Computer Science Department
			 Stanford University
			  Stanford, CA 94305
		    {ronnyk,gjohn}@cs.Stanford.EDU
	     http://robotics.stanford.edu/~{ronnyk,gjohn}

We address the problem of finding the parameter settings that will
result in optimal performance of a given learning algorithm using a
particular dataset as training data.  We describe a ``wrapper''
method, considering determination of the best parameters as a discrete
function optimization problem.  The method uses best-first search and
cross-validation to wrap around the basic induction algorithm: the
search explores the space of parameter values, running the basic
algorithm many times on training and holdout sets produced by
cross-validation to get an estimate of the expected error of each
parameter setting.  Thus, the final selected parameter settings are
tuned for the specific induction algorithm and dataset being studied.
We report experiments with this method on 33 datasets selected from
the UCI and StatLog collections using C4.5 as the basic induction
algorithm.  At a 90% confidence level, our method improves the
performance of C4.5 on nine domains, degrades performance on one, and
is statistically indistinguishable from C4.5 on the rest.  On the
sample of datasets used for comparison, our method yields an average
13% relative decrease in error rate.  We expect to see similar
performance improvements when using our method with other machine
learning algorithms.

Citation:
Kohavi, R. and John, G. H. (1995), Automatic Parameter Selection
by Minimizing Estimated Error, in Prieditis & Russell, eds., 
Machine Learning: Proceedings of the Twelfth International Conference,
Morgan Kaufmann Publishers, San Francisco, CA.