Cross-validated C4.5: Using Error Estimation for Automatic Parameter Selection
			    George H. John
		     Computer Science Department
			 Stanford University
			  Stanford, CA 94305
			gjohn@cs.Stanford.EDU

		   Technical Note STAN-CS-TN-94-12

  Machine learning algorithms for supervised learning are in wide
  use.  An important issue in the use of these algorithms is how to
  set the parameters of the algorithm.  While the default parameter
  values may be appropriate for a wide variety of tasks, they are
  not necessarily optimal for a given task.  In this paper, we
  investigate the use of cross-validation to select parameters for
  the C4.5 decision tree learning algorithm.  Experimental results
  on five datasets show that when cross-validation is applied to
  selecting an important parameter for C4.5, the accuracy of the
  induced trees on independent test sets is generally higher than
  the accuracy when using the default parameter value.