Machine Learning, 30, 271-274 (1998)

©1998 Kluwer Academic Publishers, Boston, Manufactured in The Netherlands

**Editors: Ron Kohavi (**ronnyk@CS.Stanford.EDU**)**

** Foster Provost (**foster@Basit.COM**)**

To help readers understand common terms in machine learning, statistics, and data mining, we provide a glossary of common terms. The definitions are not designed to be completely general, but instead are aimed at the most common case.

**Accuracy (error rate)**- The rate of correct (incorrect) predictions made by the model over a data set (cf. coverage). Accuracy is usually estimated by using an independent test set that was not used at any time during the learning process. More complex accuracy estimation techniques, such as cross-validation and the bootstrap, are commonly used, especially with data sets containing a small number of instances.
**Association learning**- Techniques that find conjunctive implication rules of the form ``X and Y implies A and B'' (associations) that satisfy given criteria. The conventional association algorithms are sound and complete methods for finding all associations that satisfy criteria for minimum support (at least a specified fraction of the instances must satisfy both sides of the rule) and minimum confidence (at least a specified fraction of instances satisfying the left hand side, or antecedent, must satisfy the right hand side, or consequent).
**Attribute (field, variable, feature)**- A quantity describing an instance. An attribute has a domain defined by the attribute type, which denotes the values that can be taken by an attribute. The following domain types are common:
**Categorical**- A finite number of discrete values. The type
*nominal*denotes that there is no ordering between the values, such as last names and colors. The type*ordinal*denotes that there is an ordering, such as in an attribute taking on the values low, medium, or high. **Continuous (quantitative)**- Commonly, subset of real numbers, where there is a measurable difference between the possible values. Integers are usually treated as continuous in practical problems.

A *feature* is the specification of an attribute and its value. For example, color is an attribute. ``Color is blue'' is a feature of an example. Many transformations to the attribute set leave the feature set unchanged (for example, regrouping attribute values or transforming multi-valued attributes to binary attributes). Some authors use *feature* as a synonym for *attribute* (e.g., in feature-subset selection).

**Classifier**- A mapping from unlabeled instances to (discrete) classes. Classifiers have a form (e.g., decision tree) plus an interpretation procedure (including how to handle unknowns, etc.). Some classifiers also provide probability estimates (scores), which can be thresholded to yield a discrete class decision thereby taking into account a utility function.
**Confusion matrix**- A matrix showing the predicted and actual classifications. A confusion matrix is of size LxL, where L is the number of different label values. The following confusion matrix is for L=2:

actual \ predicted |
negative |
positive |

Negative |
a |
b |

Positive |
c |
d |

ronnyk@CS.Stanford.EDU