The Earth Mover's Distance under a transformation set G is defined as
In words, we want to find the transformation of one distribution that minimizes its EMD to another distribution.
The general motivation for allowing transformations is to have a distance function which measures similarity modulo some factors. Consider the example below in which the two distributions have equal total weight.
The EMD between the distributions on the left is large because mass in the darker distributions must be transported large distances to change the darker distribution into the lighter distribution (lighter in terms of gray level, not total weight). The two distributions, however, appear visually similar. If we allow the darker distribution to be translated before comparing it to the lighter distribution, then the EMD is small. This is clearly shown in the right half of the above figure. We shall give more motivation for allowing transformations in the next section.
One class of distribution transformations contains transformations g that change a distribution's weights, but leave its points fixed:
Here y=(Y,u) splits the distribution into its points contained in the matrix Y, and its weights contained in the vector u. A useful transformation set of this class arises in the scale estimation phase of the SEDL image retrieval system. In this setting, transformations are parametrized by a scale factor c, and
Here, the distributions x and y are the color distributions for a database image and a query pattern, respectively, and the goal is to compute an estimate for the scale (size) at which the pattern may occur within the database image. See the description of the SEDL scale estimation phase for more details.
Another class of distribution transformations consists of transformations g that change the points of a distribution, but leave its weights fixed:
For example, suppose G=E, the group of Euclidean transformations. Distributions in E are defined by a rotation matrix R and a translation vector t, and
In words, gR,t rotates each distribution point by R, and then translates the result by t. For the case when transformations change only distribution points, we shall give an iteration that is guaranteed to converge, although it may converge to only a locally optimal transformation. Before we do this, however, let us give some more motivation for allowing transformations.
|top||Title, Table of Contents, The EMD|
|next||Motivation from Content-Based Image Retrieval|
The ideas and results contained in this document are part of my thesis, which will be published as a Stanford computer science technical report in June 1999.
S. Cohen. Finding Color and Shape Patterns in Images. Thesis Technical Report STAN-CS-TR-99-?. To be published June 1999.
Similar ideas applied to the EMD under translation have already been published in the technical report .
Email comments to firstname.lastname@example.org.