# The Earth Mover's Distance Under Transformations

We begin with a statement of the problem and the basic motivation for its study.

## The Problem

The Earth Mover's Distance under a transformation set G is defined as

EMDG(x,y) = ming in G EMD(x,g(y)).

In words, we want to find the transformation of one distribution that minimizes its EMD to another distribution.

The general motivation for allowing transformations is to have a distance function which measures similarity modulo some factors. Consider the example below in which the two distributions have equal total weight.

The EMD between the distributions on the left is large because mass in the darker distributions must be transported large distances to change the darker distribution into the lighter distribution (lighter in terms of gray level, not total weight). The two distributions, however, appear visually similar. If we allow the darker distribution to be translated before comparing it to the lighter distribution, then the EMD is small. This is clearly shown in the right half of the above figure. We shall give more motivation for allowing transformations in the next section.

One class of distribution transformations contains transformations g that change a distribution's weights, but leave its points fixed:

g(y) = g(Y,u) = (Y,g(u)).

Here y=(Y,u) splits the distribution into its points contained in the matrix Y, and its weights contained in the vector u. A useful transformation set of this class arises in the scale estimation phase of the SEDL image retrieval system. In this setting, transformations are parametrized by a scale factor c, and

gc(y) = gc(Y,u) = (Y,cu).

Here, the distributions x and y are the color distributions for a database image and a query pattern, respectively, and the goal is to compute an estimate for the scale (size) at which the pattern may occur within the database image. See the description of the SEDL scale estimation phase for more details.

Another class of distribution transformations consists of transformations g that change the points of a distribution, but leave its weights fixed:

g(y) = g(Y,u) = (g(Y),u).

For example, suppose G=E, the group of Euclidean transformations. Distributions in E are defined by a rotation matrix R and a translation vector t, and

gR,t(Y) = gR,t([y1 ... yn]) = [Ry1+t ... Ryn+t].

In words, gR,t rotates each distribution point by R, and then translates the result by t. For the case when transformations change only distribution points, we shall give an iteration that is guaranteed to converge, although it may converge to only a locally optimal transformation. Before we do this, however, let us give some more motivation for allowing transformations.