A Convergent Iteration

In this section, we present an iteration to search for an optimal transformation in the case that transformations change the points in a distribution, but leave the weights fixed. Since it is the weights in the distributions which determine the set of feasible flows,

F(x,g(y)) = F(x,y)    if g does not change the weights in x and y.

This fact is clear from examining the transportation problem which defines the EMD. Thus in the description below, the set of feasible flows between x and a transformed version of y is the same for every transformation.

The idea is to find the best flow for a fixed transformation, the best transformation for the previously computed flow, the best flow for the previously computed transformation, and so on:

(1)    F(k) = arg minF in F(x,y) WORK(F,x,g(k)(y)), and
(2)    g(k+1) = arg ming in G WORK(F(k),x,g(y)).

As shown in (1), the kth flow iterate is defined as a flow which minimizes the amount of work needed to match x and g(k)(y), where g(k) is the kth transformation iterate. As shown in (2), the (k+1)st transformation iterate is defined as a transformation g which minimizes the amount of work done by F(k) to match x and g(y). The iteration begins with an initial transformation g(0), and the order of evaluation is

g(0) --> F(0) --> g(1) --> F(1) --> ... .

An example will help clarify the iteration.

The left half of the figure below shows a dark and light distribution that we will match under translation using the previously given iteration.

[iter_ex.jpg]

The initial translation in this example is g(0)=0, and we shall translate the dark distribution. The best flow F(0) for g(0) is shown by the labelled arcs connecting dark and light masses in the left half of the diagram. This flow F(0) matches half (.5) the mass over a relatively large distance. When we ask for the best translation for this flow, we should expect that translation to move the .7 weight dark mass closer to the .8 weight light mass in order to decrease the total amount of work done by F(0). Indeed, the optimal translation g(1) aligns these two masses as shown in the right half of the figure. The best flow F(1) for this translation matches all of the .7 weight dark mass to the .8 weight light mass. Since these masses are on top of each other, it costs zero work to match 70% of the distributions. No further translation of the dark distribution improves the work -- g(2)=g(1) and the iteration converges.

The expression WORK(F,x,g(y)) defines a surface over the space FxG of all possible (flow,transformation)=(F,g) pairs. Our iteration provides a downhill path over this surface. The flow and transformation iterates define WORK iterates

WORK(k) = WORK(F(k),x,g(k)(y)).

From (2), we know that

(3)    WORK(F(k),x,g(k+1)(y)) <= WORK(F(k),x,g(k)(y))=WORK(k)

because g(k+1) is an optimal transformation for F(k) (and therefore yields a smaller work value than g(k) for F(k)). Similarly, (1) implies

(4)    WORK(k+1)=WORK(F(k+1),x,g(k+1)(y)) <= WORK(F(k),x,g(k+1)(y))

since F(k+1) is an optimal flow for g(k+1). Combining (3) and (4) shows

WORK(k+1) <= WORK(k).

Since the WORK sequence is decreasing and bounded below (by zero), it must converge. The corresponding transformation sequence may, however, converge to only a locally optimal transformation. Therefore, the iteration should be run with different initial transformations in search of the global minimum.

In just a couple of sections, we shall discuss two specific cases in which the EMD under transformation problem can be solved directly, without the aid of our iteration:

(i) equal-weight distributions under translation with the L2 distance squared as the ground distance, and
(ii) equal-weight 1D distributions under translation with the absolute value as the ground distance.

Although our iteration is not needed here, it is guaranteed to converge to the global minimum of the WORK function in case (i).

Our iteration is very general, in that it can be applied for a number of different (ground distance d, transformation set G) pairs. The optimal flow step (1) is a transportation problem, which can be solved with the transportation simplex algorithm ([3]). We call the problem in step (2) the optimal transformation problem. Our iteration can be applied whenever the optimal transformation problem can be solved.


top Title, Table of Contents, The EMD
prev Motivation from Content-Based Image Retrieval
next The Optimal Transformation Problem


The ideas and results contained in this document are part of my thesis, which will be published as a Stanford computer science technical report in June 1999.

S. Cohen. Finding Color and Shape Patterns in Images. Thesis Technical Report STAN-CS-TR-99-?. To be published June 1999.

Similar ideas applied to the EMD under translation have already been published in the technical report [1].

Email comments to scohen@cs.stanford.edu.