Suppose we are comparing scenes using the EMD between color distributions. One problem is that the EMD can be large between color distributions for two images of the same scene taken under different illuminants, even if the camera location and orientation is fixed. This is because the pixels colors in the images can be quite different, as illustrated in the images below.

Under certain assumptions on the reflectance functions of scene objects,
a change in the spectral power distribution (SPD) from a(w) to b(w) causes a
linear transformation A_{a,b} of image pixel colors
([2]):

By allowing for a linear tranformation in the comparison between color distributions, the EMD can show, for example, that the scene under the white illuminant is similar to the scene under the red illuminant.

Texture comparison is another example in which allowing transformations is
useful. Excellent results using the EMD to compare textures were obtained
by Y. Rubner
([4]).
The main idea is to summarize
a texture by a distribution of energy in the spatial frequency plane.
A distribution point x_{i} is a point in the spatial frequency plane,
and its weight w_{i} is the fraction of the total energy at that
frequency. The textures shown below contain energy at only one spatial
frequency, but this will be enough to make our point clear.

Suppose we want the EMD to be small between the energy distributions for
the left and right textures because these differ only by a scaling and
rotation. As shown above, let q=(f_{x},f_{y}) denote a
point in spatial frequency space. If we work in log-polar spatial frequency
space, recording

By allowing for a translation in log-polar spatial frequency space, the EMD captures the similarity between textures that differ primarily in scale and orientation.

The need for transformations might be more direct than in the previous two applications, in the sense that distribution points may be points in the image plane instead of points in a color space or a spatial frequency space. Suppose for example, that we wish to match features in a stereo pair of images as shown below.

if the thickness of the object is small in comparison to its distance from the camera center of projection.

top | Title, Table of Contents, The EMD |

prev | The Problem |

next | A Convergent Iteration |

The ideas and results contained in this document are part of my thesis, which will be published as a Stanford computer science technical report in June 1999.

S. Cohen.
**Finding Color and Shape Patterns in Images**.
*Thesis Technical Report STAN-CS-TR-99-?*.
To be published June 1999.

Similar ideas applied to the EMD under translation have already been published in the technical report [1].

Email comments to scohen@cs.stanford.edu.