## Some Notes

An excellent source for applications of the EMD in content-based image retrieval is the work of Y. Rubner. He has successfully applied the EMD to measure global color similarity between images ([5,6,7]) and texture similarity ([4,5,7]). In these cases, the feature space in which the distribution points live and the ground distance used to compare the points are different, but the same EMD framework is applied.

The EMD is a metric between equal-weight distributions whenever the underlying ground distance is a metric between points. The difficult part in the proof of this statement is to show that the triangle inequality holds. Assume that distributions have total weight one, so that the EMD is exactly the minimum of amount of work to change one distribution into another. One way to morph x into z is to use a minimum work morph of x into y, and then use a minimum work morph of y into z. The cost of the morph through y is EMD(x,y)+EMD(y,z). Since EMD(x,z) is the minimum amount of work required to change x into z, it cannot be more than the work for the particular morph through y. A formal proof of the metric property can be found in [5].

Another interesting property of the EMD is that the ground distance between the centroids of equal-weight distributions is a lower bound on their EMD for any ground distance between points induced by a vector norm (e.g. the Euclidean distance between points is the Euclidean norm of their difference). There are also lower bounds for the equal-weight case which use the EMD between projections of the distributions onto lines through the origin. The EMD between equal-weight 1D distributions can be computed much more efficiently than the EMD between more general distributions. Some information on the EMD in 1D can be found in this document, in the section on the EMD under translation in 1D. The technical report [1] covers the previously mentioned lower bounds, as well as centroid-based and projection-based lower bounds which can be applied in the partial matching case.

The EMD is more general than the way I have described it thus far. There is actually nothing in its formulation that assumes the xi's and the yj's in distributions x and y are points. The defining linear program uses only the distance d(xi,yj), so xi and yj could be any objects between which a distance can be computed. It is not necessary that these objects have a point representation, or that the ground distance is some standard point distance such as L2 or L1. In general, the EMD is a distance measure between two sets of weighted objects which is built upon a distance between individual objects.

Email comments to scohen@cs.stanford.edu.