to the optimal transformation problem when the transformation set is translation and the ground distance is the Euclidean distance squared. When x and y are equal-weight distributions, we must have
for any feasible flow between x and y. This is because all the mass in both distributions must be matched when the two distributions have the same total weight. Substituting (9) into (8), we see that
where
centroid(x) | = | sum_{i=1..m} w_{i} x_{i} / w^{S}, and |
centroid(y) | = | sum_{j=1..n} u_{j} y_{j} / u^{S}. |
Here we have also used the fact that x and y are equal-weight distributions: w^{S}=u^{S}=min(w^{S},u^{S}).
We have shown that the optimal translation in this case does not depend on the feasible flow used. The translation t^{*} which lines up the centroids of x and y is the unique optimal translation for every feasible flow. To compute the EMD under translation between equal-weight distributions with d=(L_{2})^{2}, we simply compute EMD(x,y+centroid(x)-centroid(y)). Although our iteration is not needed here, it is guaranteed to converge to the global minimum of the WORK function in this case. In fact, it will converge in only a couple of steps since g^{(2)}=g^{(1)}=t^{*}.
It turns out that there is a simple solution to computing the EMD between equal weight distributions in 1D that involves the cumulative distribution functions (CDFs) of the distributions. The CDF for x starts at 0, increases an amount w_{i} at each point x_{i}, and eventually becomes constant=w^{S} at the largest point x_{m}. The CDF for x is the bold staircase graph, while the CDF for y is the regular thickness staircase graph. Since x and y are equal-weight distributions, the two CDFs become constant at the same value w^{S}=u^{S}. The EMD between x and y is equal to the area between the CDFs of x and y divided by the total weight w^{S}=u^{S}. The area between the CDFs is shaded, and the corresponding optimal flow is indicated with arrows. This optimal flow is called the CDF flow, and is denoted F^{CDF}.
The CDF flow is given by
where
are the partial sums of the distribution weight vectors w and u, W_{0} = U_{0} = 0, and |X| denotes the size of interval X. The partial sums U_{0}, U_{1}, ..., U_{n} are the same for y and every translated version of y. This means that the CDF flow is an optimal flow between x and y+t for every translation t. An EMD computation between x and a translation of y is shown in the right half of the above figure. To compute the EMD under translation between equal-weight distributions in 1D with d=L_{1}, we simply solve the optimal translation problem for the fixed flow F=F^{CDF}. The optimal translation problem with d=L_{1} is covered in [1].
top | Title, Table of Contents, The EMD |
prev | The Optimal Transformation Problem |
next | References |
The ideas and results contained in this document are part of my thesis, which will be published as a Stanford computer science technical report in June 1999.
S. Cohen. Finding Color and Shape Patterns in Images. Thesis Technical Report STAN-CS-TR-99-?. To be published June 1999.
Similar ideas applied to the EMD under translation have already been published in the technical report [1].
Email comments to scohen@cs.stanford.edu.