A color signature is a collection of (color,weight) pairs, where the
weight of a color is the fraction of image pixels classified as that
color. For example the query color signature (Y,u), shows that
about u_{1}=23% of the query is y_{1}=white,
u_{2}=18% is y_{2}=dark green, and
u_{3}=17% is y_{3}=yellow. It is convenient to think
of color signatures as discrete distributions of mass/weight in some color
space with total weight 1=100%.

In the example above, the query is about 10% of the total database image area. Suppose we scale down the query signature to have total weight 0.10 instead of 1.0, as shown in the above color signature labelled "Scaled Comet Logo Signature". Then all the color mass in the scaled query signature can be matched to similar-color mass in the image signature (90% of the color mass in the image is not needed to perform this match). If we decrease the total weight of the query to less than 0.10, the match does not improve because we already had enough image mass of each query color to match all the query mass when the total was 0.10. If we increase the total weight of the query to more than 0.10, then, the match will get worse because, for example, there will not be enough red and yellow mass in the image to match the red and yellow in the scaled query.

There is a distance measure between discrete distributions
of mass called the
Earth Mover's Distance (EMD). One can prove that
the EMD between the image distribution and the scaled query
distribution decreases as the scale parameter c is decreased, and
furthermore that this distance becomes constant for all c <= c^{0}.
This coincides with the intuition given in the previous paragraph. A
portion of the graph of EMD versus c for the comet example is shown above,
to the right of the color signatures. The value
c^{0} is used as an estimate for the scale at which the
pattern appears in the image.

In the comet example, the scale estimate is very accurate. If,
however, there were a bit more red and yellow in the image than just in
the query occurence, then the scale estimate would be a bit too
large. This is because "background" red and yellow in the image
matches the red and yellow in the query just as well as the red
and yellow from the query occurrence. In general, the scale estimation
algorithm has the attractive property that the accuracy of the
scale estimate is determined by the query color with the
*least* background clutter within the image. The
larger this minimum background clutter, the more overestimated the
scale. Since there is no background clutter for red and yellow in the
comet example, the scale estimate were very accurate, and this is
despite the fact that there is a lot of background clutter for white
(since there is a lot of white in the image that is not part of the
query occurrence). In summary, just one query color with a small
amount of background clutter in the image is enough to obtain an
accurate scale estimate.

The user provides some minimum scale c_{min} that he/she is
interested in finding the pattern. For simplicity of description,
assume c^{0} >= c_{min}. Then initially we know is
that c^{0} is somewhere in the interval
[c_{min},1]. If we check the value of the EMD at the midpoint
c_{mid} of this interval, and, as in the example above, it is
greater than the EMD at c_{min}, then we know that
c^{0} is somewhere in the smaller interval
[c_{min},c_{mid}] (since the EMD decreases as c
decreases). By checking the value of the EMD at the midpoint of the
interval in which we are sure that c^{0} occurs, we can keep
reducing the size of this interval (shown as a thick line segment in
the graphs above) until it becomes smaller than some
user-specified precision for the scale estimate.

Although the EMD versus c curve must become constant for c small enough, this constant value may be large. This can happen, for example, if a substantial portion of the query pattern is red, but there is no color in the image which is at all similar to red. When the EMD curve flattens out at a large EMD value, the system asserts that the pattern does not occur within the image, and it does not perform the the initial placement and refinement stages.

Some results from the scale estimation phase are shown below. To the right of each image is the pattern scaled according to the estimate.

More results for the scale estimation are shown below.

top | Title, Table of Contents, Introduction |

prev | Image Signatures |

next | The Initial Placement Phase |

The ideas and results contained in this document are part of my thesis, which will be published as a Stanford computer science technical report in June 1999.

S. Cohen.
**Finding Color and Shape Patterns in Images**.
*Thesis Technical Report STAN-CS-TR-99-?*.
To be published June 1999.

Email comments to scohen@cs.stanford.edu.