The figure below illustrates the main ideas in this phase.
In the left column, there is a pattern above an image which contains that pattern at a different scale and orientation. In the center column, I have shown the pattern and image signatures in Color x Position space. In these signatures, only the locations and colors of the regions are shown, not their areas. For example, the three blue circles in the pattern signature represent the three blue letters in the pattern. The occurrence of the pattern in the image gives rise to a similar set of signels in the image signature. There is a similarity transformation g=g* of the pattern regions that aligns the regions in the pattern with regions of the same color in the image. Our goal is to find that optimal transformation g=g*. The main idea is to start from a transformation g0 determined by the scale estimation and initial placement phases (see the right column), and iteratively refine the estimate for the pattern scale, orientation, and location until the match stops improving. In this example, only the pattern orientation needs to be adjusted, since the location and scale are already correct.
image | X | = | { ((A1,P1),W1), ... , ((AM,PM),WM) }, and |
query | Y | = | { ((B1,Q1),U1), ... , ((BN,QN),UN) }. |
Here is the signature for the pattern in the above example.
We assume that both signatures are normalized to have total weight 1(=100% of an image).
We shall define a distance between signatures in Attribute x Position space based on a distance between signels in Attribute x Position space. For the color case, the distance in Color x Position space measures the overall distance between two regions with possibly different colors and locations. We use a linear combination of attribute and position distance :
where the constant K trades off the importance of distance in attribute space with the distance in the image plane (and accounts for the possibly different orders of magnitude returned by the individual attribute and position distance functions). In the color case, dattr(AI,BJ) is the Euclidean distance between colors AI and BJ represented in CIE-Lab space coordinates. In the shape case, dattr(AI,BJ) is the distance in radians between two angles AI and BJ.
The distance between the image signature X and the pattern signature Y is given by
Let us explain this formula in terms of the color case. For the moment, ignore the minimization and just consider the summation. Each pattern region J is matched to some image region f(J) (in {1,...,M}). The distance between these regions in Color x Position space is computed and weighted by the area of pattern region J. These weighted region-region distances are summed over all pattern regions. The signature distance DG(Y,X) allows for a transformation g of the query region positions (taken from some given set of transformations G), and calls for the minimum weighted sum of region-region distances over all possible pairs of correspondences and transformations.
At least a locally optimal transformation can be computed by alternately computing the best set of correspondences for a given transformation, then the best transformation for the previously computed correspondences, then the best set of correspondences for the previously computed transformation, and so on. The summation in the formula for DG(Y,X) gives a value for every (correspondence set f, transformation g) pair, and thus defines a surface over the space FxG of all such pairs. The alternation strategy gives a way to follow a path downhill along this surface. The job of the scale estimation and initial placement phases is to define the initial transformation g0 to be close to the globally optimal transformation g* so that our iteration converges to this globally optimal transformation or to a transformation which is nearly optimal.
The search for the minimum DG(Y,X) is directed by the colors of the regions since these are unchanged by transformations g in G. The system will not match two regions which are close together unless their colors are similar. As in the initial placement phase, directed should be compared with exhaustive. Here, the search for the optimal transformation does not simply try a discrete sampling of all transformations in some neighborhood of the initial transformation. The verification and refinement stage uses the image data, in particular the colors and layouts of the image regions to adjust/refine its estimate of the pattern scale, orientation, and location.
Some results from the verification phase are shown below. The red rectangle indicates the scale, orientation, and location where SEDL believes that the pattern occurs.
More results from the verification phase are shown below.
top | Title, Table of Contents, Introduction |
prev | The Initial Placement Phase |
next | Experiments with a Color Database of Product Advertisements |
The ideas and results contained in this document are part of my thesis, which will be published as a Stanford computer science technical report in June 1999.
S. Cohen. Finding Color and Shape Patterns in Images. Thesis Technical Report STAN-CS-TR-99-?. To be published June 1999.
Email comments to scohen@cs.stanford.edu.