SEDL : The Verification and Refinement Phase

The Verification and Refinement Phase

During the verification and refinement phase, SEDL examines the positions of the colors within the initial placement rectangles to verify that the pattern or a region similar to the pattern indeed occurs within the image. The pattern appears in the image to the extent that there is a transformation of pattern region positions that aligns regions in the query with regions in the image of similar colors.

The Main Ideas

The figure below illustrates the main ideas in this phase.

In the left column, there is a pattern above an image which contains that pattern at a different scale and orientation. In the center column, I have shown the pattern and image signatures in Color x Position space. In these signatures, only the locations and colors of the regions are shown, not their areas. For example, the three blue circles in the pattern signature represent the three blue letters in the pattern. The occurrence of the pattern in the image gives rise to a similar set of signels in the image signature. There is a similarity transformation g=g^* of the pattern regions that aligns the regions in the pattern with regions of the same color in the image. Our goal is to find that optimal transformation g=g^*. The main idea is to start from a transformation g⁰ determined by the scale estimation and initial placement phases (see the right column), and iteratively refine the estimate for the pattern scale, orientation, and location until the match stops improving. In this example, only the pattern orientation needs to be adjusted, since the location and scale are already correct.

Some Details

Let us denote the image and query signatures in Attribute x Position space as X and Y, respectively :

image	X	=	{ ((A₁,P₁),W₁), ... , ((A_M,P_M),W_M) }, and
query	Y	=	{ ((B₁,Q₁),U₁), ... , ((B_N,Q_N),U_N) }.

Here is the signature for the pattern in the above example.

We assume that both signatures are normalized to have total weight 1(=100% of an image).

We shall define a distance between signatures in Attribute x Position space based on a distance between signels in Attribute x Position space. For the color case, the distance in Color x Position space measures the overall distance between two regions with possibly different colors and locations. We use a linear combination of attribute and position distance :

d_ap((A_I,P_I), (B_J,Q_J)) = d_attr(A_I,B_J) + K ||P_I-Q_J||²,

where the constant K trades off the importance of distance in attribute space with the distance in the image plane (and accounts for the possibly different orders of magnitude returned by the individual attribute and position distance functions). In the color case, d_attr(A_I,B_J) is the Euclidean distance between colors A_I and B_J represented in CIE-Lab space coordinates. In the shape case, d_attr(A_I,B_J) is the distance in radians between two angles A_I and B_J.

The distance between the image signature X and the pattern signature Y is given by

D_G(Y,X) = min_{f in F, g in G} sum_J U_J x d_ap((A_f(J),P_f(J)), (B_J,g(Q_J))).

Let us explain this formula in terms of the color case. For the moment, ignore the minimization and just consider the summation. Each pattern region J is matched to some image region f(J) (in {1,...,M}). The distance between these regions in Color x Position space is computed and weighted by the area of pattern region J. These weighted region-region distances are summed over all pattern regions. The signature distance D_G(Y,X) allows for a transformation g of the query region positions (taken from some given set of transformations G), and calls for the minimum weighted sum of region-region distances over all possible pairs of correspondences and transformations.

At least a locally optimal transformation can be computed by alternately computing the best set of correspondences for a given transformation, then the best transformation for the previously computed correspondences, then the best set of correspondences for the previously computed transformation, and so on. The summation in the formula for D_G(Y,X) gives a value for every (correspondence set f, transformation g) pair, and thus defines a surface over the space FxG of all such pairs. The alternation strategy gives a way to follow a path downhill along this surface. The job of the scale estimation and initial placement phases is to define the initial transformation g⁰ to be close to the globally optimal transformation g^* so that our iteration converges to this globally optimal transformation or to a transformation which is nearly optimal.

The search for the minimum D_G(Y,X) is directed by the colors of the regions since these are unchanged by transformations g in G. The system will not match two regions which are close together unless their colors are similar. As in the initial placement phase, directed should be compared with exhaustive. Here, the search for the optimal transformation does not simply try a discrete sampling of all transformations in some neighborhood of the initial transformation. The verification and refinement stage uses the image data, in particular the colors and layouts of the image regions to adjust/refine its estimate of the pattern scale, orientation, and location.

Results

Some results from the verification phase are shown below. The red rectangle indicates the scale, orientation, and location where SEDL believes that the pattern occurs.

In general, these results are excellent. The localization of the pattern in the breathe right, jello, reynolds, and clorox, and cornpops advertisements is near perfect. The orientation is slightly off in the pert example in the first row, third column. The system makes a mistake in its search for the pert logo within the pert advertisement in the second row, third column. In the tide example in row three, column two, the scale is decreased from the initial overestimate, but the rectangle settles between the two pattern occurrences.

More results from the verification phase are shown below.

In the first two misty advertisements in row one, the system was searching for the entire cigarette box, but ended up aligning the final rectangle to a major portion of the pattern occurrence (the stripes on the box). In the third misty example in row one, SEDL finds one of the two pattern occurrences. In the scholl's example in row two, column three, SEDL finds one of the four pattern occurrences (despite an initial overestimate in scale). The postion and orientation in the taco bell example should be better. The ziploc and casting results are near perfect. The system never quite recovered from the initial scale overestimate in the misty advertisement in row three, column two; the final rectangle, however, contains both occurrences of the pattern.

top Title, Table of Contents, Introduction
prev The Initial Placement Phase
next Experiments with a Color Database of Product Advertisements

The ideas and results contained in this document are part of my thesis, which will be published as a Stanford computer science technical report in June 1999.

S. Cohen. Finding Color and Shape Patterns in Images. Thesis Technical Report STAN-CS-TR-99-?. To be published June 1999.

Email comments to scohen@cs.stanford.edu.

top	Title, Table of Contents, Introduction
prev	The Initial Placement Phase
next	Experiments with a Color Database of Product Advertisements