The Pattern Problem

An important problem in content-based image retrieval (CBIR) is

The Pattern Problem. Given an image and a query pattern, determine if the image contains a region which is visually similar to the query. If so, find at least one such image region.

For example,
[clorox_tmpl.jpg] [clorox_2.jpg]
[ntu_fs_m6.jpg] [ntu_fs_m557.jpg]
we would like to find the clorox logo within the clorox advertisement, and the Chinese character on the left within the Chinese character on the right. These two pairs of images are example inputs for the color and shape pattern problems. In the shape case, the occurrence of one Chinese character within another often implies that the characters are semantically related.

The pattern problem is important in image retrieval because its solution may allow users to query a database for images containing a specific item (e.g. object, product logo, machine part, etc.). This is a very common form of query in a text-based system, where, for example, a user may be looking for all articles that mention a specific term.

The main difficultly in efficiently solving a single pattern problem is the combination of partial matching and scaling. Allowing partial matching means that we do not know where in the image to look, and allowing scaling means that we do not know how much of the image to look at. Furthermore, there are many independent (i.e. non-overlapping) places in the image to look for very small scale occurrences of a pattern. In addition to these efficiency difficulties, there is the fundamental issue of how to combine color (shape) information with position information in judging the similarity of two color (shape) patterns.

In the context of database query, the fact that partial match distance measures do not obey the triangle inequality is also an impediment to efficiency. In the example below,

[notmetric.jpg]
we see a situation where Q and P are close to R (according to a partial match distance measure), but far from each other. Q is close to R because it is contained within R, and P is close to R because both contain sizeable regions which are identical. Yet Q is far from P because nothing in P looks like Q, and vice-versa. The lesson here is that a system cannot prune a search in R for Q because of a negative search result in P (as would be possible if the system had a distance measure which obeys the triangle inequality). Q occurs within R in a region that does not match anything in P.


top Title, Table of Contents, Introduction
prev Introduction
next The SEDL Image Retrieval System


The ideas and results contained in this document are part of my thesis, which will be published as a Stanford computer science technical report in June 1999.

S. Cohen. Finding Color and Shape Patterns in Images. Thesis Technical Report STAN-CS-TR-99-?. To be published June 1999.

Email comments to scohen@cs.stanford.edu.