The ideas and results contained in this document are part of my thesis, which will be published as a Stanford computer science technical report in June 1999. Click here for a PostScript version (20MB after decompression) of this entire document.
I thank Madirakshi Das from the FOCUS project at University of Massachussetts for the advertisements used in my color database experiments, and Oliver Lorenz for the Chinese characters used in my shape database experiments.
The motivation and setting for my work is content-based image retrieval (CBIR). There are many image retrieval systems that use a global measure of image similarity. By global, I mean that the notion of similarity assumes that all of the information in a database image matches all of the information in the query. A global or complete match distance measure penalizes any information in one image which does not match information in the other image. An example of a complete match distance measure is the L1 distance between color histograms taken over entire images.
While the assumption of complete matching is convenient and sometimes useful, it is not conducive to building systems that retrieve images which are semantically related to a given query image or drawing. Semantic image similarity, which is what CBIR users will expect and demand, often follows only from a partial match between images as shown in the figure below.
These two images are semantically related because they both contain zebras. This is obviously a case where a partial match distance measure is appropriate. There are no clouds and sky in the right image, and there are no trees in the left image.
If we ever want to obtain semantic similarity from measures of visual similarity, then our notion of visual similarity must allow for partial matching of images. A partial match distance measure is small whenever the image and query contain similar regions, and does not necessarily penalize information in one image that does not match information in the other.
Another important point to be made here is that semantic similarity often follows from only a very small partial match. For example,
|top||Title, Table of Contents, Introduction|
|next||The Pattern Problem|
The ideas and results contained in this document are part of my thesis, which will be published as a Stanford computer science technical report in June 1999.
S. Cohen. Finding Color and Shape Patterns in Images. Thesis Technical Report STAN-CS-TR-99-?. To be published June 1999.
Email comments to firstname.lastname@example.org.