The goal of the region extraction system is to obtain the spatial boundaries of the regions that will be of most interest to the user. The process of region extraction differs from image segmentation. Segmentation corresponds to a complete partitioning of the image such that each image point is assigned to one segment. With region extraction an image point may be assigned to many regions or to none. Conceptually, this is more desirable than segmentation because it supports an object-oriented representation of image content. For example, an image point corresponding to the wheel of a car can simultaneously belong to the two different regions that encapsulate respectively the wheel and the car as a whole.
There are several techniques for region extraction. The least complex method involves (1) manual or semi-automated extraction. In this process the images are evaluated by people and the pertinent information is confirmed or identified visually. This is extremely tedious and time-consuming for large image and video databases. Another procedure for region extraction utilizes a (2) fixed block segmentation of the images. By representing color content of small blocks independently there is greater likelihood that matches between regions can be obtained. However, it is difficult to pick the scale at which images should be best blocked. A third technique involves (3) color segmentation. There have been several techniques recently proposed for this such as color pairs [CLP94] and foreground object color extraction tools [HCP95].
We propose a new technique which partly employs the color histogram (4) back-projection developed by Swain and Ballard for matching images [SB91][SC95]. The basic idea behind the back-projection algorithm is that the most likely location of a spatially localized color histogram within an image is found by the back-projection onto the image of the quotient of the query histogram and the image histogram. More specifically, given query histogram g[m] and image histogram h[m], let
. Then replace each point in the image by the corresponding confidence score B[m,n] = s[I[m,n]]. After convolving B[m, n] with a blurring mask, the location of the peak value corresponds to the most likely location of the model histogram within the image. In small image retrieval applications this computation is performed at the time of query to find objects within images [SB91][EM95]. However, for a large collection it is not feasible to compute the back-projection on the fly. A faster color indexing method is needed.
We extend the back-projection to the retrieval from large databases by precomputing for all images in the database the back-projections with predefined color sets. By processing these back-projections ahead of time, the system returns the best matches directly and without new computation at the time of the query. More specifically, we modify the back-projection algorithm so that it back-projects binary color sets onto the images. Instead of blurring the back-projected images we use morphological filtering to identify the color regions. This process is described in greater detail in later sections.