Color Similarity

In order to match color regions, we need a measure for the similarity of colors, i.e., pink is more similar to red than blue. We base the measurement of color similarity on the closeness in the HSV color space as follows: the similarity between any two colors, indexed by

and

, is given by

which corresponds to the proximity in the cylindrical HSV color space depicted in Figure 5. The measure of color similarity,

, is used within the computation of the distance between color distributions as described next.

Color Histograms

A distribution of colors is defined by a color histogram. By transforming the three color channels of image I[x,y] using transformation

and quantization

as defined in Section 2, where

, the single variable color histogram is given by, where X and Y are the width and height of the image, respectively, which are used for normalization,

Histogram Distance

The most common dissimilarity measures for feature vectors are based upon the Minkowski metric, which has the following form, where

and

are the query and target feature vectors, respectively,

For example, both the

, (r = 1) [1], and

, (r = 2), metrics have been used for measuring dissimilarity of histograms. However, histogram dissimilarity measures based upon the Minkowski metric neglect to compare similar colors in the computation of dissimilarity. For example, using a Minkowski metric, a dark red image is equally dissimilar to a red image as to a blue image. By using color similarity measures within the distance computation, a quadratic metric improves histogram matching.

Histogram Quadratic Distance

The QBIC project uses the histogram quadratic distance metric for matching images [3]. It measures the weighted similarity between histograms which provides more desirable results than ``like-bin'' only comparisons. The quadratic distance between histograms

and

is given by

where

and

denotes the similarity between colors with indices i and j. By defining color similarity in HSV color space,

is given by Eq. 3. Since the histogram quadratic distance computes the cross similarity between colors, it is computationally expensive. Therefore, in large database applications, histogram indexing strategies, such as pre-filtering [5], are required to avoid exhaustive search.

Color Sets

Alternatively, we utilize color sets to represent color information. The distinction is that color sets give only a selection of colors, whereas, color histograms denote the relative amounts of colors. Although we use the above system for color set selection in order to extract regions, we note here that color sets can also be obtained by thresholding color histograms. For example, given threshold

for color m, color sets are related to color histograms by

Color sets work well to represent regional color since (1)

and

have been derived to give a complete set of distinct colors and (2) salient regions possess only a few, equally dominant colors [13].

Color Set Distance

We use a modification of the color histogram quadratic distance equation (Eq. 6) to measure the distance between color sets. The quadratic distance between two color sets

and

is given by

Considering the binary nature of the color sets, the computational complexity of the quadratic distance function can be reduced. We decompose the color set quadratic formula to provide for a more efficient computation and indexing. By defining