EE6850 Home | Previsous:Decision |Next:Summary
In this section we present CBIR results of the 3 benchmark images.
One way of evaluating CBIR results is to look at precision-recall statistics, but this may not be a fair measure because:
1. Groundtruth ambiguity.
Similarity in semantic meanings is not equal to similarity in low-level feaures. For example, 2 sunset/sunrise images can have vastly different colors; yet a "sky" image may be similar to a "snow" image(figure 4-1).
Figure 4-1. perceptual similarity != semantic similarity. images#54, #176; #244, #105
2. Recall is even harder to measure if we choose to return a fixed number of images, because the number of images in the same category("those should be returned") is different for different query images.
In our experiment, 10 or 15 images are returned, image id and distance are shown. And the query image is always shown as the first entry(distance 0).
All queries on this page use histogram intersection as the distance measure, and the maximum distance of the 2 two features(Mini-Max) is used to rank images.
1. Query image "City Night"
Figure 4-2 Result for benchmark image "city night" (click for larger image)
2 false returns("flower") if we label groundtruth as "city or night-sky"
2. Query image "Sky"
Figure 4-3 Result for benchmark image "sky" (click for a larger image)
Here labeling the groundtruth as "sky" doesn't really make sense because the return images are perceptually similar but do not belong to the same semantic category.
10 returns 15 returns. No 12 "plane" can be regarded as flase return.
3. Query image "Sunset"
Figure 4-4 Results for benchmark image "sunset" (click for larger image)
3 false returns if the "animal" images are regarded as false.