a new problem in need of a dataset with spliced images. In response to the
need, we have earlier released an open dataset, the Columbia Image Splicing
Detection Evaluation Dataset [19], for the image splicing problem.
Another problem identified under passive-blind image authentication re-
search is the classification of photographic images (PIM) and photorealistic
computer graphics (PRCG) [20, 21]. Working on the PIM and PRCG classi-
fication problem requires a dataset containing PRCG of high photorealism,
and PIM of reliable sources and with diversity in terms of image content and
the image acquisition factors such as the types of camera being used and the
photographing styles and techniques. Such dataset is not readily available
and we have collected such a dataset, namely the Columbia Photographic
Images and Photorealistic Computer Graphics Dataset, during the process
of working of the PIM versus PRCG classification problem. We are making
this dataset available to the research community. This report describes the
design and the implementation of the dataset.
Section 2 will discuss the requirements for a dataset which caters for the
PIM versus PRCG classification problem. In Section 3, we give an overview
of the Columbia Photographic Images and Photorealistic Computer Graph-
ics Dataset. Then, the subsequent sections are dedicated for the detailed
description of the dataset components, i.e., the PRCG, the Personal, the
Google and the Recaptured PRCG image sets. Then, Section 8 provides a
guide for downloading the respective components of the dataset. Finally, we
conclude with Section 9.
2 Dataset Requirements for the PIM versus PRCG
Classification Problem
While the problem of classifying PIM and the general computer graphics
(including both the photorealistic and non-photorealistic computer graph-
ics) has been studied for the purpose of improving video retrieval [22] and
other applications [23], the PIM versus PRCG classification problem in the
passive-blind image authentication settings is a new problem. It emphasizes
on highly photorealistic PRCG rather than normal or non-photorealistic
computer graphics, such as the cartoon-like images seen on television. In
general, a passive-blind PIM versus PRCG classifier would be evaluated in
the following aspects:
1. The discrimination rate/accuracy of the classifier.
2. The robustness of the classifier to various image processing operation
3