techreport

a new problem in need of a dataset with spliced images. In response to the need, we have earlier released an open dataset, the Columbia Image Splicing Detection Evaluation Dataset [19], for the image splicing problem. Another problem identified under passive-blind image authentication re- search is the classification of photographic images (PIM) and photorealistic computer graphics (PRCG) [20, 21]. Working on the PIM and PRCG classi- fication problem requires a dataset containing PRCG of high photorealism, and PIM of reliable sources and with diversity in terms of image content and the image acquisition factors such as the types of camera being used and the photographing styles and techniques. Such dataset is not readily available and we have collected such a dataset, namely the Columbia Photographic Images and Photorealistic Computer Graphics Dataset, during the process of working of the PIM versus PRCG classification problem. We are making this dataset available to the research community. This report describes the design and the implementation of the dataset. Section 2 will discuss the requirements for a dataset which caters for the PIM versus PRCG classification problem. In Section 3, we give an overview of the Columbia Photographic Images and Photorealistic Computer Graph- ics Dataset. Then, the subsequent sections are dedicated for the detailed description of the dataset components, i.e., the PRCG, the Personal, the Google and the Recaptured PRCG image sets. Then, Section 8 provides a guide for downloading the respective components of the dataset. Finally, we conclude with Section 9. 2 Dataset Requirements for the PIM versus PRCG Classification Problem While the problem of classifying PIM and the general computer graphics (including both the photorealistic and non-photorealistic computer graph- ics) has been studied for the purpose of improving video retrieval [22] and other applications [23], the PIM versus PRCG classification problem in the passive-blind image authentication settings is a new problem. It emphasizes on highly photorealistic PRCG rather than normal or non-photorealistic computer graphics, such as the cartoon-like images seen on television. In general, a passive-blind PIM versus PRCG classifier would be evaluated in the following aspects: 1. The discrimination rate/accuracy of the classifier. 2. The robustness of the classifier to various image processing operation 3