on images, such as JPEG compression, resizing, the various in-camera
image processing operations for PIM, and so on.
3. The robustness of the classifier to various computer graphics tech-
niques such as the simulated camera depth-of-field (DoF) effects, soft
shadow and so on.
4. The robustness of the classifier to various adversarial attacks. When
the algorithm of a classifier is known, the attacker may be able to
pre-process a PRCG such that it is classified as a photographic image.
5. The sensitivity of the classifier to image content, in particular for those
ambiguous content such as that of the recaptured PRCG or paintings,
PRCG of natural scene, PIM of artificial objects and so on.
Apart from facilitating the evaluation of the PIM versus PRCG classi-
fier according to the above-listed aspects, a good dataset for the PIM and
PRCG classification problem in the passive-blind image authentication set-
tings should also model the authentic and the fake images well. Hence, we
have to ensure the reliable authenticity of the PIM besides that the PRCG
are from reliable sources and are of high photorealism.
The concern of high photorealism of PRCG is due to the fact that only
PRCG of high photorealism will be used to fake PIM in realistic situation.
Unfortunately, PRCG of high photorealism are not readily available in abun-
dance in the Internet. There are many computer graphics in Internet but
many of them are not truly photorealistic, so a conscious effort is needed to
select only PRCG with high photorealism.
Besides that, we also need to make sure that the content of the PRCG
is comparable to that of the PIM. The concern of content compatibility
between PIM and PRCG is to ensure that we are comparing apple to ap-
ple. Otherwise, a trained classifier may overfit to the content discrepancy
between the two image sets, for example, this can happen if the dataset
contains mainly PIM of buildings and PRCG of forest. There are two ways
to ensure the matching of the content. First way is to narrowly restrict the
image content in both the PIM and the PRCG sets, e.g., we can restrict
the dataset to have only images of vegetation. The second way is to define
a broader scope for the content but ensure the content diversity within the
scope, in order to lower the likelihood of content mismatch. In our case, we
follow the second way; we define the content scope to be natural scene and
ensure the content diversity within the defined scope.
4