Columbia Image Splicing Detection Evaluation Dataset


Image splicing is a simple process that crops and pastes regions from the same or separate sources. It is a fundamental step used in digital photomontage, which refers to a paste-up produced by sticking together images using digital tools such as Photoshop. Examples of photomontages can be seen in several infamous news reporting cases involving the use of faked images. Searching for technical solutions for image authentication, researchers have recently started development of new techniques aiming at blind passive detection of image splicing. However, like most other research communities dealing with data processing, we need an open data set with diverse content and realistic splicing conditions in order to expedite the progresses and facilitate collaborative studies. In this report, we describe with details a data set of 1845 image blocks with a fixed size of 128 pixels x 128 pixels. The image blocks are extracted from images in the CalPhotos collection, with a small number of additional images captured by digital cameras. The dataset includes about the same number of authentic and spliced image blocks, which are further divided into different subcategories (smooth vs. textured, arbitrary object boundary vs. straight boundary).

Also take a look at our uncompressed spliced image dataset.

Detailed Information

The goal is to provide a dataset open to the research community on which new discovery and development of technologies can be evaluated. For design criteria, dataset structure, and naming convention, please refer to the Detailed Information page.

Terms and Conditions

The users of the Columbia Splicing Detection Evaluation Data Set must agree that

1. The use of the data set is restricted to research purpose only
2. No redistribution of the dataset is allowed
3. In any resultant publications of research that uses the dataset, due credits will be provided to the DVMM Laboratory of Columbia University, CalPhotos Digital Library and all the photographers who have kindly granted their permission, so that the dataset can be produced using their copyright images. The list of photographer names can be found in this page.

Dataset Download

Please fill out the download request form here. You will receive an email with the dataset link, the username and the password.


For any work that makes use of the dataset, please include the following line in the acknowledgements section or the footnote of any resultant publications.

"Credits for the use of the Columbia Image Splicing Detection Evaluation Dataset are given to the DVMM Laboratory of Columbia University, CalPhotos Digital Library and the photographers listed in"


  1. Tian-Tsong Ng (
  2. Jessie Hsu (
  3. Shih-Fu Chang (

Technical Report

A Data Set of Authentic and Spliced Image Blocks,
Tian-Tsong Ng, Shih-Fu Chang,
ADVENT Technical Report #203-2004-3 Columbia University, June 2004details


A Model for Image Splicing,
Tian-Tsong Ng, Shih-Fu Chang,
IEEE International Conference on Image Processing (ICIP)
, Singapore, October 2004 details

Blind Detection of Photomontage Using Higher Order Statistics,
Tian-Tsong Ng, Shih-Fu Chang, Qibin Sun,
IEEE International Symposium on Circuits and Systems (ISCAS), Vancouver, Canada, May 2004 details

Blind Detection of Digital Photomontage using Higher Order Statistics,
Tian-Tsong Ng, Shih-Fu Chang,
ADVENT Technical Report #201-2004-1 Columbia University, June 2004 details

A database of photos of plants, animals, habitats and other natural history subjects,
CalPhotos,, 2000

Detecting Digital Forgeries Using Bispectral Analysis,
H. Farid,
MIT AI Memo AIM-1657, MIT, 1999

A Picture Tells a Thousand Lies,
H.. Farid,
New Scientist, vol. 179, pp. 38-41, 2003

When Is Seeing Believing?,
W. J. Mitchell,
Scientific American, pp. 44-49, 1994


We sincerely appreciate Ginger Ogle of Berkeley Digital Library Project for the help in contacting the original photographers and transfering the image files.


  1. TrustFoto
  2. Research Description Page - Image Splicing Detection Using Higher Order Statistics
  3. Columbia Uncompressed Image Splicing Detection Dataset
  4. Columbia University Digital Video and Multimedia Lab
  5. Columbia University Graphics Lab