Kodak's consumer video benchmark data set

Quick Guide to Kodak's consumer video benchmark data set

Kodak's consumer video benchmark data set Citation:

Akira Yanagawa, Alexander C. Loui, Jiebo Luo, Shih-Fu Chang. Dan Ellis, Wei Jiang, Lyndon Kennedy, and Keansub Lee, " Kodak consumer video benckmark data set: concept definition and annotation ", Columbia University ADVENT Technical Report 246-2008-4, Sep, 2008. [pdf]

Consumer Video Dataset

Download the Consumer Video Dataset.
(7.9 MB file. Expands to 20.8MB on disk.)

Summary

Semantic indexing of images and videos in the consumer domain has become a very important issue for both research and actual application. In this work we developed Kodak’s consumer video benchmark data set, which includes (1) a significant number of videos from actual users (1358 video clips from consumers and 1873 clips from Youtube), (2) a rich lexicon that accommodates consumers’ needs (more than 100 concepts), and (3) the annotation of a subset of concepts (25) over the entire video data set. To the best of our knowledge, this is the first systematic work in the consumer domain aimed at the definition of a large lexicon, construction of a large benchmark data set, and annotation of videos in a rigorous fashion. Such effort will have significant impact by providing a sound foundation for developing and evaluating large-scale learning based semantic indexing/annotation techniques in the consumer domain.

Details about the data and the data structures used in this dataset release can be found in this paper. The dataset includes the annotations, extracted visual features of videos from consumers, and URLs for videos from Youtube. To get sample anonymized video clips or keyframes for videos from consumers, please send requests to us

References

[1] Akira Yanagawa, Alexander C. Loui, Jiebo Luo, Shih-Fu Chang. Dan Ellis, Wei Jiang, Lyndon Kennedy, and Keansub Lee, " Kodak consumer video benchmark data set: concept definition and annotation, " Columbia University ADVENT Technical Report 246-2008-4, Sep, 2008.

[2] Shih-Fu Chang, Dan Ellis, Wei Jiang, Keansub Lee, Akira Yanagawa, Alexander C. Loui, Jiebo Luo, " Large-Scale Multimodal Semantic Concept Detection for Consumer Video, " In ACM SIGMM International Workshop on Multimedia Information Retrieval, Germany, September 2007.

[3] Alexander C. Loui, Jiebo Luo, Shih-Fu Chang, Dan Ellis, Wei Jiang, Lyndon Kennedy, Keansub Lee, Akira Yanagawa, " Kodak's Consumer Video Benchmark Data Set: Concept Definition and Annotation, " In ACM SIGMM International Workshop on Multimedia Information Retrieval, Germany, September 2007.

For problems or questions regarding this web site contact The Web Master.
Last updated: September 12, 2008

Kodak's consumer video benchmark data set