Jump to : Download | Abstract | Contact | BibTex reference | EndNote reference |


Hari Sundaram, Shih-Fu Chang. Determining Computable Scenes in Films and their Structures using Audio Visual Memory Models. In ACM Multimedia, Los Angeles, CA, October 2000.

Download [help]

Download paper: Adobe portable document (pdf)

Copyright notice:This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


In this paper we present novel algorithms for computing scenes and within-scene structures in films. We begin by mapping insights from film-making rules and experimental results from the psychology of audition into a computational scene model. We define a computable scene to be a chunk of audio-visual data that exhibits long-term consistency with regard to three properties: (a) chromaticity (b) lighting (c) ambient sound. Central to the computational model is the notion of a causal, finite-memory viewer model. We segment the audio and video data separately. In each case we determine the degree of correlation of the most recent data in the memory with the past. The respective scene boundaries are determined using local minima and aligned using a nearest neighbor algorithm. We introduce a periodic analysis transform to automatically determine the structure within a scene. We then use statistical tests on the transform to determine the presence of a dialogue. The algorithms were tested on a difficult data set: five commercial films. We take the first hour of data from each of the five films. The best results: scene detection: 88% recall and 72% precision, dialogue detection: 91% recall and 100% precision. Keywords Computable scenes, scene detection, shot-level structure, films, periodic analysis transform, memory models


Hari Sundaram
Shih-Fu Chang

BibTex Reference

   Author = {Sundaram, Hari and Chang, Shih-Fu},
   Title = {Determining Computable Scenes in Films and their Structures using Audio Visual Memory Models},
   BookTitle = {ACM Multimedia},
   Address = {Los Angeles, CA},
   Month = {October},
   Year = {2000}

EndNote Reference [help]

Get EndNote Reference (.ref)


For problems or questions regarding this web site contact The Web Master.

This document was translated automatically from BibTEX by bib2html (Copyright 2003 © Eric Marchand, INRIA, Vista Project).