|
Summary

In this project, we develop novel
algorithms for computing scenes and within-scene structures in digital video,
experimenting with film content. We explore insights from film-making rules
and experimental results from the psychology of audition into a computational
scene model. We define a computable scene to be a chunk of audio-visual
data that exhibits long-term consistency with regard to three properties:
(a) chromaticity (b) lighting (c) ambient sound. Central to the computational
model is the notion of a causal, finite-memory viewer model. In both audio
and video, we determine the degree of correlation of the most recent data
in the memory with the past. Synchronization and complementary relations
between audio and visual scene boundaries allow us to define different types
of a-v scenes. In addition, we detect syntactical structures such as dialog
in films by analyzing the statistics in periodic analysis transform of shot
sequences. Test on five films show the following results: scene boundary
detection: 88% recall and 72% precision, dialogue detection: 91% recall
and 100% precision.
People

Hari
Sundaram
Prof.
Shih-Fu Chang
Publication

H. Sundaram and S.-F. Chang,
Determining
Computable Scenes in Films and their Structures using Audio Visual Memory
Models, ACM Multimedia 2000, Los Angeles, CA, Oct 30-Nov 3, 2000.
(PS.GZ/PDF)
H. Sundaram and S.-F. Chang,
Audio
Scene Segmentation using Multiple Models, Features and Time Scales,
ICASSP 2000, Istanbul, Turkey, June 5-9, 2000.
(PS.GZ/PDF)
H. Sundaram and S.-F. Chang,
Video
Scene Segmentation using Audio and Video Features, ICME 2000, New
York, New York, July 28-Aug 2, 2000 .
(PS.GZ/PDF)
S.-F. Chang and H. Sundaram,
Structural
and Semantic Analysis of Video, ICME 2000, New York, New York, July
28-Aug 2, 2000 .
(PS.GZ/PDF)
Demo
Download

For
problems or questions regarding this web site contact The
Web Master.
Last updated:
June 12, 2002. |