Time: 12:00 pm, December 5, Friday, 2003
Location: ADVENT Lab, CEPSR 7LE3

Title:

Screenplay Alignment for Closed-System Content Analysis of Feature Films

Speaker:

Robert J Turetsky, Dept. of Electrical Engineering, Columbia University

Abstract:

Almost all feature-length films are produced with the aid of a
screenplay. The screenplay provides a unified vision of the story,
setting, dialogue and action of a film. The screenplay is a
currently untapped resource for obtaining a textual description of
important semantic objects within a film. This has the benefit not
only of bypassing the problem of the semantic gap, but of having
said descriptions come straight from the filmmakers. The screenplay
is available for thousands of films and follows a semi-regular
formatting standard, and thus is a reliable source of data.

We discuss a method for timestamping events in the screenplay by
aligning dialogues with the closed captions extracted from the
film's DVD, which can identify approximately 60% of lines of
dialogues within a film. It is possible to recover the remaining
dialogues by using the aligned dialogues as labeled training
examples to create a statistical model for a generic classifier. As
a first application of the screenplay alignment work, we investigate
the problem of speaker identification. We are currently building a
system which will perform speaker ID without resorting to hand-
labeling or unsupervised clustering