~ Lexing
Xie / Research / VideoMining
/ Part II |
|
Finding Meaningful Video Structure in News with Associated
Text
|
Abstract |
The work presents the first effort to automatically
annotate the semantic meanings of temporal video patterns obtained through
unsupervised discovery processes. This problem is interesting in domains
where neither perceptual patterns nor semantic concepts have simple structures.
The patterns in video are modeled with hierarchical hidden Markov models
(HHMM), with efficient algorithms to learn the parameters, the model complexity,
and the relevant features; the meanings are contained in words of the
speech transcript of the video. The pattern-word association is obtained
via co-occurrence analysis and statistical machine translation models.
Promising results are obtained through extensive experiments on 20+ hours
of TRECVID news videos: video patterns that associate with distinct topics
such as el-nino and politics are identified; the HHMM temporal structure
model compares favorably to a non-temporal clustering algorithm. |
|
|
|
Publications and Reports |
L. Xie, L. Kennedy, S.-F. Chang, A. Divakaran,
H. Sun, C.-Y. Lin (2004). "Discovering
Meaningful Multimedia Patterns with Audio-visual Concepts and Associated
Text." IEEE International Conference on Image Processing (ICIP
2004), Singapore, October 2004. |
|
Last update:
October 6, 2004
|