         News Video Story Boundary Detection in TRECVID 2006
======================================================================
	  Winston Hsu, Lyndon Kennedy, and Shih-Fu Chang
	
		Columbia University, New York
		
			- 08/18/2006 - 
	
::Download Site::
http://www.ee.columbia.edu/dvmm/downloads/cuex_story.htm


::Contact::
Winston Hsu <winston@ee.columbia.edu>
Lyndon Kennedy <lyndon@ee.columbia.edu>
Shih-Fu Chang <sfchang@ee.columbia.edu>


::Description::
This package contains the automatically detected story boundaries 
for the entire TRECVID 2006 test set (259 videos).

The detection algorithm utilizes the visual cue cluster construction (VC3) 
process based on the information bottleneck principle [1] and prosody 
features extracted from speech [2]. The approach emphasizes automatic 
discovery of salient features and effective classification via information 
theory measures. The technique was shown to be effective in the TRECVID 
2004 story segmentation task.

To explore unique production styles in different channels, detection is 
conducted in a language-dependent fashion. Different detectors are trained 
separately for each language - English, Chinese, and Arabic.


As the prior work in TRECVID 2005, we adopted 75 videos in the TRECVID 2005 
development for training and testing in a 5-fold cross-validation way. 
The detection performance, based on the F1 measure used in prior story 
detection evaluation, for each language is listed below.

Lang.	F1
-------------
ARB	0.821 
CHN	0.840
ENG	0.451


The ENG channel in the TRECVID 2005 development set is troubled by 
the special reports such as US presidential election and Peterson's 
case (e.g., TRECVID2005_252). We expect the performance figure to be 
significantly improved if such special reports segments are not present.


The 2006 test set is different in that it includes videos captured in a time 
period long after the period for the training set, or from new channels not 
seen in 2005. This may cause potential degradation of the performance of the 
story boundary detector; however, due to the lack of annotations, we do not 
have performance evaluation of the story boundary detection over the 2006 
data set. To partially address this issue, we have adopted an adaptive 
detection threshold so that the expected number of stories in each video is 
comparable with that seen in the same channel or language over the 2005 data 
set.  Such adaptive scheme allows for an automatic unsupervised method for 
tuning the parameter of the detection method, without needing performance 
validation based on annotated data.


::Data Format::
The boundary files are in the "test" folder with extension *.bdr. 
The language categorization of each  video is listed in 
"SetTest06_{ARB/CHN/ENG}.txt."

Each boundary file is represented with two columns: the first is the 
starting point (in second) of the boundary and the second is reserved for 
story (1) and non-story (0) classification of each story segment. A "-1" 
value is assigned now since story vs. non-story classification is not 
performed in this release.


::References::
[1] Winston Hsu and Shih-Fu Chang, "Visual Cue Cluster Construction via 
Information Bottleneck Principle and Kernel Density Estimation," In 
International Conference on Content-Based Image and Video Retrieval (CIVR), 
Singapore, 2005. 

[2] Winston Hsu, Lyndon Kennedy, Shih-Fu Chang, Martin Franz, and 
John R. Smith, "COLUMBIA-IBM NEWS VIDEO STORY SEGMENTATION IN TRECVID 
2004," ADVENT Technical Report #207-2005-3 Columbia Universiry, 2005.


