This course focuses on recent development of statistical techniques that are promising for solving practical problems in video search and audio-visual content analysis. The goal is for students to get familiar with the state of the art, learn how to formulate and solve practical video indexing/analysis problems, and acquire hands-on experience through actual experiments.
Topics will include:
Applications
- Image/Video Search
- Concept and Event Detection
- Video Structure Parsing
- Media Summarization and Highlight Generation
Technical Topics
- Image Preprocessing and Feature Extraction
- Content Based Image/Video Retrieval
- Image/Video Annotation
- Video Copy Detection
- Multimodal Retrieval
- Search and Tagging of Media Content on Social Networking Sites
- Promising Pattern Recognition Tools such as HMM, SVM, Bayesian, Neural Network, Clustering etc
- Evaluation Methods and Benchmarking Events
We will first present overviews of image representation, feature extraction, and search methods. Then, the class will collectively review, critique, and experiment with a set of selected papers. Each student will be assigned one paper to summarize the technical content as well as related development in the field. The student will work with the instructor and TA to complete a small-scale simulation of main ideas in the paper.
We will provide image/video data sets, features, and associated metadata (such as transcripts) for experiments in this class. The data set will be available on the web site.
In addition to reviewing/presenting a paper, each student needs to complete a course project with a topic that's related to the topics covered in the course. Students are encouraged to expand the topic they review and experiment with earlier. Team projects (no more than 2 people per team) are encouraged.
Required background includes familiarity with image processing, probability, and linear algebra. Familiarity with image processing tools and programming language (Matlab, JAVA, or C) will be useful.