Jump to : Download | Note | Abstract | Contact | BibTex reference | EndNote reference |


Eric Zavesky. A Guided, Low-Latency, and Relevance Propagation Framework for Interactive Multimedia Search. PhD Thesis Graduate School of Arts and Sciences, Columbia University, 2010.

Download [help]

Download paper: Adobe portable document (pdf)

Copyright notice:This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Note on this paper

Advisor: Prof. Chang


This thesis investigates a number of problems associated with the efficient and engaging ways of executing a multi-level interactive multimedia search. These problems are of interest as the availability of multimedia sources, both professional and personal, continues to grow in tandeom with the need for users to search these libraries for consumable entertainment, captured personal memories, and automatically events with little or no forethought to manual indexing. Multimedia search refers to the retrieval of relevant content from databases containing multimedia documents. Interactive search means that a user is exploring a dynamic set of results according to parameters that he or she explicitly chose in response to a specific search topic. Multi-level search is the full utilization of a user's interaction with a system to not only provide explicitly requested results, but also to observe a user's preferences and implicitly personalize those results with only interactions the user has already performed. The goal of this thesis is to develop a framework that both guides the user through his or her search process by providing dynamic suggestions and information from automatic algorithms while simultaneously leveraging cues observed during the search process to provide a customized set of results that most precisely matches the user's search target. Upon achieving this goal, the system is aiding the user through both explicit interaction and subsequent result personalization from implicit search choices. A prototype of the proposed system, called CuZero, has been implemented and evaluated across multiple challenging databases to discover new search techniques previously unavailable. Addressing problems in traditional query formulation, a system that interactively guides the user is proposed. While previous works allow a user to specify different modalities for a multimedia search like textual keywords and image examples, this work also introduces a large library of 374 semantic concepts. Semantic concepts use pre-trained visual models to bridge the gap in perception between what a machine computes for a multimedia document and what a user can do with that computation. For example, a user need only utilize the concept \crowd" to return content containing large numbers of people attending a basketball tournament, a political protest, or an exclusive fashion show. Building on the familiar technique of text entry (typing in text keywords), the system returns a small subset of dynamically suggested concepts from a lexical mapping and statistical expansion of the user's entered text. These suggestions both engage and inform the user about what the system has indexed with respect to the current query text. Additionally, the introduction of a unique query visualization panel allows the user to interactively include arbitrary modalities (text, images, concepts, etc.) in his or her query. Traditional trial-and-error search with these different query parameters is avoided because the system allows the user to visually arrange his or her query according to personal intuitions about the search topic. Finally, while formulating the query, time otherwise lost while the user is thinking is utilized simultaneously evaluate and load results for the current query at-hand. After a query is formulated during a guided and informative process, the formulation panel is subsequently utilized for query navigation, allowing the user to instantly review numerous query permutations with no perceived latency. With the intuitive mantra \closer to something is more like it", the user is prepared to instantly change the weights of the various parameters in his or her query. To accommodate this exibility, previous systems in interactive search resorted to burdening the user with a secondary query specification stage to tweak individual modality weights. However, the proposed approach to result browsing allows the user to navigate the query and result space in parallel, spanning a wide breadth of query permutations or a deep result depth for any one query permutation. Another classic barrier in multimedia search is the sensible inclusion of new search modalities; if no longer constrained to color or text cues, how can one include motion, audio, and local object similarity that has no textual correspondence? Fortunately, the proposed query navigation panel was created in such a way that any modalities developed in the future can be included with no additional algorithmic changes. This exibility is best exemple-based during the result browsing process, where a user can include another image for example-based search or a personalized snapshot of seen results into the query to quickly hone in on desirable results. A final proposal in this work is a scalable and real-time result personalization technique. One of the fastest ways to help a search system identify results relevant to a user's search topic is to explicitly solicit a user's preference. With interactive systems, this usually means interrupting the result browsing process and asking the user whether they like a particular result. While numerous methods for this type of relevance feedback have been proposed, they all currently trade performance for speed. Typically this problem is due to the scale of the database in question and the number of results required for a system to eliminate confusion about the user's search target. On one end of this spectrum, supervised learning techniques can scale to process millions of results but also require a large number of labels. On the other end, graph-based semi-supervised techniques have been able to achieve promising performance with only a handful of labels, but the time to construct a graph is unacceptable for real-time scenarios. In this work, state-of-the-art graph-based label propagation is aided by data approximation techniques, in a proposed algorithm that is able to achieve higher accuracy in only a small fraction of the computation time when evaluated on a standard benchmark dataset. Using the real-time implementation of this technique, user search results can be personalized without the need to solicit result preferences en mass. The specific contributions of this thesis are as follows. (1) A new system for query formulation, traditionally relegated to static, non-informative interfaces, is proposed that keeps the user engaged in the process and dynamically proposes query suggestions based on knowledge from the system. (2) A technique for visual query space navigation is proposed that allows an intuitive exploration of several query permutations with no additional latency from the explicit, real-time manipulation of modality weights. (3) Utilizing a proposed hybrid of graph-based label propagation and data approximation techniques, user search results are personalized in real-time using implicit user preferences. These proposals are included in a prototype system called CuZero that is evaluated to produce unique search opportunities unavailable to existing multimodal search systems.


Eric Zavesky

BibTex Reference

   Author = {Zavesky, Eric},
   Title = {A Guided, Low-Latency, and Relevance Propagation Framework for Interactive Multimedia Search},
   School = {Graduate School of Arts and Sciences, Columbia University},
   Year = {2010}

EndNote Reference [help]

Get EndNote Reference (.ref)


For problems or questions regarding this web site contact The Web Master.

This document was translated automatically from BibTEX by bib2html (Copyright 2003 © Eric Marchand, INRIA, Vista Project).