Jump to : Download | Abstract | Contact | BibTex reference | EndNote reference |


Alejandro Jaimes, Jeff Pelz, Tim Grabowski, Jason Babcock, Shih-Fu Chang. Using Human Observers' Eye Movements in Automatic Image Classifiers. In IS&T/SPIE Human Vision and Electronic Imaging, San Jose, CA, January 2001.

Download [help]

Download paper: Adobe portable document (pdf)

Copyright notice:This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


We explore the way in which people look at images of different semantic categories (e.g., handshake, landscape), and directly relate those results to computational approaches for automatic image classification. Our hypothesis is that the eye movements of human observers differ for images of different semantic categories, and that this information can be effectively used in automatic content-based classifiers. First, we present eye tracking experiments that show the variations in eye movements (i.e., fixations and saccades) across different individuals for images of 5 different categories: handshakes (two people shaking hands), crowd (cluttered scenes with many people), landscapes (nature scenes without people), main object in uncluttered background (e.g., an airplane flying), and miscellaneous (people and still lives). The eye tracking results suggest that similar viewing patterns occur when different subjects view different images in the same semantic category. Using these results, we examine how empirical data obtained from eye tracking experiments across different semantic categories can be integrated with existing computational frameworks, or used to construct new ones. In particular, we examine the Visual Apprentice, a system in which image classifiers are learned (using machine learning) from user input as the user defines a multiple level object definition hierarchy based on an object and its parts (scene, object, object-part, perceptual area, region), and labels examples for specific classes (e.g., handshake). The resulting classifiers are applied to automatically classify new images (e.g., as handshake/non-handshake). Although many eye tracking experiments have been performed, to our knowledge, this is the first study that specifically compares eye movements across categories, and that links category-specific eye tracking results to automatic image classification techniques


Alejandro Jaimes
Shih-Fu Chang

BibTex Reference

   Author = {Jaimes, Alejandro and Pelz, Jeff and Grabowski, Tim and Babcock, Jason and Chang, Shih-Fu},
   Title = {Using Human Observers' Eye Movements in Automatic Image Classifiers},
   BookTitle = {IS&T/SPIE Human Vision and Electronic Imaging},
   Address = {San Jose, CA},
   Month = {January},
   Year = {2001}

EndNote Reference [help]

Get EndNote Reference (.ref)


For problems or questions regarding this web site contact The Web Master.

This document was translated automatically from BibTEX by bib2html (Copyright 2003 © Eric Marchand, INRIA, Vista Project).