John R. Smith and Ana B. Benitez
With the tremendous growth of digital multimedia data, multimedia systems and technologies are evolving rapidly to meet this challenge. The evolution can be described as one in which the technology focus is moving from treating multimedia content as signals to more advanced extraction and processing of multimedia features, semantics and knowledge, as illustrated in Figure 10.1. By initially handling multimedia content as signals, the MPEG-1 and MPEG-2 audiovisual coding standards have allowed efficient storage, compression and communication. More recently, by extracting and analyzing features of the multimedia signals, MPEG-4 has further improved coding efficiency by using, for example, object-based coding techniques. In addition, MPEG-7 allows content-based retrieval of multimedia by extracting and searching features using similarity measures. However, the challenge remains for automatically extracting semantic labels of multimedia content, including labeling of objects, events, places, people and so forth. By labeling multimedia content at the semantic level, the content will be easier to search, filter, index, summarize, personalize and repurpose. Ultimately, semantic labels allow the extraction of knowledge through the mining of multimedia archives. This can enable business use of multimedia and decision support and allow seamless marketplaces for multimedia content.