Authors
Julia Hirschberg, John Choi, Christine H Nakatani, Steve Whittaker
Publication date
1998
Conference
Content Visualization and Intermedia Representations (CVIR’98)
Description
The current popularity of multimodal information retrieval research critically assumes that consumers will be found for the multimodal information thus retrieved and that interfaces can be designed that will allow users to search and browse multimodal information effectively. While there has been considerable effort given to developing the basic technologies needed for information retrieval from audio, video and text domains, basic research on how people browse and search in any of these domains, let alone in some combination, has lagged behind. In developing the SCAN (Spoken Content-based Audio Navigation) system to retrieve information from an audio domain, we have attempted to study the problems of how users navigate audio databases, hand in hand with the development of the speech and information retrieval technologies which enable this navigation3
SCAN was developed initially for the TREC-6 Spoken Document Retrieval (SDR) task, which employs the NIST/DARPA HUB4 Broadcast News corpus. However, we are also developing a search and browsing system for voicemail access, over the telephone and via a GUI interface. To this end, we have built several user interfaces to both the voicemail and news domains, which we are employing in a series of laboratory experiments designed to identify limiting and enabling features of audio search and browsing interfaces. We want to examine the following questions: a) how do people want to search audio data? what sort of search and play capabilities do they make most use of, when given several alternatives? b) do people search different sorts of audio data (eg, familiar versus …
Total citations
Scholar articles
J Hirschberg, J Choi, CH Nakatani, S Whittaker - … Visualization and Intermedia Representations (CVIR'98 …, 1998