Authors
Steve Whittaker, Julia Hirschberg
Publication date
2007/4/1
Journal
Computer Speech & Language
Volume
21
Issue
2
Pages
296-324
Publisher
Academic Press
Description
When users access information from text, they engage in strategic fixation, visually scanning the text to focus on regions of interest. However, because speech is both serial and ephemeral, it does not readily support strategic fixation. This paper describes two design principles, indexing and transcript-centric access that address the problem of speech access by supporting strategic fixation. Indexing involves users constructing external visual indices into speech. Users visually scan these indices to find information-rich regions of speech for more detailed processing and playback. Transcription involves transcribing speech using automatic speech recognition (ASR) and enriching that transcription with visual cues. The resulting enriched transcript is time-aligned to the original speech, allowing users to scan the transcript as a whole or the additional visual cues present in the transcript, to fixate and play regions of …
Total citations
2010201120122013201420152016201720182019202020212022202311111111
Scholar articles
S Whittaker, J Hirschberg - Computer Speech & Language, 2007