View article

[PDF] from aclanthology.org

Finite-state multimodal parsing and understanding

Authors

Michael Johnston, Srinivas Bangalore

Publication date

2000

Conference

COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

Description

Multimodal interfaces require effective parsing and understanding of utterances whose content is distributed across multiple input modes. Johnston 1998 presents an approach in which strategies for multimodal integration are stated declaratively using a unification-based grammar that is used by a multidimensional chart parser to compose inputs. This approach is highly expressive and supports a broad class of interfaces, but offers only limited potential for mutual compensation among the input modes, is subject to significant concerns in terms of computational complexity, and complicates selection among alternative multimodal interpretations of the input. In this paper, we present an alternative approach in which multimodal parsing and understanding are achieved using a weighted finite-state device which takes speech and gesture streams as inputs and outputs their joint interpretation. This approach is significantly more efficient, enables tight-coupling of multimodal understanding with speech recognition, and provides a general probabilistic framework for multimodal ambiguity resolution.

Total citations

Cited by 165

200020012002200320042005200620072008200920102011201220132014201520162017201820192020202120222 4 13 5 15 13 19 11 11 18 10 8 3 6 5 3 7 5 2 1 1 1

Scholar articles

Finite-state multimodal parsing and understanding

M Johnston, S Bangalore - COLING 2000 Volume 1: The 18th International …, 2000