The Integrality of Speech in Multimodal Interfaces

Author/Creator ORCID





Citation of Original Publication

Michael A. Grasso, David Ebert, and Tim Finin, The Integrality of Speech in Multimodal Interfaces, ACM Transactions on Computer-Human Interaction (TOCHI), Volume 5 Issue 4, Dec. 1998 Pages 303-325 , DOI : 10.1145/300520.300521


This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
© ACM, 1998. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in PUBLICATION, VOL 5, ISS 4, Dec. 1998


A framework of complementary behavior has been proposed which maintains that direct manipulation and speech interfaces have reciprocal strengths and weaknesses. This suggests that user interface performance and acceptance may increase by adopting a multimodal approach that combines speech and direct manipulation. This effort examined the hypothesis that the speed, accuracy, and acceptance of multimodal speech and direct manipulation interfaces will increase when the modalities match the perceptual structure of the input attributes. A software prototype that supported a typical biomedical data collection task was developed to test this hypothesis. A group of 20 clinical and veterinary pathologists evaluated the prototype in an experimental setting using repeated measures. The results of this experiment supported the hypothesis that the perceptual structure of an input task is an important consideration when designing a multimodal computer interface. Task completion time, the number of speech errors, and user acceptance improved when interface best matched the perceptual structure of the input attributes.