Multimodal Language Understanding

Mary Harper

Purdue University


UMIACS Computational Linguistics Colloquium

December 8, 2004, 11 am, AVW rm 2120

Abstract:

 

People, when understanding human-to-human communication, do not simply focus on words and their meaning.  They utilize everything they can in order to understand the communication, including information from the visual domain such as other speakers' gesture and gaze.  How this information is synthesized to reach understanding is currently an important unanswered question.  If we can understand how a multimodal language performance encodes its meaning, that knowledge could be exploited to build a computer model to support effective multimodal human-to-computer exchanges.  This talk describes research on how gesture and speech may be fused to provide more complete understanding of human-to-human dialogs.  Initial work focused on careful measurement studies on how gesture and speech interact over time for both normal and Parkinson's participants.  These studies support the hypothesis that the modalities act synergistically.  We are currently evaluating the impact that gesture has on the identification of sentence boundaries, speech repairs, and topic shifts in videotaped dialogs.  Some findings for sentence boundaries will be discussed in this talk.

 

 

About the Speaker:

:

Mary Harper is a Professor in Electrical and Computer Engineering at Purdue University, where she teaches and conducts research that primarily concerns computer modeling of human communication.  Her focus is on methods for incorporating multiple types of knowledge into computer algorithms for modeling human communication.  Recent research has focused on the integration of speech and natural language processing systems, the integration of speech, gesture, and gaze (involving researchers from six universities covering the disciplines of computer science, computer engineering, electrical engineering, psychology, neurology, and linguistics), and the utilization of hierarchical structure learned in an unsupervised fashion to improve the classification accuracy of documents and images.  Dr. Harper is currently on leave from Purdue acting as the Program Director of the new Human Language and Communication program in the Division of Information and Intelligent Systems of the CISE directorate.

 

For the colloquium series schedule, see the UMD Computational http://www.umiacs.umd.edu/research/CLIP/colloq/.  If you are interested in meeting with the speaker, please contact Doug <http://www.glue.umd.edu/~oard/>  Oard (oard@umiacs.umd.edu <mailto:oard@umiacs.umd.edu> ).