Multimodal Language Understanding
Mary Harper
Abstract:
People, when understanding human-to-human communication, do not simply focus on words and their meaning. They utilize everything they can in order to understand the communication, including information from the visual domain such as other speakers' gesture and gaze. How this information is synthesized to reach understanding is currently an important unanswered question. If we can understand how a multimodal language performance encodes its meaning, that knowledge could be exploited to build a computer model to support effective multimodal human-to-computer exchanges. This talk describes research on how gesture and speech may be fused to provide more complete understanding of human-to-human dialogs. Initial work focused on careful measurement studies on how gesture and speech interact over time for both normal and Parkinson's participants. These studies support the hypothesis that the modalities act synergistically. We are currently evaluating the impact that gesture has on the identification of sentence boundaries, speech repairs, and topic shifts in videotaped dialogs. Some findings for sentence boundaries will be discussed in this talk.
About the
Speaker:
:
Mary
Harper is a Professor in Electrical and Computer Engineering at
For the colloquium series schedule, see the UMD Computational http://www.umiacs.umd.edu/research/CLIP/colloq/. If you are interested in meeting with the speaker, please contact Doug <http://www.glue.umd.edu/~oard/> Oard (oard@umiacs.umd.edu <mailto:oard@umiacs.umd.edu> ).