Semi-Supervised Approaches for Information Analysis
Rebecca Hwa
ABSTRACT
Increasingly, information gathering and analysis have become an integral part of our daily activities. Product reviews written by other customers help us decide what to buy; online forums and blogs heighten our awareness to multiple perspectives of world events; powerful web search engines and community maintained wikis allow us to learn about a topic easily and quickly, no matter how esoteric. Given the range and scale of the available data however, it is difficult for us to pick out the relevant part from the sea of information. Thus, developing automatic methods to preprocess these data is a major challenge. Machine learning approaches offer a way to enable a system to process a wide range of information input, but acquiring sufficient amount of annotated training data is another concern.
In this talk, I discuss two approaches for addressing these challenges. One way to help users to find relevant information more efficiently is to differentiate facts from opinions. I present a system for determining the strength of the subjectivity of complex text. Another concern is that the information the user wants may be in a form unintelligible to the user, such as a foreign language. I present a framework that quickly develops resources necessary for machine translation. Both approaches use semi-supervised learning methods to reduce the systems' reliance on annotated training data.
About the
Speaker:
Rebecca
Hwa joined the
For
the colloquium series schedule, see the UMD Computational http://www.umiacs.umd.edu/research/CLIP/colloq/. If you are interested in meeting with the
speaker, please contact Doug <http://www.glue.umd.edu/~oard/> Oard (