UMIACS Computational Linguistics Colloquium, October 23, 2002

Text Mining from MEDLINE using MeSHMap


Padmini Srinivasan


School of Library and Information Science, The University of Iowa
Visiting Faculty Scholar, Lister Hill Research Center, National Library of Medicine


UMIACS Computational Linguistics Colloquium

October 23, 2002,
3:30pm, AVW Room 2120


Text mining is about the automatic extraction of new knowledge from large text collections. Such knowledge is viewed as tentative, requiring further scientific study and verification. In this talk we present MeSHmap, our text mining tool developed for the MEDLINE database. Unlike several recent efforts at building similar tools, MeSHmap does not rely on co-occurrence between the concept pair being examined. In addition, it is designed as a general purpose text mining system capable of analysing individual concepts as well as analysing pairs of concepts. Moreover, its capabilities are not constrained to particular types of concepts such as genes or drugs. This talk will include results from experiments designed to test MeSHmap's potential for making novel and potentially interesting discoveries. The first set of experiments deals with MeSHmap's ability to identify meaningful concept associations from a pool that also includes random ones. The associations tested are those derived from groups of drugs, diseases and genes. The second set of experiments investigates MeSHmap's ability to explore individual concepts. These experiments, conducted in the context of a disease epidemiology problem, also demonstrate the versatility of MeSHmap. Specifically, we use it to explore the relationship between the global prevalence of diseases and the global prevalence of research on diseases.


For the colloquium series schedule, see the UMD Computational Linguistics Colloquium Series web page at http://umiacs.umd.edu/~resnik/cl_colloquium/. If you are interested in meeting with the speaker, please contact Philip Resnik (resnik@umiacs.umd.edu).