Research in the pharmaceutical industry has traditionally been "wet science," with typical drug discovery and manufacturing processes running into multiple years. However, due to increased competition from international competitors, and pressures from government regulators to conduct more thorough testing, pharmacuetical industry is currently looking for ways to speed up the drug discovery and manufacturing processes.
One approach that some pharmaceutical companies (in partnership with IBM) are taking to answer this challenge is to use various available information sources to limit the search space. So for example, can one find relationships between drugs/molecules and symptoms from either open literature or proprietary data? If a newly discovered drug can help with a particular disease, are there other diseases it could cure? I will describe three tools (among others) that we are in the process of building to help in answering these type of questions: i) an information entity extraction tool based on classifier combination, ii) a taxomony mapping algorithm using logististic regression, and iii) a hashing-based chemical search engine.
Tapas kanungo has been a Research Staff Member at the IBM Alamaden Research Center since 2001. His current research focus is text analysis in the healthcare and pharmaceutical industry. Prior to IBM, Tapas was Co-Director of Language and Media Processing Lab at the University of Maryland. He receivd his Ph.D. in Electrical Engineering from Univerisy of Washington, Seattle, WA, in 1996.
This talk is part of the CLIP Colloquium Series, organized by Jimmy Lin (jimmylin -at- umd .dot. edu). For the complete schedule, please visit http://www.umiacs.umd.edu/research/CLIP/colloq/.