We consider the problem of accurately estimating the frequency of terms in spoken documents. Such estimates will be necessary to capitalize on most techniques from plain-text information retrieval in speech. We introduce a discriminative approach to vocabulary independent term frequency estimation and show that it can perform significantly better than a previously established generative model at this task. We further introduce a new evaluation framework for this problem, emphasizing a system's ability to represent complete document vectors with high fidelity rather than simply detecting or ranking term occurrences.
Our term frequency estimator is constructed using Generalized Additive Models (GAMs), which I'll briefly introduce. GAMs are a generalization of "generalized linear models" ( e.g., logistic regression) which can be used to learn smooth transformations of input features in classification or regression problems.
Scott Olsson is a graduate student in Applied Mathematics and Scientific Computation (AMSC). He is advised by Doug Oard. Scott's primary research interests are spoken document retrieval and classification. He is currently a student fellow of the Human Language Technology Center of Excellence.
This talk is part of the CLIP Colloquium Series, organized by Jimmy Lin (jimmylin -at- umd .dot. edu). For the complete schedule, please visit http://www.umiacs.umd.edu/research/CLIP/colloq/.