Topic-focused multi-document summarization using an approximate oracle score

TitleTopic-focused multi-document summarization using an approximate oracle score
Publication TypeConference Papers
Year of Publication2006
AuthorsConroy JM, Schlesinger JD, O'Leary DP
Conference NameProceedings of the COLING/ACL on Main conference poster sessions
Date Published2006///
PublisherAssociation for Computational Linguistics
Conference LocationStroudsburg, PA, USA

We consider the problem of producing a multi-document summary given a collection of documents. Since most successful methods of multi-document summarization are still largely extractive, in this paper, we explore just how well an extractive method can perform. We introduce an "oracle" score, based on the probability distribution of unigrams in human summaries. We then demonstrate that with the oracle score, we can generate extracts which score, on average, better than the human summaries, when evaluated with ROUGE. In addition, we introduce an approximation to the oracle score which produces a system with the best known performance for the 2005 Document Understanding Conference (DUC) evaluation.