%0 Journal Article %J Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization %D 2005 %T A methodology for extrinsic evaluation of text summarization: Does ROUGE correlate %A Dorr, Bonnie J %A Monz,C. %A President,S. %A Schwartz,R. %A Zajic, David %X This paper demonstrates the usefulness of sum-maries in an extrinsic task of relevance judgment based on a new method for measuring agree- ment, Relevance-Prediction, which compares sub- jects’ judgments on summaries with their own judg- ments on full text documents. We demonstrate that, because this measure is more reliable than previ- ous gold-standard measures, we are able to make stronger statistical statements about the benefits of summarization. We found positive correlations be- tween ROUGE scores and two different summary types, where only weak or negative correlations were found using other agreement measures. How- ever, we show that ROUGE may be sensitive to the choice of summarization style. We discuss the im- portance of these results and the implications for fu- ture summarization evaluations. %B Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization %P 1 - 8 %8 2005/// %G eng