View article

[PDF] from hhu.de

Evaluating evaluation methods for generation in the presence of variation

Authors

Amanda Stent, Matthew Marge, Mohit Singhai

Publication date

2005/2/13

Book

International conference on intelligent text processing and computational linguistics

Pages

341-351

Publisher

Springer Berlin Heidelberg

Description

Recent years have seen increasing interest in automatic metrics for the evaluation of generation systems. When a system can generate syntactic variation, automatic evaluation becomes more difficult. In this paper, we compare the performance of several automatic evaluation metrics using a corpus of automatically generated paraphrases. We show that these evaluation metrics can at least partially measure adequacy (similarity in meaning), but are not good measures of fluency (syntactic correctness). We make several proposals for improving the evaluation of generation systems that produce variation.

Total citations

Cited by 163

200520062007200820092010201120122013201420152016201720182019202020212022202320241 4 7 3 7 12 2 2 2 3 5 11 14 16 11 15 18 23 7

Scholar articles

Evaluating evaluation methods for generation in the presence of variation

A Stent, M Marge, M Singhai - International conference on intelligent text processing …, 2005