Fecha: 12 de junio de 2012

Ponente: Enrique Amigó(NLP&IR-UNED)

Lugar de celebración: Sala 6.02, ETSI Informática, UNED (mapa)

Resumen: The heterogeneity property of text evaluation measures states that the probability of a real (i.e. human assessed) similarity increase is directly related to the heterogeneity of the set of automatic similarity measures that corroborate such increase. In this talk we i) generalize this principle to all Natural Language Processing tasks that involve computing similarity between texts; ii) we present empirical evidence that it holds in a wide range of tasks: Text Entailment, Clustering, Document Retrieval, Machine Translation evaluation and Text Summarization evaluation; and iii) we introduce a combination method for similarity measures that is based on the heterogeneity principle. The method is completely unsupervised (it does not use any kind of human assessments on the quality of the measures to be combined) and leads to top performing combined similarity measures in all the tasks considered.