Fecha: 17 de Junio de 2013

Ponente: Eneko Agirre (Euskal Herriko Unibertsitatea/Universidad del País Vasco)

Lugar de celebración: Salón de Actos, Facultad de Psicología, UNED


In recent years many models have been proposed that are aimed at predicting clicks of web search users. In addition, some information retrieval evaluation metrics have been built on top of a user model. In this this talk I bring these two directions together and propose a common approach to converting any click model into an evaluation metric. I then put the resulting model-based metrics as well as traditional metrics (like DCG or Precision) into a common evaluation framework and compare them along a number of dimensions. One of the dimensions I am particularly interested in is the agreement between offline and online experimental outcomes. It is widely believed, especially in an industrial setting, that online A/B-testing and interleaving experiments are generally better at capturing system quality than offline measurements. I show that offline metrics that are based on click models are more strongly correlated with online experimental outcomes than traditional offline metrics, especially in situations where we have incomplete relevance judgements.


(This is based on joint work with Aleksandr Chuklin and Pavel Serdyukov).