Fecha: 30 de abril de 2013

Ponente: Francisco Valverde, (NLP&IR-UNED)

Lugar de celebración: Sala 1.03, ETSI Informática, UNED (mapa)

Resumen:

The most widely spread measure of performance, accuracy, suffers from a paradox: predictive models with a given level of accuracy may have greater predictive power that models with higher accuracy. We argue that a reason for this may be that, in spite of optimizing classification error rate, high accuracy models fail to capture crucial information transfer in the classification task.

Therefore we set out to solve the problem of the assessment of classification when maximizing the statistical information captured by the model is the main goal of the classification process, e.g. in exploratory analysis.

For this purpose we concentrate on a different quantity for a classifier, the perplexity, and show how it relates to classification accuracy. Using perplexity we are then able to obtain the normalized information transfer factor (NIT), a measure of how efficient is the transmission of information from the input set of classes to the output set of classes.

We claim that the NIT factor is a more natural measure of classification performance than accuracy when the assessment criterion is the transfer of information through the classifier instead of classification error count. It also makes it harder for classifiers to 'cheat' using techniques like specialization. We show how to use it in classification assessment and howit rejects rankings based in accuracy.