Fecha: 3 noviembre 2015
Ponente: Oliver Schulte (Simon Fraser University, Vancouver, Canada)
Lugar de celebración: Sala J.Mira, ETSI Informática, UNED (mapa)
Many organizations maintain data in databases. Multi-relational databases contain information about entities, attributes of entities, links, and attributes of links. This talk presents methods for applying Bayesian network learning to multi-relational data. Generative graphical models like Bayesian networks support important applications such as information extraction, entity resolution, link-based clustering, link-based outlier detection, query optimization, and others. I describe a scalable parameter learning method, based on the Fast Moebius Transform, that integrates statistical information across multiple tables in the database. For learning the structure of a graphical model I describe a lattice search algorithm, that efficiently searches for probabilistic associations along increasingly longer relational pathways. These methods scale to millions of data records, for instance to data from the Internet Movie Database. Both theoretical arguments and empirical evidence indicate that Bayesian network learning provides excellent estimates of statistical associations in a relational database.