Fecha: 19 de Junio de 2015 a las 11:30
Ponente: Piek Vossen (University Amsterdam)
Lugar de celebración: Sala de Grados de la E.T.S.I. Industriales. UNED (mapa)
The FP7 project NewsReader develops ‘reading technology’ that extracts events, participants, time and place from news in 4 languages and creates Event-centric knowledge graphs in RDF. News is coming from many different sources that partially report on the same event. Their reports overlap but also differ. They report different information on the same events or contradict each other. Furthermore, news is a stream of information that continues to reflect on changes in the world as time passes. In our model we define events as instances linked to all the mentions in the sources. This allows us aggregate information on the events across these mentions and represent the merged result but also the conflicts. Event coreference is the core technology that derives an instance representation in RDF from the NLP analysis of the mentions. We applied the technology to millions of articles on the automative industry in English between 2003 and 2013 but aso to news in Spanish, Dutch and Italia
n. Although the NLP analysis is specific for each language, the instance representation of events in RDF is agnostic for the way it is expressed in text and also for the language that was used. We thus can merge the information coming from news in different languages in a single representation. Applying this to millions of news articles results in large knowledge graphs with hundreds of millions of events. We define an abstract model to construct storylines from these events. These storylines summarise event sequences on timelines that provide a more natural way of access to the data.