Fecha: 25 de febrero de 2014

Ponente: Hegler Tissot (Federal University of Paraná, Brazil)

Lugar de celebración: Sala 1.03, ETSI Informática, UNED (mapa)

Resumen:

Given that more and more unstructured knowledge is represented in computer-readable forms, it is necessary not only to understand how to use it, but also to build tools that can effectively extract, analyze, and make the meaning of information useful. Information Extraction (IE) systems and techniques can deal with this vast amount of textual information that is available nowadays. Ontology-based Information Extraction (OBIE) is subfield of IE that uses ontologies to assist in the extraction of domain-specific information. However, ontologies do not have the semantics for temporal information, they cannot perform reasoning about temporal knowledge. Medical records are an example of textual content in which events are related to temporal information. The inability to extract temporal data that can place events on a timeline makes it difficult to understand how such events are organized in a chronological order. Temporal information, however, is not always accurately represented - as in expressions like "some days ago". Moreover, input text may contain spelling errors which can further complicate the understanding of such expressions. Although some proposed approaches address the issue of identifying explicit, implicit, or even imprecise date and time expressions in text during the IE process, existing OBIE solutions do not have integrated support to deal with uncertain temporal knowledge coupled with text that contain spelling errors. In this work we present a proposal to extract and deal with precise and imprecise temporal data within an OBIE process. The temporal information may contain spelling errors as well, based on a novel phonetic search solution. In this proposed work, we will present a set of components in a OBIE framework for handling these problems. We have already developed a phonetic search solution, which is a first result toward the final work.