Abstract
Nowadays, emerging news on economic events like acquisitions substantially impacts financial markets. Hence, it is important to automatically and accurately identify economic events in news in a timely manner. For this, one has to be able to process a large amount of heterogeneous sources of unstructured data to extract knowledge useful for guiding decision making processes. We propose a Semantics-based Pipeline for Economic Event Detection (SPEED), with which we aim to extract financial events from emerging news gathered from RSS feeds, and to annotate these with machine-understandable meta-data, while retaining a speed that is high enough to make real-time use possible.
Our framework is modeled as a pipeline which takes news messages as input and is driven by a financial ontology developed by domain experts, containing information extracted from Yahoo! Finance on NASDAQ-100 companies. In our implementation, we have reused some of the components of the existing General Architecture for Text Engineering (GATE) framework and additionally, developed new ones, e.g., an Ontology Gazetteer and a Word Sense Disambiguator.
Experiments on 200 news messages fetched from the Yahoo! Business and Technology RSS feeds show fast, sub-second gazetteering and a precision and recall for concept identification of 86% and 81%, respectively. Precision and recall of fully decorated events result in lower values of approximately 62% and 53%, as they rely on multiple concepts that have to be identified correctly. Our Word Sense Disambiguator with an adapted SSI algorithm shows a precision and recall of 59%, compared to 53% and 31% for the original algorithm.
Original language | English |
---|---|
Publication status | Published - 22 Oct 2012 |
Event | The Interface for Dutch ICT-Research 2012 (ICT.OPEN 2012) - Rotterdam, the Netherlands Duration: 22 Oct 2012 → 23 Oct 2012 |
Conference
Conference | The Interface for Dutch ICT-Research 2012 (ICT.OPEN 2012) |
---|---|
City | Rotterdam, the Netherlands |
Period | 22/10/12 → 23/10/12 |
Research programs
- EUR ESE 32