A Semantic Approach for Extracting Domain Taxonomies from Text

K (Kevin) Meijer, Flavius Frasincar, Frederik Hogenboom

Research output: Contribution to journalArticleAcademicpeer-review

52 Citations (Scopus)

Abstract

In this paper we present a framework for the automatic building of a domain taxonomy from text corpora, called Automatic Taxonomy Construction from Text (ATCT). This framework comprises four steps. First, terms are extracted from a corpus of documents. From these extracted terms the ones that are most relevant for a specific domain are selected using a filtering approach in the second step. Third, the selected terms are disambiguated by means of a word sense disambiguation technique and concepts are generated. In the final step, the broader–narrower relations between concepts are determined using a subsumption technique that makes use of concept co-occurrences in a text. For evaluation, we assess the performance of the ATCT framework using the semantic precision, semantic recall, and the taxonomic F-measure that take into account the concept semantics. The proposed framework is evaluated in the field of economics and management as well as the medical domain.
Original languageEnglish
Pages (from-to)78-93
Number of pages16
JournalDecision Support Systems
Volume62
DOIs
Publication statusPublished - 27 Mar 2014

Fingerprint

Dive into the research topics of 'A Semantic Approach for Extracting Domain Taxonomies from Text'. Together they form a unique fingerprint.

Cite this