Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora

J de Knijff, K (Kevin) Meijer, Flavius Frasincar, Frederik Hogenboom

Research output: Chapter/Conference proceedingConference proceedingAcademicpeer-review

Abstract

In this paper, we propose the Automatic Taxonomy Construction from Text (ATCT) framework for building taxonomies from text-based Web corpora. The framework is composed of multiple processing steps. Firstly, domain terms are extracted using a filtering method. Subsequently, Word Sense Disambiguation (WSD) is optionally applied in order to determine the senses of these terms. Then, by means of a subsumption technique, the resulting concepts are arranged in a hierarchy. We construct taxonomies with and without WSD and we investigate the effect of WSD on the quality of concept type-of relations using an evaluation framework that uses a golden taxonomy. We find that WSD improves the quality of the built taxonomy in terms of the taxonomic F-Measure.
Original languageEnglish
Title of host publicationTwelfth International Conference on Web Information System Engineering (WISE 2011)
EditorsA. Bouguettaya, M. Hauswirth, L. Liu
Place of PublicationBerlin
PublisherSpringer-Verlag
Pages241-248
Number of pages8
Volume6997
DOIs
Publication statusPublished - 13 Oct 2011

Research programs

  • EUR ESE 32

Fingerprint

Dive into the research topics of 'Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora'. Together they form a unique fingerprint.

Cite this