Abstract
In this paper, we propose the Automatic Taxonomy Construction from Text (ATCT) framework for building taxonomies from text-based Web corpora. The framework is composed of multiple processing steps. Firstly, domain terms are extracted using a filtering method. Subsequently, Word Sense Disambiguation (WSD) is optionally applied in order to determine the senses of these terms. Then, by means of a subsumption technique, the resulting concepts are arranged in a hierarchy. We construct taxonomies with and without WSD and we investigate the effect of WSD on the quality of concept type-of relations using an evaluation framework that uses a golden taxonomy. We find that WSD improves the quality of the built taxonomy in terms of the taxonomic F-Measure.
Original language | English |
---|---|
Title of host publication | Twelfth International Conference on Web Information System Engineering (WISE 2011) |
Editors | A. Bouguettaya, M. Hauswirth, L. Liu |
Place of Publication | Berlin |
Publisher | Springer-Verlag |
Pages | 241-248 |
Number of pages | 8 |
Volume | 6997 |
DOIs | |
Publication status | Published - 13 Oct 2011 |
Research programs
- EUR ESE 32