Abstract
In this work, we investigate several machine learning methods to tackle the problem of intent classification for dialogue utterances. We start with bag-of-words in combination with Naïve Bayes. After that, we employ continuous bag-of-words coupled with support vector machines (SVM). Then, we follow long short-term memory (LSTM) networks, which are made bidirectional. The best performing model is hierarchical, such that it can take advantage of the natural taxonomy within classes. The main experiments are a comparison between these methods on an open sourced academic dataset. In the first experiment, we consider the full dataset. We also consider the given subsets of data separately, in order to compare our results with state-of-the-art vendor solutions. In general we find that the SVM models outperform the LSTM models. The former models achieve the highest macro-F1 for the full dataset, and in most of the individual datasets. We also found out that the incorporation of the hierarchical structure in the intents improves the performance.
Original language | English |
---|---|
Article number | 88 |
Journal | IEEE Intelligent Systems |
Volume | 35 |
Issue number | 1 |
Early online date | 22 Nov 2019 |
DOIs | |
Publication status | Published - Feb 2020 |
Research programs
- ESE - E&MS