Abstract
Automated analysis of the ever-increasing amount of reviews available through the Web can enable businesses to identify why people like or dislike (aspects of) products or brands, yet to this end, a reliable indication of the intended sentiment of reviews is of crucial importance. This sentiment is typically quantified in universal star ratings, which are not always available. We propose and compare the performance of several statistical methods of automatically classifying star ratings of reviews represented by means of a binary vector representation, with features signaling the presence of sentiment-carrying words. A nearest neighbor classifier maximizes recall, whereas a naïve Bayes classifier excels in terms of precision, accuracy, and the root mean squared error of the assigned number of stars.
Original language | English |
---|---|
Title of host publication | Management Intelligent Systems |
Editors | J. Casillas, F. Martinez-Lopez, J. Corchado |
Place of Publication | Salamanca, Spain |
Publisher | Springer-Verlag |
Pages | 251-260 |
Number of pages | 10 |
Volume | 171 |
ISBN (Print) | 9783642308635 |
DOIs | |
Publication status | Published - 11 Jul 2012 |
Research programs
- EUR ESE 32