Social media research: The application of supervised machine learning in organizational communication research

W. van Zoonen, T.G.L.A. van der Meer

Research output: Contribution to journalArticleAcademicpeer-review

45 Citations (Scopus)

Abstract

Despite the online availability of data, analysis of this information in academic research is arduous. This article explores the application of supervised machine learning (SML) to overcome challenges associated with online data analysis. In SML classifiers are used to categorize and code binary data. Based on a case study of Dutch employees’ work-related tweets, this paper compares the coding performance of three classifiers, Linear Support Vector Machine, Naïve Bayes, and logistic regression. The performance of these classifiers is assessed by examining accuracy, precision, recall, the area under the precision-recall curve, and Krippendorf’s Alpha. These indices are obtained by comparing the coding decisions of the classifier to manual coding decisions. The findings indicate that the Linear Support Vector Machine and Naïve Bayes classifiers outperform the logistic regression classifier. This study also compared the performance of these classifiers based on stratified random samples and random samples of training data. The findings indicate that in smaller training sets stratified random training samples perform better than random training samples, in large training sets (n = 4000) random samples yield better results. Finally, the Linear Support Vector Machine classifier was trained with 4000 tweets and subsequently used to categorize 578,581 tweets obtained from 430 employees.
Original languageEnglish
Pages (from-to)132-141
Number of pages10
JournalComputers in Human Behavior
Volume63
DOIs
Publication statusPublished - 2016
Externally publishedYes

Fingerprint

Dive into the research topics of 'Social media research: The application of supervised machine learning in organizational communication research'. Together they form a unique fingerprint.

Cite this