TY - JOUR
T1 - Chat mining
T2 - Predicting user and message attributes in computer-mediated communication
AU - Kucukyilmaz, Tayfun
AU - Cambazoglu, B. Barla
AU - Aykanat, Cevdet
AU - Can, Fazli
PY - 2008/7
Y1 - 2008/7
N2 - The focus of this paper is to investigate the possibility of predicting several user and message attributes in text-based, real-time, online messaging services. For this purpose, a large collection of chat messages is examined. The applicability of various supervised classification techniques for extracting information from the chat messages is evaluated. Two competing models are used for defining the chat mining problem. A term-based approach is used to investigate the user and message attributes in the context of vocabulary use while a style-based approach is used to examine the chat messages according to the variations in the authors' writing styles. Among 100 authors, the identity of an author is correctly predicted with 99.7% accuracy. Moreover, the reverse problem is exploited, and the effect of author attributes on computer-mediated communications is discussed.
AB - The focus of this paper is to investigate the possibility of predicting several user and message attributes in text-based, real-time, online messaging services. For this purpose, a large collection of chat messages is examined. The applicability of various supervised classification techniques for extracting information from the chat messages is evaluated. Two competing models are used for defining the chat mining problem. A term-based approach is used to investigate the user and message attributes in the context of vocabulary use while a style-based approach is used to examine the chat messages according to the variations in the authors' writing styles. Among 100 authors, the identity of an author is correctly predicted with 99.7% accuracy. Moreover, the reverse problem is exploited, and the effect of author attributes on computer-mediated communications is discussed.
UR - http://www.scopus.com/inward/record.url?scp=44649163258&partnerID=8YFLogxK
U2 - 10.1016/j.ipm.2007.12.009
DO - 10.1016/j.ipm.2007.12.009
M3 - Article
AN - SCOPUS:44649163258
SN - 0306-4573
VL - 44
SP - 1448
EP - 1466
JO - Information Processing and Management
JF - Information Processing and Management
IS - 4
ER -