Text Mining For Information Systems Researchers: An Annotated Topic Modeling Tutorial.

COMMUNICATIONS OF THE ASSOCIATION FOR INFORMATION SYSTEMS(2016)

引用 88|浏览5
暂无评分
摘要
Analysts have estimated that more than 80 percent of today's data is stored in unstructured form (e.g., text, audio, image, video)-much of it expressed in rich and ambiguous natural language. Traditionally, to analyze natural language, one has used qualitative data-analysis approaches, such as manual coding. Yet, the size of text data sets obtained from the Internet makes manual analysis virtually impossible. In this tutorial, we discuss the challenges encountered when applying automated text-mining techniques in information systems research. In particular, we showcase how to use probabilistic topic modeling via Latent Dirichlet allocation, an unsupervised text-mining technique, with a LASSO multinomial logistic regression to explain user satisfaction with an IT artifact by automatically analyzing more than 12,000 online customer reviews. For fellow information systems researchers, this tutorial provides guidance for conducting text-mining studies on their own and for evaluating the quality of others.
更多
查看译文
关键词
Text Mining,Topic Modeling,Latent Dirichlet Allocation,Online Customer Reviews,User Satisfaction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要