Topic Modeling of Political Dynamics with Shifted Cosine Similarity

INTEGRATED UNCERTAINTY IN KNOWLEDGE MODELLING AND DECISION MAKING (IUKM 2022)(2022)

引用 0|浏览14
暂无评分
摘要
Topic modeling with community detection can be used to explore the latent semantic structure of documents, we can utilize a network, i.e., a graph to depict the semantic relation between words. In some network based topic models, in order to obtain a network with obvious community structure, the similarity between words (vertices) is essential. Word embeddings trained from a large corpus empirically perform as well as in rich semantic representation, thus this research is intended to construct a novel similarity in a network based topic model (NAM). In this paper, we first intuitively propose a similarity measure based on shifted cosine similarity between word embeddings. This similarity is exploited to replace the similarity based on typical point-wise mutual information (PMI). Secondly, based on different similarity measures, topics of corpus in a global period are induced by NAM. Finally, we use NAM to capture the dynamic changes of political topics in China and interpret the dynamic processes using historical background. Although our similarity measure introduces semantic differences caused by the difference between data sets and has one more parameter, the experimental results show the effectiveness of our new proposed measure.
更多
查看译文
关键词
Topic model, Network analysis, Word embeddings
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要