Story Forms Detection in Text through Concept-Based Co-Clustering

2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom)(2016)

引用 4|浏览62
暂无评分
摘要
A story is defined as actors taking actions that culminate in resolutions. In this paper, we extract subject - verb - object relationships from paragraphs and generalize them into semantic conceptual representations. Overlapping generalized concepts and relationships correspond to archetypes/targets and actions that characterize story forms. We present an analytic framework which implements co-clustering based on generalized conceptual relationships to automatically detect such story forms. Co-clustering can help in identifying similarities that exist in low-dimensional sub-spaces of sparse data such as textual paragraphs. Through co-clustering, we detect not only the clusters themselves but also their characteristic features which can be useful in describing and summarizing their contents. We perform co-clustering of stories using two different types of features: standard unigrams/bigrams and generalized concepts. We show that the residual error of factorization with concept-based features is significantly lower than the error with standard keyword-based features. Qualitative evaluations also suggest that concept-based features yield more coherent, distinctive and interesting story forms compared to those produced by using standard keyword-based features.
更多
查看译文
关键词
Story forms,Narrative analysis,Non-negative matrix factorization,Co-clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要