Understanding the semantics of networked text

Understanding the semantics of networked text(2012)

引用 23|浏览26
暂无评分
摘要
Social networks are a powerful means for information sharing. A large social network typically has hundreds of millions of users. These users are interconnected through social links to friends, colleagues, family members, etc. The frequent interaction and information exchange between users form a massive heterogeneous information network. Understanding the semantic information in the textual data and the topological information in the social network poses a grant challenge for data mining researchers. This Ph.D. dissertation tackles the problem of understanding the unstructured or semi-structured data in social networks. First, we describe a parallel spectral clustering algorithm that makes possible clustering analysis on large-scale social networks with hundreds of millions of users. Comprehensive analysis, extraction and integration of information from multiple sources are necessary. Next, we describe an information extraction engine that extracts data items from Web pages without knowing the data wrapping template. We also present an information integration approach to aggregate data tables collected from the Web and hence better serve general Web search. To make information routing in collaborative networks more efficient, we describe generative models to characterize expertise awareness relationships between agents in collaborative networks and provide efficient task routing recommendations. We also describe, in depth, the first quantitative analysis of the information flow efficiency in collaborative networks. To utilize the accumulated information, we developed a topic modeling approach that allows document retrieval across multiple document sets with possible semantic gaps and vocabulary gaps.
更多
查看译文
关键词
social network,massive heterogeneous information network,information extraction engine,collaborative network,information integration approach,networked text,information sharing,semantic information,information routing,information flow efficiency,information exchange
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要