Knowledge discovery of semantic relationships between words using nonparametric bayesian graph model.
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining(2008)
摘要
We developed a model based on nonparametric Bayesian modeling for automatic discovery of semantic relationships between words taken from a corpus. It is aimed at discovering semantic knowledge about words in particular domains, which has become increasingly important with the growing use of text mining, information retrieval, and speech recognition. The subject-predicate structure is taken as a syntactic structure with the noun as the subject and the verb as the predicate. This structure is regarded as a graph structure. The generation of this graph can be modeled using the hierarchical Dirichlet process and the Pitman-Yor process. The probabilistic generative model we developed for this graph structure consists of subject-predicate structures extracted from a corpus. Evaluation of this model by measuring the performance of graph clustering based on WordNet similarities demonstrated that it outperforms other baseline models.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络