An Approach to Generate Topic Similar Document by Seed Extraction-Based SeqGAN Training for Bait Document.

DSC(2018)

引用 23|浏览27
暂无评分
摘要
In recent years, topic similar document generation has drawn more and more attention in both academia and industry. Especially, bait document generation is very important for security. For more-like and fast bait document generation, we proposed the topic similar document generation model based on SeqGAN model (TSDG-SeqGAN). In the training phrase, we used jieba word segmentation tool for training text to greatly reduce the training time. In the generation phrase, we extract keywords and key sentence from the subject document as seeds, and then enter the seeds into the trained generation network. Next, we get keyword-based documents and documents based on key sentences from generation network. Finally, we output documents that are most similar to the subject document as the final result. Experiments show the effectiveness of our model.
更多
查看译文
关键词
topic similar document generation,SeqGAN,keywords,key sentence,seeds,generation network,bait document
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要