Identifying Relevant Sentences for Travel Blogs from Wikipedia Articles

DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT III(2022)

引用 1|浏览1
暂无评分
摘要
Travel blogs are a rich source of information about tourist spots. Millions of travellers use them to decide where to go and what to do. However, they are often devoid of factual information, which leads to a lack of trust and credibility surrounding blogs. On the other hand, Wikipedia is rich with accurate factual descriptions but is often bland and unfit for blogs. We propose a system to identify blog-worthy sentences on Wikipedia to augment travel blogs with factual information to make them more reliable and trustworthy. We curate a dataset of over 1.83 M Wikipedia sentences from 234,305 travel-related articles and assign a blog-worthiness score to each sentence by comparing them against 20 M sentences from similar to 600K blogs. Our best BERT based model provides an F1 score of 0.74 to identify if a sentence is blog-worthy or not.
更多
查看译文
关键词
travel blogs,articles,relevant sentences
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要