Recency and quality-based ranking question in CQAs: A Stack Overflow case study

Information Processing & Management(2021)

引用 13|浏览9
暂无评分
摘要
Recency ranking, in Community-based Question Answering (CQA), would refer to put recent answers in a list’s top positions. To be recent is not related to how new is the date of creation or editing of a given answer, but how current is the content of the answer. A good ranking should also consider the answers’ quality since a current but no quality answer may be useless. Similarly, a high-quality answer, presenting adequate text and references with obsolete information, may be valueless. Combining these two issues (recency and quality) is crucial as users usually hope for current solutions and need to have fast/easy access (top items in the ranking) to the best answers to solve their problems quickly. The CQAs usually provide voting mechanisms so that the users can indicate the best quality answers. However, this method is not concerned with the recency of the answers besides being a slow and subjective process, which does not keep up with new content’s dynamism. Therefore, we propose an automatic approach that, besides the quality, also considers the answer’s recency to generating the ranking. We have used textual and non-textual features that indicate the answers’ quality and recency, extracted from the users’ answers in the CQA environment as a whole. In our approach, quality is used to classify the answers between good and poor, using a threshold value, generating two sets of answers: high quality and low quality. Then, both sets are sorted into recency order. Finally, these sets are concatenated, giving rise to the final ranking, so that the best and most current answers are in the top positions. To verify our proposal’s effectiveness, we have performed a case study in Stack Overflow CQA with a set of experiments, using different combinations of characteristics and different learning to rank Stack Overflow. Then, our main contributions are: (1) an approach to ranking answers of a questions dataset on the recency and quality of an answer; (2) a thorough evaluation of 9 learning to rank algorithms, showing that Coordinate Ascent and LambdaMart have the best performance in this task; (3) a feature analysis, which has shown that features related to the age of the response contributed to improving the ranking performance taking recency and quality into account. Furthermore, as far as we know, it is the first work that considers the recency of an answer in this task.
更多
查看译文
关键词
Community-based question answering,CQA ranking,Recency ranking,Quality ranking,Learning to rank,Recency features,Quality features,Textual features,Non-textual features
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要