TRSAv1: A new benchmark dataset for classifying user reviews on Turkish e-commerce websites

JOURNAL OF INFORMATION SCIENCE(2023)

引用 3|浏览0
暂无评分
摘要
The amount of data produced significantly increased with the development of Internet technologies. Accordingly, the importance of natural language processing studies increased, and this topic became one of the most studied artificial intelligence subjects. Even though it is a popular topic that is widely studied on, not enough studies have been conducted on the Turkish language. Even the studies conducted in Turkey are primarily on English and other natural languages instead of Turkish. The lack of a Turkish dataset is the most crucial reason for the lack of studies. Therefore, to create a solution, user reviews on e-commerce websites were collected and labelled reviews as positive, negative and neutral, and a new and unique dataset consisting of 150,000 reviews was created. This dataset was named TRSAv1, which was publicly shared with the researchers will contribute to the Turkish natural language processing studies; however, the effect of different word representation methods on algorithm performance was examined in detail, and the results were compared.
更多
查看译文
关键词
Machine learning, TRSAv1 dataset, Turkish sentiment analysis, word embedding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要