Density Based Spatial Clustering of Applications with Noise and Sentence Bert Embedding for Indonesian Utterance Clustering

2023 International Conference on Computer Science, Information Technology and Engineering (ICCoSITE)(2023)

引用 0|浏览1
暂无评分
摘要
Task oriented chatbots are a sub-topic related to chatbots, where chatbots will perform certain tasks with specific goals. One part of creating a task-oriented chatbot is doing intent classification. Intent classification is a task of text classification. As in general text classification, the required dataset requires a label to carry out the classification process. To speed up and help the utterance analysis process, there is already a method, namely clustering, and Density-based clustering is a part of clustering that can determine cluster patterns based on arbitrary data, with DBScan as one of its algorithms. This research used 10000 client utterance data of awhatsapp based e-commerce conversation. SentenceBert also used as a state of art sentence embedding. This research yield silhouette score of 0.327 as the best result from eps of 0.1 and MinPts of 95. However, based on the cluster result, sentences labelled as noise can be further clustered. Text Preprocessing, text augmentation and sentence embedding techniques can be explored to increase the cluster performance.
更多
查看译文
关键词
dbscan,SentenceBert,clustering,utterance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要