Question Classification from Thai Sentences by Considering Word Context to Question Generation

2022 Research, Invention, and Innovation Congress: Innovative Electricals and Electronics (RI2C)(2022)

引用 1|浏览9
暂无评分
摘要
The potential of automated question generation is role play in the multi-fields and multi-applications such as question and answering systems, examination systems, and information retrieval. Before learning the question generated, one should understand how to classify questions. This research aims to generate possible questions considering the possible question categories from question classification based on Natural Language Processing. In this research, we compared the results on Logistic Regression, Support Vector Machine, and Multinomial Naï ve Bayes, which were traditional classification models. The deep learning techniques were Convolutional Neural Networks, Bidirectional Long Short-Term Memory, combined CNN and BiLSTM models, and BERT models. The experimental results show that the preprocessing phase using Natural Language Processing could enhance question classification. The classification of the sentence to question classification attained an average micro $F_{1} -$ score of 91.40% when applied BERT model by pre-trained WangchanBERTa on simple sentences. In contrast, the satisfying score with an average micro $F_{1} -$ score of 82.07% (from 80.37% on original input) when applied to add all POS tags unigram + bigram TF-IDF by using the SVM model. The experimental results when the CNN model with GloVe on adding focusing POS tags is a satisfactory score with an average micro $F_{1} -$ score of 79.79%.
更多
查看译文
关键词
question classification,automatic question generation,natural language processing,feature selection,Part of Speech Tags (POS)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要