Effect of Domain on Word Prediction quality: Using Stochastic and Neural Network Models.

OITS International Conference on Information Technology(2023)

引用 0|浏览0
暂无评分
摘要
Text prediction is the technique of predicting text while the user types. It started with enhancing augmentative and alternative communication and later implemented in different applications like text messages via phones, email systems and many more. This paper showcases the predictive power of models like N-Gram and character buffer when trained on GRU (Gated Recurrent Unit) for different datasets. We have applied two approaches, statistical and neural network approaches which exhibit the neural approach to be the better option when considering feature sets like bigram and trigram. We have investigated the influence domain differences have on word prediction quality and quantified these differences. We have tested this on different domain datasets, each having same corpus size. We found that results show a drop of 26.1% to 41.28% in word prediction accuracy. This work will create more awareness of domain differences in unstructured text data and identify steps to maneuver them.
更多
查看译文
关键词
Word prediction,N-Gram,Character Buffer,Neural Network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要