Automated Prediction Of Good Dictionary Examples (Gdex): A Comprehensive Experiment With Distant Supervision, Machine Learning, And Word Embedding-Based Deep Learning Techniques

COMPLEXITY(2021)

引用 6|浏览1
暂无评分
摘要
Dictionaries not only are the source of getting meanings of the word but also serve the purpose of comprehending the context in which the words are used. For such purpose, we see a small sentence as an example for the very word in comprehensive book-dictionaries and more recently in online dictionaries. The lexicographers perform a very meticulous activity for the elicitation of Good Dictionary EXamples (GDEX)-a sentence that is best fit in a dictionary for the word's definition. The rules for the elicitation of GDEX are very strenuous and require a lot of time for committing the manual process. In this regard, this paper focuses on two major tasks, i.e., the development of labelled corpora for top 3K English words through the usage of distant supervision approach and devising a state-of-the-art artificial intelligence-based automated procedure for discriminating Good Dictionary EXamples from the bad ones. The proposed methodology involves a suite of five machine learning (ML) and five word embedding-based deep learning (DL) architectures. A thorough analysis of the results shows that GDEX elicitation can be done by both ML and DL models; however, DL-based models show a trivial improvement of 3.5% over the conventional ML models. We find that the random forests with parts-of-speech information and word2vec-based bidirectional LSTM are the most optimal ML and DL combinations for automated GDEX elicitation; on the test set, these models, respectively, secured a balanced accuracy of 73% and 77%.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要