Detection of Anorexic Girls-In Blog Posts Written in Hebrew Using a Combined Heuristic Al and NLP Method

IEEE ACCESS(2022)

引用 0|浏览7
暂无评分
摘要
In this study, we aim to detect in social media texts written in Hebrew girls who are suspected of being anorexic. We constructed a dataset containing 100 blog posts written by females who are probably anorexic, and 100 blog posts written by females who are likely to be non-anorexic. The construction of this dataset was supervised and approved by an international expert on anorexia. We tested several text classification (TC) methods, using various feature sets (content-based and style-based), five machine learning (ML) methods, three RNN models, four BERT models, three basic preprocessing methods, three feature filtering methods, and parameter tuning. Several insights were found as follows. A set of 50-word n-grams (mostly word unigrams) given by an expert was found as a good basic detector. A heuristic process based on the random forest ML method has overcome a combinatorial explosion and led to significant improvement over a baseline result at a level of P = .01. Application of an iterative process that tests combinations of "k out of n'" where n' < n (n is the number of feature sets) lead to a result of 90.63%, using a combination of 300 features from ten feature sets.
更多
查看译文
关键词
Mental disorders, natural language processing, supervised machine learning, text analysis, text classification, text processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要