Hybrid Framework for Named Entity Recognition in Turkish Social Media

2020 28th Signal Processing and Communications Applications Conference (SIU)(2020)

引用 2|浏览8
暂无评分
摘要
Named Entity Recognition (NER) is a task of extracting entities such as person, location, and organization from texts. NER is more challenging in the social media texts compared to the formal texts due to the noisy language including grammatical errors and abbreviations. However, the problem of NER in the social media gained significant attention in the literature due to the amount of information flow in the social media. In this paper, we propose a comprehensive model for NER in Turkish texts of distinct social media domains, i.e. Twitter, Facebook, and Donanimhaber Forum. The model employs Conditional Random Fields followed by Bidirectional Long Short Term Memory. To overcome the challenges of social media texts, we incorporate word embeddings, character representations, morphology, domain information, pattern-matching, dictionary, part-of-speech, and casing based features to our model. We perform ablation studies to analyze the effect of these features. We demonstrate the success of our model for tagging Turkish social media texts through the largest Turkish NER database.
更多
查看译文
关键词
Turkish named entity recognition,social media,informal texts
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要