Saisiyat Is Where It Is At! Insights Into Backdoors And Debiasing Of Cross Lingual Transformers For Named Entity Recognition

Ricardo A. Calix,Jj Ben-Joseph, Nina Lopatina, Ryan Ashley, Mona Gogia,George Sieniawski, Andrea Brennen

Big Data(2022)

引用 0|浏览4
暂无评分
摘要
Deep learning and, in particular, Transformer-based models are revolutionizing natural language processing (NLP). As a result, NLP models can now be pre-trained and fine-tuned by anyone with sufficient resources, and subsequently shared with the world at large. This is an unprecedented approach that helps level the AI playing field and improves productivity. However, this new AI sharing approach presents novel and largely unaddressed challenges involving bias and backdoors. This study has four objectives related to better understanding these issues and their causes: 1) determine if there is bias in a cross lingual (XL) Transformer model such as RoBERTa XLM (XLM-R) for named entity recognition (NER), 2) provide a predictive explainabilty (interpretability) framework that addresses the reasons why the XL model may or may not have bias, 3) test this explainability (interpretability) framework on different scenarios to evaluate its predictive capabilities, and 4) consider implications of any insights and future research directions. Based on experimental results, we find that XLM- R is not significantly biased in the NER task. The results suggest that name related subwords heavily influence NER performance and that cross-lingual transfer learning is reasonably effective in Transformer models. Finally, we discuss a general framework for debiasing or backdooring of Transformer models based on subword embedding representations. In general, by knowing the values of the embeddings of subwords in a Transformer model, one can select triggers (subwords) that will impact the overall performance of the model task either positively or negatively. As such, a broad-use backdoor scheme was developed and tested that significantly affects the recall in both a NER task and a masked-based sentiment analysis task. The results are intriguing and promising.
更多
查看译文
关键词
named entity recognition,cross lingual transformers,debiasing,backdoors
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要