NLP Verification: Towards a General Methodology for Certifying Robustness
arxiv(2024)
摘要
Deep neural networks have exhibited substantial success in the field of
Natural Language Processing (NLP) and ensuring their safety and reliability is
crucial: there are safety critical contexts where such models must be robust to
variability or attack, and give guarantees over their output. Unlike Computer
Vision, NLP lacks a unified verification methodology and, despite recent
advancements in literature, they are often light on the pragmatical issues of
NLP verification. In this paper, we make an attempt to distil and evaluate
general components of an NLP verification pipeline, that emerges from the
progress in the field to date. Our contributions are two-fold. Firstly, we give
a general characterisation of verifiable subspaces that result from embedding
sentences into continuous spaces. We identify, and give an effective method to
deal with, the technical challenge of semantic generalisability of verified
subspaces; and propose it as a standard metric in the NLP verification
pipelines (alongside with the standard metrics of model accuracy and model
verifiability). Secondly, we propose a general methodology to analyse the
effect of the embedding gap, a problem that refers to the discrepancy between
verification of geometric subpspaces on the one hand, and semantic meaning of
sentences which the geometric subspaces are supposed to represent, on the other
hand. In extreme cases, poor choices in embedding of sentences may invalidate
verification results. We propose a number of practical NLP methods that can
help to identify the effects of the embedding gap; and in particular we propose
the metric of falsifiability of semantic subpspaces as another fundamental
metric to be reported as part of the NLP verification pipeline. We believe that
together these general principles pave the way towards a more consolidated and
effective development of this new domain.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要