TOPFORMER: Topology-Aware Authorship Attribution of Deepfake Texts with Diverse Writing Styles
arxiv(2023)
摘要
Recent advances in Large Language Models (LLMs) have enabled the generation
of open-ended high-quality texts, that are non-trivial to distinguish from
human-written texts. We refer to such LLM-generated texts as deepfake texts.
There are currently over 72K text generation models in the huggingface model
repo. As such, users with malicious intent can easily use these open-sourced
LLMs to generate harmful texts and dis/misinformation at scale. To mitigate
this problem, a computational method to determine if a given text is a deepfake
text or not is desired–i.e., Turing Test (TT). In particular, in this work, we
investigate the more general version of the problem, known as Authorship
Attribution (AA), in a multi-class setting–i.e., not only determining if a
given text is a deepfake text or not but also being able to pinpoint which LLM
is the author. We propose TopFormer to improve existing AA solutions by
capturing more linguistic patterns in deepfake texts by including a Topological
Data Analysis (TDA) layer in the Transformer-based model. We show the benefits
of having a TDA layer when dealing with imbalanced, and multi-style datasets,
by extracting TDA features from the reshaped pooled_output of our backbone
as input. This Transformer-based model captures contextual representations
(i.e., semantic and syntactic linguistic features), while TDA captures the
shape and structure of data (i.e., linguistic structures). Finally, TopFormer,
outperforms all baselines in all 3 datasets, achieving up to 7% increase in
Macro F1 score.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要