A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs.
CoRR(2023)
摘要
Specifying legal requirements for software systems to ensure their compliance
with the applicable regulations is a major concern to requirements engineering
(RE). Personal data which is collected by an organization is often shared with
other organizations to perform certain processing activities. In such cases,
the General Data Protection Regulation (GDPR) requires issuing a data
processing agreement (DPA) which regulates the processing and further ensures
that personal data remains protected. Violating GDPR can lead to huge fines
reaching to billions of Euros. Software systems involving personal data
processing must adhere to the legal obligations stipulated in GDPR and outlined
in DPAs. Requirements engineers can elicit from DPAs legal requirements for
regulating the data processing activities in software systems. Checking the
completeness of a DPA according to the GDPR provisions is therefore an
essential prerequisite to ensure that the elicited requirements are complete.
Analyzing DPAs entirely manually is time consuming and requires adequate legal
expertise. In this paper, we propose an automation strategy to address the
completeness checking of DPAs against GDPR. Specifically, we pursue ten
alternative solutions which are enabled by different technologies, namely
traditional machine learning, deep learning, language modeling, and few-shot
learning. The goal of our work is to empirically examine how these different
technologies fare in the legal domain. We computed F2 score on a set of 30 real
DPAs. Our evaluation shows that best-performing solutions yield F2 score of
86.7% and 89.7% are based on pre-trained BERT and RoBERTa language models. Our
analysis further shows that other alternative solutions based on deep learning
(e.g., BiLSTM) and few-shot learning (e.g., SetFit) can achieve comparable
accuracy, yet are more efficient to develop.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要