Using natural language processing to detect privacy violations in online contracts

SAC '20: The 35th ACM/SIGAPP Symposium on Applied Computing Brno Czech Republic March, 2020(2020)

引用 15|浏览385
暂无评分
摘要
As information systems deal with contracts and documents in essential services, there is a lack of mechanisms to help organizations in protecting the involved data subjects. In this paper, we evaluate the use of named entity recognition as a way to identify, monitor and validate personally identifiable information. In our experiments, we use three of the most well-known Natural Language Processing tools (NLTK, Stanford CoreNLP, and spaCy). First, the effectiveness of the tools is evaluated in a generic dataset. Then, the tools are applied in datasets built based on contracts that contain personally identifiable information. The results show that models' performance was highly positive in accurately classifying both the generic and the contracts' data. Furthermore, we discuss how our proposal can effectively act as a Privacy Enhancing Technology.
更多
查看译文
关键词
Privacy Violations, Online Contracts, Natural Language Processing, Named Entity Recognition, Personally Identifiable Information
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要