An Empirical Study of License Conflict in Free and Open Source Software.

ICSE-SEIP(2023)

引用 0|浏览31
暂无评分
摘要
Free and Open Source Software (FOSS) has become the fundamental infrastructure of mainstream software projects. FOSS is subject to various legal terms and restrictions, depending on the type of open source license in force. Hence it is important to remain compliant with the FOSS license terms. Identifying the licenses that provide FOSS and understanding the terms of those licenses is not easy, especially when dealing with a large amount of reuse that is common in modern software development. Since reused software is often large, automated license analysis is needed to address these issues and support users in license compliant reuse of FOSS. However, existing license assessment tools can only identify the name and quantity of licenses embedded in software and thus cannot identify whether the licenses are being used safely and correctly. Moreover, they cannot provide a comprehensive analysis of the compatibility and potential risk that come with the term conflicts. In this paper, we propose DIKE, an automated tool that can perform license detection and conflict analysis for FOSS. First, DIKE extracts 12 terms under 3,256 unique open source licenses by manual analysis and Natural Language Processing (NLP) and constructs a license knowledge base containing the responsibilities of the terms. Second, DIKE scans all licenses from the code snippet for the input software and outputs the scan results in a tree structure. Third, the scan results match the license knowledge base to detect license conflicts from terms and conditions. DIKE designs two solutions for software with license conflicts: license replacement and code replacement. To demonstrate the effectiveness of DIKE, we first evaluate with the term extraction and responsibility classification, and the results show that their F1-scores reach 0.816 and 0.948, respectively. In addition, we conduct a measurement study of 16,341 popular projects from GitHub based on our proposed DIKE to explore the conflict of license usage in FOSS. The results show that 1,787 open source licenses are used in the project, and 27.2% of licenses conflict. Our new findings suggest that conflicts are prevalent in FOSS, warning the open source community about intellectual property risks.
更多
查看译文
关键词
Free and Open Source Software,license analysis,license conflict,natural language processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要