Malicious Package Detection using Metadata Information
CoRR(2024)
摘要
Protecting software supply chains from malicious packages is paramount in the
evolving landscape of software development. Attacks on the software supply
chain involve attackers injecting harmful software into commonly used packages
or libraries in a software repository. For instance, JavaScript uses Node
Package Manager (NPM), and Python uses Python Package Index (PyPi) as their
respective package repositories. In the past, NPM has had vulnerabilities such
as the event-stream incident, where a malicious package was introduced into a
popular NPM package, potentially impacting a wide range of projects. As the
integration of third-party packages becomes increasingly ubiquitous in modern
software development, accelerating the creation and deployment of applications,
the need for a robust detection mechanism has become critical. On the other
hand, due to the sheer volume of new packages being released daily, the task of
identifying malicious packages presents a significant challenge. To address
this issue, in this paper, we introduce a metadata-based malicious package
detection model, MeMPtec. This model extracts a set of features from package
metadata information. These extracted features are classified as either
easy-to-manipulate (ETM) or difficult-to-manipulate (DTM) features based on
monotonicity and restricted control properties. By utilising these metadata
features, not only do we improve the effectiveness of detecting malicious
packages, but also we demonstrate its resistance to adversarial attacks in
comparison with existing state-of-the-art. Our experiments indicate a
significant reduction in both false positives (up to 97.56
negatives (up to 91.86
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要