An empirical study of IoT security aspects at sentence-level in developer textual discussions

Information and Software Technology(2022)

引用 5|浏览11
暂无评分
摘要
Context: IoT is a rapidly emerging paradigm that now encompasses almost every aspect of our modern life. As such, ensuring the security of IoT devices is crucial. IoT devices can differ from traditional computing (e.g., low power, storage, computing), thereby the design and implementation of proper security measures can be challenging in IoT devices. We observed that IoT developers discuss their security-related challenges in developer forums like Stack Overflow (SO). However, we find that IoT security discussions can also be buried inside non-security discussions in SO. Objective: In this paper, we aim to understand the challenges IoT developers face while applying security practices and techniques to IoT devices. We have two goals: (1) Develop a model that can automatically find security-related IoT discussions in SO, and (2) Study the model output (i.e., the security discussions) to learn about IoT developer security-related challenges. Methods: First, we download all 53K posts from StackOverflow (SO) that contain discussions about various IoT devices, tools, and techniques. Second, we manually labeled 5,919 sentences from 53K posts as 1 or 0 (i.e., whether they contain a security aspect or not). Third, we then use this benchmark to investigate a suite of deep learning transformer models. The best performing model is called SecBot. Fourth, we apply SecBot on the entire 53K posts and find around 30K sentences labeled as security. Fifth, we apply topic modeling to the 30K security-related sentences labeled by SecBot. Then we label and categorize the topics. Sixth, we analyze the evolution of the topics in SO. Results: We found that (1) SecBot is based on the retraining of the deep learning model RoBERTa. SecBot offers the best F1-Score of .935, (2) there are six error categories in misclassified samples by SecBot. SecBot was mostly wrong when the keywords/contexts were ambiguous (e.g., 'gateway' can be a security gateway or a simple gateway), (3) there are 9 security topics grouped into three categories: Software, Hardware, and Network, and (4) the highest number of topics belongs to software security, followed by network security and hardware security. Conclusion: IoT researchers and vendors can use SecBot to collect and analyze security-related discussions from developer discussions in SO. The analysis of nine security-related topics can guide major IoT stakeholders like IoT Security Enthusiasts, Developers, Vendors, Educators, and Researchers in the rapidly emerging IoT ecosystems.
更多
查看译文
关键词
IoT,Security,Stack overflow,Deep learning,Empirical study
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要