Risks and Challenges of Training Classifiers for IoT

INTERNET OF THINGS - ICIOT 2021（2022）

引用 0|浏览19

暂无评分

摘要

Although deep learning algorithms can achieve high performance, deep models may not learn the right concepts and can easily overfit their training datasets. In the context of IoT devices, the problem is further exacerbated by three factors. First, traffic may be encrypted, allowing very little visibility into the activity of the endpoints. Second, devices with different models and manufacturers may exhibit very different behaviors. Finally, contrary to domains like computer vision or natural language processing, there is no well-accepted representation for the network data that characterizes IoT devices. In this work, we capture real network traffic from different environments, and we demonstrate that training models to detect specific classes of IoT devices (e.g., cameras) using state-of-the-art techniques can lead to overfitting, and very poor performance on independent datasets. However, we then show that by applying domain knowledge, one can manually define engineered features and train simple models (e.g., a decision tree) that achieve an F-1 score of 0.956 on an independent dataset. These results show the feasibility of training generalizable models, but at the same time, raise questions on how best to transform and represent the raw network data to train classifiers for other classes of IoT devices (e.g., hubs, motion sensors) while minimizing manual feature engineering. We elaborate on the challenges, drawing analogies with other fields such as natural language processing.

查看译文

关键词

training classifiers,risks

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要