Efficient hard real-time implementation of CNNs on multi-core architectures.

COMPSAC(2023)

引用 0|浏览2
暂无评分
摘要
Autonomous driving applications rely on processing large amounts of data in order to ensure sufficient perception performance. In this context, Convolutional Neuronal Networks (CNNs), which are used for object detection in camera images, are an integral part of sensor data processing. However, safety-related hard deadlines are counteracted by the required high data rates that stress the platform w.r.t. memory interference. Providing high camera frame rates under hard latency requirements is still an open issue. In a case study focusing on high-performance multi-core architectures, we evaluate in detail the advantages of a private L2 and shared L3 cache architecture for CNN processing and show how to remove malicious data synchronization effects. In this context we deploy MobileNet as well as YOLO on an Intel i5 processor and demonstrate their applicability to a worstcase design under high CNN frame rates. Last, we provide a heuristic optimization scheme that is able to efficiently find feasible high frame rate configurations. Contrary to common assumptions, our results show that high-performance multi-core COTS platforms are suitable for the application of CNNs even under hard deadline constraints and, hence, offer substantial gains in cost and productivity due to their greater ease of programming.
更多
查看译文
关键词
multi-core,real-time,safety-critical,worst-case design,CNN,automated driving
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要