MODIPHY: Multimodal Obscured Detection for IoT using PHantom Convolution-Enabled Faster YOLO
CoRR(2024)
摘要
Low-light conditions and occluded scenarios impede object detection in
real-world Internet of Things (IoT) applications like autonomous vehicles and
security systems. While advanced machine learning models strive for accuracy,
their computational demands clash with the limitations of resource-constrained
devices, hampering real-time performance. In our current research, we tackle
this challenge, by introducing "YOLO Phantom", one of the smallest YOLO models
ever conceived. YOLO Phantom utilizes the novel Phantom Convolution block,
achieving comparable accuracy to the latest YOLOv8n model while simultaneously
reducing both parameters and model size by 43
reduction in Giga Floating Point Operations (GFLOPs). YOLO Phantom leverages
transfer learning on our multimodal RGB-infrared dataset to address low-light
and occlusion issues, equipping it with robust vision under adverse conditions.
Its real-world efficacy is demonstrated on an IoT platform with advanced
low-light and RGB cameras, seamlessly connecting to an AWS-based notification
endpoint for efficient real-time object detection. Benchmarks reveal a
substantial boost of 17
detection, respectively, compared to the baseline YOLOv8n model. For community
contribution, both the code and the multimodal dataset are available on GitHub.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要