EffiPerception: an Efficient Framework for Various Perception Tasks
arxiv(2024)
摘要
The accuracy-speed-memory trade-off is always the priority to consider for
several computer vision perception tasks.
Previous methods mainly focus on a single or small couple of these tasks,
such as creating effective data augmentation, feature extractor, learning
strategies, etc. These approaches, however, could be inherently task-specific:
their proposed model's performance may depend on a specific perception task or
a dataset.
Targeting to explore common learning patterns and increasing the module
robustness, we propose the EffiPerception framework.
It could achieve great accuracy-speed performance with relatively low memory
cost under several perception tasks: 2D Object Detection, 3D Object Detection,
2D Instance Segmentation, and 3D Point Cloud Segmentation.
Overall, the framework consists of three parts:
(1) Efficient Feature Extractors, which extract the input features for each
modality. (2) Efficient Layers, plug-in plug-out layers that further process
the feature representation, aggregating core learned information while pruning
noisy proposals. (3) The EffiOptim, an 8-bit optimizer to further cut down the
computational cost and facilitate performance stability.
Extensive experiments on the KITTI, semantic-KITTI, and COCO datasets
revealed that EffiPerception could show great accuracy-speed-memory overall
performance increase within the four detection and segmentation tasks, in
comparison to earlier, well-respected methods.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要