Pyramid Architecture for Multi-Scale Processing in Point Cloud Segmentation

IEEE Conference on Computer Vision and Pattern Recognition(2022)

引用 23|浏览73
暂无评分
摘要
Semantic segmentation of point cloud data is a critical task for autonomous driving and other applications. Recent advances of point cloud segmentation are mainly driven by new designs of local aggregation operators and point sampling methods. Unlike image segmentation, few efforts have been made to understand the fundamental issue of scale and how scales should interact and be fused. In this work, we investigate how to efficiently and effectively integrate features at varying scales and varying stages in a point cloud segmentation network. In particular, we open up the commonly used encoder-decoder architecture, and design scale pyramid architectures that allow information to flow more freely and systematically, both laterally and upward/downward in scale. Moreover, a cross-scale attention feature learning block has been designed to enhance the multi-scale feature fusion which occurs everywhere in the network. Such a design of multi-scale processing and fusion gains large improvements in accuracy without adding much additional computation. When built on top of the popular KPConv network, we see consistent improvements on a wide range of datasets, including achieving state-of-the-art performance on NPM3D and S3DIS. Moreover, the pyramid architecture is generic and can be applied to other network designs: we show an example of similar improvements over RandLANet.
更多
查看译文
关键词
Navigation and autonomous driving, Representation learning, Robot vision, Scene analysis and understanding, Segmentation,grouping and shape analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要