A convolutional architecture for 3D model embedding using image views

The Visual Computer(2024)

引用 0|浏览5
暂无评分
摘要
During the last years, many advances have been made in tasks like 3D model retrieval, 3D model classification, and 3D model segmentation. The typical 3D representations such as point clouds, voxels, and polygon meshes are mostly suitable for rendering purposes, while their use for cognitive processes (retrieval, classification, segmentation) is limited due to their high redundancy and complexity. We propose a deep learning architecture to handle 3D models represented as sets of image views as input. Our proposed architecture combines other standard architectures, like Convolutional Neural Networks and autoencoders, for computing 3D model embeddings using sets of image views extracted from the 3D models, avoiding the common view pooling layer approach used in these cases. Our goal is to represent a 3D model as a vector with enough information so it can substitute the 3D model for high-level tasks. Since this vector is a learned representation which tries to capture the relevant information of a 3D model, we show that the embedding representation conveys semantic information that helps to deal with the similarity assessment of 3D objects. We compare our proposed embedding technique with state-of-the-art techniques for 3D Model Retrieval using the ShapeNet and ModelNet datasets. We show that the embeddings obtained with our proposed architecture allow us to obtain a high effectiveness score in both normalized and perturbed versions of the ShapeNet dataset while improving the training and inference times compared to the standard state-of-the-art techniques.
更多
查看译文
关键词
3D model,Deep learning,Convolutional neural network,Embedding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要