Composite Shape Modeling via Latent Space Factorization Supplementary Material

arXiv: Computer Vision and Pattern Recognition(2019)

引用 57|浏览85
暂无评分
摘要
The Decomposer consists of a whole-shape encoder and K projection layers, where K is the number of semantic part labels. The architecture of the whole-shape encoder is given in Table 1. The projection layers are implemented as fully connected layers, with 100 outputs, where 100 is the dimension of the embedding space. The Composer consists of a shared part decoder, and a Spatial Transformer Network (STN). The architecture of the part decoder is given in Table 2. STN, similar to the original design in [1], consists of a localization sub-network, and a re-sampling module. The re-sampling module uses trilinear interpolation, and does not have learned parameters. The localization network receives both K stacked decoded parts, and the sum of part embeddings, of dimension 100. First, the two inputs are separately processed: the stacked decoded parts using two FC layers with 256 outputs; the sum of part encodings using one FC layer with 128 outputs. The two results are then concatenated into a single 384-dimensional vector, and processed with two additional FC layers with 128 and 12 K outputs (K times 12 affine transformation parameters), respectively. All FC layers, except for the last one, are followed by ReLU layers, and dropout layers with keep probability of 0.7.
更多
查看译文
关键词
shape assembly,3D spatial transformer network,in-network volumetric grid deformation,part-level shape manipulation,latent space factorization,semantic structure-aware 3D shape modeling,auto-encoder-based pipeline,shape composition,linear operations,data-dependent subspace factorization,factorized shape embedding space,Decomposer-Composer,part deformation module,neural network architecture
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要