Deep Hierarchical Vision Transformer for Hyperspectral and LiDAR Data Classification

IEEE TRANSACTIONS ON IMAGE PROCESSING(2022)

引用 62|浏览85
暂无评分
摘要
In this study, we develop a novel deep hierarchical vision transformer (DHViT) architecture for hyperspectral and light detection and ranging (LiDAR) data joint classification. Current classification methods have limitations in heterogeneous feature representation and information fusion of multi-modality remote sensing data (e.g., hyperspectral and LiDAR data), these shortcomings restrict the collaborative classification accuracy of remote sensing data. The proposed deep hierarchical vision transformer architecture utilizes both the powerful modeling capability of long-range dependencies and strong generalization ability across different domains of the transformer network, which is based exclusively on the self-attention mechanism. Specifically, the spectral sequence transformer is exploited to handle the long-range dependencies along the spectral dimension from hyperspectral images, because all diagnostic spectral bands contribute to the land cover classification. Thereafter, we utilize the spatial hierarchical transformer structure to extract hierarchical spatial features from hyperspectral and LiDAR data, which are also crucial for classification. Furthermore, the cross attention (CA) feature fusion pattern could adaptively and dynamically fuse heterogeneous features from multi-modality data, and this contextual aware fusion mode further improves the collaborative classification performance. Comparative experiments and ablation studies are conducted on three benchmark hyperspectral and LiDAR datasets, and the DHViT model could yield an average overall classification accuracy of 99.58%, 99.55%, and 96.40% on three datasets, respectively, which sufficiently certify the effectiveness and superior performance of the proposed method.
更多
查看译文
关键词
Feature extraction, Transformers, Hyperspectral imaging, Laser radar, Data mining, Collaboration, Data models, Hyperspectral image, light detection and ranging, joint classification, vision transformer, convolutional vision transformer, cross attention fusion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要