HTCViT: an effective network for image classification and segmentation based on natural disaster datasets

Vis. Comput.(2023)

引用 0|浏览19
暂无评分
摘要
Classifying and segmenting natural disaster images are crucial for predicting and responding to disasters. However, current convolutional networks perform poorly in processing natural disaster images, and there are few proprietary networks for this task. To address the varying scales of the region of interest (ROI) in these images, we propose the Hierarchical TSAM-CB-ViT (HTCViT) network, which builds on the ViT network’s attention mechanism to better process natural disaster images. Considering that ViT excels at extracting global context but struggles with local features, our method combines the strengths of ViT and convolution, and can capture overall contextual information within each patch using the Triple-Strip Attention Mechanism (TSAM) structure. Experiments validate that our HTCViT can improve the classification task with 3-4 % and the segmentation task with 1-2 % on natural disaster datasets compared to the vanilla ViT network.
更多
查看译文
关键词
Natural disaster image analysis,Vision transformer,Convolution,Hierarchical
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要