Remote Sensing Image Captioning With Sequential Attention and Flexible Word Correlation

Jie Wang, Binze Wang,Jiangbo Xi, Xue Bai,Okan K. Ersoy, Ming Cong,Siyan Gao, Zhe Zhao

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS(2024)

引用 0|浏览12
暂无评分
摘要
As a successful application of machine learning in remote sensing (RS) and natural language processing, image captioning of remote-sensing images has been promoted and developed. Remote sensing images are large in width, complex in features, and contain abundant information. It is a difficult task to extract available visual features-based domain knowledge behind sufficiently and to utilize extracted feature for image captioning generation sufficiently. In order to overcome this difficulty, we propose a novel model based "encoder-decoder" framework, termed remote sensing image captioning with sequential attention and flexible word correlation (SA-FWC). In the encoder, we fuse features of different layers in VGG16 to extract global and local information. In the decoder, we propose SA-FWC to utilize extracted visual information to generate accurate image captioning sufficiently. Specially, to utilize visual features from the encoding layer sufficiently, highlight important information and reduce redundant information, long short-term memory (LSTM) in SA-FWC is used for obtaining better feature representations. Feature fusion strategy and self-attention mechanism to utilize visual features sufficiently. Additionally, we provide a data augmentation strategy-based minimal training sample pairs. In the experiments, four evaluation metrics are used to evaluate the experimental results, and the effects of various parameters on the experimental results are discussed. The experimental results (BELU-0.72, ROUGE-0.65, METEOR-0.37, and CIDEr-2.83) show that the proposed method is effective and outperforms other network structures.
更多
查看译文
关键词
Feature extraction,Remote sensing,Data mining,Visualization,Training,Correlation,Decoding,Attention mechanism,deep learning,image captioning,remote sensing (RS)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要