RSBuilding: Towards General Remote Sensing Image Building Extraction and Change Detection with Foundation Model
arxiv(2024)
摘要
The intelligent interpretation of buildings plays a significant role in urban
planning and management, macroeconomic analysis, population dynamics, etc.
Remote sensing image building interpretation primarily encompasses building
extraction and change detection. However, current methodologies often treat
these two tasks as separate entities, thereby failing to leverage shared
knowledge. Moreover, the complexity and diversity of remote sensing image
scenes pose additional challenges, as most algorithms are designed to model
individual small datasets, thus lacking cross-scene generalization. In this
paper, we propose a comprehensive remote sensing image building understanding
model, termed RSBuilding, developed from the perspective of the foundation
model. RSBuilding is designed to enhance cross-scene generalization and task
universality. Specifically, we extract image features based on the prior
knowledge of the foundation model and devise a multi-level feature sampler to
augment scale information. To unify task representation and integrate image
spatiotemporal clues, we introduce a cross-attention decoder with task prompts.
Addressing the current shortage of datasets that incorporate annotations for
both tasks, we have developed a federated training strategy to facilitate
smooth model convergence even when supervision for some tasks is missing,
thereby bolstering the complementarity of different tasks. Our model was
trained on a dataset comprising up to 245,000 images and validated on multiple
building extraction and change detection datasets. The experimental results
substantiate that RSBuilding can concurrently handle two structurally distinct
tasks and exhibits robust zero-shot generalization capabilities.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要