Zero-shot Building Attribute Extraction from Large-Scale Vision and Language Models
CoRR(2023)
摘要
Existing building recognition methods, exemplified by BRAILS, utilize
supervised learning to extract information from satellite and street-view
images for classification and segmentation. However, each task module requires
human-annotated data, hindering the scalability and robustness to regional
variations and annotation imbalances. In response, we propose a new zero-shot
workflow for building attribute extraction that utilizes large-scale vision and
language models to mitigate reliance on external annotations. The proposed
workflow contains two key components: image-level captioning and segment-level
captioning for the building images based on the vocabularies pertinent to
structural and civil engineering. These two components generate descriptive
captions by computing feature representations of the image and the
vocabularies, and facilitating a semantic match between the visual and textual
representations. Consequently, our framework offers a promising avenue to
enhance AI-driven captioning for building attribute extraction in the
structural and civil engineering domains, ultimately reducing reliance on human
annotations while bolstering performance and adaptability.
更多查看译文
关键词
Applications,Structural engineering / civil engineering,Algorithms,Image recognition and understanding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要