WeChat Mini Program
Old Version Features

Enriching Building Function Classification Using Large Language Model Embeddings of OpenStreetMap Tags

Earth Sci Informatics(2024)

GIScience Chair

Cited 0|Views1
Abstract
Automated methods for building function classification are essential due to restricted access to official building use data. Existing approaches utilize traditional Natural Language Processing (NLP) techniques to analyze textual data representing human activities, but they struggle with the ambiguity of semantic contexts. In contrast, Large Language Models (LLMs) excel at capturing the broader context of language. This study presents a method that uses LLMs to interpret OpenStreetMap (OSM) tags, combining them with physical and spatial metrics to classify urban building functions. We employed an XGBoost model trained on 32 features from six city datasets to classify urban building functions, demonstrating varying F1 scores from 67.80% in Madrid to 91.59% in Liberec. Integrating LLM embeddings enhanced the model's performance by an average of 12.5% across all cities compared to models using only physical and spatial metrics. Moreover, integrating LLM embeddings improved the model's performance by 6.2% over models that incorporate OSM tags as one-hot encodings, and when predicting based solely on OSM tags, the LLM approach outperforms traditional NLP methods in 5 out of 6 cities. These results suggest that deep contextual understanding, as captured by LLM embeddings more effectively than traditional NLP approaches, is beneficial for classification. Finally, a Pearson correlation coefficient of approximately -0.858 between population density and F1-scores suggests that denser areas present greater classification challenges. Moving forward, we recommend investigation into discrepancies in model performance across and within cities, aiming to identify generalized models.
More
Translated text
Key words
Building functions classification,Large language models,OpenStreetMap,Text embedding,Natural language processing
PDF
Bibtex
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
Summary is being generated by the instructions you defined