SeSAM: semi-automated semantic analysis method of urban areas' events with extreme levels of popularity based on public open data

10TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE IN COMPUTATIONAL SCIENCE (YSC2021)(2021)

引用 1|浏览0
暂无评分
摘要
This study describes the semi-automated pipeline created for the comprehensive analysis of the urban areas with the extremely low and extremely high popularity levels. It includes the geo-frequency analysis of the Russian-language Instagram publications for the St. Petersburg area and selection of areas with the extreme values of the popularity level according to the number of publications in them. Semantic analysis of the urban areas with an extremely low number of publications includes comparing of algorithms for descriptions extraction and classification for these areas and results of such descriptions extraction and classification using TF-IDF vectorization technique and most valuable words extraction. Semantic analysis of areas with an extremely high number of publications includes the structure description of such areas, comparing of algorithms for advertisement publications extraction, results of the advertisement extraction using BigARTM model and further development and implementation of the algorithm for extracting events related to the the points of attraction in extremely popular urban areas, which is based on the strong time binding hypothesis and the idea of similarity queries using combination of LDA models for revealing semantic structure and algorithm based on frequency analysis. Developed algorithm was tested to extract events in the urban area of St. Petersburg where Ice Palace is placed and showed interpretable results and allow us to correctly extract 89 events out of 102 which occurred in this area in 2019. Finally, SeSAM pipeline for comprehensive urban analysis was created that combined the described algorithms. (C) 2021 The Authors. Published by ELSEVIER B.V.
更多
查看译文
关键词
urban study, social network data analysis, semantic analysis, topic modelling, events extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要