Synthetic rainfall data generator development through decentralised model training

Journal of Hydrology(2022)

引用 2|浏览9
暂无评分
摘要
Recent heavy rainfall-induced flood events, for example in Germany, Australia and USA, have highlighted the relevance of countermeasures in saving human lives and preventing property damage. Newly introduced ML-based flood forecasting methods rely on high-intensity synthetic rainfall events due to the sparsity of their real counterpart. Such synthetic data instances can be produced by precipitation generators trained in an adversarial setting on historical rainfall data. Capturing processes for rainfall data are often highly distributed, with multiple radar stations contributing to a centralised data set. However, data centralisation entails challenges regarding data-stream logistics, data locality, and memory overhead. Distributed Analytics (DA) aims to overcome these challenges through decentralised model training by bringing the algorithm to the data instead of vice versa. In this work, we propose a feasibility study evaluating the applicability of DA on hydrological data. As example of use, we choose the decentralised training of rainfall data generators. We introduce a rainfall generator training procedure relying on Generative Adversarial Networks (GANs) and evaluate two DA algorithms: Federated Learning (FL) and Cyclic Institutional Incremental Learning (CIIL). We compare the resulting training outcomes with the centralised model training (CL) approach and find CIIL performed similarly to CL but less stable, while FL outperformed CL by 7.5%. We conclude that the proven feasibility of FL in our simulated distributed setting lays the groundwork for utilising this approach in realistic environments of grander scale while overcoming potential privacy concerns or logistical challenges in the setting of centralised analytics.
更多
查看译文
关键词
Hydrological data,Distributed data,Distributed analytics,Data generation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要