Developing machine learning tools for long-lead heavy precipitation prediction with multi-sensor data

ICNSC(2015)

引用 21|浏览29
暂无评分
摘要
A large number of extreme floods were closely related to heavy precipitation which lasted for several days or weeks. Long-lead prediction of extreme precipitation, i.e., prediction of 6–15 days ahead of time, is important for understanding the prognostic forecasting potential of many natural disasters, such as floods. Yet, long-lead flood forecasting is a challenging task due to the cascaded uncertainty with prediction errors from measurements to modeling, which makes the current physics-based numerical simulation models extremely complex and inaccurate. In this paper, we formulate the modeling work as a machine learning problem and introduce a complementary data mining framework for heavy precipitation prediction. Heavy precipitation that may lead to extreme floods is a rare event. Long-lead prediction requires the corresponding feature space to be sampled from extremely high spatio-temporal dimensions. Such a complexity makes long-lead heavy precipitation prediction a high dimensional and imbalanced machine learning problem. In this work, we firstly define the extreme precipitation and non-extreme precipitation clusters and then design the Nearest-Sample Choosing method to handle the imbalanced data sets. We introduce streaming feature selection and subspace learning to extract the most relevant features from high dimensional data. We evaluate the machine learning tools using historical flood data collected in the State of Iowa, the United States and associated hydrometeorological variables from 1948 to 2010.
更多
查看译文
关键词
Heavy Precipitation Predicition, Nearest Sample Choosing, Fast Online Streaming Feature Selection, Machine Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要