Evaluation and interpretation of driving risks: Automobile claim frequency modeling with telematics data

STATISTICAL ANALYSIS AND DATA MINING（2023）

引用 0|浏览7

暂无评分

摘要

With the development of vehicle telematics and data mining technology, usage-based insurance (UBI) has aroused widespread interest from both academia and industry. The extensive driving behavior features make it possible to further understand the risks of insured vehicles, but pose challenges in the identification and interpretation of important ratemaking factors. This study, based on the telematics data of policyholders in China's mainland, analyzes insurance claim frequency of commercial trucks using both Poisson regression and several machine learning models, including regression tree, random forest, gradient boosting tree, XGBoost and neural network. After selecting the best model, we analyze feature importance, feature effects and the contribution of each feature to the prediction from an actuarial perspective. Our empirical study shows that XGBoost greatly outperforms the traditional models and detects some important risk factors, such as the average speed, the average mileage traveled per day, the fraction of night driving, the number of sudden brakes and the fraction of left/right turns at intersections. These features usually have a nonlinear effect on driving risk, and there are complex interactions between features. To further distinguish high-/low-risk drivers, we run supervised clustering for risk segmentation according to drivers' driving habits. In summary, this study not only provide a more accurate prediction of driving risk, but also greatly satisfy the interpretability requirements of insurance regulators and risk management.

查看译文

关键词

claim frequency, machine learning, risk management, telematics data, usage-based insurance

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要