ISSN 1003-8035 CN 11-2852/P
    曾韬睿,王林峰,张俞,等. 基于CatBoost-SHAP模型的滑坡易发性建模及可解释性[J]. 中国地质灾害与防治学报,2024,35(1): 1-14. DOI: 10.16031/j.cnki.issn.1003-8035.202309035
    引用本文: 曾韬睿,王林峰,张俞,等. 基于CatBoost-SHAP模型的滑坡易发性建模及可解释性[J]. 中国地质灾害与防治学报,2024,35(1): 1-14. DOI: 10.16031/j.cnki.issn.1003-8035.202309035
    ZENG Tao-rui,WANG Lin-feng,ZHANG Yu,et al. Landslide susceptibility modeling and interpretability based on CatBoost-SHAP model[J]. The Chinese Journal of Geological Hazard and Control,2024,35(1): 1-14. DOI: 10.16031/j.cnki.issn.1003-8035.202309035
    Citation: ZENG Tao-rui,WANG Lin-feng,ZHANG Yu,et al. Landslide susceptibility modeling and interpretability based on CatBoost-SHAP model[J]. The Chinese Journal of Geological Hazard and Control,2024,35(1): 1-14. DOI: 10.16031/j.cnki.issn.1003-8035.202309035

    基于CatBoost-SHAP模型的滑坡易发性建模及可解释性

    Landslide susceptibility modeling and interpretability based on CatBoost-SHAP model

    • 摘要: 本研究致力于深入探索滑坡易发性建模中集成学习模型的不确定性和可解释性。以浙江省东部沿海山区为研究对象,本文利用谷歌历史影像与Sentinel-2A影像,记录了2016年超级台风“鲇鱼”触发的552起浅层滑坡事件。研究首先对连续型因子进行了不分级、等间距法和自然断点法的工况设计,进一步划分为4,6,8,12,16,20级。随后,引入了类别增强提升树模型(CatBoost)以评估不同工况下的滑坡易发性值,再结合ROC(receiver operation characteristic curves)曲线与SHAP (SHapley Additive exPlanation)分析,对建模过程中的不确定性和可解释性进行了深入研究,目的在于确定最优建模策略。结果表明:1) 在CatBoost模型计算中,河流距离成为最为关键的影响因子,其次是与地质条件和人类活动相关的因子;2) 不分级工况下,模型能够获得最高的AUC值,达到0.866。相较于等间距法,自然断点法的划分策略展现出更佳的泛化能力,且模型预测性能随着分级数量的增加而增加;3)SHAP模型揭示了主要影响因子道路距离、河流距离、DEM和坡向对台风诱发滑坡的控制机制。研究成果能够加深我们对滑坡易发性的理解,提高滑坡预测的准确性和可靠性,为相关地区的防灾减灾工作提供科学依据。

       

      Abstract: This study is devoted to delving deeply into the uncertainty and interpretability of ensemble learning models in landslide susceptibility modeling. Focusing on the Eastern coastal mountainous area of Zhejiang Province as the study subject, this research employs historical Google imagery and Sentinel-2A imagery to document 552 shallow landslide events triggered by the Super Typhoon "Megi" in 2016. Initially, the study designs scenarios for continuous factors using non-grading, equal interval method, and natural breaks method, subsequently subdividing them into 4, 6, 8, 12, 16, 20 levels. Thereafter, the Category Boosting Model (CatBoost) is introduced to assess landslide susceptibility values under different scenarios. Coupled with the analysis of ROC (Receiver Operating Characteristic) curves and SHAP (SHapley Additive exPlanation), a thorough investigation into the uncertainty and interpretability during the modeling process is conducted, aiming to determine the optimal modeling strategy. The results indicate that: 1) In the computations of the CatBoost model, aspect emerges as the most critical influencing factor, followed by factors related to water and geological conditions; 2) Under the non-grading scenario, the model achieves the highest AUC value, reaching 0.866. Compared to the equal interval method, the natural breaks method demonstrates superior generalization capability, and the model’s predictive performance enhances with the increase in the number of classifications; 3) The SHAP model reveals the controlling mechanisms of the principal influencing factors—aspect, lithology, elevation, and road distance—on typhoon-induced landslides. The findings of this research can deepen our understanding of landslide susceptibility, enhance the accuracy and reliability of landslide predictions, and provide a scientific basis for disaster prevention and mitigation work in the related regions.

       

    /

    返回文章
    返回