ISSN 1003-8035 CN 11-2852/P

    四川德昌县滑坡易发性典型统计模型与集成学习模型的对比研究

    Comparative study of typical statistical models and ensemble learning models for landslide susceptibility in Dechang County, Sichuan Province

    • 摘要: 滑坡是我国西南山区频发且危害严重的地质灾害类型之一,其易发性受地形、地质、气象与人类活动等多因素共同影响。为了提升复杂山区滑坡风险的精度识别能力,本文以四川省德昌县为研究区,基于典型滑坡成因因子构建逻辑回归模型(Logistic Regression,LR)、随机森林模型(Random Forest,RF)和梯度提升决策树模型(Gradient Boosting Decision Tree,GBDT)三种易发性评价模型,从预测性能、空间分布特征和主控因子识别等方面开展系统对比分析。研究选取9个影响因子构建上述模型,并采用受试者工作特征曲线下面积(Area Under the Curve,AUC)、Kappa系数(Kappa Coefficient)和总体分类精度(Overall Accuracy,ACC)对模型预测性能进行评估。结果表明,三种模型均能有效反映德昌县滑坡易发性空间分布特征,其中GBDT模型预测精度最高,对极低易发区识别能力较强;RF模型表现稳定,具有较高的泛化能力;LR模型解释性较好,但受变量共线性与线性假设影响,预测精度相对较低。在特征重要性方面,归一化植被指数(Normalized Difference Vegetation Index,NDVI)、坡度和道路为RF与GBDT模型中识别出的主要控制因子,反映了德昌县滑坡在植被覆盖与工程扰动双重作用下的致灾机制。在本研究样本规模与因子配置条件下,集成学习模型在滑坡易发性预测精度与稳定性方面整体优于传统统计模型,验证了基于集成学习的滑坡易发性评价方法在复杂山区环境中的有效性与适用性。本研究构建的模型评价框架与技术流程为滑坡易发性模型选择与方法优化提供了参考,同时可为德昌县及类似山区的地质灾害风险识别、国土空间规划与工程选址提供科学依据与方法支持。

       

      Abstract: Landslides are among the most frequent and destructive geological hazards in mountainous areas of southwestern China, and their susceptibility is jointly controlled by multiple factors including topography, geology, meteorology, and human activities. To improve the accuracy of landslide risk identification in complex mountainous areas, this study takes Dechang County, Sichuan Province as the research area, and establishes three landslide susceptibility evaluation models: Logistic Regression (LR), Random Forest (RF), and Gradient Boosting Decision Tree (GBDT) based on typical landslide influencing factors. A systematic comparative analysis is carried out from the perspectives of prediction performance, spatial distribution characteristics, and identification of dominant controlling factors. Nine influencing factors are selected for modelling, and the Area Under the ROC Curve (AUC), Kappa coefficient, and Overall Accuracy (ACC) are adopted to assess model performance. Results show that all three models can effectively reflect the spatial distribution pattern of landslide susceptibility in Dechang County. The GBDT model achieves the highest prediction accuracy and strong capability in identifying extremely low susceptibility zones; the RF model performs stably with high generalization ability; the LR model has good interpretability but relatively low prediction accuracy due to multicollinearity and linear hypothesis constraints. In terms of feature importance, the Normalized Difference Vegetation Index (NDVI), slope, and distance to roads are the dominant factors identified by RF and GBDT models, which reflects the disaster-causing mechanism of landslides in Dechang County under the dual effects of vegetation coverage and engineering disturbance. Under the sample size and factor configuration of this study, ensemble learning models are overall superior to traditional statistical models in terms of landslide susceptibility prediction accuracy and stability, verifying the effectiveness and applicability of ensemble learning–based susceptibility assessment in complex mountainous environments. The model evaluation framework and technical process established in this study provide a reference for model selection and optimization of landslide susceptibility assessment, and support geological hazard risk identification, territorial spatial planning, and engineering site selection in Dechang County and similar mountainous areas.

       

    /

    返回文章
    返回