Abstract:
Debris flow is a high-concentration, heterogeneous, multiphase flow typically triggered by intense rainfall or snowmelt. Its complex formation and movement processes make accurate susceptibility assessment vital for disaster monitoring and mitigation. Traditional methods often fall short in predictive accuracy, leading to a growing adoption of machine learning algorithms in this field in recent years. This study proposes a debris flow susceptibility assessment model, SPY-RF, which integrates the random forest (RF) algorithm with the spy technique (SPY), using the upper Minjiang River Basin as a case study. The SPY method addresses the common issue of class imbalance by generating high-quality pseudo-negative samples from unlabeled data, thereby enhancing the model’s classification performance. A total of fourteen assessment factors, including gully density, lithology, area, and others, were selected based on geological disaster data and remote sensing imagery to construct a comprehensive debris flow dataset. The SPY technique was utilized to optimize the negative sample selection process, which was then combined with the RF model to evaluate susceptibility. The findings indicate that the SPY-RF model outperforms the traditional RF model, achieving an AUC of 0.98 compared to 0.93. The predicted distribution of extremely high susceptibility areas aligns closely with the current debris flow points, indicating that the SPY-RF model predicts debris flow susceptibility with greater accuracy and stability. Additionally, the model also successfully identifies debris flow occurrences in low-risk and extremely low-risk susceptibility areas. The quality of negative samples was greatly increased by using SPY technology in terms of negative sample acquisition and filtering techniques, which raised the prediction accuracy and dependability of the model. The proposed SPY-RF model serves as a useful guidance for managing the risk of debris flows in the upper Minjiang River basin.