Abstract:
This study targets landslide-prone areas on the loess plateau in Northern Shaanxi, China. It aims to improve the accuracy and reliability of landslide susceptibility assessments by extending negative sample sampling methods to slope unit systems and exploring the coupling effects of the frequency ratio (FR) method with various machine learning algorithms. Using multi-source geographic data, four machine learning models - logistic regression, naive Bayes, support vector machine (SVM), and gradient boosting decision tree (GBDT) - were combined with slope unit frameworks and the FR method to develop landslide susceptibility assessment models. Through a spatial heterogeneity synergy mechanism that integrates statistical optimization with spatial constraints, the FR method establishes mapping relationships between feature space and geographic space, effectively addressing the information loss problem associated with traditional sampling methods. Experimental results indicate significant performance improvements: the SVM model’s accuracy improved from 42.1% (random sampling) to 84.2% (FR method), Matthews correlation coefficient (MCC) improved from -0.039 to 0.716, and the area under the receiver operating characteristic curve (AUC) rose from 0.65 to 0.96. The FR method improved both the representational capacity and robustness across all machine learning models, with the most substantial improvement observed in SVM. By coordinating parameter optimization and applying spatial constraints, the FR method effectively mitigates the limitations of conventional sampling approaches, underscoring its critical role in enhancing machine learning model performance. This research provides theoretical and technical support for slope unit-based landslide susceptibility assessments on the loess plateau, contributing significantly to the precision and reliability of regional landslide risk evaluation.