Abstract:
This research explores the integration of machine learning in assessing landslide susceptibility, scrutinizing the selection of non-landslide samples. Taking the Wenchuan county, Lixian county, and Maoxian county in Sichuan Province as the study areas, seven evaluation factors were considered, including slope, aspect, elevation, distance to the water system, distance to the fault, lithology, and land use. Non-landslide samples were randomly selected from the lower and extremely low susceptibility zones divided by the Information Value model (I), Weight of Evidence model(WOE), Coefficient of Determination model (CF), and Frequency Ratio model(FR), as well as form the buffer zones (B) and the entire region (G). These samples were then analyzed using a Support Vector Machine (SVM) model. The results showed that the AUC values for I-SVM, WOE-SVM, CF-SVM, and FR-SVM were
0.9804,
0.9726,
0.9368, and
0.8451, respectively, hich were superior to the AUC values of B-SVM (
0.7869) and G-SVM (
0.7389). This highlight the effectiveness of using mathematical-statistical models for the selection of non-landslide samples, with particular emphasis on the accuracy of the Information Value model. This study offers a novel approach to selecting non-landslide samples, significantly enhancing predictive accuracy in landslide susceptibility assessments.