Abstract:
Accurate landslide susceptibility assessment (LSA) for earthquake-induced landslides is critical for post-earthquake emergency response and disaster risk management. However, conventional models often rely on randomly selected negative samples, which lack physically grounded stability criteria (e.g., considering physical evaluation indicators such as factor of safety, permanent displacement, and failure probability), and consequently compromise prediction reliability. To investigate the influence of different physically-informed negative sampling strategies on the accuracy of co-seismic LSA, this study utilizes three representative stability evaluation indicators: Factor of Safety (Fs), Newmark Displacement (Dn), and Landslide Failure Probability (Pf) are employed to delineate stable regions for constructing negative sample datasets. These were integrated with three commonly used models: Logistic Regression (LR), Random Forest (RF), and Convolutional Neural Network (CNN), to conduct a systematic comparative analysis. The results show that the Dn-based sampling strategy more effectively characterizes the physical triggering mechanisms of earthquake-induced landslides, achieving the highest predictive accuracy (AUC = 0.924), outperforming both the Pf-based (AUC = 0.912) and Fs-based (AUC = 0.908) strategies. Among the three models, CNN consistently exhibited superior performance in spatial prediction and classification accuracy due to its robust capability in nonlinear learning and hierarchical feature extraction. SHAP analysis further indicates that peak ground acceleration (PGA), slope, relative slope position, and distance to fault are the dominant factors controlling landslide occurrence. Overall, the Dn-CNN combination yielded the most accurate and interpretable susceptibility results, offering valuable insights for improving negative sample construction and model selection in seismic LSA.