-
PDF
- Split View
-
Views
-
Cite
Cite
Haiping Yuan, Yangyao Zou, Hengzhe Li, Shuaijie Ji, Ziang Gu, Liu He, Ruichao Hu, Assessment of peak particle velocity of blast vibration using hybrid soft computing approaches, Journal of Computational Design and Engineering, Volume 12, Issue 2, February 2025, Pages 154–176, https://doi.org/10.1093/jcde/qwaf007
- Share Icon Share
Abstract
Blasting vibration is a major adverse effect in rock blasting excavation, and accurately predicting its peak particle velocity (PPV) is vital for ensuring engineering safety and risk management. This study proposes an innovative IHO-VMD-CatBoost model that integrates variational mode decomposition (VMD) and the CatBoost algorithm, with hyperparameters globally optimized using the improved hippopotamus optimization (IHO) algorithm. Compared to existing models, the proposed method improves feature extraction from vibration signals and significantly enhances prediction accuracy, especially in complex geological conditions. Using measured data from open-pit mine blasting, the model extracts key features such as maximum section charge, total charge, and horizontal distance, achieving superior performance compared to 13 traditional models. It reports a root mean square error of 0.28 cm/s, a mean absolute error of 0.17 cm/s, an index of agreement of 0.993, and a variance accounted for value of 97.28%, demonstrating superior prediction accuracy, a high degree of fit with observed data, and overall robustness in PPV prediction. Additionally, analyses based on the SHapley Additive Explanations framework provide insights into the complex nonlinear relationships between factors like horizontal distance and maximum section charge, improving the model’s interpretability. The model demonstrates robustness, stability, and applicability in various tests, confirming its reliability in complex engineering scenarios, and offering a valuable solution for safe mining and optimized blasting design.

Hybrid IHO-VMD-CatBoost model for accurate PPV prediction in blast vibration.
An improved hippo algorithm (IHO) is proposed for optimizing hyperparameters in predictive models.
The IHO-VMD-CatBoost model significantly enhances the accuracy of peak particle velocity prediction in blasting vibration.
Hybrid soft computing approaches improve the model’s robustness in complex engineering tasks.
SHAP analysis is incorporated for better interpretability of model predictions.
The proposed approach provides valuable insights for safe and efficient blasting practices in construction and mining operations.
1. Introduction
The blasting method is widely used in open-pit mining due to its cost-effectiveness and efficiency (Bai et al., 2023; Guo et al., 2024; Ke et al., 2022). However, the process inevitably produces adverse effects such as vibration, flying rocks, air shock waves, and dust (Fan et al., 2023; Gou et al., 2020). Among these, most of the energy generated during blasting is converted into stress waves that propagate through rock and other media, negatively impacting nearby structures. To mitigate the hazards of blasting vibrations and ensure safety, controlling the intensity of the vibrations is crucial. The peak velocity of blasting vibration is often used to assess its impact on the surrounding environment and structures, making accurate prediction of this velocity essential before initiating blasting operations (Fan et al., 2024; Hosseini et al., 2023; Shahri et al., 2021; Singh et al., 2021).
Over the past decades, researchers have developed various empirical models to predict blast vibration velocities based on extensive field experiments. These models typically rely on the proportional distance relationship, primarily considering maximum section charge and the distance from the blast center (Kadingdi et al., 2022). While traditional empirical models are simple, convenient, and quick to calculate, their prediction accuracy is limited due to the oversimplification of influencing factors. These models are generally applicable only under specific site conditions and often fail to account for the impact of various factors in complex terrains, falling short of meeting rising environmental and safety standards. Therefore, accurately predicting peak blasting vibration velocity is of great engineering significance for ensuring the safety of open-pit mining operations.
With advances in computer technology, artificial intelligence models have been increasingly applied in engineering due to their powerful ability to address nonlinear continuous function problems. These models have shown promising development trends. As awareness of the limitations of traditional empirical models deepens, more researchers have turned to artificial intelligence models to predict blasting vibrations, effectively addressing the complexity of factors influencing peak particle velocity (PPV). Notable improvements in prediction accuracy have been achieved using models such as neural networks (Lawal et al., 2021; Liu et al., 2024), classification and regression trees (Komadja et al., 2022), and support vector machines (Nguyen et al., 2023). Despite the significant advancements made, current research still faces several challenges, such as the limited adaptability of traditional models to complex geological conditions and variable environments, as well as the inability of single learner models to achieve high prediction accuracy. Thus, a research gap persists in developing more advanced algorithmic models and integrated strategies to improve the precision of blasting vibration prediction (Fissha et al., 2024; Xie et al., 2024; Zhang et al., 2023).
The gradient boosting decision tree (GBDT) framework, based on the Boosting technique, integrates multiple weak learners to create a robust ensemble learning model (Peng et al., 2021; Tian et al., 2024). Within the GBDT family, the CatBoost algorithm offers high accuracy, resistance to overfitting, strong noise immunity, and excellent handling of discrete data. Despite advances in the prediction of blasting effects, ensemble learning models still face challenges related to model construction and parameter optimization, with hyperparameter selection playing a key role in prediction accuracy.
In recent years, bionic algorithms such as the Grey Wolf Algorithm (Yan et al., 2022), sparrow search algorithm (SSA; Zhou et al., 2022), whale optimization algorithm (WOA; Yang et al., 2020), firefly algorithm (Chandrahas et al., 2023; El-Shorbagy & El-Refaey, 2022), and random forest (He et al., 2022) have been successfully applied in engineering, opening new avenues for improving blasting vibration prediction models and enhancing their accuracy (Fattahi & Hasanipanah, 2020). The hippopotamus optimization (HO) algorithm has shown superior search capabilities, reduced parameter tuning, and improved convergence compared to other affine algorithms, making it effective for hyperparameter optimization (Maurya et al., 2024). However, like other metaheuristic algorithms (Cheng et al., 2021; Gorji, 2023), HO is susceptible to the stochastic nature of its search process, leading to a tendency to get trapped in local optima, which limits its ability to guarantee global optimal solutions.
To overcome HO’s tendency to converge on local optima and its limited precision in convergence, this study introduces three enhancement strategies: the Tent chaotic mapping strategy, the Cauchy perturbation strategy, and the adaptive weight factor strategy. These improvements culminate in the development of the improved hippopotamus optimization (IHO) algorithm. To further enhance prediction accuracy, the variational mode decomposition (VMD) method is integrated. VMD’s adaptive decomposition mechanism and nonrecursive nature make it particularly well-suited for high-precision, robust applications (Ding et al., 2021). Consequently, this study proposes a novel IHO-VMD-CatBoost model for predicting blasting vibration velocity, combining the strengths of VMD and the IHO algorithm. The model’s feasibility and stability in predicting blasting vibration velocity are validated through comparisons with multiple models and various evaluation metrics in the context of an open-pit mining application.
The primary contributions of this study are outlined as follows:
Development of an integrated predictive model: A predictive model combining VMD with CatBoost is proposed to accurately predict the PPV of blast-induced vibrations. The VMD algorithm excels in decomposing complex signals and extracting critical features, significantly enhancing the CatBoost model’s accuracy and generalization performance in handling intricate datasets. This model provides a robust and effective tool for predicting blast-induced vibrations, especially under challenging terrain and geological conditions.
Enhancements to the IHO algorithm: The IHO algorithm is enhanced by integrating the Tent chaotic mapping strategy to improve the diversity and quality of the initial population distribution. Additionally, the Cauchy perturbation and adaptive weighting strategies are incorporated to enhance search efficiency and optimization capabilities. These advancements notably improve the efficiency of hyperparameter tuning, particularly in the proposed predictive model.
Validation of IHO algorithm’s optimization performance: Through benchmark tests, the enhanced IHO algorithm is shown to outperform traditional optimization algorithms, such as GWO and WOA, in terms of optimization efficiency and solution quality. In the context of blast vibration prediction, the IHO-VMD-CatBoost model achieves superior prediction accuracy and reliability when compared with 13 traditional machine learning models, underscoring its state-of-the-art performance.
Interpretability and practical significance using SHAP analysis: The critical factors influencing blast-induced vibrations, including maximum section charge and horizontal distance, are identified and quantified using SHapley Additive exPlanations (SHAP) analysis. This analysis not only enhances the interpretability of the predictive model but also provides valuable, data-driven insights for optimizing blasting strategies, thereby improving safety and efficiency in practical engineering applications.
2. Research Methodology
The overall framework of this study is depicted in Figure 1. Section 3 elaborates on the engineering geological characteristics pertinent to the case study, details the selection of model input variables, and provides descriptive statistical analysis along with correlation analysis of these parameters. Section 4 offers an in-depth overview of the underlying principles of the employed algorithms and introduces the enhanced IHO algorithm, which is based on the HO model, leading to the development of the IHO-VMD-CatBoost model for predicting peak blast vibration velocities. Section 5 presents a comparative analysis of the prediction results generated by the IHO-VMD-CatBoost model, highlighting the model’s superior performance in predicting peak blast vibration velocity. Section 6 analyzes the influencing factors from both global and local perspectives using the SHAP methodology and provides corresponding engineering recommendations. Section 7 evaluates the model’s stability, robustness, and applicability, particularly under complex conditions. Finally, Section 8 summarizes the key findings, identifies the current study’s limitations, and proposes potential improvements along with directions for future research.

3. Data Source and Analysis
Blasting-induced vibration is influenced by multiple factors, which can be broadly categorized into the characteristics of the blasting source, the properties of the propagation medium, and the performance parameters of the measurement instruments. The primary factors affecting blasting-induced vibration are as follows:
Maximum section charge (X1, kg): This parameter represents the maximum amount of explosive used in a single blasting section, directly determining the magnitude of the blasting energy. It is quantified by recording the charge weight for each blasting hole section.
Total charge (X2, kg): The total charge refers to the cumulative weight of explosives used throughout the entire blasting process. It is measured by summing the charge weights of all blasting holes.
Horizontal distance (X3, m): The horizontal distance between the blasting source and the measurement point is a key factor in vibration attenuation, as it determines the distance the vibration waves must travel. This parameter can be measured directly on-site using a laser rangefinder or geodetic instruments.
Elevation difference (X4, m): The elevation difference between the blasting point and the monitoring point influences the propagation of vibration, as it affects the energy diffusion. This value can be obtained through topographical surveys or GPS measurements.
Minimum resistance line length (X5, m): This parameter represents the shortest path from the explosive to the free surface, affecting the efficiency of energy release. It is typically measured on-site using a combination of geodetic methods and manual measurements.
Differential time (X6, ms): Differential time refers to the time interval between successive detonations in the blasting sequence. Controlled and recorded by the blasting initiation system, this parameter plays a critical role in optimizing energy release and minimizing peak vibration by reducing constructive interference.
Rock integrity index (X7): The rock integrity coefficient indicates the degree of intactness of the rock mass, which significantly affects the propagation of vibration waves. It is usually determined through geological surveys, rock quality designation evaluations, or core sample analysis.
Rock firmness coefficient (X8): This coefficient reflects the hardness and density of the rock, influencing the rate of vibration attenuation. It can be measured using compressive strength tests or on-site hardness assessment equipment.
Angle between the measurement point and the minimum resistance line (X9, °): The angle between the measurement point and the direction of the minimum resistance line affects the directness of vibration propagation. This parameter is measured using geodetic instruments to ensure accurate recording of spatial orientation.
In this study, PPV (Y, cm/s) is chosen as the dependent variable, while the aforementioned nine parameters serve as independent variables. These variables provide a comprehensive framework for analyzing and predicting the vibration behavior induced by blasting, thus contributing to improved safety and control measures in blasting operations.
To develop a robust prediction model for blasting vibrations, this study utilizes 137 sets of measurement data collected from blasting vibration experiments conducted by Shi (2008) at the Copper Mountain open-pit mine in China. These data form the foundation for developing and validating the proposed model. A geological sketch of the study area is provided in Figure 2, offering additional context regarding the geological conditions of the measurement site (Zhao et al., 2023).

(a) Geographic outline map of China, (b) outline map of Hubei province, and (c) geological sketch map of the study area.
3.1. Descriptive statistics
The descriptive statistics of the study variables are presented in Figure 3 (violin plot) and Table 1, which display the minimum, maximum, and median values. The violin plot effectively visualizes the distribution, spread, and variability of each parameter, while Table 1 provides precise numerical values that complement the range of these variables.

Parameters . | Maximum . | Minimum . | Median . |
---|---|---|---|
X1 (kg) | 160.00 | 5590.00 | 840.00 |
X2 (kg) | 936.00 | 9000.00 | 4360.00 |
X3 (m) | 31.50 | 444.30 | 140.60 |
X4 (m) | 6.00 | 109.30 | 53.00 |
X5 (m) | 4.00 | 7.00 | 5.00 |
X6 (ms) | 25.00 | 100.00 | 50.00 |
X7 | 0.30 | 0.78 | 0.55 |
X8 | 5.00 | 8.00 | 6.00 |
X9 (°) | 0 | 180.00 | 160.00 |
Y (cm s−1) | 0.101 | 7.470 | 0.825 |
Parameters . | Maximum . | Minimum . | Median . |
---|---|---|---|
X1 (kg) | 160.00 | 5590.00 | 840.00 |
X2 (kg) | 936.00 | 9000.00 | 4360.00 |
X3 (m) | 31.50 | 444.30 | 140.60 |
X4 (m) | 6.00 | 109.30 | 53.00 |
X5 (m) | 4.00 | 7.00 | 5.00 |
X6 (ms) | 25.00 | 100.00 | 50.00 |
X7 | 0.30 | 0.78 | 0.55 |
X8 | 5.00 | 8.00 | 6.00 |
X9 (°) | 0 | 180.00 | 160.00 |
Y (cm s−1) | 0.101 | 7.470 | 0.825 |
Parameters . | Maximum . | Minimum . | Median . |
---|---|---|---|
X1 (kg) | 160.00 | 5590.00 | 840.00 |
X2 (kg) | 936.00 | 9000.00 | 4360.00 |
X3 (m) | 31.50 | 444.30 | 140.60 |
X4 (m) | 6.00 | 109.30 | 53.00 |
X5 (m) | 4.00 | 7.00 | 5.00 |
X6 (ms) | 25.00 | 100.00 | 50.00 |
X7 | 0.30 | 0.78 | 0.55 |
X8 | 5.00 | 8.00 | 6.00 |
X9 (°) | 0 | 180.00 | 160.00 |
Y (cm s−1) | 0.101 | 7.470 | 0.825 |
Parameters . | Maximum . | Minimum . | Median . |
---|---|---|---|
X1 (kg) | 160.00 | 5590.00 | 840.00 |
X2 (kg) | 936.00 | 9000.00 | 4360.00 |
X3 (m) | 31.50 | 444.30 | 140.60 |
X4 (m) | 6.00 | 109.30 | 53.00 |
X5 (m) | 4.00 | 7.00 | 5.00 |
X6 (ms) | 25.00 | 100.00 | 50.00 |
X7 | 0.30 | 0.78 | 0.55 |
X8 | 5.00 | 8.00 | 6.00 |
X9 (°) | 0 | 180.00 | 160.00 |
Y (cm s−1) | 0.101 | 7.470 | 0.825 |
The descriptive statistics reveal significant differences among the various parameters, highlighting the inherent heterogeneity of the blasting conditions. For example, the maximum section charge (X1) shows a wide range, from 160 to 5590 kg, whereas the total charge (X2) spans from 936 to 9000 kg. Other key parameters, such as the elevation difference (X4) and the minimum resistance line length (X5), exhibit similar levels of variability. These substantial differences emphasize the diversity of field conditions and the complex factors that influence blasting-induced vibration.
These descriptive statistics are essential for capturing the inherent variability within the data, which forms a critical foundation for developing the blasting vibration prediction model. By quantifying the range and distribution of the influencing parameters, this preliminary analysis provides valuable context for subsequent modeling efforts, ensuring that key factors are accurately represented and the model can be effectively applied in diverse scenarios.
3.2. Pearson correlation coefficient
To assess the relationships between the study variables, the strength of correlation is classified as follows: ±0.81 to ±1.0 indicates a strong correlation, ±0.61 to ± 0.80 indicates a moderate to strong correlation, ±0.41 to ± 0.60 indicates a moderate correlation, ±0.21 to ± 0.40 indicates a weak correlation, and ±0.00 to ±0.20 indicates negligible correlation. Figure 4 visually illustrates these relationships, displaying the correlation coefficients between the variables.

Correlation coefficient for variables available in the complete database.
From Figure 4, the following correlations are observed: (a) The rock integrity coefficient (X7) and the rock firmness coefficient (X8) exhibit a strong positive correlation (R = 0.68). (b) The maximum section charge (X1) and total charge (X2) show a moderate correlation (R = 0.43). (c) The rock integrity index (X7) and the angle between the measurement point and the minimum resistance line (X9) display a weak correlation (R = 0.089). (d) The rock firmness coefficient (X8) and horizontal distance (X3) exhibit negligible correlation (R = 0.078). (e) The elevation difference (X4) and differential time (X6) show no significant correlation (R = 0.01). (f) The horizontal distance (X3) and minimum resistance line length (X5) also display negligible correlation (R = 0.01).
These results suggest that most of the variables in this study have either very weak correlations or no significant correlation at all. This is a crucial observation, as it implies that the majority of the variables are independent, thereby reducing the risk of multicollinearity in the model. Understanding these relationships helps identify the key factors that most influence PPV prediction, allowing for their prioritization in the model development process.
3.3. Multicollinearity analysis
Multicollinearity analyses were performed to assess potential interdependencies among the predictor variables in the model. Multicollinearity is a common issue in regression analysis, particularly when multiple input variables exhibit high correlations, which can affect the stability and predictive accuracy of the model. To quantify the degree of multicollinearity, the variance inflation factor (VIF) was employed, a widely recognized metric for evaluating linear dependence between variables. Multicollinearity is categorized as follows: VIF > 10 indicates “problematic multicollinearity,” 5 < VIF ≤ 10 represents “moderate multicollinearity,” 2.5 < VIF ≤ 5 denotes “considerable multicollinearity,” and VIF ≤ 2.5 indicates “weak multicollinearity” (Khatti & Grover, 2023). The results of this analysis, presented in Table 2, are used to evaluate the potential multicollinearity of the variables.
Predictors . | Coefficients . | Standard error . | t Stat . | p-Value . | Lower 95% . | Upper 95% . | R² . | VIF . | Level . |
---|---|---|---|---|---|---|---|---|---|
X1 | 0.0005 | 0.0001 | 5.4082 | 0 | 0.0003 | 0.0007 | 0.1231 | 1.4602 | Weak |
X2 | 0.0001 | 0 | 1.4441 | 0.1512 | 0 | 0.0002 | 0.1069 | 1.4524 | Weak |
X3 | −0.0078 | 0.0011 | −7.1471 | 0 | −0.01 | −0.0057 | 0.2608 | 1.9009 | Weak |
X4 | −0.0045 | 0.0043 | −1.0544 | 0.2937 | −0.013 | 0.004 | 0.1657 | 1.7863 | Weak |
X5 | 0.3551 | 0.1305 | 2.7201 | 0.0074 | 0.0968 | 0.6134 | 0.1131 | 1.7925 | Weak |
X6 | −0.0046 | 0.0041 | −1.1408 | 0.2561 | −0.0127 | 0.0034 | 0.0491 | 1.8037 | Weak |
X7 | 3.6482 | 0.9143 | 3.9903 | 0.0001 | 1.8389 | 5.4575 | 0.0072 | 2.2621 | Weak |
X8 | −0.1715 | 0.1299 | −1.3201 | 0.1892 | −0.4286 | 0.0856 | 0.0138 | 2.7302 | Considerable |
X9 | 0.0034 | 0.0016 | 2.0793 | 0.0396 | 0.0002 | 0.0066 | 0.0996 | 1.572 | Weak |
Constant | −0.8441 | 0.9543 | −0.8845 | 0.3781 | −2.7326 | 1.0445 | − | − | − |
Predictors . | Coefficients . | Standard error . | t Stat . | p-Value . | Lower 95% . | Upper 95% . | R² . | VIF . | Level . |
---|---|---|---|---|---|---|---|---|---|
X1 | 0.0005 | 0.0001 | 5.4082 | 0 | 0.0003 | 0.0007 | 0.1231 | 1.4602 | Weak |
X2 | 0.0001 | 0 | 1.4441 | 0.1512 | 0 | 0.0002 | 0.1069 | 1.4524 | Weak |
X3 | −0.0078 | 0.0011 | −7.1471 | 0 | −0.01 | −0.0057 | 0.2608 | 1.9009 | Weak |
X4 | −0.0045 | 0.0043 | −1.0544 | 0.2937 | −0.013 | 0.004 | 0.1657 | 1.7863 | Weak |
X5 | 0.3551 | 0.1305 | 2.7201 | 0.0074 | 0.0968 | 0.6134 | 0.1131 | 1.7925 | Weak |
X6 | −0.0046 | 0.0041 | −1.1408 | 0.2561 | −0.0127 | 0.0034 | 0.0491 | 1.8037 | Weak |
X7 | 3.6482 | 0.9143 | 3.9903 | 0.0001 | 1.8389 | 5.4575 | 0.0072 | 2.2621 | Weak |
X8 | −0.1715 | 0.1299 | −1.3201 | 0.1892 | −0.4286 | 0.0856 | 0.0138 | 2.7302 | Considerable |
X9 | 0.0034 | 0.0016 | 2.0793 | 0.0396 | 0.0002 | 0.0066 | 0.0996 | 1.572 | Weak |
Constant | −0.8441 | 0.9543 | −0.8845 | 0.3781 | −2.7326 | 1.0445 | − | − | − |
Predictors . | Coefficients . | Standard error . | t Stat . | p-Value . | Lower 95% . | Upper 95% . | R² . | VIF . | Level . |
---|---|---|---|---|---|---|---|---|---|
X1 | 0.0005 | 0.0001 | 5.4082 | 0 | 0.0003 | 0.0007 | 0.1231 | 1.4602 | Weak |
X2 | 0.0001 | 0 | 1.4441 | 0.1512 | 0 | 0.0002 | 0.1069 | 1.4524 | Weak |
X3 | −0.0078 | 0.0011 | −7.1471 | 0 | −0.01 | −0.0057 | 0.2608 | 1.9009 | Weak |
X4 | −0.0045 | 0.0043 | −1.0544 | 0.2937 | −0.013 | 0.004 | 0.1657 | 1.7863 | Weak |
X5 | 0.3551 | 0.1305 | 2.7201 | 0.0074 | 0.0968 | 0.6134 | 0.1131 | 1.7925 | Weak |
X6 | −0.0046 | 0.0041 | −1.1408 | 0.2561 | −0.0127 | 0.0034 | 0.0491 | 1.8037 | Weak |
X7 | 3.6482 | 0.9143 | 3.9903 | 0.0001 | 1.8389 | 5.4575 | 0.0072 | 2.2621 | Weak |
X8 | −0.1715 | 0.1299 | −1.3201 | 0.1892 | −0.4286 | 0.0856 | 0.0138 | 2.7302 | Considerable |
X9 | 0.0034 | 0.0016 | 2.0793 | 0.0396 | 0.0002 | 0.0066 | 0.0996 | 1.572 | Weak |
Constant | −0.8441 | 0.9543 | −0.8845 | 0.3781 | −2.7326 | 1.0445 | − | − | − |
Predictors . | Coefficients . | Standard error . | t Stat . | p-Value . | Lower 95% . | Upper 95% . | R² . | VIF . | Level . |
---|---|---|---|---|---|---|---|---|---|
X1 | 0.0005 | 0.0001 | 5.4082 | 0 | 0.0003 | 0.0007 | 0.1231 | 1.4602 | Weak |
X2 | 0.0001 | 0 | 1.4441 | 0.1512 | 0 | 0.0002 | 0.1069 | 1.4524 | Weak |
X3 | −0.0078 | 0.0011 | −7.1471 | 0 | −0.01 | −0.0057 | 0.2608 | 1.9009 | Weak |
X4 | −0.0045 | 0.0043 | −1.0544 | 0.2937 | −0.013 | 0.004 | 0.1657 | 1.7863 | Weak |
X5 | 0.3551 | 0.1305 | 2.7201 | 0.0074 | 0.0968 | 0.6134 | 0.1131 | 1.7925 | Weak |
X6 | −0.0046 | 0.0041 | −1.1408 | 0.2561 | −0.0127 | 0.0034 | 0.0491 | 1.8037 | Weak |
X7 | 3.6482 | 0.9143 | 3.9903 | 0.0001 | 1.8389 | 5.4575 | 0.0072 | 2.2621 | Weak |
X8 | −0.1715 | 0.1299 | −1.3201 | 0.1892 | −0.4286 | 0.0856 | 0.0138 | 2.7302 | Considerable |
X9 | 0.0034 | 0.0016 | 2.0793 | 0.0396 | 0.0002 | 0.0066 | 0.0996 | 1.572 | Weak |
Constant | −0.8441 | 0.9543 | −0.8845 | 0.3781 | −2.7326 | 1.0445 | − | − | − |
As shown in Table 2, the rock firmness coefficient (X8) has a VIF value of 2.73, indicating ‘considerable multicollinearity’. While this suggests some linear dependence between X8 and other variables, the correlation is not strong enough to undermine the robustness of the model. Therefore, X8 remains in the dataset, although potential covariance issues should be monitored during subsequent stages of model development. In contrast, most other variables, such as maximum section charge (X1) and total charge (X2), exhibit VIF values below 2.5, indicating “weak multicollinearity.” This minimal linear dependence reduces the risk of multicollinearity, thereby enhancing the model’s stability and interpretability.
In conclusion, while a few variables show moderate multicollinearity, the majority exhibit minimal linear dependence, which is essential for ensuring the robustness and reliability of the model. The current levels of multicollinearity do not pose a significant threat to the accuracy or applicability of the model, and therefore, there is no immediate need for variable removal or extensive data adjustments.
4. Computational Approaches
4.1. Categorical boosting algorithm
The CatBoost algorithm (Katlav & Ergen, 2024), a highly efficient ensemble learning framework based on gradient boosting, is illustrated in Figure 5. The dotted line in the figure indicates the sequential construction of N CART trees using N Bootstrap samples. CatBoost employs a fully symmetric decision tree structure as its base model, optimizing both the tree architecture and leaf node values during training. To ensure the balance of the tree structure and reduce the risk of overfitting, the algorithm quantifies features. Moreover, it introduces a novel traversal sequence by perturbing the dataset. When converting categorical features to numerical values, the final representation is determined by averaging the category features from the first p records and applying weighting coefficients. For the ith categorical feature of the kth sample, the corresponding formula is presented in Equation 1. In this equation, xi,k represents the ith categorical feature of the kth sample, xi,j denotes the ith categorical feature of the jth sample prior to the kth sample, yj is the labeled value of the jth sample, and Dk is the dataset up to the kth sample. The parameter p is the prior distribution added, and a, typically greater than zero, is the weighting coefficient. The indicator function [.] outputs 1 if the condition is met, otherwise it returns 0. This design enables the model to effectively leverage prior sample information at each iteration, generating unbiased gradient estimates that enhance both model accuracy and generalization:

Compared to other boosting-based ensemble learning algorithms, CatBoost demonstrates exceptional proficiency in handling discrete feature data, making it particularly effective for addressing problems with multiple feature inputs. This makes it an ideal solution for predicting blasting vibrations, providing critical insights into vibration velocity in complex terrains, and thus contributing to the safety of mining operations.
Nonetheless, CatBoost’s performance is highly sensitive to parameter configurations, which significantly impact model prediction accuracy. To enhance predictive performance, optimizing the model’s parameters using the IHO algorithm is essential. This study focuses on optimizing four critical parameters of the CatBoost model: iterations, learning rate, depth, and L2_leaf_reg. These optimizations aim to accelerate training, enhance prediction accuracy, and reduce overfitting.
4.2. Variational mode decomposition algorithm
The VMD is an adaptive, non-recursive method designed for signal decomposition (Su et al., 2024). The primary objective of the VMD algorithm is to facilitate the prediction of blasting vibration velocity by decomposing the signal f into k intrinsic modal functions (IMFs), where each modal component represents a distinct oscillatory mode within the signal. Each IMF is associated with a specific center frequency and amplitude, enabling the extraction of key patterns and features embedded within the signal. The variational constraint model, as illustrated in Equation 2, defines |$\{ {{u_k}} \}$| as the kth modal component, |$\{ {{w_k}} \}$| as the corresponding center frequency, |$* $| as the convolution operator, and |$\delta ( t )$| as the Dirac distribution function, which is infinite at a specific point, zero elsewhere, and integrates to 1. Here, |${\partial _t}$| denotes the partial derivative with respect to the time series t, and j represents the imaginary unit. The VMD algorithm iteratively updates each modal component |${u_k}( t )$| along with its corresponding center frequency |$\{ {{w_k}} \}$|. Based on the updated value of |$\{ {{u_k}} \}$|, the new center frequency |$\{ {{w_k}} \}$| is recalculated iteratively until the objective function converges to its minimum value, thereby yielding the optimal solution.
By decomposing complex signals into multiple IMFs, the VMD algorithm isolates significant frequency components, filtering out noise and revealing underlying structures in the data. This decomposition refines the feature inputs for CatBoost, enhancing the model’s prediction accuracy and generalization capability. Additionally, VMD effectively mitigates the impact of noise during model training, improving CatBoost’s robustness when handling nonlinear and non-smooth data. In this study, the VMD algorithm functions as a critical preprocessing step that transforms the initial data into more separable and interpretable components, facilitating downstream feature extraction in machine learning algorithms such as CatBoost, as shown in Figure 1. The VMD-processed data is better aligned with machine learning algorithms such as CatBoost, enhancing its feature extraction capabilities and enabling the model to more effectively capture complex data patterns, ultimately improving predictive performance:
4.3. Improved group intelligence algorithm strategy
4.3.1. Hippopotamus optimization algorithm
The hyperparameters of CatBoost are typically determined through empirical methods; however, such an approach often falls short of ensuring optimal model performance. To address this limitation, population-based intelligence optimization algorithms offer a structured framework for exploring vast parameter spaces, enabling the identification of optimal hyperparameter configurations and considerably enhancing the model’s predictive accuracy. In 2024, Mohammad Hussein Amiri introduced the hippopotamus optimization (HO) algorithm, a novel nature-inspired algorithm that emulates the foraging behavior of hippopotamuses (Amiri et al., 2024). Hippopotamus groups mainly consist of adult females, juveniles, and dominant males. During foraging, hippos modify their routes in response to environmental changes, guided by continuous observation of their surroundings. The HO algorithm leverages this adaptive behavior by dynamically adjusting its search direction based on the current distribution of population solutions and the trends in objective function values. The HO algorithm is structured into three distinct phases:
Updating the hippopotamus’s position within the population
In the initial phase of population dynamics, male hippopotamuses are responsible for safeguarding both the group and their territory, while females tend to cluster around them. Adult male hippos establish dominance through competitive interactions, with their hierarchical position updated based on Equation 3. In this context, Xi represents the position of the ith hippopotamus, and xi,j refers to the jth component of the decision vector within Xi. The position of the dominant male in the population is indicated as XiMho, with xiMho representing the jth dimension of XiMho. Additionally, Dho denotes the position of the dominant hippopotamus within the population. The variable I is an integer that can take the value of either 1 or 2, and r is a random number drawn from the interval [0,1]. The indices i and j are defined as i = 1,2,…, N/2 and j = 1,2,…, m, respectively:
If the updated position of the male hippopotamus (XiMho) or the female or immature hippopotamus (XiFBho) results in an improvement in the objective function, these new positions are accepted into the population matrix (Xi). Otherwise, the previous values of the population matrix are retained. In this context, FiMho and FiFBho represent the updated objective function values for the male and female or immature hippopotamuses, respectively, while Fi denotes the original objective function value. This process is outlined as follows:
Defending against predators
The primary function of hippopotamus herd behavior is to defend against predators, with a particular focus on protecting vulnerable individuals, such as the young and sick. Hippopotamuses compel predators to retreat by swiftly turning towards them, vocalizing, or actively approaching. The position of the predator in the search space is represented by Equation 6, where Predator denotes the position of the predator, Predatorj represents the jth dimension of the predator’s position, |$\overset{\scriptscriptstyle\rightharpoonup}{{{r_1}}}$| is a random vector within the range [0,1], and ubj and lbj are the upper and lower bounds of the decision variable, respectively.
Equation 7 defines the distance between the ith hippopotamus and the predator. When the predator is in close proximity, the hippopotamus rapidly turns towards it and actively advances, compelling the predator to retreat. In contrast, if the predator or intruder is positioned farther from the hippopotamus’s territory, the response is more measured; the hippopotamus turns towards the predator but exhibits a limited range of movement, signaling that the predator has entered its territory. If the predator’s objective function value, FPredatorj, exceeds that of the hippopotamus, Fi, it indicates that the hippopotamus has been predated, prompting another hippopotamus to take its place within the group. Conversely, if the predator flees, the hippopotamus returns to the herd. This process is summarized in Equation 8, where XiHoR denotes the updated position of the hippopotamus during the defense phase, and FiHoR represents the updated objective function value in the predator defense phase:
Escaping from predators
When confronted with an overwhelming predator, hippopotamuses typically retreat to nearby water sources. This behavior can be modeled as part of the optimization process in local search. By generating random positions and evaluating a cost function, the hippopotamus assesses whether it has identified a safe location and adjusts its position accordingly, as represented in Equation 9:
In Equation 9, xi,jHoε denotes the position selected by the hippopotamus during its search for the nearest safe location. The parameter s is a random vector or value drawn from one of three scenarios: (1) a random vector between 0 and 1, (2) a random number within the range of 0 to 1, or (3) a normally distributed random number. The term |${r_2}$| is a random number within the range of 0 to 1, and ti serves as the iteration counter. The variables ubjlocal and lbjlocal define the upper and lower bounds, respectively, for the local decision variables. The function FiHoε represents the value of the objective function after updating the position during the predator escape phase.
If the updated position XiHoε results in an improvement in the objective function, it will be accepted and incorporated into the population matrix Xi, as detailed in Equation 11:
4.3.2. Improved hippopotamus optimization algorithm
The limitations of the HO algorithm are similar to those of all meta-heuristic algorithms, including the lack of guarantee in obtaining the global optimum due to the random search process. To address the tendency of the HO algorithm to fall into local optima and its poor convergence accuracy, this study introduces three strategies: the Tent chaotic mapping strategy (Khalil et al., 2021), the Cauchy perturbation strategy (Ru, 2024), and the adaptive weight factor strategy (Wang et al., 2024). By integrating these strategies, an IHO algorithm is developed, as illustrated in Figure 6. The IHO algorithm enhances search efficiency and optimization effectiveness while effectively mitigating the issue of local optima entrapment. Furthermore, these strategies expedite the prediction model’s parameter optimization, thereby overcoming challenges in model parameter tuning and significantly improving prediction accuracy.

The Tent chaotic mapping strategy
To enhance the quality of the solutions obtained by the HO algorithm through subsequent optimization, this study introduces Tent chaotic mapping to achieve a well-distributed initial population. Tent chaotic mapping is widely utilized due to its excellent global traversal properties, which effectively expand the search range of the initial population and significantly improve the algorithm’s global search capability. Specifically, the mathematical expression of Tent chaotic mapping is shown in Equations 12 and 13, where |${y_{i,d}}$| represents the component of the dth mapping sequence in the ith row. The chaotic sequence is mapped to the solution space, resulting in a new expression for the population sequence. This ensures an effective combination of randomness and determinism, thereby optimizing the algorithm’s performance:
The Cauchy disturbance strategy
Addressing the issue of the HO algorithm’s tendency to fall into local optima, this study incorporates Cauchy perturbation strategy to enhance population diversity, thereby improving the algorithm’s global search capability and expanding the search space. The Cauchy distribution, with its smoother decline on both sides of the peak, mitigates the limitations imposed on hippopotamus individuals by local extreme points. By applying Cauchy perturbation, the constraints on hippopotamus individuals at local extreme points are reduced, thereby decreasing the search time in neighboring intervals and allocating more resources to the global optimum search. This adjustment enables the IHO algorithm to exhibit superior performance in locating the global optimum. The standard function of the Cauchy distribution is presented in Equation 14. Furthermore, Cauchy perturbation is applied to the variable XiHoε in the IHO algorithm, as shown in Equation 15:
The Adaptive weight factor strategy
The inertia weight factor is a crucial parameter that regulates the search ability of the algorithm. When this parameter is large, the algorithm’s global search capability is enhanced, increasing population diversity and allowing for the exploration of a wider area. Conversely, smaller inertia weights strengthen the algorithm’s local search ability, enabling a more refined search near the potential optimal solution and thus accelerating convergence. To further optimize the global and local search abilities of the algorithm, this study introduces an adaptive weighting factor to improve Equations 8 and 11. This enhancement enables the algorithm to dynamically adjust the inertia weights based on the search stage and the current population state, thereby more effectively balancing the needs of exploration and exploitation. The specific improvement schemes and their mathematical expressions are detailed in Equations 17 and 18. In Equation 16, |$\eta $| represents the adaptive weighting factor, while |$\varepsilon $| denotes the parameter responsible for controlling the adaptive weighting, which is used to regulate the algorithm’s search behavior. T refers to the total number of iterations, and t indicates the current iteration count:
4.3.3. Benchmark function testing
The convergence curves of the benchmark functions provide a clear visualization of both the convergence speed and solution accuracy of meta-heuristic algorithms, effectively demonstrating their ability to overcome local optima. To validate the effectiveness and robustness of the multistrategy fusion improvement scheme proposed for the IHO algorithm, six representative benchmark test functions from the IEEE CEC 2005 (Salgotra & Gandomi, 2024) suite were selected for numerical simulations. The outcomes of 30 independent simulation runs were compared against those of the HO algorithm, particle swarm optimization (PSO), grey wolf optimizer (GWO), WOA, seagull optimization algorithm (SOA), and SSA. The fundamental characteristics of the six benchmark functions are summarized in Table 3, with the population size for each algorithm set to 30 and the maximum number of iterations limited to 500.
Function expression . | S . | fmin . |
---|---|---|
|${f_1}( x ) = \sum\limits_{i = 1}^n {x_i^2} $| | |${[ { - 100,100} ]^n}$| | 0 |
|${f_2}( x ) = \sum\limits_{i = 1}^n {| {{x_i}} |} + \prod\limits_{i = 1}^n {| {{x_i}} |} $| | |${[ { - 10,10} ]^n}$| | 0 |
|${f_3}( x ) = {\sum\limits_{i = 1}^n {( {\sum\limits_{j = 1}^i {| {{x_j}} |} } )} ^2}$| | |${[ { - 100,100} ]^n}$| | 0 |
|${f_{11}}( x ) = \frac{1}{{4000}}\sum\limits_{i = 1}^n {x_i^2} - \prod\limits_{i = 1}^n {\cos ( {\frac{{{x_i}}}{{\sqrt i }}} ) + 1} $| | |${[ { - 600,600} ]^n}$| | 0 |
|${f_{12}}( x ) = \frac{\pi }{n}\{ {10\sin ( {\pi {y_1}} ) + \sum\limits_{i = 1}^{n - 1} {{{( {{y_i} - 1} )}^2}} {{[ {1 + 10{{\sin }^2}( {\pi {y_{i + 1}}} )} ]}^2} + ( {{y_n} - 1} )} \} \\+ \sum\limits_{i = 1}^n {u({x_i},10,100,4)} ,{y_i} = 1 + \frac{{{x_i} + 1}}{4}$| | |${[ { - 50,50} ]^n}$| | 0 |
|${f_{17}}( x ) = {( {{x_2} - \frac{{5.1}}{{4{\pi ^2}}}x_1^2 + \frac{5}{\pi }{x_1} - 6} )^2} + 10( {1 - \frac{1}{{8\pi }}} )\cos {x_1} + 10$| | |$[ { - 5,10} ] \times [ {0,15} ]$| | 0.398 |
Function expression . | S . | fmin . |
---|---|---|
|${f_1}( x ) = \sum\limits_{i = 1}^n {x_i^2} $| | |${[ { - 100,100} ]^n}$| | 0 |
|${f_2}( x ) = \sum\limits_{i = 1}^n {| {{x_i}} |} + \prod\limits_{i = 1}^n {| {{x_i}} |} $| | |${[ { - 10,10} ]^n}$| | 0 |
|${f_3}( x ) = {\sum\limits_{i = 1}^n {( {\sum\limits_{j = 1}^i {| {{x_j}} |} } )} ^2}$| | |${[ { - 100,100} ]^n}$| | 0 |
|${f_{11}}( x ) = \frac{1}{{4000}}\sum\limits_{i = 1}^n {x_i^2} - \prod\limits_{i = 1}^n {\cos ( {\frac{{{x_i}}}{{\sqrt i }}} ) + 1} $| | |${[ { - 600,600} ]^n}$| | 0 |
|${f_{12}}( x ) = \frac{\pi }{n}\{ {10\sin ( {\pi {y_1}} ) + \sum\limits_{i = 1}^{n - 1} {{{( {{y_i} - 1} )}^2}} {{[ {1 + 10{{\sin }^2}( {\pi {y_{i + 1}}} )} ]}^2} + ( {{y_n} - 1} )} \} \\+ \sum\limits_{i = 1}^n {u({x_i},10,100,4)} ,{y_i} = 1 + \frac{{{x_i} + 1}}{4}$| | |${[ { - 50,50} ]^n}$| | 0 |
|${f_{17}}( x ) = {( {{x_2} - \frac{{5.1}}{{4{\pi ^2}}}x_1^2 + \frac{5}{\pi }{x_1} - 6} )^2} + 10( {1 - \frac{1}{{8\pi }}} )\cos {x_1} + 10$| | |$[ { - 5,10} ] \times [ {0,15} ]$| | 0.398 |
Function expression . | S . | fmin . |
---|---|---|
|${f_1}( x ) = \sum\limits_{i = 1}^n {x_i^2} $| | |${[ { - 100,100} ]^n}$| | 0 |
|${f_2}( x ) = \sum\limits_{i = 1}^n {| {{x_i}} |} + \prod\limits_{i = 1}^n {| {{x_i}} |} $| | |${[ { - 10,10} ]^n}$| | 0 |
|${f_3}( x ) = {\sum\limits_{i = 1}^n {( {\sum\limits_{j = 1}^i {| {{x_j}} |} } )} ^2}$| | |${[ { - 100,100} ]^n}$| | 0 |
|${f_{11}}( x ) = \frac{1}{{4000}}\sum\limits_{i = 1}^n {x_i^2} - \prod\limits_{i = 1}^n {\cos ( {\frac{{{x_i}}}{{\sqrt i }}} ) + 1} $| | |${[ { - 600,600} ]^n}$| | 0 |
|${f_{12}}( x ) = \frac{\pi }{n}\{ {10\sin ( {\pi {y_1}} ) + \sum\limits_{i = 1}^{n - 1} {{{( {{y_i} - 1} )}^2}} {{[ {1 + 10{{\sin }^2}( {\pi {y_{i + 1}}} )} ]}^2} + ( {{y_n} - 1} )} \} \\+ \sum\limits_{i = 1}^n {u({x_i},10,100,4)} ,{y_i} = 1 + \frac{{{x_i} + 1}}{4}$| | |${[ { - 50,50} ]^n}$| | 0 |
|${f_{17}}( x ) = {( {{x_2} - \frac{{5.1}}{{4{\pi ^2}}}x_1^2 + \frac{5}{\pi }{x_1} - 6} )^2} + 10( {1 - \frac{1}{{8\pi }}} )\cos {x_1} + 10$| | |$[ { - 5,10} ] \times [ {0,15} ]$| | 0.398 |
Function expression . | S . | fmin . |
---|---|---|
|${f_1}( x ) = \sum\limits_{i = 1}^n {x_i^2} $| | |${[ { - 100,100} ]^n}$| | 0 |
|${f_2}( x ) = \sum\limits_{i = 1}^n {| {{x_i}} |} + \prod\limits_{i = 1}^n {| {{x_i}} |} $| | |${[ { - 10,10} ]^n}$| | 0 |
|${f_3}( x ) = {\sum\limits_{i = 1}^n {( {\sum\limits_{j = 1}^i {| {{x_j}} |} } )} ^2}$| | |${[ { - 100,100} ]^n}$| | 0 |
|${f_{11}}( x ) = \frac{1}{{4000}}\sum\limits_{i = 1}^n {x_i^2} - \prod\limits_{i = 1}^n {\cos ( {\frac{{{x_i}}}{{\sqrt i }}} ) + 1} $| | |${[ { - 600,600} ]^n}$| | 0 |
|${f_{12}}( x ) = \frac{\pi }{n}\{ {10\sin ( {\pi {y_1}} ) + \sum\limits_{i = 1}^{n - 1} {{{( {{y_i} - 1} )}^2}} {{[ {1 + 10{{\sin }^2}( {\pi {y_{i + 1}}} )} ]}^2} + ( {{y_n} - 1} )} \} \\+ \sum\limits_{i = 1}^n {u({x_i},10,100,4)} ,{y_i} = 1 + \frac{{{x_i} + 1}}{4}$| | |${[ { - 50,50} ]^n}$| | 0 |
|${f_{17}}( x ) = {( {{x_2} - \frac{{5.1}}{{4{\pi ^2}}}x_1^2 + \frac{5}{\pi }{x_1} - 6} )^2} + 10( {1 - \frac{1}{{8\pi }}} )\cos {x_1} + 10$| | |$[ { - 5,10} ] \times [ {0,15} ]$| | 0.398 |
Among the benchmark functions, f1, f2, and f3 are unimodal and primarily assess the algorithms’ convergence speed and accuracy. In contrast, f11, f12, and f17 are multimodal, containing a single global optimum alongside multiple local optima, thereby evaluating the global search capability and local exploitation efficiency of the algorithms. The convergence curves of the six benchmark functions are illustrated in Figure 7 , where the horizontal axis represents the number of iterations, and the vertical axis reflects the optimal value of the objective function following 30 independent runs for each test function.

From the convergence curves of f1, f2, and f3, it is evident that the IHO algorithm achieves the same solution accuracy as the HO algorithm within approximately 200 iterations, indicating a superior convergence rate. For the unimodal test functions, the IHO displays a pronounced downward trend in its convergence curve, particularly during the initial iterations, achieving the optimal value sooner than both the GWO and the SSA. Conversely, the multimodal test functions, characterized by the presence of multiple local optima, primarily evaluate the algorithms capability for global search and local exploitation. In the case of f11, the IHO algorithm demonstrates enhanced convergence compared to HO, effectively navigating past local optima and stabilizing near the global optimum. Similarly, for f12, IHO exhibits an excellent balance between global exploration and local exploitation, delving into multiple local optima and converging towards a more optimal solution. Through adaptive strategy fusion, the IHO showcases robust global search capabilities, successfully preventing premature convergence.
In the test of f17, the IHO algorithm exhibits strong search performance during the early iterations, attaining the global optimum more rapidly than other algorithms, while also demonstrating greater robustness and stability. This observation suggests that IHO achieves a better balance between local and global exploration within complex search spaces.
In summary, the results illustrated in Figure 7 indicate that the IHO algorithm significantly enhances search performance relative to the HO algorithm, effectively preventing premature convergence. Furthermore, IHO demonstrates substantial improvements in stability, accuracy, and robustness compared to algorithms such as the GWO and the SSA. Overall, the benchmark results confirm that IHO, through multistrategy fusion, markedly enhances algorithmic stability, robustness, and global search capability, achieving faster convergence and higher solution accuracy than other meta-heuristic algorithms, particularly excelling in global search and avoiding local optima in multimodal problems.
4.4. SHapley Additive exPlanation method
As machine learning models become increasingly complex, understanding their inner workings and decision-making processes presents a significant challenge. The predictive accuracy demonstrated by these models alone is insufficient to ensure their credibility. Improving the interpretability of “black-box” models and clearly understanding their predictive basis has become essential for enhancing the generalization ability and credibility of machine learning algorithms in various applications.
In 2017, Lundberg and Lee (Fissha et al., 2023) proposed the SHAP (SHapley Additive exPlanation) method, which is based on the Shapley value from cooperative game theory. The Shapley value determines the fair distribution of profits or costs within a coalition. When extended to machine learning models, all features can be regarded as “contributors,” and the SHAP value represents the fair distribution of the predicted values generated in the prediction samples to each feature. The value assigned to each feature, also called the degree of contribution, influences the increase or decrease in the final model results (Maulana Kusdhany & Lyth, 2021).
For a given model f and input sample x, the SHAP value of feature i is calculated as shown in Equation 19. Here, N represents the set of all features; S is the subset of features excluding feature i; |S| denotes the number of elements in set S; |N| is the total number of features; |$f_x(S \cup \{ i \})$| and |$f_x(S)$| are the predicted values of the model with and without feature i, respectively:
SHAP values provide a quantitative measure of each feature’s contribution to the predictions of a machine learning model. By obtaining SHAP values, the interpretability and transparency of the model are significantly improved, leading to a deeper understanding of the relationships between features. The calculation of these values ensures that the contribution of each feature is fairly considered. The SHAP method analyzes the impact of each feature on the model’s prediction results by calculating the degree to which each feature contributes to the model output. This approach further enhances the interpretability of black-box models or complex machine learning models.
4.5. Model building
In this study, the proposed model for predicting the PPV, based on a multistrategy fusion of CatBoost, is divided into three phases: decomposition, optimal model finding and prediction, and result analysis. The specific prediction process is as follows:
Step 1: The PPV data are fused, and the IHO algorithm is used to optimize the parameters of the penalty factor α of the VMD and the number of decomposition layers K. The data are then pre-processed and input into the CatBoost model for the prediction of the PPV. The pre-processed data are subsequently input into the VMD for decomposition.
Step 2: Using the optimized values of K and α obtained in Step 1, the original PPV sequence is decomposed using VMD to produce multiple subsequences IMF1, IMF2, ……, IMFN, where N denotes the number of modal components.
Step 3: The four main parameters of the CatBoost model are optimized using IHO, and the individual modal components obtained from the VMD decomposition are input as features into the CatBoost model for prediction. During the result output stage, the IHO-VMD-CatBoost model prediction results are superimposed and reconstructed to produce the final predictions.
Step 4: The model’s performance in predicting the PPV is evaluated using multiple metrics, including root mean square error (RMSE), root mean square error (MAE), variance accounted for (VAF), performance index (PI), variance accounted for (RSR), normalized mean bias error (NMBE), mean absolute percentage error (MAPE), and Nash–Sutcliffe efficiency (NS), to comprehensively assess its accuracy, robustness, and reliability.
Step 5: The prediction results are interpreted and analyzed using the SHAP model. The analysis includes how each feature affects the model’s prediction results from three perspectives: individual sample features, global sample features, and feature interactions. The overall process is illustrated in Figure 8.

5. Results
5.1. Parameter settings
The IHO-VMD-CatBoost model is employed to predict the PPV of blasting vibrations. To further verify its prediction performance and generalization ability, the model is compared with 13 regression prediction models, as detailed in the Appendix, Table A1 (Mehrabi Hashjin et al., 2024; Wang, Zhang et al., 2024). To avoid computational bias, the population size for optimization algorithms in the prediction models is set to 30, with the number of iterations set to 100. The first 70% of the data is used as the training set, while the remaining 30% is used as the test set. The mean results of each model after 30 runs are analyzed as the prediction results. The RMSE, MAE, VAF, PI, RSR, NMBE, MAPE, and NS were employed to comprehensively assess the prediction accuracy and overall performance of the models. The data experiments are conducted on a Windows 10, 64-bit operating system, with an Intel Core i7-9750H CPU @ 2.60GHz processor.
5.2. Effect of hyperparameters
To achieve high prediction accuracy and model reliability, it is essential to optimize key hyperparameters, as they significantly influence model convergence, stability, and generalization. Therefore, several hyperparameters in the VMD and CatBoost models were optimized using the IHO algorithm to enhance the prediction accuracy of the peak velocity of blast-induced vibrations. Figure 9 illustrates the effects of different hyperparameters, including the α parameter and the number of modes (K) for VMD, as well as iterations, learning rate (learning_rate), tree depth (depth), and L2 regularization (L2_leaf_reg) for CatBoost, on model performance as measured by RMSE. The specific settings for these key hyperparameters can be found in Table A3 (see the Appendix). By systematically tuning these parameters, an optimal balance was achieved between model convergence, prediction accuracy, and generalization capability.

Impact of key hyperparameters on IHO-VMD-CatBoost model performance.
Figure 9 shows that the key hyperparameters significantly affect the convergence speed and prediction accuracy of the model. For the CatBoost model, an appropriate number of iterations effectively reduces RMSE while mitigating overfitting. Moreover, a balanced learning_rate ensures stable convergence, thereby lowering prediction error. The tree depth determines the model’s capacity to capture data complexity, and selecting the optimal depth helps accurately capture underlying data patterns while avoiding overfitting. Additionally, L2_leaf_reg enhances the model’s generalization ability by constraining complexity, enabling better performance on unseen data.
For VMD, the α parameter controls the smoothness of the decomposed components, which directly influences noise suppression and feature extraction. An appropriate α value can efficiently reduce noise while preserving key features, thereby lowering RMSE. Similarly, the number of modes (K) governs the level of detail in the decomposition. An optimal K-value allows for effective feature capture while minimizing redundant information, resulting in improved prediction performance.
In summary, optimizing these critical hyperparameters enables the model to achieve a favorable balance between prediction accuracy and generalization capability. These findings underscore the pivotal role of hyperparameter optimization in enhancing model performance and provide valuable guidance for future improvements in similar applications.
5.3. Evaluation indicators
To comprehensively evaluate the predictive performance of the model for PPV induced by blasting, eight key performance metrics were utilized. These metrics cover various aspects of the model’s performance, including prediction error, model fit, and generalization capability, thereby ensuring a thorough and reliable assessment.
Firstly, RMSE and MAE quantify the magnitude of prediction errors, providing a direct measure of the model’s overall predictive accuracy. Lower values of RMSE and MAE indicate greater accuracy and reliability in PPV prediction. The VAF evaluates the proportion of variance in the observed data that the model can explain, thereby indicating the quality of the model’s fit. The PI further measures the model’s efficiency in capturing fundamental data patterns, while the RSR normalizes RMSE with respect to the standard deviation (Std) of the observed data, providing a standardized metric for prediction reliability.
In addition, the NMBE is employed to assess any systematic bias in model predictions, which helps in identifying tendencies towards overestimation or underestimation. The MAPE presents the average prediction error in percentage form, thereby enhancing the interpretability and usability of the model in practical contexts. Furthermore, the NS evaluates the model’s predictive performance in comparison to mean-based predictions, offering a robust benchmark for assessing the model’s overall efficacy. The formulas for calculating these metrics are provided in Equations 20 to 28, where Mi and Pi denote the actual and predicted values, respectively, n represents the total number of data points, and |$\overline {{M_i}} $| indicates the mean of the actual values.
When these metrics approach their ideal values, specifically RMSE = 0, MAE = 0, VAF = 100, PI = 2, RSR = 0, NMBE = 0, MAPE = 0, and NS = 1, the model demonstrates high accuracy and reliability in predicting PPV. Thus, these metrics collectively provide a comprehensive evaluation of the model’s strengths and weaknesses, offering clear guidance for subsequent optimization and performance enhancement:
5.4. Analysis of results
To comprehensively assess the predictive effectiveness of the IHO-VMD-CatBoost model and ensure the reliability and unbiasedness of the research results, 13 different regression models were selected for comparative experiments. These models range from traditional linear regression models such as Lasso-MLR to complex neural network models like Two-Layer BP, as well as other popular machine learning models, including support vector machine PSO-SVM and ensemble learning models such as LightGBM and ExtraTrees. The performance of these models was evaluated using eight key metrics: RMSE, MAE, VAF, PI, RSR, NMBE, MAPE, and NS. Together, these metrics ensure a comprehensive and accurate assessment of the models’ predictive capabilities.
According to the results in Table 4 , the IHO-VMD-CatBoost model demonstrated superior performance among all the models tested. Specifically, its MAE value was 0.17 cm/s and its RMSE value was 0.28 cm/s, indicating better prediction accuracy. Compared to several ensemble learning models such as LightGBM and ExtraTrees, the IHO-VMD-CatBoost model reduced the RMSE by 79.6% and 80.0%, and the MAE by 78.8% and 79.8%, respectively. Compared to the unoptimized CatBoost model, its improvement in RMSE and MAE reached 81.6% and 81.5%, respectively. Additionally, compared to the neural network model Two-Layer BP and the support vector machine model PSO-SVM, the IHO-VMD-CatBoost model achieved an 81.7% and 84.4% reduction in RMSE and an 82.5% and 86.6% reduction in MAE, respectively.
Ranking . | Model . | RMSE . | MAE . | VAF . | PI . | RSR . | NMBE . | MAPE . | NS . |
---|---|---|---|---|---|---|---|---|---|
1 | IHO-VMD-CatBoost | 0.28 | 0.17 | 97.28 | 1.66 | 0.17 | 0.10 | 17.28 | 0.97 |
2 | LightGBM | 1.37 | 0.80 | −2.14 | −1.89 | 1.04 | 23.46 | 119.71 | −0.09 |
3 | ExtraTrees | 1.40 | 0.84 | 2.34 | −1.89 | 1.05 | 33.98 | 107.96 | −0.11 |
4 | AdaBoost | 1.36 | 0.85 | −8.73 | −2.23 | 1.11 | 35.59 | 105.08 | −0.23 |
5 | DT | 1.44 | 0.92 | −2.83 | −2.13 | 1.10 | 39.72 | 98.39 | −0.21 |
6 | CatBoost | 1.52 | 0.92 | 7.04 | −1.81 | 1.04 | 37.21 | 98.43 | −0.09 |
7 | RF | 1.53 | 0.96 | 4.78 | −1.87 | 1.05 | 36.75 | 110.41 | −0.11 |
8 | GBDT | 1.44 | 0.96 | −1.57 | −2.14 | 1.11 | 42.57 | 101.04 | −0.23 |
9 | PSO-SVM | 1.53 | 0.97 | 3.52 | −1.89 | 1.06 | 36.09 | 102.90 | −0.12 |
10 | KNN | 1.72 | 1.15 | 9.16 | −1.81 | 1.05 | 40.88 | 105.69 | −0.10 |
11 | Two-layerBP | 1.80 | 1.27 | 11.68 | −2.05 | 1.06 | 0.00 | 120.26 | −0.12 |
12 | BiLSTM | 2.15 | 1.45 | 63.13 | −3.50 | 1.29 | 16.16 | 146.03 | −0.66 |
13 | ELM | 6.65 | 2.33 | −1476.43 | −36.34 | 3.97 | 1.98 | 354.74 | −14.76 |
14 | Lasso-MLR | 26.25 | 4.79 | −22804.25 | −488.63 | 15.34 | −232.29 | 577.73 | −234.28 |
Ranking . | Model . | RMSE . | MAE . | VAF . | PI . | RSR . | NMBE . | MAPE . | NS . |
---|---|---|---|---|---|---|---|---|---|
1 | IHO-VMD-CatBoost | 0.28 | 0.17 | 97.28 | 1.66 | 0.17 | 0.10 | 17.28 | 0.97 |
2 | LightGBM | 1.37 | 0.80 | −2.14 | −1.89 | 1.04 | 23.46 | 119.71 | −0.09 |
3 | ExtraTrees | 1.40 | 0.84 | 2.34 | −1.89 | 1.05 | 33.98 | 107.96 | −0.11 |
4 | AdaBoost | 1.36 | 0.85 | −8.73 | −2.23 | 1.11 | 35.59 | 105.08 | −0.23 |
5 | DT | 1.44 | 0.92 | −2.83 | −2.13 | 1.10 | 39.72 | 98.39 | −0.21 |
6 | CatBoost | 1.52 | 0.92 | 7.04 | −1.81 | 1.04 | 37.21 | 98.43 | −0.09 |
7 | RF | 1.53 | 0.96 | 4.78 | −1.87 | 1.05 | 36.75 | 110.41 | −0.11 |
8 | GBDT | 1.44 | 0.96 | −1.57 | −2.14 | 1.11 | 42.57 | 101.04 | −0.23 |
9 | PSO-SVM | 1.53 | 0.97 | 3.52 | −1.89 | 1.06 | 36.09 | 102.90 | −0.12 |
10 | KNN | 1.72 | 1.15 | 9.16 | −1.81 | 1.05 | 40.88 | 105.69 | −0.10 |
11 | Two-layerBP | 1.80 | 1.27 | 11.68 | −2.05 | 1.06 | 0.00 | 120.26 | −0.12 |
12 | BiLSTM | 2.15 | 1.45 | 63.13 | −3.50 | 1.29 | 16.16 | 146.03 | −0.66 |
13 | ELM | 6.65 | 2.33 | −1476.43 | −36.34 | 3.97 | 1.98 | 354.74 | −14.76 |
14 | Lasso-MLR | 26.25 | 4.79 | −22804.25 | −488.63 | 15.34 | −232.29 | 577.73 | −234.28 |
Ranking . | Model . | RMSE . | MAE . | VAF . | PI . | RSR . | NMBE . | MAPE . | NS . |
---|---|---|---|---|---|---|---|---|---|
1 | IHO-VMD-CatBoost | 0.28 | 0.17 | 97.28 | 1.66 | 0.17 | 0.10 | 17.28 | 0.97 |
2 | LightGBM | 1.37 | 0.80 | −2.14 | −1.89 | 1.04 | 23.46 | 119.71 | −0.09 |
3 | ExtraTrees | 1.40 | 0.84 | 2.34 | −1.89 | 1.05 | 33.98 | 107.96 | −0.11 |
4 | AdaBoost | 1.36 | 0.85 | −8.73 | −2.23 | 1.11 | 35.59 | 105.08 | −0.23 |
5 | DT | 1.44 | 0.92 | −2.83 | −2.13 | 1.10 | 39.72 | 98.39 | −0.21 |
6 | CatBoost | 1.52 | 0.92 | 7.04 | −1.81 | 1.04 | 37.21 | 98.43 | −0.09 |
7 | RF | 1.53 | 0.96 | 4.78 | −1.87 | 1.05 | 36.75 | 110.41 | −0.11 |
8 | GBDT | 1.44 | 0.96 | −1.57 | −2.14 | 1.11 | 42.57 | 101.04 | −0.23 |
9 | PSO-SVM | 1.53 | 0.97 | 3.52 | −1.89 | 1.06 | 36.09 | 102.90 | −0.12 |
10 | KNN | 1.72 | 1.15 | 9.16 | −1.81 | 1.05 | 40.88 | 105.69 | −0.10 |
11 | Two-layerBP | 1.80 | 1.27 | 11.68 | −2.05 | 1.06 | 0.00 | 120.26 | −0.12 |
12 | BiLSTM | 2.15 | 1.45 | 63.13 | −3.50 | 1.29 | 16.16 | 146.03 | −0.66 |
13 | ELM | 6.65 | 2.33 | −1476.43 | −36.34 | 3.97 | 1.98 | 354.74 | −14.76 |
14 | Lasso-MLR | 26.25 | 4.79 | −22804.25 | −488.63 | 15.34 | −232.29 | 577.73 | −234.28 |
Ranking . | Model . | RMSE . | MAE . | VAF . | PI . | RSR . | NMBE . | MAPE . | NS . |
---|---|---|---|---|---|---|---|---|---|
1 | IHO-VMD-CatBoost | 0.28 | 0.17 | 97.28 | 1.66 | 0.17 | 0.10 | 17.28 | 0.97 |
2 | LightGBM | 1.37 | 0.80 | −2.14 | −1.89 | 1.04 | 23.46 | 119.71 | −0.09 |
3 | ExtraTrees | 1.40 | 0.84 | 2.34 | −1.89 | 1.05 | 33.98 | 107.96 | −0.11 |
4 | AdaBoost | 1.36 | 0.85 | −8.73 | −2.23 | 1.11 | 35.59 | 105.08 | −0.23 |
5 | DT | 1.44 | 0.92 | −2.83 | −2.13 | 1.10 | 39.72 | 98.39 | −0.21 |
6 | CatBoost | 1.52 | 0.92 | 7.04 | −1.81 | 1.04 | 37.21 | 98.43 | −0.09 |
7 | RF | 1.53 | 0.96 | 4.78 | −1.87 | 1.05 | 36.75 | 110.41 | −0.11 |
8 | GBDT | 1.44 | 0.96 | −1.57 | −2.14 | 1.11 | 42.57 | 101.04 | −0.23 |
9 | PSO-SVM | 1.53 | 0.97 | 3.52 | −1.89 | 1.06 | 36.09 | 102.90 | −0.12 |
10 | KNN | 1.72 | 1.15 | 9.16 | −1.81 | 1.05 | 40.88 | 105.69 | −0.10 |
11 | Two-layerBP | 1.80 | 1.27 | 11.68 | −2.05 | 1.06 | 0.00 | 120.26 | −0.12 |
12 | BiLSTM | 2.15 | 1.45 | 63.13 | −3.50 | 1.29 | 16.16 | 146.03 | −0.66 |
13 | ELM | 6.65 | 2.33 | −1476.43 | −36.34 | 3.97 | 1.98 | 354.74 | −14.76 |
14 | Lasso-MLR | 26.25 | 4.79 | −22804.25 | −488.63 | 15.34 | −232.29 | 577.73 | −234.28 |
According to Table 4, the IHO-VMD-CatBoost model demonstrates superior predictive performance for PPV. Specifically, it achieves the lowest RMSE of 0.28 cm/s and the lowest MAE of 0.17 cm/s among all evaluated models, thereby indicating the highest prediction accuracy. Additionally, the model reports a VAF value of 97.28, reflecting an excellent fit to the observed data. The PI of 1.66 further underscores the model’s efficiency in capturing data trends effectively. Moreover, the model features the smallest standard deviation ratio (RSR) of 0.17, highlighting its predictive consistency across different scenarios. The NS of 0.97, which is close to the ideal value of 1, further reinforces the robustness of the model. Collectively, these metrics confirm that the IHO-VMD-CatBoost model is highly effective and reliable for predicting PPV, offering distinct advantages over conventional methods.
Through comparative analysis, it is evident that the IHO-VMD-CatBoost model significantly enhances prediction accuracy while demonstrating superior generalization ability and robustness. As illustrated in Figure 10 , the prediction performance plots for each model show that the IHO-VMD-CatBoost model offers greater flexibility and adaptability in fitting the actual observed data, particularly at higher peak particle velocities. This enhancement is attributed to the synergy between the IHO algorithm and VMD, where IHO’s advanced optimization strategies effectively refine hyperparameter tuning, and VMD decomposes complex signals into informative components, thereby improving the model’s predictive performance. Consequently, this synergy enables the IHO-VMD-CatBoost model to outperform both HO-based and non-VMD-integrated models, as evidenced by the superior RMSE and MAE values shown in Appendix, Table A2. Such considerable improvements in accuracy and model fit establish the IHO-VMD-CatBoost model as a highly valuable tool for addressing the demanding precision requirements of blasting vibration prediction, providing robust and reliable support for both research and practical applications in this field.

5.5. Analysis of reliability
To evaluate the reliability of the model, the a20index, the index of dispersion (IOS), and the index of agreement (IOA) were employed. The a20index represents the proportion of model predictions that fall within ±20% of the experimental observations, providing a measure of the degree of agreement between predicted and observed values. A higher proportion indicates superior predictive performance. The IOS is calculated by dividing the RMSE by the mean of the observations, thereby quantifying the dispersion of the predicted values. Lower IOS values suggest less variability in the predicted results, which subsequently reflects higher model accuracy. The IOA, ranging from −1 to 1, is used to evaluate the consistency between predicted and observed values; values closer to 1 indicate a stronger agreement between predictions and actual observations. The equations for calculating these indices are provided in Equations 29 to 31, where M denotes the total number of samples, m20 represents the number of samples within the range of 0.8 to 1.2, and the Avg. of Actual Values is the average of the actual observations:
Furthermore, regression error characteristic (REC) analysis was conducted to provide a comprehensive assessment of the model’s performance. The REC curve visually evaluates the model’s performance across different error thresholds by depicting the relationship between absolute error (x-axis) and the percentage of predictions within the error range (y-axis). This method effectively illustrates the prediction accuracy under varying error tolerances, thereby enhancing the transparency and interpretability of the model’s overall performance. In this study, REC curves were utilized to present a visual representation of the model’s prediction accuracy under different error tolerance levels.
As shown in Table 5, the IHO-VMD-CatBoost model exhibits superior reliability compared to traditional models, demonstrating significant advancements in predictive robustness and accuracy. As shown in Figure 11a, the a20index reached 83.33%, indicating that the majority of deviations between predicted and observed values fall within an acceptable range, which underscores the model’s strong predictive capability. The IOS value of 0.1534 suggests minimal dispersion in prediction results, thereby enhancing the reliability of the model. Moreover, the IOA value of 0.993 further corroborates the high consistency between predicted and observed values. In Figure 11b, the steep upward trend observed in the REC curve indicates a high percentage of accurate predictions within a small error margin, further supporting the reliability of the model. In summary, these evaluation metrics collectively demonstrate that the IHO-VMD-CatBoost model outperforms conventional methods in terms of both accuracy and reliability in predicting PPV.

(a) Prediction performance of IHO-VMD-CatBoost model. (b) Description of the REC curves for each model.
Ranking . | Model . | a20index . | IOA . | IOS . |
---|---|---|---|---|
1 | IHO-VMD-CatBoost | 83.333 | 0.993 | 0.153 |
2 | LightGBM | 19.048 | 0.508 | 0.969 |
3 | ExtraTrees | 16.667 | 0.486 | 0.980 |
4 | AdaBoost | 14.286 | 0.472 | 1.033 |
5 | DT | 21.429 | 0.508 | 1.023 |
6 | CatBoost | 16.667 | 0.460 | 0.971 |
7 | RF | 16.667 | 0.449 | 0.979 |
8 | GBDT | 21.429 | 0.480 | 1.029 |
9 | PSO-SVM | 23.810 | 0.437 | 0.982 |
10 | KNN | 19.048 | 0.464 | 0.976 |
11 | Two-layerBP | 21.429 | 0.667 | 0.983 |
12 | BiLSTM | 9.524 | 0.400 | 1.199 |
13 | ELM | 14.286 | 0.204 | 3.692 |
14 | Lasso-MLR | 21.429 | 0.006 | 14.264 |
Ranking . | Model . | a20index . | IOA . | IOS . |
---|---|---|---|---|
1 | IHO-VMD-CatBoost | 83.333 | 0.993 | 0.153 |
2 | LightGBM | 19.048 | 0.508 | 0.969 |
3 | ExtraTrees | 16.667 | 0.486 | 0.980 |
4 | AdaBoost | 14.286 | 0.472 | 1.033 |
5 | DT | 21.429 | 0.508 | 1.023 |
6 | CatBoost | 16.667 | 0.460 | 0.971 |
7 | RF | 16.667 | 0.449 | 0.979 |
8 | GBDT | 21.429 | 0.480 | 1.029 |
9 | PSO-SVM | 23.810 | 0.437 | 0.982 |
10 | KNN | 19.048 | 0.464 | 0.976 |
11 | Two-layerBP | 21.429 | 0.667 | 0.983 |
12 | BiLSTM | 9.524 | 0.400 | 1.199 |
13 | ELM | 14.286 | 0.204 | 3.692 |
14 | Lasso-MLR | 21.429 | 0.006 | 14.264 |
Ranking . | Model . | a20index . | IOA . | IOS . |
---|---|---|---|---|
1 | IHO-VMD-CatBoost | 83.333 | 0.993 | 0.153 |
2 | LightGBM | 19.048 | 0.508 | 0.969 |
3 | ExtraTrees | 16.667 | 0.486 | 0.980 |
4 | AdaBoost | 14.286 | 0.472 | 1.033 |
5 | DT | 21.429 | 0.508 | 1.023 |
6 | CatBoost | 16.667 | 0.460 | 0.971 |
7 | RF | 16.667 | 0.449 | 0.979 |
8 | GBDT | 21.429 | 0.480 | 1.029 |
9 | PSO-SVM | 23.810 | 0.437 | 0.982 |
10 | KNN | 19.048 | 0.464 | 0.976 |
11 | Two-layerBP | 21.429 | 0.667 | 0.983 |
12 | BiLSTM | 9.524 | 0.400 | 1.199 |
13 | ELM | 14.286 | 0.204 | 3.692 |
14 | Lasso-MLR | 21.429 | 0.006 | 14.264 |
Ranking . | Model . | a20index . | IOA . | IOS . |
---|---|---|---|---|
1 | IHO-VMD-CatBoost | 83.333 | 0.993 | 0.153 |
2 | LightGBM | 19.048 | 0.508 | 0.969 |
3 | ExtraTrees | 16.667 | 0.486 | 0.980 |
4 | AdaBoost | 14.286 | 0.472 | 1.033 |
5 | DT | 21.429 | 0.508 | 1.023 |
6 | CatBoost | 16.667 | 0.460 | 0.971 |
7 | RF | 16.667 | 0.449 | 0.979 |
8 | GBDT | 21.429 | 0.480 | 1.029 |
9 | PSO-SVM | 23.810 | 0.437 | 0.982 |
10 | KNN | 19.048 | 0.464 | 0.976 |
11 | Two-layerBP | 21.429 | 0.667 | 0.983 |
12 | BiLSTM | 9.524 | 0.400 | 1.199 |
13 | ELM | 14.286 | 0.204 | 3.692 |
14 | Lasso-MLR | 21.429 | 0.006 | 14.264 |
6. Interpretation of SHAP Results
6.1. Local character analysis
Based on the SHAP framework, the SHAP contribution of a single metric is calculated for a sample of the trained IHO-VMD-CatBoost model, where the sum of all parameter contributions of the sample equals the output of the sample on the model, i.e., the peak blast vibration velocity. This study uses the waterfall plots of the 100th and 128th blast vibration velocity data from the database as examples to visualize the local interpretation of this data under the SHAP method, as shown in Figure 12.

Interpretation of sample results: (a) 100th data and (b) 128th data.
In the waterfall plot, red arrows represent positive SHAP value features, which increase the model’s predictive value output, while blue arrows represent negative SHAP value features, which decrease the model’s predictive value output. The baseline value of 1.194 represents the model’s predicted output when no features are input. Arrows of varying colors and lengths indicate the direction and magnitude of the effect on the peak velocity of blast vibration, illustrating how individual features influence the model’s predicted results from the base value to the final value of f(x).
As shown in Figure 12a, for the 100th blast vibration data, the horizontal distance and rock integrity coefficient significantly increase the predicted value of the blast vibration velocity. Conversely, the maximum section charge, elevation difference, and differential time significantly decrease the predicted value, while the remaining features have a relatively minor effect. The predicted peak velocity of the blast vibration is ultimately 0.59 cm/s, indicating that the horizontal distance and rock integrity coefficient are highly influential for this blast vibration data.
In Figure 12b, for the 128th blast vibration data, the horizontal distance and elevation difference significantly increase the predicted value of the blast vibration velocity. In contrast, the maximum section charge, rock integrity coefficient, and the angle between the measurement point and the direction of the minimum line of resistance significantly decrease the predicted value. The other features have a relatively minor effect. The predicted peak velocity of this blast vibration is 0.52 cm/s, highlighting the substantial influence of the horizontal distance and elevation difference on this blast vibration data.
From the prediction plots of the two different samples in Figure 12, it is evident that the influence of the same input feature on different samples varies. Therefore, the overall dataset needs to be analyzed in combination with the global features of the samples to obtain more accurate prediction results.
6.2. Global character analysis
To further elucidate the relationship between features and SHAP values, we plotted the relationship between each feature and its corresponding SHAP value based on the prediction process of the IHO-VMD-Catboost model. This approach allows us to convert all local interpretations into approximate global interpretations. The SHAP values for each feature are depicted in Figure 13a, while the feature importance ranking is shown in Figure 13b.

Key feature analysis: (a) scatterplot of feature importance and (b) importance ranking of features.
In Figure 13a, each point represents a sample value. The horizontal coordinate indicates the SHAP value, and the influence of the feature size on the result is represented by different colors. The broader the color distribution, the greater the number of clustered samples. With a SHAP value of 0 as the midpoint, sample points to the left contribute negatively to the predicted value, whereas those to the right contribute positively. It is evident from Figure 13a that the maximum section charge, horizontal distance, rock integrity coefficient, and the angle between the measurement point and the direction of the minimum resistance line significantly influence the model’s prediction results. Specifically, the maximum section charge, rock integrity coefficient, and the angle between the measurement point and the direction of the minimum resistance line positively affect the model’s prediction results, while the horizontal distance negatively impacts the peak vibration velocity.
From a blasting energy perspective, an increased horizontal distance weakens the vibration intensity as the energy generated by blasting decays with distance. Conversely, a larger maximum section charge releases more energy during blasting, potentially resulting in a higher vibration velocity. A high rock integrity coefficient indicates a more intact rock structure with fewer cracks and faults, allowing for more efficient propagation of vibration energy due to better conductivity and fewer points of energy dissipation. The angle between the measurement point and the direction of the line of least resistance indicates the measurement point’s position relative to this line in the blast area. A larger angle suggests the measurement point is closer to the straight-line propagation path, thereby receiving stronger vibration waves.
SHAP theory also measures feature importance using the average absolute value of each feature’s Shapley value. The larger this value, the more critical the feature, as shown in Figure 13b. From Figure 13b, it is evident that the horizontal distance has the highest value, indicating it is the most important feature in the model, contributing the most to the prediction results. Following this, the maximum section charge, rock integrity coefficient, and the angle between the measurement point and the direction of the minimum resistance line also have a significant impact on the model’s prediction results.
The interactive effect of the two input features on the model prediction results is visualized through the SHAP dependency graph in Figure 14. In this figure, the horizontal axis represents the different values of each key factor, the vertical axis shows the corresponding SHAP values, and the color gradient indicates the magnitude of the feature value on the right side. A SHAP value greater than 0 signifies a positive effect on the model’s prediction results, while a value less than 0 indicates a negative effect.

As an illustration, consider the interaction effect of the total charge (X2) and the rock solidity coefficient (X8). For the total charge (X2), an increase in value generally leads to a decrease in the model prediction result. When the value of total charge (X2) exceeds 6000 kg, most SHAP values are greater than 0, indicating a positive impact on the model’s predictions. Conversely, when the value is below 6000 kg, most SHAP values are less than 0, indicating a negative impact. Figure 14 shows that as both the total charge (X2) and the rock solidity coefficient (X8) increase, their combined effect results in a more pronounced positive influence on the model's prediction outcomes.
6.3. Engineering recommendations analysis
Combining the global characterizations from Figures 13 and 14, the following engineering recommendations are proposed to better control blasting vibration, reduce its impact on the surrounding environment, and improve blasting efficiency and safety.
Given its substantial influence on the PPV, it is essential to prioritize the strategic arrangement of blasting points during the design phase. For instance, increasing the minimum safe distance from the blasting point to sensitive facilities (e.g., residential houses, highways, and dams) can significantly reduce vibration impact. Additionally, the mutual influence between different blasting points can be mitigated by employing a more detailed blasting grid.
Due to its significant impact on vibration, optimizing the use of explosives and blasting methods is recommended. Implementing segmental blasting and incremental delays can effectively control vibration rates and minimize environmental impact. Alternatively, reducing the maximum section charge, increasing the number of blasting segments, and using high-precision electronic detonators for precise control can also be effective strategies.
A detailed geological investigation should precede blasting to assess rock integrity. For rock formations with low integrity, using lower-strength explosives or adjusting blasting parameters can reduce unintended vibrations caused by uneven rock fragmentation. In contrast, for highly intact rock formations, blasting parameters can be adjusted to optimize the blasting effect while ensuring safety.
This parameter reflects the directional impact of blasting wave propagation. Engineering practices should consider adjusting the blasting direction to ensure that vibration wave propagation occurs in a direction that minimally impacts structures. Utilizing three-dimensional blasting design software for simulation can help identify the optimal blasting program.
By adhering to these recommendations, it is possible to enhance the control of blasting operations, thereby improving both safety and efficiency.
7. Model Performance Analysis
7.1. Stability test
When considering the application of machine learning models, it is crucial to recognize their limitations regarding applicability. While these models are well-suited for processing complex data and pattern recognition, they typically require substantial data support. Their performance may be constrained by the representativeness and quality of the training dataset. Therefore, rigorous and meticulous validation of the models’ generalization ability and applicability across different mining environments is necessary.
The primary goal of the stability test is to ensure that the model consistently produces reliable predictions across multiple runs. Since blasting vibration prediction data is typically random in nature, and the site operating conditions are challenging to fully control during the blasting process, it is essential that the model generates stable outputs across various runs. This consistency is critical to ensure that the results remain reliable under different scenarios. If the model lacks stability, the blasting operator will be unable to depend on the predicted outcomes to implement protective measures or adjust the blasting program, which could result in unnecessary safety risks. Therefore, stability testing plays a pivotal role in practical applications, as it guarantees the model's reliability and predictability in continuous operation, providing robust support for decision-making.
A standard method for verifying stability involves running the experiment multiple times and calculating statistical metrics such as Std and RMSE. In this study, the IHO-VMD-CatBoost model and several comparison models were executed multiple times on the same dataset. The RMSE and MAE were recorded for each run, and the Std for each model was computed. A lower Std signifies that the model performs more consistently across repeated experiments. In Figure 15, the models labeled Alg1-10 correspond to the rankings in Table 4, and the error bar graphs provide a visualization of each model’s stability performance. The results demonstrate that the Stds of RMSE and MAE for the IHO-VMD-CatBoost model (Alg1) are 0.02 .01 cm/s, respectively, significantly lower than those of the other models. This indicates that the IHO-VMD-CatBoost model exhibits minimal fluctuation across multiple experiments, confirming its superior stability.

7.2. Robustness test
Robustness refers to the capability of an algorithm to maintain optimal performance amid varying input perturbations or data noise. Its primary purpose is to evaluate the model’s performance in response to such disruptions. In real-world mine blasting operations, data is typically sourced from multiple measurement points, where measurement equipment may be influenced by external environmental factors, equipment aging, or human operation, leading to a certain level of noise in the data. Consequently, the model’s robustness is critical, as it determines whether the model can still yield reliable predictions under noisy conditions. A model that is overly sensitive to noise risks generating inaccurate predictions from slightly erroneous data inputs. This sensitivity can result in significant errors in blasting safety assessments, ultimately jeopardizing the safety of surrounding buildings and individuals.
To assess the model’s robustness, this experiment introduces varying levels of noise (5%, 10%, and 20%) into the test dataset, evaluating the IHO-VMD-CatBoost model alongside other comparative models to analyze their performance under these noisy conditions. The principal metric for robustness evaluation is the increase in RMSE; more robust models are expected to exhibit less performance degradation as noise levels rise. Algorithms Alg1-10, illustrated in Figure 16, correspond to Rankings 1–10 in Table 4. As depicted in the figure, the RMSE of each model escalates to varying extents with increased noise levels. Notably, the IHO-VMD-CatBoost model (Alg1) shows the least variation in RMSE, indicating its superior robustness under noisy conditions.

7.3. Suitability test
The applicability test aims to evaluate the model’s performance across various datasets and diverse scenarios, thereby assessing its generalization capability. In practical blasting projects, geological conditions, mining methods, and blasting parameters can vary significantly from one mine to another. Therefore, the model must demonstrate robust adaptability to ensure accurate predictions under differing mining conditions. If the model is limited to the specific conditions of the training data and fails to generalize effectively to other scenarios, its applicability will be significantly constrained. Thus, the importance of applicability testing lies in ensuring the scalability and reliability of the model across different geological conditions and blasting projects. This approach guarantees that the model not only performs well on the current dataset but also retains its effectiveness in future applications within new mines or varied geological formations, which is essential for enhancing both the safety and economic efficiency of blasting operations.
Cross-validation serves as an effective method for assessing the generalization ability and applicability of a model. In this study, 5-fold cross-validation is employed to evaluate model performance by partitioning the dataset into several subsets. Each subset is utilized sequentially for testing, while the remaining subsets are reserved for model training, allowing for the calculation of an average result across multiple experiments. This methodology validates the applicability of the IHO-VMD-CatBoost model. As shown in Table 6, the model exhibits stable performance during the 5-fold cross-validation, with RMSE values ranging from 0.2748 to 0.2861 cm/s and MAE values between 0.1687 and 0.1743 cm/s, reflecting minimal fluctuations. These results indicate that the model consistently performs well under varying data distribution conditions. Furthermore, the robust applicability of the IHO-VMD-CatBoost model in addressing different data distributions is affirmed, suggesting that the model is capable of maintaining performance across diverse scenarios.
Number of folds . | RMSE (cm/s) . | MAE (cm/s) . |
---|---|---|
1-Fold | 0.2813 | 0.1715 |
2-Fold | 0.2748 | 0.1687 |
3-Fold | 0.2861 | 0.1743 |
4-Fold | 0.2792 | 0.1695 |
5-Fold | 0.2830 | 0.1720 |
Number of folds . | RMSE (cm/s) . | MAE (cm/s) . |
---|---|---|
1-Fold | 0.2813 | 0.1715 |
2-Fold | 0.2748 | 0.1687 |
3-Fold | 0.2861 | 0.1743 |
4-Fold | 0.2792 | 0.1695 |
5-Fold | 0.2830 | 0.1720 |
Number of folds . | RMSE (cm/s) . | MAE (cm/s) . |
---|---|---|
1-Fold | 0.2813 | 0.1715 |
2-Fold | 0.2748 | 0.1687 |
3-Fold | 0.2861 | 0.1743 |
4-Fold | 0.2792 | 0.1695 |
5-Fold | 0.2830 | 0.1720 |
Number of folds . | RMSE (cm/s) . | MAE (cm/s) . |
---|---|---|
1-Fold | 0.2813 | 0.1715 |
2-Fold | 0.2748 | 0.1687 |
3-Fold | 0.2861 | 0.1743 |
4-Fold | 0.2792 | 0.1695 |
5-Fold | 0.2830 | 0.1720 |
In addition, as shown in Appendix, Table A2 presents the computational costs for various algorithms, including the IHO-VMD-CatBoost model. The testing computational cost of IHO-VMD-CatBoost is quite efficient, with a runtime duration of 0.065 seconds and memory usage of 1516.25 MB. Compared to other models, such as ExtraTrees (0.101 seconds, 1517.01 MB) and IHO-AdaBoost (0.110 seconds, 1515.73 MB), the IHO-VMD-CatBoost model demonstrates a good balance between performance and computational efficiency, offering relatively lower computational costs without compromising accuracy. This efficiency is crucial when applying the model to real-world scenarios where both speed and memory constraints are key factors.
8. Conclusions
Blasting vibration is an inherent adverse effect associated with rock blasting and excavation activities, capable of causing significant damage to surrounding structures and the environment, ultimately leading to substantial economic losses. Therefore, accurate prediction of the PPV of blasting vibration is critical for ensuring safety and facilitating effective risk management. In this study, an IHO-VMD-CatBoost model was developed to predict the intensity of blasting vibration, and it demonstrated notable performance with the following key findings:
Model Efficacy: The proposed IHO-VMD-CatBoost model effectively addresses the initialization challenges inherent in the HO algorithm by incorporating advanced strategies, namely the tent chaotic mapping, Cauchy perturbation, and adaptive weighting factor strategies. These enhancements substantially improve both the global search capability and the overall optimization performance of the algorithm. Case study results reveal that the IHO-VMD-CatBoost model surpasses 13 widely-used regression models, including LightGBM, ExtraTrees, AdaBoost, and decision trees, in terms of prediction accuracy. Specifically, the model achieved a MAE of 0.17 cm/s and an RMSE of 0.28 cm/s, outperforming traditional approaches significantly, thereby highlighting its advantages in terms of both accuracy and reliability.
Model Limitations: Although the IHO-VMD-CatBoost model demonstrates strong predictive performance, certain limitations remain, particularly regarding the selection of factors for predicting blasting vibration intensity. The model's comprehensiveness and predictive accuracy could be further enhanced by incorporating additional rock and geological parameters, such as uniaxial compressive strength, rock density, and the geological strength index. Therefore, future research should focus on integrating a broader set of parameters that may significantly influence PPV, thereby improving the robustness and generalizability of the model.
Research Contribution: The IHO-VMD-CatBoost model uniquely integrates advanced methods, including the IHO algorithm, VMD, and the CatBoost algorithm. This combination not only improves the model’s global search and optimization capabilities but also enhances prediction accuracy. Furthermore, SHAP analysis illustrates that the model maintains a high degree of interpretability, effectively revealing the primary factors influencing blasting vibrations—such as horizontal distance and maximum section charge. These insights provide a scientific basis for optimizing blasting scheme designs, thereby enhancing the safety and efficiency of blasting operations.
Future Research Directions: Future research should extend the analysis to encompass additional geological and environmental parameters to provide a more comprehensive assessment of the environmental impacts of blasting vibrations. Additionally, further exploration of the model’s performance under more complex geological conditions or in environments characterized by high uncertainty is warranted. This would enhance the applicability and robustness of the model under diverse field scenarios.
In summary, the IHO-VMD-CatBoost model exhibits considerable stability, robustness, and applicability in the prediction of blasting vibration intensity. Its demonstrated potential in complex geological environments suggests its feasibility for broader applications in real-world settings. Future research should focus on refining model optimization and expanding variable selection to enhance the predictive capabilities and adaptability of the model for even more challenging geological conditions.
Conflicts of Interest
The authors declare no conflict of interest.
Author Contributions
Haiping Yuan: Software, Writing—original draft, Yangyao Zou: Conceptualization, Methodology, Software, Writing—original draft, Hengzhe Li Writing-review & editing, Shuaijie Ji: Writing-review & editing, Ziang Gu: Writing-review & editing, Liu He: Writing-review & editing, Ruichao Hu: Writing-review & editing.
Acknowledgments
The authors gratefully acknowledge the precious support for this research from the National Natural Science Foundation of China (Grant No. 51874112), and the State Key Laboratory Open Funding Project of Mining ∼ Induced Response and Disaster Prevention and Control in Deep Coal Mines (Anhui University of Science and Technology) (Grant No. SKLMRDPC22KF02).
Data Availability
The data used in this study is sourced from the following link: https://kns.cnki.net/KCMS/detail/detail.aspx?dbcode=CDFD&dbname=CDFD9908&filename=2007198024.nh&v=.
References
Appendix
A. Algorithm Abbreviations, Computational Costs, and Key Parameters
This study employs various algorithmic models, with the corresponding abbreviations listed in Table A1. The computational costs for each model are presented in Table A2, while Table A3 outlines the key parameters of the models under consideration.
Abbreviation . | Full name . |
---|---|
LightGBM | Light gradient boosting machine |
ExtraTrees | Extremely randomized trees |
AdaBoost | Adaptive boosting |
DT | Decision trees |
CatBoost | Categorical boosting |
RF | Random forest |
GBDT | Gradient boosting decision tree |
PSO-SVM | Particle swarm optimization supported vector machine |
KNN | K-nearest neighbors |
Two-layerBP | Two-layer backpropagation |
BiLSTM | Bi-directional long short-term memory |
ELM | Extreme learning machine |
Lasso-MLR | Least absolute shrinkage and selection operator-mixed logistic regression |
Abbreviation . | Full name . |
---|---|
LightGBM | Light gradient boosting machine |
ExtraTrees | Extremely randomized trees |
AdaBoost | Adaptive boosting |
DT | Decision trees |
CatBoost | Categorical boosting |
RF | Random forest |
GBDT | Gradient boosting decision tree |
PSO-SVM | Particle swarm optimization supported vector machine |
KNN | K-nearest neighbors |
Two-layerBP | Two-layer backpropagation |
BiLSTM | Bi-directional long short-term memory |
ELM | Extreme learning machine |
Lasso-MLR | Least absolute shrinkage and selection operator-mixed logistic regression |
Abbreviation . | Full name . |
---|---|
LightGBM | Light gradient boosting machine |
ExtraTrees | Extremely randomized trees |
AdaBoost | Adaptive boosting |
DT | Decision trees |
CatBoost | Categorical boosting |
RF | Random forest |
GBDT | Gradient boosting decision tree |
PSO-SVM | Particle swarm optimization supported vector machine |
KNN | K-nearest neighbors |
Two-layerBP | Two-layer backpropagation |
BiLSTM | Bi-directional long short-term memory |
ELM | Extreme learning machine |
Lasso-MLR | Least absolute shrinkage and selection operator-mixed logistic regression |
Abbreviation . | Full name . |
---|---|
LightGBM | Light gradient boosting machine |
ExtraTrees | Extremely randomized trees |
AdaBoost | Adaptive boosting |
DT | Decision trees |
CatBoost | Categorical boosting |
RF | Random forest |
GBDT | Gradient boosting decision tree |
PSO-SVM | Particle swarm optimization supported vector machine |
KNN | K-nearest neighbors |
Two-layerBP | Two-layer backpropagation |
BiLSTM | Bi-directional long short-term memory |
ELM | Extreme learning machine |
Lasso-MLR | Least absolute shrinkage and selection operator-mixed logistic regression |
Ranking . | Model . | RMSE . | MAE . | Duration (s) . | Memory (MB) . |
---|---|---|---|---|---|
1 | IHO-VMD-CatBoost | 0.28 | 0.17 | 0.065 | 1516.25 |
2 | IHO-CatBoost | 0.85 | 0.53 | 0.042 | 1515.73 |
3 | HO-VMD-CatBoost | 0.88 | 0.56 | 0.080 | 1515.85 |
4 | IHO-AdaBoost | 0.88 | 0.62 | 0.110 | 1515.73 |
5 | IHO-ExtraTrees | 0.92 | 0.60 | 0.110 | 1517.27 |
6 | IHO-GBDT | 0.93 | 0.68 | 0.054 | 1517.27 |
7 | HO-CatBoost | 1.00 | 0.62 | 0.120 | 1515.50 |
8 | IHO-RF | 1.00 | 0.68 | 0.182 | 1517.27 |
9 | IHO-LightGBM | 1.04 | 0.63 | 0.022 | 1517.27 |
10 | IHO-Lasso-MLR | 1.07 | 0.75 | 0.003 | 1517.27 |
11 | IHO-SVM | 1.20 | 0.79 | 0.003 | 1517.27 |
12 | IHO-DT | 1.32 | 0.92 | 0.002 | 1517.27 |
13 | IHO-ELM | 1.34 | 0.99 | 0.006 | 1516.25 |
14 | AdaBoost | 1.36 | 0.85 | 0.059 | 1517.27 |
15 | IHO-KNN | 1.36 | 0.99 | 0.002 | 1517.27 |
16 | LightGBM | 1.37 | 0.80 | 0.028 | 1516.76 |
17 | ExtraTrees | 1.4 | 0.84 | 0.101 | 1517.01 |
18 | GBDT | 1.44 | 0.96 | 0.043 | 1517.27 |
19 | DT | 1.44 | 0.92 | 0.002 | 1517.27 |
20 | CatBoost | 1.52 | 0.92 | 0.253 | 1514.99 |
21 | RF | 1.53 | 0.96 | 0.119 | 1517.27 |
22 | PSO-SVM | 1.53 | 0.97 | 0.003 | 1517.27 |
23 | IHO-BiLSTM | 1.57 | 1.15 | 0.257 | 1516.25 |
24 | KNN | 1.72 | 1.15 | 0.002 | 1517.27 |
25 | Two-layerBP | 1.80 | 1.27 | 0.215 | 1516.25 |
26 | BiLSTM | 2.15 | 1.45 | 0.282 | 1515.34 |
27 | ELM | 6.65 | 2.33 | 0.006 | 1515.33 |
28 | Lasso-MLR | 26.25 | 4.79 | 0.002 | 1517.27 |
29 | IHO-Two-layerBP | 41.82 | 32.6 | 0.199 | 1515.99 |
Ranking . | Model . | RMSE . | MAE . | Duration (s) . | Memory (MB) . |
---|---|---|---|---|---|
1 | IHO-VMD-CatBoost | 0.28 | 0.17 | 0.065 | 1516.25 |
2 | IHO-CatBoost | 0.85 | 0.53 | 0.042 | 1515.73 |
3 | HO-VMD-CatBoost | 0.88 | 0.56 | 0.080 | 1515.85 |
4 | IHO-AdaBoost | 0.88 | 0.62 | 0.110 | 1515.73 |
5 | IHO-ExtraTrees | 0.92 | 0.60 | 0.110 | 1517.27 |
6 | IHO-GBDT | 0.93 | 0.68 | 0.054 | 1517.27 |
7 | HO-CatBoost | 1.00 | 0.62 | 0.120 | 1515.50 |
8 | IHO-RF | 1.00 | 0.68 | 0.182 | 1517.27 |
9 | IHO-LightGBM | 1.04 | 0.63 | 0.022 | 1517.27 |
10 | IHO-Lasso-MLR | 1.07 | 0.75 | 0.003 | 1517.27 |
11 | IHO-SVM | 1.20 | 0.79 | 0.003 | 1517.27 |
12 | IHO-DT | 1.32 | 0.92 | 0.002 | 1517.27 |
13 | IHO-ELM | 1.34 | 0.99 | 0.006 | 1516.25 |
14 | AdaBoost | 1.36 | 0.85 | 0.059 | 1517.27 |
15 | IHO-KNN | 1.36 | 0.99 | 0.002 | 1517.27 |
16 | LightGBM | 1.37 | 0.80 | 0.028 | 1516.76 |
17 | ExtraTrees | 1.4 | 0.84 | 0.101 | 1517.01 |
18 | GBDT | 1.44 | 0.96 | 0.043 | 1517.27 |
19 | DT | 1.44 | 0.92 | 0.002 | 1517.27 |
20 | CatBoost | 1.52 | 0.92 | 0.253 | 1514.99 |
21 | RF | 1.53 | 0.96 | 0.119 | 1517.27 |
22 | PSO-SVM | 1.53 | 0.97 | 0.003 | 1517.27 |
23 | IHO-BiLSTM | 1.57 | 1.15 | 0.257 | 1516.25 |
24 | KNN | 1.72 | 1.15 | 0.002 | 1517.27 |
25 | Two-layerBP | 1.80 | 1.27 | 0.215 | 1516.25 |
26 | BiLSTM | 2.15 | 1.45 | 0.282 | 1515.34 |
27 | ELM | 6.65 | 2.33 | 0.006 | 1515.33 |
28 | Lasso-MLR | 26.25 | 4.79 | 0.002 | 1517.27 |
29 | IHO-Two-layerBP | 41.82 | 32.6 | 0.199 | 1515.99 |
Ranking . | Model . | RMSE . | MAE . | Duration (s) . | Memory (MB) . |
---|---|---|---|---|---|
1 | IHO-VMD-CatBoost | 0.28 | 0.17 | 0.065 | 1516.25 |
2 | IHO-CatBoost | 0.85 | 0.53 | 0.042 | 1515.73 |
3 | HO-VMD-CatBoost | 0.88 | 0.56 | 0.080 | 1515.85 |
4 | IHO-AdaBoost | 0.88 | 0.62 | 0.110 | 1515.73 |
5 | IHO-ExtraTrees | 0.92 | 0.60 | 0.110 | 1517.27 |
6 | IHO-GBDT | 0.93 | 0.68 | 0.054 | 1517.27 |
7 | HO-CatBoost | 1.00 | 0.62 | 0.120 | 1515.50 |
8 | IHO-RF | 1.00 | 0.68 | 0.182 | 1517.27 |
9 | IHO-LightGBM | 1.04 | 0.63 | 0.022 | 1517.27 |
10 | IHO-Lasso-MLR | 1.07 | 0.75 | 0.003 | 1517.27 |
11 | IHO-SVM | 1.20 | 0.79 | 0.003 | 1517.27 |
12 | IHO-DT | 1.32 | 0.92 | 0.002 | 1517.27 |
13 | IHO-ELM | 1.34 | 0.99 | 0.006 | 1516.25 |
14 | AdaBoost | 1.36 | 0.85 | 0.059 | 1517.27 |
15 | IHO-KNN | 1.36 | 0.99 | 0.002 | 1517.27 |
16 | LightGBM | 1.37 | 0.80 | 0.028 | 1516.76 |
17 | ExtraTrees | 1.4 | 0.84 | 0.101 | 1517.01 |
18 | GBDT | 1.44 | 0.96 | 0.043 | 1517.27 |
19 | DT | 1.44 | 0.92 | 0.002 | 1517.27 |
20 | CatBoost | 1.52 | 0.92 | 0.253 | 1514.99 |
21 | RF | 1.53 | 0.96 | 0.119 | 1517.27 |
22 | PSO-SVM | 1.53 | 0.97 | 0.003 | 1517.27 |
23 | IHO-BiLSTM | 1.57 | 1.15 | 0.257 | 1516.25 |
24 | KNN | 1.72 | 1.15 | 0.002 | 1517.27 |
25 | Two-layerBP | 1.80 | 1.27 | 0.215 | 1516.25 |
26 | BiLSTM | 2.15 | 1.45 | 0.282 | 1515.34 |
27 | ELM | 6.65 | 2.33 | 0.006 | 1515.33 |
28 | Lasso-MLR | 26.25 | 4.79 | 0.002 | 1517.27 |
29 | IHO-Two-layerBP | 41.82 | 32.6 | 0.199 | 1515.99 |
Ranking . | Model . | RMSE . | MAE . | Duration (s) . | Memory (MB) . |
---|---|---|---|---|---|
1 | IHO-VMD-CatBoost | 0.28 | 0.17 | 0.065 | 1516.25 |
2 | IHO-CatBoost | 0.85 | 0.53 | 0.042 | 1515.73 |
3 | HO-VMD-CatBoost | 0.88 | 0.56 | 0.080 | 1515.85 |
4 | IHO-AdaBoost | 0.88 | 0.62 | 0.110 | 1515.73 |
5 | IHO-ExtraTrees | 0.92 | 0.60 | 0.110 | 1517.27 |
6 | IHO-GBDT | 0.93 | 0.68 | 0.054 | 1517.27 |
7 | HO-CatBoost | 1.00 | 0.62 | 0.120 | 1515.50 |
8 | IHO-RF | 1.00 | 0.68 | 0.182 | 1517.27 |
9 | IHO-LightGBM | 1.04 | 0.63 | 0.022 | 1517.27 |
10 | IHO-Lasso-MLR | 1.07 | 0.75 | 0.003 | 1517.27 |
11 | IHO-SVM | 1.20 | 0.79 | 0.003 | 1517.27 |
12 | IHO-DT | 1.32 | 0.92 | 0.002 | 1517.27 |
13 | IHO-ELM | 1.34 | 0.99 | 0.006 | 1516.25 |
14 | AdaBoost | 1.36 | 0.85 | 0.059 | 1517.27 |
15 | IHO-KNN | 1.36 | 0.99 | 0.002 | 1517.27 |
16 | LightGBM | 1.37 | 0.80 | 0.028 | 1516.76 |
17 | ExtraTrees | 1.4 | 0.84 | 0.101 | 1517.01 |
18 | GBDT | 1.44 | 0.96 | 0.043 | 1517.27 |
19 | DT | 1.44 | 0.92 | 0.002 | 1517.27 |
20 | CatBoost | 1.52 | 0.92 | 0.253 | 1514.99 |
21 | RF | 1.53 | 0.96 | 0.119 | 1517.27 |
22 | PSO-SVM | 1.53 | 0.97 | 0.003 | 1517.27 |
23 | IHO-BiLSTM | 1.57 | 1.15 | 0.257 | 1516.25 |
24 | KNN | 1.72 | 1.15 | 0.002 | 1517.27 |
25 | Two-layerBP | 1.80 | 1.27 | 0.215 | 1516.25 |
26 | BiLSTM | 2.15 | 1.45 | 0.282 | 1515.34 |
27 | ELM | 6.65 | 2.33 | 0.006 | 1515.33 |
28 | Lasso-MLR | 26.25 | 4.79 | 0.002 | 1517.27 |
29 | IHO-Two-layerBP | 41.82 | 32.6 | 0.199 | 1515.99 |
Model . | Key hyperparameters . | Value . |
---|---|---|
IHO-VMD-CatBoost | Learning_rate, depth, VMD α | 0.2, 5, 3800 |
IHO-CatBoost | Learning_rate, depth, L2_leaf_reg | 0.15, 6, 3 |
HO-VMD-CatBoost | Learning_rate, depth, VMD α | 0.13, 4, 3700 |
IHO-AdaBoost | Learning_rate, n_estimators | 0.2, 100 |
IHO-ExtraTrees | n_estimators, Max_depth | 200, 7 |
IHO-GBDT | Learning_rate, Max_depth, n_estimators | 0.15, 4, 150 |
HO-CatBoost | Learning_rate, depth, L2_leaf_reg | 0.12, 4, 5 |
IHO-RF | n_estimators, Max_depth | 200, 8 |
IHO-LightGBM | Learning_rate, num_leaves, Max_depth | 0.1, 31, 6 |
IHO-Lasso-MLR | Alpha, Max_iter | 0.05, 1000 |
IHO-SVM | C, Kernel, Epsilon | 10, rbf, 0.05 |
IHO-DT | Criterion, Max_depth | squared_error, 8 |
IHO-ELM | Alpha, Max_iter | 0.01, 1000 |
AdaBoost | Learning_rate, n_estimators | 0.1, 50 |
IHO-KNN | n_neighbors, weights | 10, uniform |
LightGBM | Learning_rate, num_leaves, Max_depth | 0.1, 31, -1 |
ExtraTrees | n_estimators, Max_depth | 100, 5 |
GBDT | Learning_rate, Max_depth, n_estimators | 0.1, 3, 100 |
DT | Criterion, Max_depth | squared_error, 5 |
CatBoost | Loss_function, Verbose | RMSE, 0 |
RF | n_estimators, Max_depth | 100, 6 |
IHO-BiLSTM | Layers, units, activation | 2, 50–50, relu |
KNN | n_neighbors, weights | 5, uniform |
Lasso-MLR | Alpha, Max_iter | 0.1, 1000 |
IHO-Two-layerBP | Layers, units, activation | 2, 64–32, relu |
Model . | Key hyperparameters . | Value . |
---|---|---|
IHO-VMD-CatBoost | Learning_rate, depth, VMD α | 0.2, 5, 3800 |
IHO-CatBoost | Learning_rate, depth, L2_leaf_reg | 0.15, 6, 3 |
HO-VMD-CatBoost | Learning_rate, depth, VMD α | 0.13, 4, 3700 |
IHO-AdaBoost | Learning_rate, n_estimators | 0.2, 100 |
IHO-ExtraTrees | n_estimators, Max_depth | 200, 7 |
IHO-GBDT | Learning_rate, Max_depth, n_estimators | 0.15, 4, 150 |
HO-CatBoost | Learning_rate, depth, L2_leaf_reg | 0.12, 4, 5 |
IHO-RF | n_estimators, Max_depth | 200, 8 |
IHO-LightGBM | Learning_rate, num_leaves, Max_depth | 0.1, 31, 6 |
IHO-Lasso-MLR | Alpha, Max_iter | 0.05, 1000 |
IHO-SVM | C, Kernel, Epsilon | 10, rbf, 0.05 |
IHO-DT | Criterion, Max_depth | squared_error, 8 |
IHO-ELM | Alpha, Max_iter | 0.01, 1000 |
AdaBoost | Learning_rate, n_estimators | 0.1, 50 |
IHO-KNN | n_neighbors, weights | 10, uniform |
LightGBM | Learning_rate, num_leaves, Max_depth | 0.1, 31, -1 |
ExtraTrees | n_estimators, Max_depth | 100, 5 |
GBDT | Learning_rate, Max_depth, n_estimators | 0.1, 3, 100 |
DT | Criterion, Max_depth | squared_error, 5 |
CatBoost | Loss_function, Verbose | RMSE, 0 |
RF | n_estimators, Max_depth | 100, 6 |
IHO-BiLSTM | Layers, units, activation | 2, 50–50, relu |
KNN | n_neighbors, weights | 5, uniform |
Lasso-MLR | Alpha, Max_iter | 0.1, 1000 |
IHO-Two-layerBP | Layers, units, activation | 2, 64–32, relu |
Model . | Key hyperparameters . | Value . |
---|---|---|
IHO-VMD-CatBoost | Learning_rate, depth, VMD α | 0.2, 5, 3800 |
IHO-CatBoost | Learning_rate, depth, L2_leaf_reg | 0.15, 6, 3 |
HO-VMD-CatBoost | Learning_rate, depth, VMD α | 0.13, 4, 3700 |
IHO-AdaBoost | Learning_rate, n_estimators | 0.2, 100 |
IHO-ExtraTrees | n_estimators, Max_depth | 200, 7 |
IHO-GBDT | Learning_rate, Max_depth, n_estimators | 0.15, 4, 150 |
HO-CatBoost | Learning_rate, depth, L2_leaf_reg | 0.12, 4, 5 |
IHO-RF | n_estimators, Max_depth | 200, 8 |
IHO-LightGBM | Learning_rate, num_leaves, Max_depth | 0.1, 31, 6 |
IHO-Lasso-MLR | Alpha, Max_iter | 0.05, 1000 |
IHO-SVM | C, Kernel, Epsilon | 10, rbf, 0.05 |
IHO-DT | Criterion, Max_depth | squared_error, 8 |
IHO-ELM | Alpha, Max_iter | 0.01, 1000 |
AdaBoost | Learning_rate, n_estimators | 0.1, 50 |
IHO-KNN | n_neighbors, weights | 10, uniform |
LightGBM | Learning_rate, num_leaves, Max_depth | 0.1, 31, -1 |
ExtraTrees | n_estimators, Max_depth | 100, 5 |
GBDT | Learning_rate, Max_depth, n_estimators | 0.1, 3, 100 |
DT | Criterion, Max_depth | squared_error, 5 |
CatBoost | Loss_function, Verbose | RMSE, 0 |
RF | n_estimators, Max_depth | 100, 6 |
IHO-BiLSTM | Layers, units, activation | 2, 50–50, relu |
KNN | n_neighbors, weights | 5, uniform |
Lasso-MLR | Alpha, Max_iter | 0.1, 1000 |
IHO-Two-layerBP | Layers, units, activation | 2, 64–32, relu |
Model . | Key hyperparameters . | Value . |
---|---|---|
IHO-VMD-CatBoost | Learning_rate, depth, VMD α | 0.2, 5, 3800 |
IHO-CatBoost | Learning_rate, depth, L2_leaf_reg | 0.15, 6, 3 |
HO-VMD-CatBoost | Learning_rate, depth, VMD α | 0.13, 4, 3700 |
IHO-AdaBoost | Learning_rate, n_estimators | 0.2, 100 |
IHO-ExtraTrees | n_estimators, Max_depth | 200, 7 |
IHO-GBDT | Learning_rate, Max_depth, n_estimators | 0.15, 4, 150 |
HO-CatBoost | Learning_rate, depth, L2_leaf_reg | 0.12, 4, 5 |
IHO-RF | n_estimators, Max_depth | 200, 8 |
IHO-LightGBM | Learning_rate, num_leaves, Max_depth | 0.1, 31, 6 |
IHO-Lasso-MLR | Alpha, Max_iter | 0.05, 1000 |
IHO-SVM | C, Kernel, Epsilon | 10, rbf, 0.05 |
IHO-DT | Criterion, Max_depth | squared_error, 8 |
IHO-ELM | Alpha, Max_iter | 0.01, 1000 |
AdaBoost | Learning_rate, n_estimators | 0.1, 50 |
IHO-KNN | n_neighbors, weights | 10, uniform |
LightGBM | Learning_rate, num_leaves, Max_depth | 0.1, 31, -1 |
ExtraTrees | n_estimators, Max_depth | 100, 5 |
GBDT | Learning_rate, Max_depth, n_estimators | 0.1, 3, 100 |
DT | Criterion, Max_depth | squared_error, 5 |
CatBoost | Loss_function, Verbose | RMSE, 0 |
RF | n_estimators, Max_depth | 100, 6 |
IHO-BiLSTM | Layers, units, activation | 2, 50–50, relu |
KNN | n_neighbors, weights | 5, uniform |
Lasso-MLR | Alpha, Max_iter | 0.1, 1000 |
IHO-Two-layerBP | Layers, units, activation | 2, 64–32, relu |