-
PDF
- Split View
-
Views
-
Cite
Cite
Xinmin Ge, Yiren Fan, Yingchang Cao, Yang Wang, Yunhai Cong, Lailei Liu, A hybrid method for geological and geophysical data with multi-peak distributions using the PSO–GRG algorithm, Journal of Geophysics and Engineering, Volume 12, Issue 3, June 2015, Pages 283–291, https://doi.org/10.1088/1742-2132/12/3/283
- Share Icon Share
Abstract
To allow peak searching and parameter estimation for geological and geophysical data with multi-peak distributions, we explore a hybrid method based on a combination of the particle swarm optimization (PSO) and generalized reduced gradient (GRG) algorithms. After characterizing peaks using the additive Gaussian function, a nonlinear objective function is established, which transforms our task into a search for optimal solutions. In this process, PSO is used to obtain the initial values, aiming for global convergence, while GRG is subsequently implemented for higher stability. Iterations are stopped when the convergence criteria are satisfied. Finally, grayscale histograms of backscattering electron images of sandstone show that the proposed algorithm performs much better than other methods such as PSO, GRG, simulated annealing and differential evolution, achieving a faster convergence speed and minimal variances.
1. Introduction
Many geological and geophysical data show obvious multi-peak distribution properties, including the nuclear magnetic resonance transversal relaxation time (NMR T2) spectrum for porous rock, the absorption peaks of the gamma ray spectra for conventional radioactive elements such as uranium (238U), thorium (232Th) and potassium (40K) in geophysical well logging, the grayscale histograms of backscattering electron (BSE) images of rock, the amplitude or phase spectrum of reflected waves in seismic prospecting and the x-ray fluorescence spectrometry (XFS) of geological samples. A lot of interesting information is imbedded in these data, which makes it vital to extract the multi-peak parameters. For example, for the NMR T2 spectrum of porous rocks, if the multi-peak parameters are precisely evaluated, the pore size distribution and its interrelationship with reservoir type can be characterized well.
Methods for multi-peak decomposition and data fitting have been proposed by researchers in different fields. Fens (2000) provided quantitative equations for minerals and pores based on an additive multi-Gaussian function using a grayscale histogram obtained from a BSE image and introduced the generalized reduced gradient (GRG) algorithm to solve the problem. Shang et al (2000) discussed the effectiveness of four kinds of algorithms (the derivative function, peak comparison, Gaussian fit and symmetrical zero area methods) in the process of XFS peak seeking. It was demonstrated that the Gaussian fit and symmetrical zero area methods were superior to the other methods, and a generalized least squares algorithm was adopted to solve the function. Guillot (2001) presented a digital filtering technique for the peak parameter acquisition of full absorption peaks in airborne gamma spectrometry by studying variations in the spectrum profile, but failed to separate overlapping peaks. Siiskonen and Toivonen (2005) developed an accurate model for the peak shape of inelastic neutron scattering in the HPGe detector and used an iterative approach for quantitative analyses. Karakaplan (2007) provided a global search evolutionary genetic algorithm (GA) based on a stochastic search procedure for quantifying a combination of Gaussian and Lorentzian peaks. Desirable results for data with several overlapped Lorentzian peaks with random noise were achieved by the algorithms proposed. Huang and Chau (2008) developed the expectation maximization algorithm to estimate the number of Gaussian mixtures of histograms and their corresponding parameterization when searching for the global threshold of an image using a Gaussian mixture model. Miranda et al (2009) demonstrated the utility of a neural network algorithm in radiochemical separation techniques and isolated the alpha peaks of environmental samples with partially overlapping alpha peaks. Chen et al (2009) used a self-adjusting symmetrical zero-area peak searching algorithm for nuclide identification in portable spectrometers. Li and Jiang (2011) proposed a new strategy combining the principle of the golden method with the quasi-Newton iterative method to estimate the parameters of linear frequency modulation signals. Baeza et al (2013) developed the parabolic-Lorentzian modified Gaussian model to describe skewed chromatographic peaks that made the optimization process difficult when the initial values of the model parameters were far from optimal. Brunetti (2013) noted that the fitting of the spectrum was an ill-posed problem because of the presence of superimposed peaks in x-ray spectra and introduced a GA approach to solve the problem. Zhu et al (2013) presented a method of analyzing the static 1H NMR free induction decay (FID) signals of polymer solids with a mixed Gaussian and exponential kernel function and then applied the inverse Fredholm integral of the first kind for T2 spectrum analysis of FIDs into both exponential and Gaussian decays. Wang et al (2013) established a multi-Gaussian model for waveform analysis and solved it using a weighted least squares algorithm. The fitting accuracy was enhanced after employing a multi-criteria decision-making strategy to optimize the weights for the corresponding points of the pulse wave. Liu and Zhang (2013) applied a weighted regression to fit a grayscale histogram with a multi-Gaussian distribution and achieved segmentation of the image.
The accuracy of parameter estimation is controlled by the optimization algorithm since an unsuitable method may lead to local convergence. Using conventional algorithms, such as the least squares and reduced gradients methods, it is easy to reach local minimization; these methods are not considered in our study. With the progress of mathematical and computational techniques, more and more intelligent methods, such as artificial neural networks, simulated annealing (SA), GAs, stochastic inversion, differential evolution (DE), particle swarm optimization (PSO) and group search optimization, are being introduced to solve nonlinear optimization and inversion problems (Sen and Stoffa 1991, 1995, Kozlovskaya et al2007, Shaw and Srivastava 2007, Srivastava and Sen 2010, Göktürkler and Balkaya 2012, Deng et al2013, Sharma and Biswas 2013, Liu et al2014, Tan et al2014, Teimourzadeh and Zare 2014, Basu 2015). Compared with conventional algorithms, these intelligent algorithms show a stronger adaptability to the uncertainty of the data. The small requirements for the continuity and convexity of the objective function and constraint conditions also facilitate the utility of these algorithms. However, each intelligent algorithm has unique advantages and shortcomings, therefore amendments and combinations of these algorithms are necessary in applications (Chunduru et al1997, Dobróka and Szabó 2005, Athanasiou et al2007, Saraswat et al2010, Zhang et al2012, Ahmadi et al2013, Samuel et al2013, Tan et al2013).
Our aim is to develop a hybrid algorithm to obtain the peak parameters of geological and geophysical data. The rest of the paper is organized as follows. First, we give an overview of some frequently observed data with multi-peak distributions and generalize them to a nonlinear optimization problem. Then we introduce a hybrid method combining the PSO with the GRG for the nonlinear objective function, aiming for global optimal solutions. We present the algorithm in detail and take a BSE image as an example to verify its efficiency. Algorithms such as PSO, GRG, SA and DE are also executed on the same data for comparison. Finally, quantitative information about minerals and pores is derived using criteria presented by Fens (2000) and tested using data measured with a helium porosimeter and x-ray diffraction (XRD).
2. Problem overview
In this section we describe four groups of data which are commonly seen in the geological and geophysical fields. Figure 1 shows a typical BSE image and its grayscale histogram of a sandstone sample (where the acceleration voltage is 10 KeV, the magnification is 5000 × , the working distance is 2 mm, the tilt angle is 0°, and the horizontal field width is 53.7 µm). Three main peaks are observed in the grayscale histogram, representing different constituents of the sandstone. Based on previous publications by Fens (2000), Wong et al (2006) and Klaver et al (2012), we know that the dark areas in the image represent the pores since the impregnated epoxy has a low atomic density compared with other minerals. The dark gray areas represent the clays and the medium gray areas represent the quartz. The brighter areas contain feldspars and the brightest areas represent heavy minerals such as siderite and anhydrite. Thus, we consider that the three peaks are indicators of pore space, quartz and feldspars, and the other constituents are located among these peaks. The volumes of different constituents can be obtained using certain segmentation criteria on the BSE image.

The interaction between the beam and the sample generates different signals. (a) The BSE image. (b) The grayscale histogram.
Figure 2 shows the XRD spectrum of a sandstone sample. The principle of XRD is that the peak positions occur where the x-ray beam is diffracted by the crystal lattice and the unique set of d-spacings derived from this pattern can be used to ‘fingerprint’ the mineral (Cullity and Stock 2001). The constituents of minerals can be extracted using the phase retrieval method (Saldin et al2001, Środoń et al2001, Ufer et al2012). Some commercial software, such as Highscore (Philips) and Jade (MDI/Rigaku), is widely used for phase decomposition and matching of XRD spectra (Macchiarola et al2007). Among these techniques, peak searching is the most important for precise mineral component quantification.

Figure 3 depicts the NMR T2 spectrum of a fully water-saturated sandstone. The principles of NMR are given in detail by Dunn et al (2001). We use the same procedure and experimental conditions as previously published by Ge et al (2014) to investigate NMR signals. T2 spectra can be calibrated to pore radius distributions via an appropriate transformation for rocks with regularly distributed pores. The porosity and permeability of the gas-filled sample are 8.6% and 5.43 × 10-3 µm2, respectively. The three main peaks in the T2 spectrum indicate the complex pore system of the sample. The representative pore radius and the relative ratio can be computed if the positions and areas of these peaks can be fixed automatically.

The fourth example is the natural gamma ray spectrum in geophysical well logging. This is an effective method to estimate the elemental concentrations of 238U, 232Th and 40K in sedimentary rocks, which can be applied to interpret sedimentary environments and diagenesis. The theory of natural gamma ray spectrum logging has been explained thoroughly by Desbarats and Killeen (1990). The standard gamma ray spectra of 238U, 232Th and 40K and the recorded spectrum of a shale sample are shown in figure 4. Peak searching and spectrum decomposition techniques are necessary to quantify the energy ranges and contents of these radioactive elements.

Gamma ray spectra. (a) The standard gamma spectra for 238U, 232Th and 40K of BGO detector. (b) The typical gamma ray spectrum of a shale sample.
The task of peak fitting is then transformed into an issue of optimizing the multivariable, nonlinear objective function with constraints as in equations (5)–(8). N is larger than 3M for many cases; only the optimal solutions can be reached. There are several known methods for solving this problem. However, it is still very challenging with a large number of local minima values. In the following section, a hybrid algorithm combining the conventional and intelligent optimization methods will be introduced to find the global solutions of the nonsmooth, nonconvex function under condition of nonsmooth, nonconvex constraints.
3. A hybrid algorithm combing PSO and GRG
Let the position of the kth particle of the kth hereditary algebra be and its speed be (assuming the population size is Z) in 3M-dimensional space. Each particle has a fitness value determined by an optimized function and each particle has a present position and a best position which can be regarded as the discovered flight experience of the particle up to now. In addition, each particle has a best position found in the whole swarm up to now, which can be regarded as the flight experience of the particle’s companions. Each particle utilizes the following information to alter its present position: (1) its current position; (2) its current speed; (3) the distance between its current position and its best position; (4) the distance between its best position and the best position of the whole swarm.
PSO can possess strong convergence at the beginning of the calculation, but it may easily fall into ‘inertia’ which leads to premature convergence and insufficient local searching ability.
The iteration is halted if (where ε is a minimal constant. is the optimal solution).
Considering the advantages and disadvantages of PSO and GRG, a hybrid method (PSO–GRG) combining the PSO and GRG is proposed. We first use PSO to obtain a favorable solution, then use GRG to optimize this problem. In summary, the steps of PSO–GRG can be generalized as follows: (1) giving the population size Z, the learning factors and the inertia weight w, the dimension of searching space 3M, the maximal iteration times the minimum fitness value and the minimal constant ε; (2) calculating the fitness value of every particle; (3) finding particles satisfying the condition of the minimum fitness value as the initial solutions of GRG; (4) solving the objective function using GRG and obtaining the weights, peak values and standard deviations of the Gaussian functions. Figure 5 shows a schematic representation of the hybrid method.

4. Application and results
Figure 6 shows the fitting results of the data in figure 1(b). Three additive Gaussian functions are used to fit the data. The fitting result is close to the actual value with a mean square error of 3.87%. The grayscale values of these three peaks are 16.9, 119.1 and 136.8 while the standard deviations of these peaks are 3.57, 3.28 and 3.85, respectively.

The fitting results of the BSE image shown in figure 1. (a) A comparison of the grayscale histograms between the original and optimized data. (b) The residuals of the grayscale histograms between the original and optimized data.
After the segmentation, the lowest and highest grayscales of the pore space, clay, quartz, feldspars and heavy minerals can be determined automatically. The volumes of these constituents can be predicted simultaneously by frequency counting of the grayscales.
To demonstrate the efficiency of the hybrid methods, PSO, GRG, SA and DE are carried out for comparison. A close inspection of table 1 indicates that the performance of the proposed PSO–GRG algorithm is clearly and consistently superior to the other algorithms. The iteration time, absolute error and total computing time achieve the smallest values compared to the other methods. Additionally, DE seems to have a higher efficiency than the other algorithms except for PSO–GRG. In contrast, the GRG used by Fens (2000) shows the worst performance for our data.
Algorithm . | Total iteration number (times) . | Absolute error (%) . | Total computing time (s) . |
---|---|---|---|
PSO–GRG | 28 | 3.87 | 120 |
PSO | 23 | 9.24 | 86 |
GRG | 42 | 30.6 | 213 |
SA | 36 | 14.5 | 184 |
DE | 25 | 7.28 | 135 |
Algorithm . | Total iteration number (times) . | Absolute error (%) . | Total computing time (s) . |
---|---|---|---|
PSO–GRG | 28 | 3.87 | 120 |
PSO | 23 | 9.24 | 86 |
GRG | 42 | 30.6 | 213 |
SA | 36 | 14.5 | 184 |
DE | 25 | 7.28 | 135 |
Algorithm . | Total iteration number (times) . | Absolute error (%) . | Total computing time (s) . |
---|---|---|---|
PSO–GRG | 28 | 3.87 | 120 |
PSO | 23 | 9.24 | 86 |
GRG | 42 | 30.6 | 213 |
SA | 36 | 14.5 | 184 |
DE | 25 | 7.28 | 135 |
Algorithm . | Total iteration number (times) . | Absolute error (%) . | Total computing time (s) . |
---|---|---|---|
PSO–GRG | 28 | 3.87 | 120 |
PSO | 23 | 9.24 | 86 |
GRG | 42 | 30.6 | 213 |
SA | 36 | 14.5 | 184 |
DE | 25 | 7.28 | 135 |

Comparisons of the constituents between BSE and other methods. (a) A comparison of the porosity between BSE and the helium porosimeter. (b) A comparison of clay volume between BSE and XRD. (c) A comparison of quartz volume between BSE and XRD. (d) A comparison of feldspar volume between BSE and XRD. (e) A comparison of heavy mineral volume between BSE and XRD. (f) The distributions of relative errors for all samples.
5. Discussion and conclusions
The strategies for describing geological and geophysical data with multi-peak distributions are reviewed and an effective way to solve the proposed issue is investigated. We transform the problem of peak searching and data fitting into the task of finding optimal solutions for the objective function. Then a hybrid method called PSO–GRG is developed to avoid local and premature convergence. The following conclusions can be derived.
The proposed hybrid algorithm PSO–GRG is superior to other algorithms such as PSO, GRG, SA and DE since it makes it is easier to reach global convergence and produces smaller errors.
The parameters of the main peaks in grayscale histograms derived from BSE images of sandstone samples can be determined precisely and the volumes obtained of the constituents are in accordance with those from helium porosimeter and XRD tests.
Although no algorithm is universal for all optimization problems, the PSO–GRG algorithm is recommended for solving problems of this type using verifications of geological and geophysical data.
Acknowledgments
This work is supported by the National Natural Science Foundation of China (41404086), the China Postdoctoral Science Foundation (2014M560591), the Major National Science and Technology Projects of China (2011ZX05020-008), the National Key Foundation for Exploring Scientific Instrument of China (2013YQ170463) and the CNPC Science and Technology Project (2012E-34-12). The authors are grateful to Dr Dong Li (a geophysical engineer at the SINOPEC Geophysical Research Institute) for his scientific support with the PSO algorithm. We also thank the anonymous reviewers for their constructive suggestions.