Earthquake Reaearch in China  2017, Vol. 31 Issue (2): 239-246
The Application of Multiquadric Function Fitting to Borehole Strain Time Series Data Processing
Peng Zhao1,2, Zhang Lei1, Chen Zhiyao2, Lv Pingji2
1. Earthquake Administration of Tianjin Municipality, Tianjin 300201, China;
2. Institute of Seismology, China Earthquake Administration, Wuhan 430071, China
Abstract: Based on the existing continuous borehole strain observation, the multiquadric function fitting method was used to deal with time series data. The impact of difference kernel function parameters was discussed to obtain a valuable fitting result, from which the physical connotation of the original data and its possible applications were analyzed. Meanwhile, a brief comparison was made between the results of multiquadric function fitting and polynomial fitting.
Key words: Multiquadric function fitting     Kernel function     Borehole strain time series

INTRODUCTION

Continuous borehole strain observation can measure deformation over frequencies from Hz to periods of days, months and years. Sensitive and continuous, the time series data contains a lot of information, including deformation signals at lower and higher frequencies and varieties of noise jamming. In this case, the method used in processing time series data should possess those features: Complete theory; stable and simple calculation in order to quickly deal with a large number of high-frequency time series data and easily analyze its geophysical content; high precision, in order to avoid a lot of weak crustal movement information being treated as noise; and characteristics and variation information including various types of noise which can be effectively separated, and feature extraction.

The polynomial fitting method and Chebyshev curve fitting method have been used in stations, with progressing of observation technology, many scientists are concentrating on searching a new method to replace those antiquated methods. In this paper, we introduce a new method to process time series data, that is, the multiquadric function fitting method.

1 METHOD OF MULTIQUADRIC FUNCTION FITTING 1.1 Introduction of Multiquadric Function Fitting Method

The method of multiquadric function fitting (MG), first described by Hardy R.L. in 1971, uses various harmonic and polynomial series to represent topography and other irregular surfaces that involves the summation of equations of quadric surfaces having unknown coefficients. One two-dimensional surface can be represented by the equation as follows:

 $f\left({x, y} \right) = \sum\limits_{j = 1}^m {{c_j}{q_j}(x, y, {x_j}, {y_j})}$ (1)

where, q is a single class of quadric surfaces, (xy, yj) is the horizontal position of q, cj is the associated coefficient.

The term "multiquadric" does not mean q must be a quadric equation. Of course, the principle can be extended to multicubic, multiquartic, and high-degree approaches. MG had been used in gravity field such as gravity anomalies, geoidal indulations and deflections of the vertical and geodesy field like crustal movement. In China, Huang Liren et al. (1993) described the application of MG in research on crustal vertical movement. Liu Jingnan et al. (2001) used MG to establish crustal movement speed field.

Using one-dimensional time tj to replace two-dimensional position (xj, yj), then equation (1) becomes:

 $f\left(t \right) = \sum\limits_{j = 1}^m {{c_j}{q_j}(t, {t_j})} = QC$ (2)

Using the method of least squares to determine the associated coefficient C. The error equation is:

 $L = QC - L$ (3)

L is a matrix of time series data.

Thus, the normal equation is:

 $NC + W = 0$ (4)

The solution is

 $C = - {N^{ - 1}}W$ (5)

Let the Q become

 $Q = {\rm{ }}\left[ {\begin{array}{*{20}{c}} {{q_{1, 1}}}& \cdots &{{q_{m, 1}}}\\ {{q_{1, 2}}}& \cdots &{{q_{m, 2}}}\\ \vdots&\ddots&\vdots \\ {{q_{1, j}}}& \cdots &{{q_{m, j}}} \end{array}} \right]\;\;\left({i =, 1, 2, \cdots, n} \right)$ (6)

Where n is number of the data, N=QTQ and W=QTL, as long as N is a nonsingular matrix, we can figure out the associated coefficient C. The mean square error is

 $M = \pm \sqrt {\frac{{\sum {{V^{\rm{T}}}V} }}{{n - m}}}$ (7)

It reflects the accuracy of the fitting.

1.2 The Selection of Kernel Function Parameters

We used multiquadric equations to fit concrete borehole strain time series data.The minute data of multi-component borehole instrument installed in Zhaotong station located in Yunnan Province between January 1, 2014 and January 3, 2014 proved to be stable and continuous, given that the has been used for more than ten years. After relative calibration and conversion according to Qiu Zehua et al. (2009), the minute strain change data of EW component and NS component were obtained.

 Fig. 1 Strain time series data of Zhaotong station

The experiment has not yet produced definite conclusions about which class of operational mode best represents any particular type of surfaces. However, good experimental results for topography have been obtained from smooth and harmonic models. For the time series data, a Gaussian function based on (8) is appropriate, although it is not necessarily the optimum.

 ${q_j}(t, {t_j}) = {{\rm{e}}^{ - {{\left({\frac{{t - {t_j}}}{k}} \right)}^a}}}$ (8)

in equation (8), a is power exponent, k is smooth parameter.

Parameter a and k determine the model function. The model function, the function points and the interval of points influence the fitting result together.

According to Qi Na et al. (2010), with the increasing of the smooth parameter k, the significance of simple point data reduces. For the time series data have thousands of data points, biggish k is appropriate. Thus, we set the smooth parameter k equal to 1000. Setting the power exponent a by incrementing 0.05 as let the initial number equal to 0.8.

The function points locate uniformly in the time series data, the interval of points takes 60, 90, 120, 180, 240 and 300 minutes, respectively. With the model function and the determined function points, equations of time series data can be defined. Then the mean square errors can be figured out to reflect the fitting accuracy.

According to Fig. 2 and Fig. 3, when a value is among 0.5-3.5 and interval of function points value is less than 240m, the mean square error M increases as the interval grows up for the same a. With the same interval, M increases as a grows up in most block. But when the a value is nearly integer 2 or 3, M increases sharply, indicating that the fitting becomes unstable, so we should not choose an integer power exponent when using the Gaussian function. When the a value is more than 3.5, a increases sharply with a growing up, indicating that the fitting becomes unstable. With the interval decrease, the critical a for unstable fitting decreases. When the interval values are more than 180m, the fitting is stable until a values reach more than 3.5, but when the interval values are between 90m and 120m, the critical a reduces to about 3, when the interval values are less than 60m, the critical a reduces to near 2.5, indicating that as M decreases with the interval abates, the stability of fitting reduces too, so we should choose the interval of function points according to the concrete situation.

 Fig. 2 The mean square errors of EW

 Fig. 3 The mean square errors of NS

In summary, let a values equal 2.6, and setting the interval equal to 90m is appropriate. Here M for EW component is 0.0039 and for NS component is 0.0036.

2 ANALYSIS OF FITTING RESULTS 2.1 The Fitting Results of MG

The results show that the fitting effect is controlled by the interval of the Kernel function points. In fact, the length of the Kernel function point interval mainly reflects the different frequency domain of the information. The longer the length of the Kernel function point interval is the more information tends to appear at low frequencies, and vice versa. The power exponent affects the fitting effect of the same information in the frequency domain. With a larger exponent, the influence of the single data point on the fitting result is larger. Placed in the borehole for observation, the borehole strain instrument cuts off many natural and human disturbances. So, in addition to strain tide and the band of crustal strain information, the observed data is also mixed with a large amount of different frequency interference and noise information, so the power exponent should not be too large.

 Fig. 4 The relationship between point interval and fitting errors

Fig. 3 shows the relationship between the point interval and the error in fitting, which may indicate the proportion of error in the different spectral ranges. The overall trend of the EW direction and NS direction error increases with the increase of length of point interval, reflecting that the borehole strain observation data is widely distributed in frequency domain, which accords with the characteristics of borehole strain observation. When the interval of point is less than 90m (1.5h), the variation of mean error is small, and the mean error is obviously increased in 90m-240m, which indicates that the interference and noise information in the observed time series data are mainly concentrated in the period of 90m-240m (1.5-4h). It can be seen from the observation log of the station that there was no large earthquake or human disturbance in this period, interference and noise information for the period should be mainly from short-term changes in the surrounding environment. The error of EW direction is more stable and the rate of change is smaller in the interval of 240m or above, while the variation rate of error in NS is more than 60m-90m in the interval of more than 240m, which may reflect the trend of crustal movement in the NS direction.

The mean error in NS direction is slightly smaller than that of EW, which may indicate that the observation environment in the NS direction is better than that of the EW direction, or the trend of some low-frequency long-period crustal movement in the NS direction represents the main influencing factor on the strain change in this direction; at the same time, in the EW direction there is no tendency of strain change, so the error in the background is slightly larger.

Fig. 5 and Fig. 6 show the fitting data and the corresponding residuals. From the figure we can see that the mean residuals of EW direction and NS direction are all within 0.01, the fitting result is good, and only the remnant of the two ends is larger which is due to the elimination of end effects. Taking a section of data outside the boundary to participate in the fitting calculation would help to eliminate the end effects.

 Fig. 5 The fitting results of EW time series and the corresponding residuals

 Fig. 6 The fitting result of NS series data and the corresponding residuals

This method can be used to interpolate and preprocess the time series data. Fitting the residual as the judgment factor, it could also be used in abnormal recognition and automatic alarm system. In addition, information separation, trend analysis and other calculations become possible, for the mathematical analysis formula of the time series data is obtained.

2.2 Comparing with Polynomial Approximations

The polynomial fitting method is used to deal with the time series data of Zhaotong station. The results of fitting with 21 orders polynomials are shown in Fig. 7. The mean errors are 0.0062 (in EW direction) and 0.0076 (in NS direction). Comparing the fitting results with MG, we can see:

 Fig. 7 The polynomial fitting data and residual of Zhaotong station

Firstly, the accuracy of the MG method is higher than that of the polynomial fitting method as the mean error is about half of the polynomial fitting method. The fitting accuracy of the polynomial fitting method decreases rapidly with the order decreasing. When the order is 10, the error in the fitting increases to about 0.08. The fitting precision of MG is less affected by the change of the interval and the power, the error of the fitting is less than 0.02 when the length of interval is between 60m-180m and the power is between 2-3.5, reflecting that MG has higher computational stability.

Secondly, the residual curve of polynomial fitting is similar to that of the original data, and residuals with large original data changes are also large too, which indicates that the fitting accuracy of the polynomial is affected by the shape of the graph. While the residuals of MG are nearly all the same over the whole period and are less affected by the shape of the figure, the reliability is higher than that of the polynomial fitting.

Thirdly, both methods have a more rigorous theoretical basis, while the calculation is relatively simple, suitable for computer automatic processing, but we need to be careful to avoid endpoint effects.

3 SUMMARY

In this paper, the method of multi-Kernel functions was used to fit time series data of Zhaotong multi-component borehole strain from January 1 to January 3 of 2014. We discussed the parameter selection according to the error analysis, obtained mathematical analysis formula of the fitting function, investigated its physical meaning and made a comparison with the polynomial fitting result.

The analysis shows that the Zhaotong station has a stable observation environment, the time series data has abundant information, wide frequency domain, less human disturbance and good quality. The noise period of the environment is mainly concentrated in 90m-240m (1.5-4h). There may be a long-term crustal movement trend in the NS direction.

In summary, the MG method has the characteristics of high accuracy, simple calculation, reliability and full functioning in borehole strain time series data fitting and information separation. Compared with the polynomial fitting method, it also has the following advantages: the extraction and separation of different frequency is easy to complete; analytical formulae could be mathematically obtained for further analysis; results are stable and fitting is not affected by the shape of the figure. All in all, the multiquadric function fitting method is a good method to deal with continuous borehole strain observation time series data.

REFERENCES
 Hardy R.L. Multiquadric equations of topography and other irregular surfaces[J]. J. Geophys. Res., 1971, 76: 1905–1915. DOI:10.1029/JB076i008p01905. Hardy R.L. The Application of Multi-quadric Equations and Point Mass Anomaly Models to Crustal Movement Studies[M]. NOAA Technical Report, 1978 Liu Jingnan, Shi Chuang, Yao Yibin, et al. Hardy function interpolation and its applications to the establishment of crustal movement speed field[J]. Geomatics and Information Science of Wuhan University, 2001, 26(6): 500–503+508. Liu Jingnan, Yao Yibin, Shi Chuang. Method for establishing the speed field model of crustal movement in China[J]. Geomatics and Information Science of Wuhan University, 2002, 27(4): 331–336. Qiu Zehua, Kang Baoxiang, Tang Lei. Conversion and application of 4-component borehole strainmeter data[J]. Earthquake, 2009, 29(04): 83–89. Qiu Zehua, Shi Yaolin. Development of borehole strain observation outside China[J]. Acta Seismologica Sinica, 2004, 26(Suppl.1): 162–168+176. Qi Na, Hu Liangbai. The research of choosing smoothing factor in the multi quadric method[J]. Journal of Geotechnical Investigation & Surreying, 2010, 38(9): 77–79. Tao Bengzhao, Yao Yibin. Parameter estimation based on multi-quadric collocation model[J]. Geomatics and Information Science of Wuhan University, 2003, 28(5): 547–550. Yang Bo, Zhang Fengxiang, Han Yueping. Method of multi-Kernel function and its application in GPS times series data processing[J]. Journal of Geodesy and Geodynamics, 2010, 30(04): 137–141. Yang Guohua, Huang Liren. Primary numerical study of several characteristics about multi-surface function in fitting method of velocity surface[J]. Crustal Deformation and Earthquake, 1990, 10(4): 70–82.

1. 天津市地震局，天津 300201;
2. 中国地震局地震研究所，中国地震局地震大地测量重点实验室，武汉 430071