A Performance Comparison of Various Bootstrap Methods for Diffusion Processes

In this paper, we compare the finite sample performances of various bootstrap methods for diffusion processes. Though diffusion processes are widely used to analyze stocks, bonds, and many other financial derivatives, they are known to heavily suffer from size distortions of hypothesis tests. While there are many bootstrap methods applicable to diffusion models to reduce such size distortions, their finite sample performances are yet to be investigated. We perform a Monte Carlo simulation comparing the finite sample properties, and our results show that the strong Taylor approximation method produces the best performance, followed by the Hermite expansion method.


Introduction
Markov processes play a central role in financial analyses, granted that prices efficiently reflect information in the market. Diffusion processes are continuous-time Markov processes that have a semi-martingale property, and they have been widely used to model stocks, bonds, and many other financial derivatives (Chan et al., 1992). However, diffusion models are known to heavily suffer from size distortions of hypothesis tests. While there are many bootstrap methods applicable to diffusion models to reduce such size distortions, their finite sample performances are yet to be investigated. Thus, we compare their performance by using Monte Carlo simulations.
It is well known that bootstraps enhance finite sample performance in various statistical analyses, such as parameter value estimations and hypothesis testing. For example, the bootstrap can be used to correct biases in estimation as seen in Tang & Chen (2009). However, bootstrapping the diffusion processes is not straightforward because of the dependence structure residing in the time series data (Kreiss & Lahiri, 2012;Horowitz, 2003;Härdle et al., 2003;Pan & Politis, 2016a;Pan & Politis, 2016b). There are various methods for circumventing this problem. For example, when the parametric dependence structure of the diffusion model is well known and easy to construct, we can generate the bootstrap samples using the parametric bootstraps. Conversely, if the parametric structure of the model is unknown or difficult to construct, we may need to utilize nonparametric approaches. In this study, we consider various parametric and nonparametric bootstrap resampling methods in practice.
Among the parametric approaches, the strong Taylor approximation is straightforward and easy to implement, so it is preferred by many researchers. Having been used for a long time, it is still one of the most widely implemented methods, and many extensions are still proposed, as seen in Gyöngy & Rásonyi (2011) and Mikulevičius & Zhang (2015). A relatively new approach utilizes the Hermite expansion suggested by Aït-Sahalia (1999). Due to its sound theoretical background and good performance in estimation, parameter estimation using the Hermite expansion has gained popularity since its introduction. In this study, we utilize the Hermite expansion method to generate bootstrap samples.
Nonparametric approaches utilize a nonparametrically estimated conditional distribution. The conditional distribution contains essential characteristics of the diffusion process; therefore, it plays an important role in various diffusion analyses, as seen in Chen et al. (2008) and Bhardwaj et al. (2008). Due to its versatility and robustness to misspecification, the nonparametric approach is also preferred by many researchers. As for the bootstrap resampling, Horowitz (2003) suggested using the Markov conditional bootstrap (MCB), in which random samples are generated from the non-parametrically estimated conditional distributions. These parametric and nonparametric approaches have their own strengths and weaknesses, and their finite sample performances in hypothesis testing are yet to be investigated. outperforms the first-order asymptotics, it is worthwhile to consider the MCB when the nonparametric method is inevitable.
In Section 2, we introduce various bootstrap resampling methods that generate samples either from the parametrically constructed or from nonparametrically constructed transition densities. Section 3 presents the results of a Monte Carlo simulation on the performance of the bootstrap methods. Section 4 presents concluding remarks.

Various Bootstrap Methods for Diffusion Models
In this section, we introduce various bootstrap resampling methods that can be applied to diffusion models. For the analysis, we consider a stationary diffusion process X whose stochastic differential equation (SDE) is given as the following: dXt = μ(Xt)dt + σ(Xt)dWt, where W is the standard Brownian motion.
Since closed-form expressions of exact transition densities are not generally available, we first consider the strong Taylor approximation. The Euler and Milstein schemes are popular strong Taylor approximation methods. In addition to these, we also consider other more sophisticated methods, such as the Heun method and the order 1.5 strong Taylor scheme. Furthermore, we also consider an approximation utilizing the Hermite expansion of the transition density, as suggested by Aït-Sahalia (1999). Although it is one of the most popular methods used for parameter estimation, it is not very common for random sample generation. Lastly, we consider the MCB, a nonparametric bootstrap method proposed by Horowitz (2003). For the financial analysis using diffusion models, it is worthwhile to consider using nonparametric methods, since parametric diffusion models are often imprecisely specified for simplification.

Exact Transition Density:
In general, closed-form expressions of exact transition densities are not available except for very few diffusion processes. For stationary diffusions, only the following two models are known to have closed-form transition densities.
• Vasicek model (Ornstein-Uhlenbeck process), attributed to Vasicek (1977). The SDE of the Vasicek model is given as dXt = κ(α − Xt)dt + σdWt for κ, σ ∈ R+ and α ∈ R. The conditional distribution of XΔ|X0 = x is the normal distribution, whose density function is where φ is the density function of the normal distribution, whose conditional mean and variance are, respectively, and • The Cox-Ingersoll-Ross model (Feller's square root process), attributed to Cox et al. (1985). The SDE is given as for κ, α, σ ∈ R+ such that 2κα > σ 2 . In this case XΔ|X0 = x follows a noncentral χ 2 -distribution. The transition (-κΔ), v=cy, and is the modified Bessel function of order q.

Strong Taylor Approximation:
The use of a strong Taylor approximation can be considered for bootstrap sampling. Once we obtain the parameter estimates of the diffusion processes, we can generate a stochastic process from the strong Taylor approximation. In order to define the strong Taylor approximation, we first introduce the absolute error criterion E(|XT−Y(T)|), where Y is an approximation of X. Next, Y δ , a discrete-time approximated version of X on the time interval (τ)δ = {τn: n = 0, 1, • • •}, is considered strongly converging with the order of γ, if there exist C and δ0 such that E(|XT − Y δ (T )|) ≤ Cδ γ for any δ ∈ (0, δ0). This definition can be naturally understood as a generalization of the deterministic version convergence criterion. The higher value of γ implies a sharper order of convergence. For more details, see Platen (1999). For more convergence results of the strong Taylor approximation, a discrete-time approximation Y δ is strongly consistent if, for some nonnegative function c(δ) with limδ↓0 c(δ) = 0, we have and for all n=0,1, …, where { is a family of σ-algebras. The above two conditions roughly imply that Y δ converges to X in terms of the mean and variance. Thus, if an approximation is strongly consistent, then the two processes are pathwisely close to each other. Under some regularity conditions such as the Lipschitz condition and the linear growth bound condition, we formulate the following theorem.
When we derive a discrete-time approximation in the strong convergence criterion, we refer to it as a strong Taylor approximation. As shown in the following, the convergence order is determined by how many terms we include in the expansion.
The Euler-Maruyama scheme, also called the Euler scheme, is the simplest strong Taylor approximation, and in general, it attains the order of strong convergence γ = 0.5. The Euler scheme is given by where Δ is the length of the interval [τn, τn+1] and is the increment of the standard Brownian motion W, on [τn, τn+1].
For the Euler scheme, , we evaluate the right-hand side of the equation at the beginning of each interval τn < t < τn+1. We can obtain a more accurate approximation when we include more information of the process from elsewhere, for example, when we use the average of the values at both τn and τn+1. In this case, we have This method is not feasible because the unknown quantity appears on both. To address this issue, we use the Euler scheme to replace the term on the right-hand side. Accordingly, we obtain that or This approximation is called the improved Euler scheme or the Heun method.
The Milstein scheme is an order 1.0 strong Taylor approximation method. To obtain the Milstein scheme where, , , and . We add the term to the Euler scheme from the Ito-Taylor expansion.
An even more accurate approximation can be obtained by including more stochastic Taylor expansion terms. These additional terms consist of stochastic integrals that carry more information about the process. These additional stochastic integral terms play an important role in improving the accuracy of the approximation since they represent the difference between the stochastic differential equations and the deterministic differential equations. To obtain a strong Taylor scheme of order γ = 1.5, we add more terms to the Milstein scheme utilizing the Ito lemma. Kloeden & Platen (1992) (1999). For this method, we apply the Lamperti transformation on the original diffusion process so that the transition density of the transformed model would become more suitable for the Hermite expansion. In the below brief overview of the transformations, let be the domain of the process X. For the approximation, we first transform X into Y such that where x # is an arbitrary point in the domain DX. Note that as σ > 0, the transformation function f is strictly increasing and invertible. We denote as the domain of Y where and . Then from Ito's lemma, we obtain where Next, we transform Y into Z such that where y0=f(x0). For the processes X, Y, and Z, Aït-Sahalia (1999) derived their transition densities as follows: and We approximate the transition density of Z as where φ is the standard normal density and Here H denotes the classical Hermite polynomials, which are given by Finally, we derive the approximated transition densities of Y and X as and Then the following convergence theorem holds under the conditions assumed in Aït-Sahalia (1999).
For the practical implementation of Theorem 1, we first compute p (J) (z|y0) for a given J. To obtain the coefficients ηj(Δ, y0) for j = 0, 1, . . . , J, we have To calculate this expectation, we use the following lemma, which is obtained from the Taylor approximation.
Lemma 1 (Aït-Sahalia, 1999). Let g be a function such that g and all its derivatives have at most exponential growth. Then, for Δ∈(0, y0∈R, there exists δ ∈ [0, Δ] such that where A is the infinitesimal operator of the diffusion Y defined by and A j • (y0) means A applied j times to the function y → g(y) and evaluated at y = y0.
In practice, we need to consider how many terms should be included in this Taylor series. Aït-Sahalia (1999) suggested that one should first choose the order J and then expand the expectation such that the approximation of the transition density is at most of order Δ J/2 . Horowitz (2003) proposed the MCB to conduct bootstrap sampling for Markov processes, and he sought to use a transition density to construct the dependency in time series data. We apply this idea by nonparametrically estimating the diffusion transition density. When it is applied to the diffusion model estimation, we expect in general that the nonparametric approaches will perform less accurately than the parametric approaches, since the nonparametric methods involve additional user-dependent factors. However, there are cases in which asymptotic expansion type approximations may not work well enough, so it is still worthwhile to check the performance of the nonparametric method. The MCB estimates the joint and marginal densities as follows:

Markov Conditional Bootstrap:
and where K(•) is the kernel function and hn is the bandwidth.

Monte Carlo Experiments
This section presents the results of the Monte Carlo simulation, which compares the numerical performance of the introduced bootstrap methods. To examine how well the bootstraps perform, we used the Ornstein-Uhlenbeck (OU) process: dXt = κ(α − Xt)dt + σdWt. We used this process for our simulation as it is one of the most popularly used diffusion models in practice. Furthermore, its transition density function is known in closed form, and its sampling can be carried out without any cumbersome numerical approximation. We use values of κ = 1.0, α = 0.6, and σ = 0.1 and sample size n = 300. We perform 1000 Monte Carlo replications in an experiment. The critical values are obtained from 300 bootstrap iterations, and the coefficients are estimated with the generalized method of moments (GMM). Though applying the exact maximum likelihood estimation is possible for this OU process, we utilize the GMM estimation because the GMM estimation is widely used in practice, whereas the exact maximum likelihood estimation is available only for a few limited diffusion models. In the simulation, we examine the performance of two-tailed t-tests with a significance level of 0.1.

GMM Moment Conditions and the Test Statistic:
For the GMM estimation we discretize our diffusion model as follows: Δ where E and following Brennan & Schwartz (1988), Dietrich-Campbell & Schwartz (1986), and Sanders & Unal (2001). We let . Given Δ the vector is written as Denoting with θ0 the true but unknown value of θ, we have E[ft(X, θ0)] = 0. The GMM procedure estimates parameter values by finding the values that satisfy the sample version of the moment conditions, where E[ft(X, θ)] is replaced with . Then the parameter estimates are given by the minimizer of the quadratic form where WT (θ) is a positive-definite symmetric matrix. From the matrix differentiation, finding the minimum of JT (θ) is equivalent to solving, , where D(θ) is the Jacobian matrix of gT (X, θ) with respect to θ.
The choice of the weighing matrix WT is also important since the performance of the GMM estimator θn depends on how we define WT. Hansen (1982) showed that if we let WT (θ) = S −1 (θ) where , then the resulting GMM estimator achieves the smallest asymptotic variance. Denoting S0(θ) as an estimator of S(θ), the asymptotic variance matrix of the GMM estimator is where D0(θ) is the Jacobian matrix at the estimated values.
We denote the consistent estimator of with n; the (r, r) component of n with ( n)rr,; and the r'th components of θ and θn with θr and θnr, respectively. Thus, the t-statistic for the null hypothesis is To obtain the bootstrap versions of tnr, we define using moment conditions such that where is the bootstrap expectation and is a bootstrap sample. As in Hall & Horowitz (1996) and Andrews (2004), we apply the recentering technique to the bootstrap version since there is no θ such that for an overidentified case. The bootstrap estimator of θ, denoted by , is obtained by replacing with and X with The bootstrap version of n, denoted by , is obtained by replacing with , X with , and with in the expression for n. The bootstrap version of the t-statistic is where denotes the r'th component of .
Numerical Results: Table 1 reports the results of the Monte Carlo simulation, listing the differences in the coverage probabilities of the bootstrap tests for nominal 90% confidence intervals. For the comparison of the bootstrap performances, we only focus on the coverage probability of the drift term parameters κ and α, which are known to suffer from large size distortions. Since the Euler and Milstein schemes coincide with each other in the case of the OU process, the simulation of Milstein schemes is not conducted. When the transition density is estimated by a nonparametric method or by the Hermite expansion, it is impossible to generate bootstrap samples with explicit formulae. Therefore, sampling is executed with the accept-reject method. 1 To evaluate the accuracy of each bootstrap method, the summation of the absolute values of differences between the nominal and empirical coverage probabilities in each bootstrap test is shown at the bottom of the table.
As seen in Table 1, the bootstraps utilizing the Hermite expansions of orders 1 and 2 outperform the first-order asymptotic test. However, they do not outperform the Euler approximation, which is unexpected. We conjecture that the accept-reject method brought about some inefficiency in the process of bootstrap sampling. Moreover, the bootstrap method with the best performance is the one utilizing the order 1.5 strong Taylor approximation. The higher the order of strong Taylor approximation, the more accurately the empirical coverage probability from bootstrap is calculated. The nonparametric MCB shows relatively larger errors, as it inevitably involves more factors to adjust, but it at least performs better than first-order asymptotic test even for this simple OU model case.

Conclusion
This paper compares the performance of various bootstrap methods that are applicable to diffusion models. Among the various bootstrap methods, those using the idea of a strong Taylor approximation provide the most precise results. The bootstrap using the Hermite expansion fails to perform well possibly because of sampling errors in accept-reject method. The nonparametric MCB shows relatively poor performance, but it still outperforms the first-order asymptotic test. Based on our results, we suggest using the strong Taylor approximation for bootstrapping diffusion processes. In addition, we also suggest the Hermite expansion method when the diffusion model is more complicated as we only consider a simple diffusion model here. Though the MCB produces rather an unsatisfactory result, we may also consider it when we are not very certain about the model specification, since it surely outperforms the first-order asymptotics. Furthermore, a future study would be needed to reduce the sampling error in accept-reject method, which would enhance the performance of the bootstraps using the Hermite expansion or the nonparametric transition density estimation.