Higher-order adaptive methods for exit times of Itô diffusions

Abstract

We construct a higher-order adaptive method for strong approximations of exit times of Itô stochastic differential equations (SDEs). The method employs a strong Itô–Taylor scheme for simulating SDE paths, and adaptively decreases the step size in the numerical integration as the solution approaches the boundary of the domain. These techniques complement each other nicely: adaptive timestepping improves the accuracy of the exit time by reducing the magnitude of the overshoot of the numerical solution when it exits the domain, and higher-order schemes improve the approximation of the state of the diffusion process. We present two versions of the higher-order adaptive method. The first one uses the Milstein scheme as the numerical integrator and two step sizes for adaptive timestepping: |$h$| when far away from the boundary and |$h^2$| when close to the boundary. The second method is an extension of the first one using the strong Itô–Taylor scheme of order 1.5 as the numerical integrator and three step sizes for adaptive timestepping. Under some regularity assumptions, we show that for any |$\xi>0$|⁠, the strong error is |${\mathcal{O}}(h^{1-\xi })$| and |${\mathcal{O}}(h^{3/2-\xi })$| for the first and second method, respectively. Provided quite restrictive commutativity conditions hold for the diffusion coefficient, we further show that the expected computational cost for both methods is |${\mathcal{O}}(h^{-1} \log (h^{-1}))$|⁠. This results in a near doubling/trebling of the strong error rate compared to the standard Euler–Maruyama-based approach, while the computational cost rate is kept close to order one. Numerical examples that support the theoretical results are provided, and we discuss the potential for extensions that would further improve the strong convergence rate of the method.

exit time, SDE, adaptive timestepping, Itô–Taylor scheme, strong approximation, Feynman–Kac

1. Introduction

For a bounded domain |$D \subset{\mathbb{R}}^d$| and a |$d$|-dimensional Itô diffusion |$(X_t)_{t\ge 0}$| with |$X(0) \in D$|⁠, the goal of this work is to construct efficient higher-order adaptive numerical methods for strong approximations of the exit time

$$ \begin{align}& \tau:= \inf\{t \ge 0 \mid X(t) \notin D \} \wedge T,\end{align} $$

(1.1)

where |$T>0$| is given. The dynamics of the diffusion process are governed by the autonomous Itô stochastic differential equation (SDE)

$$ \begin{align} \begin{split} \textrm{d}X &= a(X)\,\textrm{d}t+ b(X)\,\textrm{d}W\\ X(0) &= x_0, \end{split}\end{align} $$

(1.2)

where |$a \colon \mathbb{R}^{d} \to \mathbb{R}^{d}$| and |$b \colon \mathbb{R}^{d} \to \mathbb{R}^{d \times m}$|⁠, |$x_0$| is a deterministic point in |$D$| and |$W$| is an |$m$|-dimensional standard Wiener process on a filtered probability space |$(\varOmega , {\mathcal{F}}, ({\mathcal{F}}_t)_{t \ge 0}, {\mathbb{P}})$|⁠. Further details on the regularity of the domain |$D$| and the coefficients of the SDE are provided in Section 2.

Our method employs a strong Itô–Taylor scheme for simulating SDE paths and carefully decreases the step size in the numerical integration as the solution approaches the boundary of the domain. We present two versions: the order 1 method and the order 1.5 method. The order 1 method uses the Milstein scheme as the numerical integrator and two step sizes for the adaptive timestepping, depending on the proximity of the state to the boundary of |$D$|⁠: a larger timestep when far away and a smaller timestep when close to |$\partial D$|⁠. The order 1.5 method uses the strong Itô–Taylor scheme of order 1.5 as the numerical integrator and, as an extension of the previous method, three different step sizes for the adaptive timestepping.

When adaptive timestepping is properly balanced against the order of the scheme and under sufficient regularity assumptions, then for any |$\xi>0$|⁠, the order 1 method achieves the strong convergence rate |${\mathcal{O}}(h^{1-\xi })$|⁠, cf. Theorem 2.7; and the order 1.5 method achieves the strong convergence rate |${\mathcal{O}}(h^{3/2-\xi })$|⁠, cf. Theorem 2.10. Furthermore, if the diffusion coefficient satisfies the commutativity condition (2.14) for the order 1 method and the conditions (2.14) and (2.18) for the order 1.5 method, then Theorems 2.8 and 2.11 show that both higher-order methods have an expected computational cost of |${\mathcal{O}}(h^{-1} \log (h^{-1}))$|⁠. When applicable, our method nearly preserves the same computational cost as the alternative standard approaches described below, while achieving substantial improvements in the strong convergence rate.

Exit times describe critical times in controlled dynamical systems, and they have applications in pricing of barrier options (Alsmeyer, 1994) and also in American options (Bayer et al., 2020), as the latter may be formulated as a control problem where exit times determine when to execute the option. They also appear in physics, for instance when studying the transition time between pseudo-stable states of molecular systems in the canonical ensemble (Weinan et al., 2021).

The mean exit-time problem may be solved by the Monte Carlo approach of direct simulation of SDE paths or by solving the associated partial differential equation (PDE), the Feynman–Kac equation (Friedman, 1975). In high-dimensional state space, the former approach is more tractable than the latter, due to the curse of dimensionality.

A Monte Carlo method using the Euler–Maruyama scheme with a uniform timestep to simulate SDE paths was shown to produce the weak convergence rate 1/2 for approximating the mean exit time in Gobet (2000). An improved weak order 1 method was achieved by reducing the overshoot error through the careful shifting of the boundary, cf. Broadie et al. (1997); Gobet (2001); Gobet & Menozzi (2004). The weak order 1 method was extended to problems with time-dependent boundaries in Gobet & Menozzi (2010) and Bouchard et al. (2017) showed that in |$L^1(\varOmega )$|-norm, the Euler–Maruyama scheme has an order 1/2 convergence rate. The contributions of this work may be viewed as extensions of this strong convergence study to higher-order Itô–Taylor schemes with adaptive timestepping.

In Jansons & Lythe (2000), a fundamentally different idea for approximating the mean exit time of a one-dimensional Wiener process was developed that reduces the overshoot error using timesteps sampled from an exponential distribution. Numerical simulations show that the method has a weak convergence rate 1. The conditional probability of the stochastic process having exited the domain, given its value at the beginning and at the end of an exponential timestep, is computed, and the process is considered to have exited the domain if this conditional probability is greater than the value of a random variable sampled from the standard uniform distribution. This way, the error committed when computing the empirical probability density function of the exit time is reduced, thereby reducing the overestimation of the mean exit time. The work was later extended to higher-dimensional Wiener process exiting a smooth bounded domain, and then to autonomous scalar SDE in Jansons & Lythe (2005) and Jansons & Lythe (2003), respectively. It remains unclear whether the method can be applied to more general bounded domains and SDE, as well as to multilevel Monte Carlo (MLMC) settings that require strong approximations.

MLMC methods for weak approximations of exit times have been developed in Higham et al. (2013); Giles & Bernal (2018). Using the Euler–Maruyama scheme with boundary shifting and conditional sampling of paths near the boundary, the method (Giles & Bernal, 2018) can reach the mean-square approximation error |${\mathcal{O}}(\epsilon ^2)$| at the near-optimal computational cost |${\mathcal{O}}(\epsilon ^{-2} \log (\epsilon )^2 )$|⁠. Since our methods efficiently approximate the strong exit time and admit a pairwise coupling of simulations on different resolutions, they are also suitable for MLMC. See Section 5 for an outline of the extension to MLMC.

Adaptive methods for SDE with discontinuous drift coefficients have been considered in Neuenkirch et al. (2019); Yaroslavtseva (2022). Therein, the discontinuity regions of the drift coefficient are associated with hypersurfaces. When one is close to such hypersurfaces, a small timestep is used; otherwise, a large timestep is used. This adaptive timestepping approach is similar to ours, as can be seen by viewing the boundary of |$D$| in our problem as a hypersurface, but the problem formulations differ, and our method is more general in the sense that it admits a sequence of timestep sizes and also higher-order numerical integrators. For numerical testing of strong convergence rates, we have employed the idea of pairwise coupling of non-nested adaptive simulations of SDE developed in Giles et al. (2016), which is an approach that also could be useful for combining our method with MLMC in the future. See also Merle & Prohl (2023) for a recent contribution on a posteriori adaptive methods for weak approximations of exit times and states of SDE, and Hoel et al. (2012); Katsiolides et al. (2018); Fang & Giles (2020) for other partly state-dependent adaptive MLMC methods for weak approximations of SDE in finite- and infinite-time settings.

The Walk-On-Spheres (WoS) method (Muller, 1956; Deaconu et al., 2017) is an alternative approach for weak approximations of exit points of bounded diffusions that relies on simulations of SDE paths whose increments typically are bounded in the state-space rather than in the timestep. WoS requires closed-form expressions or a tractable approximation of exit time distributions of spheres, or other suitable surfaces like cubes or ellipsoids, so that the exit time of the process can be generated from walking/hopping on such surfaces. For more general SDE, extensions of WoS that rely on such principles have been developed, including error- and complexity analysis, for weak approximations of exit points and for reflected diffusions on bounded domains by Milstein et al. (Milstein, 1996, 1997; Milstein & Tretyakov, 2004). See also Leimkuhler et al. (2023) for extensions of Milstein’s WoS method to settings with more complicated boundary conditions and over infinite time-windows. Improvements to the practical implementation of Milstein’s WoS method for bounded diffusions and efficient estimation of a point’s distance to the boundary are further explored in Bernal (2019).

Unlike our approach, the efficiency of Milstein’s WoS method is not limited by commutativity conditions on the diffusion coefficient. But we believe our method is more suitable for extension to MLMC and thus to exploit the potential efficiency gains that offer for weak approximations, as it is easy to construct the kind of pairwise coupled realizations on neighboring resolution levels needed in MLMC with our method.

The rest of this paper is organized as follows: Section 2 presents two versions of the higher-order adaptive method and states the main theoretical results on convergence and computational cost. Section 3 contains proofs of the theoretical results. Section 4 presents a collection of numerical examples supporting our theoretical results. Lastly, we summarize our findings and discuss interesting open questions in Section 5.

2. Notation and the adaptive numerical methods

In this section, we first introduce the necessary notation and assumptions on the SDE coefficients and the domain |$D$|⁠, and relate the exit-time problem to the Feynman–Kac PDE. Thereafter, the higher-order adaptive methods of order 1 and 1.5 are respectively described in Sections 2.4 and 2.5, with convergence and computational cost results.

2.1 Notation and assumptions

For any |$x,y \in{\mathbb{R}}^d$|⁠, the Euclidean distance is denoted by |$d(x,y):= |x-y|$|⁠, and for any nonempty sets |$A,B \subset{\mathbb{R}}^d$|⁠, we define

$$ \begin{align*} & {d}(A,B):= \inf_{(x,y) \in A \times B} {d}(x,y). \end{align*} $$

A domain in |${\mathbb{R}}^d$| is defined as an open, nonempty and connected subset of that space. The following assumptions on the domain of the diffusion process will be needed to relate the exit-time problem to a sufficiently smooth solution of a Feynman–Kac PDE.

Assumption A.

The domain |$D \subset{\mathbb{R}}^d$| is bounded and the boundary |$\partial D$| is |$C^{4}$| continuous.

Let |$n_D:\partial D \to{\mathbb{R}}^d$| denote the outward-pointing unit normal and for any |$r>0$|⁠, let

$$ \begin{align*} & D_r:= \{ x \in{\mathbb{R}}^d \mid \textrm{d}(x,D) < r \}. \end{align*} $$

Due to the regularity of the boundary, it holds that |$n_D \in C^{3}(\partial D,{\mathbb{R}}^d)$| and |$D$| satisfies the uniform ball property: there exists an |$R>0$| such that for every |$x \in{\mathbb{R}}^n$| such that |$d(x,\partial D)< R$|⁠, there is a unique nearest point |$y$| on the boundary |$\partial D$|⁠, and it holds that |$x = y \pm d(x,y) n_D(y)$|⁠, where the sign |$\pm $| depends on whether |$x$| belongs to the exterior or interior of |$D$|⁠. (That is, for every |$x$| sufficiently close to the boundary, there is a unique point on |$\partial D$| satisfying that |$y = \arg \min _{z \in \partial D} d(x,z)$|⁠.) We refer to Gilbarg & Trudinger (2001, Chapter 14.6) for further details on the regularity of |$\partial D$| and its unit normal |$n_D$|⁠, and to Dalphin (2014, Theorem 1.9) for the uniform ball property. For any domain |$D$| satisfying the uniform ball property, let |$\textrm{reach}(D)> 0$| denote the supremum of all |$R>0$| such that the property is satisfied. The uniform ball property is equivalent to satisfying both the uniform exterior sphere and the interior sphere conditions.

For any |$r\in (0, \textrm{reach}(D))$|⁠, the boundary of |$D_r$| can be expressed by

$$ \begin{align*} & \partial D_r = \bigcup_{y \in \partial D} \, y + r n_D(y), \end{align*} $$

and, since the mapping

$$ \begin{align}& \partial D \ni y \mapsto \pi_r(y) = y + r n_D(y) \in \partial D_r\end{align} $$

(2.1)

is a |$C^3$|-diffeomorphism and |$\partial D$| is |$C^4$|⁠, it follows that |$\partial D_r$| is at least |$C^3$|⁠. This implies that |$D_r$| also satisfies the uniform ball property, and since any point |$z \in \partial D_r$| can be expressed as |$z= y + r n_D(y)$| for a unique |$y \in \partial D$| and indeed |$n_D(y) = n_{D_r}(z)$|⁠, it follows that the reach of |$D_r$| is bounded from below by |$\textrm{reach}(D)-r$|⁠, which we state for later reference:

$$ \begin{align}& \textrm{reach}(D_r) \ge \textrm{reach}(D) -r \qquad \forall r \in [0,\textrm{reach}(D))\!.\end{align} $$

(2.2)

A similar construction applies to the boundary of the subdomain |$D_{-r}:= \{x \in D \mid d(x, \partial D)>r \}$|⁠, where one can show that |$\partial D_{-r}$| is at least |$C^3$| for all |$r \in (0,\textrm{reach}(D))$|⁠.

For any integer |$k\ge 1$|⁠, let |$C^{k}_b({\mathbb{R}}^d)$| denote the set of scalar-valued functions with continuous and uniformly bounded partial derivatives up to and including order |$k$|⁠. We make the following assumptions on the coefficients of the SDE (1.2).

Assumption B.

(B.1) For the SDE coefficients
$$ \begin{align*} & a(x) = \begin{bmatrix} a_1(x) \\ \vdots \\ a_d(x) \end{bmatrix} \qquad \textrm{and} \qquad b(x) = \begin{bmatrix} b_{11}(x) & \cdots & b_{1m}(x)\\ \vdots & & \vdots \\ b_{d1}(x)& \cdots & b_{dm}(x) \end{bmatrix} \end{align*} $$
it holds that |$a_i, b_{ij} \in C_b^{3}({\mathbb{R}}^d)$| for all |$i \in \{1,\ldots ,d\}$| and |$j \in \{1,\ldots , m\}$|⁠.
(B.2) (Uniform ellipticity) For some |$\bar R_D \in (0,\textrm{reach}(D)/2)$|⁠, there exists a constant |$\hat c_b>0$| such that
$$ \begin{align*} & \hat c_{b} \left| {\xi} \right|^{2} \le \xi^{\top} (b(x) b^{\top}(x)) \xi \qquad \forall (x, \xi) \in D_{\bar R_D} \times{\mathbb{R}}^d. \end{align*} $$
(B.3) There exists a constant |$\widehat C_{b}> 0$| such that
$$ \begin{align*} & \xi^{\top} \!\big( {b(x) b^{\top}(x)} \big) \xi \le \widehat C_{b} \left| {\xi} \right|^{2}, \qquad \forall (x,\xi) \in \mathbb{R}^{d}\times \mathbb{R}^{d}. \end{align*} $$

Remark 2.1

Assumption B.1 ensures well-posedness of strong exact and numerical solutions of the SDE (1.2) and is sufficient to obtain the sought for regularity for the solution of the Feynman–Kac PDE in Propositions 2.3 and 2.4.

B.2 ensures that the diffusion process will, loosely speaking, eventually exit the domain, and it is used to obtain well-posedness of the related Feynman–Kac PDE in Propositions 2.3 and 2.4. B.3 is introduced to bound the magnitude of the overshoot of numerical paths of the diffusion process when they exit |$D$|⁠.

To simplify technical arguments, we will prove convergence results for our numerical methods under Assumption B, but the assumption can be relaxed considerably:

Remark 2.2

Since the quantity we seek to compute, |$\tau $|⁠, only depends on the dynamics of the diffusion process until it exits |$D$|⁠, Assumption B.1 can be relaxed to:

$$ \begin{align*} & (B^{\prime}.1) \qquad a_i, b_{ij} \in C^{3}({\mathbb{R}}^d) \quad \textrm{for all} \quad i \in \{1,\ldots,d\} \quad \textrm{and} \quad j \in \{1,\ldots, m\}. \end{align*} $$

This is because any |$C^{3}$|-redefinition of all coefficients on |$\overline D^C$| such that |$a_i, b_{ij} \in C^{3}_b({\mathbb{R}}^d)$| will not change the exit time of the resulting diffusion process.

For instance, if

$$ \begin{align*} & \max \big(|a_i(x)|, \big|\partial_{x_j} a_i(x)\big|, \big|\partial_{x_jx_k} a_i(x)\big|, \big|\partial_{x_jx_k x_\ell} a_i(x)\big| \big)\le C(1+ |x|^n). \end{align*} $$

for some |$n\ge 1$|⁠, then the redefinition |$\tilde a_i(x) = a_i(x) \exp (- F(x))$|⁠, where |$F: {\mathbb{R}}^d \to [0, \infty )$| is a three times globally continuously partial-differentiable function such that |$F|_{D} =0$| and |$\lim _{|x| \to \infty } |F(x)|/|x|>1$|⁠, satisfies |$\tilde a_i \in C_b^{3}({\mathbb{R}}^d)$| and |$\tilde a_i|_{\overline D} = a_i|_{\overline D}$|⁠. Such a mapping |$F$| may for instance take the following form: let |$r> \max _{x \in \overline D} |x|$| and set |$F(x):= \mathbbm{1}_{{ |x|> r}} (|x| -r)^4$|⁠.

For any |$R>0$|⁠, Assumption B.3 can be relaxed to: there exists a |$\widehat C_b>0$| such that

$$ \begin{align*} & \textbf{(B}^{\prime}.\textbf{3)} \qquad \xi^{\top} \!\big( {b(x) b^{\top}(x)} \big) \xi \le \widehat C_{b} \left| {\xi} \right|^{2}, \qquad \forall x \in \overline D_R, \; \xi \in \mathbb{R}^{d}. \end{align*} $$

The relaxation is achieved through a |$C^{3}$|-redefinition of |$b$| on the exterior of |$\overline D$| satisfying that

$$ \begin{align*} & \sup_{x \in{\mathbb{R}}^d} \|b(x) b^{\top}(x)\|_2 = \max_{x \in \overline D_R} \|b(x) b^{\top}(x) \|_2, \end{align*} $$

where |$\| \cdot \|_2$| denotes the matrix |$2$|-norm.

2.2 Mean exit times and Feynman–Kac

Recalling that the exit time was defined by |$\tau = \inf \{t \ge 0 \mid X(t) \notin D \} \wedge T$|⁠, we extend this notation to the time-adjusted exit time of a path going through the point |$(t,x)\in [0,T] \times{\mathbb{R}}^d$|⁠:

$$ \begin{align*}& \tau^{t,x} := \left( \min\left\{s \geq t \; \big| \; X(s) \notin D \ \textrm{and}\ X(t) = x\right\} \, - \, t \, \right)\; \bigwedge \; \big(T - t\big). \end{align*} $$

Since |$X(0) = x_0$| for the diffusion process (1.2), we note that |$\tau = \tau ^{0,x_0}$|⁠. Under sufficient regularity, the mean (time-adjusted) exit time

$$ \begin{align}& u(t, x) := {\mathbb{E}} \left[{\tau^{t, x}} \right], \qquad (t,x) \in \! \left[{0, T} \right] \times \mathcal{D}\end{align} $$

(2.3)

solves the following parabolic PDE:

Proposition 2.3

(Feynman–Kac).

If Assumptions A, B.1 and B.2 hold, then the mean exit time (2.3) is the unique solution of the Feynman–Kac PDE

$$ \begin{align*} \partial_t u &= - a \cdot \nabla u - \frac{1}{2} \operatorname{tr}\!\big( {bb^{\top} \nabla^{2} u} \big) - 1 && \textrm{in} \quad (0,T) \times D, \\ u &= 0 && \textrm{on} \quad ([0,T) \times \partial D) \cup (\{T\} \times D). \end{align*} $$

Moreover, |$u \in C^{1,2}([0,T]\times D) \cap C^{0,0}([0,T] \times \overline D)$|⁠.

The result follows from Proposition 1 in Gobet & Menozzi (2010).

$$ \begin{align*} & \tau^{t,x}_r := \left( \min\left\{s \geq t \; \big| \; X(s) \notin D_r \ \textrm{and}\ X(t) = x\right\} \, - \, t \, \right)\; \bigwedge \; \big(T - t\big) \,\!, \end{align*} $$

Similarly as for the exit time for the domain |$D$|⁠, we introduce the shorthand |$\tau _r:= \tau _r^{0,x_0}$|⁠. The mean exit time for the enlarged domain is defined by

$$ \begin{align}& u_r(t,x):= {\mathbb{E}}_{} \left[{\tau^{t,x}_r} \right] \qquad (t,x) \in [0,T] \times{\mathbb{R}}^d.\end{align} $$

(2.4)

The function |$u_r(t,x)$| is also the unique solution of a Feynman–Kac equation:

Proposition 2.4

(Feynman–Kac on enlarged domains).

$$ \begin{align*} \partial_t u_r &= - a \cdot \nabla u_r - \frac{1}{2} \operatorname{tr}\!\big( {bb^{\top} \nabla^{2} u_r} \big) - 1 && \textrm{in} \quad (0,T) \times D_r, \\ u_r &= 0 && \textrm{on} \quad ([0,T) \times \partial D_r) \cup (\{T\} \times D_r) \,\!, \end{align*} $$

and |$u_r \in C^{1,2}([0,T]\times D_r) \cap C^{0,0}([0,T] \times \overline D_r)$|⁠.

Moreover,

$$ \begin{align}& u_r(t,x) \ge u(t,x) \qquad (t,x) \in [0,T] \times D,\end{align} $$

(2.5)

and there exists a uniform constant |$L>0$| for all |$r \in [0,\bar R_D]$| such that

$$ \begin{align}& | u_r(t,x) - u_r(t,y) | \le L |x - y| \quad \forall (t,x,y) \in [0,T] \times D_r \times \partial D_r \,.\end{align} $$

(2.6)

Proof.

Assumption A and the argument right below the assumption shows that |$\partial D_r$| is |$C^3$|⁠. By Gobet & Menozzi (2010, Proposition 1), we then conclude that |$u_r$| is the unique solution with the stated regularity. Inequality (2.5) follows from the observation that

$$ \begin{align*} & \tau_r^{t,x} \ge \tau^{t,x} \qquad \forall (t,x) \in [0,T] \times{\mathbb{R}}^d. \end{align*} $$

For the last inequality, we note that for any |$r \in [0,R_D]$|⁠, the nontruncated mean exit time

$$ \begin{align*} & \bar u_r(x):= {\mathbb{E}}_{} \left[{\bar \tau_r^x} \right]\!, \end{align*} $$

with |$\bar \tau _r^x:= \min \{t \ge 0 \mid X(s) \notin D_r \; \& \; X(0) = x \}$| is the unique solution of the strongly elliptic PDE

$$ \begin{align} \begin{split} a \cdot \nabla \bar u_r + \frac{1}{2} \operatorname{tr}\!\big( {bb^{\top} \nabla^{2} \bar u_r} \big) + 1 & = 0 \qquad \textrm{in} \quad D_r \\ \bar u_r &= 0 \qquad \textrm{on} \quad \partial D_r, \end{split}\end{align} $$

(2.7)

$$ \begin{align*} & |u_r(t,x) - u_r(t,y)| = u_r(t,x) - \underbrace{u_r(t,y)}_{=\ 0} \stackrel{{\textsubscript{\hyperlink{deqn07}{(2.5)}}}}{\le} \bar u_r(x) - \underbrace{\bar u_r(y)}_{=\ 0} = |\bar u_r(x) - \bar u_r(y)|. \end{align*} $$

It remains to bound the term in the last equation from above. Applying the modulus of continuity (Gilbarg & Trudinger, 2001, Section 14.5) and the boundary gradient bound (Gilbarg & Trudinger, 2001, Theorem 14.1) to the PDE (2.7) yields

$$ \begin{align*} & |\bar u_r(x) - \bar u_r(y)| \le \underbrace{\mu e^{\chi_r M_r}}_{=:\ L_r} |x - y| \qquad \forall (x,y) \in D_r \times \partial D_r, \end{align*} $$

where |$M_r = \sup _{x \in \overline D_r} |\bar u_r(x)|$|⁠,

$$ \begin{align*} & \mu = \frac{1+ \widehat C_b + \sup_{x \in D_{\bar R_D}} |a(x)|}{\hat c_b} \quad \textrm{and} \quad \chi_r = \left(1 + \frac{d-1}{\textrm{reach}(D_r)} \right) \mu. \end{align*} $$

For any |$ 0 \le r< s\le \bar R_D$|⁠, it holds that |$u_s \ge u_r \ge 0$| on |$\overline D_r$|⁠, which implies that |$M_r \le M_{\bar R_D}$|⁠. And from Section 2.1 and Assumption B.2, we know that |$\textrm{reach}(D_r) \ge \textrm{reach}(D)/2>0$| for all |$r \in [0,\bar R_D]$|⁠, which implies that

$$ \begin{align*} & \chi_r \le \left(1 + \frac{2(d-1)}{\textrm{reach}(D)} \right) \mu =: \bar \chi \quad \forall r \in [0, \bar R_D]. \end{align*} $$

We conclude that |$L_r \le \mu \exp ( \bar \chi M_{\bar R_D}) =: L < \infty $| for all |$r \in [0, \bar R_d]$|⁠.

Remark 2.5

Existence, uniqueness and regularity of the solution to the elliptic PDE (2.7) can be shown to hold under weaker regularity assumptions on coefficients, domain and boundary values (Gilbarg & Hörmander, 1980, Theorems 6.1, 6.2, 6.3).

2.3 Higher-order adaptive numerical methods

Let |$\gamma \in \{1,3/2\}$| denote the order of the strong Itô–Taylor method used in the numerical integration of the SDE (1.2), cf. Kloeden & Platen (1992, Chapter 10). In abstract form, our numerical method simulates the SDE (1.2) by the scheme

$$ \begin{align}& \overline{X}(t_{n+1}) = \varPsi_{\gamma}(\overline{X}(t_n), \varDelta t_n), \qquad n=0,1,\ldots,\end{align} $$

(2.8)

where |$\overline{X}(0) = x_0$| and |$\varPsi _\gamma : {\mathbb{R}}^d \times [0,\infty ) \times \varOmega \to{\mathbb{R}}^d$| denotes the higher-order Itô–Taylor scheme. The timestep |$\varDelta t_n = \varDelta t(\overline{X}(t_n))$| is adaptively chosen as a function of the current state |$\overline{X}(t_n)$| so that the step size is small when |$\overline{X}(t_n)$| is near the boundary |$\partial D$|⁠, and larger otherwise. Both the integrator |$\varPsi _\gamma $| and the adaptive timestepping depend on the order of |$\gamma $|⁠, see Sections 2.4 and 2.5 below for further details. The purpose of the adaptive strategy is to reduce the magnitude of the overshoot when the numerical solution exits |$D$|⁠, and the stochastic mesh

$$ \begin{align*} & t_0 = 0 \quad \textrm{and} \quad t_{n+1} = t_n + \varDelta t_n \qquad n =0,1,\ldots \end{align*} $$

is a sequence of stopping times with |$t_{n+1}$| being |${\mathcal{F}}_{t_n}$|-measurable for all |$n=0,1,\ldots $|⁠. The exit time of the numerical method is defined as the first time |$\overline{X}(t_n)$| exits the domain |$D$|⁠:

$$ \begin{align}& \nu:= \{t_n\ge 0 \mid \overline{X}(t_n) \notin D\} \wedge T,\end{align} $$

(2.9)

and we will use the following notation for the time mesh of the numerical solution up to or just beyond |$T$|⁠:

$$ \begin{align}& {\mathcal{T}}^{\varDelta t}:= \{t_0, t_1, \ldots, t_{N}\} \quad \textrm{where} \quad N:= \min\{n \in{\mathbb{N}} \; | \; t_n \ge T \}.\end{align} $$

(2.10)

For later analysis, the domain of definition for the numerical solution is extended to a piecewise constant solution over continuous time by

$$ \begin{align*} & \overline{X}(t):= \overline{X}(t_n) \qquad \textrm{for} \quad t \in [t_n,t_{n+1}) \quad \textrm{and} \quad n \in \{0,1,\ldots, N-1\}. \end{align*} $$

Sections 2.4 and 2.5 below present the details for the order 1 and order 1.5 adaptive methods, respectively.

2.4 Order 1 method

The |$i$|th component of the strong Itô–Taylor scheme (2.8) of order |$\gamma =1$| is given by

$$ \begin{align} \left(\varPsi_{1}(\overline{X}(t_{n}), \varDelta t_n ) \right)_i := \overline{X}_{i}(t_{n}) &+ a_{i}(\overline{X}(t_{n})) \varDelta t_n + \sum_{j=1}^{m} b_{ij}(\overline{X}(t_{n})) \varDelta W^j_n \nonumber\\ &+ \sum_{j_{1}=1}^m \sum_{j_2=1}^{m} \mathcal{L}^{j_{1}} b_{ij_{2}}(\overline{X}(t_{n})) \mathcal{I}_{(j_{1}, j_{2})},\end{align} $$

(2.11)

for |$i \in \{1,\ldots , d\}$|⁠, where we have introduced the shorthand |$\varDelta W^j_n = W^j(t_{n+1}) - W^j(t_n)$| for the |$n$|th increment of the |$j$|th component of the Wiener process, the operator

$$ \begin{align}& {\mathcal{L}}^j:= \sum_{k=1}^d b_{kj} \partial_{x_k}\end{align} $$

(2.12)

and the iterated Itô integral

$$ \begin{align}& \mathcal{I}_{(j_{1}, j_2)}:= \int_{t_{n}}^{t_{n+1}} \int_{t_{n}}^{s_{2}}\textrm{d}W^{j_{1}}(s_1)\,\textrm{d}W^{j_{2}}(s_2)\end{align} $$

(2.13)

of the components |$(j_1,j_2) \in \{1,\ldots ,m\} \times \{1,\ldots ,m\}$|⁠.

Remark 2.6

The iterated integrals |${\mathcal{I}}_{(j_1, j_2)}$| do not have a closed-form expression when |$j_1\neq j_2$|⁠, and numerical approximations of such terms impose a substantial cost to each iteration of |$\varPsi _1$| in (2.11). The cost of evaluating |$\varPsi _1$| reduces to |${\mathcal{O}}(1)$| in settings when the off-diagonal terms of |${\mathcal{I}}_{(j_1, j_2)}$| cancel; for instance, when the first commutativity condition holds (Kloeden & Platen, 1992, equation (10.3.13)):

$$ \begin{align} {\mathcal{L}}^{j_1} b_{ij_{2}} = {\mathcal{L}}^{j_2} b_{ij_{1}}, \qquad \forall \big( \; j_{1}, j_{2} \in \! \left\{ {1, \ldots, m} \right\} \ \textrm{and}\ i \in \{1, \ldots, d\} \big).\end{align} $$

(2.14)

The size of |$\varDelta t_n$| is determined adaptively by the state of the numerical solution. A small step size is employed when |$\overline{X}(t_n)$| is close to the boundary |$\partial D$|⁠, to reduce the magnitude of the overshoot of an exit of the domain and the likelihood of the numerical solution not capturing a true exit of the domain; and a larger step size is employed when |$\overline{X}(t_n)$| is farther away from the boundary. For any |$b>a\ge 0$|⁠, we introduce the following notation to describe the distance from the boundary:

$$ \begin{align} V_{\partial D}(a, b) := \! \left\{ {x \in D \; \big| \; d(x, \partial D) \in (a, b]} \right\}\!.\end{align} $$

(2.15)

Introducing the step size parameter |$h \in (0,1)$| and the threshold parameter

$$ \begin{align}& \delta(h):= \sqrt{8 \widehat C_b\, \textrm{d} h \log(h^{-1})},\end{align} $$

(2.16)

the ‘critical region’ of points near the boundary is given by

$$ \begin{align*} & V_{\partial D}(0, \delta) := \! \left\{ {x \in D \; \big| \; \textrm{d}(x, \partial D) \le \delta} \right\}\!, \end{align*} $$

and the adaptive timestepping by

$$ \begin{align}& \varDelta t_n = \varDelta t(\overline{X}(t_n)) := \begin{cases} h & \textrm{if} \quad \textrm{d}(\overline{X}(t), \partial D)> \delta, \\ h^2 & \text{otherwise.} \end{cases}\end{align} $$

(2.17)

This means that a large timestep is used when the |$\overline{X}(t_n)$| is in the noncritical region |$D\setminus V_{\partial D}(0, \delta )$| and a small step size |$h^{2}$| in the critical region near the boundary. (The step size used in |$D^C$|⁠, whether |$h$| or |$h^2$|⁠, is not of any practical importance for the output of the numerical method. But in our theoretical analysis we need to compute the numerical solution up to time |$T$|⁠, and in case |$\nu < T$|⁠, the step size for |$\overline{X}(t_n) \in D^C$| needs to be described to compute the solution for times in |$(\nu ,T]$|⁠.) The value of the parameter |$\delta $| is chosen as a compromise between accuracy and computational cost: it should be sufficiently large so that the strong-error convergence rate in |$h$| is kept at almost order 1, cf. Theorem 2.7, but it should also be as small as possible to keep the computational cost of the method low. From the proofs of Lemma 3.3 and Theorem 2.8, it follows that the formula (2.16) is a suitable compromise for |$\delta $|⁠.

We are now ready to present the main results on the strong convergence rate and computational cost of the order 1 method.

Theorem 2.7

(Strong convergence rate for the order 1 method).

If Assumptions A and B hold, then for any |$\xi \in (0,1)$|⁠, there exists a constant |$C_{\nu }>0$| such that

$$ \begin{align*} & {\mathbb{E}}_{} \left[ |\nu - \tau| \right] \le C_{\nu} h^{1-\xi} \end{align*} $$

holds for a sufficiently small |$h>0$|⁠.

We defer the proof to Section 3.1.

To study the complexity of the method, we first define the computational cost of a numerical solution as the number timesteps used to compute the exit time:

$$ \begin{align*} \textrm{Cost}\!\left({\nu}\right) := \int_{0}^{\nu} \frac{1}{\varDelta t(\overline{X}(t))}\,\textrm{d}t \,. \end{align*} $$

Observe that |$\textrm{Cost}\!\left ({\nu }\right )$| is a random variable since the exit time |$\nu $| is different for each realization of the numerical solution. We further make the assumption that every evaluation of the distance of the numerical solution |$\overline{X}(t)$| to the boundary of the domain |$\partial D$|⁠, required for adaptive refinement of timestep size, costs |${\mathcal{O}}(1)$|⁠. This is a limiting assumption, as it can be challenging to compute the distance to the boundary in more complicated domains. On that note, the distance function |$d(x,\partial D)$| is the solution of an Eikonal equation (Bernal, 2019), and it can be approximated numerically using Sethian’s fast marching method (Sethian, 1996).

Theorem 2.8

(Computational cost for the order 1 method).

Let Assumptions A and B hold and assume that one evaluation of |$\varPsi _1$| costs |${\mathcal{O}}(1)$|⁠. Then it holds that

$$ \begin{align*} & {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left({\nu}\right)} \right] = {\mathcal{O}}( h^{-1} \log(h^{-1}) ). \end{align*} $$

We defer the proof to Section 3.1. When the Itô diffusion is a standard one-dimensional Wiener process, the upper bound on the computational cost can be proved by a more straightforward approach that we outline in Appendix B.

Remark 2.9

The proof of Theorem 2.8 uses an occupation-time estimate of the time the numerical solution spends in the critical region before exiting. Near-boundary occupation-time estimates have appeared frequently in the literature, as they often are useful tools for bounding approximation errors and complexity. For an SDE with low-regularity drift functions, the occupation-time on discontinuity sets is used to bound the computational cost of the adaptive method in Neuenkirch et al. (2019); Yaroslavtseva (2022). When |$d=1$|⁠, the approach (Yaroslavtseva, 2022) carries over to our order 1 method. For |$d>1$|⁠, occupation-time estimates using local time manipulations have been derived for the Euler–Maruyama method for diffusion processes (Neuenkirch et al., 2019, Lemma 3.3) and hypoelliptic diffusion processes (Gobet & Menozzi, 2004, Lemma 19).

2.5 The order 1.5 method

The general strong Itô–Taylor scheme of order |$\gamma =1.5$| is a complicated expression that can be found in Kloeden & Platen (1992, equation (10.4.6)). When the so-called second commutativity condition holds:

$$ \begin{align}& \mathcal{L}^{j_{1}} \mathcal{L}^{j_{2}} b_{i j_{3}} = \mathcal{L}^{j_{2}} \mathcal{L}^{j_{1}} b_{i j_{3}} \qquad \forall\left( j_{1}, j_{2}, j_{3} \in \{1, \ldots, m\} \ \textrm{and}\ i \in \{1, \ldots, d\}\right)\!,\end{align} $$

(2.18)

where the differential operator |${\mathcal{L}}^j$| is defined in (2.12), then the scheme simplifies to a practically useful form where the computational cost of one iteration of |$\varPsi _{1.5}$| is |${\mathcal{O}}(1)$|⁠, cf. Kloeden & Platen (1992, equation (10.4.15)). And in the special case of (2.18), when the diffusion coefficient is a diagonal matrix, the |$i$|th component of the scheme takes the form

$$ \begin{align} \notag \left(\varPsi_{1.5}(\overline{X}(t_{n}), \varDelta t_n ) \right)_i &:= \overline{X}_i(t_n) + a_i(\overline{X}(t_n)) \varDelta t_n + b_{ii}(\overline{X}(t_n)) \varDelta W^i_n\\ &\quad+ \frac{1}{2} {\mathcal{L}}^0 a_i(\overline{X}(t_n)) \varDelta t_n^2 + \frac{1}{2} {\mathcal{L}}^i b_{ii}(\overline{X}(t_n)) \left( \big(\varDelta W^i_n\big)^2 - \varDelta t_n \right)\notag\\ &\quad+ \frac{1}{2} {\mathcal{L}}^i {\mathcal{L}}^i b_{ii}(\overline{X}(t_n))\left( \frac{\left(\varDelta W_n^i\right)^2}{3} - \varDelta t_n\right)\varDelta W_n^i\notag\\ & \quad+ {\mathcal{L}}^0 b_{ii}(\overline{X}(t_n)) \big(\varDelta W^i_n \varDelta t_n - \varDelta Z^i_n\big) + \sum_{j=1}^d {\mathcal{L}}^j a_i(\overline{X}(t_n)) \varDelta Z_n^j, \end{align} $$

(2.19)

where we have introduced the differential operator

$$ \begin{align*} & {\mathcal{L}}^0:= \sum_{i=1}^{d} a_{i} \partial_{x_{i}} + \frac{1}{2} \sum_{i,j=1}^{d} \sum_{k=1}^{m} b_{ik} b_{jk} \partial_{x_i x_j} \end{align*} $$

and

$$ \begin{align*} & \varDelta Z^i_n:= \int_{t_n}^{t_{n+1}} \int_{t_n}^{s_2}\textrm{d}W^i(s_1) \, \textrm{d}s_2. \end{align*} $$

For computer implementations, let us add that the tuple of correlated random variables |$(\varDelta W^i_n, \varDelta Z_n^i)$| can be generated by

$$ \begin{align*} & \varDelta W^i_{n} = U_1 \sqrt{\varDelta t_n} \quad \textrm{and} \quad \varDelta Z_n^i = \frac{1}{2} \varDelta t_n^{3/2} \left( U_1 + \frac{1}{\sqrt{3}} U_2\right)\!, \end{align*} $$

where |$U_1$| and |$U_2$| are independent |$N(0,1)$|-distributed random variables.

The step size |$\varDelta t_n$| for the order 1.5 method is determined adaptively by the state of the numerical solution, but with one more resolution than for the order 1 method: a tiny step size is employed when |$\overline{X}(t_n)$| is very close to the boundary |$\partial D$|⁠, a small step size is employed when |$\overline{X}(t_n)$| is slightly farther away from the boundary, and the largest step size is employed when it is far away from the boundary. To describe this mathematically, we first introduce the step size parameter |$h \in (0,1)$|⁠, the threshold parameters

$$ \begin{align}& \delta_1:= \sqrt{12 \widehat C_b\, \textrm{d} h \log(h^{-1})} \quad \textrm{and} \quad \delta_2:= \sqrt{16 \widehat C_b\, \textrm{d} h^2 \log(h^{-1})},\end{align} $$

(2.20)

and the critical regions

$$ \begin{align*} & V_{\partial D}(\delta_2, \delta_1) = \{x \in D \mid \textrm{d}(x,\partial D) \in (\delta_2, \delta_1]\} \end{align*} $$

and

$$ \begin{align*} & V_{\partial D}(0, \delta_2) = \{x \in D \mid \textrm{d}(x,\partial D) \le \delta_2\}. \end{align*} $$

The timestepping is then given by

$$ \begin{align}& \varDelta t_n = \varDelta t(\overline{X}(t_n)) := \begin{cases} h & \textrm{if} \quad \textrm{d}(\overline{X}(t_n), \partial D)> \delta_1 \\ h^2 & \textrm{if} \quad \textrm{d}(\overline{X}(t_n), \partial D) \in(\delta_2, \delta_1]\\ h^3 & \textrm{if} \quad \textrm{d}(\overline{X}(t_n), \partial D) \le \delta_2. \end{cases}\end{align} $$

(2.21)

This means that the step size |$h$| is used when |$\overline{X}(t_n)$| is in the noncritical region |$D\setminus V_{\partial D}(0, \delta _1)$|⁠, the small step size |$h^2$| is used when |$\overline{X}(t_n)$| is in the critical region farthest from the boundary and the tiny step size |$h^3$| is used when |$\overline{X}(t_n)$| is in the critical region nearest the boundary. (The step size used in |$D^C$|⁠, whether |$h$|⁠, |$h^2$| or |$h^3$| is not of any practical importance, but is needed in the theoretical analysis to extend the numerical solution up to time |$T$| when |$\nu < T$|⁠, similarly as for the order 1 method.) The values of the threshold parameters |$(\delta _1, \delta _2)$| are chosen as a compromise between accuracy and computational cost: The critical regions should be sufficiently large so that the strong-error convergence rate in |$h$| is kept at almost order 1.5, cf. Theorem 2.10, but they should also be kept as small as possible to keep the computational cost of the method low. It follows from the proofs of Lemma 3.4 and Theorem 2.11 that (2.20) is a suitable compromise.

We are now ready to present the main results on the strong convergence rate and computational cost of the order 1.5 method.

Theorem 2.10

(Strong convergence rate for the order 1.5 method).

If Assumptions A and B hold, then for any |$\xi \in (0,1)$|⁠, there exists a constant |$C_{\nu }>0$| such that

$$ \begin{align*} & {\mathbb{E}}_{} \left[{ |\nu - \tau|} \right] \le C_{\nu} h^{3/2-\xi} \end{align*} $$

holds for sufficiently small |$h>0$|⁠.

We defer the proof to Section 3.2.

Theorem 2.11

Let Assumptions A and B hold and assume that one evaluation of |$\varPsi _{1.5}$| costs |${\mathcal{O}}(1)$|⁠. Then it holds that

$$ \begin{align*} & {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left({\nu}\right)} \right] = {\mathcal{O}}\big( h^{-1} \log(h^{-1}) \big). \end{align*} $$

3. Theory and proofs

This section proves theoretical properties of the adaptive timestepping methods of order 1 and 1.5. We first describe how critical regions combined with adaptive timestepping can bound the overshoot of the diffusion process with high probability, and thereafter use this property to prove the strong convergence of the method given by Theorems 2.7 and 2.10. Lastly, we prove upper bounds for the expected computational cost of the methods.

Let us first state a few theoretical results that will be needed in later proofs.

Proposition 3.1

Let Assumption B hold. Recall that |$X$| denotes the exact solution of the SDE (1.2) and that |$\overline{X}$| denotes the numerical solution computed on the adaptive mesh |${\mathcal{T}}^{\varDelta t}$| with the strong Itô–Taylor scheme of order |$\gamma \in \{1, 1.5\}$|⁠. Then for any |$p \ge 1$|⁠, the following bound holds:

$$ \begin{align*} & {\mathbb{E}}_{} \left[{ \sup_{t_k \in{\mathcal{T}}^{\varDelta t}} |X(t_k) - \overline{X}(t_k)|^p} \right] \le C h^{ \gamma p}, \end{align*} $$

where |$C>0$| depends on |$p$|⁠.

See Kloeden & Platen (1992, Theorem 10.6.3) for a proof of Proposition 3.1.

Proposition 3.2

For the exact solution of the SDE (1.2) and a sufficiently small |$h>0$|⁠, it holds for any |$p \ge 1$| that

$$ \begin{align*} & {\mathbb{E}}_{} \left[{ \max_{k\in \{0,1,\ldots, \lceil T/ h\rceil\}} \sup_{s \in [0, h]} |X(kh+s) - X(k h)|^p} \right] \le C \sqrt{ h^p \log( h^{-1})}, \end{align*} $$

where |$C>0$| depends on |$p$|⁠.

The above proposition is a direct consequence of replacing the piecewise constant Euler–Maruyama approximation with a piecewise constant interpolation of the exact solution of the SDE in Müller-Gronbach (2002, Theorem 2).

3.1 Order 1 method

This section proves theoretical results for the order 1 method.

For the Itô process (1.2) and |$t>s\ge 0$|⁠, let

$$ \begin{align}& M(s,t) := \sup_{r\ \in\ [s, t]} \left| {X(r) - X(s)} \right|\!.\end{align} $$

(3.1)

One may view |$M(s,t)$| as the maximum stride the process |$X$| takes over the interval |$[s,t]$|⁠. The lemma below shows that the threshold parameter |$\delta $| is chosen sufficiently large to ensure that the maximum stride |$X$| takes over every interval in the mesh |${\mathcal{T}}^{\varDelta t}$| is with very high probability bounded by |$\delta $|⁠. This estimate will help us bound the probability that the numerical solution exits the domain |$D$| from the noncritical region in the proof of Theorem 2.7 (i.e., to bound the probability of exiting |$D$| when using a large timestep).

Lemma 3.3

Let Assumptions A and B hold, and assume that the timestep parameter |$h \in (0,1)$| is sufficiently small. For the threshold parameter |$\delta = \sqrt{8 \widehat C_b h d\, \log (h^{-1})}$| and the maximal-stride set

$$ \begin{align*} & A:= \big\{\omega \in \varOmega \mid M(t_n, t_{n} + \varDelta t_n) \le \delta \quad \forall n \in \{0, 1,\ldots, N-1\} \big\}, \end{align*} $$

it then holds that |${\mathbb{P}}(A) = 1 - {\mathcal{O}}(h)$|⁠.

Proof.

From the adaptive timestepping, we know that the mesh |${\mathcal{T}}^{\varDelta t}$| contains |$N = |{\mathcal{T}}^{\varDelta t}|-1$| many intervals where |$N$| is a random integer that is bounded from below by |$T/h$| and from above by |$T/h^2$|⁠. At most |$T/h$| of the intervals are of length |$h$| and at most |$T/h^2$| of the intervals are of length |$h^2$|⁠. To avoid complications due to a random number of elements in the mesh, we extend the mesh |${\mathcal{T}}^{\varDelta t}$| to span over |$[0,2T]$| in such a way that the extended mesh agrees with |${\mathcal{T}}^{\varDelta t}$| over the interval |$[0,t_N]$| and contains exactly |$T/h$| many intervals of length |$h$| and |$T/h^2$| many intervals of length |$h^2$|⁠. In other words,

$$ \begin{align*} & {\mathcal{T}}^{E} = \{t_0, t_1,\ldots, t_N, t_{N+1}, \ldots, t_{\widehat N}\}, \end{align*} $$

where |${\mathcal{T}}^{E} \cap{\mathcal{T}}^{\varDelta t} = {\mathcal{T}}^{\varDelta t}$| and |$t_{\widehat N} = 2T$| with |$\widehat N = T/h + T/h^2$|⁠. And for

$$ \begin{align*} & \widehat \varDelta^1:= \left\{ k \in \{0,1,\ldots, \widehat N-1\} \mid \varDelta t_k = h \right\} \end{align*} $$

and

$$ \begin{align*} & \widehat \varDelta^{2}:= \big\{ k \in \{0,1,\ldots, \widehat N-1\} \mid \varDelta t_k = h^2 \big\}, \end{align*} $$

we have that |$|\widehat \varDelta ^1| = T/h$| and |$|\widehat \varDelta ^2| = T/h^2$|⁠, respectively. We represent these two sets of integers and relabel their associated mesh points as follows:

$$ \begin{align*} & \widehat \varDelta^1 = \{ \widehat \varDelta^1(1), \widehat\varDelta^1(2),\ldots, \widehat\varDelta^1({T/h}) \} \quad \textrm{with} \quad t^h_n:= t_{\widehat\varDelta^1(n)}\qquad n \in \{1,2,\ldots, T/h\} \end{align*} $$

and

$$ \begin{align*} & \widehat \varDelta^{2} = \{\widehat\varDelta^{2}(1), \widehat\varDelta^{2}(2),\ldots, \widehat\varDelta^{2}({T/h^2}) \} \quad \textrm{with} \quad t^{h^2}_n:= t_{\widehat\varDelta^{2}(n)} \qquad n \in \{1,2,\ldots, T/h^2\}. \end{align*} $$

Introducing the maximal-stride set over the extended mesh

$$ \begin{align}& B := \! \left\{ {\omega \in \varOmega \; \big| \; M(t_{n}, t_{n+1}) \le \delta \quad \forall n \in \{0,1,\ldots, \widehat N-1\}} \right\}\!,\end{align} $$

(3.2)

and noting that |$B \subset A$|⁠, we achieve the following bound for the probability of |$A^{C}$|⁠:

$$ \begin{align} {\mathbb{P}}_{} \big( {A^{C}} \big) &\leq{\mathbb{P}}_{} \big( {B^{C}} \big)\nonumber\\ &\leq \sum_{n=0}^{\hat N-1} {\mathbb{P}}_{} \! \left( {M(t_{n}, t_{n} + \varDelta t_{n})> \delta} \right)\nonumber\\ &=\sum_{n =1}^{T/h} {\mathbb{P}}_{} \big( {M\big(t^{h}_{n}, t^{h}_{n} + h\big) > \delta} \big) + \sum_{n = 1}^{T/h^2} {\mathbb{P}}_{} \big( {M\big(t_{n}^{h^2}, t^{h^2}_{n} + h^2\big) > \delta} \big).\end{align} $$

(3.3)

Recall that the integral form of the SDE (1.2) is given by

$$ \begin{align}& X(r) = X(s) + \int_{s}^{r} a(X(u))\, \textrm{d}u + \int_{s}^{r} b(X(u))\,\textrm{d}W(u),\end{align} $$

(3.4)

and let

$$ \begin{align*} & C_a:= \sup_{x \in{\mathbb{R}}^d} |a(x)|. \end{align*} $$

This yields

$$ \begin{align*} & M\big(t^{h}_{n}, t^{h}_{n} + h\big) = \sup_{r\ \in\ [t_n, t_n+h]} |X(r) - X(t_n)| \le C_a h + \sup_{r\ \in\ [0, h]} \left|\int_{t_n^h}^{t_n^h+r} b(X(u))\,\textrm{d}W(u)\right|\!. \end{align*} $$

Assuming that |$h$| is sufficiently small so that

$$ \begin{align*} & \delta - C_a h \ge \frac{\delta} {\sqrt{2}}, \end{align*} $$

we obtain that

$$ \begin{align*} & {\mathbb{P}}_{} \big( {M\big(t^{h}_{n}, t^{h}_{n} + h\big)> \delta} \big) \le{\mathbb{P}}_{} \! \left( {\sup_{r\ \in\ [0, h]} \left| {\int_{t_n^h}^{t_n^h+r} b(X(u))\,\textrm{d}W(u)} \right| \ge \delta/\sqrt{2}} \right)\!. \end{align*} $$

Introducing |$\tilde{b}(u):= b(X(u+t_n))$| and |$\widetilde{W}(t):= W(t+t_n^h) - W(t_n^h)$|⁠, the above integral takes the form

$$ \begin{align*} & \int_{t_n^h}^{t_n^h+r} b(X(u))\,\textrm{d}W(u) = \int_0^r \tilde{b}(u) \, \textrm{d}\widetilde{W}(u). \end{align*} $$

Since |$t_n^h$| is an element in the mesh |${\mathcal{T}}^{E}$|⁠, it is a finite stopping time, and the strong Markov property therefore implies that |$\widetilde{W}(t)$| is a standard Wiener process associated to the filtration |$\sigma (\{\widetilde{W}(t)\}_{u \in [0,r]} ) \subset{\mathcal{F}}_{t_n+r}$|⁠, cf. Mörters & Peres (2010, Theorem 2.16). Furthermore, the integrand |$\tilde{b}(u)$| is a square-integrable and |${\mathcal{F}}_{t_n+u}$|-adapted stochastic process, and Assumption B.3 implies that

$$ \begin{align*} & |\xi|^{-2}\int_0^h \xi^\top \tilde{b}(u) \tilde{b}^\top(u) \xi \, \textrm{d}u \le \widehat C_b h \quad \quad \forall \xi \in{\mathbb{R}}^d\setminus\{0\}. \end{align*} $$

By Baldi (2017, Proposition 8.7), Doob’s martingale inequality yields that

$$ \begin{align*} & {\mathbb{P}}_{} \! \left( {\sup_{r\ \in\ [0, h]} \left| {\int_{0}^{r} \tilde b(u)\, \textrm{d}\widetilde{W}(u)} \right| \ge \delta/\sqrt{2}} \right) \le 2\, \textrm{d} \exp\left(-\frac{\delta^2}{4 \widehat C_b h d} \right) = 2\, \textrm{d} h^2 \end{align*} $$

for any |$n \in \{1, 2,\ldots , T/h\}$|⁠. Assuming |$h\le 2/3$|⁠, a similar argument yields that

$$ \begin{align*} & {\mathbb{P}} \big( {M\big(t_{n}^{h^2}, t^{h^2}_{n} + h^2\big)> \delta} \big) \le 2\, \textrm{d} h^3 \quad \forall n \in \{1,2,\ldots, T/h^2\}. \end{align*} $$

Inequality (3.3) yields that

$$ \begin{align*} & {\mathbb{P}}(A^C) \le \sum_{n = 1}^{T/h} {\mathbb{P}} \big( {M\big(t_{n}^{h}, t^{h}_{n} + h\big)> \delta} \big) + \sum_{n = 1}^{T/h^2} {\mathbb{P}}_{} \big( {M\big(t_{n}^{h^2}, t^{h^2}_{n} + h^2\big) > \delta} \big) \le 4\,\textrm{d}T h. \end{align*} $$

Proof of Theorem 2.7.

We first partition the exit-time error into two parts:

$$ \begin{align*} & {\mathbb{E}}_{} \left[{|\tau - \nu|} \right] = {\mathbb{E}}_{} \left[{ |\tau - \nu| 1_{\nu< \tau}} \right] + {\mathbb{E}}_{} \left[{ |\tau - \nu| \mathbbm{1}_{{\nu>\tau}}} \right] =: I + II. \end{align*} $$

Since |$\nu \in{\mathcal{T}}^{\varDelta t}$|⁠, Proposition 3.1 for |$\gamma =1$| implies that

$$ \begin{align*} & {\mathbb{E}}_{} \left[{|X(\nu) - \overline{X}(\nu)|} \right] \le{\mathbb{E}}_{} \left[{\max_{t_k \in{\mathcal{T}}^{\varDelta t}} |X(t_k) - \overline{X}(t_k)|} \right] = {\mathcal{O}}(h). \end{align*} $$

For term |$I$|⁠, |$\nu < \tau $| implies that |$\nu <T$|⁠. Consequently, |$\overline{X}(\nu ) \in D^C$| and |$X(\nu ) \in D$|⁠, so there exists a |$y \in \partial D$| satisfying that |$|X(\nu ) - y| \le |X(\nu ) -\overline{X}(\nu )|$|⁠, and of course also that |$\tau ^{\nu , y} = 0$|⁠. Thanks to the Lipschitz property (2.6), and the tower property for conditional expectations combined with |$\mathbbm{1}_{\nu <\tau }$| being |${\mathcal{F}}_{\min (\nu ,\tau )}$|-measurable, we obtain that

$$ \begin{align*} \begin{split} I &= {\mathbb{E}}_{} \left[{ {\mathbb{E}}_{} \left[{ (\tau - \nu) \mathbbm{1}_{\nu< \tau} \mid{\mathcal{F}}_\nu} \right]} \right]\\ &= {\mathbb{E}}_{} \big[{ {\mathbb{E}}_{} \big[{ \tau^{\nu, X(\nu)} \mathbbm{1}_{\nu< \tau} \mid{\mathcal{F}}_\nu} \big] } \big]\\ &= {\mathbb{E}}_{} \big[{ {\mathbb{E}}_{} \big[{ (\tau^{\nu, X(\nu)} - \tau^{\nu, y } ) \mathbbm{1}_{\nu< \tau}\mid{\mathcal{F}}_\nu} \big] } \big] \\ &= {\mathbb{E}}_{} \big[{ {\mathbb{E}}_{} \big[{ \tau^{\nu, X(\nu)} - \tau^{\nu, y } \mid{\mathcal{F}}_\nu} \big] \mathbbm{1}_{\nu< \tau}} \big] \\ &= {\mathbb{E}}_{} \left[{ \big(u(\nu, X(\nu)) - u(\nu, y) \big) \mathbbm{1}_{\nu< \tau}} \right] \\ & \le L {\mathbb{E}}_{} \left[{ |X(\nu) - y| \mathbbm{1}_{\nu< \tau}} \right]\\ & \le L {\mathbb{E}}_{} \left[{ |X(\nu) - \overline{X}(\nu)| } \right]\\ & = {\mathcal{O}}(h). \end{split} \end{align*} $$

For the second term, we assume for the given |$\xi>0$| that |$r:=h^{1-\xi } < \bar R_D$|⁠, where we recall that |$\bar R_D$| is defined in Assumption B.2, and introduce the second exit-time problem

$$ \begin{align*} & \tau_r = \inf\{t\ge 0 \mid X(t) \notin D_r\} \wedge T. \end{align*} $$

As |$r<\bar R_D$|⁠, Proposition 2.4 applies, which in particular means that the function |$u_r(t,x) = {\mathbb{E}}_{} \left [ {\tau _{r}^{t,x}} \right ]$| satisfies the Lipschitz property (2.6).

Noting that |$\tau _r \ge \tau $|⁠, we obtain

$$ \begin{align*} & II = {\mathbb{E}}_{} \left[{ (\nu - \tau) \mathbbm{1}_{\nu> \tau}} \right] \le{\mathbb{E}}_{} \left[{ \tau_r - \tau } \right] + {\mathbb{E}}_{} \left[{ (\nu - \tau_r) \mathbbm{1}_{\nu > \tau}} \right] =: II_1 + II_2. \end{align*} $$

Here,

$$ \begin{align*} & II_1 = {\mathbb{E}}_{} \left[{ {\mathbb{E}}_{} \left[{\tau_r - \tau \mid{\mathcal{F}}_\tau } \right] } \right] = {\mathbb{E}}_{} \big[{ {\mathbb{E}}_{} \big[{\tau_r^{\tau,X(\tau)} \mid{\mathcal{F}}_\tau } \big] } \big] = {\mathbb{E}}_{} \left[{ u_r(\tau, X(\tau)) } \right] \le L r = {\mathcal{O}}(h^{1-\xi}), \end{align*} $$

where the last inequality follows from (2.6) and

$$ \begin{align}& X(\tau) \in\partial D \implies \textrm{d}(X(\tau), \partial D_r) = r.\end{align} $$

(3.5)

$$ \begin{align}& \partial D \ni X(\tau) \mapsto \pi_{r}(X(\tau)) \in \partial D_r \qquad \textrm{satisfying that} \quad |\pi_{r}(X(\tau)) - X(\tau) | = r.\end{align} $$

(3.6)

Using that |$u_r(T, \cdot ) = 0$| and |$u_r(\tau , \pi _{r}(X(\tau ))) = 0$| whenever |$\tau <T$|⁠, we verify the last inequality for |$II_1$| as follows:

$$ \begin{align*} \begin{split} {\mathbb{E}}_{} \left[{ u_r(\tau, X(\tau))} \right] &= {\mathbb{E}}_{} \left[{ u_r(\tau, X(\tau)) \mathbbm{1}_{\tau <T}} \right]\\ & = {\mathbb{E}}_{} \left[{ \big|\, u_r(\tau, X(\tau)) - u_r(\tau, \pi_{r}(X(\tau))) \,\big| \mathbbm{1}_{\tau <T}} \right]\\ & \le{\mathbb{E}}_{} \left[{ L \big|\, X(\tau) -\pi_{r}(X(\tau)) \,\big| \mathbbm{1}_{\tau <T}} \right] \\ &\le L r. \end{split} \end{align*} $$

To bound |$II_2$|⁠, we first note that since |$(\nu - \tau _r) \mathbbm{1}_{\tau _r = T} \le 0$|⁠, it holds that

$$ \begin{align*} & II_2 \le{\mathbb{E}}_{} \big[{ (\nu - \tau_r) \mathbbm{1}_{ \{\nu> \tau\} \cap \{ \tau_r < T\}}} \big]. \end{align*} $$

Let us introduce

$$ \begin{align*} & t^*:= \max\{t_k \in{\mathcal{T}}^{\varDelta t} \mid t_k \le \tau_r \}, \end{align*} $$

and let |$A$| be the maximal-stride set defined in Lemma 3.3, for which we recall that |${\mathbb{P}}(A) = 1 - {\mathcal{O}}(h)$|⁠. Since |$\tau _r - t^* \le h$|⁠, it holds that

$$ \begin{align*} & | X(\tau_r, \omega)- X(t^*, \omega)| \le \delta \qquad \forall \omega \in A \end{align*} $$

and since |$\tau _r <T \implies X(\tau _r) \in \partial D_r$|⁠, we also have that

$$ \begin{align*}& \textrm{d}(X(t^*, \omega), \partial D_r) \le \delta \qquad \omega \in A \cap \{\tau_r <T\}. \end{align*} $$

To bound the distance between the exact process and the numerical one at time |$t^*$|⁠, let |$p^* \in{\mathbb{N}}$| be sufficiently large so that |$p^*\xi> 1$|⁠. Then by Proposition 3.1

$$ \begin{equation} {\mathbb{P}}(|\overline{X}(t^*) - X(t^*)| \ge r) \le \frac{{\mathbb{E}}_{} \big[{|\overline{X}(t^*) - X(t^*)|^{p^*}} \big]}{h^{(1-\xi) p^*}} = {\mathcal{O}}(h). \end{equation} $$

(3.7)

Let further |$\widetilde{A}:= \{\omega \in A \mid |\overline{X}(t^*) - X(t^*)| < r\}$|⁠, and note that |${\mathbb{P}}(\widetilde{A}) = 1 - {\mathcal{O}}(h)$|⁠. We will next show that for all paths in |$\widetilde A \cap \{\tau _r <T\}$|⁠, the numerical solution uses the smallest timestep at |$t^*$|⁠, meaning that

$$ \begin{align}& \varDelta t\big(\overline{X}(t^*,\omega)\big) = h^2 \quad \forall \omega \in \widetilde{A} \cap \{\tau_r <T\}.\end{align} $$

(3.8)

Recall first that the noncritical region of |$D$| for the order 1 method is given by |$\widetilde D = D \setminus V_{\partial D}(0,\delta )$|⁠, and observe that for all paths |$\omega \in \widetilde{A} \cap \{\tau _r <T\}$|⁠, it holds that

$$ \begin{align*} & \textrm{d}\left(\overline{X}(t^*, \omega), \partial D_r\right) \le \textrm{d}(\overline{X}(t^*, \omega), X(t^*)) + \textrm{d}\left(X(t^*),\partial D_r\right) < \delta +r. \end{align*} $$

Since |$d(\widetilde D, \partial D_r) \ge \delta +r$|⁠, we conclude that |$ \overline{X}(t^*, \omega ) \notin \widetilde D$| and (3.8) is verified. Thanks to (3.8), we can sharply estimate the distance between |$\overline{X}(t^*)$| and |$X(\tau _r)$| as follows:

$$ \begin{align*} \begin{split} {\mathbb{P}}&\left( \left\{\left|\overline{X}(t^*) - X(\tau_r)\right| \ge r \right\} \cap \{\tau_r <T\}\right) \\ &\le{\mathbb{P}}\big( \big\{\left|\overline{X}(t^*) - X(\tau_r)\right| \ge h^{1-\xi} \big\} \cap \{\tau_r <T\} \cap \widetilde{A} \big) + {\mathcal{O}}(h)\\ &\le \frac{{\mathbb{E}}_{} \big[{\left|\overline{X}(t^*) - X(\tau_r)\right|^{p^*} \mathbbm{1}_{\widetilde{A} \cap \{\tau_r <T\}}} \big]}{ h^{(1-\xi)p^*}} + {\mathcal{O}}(h)\\ &\le \frac{2^{p^*-1}{\mathbb{E}}_{} \big[{|\overline{X}(t^*) - X(t^*)|^{p^*} \mathbbm{1}_{\widetilde{A} \cap \{\tau_r <T\}} + \left|X(t^*) - X(\tau_r)\right|^{p^*}\mathbbm{1}_{\widetilde{A} \cap \{\tau_r <T\}}} \big]}{ h^{(1-\xi)p^*}} +{\mathcal{O}}(h) \\ &= {\mathcal{O}}(h). \end{split} \end{align*} $$

The first summand in the expectation in the last inequality is bounded by (3.7) and the second one is bounded by Proposition 3.2, equation (3.8) (which implies that |$\tau _r-t^* \le h^2$| for all |$\omega \in \widetilde A \cap \{\tau _r <T\}$|⁠), and

$$ \begin{align*} \begin{split} {\mathbb{E}}_{} \big[{|X(t^*) - X(\tau_r)|^{p^*}\mathbbm{1}_{ \widetilde{A}\cap \{\tau_r <T\}}} \big] &\le{\mathbb{E}}_{} \left[{ \max_{k\in \{0,\ldots, T/h^2 -1\}} \sup_{s \in [0,h^2] } |X(kh^2 + s) - X(k h^2)|^{p^*}} \right]\\ &= {\mathcal{O}}( h^{p^*} \sqrt{\log(h^{-1})} ). \end{split} \end{align*} $$

For |$G:= \{\omega \in \varOmega \mid |\overline{X}(t^*) - X(\tau _r)| < r \}$|⁠, we conclude that |${\mathbb{P}}(G^C \cap \{\tau _r <T\}) = {\mathcal{O}}(h)$| and

$$ \begin{align*} \begin{split} \omega \in G \cap \{\tau_r < T\} &\implies \textrm{d}(\overline{X}(t^*,\omega), \partial D_r) < r\\ &\implies \overline{X}(t^*,\omega) \notin D\\ &\implies \nu(\omega) \le t^*(\omega)\\ &\implies \nu(\omega) \le \tau_r(\omega). \end{split} \end{align*} $$

Consequently,

$$ \begin{align*} \begin{split} II_2 &\le{\mathbb{E}}_{} \big[{ (\nu - \tau_r) \mathbbm{1}_{\{\nu> \tau\} \cap \{\tau_r < T\}}} \big]\\ &\le \underbrace{{\mathbb{E}}_{} \big[{ (\nu - \tau_r) \mathbbm{1}_{\{\nu> \tau\} \cap \{\tau_r < T\}\cap G}} \big]}_{\le\ 0} + T \, {\mathbb{P}}\big(G^C \cap \{\tau_r <T\}\big)\\ & = {\mathcal{O}}(h). \end{split} \end{align*} $$

We next prove the computational cost result for the order 1 method.

Proof of Theorem 2.8.

Let |$s_n:= n h^2$| for |$n=0,1,\ldots $| denote a set of deterministic uniformly spaced mesh points. This mesh contains all realizations of the adaptive mesh as sub-meshes, in the sense that |${\mathcal{T}}^{\varDelta t}(\omega )\subset \{s_n\}_{n\ge 0}$| for all |$\omega \in \varOmega $|⁠, and we have that

$$ \begin{align} {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left({\nu}\right)} \right] &= {\mathbb{E}}_{} \left[{\int_{0}^{\nu} \frac{1}{\varDelta t(\overline{X}(t))}\,\textrm{d}t} \right]\nonumber\\ & \le \int_0^T \frac{{\mathbb{E}}_{} \big[{\mathbbm{1}_{\{\nu>t\} \cap \{ \varDelta t(\overline{X}(t)) = h \}}} \big]}{h} + \frac{{\mathbb{E}}_{} \big[{\mathbbm{1}_{\{\nu>t\} \cap \{ \varDelta t(\overline{X}(t)) = h^2 \}}} \big]}{h^2} \,\textrm{d}t\nonumber\\ &\le \frac{T}{h} + \sum_{n=0}^{h^{-2}-1} {\mathbb{P}}\big(\{\nu>s_{n}\} \cap \big\{ \varDelta t(\overline{X}(s_n)) = h^2 \big\}\big). \end{align} $$

(3.9)

$$ \begin{align*} & \tau_{2\delta} = \inf\{ t \ge 0 \mid X(t) \notin D_{2 \delta} \} \wedge T. \end{align*} $$

Let further |$A$| denote the maximal-stride set defined in Lemma 3.3, and let |$B:= \{ \omega \mid \max _{t_k \in{\mathcal{T}}^{\varDelta t}} |X(t_k) - \overline{X}(t_k)| \le \delta \}$|⁠. Then it holds that

$$ \begin{align*} & \omega \in A\cap B \implies \nu(\omega) \le \tau_{2\delta}(\omega), \end{align*} $$

which we prove by contradiction as follows: suppose that |$\omega \in A \cap B$| and |$\tau _{2\delta }<\nu $|⁠. Then |$\overline{X}(\tau _{2\delta }) \in D$| and |$X(\tau _{2\delta }) \in \partial D_{2\delta }$|⁠. Let |$t_k$| denote the largest mesh point in |${\mathcal{T}}^{\varDelta t}$| that is smaller or equal to |$\tau _{2\delta }$|⁠. Then |$(t_k,\tau _{2\delta }) \subset (t_k,t_{k+1})$| and |$\omega \in A$| implies that |$X(t_k) \notin D_{\delta }$| while we have that |$\overline{X}(t_k) = \overline{X}(\tau _{2\delta }) \in D$|⁠. Hence, |$\textrm{d}(\overline{X}(t_k), X(t_k))> \delta $|⁠, which contradicts that |$\omega \in B$|⁠.

From Proposition 3.1, we have that

$$ \begin{align*} \begin{split} {\mathbb{P}}_{} \! \big( {B^C} \big) &\le{\mathbb{E}}_{} \left[{ \frac{\max_{t_k \in{\mathcal{T}}^{\varDelta t}} |X(t_k) - \overline{X}(t_k)|^2}{\delta^2}} \right]\\ & = {\mathcal{O}}\left( \frac{h}{\log(h^{-1})} \right) = o(h), \end{split} \end{align*} $$

and since |${\mathbb{P}}(A) = 1- {\mathcal{O}}(h)$|⁠, we conclude that |${\mathbb{P}}_{} \! \left ( {A\cap B} \right ) = 1 - {\mathcal{O}}(h)$|⁠. Recalling further that

$$ \begin{align*} & \varDelta t(\overline{X}(s_n)) = h^2 \iff \textrm{d}(\overline{X}(s_n), \partial D) \le \delta, \end{align*} $$

and

$$ \begin{align*} & B \cap \{ \textrm{d}(\overline{X}(s_n), \partial D) \le \delta \} \subset B \cap \{ \textrm{d}(X(s_n), \partial D) \le 2\delta \}, \end{align*} $$

we obtain that

$$ \begin{align} {\mathbb{P}}\big(\{\nu>s_{n}\} \cap \big\{ \varDelta t(\overline{X}(s_n))\phantom{h^2}\!\!\!\!\! &= h^2 \big\}\big)\nonumber\\ & = {\mathbb{P}}\big(A \cap B \cap \{\nu>s_{n}\} \cap \{ \textrm{d}(\overline{X}(s_n), \partial D) \le \delta \}\big) + {\mathcal{O}}(h)\nonumber \\ & \le{\mathbb{P}}\big(\{\tau_{2\delta}>s_{n}\} \cap \{ \textrm{d}(X(s_n), \partial D) \le 2\delta \}\big) + {\mathcal{O}}(h). \end{align} $$

(3.10)

To bound the first summand, note first that the ‘density’ of the SDE (1.2) on the domain |$D_{2\delta }$| with paths removed when they exit |$D_{2\delta }$| is a generalized function |$p(t,x): [0,T]\times \overline D_{2\delta } \to [0,\infty ]$| that solves the following absorbing-boundary Fokker–Planck equation (Schuss, 1980; Naeh et al., 1990):

$$ \begin{align}& \left.\begin{aligned} \partial_t p &= -\nabla \cdot (a p) + \frac{1}{2} \nabla \cdot \!\big( {bb^{\top} \nabla p} \big) && \textrm{in} \quad (0,T] \times D_{2\delta}, \\ p &= 0 && \textrm{on} \quad (0,T] \times \partial D_{2\delta}\\ p(0,x) &= \delta(x-x_0) && \quad x\in \overline D_{2\delta}. \end{aligned}\right\}\end{align} $$

(3.11)

By the regularity constraints imposed on coefficients in Assumption B with |$2\delta \le \bar R_D$| and since the boundary |$\partial D_{2\delta }$| is |$C^3$|⁠, the Fokker–Planck equation has a unique solution that blows up at |$(t,x) = (0,x_0)$|⁠, and the solution may be viewed as the Green’s function |$p(t,x) = G(t,x;0,x_0)$|⁠, cf. Friedman (1964, Chapter 3.7). Note that the well-posedness holds for any |$\delta \in [0,\bar R_D/2)$|⁠, and that we are considering a parametrized set of solutions |$p(x,t;\delta )$| to the PDE (3.11) where we suppress the dependence on |$\delta $| when confusion is not possible.

Considering |$p$| as a mapping from a smaller domain where a neighborhood |$N$| of the singular point |$(0,x_0)$| is removed from the full domain, it holds that |$p:([0,T] \times \overline{D}_{2\delta }) \setminus N \to [0,\infty )$| is a continuously differentiable function, cf. Friedman (1964, Chapters 1 and 3.7). Since we are interested in the properties of |$p$| near the cylindrical boundary |$[0,T]\times \partial D_{2\delta }$|⁠, we proceed as follows to remove a cylindrical neighborhood containing |$(0, x_0)$| from the domain of |$p$|⁠. For |$\lambda := d(x_0, \partial D)/2$|⁠, let |$\bar h>0$| be an upper bound for the step size parameter, such that |$d(x_0, \partial D_{-2\delta })> \lambda $| and |$\delta (h) < \bar R_D/2$| hold whenever |$h \le \bar h$|⁠. For |$\varGamma _{2\delta }:= D_{2\delta } \setminus D_{-2\delta }$|⁠, it then holds that |$d(x_0, \varGamma _{2\delta })>\lambda $| and Lemma A.1 implies that there exists a constant |$C_p>0$| that is uniform in |$h \in [0,\bar h]$| such that

$$ \begin{align}& \max_{(t,x) \in [0,T] \times \overline{\varGamma}_{2\delta}} p(t,x) \le 4C_p \delta.\end{align} $$

(3.12)

And Lemma A.2 implies there exists a constant |$C_{{\mathcal{I}}}>0$| such that

$$ \begin{align*} & \max_{r\ \in\ [-\bar R_D, \bar R_D]} \int_{\partial D_{r}} \textrm{d}S(x) \le C_{{\mathcal{I}}}. \end{align*} $$

Thanks to the co-area formula,

$$ \begin{align*} \begin{split} {\mathbb{P}}(\{\tau_{2\delta}>s_{n}\} \cap \{ \textrm{d}(X(s_n), \partial D) \le 2 \delta \}) &= \int_{\varGamma_{2\delta}} p(s_n,x)\, \textrm{d}x\\ &= \int_{-2\delta}^{2\delta} \int_{\partial D_{r}} p(s_n,x)\, \textrm{d}S(x)\, \textrm{d}r\\ &\stackrel{{\textsubscript{\hyperlink{deqn35}{(3.12)}}}}{\le} 4C_p \int_{-2\delta}^{2 \delta} \delta \int_{\partial D_{r}} \textrm{d}S(x)\, \textrm{d}r \\ & \le 16 C_p C_{\mathcal{I}} \, \delta^2, \end{split} \end{align*} $$

and it follows from (3.9) and (3.10) that

$$ \begin{align*} \begin{split} {\mathbb{E}} \left[{\textrm{Cost}\!\left({\nu}\right)} \right] &\le{\mathcal{O}}(h^{-1}) + \sum_{n=0}^{h^{-2}-1} {\mathbb{P}}(\{\tau_{2\delta}>s_{n}\} \cap \{\textrm{d}(X(s_n), \partial D)\} \le 2\delta) \\ &= {\mathcal{O}}(h^{-2} \delta^2 ) = {\mathcal{O}}(h^{-1} \log(h^{-1})). \end{split} \end{align*} $$

3.2 Order 1.5 method

This section proves theoretical results for the order 1.5 method.

The lemma below shows that the threshold parameters |$\delta _1$| and |$\delta _2$| are chosen sufficiently large to ensure that the maximum stride |$X$| takes over every interval in the mesh |${\mathcal{T}}^{\varDelta t}$| is with high likelihood bounded by either |$\delta _1$| or |$\delta _2$|⁠, depending on the length of the interval. The estimate will help us bound the probability that the numerical solution exits the domain |$D$| from anywhere, but the critical region nearest the boundary of |$D$| in the proof of Theorem 2.10 (i.e., bound the probability of exiting |$D$| using a larger timestep than |$h^3$|⁠).

Lemma 3.4

Let Assumptions A and B hold, and assume that the parameter |$h \in (0,1)$| is sufficiently small. For the timesteps in the mesh |${\mathcal{T}}^{\varDelta t}$|⁠, let

$$ \begin{align*} \begin{split} \varDelta^j &:= \big\{ k \in \{0,1,\ldots, N-1\} \mid \varDelta t_k = h^j \big\} \quad \textrm{for} \quad j=1,2,3. \end{split} \end{align*} $$

For the threshold parameters

$$ \begin{align*} & \delta_1 = \sqrt{12 \widehat C_b\, \textrm{d} h \, \log(h^{-1})} \quad \textrm{and} \quad \delta_2 = \sqrt{16 \widehat C_b\, \textrm{d} h^2\, \log(h^{-1})} \end{align*} $$

and the maximal-stride sets

$$ \begin{align*} \begin{split} A_1 &:= \big\{\omega \in \varOmega \mid M(t_n, t_{n} + \varDelta t_n) \le \delta_1 \quad \forall n \in \varDelta^1 \big\} \\ A_2 &:= \big\{\omega \in \varOmega \mid M(t_n, t_{n} + \varDelta t_n) \le \delta_2 \quad \forall n \in \varDelta^{2} \cup \varDelta^{3} \big\}, \end{split} \end{align*} $$

it then holds for |$A:= A_1\cap A_2$| that |${\mathbb{P}}(A) = 1 - {\mathcal{O}}(h^{2})$|⁠.

Proof.

From the adaptive timestepping, we know that the mesh |${\mathcal{T}}^{\varDelta t}$| contains |$N = |{\mathcal{T}}^{\varDelta t}|-1$| many intervals where |$N$| is a random integer that is bounded from below by |$T/h$| and from above by |$T/h^3$|⁠. At most |$T/h$| of the timesteps are of length |$h$|⁠, at most |$T/h^2$| are of length |$h^2$| and at most |$T/h^3$| are of length |$h^3$|⁠. To avoid complications due to a random number of elements in the mesh, we extend the mesh |${\mathcal{T}}^{\varDelta t}$| to span over |$[0,3T]$| in such a way that the extended mesh agrees with |${\mathcal{T}}^{\varDelta t}$| over the interval |$[0,t_N]$| and contains exactly |$T/h$| many timesteps of length |$h$|⁠, |$T/h^2$| many timesteps of length |$h^2$| and |$T/h^3$| many timesteps of length |$h^3$|⁠. In other words,

$$ \begin{align*} & {\mathcal{T}}^{E} = \{t_0, t_1,\ldots, t_N, t_{N+1}, \ldots, t_{\widehat N}\}, \end{align*} $$

where |${\mathcal{T}}^{E} \cap{\mathcal{T}}^{\varDelta t} = {\mathcal{T}}^{\varDelta t}$| and |$t_{\widehat N} = 3T$| with |$\widehat N = T/h + T/h^2 + T/h^3$|⁠. And for

$$ \begin{align*} & \widehat \varDelta^j:= \big\{ k \in \{0,1,\ldots, \widehat N-1\} \mid \varDelta t_k = h^j \big\} \quad \textrm{for} \quad j =1,2,3, \end{align*} $$

it holds that

$$ \begin{align*} & |\widehat \varDelta^j| = \frac{T}{h^j} \quad \textrm{for} \quad j=1,2,3. \end{align*} $$

We represent these three sets of integers and relabel their associated mesh points as follows: for |$j=1,2,3$|⁠, let

$$ \begin{align*} & \widehat \varDelta^j = \{ \widehat \varDelta^j(1), \widehat\varDelta^j(2),\ldots, \widehat\varDelta^j({T/h^{j}}) \} \quad \textrm{with} \quad t^{h^j}_n:= t_{\widehat\varDelta^j(n)}\qquad n \in \{1,2,\ldots, T/h^j\}. \end{align*} $$

Introducing the following maximal-stride sets over the extended mesh

$$ \begin{align} \begin{split} B_1 &:= \! \big\{ {\omega \in \varOmega \; \big| \; M(t_{n}, t_{n+1}) \le \delta_1 \quad \forall n \in \widehat \varDelta^1} \big\}\\ B_2 &:= \! \big\{ {\omega \in \varOmega \; \big| \; M(t_{n}, t_{n+1}) \le \delta_2 \quad \forall n \in \widehat \varDelta^2 \cup \widehat \varDelta^3} \big\} \end{split}\end{align} $$

(3.13)

and noting that |$B:= B_1 \cap B_2$| is a subset of |$A$|⁠, we obtain the following upper bound:

$$ \begin{align} {\mathbb{P}}_{} \big( {A^{C}} \big) &\leq{\mathbb{P}}_{} \big( {B^{C}} \big)\nonumber\\ &\leq \sum_{n =1}^{T/h} {\mathbb{P}}_{} \big( {M\big(t^{h}_{n}, t^{h}_{n} + h\big)> \delta_1} \big) + \sum_{n = 1}^{T/h^2} {\mathbb{P}}_{} \big( {M\big(t_{n}^{h^2}, t^{h^2}_{n} + h^2\big) > \delta_2} \big)\nonumber\\ & \quad + \sum_{n = 1}^{T/h^3} {\mathbb{P}}_{} \big( {M\big(t_{n}^{h^3}, t^{h^3}_{n} + h^3\big) > \delta_2} \big).\end{align} $$

(3.14)

Assuming that |$h \in (0,1)$| is sufficiently small so that

$$ \begin{align*} & \delta_1 - C_a h \ge \frac{\delta_1}{\sqrt{2}}, \qquad \delta_2 - C_a h^2 \ge \frac{\delta_2}{\sqrt{2}} \qquad \textrm{and} \quad h \le \frac{4}{5}, \end{align*} $$

then a similar use of Doob’s martingale inequality as in the proof of Lemma 3.3 yields

$$ \begin{align*} {\mathbb{P}}_{} \big( {M\big(t^{h}_{n}, t^{h}_{n} + h\big)> \delta_1} \big) &\le 2\, \textrm{d} h^{3} && \forall n \in \{1, 2,\ldots, T/h\},\\{\mathbb{P}}_{} \big( {M\big(t^{h^2}_{n}, t^{h^2}_{n} + h^2\big) > \delta_2} \big) &\le 2d \exp\left(-\frac{\delta_2^2}{4 \widehat C_b h^2 d} \right) = 2\, \textrm{d} h^{4} && \forall n \in \{1, 2,\ldots, T/h^2\},\\{\mathbb{P}}_{} \big( {M\big(t^{h^3}_{n}, t^{h^3}_{n} + h^3\big) > \delta_2} \big) &\le 2d \exp\left(-\frac{\delta_2^2}{4 \widehat C_b h^3 d} \right) \le 2\, \textrm{d} h^{5} &&\forall n \in \{1, 2,\ldots, T/h^3\}. \end{align*} $$

Conclusion: |${\mathbb{P}}(A^C) \le 6 d\, T \,h^{2}$|⁠.

Abbreviated proof of Theorem 2.10.

The exit-time error can be partitioned into two parts:

$$ \begin{align*} & {\mathbb{E}}_{} \left[{|\tau - \nu|} \right] = {\mathbb{E}}_{} \left[{ |\tau - \nu| 1_{\nu< \tau}} \right] + {\mathbb{E}}_{} \left[{ |\tau - \nu| 1_{\nu>\tau}} \right] =: I + II. \end{align*} $$

Similarly as in the proof of Theorem 2.7, but now using the strong Itô–Taylor method of order |$\gamma =1.5$|⁠, we obtain that

$$ \begin{align*} & I \le L {\mathbb{E}}_{} \left[{ |X(\nu) - \overline{X}(\nu)|} \right] = {\mathcal{O}}(h^{3/2}). \end{align*} $$

For the second term, assume for the given |$\xi>0$| that |$r:=h^{3/2-\xi } < \bar R_D$| and introduce the second exit-time problem

$$ \begin{align*} & \tau_r = \inf\{t\ge 0 \mid X(t) \notin D_r\} \wedge T. \end{align*} $$

As Proposition 2.4 applies, we recall that |$u_r(t,x) = {\mathbb{E}}_{} \left [ {\tau _{r}^{t,x}} \right ]$| satisfies the Lipschitz property (2.6). Noting that |$\tau _r \ge \tau $|⁠, we obtain

A similar argument as in the proof of Theorem 2.7 yields that

$$ \begin{align*} & II_1 = {\mathbb{E}}_{} \left[{ {\mathbb{E}}_{} \left[{\tau_r - \tau \mid{\mathcal{F}}_\tau } \right] } \right] = {\mathbb{E}}_{} \left[{ u_r(\tau, X(\tau)) } \right] \le L r = {\mathcal{O}}(h^{3/2-\xi}). \end{align*} $$

To bound |$II_2$|⁠, first observe that |$(\nu - \tau _r) \mathbbm{1}_{\tau _r = T} \le 0$| implies that

$$ \begin{align*} & II_2 \le{\mathbb{E}}_{} \left[{ (\nu - \tau_r) \mathbbm{1}_{ \{\nu> \tau\} \cap \{ \tau_r < T\}}} \right]. \end{align*} $$

We introduce

$$ \begin{align}& t^*:= \max\left\{t_k \in{\mathcal{T}}^{\varDelta t} \mid t_k \le \tau_r \right\}\end{align} $$

(3.15)

and note for later reference that |$\tau _r - t^* \le \varDelta t(\overline{X}(t^*))$|⁠. Let |$A$| denote the maximal-stride set defined in Lemma 3.4, where we recall that |${\mathbb{P}}(A) = 1 - {\mathcal{O}}(h^{2})$|⁠. Since |$\tau _r - t^* \le h$|⁠, it holds that

$$ \begin{align*} & |X(t^*, \omega) - X(\tau_r, \omega)| \le \delta_1 \qquad \forall \omega \in A \end{align*} $$

and

$$ \begin{align*}& d\left(X(t^*, \omega), \partial D_r\right) \le \delta_1 \qquad \omega \in A \cap \{\tau_r <T\}. \end{align*} $$

Using Proposition 3.1 for |$\gamma =1.5$|⁠, we proceed to bound the distance between the exact diffusion process and the numerical solution at time |$t^*$|⁠: Let |$p^* \in{\mathbb{N}}$| be sufficiently large so that |$p^*\xi> 1.5$|⁠. Then by Proposition 3.1,

$$ \begin{equation} {\mathbb{P}}(|\overline{X}(t^*) - X(t^*)| \ge r) \le \frac{{\mathbb{E}}_{} \big[{|\overline{X}(t^*) - X(t^*)|^{p^*}} \big]}{h^{(3/2-\xi) p^*}} = {\mathcal{O}}(h^{3/2}). \end{equation} $$

(3.16)

We conclude that for |$\widetilde{A}:= \{\omega \in A \mid |\overline{X}(t^*) - X(t^*)| < r\} $|⁠, it holds that |${\mathbb{P}}(\widetilde{A}) = 1 - {\mathcal{O}}(h^{3/2})$|⁠.

We will next show that |$\textrm{d}(\overline{X}(t^*, \omega ), \partial D_r) < \delta _2+r$| for all |$\omega \in \widetilde{A}\cap \{\tau _r <T\}$|⁠, which by the timestepping (2.21) implies the that the smallest step size is used at time |$t^*$| in the numerical solution:

$$ \begin{align}& \varDelta t(\overline{X}(t^*, \omega)) = h^3 \quad \forall \omega \in \widetilde{A} \cap \{\tau_r <T\}.\end{align} $$

(3.17)

Observe first that

$$ \begin{align*} & \textrm{d}(\overline{X}(t^*, \omega), \partial D_r) \le \textrm{d}(\overline{X}(t^*, \omega), X(t^*, \omega)) + \textrm{d}\left(X(t^*,\omega),\partial D_r\right) < \delta_1 +r \quad \forall \omega \in \widetilde{A} \cap \{\tau_r <T\}, \end{align*} $$

and let |$\widetilde D:= D \setminus V_{\partial D}(0,\delta _1)$|⁠. Since |$d(\widetilde D, \partial D_r) \ge \delta _1 +r$|⁠, we conclude that |$ \overline{X}(t^*) \notin \widetilde D$| and |$\varDelta t(\overline{X}(t^*)) \le h^2$| for all paths in |$\widetilde{A} \cap \{\tau _r <T\}$|⁠. A recursive argument, where we restrict ourselves to paths in |$\widetilde{A} \cap \{\tau _r <T\} \subset A$|⁠, will sharpen the step size estimate to (3.17): first recall that

$$ \begin{align*} & \tau_r - t^* \le \varDelta t(\overline{X}(t^*)) \le h^2, \end{align*} $$

and Lemma 3.4 implies that

$$ \begin{align*} & \textrm{d}\left(X(t^*), X(\tau_r)\right) \le \sup_{s \in [0,\varDelta t(\overline{X}(t^*))]} |X(t^{*}+s) - X(t^*)| \le \delta_2. \end{align*} $$

Consequently,

$$ \begin{align*} & \textrm{d}\left(\overline{X}(t^*), \partial D_r\right) \le \textrm{d}(\overline{X}(t^*), X(t^*)) + \textrm{d}\left(X(t^*), X(\tau_r)\right) < \delta_2 +r, \end{align*} $$

which implies that |$\overline{X}(t^*) \notin D \setminus V_{\partial D}(0,\delta _2)$|⁠. This verifies Property (3.17).

Thanks to (3.17), we can sharpen the estimate of the distance between |$\overline{X}(t^*)$| and |$X(\tau _r)$|⁠:

$$ \begin{align*} \begin{split} {\mathbb{P}}&\left( \left\{\left|\overline{X}(t^*) - X(\tau_r)\right| \ge r \right\} \cap \{\tau_r <T\}\right) \\ &\le{\mathbb{P}}\big( \big\{\left|\overline{X}(t^*) - X(\tau_r)\right| \ge h^{3/2-\xi} \big\} \cap \{\tau_r <T\} \cap \widetilde{A}\big) + {\mathbb{P}}(\widetilde{A}^C)\\ &\le \frac{{\mathbb{E}}_{} \big[{\left|\overline{X}(t^*) - X(\tau_r)\right|^{p^*} \mathbbm{1}_{\widetilde{A}\cap \{\tau_r <T\}}} \big]}{ h^{(3/2-\xi)p^*}} + {\mathcal{O}}(h^{3/2})\\ &\le \frac{2^{p^*-1} {\mathbb{E}}_{} \big[{ \left|\overline{X}(t^*) - X(t^*)\right|^{p^*} \mathbbm{1}_{\widetilde{A} \cap \{\tau_r <T\}} + \left|X(t^*) - X(\tau_r)\right|^{p^*}\mathbbm{1}_{\widetilde{A}\cap \{\tau_r <T\}}} \big]}{ h^{(3/2-\xi)p^*}} +{\mathcal{O}}(h^{3/2}) \\ &= {\mathcal{O}}(h^{3/2}). \end{split} \end{align*} $$

The first summand in the expectation in the last inequality is bounded by (3.16), and the second one is bounded by Proposition 3.2, (3.17) and

$$ \begin{align*} \begin{split} {\mathbb{E}}_{} \big[{\left|X(t^*) - X(\tau_r)\right|^{p^*}\mathbbm{1}_{\widetilde{A}\cap \{\tau_r <T\}}} \big] &\le{\mathbb{E}}_{} \left[{ \max_{k\in \{0,1,\ldots, \lceil T/h^3\rceil-1\}} \sup_{s \in [0,h^3] } |X(kh^3 + s) - X(k h^3)|^{p^*} \mathbbm{1}_{\widetilde{A}}} \right]\\ &= {\mathcal{O}}\big( h^{3p^*/2} \sqrt{\log(h^{-1})} \big)\,\!. \end{split} \end{align*} $$

For |$G:= \{\omega \in \varOmega \mid |\overline{X}(t^*) - X(\tau _r)| < r \}$|⁠, we obtain that |${\mathbb{P}}(G^C \cap \{\tau _r <T\}) = {\mathcal{O}}(h^{3/2})$| and

$$ \begin{align*} \begin{split} \omega \in G \cap \{\tau_r < T\} \implies \overline{X}(t^*,\omega) \notin D \implies \nu(\omega) \le t^*(\omega) &\implies \nu(\omega) \le \tau_r(\omega). \end{split} \end{align*} $$

We conclude the proof with the following observation:

$$ \begin{align*} II_2 &\le{\mathbb{E}}_{} \big[{ (\nu - \tau_r) \mathbbm{1}_{\{\nu> \tau\} \cap \{\tau_r < T\}}} \big] \\ &\le \underbrace{{\mathbb{E}}_{} \big[{ (\nu - \tau_r) \mathbbm{1}_{\{\nu> \tau\} \cap \{\tau_r < T\}\cap G}} \big]}_{ \le\ 0} + T \, {\mathbb{P}}(G^{C} \cap \{\tau_r < T\} )\\ &= {\mathcal{O}}(h^{3/2}). \end{align*} $$

Up next, we prove the computational cost result for the order 1.5 method.

Sketch of proof of Theorem 2.11.

Let |$s_n:= n h^3$| for |$n=0,1,\ldots $| denote a set of deterministic uniformly spaced mesh points. This mesh contains all realizations of the adaptive mesh as sub-meshes, in the sense that |${\mathcal{T}}^{\varDelta t}(\omega )\subset \{s_n\}_{n\ge 0}$| for all |$\omega \in \varOmega $|⁠. Similarly, as in (3.9), we obtain that

$$ \begin{align} {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left({\nu}\right)} \right] &\le \frac{T}{h} + \sum_{n=0}^{h^{-3}-1} \frac{h^3}{h^2}{\mathbb{P}}\big(\{\nu>s_{n}\} \cap \big\{ \varDelta t(\overline{X}(s_n)) = h^2 \big\}\big)\nonumber \\ & \quad + \sum_{n=0}^{h^{-3}-1} {\mathbb{P}} \big(\{\nu>s_{n}\} \cap \big\{ \varDelta t(\overline{X}(s_n)) = h^3 \big\}\big).\end{align} $$

(3.18)

Let |$A$| denote the maximal stride set in Lemma 3.4 and let

$$ \begin{align*} & B_i = \left\{ \omega \mid \max_{t_k \in{\mathcal{T}}^{\varDelta t}} |X(t_k) - \overline{X}(t_k)| \le \delta_i \right\}\!, \quad i =1,2. \end{align*} $$

If |$\omega \in A \cap B_1$|⁠, we obtain by similar reasoning as in the proof of Theorem 2.8 that

$$\begin{align*} & A\cap B_1 \cap \{\nu>s_{n}\} \cap \big\{ \varDelta t(\overline{X}(s_n)) = h^2 \big\} \subset \left\{\tau_{2\delta_1} > s_n\right\} \cap \{ \textrm{d}(X(s_n),\partial D) \le 2\delta_1 \}, \end{align*}$$

$$ \begin{align*} & A\cap B_2 \cap \{\nu>s_{n}\} \cap \big\{ \varDelta t(\overline{X}(s_n)) = h^3 \big\} \subset \left\{\tau_{2\delta_2} > s_n\right\} \cap \{ \textrm{d}(X(s_n),\partial D) \le 2\delta_2 \}, \end{align*} $$

and

$$ \begin{align*} & {\mathbb{P}}(A\cap B_1) = 1 - {\mathcal{O}}(h^2) \quad \textrm{and}\quad{\mathbb{P}}(A\cap B_2) = 1 - {\mathcal{O}}(h^2). \end{align*} $$

This leads to

$$ \begin{align} {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left({\nu}\right)} \right] &\le{\mathcal{O}}(h^{-1}) + \sum_{n=0}^{h^{-3}-1} \frac{h^3}{h^2}{\mathbb{P}} \left(\left\{\tau_{2\delta_1}>s_{n}\right\} \cap \{ \textrm{d}(X(s_n),\partial D) \le 2\delta_1\} \right) \nonumber\\ & \quad + \sum_{n=0}^{h^{-3}-1} {\mathbb{P}} \left(\left\{\tau_{2\delta_2}>s_{n}\right\} \cap \{ \textrm{d}(X(s_n),\partial D) \le 2\delta_2 \} \right)\!. \end{align} $$

(3.19)

By a similar argument as in the proof of Theorem 2.8, it holds that

$$ \begin{align*} & {\mathbb{P}} \left(\left\{\tau_{2\delta_1}>s_{n}\right\} \cap \{\textrm{d}(X(s_n),\partial D) \le 2\delta_1\} \right) = {\mathcal{O}}\big(\delta_1^{2}\big) = {\mathcal{O}}(h \log(h^{-1})) \end{align*} $$

and

$$ \begin{align*} & {\mathbb{P}} \left(\left\{\tau_{2\delta_2}>s_{n}\right\} \cap \{ \textrm{d}(X(s_n),\partial D) \le 2\delta_2 \} \right) = {\mathcal{O}}\big(\delta_2^{2}\big) = {\mathcal{O}}(h^{2} \log(h^{-1})), \end{align*} $$

so that

$$ \begin{align*} {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left({\nu}\right)} \right] &= {\mathcal{O}}(h^{-1} + h^{-3} \times h^2 \log(h^{-1})) = {\mathcal{O}}(h^{-1} \log(h^{-1})). \end{align*} $$

4. Numerical experiments

We run several simulations to numerically verify the theoretical rates for strong convergence and computational cost for the order 1 and 1.5 methods. Algorithm 1 describes the implementation of the two methods for computing the exit time of one SDE path. In all the problems below, we consider exit times (1.1) with cut-off time |$T=10$|⁠.

$$\ $$

4.1 Geometric Brownian Motion

For the one-dimensional geometric Brownian motion

$$ \begin{align*} \textrm{d}X &= 0.05 X\, \textrm{d}t+ 0.2 X\,\textrm{d}W \\ X(0) &= 4, \end{align*} $$

we consider the exit time from the interval |$D = (1, 7)$|⁠. The reference solution to the mean exit time was computed by numerically solving the Feynman–Kac PDE, cf. Proposition 2.3, using the Crank–Nicolson method for time discretization and continuous, piecewise linear finite elements for spatial discretization. For this purpose, we use the Gridap.jl library (Badia & Verdugo, 2020; Verdugo & Badia, 2022) in the Julia programming language. The pseudo-reference solution for the mean exit time of the process starting at |$\overline{X}(0) = 4$| is |${\mathbb{E}}_{} \left [ {\tau } \right ] = 7.153211$|⁠, rounded to 7 significant digits. We let |$\nu _h$| denote the exit time of the numerical solution |$\overline{X}$| from the domain |$D$| using the |$h>0$|⁠. Monte Carlo estimates with |$M= 10^7$| samples are used to estimate the strong error

$$ \begin{align}& {\mathbb{E}}_{} \left[{|\tau - \nu_h|} \right] \approx \sum_{i=1}^M \frac{\left|\nu_{h}^i - \nu_{2h}^i\right|}{M}\end{align} $$

(4.1)

and the computational cost

$$ \begin{align}& {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left(\nu_h\right)} \right] \approx \sum_{i=1}^M \frac{ \textrm{Cost}\!\left({\nu_{h}^i}\right)}{M}.\end{align} $$

(4.2)

Realizations on neighboring resolutions |$\nu _{h}^i$| and |$\nu _{2h}^i$| are pairwise coupled using the technique for non-nested meshes introduced in Giles et al. (2016) together with the procedure (Hoel & Krumscheid, 2019, Example 1.1) for coupling all driving noise in the order 1.5 method. This reduces the variance of the samples |$\nu _h^i - \nu _{2h}^i$| in the Monte Carlo estimators, leading to a more efficient Monte Carlo estimator.

$Numerical tests for the exit-time problem in Section 4.1: Strong error (top), computational cost (bottom left) and cost $\times $ strong error (bottom right).$

Fig. 1.

Numerical tests for the exit-time problem in Section 4.1: Strong error (top), computational cost (bottom left) and cost |$\times $| strong error (bottom right).

Open in new tab Download slide

Figure 1 shows that the strong convergence rate obtained from the numerical simulations agrees with the theory for the order 1 and order 1.5 methods and the weak convergence rate coincides with the strong rate. In alignment with theory, we further observe that the expected computational cost for both adaptive methods are |${\mathcal{O}}(h^{-1} \left | {\log (h)} \right |)$| and that

$$ \begin{align}& \frac{{\mathbb{E}}_{} \left[{\textrm{Cost}\!\left(\nu_h\right)} \right]}{|\!\log(h)|} \times{\mathbb{E}}_{} \left[{|\nu_h - \nu_{2h}|} \right] = {\mathcal{O}}(h^{\gamma-1}).\end{align} $$

(4.3)

Lastly, we approximate the weak error |$|{\mathbb{E}}_{} \left [ {\tau - \nu _h} \right ]|$| numerically by Monte Carlo in two different ways. First, using pairwise coupled realizations,

$$ \begin{align}& \left| \frac{1}{M}\sum_{i=1}^M \nu_{h}^i - \nu_{2h}^i \right|\end{align} $$

(4.4)

with |$M=4 \times 10^7$| samples. And second, using the aforementioned pseudo-reference solution of |${\mathbb{E}}_{} \left [ {\tau } \right ]$| computed with the finite element method, we sample

$$ \begin{align}& \left| {\mathbb{E}}_{} \left[{\tau} \right] - \frac{1}{M}\sum_{i=1}^M \nu_{h}^i \right|\end{align} $$

(4.5)

using |$M=4\times 10^7$| number of samples. Since the variance of |$\nu _h - \nu _{2h}$| is much smaller than that of |${\mathbb{E}}_{} \left [ {\tau } \right ] -\nu _h$| when |$h$| is sufficiently small, the last approach to estimating the weak error is more sensitive to statistical error and thus less tractable than the first one. We have nonetheless included the last approach to also verify numerically that our estimator |$\nu _h$| asymptotically is unbiased |$\big($|since |$\lim _{h \downarrow 0} {\mathbb{E}}_{} \left [ {\nu _h} \right ] = {\mathbb{E}}_{} \left [ {\nu } \right ]$| does not directly follow from |$\lim _{h \downarrow 0} |{\mathbb{E}}_{} \left [ {\nu _h -\nu _{2h}} \right ]\!|=0\big)$|⁠. Figure 2 shows the weak convergence rate |${\mathcal{O}}(h^\gamma )$| for the respective adaptive methods.

Fig. 2.

Numerical tests of the weak error for the exit-time problem in Section 4.1.

Open in new tab Download slide

$Numerical tests for the exit-time problem in Section 4.2: strong error (top), computational cost (bottom left) and cost $\times $ strong error (bottom right).$

Fig. 3.

Numerical tests for the exit-time problem in Section 4.2: strong error (top), computational cost (bottom left) and cost |$\times $| strong error (bottom right).

Open in new tab Download slide

4.2 Linear drift, cosine diffusion coefficient—1D

For the SDE

$$ \begin{align*} \textrm{d}X &= 0.1 X \,\textrm{d}t+ 0.3 \big(\cos(X) + 3 \big) \,\textrm{d}W \\ X(0) &= 4, \end{align*} $$

we study the exit time from the interval |$D = (1, 7)$|⁠. The pseudo-reference solution to the mean exit-time problem was computed by solving the Feynman–Kac PDE using the same numerical method as in the preceding example, yielding |${\mathbb{E}}_{} \left [ {\tau } \right ] = 5.504741$|⁠, rounded to 7 significant digits. Weak and strong errors and the computational cost of the adaptive methods are estimated using Monte Carlo with the same number of samples as in the preceding example.

Figure 3 shows numerical estimates of the strong error and the computational cost. Taking into account that the strong-error estimate is sensitive to statistical errors for small values of |$h$|⁠, the observed rates for the strong error and computational cost are consistent with theory.

Figure 4 shows that the weak convergence rate is |${\mathcal{O}}(h^\gamma )$| for the respective adaptive methods.

Fig. 4.

Numerical tests of the weak error for the exit-time problem in Section 4.2.

Open in new tab Download slide

We next consider two higher-dimensional exit-time problems.

4.3 Linear drift, linear diffusion—2D

We consider the SDE

$$ \begin{align*} \textrm{d}X_{1} &= 0.05 X_{2} \,\textrm{d}t+ 0.2 X_{1} \,\textrm{d}W^1\\ \textrm{d}X_{2} &= 0.05 X_1 \,\textrm{d}t+ 0.2 X_2 \,\textrm{d}W^2 \end{align*} $$

with initial condition |$ X(0) = (3, 3)^{\top }$|⁠, and we are interested in computing the exit time from the disk

$$ \begin{align} D := \left\{ x \in{\mathbb{R}}^{2} \; \big| \; \sqrt{(x_{1} - 3)^{2} + (x_{2} - 3)^{2}} < 3 \right\}\!.\end{align} $$

(4.6)

The pseudo-reference solution to the mean exit-time problem is computed by numerically solving the Feynman–Kac PDE, yielding |${\mathbb{E}}_{} \left [ {\tau } \right ] = 6.7737$|⁠, rounded to 5 significant digits. Weak and strong errors, and the computational cost of the adaptive methods are estimated using Monte Carlo with the same number of samples as in the previous examples.

Figure 5 shows numerical estimates of the strong error and computational cost, and Fig. 6 shows the estimates for the weak error. The observed rates for the strong error and the computational cost are once again consistent with theory. For the 1.5 order method, the weak error tests further indicate that the weak convergence rate may be slightly higher than 1.5, for this problem.

$Numerical tests for the exit-time problem in Section 4.3: Strong error (top), computational cost (bottom left) and cost $\times $ strong error (bottom right).$

Fig. 5.

Numerical tests for the exit-time problem in Section 4.3: Strong error (top), computational cost (bottom left) and cost |$\times $| strong error (bottom right).

Open in new tab Download slide

Fig. 6.

Numerical tests of the weak error for the exit-time problem in Section 4.3.

Open in new tab Download slide

4.4 Linear drift, cosine diffusion—2D

We consider the following two-dimensional SDE with linear drift and nonlinear, diagonal diffusion coefficient:

$$ \begin{align*} \textrm{d}X_{1} &= 0.1 X_{2} \,\textrm{d}t+ 0.25 \big( \cos(X_{1}) + 3 \big) \,\textrm{d}W^1\\ \textrm{d}X_{2} &= 0.1 X_1 \,\textrm{d}t+ 0.25 \big( \cos(X_{2}) + 3 \big) \,\textrm{d}W^2, \end{align*} $$

and with initial condition |$X(0) = (3, 3)^{\top }$|⁠. Note that the components of the SDE are coupled through the drift term, similarly as for the SDE in Section 4.3. We compute the exit time of the SDE from the disk domain given by equation (4.6). The reference solution to the mean exit-time problem is computed by numerically solving the Feynman–Kac PDE, yielding |${\mathbb{E}}_{} \left [ {\tau } \right ] = 5.0853$|⁠, rounded to 5 significant digits.

Monte Carlo estimates of strong and weak errors are estimated using the same number of samples and over the same ranges of |$h$|-values as in the previous example. But since the asymptotic regime of the computational cost seems to appear later—for smaller |$h$|-values—for this problem than the previous ones, we have estimated it over a range of smaller |$h$|-values, |$h \in [2^{-5}, 2^{-15}]$|⁠, using |$M=10^3$| samples (this seems to suffice as the cost estimate does not appear sensitive to the sample size).

Figure 7 shows numerical estimates of the strong error and computational cost, and Fig. 8 shows the estimates for the weak error. The observed rate for the strong error is consistent with theory. We note that the asymptotic regime for the computational cost is not reached before |$h<2^{-10}$| for neither of the methods, but that it looks proportional to |$h^{-1} |\!\log (h)|$| near the smallest |$h$|-values. As for the previous example, we also include a Monte Carlo estimate of computational cost times the strong error, cf. (4.3). This result is consistent with theory for the order |$1.5$| method, but inconclusive for the order |$1$| method. We believe this is because, as mentioned above, the |${\mathbb{E}}_{} \left [ {\textrm{Cost}\!\left (\nu _h\right )} \right ]$| function has not reached its asymptotic regime in the window of |$h$|-values that we can afford to study numerically, and that also the order |$1$| method will produce test results that are consistent with theory for sufficiently small |$h$|-values.

$Numerical tests for the exit-time problem in Section 4.4: Strong error (top), computational cost (bottom left) and cost $\times $ strong error (bottom right).$

Fig. 7.

Numerical tests for the exit-time problem in Section 4.4: Strong error (top), computational cost (bottom left) and cost |$\times $| strong error (bottom right).

Open in new tab Download slide

Fig. 8.

Numerical tests of the weak error for the exit-time problem in Section 4.4.

Open in new tab Download slide

5. Conclusion

In this paper, we developed a tractable higher-order adaptive timestepping method for the strong approximation of Itô-diffusion exit times. The fundamental idea is to use smaller step sizes as the numerical solution gets closer to the boundary of the domain, and to use higher-order integration schemes for better approximation of the state of the Itô diffusion. This reduces both the magnitude of the overshoot when the numerical solution exits the domain and the probability of the numerical solution missing an exit of the domain by the exact process.

We theoretically proved that the Milstein scheme combined with adaptive timestepping with two different step sizes lead to a strong convergence rate of |${\mathcal{O}}(h^{1-})$|⁠, and that the strong order 1.5 Itô–Taylor scheme combined with adaptive timestepping with three different step sizes lead to a strong convergence rates of |${\mathcal{O}}(h^{3/2-})$|⁠. We also showed that if the diffusion-coefficient commutativity conditions (2.14) and (2.18) hold, then the expected computational cost for both methods is bounded by |${\mathcal{O}}(h^{-1} \left | {\log (h)} \right |)$|⁠. Many diffusion coefficients with off-diagonal terms do not satisfy these conductivity conditions, and, although efficient when applicable, this does unfortunately limit the scope of our method considerably more than for example Milstein’s WoS methods.

There are several interesting ways to extend the current work. One direction would be to consider strong Itô–Taylor schemes of order |$\gamma> 3/2$| combined with adaptive timestepping that employs more than two critical regions, i.e., for some |$h \in (0, 1)$|⁠, we consider timestep sizes finer than |$h^{3}$| when the numerical solution is very close to the boundary |$\partial D$|⁠. This way, the strong convergence rate can be further improved while maintaining a computational complexity that is tractable. Another direction could be to extend the adaptive method to exit-time problems on more general domains, such as half-spaces or domains that may evolve in time, cf. Gobet (2001); Gobet & Menozzi (2010).

Although our adaptive method has been devised for improving the strong convergence rate of exit times of Itô-diffusions, the results easily extend to weak approximations of quantities of interest (QoI) that depend on the exit time and the state of the process:

$$ \begin{align*}& \textrm{QoI} := f(\tau, X_{\tau}) + \int_{0}^{\tau} g(s, X_{s})\, \textrm{d}s, \end{align*} $$

where we assume that |$f, g$| are functions that are sufficiently smooth.

For weak approximations, the combination of MLMC and adaptive timestepping can lead to substantial efficiency gains over standard Monte Carlo methods, cf. Hoel et al. (2012, 2014); Giles et al. (2016); Katsiolides et al. (2018); Kelly & Lord (2018); Fang & Giles (2020). A key component in the MLMC method is to achieve strong pairwise coupling of fine- and coarse resolution trajectories of the stochastic process. A smart way to couple solutions of the Euler–Maruyama method on non-nested adaptive meshes has been developed in Giles et al. (2016), and it would be interesting to study how extensible that approach is to our higher-order methods, using for instance the coupling procedure in Hoel & Krumscheid (2019, Example 1.1).

Finally, in Giles & Bernal (2018), a new MLMC methodology to compute the mean exit time or a QoI that depends on the exit time of a stochastic process was developed. The new MLMC method reduces the multilevel variance by computing the approximate conditional expectation once the coarse or fine trajectory has exited the domain. Combining our ideas on higher-order adaptive timestepping with the conditional MLMC method has the potential to reduce the computational cost of exit time simulations even further. This makes an interesting problem for future research.

Acknowledgements

We would like to thank Mike Giles for suggestions on improving the manuscript, particularly for the very nice idea to use the absorbing-boundary Fokker–Planck equation in the proof of the computational cost of the adaptive methods. This improved our computational cost result considerably! We also thank Snorre Harald Christensen, Kenneth Karlsen, Peter H.C. Pang and Nils Henrik Risebro for helpful discussions on parabolic and elliptic PDE, and the anonymous referees for very helpful remarks.

Funding

Mathematics Programme of the Trond Mohn Foundation.

References

Alsmeyer

(

1994

)

On the Markov renewal theorem

Stochastic Process. Appl.

–

Google Scholar

Crossref

WorldCat

Badia

Verdugo

(

2020

)

Gridap: an extensible finite element toolbox in Julia

J. Open Source Softw.

2520

Google Scholar

Crossref

WorldCat

Baldi

(

2017

)

Stochastic Calculus

Cham

Springer

Bayer

Tempone

Wolfers

(

2020

)

Pricing American options by exercise rate optimization

Quant. Finance

1749

–

1760

Google Scholar

Crossref

WorldCat

Bernal

(

2019

)

An implementation of Milstein’s method for general bounded diffusions

J. Sci. Comput.

867

–

890

Google Scholar

Crossref

WorldCat

Bouchard

Geiss

Gobet

(

2017

)

First time to exit of a continuous Itô process: general moment estimates and |${\textrm{L}}_1$|-convergence rate for discrete time approximations

Bernoulli

1631

–

1662

Google Scholar

Crossref

WorldCat

Broadie

Glasserman

Kou

(

1997

)

A continuity correction for discrete barrier options

Math. Finance

325

–

349

Google Scholar

Crossref

WorldCat

Dalphin

(

2014

)

Some characterizations of a uniform ball property

ESAIM Proc. Surv.

437

–

446

Google Scholar

Crossref

WorldCat

Deaconu

Herrmann

Maire

(

2017

)

The walk on moving spheres: a new tool for simulating Brownian motion’s exit time from a domain

Math. Comput. Simulation

135

–

Google Scholar

Crossref

WorldCat

Fang

Giles

M. B.

(

2020

)

Adaptive Euler–Maruyama method for SDEs with nonglobally Lipschitz drift

Ann. Appl. Probab.

526

–

560

Google Scholar

Crossref

WorldCat

Friedman

(

1964

)

Partial Differential Equations of Parabolic Type

Englewood Cliffs, NJ

Prentice-Hall, Inc.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Friedman

(

1975

)

Stochastic Differential Equations and Applications. Vol. 1

Probability and Mathematical Statistics

, vol. 28.

New York–London

Academic Press

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Gilbarg

Hörmander

(

1980

)

Intermediate Schauder estimates

Arch. Rational Mech. Anal.

297

–

318

Google Scholar

Crossref

WorldCat

Gilbarg

Trudinger

N. S.

(

2001

)

Elliptic Partial Differential Equations of Second Order

Berlin

Springer

Giles

M. B.

Bernal

(

2018

)

Multilevel estimation of expected exit times and other functionals of stopped diffusions

SIAM/ASA J. Uncertain. Quantif.

1454

–

1474

Google Scholar

Crossref

WorldCat

Giles

M. B.

Lester

Whittle

(

2016

)

Non-nested adaptive timesteps in multilevel Monte Carlo computations

Monte Carlo and Quasi-Monte Carlo Methods, Springer Proceedings in Mathematics & Statistics

(R. Cools & D. Nuyens eds).

Cham

Springer International Publishing

, pp.

303

–

314

Gobet

(

2000

)

Weak approximation of killed diffusion using Euler schemes

Stochastic Process. Appl.

167

–

197

Google Scholar

Crossref

WorldCat

Gobet

(

2001

)

Euler schemes and half-space approximation for the simulation of diffusion in a domain

ESAIM Probab. Stat.

261

–

297

Google Scholar

Crossref

WorldCat

Gobet

Menozzi

(

2004

)

Exact approximation rate of killed hypoelliptic diffusions using the discrete Euler scheme

Stochastic Process. Appl.

112

201

–

223

Google Scholar

Crossref

WorldCat

Gobet

Menozzi

(

2010

)

Stopped diffusion processes: boundary corrections and overshoot

Stochastic Process. Appl.

120

130

–

162

Google Scholar

Crossref

WorldCat

Higham

D. J.

Mao

Roj

Song

Yin

(

2013

)

Mean exit times and the multilevel Monte Carlo method

SIAM/ASA J. Uncertain. Quantif.

–

Google Scholar

Crossref

WorldCat

Hoel

Krumscheid

(

2019

)

Central limit theorems for multilevel Monte Carlo methods

J. Complexity

101407

Google Scholar

Crossref

WorldCat

Hoel

Von Schwerin

Szepessy

Tempone

(

2012

)

Adaptive multilevel Monte Carlo simulation

Numerical Analysis of Multiscale Computations

Berlin, Heidelberg

Springer

, pp.

217

–

234

Hoel

Von Schwerin

Szepessy

Tempone

(

2014

)

Implementation and analysis of an adaptive multilevel Monte Carlo algorithm

Monte Carlo Methods Appl.

–

Google Scholar

Crossref

WorldCat

Jansons

K. M.

Lythe

G. D.

(

2000

)

Efficient numerical solution of stochastic differential equations using exponential timestepping

J. Statist. Phys.

100

1097

–

1109

Google Scholar

Crossref

WorldCat

Jansons

K. M.

Lythe

G. D.

(

2003

)

Exponential timestepping with boundary test for stochastic differential equations

SIAM J. Sci. Comput.

1809

–

1822

Google Scholar

Crossref

WorldCat

Jansons

K. M.

Lythe

G. D.

(

2005

)

Multidimensional exponential timestepping with boundary test

SIAM J. Sci. Comput.

793

–

808

Google Scholar

Crossref

WorldCat

Katsiolides

Müller

E. H.

Scheichl

Shardlow

Giles

M. B.

Thomson

D. J.

(

2018

)

Multilevel Monte Carlo and improved timestepping methods in atmospheric dispersion modelling

J. Comput. Phys.

354

320

–

343

Google Scholar

Crossref

WorldCat

Kelly

Lord

G. J.

(

2018

)

Adaptive time-stepping strategies for nonlinear stochastic systems

IMA J. Numer. Anal.

1523

–

1549

Google Scholar

Crossref

WorldCat

Klebaner

F. C.

(

2012

)

Introduction to Stochastic Calculus With Applications

. Third edition.

London

Imperial College Press

Kloeden

P. E.

Platen

(

1992

)

Numerical Solution of Stochastic Differential Equations

Appl. Math. (N.Y.)

, vol. 23.

Berlin

Springer

Leimkuhler

Sharma

Tretyakov

M. V.

(

2023

)

Simplest random walk for approximating Robin boundary value problems and ergodic limits of reflected diffusions

Ann. Appl. Probab.

1904

–

1960

Google Scholar

Crossref

WorldCat

Lieberman

G. M.

(

1996

)

Second Order Parabolic Differential Equations

River Edge, NJ

World Scientific Publishing Co., Inc.

Merle

Prohl

(

2023

)

A posteriori error analysis and adaptivity for high-dimensional elliptic and parabolic boundary value problems

Numer. Math.

153

827

–

884

Google Scholar

Crossref

WorldCat

Milstein

G. N.

(

1996

)

Application of the numerical integration of stochastic equations for the solution of boundary value problems with Neumann boundary conditions

Teor. Veroyatnost. i Primenen.

210

–

218

Google Scholar

Crossref

WorldCat

Milstein

G. N.

(

1997

)

Weak approximation of a diffusion process in a bounded domain

Stochastics Stochastics Rep.

147

–

200

Google Scholar

Crossref

WorldCat

Milstein

G. N.

Tretyakov

M. V.

(

2004

)

Stochastic Numerics for Mathematical Physics

Scientific Computation

Berlin

Springer

Mörters

Peres

(

2010

)

Brownian Motion

, vol. 30.

Cambridge

Cambridge University Press

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Muller

M. E.

(

1956

)

Some continuous Monte Carlo methods for the Dirichlet problem

Ann. Math. Stat.

569

–

589

Google Scholar

Crossref

WorldCat

Müller-Gronbach

(

2002

)

The optimal uniform approximation of systems of stochastic differential equations

Ann. Appl. Probab.

664

–

690

Google Scholar

Crossref

WorldCat

Naeh

Kłosek

M. M.

Matkowsky

B. J.

Schuss

(

1990

)

A direct approach to the exit problem

SIAM J. Appl. Math.

595

–

627

Google Scholar

Crossref

WorldCat

Neuenkirch

Szölgyenyi

Szpruch

(

2019

)

An adaptive Euler–Maruyama scheme for stochastic differential equations with discontinuous drift and its convergence analysis

SIAM J. Numer. Anal.

378

–

403

Google Scholar

Crossref

WorldCat

Schuss

(

1980

)

Theory and Applications of Stochastic Differential Equations

Wiley Series in Probability and Statistics

New York

John Wiley & Sons, Inc.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Sethian

J. A.

(

1996

)

A fast marching level set method for monotonically advancing fronts

Proc. Natl. Acad. Sci. U.S.A.

1591

–

1595

Verdugo

Badia

(

2022

)

The software design of Gridap: a finite element package based on the Julia JIT compiler

Comput. Phys. Comm.

276

108341

Google Scholar

Crossref

WorldCat

Weinan

Vanden-Eijnden

(

2021

)

Applied Stochastic Analysis

, vol. 199.

Rhode Island

American Mathematical Soc

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Yaroslavtseva

(

2022

)

An adaptive strong order 1 method for SDEs with discontinuous drift coefficient

J. Math. Anal. Appl.

513

126180

Google Scholar

Crossref

WorldCat

Appendix. A. Theoretical results

Lemma A.1

Let Assumptions A and B hold, and for the initial condition |$x_0 \in D$| of the SDE (1.2) let |$\lambda :=d(x_0, \partial D)/2$| and let |$\bar \delta>0$| be sufficiently small such that |$\bar \delta < \bar R_D/2$| and |$d(x_0, \varGamma _{2 \bar \delta })>\lambda $|⁠, where we recall that |$\varGamma _{2 \delta }:= D_{2\delta }\setminus D_{-2\delta }$| for |$\delta \in [0,\bar \delta ]$|⁠. Then, there exists a constant |$C_p>0$| such that the absorbing-boundary Fokker–Planck equation (3.11) satisfies

$$ \begin{align*} & \max_{(t,x) \in [0,T]\times \overline \varGamma_{2 \delta}} p(t,x;\delta) \le 4 C_p \delta, \qquad \forall \delta \in [0,\bar \delta], \end{align*} $$

where |$C_p$| is independent of |$\delta $|⁠.

Proof.

Following Friedman (1964, Chapter 3.7), the solution of the Fokker–Planck equation can be decomposed into the sum of two functions:

$$ \begin{align*} & p(t,x;\delta) = \widehat \varGamma(t,x;0,x_0) + V(t,x;\delta), \end{align*} $$

where |$\widehat \varGamma $| denotes a fundamental solution to a global parabolic extension of the Fokker–Planck PDE from the domain |$D_{2 \bar \delta }$| to |${\mathbb{R}}^d$| (making it a fundamental solution for |$p(t,x;\delta )$| for every |$\delta \in [0, \bar \delta ]$|⁠) and |$V$| is a boundary correction term solving the Fokker–Planck equation

$$ \begin{align*} \partial_t V &= -\nabla \cdot (a V) + \frac{1}{2} \nabla \cdot \!\big( {bb^{\top} \nabla V} \big) && \textrm{in} \quad (0,T] \times D_{2\delta}, \\ V(t,x;\delta) &= -\widehat \varGamma(t,x;0,x_0) && \textrm{on} \quad (0,T] \times \partial D_{2\delta}\\ V(0,x;\delta) &= 0 && \quad x\in \overline D_{2\delta}. \end{align*} $$

Assumption B and Friedman (1964, Chapter 1.6) implies that |$\widehat \varGamma \in C^{1,2}_b([0,T]\times \overline{\varGamma }_{2 \bar \delta })$|⁠. Let

$$ \begin{align*} & C_{\widehat \varGamma}:= \max_{(x,t) \in [0,T]\times \overline{\varGamma}_{2 \bar \delta}} \widehat \varGamma + |\partial_t \widehat \varGamma| + |\nabla \widehat \varGamma| + |\nabla^2 \widehat \varGamma|, \end{align*} $$

and note that also the boundary data for |$V$|⁠,

$$ \begin{align*} & \varphi(t,x;\delta) = \begin{cases} -\widehat \varGamma(t,x;0,x_0) & (t,x) \in (0,T]\times \partial D_{2\delta}\\ 0 & (t,x) \in \{0\} \times \overline D_{2\delta} \end{cases} \end{align*} $$

satisfies |$\varphi \in C^{1,2}_b\left ( \big ((0,T]\times \partial D_{2\delta } \big )\cup \big (\{0\} \times \overline D_{2\delta }\big )\right )$| with the following uniform upper bound over all points on the boundary domain

$$ \begin{align*} & |\varphi| + |\partial_t \varphi| + |\nabla \varphi| + |\nabla^2 \varphi| \le C_{\widehat \varGamma} \qquad \forall \delta \in [0,\bar \delta]. \end{align*} $$

The boundary gradient of |$V$| will be bounded using Lieberman (1996, Lemma 10.1), so we need to verify that the lemma applies to |$V$| for any |$\delta \in [0,\bar \delta ]$|⁠. Note first that by the maximum principle, it holds that |$|V|\le C_{\widehat \varGamma }$| for all |$\delta \in [0,\bar \delta ]$|⁠. Assumption A and |$\bar R_D < \textrm{reach}(D)/2$| further implies that |$D_{2\delta }$| satisfies the uniform ball condition with |$\textrm{reach}(D_{2\delta })> \textrm{reach}(D)/2>\bar R_D$| for all |$\delta \in [0,\bar \delta ]$|⁠. Therefore, any point |$x \in \partial D_{2\delta }$| for any |$\delta \in [0,\bar \delta ]$| satisfies the infinite exterior cylinder condition with radius |$\tilde r:= \bar R_D$|⁠, cf. Lieberman (1996, Lemma 10.1). Thanks to Assumption B, there exists a constant |$C_{ab}>0$| that depends on the supremum of the SDE coefficients |$a$| and |$b$|⁠, and their first and second partial derivatives on |$D_{\bar R_D}$| such that the left-hand side of Lieberman (1996, inequality (10.6)) is bounded from above by

$$ \begin{align*} & g_L(p):= 1+ \left(\widehat C_b + C_{ab}(C_{\widehat \varGamma} +1) \right)|p| \quad \textrm{for} \quad p \in{\mathbb{R}}^d \end{align*} $$

$$ \begin{align*} & g_L(p) \le g_R(p; \mu)\qquad \textrm{for all}\quad p \in{\mathbb{R}}^d \quad \textrm{such that} \quad |p| \ge p_0. \end{align*} $$

We have shown that Lieberman (1996, Lemma 10.1) applies, which means there exists a constant |$C_V>0$| that depends on |$\mu ,p_0$|⁠, the uniform infinite exterior cylinder radius |$\tilde r$| and |$C_{\widehat \varGamma }$| such that

$$ \begin{align*} & \sup_{(t,x,y) \in [0,T] \times D_{2\delta} \times \partial D_{2\delta}} |V(t,x;\delta) - V(t,y;\delta)| \le C_V |x-y| \end{align*} $$

holds for all |$\delta \in [0,\bar \delta ]$|⁠. And since

$$ \begin{align*} & |p(t,x;\delta) - p(t,y;\delta)| \le |\widehat \varGamma(t,x) - \widehat \varGamma(t,y)| + |V(t,x;\delta) - V(t,y;\delta)|, \end{align*} $$

and |$p(t,y;\delta ) =0$| for |$y \in \partial D_{2\delta }$|⁠, we conclude that

$$ \begin{align*} & \max_{(t,x) \in [0,T] \times \overline{\varGamma}_{2\delta}} p(t,x;\delta) \le \underbrace{(C_{\widehat \varGamma} + C_V)}_{=:\ C_p} \max_{x \in \overline{\varGamma}_{2\delta}} \textrm{d}(x,\partial D_{2\delta}) \le 4 C_p \delta. \end{align*} $$

Lemma A.2

For a domain |$D$| satisfying Assumption A and any |$R ={reach}(D)/2$|⁠, the mapping |${\mathcal{I}}:[-R,R] \to [0,\infty )$| defined by

$$ \begin{align*} & {\mathcal{I}}(r) = \int_{\partial D_{r}} \textrm{d}S(x) \end{align*} $$

satisfies that |$\max _{r\, \in\, [-R,R]} {\mathcal{I}}(r) < \infty $|⁠.

Proof.

For each |$r \in [-R, R]$|⁠, we recall that |$\partial D_{r}$| is |$C^3$| and there exists a constant |$C_r>0$| such that

$$ \begin{align*} & {\mathcal{I}}(r) \le C_r \int_{D_R} \textrm{d}x, \end{align*} $$

cf. Lieberman (1996, Lemma 6.36). To obtain a uniform upper bound, we introduce |$\varGamma _R = D_{R}\setminus D_{-R}$|⁠, its closure |$\overline{\varGamma }_R$| and the signed distance function

$$ \begin{align*} & \textrm{d}(x):= \begin{cases} \textrm{d}(x,D) & \textrm{if} \quad x \in D \cap \overline{\varGamma}_{R}\\ -\textrm{d}(x,D) & \textrm{if} \quad x \in D^C \cap \overline{\varGamma}_{R}. \end{cases} \end{align*} $$

A small extension of Gilbarg & Trudinger (2015, Lemma 14.16) yields that |$d \in C^3(\overline{\varGamma }_{R})$| and that for any |$x \in \partial D_r$| and |$r \in [-R, R]$|⁠,

$$ \begin{align*} & - \nabla d(x) = n_{D}\big(\pi_r^{-1}(x)\big) = n_{D_r}(x), \end{align*} $$

where we recall that |$n_{D_r}$| denotes the outward pointing normal on the boundary |$\partial D_r$|⁠. For any |$r \in [-R, R]$|⁠, the divergence theorem yields

$$ \begin{align*} \begin{split} |{\mathcal{I}}(r) - {\mathcal{I}}(0)| &= \left|\int_{\partial D_{r}} |\nabla\, \textrm{d}(x)|^2 \,\textrm{d}S(x) - \int_{\partial D} |\nabla \,\textrm{d}(x)|^2\, \textrm{d}S(x) \right|\\ &\le \int_{\varGamma_{R}} |\varDelta\, \textrm{d}(x)|\, \textrm{d}x < \infty, \end{split} \end{align*} $$

where the last inequality follows from |$\varDelta d \in C^1(\overline{\varGamma }_{ R})$|⁠. We conclude that

$$ \begin{align*} & \max_{r\, \in\, [-R, R]} {\mathcal{I}}(r) \le{\mathcal{I}}(0) + \max_{r \,\in\, [-R, R]}|{\mathcal{I}}(r) - {\mathcal{I}}(0)| < \infty. \end{align*} $$

Appendix. B. Computational cost for the order 1 method—1D Wiener process

In this section, we use a simple approach to derive an upper bound for the computational cost of implementing the order 1 adaptive method for a one-dimensional Wiener process |$W(t)$| exiting the interval |$(a, b)$|⁠. This is possible thanks to the availability of the joint density of |$W(t)$| and its running maximum/minimum.

Let |$\overline{W}(t)$| denote the continuous-time extension of the discretely sampled Wiener process. For |$t> s \geq 0$|⁠, let |$M_{W}(s, t)$| and |$m_{W}(s, t)$| denote the running maximum and the running minimum of the one-dimensional Wiener process |$W(t)$|⁠, respectively, over the interval |$[s, t]$|⁠, i.e.,

$$ \begin{align*} M_{W}(s, t) &:= \sup_{r\, \in\, [s, t]} \big( W(r) - W(s) \big), \\ m_{W}(s, t) &:= \inf_{r\, \in\, [s, t]} \big( W(r) - W(s) \big). \end{align*} $$

Let the shorthands |$M_{W}(t)$| and |$m_{W}(t)$| represent the running maximum and the running minimum of the Wiener process over the interval |$[0, t]$|⁠, respectively. Recall that the running maximum and the negative of the running minimum of the Wiener process have the same distribution, cf. Klebaner (2012, Section 3.6).

The threshold parameter |$\delta (h)$|⁠, for all |$h \in (0, 1)$|⁠, is defined as

$$ \begin{align} \delta(h) := \sqrt{4 h \log (h^{-1})}.\end{align} $$

(B.1)

We construct the set |$A$| to bound the strides of the running maximum and the running minimum of the Wiener process

$$ \begin{align} \begin{split} A &:= \left\{ \omega \in \varOmega \; \big| \; M_{W}(t_{n}, t_{n+1}) \leq \delta, \forall n \in \{ 0, 1, \ldots, N - 1 \} \right\} \\ & \qquad \cap \left\{ \omega \in \varOmega \; \big| \; m_{W}(t_{n}, t_{n+1}) \geq -\delta, \forall n \in \{ 0, 1, \ldots, N - 1 \} \right\}\!. \end{split}\end{align} $$

(B.2)

For all |$h < e^{-1}$| and |$\delta (h)$| given by (B.1), it can be shown similarly as in the proof of Lemma 3.3 that

$$ \begin{align*} & {\mathbb{P}}_{} \big( {A^{C}} \big) = {\mathcal{O}}(h \left| {\log (h)} \right|^{-1/2}). \end{align*} $$

Proposition B.1

(Computational cost for the order 1 method for 1D Wiener process).

Consider the above-described adaptive timestepping method for computing the expected time of exit of a Wiener process |$W(t)$| with |$W(0)=0$| from an interval |$(a,b)$| where we assume that |$a<0<b$|⁠. For sufficiently small |$h> 0$|⁠, it then holds that

$$ \begin{align*} & {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left({\nu}\right)} \right] = {\mathcal{O}}(h^{-1} \left| {\log(h)} \right|)\,. \end{align*} $$

Proof.

Based on the size of the strides taken by the Wiener process, the computational cost can be decomposed as follows:

$$ \begin{align*} {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left({\nu}\right)} \right] = {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left({\nu}\right) \mathbbm{1}_{A^{C}}} \right] + {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left({\nu}\right) \mathbbm{1}_{A}} \right] =: I + II, \end{align*} $$

where the set |$A$| was defined in (B.2). For term |$I$|⁠, note that the upper bound on the number of timesteps that can be taken by the Wiener process before it exits the domain, using the order 1 adaptive timestepping method, is |$T/h^{2}$|⁠. From this, we obtain

$$ \begin{align*} I &= {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left({\nu}\right) \mathbbm{1}_{A^{C}}} \right] \leq \frac{T}{h^{2}} {\mathbb{E}}_{} \left[{\mathbbm{1}_{A^{C}}} \right] = \frac{T}{h^{2}} {\mathbb{P}}_{} \big( {A^{C}} \big) = {\mathcal{O}}(h^{-1} \left| {\log(h)} \right|^{-1/2}). \end{align*} $$

For term |$II$|⁠, we can write the computational cost as

$$ \begin{align*} II &= {\mathbb{E}}_{} \left[{\textrm{Cost}\!\left({\nu}\right) \mathbbm{1}_{A}} \right] \leq \frac{T}{h} + \int_{h}^{T} \frac{{\mathbb{E}}_{} \big[{\mathbbm{1}_{\{ \nu> t \} \cap \{ \varDelta t(\overline{W}(t)) = h^{2} \} \cap A}} \big]}{h^{2}}\,\textrm{d}t. \end{align*} $$

Note that

$$ \begin{multline*} \{ \nu> t \} \, {\cap} \, \{ \varDelta t(\overline{W}(t)) = h^{2} \} \, {\cap} \, A\\{\subset} \left(\{ M_{W}(t) \in [b - \delta, b + \delta) \} \, {\cap}\, \{ W(t) \in [b-\delta, b) \} \right) \\\{\cup} \left(\{ m_{W}(t) \in (a - \delta, a + \delta] \}\, {\cap} \,\{ W(t) \in (a, a + \delta] \} \right) \!. \end{multline*} $$

For any |$t>0$|⁠, the joint probability density of |$(W(t), M_{W}(t))$| is given by

$$ \begin{align*} \rho_{(W(t), M_{W}(t))}(w, m) = \sqrt{\frac{2}{\pi}} \frac{2m - w}{t^{3/2}} \exp \left( -\frac{(2m - w)^{2}}{2t} \right)\!, \quad m \geq 0, \; w \leq m. \end{align*} $$

$$ \begin{align*} \max_{t \in [h,T]} \rho_{(W(t), M_{W}(t))} (w, m) &= \max_{t \in [h,T]}\sqrt{\frac{2}{\pi}} \frac{(2m - w)}{t^{3/2}} \exp \left( -\frac{(2m - w)^{2}}{2 t} \right) \\ &\leq \max_{t \in [h,T]}\sqrt{\frac{2}{\pi}} \frac{b + 3 \delta}{t^{3/2}} \exp \left( -\frac{(b - 2\delta)^{2}}{2 t} \right) \\ &\leq \sup_{t \in (0, T]} \sqrt{\frac{2}{\pi}} \frac{2b} {t^{3/2}} \exp \left( -\frac{2b^2}{9 t} \right) \\ &=: C_{\rho}^b < \infty, \end{align*} $$

which further implies the stronger bound

$$ \begin{align*} \sup_{h \in (0,h_b]} \sup_{(t, w,m) \in [h,T] \times [b-\delta, b) \times [b- \delta, b + \delta]} \rho_{(W(t), M_{W}(t))} (w, m) \le C_{\rho}^b. \end{align*} $$

Using a similar argument, we can bound the joint probability density function |$\rho _{(W(t), m_{W}(t))}(w,m)$| uniformly for all |$(w,m) \in (a,a+\delta ] \times [a-\delta , a+\delta ]$| and |$t \in [h,T]$| by a positive constant |$C_{\rho }^{a} < \infty $|⁠. From the above results, we obtain that

$$ \begin{align*} II &\leq \frac{T}{h} + \frac{C_{\rho}^{a} \delta^{2} (T - h)}{h^{2}} + \frac{C_{\rho}^{b} \delta^{2} (T - h)}{h^{2}} = {\mathcal{O}}(h^{-1} \left| {\log(h)} \right|). \end{align*} $$

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Download all slides

Month:	Total Views:
October 2023	95
November 2023	94
December 2023	48
January 2024	54
February 2024	20
March 2024	43
April 2024	16
May 2024	15
June 2024	22
July 2024	26
August 2024	25
September 2024	21
October 2024	106
November 2024	56
December 2024	26
January 2025	17
February 2025	36
March 2025	37
April 2025	32
May 2025	5

Article Contents

Higher-order adaptive methods for exit times of Itô diffusions

Abstract

1. Introduction

2. Notation and the adaptive numerical methods

2.1 Notation and assumptions

2.2 Mean exit times and Feynman–Kac

2.3 Higher-order adaptive numerical methods

2.4 Order 1 method

2.5 The order 1.5 method

3. Theory and proofs

3.1 Order 1 method

3.2 Order 1.5 method

4. Numerical experiments

4.1 Geometric Brownian Motion

4.2 Linear drift, cosine diffusion coefficient—1D

4.3 Linear drift, linear diffusion—2D

4.4 Linear drift, cosine diffusion—2D

5. Conclusion

Acknowledgements

Funding

References

Appendix. A. Theoretical results

Appendix. B. Computational cost for the order 1 method—1D Wiener process

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

Higher-order adaptive methods for exit times of Itô diffusions

Abstract

1. Introduction

2. Notation and the adaptive numerical methods

2.1 Notation and assumptions

2.2 Mean exit times and Feynman–Kac

2.3 Higher-order adaptive numerical methods

2.4 Order 1 method

2.5 The order 1.5 method

3. Theory and proofs

3.1 Order 1 method

3.2 Order 1.5 method

4. Numerical experiments

4.1 Geometric Brownian Motion

4.2 Linear drift, cosine diffusion coefficient—1D

4.3 Linear drift, linear diffusion—2D

4.4 Linear drift, cosine diffusion—2D

5. Conclusion

Acknowledgements

Funding

References

Appendix. A. Theoretical results

Appendix. B. Computational cost for the order 1 method—1D Wiener process

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only