Rate optimal adaptive FEM with inexact solver for nonlinear operators

Abstract

We prove convergence with optimal algebraic rates for an adaptive finite element method for nonlinear equations with strongly monotone operator. Unlike prior works, our analysis also includes the iterative and inexact solution of the arising nonlinear systems by means of the Picard iteration. Using nested iteration, we prove, in particular, that the number of Picard iterations is uniformly bounded in generic cases, and the overall computational cost is (almost) optimal. Numerical experiments confirm the theoretical results.

quasilinear elliptic PDE, finite element method, adaptive mesh-refinement, adaptive solution of nonlinear algebraic system, optimal convergence rates, Banach fixed point theorem

1. Introduction

In recent years, the analysis of convergence and optimal convergence behaviour of adaptive finite element methods has matured. We refer to the seminal works for some milestones of linear elliptic equations (Dörfler, 1996; Morin et al., 2000; Binev et al., 2004; Stevenson, 2007; Cascon et al., 2008; Feischl et al., 2014), nonlinear problems (Veeser, 2002; Diening & Kreuzer, 2008; Belenki et al., 2012; Garau et al., 2012) and some general abstract framework (Carstensen et al., 2014). Although the interplay between adaptive mesh refinement, optimal convergence rates and inexact solvers has already been addressed and analysed for linear PDEs (e.g., Stevenson, 2007; Arioli et al., 2013a,b; Carstensen et al., 2014) and for eigenvalue problems (Carstensen & Gedicke, 2012), the influence of inexact solvers for nonlinear equations has not been analysed yet. The work by Garau et al. (2011) considers adaptive mesh refinement in combination with a Kačanov-type iterative solver for strongly monotone operators. In the spirit of Morin et al. (2008) and Siebert (2011), the focus is on a plain convergence result of the overall strategy, whereas the proof of optimal convergence rates remains open.

On the other hand, there is a rich body on a posteriori error estimation, which also includes the iterative and inexact solution for nonlinear problems (see, e.g., Ern & Vohralík, 2013). This study aims to close the gap between the numerical analysis (e.g., Carstensen et al., 2014) and the empirical evidence of optimal convergence rates (e.g., Garau et al., 2011; Ern & Vohralík, 2013) by analysing an adaptive algorithm from Congreve & Wihler (2017).

We consider a nonlinear elliptic equation in the variational formulation

\begin{align}\left\langle {{Au^\star},{v}} \right\rangle = \left\langle {{F},{v}} \right\rangle \quad\text{for all }v\in\mathcal{H},\end{align}

(1.1)

where |$\mathcal{H}$| is a Hilbert space over |$\mathbb{K}\in\{\mathbb{R},\mathbb{C}\}$| with dual space |$\mathcal{H}^*$| and corresponding duality brackets |$\left\langle {\cdot\,,\,\cdot} \right\rangle$| and |$F\in\mathcal{H}^*$|⁠. We treat the variational formulation in an abstract framework. We suppose that the operator |$A:\mathcal{H}\to\mathcal{H}^*$| satisfies the following conditions:

(O1)|$\boldsymbol A$| is strongly monotone: There exists |$\alpha>0,$| such that
\begin{align*}\alpha\,\left\| {{w-v}} \right\|_{\mathcal{H}}^2 \le \operatorname{Re}\,\left\langle {{Aw-Av}\,,\,{w-v}} \right\rangle\quad\text{for all} v,w\in\mathcal{H}. \end{align*}
(O2) |$\boldsymbol A$| is Lipschitz continuous: There exists |$L>0,$| such that
\begin{align*}\left\| {{Aw-Av}} \right\|_{\mathcal{H}^*} \le L\,\left\| {{w-v}} \right\|{\mathcal{H}}\quad\text{for all}\,v,w\in\mathcal{H}. \end{align*}
(O3) |$\boldsymbol{A}$| has a potential: There exists a Gâteaux differentiable function |$P:\mathcal{H}\to\mathbb{K,}$| such that its derivative |${\rm d}P:\mathcal{H}\to\mathcal{H}^*$| coincides with |$A$|⁠, i.e., for all |$v,w\in\mathcal{H}$|⁠, it holds that
\begin{align}\left\langle {{Aw}\,,\,{v}} \right\rangle=\left\langle {{{\rm d} P(w)}\,,\,{v}} \right\rangle=\lim_{\substack{r\to 0\\r\in\mathbb{R}}}\frac{P(w+rv)-P(w)}{r}.\end{align}
(1.2)

We note that (O1)–(O2) guarantee that there exists a unique solution |$u^\star \in\mathcal{H}$| to (1.1). The latter is the limit of any sequence of Picard iterates |$u^{n+1} = \varPhi(u^n)$| for all |$n\in\mathbb{N}_0$| with arbitrary initial guess |$u^0\in\mathcal{H}$|⁠, where the nonlinear mapping |$\varPhi:\mathcal{H}\to\mathcal{H}$| is a contraction (see Section 2 for details). The additional assumption (O3) implies that the nonlinear problem (1.1) and its later discretization are equivalently stated as energy-minimization problems (Lemma 5.1). In view of applications, we admit that (O1)–(O2) exclude the |$p$|-Laplacian (Veeser, 2002; Diening & Kreuzer, 2008; Belenki et al., 2012), but cover the same problem class as Garau et al., 2011, 2012; Congreve & Wihler, 2017. For applications of strongly monotone nonlinearities arising in magnetostatics, we also refer to Bruckner et al., 2014.

Based on adaptive mesh refinement of an initial triangulation |$\mathcal{T}_0$|⁠, our adaptive algorithm generates a sequence of conforming nested subspaces |$\mathcal{X}_\ell\subseteq\mathcal{X}_{\ell+1}\subset\mathcal{H}$|⁠, corresponding discrete solutions |$u_\ell\in\mathcal{X}_\ell$| and a posteriori error estimators |$\eta_\ell(u_\ell),$| such that |$\left\| {{u^\star-u_\ell}} \right\|_{\mathcal{H}} \le C_{\rm rel}^\star \,\eta_\ell(u_\ell)\to0$| as |$\ell\to\infty$| at optimal algebraic rate in the sense of certain approximation classes (Cascon et al., 2008; Carstensen et al., 2014; Feischl et al., 2014). Although the plain convergence result from Garau et al. (2011) applies to various marking strategies, our convergence analysis follows the concepts from Cascon et al. (2008), Feischl et al. (2014) and Carstensen et al. (2014), and is hence tailored to the Dörfler marking strategy.

Unlike Garau et al. (2012) and Belenki et al. (2012), we note that the computed discrete solutions |$u_\ell \neq u_\ell^\star$| in general, where |$u_\ell^\star\in\mathcal{X}_\ell$| is the Galerkin approximation to (1.1), i.e.,

\begin{align}\left\langle {{Au^\star_\ell}\,,\,{v_\ell}} \right\rangle = \left\langle {{F}\,,\,{v_\ell}} \right\rangle \quad\text{for all }v_\ell\in\mathcal{X}_\ell,\end{align}

(1.3)

because this discrete nonlinear system cannot be solved exactly in practice. Instead, |$u_\ell = u_\ell^n := \varPhi_\ell(u_\ell^{n-1})$| is a Picard approximate to |$u_\ell^\star$| (see Section 3 for the definition of |$\varPhi_\ell$|⁠), and each Picard iterate can be computed by solving one linear system. Our adaptive algorithm steers both the local mesh refinement and the number of Picard iterations, where we employ nested iteration|$u_{\ell+1}^0 := u_\ell$| to lower the number of Picard steps. To shorten the presentation, we did not include the iterative (and inexact) solution of the arising linear systems into the convergence analysis, but assume that these are solved exactly. We note, however, that the extended analysis can be done along the lines of Stevenson (2007) and Arioli et al. (2013a,b). Details will appear in Haberl (yet unpublished).

Outline of work.|$\quad$|Section 2 recalls the well-known proof that (1.1) admits a unique solution, because our numerical scheme relies on the Picard mapping |$\it \varPhi,$| which is at the core of the mathematical argument. Section 3 comments on the discrete problem (1.3) and introduces the discrete Picard mapping |$\it \varPhi_\ell$|⁠, which is used for our iterative solver. Section 4 states the adaptive strategy (Algorithm 4.3) as well as some abstract properties (A1)–(A4) of the error estimator in the spirit of Carstensen et al. (2014), which are exploited for the analysis. We prove that nested iteration essentially yields a bounded number of discrete Picard iterations (Proposition 4.6). Linear convergence of the proposed algorithm in the sense of

\begin{align}\exists C_{\rm lin}>0 \, \exists 0<q_{\rm lin}<1 \, \forall \ell,k \in\mathbb{N}_0 \quad\eta_{\ell+k}(u_{\ell+k}) \le C_{\rm lin}\,q_{\rm lin}^k\,\eta_\ell(u_\ell) \end{align}

(1.4)

is proved in Section 5 (Theorem 5.3), where we exploit the property (O3). Optimal algebraic convergence behaviour in the sense of

\begin{align}\forall s>0 \, \exists C_{\rm opt}>0 \, \forall \ell \in \mathbb{N}_0 \quad\eta_\ell(u_\ell) \le C_{\rm opt}\,\left\| {{u^\star}} \right\|_{\mathbb{A}_s}\,\big(\#\mathcal{T}_\ell-\#\mathcal{T}_0+1)^{-s}\end{align}

(1.5)

is proved in Section 6 (Theorem 6.2), where |$\left\| {{u^\star}} \right\|_{\mathbb{A}_s}<\infty$| if rate |$s$| is possible for the optimal meshes (see Section 6.2 for the precise definition of |$\left\| {{u^\star}} \right\|_{\mathbb{A}_s}$|⁠). As a consequence of the preceding results, we also obtain that the overall computational effort of the adaptive strategy is (almost) optimal (Theorem 6.4). Whereas Algorithm 4.3 is indexed by the adaptively generated meshes, Section 7 gives an equivalent formulation (Algorithm 7.1), where the Picard iterations are taken into account. Throughout, the analysis of Sections 2–7 is given in an abstract frame (in the spirit of Carstensen et al., 2014). The final Section 8 illustrates our theoretical findings with some numerical experiments, for which the estimator properties (A1)–(A4) are satisfied.

General notation.|$\quad$| Throughout all statements, all constants and their dependencies are explicitly given. In proofs, we may abbreviate the notation by the use of the symbol |$\lesssim$|⁠, which indicates |$\leq$| up to some multiplicative constant that is clear from the context. Moreover, the symbol |$\simeq$| states that both estimates |$\lesssim$| and |$\gtrsim$| hold. If (discrete) quantities are related to some triangulation, this is explicitly stated by the use of appropriate indices, e.g., |$u_\bullet$| is the discrete solution for the triangulation |$\mathcal{T}_\bullet$|⁠, |$v_\circ$| is a generic discrete function in the discrete space |$\mathcal{X}_\circ$|⁠, and |$\eta_\ell(\cdot)$| is the error estimator with respect to the triangulation |$\mathcal{T}_\ell$|⁠.

2. Banach fixpoint theorem

In this section, we prove that the model problem (1.1) admits a unique solution |$u\in\mathcal{H}$|⁠. The proof follows from the Banach’s fixpoint theorem and relies only on (O1)–(O2). Let |${(\cdot\,,\,\cdot)}_\mathcal{H}$| denote the |$\mathcal{H}$|-scalar product. Recall that the Riesz mapping |$I_\mathcal{H}:\mathcal{H}\to\mathcal{H}^*$|⁠, |$I_\mathcal{H} w:={({\cdot}\,,\,{w})}_\mathcal{H}$| is (up to complex conjugation) an isometric isomorphism (Yosida, 1980, Chapter III.6). Define

\begin{align}\varPhi v:=v-(\alpha/L^2)\,I_\mathcal{H}^{-1}(Av-F) \quad\text{and}\quadq:=(1-\alpha^2/L^2)^{1/2} < 1,\end{align}

(2.1)

where we note that (O1)–(O2) imply, in particular, |$\alpha\le L$|⁠. Then,

\begin{align*}\left\| {{\varPhi v -\varPhi w}} \right\|_{\mathcal{H}}^2 = \left\| {{v-w}} \right\|_{\mathcal{H}}^2 - 2\,\frac{\alpha}{L^2}\,\operatorname{Re}\,{({v-w}\,,\,{I_\mathcal{H}^{-1}(Av-Aw)})}_{\mathcal{H}} + \frac{\alpha^2}{L^4}\,\left\| {{I_\mathcal{H}^{-1}(Av-Aw)}} \right\|_{\mathcal{H}}^2. \end{align*}

First, let us note that

\begin{align*}{({v-w}\,,\,{I_\mathcal{H}^{-1}(Av-Aw)})}_\mathcal{H} = \left\langle {{Av-Aw}\,,\,{v-w}} \right\rangle. \end{align*}

Secondly, let us note that

\begin{align*}\left\| {{I_\mathcal{H}^{-1}(Av-Aw)}} \right\|_{\mathcal{H}}^2 = \left\| {{Av-Aw}} \right\|_{\mathcal{H}^*}^2 \stackrel{(\text{O}2)}{\le} L^2\, \left\| {{v-w}} \right\|_{\mathcal{H}}^2. \end{align*}

Combining these observations, we see that

\begin{align}\begin{split}\left\| {{\varPhi v -\varPhi w}} \right\|_{\mathcal{H}}^2 &\le \left(1+\frac{\alpha^2}{L^2}\right)\,\left\| {{v-w}} \right\|_{\mathcal{H}}^2 - 2\,\frac{\alpha}{L^2}\operatorname{Re}\,\left\langle {{Av-Aw}\,,\,{v-w}} \right\rangle \\ &\stackrel{(\text{O}1)}\le \left(1-\frac{\alpha^2}{L^2}\right)\,\left\| {{v-w}} \right\|_{\mathcal{H}}^2 \stackrel{(2.1)}{=} q^2 \, \left\| {{v - w}} \right\|_{\mathcal{H}}^2. \end{split}\end{align}

(2.2)

Hence, |$\varPhi:\mathcal{H}\to\mathcal{H}$| is a contraction with Lipschitz constant |$0<q<1$|⁠. According to the Banach’s fixpoint theorem, |$\varPhi$| has a unique fixpoint |$u^\star\in\mathcal{H}$|⁠, i.e., |$u^\star = \varPhi u^\star$|⁠. By definition of |$\varPhi$|⁠, the strong form (1.1) is equivalent to |$u^\star = \varPhi u^\star$|⁠. Overall, (1.1) admits a unique solution.

Moreover, the Banach’s fixpoint theorem guarantees that for each initial guess |$u^0\in\mathcal{H}$|⁠, the Picard iteration |$u^{n}:=\varPhi u^{n-1}$| converges to |$u^\star$| as |$n\to\infty$|⁠. Note that, for all |$n\in\mathbb{N}$|⁠,

\begin{align*}\left\| {{u^\star-u^n}} \right\|_{\mathcal{H}} = \left\| {{\varPhi u^\star - \varPhi u^{n-1}}} \right\|_{\mathcal{H}}\stackrel{(2.2)}{\le} q\,\left\| {{u^\star-u^{n-1}}} \right\|_{\mathcal{H}}\le q\,\left\| {{u^\star-u^n}} \right\|_{\mathcal{H}} + q\,\left\| {{u^n-u^{n-1}}} \right\|_{\mathcal{H}}. \end{align*}

Rearranging this estimate and induction on |$n$| with (2.2), we derive the following well-known a posteriori and a priori estimate for the Picard iterates,

\begin{align}\left\| {{u^\star-u^n}} \right\|_{\mathcal{H}} \le \frac{q}{1-q}\,\left\| {{u^n-u^{n-1}}} \right\|_{\mathcal{H}}\stackrel{(2.2)}{\le} \frac{q^n}{1-q}\,\left\| {{u^1-u^0}} \right\|_{\mathcal{H}}.\end{align}

(2.3a)

Moreover, it holds that

\begin{align}\left\| {{u^n-u^{n-1}}} \right\|_{\mathcal{H}}\leq \left\| {{u^\star-u^n}} \right\|_{\mathcal{H}} + \left\| {{u^\star-u^{n-1}}} \right\|_{\mathcal{H}}\stackrel{(2.2)}{\leq} (1+q) \, \left\| {{u^\star-u^{n-1}}} \right\|_{\mathcal{H}}.\end{align}

(2.3b)

Thus, the a posteriori computable term |$\left\| {{u^n-u^{n-1}}} \right\|_{\mathcal{H}}$| provides an upper bound for |$\left\| {{u^\star-u^n}} \right\|_{\mathcal{H}}$| as well as a lower bound for |$\left\| {{u^\star-u^{n-1}}} \right\|_{\mathcal{H}}$|⁠.

3. Discretization and a priori error estimation

3.1 Nonlinear discrete problem

Suppose that |$\mathcal{X}_\bullet \subset \mathcal{H}$| is a conforming discrete subspace of |$\mathcal{H}$|⁠. If (O1)–(O2) are satisfied, the restriction |$A_\bullet:\mathcal{X}_\bullet\to\mathcal{X}_\bullet^*$| of |$A$| is strongly monotone and Lipschitz continuous, even with the same constants |$\alpha,L>0$| for (O1)–(O2) as in the continuous case. In particular, there exists a unique solution |$u_\bullet^\star \in\mathcal{X}_\bullet$| to

\begin{align}\left\langle {{Au_\bullet^\star}\,,\,{v_\bullet}} \right\rangle = \left\langle {{F}\,,\,{v_\bullet}} \right\rangle \quad\text{for all }v_\bullet\in\mathcal{X}_\bullet.\end{align}

(3.1)

First, recall the following well-known Céa-type estimate for strongly monotone operators. We include its proof for the sake of completeness.

Lemma 3.1

Suppose that the operator |$A$| satisfies (O1)–(O2). Then, it holds that

\begin{align}\left\| {{u^\star-u_\bullet^\star}} \right\|_{\mathcal{H}} \le \frac{L}{\alpha}\,\min_{w_\bullet\in\mathcal{X}_\bullet}\left\| {{u^\star-w_\bullet}} \right\|_{\mathcal{H}}.\end{align}

(3.2)

Proof.

Note the Galerkin orthogonality |$\left\langle {{Au^\star-Au_\bullet^\star}\,,\,{v_\bullet}} \right\rangle = 0$| for all |$v_\bullet\in\mathcal{X}_\bullet$|⁠. For |$w_\bullet\in\mathcal{X}_\bullet$| and |$u^\star\neq u^\star_\bullet$|⁠, this results in

\begin{align*}\alpha\,\left\| {{u^\star-u_\bullet^\star}} \right\|_{\mathcal{H}}\stackrel{(\text{O}1)}\le \frac{\operatorname{Re} \, \left\langle {{Au^\star-Au_\bullet^\star}\,,\,{u^\star-u_\bullet^\star}} \right\rangle}{\left\| {{u^\star-u_\bullet^\star}} \right\|_{\mathcal{H}}}= \frac{\operatorname{Re} \, \left\langle {{Au^\star-Au_\bullet^\star}\,,\,{u^\star-w_\bullet}} \right\rangle}{\left\| {{u^\star-u_\bullet^\star}} \right\|_{\mathcal{H}}}&\stackrel{(\text{O}2)}\le L\,\left\| {{u^\star-w_\bullet}} \right\|_{\mathcal{H}}. \end{align*}

Finite dimension concludes that the infimum over all |$w_\bullet \in \mathcal{X}_\bullet$| is, in fact, attained. □

3.2 Linearized discrete problem

Note that the nonlinear system (3.1) can hardly be solved exactly. With the discrete Riesz mapping |$I_\bullet:\mathcal{X}_\bullet\to\mathcal{X}_\bullet^*$| and the restriction |$F_\bullet\in\mathcal{X}_\bullet^*$| of |$F$| to |$\mathcal{X}_\bullet$|⁠, define |$\varPhi_\bullet:\mathcal{X}_\bullet\to\mathcal{X}_\bullet$| by |$\varPhi_\bullet v_\bullet:=v_\bullet-(\alpha/L^2)I_\bullet^{-1}(A_\bullet v_\bullet-F_\bullet)$|⁠. Given |$u_\bullet^{n} \in \mathcal{X}_\bullet$|⁠, we thus compute the discrete Picard iterate |$u_\bullet^{n} := \varPhi_\bullet u^{n-1}_\bullet$| as follows:

(i) Solve the linear system |$(v_\bullet, w_\bullet)_\mathcal{H} = \left\langle {{A u_\bullet^{n-1} - F}\,,\,{v_\bullet}} \right\rangle$| for all |$v_\bullet \in \mathcal{X}_\bullet$|⁠.
(ii) Define |$u_\bullet^{n} := u_\bullet^{n-1} - \frac{\alpha}{L^2} w_\bullet$|⁠.

Then, |$\varPhi_\bullet$| is a contraction (cf. Section 2), and for each initial guess |$u^0_\bullet\in\mathcal{X}_\bullet$|⁠, the Picard iteration |$u^{n+1}_\bullet = \varPhi_\bullet u^n_\bullet$| converges to |$u_\bullet^\star$| as |$n\to\infty$|⁠. Moreover, the error estimates (2.3) also hold for the discrete Picard iteration, i.e., for all |$n\in\mathbb{N}$|⁠, it holds that

\begin{align}\left\| {{u_\bullet^\star-u_\bullet^n}} \right\|_{\mathcal{H}} &\le \frac{q}{1\!-\!q}\,\left\| {{u_\bullet^n-u_\bullet^{n-1}}} \right\|_{\mathcal{H}}\le \min\left\{\frac{q^n}{1\!-\!q}\left\| {{u_\bullet^1-u_\bullet^0}} \right\|_{\mathcal{H}},\,\frac{q(1\!+\!q)}{1\!-\!q}\left\| {{u_\bullet^\star-u_\bullet^{n-1}}} \right\|_{\mathcal{H}} \right\}\!.\end{align}

(3.3)

Finally, we recall the following a priori estimate for the discrete Picard iteration from Congreve & Wihler (2017, Proposition 2.1) and also include its simple proof for the sake of completeness:

Lemma 3.2

Suppose that the operator |$A$| satisfies (O1)–(O2). Then, it holds that

\begin{align}\left\| {{u^\star-u_\bullet^n}} \right\|_{\mathcal{H}} \le \frac{L}{\alpha}\,\min_{w_\bullet\in\mathcal{X}_\bullet}\left\| {{u^\star-w_\bullet}} \right\|_{\mathcal{H}}+ \frac{q^n}{1-q}\,\left\| {{u_\bullet^1-u_\bullet^0}} \right\|_{\mathcal{H}}\quad\text{for all }n\in\mathbb{N}.\end{align}

(3.4)

Proof.

With |$\displaystyle \left\| {{u^\star-u_\bullet^n}} \right\|_{\mathcal{H}} \le \left\| {{u^\star-u_\bullet^\star}} \right\|_{\mathcal{H}} + \left\| {{u_\bullet^\star-u_\bullet^n}} \right\|_{\mathcal{H}} \stackrel{(3.3)}\le \left\| {{u^\star-u_\bullet^\star}} \right\|_{\mathcal{H}} + \frac{q^n}{1-q}\,\left\| {{u_\bullet^1-u_\bullet^0}} \right\|_{\mathcal{H}},$| the proof follows from the Céa-type estimate of Lemma 3.1. □

Remark 3.3

Note that |$u_\bullet^1 = \varPhi_\bullet u_\bullet^0$| implies that

\begin{align*}{({u_\bullet^1}\,,\,{v_\bullet})}_\mathcal{H} = {({u_\bullet^0}\,,\,{v_\bullet})}_\mathcal{H}- \frac{\alpha}{L^2}\,\left\langle {{A u_\bullet^0-F}\,,\,{v_\bullet}} \right\rangle \quad\text{for all }v_\bullet\in\mathcal{X}_\bullet. \end{align*}

For |$v_\bullet = u_\bullet^1-u_\bullet^0$|⁠, this reveals that

\begin{align*}\left\| {{u_\bullet^1-u_\bullet^0}} \right\|_{\mathcal{H}}^2 = - \frac{\alpha}{L^2}\,\left\langle {{A u_\bullet^0-F}\,,\,{u_\bullet^1-u_\bullet^0}} \right\rangle &\le \frac{\alpha}{L^2}\,\left\| {{Au_\bullet^0-F}} \right\|_{\mathcal{H}^*}\,\left\| {{u_\bullet^1-u_\bullet^0}} \right\|_{\mathcal{H}}. \end{align*}

Consequently, we get

\begin{align}\left\| {{u_\bullet^1-u_\bullet^0}} \right\|_{\mathcal{H}}\le \frac{\alpha}{L^2}\,\left\| {{Au_\bullet^0-F}} \right\|_{\mathcal{H}^*}\stackrel{(1.1)}{=} \frac{\alpha}{L^2} \, \left\| {{A u_\bullet^0 - Au^\star}} \right\|_{\mathcal{H}^*}\stackrel{(\text{O}2)}{\leq} \frac{\alpha}{L}\,\left\| {{u_\bullet^0 - u^\star}} \right\|_{\mathcal{H}}.\end{align}

(3.5)

Therefore, boundedness of |$\left\| {{u_\bullet^1-u_\bullet^0}} \right\|_{\mathcal{H}}$| in the a priori estimate of Lemma 3.2 can be guaranteed independently of the space |$\mathcal{X}_\bullet \subset \mathcal{H}$| by choosing, e.g., |$u_\bullet^0:=0$|⁠. If |$\min_{w_\bullet\in\mathcal{X}_\bullet}\left\| {{u^\star-w_\bullet}} \right\|_{\mathcal{H}} = \mathcal{\rm O}(N^{-s})$| for some |$s>0$| and with |$N>0$| being the degrees of freedom associated with |$\mathcal{X}_\bullet$|⁠, this suggests the choice |$n = \mathcal{\rm{O}}(\log N)$| in Lemma 3.2 (see the discussion in Congreve & Wihler, 2017, Remark 3.7). Moreover, we shall see below that the choice of |$u_\bullet^0$| by nested iteration generically leads to |$n = \mathcal{\rm{O}}(1)$| (see Proposition 4.6 below). □

4. Adaptive algorithm

4.1 Basic properties of mesh refinement

Suppose that all considered discrete spaces |$\mathcal{X}_\bullet\subset\mathcal{H}$| are associated with a triangulation |$\mathcal{T}_\bullet$| of a fixed bounded Lipschitz domain |$\varOmega \subset\mathbb{R}^d$|⁠, |$d\ge2$|⁠. Suppose that |$\operatorname{refine}(\cdot)$| is a fixed mesh-refinement strategy.

Given a triangulation |$\mathcal{T}_\bullet$| and |$\mathcal{M}_\bullet \subseteq \mathcal{T}_\bullet$|⁠, let |$\mathcal{T}_\circ := \operatorname{refine}(\mathcal{T}_\bullet,\mathcal{M}_\bullet)$| be the coarsest triangulation, such that all marked elements |$T \in \mathcal{T}_\bullet$| have been refined, i.e., |$\mathcal{M}_\bullet \subseteq \mathcal{T}_\bullet \setminus \mathcal{T}_\circ$|⁠.

We write |$\mathcal{T}_\circ\in\operatorname{refine}(\mathcal{T}_\bullet)$| if |$\mathcal{T}_\circ$| is obtained by a finite number of refinement steps, i.e., there exists |$n\in\mathbb{N}_0$| as well as a finite sequence |$\mathcal{T}_{(0)},\dots,\mathcal{T}_{(n)}$| of triangulations and corresponding sets |$\mathcal{M}_{(j)}\subseteq\mathcal{T}_{(j)}$| such that

|$\mathcal{T}_\bullet = \mathcal{T}_{(0)}$|⁠,
|$\mathcal{T}_{(j+1)} = \operatorname{refine}(\mathcal{T}_{(j)},\mathcal{M}_{(j)})$| for all |$j=0,\dots,n-1$|⁠,
|$\mathcal{T}_\circ = \mathcal{T}_{(n)}$|⁠.

In particular, |$\mathcal{T}_\bullet\in\operatorname{refine}(\mathcal{T}_\bullet)$|⁠. Further, we suppose that refinement |$\mathcal{T}_\circ\in \operatorname{refine}(\mathcal{T}_\bullet)$| yields nestedness |$\mathcal{X}_\bullet\subseteq\mathcal{X}_\circ$| of the corresponding discrete spaces.

In view of the adaptive algorithm (Algorithm 4.3 below), let |$\mathcal{T}_0$| be a fixed initial triangulation. To ease notation, let |$\mathbb{T}:=\operatorname{refine}(\mathcal{T}_0)$| be the set of all possible triangulations that can be obtained by successively refining |$\mathcal{T}_0$|⁠.

4.2 A posteriori error estimator

Suppose that for each |$T\in\mathcal{T}_\bullet\in\mathbb{T}$| and each discrete function |$v_\bullet\in\mathcal{X}_\bullet$|⁠, one can compute an associated refinement indicator|$\eta_\bullet(T,v_\bullet)\ge0$|⁠. To abbreviate notation, let

\begin{align}\eta_\bullet(v_\bullet) := \eta_\bullet(\mathcal{T}_\bullet,v_\bullet), \quad\text{where}\quad\eta_\bullet(\mathcal{U}_\bullet,v_\bullet) := \left(\sum_{T\in\mathcal{U}_\bullet}\eta_\bullet(T,v_\bullet)^2\right)^{1/2}\quad\text{for all }\mathcal{U}_\bullet\subseteq\mathcal{T}_\bullet.\end{align}

(4.1)

We suppose the following properties with fixed constants |$C_{\rm stb},C_{\rm red},C_{\rm rel}^\star,C_{\rm drel}^\star \ge 1$| and |$0<q_{\rm red}<1$|⁠, which slightly generalise those axioms of adaptivity of Carstensen et al. (2014):

(A1) Stability on nonrefined element domains: For all triangulations |$\mathcal{T}_\bullet\in\mathbb{T}$| and |$\mathcal{T}_\circ\in\operatorname{refine}(\mathcal{T}_\bullet)$|⁠, arbitrary discrete functions |$v_\bullet\in\mathcal{X}_\bullet$| and |$v_\circ\in\mathcal{X}_\circ$| and an arbitrary set |$\mathcal{U}_\bullet\subseteq\mathcal{T}_\bullet\cap\mathcal{T}_\circ$| of nonrefined elements, it holds that
$$|\eta_\circ(\mathcal{U}_\bullet,v_\circ)-\eta_\bullet(\mathcal{U}_\bullet,v_\bullet)| \le C_{\rm stb}\,\left\| {{v_\bullet-v_\circ}} \right\|_{\mathcal{H}}.$$
(A2) Reduction on refined element domains: For all triangulations |$\mathcal{T}_\bullet\in\mathbb{T}$| and |$\mathcal{T}_\circ\in\operatorname{refine}(\mathcal{T}_\bullet)$|⁠, and arbitrary |$v_\bullet\in\mathcal{X}_\bullet$| and |$v_\circ\in\mathcal{X}_\circ$|⁠, it holds that
$$\eta_\circ(\mathcal{T}_\circ\backslash\mathcal{T}_\bullet,v_\circ)^2 \le q_{\rm red}\,\eta_\bullet(\mathcal{T}_\bullet\backslash\mathcal{T}_\circ,v_\bullet)^2 + C_{\rm red}\,\left\| {{v_\circ-v_\bullet}} \right\|_{\mathcal{H}}^2.$$
(A3) Reliability: For all triangulations |$\mathcal{T}_\bullet\in\mathbb{T}$|⁠, the error of the discrete solution |$u_\bullet^\star\in\mathcal{X}_\bullet$| to (3.1) is controlled by
$$\left\| {{u^\star-u_\bullet^\star}} \right\|_{\mathcal{H}}\le C_{\rm rel}^\star\,\eta_\bullet(u_\bullet^\star).$$
(A4) Discrete reliability: For all |$\mathcal{T}_\bullet\in\mathbb{T}$| and all |$\mathcal{T}_\circ\in\operatorname{refine}(\mathcal{T}_\bullet)$|⁠, there exists a set |$\mathcal{R}_{\bullet,\circ}\subseteq\mathcal{T}_\bullet$| with |$\mathcal{T}_\bullet\backslash\mathcal{T}_\circ \subseteq \mathcal{R}_{\bullet,\circ}$| as well as |$\#\mathcal{R}_{\bullet,\circ} \le C_{\rm drel}^\star\,\#(\mathcal{T}_\bullet\backslash\mathcal{T}_\circ)$|⁠, such that the difference of the discrete solutions |$u_\bullet^\star \in \mathcal{X}_\bullet$| and |$u_\circ^\star \in \mathcal{X}_\circ$| is controlled by
\begin{align*}\left\| {{u_\circ^\star-u_\bullet^\star}} \right\|_{\mathcal{H}} \le C_{\rm drel}^\star\,\eta_\bullet(\mathcal{R}_{\bullet,\circ},u_\bullet^\star). \end{align*}

Remark 4.1

Suppose the following approximation property of |$u^\star \in \mathcal{H}$|⁠: for all |$\mathcal{T}_\bullet \in \mathbb{T}$| and all |$\varepsilon >0$|⁠, there exists a refinement |$\mathcal{T}_\circ \in \operatorname{refine}(\mathcal{T}_\bullet)$|⁠, such that |$\left\| {{u^\star - u_\circ^\star}} \right\|_{\mathcal{H}} \leq \varepsilon$|⁠. Then, discrete reliability (A4) already implies reliability (A3) (see Carstensen et al., 2014, Lemma 3.4).

We note that (A3)–(A4) are formulated for the noncomputable Galerkin solutions |$u_\bullet^\star\in\mathcal{X}_\bullet$| to (3.1), whereas Algorithm 4.3 below generates approximations |$u_\bullet^{n}\approx u_\bullet^\star\in\mathcal{X}_\bullet$|⁠. The following lemma proves that reliability (A3) transfers to certain Picard iterates.

Lemma 4.2

Suppose (O1)–(O2) for the operator |$A$| as well as stability (A1) and reliability (A3) for the a posteriori error estimator. Let |$\lambda >0$|⁠, |$n \in \mathbb{N}$| and |$u_\bullet^0\in\mathcal{X}_\bullet$|⁠, and suppose that the Picard iterate |$u_\bullet^n\in\mathcal{X}_\bullet$| satisfies |$\left\| {{u_\bullet^{n} - u_\bullet^{n-1}}} \right\|_{\mathcal{H}} \le \lambda \eta_\bullet(u_\bullet^n)$|⁠. Then, it holds that

\begin{align}\left\| {{u^\star-u_\bullet^n}} \right\|_{\mathcal{H}}\le C_{\rm rel}\,\eta_\bullet(u_\bullet^n) \quad\text{with}\quadC_{\rm rel} := C_{\rm rel}^\star + \lambda\,(1+C_{\rm rel}^\star C_{\rm stb})\,\frac{q}{1-q}.\end{align}

(4.2)

Proof.

Reliability (A3) and stability (A1) prove that

\begin{align*}\left\| {{u^\star-u_\bullet^n}} \right\|_{\mathcal{H}}\le \left\| {{u^\star-u_\bullet^\star}} \right\|_{\mathcal{H}} + \left\| {{u_\bullet^\star-u_\bullet^n}} \right\|_{\mathcal{H}}&\stackrel{(\text{A}3)}\le C_{\rm rel}^\star\,\eta_\bullet(u_\bullet^\star) + \left\| {{u_\bullet^\star-u_\bullet^n}} \right\|_{\mathcal{H}}\\& \stackrel{(\text{A}1)}\le C_{\rm rel}^\star\,\eta_\bullet(u_\bullet^n) + (1+C_{\rm rel}^\star C_{\rm stb})\,\left\| {{u_\bullet^\star-u_\bullet^n}} \right\|_{\mathcal{H}}. \end{align*}

The a posteriori estimate (3.3) together with the assumption on |$u_\bullet^n$| yields that

\begin{align*}\left\| {{u_\bullet^\star-u_\bullet^n}} \right\|_{\mathcal{H}}\stackrel{(3.3)}\le \frac{q}{1-q}\,\left\| {{u_\bullet^n-u_\bullet^{n-1}}} \right\|_{\mathcal{H}} \le \lambda\,\frac{q}{1-q}\,\eta_\bullet(u_\bullet^n). \end{align*}

Combining these estimates, we conclude the proof. □

4.3 Adaptive algorithm

In the sequel, we analyse the following adaptive algorithm which—up to a different a posteriori error estimation based on elliptic reconstruction—is also considered in Congreve & Wihler (2017).

Algorithm 4.3

Input: Initial triangulation |$\mathcal{T}_0$|⁠, adaptivity parameters |$0<\theta\le 1$|⁠, |$\lambda>0$| and |$C_{\rm mark}\ge1$|⁠, arbitrary initial guess |$u_0^0\in\mathcal{X}_0$|⁠, e.g., |$u_0^0:=0$|⁠.

Adaptive loop: For all |$\ell=0,1,2,\dots$|⁠, iterate the following Steps (i)–(iii).

(i) Repeat the following Steps (a)–(b) for all |$n=1,2,\dots$|⁠, until |$\left\| {{u_\ell^n-u_\ell^{n-1}}} \right\|_{\mathcal{H}} \le \lambda\,\eta_\ell(u_\ell^n)$|⁠.
- (a) Compute discrete Picard iterate |$u_\ell^n = \varPhi_\ell u_\ell^{n-1}\in\mathcal{X}_\ell$|⁠.
- (b) Compute refinement indicators |$\eta_\ell(T,u_\ell^n)$| for all |$T\in\mathcal{T}_\ell$|⁠.
(ii) Define |$u_\ell := u_\ell^n \in \mathcal{X}_\ell$| and determine a set |$\mathcal{M}_\ell\subseteq\mathcal{T}_\ell$| of marked elements that have minimal cardinality up to the multiplicative constant |$C_{\rm mark}$| and that satisfy the Dörfler marking criterion |$\theta\,\eta_\ell(u_\ell) \le \eta_\ell(\mathcal{M}_\ell,u_\ell)$|⁠.
(iii) Generate the new triangulation |$\mathcal{T}_{\ell+1} := \operatorname{refine}(\mathcal{T}_\ell,\mathcal{M}_\ell)$| by refinement of (at least) all marked elements |$T\in\mathcal{M}_\ell$| and define |$u_{\ell+1}^0 := u_\ell \in\mathcal{X}_\ell\subseteq\mathcal{X}_{\ell+1}$|⁠.

Output: Sequence of discrete solutions |$u_\ell\in\mathcal{X}_\ell$| and corresponding estimators |$\eta_\ell(u_\ell)$|⁠. □

The following two results analyse the (lucky) breakdown of Algorithm 4.3. The first proposition shows that, if the repeat loop of Step (i) does not terminate after finitely many steps then the exact solution |$u^\star=u_\ell^\star$| belongs to the discrete space |$\mathcal{X}_\ell$|⁠.

Proposition 4.4

Suppose (O1)–(O2) for the nonlinear operator |$A$| as well as stability (A1) and reliability (A3) for the a posteriori error estimator. Let |$\lambda >0$| and assume that Step (i) in Algorithm 4.3 does not terminate for some fixed |$\ell \in \mathbb{N}_0$|⁠, i.e., |$\left\| {{u_\ell^n - u_\ell^{n-1}}} \right\|_{\mathcal{H}} > \lambda \eta_\ell(u_\ell^n)$| for all |$n \in \mathbb{N}$|⁠. Define |$u_{-1}:=0$| if |$\ell=0$|⁠. Then, there holds |$u^\star = u_\ell^\star \in \mathcal{X}_\ell$| and

\begin{align}\eta_\ell(u_\ell^n) < \lambda^{-1} q^{n-1} \frac{\alpha}{L} \, \left\| {{u^\star - u_{\ell-1}}} \right\|_{\mathcal{H}} \xrightarrow{n\to\infty} 0 = \eta_\ell(u^\star).\end{align}

(4.3)

Proof.

According to (A1), |$\eta_\ell(v_\ell)$| depends Lipschitz continuously on |$v_\ell \in \mathcal{X}_\ell$|⁠. Moreover, it holds that |$\left\| {{u_\ell^\star - u_\ell^n}} \right\|_{\mathcal{H}} \rightarrow 0$| and hence |$\left\| {{u_\ell^n - u_\ell^{n-1}}} \right\|_{\mathcal{H}} \rightarrow 0$| as |$n \to \infty$|⁠. This proves that

\begin{align*}\left\| {{u^\star - u_\ell^\star}} \right\|_{\mathcal{H}} \stackrel{(\text{A}3)}\lesssim \eta_\ell(u_\ell^\star) = \lim_{n \to \infty} \eta_\ell(u_\ell^{n}) \leq \lambda^{-1} \lim_{n \to \infty} \left\| {{u_\ell^n - u_\ell^{n-1}}} \right\|_{\mathcal{H}} = 0. \end{align*}

Hence, |$u^\star= u_\ell^\star \in \mathcal{X}_\ell$| and |$\eta_\ell(u^\star) = \eta_\ell(u_\ell^\star) =0$|⁠. With |$\lambda \, \eta_\ell(u_\ell^{n}) < \left\| {{u_\ell^n - u_\ell^{n-1}}} \right\|_{\mathcal{H}}$|⁠, we see that

\begin{align*}\eta_\ell(u_\ell^{n}) < \lambda^{-1} \left\| {{u_\ell^n - u_\ell^{n-1}}} \right\|_{\mathcal{H}} \stackrel{(2.2)}{\leq}\lambda^{-1} q^{n-1} \, \left\| {{u_\ell^1 - u_\ell^0}} \right\|_{\mathcal{H}}(3.5)\le \lambda^{-1}\,q^{n-1} \frac{\alpha}{L} \, \left\| {{u^\star - u_{\ell-1}}} \right\|_{\mathcal{H}}. \end{align*}

This concludes (4.3). □

Proposition 4.5

Suppose (O1)–(O2) for the operator |$A$| as well as stability (A1) and reliability (A3) for the error estimator. If Step (i) in Algorithm 4.3 terminates with |$\eta_\ell(u_\ell)=0$| (or equivalently with |$\mathcal{M}_\ell = \emptyset$| in Step (ii)) for some |$\ell\in\mathbb{N}_0$|⁠, then |$u_k=u^\star$| as well as |$\mathcal{M}_k = \emptyset$| for all |$k \geq \ell$|⁠. Moreover, for all |$k \geq \ell+1$|⁠, Step (i) terminates after one iteration.

Proof.

Clearly, |$\mathcal{M}_\ell = \emptyset$| implies |$\eta_\ell(u_\ell)=0$|⁠. Conversely, |$\eta_\ell(u_\ell)=0$| also implies |$\mathcal{M}_\ell = \emptyset$|⁠. In this case, Lemma 4.2 yields |$\left\| {{u^\star-u_\ell}} \right\|_{\mathcal{H}}=0$| and hence |$u_\ell=u^\star$|⁠. Moreover, |$\mathcal{M}_\ell = \emptyset$| implies |$\mathcal{T}_{\ell+1} = \mathcal{T}_\ell$|⁠. Nested iteration guarantees |$u_{\ell+1}^1 = \varPhi_{\ell+1} u_{\ell+1}^0 =\varPhi_{\ell+1} u_\ell =\varPhi_{\ell+1} u^\star = u^\star = u_\ell = u_{\ell+1}^0$| and therefore |$\left\| {{u_{\ell+1}^1 - u_{\ell+1}^0}} \right\|_{\mathcal{H}} =0$|⁠. Together with |$\mathcal{T}_{\ell+1}=\mathcal{T}_\ell$|⁠, this implies |$u_{\ell+1} = u_\ell$| and |$\eta_{\ell+1}(u_{\ell+1})=\eta_\ell(u_\ell)=0$|⁠. Hence, for all |$k \geq {\ell+1}$|⁠, Step (i) terminates after one iteration with |$u_k = u^\star$| and |$\mathcal{M}_k =\emptyset$|⁠. □

For the rest of this section, we suppose that Step (i) of Algorithm 4.3 terminates with |$\eta_\ell(u_\ell)>0$| for all |$\ell \in \mathbb{N}_0$|⁠. In this case, we control the number of Picard iterates |$\#{\rm Pic}(\ell)$|⁠.

Proposition 4.6

Suppose (O1)–(O2) for the nonlinear operator |$A$| as well as stability (A1) and reliability (A3) for the a posteriori error estimator. Let |$0<\theta\le1$| and |$\lambda>0$| be the adaptivity parameters of Algorithm 4.3. Suppose that, for all |$\ell\in\mathbb{N}_0$|⁠, Step (i) of Algorithm 4.3 terminates after finitely many steps and that |$\eta_\ell(u_\ell)>0$|⁠. Then, there exists a constant |$C_{\rm pic}\ge1,$| such that for all |$\ell\in\mathbb{N}$|⁠, the number of Picard iterates |$\# {\rm Pic}(\ell)$| satisfies

\begin{align}\# {\rm Pic}(\ell)\le C_{\rm pic} + \frac{1}{{|\!}\log q|} \, \log \big( \max \big\{ 1 \,,\, \eta_{\ell-1}(u_{\ell-1}) / \eta_\ell(u_\ell) \big\}\big).\end{align}

(4.4)

The constant |$C_{\rm pic}$| depends only on (O1)–(O2), (A1) and (A3).

The proof of Proposition 4.6 employs the following auxiliary result.

Lemma 4.7

Suppose (O1)–(O2) for the operator |$A$| as well as stability (A1) and reliability (A3) for the error estimator. Suppose that, for all |$\ell\in\mathbb{N}_0$|⁠, Step (i) of Algorithm 4.3 terminates after finitely many steps. Then, for |$\ell\in\mathbb{N}$|⁠, the discrete Picard iterates satisfy

\begin{align}\left\| {{u_\ell^n-u_\ell^{n-1}}} \right\|_{\mathcal{H}}\le q^{n-1}\,\frac{\alpha}{L}\,C_{\rm rel}\,\eta_{\ell-1}(u_{\ell-1}) \quad\text{for all }n\in\mathbb{N}.\end{align}

(4.5)

Moreover, it holds that

\begin{align}\eta_\ell(u_{\ell}^n) \le \frac{\alpha}{L}\,\frac{C_{\rm rel}}{\lambda} \,q^{n-1}\,\eta_{\ell-1}(u_{\ell-1}) \quad\text{for all }n=1,\dots,\# {\rm Pic}(\ell)-1.\end{align}

(4.6)

Proof.

Recall that |$u_\ell^0 = u_{\ell-1}$| and

\begin{align*}\left\| {{u_{\ell}^1 - u_{\ell}^0}} \right\|_{\mathcal{H}} \stackrel{(3.5)}{\le}\frac{\alpha}{L} \,\left\| {{u^\star-u_\ell^0}} \right\|_{\mathcal{H}}= \frac{\alpha}{L} \,\left\| {{u^\star-u_{\ell-1}}} \right\|_{\mathcal{H}}. \end{align*}

With the contraction (2.2) for the Picard iterates and reliability (4.2), this yields that

\begin{align*}\left\| {{u_\ell^n-u_\ell^{n-1}}} \right\|_{\mathcal{H}}\stackrel{(2.2)}\le q^{n-1}\,\left\| {{u_\ell^1-u_\ell^0}} \right\|_{\mathcal{H}}\le q^{n-1}\,\frac{\alpha}{L}\,\left\| {{u^\star-u_{\ell-1}}} \right\|_{\mathcal{H}}\stackrel{(4.2)}\le q^{n-1}\,\frac{\alpha}{L}\,C_{\rm rel}\,\eta_{\ell-1}(u_{\ell-1}). \end{align*}

This proves (4.5). To see (4.6), note that Algorithm 4.3 ensures that |$k:=\# {\rm Pic}(\ell)$| is the minimal integer with |$\left\| {{u_\ell^k-u_\ell^{k-1}}} \right\|_{\mathcal{H}} \le \lambda\,\eta_\ell(u_\ell^k)$|⁠. For |$n=1,\dots,k-1$|⁠, it thus holds that

\begin{align*}\lambda\,\eta_\ell(u_\ell^n) < \left\| {{u_\ell^n-u_\ell^{n-1}}} \right\|_{\mathcal{H}}\stackrel{(4.5)}\le \frac{\alpha}{L}\,C_{\rm rel}\,q^{n-1}\,\eta_{\ell-1}(u_{\ell-1}). \end{align*}

This concludes the proof. □

Proof of Proposition 4.6.

We prove the assertion in two steps.

Step 1.|$\quad$| Let |$d_\ell:= \eta_{\ell-1}(u_{\ell-1}) / \eta_{\ell}(u_{\ell})$|⁠. Choose |$n \in \mathbb{N}$| minimal such that

\begin{align}0 < \frac{C d_\ell}{1-C' d_\ell\,q^n}\,q^{n-1} \le \lambda, \quad\text{where}\quadC:= \frac{\alpha}{L}\,C_{\rm rel}\quad\text{and}\quadC':= C C_{\rm stb}\,\frac{1}{1-q}.\end{align}

(4.7)

In this step, we aim to show that |$k:=\# {\rm Pic}(\ell)\le n$|⁠. To this end, we argue by contradiction and assume |$k>n$|⁠. First, recall that Step (i) of Algorithm 4.3 guarantees that |$k\in\mathbb{N}$| is the minimal index, such that |$\left\| {{u_\ell^k-u_\ell^{k-1}}} \right\|_{\mathcal{H}} \le \lambda \eta_\ell(u_\ell^k)$|⁠. Secondly, note that (4.5) yields that

\begin{align*}\left\| {{u_\ell^n-u_\ell^{n-1}}} \right\|_{\mathcal{H}}\stackrel{(4.5)}\le q^{n-1}\,\frac{\alpha}{L}\,C_{\rm rel}\,\eta_{\ell-1}(u_{\ell-1}) = q^{n-1}\,\frac{\alpha}{L}\,C_{\rm rel}\, d_\ell \,\eta_{\ell}(u_\ell) = C d_\ell q^{n-1}\,\eta_{\ell}(u_\ell). \end{align*}

Since |$k>n$|⁠, stability (A1) and the geometric series prove that

\begin{align*}&\eta_{\ell}(u_\ell) = \eta_\ell(u_\ell^k) \stackrel{(\text{A}1)}\le \eta_\ell(u_\ell^n) + C_{\rm stb}\,\left\| {{u_\ell^k-u_\ell^n}} \right\|_{\mathcal{H}}\le \eta_\ell(u_\ell^n) + C_{\rm stb}\,\sum_{j=n}^{k-1}\left\| {{u_\ell^{j+1}-u_\ell^{j}}} \right\|_{\mathcal{H}}\\&\qquad \stackrel{(2.2)}\le \eta_\ell(u_\ell^n) + C_{\rm stb}\,\left\| {{u_\ell^{n}-u_\ell^{n-1}}} \right\|_{\mathcal{H}} \,\sum_{j=n}^{k-1}q^{(j+1)-n}\le \eta_\ell(u_\ell^n) + C_{\rm stb}\,\frac{q}{1-q}\,\left\| {{u_\ell^n-u_\ell^{n-1}}} \right\|_{\mathcal{H}}. \end{align*}

Combining the last two estimates, we derive that

\begin{align*}\left\| {{u_\ell^n-u_\ell^{n-1}}} \right\|_{\mathcal{H}}\le C d_\ell q^{n-1} \,\eta_\ell(u_\ell^n) + C' d_\ell \,q^n\,\left\| {{u_\ell^n-u_\ell^{n-1}}} \right\|_{\mathcal{H}}. \end{align*}

Rearranging the terms in combination with (4.7), we obtain that

\begin{align*}\left\| {{u_\ell^n-u_\ell^{n-1}}} \right\|_{\mathcal{H}}\le \frac{C d_\ell}{1-C' d_\ell \,q^n}\,q^{n-1} \,\eta_\ell(u_\ell^n) \le \lambda \eta_\ell(u_\ell^n). \end{align*}

This contradicts the minimality of |$k$| and concludes |$k \leq n$|⁠.

Step 2.|$\quad$| With |$C,C'>0$| from (4.7), define

\begin{align}\begin{split}C_{\rm pic} &:= \frac{1}{{|\!}\log q|}\,\max\left\{\Big|\log\frac{1}{2C'}\Big|\,,\,\Big|\log\frac{\lambda q}{2C}\Big|\right\}+1,\\ N&:= \Big\lceil C_{\rm pic}-1+\frac{1}{{|\!}\log q|}\,\log\big(\max\big\{1,d_\ell\big\}\big)\Big\rceil. \end{split}\end{align}

(4.8)

In this step, we will prove that equation (4.7) is satisfied with |$N$| instead of |$n$|⁠. According to Step 1, this will lead to

\begin{align*}k = \# {\rm Pic}(\ell)\le n\le N \le C_{\rm pic} + \frac{1}{{|\!}\log q|}\,\log\big(\max\big\{1,d_\ell\}\big) \end{align*}

and hence conclude (4.4). To this end, first note that

$$\log\big(\max\big\{1,d_\ell\}\big) = \big|\log\big(1 / \max\{1,d_\ell\}\big)\big|.$$

Therefore, basic calculus in combination with |$N\ge\frac{1}{{|\!}\log q|}\,\Big|\log\frac{1}{2C'} + \log\frac{1}{\max\{1,d_\ell\}}\Big|$| reveals that

\begin{align*}C'\max\{1,d_\ell\} \,q^N \leq C'\max\{1,d_\ell\} \, \frac{1}{2 C' \max \{1,d_\ell \} } = \frac{1}{2}\end{align*}

and proves, in particular, that

\begin{align*}0 < \frac{C d_\ell}{1- C'd_\ell \,q^N}\,q^{N-1}. \end{align*}

Together with |$N \ge \frac{1}{{|\!}\log q|}\,\Big|\log\frac{\lambda q}{2C} + \log\big(1/\max\{1,d_\ell\}\big)\Big|$|⁠, we finally get

\begin{align*}0 < \frac{C d_\ell}{1- C'd_\ell \,q^N}\,q^{N-1} \leq \frac{C \max\{1,d_\ell\} }{1- C'\max\{1,d_\ell\} \,q^N}\,q^{N-1}&\le \frac{2 C \max\{1,d_\ell\}}{q}\,q^{N}\\&\le \frac{2 C \max\{1,d_\ell\}}{q}\,\frac{\lambda q}{2 C \max\{1,d_\ell\}}= \lambda, \end{align*}

i.e., we conclude that (4.7) holds for |$n=N$|⁠. □

Remark 4.8

Linear convergence (see Theorem 5.3 below) proves, in particular, that |$\eta_{\ell}(u_{\ell}) \,{\le} C_{\rm lin} q_{\rm lin}\,\eta_{\ell-1}(u_{\ell-1})$| for all |$\ell\in\mathbb{N}$|⁠. In practice, we observe that |$\eta_{\ell}(u_{\ell})$| is only improved by a fixed factor, i.e., there also holds the converse estimate |$\eta_{\ell-1}(u_{\ell-1}) \le \widetilde C\,\eta_{\ell}(u_{\ell})$| for all |$\ell\in\mathbb{N}$|⁠. Even though, we cannot thoroughly prove this fact, Proposition 4.6 shows that—up to such an assumption—the number of Picard iterations in each step of Algorithm 4.3 is, in fact, uniformly bounded with |$\# {\rm Pic}(\ell)\,{\le} C_{\rm pic}+\frac{1}{{|\!}\log q|}\,\log(\max\{1,\widetilde C\})$|⁠.

4.4 Estimator convergence

In this section, we show that, if Step (i) of Algorithm 4.3 terminates for all |$\ell \in \mathbb{N}_0$|⁠, then Algorithm 4.3 yields |$\eta_\ell(u_\ell)\to0$| as |$\ell\to\infty$|⁠.

We first show that the iterates |$u_\bullet = u_\bullet^n$| of Algorithm 4.3 are close to the noncomputable exact Galerkin approximation |$u_\bullet^\star\in\mathcal{X}_\bullet$| to (3.1) and that the corresponding error estimators are equivalent.

Lemma 4.9

Suppose (O1)–(O2) for the operator |$A$| and stability (A1) for the error estimator. With |$C_\lambda:=C_{\rm stb}\,\frac{q}{1-q}$|⁠, the following holds for all |$0<\lambda<C_\lambda^{-1}$|⁠: For all |$u_\bullet^0\in\mathcal{X}_\bullet$| and all Picard iterates |$u_\bullet^n = \varPhi_\bullet(u_\bullet^{n-1})$|⁠, |$n\in\mathbb{N}$|⁠, with |$\left\| {{u_\bullet^n-u_\bullet^{n-1}}} \right\|_{\mathcal{H}} \le \lambda\,\eta_\bullet(u_\bullet^n)$|⁠, it holds that

\begin{align}\left\| {{u_\bullet^\star-u_\bullet^n}} \right\|_{\mathcal{H}} \le \lambda\,\frac{q}{1-q}\,\min\left\{\eta_\bullet(u_\bullet^n)\,,\,\frac{1}{1-\lambda C_\lambda}\,\eta_\bullet(u_\bullet^\star)\right\}\!.\end{align}

(4.9)

Moreover, there holds equivalence

\begin{align}(1-\lambda C_\lambda)\,\eta_\bullet(u_\bullet^n) \le \eta_\bullet(u_\bullet^\star) \le (1+\lambda C_\lambda)\,\eta_\bullet(u_\bullet^n).\end{align}

(4.10)

Proof.

For |$\mathcal{T}_\bullet=\mathcal{T}_\circ$|⁠, stability (A1) guarantees |$|\eta_\bullet(u_\bullet^\star)\,{-}\,\eta_\bullet(u_\bullet^n)|\,{\le}\, C_{\rm stb}\,\left\| {{u_\bullet^\star\,{-}\,u_\bullet^n}} \right\|_{\mathcal{H}}$|⁠. Therefore, estimate (3.3) and the assumption on the Picard iterate |$u_\bullet^n$| imply that

\begin{align*}\left\| {{u_\bullet^\star-u_\bullet^n}} \right\|_{\mathcal{H}}\stackrel{(3.3)}\le \frac{q}{1\!-\!q}\,\left\| {{u_\bullet^n-u_\bullet^{n-1}}} \right\|_{\mathcal{H}}&\le\lambda\,\frac{q}{1\!-\!q}\,\eta_\bullet(u_\bullet^n) \stackrel{(\text{A}1)}\le\lambda\,\frac{q}{1\!-\!q}\,\big(\eta_\bullet(u_\bullet^\star) + C_{\rm stb}\,\left\| {{u_\bullet^\star-u_\bullet^n}} \right\|_{\mathcal{H}} \big). \end{align*}

Since |$0<\lambda<C_\lambda^{-1}$| and hence |$\lambda C_{\rm stb}\,\frac{q}{1-q} = \lambda C_\lambda<1$|⁠, this yields that

\begin{align*}\left\| {{u_\bullet^\star-u_\bullet^n}} \right\|_{\mathcal{H}} \le\frac{ \lambda\,\frac{q}{1-q} }{ 1 - \lambda C_{\rm stb}\,\frac{q}{1-q} }\,\eta_\bullet(u_\bullet^\star) = \lambda\,\frac{q}{1-q}\,\frac{1}{1-\lambda C_\lambda}\,\eta_\bullet(u_\bullet^\star). \end{align*}

Altogether, this proves (4.9). Moreover, we see

\begin{align*}\eta_\bullet(u_\bullet^\star) \stackrel{(\text{A}1)}\le \eta_\bullet(u_\bullet^n) + C_{\rm stb}\,\left\| {{u_\bullet^\star-u_\bullet^n}} \right\|_{\mathcal{H}}\stackrel{(4.9)}\le (1+\lambda C_\lambda)\,\eta_\bullet(u_\bullet^n) \end{align*}

as well as

\begin{align*}\eta_\bullet(u_\bullet^n) \stackrel{(\text{A}1)}\le \eta_\bullet(u_\bullet^\star) + C_{\rm stb}\,\left\| {{u_\bullet^\star-u_\bullet^n}} \right\|_{\mathcal{H}}\stackrel{(4.9)}\le \left(1+ \,\frac{\lambda C_\lambda}{1-\lambda C_\lambda}\right)\,\eta_\bullet(u_\bullet^\star) = \frac{1}{1-\lambda C_\lambda}\,\eta_\bullet(u_\bullet^\star). \end{align*}

This concludes the proof. □

The following proposition gives a first convergence result for Algorithm 4.3. Unlike the stronger convergence result of Theorem 5.3 below, plain convergence only relies on (O1)–(O2), but avoids the use of (O3).

Proposition 4.10

Suppose (O1)–(O2) for the operator |$A$| and (A1)–(A2) for the error estimator. With |$C_\lambda$| from Lemma 4.9, let |$0<\theta\le1$| and |$0<\lambda< C_\lambda^{-1} \theta $|⁠. Then, Algorithm 4.3 guarantees the existence of constants |$0<q_{\rm est}<1$| and |$C_{\rm est}>0,$| which depend only on (A1)–(A2) as well as |$\lambda$| and |$\theta$|⁠, such that the following implication holds: If the repeat loop of Step (i) of Algorithm 4.3 terminates after finitely many steps for all |$\ell\in\mathbb{N}_0$|⁠, then

\begin{align}\eta_{\ell+1}(u_{\ell+1}^\star)^2 \le q_{\rm est}\,\eta_\ell(u_\ell^\star)^2 + C_{\rm est}\,\left\| {{u_{\ell+1}^\star-u_\ell^\star}} \right\|_{\mathcal{H}}^2 \quad\text{for all }\ell\in\mathbb{N}_0,\end{align}

(4.11)

where |$u_\bullet^\star\in\mathcal{X}_\bullet$| in the (noncomputable) Galerkin solution to (3.1). Moreover, there holds estimator convergence |$\eta_\ell(u_\ell)\to0$| as |$\ell\to\infty$|⁠.

Proof.

We prove the assertion in three steps.

Step 1.|$\quad$| Arguing as in the proof of Lemma 4.9, stability (A1) proves that

\begin{align*}\eta_\ell(\mathcal{M}_\ell,u_\ell) \stackrel{(\text{A}1)}\le \eta_\ell(\mathcal{M}_\ell,u_\ell^\star) + C_{\rm stb}\,\left\| {{u_\ell^\star-u_\ell}} \right\|_{\mathcal{H}}\stackrel{(4.9)}\le \eta_\ell(\mathcal{M}_\ell,u_\ell^\star) + \lambda C_\lambda \,\eta_\ell(u_\ell), \end{align*}

where we have used that |$C_\lambda = C_{\rm stb}\,\frac{q}{1-q}$|⁠. Together with the Dörfler marking strategy in Step (ii) of Algorithm 4.3, this proves that

\begin{align*}\theta'\,\eta_\ell(u_\ell^\star) := \frac{\theta- \lambda C_\lambda}{1+\lambda C_\lambda}\,\eta_\ell(u_\ell^\star) \stackrel{(4.10)}\le (\theta- \lambda C_\lambda) \,\eta_\ell(u_\ell) \stackrel{\rm(ii)}\le \eta_\ell(\mathcal{M}_\ell,u_\ell) - \lambda C_\lambda \, \eta_\ell(u_\ell) \le\eta_\ell(\mathcal{M}_\ell,u_\ell^\star). \end{align*}

Note that |$\lambda < C_\lambda^{-1} \theta$| implies |$\theta'>0$|⁠. Hence, this is the Dörfler marking for |$u_\ell^\star$| with parameter |$0<\theta'<\theta$|⁠. Therefore, Carstensen et al. (2014, Lemma 4.7) proves (4.11).

Step 2.|$\quad$| Next, we adopt an argument from Babuska & Vogelius (1984) and Morin et al. (2008) to prove a priori convergence of the sequence |$(u_\ell^\star)_{\ell \in \mathbb{N}_0}$|⁠: since the discrete subspaces are nested, |$\mathcal{X}_\infty:=\overline{\bigcup_{\ell=0}^\infty\mathcal{X}_\ell}$| is a closed subspace of |$\mathcal{H}$| and hence a Hilbert space. Arguing as above (for the continuous and discrete problem), there exists a unique solution |$u_\infty^\star\in\mathcal{X}_\infty$| of

\begin{align*}\left\langle {{Au_\infty^\star}\,,\,{v_\infty}} \right\rangle = \left\langle {{F}\,,\,{v_\infty}} \right\rangle \quad\text{for all }v_\infty\in\mathcal{X}_\infty. \end{align*}

Note that |$\mathcal{X}_\ell\subseteq\mathcal{X}_\infty$| implies that |$u_\ell^\star$| is a Galerkin approximation to |$u_\infty^\star$|⁠. Hence, the Céa lemma (Lemma 3.1) is valid with |$u^\star\in\mathcal{H}$| replaced by |$u_\infty^\star \in\mathcal{X}_\infty$|⁠. Together with the definition of |$\mathcal{X}_\infty$|⁠, this proves that

\begin{align*}\left\| {{u_\infty^\star-u_\ell^\star}} \right\|_{\mathcal{H}} \le \frac{L}{\alpha}\,\min_{w_\ell\in\mathcal{X}_\ell}\left\| {{u_\infty^\star-w_\ell}} \right\|_{\mathcal{H}}\xrightarrow{\ell\to\infty}0. \end{align*}

In particular, we infer that |$\left\| {{u_{\ell+1}^\star-u_\ell^\star}} \right\|_{\mathcal{H}}^2\to0$| as |$\ell\to\infty$|⁠.

Step 3.|$\quad$| According to (4.11) and Step 2, the sequence |$\big(\eta_\ell(u_\ell^\star)\big)_{\ell\in\mathbb{N}_0}$| is contractive up to a non-negative perturbation, which tends to zero. Basic calculus (e.g., Aurada et al., 2012, Lemma 2.3) proves |$\eta_\ell(u_\ell^\star)\to0$| as |$\ell\to\infty$|⁠. Lemma 4.9 guarantees the equivalence |$\eta_\ell(u_\ell^\star)\simeq\eta_\ell(u_\ell)$|⁠. This concludes the proof. □

Remark 4.11

As in Proposition 4.10, the linear convergence result of Theorem 5.3 below allows arbitrary |$0<\theta\le1$|⁠, but requires |$0<\lambda<C_\lambda^{-1}\,\theta$| with |$C_\lambda>0$| being the constant from Lemma 4.9. In many situations, the weaker constraint |$0<\lambda<C_\lambda^{-1}$|⁠, which avoids any coupling of |$\theta$| and |$\lambda$|⁠, appears to be sufficient to guarantee plain convergence. To see this, note that usually the error estimator is equivalent to error plus data oscillations

\begin{align*}\eta_\ell(u_\ell^\star) \simeq \left\| {{u^\star-u_\ell^\star}} \right\|_{\mathcal{H}} + {\rm osc}_\ell(u_\ell^\star). \end{align*}

If the ‘discrete limit space’ |$\mathcal{X}_\infty := \overline{\bigcup_{\ell=0}^\infty\mathcal{X}_\ell}$| satisfies |$\mathcal{X}_\infty = \mathcal{H}$|⁠, possible smoothness of |$A$| guarantees |$\left\| {{u^\star-u_\ell^\star}} \right\|_{\mathcal{H}} + {\rm osc}_\ell(u_\ell^\star)\to0$| as |$\ell\to\infty$| (see the argumentation in Erath & Praetorius, 2016, Proof of Theorem 3). Moreover, |$\mathcal{X}_\infty = \mathcal{H}$| follows either implicitly if |$u^\star$| is ‘nowhere discrete’ or can explicitly be ensured by the marking strategy without deteriorating optimal convergence rates (see Bespalov et al., 2017, Section 3.2). Since |$0<\lambda<C_\lambda^{-1}$|⁠, Lemma 4.9 guarantees estimator equivalence |$\eta_\ell(u_\ell) \simeq \eta_\ell(u_\ell^\star)$|⁠. Overall, such a situation leads to |$\eta_\ell(u_\ell)\to0$| as |$\ell\to\infty$|⁠.

5. Linear convergence

Suppose that |$A$| additionally satisfies (O3). For |$v \in \mathcal{H}$|⁠, we define the energy |$Ev:=\operatorname{Re}(P-F)v$|⁠, where |$P$| is the potential associated with |$A$| from (1.2) and |$F\in\mathcal{H}^*$| is the right-hand side of (1.1). The next lemma generalises Diening & Kreuzer (2008, Lemma 16) and Garau et al. (2012, Theorem 4.1) and states equivalence of the energy difference and the difference in norm.

Lemma 5.1

Suppose (O1)–(O3). Let |$\mathcal{X}_\bullet$| be a closed subspace of |$\mathcal{H}$| (which also allows |$\mathcal{X}_\bullet=\mathcal{H}$|⁠). If |$u_\bullet^\star \in \mathcal{X}_\bullet$| denotes the corresponding Galerkin approximation (1.3), it holds that

\begin{align}\frac{\alpha}{2}\left\| {{v_\bullet-u_\bullet^\star}} \right\|_{\mathcal{H}}^2\le E(v_\bullet)-E(u_\bullet^\star)\le \frac{L}{2}\left\| {{v_\bullet-u_\bullet^\star}} \right\|_{\mathcal{H}}^2 \quad\text{for all }v_\bullet\in\mathcal{X}_\bullet.\end{align}

(5.1)

Proof.

Since |$\mathcal{H}$| is also a Hilbert space over |$\mathbb{R}$|⁠, we interpret |$E$| as an |$\mathbb{R}$|-functional. Since |$F$| is linear with Gâteaux derivative |$\left\langle {{{\rm d}F(v)}\,,\,{w}} \right\rangle=\left\langle {{F}\,,\,{w}} \right\rangle$| for all |$v,w\in\mathcal{H}$|⁠, the energy |$E$| is also Gâteaux differentiable with |$\left\langle {{{\rm d}E(v)}\,,\,{w}} \right\rangle=\operatorname{Re}\left\langle {{{\rm d}P(v)-F}\,,\,{w}} \right\rangle =\operatorname{Re} \left\langle {{Av -F}\,,\,{w}} \right\rangle $|⁠. Define |$\psi(t):=E(u_\bullet^\star+t (v_\bullet-u_\bullet))$| for |$t\in[0,1]$|⁠. For |$t\in[0,1]$|⁠, it holds that

\begin{align}\begin{split}\psi'(t)&=\lim_{\substack{r\to 0\\ r\in\mathbb{R}}}\frac{E\big(u_\bullet^\star+t(v_\bullet-u_\bullet^\star)+r(v_\bullet-u_\bullet^\star)\big) -E\big(u_\bullet^\star+t(v_\bullet-u_\bullet^\star)\big)}{r}\\ &=\left\langle {{{\rm d}E\big(u_\bullet^\star+t(v_\bullet-u_\bullet^\star)\big)}\,,\,{v_\bullet-u_\bullet^\star}} \right\rangle\\ &=\operatorname{Re}\left\langle {{A\big(u_\bullet^\star+t(v_\bullet-u_\bullet^\star)\big)-F}\,,\,{v_\bullet-u_\bullet^\star}} \right\rangle. \end{split}\end{align}

(5.2)

\begin{align*}|\psi'(s)-\psi'(t)|&=\big|\!\operatorname{Re}\left\langle {{A\big(u_\bullet^\star+s(v_\bullet-u_\bullet^\star)\big)-A\big(u_\bullet^\star+t(v_\bullet-u_\bullet^\star)\big)}\,,\,{v_\bullet-u_\bullet^\star}} \right\rangle\big|\\ &\le L\left\| {{(s-t)(v_\bullet-u_\bullet^\star)}} \right\|_{\mathcal{H}}\left\| {{v_\bullet-u_\bullet^\star}} \right\|_{\mathcal{H}}=L\left\| {{v_\bullet-u_\bullet^\star}} \right\|_{\mathcal{H}}^2|s-t|, \end{align*}

i.e., |$\psi'$| is Lipschitz continuous with constant |$L\left\| {{v_\bullet-u_\bullet^\star}} \right\|_{\mathcal{H}}^2$|⁠. By Rademacher’s theorem, |$\psi'$| is almost everywhere differentiable and |$|\psi''|\le L\left\| {{v_\bullet-u_\bullet^\star}} \right\|_{\mathcal{H}}^2$| almost everywhere. Moreover, the fundamental theorem of calculus applies and integration by parts yields that

\begin{align*}E(v_\bullet)-E(u_\bullet^\star)=\psi(1)-\psi(0)=\psi'(0)+\int_0^1\psi''(t)(1-t)\, {\rm{d}}t. \end{align*}

Since |$\mathcal{X}_\bullet\subset\mathcal{H}$| is a closed subspace, there also holds |${\rm d}P_\bullet=A_\bullet$| with the restriction |$P_\bullet:=P|_{\mathcal{X}_\bullet}$|⁠. Hence, we may also define the restricted energy |$E_\bullet:=E|_{\mathcal{X}_\bullet}$|⁠. With (5.2) and |${\rm d}E_\bullet u_\bullet^\star= A_\bullet u_\bullet^\star-F_\bullet=0$|⁠, we see |$\psi'(0)=0$|⁠. Therefore,

\begin{align}E(v_\bullet)-E(u_\bullet^\star)=\int_0^1\psi''(t)(1-t)\, {\rm{d}}t.\end{align}

(5.3)

Since |$|\psi''|\le L\left\| {{v_\bullet-u_\bullet^\star}} \right\|_{\mathcal{H}}^2$| almost everywhere, we get the upper bound in (5.1). To see the lower bound, we compute for almost every |$t\in[0,1]$|

\begin{align*}\psi''(t)&\stackrel{(5.2)}{=} \lim_{\substack{r\to 0\\r\in\mathbb{R}}}\frac{\operatorname{Re}\left\langle {{A\big(u_\bullet^\star+(t+r)(v_\bullet-u_\bullet^\star)\big)-F}\,,\,{v_\bullet-u_\bullet^\star}} \right\rangle -\operatorname{Re}\left\langle {{A\big(u_\bullet^\star+t(v_\bullet-u_\bullet^\star)\big)-F}\,,\,{v_\bullet-u_\bullet^\star}} \right\rangle}{r}\\ &=\lim_{\substack{r\to 0\\r\in\mathbb{R}}}\frac{\operatorname{Re}\left\langle {{A\big(u_\bullet^\star+(t+r)(v_\bullet-u_\bullet^\star)\big)- A\big(u_\bullet^\star+t(v_\bullet-u_\bullet^\star)\big)}\,,\,{r(v_\bullet-u_\bullet^\star)}} \right\rangle}{r^2}\\ &\stackrel{(\text{O}1)}{\ge} \lim_{\substack{r\to 0\\r\in\mathbb{R}}}\alpha\frac{\left\| {{r(v_\bullet-u_\bullet^\star)}} \right\|_{\mathcal{H}}^2}{r^2}=\alpha\left\| {{v_\bullet-u_\bullet^\star}} \right\|_{\mathcal{H}}^2. \end{align*}

Together with (5.3), we conclude the proof. □

Remark 5.2

Lemma 5.1 immediately implies that the Galerkin solution |$u_\bullet^\star\in\mathcal{X}_\bullet$| to (3.1) minimizes the energy |$E$| in |$\mathcal{X}_\bullet$|⁠, i.e., |$E(u_\bullet^\star)\le E(v_\bullet)$| for all |$v_\bullet\in\mathcal{X}_\bullet$|⁠. On the other hand, if |$w_\bullet\in\mathcal{X}_\bullet$| is a minimizer of the energy in |$\mathcal{X}_\bullet$|⁠, we deduce |$E(w_\bullet)=E(u_\bullet^\star)$|⁠. Lemma 5.1 thus implies |$w_\bullet=u_\bullet^\star$|⁠. Therefore, solving the Galerkin formulation (3.1) is equivalent to the minimization of the energy |$E$| in |$\mathcal{X}_\bullet$|⁠.

Next, we prove a contraction property as in Diening & Kreuzer (2008, Theorem 20), Belenki et al. (2012, Theorem 4.7) and Garau et al. (2012, Theorem 4.2) and, in particular, obtain linear convergence of Algorithm 4.3 in the sense of Carstensen et al. (2014).

Theorem 5.3

Suppose (O1)–(O3) for the operator |$A$| and (A1)–(A3) for the error estimator. Let |$C_\lambda$| be the constant from Lemma 4.9. Let |$0 < \theta \leq 1$| and suppose |$0 < \lambda < C_\lambda^{-1} \theta $|⁠. Then, there exist constants |$0<q_{\rm lin}<1$| and |$\gamma>0$|⁠, which depend only on (O1)–(O2) and (A1)–(A3), as well as on |$\lambda$| and |$\theta$|⁠, such that the following implication holds: If the repeat loop of Step (i) of Algorithm 4.3 terminates after finitely many steps for all |$\ell\in\mathbb{N}_0$|⁠, then

\begin{align}\Delta_{\ell+1} \le q_{\rm lin}\,\Delta_\ell \text{ for all }\ell\in\mathbb{N}_0, \quad\text{where}\quad\Delta_\bullet := E(u_\bullet^\star)-E(u^\star) + \gamma\,\eta_\bullet(u_\bullet^\star)^{2}.\end{align}

(5.4)

Moreover, there exists a constant |$C_{\rm lin}>0$| such that

\begin{align}\eta_{\ell+k}(u_{\ell+k})^2\le C_{\rm lin} q_{\rm lin}^k \eta_\ell(u_\ell)^2 \text{ for all }k,\ell\in\mathbb{N}_0.\end{align}

(5.5)

Proof.

Because of nestedness |$\mathcal{X}_\ell \subseteq \mathcal{X}_{\ell+1} \subset \mathcal{H}$| for all |$\ell \in \mathbb{N}_0$|⁠, Lemma 5.1 proves

\begin{align}\frac{\alpha}{2}\left\| {{u_k^\star-u_\ell^\star}} \right\|_{\mathcal{H}}^2\le E(u_k^\star)-E(u_\ell^\star)\le \frac{L}{2}\left\| {{u_k^\star-u_\ell^\star}} \right\|_{\mathcal{H}}^2 \quad\text{for all }k,\ell,\in\mathbb{N}_0\text{ with }k \leq \ell \end{align}

(5.6)

and

\begin{align}\frac{\alpha}{2}\left\| {{u_k^\star-u^\star}} \right\|_{\mathcal{H}}^2\le E(u_k^\star)-E(u^\star)\le \frac{L}{2}\left\| {{u_k^\star-u^\star}} \right\|_{\mathcal{H}}^2\quad&\text{for all }k\in\mathbb{N}_0.\end{align}

(5.7)

We set |$\gamma:=\alpha / (2 C_{\rm est})$|⁠. Together with (5.6), estimator reduction (4.11) gives

\begin{align*}\Delta_{\ell+1} &=E(u_{\ell+1}^\star)-E(u^\star)+\gamma\eta_{\ell+1}(u_{\ell+1}^\star)^2\\ &\le \big(E(u_\ell^\star)-E(u^\star)\big)-\big(E(u_\ell^\star)-E(u_{\ell+1}^\star)\big)+\gamma\big(q_{\rm est} \eta_\ell(u_\ell^\star)^2+C_{\rm est} \left\| {{u_\ell^\star-u_{\ell+1}^\star}} \right\|_{\mathcal{H}}^2\big)\\ &\le E(u_\ell^\star)-E(u^\star)+\gamma q_{\rm est} \, \eta_\ell(u_\ell^\star)^2. \end{align*}

Let |$\varepsilon >0$|⁠. Combining this estimate with reliability (A3) and (5.7), we see that

\begin{align*}\Delta_{\ell+1}&\leq E(u_\ell^\star)-E(u^\star)+\gamma (q_{\rm est}+\varepsilon) \, \eta_\ell(u_\ell^\star)^2 - \gamma \varepsilon \, \eta_\ell(u_\ell^\star)^2 \\ &\leq E(u_\ell^\star)-E(u^\star)+\gamma (q_{\rm est}+\varepsilon) \, \eta_\ell(u_\ell^\star)^2 - \gamma \frac{2 \varepsilon}{L C_{\rm rel}^\star} \big(E(u_\ell^\star)-E(u^\star)\big)\\ &= \left(1-\gamma \frac{2 \varepsilon}{L C_{\rm rel}^\star}\right) \big(E(u_\ell^\star)-E(u^\star)\big) + \gamma (q_{\rm est}+\varepsilon) \, \eta_\ell(u_\ell^\star)^2 \\ &\leq \max \left\{\left(1-\gamma \frac{2 \varepsilon}{L C_{\rm rel}^\star}\right), (q_{\rm est}+\varepsilon) \right\}\, \Delta_\ell. \end{align*}

Defining |$0 < q_{\rm lin}:= \inf_{\varepsilon >0 } \max \big\{\big(1-\gamma \frac{2 \varepsilon}{L C_{\rm rel}^\star}\big), (q_{\rm est}+\varepsilon) \big \} < 1$|⁠, we prove (5.4). Moreover, induction on |$k$| proves that |$ \Delta_{\ell+k} \,{\le}\, q_{\rm lin}^k\,\Delta_\ell \ \text{for all }k,\ell\in\mathbb{N}_0.$| In combination with (5.7), reliability (A3) and the estimator equivalence (4.10) of Lemma 4.9 prove that, for all |$k,\ell\in\mathbb{N}_0$|⁠,

\begin{align*}\eta_{\ell+k}(u_{\ell+k})^2 \stackrel{(4.10)}{\simeq} \eta_{\ell+k}(u_{\ell+k}^\star)^2 \simeq \Delta_{\ell+k} \le q_{\rm lin}^k\,\Delta_\ell \simeq q_{\rm lin}^k\,\eta_{\ell}(u_{\ell}^\star)^2 \stackrel{(4.10)}{\simeq} q_{\rm lin}^k\, \eta_{\ell+k}(u_{\ell+k})^2. \end{align*}

This concludes the proof. □

6. Optimal convergence rates

6.1 Fine properties of mesh refinement

The proof of optimal convergence rates requires the following additional properties of the mesh-refinement strategy.

(R1) Splitting property: Each refined element is split in at least 2 and at most in |$C_{\rm son}\ge2$| many sons, i.e., for all |$\mathcal{T}_\bullet \in \mathbb{T}$| and all |$\mathcal{M}_\bullet \subseteq \mathcal{T}_\bullet$|⁠, the refined triangulation |$\mathcal{T}_\circ = \operatorname{refine}(\mathcal{T}_\bullet, \mathcal{M}_\bullet)$| satisfies
\begin{align}\# (\mathcal{T}_\bullet \setminus \mathcal{T}_\circ) + \# \mathcal{T}_\bullet \leq \# \mathcal{T}_\circ \leq C_{\rm son} \, \# (\mathcal{T}_\bullet \setminus \mathcal{T}_\circ) + \# (\mathcal{T}_\bullet \cap \mathcal{T}_\circ).\end{align}
(6.1)
(R2) Overlay estimate: For all meshes |$\mathcal{T} \in \mathbb{T}$| and |$\mathcal{T}_\bullet,\mathcal{T}_\circ \in \operatorname{refine}(\mathcal{T})$|⁠, there exists a common refinement |$\mathcal{T}_\bullet \oplus \mathcal{T}_\circ \in \operatorname{refine}(\mathcal{T}_\bullet) \cap \operatorname{refine}(\mathcal{T}_\circ) \subseteq \operatorname{refine}(\mathcal{T})$|⁠, which satisfies
\begin{align*}\# (\mathcal{T}_\bullet \oplus \mathcal{T}_\circ) \leq \# \mathcal{T}_\bullet + \# \mathcal{T}_\circ - \# \mathcal{T}. \end{align*}
(R3) Mesh-closure estimate: There exists |$C_{\rm mesh}>0$| such that the sequence |$\mathcal{T}_\ell$| with corresponding sets of marked elements |$\mathcal{M}_\ell \subseteq \mathcal{T}_\ell$|⁠, which is generated by Algorithm 4.3, satisfies
\begin{align*}\# \mathcal{T}_\ell - \# \mathcal{T}_0 \leq C_{\rm mesh} \sum_{j=0}^{\ell -1} \# \mathcal{M}_j. \end{align*}

For the newest vertex bisection (NVB), the mesh-closure estimate (R3) has first been proved for |$d=2$| in Binev et al. (2004) and later for |$d \geq 2$| in Stevenson (2008). Although both works require an additional admissibility assumption on |$\mathcal{T}_0$|⁠, Karkulik et al. (2013) proved that this condition is unnecessary for |$d=2$|⁠. The proof of the overlay estimate (R2) is found in Cascon et al. (2008) and Stevenson (2007). The lower bound of (R1) is clearly satisfied for each feasible mesh-refinement strategy. For NVB, the upper bound of (R1) is easily verified for |$d=2$| with |$C_{\rm son}=4$|⁠, and the proof for general dimension |$d \geq 2$| can be found in Gallistl et al. (2014).

For red-refinement with first-order hanging nodes, the validity of (R1)–(R3) is shown in Bonito & Nochetto (2010). For mesh-refinement strategies in isogeometric analysis, we refer to Morgenstern & Peterseim (2015) for T-splines and to Buffa et al. (2016) and Gantner et al. (2017) for (truncated) hierarchical B-splines.

Remark 6.1

Using (R1) and the definition of |$\operatorname{refine}(\cdot)$|⁠, an induction argument proves that the lower estimate |$\# (\mathcal{T}_\bullet \setminus \mathcal{T}_\circ) + \# \mathcal{T}_\bullet \leq \# \mathcal{T}_\circ$| in (6.1) holds true for all |$\mathcal{T}_\bullet \in \mathbb{T}$| and arbitrary refinements |$\mathcal{T}_\circ \in \operatorname{refine}(\mathcal{T}_\bullet)$|⁠.

6.2 Approximation class

For |$N\in\mathbb{N}_0$|⁠, we define the set

\begin{align}\mathbb{T}_N := \left\{ {{\mathcal{T}_\bullet \in \operatorname{refine}(\mathcal{T}_0)}{\#\mathcal{T}_\bullet - \# \mathcal{T}_0 \leq N }} \right\},\end{align}

(6.2)

of all refinements of |$\mathcal{T}_0$|⁠, which have at most |$N$| elements more than |$\mathcal{T}_0$|⁠. For |$s>0$|⁠, we define the approximation norm |$\left\| {{\cdot}} \right\|_{\mathbb{A}_s}$| by

\begin{align}\left\| {{u^\star}} \right\|_{\mathbb{A}_s} := \sup_{N \in \mathbb{N}_0} \left((N+1)^s \min_{\mathcal{T}_\bullet \in \mathbb{T}_N} \eta_\bullet(u_\bullet^\star) \right)\!,\end{align}

(6.3)

where |$\eta_\bullet(u_\bullet^\star)$| is the error estimator corresponding to the optimal triangulation |$\mathcal{T}_\bullet \in \mathbb{T}_N$|⁠. It is noted that |$\left\| {{u^\star}} \right\|_{\mathbb{A}_s} < \infty$| implies the existence of a not (necessarily nested) sequence of triangulations, such that the error estimator |$\eta_\bullet(u_\bullet^\star)$| corresponding to the (noncomputable) Galerkin approximation |$u_\bullet^\star$| decays at least with algebraic rate |$s>0$|⁠.

6.3 Main result

The following theorem is the main result of this work. It proves that Algorithm 4.3 does not only lead to linear convergence, but also guarantees the best possible algebraic convergence rate for the error estimator |$\eta_\ell(u_\ell)$|⁠.

Theorem 6.2

Suppose (O1)–(O3) for the nonlinear operator |$A$|⁠, (R1)–(R3) for the mesh refinement, and (A1)–(A4) for the a posteriori error estimator. Let |$C_\lambda$| be the constant from Lemma 4.9. Suppose that |$0 < \theta \leq 1$| and |$0<\lambda< C_\lambda^{-1} \theta$| satisfy

\begin{align}\theta'' := \frac{\theta+\lambda C_\lambda}{1-\lambda C_\lambda} < \theta_{\rm opt}:= (1+C_{\rm stb}^2 (C_{\rm drel}^\star)^2)^{-1}\end{align}

(6.4)

(which is satisfied, e.g., for |$0<\theta<\theta_{\rm opt}$| and sufficiently small |$\lambda$|⁠). Suppose that the repeat loop of Step (i) of Algorithm 4.3 terminates after finitely many steps for all |$\ell\in\mathbb{N}_0$|⁠. Then, for all |$s>0$|⁠, there holds the equivalence

\begin{align}\left\| {{u^\star}} \right\|_{\mathbb{A}_s} < \infty \quad \Longleftrightarrow \quad\exists C_{\rm opt}>0\,\forall \ell\in\mathbb{N}_0\quad\eta_\ell(u_\ell) \leq C_{\rm opt} \big(\#\mathcal{T}_\ell - \# \mathcal{T}_0 +1 \big)^{-s}.\end{align}

(6.5)

Moreover, there holds |$C_{\rm opt} \,{=}\, C_{\rm opt}'\left\| {{u^\star}} \right\|_{\mathbb{A}_s}$|⁠, where |$C_{\rm opt}'\,{>}\,0$| depends only on |$\mathcal{T}_0$|⁠, |$\theta$|⁠, |$\lambda$|⁠, |$s$|⁠, (A1)–(A4), (O1)–(O2) and (R1)–(R3).

The comparison lemma is found in Carstensen et al. (2014).

Lemma 6.3

(Carstensen et al., 2014, Lemma 4.14) Suppose (R2), (A1), (A2) and (A4). Let |$0<\theta''<\theta_{\rm opt}$|⁠. Then, there exists a constant |$C_{\rm comp}>0$|⁠, such that for all |$s>0$| with |$\left\| {{u^\star}} \right\|_{\mathbb{A}_s} < \infty$| and all |$\ell \in \mathbb{N}_0$|⁠, there exists |$\mathcal{R}_\ell \subseteq \mathcal{T}_\ell $| which satisfies

\begin{align}\# \mathcal{R}_\ell \leq C_{\rm comp} \left\| {{u^\star}} \right\|_{\mathbb{A}_s}^{1/s} \eta_\ell(u_\ell^\star)^{-1/s},\end{align}

(6.6)

as well as the Dörfler marking criterion

\begin{align}\theta'' \eta_\ell(u_\ell^\star) \leq \eta_\ell(\mathcal{R}_\ell, u_\ell^\star).\end{align}

(6.7)

The constant |$C_{\rm comp}$| depends only on |$\theta'',s$| and the constants of (A1), (A2) and (A4). □

The proof of Theorem 6.2 follows ideas from Carstensen et al. (2014, Theorem 4.1).

Proof of Theorem 6.2

We prove the assertion in three steps.

Step 2.|$\quad$| Suppose |$\left\| {{u^\star}} \right\|_{\mathbb{A}_s} < \infty$|⁠. By Assumption (6.4), Lemma 6.3 provides a set |$\mathcal{R}_\ell\subseteq\mathcal{T}_\ell$| with (6.6)–(6.7). Arguing as in the proof of Lemma 4.9, stability (A1) proves that

\begin{align*}\eta_\ell(\mathcal{R}_\ell,u_\ell^\star) \stackrel{(\text{A}1)}\le \eta_\ell(\mathcal{R}_\ell,u_\ell) + C_{\rm stb}\,\left\| {{u_\ell^\star-u_\ell}} \right\|_{\mathcal{H}}\stackrel{(4.9)}\le \eta_\ell(\mathcal{R}_\ell,u_\ell) + \lambda C_\lambda\,\eta_\ell(u_\ell), \end{align*}

where we have used that |$C_\lambda = C_{\rm stb}\,\frac{q}{1-q}$|⁠. Together with |$\theta'' \eta_\ell(u_\ell^\star) \le \eta_\ell(\mathcal{R}_\ell,u_\ell^\star)$|⁠, this proves

\begin{align*}(1-\lambda C_\lambda)\theta''\,\eta_\ell(u_\ell) \stackrel{(4.10)}\le \theta''\,\eta_\ell(u_\ell^\star) \le\eta_\ell(\mathcal{R}_\ell,u_\ell^\star) \le\eta_\ell(\mathcal{R}_\ell,u_\ell) + \lambda C_\lambda\,\eta_\ell(u_\ell) \end{align*}

and results in

\begin{align}\theta\,\eta_\ell(u_\ell) \stackrel{(6.4)}= \left((1-\lambda C_\lambda)\theta'' - \lambda C_\lambda\right)\,\eta_\ell(u_\ell) \le \eta_\ell(\mathcal{R}_\ell,u_\ell).\end{align}

(6.8)

Hence, |$\mathcal{R}_\ell$| satisfies the Dörfler marking for |$u_\ell$| with parameter |$\theta$|⁠. By choice of |$\mathcal{M}_\ell$| in Step (ii) of Algorithm 4.3, we thus infer that

\begin{align*}\# \mathcal{M}_\ell \stackrel{(6.8)}{\leq} C_{\rm mark} \# \mathcal{R}_\ell \stackrel{(6.6)}{\leq} C_{\rm mark} C_{\rm comp} \left\| {{u^\star}} \right\|_{\mathbb{A}_s}^{1/s} \eta_\ell(u_\ell^\star)^{-1/s}\stackrel{(4.10)}\simeq \left\| {{u^\star}} \right\|_{\mathbb{A}_s}^{1/s}\,\eta_\ell(u_\ell)^{-1/s}\quad\text{for all}\,$\ell\in\mathbb{N}_0$. \end{align*}

The mesh-closure estimate (R3) guarantees that

\begin{align}\# \mathcal{T}_\ell - \# \mathcal{T}_0 +1 \lesssim \sum_{j=0}^{\ell-1} \# \mathcal{M}_j \lesssim \left\| {{u^\star}} \right\|_{\mathbb{A}_s}^{1/s}\,\sum_{j=0}^{\ell-1} \eta_j(u_j)^{-1/s} \quad \text{for all } \ell>0.\end{align}

(6.9)

Step 3.|$\quad$| The linear convergence of Theorem 5.3 implies |$\eta_\ell(u_\ell) \leq C_{\rm lin} q_{\rm lin}^{\ell - j} \eta_j(u_j)$| for all |$0 \leq j \leq \ell$|⁠. In particular, this leads to

\begin{align*}\eta_j(u_j)^{-1/s} \leq C_{\rm lin}^{1/s} q_{\rm lin}^{(\ell -j)/s} \eta_\ell(u_\ell)^{-1/s}\quad\text{for all }0\le j\le \ell. \end{align*}

By the use of the geometric series with |$0 < q_{\rm lin}^{1/s} <1$|⁠, we obtain that

\begin{align*}\sum_{j=0}^{\ell-1} \eta_j(u_j)^{-1/j}\leq C_{\rm lin}^{1/s} \eta_\ell(u_\ell)^{-1/s} \sum_{j=0}^{\ell-1} (q_{\rm lin}^{1/s})^{\ell -j}\leq \frac{C_{\rm lin}^{1/s}}{1-q_{\rm lin}^{1/s}} \, \eta_\ell(u_\ell)^{-1/s}. \end{align*}

Combining the latter estimate with (6.9), we derive that

\begin{align*}\# \mathcal{T}_\ell - \# \mathcal{T}_0 +1 \lesssim \left\| {{u^\star}} \right\|_{\mathbb{A}_s}^{1/s}\,\eta_\ell(u_\ell)^{-1/s} \quad \text{for all } \ell>0. \end{align*}

Since |$\eta_0(u_0) \simeq \eta_0(u_0^\star) \lesssim \left\| {{u^\star}} \right\|_{\mathbb{A}_s}$|⁠, the latter inequality holds, in fact, for all |$\ell \geq 0$|⁠. Rearranging this estimate, we conclude the proof of (6.5). □

6.4 (Almost) Optimal computational work

We show that Algorithm 4.3 does not only lead to optimal algebraic convergence rates for the error estimator |$\eta_\ell(u_\ell)$|⁠, but also guarantees that the overall cost of the algorithm is asymptotically (almost) optimal.

We suppose that the linear system involved in the computation of each step of the discrete Picard iteration |$u_\ell^n := \varPhi_\ell(u_\ell^{n-1})$|⁠, see Section 3.2, can be solved (e.g., by multigrid) in linear complexity |$\mathcal{\rm{O}}(\# \mathcal{T}_\ell)$|⁠. Moreover, we suppose that the evaluation of |$\left\langle {{A u_\ell^{n-1} -F}\,,\,{v_\ell}} \right\rangle$| and |$\eta_\ell(T,v_\ell)$| for one fixed |$v_\ell \in \mathcal{X}_\ell$| and |$T\in\mathcal{T}_\ell$| is of order |$\mathcal{\rm O}(1)$|⁠. Then, with |$\# {\rm Pic}(\ell)\ge1$| the number of Picard iterates in Step (i) of Algorithm 4.3, we require |$\mathcal{\rm O}(\# {\rm Pic}(\ell) \, \# \mathcal{T}_\ell)$| operations to compute the discrete solution |$u_\ell \in \mathcal{X}_\ell$|⁠.
We suppose that the set |$\mathcal{M}_\ell$| in Step (ii) and the local mesh refinement |$\mathcal{T}_{\ell+1} := \operatorname{refine}(\mathcal{T}_\ell,\mathcal{M}_\ell)$| in Step (iii) of Algorithm 4.3 are performed in linear complexity |$\mathcal{\rm O}(\# \mathcal{T}_\ell)$| (see, e.g., Stevenson, 2007) with |$C_{\rm mark} = 2$| for Step (ii).

Since one step of the adaptive algorithm depends on the full history of the adaptive meshes, the overall computational cost for the |$\ell$|th step of Algorithm 4.3 thus amounts to

\begin{align*}\mathcal{\rm O} \left(\sum_{j=0}^{\ell} \# {\rm Pic}(j)\,\# \mathcal{T}_j \right)\!. \end{align*}

Optimal convergence behaviour of Algorithm 4.3 means that, given |$\left\| {{u^\star}} \right\|_{\mathbb{A}_s} < \infty$|⁠, the error estimator |$\eta_\ell(u_\ell)$| decays with rate |$s>0$| with respect to the degrees of freedom |$\mathcal{\rm O}(\# \mathcal{T}_\ell)$| (see Theorem 6.2). Optimal computational complexity would mean that, given |$\left\| {{u^\star}} \right\|_{\mathbb{A}_s} < \infty$|⁠, the error estimator |$\eta_\ell(u_\ell)$| decays with rate |$s>0$| with respect to the computational cost (see Feischl, 2015 for linear problems). Up to some small perturbation, the latter is stated in the following theorem.

Theorem 6.4

Suppose the assumptions of Theorem 6.2 and |$s>0$|⁠. Then, it holds that

\begin{align*}\left\| {{u^\star}} \right\|_{\mathbb{A}_s} < \infty \, \Longrightarrow \, \forall \varepsilon>0 \, \exists C_{\rm work}>0 \, \forall \ell \in \mathbb{N}_0 \,\,\, \eta_\ell(u_\ell) \leq C_{\rm work} \left( \sum_{j=0}^{\ell} \# {\rm Pic}(j)\,\# \mathcal{T}_j \right)^{-(s - \varepsilon)}. \end{align*}

There holds |$C_{\rm work} = C_{\rm work}'\,\left\| {{u^\star}} \right\|_{\mathbb{A}_s}$|⁠, where |$C_{\rm work}'>0$| depends only on |$\theta$|⁠, |$\lambda$|⁠, |$s$|⁠, |$\varepsilon$|⁠, (A1)–(A4), (O1)–(O2) and (R1)–(R3), as well as on |$\mathcal{T}_0$|⁠, |$\eta_0(u_0)$| and |$ \# {\rm Pic}(0)$|⁠.

Proof.

We prove the assertion in three steps.

Step 1.|$\quad$| We show that there exist constants |$C_0,C >0$| such that

\begin{align}\left(\sum_{j=0}^{\ell} \# {\rm Pic}(j)\,\# \mathcal{T}_j \right)^s \left(\log(C_0 / \eta_\ell(u_\ell)) \right)^{-s} \eta_\ell(u_\ell) \leq C \quad \text{for all } \ell \geq 0.\end{align}

(6.10)

Proposition 4.6 gives a bound for the number of Picard iterations. For |$j \geq 1$|⁠, it holds that

\begin{align}\begin{split}\# {\rm Pic}(j) &\stackrel{(4.4)}\leq C_{\rm pic} + \frac{1}{{|\!}\log q|} \, \log \big( \max \big\{ 1 \,,\, \eta_{j-1}(u_{j-1}) / \eta_j(u_j) \big\} \big) \\ &\leq C_{\rm pic} + \frac{1}{{|\!}\log q|} \big| \!\log\big(\eta_{j-1}(u_{j-1}) / \eta_j(u_j) \big) \big| \\ &\leq C_{\rm pic} + \frac{1}{{|\!}\log q|} \left(\big| \!\log(\eta_{j-1}(u_{j-1})) \big| + \big| \!\log(\eta_j(u_j))\big| \right)\!. \end{split}\end{align}

(6.11)

Quasi monotonicity (Carstensen et al., 2014, Lemma 3.5) and Lemma 4.9 imply |$\eta_{\ell}(u_{\ell}) \simeq \eta_\ell(u_\ell^\star) \lesssim \eta_{k}(u_{k}^\star) \simeq \eta_{k}(u_{k})$| and hence |$\eta_{\ell}(u_{\ell}) \leq C_{\rm mon} \eta_{k}(u_{k})$| for all |$k \leq \ell$|⁠. The constant |$C_{\rm mon}>0$| depends only on (A1)–(A4). With |$C_0:= e \, C_{\rm mon} \eta_0(u_0)$|⁠, there holds |$C_{\rm mon}^{-1} C_0^{-1} \eta_{\ell} (u_{\ell}) \leq C_0^{-1} \eta_{k}(u_{k}) \leq e^{-1} <1 $| for all |$k \leq \ell$|⁠. Hence, we obtain that

\begin{align*}&\big| \!\log(\eta_{k}(u_{k})) \big| = \big| \!\log(C_0) + \log(C_0^{-1} \eta_{k}(u_{k})) \big| \leq \big| \!\log(C_0)\big| + \big| \!\log(C_0^{-1} C_{\rm mon}^{-1} \eta_{\ell}(u_{\ell})) \big| \\ &\qquad \leq \big| \!\log(C_0)\big| + \big| \!\log(C_{\rm mon}^{-1}) \big| + \big| \!\log(C_0^{-1} \eta_{\ell}(u_{\ell})) \big| =C' + \log(C_0 / \eta_{\ell}(u_{\ell})), \end{align*}

where the constant |$C' := \big| \!\log(C_0)\big| + \big| \!\log(C_{\rm mon}^{-1}) \big|$| depends only on |$\eta_0(u_0)$| and |$C_{\rm mon}$|⁠. Combining this estimate for |$1 \leq j \leq \ell$| with (6.11), we obtain that

\begin{align*}\# {\rm Pic}(j) \leq C'' + \frac{2}{{|\!}\log q|} \log(C_0 / \eta_{\ell}(u_{\ell})) \quad \text{with } C'':= C_{\rm pic} + C'. \end{align*}

By definition of |$C_0$|⁠, it holds that |$C_0 / \eta_{\ell}(u_{\ell}) \geq e$| and hence |$\log(C_0 / \eta_{\ell}(u_{\ell})) \geq \log(e) = 1$| for all |$\ell \geq 0$|⁠. This yields that

\begin{align*}\sum_{j=1}^{\ell} \# {\rm Pic}(j)\,\# \mathcal{T}_j &\leq \sum_{j=1}^{\ell} \# \mathcal{T}_j \left(C'' + \frac{2}{{|\!}\log q|} \log(C_0 / \eta_{\ell}(u_{\ell}))\right) \\ &\leq \left(C'' + \frac{2}{{|\!}\log q|} \right) \log(C_0 / \eta_{\ell}(u_{\ell})) \sum_{j=1}^{\ell} \# \mathcal{T}_j. \end{align*}

Using |$\# \mathcal{T}_j \leq \# \mathcal{T}_0 (\# \mathcal{T}_j - \# \mathcal{T}_0 +1)$| (see, e.g., Bespalov et al., 2017, Lemma 22) and |$C''' := \max\big\{C'' + {2}/{|\!}\log q|\,,\,\# {\rm Pic}(0)\big\}$| as well as Theorem 6.2, we obtain that

\begin{align*}\sum_{j=0}^{\ell} \# {\rm Pic}(j)\,\# \mathcal{T}_j &\leq C''' \# \mathcal{T}_0 \log(C_0 / \eta_{\ell}(u_{\ell})) \sum_{j=0}^{\ell} (\# \mathcal{T}_j - \# \mathcal{T}_0 +1) \\ &\stackrel{(6.5)}{\leq} C''' \# \mathcal{T}_0 C_{\rm opt}^{-1/s} \log(C_0 / \eta_{\ell}(u_{\ell})) \sum_{j=0}^{\ell} \eta_j(u_j)^{-1/s}. \end{align*}

We argue as in the proof of Theorem 6.2. The linear convergence from Theorem 5.3 with |$0 < q_{\rm lin}^{1/s} < 1$| implies that

\begin{align}\sum_{j=0}^{\ell} \# {\rm Pic}(j)\,\# \mathcal{T}_j &\leq C''' \, \# \mathcal{T}_0 \, C_{\rm opt}^{-1/s} \,\frac{C_{\rm lin}^{1/s}}{1-q_{\rm lin}^{1/s}} \,\, \log(C_0 / \eta_{\ell}(u_{\ell})) \, \eta_\ell(u_\ell)^{-1/s}.\end{align}

(6.12)

Rearranging the terms we conclude (6.10) with |$C:=\left(C''' \# \mathcal{T}_0 \, \frac{C_{\rm lin}^{1/s}}{1-q_{\rm lin}^{1/s}} \right)^s\,C_{\rm opt}^{-1}$|⁠.

Step 2.|$\quad$| Let |$s,\delta\,{>}\,0$|⁠. Recall that |$t^{\delta/s} \log(C_0 /t) \,{\to}\, 0$| as |$t \,{\to}\, 0$|⁠. Hence, it follows |$\big(\log(C_0 / \eta_\ell(u_\ell))\big) \eta_\ell(u_\ell)^{\delta/s} \lesssim 1$| as |$\ell \to \infty$|⁠. This implies |$\big(\log(C_0 / \eta_\ell(u_\ell))\big)^{-s} \eta_\ell(u_\ell)^{-\delta} \gtrsim 1$| and results in |$\big(\log(C_0 / \eta_\ell(u_\ell))\big)^{-s} \eta_\ell(u_\ell) \gtrsim \eta_\ell(u_\ell)^{1+\delta}$|⁠, where the hidden constant depends only on |$\delta$|⁠, |$s$|⁠, |$C_0$| and |$\max_{\ell\in\mathbb{N}_0}\eta_\ell(u_\ell)\le C_{\rm mon}\,\eta_0(u_0)$|⁠.

Step 3.|$\quad$| From Step 1 and Step 2, we infer that

\begin{align*}\left(\sum_{j=0}^{\ell} \# \mathcal{T}_j (1 + \# {\rm Pic}(j)) \right)^s \eta_\ell(u_\ell)^{1+\delta} \leq C_{\rm work}, \end{align*}

where |$C_{\rm work}\ge1$| depends only on |$C$|⁠, |$\delta$|⁠, |$s$|⁠, |$C_0$|⁠, |$C_{\rm mon}$| and |$\eta_0(u_0)$|⁠. Choose |$\delta>0$| such that |$\frac{s}{1+\delta} = s- \varepsilon$|⁠. Then, we finally obtain that

\begin{align*}\left(\sum_{j=0}^{\ell} \# \mathcal{T}_j (1 + \# {\rm Pic}(j)) \right)^{s- \varepsilon} \eta_\ell(u_\ell) \leq C_{\rm work}^{1/(1+\delta)}= C_{\rm work}^{(s-\varepsilon)/s} \leq C_{\rm work}. \end{align*}

Rearranging the terms concludes the proof. □

7. Optimal convergence of full sequence

In this section, we reformulate Algorithm 4.3 in the sense that the discrete solutions |$(u_\ell)_{\ell\in\mathbb{N}_0}$| correspond to a subsequence |$(\widetilde u_{\ell_k})_{k\in\mathbb{N}_0}$| obtained by the following algorithm, which also accounts for the Picard iterates (see Remark 7.2 below for details). To distinguish between the quantities of Algorithm 4.3 and Algorithm 7.1, we use the tilde for all quantities of Algorithm 7.1, e.g., |$\widetilde\eta_\ell(\widetilde u_\ell)$| is the error estimator corresponding to some |$\widetilde u_\ell\in\widetilde{\mathcal{X}}_\ell$|⁠.

Algoritm 7.1

Input: Initial triangulation |$\widetilde{\mathcal{T}}_0:=\mathcal{T}_0$|⁠, adaptivity parameters |$0<\theta\le1$|⁠, |$\lambda\ge0$|⁠, and |$C_{\rm mark}\ge1$|⁠, initial guess |$\widetilde u_{-1}:=0$|⁠.

Adaptive loop: For all |$\ell=0,1,2,\dots$|⁠, iterate the following Steps (i)–(iv).

(i) Compute discrete Picard iterate |$\widetilde u_\ell = \widetilde\varPhi_\ell \widetilde u_{\ell-1}\in\widetilde{\mathcal{X}}_\ell$|⁠.
(ii) Compute refinement indicators |$\widetilde\eta_\ell(T,\widetilde u_\ell)$| for all |$T\in\widetilde{\mathcal{T}}_\ell$|⁠.
(iii) If |$\left\| {{\widetilde u_\ell-\widetilde u_{\ell-1}}} \right\|_{\mathcal{H}} \le \lambda\,\widetilde\eta_\ell(\widetilde u_\ell)$|⁠, do the following Steps (a)–(b).
- (a) Determine a set |$\widetilde{\mathcal{M}}_\ell\,{\subseteq}\,\widetilde{\mathcal{T}}_\ell$| of marked elements which has minimal cardinality up to the multiplicative constant |$C_{\rm mark}$| and which satisfies the Dörfler marking criterion |$\theta\,\widetilde\eta_\ell(\widetilde u_\ell) \,{\le}\, \widetilde\eta_\ell(\widetilde{\mathcal{M}}_\ell,\widetilde u_\ell)$|⁠.
- (b) Generate the new triangulation |$\widetilde{\mathcal{T}}_{\ell+1} \,{:=}\, \operatorname{refine}(\widetilde{\mathcal{T}}_\ell,\widetilde{\mathcal{M}}_\ell)$| by refinement of (at least) all marked elements |$T\in\widetilde{\mathcal{M}}_\ell$|⁠.
(iv) Else, define |$\widetilde{\mathcal{T}}_{\ell+1} := \widetilde{\mathcal{T}}_\ell$|⁠.

Output: Sequence of discrete solutions |$\widetilde u_\ell\in\widetilde{\mathcal{X}}_\ell$| and corresponding estimators |$\widetilde\eta_\ell(\widetilde u_\ell)$|⁠. □

Remark 7.2

With |$\ell_{-1}:=-1$| and |$\widetilde{\mathcal{T}}_{-1}:=\mathcal{T}_0$|⁠, define an index sequence |$(\ell_k)_{k\in\mathbb{N}_0}$| inductively as follows:

|$\ell_k > \ell_{k-1}$| is the smallest index such that |$\widetilde{\mathcal{T}}_{\ell_k+1}\neq\widetilde{\mathcal{T}}_{\ell_k}$|⁠;
resp. |$\ell_k=\infty$| if such an index does not exist.

In explicit terms, the indices |$\ell_k$| are chosen such that the mesh is refined after the computation of |$\widetilde u_{\ell_k}$|⁠. Hence, Algorithms 4.3 and 7.1 are related as follows:

|$\widetilde u_{\ell_k} = u_k$| for all |$k\in\mathbb{N}_0$| with |$\ell_k<\infty$|⁠;
|$\widetilde u_{\ell_{k-1}+m} = u_k^m$| for all |$k\in\mathbb{N}_0$| with |$\ell_{k-1}<\infty$| and all |$m\in\mathbb{N}_0$| with |$\ell_{k-1}+m \le \ell_k$|⁠;
|$\widetilde{\mathcal{T}}_\ell = \mathcal{T}_k$| for all |$\ell=\ell_{k-1}+1,\dots,\ell_k$| if |$\ell_k<\infty$|⁠;
|$\widetilde{\mathcal{T}}_\ell = \mathcal{T}_k$| for all |$\ell\ge \ell_{k-1}+1$| if |$\ell_{k-1}<\infty$| and |$\ell_k=\infty$|⁠.

The observations of Remark 7.2 allow to transfer the results for Algorithm 4.3 to Algorithm 7.1. As a consequence of our preceding analysis, we get the following theorem:

Theorem 7.3

Suppose (O1)–(O3) for the nonlinear operator and (A1)–(A3) for the a posteriori error estimator. Let |$C_\lambda$| be the constant from Lemma 4.9. Let |$0 < \theta \leq 1$| and suppose |$0 < \lambda < C_\lambda^{-1} \theta $|⁠. Then, the output of Algorithm 7.1 satisfies

\begin{align}\lim_{\ell\to\infty}\left\| {{u^\star-\widetilde u_\ell}} \right\|_{\mathcal{H}} = 0 = \lim_{\ell\to\infty}\widetilde\eta_\ell(\widetilde u_\ell).\end{align}

(7.1)

Additionally, suppose (R1)–(R3) for the mesh refinement, (A4) for the a posteriori error estimator, and that

\begin{align}\theta'' := \frac{\theta+\lambda C_\lambda}{1-\lambda C_\lambda} < \theta_{\rm opt}:= \big(1+C_{\rm stb}^2 (C_{\rm drel}^\star)^2\big)^{-1}.\end{align}

(7.2)

Then, for all |$s>0$|⁠, it holds that¹

\begin{align}\left\| {{u^\star}} \right\|_{\mathbb{A}_s} < \infty \quad \Longleftrightarrow \quad\exists\widetilde C_{\rm opt}>0\,\forall \ell\ge\ell_0\quad\widetilde\eta_\ell(\widetilde u_\ell) \leq \widetilde C_{\rm opt} \big(\#\widetilde{\mathcal{T}}_\ell - \#\mathcal{T}_0 +1 \big)^{-s}.\end{align}

(7.3)

Moreover, if |$\# \mathcal{T}_\ell \to \infty$| as |$\ell \to \infty$|⁠, then it holds that|$^1$|

\begin{align}\left\| {{u^\star}} \right\|_{\mathbb{A}_s} < \infty \quad \Longrightarrow \quad \forall \varepsilon >0 \, \exists \widetilde{C}_{\rm work}>0 \, \forall \ell \ge \ell_0 \quad \widetilde \eta_\ell(\widetilde u_\ell) \leq \widetilde{C}_{\rm work} \left(\sum_{j=0}^{\ell} \# \widetilde{\mathcal{T}}_j \right)^{-(s-\varepsilon)}\!.\end{align}

(7.4)

Finally, it holds |$\widetilde C_{\rm opt} = C\,C_{\rm opt}$| and |$\widetilde{C}_{\rm work} = C\, C_{\rm work}$|⁠, where |$C_{\rm opt}$| is the constant from Theorem 6.2, |$C_{\rm work}>0$| is the constant from Theorem 6.4 and |$C>0$| depends only on (O1)–(O2), (A3), (R1), |$\mathcal{T}_0$| and |$s$|⁠.

Proof.

We stick with the notation of Remark 7.2.

Step 1.|$\quad$| Suppose that there exists an index |$\ell\in\mathbb{N}_0$| with |$\left\| {{\widetilde u_\ell-\widetilde u_{\ell-1}}} \right\|_{\mathcal{H}} \le \lambda\,\widetilde\eta_\ell(\widetilde u_\ell)$| and |$\widetilde{\mathcal{M}}_\ell = \emptyset$|⁠. Then, it follows |$\widetilde\eta_\ell(\widetilde u_\ell)=0$|⁠. Proposition 4.5 concludes |$u^\star = \widetilde u_k$|⁠, |$\widetilde{\mathcal{M}}_k = \emptyset$| and |$\widetilde\eta_k(\widetilde u_k)=0$| for all |$k\ge\ell$|⁠. In particular, this proves (7.2)–(7.3) with |$\left\| {{u^\star}} \right\|_{\mathbb{A}_s}<\infty$| for all |$s>0$|⁠.

Step 2. Suppose that there exists an index |$k\in\mathbb{N}_0$| with |$\ell_{k-1}<\infty$| and |$\ell_k=\infty$|⁠. Then, |$\widetilde{\mathcal{T}}_\ell=\widetilde{\mathcal{T}}_{\ell_{k-1}+1}$| for all |$\ell>\ell_{k-1}$|⁠. According to Step 1, we may further assume that |$\left\| {{\widetilde u_\ell-\widetilde u_{\ell-1}}} \right\|_{\mathcal{H}} > \lambda\,\widetilde\eta_\ell(\widetilde u_\ell)$| for all |$\ell>\ell_{k-1}$|⁠. Then, this corresponds to the situation of Proposition 4.4 and hence results in convergence |$\left\| {{u^\star-\widetilde u_\ell}} \right\|_{\mathcal{H}} + \widetilde\eta_\ell(\widetilde u_\ell)\to0$| as |$\ell\to\infty$| and |$\widetilde\eta_\ell(u^\star)=0$| for all |$\ell>\ell_{k-1}$|⁠. Again, this proves (7.2)–(7.3) with |$\left\| {{u^\star}} \right\|_{\mathbb{A}_s}<\infty$| for all |$s>0$|⁠.

Step 3. According to Step 2, we may suppose |$\ell_k<\infty$| for all |$k\in\mathbb{N}_0$|⁠. In this step, we will prove (7.1). Given |$\ell\in\mathbb{N}$|⁠, choose |$k\in\mathbb{N}_0$| and |$m\in\mathbb{N}_0$| with |$\ell = \ell_{k-1} + m < \ell_{k}<\infty$|⁠. Then, |$\widetilde u_\ell = \widetilde u_{\ell_{k-1}+m} = u_{k}^m$| and hence

\begin{align*}\left\| {{u^\star-\widetilde u_\ell}} \right\|_{\mathcal{H}} = \left\| {{u^\star-u_{k}^m}} \right\|_{\mathcal{H}}&\le \left\| {{u^\star-u_{k}}} \right\|_{\mathcal{H}} + \left\| {{u_{k}-u_{k}^m}} \right\|_{\mathcal{H}}\stackrel{(4.2)}{\le} C_{\rm rel}\,\eta_k(u_{k}) + \left\| {{u_{k}-u_{k}^m}} \right\|_{\mathcal{H}}. \end{align*}

Moreover, let |$u_{k} = u_{k}^n$|⁠. Then, we obtain |$m<n$| and

\begin{align*}\left\| {{u_{k}-u_{k}^m}} \right\|_{\mathcal{H}}\le \sum_{j=m}^{n-1} \left\| {{u_{k}^{j+1}-u_{k}^j}} \right\|_{\mathcal{H}}\stackrel{(4.5)}\le \frac{\alpha}{L}\,C_{\rm rel}\left(\sum_{j=m}^{n-1}q^j\right)\,\eta_{k-1}(u_{k-1}) \le \frac{\alpha}{L}\,C_{\rm rel}\,\frac{q^m}{1-q}\,\eta_{k-1}(u_{k-1}). \end{align*}

Combining these estimates, we derive that

\begin{align*}\left\| {{u^\star-\widetilde u_\ell}} \right\|_{\mathcal{H}}\lesssim \eta_{k}(u_{k}) + \eta_{k-1}(u_{k-1}). \end{align*}

For |$k>0$|⁠, Lemma 4.7 proves that

\begin{align*}\widetilde\eta_\ell(\widetilde u_\ell) = \eta_{k}(u_{k}^m) \stackrel{(4.6)}\lesssim \eta_{k-1}(u_{k-1}). \end{align*}

For |$k> 0$| and |$m=0$|⁠, we even have equality

\begin{align*}\widetilde\eta_\ell(\widetilde u_\ell) =\eta_{k-1}(u_{k-1}). \end{align*}

Altogether, Theorem 5.3 proves convergence |$\left\| {{u^\star-\widetilde u_\ell}} \right\|_{\mathcal{H}}+\widetilde\eta_\ell(\widetilde u_\ell)\to0$| as |$\ell\to\infty$|⁠.

Step 4.|$\quad$| According to Step 2, we may suppose |$\ell_k<\infty$| for all |$k\in\mathbb{N}_0$|⁠. To show the implication ‘|$\Longleftarrow$|’ of (7.3), it is noted that the right-hand side of (7.3) implies |$\eta_{k}(u_k) \lesssim \big(\#\mathcal{T}_{k} - \# \mathcal{T}_0 +1 \big)^{-s}$| for all |$k\in\mathbb{N}_0$|⁠. Hence, the claim follows from Theorem 6.2.

Step 5.|$\quad$| According to Step 2, we may suppose |$\ell_k<\infty$| for all |$k\in\mathbb{N}_0$|⁠. To show the implication ‘|$\Longrightarrow$|’ of (7.3), let |$\ell = \ell_{k-1}+m < \ell_{k}<\infty$| for some |$k\in\mathbb{N}$| and |$m\in\mathbb{N}_0$|⁠. Then, |$\widetilde u_\ell = \widetilde u_{\ell_{k-1}+m} = u_{k}^m$| and |$\widetilde{\mathcal{T}}_\ell=\mathcal{T}_{k}$|⁠. Moreover, elementary calculation (see, e.g., Bespalov et al., 2017, Lemma 22) proves that

\begin{align*}(\# \mathcal{T}_0)^{-1} \# \widetilde{\mathcal{T}}_\ell \leq \#\widetilde{\mathcal{T}}_\ell - \#\mathcal{T}_0 +1 \leq \#\widetilde{\mathcal{T}}_\ell \quad\text{for all }\ell\in\mathbb{N}_0. \end{align*}

This and |$\#\mathcal{T}_{k-1} \simeq \#\mathcal{T}_{k}$| (which follows from (R1) and |$\mathcal{T}_k = \operatorname{refine}(\mathcal{T}_{{k-1}},\mathcal{M}_{{k-1}})$|⁠) proves

\begin{align*}\eta_{k-1}(u_{k-1}) \stackrel{(6.5)}\lesssim (\#\mathcal{T}_{k-1} - \#\mathcal{T}_0 +1)^{-s}&\simeq (\#\mathcal{T}_k - \#\mathcal{T}_0 +1)^{-s}= (\#\widetilde{\mathcal{T}}_{\ell} - \#\mathcal{T}_0 +1)^{-s}. \end{align*}

For |$m>0$|⁠, estimate (4.6) proves |$\widetilde\eta_\ell(\widetilde u_\ell) =\eta_{k}(u_{k}^m) \lesssim \eta_{k-1}(u_{k-1})$|⁠. For |$m=0$|⁠, it holds |$\widetilde\eta_\ell(\widetilde u_\ell)=\eta_{k-1}(u_{k-1})$|⁠. Combining this with the latter estimate, we conclude the proof.

Step 6.|$\quad$| The same argument as in Step 5 in combination with Theorem 6.4 proves (7.4). □

8. Numerical experiments

In this section, we present two numerical experiments in two dimensions to underpin our theoretical findings. In the experiments, we compare the performance of Algorithm 4.3 for

different values of |$\lambda \in \{1, 0.1, 0.01,\ldots,10^{-6}\}$|⁠,
different values of |$\theta \in \{0.2, 0.4,\ldots,1\}$|⁠,
nested iteration |$u_\ell^0 := u_{\ell-1}$| compared with a naive initial guess |$u_\ell^0 := 0$|⁠,

As model problems serve nonlinear boundary value problems similar to those of Garau et al. (2011), Garau et al. (2012), Bruckner et al. (2014) and Congreve & Wihler (2017).

8.1 Model problem

Let |$\varOmega \subset \mathbb{R}^d$| be a bounded Lipschitz domain with polyhedral boundary |$\Gamma = \partial \varOmega$|⁠, |$d \in \{2,3\}$|⁠. Suppose that |$\Gamma := \overline \Gamma_D \cup \overline \Gamma_N$| is split into relatively open and disjoint Dirichlet and Neumann boundaries |$\Gamma_D,\Gamma_N \subseteq \Gamma$| with |$|\Gamma_D| >0$|⁠. For given |$f \in L^2(\varOmega)$|⁠, we consider problems of the following type:

\begin{eqnarray}\begin{split}- \operatorname{div}(\mathbf{\mu}(x,|\nabla u^\star(x)|^2) \nabla u^\star(x)) &= f(x) \quad &\textrm{in }& \varOmega, \\ u^\star(x) &= 0 &\textrm{on }& \Gamma_D, \\ \mu(x,|\nabla u^\star(x)|^2) \partial_{\bf n} u^\star(x) &= g(x) &\textrm{on }& \Gamma_N. \end{split}\end{eqnarray}

(8.1)

We suppose that the scalar nonlinearity |$\mathbf{\mu}: \varOmega \times \mathbb{R}_{\geq 0} \rightarrow \mathbb{R}$| satisfies the following properties (M1)–(M4), similarly considered in Garau et al. (2012).

(M1) There exist constants |$0<\gamma_1<\gamma_2<\infty$| such that
\begin{align}\gamma_1 \le \mathbf{\mu}(x,t) \le \gamma_2 \quad \textrm{for all } x \in \varOmega \text{ and all } t \geq 0.\end{align}
(8.2)
(M2) There holds |$\mathbf{\mu}(x,\cdot) \in C^1(\mathbb{R}_{\geq 0} ,\mathbb{R})$| for all |$x \in \varOmega$|⁠, and there exist |$0 < \widetilde{\gamma}_1<\widetilde{\gamma}_2<\infty$| such that
\begin{align}\widetilde{\gamma}_1 \le \mathbf{\mu}(x,t) +2 t \frac{{\mathop{\mathrm{d}{}}}}{{\mathop{\mathrm{d}{t}}}} \mathbf{\mu}(x,t) \le \widetilde{\gamma}_2 \quad \textrm{for all } x \in \varOmega \text{ and all } t \geq 0.\end{align}
(8.3)
(M3) Lipschitz continuity of |$\mathbf{\mu}(x,t)$| in |$x$|⁠, i.e., there exists |$L_{\mathbf{\mu}}>0$| such that
\begin{align}| \mathbf{\mu}(x,t) - \mathbf{\mu}(y,t) | \le L_{\mathbf{\mu}} | x - y | \quad \textrm{for all } x,y \in \varOmega \text{ and all } t \geq 0.\end{align}
(8.4)
(M4) Lipschitz continuity of |$t \frac{{\mathop{\mathrm{d}{}}}}{{\mathop{\mathrm{d}{t}}}} \mathbf{\mu}(x,t)$| in |$x$|⁠, i.e., there exists |$\widetilde{L}_{\mathbf{\mu}}>0$| such that
\begin{align}\left| t \frac{{\mathop{\mathrm{d}{}}}}{{\mathop{\mathrm{d}{t}}}} \mathbf{\mu}(x,t) - t \frac{{\mathop{\mathrm{d}{}}}}{{\mathop{\mathrm{d}{t}}}} \mathbf{\mu}(y,t) \right| \le \widetilde{L}_{\mathbf{\mu}} | x - y | \quad \textrm{for all } x,y \in \varOmega \text{ and all } t \geq 0.\end{align}
(8.5)

8.2 Weak formulation

The weak formulation of (8.1) reads as follows: Find |$u \,{\in}\, H_D^{1}(\varOmega) \,{:=}\, \{w \in H^1: \, w\,{=}\,0 \text{ on } \Gamma_D \, \text{in the sense} \text{of traces} \}$| such that

\begin{align}{\int_{{\varOmega}}} \mathbf{\mu}(x,|\nabla u^\star(x)|^2) \,\nabla u^\star\cdot \nabla v \,{\mathop{\mathrm{d}{x}}} = {\int_{{\varOmega}}} f v \,{\mathop{\mathrm{d}{x}}}+ {\int_{{\Gamma_N}}} g v \, {\mathop{\mathrm{d}{s}}} \quad \textrm{for all } v \in H^1_D(\it \varOmega).\end{align}

(8.6)

With respect to the abstract framework, it holds |$\mathcal{H} := H^{1}_D(\it \Omega)$| with |$\| v \|_{\mathcal{H}} := \left\| {{\nabla v}} \right\|_{L^2(\varOmega)}$|⁠. With |$\left\langle {{\cdot}\,,\,{\cdot}} \right\rangle$| being the extended |$L^2(\varOmega)$| scalar product, we obtain (1.3) with operators

\begin{align}\left\langle {{Aw}\,,\,{v}} \right\rangle &:= {\int_{{\varOmega}}} \mathbf{\mu}(x,|\nabla w(x)|^2) \, \nabla w(x)\cdot\nabla v(x) \,{\mathop{\mathrm{d}{x}}}, \\ \end{align}

(8.7a)

\begin{align}F(v) &:= {\int_{{\varOmega}}} f v \,{\mathop{\mathrm{d}{x}}} + {\int_{{\varOmega}}} g v\, {\mathop{\mathrm{d}{s}}}\!.\end{align}

(8.7b)

Next, we show that |$A$| satisfies (O1)–(O3). For the sake of completeness, we first recall an auxiliary lemma, which is just a simplified version of Liu & Barrett (1996, Lemma 2.1).

Lemma 8.1

Let |$C_1>0$| as well as |$0 < C_2 \le C_3 < \infty$| and |$\mathbf{\kappa}(x,\cdot) \in C^1(\mathbb{R}_{\ge0},\mathbb{R}_{\ge0})$| with |$\kappa(x,t) \leq C_1$| for all |$x \in \varOmega$| and |$t \geq 0$| satisfy

\begin{align}C_2 \le \frac{{\mathop{\mathrm{d}{}}}}{{\mathop{\mathrm{d}{t}}}} (t \mathbf{\kappa}(x,t)) \le \frac{{\mathop{\mathrm{d}{}}}}{{\mathop{\mathrm{d}{t}}}} (t \mathbf{\kappa}(x,t)) \le C_3 \quad \textrm{for all } x \in \varOmega \text{ and all } t \geq 0.\end{align}

(8.8)

Then, it holds that

\begin{align}\big(\mathbf{\kappa}(x,|y|) y - \mathbf{\kappa}(x,|z|) z \big) \cdot \big(y- z \big) \ge C_1 | y - z|^2 \quad \textrm{for all } x \in \varOmega \text{ and } y,z \in \mathbb{R}^d,\end{align}

(8.9)

as well as

\begin{align}\big| \mathbf{\kappa}(x,|y|) y - \mathbf{\kappa}(x,|z|) z \big| \le C_2 | y - z | \quad \textrm{for all } x \in \varOmega \text{ and } y,z\in \mathbb{R}^d.\end{align}

(8.10)

□

Proposition 8.2

Suppose that |$\mathbf{\mu}: \varOmega \times \mathbb{R}_{\geq 0} \rightarrow \mathbb{R}$| satisfies (M1)–(M2). Then, the corresponding operator |$A$| satisfies (O1)–(O3) with constants |$\alpha := \widetilde{\gamma}_1$| and |$L :=\widetilde{\gamma}_2$|⁠.

Proof.

We prove the assertion in two steps.

Step 1.|$\quad$| To show (O1)–(O2), let |$\mathbf{\kappa}(x,t) := \mathbf{\mu}(x,t^2)$|⁠. Assumptions (M1)–(M2) and |$\frac{{\mathop{\mathrm{d}{}}}}{{\mathop{\mathrm{d}{t}}}}(t \mathbf{\kappa}(x,t)) = \mathbf{\mu}(x,t^2) + 2t^2 \partial_2 \mathbf{\mu}(x,t^2)$| allow to apply Lemma 8.1. For all |$v,w \in H^1_D(\varOmega)$|⁠, we obtain

\begin{align}\alpha | \nabla v - \nabla w |^2 \stackrel{(8.9)}{\le}& \big(\mathbf{\mu}(\cdot,|\nabla v|^2) \nabla v - \mathbf{\mu}(\cdot,|\nabla w|^2 ) \nabla w\big) \cdot \big(\nabla v - \nabla w \big)\quad\text{a.e. in }\varOmega,\end{align}

(8.11)

as well as

\begin{align}\big|( \mathbf{\mu}(\cdot ,|\nabla v|^2) \nabla v - \mathbf{\mu}(\cdot,|\nabla w|^2) \nabla w) \big|^2 \stackrel{(8.10)}{\le}L^2 \big| \nabla v - \nabla w \big|^2\quad\text{a.e. in }\varOmega.\end{align}

(8.12)

Integration over |$\varOmega$| proves strong monotonicity (O1) and Lipschitz continuity (O2).

Step 2.|$\quad$| We next show (O3). Analogously to Hasanov (2010), we define

\begin{align}P: H^{1}_D(\varOmega) \rightarrow \mathbb{R}_{\geq 0} : \quad w \mapsto \frac12 {\int_{{\varOmega}}} {\int_{{0}}^{{|\nabla w|^2}}} \mathbf{\mu}(x,{\mathbf{\zeta}}) \,{\mathop{\mathrm{d}{{\mathbf{\zeta}}}}} \,{\mathop{\mathrm{d}{x}}}\!.\end{align}

(8.13)

Note that boundedness (M1) implies well-posedness of |$P$|⁠. Next, we show that |$A$| is the Gateaux derivative |$dP$| of |$P$|⁠. To that end, let |$r > 0$| and |$v,w \in H^{1}_D(\varOmega)$|⁠. Define

\begin{align*}H(r) := P(w + rv) \stackrel{(8.13)}{=} \frac12 {\int_{{\varOmega}}} {\int_{{0}}^{{|\nabla w + r\nabla v|^2}}} \mathbf{\mu}(x,{\mathbf{\zeta}}) \,{\mathop{\mathrm{d}{{\mathbf{\zeta}}}}} \, {\mathop{\mathrm{d}{x}}}\!. \end{align*}

With the Leibniz rule, we get

\begin{align*}H^{\prime}(r) &= \frac12 {\int_{{\varOmega}}} \mathbf{\mu}(x,| \nabla w+r \nabla v|^2) \,\frac{{\mathop{\mathrm{d}{}}}}{{\mathop{\mathrm{d}r}}} \big(|\nabla w + r \nabla v|^2 \big) \,{\mathop{\mathrm{d}{x}}} \\ &= {\int_{{\varOmega}}} \mathbf{\mu}(x,| \nabla w+r \nabla v|^2)\, (\nabla w + r \nabla v)\cdot \nabla v \,{\mathop{\mathrm{d}{x}}}\!. \end{align*}

This concludes |$\left\langle {{\,{\rm d}P(w)}\,,\,{v}} \right\rangle = H^{\prime}(0) = {\int_{{\varOmega}}} \mathbf{\mu}(x,|\nabla w|^2) \, \nabla w \cdot \nabla v \,{\mathop{\mathrm{d}{x}}} \stackrel{(8.7a)}{=} \left\langle {{Aw}\,,\,{v}} \right\rangle$|⁠. □

8.3 Discretization and a posteriori error estimator

Let |$\mathcal{T}_\bullet$| be a conforming triangulation of |$\varOmega$|⁠. For |$T \in \mathcal{T}_\bullet$|⁠, define |$h_T := |T|^{1/d} \simeq {\rm diam}(T)$|⁠. Consider

\begin{align}\mathcal{X}_{\bullet} := \{ v: \varOmega \to \mathbb{R}: \, v|_{T} \in \mathcal{P}^1(T) \textrm{ for all } T \in \mathcal{T}_{\bullet} \} \,\cap \, H^1_D(\varOmega).\end{align}

(8.14)

For ease of notation, set |$\mathbf{\mu}_v(x):= \mathbf{\mu}(x,|\nabla v(x)|^2)$|⁠. Let |$[\, \cdot \,] \big|_{\partial T \cap \varOmega}$| denote the jump of discrete functions across the element interfaces. As in Garau et al. (2012, Section 3.2), we define for all |$T \in \mathcal{T}_{\bullet}$| and all |$v_\bullet \in \mathcal{X}_{\bullet}$|⁠, the corresponding residual refinement indicators

\begin{align}\begin{split}\eta_{\bullet}(T,v_{\bullet})^2 &:= h_T^2 \|f+ \operatorname{div}(\mathbf{\mu}_{v_{\bullet}} \nabla v_{\bullet}) \|_{L^2(T)}^2 + h_T \| [ \mathbf{\mu}_{v_{\bullet}} \partial_{\rm n} v_{\bullet} ] \|_{L^2(\partial T\cap\varOmega)}^2 \\&\quad+ h_T \left\| {{g - \mathbf{\mu}_{v_\bullet} \partial_{\rm n} v_\bullet}} \right\|_{L^2(\partial T \cap \Gamma_N)}^2. \end{split}\end{align}

(8.15)

The well-posedness of the error estimator requires that the nonlinearity |$\mu(x,t)$| is Lipschitz continuous in |$x$|⁠, i.e., (M3). Then, reliability (A3) and discrete reliability (A4) are proved as in the linear case (see, e.g., Cascon et al., 2008 for the linear case or Garau et al., 2012, Theorem 3.3 and Garau et al., 2012, Theorem 3.4, respectively, for strongly monotone nonlinearities).

The verification of stability (A1) and reduction (A2) requires the validity of a certain inverse estimate. For scalar nonlinearities and under the assumptions (M1)–(M4), the latter is proved in Garau et al. (2012, Lemma 3.7). Using this inverse estimate, the proof of (A1) and (A2) follows as for the linear case (see, e.g., Cascon et al., 2008, for the linear case or Garau et al., 2012, Section 3.3, for scalar nonlinearities). We note that the necessary inverse estimate is, in particular, open for nonscalar nonlinearities. In any case, the arising constants in (A1)–(A4) depend also on the uniform shape regularity of the triangulations generated by newest vertex bisection.

8.4 Experiment with known solution

We consider the Z-shaped domain |$\varOmega \subset \mathbb{R}^2$| from Fig. 1 (left) with mixed Dirichlet–Neumann boundary and the nonlinear problem (8.1), where |$\mu(x,|\nabla u^\star(x)|^2) := 2+ \frac{1}{\sqrt{1+|\nabla u^\star(x)|^2 }}$|⁠. This choice of |$\mu$| leads to |$\alpha=2$| and |$L=3$| in (O1)–(O2). We prescribe the solution |$u^\star$| in polar coordinates by

\begin{align}u^\star(x,y) = r^{\beta} \cos\left(\beta \, \phi \right)\!,\end{align}

(8.16)

with |$\beta = 4/7$| and compute |$f$| and |$g$| in (8.1) accordingly. We note that |$u^\star$| has a generic singularity at the re-entrant corner |$(x,y)=(0, 0)$|⁠.

$Experiment with known solution from Section 8.4: $Z$-shaped domain $\varOmega \subset \mathbb{R}^2$ and the initial mesh $\mathcal{T}_0$ (left), and with NVB adaptively generated mesh $\mathcal{T}_{19}$ with $5854$ elements (right). $\Gamma_D$ is visualized by a thick red line.$

Fig. 1.

Experiment with known solution from Section 8.4: |$Z$|-shaped domain |$\varOmega \subset \mathbb{R}^2$| and the initial mesh |$\mathcal{T}_0$| (left), and with NVB adaptively generated mesh |$\mathcal{T}_{19}$| with |$5854$| elements (right). |$\Gamma_D$| is visualized by a thick red line.

Open in new tab Download slide

Our empirical observations are the following: Due to the singular behaviour of |$u^\star$|⁠, uniform refinement leads to a reduced convergence rate |$\mathcal{\rm O}(N^{-\beta/2})$| for both, the energy error |$\left\| {{\nabla u^\star - \nabla u_\ell}} \right\|_{L^2(\varOmega)}$| as well as the error estimator |$\eta_\ell (u_\ell)$|⁠. On the other hand, the adaptive refinement of Algorithm 4.3 regains the optimal convergence rate |$\mathcal{\rm O}(N^{-1/2})$|⁠, independently of the actual choice of |$\theta \in \{0.2, 0.4, 0.6, 0.8\} $| and |$\lambda \in \{1, 0.1, 0.01, \ldots, 10^{-6}\}$| (see Figs 2–4). Throughout, we compare the performance of the iterative solver for naive initial guess |$u_\ell^0:=0$| as well as nested iteration |$u_\ell^0:=u_{\ell-1}$|⁠: The larger the |$\it \lambda$|⁠, the stronger is the influence on the convergence behaviour (see Fig. 4). Moreover, Fig. 5 shows the number of Picard iterations for |$\theta\in\{0.2, 0.8\}$| and |$\it \lambda \in \{\rm 0.1, 0.01, \ldots, 10^{-6}\}$|⁠. As expected from Remark 3.3, for the naive initial guess |$u_\ell^0:=0$|⁠, the number of Picard iterations grows logarithmically with the number of elements |$\#\mathcal{T}_\ell$|⁠, whereas we observe a bounded number of Picard iterations for nested iteration |$u_\ell^0:=u_{\ell-1}$| (cf. Proposition 4.6 respectively Remark 4.8).

$Experiment with known solution from Section 8.4: Convergence of $\eta_\ell(u_\ell)$ (solid lines) and $\left\| {{\nabla u^\star - \nabla u_\ell}} \right\|_{L^2(\varOmega)}$ (dashed lines) for $\lambda = 0.1$ and different values of $\theta\in\{0.2,\dots, 1\}$, where we compare naive initial guesses $u_\ell^0 := 0$ (left) and nested iteration $u_\ell^0 := u_{\ell-1}$ (right).$

Fig. 2.

Experiment with known solution from Section 8.4: Convergence of |$\eta_\ell(u_\ell)$| (solid lines) and |$\left\| {{\nabla u^\star - \nabla u_\ell}} \right\|_{L^2(\varOmega)}$| (dashed lines) for |$\lambda = 0.1$| and different values of |$\theta\in\{0.2,\dots, 1\}$|⁠, where we compare naive initial guesses |$u_\ell^0 := 0$| (left) and nested iteration |$u_\ell^0 := u_{\ell-1}$| (right).

Open in new tab Download slide

$Experiment with known solution from Section 8.4: Convergence of $\eta_\ell(u_\ell)$ (solid lines) and $\left\| {{\nabla u^\star - \nabla u_\ell}} \right\|_{L^2(\varOmega)}$ (dashed lines) for $\lambda = 10^{-5}$ and different values of $\theta\in\{0.2,\dots, 1 \}$, where we compare naive initial guesses $u_\ell^0 := 0$ (left) and nested iteration $u_\ell^0 := u_{\ell-1}$ (right).$

Fig. 3.

Experiment with known solution from Section 8.4: Convergence of |$\eta_\ell(u_\ell)$| (solid lines) and |$\left\| {{\nabla u^\star - \nabla u_\ell}} \right\|_{L^2(\varOmega)}$| (dashed lines) for |$\lambda = 10^{-5}$| and different values of |$\theta\in\{0.2,\dots, 1 \}$|⁠, where we compare naive initial guesses |$u_\ell^0 := 0$| (left) and nested iteration |$u_\ell^0 := u_{\ell-1}$| (right).

Open in new tab Download slide

$Experiment with known solution from Section 8.4: Convergence of $\eta_\ell(u_\ell)$ (solid lines) and $\left\| {{\nabla u^\star - \nabla u_\ell}} \right\|_{L^2(\varOmega)}$ (dashed lines) for $\theta = 0.2$, and different values of $\lambda\in\{1,0.1, 0.01,\dots, 10^{-6} \}$, where we compare naive initial guesses $u_\ell^0 := 0$ (left) and nested iteration $u_\ell^0 := u_{\ell-1}$ (right).$

Fig. 4.

Experiment with known solution from Section 8.4: Convergence of |$\eta_\ell(u_\ell)$| (solid lines) and |$\left\| {{\nabla u^\star - \nabla u_\ell}} \right\|_{L^2(\varOmega)}$| (dashed lines) for |$\theta = 0.2$|⁠, and different values of |$\lambda\in\{1,0.1, 0.01,\dots, 10^{-6} \}$|⁠, where we compare naive initial guesses |$u_\ell^0 := 0$| (left) and nested iteration |$u_\ell^0 := u_{\ell-1}$| (right).

Open in new tab Download slide

$Experiment with known solution from Section 8.4: Number of Picard iterations used in each step of Algorithm 4.3 for different values of $\lambda\in\{0.1,\dots, 10^{-6} \}$ and $\theta = 0.2$ (left) as well as $\theta = 0.8$ (right). As expected, we observe logarithmic growth for naive initial guesses $u_\ell^0 := 0$ (dashed lines); see Remark 3.3. On the other hand, nested iteration $u_\ell^0 := u_{\ell-1}$ (solid lines) leads to a bounded number of Picard iterations (see Remark 4.8).$

Fig. 5.

Experiment with known solution from Section 8.4: Number of Picard iterations used in each step of Algorithm 4.3 for different values of |$\lambda\in\{0.1,\dots, 10^{-6} \}$| and |$\theta = 0.2$| (left) as well as |$\theta = 0.8$| (right). As expected, we observe logarithmic growth for naive initial guesses |$u_\ell^0 := 0$| (dashed lines); see Remark 3.3. On the other hand, nested iteration |$u_\ell^0 := u_{\ell-1}$| (solid lines) leads to a bounded number of Picard iterations (see Remark 4.8).

Open in new tab Download slide

8.5 Experiment with unknown solution

We consider the Z-shaped domain |$\varOmega \subset \mathbb{R}^2$| from Fig. 1 (left) and the nonlinear Dirichlet problem (8.1) with |$\Gamma=\Gamma_D$| and constant right-hand side |$f\equiv1$|⁠, where |$\mu(x, |\nabla u^\star|^{2}) = 1+ \arctan(|\nabla u^\star|^2)$|⁠. According to Congreve & Wihler (2017, Example 1), there holds (O1)–(O2) with |$\alpha = 1$| and |$L = 1+\sqrt{3}/2 + \pi/3$|⁠.

Because the exact solution is unknown, our empirical observations are concerned with the error estimator only (see Figs 6–9). Uniform mesh refinement leads to a suboptimal rate of convergence, whereas the use of Algorithm 4.3 regains the optimal rate of convergence. The latter appears to be robust with respect to |$\theta \in \{0.2,\dots, 0.8\}$| as well as |$\lambda \in \{1, 0.1,\ldots ,10^{-5}\}$|⁠. Although naive initial guesses |$u_\ell^0:=0$| for the iterative solver lead to a logarithmic growth of the number of Picard iterations, the proposed use of nested iteration |$u_\ell^0 := u_{\ell-1}$| again leads to bounded iteration numbers.

$Experiment with unknown solution from Section 8.5: Convergence of $\eta_\ell(u_\ell)$ for $\lambda = 0.1$ and $\theta\in\{0.2,\dots, 1\}$, where we compare naive initial guesses $u_\ell^0 := 0$ (left) and nested iteration $u_\ell^0 := u_{\ell-1}$ (right).$

Fig. 6.

Experiment with unknown solution from Section 8.5: Convergence of |$\eta_\ell(u_\ell)$| for |$\lambda = 0.1$| and |$\theta\in\{0.2,\dots, 1\}$|⁠, where we compare naive initial guesses |$u_\ell^0 := 0$| (left) and nested iteration |$u_\ell^0 := u_{\ell-1}$| (right).

Open in new tab Download slide

$Experiment with unknown solution from Section 8.5: Convergence of $\eta_\ell(u_\ell)$ for $\lambda = 10^{-5}$ and $\theta\in\{0.2,\dots, 1\}$, where we compare naive initial guesses $u_\ell^0 := 0$ (left) and nested iteration $u_\ell^0 := u_{\ell-1}$ (right).$

Fig. 7.

Experiment with unknown solution from Section 8.5: Convergence of |$\eta_\ell(u_\ell)$| for |$\lambda = 10^{-5}$| and |$\theta\in\{0.2,\dots, 1\}$|⁠, where we compare naive initial guesses |$u_\ell^0 := 0$| (left) and nested iteration |$u_\ell^0 := u_{\ell-1}$| (right).

Open in new tab Download slide

$Experiment with unknown solution from Section 8.5: Convergence of $\eta_\ell(u_\ell)$ for $\theta = 0.2$ and $\lambda\in\{1,0.1, 0.01,\dots, 10^{-5} \}$, where we compare naive initial guesses $u_\ell^0 := 0$ (left) and nested iteration $u_\ell^0 := u_{\ell-1}$ (right).$

Fig. 8.

Experiment with unknown solution from Section 8.5: Convergence of |$\eta_\ell(u_\ell)$| for |$\theta = 0.2$| and |$\lambda\in\{1,0.1, 0.01,\dots, 10^{-5} \}$|⁠, where we compare naive initial guesses |$u_\ell^0 := 0$| (left) and nested iteration |$u_\ell^0 := u_{\ell-1}$| (right).

Open in new tab Download slide

$Experiment with unknown solution from Section 8.5: Number of Picard iterations used in each step of Algorithm 4.3 for different values of $\lambda\in\{1, 0.1,\dots, 10^{-5} \}$ and $\theta = 0.2$ (left) as well as $\theta = 0.8$ (right). As expected, we observe logarithmic growth for naive initial guesses $u_\ell^0 := 0$ (dashed lines) as well as a bounded number of Picard iterations for nested iteration $u_\ell^0 := u_{\ell-1}$ (solid lines).$

Fig. 9.

Experiment with unknown solution from Section 8.5: Number of Picard iterations used in each step of Algorithm 4.3 for different values of |$\lambda\in\{1, 0.1,\dots, 10^{-5} \}$| and |$\theta = 0.2$| (left) as well as |$\theta = 0.8$| (right). As expected, we observe logarithmic growth for naive initial guesses |$u_\ell^0 := 0$| (dashed lines) as well as a bounded number of Picard iterations for nested iteration |$u_\ell^0 := u_{\ell-1}$| (solid lines).

Open in new tab Download slide

9. Conclusions

In this article, we have analysed an adaptive finite element method (FEM) proposed in Congreve & Wihler (2017) for the numerical solution of quasi-linear partial differential equations (PDEs) with strongly monotone operators. Conceptually based on the residual error estimator and the Dörfler marking strategy, the algorithm steers the adaptive mesh refinement and the iterative solution of the arising nonlinear systems by means of a simple fixed point iteration, which is adaptively stopped if the fixed point iterates are sufficiently accurate. In the spirit of Carstensen et al. (2014), the numerical analysis is given in an abstract Hilbert space setting that (at least) covers conforming first-order FEM and certain scalar nonlinearities (Garau et al., 2012).

We prove that our adaptive algorithm guarantees convergence of the finite element solutions to the unique solution of the PDE at optimal algebraic rate for the error estimator (which, in usual applications, is equivalent to energy error plus data oscillations, Garau et al., 2012). Employing nested iterations, we prove that the number of fixed point iterations per mesh is bounded logarithmically with respect to the improvement of the error estimator. As a consequence, we thus prove that the adaptive algorithm is not only convergent at optimal rate with respect to the degrees of freedom, but also at (almost) optimal rate with respect to the computational work.

Numerical experiments for quasi-linear PDEs in |$H^1_0(\varOmega)$| confirm our theory and underline the performance of the proposed adaptive strategy, where each step of the considered fixed point iteration requires only the numerical solution of one linear Poisson problem. In the present work, the iterative and inexact solution of these linear problems is not considered, but it can be included into the analysis (Haberl, yet unpublished).

Overall, this study appears to be the first that guarantees optimal convergence for an adaptive FEM with iterative solver for nonlinear PDEs. Open questions for future research include the following: Is it possible to generalise the numerical analysis to other type of error estimators (e.g., estimators based on equilibrated fluxes, Ern & Vohralík, 2013) as well as higher-order and/or nonconforming FEM (see, e.g., Congreve & Wihler, 2017,, for numerical experiments)? Is it possible to treat nonscalar nonlinearities?

Funding

Austria Science Fund (FWF) through the research project Optimal adaptivity for BEM and FEM-BEM coupling (P27005 to A.H., D.P.) and the research project Optimal isogeometric boundary element methods (P29096 to D.P., G.G.); the FWF doctoral school Nonlinear PDEs funded (W1245 to D.P., G.G.); Vienna Science and Technology Fund (WWTF) through the research project Thermally controlled magnetization dynamics (MA14-44 to B.S., D.P.).

References

Arioli

Georgoulis

E. H.

Loghin

(

2013a

)

Stopping criteria for adaptive finite element solvers.

SIAM J. Sci. Comput.

A1537

A1559

Month:	Total Views:
September 2017	27
October 2017	18
November 2017	31
December 2017	51
January 2018	15
February 2018	16
March 2018	43
April 2018	17
May 2018	35
June 2018	25
July 2018	21
August 2018	35
September 2018	31
October 2018	69
November 2018	57
December 2018	56
January 2019	91
February 2019	190
March 2019	117
April 2019	205
May 2019	150
June 2019	36
July 2019	21
August 2019	13
September 2019	19
October 2019	24
November 2019	22
December 2019	25
January 2020	24
February 2020	22
March 2020	8
April 2020	15
May 2020	9
June 2020	14
July 2020	13
August 2020	14
September 2020	16
October 2020	12
November 2020	18
December 2020	18

Article Contents

Rate optimal adaptive FEM with inexact solver for nonlinear operators

Abstract

1. Introduction

2. Banach fixpoint theorem

3. Discretization and a priori error estimation

3.1 Nonlinear discrete problem

3.2 Linearized discrete problem

4. Adaptive algorithm

4.1 Basic properties of mesh refinement

4.2 A posteriori error estimator

4.3 Adaptive algorithm

4.4 Estimator convergence

5. Linear convergence

6. Optimal convergence rates

6.1 Fine properties of mesh refinement

6.2 Approximation class

6.3 Main result

6.4 (Almost) Optimal computational work

7. Optimal convergence of full sequence

8. Numerical experiments

8.1 Model problem

8.2 Weak formulation

8.3 Discretization and a posteriori error estimator

8.4 Experiment with known solution

8.5 Experiment with unknown solution

9. Conclusions

Funding

References

Footnotes

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only