-
PDF
- Split View
-
Views
-
Cite
Cite
Young-Ju Lee, Jongho Park, On the linear convergence of additive Schwarz methods for the p-Laplacian, IMA Journal of Numerical Analysis, 2024;, drae068, https://doi.org/10.1093/imanum/drae068
- Share Icon Share
Abstract
We consider additive Schwarz methods for boundary value problems involving the |$p$|-Laplacian. While existing theoretical estimates suggest a sublinear convergence rate for these methods, empirical evidence from numerical experiments demonstrates a linear convergence rate. In this paper we narrow the gap between these theoretical and empirical results by presenting a novel convergence analysis. First, we present a new convergence theory for additive Schwarz methods written in terms of a quasi-norm. This quasi-norm exhibits behaviour akin to the Bregman distance of the convex energy functional associated with the problem. Secondly, we provide a quasi-norm version of the Poincaré–Friedrichs inequality, which plays a crucial role in deriving a quasi-norm stable decomposition for a two-level domain decomposition setting. By utilizing these key elements we establish the asymptotic linear convergence of additive Schwarz methods for the |$p$|-Laplacian.
1. Introduction
Let |$\varOmega $| be a bounded polygonal domain in |$\mathbb{R}^{2}$| with the Lipschitz boundary |$\partial \varOmega $|. Given |$p \in (1,\infty )$| we consider the following |$p$|-Laplace equation:
where |$f \in L^{p^{*}}(\varOmega )$| with |$p^{*}$| being from the equation |$1/p + 1/p^{*} = 1$|.
The |$p$|-Laplacian is a standard example of nonlinear elliptic problems (Benedikt et al., 2018). Furthermore, it has a number of application areas, including glaciology, non-Newtonian fluids (Shapovalov, 2017), nonlinear diffusion and nonlinear elasticity; see Díaz (1985) and references therein. Thus, there has been extensive research on (1.1), especially for numerical solutions of (1.1). Some important early results can be found in Glowinski & Marrocco (1975); Ciarlet (2002). Finite element methods for the |$p$|-Laplacian were analysed in terms of the quasi-norm in Barrett & Liu (1993, 1994). Further studies on error estimates for the |$p$|-Laplacian in terms of the quasi-norm were conducted in Liu & Yan (2001, 2002); Ebmeyer & Liu (2005); Carstensen et al. (2006). Linear convergence of adaptive finite element methods for (1.1) was shown in Diening & Kreuzer (2008). Numerical homogenization for multiscale |$p$|-Laplacian problems was investigated in Liu et al. (2021).
It is well known that the boundary value problem (1.1) can be formulated in the following weak form (Glowinski & Marrocco, 1975; Ciarlet, 2002): find |$u \in W_{0}^{1,p}(\varOmega )$| such that
where |$W_{0}^{1,p} (\varOmega )$| is a usual Sobolev space consisting of the |$L^{p} (\varOmega )$|-functions vanishing on |$\partial \varOmega $| with |$(L^{p} (\varOmega ))^{2}$|-gradient. Equivalently, it is interpreted as the following convex optimization problem:
That is, one may deal with the convex optimization problem (1.2) to obtain a solution of (1.1). Based on the convex optimization formulation (1.2) multigrid and preconditioned descent methods were proposed in Bermejo & Infante (2000) and Huang et al. (2007), respectively. In particular, the framework of subspace correction methods (Xu, 1992) for (1.2) were considered in Tai & Xu (2002); Park (2020).
This paper is concerned with numerical solutions of boundary value problems involving the |$p$|-Laplacian by additive Schwarz methods. Additive Schwarz methods, also known as parallel subspace correction methods, have been broadly used as efficient numerical solvers for large-scale scientific problems; see Xu (1992); Toselli & Widlund (2005) and references therein for relevant results on linear problems. In additive Schwarz methods the domain of a target problem is decomposed into a union of several subdomains, and optimal local corrections on the subdomains with respect a numerical approximation for the solution are computed in parallel. The numerical approximation for the solution is iteratively updated by collecting all the local corrections. Due to their parallel structures additive Schwarz methods are suitable for massively parallel computation using distributed memory computers. In the past decades there have been a number of results on additive Schwarz methods for large-scale convex optimization problems. The framework of additive Schwarz methods was first considered for convex optimization in Tai & Espedal (1998), and subsequently applied to the |$p$|-Laplacian in Tai & Xu (2002). These methods have since been further investigated in several studies, including Badea (2006, 2019); Park (2020, 2022).
The convergence rate of additive Schwarz methods for the |$p$|-Laplacian problem (1.1) was first analysed in Tai & Xu (2002); the |$\mathcal{O} (n^{-\frac{\underline{p}(\underline{p}-1)}{(\overline{p}-\underline{p})(\overline{p}+\underline{p}-1)}})$| energy convergence of the methods was proven, where |$n$| denotes the number of iterations, |$\underline{p} = \min \{p, 2 \}$|, and |$\overline{p} = \max \{ p, 2 \}$|. Recently, Park (2020) showed that the methods satisfy the improved |$\mathcal{O} (n^{-\frac{\overline{p}(\underline{p}-1)}{\overline{p}-\underline{p}}})$| convergence rate (see Proposition 2.3). The results in both Tai & Xu (2002) and Park (2020) are based on some estimates for the Bregman distance of the energy functional |$F$| in (1.2). Roughly speaking, these estimates are written as
where |$\mu _{\overline{p}}$| and |$L_{\underline{p}}$| are positive constants independent of |$u$| and |$v$|, and |$D_{F} (u,v)$| is the Bregman distance of |$F$| defined by
Here, |$F^{\prime} (v)$| stands for the Frechét derivative of |$F$| at |$v$| given by
One may refer to Tai & Xu (2002, Lemma 2.1) and Park (2020, Section 6.1) for details on the estimate (1.3).
While both Tai & Xu (2002) and Park (2020) proved the sublinear convergence of additive Schwarz methods for the |$p$|-Laplacian, it was observed numerically in several works that the methods actually converge linearly; see, e.g., Park (2021, Fig. 2). Indeed, as we will demonstrate in the numerical experiments presented in Section 5 of this paper, additive Schwarz methods for (1.1) exhibit linear convergence empirically under various settings on discretization and domain decomposition. More precisely, each convergence curve of the energy error with respect to the number of iterations seems linear in the |$x$|-linear |$y$|-log scale plot when the number of iterations is sufficiently large, which means that the energy error decays exponentially as the number of iterations increases. This implies that the existing convergence estimates for additive Schwarz methods for the |$p$|-Laplacian may not be optimal.
The main motivation of this paper is to discuss a linear convergence analysis for additive Schwarz methods to solve the |$p$|-Laplacian problem (1.1). As we mentioned above, while the existing theoretical estimates (Tai & Xu, 2002; Park, 2020) for the convergence rate of additive Schwarz methods for the |$p$|-Laplacian are sublinear, the empirical convergence rate observed by numerical experiments is linear. This discrepancy between theoretical and empirical results motivates our work, as we aim to bridge this gap by rigorously proving the asymptotic linear convergence of additive Schwarz methods for the |$p$|-Laplacian.
In (1.3) |$\underline{p}$| and |$\overline{p}$| do not agree if |$p \neq 2$|, so that the lower and upper bounds for |$D_{F} (u,v)$| are expressed in powers of |$\| u - v \|_{W^{1,p}(\varOmega )}$| with different exponents. This discrepancy indicates that a power of norm is not adequate as a tight two-sided approximation for the Bregman distance; whenever we establish a bound for |$D_{F} (u,v)$| in terms of |$\| u - v \|_{W^{1,p}(\varOmega )}$| or vice versa we suffer from a kind of looseness. We claim that the sublinear convergence rates given in the existing works (Tai & Xu, 2002; Park, 2020) are caused by this looseness. To overcome this issue we propose to use the quasi-norm developed in Liu & Yan (2001, 2002); Ebmeyer & Liu (2005); Carstensen et al. (2006), which is relevant to the problem of consideration and approximates the Bregman distance appropriately, and then to derive the convergence estimate in terms of the quasi-norm. This approach is similar to obtain the convergence measure of the iterative method using the energy-like metric relevant to the problem to be solved, as discussed in Lee et al. (2008, 2009). We denote the quasi-norm by |$\| \cdot \|_{(\nabla v)}$| (see (3.1)) and show that
for some positive constants |$\mu _{p}$| and |$L_{p}$| (see Lemma 3.3), i.e., |$\| u - v \|_{(\nabla v)}^{2}$| approximates |$D_{F} (u,v)$| well up to a multiplicative constant. Meanwhile, we note that the quasi-norm |$\| \cdot \|_{(\nabla v)}$|, along with several alternative versions described in Diening & Růžička (2007), Diening & Kreuzer (2008), do not induce a norm. As a result existing convergence theories for additive Schwarz methods (Tai & Xu, 2002; Park, 2020) cannot directly utilize the estimate (1.6). A novelty in this paper is that, by extending the idea of Park (2020), a new convergence theory for additive Schwarz methods is obtained in terms of the quasi-norm, which utilizes (1.6) to obtain the asymptotic linear convergence rate of additive Schwarz methods for the |$p$|-Laplacian. In our linear convergence analysis a quasi-norm version of the Poincaré–Friedrichs inequality (see Lemmas 3.4 and 3.5) plays a critical role. We validate this asymptotic linear convergence result numerically in Section 5.
The rest of this paper is organized as follows. In Section 2 we present finite element approximations, domain decomposition settings and a two-level additive Schwarz method for the |$p$|-Laplacian problem. An asymptotic linear convergence analysis of the two-level additive Schwarz method is given in Section 3. In Section 4 we present details of the quasi-norm Poincaré–Friedrichs inequality that is used in the convergence analysis of the methods. In Section 5 we provide numerical results of the two-level additive Schwarz method for the |$p$|-Laplacian problem across various settings. Finally, we provide a concluding remark for our paper in Section 6.
2. Additive Schwarz methods
In this section we introduce finite element spaces and domain decomposition settings for the |$p$|-Laplacian problem (1.2). Based on these settings we present a two-level additive Schwarz method for (1.2) and its convergence theory, which explains the asymptotic linear convergence of the algorithm.
In what follows the notation |$A \lesssim B$| means that there exists a constant |$c> 0$| such that |$A \leq c B$|, where |$c$| is independent of the geometric parameters |$H$|, |$h$| and |$\delta $| relying on discretization and domain decomposition. We also write |$A \approx B$| if |$A \lesssim B$| and |$B \lesssim A$|.
2.1 Discretization and domain decomposition
Let |${\mathcal{T}}_{h}$| be a quasi-uniform triangulation of |$\varOmega $| with |$h$| the characteristic element diameter. The collection of continuous and piecewise linear functions on |${\mathcal{T}}_{h}$| vanishing on |$\partial \varOmega $| is denoted by |$V = S_{h} (\varOmega )$|. Clearly, we have |$V \subset W_{0}^{1, \infty } (\varOmega )$|. For continuous functions the nodal interpolation operator |$I_{h}$| onto |$S_{h} (\varOmega )$| is well-defined.
In what follows we consider the following conforming finite element approximation of (1.2) defined on |$V$|:
A unique solution of (2.1) is denoted by |$u^{*} \in V$|. Convergence properties of (2.1) as |$h \rightarrow 0$| can be found in Barrett & Liu (1993); Ciarlet (2002).
Next, we describe domain decomposition settings for the problem (2.1). We assume that |$\varOmega $| admits another quasi-uniform triangulation |${\mathcal{T}}_{H}$| with |$H$| the characteristic element diameter such that |${\mathcal{T}}_{h}$| is a refinement of |${\mathcal{T}}_{H}$|. A finite element space |$S_{H} (\varOmega )$| is defined in the same manner as |$S_{h} (\varOmega )$|. In the two-level additive Schwarz method for (2.1) |${\mathcal{T}}_{h}$| and |${\mathcal{T}}_{H}$| will play roles of fine and coarse meshes, respectively. Let |$\{ \varOmega _{k} \}_{k=1}^{N}$| be a nonoverlapping domain decomposition of |$\varOmega $| such that each |$\varOmega _{k}$| is the union of several coarse elements in |${\mathcal{T}}_{H}$| and the number of coarse elements consisting of |$\varOmega _{k}$| is uniformly bounded. For each subdomain |$\varOmega _{k}$|, |$1 \leq k \leq N$| we consider an enlarged region |$\varOmega _{k}^{\prime}$| consisting of the elements |$T \in{\mathcal{T}}_{h}$| with |$\operatorname{dist} (T, \varOmega _{k}) \leq \delta $|. Then |$\{ \varOmega _{k}^{\prime}\}_{k=1}^{N}$| forms an overlapping domain decomposition of |$\varOmega $|. We define |$S_{h} (\varOmega _{k} ^{\prime}) \subset W_{0}^{1, \infty }(\varOmega _{k}^{\prime})$| as the piecewise linear finite element space on the |${\mathcal{T}}_{h}|_{\varOmega _{k}^{\prime}}$| with the homogeneous essential boundary condition.
We set
A two-level domain decomposition for |$V$| is given by
where |$R_{k}^{*}$|: |$V_{k} \rightarrow V$|, |$1 \leq k \leq N$| is the natural extension-by-zero operator and |$R_{0}^{*}$|: |$V_{0} \rightarrow V$| is the natural interpolation operator. Let |$\{ \theta _{k} \}_{k=1}^{N}$| be the piecewise linear partition of unity for |$\varOmega $| subordinate to the covering |$\{ \varOmega _{k}^{\prime} \}_{k=1}^{N}$| that was presented in Toselli & Widlund (2005, Eq. (3.7)). It is known that |$\{ \theta _{k} \}_{k=1}^{N}$| satisfies the following properties:
The following lemma summarizes an important result on stable decomposition for the two-level domain decomposition (2.2) (see Tai & Xu (2002, Lemma 4.1)).
Using the usual colouring technique one can prove that the two-level domain decomposition (2.2) enjoys the strengthened convexity condition (see Park (2020, Assumption 4.2)).
2.2 Two-level additive Schwarz method
The two-level additive Schwarz method for (2.1) based on the space decomposition (2.2) is described in Algorithm 1. It is worth noting that this algorithm has been investigated in several prior works. The algorithm for smooth convex optimization was first considered in Tai & Espedal (1998), and then applied to the |$p$|-Laplacian in Tai & Xu (2002). Later, the framework was generalized to constrained and nonsmooth convex optimization problems in Badea (2006) and Park (2020, 2021), respectively. The constant |$\tau _{0}$| in Algorithm 1 was given in Lemma 2.2.
The following proposition summarizes the sublinear convergence rate of Algorithm 1 analysed in Park (2020, Theorem 6.1). It was discussed in Park (2021, Remark 4.2) that the rate presented in Proposition 2.3 is the sharpest estimate among the existing ones (Tai & Xu, 2002; Badea, 2006; Badea & Krause, 2012; Park, 2020).
While Proposition 2.3 ensures the sublinear convergence of Algorithm 1, as we will see in Section 5, the actual numerical behaviour indicates linear convergence. This observation motivates us to develop a new convergence theory for Algorithm 1 that can explain the linear convergence. We summarize our main result, the asymptotic linear convergence of Algorithm 1, in Theorem 2.4. The proof of Algorithm 1 will be provided in Section 3. We highlight that Theorem 2.4 stands as the first theoretical result that explains the linear convergence of the additive Schwarz method for the |$p$|-Laplacian.
Regarding the condition in Theorem 2.4 that requires the finite element solution |$u^{*}$| to satisfy |$| \nabla u^{*} | \neq 0$| on |$\varOmega $|, we discuss its validity for extreme values of |$p$|, particularly when |$p$| is either very large or close to |$1$|. As we will demonstrate in Section 5, for large |$p$|, the solution may develop a singularity (see Fig. 2(e, f)). Fortunately, this singularity does not violate the condition |$| \nabla u^{*} | \neq 0$|. However, when |$p$| is close to |$1$| the solution may exhibit a flat region, potentially leading to a vanishing gradient (see Fig. 2(a, b)). Consequently, the applicability of Theorem 2.4 to cases near |$p = 1$| may be limited.
Despite the potential limitations in applying Theorem 2.4 to such cases it remains practically relevant, as many real-world applications involving the |$p$|-Laplacian typically utilize moderate values of |$p$|. For instance, in modelling nonlinear Darcy law for fluid flows, as discussed in Benedikt et al. (2018), physically meaningful values for |$p$| are generally greater than |$3/2$|.
We conclude this section by mentioning several acceleration methodologies that can be applied to Algorithm 1. In Park (2021, 2022) acceleration schemes for additive Schwarz methods for convex optimization were proposed. As the energy functional |$F$| is convex these schemes can be directly applied to Algorithm 1 to yield accelerated variants. These accelerated methods show faster convergence behaviours than the vanilla method, while they have essentially the same computational cost per iteration; see Park (2022) for relevant numerical results. We do not deal with the accelerated methods in detail because they are beyond the scope of this paper.
3. Convergence analysis
The main objective of this section is to prove Theorem 2.4, which is the asymptotic linear convergence theorem for the two-level additive Schwarz method for the |$p$|-Laplacian. We begin by presenting some useful properties of the quasi-norm |$\| \cdot \|_{(\nabla v)}$| (Liu & Yan, 2001; Ebmeyer & Liu, 2005), which is defined as
Subsequently, we prove Theorem 2.4 by verifying a certain quasi-norm stable decomposition property (Tai & Xu, 2002; Park, 2020).
3.1 Properties of the quasi-norm
The quasi-norm |$\| \cdot \|_{(\nabla v)}$| given in (3.1) satisfies a scaling property in the sense that the |$\| tw \|_{(\nabla v)}$| is bounded by |$\| w \|_{(\nabla v)}$| multiplied by |$t^{\alpha }$| for some |$\alpha \in \mathbb{R}$|, where |$v,w \in W^{1,p} (\varOmega )$| and |$t \in [0,1]$|. Lemma 3.1 summarizes such a property.
The following lemma states that |$\| u - v \|_{(\nabla v)}$| is bounded by |$\| u - v \|_{(\nabla u)}$| up to a multiplicative constant independent of |$u,v \in W^{1,p} (\varOmega )$|.
In Barrett & Liu (1993, 1994) the following vector inequalities were established: there exist two positive constants |$C_{1}$| and |$C_{2}$| such that, for any |$\xi , \eta \in \mathbb{R}^{2}$|, the following hold:
Using (3.2) and proceeding similarly to Barrett & Liu (1993, Theorem 2.1) we prove Lemma 3.3, which says that the estimate (1.6) actually holds. Lemma 3.3 will play an important role in proving (3.8); see also Liu et al. (2021, Lemma 2.3).
3.2 Quasi-norm stable decomposition
The core step in the convergence analysis of additive Schwarz methods typically involves verifying a stable decomposition property; see, e.g., Tai & Xu (2002, Eq. (13)) and Park (2020, Assumption 4.1). In this section we derive a quasi-norm stable decomposition property associated with the space decomposition (2.2). A key distinction of the quasi-norm stable decomposition property considered in this section compared with the existing ones is that we use the quasi-norm |$\| \cdot \|_{(\nabla v)}$|, while the existing ones are written in terms of norms. As (1.3) implies a power of norm cannot approximate the Bregman distance of |$F$| by a multiplicative constant if |$p \neq 2$|. Our main insight is that if the quasi-norm can approximate the Bregman distance of |$F$| up to a multiplicative constant, i.e., if it satisfies an estimate of the form (1.6), then we can derive the asymptotic linear convergence of Algorithm 1 using this property.
We recall that two key ingredients for the stable decomposition analysis for linear elliptic problems are the Poincaré–Friedrichs inequality and interpolation error estimate; see Toselli & Widlund (2005, Chapter 3). Therefore, we need to establish these theories with respect to the quasi-norm for the stable decomposition analysis of the |$p$|-Laplacian.
In Lemmas 3.4 and 3.5 we present quasi-norm Poincaré–Friedrichs inequalities for the cases |$p \in (2, \infty )$| and |$p \in (1,2)$|, respectively, that are suitable for our purposes; more general results are proven in Section 4.
As stated in Lemmas 3.4 and 3.5 the quasi-norm Poincaré–Friedrichs inequality holds for all choices of |$v$| except for certain exceptional cases, which are detailed in Examples 4.8 and 4.15. Moreover, in most cases, the constant |$C_{p,v}^{\mathrm{PF}}$| demonstrates only a weak dependence on |$v$|. By the quasi-monotone argument (Galvis & Efendiev, 2010; Pechstein & Scheichl, 2013) presented in Section 4 we can ensure that the value of |$C_{p,v}^{\mathrm{PF}}$| is influenced by the local variation of |$| \nabla v |$| only. Consequently, even if |$| \nabla v |$| exhibits significant global variation |$C_{p,v}^{\mathrm{PF}}$| has a moderate value. One may refer to Scheichl et al. (2012) for relevant numerical evidences.
As noted in Lemmas 3.4 and 3.5, the quasi-norm Poincare–Friedrichs inequality may not hold in cases where |$\nabla v$| vanishes in a certain pattern, which makes the convergence analysis of the algorithm challenging. In order to address this issue one may consider regularization techniques as described in Diening et al. (2020); Liu et al. (2021). However, we do not adopt such techniques since they require a delicate convergence analysis for the case when the regularization parameter tends to |$0$|.
Next, we establish a quasi-norm error estimate for the nodal interpolation operator |$I_{h}$| onto the finite element space |$S_{h} (\varOmega )$|, as summarized in Lemma 3.7.
Using Lemmas 3.4, 3.5 and 3.7, we obtain the following quasi-norm stable decomposition result.
The estimate presented in Lemma 3.8 is not as sharp as the one in the norm-stable decomposition result given in Lemma 2.1. Specifically, in Lemma 3.8, the power of |$H/\delta $| is |$p$|, while in Lemma 2.1, it is |$p-1$|. The norm-stable decomposition achieves the sharp |$(H/\delta )^{p-1}$|-result using a trace theorem-type argument introduced in Dryja & Widlund (1994); see also Toselli & Widlund (2005, Lemma 3.10). Unfortunately, we were unable to make a similar argument in our quasi-norm analysis because the quasi-norm does not have a notion of trace. To obtain a sharp estimate it will be necessary to define an appropriate trace for the quasi-norm, which is remained as a topic for future research.
3.3 Proof of Theorem 2.4
The proof of Theorem 2.4 presented here uses a similar argument to Park (2020). However, due to the nonlinearity of the quasi-norm |$\| \cdot \|_{(\nabla v)}$|, we have to make a careful consideration on dealing with |$\| \cdot \|_{(\nabla v)}$|. In Lemma 3.10 we state the generalized additive Schwarz lemma (see Park (2020, Lemma 4.5)) applied to Algorithm 1 in a form suitable for our purposes.
In the following text, similar to Park (2020, Lemma 4.6), we prove that |$M_{\tau }$| defined in Lemma 3.10 is bounded below by |$D_{F}$| and above by |$\| \cdot \|_{(\nabla v)}^{2}$| up to a multiplicative constant. We note that Lemma 3.11 can be regarded as a variant of the Lipschitz-like/convexity condition discussed in Teboulle (2018).
By closely following the argument in (Park, 2020, Appendix A.4) and manipulating |$\| \cdot \|_{(\nabla v)}$|-terms using the properties of |$\| \cdot \|_{(\nabla v)}$| presented in Lemmas 3.1 to 3.3 we establish the following lemma, which provides an estimate for the ratio of two consecutive energy errors in Algorithm 1.
Finally, we are ready to present our proof of Theorem 2.4.
As stated in Lemma 3.3 the squared quasi-norm |$\| u - v \|_{(\nabla v)}^{2}$| is equivalent to the Bregman distance |$D_{F} (u, v)$| up to multiplicative constants for |$u, v \in W^{1,p} (\varOmega )$|. This equivalence allows us to perform the convergence analysis presented in this section using the Bregman distance instead of the quasi-norm. Indeed, the Bregman distance is frequently utilized in the analysis of convex optimization algorithms; see, e.g., Bauschke et al. (2017); Teboulle (2018). Nevertheless, we opted to present the convergence analysis of Algorithm 1 in terms of the quasi-norm in this paper, because using the quasi-norm has an advantage that we can simplify our proof by borrowing some useful techniques regarding the quasi-norm introduced in from the existing literature (Liu & Yan, 2001, 2002; Ebmeyer & Liu, 2005; Carstensen et al., 2006).
4. Quasi-norm Poincaré–Friedrichs inequality
This section is devoted to the proofs of Lemmas 3.4 and 3.5. Namely, we deal with quasi-norm Poincaré–Friedrichs inequalities of the form
where |$p \in (1, \infty )$| with |$p \neq 2$|. Unfortunately, the inequality (4.1) does not hold for every |$v, w \in W_{0}^{1,p} (\varOmega )$|; see Examples 4.8 and 4.15. Based on a quasi-monotonicity argument introduced in Pechstein & Scheichl (2013) we characterize the conditions on |$v$| such that the inequality (4.1) holds and provide a precise estimate for the Poincaré–Friedrichs constant |$C$| in (4.1). Throughout this section let |$W_{\varGamma }^{1,p} (\varOmega )$| denote the collection of all |$W^{1,p} (\varOmega )$|-functions vanishing on |$\varGamma \subset \partial \varOmega $|. In addition we use the conventions |$0/0 = 1$| and |$1/0 = \infty $|.
We first observe that a particular case of (4.1), when |$|\nabla v|$| is constant on |$\varOmega $|, is valid. By the same argument as in Liu & Yan (2002, Lemma 3.1) and Carstensen et al. (2006, Lemma. 4.1) we can prove the following lemma.
For a non-negative function |$\alpha \in L^{\infty } (\varOmega )$| and a partition |$\mathcal{Y} = \{ Y \}_{l=1}^{m}$| of |$\varOmega $| consisting of nonoverlapping polygonal regions we define two non-negative functions |$\underline{\alpha }_{\mathcal{Y}\,}, \overline{\alpha }^{\mathcal{Y}\,} \in L^{\infty } (\varOmega )$| as follows:
In the following text we address the cases |$p \in (2, \infty )$| and |$p \in (1, 2)$| separately. We first focus on the case |$p \in (2, \infty )$|. In Definition 4.2 we introduce the concept of quasi-monotone increase. We note that relevant notions were explored in Galvis & Efendiev (2010); Pechstein & Scheichl (2013).
Let |$\alpha \in L^{\infty } (\varOmega )$| be a non-negative function on |$\varOmega $|, and let |$\mathcal{Y}\, = \{ Y \}_{l=1}^{m}$| denote a partition of |$\varOmega $| into nonoverlapping polygonal regions.
We say that the region |$P_ {l_{1}, l_{s}} = ( \overline{Y}_{l_{1}} \cup \dots \cup \overline{Y}_{l_{s}} )^{\circ }$|, |$1 \leq l_{1}, \dots , l_{s} \leq m$|, is a quasi-monotonically increasing path from |$Y_{l_{1}}$| to |$Y_{l_{s}}$| with respect to |$\alpha $| if the following two conditions hold:
For each |$1 \leq i \leq s-1$| the regions |$\overline{Y}_{l_{i}}$| and |$\overline{Y}_{l_{i+1}}$| share a common edge.
|$\underline{\alpha }_{\mathcal{Y}\,} (Y_{l_{1}}) \leq \dots \leq \underline{\alpha }_{\mathcal{Y}\,} (Y_{l_{s}})$|.
We say that |$\alpha $| is |$\partial \varOmega $|-quasi-monotonically increasing on |$\mathcal{Y}\,\,$| if, for any |$1 \leq l \leq m$|, there exist an index |$l^{*}$| and a quasi-monotonically increasing path |$P_{l, l^{*}}$| from |$Y_{l}$| to |$Y_{l^{*}}$|, such that |$\partial Y_{l^{*}} \cap \partial \varOmega $| has nonvanishing one-dimensional measure.
By a similar argument, as in the proof of Pechstein & Scheichl (2013, Theorem 2.9), we prove the following lemma.
where the region |$P_{l,l^{*}}$| was given in Definition 4.2.
Due to Lemma 4.3 we are able to define the quasi-monotone increase constant |$C_{p, \alpha }^{\mathrm{QM}}$| for |$p \in (2, \infty )$|, as presented in Definition 4.4.
Note that the infimum in Definition 4.2 is well-defined because |$\alpha $| is |$\partial \varOmega $|-quasi-monotonically increasing on the trivial partition |$\{ \varOmega \}$|. In terms of the quasi-monotone increase constant |$C_{p, \alpha }^{\mathrm{QM}}$| we present a quasi-norm Poincaré–Friedrichs inequality for |$p \in (2, \infty )$| in Theorem 4.5.
Let |$W_{h} (\varOmega )$| be the space of piecewise constant functions on the triangulation |${\mathcal{T}}_{h}$|. Under an additional assumption that |$\alpha \in W_{h} (\varOmega )$| we can characterize the condition when the quasi-monotone increase constant |$C_{p, \alpha }^{\mathrm{QM}}$| is finite.
Assume that |$p \in (2, \infty )$|. Let |$\alpha \in W_{h} (\varOmega )$| be a non-negative piecewise constant function on |${\mathcal{T}}_{h}$|. Then, the quasi-monotone increase constant |$C_{p, \alpha }^{\mathrm{QM}}$| is finite if and only if every maximal polygonal region |$R \subset \varOmega $| with |$\alpha> 0$| satisfies that |$\partial R \cap \partial \varOmega $| has nonvanishing one-dimensional measure.
Combining Theorem 4.5 and Lemma 4.6 yields Corollary 4.7, in which Lemma 3.4 is a particular case |$\alpha = |\nabla v |$| of this result.
We show that, under the condition presented in Lemma 4.6 for the quasi-monotone increase constant |$C_{p, \alpha }$| to be infinite, the quasi-norm Poincaré–Friedrichs inequality of the form (4.1) is not valid. For simplicity we provide a counterexample in one-dimension; we note that the construction can be generalized to higher dimensions.
We now turn to the case |$p \in (1, 2)$|. In contrast to the case |$p \in (2, \infty )$|, which heavily relies on the quasi-monotone increase of |$\alpha $|, the analysis of the case |$p \in (1, 2)$| hinges on the quasi-monotone decrease of |$\alpha $|; see Definition 4.9.
Let |$\alpha \in L^{\infty } (\varOmega )$| be a non-negative function on |$\varOmega $|, and let |$\mathcal{Y}\, = \{ Y \}_{l=1}^{m}$| denote a partition of |$\varOmega $| into nonoverlapping polygonal regions.
We say that the region |$P_{l_{1}, l_{s}} = ( \overline{Y}_{l_{1}} \cup \dots \cup \overline{Y}_{l_{s}} )^{\circ }$|, |$1 \leq l_{1}, \dots , l_{s} \leq m$|, is a quasi-monotonically decreasing path from |$Y_{l_{1}}$| to |$Y_{l_{s}}$| with respect to |$\alpha $| if the following two conditions hold:
For each |$1 \leq i \leq s-1$| the regions |$\overline{Y}_{l_{i}}$| and |$\overline{Y}_{l_{i+1}}$| share a common edge.
|$\overline{\alpha }_{\mathcal{Y}\,} (Y_{l_{1}}) \geq \dots \geq \overline{\alpha }_{\mathcal{Y}\,} (Y_{l_{s}})$|.
We say that |$\alpha $| is |$\partial \varOmega $|-quasi-monotonically decreasing on |$\mathcal{Y}\,$| if, for any |$1 \leq l \leq m$|, there exist an index |$l^{*}$| and a quasi-monotonically decreasing path |$P_{l, l^{*}}$| from |$Y_{l}$| to |$Y_{l^{*}}$|, such that |$\partial Y_{l^{*}} \cap \partial \varOmega $| has nonvanishing one-dimensional measure.
One can prove the following lemma using the fact that the map |$x \mapsto x^{p-2}$| (|$x \geq 0$|) is decreasing and by following a similar argument to that used in the proof of Lemma 4.3.
where the region |$P_{l,l^{*}}$| was given in Definition 4.9.
Similar to Definition 4.4 we present the definition of the quasi-monotone decrease constant |$C_{p, \alpha }^{\mathrm{QM}}$| for |$p \in (1, 2)$| in the following text.
In terms of the quasi-monotone decrease constant |$C_{p, \alpha }^{\mathrm{QM}}$|, we present a quasi-norm Poincaré–Friedrichs inequality for |$p \in (1, 2)$| in Theorem 4.12, which can be proven in a similar manner to Theorem 4.5.
The following lemma characterizes the condition when the quasi-monotone decrease constant |$C_{p, \alpha }$| is finite, under an additional assumption that |$\alpha \in W_{h} (\varOmega )$|, i.e., |$\alpha $| is a non-negative piecewise constant function on the triangulation |${\mathcal{T}}_{h}$|.
Assume that |$p \in (1, 2)$|. Let |$\alpha \in W_{h} (\varOmega )$| be a non-negative piecewise constant function on |${\mathcal{T}}_{h}$|. Then, the quasi-monotone decrease constant |$C_{p, \alpha }^{\mathrm{QM}}$| is finite if and only if every maximal polygonal region |$S \subset \varOmega $| with |$\alpha = 0$| satisfies that |$\partial S \cap \partial \varOmega $| has nonvanishing one-dimensional measure.
The proof is analogous to that of Lemma 4.6.
We obtain Corollary 4.14, in which Lemma 3.5 is a particular case |$\alpha = |\nabla u |$|, as a direct consequence of Theorem 4.12 and Lemma 4.13.
Finally, we present a counterexample of the quasi-norm Poincaré–Friedrichs inequality (4.1) under the condition presented in Lemma 4.13 for the quasi-monotone decrease constant |$C_{p, \alpha }^{\mathrm{QM}}$| to be infinite.
5. Numerical experiments
In this section we present numerical results of the two-level additive Schwarz method for the |$p$|-Laplacian, which support our theoretical findings. All the algorithms were implemented in MATLAB R2022b. They were executed on a desktop equipped with AMD Ryzen 5 5600X CPU (3.7GHz, 6C), 40GB RAM and the operating system Windows 10 Pro.
In the model |$p$|-Laplacian problem (1.1) we set |$p \in \{ 1.05, 1.1, 1.5, 5, 10, 20 \}$|, |$\varOmega = [0,1]^{2} \subset \mathbb{R}^{2}$| and |$f = 1$|. The domain |$\varOmega $| is partitioned into |$2 \times 1/H \times 1/H$| uniform triangles to form a coarse triangulation |${\mathcal{T}}_{H}$| of |$\varOmega $|. We further refine |${\mathcal{T}}_{H}$| to obtain a fine triangulation |${\mathcal{T}}_{h}$|, which consists of total |$2 \times 1/h \times 1/h$| uniform triangles. Each subdomain |$\varOmega _{k}$|, |$1 \leq k \leq N$| (|$N = 1/H \times 1/H$|) is defined by a rectangular region consisting of two coarse triangles sharing a diagonal edge. Then we extend |$\varOmega _{k}$| by adding its surrounding layers of fine triangles in |${\mathcal{T}}_{h}$| with the width |$\delta $| to construct |$\varOmega _{k}^{\prime}$|, so that |$\{ \varOmega _{k}^{\prime} \}_{k=1}^{N}$| becomes an overlapping domain decomposition for |$\varOmega $|. If |$\delta \in (0, H/2)$| then |$\{ \varOmega _{k}^{\prime} \}_{k=1}^{N}$| can be coloured with four colours in the way described in Lemma 2.2. The discretization and domain decomposition settings described above are illustrated in Fig. 1.

Discretization and domain decomposition settings when |$h = 1/2^{4}$|, |$H = 1/2^{2}$| and |$\delta = h$|. (a) Coarse triangulation |${\mathcal{T}}_{H}$| and fine triangulation |${\mathcal{T}}_{h}$|. (b) Nonoverlapping domain decomposition |$\{ \varOmega _{k} \}_{k=1}^{N}$|. (c) Overlapping domain decomposition |$\{ \varOmega _{k} ^{\prime} \}_{k=1}^{N}$|.

Reference solutions of the |$p$|-Laplacian problem (2.1) (|$p \in \{ 1.05, 1.1, 1.5, 5, 10, 20 \}$|) computed by the adaptive Newton method (Mishchenko, 2023) (|$h = 2^{-5}$|).
In Algorithm 1 we set |$u^{(0)} = 0$| and |$\tau = \tau _{0} = 1/5$|. Local problems defined on |$V_{k}$|, |$1 \leq k \leq N$|, and coarse problems defined on |$V_{0}$| are solved by the adaptive Newton method proposed in Mishchenko (2023, Algorithm 2.1). We use the stop criterion
for both local and coarse problems, where |$F_{k}$| represents the energy functional corresponding to the local or coarse problems on |$V_{k}$|.
As alternatives to the adaptive Newton method used in this paper, which is a second-order optimization algorithm, first-order optimization algorithms (Teboulle, 2018) can be adopted to solve the local and coarse problems. These algorithms are generally easier to implement as they do not require the Hessian information of the energy functional, but known to converge slower than second-order algorithms. To accelerate the convergence rate of a first-order algorithm several techniques such as the FISTA momentum (Beck & Teboulle, 2009), restart scheme (O’Donoghue & Candes, 2015) and backtracking (Scheinberg et al., 2014) can be employed.
A reference solution |$u^{*} \in V$| for each |$p$| and |$h$| is computed by sufficiently many iterations of the adaptive Newton method applied to the full-dimension problem (2.1). The computed reference solutions for |$p \in \{ 1.05, 1.1, 1.5, 5, 10, 20 \}$| are plotted in Fig. 2. One can observe that for cases where |$p$| is close to |$1$| the reference solutions exhibit flat regions where the gradient vanishes. This observation implies that when |$p$| is close to |$1$| the assumption in Theorem 2.4 that |$\nabla u^{*}$| does not vanish may not hold. On the other hand, for cases where |$p$| is large, the reference solutions display peaks, leading to singular behaviour in the solution.
In Figs 3, 4, 5, 6, 7, 8, we depict the relative energy errors

Decay of the relative energy error (5.1) in the two-level additive Schwarz method (Algorithm 1) for the |$p$|-Laplacian problem (2.1) (|$p = 1.05$|). Parameters |$h$|, |$H$| and |$\delta $| stand for the characteristic element size, subdomain size and overlapping width among subdomains, respectively (|$H/h = 2^{3}$|).

Decay of the relative energy error (5.1) in the two-level additive Schwarz method (Algorithm 1) for the |$p$|-Laplacian problem (2.1) (|$p = 1.1$|). Parameters |$h$|, |$H$| and |$\delta $| stand for the characteristic element size, subdomain size and overlapping width among subdomains, respectively (|$H/h = 2^{3}$|).

Decay of the relative energy error (5.1) in the two-level additive Schwarz method (Algorithm 1) for the |$p$|-Laplacian problem (2.1) (|$p = 1.5$|). Parameters |$h$|, |$H$| and |$\delta $| stand for the characteristic element size, subdomain size and overlapping width among subdomains, respectively (|$H/h = 2^{3}$|).

Decay of the relative energy error (5.1) in the two-level additive Schwarz method (Algorithm 1) for the |$p$|-Laplacian problem (2.1) (|$p = 5$|). Parameters |$h$|, |$H$| and |$\delta $| stand for the characteristic element size, subdomain size and overlapping width among subdomains, respectively (|$H/h = 2^{3}$|).

Decay of the relative energy error (5.1) in the two-level additive Schwarz method (Algorithm 1) for the |$p$|-Laplacian problem (2.1) (|$p = 10$|). Parameters |$h$|, |$H$| and |$\delta $| stand for the characteristic element size, subdomain size and overlapping width among subdomains, respectively (|$H/h = 2^{3}$|).

Decay of the relative energy error (5.1) in the two-level additive Schwarz method (Algorithm 1) for the |$p$|-Laplacian problem (2.1) (|$p = 20$|). Parameters |$h$|, |$H$| and |$\delta $| stand for the characteristic element size, subdomain size and overlapping width among subdomains, respectively (|$H/h = 2^{3}$|).
of Algorithm 1 under various settings on |$p$|, |$h$|, |$H$| and |$\delta $|. More precisely, in Figs 5 and 6 we choose |$p$| as moderate values |$1.5$| and |$5$|, in Figs 3 and 4 |$p$| is chosen very close to |$1$| (|$p = 1.05, 1.1$|) and in Figs 7 and 8 |$p$| is chosen large (|$p = 10, 20$|). In all figures |$h$| and |$H$| vary such that |$H/h = 2^{3}$|, and |$\delta $| is chosen as |$\delta \in \{ 2^{0}h, 2^{1}h, 2^{2} h \}$|.
In every case we observe that the convergence curve of the relative energy error with respect to the number of iterations |$n$| appears linear in the |$x$|-linear |$y$|-log scale plot when |$n$| is large enough, consistent with our theoretical result presented in Theorem 2.4. It is noteworthy that even in cases where |$p$| is very close to |$1$| (see Figs 3 and 4), where Theorem 2.4 cannot be applied due to the flat region in the solution |$u^{*}$| as shown in Fig. 2(a, b), the convergence curve still appears linear. However, a theoretical explanation for the linear convergence in these cases is currently lacking.
On the other hand, for each |$p$|, we observe that the asymptotic convergence rate of Algorithm 1 shown in Figs 3, 4, 5, 6, 7, 8 remains bounded when |$h$| decreases keeping |$H / \delta $| constant. This behaviour aligns with the dependence of |$\gamma $| to |$H/ \delta $| explained in Theorem 2.4. Moreover, this observation implies that Algorithm 1 is numerically scalable; the asymptotic linear convergence rate is uniformly bounded when the ratio of the subdomain size to the overlapping width is fixed.
6. Conclusion
In this paper we developed a new convergence theory for additive Schwarz methods for boundary value problems involving the |$p$|-Laplacian. To the best of our knowledge our theory is the first theoretical result that explains the asymptotic linear convergence of additive Schwarz methods for the |$p$|-Laplacian. Our work successfully bridges the gap between theory and practice by demonstrating that our theoretical findings align well with numerical results.
While the convergence theory of subspace correction methods for linear problems appears to be well-developed (Xu & Zikatanov, 2002; Lee et al., 2008), there remains a need for further research on the theory of subspace correction methods for nonlinear problems. We believe that our result can serve as a foundation for the sharp convergence theory of general subspace correction methods for complex nonlinear problems.
Funding
NSF-DMS (2208499 to Y.-J.L.); Shapiro Fellowship from Penn State University in the Spring of 2022; REP grant for the year of 2022, from Texas State University; National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (No. 2021R1C1C2095193 to J.P.).
References