Double-Layer Potentials, Configuration Constants, and Applications to Numerical Ranges

Abstract

Given a compact convex planar domain |$\Omega $| with non-empty interior, the classical Neumann’s configuration constant |$c_{\mathbb{R}}(\Omega )$| is the norm of the Neumann–Poincaré operator |$K_\Omega $| acting on the space of continuous real-valued functions on the boundary |$\partial \Omega $|⁠, modulo constants. We investigate the related operator norm |$c_{\mathbb{C}}(\Omega )$| of |$K_\Omega $| on the corresponding space of complex-valued functions, and the norm |$a(\Omega )$| on the subspace of analytic functions. This change requires introduction of techniques much different from the ones used in the classical setting. We prove the equality |$c_{\mathbb{R}}(\Omega ) = c_{\mathbb{C}}(\Omega )$|⁠, the analytic Neumann-type inequality |$a(\Omega ) < 1$|⁠, and provide various estimates for these quantities expressed in terms of the geometry of |$\Omega $|⁠. We apply our results to estimates for the holomorphic functional calculus of operators on Hilbert space of the type |$\|p(T)\| \leq K \sup _{z \in \Omega } |p(z)|$|⁠, where |$p$| is a polynomial and |$\Omega $| is a domain containing the numerical range of the operator |$T$|⁠. Among other results, we show that the well-known Crouzeix–Palencia bound |$K \leq 1 + \sqrt{2}$| can be improved to |$K \leq 1 + \sqrt{1 + a(\Omega )}$|⁠. In the case that |$\Omega $| is an ellipse, this leads to an estimate of |$K$| in terms of the eccentricity of the ellipse.

1 Introduction

1.1 Double-layer potential

$$ \begin{align}& u(z) = \frac{1}{\pi} \int_{\partial \Omega} f(\sigma) \, \textrm{d} \arg( \sigma - z) = \frac{1}{\pi} \int_{\partial \Omega} f(\sigma) \,\textrm{Re} \Bigg( \frac{N(\sigma)}{\sigma - z} \Bigg) \textrm{d}s, \quad z \in \Omega^{o}.\end{align} $$

(1)

Here |$ds = |d\sigma |$| is the arclength measure on the rectifiable curve |$\partial \Omega $|⁠, |$\Omega ^{o}$| is the interior of |$\Omega $|⁠, and |$N(\sigma )$| is the outer-pointing normal at the boundary point |$\sigma $|⁠. The equality between the two expressions for |$u(z)$| above follows from an elementary computation in the case that |$\partial \Omega $| is sufficiently smooth. In the general case, we interpret |$N(\sigma )(\sigma - z)^{-1}$| as a Borel measurable function on |$\partial \Omega $|⁠. By convexity of the domain, both the tangent |$T(\sigma )$| and the normal |$N(\sigma )$| exist and are continuous at all but a countable number of points |$\sigma $|⁠, which we will call corners, at which the discontinuity of |$T$| and |$N$| amounts to a jump in the argument. In Appendix A we include more details regarding boundaries of planar convex domains, and other facts mentioned below.

The Neumann–Poincaré operator appears in connection with the study of boundary behaviour of the double-layer potential. It is known that |$u$| given by (1) has a continuous extension to |$\partial \Omega $|⁠, and we have the representation

$$ \begin{align}& u(\zeta) = f(\zeta) + K_\Omega f(\zeta), \quad \zeta \in \partial \Omega\end{align} $$

(2)

where |$K_\Omega $| denotes the Neumann–Poincaré integral operator

$$ \begin{align*} & K_\Omega f(\zeta):= \frac{1}{\pi} \int_{\partial \Omega} f(\sigma)\, \textrm{d}\mu_\zeta( \sigma), \quad \zeta \in \partial \Omega. \end{align*} $$

Here |$\mu _\zeta $| is the probability measure

$$ \begin{align}& d\mu_\zeta = (1-\theta_\zeta / \pi) d \delta_\zeta + \rho_\zeta ds\end{align} $$

(3)

$$ \begin{align}& \rho_\zeta(\sigma):= \frac{d \mu_\zeta}{ds}(\sigma):= \frac{1}{\pi}\textrm{Re} \Bigg( \frac{N(\sigma)}{\sigma - \zeta} \Bigg) = \frac{1}{\pi}\textrm{Im} \Bigg( \frac{T(\sigma)}{\sigma - \zeta}\Bigg).\end{align} $$

(4)

It is natural to use the convention that |$\theta _\zeta = \pi $| if |$\zeta $| is not a corner. This occurs precisely when |$\mu _\zeta $| assigns no mass to the singleton |$\{\zeta \}$|⁠. We will say that the collection of measures |$\{\mu _\zeta \}_{\zeta \in \partial \Omega }$| is the Neumann–Poincaré kernel of |$\Omega $|⁠.

The density |$\rho _\zeta $| has the following useful geometric interpretation. If |$\sigma \in \partial \Omega \setminus \{\zeta \}$| is not a corner, and |$R_{\zeta , \sigma }$| is the radius of the unique circle passing through |$\zeta $| that is tangent to |$\partial \Omega $| at |$\sigma $|⁠, then the equality

$$ \begin{align}& \rho_\zeta(\sigma) = \frac{1}{2 \pi R_{\zeta,\sigma}}\end{align} $$

(5)

holds. The radius |$R_{\zeta , \sigma }$| may degenerate to |$\infty $| if |$\zeta $| is contained in the tangent line to |$\partial \Omega $| passing through |$\sigma $|⁠. In that case we see easily that |$\rho _\zeta (\sigma ) = 0$|⁠, so (5) still holds. To establish the formula, note that the center |$m$| of the circle in question is of the form |$m = \sigma - R N(\sigma )$|⁠, where the radius |$R = R_{\zeta ,\sigma }> 0$| of the circle satisfies |$|m - \zeta |^{2} = |(\sigma - \zeta ) - R N(\sigma )|^{2} = R^{2}$|⁠. Expanding the squares and solving for |$R$| leads to (5).

$Example domain $\Omega $ with corner of angle $\theta _{\zeta ^{\prime}}$ at $\zeta ^{\prime}$, and a circle of radius $R_{\zeta , \sigma }$ with center $m$, tangent to $\partial \Omega $ at $\sigma $ and passing through $\zeta $.$

Fig. 1

Example domain |$\Omega $| with corner of angle |$\theta _{\zeta ^{\prime}}$| at |$\zeta ^{\prime}$|⁠, and a circle of radius |$R_{\zeta , \sigma }$| with center |$m$|⁠, tangent to |$\partial \Omega $| at |$\sigma $| and passing through |$\zeta $|⁠.

Open in new tab Download slide

1.2 Neumann’s configuration constant

1.2.1 Real configuration constant

Historically, the Neumann–Poincaré operator has been used to solve the Dirichlet problem of finding a harmonic extension to |$\Omega ^{o}$| of a given continuous function |$u$| on |$\partial \Omega $|⁠. The extension can be obtained by finding |$f \in C(\partial \Omega )$|⁠, which solves (2). Indeed, if such an |$f$| is found, then the extension of |$u$| to |$\Omega ^{o}$| is given by the double-layer potential in (1). This naturally leads to questions of invertibility of the operator |$I + K_\Omega $| appearing on the right-hand side of (2), and consequently to the introduction of the Neumann’s configuration constant, which we shall soon define as the operator norm of |$K_\Omega $| acting on an appropriate space. Note that if |$\mathbf{1}$| is the constant function, then we have that |$K_\Omega \mathbf{1} = \mathbf{1}$|⁠, since each |$\mu _\zeta $| is a probability measure. Thus |$K_\Omega $| can be naturally defined as a linear mapping on the quotient space |$C(\partial \Omega )/{\mathbb{C}} \mathbf{1}$|⁠. The classical approach is to instead consider |$K_\Omega $| as acting on the space of real-valued continuous functions |$C_{\mathbb{R}}(\partial \Omega )$|⁠, in which case the corresponding quotient space |$C_{\mathbb{R}}(\partial \Omega )/{\mathbb{R}} \mathbf{1}$| is endowed with the norm

$$ \begin{align}& \| g + {\mathbb{R}} \mathbf{1} \|_{\partial \Omega}:= \max_{\zeta, \zeta^{\prime} \in \partial \Omega} \frac{|g(\zeta) - g(\zeta^{\prime})|}{2} = \min_{r \in{\mathbb{R}}} \, \max_{\zeta \in \partial \Omega} |g(\zeta) - r|.\end{align} $$

(6)

It is not hard to see that the two above expressions for the norm of the coset |$g + {\mathbb{R}} \mathbf{1}$| are equivalent: they are both equal to half of the length of the interval |$g(\partial \Omega ):= \{ g(\partial \Omega ): \zeta \in \partial \Omega \}$|⁠, the image of |$g$|⁠. The right-most expression is minimized by choosing |$r$| to be the mid-point of the image interval. Neumann’s (real) configuration constant |$c_{\mathbb{R}}(\Omega )$| is defined as the operator norm of |$K_\Omega $| acting on the quotient space |$C_{\mathbb{R}}(\partial \Omega )/ {\mathbb{R}} \mathbf{1}$|⁠:

$$ \begin{align}& c_{\mathbb{R}}(\Omega):= \| K_\Omega: C_{\mathbb{R}}(\partial \Omega)/ {\mathbb{R}} \mathbf{1} \to C_{\mathbb{R}}(\partial \Omega)/ {\mathbb{R}} \mathbf{1} \|.\end{align} $$

(7)

It is not hard to see that we may let |$K_\Omega $| instead act from |$C_{\mathbb{R}}(\partial \Omega )$| into the quotient |$C_{\mathbb{R}}(\partial \Omega ) / {\mathbb{R}} \mathbf{1}$| without affecting the operator norm. Since each measure |$\mu _\zeta $| is of unit mass, we have |$0 \leq c_{\mathbb{R}}(\Omega ) \leq 1$|⁠. If

$$ \begin{align*} & \|f \|_{\partial \Omega}:= \sup_{\zeta \in \partial \Omega} |f(\zeta)| \leq 1, \end{align*} $$

then

$$ \begin{align*} &|K_\Omega f(\zeta) - K_\Omega f(\zeta^{\prime})| \leq \| \mu_{\zeta} - \mu_{\zeta^{\prime}}\|,\end{align*} $$

where we use the total variation norm (functional norm) on the right-hand side. By varying |$f$| over the unit ball of |$C_{\mathbb{R}}(\partial \Omega )$| and |$\zeta , \zeta ^{\prime}$| over |$\partial \Omega $|⁠, we obtain the important relation

$$ \begin{align}& c_{\mathbb{R}}(\Omega):= \sup_{\zeta, \zeta^{\prime} \in \partial \Omega} \frac{\| \mu_{\zeta} - \mu_{\zeta^{\prime}}\|}{2}.\end{align} $$

(8)

This expression for |$c_{\mathbb{R}}(\Omega )$| will play a fundamental role in our study.

1.2.2 Neumann’s lemma

From (8) we can immediately deduce that |$c_{\mathbb{R}}(\Omega ) = 1$| in the case that |$\Omega $| is a triangle or a convex quadrilateral. Indeed, in those cases one sees from (3) and (4) that if |$\zeta _{1}$| and |$\zeta _{2}$| are corners of |$\Omega $| (opposing, in the case of the quadrilateral) then |$\mu _{\zeta _{1}}$| and |$\mu _{\zeta _{2}}$| are mutually singular, and so |$\| \mu _{\zeta _{1}} - \mu _{\zeta _{2}}\| = 2$|⁠, implying |$c_{\mathbb{R}}(\Omega ) = 1$|⁠. Neumann’s lemma, which appears initially in Neumann’s book [14], states that the cases of the triangle and quadrilateral are exceptional. For any other type of domain we have the strict inequality |$c_{\mathbb{R}}(\Omega ) < 1$|⁠. See [17] for a proof of this claim by Schober, and the curious history of incomplete attempts at a valid proof in full generality. Neumann’s lemma implies the invertibility of |$I + K_\Omega $| on |$C_{\mathbb{R}}(\partial \Omega )/ {\mathbb{R}} \mathbf{1}$|⁠, and thus the solvability of the Dirichlet problem on a convex domain |$\Omega $|⁠, which is not one of the two exceptional cases. The remaining cases can be handled by considering instead powers of |$K_\Omega $|⁠. See, for instance, [13, Theorem 3.8], [6, Proposition 7], or the article [16], which contains also an exposition of the double-layer potential and Neumann’s lemma.

At the other extreme, we have |$c_{\mathbb{R}}(\Omega ) = 0$| if and only if |$\Omega $| is a disk. This result will be proved in Section 5.

1.3 Complex and analytic configuration constants

1.3.1 Two new configuration constants

In the present article we will discuss certain applications of the double-layer potential to operator theory, which motivate the definition of the complex configuration constant

$$ \begin{align}& c_{\mathbb{C}}(\Omega):= \| K_\Omega: C(\partial \Omega)/ {\mathbb{C}} \mathbf{1} \to C(\partial \Omega)/ {\mathbb{C}} \mathbf{1} \|.\end{align} $$

(9)

The difference between (7) and (9) is that the latter is the norm of |$K_\Omega $| on the larger space of complex-valued functions. As a consequence, we have |$c_{\mathbb{R}}(\Omega ) \leq c_{\mathbb{C}}(\Omega ) \leq 1$|⁠. There is a principal difference between the geometric interpretations of the norms in the quotient spaces |$C_{\mathbb{R}}(\partial \Omega ) / {\mathbb{R}} \mathbf{1}$| and |$C(\partial \Omega ) / {\mathbb{C}} \mathbf{1}$|⁠. In the former case, as we have already noted, the norm (6) of the coset represented by the real-valued function |$g$| is equal to half of the length of the image of |$g$|⁠, this image being an interval on the real line |${\mathbb{R}}$|⁠. In the case of complex-valued |$g$|⁠, the quotient norm

$$ \begin{align}& \| g + {\mathbb{C}} \mathbf{1}\|_{\partial \Omega}:= \min_{\lambda \in{\mathbb{C}}} \max_{\zeta \in \partial \Omega} | g(\zeta) - \lambda|\end{align} $$

(10)

can instead be interpreted as the radius of the smallest disk containing the image of |$g$|. A crucial difference is that we lose the ability to estimate the norm of the coset |$g + {\mathbb{C}} \mathbf{1}$| by considering the quantities |$|g(\zeta ) - g(\zeta ^{\prime})|$| only. This is the essence of why new tools are required to treat this case.

We will also study an analogous analytic constant, which is the norm of the operator |$K_\Omega $| restricted to the subspace of analytic functions in |$C(\partial \Omega )$|⁠. More precisely, we let |$\mathcal{A}(\Omega )$| be the space of functions that are continuous in |$\Omega $| and analytic in |$\Omega ^{o}$|⁠. Each function in |$\mathcal{A}(\Omega )$| has a unique restriction to |$\partial \Omega $|⁠, and thus |$\mathcal{A}(\Omega )$| can be naturally identified with a subspace of |$C(\partial \Omega )$|⁠. We define the analytic configuration constant as

$$ \begin{align}& a(\Omega):= \| K_\Omega: \mathcal{A}(\Omega) / {\mathbb{C}}\mathbf{1} \to C(\partial \Omega) / {\mathbb{C}}\mathbf{1} \|.\end{align} $$

(11)

The space |$\mathcal{A}(\Omega )$| is not invariant under |$K_\Omega $|⁠, but we do have that |$K_\Omega f$| is the complex conjugate of a function in |$\mathcal{A}(\Omega )$| (in [5, proof of Lemma 2.1] this claim is established for |$\Omega $| with smooth boundary, but the same argument works in general). Clearly, we have the inequality |$a(\Omega ) \leq c_{\mathbb{C}}(\Omega )$|⁠. We note also that if |$\widetilde{\Omega }$| is the image of |$\Omega $| under an affine transformation of the plane, then the configuration constants of the two domains are equal. We shall verify this claim in Section 6.

1.3.2 An application to functional calculi

Given an operator |$T$| on a Hilbert space |$\mathcal{H}$| with numerical range

$$ \begin{align*} & W(T):= \{ \langle Tx, x \rangle_{\mathcal{H}}: x \in{\mathcal{H}}, \|x\|_{{\mathcal{H}}} = 1\}, \end{align*} $$

we are interested in the optimal constant |$K> 0$| in the inequality

$$ \begin{align}& \|p(T)\| \leq K\cdot \sup_{z \in W(T)} |p(z)| = K \|p\|_{W(T)},\end{align} $$

(12)

where |$p$| is an analytic polynomial, and the left-hand side is the operator norm of |$p(T)$| acting on |$\mathcal{H}$|⁠. More generally, if |$W(T)$| in (12) is replaced by an arbitrary domain |$\Omega $|⁠, and if the corresponding inequality holds for some |$K$|⁠, then we say that |$\Omega $| is a |$K$|-spectral set for |$T$|⁠. Von Neumann’s inequality says that the unit disk is a |$1$|-spectral set for any contraction |$T$|⁠, and a result of Okubo–Ando from [15] says that any disk containing |$W(T)$| is a |$2$|-spectral set for |$T$|⁠.

The numerical range |$W(T)$| is a bounded convex subset of the plane, its closure |$\overline{W(T)}$| contains the spectrum |$\sigma (T)$| of |$T$|⁠, and it has non-empty interior in the case that |$T$| is not a normal operator (see, for instance, [10, Chapter 1]). For normal operators, the bound (12) with constant |$K = 1$| is a consequence of the spectral theorem, and it suffices to take the supremum on the right-hand side over the smaller set |$\sigma (T)$|⁠. For general |$T$|⁠, even establishing the existence of a bound as in (12) is a non-trivial task. A result of Delyon–Delyon from [6, Theorem 3] establishes the existence of the bound, and shows that |$K$| can be chosen depending only on the area and the diameter of |$W(T)$|⁠. The remarkable work of Crouzeix in [3] establishes that (12) holds with |$K \leq 11.08$|⁠. A subsequent work of Crouzeix and Palencia in [5] improves the estimate to |$K \leq 1 + \sqrt{2}$|⁠. The Neumann–Poincaré operator appears as an essential tool in all of the mentioned works. The standing conjecture of Crouzeix from [2] is that the bound holds with |$K = 2$|⁠. This bound is presently known to hold in the case |$\mathcal{H}$| being of dimension 2, and has been established by Crouzeix in [2].

Our interest in the new notions of configuration constants is inspired by a recent work of Schwenninger and de Vries in [18], where bounds for general homomorphisms between uniform algebras and the algebras of bounded linear operators are studied. In Section 6 we will combine their arguments with the methods of Crouzeix–Palencia to obtain the following estimate:

$$ \begin{align}& \|p(T)\| \leq \bigg( 1 + \sqrt{1 + a(W)} \bigg) \| p \|_{W}, \quad W:= \overline{W(T)}.\end{align} $$

(13)

For instance, if |$W$| is a disk, then |$a(W) = 0$|⁠, which gives the Okubo–Ando result mentioned above. In [18], Schwenninger and de Vries recovered this result also. The estimate (13) is our motivation for the following investigation of the configuration constants |$c_{\mathbb{R}}(\Omega ), c_{\mathbb{C}}(\Omega )$|⁠, and |$a(\Omega )$|⁠, and the relations between them.

1.4 Main results

1.4.1 Relation between the real and complex constants

Consider the situation in Figure 2, where the triangular image of the complex-valued function |$g: \partial \Omega \to{\mathbb{C}}$| is contained in a disk of radius |$1$|⁠, and intersects the boundary circle of the disk in three distinct points. The three-point set |$\{g(\zeta _{1}), g(\zeta _{2}), g(\zeta _{3})\}$| is not contained in any open half-circle of the boundary, and it follows from a simple geometric argument (which we shall present in the proofs below) that |$\| g + {\mathbb{C}} \mathbf{1}\|_{\partial \Omega } = 1$|⁠. However, the sides of the triangular image of |$g$| are all of lengths strictly less than |$2$|⁠, and this implies that

$$ \begin{align*} &\| g + {\mathbb{C}} \mathbf{1}\|_{\partial \Omega} = 1> \max_{\zeta, \zeta^{\prime} \in \partial \Omega} \frac{|g(\zeta)- g(\zeta^{\prime})|}{2}.\end{align*} $$

Fig. 2

A triangular image of a complex-valued function |$g$| contained in a disk of radius |$1$|⁠, with three points on the boundary of a disk.

Open in new tab Download slide

If such a function |$g$| lies in the image of the unit ball of |$C(\partial \Omega )$| under the Neumann–Poincaré operator |$K_\Omega $| for some domain |$\Omega $| that satisfies |$c_{\mathbb{R}}(\Omega ) < 1$|⁠, then a strict inequality |$c_{\mathbb{R}}(\Omega ) < c_{\mathbb{C}}(\Omega )$| occurs. Our first main result excludes this possibility, and so establishes the simplest possible relation between the real and complex configuration constants.

Theorem 1.

The equality

$$ \begin{align*} & c_{\mathbb{R}}(\Omega) = c_{\mathbb{C}}(\Omega) \end{align*} $$

holds for every compact convex domain |$\Omega $| with non-empty interior.

It follows that every considered domain has a well-defined configuration constant |$c(\Omega )$|⁠, which is equal to the operator norm of |$K_\Omega $| on |$C(\partial \Omega ) / {\mathbb{C}} \mathbf{1}$|⁠, and which can be computed according to the right-hand side of (8). An important consequence of this result is the inequality

$$ \begin{align}& a(\Omega) \leq c(\Omega) = \sup_{\zeta, \zeta^{\prime} \in \partial \Omega} \frac{\| \mu_{\zeta} - \mu_{\zeta^{\prime}}\|}{2},\end{align} $$

(14)

which, as we shall soon see, has some interesting consequences.

Theorem 1 doesn’t appear nearly as straightforward to prove as it is to state, and the proof takes up a large portion of the article. However, the only property of the Neumann–Poincaré operator used in the proof is that its integral kernel |$\{\mu _\zeta \}_{\zeta \in \partial \Omega }$| consists of real-valued measures. In fact, the theorem will be deduced as a corollary of a result, which we call the Three-measures theorem, and which is a general statement regarding the geometry of the space |$C(X)$| of continuous functions on a compact Hausdorff space |$X$|⁠. This result, which we discuss and prove in Section 2, puts a restriction on the possible configurations of point sets in the plane, which arise as values of a collection of real-valued functionals on |$C(X)$|⁠.

1.4.2 Analytic Neumann’s lemma

Note that the above estimate in (14), together with Neumann’s lemma, implies that |$a(\Omega ) < 1$| whenever |$\Omega $| is not a triangle or a quadrilateral. This can be improved, for we have an analytic version of Neumann’s lemma, in which no exceptional cases occur.

Theorem 2.

The strict inequality

$$ \begin{align*} & a(\Omega) < 1 \end{align*} $$

holds for every compact convex domain |$\Omega $| with non-empty interior.

Our proof of Theorem 2 is much different from the one given by Schober in his proof of the real Neumann’s lemma in [17], but it works also in the real context. At the end of Section 4 we show how our technique leads to a different proof of Neumann’s lemma.

1.4.3 Functional calculus bounds

The following result has already been mentioned above.

Theorem 3.

Let |$T: {\mathcal{H}} \to{\mathcal{H}}$| be a bounded linear operator on a Hilbert space |${\mathcal{H}}$| with numerical range |$W(T)$|⁠, which has non-empty interior. Then, for every polynomial |$p$|⁠, we have

$$ \begin{align*} & \| p(T)\| \leq \Bigl( 1 + \sqrt{1 + a(W)} \Bigr) \| p \|_{W(T)}. \end{align*} $$

Recall that if the numerical range of an operator has empty interior, then the operator is normal, and so (12) holds with |$K = 1$|⁠. From this observation and Theorem 2 we obtain that for any fixed operator |$T:{\mathcal{H}} \to{\mathcal{H}}$|⁠, the optimal constant |$K$| in (12) is always strictly smaller than |$1 + \sqrt{2}$|⁠. In fact, we deduce from our results that we have the inequality

$$ \begin{align*} & \|p(T)\| \leq K_{W} \| p \|_{W} \end{align*} $$

with a constant

$$ \begin{align*} &K_{W} < 1 + \sqrt{2},\end{align*} $$

which depends only on the shape of |$W = \overline{W(T)}$|⁠, and not on the operator |$T$| itself. We show in Section 5 that no better universal bound can be obtained by means of the analytic configuration constant: for any |$\epsilon> 0$| there exists a “thin” quadrilateral |$\Omega _\epsilon $| for which we have |$a(\Omega _\epsilon )> 1 - \epsilon $|⁠. However, fixing the dimension of the Hilbert space |${\mathcal{H}}$|⁠, one may combine earlier results of Crouzeix to obtain a uniform improvement. The optimal constant |$K$| in (12) varies with |$T$|⁠, and we may consider the supremum of these quantities among all operators |$T$| on a Hilbert space |${\mathcal{H}}$| of a fixed dimension |$N$|⁠. In [4, Theorem 2.2], Crouzeix proved that there exists an operator realizing this supremum. An immediate corollary of his result, Theorem 2 and Theorem 3 is the following.

Corollary 4.

For every positive integer |$N$|⁠, there exists a constant |$C_{N} < 1 + \sqrt{2}$| for which we have

$$ \begin{align*} & \| p(T)\| \leq C_{N} \|p\|_{W(T)}\end{align*} $$

whenever |$T$| is an operator on an |$N$|-dimensional Hilbert space, and |$p$| is a polynomial.

This improves the Crouzeix–Palencia bound, although by an indefinite amount.

1.4.4 Estimates for the configuration constants

In Section 5 we present also other computations and estimates for the configuration constants. Surprisingly, in the case of an elliptical domain, the configuration constant is computable exactly, and we obtain

$$ \begin{align*} & c(\Omega_{a,b}) = \frac{2}{\pi}\arctan\Bigl(\frac{1}{2}\Bigl|\frac{b}{a}-\frac{a}{b}\Bigr|\Bigr) = \frac{2}{\pi}\arctan\Bigl( \frac{1}{2} \frac{e^{2}}{\sqrt{1-e^{2}}}\Bigr)\end{align*} $$

where |$a$| and |$b$| are lengths of the semi-axes of the ellipse |$\Omega _{a,b}$|⁠, and |$e$| is the eccentricity of the ellipse, given by |$e:= \sqrt{1 - b^{2}/a^{2}}$| in case that |$a \geq b$|⁠. This fact, together with Theorem 1, estimate (13), and the inequality |$a(\Omega ) \leq c(\Omega )$|⁠, has the following consequence.

Corollary 5.

Let |$T: {\mathcal{H}} \to{\mathcal{H}}$| be a bounded linear operator on a Hilbert space |${\mathcal{H}}$| with numerical range contained in (or equal to) the ellipse |$\Omega _{a,b}$|⁠. Then, for every polynomial |$p$|⁠, we have

$$ \begin{align*} & \|p(T)\| \leq K(a,b) \| p \|_{\Omega_{a,b}}. \end{align*} $$

where

$$ \begin{align*} & K(a,b):= 1 + \sqrt{1 + \frac{2}{\pi}\arctan\Bigl(\frac{1}{2}\Bigl|\frac{b}{a}-\frac{a}{b}\Bigr|\Bigr)}.\end{align*} $$

Note that the function |$a \mapsto K(a,1)$| is continuous and increasing for |$a \geq 1$|⁠, and we have

$$ \begin{align*} & \lim_{a \to \infty} K(a,1) = 1 + \sqrt{2}, \quad \lim_{a \to 1} K(a,1) = 2.\end{align*} $$

Hence the estimate in Corollary 5 gets worse as the eccentricity of the ellipse |$\Omega _{a,b}$| grows, and approaches the Crouzeix–Palencia bound in the limit |$a \to \infty $|⁠. On the other hand, as |$a \to 1$|⁠, the eccentricity of the ellipse |$\Omega _{a,1}$| tends to |$0$|⁠. The estimate is then close to the conjectured optimal bound |$K = 2$| and coincides with the Okubo–Ando bound for |$a=1$|⁠, in which case the domain is a disk. From this perspective, Corollary 5 may be interpreted as an elliptical generalization of the Okubo–Ando estimate.

For many other types of domains, the exact value of |$c(\Omega )$| is inaccessible. To help the situation, we establish an integral estimate, which gives an upper bound on |$c(\Omega )$| in terms of the curvature of |$\partial \Omega $|⁠, roughly speaking. For a fixed |$\sigma $| that is not a corner of |$\partial \Omega $|⁠, recall the definition of |$R_{\zeta , \sigma }$| in (5), and consider

$$ \begin{align}& R_\Omega(\sigma):= \sup_{\zeta \in \partial \Omega} R_{\zeta, \sigma}.\end{align} $$

(15)

$$ \begin{align}& 1/\kappa(\sigma) = \lim_{\zeta \to \sigma} R_{\zeta, \sigma},\end{align} $$

(16)

Theorem 6.

With the above notation, we have the estimate

$$ \begin{align*} & c(\Omega) \leq 1-\frac{1}{2\pi}\int_{\partial\Omega}\frac{ \textrm{d}s}{R_\Omega}.\end{align*} $$

The result implies spectral constant estimates similar to the one in Corollary 5 above. It also generalizes some similar results in the literature. See Section 5 for further details and examples.

1.4.5 An unresolved matter

We have mentioned above that |$c(\Omega ) = 0$| if and only if |$\Omega $| is a disk. With some additional effort, we will show in Section 5 that the condition |$a(\Omega ) = 0$| also characterizes disks. In this case, we have the equality |$a(\Omega ) = c(\Omega )$|⁠. It is natural to ask whether other domains exist for which the equality occurs, or if the case of the disk is exceptional.

Question.

Do we always have the strict inequality

$$ \begin{align*} & a(\Omega) < c(\Omega)\end{align*} $$

whenever |$\Omega $| is not a disk?

As a consequence of Theorem 2 and the exceptional cases of Neumann’s lemma, we see that the strict inequality holds whenever |$\Omega $| is a triangle or a quadrilateral. The authors have not been able to confirm that the inequality holds in any other examples.

1.5 Notations

Some of our notation has already been introduced above. For a continuous function |$f$| defined on a set |$X$|⁠, we denote by |$\| f \|_{X}$| the supremum of |$|f|$| over |$X$|⁠. For cosets of the form |$f + {\mathbb{C}}\mathbf{1}$| we use the convention

$$ \begin{align*} &\| f + {\mathbb{C}} \mathbf{1}\|_{X}:= \inf_{\lambda \in{\mathbb{C}}} \|f + \lambda \|_{X},\end{align*} $$

with similar convention for real-valued |$f$| and cosets |$f + {\mathbb{R}}\mathbf{1}$|⁠. A norm |$\| \cdot \|$| without a subscript usually denotes a linear functional norm or a total variation norm of a measure. The distinction will be unimportant and should anyway be easy to deduce from context. We use boldface letters, such as |$\mathbf{x}$|⁠, to denote vectors in |${\mathbb{R}}^{n}$|⁠, and plain letters, such as |$x_{j}$|⁠, to denote the coordinates.

2 The Three-Measures Theorem

2.1 Definitions of relevant spaces and operators

Theorem 1 will be proved as a corollary of our analysis of three-point configurations

$$ \begin{align*} & \big( \ell_{1}(x), \ell_{2}(x), \ell_{3}(x) \big) \in{\mathbb{C}}^{3}, \end{align*} $$

where |$x$| is an element of a given normed space |$\mathcal{N}$|⁠, and |$\ell _{1}, \ell _{2}, \ell _{3} \in \mathcal{N}^{*}$| are three bounded linear functionals on |$\mathcal{N}$|⁠. A point configuration of this type has to satisfy certain conditions. For instance, we must have the distance bound

$$ \begin{align*} & |\ell_{j}(x) - \ell_{k}(x)| \leq \|\ell_{j} - \ell_{k}\|_{\mathcal{N}^{*}} \|x\|_{\mathcal{N}}, \quad 1 \leq j,k \leq 3. \end{align*} $$

Our principal interest will be in estimating the radius of the smallest disk that contains such a three-point set.

In order to use the tools of functional analysis, we will formulate our problem as one of estimating the norm of an operator between normed spaces. To this end, we use the space |${\mathbb{C}}^{3}$| of triples of complex numbers, and we equip it with the following norm:

$$ \begin{align}& \| (a,b,c) \|_\infty:= \max \{|a|, |b|, |c| \}.\end{align} $$

(17)

Similarly to our previous notational conventions, we shall set |$\mathbf{1}:= (1,1,1) \in{\mathbb{C}}^{3}$|⁠. The quotient norm in the quotient space |${\mathbb{C}}^{3} / {\mathbb{C}} \mathbf{1}$| satisfies

$$ \begin{align*} & \| (a,b,c) + {\mathbb{C}} \mathbf{1} \|_{\infty}:= \min_{\lambda \in{\mathbb{C}}} \, \max \{ |a-\lambda|, |b-\lambda|, |c-\lambda|\}, \end{align*} $$

and it has the geometric interpretation adequate to our problem: it is the radius of the smallest disk containing the three point set |$\{a,b,c\}$|⁠. Given a normed space |$\mathcal{N}$| and three linear functionals |$\ell _{1}, \ell _{2}, \ell _{3} \in \mathcal{N}^{*}$|⁠, we introduce the linear operator |$\mathcal{L}: \mathcal{N} \to{\mathbb{C}}^{3} / {\mathbb{C}} \mathbf{1}$| defined by

$$ \begin{align}& \mathcal{L} x:= \big( \ell_{1}(x), \ell_{2}(x), \ell_{3}(x) \big) + {\mathbb{C}} \mathbf{1}.\end{align} $$

(18)

With these conventions, each three-point configuration |$\big ( \ell _{1}(x), \ell _{2}(x), \ell _{3}(x) \big )$| is contained in a disk of radius at most |$\|\mathcal{L}\|_{\mathcal{N} \to{\mathbb{C}}^{3} / {\mathbb{C}} \mathbf{1}} \cdot \|x\|_\mathcal{N}$|⁠. We want to estimate the operator norm |$\|\mathcal{L}\|_{\mathcal{N} \to{\mathbb{C}}^{3} / {\mathbb{C}} \mathbf{1}}$|⁠.

2.2 Statement of the theorem

Without any information regarding the space |$\mathcal{N}$| or the functionals |$\ell _{1}, \ell _{2}, \ell _{3}$|⁠, the optimal estimate is

$$ \begin{align}& \|\mathcal{L}\|_{\mathcal{N} \to{\mathbb{C}}^{3} / {\mathbb{C}} \mathbf{1}} \leq \frac{1}{\sqrt{3}} \max_{j,k} \|\ell_{j} - \ell_{k}\|.\end{align} $$

(19)

Indeed, we see that we cannot do better by choosing |$\mathcal{N} = {\mathbb{C}}$|⁠, |$x = 1$|⁠, and the functionals (scalars) to be the vertices of an equilateral triangle inscribed in the unit circle. For instance,

$$ \begin{align*} & \ell_{1} = 1, \quad \ell_{2} = -1/2 + i\sqrt{3}/2, \quad \ell_{3} = -1/2 - i\sqrt{3}/2.\end{align*} $$

The sides of the triangle have the common length equal to |$| \ell _{i} - \ell _{j}| = \sqrt{3}$|⁠, and the smallest disk containing the three points |$\ell _{i}(x) = \ell _{i}$| is the unit disk itself. Thus, in this case, (19) holds with equality. The estimate holds in general as a consequence of Jung’s theorem, which appeared first in [11], and which in the context of the plane says that any set of diameter |$d$| is contained in a disk of radius |$d/\sqrt{3}$|⁠. In our setting |$d \leq \max _{j,k} \|\ell _{j} - \ell _{k}\|_{\mathcal{N}^{*}}$|⁠, and so the estimate (19) follows from Jung’s theorem.

In our intended application, the role of the space |$\mathcal{N}$| is played by |$C(X)$|⁠, the Banach space of continuous functions on a compact Hausdorff space |$X$|⁠, and the functionals are given by integration against real-valued measures

$$ \begin{align*} & f \mapsto \mu_{j}(f):= \int_{X} f \, \textrm{d} \mu_{j}.\end{align*} $$

It turns out that the three-point configurations that arise in this way are contained in disks of radius smaller than predicted by Jung’s theorem. The main result of the section is the following.

Theorem 7.

Let |$C(X)$| be the space of continuous functions on a compact Hausdorff space |$X$|⁠, and |$\mathcal{L}: C(X) \to{\mathbb{C}}^{3} / {\mathbb{C}} \mathbf{1}$| be the operator in (18) defined by three functionals induced by three finite real-valued Borel measures |$\mu _{1}, \mu _{2}, \mu _{3}$|⁠. Then

$$ \begin{align}& \|\mathcal{L}\|_{C(X) \to{\mathbb{C}}^{3} / {\mathbb{C}} \mathbf{1}} = \frac{1}{2} \max_{j,k} \|\mu_{j} - \mu_{k}\|.\end{align} $$

(20)

It is the “|$\leq $|” estimate in (20) that is the critical one. The lower bound “|$\geq $|” follows from the definition of the functional norm

$$ \begin{align*} & \frac{1}{2} \| \mu_{j} - \mu_{k}\| = \frac{1}{2}\sup_{f: \|f\|_{X} = 1} | \mu_{j}(f) - \mu_{k}(f)| \leq \|\mathcal{L}\|_{C(X) \to{\mathbb{C}}^{3} / {\mathbb{C}} \mathbf{1}}. \end{align*} $$

We will spend the rest of the section on proving Theorem 7. The outline of the proof is as follows. We will first use duality to formulate the problem in terms of the adjoint operator |$\mathcal{L}^{*}$| between the dual spaces. Next, a discretization will help us reduce the dual problem to a finite-dimensional optimization problem. Finally, we will solve the finite-dimensional problem by the use of techniques of convex analysis.

Before proceeding, we remark that the natural generalization of the above theorem to an arbitrary |$n$|-tuple of real-valued measures is valid. See Theorem 17 below.

2.3 Dual problem

Let us denote by |$Y$| the space |${\mathbb{C}}^{3}/{\mathbb{C}} \mathbf{1}$| equipped with the norm in (17). Then the dual space |$Y^{*}$| is the two-dimensional space of three-tuples |$(\alpha ,\beta , \gamma )$| of complex numbers that satisfy

$$ \begin{align*} & \alpha+\beta+\gamma = 0,\end{align*} $$

and the norm on |$Y^{*}$| is given by

$$ \begin{align*} & \|(\alpha,\beta,\gamma)\|_{1}:= |\alpha| + |\beta| + |\gamma|.\end{align*} $$

In the case |$\mathcal{N} = C(X)$|⁠, the dual space |$(C(X))^{*}$| is just the space of finite Borel measures on |$X$|⁠. The adjoint operator |$\mathcal{L}^{*}: Y^{*} \to \mathcal{N}^{*}$| is then given by

$$ \begin{align*} & \mathcal{L}^{*}: (\alpha,\beta,\gamma) \mapsto \alpha\mu_{1} + \beta\mu_{2} + \gamma\mu_{3}\end{align*} $$

and the estimate (20) is equivalent to

$$ \begin{align}& \| \alpha \mu_{1} + \beta \mu_{2} + \gamma \mu_{3}\| \leq \frac{|\alpha| + |\beta| + |\gamma|}{2} \max_{j,k} \|\mu_{j} - \mu_{k}\|.\end{align} $$

(21)

Since |$\alpha + \beta = -\gamma $| and |$(\alpha + \beta + \gamma )\mu _{3} = 0$|⁠, we may rewrite the above inequality into

$$ \begin{align}& \| \alpha \nu_{1} + \beta \nu_{2}\| \leq \frac{|\alpha| + |\beta| + |\alpha + \beta|}{2} \max \Bigl\{ \|\nu_{1}\|, \| \nu_{2}\|, \|\nu_{1} - \nu_{2}\| \Bigr\}\end{align} $$

(22)

where

$$ \begin{align*} & \nu_{1}:= \mu_{1} - \mu_{3}, \quad \nu_{2}:= \mu_{2} - \mu_{3}. \end{align*} $$

Note that |$\nu _{1}$| and |$\nu _{2}$| are real-valued if |$\mu _{1}, \mu _{2}, \mu _{3}$| are real-valued. Theorem 7 is thus a consequence of the following slightly more general statement in which the topological structure of |$X$| does not play a role.

Proposition 8.

Let |$\nu _{1}$| and |$\nu _{2}$| be two finite real-valued measures on a measurable space |$X$|⁠. Then for any complex numbers |$\alpha , \beta $| we have the inequality

$$ \begin{align}& \|\alpha\nu_{1}+\beta\nu_{2}\|\le \frac{|\alpha|+|\beta|+|\alpha + \beta|}{2}\max \Bigl\{ \|\nu_{1}\|, \| \nu_{2}\|, \|\nu_{1} - \nu_{2}\| \Bigr\},\end{align} $$

(23)

where the norm on the right-hand side is the total variation norm |$\|\mu \|:= |\mu |(X)$|⁠.

In our next step, we shall simplify the problem further, and show that Proposition 8 can be established by considering finite sets |$X$| only.

2.4 Discretization

With notations as in Proposition 8, set |$\sigma :=|\nu _{1}|+|\nu _{2}|$|⁠. Then |$\sigma $| is a positive finite measure on |$X$|⁠, and by the Radon–Nikodym theorem we have |$d\nu _{1}=f\,d\sigma $| and |$d\nu _{2}=g\,d\sigma $|⁠, where |$f,g$| are bounded real measurable functions on |$X$|⁠. For a moment, let |$\|\cdot \|_{\sigma ,1}$| denote the norm

$$ \begin{align*} & \|f \|_{\sigma,1}:= \int_{X} |f|\, \textrm{d}\sigma. \end{align*} $$

Then Proposition 8 is equivalent to the inequality

$$ \begin{align}& \int_{X}|\alpha f+\beta g|\, \textrm{d}\sigma \le \frac{|\alpha|+|\beta|+|\alpha+\beta|}{2}\max\Bigl\{\|f\|_{\sigma,1},\|g\|_{\sigma,1},\|f-g\|_{\sigma,1}\Bigr\}.\end{align} $$

(24)

We will say that a function is simple if it only takes on a finite number of distinct values. By standard measure theory, there exist simple measurable real functions |$f_{m}$|⁠, |$g_{m}$| on |$X$| such that |$f_{m}\to f$| and |$g_{m}\to g$| uniformly on |$X$|⁠. Clearly |$\| \alpha f_{m} + \beta g_{m}\|_{\sigma ,1} \to \| \alpha f + \beta g\|_{\sigma ,1}$|⁠. Likewise |$\|f_{m}\|_{\sigma ,1}\to \|f\|_{\sigma ,1}$| and |$\|g_{m}\|_{\sigma ,1}\to \|g\|_{\sigma ,1}$| and |$\|f_{m}-g_{m}\|_{\sigma ,1}\to \|f-g\|_{\sigma ,1}$|⁠. Thus, if the inequality (24) holds for each pair of simple functions, then it holds for |$f,g$|⁠. So it suffices to establish (24) when |$f,g$| are simple measurable real functions.

Hence, suppose that |$f,g$| are simple measurable real functions on |$X$|⁠. We can write them as |$f=\sum _{j=1}^{n} a_{j}1_{X_{j}}$| and |$g=\sum _{j=1}^{n} b_{j} 1_{X_{j}}$|⁠, where |$\{ X_{1},\dots ,X_{n} \}$| is a measurable partition of |$X$|⁠, and |$a_{j},b_{j}\in{\mathbb{R}}$| for all |$j$|⁠. The inequality in (24) becomes

$$ \begin{align*} & \sum_{j=1}^{n} |\alpha a_{j}+\beta b_{j}|\sigma(X_{j}) \le \frac{|\alpha|+|\beta|+|\alpha+\beta|}{2}\max\Bigl\{\sum_{j=1}^{n}|a_{j}|\sigma(X_{j}),\sum_{j=1}^{n}|b_{j}|\sigma(X_{j}), \sum_{j=1}^{n} |a_{j}-b_{j}|\sigma(X_{j})\Bigr\}. \end{align*} $$

Writing

$$ \begin{align*} &x_{j}:=a_{j}\sigma(X_{j}), \quad \mathbf{x} = (x_{1}, \ldots, x_{n})^{T} \in{\mathbb{R}}^{n}\end{align*} $$

and

$$ \begin{align*} & y_{j}:=b_{j}\sigma(X_{j}), \quad \mathbf{y} = (y_{1}, \ldots, y_{n})^{T} \in{\mathbb{R}}^{n}\end{align*} $$

we see that this becomes

$$ \begin{align*} & \|\alpha \mathbf{x}+\beta \mathbf{y}\|_{1}\le \frac{|\alpha|+|\beta|+|\alpha+\beta|}{2}\max\Bigl\{\|\mathbf{x}\|_{1},\|\mathbf{y}\|_{1},\|\mathbf{x-y}\|_{1}\Bigr\}, \end{align*} $$

where now |$\mathbf{x},\mathbf{y}$| are vectors in |${\mathbb{R}}^{n}$| and |$\|\cdot \|_{1}$| denotes the usual |$\ell ^{1}$|-norm on |${\mathbb{R}}^{n}$| given by

$$ \begin{align}& \|\mathbf{x}\|_{1}:= \sum_{j=1}^{n} |x_{j}|.\end{align} $$

(25)

To summarize, to prove Proposition 8 and consequently to prove Theorem 7, it suffices to establish the following discrete result.

Proposition 9.

Let |$n\ge 1$| and |$\mathbf{x},\mathbf{y}\in{\mathbb{R}}^{n}$|⁠. Then for all complex numbers |$\alpha , \beta $| we have the inequality

$$ \begin{align}& \|\alpha \mathbf{x}+\beta \mathbf{y}\|_{1}\le \frac{|\alpha|+|\beta|+|\alpha+\beta|}{2}\max\Bigl\{\|\mathbf{x}\|_{1},\|\mathbf{y}\|_{1},\|\mathbf{x-y}\|_{1}\Bigr\}.\end{align} $$

(26)

This reduction of the problem to the finite-dimensional setting allows us to use the tools of convex analysis.

2.5 Optimization over a convex set

Consider the set

$$ \begin{align}& C_{n}:= \Bigl \{ (\mathbf{x},\mathbf{y}) \in{\mathbb{R}}^{n} \times{\mathbb{R}}^{n}: \|\mathbf{x}\|_{1} \leq 1, \|\mathbf{y} \|_{1} \leq 1, \| \mathbf{x - y}\|_{1} \leq 1 \Bigr\}.\end{align} $$

(27)

Thus |$C_{n}$| is a compact convex polytope in |${\mathbb{R}}^{n} \times{\mathbb{R}}^{n}$|⁠, and so it has a finite number of extreme points. That is, points of |$C_{n}$| that do not lie in the interior of any line segment in |$C_{n}$|⁠. A well-known theorem of Carathéodory says that each point of a compact convex polytope is a convex combination of its extreme points.

Lemma 10.

In order to establish Proposition 9, it suffices to show that the inequality (26) holds for every extreme point of |$C_{n}$|⁠.

Proof.

Let us fix |$\alpha , \beta \in{\mathbb{C}}$| and |$(\mathbf{x},\mathbf{y}) \in{\mathbb{R}}^{n} \times{\mathbb{R}}^{n}$|⁠. By the homogeneity of the inequality in (26), we may assume that

$$ \begin{align}& \max \Bigl \{ \|\mathbf{x}\|_{1}, \|\mathbf{y}\|_{1}, \|\mathbf{x-y}\|_{1} \Bigr\} = 1.\end{align} $$

(28)

Then |$(\mathbf{x},\mathbf{y}) \in C_{n}$| and so we may express it as a convex combination of the extreme points of |$C_{n}$|⁠, namely

$$ \begin{align*} & \mathbf{x} = \sum_{k=1}^{m} t_{k} \mathbf{e^{k}}, \quad \mathbf{y} = \sum_{k=1}^{m} t_{k} \mathbf{f^{k}} \end{align*} $$

where |$\mathbf{e^{k}} \in{\mathbb{R}}^{n}$|⁠, |$\mathbf{f^{k}} \in{\mathbb{R}}^{n}$|⁠, the pairs |$(\mathbf{e^{k}}, \mathbf{f^{k}})$| are extreme points of |$C_{n}$|⁠, |$t_{k}> 0$|⁠, and |$\sum _{k=1}^{m} t_{k} = 1$|⁠. Note that since |$(\mathbf{e^{k}}, \mathbf{f^{k}})$| is an extreme point of |$C_{n}$|⁠, we must have

$$ \begin{align*} & \max \Bigl \{ \|\mathbf{e^{k}}\|_{1}, \|\mathbf{f^{k}}\|_{1}, \|\mathbf{e^{k} - f^{k}}\|_{1} \Bigr\} = 1.\end{align*} $$

Since we are assuming that (26) holds for extreme points, we can estimate

$$ \begin{align*} \| \alpha \mathbf{x} + \beta \mathbf{y}\|_{1} &\leq \sum_{k=1}^{m} t_{k} \| \alpha \mathbf{e^{k}} + \beta \mathbf{f^{k}}\|_{1} \\ &\leq \sum_{k=1}^{m} t_{j} \frac{|\alpha| + |\beta| + |\alpha + \beta|}{2} \max \Bigl \{ \|\mathbf{e^{k}}\|_{1}, \|\mathbf{f^{k}}\|_{1}, \|\mathbf{e^{k} - f^{k}}\|_{1} \Bigr\} \\ &= \frac{|\alpha| + |\beta| + |\alpha + \beta|}{2}\sum_{k=1}^{m} t_{k} \\ &= \frac{|\alpha| + |\beta| + |\alpha + \beta|}{2}. \end{align*} $$

Recalling our normalization in (28), this is the desired estimate in (26).

From the above lemma and our sequence of reductions above, it follows that in order to prove Theorem 7 it suffices show that the inequality (26) holds at every extreme point of the polytope |$C_{n}$|⁠. Proposition 11 below characterizes these extreme points by partitioning them into three equivalence classes.

Note that |$C_{n}$| is invariant under the following linear symmetries:

$$ \begin{align} &\begin{cases} x_{j}^{\prime}&:= x_{\pi(j)} \\ y_{j}^{\prime}&:= y_{\pi(j)} \end{cases}\end{align} $$

(29)

where |$\pi $| is any permutation of |$\{1,2,\dots ,n\}$|⁠,

$$ \begin{align} &\begin{cases} x_{j}^{\prime}&:=\epsilon_{j} x_{j} \\ y_{j}^{\prime}&:=\epsilon_{j} y_{j} \end{cases}\end{align} $$

(30)

for any choice of |$\epsilon _{1},\dots ,\epsilon _{n}\in \{-1,1\}$|⁠,

$$ \begin{align} &\begin{cases} \mathbf{x^{\prime}} &:=\mathbf{y} \\ \mathbf{y^{\prime}} &:=\mathbf{x} \end{cases}\end{align} $$

(31)

and

$$ \begin{align} &\begin{cases} \mathbf{x^{\prime}}&:=\mathbf{x} \\ \mathbf{y^{\prime}}&:=\mathbf{x-y} \end{cases}\end{align} $$

(32)

Denote by |$G_{n}$| the group generated by these symmetries. As these symmetries are linear automorphisms of |${\mathbb{R}}^{n} \times{\mathbb{R}}^{n}$|⁠, it is clear that |$G_{n}$| leaves invariant the set of extreme points |$C_{n}$|⁠. We say that two extreme points of |$C_{n}$| are |$\textit{G}_{n}$|-equivalent if there is an element of |$G_{n}$| mapping one of them to the other. Thus the action of |$G_{n}$| on |$C_{n}$| partitions the set of extreme points of |$C_{n}$| into a finite number of equivalence classes. Note that if the inequality (26) holds for some |$(\mathbf{x},\mathbf{y}) \in C_{n}$|⁠, then it holds also for any point of |$C_{n}$| in the orbit of |$(\mathbf{x},\mathbf{y})$| under the group action of |$G_{n}$| on |$C_{n}$|⁠.

The extreme points of |$C_{n}$| are identified in the following proposition.

Proposition 11.

If |$n\ge 3$|⁠, then every extreme point |$(\mathbf{x},\mathbf{y})$| of |$C_{n}$| is |$G_{n}$|-equivalent to one of the pairs

$$ \begin{align*} & \begin{pmatrix} 1\\0\\0\\0\\ \vdots\\ 0 \end{pmatrix}, \begin{pmatrix} 1\\0\\0\\0\\ \vdots\\ 0 \end{pmatrix} \quad\textrm{and}\quad \begin{pmatrix} 1\\0\\0\\0\\ \vdots\\ 0 \end{pmatrix}, \begin{pmatrix} 1/2\\1/2\\0\\0\\ \vdots\\ 0 \end{pmatrix} \quad\textrm{and}\quad \begin{pmatrix} 1/2\\1/2\\0\\0\\ \vdots\\0 \end{pmatrix}, \begin{pmatrix} 1/2\\0\\1/2\\0\\ \vdots\\ 0 \end{pmatrix}. \end{align*} $$

One can readily check that each of the three above pairs really is an extreme point of |$C_{n}$|⁠. We omit the proof, since we do not actually need this fact. In the case that |$n = 1$|⁠, the same result holds, but only the first kind of pair can arise. Likewise, if |$n = 2$|⁠, the same result holds, but only the first two types of pairs can arise.

We will prove Proposition 11 in Section 2.6. For now let us see how Theorem 7 follows. In order to verify (26) for all extreme points of |$C_{n}$|⁠, it suffices to verify the inequality for the three pairs of vectors appearing in Proposition 11. This is an easy task. For instance, if |$(\mathbf{x},\mathbf{y})$| is the second pair in Proposition 11, then we have

$$ \begin{align*} \|\alpha \mathbf{x} + \beta \mathbf{y}\|_{1} &= |\alpha + \beta /2 | + |\beta|/2 \\ &= |\alpha/2 + \alpha/2 + \beta/2| + |\beta|/2 \\&\leq |\alpha + \beta|/2 + |\alpha|/2 + |\beta|/2. \end{align*} $$

The inequality for the other two pairs is verified similarly. Then from Lemma 10 we conclude that Proposition 9 holds, from which Theorem 7 follows by the earlier reduction.

It remains to prove Proposition 11.

2.6 Extreme points of the polytope

In the proof of Proposition 11, the group |$G_{n}$| generated by the symmetries (29)–(32) will be extensively used. In particular we will use the property that |$(\mathbf{x},\mathbf{y})$| is an extreme point of |$C_{n}$| if and only if some extreme point of |$C_{n}$| is |$G_{n}$|-equivalent to it. Moreover, the following two observations will be useful to single out.

Lemma 12.

If for a pair |$(\mathbf{x},\mathbf{y}) \in C_{n}$| there exists two distinct indices |$j,k$| such that

$$ \begin{align*} & x_{j}> 0, \quad y_{j} > 0, \quad x_{k} > 0, \quad y_{k} > 0 \end{align*} $$

then |$(\mathbf{x},\mathbf{y})$| is not an extreme point of |$C_{n}$|⁠.

More generally, if for two distinct indices |$j, k$| we have that two of the quantities |$x_{j}x_{k}$|⁠, |$y_{j}y_{k}$| and |$(x_{j}-y_{j})(x_{k}-y_{k})$| are non-zero and have the same sign, then |$(\mathbf{x},\mathbf{y})$| is not an extreme point of |$C_{n}$|⁠.

Proof.

Using the symmetry (29) we may suppose that |$j=1$|⁠, |$k=2$|⁠. Note that |$x_{1} < 1$|⁠, |$x_{2} < 1$|⁠, since |$\| \mathbf{x}\|_{1} \leq 1$|⁠. The same is true for the corresponding coordinates of |$\mathbf{y}$|⁠. Let |$\mathbf{d} = (1, -1, 0, \ldots , 0)^{T} \in{\mathbb{R}}^{n}$|⁠. It is easy to verify that if |$t$| is a real number, and |$|t|$| is sufficiently small, then we have

$$ \begin{align*} & (\mathbf{x},\mathbf{y}) + t(\mathbf{d},\mathbf{d}) = (\mathbf{x}+t\mathbf{d}, \mathbf{y}+t\mathbf{d}) \in C_{n}. \end{align*} $$

Thus |$(\mathbf{x},\mathbf{y})$| lies on a line segment inside |$C_{n}$|⁠, and so is not an extreme point of |$C_{n}$|⁠.

The more general statement follows by applications of a sequence of symmetries in (29)–(32) to transform |$(\mathbf{x},\mathbf{y})$| satisfying the more general assumption into a point |$(\mathbf{x^{\prime}}, \mathbf{y^{\prime}})$| where the first two coordinates of the vectors |$\mathbf{x^{\prime}}$| and |$\mathbf{y^{\prime}}$| are positive.

Lemma 13.

If for a pair |$(\mathbf{x},\mathbf{y}) \in C_{n}$| the vector |$\mathbf{x}$| or |$\mathbf{y}$| has at least three non-zero coordinates, then |$(\mathbf{x},\mathbf{y})$| is not an extreme point of |$C_{n}$|⁠.

Proof.

By using symmetries (29)–(31) we may suppose that coordinates |$x_{1}, x_{2}, x_{3}$| are non-zero and positive. If two of the coordinates |$y_{1}, y_{2}, y_{3}$| are positive, then by Lemma 12 we conclude that |$(\mathbf{x},\mathbf{y})$| is not an extreme point of |$C_{n}$|⁠. In the contrary case, two of the coordinates |$y_{1}, y_{2}, y_{3}$| are non-positive. Then again by Lemma 12 and the symmetry (32) the pair |$(\mathbf{x},\mathbf{x-y}) \in C_{n}$| is not extreme, and thus neither is |$(\mathbf{x},\mathbf{y})$|⁠, since these two pairs are |$G_{n}$|-equivalent.

We are ready to prove Proposition 11. We denote by |$\ell ^{1}_{n}$| the space |${\mathbb{R}}^{n}$| equipped with the norm |$\| \cdot \|_{1}$| given by (25). Recall that the extreme points of the unit ball |$B:= \{ \mathbf{x} \in{\mathbb{R}}^{n}: \|\mathbf{x}\|_{1} \leq 1\}$| are the vectors with precisely one non-zero coordinate, this coordinate being equal to |$\pm 1$|⁠.

Proof of Proposition 11.

We will split up the proof into three cases, each case corresponding to one of the pairs in the statement of the proposition.

Case 1: At least one of the norms |$\|\mathbf{x}\|_{1},\|\mathbf{y}\|_{1},\|\mathbf{x-y}\|_{1}$| is strictly less than |$1$|⁠. We will show that in this case |$(\mathbf{x},\mathbf{y})$| is |$G_{n}$|-equivalent to the first pair in the statement of the proposition.

By applying a suitable combination of symmetries (29)–(32), we may suppose that in fact |$\|\mathbf{x-y}\|_{1}<1$|⁠. We claim that |$\mathbf{x}$| must be an extreme point of the unit ball of |$\ell ^{1}_{n}$|⁠. For if not, then it lies at the midpoint of a line segment |$I$| such that |$\|\mathbf{x^{\prime}}\|_{1}\le 1$| for all |$\mathbf{x^{\prime}}\in I$|⁠. Since |$\|\mathbf{x-y}\|_{1} < 1$|⁠, by shrinking |$I$| if necessary, we also have |$\|\mathbf{x^{\prime}-y}\|_{1}<1$| for all |$\mathbf{x^{\prime}}\in I$|⁠. Thus |$I\times \{\mathbf{y}\}$| is a line segment in |$C_{n}$| with interior point |$(\mathbf{x},\mathbf{y})$|⁠, contradicting the fact that |$(\mathbf{x},\mathbf{y})$| is extreme.

Likewise, |$\mathbf{y}$| is extreme in the unit ball of |$\ell ^{1}_{n}$|⁠. Applying a suitable symmetry, we may suppose that |$x_{1}=1$| and |$y_{j}=\pm 1$| for some |$j$|⁠, all the other entries of |$\mathbf{x}$| and |$\mathbf{y}$| being |$0$|⁠. Since we must have |$\|\mathbf{x-y}\|_{1}<1$|⁠, this implies that actually |$j=1$| and |$y_{1}=1$|⁠. Thus |$(\mathbf{x},\mathbf{y})$| is equivalent to the first pair of vectors listed in the statement of the proposition. This concludes Case 1.

Case 2: We have |$\|\mathbf{x}\|_{1} = \|\mathbf{y}\|_{1} = \|\mathbf{x-y}\|_{1} = 1$|⁠, and one of the vectors |$\mathbf{x}$|⁠, |$\mathbf{y}$|⁠, or |$\mathbf{x-y}$| has only one non-zero coordinate. In this case, |$(\mathbf{x},\mathbf{y})$| will be now shown to be |$G_{n}$|-equivalent to the second pair in the statement of the proposition.

Using our symmetries, we may suppose that |$\mathbf{x} = (1, 0, \ldots , 0)^{T}$|⁠. Note that

$$ \begin{align*} & \|\mathbf{x-y}\|_{1} = |1 - y_{1}| + |y_{2}| + \ldots + |y_{n}| = 1\end{align*} $$

and

$$ \begin{align*} & \|\mathbf{y} \|_{1} = |y_{1}| + |y_{2}| + \ldots + |y_{n}| = 1\end{align*} $$

force

$$ \begin{align*} &|1-y_{1}| = |y_{1}|,\end{align*} $$

the unique real solution |$y_{1}$| to this equation being |$y_{1} = 1/2$|⁠. By Lemma 13, |$\mathbf{y}$| has only one other non-zero coordinate, and |$\|\mathbf{y}\|_{1} = 1$| forces this coordinate to be equal to |$\pm 1/2$|⁠. Applying symmetries (29) and (30) we conclude that |$(\mathbf{x},\mathbf{y})$| is |$G_{n}$|-equivalent to the second pair in the statement. This concludes Case 2.

Case 3: We have |$\|\mathbf{x}\|_{1} = \|\mathbf{y}\|_{1} = \|\mathbf{x-y}\|_{1} = 1$|⁠, and all of the vectors |$\mathbf{x}$|⁠, |$\mathbf{y}$| and |$\mathbf{x-y}$| have exactly two non-zero coordinates. We will show that |$(\mathbf{x},\mathbf{y})$| is |$G_{n}$|-equivalent to the third pair in the statement of the proposition.

This case is slightly more complicated than the previous two. As before, we may suppose that |$x_{1}> 0$| and |$x_{2}> 0$|⁠. We claim that |$y_{1}$| and |$y_{2}$| cannot both be equal to zero. If they were, then |$\mathbf{x-y}$| has four non-zero coordinates, contrary to the assumption. In fact, precisely one of |$y_{1}$| and |$y_{2}$| must be non-zero. If both were non-zero, then since |$\mathbf{x-y}$| has exactly two non-zero coordinates, we would have |$x_{1} - y_{1} \neq 0$| and |$x_{2} - y_{2} \neq 0$|⁠. Then the three quantities |$x_{1}x_{2}$|⁠, |$y_{1}y_{2}$| and |$(x_{1}-y_{1})(x_{2} - y_{2})$| would be non-zero, and Lemma 12 would imply that |$(\mathbf{x},\mathbf{y})$| is not an extreme point.

By an application of symmetries we may, in addition to |$x_{1}> 0$| and |$x_{2}> 0$|⁠, suppose that |$y_{1} \neq 0$|⁠, |$y_{2} = 0$| and |$y_{3} = s> 0$|⁠. Since |$x_{1} + x_{2} = 1$|⁠, we have |$x_{1} = t$|⁠, |$x_{2} = 1-t$| for some |$t \in (0, 1)$|⁠. Our vectors thus have the following structure:

$$ \begin{align*} & x= \begin{pmatrix} t\\1-t\\0\\0\\ \vdots\\0 \end{pmatrix}, \quad y= \begin{pmatrix} y_{1} \\ 0 \\ s \\ 0\\\vdots\\0 \end{pmatrix}, \quad x-y= \begin{pmatrix} t - y_{1}\\ 1-t \\-s\\0\\\vdots\\0 \end{pmatrix}. \end{align*} $$

Recall that |$\mathbf{x-y}$| has only two non-zero coordinates. Since |$1-t \neq 0$| and |$s \neq 0$|⁠, we conclude from the above that |$t = y_{1}$|⁠. But then |$\|\mathbf{x-y}\|_{1} = 1-t + s = 1$|⁠, and so |$t = s$|⁠. Finally, |$1 = \|\mathbf{y}\|_{1} = t + s = 2s$| shows that |$s = t = 1/2$|⁠, and so |$(\mathbf{x},\mathbf{y})$| is |$G_{n}$|-equivalent to the third pair in the statement of the proposition.

3 Proof of Theorem 1

In addition to Theorem 7 from Section 2, we will also need some facts from plane geometry in order to prove Theorem 1. In particular, we will need to discuss the minimum enclosing disk problem appearing in computational geometry.

3.1 Minimal enclosing disk

Let |$K$| be a compact subset of |$\mathbb{C}$| containing at least two points. Among all closed disks that contain |$K$| there exists a unique one of minimal radius. We will denote this disk by |${\mathbb{D}}_{K}$| and call it the minimal disk for |$K$|⁠. The radius of |${\mathbb{D}}_{K}$| will be denoted by |$R(K)$|⁠.

If |${\mathbb{D}}_{K}$| is minimal for |$K$|⁠, then the intersection |$K \cap \partial{\mathbb{D}}_{K}$| must obviously be non-empty. In fact, this intersection must contain at least two points, and there is also a restriction on the locations of the points in |$K \cap \partial{\mathbb{D}}_{K}$|⁠.

Lemma 14.

Let |$K$| be a compact subset of |$\mathbb{C}$|⁠, which contains at least two points. Then the intersection |$\partial{\mathbb{D}}_{K} \cap K$| is not contained in any arc of |$\partial{\mathbb{D}}_{K}$|⁠, which has length strictly smaller than half of the circumference of |$ {\mathbb{D}}_{K}$|⁠. In particular, if |$K \cap \partial{\mathbb{D}}_{K} = \{a,b\}$| is a two-point set, then |$a$| and |$b$| are antipodal on |$\partial{\mathbb{D}}_{K}$|⁠.

Proof.

Seeking a contradiction, assume that |$\partial{\mathbb{D}}_{K} \cap K$| is contained in an arc of length strictly less than half of the circumference of |${\mathbb{D}}_{K}$|⁠. By translation, rescaling, and rotation of the setting, we may assume that |${\mathbb{D}}_{K}$| is the unit disk, and that |$\partial{\mathbb{D}}_{K} \cap K$| is contained in some half-space

$$ \begin{align*} &\{ z \in \mathbb{C}: \textrm{Re} z> \delta \}, \quad \delta > 0.\end{align*} $$

By compactness, the distance between the compact sets |$K$| and |$\partial{\mathbb{D}}_{K} \cap \{ z \in \mathbb{C}: \textrm{Re} z \leq \delta /2 \}$| is positive. It follows that we may translate the disk |${\mathbb{D}}_{K}$| in the positive direction of the real axis, and then shrink the radius of the translated disk slightly, and the resulting disk will still contain |$K$|⁠, yet be of strictly smaller radius than |$R_{K}$|⁠. See Figure 3. This contradiction establishes Lemma 14.

$The initial disk ${\mathbb{D}}_{K}$ is the dashed circle, and we assume that $\partial{\mathbb{D}}_{K} \cap K$ is contained in the black thick arc. Then $K$ will be contained in the grey disk, which is obtained from ${\mathbb{D}}_{K}$ by first translating ${\mathbb{D}}_{K}$ in the direction of the positive real axis, and then slightly shrinking the translated disk. This contradicts the minimality of ${\mathbb{D}}_{K}$.$

Fig. 3

The initial disk |${\mathbb{D}}_{K}$| is the dashed circle, and we assume that |$\partial{\mathbb{D}}_{K} \cap K$| is contained in the black thick arc. Then |$K$| will be contained in the grey disk, which is obtained from |${\mathbb{D}}_{K}$| by first translating |${\mathbb{D}}_{K}$| in the direction of the positive real axis, and then slightly shrinking the translated disk. This contradicts the minimality of |${\mathbb{D}}_{K}$|⁠.

Open in new tab Download slide

$The thick arc $J$ between $a$ and $b$ is the smallest containing the compact set $K$. It follows that the shorter arc between the antipodal points $\tilde{a}$ an $\tilde{b}$ must contain points of $K$.$

Fig. 4

The thick arc |$J$| between |$a$| and |$b$| is the smallest containing the compact set |$K$|⁠. It follows that the shorter arc between the antipodal points |$\tilde{a}$| an |$\tilde{b}$| must contain points of |$K$|⁠.

Open in new tab Download slide

Lemma 15.

Let |$T = \{a, b, c\}$| be a three-point set. If |$D$| is a closed disk for which |$T \subset \partial D$|⁠, and |$T$| is not contained in any arc of |$\partial D$|⁠, which is strictly smaller than half of the circumference of |$D$|⁠, then |$D = {\mathbb{D}}_{T}$|⁠.

Proof.

Assume, seeking a contradiction, that |$D \neq{\mathbb{D}}_{T}$|⁠, and so that |$R(T)$| is strictly smaller than the radius of |$D$|⁠. Since |$\partial D$| is the unique circle passing through the three points |$a,b,c$|⁠, we must have that |$T \cap \partial{\mathbb{D}}_{T}$| contains precisely two points. Say |$a, b \in \partial{\mathbb{D}}_{T}$| but |$c \not \in \partial{\mathbb{D}}_{T}$|⁠. Lemma 14 implies that |$a$| and |$b$| are antipodal on |${\mathbb{D}}_{T}$|⁠. By translation, rescaling, and rotation, we may assume that |${\mathbb{D}}_{T}$| is the unit disk, |$a = i, b = -i$|⁠, |$c$| has non-negative real part and |$|c| < 1$|⁠. After these operations, we have that |$R(T) = 1$| and the circumference of |$D$| is larger than |$2 \pi $|⁠. Thus by hypothesis, surely |$T$| is not contained in any arc of |$\partial D$| of length strictly smaller than |$\pi $|⁠. But the shorter of the arcs of |$\partial D$| that contains |$T$| is then contained in |$\{ z \in \mathbb{C}: 0 \leq \textrm{Re} z, |z| \leq 1\}$|⁠, and so this arc must have a length smaller than |$\pi $|⁠. This is a contradiction, and the lemma follows.

3.2 Reduction to three-point sets

The following simple result on minimal disks makes it possible to apply Theorem 7 to more than three measures.

Lemma 16.

Let |$K$| be a compact subset of |${\mathbb{C}}$| containing at least two points. There exists a subset |$T \subset K$|⁠, which contains at most three points and for which |${\mathbb{D}}_{K} = {\mathbb{D}}_{T}$|⁠. In particular, |$R(K) = R(T)$|⁠.

It may be convenient to refer to Figure 4 during the reading of the proof.

Proof.

If there are two points in |$K$| that are antipodal on |$\partial{\mathbb{D}}_{K}$|⁠, then we take |$T$| to consist of those two points. Clearly |${\mathbb{D}}_{K} = {\mathbb{D}}_{T}$|⁠. In the case that no pair of antipodal points of |$\partial{\mathbb{D}}_{K}$| are contained in |$K$|⁠, let |$J$| be the shortest closed arc of |$\partial{\mathbb{D}}_{K}$|⁠, which contains |$K$|⁠, and let |$a,b \in J \cap K$| be the end-points of |$J$|⁠. By Lemma 14, the length of |$J$| is strictly larger than half of the circumference of |$\partial{\mathbb{D}}_{K}$|⁠, and so |$J$| is the longer of the two arcs between |$a$| and |$b$|⁠. Let |$\tilde{a}$| and |$\tilde{b}$| be points on |$\partial{\mathbb{D}}_{K}$|⁠, which are antipodal to |$a$| and |$b$|⁠, respectively. By assumption, |$\tilde{a} \not \in K, \tilde{b} \not \in K$|⁠. We claim that the shorter of the two open arcs between |$\tilde{a}$| and |$\tilde{b}$| must contain points of |$K$|⁠. If not, then the longer of the two arcs between |$\tilde{a}$| and |$\tilde{b}$| would contain |$K$| in its interior, and this arc has the same length as |$J$|⁠. A routine compactness argument would lead to a contradiction to the minimality of |$J$|⁠.

Let |$T = \{a, b, c\}$|⁠, where |$c \in K$| is any point contained in the shorter open arc between |$\tilde{a}$| and |$\tilde{b}$|⁠. Note that any arc containing |$T$| must contain either |$\tilde{a}$| or |$\tilde{b}$|⁠. Then such an arc contains two antipodal points on |${\mathbb{D}}_{K}$|⁠, and so it has a length that is at least half of the circumference of |${\mathbb{D}}_{K}$|⁠. By Lemma 15 we conclude that |${\mathbb{D}}_{K} = {\mathbb{D}}_{T}$|⁠.

3.3 Finalizing the proof

We are finally ready to give a proof of the equality |$c_{\mathbb{R}}(\Omega ) = c_{\mathbb{C}}(\Omega )$|⁠.

Proof.

(Proof of Theorem 1)

Since |$c_{\mathbb{R}}(\Omega ) \leq c_{\mathbb{C}}(\Omega )$|⁠, it will suffice to show the reverse inequality. To this end, we need to show that given |$f \in C(\partial \Omega )$| satisfying |$\|f \|_{\partial \Omega } \leq 1$|⁠, we have that |$\|K_\Omega f + {\mathbb{C}} \mathbf{1}\|_{\partial \Omega } \leq c_{\mathbb{R}}(\Omega )$|⁠. Since |$K_\Omega f$| is continuous, the image |$K = K_\Omega f(\partial \Omega )$| is a compact subset of |${\mathbb{C}}$|⁠. If |$K$| consists of a single point, then |$\|K_\Omega f + {\mathbb{C}} \mathbf{1}\|_{\partial \Omega } = 0$|⁠, and the proof is complete. In other case, let |${\mathbb{D}}_{K}$| be the minimal disk for |$K$|⁠. We use Lemma 16 to obtain a three-point set |$T = \{a,b,c\} \subset K$| for which |$R(T) = R(K)$| (note that if |$K \cap \partial{\mathbb{D}}_{K}$| contains only two points |$\{a,b\}$|⁠, then we may pick |$c \in K$| arbitrarily to complete |$T$| to a three-point set). The geometric interpretation of the quotient norm in |$C(\partial \Omega )/{\mathbb{C}} \mathbf{1}$| implies that |$\|K_\Omega f + {\mathbb{C}} \mathbf{1}\|_{\partial \Omega } = R(K) = R(T)$|⁠. Since |$T$| is contained in the image of |$K_\Omega f$|⁠, there exists |$\zeta _{1}, \zeta _{2}, \zeta _{3} \in \partial \Omega $| such that

$$ \begin{align*} & (a,b,c) = \bigl( K_\Omega f(\zeta_{1}), K_\Omega f(\zeta_{2}), K_\Omega f(\zeta_{3})\bigr).\end{align*} $$

Since |$K_\Omega f(\zeta _{j}) = \int _{\partial \Omega } f \, \textrm{d}\mu _{\zeta _{j}}$|⁠, we may apply Theorem 7 to |$X = \partial \Omega $|⁠, |$\mu _{j} = \mu _{\zeta _{j}}$| for |$j = 1,2,3$|⁠, and conclude that the operator |$\mathcal{L}: C(\partial \Omega ) \to{\mathbb{C}}^{3} / {\mathbb{C}} \mathbf{1}$| defined by

$$ \begin{align*} & \mathcal{L}: f \mapsto \bigl( K_\Omega f(\zeta_{1}), K_\Omega f(\zeta_{2}), K_\Omega f(\zeta_{3})\bigr) + {\mathbb{C}} \mathbf{1}\end{align*} $$

has a norm satisfying the bound (20). With |$\| \cdot \|_\infty $| denoting the norm on |${\mathbb{C}}^{3} / {\mathbb{C}} \mathbf{1}$| given in (17), we obtain

$$ \begin{align*} \|K_\Omega f + {\mathbb{C}} \mathbf{1}\|_{\partial \Omega} = R(T) &= \|(a,b,c) + {\mathbb{C}} \mathbf{1}\|_\infty \\ &= \|\mathcal{L} f \|_{\infty} \\ &\leq \|\mathcal{L}\|_{C(\partial \Omega) \to{\mathbb{C}}^{3} / {\mathbb{C}} \mathbf{1}} \\ &\leq \frac{1}{2} \max_{j,k} \|\mu_{\zeta_{j}} - \mu_{\zeta_{k}}\| \\ &\leq \frac{1}{2} \sup_{\zeta, \zeta^{\prime} \in \partial \Omega} \|\mu_\zeta - \mu_{\zeta^{\prime}}\|\\ & = c_{\mathbb{R}}(\Omega). \end{align*} $$

The earlier mentioned extension of Theorem 7 to an n-measures theorem is obtained by employing the same argument as in the above proof. The normed space |${\mathbb{C}}^{n} / {\mathbb{C}} \mathbf{1}$| appearing below is defined analogously to the case |$n=3$| treated in Section 2.1.

Theorem 17.

Let |$C(X)$| be the space of continuous functions on a compact Hausdorff space |$X$|⁠, |$n \geq 3$| an integer, and |$\mathcal{L}: C(X) \to{\mathbb{C}}^{n} / {\mathbb{C}} \mathbf{1}$| the operator defined by

$$ \begin{align*} & \mathcal{L} f = \big( \mu_{1}(f), \ldots, \mu_{n}(f) \big) + {\mathbb{C}} \mathbf{1} \end{align*} $$

where |$\mu _{1}, \ldots , \mu _{n}$| are finite real-valued Borel measures on |$X$|⁠. Then

$$ \begin{align*} & \| \mathcal{L} \|_{C(X) \to{\mathbb{C}}^{n} / {\mathbb{C}} \mathbf{1}} = \frac{1}{2} \max_{j,k} \|\mu_{j} - \mu_{k}\|.\end{align*} $$

Proof.

We use Lemma 16 to pick a three-point subset |$T$| of |$K = \{ \mu _{j}(f) \}_{j=1}^{n}$| for which we have |$R(K) = R(T)$|⁠, and apply Theorem 7 as in the preceding proof.

4 Proof of Theorem 2

4.1 Exploiting subsequences

We will argue by contradiction in order to prove Theorem 2. That is, we will assume that there exists a convex domain |$\Omega $| with |$a(\Omega ) = 1$|⁠, and so that there exists a sequence of functions |$(f_{n})$| in |$\mathcal{A}(\Omega )$|⁠, which satisfy

$$ \begin{align*} & \|f_{n} + {\mathbb{C}} \mathbf{1}\|_{\Omega} = 1 \end{align*} $$

and

$$ \begin{align}& \lim_{n \to \infty} \| K_\Omega f_{n} + {\mathbb{C}} \mathbf{1}\|_{\Omega} = 1.\end{align} $$

(33)

We shall see that this leads to a contradiction. The proof technique below is different from the one employed by Schober in [17] in his proof of Neumann’s lemma, and analyticity is used only at the very end of the proof. In fact, we shall remark at the end of the section how our arguments lead to a new proof of Neumann’s lemma that is different from the one in [17].

Thus, for now, we assume merely that |$f_{n} \in C(\partial \Omega )$|⁠, and we will derive certain consequences of (33). In the course of the proof we shall replace the sequence |$(f_{n})$| by a subsequence multiple times, and for convenience we will not be changing the subscripts. We may suppose that |$\|f_{n}\|_\Omega = 1$|⁠, and consequently that the images

$$ \begin{align*} & K_\Omega f_{n}(\partial \Omega):= \{ K_\Omega f_{n}(\zeta): \zeta \in \partial \Omega \} \end{align*} $$

are contained in a closed disk of radius |$1$| centred at the origin. For large |$n$|⁠, this observation and (33) forces there to be points of the image of |$K_\Omega f_{n}$| outside of any disk centred at the origin of radius strictly less than |$1$|⁠. By exchanging |$f_{n}$| for a unimodular multiple of itself, we may thus assume that there exists a sequence of points |$(\zeta _{n})$| in |$\partial \Omega $| for which we have

$$ \begin{align}& \lim_{n \to \infty} K_\Omega f_{n}(\zeta_{n}) = \lim_{n \to \infty} \int_{\partial \Omega} f_{n} \, \textrm{d}\mu_{\zeta_{n}} = 1.\end{align} $$

(34)

Using that the functions |$f_{n}$| are bounded by |$1$| in modulus, and the positive measures |$d\mu _\zeta $| are of unit mass, we obtain

$$ \begin{align*} \lim_{n \to \infty} \int_{\partial \Omega} |f_{n} - 1|^{2} \textrm{d}\mu_{\zeta_{n}} &= \lim_{n \to \infty} \int_{\partial \Omega} \big(|f_{n}|^{2} - 2 \textrm{Re} f_{n} + 1 \big) \textrm{d}\mu_{\zeta_{n}} \\ &\leq \lim_{n \to \infty} \Big( 2 - 2\textrm{Re} \int_{\partial \Omega} f_{n} \, \textrm{d}\mu_{\zeta_{n}} \Big) = 0. \end{align*} $$

Recall from (3) that |$\rho _{\zeta _{n}}$| denotes the |$ds$|-absolutely continuous part of |$\mu _{\zeta _{n}}$|⁠. The above computation implies that

$$ \begin{align}& \lim_{n \to \infty} \int_{\partial \Omega} |f_{n} - 1|^{2} \rho_{\zeta_{n}} \textrm{d}s = 0.\end{align} $$

(35)

Compactness of the boundary |$\partial \Omega $| implies that we may assume convergence of the sequence |$(\zeta _{n})$| to some points |$\zeta \in \partial \Omega $|⁠. The following lemma shows that we may replace in (35) the densities |$\rho _{\zeta _{n}}$| with the density |$\rho _{\zeta }$|⁠.

Lemma 18.

With notations as above, we have

$$ \begin{align}& \lim_{n \to \infty} \int_{\partial \Omega} |f_{n} - 1|^{2} \rho_{\zeta} \textrm{d}s = 0.\end{align} $$

(36)

Consequently, after passing to a subsequence, we can ensure that

$$ \begin{align*} & \lim_{n \to \infty} f_{n}(\sigma) = 1\end{align*} $$

for almost every |$\sigma \in \partial \Omega $| with respect to the measure |$\rho _\zeta d s$|⁠.

Proof.

Note that whenever |$\sigma $| is not a corner of |$\partial \Omega $| or any of the points |$\zeta _{n}$| or |$\zeta $|⁠, we have

$$ \begin{align*} & \rho_{\zeta_{n}}(\sigma) - \rho_\zeta(\sigma) = \textrm{Re} \frac{(\zeta_{n} - \zeta) N(\sigma)}{\pi(\sigma-\zeta)(\sigma-\zeta_{n})}.\end{align*} $$

If |$B = B(\zeta , \delta )$| is a disk around |$\zeta $| of small radius |$\delta> 0$|⁠, then for large enough |$n$| the denominator on the right-hand side above is uniformly bounded from below for |$\sigma \in \partial \Omega \setminus B$|⁠, with exception of a countable set. This shows uniform convergence of |$\rho _{\zeta _{n}}(\sigma )$| to |$\rho _\zeta (\sigma )$| for |$\sigma \in \partial \Omega \setminus B$|⁠, again with exception of an at most countable set. Since |$|f_{n} - 1|^{2} \leq 4$|⁠, we obtain from (35) that

$$ \begin{align*} \limsup_{n \to \infty} \int_{\partial \Omega} |f_{n} - 1|^{2} \rho_{\zeta} \textrm{d}s & \leq \limsup_{n \to \infty} \int_{\partial \Omega \cap B} |f_{n} - 1|^{2} \rho_{\zeta} \textrm{d}s \\ &+ \limsup_{n \to \infty} \int_{\partial \Omega \setminus B} |f_{n} - 1|^{2} \rho_{\zeta} \textrm{d}s \\ &\leq 4 \int_{\partial \Omega \cap B} \rho_\zeta \textrm{d}s. \end{align*} $$

Since |$\partial \Omega \cap B$| is an arc of length that tends to |$0$| as the radius |$\delta $| of |$B$| tends to |$0$|⁠, the last quantity above can be made arbitrarily small by choosing |$\delta $| small enough. This establishes (36). Basic measure theory now implies that we may pass again to a subsequence and ensure the pointwise convergence |$f_{n} \to 1$| almost everywhere with respect to |$\rho _\zeta d s$|⁠.

Out next observation extracts more information from (33). Consider the strips

$$ \begin{align*} &S_\delta = \{ z = re^{it}: 1-\delta < r < 1, |t| \in [\pi/4, \pi] \}, \quad \delta> 0. \end{align*} $$

These strips have a fixed large “length” but shrinking “width”. One such strip is marked in Figure 5. We claim that each one of the strips |$S_\delta $| intersects the images |$K_\Omega f_{n}(\Omega )$| non-trivially for infinitely many indices |$n$|⁠. For if not, then for some fixed |$\delta> 0$|⁠, we would have that |$S_\delta \cap K_\Omega f_{n}(\partial \Omega )= \emptyset $| for all sufficiently large |$n$|⁠, which means that the images |$K_\Omega f_{n}(\partial \Omega )$| are entirely contained in |$B(0,1) \setminus S_\delta $|⁠, where |$B(0,1)$| denotes the closed disk of radius |$1$| centred at the origin. But if |$\epsilon _{1}$| and |$\epsilon _{2}$| are sufficiently small positive numbers, then |$B(0,1) \setminus S_\delta \subset B(\epsilon _{1}, 1-\epsilon _{2})$|⁠, a disk of radius |$1-\epsilon _{2}$| centred at the point |$\epsilon _{1} \in{\mathbb{R}}$|⁠. See Figure 5. Recalling the geometric interpretation of the norm |$\| K_\Omega f_{n} + {\mathbb{C}}\mathbf{1}\|_{\partial \Omega }$| as the radius of the smallest disk containing the image of |$K_\Omega f_{n}$|⁠, we would arrive at a contradiction to (33). Thus every strip |$S_\delta $| contains points in the image of |$K_\Omega f_{n}$| for infinitely many |$n$|⁠.

$The unit disk in dark grey with the strip $S_\delta $ removed. The dotted circle containing the dark grey area has a radius slightly smaller than $1$.$

Fig. 5

The unit disk in dark grey with the strip |$S_\delta $| removed. The dotted circle containing the dark grey area has a radius slightly smaller than |$1$|⁠.

Open in new tab Download slide

Lemma 19.

With notations as above, we may pass to a subsequence again, and obtain a new sequence |$(\zeta _{n}^{\prime})$| that converges to some point |$\zeta ^{\prime} \in \partial \Omega $|⁠, and such that

$$ \begin{align*} & \lim_{n \to \infty} f_{n}(\sigma) = \alpha\end{align*} $$

for some unimodular constant |$\alpha \neq 1$| and for almost every |$\sigma \in \partial \Omega $| with respect to the measure |$\rho _{\zeta ^{\prime}} d s$|⁠.

Proof.

Since each strip |$S_\delta $| intersects the images of |$K_\Omega f_{n}$| for infinitely many |$n$|⁠, passing to a subsequence and a routine compactness argument produces a sequence |$(\zeta ^{\prime}_{n})$| convergent to some |$\zeta ^{\prime} \in \partial \Omega $|⁠, for which |$K_\Omega f_{n}(\zeta ^{\prime}_{n}) \to \alpha $|⁠, with |$\alpha $| unimodular and lying in the closure of each of the strips |$S_\delta $|⁠. Thus |$\alpha \neq 1$|⁠. We therefore merely need to repeat the previous arguments to see that, after passing to a subsequence, we will have |$f_{n}(\sigma ) \to \alpha $| for almost every |$\sigma $| with respect to the measure |$\rho _{\zeta ^{\prime}} d s$|⁠.

4.2 Proof of Theorem 2

The above arguments are valid for |$f_{n} \in C(\partial \Omega )$|⁠. However, under the assumption of analyticity, the sequence |$(f_{n})$| cannot converge to two different constants on two different sets of positive arclength measure. To make this statement precise, we appeal to the classical theory of analytic functions in the (open) unit disk |${\mathbb{D}} = \{ z \in{\mathbb{C}}: |z| < 1 \}$|⁠. Here [8, Chapter II] is an excellent reference for the claims made in the following proof.

Proof of Theorem 2.

Let |$H^\infty = H^\infty ({\mathbb{D}})$| be the space of bounded analytic functions in |${\mathbb{D}}$|⁠, identified as usual through boundary function correspondence with a weak-star closed subspace of the space |$L^\infty (\partial{\mathbb{D}}) = (L^{1}(\partial{\mathbb{D}})^{*}$| of bounded measurable functions on |$\partial{\mathbb{D}}$|⁠, the dual of the Lebesgue space |$L^{1}(\partial{\mathbb{D}})$| of functions integrable on |$\partial{\mathbb{D}}$| with respect to the Lebesgue measure (arclength measure) on |$\partial{\mathbb{D}}$|⁠. It is well known that a function |$\tilde{f} \in H^\infty $| that vanishes on a subset of positive Lebesgue measure on |$\partial{\mathbb{D}}$| must vanish identically.

Fix some conformal mapping |$\phi : {\mathbb{D}} \to \Omega $|⁠. Under the assumption that |$f_{n} \in \mathcal{A}(\Omega )$|⁠, |$\|f_{n}\|_\Omega \leq 1$|⁠, the functions

$$ \begin{align*} & \tilde{f_{n}}:= f_{n} \circ \phi \in H^\infty, \quad n \geq 1 \end{align*} $$

are bounded in modulus by |$1$| in |${\mathbb{D}}$|⁠. By Carathéodory’s classical theorem (see, for instance, [9, Chapter I.3]), |$\phi $| extends to a homeomorphism between |$\partial{\mathbb{D}}$| and |$\partial \Omega $|⁠. If |$\|K_\Omega f_{n} + {\mathbb{C}} \mathbf{1}\|_\Omega \to 1$|⁠, then Lemmas 18 and 19 show that there exist two sets |$E, E^{\prime} \subset \partial \Omega $| that have positive arclength measure, such that

$$ \begin{align*} & \lim_{n \to \infty} \tilde{f_{n}}(\lambda) = 1, \quad \lambda \in \phi^{-1}(E) \end{align*} $$

and

$$ \begin{align*} & \lim_{n \to \infty} \tilde{f_{n}}(\lambda) = \alpha, \quad \lambda \in \phi^{-1}(E^{\prime}). \end{align*} $$

Since |$\Omega $| is convex, the curve |$\partial \Omega $| is rectifiable, and general theory of harmonic measures tells us that the sets |$\phi ^{-1}(E)$| and |$\phi ^{-1}(E^{\prime})$| have positive Lebesgue measure (see [9, Chapter VI]). Since |$L^{1}(\partial{\mathbb{D}})$| is separable and the functions |$\tilde{f_{n}}$| are uniformly bounded by |$1$| in modulus, the usual Helly-type selection process will produce a subsequence of |$(\tilde{f}_{n})$|⁠, which converges in the weak-star topology to some function |$\widetilde{f} \in H^\infty $|⁠. By the above pointwise convergence, we must have |$\tilde{f} \equiv 1$| on |$\phi ^{-1}(E)$| and |$\tilde{f} \equiv \alpha $| on |$\phi ^{-1}(E^{\prime})$|⁠. Then the non-zero function |$\widetilde{f} - 1$| vanishes on the subset |$\phi ^{-1}(E)$| of positive Lebesgue measure on |$\partial{\mathbb{D}}$|⁠. This is a contradiction, which shows that our assumption |$\|K_\Omega f_{n} + {\mathbb{C}} \mathbf{1}\|_\Omega \to 1$| must be false. Theorem 2 follows.

4.3 A proof of Neumann’s lemma

Lemma 20.

Fix |$\zeta \in \partial \Omega $|⁠. Any |$\sigma \in \partial \Omega \setminus \{\zeta \}$| that is not a corner of |$\partial \Omega $| and that satisfies |$\rho _\zeta (\sigma ) = 0$| is contained in the union of at most two line segments of |$\partial \Omega $| containing |$\zeta $|⁠.

Proof.

It will suffice to show that all |$\sigma $| satisfying the above conditions are contained in at most two different tangent lines to |$\Omega $|⁠. To see this, recall formula (5). The condition |$\rho _\zeta (\sigma ) = (2 \pi R_{\zeta ,\sigma })^{-1} = 0$| gives |$R_{\zeta , \sigma } = \infty $|⁠, and so |$\zeta $| is contained in the tangent line to |$\Omega $| at |$\sigma $|⁠. The tangent line divides the plane |${\mathbb{C}}$| into two half-planes, one of which contains |$\Omega $|⁠. Assume that two different tangent lines, at |$\sigma $| and |$\sigma ^{\prime}$|⁠, intersect at |$\zeta $|⁠. They divide the plane |${\mathbb{C}}$| into four sectors, and by convexity precisely one of those sectors contains |$\Omega $|⁠. Now, any line that passes through |$\zeta $| and the open sector containing |$\Omega $| must separate |$\sigma , \sigma ^{\prime} \in \partial \Omega $|⁠. Therefore, it is not a tangent to |$\Omega $|⁠.

Neumann’s lemma is established as follows. Assume that |$c(\Omega ) = 1$|⁠. From Lemmas 18 and 19 we see that two points |$\zeta , \zeta ^{\prime}$| exist for which the measures |$\rho _\zeta ds$| and |$\rho _{\zeta ^{\prime}} ds$| are mutually singular. From Lemma 20 we deduce that the support of |$\rho _\zeta ds$| is the union of at most two line segments containing |$\zeta ^{\prime}$|⁠, and the complement of the support of |$\rho _\zeta ds$| is also a union of at most two line segments. Thus |$\partial \Omega $| is the union of at most four line segments.

5 Examples

In this section, we compute and estimate the configuration constants for some types of domains.

5.1 Configuration constant of an ellipse

For |$a,b>0$|⁠, let

$$ \begin{align*} & \Omega_{a,b}:=\Bigl\{x+iy\in{\mathbb{C}}: \frac{x^{2}}{a^{2}}+\frac{y^{2}}{b^{2}}\le1\Bigr\} \end{align*} $$

be the ellipse centred at the origin with semi-axes of lengths |$a$| and |$b$|⁠, respectively. It is quite remarkable that the configuration constant can in this case be computed explicitly.

Proposition 21.

With the above notation, we have

$$ \begin{align*} & c(\Omega_{a,b})=\frac{2}{\pi}\arctan\Bigl(\frac{1}{2}\Bigl|\frac{b}{a}-\frac{a}{b}\Bigr|\Bigr). \end{align*} $$

In order to prove the proposition, our first step is to derive an expression for the density of the Neumann–Poincaré kernel of |$\Omega _{a,b}$|⁠. The boundary |$\partial \Omega _{a,b}$| is parametrized by

$$ \begin{align}& \gamma(t):=a\cos t+ib\sin t, \quad t\in[0,2\pi].\end{align} $$

(37)

$$ \begin{align} d \mu_{\gamma(s)}(\gamma(t)) &= \rho_{\gamma(s)}(\gamma(t)) \, ds(\gamma(t)) \\ &= \frac{1}{\pi} \textrm{Im} \Bigg(\frac{T(\gamma(t))}{\gamma(t) - \gamma(s)} \Bigg) |\gamma^{\prime}(t)| \, dt \nonumber \\ &= \frac{1}{\pi}\textrm{Im}\frac{\gamma^{\prime}(t)}{\gamma(t)-\gamma(s)} \, dt. \nonumber\end{align} $$

(38)

Using (37), this formula can be greatly simplified.

Lemma 22.

With the notation above, we have

$$ \begin{align}& d\mu_{\gamma(s)}(\gamma(t))=\frac{1}{2\pi}\frac{A}{1+B\cos(t+s)}\,dt, \quad s,t\in[0,2\pi],\end{align} $$

(39)

where

$$ \begin{align*} & A:=\frac{2ab}{a^{2}+b^{2}} \quad\textrm{and}\quad B:=\frac{b^{2}-a^{2}}{b^{2}+a^{2}}. \end{align*} $$

The lemma is established by combining (37) and (38), and then using elementary trigonometric identities to simplify the resulting expression.

With this formula in hand, we now evaluate the configuration constant of the ellipse |$\Omega _{a,b}$|⁠.

Proof of Proposition 21.

Using the formulas (8) and (39), we obtain

$$ \begin{align*} & c(\Omega_{a,b}) =\sup_{s_{1},s_{2}\in[0,2\pi]}\frac{1}{2}\frac{1}{2\pi}\int_{[-\pi, \pi]}\Bigl|\frac{A}{1+B\cos(t+s_{1})}-\frac{A}{1+B\cos(t+s_{2})}\Bigr|\, \textrm{d}t. \end{align*} $$

By the periodicity of |$\cos $|⁠, this last expression simplifies to

$$ \begin{align*} & c(\Omega_{a,b}) =\sup_{s\in(0,2\pi)}\frac{1}{2}\frac{1}{2\pi}\int_{[-\pi,\pi]}\Bigl|\frac{A}{1+B\cos(t+s)}-\frac{A}{1+B\cos(t)}\Bigr|\, \textrm{d}t. \end{align*} $$

We readily verify that |$\cos (t) \geq \cos (t+s)$| if and only if |$t\in [-s/2,\,\pi -s/2]$|⁠. Therefore

$$ \begin{align*} &\frac{1}{2}\frac{1}{2\pi}\int_{[-\pi,\pi]}\Bigl|\frac{A}{1+B\cos(t+s)}-\frac{A}{1+B\cos(t)}\Bigr|\, \textrm{d}t\\ &=\frac{1}{2\pi}\int_{-s/2}^{\pi-s/2}\Bigl(\frac{A}{1+B\cos(t+s)}-\frac{A}{1+B\cos(t)}\Bigr)\, \textrm{d}t\\ &=\frac{1}{2\pi}\int_{s/2}^{\pi+s/2}\frac{A}{1+B\cos(t)}\, \textrm{d}t-\frac{1}{2\pi}\int_{-s/2}^{\pi-s/2}\frac{A}{1+B\cos(t)}\, \textrm{d}t\\ &=\frac{1}{2\pi}\int_{\pi-s/2}^{\pi+s/2}\frac{A}{1+B\cos(t)}\, \textrm{d}t-\frac{1}{2\pi}\int_{-s/2}^{s/2}\frac{A}{1+B\cos(t)}\, \textrm{d}t\\ &=\frac{1}{2\pi}\int_{-s/2}^{s/2}\frac{A}{1-B\cos(t)}\, \textrm{d}t-\frac{1}{2\pi}\int_{-s/2}^{s/2}\frac{A}{1+B\cos(t)}\, \textrm{d}t\\ &=\frac{1}{2\pi}\int_{-s/2}^{s/2}\frac{2AB\cos(t)}{1-B^{2}\cos^{2}(t)}\, \textrm{d}t. \end{align*} $$

$$ \begin{align*} & c(\Omega_{a,b})=\frac{1}{2\pi}\int_{-\pi/2}^{\pi/2} \frac{2AB\cos(t)}{1-B^{2}\cos^{2}(t)}\, \textrm{d}t. \end{align*} $$

All that remains is to evaluate the integral. Making the substitution |$x=\sin t$|⁠, and exploiting the fact that |$A^{2}+B^{2}=1$|⁠, we have

$$ \begin{align*} \frac{1}{2\pi}\int_{-\pi/2}^{\pi/2} \frac{2AB\cos(t)}{1-B^{2}\cos^{2}(t)}\, \textrm{d}t &=\frac{1}{2\pi}\int_{-1}^{1} \frac{2AB}{1-B^{2}(1-x^{2})}\, \textrm{d}x\\ &=\frac{1}{\pi}\int_{-1}^{1} \frac{AB}{A^{2}+B^{2}x^{2}}\, \textrm{d}x\\ &=\frac{2}{\pi}\arctan\Bigl(\frac{B}{A}\Bigr)\\ &=\frac{2}{\pi}\arctan\Bigl(\frac{1}{2}\Bigl(\frac{b}{a}-\frac{a}{b}\Bigr)\Bigr). \end{align*} $$

This proves the result in the case when |$b\ge a$|⁠. The remaining case is obtained by exchanging the roles of |$a$| and |$b$|⁠.

5.2 Integral estimates

Assume that we find a Borel measure |$\nu $| on |$\partial \Omega $| such that

$$ \begin{align*} & k^\nu_\Omega:= \sup\{\|\mu_\zeta-\nu\|:\zeta\in\partial\Omega\} < 1. \end{align*} $$

If so, then, for every |$\phi \in C(\partial \Omega )$| with |$\|\phi \|_{\partial \Omega }\le 1$|⁠, we have

$$ \begin{align*} & \Bigl|K_\Omega\phi(\zeta)-\int_{\partial\Omega}\phi\, \textrm{d}\nu\Bigr|\le k^\nu_\Omega, \quad \zeta\in\partial\Omega, \end{align*} $$

which shows that the image of |$K_\Omega \phi $| is contained in a disk of radius |$k^\nu _\Omega $| centred at |$\int _{\partial \Omega } \phi \, \textrm{d}\nu $|⁠. Thus,

$$ \begin{equation*} c_{\mathbb{R}}(\Omega) = c_{\mathbb{C}}(\Omega) = \|K_\Omega:C(\partial\Omega)/{\mathbb{C}} \mathbf{1} \to C(\partial\Omega)/{\mathbb{C}} \mathbf{1}\|\le k^\nu_\Omega. \end{equation*} $$

$$ \begin{align*} &\| \mu_\zeta - \nu\| = (\mu_\zeta - \nu)(\partial \Omega) = 1 - \nu(\partial \Omega),\end{align*} $$

and so |$k^\nu _\Omega = 1- \nu (\partial \Omega )$|⁠.

$A domain $\Omega $ with two circles corresponding to values $R_\Omega (\sigma )$ and $R_\Omega (\sigma ^{\prime})$$

Fig. 6

A domain |$\Omega $| with two circles corresponding to values |$R_\Omega (\sigma )$| and |$R_\Omega (\sigma ^{\prime})$|

Open in new tab Download slide

Lemma 23.

The function |$R_\Omega :\partial \Omega \to (0,\infty ]$| is lower semicontinuous. In particular, it is Borel measurable.

Proof.

Let |$\sigma \in \partial \Omega $| and let |$(\sigma _{n})$| be a sequence in |$\partial \Omega $| such that |$\sigma _{n}\to \sigma $|⁠. We need to show that |$\liminf _{n\to \infty }R_\Omega (\sigma _{n})\ge R_\Omega (\sigma )$|⁠. We can suppose that |$L:=\liminf _{n\to \infty }R_\Omega (\sigma _{n})<\infty $|⁠, otherwise there is nothing to prove. Let |$L^{\prime}>L$|⁠. Then, replacing |$(\sigma _{n})$| by a subsequence, we can suppose that |$R_\Omega (\sigma _{n})<L^{\prime}$| for all |$n$|⁠. Thus, for each |$n$|⁠, there exists a closed disk |$\Delta _{n}$| of radius |$L^{\prime}$| such that |$\Omega \subset \Delta _{n}$| and |$\sigma _{n}\in \partial \Delta _{n}$|⁠. The sequence of centres |$(c_{n})$| of the disks |$\Delta _{n}$| is bounded, so there exists a convergent subsequence |$c_{n_{j}}\to c$|⁠. Let |$\Delta $| be the closed disk with centre |$c$| and radius |$L^{\prime}$|⁠. Then we have |$\Omega \subset \Delta $| and |$\sigma \in \partial \Delta $|⁠. It follows that |$R_\Omega (\sigma )\le L^{\prime}$|⁠. As this last inequality holds for all |$L^{\prime}>L$|⁠, we deduce that |$R_\Omega (\sigma )\le L$|⁠. This completes the proof.

We set

$$ \begin{align}& d\nu:= \frac{ds}{2 \pi R_\Omega}.\end{align} $$

(40)

$$ \begin{align}& \mu_\zeta \geq \nu.\end{align} $$

(41)

To see this, note that if |$\sigma $| is not a corner and |$R_{\zeta ,\sigma }$| is the radius of the unique circle tangent to |$\partial \Omega $| at |$\sigma $| and passing through |$\zeta $|⁠, then |$R_\Omega (\sigma ) \geq R_{\zeta , \sigma }$|⁠. Therefore, according to (5),

$$ \begin{align*} & \frac{1}{2 \pi R_\Omega(\sigma)} \leq \frac{1}{2 \pi R_{\zeta, \sigma}} = \rho_\zeta(\sigma) \end{align*} $$

for almost every |$\sigma $| with respect to arclength measure on |$\partial \Omega $|⁠. Inequality (41) follows. Although we shall skip a formal proof, we mention also that |$\nu $| is in fact the largest measure satisfying |$\mu _\zeta \geq \nu $| for all |$\zeta \in \partial \Omega $|⁠. This maximality property of |$\nu $| is to be interpreted in the following sense: if |$\nu ^{\prime}$| is any measure satisfying |$\mu _\zeta \geq \nu ^{\prime}$| for all |$\zeta $|⁠, then |$\nu \geq \nu ^{\prime}$|⁠.

$A quadrilateral domain $\Omega _\epsilon $ with $a(\Omega _\epsilon )> 1 - (4/\pi )\epsilon $.$

Fig. 7

A quadrilateral domain |$\Omega _\epsilon $| with |$a(\Omega _\epsilon )> 1 - (4/\pi )\epsilon $|⁠.

Open in new tab Download slide

By our earlier discussion, we obtain the following upper estimate for the configuration constant:

$$ \begin{align*} & c(\Omega) \leq 1-\frac{1}{2\pi}\int_{\partial\Omega}\frac{ \textrm{d}s}{R_\Omega}. \end{align*} $$

Note that this is precisely the assertion of Theorem 6 stated in Section 1.

We will now mention some consequences. Recall that if |$\gamma $| is a plane curve of class |$C^{2}$|⁠, then the radius of curvature of |$\gamma $| is the reciprocal of its curvature.

Corollary 24.

If |$\Omega $| has a |$C^{2}$|-boundary of length |$L$|⁠, whose radius of curvature is everywhere at most |$\rho $|⁠, then

$$ \begin{align*} &c(\Omega) \leq 1-\frac{L}{2\pi\rho}.\end{align*} $$

Proof.

In this case, one sees from (15) and (16) that |$R_\Omega (\sigma )\le \rho $| for all |$\sigma \in \partial \Omega $|⁠, from which the result follows.

This last result was already known. See for example [7, pp. 45–46] and [12, pp. 128–129]. However the proofs in these references are quite different from the one above.

Corollary 25.

Consider a convex circular sector

$$ \begin{align*} & \Omega = \{z\in{\mathbb{C}}:0\le |z|\le r,\, 0\le\arg(z)\le\theta\}, \end{align*} $$

where |$r>0$| and |$0<\theta \le \pi $|⁠. Then

$$ \begin{align*} & c(\Omega) \leq 1-\frac{\theta}{2\pi}. \end{align*} $$

Proof.

$$ \begin{align*} & \frac{1}{2\pi}\int_{\partial\Omega}\frac{ \textrm{d}s}{R_\Omega} =\frac{1}{2\pi}\frac{r\theta}{r}=\frac{\theta}{2\pi}. \end{align*} $$

The result now follows from Theorem 6.

5.3 Analytic configuration constants of quadrilaterals

Proposition 26.

For |$\epsilon>0$|⁠, let |$\Omega _\epsilon $| be the convex hull of |$\{\pm 1,\,\pm \epsilon i\}$|⁠. Then

$$ \begin{align*} & a(\Omega_\epsilon)\ge 1-(4/\pi)\epsilon. \end{align*} $$

Proof.

Let |$f$| be a conformal mapping of the interior of |$\Omega _\epsilon $| onto the unit disk |${\mathbb{D}}$|⁠. By Carathéodory’s theorem, |$f$| extends to a homeomorphism of |$\Omega _\epsilon $| onto |$\overline{{\mathbb{D}}}$|⁠, and so clearly |$f\in A(\Omega )$|⁠. Post-composing with a suitable automorphism of |$\overline{{\mathbb{D}}}$|⁠, we may further suppose that |$f(1)=1$| and |$f(-1)=-1$|⁠.

Consider |$\zeta =1$|⁠. Recalling (3), we have |$\mu _{1}=(1-\theta _{1}/\pi )\delta _{1}+(\theta _{1}/\pi )\nu $|⁠, where |$\theta _{1}$| is the angle of the aperture of |$\partial \Omega _\epsilon $| at |$1$|⁠, and |$\nu $| is a probability measure on |$\partial \Omega _\epsilon \setminus \{1\}$|⁠. It follows that

$$ \begin{align*} \textrm{Re} (K_{\Omega_\epsilon}f)(1) &= \int_{\partial\Omega_\epsilon}(\textrm{Re} f)\, \textrm{d}\mu_{1}\\ &=(1-\theta_{1}/\pi)\textrm{Re} f(1)+(\theta_{1}/\pi)\int_{\partial\Omega_\epsilon\setminus\{1\}}(\textrm{Re} f)\, \textrm{d}\nu\\ &\ge (1-\theta_{1}/\pi)(1)+(\theta_{1}/\pi)(-1)\\ &=1-2\theta_{1}/\pi.\end{align*} $$

Likewise

$$ \begin{align*}\textrm{Re} (K_{\Omega_\epsilon}f)(-1)&\le -(1-2\theta_{1}/\pi). \end{align*} $$

It follows that the diameter of |$(K_{\Omega _\epsilon }f)(\Omega )$| is at least |$2(1-2\theta _{1}/\pi )$|⁠, whence |$a(\Omega )\ge (1-2\theta _{1}/\pi )$|⁠.

Finally, by trigonometry, |$\theta _{1}$| is related to |$\epsilon $| by |$\tan (\theta _{1}/2)=\epsilon $|⁠, whence |$\theta _{1}=2\arctan \epsilon \le 2\epsilon $|⁠. The result follows.

5.4 Configuration constants equal to zero

Proposition 27.

Let |$\Omega $| be a compact convex domain with non-empty interior. The following are equivalent:

(i)
|$\Omega $| is a disk,
(ii)
|$c(\Omega ) = 0$|⁠,
(iii)
|$a(\Omega ) = 0$|⁠.

Proof.

In the case that |$\Omega $| is a disk, then (5) implies readily that |$\rho _\zeta (\sigma )$| is a constant independent of |$\zeta $|⁠, and so for every |$\zeta \in \partial \Omega $|⁠, the measure |$\mu _\zeta $| is a normalized arclength measure on the circular boundary |$\partial \Omega $|⁠. Then it follows from the definition that |$K_\Omega f$| is a constant function, and consequently |$\|K_\Omega f + {\mathbb{C}} \mathbf{1}\|_{C(\partial \Omega } = 0$|⁠, so |$c(\Omega ) = a(\Omega ) = 0$|⁠. This shows that the implications |$(i) \Rightarrow (ii)$| and |$(ii) \Rightarrow (iii)$| hold.

It remains to prove |$(iii) \Rightarrow (i)$|⁠. Fix a conformal mapping |$\phi : {\mathbb{D}} \to \Omega ^{o}$|⁠, where |${\mathbb{D}}$| is the open unit disk and |$\Omega ^{o}$| is the interior of |$\Omega $|⁠. The mapping |$\phi $| extends to a homeomorphism of |$\partial{\mathbb{D}}$| and |$\partial \Omega $|⁠, and so it makes sense to define the probability measures |$\mu ^\phi _\zeta $| on |$\partial{\mathbb{D}}$| by the equation

$$ \begin{align*} & \mu^\phi_\zeta(E):= \mu_\zeta( \phi(E) )\end{align*} $$

where |$E$| is a Borel subset of |$\partial{\mathbb{D}}$|⁠, and |$\{\mu _\zeta \}_{\zeta \in \partial \Omega }$| is the double-layer potential of |$\Omega $|⁠. Since |$a(\Omega ) = 0$|⁠, it follows that for every |$f \in \mathcal{A}(\Omega )$| and every pair of points |$\zeta , \zeta ^{\prime} \in \partial \Omega $| we have, by the change of variables formula, that

$$ \begin{align*} 0 &= \int_{\partial \Omega} f \, \textrm{d}\mu_\zeta - \int_{\partial \Omega} f \, \textrm{d}\mu_{\zeta^{\prime}} \\ &= \int_{\partial{\mathbb{D}}} f \circ \phi \, \textrm{d} \mu^\phi_\zeta - \int_{\partial{\mathbb{D}}} f \circ \phi \, \textrm{d} \mu^\phi_{\zeta^{\prime}}. \end{align*} $$

As |$f$| varies over |$\mathcal{A}(\Omega )$|⁠, |$f \circ \phi $| varies over |$\mathcal{A}({\mathbb{D}}):= \mathcal{A}(\overline{{\mathbb{D}}})$|⁠, and it follows that |$\mu ^\phi _\zeta - \mu ^\phi _{\zeta ^{\prime}}$| annihilates |$\mathcal{A}({\mathbb{D}})$|⁠. Then the theorem of brothers Riesz (see, for instance, [8, Exercise 1, Chapter III]) implies that

$$ \begin{align*} & \mu^\phi_\zeta - \mu^\phi_{\zeta^{\prime}} = h \cdot s_{\partial{\mathbb{D}}}\end{align*} $$

where |$h$| is a function with vanishing non-positive Fourier coefficients. Note that |$h$| is real-valued, so the positive Fourier coefficients also vanish, and consequently |$h \equiv 0$|⁠. Since |$\zeta , \zeta ^{\prime}$| were arbitrary, we conclude that the hypothesis |$a(\Omega ) = 0$| implies that all the measures |$\mu _\zeta $| are equal.

6 Application to Numerical Ranges

6.1 Spectral constant estimate

Our principal motivation for the introduction of the analytic configuration constant is the following result that was mentioned in the Section 1 and that we will now prove.

Theorem 28.

Let |$T$| be a bounded linear operator on a Hilbert space |${\mathcal{H}}$|⁠, and |$W = \overline{W(T)}$| the closure of the numerical range of |$T$|⁠. If |$W$| has non-empty interior, then for every |$f \in \mathcal{A}(W)$| we have

$$ \begin{align*} & \| f(T)\| \leq \Bigl( 1 + \sqrt{1 + a(W)} \Bigr) \| f \|_{W},\end{align*} $$

where |$a(W)$| is the analytic configuration constant in (11), and |$\mathcal{A}(W)$| is the space of continuous functions on |$W$|⁠, which are analytic in the interior of |$W$|⁠.

Of course, if |$W$| has no interior, then its convexity forces it to be a line segment. In that case |$T$| is a normal operator, and the spectral theorem gives us the better estimate |$\| f(T)\| \leq \|f\|_{\sigma (T)}$|⁠, where |$f$| may be any Borel measurable function on the spectrum |$\sigma (T)$|⁠. Thus Theorem 28 implies Theorem 3. In what follows, we will assume that |$W$| has non-empty interior.

Let us make some initial remarks before going into the proof of Theorem 28. In the case |$\sigma (T)$| is contained in the interior of |$W$|⁠, then |$f(T)$| is defined, as usual, through the Dunford–Riesz holomorphic functional calculus. If |$\partial W \cap \sigma (T) \neq \emptyset $|⁠, then this definition does not work. Nevertheless, if |$f \in \mathcal{A}(W)$|⁠, then it is a standard result of approximation theory that a sequence of analytic polynomials |$(p_{n})$| exists, which converges to |$f$| uniformly on |$W$|⁠. In the presence of any uniform bound of the form |$\|p(T)\| \leq K \|p(T)\|_{W}$| for polynomials |$p$|⁠, we may then define |$f(T)$| as the limit of the sequence |$(p_{n}(T))$| in the operator norm. Such bounds are known to exists, the strongest known bound |$K \leq 1 + \sqrt{2}$| being due to Crouzeix and Palencia. Theorem 28 improves this estimate given information about the numerical range of |$T$|⁠.

Our proof of Theorem 28 combines the argument of Crouzeix and Palencia from [5] with ideas of Schwenninger and de Vries from [18], where bounds for various functional calculi are derived as a consequence of the existence of extremal functions and extremal vectors. Let |$U$| be an open set in the plane, and |$H^\infty (U)$| be the algebra of bounded holomorphic functions on |$U$|⁠. Given an operator |$T: {\mathcal{H}} \to{\mathcal{H}}$| with |$\sigma (T)$| contained in |$U$|⁠, it is elementary that the quantity

$$ \begin{align*} & \sup \Big\{ \|f(T)\|: f \in H^\infty(U), \|f\|_{U} \leq 1\Big\} \end{align*} $$

is finite. A normal-families argument shows that an |$f \in H^\infty (U)$| exists with |$\|f\|_{U} = 1$| for which the supremum above is attained. Any such |$f$| will be called for an extremal function. If, moreover, a vector |$x \in{\mathcal{H}}$| with |$\|x\|_{\mathcal{H}} = 1$| exists for which

$$ \begin{align*} & \sup \Big\{ \|f(T)\|: f \in H^\infty(U), \|f\|_{U} \leq 1\Big\} = \|f(T)x\|_{\mathcal{H}} \end{align*} $$

then we will say that |$x$| is an extremal vector, and |$(f,x)$| is an extremal pair. Unless |$\dim{\mathcal{H}} < \infty $|⁠, an extremal vector might not exist, but we will be able to reduce the proof to the finite-dimensional case. The importance of the concept of extremal pairs |$(f,x)$| stems from the following result. We refer the reader to [1, Theorem 4.5] for a proof (see also [18, Proposition 3]).

Lemma 29.

Let |$T:{\mathcal{H}} \to{\mathcal{H}}$| be a bounded linear operator, and |$U$| be an open neighbourhood of |$\sigma (T)$|⁠. Let |$(f,x)$| be a corresponding extremal pair. If |$\|f(T)\|>1$|⁠, then |$f(T)x$| is orthogonal to |$x$| in |${\mathcal{H}}$|⁠:

$$ \begin{align*} & \langle f(T)x,x\rangle_{\mathcal{H}} =0. \end{align*} $$

The next two lemmas will reduce our task to consideration of finite-dimensional Hilbert spaces, in which extremal vectors exist, and will dispose of the problematic set |$\sigma (T) \cap \partial W$|⁠. The first observation is essentially contained in [18, Proposition 9].

Lemma 30.

Let |$\Omega $| be a compact convex domain with non-empty interior. If for some |$K> 0$| the estimate

$$ \begin{align*} & \|p(T)\| \leq K \| p \|_\Omega\end{align*} $$

holds for every polynomial |$p$| and every operator |$T$| on a finite-dimensional Hilbert space with |$W(T)$| contained in the interior of |$\Omega $|⁠, then the same estimate with the same constant |$K$| holds also for operators |$T$| on infinite-dimensional Hilbert spaces with |$\overline{W(T)}$| contained in the interior of |$\Omega $|⁠.

Proof.

Let |$T: {\mathcal{H}} \to{\mathcal{H}}$| be as above, with |$\dim{\mathcal{H}} = \infty $|⁠. It suffices to show that

$$ \begin{align*} & \|p(T)x\|_{\mathcal{H}} \leq K \| p \|_\Omega \|x\|_{\mathcal{H}}\end{align*} $$

holds for every analytic polynomial |$p$| and every |$x \in{\mathcal{H}}$|⁠. Note that |$p(T)x$| is contained in the finite-dimensional subspace |$\mathcal{K}$| spanned by |$\{x, Tx, \ldots , T^{d}x\}$|⁠, where |$d$| is the degree of the polynomial |$p$|⁠. If |$\Pi : {\mathcal{H}} \to \mathcal{K}$| is the orthogonal projection, then |$p(T)x = \Pi p(T)x = p(\Pi T)x$|⁠, where |$\Pi T: \mathcal{K} \to \mathcal{K}$| is an operator on a finite-dimensional Hilbert space. Since |$W(\Pi T) \subset \overline{W(T)}$|⁠, our hypothesis implies

$$ \begin{align*} & \|p(T)x\|_{\mathcal{H}} = \|p(\Pi T)x\|_{\mathcal{K}} \leq K \|p\|_\Omega \|x\|_{\mathcal{K}} = K \|p\|_\Omega \|x\|_{\mathcal{H}}.\end{align*} $$

The lemma follows.

The proof of the next lemma will use affine invariance of the configuration constants. Let us fix |$\alpha , \beta \in{\mathbb{C}}$|⁠, |$\alpha \neq 0$|⁠, and an affine mapping |$A(z):= \alpha z + \beta $|⁠. Then |$A$| is a conformal transformation of |${\mathbb{C}}$| with the additional property of taking a line segment of length |$L$| to a line segment of length |$|\alpha |L$|⁠, and a circle of radius |$R$| to a circle of radius |$|\alpha |R$|⁠. Let |$\widetilde{\Omega } = A(\Omega )$| be the affine image of |$\Omega $| under |$A$|⁠, and recall the formula for the Neumann–Poincaré kernel in (3) and its geometric interpretation. If |$\zeta , \sigma \in \partial \Omega $|⁠, |$E$| is a Borel subset of |$\partial \Omega $|⁠, and |$s, \widetilde{s}$| are the arclength measures on |$\partial \Omega $| and |$\partial \widetilde{\Omega }$|⁠, respectively, then it follows from the properties of |$A$| listed above that

(i)
|$\theta _\zeta = \theta _{A(\zeta )}$|⁠,
(ii)
|$|\alpha | s(E) = \widetilde{s}(A(E))$|⁠,
(iii)
|$|\alpha | R_{\zeta , \sigma } = R_{A(\zeta ), A(\sigma )}$|⁠.

A consequence is that the Neumann–Poincaré kernels |$\{\mu _\zeta \}_{\zeta \in \partial \Omega }$| and |$\{\widetilde{\mu }_{A(\zeta )}\}_{A(\zeta ) \in \partial \Omega }$| of the respective domains satisfy

$$ \begin{align*} & \widetilde{\mu}_{A(\zeta)}(A(E)) = \mu_\zeta(E), \quad E \textrm{ a Borel subset of}\ \partial \Omega.\end{align*} $$

Then a change of variables shows that |$K_\Omega (\widetilde{f} \circ A) = K_{\widetilde{\Omega }} \widetilde{f}$| for any |$\widetilde{f} \in C(\partial \widetilde{\Omega })$|⁠, and it follows that

$$ \begin{align*} & a(\Omega) = a(\widetilde{\Omega}), \quad c(\Omega) = c(\widetilde{\Omega}).\end{align*} $$

Armed with these equalities, we make our second observation.

Lemma 31.

Assume that the estimate

$$ \begin{align}& \|p(T)\| \leq \Bigl( 1 + \sqrt{1 + a(\Omega)} \Bigr) \|p \|_\Omega\end{align} $$

(42)

Proof.

Replacing |$T$| by an operator |$T + \beta I$| for some |$\beta \in{\mathbb{C}}$|⁠, we may assume that |$0$| lies in the interior of |$W(T)$|⁠. Let |$W = \overline{W(T)}$|⁠, and

$$ \begin{align*} & W_{r} = \{ rz: z \in W \}, \quad r> 1. \end{align*} $$

Then |$W_{r}$| is a convex domain that contains |$W$| in its interior. By our assumption, for any analytic polynomial |$p$| we have

$$ \begin{align*} & \|p(T)\| \leq \Bigl( 1 + \sqrt{1 + a(W_{r})} \Bigr) \|p\|_{W_{r}}.\end{align*} $$

Since |$W_{r}$| is an affine image of |$W$|⁠, we have |$a(W_{r}) = a(W)$|⁠. Since this holds for all |$r> 1$|⁠, and since |$\lim _{r \to 1} \|p\|_{W_{r}} = \|p\|_{W}$|⁠, we may let |$r \to 1$| to obtain the desired estimate whenever |$p$| is an analytic polynomial. The estimate for |$f \in \mathcal{A}(W)$| follows by density of polynomials in |$\mathcal{A}(W)$|⁠.

Proof of Theorem 28.

By Lemma 31, it will be sufficient to establish the estimate (42) whenever |$\Omega $| contains |$W(T)$| in its interior |$\Omega ^{o}$|⁠. Moreover, by Lemma 30, we may assume that |$T$| is an operator on a finite-dimensional Hilbert space |${\mathcal{H}}$|⁠. Let |$U = \Omega ^{o}$| and |$(f,x)$| be an extremal pair corresponding to |$U$| and the operator |$T$|⁠. If |$\|f(T)\| \leq 1$|⁠, then (42) certainly holds, so we may assume that |$\|f(T)\|> 1$|⁠.

Let |$(f_{n})$| be a sequence in |$A(\Omega )$| such that |$\|f_{n}\|_\Omega \le 1$| and |$f_{n}\to f$| locally uniformly in |$\Omega $|⁠. Then |$f_{n}(T)\to f(T)$| in operator norm. Set |$g_{n}:=K_{\Omega }\overline{f}_{n}$|⁠. It is shown in [5, Lemmas 2.1 and 2.3] that |$g_{n} \in \mathcal{A}(\Omega )$| and

$$ \begin{align}& \|f_{n}(T)+g_{n}(T)^{*}\|\le 2.\end{align} $$

(43)

For each |$n$|⁠, we may choose |$\lambda _{n}\in{\mathbb{C}}$| such that

$$ \begin{align*} & \|g_{n}+\lambda_{n} \mathbf{1}\|_\Omega=\inf_{\lambda\in{\mathbb{C}}}\|g_{n}+\lambda \mathbf{1}\|_\Omega \le a(\Omega). \end{align*} $$

We now have the following identity:

$$ \begin{align} \langle f_{n}(T)x,\, f_{n}(T)x\rangle_{\mathcal{H}} =& \, \langle f_{n}(T)x, (f_{n}(T)+g_{n}(T)^{*})x\rangle_{\mathcal{H}} \\ &-\langle f_{n}(T)x, (g_{n}+\lambda_{n} \mathbf{1})(T)^{*}x\rangle_{\mathcal{H}} \nonumber \\ &+\lambda_{n}\langle f_{n}(T)x,x\rangle_{\mathcal{H}}. \nonumber\end{align} $$

(44)

Let us consider each of the terms in this identity. By the choice of |$x$|⁠, we have

$$ \begin{align*} & \langle f_{n}(T)x,f_{n}(T)x\rangle_{\mathcal{H}} =\|f_{n}(T)x\|^{2}\underset{n\to\infty}\longrightarrow\|f(T)x\|^{2}=\|f(T)\|^{2}. \end{align*} $$

Also, from (43) and the Cauchy–Schwarz inequality,

$$ \begin{align*} & \bigl|\langle f_{n}(T)x, (f_{n}(T)+g_{n}(T)^{*})x\rangle_{\mathcal{H}} \bigr|\le 2\|f_{n}(T)\|\underset{n\to\infty}\longrightarrow2\|f(T)\|. \end{align*} $$

By Lemma 29, we have

$$ \begin{align*} \bigl|\langle f_{n}(T)x, (g_{n}+\lambda_{n} \mathbf{1})(T)^{*}x\rangle\bigr| &= \bigl|\langle (f_{n}(g_{n}+\lambda_{n} \mathbf{1}))(T)x, x\rangle\bigr| \\ &\le \|f_{n}(g_{n}+\lambda_{n} \mathbf{1})\|_\Omega \\ &\le\|g_{n}+\lambda_{n} \mathbf{1}\|_\Omega \\ &\le a(\Omega). \end{align*} $$

By Lemma 29 again, |$\langle f(T)x,x\rangle _{\mathcal{H}} =0$|⁠. Since the sequence |$(\lambda _{n})$| is certainly bounded (indeed |$|\lambda _{n}|\le 2$|⁠), we deduce that

$$ \begin{align*} & \lambda_{n}\langle f_{n}(T)x,x\rangle_{\mathcal{H}} \underset{n\to\infty}\longrightarrow0. \end{align*} $$

Thus, letting |$n\to \infty $| in (44), we deduce that

$$ \begin{align*} & \|f(T)\|^{2}\le 2\|f(T)\|+a(\Omega). \end{align*} $$

Hence

$$ \begin{align*} & \|f(T)\|\le 1+\sqrt{1+a(\Omega)}. \end{align*} $$

In particular, for every polynomial |$p$| with |$\|p\|_\Omega = 1$| we have

$$ \begin{align*} & \|p(T)\| \leq \|f(T)\| \leq 1+\sqrt{1+a(\Omega)},\end{align*} $$

since |$f$| is extremal. This is equivalent to (42), and so the proof is complete.

Funding

This work has been done during Malman’s visit at Département de mathématiques et de statistique, Université Laval, supported by Simons-CRM Scholar-in-Residence program. Mashreghi’s research was supported by an NSERC Discovery Grant and the Canada Research Chairs program. O’Loughlin was supported by a CRM-Laval Postdoctoral Fellowship. Ransford’s research was supported by an NSERC Discovery Grant.

Acknowledgments

We are grateful to the referee for the careful reading of the paper, and for the suggestions that we used to improve the manuscript.

A Double-Layer Potential on a General Convex Domain

A.1 Convex domains

Let |$\Omega $| be a compact convex domain in the plane |$\mathbb{C}$| with non-empty interior |$\Omega ^{o}$|⁠. We will be making no assumptions regarding smoothness of the boundary |$\partial \Omega $|⁠. However, convexity itself implies that |$\partial \Omega $| is a rectifiable simple closed curve with some additional properties.

The orientation of |$\partial \Omega $| is to be counter-clockwise (i.e., positive), and we use |$\sigma ^{\prime} \uparrow \sigma $| and |$\sigma ^{\prime} \downarrow \sigma $| to denote, respectively, the counter-clockwise and clockwise one-sided convergence of |$\sigma ^{\prime}$| to |$\sigma $| within |$\partial \Omega $|⁠. As a consequence of convexity of |$\Omega $|⁠, the one-sided tangent angles exist at every point |$\sigma \in \partial \Omega $|⁠, are locally given by

$$ \begin{align*} &\alpha_{+}(\sigma):= \lim_{\sigma^{\prime} \downarrow \sigma} \arg ( \sigma^{\prime} - \sigma), \quad \alpha_{-}(\sigma):= \lim_{\sigma^{\prime} \uparrow \sigma} \arg (\sigma - \sigma^{\prime}),\end{align*} $$

and satisfy

$$ \begin{align*} &\alpha_{-}(\sigma) \leq \alpha_{+}(\sigma).\end{align*} $$

$$ \begin{align*} &\alpha(\sigma):= \alpha_{+}(\sigma) = \alpha_{-}(\sigma)\end{align*} $$

is well-defined, and so is the tangent |$T(\sigma ):= e^{i \alpha (\sigma )}$| itself. If |$t \mapsto \gamma (t)$| is any (positively-oriented) parametrization of |$\partial \Omega $|⁠, and we set |$\alpha (\sigma ) = \alpha _{+}(\sigma )$| at the corners, then the locally defined function |$\alpha (\gamma (t))$| is increasing in |$t$|⁠, and consequently the tangent |$T$| is continuous at every point that is not a corner of |$\partial \Omega $|⁠. At a corner, the discontinuity of |$T$| amounts to a jump of the argument of |$T$|⁠. We denote by |$N(\sigma ):= -i T(\sigma )$| the outward-pointing normal at |$\sigma \in \partial \Omega $|⁠.

A.2 Double-layer potential

Let |$\Omega ^{o}$| denote the interior of |$\Omega $|⁠. To each |$z \in \Omega ^{o}$| we associate the measure |$\mu _{z}$| on |$\partial \Omega $|⁠, which for any arc |$J \subset \partial \Omega $| satisfies

$$ \begin{align}& \mu_{z}(J) = \frac{1}{\pi}\int_{J} \textrm{d}\arg (\sigma - z) = \frac{1}{\pi}\bigl(\textrm{angle subtended at}\ z\ \textrm{by}\ J \bigr).\end{align} $$

(A.1)

Here |$\arg (\sigma - z)$| is any locally defined continuous determination of the argument function on |$\partial \Omega $|⁠. Non-negativity of |$\mu _{z}$| follows from convexity of |$\Omega $| and our choice of positive orientation of |$\partial \Omega $|⁠. With respect to this orientation, every arc |$J =(a,b) \subset \partial \Omega $| has a start-point |$a$| and an end-point |$b$|⁠, and it is easy to see that

$$ \begin{align*} & \mu_{z}(J) = \frac{\arg(b - z) - \arg(a - z)}{\pi}. \end{align*} $$

In particular, |$\mu _{z}(\partial \Omega ) = 2$|⁠.

The measure |$\mu _{z}$| is absolutely continuous with respect to arclength |$s$| on |$\partial \Omega $|⁠. Indeed, if |$\sigma _{0} \in \partial \Omega $|⁠, |$J_{n} = (a_{n}, b_{n})$| is a sequence of arcs of |$\partial \Omega $| that are shrinking to |$\sigma _{0}$|⁠, and |$|J_{n}|$| are the corresponding arclengths, then

$$ \begin{align*} \frac{\pi \mu_{z}(J_{n})}{|J_{n}|} &= \frac{1}{|J_{n}|}\int_{J_{n}} \textrm{d}\arg(\sigma - z) \\ &= \frac{\arg(b_{n} - z) - \arg(a_{n} - z)}{|J_{n}|} \\ &= \textrm{Im} \Bigg( \frac{\log ( b_{n} - z) - \log (a_{n} - z)}{b_{n}-a_{n}} \cdot \frac{b_{n} - a_{n}}{|J_{n}|}\Bigg). \end{align*} $$

We use above an appropriate locally defined holomorphic branch of the logarithm. As |$n \to \infty $|⁠, the first factor inside the brackets satisfies

$$ \begin{align*} & \lim_{n \to \infty} \frac{\log ( b_{n} - z) - \log (a_{n} - z)}{b_{n}-a_{n}} = \frac{1}{\sigma_{0} - z},\end{align*} $$

while the second factor stays bounded as a consequence of the inequality |$|b_{n} - a_{n}| \leq |J_{n}|$|⁠. Thus

$$ \begin{align*} & \limsup_{n \to \infty} \frac{\mu_{z}(J_{n})}{|J_{n}|} < \infty \end{align*} $$

and from elementary measure theory we obtain that |$\mu _{z}$| is absolutely continuous with respect to |$s$|⁠. If moreover |$\sigma _{0}$| is not a corner, then it can be shown that

$$ \begin{align*} & \lim_{n \to \infty} \frac{|J_{n}|}{|b_{n}-a_{n}|} = 1, \end{align*} $$

and so in additional to boundedness we even have the convergence

$$ \begin{align*} & \lim_{n \to \infty} \frac{b_{n} - a_{n}}{|J_{n}|} = \lim_{n \to \infty} \frac{b_{n} - a_{n}}{|b_{n} - a_{n}|} = T(\sigma_{0}) = i N(\sigma_{0}).\end{align*} $$

Thus the Radon–Nikodym derivative satisfies

$$ \begin{align}& \rho_{z}(\sigma):= \frac{d \mu_{z}} {ds}(\sigma) = \frac{1}{\pi}\textrm{Im} \Bigg( \frac{T(\sigma)}{\sigma - z}\Bigg) = \frac{1}{\pi}\textrm{Re} \Bigg ( \frac{N(\sigma)}{\sigma - z} \Bigg)\end{align} $$

(A.2)

at every |$\sigma \in \partial \Omega $|⁠, which is not a corner.

A.3 Boundary kernel

The Neumann–Poincaré kernel is the boundary version of the family of measures |$\{\mu _{z}\}_{z \in \Omega ^{o}}$| introduced above. To each point |$\zeta \in \partial \Omega $| we associate the Borel probability measure on |$\partial \Omega $| defined by (A.1) for arcs |$J \subset \partial \Omega $| not containing the point |$\zeta $|⁠. Because |$\zeta \in \partial \Omega $|⁠, this definition implies that |$\mu _\zeta ( \partial \Omega \setminus \{\zeta \}) = \theta _\zeta /\pi $|⁠, where |$\theta _\zeta = \pi - \alpha _{+}(\zeta ) + \alpha _{-}(\zeta )$| can be interpreted as the angle of the aperture at |$\zeta $|⁠. Indeed, |$\theta _\zeta $| is equal to the increase in the argument of |$\sigma - \zeta $| as we traverse one loop around |$\partial \Omega $| starting and ending at the point |$\zeta $|⁠, and since |$\mu _\zeta $| is a probability measure, we must have

$$ \begin{align*} & \mu_\zeta(\{\zeta\}) = 1 - \frac{\theta_\zeta}{\pi}.\end{align*} $$

With the exception of this possible point mass, |$\mu _\zeta $| is otherwise absolutely continuous with respect to arclength. The corresponding Radon–Nikodym derivative is given by

$$ \begin{align}& \rho_\zeta(\sigma):= \frac{d \mu_\zeta} {ds}(\sigma) = \frac{1}{\pi}\textrm{Im} \Bigg( \frac{T(\sigma)}{\sigma - \zeta}\Bigg) = \frac{1}{\pi}\textrm{Re} \Bigg ( \frac{N(\sigma)}{\sigma - \zeta} \Bigg).\end{align} $$

(A.3)

The formula (A.3) is established analogously to (A.2). All in all, the measure |$\mu _\zeta $| can be decomposed as

$$ \begin{align*} & d\mu_\zeta = (1- \theta_\zeta/\pi) d \delta_\zeta + \rho_\zeta d s, \end{align*} $$

A.4 Weak-star convergence

We establish now that

$$ \begin{align*} & \lim_{z \to \zeta} \mu_{z} = \delta_\zeta + \mu_\zeta \end{align*} $$

in the sense of the weak-star topology on measures. Note that if |$B = B(\zeta , \delta )$| is a ball of radius |$\delta> 0$| centred at |$\zeta \in \partial \Omega $|⁠, then expressions (A.2) and (A.3) for the densities of |$\mu _{z}$| and |$\mu _\zeta $| show that

$$ \begin{align}& \lim_{z \to \zeta} \int_{\partial \Omega \setminus B} f \, \textrm{d}\mu_{z} = \int_{\partial \Omega \setminus B} f \, \textrm{d}\mu_\zeta\end{align} $$

(A.4)

for every |$f \in C(\partial \Omega )$|⁠. In particular, choosing |$f = \mathbf{1}$|⁠, we obtain

$$ \begin{align*} & 2 = \mu_\zeta(\partial \Omega \setminus B) + \lim_{z \to \zeta} \mu_{z}(B).\end{align*} $$

Since

$$ \begin{align*} & \lim_{\delta \to 0} \mu_\zeta(\partial \Omega \setminus B) = \mu_\zeta(\partial \Omega \setminus \{\zeta\}) = \theta_\zeta/\pi\end{align*} $$

we see that given |$\epsilon> 0$| for all sufficiently small |$\delta> 0$| we will have

$$ \begin{align*} & \limsup_{z \to \zeta} \, |\mu_{z}(B) - 2 + \theta_\zeta/\pi| \leq \epsilon.\end{align*} $$

Returning to general |$f \in C(\partial \Omega )$|⁠, we have

$$ \begin{align*} \int_{\partial \Omega} f \, \textrm{d}\mu_{z} - \int_{\partial \Omega} f \, \textrm{d}[\delta_\zeta + \mu_\zeta] =& \int_{\partial \Omega \setminus B} f \, \textrm{d}\mu_{z} - \int_{\partial \Omega \setminus B} f \, \textrm{d}\mu_\zeta \\ &+ \int_{B} \big(f - f(\zeta)\big) \, \textrm{d}\mu_{z} \\ &+ f(\zeta) \big(\mu_{z}(B) - 2 + \theta_\zeta/\pi \big) \\ &- \int_{B \setminus \{\zeta\}} f \, \textrm{d}\mu_\zeta. \end{align*} $$

On the right-hand side, the first term tends to zero as |$z \to \zeta $|⁠, the second can be made arbitrarily small by continuity of |$f$|⁠, the crude estimate |$\mu _{z}(B) \leq 2$| and choice of sufficiently small |$\delta $|⁠, the third is dominated in modulus by |$\|f\|_{\partial \Omega } \cdot \epsilon $| for |$z$| sufficiently close to |$\zeta $|⁠, and the fourth is dominated by |$\|f\|_{\partial \Omega } \cdot \mu _\zeta (B \setminus \{\zeta \})$|⁠, which also can be made arbitrarily small by choice of sufficiently small |$\delta $|⁠. The desired weak-star convergence follows.

References

Bickel

Gorkin

Greenbaum

Ransford

F. L.

Schwenninger

, and

Wegert

. “

Crouzeix’s conjecture and related problems

.”

Comput. Methods Funct. Theory

, no.

3–4

(

2020

701

–

10.1007/s40315-020-00350-9

Google Scholar

Crossref

WorldCat

Crouzeix

“

Bounds for analytical functions of matrices

.”

Integral Equations Operator Theory

, no.

(

2004

461

–

10.1007/s00020-002-1188-6

Google Scholar

Crossref

WorldCat

Crouzeix

“

Numerical range and functional calculus in Hilbert space

.”

J. Funct. Anal.

244

, no.

(

2007

668

–

10.1016/j.jfa.2006.10.013

Google Scholar

Crossref

WorldCat

Crouzeix

“

Some constants related to numerical ranges

.”

SIAM J. Matrix Anal. Appl.

, no.

(

2016

420

–

Crouzeix

and

Palencia

. “

The numerical range is a |$\left (1+\sqrt{2}\right )$|-spectral set

.”

SIAM J. Matrix Anal. Appl.

, no.

(

2017

649

–

Google Scholar

Crossref

WorldCat

Delyon

and

Delyon

. “

Generalization of von Neumann’s spectral sets and integral representation of operators

.”

Bull. Soc. Math. France

127

, no.

(

1999

–

Gaier

“

Konstruktive Methoden der konformen Abbildung

.”

Springer Tracts Nat. Philos.

, vol. 3.

New York, NY

Springer

1964

Garnett

“

Bounded analytic functions

.”

Grad. Texts Math

, vol. 236,

revised 1st ed

New York, NY

Springer

2006

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Garnett

and

Marshall

. “

Harmonic measure

.”

New Math. Monogr.

, vol. 2.

Cambridge

Cambridge University Press

2005

10.

Gustafson

Rao

K. E.

Gustafson

, and

D. K. M.

Rao

Numerical Range. The Field of Values of Linear Operators and Matrices

New York, NY

Universitext

Springer

1996

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

11.

Jung

“

Über den kleinsten Kreis, der eine ebene Figur einschließt

.”

J. Reine Angew. Math.

1910

(

1910

310

–

10.1515/crll.1910.137.310

Google Scholar

Crossref

WorldCat

12.

Kantorovich

and

Krylov

Approximate Methods of Higher Analysis

Groningen

P. Noordhoff Ltd. xii

1958

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

13.

Kral

“

Integral operators in potential theory

.”

Lect. Notes Math

, vol. 823.

Cham

Springer

1980

14.

Neumann

Untersuchungen über das Logarithmische und Newton’sche Potential

Leipzig

Teubner

1877

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

15.

Okubo

and

Ando

. “

Constants related to operators of class C|$_p$|

.”

Manuscr. Math.

(

1975

385

–

Google Scholar

Crossref

WorldCat

16.

Putinar

and

Sandberg

. “

A skew normal dilation on the numerical range of an operator

.”

Math. Ann.

331

, no.

(

2005

345

–

10.1007/s00208-004-0585-3

Google Scholar

Crossref

WorldCat

17.

Schober

“

Neumann’s lemma

.”

Proc. Amer. Math. Soc.

(

1968

306

–

10.1090/S0002-9939-1968-0224849-5

Google Scholar

Crossref

WorldCat

18.

Schwenninger

and

de Vries

. “

On abstract spectral constants

.” In

Operator and Matrix Theory, Function Spaces, and Applications. IWOTA 2022. Operator Theory: Advances and Applications

, vol. 295.

Birkhäuser

Cham

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Download all slides

Article Contents

Double-Layer Potentials, Configuration Constants, and Applications to Numerical Ranges

Abstract

1 Introduction

1.1 Double-layer potential

1.2 Neumann’s configuration constant

1.2.1 Real configuration constant

1.2.2 Neumann’s lemma

1.3 Complex and analytic configuration constants

1.3.1 Two new configuration constants

1.3.2 An application to functional calculi

1.4 Main results

1.4.1 Relation between the real and complex constants

1.4.2 Analytic Neumann’s lemma

1.4.3 Functional calculus bounds

1.4.4 Estimates for the configuration constants

1.4.5 An unresolved matter

1.5 Notations

2 The Three-Measures Theorem

2.1 Definitions of relevant spaces and operators

2.2 Statement of the theorem

2.3 Dual problem

2.4 Discretization

2.5 Optimization over a convex set

2.6 Extreme points of the polytope

3 Proof of Theorem 1

3.1 Minimal enclosing disk

3.2 Reduction to three-point sets

3.3 Finalizing the proof

4 Proof of Theorem 2

4.1 Exploiting subsequences

4.2 Proof of Theorem 2

4.3 A proof of Neumann’s lemma

5 Examples

5.1 Configuration constant of an ellipse

5.2 Integral estimates

5.3 Analytic configuration constants of quadrilaterals

5.4 Configuration constants equal to zero

6 Application to Numerical Ranges

6.1 Spectral constant estimate

Funding

Acknowledgments

A Double-Layer Potential on a General Convex Domain

A.1 Convex domains

A.2 Double-layer potential

A.3 Boundary kernel

A.4 Weak-star convergence

References

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only