Abstract

Given a compact convex planar domain Ω with non-empty interior, the classical Neumann’s configuration constant cR(Ω) is the norm of the Neumann–Poincaré operator KΩ acting on the space of continuous real-valued functions on the boundary Ω, modulo constants. We investigate the related operator norm cC(Ω) of KΩ on the corresponding space of complex-valued functions, and the norm a(Ω) on the subspace of analytic functions. This change requires introduction of techniques much different from the ones used in the classical setting. We prove the equality cR(Ω)=cC(Ω), the analytic Neumann-type inequality a(Ω)<1, and provide various estimates for these quantities expressed in terms of the geometry of Ω. We apply our results to estimates for the holomorphic functional calculus of operators on Hilbert space of the type p(T)KsupzΩ|p(z)|, where p is a polynomial and Ω is a domain containing the numerical range of the operator T. Among other results, we show that the well-known Crouzeix–Palencia bound K1+2 can be improved to K1+1+a(Ω). In the case that Ω is an ellipse, this leads to an estimate of K in terms of the eccentricity of the ellipse.

1 Introduction

1.1 Double-layer potential

Throughout this article, Ω will denote a compact convex planar domain with non-empty interior. If C(Ω) is the space of continuous functions on the boundary Ω and fC(Ω), then its double-layer potential  u is the harmonic function

(1)

Here ds=|dσ| is the arclength measure on the rectifiable curve Ω, Ωo is the interior of Ω, and N(σ) is the outer-pointing normal at the boundary point σ. The equality between the two expressions for u(z) above follows from an elementary computation in the case that Ω is sufficiently smooth. In the general case, we interpret N(σ)(σz)1 as a Borel measurable function on Ω. By convexity of the domain, both the tangent T(σ) and the normal N(σ) exist and are continuous at all but a countable number of points σ, which we will call corners, at which the discontinuity of T and N amounts to a jump in the argument. In Appendix  A we include more details regarding boundaries of planar convex domains, and other facts mentioned below.

The Neumann–Poincaré operator appears in connection with the study of boundary behaviour of the double-layer potential. It is known that u given by (1) has a continuous extension to Ω, and we have the representation

(2)

where KΩ denotes the Neumann–Poincaré integral operator

Here μζ is the probability measure

(3)

where θζ can be interpreted as the angle of the aperture at the possible corner at ζ of Ω, δζ is a unit mass at the point ζ, and ρζ is the Radon–Nikodym derivative

(4)

It is natural to use the convention that θζ=π if ζ is not a corner. This occurs precisely when μζ assigns no mass to the singleton {ζ}. We will say that the collection of measures {μζ}ζΩ is the Neumann–Poincaré kernel of Ω.

The density ρζ has the following useful geometric interpretation. If σΩ{ζ} is not a corner, and Rζ,σ is the radius of the unique circle passing through ζ that is tangent to Ω at σ, then the equality

(5)

holds. The radius Rζ,σ may degenerate to if ζ is contained in the tangent line to Ω passing through σ. In that case we see easily that ρζ(σ)=0, so (5) still holds. To establish the formula, note that the center m of the circle in question is of the form m=σRN(σ), where the radius R=Rζ,σ>0 of the circle satisfies |mζ|2=|(σζ)RN(σ)|2=R2. Expanding the squares and solving for R leads to (5).

Example domain $\Omega $ with corner of angle $\theta _{\zeta ^{\prime}}$ at $\zeta ^{\prime}$, and a circle of radius $R_{\zeta , \sigma }$ with center $m$, tangent to $\partial \Omega $ at $\sigma $ and passing through $\zeta $.
Fig. 1

Example domain Ω with corner of angle θζ at ζ, and a circle of radius Rζ,σ with center m, tangent to Ω at σ and passing through ζ.

1.2 Neumann’s configuration constant

1.2.1 Real configuration constant

Historically, the Neumann–Poincaré operator has been used to solve the Dirichlet problem of finding a harmonic extension to Ωo of a given continuous function u on Ω. The extension can be obtained by finding fC(Ω), which solves (2). Indeed, if such an f is found, then the extension of u to Ωo is given by the double-layer potential in (1). This naturally leads to questions of invertibility of the operator I+KΩ appearing on the right-hand side of (2), and consequently to the introduction of the Neumann’s configuration constant, which we shall soon define as the operator norm of KΩ acting on an appropriate space. Note that if 1 is the constant function, then we have that KΩ1=1, since each μζ is a probability measure. Thus KΩ can be naturally defined as a linear mapping on the quotient space C(Ω)/C1. The classical approach is to instead consider KΩ as acting on the space of real-valued continuous functions CR(Ω), in which case the corresponding quotient space CR(Ω)/R1 is endowed with the norm

(6)

It is not hard to see that the two above expressions for the norm of the coset g+R1 are equivalent: they are both equal to half of the length of the interval g(Ω):={g(Ω):ζΩ}, the image of g. The right-most expression is minimized by choosing r to be the mid-point of the image interval. Neumann’s (real) configuration constant cR(Ω) is defined as the operator norm of KΩ acting on the quotient space CR(Ω)/R1:

(7)

It is not hard to see that we may let KΩ instead act from CR(Ω) into the quotient CR(Ω)/R1 without affecting the operator norm. Since each measure μζ is of unit mass, we have 0cR(Ω)1. If

then

where we use the total variation norm (functional norm) on the right-hand side. By varying f over the unit ball of CR(Ω) and ζ,ζ over Ω, we obtain the important relation

(8)

This expression for cR(Ω) will play a fundamental role in our study.

1.2.2 Neumann’s lemma

From (8) we can immediately deduce that cR(Ω)=1 in the case that Ω is a triangle or a convex quadrilateral. Indeed, in those cases one sees from (3) and (4) that if ζ1 and ζ2 are corners of Ω (opposing, in the case of the quadrilateral) then μζ1 and μζ2 are mutually singular, and so μζ1μζ2=2, implying cR(Ω)=1. Neumann’s lemma, which appears initially in Neumann’s book [14], states that the cases of the triangle and quadrilateral are exceptional. For any other type of domain we have the strict inequality cR(Ω)<1. See [17] for a proof of this claim by Schober, and the curious history of incomplete attempts at a valid proof in full generality. Neumann’s lemma implies the invertibility of I+KΩ on CR(Ω)/R1, and thus the solvability of the Dirichlet problem on a convex domain Ω, which is not one of the two exceptional cases. The remaining cases can be handled by considering instead powers of KΩ. See, for instance, [13, Theorem 3.8], [6, Proposition 7], or the article [16], which contains also an exposition of the double-layer potential and Neumann’s lemma.

At the other extreme, we have cR(Ω)=0 if and only if Ω is a disk. This result will be proved in Section 5.

1.3 Complex and analytic configuration constants

1.3.1 Two new configuration constants

In the present article we will discuss certain applications of the double-layer potential to operator theory, which motivate the definition of the complex configuration constant

(9)

The difference between (7) and (9) is that the latter is the norm of KΩ on the larger space of complex-valued functions. As a consequence, we have cR(Ω)cC(Ω)1. There is a principal difference between the geometric interpretations of the norms in the quotient spaces CR(Ω)/R1 and C(Ω)/C1. In the former case, as we have already noted, the norm (6) of the coset represented by the real-valued function g is equal to half of the length of the image of g, this image being an interval on the real line R. In the case of complex-valued g, the quotient norm

(10)

can instead be interpreted as the radius of the smallest disk containing the image of g. A crucial difference is that we lose the ability to estimate the norm of the coset g+C1 by considering the quantities |g(ζ)g(ζ)| only. This is the essence of why new tools are required to treat this case.

We will also study an analogous analytic constant, which is the norm of the operator KΩ restricted to the subspace of analytic functions in C(Ω). More precisely, we let A(Ω) be the space of functions that are continuous in Ω and analytic in Ωo. Each function in A(Ω) has a unique restriction to Ω, and thus A(Ω) can be naturally identified with a subspace of C(Ω). We define the analytic configuration constant as

(11)

The space A(Ω) is not invariant under KΩ, but we do have that KΩf is the complex conjugate of a function in A(Ω) (in [5, proof of Lemma 2.1] this claim is established for Ω with smooth boundary, but the same argument works in general). Clearly, we have the inequality a(Ω)cC(Ω). We note also that if Ω~ is the image of Ω under an affine transformation of the plane, then the configuration constants of the two domains are equal. We shall verify this claim in Section 6.

1.3.2 An application to functional calculi

Given an operator T on a Hilbert space H with numerical range

we are interested in the optimal constant K>0 in the inequality

(12)

where p is an analytic polynomial, and the left-hand side is the operator norm of p(T) acting on H. More generally, if W(T) in (12) is replaced by an arbitrary domain Ω, and if the corresponding inequality holds for some K, then we say that Ω is a K-spectral set for T. Von Neumann’s inequality says that the unit disk is a 1-spectral set for any contraction T, and a result of Okubo–Ando from [15] says that any disk containing W(T) is a 2-spectral set for T.

The numerical range W(T) is a bounded convex subset of the plane, its closure W(T) contains the spectrum σ(T) of T, and it has non-empty interior in the case that T is not a normal operator (see, for instance, [10, Chapter 1]). For normal operators, the bound (12) with constant K=1 is a consequence of the spectral theorem, and it suffices to take the supremum on the right-hand side over the smaller set σ(T). For general T, even establishing the existence of a bound as in (12) is a non-trivial task. A result of Delyon–Delyon from [6, Theorem 3] establishes the existence of the bound, and shows that K can be chosen depending only on the area and the diameter of W(T). The remarkable work of Crouzeix in [3] establishes that (12) holds with K11.08. A subsequent work of Crouzeix and Palencia in [5] improves the estimate to K1+2. The Neumann–Poincaré operator appears as an essential tool in all of the mentioned works. The standing conjecture of Crouzeix from [2] is that the bound holds with K=2. This bound is presently known to hold in the case H being of dimension 2, and has been established by Crouzeix in [2].

Our interest in the new notions of configuration constants is inspired by a recent work of Schwenninger and de Vries in [18], where bounds for general homomorphisms between uniform algebras and the algebras of bounded linear operators are studied. In Section 6 we will combine their arguments with the methods of Crouzeix–Palencia to obtain the following estimate:

(13)

For instance, if W is a disk, then a(W)=0, which gives the Okubo–Ando result mentioned above. In [18], Schwenninger and de Vries recovered this result also. The estimate (13) is our motivation for the following investigation of the configuration constants cR(Ω),cC(Ω), and a(Ω), and the relations between them.

1.4 Main results

1.4.1 Relation between the real and complex constants

Consider the situation in Figure 2, where the triangular image of the complex-valued function g:ΩC is contained in a disk of radius 1, and intersects the boundary circle of the disk in three distinct points. The three-point set {g(ζ1),g(ζ2),g(ζ3)} is not contained in any open half-circle of the boundary, and it follows from a simple geometric argument (which we shall present in the proofs below) that g+C1Ω=1. However, the sides of the triangular image of g are all of lengths strictly less than 2, and this implies that

A triangular image of a complex-valued function $g$ contained in a disk of radius $1$, with three points on the boundary of a disk.
Fig. 2

A triangular image of a complex-valued function g contained in a disk of radius 1, with three points on the boundary of a disk.

If such a function g lies in the image of the unit ball of C(Ω) under the Neumann–Poincaré operator KΩ for some domain Ω that satisfies cR(Ω)<1, then a strict inequality cR(Ω)<cC(Ω) occurs. Our first main result excludes this possibility, and so establishes the simplest possible relation between the real and complex configuration constants.

 

Theorem 1.
The equality
holds for every compact convex domain Ω with non-empty interior.

It follows that every considered domain has a well-defined configuration constant  c(Ω), which is equal to the operator norm of KΩ on C(Ω)/C1, and which can be computed according to the right-hand side of (8). An important consequence of this result is the inequality

(14)

which, as we shall soon see, has some interesting consequences.

Theorem 1 doesn’t appear nearly as straightforward to prove as it is to state, and the proof takes up a large portion of the article. However, the only property of the Neumann–Poincaré operator used in the proof is that its integral kernel {μζ}ζΩ consists of real-valued measures. In fact, the theorem will be deduced as a corollary of a result, which we call the Three-measures theorem, and which is a general statement regarding the geometry of the space C(X) of continuous functions on a compact Hausdorff space X. This result, which we discuss and prove in Section 2, puts a restriction on the possible configurations of point sets in the plane, which arise as values of a collection of real-valued functionals on C(X).

1.4.2 Analytic Neumann’s lemma

Note that the above estimate in (14), together with Neumann’s lemma, implies that a(Ω)<1 whenever Ω is not a triangle or a quadrilateral. This can be improved, for we have an analytic version of Neumann’s lemma, in which no exceptional cases occur.

 

Theorem 2.
The strict inequality
holds for every compact convex domain Ω with non-empty interior.

Our proof of Theorem 2 is much different from the one given by Schober in his proof of the real Neumann’s lemma in [17], but it works also in the real context. At the end of Section 4 we show how our technique leads to a different proof of Neumann’s lemma.

1.4.3 Functional calculus bounds

The following result has already been mentioned above.

 

Theorem 3.
Let T:HH be a bounded linear operator on a Hilbert space H with numerical range W(T), which has non-empty interior. Then, for every polynomial p, we have  

Recall that if the numerical range of an operator has empty interior, then the operator is normal, and so (12) holds with K=1. From this observation and Theorem 2 we obtain that for any fixed operator T:HH, the optimal constant K in (12) is always strictly smaller than 1+2. In fact, we deduce from our results that we have the inequality

with a constant

which depends only on the shape of W=W(T), and not on the operator T itself. We show in Section 5 that no better universal bound can be obtained by means of the analytic configuration constant: for any ϵ>0 there exists a “thin” quadrilateral Ωϵ for which we have a(Ωϵ)>1ϵ. However, fixing the dimension of the Hilbert space H, one may combine earlier results of Crouzeix to obtain a uniform improvement. The optimal constant K in (12) varies with T, and we may consider the supremum of these quantities among all operators T on a Hilbert space H of a fixed dimension N. In [4, Theorem 2.2], Crouzeix proved that there exists an operator realizing this supremum. An immediate corollary of his result, Theorem 2 and Theorem 3 is the following.

 

Corollary 4.
For every positive integer N, there exists a constant CN<1+2 for which we have
whenever T is an operator on an N-dimensional Hilbert space, and p is a polynomial.

This improves the Crouzeix–Palencia bound, although by an indefinite amount.

1.4.4 Estimates for the configuration constants

In Section 5 we present also other computations and estimates for the configuration constants. Surprisingly, in the case of an elliptical domain, the configuration constant is computable exactly, and we obtain

where a and b are lengths of the semi-axes of the ellipse Ωa,b, and e is the eccentricity of the ellipse, given by e:=1b2/a2 in case that ab. This fact, together with Theorem 1, estimate (13), and the inequality a(Ω)c(Ω), has the following consequence.

 

Corollary 5.
Let T:HH be a bounded linear operator on a Hilbert space H with numerical range contained in (or equal to) the ellipse Ωa,b. Then, for every polynomial p, we have
where

Note that the function aK(a,1) is continuous and increasing for a1, and we have

Hence the estimate in Corollary 5 gets worse as the eccentricity of the ellipse Ωa,b grows, and approaches the Crouzeix–Palencia bound in the limit a. On the other hand, as a1, the eccentricity of the ellipse Ωa,1 tends to 0. The estimate is then close to the conjectured optimal bound K=2 and coincides with the Okubo–Ando bound for a=1, in which case the domain is a disk. From this perspective, Corollary 5 may be interpreted as an elliptical generalization of the Okubo–Ando estimate.

For many other types of domains, the exact value of c(Ω) is inaccessible. To help the situation, we establish an integral estimate, which gives an upper bound on c(Ω) in terms of the curvature of Ω, roughly speaking. For a fixed σ that is not a corner of Ω, recall the definition of Rζ,σ in (5), and consider

(15)

If κ(σ) is the curvature of Ω at σ, then RΩ(σ) is at least as large as the radius of curvature

(16)

which is also the radius of the osculating circle at σ. Geometrically, RΩ(σ) is the radius of the smallest disk tangent to Ω at σ, which contains Ω, if such a disk exists, and it is equal to otherwise. The latter case occurs, for instance, if σ lies on a straight line segment contained in Ω. However, if Ω is sufficiently curved on a segment of Ω, then RΩ will be bounded above there. We obtain in such a situation a non-trivial upper bound on c(Ω).

 

Theorem 6.
With the above notation, we have the estimate

The result implies spectral constant estimates similar to the one in Corollary 5 above. It also generalizes some similar results in the literature. See Section 5 for further details and examples.

1.4.5 An unresolved matter

We have mentioned above that c(Ω)=0 if and only if Ω is a disk. With some additional effort, we will show in Section 5 that the condition a(Ω)=0 also characterizes disks. In this case, we have the equality a(Ω)=c(Ω). It is natural to ask whether other domains exist for which the equality occurs, or if the case of the disk is exceptional.

 

Question.
Do we always have the strict inequality
whenever Ω is not a disk?

As a consequence of Theorem 2 and the exceptional cases of Neumann’s lemma, we see that the strict inequality holds whenever Ω is a triangle or a quadrilateral. The authors have not been able to confirm that the inequality holds in any other examples.

1.5 Notations

Some of our notation has already been introduced above. For a continuous function f defined on a set X, we denote by fX the supremum of |f| over X. For cosets of the form f+C1 we use the convention

with similar convention for real-valued f and cosets f+R1. A norm without a subscript usually denotes a linear functional norm or a total variation norm of a measure. The distinction will be unimportant and should anyway be easy to deduce from context. We use boldface letters, such as x, to denote vectors in Rn, and plain letters, such as xj, to denote the coordinates.

2 The Three-Measures Theorem

2.1 Definitions of relevant spaces and operators

Theorem 1 will be proved as a corollary of our analysis of three-point configurations

where x is an element of a given normed space N, and 1,2,3N are three bounded linear functionals on N. A point configuration of this type has to satisfy certain conditions. For instance, we must have the distance bound

Our principal interest will be in estimating the radius of the smallest disk that contains such a three-point set.

In order to use the tools of functional analysis, we will formulate our problem as one of estimating the norm of an operator between normed spaces. To this end, we use the space C3 of triples of complex numbers, and we equip it with the following norm:

(17)

Similarly to our previous notational conventions, we shall set 1:=(1,1,1)C3. The quotient norm in the quotient space C3/C1 satisfies

and it has the geometric interpretation adequate to our problem: it is the radius of the smallest disk containing the three point set {a,b,c}. Given a normed space N and three linear functionals 1,2,3N, we introduce the linear operator L:NC3/C1 defined by

(18)

With these conventions, each three-point configuration (1(x),2(x),3(x)) is contained in a disk of radius at most LNC3/C1xN. We want to estimate the operator norm LNC3/C1.

2.2 Statement of the theorem

Without any information regarding the space N or the functionals 1,2,3, the optimal estimate is

(19)

Indeed, we see that we cannot do better by choosing N=C, x=1, and the functionals (scalars) to be the vertices of an equilateral triangle inscribed in the unit circle. For instance,

The sides of the triangle have the common length equal to |ij|=3, and the smallest disk containing the three points i(x)=i is the unit disk itself. Thus, in this case, (19) holds with equality. The estimate holds in general as a consequence of Jung’s theorem, which appeared first in [11], and which in the context of the plane says that any set of diameter d is contained in a disk of radius d/3. In our setting dmaxj,kjkN, and so the estimate (19) follows from Jung’s theorem.

In our intended application, the role of the space N is played by C(X), the Banach space of continuous functions on a compact Hausdorff space X, and the functionals are given by integration against real-valued measures

It turns out that the three-point configurations that arise in this way are contained in disks of radius smaller than predicted by Jung’s theorem. The main result of the section is the following.

 

Theorem 7.
Let C(X) be the space of continuous functions on a compact Hausdorff space X, and L:C(X)C3/C1 be the operator in (18) defined by three functionals induced by three finite real-valued Borel measures μ1,μ2,μ3. Then
(20)

It is the “” estimate in (20) that is the critical one. The lower bound “” follows from the definition of the functional norm

We will spend the rest of the section on proving Theorem 7. The outline of the proof is as follows. We will first use duality to formulate the problem in terms of the adjoint operator L between the dual spaces. Next, a discretization will help us reduce the dual problem to a finite-dimensional optimization problem. Finally, we will solve the finite-dimensional problem by the use of techniques of convex analysis.

Before proceeding, we remark that the natural generalization of the above theorem to an arbitrary n-tuple of real-valued measures is valid. See Theorem 17 below.

2.3 Dual problem

Let us denote by Y the space C3/C1 equipped with the norm in (17). Then the dual space Y is the two-dimensional space of three-tuples (α,β,γ) of complex numbers that satisfy

and the norm on Y is given by

In the case N=C(X), the dual space (C(X)) is just the space of finite Borel measures on X. The adjoint operator L:YN is then given by

and the estimate (20) is equivalent to

(21)

Since α+β=γ and (α+β+γ)μ3=0, we may rewrite the above inequality into

(22)

where

Note that ν1 and ν2 are real-valued if μ1,μ2,μ3 are real-valued. Theorem 7 is thus a consequence of the following slightly more general statement in which the topological structure of X does not play a role.

 

Proposition 8.
Let ν1 and ν2 be two finite real-valued measures on a measurable space X. Then for any complex numbers α,β we have the inequality
(23)
where the norm on the right-hand side is the total variation norm μ:=|μ|(X).

In our next step, we shall simplify the problem further, and show that Proposition 8 can be established by considering finite sets X only.

2.4 Discretization

With notations as in Proposition 8, set σ:=|ν1|+|ν2|. Then σ is a positive finite measure on X, and by the Radon–Nikodym theorem we have dν1=fdσ and dν2=gdσ, where f,g are bounded real measurable functions on X. For a moment, let σ,1 denote the norm

Then Proposition 8 is equivalent to the inequality

(24)

We will say that a function is simple if it only takes on a finite number of distinct values. By standard measure theory, there exist simple measurable real functions fm, gm on X such that fmf and gmg uniformly on X. Clearly αfm+βgmσ,1αf+βgσ,1. Likewise fmσ,1fσ,1 and gmσ,1gσ,1 and fmgmσ,1fgσ,1. Thus, if the inequality (24) holds for each pair of simple functions, then it holds for f,g. So it suffices to establish (24) when f,g are simple measurable real functions.

Hence, suppose that f,g are simple measurable real functions on X. We can write them as f=j=1naj1Xj and g=j=1nbj1Xj, where {X1,,Xn} is a measurable partition of X, and aj,bjR for all j. The inequality in (24) becomes

Writing

and

we see that this becomes

where now x,y are vectors in Rn and 1 denotes the usual 1-norm on Rn given by

(25)

To summarize, to prove Proposition 8 and consequently to prove Theorem 7, it suffices to establish the following discrete result.

 

Proposition 9.
Let n1 and x,yRn. Then for all complex numbers α,β we have the inequality  
(26)

This reduction of the problem to the finite-dimensional setting allows us to use the tools of convex analysis.

2.5 Optimization over a convex set

Consider the set

(27)

Thus Cn is a compact convex polytope in Rn×Rn, and so it has a finite number of extreme points. That is, points of Cn that do not lie in the interior of any line segment in Cn. A well-known theorem of Carathéodory says that each point of a compact convex polytope is a convex combination of its extreme points.

 

Lemma 10.

In order to establish Proposition 9, it suffices to show that the inequality (26) holds for every extreme point of Cn.

 

Proof.
Let us fix α,βC and (x,y)Rn×Rn. By the homogeneity of the inequality in (26), we may assume that
(28)
Then (x,y)Cn and so we may express it as a convex combination of the extreme points of Cn, namely
where ekRn, fkRn, the pairs (ek,fk) are extreme points of Cn, tk>0, and k=1mtk=1. Note that since (ek,fk) is an extreme point of Cn, we must have
Since we are assuming that (26) holds for extreme points, we can estimate
Recalling our normalization in (28), this is the desired estimate in (26).

From the above lemma and our sequence of reductions above, it follows that in order to prove Theorem 7 it suffices show that the inequality (26) holds at every extreme point of the polytope Cn. Proposition 11 below characterizes these extreme points by partitioning them into three equivalence classes.

Note that Cn is invariant under the following linear symmetries:

(29)

where π is any permutation of {1,2,,n},

(30)

for any choice of ϵ1,,ϵn{1,1},

(31)

and

(32)

Denote by Gn the group generated by these symmetries. As these symmetries are linear automorphisms of Rn×Rn, it is clear that Gn leaves invariant the set of extreme points Cn. We say that two extreme points of Cn are Gn-equivalent if there is an element of Gn mapping one of them to the other. Thus the action of Gn on Cn partitions the set of extreme points of Cn into a finite number of equivalence classes. Note that if the inequality (26) holds for some (x,y)Cn, then it holds also for any point of Cn in the orbit of (x,y) under the group action of Gn on Cn.

The extreme points of Cn are identified in the following proposition.

 

Proposition 11.
If n3, then every extreme point (x,y) of Cn is Gn-equivalent to one of the pairs

One can readily check that each of the three above pairs really is an extreme point of Cn. We omit the proof, since we do not actually need this fact. In the case that n=1, the same result holds, but only the first kind of pair can arise. Likewise, if n=2, the same result holds, but only the first two types of pairs can arise.

We will prove Proposition 11 in Section 2.6. For now let us see how Theorem 7 follows. In order to verify (26) for all extreme points of Cn, it suffices to verify the inequality for the three pairs of vectors appearing in Proposition 11. This is an easy task. For instance, if (x,y) is the second pair in Proposition 11, then we have

The inequality for the other two pairs is verified similarly. Then from Lemma 10 we conclude that Proposition 9 holds, from which Theorem 7 follows by the earlier reduction.

It remains to prove Proposition 11.

2.6 Extreme points of the polytope

In the proof of Proposition 11, the group Gn generated by the symmetries (29)–(32) will be extensively used. In particular we will use the property that (x,y) is an extreme point of Cn if and only if some extreme point of Cn is Gn-equivalent to it. Moreover, the following two observations will be useful to single out.

 

Lemma 12.
If for a pair (x,y)Cn there exists two distinct indices j,k such that
then (x,y) is not an extreme point of Cn.

More generally, if for two distinct indices j,k we have that two of the quantities xjxk, yjyk and (xjyj)(xkyk) are non-zero and have the same sign, then (x,y) is not an extreme point of Cn.

 

Proof.
Using the symmetry (29) we may suppose that j=1, k=2. Note that x1<1, x2<1, since x11. The same is true for the corresponding coordinates of y. Let d=(1,1,0,,0)TRn. It is easy to verify that if t is a real number, and |t| is sufficiently small, then we have
Thus (x,y) lies on a line segment inside Cn, and so is not an extreme point of Cn.

The more general statement follows by applications of a sequence of symmetries in (29)–(32) to transform (x,y) satisfying the more general assumption into a point (x,y) where the first two coordinates of the vectors x and y are positive.

 

Lemma 13.

If for a pair (x,y)Cn the vector x or y has at least three non-zero coordinates, then (x,y) is not an extreme point of Cn.

 

Proof.

By using symmetries (29)–(31) we may suppose that coordinates x1,x2,x3 are non-zero and positive. If two of the coordinates y1,y2,y3 are positive, then by Lemma 12 we conclude that (x,y) is not an extreme point of Cn. In the contrary case, two of the coordinates y1,y2,y3 are non-positive. Then again by Lemma 12 and the symmetry (32) the pair (x,xy)Cn is not extreme, and thus neither is (x,y), since these two pairs are Gn-equivalent.

We are ready to prove Proposition 11. We denote by n1 the space Rn equipped with the norm 1 given by (25). Recall that the extreme points of the unit ball B:={xRn:x11} are the vectors with precisely one non-zero coordinate, this coordinate being equal to ±1.

 

Proof of Proposition 11.

We will split up the proof into three cases, each case corresponding to one of the pairs in the statement of the proposition.

Case 1: At least one of the norms x1,y1,xy1 is strictly less than 1. We will show that in this case (x,y) is Gn-equivalent to the first pair in the statement of the proposition.

By applying a suitable combination of symmetries (29)–(32), we may suppose that in fact xy1<1. We claim that x must be an extreme point of the unit ball of n1. For if not, then it lies at the midpoint of a line segment I such that x11 for all xI. Since xy1<1, by shrinking I if necessary, we also have xy1<1 for all xI. Thus I×{y} is a line segment in Cn with interior point (x,y), contradicting the fact that (x,y) is extreme.

Likewise, y is extreme in the unit ball of n1. Applying a suitable symmetry, we may suppose that x1=1 and yj=±1 for some j, all the other entries of x and y being 0. Since we must have xy1<1, this implies that actually j=1 and y1=1. Thus (x,y) is equivalent to the first pair of vectors listed in the statement of the proposition. This concludes Case 1.

Case 2: We have x1=y1=xy1=1, and one of the vectors x, y, or xy has only one non-zero coordinate. In this case, (x,y) will be now shown to be Gn-equivalent to the second pair in the statement of the proposition.

Using our symmetries, we may suppose that x=(1,0,,0)T. Note that
and
force
the unique real solution y1 to this equation being y1=1/2. By Lemma 13, y has only one other non-zero coordinate, and y1=1 forces this coordinate to be equal to ±1/2. Applying symmetries (29) and (30) we conclude that (x,y) is Gn-equivalent to the second pair in the statement. This concludes Case 2.

Case 3: We have x1=y1=xy1=1, and all of the vectors x, y and xy have exactly two non-zero coordinates. We will show that (x,y) is Gn-equivalent to the third pair in the statement of the proposition.

This case is slightly more complicated than the previous two. As before, we may suppose that x1>0 and x2>0. We claim that y1 and y2 cannot both be equal to zero. If they were, then xy has four non-zero coordinates, contrary to the assumption. In fact, precisely one of y1 and y2 must be non-zero. If both were non-zero, then since xy has exactly two non-zero coordinates, we would have x1y10 and x2y20. Then the three quantities x1x2, y1y2 and (x1y1)(x2y2) would be non-zero, and Lemma 12 would imply that (x,y) is not an extreme point.

By an application of symmetries we may, in addition to x1>0 and x2>0, suppose that y10, y2=0 and y3=s>0. Since x1+x2=1, we have x1=t, x2=1t for some t(0,1). Our vectors thus have the following structure:

Recall that xy has only two non-zero coordinates. Since 1t0 and s0, we conclude from the above that t=y1. But then xy1=1t+s=1, and so t=s. Finally, 1=y1=t+s=2s shows that s=t=1/2, and so (x,y) is Gn-equivalent to the third pair in the statement of the proposition.

3 Proof of Theorem 1

In addition to Theorem 7 from Section 2, we will also need some facts from plane geometry in order to prove Theorem 1. In particular, we will need to discuss the minimum enclosing disk problem appearing in computational geometry.

3.1 Minimal enclosing disk

Let K be a compact subset of C containing at least two points. Among all closed disks that contain K there exists a unique one of minimal radius. We will denote this disk by DK and call it the minimal disk for K. The radius of DK will be denoted by R(K).

If DK is minimal for K, then the intersection KDK must obviously be non-empty. In fact, this intersection must contain at least two points, and there is also a restriction on the locations of the points in KDK.

 

Lemma 14.

Let K be a compact subset of C, which contains at least two points. Then the intersection DKK is not contained in any arc of DK, which has length strictly smaller than half of the circumference of DK. In particular, if KDK={a,b} is a two-point set, then a and b are antipodal on DK.

 

Proof.
Seeking a contradiction, assume that DKK is contained in an arc of length strictly less than half of the circumference of DK. By translation, rescaling, and rotation of the setting, we may assume that DK is the unit disk, and that DKK is contained in some half-space
By compactness, the distance between the compact sets K and DK{zC:Rezδ/2} is positive. It follows that we may translate the disk DK in the positive direction of the real axis, and then shrink the radius of the translated disk slightly, and the resulting disk will still contain K, yet be of strictly smaller radius than RK. See Figure 3. This contradiction establishes Lemma 14.

The initial disk ${\mathbb{D}}_{K}$ is the dashed circle, and we assume that $\partial{\mathbb{D}}_{K} \cap K$ is contained in the black thick arc. Then $K$ will be contained in the grey disk, which is obtained from ${\mathbb{D}}_{K}$ by first translating ${\mathbb{D}}_{K}$ in the direction of the positive real axis, and then slightly shrinking the translated disk. This contradicts the minimality of ${\mathbb{D}}_{K}$.
Fig. 3

The initial disk DK is the dashed circle, and we assume that DKK is contained in the black thick arc. Then K will be contained in the grey disk, which is obtained from DK by first translating DK in the direction of the positive real axis, and then slightly shrinking the translated disk. This contradicts the minimality of DK.

The thick arc $J$ between $a$ and $b$ is the smallest containing the compact set $K$. It follows that the shorter arc between the antipodal points $\tilde{a}$ an $\tilde{b}$ must contain points of $K$.
Fig. 4

The thick arc J between a and b is the smallest containing the compact set K. It follows that the shorter arc between the antipodal points a~ an b~ must contain points of K.

 

Lemma 15.

Let T={a,b,c} be a three-point set. If D is a closed disk for which TD, and T is not contained in any arc of D, which is strictly smaller than half of the circumference of D, then D=DT.

 

Proof.

Assume, seeking a contradiction, that DDT, and so that R(T) is strictly smaller than the radius of D. Since D is the unique circle passing through the three points a,b,c, we must have that TDT contains precisely two points. Say a,bDT but cDT. Lemma 14 implies that a and b are antipodal on DT. By translation, rescaling, and rotation, we may assume that DT is the unit disk, a=i,b=i, c has non-negative real part and |c|<1. After these operations, we have that R(T)=1 and the circumference of D is larger than 2π. Thus by hypothesis, surely T is not contained in any arc of D of length strictly smaller than π. But the shorter of the arcs of D that contains T is then contained in {zC:0Rez,|z|1}, and so this arc must have a length smaller than π. This is a contradiction, and the lemma follows.

3.2 Reduction to three-point sets

The following simple result on minimal disks makes it possible to apply Theorem 7 to more than three measures.

 

Lemma 16.

Let K be a compact subset of C containing at least two points. There exists a subset TK, which contains at most three points and for which DK=DT. In particular, R(K)=R(T).

It may be convenient to refer to Figure 4 during the reading of the proof.

 

Proof.

If there are two points in K that are antipodal on DK, then we take T to consist of those two points. Clearly DK=DT. In the case that no pair of antipodal points of DK are contained in K, let J be the shortest closed arc of DK, which contains K, and let a,bJK be the end-points of J. By Lemma 14, the length of J is strictly larger than half of the circumference of DK, and so J is the longer of the two arcs between a and b. Let a~ and b~ be points on DK, which are antipodal to a and b, respectively. By assumption, a~K,b~K. We claim that the shorter of the two open arcs between a~ and b~ must contain points of K. If not, then the longer of the two arcs between a~ and b~ would contain K in its interior, and this arc has the same length as J. A routine compactness argument would lead to a contradiction to the minimality of J.

Let T={a,b,c}, where cK is any point contained in the shorter open arc between a~ and b~. Note that any arc containing T must contain either a~ or b~. Then such an arc contains two antipodal points on DK, and so it has a length that is at least half of the circumference of DK. By Lemma 15 we conclude that DK=DT.

3.3 Finalizing the proof

We are finally ready to give a proof of the equality cR(Ω)=cC(Ω).

 

Proof.
(Proof of Theorem 1)
Since cR(Ω)cC(Ω), it will suffice to show the reverse inequality. To this end, we need to show that given fC(Ω) satisfying fΩ1, we have that KΩf+C1ΩcR(Ω). Since KΩf is continuous, the image K=KΩf(Ω) is a compact subset of C. If K consists of a single point, then KΩf+C1Ω=0, and the proof is complete. In other case, let DK be the minimal disk for K. We use Lemma 16 to obtain a three-point set T={a,b,c}K for which R(T)=R(K) (note that if KDK contains only two points {a,b}, then we may pick cK arbitrarily to complete T to a three-point set). The geometric interpretation of the quotient norm in C(Ω)/C1 implies that KΩf+C1Ω=R(K)=R(T). Since T is contained in the image of KΩf, there exists ζ1,ζ2,ζ3Ω such that
Since KΩf(ζj)=Ωfdμζj, we may apply Theorem 7 to X=Ω, μj=μζj for j=1,2,3, and conclude that the operator L:C(Ω)C3/C1 defined by
has a norm satisfying the bound (20). With denoting the norm on C3/C1 given in (17), we obtain

The earlier mentioned extension of Theorem 7 to an n-measures theorem is obtained by employing the same argument as in the above proof. The normed space Cn/C1 appearing below is defined analogously to the case n=3 treated in Section 2.1.

 

Theorem 17.
Let C(X) be the space of continuous functions on a compact Hausdorff space X, n3 an integer, and L:C(X)Cn/C1 the operator defined by
where μ1,,μn are finite real-valued Borel measures on X. Then

 

Proof.

We use Lemma 16 to pick a three-point subset T of K={μj(f)}j=1n for which we have R(K)=R(T), and apply Theorem 7 as in the preceding proof.

4 Proof of Theorem 2

4.1 Exploiting subsequences

We will argue by contradiction in order to prove Theorem 2. That is, we will assume that there exists a convex domain Ω with a(Ω)=1, and so that there exists a sequence of functions (fn) in A(Ω), which satisfy

and

(33)

We shall see that this leads to a contradiction. The proof technique below is different from the one employed by Schober in [17] in his proof of Neumann’s lemma, and analyticity is used only at the very end of the proof. In fact, we shall remark at the end of the section how our arguments lead to a new proof of Neumann’s lemma that is different from the one in [17].

Thus, for now, we assume merely that fnC(Ω), and we will derive certain consequences of (33). In the course of the proof we shall replace the sequence (fn) by a subsequence multiple times, and for convenience we will not be changing the subscripts. We may suppose that fnΩ=1, and consequently that the images

are contained in a closed disk of radius 1 centred at the origin. For large n, this observation and (33) forces there to be points of the image of KΩfn outside of any disk centred at the origin of radius strictly less than 1. By exchanging fn for a unimodular multiple of itself, we may thus assume that there exists a sequence of points (ζn) in Ω for which we have

(34)

Using that the functions fn are bounded by 1 in modulus, and the positive measures dμζ are of unit mass, we obtain

Recall from (3) that ρζn denotes the ds-absolutely continuous part of μζn. The above computation implies that

(35)

Compactness of the boundary Ω implies that we may assume convergence of the sequence (ζn) to some points ζΩ. The following lemma shows that we may replace in (35) the densities ρζn with the density ρζ.

 

Lemma 18.
With notations as above, we have
(36)
Consequently, after passing to a subsequence, we can ensure that
for almost every σΩ with respect to the measure ρζds.

 

Proof.
Note that whenever σ is not a corner of Ω or any of the points ζn or ζ, we have
If B=B(ζ,δ) is a disk around ζ of small radius δ>0, then for large enough n the denominator on the right-hand side above is uniformly bounded from below for σΩB, with exception of a countable set. This shows uniform convergence of ρζn(σ) to ρζ(σ) for σΩB, again with exception of an at most countable set. Since |fn1|24, we obtain from (35) that
Since ΩB is an arc of length that tends to 0 as the radius δ of B tends to 0, the last quantity above can be made arbitrarily small by choosing δ small enough. This establishes (36). Basic measure theory now implies that we may pass again to a subsequence and ensure the pointwise convergence fn1 almost everywhere with respect to ρζds.

Out next observation extracts more information from (33). Consider the strips

These strips have a fixed large “length” but shrinking “width”. One such strip is marked in Figure 5. We claim that each one of the strips Sδ intersects the images KΩfn(Ω) non-trivially for infinitely many indices n. For if not, then for some fixed δ>0, we would have that SδKΩfn(Ω)= for all sufficiently large n, which means that the images KΩfn(Ω) are entirely contained in B(0,1)Sδ, where B(0,1) denotes the closed disk of radius 1 centred at the origin. But if ϵ1 and ϵ2 are sufficiently small positive numbers, then B(0,1)SδB(ϵ1,1ϵ2), a disk of radius 1ϵ2 centred at the point ϵ1R. See Figure 5. Recalling the geometric interpretation of the norm KΩfn+C1Ω as the radius of the smallest disk containing the image of KΩfn, we would arrive at a contradiction to (33). Thus every strip Sδ contains points in the image of KΩfn for infinitely many n.

The unit disk in dark grey with the strip $S_\delta $ removed. The dotted circle containing the dark grey area has a radius slightly smaller than $1$.
Fig. 5

The unit disk in dark grey with the strip Sδ removed. The dotted circle containing the dark grey area has a radius slightly smaller than 1.

 

Lemma 19.
With notations as above, we may pass to a subsequence again, and obtain a new sequence (ζn) that converges to some point ζΩ, and such that
for some unimodular constant α1 and for almost every σΩ with respect to the measure ρζds.

 

Proof.

Since each strip Sδ intersects the images of KΩfn for infinitely many n, passing to a subsequence and a routine compactness argument produces a sequence (ζn) convergent to some ζΩ, for which KΩfn(ζn)α, with α unimodular and lying in the closure of each of the strips Sδ. Thus α1. We therefore merely need to repeat the previous arguments to see that, after passing to a subsequence, we will have fn(σ)α for almost every σ with respect to the measure ρζds.

4.2 Proof of Theorem 2

The above arguments are valid for fnC(Ω). However, under the assumption of analyticity, the sequence (fn) cannot converge to two different constants on two different sets of positive arclength measure. To make this statement precise, we appeal to the classical theory of analytic functions in the (open) unit disk D={zC:|z|<1}. Here [8, Chapter II] is an excellent reference for the claims made in the following proof.

 

Proof of Theorem 2.

Let H=H(D) be the space of bounded analytic functions in D, identified as usual through boundary function correspondence with a weak-star closed subspace of the space L(D)=(L1(D) of bounded measurable functions on D, the dual of the Lebesgue space L1(D) of functions integrable on D with respect to the Lebesgue measure (arclength measure) on D. It is well known that a function f~H that vanishes on a subset of positive Lebesgue measure on D must vanish identically.

Fix some conformal mapping ϕ:DΩ. Under the assumption that fnA(Ω), fnΩ1, the functions
are bounded in modulus by 1 in D. By Carathéodory’s classical theorem (see, for instance, [9, Chapter I.3]), ϕ extends to a homeomorphism between D and Ω. If KΩfn+C1Ω1, then Lemmas 18 and 19 show that there exist two sets E,EΩ that have positive arclength measure, such that
and
Since Ω is convex, the curve Ω is rectifiable, and general theory of harmonic measures tells us that the sets ϕ1(E) and ϕ1(E) have positive Lebesgue measure (see [9, Chapter VI]). Since L1(D) is separable and the functions fn~ are uniformly bounded by 1 in modulus, the usual Helly-type selection process will produce a subsequence of (f~n), which converges in the weak-star topology to some function f~H. By the above pointwise convergence, we must have f~1 on ϕ1(E) and f~α on ϕ1(E). Then the non-zero function f~1 vanishes on the subset ϕ1(E) of positive Lebesgue measure on D. This is a contradiction, which shows that our assumption KΩfn+C1Ω1 must be false. Theorem 2 follows.

4.3 A proof of Neumann’s lemma

We indicate how one may proceed to use our above arguments to obtain a proof of Neumann’s lemma, stating that c(Ω)=1 if and only if Ω is a triangle or a quadrilateral. We need only the following simple geometric observation regarding the densities ρζ.

 

Lemma 20.

Fix ζΩ. Any σΩ{ζ} that is not a corner of Ω and that satisfies ρζ(σ)=0 is contained in the union of at most two line segments of Ω containing ζ.

 

Proof.

It will suffice to show that all σ satisfying the above conditions are contained in at most two different tangent lines to Ω. To see this, recall formula (5). The condition ρζ(σ)=(2πRζ,σ)1=0 gives Rζ,σ=, and so ζ is contained in the tangent line to Ω at σ. The tangent line divides the plane C into two half-planes, one of which contains Ω. Assume that two different tangent lines, at σ and σ, intersect at ζ. They divide the plane C into four sectors, and by convexity precisely one of those sectors contains Ω. Now, any line that passes through ζ and the open sector containing Ω must separate σ,σΩ. Therefore, it is not a tangent to Ω.

Neumann’s lemma is established as follows. Assume that c(Ω)=1. From Lemmas 18 and 19 we see that two points ζ,ζ exist for which the measures ρζds and ρζds are mutually singular. From Lemma 20 we deduce that the support of ρζds is the union of at most two line segments containing ζ, and the complement of the support of ρζds is also a union of at most two line segments. Thus Ω is the union of at most four line segments.

5 Examples

In this section, we compute and estimate the configuration constants for some types of domains.

5.1 Configuration constant of an ellipse

For a,b>0, let

be the ellipse centred at the origin with semi-axes of lengths a and b, respectively. It is quite remarkable that the configuration constant can in this case be computed explicitly.

 

Proposition 21.
With the above notation, we have

In order to prove the proposition, our first step is to derive an expression for the density of the Neumann–Poincaré kernel of Ωa,b. The boundary Ωa,b is parametrized by

(37)

Here [0,2π] can be replaced by any interval of length 2π. Recalling formula (4) for μζ and setting ζ=γ(s), σ=γ(t), we obtain

(38)

Using (37), this formula can be greatly simplified.

 

Lemma 22.
With the notation above, we have
(39)
where

The lemma is established by combining (37) and (38), and then using elementary trigonometric identities to simplify the resulting expression.

With this formula in hand, we now evaluate the configuration constant of the ellipse Ωa,b.

 

Proof of Proposition 21.
Using the formulas (8) and (39), we obtain
By the periodicity of cos, this last expression simplifies to
For the time being, let us assume that ba, so B0. Using the fact that (39) is the density of a probability measure for each s[0,2π], we have
We readily verify that cos(t)cos(t+s) if and only if t[s/2,πs/2]. Therefore
It is clear that this last integral is maximized over s[0,2π] when s=π. Putting everything together, we deduce that, if ba, then
All that remains is to evaluate the integral. Making the substitution x=sint, and exploiting the fact that A2+B2=1, we have
This proves the result in the case when ba. The remaining case is obtained by exchanging the roles of a and b.

5.2 Integral estimates

For a general domain, the exact value of c(Ω) is often inaccessible. In this section, we will present a simple estimate that is applicable to domains Ω with a non-flat part of the boundary that leads to an upper bound on c(Ω).

Assume that we find a Borel measure ν on Ω such that

If so, then, for every ϕC(Ω) with ϕΩ1, we have

which shows that the image of KΩϕ is contained in a disk of radius kΩν centred at Ωϕdν. Thus,

One approach is to seek a positive measure ν on Ω satisfying μζν for all ζΩ. Then

and so kΩν=1ν(Ω).

We will construct the largest non-negative Borel measure ν on Ω, which satisfies μζν0. The construction is based on the geometric interpretation of the density ρζ(σ) in (5) and the quantity RΩ appearing in (15). In order to avoid the need to establish Borel measurability of RΩ defined as a supremum of an uncountable family as in (15), we proceed to define it in a slightly different but equivalent way. Namely, it is easy to see that, given σΩ, if there exists a closed disk Δ such that ΩΔ and σΔ, then there exists one of smallest radius. We denote this radius by RΩ(σ). Note that if σ is not a corner of Ω, then the corresponding disk must be tangent to Ω at σ. If no disk passing through σ exists that contains Ω, then we set RΩ(σ):=. This happens, for instance, if σ is contained in the interior of a line segment in Ω. In particular, RΩ(σ)= for all but a finite number of points of any polygonal domain.

A domain $\Omega $ with two circles corresponding to values $R_\Omega (\sigma )$ and $R_\Omega (\sigma ^{\prime})$
Fig. 6

A domain Ω with two circles corresponding to values RΩ(σ) and RΩ(σ)

 

Lemma 23.

The function RΩ:Ω(0,] is lower semicontinuous. In particular, it is Borel measurable.

 

Proof.

Let σΩ and let (σn) be a sequence in Ω such that σnσ. We need to show that lim infnRΩ(σn)RΩ(σ). We can suppose that L:=lim infnRΩ(σn)<, otherwise there is nothing to prove. Let L>L. Then, replacing (σn) by a subsequence, we can suppose that RΩ(σn)<L for all n. Thus, for each n, there exists a closed disk Δn of radius L such that ΩΔn and σnΔn. The sequence of centres (cn) of the disks Δn is bounded, so there exists a convergent subsequence cnjc. Let Δ be the closed disk with centre c and radius L. Then we have ΩΔ and σΔ. It follows that RΩ(σ)L. As this last inequality holds for all L>L, we deduce that RΩ(σ)L. This completes the proof.

We set

(40)

By the above lemma, ν is a non-negative Borel measure on Ω. For any ζΩ, we have

(41)

To see this, note that if σ is not a corner and Rζ,σ is the radius of the unique circle tangent to Ω at σ and passing through ζ, then RΩ(σ)Rζ,σ. Therefore, according to (5),

for almost every σ with respect to arclength measure on Ω. Inequality (41) follows. Although we shall skip a formal proof, we mention also that ν is in fact the largest measure satisfying μζν for all ζΩ. This maximality property of ν is to be interpreted in the following sense: if ν is any measure satisfying μζν for all ζ, then νν.

A quadrilateral domain $\Omega _\epsilon $ with $a(\Omega _\epsilon )> 1 - (4/\pi )\epsilon $.
Fig. 7

A quadrilateral domain Ωϵ with a(Ωϵ)>1(4/π)ϵ.

By our earlier discussion, we obtain the following upper estimate for the configuration constant:

Note that this is precisely the assertion of Theorem 6 stated in Section 1.

We will now mention some consequences. Recall that if γ is a plane curve of class C2, then the radius of curvature of γ is the reciprocal of its curvature.

 

Corollary 24.
If Ω has a C2-boundary of length L, whose radius of curvature is everywhere at most ρ, then

 

Proof.

In this case, one sees from (15) and (16) that RΩ(σ)ρ for all σΩ, from which the result follows.

This last result was already known. See for example [7, pp. 45–46] and [12, pp. 128–129]. However the proofs in these references are quite different from the one above.

 

Corollary 25.
Consider a convex circular sector
where r>0 and 0<θπ. Then  

 

Proof.
It is obvious that RΩ(σ)=r for σ in the curved part of Ω, and that RΩ(σ)= elsewhere. Hence
The result now follows from Theorem 6.

5.3 Analytic configuration constants of quadrilaterals

Theorem 2 shows that a(Ω)<1 for every Ω. Here we show by example that a(Ω) may be arbitrarily close to 1. Figure 7 shows a narrow quadrilateral domain for which this phenomenon occurs.

 

Proposition 26.
For ϵ>0, let Ωϵ be the convex hull of {±1,±ϵi}. Then  

 

Proof.

Let f be a conformal mapping of the interior of Ωϵ onto the unit disk D. By Carathéodory’s theorem, f extends to a homeomorphism of Ωϵ onto D, and so clearly fA(Ω). Post-composing with a suitable automorphism of D, we may further suppose that f(1)=1 and f(1)=1.

Consider ζ=1. Recalling (3), we have μ1=(1θ1/π)δ1+(θ1/π)ν, where θ1 is the angle of the aperture of Ωϵ at 1, and ν is a probability measure on Ωϵ{1}. It follows that
Likewise
It follows that the diameter of (KΩϵf)(Ω) is at least 2(12θ1/π), whence a(Ω)(12θ1/π).

Finally, by trigonometry, θ1 is related to ϵ by tan(θ1/2)=ϵ, whence θ1=2arctanϵ2ϵ. The result follows.

5.4 Configuration constants equal to zero

Recall the estimate (13) from Section 1, which will be proved in the next section. The estimate is strongest if a(W)=0, and in this case we reach the conjectured bound K=2. Unfortunately, the only domain W for which we have a(W)=0 is a disk, and in this case (13) reduces to the well-known Okubo–Ando bound from [15]. For completeness we give a proof of the statement that a(Ω)=0 if and only if Ω is a disk. More precisely, we have the following.

 

Proposition 27.

Let Ω be a compact convex domain with non-empty interior. The following are equivalent:

  • (i)

    Ω is a disk,

  • (ii)

    c(Ω)=0,

  • (iii)

    a(Ω)=0.

 

Proof.

In the case that Ω is a disk, then (5) implies readily that ρζ(σ) is a constant independent of ζ, and so for every ζΩ, the measure μζ is a normalized arclength measure on the circular boundary Ω. Then it follows from the definition that KΩf is a constant function, and consequently KΩf+C1C(Ω=0, so c(Ω)=a(Ω)=0. This shows that the implications (i)(ii) and (ii)(iii) hold.

It remains to prove (iii)(i). Fix a conformal mapping ϕ:DΩo, where D is the open unit disk and Ωo is the interior of Ω. The mapping ϕ extends to a homeomorphism of D and Ω, and so it makes sense to define the probability measures μζϕ on D by the equation
where E is a Borel subset of D, and {μζ}ζΩ is the double-layer potential of Ω. Since a(Ω)=0, it follows that for every fA(Ω) and every pair of points ζ,ζΩ we have, by the change of variables formula, that
As f varies over A(Ω), fϕ varies over A(D):=A(D), and it follows that μζϕμζϕ annihilates A(D). Then the theorem of brothers Riesz (see, for instance, [8, Exercise 1, Chapter III]) implies that
where h is a function with vanishing non-positive Fourier coefficients. Note that h is real-valued, so the positive Fourier coefficients also vanish, and consequently h0. Since ζ,ζ were arbitrary, we conclude that the hypothesis a(Ω)=0 implies that all the measures μζ are equal.

The conclusion that Ω is a disk is now a consequence of the geometric formula for ρζ(σ) in (5). Fix any σΩ, which is not a corner. Since the measures μζ are all equal, so are their densities ρζ(σ). Then the circles passing through ζΩ and tangential to Ω at σ all have the same radius, and so they all coincide with each other. Thus one circle passes through all points ζΩ. Consequently Ω is a circle, and so Ω is a disk.

6 Application to Numerical Ranges

6.1 Spectral constant estimate

Our principal motivation for the introduction of the analytic configuration constant is the following result that was mentioned in the Section 1 and that we will now prove.

 

Theorem 28.
Let T be a bounded linear operator on a Hilbert space H, and W=W(T) the closure of the numerical range of T. If W has non-empty interior, then for every fA(W) we have
where a(W) is the analytic configuration constant in (11), and A(W) is the space of continuous functions on W, which are analytic in the interior of W.

Of course, if W has no interior, then its convexity forces it to be a line segment. In that case T is a normal operator, and the spectral theorem gives us the better estimate f(T)fσ(T), where f may be any Borel measurable function on the spectrum σ(T). Thus Theorem 28 implies Theorem 3. In what follows, we will assume that W has non-empty interior.

Let us make some initial remarks before going into the proof of Theorem 28. In the case σ(T) is contained in the interior of W, then f(T) is defined, as usual, through the Dunford–Riesz holomorphic functional calculus. If Wσ(T), then this definition does not work. Nevertheless, if fA(W), then it is a standard result of approximation theory that a sequence of analytic polynomials (pn) exists, which converges to f uniformly on W. In the presence of any uniform bound of the form p(T)Kp(T)W for polynomials p, we may then define f(T) as the limit of the sequence (pn(T)) in the operator norm. Such bounds are known to exists, the strongest known bound K1+2 being due to Crouzeix and Palencia. Theorem 28 improves this estimate given information about the numerical range of T.

Our proof of Theorem 28 combines the argument of Crouzeix and Palencia from [5] with ideas of Schwenninger and de Vries from [18], where bounds for various functional calculi are derived as a consequence of the existence of extremal functions and extremal vectors. Let U be an open set in the plane, and H(U) be the algebra of bounded holomorphic functions on U. Given an operator T:HH with σ(T) contained in U, it is elementary that the quantity

is finite. A normal-families argument shows that an fH(U) exists with fU=1 for which the supremum above is attained. Any such f will be called for an extremal function. If, moreover, a vector xH with xH=1 exists for which

then we will say that x is an extremal vector, and (f,x) is an extremal pair. Unless dimH<, an extremal vector might not exist, but we will be able to reduce the proof to the finite-dimensional case. The importance of the concept of extremal pairs (f,x) stems from the following result. We refer the reader to [1, Theorem 4.5] for a proof (see also [18, Proposition 3]).

 

Lemma 29.
Let T:HH be a bounded linear operator, and U be an open neighbourhood of σ(T). Let (f,x) be a corresponding extremal pair. If f(T)>1, then f(T)x is orthogonal to x in H:  

The next two lemmas will reduce our task to consideration of finite-dimensional Hilbert spaces, in which extremal vectors exist, and will dispose of the problematic set σ(T)W. The first observation is essentially contained in [18, Proposition 9].

 

Lemma 30.
Let Ω be a compact convex domain with non-empty interior. If for some K>0 the estimate
holds for every polynomial p and every operator T on a finite-dimensional Hilbert space with W(T) contained in the interior of Ω, then the same estimate with the same constant K holds also for operators T on infinite-dimensional Hilbert spaces with W(T) contained in the interior of Ω.

 

Proof.
Let T:HH be as above, with dimH=. It suffices to show that
holds for every analytic polynomial p and every xH. Note that p(T)x is contained in the finite-dimensional subspace K spanned by {x,Tx,,Tdx}, where d is the degree of the polynomial p. If Π:HK is the orthogonal projection, then p(T)x=Πp(T)x=p(ΠT)x, where ΠT:KK is an operator on a finite-dimensional Hilbert space. Since W(ΠT)W(T), our hypothesis implies
The lemma follows.

The proof of the next lemma will use affine invariance of the configuration constants. Let us fix α,βC, α0, and an affine mapping A(z):=αz+β. Then A is a conformal transformation of C with the additional property of taking a line segment of length L to a line segment of length |α|L, and a circle of radius R to a circle of radius |α|R. Let Ω~=A(Ω) be the affine image of Ω under A, and recall the formula for the Neumann–Poincaré kernel in (3) and its geometric interpretation. If ζ,σΩ, E is a Borel subset of Ω, and s,s~ are the arclength measures on Ω and Ω~, respectively, then it follows from the properties of A listed above that

  • (i)

    θζ=θA(ζ),

  • (ii)

    |α|s(E)=s~(A(E)),

  • (iii)

    |α|Rζ,σ=RA(ζ),A(σ).

A consequence is that the Neumann–Poincaré kernels {μζ}ζΩ and {μ~A(ζ)}A(ζ)Ω of the respective domains satisfy

Then a change of variables shows that KΩ(f~A)=KΩ~f~ for any f~C(Ω~), and it follows that

Armed with these equalities, we make our second observation.

 

Lemma 31.
Assume that the estimate
(42)
holds for every polynomial, every compact convex domain Ω, and every operator T for which W(T) is contained in the interior of Ω. Then Theorem 28 holds.

 

Proof.
Replacing T by an operator T+βI for some βC, we may assume that 0 lies in the interior of W(T). Let W=W(T), and
Then Wr is a convex domain that contains W in its interior. By our assumption, for any analytic polynomial p we have
Since Wr is an affine image of W, we have a(Wr)=a(W). Since this holds for all r>1, and since limr1pWr=pW, we may let r1 to obtain the desired estimate whenever p is an analytic polynomial. The estimate for fA(W) follows by density of polynomials in A(W).

 

Proof of Theorem 28.

By Lemma 31, it will be sufficient to establish the estimate (42) whenever Ω contains W(T) in its interior Ωo. Moreover, by Lemma 30, we may assume that T is an operator on a finite-dimensional Hilbert space H. Let U=Ωo and (f,x) be an extremal pair corresponding to U and the operator T. If f(T)1, then (42) certainly holds, so we may assume that f(T)>1.

Let (fn) be a sequence in A(Ω) such that fnΩ1 and fnf locally uniformly in Ω. Then fn(T)f(T) in operator norm. Set gn:=KΩfn. It is shown in [5, Lemmas 2.1 and 2.3] that gnA(Ω) and
(43)
For each n, we may choose λnC such that
We now have the following identity:
(44)
Let us consider each of the terms in this identity. By the choice of x, we have
Also, from (43) and the Cauchy–Schwarz inequality,
By Lemma 29, we have
By Lemma 29 again, f(T)x,xH=0. Since the sequence (λn) is certainly bounded (indeed |λn|2), we deduce that
Thus, letting n in (44), we deduce that
Hence
In particular, for every polynomial p with pΩ=1 we have
since f is extremal. This is equivalent to (42), and so the proof is complete.

Funding

This work has been done during Malman’s visit at Département de mathématiques et de statistique, Université Laval, supported by Simons-CRM Scholar-in-Residence program. Mashreghi’s research was supported by an NSERC Discovery Grant and the Canada Research Chairs program. O’Loughlin was supported by a CRM-Laval Postdoctoral Fellowship. Ransford’s research was supported by an NSERC Discovery Grant.

Acknowledgments

We are grateful to the referee for the careful reading of the paper, and for the suggestions that we used to improve the manuscript.

A Double-Layer Potential on a General Convex Domain

A.1 Convex domains

Let Ω be a compact convex domain in the plane C with non-empty interior Ωo. We will be making no assumptions regarding smoothness of the boundary Ω. However, convexity itself implies that Ω is a rectifiable simple closed curve with some additional properties.

The orientation of Ω is to be counter-clockwise (i.e., positive), and we use σσ and σσ to denote, respectively, the counter-clockwise and clockwise one-sided convergence of σ to σ within Ω. As a consequence of convexity of Ω, the one-sided tangent angles exist at every point σΩ, are locally given by

and satisfy

Strict inequality may occur at most at a countable subset of Ω. If it occurs at σ, then we say that Ω has a corner at σ. At any point that is not a corner, the tangent angle

is well-defined, and so is the tangent T(σ):=eiα(σ) itself. If tγ(t) is any (positively-oriented) parametrization of Ω, and we set α(σ)=α+(σ) at the corners, then the locally defined function α(γ(t)) is increasing in t, and consequently the tangent T is continuous at every point that is not a corner of Ω. At a corner, the discontinuity of T amounts to a jump of the argument of T. We denote by N(σ):=iT(σ) the outward-pointing normal at σΩ.

A.2 Double-layer potential

Let Ωo denote the interior of Ω. To each zΩo we associate the measure μz on Ω, which for any arc JΩ satisfies

(A.1)

Here arg(σz) is any locally defined continuous determination of the argument function on Ω. Non-negativity of μz follows from convexity of Ω and our choice of positive orientation of Ω. With respect to this orientation, every arc J=(a,b)Ω has a start-point a and an end-point b, and it is easy to see that

In particular, μz(Ω)=2.

The measure μz is absolutely continuous with respect to arclength s on Ω. Indeed, if σ0Ω, Jn=(an,bn) is a sequence of arcs of Ω that are shrinking to σ0, and |Jn| are the corresponding arclengths, then

We use above an appropriate locally defined holomorphic branch of the logarithm. As n, the first factor inside the brackets satisfies

while the second factor stays bounded as a consequence of the inequality |bnan||Jn|. Thus

and from elementary measure theory we obtain that μz is absolutely continuous with respect to s. If moreover σ0 is not a corner, then it can be shown that

and so in additional to boundedness we even have the convergence

Thus the Radon–Nikodym derivative satisfies

(A.2)

at every σΩ, which is not a corner.

A.3 Boundary kernel

The Neumann–Poincaré kernel is the boundary version of the family of measures {μz}zΩo introduced above. To each point ζΩ we associate the Borel probability measure on Ω defined by (A.1) for arcs JΩ not containing the point ζ. Because ζΩ, this definition implies that μζ(Ω{ζ})=θζ/π, where θζ=πα+(ζ)+α(ζ) can be interpreted as the angle of the aperture at ζ. Indeed, θζ is equal to the increase in the argument of σζ as we traverse one loop around Ω starting and ending at the point ζ, and since μζ is a probability measure, we must have

With the exception of this possible point mass, μζ is otherwise absolutely continuous with respect to arclength. The corresponding Radon–Nikodym derivative is given by

(A.3)

The formula (A.3) is established analogously to (A.2). All in all, the measure μζ can be decomposed as

where δζ is a unit mass at ζΩ, θζ is the angle of the aperture at ζ (with the convention that θζ=π if ζ is not a corner), and where the density ρζ is given by (A.3).

A.4 Weak-star convergence

We establish now that

in the sense of the weak-star topology on measures. Note that if B=B(ζ,δ) is a ball of radius δ>0 centred at ζΩ, then expressions (A.2) and (A.3) for the densities of μz and μζ show that

(A.4)

for every fC(Ω). In particular, choosing f=1, we obtain

Since

we see that given ϵ>0 for all sufficiently small δ>0 we will have

Returning to general fC(Ω), we have

On the right-hand side, the first term tends to zero as zζ, the second can be made arbitrarily small by continuity of f, the crude estimate μz(B)2 and choice of sufficiently small δ, the third is dominated in modulus by fΩϵ for z sufficiently close to ζ, and the fourth is dominated by fΩμζ(B{ζ}), which also can be made arbitrarily small by choice of sufficiently small δ. The desired weak-star convergence follows.

References

1.

Bickel
,
K.
,
P.
 
Gorkin
,
A.
 
Greenbaum
,
T.
 
Ransford
,
F. L.
 
Schwenninger
, and
E.
 
Wegert
. “
Crouzeix’s conjecture and related problems
.”
Comput. Methods Funct. Theory
 
20
, no.
3–4
(
2020
):
701
28
. .

2.

Crouzeix
,
M.
Bounds for analytical functions of matrices
.”
Integral Equations Operator Theory
 
48
, no.
4
(
2004
):
461
77
. .

3.

Crouzeix
,
M.
Numerical range and functional calculus in Hilbert space
.”
J. Funct. Anal.
 
244
, no.
2
(
2007
):
668
90
. .

4.

Crouzeix
,
M.
Some constants related to numerical ranges
.”
SIAM J. Matrix Anal. Appl.
 
37
, no.
1
(
2016
):
420
42
. .

5.

Crouzeix
,
M.
and
C.
 
Palencia
. “
The numerical range is a (1+2)-spectral set
.”
SIAM J. Matrix Anal. Appl.
 
38
, no.
2
(
2017
):
649
55
.

6.

Delyon
,
B.
and
F.
 
Delyon
. “
Generalization of von Neumann’s spectral sets and integral representation of operators
.”
Bull. Soc. Math. France
 
127
, no.
1
(
1999
):
25
41
. .

7.

Gaier
,
D.
Konstruktive Methoden der konformen Abbildung
.”
Springer Tracts Nat. Philos.
, vol. 3.
New York, NY
:
Springer
,
1964
.

8.

Garnett
,
J.
Bounded analytic functions
.”
Grad. Texts Math
, vol. 236,
revised 1st ed
.
New York, NY
:
Springer
,
2006
.

9.

Garnett
,
J.
and
D.
 
Marshall
. “
Harmonic measure
.”
New Math. Monogr.
, vol. 2.
Cambridge
:
Cambridge University Press
,
2005
.

10.

Gustafson
,
K.
,
D.
 
Rao
,
K. E.
 
Gustafson
, and
D. K. M.
 
Rao
.
Numerical Range. The Field of Values of Linear Operators and Matrices
.
New York, NY
:
Universitext
,
Springer
,
1996
.

11.

Jung
,
H.
Über den kleinsten Kreis, der eine ebene Figur einschließt
.”
J. Reine Angew. Math.
 
1910
(
1910
):
310
3
. .

12.

Kantorovich
,
L.
and
V.
 
Krylov
.
Approximate Methods of Higher Analysis
.
Groningen
:
P. Noordhoff Ltd. xii
,
1958
.

13.

Kral
,
J.
Integral operators in potential theory
.”
Lect. Notes Math
, vol. 823.
Cham
:
Springer
,
1980
.

14.

Neumann
,
C.
 
Untersuchungen über das Logarithmische und Newton’sche Potential
.
Leipzig
:
Teubner
,
1877
.

15.

Okubo
,
K.
and
T.
 
Ando
. “
Constants related to operators of class Cp
.”
Manuscr. Math.
 
16
(
1975
):
385
94
.

16.

Putinar
,
M.
and
S.
 
Sandberg
. “
A skew normal dilation on the numerical range of an operator
.”
Math. Ann.
 
331
, no.
2
(
2005
):
345
57
. .

17.

Schober
,
G.
Neumann’s lemma
.”
Proc. Amer. Math. Soc.
 
19
(
1968
):
306
11
. .

18.

Schwenninger
,
F.
and
J.
 
de Vries
. “
On abstract spectral constants
.” In
Operator and Matrix Theory, Function Spaces, and Applications. IWOTA 2022. Operator Theory: Advances and Applications
, vol. 295.
Birkhäuser
,
Cham
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.