Two-weight dyadic Hardy's inequalities

We present various results concerning the two-weight Hardy's inequality on infinite trees. Our main scope is to survey known characterizations (and proofs) for trace measures, as well as to provide some new ones. Also for some of the known characterizations we provide here new proofs. In particular, we obtain a new characterization based on a new reverse H\"older inequality for trace measures, and one based on the well known Muckenhoupt-Wheeden-Wolff inequality, of which we here give a new probabilistic proof. We provide a new direct proof for the so called isocapacitary characterization and a new simple proof, based on a monotonicity argument, for the so called mass-energy characterization. Furthermore, we introduce a conformally invariant version of the two-weight Hardy's inequality, we characterize the compactness of the Hardy operator, we provide a list of open problems and suggest some possible lines of future research.

states that for every positive measurable function f and every p > 1, where p * := p/(p − 1), and the constant (p * ) p is optimal. Hardy himself, motivated by the goal of giving a simpler proof of "Hilbert's inequality for double series" [35], was actually primarily interested in the discrete analogue of the above inequality, ∞ n=1 a 1 + · · · + a n n where a n are positive real numbers. Indeed, for p = 2, the discrete version was the first one to be proved by Hardy in his earlier paper [34]. The discrete inequality for general p can either be deduced by the corresponding continuous one or proved directly, as communicated to Hardy by Landau [36]. Much more information in the fascinating history of the development of Hardy's original inequality can be found in the survey paper [41]. Despite its original purpose it soon became clear that Hardy's inequality and its extensions lie in the heart of the developments in the broad area of harmonic analysis throughout the 20th century, up until modern days. On a macroscopic level that is because Hardy's inequality is the prototype of a (weighted) norm inequality for an integration (averaging) operator between L p spaces. Averaging operators together with maximal and singular operators are the pillars of harmonic analysis. Under this light a vast number of theorems can be considered as Hardy's type inequalities. Hence, it comes with no surprise that the original Hardy's inequality was later generalized in many different directions.
Tomaselli in [64] and Talenti in [62] made the first steps towards a weighted Hardy's inequality, i.e., an inequality of the form in which V, U are positive measurable weights. Another way to state the same inequality is to say that the Hardy operator, which maps f to its primitive, is bounded from L p (R + , V (x)dx) to L p (R + , U(x)dx). The first to give a complete characterization of the weights such that the weighted Hardy's inequality holds was Muckenhoupt in [50]. It is worth taking a closer look to Muckenhoupt's condition.
Theorem 1 (Muckenhoupt). Let 1 ≤ p ≤ ∞. There exists a constant C such that (1) is true if and only if Furthermore, if C is the smallest constant such that the inequality holds, then B ≤ C ≤ p(p * ) p−1 B.
This theorem completes the picture of the weighted Hardy's inequality on R + . In the meanwhile several other extensions of Hardy's inequality to different spaces were considered: to higher dimensions, to different metric spaces, to fractional integral operators [59,61,51]. Also the weighted problem with different exponents, namely, the boundedness of the Hardy's operator from L p (R + , V (x)dx) to L q (R + , U(x)dx), with p and q not necessarily coinciding, was extensively studied: we mention Scott [22] for the case 1 < p ≤ q ≤ ∞, Maz'ya [47] and Sawyer [56] for the case p > q.
Let us mention that there exists also a different stream of research which is dedicated to finding weights which are optimal, in an appropriate sense, for weighted Hardy's inequalities, both in the continuous setting ( [30,18]) and in the setting of graphs ( [39,19]).

The two-weight Hardy's inequality on trees.
In the present paper we focus on a generalized version of the discrete Hardy's inequality (Hardy), the two-weight Hardy's inequality on trees. A particular case of this inequality is the so called twoweight dyadic Hardy's inequality, presented in Section 9.1 in the Appendix. In that section we will also make clear the intuitive fact that the classical inequality (Hardy) is a special case of the inequality on trees, and we will provide some examples of application of the dyadic inequality in Complex Analysis. In the paper, however, we are able to work in the generality of the two-weight Hardy's inequality on trees presented in this section. Before stating such an inequality we need to introduce some pieces of notation.
A tree T = (V, E) is a simple connected graph with no cycles, where V is a finite or countable set of vertices and E is the set of edges. The number of edges sharing each vertex is assumed to be finite, but we don't require further restrictions. We will consistently use Greek letters for edges and Latin letters for vertices and boundary points, to be defined below. Henceforth we will identify the tree T with its vertices V . We fix arbitrarily a root vertex o and we assume there exists a pre-root vertex o * which is connected to o and to no other vertex. We denote by ω the edge connecting o and o * and we call it the root edge .
A fundamental property of trees is that for any couple of vertices x, y ∈ T there exists a unique geodesic connecting the two, that is, a unique minimal sequence of pairwise connected vertices containing x and y. We write [x, y], or equivalently [y, x], for the (unique) set of edges connecting pairs of points in the geodesic joining x and y. The confluent of x and y is the vertex x ∧ y such that [o * , x ∧ y] = [o * , x] ∩ [o * , y]. The edge-counting distance on T is given by d(x, y) = ♯[x, y]. We use the same symbol for the vertex-counting distance on E, defined in the obvious way, and abbreviate d(α) for d(α, ω). If we assign to each edge a weight σ(α) > 0, we can define the associated distance d σ by adding the weights on the edges of a paths, instead of counting edges. Due to the elementary topology of a tree, the new metric has the same geodesics.
If α is an edge, we denote by b(α) its endpoint vertex which is closest to o * and by e(α) the furthest. Observe that for any vertex x there exists a unique edge α with e(α) = x, while there are possibly many edges with b(α) = x. For each vertex x we define the predecessor of x as its unique neighbor vertex p(x) which is closer than x to o * , and we denote by s(x) the set of the remaining neighbors, the children on x. Predecessors and children may be defined also for edges in the very same way.
The choice of a root induces a partial order on T and E. For edges, we write α ⊆ β if β ∈ [o * , e(α)], and for vertices we write x ⊆ y if [o * , y] ⊆ [o * , x]. We will also write x ⊆ α ⊆ y, comparing vertices and edges, with the obvious meaning. The notation ⊆ is a reminder that trees often come from dyadic decompositions of metric spaces (see Section 9.1).
The boundary of the rooted tree (T, ω), denoted by ∂T , is the set of the maximal geodesics emanating from o * 1 . In the infinite tree case, we always assume that all maximal geodesics starting at o * are infinite, so that ∂T ∩T = ∅. We set T := T ∪∂T .
The most important geometric objects we deal with are the successor sets S(α) of an edge α, S(α) = {x ∈ T : [o * , x] ∋ α}. We also write S(x) for S(α) when x = e(α). We remark that T is compact with respect to the topology generated by the family {S(α)} α∈E of successor sets and singletons {x} x∈T , from which T is often referred to as the standard compactification of T .
The following picture should help the reader to visualize and summarize some of the fundamental notation just introduced. We are now ready to introduce our main object of study, the two-weight Hardy's inequality on trees. Define the Hardy operator I as the operator mapping a function ϕ : E → R + to (2) Iϕ provided that the sum converges. Let π : E → R + be some fixed edge weight, and µ ≥ 0 a Borel measure on T and write ℓ p (π) = ℓ p (E, π) and L p (µ) = L p (T , µ). The main problem which is discussed here is whether the following two-weight Hardy's inequality on T holds, i.e., if I : ℓ p (π) → L p (µ) is bounded, and what can be said about the constant [µ] = I p ℓ p (π)→L p (µ) . Any µ satisfying (H) (for some fixed π, p) with finite constant [µ] will be called a trace measure, and [µ] is the Carleson measure norm, or trace norm, of µ. We always consider π as a fixed, geometric object so that [µ] depends on π and p as well. Also, the quantity [·] is chosen so to be sublinear functional of µ: The inequality (H) has a natural corresponding dual inequality, which we will be of fundamental use in the sequel. For ψ : V → R, we set By obvious duality, Thus, (H) is equivalent to which will refer to as the dual Hardy's inequality.
We reserve a specific symbol for a particular edge weight which will show up often in the paper; we write |α| for the unique edge weight satisfying the recursive formula |α| = |p(α)|/q(e(α)), normalized such that |ω| = 1. On the homogeneous tree where each vertex has q + 1 neighbors, |α| = q −d(α) . Observe that if the tree arises from a dyadic decomposition of the unit interval (see Section 9.1) then |α| is the Lebesgue measure of the corresponding subinterval in the decomposition. In full generality we will then call Lebesgue measure on the boundary of T the measure dx defined by Characterizations of the triples (p, π, µ) for which (H) holds have been know since more than twenty years ( [56,9,4,25]). The main scope of this expository paper is surveying old characterizations together with their proofs, provide some new proofs and also present some completely new characterizations. We point out that even the known proofs are here adapted to work in our general framework, since they have all been originally given for the dyadic homogeneous tree only.
Our motivations are manifold. First, it is interesting and instructive to see the diverse machinery which can be employed in the solution of the problem. Second, we will see that the many conditions characterizing µ ′ s for which the Hardy's inequality holds are rather different one from the other, and their equivalence is a collection of interesting mathematical facts by itself. Third, as research moves to uncharted territories, such as multi-parameter dyadic Hardy inequalities, it is useful to have a place were different techniques are surveyed in a unified framework. We will mention some more recent areas of investigation, providing some results and mentioning a number of open problems.

Description of contents.
We proceed now to a detailed description of the main theorems of this paper. The main goal of Sections 2 and 3 is to prove the following theorem.
Theorem 2. Let T be a locally finite connected tree, 1 < p < ∞ and π a non negative weight on the edges. Then for a positive Borel measure µ on T the following are equivalent.
(ii) The mass-energy condition holds; (iii) The isocapacitary condition holds; (ISO) sup where the capacity Cap p,π is the one defined in Section 2.1. Furthermore, the following inequalities hold, As we will show later, it is easy to prove that (ME) and (ISO) imply (H); the main issue in Theorem 2 is to prove the reverse implications. Section 2 is dedicated to developing a potential theory on the tree and proving the isocapacitary characterization , i.e., the equivalence of (i) and (iii) in Theorem 2. This equivalence was first proved in [11], though in an indirect way, passing through the mass energy condition. Here we give a new direct proof that (iii) implies (i), which builds on ideas developed by Maz'ya in the continuous setting. The main tool is a strong capacitary inequality [46,1], which in tree language takes the form, and has an elementary proof. In Section 3 we turn to the mass-energy condition introduced in [9]. We will give three proofs of its equivalence with (H): one based on maximal functions (originally proved for the homogeneous tree in [10]), one relying on a simple monotonicity argument, which is new and the simplest available at the moment, but only works for p = 2, and a very recent one using a Bellman function argument (originally proved for the homogeneous tree in [4,25]).
The advantage of the isocapacitary condition (ISO) over the mass-energy condition (ME) is that the measure µ appears on the left hand side only. It is obvious from it, On the other hand, the mass-energy condition only has to be verified on single intervals, and not on arbitrary unions of them.
In some simple and particularly common cases of weights and trees, some further characterizations for trace measures can be proved to hold, lengthening the list of equivalent conditions summed up in Theorem 2. Sections 4 and 5 are dedicated to two different such additional characterizations.
More precisely, in Section 4, we restrict our attention to the case π = 1 and p = 2, and prove that in this case the Hardy's inequality is equivalent to a one parameter family of conditions. We prove the following theorem.
It can be readily verified that the s−testing condition is stronger than the massenergy condition. The aim of the section is to prove that in fact they are all equivalent to the mass-energy condition which, in a sense, amounts to say that Carleson measures satisfy a reverse Hölder inequality, see Theorem 10. The results in this section are new and the techniques employed are partially inspired by the work of Tchoundja [63].
In Section 5 we provide another characterization of trace measures which holds (for any p) for a family of edge weights (depending on p) on homogeneous trees, and it can be generalized to trees having Ahlfors regular boundary [15,Section 3]. More precisely we prove the following.
Such a characterization is an immediate consequence of the Muckenhoupt-Wheeden inequality, which in this case reads as follows: for any 1 < p < ∞, 0 < s < 1 and for any measure µ on a homogeneous tree T , Indeed, it is easy to recognize in the middle term the form that the sum appearing in the mass-energy condition gets for the particular choice of π(α) = |α| 1−p * s 1−p * . Hence, for this family of weights on homogeneous trees we have an alternative characterization of Carleson measures. The above choice of π is of particular interest because of the connection with the theory of Bessel potentials in R n , see Section 9.2 in Appendix.
Besides providing a different characterization of Carleson measures, the story of this inequality is itself interesting. That the term one the left is comparable to that on the right was proved in a non-dyadic language by Muckenhoupt and Wheeden in [51,Theorem 1]. They attribute the idea of the proof, which is a textbook example of a good lambda inequality, to Coifman and Fefferman [29]. Unaware of it, Wolff gave a wholly different proof [37, Theorem 1] that the term in the center is comparable to the term on the left. Independently of this, the first author, Rochberg and Sawyer gave a different proof that the central term is bounded by the one on the right [9]. In this section we will give a new proof of the "Wolff inequality" based on a probabilistic argument. The new proof has the advantage that it extends more easily to different settings. We will return on that in future work.
In Section 6 we discuss a conformally invariant version of (H) on the homogeneous tree. The left hand side in (H) evidently depends on the arbitrary choice of a root in our tree. To remedy this we propose the following modified inequality We prove that it is "invariant", i.e., if Ψ is a tree automorphism, then [Ψ * µ] inv = [µ] inv , and that it is surprisingly equivalent to (H). We also provide a sharp estimate of the quantity [µ] inv in terms of capacity (Theorem 20). All results in this section appear here for the first time. In Section 7 we collect some miscellaneous results on the topic. They are all new. First we prove that a "vanishing" version of the mass-energy condition characterizes the compactness of the Hardy operator (Theorem 23). With a similar reasoning one can obtain an equivalent isocapacitary type vanishing condition (Theorem 24). We then discuss another very natural and easily determined necessary condition for (H) to hold, the simple box-type condition, In contrast to the mass-energy and the isocapacitary conditions, however, for the many relevant weights and trees (SB) is not sufficient, see Example 25. We end the section by providing two easy examples of trees where the potential theory degenerates in two opposite ways.
In Section 8 we enlarge our horizon, including some dyadic structures variously related to the Hardy operator, or to its applications. Most of this territory is uncharted, only a few results are known, investigation is still in its infancy, and there is a high potential for applications to harmonic analysis, holomorphic function theory, and more. This section is essentially a description of the few results that are known in the literature. We omit most of the proofs. In Section 8.1 we introduce the viewpoint of reproducing kernel Hilbert spaces, which provides a unified view of the preceding inequalities and is instrumental to state the problem of Hardy-type inequalities for quotient structures in Section 8.2. In Section 8.3 we briefly account on the topic of Hardy inequalities on poly-trees. This is a new area of research where very little is known. Recently it has attracted a lot of interest because of the applications to function theory in the poly-disc.
We end the paper with an Appendix where we include the discussion on the model case of the purely dyadic Hardy's inequality (Section 9.1) and a comparison of the potential theory we use in the paper with that arising from Bessel's potentials (Section 9.2.
In the text we mention a number of open problems that we think are interesting and deserve further attention.

Potential theory on trees and the isocapacitary characterization
This section is dedicated to prove the equivalence of (i) and (iii) in Theorem 2. Section 2.1 introduces the potential theory which we need to define a p-capacity on the tree, while the actual proof is given in the subsection 2.2.
2.1. Potential theory. We define a potential theory following Adams and Hedberg's axiomatic approach [2]. Other approaches are also possible, see for example [58]. Consider the compact Hausdorff space T , and make E into a measure space by endowing it with the measure associated to a weight σ : E → R + . We introduce the kernel k : T × E → R + , given by the characteristic function k(x, α) = χ {α⊃x} (x, α). Observe that k(·, α) is continuous on ∂T , since ∂S(α) is open.
Given a function ϕ : E → R + , we define the potential of ϕ, I σ ϕ : T → R + ∪{+∞}, by The co-potential of a function ψ : T → R + with respect to a positive Borel measure µ on T is defined as the edge function The co-potential of µ is intended to be I * µ (α) := I * µ 1(α) = µ(S(α)). Observe that if ψ ∈ L 1 (T , µ), by Fubini's theorem we have I σ ϕ, ψ L 2 (T ,µ) = ϕ, I * µ ψ ℓ 2 (E,σ) . For a Hölder dual pair of exponents p, p * we can further associate to the measure µ a nonlinear Wolff 's potential, V µ,σ In the linear case p = 2 the sum and the integral can be switched and the above potential expressed as The p−energy of the charge distribution µ is given by If the energy is finite, by Fubini's theorem it holds We define the capacity of a closed subset A ⊆ T as The equality between the first and second line above is given by a classical theorem in potential theory that can be found, for instance, in [2]. The (p, σ)−equilibrium function for A is the unique function ϕ satisfying I σ ϕ = 1, Cap σ p −quasi everywhere on A, and Cap σ p (A) = ϕ p ℓ p (E,σ) . Similarly, one defines the (p, σ)−equilibrium measure for A as the unique measure probability measure such that Cap σ p (A) = µ(A). Two of the authors recently found a characterization for equilibrium measures on trees [6].
Observe that the boundedness of I σ : ℓ p (σ) → L p (µ) is equivalent to that of the Hardy operator I : ℓ p (π) → L p (µ), under the correspondence π(α) = σ(α) 1−p . Following this paradigm, we can translate the natural potential theoretic objects in the π−dictonary, which turns out to be more adjusted to our scopes:

Strong Capacitary Inequality and the isocapacitary characterization.
We are ready to prove the isocapacitary characterization for trace measures, that is, the equivalence of (i) and (iii) in Theorem 2. As it is common, we will prove it passing through the so called capacitary strong inequality. There are various versions of such inequality. Here we naturally treat the case of a tree, a proof for a large class of kernels in the continuous case can be found in [2, Theorem 7.1.1].
while if x = y and α is the unique edge such that x = e(α), we have ϕ(α) = Iϕ(x) − Iϕ(p(x)) > 2 m+1 − 2 m and therefore Hence, 2 −k ϕ k is a testing function for Cap p,π (x : Iϕ(x) > 2 k ). Summing and using that using that the supports of the ϕ ′ k s can meet only at points belonging to multiple boundaries, Proof of the equivalence of (i) and (iii) in Theorem 2. Suppose that as always ϕ is a positive function defined on the edges of the tree, then using the distribution function we write . This concludes the proof of sufficiency. To prove necessity let α 1 , . . . , α n ∈ E and denote by ϕ the equilibrium function associated to the set ∪ n i=1 S(α i ). Then, Iϕ ≥ 1, Cap p,π −quasi everywhere and in particular µ−a.e. on ∪ n i=1 S(α i ). It follows

Mass-energy characterization: three different proofs
In this section we will prove the mass-energy characterization for trace measures, that is, the equivalence of (i) and (ii) in Theorem 2. We will give three different proofs, based on different techniques. The easiest proof only works for p = 2, while the other two work for any 1 < p < ∞.
We remind that, and we call it the energy-mass ratio.
3.1. Maximal function. This simple proof can be found, for instance, in [13]. It relies on the L p inequality for a suitable maximal function. If µ, σ ≥ 0 are measures on T , and f a function on T |f |dσ.
We simplify the notation by setting M µ f = M µ (f dµ). We have the following weak-(1, 1) estimate.
Theorem 6. Suppose µ, σ ≥ 0 are measures on T and ψ a positive function on T . Then, Proof. First we prove (a). Fix λ > 0 and set which proves the weak estimate. Then a simple application of a variation of the classical Marcinkiewicz interpolation theorem [32, Exercise 1.

3.3] gives us (b) from (a).
Proof of the equivalence of (i) and (ii) in Theorem 2.

3.2.
Monotone proof for p=2. This easiest proof only works for p = 2, because it uses the C * identity for operators on Hilbert spaces. Lemma 8. Let µ, ν ≥ 0 be measures on T , and suppose that µ(S(α)) ≤ ν(S(α)) holds for all α in E; and suppose that f : Proof. It suffices to prove the inequality of the corresponding distribution functions. For t > 0, Proof of the equivalence of (i) and (ii) in Theorem 2 for p = 2. Suppose initially that our tree is finite, but arbitrarily large. This assures that all relevant operators are bounded. If we manage to estimate the norm of the Hardy operator independently of the length of the tree we can pass to the infinite case by a simple limiting argument.
For a g : E → R + , we first compute Therefore, . The second norm can be computed in terms of the norm I. Consider a measure ρ on T such that ρ(e(α)) = π −1 (α)µ(S(α)). The mass-energy condition allows us to apply Lemma 8 to the measures ρ and µ. Therefore obtaining A standard calculation shows that T * 2 = T 1 (with respect to the inner product in ℓ 2 (π)), hence we have for free the estimate on the norm of T 1 . Putting everything together we get . Since I is bounded because our tree is finite we can divide both sides of the inequality with its norm to get ]. Notice that this proof gives the best constant.
3.3. Bellman function. In this section we provide a different proof, based on a Bellman function approach, of the fact that (ME) implies (H). Let T be a general rooted tree and denote by | · |, as usual, the canonical edge weight defined in Section 2.1. Set k β = |β|/|α| when β ∈ s(α).
The following is a tree version of the Weighted Dyadic Carleson Imbedding Theorem by Nazarov, Treil and Volberg [53]. With respect to the standard dyadic case, this tree analogue presents some extra difficulty, due to the fact that the objects into play are here not martingales but only supermartingales.
On the dyadic tree, a proof of the above theorem relying on a Bellman function method was first given in [4] for p = 2, and later extended to every 1 < p < ∞ in [25]. In his paper it is proven that the Bellman function employed is the Bellman function of a problem in stochastic optimal control. We give here a slightly adapted proof which works on every tree T . We also remark that the result remains true substituting the canonical weight | · | with a general weight w fulfilling the so called flow condition, that is, β∈s(α) w(β) = w(α), for any α ∈ E. Indeed, in the following proof the flow condition is the only property of the canonical weight which is used. We refer the reader to [26], [44] and [43] for some recent result concerning trees endowed with flow measures and flow weights.
Proof of Theorem 9. We begin by observing that it is enough to show the result for nonnegative functions. For any edge α and any quadruple of nonegative real numbers F, f, A, v, define Ω α (F, f, A, v) to be the set of weights σ, measures λ and functions ϕ such that In order for Ω α (F, f, A, v) not to be empty, it must be f p ≤ F v p−1 , the condition coming from Hölder inequality. Moreover, , which is clearly convex, being the intersection of the half plane {A ≤ v} with the cylindroid having as a basis the convex set We aim to prove that In particular x ω = x, and it is clear that x α ∈ D for each α. Moreover, the additivity of I * gives the following relations , the above relations can be rewritten as (9) x Now, suppose we can design a concrete function, B : Then, summing over α ∈ E both sides of the inequality and exploiting the telescopic structure of the summand we obtain from which follows the thesis, We now claim that the function fulfils the desired properties. The construction of a Bellman's functions is a delicate matter. The interested reader can find more information and examples in [23,54,17].
It is immediate that (i) holds on D. For any chosen x = (F, f, A, v) ∈ D, let x α be the associated family of points solving (9). Then also the points x * α = x α −(0, 0, c α , 0) and x * * α = x α − y α belong to the convex domain D. Since B is clearly concave in the third variable, we have the last inequality following from the domain constraint A α ≤ v α . Indeed, B is also concave as a function of four variables on the convex set D, as one can verify by checking that the Hessian matrix of (F, f, where the last inequality can be derived by direct calculations. Putting the pieces together we obtain Exploiting the concavity of B, which, by means of (11) yields to It is easily seen that c α = v p α σ(α)|α| −1 , which substituted above gives (ii). It is clear that, a posteriori, the mass-energy and the isocapacitary conditions are equivalent, being both equivalent to (H). However, it is tempting to look for an a priori argument for the equivalence of these geometric conditions which does not pass through the boundedness of the Hardy operator. A direct proof that (ME) implies (ISO), for a family of weights including π = 1, is in [11] , where it is also directly proven, for p = 2 and π = 1, that ν ≤ µ implies that Problem 2. Find a proof of the equivalence between (ME) and (ISO), which works for every couple π, µ and does not require the boundedness of the Hardy operator.

A reverse Hölder inequality
In the particular case that π ≡ 1 and p = 2 the mass-energy condition can be rewritten in an interesting way as a consequence of the following calculation.
Therefore, the mass-energy condition can be expressed as The following variation on the above condition, is clearly stronger than the mass-energy condition for s > 1 due to Hölder's inequality. The surprising result is that in fact the conditions are equivalent, and the corresponding quantities are comparable. This result is in the spirit of the John-Nirenberg reverse Hölder inequality for BMO functions.
Theorem 10. For all measures µ, and s > 1, In an implicit form this result is contained in the work of Tchoundja [63]. The proof of the above theorem is based on a Calderón-Zygmund type theorem for the operator II * µ := T µ . More precisely, Theorem 11. Suppose that the operator is bounded. Then for any s ∈ (1, +∞) the operator Since the underlying measure µ is not necessarily doubling this theorem can be seen as a special case of [52, Theorem 1.1]. Here we shall give a direct proof which also provides better quantitative estimates of the constants involved based on a good-λ inequality as in [66,63].
Lemma 12 (Good-λ inequality). Let µ a trace measure on T . Then for every η > 0, there exists γ(η) > 0 such that for any non negative function f on T , Proof. Notice that the set {T µ f > λ} is a stopping time. In other words it can be written as a disjoint union of tent regions, It is therefore sufficient to prove that for all α i we have So for the rest of the proof we work on a fixed S(α i ) which we denote by S(α) to avoid an overload of notation. Let Where we assume without loss of generality that M µ (f 2 )(e(α)) ≤ γ 2 λ 2 , otherwise the left hand side is zero. It suffices therefore to choose Proof of Theorem 10. Since the operator T µ is self adjoint it suffices to prove that L 2 (µ) boundedness implies L s (µ) boundedness for all s > 2. Let s > 2 and f ∈ L s (T , µ). Exploiting Lemma 12 and Theorem 6, we get which proves the thesis if η is chosen small. In particular The reverse Hölder inequality is now a corollary of the above theorem.
Proof of Theorem 11. Suppose that µ satisfies the mass-energy condition. Then, ]. On the other hand, II * µ (χ(S(a)) s dµ, and the result follows from Theorem 10.

The inequality of Muckenhoupt and Wheeden, and Wolff
In this section we only consider only the case when T is a homogeneous tree. We recall that if each vertex of T has q + 1 neighbors, then |α| = q −d(α) , for each edge α. Let 0 < s < 1 and 1 < p * < ∞. For any measure µ on T we trivially have The inequality of Muckenhoupt and Wheeden [51, Theorem 1], (MW) in the sequel, says that the chain of inequalities can be reversed, on average. 2 Theorem 13. For any measure µ on ∂T , p * ≥ 1, and 0 < s < 1, there is a constant As usual, dx here is the Lebesgue measure for which S(α) dx|α|. A first consequence of the (MW) inequality is that we have a one parameter of seemingly different conditions characterizing µ's for which the Hardy inequality holds, provided that the weight π has the special form π(α) = |α| Proposition 14 (Wolff's inequality on the tree). Let µ be a non-negative Borel measure on ∂T . Then for any p * ≥ 1 and 0 < s < 1 one has Since the particular choice of q plays no role, from now on, to keep the notation lighter, we fix the homogeneity of the tree T setting q = 2., i.e., we put ourselves back in the realm of the classical dyadic Hardy's inequality (33). Given an edge α ∈ E we denote by α + and α − its two children edges. In this setting, Proposition 14 follows from the slightly more general statement below. The function ϕ : T → R + is a logarithmic supermartingale with the drift d > 0, if for every edge α one has Proposition 15. Assume ϕ is a logarithmic supermartingale with the drift d > 0, and that its jumps are bounded from above, for some constant C > 0. Then, for any p * ≥ 1 one has Proof of Proposition 14. One only needs to observe that ϕ(α) := µ(S(α))/|α| s , α ∈ E, defines a logarithmic supermartingale with the drift log 2·(1−s) and bounded jumps. Indeed, given any edge α in T , we clearly have |α ± | s = 2 −s |α| s , hence, since µ(S(α)) = µ(S(α + )) + µ(S(α − )), we see that ≤ |α| −2s 2 2(s−1) (µ(S(α + )) + µ(S(α − ))) 2 = 2 2(s−1) µ(S(α)) |α| s 2 .
The logarithmic supermartingale property follows immediately. On the other hand, µ(S(α ± )) |α ± | s ≤ 2 s µ(S(α)) |α| s , so the jumps of ϕ are clearly bounded from above. In order to prove Proposition 15 we will need the following lemma, of which we postpone the proof.
Lemma 16 (Wolff's lemma). Fix δ > 0 and N > 1/2. Let ϕ be a logarithmic supermartingale with positive drift d > 0 satisfying (15). Then for any edge α 0 in T the following inequality holds Proof of Proposition 15. We only show the left inequality in (16). Let us introduce following notations: write [p * ] and {p * } for the integer and the decimal part of p and set Note that p * = Q + r + r. Then we have where S Q+2 is the symmetric group of all permutations of {1, . . . , Q + 2} and p j = 1, 1 ≤ j ≤ Q, p Q+1 = p Q+2 = p. The next step is to use Wolff's lemma: given a permutation π ∈ S Q+2 we apply (17) repeatedly to (18), obtaining Summing over all π ∈ S Q+2 we obtain first half of (16). We now prove Wolff's Lemma. The proof is based on a careful analysis on slow and fast growing geodesics.
Proof of Lemma 16. Without any loss of generality we may assume that N = 1 (since the proof works all the same for every N) and α 0 = ω, the root edge.
What we are going to do next is to fix an edge α and look at the possible growth rate of ϕ(β) for β ⊆ α. The idea is that, if ϕ does not grow too fast in this region, then one could expect for the second sum on the right hand side of (17) to be dominated by the value at the starting point, β⊆α ϕ(β)|β| ϕ(α)|α|.
Given α ∈ E and k ≥ 0, let to be the set of slowly growing successors (here r = r(d, δ) < 10 −2 is some small constant to be chosen later), let also A(α) = k≥0 A(α, k). We have We start by estimating the second term, To deal with the first term we let and, as before, B(α) = k≥0 B(α, k). The function ϕ decays exponentially outside of B(α), in particular So far, we took care of two types of behaviour of ϕ: points of very fast growth (i.e., β / ∈ A(α)), and points of very fast decay (β / ∈ B(α)). Now we consider the points β ⊆ α where ϕ(β) is roughly comparable to ϕ(α). It turns out that these points are very rare in the successor set of α. More precisely we show that for every α ∈ E and k ≥ 0 The reason for this is that by the multiplicative property (14) the function ϕ decays exponentially on (geometric) average, and its pointwise growth rate is bounded from above. Let Φ = log ϕ. By (15), On the other hand (remind that r < 10 −2 ), if β ∈ B(α, k), then we get Choose r < min(1/100, d/10). The inequality (21) now follows from the following lemma.
and its differences are bounded from above Then for any k ≥ 0 and r ≤ d/10 one has Assume for a moment that we have Lemma 17, and hence (21). Then, we get (22) .
and we are done. Lemma 17 clearly follows from the following rescaled driftless version.
Then for any k ≥ 0 and η > 0 one has This Lemma is in turn a corollary of Azuma-Hoeffding inequality (essentially a good-λ argument for supermartingales).
Proposition 19 (Azuma-Hoeffding inequality). Let Z = {Z n } be a supermartingale with bounded differences, While the proposition above requires the supermartingale differences to be bounded above and below, it is not really relevant here. Namely, assume X satisfies the hypothesis of Lemma 18 and let S = {S n } be its differences, where P (β) is the parent of β. By (24), S ≤ 1. Consider the set otherwise.

Conformally invariant Hardy's inequality
While the right hand side of the Hardy's inequality (H) does not depend upon the choice of the root vertex o, the Hardy operator contained in the left hand side does, and consequently also the optimal constant [µ] = [µ] o depends on this choice. It is therefore natural to seek an alternative "conformal" invariant theory. The term "conformal invariant" should be interpreted in the sense that as (H) corresponds, as explained in the introduction, to a Carleson inequality for Besov spaces, in the same way the inequality we are going to introduce should correspond to a continuous inequality which remains invariant under the group of automorphisms of the unit disc.
We consider here the case p = 2, π ≡ 1. We also assume the tree is dyadic and not rooted: each vertex is the endpoint of three edges, and T is endowed with a rich group of automorphisms which, having the Poincaré distance in mind, play in T the role of conformal automorphisms. Such automorphisms are also isometries with respect to the distance d and act naturally also on the boundary ∂T (see [31] for a comprehensive exposition on the topic). Once we fix a root o, there are 3 × 2 n−1 vertices at distance n from it.
It is easily seen that the Hardy's inequality (H), holding for functions f : E → R, is equivalent to where ∇F (α) = F (e(α)) − F (b(α)) depends on the choice of the root, but |∇F (α)| 2 does not. A first attempt to write down a "conformally invariant" formulation of the Hardy's inequality is, assuming that µ(T ) = 1, where µ(F ) = T F dµ is the mean of F and [µ] inv ∈ [0, +∞] the best constant in the inequality. The invariance is the following.
Let Ψ be an isometry of T and define Ψ * µ(A) = µ(Ψ −1 (A)) and Ψ * F (x) = F (Ψ(x)), A ⊆ T . Then, Observe that the finiteness of [µ] o in (27) implies that µ has no atoms on ∂T . On the other hand, if µ is a Dirac delta measure supported on the boundary the left hand side of (CH) vanishes, while the average µ = δx+δy 2 of two Dirac delta gives a true, non trivial inequality.
We will show that if µ is not a boundary Dirac delta, then (CH) is equivalent to (H). We present two separate arguments, one for measures supported on the boundary of the tree, and another one for measures supported on the vertex set. The next result shows that, for measures supported on the boundary, (27) and (CH) are equivalent, and relates the optimal constants to a capacitary expression.
Theorem 20. Let µ ≥ 0 be a Borel probability measure on T , giving no mass to vertices, and not being a Dirac delta on the boundary. Then, where the supremum is over all couples of finite union of arcs.
The following lemma provides a recursive formula for calculating the capacity of a condenser of a special type. This kind of formulas arise often in the setting of discrete capacities.

Lemma 21.
Let o ∈ T be a vertex, and let T 1 , T 2 , T 3 be the dyadic subtrees having it as pre-root. Let A j ⊆ ∂T j be finite union of arcs. Then, In particular, Proof. Let c j = Cap o (A j ). As in [12,Proposition 1], it can be proved that there exists an extremal function F ≥ 0 on T such that (a) lim T ∋x→ζ F (x) = 0 for ζ ∈ Of course, such |∇F | is not finitely supported. Similarly, there exist analogous extremal functions F j for c j , j = 1, 2, 3, with F j (a) = 0; lim T ∋x→ξ F j (x) = 1 for ξ ∈ A j ; α∈E ∇F j (α) 2 = c j .
By estremality, it is obvious that there are numbers 0 < s j ≤ 1 such that |∇F (α)| = s j |∇F j (α)| on the edges α of T j , and since |∇F | adds to one along geodesics going from A 1 ∪ A 2 to A 3 , it must be s 1 + s 2 = s 1 + s 3 = 1. Again by minimality, we look for t = s 1 which minimizes which is minimal when t = c 2 +c 3 c 1 +c 2 +c 3 , thus

Since clearly Cap
Proof of Theorem 20. In one direction, In the other direction, consider closed subsets A, B ⊆ ∂T , which we might assume to be finite unions of arcs, and a function F with finitely supported |∇F | such that F = 1 on A and F = 0 on B. Then, Passing to the infimum over all such F 's, we obtain Cap(A,B) implies that µ does not have more than an atom on the boundary, which would then be a Dirac delta. Hence, if µ has boundary atoms the statement holds.
Suppose now that µ is atomless. We claim: (i) if µ is a probability measure on ∂T having no atoms, then there are disjoint arcs I 1 ∪ I 2 = ∂T , such that 1/3 ≤ µ(I j ) ≤ 2/3.
With the claim given, let A be a finite union of arcs, let I 1 , I 2 as given by (i), and A 1 = A ∩ I 2 , A 2 = A ∩ I 1 . Let o j be the pre-root of the dyadic tree T j 1 having boundary I j and T j 2 , T j 3 the two other dyadic trees having it as a pre-root, so that, for k = j, I k = ∂T j 2 ∪ ∂T j 3 , and, , We now come to the proof of the claim. Choose a vertex x 0 ∈ T and for j = 1, 2, 3 let T 0 j be the dyadic subtrees having pre-root at it and call x j the neighborhood point of x 0 lying in T 0 j . For at least one j, µ(∂T 0 j ) ≥ 1/3; say j = 1. If µ(∂T 0 1 ) ≤ 2/3 we set I 1 = ∂T 0 1 and we are done. Otherwise, let T 1 1 , T 1 2 be the two infinite subtrees of T 0 1 with pre-root in x 1 . For one j ∈ {1, 2} we have µ(∂T 1 j ) ≥ 1/3; let it again be j = 1. As before, set I 1 = ∂T 1 1 if µ(∂T 1 1 ) ≤ 2/3, otherwise consider the two infinite subtrees of T 1 1 rooted at the neighboroods of x 1 , and iterate the reasoning. If there is no stop, we have a family of nested tents ∂T 0 1 ⊃ ∂T 1 1 ⊃ . . ., whose intersection is a single boundary point x with µ({x}) ≥ 2/3, contradicting the assumption.
We come now to the case of measures supported on the vertex set of T . Since no extra work is required, we present a proof which holds in much higher generality.
Proposition 22. Let X be a locally compact space and B be a Banach space of functions on X such that for every compact K there is C(K) with sup x∈K |f (x)| ≤ C(K) if f B ≤ 1. Then the following are equivalent for a probability measure µ:

Suppose (a) holds and (b) does not. By (a):
We can find compact sets K n ր X such that Kn |f n |dµ > Mn 2 . For any fixed compact S we have that S |f n |dµ ≤ C(S) ≤ Mn 4 for n ≥ n(S), hence, for n ≥ n(S). Thus i.e., |µ(fn)| f L p (µ) ≤ 4µ(X \ S) 1/p * , which can be made as small as we wish, leading to a contradiction.
Problem 3. A more general problem is that of characterizing the probability measures λ on T × T such that which is as well conformally invariant, if we set Ψ * λ(A × B) = λ(Ψ(a) × Ψ(B)), and reduces to the above for λ = µ ⊗ µ: No characterization of the measures λ for which (a) holds is known, to the best of our knowledge.

Problem 4.
A natural generalization of the conformal Hardy's inequality is to consider a p-version, or even a weighted version of it. Namely, given a weight function π : E → R + characterize positive Borel probability measures such that there exists a constant [µ] inv possibly depending on µ, π, and π such that There is a related interesting, conformally invariant interpolation problem.
Given a subset Z ⊂ T , we say that it is universally interpolating for the seminorm · if i. z∈T w∈T ii. for all sequences {a(z)} z∈Z such that z∈T w∈T Condition (i) says that the measure λ = z,w∈T δ (z,w) d(z,w) satisfies (a). We call the sequence onto interpolating if just (ii) holds. We think that it is an interesting problem finding characterizations of universally, or onto, interpolating sequences.

7.1.
Compactness. We will briefly discuss the compactness conditions for the Hardy operator. As it is natural to expect the compactness of the Hardy operator corresponds to the "vanishing" versions of the conditions characterizing boundedness.
To see the necessity of the condition we can again work with I * µ courtesy of Schauder's Theorem. Suppose that (29) does not hold. Then there exists an ε > 0 and a sequence of edges {α k } k such that lim k d(α k ) = ∞ such that Then consider the sequence of testing functions g k := µ(S(α k )) −1/p * χ S(α k ) , which converges weakly to 0 in the space L p * (µ). By compactness we must have which contradicts that the above quantity is bounded below by ε.
In a very similar way one can characterize the compactness of the Hardy operator in terms of a vanishing capacitary condition. We state the theorem without a proof in order to avoid repetition.
Problem 6. Find a simple characterization of measures µ such that the Hardy operator I : ℓ 2 (π) → L 2 (µ) belongs to the p−Schatten ideal.
In [45] Luecking has studied trace ideal criteria for Toeplitz operators on weighted Bergman spaces. It is very possible that some of his results apply also to our case, although his results are not complete.
Example 25 ((SB) =⇒ (H)). Let T be a dyadic tree, π ≡ 1 and p = 2. At level N k = 2 k k, choose 2 k vertices {z k n } 2 k n=1 , with children named x k n and y k n , such that Figure 2. A snapshot from the defined family of points and their relations: continuous lines represent edges and dashed lines paths of (many) edges. Next to some vertices is specified, in parenthesis, their distance from the origin.
Let µ = k,n 1 M k δ w k n . A simple reasoning shows that it suffices to verify the one-box condition at the nodes z k n : On the other hand, the mass-energy condition (ME) fails. To see this, denote by Z the minimal subtree containing all the z k n points. Then, Problem 7. Find a characterization of those couples (p, π) for which (SB) is not sufficient on the dyadic tree.

Two opposite examples.
In the generality in which we have stated it, the dyadic Hardy's inequality covers a variety of contexts; some of them very rich, some very poor. The richest context is in our opinion the unweighted case π = 1, corresponding to the classical Dirichlet space. We consider here two cases at the opposite extremes.
Example 26 (Boundary having null capacity). Consider the dyadic, infinite tree with the weight π(α) = 2 −d(α) and p = 2. The reader familiar with martingale theory can fruitfully think of π as the probability of a fair coin tossed d(α) times. Let us show that with this choice Cap 2,π (∂T ) = 0. Let g N (α) = 1 N if d(α) ≤ N and g N (α) = 0 elsewhere. Then, It follows from the isocapacitary inequality (ISO) that ∂T does not support any Carleson measure p = 2 and the chosen weight π. In particular, the Lebesgue measure µ 0 (∂S(α)) = 2 −d(α) does not define a Carleson measure. This fact is best appreciated having in mind basic martingale theory. Consider the filtration associated with the infinite tossing of a fair coin, where ∂T is the probability space and µ 0 (∂S(α)) = 2 −d(α) is the probability measure. A martingale for the filtration has the form X n (ζ) = If (α) for ζ ∈ ∂S(α) and d(α) = n. The martingale Hardy space M 2 contains those martingales for which i.e., M 2 = ℓ 2 (π) ∩ M, where M is the space of all martingales. By definition, µ 0 is Carleson measure for M 2 , but it is not for ℓ 2 (π). The underlying reason is that in M 2 cancellations play a prominent role, and this much enlarges the class of the Carleson measures at the boundary. But suppose we ask a measure µ to be Carleson for the variation of the martingales in M 2 , where the variation of X = If is V (X)(ζ) = α∈P (ζ) |f (α)|. It is easy to see that this is the same as asking µ to be Carleson for ℓ 2 (π), hence µ = 0. This is reflected in the fact that functions in the classical Hardy space can have unbounded variation a.e. at the boundary of the unit disc [55,21].
Example 27 (All boundary points have positive capacity). It is easy to see that for any tree T and any ζ ∈ ∂T , Cap σ p ({ζ}) = d σ (ζ) −1 . In fact, for any function f which is admissible for ζ, and the right handside is the ℓ p (σ)−norm of the admissible function taking constant value d σ (ζ) −1 on the edges α ⊃ ζ and zero elsewhere. Then, assuming that Cap σ p ({ζ}) ≥ c > 0 for all boundary points, is the same as saying that ∂T is bounded with respect to the distance d σ . This is the case, for example, for the weights σ(α) = 2 λd(α) with λ > 0. Under this assumption, all functions If with f ∈ ℓ p (π), are bounded on ∂T : from which follows that all bounded measures µ are Carleson for ℓ p (π).
Under this assumption, all functions If with f ∈ ℓ p (π), where as usual π = σ 1−p , are continuous up to the boundary, with respect to the natural topology induced by the visual distance d(ζ, ξ) = e −d(ζ∧ξ) : which tends to zero as ξ → ζ by dominated convergence. By Weierstrass Theorem, all bounded measure µ are Carleson for ℓ p (π).
8. Some variations on the structure 8.1. The viewpoint of reproducing kernels. Let us recall to the reader that a reproducing kernel Hilbert space (RKHS) is a Hilbert space H of functions defined on a set X such that the point evaluation functionals are continuous or, equivalently, such that for any x ∈ X there exists an element K x ∈ H which fulfills the reproducing property It is easy to see that the function K : X × X → C given by is a kernel on H, that is, it is positive semi-definite, nonzero on the diagonal and satisfies K(x, y) = K(y, x). See [16] or [3] for a systematic treatment of the topic. Let H K be a RKHS of functions on a locally compact space X with continuous reproducing kernel K. A simple and well known "T * T argument" shows that the imbedding Id : H K → L 2 (µ) (i.e., µ, a positive Borel measure on X, is a Carleson measure for H K ) can be rephrased various ways in terms of integral inequality on L 2 (µ).
Lemma 28. Given a RKHS H K of functions on a locally compact space X with continuous reproducing kernel K(x, y) = K y (x), the following are equivalent for a Borel measure µ on X: (iv) X X g(x)g(y) Re K y (x)dµ(x)dµ(y) ≤ {µ} 2 g 2 L 2 (µ) , for real (or even just positive) g, . Where the statements are assumed to hold for all f, g : X → C and the constants {µ} 1 , {µ} 2 , {µ} 3 , which are the best constants in the respective inequalities, might depend on µ and π but not on f, g. In particular, µ is Carleson measure for H K if and only if it is a Carleson measure for H Re K .
The proof of Lemma 28 can be found with some variations in different sources [13,Proposition 4.9], [11,Lemma 24], [14, pp. 9-10]. The proof itself is a short "soft" analysis argument, and we write it here for convenience of the reader. The statement we give is an elaboration of various statements in the literature.
Proof of Lemma 28. (i) says that Id : H K → L 2 (µ) is bounded with norm {µ} 1/2 1 , so Id * : L 2 (µ) → H K is bounded with the same norm, where where in the last equality we used K x (y) = K y (x). Hence, showing that (i) and (ii) are equivalent. Testing (ii) on real g's we have that (iv) holds with {µ} 2 = {µ} 1 . If viceversa (iv) holds and g = g R + ig I is the decomposition of g in real and imaginary part, then , and we obtain the dual form of (i) with {µ} 1 ≤ 2{µ} 2 . A similar reasoning works if we assume (iv) to hold for positive g's, but with a different constant.
It is clear that (vi) and (vii) are equivalent, and (vi) implies (ii) with {µ} 1 ≤ {µ} 3 , In the other direction, set T g(x) = X K x (y)g(y)dµ(y) = Id * g(x), where T is defined on L 2 (µ), T = T * and T is positive, T g, g L 2 (µ) ≥ 0. Then it has a positive square root √ T and by (ii) Since Re K is positive definite, hence a reproducing kernel, (iii) and (v) are equivalent for the same reason (i) and (ii) are. Finally, (iii) clearly implies (iv) and, on the other hand, if (iv) holds, then This lemma allows one to use methods from singular integral theory (where Re K is the kernel of the "singular" integral) on nonhomogeneous spaces. This point of view, at least in dyadic theory, started in [11] in connection to the problem of Carleson measures for the Drury-Arveson space, which we will mention below, then taken up by Tchoundia [63] and Volberg-Wick [67] in order to study more general function spaces on the unit ball.
What we aim here, instead, is to give an interpretation of the conformal invariant inequality (CH) in terms of reproducing kernels. Indeed, we will provide such an interpretation for the even more general inequality (a) introduced in Problem 3. The idea is to read (a) as the imbedding inequality of an appropriate RKHS into L 2 (µ ⊗ µ), where µ is a Borel measure on the rooted tree T . Lemma 28 would then provide various ways to reformulate inequality (a).
Let T be the rooted tree and π an arbitrary positive weight function. We first introduce the Dirichlet space D(π) := {F = If : f ∈ ℓ 2 (π)}, endowed with the inner product F, G D(π) = f, g ℓ 2 (π) = ∇F, ∇G ℓ 2 (π) , for F = If, G = Ig ∈ D(π). We claim that this (semi-)Hilbert space has a reproducing kernel, Indeed, for f ∈ ℓ 2 (π) and F = If we have from which follows, It is then imprecise but harmless to say that D(π) is a RKHS. The inequality we are re-intepreting as a RKHS imbedding is (a) for λ = µ ⊗ µ, i.e., We are not quite done yet, since this inequality bounds the L 2 (µ ⊗ µ) norm of the differences of a function with the (π = 1) Dirichlet (semi-)norm of the function itself. However, we argue here that, given a RKHS H K on a set X such that 1 ∈ H K (assume 1 H K = 1), there is a canonical way to construct the RKHS of its differences, having as elements the functions (x, y) → F (x) − F (y) =: ∇F (x, y), with F ∈ H K . We define the kernel κ : (X × X) × (X × X) → C as We show that κ reproduces the space H ∇ of the functions ∇F , endowed with the inner product (32) ∇F, ∇G H ∇ : which is well defined since ∇F = 0 if and only if F is constant, i.e., if and only if F − 1 F, 1 H K = 0.
Lemma 29 in particular tells us that the space D ∇ of differences of functions in the Dirichlet space D = D(1) has reproducing kernel Inequality (a') represents then the boundedness of the imbedding D ∇ → L 2 (µ ⊗ µ), which by means of Lemma 28 admits various re-writings.
A picture shows that the definition of the kernel of D ∇ is independent of the choice of the root, as we know a priori by conformal invariance.
For many classical spaces of holomorphic functions, as far as it concerns their imbedding properties, one can substitute the reproducing kernel with its absolute vale causing no losses. It is a natural question if the same applies here. Problem 8. Is (a') equivalent to the imbedding in L 2 (µ ⊗ µ) of the space having |k (a,b) (x, y)| as kernel?
As a comment to the above problem, we observe that the kernel k (a,b) (x, y) seems to present important cancellations, which might be an indication that tools from singular integral theory are needed in the characterization of the conformally invariant Hardy's inequality. Indeed, it is a simple exercise to check that for (a, b), (x, y) ∈ T × T and 8.2. Quotient structures. Dyadic quotient structures appeared for the first time in [11], to the best of our knowledge, to deal with the problem of the Carleson measures for the Drury-Arveson space. Using the T * T argument outlined in Section 8.1, the problem was shown to be equivalent to the immersion Id : H K → L 2 (µ) for a tree and a kernel which we are going to describe in a special case containing all essential information. Consider a 4-adic, rooted tree T , whose vertices x at level d(o * , x) = n might be labelled as 4-adic rationals x = 0.t 1 . . . t n , with t j ∈ Z/4Z and an edge joining the parent 0.t 1 . . . t n−1 with the child 0.t 1 . . . t n−1 t n . Define similarly the dyadic tree U and consider the surjective map Φ : T → U induced by the map [t] mod 4 → [t] mod 2 , sending digits 0, 2 to binary digit 0, and digits 1, 3 to binary digit 1.
The map Φ is a root-preserving tree epimorphism: it is surjective, and Φ(x) and Φ(y) are joined by an edge in U if and only if x and y are joined by an edge in T . In other words, we have defined a quotient structure U = T /Φ on T .
We define a kernel K G on T by, , which can be proved to be positive definite, hence defining a RKHS H K G . The wedge ∧ G is a modified version of the wedge we have used so far. For the exact definition the reader is referred to [11].
It is not clear if one needs to introduce the modified wedge ∧ G in order for the above theorem to hold. Thus the following problem remains open. Problem 9. Is it true that Theorem 30 remains true if we replace the kernel K G with the kernel The real part of the reproducing kernel of the Drury-Arveson space can be naturally written down as the quotient of two kernels, which reflect this stratification. Passing to dyadic decompositions, this leads to the kernels K G and K we have just described, and the Carleson measure problem for the Drury-Arveson space can be reduced to the theorem stated above.
We have seen that conditions similar to those in the theorem also provide alternative characterizations of the measures µ satisfying the Hardy inequality, at least when p = 2. We think that there are here some interesting questions for further investigation. Problem 10. Is it possible to have a characterization of the Carleson measures for H K in terms of the potential theory associated with the kernel K?
8.3. Product structures: poly-trees. The dyadic tree T parametrizes the set of the dyadic subintervals of [0, 1], and the corresponding product structure T d is defined to parametrize Cartesian products R = I 1 × . . . × I d of such intervals: dyadic rectangles for d = 2, etcetera. Following the same lines of Section 2.1, a potential theory can now be defined on T d by taking tensor products of everything on sight, as we will detail below. This leads to a natural extension of the Hardy's inequality to the multi-parameter setting. In this situation, however, characterizing trace measures is a much more complicated problem. We remark that the poly-tree is not a tree, but a graph presenting cycles, and this creates new and major difficulties. So far, solutions to the problem are known for σ ≡ 1, p = 2 and for dimension d = 2, 3 only [7], [5], [49]. It is also known [48] that the techniques used in these works are not feasible to be extended to d = 4 and p = 2. Let us briefly expand on that.
Once we have the kernel k, the general theory [2] provides us also with definitions of potentials and energies of measures, and set capacities of compacta K ⊆ T d , as exposed in Section 2.1. We can hope at this point that the capacitary estimate does the job, Here a major difficulty appears: the standard proofs of (SCI) depend, more or less explicitly, on the boundedness principle for potentials of measures, sup{V µ,σ p (x) : x ∈ T d } B · max{V µ,σ p (x) : x ∈ supp(µ)}, but in the multi-parameter situation such principle miserably fails.
The idea is to have a set K which is rarefied, but "curved" is such a way many "not too thin" rectangles join it to the point x, like rays focusing on it.
The way out of this difficulty, implemented in [7] for d = 2, p = 2 and σ ≡ 1, is proving a distributional boundedness principle.
Theorem 32. There is C > 0 such that for λ > 1 and for an equilibrium measure µ, His very clever proof does not extend to the tri-parameter case. Moreover, his inequality is not dyadic, and covers the facts here surveyed only in the case of the trivial homogeneous rooted tree N.
This result was then extended to d = 3, but no higher, in [49]. With some major difficulty, the capacitary characterization of the measures for which the multiparameter, dyadic Hardy inequality holds is true at least for d = 2, 3. What about the other characterizations and proofs?
It is proved in [5] that a mass-energy condition holds as well in d = 2, and in [49] this was extended to d = 3, always for p = 2. More precisely, for d = 2, 3 This fact might surprise practitioners of the Hardy space on the bidisc. It was proved in [24] that Carleson measures for the Hardy space on the bidisc are not characterized by a "single-box condition" such as (ME⊗), and A. Chang proved in [27] that the characterization holds if one allows multiple boxes. One might expect that a multiple box condition like β⊆∪ n j=1 α (j) µ(S(β)) 2 ≤ [µ] mult µ n j=1 S(α (j) ) < ∞, for all α (1) , . . . , α (n) ∈ E d , might not be weakened, but if fact this is not the case. The proofs we surveyed for the one-parameter Hardy operator seem not to work in the multi-parameter case. The simple maximal proof, for instance, does not work because, contrary to the usual dyadic, weighted maximal function, its several parameter versions, are not necessarily weakly bounded on L 1 , neither they are bounded on L 2 . In the unweighted case, the L 2 boundedness of the multiparameter maximal function was proved in [38], and a nice account of multiparameter theory with applications to martingales and the Hardy space is in [33].
Problem 11. It would be interesting to know whether, like in the one parameter case, for 1 s < ∞, Carleson measure for B p a is a positive Borel measureμ on the unit disc D of the complex plane for which there exists a constant C(μ) < ∞ such that, for all f holomorphic on D, with 0 ≤ ap < 1. Such problems appear in connection to the characterization of multipliers and of exceptional sets at the boundary for spaces of holomorphic functions, sequences of interpolation, and more. The first result is by Stegenga [60] who manages to characterize such measures for p = 2 in terms of a condition involving Riesz capacities of compact subsets of the unit disc. The root of Stegenga's work can be traced back to earlier work of Maz'ya [46] and Adams [1]. For the case 1 < p < ∞, a similar characterization of Carleson measures in terms of non linear Riesz capacities was later obtained by Verbitskii [65] and rediscovered by Wu [68].
More recently it was proved in [40,9] that the Carleson inequality for Besov spaces is equivalent to a dyadic Hardy's inequality. More precisely, for a given dyadic interval I = I n,j ∈ D, let Q(I) be the set of points z = re iθ with θ/2π ∈ I and 1 − 2 −n ≤ r ≤ 1 − 2 −n−1 . Set also µ(I) :=μ(Q(I)) and π(I) = 2 −an . Thenμ is a Carleson measure for B p a , 0 ≤ ap < 1, if and only if the triple p, µ, π satisfies the dyadic Hardy's inequality (33). Motivated by this application we shall call Carleson measures (or trace measures) all measures µ : D → R + satisfing the Hardy's inequality on trees (H).
The dyadic setting is much ductile. The same inequality (33), with different choices of the weight π, can be used to characterize Carleson measures for holomorphic spaces in several dimensions or for spaces of harmonic functions, trace inequalities for potential spaces, and more. Many such problems, in fact, can be proven to be equivalent to their dyadic counterparts, and often (33) is the form they assume. 9.2. Bessel potentials on the boundary of the dyadic tree. The content of this section is specific to the homegenous tree. Out of simplicity, we consider only the dyadic case, but everything we say applies, mutatis mutandis, to homogeneous trees of any degree.
Our objective in this section is to introduce, using as usual the axiomatic theory of Adams and Hedberg, a seemingly different potential theory on the boundary of the dyadic tree which depends on two parameters p and s. Subsequently we will use the inequality of Muckenhoupt and Wheeden in order to prove that it is "equivalent" to the potential theory introduced in Section 2.1 for the same parameter p and a particular choice of the weight π. Equivalent means that for compact sets that lie on the boundary of the dyadic tree, the capacities of the sets measured by means of the two different potential theories are comparable. ≈ α µ(S(α)) p * |α| sp * −1 = E p,π (µ), where π(α) = |α| 1−sp * 1−p * . Notice that we have used the Muckenhoupt-Wheeden inequality (MW) in the last step.
Therefore, since the energies associated to a positive Borel measure via the two different potential theories are comparable, we can conclude that the Cap p,π capacity of a compact subset of the boundary of the dyadic tree and its s−Bessel capacity are comparable. In particular compact sets of zero Cap p,π capacity coincide with those of zero s−Bessel capacity.