![MathJax Logo](/templates/jsp/_style2/_tandf/pb2/images/math-jax.gif)
Abstract
We investigate three aspects of weak* convergence of the n-step distributions of random walks on finite volume homogeneous spaces of semisimple real Lie groups. First, we look into the obvious obstruction to the upgrade from Cesàro to non-averaged convergence: periodicity. We give examples where it occurs and conditions under which it does not. In a second part, we prove convergence towards Haar measure with exponential speed from almost every starting point. Finally, we establish a strong uniformity property for the Cesàro convergence towards Haar measure for uniquely ergodic random walks.
2010 Mathematics Subject Classifications:
1. Introduction
Let G be a real Lie group and Γ a lattice in G, that is, a discrete subgroup of G such that the homogeneous space admits a G-invariant Borel probability measure
. This measure
is unique and we refer to it as the (normalized) Haar measure on X. A good example to have in mind is
and
.
The objects of study in this paper are random walks on X, given by probability measures µ on G: A step corresponds to randomly choosing a group element according to µ and then moving from the current location
to gx. Starting at
, the distribution of the location after n steps is given by the convolution
(1)
(1) which is the push-forward of the product measure
under the multiplication map
.
The broader context in which the study of these random walks originated is that of subgroup actions on homogeneous spaces. After Ratner's treatment of the rigidity and asymptotic properties of unipotent actions in her celebrated series of articles [Citation21–24], a new approach was needed to understand the dynamics of non-unipotent actions. Passing from a deterministic to a probabilistic point of view turned out to be a particularly fruitful angle. Still, understanding the long-term behaviour of random walks on homogeneous spaces and the limiting behaviour of the n-step distributions (Equation1(1)
(1) ) is a notoriously difficult problem. Major contributions to this line of study were made e.g. by Eskin–Margulis in their work on non-divergence [Citation15], and by Benoist–Quint in their breakthrough series of articles [Citation4,Citation6–8]. We reproduce one of the main results of [Citation8] as motivating example. For the statement, recall that a probability measure ν on X is called homogeneous if there exists a closed subgroup H of G and a point
such that
is a closed orbit and ν is H-invariant.
Theorem 1.1
Benoist–Quint [Citation8]
Let µ be a compactly supported probability measure on G. Denote by and
the closed subsemigroup and subgroup of G generated by
, respectively, and suppose that the Zariski closure of
in
is Zariski connected, semisimple, and has no compact factors. Then for every
there is a homogeneous probability measure
on X with
and such that
(2)
(2) as
in the weak* topology.
Here the weak* convergence (Equation2(2)
(2) ) more explicitly means that for every compactly supported continuous function
we have
as
. Recently, it was shown by Bénard–de Saxcé [Citation3] that the compact support assumption on µ in Theorem 1.1 can be relaxed to a finite first moment assumption; see Remark 2.7. Another recent generalization of the theorem above in joint work of the author with Sert and Shi [Citation19] replaces the algebraic assumption on the support of µ by a certain expansion condition, which allows for cases in which µ is e.g. supported on a parabolic subgroup of a semisimple group.
Some questions left open by Theorem 1.1 are listed by Benoist–Quint at the end of their survey [Citation5]. A major one is the following.
Question 1.2
In the setting of Theorem 1.1, is it also true that
(3)
(3) as
?
Answers are available only in special cases: Breuillard [Citation11] established (Equation3(3)
(3) ) for certain measures µ supported on unipotent subgroups, Buenger [Citation12] proved it for some sparse solvable measures, and in previous work the author dealt with the case of spread out measures [Citation18]. Very recently, Bénard [Citation2] observed that (Equation3
(3)
(3) ) holds for aperiodic measures µ under the assumption that µ has two convolution powers which are not mutually singular.
The purpose of this article is to discuss three (largely independent) aspects of random walk convergence related to Theorem 1.1 and Question 1.2, mainly having in mind the case that G is a semisimple real Lie group. We are going to use the following terminology.
Definition 1.3
Let ν be a probability measure on X and . We say that the random walk on X given by µ converges to ν on average (resp. converges to ν) from the starting point
if
(resp.
) as
in the weak* topology.
Convergence on average is also commonly referred to as Cesàro convergence. We use the two terms interchangeably.
The article is organized as follows.
In Section 2, we look into the obvious obstruction to the upgrade from Cesàro convergence to (non-averaged) convergence: periodicity. We show in Example 2.1 how (Equation3(3)
(3) ) can fail when
has finite orbit under
. Using a product construction, we can also produce a counterexample in which the orbit closure
has positive dimension (Example 2.2). In both cases, the periodic behaviour occurs at the level of the connected components of the orbit closure. As it turns out, this is no coincidence: If, in the setting of Theorem 1.1, the orbit closure
is connected, there can be no periodicity (Theorem 2.5) and we can show that the Cesàro convergence (Equation2
(2)
(2) ) also holds along arithmetic progressions (Corollary 2.8).
In Section 3, we establish effective convergence of random walks to the normalized Haar measure for typical starting points
: When
generates a Zariski dense subgroup of a semisimple real Lie group G without compact factors, for any fixed
-function f on X the convergence
not only holds but is in fact exponentially fast for
-almost every
(Theorem 3.2, Proposition 3.4). The proof relies on an
-spectral gap of the convolution operator
acting on measurable functions on X. Taking into account regularity of the function f, the above can be further strengthened to the statement that almost every
is exponentially generic (Definition 3.12): Up to a constant factor depending on derivatives of f, the exponential speed of convergence holds uniformly over all compactly supported smooth functions (Theorem 3.13). Key to this upgrade are the definition of suitable Sobolev norms and a functional analytic argument involving relative traces, first exploited in a dynamical context by Einsiedler–Margulis–Venkatesh [Citation13].
Finally, in Section 4 we prove that convergence on average to happens locally uniformly in
in a strong way when the random walk is uniquely ergodic and admits a Lyapunov function (Theorem 4.13). For example, this is the case when G is a connected semisimple real algebraic group and
generates a non-discrete Zariski dense subgroup, and also in the setup of Simmons–Weiss [Citation27], which has connections to Diophantine approximation problems on fractals. To this end, we introduce the new concept of
-uniform recurrence (Definition 4.10), which refines recurrence properties of random walks previously studied in [Citation6,Citation15].
1.1. Standing assumptions & notation
As many of our arguments work in greater generality, in the remainder of the article we will relax the assumptions stated at the beginning of this introduction. The following setup shall be in place whenever nothing else is specified: G is a locally compact σ-compact metrizable group acting ergodically on a locally compact σ-compact metrizable space X endowed with a G-invariant probability measure ; and µ is a Borel probability measure on G.
2. Periodicity
In this section, we start with two simple counterexamples to (Equation3(3)
(3) ), which illustrate ways in which a random walk may exhibit periodic behaviour (Section 2.1). Analysing these examples for their common feature, we are led to a simple condition ensuring aperiodicity, stated and proved in Section 2.2.
2.1. Examples
The first example with periodicity is on finite periodic orbits. In the following, for we denote by
the
-identity matrix.
Example 2.1
Consider the principal congruence lattice
in
. Being the kernel of the reduction homomorphism from
to
, we recognize
as a finite-index normal subgroup of
. In particular,
is a lattice in G. Let
with
Then the closed subgroup
generated by
is
, which is Zariski dense in G. The
-orbit of
is
with transitions as shown in the following diagram:
Consequently, we see that the random walk with starting point alternates between the two sets
The 2-step random walks on these sets constitute irreducible, aperiodic, finite state Markov chains, so that
as
in the weak* topology.
In the example above, the support of µ generates a Zariski dense subgroup of G and the lattice Γ in G is irreducible. (Recall that, loosely speaking, ‘irreducibility’ of Γ means that it does not arise from a product construction, cf. [Citation20, Definition 5.20]). By the work of Benoist–Quint [Citation8, Corollary 1.8], these properties force any orbit closure to be either finite or all of X. As soon as intermediate orbit closures are possible, however, one can also construct examples with periodic behaviour on non-discrete orbit closures.
Example 2.2
Let G, Γ, ,
,
and
be as in Example 2.1 and choose a diagonal matrix
such that the diagonal entries of
are irrational. We are going to consider the random walk on the product space
given by the probability measure
on
with
The (closed) subgroup generated by the support of this measure µ is given by
. Indeed, the correct entry in the second copy of G can be arranged using a finite product of
, and then the entry in the first copy can be corrected using
. By Theorem 1.1 we thus know that for the starting point
we have the weak* convergence
as
, where
is the homogeneous probability measure on the closure of the
-orbit of
. (Recall that it makes no difference for the closure whether one considers the orbit under the generated subgroup or subsemigroup.)
Let us identify this orbit closure. In the first copy of X, we recognize the finite orbit from Example 2.1. In the second copy, we see the action of irrational conjugates of
. As the acting group has product structure, the orbit closure in question is the product of these two orbit closures in the components:
Since the orbit
is infinite by our choice of the matrix a, it follows from [Citation8, Corollary 1.8] that
, so that
for the normalized counting measure
on
and the normalized Haar measure
on X. However, in analogy to Example 2.1, the random walk is found to alternate between the sets
in the sense that
and
for all
. Hence, we conclude that the random walk starting from
does not converge to
.
Remark 2.3
The same behaviour as in the previous example can be arranged inside a homogeneous space that is the quotient of a semisimple real Lie group
by an irreducible lattice
. Indeed, this is only a matter of choosing suitable embeddings
and
, where G and X are as in Example 2.2. Concretely, one can e.g. consider the
-congruence lattice
in
and the diagonal embeddings
We therefore see that Example 2.2, i.e. periodic behaviour on a non-discrete orbit closure, can be realized inside
. Of course, after applying this embedding, the subgroup generated by the support of µ will no longer be Zariski dense in
.
2.2. An aperiodicity criterion
Inspecting the examples above, one may notice that their common salient feature is that the orbit closure is disconnected. This naturally raises the question whether periodic behaviour can also occur when this orbit closure is connected. In what follows, we answer this question in the negative. We shall use the following formalization of periodicity.
Definition 2.4
Assume that the random walk on X given by µ converges on average to a probability measure ν on X from the starting point . We say that this convergence is periodic if there exists an integer
and pairwise disjoint measurable subsets
with
for
and such that
for every
. Otherwise, we call the convergence aperiodic.
The requirement on the boundaries of the sets is needed to ensure that the cyclic behaviour is witnessed by the limit measure ν. Without a condition of this sort, one could try to artificially define
as the set of all points in X that can be reached from
precisely in
steps. Indeed, this construction is possible for example when µ is finitely supported with the property that its support freely generates a discrete subsemigroup
of G and the starting point
has a free
-orbit. The latter is the case e.g. for
,
with
and
, and
for a diagonal matrix
such that the diagonal entries of
are irrational.
We are now ready to state the announced aperiodicity theorem.
Theorem 2.5
Retain the notation and assumptions from Theorem 1.1 and let be such that the orbit closure
is connected. Then the Cesàro convergence to
of the random walk on X given by µ starting from
is aperiodic.
For the proof we need the following simple lemma.
Lemma 2.6
Let H be a Zariski connected real algebraic group and S a subset of H generating a Zariski dense subsemigroup. Then for every , also the d-fold product set
generates a Zariski dense subsemigroup of H. In particular, if
generates a Zariski dense subsemigroup for some probability measure µ on H, the same is true for
.
Proof.
Let be a non-empty Zariski open subset and consider the map
. Since ϕ is Zariski continuous,
is Zariski open. Moreover, this preimage is non-empty because U is dense in the Lie group topology and ϕ is a diffeomorphism near the identity. By the assumption that S generates a Zariski dense subsemigroup, we can thus find an element
that is the product of finitely many elements of S. It follows that
lies in the intersection of U with the subsemigroup generated by
.
The second claim involving µ immediately follows from the above together with the inclusion .
Proof
Proof of Theorem 2.5
Suppose is an integer such that there are pairwise disjoint
with
for all
and such that
for all
as in the definition of periodicity. We have to show that d = 1.
First note that from Theorem 1.1 and the properties of the sets it follows that
(4)
(4) where the application of weak* convergence to the set
is justified since it has negligible boundary with respect to the limit measure
. In view of Lemma 2.6, Theorem 1.1 also applies to the d-step random walk given by
. Assuming for the moment that the limit measure for this d-step random walk starting from
coincides with
, we deduce that
(5)
(5) Together, (Equation4
(4)
(4) ) and (Equation5
(5)
(5) ) imply d = 1, the desired conclusion.
It thus remains to show that the d-step random walk starting from does indeed have the same limit measure as the 1-step random walk. Denoting by
and
the closed subsemigroups of G generated by
and
, respectively, this statement is equivalent to the equality
of orbit closures. To prove this, let
be arbitrary. We claim that
Indeed, since
is homogeneous, it is invariant under the group generated by
. As
clearly contains
, the inclusion ‘
’ follows. For the reverse inclusion let
for some
. Choose
such that
. Then
and hence
, giving the claim.
We already noted that Theorem 1.1 applies to . In particular, the orbit closure
and its translates by
,
, are submanifolds of
. Necessarily, all these translates have the same dimension, and since together they make up
by the claim above, their shared dimension coincides with that of
. This implies that
is open in
. However, it is also closed, so that the assumed connectedness of
forces
. This completes the proof.
Remark 2.7
It was recently shown by Bénard–de Saxcé [Citation3] that the compact support assumption on µ in Theorem 1.1 can be relaxed. Indeed, their [Citation3, Theorem C] establishes the same conclusion under the substantially weaker assumption that µ has a finite first moment, meaning that
Relying on this stronger result, also our Theorem 2.5 above and Corollary 2.8 below are seen to hold under a finite first moment assumption on µ, instead of requiring compact support as in Theorem 1.1.
We end this section by recording a corollary of the proof above.
Corollary 2.8
Retain the notation and assumptions from Theorem 1.1 and suppose that is connected. Let
and denote by
the closed subsemigroup of G generated by
. Then
, and for the homogeneous probability measure
on this orbit closure we have for arbitrary
that
(6)
(6) as
in the weak* topology.
Proof.
The statement about orbit closures was established as part of the proof of Theorem 2.5. From Theorem 1.1 we thus get the weak* convergence
(7)
(7) which is (Equation6
(6)
(6) ) for r = 0. Given
, the general case follows by applying (Equation7
(7)
(7) ) to the compactly supported continuous function
defined by
for
.
This corollary sharpens the convergence statement in Theorem 1.1 in the case of a connected orbit closure: The Cesàro convergence to holds along arbitrary arithmetic progressions. Although this does not provide an answer to Question 1.2, it at least allows the following conclusion to be drawn: If
is a sequence of indices such that
converges to a weak* limit different from
as
, then
cannot contain a density 1 subset of an infinite arithmetic progression.
3. Spectral gap
In this section, we will explain how a spectral gap of the convolution operator associated to a random walk entails the convergence of
towards
for
-a.e.
. In its simplest form, the involved argument works in great generality and also produces an exponential rate of convergence from almost every starting point when the test function f is fixed. This is done in Section 3.1. The following Sections 3.2–3.4 are dedicated to a substantial refinement of this spectral gap argument for random walks on homogeneous spaces of real Lie groups, making the exponentially fast convergence uniform over smooth test functions.
3.1. Generic points
Recall that is defined by
for
and
, and that it extends to a continuous contraction on each
-space (see [Citation9, Corollary 2.2]). We shall study its behaviour on
. By ergodicity, the G-fixed functions are the constant functions, so we restrict our attention to their orthogonal complement
of
-functions with mean 0.
Definition 3.1
We say that µ has a spectral gap on X if the associated convolution operator restricted to
has spectral radius strictly less than 1.
We are going to use the notation to denote the spectral radius of an operator T. Then by the spectral radius formula, µ having a spectral gap on X can be reformulated as the requirement that
Given the existence of a spectral gap, we obtain an almost everywhere convergence result in a quite general setup.
Theorem 3.2
Suppose that µ has a spectral gap on X. Then -a.e.
is generic for the random walk on X given by µ, meaning that
as
in the weak* topology. This convergence is exponentially fast in the sense that for every fixed
we have
(8)
(8) for
-a.e.
.
Proof.
By separability of , for the statement about weak* convergence it suffices to prove
-a.s. convergence for one fixed function
. Consequently, it is enough to prove the second assertion of the theorem. To this end, fix a function
and a rational number
, and consider the
-function
. Then in view of the spectral radius formula we have
for sufficiently large
.
Fix in addition a rational number . By Chebyshev's inequality, the above implies that for large n we have
By Borel–Cantelli it follows that for all x in a full measure set
, the inequality
holds only for finitely many
. Since
, we conclude that (Equation8
(8)
(8) ) holds for all x in a countable intersection of the sets
over rational numbers α approaching
and ε approaching 0 from above.
Remark 3.3
In the second conclusion of Theorem 3.2, how long it takes for the exponential rate of convergence to kick in depends on the point x. However, the measure of sets on which one has to wait for a long time can be controlled as follows: Given , choose
such that
for all
. Then if we additionally take
and denote
the proof above gives the bound
for every
. In particular, the measure of the set on which the exponential convergence does not start during the first n steps decays exponentially in n.
We now demonstrate that the previous result covers the case announced in Section 1.
Proposition 3.4
Let G be a connected semisimple real Lie group without compact factors and with finite centre, a lattice, and X the homogeneous space
endowed with the Haar measure
. Suppose that the closed subsemigroup
generated by
has the property that
is Zariski dense in
. Then µ has a spectral gap on X.
Proof.
Consider the regular representation of G on . By Bekka [Citation1, Lemma 3] it doesn't weakly contain the trivial representation. From this, in view of [Citation25, Theorem C], the result follows if we can argue that the projection of µ to any simple factor of G is not supported on a closed amenable subgroup. However, since amenability passes to the Zariski closure (see e.g. [Citation28, Theorem 4.1.15]) the latter would imply that one of the simple factors of
is amenable, hence compact by a classical result of Furstenberg (see e.g. [Citation28, Proposition 4.1.8]).
3.2. Good height functions
Inspecting the proof of Theorem 3.2, one observes that every step is effective, with explicit bounds and good control over the measure of exceptional sets, except for the very first one: separability of the space of compactly supported continuous functions. In the remainder of this section, we aim to also make effective this step, the goal being exponentially fast convergence
from almost every starting point, uniformly over functions f on X. As merely continuous functions can behave arbitrarily badly (with respect to the convergence problem at hand), there is no hope of achieving this feat for all
. We shall therefore restrict our attention to smooth functions of compact support, and take into account their regularity by considering not just their
, but also certain Sobolev norms. Built into the definition of these norms will be what we call a good height function, the concept of which is introduced in this subsection.
Our setup is as follows: Let G be a real Lie group with Lie algebra . We endow
with a scalar product, which we use to define a right-invariant metric
on G. Given a lattice
, this metric descends to a metric
on
such that the projection
is locally an isometry. Moreover, we fix an orthonormal basis of
, using which we will identify
with
. Here is the crucial definition.
Definition 3.5
We call a measurable function a good height function if there exists
and a function
with the following properties:
The restriction of the exponential map
is a diffeomorphism onto its image and we have
for all
, where
denotes the open ball of radius r around the identity
with respect to the metric
on G.
For all
, the projection
is injective.
There exist constants
such that
for all
.
There exists a constant
such that
for all
and all
.
The definition suggests to think of a good height function as reciprocal of the injectivity radius. And indeed, this viewpoint allows their construction on any homogeneous space .
Proposition 3.6
Let G be a real Lie group and Γ a lattice in G. Then admits a good height function.
Proof.
Choose R>0 such that condition (i) of the definition is satisfied and set , where
is the injectivity radius at
, i.e. the maximal radius such that (ii) holds at x. Define
Then the only thing that needs to be verified is the validity of (iv). We claim that it holds with
. This will follow if we can show that
(9)
(9) whenever
. To this end, let
. Then by definition, there are distinct
such that
. As
, right-invariance of the metric implies
for i = 1, 2, and we also have
. This shows that
, and as
was arbitrary, we see that (Equation9
(9)
(9) ) holds.
Often, however, one might want to work with different, naturally occurring height functions. The flexibility in our definition of a good height function accommodates this possibility.
In the examples below, we denote by the length of a shortest non-zero vector in a lattice
.
Example 3.7
Let and
. Then
can be identified with the space of lattices in
with covolume 1 via
Then the function
, defined on X via the above identification, is a good height function. Indeed, one can first choose R>0 such that (i) is satisfied, and then set
as in the proof of Proposition 3.6. Then (ii) is automatically satisfied, and (iv) is valid for a suitable choice of σ due to the inequality
for
and
, where
denotes any matrix norm. To see that also (iii) holds, let
and suppose that hx = x for some
with
. Then for all
, the matrix
fixes the lattice
but is not the identity, so that
for some constants
. For a basis change
such that
consists of a reduced basis of the lattice x we have
for some
(cf. e.g. [Citation26, Chapter III]). With this choice, the above inequality implies
for
and
. Since near the identity, the metric
on G is Lipschitz-equivalent to the distance induced by
, this establishes (iii).
A similar construction is possible in a more general context.
Example 3.8
[Citation13]
Let be the group of real points of a semisimple
-group
and Γ an arithmetic lattice in G. Choose a rational
-stable lattice
. Then, using similar reasoning as in the previous example, the function
on
defined by
for
is seen to be a good height function (cf. [Citation13, Section 3.6]).
3.3. Sobolev norms
Given a good height function on X, the associated Sobolev norm of degree
of a compactly supported smooth function
is defined by
where the sum runs over differential operators
given by monomials of degree at most ℓ in elements of the fixed orthonormal basis of
in the universal enveloping algebra.
In other words, the differential operators appearing above are
for any k-tuple
of elements of the fixed basis of
,
, where
for
is defined by
for
and
.
Here are two immediate observations.
Lemma 3.9
Let be a good height function on X and
the associated Sobolev norms.
The norms
are induced by inner products
on
.
Given
, there exists a constant
such that
.
Proof.
Part (i) is clear. Part (ii) is also immediate from the definition of the Sobolev norms, once we know that a good height function must be bounded away from 0. The latter, however, follows directly from property (iii) in the definition of a good height function, as the function r appearing there is assumed to be bounded.
The proof of our convergence result in Section 3.4 will depend on the following proposition.
Proposition 3.10
[Citation13]
For the Sobolev norms associated to a good height function on X, there exists a non-negative integer and a constant C>0 with the following properties:
(Sobolev embedding estimate [Citation13, (3.9)]) For every
it holds that
.
(Finite relative traces [Citation13, (3.10)]) For all integers
the relative trace
is finite, meaning that for any orthogonal basis
in the completion of
with respect to
We refer to Bernstein–Reznikov [Citation10] for a systematic treatment of relative traces. In particular, it is proved in this reference that the above expression is independent of the choice of orthogonal basis.
The proofs in [Citation13] of the statements in the above proposition are given for the height function from Example 3.8. However, the only properties used are those in our definition of a good height function. In fact, the arguments only depend on validity of the second statement in [Citation13, Lemma 5.1], which holds in our context, as we demonstrate below.
Lemma 3.11
Let be a good height function on X. Then there exists a non-negative integer
and a constant C>0 such that for every non-negative integer
and every differential operator
given by a monomial of degree at most ℓ in elements of the fixed basis of
we have
for every
and
.
Proof.
We inspect the function in a chart around x given by the exponential map: We set
, where
is the function from the definition of a good height function,
, and consider
Then by the first statement of [Citation13, Lemma 5.1], which is simply a Sobolev embedding estimate on
, we know
(10)
(10) where
is a constant depending only on the dimension d of
and
is the standard degree d Sobolev norm on the open subset
of
, i.e.
where the sum is over all multi-indices
of degree at most d and
is the corresponding standard partial derivative of
. Using property (iii) in the definition of a good height function, (Equation10
(10)
(10) ) implies that
(11)
(11) where
is another constant and we used that
is bounded away from 0 to replace
appearing in the exponent by
. Using properties (i) and (ii) in the definition of a good height function, we find
such that
(12)
(12) To see this, one needs to note two things: firstly, that by the chain rule the partial derivatives of
at a point
in the chart can be expressed as linear combinations of derivatives
appearing on the right-hand side in (Equation12
(12)
(12) ) evaluated at the corresponding point
, with fixed coefficient functions depending only on finitely many derivatives of the exponential map on
; and secondly, that the Haar measure
is a smooth measure, meaning that it has a smooth and nowhere vanishing density w.r.t. Lebesgue measure in the chart.
Combining (Equation11(11)
(11) ), (Equation12
(12)
(12) ), condition (iv) in the definition of a good height function, and plugging back in the definition of F, we finally arrive at
for yet another constant
, which is the one appearing in the lemma.
3.4. Exponentially generic points
Now we are ready to define the notion of effective genericity we wish to establish, and to prove the main convergence result of this section.
Until the end of this section, we fix a good height function on X. Moreover, given a bounded measurable function f on X and
we will use the notation
for
. We refer to
as the time n discrepancy for the function f.
Definition 3.12
We say that a point is
-exponentially generic if
is a non-negative integer and β a real number in
satisfying
where
is the degree ℓ Sobolev norm associated to
.
With this terminology, we have the following result, which quantifies the dependence on the function f in the effective part of Theorem 3.2.
Theorem 3.13
Let G be a real Lie group, a lattice and
endowed with the Haar measure
. Suppose that µ has a spectral gap on X. Then there exists a non-negative integer
such that
-almost every point
is
-exponentially generic.
Our argument uses ideas from the proof of [Citation13, Proposition 9.2]. Recall that denotes the inner product associated to the Sobolev norm
.
Proof.
Set with
from Proposition 3.10. We denote by
the completion of
with respect to
.
The first step of the proof is to argue that admits an orthonormal basis
with respect to
that is also orthogonal with respect to
. To this end, let us endow
with the scalar product
associated to
. This makes
into a Hilbert space. As a consequence of Lemma 3.9(ii),
defines a bounded positive definite Hermitian form on
. Using Riesz representation it follows that there is a bounded positive self-adjoint operator T on
such that
for all
. Finiteness of the relative trace
from Proposition 3.10(ii) then translates into the statement that T is a trace-class operator on
(cf. [Citation14, Proposition 6.44]); in particular, the operator T is compact (cf. [Citation14, Proposition 6.42]). By the spectral theorem, T is thus diagonalizable. Hence, an orthonormal basis
of
consisting of eigenvectors of T is a basis with the desired properties.
Next, fix rational numbers and
. As in the proof of Theorem 3.2, using Chebyshev's inequality we find that for every
and large enough n we have
(13)
(13) where
. Since the relative trace
is finite by Proposition 3.10, the terms on the right-hand side of (Equation13
(13)
(13) ) are summable over
. Borel–Cantelli thus implies that
is a null set. Let
be the complement of this null set. We claim that any
is
-exponentially generic. Fix such a point x. Then we know that there are only finitely many pairs
with
. Thus, there exists
such that for
the inequality
holds for all k. Now let
be arbitrary and write
for the expansion of f in terms of the orthonormal basis
. Then, using the triangle inequality, we can estimate the time n discrepancy for f as follows:
(14)
(14) The exchange of integral and summation involved in the above estimate is justified by part (i) of Proposition 3.10: It ensures that the functions
are defined pointwise and the series expansion of f converges uniformly. Next, for
an application of the Cauchy–Schwarz inequality implies that the right-hand side of (Equation14
(14)
(14) ) is strictly less than
(15)
(15) Again by Proposition 3.10, the relative trace
is finite. Hence, in view of our definition of exponential genericity and the fact that
does not depend on f, combining (Equation14
(14)
(14) ) and (Equation15
(15)
(15) ) establishes the claim. It follows that all x in a countable intersection of the sets
over rational numbers α approaching
and ε approaching 0 from above are
-exponentially generic, giving the theorem.
Remark 3.14
In analogy to Remark 3.3, we can control the measure of the set of points where exponentially generic behaviour is not observed for a given number of steps: If we define
for
,
and
, and
is chosen such that
for all
, then for every
it holds that
Indeed, we have
, as the proof of Theorem 3.13 demonstrates. Thus, again, the measure of the set of ‘bad points’, on which exponential genericity takes more than n steps to manifest, is itself exponentially small in n.
4. Uniform Cesàro convergence
In this last section, we explore the situation where the only possible limit in Theorem 1.1 is the normalized Haar measure . In this setting, by analogy with the case of unique ergodicity in classical ergodic theory, it is reasonable to expect the Cesàro convergence (Equation2
(2)
(2) ) to hold (locally) uniformly in the starting point
. We shall prove in Section 4.1 below that this indeed holds true. In Section 4.2, we conclude the article by showing that in many naturally occurring situations something even stronger than locally uniform can be achieved.
Before continuing with the pertinent definitions, let us recall that even though the setup of Theorem 1.1 is our motivation and useful to have in mind, formally we are working with the assumptions stated at the end of Section 1: is merely required to be a space with a G-action for which
is invariant and ergodic.
Definition 4.1
A probability measure ν on X is called µ-stationary if . The random walk on X induced by µ is called uniquely ergodic if
is the unique µ-stationary probability measure on X.
In particular, for a random walk to be uniquely ergodic, there must be no finite -orbits in X, where
denotes the closed subgroup of G generated by µ. In the case that
for a lattice Γ in G, this happens if and only if
is not virtually contained in a conjugate of Γ. (Recall that a subgroup H of G is said to be virtually contained in a subgroup L of G if
has finite index in H.) In fact, in many cases of interest, finite orbits are the only obstruction to unique ergodicity: For example, this is true when G is a connected semisimple Lie group without compact factors, Γ is an irreducible lattice,
, and
is Zariski dense in
(see [Citation8, Corollary 1.8]); and also in the setting of [Citation27], a special case of which is reproduced below as Example 4.8.
4.1. Locally uniform convergence
The notion of unique ergodicity introduced above coincides with the classical property of unique ergodicity of the Markov operator . When the space X is compact, this is enough to guarantee that the Cesàro convergence
as
is uniform in x (see e.g. [Citation16, Section 5.1]). Without compactness, we also need to assume a form of recurrence.
Definition 4.2
We say that the random walk on X given by µ is locally uniformly recurrent if for every compact subset and
there exists
and a compact subset
with
for all
and
. It is called locally uniformly recurrent on average if the above holds with the Cesàro averages
in place of
.
It is a simple exercise to check that locally uniform recurrence implies locally uniform recurrence on average. In concrete examples, recurrence properties such as these are typically established by constructing a Lyapunov function; see Section 4.2 below.
The following well-known fact explains why these properties are referred to as ‘non-escape of mass’.
Lemma 4.3
Let the sequence of points in X be relatively compact and suppose that the random walk on X is locally uniformly recurrent (resp. on average). Then every weak* limit of the sequence
(resp.
) is a probability measure. □
The proof is immediate and left to the reader.
We are now ready to state and prove our first result on locally uniform Cesàro convergence.
Theorem 4.4
Suppose that the random walk on X induced by µ is uniquely ergodic and locally uniformly recurrent on average. Then for every , every compact
, and every
, there exists
such that for every
and
we have
Equivalently, considering the space of probability measures on X as endowed with the weak* topology, the sequence of functions
converges to
uniformly on compact subsets of X as
.
Proof.
The equivalence of the two formulations is due to the definition of neighbourhoods in the weak* topology by finitely many test functions in .
To prove the statement for individual functions, we proceed by contradiction. If the conclusion is false, then for some ,
compact and
there exist indices
and
with
(16)
(16) for all
. Let ν be a weak* limit point of the sequence
Then ν is µ-stationary, and a probability measure because of our recurrence assumption and the fact that all
lie in the fixed compact set K Lemma 4.3. But by unique ergodicity this forces
, contradicting (Equation16
(16)
(16) ).
4.2. Lyapunov functions & stronger uniformity
Loosely speaking, (Foster–)Lyapunov functions are functions enjoying certain contraction properties with respect to the random walk, to the effect that (on average) its dynamics are directed towards the ‘centre’ of the space, where the function takes values below some threshold. They were introduced into the study of random walks on homogeneous spaces by Eskin–Margulis [Citation15], whose ideas were further developed by Benoist–Quint [Citation6].
Definition 4.5
A measurable function is called a Lyapunov function for the random walk on X induced by µ if
it is proper, in the sense that the sublevel sets
are relatively compact for
, and
there exist constants
,
such that
, where
is the convolution operator associated to µ introduced in Section 3.
The inequality in the second condition above is referred to as the contraction property of V.
Allowing Lyapunov functions to take the value ∞ is conceptually important for the proofs of results such as Theorem 1.1, in order to show that the random walk does not accumulate near a lower-dimensional homogeneous subspace. Also, affording the possibility of non-continuous Lyapunov functions is crucial in recent constructions given in the literature [Citation6,Citation19]. For the purposes of the discussion in this section, however, it is no big restriction to have in mind the case of a continuous Lyapunov function which is finite on all of X.
Remark 4.6
Let us collect some immediate observations about Lyapunov functions.
If V is a Lyapunov function, then so are cV and V + c for any constant c>0. In particular, one may impose an arbitrary lower bound on V, so that it is no restriction to assume that a Lyapunov function takes values
, say.
Given a Lyapunov function
for the
-step random walk (induced by the convolution power
), one can construct a Lyapunov function V for the random walk given by µ itself by setting
By enlarging α and using properness, the contraction property in the definition of a Lyapunov function V may be replaced by
for some compact
, where
denotes the indicator function of K (cf. [Citation17, Lemma 15.2.8]).
Two examples in which a Lyapunov function exists are the following.
Example 4.7
[Citation15]
Identify with the space of unimodular lattices in
as in Example 3.7 and recall that we denote by
the length of a shortest non-zero vector in
. Then for every compactly supported probability measure µ on G whose support generates a Zariski dense subgroup there exist
such that
is a finite continuous Lyapunov function for the
-step random walk on X induced by
for some
. This construction can be generalized to higher dimensions by taking into account the higher successive minima
of lattices in
. A more advanced construction also ensures existence of Lyapunov functions for Zariski dense probability measures with finite exponential moments when
is the group of real points of a Zariski connected semisimple algebraic group
defined over
such that G has no compact factors.
Example 4.8
[Citation27]
Let ,
and
. For
let
be positive real numbers,
vectors such that
and
span
,
and set
Then for any choice of
with
, the measure
defines a uniquely ergodic random walk on X admitting a finite continuous Lyapunov function.
It is well known that existence of a Lyapunov function implies recurrence properties of the random walk.
Lemma 4.9
[Citation15, Lemma 3.1]
Suppose the random walk on X given by µ admits a finite continuous Lyapunov function V. Then this random walk is locally uniformly recurrent.
The intuitive reason for this behaviour is simple: The contraction property means that after a step of the random walk, the value of the Lyapunov function V on average gets smaller by a constant factor, at least when starting outside some compact set K (cf. Remark 4.6(iii) above), which one can think of as the ‘centre’ of the space. The set K can be chosen as (closure of) a sublevel set of V. By the contraction property, the number of steps required to reach it is uniform over starting points x in any given sublevel set of V, or in any given compact subset of X in the case that V is finite and continuous. This suggests that we might even let the starting points diverge, as long as this divergence is outcompeted by the geometric rate of contraction of V. We are led to the following notion of recurrence.
Definition 4.10
Let be a sequence of subsets of X. We say that the random walk on X given by µ is
-uniformly recurrent if for every
there exists
and a compact subset
with
for all
and
. It is called
-uniformly recurrent on average if the above holds with the Cesàro averages
in place of
.
Remark 4.11
We point out that contrary to the locally uniform situation, for the two versions of this property (with/without average) it is generally not clear whether one implies the other.
We are now going to establish such recurrence properties for certain families of sublevel sets of Lyapunov functions, which can be chosen to be increasing and to exhaust the part of X where the Lyapunov function is finite. Recall that the Lyapunov exponent of a function
is the exponential growth rate
If
, we say that φ has sub-exponential growth.
Proposition 4.12
Let be a function. Suppose that the random walk on X induced by µ admits a Lyapunov function V with contraction factor
and set
.
If φ has Lyapunov exponent
, then the random walk on X given by µ is
-uniformly recurrent. The number
in the definition can be chosen independently of ε.
If φ has sub-exponential growth, then the random walk on X given by µ is
-uniformly recurrent on average.
The proof is a refinement of the methods in [Citation6,Citation15].
Proof.
Let be the constants from the contraction property of V and define
. We are going to use the same set M for both parts of the proposition, namely
, which is compact since V is proper. Then for
and
we find, by repeatedly using the contraction property of V,
When the exponential growth rate of φ is less than
, for some
we have
for all
. This proves (i).
In order to prove (ii) we use a similar estimate, but have to ensure that the values are small for a sufficiently large proportion of
. For
we find, as above,
(17)
(17) Using straightforward manipulations, we further see
the right-hand side of which tends to 0 as
by sub-exponential growth of φ. Hence, with
, we may choose
large enough to ensure the above inequality holds for all
for
. For such n we conclude, using (Equation17
(17)
(17) ),
which ends the proof of (ii).
Theorem 4.4 can now be strengthened in the following way.
Theorem 4.13
In addition to the assumptions of Theorem 4.4, suppose that the random walk on X induced by µ admits a Lyapunov function V. Let have sub-exponential growth. Then for every
we have
Proof.
Using -uniform recurrence on average for
from Proposition 4.12(ii), the proof of Theorem 4.4 goes through with the obvious modifications.
Acknowledgments
The author would like to express his gratitude to Andreas Wieser for valuable comments on preliminary versions of the article, and to Manfred Einsiedler for explaining how relative traces can be used to make separability effective. Thanks also go to HIM Bonn and the organizers of the trimester program ‘Dynamics: Topology and Numbers’, in the course of which parts of this manuscript were completed, for hospitality and providing an excellent working environment. Finally, the author is grateful to the anonymous referee for pointing out a simple way to establish a better speed of convergence in Theorems 3.2 and 3.13.
Disclosure statement
No potential conflict of interest was reported by the author(s).
References
- Bekka M.B., On uniqueness of invariant means, Proc. Amer. Math. Soc. 126 (1998), pp. 507–514.
- Bénard T., Equidistribution of mass for random processes on finite-volume spaces, Israel J. Math.255 (2023), pp. 417–422.
- Bénard T. and de Saxcé N., Random walks with bounded first moment on finite-volume spaces, Geom. Funct. Anal. 32 (2022), pp. 687–724.
- Benoist Y. and Quint J.-F., Mesures stationnaires et fermés invariants des espaces homogènes, Ann. Math. (2) 174 (2011), pp. 1111–1162.
- Benoist Y. and Quint J.-F., Introduction to random walks on homogeneous spaces, Jpn. J. Math. 7 (2012), pp. 135–166.
- Benoist Y. and Quint J.-F., Random walks on finite volume homogeneous spaces, Invent. Math. 187 (2012), pp. 37–59.
- Benoist Y. and Quint J.-F., Stationary measures and invariant subsets of homogeneous spaces (II), J. Amer. Math. Soc. 26 (2013), pp. 659–734.
- Benoist Y. and Quint J.-F., Stationary measures and invariant subsets of homogeneous spaces (III), Ann. Math. (2) 178 (2013), pp. 1017–1059.
- Benoist Y. and Quint J.-F., Random Walks on Reductive Groups, Springer, Cham, 2016.
- Bernstein J. and Reznikov A., Sobolev norms of automorphic functionals, Int. Math. Res. Not. 2002 (2002), pp. 2155–2174.
- Breuillard E. F., Equidistribution of random walks on nilpotent Lie groups and homogeneous spaces, PhD thesis, Yale University, 2004.
- Davis Buenger C., Quantitative non-divergence, effective mixing, and random walks on homogeneous spaces, PhD thesis, The Ohio State University, 2016.
- Einsiedler M., Margulis G., and Venkatesh A., Effective equidistribution for closed orbits of semisimple groups on homogeneous spaces, Invent. Math. 177 (2009), pp. 137–212.
- Einsiedler M. and Ward T., Functional Analysis, Spectral Theory, and Applications, Springer, Cham, 2017.
- Eskin A. and Margulis G., Recurrence properties of random walks on finite volume homogeneous manifolds, in Random walks and geometry, Vadim A. Kaimanovich, ed., De Gruyter, Berlin, 2004, pp. 431–444. Proceedings of a Workshop at the Erwin Schrödinger Institute, Vienna, June 18–July 13, 2001. Corrected version: https://www.math.uchicago.edu/eskin/return.ps.
- Krengel U., Ergodic Theorems, de Gruyter, Berlin, 1985.
- Meyn S. and Tweedie R. L., Markov Chains and Stochastic Stability, 2nd ed., Cambridge University Press, Cambridge, 2009.
- Prohaska R., Spread out random walks on homogeneous spaces, Ergodic Theory Dynam. Syst. 41 (2021), pp. 3439–3473.
- Prohaska R., Sert C., and Shi R., Expanding measures: Random walks and rigidity on homogeneous spaces, Forum Math. Sigma 11(e59) (2023), pp. 1–61.
- Raghunathan M.S., Discrete Subgroups of Lie Groups, Springer, Berlin, 1972.
- Ratner M., On measure rigidity of unipotent subgroups of semisimple groups, Acta Math. 165 (1990), pp. 229–309.
- Ratner M., Strict measure rigidity for unipotent subgroups of solvable groups, Invent. Math. 101 (1990), pp. 449–482.
- Ratner M., On Raghunathan's measure conjecture, Ann. Math. (2) 134 (1991), pp. 545–607.
- Ratner M., Raghunathan's topological conjecture and distributions of unipotent flows, Duke Math. J. 63 (1991), pp. 235–280.
- Shalom Y., Explicit Kazhdan constants for representations of semisimple and arithmetic groups, Ann. Inst. Fourier (Grenoble) 50 (2000), pp. 833–863.
- Siegel C. L., Lectures on the Geometry of Numbers, Springer, Berlin, 1989.
- Simmons D. and Weiss B., Random walks on homogeneous spaces and diophantine approximation on fractals, Invent. Math. 216 (2019), pp. 337–394.
- Zimmer R. J., Ergodic Theory and Semisimple Groups, Birkhäuser, Boston, 1984.