In mathematics, the L spaces are function spaces defined using a natural generalization of the p-norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue (Dunford & Schwartz 1958, III.3), although according to the Bourbaki group (Bourbaki 1987) they were first introduced by Frigyes Riesz (Riesz 1910). L spaces form an important class of Banach spaces in functional analysis, and of topological vector spaces. Lebesgue spaces have applications in physics, statistics, finance, engineering, and other disciplines.
In statistics, measures of central tendency and statistical dispersion, such as the mean, median, and standard deviation, are defined in terms of L metrics, and measures of central tendency can be characterized as solutions to variational problems.
In penalized regression, 'L1 penalty' and 'L2 penalty' refer to penalizing either the L norm of a solution's vector of parameter values (i.e. the sum of its absolute values), or its L norm (its Euclidean length). Techniques which use an L1 penalty, like LASSO, encourage solutions where many parameters are zero. Techniques which use an L2 penalty, like ridge regression, encourage solutions where most parameter values are small. Elastic net regularization uses a penalty term that is a combination of the L norm and the L norm of the parameter vector.
The Fourier transform for the real line (resp. for periodic functions, see Fourier series), maps L(R) to L(R) (resp. L(T) to ℓ), where 1 ≤ p ≤ 2 and 1/p + 1/q = 1. This is a consequence of the Riesz–Thorin interpolation theorem, and is made precise with the Hausdorff–Young inequality.
By contrast, if p > 2, the Fourier transform does not map into L.
Hilbert spaces are central to many applications, from quantum mechanics to stochastic calculus. The spaces L and ℓ are both Hilbert spaces. In fact, by choosing a Hilbert basis (i.e., a maximal orthonormal subset of L or any Hilbert space), one sees that all Hilbert spaces are isometric to ℓ(E), where E is a set with an appropriate cardinality.
The p-norm in finite dimensions
The Euclidean distance between two points x and y is the length ||x − y|| of the straight line between the two points. In many situations, the Euclidean distance is insufficient for capturing the actual distances in a given space. An analogy to this is suggested by taxi drivers in a grid street plan who should measure distance not in terms of the length of the straight line to their destination, but in terms of the rectilinear distance, which takes into account that streets are either orthogonal or parallel to each other. The class of p-norms generalizes these two examples and has an abundance of applications in many parts of mathematics, physics, and computer science.
For a real number p ≥ 1, the p-norm or L-norm of x is defined by
Of course the absolute value bars are unnecessary when p is a rational number and, in reduced form, has an even numerator.
The Euclidean norm from above falls into this class and is the 2-norm, and the 1-norm is the norm that corresponds to the rectilinear distance.
The L-norm or maximum norm (or uniform norm) is the limit of the L-norms for p → ∞. It turns out that this limit is equivalent to the following definition:
For all p ≥ 1, the p-norms and maximum norm as defined above indeed satisfy the properties of a "length function" (or norm), which are that:
- only the zero vector has zero length,
- the length of the vector is positive homogeneous with respect to multiplication by a scalar (positive homogeneity), and
- the length of the sum of two vectors is no larger than the sum of lengths of the vectors (triangle inequality).
Abstractly speaking, this means that R together with the p-norm is a Banach space. This Banach space is the L-space over R.
Relations between p-norms
The grid distance or rectilinear distance (sometimes called the "Manhattan distance") between two points is never shorter than the length of the line segment between them (the Euclidean or "as the crow flies" distance). Formally, this means that the Euclidean norm of any vector is bounded by its 1-norm:
This fact generalizes to p-norms in that the p-norm ||x|| of any given vector x does not grow with p:
For the opposite direction, the following relation between the 1-norm and the 2-norm is known:
This inequality depends on the dimension n of the underlying vector space and follows directly from the Cauchy–Schwarz inequality.
In general, for vectors in C where 0 < r < p:
When 0 < p < 1
In R for n > 1, the formula
defines an absolutely homogeneous function of degree 1 for 0 < p < 1; however, the resulting function does not define an F-norm, because it is not subadditive. In R for n > 1, the formula for 0 < p < 1
defines a subadditive function, which does define an F-norm. This F-norm is homogeneous of degree p.
Hence, the function
defines a metric. The metric space (R, d) is denoted by ℓ.
Although the p-unit ball B around the origin in this metric is "concave", the topology defined on R by the metric d is the usual vector space topology of R, hence ℓ is a locally convex topological vector space. Beyond this qualitative statement, a quantitative way to measure the lack of convexity of ℓ is to denote by C(n) the smallest constant C such that the multiple C B of the p-unit ball contains the convex hull of B, equal to B. The fact that for fixed p < 1 we have
shows that the infinite-dimensional sequence space ℓ defined below, is no longer locally convex.
When p = 0
There is one ℓ norm and another function called the ℓ "norm" (with quotation marks).
which is discussed by Stefan Rolewicz in Metric Linear Spaces. The ℓ-normed space is studied in functional analysis, probability theory, and harmonic analysis.
Another function was called the ℓ "norm" by David Donoho—whose quotation marks warn that this function is not a proper norm—is the number of non-zero entries of the vector x. Many authors abuse terminology by omitting the quotation marks. Defining 0 = 0, the zero "norm" of x is equal to
This is not a norm because it is not homogeneous. Despite these defects as a mathematical norm, the non-zero counting "norm" has uses in scientific computing, information theory, and statistics–notably in compressed sensing in signal processing and computational harmonic analysis.
The p-norm in countably infinite dimensions and ℓ spaces
The p-norm can be extended to vectors that have an infinite number of components, which yields the space ℓ. This contains as special cases:
- ℓ, the space of sequences whose series is absolutely convergent,
- ℓ, the space of square-summable sequences, which is a Hilbert space, and
- ℓ, the space of bounded sequences.
The space of sequences has a natural vector space structure by applying addition and scalar multiplication coordinate by coordinate. Explicitly, the vector sum and the scalar action for infinite sequences of real (or complex) numbers are given by:
Define the p-norm:
Here, a complication arises, namely that the series on the right is not always convergent, so for example, the sequence made up of only ones, (1, 1, 1, ...), will have an infinite p-norm for 1 ≤ p < ∞. The space ℓ is then defined as the set of all infinite sequences of real (or complex) numbers such that the p-norm is finite.
One can check that as p increases, the set ℓ grows larger. For example, the sequence
is not in ℓ, but it is in ℓ for p > 1, as the series
diverges for p = 1 (the harmonic series), but is convergent for p > 1.
One also defines the ∞-norm using the supremum:
and the corresponding space ℓ of all bounded sequences. It turns out that
if the right-hand side is finite, or the left-hand side is infinite. Thus, we will consider ℓ spaces for 1 ≤ p ≤ ∞.
The p-norm thus defined on ℓ is indeed a norm, and ℓ together with this norm is a Banach space. The fully general L space is obtained—as seen below — by considering vectors, not only with finitely or countably-infinitely many components, but with "arbitrarily many components"; in other words, functions. An integral instead of a sum is used to define the p-norm.
An L space may be defined as a space of functions for which the p-th power of the absolute value is Lebesgue integrable. More generally, let 1 ≤ p < ∞ and (S, Σ, μ) be a measure space. Consider the set of all measurable functions from S to C or R whose absolute value raised to the p-th power has finite integral, or equivalently, that
The set of such functions forms a vector space, with the following natural operations:
for every scalar λ.
That the sum of two p-th power integrable functions is again p-th power integrable follows from the inequality
(This comes from the convexity of for .)
In fact, more is true. Minkowski's inequality says the triangle inequality holds for || · ||. Thus the set of p-th power integrable functions, together with the function || · ||, is a seminormed vector space, which is denoted by .
This can be made into a normed vector space in a standard way; one simply takes the quotient space with respect to the kernel of || · ||. Since for any measurable function f , we have that || f || = 0 if and only if f = 0 almost everywhere, the kernel of || · || does not depend upon p,
In the quotient space, two functions f and g are identified if f = g almost everywhere. The resulting normed vector space is, by definition,
For p = ∞, the space L(S, μ) is defined as follows. We start with the set of all measurable functions from S to C or R which are bounded. Again two such functions are identified if they are equal almost everywhere. Denote this set by L(S, μ). For a function f in this set, its essential supremum serves as an appropriate norm:
As before, if there exists q < ∞ such that f ∈ L(S, μ) ∩ L(S, μ), then
For 1 ≤ p ≤ ∞, L(S, μ) is a Banach space. The fact that L is complete is often referred to as the Riesz-Fischer theorem. Completeness can be checked using the convergence theorems for Lebesgue integrals.
When the underlying measure space S is understood, L(S, μ) is often abbreviated L(μ), or just L. The above definitions generalize to Bochner spaces.
Similar to the ℓ spaces, L is the only Hilbert space among L spaces. In the complex case, the inner product on L is defined by
The additional inner product structure allows for a richer theory, with applications to, for instance, Fourier series and quantum mechanics. Functions in L are sometimes called quadratically integrable functions, square-integrable functions or square-summable functions, but sometimes these terms are reserved for functions that are square-integrable in some other sense, such as in the sense of a Riemann integral (Titchmarsh 1976).
If we use complex-valued functions, the space L is a commutative C*-algebra with pointwise multiplication and conjugation. For many measure spaces, including all sigma-finite ones, it is in fact a commutative von Neumann algebra. An element of L defines a bounded operator on any L space by multiplication.
For 1 ≤ p ≤ ∞ the ℓ spaces are a special case of L spaces, when S = N, and μ is the counting measure on N. More generally, if one considers any set S with the counting measure, the resulting L space is denoted ℓ(S). For example, the space ℓ(Z) is the space of all sequences indexed by the integers, and when defining the p-norm on such a space, one sums over all the integers. The space ℓ(n), where n is the set with n elements, is R with its p-norm as defined above. As any Hilbert space, every space L is linearly isometric to a suitable ℓ(I), where the cardinality of the set I is the cardinality of an arbitrary Hilbertian basis for this particular L.
Properties of L spaces
The dual space (the Banach space of all continuous linear functionals) of L(μ) for 1 < p < ∞ has a natural isomorphism with L(μ), where q is such that 1/p + 1/q = 1 (i.e. ). This isomorphism associates g ∈ L(μ) with the functional κ(g) ∈ L(μ) defined by
The fact that κ(g) is well defined and continuous follows from Hölder's inequality. κ : L(μ) → L(μ) is a linear mapping which is an isometry by the extremal case of Hölder's inequality. It is also possible to show (for example with the Radon–Nikodym theorem, see) that any G ∈ L(μ) can be expressed this way: i.e., that κ is onto. Since κ is onto and isometric, it is an isomorphism of Banach spaces. With this (isometric) isomorphism in mind, it is usual to say simply that L is the dual Banach space of L.
For 1 < p < ∞, the space L(μ) is reflexive. Let κ be as above and let κ : L(μ) → L(μ) be the corresponding linear isometry. Consider the map from L(μ) to L(μ), obtained by composing κ with the transpose (or adjoint) of the inverse of κ:
This map coincides with the canonical embedding J of L(μ) into its bidual. Moreover, the map j is onto, as composition of two onto isometries, and this proves reflexivity.
If the measure μ on S is sigma-finite, then the dual of L(μ) is isometrically isomorphic to L(μ) (more precisely, the map κ corresponding to p = 1 is an isometry from L(μ) onto L(μ)).
The dual of L is subtler. Elements of L(μ) can be identified with bounded signed finitely additive measures on S that are absolutely continuous with respect to μ. See ba space for more details. If we assume the axiom of choice, this space is much bigger than L(μ) except in some trivial cases. However, Saharon Shelah proved that there are relatively consistent extensions of Zermelo–Fraenkel set theory (ZF + DC + "Every subset of the real numbers has the Baire property") in which the dual of ℓ is ℓ.
Colloquially, if 1 ≤ p < q ≤ ∞, then L(S, μ) contains functions that are more locally singular, while elements of L(S, μ) can be more spread out. Consider the Lebesgue measure on the half line (0, ∞). A continuous function in L might blow up near 0 but must decay sufficiently fast toward infinity. On the other hand, continuous functions in L need not decay at all but no blow-up is allowed. The precise technical result is the following. Suppose that 0 < p < q ≤ ∞. Then:
- L(S, μ) ⊂ L(S, μ) iff S does not contain sets of finite but arbitrarily large measure, and
- L(S, μ) ⊂ L(S, μ) iff S does not contain sets of non-zero but arbitrarily small measure.
Neither condition holds for the real line with the Lebesgue measure. In both cases the embedding is continuous, in that the identity operator is a bounded linear map from L to L in the first case, and L to L in the second. (This is a consequence of the closed graph theorem and properties of L spaces.) Indeed, if the domain S has finite measure, one can make the following explicit calculation via Hölder's inequality:
The constant appearing in the above inequality is optimal, in the sense that the operator norm of the identity I : L(S, μ) → L(S, μ) is precisely
the case of equality being achieved exactly when f = 1 μ-a.e.
Throughout this section we assume that: 1 ≤ p < ∞.
Let (S, Σ, μ) be a measure space. An integrable simple function f on S is one of the form
where a is scalar, A ∈ Σ has finite measure and is the indicator function of the set , for j = 1, ..., n. By construction of the integral, the vector space of integrable simple functions is dense in L(S, Σ, μ).
Suppose V ⊂ S is an open set with μ(V) < ∞. It can be proved that for every Borel set A ∈ Σ contained in V, and for every ε > 0, there exist a closed set F and an open set U such that
It follows that there exists φ continuous on S such that
If S can be covered by an increasing sequence (V) of open sets that have finite measure, then the space of p–integrable continuous functions is dense in L(S, Σ, μ). More precisely, one can use bounded continuous functions that vanish outside one of the open sets V.
This applies in particular when S = R and when μ is the Lebesgue measure. The space of continuous and compactly supported functions is dense in L(R). Similarly, the space of integrable step functions is dense in L(R); this space is the linear span of indicator functions of bounded intervals when d = 1, of bounded rectangles when d = 2 and more generally of products of bounded intervals.
Several properties of general functions in L(R) are first proved for continuous and compactly supported functions (sometimes for step functions), then extended by density to all functions. For example, it is proved this way that translations are continuous on L(R), in the following sense:
L (0 < p < 1)
Let (S, Σ, μ) be a measure space. If 0 < p < 1, then L(μ) can be defined as above: it is the vector space of those measurable functions f such that
As before, we may introduce the p-norm || f || = N( f ), but || · || does not satisfy the triangle inequality in this case, and defines only a quasi-norm. The inequality (a + b) ≤ a + b, valid for a, b ≥ 0 implies that (Rudin 1991, §1.47)
and so the function
is a metric on L(μ). The resulting metric space is complete; the verification is similar to the familiar case when p ≥ 1.
In this setting L satisfies a reverse Minkowski inequality, that is for u, v in L
The space L for 0 < p < 1 is an F-space: it admits a complete translation-invariant metric with respect to which the vector space operations are continuous. It is also locally bounded, much like the case p ≥ 1. It is the prototypical example of an F-space that, for most reasonable measure spaces, is not locally convex: in ℓ or L([0, 1]), every open convex set containing the 0 function is unbounded for the p-quasi-norm; therefore, the 0 vector does not possess a fundamental system of convex neighborhoods. Specifically, this is true if the measure space S contains an infinite family of disjoint measurable sets of finite positive measure.
The only nonempty convex open set in L([0, 1]) is the entire space (Rudin 1991, §1.47). As a particular consequence, there are no nonzero linear functionals on L([0, 1]): the dual space is the zero space. In the case of the counting measure on the natural numbers (producing the sequence space L(μ) = ℓ), the bounded linear functionals on ℓ are exactly those that are bounded on ℓ, namely those given by sequences in ℓ. Although ℓ does contain non-trivial convex open sets, it fails to have enough of them to give a base for the topology.
The situation of having no linear functionals is highly undesirable for the purposes of doing analysis. In the case of the Lebesgue measure on R, rather than work with L for 0 < p < 1, it is common to work with the Hardy space H whenever possible, as this has quite a few linear functionals: enough to distinguish points from one another. However, the Hahn–Banach theorem still fails in H for p < 1 (Duren 1970, §7.5).
L, the space of measurable functions
The vector space of (equivalence classes of) measurable functions on (S, Σ, μ) is denoted L(S, Σ, μ) (Kalton, Peck & Roberts 1984). By definition, it contains all the L, and is equipped with the topology of convergence in measure. When μ is a probability measure (i.e., μ(S) = 1), this mode of convergence is named convergence in probability.
The description is easier when μ is finite. If μ is a finite measure on (S, Σ), the 0 function admits for the convergence in measure the following fundamental system of neighborhoods
The topology can be defined by any metric d of the form
where φ is bounded continuous concave and non-decreasing on [0, ∞), with φ(0) = 0 and φ(t) > 0 when t > 0 (for example, φ(t) = min(t, 1)). Such a metric is called Lévy-metric for L. Under this metric the space L is complete (it is again an F-space). The space L is in general not locally bounded, and not locally convex.
For the infinite Lebesgue measure λ on R, the definition of the fundamental system of neighborhoods could be modified as follows
The resulting space L(R, λ) coincides as topological vector space with L(R, g(x) dλ(x)), for any positive λ–integrable density g.
If f is in L(S, μ) for some p with 1 ≤ p < ∞, then by Markov's inequality,
A function f is said to be in the space weak L(S, μ), or L(S, μ), if there is a constant C > 0 such that, for all t > 0,
The best constant C for this inequality is the L-norm of f, and is denoted by
The weak L coincide with the Lorentz spaces L, so this notation is also used to denote them.
The L-norm is not a true norm, since the triangle inequality fails to hold. Nevertheless, for f in L(S, μ),
and in particular L(S, μ) ⊂ L(S, μ). Under the convention that two functions are equal if they are equal μ almost everywhere, then the spaces L are complete (Grafakos 2004).
For any 0 < r < p the expression
Weighted L spaces
As before, consider a measure space (S, Σ, μ). Let w : S → [0, ∞) be a measurable function. The w-weighted L space is defined as L(S, w dμ), where w dμ means the measure ν defined by
As L-spaces, the weighted spaces have nothing special, since L(S, w dμ) is equal to L(S, dν). But they are the natural framework for several results in harmonic analysis (Grafakos 2004); they appear for example in the Muckenhoupt theorem: for 1 < p < ∞, the classical Hilbert transform is defined on L(T, λ) where T denotes the unit circle and λ the Lebesgue measure; the (nonlinear) Hardy–Littlewood maximal operator is bounded on L(R, λ). Muckenhoupt's theorem describes weights w such that the Hilbert transform remains bounded on L(T, w dλ) and the maximal operator on L(R, w dλ).
L spaces on manifolds
One may also define spaces L(M) on a manifold, called the intrinsic L spaces of the manifold, using densities.