Time and Band Limiting for Matrix Valued Functions, an Example

The main purpose of this paper is to extend to a situation involving matrix valued orthogonal polynomials and spherical functions, a result that traces its origin and its importance to work of Claude Shannon in laying the mathematical foundations of information theory and to a remarkable series of papers by D. Slepian, H. Landau and H. Pollak. To our knowledge, this is the first example showing in a non-commutative setup that a bispectral property implies that the corresponding global operator of"time and band limiting"admits a commuting local operator. This is a noncommutative analog of the famous prolate spheroidal wave operator.


Introduction
In a ground-breaking paper lying down the mathematical foundations of communication theory, Claude Shannon, [S], considers a basic problem in harmonic analysis and signal processing: how to best concentrate a function both in physical and frequency space. This issue was an important part of the work of C. Shannon for several years after the publication of this paper. The problem had appeared earlier in several versions and one should at least mention the role of the Heisenberg inequality in this context: for a nice and simple proof it -due to W. Pauli-see Hermann Weyl's book [W].
What is really novel in Shannon's (and people working close to him) look at this problem is the following question: suppose you consider an unknown signal f (t) of finite duration, i.e. the signal is non-zero only in the interval [−T, T ]. The data you have are the values of the Fourier transform F (k) of f for values of k in the interval [ −W, W ]. What is the best use you can make of this data?
In practice, the values of F (k) will be corrupted by noise and one is dealing with a typical situation in signal processing: recovering an image from partial and noisy data in the presence of some apriori information.
This problem has treated originally by Shannon himself but a full solution had to wait for the joint work, in different combinations, of three remarkable workers at Bell labs in the 1960's: David Slepian, Henry Landau and Henry Pollak, see [SLP1,SLP2,SLP3,SLP4,SLP5].
A very good account of this development is a pair of papers by David Slepian, [S1, S2] The first one is essentially the second Shannon lecture given at the International Symposium on Information Theory in 1975. The abstract starts with the sentence "It is easy to argue that real signals must be bandlimited. It is also easy to argue that they cannot be so". The ideas in this paper took their definite form in [M].
The second paper is on the occasion of the John von Neumann lecture given at the SIAM 30th anniversary meeting in 1982. Here is a quote from the second paper: "There was a lot of serendipity here, clearly. And then our solution, too, seemed to hinge on a lucky accident-namely that we found a second-order differential operator that commuted with an integral operator that was at the heart of the problem".
What these workers found was that instead of looking for the unknown f (t) itself one should consider a certain integral operator with discrete spectrum in the open interval (0, 1) and a remarkable "spectral gap": about [4W T ] (integer part of 2W×2T ) eigenvalues are positive, and all the remaining ones are essentially zero. They argue that in the presence of noisy data one should try to compute the projection of f (t) on the linear span of the eigenfunctions with "large" eigenvalues. The effective computation of these eigenfunctions is made possible by replacing the integral operator by the commuting differential one alluded to by D. Slepian (both have simple spectrum). From a theoretical point of view these eigenfunctions are the same, but using the differential operator instead of the integral one, we have a manageable numerical problem. For a very recent account of several computational issues see [ORX].
We still have to answer the question: How do you explain the existence of this local commuting operator?
To this day nobody has a simple explanation for this miracle. Indeed there has been a systematic effort to see if the "bispectral property" first considered in [DG], guarantees the commutativity of these two operators, a global and a local one. A few papers where this question has been taken up, include [G3,G4,G5,G6,GLP,GY,P1,P2].
The results in the present paper are a (first) natural extension of the work in [G3,GLP], where the classical orthogonal polynomials played a central role, to a matrix-valued case involving matrix orthogonal polynomials. In a case such as in [G3,GLP] or in the present paper where physical and frequency space are of a different nature (one is continuous and the other one is discrete) one can deal (as explained in [G3,GLP]) with either an integral operator or with a full matrix. Each one of these global objects will depend on two parameters that play the role of (T, W ) in Shannon's case, and one will be looking for a commuting local object, i.e. a second order differential operator or a tridiagonal matrix. In this paper we will be dealing with a full matrix and a (block) tridiagonal one.
The study of the bispectral property has taken a life of its own, including a look at it in the non-commutative case when the operators in question have matrix valued coefficients. A sample of papers dealing with this includes [BHY,CG,DG1,G,GPT1,GPT5,G1,G2,G7,GH3,GH4,GH7,GHH,GI,GLP,GrTi,SZ,Tir,Z]. In some cases this has had unexpected pay-offs such as in [T2].
Finally a word about possible applications. The paper [GLP] was written with no particular application in mind, but with the expectation that the analysis of functions defined on the sphere could benefit from it. A few years later some applications did emerge, see [SD, SDW] and its references. One can only hope that the present paper dealing with matrix valued functions defined on spheres will find a natural application in the future.

Preliminaries
In this paper the role of real line will be taken up by the n-dimensional sphere. We will consider 2 × 2 matrix valued functions defined on the sphere with the appropriate invariance that makes them functions of the colatitude θ and we will use x = cos(θ) as the variable. The role of the Fourier transform will be taken by the expansion of our functions in terms of a basis of matrix valued orthogonal polynomials described below. This is similar to the situation discussed in [GLP] except for the crucial fact that our functions are now matrix valued. This situation has, to the best of our knowledge, not been considered before.
The matrix valued orthogonal polynomials considered here are those studied in [PZ], arising from the spherical functions of fundamental representations associated to the n-dimensional sphere S n ≃ G/K, where (G, K) = (SO(n + 1), SO(n)).
These spherical functions give rise to sequences {P w } w≥0 of matrix orthogonal polynomials depending on two parameters n and p ∈ R such that 0 < p < n We recall that C λ w is a polynomial of degree w whose leading coefficient is 2 w (λ)w w! , where (a) w = a(a + 1) . . . (a + w − 1) denotes the Pochhammer's symbol.
In particular we have .
Let us observe that the deg(P w ) = w and the leading coefficient of P w is a non singular scalar matrix The matrix polynomials {P w } w≥0 are orthogonal with respect to the matrix valued inner product where the weight matrix is given by Let us observe that by changing p by n − p, the weight matrices are conjugated to each other. In fact, by taking J = ( 0 1 1 0 ) we get (2) JW p,n J * = W n−p,n .
As a consequence of this fact (or directly from the explicit definition of P w ) we have that (P w ) 22 , the entry (2, 2) of P w , is the same that the entry (P w ) 11 by replacing p by n − p. Also the entry (2, 1) of P w , (P w ) 21 is the entry (P w ) 12 , if we replace p by n − p.
We have that P w , P w is always a diagonal matrix. Moreover one can verify that We consider the orthonormal sequence of matrix polynomials We display the first elements of the sequence {Q n }.
3. The matrix M Given the sequence of matrix orthogonal polynomials {P w } w≥0 we fix a natural number N and α ∈ (−1, 1) and consider the matrix M of total size 2(N + 1) × 2(N + 1), whose (i, j)-block is the 2 × 2 matrix obtained by taking the inner product of the i-th and j-th normalized matrix valued orthogonal polynomials in the interval [−1, α] with α ≤ 1 It should be clear that the restriction to the interval [−1, α] implements "band-limiting" while the restriction to the range 0, 1, ..., N takes care of "time-limiting". In the language of [GLP] where we were dealing with scalar valued functions defined on spheres the first restriction gives us a "spherical cap" while the second one amounts to chopping the expansion in spherical harmonics.
We gather here a few important properties of the matrix M.
The entries of the matrix M = (M rs ) 1≤r,s≤2(N +1) are related with the entries of the block matrices From the definition (5) The weight matrices W p,n and W n−p,p are conjugated to each other by Let us denote M i,j (p) the 2 × 2 matrix with parameter p, thus Therefore we get that the entry 1, 2 of M i,j is equal to the entry 2, 1 by replacing p by n − p, i.e.
In Section 6 we will give detailed information about the entries of M .

The commutant of M contains block tridiagonal matrices
The aim of this section is to find all block tridiagonal symmetric matrices L such that Notice that in principle there is no guarantee that we will find any such L except for a scalar multiple of the identity. For the problem at hand we need to exhibit matrices L that have a simple spectrum, so that its eigenvectors will automatically be eigenvectors of M.
Our main result is that the vector space of such matrices L is of dimension 4. It consist of the linear span of the matrices L 1 , L 2 and L 3 given below. Of course we can always add a multiple of the identity matrix. We only indicate explicitly the non-zero entries of these symmetric matrices, for k = 1, . . . , N + 1 we have .
In the scalar case the matrix L is unique up to shifts and scaling, in the 2 × 2 case at hand this is no longer true. This extra freedom can be traced back to a phenomenon first mentioned in [DG2].

The spectrum of L 1
Here we give a "hands on" argument to prove that the matrix L 1 has simple spectrum. From this, and the commutativity established earlier, it follows that every eigenvector of L 1 is automatically an eigenvector of M , and we have achieved our goal: for each value of the parameters (α, N ) we have a numerically efficient way to compute the eigenvectors of M.
First observe the structure of the matrix L 1 . For N = 3 we have where all indicated entries ℓ i,j = (L 1 ) i,j are nonzero. See Section 4.
For general N , with the notation E i,j , we have that L 1 is of the form (Note that ℓ 0,2 = ℓ 2N +2,2N +4 = 0). Its blocks are of the form where the 2 × 2 blocks are 2j+2) , and the coefficients ℓ i,j = (L 1 ) i,j were given in Section 4.
One can see by induction that Assume that λ is an eigenvalue of L 1 and thus non-zero, and denote by X one of its eigenvectors. We will show that X is a scalar multiple of a certain vector that depends only on the matrix L 1 and the eigenvalue in question. This shows that the geometric multiplicity of λ is one.
Our strategy is now clear, by successively solving the equations (14) and using the relation between x 2j and x 2j−1 we obtain expressions for x 2 , x 3 , x 4 , ... etc. as quantities built out of L 1 and λ all multiplied by the free parameter x 1 .
This goes on till we try to solve the last equation in (14), with j = N + 1, where we meet our only restriction, namely With the variables x 2 , x 3 , x 4 , . . . , x 2N +1 , x 2N +2 all expressed in the desired form (i.e. as certain multiples of x 1 ) the last equation becomes a polynomial in elements of L 1 and the unknown quantity λ all multiplied by x 1 . Except for this factor this is just the characteristic polynomial for L 1 and by assumption λ is a root of it.

Explicit expression for the matrix M
Writing down nice and explicit expressions for the entries of M is not easy. We display here some of the entries whose expressions are not too involved.
We start by computing explicitly, the entries of the matrix By the definition of the weight matrix given in (1) and the explicit expression of the Q 0 given in (4) we have π(p + 1)(n − p + 1) Γ( n 2 + 1) .
6.1. The matrix M . To simplify some calculations that follow and to avoid some square roots, we consider the following matrix M . This matrix is close to the matrix M and it is defined as a matrix of size 2(N + 1) × 2(N + 1), whose (i, j)-block is the 2 × 2 matrix given by We observe that M is defined in the same way as M by using the sequence (P w ) w≥0 of matrix orthogonal polynomial instead of the orthonormal polynomials (Q w ) w≥0 . The sequences of orthogonal polynomials (P w ) w≥0 and (Q w ) w≥0 are related by Q w = S w P w , where S w = P w −1 (see (3)). Thus the blocks of the matrices M and M are related by Since the matrices S j are diagonal matrices, we have that Explicitly, 6.2. Non diagonal elements for all the blocks of M . By direct computations we obtained the following expression for the non diagonal elements of each 2 × 2 block M i,j .
For 0 ≤ i, j ≤ N we get where C n+1 2 n (α) denotes the n-th Gegenbauer polynomial in α.
By using (7) we have that the element M j−1,k−1 22 is obtained from M j−1,k−1 11 by changing p by n − p.
Since we already know the explicit value of M 1,1 (see (15)), these expressions allow us to determine all the diagonal elements of M.