A System of Multivariable Krawtchouk Polynomials and a Probabilistic Application

The one variable Krawtchouk polynomials, a special case of the $_2F_1$ function did appear in the spectral representation of the transition kernel for a Markov chain studied a long time ago by M. Hoare and M. Rahman. A multivariable extension of this Markov chain was considered in a later paper by these authors where a certain two variable extension of the $F_1$ Appel function shows up in the spectral analysis of the corresponding transition kernel. Independently of any probabilistic consideration a certain multivariable version of the Gelfand-Aomoto hypergeometric function was considered in papers by H. Mizukawa and H. Tanaka. These authors and others such as P. Iliev and P. Tertwilliger treat the two-dimensional version of the Hoare-Rahman work from a Lie-theoretic point of view. P. Iliev then treats the general $n$-dimensional case. All of these authors proved several properties of these functions. Here we show that these functions play a crucial role in the spectral analysis of the transition kernel that comes from pushing the work of Hoare-Rahman to the multivariable case. The methods employed here to prove this as well as several properties of these functions are completely different to those used by the authors mentioned above.


Introduction
The genesis of this paper goes back to a joint work of Hoare and Rahman [8] in 1983, where the idea of the so-called "Cumulative Bernoulli Trials" (CBT) was introduced. The essential elements of this probabilistic model are as follows: A player (say, of poker dice) rolls a subset, i, of a fixed number, N , of dice for success (say, "aces") with a certain probability α. The player is allowed to save his/her k successes, and given a second chance, namely, to mix the i − k unsuccessful dice with the previously N − i unrolled ones. The player then rolls the combined dice, numbering N − k for success with probability β. Suppose the number of successes in this second try is j − k, which, when combined with the previously earned points, k, gives the total number of successes as j. If the number of successes after these two rolls is defined to be the state of our system we get a Markov chain by iterating this scheme which took as from state i to state j.
More explicitly the transition probability matrix of this Markov chain with state space 0, 1, 2, . . . , N is given by is the binomial distribution. The stationary distribution φ 0 (i) corresponding to this process can be defined by N j=0 K(i, j)φ 0 (j) = φ 0 (i).
A sufficient condition for some φ 0 (i) to satisfy this condition is K(i, j)φ 0 (j) = K(j, i)φ 0 (i), (1.3) and also that the summation part of K(j, i) is symmetric in i and j. Use of (1.1), (1.2) and (1.3) gives with D 1 = 1 + αβ 1−α . As determined in [8] the eigenvalues λ k of the eigenvalue equation where the hypergeometric function 2 F 1 in one variable i is just the Krawtchouk polynomials, see [10]. Clearly, by use of the orthogonality property of Krawtchouk polynomials we can write down the spectral representation of K(j, i), namely, It is obvious that the above prototype of dice-tossing and "saving" successes can be extended, on one hand, to more practical situations in which Bernoulli trials may be accumulated (for example, in 'infection-therapy' models, see [8,9]), and on the other hand, to multiple variables where one defines various kinds of success (say, aces, kings, queens, . . . ). Suppose we have a process where the total number of dice is N , of which n subsets i 1 , . . . , i n are tossed separately for successes of n different kinds with probabilities α 1 , α 2 , . . . , α n , respectively. Let k 1 , k 2 , . . . , k n be the number of successes in each category (that is, k 1 aces, k 2 kings, . . . , etc.). Mix the "unsuccessful" i 1 − k 1 , i 2 − k 2 , . . . , i n − k n trials with the remaining N − i 1 − i 2 − · · · − i n dice. The player "saves" the k successes and is allowed to try again with the resulting N −k 1 −· · ·−k n dice with probabilities β 1 , β 2 , . . . , β n of producing j 1 − k 1 aces, j 2 − k 2 kings, . . . , etc.
With the definition for the n-fold multinomial distribution b n given below we get that the transition probability kernel from the state (i 1 , i 2 , . . . , i n ) to (j 1 , j 2 , . . . , j n ) is The stationary distribution in this case, as before, is defined by which, together with (1.4) gives, as a sufficient condition which is the n-fold multinomial distribution. Using (1.4) and (1.5) we find that, in perfect analogy to the one-variable case, the η's are related to the probability parameters α's and β's in the following way: One of the questions we will address in this paper is this: what are the eigenvalues and eigenfunctions of K(i; j)? In other words, we will seek solutions of the eigenvalue problem: (1.7) In the single-variable case the eigenfunctions are simply φ 0 (i) times the ordinary Krawtchouk polynomials, as we mentioned earlier. So it is reasonable to expect that in n dimensions (n ≥ 2) the solutions will be an appropriate extension of these polynomials. The question is: which one? Even in the n = 2 case the question is not quite as straightforward as it would seem. In fact, the same authors, Hoare and Rahman, wrestled with this problem for a number of years until they were able to show that the Krawtchouk limit of the 9 − j symbols of quantum angular momentum theory in physics provides the answer which, written in slightly more convenient notation is where (t, u, v, w) satisfy certain relationships with η 1 and η 2 , and the F represents an iterate or a 2-variable extension of the familiar F 1 Appell function: that is, subject to conditions for convergence in case they are infinite sums. Unbeknownst to the authors, Hoare and Rahman, at the time they published the paper [9], a general n variable extension of the F (2) 1 in (1.8), namely, a special case of Gelfand hypergeometric function [2,4], was known to, among others, the Japanese authors Mizukawa and Tanaka [15], who used them to prove their orthogonality in some special cases. Later, Mizukawa [13] gave a complete orthogonality proof of (1.9) using Gelfand pairs, then [15] used character algebras and closely following this proof came Iliev and Terwilliger's proofs, first for n = 2 [11], then for general n in [12], in which they used tools from Lie algebra theory. In these proofs the authors found it convenient to use 1−u ij instead of u ij as parameters in (1.9), and also to use Our second objective in this paper is to give an alternate proof of orthogonality by using the more elementary method of hypergeometric functions and their various transformation properties and integral representation, and in doing so we find no special advantage of using 1 − u ij instead of u ij , or of using (1.10). Also, we believe that elementary and perhaps a bit cumbersome as it may be, our method yields a byproduct that throws some light on the underlying geometrical structure of these polynomials, which the other authors may have overlooked.
In Section 2 we will list the transformation and integral representation formulas for the multivariable hypergeometric functions that we shall use throughout this paper. Section 3 will be devoted to obtaining the necessary conditions of orthogonality, while in Sections 4 and 5 we shall deal with the sufficient conditions that will simultaneously establish the orthogonality relation where η are the parameters for the dual orthogonality. The relationship between the u's and the η's and η's will be given in latter sections, specially Section 6.
In Section 6 we shall examine the geometrical implications of the relationships between the u's that result from the necessary and sufficient conditions, while the last section will be aimed at proving that the n-dimensional extension of the function in (1.8) are precisely the eigenfunctions of the kernel K(x; y). In Section 7 we get the expressions for the eigenvalues and eigenfunctions of the transition probability kernel governing the evolution of our Markov chain.
Before closing this section we must mention that the polynomials in (1.9) were, in fact, introduced into the statistical literature by R.C. Griffiths [5] as early as 1971 which he defined as coefficients in an expansion of their generating function. However, Mizukawa and Tanaka [15] seem to have been the first to give the explicit expression in (1.9).
Few people would disagree with the importance of looking at certain mathematical objects from different points of view. We feel that this is certainly valid in the case of the present problem: character algebras, Gelfand-Aomoto functions, Lie algebras and the much older methods of hypergeometric functions including their series as well as their integral representations have a useful role to play. Such a wealth of approaches may be important if one tries to obtain matrix valued versions of these probabilistic models in the spirit of [6,7]. For a very rich and recent extension of the scalar valued solution of the hypergeometric equation, see [16]. It is worth noticing that in this case there is yet no extension of the Euler integral representation formula for the 2 F 1 function, and that algebraic methods such as those coming from Lie algebras or group representation theory, which have played such in important role in [11,12,13,14,15] have not made a mark in this matrix valued extension yet. Similar consideration would be relevant if one were to consider a matrix valued extension of, for instance, the work in [3].

Transformation formulas and integral representations (a) Transformation formulas
The point of this section is to establish (2.1), (2.2) and (2.3).
For references to some of the classical identities in this section the reader can consult for instance [1].

Necessary conditions of orthogonality
The point of this section is to show that (3.4) is a necessary condition to insure the desired orthogonality (3.2). Denoting the F (n) 1 polynomials in (1.9) by P m (x) for abbreviation, we find that which is just the generating function of P m (x). In a sense this represents the opposite point of view of Griffiths [5] where he defined the polynomials as the coefficients of the generating function.
Since one of our aims is to prove the orthogonality relation in which we assume that while the m's are nonnegative integers the α's are not, and nor is γ a nonpositive integer, and that Using (2.3) we then find that the above sum equals ξ i , and hence the sum inside the integral in (4.1)  We now expand the n-fold product on the right-hand side of (4.2) to get (4.4) Substitution of (4.2) and (4.4) inside the integrand of (4.1) and computing the integral, with the α i 's replaced by −m i 's and γ by −N , gives the value of the integral as In particular, for m = e 1 , and m ′ = e n ,

By similar arguments it follows that
which, by (4.3), means that n j=1 η j u rj u sj = 1, r, s = 1, 2, . . . , n, r = s. So the only terms that survive in (4.5) give The first factor in the numerator on the right-hand side of (4.7) is zero unless N −

Relations among the parameters
The previous sections were devoted to proving the orthogonality of the n 2 -parameter polynomials P m (x) defined in (1.9), with respect to the multinomial distribution b n (x; N ; η), η = (η 1 , . . . , η n ) (or b n (m; N ; η), η = (η 1 , . . . , η n ), with η i = η i ). Clearly, the n relations in (3.3) completely define the η i 's in terms of the u's, as they do the η i 's, by (3.4). If we think of the η's as given parameters then the n 2 polynomial parameters must satisfy in addition n 2 − (2n − 1) = (n − 1) 2 relations among themselves. Where do they come from and what do they mean geometrically is the question we shall examine in this section.
Note that the sufficient conditions of orthogonality (4.6) give n 2 further relations between the η's and u's. In fact, combining (3.3) and (4.6) we get n j=1 η j u rj (1 − u sj ) = 0, r, s = 1, 2, . . . , n, r = s. (6.1) Clearly, the positivity of all the probability parameters, i.e., η's, require that the determinant Det(u rj (1 − u sj )) = 0. It is possible to analyze these determinants and obtain the missing (n − 1) 2 relations. We shall, however, take a different route. Our proof of orthogonality is based almost entirely on the integral representation (2.3) and the simple multinomial summation formula. The proof is direct and elementary. However, if we had instead used the transformations (2.1) and (2.2) we could have reduced the problem to a (n − 1)-variable case, with the polynomials having (n − 1) 2 parameters, instead of n 2 . There are, of course, n ways to make this reduction, depending on which of the n variables we integrate first. If we do the x n summation first, then by n long and tedious set of computations we can find that the I m ′ m reduces to an expression proportional to the (n − 1)-fold sum , . . . , ζ n = η n−1 u n,n−1 (1 − u n,n−1 ) η n u nn (1 − u nn ) .
If we were to have orthogonality of the n-variable case then surely a reduction to a lower dimension will retain the same property. However, the appearance of the two F (n−1) 1 functions above doesn't suggest their parameters are the same. But the point is that they are, not identically, but consistent with the necessary conditions of orthogonality (3.3). To illustrate this point let us take the product of ζ's and the first n − 1 parameters of the first F (n−1) 1 , i.e., compute On the other hand, Similarly the equalities of other parameters are also established. The (n − 1) 2 relations can be expressed in the following compact form: where (one must, of course, take it for granted that u ij = 0, for any i, j).
The geometrical implication of these relations seems to suggest that the n-variable Krawtchouk polynomials (1.9) live on an (n − 1) 2 -dimensional submanifold, defined by (6.2), of the n 2 -dimensional space of (u ij ).